Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210518となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 刺激的、有用、心配、未来的:8か国における人工知能に対する大衆の認識 Exciting, Useful, Worrying, Futuristic: Public Perception of Artificial Intelligence in 8 Countries ( http://arxiv.org/abs/2001.00081v2 ) ライセンス: Link先を確認	Patrick Gage Kelley, Yongwei Yang, Courtney Heldreth, Christopher Moessner, Aaron Sedley, Andreas Kramm, David T. Newman, and Allison Woodruff	(参考訳) 人工知能(ai)の影響と利用が拡大し、その変革的な潜在性がより顕在化するにつれて、その使用の経済的、政治的、社会的、倫理的な影響に関する多くの疑問が提起されている。世論は、製品導入、商業開発、研究資金、規制に影響を与えるこれらの議論において重要な役割を担っている。本稿では,8カ国と6大陸にまたがる10,005人の回答者を対象に,人工知能の世論調査を行った。我々は、AIが社会に重大な影響を与えるという認識を広く報告し、AIの責任ある開発と利用を強く支援し、また、各国のAIに対する反応を区別する4つの主要なテーマ(興奮、有用、心配、未来)で、AIに対する大衆の感情を特徴づける。 As the influence and use of artificial intelligence (AI) have grown and its transformative potential has become more apparent, many questions have been raised regarding the economic, political, social, and ethical implications of its use. Public opinion plays an important role in these discussions, influencing product adoption, commercial development, research funding, and regulation. In this paper we present results of an in-depth survey of public opinion of artificial intelligence conducted with 10,005 respondents spanning eight countries and six continents. We report widespread perception that AI will have significant impact on society, accompanied by strong support for the responsible development and use of AI, and also characterize the public's sentiment towards AI with four key themes (exciting, useful, worrying, and futuristic) whose prevalence distinguishes response to AI in different countries.	翻訳日:2023-06-09 22:45:16 公開日:2021-05-18
# 一般確率論における確率の動的制約 How dynamics constrains probabilities in general probabilistic theories ( http://arxiv.org/abs/2002.05088v3 ) ライセンス: Link先を確認	Thomas D. Galley and Lluis Masanes	(参考訳) 本稿では,システムの動的構造と確率構造を区別する一般確率論を解析するための一般的な枠組みを紹介する。力学構造は、可逆力学の作用とともに純粋な状態の集合であり、確率構造は測定値と結果確率を決定する。動的群と安定化部分群がゲルファント対を形成する推移的力学構造に対して、すべての確率的構造は剛性(無限に変形することはできない)であり、動的群の球面表現と一対一の対応であることを示す。動的構造がユニタリ群によって作用する複素グラスマン多様体のものであるとき、すべての確率構造を分類するために我々の方法を適用する。これは量子論の一般化であり、純粋な状態は複素ベクトル空間の1次元部分空間で表現される代わりに、1より大きい固定次元の部分空間で表現される。また、コンパクトな2点均質な力学構造を持つ系(すなわち、与えられた距離を持つ全ての純状態は、同じ距離を持つ任意の純粋な状態に可逆的に変換可能である)がユークリッド・ヨルダン・アルゲブラに対応する系を含むことを示す。 We introduce a general framework for analysing general probabilistic theories, which emphasises the distinction between the dynamical and probabilistic structures of a system. The dynamical structure is the set of pure states together with the action of the reversible dynamics, whilst the probabilistic structure determines the measurements and the outcome probabilities. For transitive dynamical structures whose dynamical group and stabiliser subgroup form a Gelfand pair we show that all probabilistic structures are rigid (cannot be infinitesimally deformed) and are in one-to-one correspondence with the spherical representations of the dynamical group. We apply our methods to classify all probabilistic structures when the dynamical structure is that of complex Grassmann manifolds acted on by the unitary group. This is a generalisation of quantum theory where the pure states, instead of being represented by one-dimensional subspaces of a complex vector space, are represented by subspaces of a fixed dimension larger than one. We also show that systems with compact two-point homogeneous dynamical structures (i.e. every pair of pure states with a given distance can be reversibly transformed to any other pair of pure states with the same distance), which include systems corresponding to Euclidean Jordan Algebras, all have rigid probabilistic structures.	翻訳日:2023-06-03 21:23:36 公開日:2021-05-18
# タンパク質格子問題における制限量子スピードアップの可能性の検討 Investigating the potential for a limited quantum speedup on protein lattice problems ( http://arxiv.org/abs/2004.01118v2 ) ライセンス: Link先を確認	Carlos Outeiral, Garrett M. Morris, Jiye Shi, Martin Strahm, Simon C. Benjamin and Charlotte M. Deane	(参考訳) タンパク質の折りたたみは計算生物学における中心的な課題であり、分子生物学、創薬、触媒設計における重要な応用である。ハードコンビネート最適化問題として、量子アニーリングの潜在的なターゲット問題として研究されている。いくつかの実験的な実装が文献で議論されているが、これらのアプローチの計算スケーリングは解明されていない。本稿では,多数の小ペプチド折り畳み問題に適用する量子アニーリングの数値的研究を行い,短期的応用に有用な知見を推察することを目的とした。 2つの結論として、タンパク質格子の折り畳みに適用した場合、ナイーブな量子アニールさえも古典的アプローチより優れている可能性があり、ハミルトン派や関連するスケジュールの注意深いエンジニアリングはこの問題に対して顕著な相対的な改善をもたらす。全体として、量子アルゴリズムはタンパク質の折り畳みや構造予測領域の問題を改善できる可能性が示唆された。 Protein folding is a central challenge in computational biology, with important applications in molecular biology, drug discovery and catalyst design. As a hard combinatorial optimisation problem, it has been studied as a potential target problem for quantum annealing. Although several experimental implementations have been discussed in the literature, the computational scaling of these approaches has not been elucidated. In this article, we present a numerical study of quantum annealing applied to a large number of small peptide folding problems, aiming to infer useful insights for near-term applications. We present two conclusions: that even naive quantum annealing, when applied to protein lattice folding, has the potential to outperform classical approaches, and that careful engineering of the Hamiltonians and schedules involved can deliver notable relative improvements for this problem. Overall, our results suggest that quantum algorithms may well offer improvements for problems in the protein folding and structure prediction realm.	翻訳日:2023-05-27 03:17:05 公開日:2021-05-18
# 近似ダイナミクスはより最適制御につながる:効率的な完全微分 Approximate Dynamics Lead to More Optimal Control: Efficient Exact Derivatives ( http://arxiv.org/abs/2005.09943v4 ) ライセンス: Link先を確認	Jesper Hasseriis Mohr Jensen, Frederik Skovbo M{\o}ller, Jens Jakob S{\o}rensen, Jacob Friis Sherson	(参考訳) 正確な微分は、量子最適化のランドスケープにおいて局所的な反転と収束を効率的に行うために重要である。ユニタリ制御タスクに対する解析的厳密な制御導関数(gradient and hessian)を導出することにより、この精度要件を満たす計算可能性が伝播スキームと問題表現の選択に依存することを示した。正確な伝播が十分に安価である場合でも、(適切な)近似伝達子を最適化する方が、おそらく驚くほど効率的である。重要なことに、最初の分析的考察を終えると、現実のシステムへの直接的な適用において、標準的な数値的技術のみが明示的に必要となる。これらの結果はヒルベルト空間次元を増大させる2つの具体的な問題に対して数値的に検証される。最善のスキームは機械精度に対して単位忠実度を得るが、他のスキームの結果は計算時間において10桁、最悪の場合には10桁の精度で一貫して分離される。これらのギャップはシステムのサイズと複雑さによって継続的に増大するため、この手法は、例えば、多体コンテキストにおいて、別々に発行される高忠実な体制で、非常に高次元のダイナミクスを数値的に効率的に最適化することができる。 Accurate derivatives are important for efficiently locally traversing and converging in quantum optimization landscapes. By deriving analytically exact control derivatives (gradient and Hessian) for unitary control tasks, we show here that the computational feasibility of meeting this accuracy requirement depends on the choice of propagation scheme and problem representation. Even when exact propagation is sufficiently cheap it is, perhaps surprisingly, much more efficient to optimize the (appropriately) approximate propagators: approximations in the dynamics are traded off for significant complexity reductions in the exact derivative calculations. Importantly, past the initial analytical considerations, only standard numerical techniques are explicitly required with straightforward application to realistic systems. These results are numerically verified for two concrete problems of increasing Hilbert space dimensionality. The best schemes obtain unit fidelity to machine precision whereas the results for other schemes are separated consistently by orders of magnitude in computation time and in worst case 10 orders of magnitude in achievable fidelity. Since these gaps continually increase with system size and complexity, this methodology allows numerically efficient optimization of very high-dimensional dynamics, e.g. in many-body contexts, operating in the high-fidelity regime which will be published separately.	翻訳日:2023-05-19 06:00:28 公開日:2021-05-18
# 可変絡み合いセンサネットワークを用いた量子強調データ分類 Quantum-enhanced data classification with a variational entangled sensor network ( http://arxiv.org/abs/2006.11962v2 ) ライセンス: Link先を確認	Yi Xia, Wei Li, Quntao Zhuang, Zheshen Zhang	(参考訳) ノイズの多い中間スケール量子(NISQ)ハードウェア上に構築された変分量子回路(VQC)は、古典的な処理とともに、量子シミュレーション、古典的な最適化、機械学習のための有望なアーキテクチャを構成する。しかし、古典的なスキームよりも量子的優位性を示すために必要なVQC深さは、利用可能なNISQデバイスの範囲を超えている。畳み込みセンサネットワーク(SLAEN)によって補助される監視学習は、古典的な機械学習アルゴリズムで訓練されたVQCを活用して、センサが共有するマルチパーティの絡み合いを調整し、実用的なデータ処理問題を解決する。本稿では、SLAENの初の実験実験を報告し、多次元無線周波数信号の分類における誤差確率のエンタングルメントによる低減を示す。我々の研究は、NISQ時代における量子化データ処理の新たな道のりを開拓している。 Variational quantum circuits (VQCs) built upon noisy intermediate-scale quantum (NISQ) hardware, in conjunction with classical processing, constitute a promising architecture for quantum simulations, classical optimization, and machine learning. However, the required VQC depth to demonstrate a quantum advantage over classical schemes is beyond the reach of available NISQ devices. Supervised learning assisted by an entangled sensor network (SLAEN) is a distinct paradigm that harnesses VQCs trained by classical machine-learning algorithms to tailor multipartite entanglement shared by sensors for solving practically useful data-processing problems. Here, we report the first experimental demonstration of SLAEN and show an entanglement-enabled reduction in the error probability for classification of multidimensional radio-frequency signals. Our work paves a new route for quantum-enhanced data processing and its applications in the NISQ era.	翻訳日:2023-05-13 05:18:09 公開日:2021-05-18
# 結合線からのフラクタル位相相 Fractonic topological phases from coupled wires ( http://arxiv.org/abs/2010.15148v3 ) ライセンス: Link先を確認	Joseph Sullivan, Arpit Dua and Meng Cheng	(参考訳) 3次元では、ギャップ付き位相は「フラクトロニック」準粒子励起をサポートすることができ、これは完全に非運動的であるか、あるいは低次元のサブ多様体の中でしか動くことができない。本研究では, 三次元連結ワイヤ構造を用いてフラクトロニック位相を探索し, トポロジ的位相を2次元で実現・特徴づける手法として成功している。フラクタル励起を伴うガッピング相とギャップレス相の両方がモデルから現れることが判明した。ガッピングの場合、フラクタル励起はワイヤ方向に沿って移動可能であるが、横面における移動性は一般的に減少する。一般の励起は、既知のギャップを持つフラクトンモデルとは異なる無限次核融合構造を持つことを示す。 2D結合ワイヤ構造と同様に、多くのモデルは、無限成分のルッティンガー液体によって記述できる隙間のない(あるいはキラルな)表面状態を示す。しかし、表面理論の普遍性クラスは表面配向に強く依存するため、フラクトン相に特有の新しいタイプのバルク境界対応が明らかになる。 In three dimensions, gapped phases can support "fractonic" quasiparticle excitations, which are either completely immobile or can only move within a low-dimensional submanifold, a peculiar topological phenomenon going beyond the conventional framework of topological quantum field theory. In this work we explore fractonic topological phases using three-dimensional coupled wire constructions, which have proven to be a successful tool to realize and characterize topological phases in two dimensions. We find that both gapped and gapless phases with fractonic excitations can emerge from the models. In the gapped case, we argue that fractonic excitations are mobile along the wire direction, but their mobility in the transverse plane is generally reduced. We show that the excitations in general have infinite-order fusion structure, distinct from previously known gapped fracton models. Like the 2D coupled wire constructions, many models exhibit gapless (or even chiral) surface states, which can be described by infinite-component Luttinger liquids. However, the universality class of the surface theory strongly depends on the surface orientation, thus revealing a new type of bulk-boundary correspondence unique to fracton phases.	翻訳日:2023-04-27 06:15:40 公開日:2021-05-18
# 動的場推論と超対称性 Dynamical field inference and supersymmetry ( http://arxiv.org/abs/2010.15414v2 ) ライセンス: Link先を確認	Margret Westerkamp, Igor Ovchinnikov, Philipp Frank, Torsten En{\ss}lin	(参考訳) 物理分野の進化に関する知識は、科学、技術、経済学において最重要である。動的場推論(DFI)は、有限データから確率的に駆動される動的に進化する場を再構成する問題に対処する。情報場理論(英: information field theory、IFT)とは、情報場の理論である。ここでは、DFI、IFT、最近開発された超対称確率論(STS)の関係が、教育学的議論において確立されている。 IFTでは、全時空推論問題の分割関数からフィールド期待値を計算することができる。推論問題の分割関数は、場に依存した関数決定式と同様に動力学を保証する機能ディラック関数を起動し、適切な正規化を確立する。 STSは、それぞれフェルミオンゴーストとボソニックラグランジュフィールドを導入することによって、これらの問題表現を置き換える。これらの場の作用は超対称性を持ち、ボソンとフェルミオンの間の交換演算が存在し、系を不変にする。これとは対照的に、力学場の測定はこの超対称性に従わない。超対称性は自然に破壊することもでき、そこでは系がカオス的に進化する。これはシステムの予測可能性に影響を与えるため、DFIをより難しくする。ファインマン図の助けを借りて,簡略化された図示システムの非線形カオス力学と測定制約の相互作用を考察し,フェルミオン補正が系軌道上の正しい後方統計を得るために不可欠であることを示した。 Knowledge on evolving physical fields is of paramount importance in science, technology, and economics. Dynamical field inference (DFI) addresses the problem of reconstructing a stochastically driven, dynamically evolving field from finite data. It relies on information field theory (IFT), the information theory for fields. Here, the relations of DFI, IFT, and the recently developed supersymmetric theory of stochastics (STS) are established in a pedagogical discussion. In IFT, field expectation values can be calculated from the partition function of the full space-time inference problem. The partition function of the inference problem invokes a functional Dirac function to guarantee the dynamics, as well as a field-dependent functional determinant, to establish proper normalization, both impeding the necessary evaluation of the path integral over all field configurations. STS replaces these problematic expressions via the introduction of fermionic ghost and bosonic Lagrange fields, respectively. The action of these fields has a supersymmetry, which means there exists an exchange operation between bosons and fermions that leaves the system invariant. In contrast to this, measurements of the dynamical fields do not adhere to this supersymmetry. The supersymmetry can also be broken spontaneously, in which case the system evolves chaotically. This affects the predictability of the system and thereby make DFI more challenging. We investigate the interplay of measurement constraints with the non-linear chaotic dynamics of a simplified, illustrative system with the help of Feynman diagrams and show that the Fermionic corrections are essential to obtain the correct posterior statistics over system trajectories.	翻訳日:2023-04-27 00:57:09 公開日:2021-05-18
# 正演算値測度に基づくコヒーレンスを絡み合いに変換する Converting coherence based on positive-operator-valued measures into entanglement ( http://arxiv.org/abs/2011.00220v3 ) ライセンス: Link先を確認	Sunho Kim, Chunhe Xiong, Asutosh Kumar, and Junde Wu	(参考訳) 量子資源理論は、量子物理学における現象を広範囲に研究するための多様で強力な枠組みを提供する。量子コヒーレンス(quantum coherence)は、多くの量子情報処理の基本的な要素である。量子情報に対する広範かつ現在の関心の対象であり、その成立以来多くの新しい概念が導入され、一般化されてきた。ここではブロックコヒーレンスをブロックインコヒーレント操作により絡み合いに変換することができることを示す。さらに,naimark拡張によるブロックコヒーレンスに関連するpovmベースのコヒーレンスが,絡み合い生成の観点から潜在的資源として機能することを見出した。最後に、povmベースのコヒーレンスから絡み合う方法を説明し、埋め込みチャネルと補助システムを必要とする戦略を提示し、いくつかの例を示し、一般化する。 Quantum resource theories provide a diverse and powerful framework for extensively studying the phenomena in quantum physics. Quantum coherence, a quantum resource, is the basic ingredient in many quantum information tasks. It is a subject of broad and current interest in quantum information, and many new concepts have been introduced and generalized since its establishment. Here we show that the block coherence can be transformed into entanglement via a block incoherent operation. Moreover, we find that the POVM-based coherence associated with block coherence through the Naimark extension acts as a potential resource from the perspective of generating entanglement. Finally, we discuss avenues of creating entanglement from POVM-based coherence, present strategies that require embedding channels and auxiliary systems, give some examples, and generalize them.	翻訳日:2023-04-26 05:46:36 公開日:2021-05-18
# 微分可能な量子回路を用いた非線形微分方程式の解法 Solving nonlinear differential equations with differentiable quantum circuits ( http://arxiv.org/abs/2011.10395v2 ) ライセンス: Link先を確認	Oleksandr Kyriienko, Annie E. Paine, Vincent E. Elfving	(参考訳) 非線形微分方程式系を解く量子アルゴリズムを提案する。量子特徴写像符号化を用いて、パラメタライズド量子回路の期待値として関数を定義する。解析的な形で関数微分を微分可能な量子回路(DQC)として表現するために自動微分を用いるので、勾配を計算するための不正確な有限差分手順を避けることができる。微分方程式と境界条件を満たすようdqcを訓練するハイブリッド量子古典ワークフローについて述べる。特定の例として,高次元特徴空間における微分方程式を解くためのスペクトル法の実装について述べる。技術的な観点から、適合多項式の強力な基底セットを提供し、豊かな表現性を持つチェビシェフ量子特徴写像を設計する。本研究では, ナビエ・ストークス方程式の解法をシミュレートし, 収束ダイバージェントノズル内の流体流動の密度, 温度, 速度プロファイルを計算する。 We propose a quantum algorithm to solve systems of nonlinear differential equations. Using a quantum feature map encoding, we define functions as expectation values of parametrized quantum circuits. We use automatic differentiation to represent function derivatives in an analytical form as differentiable quantum circuits (DQCs), thus avoiding inaccurate finite difference procedures for calculating gradients. We describe a hybrid quantum-classical workflow where DQCs are trained to satisfy differential equations and specified boundary conditions. As a particular example setting, we show how this approach can implement a spectral method for solving differential equations in a high-dimensional feature space. From a technical perspective, we design a Chebyshev quantum feature map that offers a powerful basis set of fitting polynomials and possesses rich expressivity. We simulate the algorithm to solve an instance of Navier-Stokes equations, and compute density, temperature and velocity profiles for the fluid flow in a convergent-divergent nozzle.	翻訳日:2023-04-23 15:05:29 公開日:2021-05-18
# labフレームコヒーレント回転波束による多原子分子中の分子フレーム光電子角分布 Molecular Frame Photoelectron Angular Distributions in Polyatomic Molecules from Lab Frame Coherent Rotational Wavepacket Evolution ( http://arxiv.org/abs/2012.04561v2 ) ライセンス: Link先を確認	Margaret Gregory, Paul Hockett, Albert Stolow, Varun Makhija	(参考訳) 実験室フレーム(LF)測定(LFPAD)から分子フレーム(MF)光電子角分布(MFPAD)を得るための行列ベースの再構成プロトコルの適用について検討した。 Similarly to other recent works on the topic of MF reconstruction, this protocol makes use of time-resolved LF measurements, in which a rotational wavepacket is prepared and probed via photoionization, followed by a numerical reconstruction routine; however, in contrast to other methodologies, the protocol developed herein does not require determination of photoionization matrix elements, and consequently takes a relatively simple numerical form (matrix transform making use of the Moore-Penrose inverse). 重要なことに、この単純さにより、多原子分子に対するMFPADの再構築が成功している。このスキームは、N_2$とC_2H_4$の2つの実例に対して数値的に示される。この新しい手法は、多原子分子の光イオン化を含むmf再構成問題に適用できると期待されている。 The application of a matrix-based reconstruction protocol for obtaining Molecular Frame (MF) photoelectron angular distributions (MFPADs) from laboratory frame (LF) measurements (LFPADs) is explored. Similarly to other recent works on the topic of MF reconstruction, this protocol makes use of time-resolved LF measurements, in which a rotational wavepacket is prepared and probed via photoionization, followed by a numerical reconstruction routine; however, in contrast to other methodologies, the protocol developed herein does not require determination of photoionization matrix elements, and consequently takes a relatively simple numerical form (matrix transform making use of the Moore-Penrose inverse). Significantly, the simplicity allows application of the method to the successful reconstruction of MFPADs for polyatomic molecules. The scheme is demonstrated numerically for two realistic cases, $N_2$ and $C_2H_4$. The new technique is expected to be generally applicable for a range of MF reconstruction problems involving photoionization of polyatomic molecules.	翻訳日:2023-04-21 18:25:16 公開日:2021-05-18
# 縮退支援量子スターリング熱エンジンにおける作業の温度依存性最大化と効率 Temperature dependent maximization of work and efficiency in a degeneracy assisted quantum Stirling heat engine ( http://arxiv.org/abs/2012.11362v3 ) ライセンス: Link先を確認	Sarbani Chatterjee, Arghadip Koner, Sohini Chatterjee, and Chandan Kumar	(参考訳) 本稿では,高調波発振器を作動媒体とする量子スターリング熱エンジンを提案する。本研究では, 所定の周波数での高調波発振器量子スターリングヒートエンジン(HO-QSHE)の効率を, 蓄熱器の温度比で最大化できることを示す。音波発振器の低温または等価に高い周波数制限では、HO-QSHEの効率はカルノー効率に近づく。さらに,量子スターリング熱エンジンの動作媒質として,箱型量子システムにおける粒子のアンサンブルを分析する。ここで、作業と効率の両方を熱貯水池の温度の比で最大化することができる。これらの研究により、最適な性能で量子スターリング熱エンジンを動作させることができる。 HO-QSHEの理論的な研究は、ほとんどの実システムは平衡付近の小さな変位に対する調和振動子として近似できるので、実験的な実現のための衝動を与える。 We propose a quantum Stirling heat engine with an ensemble of harmonic oscillators as the working medium. We show that the efficiency of the harmonic oscillator quantum Stirling heat engine (HO-QSHE) at a given frequency can be maximized at a specific ratio of the temperatures of the thermal reservoirs. In the low temperature or equivalently high frequency limit of the harmonic oscillators, the efficiency of the HO-QSHE approaches the Carnot efficiency. Further, we analyse quantum Stirling heat engine with an ensemble of particle in box quantum systems as the working medium. Here both work and efficiency can be maximized at a specific ratio of temperatures of the thermal reservoirs. These studies will enable us to operate the quantum Stirling heat engines at its optimal performance. The theoretical study of the HO-QSHE would provide impetus for its experimental realisation, as most real systems can be approximated as harmonic oscillators for small displacements near equilibrium.	翻訳日:2023-04-20 00:28:25 公開日:2021-05-18
# 量子モンテカルロにおけるgrover-rudolph状態形成の問題点 The Problem with Grover-Rudolph State Preparation for Quantum Monte-Carlo ( http://arxiv.org/abs/2101.02240v2 ) ライセンス: Link先を確認	Steven Herbert	(参考訳) 我々は,Grover-Rudolph法を用いて,解析的に定義された対数凹型確率分布の平均(および他のモーメント)を量子状態として推定するために量子モンテカルロを用いる場合,量子スピードアップは存在しないことを証明した。 We prove that there is no quantum speed-up when using quantum Monte-Carlo to estimate the mean (and other moments) of analytically-defined log-concave probability distributions prepared as quantum states using the Grover-Rudolph method.	翻訳日:2023-04-17 17:41:16 公開日:2021-05-18
# 1+1次元ゲージ場理論のリアルタイムダイナミクスへの連続的アプローチ:シュウィンガー模型の地平線相関から Continuum approach to real time dynamics of 1+1D gauge field theory: out of horizon correlations of the Schwinger model ( http://arxiv.org/abs/2101.07807v2 ) ライセンス: Link先を確認	Ivan Kukuljan	(参考訳) シュウィンガー模型における非平衡実時間ダイナミクス(d=1+1の量子電磁力学)を研究するための切断ハミルトン法を開発した。これは、局所的および大域的ゲージ変換の下で不変性を確実に捉え、時空を離散化する必要のない純粋連続体法である。 1+1dの量子電磁力学は、最近sine-gordonモデルで発見されたダイナミックな地平線破れ効果を認めている。モデルのクエンチの後、振動性のある長距離相関が発展し、水平線境界をはっきりと破る。その結果, 水平方向の相関関係の振動周波数は, 相関中間子対を介する効果を示すモデルの中間子の質量の2倍に相当することがわかった。また、以前は質量のないシュウィンガーモデルで知られていた大規模モデルのクラスタ違反について報告する。この結果は、1+1D量子電磁力学における新しい非平衡現象を明らかにし、ゲージ場理論に地平線違反効果が存在することを証明するための第一歩となる。 We develop a truncated Hamiltonian method to study nonequilibrium real time dynamics in the Schwinger model - the quantum electrodynamics in D=1+1. This is a purely continuum method that captures reliably the invariance under local and global gauge transformations and does not require a discretisation of space-time. We use it to study a phenomenon that is expected not to be tractable using lattice methods: we show that the 1+1D quantum electrodynamics admits the dynamical horizon violation effect which was recently discovered in the case of the sine-Gordon model. Following a quench of the model, oscillatory long-range correlations develop, manifestly violating the horizon bound. We find that the oscillation frequencies of the out-of-horizon correlations correspond to twice the masses of the mesons of the model suggesting that the effect is mediated through correlated meson pairs. We also report on the cluster violation in the massive version of the model, previously known in the massless Schwinger model. The results presented here reveal a novel nonequilibrium phenomenon in 1+1D quantum electrodynamics and make a first step towards establishing that the horizon violation effect is present in gauge field theory.	翻訳日:2023-04-14 17:51:54 公開日:2021-05-18
# 量子テレポーテーションにおける量子ビットチャネルの特徴付け Characterizing qubit channels in the context of quantum teleportation ( http://arxiv.org/abs/2102.02054v2 ) ライセンス: Link先を確認	Arkaprabha Ghosal, Debarshi Das, Subhashish Banerjee	(参考訳) 例えば、アリスが純粋な2量子ビット(最大に絡み合うか、最大に絡み合うか)の状態を準備し、その状態の半分を例えばボブにキュービット(単位的または非単位的)チャネルで送るシナリオを考える。最後に、共有状態はテレポーテーションチャネルとして使用される。このシナリオでは、量子テレポーテーション(QT)の資源としての最終的な状態の有効性に関するキュービットチャネルの集合を、最大平均忠実度とフィデリティ偏差(入力状態上のフィデリティ値のゆらぎ)の観点から特徴づけることに焦点をあてる。重要なことは、初期準備状態が普遍QT(すなわち、最大エンタングル状態)または普遍QT(すなわち、非最大エンタングル状態のサブセット)に有用でないとき、最終状態が普遍QT(古典的境界よりも厳密な最大平均フィデリティを持ち、フィデリティの偏差がゼロである)に有用になるようなキュービットチャネルのサブセットの存在を指摘したことである。興味深いことに、後者の場合、非単位的チャネル(散逸的相互作用)は単位的チャネル(非散逸的相互作用)よりも有効であり、非最大に絡み合った純粋な状態から普遍的qtに対して有用な状態を生成する。 We consider a scenario where a party, say, Alice prepares a pure two-qubit (either maximally entangled or non-maximally entangled) state and sends one half of this state to another distant party, say, Bob through a qubit (either unital or non-unital) channel. Finally, the shared state is used as a teleportation channel. In this scenario, we focus on characterizing the set of qubit channels with respect to the final state's efficacy as a resource of quantum teleportation (QT) in terms of maximal average fidelity and fidelity deviation (fluctuation in fidelity values over the input states). Importantly, we point out the existence of a subset of qubit channels for which the final state becomes useful for universal QT (having maximal average fidelity strictly greater than the classical bound and having zero fidelity deviation) when the initially prepared state is either useful for universal QT (i.e., for a maximally entangled state) or not useful for universal QT (i.e., for a subset of non-maximally entangled pure states). Interestingly, in the latter case, we show that non-unital channels (dissipative interactions) are more effective than unital channels (non-dissipative interactions) in producing useful states for universal QT from non-maximally entangled pure states.	翻訳日:2023-04-12 22:24:11 公開日:2021-05-18
# 低周波トラップにおける浮上磁石の地上冷却 Ground-State Cooling of Levitated Magnets in Low-Frequency Traps ( http://arxiv.org/abs/2102.03344v2 ) ライセンス: Link先を確認	Kirill Streltsov, Julen S. Pedernales, Martin B. Plenio	(参考訳) 低周波トラップ中を浮遊するメソスコピック磁性粒子の機械的自由度に対する基底状態冷却方式を提案する。本手法では,二値センサと適切な形状のパルスを用い,磁石の位置を弱く適応的に測定する。これにより粒子の位置と運動量を正確に決定することができ、初期高エントロピーの熱状態を純粋なコヒーレント状態に変換する。その後、トラップ中心をシフトしてエネルギーを抽出する。エネルギー抽出のタスクをコヒーレント変位操作に委譲することにより、発振器に結合した2レベル系の消散に依存する冷却スキームに関連する制約を克服する。我々は,本プロトコルを加熱速度や不完全な読み出し特性を含む実測実験条件で数値的に評価し,低温で作動する磁気浮上トラップに適していることを示す。その結果,ミクロンスケール粒子の地中冷却の道が開けた。 We present a ground-state cooling scheme for the mechanical degrees of freedom of mesoscopic magnetic particles levitated in low-frequency traps. Our method makes use of a binary sensor and suitably shaped pulses to perform weak, adaptive measurements on the position of the magnet. This allows us to precisely determine the position and momentum of the particle, transforming the initial high-entropy thermal state into a pure coherent state. The energy is then extracted by shifting the trap center. By delegating the task of energy extraction to a coherent displacement operation we overcome the limitations associated with cooling schemes that rely on the dissipation of a two-level system coupled to the oscillator. We numerically benchmark our protocol in realistic experimental conditions, including heating rates and imperfect readout fidelities, showing that it is well suited for magnetogravitational traps operating at cryogenic temperatures. Our results pave the way for ground-state cooling of micron-scale particles.	翻訳日:2023-04-12 11:41:01 公開日:2021-05-18
# ge Hut Wire Double Quantum Dotにおける異方性g因子とスピン軌道場 Anisotropic g-Factor and Spin-Orbit Field in a Ge Hut Wire Double Quantum Dot ( http://arxiv.org/abs/2102.03707v2 ) ライセンス: Link先を確認	Ting Zhang, He Liu, Fei Gao, Gang Xu, Ke Wang, Xin Zhang, Gang Cao, Ting Wang, Jian-Jun Zhang, Xuedong Hu, Hai-Ou Li and Guo-Ping Guo	(参考訳) ナノワイヤのホールは、マヨラナゼロモードの構築やスピン軌道量子ビットの操作において重要な役割を果たす強いスピン軌道相互作用のために近年大きな注目を集めている。ここでは、二重ドットのスピン遮断状態における強い異方性リーク電流から、全 g-テンソルを抽出し、スピン軌道場がナノワイヤの軸に対して59{\degの方位角を持つ平面内にあることを確認する。スピン軌道場の方向は、ナノワイヤに沿った強いスピン軌道相互作用を示すが、これはGe小屋ワイヤの界面反転非対称性に由来する可能性がある。また,ge hut線ダブルドットの穴に対するスピン緩和機構として,リードへのスピンフリップコネネネリングと、ダブルドット内のスピン軌道相互作用の2つの異なる機構を示す。これらの結果はgeベースの量子プロセッサの実現可能性を確立するのに役立つ。 Holes in nanowires have drawn significant attention in recent years because of the strong spin-orbit interaction, which plays an important role in constructing Majorana zero modes and manipulating spin-orbit qubits. Here, from the strongly anisotropic leakage current in the spin blockade regime for a double dot, we extract the full g-tensor and find that the spin-orbit field is in plane with an azimuthal angle of 59{\deg} to the axis of the nanowire. The direction of the spin-orbit field indicates a strong spin-orbit interaction along the nanowire, which may have originated from the interface inversion asymmetry in Ge hut wires. We also demonstrate two different spin relaxation mechanisms for the holes in the Ge hut wire double dot: spin-flip cotunneling to the leads, and spin-orbit interaction within the double dot. These results help establish feasibility of a Ge-based quantum processor.	翻訳日:2023-04-12 07:26:57 公開日:2021-05-18
# 操作のコヒーレンスと干渉計 Coherence of operations and interferometry ( http://arxiv.org/abs/2102.04863v2 ) ライセンス: Link先を確認	Michele Masini, Thomas Theurer, Martin B. Plenio	(参考訳) 量子コヒーレンス(quantum coherence)は、量子力学が古典物理学の力を超越する応用の鍵となる特徴の1つである。これは、量子資源理論を通じてコヒーレンスを定量化するために行われたかなりの努力を説明する。しかし、具体的な技術的タスクへのフレームワークの適用はほとんど失われている。本稿では、この問題に対処し、干渉計測実験の性能にコヒーレンスを検出または生成する操作の能力を結合する。 Quantum coherence is one of the key features that fuels applications for which quantum mechanics exceeds the power of classical physics. This explains the considerable efforts that were undertaken to quantify coherence via quantum resource theories. An application of the resulting framework to concrete technological tasks is however largely missing. Here, we address this problem and connect the ability of an operation to detect or create coherence to the performance of interferometric experiments.	翻訳日:2023-04-12 03:16:22 公開日:2021-05-18
# スパンニングツリー組換えのコンパクト性統計 Compactness statistics for spanning tree recombination ( http://arxiv.org/abs/2103.02699v2 ) ライセンス: Link先を確認	Jeanne N. Clelland, Nicholas Bossenbroek, Thomas Heckmaster, Adam Nelson, Peter Rock, Jade VanAusdall	(参考訳) アンサンブル分析はゲリーマンデリングを定量化するための重要なツールとなり、主要なアイデアは、提案された計画を比較することができる大きなランダムな計画("ensemble")のサンプルを生成することである。もし提案された計画が、様々な再配置基準に関してアンサンブルと比較して極端な異常であるならば、計画が意図的に特定の結果を生み出すように設計されたことを示す可能性がある。計画を構成する方法が与えられたら、その方法によって特定の計画を構築する確率を記述する計画の空間上の確率分布を特定できますか? 近年,MCMC法がアンサンブル構築の主流となっている。ここでは、2018年にMGGG Reistricting Labによって導入された「ReCom」と呼ばれるMCMC手法に焦点を当てる。 ReComは他の方法よりもコンパクトな地区で計画を作成する傾向があり、我々はこの現象をよりよく理解しようとした。この尺度はduchin と tenner によって提案され,polsby-popper score などの地理的周辺値に基づくコンパクト性尺度の難しさを回避した。基本ReComのステップをモデル化するため,2つの格子グラフとボーダー郡地区グラフの2分割計画のアンサンブルを構築した。また,2つの地区ごとの分布木数の積にほぼ比例する特定の計画の採集確率は,その計画におけるカットエッジ数の指数関数的減衰関数にほぼ比例することを示した。これはReCom法による分割計画のコンパクト性を理解するための重要なステップである。 Ensemble analysis has become an important tool for quantifying gerrymandering; the main idea is to generate a large, random sample of districting plans (an "ensemble") to which any proposed plan may be compared. If a proposed plan is an extreme outlier compared to the ensemble with regard to various redistricting criteria, this may indicate that the plan was deliberately engineered to produce a specific outcome. Many methods have been used to construct ensembles, and a fundamental question that arises is: Given a method for constructing plans, can we identify a probability distribution on the space of plans that describes the probability of constructing any particular plan by that method? Recently, MCMC methods have become a predominant tool for constructing ensembles. Here we focus on the MCMC method known as "ReCom," which was introduced in 2018 by the MGGG Redistricting Lab. ReCom tends to produce plans with more compact districts than some other methods, and we sought to better understand this phenomenon. We adopted a discrete analog of district perimeter called "cut edges" as a quantitative measure for district compactness; this measure was proposed by Duchin and Tenner, and it avoids some of the difficulties associated with compactness measures based on geographic perimeter, such as the Polsby-Popper score. To model the basic ReCom step, we constructed ensembles of 2-district plans for two grid graphs and for the precinct graph of Boulder County, CO. We found that the probability of sampling any particular plan -- which is roughly proportional to the product of the numbers of spanning trees for each of the two districts -- is also approximately proportional to an exponentially decaying function of the number of cut edges in the plan. This is an important step towards understanding compactness properties for districting plans produced by the ReCom method.	翻訳日:2023-04-09 07:42:25 公開日:2021-05-18
# コヒーレント・圧縮光によるオプトメカニカル冷却--熱弁開弁の熱力学的コスト- Optomechanical cooling with coherent and squeezed light: the thermodynamic cost of opening the heat valve ( http://arxiv.org/abs/2103.03596v2 ) ライセンス: Link先を確認	Juliette Monsel, Nastaran Dashti, Sushanth Kini Manjeshwar, Jakob Eriksson, Henric Ernbrink, Ebba Olsson, Emelie Torneus, Witlef Wieczorek and Janine Splettstoesser	(参考訳) 各種光学系において、駆動光空洞との結合による機械運動の地中冷却が実証されている。本研究では,熱弁を用いた光機械式サイドバンド冷却の熱力学的性能解析を行う。性能定量化器として, 低到達性有効温度(フォノン数)だけでなく, 標準冷蔵機の冷却電力と同等の避難熱流や, キャビティ出力光場の測定から, 全てを実験的に推定できる適切な熱力学効率についても検討する。特に,コヒーレント光によって供給される標準的なオプティメカルセットアップに加えて,コヒーレントレーザドライブを圧縮光で置き換えたり,周波数依存(ファノ)ミラーでキャビティを使用するという,地中冷却を実現するための2つの方法を検討した。弱結合限界の内外におけるこれらのセットアップのダイナミクスを考察し、既存の実験システムのパラメータに基づく具体例を示す。熱力学の枠組みを適用することで、これら3つの異なる冷却機構について詳細な知見を得ることができ、熱力学のメカニズムを網羅的に理解することができる。 Ground-state cooling of mechanical motion by coupling to a driven optical cavity has been demonstrated in various optomechanical systems. In our work, we provide a so far missing thermodynamic performance analysis of optomechanical sideband cooling in terms of a heat valve. As performance quantifiers, we examine not only the lowest reachable effective temperature (phonon number) but also the evacuated-heat flow as an equivalent to the cooling power of a standard refrigerator, as well as appropriate thermodynamic efficiencies, which all can be experimentally inferred from measurements of the cavity output light field. Importantly, in addition to the standard optomechanical setup fed by coherent light, we investigate two recent alternative setups for achieving ground-state cooling: replacing the coherent laser drive by squeezed light or using a cavity with a frequency-dependent (Fano) mirror. We study the dynamics of these setups within and beyond the weak-coupling limit and give concrete examples based on parameters of existing experimental systems. By applying our thermodynamic framework, we gain detailed insights into these three different optomechanical cooling setups, allowing a comprehensive understanding of the thermodynamic mechanisms at play.	翻訳日:2023-04-09 00:20:29 公開日:2021-05-18
# 強相互作用するフェルミ・ボース混合系におけるフェルミ・ポーラロンの安定性と分解 Stability and breakdown of Fermi polarons in a strongly interacting Fermi-Bose mixture ( http://arxiv.org/abs/2103.03625v2 ) ライセンス: Link先を確認	Isabella Fritsche, Cosetta Baroni, Erich Dobler, Emil Kirilov, Bo Huang, Rudolf Grimm, Georg M. Bruun, Pietro Massignan	(参考訳) ウルトラコールド$^6$Li原子のフェルミ海に浸漬したボソニック$^{41}$K不純物の強相互作用不均衡混合物の特性について検討した。これにより、ボース=アインシュタイン凝縮体を形成する場合を含む、大きな不純物濃度のフェルミポラロンシナリオを探索することができる。このシステムは高周波注入分光法によって特徴づけられ、種間相互作用はよく特性化されたフェッシュバッハ共鳴によって広く調整可能である。不純物雲の熱分率で形成されるフェルミポーラロンのエネルギーは、両方の種の等しい密度に接近しても、不純物濃度にかなり敏感であることがわかった。高濃度に対する明らかな非感度は、ランダウの準粒子理論に基づく理論的な予測と、ポーラロン間の弱い効果的な相互作用と一致している。ボソニックの$^{41}$Kガスの凝縮分は熱成分よりもはるかに密度が高いため、フェルミ・ポーラロンの記述が破壊される。その代わり、周波数スペクトルの新しい分岐を小さなエネルギーシフトで観測し、これは$^{41}$kの凝縮物の内部で$^{6}$liのフェルミオンによって形成されたボースポーラロンの存在と一致する。ラビ振動測定による凝縮物の挙動のより深い調査は、この観測を裏付け、我々はフェルミとボース・ポーラロン(基本的に異なる2つの準粒子)を1つの雲で実現したことを示している。 We investigate the properties of a strongly interacting imbalanced mixture of bosonic $^{41}$K impurities immersed in a Fermi sea of ultracold $^6$Li atoms. This enables us to explore the Fermi polaron scenario for large impurity concentrations including the case where they form a Bose-Einstein condensate. The system is characterized by means of radio-frequency injection spectroscopy and interspecies interactions are widely tunable by means of a well-characterized Feshbach resonance. We find that the energy of the Fermi polarons formed in the thermal fraction of the impurity cloud remains rather insensitive to the impurity concentration, even as we approach equal densities for both species. The apparent insensitivity to high concentration is consistent with a theoretical prediction, based on Landau's quasiparticle theory, of a weak effective interaction between the polarons. The condensed fraction of the bosonic $^{41}$K gas is much denser than its thermal component, which leads to a break-down of the Fermi polaron description. Instead, we observe a new branch in the radio-frequency spectrum with a small energy shift, which is consistent with the presence of Bose polarons formed by $^{6}$Li fermions inside the $^{41}$K condensate. A closer investigation of the behavior of the condensate by means of Rabi oscillation measurements support this observation, indicating that we have realized Fermi and Bose polarons, two fundamentally different quasiparticles, in one cloud.	翻訳日:2023-04-09 00:07:54 公開日:2021-05-18
# 超波長可変量子周波数変換用フォトニック結晶繊維の群速度対称性 Group-velocity symmetry in photonic crystal fibre for ultra-tunable quantum frequency conversion ( http://arxiv.org/abs/2103.04824v2 ) ライセンス: Link先を確認	Charlotte Parry, Philip B. Main, Thomas A. Wright and Peter J. Mosley	(参考訳) 単一光子の低ノイズ周波数変換は、ファイバーベースの量子ネットワークを確立する上で重要なツールである。単一フォトニック結晶繊維は、対称群速度プロファイルを用いて、超広帯域の光源光子の4波混合をブラッグ散乱することで周波数変換が可能となる。さらに,ポンプチューニングがデバイス製造における現実的な相違を緩和する方法について論じる。これにより、1つの高い適応性を持つ周波数変換インタフェースにより、通信帯域を介して量子ネットワーク内の異種ノードをリンクすることができる。 Low-noise frequency conversion of single photons is a critical tool in establishing fibre-based quantum networks. We show that a single photonic crystal fibre can achieve frequency conversion by Bragg-scattering four-wave mixing of source photons from an ultra-broad wavelength range by engineering a symmetric group velocity profile. Furthermore, we discuss how pump tuning can mitigate realistic discrepancies in device fabrication. This enables a single highly adaptable frequency conversion interface to link disparate nodes in a quantum network via the telecoms band.	翻訳日:2023-04-08 18:23:52 公開日:2021-05-18
# ゴリニ-コサコフスキ-スダールシャン-リンドブラド型マルコフ系における有限サイズ孤立量子系の全熱力学的エントロピー生成速度の負の可能性 Possibility of the total thermodynamic entropy production rate of a finite-sized isolated quantum system to be negative for the Gorini-Kossakowski-Sudarshan-Lindblad-type Markovian dynamics of its subsystem ( http://arxiv.org/abs/2103.05308v2 ) ライセンス: Link先を確認	Takaaki Aoki, Yuichiro Matsuzaki, and Hideaki Hakoshima	(参考訳) 孤立量子系の全熱力学的エントロピー生成速度について検討する。特に、中心高調波発振器(系)が周囲の有限個の高調波発振器(バス)と結合する恒星構成における結合高調波発振器の量子モデルを考える。このモデルでは、システムと浴のギブス状態の初期状態がテンソル積によって与えられるとき、全ての高調波発振器は常にギブス状態であり、温度は時間に依存する。これにより、各調和振動子に対する時間依存熱力学的エントロピーと、それらの和として全非平衡熱力学的エントロピーを定義できる。熱力学エントロピーが熱力学の第3法則を満たすことを解析的に確認する。数値解は,gorini-kossakowski-sudarshan-lindblad (gksl) 型マルコフマスター方程式によって系の力学が十分に近似された場合でも,全熱力学的エントロピー生成速度は負であり,全熱力学的エントロピーは熱力学の第2法則を満たすことを示した。この結果は、系がgksl型マルコフ力学の下では、全エントロピー生成率は非負であるという共通の信念に対する反例である。 We investigate a total thermodynamic entropy production rate of an isolated quantum system. In particular, we consider a quantum model of coupled harmonic oscillators in a star configuration, where a central harmonic oscillator (system) is coupled to a finite number of surrounding harmonic oscillators (bath). In this model, when the initial state of the total system is given by the tensor product of the Gibbs states of the system and the bath, every harmonic oscillator is always in a Gibbs state with a time-dependent temperature. This enables us to define time-dependent thermodynamic entropy for each harmonic oscillator and total nonequilibrium thermodynamic entropy as the summation of them. We analytically confirm that the total thermodynamic entropy satisfies the third law of thermodynamics. Our numerical solutions show that, even when the dynamics of the system is well approximated by the Gorini-Kossakowski-Sudarshan-Lindblad (GKSL)-type Markovian master equation, the total thermodynamic entropy production rate can be negative, while the total thermodynamic entropy satisfies the second law of thermodynamics. This result is a counterexample to the common belief that the total entropy production rate is non-negative when the system is under the GKSL-type Markovian dynamics.	翻訳日:2023-04-08 16:11:31 公開日:2021-05-18
# 断続テイラー級数によるハミルトンシミュレーションのためのnisqアルゴリズム NISQ Algorithm for Hamiltonian Simulation via Truncated Taylor Series ( http://arxiv.org/abs/2103.05500v2 ) ライセンス: Link先を確認	Jonathan Wei Zhong Lau, Tobias Haug, Leong Chuan Kwek, Kishor Bharti	(参考訳) 多体量子システムのダイナミクスをシミュレートすることは、量子コンピュータが古典的コンピュータよりも量子優位を示す最初の分野の1つであると考えられている。ノイズの多い中間スケール量子(nisq)アルゴリズムは、現在利用可能な量子ハードウェアを効果的に利用することを目指している。量子シミュレーションでは、様々な種類のNISQアルゴリズムが個々の利点と課題によって提案されている。本稿では,既存のアルゴリズムの利点を共有し,いくつかの欠点を緩和する新しいアルゴリズムであるttqs(tncated taylor quantum simulator)を提案する。本アルゴリズムは古典量子フィードバックループを持たず,構築によって不毛台地問題を回避している。我々のハイブリッド量子古典アルゴリズムの古典的部分は、半定値緩和を含む1つの2次等式制約を持つ2次制約付き二次プログラム(QCQP)に対応する。 QCQPに基づく古典最適化は、最近ハミルトン基底問題に対するNISQアルゴリズムである量子補助固有解法(QAE)の古典的なステップとして導入された。したがって,本研究は,ハミルトン基底状態問題に対するnisqアルゴリズムとハミルトニアンシミュレーションの間の概念的統一性を提供する。量子支援シミュレータ (QAS) や変分量子シミュレータ (VQS) などのハミルトンシミュレーションのための微分方程式に基づく NISQ アルゴリズムをアルゴリズムの特定の場合として回収する。私たちは、現在のクラウド量子コンピュータのおもちゃの例でアルゴリズムをテストします。また,アルゴリズムの精度を向上させるための体系的手法を提案する。 Simulating the dynamics of many-body quantum systems is believed to be one of the first fields that quantum computers can show a quantum advantage over classical computers. Noisy intermediate-scale quantum (NISQ) algorithms aim at effectively using the currently available quantum hardware. For quantum simulation, various types of NISQ algorithms have been proposed with individual advantages as well as challenges. In this work, we propose a new algorithm, truncated Taylor quantum simulator (TTQS), that shares the advantages of existing algorithms and alleviates some of the shortcomings. Our algorithm does not have any classical-quantum feedback loop and bypasses the barren plateau problem by construction. The classical part in our hybrid quantum-classical algorithm corresponds to a quadratically constrained quadratic program (QCQP) with a single quadratic equality constraint, which admits a semidefinite relaxation. The QCQP based classical optimization was recently introduced as the classical step in quantum assisted eigensolver (QAE), a NISQ algorithm for the Hamiltonian ground state problem. Thus, our work provides a conceptual unification between the NISQ algorithms for the Hamiltonian ground state problem and the Hamiltonian simulation. We recover differential equation-based NISQ algorithms for Hamiltonian simulation such as quantum assisted simulator (QAS) and variational quantum simulator (VQS) as particular cases of our algorithm. We test our algorithm on some toy examples on current cloud quantum computers. We also provide a systematic approach to improve the accuracy of our algorithm.	翻訳日:2023-04-08 16:01:42 公開日:2021-05-18
# デュアルユニタリ量子回路における固有熱化:スペクトル関数の漸近 Eigenstate thermalization in dual-unitary quantum circuits: Asymptotics of spectral functions ( http://arxiv.org/abs/2103.11694v2 ) ライセンス: Link先を確認	Felix Fritzsch and Toma\v{z} Prosen	(参考訳) 固有状態熱化仮説は、(準)エネルギー固有基底における典型的な作用素の行列要素の統計的性質を導出することによって、孤立量子系における熱化の最も成功した記述である。本稿では,2元量子回路における作用素のクラスに対する行列要素の分布を,対応する固有状態の周波数に依存して検討する。スペクトル関数、すなわち、この周波数分解分布の第2モーメントに対する正確な漸近的表現を提供する。後者は、双対ユニタリ回路の基本構成ブロックから正確に計算できる局所作用素間の動的相関の減衰から得られる。漸近表現と正確な対角化による結果を比較すると,良好な一致が得られた。有限系サイズの小さなゆらぎは、中間時間の動的相関とそれらの漸近力学からの偏差に明示的に関係している。さらに,高いモーメントを数値計算することで,行列要素の期待ガウス分布を確認する。 The eigenstate thermalization hypothesis provides to date the most successful description of thermalization in isolated quantum systems by conjecturing statistical properties of matrix elements of typical operators in the (quasi-)energy eigenbasis. Here we study the distribution of matrix elements for a class of operators in dual-unitary quantum circuits in dependence of the frequency associated with the corresponding eigenstates. We provide an exact asymptotic expression for the spectral function, i.e., the second moment of this frequency resolved distribution. The latter is obtained from the decay of dynamical correlations between local operators which can be computed exactly from the elementary building blocks of the dual-unitary circuits. Comparing the asymptotic expression with results obtained by exact diagonalization we find excellent agreement. Small fluctuations at finite system size are explicitly related to dynamical correlations at intermediate times and the deviations from their asymptotical dynamics. Moreover, we confirm the expected Gaussian distribution of the matrix elements by computing higher moments numerically.	翻訳日:2023-04-07 04:46:44 公開日:2021-05-18
# 強結合キャビティ量子電気力学におけるフォノン効果の非マルコフ摂動理論 Non-Markovian perturbation theories for phonon effects in strong-coupling cavity quantum electrodynamics ( http://arxiv.org/abs/2103.14327v2 ) ライセンス: Link先を確認	Matias Bundgaard-Nielsen, Jesper M{\o}rk and Emil Vosmar Denning	(参考訳) フォノン相互作用は、固体エミッタや蛍光分子に基づくキャビティ量子電気力学系では避けられず、格子や化学結合の振動が電子自由度に結合する。振動環境の非マルコフ応答のため、そのような効果を計算的に効率的に記述することは重要な理論的課題である。これは、エミッタ-キャビティ結合が典型的なフォノンエネルギー範囲に匹敵するか大きいときに特に顕著であり、偏光子形成は光遷移の振動ドレッシングと一致する。本稿では,4つの非マルコフ的摂動的マスター方程式を用いて,光物質結合強度の広い範囲にわたる力学を記述し,テンソルネットワークを用いた数値的正確な参照計算と比較する。マスター方程式は異なる基底変換を用いて導出され、新しい基底における摂動拡大はその後導入され、解析される。 2つのアプローチが特に成功し,堅牢であることに気付きました。本論では, 励起子キャビティ偏光子の振動ドレッシングを基礎として, 第一報が提案され, 開発されている。これにより、ポラリトン分裂が環境における典型的なフォノン周波数スケールを超えると現れる異なるフォノン・ポーラリトンサイドバンドを記述することができる。第2のアプローチは、電子状態の変分最適化された極性振動ドレッシングに基づいている。どちらの手法も、放射スペクトルの基準計算と良好な定性的かつ定量的な一致を示し、熱フォノンの集団が顕著な高温でも数値的に堅牢である。 Phonon interactions are inevitable in cavity quantum electrodynamical systems based on solid-state emitters or fluorescent molecules, where vibrations of the lattice or chemical bonds couple to the electronic degrees of freedom. Due to the non-Markovian response of the vibrational environment, it remains a significant theoretical challenge to describe such effects in a computationally efficient manner. This is particularly pronounced when the emitter-cavity coupling is comparable to or larger than the typical phonon energy range, and polariton formation coincides with vibrational dressing of the optical transitions. In this Article, we consider four non-Markovian perturbative master equation approaches to describe such dynamics over a broad range of light-matter coupling strengths and compare them to numerically exact reference calculations using a tensor network. The master equations are derived using different basis transformations and a perturbative expansion in the new basis is subsequently introduced and analyzed. We find that two approaches are particularly successful and robust. The first of these is suggested and developed in this Article and is based on a vibrational dressing of the exciton-cavity polaritons. This enables the description of distinct phonon-polariton sidebands that appear when the polariton splitting exceeds the typical phonon frequency scale in the environment. The second approach is based on a variationally optimized polaronic vibrational dressing of the electronic state. Both of these approaches demonstrate good qualitative and quantitative agreement with reference calculations of the emission spectrum and are numerically robust, even at elevated temperatures, where the thermal phonon population is significant.	翻訳日:2023-04-06 19:28:51 公開日:2021-05-18
# 時間外行列における量子カオスのシグネチャ Signatures of quantum chaos in an out-of-time-order matrix ( http://arxiv.org/abs/2105.08282v1 ) ライセンス: Link先を確認	Magdalini Zonnios, Jesper Levinsen, Meera M. Parish, Felix A. Pollock, Kavan Modi	(参考訳) 液体のカオス性を決定するためにインク滴を用いる有名なインク滴実験に動機づけられ,量子プロセスのスクランブル容量を実験的に測定する方法を提案する。ここで、興味のある系は、系のカオス性を特定する動的性質を持つ小さな量子プローブと相互作用する。具体的には、プロセスのカオス性に関する明確な情報理論的意味を提供する、時間外行列(OTOM)と呼ばれる、時間外相関器(OTOC)の完全量子バージョンを提案する。我々は、ランダムなユニタリ過程を用いたカオスのシグネチャとしてのotomの有用性と、カオス性がチューニング可能な量子キックロータについて説明する。 Motivated by the famous ink-drop experiment, where ink droplets are used to determine the chaoticity of a fluid, we propose an experimentally implementable method for measuring the scrambling capacity of quantum processes. Here, a system of interest interacts with a small quantum probe whose dynamical properties identify the chaoticity of the system. Specifically, we propose a fully quantum version of the out-of-time-order correlator (OTOC) - which we term the out-of-time-order matrix (OTOM) - whose correlations offer clear information theoretic meanings about the chaoticity of a process. We illustrate the utility of the OTOM as a signature of chaos using random unitary processes as well as in the quantum kicked rotor, where the chaoticity is tuneable.	翻訳日:2023-03-30 20:18:10 公開日:2021-05-18
# 関数全体のヒルベルト空間とユークリッド平面のトープリッツ量子化 Hilbert Spaces of Entire Functions and Toeplitz Quantization of Euclidean Planes ( http://arxiv.org/abs/2105.08400v1 ) ライセンス: Link先を確認	Micho Durdevich and Stephen Bruce Sontz	(参考訳) 先程の論文で提示されたトープリッツ量子化の理論は拡張され、古典ユークリッド平面の多様で興味深い非可換な実現を含むようにさらに発展した。これは関数全体のヒルベルト空間を用いて行われ、1つの複素変数の多項式が密部分空間を形成する。複素座標は自然に非有界乗法演算子として作用し、その随伴子とともに作用素の高度に非可換な -代数である。トープリッツ作用素は、この代数の特殊元として幾何学的に構成され、他の二次非可換代数の記号と関連付けられ、平面上の多項式として量子化される。そのような概念的枠組みは、初期スカラー積上の興味深い非自明な条件を促進する。これらは詳細に分析される。様々な例が計算される。 The theory of Toeplitz quantization presented in our previous paper is extended and further developed to include diverse and interesting non-commutative realizations of the classical Euclidean plane. This is done using Hilbert spaces of entire functions, where polynomials in one complex variable form a dense subspace. The complex coordinate naturally acts as an unbounded multiplication operator generating, together with its adjoint, a highly non-commutative -algebra of operators. The Toeplitz operators are then geometrically constructed as special elements from this algebra; they are associated to the symbols from another quadratic non-commutative algebra, which is interpretable as polynomials over a plane to be quantized. Such a conceptual framework promotes interesting non-trivial conditions on the initial scalar product. These are analyzed in detail. Various illustrative examples are computed.	翻訳日:2023-03-30 20:09:22 公開日:2021-05-18
# 回折結合を介する低温原子の自己組織化 Self-Organization in Cold Atoms Mediated by Diffractive Coupling ( http://arxiv.org/abs/2105.08340v1 ) ライセンス: Link先を確認	Thorsten Ackemann, Guillaume Labeyrie, Giuseppe Baio, Ivor Kre\v{s}i\'c, Josh G. M. Walker, Adrian Costa Boquete, Paul Griffin, William J. Firth, Robin Kaiser, Gian-Luca Oppo, and Gordon R.M. Robb	(参考訳) 本稿では,1枚の反射鏡からのフィードバックによって誘導される光媒介相互作用による低温原子の自己組織化について論じる。ポンプビームと自発側バンドとの拡散劣化は格子周期を選択する。回転対称性と並進対称性の自発的破壊はポンプに横切る2次元平面で起こる。自己誘起原子格子上の回折リップルの結合部位を解明する。光ビームに印加された原子雲の非線形位相シフトは結合強度を決定するパラメータである。相互作用は、熱原子の原子結晶化につながる外部自由度と、量子縮退ガスの超固体、または励起状態の集団やゼーマン準準準位のような内部自由度のいずれかに調整することができる。 poincar{\'e}球面上の光偏光度(ヘリシティと偏光方向)を用いることで、原子ゼーマン状態の特定の既約テンソル成分を結合することができ、双極子と四極子の性質の自発的磁気秩序に導くことができる。臨界相互作用強度の要件は、異なる状況で比較される。縦送りキャビティ, 逆伝搬ビームスキーム, CARL不安定性への接続と拡張について論じる。 This article discusses self-organization in cold atoms via light-mediated interactions induced by feedback from a single retro-reflecting mirror. Diffractive dephasing between the pump beam and the spontaneous sidebands selects the lattice period. Spontaneous breaking of the rotational and translational symmetry occur in the 2D plane transverse to the pump. We elucidate how diffractive ripples couple sites on the self-induced atomic lattice. The nonlinear phase shift of the atomic cloud imprinted onto the optical beam is the parameter determining coupling strength. The interaction can be tailored to operate either on external degrees of freedom leading to atomic crystallization for thermal atoms and supersolids for a quantum degenerate gas, or on internal degrees of freedom like populations of the excited state or Zeeman sublevels. Using the light polarization degrees of freedom on the Poincar{\'e} sphere (helicity and polarization direction), specific irreducible tensor components of the atomic Zeeman states can be coupled leading to spontaneous magnetic ordering of states of dipolar and quadrupolar nature. The requirements for critical interaction strength are compared for the different situations. Connections and extensions to longitudinally pumped cavities, counterpropagating beam schemes and the CARL instability are discussed.	翻訳日:2023-03-30 20:07:42 公開日:2021-05-18
# 2光子量子電池のキャラクタリゼーション:初期条件,安定性,作業抽出 Characterization of a Two-Photon Quantum Battery: Initial Conditions, Stability and Work Extraction ( http://arxiv.org/abs/2105.08337v1 ) ライセンス: Link先を確認	Anna Delmonte, Alba Crescente, Matteo Carrega, Dario Ferraro, Maura Sassetti	(参考訳) 2光子相互作用によってキャビティ放射と結合した2レベル系に基づく量子電池を考える。キャビティの初期条件, フォック状態, コヒーレント状態, 圧縮状態を考慮して, 蓄積エネルギー, 平均充電電力, エネルギー変動量, 抽出可能な作業量など, 様々な特性について検討した。最初の状態がバッテリーの性能向上につながることを示す。しかし、同じ平均光子数を持つコヒーレントな状態は、蓄えられたエネルギーの強いゆらぎの影響を受けているとしても、特に、保存されたエネルギーを短時間でほぼ完全に取り出すことができるため、非常に興味深い性能をもたらす。 We consider a quantum battery that is based on a two-level system coupled with a cavity radiation by means of a two-photon interaction. Various figures of merit, such as stored energy, average charging power, energy fluctuations, and extractable work are investigated, considering, as possible initial conditions for the cavity, a Fock state, a coherent state, and a squeezed state. We show that the first state leads to better performances for the battery. However, a coherent state with the same average number of photons, even if it is affected by stronger fluctuations in the stored energy, results in quite interesting performance, in particular since it allows for almost completely extracting the stored energy as usable work at short enough times.	翻訳日:2023-03-30 20:07:23 公開日:2021-05-18
# 無限円筒ポテンシャル井戸における自由粒子の非相対論的シナリオ Non-Relativistic Scenario of a Free Particle in an Infinite Cylindrical Potential Well ( http://arxiv.org/abs/2105.08283v1 ) ライセンス: Link先を確認	Pratik Adarsh and Sabyasachi Ghosh	(参考訳) 基本量子力学の形式論には様々な種類の無限ポテンシャル井戸問題が存在する。無限平方井戸(1次元)、立方体箱、球面井戸は教科書では非常に一般的である。本稿では、無限の円筒形井戸である比較的珍しいポテンシャル井戸を考察し、そのエネルギー固有値と固有関数をSchr\"{o}dinger方程式を用いて探究する。また、放射波関数や密度プロットをプロットする。 There are various types of infinite potential well problems occurring in elementary quantum mechanics formalism. The infinite square well (one dimensional), cubical box and, spherical well are quite common in textbooks. In this paper, we consider a rather uncommon potential well, an infinite cylindrical well, and try to find its energy eigenvalues and eigenfunctions using Schr\"{o}dinger equation. We also plot some radial wavefunctions and density plots.	翻訳日:2023-03-30 20:06:36 公開日:2021-05-18
# エルビウムドーパントの接地におけるコヒーレント制御と光励起状態 Coherent control in the ground and optically excited state of an ensemble of erbium dopants ( http://arxiv.org/abs/2105.08487v1 ) ライセンス: Link先を確認	Pablo Cova Fari\~na, Benjamin Merkel, Natalia Herrera Valencia, Penghong Yu, Alexander Ulanowski, Andreas Reiserer	(参考訳) エルビウムドーパントのアンサンブルは、光ファイバー通信の最小損失波長帯域で動作する量子メモリと周波数変換器を実現することができる。彼らの操作は電子スピン状態の初期化、コヒーレント制御、読み出しを必要とする。本研究では、スプリットリングマイクロ波共振器を用いて、地上と光学的励起状態の両方でそのような制御を実証する。提案手法は、ドパントとホストの他の組み合わせにも適用でき、新しい量子メモリプロトコルやセンシングスキームの開発を容易にする可能性がある。 Ensembles of erbium dopants can realize quantum memories and frequency converters that operate in the minimal-loss wavelength band of fiber optical communication. Their operation requires the initialization, coherent control and readout of the electronic spin state. In this work, we use a split-ring microwave resonator to demonstrate such control in both the ground and optically excited state. The presented techniques can also be applied to other combinations of dopant and host, and may facilitate the development of new quantum memory protocols and sensing schemes.	翻訳日:2023-03-30 19:58:13 公開日:2021-05-18
# 大型ランダムアローヘッド行列 : キャビティに結合した量子スピンの多フラクタリティ,半局所化,保護輸送 Large Random Arrowhead Matrices: Multifractality, Semi-Localization, and Protected Transport in Disordered Quantum Spins Coupled to a Cavity ( http://arxiv.org/abs/2105.08444v1 ) ライセンス: Link先を確認	J\'er\^ome Dubail, Thomas Botzung, Johannes Schachenmayer, Guido Pupillo, and David Hagenm\"uller	(参考訳) 我々は、空洞モードに結合した不均一に拡張された量子エミッタの最小モデルである対角障害を持つ大きなランダムな矢印ハミルトニアンの正確な解を提供する。エネルギー間隔の分布は、ポアソン統計と半ポアソン統計に非常に近い分布(後者は通常「アンダーソン」局在化-非局在化遷移の臨界点に関連付けられている)の間で連続的に調整できる。 2つの分極子と1つの暗黒状態の連続体を含むすべての固有状態が多重フラクタルであることを示し、これは光物質結合強度の全ての値に対して重要な「半局所化」相が存在することを示す。初期地点からの脱出確率を計算した結果,有限結合強度の時間とともに線形に増大する脱出確率と,初期地点のエネルギーを選択することで脱出速度を制御することができる。乱れの配置で平均される脱出率は中間結合強度に対して最大値を示し, 集合的強結合限界("キャビティ保護"効果)よりも低い値で飽和する。意外なことに、飽和値は障害によって増加し、空洞が障害に対する輸送を保護しているだけでなく、後者を補助的改善輸送にすることも示している。その結果, 定常励磁電流は脱出確率と類似した特性を示し, キャビティ保護輸送シナリオを平衡外へ拡張することを示した。最後に,無秩序システムにおける長距離輸送に暗黒状態が寄与できることを実証する。 We provide an exact solution of large random arrowhead Hamiltonians with diagonal disorder, a minimal model for inhomogeneously broadened quantum emitters coupled to a cavity mode. We find that the distribution of energy spacing can be continuously tuned between Poisson statistics and a distribution that is very close to semi-Poisson statistics - the latter being usually associated to the critical point of "Anderson" localization-delocalization transitions. We demonstrate that all the eigenstates - including two polaritons and a continuum of dark states - are multifractal, which indicates the existence of a critical "semi-localized" phase for all values of the light-matter coupling strength, where dark states are localized over multiple, arbitrarily-distant sites. By computing the escape probability from an initial site, we find that the system has a peculiar diffusive-like behavior with an escape probability growing linearly with time for any finite coupling strength, and that the escape rate can be controlled by selecting the energy of the initial site. The escape rate averaged over the disorder configurations is found to exhibit a maximum for intermediate coupling strengths, before saturating at a lower value in the collective strong coupling limit - a "cavity protection" effect. Surprisingly, we show that the saturation value increases with the disorder, indicating that the cavity does not only protect transport against disorder but can also turn the latter into an ally improving transport. We finally investigate the system in a two-terminal configuration, and show that the steady-state excitation current exhibits similar features as the escape probability, thereby extending our cavity-protected transport scenario to out-of-equilibrium situations. We finally demonstrate that dark states can provide the major contribution to long-distance transport in disordered systems.	翻訳日:2023-03-30 19:58:03 公開日:2021-05-18
# 時間と位相調整電磁界による光電子放出 Photoelectron emission via time and phase-tailored electromagnetic fields ( http://arxiv.org/abs/2105.08435v1 ) ライセンス: Link先を確認	Jonas W\"atzel, Johannes Hahn and Jamal Berakdar	(参考訳) 光電子のエネルギーと角分布は、駆動場の時間と空間位相構造を選択することで調整可能であることが示されている。これらの結論は、レーザー場と時間非対称thzパルスと/または渦レーザーパルスとを組み合わせた原子ターゲットの単一活性電子モデル内で、波面の空間変調位相を持つ量子力学的計算から導かれる。 The energy and the angular distributions of photoelectrons are shown to be tunable by choosing the time and the spatial phase structure of the driving fields. These conclusions are derived from quantum mechanical calculations done within a single-active electron model for an atomic target subjected to a combination of laser field and a time-asymmetric THz pulse and/or vortex-laser pulse with a spatially modulated phase of the wavefront.	翻訳日:2023-03-30 19:57:28 公開日:2021-05-18
# itの柔軟性と動的能力の戦略的整合:実証的研究 Strategic alignment between IT flexibility and dynamic capabilities: an empirical investigation ( http://arxiv.org/abs/2105.08429v1 ) ライセンス: Link先を確認	Rogier van de Wetering, Patrick Mikalef and Adamantia Pateli	(参考訳) ダイナミック機能理論は企業価値創造のプロセスにおける主要なフレームワークとして登場した。その中核的な概念は、会社の資源に基づく視点の前提を補完し、現代の情報システム研究において重要な理論と管理の枠組みと考えられている。しかし、dctsの大きな貢献にもかかわらず、その強みと中心となる焦点は、本質的には歴史的な業績説明に使われている。さらに、DCTを拡張して、常に変化するIT環境や他の命令的ドライバに適合するようにするために、何人かの研究者が貴重な貢献をしている。しかし、さらなるパフォーマンス向上のための命令的なステップを導出するために、企業が現在の成熟度を統合的に評価できるdct拡張は開発されていない。本稿では,IT の柔軟性と動的能力に関する戦略的アライメントモデルの構築と,322 社の大規模データを用いた相関分析と回帰分析による仮説の実証的検証を目的とする。企業の基盤となる寸法の相乗効果をitフレキシビリティアーキテクチャと動的能力の組み合わせによって、組織は環境条件の変化に対処し、競争相手のパフォーマンスを高めることができると推測する。本研究の結果から,全次元のバランスの程度と競争力のある企業業績との間には,戦略的アライメントの程度に有意な正の相関があることが示唆された。したがって、戦略的アライメントは、常に変化する環境において企業が競争上の優位性に大きく影響を与える重要な条件と見なすことができる。提案されたフレームワークは、ITの柔軟性と動的な能力の成熟度と整合性を評価し、改善するのに役立つ。 Dynamic capabilities theory emerged as a leading framework in the process of value creation for firms. Its core notion complements the premise of the resource-based view of the firm and is considered an important theoretical and management framework in modern information systems research. However, despite DCTs significant contributions, its strength and core focus are essentially in its use for historical firm performance explanation. Furthermore, valuable contributions have been made by several researchers in order to extend the DCT to fit the constantly changing IT environments and other imperative drivers for competitive performance. However, no DCT extension has been developed which allows firms to integrally assess their current state of maturity in order to derive imperative steps for further performance enhancements. In light of empirical advancement, this paper aims to develop a strategic alignment model for IT flexibility and dynamic capabilities and empirically validates proposed hypotheses using correlation and regression analyses on a large data sample of 322 international firms. We conjecture that the combined synergetic effect of the underlying dimensions of a firms IT flexibility architecture and dynamic capabilities enables organizations to cope with changing environmental conditions and drive competitive firm performance. Findings of this study suggest that there is a significant positive relationship between the firms degree of strategic alignment defined as the degree of balance between all dimensions and competitive firm performance. Strategic alignment can, therefore, be seen as an important condition that significantly influences a firms competitive advantage in constantly changing environments. The proposed framework helps firms assess and improve their maturity and alignment of IT flexibility and dynamic capabilities.	翻訳日:2023-03-30 19:57:06 公開日:2021-05-18
# 熱原子ビームの光学キャビティへの超放射放出 Superradiant emission of a thermal atomic beam into an optical cavity ( http://arxiv.org/abs/2105.08718v1 ) ライセンス: Link先を確認	Simon B. J\"ager, Haonan Liu, John Cooper, Travis L. Nicholson, and Murray J. Holland	(参考訳) 理論的には、光学キャビティを横切る際に1つのモードに結合する原子双極子の熱線の集合ダイナミクスを理論的に解析する。この設定のために半古典モデルから導出し、超ラジアント放出の開始とその安定性を決定する。放射光の直線幅に関する解析式を導出し,それらを数値シミュレーションと比較する。さらに、定常超放射相と多成分超放射相の2つの異なる超放射相を発見し、予測する。後者の場合、集団双極子の振幅モードの安定性解析を用いて計算できる周波数スペクトルのサイドバンドを観測する。両超ラジアント相は, 自由空間自然放出と$T_2$脱落過程に対して堅牢であることを示す。 We theoretically analyze the collective dynamics of a thermal beam of atomic dipoles that couple to a single mode when traversing an optical cavity. For this setup we derive a semiclassical model and determine the onset of superradiant emission and its stability. We derive analytical expressions for the linewidth of the emitted light and compare them with numerical simulations. In addition, we find and predict two different superradiant phases; a steady-state superradiant phase and a multi-component superradiant phase. In the latter case we observe sidebands in the frequency spectrum that can be calculated using a stability analysis of the amplitude mode of the collective dipole. We show that both superradiant phases are robust against free-space spontaneous emission and $T_2$ dephasing processes.	翻訳日:2023-03-30 19:50:42 公開日:2021-05-18
# ノイズチャネル識別における量子アドバンテージ Quantum advantage for noisy channel discrimination ( http://arxiv.org/abs/2105.08707v1 ) ライセンス: Link先を確認	Zane M. Rossi, Jeffery Yu, Isaac L. Chuang, Sho Sugiura	(参考訳) 多くの量子力学実験は、既知の量子回路と未知の量子過程の間のマルチラウンド対話プロトコルと見なすことができる。未知のプロセスに対する完全量子コヒーレントなアクセスは、非コヒーレントなアクセスが許可されているときに比べて多くの識別タスクにおいて利点をもたらすことが知られているが、この利点がプロセスがうるさいときに持続するかどうかは不明である。ここでは,2つの単一量子ビット回転チャネルを区別する場合に,量子アドバンテージを維持できることを示す。数値計算と解析により,雑音強度関数としての完全コヒーレントプロトコルと完全コヒーレントプロトコルによる最適性能の差が明らかとなった。さらに、コヒーレント量子優位領域のサイズはチャネル使用数において逆多項式的に縮小し、中間状態においては、改良された戦略は完全コヒーレントかつ完全非コヒーレントなサブルーチンのハイブリッドである。完全コヒーレントプロトコルは量子信号処理に基づいており、現実的な雑音の存在下での量子優位性の研究のための一般化可能なアルゴリズムフレームワークを提案する。 Many quantum mechanical experiments can be viewed as multi-round interactive protocols between known quantum circuits and an unknown quantum process. Fully quantum "coherent" access to the unknown process is known to provide an advantage in many discrimination tasks compared to when only incoherent access is permitted, but it is unclear if this advantage persists when the process is noisy. Here, we show that a quantum advantage can be maintained when distinguishing between two noisy single qubit rotation channels. Numerical and analytical calculations reveal a distinct transition between optimal performance by fully coherent and fully incoherent protocols as a function of noise strength. Moreover, the size of the region of coherent quantum advantage shrinks inverse polynomially in the number of channel uses, and in an intermediate regime an improved strategy is a hybrid of fully-coherent and fully-incoherent subroutines. The fully coherent protocol is based on quantum signal processing, suggesting a generalizable algorithmic framework for the study of quantum advantage in the presence of realistic noise.	翻訳日:2023-03-30 19:50:31 公開日:2021-05-18
# NISQハードウェアにおける古典的量子ノイズ低減 Classical-Quantum Noise Mitigation for NISQ Hardware ( http://arxiv.org/abs/2105.08701v1 ) ライセンス: Link先を確認	Andrew Shaw	(参考訳) この研究において、グローバルホワイトノイズモデルは第一原理から証明される。 NISQハードウェアのグローバルホワイトノイズモデルへの付着は、古典的ホワイトノイズ外挿法(CLAWE)を用いてノイズ軽減に使用される。 In this work, the global white-noise model is proved from first principles. The adherence of NISQ hardware to the global white-noise model is used to perform noise mitigation using Classical White-noise Extrapolation (CLAWE).	翻訳日:2023-03-30 19:49:51 公開日:2021-05-18
# 非条件セキュアな鍵分布を示す量子リピータノード A Quantum Repeater Node Demonstrating Unconditionally Secure Key Distribution ( http://arxiv.org/abs/2105.08691v1 ) ライセンス: Link先を確認	S. Langenfeld, P. Thomas, O. Morin, and G. Rempe	(参考訳) 長距離量子通信は光ファイバーの光子損失を克服するために量子リピータを必要とする。ここでは、光空洞に2つのメモリ原子を持つリピータノードを示す。両方の原子は個別に、繰り返し光子と絡み合っており、それぞれの通信相手がそれぞれ独立に受信するまで分配される。原子のベル状態の測定に続いて古典的な通信が鍵を確立する。我々はキーレートのスケーリングの利点を実証し、有効減衰長を2倍に増やし、無条件でセキュアな通信を行う場合のエラーレート閾値11\%、リピータベースの量子ネットワークの場合のコーナーストーンを突破する。 Long-distance quantum communication requires quantum repeaters to overcome photon loss in optical fibers. Here we demonstrate a repeater node with two memory atoms in an optical cavity. Both atoms are individually and repeatedly entangled with photons that are distributed until each communication partner has independently received one of them. An atomic Bell-state measurement followed by classical communication serves to establish a key. We demonstrate scaling advantage of the key rate, increase the effective attenuation length by a factor of two, and beat the error-rate threshold of 11\% for unconditionally secure communication, the corner stones for repeater-based quantum networks.	翻訳日:2023-03-30 19:49:36 公開日:2021-05-18
# ダイヤモンド中の単一SiV$^{-}およびSnV$^{-}$中心の高忠実スピン回転のための光制御プロトコル Optical control protocols for high-fidelity spin rotations of single SiV$^{-}$ and SnV$^{-}$ centers in diamond ( http://arxiv.org/abs/2105.08594v1 ) ライセンス: Link先を確認	Evangelia Takou and Sophia E. Economou	(参考訳) ダイヤモンドのシリコン空孔とスズ空孔欠陥は、その優れた光学特性のために、NV中心の代替量子ビットとして興味がある。これらの欠陥における光遷移の可用性は、その資産の1つであるが、高忠実性光コヒーレント制御は証明されていない。ここでは,これらの欠陥に対応する新しい光制御スキームを設計する。外部磁場の有無および存在下での電子スピン量子ビットの任意の単一量子ビット回転の性能をコヒーレントな誤差と非コヒーレントな誤差の両方を考慮して評価する。 9,8.0\%$(T=4$~K)と99.71\%$(T=6$~K)を超える回転は、現実的な緩和とリークエラーの存在下でそれぞれSi-VとSn-Vに対して達成できる。 Silicon-vacancy and tin-vacancy defects in diamond are of interest as alternative qubits to the NV center due to their superior optical properties. While the availability of optical transitions in these defects is one of their assets, high-fidelity optical coherent control has not been demonstrated. Here, we design novel optical control schemes tailored to these defects. We evaluate the performance of arbitrary single-qubit rotations of the electron spin qubit both in the absence and presence of an external magnetic field, by taking into account both coherent and incoherent errors. We find that rotations in excess of $98.0\%$ ($T=4$~K) and $99.71\%$ ($T=6$~K) can be achieved for Si-V and Sn-V respectively in the presence of realistic relaxation and leakage errors.	翻訳日:2023-03-30 19:49:14 公開日:2021-05-18
# ロボティクス研究所におけるコビッドとそれ以上の時代の継続性を教える Teaching Continuity in Robotics Labs in the Age of Covid and Beyond ( http://arxiv.org/abs/2105.08839v1 ) ライセンス: Link先を確認	R. Pito Salas	(参考訳) 本論文は,コンピュータ科学分野におけるロボット工学者およびロボット工学者の育成には,実際のロボットとの広範な直接作業が必要であり,ロボット工学の学習ラボへのアクセスが制限された場合,この教育的ミッションは負の影響を受けると論じる。これはまさに、Covidパンデミックの始まりである2020年初頭にロボティクス研究所が遭遇した問題だ。論文は、遠隔/仮想ロボット工学の教育ラボの説明に変わり、それが何を意味するのか、どのような利点があるのか、どのように使用されるのかを詳細に調べる。このビジョンの一部は2020年に当社の機関で実施され、それ以来常に使用されてきた。構築された特定のアーキテクチャと実装について述べられている。この結論のエキサイティングな洞察は、パンデミックによって奨励され、引き起こされた作業は、ロボティクス教育へのアクセスを拡大し、ある機関がロボティクス教育を大規模に拡張し、コストを抑えながらこれを行う能力を高めるという、長期的利益を非常に有すると考えられる。 This paper argues that training of future Roboticists and Robotics Engineers in Computer Science departments, requires the extensive direct work with real robots, and that this educational mission will be negatively impacted when access to robotics learning laboratories is curtailed. This is exactly the problem that Robotics Labs encountered in early 2020, at the start of the Covid pandemic. The paper then turns to the description of a remote/virtual robotics teaching laboratory and examines in detail what that would mean, what the benefits would be, and how it may be used. Part of this vision was implemented at our institution during 2020 and has been in constant use since then. The specific architecture and implementation, as far as it has been built, is described. The exciting insight in the conclusion is that the work that was encouraged and triggered by a pandemic seems to have very positive longer-term benefits of increasing access to robotics education, increasing the ability of any one institution to scale their robotics education greatly, and potentially do this while reducing costs.	翻訳日:2023-03-30 19:40:59 公開日:2021-05-18
# 教育者、ソリケーター、フラマー、モチベーター、共感者:オンラインエクストリーム運動における役割の特徴 Educators, Solicitors, Flamers, Motivators, Sympathizers: Characterizing Roles in Online Extremist Movements ( http://arxiv.org/abs/2105.08827v1 ) ライセンス: Link先を確認	Shruti Phadke, Tanushree Mitra	(参考訳) ソーシャルメディアは、白人至上主義や反LGBTQのような過激派社会運動がオンラインで繁栄する手段を提供する。しかし、このような動きの参加者が果たす役割についてはほとんど分かっていない。本稿では,これらの参加者が果たす役割,役割のダイナミクス,オンライン過激主義の普及に与える影響について考察する。当社の参加者は、オンライン過激派アカウントであり、公開Facebookページ4,876人、あるいは289Southern Poverty Law Centerのウェブサイトから情報を共有しているグループです。定量的特徴のクラスタ化と質的専門家による検証により,教育者,ソリケータ,フレイラー,モチベータ,交感神経の5つの役割を同定した。例えば、ソリケーターは過激派ウェブサイトからのリンクを使って寄付を集め、過激派問題に参加する一方、フレイラーは過激派コンテンツを共有して怒りを喚起する。我々はさらに,これらの役割の安定性や,過激なアカウントがある役割から別の役割へと移行する可能性など,役割のダイナミクスについても調査する。フライヤーとモチベーションは高い確率で同調者に移行することができるが、運動、教育者、事務員にとっての役割はより安定している。さらに、教育者やソリテーターが過激なリンク投稿をトリガーするのに対して、フレイラーはフェイクニュースソースからの情報の拡散に影響を及ぼす。本研究は,過激派運動への深い関与の軌跡に様々な役割を担い,様々な反過激派介入の可能性を理解する上で有効である。本研究は, オンライン過激主義運動が参加活動を通じてどのように発展していくのか, オンライン過激主義を動員するために, どのように同盟関係を築いていくのかを理解することにつながる。 Social media provides the means by which extremist social movements, such as white supremacy and anti LGBTQ, thrive online. Yet, we know little about the roles played by the participants of such movements. In this paper, we investigate these participants to characterize their roles, their role dynamics, and their influence in spreading online extremism. Our participants, online extremist accounts, are 4,876 public Facebook pages or groups that have shared information from the websites of 289 Southern Poverty Law Center designated extremist groups. By clustering the quantitative features followed by qualitative expert validation, we identify five roles surrounding extremist activism: educators, solicitors, flamers, motivators, sympathizers. For example, solicitors use links from extremist websites to attract donations and participation in extremist issues, whereas flamers share inflammatory extremist content inciting anger. We further investigate role dynamics such as, how stable these roles are over time and how likely will extremist accounts transition from one role into another. We find that roles core to the movement, educators and solicitors, are more stable, while flamers and motivators can transition to sympathizers with high probability. We further find that educators and solicitors exert the most influence in triggering extremist link posts, whereas flamers are influential in triggering the spread of information from fake news sources. Our results help in situating various roles on the trajectory of deeper engagement into the extremist movements and understanding the potential effect of various counter extremism interventions. Our findings have implications for understanding how online extremist movements flourish through participatory activism and how they gain a spectrum of allies for mobilizing extremism online.	翻訳日:2023-03-30 19:39:43 公開日:2021-05-18
# オープンデータを用いた世界中の歩行者のアクセシビリティ測定のための汎用フレームワーク A Generalized Framework for Measuring Pedestrian Accessibility around the World Using Open Data ( http://arxiv.org/abs/2105.08814v1 ) ライセンス: Link先を確認	Shiqin Liu, Carl Higgs, Jonathan Arundel, Geoff Boeing, Nicholas Cerdera, David Moctezuma, Ester Cerin, Deepti Adlakha, Melanie Lowe, and Billie Giles-Corti	(参考訳) 歩行者のアクセシビリティは、都市交通と土地利用政策の重要な要素であり、健康で持続可能な都市を作る上で重要である。歩行者のアクセシビリティの不平等を測定する指標の開発と評価は、都市計画者や政策立案者が都市計画介入の進捗をベンチマークし、監視するのに役立つ。しかし,都市設計と都市比較を可能にするために,都市設計と交通特性の指標を世界規模で測定・評価することは,公的な,高品質,同等の空間データや,インジケータの構築と分析のためのカスタマイズ可能なフレームワークを提供する空間分析ツールが限られているため,課題である。これらの課題に対処するため,オープンで一貫したデータを用いた歩行者アクセシビリティ指標を構築するための,オープンソースのソフトウェアフレームワークを開発した。歩行者のアクセシビリティを高分解能・空間的集約スケールで一貫して測定し,都市内・都市間分析を可能にした。本研究で開発されたオープンソースおよびオープンデータ手法は,地域計画と政策立案を支援するため,世界中の他の都市に拡張することができる。ソフトウェアは、オープンリポジトリで再利用するために公開されています。 Pedestrian accessibility is an important factor in urban transport and land use policy and critical for creating healthy, sustainable cities. Developing and evaluating indicators measuring inequalities in pedestrian accessibility can help planners and policymakers benchmark and monitor the progress of city planning interventions. However, measuring and assessing indicators of urban design and transport features at high resolution worldwide to enable city comparisons is challenging due to limited availability of official, high quality, and comparable spatial data, as well as spatial analysis tools offering customizable frameworks for indicator construction and analysis. To address these challenges, this study develops an open source software framework to construct pedestrian accessibility indicators for cities using open and consistent data. It presents a generalized method to consistently measure pedestrian accessibility at high resolution and spatially aggregated scale, to allow for both within- and between-city analyses. The open source and open data methods developed in this study can be extended to other cities worldwide to support local planning and policymaking. The software is made publicly available for reuse in an open repository.	翻訳日:2023-03-30 19:39:08 公開日:2021-05-18
# エルミート行列集合における正の写像によって引き起こされる前順序のキャラクタリゼーション Characterization of preorders induced by positive maps in the set of Hermitian matrices ( http://arxiv.org/abs/2105.08778v1 ) ライセンス: Link先を確認	Julio I. de Vicente	(参考訳) Uhlmann は、エルミート行列 $A$ を別の$B$ に変換する正の、ユニタリかつトレース保存写像が存在することを示した。この本では、ユニタリティーまたはトレース保存の条件の1つを落として、そのような変換の存在を特徴づける。このことはエルミート行列の集合において2つの可能な事前順序を導き出し、任意のエルミート行列の正半定値の欠如と関連する単調性の性質を定量化する尺度を構築するためにどのように使用できるか論じる。 2つの形式主義のそれぞれの尺度は本質的に一意であることが判明した。 Uhlmann showed that there exists a positive, unital and trace-preserving map transforming a Hermitian matrix $A$ into another $B$ if and only if the vector of eigenvalues of $A$ majorizes that of $B$. In this work I characterize the existence of such a transformation when one of the conditions of unitality or trace preservation is dropped. This induces two possible preorders in the set of Hermitian matrices and I argue how this can be used to construct measures quantifying the lack of positive semidefiniteness of any given Hermitian matrix with relevant monotonicity properties. It turns out that the measures in each of the two formalisms are essentially unique.	翻訳日:2023-03-30 19:38:46 公開日:2021-05-18
# Holstein-Tavis-Cummingsモデルに基づく有機分子の集合効果 Collective Effects of Organic Molecules based on Holstein-Tavis-Cummings Model ( http://arxiv.org/abs/2105.08775v1 ) ライセンス: Link先を確認	Quansheng Zhang and Ke Zhang	(参考訳) 本研究では,ホルシュタイン・タヴィス・カミングスモデルに基づく光学キャビティに閉じ込められた有機分子の集合効果について検討した。量子ランジュバン法を用いて, 振動運動の自由度を分離的に除去することにより, 空洞透過スペクトルの表現を解析的に求め, 分極状態の特徴を解析する。応用として, 超低温分子の検出において, 下方偏光状態の周波数シフトが分子数に依存していることが示される。また、蛍光スペクトルを数値解析する。スペクトルプロファイルの様々な分子数の変動は、分子配座の修飾のためのシグネチャを与える。 We study the collective effects of an ensemble of organic molecules confined in an optical cavity based on Holstein-Tavis-Cummings model. By using the quantum Langevin approach and adiabatically eliminating the degree of freedom of the vibrational motion, we analytically obtain the expression of the cavity transmission spectrum to analyze the features of polaritonic states. As an application, we show that the dependence for the frequency shift of the lower polaritonic state on the number of molecules can be used in the detection of the ultra-cold molecules. We also numerically analyze the fluorescence spectrum. The variation of the spectral profile with various numbers of molecules gives signatures for the modification of molecular conformation.	翻訳日:2023-03-30 19:38:30 公開日:2021-05-18
# 連続可変量子鍵分布の正準攻撃に対する安全性 Security of continuous-variable quantum key distribution against canonical attacks ( http://arxiv.org/abs/2105.08774v1 ) ライセンス: Link先を確認	Panagiotis Papanastasiou, Carlo Ottaviani, and Stefano Pirandola	(参考訳) そこで,本研究では,ガウス的攻撃であるガウス的攻撃の存在下でのガウス的コヒーレント状態コヒーレント状態qkdプロトコルの性能について検討する。我々は漸近的なキーレートを示し、最近開発されたツールボックスを用いて、その結果を有限サイズに拡張する。 We investigate the performance of Gaussianmodulated coherent-state QKD protocols in the presence of canonical attacks, which are collective Gaussian attacks resulting in Gaussian channels described by one of the possible canonical forms. We present asymptotic key rates and then we extend the results to the finite-size regime using a recently-developed toolbox for composable security.	翻訳日:2023-03-30 19:38:22 公開日:2021-05-18
# 高品質ダイヤモンド充填オープンキャビティ High quality-factor diamond-confined open microcavity ( http://arxiv.org/abs/2105.08736v1 ) ライセンス: Link先を確認	Sigurd Fl{\aa}gan, Daniel Riedel, Alisa Javadi, Tomasz Jakubczyk, Patrick Maletinsky and Richard J. Warburton	(参考訳) 高いコヒーレントで光学的に対応可能な電子スピンを持つダイヤモンドの窒素空隙(nv)中心は、量子ネットワークにおけるノードの有望な候補である。しかし、NV中心は、長い放射寿命、ゼロフォノンライン(ZPL)への小さな分岐比、高指数ホスト材料からの抽出効率の低下によるコヒーレントな単一光子の供給源である。原則として、これら3つの欠点は、共振結合によって光学キャビティの単一モードに対処できる。空洞電磁力学の弱い結合状態を利用して、ZPLと単一空洞モードとの共鳴結合は、ZPLへの遷移速度と分岐比を高める。さらに、キャビティは光を明確に定義されたモードに流し込み、外部光学による検出を容易にする。本稿では,真空中での電界がダイヤモンド膜に強く拘束された状態での単一結晶ダイヤモンド膜を含むファブリ・ペロのマイクロキャビティ幾何構造を提案する。ダイヤモンド-空気界面には電界反ノードがある。表面の損失はあったものの, 品質因子が120,000ドルを超え, 微粒な$\mathcal{F}=11\,500$が観察された。異なる損失機構間の相互作用と、これらの損失チャネルがキャビティの性能に与える影響について検討する。この分析から,"waviness"(マイクロキャビティモードに匹敵する空間周波数の粗さ)は品質因子がさらに高い値に達するのを防ぐメカニズムであることが示唆された。最後に, 抽出した空洞パラメータをNV中心に適用し, 150を超える予測されたPurcell因子を算出する。 With a highly coherent, optically addressable electron spin, the nitrogen vacancy (NV) centre in diamond is a promising candidate for a node in a quantum network. However, the NV centre is a poor source of coherent single photons owing to a long radiative lifetime, a small branching ratio into the zero-phonon line (ZPL) and a poor extraction efficiency out of the high-index host material. In principle, these three shortcomings can be addressed by resonant coupling to a single mode of an optical cavity. Utilising the weak-coupling regime of cavity electrodynamics, resonant coupling between the ZPL and a single cavity-mode enhances the transition rate and branching ratio into the ZPL. Furthermore, the cavity channels the light into a well-defined mode thereby facilitating detection with external optics. Here, we present an open Fabry-Perot microcavity geometry containing a single-crystal diamond membrane, which operates in a regime where the vacuum electric field is strongly confined to the diamond membrane. There is a field anti-node at the diamond-air interface. Despite the presence of surface losses, quality factors exceeding $120\,000$ and a finesse $\mathcal{F}=11\,500$ were observed. We investigate the interplay between different loss mechanisms, and the impact these loss channels have on the performance of the cavity. This analysis suggests that the "waviness" (roughness with a spatial frequency comparable to that of the microcavity mode) is the mechanism preventing the quality factors from reaching even higher values. Finally, we apply the extracted cavity parameters to the NV centre and calculate a predicted Purcell factor exceeding 150.	翻訳日:2023-03-30 19:38:14 公開日:2021-05-18
# インド株取引自動化への深層強化学習の適用 Application of deep reinforcement learning for Indian stock trading automation ( http://arxiv.org/abs/2106.16088v1 ) ライセンス: Link先を確認	Supriya Bajpai	(参考訳) 株式取引において、特徴抽出と取引戦略設計は、機械学習技術を用いて長期的な利益を達成するための2つの重要なタスクである。報酬を最大化するために取引信号を取得することで取引戦略を設計するいくつかの方法が提案されている。本稿では,インド市場における株式取引戦略と投資決定に深層強化学習の理論を適用した。実験は、古典的な3つの深層強化学習モデル、深層Qネットワーク、深層Qネットワーク、深層Qネットワークを10のインドストックデータセット上で体系的に実施する。モデルの性能を評価し、比較する。 In stock trading, feature extraction and trading strategy design are the two important tasks to achieve long-term benefits using machine learning techniques. Several methods have been proposed to design trading strategy by acquiring trading signals to maximize the rewards. In the present paper the theory of deep reinforcement learning is applied for stock trading strategy and investment decisions to Indian markets. The experiments are performed systematically with three classical Deep Reinforcement Learning models Deep Q-Network, Double Deep Q-Network and Dueling Double Deep Q-Network on ten Indian stock datasets. The performance of the models are evaluated and comparison is made.	翻訳日:2023-03-30 19:30:46 公開日:2021-05-18
# 動的エンタープライズアーキテクチャ機能と組織的メリット--経験的仲介研究 Dynamic enterprise architecture capabilities and organizational benefits: an empirical mediation study ( http://arxiv.org/abs/2105.10036v1 ) ライセンス: Link先を確認	Rogier van de Wetering	(参考訳) 近年、文献はエンタープライズアーキテクチャ、EA、研究の文脈における理論構築に重点を置いている。特に、学者は、戦略的な目標と技術の使用を一致させるために組織固有のリソースを組織化し、展開するeaベースの能力に焦点をあてる傾向がある。 EA研究の進展にもかかわらず、文献にはかなりのギャップが残っている。最も注目すべきギャップは、EAベースの能力の概念化は、理論上まだしっかりとした基盤を欠いていることと、EAベースの能力がビジネス変革を促進し、会社に利益をもたらす方法に関する決定的な証拠がないことである。そこで本研究では, EA ベースの機能に着目し, 動的機能ビューを理論的基盤として利用し, 動的エンタープライズアーキテクチャ能力が組織的メリットにどのように寄与するかを説明する新しい研究モデルを開発し, テストする。研究モデルに関連する仮説は、299人のCIO、ITマネージャ、リードアーキテクトからの回答を含むデータセットを使用してテストされる。結果として、動的エンタープライズアーキテクチャ機能は、企業のプロセス革新とビジネス/ITアライメントに肯定的な影響を与えます。これらの仲介力はどちらも、組織の利益に肯定的に結びついています。本研究は、組織に利益をもたらすために、動的エンタープライズアーキテクチャの能力を効果的に脱線させる方法についての理解を深める。 In recent years the literature has put a greater emphasis on theory building in the context of Enterprise Architecture, EA, research. Specifically, scholars tend to focus on EA-based capabilities that organize and deploy organization-specific resources to align strategic objectives with the particular use of technology. Despite the growth in EA studies, substantial gaps remain in the literature. The most noteworthy gaps are that the conceptualization of EA-based capabilities still lacks a firm base in theory and that there is no conclusive evidence on how EA-based capabilities drive business transformation and deliver benefits to the firm. Therefore, this study focuses on EA-based capabilities, using the dynamic capabilities view as a theoretical foundation, develops and tests a new research model that explains how dynamic enterprise architecture capabilities lead to organizational benefits. Hypotheses associated with the research model are tested using a dataset that contains responses from 299 CIOs, IT managers, and lead architects. Results show that dynamic enterprise architecture capabilities positively influence the firm's process innovation and business/IT alignment. These mediating forces are both positively associated with organizational benefits. This study advances our understanding of how to efficaciously de-lineate dynamic enterprise architecture capabilities in delivering benefits to the organization.	翻訳日:2023-03-30 19:29:39 公開日:2021-05-18
# Informatiekunde -- Curriculum 2003 Informatiekunde -- Curriculum 2003 ( http://arxiv.org/abs/2105.09311v1 ) ライセンス: Link先を確認	V. Kamphuis and H. A. Proper	(参考訳) 本書はニジェーゲン情報情報学研究所(NIII)の業務情報学プログラムのカリキュラムについて論じる。 2003年(平成15年)から適用されるカリキュラムの構造に関する「レポジトリ」を提供することが目的である。過去3年間、情報科学の分野としてのイメージは、国家レベルとニメゲンレベルの両方でさらに強調されてきた。 2003年のカリキュラムは、この包括化の結果であり、また、情報科学のトレーニングで現在NIII内に構築されている3年間の経験の成果である。この文書では、2000年、2001年、2002年の既存の「スタートアップ」カリキュラムからの「移行」にも明確な注意が払われる。ここで注意すべきは、コホート2000の学生が原則としてこのプログラムの学士課程を完了したことである(2003年)。専門は情報科学教育。そのため、この集団には具体的な「移住」は必要ない。 This document discusses the curriculum of the business informatics program of the Nijmegen Institute for Informatics and business informatics (NIII). The aim is to provide a 'repository' with regard to the structure of the curriculum, which will apply from 2003. In the past three years, the image of information science as a discipline has been further concretised at both national and Nijmegen level. Curriculum 2003 is on the one hand the result of this concretization and on the other hand of the three years of experience that has now been built up within the NIII with the information science training. In this document, therefore, explicit attention will also be paid to the 'migration' from the existing 'start-up' curricula: 2000, 2001 and 2002. It should be noted here that the students of cohort 2000 will in principle have completed the bachelor's phase of the program this year (2003). Complete information science education. No specific `migration 'is therefore necessary for this group of students.	翻訳日:2023-03-30 19:29:19 公開日:2021-05-18
# Informatiekunde -- Visie 2003 Informatiekunde -- Visie 2003 ( http://arxiv.org/abs/2105.09310v1 ) ライセンス: Link先を確認	V. Kamphuis and H. A. Proper	(参考訳) この文書は、ニジェーゲン情報科学研究所(NIII)のビジネス情報学カリキュラムと研究プログラム(Informatiekunde)の基礎となるビジョンを(オランダ語で)論じている。この文書の最終的な目的は、これらのビジョンに関する「リポジトリ」と、プログラムのカリキュラムと研究計画の具体的な構造の基礎を提供することである。ビジネス情報学は、NIIIにおける教育と研究のための比較的新しい分野であるため、この文書の現在の(2003年)バージョンは、主に教育的視点に焦点を当てている。今後数年間で、この文書のアップデートがビジネスインフォマティクスの研究にさらに注目されるようになると期待されている。しかし、この文書が毎年更新可能であるという事実は、もちろん年次変更を期待するという意味ではない。この文書に記載されているものの安定性に関する野望は5年から6年である。現在のバージョンでは、これは特に情報科学研究プログラムのビジョンに当てはまる。この文書の研究部分は、今後数年でさらに具体化されていく必要があるだろう。 This document discusses (in Dutch) the vision underlying the business informatics (Informatiekunde) curriculum and research programme at the Nijmegen Institute for Informatics and Information Science (NIII). The ultimate aim of this document is to provide a 'repository' with regard to these visions, and a basis for the specific structure of the program's curriculum and research plans. Since business informatics is a relatively new area for teaching and research at NIII, the current (2003) version of this document primarily focuses on the educational perspective. It is to be expected that in the coming years, updates to this document will also pay more attention to business informatics research. The fact that this document can be updated annually does not mean, however, that we expect an annual change of course. The ambition with regard to the stability of what is described in this document is 5 to 6 years. In the current version, this applies specifically to the vision of the information science study program. The research part of this document will have to be fleshed out even more specifically in the coming years.	翻訳日:2023-03-30 19:29:08 公開日:2021-05-18
# 確率と滑らかさを誤認したガウス過程平均の収束保証 Convergence Guarantees for Gaussian Process Means With Misspecified Likelihoods and Smoothness ( http://arxiv.org/abs/2001.10818v3 ) ライセンス: Link先を確認	George Wynne, Fran\c{c}ois-Xavier Briol and Mark Girolami	(参考訳) ガウス過程は機械学習、統計学、応用数学においてユビキタスである。関数を近似するための柔軟なモデリングフレームワークを提供し、同時に不確実性を定量化する。しかし、これはモデルが十分に特定されている場合にのみ当てはまるが、実際にはそうではないことが多い。本稿では,モデルの滑らかさと可能性関数が不明確である場合に,ガウス過程の性質について検討する。この設定において、実践的関連性に関する重要な理論的疑問は、ガウス過程の近似が問題の難しさ、我々のモデル、そして誤特定の程度をどの程度正確にするかである。モデルと実験設計の選択を知らせてくれるので、この問題に対する答えは特に有用です。特に,カーネルとカーネルのハイパーパラメータの実験的設計と選択が,モデルの誤特定を軽減するためにどのように適応できるかを述べる。 Gaussian processes are ubiquitous in machine learning, statistics, and applied mathematics. They provide a flexible modelling framework for approximating functions, whilst simultaneously quantifying uncertainty. However, this is only true when the model is well-specified, which is often not the case in practice. In this paper, we study the properties of Gaussian process means when the smoothness of the model and the likelihood function are misspecified. In this setting, an important theoretical question of practial relevance is how accurate the Gaussian process approximations will be given the difficulty of the problem, our model and the extent of the misspecification. The answer to this problem is particularly useful since it can inform our choice of model and experimental design. In particular, we describe how the experimental design and choice of kernel and kernel hyperparameters can be adapted to alleviate model misspecification.	翻訳日:2023-01-05 21:12:43 公開日:2021-05-18
# ポテンシャルインスタンスの推論によるツリーアンサンブルの検証 Verifying Tree Ensembles by Reasoning about Potential Instances ( http://arxiv.org/abs/2001.11905v3 ) ライセンス: Link先を確認	Laurens Devos, Wannes Meert, Jesse Davis	(参考訳) 特定の属性がモデルの予測に不釣り合いな影響を与えているか"、"部分的に説明された例に対して、どのような予測ができるのか"のようなブラックボックスモデルに質問することができると想像してください。この最後の質問は、部分的な記述がデータ内の観察された例に対応していない場合、特に重要である。これらの機能は、特に堅牢性、公平性、バイアスといった問題に関連して、ユーザがモデルの振る舞いをよりよく理解できるようにするため、非常に役に立ちます。本稿では,木々のアンサンブルに対してこのようなアプローチを提案する。この課題は一般に難易度が高いため,(1) 課題の簡略化を問う質問に対して入力空間の一部を抽出し,(2) 段階的かつ常に回答を返却し,入力領域のどの部分がまだ不確実かを示す分割・征服的アプローチに従うという戦略を提示する。このアプローチの有用性は、さまざまなユースケースで示されています。 Imagine being able to ask questions to a black box model such as "Which adversarial examples exist?", "Does a specific attribute have a disproportionate effect on the model's prediction?" or "What kind of predictions could possibly be made for a partially described example?" This last question is particularly important if your partial description does not correspond to any observed example in your data, as it provides insight into how the model will extrapolate to unseen data. These capabilities would be extremely helpful as they would allow a user to better understand the model's behavior, particularly as it relates to issues such as robustness, fairness, and bias. In this paper, we propose such an approach for an ensemble of trees. Since, in general, this task is intractable we present a strategy that (1) can prune part of the input space given the question asked to simplify the problem; and (2) follows a divide and conquer approach that is incremental and can always return some answers and indicates which parts of the input domains are still uncertain. The usefulness of our approach is shown on a diverse set of use cases.	翻訳日:2023-01-05 05:47:17 公開日:2021-05-18
# 高速近似固有空間の構築と高速グラフフーリエ変換への応用 Constructing fast approximate eigenspaces with application to the fast graph Fourier transforms ( http://arxiv.org/abs/2002.09723v3 ) ライセンス: Link先を確認	Cristian Rusu and Lorenzo Rosasco	(参考訳) 対称行列および一般行列に付随する固有空間の数値的効率的な近似について検討する。固有空間は、効率よく操作できる基本成分の固定個数に分解される(拡張直交の命題やスケーリングやせん断変換を考える)。これらの成分の数は、近似精度と固有空間上の投影の計算複雑性の間のトレードオフを制御する。単一基本成分の最小化問題を書き、閉形式解を提供する。次に,これらすべてのコンポーネントを収束するまで反復的に更新するアルゴリズムを提案する。ランダム行列に関する結果と、有向グラフおよび無向グラフに対するグラフフーリエ変換の近似への応用を示す。 We investigate numerically efficient approximations of eigenspaces associated to symmetric and general matrices. The eigenspaces are factored into a fixed number of fundamental components that can be efficiently manipulated (we consider extended orthogonal Givens or scaling and shear transformations). The number of these components controls the trade-off between approximation accuracy and the computational complexity of projecting on the eigenspaces. We write minimization problems for the single fundamental components and provide closed-form solutions. Then we propose algorithms that iterative update all these components until convergence. We show results on random matrices and an application on the approximation of graph Fourier transforms for directed and undirected graphs.	翻訳日:2022-12-29 19:19:35 公開日:2021-05-18
# ゲノムデータセットの高次元特徴選択 High-Dimensional Feature Selection for Genomic Datasets ( http://arxiv.org/abs/2002.12104v2 ) ライセンス: Link先を確認	Majid Afshar, Hamid Usefi	(参考訳) 機械学習とパターン認識の中心的な問題は、最も重要な特徴を認識するプロセスである。本稿では,まず無関係な特徴を取り除き,残りの特徴間の相関を検出する新しい特徴選択法(drpt)を提案する。 $d=[a\mid \mathbf{b}]$をデータセットとし、$\mathbf{b}$をクラスラベルとし、$a$を列を特徴とする行列とする。我々は最小二乗法と$A$の擬逆法を用いて$A\mathbf{x} = \mathbf{b}$を解く。各々の$\mathbf{x}$の成分は対応する列(機能)に割り当てられた重みと見なすことができる。我々は$\mathbf{x}$の局所最大値に基づいてしきい値を定義し、しきい値よりも重みが小さい特徴を除去する。還元行列の相関を検出するために、我々はまだ$a$と呼ぶが、摂動$\tilde a$ を$a$とする。相関は$\delta\mathbf{x}=\mid \mathbf{x} -\tilde{\mathbf{x}}\mid $, ここで $\tilde{\mathbf{x}}$ は$\tilde a\tilde{\mathbf{x}}=\mathbf{b}$ の最小四分法である。まず、$\delta\mathbf{x}$に基づいて機能をクラスタし、次に機能のエントロピーを使用します。最後に、その重みとエントロピーに基づいて各サブクラスタから特徴を選択する。 drptの有効性は、9,117から267,604までの10の遺伝子データセットに対して7つの最先端特徴選択法との比較を行い検証した。その結果,各特徴選択アルゴリズムと比較して,DRPTの性能はいくつかの面で好ましいことがわかった。 \e である。 A central problem in machine learning and pattern recognition is the process of recognizing the most important features. In this paper, we provide a new feature selection method (DRPT) that consists of first removing the irrelevant features and then detecting correlations between the remaining features. Let $D=[A\mid \mathbf{b}]$ be a dataset, where $\mathbf{b}$ is the class label and $A$ is a matrix whose columns are the features. We solve $A\mathbf{x} = \mathbf{b}$ using the least squares method and the pseudo-inverse of $A$. Each component of $\mathbf{x}$ can be viewed as an assigned weight to the corresponding column (feature). We define a threshold based on the local maxima of $\mathbf{x}$ and remove those features whose weights are smaller than the threshold. To detect the correlations in the reduced matrix, which we still call $A$, we consider a perturbation $\tilde A$ of $A$. We prove that correlations are encoded in $\Delta\mathbf{x}=\mid \mathbf{x} -\tilde{\mathbf{x}}\mid $, where $\tilde{\mathbf{x}}$ is the least quares solution of $\tilde A\tilde{\mathbf{x}}=\mathbf{b}$. We cluster features first based on $\Delta\mathbf{x}$ and then using the entropy of features. Finally, a feature is selected from each sub-cluster based on its weight and entropy. The effectiveness of DRPT has been verified by performing a series of comparisons with seven state-of-the-art feature selection methods over ten genetic datasets ranging up from 9,117 to 267,604 features. The results show that, over all, the performance of DRPT is favorable in several aspects compared to each feature selection algorithm. \e	翻訳日:2022-12-28 08:04:40 公開日:2021-05-18
# ENTMOOT: ツリーモデルのアンサンブルを最適化するフレームワーク ENTMOOT: A Framework for Optimization over Ensemble Tree Models ( http://arxiv.org/abs/2003.04774v3 ) ライセンス: Link先を確認	Alexander Thebelt, Jan Kronqvist, Miten Mistry, Robert M. Lee, Nathan Sudermann-Merx, Ruth Misener	(参考訳) グラディエント強化木やその他の回帰木モデルは、幅広い実世界の産業用途でよく機能する。これらの木モデル (i)重要な予測機能についての洞察を提供する。 (ii)スパースデータを効果的に管理し、 (iii)優れた予測能力を有する。その利点にもかかわらず、彼らは一般的に意思決定タスクやブラックボックス最適化に不人気であり、それは構造を最適化するのが困難であり、信頼性の高い不確実性尺度が欠如しているからである。 ENTMOOTは、(既に訓練済みの)ツリーモデルをより大きな最適化問題に統合するための新しいフレームワークです。 ENTMOOTの貢献は以下のとおりである。 (i)木モデルと互換性のある信頼性の高い不確実性尺度を明示的に導入すること。 (ii)これらの不確実性認識木モデルを取り入れたより大きな最適化問題を解くこと。 (iii) 解がグローバルに最適であることを証明すること、すなわち、より良い解は存在しないこと。特に、entmootアプローチによって、木モデルの意思決定とブラックボックス最適化へのシンプルな統合が可能になり、一般的に使用されているフレームワークとの強力な競合であることが証明される。 Gradient boosted trees and other regression tree models perform well in a wide range of real-world, industrial applications. These tree models (i) offer insight into important prediction features, (ii) effectively manage sparse data, and (iii) have excellent prediction capabilities. Despite their advantages, they are generally unpopular for decision-making tasks and black-box optimization, which is due to their difficult-to optimize structure and the lack of a reliable uncertainty measure. ENTMOOT is our new framework for integrating (already trained) tree models into larger optimization problems. The contributions of ENTMOOT include: (i) explicitly introducing a reliable uncertainty measure that is compatible with tree models, (ii) solving the larger optimization problems that incorporate these uncertainty aware tree models, (iii) proving that the solutions are globally optimal, i.e. no better solution exists. In particular, we show how the ENTMOOT approach allows a simple integration of tree models into decision-making and black-box optimization, where it proves as a strong competitor to commonly-used frameworks.	翻訳日:2022-12-24 20:16:17 公開日:2021-05-18
# 広帯域におけるCNNの従来未確認スケールへの一般化能力の探索 Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges ( http://arxiv.org/abs/2004.01536v7 ) ライセンス: Link先を確認	Ylva Jansson and Tony Lindeberg	(参考訳) 大規模なバリエーションを扱う能力は多くの現実世界の視覚的タスクにとって不可欠である。ディープネットワークにおけるスケールを扱うための簡単なアプローチは、一連のスケールチャネルで複数のスケールで画像を同時に処理することだ。スケール不変性は、原則として、スケールチャネル間の重量共有と、スケールチャネルからの出力を最大または平均的にプールすることで達成できる。このようなスケールチャネルネットワークが、重要なスケール範囲のトレーニングセットに存在しないスケールに一般化する能力は、これまで検討されていなかった。そこで、我々は、スケールチャネルネットワークの不変性と共分散特性の理論解析を行い、これまで見られなかったスケールに一般化する様々な種類のスケールチャネルネットワークの能力を実験的に評価する。我々は,従来のアプローチの限界を識別し,より解像度の低い画像のより大きな部分をスケールチャネルが処理する,新たなタイプのスケールチャネルアーキテクチャを提案する。提案するFovMaxとFovAvgのネットワークは,1スケールのトレーニングデータを用いたトレーニングにおいても,ほぼ同一のスケールで動作し,また,小型のサンプルシステムでも改善が期待できる。 The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single scale training data, and do also give improvements in the small sample regime.	翻訳日:2022-12-17 04:47:31 公開日:2021-05-18
# Byzantine-Robust Client Weighting によるフェデレーションラーニング Towards Federated Learning With Byzantine-Robust Client Weighting ( http://arxiv.org/abs/2004.04986v2 ) ライセンス: Link先を確認	Amit Portnoy, Yoav Tirosh, and Danny Hendler	(参考訳) フェデレーション学習(federated learning, fl)は、中央サーバが協調する計算プロセスにおいて、モデルを協調的にトレーニングするクライアント間でデータを分散する分散機械学習パラダイムである。保有するデータインスタンスの割合に基づいて各クライアントに重みを割り当てることにより、正確なジョイントモデルへの収束率を大幅に向上させることができる。以前のいくつかの作品は、一部のクライアントがモデルに関する任意の、あるいは悪意のある情報を送信できるビザンチンの設定でflを研究した。しかし、これらの作業はデータアンバランスの問題を完全に無視するか、クライアントの重みがサーバに周知されていると仮定するかのいずれかであり、実際には、重みはクライアント自身によってサーバに報告され、従って信頼できない可能性がある。そこで本研究では, 実用的重み関係に基づく前処理法を提案し, モデル品質とビザンチンのロバスト性とのバランスが良好であることを実証的に示す。また,本手法をランダムに選択したクライアントウェイトのサンプルに適用できることを解析的に確立した。 Federated Learning (FL) is a distributed machine learning paradigm where data is distributed among clients who collaboratively train a model in a computation process coordinated by a central server. By assigning a weight to each client based on the proportion of data instances it possesses, the rate of convergence to an accurate joint model can be greatly accelerated. Some previous works studied FL in a Byzantine setting, in which a fraction of the clients may send arbitrary or even malicious information regarding their model. However, these works either ignore the issue of data unbalancedness altogether or assume that client weights are apriori known to the server, whereas, in practice, it is likely that weights will be reported to the server by the clients themselves and therefore cannot be relied upon. We address this issue for the first time by proposing a practical weight-truncation-based preprocessing method and demonstrating empirically that it is able to strike a good balance between model quality and Byzantine robustness. We also establish analytically that our method can be applied to a randomly selected sample of client weights.	翻訳日:2022-12-14 20:36:43 公開日:2021-05-18
# 空間変圧器ネットワークが不変性をサポートしていない場合の理解とその対策 Understanding when spatial transformer networks do not support invariance, and what to do about it ( http://arxiv.org/abs/2004.11678v5 ) ライセンス: Link先を確認	Lukas Finnveden, Ylva Jansson and Tony Lindeberg	(参考訳) 空間トランスフォーマーネットワーク(STN)は、畳み込みニューラルネットワーク(CNN)が画像変換に不変性を学習できるように設計された。 STNはもともとCNNの特徴マップと入力画像の変換のために提案されていた。これにより、変換パラメータを予測する際に、より複雑な機能の使用が可能になる。しかし、STNは純粋に空間変換を行うため、一般的な場合、変換された画像の特徴写像を元のものと整列する能力を持たない。したがって、STNはCNN特徴写像を変換する際に不変性をサポートできない。そこで本研究では,この問題に対する簡単な証明と実用的意義について検討し,分類精度の低下と組み合わせることを提案する。そこで我々は,複雑な特徴を利用する代替STNアーキテクチャについて検討する。また,より深い局所化ネットワークは訓練が難しいが,分類ネットワークとパラメータを共有するローカライズネットワークは,より深く成長するにつれて安定し,困難なデータセットの分類精度が向上することがわかった。最後に,ローカライズネットワークの複雑さと反復画像アライメントの相互作用について検討する。 Spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image with those of its original. STNs are therefore unable to support invariance when transforming CNN feature maps. We present a simple proof for this and study the practical implications, showing that this inability is coupled with decreased classification accuracy. We therefore investigate alternative STN architectures that make use of complex features. We find that while deeper localization networks are difficult to train, localization networks that share parameters with the classification network remain stable as they grow deeper, which allows for higher classification accuracy on difficult datasets. Finally, we explore the interaction between localization network complexity and iterative image alignment.	翻訳日:2022-12-10 03:52:24 公開日:2021-05-18
# 検索エンジンとの会話:serpベースの会話応答生成 Conversations with Search Engines: SERP-based Conversational Response Generation ( http://arxiv.org/abs/2004.14162v2 ) ライセンス: Link先を確認	Pengjie Ren, Zhumin Chen, Zhaochun Ren, Evangelos Kanoulas, Christof Monz, and Maarten de Rijke	(参考訳) 本稿では,ユーザが自然言語でクエリを表現できるという意味で,検索エンジンと会話することで複雑な情報要求に答える問題に対処し,短いシステム応答から必要な情報を会話形式で直接受信する。近年、会話エージェント(cas)や会話検索(cs)の研究など、同様の目標に向けた試みがいくつか行われている。しかし、複雑な情報のニーズに対処しないか、あるいは概念フレームワークや実験室ベースのユーザリサーチの開発に限られている。本稿では,(1)適切なデータセットの作成,(2)会話用パイプラインの開発のためのsaac(search as a conversation)データセット,(2)検索エンジンとの会話のための最先端パイプラインの開発,(2)このデータセットを用いた検索エンジンとの対話(case)という2つの目標を追求する。 SaaCはマルチターンの会話検索データセットに基づいて構築されており、クラウドソーシングプラットフォームから労働者を雇い、関連する各項目を短い会話応答にまとめる。 caseは、サポート対象のトークン識別モジュールとaprior-awareポインタジェネレータを導入することで、最先端の処理を強化します。我々は,CaSEが強いベースラインより優れていることを示す実験を行った。また、CaSE以外のさらなる改善の余地があるかを示すために、SaaCデータセットの広範な分析を行う。最後に、我々は、CaSEのSaaCデータセットとコードと、このトピックに関する今後の研究を促進するために使用されるすべてのモデルをリリースする。 In this paper, we address the problem of answering complex information needs by conversing conversations with search engines, in the sense that users can express their queries in natural language, and directly receivethe information they need from a short system response in a conversational manner. Recently, there have been some attempts towards a similar goal, e.g., studies on Conversational Agents (CAs) and Conversational Search (CS). However, they either do not address complex information needs, or they are limited to the development of conceptual frameworks and/or laboratory-based user studies. We pursue two goals in this paper: (1) the creation of a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines, and (2) the development of astate-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE), using this dataset. SaaC is built based on a multi-turn conversational search dataset, where we further employ workers from a crowdsourcing platform to summarize each relevant passage into a short, conversational response. CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator, which enables us to generate more accurate responses. We carry out experiments to show that CaSE is able to outperform strong baselines. We also conduct extensive analyses on the SaaC dataset to show where there is room for further improvement beyond CaSE. Finally, we release the SaaC dataset and the code for CaSE and all models used for comparison to facilitate future research on this topic.	翻訳日:2022-12-08 14:18:46 公開日:2021-05-18
# WOAD:未公開動画のオンラインアクション検出を監督 WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos ( http://arxiv.org/abs/2006.03732v2 ) ライセンス: Link先を確認	Mingfei Gao, Yingbo Zhou, Ran Xu, Richard Socher, Caiming Xiong	(参考訳) 非トリミングビデオ中のオンラインアクション検出は、発生時のアクションを識別することを目的としているため、リアルタイムアプリケーションにとって非常に重要である。従来は、オンライン行動検出システムのスケーラビリティを妨げる時間的行動境界の面倒なアノテーションをトレーニングに頼っていた。ビデオクラスラベルのみを用いてトレーニング可能な弱教師付きフレームワークであるWOADを提案する。 WOADには、時間的提案生成(TPG)とオンラインアクション認識(OAR)の2つの共同訓練モジュールが含まれている。ビデオクラスのラベルによって監督され、TPGはオフラインで動作し、OARの擬似フレームレベルのラベルを正確にマイニングするターゲットとなる。 TPGからの監視信号により、OARはオンライン方式で行動検出を行うことを学ぶ。 thumos'14, activitynet1.2, activitynet1.3の実験結果は,弱教師付き手法が弱教師付きベースラインをほとんど上回っており,従来の強教師付き手法と同等の性能を達成していることを示している。さらに、WOADは、利用可能な時に強力な監視を活用するために柔軟です。本手法は,オンラインフレームごとの行動認識とオンライン行動開始検出の両方のタスクにおいて,最先端の結果を得る。 Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications. Previous methods rely on tedious annotations of temporal action boundaries for training, which hinders the scalability of online action detection systems. We propose WOAD, a weakly supervised framework that can be trained using only video-class labels. WOAD contains two jointly-trained modules, i.e., temporal proposal generator (TPG) and online action recognizer (OAR). Supervised by video-class labels, TPG works offline and targets at accurately mining pseudo frame-level labels for OAR. With the supervisory signals from TPG, OAR learns to conduct action detection in an online fashion. Experimental results on THUMOS'14, ActivityNet1.2 and ActivityNet1.3 show that our weakly-supervised method largely outperforms weakly-supervised baselines and achieves comparable performance to the previous strongly-supervised methods. Beyond that, WOAD is flexible to leverage strong supervision when it is available. When strongly supervised, our method obtains the state-of-the-art results in the tasks of both online per-frame action recognition and online detection of action start.	翻訳日:2022-11-25 04:11:36 公開日:2021-05-18
# テキスト生成の評価:調査 Evaluation of Text Generation: A Survey ( http://arxiv.org/abs/2006.14799v2 ) ライセンス: Link先を確認	Asli Celikyilmaz, Elizabeth Clark, Jianfeng Gao	(参考訳) 本稿は,ここ数年で開発された自然言語生成システム(NLG)の評価手法について検討する。 nlg評価方法は,(1)人間中心評価指標,(2)訓練を必要としない自動評価指標,(3)機械学習指標の3つのカテゴリに分類した。各カテゴリにおいて、最近提案されたNLGタスクとニューラルNLGモデルの評価に焦点をあて、現在行われている進歩と課題について論じる。次に,テキストの自動要約と長文生成のためのタスク固有のnlg評価の2つの例を示し,今後の研究の方向性を述べる。 The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years. We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics. For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models. We then present two examples for task-specific NLG evaluations for automatic text summarization and long text generation, and conclude the paper by proposing future research directions.	翻訳日:2022-11-16 20:45:03 公開日:2021-05-18
# 不均一情報ネットワークのための事前学習モデル Pre-Trained Models for Heterogeneous Information Networks ( http://arxiv.org/abs/2007.03184v2 ) ライセンス: Link先を確認	Yang Fang, Xiang Zhao, Yifan Chen, Weidong Xiao, Maarten de Rijke	(参考訳) ネットワーク表現学習では,ヘテロジニアスな情報ネットワークを低次元空間で表現する方法を学習し,効率的な探索,分類,予測を容易にする。従来のネットワーク表現学習手法では、ドメイン固有の問題に対処するために十分なタスク固有のラベル付きデータが必要である。トレーニングされたモデルは、通常、ドメイン外データセットに転送できない。我々は、異種情報ネットワークの特徴を捉えるための自己教師付き事前学習および微調整フレームワークPF-HINを提案する。ダウンストリームのタスクとデータセットごとにモデル全体をトレーニングしなければならない従来のネットワーク表現学習モデルとは異なり、PF-HINはモデルと少数のタスク固有のパラメータを微調整するだけで、モデル効率と効率性が向上する。事前学習中、我々はまず与えられたノードの近傍をシーケンスに変換する。 PF-HINは2つの自己教師付きタスク、マスキングノードモデリング、隣接ノード予測に基づいて事前訓練される。モデルのトレーニングには深層双方向トランスフォーマーエンコーダを採用し、パラメータの削減には分解型埋め込みパラメータ化と層間パラメータ共有を利用する。微調整の段階では、リンク予測、類似性検索、ノード分類、ノードクラスタリングという4つのベンチマークダウンストリームタスクを選択します。 pf-hinは、これら各タスクにおける最先端の代替手段を4つのデータセットで一貫して大幅に上回っている。 In network representation learning we learn how to represent heterogeneous information networks in a low-dimensional space so as to facilitate effective search, classification, and prediction solutions. Previous network representation learning methods typically require sufficient task-specific labeled data to address domain-specific problems. The trained model usually cannot be transferred to out-of-domain datasets. We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. Unlike traditional network representation learning models that have to train the entire model all over again for every downstream task and dataset, PF-HIN only needs to fine-tune the model and a small number of extra task-specific parameters, thus improving model efficiency and effectiveness. During pre-training, we first transform the neighborhood of a given node into a sequence. PF-HIN is pre-trained based on two self-supervised tasks, masked node modeling and adjacent node prediction. We adopt deep bi-directional transformer encoders to train the model, and leverage factorized embedding parameterization and cross-layer parameter sharing to reduce the parameters. In the fine-tuning stage, we choose four benchmark downstream tasks, i.e., link prediction, similarity search, node classification, and node clustering. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.	翻訳日:2022-11-12 18:11:54 公開日:2021-05-18
# Deep Retrieval: 大規模レコメンデーションのための検索可能な構造を学ぶ Deep Retrieval: Learning A Retrievable Structure for Large-Scale Recommendations ( http://arxiv.org/abs/2007.07203v2 ) ライセンス: Link先を確認	Weihao Gao, Xiangjun Fan, Chong Wang, Jiankai Sun, Kai Jia, Wenzhi Xiao, Ruofan Ding, Xingyan Bin, Hui Yang, Xiaobing Liu	(参考訳) 大規模レコメンデーションにおける中核的な問題は、重要候補を正確かつ効率的に、好ましくは準線形時間で検索することである。先進的なアプローチは主に2段階の手順に基づいており、まず内積モデルを学び、次に近い近接探索アルゴリズム(ANN)を用いて上位候補を見つける。本稿では,ANNアルゴリズムにおけるユークリッド空間の仮定に頼らずに,ユーザとイテムのインタラクションデータ(例えばクリック)を直接検索可能な構造を学習するために,Deep Retrieval(DR)を提案する。 DRの構造は全ての候補項目を離散潜在空間に符号化する。候補の潜在コードはモデルパラメータであり、同じ目的関数を最大化するために他のニューラルネットワークパラメータと共に学習する。モデルが学習されると、構造上のビーム探索を行い、最上位候補を検索して再ランキングを行う。経験的に、我々はまず、2つの公開データセットのブルートフォースベースラインとほぼ同じ精度を、サブ線形計算複雑性を持つ dr が達成できることを実証した。さらに本研究では,実運用レコメンデーションシステムにおいて,デプロイされたDRアプローチが,エンゲージメントの指標として十分に調整されたANNベースラインを著しく上回ることを示す。我々の知る限りでは、DRは産業レコメンデーションシステムのために数億のアイテムをスケールで展開した最初の非ANNアルゴリズムの1つである。 One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model, and then use some approximate nearest neighbor (ANN) search algorithm to find top candidates. In this paper, we present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data (e.g. clicks) without resorting to the Euclidean space assumption in ANN algorithms. DR's structure encodes all candidate items into a discrete latent space. Those latent codes for the candidates are model parameters and learnt together with other neural network parameters to maximize the same objective function. With the model learnt, a beam search over the structure is performed to retrieve the top candidates for reranking. Empirically, we first demonstrate that DR, with sub-linear computational complexity, can achieve almost the same accuracy as the brute-force baseline on two public datasets. Moreover, we show that, in a live production recommendation system, a deployed DR approach significantly outperforms a well-tuned ANN baseline in terms of engagement metrics. To the best of our knowledge, DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems.	翻訳日:2022-11-11 05:49:55 公開日:2021-05-18
# GPU上の多面体ニューラルネットワークのスケーリング検証 Scaling Polyhedral Neural Network Verification on GPUs ( http://arxiv.org/abs/2007.10868v2 ) ライセンス: Link先を確認	Christoph M\"uller, Fran\c{c}ois Serre, Gagandeep Singh, Markus P\"uschel, Martin Vechev	(参考訳) ニューラルネットワークの敵攻撃に対する堅牢性を証明することは、自律運転や診断などの安全クリティカルなシステムに確実に採用するために不可欠である。残念なことに、最先端の検証者はより大きなネットワークにスケールしないか、堅牢性を証明するには不正確で、実践的な採用を制限している。本稿では,従来よりもはるかに大きなディープニューラルネットワークのロバスト性を証明可能なスケーラブルな検証器であるGPUPolyを紹介する。 GPUPolyの背後にある重要な技術的洞察は、GPU上のニューラルネットワーク検証のためのカスタムのサウンドポリヘドラアルゴリズムの設計である。我々のアルゴリズムは、基盤となる検証タスクの利用可能なGPU並列性と固有の疎性を活用する。 GPUPolyは大規模ネットワークにスケールする。例えば、約34.5msで1Mのニューロン、34層の深い残留ネットワークの堅牢性を証明することができる。我々は、GPUPolyが現実のニューラルネットワークの実用的な検証に向けた有望なステップであると考えている。 Certifying the robustness of neural networks against adversarial attacks is essential to their reliable adoption in safety-critical systems such as autonomous driving and medical diagnosis. Unfortunately, state-of-the-art verifiers either do not scale to bigger networks or are too imprecise to prove robustness, limiting their practical adoption. In this work, we introduce GPUPoly, a scalable verifier that can prove the robustness of significantly larger deep neural networks than previously possible. The key technical insight behind GPUPoly is the design of custom, sound polyhedra algorithms for neural network verification on a GPU. Our algorithms leverage the available GPU parallelism and inherent sparsity of the underlying verification task. GPUPoly scales to large networks: for example, it can prove the robustness of a 1M neuron, 34-layer deep residual network in approximately 34.5 ms. We believe GPUPoly is a promising step towards practical verification of real-world neural networks.	翻訳日:2022-11-08 13:05:07 公開日:2021-05-18
# プライバシ/レート・歪み理論によるロバスト機械学習 Robust Machine Learning via Privacy/Rate-Distortion Theory ( http://arxiv.org/abs/2007.11693v2 ) ライセンス: Link先を確認	Ye Wang, Shuchin Aeron, Adnan Siraj Rakin, Toshiaki Koike-Akino, Pierre Moulin	(参考訳) ニューラルネットワークの一般的な脆弱性に対処するために、ロバストなマシンラーニングの定式化が登場している。我々の研究は、最適ロバスト学習とプライバシ・ユーティリティ・トレードオフ問題との関連性を引き合いに出し、これは率歪み問題の一般化である。頑健な分類器と対向的な摂動の間のゲームのサドルポイントは、最大条件エントロピー問題の解によって見つけることができる。この情報理論的な観点は、ロバストネスとクリーンなデータ性能の基本的なトレードオフに光を当て、それは最終的に、基礎となるデータ分布と摂動制約の幾何学的構造から生じる。 Robust machine learning formulations have emerged to address the prevalent vulnerability of deep neural networks to adversarial examples. Our work draws the connection between optimal robust learning and the privacy-utility tradeoff problem, which is a generalization of the rate-distortion problem. The saddle point of the game between a robust classifier and an adversarial perturbation can be found via the solution of a maximum conditional entropy problem. This information-theoretic perspective sheds light on the fundamental tradeoff between robustness and clean data performance, which ultimately arises from the geometric structure of the underlying data distribution and perturbation constraints.	翻訳日:2022-11-07 22:38:48 公開日:2021-05-18
# マルチドメイン学習のためのNASを介してアダプタをプラグインすること What and Where: Learn to Plug Adapters via NAS for Multi-Domain Learning ( http://arxiv.org/abs/2007.12415v2 ) ライセンス: Link先を確認	Hanbin Zhao, Hao Zeng, Xin Qin, Yongjian Fu, Hui Wang, Bourahla Omar, and Xi Li	(参考訳) 重要かつ困難な問題として、マルチドメイン学習(MDL)は一般的に、共通のドメインに依存しないネットワークにプラグインされた、効果的な軽量なドメイン固有アダプタモジュールのセットを探している。通常、既存のアダプタプラグと構造設計の方法は、モデル学習の前にすべてのドメインに対して手作りで固定され、学習の柔軟性と計算集約性をもたらす。このモチベーションにより,neural architecture search (nas) を用いたデータ駆動アダプタ接続戦略を学習し,アダプタモジュールの接続先を自動的に決定する。さらに、NAS駆動学習方式におけるアダプタ構造設計のためのNAS-adapterモジュールを提案し、異なるドメインに対する効果的なアダプタモジュール構造を自動的に発見する。実験結果は,mdlモデルが既存手法と同等の性能条件下での有効性を示す。 As an important and challenging problem, multi-domain learning (MDL) typically seeks for a set of effective lightweight domain-specific adapter modules plugged into a common domain-agnostic network. Usually, existing ways of adapter plugging and structure design are handcrafted and fixed for all domains before model learning, resulting in the learning inflexibility and computational intensiveness. With this motivation, we propose to learn a data-driven adapter plugging strategy with Neural Architecture Search (NAS), which automatically determines where to plug for those adapter modules. Furthermore, we propose a NAS-adapter module for adapter structure design in a NAS-driven learning scheme, which automatically discovers effective adapter module structures for different domains. Experimental results demonstrate the effectiveness of our MDL model against existing approaches under the conditions of comparable performance.	翻訳日:2022-11-07 05:55:32 公開日:2021-05-18
# リコメンダシステムにおける便益的特徴相互作用の検出 Detecting Beneficial Feature Interactions for Recommender Systems ( http://arxiv.org/abs/2008.00404v6 ) ライセンス: Link先を確認	Yixin Su, Rui Zhang, Sarah Erfani, Zhenghua Xu	(参考訳) 特徴相互作用は、推薦システムにおいて高い精度を達成するために不可欠である。多くの研究がそれぞれの特徴の相互作用を考慮に入れている。しかし、いくつかの特徴的相互作用は推奨結果に関係しない可能性があり、それらを考慮してノイズを生じさせ、推奨精度を低下させる可能性があるため、これは最適ではない。特徴的相互作用から最善を尽くすため,提案手法では,特徴的相互作用を効果的にモデル化するグラフニューラルネットワーク手法と,推薦精度の観点から有益である特徴的相互作用を自動的に検出する手法を提案する。自動特徴相互作用検出は、エッジ予測とL0アクティベーション正規化により達成される。提案モデルは,情報ボトルネック原理と統計相互作用理論を用いて有効であることが証明された。実験結果から我々のモデルは (i)既存の基準線を精度で上回り、 (ii) 有用な特徴相互作用を自動的に識別する。 Feature interactions are essential for achieving high accuracy in recommender systems. Many studies take into account the interaction between every pair of features. However, this is suboptimal because some feature interactions may not be that relevant to the recommendation result, and taking them into account may introduce noise and decrease recommendation accuracy. To make the best out of feature interactions, we propose a graph neural network approach to effectively model them, together with a novel technique to automatically detect those feature interactions that are beneficial in terms of recommendation accuracy. The automatic feature interaction detection is achieved via edge prediction with an L0 activation regularization. Our proposed model is proved to be effective through the information bottleneck principle and statistical interaction theory. Experimental results show that our model (i) outperforms existing baselines in terms of accuracy, and (ii) automatically identifies beneficial feature interactions.	翻訳日:2022-11-03 19:28:12 公開日:2021-05-18
# 画像分類のためのメモリ効率の良いクラスインクリメンタル学習 Memory Efficient Class-Incremental Learning for Image Classification ( http://arxiv.org/abs/2008.01411v2 ) ライセンス: Link先を確認	Hanbin Zhao, Hui Wang, Yongjian Fu, Fei Wu, Xi Li	(参考訳) メモリリソース制限の制約により、クラスインクリメンタルラーニング(CIL)は通常、新たに追加されたクラスが到着すると、共同分類モデルを更新する際に「破滅的な忘れる」問題に悩まされる。忘れる問題に対処するため、多くのCILメソッドは、模範的なサンプルをメモリバッファのサイズに制限して保存することで、古いクラスの知識を転送する。メモリバッファをより効率的に活用するために,本研究では,従来の実高忠実な模範サンプルよりも,補助的な低忠実な模範サンプルを維持することを提案する。このようなメモリ効率の良い模範保存スキームは、古いクラスの知識伝達をより効果的にする。しかし、低忠実度例のサンプルは、しばしば元の例のサンプル、すなわちドメインシフトとは別の領域に分散される。この問題を軽減するため、我々は、上記のドメイン間ギャップを大幅に狭めるドメイン互換特徴抽出器と分類器を構築しようとする二重学習スキームを提案する。その結果、これらの低忠実度補助サンプルは、元の例サンプルを低メモリコストで適度に置き換えることができる。さらに, 純粋な真のクラスラベルのサンプルを用いて, バイアス付き分類器(古いクラスに関する蒸留ラベルの知識を含むサンプルを学習)を改良する, 頑健な分類器適応方式を提案する。実験により,本研究の有効性が実証された。 With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the "catastrophic forgetting" problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer. To utilize the memory buffer more efficiently, we propose to keep more auxiliary low-fidelity exemplar samples rather than the original real high-fidelity exemplar samples. Such a memory-efficient exemplar preserving scheme makes the old-class knowledge transfer more effective. However, the low-fidelity exemplar samples are often distributed in a different domain away from that of the original exemplar samples, that is, a domain shift. To alleviate this problem, we propose a duplet learning scheme that seeks to construct domain-compatible feature extractors and classifiers, which greatly narrows down the above domain gap. As a result, these low-fidelity auxiliary exemplar samples have the ability to moderately replace the original exemplar samples with a lower memory cost. In addition, we present a robust classifier adaptation scheme, which further refines the biased classifier (learned with the samples containing distillation label knowledge about old classes) with the help of the samples of pure true class labels. Experimental results demonstrate the effectiveness of this work against the state-of-the-art approaches.	翻訳日:2022-11-02 23:21:09 公開日:2021-05-18
# 因果規則:不均一な治療効果の解釈的推論 Causal Rule Ensemble: Interpretable Inference of Heterogeneous Treatment Effects ( http://arxiv.org/abs/2009.09036v3 ) ライセンス: Link先を確認	Kwonsang Lee, Falco J. Bargagli-Stoffi, Francesca Dominici	(参考訳) 社会科学や健康科学では、治療が人口平均よりも明らかに大きいか小さい因果効果を持つ研究集団のサブグループを特定することが重要である。近年,因果効果の不均一性に対処するための方法論開発が数多く行われている。一般的なアプローチは、あらかじめ特定された共変量集合が与えられた条件平均処理効果(CATE)を推定することである。しかし、このアプローチは新たな部分群を発見できない。最近の因果機械学習(ML)アプローチでは、多数の観測や共変量が存在する場合、個々のレベルでCATEを推定する。しかしながら、これらのMLアプローチの大部分は、異種部分群の解釈可能な特徴づけを提供していない。本稿では,新しい因果ルールアンサンブル(CRE)法を提案する。 1) 著しく異質な治療効果を持つde novoサブグループ(causal rules)を発見する。 2)これらのサブグループの解釈性は,決定規則によって定義されるので保証する。 3) CATEは, 偏差が小さく, 統計的精度が高いこれらの新発見サブグループのそれぞれについて推定する。新たに発見された因果規則に対する推定因果効果の整合性を保証する理論的結果を提供する。 CREの優れた特徴は、因果規則の発見に使用できるMLアルゴリズムの選択や、因果規則内の因果効果の推定方法に非依存である点である。シミュレーションにより,cre手法は既存の手法に比べて性能が向上し,解釈性が向上することを示す。また,未測定埋没バイアスに対する新しい感度解析も導入した。 CRE法を用いて,大気汚染の長期曝露による死亡率に対する因果的影響に弱いサブグループを同定する。 In social and health sciences, it is critically important to identify subgroups of the study population where a treatment has a notably larger or smaller causal effect compared to the population average. In recent years, there have been many methodological developments for addressing heterogeneity of causal effects. A common approach is to estimate the conditional average treatment effect (CATE) given a pre-specified set of covariates. However, this approach does not allow to discover new subgroups. Recent causal machine learning (ML) approaches estimate the CATE at an individual level in presence of large number of observations and covariates with great accuracy. Nevertheless, the bulk of these ML approaches do not provide an interpretable characterization of the heterogeneous subgroups. In this paper, we propose a new Causal Rule Ensemble (CRE) method that: 1) discovers de novo subgroups with significantly heterogeneous treatment effects (causal rules); 2) ensures interpretability of these subgroups because they are defined in terms of decision rules; and 3) estimates the CATE for each of these newly discovered subgroups with small bias and high statistical precision. We provide theoretical results that guarantee consistency of the estimated causal effects for the newly discovered causal rules. A nice feature of CRE is that it is agnostic to the choices of the ML algorithms that can be used to discover the causal rules, and the estimation methods for the causal effects within the discovered causal rules. Via simulations, we show that the CRE method has competitive performance as compared to existing approaches while providing enhanced interpretability. We also introduce a new sensitivity analysis to unmeasured confounding bias. We apply the CRE method to discover subgroups that are more vulnerable to the causal effects of long-term exposure to air pollution on mortality.	翻訳日:2022-10-17 03:25:08 公開日:2021-05-18
# ロバストか公正か:対人訓練の公正性を目指して To be Robust or to be Fair: Towards Fairness in Adversarial Training ( http://arxiv.org/abs/2010.06121v2 ) ライセンス: Link先を確認	Han Xu, Xiaorui Liu, Yaxin Li, Anil K. Jain, Jiliang Tang	(参考訳) 敵のトレーニングアルゴリズムは、敵の例に対する機械学習モデルの堅牢性を改善するために信頼できることが証明されている。しかし, 逆行訓練アルゴリズムは, 異なるデータ群間の精度と頑健さの相違が生じやすいことがわかった。例えば、cifar-10の対向的に訓練されたresnet18モデルは、クラス"automobile"では93%のクリーン精度と67%のpgd l-infty-8ロバスト精度を持つが、クラス"cat"では65%と17%しかない。この現象はバランスの取れたデータセットで発生し、クリーンサンプルのみを使用すると自然に訓練されたモデルには存在しない。本研究では,DNNモデルのロバストな誤差を最小限に抑える一般対角訓練アルゴリズムにおいて,この現象が生じることを実証的,理論的に示す。これらの知見に触発されて、敵防衛を行う際の不公平問題を軽減するためのFair-Robust-Learning(FRL)フレームワークを提案する。 FRLの有効性を実験的に検証した。 Adversarial training algorithms have been proved to be reliable to improve machine learning models' robustness against adversarial examples. However, we find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data. For instance, a PGD adversarially trained ResNet18 model on CIFAR-10 has 93% clean accuracy and 67% PGD l-infty-8 robust accuracy on the class "automobile" but only 65% and 17% on the class "cat". This phenomenon happens in balanced datasets and does not exist in naturally trained models when only using clean samples. In this work, we empirically and theoretically show that this phenomenon can happen under general adversarial training algorithms which minimize DNN models' robust errors. Motivated by these findings, we propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses. Experimental results validate the effectiveness of FRL.	翻訳日:2022-10-07 22:53:40 公開日:2021-05-18
# タスクとドメインにまたがる技術的疑問 Technical Question Answering across Tasks and Domains ( http://arxiv.org/abs/2010.09780v2 ) ライセンス: Link先を確認	Wenhao Yu, Lingfei Wu, Yu Deng, Qingkai Zeng, Ruchi Mahindru, Sinem Guven, Meng Jiang	(参考訳) 自動技術支援システムの構築は重要な課題である。概念的には、技術的なフォーラムでユーザー質問に答えるためには、人間の専門家がまず関連文書を検索し、答えのスニペットを特定するために慎重に読む必要がある。大きな成功にもかかわらず、研究者たちは一般領域質問応答(QA)に対処することに成功しているが、技術的QAの調査に対する注意ははるかに少ない。具体的には、既存の手法はいくつかの固有の課題に苦しむ (i)質問と回答が実質的に重なることは滅多になく、 (ii)データサイズが非常に限られている。本稿では,タスクやドメイン間での技術的QAを効果的に扱うための,ディープラーニング学習の枠組みを提案する。この目的のために,文書検索と読解作業のための調整可能な共同学習手法を提案する。我々のTechQA実験は最先端手法と比較して優れた性能を示した。 Building automatic technical support system is an important yet challenge task. Conceptually, to answer a user question on a technical forum, a human expert has to first retrieve relevant documents, and then read them carefully to identify the answer snippet. Despite huge success the researchers have achieved in coping with general domain question answering (QA), much less attentions have been paid for investigating technical QA. Specifically, existing methods suffer from several unique challenges (i) the question and answer rarely overlaps substantially and (ii) very limited data size. In this paper, we propose a novel framework of deep transfer learning to effectively address technical QA across tasks and domains. To this end, we present an adjustable joint learning approach for document retrieval and reading comprehension tasks. Our experiments on the TechQA demonstrates superior performance compared with state-of-the-art methods.	翻訳日:2022-10-05 20:29:01 公開日:2021-05-18
# トピック・スペース・トラジェクトリー:機械学習文学の事例研究 Topic Space Trajectories: A case study on machine learning literature ( http://arxiv.org/abs/2010.12294v3 ) ライセンス: Link先を確認	Bastian Sch\"afermeier and Gerd Stumme and Tom Hanika	(参考訳) 科学会場での年次刊行物、例えば会議や雑誌の数は急速に増えている。したがって、研究者にとっても研究トピックとその進捗を追跡することが難しくなる。このタスクでは、研究者は自動出版分析によって支援できる。しかし、そのような方法の多くは解釈不能で純粋に数値表現をもたらす。人的分析者を支援するため,研究トピックを網羅的に追跡する構造であるトピック空間トラジェクトリを提案する。 8つの異なる解析手法に基づいてこれらの軌道を解釈する方法を実証する。その結果,非負の行列係数化と適切な可視化手法が得られた。我々は,32の出版会場から50年間の機械学習研究を対象とする出版コーパスへのアプローチの適用性を示した。本手法は,論文分類,今後の研究課題の予測,未発表の論文提出のための会議や雑誌の掲載を推奨するために利用することができる。 The annual number of publications at scientific venues, for example, conferences and journals, is growing quickly. Hence, even for researchers it becomes harder and harder to keep track of research topics and their progress. In this task, researchers can be supported by automated publication analysis. Yet, many such methods result in uninterpretable, purely numerical representations. As an attempt to support human analysts, we present topic space trajectories, a structure that allows for the comprehensible tracking of research topics. We demonstrate how these trajectories can be interpreted based on eight different analysis approaches. To obtain comprehensible results, we employ non-negative matrix factorization as well as suitable visualization techniques. We show the applicability of our approach on a publication corpus spanning 50 years of machine learning research from 32 publication venues. Our novel analysis method may be employed for paper classification, for the prediction of future research topics, and for the recommendation of fitting conferences and journals for submitting unpublished work.	翻訳日:2022-10-04 00:03:04 公開日:2021-05-18
# 現代・歴史的テキストにおける文字エントロピー:未解読写本の比較尺度 Character Entropy in Modern and Historical Texts: Comparison Metrics for an Undeciphered Manuscript ( http://arxiv.org/abs/2010.14697v2 ) ライセンス: Link先を確認	Luke Lindemann and Claire Bowern	(参考訳) 本稿では,voynich写本を多言語で比較分析するためのコーパスとして,カーリアー言語,スクリバル手,転写システムで区切られたvoynichテキストのコーパス,wikipediaから収集された294言語サンプルのコーパス,8言語で書き起こされた18の歴史的テキストのコーパスの3つのコーパスについて概説する。これらのコーパスは、イェール大学のVoynich Working Groupによるその後の研究で活用される。本稿では,Voynicheseにおける条件付き文字エントロピーの分析により,Voynich文字と言語の特徴を研究するためのコーパスの有用性を実証する。文字エントロピーと言語,スクリプトサイズとタイプ,グリフの構成性,スクリバル規則と略語,位置的文字変種,ビッグラム周波数の相互作用について論じる。この分析は、スクリプト構成性、文字サイズ、予測可能性の間の相互作用を特徴付ける。条件付きエントロピーレベルを自然言語に合わせるには,グリフ合成の実質的な操作が不十分であることを示す。ヴォイニチェ文字の異常に予測可能な性質は、特定のスクリプトや転写システム、基礎言語、置換暗号に起因するものではない。 Voynicheseはコーパスのすべての比較テキストと異なるのは、文字の配置が単語内で非常に制約されているためであり、これは下層の言語から音韻的区別が失われていることを示している。 This paper outlines the creation of three corpora for multilingual comparison and analysis of the Voynich manuscript: a corpus of Voynich texts partitioned by Currier language, scribal hand, and transcription system, a corpus of 294 language samples compiled from Wikipedia, and a corpus of eighteen transcribed historical texts in eight languages. These corpora will be utilized in subsequent work by the Voynich Working Group at Yale University. We demonstrate the utility of these corpora for studying characteristics of the Voynich script and language, with an analysis of conditional character entropy in Voynichese. We discuss the interaction between character entropy and language, script size and type, glyph compositionality, scribal conventions and abbreviations, positional character variants, and bigram frequency. This analysis characterizes the interaction between script compositionality, character size, and predictability. We show that substantial manipulations of glyph composition are not sufficient to align conditional entropy levels with natural languages. The unusually predictable nature of the Voynichese script is not attributable to a particular script or transcription system, underlying language, or substitution cipher. Voynichese is distinct from every comparison text in our corpora because character placement is highly constrained within the word, and this may indicate the loss of phonemic distinctions from the underlying language.	翻訳日:2022-10-02 05:22:06 公開日:2021-05-18
# ガウス重み付きグラフデータベースのアライメントのためのシャープしきい値 Sharp threshold for alignment of graph databases with Gaussian weights ( http://arxiv.org/abs/2010.16295v2 ) ライセンス: Link先を確認	Luca Ganassali	(参考訳) 重み付きグラフ(行列)データベースアライメントにおける再構成の基本的限界について検討する。 2つのグラフのモデルを考えると、$\pi^$は植込みされた一様置換であり、すべてのエッジウェイトが$(A_{i,j}, B_{\pi^である。 (i)\pi^* (j)})_{1 \leq i<j \leq n}$ は、ゼロ平均、単位分散、相関パラメータ $\rho \in [0,1]$ を持つガウス変数の対である。もし$n \rho^2 \geq (4+\epsilon) \log n + \omega(1)$ for some $\epsilon>0$なら、正確な再構成を達成するデータベース$a,b$の観測に基づいて、$\hat{\pi}$ --すなわちmap estimator -- が存在することを証明する。逆に、$n \rho^2 \leq 4 \log n - \log \log n - \omega(1)$ ならば、任意の推定器 $\hat{\pi}$ は確率 $o(1)$ で $\hat{\pi}=\pi$ を検証する。この結果から, 精度回復のための情報理論しきい値が, Wuらによる最近の研究(2020年)で得られたものと同一であること, 言い換えればガウス重み付きグラフアライメントでは, 再構築の問題は検出のそれよりも難しくないことがわかった。復元作業はベクトル型データベースアライメント(これは$(u_i, v_{\pi^)の信号を取る)に対して既によく理解されていた。 (i)})_{1 \leq i\leq n}$ ここで$(u_i, v_{\pi^) (i)})$ は$\mathbb{r}^{d_u} \times \mathbb{r}^{d_v}$ の i.i.d.ペアであり、グラフ(または行列)データベースの定式化は、ハードフェーズが広く予想されるような、劇的に異なる問題をもたらす。これらの証明は、置換のエネルギーの相関構造の研究とともに、写像推定器と第二モーメント法の解析に基づいている。 We study the fundamental limits for reconstruction in weighted graph (or matrix) database alignment. We consider a model of two graphs where $\pi^$ is a planted uniform permutation and all pairs of edge weights $(A_{i,j}, B_{\pi^(i),\pi^(j)})_{1 \leq i<j \leq n}$ are i.i.d. pairs of Gaussian variables with zero mean, unit variance and correlation parameter $\rho \in [0,1]$. We prove that there is a sharp threshold for exact recovery of $\pi^$: if $n \rho^2 \geq (4+\epsilon) \log n + \omega(1)$ for some $\epsilon>0$, there is an estimator $\hat{\pi}$ -- namely the MAP estimator -- based on the observation of databases $A,B$ that achieves exact reconstruction with high probability. Conversely, if $n \rho^2 \leq 4 \log n - \log \log n - \omega(1)$, then any estimator $\hat{\pi}$ verifies $\hat{\pi}=\pi$ with probability $o(1)$. This result shows that the information-theoretic threshold for exact recovery is the same as the one obtained for detection in a recent work by Wu et al. (2020): in other words, for Gaussian weighted graph alignment, the problem of reconstruction is not more difficult than that of detection. Though the reconstruction task was already well understood for vector-shaped database alignment (that is taking signal of the form $(u_i, v_{\pi^(i)})_{1 \leq i\leq n}$ where $(u_i, v_{\pi^(i)})$ are i.i.d. pairs in $\mathbb{R}^{d_u} \times \mathbb{R}^{d_v}$), its formulation for graph (or matrix) databases brings a drastically different problem for which the hard phase is conjectured to be wide. The proofs build upon the analysis of the MAP estimator and the second moment method, together with the study of the correlation structure of energies of permutations.	翻訳日:2022-10-01 16:27:50 公開日:2021-05-18
# GPRNetを用いた地下事業用モデル再構築システム GPR-based Model Reconstruction System for Underground Utilities Using GPRNet ( http://arxiv.org/abs/2011.02635v3 ) ライセンス: Link先を確認	Jinglun Feng, Liang Yang, Ejup Hoxha, Diar Sanakov, Stanislav Sotnikov, Jizhong Xiao	(参考訳) 地中レーダ(gpr)は、地下の物体(リバー、ユーティリティパイプ)を検知・発見するための最も重要な非破壊評価(nde)機器の1つである。これまでの多くの研究は、GPR画像に基づく特徴検出のみに焦点を当てており、より詳細な地下物体の非常に微細で詳細な3Dモデルの再構築を成功させるために、粗いGPR測定を処理できない。そこで本稿では,GPRデータを収集し,地下ユーティリティをローカライズし,地下オブジェクトの高密度点クラウドモデルを再構築する,新しいロボットシステムを提案する。このシステムは3つのモジュールから構成される。 1 視覚慣性に基づくGPRデータ収集モジュールで、全方向ロボットの位置情報をGPR計測にタグ付けする。 2) 生のgpr b-scan画像をオブジェクトモデルの断面に解釈するためのディープニューラルネットワーク(dnn)マイグレーションモジュール 3)DNNベースの3D再構成モジュール、すなわちGPRNetは、細かな3Dポイントクラウドを持つ地下ユーティリティモデルを生成する。本稿では,本手法を定量的・定性的に検証し,パイプ状ユーティリティの濃密かつ完全点クラウドモデル,すなわちgpr生データ不完全性および各種ノイズの少ない入力に基づいて生成する手法について検証する。実験の結果, 合成データとフィールドテストデータにより, 本手法の有効性がさらに向上した。 Ground Penetrating Radar (GPR) is one of the most important non-destructive evaluation (NDE) instruments to detect and locate underground objects (i.e., rebars, utility pipes). Many previous researches focus on GPR image-based feature detection only, and none can process sparse GPR measurements to successfully reconstruct a very fine and detailed 3D model of underground objects for better visualization. To address this problem, this paper presents a novel robotic system to collect GPR data, localize the underground utilities, and reconstruct the underground objects' dense point cloud model. This system is composed of three modules: 1) visual-inertial-based GPR data collection module, which tags the GPR measurements with positioning information provided by an omnidirectional robot; 2) a deep neural network (DNN) migration module to interpret the raw GPR B-scan image into a cross-section of object model; 3) a DNN-based 3D reconstruction module, i.e., GPRNet, to generate underground utility model with the fine 3D point cloud. In this paper, both the quantitative and qualitative experiment results verify our method that can generate a dense and complete point cloud model of pipe-shaped utilities based on a sparse input, i.e., GPR raw data incompleteness and various noise. The experiment results on synthetic data and field test data further support the effectiveness of our approach.	翻訳日:2022-09-29 13:01:00 公開日:2021-05-18
# CODER:用語正規化のための言語間医療用語埋め込みの知識注入 CODER: Knowledge infused cross-lingual medical term embedding for term normalization ( http://arxiv.org/abs/2011.02947v3 ) ライセンス: Link先を確認	Zheng Yuan and Zhengyun Zhao and Haixia Sun and Jiao Li and Fei Wang and Sheng Yu	(参考訳) 本稿では, 言語間医療用語表現のための知識グラフを用いた比較学習コーダを提案する。 CODERは医療用語の正規化のために設計されており、同じまたは類似の医療概念を言語間サポートで表す異なる用語のクローズドベクター表現を提供する。統合医療言語システム (unified medical language system) と呼ばれる医学知識グラフ (kg) 上の対比学習を通じてコーダを訓練し, kg から用語と関係の3重項を用いて類似度を計算する。関係性のあるトレーニングは、医療知識を埋め込みに注入し、より優れた機械学習機能を提供することを目指している。我々は,ゼロショット項正規化,意味的類似性,関係分類ベンチマークにおけるコーダの評価を行い,コーダアウトが様々な最先端の生物医学用語埋め込み,概念埋め込み,文脈埋め込みを行うことを示した。私たちのコードとモデルはhttps://github.com/ganjinzero/coderで利用可能です。 This paper proposes CODER: contrastive learning on knowledge graphs for cross-lingual medical term representation. CODER is designed for medical term normalization by providing close vector representations for different terms that represent the same or similar medical concepts with cross-lingual support. We train CODER via contrastive learning on a medical knowledge graph (KG) named the Unified Medical Language System, where similarities are calculated utilizing both terms and relation triplets from KG. Training with relations injects medical knowledge into embeddings and aims to provide potentially better machine learning features. We evaluate CODER in zero-shot term normalization, semantic similarity, and relation classification benchmarks, which show that CODERoutperforms various state-of-the-art biomedical word embedding, concept embeddings, and contextual embeddings. Our codes and models are available at https://github.com/GanjinZero/CODER.	翻訳日:2022-09-29 11:57:13 公開日:2021-05-18
# Margin-dynamic-softmax Lossによる深いクロスモーダルハッシュ Deep Cross-modal Hashing via Margin-dynamic-softmax Loss ( http://arxiv.org/abs/2011.03451v2 ) ライセンス: Link先を確認	Rong-Cheng Tu, Xian-Ling Mao, Rongxin Tu, Binbin Bian, Wei Wei, Heyan Huang	(参考訳) クロスモーダル検索作業における高い検索効率と低ストレージコストのため,クロスモーダルハッシュ法が注目されている。教師付きクロスモーダルハッシュ法では,データポイントのラベルに十分に含まれている意味情報を学習ハッシュコードに保存させる方法が検索性能向上の鍵となる。したがって、ほとんど全ての教師付きクロスモーダルハッシュ手法は、通常、ハッシュモデルの学習を完全にまたは部分的に導くためにラベル情報を持つデータポイント間の類似性を定義することに依存する。しかし、データポイント間の定義された類似性は、部分的にデータポイントのラベル情報を取り込み、豊富な意味情報を見逃し、検索性能のさらなる向上を妨げる。そこで,本研究では,データポイント間の類似性を定義せずに,新しいクロスモーダルハッシュ法を提案し,それをDCHML(textit{Margin-dynamic-softmax Loss})と呼ぶ。具体的には、dchmlはまずプロキシハッシュネットワークを訓練し、データセットの各カテゴリ情報をプロキシハッシュコードと呼ばれるセマンティック識別ハッシュコードに変換する。各プロキシハッシュコードは、対応するカテゴリのセマンティック情報を適切に保存することができる。次に、モダリティ固有のハッシュネットワークのトレーニングプロセスを監督するためにデータポイント間の類似性を定義することなく、プロキシハッシュコードを教師付き情報として直接利用する新しい \textit{margin-dynamic-softmax loss} を提案する。最後に、新しい \textit{margin-dynamic-softmax loss} を最小化することで、モダリティ固有のハッシュネットワークを訓練して、クロスモーダル類似性と豊富な意味情報を同時に保存できるハッシュコードを生成することができる。 Due to their high retrieval efficiency and low storage cost for cross-modal search task, cross-modal hashing methods have attracted considerable attention. For the supervised cross-modal hashing methods, how to make the learned hash codes preserve semantic information sufficiently contained in the label of datapoints is the key to further enhance the retrieval performance. Hence, almost all supervised cross-modal hashing methods usually depends on defining a similarity between datapoints with the label information to guide the hashing model learning fully or partly. However, the defined similarity between datapoints can only capture the label information of datapoints partially and misses abundant semantic information, then hinders the further improvement of retrieval performance. Thus, in this paper, different from previous works, we propose a novel cross-modal hashing method without defining the similarity between datapoints, called Deep Cross-modal Hashing via \textit{Margin-dynamic-softmax Loss} (DCHML). Specifically, DCHML first trains a proxy hashing network to transform each category information of a dataset into a semantic discriminative hash code, called proxy hash code. Each proxy hash code can preserve the semantic information of its corresponding category well. Next, without defining the similarity between datapoints to supervise the training process of the modality-specific hashing networks , we propose a novel \textit{margin-dynamic-softmax loss} to directly utilize the proxy hashing codes as supervised information. Finally, by minimizing the novel \textit{margin-dynamic-softmax loss}, the modality-specific hashing networks can be trained to generate hash codes which can simultaneously preserve the cross-modal similarity and abundant semantic information well.	翻訳日:2022-09-29 05:43:26 公開日:2021-05-18
# PairRE: ペア関係ベクトルによる知識グラフ埋め込み PairRE: Knowledge Graph Embeddings via Paired Relation Vectors ( http://arxiv.org/abs/2011.03798v3 ) ライセンス: Link先を確認	Linlin Chao, Jianshan He, Taifeng Wang, Wei Chu	(参考訳) N-to-1, 1-to-N, N-to-Nなどの複雑な関係を扱う能力と、対称性や反対称性などの様々な関係パターンを符号化する能力である。しかし、既存の手法ではこれら2つの問題を同時に解くことができず、結果が不十分である。この問題を軽減するために,各関係表現に対してペアベクトルを持つモデルであるペアレを提案する。ペアベクトルは、損失関数のマージンの適応的な調整を複素関係に適合させることができる。加えて、PairREは3つの重要な関係パターン、対称性/反対称性、逆および合成を符号化することができる。関係表現に関する単純な制約が与えられた場合、PairREはさらにサブリレーションをエンコードできる。リンク予測ベンチマークの実験は、ペアリングの鍵となる機能を示す。さらに、我々は2つの知識グラフデータセットに新しい最先端のOpen Graphベンチマークを設定した。 Distance based knowledge graph embedding methods show promising results on link prediction task, on which two topics have been widely studied: one is the ability to handle complex relations, such as N-to-1, 1-to-N and N-to-N, the other is to encode various relation patterns, such as symmetry/antisymmetry. However, the existing methods fail to solve these two problems at the same time, which leads to unsatisfactory results. To mitigate this problem, we propose PairRE, a model with paired vectors for each relation representation. The paired vectors enable an adaptive adjustment of the margin in loss function to fit for complex relations. Besides, PairRE is capable of encoding three important relation patterns, symmetry/antisymmetry, inverse and composition. Given simple constraints on relation representations, PairRE can encode subrelation further. Experiments on link prediction benchmarks demonstrate the proposed key capabilities of PairRE. Moreover, We set a new state-of-the-art on two knowledge graph datasets of the challenging Open Graph Benchmark.	翻訳日:2022-09-28 22:05:59 公開日:2021-05-18
# poisson multi-bernoulli mixture filterによる点と拡張目標の共存 A Poisson multi-Bernoulli mixture filter for coexisting point and extended targets ( http://arxiv.org/abs/2011.04464v2 ) ライセンス: Link先を確認	\'Angel F. Garc\'ia-Fern\'andez, Jason L. Williams, Lennart Svensson, Yuxuan Xia	(参考訳) 本稿では,Poisson Multi-Bernoulli Mixing (PMBM) フィルタを提案する。 PMBMフィルタは、データアソシエーションの確率情報に基づいて、マルチターゲットフィルタリング後部を計算するための再帰と、単一ターゲット予測と更新を提供する。本稿では,まずPMBMフィルタを一般化された測定モデルに適用し,点と拡張対象から導出する測定値を含む手法を提案する。次に,ポイントと拡張対象の両方に対応し,ポイントターゲットに対するガウス密度と拡張ターゲットに対するガンマガウス逆ウィッシュアート密度を伝播するフィルタリング再帰を導出する単一ターゲット空間を提案する。また,PMBMフィルタの計算効率のよい近似法として,ポアソン・マルチベルヌーリフィルタ(PMB)を開発した。結果のフィルタは数値シミュレーションによって解析される。 This paper proposes a Poisson multi-Bernoulli mixture (PMBM) filter for coexisting point and extended targets, i.e., for scenarios where there may be simultaneous point and extended targets. The PMBM filter provides a recursion to compute the multi-target filtering posterior based on probabilistic information on data associations, and single-target predictions and updates. In this paper, we first derive the PMBM filter update for a generalised measurement model, which can include measurements originated from point and extended targets. Second, we propose a single-target space that accommodates both point and extended targets and derive the filtering recursion that propagates Gaussian densities for point targets and gamma Gaussian inverse Wishart densities for extended targets. As a computationally efficient approximation of the PMBM filter, we also develop a Poisson multi-Bernoulli (PMB) filter for coexisting point and extended targets. The resulting filters are analysed via numerical simulations.	翻訳日:2022-09-28 02:28:10 公開日:2021-05-18
# 深部ニューラルネットワークと校正データを用いた広視野小開口望遠鏡の点展開関数推定 Point Spread Function Estimation for Wide Field Small Aperture Telescopes with Deep Neural Networks and Calibration Data ( http://arxiv.org/abs/2011.10243v2 ) ライセンス: Link先を確認	Peng Jia, Xuebo Wu, Zhengyang Li, Bo Li, Weihua Wang, Qiang Liu, Adam Popowicz	(参考訳) 点拡散関数(PSF)は望遠鏡の状態を反映し、PSFベースのアストロメトリー、測光、画像復元などのデータ処理手法の発展に重要な役割を果たしている。しかし、広視野小型開口望遠鏡(WFSAT)では、光学系によって誘導される収差が非常に複雑であり、恒星画像の信号対雑音比が低すぎるため、視野全体の位置でPSFを推定するのは困難である。本稿では,より深いニューラルネットワーク(DNN)に基づくPSFモデリング法を開発し,そのPSF推定への応用を示す。望遠鏡アライメントとテストの段階では、光学素子を工学的許容範囲(傾きとまともさ)で修正することで、システムキャリブレーションデータを収集する。次に、これらのデータを用いてDNN(Tel-Net)を訓練する。訓練後、Tel-Netは、複数の離散サンプリングされた星画像から任意の視野でPSFを推定できる。本手法の性能評価にはシミュレーションデータと実験データの両方を用いた。その結果,Tel-Netはどの状態でもFoVの任意の位置でもWFSATのPSFを再構築できることがわかった。比較した古典的手法である逆距離重み (IDW) の補間結果よりも, はるかに精度が高い。提案手法は,PSFの強い事前情報を必要とするWFSATのためのディープニューラルネットワークに基づくデータ処理手法の開発の基礎となる。 The point spread function (PSF) reflects states of a telescope and plays an important role in development of data processing methods, such as PSF based astrometry, photometry and image restoration. However, for wide field small aperture telescopes (WFSATs), estimating PSF in any position of the whole field of view is hard, because aberrations induced by the optical system are quite complex and the signal to noise ratio of star images is often too low for PSF estimation. In this paper, we further develop our deep neural network (DNN) based PSF modelling method and show its applications in PSF estimation. During the telescope alignment and testing stage, our method collects system calibration data through modification of optical elements within engineering tolerances (tilting and decentering). Then we use these data to train a DNN (Tel--Net). After training, the Tel--Net can estimate PSF in any field of view from several discretely sampled star images. We use both simulated and experimental data to test performance of our method. The results show that the Tel--Net can successfully reconstruct PSFs of WFSATs of any states and in any positions of the FoV. Its results are significantly more precise than results obtained by the compared classic method - Inverse Distance Weight (IDW) interpolation. Our method provides foundations for developing of deep neural network based data processing methods for WFSATs, which require strong prior information of PSFs.	翻訳日:2022-09-23 06:52:40 公開日:2021-05-18
# 深部領域一般化のためのバッチ正規化埋め込み Batch Normalization Embeddings for Deep Domain Generalization ( http://arxiv.org/abs/2011.12672v3 ) ライセンス: Link先を確認	Mattia Segu, Alessio Tonioni, Federico Tombari	(参考訳) ドメインの一般化は、異なるドメインと見えないドメインで堅牢に実行されるように機械学習モデルをトレーニングすることを目的としている。最近では、複数のデータセットを使用してモデルをトレーニングし、ドメイン不変の機能を抽出している。まず、アドホックなバッチ正規化レイヤを使用してドメイン依存表現を明示的にトレーニングし、独立したドメインの統計を収集します。そこで我々は,これらの統計データを用いて,距離関数を用いて領域へのメンバシップを計測できる共有潜在空間の領域をマッピングする。テスト時には、未知の領域から同じ空間にサンプルを投影し、既知の領域の線形結合としてそれらの領域の特性を推論する。トレーニングとテスト時に同じマッピング戦略を適用し、潜在表現と強力で軽量なアンサンブルモデルの両方を学習します。一般的なドメイン一般化ベンチマーク(pacs、office-31、office-caltech)では、現在の最先端技術よりも分類精度が大幅に向上している。 Domain generalization aims at training machine learning models to perform robustly across different and unseen domains. Several recent methods use multiple datasets to train models to extract domain-invariant features, hoping to generalize to unseen domains. Instead, first we explicitly train domain-dependant representations by using ad-hoc batch normalization layers to collect independent domain's statistics. Then, we propose to use these statistics to map domains in a shared latent space, where membership to a domain can be measured by means of a distance function. At test time, we project samples from an unknown domain into the same space and infer properties of their domain as a linear combination of the known ones. We apply the same mapping strategy at training and test time, learning both a latent representation and a powerful but lightweight ensemble model. We show a significant increase in classification accuracy over current state-of-the-art techniques on popular domain generalization benchmarks: PACS, Office-31 and Office-Caltech.	翻訳日:2022-09-21 02:11:20 公開日:2021-05-18
# 3DSNet: 教師なし形状の3Dスタイル転送 3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer ( http://arxiv.org/abs/2011.13388v4 ) ライセンス: Link先を確認	Mattia Segu, Margarita Grinvald, Roland Siegwart, Federico Tombari	(参考訳) あるイメージから別のイメージへスタイルを転送することは、コンピュータビジョンにおいて広く研究されている課題である。しかし、3d設定でのスタイル転送は、ほとんど未解決の問題である。そこで本研究では,不整合コンテンツとスタイル表現に基づく3次元オブジェクト間のスタイル伝達のための学習ベースアプローチを提案する。提案手法は, 点雲とメッシュの2つの形状を合成し, ソースとターゲットの3dモデルの内容とスタイルを組み合わせて, ソースの内容を保持しながら, ターゲットのスタイルに類似した新たな形状を生成する。さらに,提案手法を拡張して,選択した領域のマルチモーダル分布を暗黙的に学習する。学習した分布からスタイルコードをサンプリングすることで、モデルが入力形状に表現できるスタイルの種類を増加させます。実験により,多くのベンチマークにおいて提案手法の有効性が検証された。私たちのフレームワークの実装は受け入れ次第リリースします。 Transferring the style from one image onto another is a popular and widely studied task in computer vision. Yet, style transfer in the 3D setting remains a largely unexplored problem. To our knowledge, we propose the first learning-based approach for style transfer between 3D objects based on disentangled content and style representations. The proposed method can synthesize new 3D shapes both in the form of point clouds and meshes, combining the content and style of a source and target 3D model to generate a novel shape that resembles in style the target while retaining the source content. Furthermore, we extend our technique to implicitly learn the multimodal style distribution of the chosen domains. By sampling style codes from the learned distributions, we increase the variety of styles that our model can confer to an input shape. Experimental results validate the effectiveness of the proposed 3D style transfer method on a number of benchmarks. The implementation of our framework will be released upon acceptance.	翻訳日:2022-09-20 08:28:14 公開日:2021-05-18
# SemSegLoss:セマンティックセグメンテーションのための損失関数のpythonパッケージ SemSegLoss: A python package of loss functions for semantic segmentation ( http://arxiv.org/abs/2106.05844v1 ) ライセンス: Link先を確認	Shruti Jadon	(参考訳) Image Segmentationは、自動疾患検出から自動運転車まで、幅広い用途があるため、活発な研究分野である。近年、様々な研究論文がバイアスデータ、スパースセグメンテーション、不均衡データセットの場合に使用される異なる損失関数を提案している。本稿では,画像セグメンテーションに広く用いられているよく知られた損失関数のいくつかからなるピソンパッケージであるSemSegLossを紹介する。研究者が新規な損失関数の開発を支援し、様々なアプリケーションのためのモデルアーキテクチャに関する広範な実験を行うために開発された。提案パッケージの使いやすさと柔軟性により、開発時間を短縮し、セマンティックセグメンテーションのための機械学習モデルの評価戦略が強化された。さらに、イメージセグメンテーションを使用するアプリケーションは、関数の一般性のためにSemSegLossを使用することができる。この幅広い応用は、あらゆる産業におけるAIの発展と成長につながるだろう。 Image Segmentation has been an active field of research as it has a wide range of applications, ranging from automated disease detection to self-driving cars. In recent years, various research papers proposed different loss functions used in case of biased data, sparse segmentation, and unbalanced dataset. In this paper, we introduce SemSegLoss, a python package consisting of some of the well-known loss functions widely used for image segmentation. It is developed with the intent to help researchers in the development of novel loss functions and perform an extensive set of experiments on model architectures for various applications. The ease-of-use and flexibility of the presented package have allowed reducing the development time and increased evaluation strategies of machine learning models for semantic segmentation. Furthermore, different applications that use image segmentation can use SemSegLoss because of the generality of its functions. This wide range of applications will lead to the development and growth of AI across all industries.	翻訳日:2021-06-13 13:56:50 公開日:2021-05-18
# (参考訳) 言語間低リソース音声認識のためのアダプタの活用 Exploiting Adapters for Cross-lingual Low-resource Speech Recognition ( http://arxiv.org/abs/2105.11905v1 ) ライセンス: CC BY 4.0	Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki	(参考訳) 言語間適応は、複数のリッチリソース言語を利用して低リソースターゲット言語のためのモデルを構築する問題を解決することを目的としている。低リソース言語は訓練データに制限があるため、音声認識モデルは容易に過度に適合する。本稿では,パラメータ効率のよい言語間音声適応のための複数のアダプタの性能について検討する。アダプタを暗黙的に活用するこれまでのMetaAdapterに基づいて,アダプタから知識を明示的に学習するSimAdapterと呼ばれる新しいアルゴリズムを提案する。 metaadapterはメタラーニングを利用して、トレーニングデータからテスト言語に一般的な知識を転送します。 SimAdapterは、アダプタを使って微調整中にソース言語とターゲット言語の類似性を学ぶことを目的としている。我々はCommon Voiceデータセットで5つの低リソース言語について広範な実験を行った。その結果、メタアダプタとシムアダプタはWERを2.98%、2.55%減らすことができ、トレーニング可能なパラメータは2.5%と15.5%に留まった。さらに,これら2つのアルゴリズムを最大3.55%のwar削減で性能向上のために統合可能であることを示した。 Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language. Since the low-resource language has limited training data, speech recognition models can easily overfit. In this paper, we propose to use adapters to investigate the performance of multiple adapters for parameter-efficient cross-lingual speech adaptation. Based on our previous MetaAdapter that implicitly leverages adapters, we propose a novel algorithms called SimAdapter for explicitly learning knowledge from adapters. Our algorithm leverages adapters which can be easily integrated into the Transformer structure.MetaAdapter leverages meta-learning to transfer the general knowledge from training data to the test language. SimAdapter aims to learn the similarities between the source and target languages during fine-tuning using the adapters. We conduct extensive experiments on five-low-resource languages in Common Voice dataset. Results demonstrate that our MetaAdapter and SimAdapter methods can reduce WER by 2.98% and 2.55% with only 2.5% and 15.5% of trainable parameters compared to the strong full-model fine-tuning baseline. Moreover, we also show that these two novel algorithms can be integrated for better performance with up to 3.55% relative WER reduction.	翻訳日:2021-06-06 10:29:23 公開日:2021-05-18
# (参考訳) 高次微分方程式と非線形微分方程式を解くための効率的かつ効率的な方法:比列ネット An Effective and Efficient Method to Solve the High-Order and the Non-Linear Ordinary Differential Equations: the Ratio Net ( http://arxiv.org/abs/2105.11309v1 ) ライセンス: CC BY 4.0	Chen-Xin Qin, Ru-Hao Liu, Mao-Cai Li, Chi-Chun Zhou, and Yi-Liua	(参考訳) 高次および非線形常微分方程式を解く効率的かつ効率的な方法が提供される。その方法は比率ネットに基づいている。本手法を多項式法や多層パーセプトロンネットワーク法などの既存手法と比較することにより,比重ネットが良好な結果を与え,高い効率を示すことを示す。 An effective and efficient method that solves the high-order and the non-linear ordinary differential equations is provided. The method is based on the ratio net. By comparing the method with existing methods such as the polynomial based method and the multilayer perceptron network based method, we show that the ratio net gives good results and has higher efficiency.	翻訳日:2021-06-06 10:07:52 公開日:2021-05-18
# (参考訳) アプリケーションモデルの進化に関する一般的な理論 -- フルバージョン A General Theory for the Evolution of Application Models -- Full version ( http://arxiv.org/abs/2105.11308v1 ) ライセンス: CC BY 4.0	H. A. Proper and Th. P. van der Weide	(参考訳) 本稿では,情報システムの発展に焦点をあてる。まず、進化の概念の軽視が提供され、そのような進化の一般理論への最初の試みとなる。この理論は、概念レベルでの基盤となる情報構造、一方での進化、他方で情報構造とその集団に関する操作の説明と意味論を区別する。この理論の主な問題は、オブジェクトの型付け、型関連性、オブジェクトの識別である。これらの概念の観点から、進化の健全性に関するいくつかの公理を提案する。この一般的な理論では、基礎となるデータモデルはパラメータであり、オブジェクト・ロール・モデリングやオブジェクト指向技術を含む幅広いモデリング手法に適用できる。 In this article we focus on evolving information systems. First a delimitation of the concept of evolution is provided, resulting in a first attempt to a general theory for such evolutions. The theory makes a distinction between the underlying information structure at the conceptual level, its evolution on the one hand, and the description and semantics of operations on the information structure and its population on the other hand. Main issues within this theory are object typing, type relatedness and identification of objects. In terms of these concepts, we propose some axioms on the well-formedness of evolution. In this general theory, the underlying data model is a parameter, making the theory applicable for a wide range of modelling techniques, including object-role modelling and object oriented techniques.	翻訳日:2021-06-06 09:59:24 公開日:2021-05-18
# 単純なリッジペナルティで公平を達成する Achieving Fairness with a Simple Ridge Penalty ( http://arxiv.org/abs/2105.13817v1 ) ライセンス: Link先を確認	Marco Scutari and Manuel Proissl	(参考訳) ユーザ定義の公正度に基づく線形回帰モデルの推定は、2次制約を持つ非凸2次プログラミング最適化問題を解くことで達成できる。本研究では,尾根ペナルティによってユーザ定義の公平度を強制する,このタスクに対する代替的で柔軟なアプローチを提案する。提案手法は, より直感的に解釈できる回帰係数の推定値を生成すること, 数学的に単純で, 部分的に閉じた解を持つこと, 線形回帰を超えて拡張しやすいこと, の3つの制限に対処する。両手法を5つの異なるデータセットで実証的に評価し,提案手法が適合性の向上と予測精度の向上をもたらすとともに,所望の公平性レベルを達成するのに等しく有効であることを見出した。さらに,非凸2次アプローチの当初の実験的評価におけるバイアスの源泉を明らかにするとともに,提案手法を広範囲なモデルに拡張する方法について論じる。 Estimating a fair linear regression model subject to a user-defined level of fairness can be achieved by solving a non-convex quadratic programming optimisation problem with quadratic constraints. In this work we propose an alternative, more flexible approach to this task that enforces a user-defined level of fairness by means of a ridge penalty. Our proposal addresses three limitations of the former approach: it produces regression coefficient estimates that are more intuitive to interpret; it is mathematically simpler, with a solution that is partly in closed form; and it is easier to extend beyond linear regression. We evaluate both approaches empirically on five different data sets, and we find that our proposal provides better goodness of fit and better predictive accuracy while being equally effective at achieving the desired fairness level. In addition we highlight a source of bias in the original experimental evaluation of the non-convex quadratic approach, and we discuss how our proposal can be extended to a wide range of models.	翻訳日:2021-06-06 08:51:07 公開日:2021-05-18
# データからのダイナミクス学習のためのエネルギー保存ニューラルネットワークのベンチマーク Benchmarking Energy-Conserving Neural Networks for Learning Dynamics from Data ( http://arxiv.org/abs/2012.02334v4 ) ライセンス: Link先を確認	Yaofeng Desmond Zhong, Biswadip Dey, Amit Chakraborty	(参考訳) ここ数年、深層学習フレームワークに物理学に基づく帰納的バイアスを導入することへの関心が高まっている。特に、観測された時系列データからダイナミクスを学習するためにニューラルネットワークを使用しながら、エネルギー保存を強制する方法を模索する文献が増えている。本研究では,HNN,LNN,DeLaN,SymanODEN,CHNN,CLNNなど10種類のエネルギー保存型ニューラルネットワークモデルについて検討した。これらのモデルの背後にある理論をコンパクトに導出し、それらの類似性と相違を説明する。性能は4つの物理系で比較される。エネルギーベースコントローラの設計にこれらのエネルギー保存モデルを活用する可能性について指摘する。 The last few years have witnessed an increased interest in incorporating physics-informed inductive bias in deep learning frameworks. In particular, a growing volume of literature has been exploring ways to enforce energy conservation while using neural networks for learning dynamics from observed time-series data. In this work, we survey ten recently proposed energy-conserving neural network models, including HNN, LNN, DeLaN, SymODEN, CHNN, CLNN and their variants. We provide a compact derivation of the theory behind these models and explain their similarities and differences. Their performance are compared in 4 physical systems. We point out the possibility of leveraging some of these energy-conserving models to design energy-based controllers.	翻訳日:2021-05-23 15:01:30 公開日:2021-05-18
# (参考訳) 確率論的モデルによる薬物発見予測の不確実性の定量化 Quantifying sources of uncertainty in drug discovery predictions with probabilistic models ( http://arxiv.org/abs/2105.09474v1 ) ライセンス: CC BY-SA 4.0	Stanley E. Lazic, Dominic P. Williams	(参考訳) 予測の不確実性を知ることは、高価な投資判断を行う場合や患者の安全性が最重要となる場合に重要であるが、薬物発見における機械学習(ml)モデルは、一般的には単一の最良の見積もりを提供し、すべての不確実性源を無視する。したがって、これらのモデルからの予測は自信過剰であり、失敗する運命にある化合物がさらに開発されると、患者をリスクと廃棄物にすることができる。確率的予測モデル(PPM)は、データとモデルの両方に不確実性を取り入れ、予測の不確実性を表す予測値の分布を返す。 PPMは、いつ予測が不確実であるかをユーザーに知らせるだけでなく、これらのモデルからの直感的なアウトプットによって、コミュニケーションのリスクと意思決定がより簡単になる。多くの一般的な機械学習メソッドは、PPMまたはベイジアンアナログを持ち、PPMを現在のワークフローに簡単に適合させることができる。我々は毒性予測を例に挙げるが、薬物発見に使用される全ての予測モデルにも同様の原理を適用する。不確実性を無視した結果や、不確実性の原因となるPPMについても述べられている。我々は議論を広く非数学的な聴衆に公開することを目指している。方程式は数学的読者向けに具体化するために提供され(しかし、理解を失うことなくスキップできる)、計算研究者はコードを利用できる(https://github.com/stanlazic/ml_uncertainty_quantification)。 Knowing the uncertainty in a prediction is critical when making expensive investment decisions and when patient safety is paramount, but machine learning (ML) models in drug discovery typically provide only a single best estimate and ignore all sources of uncertainty. Predictions from these models may therefore be over-confident, which can put patients at risk and waste resources when compounds that are destined to fail are further developed. Probabilistic predictive models (PPMs) can incorporate uncertainty in both the data and model, and return a distribution of predicted values that represents the uncertainty in the prediction. PPMs not only let users know when predictions are uncertain, but the intuitive output from these models makes communicating risk easier and decision making better. Many popular machine learning methods have a PPM or Bayesian analogue, making PPMs easy to fit into current workflows. We use toxicity prediction as a running example, but the same principles apply for all prediction models used in drug discovery. The consequences of ignoring uncertainty and how PPMs account for uncertainty are also described. We aim to make the discussion accessible to a broad non-mathematical audience. Equations are provided to make ideas concrete for mathematical readers (but can be skipped without loss of understanding) and code is available for computational researchers (https://github.com/stanlazic/ML_uncertainty_quantification).	翻訳日:2021-05-22 01:53:30 公開日:2021-05-18
# (参考訳) 構造力学と振動の解法と反転のための深層学習 Deep learning for solution and inversion of structural mechanics and vibrations ( http://arxiv.org/abs/2105.09477v1 ) ライセンス: CC BY 4.0	Ehsan Haghighat, Ali Can Bekar, Erdogan Madenci, Ruben Juanes	(参考訳) ディープラーニングはここ数年でもっとも人気のある機械学習手法だ。本章では,構造力学および振動問題に対する深層学習と物理インフォームドニューラルネットワークの適用について述べる。演示問題は、データのデノイズ化、時間依存の常微分方程式と偏微分方程式の解、与えられたデータに対するシステムの応答を特徴づけることである。 Deep learning has been the most popular machine learning method in the last few years. In this chapter, we present the application of deep learning and physics-informed neural networks concerning structural mechanics and vibration problems. Demonstration problems involve de-noising data, solution to time-dependent ordinary and partial differential equations, and characterizing the system's response for a given data.	翻訳日:2021-05-22 01:52:22 公開日:2021-05-18
# (参考訳) 光gbmに基づく海洋水中の支配波周期の予測 A LightGBM based Forecasting of Dominant Wave Periods in Oceanic Waters ( http://arxiv.org/abs/2105.08721v1 ) ライセンス: CC BY 4.0	Pujan Pokhrel, Elias Ioup, Md Tamjidul Hoque, Mahdi Abdelguerfi and Julian Simeonov	(参考訳) 本稿では,海洋水中の優占波周期を予測するための光勾配ブースティング(lightgbm)を提案する。まず,CDIPブイから収集したデータを用いて,様々なデータフィルタリング手法を適用する。データフィルタリングにより、トレーニングと検証のために高品質なデータセットを得ることができる。次に, 波高, 周期, 歪, 曲率などの波面特性と, ブイの湿度, 圧力, 気温などの大気特性を抽出する。その後、hvブロッククロスバリデーション方式を用いてLightGBMとExtra Treesを使用するアルゴリズムを訓練し、最大30日間の波浪期間を予測する。 lightgbm の r2 スコアは 0.94, 0.94, 0.94 で、1 日先、15 日先、15 日先、予測 30 日先である。同様に、エクストラツリー(ET)は1日先、15日前、30日前、R2スコアが0.88、0.86、0.85である。テストデータセットの場合、lightgbmのr2スコアは 0.94, 0.94, 0.94で、1日前、15日前、30日前である。 ET の R2 スコアは 0.88, 0.86, 0.85 であり、1 日先、15 日前、30 日先、予測されている。同様のR2スコアとテストデータセットは、本論文で開発された機械学習モデルが堅牢であることを示している。 LightGBM アルゴリズムはテスト対象のウィンドウに対して ET よりも優れており、最終アルゴリズムとして扱われる。予測地平線が大きくなるにつれて,両手法の性能は著しく低下しない。同様に,提案手法は,本論文に含まれる数値的アプローチよりも優れている。 1日間の予測のために、提案アルゴリズムはsi, bias, cc, rmseを0.09, 0.00, 0.97, 1.78とし、欧州中距離気象予報センター(ecmwf)モデルの0.268, 0.40, 0.63, 2.18と比較した。 In this paper, we propose a Light Gradient Boosting (LightGBM) to forecast dominant wave periods in oceanic waters. First, we use the data collected from CDIP buoys and apply various data filtering methods. The data filtering methods allow us to obtain a high-quality dataset for training and validation purposes. We then extract various wave-based features like wave heights, periods, skewness, kurtosis, etc., and atmospheric features like humidity, pressure, and air temperature for the buoys. Afterward, we train algorithms that use LightGBM and Extra Trees through a hv-block cross-validation scheme to forecast dominant wave periods for up to 30 days ahead. LightGBM has the R2 score of 0.94, 0.94, and 0.94 for 1-day ahead, 15-day ahead, and 30-day ahead prediction. Similarly, Extra Trees (ET) has an R2 score of 0.88, 0.86, and 0.85 for 1-day ahead, 15-day ahead, and 30 day ahead prediction. In case of the test dataset, LightGBM has R2 score of 0.94, 0.94, and 0.94 for 1-day ahead, 15-day ahead and 30-day ahead prediction. ET has R2 score of 0.88, 0.86, and 0.85 for 1-day ahead, 15-day ahead, and 30-day ahead prediction. A similar R2 score for both training and the test dataset suggests that the machine learning models developed in this paper are robust. Since the LightGBM algorithm outperforms ET for all the windows tested, it is taken as the final algorithm. Note that the performance of both methods does not decrease significantly as the forecast horizon increases. Likewise, the proposed method outperforms the numerical approaches included in this paper in the test dataset. For 1 day ahead prediction, the proposed algorithm has SI, Bias, CC, and RMSE of 0.09, 0.00, 0.97, and 1.78 compared to 0.268, 0.40, 0.63, and 2.18 for the European Centre for Medium-range Weather Forecasts (ECMWF) model, which outperforms all the other methods in the test dataset.	翻訳日:2021-05-21 01:34:55 公開日:2021-05-18
# (参考訳) コンテキストアウェアセキュリティ監視のための知識グラフ上の機械学習 Machine learning on knowledge graphs for context-aware security monitoring ( http://arxiv.org/abs/2105.08741v1 ) ライセンス: CC BY 4.0	Josep Soler Garrido, Dominik Dold, Johannes Frank	(参考訳) 監視ツールが生成するデータ量の増加や、攻撃者が活動を隠す際の洗練度の高さから、機械学習技術は侵入検出の文脈で注目を集めている。しかし、既存の手法は、生成されたアラートの量と関連性の観点から、しばしば重要な制限を示す。近年、知識グラフはサイバーセキュリティ分野の応用を見つけており、人間の理解可能な語彙を使って複数のドメインからのデータをシームレスに統合する能力によって、これらの欠点のいくつかを緩和する可能性を示している。産業システムにおける異常な活動を評価するためのリンク予測手法を実験的に評価し, 侵入検知のための知識グラフへの機械学習の適用について検討する。初期教師なし訓練の後,提案手法は様々なシナリオにおいて直感的によく校正され,解釈可能な警告を生成することを示し,侵入検出目的の知識グラフに対するリレーショナル機械学習の潜在的メリットを示唆している。 Machine learning techniques are gaining attention in the context of intrusion detection due to the increasing amounts of data generated by monitoring tools, as well as the sophistication displayed by attackers in hiding their activity. However, existing methods often exhibit important limitations in terms of the quantity and relevance of the generated alerts. Recently, knowledge graphs are finding application in the cybersecurity domain, showing the potential to alleviate some of these drawbacks thanks to their ability to seamlessly integrate data from multiple domains using human-understandable vocabularies. We discuss the application of machine learning on knowledge graphs for intrusion detection and experimentally evaluate a link-prediction method for scoring anomalous activity in industrial systems. After initial unsupervised training, the proposed method is shown to produce intuitively well-calibrated and interpretable alerts in a diverse range of scenarios, hinting at the potential benefits of relational machine learning on knowledge graphs for intrusion detection purposes.	翻訳日:2021-05-21 01:20:37 公開日:2021-05-18
# (参考訳) 限られた露出とほぼ確実性で安全に行動することを学ぶ Learning to Act Safely with Limited Exposure and Almost Sure Certainty ( http://arxiv.org/abs/2105.08748v1 ) ライセンス: CC BY 4.0	Agustin Castellano, Hancheng Min, Juan Bazerque, Enrique Mallada	(参考訳) 本研究の目的は,未知の環境での安全行動の学習を,確率が保証されても,最適性,安全でない事象への曝露レベル,安全でない事象の最大検出時間とのトレードオフを行ない,無拘束の探索試験を必要とせずに達成できる,という概念を提唱することにある。この概念を2つの相補的な設定で説明する。本稿では,まず標準的マルチアームバンディット問題に着目し,不確実性の存在下での学習安全性の本質的なトレードオフについて検討する。十分な探索に関する軽度な仮定の下で、予測された)有限個のラウンドで全ての安全でないマシンを確実に検出するアルゴリズムを提供する。この分析はまた、環境を確保するのに必要なラウンド数と安全なマシンを捨てる確率とのトレードオフも明らかにしている。次に、ほぼ確実に制約のあるマルコフ決定プロセス(mdp)のための最適なポリシーを見つける問題を考える。その結果、(作用)値関数は、報酬プロセスとは独立に実現可能なポリシーを識別できるバリアベースの分解を満足していることが示される。この分解を用いて、有限個のステップでそのような安全でない状態-作用対を識別するバリア学習アルゴリズムを開発した。我々の分析は、安全でない行動を検出するために必要なMDPのタイムラグと、安全でない事象への暴露のレベルとのトレードオフをさらに強調している。シミュレーションは、上記のトレードオフをさらに説明し、安全性の制約が学習プロセスのさらなるスピードアップにつながることを示唆する。 This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials, provided that one is willing to navigate trade-offs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We illustrate this concept in two complementary settings. We first focus on the canonical multi-armed bandit problem and seek to study the intrinsic trade-offs of learning safety in the presence of uncertainty. Under mild assumptions on sufficient exploration, we provide an algorithm that provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe machines. We then consider the problem of finding optimal policies for a Markov Decision Process (MDP) with almost sure constraints. We show that the (action) value function satisfies a barrier-based decomposition which allows for the identification of feasible policies independently of the reward process. Using this decomposition, we develop a Barrier-learning algorithm, that identifies such unsafe state-action pairs in a finite expected number of steps. Our analysis further highlights a trade-off between the time lag for the underlying MDP necessary to detect unsafe actions, and the level of exposure to unsafe events. Simulations corroborate our theoretical findings, further illustrating the aforementioned trade-offs, and suggesting that safety constraints can further speed up the learning process.	翻訳日:2021-05-21 01:11:26 公開日:2021-05-18
# (参考訳) 細粒度視覚分類のための自己教師あり学習 Self-Supervised Learning for Fine-Grained Visual Categorization ( http://arxiv.org/abs/2105.08788v1 ) ライセンス: CC BY 4.0	Muhammad Maaz, Hanoona Abdul Rasheed, Dhanalaxmi Gaddam	(参考訳) 自己教師付き学習(SSL)の最近の研究は、分類タスクの画像から有用な意味表現を学習する能力を示している。本研究では,FGVCにおけるSSLの有用性について検討した。 FGVCは、一般的なカテゴリ内で視覚的に類似したサブカテゴリのオブジェクトを区別することを目的としている。データセット内の小さなクラス間、しかし大きなクラス内バリエーションは、難しいタスクになります。このようなきめ細かいデータに対するアノテートラベルの制限はSSLの必要性を助長し、追加のアノテーションのコストを伴わずに学習を促進することができる。 cub-200-2011 データセットではトレーニング中のランダム作物増量と試験中の中心作物増量を利用して 86.36 % のtop-1 分類精度を達成している。本研究では,FGVCにおける各種プリテキストタスク,特に回転,プリテキスト不変表現学習(PIRL),デコンストラクションと構築学習(DCL)の有用性について検討する。補助的なタスクとしての回転は、グローバルな特徴を学習するモデルを促進し、微妙な詳細に焦点を絞ることから切り離す。ジグソーパッチを使用するPIRLは、差別的な地域に集中しようとするが、それらを正確にローカライズするのに苦労する。 DCLは局所的な識別特徴の学習に役立ち、87.41 %$ top-1 の精度でベースラインを上回ります。デコンストラクション学習はモデルを局所的なオブジェクト部分に集中させ、レコンストラクション学習は部分間の相関を学習するのに役立つ。我々の発見を推論するための広範な実験を行う。私たちのコードはhttps://github.com/mmaaz60/ssl_for_fgvcで利用可能です。 Recent research in self-supervised learning (SSL) has shown its capability in learning useful semantic representations from images for classification tasks. Through our work, we study the usefulness of SSL for Fine-Grained Visual Categorization (FGVC). FGVC aims to distinguish objects of visually similar sub categories within a general category. The small inter-class, but large intra-class variations within the dataset makes it a challenging task. The limited availability of annotated labels for such a fine-grained data encourages the need for SSL, where additional supervision can boost learning without the cost of extra annotations. Our baseline achieves $86.36\%$ top-1 classification accuracy on CUB-200-2011 dataset by utilizing random crop augmentation during training and center crop augmentation during testing. In this work, we explore the usefulness of various pretext tasks, specifically, rotation, pretext invariant representation learning (PIRL), and deconstruction and construction learning (DCL) for FGVC. Rotation as an auxiliary task promotes the model to learn global features, and diverts it from focusing on the subtle details. PIRL that uses jigsaw patches attempts to focus on discriminative local regions, but struggles to accurately localize them. DCL helps in learning local discriminating features and outperforms the baseline by achieving $87.41\%$ top-1 accuracy. The deconstruction learning forces the model to focus on local object parts, while reconstruction learning helps in learning the correlation between the parts. We perform extensive experiments to reason our findings. Our code is available at https://github.com/mmaaz60/ssl_for_fgvc.	翻訳日:2021-05-21 00:36:04 公開日:2021-05-18
# (参考訳) Corelated Adversarial Joint Disrepancy Adaptation Network Correlated Adversarial Joint Discrepancy Adaptation Network ( http://arxiv.org/abs/2105.08808v1 ) ライセンス: CC0 1.0	Youshan Zhang and Brian D. Davison	(参考訳) ドメイン適応は、あるドメインから別のドメインに知識を移す際にドメインシフトの問題を軽減することを目的としている。しかし、既存の作品の多くは、クラスラベルを考慮せずに限界的な特徴を抽出することに依存している。さらに、対象のドメインラベルを使ってパラメータをチューニングしながら、そのモデルをいわゆる教師なしドメイン適応(unsupervised domain adapt)と呼ぶメソッドもある。これらの問題に対処するために,2つの領域の合同不一致を最小限に抑え,相関ラベルを用いたパラメータの調整による競合性能を実現する,correlationd adversarial joint discrepancy adaptation network (cajnet) と呼ばれる新しい手法を提案する。ジョイント特徴を訓練することで、2つの領域間の限界分布と条件分布を調整できる。さらに,対象領域の強力な指標である確率に基づくtop-$\mathcal{k}$ correlationd label (\mathcal{k}$-label)を導入する。ベンチマークデータセットに対する大規模な実験により、最先端の分類精度が大幅に向上した。 Domain adaptation aims to mitigate the domain shift problem when transferring knowledge from one domain into another similar but different domain. However, most existing works rely on extracting marginal features without considering class labels. Moreover, some methods name their model as so-called unsupervised domain adaptation while tuning the parameters using the target domain label. To address these issues, we propose a novel approach called correlated adversarial joint discrepancy adaptation network (CAJNet), which minimizes the joint discrepancy of two domains and achieves competitive performance with tuning parameters using the correlated label. By training the joint features, we can align the marginal and conditional distributions between the two domains. In addition, we introduce a probability-based top-$\mathcal{K}$ correlated label ($\mathcal{K}$-label), which is a powerful indicator of the target domain and effective metric to tune parameters to aid predictions. Extensive experiments on benchmark datasets demonstrate significant improvements in classification accuracy over the state of the art.	翻訳日:2021-05-21 00:20:44 公開日:2021-05-18
# (参考訳) GloVe ワード埋め込みと補助語彙資源を用いた消費者健康語彙の充実のための自動化手法 An Automated Method to Enrich Consumer Health Vocabularies Using GloVe Word Embeddings and An Auxiliary Lexical Resource ( http://arxiv.org/abs/2105.08812v1 ) ライセンス: CC BY 4.0	Mohammed Ibrahim, Susan Gauch, Omar Salman, Mohammed Alqahatani	(参考訳) 背景: 明快な言語は、任意の当事者間のコミュニケーションを容易にする。在職者は、ドメインに共通する専門用語を理解できないため、専門家とのコミュニケーションが困難になる可能性がある。医療分野では、病状や治療の理解が不十分な、医学用語に精通する素人を見つけることは稀である。このギャップを埋めるために、いくつかの専門用語とオントロジーが作成され、平凡な医学用語を専門的な医学用語にマッピングする。目的: 提示された語彙の多くは手動または半自動で構築され、時間と人的労力に大きな投資を必要とし、結果としてこれらの語彙の成長が遅くなる。本稿では,任意の領域の語彙に適用できるという利点を持つ,在職者の語彙を豊かにするための自動手法を提案する。方法: 完全に自動化されたアプローチでは、マシンラーニング、特にglove(global vectors for word embeddeds)を使用して、ソーシャルメディアのヘルスケアプラットフォームから収集したコーパスに基づいて、consumer health vocabularies(chv)を拡張し、拡張します。提案手法は,WordNetのオントロジーから同義語や偽名を取り込むことにより,CHVをさらに改善する。 The basic GloVe and our novel algorithm in Using WordNet using two laymen datasets from the National Library of Medicine (NLM), Open-Access Consumer Health Vocabulary (OAC CHV) and MedlinePlus Healthcare Vocabulary。結果は、GloVeが48.44%のFスコアで新しいレイメン語を見つけることができたことを示している。さらに, 強化グローブアプローチは, 平均fスコアが61%, 相対的に25%向上したベーシックグローブよりも優れていた。さらに、強化されたGloVeはP<.001の2つの基底真理データセットに対して統計的に有意であった。 Background: Clear language makes communication easier between any two parties. A layman may have difficulty communicating with a professional due to not understanding the specialized terms common to the domain. In healthcare, it is rare to find a layman knowledgeable in medical terminology which can lead to poor understanding of their condition and/or treatment. To bridge this gap, several professional vocabularies and ontologies have been created to map laymen medical terms to professional medical terms and vice versa. Objective: Many of the presented vocabularies are built manually or semi-automatically requiring large investments of time and human effort and consequently the slow growth of these vocabularies. In this paper, we present an automatic method to enrich laymen's vocabularies that has the benefit of being able to be applied to vocabularies in any domain. Methods: Our entirely automatic approach uses machine learning, specifically Global Vectors for Word Embeddings (GloVe), on a corpus collected from a social media healthcare platform to extend and enhance consumer health vocabularies (CHV). Our approach further improves the CHV by incorporating synonyms and hyponyms from the WordNet ontology. The basic GloVe and our novel algorithms incorporating WordNet were evaluated using two laymen datasets from the National Library of Medicine (NLM), Open-Access Consumer Health Vocabulary (OAC CHV) and MedlinePlus Healthcare Vocabulary. Results: The results show that GloVe was able to find new laymen terms with an F-score of 48.44%. Furthermore, our enhanced GloVe approach outperformed basic GloVe with an average F-score of 61%, a relative improvement of 25%. Furthermore, the enhanced GloVe showed a statistical significance over the two ground truth datasets with P<.001.	翻訳日:2021-05-20 23:55:12 公開日:2021-05-18
# (参考訳) リモート生理計測予測による映像系列からの非接触痛認識 Non-contact Pain Recognition from Video Sequences with Remote Physiological Measurements Prediction ( http://arxiv.org/abs/2105.08822v1 ) ライセンス: CC BY 4.0	Ruijing Yang, Ziyu Guan, Zitong Yu, Guoying Zhao, Xiaoyi Feng, Jinye Peng	(参考訳) 自動鎮痛は診断と治療において最重要である。既存の作品は、顔の外観の変化の評価、生理的な手がかりの活用、マルチモーダルな方法でそれらを融合させるという3つのカテゴリに分類される。しかし,(1)外見の変化は主観的痛み認知を妨げる主観的要因の影響を受けやすい。また,表情に基づくアプローチでは,表現のモデル化に重要な長期的空間的-時間的依存性を無視し,(2)不便で不快な人体にセンサを装着することで生理学的手がかりを得る。本稿では,出現変化と生理的手がかりの両方を非接触的にエンコードして痛み認識を行うマルチタスク学習フレームワークを提案する。このフレームワークは、学習された外観表現に対する注意機構を通じて局所的および長期的依存性の両方を捉えることができ、補助タスクでビデオから復元された生理的手がかり(remote photoplethysmography, rppg)によりさらに強化される。このフレームワークはrPPGにより強化された時空間注意ネットワーク(rSTAN)と呼ばれ、一般に利用可能な痛みデータベース上での非接触痛認識の最先端性能を確立することができる。これはrppg予測を非接触自動痛み認識の補助タスクとして使用できることを示す。 Automatic pain recognition is paramount for medical diagnosis and treatment. The existing works fall into three categories: assessing facial appearance changes, exploiting physiological cues, or fusing them in a multi-modal manner. However, (1) appearance changes are easily affected by subjective factors which impedes objective pain recognition. Besides, the appearance-based approaches ignore long-range spatial-temporal dependencies that are important for modeling expressions over time; (2) the physiological cues are obtained by attaching sensors on human body, which is inconvenient and uncomfortable. In this paper, we present a novel multi-task learning framework which encodes both appearance changes and physiological cues in a non-contact manner for pain recognition. The framework is able to capture both local and long-range dependencies via the proposed attention mechanism for the learned appearance representations, which are further enriched by temporally attended physiological cues (remote photoplethysmography, rPPG) that are recovered from videos in the auxiliary task. This framework is dubbed rPPG-enriched Spatio-Temporal Attention Network (rSTAN) and allows us to establish the state-of-the-art performance of non-contact pain recognition on publicly available pain databases. It demonstrates that rPPG predictions can be used as an auxiliary task to facilitate non-contact automatic pain recognition.	翻訳日:2021-05-20 23:37:53 公開日:2021-05-18
# (参考訳) Fusion-DHL:WiFi, IMU, Floorplan Fusion for Dense History of Locations in Indoor Environments Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments ( http://arxiv.org/abs/2105.08837v1 ) ライセンス: CC BY 4.0	Sachini Herath, Saghar Irandoust, Bowen Chen, Yiming Qian, Pyojin Kim, Yasutaka Furukawa	(参考訳) 本稿では,WiFi,IMU,フロアプラン情報を融合して屋内環境における正確な位置履歴を推定するマルチモーダルセンサ融合アルゴリズムを提案する。このアルゴリズムは,1)IMUセンサデータから相対的な運動軌跡を推定する慣性ナビゲーションアルゴリズム,2)位置制約を取得し,その軌跡をジオローカライズする業界におけるWiFiベースのローカライゼーションAPI,3)フロアプランと整合する位置履歴を洗練するための畳み込みニューラルネットワークを使用する。 4つの大学ビルと3つのショッピングモールで、wi-fi、imu、フロアプランデータを使った新しいデータセットを構築するためのデータ取得アプリを開発した。定性的かつ定量的な評価により,提案システムは現在の標準よりも2倍の精度と数桁の高密度な位置履歴を生成でき,エネルギー消費は最小限であることが示された。私たちはコード、データ、モデルを公開します。 The paper proposes a multi-modal sensor fusion algorithm that fuses WiFi, IMU, and floorplan information to infer an accurate and dense location history in indoor environments. The algorithm uses 1) an inertial navigation algorithm to estimate a relative motion trajectory from IMU sensor data; 2) a WiFi-based localization API in industry to obtain positional constraints and geo-localize the trajectory; and 3) a convolutional neural network to refine the location history to be consistent with the floorplan. We have developed a data acquisition app to build a new dataset with WiFi, IMU, and floorplan data with ground-truth positions at 4 university buildings and 3 shopping malls. Our qualitative and quantitative evaluations demonstrate that the proposed system is able to produce twice as accurate and a few orders of magnitude denser location history than the current standard, while requiring minimal additional energy consumption. We will publicly share our code, data and models.	翻訳日:2021-05-20 23:22:55 公開日:2021-05-18
# (参考訳) シーケンスからシーケンスタスクへの表現学習:マルチフィルタガウス混合オートエンコーダ Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder ( http://arxiv.org/abs/2105.08840v1 ) ライセンス: CC BY 4.0	Yunhao Yang, Zhaokun Xue	(参考訳) 文の不均一性は、機械翻訳のようなシーケンスタスクに連続して存在する。大きく異なる意味や文法構造を持つ文は、ネットワークを訓練しながら収束の困難を増す可能性がある。本稿では,シーケンスタスクにおける不均一性を解決するためのモデルを提案する。 Multi-filter Gaussian Mixture Autoencoder (MGMAE) はオートエンコーダを用いて入力の表現を学習する。表現はエンコーダからの出力であり、その次元がエンコーダの隠れた次元である潜在空間にある。潜在空間におけるトレーニングデータの表現はガウス混合の訓練に使用される。潜在空間表現はガウス分布のいくつかの混合に分割される。フィルタ(デコーダ)は、具体的にはガウス分布の1つに適合するように調整される。各ガウシアンはこのガウシアン内の不均一性の原因となるように1つのフィルターに対応している。これにより、トレーニングデータの均一性を解消できる。ジオクエリデータセットと英語とフランス語の翻訳について比較実験を行った。実験の結果,従来のエンコーダ・デコーダモデルと比較すると,機械翻訳や質問応答といったシーケンスタスクの処理性能が向上することがわかった。 Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we introduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.	翻訳日:2021-05-20 23:09:47 公開日:2021-05-18
# (参考訳) Gym-ANM:研究開発における電力系統管理のための強化学習を活用したオープンソースソフトウェア Gym-ANM: Open-source software to leverage reinforcement learning for power system management in research and education ( http://arxiv.org/abs/2105.08846v1 ) ライセンス: CC BY 4.0	Robin Henry and Damien Ernst	(参考訳) Gym-ANMは、電気ネットワークにおけるアクティブネットワーク管理(ANM)タスクをモデル化する強化学習(RL)環境の設計を容易にするPythonパッケージである。ここでは、新しい環境の実装方法と、既存の環境と相互作用するコードを書く方法を説明する。また、ANM6-Easyは、一般的なANM課題を強調するために設計された環境である。最後に,sm-anmが科学コミュニティに与える影響について,研究と教育の両面で検討する。このパッケージは、将来のエネルギーシステムを制御するアルゴリズムの探索において、電力システムとRLコミュニティの協力を促進することを願っている。 Gym-ANM is a Python package that facilitates the design of reinforcement learning (RL) environments that model active network management (ANM) tasks in electricity networks. Here, we describe how to implement new environments and how to write code to interact with pre-existing ones. We also provide an overview of ANM6-Easy, an environment designed to highlight common ANM challenges. Finally, we discuss the potential impact of Gym-ANM on the scientific community, both in terms of research and education. We hope this package will facilitate collaboration between the power system and RL communities in the search for algorithms to control future energy systems.	翻訳日:2021-05-20 23:02:30 公開日:2021-05-18
# (参考訳) AI教育における構造的不平等の克服 Confronting Structural Inequities in AI for Education ( http://arxiv.org/abs/2105.08847v1 ) ライセンス: CC BY 4.0	Michael Madaio, Su Lin Blodgett, Elijah Mayfield, Ezekiel Dixon-Rom\'an	(参考訳) 教育技術と、それらが展開される教育制度は、何が重要なのか、学習者がどのように学ぶべきかについて、特にイデオロギーを実践する。人工知能技術(教育など)が辺境化社会に不平等な結果をもたらしたため、AIシステムの異なる影響を評価し緩和するための様々なアプローチが開発されている。しかし,本稿では,AIモデルの性能格差に基づく公平性評価の主流パラダイムが,教育用AIシステム(re)が生み出す構造的不平等に直面するには不十分である,と論じる。我々は、批判理論と黒人フェミニスト奨学金によって知らされる構造的不正のレンズを描き、広く研究され広く研究されている教育AIシステムのカテゴリを批判的に尋問し、どのように教育AI技術が、モデルの性能に関わらず、構造的不正と不平等の歴史的正当性に束縛されているかを実証する。私たちは、教育AI研究のより公平な未来に向けて、代替のビジョンに近づきます。 Educational technologies, and the systems of schooling in which they are deployed, enact particular ideologies about what is important to know and how learners should learn. As artificial intelligence technologies -- in education and beyond -- have led to inequitable outcomes for marginalized communities, various approaches have been developed to evaluate and mitigate AI systems' disparate impact. However, we argue in this paper that the dominant paradigm of evaluating fairness on the basis of performance disparities in AI models is inadequate for confronting the structural inequities that educational AI systems (re)produce. We draw on a lens of structural injustice informed by critical theory and Black feminist scholarship to critically interrogate several widely-studied and widely-adopted categories of educational AI systems and demonstrate how educational AI technologies are bound up in and reproduce historical legacies of structural injustice and inequity, regardless of the parity of their models' performance. We close with alternative visions for a more equitable future for educational AI research.	翻訳日:2021-05-20 22:54:38 公開日:2021-05-18
# (参考訳) 効果的な注意は解釈可能性に光を当てる Effective Attention Sheds Light On Interpretability ( http://arxiv.org/abs/2105.08855v1 ) ライセンス: CC BY 4.0	Kaiser Sun and Ana Marasovi\'c	(参考訳) 変圧器自己注意サブレイヤの注意行列は、2つの成分に確実に分解することができ、その1つ(有効注意)のみがモデル出力に寄与する。これにより、効果的な注意の可視化が標準的な注意の解釈と異なる結論を与えるかどうかを問うことができる。グルータスクとbertのサブセットを使用して、2つのアテンション行列を比較する解析を行い、それらの解釈が異なることを示す。効果的な注意力は、セパレータトークンのような言語モデリング事前訓練に関連する特徴とは無関係であり、エンドタスクを解くためにモデルが捉えた言語的特徴を説明する可能性がある。この違いを考慮に入れると,設計によって出力されるモデルとより関連があるため,トランスフォーマーの挙動の研究に効果的に注意を払うことを推奨する。 An attention matrix of a transformer self-attention sublayer can provably be decomposed into two components and only one of them (effective attention) contributes to the model output. This leads us to ask whether visualizing effective attention gives different conclusions than interpretation of standard attention. Using a subset of the GLUE tasks and BERT, we carry out an analysis to compare the two attention matrices, and show that their interpretations differ. Effective attention is less associated with the features related to the language modeling pretraining such as the separator token, and it has more potential to illustrate linguistic features captured by the model for solving the end-task. Given the found differences, we recommend using effective attention for studying a transformer's behavior since it is more pertinent to the model output by design.	翻訳日:2021-05-20 22:17:38 公開日:2021-05-18
# Value Functionは必要なものすべて: ハイドプラットフォームのための統一学習フレームワーク Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms ( http://arxiv.org/abs/2105.08791v1 ) ライセンス: Link先を確認	Xiaocheng Tang, Fan Zhang, Zhiwei (Tony) Qin, Yansheng Wang, Dingyuan Shi, Bingchen Song, Yongxin Tong, Hongtu Zhu, Jieping Ye	(参考訳) DiDi、Uber、Lyftなどの大型配車プラットフォームは、都市内の数万台の車両を1日中数百万の乗車要求に接続し、注文の発送と車両配置のタスクを通じて、交通効率を向上させるための素晴らしい約束を提供する。しかし、既存の研究では2つのタスクが単純化されており、これら2つの間の複雑な相互作用、供給と需要のリアルタイムな変動、そして問題の大規模な性質による必要な調整にほとんど対応していない。本稿では,両タスクに取り組むための統合価値ベース動的学習フレームワーク(v1d3)を提案する。フレームワークの中心にはグローバルな共有バリュー関数があり、リアルタイムプラットフォームトランザクションから生成されたオンラインエクスペリエンスを使用して継続的に更新される。サンプル効率とロバスト性を改善するために,高速オンライン学習と,豊富な履歴ドライバ軌道データを活用する大規模なオフライン学習手法を組み合わせた,新しい定期的なアンサンブル手法を提案する。これにより、提案するフレームワークは、非常にダイナミックな環境に迅速に適応し、繰り返しパターンに頑健に一般化し、管理車両の人口間の暗黙的な調整を促進することができる。実世界のデータセットに基づく広範な実験では、両タスクで最近提案された他の方法よりも大幅に改善されている。特に、v1d3は、kdd cup 2020 rlコンペティションにおけるディスパッチとリプレースの両方のトラックの勝者を上回り、ドライバー総収入とユーザエクスペリエンス関連の指標の両方を改善する最新結果を達成している。 Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of thousands of vehicles in a city to millions of ride demands throughout the day, providing great promises for improving transportation efficiency through the tasks of order dispatching and vehicle repositioning. Existing studies, however, usually consider the two tasks in simplified settings that hardly address the complex interactions between the two, the real-time fluctuations between supply and demand, and the necessary coordinations due to the large-scale nature of the problem. In this paper we propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks. At the center of the framework is a globally shared value function that is updated continuously using online experiences generated from real-time platform transactions. To improve the sample-efficiency and the robustness, we further propose a novel periodic ensemble method combining the fast online learning with a large-scale offline training scheme that leverages the abundant historical driver trajectory data. This allows the proposed framework to adapt quickly to the highly dynamic environment, to generalize robustly to recurrent patterns and to drive implicit coordinations among the population of managed vehicles. Extensive experiments based on real-world datasets show considerably improvements over other recently proposed methods on both tasks. Particularly, V1D3 outperforms the first prize winners of both dispatching and repositioning tracks in the KDD Cup 2020 RL competition, achieving state-of-the-art results on improving both total driver income and user experience related metrics.	翻訳日:2021-05-20 14:01:32 公開日:2021-05-18
# Pathdreamer: 室内ナビゲーションのための世界モデル Pathdreamer: A World Model for Indoor Navigation ( http://arxiv.org/abs/2105.08756v1 ) ライセンス: Link先を確認	Jing Yu Koh, Honglak Lee, Yinfei Yang, Jason Baldridge, Peter Anderson	(参考訳) 不慣れな建物をナビゲートする人々は、無数の視覚的、空間的、セマンティックな手がかりを利用して、ナビゲーション目標を効率的に達成します。同様の能力を持つ計算エージェントの装備に向けて,新しい屋内環境を探索するエージェントの視覚的世界モデルPathdreamerを紹介した。ひとつ以上の視覚的な観察から、pathdreamerは、訓練中に見えない建物において、訪問されていない視点に対して、おそらく高解像度の360度視覚観察(rgb、セマンティックセグメンテーション、深さ)を生成する。不確実性の高い地域では(例えば) 隅々を予測し、目に見えない部屋の内容を想像すると、Pathdreamerは多様なシーンを予測でき、エージェントは与えられた軌道に対して複数の現実的な結果をサンプリングすることができる。 Pathdreamerは視覚・言語ナビゲーション(VLN)の下流タスクにおいて、人間の環境に関する有用な視覚的・空間的・意味的な知識を符号化する。具体的には、Pathdreamerの今後の計画が、環境の観測されていない部分からの実際の観測に先んじることの利点の半分をもたらすことを示す。 pathdreamerは、特定のオブジェクトやvlnへのナビゲートなど、具体化されたナビゲーションタスクに挑戦するためのモデルベースのアプローチのアンロックを支援することを願っている。 People navigating in unfamiliar buildings take advantage of myriad visual, spatial and semantic cues to efficiently achieve their navigation goals. Towards equipping computational agents with similar capabilities, we introduce Pathdreamer, a visual world model for agents navigating in novel indoor environments. Given one or more previous visual observations, Pathdreamer generates plausible high-resolution 360 visual observations (RGB, semantic segmentation and depth) for viewpoints that have not been visited, in buildings not seen during training. In regions of high uncertainty (e.g. predicting around corners, imagining the contents of an unseen room), Pathdreamer can predict diverse scenes, allowing an agent to sample multiple realistic outcomes for a given trajectory. We demonstrate that Pathdreamer encodes useful and accessible visual, spatial and semantic knowledge about human environments by using it in the downstream task of Vision-and-Language Navigation (VLN). Specifically, we show that planning ahead with Pathdreamer brings about half the benefit of looking ahead at actual observations from unobserved parts of the environment. We hope that Pathdreamer will help unlock model-based approaches to challenging embodied navigation tasks such as navigating to specified objects and VLN.	翻訳日:2021-05-20 14:00:14 公開日:2021-05-18
# 異常検出のためのマスク付きコントラスト学習 Masked Contrastive Learning for Anomaly Detection ( http://arxiv.org/abs/2105.08793v1 ) ライセンス: Link先を確認	Hyunsoo Cho, Jinseok Seol, Sang-goo Lee	(参考訳) 異常検出は、安全クリティカルなソフトウェアシステムにおける基本的な側面の一つであるが、長年の課題である。複雑化を緩和し、効率性を示すために多くの作品が提案されている。特に,ラベルを付加せずに多彩な表現を学習できることから,自己指導型学習手法が関心を喚起している。自己指導型学習戦術の中で、コントラスト学習は、異常検出を含む様々な分野において、その優位性を検証するための特定の枠組みである。しかし、対照的な学習の主な目的は、ラベルなしでタスクに依存しない特徴を学ぶことである。本稿では,マスク付きコントラスト学習という,タスク固有のコントラスト学習の変種を提案する。さらに,補助的な自己監督タスクを通じて学習した能力を活用することで,パフォーマンスをさらに向上する,自己組織化推論と呼ばれる新しい推論手法を提案する。モデルを組み合わせることで、さまざまなベンチマークデータセットにおいて、従来の最先端手法よりも大きなマージンを達成できます。 Detecting anomalies is one fundamental aspect of a safety-critical software system, however, it remains a long-standing problem. Numerous branches of works have been proposed to alleviate the complication and have demonstrated their efficiencies. In particular, self-supervised learning based methods are spurring interest due to their capability of learning diverse representations without additional labels. Among self-supervised learning tactics, contrastive learning is one specific framework validating their superiority in various fields, including anomaly detection. However, the primary objective of contrastive learning is to learn task-agnostic features without any labels, which is not entirely suited to discern anomalies. In this paper, we propose a task-specific variant of contrastive learning named masked contrastive learning, which is more befitted for anomaly detection. Moreover, we propose a new inference method dubbed self-ensemble inference that further boosts performance by leveraging the ability learned through auxiliary self-supervision tasks. By combining our models, we can outperform previous state-of-the-art methods by a significant margin on various benchmark datasets.	翻訳日:2021-05-20 13:59:50 公開日:2021-05-18
# LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction ( http://arxiv.org/abs/2105.08780v1 ) ライセンス: Link先を確認	Abhinandan Desai and Kai North and Marcos Zampieri and Christopher M. Homan	(参考訳) 本稿では,チームLCP-RITによるSemEval-2021 Task 1: Lexical Complexity Prediction (LCP)の提出について述べる。タスクオーガナイザは、コンプレックスの拡張バージョン(shardlow et al., 2020)を参加者に提供した。コンプレックスは英語のマルチドメインデータセットで、コンテキスト内の単語が5点のlikertスケールを使用して複雑さに対して注釈付けされたものだ。我々のシステムはロジスティック回帰と幅広い言語的特徴(例)を用いる。心理言語学的な特徴、n-gram、単語頻度、posタグ) このデータセットにおける単一単語の複雑さを予測する。言語特性の違いが分類性能に与える影響を分析し,平均絶対誤差,平均二乗誤差,ピアソン相関,スピアマン相関の観点から評価した。 This paper describes team LCP-RIT's submission to the SemEval-2021 Task 1: Lexical Complexity Prediction (LCP). The task organizers provided participants with an augmented version of CompLex (Shardlow et al., 2020), an English multi-domain dataset in which words in context were annotated with respect to their complexity using a five point Likert scale. Our system uses logistic regression and a wide range of linguistic features (e.g. psycholinguistic features, n-grams, word frequency, POS tags) to predict the complexity of single words in this dataset. We analyze the impact of different linguistic features in the classification performance and we evaluate the results in terms of mean absolute error, mean squared error, Pearson correlation, and Spearman correlation.	翻訳日:2021-05-20 13:54:33 公開日:2021-05-18
# 合成符号ミキシングによる機械翻訳への英語テキスト変換器の探索 Exploring Text-to-Text Transformers for English to Hinglish Machine Translation with Synthetic Code-Mixing ( http://arxiv.org/abs/2105.08807v1 ) ライセンス: Link先を確認	Ganesh Jawahar, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, Laks V.S. Lakshmanan	(参考訳) 単言語対とコード混合言語対の翻訳問題に焦点をあてたモデルについて述べる。具体的には、モノリンガルな英語のテキストをHinglish(コードミキシングされたヒンディー語と英語)に変換する幅広いモデルを提供しています。最近の事前学習された言語モデルの成功を考えると、我々は2つのトランスフォーマベースのエンコーダ-デコーダモデル(すなわちmt5とmbart)の有用性をテストし、両方がうまく機能するようにした。また,コード混合のための学習データのpaucityを考慮し,バイリンガル分散表現からコード混合テキストを生成するための依存性のない手法を提案し,言語モデルの性能向上に活用する。特に、この追加データを用いて、まず合成データ上で言語モデルを微調整し、次にゴールドコード混合データを用いて、カリキュラム学習アプローチを採用する。単純ではあるが,本手法は様々な条件下で,いくつかの標準手法(逆変換法,同値制約理論に基づく方法)と競合する(場合によってはさらに優れている)ことが判明した。本研究は,mT5モデルをカリキュラム学習手順に従って微調整し,最高の翻訳性能(12.67BLEU)を達成することを示す。私たちのモデルは、英語と英語の公式共有タスク全体のランキングで第一位です。 We describe models focused at the understudied problem of translating between monolingual and code-mixed language pairs. More specifically, we offer a wide range of models that convert monolingual English text into Hinglish (code-mixed Hindi and English). Given the recent success of pretrained language models, we also test the utility of two recent Transformer-based encoder-decoder models (i.e., mT5 and mBART) on the task finding both to work well. Given the paucity of training data for code-mixing, we also propose a dependency-free method for generating code-mixed texts from bilingual distributed representations that we exploit for improving language model performance. In particular, armed with this additional data, we adopt a curriculum learning approach where we first finetune the language models on synthetic data then on gold code-mixed data. We find that, although simple, our synthetic code-mixing method is competitive with (and in some cases is even superior to) several standard methods (backtranslation, method based on equivalence constraint theory) under a diverse set of conditions. Our work shows that the mT5 model, finetuned following the curriculum learning procedure, achieves best translation performance (12.67 BLEU). Our models place first in the overall ranking of the English-Hinglish official shared task.	翻訳日:2021-05-20 13:54:17 公開日:2021-05-18
# 限られたデータからの顔認識における画像強調の有効性の分析 Analyzing the effectiveness of image augmentations for face recognition from limited data ( http://arxiv.org/abs/2105.08796v1 ) ライセンス: Link先を確認	Aleksei Zhuchkov	(参考訳) 本研究は,限られたデータから顔認識問題に対する画像強調の効率を解析する。拡張のための基本的な操作,生成方法,およびそれらの組み合わせを検討した。以上の結果より, 顔認証システムの品質は向上し, 生成的アプローチと基本手法の組み合わせは, 他の試験手法よりも優れていたことが示唆された。 This work presents an analysis of the efficiency of image augmentations for the face recognition problem from limited data. We considered basic manipulations, generative methods, and their combinations for augmentations. Our results show that augmentations, in general, can considerably improve the quality of face recognition systems and the combination of generative and basic approaches performs better than the other tested techniques.	翻訳日:2021-05-20 13:50:03 公開日:2021-05-18
# クロスアクションアテンションを用いたマルチパーソン極端運動予測 Multi-Person Extreme Motion Prediction with Cross-Interaction Attention ( http://arxiv.org/abs/2105.08825v1 ) ライセンス: Link先を確認	Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno	(参考訳) 人間の動き予測は、過去の3D骨格の連続から将来の人間のポーズを予測することを目的としている。この問題は近年注目されているが、ほとんどの場合単独の人間に対処されている。本稿では,人間による協調作業を含む新しい視点から,この問題を考察する。本システムでは,2人の対話者を対象とした2つの過去の骨格列を入力とし,それぞれの動作を予測することを目的とする。本研究では,両者の歴史的情報を活用し,その空間的・時間的距離に拘わらず,自己ポーズと他者のポーズ間の相互依存を予測できる新たな相互行為注意機構を考案する。このような対話的な状況をトレーニングするデータセットが存在しないため、アクロバティックを行うプロのダンサーによる新しいラボベースの個人インタラクションデータセットであるExPI(Extreme Pose Interaction)をキャプチャした。 ExPIには、30kフレームと60kインスタンスの115のシーケンスと、アノテーション付きの3Dボディポーズと形状が含まれている。このデータセット上でのクロスインタラクションネットワークを徹底的に評価し、短期予測と長期予測の両方において、各人が独立的に推論するベースラインを一貫して上回っています。私たちは、データセットとトレイン/テストの分割を共同でリリースして、このトピックに関する将来の研究を促進する予定です。 Human motion prediction aims to forecast future human poses given a sequence of past 3D skeletons. While this problem has recently received increasing attention, it has mostly been tackled for single humans in isolation. In this paper we explore this problem from a novel perspective, involving humans performing collaborative tasks. We assume that the input of our system are two sequences of past skeletons for two interacting persons, and we aim to predict the future motion for each of them. For this purpose, we devise a novel cross interaction attention mechanism that exploits historical information of both persons and learns to predict cross dependencies between self poses and the poses of the other person in spite of their spatial or temporal distance. Since no dataset to train such interactive situations is available, we have captured ExPI (Extreme Pose Interaction), a new lab-based person interaction dataset of professional dancers performing acrobatics. ExPI contains 115 sequences with 30k frames and 60k instances with annotated 3D body poses and shapes. We thoroughly evaluate our cross-interaction network on this dataset and show that both in short-term and long-term predictions, it consistently outperforms baselines that independently reason for each person. We plan to release our code jointly with the dataset and the train/test splits to spur future research on the topic.	翻訳日:2021-05-20 13:49:57 公開日:2021-05-18
# 確率ネットワークとキューにおける学習と情報 Learning and Information in Stochastic Networks and Queues ( http://arxiv.org/abs/2105.08769v1 ) ライセンス: Link先を確認	Neil Walton, Kuang Xu	(参考訳) 待ち行列システムの安定性と最適化における情報と学習の役割を概観する。近年,意思決定における情報の役割の増大に支えられた待ち行列システムに,教師あり学習,盗賊学習,強化学習の技法が応用されている。待ち行列システムへのこれらの領域の適用を合理化するための観測結果と新たな結果を提案する。我々は、MaxWeight と BackPressure ポリシーが Blackwell の Approachability Theorem の応用であることを証明する。これは待ち行列理論の結果と逆学習を結びつける。次に,サービスパラメータ推定のための統計的学習の要件について論じる。例として、サービス分類にパーセプトロンアルゴリズムを適用する場合、キューサイズの後悔がいかに制限されるかを示す。次に,意思決定における状態情報の役割について述べる。ここでは, てんかん情報(不確定なパラメータの情報)と失語症情報(不確定な状態の情報)の役割を対比する。最後に,強化学習と待ち行列理論の最近の進歩を概観し,現在の研究課題について考察する。 We review the role of information and learning in the stability and optimization of queueing systems. In recent years, techniques from supervised learning, bandit learning and reinforcement learning have been applied to queueing systems supported by increasing role of information in decision making. We present observations and new results that help rationalize the application of these areas to queueing systems. We prove that the MaxWeight and BackPressure policies are an application of Blackwell's Approachability Theorem. This connects queueing theoretic results with adversarial learning. We then discuss the requirements of statistical learning for service parameter estimation. As an example, we show how queue size regret can be bounded when applying a perceptron algorithm to classify service. Next, we discuss the role of state information in improved decision making. Here we contrast the roles of epistemic information (information on uncertain parameters) and aleatoric information (information on an uncertain state). Finally we review recent advances in the theory of reinforcement learning and queueing, as well as, provide discussion on current research challenges.	翻訳日:2021-05-20 13:45:58 公開日:2021-05-18
# タスク非定常性追跡によるメタ強化学習 Meta-Reinforcement Learning by Tracking Task Non-stationarity ( http://arxiv.org/abs/2105.08834v1 ) ライセンス: Link先を確認	Riccardo Poiani, Andrea Tirinzoni, Marcello Restelli	(参考訳) 多くの現実世界のドメインは、エージェントの目標と環境力学に影響を与える構造化された非定常性の対象である。メタ強化学習(rl)は、関連するタスクに迅速に適応するトレーニングエージェントに成功している。しかし、非定常領域のための既存のメタRLアルゴリズムのほとんどは、タスク生成プロセスに強い仮定を行うか、トレーニング時にサンプリングを必要とする。本稿では,タスクの時間的進化を明示的に追跡することで,将来に向けて最適化する新しいアルゴリズム(TRIO)を提案する。トレーニング時にTRIOは、経験サンプルから潜伏パラメータを素早く識別する変分モジュールを学習する。このモジュールは、タスクの不確実性を考慮した最適探索ポリシーと共同で学習される。テスト時にTRIOは、オンラインの潜在パラメータの進化を追跡し、将来のタスクに対する不確実性を減らし、メタ学習ポリシーによる迅速な適応を得る。既存のほとんどの方法とは異なり、トリオはマルコフのタスク進化過程を仮定せず、訓練時の非定常性に関する情報を必要とせず、環境における複雑な変化を捉えている。シミュレーション問題に対するアルゴリズムの評価を行い,競合ベースラインよりも優れていることを示す。 Many real-world domains are subject to a structured non-stationarity which affects the agent's goals and the environmental dynamics. Meta-reinforcement learning (RL) has been shown successful for training agents that quickly adapt to related tasks. However, most of the existing meta-RL algorithms for non-stationary domains either make strong assumptions on the task generation process or require sampling from it at training time. In this paper, we propose a novel algorithm (TRIO) that optimizes for the future by explicitly tracking the task evolution through time. At training time, TRIO learns a variational module to quickly identify latent parameters from experience samples. This module is learned jointly with an optimal exploration policy that takes task uncertainty into account. At test time, TRIO tracks the evolution of the latent parameters online, hence reducing the uncertainty over future tasks and obtaining fast adaptation through the meta-learned policy. Unlike most existing methods, TRIO does not assume Markovian task-evolution processes, it does not require information about the non-stationarity at training time, and it captures complex changes undergoing in the environment. We evaluate our algorithm on different simulated problems and show it outperforms competitive baselines.	翻訳日:2021-05-20 13:45:44 公開日:2021-05-18
# コンフォーマルヒストグラム回帰 Conformal histogram regression ( http://arxiv.org/abs/2105.08747v1 ) ライセンス: Link先を確認	Matteo Sesia, Yaniv Romano	(参考訳) 本稿では,スキューデータに自動的に適応可能な非パラメトリック回帰の予測間隔を計算するためのコンフォメーション手法を提案する。ブラックボックス機械学習アルゴリズムを用いて、ヒストグラムを用いて結果の条件分布を推定し、その出力を近似条件付きの最短予測間隔に変換する。結果として得られる予測間隔は有限サンプルにおいて限界範囲を持つことが証明され、ブラックボックスモデルが一致する場合に条件範囲と最適長さを漸近的に達成する。シミュレーションおよび実データを用いた数値実験により、共形量子化回帰やその他の分布共形予測手法を含む最先端の代替手法と比較して性能が向上した。 This paper develops a conformal method to compute prediction intervals for non-parametric regression that can automatically adapt to skewed data. Leveraging black-box machine learning algorithms to estimate the conditional distribution of the outcome using histograms, it translates their output into the shortest prediction intervals with approximate conditional coverage. The resulting prediction intervals provably have marginal coverage in finite samples, while asymptotically achieving conditional coverage and optimal length if the black-box model is consistent. Numerical experiments with simulated and real data demonstrate improved performance compared to state-of-the-art alternatives, including conformalized quantile regression and other distributional conformal prediction approaches.	翻訳日:2021-05-20 13:43:07 公開日:2021-05-18
# RecPipe: 推奨品質とパフォーマンスを両立させる共設計モデルとハードウェア RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance ( http://arxiv.org/abs/2105.08820v1 ) ライセンス: Link先を確認	Udit Gupta, Samuel Hsia, Jeff (Jun) Zhang, Mark Wilkening, Javin Pombra, Hsien-Hsin S. Lee, Gu-Yeon Wei, Carole-Jean Wu, David Brooks	(参考訳) ディープラーニングレコメンデーションシステムは、厳格なテールレイテンシターゲットと高いシステム負荷の下で高品質でパーソナライズされたコンテンツを提供する必要がある。本稿では,推薦品質と推論性能を協調的に最適化するRecPipeを提案する。 central to recpipeは、計算の複雑さを減らし、異なる並列処理の機会を露出しながら、品質を維持するために、レコメンデーションモデルを多段階パイプラインに分解する。 recpipeは、多段階のレコメンデーションエンジンを、コモディティで異種プラットフォーム(cpuやgpuなど)にマッピングする推論スケジューラを実装している。そこで我々は,品質,テールレイテンシ,システムスループットを共同で最適化するカスタムアクセラレータrecpipeaccel(rpaccel)を設計した。 RPAc-celはRecPipeを通じてオープンされた異なるデザイン空間を利用するように設計されている。特にRPAccelは、サブバッチでクエリをパイプラインレコメンデーションステージに処理し、デュアルな静的および動的埋め込みキャッシュ、トップkフィルタリングユニットのセット、再構成可能なsystolic配列を実装している。先行技術とアイソクオリティに比較して、RPAccelはレイテンシとスループットを3倍と6倍改善することを示した。 Deep learning recommendation systems must provide high quality, personalized content under strict tail-latency targets and high system loads. This paper presents RecPipe, a system to jointly optimize recommendation quality and inference performance. Central to RecPipe is decomposing recommendation models into multi-stage pipelines to maintain quality while reducing compute complexity and exposing distinct parallelism opportunities. RecPipe implements an inference scheduler to map multi-stage recommendation engines onto commodity, heterogeneous platforms (e.g., CPUs, GPUs).While the hardware-aware scheduling improves ranking efficiency, the commodity platforms suffer from many limitations requiring specialized hardware. Thus, we design RecPipeAccel (RPAccel), a custom accelerator that jointly optimizes quality, tail-latency, and system throughput. RPAc-cel is designed specifically to exploit the distinct design space opened via RecPipe. In particular, RPAccel processes queries in sub-batches to pipeline recommendation stages, implements dual static and dynamic embedding caches, a set of top-k filtering units, and a reconfigurable systolic array. Com-pared to prior-art and at iso-quality, we demonstrate that RPAccel improves latency and throughput by 3x and 6x.	翻訳日:2021-05-20 13:42:55 公開日:2021-05-18
# 加速流からの最適化アルゴリズムへの縮約理論のアプローチ A Contraction Theory Approach to Optimization Algorithms from Acceleration Flows ( http://arxiv.org/abs/2105.08832v1 ) ライセンス: Link先を確認	Pedro Cisneros-Velarde, Francesco Bullo	(参考訳) 最近では、関連する最適化フローの離散化、すなわち軌道が関連する最適化問題を解く微分方程式(ODE)システムから最適化アルゴリズムの設計に焦点が当てられている。このような設計アプローチは、適切なODEを設計し、識別するための原則化された方法論を見つける方法という、重要な問題を引き起こします。本稿では, この問題の解法として, 縮約理論を用いた解法を提案する。まず、縮退理論が暗黙的かつ明示的なオイラー積分法の安定性をいかに保証するかを説明する一般的な数学的結果を紹介する。そこで我々は,ODEの新しいシステム,すなわち Accelerated-Contracting-Nesterov フロー,およびそれを確立するための収縮理論を,指数収束率を持つ最適化フローとして提案し,その関連する最適化アルゴリズムの線形収束率を即時確立する。この流れの単純明示的なオイラー離散化はネステロフ加速度法に対応する。最後に,時間変動最適化問題に対する最適化アルゴリズムの設計において,このアプローチが性能保証にどのようにつながるかを示す。 Much recent interest has focused on the design of optimization algorithms from the discretization of an associated optimization flow, i.e., a system of differential equations (ODEs) whose trajectories solve an associated optimization problem. Such a design approach poses an important problem: how to find a principled methodology to design and discretize appropriate ODEs. This paper aims to provide a solution to this problem through the use of contraction theory. We first introduce general mathematical results that explain how contraction theory guarantees the stability of the implicit and explicit Euler integration methods. Then, we propose a novel system of ODEs, namely the Accelerated-Contracting-Nesterov flow, and use contraction theory to establish it is an optimization flow with exponential convergence rate, from which the linear convergence rate of its associated optimization algorithm is immediately established. Remarkably, a simple explicit Euler discretization of this flow corresponds to the Nesterov acceleration method. Finally, we present how our approach leads to performance guarantees in the design of optimization algorithms for time-varying optimization problems.	翻訳日:2021-05-20 13:42:32 公開日:2021-05-18
# ソーシャルメディアにおける画像人気予測のためのマルチモーダルディープラーニングフレームワーク Multimodal Deep Learning Framework for Image Popularity Prediction on Social Media ( http://arxiv.org/abs/2105.08809v1 ) ライセンス: Link先を確認	Fatma S. Abousaleh, Wen-Huang Cheng, Neng-Hao Yu, and Yu Tsao	(参考訳) 何十億枚もの写真が、様々な種類のソーシャルネットワークを通じて毎日ウェブにアップロードされる。これらの画像の中には何百万ものビューを受け取り人気を得るものもあれば、全く気づかないものもある。これは、ソーシャルメディアで画像人気を予測するという問題を引き起こす。画像の人気は、視覚コンテンツ、美的品質、ユーザ、ポストメタデータ、時間など、いくつかの要因に影響される可能性がある。したがって、これら全ての要因を考慮することは、画像の人気を正確に予測するのに不可欠である。さらに,予測モデルの効率性も重要な役割を担っている。本研究では,様々なモダリティからの情報を利用するマルチモーダル学習と,様々な分野における畳み込みニューラルネットワーク(CNN)の現在の成功を動機として,様々な種類の視覚的特徴と社会的特徴を統合ネットワークモデルに組み込むことで,投稿画像の人気を予測する深層学習モデル(VSCNN)を提案する。 VSCNNはまず、2つの個別CNNを利用して入力された視覚的特徴と社会的特徴から高レベル表現を抽出することを学ぶ。これら2つのネットワークの出力をジョイントネットワークに融合し、出力層における人気スコアを推定する。 Flickrに投稿された約432K画像のデータセットを広範囲に実験することにより,提案手法の性能を評価する。シミュレーションの結果、提案したVSCNNモデルは、それぞれ平均絶対誤差と平均二乗誤差の2.33%、7.59%、14.16%以上の相対的な改善により、最先端モデルよりも大幅に優れていることが示された。 Billions of photos are uploaded to the web daily through various types of social networks. Some of these images receive millions of views and become popular, whereas others remain completely unnoticed. This raises the problem of predicting image popularity on social media. The popularity of an image can be affected by several factors, such as visual content, aesthetic quality, user, post metadata, and time. Thus, considering all these factors is essential for accurately predicting image popularity. In addition, the efficiency of the predictive model also plays a crucial role. In this study, motivated by multimodal learning, which uses information from various modalities, and the current success of convolutional neural networks (CNNs) in various fields, we propose a deep learning model, called visual-social convolutional neural network (VSCNN), which predicts the popularity of a posted image by incorporating various types of visual and social features into a unified network model. VSCNN first learns to extract high-level representations from the input visual and social features by utilizing two individual CNNs. The outputs of these two networks are then fused into a joint network to estimate the popularity score in the output layer. We assess the performance of the proposed method by conducting extensive experiments on a dataset of approximately 432K images posted on Flickr. The simulation results demonstrate that the proposed VSCNN model significantly outperforms state-of-the-art models, with a relative improvement of greater than 2.33%, 7.59%, and 14.16% in terms of Spearman's Rho, mean absolute error, and mean squared error, respectively.	翻訳日:2021-05-20 13:42:00 公開日:2021-05-18
# スパース・スパイキング勾配降下 Sparse Spiking Gradient Descent ( http://arxiv.org/abs/2105.08810v1 ) ライセンス: Link先を確認	Nicolas Perez-Nieves and Dan F.M. Goodman	(参考訳) 低エネルギー消費のため、ニューロモルフィックコンピューティングデバイスにスパイキングニューラルネットワーク(SNN)をエミュレートすることへの関心が高まっている。近年の進歩により、SNNをトレーニングすることで、従来のニューラルネットワーク(ANN)と精度で競合し始めることができると同時に、ニューロモルフィックハードウェア上での動作時のエネルギー効率も向上している。しかし、SNNのトレーニングプロセスは、SNNの時空間的疎結合性を生かしていないANN向けに開発された高密度テンソル操作に基づいている。本稿では,現在の art 法と同等かそれ以上の精度を実現しつつ,より高速かつメモリ効率を向上できる最初のスパース snn バックプロパゲーションアルゴリズムを提案する。提案手法は,70倍までの後方通過速度を達成し,精度を損なうことなく,最大40%のメモリ効率を向上できる実データ(Fashion-MNIST,Neuromophic-MNIST,Spike Heidelberg Digits)に対して有効性を示す。 There is an increasing interest in emulating Spiking Neural Networks (SNNs) on neuromorphic computing devices due to their low energy consumption. Recent advances have allowed training SNNs to a point where they start to compete with traditional Artificial Neural Networks (ANNs) in terms of accuracy, while at the same time being energy efficient when run on neuromorphic hardware. However, the process of training SNNs is still based on dense tensor operations originally developed for ANNs which do not leverage the spatiotemporally sparse nature of SNNs. We present here the first sparse SNN backpropagation algorithm which achieves the same or better accuracy as current state of the art methods while being significantly faster and more memory efficient. We show the effectiveness of our method on real datasets of varying complexity (Fashion-MNIST, Neuromophic-MNIST and Spiking Heidelberg Digits) achieving a speedup in the backward pass of up to 70x, and 40% more memory efficient, without losing accuracy.	翻訳日:2021-05-20 13:38:39 公開日:2021-05-18
# rx-anon -- 修正モンドリアンアルゴリズムに基づく異種データの復号化に関する新しいアプローチ rx-anon -- A Novel Approach on the De-Identification of Heterogeneous Data based on a Modified Mondrian Algorithm ( http://arxiv.org/abs/2105.08842v1 ) ライセンス: Link先を確認	Fabian Singhofer, Aygul Garifullina, Mathias Kern, Ansgar Scherp	(参考訳) データ匿名化の伝統的なアプローチは、関係データとテキストデータとは独立に考える。本稿では,関係属性とテキスト属性からなる異種半構造化文書の匿名化手法であるrx-anonを提案する。テキストから抽出したセンシティブな用語を構造化データにマップする。これにより、k匿名性のような概念を使って、異種データ入力の結合されたプライバシー保護バージョンを生成することができます。我々は,異種データを一貫して匿名化するために,冗長な機密情報の概念を導入する。非構造化テキストデータと構造化データ属性との匿名化の影響を制御するために,修正されたパラメータ付きmondrianアルゴリズムを導入する。パラメータ $\lambda$ は、匿名化プロセス中に関係属性とテキスト属性に異なる重みを与えることができる。本手法は,リレーショナルデータとテキストデータの共同匿名化の問題に適応した正規化確実性ペナルティスコアを用いて,実世界の2つのデータセットを用いて評価する。提案手法は,モンドリアン分割の制御にチューニングパラメータを用いることで情報損失を低減できることを示すとともに,関係属性やセンシティブな用語のk匿名性を保証する。 rx-anonはフレームワークアプローチであるため、他の匿名化アルゴリズム、プライバシモデル、テキスト類似度メトリクスによって再利用および拡張することができる。 Traditional approaches for data anonymization consider relational data and textual data independently. We propose rx-anon, an anonymization approach for heterogeneous semi-structured documents composed of relational and textual attributes. We map sensitive terms extracted from the text to the structured data. This allows us to use concepts like k-anonymity to generate a joined, privacy-preserved version of the heterogeneous data input. We introduce the concept of redundant sensitive information to consistently anonymize the heterogeneous data. To control the influence of anonymization over unstructured textual data versus structured data attributes, we introduce a modified, parameterized Mondrian algorithm. The parameter $\lambda$ allows to give different weight on the relational and textual attributes during the anonymization process. We evaluate our approach with two real-world datasets using a Normalized Certainty Penalty score, adapted to the problem of jointly anonymizing relational and textual data. The results show that our approach is capable of reducing information loss by using the tuning parameter to control the Mondrian partitioning while guaranteeing k-anonymity for relational attributes as well as for sensitive terms. As rx-anon is a framework approach, it can be reused and extended by other anonymization algorithms, privacy models, and textual similarity metrics.	翻訳日:2021-05-20 13:38:19 公開日:2021-05-18
# 神経正準変換を伴う有限温度における相互作用フェルミオンのab-initio研究 Ab-initio study of interacting fermions at finite temperature with neural canonical transformation ( http://arxiv.org/abs/2105.08644v1 ) ライセンス: Link先を確認	Hao Xie, Linfeng Zhang, Lei Wang	(参考訳) 連続体における相互作用するフェルミオンの熱的性質に対する変動密度行列アプローチを提案する。変分密度行列は、離散確率モデルとともに置換同変多体ユニタリ変換によってパラメトリゼーションされる。ユニタリ変換は、フェルミオン座標の流れを介して相関効果を組み込んだ神経正準変換の量子対として実装される。最初の応用として、フェルミ液体からウィグナー分子への相互作用が引き起こされる2次元量子ドット中の電子を研究する。本手法は,フェルミオンサイン問題により従来の量子モンテカルロ法が深刻な困難に直面する低温状態において,正確な結果を与える。このアプローチは、さらなる拡張のために一般的かつ柔軟であり、従って超低温量子ガス、凝縮物質、暖かい高密度物質物理学の文脈で強相関フェルミオンに関する新しい物理結果を提供するという約束を持っている。 We present a variational density matrix approach to the thermal properties of interacting fermions in the continuum. The variational density matrix is parametrized by a permutation equivariant many-body unitary transformation together with a discrete probabilistic model. The unitary transformation is implemented as a quantum counterpart of neural canonical transformation, which incorporates correlation effects via a flow of fermion coordinates. As the first application, we study electrons in a two-dimensional quantum dot with an interaction-induced crossover from Fermi liquid to Wigner molecule. The present approach provides accurate results in the low-temperature regime, where conventional quantum Monte Carlo methods face severe difficulties due to the fermion sign problem. The approach is general and flexible for further extensions, thus holds the promise to deliver new physical results on strongly correlated fermions in the context of ultracold quantum gases, condensed matter, and warm dense matter physics.	翻訳日:2021-05-20 13:36:46 公開日:2021-05-18
# 共通認知モデルの予測処理実装に向けて Towards a Predictive Processing Implementation of the Common Model of Cognition ( http://arxiv.org/abs/2105.07308v2 ) ライセンス: Link先を確認	Alexander Ororbia, M. A. Kelly	(参考訳) 本稿では,強力な,かつ単純なニューラルモデルから構築した認知的アーキテクチャを提案する。具体的には、ニューラル生成符号化とホログラフィック連想記憶に基づく認知の共通モデルの実装について述べる。提案システムは,多様なタスクから継続的に学習するエージェントを開発するための基盤となり,既存の認知アーキテクチャよりも大規模で人的パフォーマンスをモデル化する。 In this article, we present a cognitive architecture that is built from powerful yet simple neural models. Specifically, we describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales than what is possible with existant cognitive architectures.	翻訳日:2021-05-20 11:27:30 公開日:2021-05-18
# 十分な校正 Calibrating sufficiently ( http://arxiv.org/abs/2105.07283v2 ) ライセンス: Link先を確認	Dirk Tasche	(参考訳) 確率的分類器を訓練して校正する場合、キャリブレーション損失のいわゆるグループ損失成分を容易に見逃すことができる。グルーピングロス(grouping loss)とは、観測可能な情報と実際に校正訓練で活用された情報との間のギャップを指す。グループ化損失とsufficiencyの概念との関係について検討し,sufficiencyの有用な基準としてコモノトニック性を特定する。 langford & zadrozny (2005) の探索還元アプローチを再検討し、グループ化損失を減らす確率的分類器の推定子を生成することを発見した。最後に,確率的分類器の訓練と「不十分」校正を支援するツールとして,ブライア曲線について考察する。 When probabilistic classifiers are trained and calibrated, the so-called grouping loss component of the calibration loss can easily be overlooked. Grouping loss refers to the gap between observable information and information actually exploited in the calibration exercise. We investigate the relation between grouping loss and the concept of sufficiency, identifying comonotonicity as a useful criterion for sufficiency. We revisit the probing reduction approach of Langford & Zadrozny (2005) and find that it produces an estimator of probabilistic classifiers that reduces grouping loss. Finally, we discuss Brier curves as tools to support training and 'sufficient' calibration of probabilistic classifiers.	翻訳日:2021-05-20 11:27:26 公開日:2021-05-18
# (参考訳) Recursive Hierarchy-Interactive Attention and Entity-Order Perception による遠隔監視型関係抽出 Distantly Supervised Relation Extraction via Recursive Hierarchy-Interactive Attention and Entity-Order Perception ( http://arxiv.org/abs/2105.08213v1 ) ライセンス: CC BY 4.0	Ridong Han, Tao Peng, Jiayu Han, Lin Yue, Hai Cui, Lu Liu	(参考訳) 最近,遠隔教師付き関係抽出が注目されている。しかし、ほとんど全ての先行作品は、文中で2つの実体の出現順序が意味論の理解に寄与しているという事実を無視している。さらに、関係階層を活用しているが、関係レベル間のヒューリスティックな効果を十分に活用していない。本稿では,関係関係の階層構造を用いて,関係レベル間の対話的情報をモデル化し,より長期的関係を扱う新しい階層型階層型対話型注意ネットワーク(RHIA)を設計する。再帰的構造において階層的関係連鎖に沿った関係強化文表現を生成する。さらに、文エンコーダがよりエンティティの外観情報を保持できるように、Entity-Order Perception (EOP)と呼ばれる新たな訓練目標を導入する。人気のNew York Times(NYT)データセットに関する実体実験が実施されている。従来のベースラインと比較して,p-r曲線,auc,top-n精度,その他の評価指標を用いて最先端のパフォーマンスを実現する。 Distantly supervised relation extraction has drawn significant attention recently. However, almost all prior works ignore the fact that, in a sentence, the appearance order of two entities contributes to the understanding of its semantics. Furthermore, they leverage relation hierarchies but don't fully exploit the heuristic effect between relation levels, i.e., higher-level relations can give useful information to the lower ones. In this paper, we design a novel Recursive Hierarchy-Interactive Attention network (RHIA), which uses the hierarchical structure of the relation to model the interactive information between the relation levels to further handle long-tail relations. It generates relation-augmented sentence representations along hierarchical relation chains in a recursive structure. Besides, we introduce a newfangled training objective, called Entity-Order Perception (EOP), to make the sentence encoder retain more entity appearance information. Substantial experiments on the popular New York Times (NYT) dataset are conducted. Compared to prior baselines, our approach achieves state-of-the-art performance in terms of precision-recall (P-R) curves, AUC, Top-N precision and other evaluation metrics.	翻訳日:2021-05-20 01:08:28 公開日:2021-05-18
# (参考訳) The Commodities News Corpus: 良質なコモディティニュースのためのリソース The Commodities News Corpus: A Resource forUnderstanding Commodity News Better ( http://arxiv.org/abs/2105.08214v1 ) ライセンス: CC BY 4.0	Meisin Lee, Lay Ki Soon, Eu-gene Siew	(参考訳) コモディティ・ニュースは、最近の商品価格運動の要約や、ムーブメントに繋がった注目すべき出来事など、豊富な情報を含んでいる。イベント抽出を通じて、商品ニュースから抽出した有用な情報は、商品価格予測に使用できる商品価格運動と商品間の因果関係のマイニングに極めて有用である。今後の研究を容易にするために、以下の情報と注釈付きデータセットを紹介する。 (i) エンティティ(nomi-nalとnamedの両方)、 (ii) イベント(trigger words and argument role)、 (iii) イベントメタデータ(modality, polarity and intensity)、 (iv) イベント-イベント関係。 Commodity News contains a wealth of information such as sum-mary of the recent commodity price movement and notable events that led tothe movement. Through event extraction, useful information extracted fromcommodity news is extremely useful in mining for causal relation betweenevents and commodity price movement, which can be used for commodity priceprediction. To facilitate the future research, we introduce a new dataset withthe following information identified and annotated: (i) entities (both nomi-nal and named), (ii) events (trigger words and argument roles), (iii) eventmetadata: modality, polarity and intensity and (iv) event-event relations.	翻訳日:2021-05-20 00:53:47 公開日:2021-05-18
# (参考訳) ヘイスタックにおける針の発見:共同検出・追跡による4Kビデオにおけるニーフライング物体検出 Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K Videos using a Joint Detection-and-Tracking Approach ( http://arxiv.org/abs/2105.08253v1 ) ライセンス: CC BY 4.0	Ryota Yoshihashi, Rei Kawakami, Shaodi You, Tu Tuan Trinh, Makoto Iida, Takeshi Naemura	(参考訳) 高解像度ビデオで小さな物体を検出することは、視覚情報がほとんど信頼できないため難しい。特に、課題には、非常に低い解像度のオブジェクト、圧縮によるmpegアーティファクト、多くのハードネガティブな検索領域が含まれる。追跡は、信頼性の低い外観と信頼性の低い動作推定のため等しく困難である。幸いなことに、この2つの困難なタスクを組み合わせることで、相互に利益が得られます。そこで,本論文では,単一,トレーニング可能な,エンドツーエンドのネットワークを通じて学習した多フレーム表現を用いて,検出と追跡を共同で行うリカレント相関ネットワークというニューラルネットワークモデルを提案する。このフレームワークは、長期にわたる畳み込みメモリネットワークを利用して、検出のための情報的外観変化を学習し、学習された表現は、その性能を高めるために追跡において共有される。鳥や無人航空機などの小型飛行物体を含むシーンの画像を含むデータセットを用いた実験において、提案手法は、深部単一フレーム検出器や既存のモーションベース検出器に対する検出性能の一貫した改善をもたらした。さらに,鳥画像データセットのトラッカとして評価された場合,ネットワークは最先端の汎用オブジェクトトラッカと同様に動作する。 Detecting tiny objects in a high-resolution video is challenging because the visual information is little and unreliable. Specifically, the challenge includes very low resolution of the objects, MPEG artifacts due to compression and a large searching area with many hard negatives. Tracking is equally difficult because of the unreliable appearance, and the unreliable motion estimation. Luckily, we found that by combining this two challenging tasks together, there will be mutual benefits. Following the idea, in this paper, we present a neural network model called the Recurrent Correlational Network, where detection and tracking are jointly performed over a multi-frame representation learned through a single, trainable, and end-to-end network. The framework exploits a convolutional long short-term memory network for learning informative appearance changes for detection, while the learned representation is shared in tracking for enhancing its performance. In experiments with datasets containing images of scenes with small flying objects, such as birds and unmanned aerial vehicles, the proposed method yielded consistent improvements in detection performance over deep single-frame detectors and existing motion-based detectors. Furthermore, our network performs as well as state-of-the-art generic object trackers when it was evaluated as a tracker on a bird image dataset.	翻訳日:2021-05-20 00:40:09 公開日:2021-05-18
# (参考訳) ログロススコアからのラベル推論攻撃 Label Inference Attacks from Log-loss Scores ( http://arxiv.org/abs/2105.08266v1 ) ライセンス: CC BY 4.0	Abhinav Aggarwal, Shiva Prasad Kasiviswanathan, Zekun Xu, Oluwaseyi Feyisetan, Nathanael Teissier	(参考訳) ログロス(クロスエントロピー損失とも呼ばれる)メトリックは、分類アルゴリズムのパフォーマンスを評価するために機械学習アプリケーションに広く使われている。本稿では,データセットのラベルを単一の(あるいは複数)log-lossスコアから推測する問題を,他のデータにアクセスせずに検討する。驚くべきことに、任意の有限個のラベルクラスに対して、任意の精度演算が可能であれば、注意深く構築された単一の予測ベクトルのログロススコアからデータセットのラベルを正確に推測できることが示されている。さらに,log-lossスコアにノイズを加えたり,演算精度が制限されたりしても成功するラベル推論アルゴリズム(attacks)を提案する。私たちのアルゴリズムはすべて数論と組合せ論のアイデアに依存しており、モデルトレーニングは必要ありません。実際のデータセット上で実験的なシミュレーションを行い、実際の攻撃の容易さを実証した。 Log-loss (also known as cross-entropy loss) metric is ubiquitously used across machine learning applications to assess the performance of classification algorithms. In this paper, we investigate the problem of inferring the labels of a dataset from single (or multiple) log-loss score(s), without any other access to the dataset. Surprisingly, we show that for any finite number of label classes, it is possible to accurately infer the labels of the dataset from the reported log-loss score of a single carefully constructed prediction vector if we allow arbitrary precision arithmetic. Additionally, we present label inference algorithms (attacks) that succeed even under addition of noise to the log-loss scores and under limited precision arithmetic. All our algorithms rely on ideas from number theory and combinatorics and require no model training. We run experimental simulations on some real datasets to demonstrate the ease of running these attacks in practice.	翻訳日:2021-05-20 00:16:38 公開日:2021-05-18
# (参考訳) echocp : コントラスト経胸腔的心エコー図を用いた卵管診断のための心エコー図データセット EchoCP: An Echocardiography Dataset in Contrast Transthoracic Echocardiography for Patent Foramen Ovale Diagnosis ( http://arxiv.org/abs/2105.08267v1 ) ライセンス: CC BY 4.0	Tianchen Wang, Zhihe Li, Meiping Huang, Jian Zhuang, Shanshan Bi, Jiawei Zhang, Yiyu Shi, Hongwen Fei, Xiaowei Xu	(参考訳) 特許前卵胞(英: patent foramen ovale, pfo)は、心房中隔の反上部に位置する中隔、霊長体、中隔の間の潜在的分離である。 PFOは、米国で5番目に多い死因である暗号化的脳卒中を引き起こす主要な要因の1つである。 PFO診断では, 造影心エコー法(cTTE)が, 他と比べ, より堅牢な方法として好まれる。しかし,心エコービデオのソノグラフィーが手作業で行うため,cTTEによる現在のPFO診断は極めて遅い。現在、コミュニティでこの重要なトピックのための公開データセットはありません。本稿では, PFO 診断をターゲットとした, cTTE における最初の心エコー画像データセットとして EchoCP を提案する。 EchoCPは、安静とValsalva操作ビデオの両方を持つ30の患者で構成される。さらに, 術中心室セグメンテーション法に基づくpfo診断のベースライン自動決定法を確立し, 平均平均diceスコア 0.89 を得たが, 改善の余地は多く, pfo診断の精度は 0.70/0.67 に留まった。挑戦的なEchoCPデータセットがさらなる研究を刺激し、複数のドメインに影響を及ぼす革新的で汎用的なソリューションにつながることを期待しています。データセットがリリースされます。 Patent foramen ovale (PFO) is a potential separation between the septum, primum and septum secundum located in the anterosuperior portion of the atrial septum. PFO is one of the main factors causing cryptogenic stroke which is the fifth leading cause of death in the United States. For PFO diagnosis, contrast transthoracic echocardiography (cTTE) is preferred as being a more robust method compared with others. However, the current PFO diagnosis through cTTE is extremely slow as it is proceeded manually by sonographers on echocardiography videos. Currently there is no publicly available dataset for this important topic in the community. In this paper, we present EchoCP, as the first echocardiography dataset in cTTE targeting PFO diagnosis. EchoCP consists of 30 patients with both rest and Valsalva maneuver videos which covers various PFO grades. We further establish an automated baseline method for PFO diagnosis based on the state-of-the-art cardiac chamber segmentation technique, which achieves 0.89 average mean Dice score, but only 0.70/0.67 mean accuracies for PFO diagnosis, leaving large room for improvement. We hope that the challenging EchoCP dataset can stimulate further research and lead to innovative and generic solutions that would have an impact in multiple domains. Our dataset is released.	翻訳日:2021-05-19 23:52:10 公開日:2021-05-18
# (参考訳) 知識伝達による運転支援物体検出の検討 Exploring Driving-aware Salient Object Detection via Knowledge Transfer ( http://arxiv.org/abs/2105.08286v1 ) ライセンス: CC BY 4.0	Jinming Su, Changqun Xia, and Jia Li	(参考訳) 近年,ニューラルネットワークの急速な発展に伴い,sod(general salient object detection)が大きな進歩を遂げている。しかし、タスク固有のデータセットがないため、タスク対応SODの研究はほとんど行われていない。本稿では,有向物体の画素レベルのマスクが注釈付けされた運転タスク指向のデータセットを構築する。一般的なSODデータセットと比較すると、クロスドメインの知識差とタスク固有のシーンギャップは、運転時の健全な物体に焦点を合わせる2つの主な課題であることがわかった。これらの知見に触発されて,知識伝達畳み込みニューラルネットワークを用いた運転タスク認識型SODのベースラインモデルを提案した。このネットワークでは,知識差を補うために,注意に基づく知識伝達モジュールを構築する。さらに、複雑なタスク固有のシーンにおけるオブジェクトの詳細な特徴復号を行うために、効率的な境界認識機能復号モジュールを導入する。ネットワーク全体は知識伝達と機能デコードモジュールを漸進的に統合する。実験により,提案したデータセットは非常に困難であることが示され,提案手法は,タスク認識型SODの開発を容易にする12の最先端メソッドよりも優れていた。 Recently, general salient object detection (SOD) has made great progress with the rapid development of deep neural networks. However, task-aware SOD has hardly been studied due to the lack of task-specific datasets. In this paper, we construct a driving task-oriented dataset where pixel-level masks of salient objects have been annotated. Comparing with general SOD datasets, we find that the cross-domain knowledge difference and task-specific scene gap are two main challenges to focus the salient objects when driving. Inspired by these findings, we proposed a baseline model for the driving task-aware SOD via a knowledge transfer convolutional neural network. In this network, we construct an attentionbased knowledge transfer module to make up the knowledge difference. In addition, an efficient boundary-aware feature decoding module is introduced to perform fine feature decoding for objects in the complex task-specific scenes. The whole network integrates the knowledge transfer and feature decoding modules in a progressive manner. Experiments show that the proposed dataset is very challenging, and the proposed method outperforms 12 state-of-the-art methods on the dataset, which facilitates the development of task-aware SOD.	翻訳日:2021-05-19 23:49:41 公開日:2021-05-18
# (参考訳) ソーシャルネットワーク上のカスケード予測のための独立非対称埋め込みモデル Independent Asymmetric Embedding Model for Cascade Prediction on Social Network ( http://arxiv.org/abs/2105.08291v1 ) ライセンス: CC BY 4.0	Wenjin Xie and Xiaomeng Wang and Tao Jia	(参考訳) ソーシャルネットワーク上での情報拡散の予測は,マーケティングや世論管理において極めて重要な意味を持つ。カスケード予測は、メッセージをソーシャルネットワークに再投稿する可能性のある個人を予測することを目的としている。ある種類の手法は、人口統計学的、構造的、時間的特徴を予測に利用するか、特定の情報拡散モデルに明示的に依存する。他のモデルは完全にデータ駆動であり、グローバルネットワーク構造を必要としない。そこで,ネットワーク埋め込みに基づく大規模拡散予測モデルを提案する。これらのモデルは、ユーザをカスケード情報を使用して潜在空間に埋め込むが、埋め込み時のユーザ間の介入に対する考慮が欠如している。本稿では,カスケード予測のための社会的埋め込み学習のための独立な非対称埋め込み法を提案する。既存の手法と異なり、各個体を1つの潜伏影響空間と複数の潜伏感受性空間に埋め込む。さらに,提案手法は,カスケード内のユーザ組み合わせの共起制御を捕捉し,計算効率を向上する。実世界のデータセット上で行った広範な実験の結果は、予測精度とコスト効率の両方を検証できた。 The prediction for information diffusion on social networks has great practical significance in marketing and public opinion control. Cascade prediction aims to predict the individuals who will potentially repost the message on the social network. One kind of methods either exploit demographical, structural, and temporal features for prediction, or explicitly rely on particular information diffusion models. The other kind of models are fully data-driven and do not require a global network structure. Thus massive diffusion prediction models based on network embedding are proposed. These models embed the users into the latent space using their cascade information, but are lack of consideration for the intervene among users when embedding. In this paper, we propose an independent asymmetric embedding method to learn social embedding for cascade prediction. Different from existing methods, our method embeds each individual into one latent influence space and multiple latent susceptibility spaces. Furthermore, our method captures the co-occurrence regulation of user combination in cascades to improve the calculating effectiveness. The results of extensive experiments conducted on real-world datasets verify both the predictive accuracy and cost-effectiveness of our approach.	翻訳日:2021-05-19 23:40:31 公開日:2021-05-18
# (参考訳) 自己申告症状は毎日のcovid-19感染者を予測できるか? Can Self Reported Symptoms Predict Daily COVID-19 Cases? ( http://arxiv.org/abs/2105.08321v1 ) ライセンス: CC BY 4.0	Parth Patwa and Viswanatha Reddy and Rohan Sukumaran and Sethuraman TV and Eptehal Nashnoush and Sheshank Shankar and Rishemjit Kaur and Abhishek Singh and Ramesh Raskar	(参考訳) 新型コロナウイルスのパンデミックが世界中の生活や経済に影響を及ぼし、多くの死者を出した。ワクチン接種は重要な介入であるが、ロールアウトは世界中で遅く、不平等である。そのため、大規模な検査はウイルスをモニターし、封じ込めるための重要な方法の1つとして残っている。大規模なテストは高価で厳しいです。したがって、ケース数を見積もる別の方法が必要である。オンライン調査は、パンデミック中のデータ収集に有効な方法であることが示されている。本研究では,自己申告症状を用いて新型コロナウイルスの感染率を推定する機械学習モデルを開発した。最善のモデルでは、1州あたり平均絶対誤差(mae)が226.30(maeは27.09%)と予測され、自己報告症状を用いて実際の感染者数を予測する可能性が示された。モデルは、状態レベルでトレーニングされるローカルモデルと、すべての州で集約された複合データに基づいてトレーニングされる単一のグローバルモデルという、2つのレベルのデータ粒度で開発されている。その結果,グローバルモデルとは対照的に,局所モデルに対する誤差が低かった。また、最も重要な症状(機能)は、状態によって大きく異なることも示している。この研究は、クラウドソーシングデータに基づいて開発されたモデルは、オンラインプラットフォームを介してキュレーションされ、既存の疫学的監視インフラを費用対効果で補完できることを示した。 The COVID-19 pandemic has impacted lives and economies across the globe, leading to many deaths. While vaccination is an important intervention, its roll-out is slow and unequal across the globe. Therefore, extensive testing still remains one of the key methods to monitor and contain the virus. Testing on a large scale is expensive and arduous. Hence, we need alternate methods to estimate the number of cases. Online surveys have been shown to be an effective method for data collection amidst the pandemic. In this work, we develop machine learning models to estimate the prevalence of COVID-19 using self-reported symptoms. Our best model predicts the daily cases with a mean absolute error (MAE) of 226.30 (normalized MAE of 27.09%) per state, which demonstrates the possibility of predicting the actual number of confirmed cases by utilizing self-reported symptoms. The models are developed at two levels of data granularity - local models, which are trained at the state level, and a single global model which is trained on the combined data aggregated across all states. Our results indicate a lower error on the local models as opposed to the global model. In addition, we also show that the most important symptoms (features) vary considerably from state to state. This work demonstrates that the models developed on crowd-sourced data, curated via online platforms, can complement the existing epidemiological surveillance infrastructure in a cost-effective manner.	翻訳日:2021-05-19 23:29:16 公開日:2021-05-18
# (参考訳) ELdrオントロジーに基づくアクティブラーニングの概念と接続型クエリ Actively Learning Concepts and Conjunctive Queries under ELdr-Ontologies ( http://arxiv.org/abs/2105.08326v1 ) ライセンス: CC BY 4.0	Maurice Funk, Jean Christoph Jung, Carsten Lutz	(参考訳) 本稿では,学習アルゴリズムがオラクル(ドメインエキスパートなど)を対話的にクエリすることのできる,Angluinのアクティブラーニングフレームワークにおいて,記述論理ELdrで定式化されたオントロジーの存在下で概念やクエリを学習する問題を考察する。 1) el-concepts, (2) symmetry-free eli-concepts, (3) chordal, symmetry-free, そしてbounded arityである結合クエリ(cqs)である。いずれの場合も、学習者は、ABoxesと同値クエリに基づいて、そのクラスから与えられた概念/クエリがターゲットと同等であるかどうかを問うオラクルメンバーシップクエリにポーズすることができる。 (3) における有界アリティに対する制限は、同値クエリで非制限な CQ が認められると取り除かれる。また,EL-concepts は ELI-ontology の存在下で学習可能な多項式クエリではないことを示す。 We consider the problem to learn a concept or a query in the presence of an ontology formulated in the description logic ELdr, in Angluin's framework of active learning that allows the learning algorithm to interactively query an oracle (such as a domain expert). We show that the following can be learned in polynomial time: (1) EL-concepts, (2) symmetry-free ELI-concepts, and (3) conjunctive queries (CQs) that are chordal, symmetry-free, and of bounded arity. In all cases, the learner can pose to the oracle membership queries based on ABoxes and equivalence queries that ask whether a given concept/query from the considered class is equivalent to the target. The restriction to bounded arity in (3) can be removed when we admit unrestricted CQs in equivalence queries. We also show that EL-concepts are not polynomial query learnable in the presence of ELI-ontologies.	翻訳日:2021-05-19 23:15:01 公開日:2021-05-18
# (参考訳) 凸クラスタリングソリューションについて On Convex Clustering Solutions ( http://arxiv.org/abs/2105.08348v1 ) ライセンス: CC BY 4.0	Canh Hao Nguyen, Hiroshi Mamitsuka	(参考訳) 凸クラスタリング(convex clustering)は、その凸定式化による効率や最適性などの優れた特性を持つ魅力的なクラスタリングアルゴリズムである。 k平均クラスタリングと凝集クラスタリングの両方を一般化すると考えられている。しかし、凸クラスタリングがこれらのアルゴリズムの望ましい特性を維持するかどうかは不明である。一般的な期待としては、凸クラスタリングは非凸クラスタのような難しいクラスタタイプを学ぶことができる。現在の凸クラスタリングの理解は、十分に分離されたクラスタ上の一貫性結果のみに限られている。我々はその解に対する新しい理解を示す。凸クラスタリングは凸クラスタのみを学習できることを証明する。すると、クラスターは大きなギャップを持つ有界球を持つことを示す。さらに、ソリューション、正規化ハイパーパラメータ、クラスタ化可能なケース、一貫性を特徴付ける。 Convex clustering is an attractive clustering algorithm with favorable properties such as efficiency and optimality owing to its convex formulation. It is thought to generalize both k-means clustering and agglomerative clustering. However, it is not known whether convex clustering preserves desirable properties of these algorithms. A common expectation is that convex clustering may learn difficult cluster types such as non-convex ones. Current understanding of convex clustering is limited to only consistency results on well-separated clusters. We show new understanding of its solutions. We prove that convex clustering can only learn convex clusters. We then show that the clusters have disjoint bounding balls with significant gaps. We further characterize the solutions, regularization hyperparameters, inclusterable cases and consistency.	翻訳日:2021-05-19 21:34:15 公開日:2021-05-18
# (参考訳) 深層強化学習を用いたオンラインマルチモーダル交通計画 Online Multimodal Transportation Planning using Deep Reinforcement Learning ( http://arxiv.org/abs/2105.08374v1 ) ライセンス: CC BY 4.0	Amirreza Farahani, Laura Genga and Remco Dijkman	(参考訳) 本稿では,トラックにコンテナを割り当てたり,目的地に輸送する列車にコンテナを割り当てるマルチモーダル輸送計画問題を解決するための深層強化学習手法を提案する。従来の計画手法は"オフライン"(すなわち、輸送開始前にコンテナのバッチを決定する)で動作するが、提案されたアプローチは"オンライン"であり、輸送が実行される間、個々のコンテナに対して決定を下すことができる。オンライン交通計画は、当初の交通計画に影響を及ぼす可能性のある予期せぬ出来事に効果的に対応し、企業による輸送コストの削減を支援する。提案したDeep Reinforcement Learningアルゴリズムで異なるコンテナ選択ヒューリスティックスを実装し,ロジスティクス企業における実ケーススタディに基づいて,現実的なシナリオをシミュレートしたデータを用いて,各ヒューリスティックのパフォーマンスを評価した。実験の結果,提案手法はコンテナ割り当ての効果的なパターンを学習できることがわかった。競争相手の総輸送コストと車両使用能力の面では20.48%から55.32%、キャパシティは7.51%から20.54%に向上した。さらに,オフライン環境で整数線形計画解法で生成した最適解の容量は,コストで2.7%,オフラインで0.72%であった。 In this paper we propose a Deep Reinforcement Learning approach to solve a multimodal transportation planning problem, in which containers must be assigned to a truck or to trains that will transport them to their destination. While traditional planning methods work "offline" (i.e., they take decisions for a batch of containers before the transportation starts), the proposed approach is "online", in that it can take decisions for individual containers, while transportation is being executed. Planning transportation online helps to effectively respond to unforeseen events that may affect the original transportation plan, thus supporting companies in lowering transportation costs. We implemented different container selection heuristics within the proposed Deep Reinforcement Learning algorithm and we evaluated its performance for each heuristic using data that simulate a realistic scenario, designed on the basis of a real case study at a logistics company. The experimental results revealed that the proposed method was able to learn effective patterns of container assignment. It outperformed tested competitors in terms of total transportation costs and utilization of train capacity by 20.48% to 55.32% for the cost and by 7.51% to 20.54% for the capacity. Furthermore, it obtained results within 2.7% for the cost and 0.72% for the capacity of the optimal solution generated by an Integer Linear Programming solver in an offline setting.	翻訳日:2021-05-19 21:13:03 公開日:2021-05-18
# (参考訳) i2c2w:正確なシーン認識のための画像から文字への変換器 I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text Recognition ( http://arxiv.org/abs/2105.08383v1 ) ライセンス: CC0 1.0	Chuhui Xue, Shijian Lu, Song Bai, Wenqing Zhang, Changhu Wang	(参考訳) 自然言語処理の進歩を利用して、最近のシーンのテキスト認識者はエンコーダ-デコーダアーキテクチャを採用しており、テキストイメージはまず代表的特徴に変換され、その後 ‘direct decoding’ を介して文字のシーケンスに変換される。しかし、シーンテキスト画像は複雑な背景や幾何歪みなどの様々な音源の豊かなノイズに悩まされ、デコーダを混乱させ、ノイズの多いデコード時間ステップで視覚的特徴の不正なアライメントにつながる。本稿では,シーンの様々なノイズに対して正確かつ耐性のある新しいシーンテキスト認識装置I2C2Wを提案する。 i2c2wはイメージ・ツー・キャラクタモジュール(i2c)と文字・ワードモジュール(c2w)から構成される。 i2cは文字を検出し、単語内の相対位置を予測する。時間ステップの制限なしに、異なる視覚的特徴のアライメントに基づいて、不正かつ冗長な文字を含む全ての文字を検出する。検出された文字を入力として、C2Wは文字の意味とその位置から学習し、不正かつ冗長な検出をフィルタリングし、最終的な単語認識を生成する。 7つの公開データセットに対する大規模な実験は、I2C2Wが優れた認識性能を達成し、不規則なシーンテキストデータセットに対して大きなマージンで最先端のパフォーマンスを達成していることを示している。 Leveraging the advances of natural language processing, most recent scene text recognizers adopt an encoder-decoder architecture where text images are first converted to representative features and then a sequence of characters via `direct decoding'. However, scene text images suffer from rich noises of different sources such as complex background and geometric distortions which often confuse the decoder and lead to incorrect alignment of visual features at noisy decoding time steps. This paper presents I2C2W, a novel scene text recognizer that is accurate and tolerant to various noises in scenes. I2C2W consists of an image-to-character module (I2C) and a character-to-word module (C2W) which are complementary and can be trained end-to-end. I2C detects characters and predicts their relative positions in a word. It strives to detect all possible characters including incorrect and redundant ones based on different alignments of visual features without the restriction of time steps. Taking the detected characters as input, C2W learns from character semantics and their positions to filter out incorrect and redundant detection and produce the final word recognition. Extensive experiments over seven public datasets show that I2C2W achieves superior recognition performances and outperforms the state-of-the-art by large margins on challenging irregular scene text datasets.	翻訳日:2021-05-19 20:58:29 公開日:2021-05-18
# (参考訳) 道路網における小型物体の検出の改善 Improved detection of small objects in road network sequences ( http://arxiv.org/abs/2105.08416v1 ) ライセンス: CC BY 4.0	Iv\'an Garc\'ia, Rafael Marcos Luque and Ezequiel L\'opez	(参考訳) 現在の道路ネットワークにある膨大な数の既存のIPカメラは、取得したデータを利用してビデオを分析し、重要なイベントを検出する機会である。この目的のためには、数年前まで古典的な人工視覚技術を用いて行われた作業である移動車を検出する必要がある。現在、深層学習ネットワークによって大幅に改善されている。それでも、オブジェクト検出はコンピュータビジョンにおける主要なオープン問題の1つと考えられている。現在のシナリオは絶えず進化しており、この分野を改善しようとする新しいモデルやテクニックが現れています。特に、道路シーンに登場する車両に主に対応している小型物体の検出に関して、新たな問題や欠点が現れる。これらのことは、小さな元素の低い検出率を改善する新しいソリューションが不可欠であることを意味している。様々な研究ラインの中で、この研究は小さな物体の検出に焦点を当てている。特に,本提案では,ビデオ監視カメラで撮影した画像からの車両検出を目的とした。本研究では,畳み込みニューラルネットワーク \emph{(CNN)} による検出に基づく超解像プロセスを適用することで,小型物体の検出を行う新しい手法を提案する。ニューラルネットワークは、画像の解像度を向上させるプロセスと統合され、オブジェクト検出性能が向上する。この手法は,様々なスケールの要素を含む一組のトラヒック画像に対して,モデルにより得られた検出結果に応じて効率性をテストすることで,幅広い状況において良好な結果が得られることを示す。 The vast number of existing IP cameras in current road networks is an opportunity to take advantage of the captured data and analyze the video and detect any significant events. For this purpose, it is necessary to detect moving vehicles, a task that was carried out using classical artificial vision techniques until a few years ago. Nowadays, significant improvements have been obtained by deep learning networks. Still, object detection is considered one of the leading open issues within computer vision. The current scenario is constantly evolving, and new models and techniques are appearing trying to improve this field. In particular, new problems and drawbacks appear regarding detecting small objects, which correspond mainly to the vehicles that appear in the road scenes. All this means that new solutions that try to improve the low detection rate of small elements are essential. Among the different emerging research lines, this work focuses on the detection of small objects. In particular, our proposal aims to vehicle detection from images captured by video surveillance cameras. In this work, we propose a new procedure for detecting small-scale objects by applying super-resolution processes based on detections performed by convolutional neural networks \emph{(CNN)}. The neural network is integrated with processes that are in charge of increasing the resolution of the images to improve the object detection performance. This solution has been tested for a set of traffic images containing elements of different scales to test the efficiency according to the detections obtained by the model, thus demonstrating that our proposal achieves good results in a wide range of situations.	翻訳日:2021-05-19 20:44:02 公開日:2021-05-18
# (参考訳) 深層学習概念に基づくPtychographyのパラメータ改善手法 A parameter refinement method for Ptychography based on Deep Learning concepts ( http://arxiv.org/abs/2105.08058v1 ) ライセンス: CC BY 4.0	Francesco Guzzi, George Kourousias, Fulvio Bill\`e, Roberto Pugliese, Alessandra Gianoncelli and Sergio Carrato	(参考訳) x-ray ptychography(x-ray ptychography)は、生物およびナノテクノロジーの標本の詳細な定量的イメージングを提供する高度な計算顕微鏡技術である。しかし, 伝播距離, 位置誤差, 部分コヒーレンスにおける粗いパラメータは, 実験の有効性をしばしば低下させる。本研究では,これらのアクタを正式に導入し,最適化問題としての再構築全体を解決した。最新のDeep Learningフレームワークは、セットアップの不整合を自律的に補正するために使用され、ポチコグラフィーの再構築の質が向上する。自動的なプロシージャは信頼性のある分析の時間を短縮するために非常に重要であり、この種の顕微鏡を使用するすべての分野に重大な影響を及ぼす。ソフトウェアフレームワークであるSciComPtyにアルゴリズムを実装し、オープンソースとしてリリースしました。我々は,elettra シンクロトロン施設のツインミックビームラインで取得した合成データセットと実データの両方でシステムをテストした。 X-ray Ptychography is an advanced computational microscopy technique which is delivering exceptionally detailed quantitative imaging of biological and nanotechnology specimens. However coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability. In this work we formally introduced these actors, solving the whole reconstruction as an optimisation problem. A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction. Automatic procedures are indeed crucial to reduce the time for a reliable analysis, which has a significant impact on all the fields that use this kind of microscopy. We implemented our algorithm in our software framework, SciComPty, releasing it as open-source. We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.	翻訳日:2021-05-19 20:31:52 公開日:2021-05-18
# (参考訳) DRILL:不均衡生涯学習のための動的表現 DRILL: Dynamic Representations for Imbalanced Lifelong Learning ( http://arxiv.org/abs/2105.08445v1 ) ライセンス: CC BY 4.0	Kyra Ahrens, Fares Abawi, Stefan Wermter	(参考訳) 継続的あるいは生涯的学習は、機械学習、特に自然言語処理(NLP)における長年にわたる課題である。 bertのような最先端の言語モデルは、マルチタスク学習シナリオにおける優れたパフォーマンスのために、この分野で新たな時代を迎えてきたが、データ分布のシフトを伴う連続したデータストリームに晒された時に忘れられてしまう。本稿では,オープンドメインテキスト分類のための新しい連続学習アーキテクチャDRILLを紹介する。 DRILLは生物学的にインスパイアされた自己組織化ニューラルアーキテクチャを利用して、BERTから潜在言語表現をタスクインクリメンタルに選択的にゲートする。本実験では,DRILLがタスク境界に関する事前知識を必要とせず,非定常的な非定常データの現実的なシナリオにおいて,現在の手法よりも優れていることを示す。我々の知る限りでは、DRILLはNLPのオープンドメイン生涯学習に自己組織化ニューラルアーキテクチャを使用した最初の種類のものだ。 Continual or lifelong learning has been a long-standing challenge in machine learning to date, especially in natural language processing (NLP). Although state-of-the-art language models such as BERT have ushered in a new era in this field due to their outstanding performance in multitask learning scenarios, they suffer from forgetting when being exposed to a continuous stream of data with shifting data distributions. In this paper, we introduce DRILL, a novel continual learning architecture for open-domain text classification. DRILL leverages a biologically inspired self-organizing neural architecture to selectively gate latent language representations from BERT in a task-incremental manner. We demonstrate in our experiments that DRILL outperforms current methods in a realistic scenario of imbalanced, non-stationary data without prior knowledge about task boundaries. To the best of our knowledge, DRILL is the first of its kind to use a self-organizing neural architecture for open-domain lifelong learning in NLP.	翻訳日:2021-05-19 20:16:59 公開日:2021-05-18
# (参考訳) 局所制御された距離ベクトル流を用いた深部能動輪郭 Deep Active Contours Using Locally Controlled Distance Vector Flow ( http://arxiv.org/abs/2105.08447v1 ) ライセンス: CC BY 4.0	Parastoo Akbari, Atefeh Ziaei, and Hamed Azarnoush	(参考訳) ACM(Active Contours Model)はコンピュータビジョンや画像処理に広く使われている。近年、畳み込みニューラルネットワーク(CNN)は、エネルギー関数と初期化のパラメータへのACMの依存に伴う制限を取り除くために、輪郭の進化と画像セグメント化の過程において、ユーザに代わってアクティブな輪郭と組み合わせられている。しかし、それ以前の作業は、ここで対処する自動初期化を目標としなかった。手動初期化に加えて、現在のメソッドは初期位置に対して高い感度を持ち、境界を正確に定義できない。エネルギー関数パラメータの割り当て問題に加えて,手動初期化や捕捉範囲の不足,境界への収束性の低下といった問題に対処する完全自動画像分割手法を提案する。 2つのcnnを訓練し, 有効輪郭重み付けパラメータを予測し, 距離変換(dt)と初期化円を抽出するための基底真理マスクを生成する。距離変換は、画像の各ピクセルから境界上の最も近い点へ向けてベクトル場を形成するために使用され、その大きさはユークリッド距離写像と等しい。本研究では, ビルディングインスタンスセグメンテーションデータセットであるVayhingenとBingハッシュと, INBreastとDDSM-BCRPの2つのマンモグラフィ画像データセットを含む4つの公開データセットについて評価を行った。今回のアプローチは,平均交点オーバーユニオン(miou)の0.59 ans 2.39パーセント,境界f-score(boundf)の7.38および8.62パーセントという,vaihingenとbing hutsデータセットの最新の研究を上回っている。 INBreast と DDSM-BCRP データセットのDice similarity coefficient は94.23% と 90.89% である。 Active contours Model (ACM) has been extensively used in computer vision and image processing. In recent studies, Convolutional Neural Networks (CNNs) have been combined with active contours replacing the user in the process of contour evolution and image segmentation to eliminate limitations associated with ACM's dependence on parameters of the energy functional and initialization. However, prior works did not aim for automatic initialization which is addressed here. In addition to manual initialization, current methods are highly sensitive to initial location and fail to delineate borders accurately. We propose a fully automatic image segmentation method to address problems of manual initialization, insufficient capture range, and poor convergence to boundaries, in addition to the problem of assignment of energy functional parameters. We train two CNNs, which predict active contour weighting parameters and generate a ground truth mask to extract Distance Transform (DT) and an initialization circle. Distance transform is used to form a vector field pointing from each pixel of the image towards the closest point on the boundary, the size of which is equal to the Euclidean distance map. We evaluate our method on four publicly available datasets including two building instance segmentation datasets, Vaihingen and Bing huts, and two mammography image datasets, INBreast and DDSM-BCRP. Our approach outperforms latest research by 0.59 ans 2.39 percent in mean Intersection-over-Union (mIoU), 7.38 and 8.62 percent in Boundary F-score (BoundF) for Vaihingen and Bing huts datasets, respectively. Dice similarity coefficient for the INBreast and DDSM-BCRP datasets is 94.23% and 90.89%, respectively indicating our method is comparable to state-of-the-art frameworks.	翻訳日:2021-05-19 20:07:23 公開日:2021-05-18
# (参考訳) 顔アンチスプーフィングのための教師なし複合ドメイン適応 Unsupervised Compound Domain Adaptation for Face Anti-Spoofing ( http://arxiv.org/abs/2105.08463v1 ) ライセンス: CC BY 4.0	Ankush Panwar, Pratyush Singh, Suman Saha, Danda Pani Paudel and Luc Van Gool	(参考訳) 現実の環境での顔認証システムを堅牢なものにすることを目的とした顔認証の課題に対処する。モデルがトレーニングされたラベル付きソースドメインと比較して,ライブ対スプーフ顔画像の検出状況は,対象領域で大きく異なる場合がある。このような違いは、新規で未知のスプーフタイプ、照明条件、背景などによって引き起こされる可能性がある。これらの差異は対象を複合ドメインとし、教師なし複ドメイン適応の問題を提起する。本稿では,本研究で初めて,対スプーフィング作業における複合ドメイン仮定の有効性を実証する。そこで本研究では,ソースモデルを対象ドメインに適応させるメモリ拡張手法を提案する。カリキュラム学習とドメインに依存しないソースネットワークトレーニングアプローチを用いることで、適応プロセスをさらに改善する。提案手法は,複数の新しいスプーフ型からなる複合ターゲットドメインに適応する。複数のベンチマークデータセットに対する実験により,提案手法が最先端よりも優れていることを示す。 We address the problem of face anti-spoofing which aims to make the face verification systems robust in the real world settings. The context of detecting live vs. spoofed face images may differ significantly in the target domain, when compared to that of labeled source domain where the model is trained. Such difference may be caused due to new and unknown spoof types, illumination conditions, scene backgrounds, among many others. These varieties of differences make the target a compound domain, thus calling for the problem of the unsupervised compound domain adaptation. We demonstrate the effectiveness of the compound domain assumption for the task of face anti-spoofing, for the first time in this work. To this end, we propose a memory augmentation method for adapting the source model to the target domain in a domain aware manner. The adaptation process is further improved by using the curriculum learning and the domain agnostic source network training approaches. The proposed method successfully adapts to the compound target domain consisting multiple new spoof types. Our experiments on multiple benchmark datasets demonstrate the superiority of the proposed method over the state-of-the-art.	翻訳日:2021-05-19 19:57:13 公開日:2021-05-18
# (参考訳) ビデオポリープセグメンテーションのためのプログレッシブノーマライズドセルフアテンションネットワーク Progressively Normalized Self-Attention Network for Video Polyp Segmentation ( http://arxiv.org/abs/2105.08468v1 ) ライセンス: CC BY 4.0	Ge-Peng Ji, Yu-Cheng Chou, Deng-Ping Fan, Geng Chen, Huazhu Fu, Debesh Jha, and Ling Shao	(参考訳) 既存のビデオポリプセグメンテーション(VPS)モデルは、通常、特徴を抽出するために畳み込みニューラルネットワーク(CNN)を使用する。しかし、cnnは、その限られた受容領域のため、連続するビデオフレームの全体的時間的および空間的情報を十分に活用できないため、偽陽性のセグメンテーション結果が得られる。本稿では,RTX 2080 GPU上でのリアルタイム速度 (~140fps) のポリプビデオから表現を効率よく学習し,後処理を行わない新しい PNS-Net (Progressively Normalized Self-attention Network) を提案する。当社のPNS-Netは,再帰性とCNNを完全に装備する,基本的正規化自己注意ブロックのみをベースとしています。 VPSデータセットに挑戦する実験は、提案されたNS-Netが最先端のパフォーマンスを達成することを示す。また,チャネル分割,ソフトアテンション,プログレッシブ学習戦略の有効性を検討するために,広範囲な実験を行った。 PNS-Netは、異なる設定でうまく機能し、VPSタスクに対する有望なソリューションになります。 Existing video polyp segmentation (VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs can not fully exploit the global temporal and spatial information in successive video frames, resulting in false-positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single RTX 2080 GPU and no post-processing. Our PNS-Net is based solely on a basic normalized self-attention block, equipping with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.	翻訳日:2021-05-19 19:42:13 公開日:2021-05-18
# (参考訳) ベイズ型プレイヤーモデリングによる高速ゲームコンテンツ適応 Fast Game Content Adaptation Through Bayesian-based Player Modelling ( http://arxiv.org/abs/2105.08484v1 ) ライセンス: CC BY 4.0	Miguel Gonz\'alez-Duque, Rasmus Berg Palm and Sebastian Risi	(参考訳) ゲーム(および多くのユーザ向けシステム)では、ユーザの好みや経験にコンテンツを適用することが重要な課題である。本稿では,動的難易度調整(DDA)の文脈において,この目標を実現する新しい手法を提案する。ここでの目的は、ゲームの内容がプレイヤーのスキルレベルに常に適応し、難しすぎるか難しすぎる状態を避けることで、プレイヤーをエンゲージさせることである。 DDAの現在のシステムは、高価なデータマイニングや、特定のドメイン用に設計された手作りのルールに依存しており、通常はプレイヤーをフローに維持するために適応し、デザイナーが意図的に簡単または困難であるコンテンツを提示する余地は残っていない。本稿では,領域に依存せず,特定の困難を対象とするベイズ最適化ベースのddaシステムを提案する。我々はこのフレームワークをパズルゲームSudokuと単純なRogueライクゲームという2つの異なる領域に展開する。獲得関数の最適化を変更することで,5回未満の反復(sudoku)と15回の反復(simple roguelike)で,異なるスキルレベルを持つプレイヤーに対して,難易度の高いパズルを提示できる。これらの結果は、様々な領域におけるコンテンツ適応の有望な代替案を指している。 In games (as well as many user-facing systems), adapting content to user's preferences and experience is an important challenge. This paper explores a novel method to realize this goal in the context of dynamic difficulty adjustment (DDA). Here the aim is to constantly adapt the content of a game to the skill level of the player, keeping them engaged by avoiding states that are either too difficult or too easy. Current systems for DDA rely on expensive data mining, or on hand-crafted rules designed for particular domains, and usually adapts to keep players in the flow, leaving no room for the designer to present content that is purposefully easy or difficult. This paper presents a Bayesian Optimization-based system for DDA that is agnostic to the domain and that can target particular difficulties. We deploy this framework in two different domains: the puzzle game Sudoku, and a simple Roguelike game. By modifying the acquisition function's optimization, we are reliably able to present a puzzle with a bespoke difficulty for players with different skill levels in less than five iterations (for Sudoku) and fifteen iterations (for the simple Roguelike), significantly outperforming simpler heuristics for difficulty adjustment in said domains, with the added benefit of maintaining a model of the user. These results point towards a promising alternative for content adaption in a variety of different domains.	翻訳日:2021-05-19 19:30:56 公開日:2021-05-18
# (参考訳) ターゲットディスプレイ広告におけるマルチタスク学習によるオーディエンス多段階変換の系列依存性のモデル化 Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising ( http://arxiv.org/abs/2105.08489v1 ) ライセンス: CC BY 4.0	Dongbo Xi, Zhen Chen, Peng Yan, Yinger Zhang, Yongchun Zhu, Fuzhen Zhuang, Yu Chen	(参考訳) 現実世界のほとんどのオンラインアプリケーション(eコマースや金融など)では、顧客獲得は通常、オーディエンスの多段階変換プロセスである。例えば、インプレッション->クリック->購入プロセスは通常、Eコマースプラットフォームのオーディエンスによって実行される。しかし、従来の広告よりも金融広告(クレジットカード広告など)で顧客を獲得することは困難である。一方、オーディエンスマルチステップ変換パスは長くなる。一方、正のフィードバックは段階的にスパーサー(クラス不均衡)であり、活性化の遅延による最終正のフィードバックを得ることは困難である。マルチタスク学習は、この方向の典型的なソリューションです。この方向にかなりのマルチタスクの取り組みがなされているが、長年にわたる課題は、エンド・ツー・エンドの変換を改善するために、オーディエンス間の長いパスのシーケンシャルな依存を明示的にモデル化する方法である。本稿では,適応型情報転送(ait)モジュールを用いて,オーディエンス間の逐次依存性をモデル化する適応型情報転送マルチタスク(aitm)フレームワークを提案する。 AITモジュールは、異なる変換段階で転送する情報の種類と量を適応的に学習することができる。さらに、損失関数に振舞い期待キャリブレータを組み合わせることで、AITMフレームワークはより正確なエンドツーエンド変換識別を得ることができる。提案するフレームワークはMeituanアプリにデプロイされ、Meituan Co-Branded Credit Cardsのエンドツーエンドの変換レートの高いバナーをリアルタイムでユーザに提示する。産業用および公共用両方の実世界のデータセットのオフライン実験結果から,提案したフレームワークは最先端のベースラインに比べて性能が著しく向上していることが明らかとなった。 In most real-world large-scale online applications (e.g., e-commerce or finance), customer acquisition is usually a multi-step conversion process of audiences. For example, an impression->click->purchase process is usually performed of audiences for e-commerce platforms. However, it is more difficult to acquire customers in financial advertising (e.g., credit card advertising) than in traditional advertising. On the one hand, the audience multi-step conversion path is longer. On the other hand, the positive feedback is sparser (class imbalance) step by step, and it is difficult to obtain the final positive feedback due to the delayed feedback of activation. Multi-task learning is a typical solution in this direction. While considerable multi-task efforts have been made in this direction, a long-standing challenge is how to explicitly model the long-path sequential dependence among audience multi-step conversions for improving the end-to-end conversion. In this paper, we propose an Adaptive Information Transfer Multi-task (AITM) framework, which models the sequential dependence among audience multi-step conversions via the Adaptive Information Transfer (AIT) module. The AIT module can adaptively learn what and how much information to transfer for different conversion stages. Besides, by combining the Behavioral Expectation Calibrator in the loss function, the AITM framework can yield more accurate end-to-end conversion identification. The proposed framework is deployed in Meituan app, which utilizes it to real-timely show a banner to the audience with a high end-to-end conversion rate for Meituan Co-Branded Credit Cards. Offline experimental results on both industrial and public real-world datasets clearly demonstrate that the proposed framework achieves significantly better performance compared with state-of-the-art baselines.	翻訳日:2021-05-19 19:15:08 公開日:2021-05-18
# (参考訳) 画素レベルでのリモートセンシング画像のマルチビューコントラスト符号化 Multi-view Contrastive Coding of Remote Sensing Images at Pixel-level ( http://arxiv.org/abs/2105.08501v1 ) ライセンス: CC BY 4.0	Yuxing Chen	(参考訳) 我々の惑星は複数のセンサー(マルチスペクトル、ライダー、SARなど)と異なる時間で衛星によって観測される。多視点観察は、一つのものよりも補完的な情報をもたらす。あるいは、幾何やセマンティクスなど、異なるビュー間で共有される共通の機能もある。近年,マルチビューリモートセンシング画像のアライメントと,ビュー不変因子のモデル化による単一センサ画像の特徴表現の改善のために,コントラスト学習手法が提案されている。しかし、これらの手法は、事前に定義されたタスクの事前学習、あるいは画像レベルの分類のみに焦点を当てている。さらに、これらの手法は不確実性推定の研究を欠いている。本研究では,この制限を克服するために,ラベルのないマルチビュー設定に基づく画素単位のコントラスト的アプローチを提案する。これは、特徴アライメントにおける対照的な損失と、多視点画像間の均一性によって達成される。このアプローチでは, 擬似媒介ResUnetを用いて, シフトした正の対から特徴を整列させ, 超球上の特徴の誘導分布を均一化することを目的とした表現を学習する。マルチビューリモートセンシング画像の学習特徴を、線形プロトコルの評価と教師なしの変更検出タスクに基づいて評価する。提案手法を動作させるアプローチの重要な特性を分析し,シフト等分散の要求が提案手法の成功を保証し,表現の不確実性の推定が性能向上につながることを見出した。さらに、マルチビューコントラスト学習の性能は、異なるセンサの選択によって影響を受ける。その結果,最先端のマルチビューコントラスト法よりも効率と精度が向上した。 Our planet is viewed by satellites through multiple sensors (e.g., multi-spectral, Lidar and SAR) and at different times. Multi-view observations bring us complementary information than the single one. Alternatively, there are common features shared between different views, such as geometry and semantics. Recently, contrastive learning methods have been proposed for the alignment of multi-view remote sensing images and improving the feature representation of single sensor images by modeling view-invariant factors. However, these methods are based on the pretraining of the predefined tasks or just focus on image-level classification. Moreover, these methods lack research on uncertainty estimation. In this work, a pixel-wise contrastive approach based on an unlabeled multi-view setting is proposed to overcome this limitation. This is achieved by the use of contrastive loss in the feature alignment and uniformity between multi-view images. In this approach, a pseudo-Siamese ResUnet is trained to learn a representation that aims to align features from the shifted positive pairs and uniform the induced distribution of the features on the hypersphere. The learned features of multi-view remote sensing images are evaluated on a liner protocol evaluation and an unsupervised change detection task. We analyze key properties of the approach that make it work, finding that the requirement of shift equivariance ensured the success of the proposed approach and the uncertainty estimation of representations leads to performance improvements. Moreover, the performance of multi-view contrastive learning is affected by the choice of different sensors. Results demonstrate both improvements in efficiency and accuracy over the state-of-the-art multi-view contrastive methods.	翻訳日:2021-05-19 18:59:26 公開日:2021-05-18
# (参考訳) ニューラルマシン翻訳における最小ベイズリスク復号の特性の理解 Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation ( http://arxiv.org/abs/2105.08504v1 ) ライセンス: CC BY 4.0	Mathias M\"uller and Rico Sennrich	(参考訳) ニューラルマシン翻訳(nmt)は現在、短すぎる翻訳や頻繁な単語の過剰生成といったバイアスを示しており、トレーニングデータやドメインシフトのノイズをコピーするロバスト性が乏しい。最近の研究はこれらの欠点をビーム探索(nmtのデファクト標準推論アルゴリズム)と結びつけており、eikema & aziz (2020) は最小ベイズリスク(mbr)をバイアスのないサンプルにデコードすることを提案している。本稿では,これまでに報告された多数のバイアスとビームサーチの故障事例に対するmbr復号の特性について実験的に検討する。 MBRは、実用関数として使用されるMT測定値から、長さとトークンの周波数バイアスがまだ残っているが、トレーニングデータやドメインシフトのコピーノイズに対する堅牢性も向上している。 Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words, and shows poor robustness to copy noise in training data or domain shift. Recent work has tied these shortcomings to beam search -- the de facto standard inference algorithm in NMT -- and Eikema & Aziz (2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead. In this paper, we empirically investigate the properties of MBR decoding on a number of previously reported biases and failure cases of beam search. We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases robustness against copy noise in the training data and domain shift.	翻訳日:2021-05-19 18:46:57 公開日:2021-05-18
# (参考訳) Transformers \`a Grande Vitesse Transformers \`a Grande Vitesse ( http://arxiv.org/abs/2105.08526v1 ) ライセンス: CC BY-SA 4.0	Farid Arthaud, Guillaume Lecoeur, Alban Pierre	(参考訳) 堅牢な走行時間予測は、交通インフラ、特に交通規制と乗客満足度の両方に大きな影響を与える鉄道網の管理において最も重要なものである。我々は,鉄道網全体の規模で鉄道区間を走行する列車の走行時間を予測することを目的として,理論的循環計画に対する列車の遅延を推定する。鉄道会社内の既存の実装は、列車の遅延が残りの旅行の間一定であるように近似して機能する。列車の遅延の進行を予測することは、主要な道路交通予測問題と異なり、列車の間隔、駅の混雑、不均一な車両など、いくつかの難解な現象を含むため、ユニークな難題である。まず,フランス国鉄の遅延伝播現象の実証的証拠を提示し,列車間の相互作用によって遅延が増幅されることを示した。次に, 変圧器アーキテクチャと事前学習した組込みを用いた新しい手法を提案し, 鉄道網全体のスケールで列車の遅延をリアルタイムに並列に予測する手法を提案する(ピーク時3k以上の列車は平均70分間隔で予測を行う)。提案手法は,現在使われている,実験的な予測手法と比較して,実世界のデータに対して非常に肯定的な結果をもたらす。私たちの仕事は、フランスの鉄道会社sncfによる旅客情報システムの実装の初期段階にあり、交通規制決定を支援するツールとして候補となっている。 Robust travel time predictions are of prime importance in managing any transportation infrastructure, and particularly in rail networks where they have major impacts both on traffic regulation and passenger satisfaction. We aim at predicting the travel time of trains on rail sections at the scale of an entire rail network in real-time, by estimating trains' delays relative to a theoretical circulation plan. Existing implementations within railway companies generally work using the approximation that a train's delay will stay constant for the rest of its trip. Predicting the evolution of a given train's delay is a uniquely hard problem, distinct from mainstream road traffic forecasting problems, since it involves several hard-to-model phenomena: train spacing, station congestion and heterogeneous rolling stock among others. We first offer empirical evidence of the previously unexplored phenomenon of delay propagation in the French National Railway Network, leading to delays being amplified by interactions between trains. We then contribute a novel technique using the transformer architecture and pre-trained embeddings to make real-time massively parallel predictions for train delays at the scale of the whole rail network (over 3k trains at peak hours, making predictions at an average horizon of 70 minutes). Our approach yields very positive results on real-world data when compared to currently-used and experimental prediction techniques. Our work is in the early stages of implementation for industrial use at the French railway company SNCF for passenger information systems, and a contender as a tool to aid traffic regulation decisions.	翻訳日:2021-05-19 18:35:22 公開日:2021-05-18
# (参考訳) VASS到達性問題に対するアッカーマン下界の改善 Improved Ackermannian lower bound for the VASS reachability problem ( http://arxiv.org/abs/2105.08551v1 ) ライセンス: CC BY 4.0	S{\l}awomir Lasota	(参考訳) このドラフトは、最近 Czerwi\'nski と Orlikowski によって発表された状態を持つベクトル加算系(VASS)における到達可能性問題に対するアッカーマン下界のフォローアップである。独立して、同じ結果がlerouxによって発表されたが、かなり異なる証拠がある。 czerwi\'nski と orlikowski は、次元 6k$ で $f_k$-hardness、次元 4k+9$ で leroux を証明しているが、単純化された構成により、既に次元 3k+2$ で $f_k$-hardness が得られる。 This draft is a follow-up of the Ackermannian lower bound for the reachability problem in vector addition systems with states (VASS), recently announced by Czerwi\'nski and Orlikowski. Independently, the same result has been announced by Leroux, but with a significantly different proof. We provide a simplification of the former construction, thus improving the lower bound for VASS in fixed dimension: while Czerwi\'nski and Orlikowski prove $F_k$-hardness in dimension $6k$, and Leroux in dimension $4k+9$, the simplified construction yields $F_k$-hardness already in dimension $3k+2$.	翻訳日:2021-05-19 18:16:10 公開日:2021-05-18
# (参考訳) 複雑な3次元環境におけるキュラス探索のための$\beta$-VAE符号化 Fixed $\beta$-VAE Encoding for Curious Exploration in Complex 3D Environments ( http://arxiv.org/abs/2105.08568v1 ) ライセンス: CC BY 4.0	Auguste Lehuger, Matthew Crosby	(参考訳) 好奇心は、環境報酬を内在的な報酬で増やす一般的な方法であり、探索を促進し、スパース報酬設定において特に有用である。キュリオシティは次の状態予測誤差を用いて計算されるため、使用する状態エンコーディングの種類は性能に大きな影響を与える。ランダムな特徴と逆動的特徴は、Atariや他の主に2D環境の以前の結果に基づいて、VAEよりも一般的に好まれる。しかし、VAEと異なり、最適な行動のための十分な情報をエンコードしていないため、環境が複雑化するにつれて、ますます重要になる。本稿では,3D物理環境であるAnimal-AIを用いて,固定された$\beta$-VAEエンコーディングを好奇心で効果的に利用できることを示す。これをカリキュラム学習と組み合わせて、未解決の探索集約的なデトラウトタスクを解き、次の最良エンコーディングに対してトレーニングカリキュラムのサンプル効率を22倍に向上させる。また、atariのブレイクアウトの結果は、ランダムな機能や逆ダイナミクス機能よりも優れたエンコーディングで一致しています。 Curiosity is a general method for augmenting an environment reward with an intrinsic reward, which encourages exploration and is especially useful in sparse reward settings. As curiosity is calculated using next state prediction error, the type of state encoding used has a large impact on performance. Random features and inverse-dynamics features are generally preferred over VAEs based on previous results from Atari and other mostly 2D environments. However, unlike VAEs, they may not encode sufficient information for optimal behaviour, which becomes increasingly important as environments become more complex. In this paper, we use the sparse reward 3D physics environment Animal-AI, to demonstrate how a fixed $\beta$-VAE encoding can be used effectively with curiosity. We combine this with curriculum learning to solve the previously unsolved exploration intensive detour tasks while achieving 22\% gain in sample efficiency on the training curriculum against the next best encoding. We also corroborate the results on Atari Breakout, with our custom encoding outperforming random features and inverse-dynamics features.	翻訳日:2021-05-19 18:04:46 公開日:2021-05-18
# (参考訳) 画像キャプションのための因果干渉によるマルチタスク学習 Dependent Multi-Task Learning with Causal Intervention for Image Captioning ( http://arxiv.org/abs/2105.08573v1 ) ライセンス: CC BY 4.0	Wenqing Chen, Jidong Tian, Caoyun Fan, Hao He, and Yaohui Jin	(参考訳) 画像キャプションの最近の研究は、主に抽出列生成のパラダイムに従い、オブジェクトベースの特徴列を事前抽出し、単一のシーケンス対シーケンスタスクとして画像キャプションを定式化する。 1) モデルが矛盾する事実を生成する内容の不整合,2) モデルが重要な情報の一部を見逃すような情報がない,という2つの問題を発見した。因果的な観点からすると、モデルが視覚的特徴と特定の表現(例えば「長い髪」と「女性」の視覚的特徴)の間の散発的な統計的相関を捉えたからである。本稿では,因果介入(dmtci)を用いた依存型マルチタスク学習フレームワークを提案する。まず、中間タスク、カテゴリの袋生成、最終タスクの前に、画像キャプションを伴います。中間タスクは、モデルが視覚的特徴をよりよく理解し、コンテンツ一貫性の問題を軽減するのに役立つ。次に、Pearlのdo-calculusをモデルに適用し、視覚的特徴と可能共同創設者とのリンクを遮断し、モデルが因果的視覚的特徴にフォーカスできるようにする。特に、高周波の概念セットは、実際の共同設立者が連続空間で推測されるプロキシ共同設立者と見なされる。最後に,マルチエージェント強化学習(marl)戦略を用いてエンドツーエンドトレーニングを可能にし,タスク間エラーの蓄積を低減する。実験により,本モデルがベースラインモデルより優れ,最先端モデルと競合する性能が得られた。 Recent work for image captioning mainly followed an extract-then-generate paradigm, pre-extracting a sequence of object-based features and then formulating image captioning as a single sequence-to-sequence task. Although promising, we observed two problems in generated captions: 1) content inconsistency where models would generate contradicting facts; 2) not informative enough where models would miss parts of important information. From a causal perspective, the reason is that models have captured spurious statistical correlations between visual features and certain expressions (e.g., visual features of "long hair" and "woman"). In this paper, we propose a dependent multi-task learning framework with the causal intervention (DMTCI). Firstly, we involve an intermediate task, bag-of-categories generation, before the final task, image captioning. The intermediate task would help the model better understand the visual features and thus alleviate the content inconsistency problem. Secondly, we apply Pearl's do-calculus on the model, cutting off the link between the visual features and possible confounders and thus letting models focus on the causal visual features. Specifically, the high-frequency concept set is considered as the proxy confounders where the real confounders are inferred in the continuous space. Finally, we use a multi-agent reinforcement learning (MARL) strategy to enable end-to-end training and reduce the inter-task error accumulations. The extensive experiments show that our model outperforms the baseline models and achieves competitive performance with state-of-the-art models.	翻訳日:2021-05-19 17:49:29 公開日:2021-05-18
# (参考訳) 海洋水中の有意な波高の予測 Forecasting Significant Wave Heights in Oceanic Waters ( http://arxiv.org/abs/2105.08583v1 ) ライセンス: CC BY 4.0	Pujan Pokhrel, Elias Ioup, Md Tamjidul Hoque, Mahdi Abdelguerfi, Julian Simeonov	(参考訳) 本稿では,海洋水中の波高を推定するための余分木(et)アルゴリズムに基づく機械学習手法を提案する。点計測を行うCDIPブイから複数の特徴を導出するため,まず様々なパラメータを解析し,30分間隔で予測する。提案アルゴリズムは、それぞれ1日前の予測でScatter Index (SI), Bias, correlation Coefficient, Root Mean Squared Error (RMSE) が0.130,-0.002, 0.97, 0.14であり、テストデータセット上で14日前の予測では0.110,-0.001, 0.98, 0.122である。他の最先端の手法では120時間前にしか予測できないが、さらに14日延長する。この14日間の制限は予測限界ではないが、実験のセットアップによって生じる。提案手法は,スペクトル特性,hvブロッククロスバリデーション,厳密QC基準を含む。提案アルゴリズムは,1日先進予測のための有意な波高予測に一般的に使用される最先端手法よりも,はるかに優れた性能を示す。さらに, 数値計算法と比較して, 提案手法の性能が向上し, 海洋水中の波高を早期に予測できる長周期に拡張できることを示した。 This paper proposes a machine learning method based on the Extra Trees (ET) algorithm for forecasting Significant Wave Heights in oceanic waters. To derive multiple features from the CDIP buoys, which make point measurements, we first nowcast various parameters and then forecast them at 30-min intervals. The proposed algorithm has Scatter Index (SI), Bias, Correlation Coefficient, Root Mean Squared Error (RMSE) of 0.130, -0.002, 0.97, and 0.14, respectively, for one day ahead prediction and 0.110, -0.001, 0.98, and 0.122, respectively, for 14-day ahead prediction on the testing dataset. While other state-of-the-art methods can only forecast up to 120 hours ahead, we extend it further to 14 days. This 14-day limit is not the forecasting limit, but it arises due to our experiment's setup. Our proposed setup includes spectral features, hv-block cross-validation, and stringent QC criteria. The proposed algorithm performs significantly better than the state-of-the-art methods commonly used for significant wave height forecasting for one-day ahead prediction. Moreover, the improved performance of the proposed machine learning method compared to the numerical methods, shows that this performance can be extended to even longer time periods allowing for early prediction of significant wave heights in oceanic waters.	翻訳日:2021-05-19 17:32:17 公開日:2021-05-18
# (参考訳) テキスト分類のための自己解釈型畳み込みニューラルネットワーク Self-interpretable Convolutional Neural Networks for Text Classification ( http://arxiv.org/abs/2105.08589v1 ) ライセンス: CC BY 4.0	Wei Zhao, Rahul Singh, Tarun Joshi, Agus Sudjianto, Vijayan N. Nair	(参考訳) 自然言語処理(NLP)のディープラーニングモデルは本質的に複雑であり、本質的にはブラックボックスと見なされることが多い。本稿では,relu-dnnに固有な局所線形モデルを用いて,テキスト分類問題に対する畳み込みニューラルネットワークの解釈手法を開発した。 CNNモデルは、畳み込み層に埋め込み、最大プールを用いてそれらをフィルタリングし、分類のためにReLU-DNNを使用して最適化する。全体的な自己解釈モデルを得るために、ReLU DNNからの局所線形モデルのシステムは、最大プールフィルタを通して適切なn-gramにマッピングされる。実験データセットを用いた結果から,提案手法は,より複雑なcnnモデルに対して,自己解釈可能で同等の性能を持つ並列モデルを生成することが示された。また,畳み込み層と分類層の複雑さがモデル性能に与える影響についても検討した。 Deep learning models for natural language processing (NLP) are inherently complex and often viewed as black box in nature. This paper develops an approach for interpreting convolutional neural networks for text classification problems by exploiting the local-linear models inherent in ReLU-DNNs. The CNN model combines the word embedding through convolutional layers, filters them using max-pooling, and optimizes using a ReLU-DNN for classification. To get an overall self-interpretable model, the system of local linear models from the ReLU DNN are mapped back through the max-pool filter to the appropriate n-grams. Our results on experimental datasets demonstrate that our proposed technique produce parsimonious models that are self-interpretable and have comparable performance with respect to a more complex CNN model. We also study the impact of the complexity of the convolutional layers and the classification layers on the model performance.	翻訳日:2021-05-19 17:18:17 公開日:2021-05-18
# (参考訳) WOVe:GloVeワード埋め込みに単語順序を組み込む WOVe: Incorporating Word Order in GloVe Word Embeddings ( http://arxiv.org/abs/2105.08597v1 ) ライセンス: CC BY 4.0	Mohammed Ibrahim, Susan Gauch, Tyler Gerth, Brandon Cox	(参考訳) 単語ベクトル表現は、構造化されていないテキストから有用な情報を抽出する新しい機会を開く。単語をベクトルとして定義することで、機械学習アルゴリズムがテキストを理解して情報を抽出することが容易になった。ワードベクトル表現は、単語同義語、単語類似、構文解析など、多くのアプリケーションで使われている。 GloVeは、単語コンテキストと行列ベクトル化に基づいて、エフェクティブなベクトル学習アルゴリズムである。従来のベクトル学習アルゴリズムを改善する。しかし、グローブモデルは文脈の中で単語が現れる順序を明示的に考慮しない。本稿では,グローブワード埋め込みに単語順序を組み込む複数の手法を提案する。実験の結果, 単語順ベクトル(WOVe)の単語埋め込みは, アナログ補完と単語類似性の自然なランゲージタスクにおいて, 未修正のGloVeよりも優れていることがわかった。単語類似性タスクでは、直接結合性がわずかに優れており、平均的なランクが2%上昇している。しかし、GloVeのベースラインでは単語類似タスクが大幅に改善され、平均36.34%の精度が向上した。 Word vector representations open up new opportunities to extract useful information from unstructured text. Defining a word as a vector made it easy for the machine learning algorithms to understand a text and extract information from. Word vector representations have been used in many applications such word synonyms, word analogy, syntactic parsing, and many others. GloVe, based on word contexts and matrix vectorization, is an ef-fective vector-learning algorithm. It improves on previous vector-learning algorithms. However, the GloVe model fails to explicitly consider the order in which words appear within their contexts. In this paper, multiple methods of incorporating word order in GloVe word embeddings are proposed. Experimental results show that our Word Order Vector (WOVe) word embeddings approach outperforms unmodified GloVe on the natural lan-guage tasks of analogy completion and word similarity. WOVe with direct concatenation slightly outperformed GloVe on the word similarity task, increasing average rank by 2%. However, it greatly improved on the GloVe baseline on a word analogy task, achieving an average 36.34% improvement in accuracy.	翻訳日:2021-05-19 17:08:16 公開日:2021-05-18
# (参考訳) 弾性部分マッチングを用いた機能データの形状解析 Shape Analysis of Functional Data with Elastic Partial Matching ( http://arxiv.org/abs/2105.08604v1 ) ライセンス: CC BY 4.0	Darshan Bryner and Anuj Srivastava	(参考訳) 弾性リーマン計量は、関数および曲線形状データの統計処理に過去に成功している。しかし、この使用法には重要な制限が課されており、関数の境界は固定され、一致すると仮定されている。未整合境界を示す機能データは、通常、異なる地理的領域に関連するCOVID-19感染率曲線などの変動進化率を持つ力学系から生じる。この場合、そのようなデータをスライディングバウンダリでモデル化し、部分マッチングを使用する方がより自然である。本稿では,位相可変性と不確定境界下での関数の部分マッチング,比較,クラスタリングを可能にする包括的リーマンフレームワークを開発した。我々は,(1)時変群と時変群の合同作用を形成すること,(2)この共同作用に不変な計量を導入し,弾性的部分マッチングへの勾配に基づくアプローチを可能にすること,(3)計量特性を損なうことなく,両者の相対的影響を制御できる修正を提示すること,により過去の作業を拡張した。このフレームワークは、COVID-19レートカーブの登録とクラスタリング、必須パターンの特定、ミスマッチエラーの最小化、以前の方法と比較してクラスタ内のばらつきの低減のために説明されている。 Elastic Riemannian metrics have been used successfully in the past for statistical treatments of functional and curve shape data. However, this usage has suffered from an important restriction: the function boundaries are assumed fixed and matched. Functional data exhibiting unmatched boundaries typically arise from dynamical systems with variable evolution rates such as COVID-19 infection rate curves associated with different geographical regions. In this case, it is more natural to model such data with sliding boundaries and use partial matching, i.e., only a part of a function is matched to another function. Here, we develop a comprehensive Riemannian framework that allows for partial matching, comparing, and clustering of functions under both phase variability and uncertain boundaries. We extend past work by: (1) Forming a joint action of the time-warping and time-scaling groups; (2) Introducing a metric that is invariant to this joint action, allowing for a gradient-based approach to elastic partial matching; and (3) Presenting a modification that, while losing the metric property, allows one to control relative influence of the two groups. This framework is illustrated for registering and clustering shapes of COVID-19 rate curves, identifying essential patterns, minimizing mismatch errors, and reducing variability within clusters compared to previous methods.	翻訳日:2021-05-19 17:00:33 公開日:2021-05-18
# (参考訳) ベイズニューラルネットワークによる逆例の検出 Detecting Adversarial Examples with Bayesian Neural Network ( http://arxiv.org/abs/2105.08620v1 ) ライセンス: CC BY 4.0	Yao Li, Tongyi Tang, Cho-Jui Hsieh, Thomas C. M. Lee	(参考訳) ディープニューラルネットワーク(Deep Neural Network, DNN)は、人間の自然なイメージと区別できないまま、DNNを騙すために慎重に作られた例である。本稿では,ランダム成分が予測器の滑らかさを向上し,ディープニューラルネットワークの出力分布をシミュレートしやすくするという観測結果に動機づけられた,逆行例を検出する新しい枠組みを提案する。そこで本研究では,バテクタに略されるベイズ型逆例検出器を提案し,逆例検出の性能を向上させる。具体的には,実例と逆例の隠れ層出力の分布差について検討し,ベイズニューラルネットワーク(bnn)のランダム性を用いて隠れ層出力分布をシミュレートし,分布分散を利用して逆例を検出することを提案する。 bnnの利点は、ランダム成分を持たないニューラルネットワークはそのような特性を持たないが、出力が確率的であることである。ポピュラーアタックに対するいくつかのベンチマークデータセットでの実証結果から、提案するバテクタは、敵対的な例検出において最先端の検出器よりも優れていることが分かる。 Deep neural networks (DNNs) are vulnerable against adversarial examples, i.e., examples that are carefully crafted to fool the DNNs while being indistinguishable from the natural images to humans. In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate output distribution of deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATector, to improve the performance of adversarial example detection. In specific, we study the distributional difference of hidden layer output between natural and adversarial examples, and propose to use the randomness of Bayesian neural network (BNN) to simulate hidden layer output distribution and leverage the distribution dispersion to detect adversarial examples. The advantage of BNN is that the output is stochastic while neural network without random components do not have such characteristics. Empirical results on several benchmark datasets against popular attacks show that, the proposed BATector outperforms the state-of-the-art detectors in adversarial example detection.	翻訳日:2021-05-19 16:32:16 公開日:2021-05-18
# (参考訳) Zorro: グラフニューラルネットワークにおける妥当性,スパース,安定した説明 Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks ( http://arxiv.org/abs/2105.08621v1 ) ライセンス: CC BY 4.0	Thorben Funke, Megha Khosla, Avishek Anand	(参考訳) グラフニューラルネットワークの普及と応用により、GNNモデルの判断を解釈し理解するためのいくつかの提案がなされている。 GNNモデルの説明は他の入力設定と原理的に異なる。グラフ構造で接続された特徴やその他の関連インスタンスを入力する決定を属性とすることが重要である。我々は,gnnモデルが生成するラベル分布と説明との相互情報を最大化する先行説明生成手法が制限的であることを見出した。具体的には、既存のアプローチでは、予測、スパース、あるいは入力摂動に頑健な説明を強制しない。本稿では,GNNにおける説明手法が従うべき基本原理を概説し,説明の有効性の尺度として計量忠実度を導入する。本稿では、簡単な組合せ法を用いて忠実度を最適化する速度歪み理論の原理に基づく新しいアプローチZorroを提案する。実データと合成データセットの大規模な実験により、Zorroは既存のGNNの説明手法よりもスペーサー、安定、忠実な説明を生み出すことが明らかになった。 With the ever-increasing popularity and applications of graph neural networks, several proposals have been made to interpret and understand the decisions of a GNN model. Explanations for a GNN model differ in principle from other input settings. It is important to attribute the decision to input features and other related instances connected by the graph structure. We find that the previous explanation generation approaches that maximize the mutual information between the label distribution produced by the GNN model and the explanation to be restrictive. Specifically, existing approaches do not enforce explanations to be predictive, sparse, or robust to input perturbations. In this paper, we lay down some of the fundamental principles that an explanation method for GNNs should follow and introduce a metric fidelity as a measure of the explanation's effectiveness. We propose a novel approach Zorro based on the principles from rate-distortion theory that uses a simple combinatorial procedure to optimize for fidelity. Extensive experiments on real and synthetic datasets reveal that Zorro produces sparser, stable, and more faithful explanations than existing GNN explanation approaches.	翻訳日:2021-05-19 16:18:36 公開日:2021-05-18
# (参考訳) 相関構造を用いた抽象画像の美的評価 Assessing aesthetics of generated abstract images using correlation structure ( http://arxiv.org/abs/2105.08635v1 ) ライセンス: CC BY 4.0	Sina Khajehabdollahi, Georg Martius, Anna Levina	(参考訳) 自然画像や人間の選択画像から偏りなく抽象美的画像を生成することができるか? 美的イメージは相関関数に含まれているか? 本稿では,これらの質問に対する回答について述べる。ランダム重みと異なるアーキテクチャを持つ合成パターン生成ネットワークを用いて画像を生成する。ランダムに選択された重みであっても、相関関数はネットワークアーキテクチャによって決定される。制御された実験では、人間はすべての生成された画像の大規模なデータセットから美的イメージを抽出した。統計的解析により、相関関数は美的画像では確かに異なることが分かる。 Can we generate abstract aesthetic images without bias from natural or human selected image corpi? Are aesthetic images singled out in their correlation functions? In this paper we give answers to these and more questions. We generate images using compositional pattern-producing networks with random weights and varying architecture. We demonstrate that even with the randomly selected weights the correlation functions remain largely determined by the network architecture. In a controlled experiment, human subjects picked aesthetic images out of a large dataset of all generated images. Statistical analysis reveals that the correlation function is indeed different for aesthetic images.	翻訳日:2021-05-19 15:54:24 公開日:2021-05-18
# (参考訳) CoTexT: Code-Text Transformerによるマルチタスク学習 CoTexT: Multi-task Learning with Code-Text Transformer ( http://arxiv.org/abs/2105.08645v1 ) ライセンス: CC BY 4.0	Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Anibal, Alec Peltekian, and Yanfang Ye	(参考訳) マルチタスク学習を通じて自然言語(NL)とプログラミング言語(PL)の代表的な文脈を学習するトランスフォーマーベースのアーキテクチャエンコーダデコーダモデルであるCoTexTを提案する。 CoTexTは、コード要約/文書化、コード生成、欠陥検出、コードデバッギングなど、下流のNL-PLタスクをサポートする汎用的な理解とコードテキスト生成を学ぶために、大規模なプログラミング言語コーパスに基づいて、自己管理型で事前訓練されている。我々は、CoTexTを利用可能なPLコーパスの異なる組み合わせで訓練する。これは、"bimodal"データと"unimodal"データの両方で、後者は、入力シーケンス内の自然文と対応するコードスニペットの組み合わせであり、後者は単なるコードスニペットである。マルチタスク学習のCoTexTをCodeXGLUE上で生成・分類タスクで評価し,すべての下流タスクで最先端を実現する。 We present CoTexT, a transformer-based architecture encoder-decoder pre-trained model that learns the representative context between natural language (NL) and programming language (PL) through multi-task learning. CoTexT is pre-trained, in self-supervised fashion, based on large programming language corpus to learn general-purpose understanding and code-text generation supporting downstream NL-PL task such as code summarizing/documentation, code generation, defect detection, code debugging, etc. We train CoTexT on different combination of available PL corpus including both "bimodal" and "unimodal" data where the former is the combinations of both natural texts and their corresponding code snippets in an input sequence and the latter is merely code snippets. We evaluate multi-task learning CoTexT on different generation and classification tasks on CodeXGLUE and it achieves state-of-the-art on all downstream tasks.	翻訳日:2021-05-19 15:44:29 公開日:2021-05-18
# (参考訳) IntFormer: Transformerアーキテクチャの助けを借りて歩行者の意図を予測する IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture ( http://arxiv.org/abs/2105.08647v1 ) ライセンス: CC BY-SA 4.0	J. Lorenzo, I. Parra and M. A. Sotelo	(参考訳) 歩行者の横断行動を理解することは、インテリジェントな車両開発において重要な目標であり、セキュリティと交通の流れの改善につながる。本稿では,IntFormerという手法を開発した。これはトランスフォーマーアーキテクチャとrubiksnetと呼ばれる新しい畳み込みビデオ分類モデルに基づいている。最近のベンチマークでの評価手順に従うと、我々のモデルは高い性能(約40$ seq)で最先端の結果に達することを示す。 1秒あたり)とサイズ(8\times $smaller than the best performing model)で、リアルタイム使用に適している。また、各入力特徴についても検討し、Ego-vehicleの速度が最も重要な変数であることを発見した。 Understanding pedestrian crossing behavior is an essential goal in intelligent vehicle development, leading to an improvement in their security and traffic flow. In this paper, we developed a method called IntFormer. It is based on transformer architecture and a novel convolutional video classification model called RubiksNet. Following the evaluation procedure in a recent benchmark, we show that our model reaches state-of-the-art results with good performance ($\approx 40$ seq. per second) and size ($8\times $smaller than the best performing model), making it suitable for real-time usage. We also explore each of the input features, finding that ego-vehicle speed is the most important variable, possibly due to the similarity in crossing cases in PIE dataset.	翻訳日:2021-05-19 15:33:27 公開日:2021-05-18
# (参考訳) スケーラブルコンテンツに基づくビジュアルメディア検索のためのマルチモーダルディープラーニングフレームワーク A multimodal deep learning framework for scalable content based visual media retrieval ( http://arxiv.org/abs/2105.08665v1 ) ライセンス: CC BY 4.0	Ambareesh Ravi, Amith Nandakumar	(参考訳) 本稿では,画像と映像の両方に対して協調的に動作可能な深層学習の力を活用し,コンテンツベースビジュアルメディア検索システムのための新しい,効率的,モジュール性,スケーラブルなフレームワークを提案し,検索のための効率的な比較・フィルタリング指標を提案する。提案手法を従来の手法と比較し,提案手法の有効性と効率性,検索アーキテクチャの能力をさらに高める可能性のある改善を実証する。 We propose a novel, efficient, modular and scalable framework for content based visual media retrieval systems by leveraging the power of Deep Learning which is flexible to work both for images and videos conjointly and we also introduce an efficient comparison and filtering metric for retrieval. We put forward our findings from critical performance tests comparing our method to the predominant conventional approach to demonstrate the feasibility and efficiency of the proposed solution with best practices, possible improvements that may further augment the ability of retrieval architectures.	翻訳日:2021-05-19 15:24:19 公開日:2021-05-18
# (参考訳) 数値エッジ属性を用いた知識グラフからの埋め込み学習 Learning Embeddings from Knowledge Graphs With Numeric Edge Attributes ( http://arxiv.org/abs/2105.08683v1 ) ライセンス: CC BY-SA 4.0	Sumit Pai, Luca Costabello	(参考訳) 知識グラフのエッジに関連する数値は、遺伝的データからソーシャルネットワークまで、多くのシナリオにおいて不確実性、エッジの重要性、さらには帯域外知識を表すために使われてきた。それにもかかわらず、従来の知識グラフ埋め込みモデルは、予測力を損なうような情報をキャプチャするように設計されていない。本稿では,従来の知識グラフ埋め込みアーキテクチャのスコアリング層に,数値エッジ属性を注入する新しい手法を提案する。数値知識グラフの公開実験により,本手法は従来の数値知識ベースラインよりも,最近のukgeモデルよりも優れていることが示された。 Numeric values associated to edges of a knowledge graph have been used to represent uncertainty, edge importance, and even out-of-band knowledge in a growing number of scenarios, ranging from genetic data to social networks. Nevertheless, traditional knowledge graph embedding models are not designed to capture such information, to the detriment of predictive power. We propose a novel method that injects numeric edge attributes into the scoring layer of a traditional knowledge graph embedding architecture. Experiments with publicly available numeric-enriched knowledge graphs show that our method outperforms traditional numeric-unaware baselines as well as the recent UKGE model.	翻訳日:2021-05-19 15:10:58 公開日:2021-05-18
# (参考訳) 動的チーム構成のためのコーチプレイヤマルチエージェント強化学習 Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition ( http://arxiv.org/abs/2105.08692v1 ) ライセンス: CC BY 4.0	Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu and Animashree Anandkumar	(参考訳) 現実世界のマルチエージェントシステムでは、異なる能力を持つエージェントがチーム全体の目標を変更することなく参加または離脱することができる。このようなダイナミックな構成でチームをコーディネートすることは難しい。この問題に対処するためのコーチ・プレイヤ・フレームワークであるCOPAを提案する。コーチは環境をグローバルに把握し、個々の戦略を分散することで、部分的な視点しか持たないプレイヤーをコーディネートしていると仮定する。具体的には,1) コーチと選手の双方に注意機構を導入し,2) 学習の規則化のための変動目標を提案し,3) コーチが選手といつコミュニケーションをするかを決めるための適応的なコミュニケーション手法を設計する。本手法は,資源収集タスク,救助ゲーム,およびStarCraftマイクロマネジメントタスクにおいて検証する。新しいチーム構成にゼロショットの一般化を実証する。本手法は,全プレイヤーが環境をフルに把握できる環境よりも,同等あるいは優れた性能を実現する。また,適応的なコミュニケーション戦略を用いることで,コーチが13%の時間でコミュニケーションを行う場合でも,パフォーマンスは高いままである。 In real-world multiagent systems, agents with different capabilities may join or leave without altering the team's overarching goals. Coordinating teams with such dynamic composition is challenging: the optimal team strategy varies with the composition. We propose COPA, a coach-player framework to tackle this problem. We assume the coach has a global view of the environment and coordinates the players, who only have partial views, by distributing individual strategies. Specifically, we 1) adopt the attention mechanism for both the coach and the players; 2) propose a variational objective to regularize learning; and 3) design an adaptive communication method to let the coach decide when to communicate with the players. We validate our methods on a resource collection task, a rescue game, and the StarCraft micromanagement tasks. We demonstrate zero-shot generalization to new team compositions. Our method achieves comparable or better performance than the setting where all players have a full view of the environment. Moreover, we see that the performance remains high even when the coach communicates as little as 13% of the time using the adaptive communication strategy.	翻訳日:2021-05-19 14:58:53 公開日:2021-05-18
# (参考訳) Manifold-Aware Wasserstein GAN を用いた人間の動作予測 Human Motion Prediction Using Manifold-Aware Wasserstein GAN ( http://arxiv.org/abs/2105.08715v1 ) ライセンス: CC BY 4.0	Baptiste Chopin, Naima Otberdout, Mohamed Daoudi, Angela Bartolo	(参考訳) ヒューマンモーション予測は、事前ポーズシーケンスが与えられた将来の人間のポーズを予測することを目的としている。予測運動の不連続性と長期地平線の性能劣化は、現在も文献で直面する主な課題である。本研究では,人間の動きのコンパクトな表現を用いて,これらの課題に対処する。具体的には、3次元人間のポーズの時間的進化を軌跡としてモデル化し、人間の動きを球面多様体上の単一点にマッピングする。これらの非ユークリッド表現を学ぶために、異なる損失を通じて人間の運動の時間的および空間的依存性を捉える多様体認識ワッサースタイン生成逆モデルを構築する。大規模な実験により、我々のアプローチはCMU MoCapとHuman 3.6Mデータセットの最先端よりも優れていることが示された。定性的結果は予測運動の滑らかさを示す。 Human motion prediction aims to forecast future human poses given a prior pose sequence. The discontinuity of the predicted motion and the performance deterioration in long-term horizons are still the main challenges encountered in current literature. In this work, we tackle these issues by using a compact manifold-valued representation of human motion. Specifically, we model the temporal evolution of the 3D human poses as trajectory, what allows us to map human motions to single points on a sphere manifold. To learn these non-Euclidean representations, we build a manifold-aware Wasserstein generative adversarial model that captures the temporal and spatial dependencies of human motion through different losses. Extensive experiments show that our approach outperforms the state-of-the-art on CMU MoCap and Human 3.6M datasets. Our qualitative results show the smoothness of the predicted motions.	翻訳日:2021-05-19 14:33:48 公開日:2021-05-18
# 交代最小化による線形メタラーニング Sample Efficient Linear Meta-Learning by Alternating Minimization ( http://arxiv.org/abs/2105.08306v1 ) ライセンス: Link先を確認	Kiran Koshy Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh	(参考訳) メタラーニングは、与えられたタスクセットから知識を合成して利用し、非常に小さなデータを使って新しいタスクを迅速に学習する。低次元部分空間にある線形回帰タスクのメタラーニングは、この領域で広く研究されている基本的な問題である。しかし、既存の結果は、非常に最適な推定誤差を保証するか、タスク毎に$\Omega(d)$サンプルを必要とする($d$はデータ次元である)。本研究では,低次元部分空間と回帰器を交互に学習する簡易交互最小化法(MLLAM)について検討する。定数部分空間次元 mllam は、タスクごとに$\omega(\log d)$ のサンプルしか必要とせず、ほぼ最適な推定誤差が得られる。しかし、タスク毎に必要なサンプル数はタスク数で対数的に増加する。低雑音環境下でのこの対策として,タスク毎のサンプル数が任意に多数存在する場合でも,MLLAMと同じ強い統計的保証を保証するタスクサブセット選択方式を提案する。 Meta-learning synthesizes and leverages the knowledge from a given set of tasks to rapidly learn new tasks using very little data. Meta-learning of linear regression tasks, where the regressors lie in a low-dimensional subspace, is an extensively-studied fundamental problem in this domain. However, existing results either guarantee highly suboptimal estimation errors, or require $\Omega(d)$ samples per task (where $d$ is the data dimensionality) thus providing little gain over separately learning each task. In this work, we study a simple alternating minimization method (MLLAM), which alternately learns the low-dimensional subspace and the regressors. We show that, for a constant subspace dimension MLLAM obtains nearly-optimal estimation error, despite requiring only $\Omega(\log d)$ samples per task. However, the number of samples required per task grows logarithmically with the number of tasks. To remedy this in the low-noise regime, we propose a novel task subset selection scheme that ensures the same strong statistical guarantee as MLLAM, even with bounded number of samples per task for arbitrarily large number of tasks.	翻訳日:2021-05-19 14:18:14 公開日:2021-05-18
# データレス知識蒸留におけるコントラストモデルインバージョン Contrastive Model Inversion for Data-Free Knowledge Distillation ( http://arxiv.org/abs/2105.08584v1 ) ライセンス: Link先を確認	Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, Mingli Song	(参考訳) トレーニング済みのモデルからトレーニングデータを復元することを目的としているモデル反転は、最近実現可能であることが証明された。しかし, 既存の逆転法では, 合成されたインスタンスは互いに非常によく似ており, 知識蒸留などの下流タスクに限定的な有効性を示すモード崩壊問題に悩まされることが多い。本稿では,データ多様性を最適化可能な目的として明示的にモデル化し,モード崩壊問題を緩和するContrastive Model Inversion~(CMI)を提案する。我々の主な観察では、同じ量のデータの制約の下では、高いデータの多様性は、通常より強いインスタンス識別を示す。この目的のために、我々はCMIにおいて、前回のバッチで既に合成されたものと区別できるように合成インスタンスを奨励する対照的な学習目標を紹介した。 CIFAR-10, CIFAR-100, Tiny-ImageNetの事前学習モデル実験により, CMIは芸術的状況よりも視覚的に可視なインスタンスを生成するだけでなく, 生成したデータを知識蒸留に使用する場合, 極めて優れた性能が得られることが示された。コードは \url{https://github.com/zju-vipa/DataFree} で入手できる。 Model inversion, whose goal is to recover training data from a pre-trained model, has been recently proved feasible. However, existing inversion methods usually suffer from the mode collapse problem, where the synthesized instances are highly similar to each other and thus show limited effectiveness for downstream tasks, such as knowledge distillation. In this paper, we propose Contrastive Model Inversion~(CMI), where the data diversity is explicitly modeled as an optimizable objective, to alleviate the mode collapse issue. Our main observation is that, under the constraint of the same amount of data, higher data diversity usually indicates stronger instance discrimination. To this end, we introduce in CMI a contrastive learning objective that encourages the synthesizing instances to be distinguishable from the already synthesized ones in previous batches. Experiments of pre-trained models on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CMI not only generates more visually plausible instances than the state of the arts, but also achieves significantly superior performance when the generated data are used for knowledge distillation. Code is available at \url{https://github.com/zju-vipa/DataFree}.	翻訳日:2021-05-19 14:17:37 公開日:2021-05-18
# 線形複雑度を有する変圧器の相対位置符号化 Relative Positional Encoding for Transformers with Linear Complexity ( http://arxiv.org/abs/2105.08399v1 ) ライセンス: Link先を確認	Antoine Liutkus, Ond\v{r}ej C\'ifka, Shih-Lun Wu, Umut \c{S}im\c{s}ekli, Yi-Hsuan Yang, Ga\"el Richard	(参考訳) トランスフォーマーモデルの最近の進歩は、線形空間と時間複雑さのために、前例のないシーケンス長を許容している。一方、相対位置符号化 (relative positional encoding, rpe) は古典的トランスフォーマーにとって有益であり、推論のための絶対位置ではなくラグを利用する。しかし、最近のトランスフォーマーの線形変種には RPE が利用できないのは、注意行列の明示的な計算を必要とするためである。本稿では,このギャップを埋めて,古典的な付加形(正弦波)PEの代替として使用でき,RPEのように確実に振る舞うPEを生成する方法として,確率的位置エンコーディングを提案する。主な理論的貢献は、位置符号化と相関したガウス過程の相互共分散構造を関連付けることである。本稿では,Long-Range Arenaベンチマークと音楽生成におけるアプローチの性能について述べる。 Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity. In the meantime, relative positional encoding (RPE) was proposed as beneficial for classical Transformers and consists in exploiting lags instead of absolute positions for inference. Still, RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix, which is precisely what is avoided by such methods. In this paper, we bridge this gap and present Stochastic Positional Encoding as a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE. The main theoretical contribution is to make a connection between positional encoding and cross-covariance structures of correlated Gaussian processes. We illustrate the performance of our approach on the Long-Range Arena benchmark and on music generation.	翻訳日:2021-05-19 14:17:15 公開日:2021-05-18
# 事前と後方のパラメトリゼーション不変な解釈 Parametrization invariant interpretation of priors and posteriors ( http://arxiv.org/abs/2105.08304v1 ) ライセンス: Link先を確認	Jesus Cerquides	(参考訳) 本稿では、リーマン多様体上の確率を利用して、ベイズ予想における事前および後続の解釈を再考する。主マインドシフトは、「事前分布が我々のモデルのパラメータ上で確率分布を確立する」という考えから「事前分布が確率分布を超えた確率分布を確立する」という考えに移行することである。そのため、我々の確率モデルがフィッシャー計量を持つリーマン多様体であると仮定する。この考え方の下では、確率分布上の任意の分布は「内在的」であり、つまり多様体に選択される特定のパラメトリゼーションに不変である。我々はベルヌーイ分布の多様体上の分布の単純な解析を通じてアイデアを例示する。最大アフター推定の最大の欠点の1つは、それらはパラメトリゼーションに依存することである。ここで開発された理解に基づき、パラメトリゼーションとは独立な最大アフター推定を定義することができる。 In this paper we leverage on probability over Riemannian manifolds to rethink the interpretation of priors and posteriors in Bayesian inference. The main mindshift is to move away from the idea that "a prior distribution establishes a probability distribution over the parameters of our model" to the idea that "a prior distribution establishes a probability distribution over probability distributions". To do that we assume that our probabilistic model is a Riemannian manifold with the Fisher metric. Under this mindset, any distribution over probability distributions should be "intrinsic", that is, invariant to the specific parametrization which is selected for the manifold. We exemplify our ideas through a simple analysis of distributions over the manifold of Bernoulli distributions. One of the major shortcomings of maximum a posteriori estimates is that they depend on the parametrization. Based on the understanding developed here, we can define the maximum a posteriori estimate which is independent of the parametrization.	翻訳日:2021-05-19 14:16:59 公開日:2021-05-18
# スタイル誘導型プランニングによるスタイリズドストーリー生成 Stylized Story Generation with Style-Guided Planning ( http://arxiv.org/abs/2105.08625v1 ) ライセンス: Link先を確認	Xiangzhe Kong, Jialiang Huang, Ziquan Tung, Jian Guan and Minlie Huang	(参考訳) 現在のストーリーテリングシステムは、ナレーションスタイルを考慮せずにコヒーレントなプロットでストーリーを生成することに焦点を当てている。そこで,本稿では,先進的な文脈を与えられたスペクティブスタイルで物語を生成する新しいタスク,スタイル化されたストーリージェネレーションを提案する。この問題に対処するために,まず文体化されたキーワードを計画し,そのキーワードの誘導で全ストーリーを生成する新しい生成モデルを提案する。さらに、生成したストーリーと特定スタイルの整合性を評価するために、2つの自動メトリクスを提案する。実験では、ROCStoriesデータセット(Mostafazadeh et al., 2016)に基づいて、当社のモデルが制御可能であることを実証した。本研究は,今後の研究におけるスタイリズドストーリー生成の展望を示す。 Current storytelling systems focus more ongenerating stories with coherent plots regard-less of the narration style, which is impor-tant for controllable text generation. There-fore, we propose a new task, stylized story gen-eration, namely generating stories with speci-fied style given a leading context. To tacklethe problem, we propose a novel generationmodel that first plans the stylized keywordsand then generates the whole story with theguidance of the keywords. Besides, we pro-pose two automatic metrics to evaluate theconsistency between the generated story andthe specified style. Experiments demonstratesthat our model can controllably generateemo-tion-driven orevent-driven stories based onthe ROCStories dataset (Mostafazadeh et al.,2016). Our study presents insights for stylizedstory generation in further research.	翻訳日:2021-05-19 14:16:45 公開日:2021-05-18
# ビデオグラウンドのためのシーケンスマッチングを用いた並列アテンションネットワーク Parallel Attention Network with Sequence Matching for Video Grounding ( http://arxiv.org/abs/2105.08481v1 ) ライセンス: Link先を確認	Hao Zhang, Aixin Sun, Wei Jing, Liangli Zhen, Joey Tianyi Zhou, Rick Siow Mong Goh	(参考訳) ビデオグラウンディングは、意味的に言語クエリに対応する時間モーメントを検索することを目的としている。本研究では,マルチモーダル表現学習とターゲットモーメント境界予測という課題に対処するために,シーケンスマッチングを用いた並列注意ネットワーク(SeqPAN)を提案する。我々は,ビデオとテキスト間の自己モダルコンテキストとクロスモダル注意情報を効果的に捉えるために,自己誘導型並列アテンションモジュールを設計した。自然言語処理におけるシーケンスラベリングタスクにインスパイアされた我々は、真理モーメントを開始、内部、終了領域に分割した。次に,領域ラベルを用いた開始/終了境界予測を導くシーケンスマッチング戦略を提案する。 3つのデータセットの実験結果は、SeqPANが最先端の手法よりも優れていることを示している。さらに、自己誘導並列注意モジュールとシーケンスマッチングモジュールの有効性を検証する。 Given a video, video grounding aims to retrieve a temporal moment that semantically corresponds to a language query. In this work, we propose a Parallel Attention Network with Sequence matching (SeqPAN) to address the challenges in this task: multi-modal representation learning, and target moment boundary prediction. We design a self-guided parallel attention module to effectively capture self-modal contexts and cross-modal attentive information between video and text. Inspired by sequence labeling tasks in natural language processing, we split the ground truth moment into begin, inside, and end regions. We then propose a sequence matching strategy to guide start/end boundary predictions using region labels. Experimental results on three datasets show that SeqPAN is superior to state-of-the-art methods. Furthermore, the effectiveness of the self-guided parallel attention module and the sequence matching module is verified.	翻訳日:2021-05-19 14:16:30 公開日:2021-05-18
# 適応型ビデオ圧縮センシングのための強化学習 Reinforcement Learning for Adaptive Video Compressive Sensing ( http://arxiv.org/abs/2105.08205v1 ) ライセンス: Link先を確認	Sidi Lu, Xin Yuan, Aggelos K Katsaggelos, Weisong Shi	(参考訳) 映像圧縮センシングに強化学習を適用し,圧縮比を適応させる。具体的には、低速カメラを用いて高速映像を撮影するビデオスナップショット圧縮画像(SCI)について、スナップショット計測から複数の(B)ビデオフレームを再構成できると考えられる。前回の研究では、異なる場面でビデオSCIシステムにBを適応する方法が研究の欠如となっている。本稿では,強化学習(RL)を用いて,このギャップを埋める。再構成のための様々な畳み込みニューラルネットワークと同様にrlモデルが学習され、ビデオsciシステムの適応的センシングを実現する。さらに、再構成のないビデオSCI測定を直接使用したオブジェクト検出ネットワークの性能を用いて、RLに基づく適応的なビデオ圧縮センシングを行う。したがって,提案手法は低コストかつリアルタイムに実現可能である。我々の研究は、ビデオSCIの実際の応用に向けて一歩前進する。 We apply reinforcement learning to video compressive sensing to adapt the compression ratio. Specifically, video snapshot compressive imaging (SCI), which captures high-speed video using a low-speed camera is considered in this work, in which multiple (B) video frames can be reconstructed from a snapshot measurement. One research gap in previous studies is how to adapt B in the video SCI system for different scenes. In this paper, we fill this gap utilizing reinforcement learning (RL). An RL model, as well as various convolutional neural networks for reconstruction, are learned to achieve adaptive sensing of video SCI systems. Furthermore, the performance of an object detection network using directly the video SCI measurements without reconstruction is also used to perform RL-based adaptive video compressive sensing. Our proposed adaptive SCI method can thus be implemented in low cost and real time. Our work takes the technology one step further towards real applications of video SCI.	翻訳日:2021-05-19 14:15:20 公開日:2021-05-18
# NExT-QA: 時間的行動の説明に対する質問のNext Phase NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions ( http://arxiv.org/abs/2105.08276v1 ) ライセンス: Link先を確認	Junbin Xiao, Xindi Shang, Angela Yao and Tat-Seng Chua	(参考訳) ビデオ質問応答(VideoQA)ベンチマークであるNExT-QAを導入し,映像理解の促進と時間的行動の説明を行う。本データセットに基づいて,因果行動推論,時間的行動推論,共通場面理解を対象とする複数選択およびオープンエンドQAタスクを設定した。ベースラインの広範囲な解析とビデオQA手法の確立により, 浅いシーン記述では高い性能を示すが, 因果的・時間的行動推論では弱いことがわかった。さらに, 複数選択QAに適応したモデルでは, 解の一般化に苦慮している。これにより、これらのモデルが改善の可能性を推論し強調する能力に疑問が持ち上がっている。 NExT-QAが次世代のVQA研究を指導し、表面的なシーン記述を超えて、ビデオのより深い理解へと進むことを願っている。 (データセットと関連するリソースはhttps://github.com/doc-doc/NExT-QA.git)。 We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions. Based on the dataset, we set up multi-choice and open-ended QA tasks targeting causal action reasoning, temporal action reasoning, and common scene comprehension. Through extensive analysis of baselines and established VideoQA techniques, we find that top-performing methods excel at shallow scene descriptions but are weak in causal and temporal action reasoning. Furthermore, the models that are effective on multi-choice QA, when adapted to open-ended QA, still struggle in generalizing the answers. This raises doubt on the ability of these models to reason and highlights possibilities for improvement. With detailed results for different question types and heuristic observations for future works, we hope NExT-QA will guide the next generation of VQA research to go beyond superficial scene description towards a deeper understanding of videos. (The dataset and related resources are available at https://github.com/doc-doc/NExT-QA.git)	翻訳日:2021-05-19 14:15:08 公開日:2021-05-18
# スパースアクションタスクのためのsparsity prior regularized q-learning Sparsity Prior Regularized Q-learning for Sparse Action Tasks ( http://arxiv.org/abs/2105.08666v1 ) ライセンス: Link先を確認	Jing-Cheng Pang, Tian Xu, Sheng-Yi Jiang, Yu-Ren Liu, Yang Yu	(参考訳) 多くの意思決定タスクにおいて、特定のアクションは、銃術の「火」や株式取引の「買い」など、その頻度や総量によって制限される。我々はそのような行動を「スパースアクション」と呼ぶ。スパースアクションは、しばしば優れたパフォーマンスを達成する上で重要な役割を果たす。しかしながら、emph{classical bellman update} によって推定されるそれらのq値は、通常、標本のスパース性のため、大きな推定誤差を被る。 emph{greedy} のポリシーは、バイアス付き Q-函数によって大きく誤解される可能性があり、スパース作用を積極的に行い、大きな準最適をもたらす。本稿では,sparseアクションに低い確率を割り当てる参照分布を構築し,その参照分布に明示的な制約を持つ正規化対象を提案する。さらに、正規化ベルマン演算子と正規化最適ポリシーを導出し、エラーの伝播を遅くし、エージェントがよりスパースアクションを取るよう誘導する。実験の結果,本手法は,典型的なスパース動作タスクにおける最先端性能を実現する。 In many decision-making tasks, some specific actions are limited in their frequency or total amounts, such as "fire" in the gunfight game and "buy/sell" in the stock trading. We name such actions as "sparse action". Sparse action often plays a crucial role in achieving good performance. However, their Q-values, estimated by \emph{classical Bellman update}, usually suffer from a large estimation error due to the sparsity of their samples. The \emph{greedy} policy could be greatly misled by the biased Q-function and takes sparse action aggressively, which leads to a huge sub-optimality. This paper constructs a reference distribution that assigns a low probability to sparse action and proposes a regularized objective with an explicit constraint to the reference distribution. Furthermore, we derive a regularized Bellman operator and a regularized optimal policy that can slow down the propagation of error and guide the agent to take sparse action more carefully. The experiment results demonstrate that our method achieves state-of-the-art performance on typical sparse action tasks.	翻訳日:2021-05-19 14:14:33 公開日:2021-05-18
# 逐次独立メカニズムの高速・低速学習 Fast and Slow Learning of Recurrent Independent Mechanisms ( http://arxiv.org/abs/2105.08710v1 ) ライセンス: Link先を確認	Kanika Madan, Rosemary Nan Ke, Anirudh Goyal, Bernhard Bernhard Sch\"olkopf, Yoshua Bengio	(参考訳) 知識を交換可能な部品に分解することは、分布の変化がある場合に一般化の利点を約束する。環境と相互作用する学習エージェントは、既存の知識の新たな組み合わせを必要とする状況に直面しやすい。このような知識の分解は、分布外変化を体系的に一般化できる上で特に重要であると仮定する。そこで本研究では,エージェントが必要とする知識の一部と報酬関数が定常的であり,タスク間で再利用可能な,特定のトレーニングフレームワークを提案する。注意機構は、どのモジュールを現在のタスクに適応できるかを動的に選択し、選択したモジュールのパラメータは、学習者が経験する変化に直面すると迅速に変更でき、一方で注意機構のパラメータは安定してゆっくりと変化するメタパラメータとして動作する。我々は,注意のボトルネックを通じて相互に疎通するモジュール群が捉えた知識の断片に着目した。画像レベルの入力を伴う部分的に観測されたグリッドの世界におけるナビゲーションを含む強化学習装置において,提案方式のモジュール的側面をメタラーニングすることで,より高速な適応を実現することができる。また,パラメータとメタパラメータの役割を逆転させることは,動的に選択されたモジュールを高速に適応するための特別な役割を示唆する。 Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution. A learning agent interacting with its environment is likely to be faced with situations requiring novel combinations of existing pieces of knowledge. We hypothesize that such a decomposition of knowledge is particularly relevant for being able to generalize in a systematic manner to out-of-distribution changes. To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks. An attention mechanism dynamically selects which modules can be adapted to the current task, and the parameters of the selected modules are allowed to change quickly as the learner is confronted with variations in what it experiences, while the parameters of the attention mechanisms act as stable, slowly changing, meta-parameters. We focus on pieces of knowledge captured by an ensemble of modules sparsely communicating with each other via a bottleneck of attention. We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup involving navigation in a partially observed grid world with image-level input. We also find that reversing the role of parameters and meta-parameters does not work nearly as well, suggesting a particular role for fast adaptation of the dynamically selected modules.	翻訳日:2021-05-19 14:14:13 公開日:2021-05-18
# 不均一文脈における分布ロバスト学習 Distributionally Robust Learning in Heterogeneous Contexts ( http://arxiv.org/abs/2105.08532v1 ) ライセンス: Link先を確認	Muhammad Osama, Dave Zachariah, Petre Stoica	(参考訳) 本研究では,異なる文脈で得られた学習データから学習する際の問題点について考察する。我々は,超過リスクに着目した分散ロバストな手法を開発し,従来の超保守的ミニマックスアプローチよりもパフォーマンスとロバスト性のトレードオフをより適切なものにする。提案手法は計算可能であり,統計的保証を提供する。実データと合成データの両方を用いてその性能を示す。 We consider the problem of learning from training data obtained in different contexts, where the test data is subject to distributional shifts. We develop a distributionally robust method that focuses on excess risks and achieves a more appropriate trade-off between performance and robustness than the conventional and overly conservative minimax approach. The proposed method is computationally feasible and provides statistical guarantees. We demonstrate its performance using both real and synthetic data.	翻訳日:2021-05-19 14:13:51 公開日:2021-05-18
# sparta:空間的注意と敵対的ロバストな活性化 Sparta: Spatially Attentive and Adversarially Robust Activation ( http://arxiv.org/abs/2105.08269v1 ) ライセンス: Link先を確認	Qing Guo, Felix Juefei-Xu, Changqing Zhou, Yang Liu, Song Wang	(参考訳) 敵対的トレーニング(AT)は、深層畳み込みニューラルネットワーク(CNN)の堅牢性を改善する最も効果的な方法の1つである。一般的なネットワークトレーニングと同じように、atの有効性は基本的なネットワークコンポーネントの設計に依存する。本稿では,AT における強靭性 CNN における基本 ReLU 活性化成分の役割について,詳細な研究を行う。 ReLUアクティベーションの空間的共有性および入力非依存性により、CNNは標準的あるいは逆的トレーニングによるホワイトボックス攻撃に対してより堅牢であることがわかった。この問題に対処するため、我々はReLUを新しいSpartaアクティベーション関数(Spatially Attentive and Adversarially Robust Activation)に拡張し、CNNがより高いロバスト性、すなわち、敵の事例におけるエラー率、そしてより高い精度、すなわちクリーンな例におけるエラー率、すなわち既存の最先端(SOTA)アクティベーション関数よりも高い精度を実現する。さらに, Sparta と SOTA 活性化関数の関係について検討し, 本手法の利点について考察した。包括的な実験により,提案手法が優れたクロスcnnおよびクロスデータセット転送性を示すことがわかった。前者の場合、1つのCNN(例えばResNet-18)に対して逆向きに訓練されたSparta関数を固定し、他のCNN(例えばResNet-34)をトレーニングするために直接使用することができる。後者では、あるデータセット(例えば、CIFAR-10)でトレーニングされたSparta関数を使用して、別のデータセット(例えば、SVHN)で敵対的に堅牢なCNNをトレーニングすることができる。どちらの場合も、SpartaはバニラReLUよりも堅牢性が高く、提案手法の柔軟性と汎用性を検証する。 Adversarial training (AT) is one of the most effective ways for improving the robustness of deep convolution neural networks (CNNs). Just like common network training, the effectiveness of AT relies on the design of basic network components. In this paper, we conduct an in-depth study on the role of the basic ReLU activation component in AT for robust CNNs. We find that the spatially-shared and input-independent properties of ReLU activation make CNNs less robust to white-box adversarial attacks with either standard or adversarial training. To address this problem, we extend ReLU to a novel Sparta activation function (Spatially attentive and Adversarially Robust Activation), which enables CNNs to achieve both higher robustness, i.e., lower error rate on adversarial examples, and higher accuracy, i.e., lower error rate on clean examples, than the existing state-of-the-art (SOTA) activation functions. We further study the relationship between Sparta and the SOTA activation functions, providing more insights about the advantages of our method. With comprehensive experiments, we also find that the proposed method exhibits superior cross-CNN and cross-dataset transferability. For the former, the adversarially trained Sparta function for one CNN (e.g., ResNet-18) can be fixed and directly used to train another adversarially robust CNN (e.g., ResNet-34). For the latter, the Sparta function trained on one dataset (e.g., CIFAR-10) can be employed to train adversarially robust CNNs on another dataset (e.g., SVHN). In both cases, Sparta leads to CNNs with higher robustness than the vanilla ReLU, verifying the flexibility and versatility of the proposed method.	翻訳日:2021-05-19 14:13:44 公開日:2021-05-18
# 理論誘導残差ネットワークによる経路学習 Learning to Route via Theory-Guided Residual Network ( http://arxiv.org/abs/2105.08279v1 ) ライセンス: Link先を確認	Chang Liu, Guanjie Zheng, Zhenhui Li	(参考訳) 交通量と関連する問題は、常に近代都市に対する懸念であった。深層学習と強化学習の助けを借りて、スマート交通信号制御システムやタクシー配車システムなど、これらの交通問題を解決するための様々な政策を提案してきた。実際の都市で直接適用すると実際のコストがかかるため、人々は通常、都市シミュレーターでこれらのポリシーを検証する。しかし, 都市シミュレータで検証されたこれらの政策は, シミュレータが現実と大きく異なる場合, 実際の都市で失敗する可能性がある。この問題に取り組むためには,実際の交通シミュレーションシステムを構築する必要がある。そこで本研究では,交通シミュレータにおいて最も重要な部分の一つである人間のルーティングモデルを学習することを提案する。この問題には2つの大きな課題がある。第一に、人間の経路決定は、共通時間と距離要素以外の複数の要因によって決定される。第2に,現行のルートデータは通常,プライバシとデバイス可用性の問題から,車両のごく一部をカバーする。これらの問題に対処するために、理論的部分は人間の経路決定の一般的な原則(例えば、最速経路)を強調し、残余部分は乾燥可能な条件設定(例えば、ローカル道路やハイウェイ)を捉えることができる理論誘導残差ネットワークモデルを提案する。理論部分は、訓練に必要なデータを必要としない従来の最短経路アルゴリズムから成り立っているため、残余のネットワークは限られたデータから人間のルーティングモデルを学習することができる。我々は複数の実世界のデータセットに対して広範囲に実験を行い、特に小さなデータを用いて、モデルの優れた性能を示す。さらに、ケーススタディを通じて、私たちのモデルが実際のルートを回復する上で優れている理由も示しています。 The heavy traffic and related issues have always been concerns for modern cities. With the help of deep learning and reinforcement learning, people have proposed various policies to solve these traffic-related problems, such as smart traffic signal control systems and taxi dispatching systems. People usually validate these policies in a city simulator, since directly applying them in the real city introduces real cost. However, these policies validated in the city simulator may fail in the real city if the simulator is significantly different from the real world. To tackle this problem, we need to build a real-like traffic simulation system. Therefore, in this paper, we propose to learn the human routing model, which is one of the most essential part in the traffic simulator. This problem has two major challenges. First, human routing decisions are determined by multiple factors, besides the common time and distance factor. Second, current historical routes data usually covers just a small portion of vehicles, due to privacy and device availability issues. To address these problems, we propose a theory-guided residual network model, where the theoretical part can emphasize the general principles for human routing decisions (e.g., fastest route), and the residual part can capture drivable condition preferences (e.g., local road or highway). Since the theoretical part is composed of traditional shortest path algorithms that do not need data to train, our residual network can learn human routing models from limited data. We have conducted extensive experiments on multiple real-world datasets to show the superior performance of our model, especially with small data. Besides, we have also illustrated why our model is better at recovering real routes through case studies.	翻訳日:2021-05-19 14:12:21 公開日:2021-05-18
# ゼロショットレコメンダシステム Zero-Shot Recommender Systems ( http://arxiv.org/abs/2105.08318v1 ) ライセンス: Link先を確認	Hao Ding, Yifei Ma, Anoop Deoras, Yuyang Wang, Hao Wang	(参考訳) 推薦システム(RS)の性能は、利用可能なトレーニングデータの量に大きく依存する。これはアーリーステージの製品にニワトリの問題を生じさせ、そのデータ量は彼らのRSの性能に依存する。一方、ゼロショット学習は、古いデータセットから全く新しいデータセットへのある程度の一般化を約束する。本稿では,RSにおけるゼロショット学習の可能性を検討する。我々は、ZESRecと呼ばれるアルゴリズムを開発し、古いデータセットでトレーニングし、重複するユーザも重複するアイテムも存在しない新しいデータセットに一般化する。カテゴリー的な項目インデックス、すなわち項目idとは異なり、zesrecは項目の自然言語記述(または記述埋め込み)を連続的なインデックスとして使用するため、自然に見えない項目に一般化する。ユーザの観点からは、zesrecはアイテムとのインタラクションを使用してユーザを表現するためにシーケンシャルなrsの最近の進歩をベースにしている。 2組の現実世界のRSデータセットを調査し、ZESRecがこのようなゼロショット設定でレコメンデーションをうまく実現できることを示し、データスカーススタートアップやアーリーステージ製品におけるチキンとエッグの問題を解決する新たな機会を開く。 Performance of recommender systems (RS) relies heavily on the amount of training data available. This poses a chicken-and-egg problem for early-stage products, whose amount of data, in turn, relies on the performance of their RS. On the other hand, zero-shot learning promises some degree of generalization from an old dataset to an entirely new dataset. In this paper, we explore the possibility of zero-shot learning in RS. We develop an algorithm, dubbed ZEro-Shot Recommenders (ZESRec), that is trained on an old dataset and generalize to a new one where there are neither overlapping users nor overlapping items, a setting that contrasts typical cross-domain RS that has either overlapping users or items. Different from categorical item indices, i.e., item ID, in previous methods, ZESRec uses items' natural-language descriptions (or description embeddings) as their continuous indices, and therefore naturally generalize to any unseen items. In terms of users, ZESRec builds upon recent advances on sequential RS to represent users using their interactions with items, thereby generalizing to unseen users as well. We study two pairs of real-world RS datasets and demonstrate that ZESRec can successfully enable recommendations in such a zero-shot setting, opening up new opportunities for resolving the chicken-and-egg problem for data-scarce startups or early-stage products.	翻訳日:2021-05-19 14:11:57 公開日:2021-05-18
# 経時的医療記録の類似性尺度としての多変量抽象型動的時間ワープ法の実装と評価 Implementation and Evaluation of a Multivariate Abstraction-Based, Interval-Based Dynamic Time-Warping Method as a Similarity Measure for Longitudinal Medical Records ( http://arxiv.org/abs/2105.08450v1 ) ライセンス: Link先を確認	Yuval Shahar and Matan Lion	(参考訳) A)インターバルベース表現(iRep): [1] 生のタイムスタンプデータをインターバルベース抽象化に抽象化する、[2] 比較周期スコピング、[3] 抽象インターバルを与えられた時間粒度に分割する、(B) インターバルベースマッチング(iMatch): 変更されたDTWを使用してパーティショニングされた抽象概念レコードにマッチする、インターバルベースの動的タイムワープ(iDTW)に拡張した。ドメイン知識を使って、医療記録の生データ(4つから5つの関連する概念のうち最大3つの概念)を2つのインターバルタイプに抽象化しました。 low, High) と Gradient の抽象化(例) Incrasing, Decrasing)。すべての一次元(状態または勾配)または多次元(状態と勾配)の抽象組み合わせを作成しました。課題: 自己骨髄移植または同種骨髄移植を161例, B型肝炎を125例, C型肝炎を125例, 来年のマイクロアルブミン尿症またはマクロアルブミン尿症を151例とした。 k-nearest-neighbors majority, k=1 to sqrt(n), n = set sizeを用いた。 23400(オンコロジー)、19,800(肝炎)、7,128(糖尿病)の10倍のクロスバリデーション実験を行った。測度:曲線の下の領域(auc)、最適なユーデン指数。 Paired t-tests compared result vectors for equivalent configurations than a test variable, to determine a significant mean accuracy difference (P<0.05。抽象化を用いた平均分類と予測は,生のタイムスタンプデータのみを使用するよりも有意に良好であった。各ドメインにおいて、少なくとも1つの抽象化の組み合わせは、生のデータを使用するよりも大幅にパフォーマンスが向上した。特徴数の増加、多次元抽象化の使用によりパフォーマンスが向上した。生のデータと異なり、最適性能はk=5で、抽象化を用いて達成されることが多い。 We extended dynamic time warping (DTW) into interval-based dynamic time warping (iDTW), including (A) interval-based representation (iRep): [1] abstracting raw, time-stamped data into interval-based abstractions, [2] comparison-period scoping, [3] partitioning abstract intervals into a given temporal granularity; (B) interval-based matching (iMatch): matching partitioned, abstract-concepts records, using a modified DTW. Using domain knowledge, we abstracted the raw data of medical records, for up to three concepts out of four or five relevant concepts, into two interval types: State abstractions (e.g. LOW, HIGH) and Gradient abstractions (e.g. INCREASING, DECREASING). We created all uni-dimensional (State or Gradient) or multi-dimensional (State and Gradient) abstraction combinations. Tasks: Classifying 161 oncology patients records as autologous or allogenic bone-marrow transplantation; classifying 125 hepatitis patients records as B or C hepatitis; predicting micro- or macro-albuminuria in the next year for 151 Type 2 diabetes patients. We used a k-Nearest-Neighbors majority, k=1 to SQRT(N), N = set size. 50,328 10-fold cross-validation experiments were performed: 23,400 (Oncology), 19,800 (Hepatitis), 7,128 (Diabetes). Measures: Area Under the Curve (AUC), optimal Youden's Index. Paired t-tests compared result vectors for equivalent configurations other than a tested variable, to determine a significant mean accuracy difference (P<0.05). Mean classification and prediction using abstractions was significantly better than using only raw time-stamped data. In each domain, at least one abstraction combination led to a significantly better performance than using raw data. Increasing feature number, and using multi-dimensional abstractions, enhanced performance. Unlike when using raw data, optimal performance was often reached with k=5, using abstractions.	翻訳日:2021-05-19 14:11:34 公開日:2021-05-18
# 適応型ABAC政策学習 : 強化学習アプローチ Adaptive ABAC Policy Learning: A Reinforcement Learning Approach ( http://arxiv.org/abs/2105.08587v1 ) ライセンス: Link先を確認	Leila Karimi, Mai Abdelhakim, James Joshi	(参考訳) コンピュータシステムの急速な進歩により、より効率的かつ効率的なアクセス制御(AC)アプローチへの需要が高まっている。近年、ABAC(Atribute Based Access Control)アプローチは、このような複雑なコンピューティング環境のACニーズを満たす上で有望であることが示されている。 abacモデルは、システム内のエンティティの属性と認可ポリシーに基づく要求者へのアクセスを許可するが、その汎用性と柔軟性はより高いコストを伴う。さらに、組織システムの複雑さの増大とリソースへの連合的なアクセスの必要性により、ACの執行と管理がより困難になる。本稿では,認証管理タスクを自動化するための適応型ABACポリシー学習手法を提案する。 abacポリシー学習を強化学習問題としてモデル化する。特に,承認エンジンがフィードバック制御ループを介してABACモデルを適応させるコンテキスト的盗聴システムを提案する。学習過程を高速化するために,属性値階層に基づく学習モデルと計画手法を初期化する4つの手法を提案する。実例として,ホームIoT環境のための適応型ABACポリシー学習モデルの開発に注力する。提案手法を実データおよび合成データに対して評価する。評価において、完全なデータセットとスパースデータセットの両方を考慮する。実験結果から,提案手法は,多くのシナリオにおける教師付き学習に基づくものと同等の性能を達成し,いくつかの状況でそれを上回る結果が得られた。 With rapid advances in computing systems, there is an increasing demand for more effective and efficient access control (AC) approaches. Recently, Attribute Based Access Control (ABAC) approaches have been shown to be promising in fulfilling the AC needs of such emerging complex computing environments. An ABAC model grants access to a requester based on attributes of entities in a system and an authorization policy; however, its generality and flexibility come with a higher cost. Further, increasing complexities of organizational systems and the need for federated accesses to their resources make the task of AC enforcement and management much more challenging. In this paper, we propose an adaptive ABAC policy learning approach to automate the authorization management task. We model ABAC policy learning as a reinforcement learning problem. In particular, we propose a contextual bandit system, in which an authorization engine adapts an ABAC model through a feedback control loop; it relies on interacting with users/administrators of the system to receive their feedback that assists the model in making authorization decisions. We propose four methods for initializing the learning model and a planning approach based on attribute value hierarchy to accelerate the learning process. We focus on developing an adaptive ABAC policy learning model for a home IoT environment as a running example. We evaluate our proposed approach over real and synthetic data. We consider both complete and sparse datasets in our evaluations. Our experimental results show that the proposed approach achieves performance that is comparable to ones based on supervised learning in many scenarios and even outperforms them in several situations.	翻訳日:2021-05-19 14:10:57 公開日:2021-05-18
# 分散マルチロボットサブモジュラー動作選択のためのグラフニューラルネットワーク Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection ( http://arxiv.org/abs/2105.08601v1 ) ライセンス: Link先を確認	Lifeng Zhou, Vishnu D. Sharma, Qingbiao Li, Amanda Prorok, Alejandro Ribeiro, Vijay Kumar	(参考訳) 本稿では,分散化サブモジュラー最大化のための学習ベースアプローチを開発する。ロボットが行動プリミティブなどのアクションを共同で選択し、ローカルコミュニケーションのみによるチームサブモジュラー目標を最大化するためのアプリケーションに注目します。このようなアプリケーションは、エリアカバレッジのためのマルチロボットモーションプランニング、環境探索、ターゲット追跡など、大規模なマルチロボット協調に不可欠である。しかし、現在の分散化部分モジュラー最大化アルゴリズムは、ロボット間通信の仮定を必要とするか、あるいはいくつかの準最適保証を失う。本研究では,分散通信を用いた大規模部分モジュラー最大化に向けた汎用学習アーキテクチャを提案する。特に、我々の学習アーキテクチャはグラフニューラルネットワーク(GNN)を利用して、ロボットの局所的な相互作用を捉え、ロボットの分散意思決定を学ぶ。専門的なソリューションを模倣して学習モデルを訓練し,局所的な観察とコミュニケーションのみを含む分散的な行動選択モデルを実装した。我々は,大規模ロボットネットワークを用いたアクティブターゲットカバレッジのシナリオにおいて,GNNベースの学習手法の性能を示す。シミュレーションの結果,我々のアプローチは,エキスパートアルゴリズムのカバレッジ性能にほぼ匹敵するものの,30体以上のロボットで複数の注文を高速に実行することがわかった。また,従来は認識されていなかったシナリオ,例えば,より大きな環境やロボットのネットワークなどにおいて,我々のアプローチの一般化能力を示す。 In this paper, we develop a learning-based approach for decentralized submodular maximization. We focus on applications where robots are required to jointly select actions, e.g., motion primitives, to maximize team submodular objectives with local communications only. Such applications are essential for large-scale multi-robot coordination such as multi-robot motion planning for area coverage, environment exploration, and target tracking. But the current decentralized submodular maximization algorithms either require assumptions on the inter-robot communication or lose some suboptimal guarantees. In this work, we propose a general-purpose learning architecture towards submodular maximization at scale, with decentralized communications. Particularly, our learning architecture leverages a graph neural network (GNN) to capture local interactions of the robots and learns decentralized decision-making for the robots. We train the learning model by imitating an expert solution and implement the resulting model for decentralized action selection involving local observations and communications only. We demonstrate the performance of our GNN-based learning approach in a scenario of active target coverage with large networks of robots. The simulation results show our approach nearly matches the coverage performance of the expert algorithm, and yet runs several orders faster with more than 30 robots. The results also exhibit our approach's generalization capability in previously unseen scenarios, e.g., larger environments and larger networks of robots.	翻訳日:2021-05-19 14:10:37 公開日:2021-05-18
# ASM2TV: 適応型セミスーパービジョンマルチタスクマルチビュー学習フレームワーク ASM2TV: An Adaptive Semi-Supervised Multi-Task Multi-View Learning Framework ( http://arxiv.org/abs/2105.08643v1 ) ライセンス: Link先を確認	Zekai Chen, Maiwang Shi, Xiao Zhang, Haochao Ying	(参考訳) IoTにおけるヒューマンアクティビティ認識(HAR)のような現実のシナリオの多くは、マルチタスクのマルチビュー学習問題として形式化することができる。各タスクは、複数のソースから収集された複数の共有機能ビューで構成される。最近のアプローチの共通点は、共通知識を明らかにするために、タスクをまたいだ各ビューに対して、最初のフェーズで典型的なハード/ソフトの共有戦略を個別に採用することである。一方、タスク間の複数のビューは、実用的な状況下で相互に関連している可能性がある。一方で、ラベル付きデータが少ない場合、教師付きメソッドは不十分かもしれない。これらの課題に対処するために,準教師付きマルチタスク多視点学習のための新しいフレームワーク ASM2TV を提案する。本稿では,任意のタスクに対して最も望ましい候補共有ブロックを適応的に選択する,学習可能なタスクビュー対応共有ポリシであるゲーティングコントロールポリシーを提案する。重要な点として,本提案手法は大量の未ラベルの断片化時系列をフル活用し,広範囲のアプリケーションに対応する汎用的なフレームワークである。さまざまな主題やソースから収集された2つの多様な実世界のHARベンチマークデータセットの実験は、我々のフレームワークが他の最先端技術よりも優れていることを示している。 Many real-world scenarios, such as human activity recognition (HAR) in IoT, can be formalized as a multi-task multi-view learning problem. Each specific task consists of multiple shared feature views collected from multiple sources, either homogeneous or heterogeneous. Common among recent approaches is to employ a typical hard/soft sharing strategy at the initial phase separately for each view across tasks to uncover common knowledge, underlying the assumption that all views are conditionally independent. On the one hand, multiple views across tasks possibly relate to each other under practical situations. On the other hand, supervised methods might be insufficient when labeled data is scarce. To tackle these challenges, we introduce a novel framework ASM2TV for semi-supervised multi-task multi-view learning. We present a new perspective named gating control policy, a learnable task-view-interacted sharing policy that adaptively selects the most desirable candidate shared block for any view across any task, which uncovers more fine-grained task-view-interacted relatedness and improves inference efficiency. Significantly, our proposed gathering consistency adaption procedure takes full advantage of large amounts of unlabeled fragmented time-series, making it a general framework that accommodates a wide range of applications. Experiments on two diverse real-world HAR benchmark datasets collected from various subjects and sources demonstrate our framework's superiority over other state-of-the-arts.	翻訳日:2021-05-19 14:10:16 公開日:2021-05-18
# DCAP:ユーザ応答予測のためのディープクロス注意製品ネットワーク DCAP: Deep Cross Attentional Product Network for User Response Prediction ( http://arxiv.org/abs/2105.08649v1 ) ライセンス: Link先を確認	Zekai Chen, Fangtian Zhong, Zhumin Chen, Xiao Zhang, Robert Pless, Xiuzhen Cheng	(参考訳) ユーザが広告をクリックしたりアイテムを購入したりといった特定のコンテキストで事前定義されたポジティブな応答を提供する確率を予測することを目的としたユーザ応答予測は、オンライン広告やレコメンデーションシステム、検索ランキングといった多くの産業アプリケーションにとって不可欠である。しかし、これらのタスクで収集されたデータの高次元性と超広さのため、クロス機能は必然的に高価である。ユーザ応答の予測に関する以前の研究では、機能ベクタを2次あるいは高次クロス機能を明示的に、あるいは暗黙的にモデル化するために、機能ベクターを拡張して機能インタラクションを利用した。しかし、これらの既存手法は、モデルアーキテクチャの制限のために十分なクロスフィーチャを学習しなかったり、同じ重みを持つ全ての高階特徴相互作用をモデル化することによって妨げられる。この研究は、新しいアーキテクチャであるDeep Cross Attentional Product Network (DCAP)を提案することで、このギャップを埋めることを目的としている。さらに、マルチヘッドアテンションメカニズムとProduct Neural Network(PNN)にインスパイアされた各ネットワーク層における異なるクロス機能の重要性を区別し、実践者がより詳細なユーザ行動分析を行うことを可能にする。さらに,提案モデルは容易に実装でき,並行して訓練できる。実世界の3つのデータセットに関する総合的な実験を行う。その結果,提案モデルDCAPは最先端モデルと比較して優れた予測性能が得られることが示された。 User response prediction, which aims to predict the probability that a user will provide a predefined positive response in a given context such as clicking on an ad or purchasing an item, is crucial to many industrial applications such as online advertising, recommender systems, and search ranking. However, due to the high dimensionality and super sparsity of the data collected in these tasks, handcrafting cross features is inevitably time expensive. Prior studies in predicting user response leveraged the feature interactions by enhancing feature vectors with products of features to model second-order or high-order cross features, either explicitly or implicitly. Nevertheless, these existing methods can be hindered by not learning sufficient cross features due to model architecture limitations or modeling all high-order feature interactions with equal weights. This work aims to fill this gap by proposing a novel architecture Deep Cross Attentional Product Network (DCAP), which keeps cross network's benefits in modeling high-order feature interactions explicitly at the vector-wise level. Beyond that, it can differentiate the importance of different cross features in each network layer inspired by the multi-head attention mechanism and Product Neural Network (PNN), allowing practitioners to perform a more in-depth analysis of user behaviors. Additionally, our proposed model can be easily implemented and train in parallel. We conduct comprehensive experiments on three real-world datasets. The results have robustly demonstrated that our proposed model DCAP achieves superior prediction performance compared with the state-of-the-art models.	翻訳日:2021-05-19 14:09:54 公開日:2021-05-18
# 破損測定による低ランク行列回復問題に対するシャープ制限等尺特性境界 Sharp Restricted Isometry Property Bounds for Low-rank Matrix Recovery Problems with Corrupted Measurements ( http://arxiv.org/abs/2105.08232v1 ) ライセンス: Link先を確認	Ziye Ma, Yingjie Bi, Javad Lavaei, Somayeh Sojoudi	(参考訳) 本稿では,雑音による線形測定による一般的な低ランク行列回復問題について検討する。本研究の目的は,局所探索手法の制限等尺性(RIP)の条件が,誤差の少ない基底真理を見つけることができるかを理解することである。非凸問題の景観を解析することにより、まず、RIP定数が1/2より小さいという仮定の下で、任意の局所最小化器と基底真理の間の最大距離に関する大域的保証を提案する。ノイズの強度が減少するにつれて、この距離がゼロに縮まることを示す。我々の新しい保証は、RIP定数の点で鋭く、既存の結果よりもはるかに強い。次に、任意の RIP 定数を持つ問題に対する局所的な保証を示し、任意の局所最小化器は基底的真理にかなり近いか、それから遠く離れていることを示す。これらの結果から,問題の雑音強度とRIP定数が,真の解に対する局所最小値の位置に与える影響が示された。 In this paper, we study a general low-rank matrix recovery problem with linear measurements corrupted by some noise. The objective is to understand under what conditions on the restricted isometry property (RIP) of the problem local search methods can find the ground truth with a small error. By analyzing the landscape of the non-convex problem, we first propose a global guarantee on the maximum distance between an arbitrary local minimizer and the ground truth under the assumption that the RIP constant is smaller than 1/2. We show that this distance shrinks to zero as the intensity of the noise reduces. Our new guarantee is sharp in terms of the RIP constant and is much stronger than the existing results. We then present a local guarantee for problems with an arbitrary RIP constant, which states that any local minimizer is either considerably close to the ground truth or far away from it. The developed results demonstrate how the noise intensity and the RIP constant of the problem affect the locations of the local minima relative to the true solution.	翻訳日:2021-05-19 14:09:16 公開日:2021-05-18
# oneshot differentially top-k selection Oneshot Differentially Private Top-k Selection ( http://arxiv.org/abs/2105.08233v1 ) ライセンス: Link先を確認	Gang Qiao, Weijie J. Su, Li Zhang	(参考訳) プライバシリークのないトップ$1の要素を効率的かつ正確に選択できることは、さまざまなデータ分析タスクの不可欠なコンポーネントであり、大きな注目を集めている。本稿では,上位k$問題に対する高速かつ低歪みかつ微分プライベートなプリミティブである「textit{oneshot mechanism}」を紹介する。文献の既存手法と比較すると,本アルゴリズムは数にLaplaceノイズを付加し,高額なノイズ数とその推定値を一括してリリースすることにより,有効性を保ちながら計算コストを大幅に削減する。このメカニズムのプライバシーの証明は、独立した理論的関心を持つ新しい結合技術に依存している。最後に,複数仮説検定とペア比較によるランク付けにワンショット機構を適用し,その差分プライベートな結果を得る。 Being able to efficiently and accurately select the top-$k$ elements without privacy leakage is an integral component of various data analysis tasks and has gained significant attention. In this paper, we introduce the \textit{oneshot mechanism}, a fast, low-distortion, and differentially private primitive for the top-$k$ problem. Compared with existing approaches in the literature, our algorithm adds Laplace noise to the counts and releases the top-$k$ noisy counts and their estimates in a oneshot fashion, thereby substantially reducing the computational cost while maintaining satisfying utility. Our proof of privacy for this mechanism relies on a novel coupling technique that is of independent theoretical interest. Finally, we apply the oneshot mechanism to multiple hypothesis testing and ranking from pairwise comparisons and thus obtain their differentially private counterparts.	翻訳日:2021-05-19 14:09:00 公開日:2021-05-18
# 局所感性ハッシュによる線形最小二乗値反復 Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing ( http://arxiv.org/abs/2105.08285v1 ) ライセンス: Link先を確認	Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu	(参考訳) 本稿では,動作数において実行時の複雑性を部分線形に有する,最初の証明可能な最小二乗値反復 (lsvi) アルゴリズムを提案する。本稿では,最大内部積探索問題として値反復値関数推定法を定式化し,局所性に敏感なハッシュ (LSH) [Indyk and Motwani STOC'98, Andoni and Razenshteyn STOC'15, Andoni, Laarhoven, Razenshteyn and Waingarten SODA'17] 型データ構造を提案する。さらに, 近似最大内積探索理論と強化学習の後悔分析との関係を明らかにした。我々は、近似係数を選択することで、我々のSublinear LSVIアルゴリズムが元のLSVIアルゴリズムと同じ後悔を保ちつつ、実行時の複雑さをアクションの数でサブリニアに減らすことを証明した。私たちの知る限りでは、これはlshと強化学習を組み合わせることで、証明可能な改善をもたらす最初の仕事です。データ構造と反復アルゴリズムを組み合わせた新しい手法が、コスト削減と最適化のさらなる研究の扉を開くことを願っている。 We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions. We formulate the value function estimation procedure in value iteration as an approximate maximum inner product search problem and propose a locality sensitive hashing (LSH) [Indyk and Motwani STOC'98, Andoni and Razenshteyn STOC'15, Andoni, Laarhoven, Razenshteyn and Waingarten SODA'17] type data structure to solve this problem with sublinear time complexity. Moreover, we build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning. We prove that, with our choice of approximation factor, our Sublinear LSVI algorithms maintain the same regret as the original LSVI algorithms while reducing the runtime complexity to sublinear in the number of actions. To the best of our knowledge, this is the first work that combines LSH with reinforcement learning resulting in provable improvements. We hope that our novel way of combining data-structures and iterative algorithm will open the door for further study into cost reduction in optimization.	翻訳日:2021-05-19 14:08:46 公開日:2021-05-18
# 密度に基づく原子表現の最適ラジアル基底 Optimal radial basis for density-based atomic representations ( http://arxiv.org/abs/2105.08717v1 ) ライセンス: Link先を確認	Alexander Goscinski, F\'elix Musil, Sergey Pozdnyakov, and Michele Ceriotti	(参考訳) 原子スケールでの物質の特性をターゲットにしたほぼ全ての機械学習アルゴリズムの入力は、デカルト原子座標のリストをより対称な表現に変換することを含む。これらの最も一般的な表現の多くは、原子密度の対称性相関の拡張と見なすことができ、主に基底の選択によって異なる。ここでは、データセットの構造的多様性を最も効率的に表現するために選択された適応的で最適な数値基底を構築する方法について論じる。トレーニングデータセットごとに、この最適なベースはユニークで、スプラインで近似することで、プリミティブベースに関して追加コストなしで計算することができる。この構成は、正確で計算効率の良い表現をもたらし、分子と凝縮相の両方の機械学習モデルを含む例を示す。 The input of almost every machine learning algorithm targeting the properties of matter at the atomic scale involves a transformation of the list of Cartesian atomic coordinates into a more symmetric representation. Many of these most popular representations can be seen as an expansion of the symmetrized correlations of the atom density, and differ mainly by the choice of basis. Here we discuss how to build an adaptive, optimal numerical basis that is chosen to represent most efficiently the structural diversity of the dataset at hand. For each training dataset, this optimal basis is unique, and can be computed at no additional cost with respect to the primitive basis by approximating it with splines. We demonstrate that this construction yields representations that are accurate and computationally efficient, presenting examples that involve both molecular and condensed-phase machine-learning models.	翻訳日:2021-05-19 14:08:21 公開日:2021-05-18
# UncertaintyFuseNet: uncertainty-aware Hierarchical Feature Fusion with Ensemble Monte Carlo Dropout for COVID-19 Detection UncertaintyFuseNet: Robust Uncertainty-aware Hierarchical Feature Fusion with Ensemble Monte Carlo Dropout for COVID-19 Detection ( http://arxiv.org/abs/2105.08590v1 ) ライセンス: Link先を確認	Moloud Abdar, Soorena Salari, Sina Qahremani, Hak-Keung Lam, Fakhri Karray, Sadiq Hussain, Abbas Khosravi, U. Rajendra Acharya, Saeid Nahavandi	(参考訳) 新型コロナウイルス(Coronavirus disease 2019)は1億5100万人以上に感染し、現在まで世界中で約317万人が死亡している。新型コロナウイルス(covid-19)の急速な拡大は、人間の生命と健康を脅かし続けている。そのため,CTとX線データセットを用いて,新型コロナウイルスと他の疾患を正確に区別できる機械学習と深層学習を基盤としたCADシステムの開発が不可欠であり,最優先事項である。 CT画像とX線画像のどちらを用いた以前の研究と異なり、実装に十分なサンプルが得られたデータ型を両方使用した。一方で、この広汎性ウイルスの極度の感受性のため、モデル不確実性は考慮されるべきであるが、ほとんどの研究はそれを見落としている。そこで我々は,不確実性モジュールであるEnsemble Monte Carlo (EMC)ドロップアウトからなる,$UncertaintyFuseNet$という新しい強力な融合モデルを提案する。以上の結果から,CTスキャンとX線データを用いたCOVID-19検出のための融合の有用性が示唆された。また、提案する$uncertaintyfusenet$モデルはノイズに対してかなり頑健で、未発見のデータでもうまく動作します。この研究のソースコードとモデルは、https://github.com/moloud 1987/uncertaintyfusenet-for-covid-19-classificationで入手できる。 The COVID-19 (Coronavirus disease 2019) has infected more than 151 million people and caused approximately 3.17 million deaths around the world up to the present. The rapid spread of COVID-19 is continuing to threaten human's life and health. Therefore, the development of computer-aided detection (CAD) systems based on machine and deep learning methods which are able to accurately differentiate COVID-19 from other diseases using chest computed tomography (CT) and X-Ray datasets is essential and of immediate priority. Different from most of the previous studies which used either one of CT or X-ray images, we employed both data types with sufficient samples in implementation. On the other hand, due to the extreme sensitivity of this pervasive virus, model uncertainty should be considered, while most previous studies have overlooked it. Therefore, we propose a novel powerful fusion model named $UncertaintyFuseNet$ that consists of an uncertainty module: Ensemble Monte Carlo (EMC) dropout. The obtained results prove the effectiveness of our proposed fusion for COVID-19 detection using CT scan and X-Ray datasets. Also, our proposed $UncertaintyFuseNet$ model is significantly robust to noise and performs well with the previously unseen data. The source codes and models of this study are available at: https://github.com/moloud1987/UncertaintyFuseNet-for-COVID-19-Classification.	翻訳日:2021-05-19 14:06:09 公開日:2021-05-18
# SAIL-VOS 3D:映像データからのオブジェクト検出と3Dメッシュ再構成のための合成データセットとベースライン SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data ( http://arxiv.org/abs/2105.08612v1 ) ライセンス: Link先を確認	Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing	(参考訳) 映像データからオブジェクトの詳細な3D情報を抽出することは、全体像理解の重要な目標である。最近の手法では、単一の画像からオブジェクトのメッシュを再構築する場合に印象的な結果が得られたが、オブジェクトの一部が観測できないため、結果が曖昧なままであることが多い。さらに、メッシュ再構築のための既存の画像ベースのデータセットは、時間情報を統合するモデルの研究を許可しません。 SAIL-VOS 3D:SAIL-VOSを拡張したフレーム単位のメッシュアノテーションを備えた合成ビデオデータセット。また,時間モデルによる映像データから3次元メッシュを再構成するための最初のベースラインを開発した。提案するベースラインがSAIL-VOS 3DとPix3Dに対して有効であることを示し,時間的情報により復元精度が向上することを示した。リソースと追加情報はhttp://sailvos.web.illinois.eduで入手できる。 Extracting detailed 3D information of objects from video data is an important goal for holistic scene understanding. While recent methods have shown impressive results when reconstructing meshes of objects from a single image, results often remain ambiguous as part of the object is unobserved. Moreover, existing image-based datasets for mesh reconstruction don't permit to study models which integrate temporal information. To alleviate both concerns we present SAIL-VOS 3D: a synthetic video dataset with frame-by-frame mesh annotations which extends SAIL-VOS. We also develop first baselines for reconstruction of 3D meshes from video data via temporal models. We demonstrate efficacy of the proposed baseline on SAIL-VOS 3D and Pix3D, showing that temporal information improves reconstruction quality. Resources and additional information are available at http://sailvos.web.illinois.edu.	翻訳日:2021-05-19 14:05:46 公開日:2021-05-18
# Image Cropping on Twitter: Fairness Metrics, their limitation, and the importance of Representation, Design, and Agency Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency ( http://arxiv.org/abs/2105.08667v1 ) ライセンス: Link先を確認	Kyra Yee, Uthaipon Tantipongpipat, Shubhanshu Mishra	(参考訳) twitterは機械学習を使って画像を収穫する。 2020年秋、twitterのユーザーは、自動トリッピングシステムが浅黒い肌の個人よりも明るい肌を好むことを懸念し、またこのシステムが頭の代わりに女性の身体をトリッピングすることを好んでいることを懸念した。これらの懸念に対処するために,形式化されたグループフェアネスメトリクスを用いて広範な分析を行う。作付けにおける系統的な相違は,最も顕著な点に基づく作付けが,その相違を増幅するという事実を含む寄与要因を同定する。しかし, 自動収穫における表現的害のリスクを捉えるには, 形式化された公正度指標と定量分析が不十分であることを示す。ユーザエージェンシーをよりよく保存するソリューションとして,サリエンシに基づく収穫の除去を提案する。表現的危害に関する懸念に十分対処できる新しいソリューションを開発するために、我々の批判は、人間中心の設計を含む量的および質的手法の組み合わせを動機付ける。 Twitter uses machine learning to crop images, where crops are centered around the part predicted to be the most salient. In fall 2020, Twitter users raised concerns that the automated image cropping system on Twitter favored light-skinned over dark-skinned individuals, as well as concerns that the system favored cropping woman's bodies instead of their heads. In order to address these concerns, we conduct an extensive analysis using formalized group fairness metrics. We find systematic disparities in cropping and identify contributing factors, including the fact that the cropping based on the single most salient point can amplify the disparities. However, we demonstrate that formalized fairness metrics and quantitative analysis on their own are insufficient for capturing the risk of representational harm in automatic cropping. We suggest the removal of saliency-based cropping in favor of a solution that better preserves user agency. For developing a new solution that sufficiently address concerns related to representational harm, our critique motivates a combination of quantitative and qualitative methods that include human-centered design.	翻訳日:2021-05-19 14:05:00 公開日:2021-05-18
# 勾配で戦うグラデーション:敵の攻撃に対する動的防御 Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks ( http://arxiv.org/abs/2105.08714v1 ) ライセンス: Link先を確認	Dequan Wang, An Ju, Evan Shelhamer, David Wagner, Trevor Darrell	(参考訳) 敵の攻撃は防御を破るためにモデルに最適化する。既存の防御は静的であり、攻撃が変化しても一度トレーニングされたままである。モデルは反撃し、テスト時に攻撃に対して防御を最適化すべきである。防御エントロピー最小化(dent)により,テスト中にモデルと入力に適応する動的防御を提案する。 dentは既存のモデルとの互換性と列車時の防御のために、トレーニングではなくテストを変更する。 Dentは、CIFAR-10/100およびImageNetに対する、敵に訓練された防御と名指しで訓練されたモデルの堅牢性を改善する。特にdentは、cifar-10のオートアタックに対して、$\epsilon_\infty$ = 8/255で絶対的に20ポイント以上の防御を強化している。 Adversarial attacks optimize against models to defeat defenses. Existing defenses are static, and stay the same once trained, even while attacks change. We argue that models should fight back, and optimize their defenses against attacks at test time. We propose dynamic defenses, to adapt the model and input during testing, by defensive entropy minimization (dent). Dent alters testing, but not training, for compatibility with existing models and train-time defenses. Dent improves the robustness of adversarially-trained defenses and nominally-trained models against white-box, black-box, and adaptive attacks on CIFAR-10/100 and ImageNet. In particular, dent boosts state-of-the-art defenses by 20+ points absolute against AutoAttack on CIFAR-10 at $\epsilon_\infty$ = 8/255.	翻訳日:2021-05-19 14:04:39 公開日:2021-05-18
# データ次元にパラメータ化されたreluネットワークトレーニングの計算複雑性 The Computational Complexity of ReLU Network Training Parameterized by Data Dimensionality ( http://arxiv.org/abs/2105.08675v1 ) ライセンス: Link先を確認	Vincent Froese, Christoph Hertrich, Rolf Niedermeier	(参考訳) 線形整列ユニット(ReLU)を用いた単純なニューラルネットワークの学習における計算複雑性の理解が近年,集中的な研究の対象となっている。そこで本論文では,2層reluネットワークの各種損失関数に対するパラメータ化複雑性に関するいくつかの結果について述べる。他のパラメータに関する簡単な議論の後、トレーニングデータの次元$d$が計算複雑性に与える影響を分析することに重点を置いている。パラメータ $d$ に対する W[1]-hardness という観点でランニングタイムの下界を提供し、既知のブルートフォース戦略が本質的に最適であることを証明する(指数時間仮説を仮定する)。これまでの研究と比較すると、結果は幅広い損失関数に対して、すべての$p\in[0,\infty]$に対して$\ell^p$-lossを含む。特に、定数$d$と凸損失関数の既知の多項式時間アルゴリズムを、より一般的な損失関数のクラスに拡張し、これらの場合もランニングタイムの下限と一致する。 Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension $d$ of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter $d$ and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including $\ell^p$-loss for all $p\in[0,\infty]$. In particular, we extend a known polynomial-time algorithm for constant $d$ and convex loss functions to a more general class of loss functions, matching our running time lower bounds also in these cases.	翻訳日:2021-05-19 14:04:26 公開日:2021-05-18
# ハイパーグラフによる高次相互作用の非パラメトリックモデリング Nonparametric Modeling of Higher-Order Interactions via Hypergraphons ( http://arxiv.org/abs/2105.08678v1 ) ライセンス: Link先を確認	Krishnakumar Balasubramanian	(参考訳) 大規模ハイパーグラフの限界であるハイパーグラフを用いた高次相互作用のモデル化における統計的およびアルゴリズム的側面について検討する。ハイパーグラフはモデリングの観点からは非常に強力であるが、実際に効率的に推定できる制限された単純なリプシッツハイパーグラフ(SLH)のクラスを考える。また、SLHのクラスに最適である推定器の収束率も提供する。理論を裏付けるシミュレーション結果が提供される。 We study statistical and algorithmic aspects of using hypergraphons, that are limits of large hypergraphs, for modeling higher-order interactions. Although hypergraphons are extremely powerful from a modeling perspective, we consider a restricted class of Simple Lipschitz Hypergraphons (SLH), that are amenable to practically efficient estimation. We also provide rates of convergence for our estimator that are optimal for the class of SLH. Simulation results are provided to corroborate the theory.	翻訳日:2021-05-19 14:04:08 公開日:2021-05-18
# LEWIS: 教師なしテキストスタイル転送のためのLevenshtein編集 LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer ( http://arxiv.org/abs/2105.08206v1 ) ライセンス: Link先を確認	Machel Reid and Victor Zhong	(参考訳) 多くのタイプのテキストスタイル転送は、小さな正確な編集(例えば)だけで実現できる。気持ちの移り変わりはひどい時間だったのに... 素晴らしい時間を過ごしたのに) 本稿では,Levenshtein編集操作を用いてテキストを変換するスタイル転送のための粗大なエディタを提案する。挿入、置換、削除)。従来の単一スパン編集法とは異なり,本手法はソーステキスト中の複数のスパンを同時に編集する。並列スタイルのテキストペア(例)なしでトレーニングするペアの +/sentiment 文) では、教師なしのデータ合成手順を提案する。まず、スタイル分類子に注意を向けて、テキストをスタイル非依存のテンプレートに変換する。そしてテンプレートのスロットを微調整された事前学習された言語モデルで埋めます。提案手法は感情(yelp, amazon)と礼儀正しい(polite)トランスファー(polite)において,既存の生成および編集スタイルトランスファー手法を上回っている。特にマルチスパン編集はシングルスパン編集よりも高い性能と多様な出力を実現する。さらに,教師なしデータ合成における従来の手法と比較して,高品質な並列スタイルペアが得られ,モデル性能が向上する。 Many types of text style transfer can be achieved with only small, precise edits (e.g. sentiment transfer from I had a terrible time... to I had a great time...). We propose a coarse-to-fine editor for style transfer that transforms text using Levenshtein edit operations (e.g. insert, replace, delete). Unlike prior single-span edit methods, our method concurrently edits multiple spans in the source text. To train without parallel style text pairs (e.g. pairs of +/- sentiment statements), we propose an unsupervised data synthesis procedure. We first convert text to style-agnostic templates using style classifier attention (e.g. I had a SLOT time...), then fill in slots in these templates using fine-tuned pretrained language models. Our method outperforms existing generation and editing style transfer methods on sentiment (Yelp, Amazon) and politeness (Polite) transfer. In particular, multi-span editing achieves higher performance and more diverse output than single-span editing. Moreover, compared to previous methods on unsupervised data synthesis, our method results in higher quality parallel style pairs and improves model performance.	翻訳日:2021-05-19 14:03:42 公開日:2021-05-18
# BookSum: 長文ナラティブ要約のためのデータセットのコレクション BookSum: A Collection of Datasets for Long-form Narrative Summarization ( http://arxiv.org/abs/2105.08209v1 ) ライセンス: Link先を確認	Wojciech Kry\'sci\'nski, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong, Dragomir Radev	(参考訳) 利用可能なテキスト要約データセットの大部分は、長期因果関係や時間依存がなく、強いレイアウトやスタイルバイアスを含む短い形式のソースドキュメントを含んでいる。関連性はあるものの、このようなデータセットは将来のテキスト要約システムに限定的な課題をもたらすだろう。長文要約のためのデータセットの集合であるBookSumを導入することで,これらの問題に対処する。私たちのデータセットは、小説、戯曲、物語などの文学領域のソースドキュメントをカバーしており、難易度の増加の3つのレベル(段落、章、書籍レベル)において、高度に抽象的な人間による要約を含んでいます。データセットのドメインと構造は、非常に長いドキュメントの処理、非自明な因果関係と時間的依存関係、リッチな談話構造など、要約システムに固有の課題をもたらします。今後の作業を容易にするため、データセットのベースラインとして、複数の抽出および抽象的な要約モデルを訓練し、評価した。 The majority of available text summarization datasets include short-form source documents that lack long-range causal and temporal dependencies, and often contain strong layout and stylistic biases. While relevant, such datasets will offer limited challenges for future generations of text summarization systems. We address these issues by introducing BookSum, a collection of datasets for long-form narrative summarization. Our dataset covers source documents from the literature domain, such as novels, plays and stories, and includes highly abstractive, human written summaries on three levels of granularity of increasing difficulty: paragraph-, chapter-, and book-level. The domain and structure of our dataset poses a unique set of challenges for summarization systems, which include: processing very long documents, non-trivial causal and temporal dependencies, and rich discourse structures. To facilitate future work, we trained and evaluated multiple extractive and abstractive summarization models as baselines for our dataset.	翻訳日:2021-05-19 14:03:23 公開日:2021-05-18
# 感情誘発機:デュアルジェネレータに基づく会話生成を誘発する感情 Emotion Eliciting Machine: Emotion Eliciting Conversation Generation based on Dual Generator ( http://arxiv.org/abs/2105.08251v1 ) ライセンス: Link先を確認	Hao Jiang, Yutao Zhu, Xinyu Zhang, Zhicheng Dou, Pan Du, Te Pi, Yantao Jia	(参考訳) 近年、感情的なチャットボットの構築に大きな進歩が見られた。チャットボットが与えられた感情で応答を生成するための素晴らしい方法が提案されている。しかし,会話中のユーザの感情変化は十分に検討されていない。本研究では,人間と機械の会話において,ユーザのポジティブな感情を誘発する応答を生成することを目的としたポジティブ感情誘発問題について検討する。この問題に対処するために,弱い教師付き感情除去機械(EEM)を提案する。具体的には,まず,事前学習した感情分類器に基づいて,ユーザの感情状態変化の弱いラベルを変換で収集する。次に,会話におけるユーザの感情状態の変化に基づいて,正と負の両方の応答生成をモデル化する二重エンコーダデコーダ構造を提案する。二重構造の上に感情誘発因子を導入し、感情誘発時の反応に対する肯定的および否定的な感情的影響のバランスをとる。この要因はまた、感情誘発のきめ細かい制御方法を提供する。大規模な実世界のデータセットによる実験結果から、EEMは肯定的な感情誘発反応の生成において既存のモデルよりも優れていた。 Recent years have witnessed great progress on building emotional chatbots. Tremendous methods have been proposed for chatbots to generate responses with given emotions. However, the emotion changes of the user during the conversation has not been fully explored. In this work, we study the problem of positive emotion elicitation, which aims to generate responses that can elicit positive emotion of the user, in human-machine conversation. We propose a weakly supervised Emotion Eliciting Machine (EEM) to address this problem. Specifically, we first collect weak labels of user emotion status changes in a conversion based on a pre-trained emotion classifier. Then we propose a dual encoder-decoder structure to model the generation of responses in both positive and negative side based on the changes of the user's emotion status in the conversation. An emotion eliciting factor is introduced on top of the dual structure to balance the positive and negative emotional impacts on the generated response during emotion elicitation. The factor also provides a fine-grained controlling manner for emotion elicitation. Experimental results on a large real-world dataset show that EEM outperforms the existing models in generating responses with positive emotion elicitation.	翻訳日:2021-05-19 14:03:04 公開日:2021-05-18
# KECRS:知識に富んだ会話レコメンデーションシステムを目指して KECRS: Towards Knowledge-Enriched Conversational Recommendation System ( http://arxiv.org/abs/2105.08261v1 ) ライセンス: Link先を確認	Tong Zhang, Yong Liu, Peixiang Zhong, Chen Zhang, Hao Wang, Chunyan Miao	(参考訳) チャットベースの会話レコメンデーションシステム(CRS)は、自然言語による対話を通じて、ユーザにアイテムレコメンデーションを提供する。ユーザの意図をよりよく理解するために、外部知識グラフ(KG)がチャットベースのCRSに導入されている。しかし、既存のチップチャットベースのCRSは、通常反復的なアイテムレコメンデーションを生成し、KGからの知識をCRSに適切に注入して情報的応答を生成することはできない。これらの問題を解決するため、まず、推奨項目を新規かつ潜在的に興味のあるものにするために、会話レコメンデーションタスクを再構成する。そこで我々はKECRS(Knowledge-Enriched Conversational Recommendation System)を提案する。特に,バグ・オブ・エンティティ(boe)損失と輸液損失を発達させ,より多様で有益な応答を生成するために,kg と crs との統合性が向上した。 BOE損失は、CRSに人書きの発話とKGから学ぶための追加の監視信号を提供する。注入損失は、この2つの埋め込みにおける同じ単語の距離を最小化することにより、単語埋め込みとエンティティ埋め込みの間のギャップを埋める。さらに、高品質なKG, \ie The Movie Domain Knowledge Graph (TMDKG)を構築することで、研究の促進を図る。大規模データセットによる実験結果から,KECRSは推奨精度と応答生成品質の両方の観点から,最先端のチャットベースのCRSよりも優れていた。 The chit-chat-based conversational recommendation systems (CRS) provide item recommendations to users through natural language interactions. To better understand user's intentions, external knowledge graphs (KG) have been introduced into chit-chat-based CRS. However, existing chit-chat-based CRS usually generate repetitive item recommendations, and they cannot properly infuse knowledge from KG into CRS to generate informative responses. To remedy these issues, we first reformulate the conversational recommendation task to highlight that the recommended items should be new and possibly interested by users. Then, we propose the Knowledge-Enriched Conversational Recommendation System (KECRS). Specifically, we develop the Bag-of-Entity (BOE) loss and the infusion loss to better integrate KG with CRS for generating more diverse and informative responses. BOE loss provides an additional supervision signal to guide CRS to learn from both human-written utterances and KG. Infusion loss bridges the gap between the word embeddings and entity embeddings by minimizing distances of the same words in these two embeddings. Moreover, we facilitate our study by constructing a high-quality KG, \ie The Movie Domain Knowledge Graph (TMDKG). Experimental results on a large-scale dataset demonstrate that KECRS outperforms state-of-the-art chit-chat-based CRS, in terms of both recommendation accuracy and response generation quality.	翻訳日:2021-05-19 14:02:50 公開日:2021-05-18
# CoMAE:共感応答生成のための多要素階層フレームワーク CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation ( http://arxiv.org/abs/2105.08316v1 ) ライセンス: Link先を確認	Chujie Zheng, Yong Liu, Wei Chen, Yongcai Leng and Minlie Huang	(参考訳) オープンドメインダイアログシステムの成功には共感の能力が不可欠である。多次元性の性質から,コミュニケーション機構や対話行動,感情など,共感表現に関連するさまざまな要因が存在する。しかしながら、既存の共感的応答生成法は、通常、1つの共感因子のみを考慮するか、異なる要因間の階層的関係を無視し、共感モデリングの弱い能力をもたらす。本稿では,共感表現の3つの重要な要素を階層的にモデル化した,共感応答生成のための多要素階層型フレームワークCoMAEを提案する。実験により,我々のCoMAEモデルが従来の方法よりも共感的な反応を生成できることが示された。また,実生活コーパスにおける経験的分析と広範な実験を通して,異なる要因の階層的モデリングの重要性を強調する。私たちのコードと使用済みデータはhttps://github.com/chujiezheng/comae.comから入手できます。 The capacity of empathy is crucial to the success of open-domain dialog systems. Due to its nature of multi-dimensionality, there are various factors that relate to empathy expression, such as communication mechanism, dialog act and emotion. However, existing methods for empathetic response generation usually either consider only one empathy factor or ignore the hierarchical relationships between different factors, leading to a weak ability of empathy modeling. In this paper, we propose a multi-factor hierarchical framework, CoMAE, for empathetic response generation, which models the above three key factors of empathy expression in a hierarchical way. We show experimentally that our CoMAE-based model can generate more empathetic responses than previous methods. We also highlight the importance of hierarchical modeling of different factors through both the empirical analysis on a real-life corpus and the extensive experiments. Our codes and used data are available at https://github.com/chujiezheng/CoMAE.	翻訳日:2021-05-19 14:02:23 公開日:2021-05-18
# エンティティ型制約付き関係分類 Relation Classification with Entity Type Restriction ( http://arxiv.org/abs/2105.08393v1 ) ライセンス: Link先を確認	Shengfei Lyu, Huanhuan Chen	(参考訳) 関係分類は文中の2つの実体間の関係を予測することを目的としている。既存の方法は、すべての関係を文中の2つの実体の候補関係とみなす。これらの方法は、エンティティタイプによる候補関係の制限を無視し、いくつかの不適切な関係が候補関係となる。本稿では,関係性を制限するためにエンティティタイプを利用した関係分類法であるRelation Classification with ENtity Type Regulation (RECENT)を提案する。特に、関係型とエンティティ型の相互制約を形式化し、関係分類に導入する。さらに、提案するパラダイムであるRECENTはモデルに依存しない。それぞれ2つの代表モデルGCNとSpanBERTに基づいて、RECENT_GCNとRECENT_SpanBERTをトレーニングする。標準データセットの実験結果は、RECENTがGCNとSpanBERTのパフォーマンスをそれぞれ6.9ポイント、F1が4.4ポイント改善したことを示している。特にRECENT_SpanBERTはTACREDで新しい最先端を実現している。 Relation classification aims to predict a relation between two entities in a sentence. The existing methods regard all relations as the candidate relations for the two entities in a sentence. These methods neglect the restrictions on candidate relations by entity types, which leads to some inappropriate relations being candidate relations. In this paper, we propose a novel paradigm, RElation Classification with ENtity Type restriction (RECENT), which exploits entity types to restrict candidate relations. Specially, the mutual restrictions of relations and entity types are formalized and introduced into relation classification. Besides, the proposed paradigm, RECENT, is model-agnostic. Based on two representative models GCN and SpanBERT respectively, RECENT_GCN and RECENT_SpanBERT are trained in RECENT. Experimental results on a standard dataset indicate that RECENT improves the performance of GCN and SpanBERT by 6.9 and 4.4 F1 points, respectively. Especially, RECENT_SpanBERT achieves a new state-of-the-art on TACRED.	翻訳日:2021-05-19 14:02:08 公開日:2021-05-18
# 付加的な構成性を再考する: 単語埋め込みによるAND, OR, NOT操作 Revisiting Additive Compositionality: AND, OR and NOT Operations with Word Embeddings ( http://arxiv.org/abs/2105.08585v1 ) ライセンス: Link先を確認	Masahiro Naito, Sho Yokoi, Geewook Kim, Hidetoshi Shimodaira	(参考訳) word2vec や glove のような典型的な単語埋め込みメソッドは、その意味を埋め込み(付加的合成性)を付加することで構成できるという特性を持つことはよく知られている。加法構成性を説明するためにいくつかの理論が提案されているが、以下の疑問は未解決である: (q1) これらの理論の仮定は、実際的な単語埋め込みには当てはまらない。 (q2) 通常の加法構成性は、単語の意味の操作や操作と見なすことができるが、埋め込みによって他の演算がどのように計算されるかはよく分かっていない。我々は,周波数重み付けセンタリングの考え方によってこの問題に対処した。本稿では, (q1) に対する回答として, 実用的な単語埋め込みと付加合成性理論とのギャップを橋渡しする後処理法を提案する。また、(Q2)への応答として単語埋め込みの線形操作により、意味のORまたはNOTを取る方法を提供する。さらに,本手法の処理後処理(トップ100における3.5倍の精度向上)により,通常の加法的構成性であるAND操作の精度が向上し,ORおよびNOT操作が正しく行えることを実験的に確認した。 It is well-known that typical word embedding methods such as Word2Vec and GloVe have the property that the meaning can be composed by adding up the embeddings (additive compositionality). Several theories have been proposed to explain additive compositionality, but the following questions remain unanswered: (Q1) The assumptions of those theories do not hold for the practical word embedding. (Q2) Ordinary additive compositionality can be seen as an AND operation of word meanings, but it is not well understood how other operations, such as OR and NOT, can be computed by the embeddings. We address these issues by the idea of frequency-weighted centering at its core. This paper proposes a post-processing method for bridging the gap between practical word embedding and the assumption of theory about additive compositionality as an answer to (Q1). It also gives a method for taking OR or NOT of the meaning by linear operation of word embedding as an answer to (Q2). Moreover, we confirm experimentally that the accuracy of AND operation, i.e., the ordinary additive compositionality, can be improved by our post-processing method (3.5x improvement in top-100 accuracy) and that OR and NOT operations can be performed correctly.	翻訳日:2021-05-19 14:01:55 公開日:2021-05-18
# PoBRL:Blending Reinforcement Learning Policiesによる多文書要約の最適化 PoBRL: Optimizing Multi-Document Summarization by Blending Reinforcement Learning Policies ( http://arxiv.org/abs/2105.08244v1 ) ライセンス: Link先を確認	Andy Su, Difei Su, John M.Mulvey, H.Vincent Poor	(参考訳) 多文書要約を解くための新しい強化学習フレームワークPoBRLを提案する。 PoBRLは、高品質な要約に必要な3つの目的、すなわち重要性、妥当性、長さを共同で最適化する。我々の戦略は、この多目的最適化を、強化学習によって個別に解ける様々なサブプロブレムに分解する。 PoBRLを利用して、学習した各ポリシーをブレンドして、元の入力の簡潔で完全な表現である要約を生成する。実験結果から,複数のマルチドキュメントデータセットにおける最先端の性能を示す。また,本手法が高品質な出力を生成することを示す。 We propose a novel reinforcement learning based framework PoBRL for solving multi-document summarization. PoBRL jointly optimizes over the following three objectives necessary for a high-quality summary: importance, relevance, and length. Our strategy decouples this multi-objective optimization into different subproblems that can be solved individually by reinforcement learning. Utilizing PoBRL, we then blend each learned policies together to produce a summary that is a concise and complete representation of the original input. Our empirical analysis shows state-of-the-art performance on several multi-document datasets. Human evaluation also shows that our method produces high-quality output.	翻訳日:2021-05-19 14:01:35 公開日:2021-05-18
# E-Commerce Fresh Retailのマークダウン: 対実予測と多機能最適化アプローチ Markdowns in E-Commerce Fresh Retail: A Counterfactual Prediction and Multi-Period Optimization Approach ( http://arxiv.org/abs/2105.08313v1 ) ライセンス: Link先を確認	Junhao Hua, Ling Yan, Huan Xu, Cheng Yang	(参考訳) 本稿では,大量の観測トランザクションデータを活用することで,非現実的予測と多周期価格最適化からなる,マークダウンのための新しいデータ駆動型かつ解釈可能な価格設定手法を提案する。まず, 準パラメトリック構造モデルを構築し, 個々の価格弾性を学習し, 反事実需要を予測する。この半パラメトリックモデルは、非パラメトリック機械学習モデルの予測可能性と経済モデルの解釈可能性の両方を活用する。第2に,有限販売地平線上での消耗品全体の利益を最大化する多周期動的価格アルゴリズムを提案する。決定論的需要を用いる従来のアプローチとは異なり、予測プロセスに必然的にランダム性を持つため、反事実的需要の不確かさをモデル化する。確率モデルに基づいてマルコフ決定プロセスによる逐次価格戦略を導出し,それを解決するための2段階のアルゴリズムを設計する。提案アルゴリズムは非常に効率的である。指数関数から多項式への時間の複雑さを減少させる。実験の結果,我々の価格アルゴリズムの利点が示され,提案したフレームワークは有名なeコマースの新鮮小売シナリオであるFreshippoにうまく展開されている。 In this paper, by leveraging abundant observational transaction data, we propose a novel data-driven and interpretable pricing approach for markdowns, consisting of counterfactual prediction and multi-period price optimization. Firstly, we build a semi-parametric structural model to learn individual price elasticity and predict counterfactual demand. This semi-parametric model takes advantage of both the predictability of nonparametric machine learning model and the interpretability of economic model. Secondly, we propose a multi-period dynamic pricing algorithm to maximize the overall profit of a perishable product over its finite selling horizon. Different with the traditional approaches that use the deterministic demand, we model the uncertainty of counterfactual demand since it inevitably has randomness in the prediction process. Based on the stochastic model, we derive a sequential pricing strategy by Markov decision process, and design a two-stage algorithm to solve it. The proposed algorithm is very efficient. It reduces the time complexity from exponential to polynomial. Experimental results show the advantages of our pricing algorithm, and the proposed framework has been successfully deployed to the well-known e-commerce fresh retail scenario - Freshippo.	翻訳日:2021-05-19 14:01:28 公開日:2021-05-18
# SATによるハイブリッドシステムの再構成 Reconfiguring Hybrid Systems Using SAT ( http://arxiv.org/abs/2105.08398v1 ) ライセンス: Link先を確認	Kaja Balzereit and Oliver Niggemann	(参考訳) リコンフィグレーションは、システム目標に再び到達できるように、システム構成を自動的に適応することで、障害からシステムを取り戻すことを目的としている。古典的なアプローチは通常、対応するリカバリアクションを手動で定義する事前定義された障害セットを使用する。これは頻繁な変更によって特徴づけられる現代のハイブリッドシステムでは不可能である。代わりに、AIベースのアプローチは、非デフォルトシステムのモデルを活用し、有効な振る舞いを再び確立する再構成操作のセットを検索する必要がある。この研究は、3つの主要な課題を解決する新しいアルゴリズムを提示している。欠陥の振る舞いをモデル化する必要はありません (ii)主に連続系変数や制御信号の数が多いため、もともと大きすぎる探索空間を識別し、縮小する。 3) 命題論理にはSATソルバを用いており, 第一に妥当性という二項の概念を定義している。第二に、検索自体を実装する -- 任意のソリューションを素早く識別するために最適なソリューションを犠牲にする。この手法はプロセス工学シミュレーションシステム上で障害を再構成できることが示されている。 Reconfiguration aims at recovering a system from a fault by automatically adapting the system configuration, such that the system goal can be reached again. Classical approaches typically use a set of pre-defined faults for which corresponding recovery actions are defined manually. This is not possible for modern hybrid systems which are characterized by frequent changes. Instead, AI-based approaches are needed which leverage on a model of the non-faulty system and which search for a set of reconfiguration operations which will establish a valid behavior again. This work presents a novel algorithm which solves three main challenges: (i) Only a model of the non-faulty system is needed, i.e. the faulty behavior does not need to be modeled. (ii) It discretizes and reduces the search space which originally is too large -- mainly due to the high number of continuous system variables and control signals. (iii) It uses a SAT solver for propositional logic for two purposes: First, it defines the binary concept of validity. Second, it implements the search itself -- sacrificing the optimal solution for a quick identification of an arbitrary solution. It is shown that the approach is able to reconfigure faults on simulated process engineering systems.	翻訳日:2021-05-19 14:01:10 公開日:2021-05-18
# CFR-MIX: Combinatorial Action Spaceによる不完全な情報集約型ゲームの解決 CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space ( http://arxiv.org/abs/2105.08440v1 ) ライセンス: Link先を確認	Shuxin Li, Youzhi Zhang, Xinrun Wang, Wanqi Xue, Bo An	(参考訳) 多くの現実世界のシナリオでは、エージェントのチームが互いに調整し、対戦相手と競う。このタイプのゲーム解決の課題は、チームの共同アクションスペースがエージェント数で指数関数的に増大し、既存のアルゴリズム、例えば、反事実後悔最小化(cfr)の非効率化につながることである。そこで本研究では,CFRの新しいフレームワークであるCFR-MIXを提案する。まず,各エージェントの個別戦略を用いた共同行動戦略と,エージェント間の協調を維持するための一貫性関係を示す新しい戦略表現を提案する。 cfrフレームワークの下で個々の戦略との均衡を計算するために,戦略間の一貫性関係を累積後悔値間の一貫性関係に変換する。さらに, 累積的後悔値に対する新しい分解法を提案し, 累積的後悔値間の整合性関係を保証する。最後に, 混合層を用いた新しいアルゴリズムCFR-MIXを導入し, 個別動作の累積後悔値の非線形結合として, 共同動作の累積後悔値を推定する。実験の結果,CFR-MIXは様々なゲームにおいて既存のアルゴリズムよりも優れていた。 In many real-world scenarios, a team of agents coordinate with each other to compete against an opponent. The challenge of solving this type of game is that the team's joint action space grows exponentially with the number of agents, which results in the inefficiency of the existing algorithms, e.g., Counterfactual Regret Minimization (CFR). To address this problem, we propose a new framework of CFR: CFR-MIX. Firstly, we propose a new strategy representation that represents a joint action strategy using individual strategies of all agents and a consistency relationship to maintain the cooperation between agents. To compute the equilibrium with individual strategies under the CFR framework, we transform the consistency relationship between strategies to the consistency relationship between the cumulative regret values. Furthermore, we propose a novel decomposition method over cumulative regret values to guarantee the consistency relationship between the cumulative regret values. Finally, we introduce our new algorithm CFR-MIX which employs a mixing layer to estimate cumulative regret values of joint actions as a non-linear combination of cumulative regret values of individual actions. Experimental results show that CFR-MIX outperforms existing algorithms on various games significantly.	翻訳日:2021-05-19 14:00:55 公開日:2021-05-18
# N-ary Relational Factsのリンク予測:グラフに基づくアプローチ Link Prediction on N-ary Relational Facts: A Graph-based Approach ( http://arxiv.org/abs/2105.08476v1 ) ライセンス: Link先を確認	Quan Wang, Haifeng Wang, Yajuan Lyu, Yong Zhu	(参考訳) 知識グラフ(KG)のリンク予測は重要な研究トピックである。それまでの研究は主に二項関係に焦点をあて、現実世界のKGではユビキタスだが、高次関係にはあまり注意を払わなかった。本稿では,n-項関係の事実に対するリンク予測を考察し,この課題に対するグラフベースアプローチを提案する。我々のアプローチの鍵は、事実の n-項構造を小さな不均一なグラフとして表現し、エッジバイアスの完全な接続された注意でこのグラフをモデル化することです。完全接続された注意は普遍的な頂点間相互作用を捉える一方、エッジアウェアの注意バイアスによりグラフ構造とその不均一性を特に符号化する。この方法では、我々のアプローチは、各n-ary事実におけるグローバルとローカルの依存関係を完全にモデル化します。広範な評価は、我々のアプローチの有効性と優位性を検証する。さまざまなn-aryリレーショナルベンチマークにおいて,現在の最先端よりも実質的に,一貫して優れたパフォーマンスを実現しています。私たちのコードは公開されています。 Link prediction on knowledge graphs (KGs) is a key research topic. Previous work mainly focused on binary relations, paying less attention to higher-arity relations although they are ubiquitous in real-world KGs. This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task. The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention. The fully-connected attention captures universal inter-vertex interactions, while with edge-aware attentive biases to particularly encode the graph structure and its heterogeneity. In this fashion, our approach fully models global and local dependencies in each n-ary fact, and hence can more effectively capture associations therein. Extensive evaluation verifies the effectiveness and superiority of our approach. It performs substantially and consistently better than current state-of-the-art across a variety of n-ary relational benchmarks. Our code is publicly available.	翻訳日:2021-05-19 14:00:39 公開日:2021-05-18
# DACBench:動的アルゴリズム構成のためのベンチマークライブラリ DACBench: A Benchmark Library for Dynamic Algorithm Configuration ( http://arxiv.org/abs/2105.08541v1 ) ライセンス: Link先を確認	Theresa Eimer, Andr\'e Biedenkapp, Maximilian Reimer, Steven Adriaensen, Frank Hutter, Marius Lindauer	(参考訳) Dynamic Algorithm Configuration (DAC)は、ターゲットアルゴリズムのハイパーパラメータを動的に制御してパフォーマンスを向上させることを目的としている。いくつかの理論的および実証的な結果は、進化計算、AI計画、ディープラーニングのような領域におけるハイパーパラメータを動的に制御する利点を示している。しかし、これらの結果の複製やDACの新しい手法の研究は、既存のベンチマークがしばしば同一インタフェースと互換性がないため困難である。ベンチマークの容易化とDACの研究を目的として,AIドメインから既存のDACベンチマークを収集,標準化するベンチマークライブラリであるDACBenchと,新たなベンチマーク用のテンプレートを提案する。 dacbenchの設計には, (i) 柔軟性, (ii) 再現性, (iii) 拡張性, (iv) 自動ドキュメンテーションと可視化といった重要なデシデラタに注目した。 DACの可能性,適用性,課題を示すために,6つの初期ベンチマークの集合が,いくつかの難易度でどのように比較されるかを検討する。 Dynamic Algorithm Configuration (DAC) aims to dynamically control a target algorithm's hyperparameters in order to improve its performance. Several theoretical and empirical results have demonstrated the benefits of dynamically controlling hyperparameters in domains like evolutionary computation, AI Planning or deep learning. Replicating these results, as well as studying new methods for DAC, however, is difficult since existing benchmarks are often specialized and incompatible with the same interfaces. To facilitate benchmarking and thus research on DAC, we propose DACBench, a benchmark library that seeks to collect and standardize existing DAC benchmarks from different AI domains, as well as provide a template for new ones. For the design of DACBench, we focused on important desiderata, such as (i) flexibility, (ii) reproducibility, (iii) extensibility and (iv) automatic documentation and visualization. To show the potential, broad applicability and challenges of DAC, we explore how a set of six initial benchmarks compare in several dimensions of difficulty.	翻訳日:2021-05-19 14:00:22 公開日:2021-05-18
# 野生の単一ビューの地球中心ポス Single View Geocentric Pose in the Wild ( http://arxiv.org/abs/2105.08229v1 ) ライセンス: Link先を確認	Gordon Christie, Kevin Foster, Shea Hagstrom, Gregory D. Hager, Myron Z. Brown	(参考訳) セマンティックマッピング、マップアライメント、変化検出などの地球観測タスクの現在の方法は、ほぼナディア画像に依存しているが、自然災害のような動的な世界イベントに対応する最初の画像は斜めであることが多い。これらの課題は、観測対象視差により斜め画像にとってはるかに困難である。近年、衛星画像に登録された空中ライダーによる訓練により、地上の高さと重力に対する方向を規定した地中心のポーズを復元することに成功した。本稿では,アフィン不変性を利用した新しい課題のモデルを提案する。また,本手法を現実のアプリケーションに適用する上で,現実的な課題にも対処する。私たちのデータとコードは公開されています。 Current methods for Earth observation tasks such as semantic mapping, map alignment, and change detection rely on near-nadir images; however, often the first available images in response to dynamic world events such as natural disasters are oblique. These tasks are much more difficult for oblique images due to observed object parallax. There has been recent success in learning to regress geocentric pose, defined as height above ground and orientation with respect to gravity, by training with airborne lidar registered to satellite images. We present a model for this novel task that exploits affine invariance properties to outperform state of the art performance by a wide margin. We also address practical issues required to deploy this method in the wild for real-world applications. Our data and code are publicly available.	翻訳日:2021-05-19 13:59:43 公開日:2021-05-18
# 教師なしスケッチに基づく画像検索に向けて Towards Unsupervised Sketch-based Image Retrieval ( http://arxiv.org/abs/2105.08237v1 ) ライセンス: Link先を確認	Conghui Hu, Yongxin Yang, Yunpeng Li, Timothy M. Hospedales, Yi-Zhe Song	(参考訳) 現在の教師付きスケッチベース画像検索(SBIR)は優れた性能を発揮する。しかし、データ収集とラベリングのコストは、実際のアプリケーションの実用的なデプロイに対する難解な障壁を伴います。本稿では,従来訓練に必要であったラベル付けコスト(カテゴリアノテーションとスケッチ写真ペアリング)を取り除くための教師なしsbirの最初の試みについて述べる。既存の単一ドメインの教師なし表現学習手法は、この問題のユニークなクロスドメイン性(スケッチとフォト)のため、このアプリケーションでは性能が悪い。そこで我々は,教師なし表現学習とスケッチ写真領域アライメントを同時に行う新しい枠組みを提案する。技術的には、これは関節分布最適輸送(JDOT)を利用して表現学習中に異なる領域からのデータを整列させ、トレーニング可能なクラスタプロトタイプと機能記憶バンクで拡張し、スケーラビリティと効率をさらに向上させます。広範な実験により,新しい教師なし設定では優れた性能を達成し,ゼロショット設定では最先端よりも優れた性能を示すことができた。 Current supervised sketch-based image retrieval (SBIR) methods achieve excellent performance. However, the cost of data collection and labeling imposes an intractable barrier to practical deployment of real applications. In this paper, we present the first attempt at unsupervised SBIR to remove the labeling cost (category annotations and sketch-photo pairings) that is conventionally needed for training. Existing single-domain unsupervised representation learning methods perform poorly in this application, due to the unique cross-domain (sketch and photo) nature of the problem. We therefore introduce a novel framework that simultaneously performs unsupervised representation learning and sketch-photo domain alignment. Technically this is underpinned by exploiting joint distribution optimal transport (JDOT) to align data from different domains during representation learning, which we extend with trainable cluster prototypes and feature memory banks to further improve scalability and efficacy. Extensive experiments show that our framework achieves excellent performance in the new unsupervised setting, and performs comparably or better than state-of-the-art in the zero-shot setting.	翻訳日:2021-05-19 13:59:31 公開日:2021-05-18
# セルフポイントフロー:最適移動とランダム歩行を伴うポイントクラウドからの自己教師付きシーンフロー推定 Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk ( http://arxiv.org/abs/2105.08248v1 ) ライセンス: Link先を確認	Ruibo Li, Guosheng Lin, Lihua Xie	(参考訳) 注釈付きシーンフローデータの不足により,ポイントクラウドにおける自己教師ありシーンフロー学習が注目されている。自己監督的な方法では、2点雲間の対応性を確立することが効果的なアプローチである。従来の手法では、3次元点座標上の距離のみを考慮し、(1)色や表面の正常といった他の識別的指標を見落とし、(2)マッチングが制約のない状況で操作され、複数の点が同じ対応点に終止符を打つため、しばしばサブパー性能を生成する。この問題に対処するため、このマッチングタスクを最適な輸送問題として定式化する。出力最適割り当て行列を用いて擬似基底真理の生成を導くことができる。この最適輸送法では,複数の記述子を考慮した輸送コストを設計し,質量等式制約による1対1のマッチングを奨励する。また、各点にグラフを構築することにより、擬似ラベルの局所的一貫性を促進するランダムウォークモジュールを導入する。 FlyingThings3D と KITTI の総合的な実験により,本手法が自己教師付き学習手法の最先端性能を実現することを示す。我々の自己指導手法は、訓練に基礎的な真実の流れを必要としないが、教師付き学習手法と同等に機能する。 Due to the scarcity of annotated scene flow data, self-supervised scene flow learning in point clouds has attracted increasing attention. In the self-supervised manner, establishing correspondences between two point clouds to approximate scene flow is an effective approach. Previous methods often obtain correspondences by applying point-wise matching that only takes the distance on 3D point coordinates into account, introducing two critical issues: (1) it overlooks other discriminative measures, such as color and surface normal, which often bring fruitful clues for accurate matching; and (2) it often generates sub-par performance, as the matching is operated in an unconstrained situation, where multiple points can be ended up with the same corresponding point. To address the issues, we formulate this matching task as an optimal transport problem. The output optimal assignment matrix can be utilized to guide the generation of pseudo ground truth. In this optimal transport, we design the transport cost by considering multiple descriptors and encourage one-to-one matching by mass equality constraints. Also, constructing a graph on the points, a random walk module is introduced to encourage the local consistency of the pseudo labels. Comprehensive experiments on FlyingThings3D and KITTI show that our method achieves state-of-the-art performance among self-supervised learning methods. Our self-supervised method even performs on par with some supervised learning approaches, although we do not need any ground truth flow for training.	翻訳日:2021-05-19 13:59:12 公開日:2021-05-18
# 知識蒸留とクロスモーダルマッチングの併用による弱教師付き密集ビデオキャプション Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching ( http://arxiv.org/abs/2105.08252v1 ) ライセンス: Link先を確認	Bofeng Wu, Guocheng Niu, Jun Yu, Xinyan Xiao, Jian Zhang and Hua Wu	(参考訳) 本稿では,ペアワイズなイベントセンテンスアノテーションを使わずに動画キャプション(dvc)を行う手法を提案する。まず,関連する課題から抽出した知識を用いて,高品質なイベント提案を生成する。次に,提案文と文のセマンティックマッチングを構築するために,典型的にクロスモーダル検索タスクに適用されるコントラッシブ・ロスとサイクル・一貫性・ロスを取り入れ,最終的にキャプション生成モジュールのトレーニングに使用される。また、アノテート画像に基づく事前学習によりマッチングモジュールのパラメータを初期化し、マッチング性能を向上させる。 activitynet-captionデータセットに関する広範な実験は、蒸留に基づくイベント提案生成と、弱い教師付きdvcとのクロスモーダル検索に基づく意味マッチングの意義を明らかにし、この手法が既存の最先端手法に優れていることを示す。 This paper proposes an approach to Dense Video Captioning (DVC) without pairwise event-sentence annotation. First, we adopt the knowledge distilled from relevant and well solved tasks to generate high-quality event proposals. Then we incorporate contrastive loss and cycle-consistency loss typically applied to cross-modal retrieval tasks to build semantic matching between the proposals and sentences, which are eventually used to train the caption generation module. In addition, the parameters of matching module are initialized via pre-training based on annotated images to improve the matching performance. Extensive experiments on ActivityNet-Caption dataset reveal the significance of distillation-based event proposal generation and cross-modal retrieval-based semantic matching to weakly supervised DVC, and demonstrate the superiority of our method to existing state-of-the-art methods.	翻訳日:2021-05-19 13:58:47 公開日:2021-05-18
# Exemplar-based Open-Set Panoptic Segmentation Network Exemplar-Based Open-Set Panoptic Segmentation Network ( http://arxiv.org/abs/2105.08336v1 ) ライセンス: Link先を確認	Jaedong Hwang, Seoung Wug Oh, Joon-Young Lee, Bohyung Han	(参考訳) 我々は、panoptic segmentationをopen-worldに拡張し、open-set panoptic segmentation~(ops)タスクを導入する。このタスクは、既知のクラスだけでなく、トレーニング中に認識されていない未知のクラスに対しても、panopticのセグメンテーションを実行する必要がある。タスクの実践的課題を調査し,既存のデータセットであるCOCO上にベンチマークを構築する。さらに,実証理論にインスパイアされた,新しいオープン・セット・パノプティブ・セグメンテーション・ネットワーク~(EOPSN)を提案する。提案手法は,クラスタ化によって識別され,疑似グラウンドルーツとして使用されるexemplarsに基づく新しいクラスを識別する。各クラスのサイズは、クラスに関連する既存のクラスと類似性に基づいて、新しい例をマイニングすることによって増加する。提案するベンチマークでeopsnを評価し,提案の有効性を実証する。私たちの仕事の第一の目的は、オープンワールドのシナリオにおける認識にコミュニティの注意を引き付けることです。我々のアルゴリズムの実装は、プロジェクトのWebページで利用可能である。 We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation~(OPS) task. This task requires performing panoptic segmentation for not only \known classes but also \unknown ones that have not been acknowledged during training. We investigate the practical challenges of the task and construct a benchmark on top of an existing dataset, COCO. In addition, we propose a novel exemplar-based open-set panoptic segmentation network~(EOPSN) inspired by exemplar theory. Our approach identifies a new class based on exemplars, which are identified by clustering and employed as pseudo-ground-truths. The size of each class increases by mining new exemplars based on the similarities to the existing ones associated with the class. We evaluate EOPSN on the proposed benchmark and demonstrate the effectiveness of our proposals. The primary goal of our work is to draw the attention of the community to the recognition in the open-world scenarios. The implementation of our algorithm is available on the project webpage: https://cv.snu.ac.kr/research/EOPSN.	翻訳日:2021-05-19 13:58:29 公開日:2021-05-18
# 小型非均質データセットからの手術ロボット動作の教師なし同定 Unsupervised identification of surgical robotic actions from small non homogeneous datasets ( http://arxiv.org/abs/2105.08488v1 ) ライセンス: Link先を確認	Daniele Meli, Paolo Fiorini	(参考訳) ロボット支援手術は確立された臨床実践である。研修生のパフォーマンス評価や、自律的な実行とモニタリングのための手術プロセスモデリングなど、さまざまな用途において外科的アクションの自動識別が必要である。しかし,手術が複雑で長い場合,手作業で記録に注釈を付ける重荷がかかるため,指導された行動同定は不可能である。さらに、手術手順の実施例が記録されることも少なくない。本稿では,da vinci research kitで実施した標準手術訓練課題であるリングトランスファーにおいて,手術動作の教師なし識別のための新しいアルゴリズムを提案する。非常に限られた実行データセットから自動的にキネマティックおよびセマンティックな視覚的特徴を抽出することにより、同様のアプリケーションで最先端の結果を大幅に上回り、ノイズやショートアクション、非均一なワークフローの存在下でもセグメンテーション(88%対82%のマッチングスコア)とクラスタリング(67%対54%のF1スコア)の品質を向上させることができる。非反復的なアクションシーケンス標準商用仕様のハードウェア上の完全なアクション識別は、単一の実行のために1 s未満で実行される。 Robot-assisted surgery is an established clinical practice. The automatic identification of surgical actions is needed for a range of applications, including performance assessment of trainees and surgical process modeling for autonomous execution and monitoring. However, supervised action identification is not feasible, due to the burden of manually annotating recordings of potentially complex and long surgical executions. Moreover, often few example executions of a surgical procedure can be recorded. This paper proposes a novel algorithm for unsupervised identification of surgical actions in a standard surgical training task, the ring transfer, executed with da Vinci Research Kit. Exploiting kinematic and semantic visual features automatically extracted from a very limited dataset of executions, we are able to significantly outperform the state-of-the-art results for a similar application, improving the quality of segmentation (88% vs. 82% matching score) and clustering (67% vs. 54% F1-score) even in the presence of noise, short actions and non homogeneous workflows, i.e. non repetitive action sequences. Full action identification on hardware with standard commercial specifications is performed in less than 1 s for single execution.	翻訳日:2021-05-19 13:58:16 公開日:2021-05-18
# 高速かつ効率的なシーンテキスト認識のための視覚変換器 Vision Transformer for Fast and Efficient Scene Text Recognition ( http://arxiv.org/abs/2105.08582v1 ) ライセンス: Link先を確認	Rowel Atienza	(参考訳) Scene Text Recognition (STR) は、コンピュータがオブジェクトラベル、道路標識、指示書などの自然なシーンでテキストを読むことを可能にする。 STRは、どのオブジェクトを選択するか、どの方向に進むか、次のアクションのステップは何かといった、マシンが情報的な決定を行うのを助ける。 STRの研究の本体では、常に認識精度に焦点が当てられている。速度と計算効率にはあまり重点が置かれておらず、特にエネルギー制約のあるモバイルマシンでも同様に重要である。本稿では、計算およびパラメータ効率のよい視覚変換器(ViT)上に構築された単純な単一ステージモデルアーキテクチャを持つSTRであるViTSTRを提案する。 TRBAのような、84.3%の精度の強力なベースライン法では、私たちの小さなViTSTRは、パラメータの43.4%と42.2%のFLOPSを使用して、2.4倍の速度で82.6%(データ拡張で84.2%)の競争精度を達成する。 ViTSTRの小さなバージョンは80.3%の精度(データ拡張で82.1%)、2.5倍の速度で、パラメータの10.9%と11.9%のFLOPSしか必要としない。データ拡張では、我々のベースViTSTRはTRBAの精度85.2%(拡張なしで83.7%)を2.3倍に向上するが、73.2%以上のパラメータと61.5%以上のFLOPSを必要とする。トレードオフに関して言えば、ほぼ全てのViTSTR構成は、精度、速度、計算効率を同時に最大化するために、フロンティア付近にある。 Scene text recognition (STR) enables computers to read text in natural scenes such as object labels, road signs and instructions. STR helps machines perform informed decisions such as what object to pick, which direction to go, and what is the next step of action. In the body of work on STR, the focus has always been on recognition accuracy. There is little emphasis placed on speed and computational efficiency which are equally important especially for energy-constrained mobile machines. In this paper we propose ViTSTR, an STR with a simple single stage model architecture built on a compute and parameter efficient vision transformer (ViT). On a comparable strong baseline method such as TRBA with accuracy of 84.3%, our small ViTSTR achieves a competitive accuracy of 82.6% (84.2% with data augmentation) at 2.4x speed up, using only 43.4% of the number of parameters and 42.2% FLOPS. The tiny version of ViTSTR achieves 80.3% accuracy (82.1% with data augmentation), at 2.5x the speed, requiring only 10.9% of the number of parameters and 11.9% FLOPS. With data augmentation, our base ViTSTR outperforms TRBA at 85.2% accuracy (83.7% without augmentation) at 2.3x the speed but requires 73.2% more parameters and 61.5% more FLOPS. In terms of trade-offs, nearly all ViTSTR configurations are at or near the frontiers to maximize accuracy, speed and computational efficiency all at the same time.	翻訳日:2021-05-19 13:57:54 公開日:2021-05-18
# 圧縮特徴リプレイによるオンライン連続学習のためのACAE-REMIND ACAE-REMIND for Online Continual Learning with Compressed Feature Replay ( http://arxiv.org/abs/2105.08595v1 ) ライセンス: Link先を確認	Kai Wang, Luis Herranz, Joost van de Weijer	(参考訳) オンライン連続学習は、学習者が一度だけデータを考えることができる、複数の異なるタスクから、非IIDデータストリームから学習することを目的としている。通常、メソッドは制限されたバッファを使用して、ストリームにいくつかのイメージを保存することができる。近年,画像の中間層表現が保存(あるいは生成)される機能リプレイは,メモリの削減を図りながら,画像リプレイよりも優れた結果をもたらすことが判明した。量子化された例はメモリ使用量をさらに削減できる。しかし、これらの方法の欠点は、固定された(あるいは非常に非推移的な)バックボーンネットワークを使用することである。これは、全てのタスクを区別できる表現の学習を著しく制限する。この問題を解決するために,中間層で高い圧縮率で特徴再生を行うための補助分類器自動エンコーダ (ACAE) モジュールを提案する。画像あたりのメモリフットプリントの削減により、リプレイ用に多くの例を節約できます。実験では、オンライン連続学習環境下でタスク非依存評価を行い、ImageNet-Subset、CIFAR100、CIFAR10データセット上で最先端のパフォーマンスを得る。 Online continual learning aims to learn from a non-IID stream of data from a number of different tasks, where the learner is only allowed to consider data once. Methods are typically allowed to use a limited buffer to store some of the images in the stream. Recently, it was found that feature replay, where an intermediate layer representation of the image is stored (or generated) leads to superior results than image replay, while requiring less memory. Quantized exemplars can further reduce the memory usage. However, a drawback of these methods is that they use a fixed (or very intransigent) backbone network. This significantly limits the learning of representations that can discriminate between all tasks. To address this problem, we propose an auxiliary classifier auto-encoder (ACAE) module for feature replay at intermediate layers with high compression rates. The reduced memory footprint per image allows us to save more exemplars for replay. In our experiments, we conduct task-agnostic evaluation under online continual learning setting and get state-of-the-art performance on ImageNet-Subset, CIFAR100 and CIFAR10 dataset.	翻訳日:2021-05-19 13:57:25 公開日:2021-05-18
# 都市交通現場におけるセマンティック・コンシスタント・レアルドメイン適応のためのコンテンツディスタングル Content Disentanglement for Semantically Consistent Synthetic-to-RealDomain Adaptation in Urban Traffic Scenes ( http://arxiv.org/abs/2105.08704v1 ) ライセンス: Link先を確認	Mert Keser, Artem Savkin, Federico Tombari	(参考訳) 合成データ生成は、自動運転における新しい交通シナリオを生成するための魅力的なアプローチである。しかし、合成データのみに訓練されたディープラーニング技術は、実データ上でのテスト時に劇的なパフォーマンス低下に遭遇する。このような性能低下は、一般に、実データと合成データの間の領域ギャップに起因する。上記の領域ギャップを軽減するために、ドメイン適応法が適用されている。これらの手法は視覚的に魅力的な結果をもたらすが、翻訳されたサンプルは通常意味的不一致をもたらす。本研究では,合成データと実データ間の意味的に一貫したドメイン適応を可能にする,教師なしのエンドツーエンドドメイン適応ネットワークアーキテクチャを提案する。セマンティックセグメンテーションの下流タスクにおけるアーキテクチャを評価し,最先端手法と比較して優れた性能が得られることを示す。 Synthetic data generation is an appealing approach to generate novel traffic scenarios in autonomous driving. However, deep learning techniques trained solely on synthetic data encounter dramatic performance drops when they are tested on real data. Such performance drop is commonly attributed to the domain gap between real and synthetic data. Domain adaptation methods have been applied to mitigate the aforementioned domain gap. These methods achieve visually appealing results, but the translated samples usually introduce semantic inconsistencies. In this work, we propose a new, unsupervised, end-to-end domain adaptation network architecture that enables semantically consistent domain adaptation between synthetic and real data. We evaluate our architecture on the downstream task of semantic segmentation and show that our method achieves superior performance compared to the state-of-the-art methods.	翻訳日:2021-05-19 13:57:07 公開日:2021-05-18
# 検索エンジンの魔法:検索エンジンとの会話を通して情報にアクセスする Wizard of Search Engine: Access to Information Through Conversations with Search Engines ( http://arxiv.org/abs/2105.08301v1 ) ライセンス: Link先を確認	Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zhumin Chen, Zhaochun Ren and Maarten de Rijke	(参考訳) 会話情報探索(CIS)は、人々を情報に結びつける上でますます重要な役割を担っている。適切なリソースが不足しているため、CISに関する以前の研究は理論・概念的枠組み、実験室ベースのユーザー研究、あるいはCISの特定の側面(例えば、質問を明確にすること)の研究に限られている。本研究では,3つの側面からCISの研究を促進するために努力する。 1) 意図検出(ID)、キーフレーズ抽出(KE)、行動予測(AP)、クエリ選択(QS)、通過選択(PS)、応答生成(RG)の6つのサブタスクでCIS用のパイプラインを定式化する。 2) cisのすべての側面に関する包括的かつ詳細な調査を可能にする,ウィザード・オブ・検索エンジン(wise)と呼ばれるベンチマークデータセットをリリースする。 (3)6つのサブタスクを共同で個別にトレーニングし、評価できるニューラルネットワークを設計し、利用可能なデータを完全に活用することでWISEの要求を大規模に削減できる事前訓練/微調整学習方式を考案する。 WISE統計に基づくCISの有用な特徴について報告する。また、いくつかの指標で示されるような効果的なCISを実現するために、最良のモデル変種が可能であることを示す。我々は、この重要な研究方向性のさらなる改善を計測し、将来の研究を促進するためのデータセット、コード、および評価スクリプトをリリースする。 Conversational information seeking (CIS) is playing an increasingly important role in connecting people to information. Due to the lack of suitable resource, previous studies on CIS are limited to the study of theoretical/conceptual frameworks, laboratory-based user studies, or a particular aspect of CIS (e.g., asking clarifying questions). In this work, we make efforts to facilitate research on CIS from three aspects. (1) We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS), and response generation (RG). (2) We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS. (3) We design a neural architecture capable of training and evaluating both jointly and separately on the six sub-tasks, and devise a pre-train/fine-tune learning scheme, that can reduce the requirements of WISE in scale by making full use of available data. We report some useful characteristics of CIS based on statistics of WISE. We also show that our best performing model variant isable to achieve effective CIS as indicated by several metrics. We release the dataset, the code, as well as the evaluation scripts to facilitate future research by measuring further improvements in this important research direction.	翻訳日:2021-05-19 13:56:56 公開日:2021-05-18
# インドの政治危機中、Twitter上でのインフルエンサーの偏光 Divided We Rule: Influencer Polarization on Twitter During Political Crises in India ( http://arxiv.org/abs/2105.08361v1 ) ライセンス: Link先を確認	Saloni Dash, Dibyendu Mishra, Gazal Shekhawat, Joyojeet Pal	(参考訳) インフルエンサーは、ソーシャルメディアにおける情報伝達の性質とネットワークの鍵となる。インフルエンサーは、問題への関与を通じて政治的談話において特に重要であり、彼らの正当性は、オンライン操作によってのみまたは部分的に導かれるか、あるいは芸能人、ジャーナリストなどの専門知識のオフライン領域を持つ。インフルエンサーの政治的関与と極性の定量化には、インドの政治危機における6kインフルエンサーと26kインドの政治家のツイートをエンコードするために、googleのuniversal sentence encoder(use)を使用します。次に、リツイートグラフとともに、政治的問題に関して、インフルエンサーの姿勢と極性を計算するのに役立つツイート埋め込みに基づいて、インフルエンサーの集合ベクトル表現を得る。新型コロナウイルス(COVID-19)では、政府側にインフルエンサーが集まっている一方で、市民権、カシミールの州昇格、農民の抗議に関する他の3つの論争的な問題について、主に政府主導のファンアカウントであり、現職の地位を拡大している。本手法は、現在のインドにおける政治的分裂の洞察を提供するとともに、他の文脈におけるインフルエンサーや偏極を研究する手段を提供する。 Influencers are key to the nature and networks of information propagation on social media. Influencers are particularly important in political discourse through their engagement with issues, and may derive their legitimacy either solely or partly through online operation, or have an offline sphere of expertise such as entertainers, journalists etc. To quantify influencers' political engagement and polarity, we use Google's Universal Sentence Encoder (USE) to encode the tweets of 6k influencers and 26k Indian politicians during political crises in India. We then obtain aggregate vector representations of the influencers based on their tweet embeddings, which alongside retweet graphs help compute the stance and polarity of these influencers with respect to the political issues. We find that while on COVID-19 there is a confluence of influencers on the side of the government, on three other contentious issues around citizenship, Kashmir's statehood, and farmers' protests, it is mainly government-aligned fan accounts that amplify the incumbent's positions. We propose that this method offers insight into the political schisms in present-day India, but also offers a means to study influencers and polarization in other contexts.	翻訳日:2021-05-19 13:56:31 公開日:2021-05-18
# エンティティベースのクエリ解釈 Entity-Based Query Interpretation ( http://arxiv.org/abs/2105.08581v1 ) ライセンス: Link先を確認	Vaibhav Kasturia, Marcel Gohsen, Matthias Hagen	(参考訳) パリ・ヒルトン(paris hilton)は、有名人の最新のニュースを見つけたり、パリで特定のホテルを見つけたりすることを目的としているのだろうか? そして、世界の20以上の「パリ」の中で、どちらがそうですか。本稿では,エンティティベースのクエリ解釈を導出することで,このあいまいさを解消することを提案する。あるクエリに対して,クエリの適切な部分を,背景知識ベースで意味的に互換性のあるエンティティにリンクすること。提案手法は, 検索応答時間が数百ミリ秒を超えるべきではないため, 有効性だけでなく, 効率性にも焦点をあてるものである。提案手法では,クエリセグメンテーションを前処理ステップとして,有望なセグメントベースの「骨格」を見つけることを提案する。これらの骨格は、包含されたセグメントを知識ベースからエンティティにリンクし、最後のステップで解釈をランク付けすることで「解釈」に拡張される。 2,800のクエリをコーパスで比較した結果,これまでで最も有効なクエリエンティティリンク手法よりも,実行時の解釈精度を向上するアプローチが示された。 Web search queries can be rather ambiguous: Is "paris hilton" meant to find the latest news on the celebrity or to find a specific hotel in Paris? And in which of the worldwide more than 20 "Parises"? We propose to solve this ambiguity problem by deriving entity-based query interpretations: given some query, the task is to link suitable parts of the query to semantically compatible entities in a background knowledge base. Our suggested approach to identify the most reasonable interpretations of a query based on the contained entities focuses on effectiveness but also on efficiency since web search response times should not exceed some hundreds of milliseconds. In our approach, we propose to use query segmentation as a pre-processing step that finds promising segment-based "skeletons". These skeletons are then enhanced to "interpretations" by linking the contained segments to entities from a knowledge base and then ranking the interpretations in a final step. An experimental comparison on a corpus of 2,800 queries shows our approach to have a better interpretation accuracy at a better run time than the previously most effective query entity linking methods.	翻訳日:2021-05-19 13:56:10 公開日:2021-05-18
# 残留ネットワークと埋め込み利用:グラフ畳み込みネットワークを用いたノード分類の新手法 Residual Network and Embedding Usage: New Tricks of Node Classification with Graph Convolutional Networks ( http://arxiv.org/abs/2105.08330v1 ) ライセンス: Link先を確認	Huixuan Chi, Yuying Wang, Qinfen Hao, Hong Xia	(参考訳) グラフ畳み込みネットワーク(GCN)とその後の変種は、グラフ上のタスク、特にノード分類タスクを解決するために提案されている。しかし文献では、ほとんどのトリックやテクニックが実装の詳細として言及されているか、ソースコードでしか見えない。本稿ではまず,GCNのミニバッチトレーニングで使用される既存の効果的なトリックについて要約する。これに基づいて,gcn_resフレームワークと組込み使用法という2つの新しい手法が,異なるデータセットにおけるベースラインのテスト精度を向上させるために,残差ネットワークと事前学習された組込みを活用することで提案されている。 Open Graph Benchmark (OGB) の実験では、これらの手法を組み合わせることで、様々なGCNのテスト精度が1.21%〜2.84%向上した。実装はhttps://github.com/ytchx 1999/PyG-OGB-Tricks.comで公開しています。 Graph Convolutional Networks (GCNs) and subsequent variants have been proposed to solve tasks on graphs, especially node classification tasks. In the literature, however, most tricks or techniques are either briefly mentioned as implementation details or only visible in source code. In this paper, we first summarize some existing effective tricks used in GCNs mini-batch training. Based on this, two novel tricks named GCN_res Framework and Embedding Usage are proposed by leveraging residual network and pre-trained embedding to improve baseline's test accuracy in different datasets. Experiments on Open Graph Benchmark (OGB) show that, by combining these techniques, the test accuracy of various GCNs increases by 1.21%~2.84%. We open source our implementation at https://github.com/ytchx1999/PyG-OGB-Tricks.	翻訳日:2021-05-19 13:55:33 公開日:2021-05-18
# グラフニューラルネットワークを用いた時系列異常検出のためのスタックングVAE Stacking VAE with Graph Neural Networks for Effective and Interpretable Time Series Anomaly Detection ( http://arxiv.org/abs/2105.08397v1 ) ライセンス: Link先を確認	Wenkai Li, Wenbo Hu, Ning Chen, Cheng Feng	(参考訳) 実世界の保守アプリケーションにおいて、深層生成モデルは、複数のセンサから収集された時系列信号からエンティティの異常事象を検出する上で有望な性能を示した。それにもかかわらず、このようなモデルを時系列異常検出に活用するための2つの重要な課題を概説する:1)効率的かつ効率的な再構成モデルの開発、2)多変量時系列データチャネル間の類似性と相互関係構造を利用する。これらの課題に対処するため,本稿では,グラフニューラルネットワークを用いた重畳変動自動エンコーダ(VAE)モデルを提案する。具体的には,チャネル間の類似性を持つ多変量時系列データに対して,重み共有方式を用いた積み重ねブロックワイズ再構築フレームワークを提案する。さらに,グラフ学習モジュールを用いて疎隣接行列を学習し,時系列データチャネル間の安定な相互関係構造情報を明示的に把握し,系列パターンの解釈可能な再構成を行う。実験結果から,提案モデルが3つの公開データセットに対して高いベースラインを達成し,その一方でトレーニング効率の維持が図られた。さらに,本モデルで学習した直感的な安定構造は,検出結果の解釈可能性を大幅に向上させることを示した。 In real-world maintenance applications, deep generative models have shown promising performance in detecting anomalous events of entities from time-series signals collected from multiple sensors. Nevertheless, we outline two important challenges of leveraging such models for times-series anomaly detection: 1) developing effective and efficient reconstruction models and 2) exploiting the similarity and interrelation structures among the multivariate time series data channels. To address these challenges, in this paper we propose a stacking variational auto-encoder (VAE) model with graph neural networks for the effective and interpretable time-series anomaly detection. Specifically, we propose a stacking block-wise reconstruction framework with a weight-sharing scheme for the multivariate time series data with similarities among channels. Moreover, with a graph learning module, our model learns a sparse adjacency matrix to explicitly capture the stable interrelation structure information among multiple time series data channels for interpretable reconstruction of series patterns. Experimental results show that our proposed model outperforms the strong baselines on three public datasets with considerable improvements and meanwhile still maintains the training efficiency. Furthermore, we demonstrate that the intuitive stable structure learned by our model significantly improves the interpretability of our detection results.	翻訳日:2021-05-19 13:55:18 公開日:2021-05-18
# 深層学習によるアルツハイマー病診断の自動評価 Automatic Assessment of Alzheimer's Disease Diagnosis Based on Deep Learning Techniques ( http://arxiv.org/abs/2105.08446v1 ) ライセンス: Link先を確認	Alejandro Puente-Castro, Enrique Fernandez-Blanco, Alejandro Pazos, Cristian R. Munteanu	(参考訳) 早期発見はアルツハイマー病(AD)の進行を防ぐために重要である。したがって、専門家はできるだけ早く予防治療を開始することができる。彼らはADの早期かつ最も検出が難しい診断において、迅速かつ正確な評価を要求する。本研究の主な目的は、一般的には使われない矢状磁気共鳴画像(MRI)における疾患の存在を自動的に検出するシステムを開発することである。 ADNIデータセットとOASISデータセットの矢状MRIが採用された。より正確な結果を得るために,Transfer Learning (TL) 技術を用いて実験を行った。第一に、ADとそのステージに関する損傷は、矢状MRIにおいて区別でき、第二に、矢状MRIを用いたDLモデルを用いて得られた結果は、水平平面MRIを用いた最先端のMRIと類似している。矢状面MRIは一般的には使われていないが、この研究は、少なくとも、ADを早期に同定する他の平面からのMRIと同じくらい効果があることを証明した。これはさらなる研究の道を開くかもしれない。最後に、ある分野において、データセットの例を得るのは非常に高価であることに留意する必要がある。本研究は,これらの分野でDLモデルを構築できることを実証する一方,TLは少ない例でタスクを完了するための必須のツールである。 Early detection is crucial to prevent the progression of Alzheimer's disease (AD). Thus, specialists can begin preventive treatment as soon as possible. They demand fast and precise assessment in the diagnosis of AD in the earliest and hardest to detect stages. The main objective of this work is to develop a system that automatically detects the presence of the disease in sagittal magnetic resonance images (MRI), which are not generally used. Sagittal MRIs from ADNI and OASIS data sets were employed. Experiments were conducted using Transfer Learning (TL) techniques in order to achieve more accurate results. There are two main conclusions to be drawn from this work: first, the damages related to AD and its stages can be distinguished in sagittal MRI and, second, the results obtained using DL models with sagittal MRIs are similar to the state-of-the-art, which uses the horizontal-plane MRI. Although sagittal-plane MRIs are not commonly used, this work proved that they were, at least, as effective as MRI from other planes at identifying AD in early stages. This could pave the way for further research. Finally, one should bear in mind that in certain fields, obtaining the examples for a data set can be very expensive. This study proved that DL models could be built in these fields, whereas TL is an essential tool for completing the task with fewer examples.	翻訳日:2021-05-19 13:55:01 公開日:2021-05-18
# 不平等な長期水道需要予測 Univariate Long-Term Municipal Water Demand Forecasting ( http://arxiv.org/abs/2105.08486v1 ) ライセンス: Link先を確認	Blake VanBerlo, Matthew A.S. Ross, Daniel Hsia	(参考訳) 本研究は,カナダのロンドンにおける都市全体の水消費のモデル化について述べる。線形回帰やfacebookの預言法,リカレントニューラルネットワーク,畳み込みニューラルネットワークなど,水消費を伴う不定時系列予測のタスクに対して,複数のモデリング手法が評価された。命題は選択のモデルとして同定され、5倍のクロス検証で平均2.51%の絶対誤差を達成した。預言者はまた、水需要管理の利害関係者にとって価値のある他の利点があることが判明した。提案手法の実装は,他の自治体でも適用可能であるため,オープンソース化されている。 This study describes an investigation into the modelling of citywide water consumption in London, Canada. Multiple modelling techniques were evaluated for the task of univariate time series forecasting with water consumption, including linear regression, Facebook's Prophet method, recurrent neural networks, and convolutional neural networks. Prophet was identified as the model of choice, having achieved a mean absolute percentage error of 2.51%, averaged across a 5-fold cross validation. Prophet was also found to have other advantages deemed valuable to water demand management stakeholders, including inherent interpretability and graceful handling of missing data. The implementation for the methods described in this paper has been open sourced, as they may be adaptable by other municipalities.	翻訳日:2021-05-19 13:54:41 公開日:2021-05-18
# multi-aspect temporal network embedded: a mixed of hawkes process view Multi-Aspect Temporal Network Embedding: A Mixture of Hawkes Process View ( http://arxiv.org/abs/2105.08566v1 ) ライセンス: Link先を確認	Yutian Chang and Guannan Liu and Yuan Zuo and Junjie Wu	(参考訳) 近年,ネットワーク組込みの研究が盛んに行われている。現存する研究は、ネットワーク構造の本質的なダイナミクスを明らかにする重要な情報として地区形成を取り上げ、近隣の歴史的影響を捉えるために、時間的エッジ形成シーケンスの符号化を提案した。しかし,本稿では,エッジの形成は時間的影響を含む様々な要因に起因しうると論じる。実のところ、異なるノードの側面は、識別された隣人の形成を駆動し、時間的スコープを超えながら関連するマルチスペクトル埋め込みを生み出します。そこで本研究では,ホークスに基づく時間的ネットワーク埋め込み(mhne)モデルを用いて,ネットワークのアスペクト駆動近傍形成を捉える手法を提案する。 MHNEでは、複数アスペクトの埋め込みをホークス過程の混合にエンコードし、励起効果と潜伏面をモデル化する利点を得る。具体的には、履歴イベントの励起効果を考慮して異なる重みを割り当てるためにグラフアテンション機構を使用し、一方、gumbel-softmaxはアスペクト上の分布を導出するために接続される。 8つの異なる時間ネットワークに関する大規模な実験は、MHNEによって得られたマルチアスペクト埋め込みの性能を最先端の手法と比較した。 Recent years have witnessed the tremendous research interests in network embedding. Extant works have taken the neighborhood formation as the critical information to reveal the inherent dynamics of network structures, and suggested encoding temporal edge formation sequences to capture the historical influences of neighbors. In this paper, however, we argue that the edge formation can be attributed to a variety of driving factors including the temporal influence, which is better referred to as multiple aspects. As a matter of fact, different node aspects can drive the formation of distinctive neighbors, giving birth to the multi-aspect embedding that relates to but goes beyond a temporal scope. Along this vein, we propose a Mixture of Hawkes-based Temporal Network Embeddings (MHNE) model to capture the aspect-driven neighborhood formation of networks. In MHNE, we encode the multi-aspect embeddings into the mixture of Hawkes processes to gain the advantages in modeling the excitation effects and the latent aspects. Specifically, a graph attention mechanism is used to assign different weights to account for the excitation effects of history events, while a Gumbel-Softmax is plugged in to derive the distribution over the aspects. Extensive experiments on 8 different temporal networks have demonstrated the great performance of the multi-aspect embeddings obtained by MHNE in comparison with the state-of-the-art methods.	翻訳日:2021-05-19 13:54:29 公開日:2021-05-18
# ニューラルネットワークを用いたPDE制約モデル:最適化と大域収束 PDE-constrained Models with Neural Network Terms: Optimization and Global Convergence ( http://arxiv.org/abs/2105.08633v1 ) ライセンス: Link先を確認	Justin Sirignano, Jonathan MacArt, Konstantinos Spiliopoulos	(参考訳) 近年、深層学習を用いて、科学と工学における偏微分方程式(pde)モデルを開発した。 PDEの機能形式はニューラルネットワークによって決定され、ニューラルネットワークパラメータは利用可能なデータに校正される。 PDEを最適化することで、組み込みニューラルネットワークの校正を行うことができる。これらの応用に動機づけられ,ニューラルネットワークを用いた線形楕円型pdesの最適化を厳格に検討した。 PDEのニューラルネットワークパラメータは勾配降下を用いて最適化され、その勾配は隣接PDEを用いて評価される。パラメータの数が大きくなると、PDE と随伴する PDE は非局所 PDE 系に収束する。この制限付きPDEシステムを用いて、最適化中にニューラルネットワーク-PDEのグローバル最小値への収束を証明できる。極限PDEシステムは、固有値が正であるが任意に小さい非局所線型作用素を含む。固有値に対するスペクトルギャップの欠如は、大域収束証明の主要な課題である。結合型PDEと随伴型PDEシステムのスペクトル分解の注意深い解析が必要である。最後に, ニューラルネットワークをレイノルズ平均化 Navier-Stokes (RANS) 方程式の閉包モデルとして機能させる流体力学への応用のために, ニューラルネットワークモデルをトレーニングする。 RANSニューラルネットワークモデルは、乱流チャネルフローの複数のデータセットに基づいてトレーニングされ、Reynoldsの異なる数値でサンプル外評価される。 Recent research has used deep learning to develop partial differential equation (PDE) models in science and engineering. The functional form of the PDE is determined by a neural network, and the neural network parameters are calibrated to available data. Calibration of the embedded neural network can be performed by optimizing over the PDE. Motivated by these applications, we rigorously study the optimization of a class of linear elliptic PDEs with neural network terms. The neural network parameters in the PDE are optimized using gradient descent, where the gradient is evaluated using an adjoint PDE. As the number of parameters become large, the PDE and adjoint PDE converge to a non-local PDE system. Using this limit PDE system, we are able to prove convergence of the neural network-PDE to a global minimum during the optimization. The limit PDE system contains a non-local linear operator whose eigenvalues are positive but become arbitrarily small. The lack of a spectral gap for the eigenvalues poses the main challenge for the global convergence proof. Careful analysis of the spectral decomposition of the coupled PDE and adjoint PDE system is required. Finally, we use this adjoint method to train a neural network model for an application in fluid mechanics, in which the neural network functions as a closure model for the Reynolds-averaged Navier-Stokes (RANS) equations. The RANS neural network model is trained on several datasets for turbulent channel flow and is evaluated out-of-sample at different Reynolds numbers.	翻訳日:2021-05-19 13:54:08 公開日:2021-05-18
# 賭けによる予測アルゴリズムの強化 Enhancement of prediction algorithms by betting ( http://arxiv.org/abs/2105.08669v1 ) ライセンス: Link先を確認	Vladimir Vovk	(参考訳) 本稿では,確率的予測アルゴリズムの品質向上のための手法を提案する。これは、最近開発された共形テストmartingalesの成功に触発されている。 This note proposes a procedure for enhancing the quality of probabilistic prediction algorithms via betting against their predictions. It is inspired by the success of the conformal test martingales that have been developed recently.	翻訳日:2021-05-19 13:53:47 公開日:2021-05-18
# AIと共有の繁栄 AI and Shared Prosperity ( http://arxiv.org/abs/2105.08475v1 ) ライセンス: Link先を確認	Katya Klinova and Anton Korinek	(参考訳) 人間の労働を自動化するAIの今後の進歩は、労働市場や不平等に深刻な影響を及ぼす可能性がある。本稿では、生産性の向上が社会を豊かにすると同時に、労働需要の増加に寄与することを考慮して、労働市場における特定のタイプのaiシステムの影響を分析する枠組みを提案する。この分析により、倫理に配慮した企業は、aiシステムの作成や展開を可能にし、研究者や政策立案者は、労働市場や不平等に対する彼らの行動の影響を考慮に入れ、aiの進歩を、共通の繁栄と全人類の包括的経済未来を促進する方向に導くことができる。 Future advances in AI that automate away human labor may have stark implications for labor markets and inequality. This paper proposes a framework to analyze the effects of specific types of AI systems on the labor market, based on how much labor demand they will create versus displace, while taking into account that productivity gains also make society wealthier and thereby contribute to additional labor demand. This analysis enables ethically-minded companies creating or deploying AI systems as well as researchers and policymakers to take into account the effects of their actions on labor markets and inequality, and therefore to steer progress in AI in a direction that advances shared prosperity and an inclusive economic future for all of humanity.	翻訳日:2021-05-19 13:53:29 公開日:2021-05-18
# 自分の寝室を飾る: 生成的広告ネットワークによるローカルな画像生成 Decorating Your Own Bedroom: Locally Controlling Image Generation with Generative Adversarial Networks ( http://arxiv.org/abs/2105.08222v1 ) ライセンス: Link先を確認	Chen Zhang, Yinghao Xu, Yujun Shen	(参考訳) GAN(Generative Adversarial Networks)は高品質な画像の合成に成功している。しかし、十分に訓練されたGANモデルの生成過程を制御し、出力イメージをカスタマイズする方法は、明らかにされていない。 GANで使用される入力潜時符号の変調は、出力画像の変動係数を合理的に変更できることが最近発見されたが、そのような操作は通常、画像全体を変更するために現れる。本研究では,出力画像のローカル編集をサポートするためのloganと呼ばれる効果的な手法を提案する。具体的には,コンテンツ変調とスタイル変調の2つの演算子を優先マスクとともに導入し,中間生成特性の正確な制御を容易にする。寝室の合成を例にとれば、部屋内の個々のオブジェクトをシームレスに削除、挿入、シフト、回転することが可能です。さらに, 部屋を完全に取り除き, 家具やスタイルをカスタマイズして再調合することができる。実験結果から,多目的画像編集のための事前学習されたGANの画像生成を操る大きな可能性を示した。 Generative Adversarial Networks (GANs) have made great success in synthesizing high-quality images. However, how to steer the generation process of a well-trained GAN model and customize the output image is much less explored. It has been recently found that modulating the input latent code used in GANs can reasonably alter some variation factors in the output image, but such manipulation usually presents to change the entire image as a whole. In this work, we propose an effective approach, termed as LoGAN, to support local editing of the output image. Concretely, we introduce two operators, i.e., content modulation and style modulation, together with a priority mask to facilitate the precise control of the intermediate generative features. Taking bedroom synthesis as an instance, we are able to seamlessly remove, insert, shift, and rotate the individual objects inside a room. Furthermore, our method can completely clear out a room and then refurnish it with customized furniture and styles. Experimental results show the great potentials of steering the image generation of pre-trained GANs for versatile image editing.	翻訳日:2021-05-19 13:53:17 公開日:2021-05-18
# chexnetで事前学習したresnet50を用いたコロナ病に対応するx線画像の分類 Transfer learning approach to Classify the X-ray image that corresponds to corona disease Using ResNet50 pretrained by ChexNet ( http://arxiv.org/abs/2105.08382v1 ) ライセンス: Link先を確認	Mahyar Bolhassani	(参考訳) コロナウイルスは世界中の人々に悪影響を及ぼした。コビッド19ウイルスと肺炎やインフルエンザなどの他の呼吸器疾患との間には共通の症状がある。したがって、迅速な診断は患者を救うだけでなく、感染拡大を防ぐためにも重要である。最も頼りになる診断方法の1つは、肺のx線像である。深層学習アプローチの助けを借りて、深層モデルに影響のある肺の状態を学ぶように教えることができる。したがって、新しいサンプルをCovid19感染患者であるかどうかの分類が可能である。このプロジェクトでは、imagenetデータセットとchexnetデータセットで事前トレーニングされたresnet50に基づく深いモデルをトレーニングします。 kaggle が導入した非バランスな coronahack 胸部 x-ray データセットに基づいて,バイナリ分類とマルチクラス分類の両方を適用した。また,焦点損失とクロスエントロピー損失を用いた場合の比較を行った。 Coronavirus adversely has affected people worldwide. There are common symptoms between the Covid19 virus disease and other respiratory diseases like pneumonia or Influenza. Therefore, diagnosing it fast is crucial not only to save patients but also to prevent it from spreading. One of the most reliant methods of diagnosis is through X-ray images of a lung. With the help of deep learning approaches, we can teach the deep model to learn the condition of an affected lung. Therefore, it can classify the new sample as if it is a Covid19 infected patient or not. In this project, we train a deep model based on ResNet50 pretrained by ImageNet dataset and CheXNet dataset. Based on the imbalanced CoronaHack Chest X-Ray dataset introducing by Kaggle we applied both binary and multi-class classification. Also, we compare the results when using Focal loss and Cross entropy loss.	翻訳日:2021-05-19 13:52:59 公開日:2021-05-18
# 固定フロップ数でのハイパーネットワークのオーバーパラメトリゼーションによる高速ニューラルネットワークの高速化 Overparametrization of HyperNetworks at Fixed FLOP-Count Enables Fast Neural Image Enhancement ( http://arxiv.org/abs/2105.08470v1 ) ライセンス: Link先を確認	Lorenz K. Muller	(参考訳) 深層畳み込みニューラルネットワークは、小型のモバイルカメラセンサーで撮影された画像を強化し、分解、分解、超高解像度化といったタスクに優れる。しかし、モバイルデバイスで実際に使用する場合、これらのネットワークはFLOPを多用し、畳み込み層のFLOPを削減し、パラメータ数を減少させる。これは、過大なパラメータを持つニューラルネットワークがしばしば最も一般化しているという最近の発見から、問題となっている。本稿では,フロップと標準畳み込みのパラメータの固定比率を破るためにハイパーネットワークの利用を提案する。これにより、ZRR(Zurich RAW-to-DSLR)データセットにおけるSSIMおよびMS-SSIMの従来の最先端アーキテクチャを10倍に削減できる。 zrr では、より大きな画像限界において、固定フロップ数における「二重増分」挙動と一致する一般化曲線をさらに観察する。最後に、既存のネットワーク(VDN)に同じ手法を適用することで、スマートフォン画像デノイングデータセット(SIDD)の忠実さを維持しながら計算コストを削減できることを示す。キー関数のコードは、appendixで与えられる。 Deep convolutional neural networks can enhance images taken with small mobile camera sensors and excel at tasks like demoisaicing, denoising and super-resolution. However, for practical use on mobile devices these networks often require too many FLOPs and reducing the FLOPs of a convolution layer, also reduces its parameter count. This is problematic in view of the recent finding that heavily over-parameterized neural networks are often the ones that generalize best. In this paper we propose to use HyperNetworks to break the fixed ratio of FLOPs to parameters of standard convolutions. This allows us to exceed previous state-of-the-art architectures in SSIM and MS-SSIM on the Zurich RAW- to-DSLR (ZRR) data-set at > 10x reduced FLOP-count. On ZRR we further observe generalization curves consistent with 'double-descent' behavior at fixed FLOP-count, in the large image limit. Finally we demonstrate the same technique can be applied to an existing network (VDN) to reduce its computational cost while maintaining fidelity on the Smartphone Image Denoising Dataset (SIDD). Code for key functions is given in the appendix.	翻訳日:2021-05-19 13:52:48 公開日:2021-05-18
# 平均的マルチエージェント強化学習のための置換不変ポリシー最適化:原則的アプローチ Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach ( http://arxiv.org/abs/2105.08268v1 ) ライセンス: Link先を確認	Yan Li, Lingxiao Wang, Jiachen Yang, Ethan Wang, Zhaoran Wang, Tuo Zhao, Hongyuan Zha	(参考訳) 多エージェント強化学習(MARL)は, エージェントの数が指数関数的に増加するにつれて, より多くのエージェントの存在下でより困難になる。このようなスケール上の課題に対処するために,置換不変性を持つ協調的marl問題のクラスを同定し,平均場マルコフ決定過程(mdp)として定式化する。そこで,置換不変なアクター批判型ニューラルアーキテクチャのコアとなる平均場近似ポリシ最適化(MF-PPO)アルゴリズムを提案する。我々は,MF-PPOが収束のサブ線形速度で世界的最適政策を達成することを証明した。さらに、サンプルの複雑さはエージェントの数に依存しない。マルチエージェント粒子環境(MPE)における数値実験により,MF-PPOの理論的利点を検証する。特に、置換不変ニューラルアーキテクチャによって引き起こされる帰納バイアスにより、MF-PPOは、その一般化性能の鍵となる、より少ないモデルパラメータで既存の競合より優れていることを示す。 Multi-agent reinforcement learning (MARL) becomes more challenging in the presence of more agents, as the capacity of the joint state and action spaces grows exponentially in the number of agents. To address such a challenge of scale, we identify a class of cooperative MARL problems with permutation invariance, and formulate it as a mean-field Markov decision processes (MDP). To exploit the permutation invariance therein, we propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture. We prove that MF-PPO attains the globally optimal policy at a sublinear rate of convergence. Moreover, its sample complexity is independent of the number of agents. We validate the theoretical advantages of MF-PPO with numerical experiments in the multi-agent particle environment (MPE). In particular, we show that the inductive bias introduced by the permutation-invariant neural architecture enables MF-PPO to outperform existing competitors with a smaller number of model parameters, which is the key to its generalization performance.	翻訳日:2021-05-19 13:52:17 公開日:2021-05-18
# ModelPS: スケールでトレーニング済みモデルを編集するためのインタラクティブで協調的なプラットフォーム ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale ( http://arxiv.org/abs/2105.08275v1 ) ライセンス: Link先を確認	Yuanming Li, Huaizheng Zhang, Shanshan Jiang, Fan Yang, Yonggang Wen and Yong Luo	(参考訳) AIエンジニアリングは、さまざまなバックグラウンドを持つソフトウェア開発者の間でディープニューラルネットワーク(DNN)モデルを民主化するための重要な規律として登場した。特に、デプロイメント段階でこれらのDNNモデルを変更することは、非常に難しい課題です。本研究では,協調的なdnnモデル編集とインテリジェントなモデル提供を実現するために,ローコードソリューションであるmodelps("model photoshop"の頭字語)を提案し,開発する。 modelpsソリューションは2つのトランスフォーメーションフィーチャを具体化している: 1) 開発者チームがdnnモデルをローコード形式で画像で共有および編集するためのユーザフレンドリーなwebインターフェース、2) 開発者が所定のデプロイメント要件や制約のためにモデル編集設定をカスタマイズするのを支援するバックエンドのmodel genieエンジン。広範にわたるディープラーニング(DL)モデルを用いたケーススタディでは,生産性の向上により,開発オーバーヘッドと通信オーバーヘッドの両方を大幅に削減できることが示された。コードはGitHubでオープンソースパッケージとしてリリースされた。 AI engineering has emerged as a crucial discipline to democratize deep neural network (DNN) models among software developers with a diverse background. In particular, altering these DNN models in the deployment stage posits a tremendous challenge. In this research, we propose and develop a low-code solution, ModelPS (an acronym for "Model Photoshop"), to enable and empower collaborative DNN model editing and intelligent model serving. The ModelPS solution embodies two transformative features: 1) a user-friendly web interface for a developer team to share and edit DNN models pictorially, in a low-code fashion, and 2) a model genie engine in the backend to aid developers in customizing model editing configurations for given deployment requirements or constraints. Our case studies with a wide range of deep learning (DL) models show that the system can tremendously reduce both development and communication overheads with improved productivity. The code has been released as an open-source package at GitHub.	翻訳日:2021-05-19 13:51:58 公開日:2021-05-18
# DRIVE: 1ビット分散平均推定 DRIVE: One-bit Distributed Mean Estimation ( http://arxiv.org/abs/2105.08339v1 ) ライセンス: Link先を確認	Shay Vargaftik, Ran Ben Basat, Amit Portnoy, Gal Mendelson, Yaniv Ben-Itzhak, Michael Mitzenmacher	(参考訳) 我々は、$n$クライアントが$d(1+o(1))$ビットのみを使用して$d$-dimensional real-valuedベクターを送信する問題を考える。このような圧縮問題は、フェデレートされた分散学習や、他の領域でも発生する。従来の圧縮アルゴリズムを精度と計算効率で上回る、新しい数学的結果を提供し、それに対応する新しいアルゴリズムを導出する。本手法は,分散学習タスクとフェデレーション学習タスクの集合において,様々なデータセットを用いて評価し,その状態に対して一貫した改善を示す。 We consider the problem where $n$ clients transmit $d$-dimensional real-valued vectors using only $d(1+o(1))$ bits each, in a manner that allows a receiver to approximately reconstruct their mean. Such compression problems arise in federated and distributed learning, as well as in other domains. We provide novel mathematical results and derive corresponding new algorithms that outperform previous compression algorithms in accuracy and computational efficiency. We evaluate our methods on a collection of distributed and federated learning tasks, using a variety of datasets, and show a consistent improvement over the state of the art.	翻訳日:2021-05-19 13:51:39 公開日:2021-05-18
# Euler-Maruyamaスキームを模倣したニューラルネットワークによる確率力学系の学習 Learning stochastic dynamical systems with neural networks mimicking the Euler-Maruyama scheme ( http://arxiv.org/abs/2105.08449v1 ) ライセンス: Link先を確認	Noura Dridi, Lucas Drumetz, Ronan Fablet	(参考訳) 確率微分方程式(SDE)は力学系の最も重要な表現の一つである。これらは、システムの決定論的構成要素と、ランダムな未知の要因を表す確率的要素を含む能力で特筆される。しかし、これはSDEの学習を通常の微分方程式(ODE)よりもはるかに困難にする。本稿では、SDEのパラメータをSDE統合スキームを組み込んだニューラルネットワークで表現するデータ駆動手法を提案する。損失関数は最大確率基準に基づいており、1つのマルコフ・ガウスの仮定に従っている。このアルゴリズムは、幾何学的ブラウン運動とロレンツ-63モデルの確率バージョンに適用される。後者は、状態に依存する確率的なコンポーネントが存在するため、特に対処が難しい。アルゴリズムの性能は異なるシミュレーション結果を用いて検証される。さらに,非線型ドリフト推定に用いる基準勾配マッチング法と,確率項を考慮しないニューラルネットワークに基づく手法との比較を行った。 Stochastic differential equations (SDEs) are one of the most important representations of dynamical systems. They are notable for the ability to include a deterministic component of the system and a stochastic one to represent random unknown factors. However, this makes learning SDEs much more challenging than ordinary differential equations (ODEs). In this paper, we propose a data driven approach where parameters of the SDE are represented by a neural network with a built-in SDE integration scheme. The loss function is based on a maximum likelihood criterion, under order one Markov Gaussian assumptions. The algorithm is applied to the geometric brownian motion and a stochastic version of the Lorenz-63 model. The latter is particularly hard to handle due to the presence of a stochastic component that depends on the state. The algorithm performance is attested using different simulations results. Besides, comparisons are performed with the reference gradient matching method used for non linear drift estimation, and a neural networks-based method, that does not consider the stochastic term.	翻訳日:2021-05-19 13:51:28 公開日:2021-05-18
# 6GネットワークのためのAI-Native Network Slicing AI-Native Network Slicing for 6G Networks ( http://arxiv.org/abs/2105.08576v1 ) ライセンス: Link先を確認	Wen Wu, Conghao Zhou, Mushu Li, Huaqing Wu, Haibo Zhou, Ning Zhang, Xuemin (Sherman) Shen, Weihua Zhuang	(参考訳) 第5世代(5G)ネットワークのグローバル展開では、5Gを超えて第6世代(6G)ネットワークを想定する必要がある。 6Gネットワークには、宇宙空間の統合ネットワーク、高度なネットワーク仮想化、ユビキタスインテリジェンスが期待されている。本稿では、インテリジェントなネットワーク管理を促進し、新興AIサービスをサポートするために、6Gネットワークのための人工知能(AI)ネイティブネットワークスライシングアーキテクチャを提案する。 AIは、提案されたネットワークスライシングアーキテクチャで構築されており、AIとネットワークスライシングのシナジーを可能にする。 AIソリューションは、インテリジェントネットワーク管理、すなわちスライシングのためのAIを促進するために、ネットワークスライシングのライフサイクル全体について調査されている。さらに、スライスインスタンスを構築し、効率的なリソース管理、すなわちAIスライスを実行することによって、新興AIサービスをサポートするために、ネットワークスライシングアプローチについて議論する。最後に、ケーススタディを示し、6GでAIネイティブネットワークスライシングに不可欠なオープンリサーチ問題について議論する。 With the global roll-out of the fifth generation (5G) networks, it is necessary to look beyond 5G and envision the sixth generation (6G) networks. The 6G networks are expected to have space-air-ground integrated networking, advanced network virtualization, and ubiquitous intelligence. This article proposes an artificial intelligence (AI)-native network slicing architecture for 6G networks to facilitate intelligent network management and support emerging AI services. AI is built in the proposed network slicing architecture to enable the synergy of AI and network slicing. AI solutions are investigated for the entire lifecycle of network slicing to facilitate intelligent network management, i.e., AI for slicing. Furthermore, network slicing approaches are discussed to support emerging AI services by constructing slice instances and performing efficient resource management, i.e., slicing for AI. Finally, a case study is presented, followed by a discussion of open research issues that are essential for AI-native network slicing in 6G.	翻訳日:2021-05-19 13:50:51 公開日:2021-05-18
# インスタンス標的中毒における学習と認定 Learning and Certification under Instance-targeted Poisoning ( http://arxiv.org/abs/2105.08709v1 ) ライセンス: Link先を確認	Ji Gao, Amin Karbasi, Mohammad Mahmoody	(参考訳) 本稿では,特定のターゲットインスタンスで学習者を騙すことを目標として,学習セットのごく一部を変更する可能性がある,インスタンス標的中毒攻撃下でのpac学習可能性と認定について検討する。最初のコントリビューションは、様々な設定で問題を形式化し、学習者のランダムさや敵の攻撃がそれに依存するかどうかといった微妙な側面を明確に議論することである。敵の予算がサンプルの複雑さに比例してスケールすると、PAC学習性と認定が達成可能であることを示す。対照的に、敵の予算がサンプルの複雑さと線形に増加すると、敵は期待される0-1の損失を1に引き上げる可能性がある。さらに,同じ攻撃モデルを用いて分布特異的pac学習に結果を拡張し,ガウス分布下での半空間学習において,認証による適切な学習が可能であることを示す。最後に,実データ集合上のk近傍のロバスト性,ロジスティック回帰,多層パーセプトロン,畳み込みニューラルネットワークを実験的に検討し,ターゲットポジショニング攻撃に対して検証する。我々の実験結果によると、多くのモデル、特に最先端のニューラルネットワークは、これらの強力な攻撃に対して脆弱である。興味深いことに、標準精度の高いメソッドは、インスタンスターゲットの毒殺攻撃に対してより脆弱である可能性がある。 In this paper, we study PAC learnability and certification under instance-targeted poisoning attacks, where the adversary may change a fraction of the training set with the goal of fooling the learner at a specific target instance. Our first contribution is to formalize the problem in various settings, and explicitly discussing subtle aspects such as learner's randomness and whether (or not) adversary's attack can depend on it. We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable. In contrast, when the adversary's budget grows linearly with the sample complexity, the adversary can potentially drive up the expected 0-1 loss to one. We further extend our results to distribution-specific PAC learning in the same attack model and show that proper learning with certification is possible for learning halfspaces under Gaussian distribution. Finally, we empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets, and test them against targeted-poisoning attacks. Our experimental results show that many models, especially state-of-the-art neural networks, are indeed vulnerable to these strong attacks. Interestingly, we observe that methods with high standard accuracy might be more vulnerable to instance-targeted poisoning attacks.	翻訳日:2021-05-19 13:50:34 公開日:2021-05-18
# BBE:エージェント・ベース・モデリングによるゲーム内ベッティング取引所の組織動態のシミュレーション BBE: Simulating the Microstructural Dynamics of an In-Play Betting Exchange via Agent-Based Modelling ( http://arxiv.org/abs/2105.08310v1 ) ライセンス: Link先を確認	Dave Cliff	(参考訳) I describe the rationale for, and design of, an agent-based simulation model of a contemporary online sports-betting exchange: such exchanges, closely related to the exchange mechanisms at the heart of major financial markets, have revolutionized the gambling industry in the past 20 years, but gathering sufficiently large quantities of rich and temporally high-resolution data from real exchanges - i.e., the sort of data that is needed in large quantities for Deep Learning - is often very expensive, and sometimes simply impossible; this creates a need for a plausibly realistic synthetic data generator, which is what this simulation now provides. The simulator, named the "Bristol Betting Exchange" (BBE), is intended as a common platform, a data-source and experimental test-bed, for researchers studying the application of AI and machine learning (ML) techniques to issues arising in betting exchanges; and, as far as I have been able to determine, BBE is the first of its kind: a free open-source agent-based simulation model consisting not only of a sports-betting exchange, but also a minimal simulation model of racetrack sporting events (e.g., horse-races or car-races) about which bets may be made, and a population of simulated bettors who each form their own private evaluation of odds and place bets on the exchange before and - cruciallyduring the race itself (i.e., so-called "in-play" betting) and whose betting opinions change second-by-second as each race event unfolds. BBEは、AI/MLと高度なデータ分析技術の適用を通じて、スポーツイベントに賭けるための収益戦略の自動発見や改善のための大規模な高解像度データセットの生成を可能にする概念実証システムとして提供される。本稿は,関連文献の広範な調査を行い,bbeの動機と設計を説明し,簡単な図示的結果を示す。 I describe the rationale for, and design of, an agent-based simulation model of a contemporary online sports-betting exchange: such exchanges, closely related to the exchange mechanisms at the heart of major financial markets, have revolutionized the gambling industry in the past 20 years, but gathering sufficiently large quantities of rich and temporally high-resolution data from real exchanges - i.e., the sort of data that is needed in large quantities for Deep Learning - is often very expensive, and sometimes simply impossible; this creates a need for a plausibly realistic synthetic data generator, which is what this simulation now provides. The simulator, named the "Bristol Betting Exchange" (BBE), is intended as a common platform, a data-source and experimental test-bed, for researchers studying the application of AI and machine learning (ML) techniques to issues arising in betting exchanges; and, as far as I have been able to determine, BBE is the first of its kind: a free open-source agent-based simulation model consisting not only of a sports-betting exchange, but also a minimal simulation model of racetrack sporting events (e.g., horse-races or car-races) about which bets may be made, and a population of simulated bettors who each form their own private evaluation of odds and place bets on the exchange before and - crucially - during the race itself (i.e., so-called "in-play" betting) and whose betting opinions change second-by-second as each race event unfolds. BBE is offered as a proof-of-concept system that enables the generation of large high-resolution data-sets for automated discovery or improvement of profitable strategies for betting on sporting events via the application of AI/ML and advanced data analytics techniques. This paper offers an extensive survey of relevant literature and explains the motivation and design of BBE, and presents brief illustrative results.	翻訳日:2021-05-19 13:50:11 公開日:2021-05-18
# 二重交換モデルにおける停止相分離:機械学習による大規模シミュレーション Arrested phase separation in double-exchange models: machine-learning enabled large-scale simulation ( http://arxiv.org/abs/2105.08221v1 ) ライセンス: Link先を確認	Puhan Zhang, Gia-Wei Chern	(参考訳) 本稿では,小型完全対角化解から学習したディープラーニングニューラルネットワークポテンシャルに基づいて,シングルバンド二重交換モデルにおける電子相分離の大規模動的シミュレーションを提案する。ドープ孔を半充填した絶縁背景から分離し, 興味深い相関誘起凍結挙動を明らかにする。ホールの凝集は、電荷キャリアと局所磁気モーメントの結合による強磁性クラスターの形成によって安定化されるが、この安定化はまた、反強磁性スピン-スピン相関が背景で十分に発達しているときにホールの凝縮電位を生成する。自走ブラックホールの運動量が劇的に減少すると、強磁性クラスターのさらなる成長が妨げられ、位相分離が達成される。磁気抵抗効果を示す材料における相分離ダイナミクスの意義について考察した。 We present large-scale dynamical simulations of electronic phase separation in the single-band double-exchange model based on deep-learning neural-network potentials trained from small-size exact diagonalization solutions. We uncover an intriguing correlation-induced freezing behavior as doped holes are segregated from half-filled insulating background during equilibration. While the aggregation of holes is stabilized by the formation of ferromagnetic clusters through Hund's coupling between charge carriers and local magnetic moments, this stabilization also creates confining potentials for holes when antiferromagnetic spin-spin correlation is well developed in the background. The dramatically reduced mobility of the self-trapped holes prematurely disrupts further growth of the ferromagnetic clusters, leading to an arrested phase separation. Implications of our findings for phase separation dynamics in materials that exhibit colossal magnetoresistance effect are discussed.	翻訳日:2021-05-19 13:49:30 公開日:2021-05-18
# 重み共有ディープニューラルネットワークを用いた多核ジオメトリの電子シュレーディンガー方程式の解法 Solving the electronic Schr\"odinger equation for multiple nuclear geometries with weight-sharing deep neural networks ( http://arxiv.org/abs/2105.08351v1 ) ライセンス: Link先を確認	Michael Scherbela, Rafael Reisenhofer, Leon Gerard, Philipp Marquetand and Philipp Grohs	(参考訳) schr\"odinger方程式の正確な数値解は量子化学において極めて重要である。しかし、現在の高精度手法の計算コストは相互作用する粒子の数に比例する。モンテカルロ法と教師なしニューラルネットワークのトレーニングを組み合わせることで、この設定における次元性の呪いを克服し、計算コストを適度にスケーリングして個々の分子の正確な波動関数を得るための有望なアプローチが提案されている。これらの手法は現在、分子の測度に関して波動関数が示す規則性を利用していない。近年のDeep Transfer Learningの機械翻訳やコンピュータビジョンタスクへの応用に着想を得て、ニューラルネットワークベースのモデルを異なる分子ジオメトリに最適化する際に、重み付け制約を導入することで、この規則性を活用する。すなわち、ニューラルネットワークモデルにおける重みの最大95%が、実際には様々な分子ジオメトリにわたって等しいように最適化プロセスを制限する。この手法は、同じ分子の核ジオメトリの集合を等級で考える場合の最適化を加速し、異なる分子にまたがって高い精度をもたらす事前学習されたニューラルネットワークの波動関数への有望な道を開くことを見出した。 Accurate numerical solutions for the Schr\"odinger equation are of utmost importance in quantum chemistry. However, the computational cost of current high-accuracy methods scales poorly with the number of interacting particles. Combining Monte Carlo methods with unsupervised training of neural networks has recently been proposed as a promising approach to overcome the curse of dimensionality in this setting and to obtain accurate wavefunctions for individual molecules at a moderately scaling computational cost. These methods currently do not exploit the regularity exhibited by wavefunctions with respect to their molecular geometries. Inspired by recent successful applications of deep transfer learning in machine translation and computer vision tasks, we attempt to leverage this regularity by introducing a weight-sharing constraint when optimizing neural network-based models for different molecular geometries. That is, we restrict the optimization process such that up to 95 percent of weights in a neural network model are in fact equal across varying molecular geometries. We find that this technique can accelerate optimization when considering sets of nuclear geometries of the same molecule by an order of magnitude and that it opens a promising route towards pre-trained neural network wavefunctions that yield high accuracy even across different molecules.	翻訳日:2021-05-19 13:49:16 公開日:2021-05-18
# 領域制約のロバスト性について On the Robustness of Domain Constraints ( http://arxiv.org/abs/2105.08619v1 ) ライセンス: Link先を確認	Ryan Sheatsley and Blaine Hoak and Eric Pauley and Yohan Beugin and Michael J. Weisman and Patrick McDaniel	(参考訳) 機械学習は、モデルのパフォーマンスを損なうように設計された逆例入力に対して脆弱である。しかし、逆例がモデル化された領域における現実的な入力を表すかどうかは不明である。ネットワークやフィッシングのような様々なドメインは、敵が攻撃を実現するために満たさなければならない特徴(敵固有の目標に加えて)の間のドメイン制約と複雑な関係を持つ。本稿では,ドメイン制約が敵の能力を制限する方法と,現実的な(制約に従順な)例を作成するために敵の戦略をどのように適用できるかを検討する。そこで本研究では,データからドメイン制約を学習する手法を開発し,学習した制約を敵対的工法に組み込む方法を示す。ネットワーク侵入とフィッシングデータセットにおける我々のアプローチの有効性を評価し,(1)最先端の工法アルゴリズムが生成する敵例の最大82%がドメイン制約に違反し,(2)ドメイン制約は敵例に対して堅牢であり,制約を強制するとモデル精度が最大34%向上することを示した。我々は、ドメイン制約を満たすために入力を変更する必要があるだけでなく、これらの制約が有効な敵例の生成をより困難にしていることを観察する。 Machine learning is vulnerable to adversarial examples-inputs designed to cause models to perform poorly. However, it is unclear if adversarial examples represent realistic inputs in the modeled domains. Diverse domains such as networks and phishing have domain constraints-complex relationships between features that an adversary must satisfy for an attack to be realized (in addition to any adversary-specific goals). In this paper, we explore how domain constraints limit adversarial capabilities and how adversaries can adapt their strategies to create realistic (constraint-compliant) examples. In this, we develop techniques to learn domain constraints from data, and show how the learned constraints can be integrated into the adversarial crafting process. We evaluate the efficacy of our approach in network intrusion and phishing datasets and find: (1) up to 82% of adversarial examples produced by state-of-the-art crafting algorithms violate domain constraints, (2) domain constraints are robust to adversarial examples; enforcing constraints yields an increase in model accuracy by up to 34%. We observe not only that adversaries must alter inputs to satisfy domain constraints, but that these constraints make the generation of valid adversarial examples far more challenging.	翻訳日:2021-05-19 13:48:34 公開日:2021-05-18
# DID-eFed: 分散アイデンティティを備えたサービスとしてのフェデレーション学習の実現 DID-eFed: Facilitating Federated Learning as a Service withDecentralized Identities ( http://arxiv.org/abs/2105.08671v1 ) ライセンス: Link先を確認	Jiahui Geng, Neel Kanwal, Martin Gilje Jaatun, Chunming Rong	(参考訳) 私たちはビッグデータの時代に入り、人工知能応用の繁栄の「燃料」と考えられている。 eu一般データ保護規則(gdpr)の制定は、ビッグデータにおける個人のプライバシーに関する懸念を引き起こす。フェデレートラーニング(FL)は、ユーザプライバシとデータの機密性要件に準拠したまま、複数のパーティ間で共有される高性能モデルを構築するのに役立つ機能的なソリューションとして現れます。 FLは、実アプリケーションで集中的に研究され、使用されているが、関心のあるサードパーティへのFLaaS(Federated Learning as a Service)としての展望と応用に関する研究は、まだ限られている。本稿では,分散ID(DID)とスマートコントラクトによってFLが促進されるFLaaSシステム,DID-eFedを提案する。 didは当社のシステムにおいて、より柔軟で信頼性の高い分散アクセス管理を可能にします。 DID-eFedが病院や研究機関のFLaaSを可能にするシナリオについて述べる。 We have entered the era of big data, and it is considered to be the "fuel" for the flourishing of artificial intelligence applications. The enactment of the EU General Data Protection Regulation (GDPR) raises concerns about individuals' privacy in big data. Federated learning (FL) emerges as a functional solution that can help build high-performance models shared among multiple parties while still complying with user privacy and data confidentiality requirements. Although FL has been intensively studied and used in real applications, there is still limited research related to its prospects and applications as a FLaaS (Federated Learning as a Service) to interested 3rd parties. In this paper, we present a FLaaS system: DID-eFed, where FL is facilitated by decentralized identities (DID) and a smart contract. DID enables a more flexible and credible decentralized access management in our system, while the smart contract offers a frictionless and less error-prone process. We describe particularly the scenario where our DID-eFed enables the FLaaS among hospitals and research institutions.	翻訳日:2021-05-19 13:48:14 公開日:2021-05-18
# (参考訳) slgpt: transfer learningを使用してsimulinkモデルファイルを直接生成し、simulinkツールチェーンのバグを見つける SLGPT: Using Transfer Learning to Directly Generate Simulink Model Files and Find Bugs in the Simulink Toolchain ( http://arxiv.org/abs/2105.07465v2 ) ライセンス: CC BY 4.0	Sohil Lal Shrestha and Christoph Csallner	(参考訳) Simulinkのような商用サイバー物理システム(CPS)開発ツールのバグを見つけることは、コードベースに数百万行のコードが含まれており、完全な形式言語仕様が利用できないため難しい。ディープラーニング技術は、サンプルモデルからそのような言語仕様を学ぶことを約束する一方で、ディープラーニングは、うまく機能するために多数のトレーニングデータが必要です。 SLGPTは、転送学習を用いて、大規模なトレーニングデータに基づいて事前学習された強力な生成事前学習トランスフォーマ2(GPT-2)モデルを活用することでこの問題に対処する。 SLGPTは、オープンソースリポジトリから抽出されたランダムに生成されたモデルとモデルの両方でGPT-2をSimulinkに適合させる。 SLGPTは、最も近い競合であるDeepFuzzSLよりもオープンソースモデルに近いSimulinkモデルを作成し、DeepFuzzSLが発見したSimulink開発ツールチェーンのスーパーセットを発見した。 Finding bugs in a commercial cyber-physical system (CPS) development tool such as Simulink is hard as its codebase contains millions of lines of code and complete formal language specifications are not available. While deep learning techniques promise to learn such language specifications from sample models, deep learning needs a large number of training data to work well. SLGPT addresses this problem by using transfer learning to leverage the powerful Generative Pre-trained Transformer 2 (GPT-2) model, which has been pre-trained on a large set of training data. SLGPT adapts GPT-2 to Simulink with both randomly generated models and models mined from open-source repositories. SLGPT produced Simulink models that are both more similar to open-source models than its closest competitor, DeepFuzzSL, and found a super-set of the Simulink development toolchain bugs found by DeepFuzzSL.	翻訳日:2021-05-19 12:10:06 公開日:2021-05-18
# (参考訳) マルチモーダル深層ニューラルネットワークにおける説明可能性の検討 A Review on Explainability in Multimodal Deep Neural Nets ( http://arxiv.org/abs/2105.07878v2 ) ライセンス: CC BY 4.0	Gargi Joshi, Rahee Walambe, Ketan Kotecha	(参考訳) ディープニューラルネットワークを利用した人工知能技術は、コンピュータビジョンアプリケーションや自然言語処理タスクなど、いくつかのアプリケーション領域で大きな成功を収めています。人間レベルのパフォーマンスを上回ることで、言語、視覚、感覚、テキストの異なるモダリティが正確な予測と識別において重要な役割を果たすアプリケーションの研究が促進された。深層学習モデルを用いたマルチモーダル融合法が文献で提案されている。その優れた性能にもかかわらず、深層ニューラルネットワークの複雑で不透明でブラックボックスな性質は、社会的受容と使用性を制限する。これにより、モデル解釈可能性と説明可能性の探求が生まれ、さらにマルチモーダルAIメソッドを含む複雑なタスクにもたらされた。本稿では,マルチモーダル深層ニューラルネットワーク,特に視覚と言語タスクにおける説明可能性に関する包括的な調査と解説を行うため,本論文を概説する。本稿では,マルチモーダルaiとその汎用ドメインへの応用に関するいくつかの話題を取り上げ,その意義,データセット,手法と技法の基本構成要素,課題,応用,今後のトレンドについて述べる。 Artificial Intelligence techniques powered by deep neural nets have achieved much success in several application domains, most significantly and notably in the Computer Vision applications and Natural Language Processing tasks. Surpassing human-level performance propelled the research in the applications where different modalities amongst language, vision, sensory, text play an important role in accurate predictions and identification. Several multimodal fusion methods employing deep learning models are proposed in the literature. Despite their outstanding performance, the complex, opaque and black-box nature of the deep neural nets limits their social acceptance and usability. This has given rise to the quest for model interpretability and explainability, more so in the complex tasks involving multimodal AI methods. This paper extensively reviews the present literature to present a comprehensive survey and commentary on the explainability in multimodal deep neural nets, especially for the vision and language tasks. Several topics on multimodal AI and its applications for generic domains have been covered in this paper, including the significance, datasets, fundamental building blocks of the methods and techniques, challenges, applications, and future trends in this domain	翻訳日:2021-05-19 12:00:27 公開日:2021-05-18
# ワールドワイドロードシーン画像におけるポットホールの自動捕捉学習 Learning to Automatically Catch Potholes in Worldwide Road Scene Images ( http://arxiv.org/abs/2105.07986v2 ) ライセンス: Link先を確認	J. Javier Yebes, David Montero, Ignacio Arriola	(参考訳) 世界中の舗装道路に存在するいくつかの道路の危険の中で、ポットホールは最も厄介なものの1つであり、メンテナンスコストも高い。技術や研究の進展により、これらの危険を自動的に検出することへの関心が高まっている。我々の研究は、現実世界の道路シーンの画像から抜け穴を検出するという課題に取り組みました。主な斬新さは、最新のAIの進歩を応用して、穴の視覚的外観を学ぶことにある。私たちはpotholeアノテーションで画像の大規模なデータセットを構築しました。彼らは、様々な環境条件下で異なるカメラ、車両、視点で撮影された世界中の異なる都市の道路シーンを含んでいた。次に,高速なr-cnnとssd深層ニューラルネットワークに基づく4種類の物体検出モデルを微調整した。車両に埋め込むことができるGPGPU機能を備えたNvidia DrivePX2プラットフォーム上で,高い平均精度を達成し,ポットホール検出器を試験した。さらに、AUTOPILOT H2020プロジェクトの一環として、検出されたポットホールを所定のIoTプラットフォームに通知するために、実際の車両にデプロイされた。 Among several road hazards that are present in any paved way in the world, potholes are one of the most annoying and also involving higher maintenance costs. There exists an increasing interest on the automated detection of these hazards enabled by technological and research progress. Our research work tackled the challenge of pothole detection from images of real world road scenes. The main novelty resides on the application of the latest progress in AI to learn the visual appearance of potholes. We built a large dataset of images with pothole annotations. They contained road scenes from different cities in the world, taken with different cameras, vehicles and viewpoints under varied environmental conditions. Then, we fine-tuned four different object detection models based on Faster R-CNN and SSD deep neural networks. We achieved high average precision and the pothole detector was tested on the Nvidia DrivePX2 platform with GPGPU capability, which can be embedded on vehicles. Moreover, it was deployed on a real vehicle to notify the detected potholes to a given IoT platform as part of AUTOPILOT H2020 project.	翻訳日:2021-05-19 11:14:16 公開日:2021-05-18
# 異常検出におけるadversarial discriminative transferの重要性 Importance Weighted Adversarial Discriminative Transfer for Anomaly Detection ( http://arxiv.org/abs/2105.06649v2 ) ライセンス: Link先を確認	Cangning Fan, Fangyi Zhang, Peng Liu, Xiuyu Sun, Hao Li, Ting Xiao, Wei Zhao, Xianglong Tang	(参考訳) 異常検出のための以前の転送方法は、一般的にソースまたはターゲットドメインのラベル付きデータの可用性を前提としている。しかし、大規模なラベル付きデータが高価すぎる多くの実アプリケーションでは、そのような仮定は有効ではない。そこで本稿では,対象ドメインにラベル付き正規/異常データがなく,関連するソースドメインからの正規データのみが存在するケースにおいて,異常検出知識を教師なしで転送するための重み付き対向オートエンコーダ方式を提案する。具体的には、ソース領域とターゲット領域の両方で正規データの分布を調整することを学習するが、ターゲット領域における異常データの分布は変わらない。このようにして、対象領域内の正常データと異常データの分布との間に明らかなギャップが生じ、ドメイン内の異常検出を可能にする。複数の合成データセットに対する大規模な実験とUCSDベンチマークにより,本手法の有効性が示された。コードはhttps://github.com/fancangning/anomaly_detection_transferで入手できる。 Previous transfer methods for anomaly detection generally assume the availability of labeled data in source or target domains. However, such an assumption is not valid in most real applications where large-scale labeled data are too expensive. Therefore, this paper proposes an importance weighted adversarial autoencoder-based method to transfer anomaly detection knowledge in an unsupervised manner, particularly for a rarely studied scenario where a target domain has no labeled normal/abnormal data while only normal data from a related source domain exist. Specifically, the method learns to align the distributions of normal data in both source and target domains, but leave the distribution of abnormal data in the target domain unchanged. In this way, an obvious gap can be produced between the distributions of normal and abnormal data in the target domain, therefore enabling the anomaly detection in the domain. Extensive experiments on multiple synthetic datasets and the UCSD benchmark demonstrate the effectiveness of our approach. The code is available at https://github.com/fancangning/anomaly_detection_transfer.	翻訳日:2021-05-19 11:14:02 公開日:2021-05-18
# 視覚トランスフォーマーは堅牢な学習者です Vision Transformers are Robust Learners ( http://arxiv.org/abs/2105.07581v2 ) ライセンス: Link先を確認	Sayak Paul and Pin-Yu Chen	(参考訳) 複数の自己注意層で構成されたトランスフォーマーは、さまざまなデータモダリティに適用可能な汎用的な学習プリミティブに対して、パラメータ効率を向上して最先端のSOTA(State-of-the-art)標準精度を達成するコンピュータビジョンの最近のブレークスルーを含む、強い約束を持っている。セルフアテンションは入力データ内に存在する異なるコンポーネントを体系的に整列させるのに役立つため、モデルロバスト性ベンチマークでその性能を調査する根拠を残している。本研究では,視覚トランスフォーマ (vit) の共通の腐敗や摂動, 分布シフト, 自然逆流に対するロバスト性について検討する。 vitモデルとsoma畳み込みニューラルネットワーク(cnns)の総合的な性能比較を行うために,ロバスト分類に関する6種類の画像ネットデータセットを用いた。 6つの体系的に設計された実験を通して、ViTsが実際により堅牢な学習者である理由を説明するために、定量的および定性的な指標の両方を提供する分析を行う。例えば、より少ないパラメータと類似したデータセットと事前トレーニングの組み合わせで、ViTはImageNet-Aで28.10%の精度を提供する。画像マスキング,フーリエスペクトル感度および離散コサインエネルギースペクトルへの拡散に関する解析により,ViTの強靭性向上に寄与する興味深い性質が明らかになった。実験を再現するためのコードは以下の通りである。 Transformers, composed of multiple self-attention layers, hold strong promises toward a generic learning primitive applicable to different data modalities, including the recent breakthroughs in computer vision achieving state-of-the-art (SOTA) standard accuracy with better parameter efficiency. Since self-attention helps a model systematically align different components present inside the input data, it leaves grounds to investigate its performance under model robustness benchmarks. In this work, we study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples. We use six different diverse ImageNet datasets concerning robust classification to conduct a comprehensive performance comparison of ViT models and SOTA convolutional neural networks (CNNs), Big-Transfer. Through a series of six systematically designed experiments, we then present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners. For example, with fewer parameters and similar dataset and pre-training combinations, ViT gives a top-1 accuracy of 28.10% on ImageNet-A which is 4.3x higher than a comparable variant of BiT. Our analyses on image masking, Fourier spectrum sensitivity, and spread on discrete cosine energy spectrum reveal intriguing properties of ViT attributing to improved robustness. Code for reproducing our experiments is available here: https://git.io/J3VO0.	翻訳日:2021-05-19 11:13:46 公開日:2021-05-18
# 高次元クラスタデータ可視化のためのt-SNEの理論基礎 Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data ( http://arxiv.org/abs/2105.07536v2 ) ライセンス: Link先を確認	T. Tony Cai and Rong Ma	(参考訳) 本研究では,一般的な非線形次元低減・データ可視化手法であるt-distributed stochastic neighbor embedded(t-sne)の理論的基礎について検討する。勾配降下法に基づく t-SNE の解析のための新しい理論的枠組みを提案する。 t-SNEの初期の誇張段階において、基礎となるグラフであるラプラシアンに基づく電力反復に対する漸近的同値性を示し、その制限挙動を特徴づけ、ラプラシアンスペクトルクラスタリングとの深い関係、および暗黙の正則化として早期停止を含む基本原理を明らかにする。結果は,このような計算戦略の固有機構と経験的利点を説明する。 t-SNEの埋め込み段階では, 繰り返しを通して低次元写像の運動特性を特徴づけ, クラスタ間反発と低次元写像の拡張挙動を特徴とする増幅位相を同定する。一般的な理論では、クラスタ化されたデータを視覚化するためのt-SNEの高速収束率と例外的な経験的性能を説明し、t-SNE出力の解釈をもたらし、様々なアプリケーションでチューニングパラメータを選択するための理論的ガイダンスを提供する。 This study investigates the theoretical foundations of t-distributed stochastic neighbor embedding (t-SNE), a popular nonlinear dimension reduction and data visualization method. A novel theoretical framework for the analysis of t-SNE based on the gradient descent approach is presented. For the early exaggeration stage of t-SNE, we show its asymptotic equivalence to a power iteration based on the underlying graph Laplacian, characterize its limiting behavior, and uncover its deep connection to Laplacian spectral clustering, and fundamental principles including early stopping as implicit regularization. The results explain the intrinsic mechanism and the empirical benefits of such a computational strategy. For the embedding stage of t-SNE, we characterize the kinematics of the low-dimensional map throughout the iterations, and identify an amplification phase, featuring the intercluster repulsion and the expansive behavior of the low-dimensional map. The general theory explains the fast convergence rate and the exceptional empirical performance of t-SNE for visualizing clustered data, brings forth the interpretations of the t-SNE output, and provides theoretical guidance for selecting tuning parameters in various applications.	翻訳日:2021-05-19 11:13:22 公開日:2021-05-18
# エビデンス理論における近似エントロピーに基づく基本確率割当て積分の不確かさの測定 Uncertainty Measurement of Basic Probability Assignment Integrity Based on Approximate Entropy in Evidence Theory ( http://arxiv.org/abs/2105.07382v2 ) ライセンス: Link先を確認	Tianxiang Zhan, Yuanpeng He, Hanwen Li, Fuyuan Xiao	(参考訳) 証拠理論は、確率の延長は未知や不正確な情報にうまく対処できるというものである。不確かさの測定は証拠理論と確率理論の両方において重要な役割を果たす。近似エントロピー (ApEn) は、複素系の不規則性を記述するためにピンカスによって提案されている。時系列が不規則であればあるほど、近似エントロピーは大きくなる。ネットワークのApEnは、ネットワークが新しいノードを生成する能力、または未発見ノードの可能性を表す。ネットワーク特性と基本確率割当(BPA)の関連付けにより、完全性に関するBPAの不確実性の尺度を得ることができる。論文の主な貢献は、基本確率割り当ての完全性を定義することであり、BPAの近似エントロピーは、BPAの完全性の不確実性を測定するために提案される。提案手法は,証拠理論におけるBPAの不確実性を計算するための論理ネットワーク構造に基づく。提案手法に基づく不確実性は,BPAの完全性の不確実性を表し,BPAの信頼性の同定に寄与する。 Evidence theory is that the extension of probability can better deal with unknowns and inaccurate information. Uncertainty measurement plays a vital role in both evidence theory and probability theory. Approximate Entropy (ApEn) is proposed by Pincus to describe the irregularities of complex systems. The more irregular the time series, the greater the approximate entropy. The ApEn of the network represents the ability of a network to generate new nodes, or the possibility of undiscovered nodes. Through the association of network characteristics and basic probability assignment (BPA) , a measure of the uncertainty of BPA regarding completeness can be obtained. The main contribution of paper is to define the integrity of the basic probability assignment then the approximate entropy of the BPA is proposed to measure the uncertainty of the integrity of the BPA. The proposed method is based on the logical network structure to calculate the uncertainty of BPA in evidence theory. The uncertainty based on the proposed method represents the uncertainty of integrity of BPA and contributes to the identification of the credibility of BPA.	翻訳日:2021-05-19 11:13:02 公開日:2021-05-18

Title

Authors

Abstract

論文公表日・翻訳日

# 刺激的、有用、心配、未来的:8か国における人工知能に対する大衆の認識

Exciting, Useful, Worrying, Futuristic: Public Perception of Artificial Intelligence in 8 Countries ( http://arxiv.org/abs/2001.00081v2 )

ライセンス: Link先を確認

Patrick Gage Kelley, Yongwei Yang, Courtney Heldreth, Christopher Moessner, Aaron Sedley, Andreas Kramm, David T. Newman, and Allison Woodruff

(参考訳) 人工知能(ai)の影響と利用が拡大し、その変革的な潜在性がより顕在化するにつれて、その使用の経済的、政治的、社会的、倫理的な影響に関する多くの疑問が提起されている。世論は、製品導入、商業開発、研究資金、規制に影響を与えるこれらの議論において重要な役割を担っている。本稿では,8カ国と6大陸にまたがる10,005人の回答者を対象に,人工知能の世論調査を行った。我々は、AIが社会に重大な影響を与えるという認識を広く報告し、AIの責任ある開発と利用を強く支援し、また、各国のAIに対する反応を区別する4つの主要なテーマ(興奮、有用、心配、未来)で、AIに対する大衆の感情を特徴づける。

As the influence and use of artificial intelligence (AI) have grown and its transformative potential has become more apparent, many questions have been raised regarding the economic, political, social, and ethical implications of its use. Public opinion plays an important role in these discussions, influencing product adoption, commercial development, research funding, and regulation. In this paper we present results of an in-depth survey of public opinion of artificial intelligence conducted with 10,005 respondents spanning eight countries and six continents. We report widespread perception that AI will have significant impact on society, accompanied by strong support for the responsible development and use of AI, and also characterize the public's sentiment towards AI with four key themes (exciting, useful, worrying, and futuristic) whose prevalence distinguishes response to AI in different countries.

翻訳日:2023-06-09 22:45:16 公開日:2021-05-18

# 一般確率論における確率の動的制約

How dynamics constrains probabilities in general probabilistic theories ( http://arxiv.org/abs/2002.05088v3 )

ライセンス: Link先を確認

Thomas D. Galley and Lluis Masanes

(参考訳) 本稿では,システムの動的構造と確率構造を区別する一般確率論を解析するための一般的な枠組みを紹介する。力学構造は、可逆力学の作用とともに純粋な状態の集合であり、確率構造は測定値と結果確率を決定する。動的群と安定化部分群がゲルファント対を形成する推移的力学構造に対して、すべての確率的構造は剛性(無限に変形することはできない)であり、動的群の球面表現と一対一の対応であることを示す。動的構造がユニタリ群によって作用する複素グラスマン多様体のものであるとき、すべての確率構造を分類するために我々の方法を適用する。これは量子論の一般化であり、純粋な状態は複素ベクトル空間の1次元部分空間で表現される代わりに、1より大きい固定次元の部分空間で表現される。また、コンパクトな2点均質な力学構造を持つ系(すなわち、与えられた距離を持つ全ての純状態は、同じ距離を持つ任意の純粋な状態に可逆的に変換可能である)がユークリッド・ヨルダン・アルゲブラに対応する系を含むことを示す。

We introduce a general framework for analysing general probabilistic theories, which emphasises the distinction between the dynamical and probabilistic structures of a system. The dynamical structure is the set of pure states together with the action of the reversible dynamics, whilst the probabilistic structure determines the measurements and the outcome probabilities. For transitive dynamical structures whose dynamical group and stabiliser subgroup form a Gelfand pair we show that all probabilistic structures are rigid (cannot be infinitesimally deformed) and are in one-to-one correspondence with the spherical representations of the dynamical group. We apply our methods to classify all probabilistic structures when the dynamical structure is that of complex Grassmann manifolds acted on by the unitary group. This is a generalisation of quantum theory where the pure states, instead of being represented by one-dimensional subspaces of a complex vector space, are represented by subspaces of a fixed dimension larger than one. We also show that systems with compact two-point homogeneous dynamical structures (i.e. every pair of pure states with a given distance can be reversibly transformed to any other pair of pure states with the same distance), which include systems corresponding to Euclidean Jordan Algebras, all have rigid probabilistic structures.

翻訳日:2023-06-03 21:23:36 公開日:2021-05-18

# タンパク質格子問題における制限量子スピードアップの可能性の検討

Investigating the potential for a limited quantum speedup on protein lattice problems ( http://arxiv.org/abs/2004.01118v2 )

ライセンス: Link先を確認

Carlos Outeiral, Garrett M. Morris, Jiye Shi, Martin Strahm, Simon C. Benjamin and Charlotte M. Deane

(参考訳) タンパク質の折りたたみは計算生物学における中心的な課題であり、分子生物学、創薬、触媒設計における重要な応用である。ハードコンビネート最適化問題として、量子アニーリングの潜在的なターゲット問題として研究されている。いくつかの実験的な実装が文献で議論されているが、これらのアプローチの計算スケーリングは解明されていない。本稿では,多数の小ペプチド折り畳み問題に適用する量子アニーリングの数値的研究を行い,短期的応用に有用な知見を推察することを目的とした。 2つの結論として、タンパク質格子の折り畳みに適用した場合、ナイーブな量子アニールさえも古典的アプローチより優れている可能性があり、ハミルトン派や関連するスケジュールの注意深いエンジニアリングはこの問題に対して顕著な相対的な改善をもたらす。全体として、量子アルゴリズムはタンパク質の折り畳みや構造予測領域の問題を改善できる可能性が示唆された。

Protein folding is a central challenge in computational biology, with important applications in molecular biology, drug discovery and catalyst design. As a hard combinatorial optimisation problem, it has been studied as a potential target problem for quantum annealing. Although several experimental implementations have been discussed in the literature, the computational scaling of these approaches has not been elucidated. In this article, we present a numerical study of quantum annealing applied to a large number of small peptide folding problems, aiming to infer useful insights for near-term applications. We present two conclusions: that even naive quantum annealing, when applied to protein lattice folding, has the potential to outperform classical approaches, and that careful engineering of the Hamiltonians and schedules involved can deliver notable relative improvements for this problem. Overall, our results suggest that quantum algorithms may well offer improvements for problems in the protein folding and structure prediction realm.

翻訳日:2023-05-27 03:17:05 公開日:2021-05-18

# 近似ダイナミクスはより最適制御につながる:効率的な完全微分

Approximate Dynamics Lead to More Optimal Control: Efficient Exact Derivatives ( http://arxiv.org/abs/2005.09943v4 )

ライセンス: Link先を確認

Jesper Hasseriis Mohr Jensen, Frederik Skovbo M{\o}ller, Jens Jakob S{\o}rensen, Jacob Friis Sherson

(参考訳) 正確な微分は、量子最適化のランドスケープにおいて局所的な反転と収束を効率的に行うために重要である。ユニタリ制御タスクに対する解析的厳密な制御導関数(gradient and hessian)を導出することにより、この精度要件を満たす計算可能性が伝播スキームと問題表現の選択に依存することを示した。正確な伝播が十分に安価である場合でも、(適切な)近似伝達子を最適化する方が、おそらく驚くほど効率的である。重要なことに、最初の分析的考察を終えると、現実のシステムへの直接的な適用において、標準的な数値的技術のみが明示的に必要となる。これらの結果はヒルベルト空間次元を増大させる2つの具体的な問題に対して数値的に検証される。最善のスキームは機械精度に対して単位忠実度を得るが、他のスキームの結果は計算時間において10桁、最悪の場合には10桁の精度で一貫して分離される。これらのギャップはシステムのサイズと複雑さによって継続的に増大するため、この手法は、例えば、多体コンテキストにおいて、別々に発行される高忠実な体制で、非常に高次元のダイナミクスを数値的に効率的に最適化することができる。

Accurate derivatives are important for efficiently locally traversing and converging in quantum optimization landscapes. By deriving analytically exact control derivatives (gradient and Hessian) for unitary control tasks, we show here that the computational feasibility of meeting this accuracy requirement depends on the choice of propagation scheme and problem representation. Even when exact propagation is sufficiently cheap it is, perhaps surprisingly, much more efficient to optimize the (appropriately) approximate propagators: approximations in the dynamics are traded off for significant complexity reductions in the exact derivative calculations. Importantly, past the initial analytical considerations, only standard numerical techniques are explicitly required with straightforward application to realistic systems. These results are numerically verified for two concrete problems of increasing Hilbert space dimensionality. The best schemes obtain unit fidelity to machine precision whereas the results for other schemes are separated consistently by orders of magnitude in computation time and in worst case 10 orders of magnitude in achievable fidelity. Since these gaps continually increase with system size and complexity, this methodology allows numerically efficient optimization of very high-dimensional dynamics, e.g. in many-body contexts, operating in the high-fidelity regime which will be published separately.

翻訳日:2023-05-19 06:00:28 公開日:2021-05-18

# 可変絡み合いセンサネットワークを用いた量子強調データ分類

Quantum-enhanced data classification with a variational entangled sensor network ( http://arxiv.org/abs/2006.11962v2 )

ライセンス: Link先を確認

Yi Xia, Wei Li, Quntao Zhuang, Zheshen Zhang

(参考訳) ノイズの多い中間スケール量子(NISQ)ハードウェア上に構築された変分量子回路(VQC)は、古典的な処理とともに、量子シミュレーション、古典的な最適化、機械学習のための有望なアーキテクチャを構成する。しかし、古典的なスキームよりも量子的優位性を示すために必要なVQC深さは、利用可能なNISQデバイスの範囲を超えている。畳み込みセンサネットワーク(SLAEN)によって補助される監視学習は、古典的な機械学習アルゴリズムで訓練されたVQCを活用して、センサが共有するマルチパーティの絡み合いを調整し、実用的なデータ処理問題を解決する。本稿では、SLAENの初の実験実験を報告し、多次元無線周波数信号の分類における誤差確率のエンタングルメントによる低減を示す。我々の研究は、NISQ時代における量子化データ処理の新たな道のりを開拓している。

Variational quantum circuits (VQCs) built upon noisy intermediate-scale quantum (NISQ) hardware, in conjunction with classical processing, constitute a promising architecture for quantum simulations, classical optimization, and machine learning. However, the required VQC depth to demonstrate a quantum advantage over classical schemes is beyond the reach of available NISQ devices. Supervised learning assisted by an entangled sensor network (SLAEN) is a distinct paradigm that harnesses VQCs trained by classical machine-learning algorithms to tailor multipartite entanglement shared by sensors for solving practically useful data-processing problems. Here, we report the first experimental demonstration of SLAEN and show an entanglement-enabled reduction in the error probability for classification of multidimensional radio-frequency signals. Our work paves a new route for quantum-enhanced data processing and its applications in the NISQ era.

翻訳日:2023-05-13 05:18:09 公開日:2021-05-18

# 結合線からのフラクタル位相相

Fractonic topological phases from coupled wires ( http://arxiv.org/abs/2010.15148v3 )

ライセンス: Link先を確認

Joseph Sullivan, Arpit Dua and Meng Cheng

(参考訳) 3次元では、ギャップ付き位相は「フラクトロニック」準粒子励起をサポートすることができ、これは完全に非運動的であるか、あるいは低次元のサブ多様体の中でしか動くことができない。本研究では, 三次元連結ワイヤ構造を用いてフラクトロニック位相を探索し, トポロジ的位相を2次元で実現・特徴づける手法として成功している。フラクタル励起を伴うガッピング相とギャップレス相の両方がモデルから現れることが判明した。ガッピングの場合、フラクタル励起はワイヤ方向に沿って移動可能であるが、横面における移動性は一般的に減少する。一般の励起は、既知のギャップを持つフラクトンモデルとは異なる無限次核融合構造を持つことを示す。 2D結合ワイヤ構造と同様に、多くのモデルは、無限成分のルッティンガー液体によって記述できる隙間のない(あるいはキラルな)表面状態を示す。しかし、表面理論の普遍性クラスは表面配向に強く依存するため、フラクトン相に特有の新しいタイプのバルク境界対応が明らかになる。

In three dimensions, gapped phases can support "fractonic" quasiparticle excitations, which are either completely immobile or can only move within a low-dimensional submanifold, a peculiar topological phenomenon going beyond the conventional framework of topological quantum field theory. In this work we explore fractonic topological phases using three-dimensional coupled wire constructions, which have proven to be a successful tool to realize and characterize topological phases in two dimensions. We find that both gapped and gapless phases with fractonic excitations can emerge from the models. In the gapped case, we argue that fractonic excitations are mobile along the wire direction, but their mobility in the transverse plane is generally reduced. We show that the excitations in general have infinite-order fusion structure, distinct from previously known gapped fracton models. Like the 2D coupled wire constructions, many models exhibit gapless (or even chiral) surface states, which can be described by infinite-component Luttinger liquids. However, the universality class of the surface theory strongly depends on the surface orientation, thus revealing a new type of bulk-boundary correspondence unique to fracton phases.

翻訳日:2023-04-27 06:15:40 公開日:2021-05-18

# 動的場推論と超対称性

Dynamical field inference and supersymmetry ( http://arxiv.org/abs/2010.15414v2 )

ライセンス: Link先を確認

Margret Westerkamp, Igor Ovchinnikov, Philipp Frank, Torsten En{\ss}lin

(参考訳) 物理分野の進化に関する知識は、科学、技術、経済学において最重要である。動的場推論(DFI)は、有限データから確率的に駆動される動的に進化する場を再構成する問題に対処する。情報場理論(英: information field theory、IFT)とは、情報場の理論である。ここでは、DFI、IFT、最近開発された超対称確率論(STS)の関係が、教育学的議論において確立されている。 IFTでは、全時空推論問題の分割関数からフィールド期待値を計算することができる。推論問題の分割関数は、場に依存した関数決定式と同様に動力学を保証する機能ディラック関数を起動し、適切な正規化を確立する。 STSは、それぞれフェルミオンゴーストとボソニックラグランジュフィールドを導入することによって、これらの問題表現を置き換える。これらの場の作用は超対称性を持ち、ボソンとフェルミオンの間の交換演算が存在し、系を不変にする。これとは対照的に、力学場の測定はこの超対称性に従わない。超対称性は自然に破壊することもでき、そこでは系がカオス的に進化する。これはシステムの予測可能性に影響を与えるため、DFIをより難しくする。ファインマン図の助けを借りて,簡略化された図示システムの非線形カオス力学と測定制約の相互作用を考察し,フェルミオン補正が系軌道上の正しい後方統計を得るために不可欠であることを示した。

Knowledge on evolving physical fields is of paramount importance in science, technology, and economics. Dynamical field inference (DFI) addresses the problem of reconstructing a stochastically driven, dynamically evolving field from finite data. It relies on information field theory (IFT), the information theory for fields. Here, the relations of DFI, IFT, and the recently developed supersymmetric theory of stochastics (STS) are established in a pedagogical discussion. In IFT, field expectation values can be calculated from the partition function of the full space-time inference problem. The partition function of the inference problem invokes a functional Dirac function to guarantee the dynamics, as well as a field-dependent functional determinant, to establish proper normalization, both impeding the necessary evaluation of the path integral over all field configurations. STS replaces these problematic expressions via the introduction of fermionic ghost and bosonic Lagrange fields, respectively. The action of these fields has a supersymmetry, which means there exists an exchange operation between bosons and fermions that leaves the system invariant. In contrast to this, measurements of the dynamical fields do not adhere to this supersymmetry. The supersymmetry can also be broken spontaneously, in which case the system evolves chaotically. This affects the predictability of the system and thereby make DFI more challenging. We investigate the interplay of measurement constraints with the non-linear chaotic dynamics of a simplified, illustrative system with the help of Feynman diagrams and show that the Fermionic corrections are essential to obtain the correct posterior statistics over system trajectories.

翻訳日:2023-04-27 00:57:09 公開日:2021-05-18

# 正演算値測度に基づくコヒーレンスを絡み合いに変換する

Converting coherence based on positive-operator-valued measures into entanglement ( http://arxiv.org/abs/2011.00220v3 )

ライセンス: Link先を確認

Sunho Kim, Chunhe Xiong, Asutosh Kumar, and Junde Wu

(参考訳) 量子資源理論は、量子物理学における現象を広範囲に研究するための多様で強力な枠組みを提供する。量子コヒーレンス(quantum coherence)は、多くの量子情報処理の基本的な要素である。量子情報に対する広範かつ現在の関心の対象であり、その成立以来多くの新しい概念が導入され、一般化されてきた。ここではブロックコヒーレンスをブロックインコヒーレント操作により絡み合いに変換することができることを示す。さらに,naimark拡張によるブロックコヒーレンスに関連するpovmベースのコヒーレンスが,絡み合い生成の観点から潜在的資源として機能することを見出した。最後に、povmベースのコヒーレンスから絡み合う方法を説明し、埋め込みチャネルと補助システムを必要とする戦略を提示し、いくつかの例を示し、一般化する。

Quantum resource theories provide a diverse and powerful framework for extensively studying the phenomena in quantum physics. Quantum coherence, a quantum resource, is the basic ingredient in many quantum information tasks. It is a subject of broad and current interest in quantum information, and many new concepts have been introduced and generalized since its establishment. Here we show that the block coherence can be transformed into entanglement via a block incoherent operation. Moreover, we find that the POVM-based coherence associated with block coherence through the Naimark extension acts as a potential resource from the perspective of generating entanglement. Finally, we discuss avenues of creating entanglement from POVM-based coherence, present strategies that require embedding channels and auxiliary systems, give some examples, and generalize them.

翻訳日:2023-04-26 05:46:36 公開日:2021-05-18

# 微分可能な量子回路を用いた非線形微分方程式の解法

Solving nonlinear differential equations with differentiable quantum circuits ( http://arxiv.org/abs/2011.10395v2 )

ライセンス: Link先を確認

Oleksandr Kyriienko, Annie E. Paine, Vincent E. Elfving

(参考訳) 非線形微分方程式系を解く量子アルゴリズムを提案する。量子特徴写像符号化を用いて、パラメタライズド量子回路の期待値として関数を定義する。解析的な形で関数微分を微分可能な量子回路(DQC)として表現するために自動微分を用いるので、勾配を計算するための不正確な有限差分手順を避けることができる。微分方程式と境界条件を満たすようdqcを訓練するハイブリッド量子古典ワークフローについて述べる。特定の例として,高次元特徴空間における微分方程式を解くためのスペクトル法の実装について述べる。技術的な観点から、適合多項式の強力な基底セットを提供し、豊かな表現性を持つチェビシェフ量子特徴写像を設計する。本研究では, ナビエ・ストークス方程式の解法をシミュレートし, 収束ダイバージェントノズル内の流体流動の密度, 温度, 速度プロファイルを計算する。

We propose a quantum algorithm to solve systems of nonlinear differential equations. Using a quantum feature map encoding, we define functions as expectation values of parametrized quantum circuits. We use automatic differentiation to represent function derivatives in an analytical form as differentiable quantum circuits (DQCs), thus avoiding inaccurate finite difference procedures for calculating gradients. We describe a hybrid quantum-classical workflow where DQCs are trained to satisfy differential equations and specified boundary conditions. As a particular example setting, we show how this approach can implement a spectral method for solving differential equations in a high-dimensional feature space. From a technical perspective, we design a Chebyshev quantum feature map that offers a powerful basis set of fitting polynomials and possesses rich expressivity. We simulate the algorithm to solve an instance of Navier-Stokes equations, and compute density, temperature and velocity profiles for the fluid flow in a convergent-divergent nozzle.

翻訳日:2023-04-23 15:05:29 公開日:2021-05-18

# labフレームコヒーレント回転波束による多原子分子中の分子フレーム光電子角分布

Molecular Frame Photoelectron Angular Distributions in Polyatomic Molecules from Lab Frame Coherent Rotational Wavepacket Evolution ( http://arxiv.org/abs/2012.04561v2 )

ライセンス: Link先を確認

Margaret Gregory, Paul Hockett, Albert Stolow, Varun Makhija

(参考訳) 実験室フレーム(LF)測定(LFPAD)から分子フレーム(MF)光電子角分布(MFPAD)を得るための行列ベースの再構成プロトコルの適用について検討した。 Similarly to other recent works on the topic of MF reconstruction, this protocol makes use of time-resolved LF measurements, in which a rotational wavepacket is prepared and probed via photoionization, followed by a numerical reconstruction routine; however, in contrast to other methodologies, the protocol developed herein does not require determination of photoionization matrix elements, and consequently takes a relatively simple numerical form (matrix transform making use of the Moore-Penrose inverse). 重要なことに、この単純さにより、多原子分子に対するMFPADの再構築が成功している。このスキームは、N_2$とC_2H_4$の2つの実例に対して数値的に示される。この新しい手法は、多原子分子の光イオン化を含むmf再構成問題に適用できると期待されている。

The application of a matrix-based reconstruction protocol for obtaining Molecular Frame (MF) photoelectron angular distributions (MFPADs) from laboratory frame (LF) measurements (LFPADs) is explored. Similarly to other recent works on the topic of MF reconstruction, this protocol makes use of time-resolved LF measurements, in which a rotational wavepacket is prepared and probed via photoionization, followed by a numerical reconstruction routine; however, in contrast to other methodologies, the protocol developed herein does not require determination of photoionization matrix elements, and consequently takes a relatively simple numerical form (matrix transform making use of the Moore-Penrose inverse). Significantly, the simplicity allows application of the method to the successful reconstruction of MFPADs for polyatomic molecules. The scheme is demonstrated numerically for two realistic cases, $N_2$ and $C_2H_4$. The new technique is expected to be generally applicable for a range of MF reconstruction problems involving photoionization of polyatomic molecules.

翻訳日:2023-04-21 18:25:16 公開日:2021-05-18

# 縮退支援量子スターリング熱エンジンにおける作業の温度依存性最大化と効率

Temperature dependent maximization of work and efficiency in a degeneracy assisted quantum Stirling heat engine ( http://arxiv.org/abs/2012.11362v3 )

ライセンス: Link先を確認

Sarbani Chatterjee, Arghadip Koner, Sohini Chatterjee, and Chandan Kumar

(参考訳) 本稿では,高調波発振器を作動媒体とする量子スターリング熱エンジンを提案する。本研究では, 所定の周波数での高調波発振器量子スターリングヒートエンジン(HO-QSHE)の効率を, 蓄熱器の温度比で最大化できることを示す。音波発振器の低温または等価に高い周波数制限では、HO-QSHEの効率はカルノー効率に近づく。さらに,量子スターリング熱エンジンの動作媒質として,箱型量子システムにおける粒子のアンサンブルを分析する。ここで、作業と効率の両方を熱貯水池の温度の比で最大化することができる。これらの研究により、最適な性能で量子スターリング熱エンジンを動作させることができる。 HO-QSHEの理論的な研究は、ほとんどの実システムは平衡付近の小さな変位に対する調和振動子として近似できるので、実験的な実現のための衝動を与える。

We propose a quantum Stirling heat engine with an ensemble of harmonic oscillators as the working medium. We show that the efficiency of the harmonic oscillator quantum Stirling heat engine (HO-QSHE) at a given frequency can be maximized at a specific ratio of the temperatures of the thermal reservoirs. In the low temperature or equivalently high frequency limit of the harmonic oscillators, the efficiency of the HO-QSHE approaches the Carnot efficiency. Further, we analyse quantum Stirling heat engine with an ensemble of particle in box quantum systems as the working medium. Here both work and efficiency can be maximized at a specific ratio of temperatures of the thermal reservoirs. These studies will enable us to operate the quantum Stirling heat engines at its optimal performance. The theoretical study of the HO-QSHE would provide impetus for its experimental realisation, as most real systems can be approximated as harmonic oscillators for small displacements near equilibrium.

翻訳日:2023-04-20 00:28:25 公開日:2021-05-18

# 量子モンテカルロにおけるgrover-rudolph状態形成の問題点

The Problem with Grover-Rudolph State Preparation for Quantum Monte-Carlo ( http://arxiv.org/abs/2101.02240v2 )

ライセンス: Link先を確認

Steven Herbert

(参考訳) 我々は,Grover-Rudolph法を用いて,解析的に定義された対数凹型確率分布の平均(および他のモーメント)を量子状態として推定するために量子モンテカルロを用いる場合,量子スピードアップは存在しないことを証明した。

We prove that there is no quantum speed-up when using quantum Monte-Carlo to estimate the mean (and other moments) of analytically-defined log-concave probability distributions prepared as quantum states using the Grover-Rudolph method.

翻訳日:2023-04-17 17:41:16 公開日:2021-05-18

# 1+1次元ゲージ場理論のリアルタイムダイナミクスへの連続的アプローチ:シュウィンガー模型の地平線相関から

Continuum approach to real time dynamics of 1+1D gauge field theory: out of horizon correlations of the Schwinger model ( http://arxiv.org/abs/2101.07807v2 )

ライセンス: Link先を確認

Ivan Kukuljan

(参考訳) シュウィンガー模型における非平衡実時間ダイナミクス(d=1+1の量子電磁力学)を研究するための切断ハミルトン法を開発した。これは、局所的および大域的ゲージ変換の下で不変性を確実に捉え、時空を離散化する必要のない純粋連続体法である。 1+1dの量子電磁力学は、最近sine-gordonモデルで発見されたダイナミックな地平線破れ効果を認めている。モデルのクエンチの後、振動性のある長距離相関が発展し、水平線境界をはっきりと破る。その結果, 水平方向の相関関係の振動周波数は, 相関中間子対を介する効果を示すモデルの中間子の質量の2倍に相当することがわかった。また、以前は質量のないシュウィンガーモデルで知られていた大規模モデルのクラスタ違反について報告する。この結果は、1+1D量子電磁力学における新しい非平衡現象を明らかにし、ゲージ場理論に地平線違反効果が存在することを証明するための第一歩となる。

We develop a truncated Hamiltonian method to study nonequilibrium real time dynamics in the Schwinger model - the quantum electrodynamics in D=1+1. This is a purely continuum method that captures reliably the invariance under local and global gauge transformations and does not require a discretisation of space-time. We use it to study a phenomenon that is expected not to be tractable using lattice methods: we show that the 1+1D quantum electrodynamics admits the dynamical horizon violation effect which was recently discovered in the case of the sine-Gordon model. Following a quench of the model, oscillatory long-range correlations develop, manifestly violating the horizon bound. We find that the oscillation frequencies of the out-of-horizon correlations correspond to twice the masses of the mesons of the model suggesting that the effect is mediated through correlated meson pairs. We also report on the cluster violation in the massive version of the model, previously known in the massless Schwinger model. The results presented here reveal a novel nonequilibrium phenomenon in 1+1D quantum electrodynamics and make a first step towards establishing that the horizon violation effect is present in gauge field theory.

翻訳日:2023-04-14 17:51:54 公開日:2021-05-18

# 量子テレポーテーションにおける量子ビットチャネルの特徴付け

Characterizing qubit channels in the context of quantum teleportation ( http://arxiv.org/abs/2102.02054v2 )

ライセンス: Link先を確認

Arkaprabha Ghosal, Debarshi Das, Subhashish Banerjee

(参考訳) 例えば、アリスが純粋な2量子ビット(最大に絡み合うか、最大に絡み合うか)の状態を準備し、その状態の半分を例えばボブにキュービット(単位的または非単位的)チャネルで送るシナリオを考える。最後に、共有状態はテレポーテーションチャネルとして使用される。このシナリオでは、量子テレポーテーション(QT)の資源としての最終的な状態の有効性に関するキュービットチャネルの集合を、最大平均忠実度とフィデリティ偏差(入力状態上のフィデリティ値のゆらぎ)の観点から特徴づけることに焦点をあてる。重要なことは、初期準備状態が普遍QT(すなわち、最大エンタングル状態)または普遍QT(すなわち、非最大エンタングル状態のサブセット)に有用でないとき、最終状態が普遍QT(古典的境界よりも厳密な最大平均フィデリティを持ち、フィデリティの偏差がゼロである)に有用になるようなキュービットチャネルのサブセットの存在を指摘したことである。興味深いことに、後者の場合、非単位的チャネル(散逸的相互作用)は単位的チャネル(非散逸的相互作用)よりも有効であり、非最大に絡み合った純粋な状態から普遍的qtに対して有用な状態を生成する。

We consider a scenario where a party, say, Alice prepares a pure two-qubit (either maximally entangled or non-maximally entangled) state and sends one half of this state to another distant party, say, Bob through a qubit (either unital or non-unital) channel. Finally, the shared state is used as a teleportation channel. In this scenario, we focus on characterizing the set of qubit channels with respect to the final state's efficacy as a resource of quantum teleportation (QT) in terms of maximal average fidelity and fidelity deviation (fluctuation in fidelity values over the input states). Importantly, we point out the existence of a subset of qubit channels for which the final state becomes useful for universal QT (having maximal average fidelity strictly greater than the classical bound and having zero fidelity deviation) when the initially prepared state is either useful for universal QT (i.e., for a maximally entangled state) or not useful for universal QT (i.e., for a subset of non-maximally entangled pure states). Interestingly, in the latter case, we show that non-unital channels (dissipative interactions) are more effective than unital channels (non-dissipative interactions) in producing useful states for universal QT from non-maximally entangled pure states.

翻訳日:2023-04-12 22:24:11 公開日:2021-05-18

# 低周波トラップにおける浮上磁石の地上冷却

Ground-State Cooling of Levitated Magnets in Low-Frequency Traps ( http://arxiv.org/abs/2102.03344v2 )

ライセンス: Link先を確認

Kirill Streltsov, Julen S. Pedernales, Martin B. Plenio

(参考訳) 低周波トラップ中を浮遊するメソスコピック磁性粒子の機械的自由度に対する基底状態冷却方式を提案する。本手法では,二値センサと適切な形状のパルスを用い,磁石の位置を弱く適応的に測定する。これにより粒子の位置と運動量を正確に決定することができ、初期高エントロピーの熱状態を純粋なコヒーレント状態に変換する。その後、トラップ中心をシフトしてエネルギーを抽出する。エネルギー抽出のタスクをコヒーレント変位操作に委譲することにより、発振器に結合した2レベル系の消散に依存する冷却スキームに関連する制約を克服する。我々は,本プロトコルを加熱速度や不完全な読み出し特性を含む実測実験条件で数値的に評価し,低温で作動する磁気浮上トラップに適していることを示す。その結果,ミクロンスケール粒子の地中冷却の道が開けた。

We present a ground-state cooling scheme for the mechanical degrees of freedom of mesoscopic magnetic particles levitated in low-frequency traps. Our method makes use of a binary sensor and suitably shaped pulses to perform weak, adaptive measurements on the position of the magnet. This allows us to precisely determine the position and momentum of the particle, transforming the initial high-entropy thermal state into a pure coherent state. The energy is then extracted by shifting the trap center. By delegating the task of energy extraction to a coherent displacement operation we overcome the limitations associated with cooling schemes that rely on the dissipation of a two-level system coupled to the oscillator. We numerically benchmark our protocol in realistic experimental conditions, including heating rates and imperfect readout fidelities, showing that it is well suited for magnetogravitational traps operating at cryogenic temperatures. Our results pave the way for ground-state cooling of micron-scale particles.

翻訳日:2023-04-12 11:41:01 公開日:2021-05-18

# ge Hut Wire Double Quantum Dotにおける異方性g因子とスピン軌道場

Anisotropic g-Factor and Spin-Orbit Field in a Ge Hut Wire Double Quantum Dot ( http://arxiv.org/abs/2102.03707v2 )

ライセンス: Link先を確認

Ting Zhang, He Liu, Fei Gao, Gang Xu, Ke Wang, Xin Zhang, Gang Cao, Ting Wang, Jian-Jun Zhang, Xuedong Hu, Hai-Ou Li and Guo-Ping Guo

(参考訳) ナノワイヤのホールは、マヨラナゼロモードの構築やスピン軌道量子ビットの操作において重要な役割を果たす強いスピン軌道相互作用のために近年大きな注目を集めている。ここでは、二重ドットのスピン遮断状態における強い異方性リーク電流から、全 g-テンソルを抽出し、スピン軌道場がナノワイヤの軸に対して59{\degの方位角を持つ平面内にあることを確認する。スピン軌道場の方向は、ナノワイヤに沿った強いスピン軌道相互作用を示すが、これはGe小屋ワイヤの界面反転非対称性に由来する可能性がある。また,ge hut線ダブルドットの穴に対するスピン緩和機構として,リードへのスピンフリップコネネネリングと、ダブルドット内のスピン軌道相互作用の2つの異なる機構を示す。これらの結果はgeベースの量子プロセッサの実現可能性を確立するのに役立つ。

Holes in nanowires have drawn significant attention in recent years because of the strong spin-orbit interaction, which plays an important role in constructing Majorana zero modes and manipulating spin-orbit qubits. Here, from the strongly anisotropic leakage current in the spin blockade regime for a double dot, we extract the full g-tensor and find that the spin-orbit field is in plane with an azimuthal angle of 59{\deg} to the axis of the nanowire. The direction of the spin-orbit field indicates a strong spin-orbit interaction along the nanowire, which may have originated from the interface inversion asymmetry in Ge hut wires. We also demonstrate two different spin relaxation mechanisms for the holes in the Ge hut wire double dot: spin-flip cotunneling to the leads, and spin-orbit interaction within the double dot. These results help establish feasibility of a Ge-based quantum processor.

翻訳日:2023-04-12 07:26:57 公開日:2021-05-18

# 操作のコヒーレンスと干渉計

Coherence of operations and interferometry ( http://arxiv.org/abs/2102.04863v2 )

ライセンス: Link先を確認

Michele Masini, Thomas Theurer, Martin B. Plenio

(参考訳) 量子コヒーレンス(quantum coherence)は、量子力学が古典物理学の力を超越する応用の鍵となる特徴の1つである。これは、量子資源理論を通じてコヒーレンスを定量化するために行われたかなりの努力を説明する。しかし、具体的な技術的タスクへのフレームワークの適用はほとんど失われている。本稿では、この問題に対処し、干渉計測実験の性能にコヒーレンスを検出または生成する操作の能力を結合する。

Quantum coherence is one of the key features that fuels applications for which quantum mechanics exceeds the power of classical physics. This explains the considerable efforts that were undertaken to quantify coherence via quantum resource theories. An application of the resulting framework to concrete technological tasks is however largely missing. Here, we address this problem and connect the ability of an operation to detect or create coherence to the performance of interferometric experiments.

翻訳日:2023-04-12 03:16:22 公開日:2021-05-18

# スパンニングツリー組換えのコンパクト性統計

Compactness statistics for spanning tree recombination ( http://arxiv.org/abs/2103.02699v2 )

ライセンス: Link先を確認

Jeanne N. Clelland, Nicholas Bossenbroek, Thomas Heckmaster, Adam Nelson, Peter Rock, Jade VanAusdall

(参考訳) アンサンブル分析はゲリーマンデリングを定量化するための重要なツールとなり、主要なアイデアは、提案された計画を比較することができる大きなランダムな計画("ensemble")のサンプルを生成することである。もし提案された計画が、様々な再配置基準に関してアンサンブルと比較して極端な異常であるならば、計画が意図的に特定の結果を生み出すように設計されたことを示す可能性がある。計画を構成する方法が与えられたら、その方法によって特定の計画を構築する確率を記述する計画の空間上の確率分布を特定できますか? 近年,MCMC法がアンサンブル構築の主流となっている。ここでは、2018年にMGGG Reistricting Labによって導入された「ReCom」と呼ばれるMCMC手法に焦点を当てる。 ReComは他の方法よりもコンパクトな地区で計画を作成する傾向があり、我々はこの現象をよりよく理解しようとした。この尺度はduchin と tenner によって提案され,polsby-popper score などの地理的周辺値に基づくコンパクト性尺度の難しさを回避した。基本ReComのステップをモデル化するため,2つの格子グラフとボーダー郡地区グラフの2分割計画のアンサンブルを構築した。また,2つの地区ごとの分布木数の積にほぼ比例する特定の計画の採集確率は,その計画におけるカットエッジ数の指数関数的減衰関数にほぼ比例することを示した。これはReCom法による分割計画のコンパクト性を理解するための重要なステップである。

Ensemble analysis has become an important tool for quantifying gerrymandering; the main idea is to generate a large, random sample of districting plans (an "ensemble") to which any proposed plan may be compared. If a proposed plan is an extreme outlier compared to the ensemble with regard to various redistricting criteria, this may indicate that the plan was deliberately engineered to produce a specific outcome. Many methods have been used to construct ensembles, and a fundamental question that arises is: Given a method for constructing plans, can we identify a probability distribution on the space of plans that describes the probability of constructing any particular plan by that method? Recently, MCMC methods have become a predominant tool for constructing ensembles. Here we focus on the MCMC method known as "ReCom," which was introduced in 2018 by the MGGG Redistricting Lab. ReCom tends to produce plans with more compact districts than some other methods, and we sought to better understand this phenomenon. We adopted a discrete analog of district perimeter called "cut edges" as a quantitative measure for district compactness; this measure was proposed by Duchin and Tenner, and it avoids some of the difficulties associated with compactness measures based on geographic perimeter, such as the Polsby-Popper score. To model the basic ReCom step, we constructed ensembles of 2-district plans for two grid graphs and for the precinct graph of Boulder County, CO. We found that the probability of sampling any particular plan -- which is roughly proportional to the product of the numbers of spanning trees for each of the two districts -- is also approximately proportional to an exponentially decaying function of the number of cut edges in the plan. This is an important step towards understanding compactness properties for districting plans produced by the ReCom method.

翻訳日:2023-04-09 07:42:25 公開日:2021-05-18

# コヒーレント・圧縮光によるオプトメカニカル冷却--熱弁開弁の熱力学的コスト-

Optomechanical cooling with coherent and squeezed light: the thermodynamic cost of opening the heat valve ( http://arxiv.org/abs/2103.03596v2 )

ライセンス: Link先を確認

Juliette Monsel, Nastaran Dashti, Sushanth Kini Manjeshwar, Jakob Eriksson, Henric Ernbrink, Ebba Olsson, Emelie Torneus, Witlef Wieczorek and Janine Splettstoesser

(参考訳) 各種光学系において、駆動光空洞との結合による機械運動の地中冷却が実証されている。本研究では,熱弁を用いた光機械式サイドバンド冷却の熱力学的性能解析を行う。性能定量化器として, 低到達性有効温度(フォノン数)だけでなく, 標準冷蔵機の冷却電力と同等の避難熱流や, キャビティ出力光場の測定から, 全てを実験的に推定できる適切な熱力学効率についても検討する。特に,コヒーレント光によって供給される標準的なオプティメカルセットアップに加えて,コヒーレントレーザドライブを圧縮光で置き換えたり,周波数依存(ファノ)ミラーでキャビティを使用するという,地中冷却を実現するための2つの方法を検討した。弱結合限界の内外におけるこれらのセットアップのダイナミクスを考察し、既存の実験システムのパラメータに基づく具体例を示す。熱力学の枠組みを適用することで、これら3つの異なる冷却機構について詳細な知見を得ることができ、熱力学のメカニズムを網羅的に理解することができる。

Ground-state cooling of mechanical motion by coupling to a driven optical cavity has been demonstrated in various optomechanical systems. In our work, we provide a so far missing thermodynamic performance analysis of optomechanical sideband cooling in terms of a heat valve. As performance quantifiers, we examine not only the lowest reachable effective temperature (phonon number) but also the evacuated-heat flow as an equivalent to the cooling power of a standard refrigerator, as well as appropriate thermodynamic efficiencies, which all can be experimentally inferred from measurements of the cavity output light field. Importantly, in addition to the standard optomechanical setup fed by coherent light, we investigate two recent alternative setups for achieving ground-state cooling: replacing the coherent laser drive by squeezed light or using a cavity with a frequency-dependent (Fano) mirror. We study the dynamics of these setups within and beyond the weak-coupling limit and give concrete examples based on parameters of existing experimental systems. By applying our thermodynamic framework, we gain detailed insights into these three different optomechanical cooling setups, allowing a comprehensive understanding of the thermodynamic mechanisms at play.

翻訳日:2023-04-09 00:20:29 公開日:2021-05-18

# 強相互作用するフェルミ・ボース混合系におけるフェルミ・ポーラロンの安定性と分解

Stability and breakdown of Fermi polarons in a strongly interacting Fermi-Bose mixture ( http://arxiv.org/abs/2103.03625v2 )

ライセンス: Link先を確認

Isabella Fritsche, Cosetta Baroni, Erich Dobler, Emil Kirilov, Bo Huang, Rudolf Grimm, Georg M. Bruun, Pietro Massignan

(参考訳) ウルトラコールド$^6$Li原子のフェルミ海に浸漬したボソニック$^{41}$K不純物の強相互作用不均衡混合物の特性について検討した。これにより、ボース=アインシュタイン凝縮体を形成する場合を含む、大きな不純物濃度のフェルミポラロンシナリオを探索することができる。このシステムは高周波注入分光法によって特徴づけられ、種間相互作用はよく特性化されたフェッシュバッハ共鳴によって広く調整可能である。不純物雲の熱分率で形成されるフェルミポーラロンのエネルギーは、両方の種の等しい密度に接近しても、不純物濃度にかなり敏感であることがわかった。高濃度に対する明らかな非感度は、ランダウの準粒子理論に基づく理論的な予測と、ポーラロン間の弱い効果的な相互作用と一致している。ボソニックの$^{41}$Kガスの凝縮分は熱成分よりもはるかに密度が高いため、フェルミ・ポーラロンの記述が破壊される。その代わり、周波数スペクトルの新しい分岐を小さなエネルギーシフトで観測し、これは$^{41}$kの凝縮物の内部で$^{6}$liのフェルミオンによって形成されたボースポーラロンの存在と一致する。ラビ振動測定による凝縮物の挙動のより深い調査は、この観測を裏付け、我々はフェルミとボース・ポーラロン(基本的に異なる2つの準粒子)を1つの雲で実現したことを示している。

We investigate the properties of a strongly interacting imbalanced mixture of bosonic $^{41}$K impurities immersed in a Fermi sea of ultracold $^6$Li atoms. This enables us to explore the Fermi polaron scenario for large impurity concentrations including the case where they form a Bose-Einstein condensate. The system is characterized by means of radio-frequency injection spectroscopy and interspecies interactions are widely tunable by means of a well-characterized Feshbach resonance. We find that the energy of the Fermi polarons formed in the thermal fraction of the impurity cloud remains rather insensitive to the impurity concentration, even as we approach equal densities for both species. The apparent insensitivity to high concentration is consistent with a theoretical prediction, based on Landau's quasiparticle theory, of a weak effective interaction between the polarons. The condensed fraction of the bosonic $^{41}$K gas is much denser than its thermal component, which leads to a break-down of the Fermi polaron description. Instead, we observe a new branch in the radio-frequency spectrum with a small energy shift, which is consistent with the presence of Bose polarons formed by $^{6}$Li fermions inside the $^{41}$K condensate. A closer investigation of the behavior of the condensate by means of Rabi oscillation measurements support this observation, indicating that we have realized Fermi and Bose polarons, two fundamentally different quasiparticles, in one cloud.

翻訳日:2023-04-09 00:07:54 公開日:2021-05-18

# 超波長可変量子周波数変換用フォトニック結晶繊維の群速度対称性

Group-velocity symmetry in photonic crystal fibre for ultra-tunable quantum frequency conversion ( http://arxiv.org/abs/2103.04824v2 )

ライセンス: Link先を確認

Charlotte Parry, Philip B. Main, Thomas A. Wright and Peter J. Mosley

(参考訳) 単一光子の低ノイズ周波数変換は、ファイバーベースの量子ネットワークを確立する上で重要なツールである。単一フォトニック結晶繊維は、対称群速度プロファイルを用いて、超広帯域の光源光子の4波混合をブラッグ散乱することで周波数変換が可能となる。さらに,ポンプチューニングがデバイス製造における現実的な相違を緩和する方法について論じる。これにより、1つの高い適応性を持つ周波数変換インタフェースにより、通信帯域を介して量子ネットワーク内の異種ノードをリンクすることができる。

Low-noise frequency conversion of single photons is a critical tool in establishing fibre-based quantum networks. We show that a single photonic crystal fibre can achieve frequency conversion by Bragg-scattering four-wave mixing of source photons from an ultra-broad wavelength range by engineering a symmetric group velocity profile. Furthermore, we discuss how pump tuning can mitigate realistic discrepancies in device fabrication. This enables a single highly adaptable frequency conversion interface to link disparate nodes in a quantum network via the telecoms band.

翻訳日:2023-04-08 18:23:52 公開日:2021-05-18

# ゴリニ-コサコフスキ-スダールシャン-リンドブラド型マルコフ系における有限サイズ孤立量子系の全熱力学的エントロピー生成速度の負の可能性

Possibility of the total thermodynamic entropy production rate of a finite-sized isolated quantum system to be negative for the Gorini-Kossakowski-Sudarshan-Lindblad-type Markovian dynamics of its subsystem ( http://arxiv.org/abs/2103.05308v2 )

ライセンス: Link先を確認

Takaaki Aoki, Yuichiro Matsuzaki, and Hideaki Hakoshima

(参考訳) 孤立量子系の全熱力学的エントロピー生成速度について検討する。特に、中心高調波発振器(系)が周囲の有限個の高調波発振器(バス)と結合する恒星構成における結合高調波発振器の量子モデルを考える。このモデルでは、システムと浴のギブス状態の初期状態がテンソル積によって与えられるとき、全ての高調波発振器は常にギブス状態であり、温度は時間に依存する。これにより、各調和振動子に対する時間依存熱力学的エントロピーと、それらの和として全非平衡熱力学的エントロピーを定義できる。熱力学エントロピーが熱力学の第3法則を満たすことを解析的に確認する。数値解は,gorini-kossakowski-sudarshan-lindblad (gksl) 型マルコフマスター方程式によって系の力学が十分に近似された場合でも,全熱力学的エントロピー生成速度は負であり,全熱力学的エントロピーは熱力学の第2法則を満たすことを示した。この結果は、系がgksl型マルコフ力学の下では、全エントロピー生成率は非負であるという共通の信念に対する反例である。

We investigate a total thermodynamic entropy production rate of an isolated quantum system. In particular, we consider a quantum model of coupled harmonic oscillators in a star configuration, where a central harmonic oscillator (system) is coupled to a finite number of surrounding harmonic oscillators (bath). In this model, when the initial state of the total system is given by the tensor product of the Gibbs states of the system and the bath, every harmonic oscillator is always in a Gibbs state with a time-dependent temperature. This enables us to define time-dependent thermodynamic entropy for each harmonic oscillator and total nonequilibrium thermodynamic entropy as the summation of them. We analytically confirm that the total thermodynamic entropy satisfies the third law of thermodynamics. Our numerical solutions show that, even when the dynamics of the system is well approximated by the Gorini-Kossakowski-Sudarshan-Lindblad (GKSL)-type Markovian master equation, the total thermodynamic entropy production rate can be negative, while the total thermodynamic entropy satisfies the second law of thermodynamics. This result is a counterexample to the common belief that the total entropy production rate is non-negative when the system is under the GKSL-type Markovian dynamics.

翻訳日:2023-04-08 16:11:31 公開日:2021-05-18

# 断続テイラー級数によるハミルトンシミュレーションのためのnisqアルゴリズム

NISQ Algorithm for Hamiltonian Simulation via Truncated Taylor Series ( http://arxiv.org/abs/2103.05500v2 )

ライセンス: Link先を確認

Jonathan Wei Zhong Lau, Tobias Haug, Leong Chuan Kwek, Kishor Bharti

(参考訳) 多体量子システムのダイナミクスをシミュレートすることは、量子コンピュータが古典的コンピュータよりも量子優位を示す最初の分野の1つであると考えられている。ノイズの多い中間スケール量子(nisq)アルゴリズムは、現在利用可能な量子ハードウェアを効果的に利用することを目指している。量子シミュレーションでは、様々な種類のNISQアルゴリズムが個々の利点と課題によって提案されている。本稿では,既存のアルゴリズムの利点を共有し,いくつかの欠点を緩和する新しいアルゴリズムであるttqs(tncated taylor quantum simulator)を提案する。本アルゴリズムは古典量子フィードバックループを持たず,構築によって不毛台地問題を回避している。我々のハイブリッド量子古典アルゴリズムの古典的部分は、半定値緩和を含む1つの2次等式制約を持つ2次制約付き二次プログラム(QCQP)に対応する。 QCQPに基づく古典最適化は、最近ハミルトン基底問題に対するNISQアルゴリズムである量子補助固有解法(QAE)の古典的なステップとして導入された。したがって,本研究は,ハミルトン基底状態問題に対するnisqアルゴリズムとハミルトニアンシミュレーションの間の概念的統一性を提供する。量子支援シミュレータ (QAS) や変分量子シミュレータ (VQS) などのハミルトンシミュレーションのための微分方程式に基づく NISQ アルゴリズムをアルゴリズムの特定の場合として回収する。私たちは、現在のクラウド量子コンピュータのおもちゃの例でアルゴリズムをテストします。また,アルゴリズムの精度を向上させるための体系的手法を提案する。

Simulating the dynamics of many-body quantum systems is believed to be one of the first fields that quantum computers can show a quantum advantage over classical computers. Noisy intermediate-scale quantum (NISQ) algorithms aim at effectively using the currently available quantum hardware. For quantum simulation, various types of NISQ algorithms have been proposed with individual advantages as well as challenges. In this work, we propose a new algorithm, truncated Taylor quantum simulator (TTQS), that shares the advantages of existing algorithms and alleviates some of the shortcomings. Our algorithm does not have any classical-quantum feedback loop and bypasses the barren plateau problem by construction. The classical part in our hybrid quantum-classical algorithm corresponds to a quadratically constrained quadratic program (QCQP) with a single quadratic equality constraint, which admits a semidefinite relaxation. The QCQP based classical optimization was recently introduced as the classical step in quantum assisted eigensolver (QAE), a NISQ algorithm for the Hamiltonian ground state problem. Thus, our work provides a conceptual unification between the NISQ algorithms for the Hamiltonian ground state problem and the Hamiltonian simulation. We recover differential equation-based NISQ algorithms for Hamiltonian simulation such as quantum assisted simulator (QAS) and variational quantum simulator (VQS) as particular cases of our algorithm. We test our algorithm on some toy examples on current cloud quantum computers. We also provide a systematic approach to improve the accuracy of our algorithm.

翻訳日:2023-04-08 16:01:42 公開日:2021-05-18

# デュアルユニタリ量子回路における固有熱化:スペクトル関数の漸近

Eigenstate thermalization in dual-unitary quantum circuits: Asymptotics of spectral functions ( http://arxiv.org/abs/2103.11694v2 )

ライセンス: Link先を確認

Felix Fritzsch and Toma\v{z} Prosen

(参考訳) 固有状態熱化仮説は、(準)エネルギー固有基底における典型的な作用素の行列要素の統計的性質を導出することによって、孤立量子系における熱化の最も成功した記述である。本稿では,2元量子回路における作用素のクラスに対する行列要素の分布を,対応する固有状態の周波数に依存して検討する。スペクトル関数、すなわち、この周波数分解分布の第2モーメントに対する正確な漸近的表現を提供する。後者は、双対ユニタリ回路の基本構成ブロックから正確に計算できる局所作用素間の動的相関の減衰から得られる。漸近表現と正確な対角化による結果を比較すると,良好な一致が得られた。有限系サイズの小さなゆらぎは、中間時間の動的相関とそれらの漸近力学からの偏差に明示的に関係している。さらに,高いモーメントを数値計算することで,行列要素の期待ガウス分布を確認する。

The eigenstate thermalization hypothesis provides to date the most successful description of thermalization in isolated quantum systems by conjecturing statistical properties of matrix elements of typical operators in the (quasi-)energy eigenbasis. Here we study the distribution of matrix elements for a class of operators in dual-unitary quantum circuits in dependence of the frequency associated with the corresponding eigenstates. We provide an exact asymptotic expression for the spectral function, i.e., the second moment of this frequency resolved distribution. The latter is obtained from the decay of dynamical correlations between local operators which can be computed exactly from the elementary building blocks of the dual-unitary circuits. Comparing the asymptotic expression with results obtained by exact diagonalization we find excellent agreement. Small fluctuations at finite system size are explicitly related to dynamical correlations at intermediate times and the deviations from their asymptotical dynamics. Moreover, we confirm the expected Gaussian distribution of the matrix elements by computing higher moments numerically.

翻訳日:2023-04-07 04:46:44 公開日:2021-05-18

# 強結合キャビティ量子電気力学におけるフォノン効果の非マルコフ摂動理論

Non-Markovian perturbation theories for phonon effects in strong-coupling cavity quantum electrodynamics ( http://arxiv.org/abs/2103.14327v2 )

ライセンス: Link先を確認

Matias Bundgaard-Nielsen, Jesper M{\o}rk and Emil Vosmar Denning

(参考訳) フォノン相互作用は、固体エミッタや蛍光分子に基づくキャビティ量子電気力学系では避けられず、格子や化学結合の振動が電子自由度に結合する。振動環境の非マルコフ応答のため、そのような効果を計算的に効率的に記述することは重要な理論的課題である。これは、エミッタ-キャビティ結合が典型的なフォノンエネルギー範囲に匹敵するか大きいときに特に顕著であり、偏光子形成は光遷移の振動ドレッシングと一致する。本稿では,4つの非マルコフ的摂動的マスター方程式を用いて,光物質結合強度の広い範囲にわたる力学を記述し,テンソルネットワークを用いた数値的正確な参照計算と比較する。マスター方程式は異なる基底変換を用いて導出され、新しい基底における摂動拡大はその後導入され、解析される。 2つのアプローチが特に成功し,堅牢であることに気付きました。本論では, 励起子キャビティ偏光子の振動ドレッシングを基礎として, 第一報が提案され, 開発されている。これにより、ポラリトン分裂が環境における典型的なフォノン周波数スケールを超えると現れる異なるフォノン・ポーラリトンサイドバンドを記述することができる。第2のアプローチは、電子状態の変分最適化された極性振動ドレッシングに基づいている。どちらの手法も、放射スペクトルの基準計算と良好な定性的かつ定量的な一致を示し、熱フォノンの集団が顕著な高温でも数値的に堅牢である。

Phonon interactions are inevitable in cavity quantum electrodynamical systems based on solid-state emitters or fluorescent molecules, where vibrations of the lattice or chemical bonds couple to the electronic degrees of freedom. Due to the non-Markovian response of the vibrational environment, it remains a significant theoretical challenge to describe such effects in a computationally efficient manner. This is particularly pronounced when the emitter-cavity coupling is comparable to or larger than the typical phonon energy range, and polariton formation coincides with vibrational dressing of the optical transitions. In this Article, we consider four non-Markovian perturbative master equation approaches to describe such dynamics over a broad range of light-matter coupling strengths and compare them to numerically exact reference calculations using a tensor network. The master equations are derived using different basis transformations and a perturbative expansion in the new basis is subsequently introduced and analyzed. We find that two approaches are particularly successful and robust. The first of these is suggested and developed in this Article and is based on a vibrational dressing of the exciton-cavity polaritons. This enables the description of distinct phonon-polariton sidebands that appear when the polariton splitting exceeds the typical phonon frequency scale in the environment. The second approach is based on a variationally optimized polaronic vibrational dressing of the electronic state. Both of these approaches demonstrate good qualitative and quantitative agreement with reference calculations of the emission spectrum and are numerically robust, even at elevated temperatures, where the thermal phonon population is significant.

翻訳日:2023-04-06 19:28:51 公開日:2021-05-18

# 時間外行列における量子カオスのシグネチャ

Signatures of quantum chaos in an out-of-time-order matrix ( http://arxiv.org/abs/2105.08282v1 )

ライセンス: Link先を確認

Magdalini Zonnios, Jesper Levinsen, Meera M. Parish, Felix A. Pollock, Kavan Modi

(参考訳) 液体のカオス性を決定するためにインク滴を用いる有名なインク滴実験に動機づけられ,量子プロセスのスクランブル容量を実験的に測定する方法を提案する。ここで、興味のある系は、系のカオス性を特定する動的性質を持つ小さな量子プローブと相互作用する。具体的には、プロセスのカオス性に関する明確な情報理論的意味を提供する、時間外行列(OTOM)と呼ばれる、時間外相関器(OTOC)の完全量子バージョンを提案する。我々は、ランダムなユニタリ過程を用いたカオスのシグネチャとしてのotomの有用性と、カオス性がチューニング可能な量子キックロータについて説明する。

Motivated by the famous ink-drop experiment, where ink droplets are used to determine the chaoticity of a fluid, we propose an experimentally implementable method for measuring the scrambling capacity of quantum processes. Here, a system of interest interacts with a small quantum probe whose dynamical properties identify the chaoticity of the system. Specifically, we propose a fully quantum version of the out-of-time-order correlator (OTOC) - which we term the out-of-time-order matrix (OTOM) - whose correlations offer clear information theoretic meanings about the chaoticity of a process. We illustrate the utility of the OTOM as a signature of chaos using random unitary processes as well as in the quantum kicked rotor, where the chaoticity is tuneable.

翻訳日:2023-03-30 20:18:10 公開日:2021-05-18

# 関数全体のヒルベルト空間とユークリッド平面のトープリッツ量子化

Hilbert Spaces of Entire Functions and Toeplitz Quantization of Euclidean Planes ( http://arxiv.org/abs/2105.08400v1 )

ライセンス: Link先を確認

Micho Durdevich and Stephen Bruce Sontz

(参考訳) 先程の論文で提示されたトープリッツ量子化の理論は拡張され、古典ユークリッド平面の多様で興味深い非可換な実現を含むようにさらに発展した。これは関数全体のヒルベルト空間を用いて行われ、1つの複素変数の多項式が密部分空間を形成する。複素座標は自然に非有界乗法演算子として作用し、その随伴子とともに作用素の高度に非可換な *-代数である。トープリッツ作用素は、この代数の特殊元として幾何学的に構成され、他の二次非可換代数の記号と関連付けられ、平面上の多項式として量子化される。そのような概念的枠組みは、初期スカラー積上の興味深い非自明な条件を促進する。これらは詳細に分析される。様々な例が計算される。

The theory of Toeplitz quantization presented in our previous paper is extended and further developed to include diverse and interesting non-commutative realizations of the classical Euclidean plane. This is done using Hilbert spaces of entire functions, where polynomials in one complex variable form a dense subspace. The complex coordinate naturally acts as an unbounded multiplication operator generating, together with its adjoint, a highly non-commutative *-algebra of operators. The Toeplitz operators are then geometrically constructed as special elements from this algebra; they are associated to the symbols from another quadratic non-commutative algebra, which is interpretable as polynomials over a plane to be quantized. Such a conceptual framework promotes interesting non-trivial conditions on the initial scalar product. These are analyzed in detail. Various illustrative examples are computed.

翻訳日:2023-03-30 20:09:22 公開日:2021-05-18

# 回折結合を介する低温原子の自己組織化

Self-Organization in Cold Atoms Mediated by Diffractive Coupling ( http://arxiv.org/abs/2105.08340v1 )

ライセンス: Link先を確認

Thorsten Ackemann, Guillaume Labeyrie, Giuseppe Baio, Ivor Kre\v{s}i\'c, Josh G. M. Walker, Adrian Costa Boquete, Paul Griffin, William J. Firth, Robin Kaiser, Gian-Luca Oppo, and Gordon R.M. Robb

(参考訳) 本稿では,1枚の反射鏡からのフィードバックによって誘導される光媒介相互作用による低温原子の自己組織化について論じる。ポンプビームと自発側バンドとの拡散劣化は格子周期を選択する。回転対称性と並進対称性の自発的破壊はポンプに横切る2次元平面で起こる。自己誘起原子格子上の回折リップルの結合部位を解明する。光ビームに印加された原子雲の非線形位相シフトは結合強度を決定するパラメータである。相互作用は、熱原子の原子結晶化につながる外部自由度と、量子縮退ガスの超固体、または励起状態の集団やゼーマン準準準位のような内部自由度のいずれかに調整することができる。 poincar{\'e}球面上の光偏光度(ヘリシティと偏光方向)を用いることで、原子ゼーマン状態の特定の既約テンソル成分を結合することができ、双極子と四極子の性質の自発的磁気秩序に導くことができる。臨界相互作用強度の要件は、異なる状況で比較される。縦送りキャビティ, 逆伝搬ビームスキーム, CARL不安定性への接続と拡張について論じる。

This article discusses self-organization in cold atoms via light-mediated interactions induced by feedback from a single retro-reflecting mirror. Diffractive dephasing between the pump beam and the spontaneous sidebands selects the lattice period. Spontaneous breaking of the rotational and translational symmetry occur in the 2D plane transverse to the pump. We elucidate how diffractive ripples couple sites on the self-induced atomic lattice. The nonlinear phase shift of the atomic cloud imprinted onto the optical beam is the parameter determining coupling strength. The interaction can be tailored to operate either on external degrees of freedom leading to atomic crystallization for thermal atoms and supersolids for a quantum degenerate gas, or on internal degrees of freedom like populations of the excited state or Zeeman sublevels. Using the light polarization degrees of freedom on the Poincar{\'e} sphere (helicity and polarization direction), specific irreducible tensor components of the atomic Zeeman states can be coupled leading to spontaneous magnetic ordering of states of dipolar and quadrupolar nature. The requirements for critical interaction strength are compared for the different situations. Connections and extensions to longitudinally pumped cavities, counterpropagating beam schemes and the CARL instability are discussed.

翻訳日:2023-03-30 20:07:42 公開日:2021-05-18

# 2光子量子電池のキャラクタリゼーション:初期条件,安定性,作業抽出

Characterization of a Two-Photon Quantum Battery: Initial Conditions, Stability and Work Extraction ( http://arxiv.org/abs/2105.08337v1 )

ライセンス: Link先を確認

Anna Delmonte, Alba Crescente, Matteo Carrega, Dario Ferraro, Maura Sassetti

(参考訳) 2光子相互作用によってキャビティ放射と結合した2レベル系に基づく量子電池を考える。キャビティの初期条件, フォック状態, コヒーレント状態, 圧縮状態を考慮して, 蓄積エネルギー, 平均充電電力, エネルギー変動量, 抽出可能な作業量など, 様々な特性について検討した。最初の状態がバッテリーの性能向上につながることを示す。しかし、同じ平均光子数を持つコヒーレントな状態は、蓄えられたエネルギーの強いゆらぎの影響を受けているとしても、特に、保存されたエネルギーを短時間でほぼ完全に取り出すことができるため、非常に興味深い性能をもたらす。

We consider a quantum battery that is based on a two-level system coupled with a cavity radiation by means of a two-photon interaction. Various figures of merit, such as stored energy, average charging power, energy fluctuations, and extractable work are investigated, considering, as possible initial conditions for the cavity, a Fock state, a coherent state, and a squeezed state. We show that the first state leads to better performances for the battery. However, a coherent state with the same average number of photons, even if it is affected by stronger fluctuations in the stored energy, results in quite interesting performance, in particular since it allows for almost completely extracting the stored energy as usable work at short enough times.

翻訳日:2023-03-30 20:07:23 公開日:2021-05-18

# 無限円筒ポテンシャル井戸における自由粒子の非相対論的シナリオ

Non-Relativistic Scenario of a Free Particle in an Infinite Cylindrical Potential Well ( http://arxiv.org/abs/2105.08283v1 )

ライセンス: Link先を確認

Pratik Adarsh and Sabyasachi Ghosh

(参考訳) 基本量子力学の形式論には様々な種類の無限ポテンシャル井戸問題が存在する。無限平方井戸(1次元)、立方体箱、球面井戸は教科書では非常に一般的である。本稿では、無限の円筒形井戸である比較的珍しいポテンシャル井戸を考察し、そのエネルギー固有値と固有関数をSchr\"{o}dinger方程式を用いて探究する。また、放射波関数や密度プロットをプロットする。

There are various types of infinite potential well problems occurring in elementary quantum mechanics formalism. The infinite square well (one dimensional), cubical box and, spherical well are quite common in textbooks. In this paper, we consider a rather uncommon potential well, an infinite cylindrical well, and try to find its energy eigenvalues and eigenfunctions using Schr\"{o}dinger equation. We also plot some radial wavefunctions and density plots.

翻訳日:2023-03-30 20:06:36 公開日:2021-05-18

# エルビウムドーパントの接地におけるコヒーレント制御と光励起状態

Coherent control in the ground and optically excited state of an ensemble of erbium dopants ( http://arxiv.org/abs/2105.08487v1 )

ライセンス: Link先を確認

Pablo Cova Fari\~na, Benjamin Merkel, Natalia Herrera Valencia, Penghong Yu, Alexander Ulanowski, Andreas Reiserer

(参考訳) エルビウムドーパントのアンサンブルは、光ファイバー通信の最小損失波長帯域で動作する量子メモリと周波数変換器を実現することができる。彼らの操作は電子スピン状態の初期化、コヒーレント制御、読み出しを必要とする。本研究では、スプリットリングマイクロ波共振器を用いて、地上と光学的励起状態の両方でそのような制御を実証する。提案手法は、ドパントとホストの他の組み合わせにも適用でき、新しい量子メモリプロトコルやセンシングスキームの開発を容易にする可能性がある。

Ensembles of erbium dopants can realize quantum memories and frequency converters that operate in the minimal-loss wavelength band of fiber optical communication. Their operation requires the initialization, coherent control and readout of the electronic spin state. In this work, we use a split-ring microwave resonator to demonstrate such control in both the ground and optically excited state. The presented techniques can also be applied to other combinations of dopant and host, and may facilitate the development of new quantum memory protocols and sensing schemes.

翻訳日:2023-03-30 19:58:13 公開日:2021-05-18

# 大型ランダムアローヘッド行列 : キャビティに結合した量子スピンの多フラクタリティ,半局所化,保護輸送

Large Random Arrowhead Matrices: Multifractality, Semi-Localization, and Protected Transport in Disordered Quantum Spins Coupled to a Cavity ( http://arxiv.org/abs/2105.08444v1 )

ライセンス: Link先を確認

J\'er\^ome Dubail, Thomas Botzung, Johannes Schachenmayer, Guido Pupillo, and David Hagenm\"uller

(参考訳) 我々は、空洞モードに結合した不均一に拡張された量子エミッタの最小モデルである対角障害を持つ大きなランダムな矢印ハミルトニアンの正確な解を提供する。エネルギー間隔の分布は、ポアソン統計と半ポアソン統計に非常に近い分布(後者は通常「アンダーソン」局在化-非局在化遷移の臨界点に関連付けられている)の間で連続的に調整できる。 2つの分極子と1つの暗黒状態の連続体を含むすべての固有状態が多重フラクタルであることを示し、これは光物質結合強度の全ての値に対して重要な「半局所化」相が存在することを示す。初期地点からの脱出確率を計算した結果,有限結合強度の時間とともに線形に増大する脱出確率と,初期地点のエネルギーを選択することで脱出速度を制御することができる。乱れの配置で平均される脱出率は中間結合強度に対して最大値を示し, 集合的強結合限界("キャビティ保護"効果)よりも低い値で飽和する。意外なことに、飽和値は障害によって増加し、空洞が障害に対する輸送を保護しているだけでなく、後者を補助的改善輸送にすることも示している。その結果, 定常励磁電流は脱出確率と類似した特性を示し, キャビティ保護輸送シナリオを平衡外へ拡張することを示した。最後に,無秩序システムにおける長距離輸送に暗黒状態が寄与できることを実証する。

We provide an exact solution of large random arrowhead Hamiltonians with diagonal disorder, a minimal model for inhomogeneously broadened quantum emitters coupled to a cavity mode. We find that the distribution of energy spacing can be continuously tuned between Poisson statistics and a distribution that is very close to semi-Poisson statistics - the latter being usually associated to the critical point of "Anderson" localization-delocalization transitions. We demonstrate that all the eigenstates - including two polaritons and a continuum of dark states - are multifractal, which indicates the existence of a critical "semi-localized" phase for all values of the light-matter coupling strength, where dark states are localized over multiple, arbitrarily-distant sites. By computing the escape probability from an initial site, we find that the system has a peculiar diffusive-like behavior with an escape probability growing linearly with time for any finite coupling strength, and that the escape rate can be controlled by selecting the energy of the initial site. The escape rate averaged over the disorder configurations is found to exhibit a maximum for intermediate coupling strengths, before saturating at a lower value in the collective strong coupling limit - a "cavity protection" effect. Surprisingly, we show that the saturation value increases with the disorder, indicating that the cavity does not only protect transport against disorder but can also turn the latter into an ally improving transport. We finally investigate the system in a two-terminal configuration, and show that the steady-state excitation current exhibits similar features as the escape probability, thereby extending our cavity-protected transport scenario to out-of-equilibrium situations. We finally demonstrate that dark states can provide the major contribution to long-distance transport in disordered systems.

翻訳日:2023-03-30 19:58:03 公開日:2021-05-18

# 時間と位相調整電磁界による光電子放出

Photoelectron emission via time and phase-tailored electromagnetic fields ( http://arxiv.org/abs/2105.08435v1 )

ライセンス: Link先を確認

Jonas W\"atzel, Johannes Hahn and Jamal Berakdar

(参考訳) 光電子のエネルギーと角分布は、駆動場の時間と空間位相構造を選択することで調整可能であることが示されている。これらの結論は、レーザー場と時間非対称thzパルスと/または渦レーザーパルスとを組み合わせた原子ターゲットの単一活性電子モデル内で、波面の空間変調位相を持つ量子力学的計算から導かれる。

The energy and the angular distributions of photoelectrons are shown to be tunable by choosing the time and the spatial phase structure of the driving fields. These conclusions are derived from quantum mechanical calculations done within a single-active electron model for an atomic target subjected to a combination of laser field and a time-asymmetric THz pulse and/or vortex-laser pulse with a spatially modulated phase of the wavefront.

翻訳日:2023-03-30 19:57:28 公開日:2021-05-18

# itの柔軟性と動的能力の戦略的整合:実証的研究

Strategic alignment between IT flexibility and dynamic capabilities: an empirical investigation ( http://arxiv.org/abs/2105.08429v1 )

ライセンス: Link先を確認

Rogier van de Wetering, Patrick Mikalef and Adamantia Pateli

(参考訳) ダイナミック機能理論は企業価値創造のプロセスにおける主要なフレームワークとして登場した。その中核的な概念は、会社の資源に基づく視点の前提を補完し、現代の情報システム研究において重要な理論と管理の枠組みと考えられている。しかし、dctsの大きな貢献にもかかわらず、その強みと中心となる焦点は、本質的には歴史的な業績説明に使われている。さらに、DCTを拡張して、常に変化するIT環境や他の命令的ドライバに適合するようにするために、何人かの研究者が貴重な貢献をしている。しかし、さらなるパフォーマンス向上のための命令的なステップを導出するために、企業が現在の成熟度を統合的に評価できるdct拡張は開発されていない。本稿では,IT の柔軟性と動的能力に関する戦略的アライメントモデルの構築と,322 社の大規模データを用いた相関分析と回帰分析による仮説の実証的検証を目的とする。企業の基盤となる寸法の相乗効果をitフレキシビリティアーキテクチャと動的能力の組み合わせによって、組織は環境条件の変化に対処し、競争相手のパフォーマンスを高めることができると推測する。本研究の結果から,全次元のバランスの程度と競争力のある企業業績との間には,戦略的アライメントの程度に有意な正の相関があることが示唆された。したがって、戦略的アライメントは、常に変化する環境において企業が競争上の優位性に大きく影響を与える重要な条件と見なすことができる。提案されたフレームワークは、ITの柔軟性と動的な能力の成熟度と整合性を評価し、改善するのに役立つ。

Dynamic capabilities theory emerged as a leading framework in the process of value creation for firms. Its core notion complements the premise of the resource-based view of the firm and is considered an important theoretical and management framework in modern information systems research. However, despite DCTs significant contributions, its strength and core focus are essentially in its use for historical firm performance explanation. Furthermore, valuable contributions have been made by several researchers in order to extend the DCT to fit the constantly changing IT environments and other imperative drivers for competitive performance. However, no DCT extension has been developed which allows firms to integrally assess their current state of maturity in order to derive imperative steps for further performance enhancements. In light of empirical advancement, this paper aims to develop a strategic alignment model for IT flexibility and dynamic capabilities and empirically validates proposed hypotheses using correlation and regression analyses on a large data sample of 322 international firms. We conjecture that the combined synergetic effect of the underlying dimensions of a firms IT flexibility architecture and dynamic capabilities enables organizations to cope with changing environmental conditions and drive competitive firm performance. Findings of this study suggest that there is a significant positive relationship between the firms degree of strategic alignment defined as the degree of balance between all dimensions and competitive firm performance. Strategic alignment can, therefore, be seen as an important condition that significantly influences a firms competitive advantage in constantly changing environments. The proposed framework helps firms assess and improve their maturity and alignment of IT flexibility and dynamic capabilities.

翻訳日:2023-03-30 19:57:06 公開日:2021-05-18

# 熱原子ビームの光学キャビティへの超放射放出

Superradiant emission of a thermal atomic beam into an optical cavity ( http://arxiv.org/abs/2105.08718v1 )

ライセンス: Link先を確認

Simon B. J\"ager, Haonan Liu, John Cooper, Travis L. Nicholson, and Murray J. Holland

(参考訳) 理論的には、光学キャビティを横切る際に1つのモードに結合する原子双極子の熱線の集合ダイナミクスを理論的に解析する。この設定のために半古典モデルから導出し、超ラジアント放出の開始とその安定性を決定する。放射光の直線幅に関する解析式を導出し,それらを数値シミュレーションと比較する。さらに、定常超放射相と多成分超放射相の2つの異なる超放射相を発見し、予測する。後者の場合、集団双極子の振幅モードの安定性解析を用いて計算できる周波数スペクトルのサイドバンドを観測する。両超ラジアント相は, 自由空間自然放出と$T_2$脱落過程に対して堅牢であることを示す。

We theoretically analyze the collective dynamics of a thermal beam of atomic dipoles that couple to a single mode when traversing an optical cavity. For this setup we derive a semiclassical model and determine the onset of superradiant emission and its stability. We derive analytical expressions for the linewidth of the emitted light and compare them with numerical simulations. In addition, we find and predict two different superradiant phases; a steady-state superradiant phase and a multi-component superradiant phase. In the latter case we observe sidebands in the frequency spectrum that can be calculated using a stability analysis of the amplitude mode of the collective dipole. We show that both superradiant phases are robust against free-space spontaneous emission and $T_2$ dephasing processes.

翻訳日:2023-03-30 19:50:42 公開日:2021-05-18

# ノイズチャネル識別における量子アドバンテージ

Quantum advantage for noisy channel discrimination ( http://arxiv.org/abs/2105.08707v1 )

ライセンス: Link先を確認

Zane M. Rossi, Jeffery Yu, Isaac L. Chuang, Sho Sugiura

(参考訳) 多くの量子力学実験は、既知の量子回路と未知の量子過程の間のマルチラウンド対話プロトコルと見なすことができる。未知のプロセスに対する完全量子コヒーレントなアクセスは、非コヒーレントなアクセスが許可されているときに比べて多くの識別タスクにおいて利点をもたらすことが知られているが、この利点がプロセスがうるさいときに持続するかどうかは不明である。ここでは,2つの単一量子ビット回転チャネルを区別する場合に,量子アドバンテージを維持できることを示す。数値計算と解析により,雑音強度関数としての完全コヒーレントプロトコルと完全コヒーレントプロトコルによる最適性能の差が明らかとなった。さらに、コヒーレント量子優位領域のサイズはチャネル使用数において逆多項式的に縮小し、中間状態においては、改良された戦略は完全コヒーレントかつ完全非コヒーレントなサブルーチンのハイブリッドである。完全コヒーレントプロトコルは量子信号処理に基づいており、現実的な雑音の存在下での量子優位性の研究のための一般化可能なアルゴリズムフレームワークを提案する。

Many quantum mechanical experiments can be viewed as multi-round interactive protocols between known quantum circuits and an unknown quantum process. Fully quantum "coherent" access to the unknown process is known to provide an advantage in many discrimination tasks compared to when only incoherent access is permitted, but it is unclear if this advantage persists when the process is noisy. Here, we show that a quantum advantage can be maintained when distinguishing between two noisy single qubit rotation channels. Numerical and analytical calculations reveal a distinct transition between optimal performance by fully coherent and fully incoherent protocols as a function of noise strength. Moreover, the size of the region of coherent quantum advantage shrinks inverse polynomially in the number of channel uses, and in an intermediate regime an improved strategy is a hybrid of fully-coherent and fully-incoherent subroutines. The fully coherent protocol is based on quantum signal processing, suggesting a generalizable algorithmic framework for the study of quantum advantage in the presence of realistic noise.

翻訳日:2023-03-30 19:50:31 公開日:2021-05-18

# NISQハードウェアにおける古典的量子ノイズ低減

Classical-Quantum Noise Mitigation for NISQ Hardware ( http://arxiv.org/abs/2105.08701v1 )

ライセンス: Link先を確認

Andrew Shaw

(参考訳) この研究において、グローバルホワイトノイズモデルは第一原理から証明される。 NISQハードウェアのグローバルホワイトノイズモデルへの付着は、古典的ホワイトノイズ外挿法(CLAWE)を用いてノイズ軽減に使用される。

In this work, the global white-noise model is proved from first principles. The adherence of NISQ hardware to the global white-noise model is used to perform noise mitigation using Classical White-noise Extrapolation (CLAWE).

翻訳日:2023-03-30 19:49:51 公開日:2021-05-18

# 非条件セキュアな鍵分布を示す量子リピータノード

A Quantum Repeater Node Demonstrating Unconditionally Secure Key Distribution ( http://arxiv.org/abs/2105.08691v1 )

ライセンス: Link先を確認

S. Langenfeld, P. Thomas, O. Morin, and G. Rempe

(参考訳) 長距離量子通信は光ファイバーの光子損失を克服するために量子リピータを必要とする。ここでは、光空洞に2つのメモリ原子を持つリピータノードを示す。両方の原子は個別に、繰り返し光子と絡み合っており、それぞれの通信相手がそれぞれ独立に受信するまで分配される。原子のベル状態の測定に続いて古典的な通信が鍵を確立する。我々はキーレートのスケーリングの利点を実証し、有効減衰長を2倍に増やし、無条件でセキュアな通信を行う場合のエラーレート閾値11\%、リピータベースの量子ネットワークの場合のコーナーストーンを突破する。

Long-distance quantum communication requires quantum repeaters to overcome photon loss in optical fibers. Here we demonstrate a repeater node with two memory atoms in an optical cavity. Both atoms are individually and repeatedly entangled with photons that are distributed until each communication partner has independently received one of them. An atomic Bell-state measurement followed by classical communication serves to establish a key. We demonstrate scaling advantage of the key rate, increase the effective attenuation length by a factor of two, and beat the error-rate threshold of 11\% for unconditionally secure communication, the corner stones for repeater-based quantum networks.

翻訳日:2023-03-30 19:49:36 公開日:2021-05-18

# ダイヤモンド中の単一SiV$^{-}およびSnV$^{-}$中心の高忠実スピン回転のための光制御プロトコル

Optical control protocols for high-fidelity spin rotations of single SiV$^{-}$ and SnV$^{-}$ centers in diamond ( http://arxiv.org/abs/2105.08594v1 )

ライセンス: Link先を確認

Evangelia Takou and Sophia E. Economou

(参考訳) ダイヤモンドのシリコン空孔とスズ空孔欠陥は、その優れた光学特性のために、NV中心の代替量子ビットとして興味がある。これらの欠陥における光遷移の可用性は、その資産の1つであるが、高忠実性光コヒーレント制御は証明されていない。ここでは,これらの欠陥に対応する新しい光制御スキームを設計する。外部磁場の有無および存在下での電子スピン量子ビットの任意の単一量子ビット回転の性能をコヒーレントな誤差と非コヒーレントな誤差の両方を考慮して評価する。 9,8.0\%$(T=4$~K)と99.71\%$(T=6$~K)を超える回転は、現実的な緩和とリークエラーの存在下でそれぞれSi-VとSn-Vに対して達成できる。

Silicon-vacancy and tin-vacancy defects in diamond are of interest as alternative qubits to the NV center due to their superior optical properties. While the availability of optical transitions in these defects is one of their assets, high-fidelity optical coherent control has not been demonstrated. Here, we design novel optical control schemes tailored to these defects. We evaluate the performance of arbitrary single-qubit rotations of the electron spin qubit both in the absence and presence of an external magnetic field, by taking into account both coherent and incoherent errors. We find that rotations in excess of $98.0\%$ ($T=4$~K) and $99.71\%$ ($T=6$~K) can be achieved for Si-V and Sn-V respectively in the presence of realistic relaxation and leakage errors.

翻訳日:2023-03-30 19:49:14 公開日:2021-05-18

# ロボティクス研究所におけるコビッドとそれ以上の時代の継続性を教える

Teaching Continuity in Robotics Labs in the Age of Covid and Beyond ( http://arxiv.org/abs/2105.08839v1 )

ライセンス: Link先を確認

R. Pito Salas

(参考訳) 本論文は,コンピュータ科学分野におけるロボット工学者およびロボット工学者の育成には,実際のロボットとの広範な直接作業が必要であり,ロボット工学の学習ラボへのアクセスが制限された場合,この教育的ミッションは負の影響を受けると論じる。これはまさに、Covidパンデミックの始まりである2020年初頭にロボティクス研究所が遭遇した問題だ。論文は、遠隔/仮想ロボット工学の教育ラボの説明に変わり、それが何を意味するのか、どのような利点があるのか、どのように使用されるのかを詳細に調べる。このビジョンの一部は2020年に当社の機関で実施され、それ以来常に使用されてきた。構築された特定のアーキテクチャと実装について述べられている。この結論のエキサイティングな洞察は、パンデミックによって奨励され、引き起こされた作業は、ロボティクス教育へのアクセスを拡大し、ある機関がロボティクス教育を大規模に拡張し、コストを抑えながらこれを行う能力を高めるという、長期的利益を非常に有すると考えられる。

This paper argues that training of future Roboticists and Robotics Engineers in Computer Science departments, requires the extensive direct work with real robots, and that this educational mission will be negatively impacted when access to robotics learning laboratories is curtailed. This is exactly the problem that Robotics Labs encountered in early 2020, at the start of the Covid pandemic. The paper then turns to the description of a remote/virtual robotics teaching laboratory and examines in detail what that would mean, what the benefits would be, and how it may be used. Part of this vision was implemented at our institution during 2020 and has been in constant use since then. The specific architecture and implementation, as far as it has been built, is described. The exciting insight in the conclusion is that the work that was encouraged and triggered by a pandemic seems to have very positive longer-term benefits of increasing access to robotics education, increasing the ability of any one institution to scale their robotics education greatly, and potentially do this while reducing costs.

翻訳日:2023-03-30 19:40:59 公開日:2021-05-18

# 教育者、ソリケーター、フラマー、モチベーター、共感者:オンラインエクストリーム運動における役割の特徴

Educators, Solicitors, Flamers, Motivators, Sympathizers: Characterizing Roles in Online Extremist Movements ( http://arxiv.org/abs/2105.08827v1 )

ライセンス: Link先を確認

Shruti Phadke, Tanushree Mitra

(参考訳) ソーシャルメディアは、白人至上主義や反LGBTQのような過激派社会運動がオンラインで繁栄する手段を提供する。しかし、このような動きの参加者が果たす役割についてはほとんど分かっていない。本稿では,これらの参加者が果たす役割,役割のダイナミクス,オンライン過激主義の普及に与える影響について考察する。当社の参加者は、オンライン過激派アカウントであり、公開Facebookページ4,876人、あるいは289Southern Poverty Law Centerのウェブサイトから情報を共有しているグループです。定量的特徴のクラスタ化と質的専門家による検証により,教育者,ソリケータ,フレイラー,モチベータ,交感神経の5つの役割を同定した。例えば、ソリケーターは過激派ウェブサイトからのリンクを使って寄付を集め、過激派問題に参加する一方、フレイラーは過激派コンテンツを共有して怒りを喚起する。我々はさらに,これらの役割の安定性や,過激なアカウントがある役割から別の役割へと移行する可能性など,役割のダイナミクスについても調査する。フライヤーとモチベーションは高い確率で同調者に移行することができるが、運動、教育者、事務員にとっての役割はより安定している。さらに、教育者やソリテーターが過激なリンク投稿をトリガーするのに対して、フレイラーはフェイクニュースソースからの情報の拡散に影響を及ぼす。本研究は,過激派運動への深い関与の軌跡に様々な役割を担い,様々な反過激派介入の可能性を理解する上で有効である。本研究は, オンライン過激主義運動が参加活動を通じてどのように発展していくのか, オンライン過激主義を動員するために, どのように同盟関係を築いていくのかを理解することにつながる。

Social media provides the means by which extremist social movements, such as white supremacy and anti LGBTQ, thrive online. Yet, we know little about the roles played by the participants of such movements. In this paper, we investigate these participants to characterize their roles, their role dynamics, and their influence in spreading online extremism. Our participants, online extremist accounts, are 4,876 public Facebook pages or groups that have shared information from the websites of 289 Southern Poverty Law Center designated extremist groups. By clustering the quantitative features followed by qualitative expert validation, we identify five roles surrounding extremist activism: educators, solicitors, flamers, motivators, sympathizers. For example, solicitors use links from extremist websites to attract donations and participation in extremist issues, whereas flamers share inflammatory extremist content inciting anger. We further investigate role dynamics such as, how stable these roles are over time and how likely will extremist accounts transition from one role into another. We find that roles core to the movement, educators and solicitors, are more stable, while flamers and motivators can transition to sympathizers with high probability. We further find that educators and solicitors exert the most influence in triggering extremist link posts, whereas flamers are influential in triggering the spread of information from fake news sources. Our results help in situating various roles on the trajectory of deeper engagement into the extremist movements and understanding the potential effect of various counter extremism interventions. Our findings have implications for understanding how online extremist movements flourish through participatory activism and how they gain a spectrum of allies for mobilizing extremism online.

翻訳日:2023-03-30 19:39:43 公開日:2021-05-18

# オープンデータを用いた世界中の歩行者のアクセシビリティ測定のための汎用フレームワーク

A Generalized Framework for Measuring Pedestrian Accessibility around the World Using Open Data ( http://arxiv.org/abs/2105.08814v1 )

ライセンス: Link先を確認

Shiqin Liu, Carl Higgs, Jonathan Arundel, Geoff Boeing, Nicholas Cerdera, David Moctezuma, Ester Cerin, Deepti Adlakha, Melanie Lowe, and Billie Giles-Corti

(参考訳) 歩行者のアクセシビリティは、都市交通と土地利用政策の重要な要素であり、健康で持続可能な都市を作る上で重要である。歩行者のアクセシビリティの不平等を測定する指標の開発と評価は、都市計画者や政策立案者が都市計画介入の進捗をベンチマークし、監視するのに役立つ。しかし,都市設計と都市比較を可能にするために,都市設計と交通特性の指標を世界規模で測定・評価することは,公的な,高品質,同等の空間データや,インジケータの構築と分析のためのカスタマイズ可能なフレームワークを提供する空間分析ツールが限られているため,課題である。これらの課題に対処するため,オープンで一貫したデータを用いた歩行者アクセシビリティ指標を構築するための,オープンソースのソフトウェアフレームワークを開発した。歩行者のアクセシビリティを高分解能・空間的集約スケールで一貫して測定し,都市内・都市間分析を可能にした。本研究で開発されたオープンソースおよびオープンデータ手法は,地域計画と政策立案を支援するため,世界中の他の都市に拡張することができる。ソフトウェアは、オープンリポジトリで再利用するために公開されています。

Pedestrian accessibility is an important factor in urban transport and land use policy and critical for creating healthy, sustainable cities. Developing and evaluating indicators measuring inequalities in pedestrian accessibility can help planners and policymakers benchmark and monitor the progress of city planning interventions. However, measuring and assessing indicators of urban design and transport features at high resolution worldwide to enable city comparisons is challenging due to limited availability of official, high quality, and comparable spatial data, as well as spatial analysis tools offering customizable frameworks for indicator construction and analysis. To address these challenges, this study develops an open source software framework to construct pedestrian accessibility indicators for cities using open and consistent data. It presents a generalized method to consistently measure pedestrian accessibility at high resolution and spatially aggregated scale, to allow for both within- and between-city analyses. The open source and open data methods developed in this study can be extended to other cities worldwide to support local planning and policymaking. The software is made publicly available for reuse in an open repository.

翻訳日:2023-03-30 19:39:08 公開日:2021-05-18

# エルミート行列集合における正の写像によって引き起こされる前順序のキャラクタリゼーション

Characterization of preorders induced by positive maps in the set of Hermitian matrices ( http://arxiv.org/abs/2105.08778v1 )

ライセンス: Link先を確認

Julio I. de Vicente

(参考訳) Uhlmann は、エルミート行列 $A$ を別の$B$ に変換する正の、ユニタリかつトレース保存写像が存在することを示した。この本では、ユニタリティーまたはトレース保存の条件の1つを落として、そのような変換の存在を特徴づける。このことはエルミート行列の集合において2つの可能な事前順序を導き出し、任意のエルミート行列の正半定値の欠如と関連する単調性の性質を定量化する尺度を構築するためにどのように使用できるか論じる。 2つの形式主義のそれぞれの尺度は本質的に一意であることが判明した。

Uhlmann showed that there exists a positive, unital and trace-preserving map transforming a Hermitian matrix $A$ into another $B$ if and only if the vector of eigenvalues of $A$ majorizes that of $B$. In this work I characterize the existence of such a transformation when one of the conditions of unitality or trace preservation is dropped. This induces two possible preorders in the set of Hermitian matrices and I argue how this can be used to construct measures quantifying the lack of positive semidefiniteness of any given Hermitian matrix with relevant monotonicity properties. It turns out that the measures in each of the two formalisms are essentially unique.

翻訳日:2023-03-30 19:38:46 公開日:2021-05-18

# Holstein-Tavis-Cummingsモデルに基づく有機分子の集合効果

Collective Effects of Organic Molecules based on Holstein-Tavis-Cummings Model ( http://arxiv.org/abs/2105.08775v1 )

ライセンス: Link先を確認

Quansheng Zhang and Ke Zhang

(参考訳) 本研究では,ホルシュタイン・タヴィス・カミングスモデルに基づく光学キャビティに閉じ込められた有機分子の集合効果について検討した。量子ランジュバン法を用いて, 振動運動の自由度を分離的に除去することにより, 空洞透過スペクトルの表現を解析的に求め, 分極状態の特徴を解析する。応用として, 超低温分子の検出において, 下方偏光状態の周波数シフトが分子数に依存していることが示される。また、蛍光スペクトルを数値解析する。スペクトルプロファイルの様々な分子数の変動は、分子配座の修飾のためのシグネチャを与える。

We study the collective effects of an ensemble of organic molecules confined in an optical cavity based on Holstein-Tavis-Cummings model. By using the quantum Langevin approach and adiabatically eliminating the degree of freedom of the vibrational motion, we analytically obtain the expression of the cavity transmission spectrum to analyze the features of polaritonic states. As an application, we show that the dependence for the frequency shift of the lower polaritonic state on the number of molecules can be used in the detection of the ultra-cold molecules. We also numerically analyze the fluorescence spectrum. The variation of the spectral profile with various numbers of molecules gives signatures for the modification of molecular conformation.

翻訳日:2023-03-30 19:38:30 公開日:2021-05-18

# 連続可変量子鍵分布の正準攻撃に対する安全性

Security of continuous-variable quantum key distribution against canonical attacks ( http://arxiv.org/abs/2105.08774v1 )

ライセンス: Link先を確認

Panagiotis Papanastasiou, Carlo Ottaviani, and Stefano Pirandola

(参考訳) そこで,本研究では,ガウス的攻撃であるガウス的攻撃の存在下でのガウス的コヒーレント状態コヒーレント状態qkdプロトコルの性能について検討する。我々は漸近的なキーレートを示し、最近開発されたツールボックスを用いて、その結果を有限サイズに拡張する。

We investigate the performance of Gaussianmodulated coherent-state QKD protocols in the presence of canonical attacks, which are collective Gaussian attacks resulting in Gaussian channels described by one of the possible canonical forms. We present asymptotic key rates and then we extend the results to the finite-size regime using a recently-developed toolbox for composable security.

翻訳日:2023-03-30 19:38:22 公開日:2021-05-18

# 高品質ダイヤモンド充填オープンキャビティ

High quality-factor diamond-confined open microcavity ( http://arxiv.org/abs/2105.08736v1 )

ライセンス: Link先を確認

Sigurd Fl{\aa}gan, Daniel Riedel, Alisa Javadi, Tomasz Jakubczyk, Patrick Maletinsky and Richard J. Warburton

(参考訳) 高いコヒーレントで光学的に対応可能な電子スピンを持つダイヤモンドの窒素空隙(nv)中心は、量子ネットワークにおけるノードの有望な候補である。しかし、NV中心は、長い放射寿命、ゼロフォノンライン(ZPL)への小さな分岐比、高指数ホスト材料からの抽出効率の低下によるコヒーレントな単一光子の供給源である。原則として、これら3つの欠点は、共振結合によって光学キャビティの単一モードに対処できる。空洞電磁力学の弱い結合状態を利用して、ZPLと単一空洞モードとの共鳴結合は、ZPLへの遷移速度と分岐比を高める。さらに、キャビティは光を明確に定義されたモードに流し込み、外部光学による検出を容易にする。本稿では,真空中での電界がダイヤモンド膜に強く拘束された状態での単一結晶ダイヤモンド膜を含むファブリ・ペロのマイクロキャビティ幾何構造を提案する。ダイヤモンド-空気界面には電界反ノードがある。表面の損失はあったものの, 品質因子が120,000ドルを超え, 微粒な$\mathcal{F}=11\,500$が観察された。異なる損失機構間の相互作用と、これらの損失チャネルがキャビティの性能に与える影響について検討する。この分析から,"waviness"(マイクロキャビティモードに匹敵する空間周波数の粗さ)は品質因子がさらに高い値に達するのを防ぐメカニズムであることが示唆された。最後に, 抽出した空洞パラメータをNV中心に適用し, 150を超える予測されたPurcell因子を算出する。

With a highly coherent, optically addressable electron spin, the nitrogen vacancy (NV) centre in diamond is a promising candidate for a node in a quantum network. However, the NV centre is a poor source of coherent single photons owing to a long radiative lifetime, a small branching ratio into the zero-phonon line (ZPL) and a poor extraction efficiency out of the high-index host material. In principle, these three shortcomings can be addressed by resonant coupling to a single mode of an optical cavity. Utilising the weak-coupling regime of cavity electrodynamics, resonant coupling between the ZPL and a single cavity-mode enhances the transition rate and branching ratio into the ZPL. Furthermore, the cavity channels the light into a well-defined mode thereby facilitating detection with external optics. Here, we present an open Fabry-Perot microcavity geometry containing a single-crystal diamond membrane, which operates in a regime where the vacuum electric field is strongly confined to the diamond membrane. There is a field anti-node at the diamond-air interface. Despite the presence of surface losses, quality factors exceeding $120\,000$ and a finesse $\mathcal{F}=11\,500$ were observed. We investigate the interplay between different loss mechanisms, and the impact these loss channels have on the performance of the cavity. This analysis suggests that the "waviness" (roughness with a spatial frequency comparable to that of the microcavity mode) is the mechanism preventing the quality factors from reaching even higher values. Finally, we apply the extracted cavity parameters to the NV centre and calculate a predicted Purcell factor exceeding 150.

翻訳日:2023-03-30 19:38:14 公開日:2021-05-18

# インド株取引自動化への深層強化学習の適用

Application of deep reinforcement learning for Indian stock trading automation ( http://arxiv.org/abs/2106.16088v1 )

ライセンス: Link先を確認

Supriya Bajpai

(参考訳) 株式取引において、特徴抽出と取引戦略設計は、機械学習技術を用いて長期的な利益を達成するための2つの重要なタスクである。報酬を最大化するために取引信号を取得することで取引戦略を設計するいくつかの方法が提案されている。本稿では,インド市場における株式取引戦略と投資決定に深層強化学習の理論を適用した。実験は、古典的な3つの深層強化学習モデル、深層Qネットワーク、深層Qネットワーク、深層Qネットワークを10のインドストックデータセット上で体系的に実施する。モデルの性能を評価し、比較する。

In stock trading, feature extraction and trading strategy design are the two important tasks to achieve long-term benefits using machine learning techniques. Several methods have been proposed to design trading strategy by acquiring trading signals to maximize the rewards. In the present paper the theory of deep reinforcement learning is applied for stock trading strategy and investment decisions to Indian markets. The experiments are performed systematically with three classical Deep Reinforcement Learning models Deep Q-Network, Double Deep Q-Network and Dueling Double Deep Q-Network on ten Indian stock datasets. The performance of the models are evaluated and comparison is made.

翻訳日:2023-03-30 19:30:46 公開日:2021-05-18

# 動的エンタープライズアーキテクチャ機能と組織的メリット--経験的仲介研究

Dynamic enterprise architecture capabilities and organizational benefits: an empirical mediation study ( http://arxiv.org/abs/2105.10036v1 )

ライセンス: Link先を確認

Rogier van de Wetering

(参考訳) 近年、文献はエンタープライズアーキテクチャ、EA、研究の文脈における理論構築に重点を置いている。特に、学者は、戦略的な目標と技術の使用を一致させるために組織固有のリソースを組織化し、展開するeaベースの能力に焦点をあてる傾向がある。 EA研究の進展にもかかわらず、文献にはかなりのギャップが残っている。最も注目すべきギャップは、EAベースの能力の概念化は、理論上まだしっかりとした基盤を欠いていることと、EAベースの能力がビジネス変革を促進し、会社に利益をもたらす方法に関する決定的な証拠がないことである。そこで本研究では, EA ベースの機能に着目し, 動的機能ビューを理論的基盤として利用し, 動的エンタープライズアーキテクチャ能力が組織的メリットにどのように寄与するかを説明する新しい研究モデルを開発し, テストする。研究モデルに関連する仮説は、299人のCIO、ITマネージャ、リードアーキテクトからの回答を含むデータセットを使用してテストされる。結果として、動的エンタープライズアーキテクチャ機能は、企業のプロセス革新とビジネス/ITアライメントに肯定的な影響を与えます。これらの仲介力はどちらも、組織の利益に肯定的に結びついています。本研究は、組織に利益をもたらすために、動的エンタープライズアーキテクチャの能力を効果的に脱線させる方法についての理解を深める。

In recent years the literature has put a greater emphasis on theory building in the context of Enterprise Architecture, EA, research. Specifically, scholars tend to focus on EA-based capabilities that organize and deploy organization-specific resources to align strategic objectives with the particular use of technology. Despite the growth in EA studies, substantial gaps remain in the literature. The most noteworthy gaps are that the conceptualization of EA-based capabilities still lacks a firm base in theory and that there is no conclusive evidence on how EA-based capabilities drive business transformation and deliver benefits to the firm. Therefore, this study focuses on EA-based capabilities, using the dynamic capabilities view as a theoretical foundation, develops and tests a new research model that explains how dynamic enterprise architecture capabilities lead to organizational benefits. Hypotheses associated with the research model are tested using a dataset that contains responses from 299 CIOs, IT managers, and lead architects. Results show that dynamic enterprise architecture capabilities positively influence the firm's process innovation and business/IT alignment. These mediating forces are both positively associated with organizational benefits. This study advances our understanding of how to efficaciously de-lineate dynamic enterprise architecture capabilities in delivering benefits to the organization.

翻訳日:2023-03-30 19:29:39 公開日:2021-05-18

# Informatiekunde -- Curriculum 2003

Informatiekunde -- Curriculum 2003 ( http://arxiv.org/abs/2105.09311v1 )

ライセンス: Link先を確認

V. Kamphuis and H. A. Proper

(参考訳) 本書はニジェーゲン情報情報学研究所(NIII)の業務情報学プログラムのカリキュラムについて論じる。 2003年(平成15年)から適用されるカリキュラムの構造に関する「レポジトリ」を提供することが目的である。過去3年間、情報科学の分野としてのイメージは、国家レベルとニメゲンレベルの両方でさらに強調されてきた。 2003年のカリキュラムは、この包括化の結果であり、また、情報科学のトレーニングで現在NIII内に構築されている3年間の経験の成果である。この文書では、2000年、2001年、2002年の既存の「スタートアップ」カリキュラムからの「移行」にも明確な注意が払われる。ここで注意すべきは、コホート2000の学生が原則としてこのプログラムの学士課程を完了したことである(2003年)。専門は情報科学教育。そのため、この集団には具体的な「移住」は必要ない。

This document discusses the curriculum of the business informatics program of the Nijmegen Institute for Informatics and business informatics (NIII). The aim is to provide a 'repository' with regard to the structure of the curriculum, which will apply from 2003. In the past three years, the image of information science as a discipline has been further concretised at both national and Nijmegen level. Curriculum 2003 is on the one hand the result of this concretization and on the other hand of the three years of experience that has now been built up within the NIII with the information science training. In this document, therefore, explicit attention will also be paid to the 'migration' from the existing 'start-up' curricula: 2000, 2001 and 2002. It should be noted here that the students of cohort 2000 will in principle have completed the bachelor's phase of the program this year (2003). Complete information science education. No specific `migration 'is therefore necessary for this group of students.

翻訳日:2023-03-30 19:29:19 公開日:2021-05-18

# Informatiekunde -- Visie 2003

Informatiekunde -- Visie 2003 ( http://arxiv.org/abs/2105.09310v1 )

ライセンス: Link先を確認

V. Kamphuis and H. A. Proper

(参考訳) この文書は、ニジェーゲン情報科学研究所(NIII)のビジネス情報学カリキュラムと研究プログラム(Informatiekunde)の基礎となるビジョンを(オランダ語で)論じている。この文書の最終的な目的は、これらのビジョンに関する「リポジトリ」と、プログラムのカリキュラムと研究計画の具体的な構造の基礎を提供することである。ビジネス情報学は、NIIIにおける教育と研究のための比較的新しい分野であるため、この文書の現在の(2003年)バージョンは、主に教育的視点に焦点を当てている。今後数年間で、この文書のアップデートがビジネスインフォマティクスの研究にさらに注目されるようになると期待されている。しかし、この文書が毎年更新可能であるという事実は、もちろん年次変更を期待するという意味ではない。この文書に記載されているものの安定性に関する野望は5年から6年である。現在のバージョンでは、これは特に情報科学研究プログラムのビジョンに当てはまる。この文書の研究部分は、今後数年でさらに具体化されていく必要があるだろう。

This document discusses (in Dutch) the vision underlying the business informatics (Informatiekunde) curriculum and research programme at the Nijmegen Institute for Informatics and Information Science (NIII). The ultimate aim of this document is to provide a 'repository' with regard to these visions, and a basis for the specific structure of the program's curriculum and research plans. Since business informatics is a relatively new area for teaching and research at NIII, the current (2003) version of this document primarily focuses on the educational perspective. It is to be expected that in the coming years, updates to this document will also pay more attention to business informatics research. The fact that this document can be updated annually does not mean, however, that we expect an annual change of course. The ambition with regard to the stability of what is described in this document is 5 to 6 years. In the current version, this applies specifically to the vision of the information science study program. The research part of this document will have to be fleshed out even more specifically in the coming years.

翻訳日:2023-03-30 19:29:08 公開日:2021-05-18

# 確率と滑らかさを誤認したガウス過程平均の収束保証

Convergence Guarantees for Gaussian Process Means With Misspecified Likelihoods and Smoothness ( http://arxiv.org/abs/2001.10818v3 )

ライセンス: Link先を確認

George Wynne, Fran\c{c}ois-Xavier Briol and Mark Girolami

(参考訳) ガウス過程は機械学習、統計学、応用数学においてユビキタスである。関数を近似するための柔軟なモデリングフレームワークを提供し、同時に不確実性を定量化する。しかし、これはモデルが十分に特定されている場合にのみ当てはまるが、実際にはそうではないことが多い。本稿では,モデルの滑らかさと可能性関数が不明確である場合に,ガウス過程の性質について検討する。この設定において、実践的関連性に関する重要な理論的疑問は、ガウス過程の近似が問題の難しさ、我々のモデル、そして誤特定の程度をどの程度正確にするかである。モデルと実験設計の選択を知らせてくれるので、この問題に対する答えは特に有用です。特に,カーネルとカーネルのハイパーパラメータの実験的設計と選択が,モデルの誤特定を軽減するためにどのように適応できるかを述べる。

Gaussian processes are ubiquitous in machine learning, statistics, and applied mathematics. They provide a flexible modelling framework for approximating functions, whilst simultaneously quantifying uncertainty. However, this is only true when the model is well-specified, which is often not the case in practice. In this paper, we study the properties of Gaussian process means when the smoothness of the model and the likelihood function are misspecified. In this setting, an important theoretical question of practial relevance is how accurate the Gaussian process approximations will be given the difficulty of the problem, our model and the extent of the misspecification. The answer to this problem is particularly useful since it can inform our choice of model and experimental design. In particular, we describe how the experimental design and choice of kernel and kernel hyperparameters can be adapted to alleviate model misspecification.

翻訳日:2023-01-05 21:12:43 公開日:2021-05-18

# ポテンシャルインスタンスの推論によるツリーアンサンブルの検証

Verifying Tree Ensembles by Reasoning about Potential Instances ( http://arxiv.org/abs/2001.11905v3 )

ライセンス: Link先を確認

Laurens Devos, Wannes Meert, Jesse Davis

(参考訳) 特定の属性がモデルの予測に不釣り合いな影響を与えているか"、"部分的に説明された例に対して、どのような予測ができるのか"のようなブラックボックスモデルに質問することができると想像してください。この最後の質問は、部分的な記述がデータ内の観察された例に対応していない場合、特に重要である。これらの機能は、特に堅牢性、公平性、バイアスといった問題に関連して、ユーザがモデルの振る舞いをよりよく理解できるようにするため、非常に役に立ちます。本稿では,木々のアンサンブルに対してこのようなアプローチを提案する。この課題は一般に難易度が高いため,(1) 課題の簡略化を問う質問に対して入力空間の一部を抽出し,(2) 段階的かつ常に回答を返却し,入力領域のどの部分がまだ不確実かを示す分割・征服的アプローチに従うという戦略を提示する。このアプローチの有用性は、さまざまなユースケースで示されています。

Imagine being able to ask questions to a black box model such as "Which adversarial examples exist?", "Does a specific attribute have a disproportionate effect on the model's prediction?" or "What kind of predictions could possibly be made for a partially described example?" This last question is particularly important if your partial description does not correspond to any observed example in your data, as it provides insight into how the model will extrapolate to unseen data. These capabilities would be extremely helpful as they would allow a user to better understand the model's behavior, particularly as it relates to issues such as robustness, fairness, and bias. In this paper, we propose such an approach for an ensemble of trees. Since, in general, this task is intractable we present a strategy that (1) can prune part of the input space given the question asked to simplify the problem; and (2) follows a divide and conquer approach that is incremental and can always return some answers and indicates which parts of the input domains are still uncertain. The usefulness of our approach is shown on a diverse set of use cases.

翻訳日:2023-01-05 05:47:17 公開日:2021-05-18

# 高速近似固有空間の構築と高速グラフフーリエ変換への応用

Constructing fast approximate eigenspaces with application to the fast graph Fourier transforms ( http://arxiv.org/abs/2002.09723v3 )

ライセンス: Link先を確認

Cristian Rusu and Lorenzo Rosasco

(参考訳) 対称行列および一般行列に付随する固有空間の数値的効率的な近似について検討する。固有空間は、効率よく操作できる基本成分の固定個数に分解される(拡張直交の命題やスケーリングやせん断変換を考える)。これらの成分の数は、近似精度と固有空間上の投影の計算複雑性の間のトレードオフを制御する。単一基本成分の最小化問題を書き、閉形式解を提供する。次に,これらすべてのコンポーネントを収束するまで反復的に更新するアルゴリズムを提案する。ランダム行列に関する結果と、有向グラフおよび無向グラフに対するグラフフーリエ変換の近似への応用を示す。

We investigate numerically efficient approximations of eigenspaces associated to symmetric and general matrices. The eigenspaces are factored into a fixed number of fundamental components that can be efficiently manipulated (we consider extended orthogonal Givens or scaling and shear transformations). The number of these components controls the trade-off between approximation accuracy and the computational complexity of projecting on the eigenspaces. We write minimization problems for the single fundamental components and provide closed-form solutions. Then we propose algorithms that iterative update all these components until convergence. We show results on random matrices and an application on the approximation of graph Fourier transforms for directed and undirected graphs.

翻訳日:2022-12-29 19:19:35 公開日:2021-05-18

# ゲノムデータセットの高次元特徴選択

High-Dimensional Feature Selection for Genomic Datasets ( http://arxiv.org/abs/2002.12104v2 )

ライセンス: Link先を確認

Majid Afshar, Hamid Usefi

(参考訳) 機械学習とパターン認識の中心的な問題は、最も重要な特徴を認識するプロセスである。本稿では,まず無関係な特徴を取り除き,残りの特徴間の相関を検出する新しい特徴選択法(drpt)を提案する。 $d=[a\mid \mathbf{b}]$をデータセットとし、$\mathbf{b}$をクラスラベルとし、$a$を列を特徴とする行列とする。我々は最小二乗法と$A$の擬逆法を用いて$A\mathbf{x} = \mathbf{b}$を解く。各々の$\mathbf{x}$の成分は対応する列(機能)に割り当てられた重みと見なすことができる。我々は$\mathbf{x}$の局所最大値に基づいてしきい値を定義し、しきい値よりも重みが小さい特徴を除去する。還元行列の相関を検出するために、我々はまだ$a$と呼ぶが、摂動$\tilde a$ を$a$とする。相関は$\delta\mathbf{x}=\mid \mathbf{x} -\tilde{\mathbf{x}}\mid $, ここで $\tilde{\mathbf{x}}$ は$\tilde a\tilde{\mathbf{x}}=\mathbf{b}$ の最小四分法である。まず、$\delta\mathbf{x}$に基づいて機能をクラスタし、次に機能のエントロピーを使用します。最後に、その重みとエントロピーに基づいて各サブクラスタから特徴を選択する。 drptの有効性は、9,117から267,604までの10の遺伝子データセットに対して7つの最先端特徴選択法との比較を行い検証した。その結果,各特徴選択アルゴリズムと比較して,DRPTの性能はいくつかの面で好ましいことがわかった。 \e である。

A central problem in machine learning and pattern recognition is the process of recognizing the most important features. In this paper, we provide a new feature selection method (DRPT) that consists of first removing the irrelevant features and then detecting correlations between the remaining features. Let $D=[A\mid \mathbf{b}]$ be a dataset, where $\mathbf{b}$ is the class label and $A$ is a matrix whose columns are the features. We solve $A\mathbf{x} = \mathbf{b}$ using the least squares method and the pseudo-inverse of $A$. Each component of $\mathbf{x}$ can be viewed as an assigned weight to the corresponding column (feature). We define a threshold based on the local maxima of $\mathbf{x}$ and remove those features whose weights are smaller than the threshold. To detect the correlations in the reduced matrix, which we still call $A$, we consider a perturbation $\tilde A$ of $A$. We prove that correlations are encoded in $\Delta\mathbf{x}=\mid \mathbf{x} -\tilde{\mathbf{x}}\mid $, where $\tilde{\mathbf{x}}$ is the least quares solution of $\tilde A\tilde{\mathbf{x}}=\mathbf{b}$. We cluster features first based on $\Delta\mathbf{x}$ and then using the entropy of features. Finally, a feature is selected from each sub-cluster based on its weight and entropy. The effectiveness of DRPT has been verified by performing a series of comparisons with seven state-of-the-art feature selection methods over ten genetic datasets ranging up from 9,117 to 267,604 features. The results show that, over all, the performance of DRPT is favorable in several aspects compared to each feature selection algorithm. \e

翻訳日:2022-12-28 08:04:40 公開日:2021-05-18

# ENTMOOT: ツリーモデルのアンサンブルを最適化するフレームワーク

ENTMOOT: A Framework for Optimization over Ensemble Tree Models ( http://arxiv.org/abs/2003.04774v3 )

ライセンス: Link先を確認

Alexander Thebelt, Jan Kronqvist, Miten Mistry, Robert M. Lee, Nathan Sudermann-Merx, Ruth Misener

(参考訳) グラディエント強化木やその他の回帰木モデルは、幅広い実世界の産業用途でよく機能する。これらの木モデル (i)重要な予測機能についての洞察を提供する。 (ii)スパースデータを効果的に管理し、 (iii)優れた予測能力を有する。その利点にもかかわらず、彼らは一般的に意思決定タスクやブラックボックス最適化に不人気であり、それは構造を最適化するのが困難であり、信頼性の高い不確実性尺度が欠如しているからである。 ENTMOOTは、(既に訓練済みの)ツリーモデルをより大きな最適化問題に統合するための新しいフレームワークです。 ENTMOOTの貢献は以下のとおりである。 (i)木モデルと互換性のある信頼性の高い不確実性尺度を明示的に導入すること。 (ii)これらの不確実性認識木モデルを取り入れたより大きな最適化問題を解くこと。 (iii) 解がグローバルに最適であることを証明すること、すなわち、より良い解は存在しないこと。特に、entmootアプローチによって、木モデルの意思決定とブラックボックス最適化へのシンプルな統合が可能になり、一般的に使用されているフレームワークとの強力な競合であることが証明される。

Gradient boosted trees and other regression tree models perform well in a wide range of real-world, industrial applications. These tree models (i) offer insight into important prediction features, (ii) effectively manage sparse data, and (iii) have excellent prediction capabilities. Despite their advantages, they are generally unpopular for decision-making tasks and black-box optimization, which is due to their difficult-to optimize structure and the lack of a reliable uncertainty measure. ENTMOOT is our new framework for integrating (already trained) tree models into larger optimization problems. The contributions of ENTMOOT include: (i) explicitly introducing a reliable uncertainty measure that is compatible with tree models, (ii) solving the larger optimization problems that incorporate these uncertainty aware tree models, (iii) proving that the solutions are globally optimal, i.e. no better solution exists. In particular, we show how the ENTMOOT approach allows a simple integration of tree models into decision-making and black-box optimization, where it proves as a strong competitor to commonly-used frameworks.

翻訳日:2022-12-24 20:16:17 公開日:2021-05-18

# 広帯域におけるCNNの従来未確認スケールへの一般化能力の探索

Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges ( http://arxiv.org/abs/2004.01536v7 )

ライセンス: Link先を確認

Ylva Jansson and Tony Lindeberg

(参考訳) 大規模なバリエーションを扱う能力は多くの現実世界の視覚的タスクにとって不可欠である。ディープネットワークにおけるスケールを扱うための簡単なアプローチは、一連のスケールチャネルで複数のスケールで画像を同時に処理することだ。スケール不変性は、原則として、スケールチャネル間の重量共有と、スケールチャネルからの出力を最大または平均的にプールすることで達成できる。このようなスケールチャネルネットワークが、重要なスケール範囲のトレーニングセットに存在しないスケールに一般化する能力は、これまで検討されていなかった。そこで、我々は、スケールチャネルネットワークの不変性と共分散特性の理論解析を行い、これまで見られなかったスケールに一般化する様々な種類のスケールチャネルネットワークの能力を実験的に評価する。我々は,従来のアプローチの限界を識別し,より解像度の低い画像のより大きな部分をスケールチャネルが処理する,新たなタイプのスケールチャネルアーキテクチャを提案する。提案するFovMaxとFovAvgのネットワークは,1スケールのトレーニングデータを用いたトレーニングにおいても,ほぼ同一のスケールで動作し,また,小型のサンプルシステムでも改善が期待できる。

The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single scale training data, and do also give improvements in the small sample regime.

翻訳日:2022-12-17 04:47:31 公開日:2021-05-18

# Byzantine-Robust Client Weighting によるフェデレーションラーニング

Towards Federated Learning With Byzantine-Robust Client Weighting ( http://arxiv.org/abs/2004.04986v2 )

ライセンス: Link先を確認

Amit Portnoy, Yoav Tirosh, and Danny Hendler

(参考訳) フェデレーション学習(federated learning, fl)は、中央サーバが協調する計算プロセスにおいて、モデルを協調的にトレーニングするクライアント間でデータを分散する分散機械学習パラダイムである。保有するデータインスタンスの割合に基づいて各クライアントに重みを割り当てることにより、正確なジョイントモデルへの収束率を大幅に向上させることができる。以前のいくつかの作品は、一部のクライアントがモデルに関する任意の、あるいは悪意のある情報を送信できるビザンチンの設定でflを研究した。しかし、これらの作業はデータアンバランスの問題を完全に無視するか、クライアントの重みがサーバに周知されていると仮定するかのいずれかであり、実際には、重みはクライアント自身によってサーバに報告され、従って信頼できない可能性がある。そこで本研究では, 実用的重み関係に基づく前処理法を提案し, モデル品質とビザンチンのロバスト性とのバランスが良好であることを実証的に示す。また,本手法をランダムに選択したクライアントウェイトのサンプルに適用できることを解析的に確立した。

Federated Learning (FL) is a distributed machine learning paradigm where data is distributed among clients who collaboratively train a model in a computation process coordinated by a central server. By assigning a weight to each client based on the proportion of data instances it possesses, the rate of convergence to an accurate joint model can be greatly accelerated. Some previous works studied FL in a Byzantine setting, in which a fraction of the clients may send arbitrary or even malicious information regarding their model. However, these works either ignore the issue of data unbalancedness altogether or assume that client weights are apriori known to the server, whereas, in practice, it is likely that weights will be reported to the server by the clients themselves and therefore cannot be relied upon. We address this issue for the first time by proposing a practical weight-truncation-based preprocessing method and demonstrating empirically that it is able to strike a good balance between model quality and Byzantine robustness. We also establish analytically that our method can be applied to a randomly selected sample of client weights.

翻訳日:2022-12-14 20:36:43 公開日:2021-05-18

# 空間変圧器ネットワークが不変性をサポートしていない場合の理解とその対策

Understanding when spatial transformer networks do not support invariance, and what to do about it ( http://arxiv.org/abs/2004.11678v5 )

ライセンス: Link先を確認

Lukas Finnveden, Ylva Jansson and Tony Lindeberg

(参考訳) 空間トランスフォーマーネットワーク(STN)は、畳み込みニューラルネットワーク(CNN)が画像変換に不変性を学習できるように設計された。 STNはもともとCNNの特徴マップと入力画像の変換のために提案されていた。これにより、変換パラメータを予測する際に、より複雑な機能の使用が可能になる。しかし、STNは純粋に空間変換を行うため、一般的な場合、変換された画像の特徴写像を元のものと整列する能力を持たない。したがって、STNはCNN特徴写像を変換する際に不変性をサポートできない。そこで本研究では,この問題に対する簡単な証明と実用的意義について検討し,分類精度の低下と組み合わせることを提案する。そこで我々は,複雑な特徴を利用する代替STNアーキテクチャについて検討する。また,より深い局所化ネットワークは訓練が難しいが,分類ネットワークとパラメータを共有するローカライズネットワークは,より深く成長するにつれて安定し,困難なデータセットの分類精度が向上することがわかった。最後に,ローカライズネットワークの複雑さと反復画像アライメントの相互作用について検討する。

Spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image with those of its original. STNs are therefore unable to support invariance when transforming CNN feature maps. We present a simple proof for this and study the practical implications, showing that this inability is coupled with decreased classification accuracy. We therefore investigate alternative STN architectures that make use of complex features. We find that while deeper localization networks are difficult to train, localization networks that share parameters with the classification network remain stable as they grow deeper, which allows for higher classification accuracy on difficult datasets. Finally, we explore the interaction between localization network complexity and iterative image alignment.

翻訳日:2022-12-10 03:52:24 公開日:2021-05-18

# 検索エンジンとの会話:serpベースの会話応答生成

Conversations with Search Engines: SERP-based Conversational Response Generation ( http://arxiv.org/abs/2004.14162v2 )

ライセンス: Link先を確認

Pengjie Ren, Zhumin Chen, Zhaochun Ren, Evangelos Kanoulas, Christof Monz, and Maarten de Rijke

(参考訳) 本稿では,ユーザが自然言語でクエリを表現できるという意味で,検索エンジンと会話することで複雑な情報要求に答える問題に対処し,短いシステム応答から必要な情報を会話形式で直接受信する。近年、会話エージェント(cas)や会話検索(cs)の研究など、同様の目標に向けた試みがいくつか行われている。しかし、複雑な情報のニーズに対処しないか、あるいは概念フレームワークや実験室ベースのユーザリサーチの開発に限られている。本稿では,(1)適切なデータセットの作成,(2)会話用パイプラインの開発のためのsaac(search as a conversation)データセット,(2)検索エンジンとの会話のための最先端パイプラインの開発,(2)このデータセットを用いた検索エンジンとの対話(case)という2つの目標を追求する。 SaaCはマルチターンの会話検索データセットに基づいて構築されており、クラウドソーシングプラットフォームから労働者を雇い、関連する各項目を短い会話応答にまとめる。 caseは、サポート対象のトークン識別モジュールとaprior-awareポインタジェネレータを導入することで、最先端の処理を強化します。我々は,CaSEが強いベースラインより優れていることを示す実験を行った。また、CaSE以外のさらなる改善の余地があるかを示すために、SaaCデータセットの広範な分析を行う。最後に、我々は、CaSEのSaaCデータセットとコードと、このトピックに関する今後の研究を促進するために使用されるすべてのモデルをリリースする。

In this paper, we address the problem of answering complex information needs by conversing conversations with search engines, in the sense that users can express their queries in natural language, and directly receivethe information they need from a short system response in a conversational manner. Recently, there have been some attempts towards a similar goal, e.g., studies on Conversational Agents (CAs) and Conversational Search (CS). However, they either do not address complex information needs, or they are limited to the development of conceptual frameworks and/or laboratory-based user studies. We pursue two goals in this paper: (1) the creation of a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines, and (2) the development of astate-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE), using this dataset. SaaC is built based on a multi-turn conversational search dataset, where we further employ workers from a crowdsourcing platform to summarize each relevant passage into a short, conversational response. CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator, which enables us to generate more accurate responses. We carry out experiments to show that CaSE is able to outperform strong baselines. We also conduct extensive analyses on the SaaC dataset to show where there is room for further improvement beyond CaSE. Finally, we release the SaaC dataset and the code for CaSE and all models used for comparison to facilitate future research on this topic.

翻訳日:2022-12-08 14:18:46 公開日:2021-05-18

# WOAD:未公開動画のオンラインアクション検出を監督

WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos ( http://arxiv.org/abs/2006.03732v2 )

ライセンス: Link先を確認

Mingfei Gao, Yingbo Zhou, Ran Xu, Richard Socher, Caiming Xiong

(参考訳) 非トリミングビデオ中のオンラインアクション検出は、発生時のアクションを識別することを目的としているため、リアルタイムアプリケーションにとって非常に重要である。従来は、オンライン行動検出システムのスケーラビリティを妨げる時間的行動境界の面倒なアノテーションをトレーニングに頼っていた。ビデオクラスラベルのみを用いてトレーニング可能な弱教師付きフレームワークであるWOADを提案する。 WOADには、時間的提案生成(TPG)とオンラインアクション認識(OAR)の2つの共同訓練モジュールが含まれている。ビデオクラスのラベルによって監督され、TPGはオフラインで動作し、OARの擬似フレームレベルのラベルを正確にマイニングするターゲットとなる。 TPGからの監視信号により、OARはオンライン方式で行動検出を行うことを学ぶ。 thumos'14, activitynet1.2, activitynet1.3の実験結果は,弱教師付き手法が弱教師付きベースラインをほとんど上回っており,従来の強教師付き手法と同等の性能を達成していることを示している。さらに、WOADは、利用可能な時に強力な監視を活用するために柔軟です。本手法は,オンラインフレームごとの行動認識とオンライン行動開始検出の両方のタスクにおいて,最先端の結果を得る。

Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications. Previous methods rely on tedious annotations of temporal action boundaries for training, which hinders the scalability of online action detection systems. We propose WOAD, a weakly supervised framework that can be trained using only video-class labels. WOAD contains two jointly-trained modules, i.e., temporal proposal generator (TPG) and online action recognizer (OAR). Supervised by video-class labels, TPG works offline and targets at accurately mining pseudo frame-level labels for OAR. With the supervisory signals from TPG, OAR learns to conduct action detection in an online fashion. Experimental results on THUMOS'14, ActivityNet1.2 and ActivityNet1.3 show that our weakly-supervised method largely outperforms weakly-supervised baselines and achieves comparable performance to the previous strongly-supervised methods. Beyond that, WOAD is flexible to leverage strong supervision when it is available. When strongly supervised, our method obtains the state-of-the-art results in the tasks of both online per-frame action recognition and online detection of action start.

翻訳日:2022-11-25 04:11:36 公開日:2021-05-18

# テキスト生成の評価:調査

Evaluation of Text Generation: A Survey ( http://arxiv.org/abs/2006.14799v2 )

ライセンス: Link先を確認

Asli Celikyilmaz, Elizabeth Clark, Jianfeng Gao

(参考訳) 本稿は,ここ数年で開発された自然言語生成システム(NLG)の評価手法について検討する。 nlg評価方法は,(1)人間中心評価指標,(2)訓練を必要としない自動評価指標,(3)機械学習指標の3つのカテゴリに分類した。各カテゴリにおいて、最近提案されたNLGタスクとニューラルNLGモデルの評価に焦点をあて、現在行われている進歩と課題について論じる。次に,テキストの自動要約と長文生成のためのタスク固有のnlg評価の2つの例を示し,今後の研究の方向性を述べる。

The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years. We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics. For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models. We then present two examples for task-specific NLG evaluations for automatic text summarization and long text generation, and conclude the paper by proposing future research directions.

翻訳日:2022-11-16 20:45:03 公開日:2021-05-18

# 不均一情報ネットワークのための事前学習モデル

Pre-Trained Models for Heterogeneous Information Networks ( http://arxiv.org/abs/2007.03184v2 )

ライセンス: Link先を確認

Yang Fang, Xiang Zhao, Yifan Chen, Weidong Xiao, Maarten de Rijke

(参考訳) ネットワーク表現学習では,ヘテロジニアスな情報ネットワークを低次元空間で表現する方法を学習し,効率的な探索,分類,予測を容易にする。従来のネットワーク表現学習手法では、ドメイン固有の問題に対処するために十分なタスク固有のラベル付きデータが必要である。トレーニングされたモデルは、通常、ドメイン外データセットに転送できない。我々は、異種情報ネットワークの特徴を捉えるための自己教師付き事前学習および微調整フレームワークPF-HINを提案する。ダウンストリームのタスクとデータセットごとにモデル全体をトレーニングしなければならない従来のネットワーク表現学習モデルとは異なり、PF-HINはモデルと少数のタスク固有のパラメータを微調整するだけで、モデル効率と効率性が向上する。事前学習中、我々はまず与えられたノードの近傍をシーケンスに変換する。 PF-HINは2つの自己教師付きタスク、マスキングノードモデリング、隣接ノード予測に基づいて事前訓練される。モデルのトレーニングには深層双方向トランスフォーマーエンコーダを採用し、パラメータの削減には分解型埋め込みパラメータ化と層間パラメータ共有を利用する。微調整の段階では、リンク予測、類似性検索、ノード分類、ノードクラスタリングという4つのベンチマークダウンストリームタスクを選択します。 pf-hinは、これら各タスクにおける最先端の代替手段を4つのデータセットで一貫して大幅に上回っている。

In network representation learning we learn how to represent heterogeneous information networks in a low-dimensional space so as to facilitate effective search, classification, and prediction solutions. Previous network representation learning methods typically require sufficient task-specific labeled data to address domain-specific problems. The trained model usually cannot be transferred to out-of-domain datasets. We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. Unlike traditional network representation learning models that have to train the entire model all over again for every downstream task and dataset, PF-HIN only needs to fine-tune the model and a small number of extra task-specific parameters, thus improving model efficiency and effectiveness. During pre-training, we first transform the neighborhood of a given node into a sequence. PF-HIN is pre-trained based on two self-supervised tasks, masked node modeling and adjacent node prediction. We adopt deep bi-directional transformer encoders to train the model, and leverage factorized embedding parameterization and cross-layer parameter sharing to reduce the parameters. In the fine-tuning stage, we choose four benchmark downstream tasks, i.e., link prediction, similarity search, node classification, and node clustering. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.

翻訳日:2022-11-12 18:11:54 公開日:2021-05-18

# Deep Retrieval: 大規模レコメンデーションのための検索可能な構造を学ぶ

Deep Retrieval: Learning A Retrievable Structure for Large-Scale Recommendations ( http://arxiv.org/abs/2007.07203v2 )

ライセンス: Link先を確認

Weihao Gao, Xiangjun Fan, Chong Wang, Jiankai Sun, Kai Jia, Wenzhi Xiao, Ruofan Ding, Xingyan Bin, Hui Yang, Xiaobing Liu

(参考訳) 大規模レコメンデーションにおける中核的な問題は、重要候補を正確かつ効率的に、好ましくは準線形時間で検索することである。先進的なアプローチは主に2段階の手順に基づいており、まず内積モデルを学び、次に近い近接探索アルゴリズム(ANN)を用いて上位候補を見つける。本稿では,ANNアルゴリズムにおけるユークリッド空間の仮定に頼らずに,ユーザとイテムのインタラクションデータ(例えばクリック)を直接検索可能な構造を学習するために,Deep Retrieval(DR)を提案する。 DRの構造は全ての候補項目を離散潜在空間に符号化する。候補の潜在コードはモデルパラメータであり、同じ目的関数を最大化するために他のニューラルネットワークパラメータと共に学習する。モデルが学習されると、構造上のビーム探索を行い、最上位候補を検索して再ランキングを行う。経験的に、我々はまず、2つの公開データセットのブルートフォースベースラインとほぼ同じ精度を、サブ線形計算複雑性を持つ dr が達成できることを実証した。さらに本研究では,実運用レコメンデーションシステムにおいて,デプロイされたDRアプローチが,エンゲージメントの指標として十分に調整されたANNベースラインを著しく上回ることを示す。我々の知る限りでは、DRは産業レコメンデーションシステムのために数億のアイテムをスケールで展開した最初の非ANNアルゴリズムの1つである。

One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model, and then use some approximate nearest neighbor (ANN) search algorithm to find top candidates. In this paper, we present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data (e.g. clicks) without resorting to the Euclidean space assumption in ANN algorithms. DR's structure encodes all candidate items into a discrete latent space. Those latent codes for the candidates are model parameters and learnt together with other neural network parameters to maximize the same objective function. With the model learnt, a beam search over the structure is performed to retrieve the top candidates for reranking. Empirically, we first demonstrate that DR, with sub-linear computational complexity, can achieve almost the same accuracy as the brute-force baseline on two public datasets. Moreover, we show that, in a live production recommendation system, a deployed DR approach significantly outperforms a well-tuned ANN baseline in terms of engagement metrics. To the best of our knowledge, DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems.

翻訳日:2022-11-11 05:49:55 公開日:2021-05-18

# GPU上の多面体ニューラルネットワークのスケーリング検証

Scaling Polyhedral Neural Network Verification on GPUs ( http://arxiv.org/abs/2007.10868v2 )

ライセンス: Link先を確認

Christoph M\"uller, Fran\c{c}ois Serre, Gagandeep Singh, Markus P\"uschel, Martin Vechev

(参考訳) ニューラルネットワークの敵攻撃に対する堅牢性を証明することは、自律運転や診断などの安全クリティカルなシステムに確実に採用するために不可欠である。残念なことに、最先端の検証者はより大きなネットワークにスケールしないか、堅牢性を証明するには不正確で、実践的な採用を制限している。本稿では,従来よりもはるかに大きなディープニューラルネットワークのロバスト性を証明可能なスケーラブルな検証器であるGPUPolyを紹介する。 GPUPolyの背後にある重要な技術的洞察は、GPU上のニューラルネットワーク検証のためのカスタムのサウンドポリヘドラアルゴリズムの設計である。我々のアルゴリズムは、基盤となる検証タスクの利用可能なGPU並列性と固有の疎性を活用する。 GPUPolyは大規模ネットワークにスケールする。例えば、約34.5msで1Mのニューロン、34層の深い残留ネットワークの堅牢性を証明することができる。我々は、GPUPolyが現実のニューラルネットワークの実用的な検証に向けた有望なステップであると考えている。

Certifying the robustness of neural networks against adversarial attacks is essential to their reliable adoption in safety-critical systems such as autonomous driving and medical diagnosis. Unfortunately, state-of-the-art verifiers either do not scale to bigger networks or are too imprecise to prove robustness, limiting their practical adoption. In this work, we introduce GPUPoly, a scalable verifier that can prove the robustness of significantly larger deep neural networks than previously possible. The key technical insight behind GPUPoly is the design of custom, sound polyhedra algorithms for neural network verification on a GPU. Our algorithms leverage the available GPU parallelism and inherent sparsity of the underlying verification task. GPUPoly scales to large networks: for example, it can prove the robustness of a 1M neuron, 34-layer deep residual network in approximately 34.5 ms. We believe GPUPoly is a promising step towards practical verification of real-world neural networks.

翻訳日:2022-11-08 13:05:07 公開日:2021-05-18

# プライバシ/レート・歪み理論によるロバスト機械学習

Robust Machine Learning via Privacy/Rate-Distortion Theory ( http://arxiv.org/abs/2007.11693v2 )

ライセンス: Link先を確認

Ye Wang, Shuchin Aeron, Adnan Siraj Rakin, Toshiaki Koike-Akino, Pierre Moulin

(参考訳) ニューラルネットワークの一般的な脆弱性に対処するために、ロバストなマシンラーニングの定式化が登場している。我々の研究は、最適ロバスト学習とプライバシ・ユーティリティ・トレードオフ問題との関連性を引き合いに出し、これは率歪み問題の一般化である。頑健な分類器と対向的な摂動の間のゲームのサドルポイントは、最大条件エントロピー問題の解によって見つけることができる。この情報理論的な観点は、ロバストネスとクリーンなデータ性能の基本的なトレードオフに光を当て、それは最終的に、基礎となるデータ分布と摂動制約の幾何学的構造から生じる。

Robust machine learning formulations have emerged to address the prevalent vulnerability of deep neural networks to adversarial examples. Our work draws the connection between optimal robust learning and the privacy-utility tradeoff problem, which is a generalization of the rate-distortion problem. The saddle point of the game between a robust classifier and an adversarial perturbation can be found via the solution of a maximum conditional entropy problem. This information-theoretic perspective sheds light on the fundamental tradeoff between robustness and clean data performance, which ultimately arises from the geometric structure of the underlying data distribution and perturbation constraints.

翻訳日:2022-11-07 22:38:48 公開日:2021-05-18

# マルチドメイン学習のためのNASを介してアダプタをプラグインすること

What and Where: Learn to Plug Adapters via NAS for Multi-Domain Learning ( http://arxiv.org/abs/2007.12415v2 )

ライセンス: Link先を確認

Hanbin Zhao, Hao Zeng, Xin Qin, Yongjian Fu, Hui Wang, Bourahla Omar, and Xi Li

(参考訳) 重要かつ困難な問題として、マルチドメイン学習(MDL)は一般的に、共通のドメインに依存しないネットワークにプラグインされた、効果的な軽量なドメイン固有アダプタモジュールのセットを探している。通常、既存のアダプタプラグと構造設計の方法は、モデル学習の前にすべてのドメインに対して手作りで固定され、学習の柔軟性と計算集約性をもたらす。このモチベーションにより,neural architecture search (nas) を用いたデータ駆動アダプタ接続戦略を学習し,アダプタモジュールの接続先を自動的に決定する。さらに、NAS駆動学習方式におけるアダプタ構造設計のためのNAS-adapterモジュールを提案し、異なるドメインに対する効果的なアダプタモジュール構造を自動的に発見する。実験結果は,mdlモデルが既存手法と同等の性能条件下での有効性を示す。

As an important and challenging problem, multi-domain learning (MDL) typically seeks for a set of effective lightweight domain-specific adapter modules plugged into a common domain-agnostic network. Usually, existing ways of adapter plugging and structure design are handcrafted and fixed for all domains before model learning, resulting in the learning inflexibility and computational intensiveness. With this motivation, we propose to learn a data-driven adapter plugging strategy with Neural Architecture Search (NAS), which automatically determines where to plug for those adapter modules. Furthermore, we propose a NAS-adapter module for adapter structure design in a NAS-driven learning scheme, which automatically discovers effective adapter module structures for different domains. Experimental results demonstrate the effectiveness of our MDL model against existing approaches under the conditions of comparable performance.

翻訳日:2022-11-07 05:55:32 公開日:2021-05-18

# リコメンダシステムにおける便益的特徴相互作用の検出

Detecting Beneficial Feature Interactions for Recommender Systems ( http://arxiv.org/abs/2008.00404v6 )

ライセンス: Link先を確認

Yixin Su, Rui Zhang, Sarah Erfani, Zhenghua Xu

(参考訳) 特徴相互作用は、推薦システムにおいて高い精度を達成するために不可欠である。多くの研究がそれぞれの特徴の相互作用を考慮に入れている。しかし、いくつかの特徴的相互作用は推奨結果に関係しない可能性があり、それらを考慮してノイズを生じさせ、推奨精度を低下させる可能性があるため、これは最適ではない。特徴的相互作用から最善を尽くすため,提案手法では,特徴的相互作用を効果的にモデル化するグラフニューラルネットワーク手法と,推薦精度の観点から有益である特徴的相互作用を自動的に検出する手法を提案する。自動特徴相互作用検出は、エッジ予測とL0アクティベーション正規化により達成される。提案モデルは,情報ボトルネック原理と統計相互作用理論を用いて有効であることが証明された。実験結果から我々のモデルは (i)既存の基準線を精度で上回り、 (ii) 有用な特徴相互作用を自動的に識別する。

Feature interactions are essential for achieving high accuracy in recommender systems. Many studies take into account the interaction between every pair of features. However, this is suboptimal because some feature interactions may not be that relevant to the recommendation result, and taking them into account may introduce noise and decrease recommendation accuracy. To make the best out of feature interactions, we propose a graph neural network approach to effectively model them, together with a novel technique to automatically detect those feature interactions that are beneficial in terms of recommendation accuracy. The automatic feature interaction detection is achieved via edge prediction with an L0 activation regularization. Our proposed model is proved to be effective through the information bottleneck principle and statistical interaction theory. Experimental results show that our model (i) outperforms existing baselines in terms of accuracy, and (ii) automatically identifies beneficial feature interactions.

翻訳日:2022-11-03 19:28:12 公開日:2021-05-18

# 画像分類のためのメモリ効率の良いクラスインクリメンタル学習

Memory Efficient Class-Incremental Learning for Image Classification ( http://arxiv.org/abs/2008.01411v2 )

ライセンス: Link先を確認

Hanbin Zhao, Hui Wang, Yongjian Fu, Fei Wu, Xi Li

(参考訳) メモリリソース制限の制約により、クラスインクリメンタルラーニング(CIL)は通常、新たに追加されたクラスが到着すると、共同分類モデルを更新する際に「破滅的な忘れる」問題に悩まされる。忘れる問題に対処するため、多くのCILメソッドは、模範的なサンプルをメモリバッファのサイズに制限して保存することで、古いクラスの知識を転送する。メモリバッファをより効率的に活用するために,本研究では,従来の実高忠実な模範サンプルよりも,補助的な低忠実な模範サンプルを維持することを提案する。このようなメモリ効率の良い模範保存スキームは、古いクラスの知識伝達をより効果的にする。しかし、低忠実度例のサンプルは、しばしば元の例のサンプル、すなわちドメインシフトとは別の領域に分散される。この問題を軽減するため、我々は、上記のドメイン間ギャップを大幅に狭めるドメイン互換特徴抽出器と分類器を構築しようとする二重学習スキームを提案する。その結果、これらの低忠実度補助サンプルは、元の例サンプルを低メモリコストで適度に置き換えることができる。さらに, 純粋な真のクラスラベルのサンプルを用いて, バイアス付き分類器(古いクラスに関する蒸留ラベルの知識を含むサンプルを学習)を改良する, 頑健な分類器適応方式を提案する。実験により,本研究の有効性が実証された。

With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the "catastrophic forgetting" problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer. To utilize the memory buffer more efficiently, we propose to keep more auxiliary low-fidelity exemplar samples rather than the original real high-fidelity exemplar samples. Such a memory-efficient exemplar preserving scheme makes the old-class knowledge transfer more effective. However, the low-fidelity exemplar samples are often distributed in a different domain away from that of the original exemplar samples, that is, a domain shift. To alleviate this problem, we propose a duplet learning scheme that seeks to construct domain-compatible feature extractors and classifiers, which greatly narrows down the above domain gap. As a result, these low-fidelity auxiliary exemplar samples have the ability to moderately replace the original exemplar samples with a lower memory cost. In addition, we present a robust classifier adaptation scheme, which further refines the biased classifier (learned with the samples containing distillation label knowledge about old classes) with the help of the samples of pure true class labels. Experimental results demonstrate the effectiveness of this work against the state-of-the-art approaches.

翻訳日:2022-11-02 23:21:09 公開日:2021-05-18

# 因果規則:不均一な治療効果の解釈的推論

Causal Rule Ensemble: Interpretable Inference of Heterogeneous Treatment Effects ( http://arxiv.org/abs/2009.09036v3 )

ライセンス: Link先を確認

Kwonsang Lee, Falco J. Bargagli-Stoffi, Francesca Dominici

(参考訳) 社会科学や健康科学では、治療が人口平均よりも明らかに大きいか小さい因果効果を持つ研究集団のサブグループを特定することが重要である。近年,因果効果の不均一性に対処するための方法論開発が数多く行われている。一般的なアプローチは、あらかじめ特定された共変量集合が与えられた条件平均処理効果(CATE)を推定することである。しかし、このアプローチは新たな部分群を発見できない。最近の因果機械学習(ML)アプローチでは、多数の観測や共変量が存在する場合、個々のレベルでCATEを推定する。しかしながら、これらのMLアプローチの大部分は、異種部分群の解釈可能な特徴づけを提供していない。本稿では,新しい因果ルールアンサンブル(CRE)法を提案する。 1) 著しく異質な治療効果を持つde novoサブグループ(causal rules)を発見する。 2)これらのサブグループの解釈性は,決定規則によって定義されるので保証する。 3) CATEは, 偏差が小さく, 統計的精度が高いこれらの新発見サブグループのそれぞれについて推定する。新たに発見された因果規則に対する推定因果効果の整合性を保証する理論的結果を提供する。 CREの優れた特徴は、因果規則の発見に使用できるMLアルゴリズムの選択や、因果規則内の因果効果の推定方法に非依存である点である。シミュレーションにより,cre手法は既存の手法に比べて性能が向上し,解釈性が向上することを示す。また,未測定埋没バイアスに対する新しい感度解析も導入した。 CRE法を用いて,大気汚染の長期曝露による死亡率に対する因果的影響に弱いサブグループを同定する。

In social and health sciences, it is critically important to identify subgroups of the study population where a treatment has a notably larger or smaller causal effect compared to the population average. In recent years, there have been many methodological developments for addressing heterogeneity of causal effects. A common approach is to estimate the conditional average treatment effect (CATE) given a pre-specified set of covariates. However, this approach does not allow to discover new subgroups. Recent causal machine learning (ML) approaches estimate the CATE at an individual level in presence of large number of observations and covariates with great accuracy. Nevertheless, the bulk of these ML approaches do not provide an interpretable characterization of the heterogeneous subgroups. In this paper, we propose a new Causal Rule Ensemble (CRE) method that: 1) discovers de novo subgroups with significantly heterogeneous treatment effects (causal rules); 2) ensures interpretability of these subgroups because they are defined in terms of decision rules; and 3) estimates the CATE for each of these newly discovered subgroups with small bias and high statistical precision. We provide theoretical results that guarantee consistency of the estimated causal effects for the newly discovered causal rules. A nice feature of CRE is that it is agnostic to the choices of the ML algorithms that can be used to discover the causal rules, and the estimation methods for the causal effects within the discovered causal rules. Via simulations, we show that the CRE method has competitive performance as compared to existing approaches while providing enhanced interpretability. We also introduce a new sensitivity analysis to unmeasured confounding bias. We apply the CRE method to discover subgroups that are more vulnerable to the causal effects of long-term exposure to air pollution on mortality.

翻訳日:2022-10-17 03:25:08 公開日:2021-05-18

# ロバストか公正か:対人訓練の公正性を目指して

To be Robust or to be Fair: Towards Fairness in Adversarial Training ( http://arxiv.org/abs/2010.06121v2 )

ライセンス: Link先を確認

Han Xu, Xiaorui Liu, Yaxin Li, Anil K. Jain, Jiliang Tang

(参考訳) 敵のトレーニングアルゴリズムは、敵の例に対する機械学習モデルの堅牢性を改善するために信頼できることが証明されている。しかし, 逆行訓練アルゴリズムは, 異なるデータ群間の精度と頑健さの相違が生じやすいことがわかった。例えば、cifar-10の対向的に訓練されたresnet18モデルは、クラス"automobile"では93%のクリーン精度と67%のpgd l-infty-8ロバスト精度を持つが、クラス"cat"では65%と17%しかない。この現象はバランスの取れたデータセットで発生し、クリーンサンプルのみを使用すると自然に訓練されたモデルには存在しない。本研究では,DNNモデルのロバストな誤差を最小限に抑える一般対角訓練アルゴリズムにおいて,この現象が生じることを実証的,理論的に示す。これらの知見に触発されて、敵防衛を行う際の不公平問題を軽減するためのFair-Robust-Learning(FRL)フレームワークを提案する。 FRLの有効性を実験的に検証した。

Adversarial training algorithms have been proved to be reliable to improve machine learning models' robustness against adversarial examples. However, we find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data. For instance, a PGD adversarially trained ResNet18 model on CIFAR-10 has 93% clean accuracy and 67% PGD l-infty-8 robust accuracy on the class "automobile" but only 65% and 17% on the class "cat". This phenomenon happens in balanced datasets and does not exist in naturally trained models when only using clean samples. In this work, we empirically and theoretically show that this phenomenon can happen under general adversarial training algorithms which minimize DNN models' robust errors. Motivated by these findings, we propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses. Experimental results validate the effectiveness of FRL.

翻訳日:2022-10-07 22:53:40 公開日:2021-05-18

# タスクとドメインにまたがる技術的疑問

Technical Question Answering across Tasks and Domains ( http://arxiv.org/abs/2010.09780v2 )

ライセンス: Link先を確認

Wenhao Yu, Lingfei Wu, Yu Deng, Qingkai Zeng, Ruchi Mahindru, Sinem Guven, Meng Jiang

(参考訳) 自動技術支援システムの構築は重要な課題である。概念的には、技術的なフォーラムでユーザー質問に答えるためには、人間の専門家がまず関連文書を検索し、答えのスニペットを特定するために慎重に読む必要がある。大きな成功にもかかわらず、研究者たちは一般領域質問応答(QA)に対処することに成功しているが、技術的QAの調査に対する注意ははるかに少ない。具体的には、既存の手法はいくつかの固有の課題に苦しむ (i)質問と回答が実質的に重なることは滅多になく、 (ii)データサイズが非常に限られている。本稿では,タスクやドメイン間での技術的QAを効果的に扱うための,ディープラーニング学習の枠組みを提案する。この目的のために,文書検索と読解作業のための調整可能な共同学習手法を提案する。我々のTechQA実験は最先端手法と比較して優れた性能を示した。

Building automatic technical support system is an important yet challenge task. Conceptually, to answer a user question on a technical forum, a human expert has to first retrieve relevant documents, and then read them carefully to identify the answer snippet. Despite huge success the researchers have achieved in coping with general domain question answering (QA), much less attentions have been paid for investigating technical QA. Specifically, existing methods suffer from several unique challenges (i) the question and answer rarely overlaps substantially and (ii) very limited data size. In this paper, we propose a novel framework of deep transfer learning to effectively address technical QA across tasks and domains. To this end, we present an adjustable joint learning approach for document retrieval and reading comprehension tasks. Our experiments on the TechQA demonstrates superior performance compared with state-of-the-art methods.

翻訳日:2022-10-05 20:29:01 公開日:2021-05-18

# トピック・スペース・トラジェクトリー:機械学習文学の事例研究

Topic Space Trajectories: A case study on machine learning literature ( http://arxiv.org/abs/2010.12294v3 )

ライセンス: Link先を確認

Bastian Sch\"afermeier and Gerd Stumme and Tom Hanika

(参考訳) 科学会場での年次刊行物、例えば会議や雑誌の数は急速に増えている。したがって、研究者にとっても研究トピックとその進捗を追跡することが難しくなる。このタスクでは、研究者は自動出版分析によって支援できる。しかし、そのような方法の多くは解釈不能で純粋に数値表現をもたらす。人的分析者を支援するため,研究トピックを網羅的に追跡する構造であるトピック空間トラジェクトリを提案する。 8つの異なる解析手法に基づいてこれらの軌道を解釈する方法を実証する。その結果,非負の行列係数化と適切な可視化手法が得られた。我々は,32の出版会場から50年間の機械学習研究を対象とする出版コーパスへのアプローチの適用性を示した。本手法は,論文分類,今後の研究課題の予測,未発表の論文提出のための会議や雑誌の掲載を推奨するために利用することができる。

The annual number of publications at scientific venues, for example, conferences and journals, is growing quickly. Hence, even for researchers it becomes harder and harder to keep track of research topics and their progress. In this task, researchers can be supported by automated publication analysis. Yet, many such methods result in uninterpretable, purely numerical representations. As an attempt to support human analysts, we present topic space trajectories, a structure that allows for the comprehensible tracking of research topics. We demonstrate how these trajectories can be interpreted based on eight different analysis approaches. To obtain comprehensible results, we employ non-negative matrix factorization as well as suitable visualization techniques. We show the applicability of our approach on a publication corpus spanning 50 years of machine learning research from 32 publication venues. Our novel analysis method may be employed for paper classification, for the prediction of future research topics, and for the recommendation of fitting conferences and journals for submitting unpublished work.

翻訳日:2022-10-04 00:03:04 公開日:2021-05-18

# 現代・歴史的テキストにおける文字エントロピー:未解読写本の比較尺度

Character Entropy in Modern and Historical Texts: Comparison Metrics for an Undeciphered Manuscript ( http://arxiv.org/abs/2010.14697v2 )

ライセンス: Link先を確認

Luke Lindemann and Claire Bowern

(参考訳) 本稿では,voynich写本を多言語で比較分析するためのコーパスとして,カーリアー言語,スクリバル手,転写システムで区切られたvoynichテキストのコーパス,wikipediaから収集された294言語サンプルのコーパス,8言語で書き起こされた18の歴史的テキストのコーパスの3つのコーパスについて概説する。これらのコーパスは、イェール大学のVoynich Working Groupによるその後の研究で活用される。本稿では,Voynicheseにおける条件付き文字エントロピーの分析により,Voynich文字と言語の特徴を研究するためのコーパスの有用性を実証する。文字エントロピーと言語,スクリプトサイズとタイプ,グリフの構成性,スクリバル規則と略語,位置的文字変種,ビッグラム周波数の相互作用について論じる。この分析は、スクリプト構成性、文字サイズ、予測可能性の間の相互作用を特徴付ける。条件付きエントロピーレベルを自然言語に合わせるには,グリフ合成の実質的な操作が不十分であることを示す。ヴォイニチェ文字の異常に予測可能な性質は、特定のスクリプトや転写システム、基礎言語、置換暗号に起因するものではない。 Voynicheseはコーパスのすべての比較テキストと異なるのは、文字の配置が単語内で非常に制約されているためであり、これは下層の言語から音韻的区別が失われていることを示している。

This paper outlines the creation of three corpora for multilingual comparison and analysis of the Voynich manuscript: a corpus of Voynich texts partitioned by Currier language, scribal hand, and transcription system, a corpus of 294 language samples compiled from Wikipedia, and a corpus of eighteen transcribed historical texts in eight languages. These corpora will be utilized in subsequent work by the Voynich Working Group at Yale University. We demonstrate the utility of these corpora for studying characteristics of the Voynich script and language, with an analysis of conditional character entropy in Voynichese. We discuss the interaction between character entropy and language, script size and type, glyph compositionality, scribal conventions and abbreviations, positional character variants, and bigram frequency. This analysis characterizes the interaction between script compositionality, character size, and predictability. We show that substantial manipulations of glyph composition are not sufficient to align conditional entropy levels with natural languages. The unusually predictable nature of the Voynichese script is not attributable to a particular script or transcription system, underlying language, or substitution cipher. Voynichese is distinct from every comparison text in our corpora because character placement is highly constrained within the word, and this may indicate the loss of phonemic distinctions from the underlying language.

翻訳日:2022-10-02 05:22:06 公開日:2021-05-18

# ガウス重み付きグラフデータベースのアライメントのためのシャープしきい値

Sharp threshold for alignment of graph databases with Gaussian weights ( http://arxiv.org/abs/2010.16295v2 )

ライセンス: Link先を確認

Luca Ganassali

(参考訳) 重み付きグラフ(行列)データベースアライメントにおける再構成の基本的限界について検討する。 2つのグラフのモデルを考えると、$\pi^*$は植込みされた一様置換であり、すべてのエッジウェイトが$(A_{i,j}, B_{\pi^*である。 (i)\pi^* (j)})_{1 \leq i<j \leq n}$ は、ゼロ平均、単位分散、相関パラメータ $\rho \in [0,1]$ を持つガウス変数の対である。もし$n \rho^2 \geq (4+\epsilon) \log n + \omega(1)$ for some $\epsilon>0$なら、正確な再構成を達成するデータベース$a,b$の観測に基づいて、$\hat{\pi}$ --すなわちmap estimator -- が存在することを証明する。逆に、$n \rho^2 \leq 4 \log n - \log \log n - \omega(1)$ ならば、任意の推定器 $\hat{\pi}$ は確率 $o(1)$ で $\hat{\pi}=\pi$ を検証する。この結果から, 精度回復のための情報理論しきい値が, Wuらによる最近の研究(2020年)で得られたものと同一であること, 言い換えればガウス重み付きグラフアライメントでは, 再構築の問題は検出のそれよりも難しくないことがわかった。復元作業はベクトル型データベースアライメント(これは$(u_i, v_{\pi^*)の信号を取る)に対して既によく理解されていた。 (i)})_{1 \leq i\leq n}$ ここで$(u_i, v_{\pi^*) (i)})$ は$\mathbb{r}^{d_u} \times \mathbb{r}^{d_v}$ の i.i.d.ペアであり、グラフ(または行列)データベースの定式化は、ハードフェーズが広く予想されるような、劇的に異なる問題をもたらす。これらの証明は、置換のエネルギーの相関構造の研究とともに、写像推定器と第二モーメント法の解析に基づいている。

We study the fundamental limits for reconstruction in weighted graph (or matrix) database alignment. We consider a model of two graphs where $\pi^*$ is a planted uniform permutation and all pairs of edge weights $(A_{i,j}, B_{\pi^*(i),\pi^*(j)})_{1 \leq i<j \leq n}$ are i.i.d. pairs of Gaussian variables with zero mean, unit variance and correlation parameter $\rho \in [0,1]$. We prove that there is a sharp threshold for exact recovery of $\pi^*$: if $n \rho^2 \geq (4+\epsilon) \log n + \omega(1)$ for some $\epsilon>0$, there is an estimator $\hat{\pi}$ -- namely the MAP estimator -- based on the observation of databases $A,B$ that achieves exact reconstruction with high probability. Conversely, if $n \rho^2 \leq 4 \log n - \log \log n - \omega(1)$, then any estimator $\hat{\pi}$ verifies $\hat{\pi}=\pi$ with probability $o(1)$. This result shows that the information-theoretic threshold for exact recovery is the same as the one obtained for detection in a recent work by Wu et al. (2020): in other words, for Gaussian weighted graph alignment, the problem of reconstruction is not more difficult than that of detection. Though the reconstruction task was already well understood for vector-shaped database alignment (that is taking signal of the form $(u_i, v_{\pi^*(i)})_{1 \leq i\leq n}$ where $(u_i, v_{\pi^*(i)})$ are i.i.d. pairs in $\mathbb{R}^{d_u} \times \mathbb{R}^{d_v}$), its formulation for graph (or matrix) databases brings a drastically different problem for which the hard phase is conjectured to be wide. The proofs build upon the analysis of the MAP estimator and the second moment method, together with the study of the correlation structure of energies of permutations.

翻訳日:2022-10-01 16:27:50 公開日:2021-05-18

# GPRNetを用いた地下事業用モデル再構築システム

GPR-based Model Reconstruction System for Underground Utilities Using GPRNet ( http://arxiv.org/abs/2011.02635v3 )

ライセンス: Link先を確認

Jinglun Feng, Liang Yang, Ejup Hoxha, Diar Sanakov, Stanislav Sotnikov, Jizhong Xiao

(参考訳) 地中レーダ(gpr)は、地下の物体(リバー、ユーティリティパイプ)を検知・発見するための最も重要な非破壊評価(nde)機器の1つである。これまでの多くの研究は、GPR画像に基づく特徴検出のみに焦点を当てており、より詳細な地下物体の非常に微細で詳細な3Dモデルの再構築を成功させるために、粗いGPR測定を処理できない。そこで本稿では,GPRデータを収集し,地下ユーティリティをローカライズし,地下オブジェクトの高密度点クラウドモデルを再構築する,新しいロボットシステムを提案する。このシステムは3つのモジュールから構成される。 1 視覚慣性に基づくGPRデータ収集モジュールで、全方向ロボットの位置情報をGPR計測にタグ付けする。 2) 生のgpr b-scan画像をオブジェクトモデルの断面に解釈するためのディープニューラルネットワーク(dnn)マイグレーションモジュール 3)DNNベースの3D再構成モジュール、すなわちGPRNetは、細かな3Dポイントクラウドを持つ地下ユーティリティモデルを生成する。本稿では,本手法を定量的・定性的に検証し,パイプ状ユーティリティの濃密かつ完全点クラウドモデル,すなわちgpr生データ不完全性および各種ノイズの少ない入力に基づいて生成する手法について検証する。実験の結果, 合成データとフィールドテストデータにより, 本手法の有効性がさらに向上した。

Ground Penetrating Radar (GPR) is one of the most important non-destructive evaluation (NDE) instruments to detect and locate underground objects (i.e., rebars, utility pipes). Many previous researches focus on GPR image-based feature detection only, and none can process sparse GPR measurements to successfully reconstruct a very fine and detailed 3D model of underground objects for better visualization. To address this problem, this paper presents a novel robotic system to collect GPR data, localize the underground utilities, and reconstruct the underground objects' dense point cloud model. This system is composed of three modules: 1) visual-inertial-based GPR data collection module, which tags the GPR measurements with positioning information provided by an omnidirectional robot; 2) a deep neural network (DNN) migration module to interpret the raw GPR B-scan image into a cross-section of object model; 3) a DNN-based 3D reconstruction module, i.e., GPRNet, to generate underground utility model with the fine 3D point cloud. In this paper, both the quantitative and qualitative experiment results verify our method that can generate a dense and complete point cloud model of pipe-shaped utilities based on a sparse input, i.e., GPR raw data incompleteness and various noise. The experiment results on synthetic data and field test data further support the effectiveness of our approach.

翻訳日:2022-09-29 13:01:00 公開日:2021-05-18

# CODER:用語正規化のための言語間医療用語埋め込みの知識注入

CODER: Knowledge infused cross-lingual medical term embedding for term normalization ( http://arxiv.org/abs/2011.02947v3 )

ライセンス: Link先を確認

Zheng Yuan and Zhengyun Zhao and Haixia Sun and Jiao Li and Fei Wang and Sheng Yu

(参考訳) 本稿では, 言語間医療用語表現のための知識グラフを用いた比較学習コーダを提案する。 CODERは医療用語の正規化のために設計されており、同じまたは類似の医療概念を言語間サポートで表す異なる用語のクローズドベクター表現を提供する。統合医療言語システム (unified medical language system) と呼ばれる医学知識グラフ (kg) 上の対比学習を通じてコーダを訓練し, kg から用語と関係の3重項を用いて類似度を計算する。関係性のあるトレーニングは、医療知識を埋め込みに注入し、より優れた機械学習機能を提供することを目指している。我々は,ゼロショット項正規化,意味的類似性,関係分類ベンチマークにおけるコーダの評価を行い,コーダアウトが様々な最先端の生物医学用語埋め込み,概念埋め込み,文脈埋め込みを行うことを示した。私たちのコードとモデルはhttps://github.com/ganjinzero/coderで利用可能です。

This paper proposes CODER: contrastive learning on knowledge graphs for cross-lingual medical term representation. CODER is designed for medical term normalization by providing close vector representations for different terms that represent the same or similar medical concepts with cross-lingual support. We train CODER via contrastive learning on a medical knowledge graph (KG) named the Unified Medical Language System, where similarities are calculated utilizing both terms and relation triplets from KG. Training with relations injects medical knowledge into embeddings and aims to provide potentially better machine learning features. We evaluate CODER in zero-shot term normalization, semantic similarity, and relation classification benchmarks, which show that CODERoutperforms various state-of-the-art biomedical word embedding, concept embeddings, and contextual embeddings. Our codes and models are available at https://github.com/GanjinZero/CODER.

翻訳日:2022-09-29 11:57:13 公開日:2021-05-18

# Margin-dynamic-softmax Lossによる深いクロスモーダルハッシュ

Deep Cross-modal Hashing via Margin-dynamic-softmax Loss ( http://arxiv.org/abs/2011.03451v2 )

ライセンス: Link先を確認

Rong-Cheng Tu, Xian-Ling Mao, Rongxin Tu, Binbin Bian, Wei Wei, Heyan Huang

(参考訳) クロスモーダル検索作業における高い検索効率と低ストレージコストのため,クロスモーダルハッシュ法が注目されている。教師付きクロスモーダルハッシュ法では,データポイントのラベルに十分に含まれている意味情報を学習ハッシュコードに保存させる方法が検索性能向上の鍵となる。したがって、ほとんど全ての教師付きクロスモーダルハッシュ手法は、通常、ハッシュモデルの学習を完全にまたは部分的に導くためにラベル情報を持つデータポイント間の類似性を定義することに依存する。しかし、データポイント間の定義された類似性は、部分的にデータポイントのラベル情報を取り込み、豊富な意味情報を見逃し、検索性能のさらなる向上を妨げる。そこで,本研究では,データポイント間の類似性を定義せずに,新しいクロスモーダルハッシュ法を提案し,それをDCHML(textit{Margin-dynamic-softmax Loss})と呼ぶ。具体的には、dchmlはまずプロキシハッシュネットワークを訓練し、データセットの各カテゴリ情報をプロキシハッシュコードと呼ばれるセマンティック識別ハッシュコードに変換する。各プロキシハッシュコードは、対応するカテゴリのセマンティック情報を適切に保存することができる。次に、モダリティ固有のハッシュネットワークのトレーニングプロセスを監督するためにデータポイント間の類似性を定義することなく、プロキシハッシュコードを教師付き情報として直接利用する新しい \textit{margin-dynamic-softmax loss} を提案する。最後に、新しい \textit{margin-dynamic-softmax loss} を最小化することで、モダリティ固有のハッシュネットワークを訓練して、クロスモーダル類似性と豊富な意味情報を同時に保存できるハッシュコードを生成することができる。

Due to their high retrieval efficiency and low storage cost for cross-modal search task, cross-modal hashing methods have attracted considerable attention. For the supervised cross-modal hashing methods, how to make the learned hash codes preserve semantic information sufficiently contained in the label of datapoints is the key to further enhance the retrieval performance. Hence, almost all supervised cross-modal hashing methods usually depends on defining a similarity between datapoints with the label information to guide the hashing model learning fully or partly. However, the defined similarity between datapoints can only capture the label information of datapoints partially and misses abundant semantic information, then hinders the further improvement of retrieval performance. Thus, in this paper, different from previous works, we propose a novel cross-modal hashing method without defining the similarity between datapoints, called Deep Cross-modal Hashing via \textit{Margin-dynamic-softmax Loss} (DCHML). Specifically, DCHML first trains a proxy hashing network to transform each category information of a dataset into a semantic discriminative hash code, called proxy hash code. Each proxy hash code can preserve the semantic information of its corresponding category well. Next, without defining the similarity between datapoints to supervise the training process of the modality-specific hashing networks , we propose a novel \textit{margin-dynamic-softmax loss} to directly utilize the proxy hashing codes as supervised information. Finally, by minimizing the novel \textit{margin-dynamic-softmax loss}, the modality-specific hashing networks can be trained to generate hash codes which can simultaneously preserve the cross-modal similarity and abundant semantic information well.

翻訳日:2022-09-29 05:43:26 公開日:2021-05-18

# PairRE: ペア関係ベクトルによる知識グラフ埋め込み

PairRE: Knowledge Graph Embeddings via Paired Relation Vectors ( http://arxiv.org/abs/2011.03798v3 )

ライセンス: Link先を確認

Linlin Chao, Jianshan He, Taifeng Wang, Wei Chu

(参考訳) N-to-1, 1-to-N, N-to-Nなどの複雑な関係を扱う能力と、対称性や反対称性などの様々な関係パターンを符号化する能力である。しかし、既存の手法ではこれら2つの問題を同時に解くことができず、結果が不十分である。この問題を軽減するために,各関係表現に対してペアベクトルを持つモデルであるペアレを提案する。ペアベクトルは、損失関数のマージンの適応的な調整を複素関係に適合させることができる。加えて、PairREは3つの重要な関係パターン、対称性/反対称性、逆および合成を符号化することができる。関係表現に関する単純な制約が与えられた場合、PairREはさらにサブリレーションをエンコードできる。リンク予測ベンチマークの実験は、ペアリングの鍵となる機能を示す。さらに、我々は2つの知識グラフデータセットに新しい最先端のOpen Graphベンチマークを設定した。

Distance based knowledge graph embedding methods show promising results on link prediction task, on which two topics have been widely studied: one is the ability to handle complex relations, such as N-to-1, 1-to-N and N-to-N, the other is to encode various relation patterns, such as symmetry/antisymmetry. However, the existing methods fail to solve these two problems at the same time, which leads to unsatisfactory results. To mitigate this problem, we propose PairRE, a model with paired vectors for each relation representation. The paired vectors enable an adaptive adjustment of the margin in loss function to fit for complex relations. Besides, PairRE is capable of encoding three important relation patterns, symmetry/antisymmetry, inverse and composition. Given simple constraints on relation representations, PairRE can encode subrelation further. Experiments on link prediction benchmarks demonstrate the proposed key capabilities of PairRE. Moreover, We set a new state-of-the-art on two knowledge graph datasets of the challenging Open Graph Benchmark.

翻訳日:2022-09-28 22:05:59 公開日:2021-05-18

# poisson multi-bernoulli mixture filterによる点と拡張目標の共存

A Poisson multi-Bernoulli mixture filter for coexisting point and extended targets ( http://arxiv.org/abs/2011.04464v2 )

ライセンス: Link先を確認

\'Angel F. Garc\'ia-Fern\'andez, Jason L. Williams, Lennart Svensson, Yuxuan Xia

(参考訳) 本稿では,Poisson Multi-Bernoulli Mixing (PMBM) フィルタを提案する。 PMBMフィルタは、データアソシエーションの確率情報に基づいて、マルチターゲットフィルタリング後部を計算するための再帰と、単一ターゲット予測と更新を提供する。本稿では,まずPMBMフィルタを一般化された測定モデルに適用し,点と拡張対象から導出する測定値を含む手法を提案する。次に,ポイントと拡張対象の両方に対応し,ポイントターゲットに対するガウス密度と拡張ターゲットに対するガンマガウス逆ウィッシュアート密度を伝播するフィルタリング再帰を導出する単一ターゲット空間を提案する。また,PMBMフィルタの計算効率のよい近似法として,ポアソン・マルチベルヌーリフィルタ(PMB)を開発した。結果のフィルタは数値シミュレーションによって解析される。

This paper proposes a Poisson multi-Bernoulli mixture (PMBM) filter for coexisting point and extended targets, i.e., for scenarios where there may be simultaneous point and extended targets. The PMBM filter provides a recursion to compute the multi-target filtering posterior based on probabilistic information on data associations, and single-target predictions and updates. In this paper, we first derive the PMBM filter update for a generalised measurement model, which can include measurements originated from point and extended targets. Second, we propose a single-target space that accommodates both point and extended targets and derive the filtering recursion that propagates Gaussian densities for point targets and gamma Gaussian inverse Wishart densities for extended targets. As a computationally efficient approximation of the PMBM filter, we also develop a Poisson multi-Bernoulli (PMB) filter for coexisting point and extended targets. The resulting filters are analysed via numerical simulations.

翻訳日:2022-09-28 02:28:10 公開日:2021-05-18

# 深部ニューラルネットワークと校正データを用いた広視野小開口望遠鏡の点展開関数推定

Point Spread Function Estimation for Wide Field Small Aperture Telescopes with Deep Neural Networks and Calibration Data ( http://arxiv.org/abs/2011.10243v2 )

ライセンス: Link先を確認

Peng Jia, Xuebo Wu, Zhengyang Li, Bo Li, Weihua Wang, Qiang Liu, Adam Popowicz

(参考訳) 点拡散関数(PSF)は望遠鏡の状態を反映し、PSFベースのアストロメトリー、測光、画像復元などのデータ処理手法の発展に重要な役割を果たしている。しかし、広視野小型開口望遠鏡(WFSAT)では、光学系によって誘導される収差が非常に複雑であり、恒星画像の信号対雑音比が低すぎるため、視野全体の位置でPSFを推定するのは困難である。本稿では,より深いニューラルネットワーク(DNN)に基づくPSFモデリング法を開発し,そのPSF推定への応用を示す。望遠鏡アライメントとテストの段階では、光学素子を工学的許容範囲(傾きとまともさ)で修正することで、システムキャリブレーションデータを収集する。次に、これらのデータを用いてDNN(Tel-Net)を訓練する。訓練後、Tel-Netは、複数の離散サンプリングされた星画像から任意の視野でPSFを推定できる。本手法の性能評価にはシミュレーションデータと実験データの両方を用いた。その結果,Tel-Netはどの状態でもFoVの任意の位置でもWFSATのPSFを再構築できることがわかった。比較した古典的手法である逆距離重み (IDW) の補間結果よりも, はるかに精度が高い。提案手法は,PSFの強い事前情報を必要とするWFSATのためのディープニューラルネットワークに基づくデータ処理手法の開発の基礎となる。

The point spread function (PSF) reflects states of a telescope and plays an important role in development of data processing methods, such as PSF based astrometry, photometry and image restoration. However, for wide field small aperture telescopes (WFSATs), estimating PSF in any position of the whole field of view is hard, because aberrations induced by the optical system are quite complex and the signal to noise ratio of star images is often too low for PSF estimation. In this paper, we further develop our deep neural network (DNN) based PSF modelling method and show its applications in PSF estimation. During the telescope alignment and testing stage, our method collects system calibration data through modification of optical elements within engineering tolerances (tilting and decentering). Then we use these data to train a DNN (Tel--Net). After training, the Tel--Net can estimate PSF in any field of view from several discretely sampled star images. We use both simulated and experimental data to test performance of our method. The results show that the Tel--Net can successfully reconstruct PSFs of WFSATs of any states and in any positions of the FoV. Its results are significantly more precise than results obtained by the compared classic method - Inverse Distance Weight (IDW) interpolation. Our method provides foundations for developing of deep neural network based data processing methods for WFSATs, which require strong prior information of PSFs.

翻訳日:2022-09-23 06:52:40 公開日:2021-05-18

# 深部領域一般化のためのバッチ正規化埋め込み

Batch Normalization Embeddings for Deep Domain Generalization ( http://arxiv.org/abs/2011.12672v3 )

ライセンス: Link先を確認

Mattia Segu, Alessio Tonioni, Federico Tombari

(参考訳) ドメインの一般化は、異なるドメインと見えないドメインで堅牢に実行されるように機械学習モデルをトレーニングすることを目的としている。最近では、複数のデータセットを使用してモデルをトレーニングし、ドメイン不変の機能を抽出している。まず、アドホックなバッチ正規化レイヤを使用してドメイン依存表現を明示的にトレーニングし、独立したドメインの統計を収集します。そこで我々は,これらの統計データを用いて,距離関数を用いて領域へのメンバシップを計測できる共有潜在空間の領域をマッピングする。テスト時には、未知の領域から同じ空間にサンプルを投影し、既知の領域の線形結合としてそれらの領域の特性を推論する。トレーニングとテスト時に同じマッピング戦略を適用し、潜在表現と強力で軽量なアンサンブルモデルの両方を学習します。一般的なドメイン一般化ベンチマーク(pacs、office-31、office-caltech)では、現在の最先端技術よりも分類精度が大幅に向上している。

Domain generalization aims at training machine learning models to perform robustly across different and unseen domains. Several recent methods use multiple datasets to train models to extract domain-invariant features, hoping to generalize to unseen domains. Instead, first we explicitly train domain-dependant representations by using ad-hoc batch normalization layers to collect independent domain's statistics. Then, we propose to use these statistics to map domains in a shared latent space, where membership to a domain can be measured by means of a distance function. At test time, we project samples from an unknown domain into the same space and infer properties of their domain as a linear combination of the known ones. We apply the same mapping strategy at training and test time, learning both a latent representation and a powerful but lightweight ensemble model. We show a significant increase in classification accuracy over current state-of-the-art techniques on popular domain generalization benchmarks: PACS, Office-31 and Office-Caltech.

翻訳日:2022-09-21 02:11:20 公開日:2021-05-18

# 3DSNet: 教師なし形状の3Dスタイル転送

3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer ( http://arxiv.org/abs/2011.13388v4 )

ライセンス: Link先を確認

Mattia Segu, Margarita Grinvald, Roland Siegwart, Federico Tombari

(参考訳) あるイメージから別のイメージへスタイルを転送することは、コンピュータビジョンにおいて広く研究されている課題である。しかし、3d設定でのスタイル転送は、ほとんど未解決の問題である。そこで本研究では,不整合コンテンツとスタイル表現に基づく3次元オブジェクト間のスタイル伝達のための学習ベースアプローチを提案する。提案手法は, 点雲とメッシュの2つの形状を合成し, ソースとターゲットの3dモデルの内容とスタイルを組み合わせて, ソースの内容を保持しながら, ターゲットのスタイルに類似した新たな形状を生成する。さらに,提案手法を拡張して,選択した領域のマルチモーダル分布を暗黙的に学習する。学習した分布からスタイルコードをサンプリングすることで、モデルが入力形状に表現できるスタイルの種類を増加させます。実験により,多くのベンチマークにおいて提案手法の有効性が検証された。私たちのフレームワークの実装は受け入れ次第リリースします。

Transferring the style from one image onto another is a popular and widely studied task in computer vision. Yet, style transfer in the 3D setting remains a largely unexplored problem. To our knowledge, we propose the first learning-based approach for style transfer between 3D objects based on disentangled content and style representations. The proposed method can synthesize new 3D shapes both in the form of point clouds and meshes, combining the content and style of a source and target 3D model to generate a novel shape that resembles in style the target while retaining the source content. Furthermore, we extend our technique to implicitly learn the multimodal style distribution of the chosen domains. By sampling style codes from the learned distributions, we increase the variety of styles that our model can confer to an input shape. Experimental results validate the effectiveness of the proposed 3D style transfer method on a number of benchmarks. The implementation of our framework will be released upon acceptance.

翻訳日:2022-09-20 08:28:14 公開日:2021-05-18

# SemSegLoss:セマンティックセグメンテーションのための損失関数のpythonパッケージ

SemSegLoss: A python package of loss functions for semantic segmentation ( http://arxiv.org/abs/2106.05844v1 )

ライセンス: Link先を確認

Shruti Jadon

(参考訳) Image Segmentationは、自動疾患検出から自動運転車まで、幅広い用途があるため、活発な研究分野である。近年、様々な研究論文がバイアスデータ、スパースセグメンテーション、不均衡データセットの場合に使用される異なる損失関数を提案している。本稿では,画像セグメンテーションに広く用いられているよく知られた損失関数のいくつかからなるピソンパッケージであるSemSegLossを紹介する。研究者が新規な損失関数の開発を支援し、様々なアプリケーションのためのモデルアーキテクチャに関する広範な実験を行うために開発された。提案パッケージの使いやすさと柔軟性により、開発時間を短縮し、セマンティックセグメンテーションのための機械学習モデルの評価戦略が強化された。さらに、イメージセグメンテーションを使用するアプリケーションは、関数の一般性のためにSemSegLossを使用することができる。この幅広い応用は、あらゆる産業におけるAIの発展と成長につながるだろう。

Image Segmentation has been an active field of research as it has a wide range of applications, ranging from automated disease detection to self-driving cars. In recent years, various research papers proposed different loss functions used in case of biased data, sparse segmentation, and unbalanced dataset. In this paper, we introduce SemSegLoss, a python package consisting of some of the well-known loss functions widely used for image segmentation. It is developed with the intent to help researchers in the development of novel loss functions and perform an extensive set of experiments on model architectures for various applications. The ease-of-use and flexibility of the presented package have allowed reducing the development time and increased evaluation strategies of machine learning models for semantic segmentation. Furthermore, different applications that use image segmentation can use SemSegLoss because of the generality of its functions. This wide range of applications will lead to the development and growth of AI across all industries.

翻訳日:2021-06-13 13:56:50 公開日:2021-05-18

# (参考訳) 言語間低リソース音声認識のためのアダプタの活用

Exploiting Adapters for Cross-lingual Low-resource Speech Recognition ( http://arxiv.org/abs/2105.11905v1 )

ライセンス: CC BY 4.0

Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki

(参考訳) 言語間適応は、複数のリッチリソース言語を利用して低リソースターゲット言語のためのモデルを構築する問題を解決することを目的としている。低リソース言語は訓練データに制限があるため、音声認識モデルは容易に過度に適合する。本稿では,パラメータ効率のよい言語間音声適応のための複数のアダプタの性能について検討する。アダプタを暗黙的に活用するこれまでのMetaAdapterに基づいて,アダプタから知識を明示的に学習するSimAdapterと呼ばれる新しいアルゴリズムを提案する。 metaadapterはメタラーニングを利用して、トレーニングデータからテスト言語に一般的な知識を転送します。 SimAdapterは、アダプタを使って微調整中にソース言語とターゲット言語の類似性を学ぶことを目的としている。我々はCommon Voiceデータセットで5つの低リソース言語について広範な実験を行った。その結果、メタアダプタとシムアダプタはWERを2.98%、2.55%減らすことができ、トレーニング可能なパラメータは2.5%と15.5%に留まった。さらに,これら2つのアルゴリズムを最大3.55%のwar削減で性能向上のために統合可能であることを示した。

Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language. Since the low-resource language has limited training data, speech recognition models can easily overfit. In this paper, we propose to use adapters to investigate the performance of multiple adapters for parameter-efficient cross-lingual speech adaptation. Based on our previous MetaAdapter that implicitly leverages adapters, we propose a novel algorithms called SimAdapter for explicitly learning knowledge from adapters. Our algorithm leverages adapters which can be easily integrated into the Transformer structure.MetaAdapter leverages meta-learning to transfer the general knowledge from training data to the test language. SimAdapter aims to learn the similarities between the source and target languages during fine-tuning using the adapters. We conduct extensive experiments on five-low-resource languages in Common Voice dataset. Results demonstrate that our MetaAdapter and SimAdapter methods can reduce WER by 2.98% and 2.55% with only 2.5% and 15.5% of trainable parameters compared to the strong full-model fine-tuning baseline. Moreover, we also show that these two novel algorithms can be integrated for better performance with up to 3.55% relative WER reduction.

翻訳日:2021-06-06 10:29:23 公開日:2021-05-18

# (参考訳) 高次微分方程式と非線形微分方程式を解くための効率的かつ効率的な方法:比列ネット

An Effective and Efficient Method to Solve the High-Order and the Non-Linear Ordinary Differential Equations: the Ratio Net ( http://arxiv.org/abs/2105.11309v1 )

ライセンス: CC BY 4.0

Chen-Xin Qin, Ru-Hao Liu, Mao-Cai Li, Chi-Chun Zhou, and Yi-Liua

(参考訳) 高次および非線形常微分方程式を解く効率的かつ効率的な方法が提供される。その方法は比率ネットに基づいている。本手法を多項式法や多層パーセプトロンネットワーク法などの既存手法と比較することにより,比重ネットが良好な結果を与え,高い効率を示すことを示す。

An effective and efficient method that solves the high-order and the non-linear ordinary differential equations is provided. The method is based on the ratio net. By comparing the method with existing methods such as the polynomial based method and the multilayer perceptron network based method, we show that the ratio net gives good results and has higher efficiency.

翻訳日:2021-06-06 10:07:52 公開日:2021-05-18

# (参考訳) アプリケーションモデルの進化に関する一般的な理論 -- フルバージョン

A General Theory for the Evolution of Application Models -- Full version ( http://arxiv.org/abs/2105.11308v1 )

ライセンス: CC BY 4.0

H. A. Proper and Th. P. van der Weide

(参考訳) 本稿では,情報システムの発展に焦点をあてる。まず、進化の概念の軽視が提供され、そのような進化の一般理論への最初の試みとなる。この理論は、概念レベルでの基盤となる情報構造、一方での進化、他方で情報構造とその集団に関する操作の説明と意味論を区別する。この理論の主な問題は、オブジェクトの型付け、型関連性、オブジェクトの識別である。これらの概念の観点から、進化の健全性に関するいくつかの公理を提案する。この一般的な理論では、基礎となるデータモデルはパラメータであり、オブジェクト・ロール・モデリングやオブジェクト指向技術を含む幅広いモデリング手法に適用できる。

In this article we focus on evolving information systems. First a delimitation of the concept of evolution is provided, resulting in a first attempt to a general theory for such evolutions. The theory makes a distinction between the underlying information structure at the conceptual level, its evolution on the one hand, and the description and semantics of operations on the information structure and its population on the other hand. Main issues within this theory are object typing, type relatedness and identification of objects. In terms of these concepts, we propose some axioms on the well-formedness of evolution. In this general theory, the underlying data model is a parameter, making the theory applicable for a wide range of modelling techniques, including object-role modelling and object oriented techniques.

翻訳日:2021-06-06 09:59:24 公開日:2021-05-18

# 単純なリッジペナルティで公平を達成する

Achieving Fairness with a Simple Ridge Penalty ( http://arxiv.org/abs/2105.13817v1 )

ライセンス: Link先を確認

Marco Scutari and Manuel Proissl

(参考訳) ユーザ定義の公正度に基づく線形回帰モデルの推定は、2次制約を持つ非凸2次プログラミング最適化問題を解くことで達成できる。本研究では,尾根ペナルティによってユーザ定義の公平度を強制する,このタスクに対する代替的で柔軟なアプローチを提案する。提案手法は, より直感的に解釈できる回帰係数の推定値を生成すること, 数学的に単純で, 部分的に閉じた解を持つこと, 線形回帰を超えて拡張しやすいこと, の3つの制限に対処する。両手法を5つの異なるデータセットで実証的に評価し,提案手法が適合性の向上と予測精度の向上をもたらすとともに,所望の公平性レベルを達成するのに等しく有効であることを見出した。さらに,非凸2次アプローチの当初の実験的評価におけるバイアスの源泉を明らかにするとともに,提案手法を広範囲なモデルに拡張する方法について論じる。

Estimating a fair linear regression model subject to a user-defined level of fairness can be achieved by solving a non-convex quadratic programming optimisation problem with quadratic constraints. In this work we propose an alternative, more flexible approach to this task that enforces a user-defined level of fairness by means of a ridge penalty. Our proposal addresses three limitations of the former approach: it produces regression coefficient estimates that are more intuitive to interpret; it is mathematically simpler, with a solution that is partly in closed form; and it is easier to extend beyond linear regression. We evaluate both approaches empirically on five different data sets, and we find that our proposal provides better goodness of fit and better predictive accuracy while being equally effective at achieving the desired fairness level. In addition we highlight a source of bias in the original experimental evaluation of the non-convex quadratic approach, and we discuss how our proposal can be extended to a wide range of models.

翻訳日:2021-06-06 08:51:07 公開日:2021-05-18

# データからのダイナミクス学習のためのエネルギー保存ニューラルネットワークのベンチマーク

Benchmarking Energy-Conserving Neural Networks for Learning Dynamics from Data ( http://arxiv.org/abs/2012.02334v4 )

ライセンス: Link先を確認

Yaofeng Desmond Zhong, Biswadip Dey, Amit Chakraborty

(参考訳) ここ数年、深層学習フレームワークに物理学に基づく帰納的バイアスを導入することへの関心が高まっている。特に、観測された時系列データからダイナミクスを学習するためにニューラルネットワークを使用しながら、エネルギー保存を強制する方法を模索する文献が増えている。本研究では,HNN,LNN,DeLaN,SymanODEN,CHNN,CLNNなど10種類のエネルギー保存型ニューラルネットワークモデルについて検討した。これらのモデルの背後にある理論をコンパクトに導出し、それらの類似性と相違を説明する。性能は4つの物理系で比較される。エネルギーベースコントローラの設計にこれらのエネルギー保存モデルを活用する可能性について指摘する。

The last few years have witnessed an increased interest in incorporating physics-informed inductive bias in deep learning frameworks. In particular, a growing volume of literature has been exploring ways to enforce energy conservation while using neural networks for learning dynamics from observed time-series data. In this work, we survey ten recently proposed energy-conserving neural network models, including HNN, LNN, DeLaN, SymODEN, CHNN, CLNN and their variants. We provide a compact derivation of the theory behind these models and explain their similarities and differences. Their performance are compared in 4 physical systems. We point out the possibility of leveraging some of these energy-conserving models to design energy-based controllers.

翻訳日:2021-05-23 15:01:30 公開日:2021-05-18

# (参考訳) 確率論的モデルによる薬物発見予測の不確実性の定量化

Quantifying sources of uncertainty in drug discovery predictions with probabilistic models ( http://arxiv.org/abs/2105.09474v1 )

ライセンス: CC BY-SA 4.0

Stanley E. Lazic, Dominic P. Williams

(参考訳) 予測の不確実性を知ることは、高価な投資判断を行う場合や患者の安全性が最重要となる場合に重要であるが、薬物発見における機械学習(ml)モデルは、一般的には単一の最良の見積もりを提供し、すべての不確実性源を無視する。したがって、これらのモデルからの予測は自信過剰であり、失敗する運命にある化合物がさらに開発されると、患者をリスクと廃棄物にすることができる。確率的予測モデル(PPM)は、データとモデルの両方に不確実性を取り入れ、予測の不確実性を表す予測値の分布を返す。 PPMは、いつ予測が不確実であるかをユーザーに知らせるだけでなく、これらのモデルからの直感的なアウトプットによって、コミュニケーションのリスクと意思決定がより簡単になる。多くの一般的な機械学習メソッドは、PPMまたはベイジアンアナログを持ち、PPMを現在のワークフローに簡単に適合させることができる。我々は毒性予測を例に挙げるが、薬物発見に使用される全ての予測モデルにも同様の原理を適用する。不確実性を無視した結果や、不確実性の原因となるPPMについても述べられている。我々は議論を広く非数学的な聴衆に公開することを目指している。方程式は数学的読者向けに具体化するために提供され(しかし、理解を失うことなくスキップできる)、計算研究者はコードを利用できる(https://github.com/stanlazic/ml_uncertainty_quantification)。

Knowing the uncertainty in a prediction is critical when making expensive investment decisions and when patient safety is paramount, but machine learning (ML) models in drug discovery typically provide only a single best estimate and ignore all sources of uncertainty. Predictions from these models may therefore be over-confident, which can put patients at risk and waste resources when compounds that are destined to fail are further developed. Probabilistic predictive models (PPMs) can incorporate uncertainty in both the data and model, and return a distribution of predicted values that represents the uncertainty in the prediction. PPMs not only let users know when predictions are uncertain, but the intuitive output from these models makes communicating risk easier and decision making better. Many popular machine learning methods have a PPM or Bayesian analogue, making PPMs easy to fit into current workflows. We use toxicity prediction as a running example, but the same principles apply for all prediction models used in drug discovery. The consequences of ignoring uncertainty and how PPMs account for uncertainty are also described. We aim to make the discussion accessible to a broad non-mathematical audience. Equations are provided to make ideas concrete for mathematical readers (but can be skipped without loss of understanding) and code is available for computational researchers (https://github.com/stanlazic/ML_uncertainty_quantification).

翻訳日:2021-05-22 01:53:30 公開日:2021-05-18

# (参考訳) 構造力学と振動の解法と反転のための深層学習

Deep learning for solution and inversion of structural mechanics and vibrations ( http://arxiv.org/abs/2105.09477v1 )

ライセンス: CC BY 4.0

Ehsan Haghighat, Ali Can Bekar, Erdogan Madenci, Ruben Juanes

(参考訳) ディープラーニングはここ数年でもっとも人気のある機械学習手法だ。本章では,構造力学および振動問題に対する深層学習と物理インフォームドニューラルネットワークの適用について述べる。演示問題は、データのデノイズ化、時間依存の常微分方程式と偏微分方程式の解、与えられたデータに対するシステムの応答を特徴づけることである。

Deep learning has been the most popular machine learning method in the last few years. In this chapter, we present the application of deep learning and physics-informed neural networks concerning structural mechanics and vibration problems. Demonstration problems involve de-noising data, solution to time-dependent ordinary and partial differential equations, and characterizing the system's response for a given data.

翻訳日:2021-05-22 01:52:22 公開日:2021-05-18

# (参考訳) 光gbmに基づく海洋水中の支配波周期の予測

A LightGBM based Forecasting of Dominant Wave Periods in Oceanic Waters ( http://arxiv.org/abs/2105.08721v1 )

ライセンス: CC BY 4.0

Pujan Pokhrel, Elias Ioup, Md Tamjidul Hoque, Mahdi Abdelguerfi and Julian Simeonov

(参考訳) 本稿では,海洋水中の優占波周期を予測するための光勾配ブースティング(lightgbm)を提案する。まず,CDIPブイから収集したデータを用いて,様々なデータフィルタリング手法を適用する。データフィルタリングにより、トレーニングと検証のために高品質なデータセットを得ることができる。次に, 波高, 周期, 歪, 曲率などの波面特性と, ブイの湿度, 圧力, 気温などの大気特性を抽出する。その後、hvブロッククロスバリデーション方式を用いてLightGBMとExtra Treesを使用するアルゴリズムを訓練し、最大30日間の波浪期間を予測する。 lightgbm の r2 スコアは 0.94, 0.94, 0.94 で、1 日先、15 日先、15 日先、予測 30 日先である。同様に、エクストラツリー(ET)は1日先、15日前、30日前、R2スコアが0.88、0.86、0.85である。テストデータセットの場合、lightgbmのr2スコアは 0.94, 0.94, 0.94で、1日前、15日前、30日前である。 ET の R2 スコアは 0.88, 0.86, 0.85 であり、1 日先、15 日前、30 日先、予測されている。同様のR2スコアとテストデータセットは、本論文で開発された機械学習モデルが堅牢であることを示している。 LightGBM アルゴリズムはテスト対象のウィンドウに対して ET よりも優れており、最終アルゴリズムとして扱われる。予測地平線が大きくなるにつれて,両手法の性能は著しく低下しない。同様に,提案手法は,本論文に含まれる数値的アプローチよりも優れている。 1日間の予測のために、提案アルゴリズムはsi, bias, cc, rmseを0.09, 0.00, 0.97, 1.78とし、欧州中距離気象予報センター(ecmwf)モデルの0.268, 0.40, 0.63, 2.18と比較した。

In this paper, we propose a Light Gradient Boosting (LightGBM) to forecast dominant wave periods in oceanic waters. First, we use the data collected from CDIP buoys and apply various data filtering methods. The data filtering methods allow us to obtain a high-quality dataset for training and validation purposes. We then extract various wave-based features like wave heights, periods, skewness, kurtosis, etc., and atmospheric features like humidity, pressure, and air temperature for the buoys. Afterward, we train algorithms that use LightGBM and Extra Trees through a hv-block cross-validation scheme to forecast dominant wave periods for up to 30 days ahead. LightGBM has the R2 score of 0.94, 0.94, and 0.94 for 1-day ahead, 15-day ahead, and 30-day ahead prediction. Similarly, Extra Trees (ET) has an R2 score of 0.88, 0.86, and 0.85 for 1-day ahead, 15-day ahead, and 30 day ahead prediction. In case of the test dataset, LightGBM has R2 score of 0.94, 0.94, and 0.94 for 1-day ahead, 15-day ahead and 30-day ahead prediction. ET has R2 score of 0.88, 0.86, and 0.85 for 1-day ahead, 15-day ahead, and 30-day ahead prediction. A similar R2 score for both training and the test dataset suggests that the machine learning models developed in this paper are robust. Since the LightGBM algorithm outperforms ET for all the windows tested, it is taken as the final algorithm. Note that the performance of both methods does not decrease significantly as the forecast horizon increases. Likewise, the proposed method outperforms the numerical approaches included in this paper in the test dataset. For 1 day ahead prediction, the proposed algorithm has SI, Bias, CC, and RMSE of 0.09, 0.00, 0.97, and 1.78 compared to 0.268, 0.40, 0.63, and 2.18 for the European Centre for Medium-range Weather Forecasts (ECMWF) model, which outperforms all the other methods in the test dataset.

翻訳日:2021-05-21 01:34:55 公開日:2021-05-18

# (参考訳) コンテキストアウェアセキュリティ監視のための知識グラフ上の機械学習

Machine learning on knowledge graphs for context-aware security monitoring ( http://arxiv.org/abs/2105.08741v1 )

ライセンス: CC BY 4.0

Josep Soler Garrido, Dominik Dold, Johannes Frank

(参考訳) 監視ツールが生成するデータ量の増加や、攻撃者が活動を隠す際の洗練度の高さから、機械学習技術は侵入検出の文脈で注目を集めている。しかし、既存の手法は、生成されたアラートの量と関連性の観点から、しばしば重要な制限を示す。近年、知識グラフはサイバーセキュリティ分野の応用を見つけており、人間の理解可能な語彙を使って複数のドメインからのデータをシームレスに統合する能力によって、これらの欠点のいくつかを緩和する可能性を示している。産業システムにおける異常な活動を評価するためのリンク予測手法を実験的に評価し, 侵入検知のための知識グラフへの機械学習の適用について検討する。初期教師なし訓練の後,提案手法は様々なシナリオにおいて直感的によく校正され,解釈可能な警告を生成することを示し,侵入検出目的の知識グラフに対するリレーショナル機械学習の潜在的メリットを示唆している。

Machine learning techniques are gaining attention in the context of intrusion detection due to the increasing amounts of data generated by monitoring tools, as well as the sophistication displayed by attackers in hiding their activity. However, existing methods often exhibit important limitations in terms of the quantity and relevance of the generated alerts. Recently, knowledge graphs are finding application in the cybersecurity domain, showing the potential to alleviate some of these drawbacks thanks to their ability to seamlessly integrate data from multiple domains using human-understandable vocabularies. We discuss the application of machine learning on knowledge graphs for intrusion detection and experimentally evaluate a link-prediction method for scoring anomalous activity in industrial systems. After initial unsupervised training, the proposed method is shown to produce intuitively well-calibrated and interpretable alerts in a diverse range of scenarios, hinting at the potential benefits of relational machine learning on knowledge graphs for intrusion detection purposes.

翻訳日:2021-05-21 01:20:37 公開日:2021-05-18

# (参考訳) 限られた露出とほぼ確実性で安全に行動することを学ぶ

Learning to Act Safely with Limited Exposure and Almost Sure Certainty ( http://arxiv.org/abs/2105.08748v1 )

ライセンス: CC BY 4.0

Agustin Castellano, Hancheng Min, Juan Bazerque, Enrique Mallada

(参考訳) 本研究の目的は,未知の環境での安全行動の学習を,確率が保証されても,最適性,安全でない事象への曝露レベル,安全でない事象の最大検出時間とのトレードオフを行ない,無拘束の探索試験を必要とせずに達成できる,という概念を提唱することにある。この概念を2つの相補的な設定で説明する。本稿では,まず標準的マルチアームバンディット問題に着目し,不確実性の存在下での学習安全性の本質的なトレードオフについて検討する。十分な探索に関する軽度な仮定の下で、予測された)有限個のラウンドで全ての安全でないマシンを確実に検出するアルゴリズムを提供する。この分析はまた、環境を確保するのに必要なラウンド数と安全なマシンを捨てる確率とのトレードオフも明らかにしている。次に、ほぼ確実に制約のあるマルコフ決定プロセス(mdp)のための最適なポリシーを見つける問題を考える。その結果、(作用)値関数は、報酬プロセスとは独立に実現可能なポリシーを識別できるバリアベースの分解を満足していることが示される。この分解を用いて、有限個のステップでそのような安全でない状態-作用対を識別するバリア学習アルゴリズムを開発した。我々の分析は、安全でない行動を検出するために必要なMDPのタイムラグと、安全でない事象への暴露のレベルとのトレードオフをさらに強調している。シミュレーションは、上記のトレードオフをさらに説明し、安全性の制約が学習プロセスのさらなるスピードアップにつながることを示唆する。

This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials, provided that one is willing to navigate trade-offs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We illustrate this concept in two complementary settings. We first focus on the canonical multi-armed bandit problem and seek to study the intrinsic trade-offs of learning safety in the presence of uncertainty. Under mild assumptions on sufficient exploration, we provide an algorithm that provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe machines. We then consider the problem of finding optimal policies for a Markov Decision Process (MDP) with almost sure constraints. We show that the (action) value function satisfies a barrier-based decomposition which allows for the identification of feasible policies independently of the reward process. Using this decomposition, we develop a Barrier-learning algorithm, that identifies such unsafe state-action pairs in a finite expected number of steps. Our analysis further highlights a trade-off between the time lag for the underlying MDP necessary to detect unsafe actions, and the level of exposure to unsafe events. Simulations corroborate our theoretical findings, further illustrating the aforementioned trade-offs, and suggesting that safety constraints can further speed up the learning process.

翻訳日:2021-05-21 01:11:26 公開日:2021-05-18

# (参考訳) 細粒度視覚分類のための自己教師あり学習

Self-Supervised Learning for Fine-Grained Visual Categorization ( http://arxiv.org/abs/2105.08788v1 )

ライセンス: CC BY 4.0

Muhammad Maaz, Hanoona Abdul Rasheed, Dhanalaxmi Gaddam

(参考訳) 自己教師付き学習(SSL)の最近の研究は、分類タスクの画像から有用な意味表現を学習する能力を示している。本研究では,FGVCにおけるSSLの有用性について検討した。 FGVCは、一般的なカテゴリ内で視覚的に類似したサブカテゴリのオブジェクトを区別することを目的としている。データセット内の小さなクラス間、しかし大きなクラス内バリエーションは、難しいタスクになります。このようなきめ細かいデータに対するアノテートラベルの制限はSSLの必要性を助長し、追加のアノテーションのコストを伴わずに学習を促進することができる。 cub-200-2011 データセットではトレーニング中のランダム作物増量と試験中の中心作物増量を利用して 86.36 % のtop-1 分類精度を達成している。本研究では,FGVCにおける各種プリテキストタスク,特に回転,プリテキスト不変表現学習(PIRL),デコンストラクションと構築学習(DCL)の有用性について検討する。補助的なタスクとしての回転は、グローバルな特徴を学習するモデルを促進し、微妙な詳細に焦点を絞ることから切り離す。ジグソーパッチを使用するPIRLは、差別的な地域に集中しようとするが、それらを正確にローカライズするのに苦労する。 DCLは局所的な識別特徴の学習に役立ち、87.41 %$ top-1 の精度でベースラインを上回ります。デコンストラクション学習はモデルを局所的なオブジェクト部分に集中させ、レコンストラクション学習は部分間の相関を学習するのに役立つ。我々の発見を推論するための広範な実験を行う。私たちのコードはhttps://github.com/mmaaz60/ssl_for_fgvcで利用可能です。

Recent research in self-supervised learning (SSL) has shown its capability in learning useful semantic representations from images for classification tasks. Through our work, we study the usefulness of SSL for Fine-Grained Visual Categorization (FGVC). FGVC aims to distinguish objects of visually similar sub categories within a general category. The small inter-class, but large intra-class variations within the dataset makes it a challenging task. The limited availability of annotated labels for such a fine-grained data encourages the need for SSL, where additional supervision can boost learning without the cost of extra annotations. Our baseline achieves $86.36\%$ top-1 classification accuracy on CUB-200-2011 dataset by utilizing random crop augmentation during training and center crop augmentation during testing. In this work, we explore the usefulness of various pretext tasks, specifically, rotation, pretext invariant representation learning (PIRL), and deconstruction and construction learning (DCL) for FGVC. Rotation as an auxiliary task promotes the model to learn global features, and diverts it from focusing on the subtle details. PIRL that uses jigsaw patches attempts to focus on discriminative local regions, but struggles to accurately localize them. DCL helps in learning local discriminating features and outperforms the baseline by achieving $87.41\%$ top-1 accuracy. The deconstruction learning forces the model to focus on local object parts, while reconstruction learning helps in learning the correlation between the parts. We perform extensive experiments to reason our findings. Our code is available at https://github.com/mmaaz60/ssl_for_fgvc.

翻訳日:2021-05-21 00:36:04 公開日:2021-05-18

# (参考訳) Corelated Adversarial Joint Disrepancy Adaptation Network

Correlated Adversarial Joint Discrepancy Adaptation Network ( http://arxiv.org/abs/2105.08808v1 )

ライセンス: CC0 1.0

Youshan Zhang and Brian D. Davison

(参考訳) ドメイン適応は、あるドメインから別のドメインに知識を移す際にドメインシフトの問題を軽減することを目的としている。しかし、既存の作品の多くは、クラスラベルを考慮せずに限界的な特徴を抽出することに依存している。さらに、対象のドメインラベルを使ってパラメータをチューニングしながら、そのモデルをいわゆる教師なしドメイン適応(unsupervised domain adapt)と呼ぶメソッドもある。これらの問題に対処するために,2つの領域の合同不一致を最小限に抑え,相関ラベルを用いたパラメータの調整による競合性能を実現する,correlationd adversarial joint discrepancy adaptation network (cajnet) と呼ばれる新しい手法を提案する。ジョイント特徴を訓練することで、2つの領域間の限界分布と条件分布を調整できる。さらに,対象領域の強力な指標である確率に基づくtop-$\mathcal{k}$ correlationd label (\mathcal{k}$-label)を導入する。ベンチマークデータセットに対する大規模な実験により、最先端の分類精度が大幅に向上した。

Domain adaptation aims to mitigate the domain shift problem when transferring knowledge from one domain into another similar but different domain. However, most existing works rely on extracting marginal features without considering class labels. Moreover, some methods name their model as so-called unsupervised domain adaptation while tuning the parameters using the target domain label. To address these issues, we propose a novel approach called correlated adversarial joint discrepancy adaptation network (CAJNet), which minimizes the joint discrepancy of two domains and achieves competitive performance with tuning parameters using the correlated label. By training the joint features, we can align the marginal and conditional distributions between the two domains. In addition, we introduce a probability-based top-$\mathcal{K}$ correlated label ($\mathcal{K}$-label), which is a powerful indicator of the target domain and effective metric to tune parameters to aid predictions. Extensive experiments on benchmark datasets demonstrate significant improvements in classification accuracy over the state of the art.

翻訳日:2021-05-21 00:20:44 公開日:2021-05-18

# (参考訳) GloVe ワード埋め込みと補助語彙資源を用いた消費者健康語彙の充実のための自動化手法

An Automated Method to Enrich Consumer Health Vocabularies Using GloVe Word Embeddings and An Auxiliary Lexical Resource ( http://arxiv.org/abs/2105.08812v1 )

ライセンス: CC BY 4.0

Mohammed Ibrahim, Susan Gauch, Omar Salman, Mohammed Alqahatani

(参考訳) 背景: 明快な言語は、任意の当事者間のコミュニケーションを容易にする。在職者は、ドメインに共通する専門用語を理解できないため、専門家とのコミュニケーションが困難になる可能性がある。医療分野では、病状や治療の理解が不十分な、医学用語に精通する素人を見つけることは稀である。このギャップを埋めるために、いくつかの専門用語とオントロジーが作成され、平凡な医学用語を専門的な医学用語にマッピングする。目的: 提示された語彙の多くは手動または半自動で構築され、時間と人的労力に大きな投資を必要とし、結果としてこれらの語彙の成長が遅くなる。本稿では,任意の領域の語彙に適用できるという利点を持つ,在職者の語彙を豊かにするための自動手法を提案する。方法: 完全に自動化されたアプローチでは、マシンラーニング、特にglove(global vectors for word embeddeds)を使用して、ソーシャルメディアのヘルスケアプラットフォームから収集したコーパスに基づいて、consumer health vocabularies(chv)を拡張し、拡張します。提案手法は,WordNetのオントロジーから同義語や偽名を取り込むことにより,CHVをさらに改善する。 The basic GloVe and our novel algorithm in Using WordNet using two laymen datasets from the National Library of Medicine (NLM), Open-Access Consumer Health Vocabulary (OAC CHV) and MedlinePlus Healthcare Vocabulary。結果は、GloVeが48.44%のFスコアで新しいレイメン語を見つけることができたことを示している。さらに, 強化グローブアプローチは, 平均fスコアが61%, 相対的に25%向上したベーシックグローブよりも優れていた。さらに、強化されたGloVeはP<.001の2つの基底真理データセットに対して統計的に有意であった。

Background: Clear language makes communication easier between any two parties. A layman may have difficulty communicating with a professional due to not understanding the specialized terms common to the domain. In healthcare, it is rare to find a layman knowledgeable in medical terminology which can lead to poor understanding of their condition and/or treatment. To bridge this gap, several professional vocabularies and ontologies have been created to map laymen medical terms to professional medical terms and vice versa. Objective: Many of the presented vocabularies are built manually or semi-automatically requiring large investments of time and human effort and consequently the slow growth of these vocabularies. In this paper, we present an automatic method to enrich laymen's vocabularies that has the benefit of being able to be applied to vocabularies in any domain. Methods: Our entirely automatic approach uses machine learning, specifically Global Vectors for Word Embeddings (GloVe), on a corpus collected from a social media healthcare platform to extend and enhance consumer health vocabularies (CHV). Our approach further improves the CHV by incorporating synonyms and hyponyms from the WordNet ontology. The basic GloVe and our novel algorithms incorporating WordNet were evaluated using two laymen datasets from the National Library of Medicine (NLM), Open-Access Consumer Health Vocabulary (OAC CHV) and MedlinePlus Healthcare Vocabulary. Results: The results show that GloVe was able to find new laymen terms with an F-score of 48.44%. Furthermore, our enhanced GloVe approach outperformed basic GloVe with an average F-score of 61%, a relative improvement of 25%. Furthermore, the enhanced GloVe showed a statistical significance over the two ground truth datasets with P<.001.

翻訳日:2021-05-20 23:55:12 公開日:2021-05-18

# (参考訳) リモート生理計測予測による映像系列からの非接触痛認識

Non-contact Pain Recognition from Video Sequences with Remote Physiological Measurements Prediction ( http://arxiv.org/abs/2105.08822v1 )

ライセンス: CC BY 4.0

Ruijing Yang, Ziyu Guan, Zitong Yu, Guoying Zhao, Xiaoyi Feng, Jinye Peng

(参考訳) 自動鎮痛は診断と治療において最重要である。既存の作品は、顔の外観の変化の評価、生理的な手がかりの活用、マルチモーダルな方法でそれらを融合させるという3つのカテゴリに分類される。しかし,(1)外見の変化は主観的痛み認知を妨げる主観的要因の影響を受けやすい。また,表情に基づくアプローチでは,表現のモデル化に重要な長期的空間的-時間的依存性を無視し,(2)不便で不快な人体にセンサを装着することで生理学的手がかりを得る。本稿では,出現変化と生理的手がかりの両方を非接触的にエンコードして痛み認識を行うマルチタスク学習フレームワークを提案する。このフレームワークは、学習された外観表現に対する注意機構を通じて局所的および長期的依存性の両方を捉えることができ、補助タスクでビデオから復元された生理的手がかり(remote photoplethysmography, rppg)によりさらに強化される。このフレームワークはrPPGにより強化された時空間注意ネットワーク(rSTAN)と呼ばれ、一般に利用可能な痛みデータベース上での非接触痛認識の最先端性能を確立することができる。これはrppg予測を非接触自動痛み認識の補助タスクとして使用できることを示す。

Automatic pain recognition is paramount for medical diagnosis and treatment. The existing works fall into three categories: assessing facial appearance changes, exploiting physiological cues, or fusing them in a multi-modal manner. However, (1) appearance changes are easily affected by subjective factors which impedes objective pain recognition. Besides, the appearance-based approaches ignore long-range spatial-temporal dependencies that are important for modeling expressions over time; (2) the physiological cues are obtained by attaching sensors on human body, which is inconvenient and uncomfortable. In this paper, we present a novel multi-task learning framework which encodes both appearance changes and physiological cues in a non-contact manner for pain recognition. The framework is able to capture both local and long-range dependencies via the proposed attention mechanism for the learned appearance representations, which are further enriched by temporally attended physiological cues (remote photoplethysmography, rPPG) that are recovered from videos in the auxiliary task. This framework is dubbed rPPG-enriched Spatio-Temporal Attention Network (rSTAN) and allows us to establish the state-of-the-art performance of non-contact pain recognition on publicly available pain databases. It demonstrates that rPPG predictions can be used as an auxiliary task to facilitate non-contact automatic pain recognition.

翻訳日:2021-05-20 23:37:53 公開日:2021-05-18

# (参考訳) Fusion-DHL:WiFi, IMU, Floorplan Fusion for Dense History of Locations in Indoor Environments

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments ( http://arxiv.org/abs/2105.08837v1 )

ライセンス: CC BY 4.0

Sachini Herath, Saghar Irandoust, Bowen Chen, Yiming Qian, Pyojin Kim, Yasutaka Furukawa

(参考訳) 本稿では,WiFi,IMU,フロアプラン情報を融合して屋内環境における正確な位置履歴を推定するマルチモーダルセンサ融合アルゴリズムを提案する。このアルゴリズムは,1)IMUセンサデータから相対的な運動軌跡を推定する慣性ナビゲーションアルゴリズム,2)位置制約を取得し,その軌跡をジオローカライズする業界におけるWiFiベースのローカライゼーションAPI,3)フロアプランと整合する位置履歴を洗練するための畳み込みニューラルネットワークを使用する。 4つの大学ビルと3つのショッピングモールで、wi-fi、imu、フロアプランデータを使った新しいデータセットを構築するためのデータ取得アプリを開発した。定性的かつ定量的な評価により,提案システムは現在の標準よりも2倍の精度と数桁の高密度な位置履歴を生成でき,エネルギー消費は最小限であることが示された。私たちはコード、データ、モデルを公開します。

The paper proposes a multi-modal sensor fusion algorithm that fuses WiFi, IMU, and floorplan information to infer an accurate and dense location history in indoor environments. The algorithm uses 1) an inertial navigation algorithm to estimate a relative motion trajectory from IMU sensor data; 2) a WiFi-based localization API in industry to obtain positional constraints and geo-localize the trajectory; and 3) a convolutional neural network to refine the location history to be consistent with the floorplan. We have developed a data acquisition app to build a new dataset with WiFi, IMU, and floorplan data with ground-truth positions at 4 university buildings and 3 shopping malls. Our qualitative and quantitative evaluations demonstrate that the proposed system is able to produce twice as accurate and a few orders of magnitude denser location history than the current standard, while requiring minimal additional energy consumption. We will publicly share our code, data and models.

翻訳日:2021-05-20 23:22:55 公開日:2021-05-18

# (参考訳) シーケンスからシーケンスタスクへの表現学習:マルチフィルタガウス混合オートエンコーダ

Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder ( http://arxiv.org/abs/2105.08840v1 )

ライセンス: CC BY 4.0

Yunhao Yang, Zhaokun Xue

(参考訳) 文の不均一性は、機械翻訳のようなシーケンスタスクに連続して存在する。大きく異なる意味や文法構造を持つ文は、ネットワークを訓練しながら収束の困難を増す可能性がある。本稿では,シーケンスタスクにおける不均一性を解決するためのモデルを提案する。 Multi-filter Gaussian Mixture Autoencoder (MGMAE) はオートエンコーダを用いて入力の表現を学習する。表現はエンコーダからの出力であり、その次元がエンコーダの隠れた次元である潜在空間にある。潜在空間におけるトレーニングデータの表現はガウス混合の訓練に使用される。潜在空間表現はガウス分布のいくつかの混合に分割される。フィルタ(デコーダ)は、具体的にはガウス分布の1つに適合するように調整される。各ガウシアンはこのガウシアン内の不均一性の原因となるように1つのフィルターに対応している。これにより、トレーニングデータの均一性を解消できる。ジオクエリデータセットと英語とフランス語の翻訳について比較実験を行った。実験の結果,従来のエンコーダ・デコーダモデルと比較すると,機械翻訳や質問応答といったシーケンスタスクの処理性能が向上することがわかった。

Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we introduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.

翻訳日:2021-05-20 23:09:47 公開日:2021-05-18

# (参考訳) Gym-ANM:研究開発における電力系統管理のための強化学習を活用したオープンソースソフトウェア

Gym-ANM: Open-source software to leverage reinforcement learning for power system management in research and education ( http://arxiv.org/abs/2105.08846v1 )

ライセンス: CC BY 4.0

Robin Henry and Damien Ernst

(参考訳) Gym-ANMは、電気ネットワークにおけるアクティブネットワーク管理(ANM)タスクをモデル化する強化学習(RL)環境の設計を容易にするPythonパッケージである。ここでは、新しい環境の実装方法と、既存の環境と相互作用するコードを書く方法を説明する。また、ANM6-Easyは、一般的なANM課題を強調するために設計された環境である。最後に,sm-anmが科学コミュニティに与える影響について,研究と教育の両面で検討する。このパッケージは、将来のエネルギーシステムを制御するアルゴリズムの探索において、電力システムとRLコミュニティの協力を促進することを願っている。

Gym-ANM is a Python package that facilitates the design of reinforcement learning (RL) environments that model active network management (ANM) tasks in electricity networks. Here, we describe how to implement new environments and how to write code to interact with pre-existing ones. We also provide an overview of ANM6-Easy, an environment designed to highlight common ANM challenges. Finally, we discuss the potential impact of Gym-ANM on the scientific community, both in terms of research and education. We hope this package will facilitate collaboration between the power system and RL communities in the search for algorithms to control future energy systems.

翻訳日:2021-05-20 23:02:30 公開日:2021-05-18

# (参考訳) AI教育における構造的不平等の克服

Confronting Structural Inequities in AI for Education ( http://arxiv.org/abs/2105.08847v1 )

ライセンス: CC BY 4.0

Michael Madaio, Su Lin Blodgett, Elijah Mayfield, Ezekiel Dixon-Rom\'an

(参考訳) 教育技術と、それらが展開される教育制度は、何が重要なのか、学習者がどのように学ぶべきかについて、特にイデオロギーを実践する。人工知能技術(教育など)が辺境化社会に不平等な結果をもたらしたため、AIシステムの異なる影響を評価し緩和するための様々なアプローチが開発されている。しかし,本稿では,AIモデルの性能格差に基づく公平性評価の主流パラダイムが,教育用AIシステム(re)が生み出す構造的不平等に直面するには不十分である,と論じる。我々は、批判理論と黒人フェミニスト奨学金によって知らされる構造的不正のレンズを描き、広く研究され広く研究されている教育AIシステムのカテゴリを批判的に尋問し、どのように教育AI技術が、モデルの性能に関わらず、構造的不正と不平等の歴史的正当性に束縛されているかを実証する。私たちは、教育AI研究のより公平な未来に向けて、代替のビジョンに近づきます。

Educational technologies, and the systems of schooling in which they are deployed, enact particular ideologies about what is important to know and how learners should learn. As artificial intelligence technologies -- in education and beyond -- have led to inequitable outcomes for marginalized communities, various approaches have been developed to evaluate and mitigate AI systems' disparate impact. However, we argue in this paper that the dominant paradigm of evaluating fairness on the basis of performance disparities in AI models is inadequate for confronting the structural inequities that educational AI systems (re)produce. We draw on a lens of structural injustice informed by critical theory and Black feminist scholarship to critically interrogate several widely-studied and widely-adopted categories of educational AI systems and demonstrate how educational AI technologies are bound up in and reproduce historical legacies of structural injustice and inequity, regardless of the parity of their models' performance. We close with alternative visions for a more equitable future for educational AI research.

翻訳日:2021-05-20 22:54:38 公開日:2021-05-18

# (参考訳) 効果的な注意は解釈可能性に光を当てる

Effective Attention Sheds Light On Interpretability ( http://arxiv.org/abs/2105.08855v1 )

ライセンス: CC BY 4.0

Kaiser Sun and Ana Marasovi\'c

(参考訳) 変圧器自己注意サブレイヤの注意行列は、2つの成分に確実に分解することができ、その1つ(有効注意)のみがモデル出力に寄与する。これにより、効果的な注意の可視化が標準的な注意の解釈と異なる結論を与えるかどうかを問うことができる。グルータスクとbertのサブセットを使用して、2つのアテンション行列を比較する解析を行い、それらの解釈が異なることを示す。効果的な注意力は、セパレータトークンのような言語モデリング事前訓練に関連する特徴とは無関係であり、エンドタスクを解くためにモデルが捉えた言語的特徴を説明する可能性がある。この違いを考慮に入れると,設計によって出力されるモデルとより関連があるため,トランスフォーマーの挙動の研究に効果的に注意を払うことを推奨する。

An attention matrix of a transformer self-attention sublayer can provably be decomposed into two components and only one of them (effective attention) contributes to the model output. This leads us to ask whether visualizing effective attention gives different conclusions than interpretation of standard attention. Using a subset of the GLUE tasks and BERT, we carry out an analysis to compare the two attention matrices, and show that their interpretations differ. Effective attention is less associated with the features related to the language modeling pretraining such as the separator token, and it has more potential to illustrate linguistic features captured by the model for solving the end-task. Given the found differences, we recommend using effective attention for studying a transformer's behavior since it is more pertinent to the model output by design.

翻訳日:2021-05-20 22:17:38 公開日:2021-05-18

# Value Functionは必要なものすべて: ハイドプラットフォームのための統一学習フレームワーク

Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms ( http://arxiv.org/abs/2105.08791v1 )

ライセンス: Link先を確認

Xiaocheng Tang, Fan Zhang, Zhiwei (Tony) Qin, Yansheng Wang, Dingyuan Shi, Bingchen Song, Yongxin Tong, Hongtu Zhu, Jieping Ye

(参考訳) DiDi、Uber、Lyftなどの大型配車プラットフォームは、都市内の数万台の車両を1日中数百万の乗車要求に接続し、注文の発送と車両配置のタスクを通じて、交通効率を向上させるための素晴らしい約束を提供する。しかし、既存の研究では2つのタスクが単純化されており、これら2つの間の複雑な相互作用、供給と需要のリアルタイムな変動、そして問題の大規模な性質による必要な調整にほとんど対応していない。本稿では,両タスクに取り組むための統合価値ベース動的学習フレームワーク(v1d3)を提案する。フレームワークの中心にはグローバルな共有バリュー関数があり、リアルタイムプラットフォームトランザクションから生成されたオンラインエクスペリエンスを使用して継続的に更新される。サンプル効率とロバスト性を改善するために,高速オンライン学習と,豊富な履歴ドライバ軌道データを活用する大規模なオフライン学習手法を組み合わせた,新しい定期的なアンサンブル手法を提案する。これにより、提案するフレームワークは、非常にダイナミックな環境に迅速に適応し、繰り返しパターンに頑健に一般化し、管理車両の人口間の暗黙的な調整を促進することができる。実世界のデータセットに基づく広範な実験では、両タスクで最近提案された他の方法よりも大幅に改善されている。特に、v1d3は、kdd cup 2020 rlコンペティションにおけるディスパッチとリプレースの両方のトラックの勝者を上回り、ドライバー総収入とユーザエクスペリエンス関連の指標の両方を改善する最新結果を達成している。

Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of thousands of vehicles in a city to millions of ride demands throughout the day, providing great promises for improving transportation efficiency through the tasks of order dispatching and vehicle repositioning. Existing studies, however, usually consider the two tasks in simplified settings that hardly address the complex interactions between the two, the real-time fluctuations between supply and demand, and the necessary coordinations due to the large-scale nature of the problem. In this paper we propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks. At the center of the framework is a globally shared value function that is updated continuously using online experiences generated from real-time platform transactions. To improve the sample-efficiency and the robustness, we further propose a novel periodic ensemble method combining the fast online learning with a large-scale offline training scheme that leverages the abundant historical driver trajectory data. This allows the proposed framework to adapt quickly to the highly dynamic environment, to generalize robustly to recurrent patterns and to drive implicit coordinations among the population of managed vehicles. Extensive experiments based on real-world datasets show considerably improvements over other recently proposed methods on both tasks. Particularly, V1D3 outperforms the first prize winners of both dispatching and repositioning tracks in the KDD Cup 2020 RL competition, achieving state-of-the-art results on improving both total driver income and user experience related metrics.

翻訳日:2021-05-20 14:01:32 公開日:2021-05-18

# Pathdreamer: 室内ナビゲーションのための世界モデル

Pathdreamer: A World Model for Indoor Navigation ( http://arxiv.org/abs/2105.08756v1 )

ライセンス: Link先を確認

Jing Yu Koh, Honglak Lee, Yinfei Yang, Jason Baldridge, Peter Anderson

(参考訳) 不慣れな建物をナビゲートする人々は、無数の視覚的、空間的、セマンティックな手がかりを利用して、ナビゲーション目標を効率的に達成します。同様の能力を持つ計算エージェントの装備に向けて,新しい屋内環境を探索するエージェントの視覚的世界モデルPathdreamerを紹介した。ひとつ以上の視覚的な観察から、pathdreamerは、訓練中に見えない建物において、訪問されていない視点に対して、おそらく高解像度の360度視覚観察(rgb、セマンティックセグメンテーション、深さ)を生成する。不確実性の高い地域では(例えば) 隅々を予測し、目に見えない部屋の内容を想像すると、Pathdreamerは多様なシーンを予測でき、エージェントは与えられた軌道に対して複数の現実的な結果をサンプリングすることができる。 Pathdreamerは視覚・言語ナビゲーション(VLN)の下流タスクにおいて、人間の環境に関する有用な視覚的・空間的・意味的な知識を符号化する。具体的には、Pathdreamerの今後の計画が、環境の観測されていない部分からの実際の観測に先んじることの利点の半分をもたらすことを示す。 pathdreamerは、特定のオブジェクトやvlnへのナビゲートなど、具体化されたナビゲーションタスクに挑戦するためのモデルベースのアプローチのアンロックを支援することを願っている。

People navigating in unfamiliar buildings take advantage of myriad visual, spatial and semantic cues to efficiently achieve their navigation goals. Towards equipping computational agents with similar capabilities, we introduce Pathdreamer, a visual world model for agents navigating in novel indoor environments. Given one or more previous visual observations, Pathdreamer generates plausible high-resolution 360 visual observations (RGB, semantic segmentation and depth) for viewpoints that have not been visited, in buildings not seen during training. In regions of high uncertainty (e.g. predicting around corners, imagining the contents of an unseen room), Pathdreamer can predict diverse scenes, allowing an agent to sample multiple realistic outcomes for a given trajectory. We demonstrate that Pathdreamer encodes useful and accessible visual, spatial and semantic knowledge about human environments by using it in the downstream task of Vision-and-Language Navigation (VLN). Specifically, we show that planning ahead with Pathdreamer brings about half the benefit of looking ahead at actual observations from unobserved parts of the environment. We hope that Pathdreamer will help unlock model-based approaches to challenging embodied navigation tasks such as navigating to specified objects and VLN.

翻訳日:2021-05-20 14:00:14 公開日:2021-05-18

# 異常検出のためのマスク付きコントラスト学習

Masked Contrastive Learning for Anomaly Detection ( http://arxiv.org/abs/2105.08793v1 )

ライセンス: Link先を確認

Hyunsoo Cho, Jinseok Seol, Sang-goo Lee

(参考訳) 異常検出は、安全クリティカルなソフトウェアシステムにおける基本的な側面の一つであるが、長年の課題である。複雑化を緩和し、効率性を示すために多くの作品が提案されている。特に,ラベルを付加せずに多彩な表現を学習できることから,自己指導型学習手法が関心を喚起している。自己指導型学習戦術の中で、コントラスト学習は、異常検出を含む様々な分野において、その優位性を検証するための特定の枠組みである。しかし、対照的な学習の主な目的は、ラベルなしでタスクに依存しない特徴を学ぶことである。本稿では,マスク付きコントラスト学習という,タスク固有のコントラスト学習の変種を提案する。さらに,補助的な自己監督タスクを通じて学習した能力を活用することで,パフォーマンスをさらに向上する,自己組織化推論と呼ばれる新しい推論手法を提案する。モデルを組み合わせることで、さまざまなベンチマークデータセットにおいて、従来の最先端手法よりも大きなマージンを達成できます。

Detecting anomalies is one fundamental aspect of a safety-critical software system, however, it remains a long-standing problem. Numerous branches of works have been proposed to alleviate the complication and have demonstrated their efficiencies. In particular, self-supervised learning based methods are spurring interest due to their capability of learning diverse representations without additional labels. Among self-supervised learning tactics, contrastive learning is one specific framework validating their superiority in various fields, including anomaly detection. However, the primary objective of contrastive learning is to learn task-agnostic features without any labels, which is not entirely suited to discern anomalies. In this paper, we propose a task-specific variant of contrastive learning named masked contrastive learning, which is more befitted for anomaly detection. Moreover, we propose a new inference method dubbed self-ensemble inference that further boosts performance by leveraging the ability learned through auxiliary self-supervision tasks. By combining our models, we can outperform previous state-of-the-art methods by a significant margin on various benchmark datasets.

翻訳日:2021-05-20 13:59:50 公開日:2021-05-18

# LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction

LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction ( http://arxiv.org/abs/2105.08780v1 )

ライセンス: Link先を確認

Abhinandan Desai and Kai North and Marcos Zampieri and Christopher M. Homan

(参考訳) 本稿では,チームLCP-RITによるSemEval-2021 Task 1: Lexical Complexity Prediction (LCP)の提出について述べる。タスクオーガナイザは、コンプレックスの拡張バージョン(shardlow et al., 2020)を参加者に提供した。コンプレックスは英語のマルチドメインデータセットで、コンテキスト内の単語が5点のlikertスケールを使用して複雑さに対して注釈付けされたものだ。我々のシステムはロジスティック回帰と幅広い言語的特徴(例)を用いる。心理言語学的な特徴、n-gram、単語頻度、posタグ) このデータセットにおける単一単語の複雑さを予測する。言語特性の違いが分類性能に与える影響を分析し,平均絶対誤差,平均二乗誤差,ピアソン相関,スピアマン相関の観点から評価した。

This paper describes team LCP-RIT's submission to the SemEval-2021 Task 1: Lexical Complexity Prediction (LCP). The task organizers provided participants with an augmented version of CompLex (Shardlow et al., 2020), an English multi-domain dataset in which words in context were annotated with respect to their complexity using a five point Likert scale. Our system uses logistic regression and a wide range of linguistic features (e.g. psycholinguistic features, n-grams, word frequency, POS tags) to predict the complexity of single words in this dataset. We analyze the impact of different linguistic features in the classification performance and we evaluate the results in terms of mean absolute error, mean squared error, Pearson correlation, and Spearman correlation.

翻訳日:2021-05-20 13:54:33 公開日:2021-05-18

# 合成符号ミキシングによる機械翻訳への英語テキスト変換器の探索

Exploring Text-to-Text Transformers for English to Hinglish Machine Translation with Synthetic Code-Mixing ( http://arxiv.org/abs/2105.08807v1 )

ライセンス: Link先を確認

Ganesh Jawahar, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, Laks V.S. Lakshmanan

(参考訳) 単言語対とコード混合言語対の翻訳問題に焦点をあてたモデルについて述べる。具体的には、モノリンガルな英語のテキストをHinglish(コードミキシングされたヒンディー語と英語)に変換する幅広いモデルを提供しています。最近の事前学習された言語モデルの成功を考えると、我々は2つのトランスフォーマベースのエンコーダ-デコーダモデル(すなわちmt5とmbart)の有用性をテストし、両方がうまく機能するようにした。また,コード混合のための学習データのpaucityを考慮し,バイリンガル分散表現からコード混合テキストを生成するための依存性のない手法を提案し,言語モデルの性能向上に活用する。特に、この追加データを用いて、まず合成データ上で言語モデルを微調整し、次にゴールドコード混合データを用いて、カリキュラム学習アプローチを採用する。単純ではあるが,本手法は様々な条件下で,いくつかの標準手法(逆変換法,同値制約理論に基づく方法)と競合する(場合によってはさらに優れている)ことが判明した。本研究は,mT5モデルをカリキュラム学習手順に従って微調整し,最高の翻訳性能(12.67BLEU)を達成することを示す。私たちのモデルは、英語と英語の公式共有タスク全体のランキングで第一位です。

We describe models focused at the understudied problem of translating between monolingual and code-mixed language pairs. More specifically, we offer a wide range of models that convert monolingual English text into Hinglish (code-mixed Hindi and English). Given the recent success of pretrained language models, we also test the utility of two recent Transformer-based encoder-decoder models (i.e., mT5 and mBART) on the task finding both to work well. Given the paucity of training data for code-mixing, we also propose a dependency-free method for generating code-mixed texts from bilingual distributed representations that we exploit for improving language model performance. In particular, armed with this additional data, we adopt a curriculum learning approach where we first finetune the language models on synthetic data then on gold code-mixed data. We find that, although simple, our synthetic code-mixing method is competitive with (and in some cases is even superior to) several standard methods (backtranslation, method based on equivalence constraint theory) under a diverse set of conditions. Our work shows that the mT5 model, finetuned following the curriculum learning procedure, achieves best translation performance (12.67 BLEU). Our models place first in the overall ranking of the English-Hinglish official shared task.

翻訳日:2021-05-20 13:54:17 公開日:2021-05-18

# 限られたデータからの顔認識における画像強調の有効性の分析

Analyzing the effectiveness of image augmentations for face recognition from limited data ( http://arxiv.org/abs/2105.08796v1 )

ライセンス: Link先を確認

Aleksei Zhuchkov

(参考訳) 本研究は,限られたデータから顔認識問題に対する画像強調の効率を解析する。拡張のための基本的な操作,生成方法,およびそれらの組み合わせを検討した。以上の結果より, 顔認証システムの品質は向上し, 生成的アプローチと基本手法の組み合わせは, 他の試験手法よりも優れていたことが示唆された。

This work presents an analysis of the efficiency of image augmentations for the face recognition problem from limited data. We considered basic manipulations, generative methods, and their combinations for augmentations. Our results show that augmentations, in general, can considerably improve the quality of face recognition systems and the combination of generative and basic approaches performs better than the other tested techniques.

翻訳日:2021-05-20 13:50:03 公開日:2021-05-18

# クロスアクションアテンションを用いたマルチパーソン極端運動予測

Multi-Person Extreme Motion Prediction with Cross-Interaction Attention ( http://arxiv.org/abs/2105.08825v1 )

ライセンス: Link先を確認

Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno

(参考訳) 人間の動き予測は、過去の3D骨格の連続から将来の人間のポーズを予測することを目的としている。この問題は近年注目されているが、ほとんどの場合単独の人間に対処されている。本稿では,人間による協調作業を含む新しい視点から,この問題を考察する。本システムでは,2人の対話者を対象とした2つの過去の骨格列を入力とし,それぞれの動作を予測することを目的とする。本研究では,両者の歴史的情報を活用し,その空間的・時間的距離に拘わらず,自己ポーズと他者のポーズ間の相互依存を予測できる新たな相互行為注意機構を考案する。このような対話的な状況をトレーニングするデータセットが存在しないため、アクロバティックを行うプロのダンサーによる新しいラボベースの個人インタラクションデータセットであるExPI(Extreme Pose Interaction)をキャプチャした。 ExPIには、30kフレームと60kインスタンスの115のシーケンスと、アノテーション付きの3Dボディポーズと形状が含まれている。このデータセット上でのクロスインタラクションネットワークを徹底的に評価し、短期予測と長期予測の両方において、各人が独立的に推論するベースラインを一貫して上回っています。私たちは、データセットとトレイン/テストの分割を共同でリリースして、このトピックに関する将来の研究を促進する予定です。

Human motion prediction aims to forecast future human poses given a sequence of past 3D skeletons. While this problem has recently received increasing attention, it has mostly been tackled for single humans in isolation. In this paper we explore this problem from a novel perspective, involving humans performing collaborative tasks. We assume that the input of our system are two sequences of past skeletons for two interacting persons, and we aim to predict the future motion for each of them. For this purpose, we devise a novel cross interaction attention mechanism that exploits historical information of both persons and learns to predict cross dependencies between self poses and the poses of the other person in spite of their spatial or temporal distance. Since no dataset to train such interactive situations is available, we have captured ExPI (Extreme Pose Interaction), a new lab-based person interaction dataset of professional dancers performing acrobatics. ExPI contains 115 sequences with 30k frames and 60k instances with annotated 3D body poses and shapes. We thoroughly evaluate our cross-interaction network on this dataset and show that both in short-term and long-term predictions, it consistently outperforms baselines that independently reason for each person. We plan to release our code jointly with the dataset and the train/test splits to spur future research on the topic.

翻訳日:2021-05-20 13:49:57 公開日:2021-05-18

# 確率ネットワークとキューにおける学習と情報

Learning and Information in Stochastic Networks and Queues ( http://arxiv.org/abs/2105.08769v1 )

ライセンス: Link先を確認

Neil Walton, Kuang Xu

(参考訳) 待ち行列システムの安定性と最適化における情報と学習の役割を概観する。近年,意思決定における情報の役割の増大に支えられた待ち行列システムに,教師あり学習,盗賊学習,強化学習の技法が応用されている。待ち行列システムへのこれらの領域の適用を合理化するための観測結果と新たな結果を提案する。我々は、MaxWeight と BackPressure ポリシーが Blackwell の Approachability Theorem の応用であることを証明する。これは待ち行列理論の結果と逆学習を結びつける。次に,サービスパラメータ推定のための統計的学習の要件について論じる。例として、サービス分類にパーセプトロンアルゴリズムを適用する場合、キューサイズの後悔がいかに制限されるかを示す。次に,意思決定における状態情報の役割について述べる。ここでは, てんかん情報(不確定なパラメータの情報)と失語症情報(不確定な状態の情報)の役割を対比する。最後に,強化学習と待ち行列理論の最近の進歩を概観し,現在の研究課題について考察する。

We review the role of information and learning in the stability and optimization of queueing systems. In recent years, techniques from supervised learning, bandit learning and reinforcement learning have been applied to queueing systems supported by increasing role of information in decision making. We present observations and new results that help rationalize the application of these areas to queueing systems. We prove that the MaxWeight and BackPressure policies are an application of Blackwell's Approachability Theorem. This connects queueing theoretic results with adversarial learning. We then discuss the requirements of statistical learning for service parameter estimation. As an example, we show how queue size regret can be bounded when applying a perceptron algorithm to classify service. Next, we discuss the role of state information in improved decision making. Here we contrast the roles of epistemic information (information on uncertain parameters) and aleatoric information (information on an uncertain state). Finally we review recent advances in the theory of reinforcement learning and queueing, as well as, provide discussion on current research challenges.

翻訳日:2021-05-20 13:45:58 公開日:2021-05-18

# タスク非定常性追跡によるメタ強化学習

Meta-Reinforcement Learning by Tracking Task Non-stationarity ( http://arxiv.org/abs/2105.08834v1 )

ライセンス: Link先を確認

Riccardo Poiani, Andrea Tirinzoni, Marcello Restelli

(参考訳) 多くの現実世界のドメインは、エージェントの目標と環境力学に影響を与える構造化された非定常性の対象である。メタ強化学習(rl)は、関連するタスクに迅速に適応するトレーニングエージェントに成功している。しかし、非定常領域のための既存のメタRLアルゴリズムのほとんどは、タスク生成プロセスに強い仮定を行うか、トレーニング時にサンプリングを必要とする。本稿では,タスクの時間的進化を明示的に追跡することで,将来に向けて最適化する新しいアルゴリズム(TRIO)を提案する。トレーニング時にTRIOは、経験サンプルから潜伏パラメータを素早く識別する変分モジュールを学習する。このモジュールは、タスクの不確実性を考慮した最適探索ポリシーと共同で学習される。テスト時にTRIOは、オンラインの潜在パラメータの進化を追跡し、将来のタスクに対する不確実性を減らし、メタ学習ポリシーによる迅速な適応を得る。既存のほとんどの方法とは異なり、トリオはマルコフのタスク進化過程を仮定せず、訓練時の非定常性に関する情報を必要とせず、環境における複雑な変化を捉えている。シミュレーション問題に対するアルゴリズムの評価を行い,競合ベースラインよりも優れていることを示す。

Many real-world domains are subject to a structured non-stationarity which affects the agent's goals and the environmental dynamics. Meta-reinforcement learning (RL) has been shown successful for training agents that quickly adapt to related tasks. However, most of the existing meta-RL algorithms for non-stationary domains either make strong assumptions on the task generation process or require sampling from it at training time. In this paper, we propose a novel algorithm (TRIO) that optimizes for the future by explicitly tracking the task evolution through time. At training time, TRIO learns a variational module to quickly identify latent parameters from experience samples. This module is learned jointly with an optimal exploration policy that takes task uncertainty into account. At test time, TRIO tracks the evolution of the latent parameters online, hence reducing the uncertainty over future tasks and obtaining fast adaptation through the meta-learned policy. Unlike most existing methods, TRIO does not assume Markovian task-evolution processes, it does not require information about the non-stationarity at training time, and it captures complex changes undergoing in the environment. We evaluate our algorithm on different simulated problems and show it outperforms competitive baselines.

翻訳日:2021-05-20 13:45:44 公開日:2021-05-18

# コンフォーマルヒストグラム回帰

Conformal histogram regression ( http://arxiv.org/abs/2105.08747v1 )

ライセンス: Link先を確認

Matteo Sesia, Yaniv Romano

(参考訳) 本稿では,スキューデータに自動的に適応可能な非パラメトリック回帰の予測間隔を計算するためのコンフォメーション手法を提案する。ブラックボックス機械学習アルゴリズムを用いて、ヒストグラムを用いて結果の条件分布を推定し、その出力を近似条件付きの最短予測間隔に変換する。結果として得られる予測間隔は有限サンプルにおいて限界範囲を持つことが証明され、ブラックボックスモデルが一致する場合に条件範囲と最適長さを漸近的に達成する。シミュレーションおよび実データを用いた数値実験により、共形量子化回帰やその他の分布共形予測手法を含む最先端の代替手法と比較して性能が向上した。

This paper develops a conformal method to compute prediction intervals for non-parametric regression that can automatically adapt to skewed data. Leveraging black-box machine learning algorithms to estimate the conditional distribution of the outcome using histograms, it translates their output into the shortest prediction intervals with approximate conditional coverage. The resulting prediction intervals provably have marginal coverage in finite samples, while asymptotically achieving conditional coverage and optimal length if the black-box model is consistent. Numerical experiments with simulated and real data demonstrate improved performance compared to state-of-the-art alternatives, including conformalized quantile regression and other distributional conformal prediction approaches.

翻訳日:2021-05-20 13:43:07 公開日:2021-05-18

# RecPipe: 推奨品質とパフォーマンスを両立させる共設計モデルとハードウェア

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance ( http://arxiv.org/abs/2105.08820v1 )

ライセンス: Link先を確認

Udit Gupta, Samuel Hsia, Jeff (Jun) Zhang, Mark Wilkening, Javin Pombra, Hsien-Hsin S. Lee, Gu-Yeon Wei, Carole-Jean Wu, David Brooks

(参考訳) ディープラーニングレコメンデーションシステムは、厳格なテールレイテンシターゲットと高いシステム負荷の下で高品質でパーソナライズされたコンテンツを提供する必要がある。本稿では,推薦品質と推論性能を協調的に最適化するRecPipeを提案する。 central to recpipeは、計算の複雑さを減らし、異なる並列処理の機会を露出しながら、品質を維持するために、レコメンデーションモデルを多段階パイプラインに分解する。 recpipeは、多段階のレコメンデーションエンジンを、コモディティで異種プラットフォーム(cpuやgpuなど)にマッピングする推論スケジューラを実装している。そこで我々は,品質,テールレイテンシ,システムスループットを共同で最適化するカスタムアクセラレータrecpipeaccel(rpaccel)を設計した。 RPAc-celはRecPipeを通じてオープンされた異なるデザイン空間を利用するように設計されている。特にRPAccelは、サブバッチでクエリをパイプラインレコメンデーションステージに処理し、デュアルな静的および動的埋め込みキャッシュ、トップkフィルタリングユニットのセット、再構成可能なsystolic配列を実装している。先行技術とアイソクオリティに比較して、RPAccelはレイテンシとスループットを3倍と6倍改善することを示した。

Deep learning recommendation systems must provide high quality, personalized content under strict tail-latency targets and high system loads. This paper presents RecPipe, a system to jointly optimize recommendation quality and inference performance. Central to RecPipe is decomposing recommendation models into multi-stage pipelines to maintain quality while reducing compute complexity and exposing distinct parallelism opportunities. RecPipe implements an inference scheduler to map multi-stage recommendation engines onto commodity, heterogeneous platforms (e.g., CPUs, GPUs).While the hardware-aware scheduling improves ranking efficiency, the commodity platforms suffer from many limitations requiring specialized hardware. Thus, we design RecPipeAccel (RPAccel), a custom accelerator that jointly optimizes quality, tail-latency, and system throughput. RPAc-cel is designed specifically to exploit the distinct design space opened via RecPipe. In particular, RPAccel processes queries in sub-batches to pipeline recommendation stages, implements dual static and dynamic embedding caches, a set of top-k filtering units, and a reconfigurable systolic array. Com-pared to prior-art and at iso-quality, we demonstrate that RPAccel improves latency and throughput by 3x and 6x.

翻訳日:2021-05-20 13:42:55 公開日:2021-05-18

# 加速流からの最適化アルゴリズムへの縮約理論のアプローチ

A Contraction Theory Approach to Optimization Algorithms from Acceleration Flows ( http://arxiv.org/abs/2105.08832v1 )

ライセンス: Link先を確認

Pedro Cisneros-Velarde, Francesco Bullo

(参考訳) 最近では、関連する最適化フローの離散化、すなわち軌道が関連する最適化問題を解く微分方程式(ODE)システムから最適化アルゴリズムの設計に焦点が当てられている。このような設計アプローチは、適切なODEを設計し、識別するための原則化された方法論を見つける方法という、重要な問題を引き起こします。本稿では, この問題の解法として, 縮約理論を用いた解法を提案する。まず、縮退理論が暗黙的かつ明示的なオイラー積分法の安定性をいかに保証するかを説明する一般的な数学的結果を紹介する。そこで我々は,ODEの新しいシステム,すなわち Accelerated-Contracting-Nesterov フロー,およびそれを確立するための収縮理論を,指数収束率を持つ最適化フローとして提案し,その関連する最適化アルゴリズムの線形収束率を即時確立する。この流れの単純明示的なオイラー離散化はネステロフ加速度法に対応する。最後に,時間変動最適化問題に対する最適化アルゴリズムの設計において,このアプローチが性能保証にどのようにつながるかを示す。

Much recent interest has focused on the design of optimization algorithms from the discretization of an associated optimization flow, i.e., a system of differential equations (ODEs) whose trajectories solve an associated optimization problem. Such a design approach poses an important problem: how to find a principled methodology to design and discretize appropriate ODEs. This paper aims to provide a solution to this problem through the use of contraction theory. We first introduce general mathematical results that explain how contraction theory guarantees the stability of the implicit and explicit Euler integration methods. Then, we propose a novel system of ODEs, namely the Accelerated-Contracting-Nesterov flow, and use contraction theory to establish it is an optimization flow with exponential convergence rate, from which the linear convergence rate of its associated optimization algorithm is immediately established. Remarkably, a simple explicit Euler discretization of this flow corresponds to the Nesterov acceleration method. Finally, we present how our approach leads to performance guarantees in the design of optimization algorithms for time-varying optimization problems.

翻訳日:2021-05-20 13:42:32 公開日:2021-05-18

# ソーシャルメディアにおける画像人気予測のためのマルチモーダルディープラーニングフレームワーク

Multimodal Deep Learning Framework for Image Popularity Prediction on Social Media ( http://arxiv.org/abs/2105.08809v1 )

ライセンス: Link先を確認

Fatma S. Abousaleh, Wen-Huang Cheng, Neng-Hao Yu, and Yu Tsao

(参考訳) 何十億枚もの写真が、様々な種類のソーシャルネットワークを通じて毎日ウェブにアップロードされる。これらの画像の中には何百万ものビューを受け取り人気を得るものもあれば、全く気づかないものもある。これは、ソーシャルメディアで画像人気を予測するという問題を引き起こす。画像の人気は、視覚コンテンツ、美的品質、ユーザ、ポストメタデータ、時間など、いくつかの要因に影響される可能性がある。したがって、これら全ての要因を考慮することは、画像の人気を正確に予測するのに不可欠である。さらに,予測モデルの効率性も重要な役割を担っている。本研究では,様々なモダリティからの情報を利用するマルチモーダル学習と,様々な分野における畳み込みニューラルネットワーク(CNN)の現在の成功を動機として,様々な種類の視覚的特徴と社会的特徴を統合ネットワークモデルに組み込むことで,投稿画像の人気を予測する深層学習モデル(VSCNN)を提案する。 VSCNNはまず、2つの個別CNNを利用して入力された視覚的特徴と社会的特徴から高レベル表現を抽出することを学ぶ。これら2つのネットワークの出力をジョイントネットワークに融合し、出力層における人気スコアを推定する。 Flickrに投稿された約432K画像のデータセットを広範囲に実験することにより,提案手法の性能を評価する。シミュレーションの結果、提案したVSCNNモデルは、それぞれ平均絶対誤差と平均二乗誤差の2.33%、7.59%、14.16%以上の相対的な改善により、最先端モデルよりも大幅に優れていることが示された。

Billions of photos are uploaded to the web daily through various types of social networks. Some of these images receive millions of views and become popular, whereas others remain completely unnoticed. This raises the problem of predicting image popularity on social media. The popularity of an image can be affected by several factors, such as visual content, aesthetic quality, user, post metadata, and time. Thus, considering all these factors is essential for accurately predicting image popularity. In addition, the efficiency of the predictive model also plays a crucial role. In this study, motivated by multimodal learning, which uses information from various modalities, and the current success of convolutional neural networks (CNNs) in various fields, we propose a deep learning model, called visual-social convolutional neural network (VSCNN), which predicts the popularity of a posted image by incorporating various types of visual and social features into a unified network model. VSCNN first learns to extract high-level representations from the input visual and social features by utilizing two individual CNNs. The outputs of these two networks are then fused into a joint network to estimate the popularity score in the output layer. We assess the performance of the proposed method by conducting extensive experiments on a dataset of approximately 432K images posted on Flickr. The simulation results demonstrate that the proposed VSCNN model significantly outperforms state-of-the-art models, with a relative improvement of greater than 2.33%, 7.59%, and 14.16% in terms of Spearman's Rho, mean absolute error, and mean squared error, respectively.

翻訳日:2021-05-20 13:42:00 公開日:2021-05-18

# スパース・スパイキング勾配降下

Sparse Spiking Gradient Descent ( http://arxiv.org/abs/2105.08810v1 )

ライセンス: Link先を確認

Nicolas Perez-Nieves and Dan F.M. Goodman

(参考訳) 低エネルギー消費のため、ニューロモルフィックコンピューティングデバイスにスパイキングニューラルネットワーク(SNN)をエミュレートすることへの関心が高まっている。近年の進歩により、SNNをトレーニングすることで、従来のニューラルネットワーク(ANN)と精度で競合し始めることができると同時に、ニューロモルフィックハードウェア上での動作時のエネルギー効率も向上している。しかし、SNNのトレーニングプロセスは、SNNの時空間的疎結合性を生かしていないANN向けに開発された高密度テンソル操作に基づいている。本稿では,現在の art 法と同等かそれ以上の精度を実現しつつ,より高速かつメモリ効率を向上できる最初のスパース snn バックプロパゲーションアルゴリズムを提案する。提案手法は,70倍までの後方通過速度を達成し,精度を損なうことなく,最大40%のメモリ効率を向上できる実データ(Fashion-MNIST,Neuromophic-MNIST,Spike Heidelberg Digits)に対して有効性を示す。

There is an increasing interest in emulating Spiking Neural Networks (SNNs) on neuromorphic computing devices due to their low energy consumption. Recent advances have allowed training SNNs to a point where they start to compete with traditional Artificial Neural Networks (ANNs) in terms of accuracy, while at the same time being energy efficient when run on neuromorphic hardware. However, the process of training SNNs is still based on dense tensor operations originally developed for ANNs which do not leverage the spatiotemporally sparse nature of SNNs. We present here the first sparse SNN backpropagation algorithm which achieves the same or better accuracy as current state of the art methods while being significantly faster and more memory efficient. We show the effectiveness of our method on real datasets of varying complexity (Fashion-MNIST, Neuromophic-MNIST and Spiking Heidelberg Digits) achieving a speedup in the backward pass of up to 70x, and 40% more memory efficient, without losing accuracy.

翻訳日:2021-05-20 13:38:39 公開日:2021-05-18

# rx-anon -- 修正モンドリアンアルゴリズムに基づく異種データの復号化に関する新しいアプローチ

rx-anon -- A Novel Approach on the De-Identification of Heterogeneous Data based on a Modified Mondrian Algorithm ( http://arxiv.org/abs/2105.08842v1 )

ライセンス: Link先を確認

Fabian Singhofer, Aygul Garifullina, Mathias Kern, Ansgar Scherp

(参考訳) データ匿名化の伝統的なアプローチは、関係データとテキストデータとは独立に考える。本稿では,関係属性とテキスト属性からなる異種半構造化文書の匿名化手法であるrx-anonを提案する。テキストから抽出したセンシティブな用語を構造化データにマップする。これにより、k匿名性のような概念を使って、異種データ入力の結合されたプライバシー保護バージョンを生成することができます。我々は,異種データを一貫して匿名化するために,冗長な機密情報の概念を導入する。非構造化テキストデータと構造化データ属性との匿名化の影響を制御するために,修正されたパラメータ付きmondrianアルゴリズムを導入する。パラメータ $\lambda$ は、匿名化プロセス中に関係属性とテキスト属性に異なる重みを与えることができる。本手法は,リレーショナルデータとテキストデータの共同匿名化の問題に適応した正規化確実性ペナルティスコアを用いて,実世界の2つのデータセットを用いて評価する。提案手法は,モンドリアン分割の制御にチューニングパラメータを用いることで情報損失を低減できることを示すとともに,関係属性やセンシティブな用語のk匿名性を保証する。 rx-anonはフレームワークアプローチであるため、他の匿名化アルゴリズム、プライバシモデル、テキスト類似度メトリクスによって再利用および拡張することができる。

Traditional approaches for data anonymization consider relational data and textual data independently. We propose rx-anon, an anonymization approach for heterogeneous semi-structured documents composed of relational and textual attributes. We map sensitive terms extracted from the text to the structured data. This allows us to use concepts like k-anonymity to generate a joined, privacy-preserved version of the heterogeneous data input. We introduce the concept of redundant sensitive information to consistently anonymize the heterogeneous data. To control the influence of anonymization over unstructured textual data versus structured data attributes, we introduce a modified, parameterized Mondrian algorithm. The parameter $\lambda$ allows to give different weight on the relational and textual attributes during the anonymization process. We evaluate our approach with two real-world datasets using a Normalized Certainty Penalty score, adapted to the problem of jointly anonymizing relational and textual data. The results show that our approach is capable of reducing information loss by using the tuning parameter to control the Mondrian partitioning while guaranteeing k-anonymity for relational attributes as well as for sensitive terms. As rx-anon is a framework approach, it can be reused and extended by other anonymization algorithms, privacy models, and textual similarity metrics.

翻訳日:2021-05-20 13:38:19 公開日:2021-05-18

# 神経正準変換を伴う有限温度における相互作用フェルミオンのab-initio研究

Ab-initio study of interacting fermions at finite temperature with neural canonical transformation ( http://arxiv.org/abs/2105.08644v1 )

ライセンス: Link先を確認

Hao Xie, Linfeng Zhang, Lei Wang

(参考訳) 連続体における相互作用するフェルミオンの熱的性質に対する変動密度行列アプローチを提案する。変分密度行列は、離散確率モデルとともに置換同変多体ユニタリ変換によってパラメトリゼーションされる。ユニタリ変換は、フェルミオン座標の流れを介して相関効果を組み込んだ神経正準変換の量子対として実装される。最初の応用として、フェルミ液体からウィグナー分子への相互作用が引き起こされる2次元量子ドット中の電子を研究する。本手法は,フェルミオンサイン問題により従来の量子モンテカルロ法が深刻な困難に直面する低温状態において,正確な結果を与える。このアプローチは、さらなる拡張のために一般的かつ柔軟であり、従って超低温量子ガス、凝縮物質、暖かい高密度物質物理学の文脈で強相関フェルミオンに関する新しい物理結果を提供するという約束を持っている。

We present a variational density matrix approach to the thermal properties of interacting fermions in the continuum. The variational density matrix is parametrized by a permutation equivariant many-body unitary transformation together with a discrete probabilistic model. The unitary transformation is implemented as a quantum counterpart of neural canonical transformation, which incorporates correlation effects via a flow of fermion coordinates. As the first application, we study electrons in a two-dimensional quantum dot with an interaction-induced crossover from Fermi liquid to Wigner molecule. The present approach provides accurate results in the low-temperature regime, where conventional quantum Monte Carlo methods face severe difficulties due to the fermion sign problem. The approach is general and flexible for further extensions, thus holds the promise to deliver new physical results on strongly correlated fermions in the context of ultracold quantum gases, condensed matter, and warm dense matter physics.

翻訳日:2021-05-20 13:36:46 公開日:2021-05-18

# 共通認知モデルの予測処理実装に向けて

Towards a Predictive Processing Implementation of the Common Model of Cognition ( http://arxiv.org/abs/2105.07308v2 )

ライセンス: Link先を確認

Alexander Ororbia, M. A. Kelly

(参考訳) 本稿では,強力な,かつ単純なニューラルモデルから構築した認知的アーキテクチャを提案する。具体的には、ニューラル生成符号化とホログラフィック連想記憶に基づく認知の共通モデルの実装について述べる。提案システムは,多様なタスクから継続的に学習するエージェントを開発するための基盤となり,既存の認知アーキテクチャよりも大規模で人的パフォーマンスをモデル化する。

In this article, we present a cognitive architecture that is built from powerful yet simple neural models. Specifically, we describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales than what is possible with existant cognitive architectures.

翻訳日:2021-05-20 11:27:30 公開日:2021-05-18

# 十分な校正

Calibrating sufficiently ( http://arxiv.org/abs/2105.07283v2 )

ライセンス: Link先を確認

Dirk Tasche

(参考訳) 確率的分類器を訓練して校正する場合、キャリブレーション損失のいわゆるグループ損失成分を容易に見逃すことができる。グルーピングロス(grouping loss)とは、観測可能な情報と実際に校正訓練で活用された情報との間のギャップを指す。グループ化損失とsufficiencyの概念との関係について検討し,sufficiencyの有用な基準としてコモノトニック性を特定する。 langford & zadrozny (2005) の探索還元アプローチを再検討し、グループ化損失を減らす確率的分類器の推定子を生成することを発見した。最後に,確率的分類器の訓練と「不十分」校正を支援するツールとして,ブライア曲線について考察する。

When probabilistic classifiers are trained and calibrated, the so-called grouping loss component of the calibration loss can easily be overlooked. Grouping loss refers to the gap between observable information and information actually exploited in the calibration exercise. We investigate the relation between grouping loss and the concept of sufficiency, identifying comonotonicity as a useful criterion for sufficiency. We revisit the probing reduction approach of Langford & Zadrozny (2005) and find that it produces an estimator of probabilistic classifiers that reduces grouping loss. Finally, we discuss Brier curves as tools to support training and 'sufficient' calibration of probabilistic classifiers.

翻訳日:2021-05-20 11:27:26 公開日:2021-05-18

# (参考訳) Recursive Hierarchy-Interactive Attention and Entity-Order Perception による遠隔監視型関係抽出

Distantly Supervised Relation Extraction via Recursive Hierarchy-Interactive Attention and Entity-Order Perception ( http://arxiv.org/abs/2105.08213v1 )

ライセンス: CC BY 4.0

Ridong Han, Tao Peng, Jiayu Han, Lin Yue, Hai Cui, Lu Liu

(参考訳) 最近,遠隔教師付き関係抽出が注目されている。しかし、ほとんど全ての先行作品は、文中で2つの実体の出現順序が意味論の理解に寄与しているという事実を無視している。さらに、関係階層を活用しているが、関係レベル間のヒューリスティックな効果を十分に活用していない。本稿では,関係関係の階層構造を用いて,関係レベル間の対話的情報をモデル化し,より長期的関係を扱う新しい階層型階層型対話型注意ネットワーク(RHIA)を設計する。再帰的構造において階層的関係連鎖に沿った関係強化文表現を生成する。さらに、文エンコーダがよりエンティティの外観情報を保持できるように、Entity-Order Perception (EOP)と呼ばれる新たな訓練目標を導入する。人気のNew York Times(NYT)データセットに関する実体実験が実施されている。従来のベースラインと比較して,p-r曲線,auc,top-n精度,その他の評価指標を用いて最先端のパフォーマンスを実現する。

Distantly supervised relation extraction has drawn significant attention recently. However, almost all prior works ignore the fact that, in a sentence, the appearance order of two entities contributes to the understanding of its semantics. Furthermore, they leverage relation hierarchies but don't fully exploit the heuristic effect between relation levels, i.e., higher-level relations can give useful information to the lower ones. In this paper, we design a novel Recursive Hierarchy-Interactive Attention network (RHIA), which uses the hierarchical structure of the relation to model the interactive information between the relation levels to further handle long-tail relations. It generates relation-augmented sentence representations along hierarchical relation chains in a recursive structure. Besides, we introduce a newfangled training objective, called Entity-Order Perception (EOP), to make the sentence encoder retain more entity appearance information. Substantial experiments on the popular New York Times (NYT) dataset are conducted. Compared to prior baselines, our approach achieves state-of-the-art performance in terms of precision-recall (P-R) curves, AUC, Top-N precision and other evaluation metrics.

翻訳日:2021-05-20 01:08:28 公開日:2021-05-18

# (参考訳) The Commodities News Corpus: 良質なコモディティニュースのためのリソース

The Commodities News Corpus: A Resource forUnderstanding Commodity News Better ( http://arxiv.org/abs/2105.08214v1 )

ライセンス: CC BY 4.0

Meisin Lee, Lay Ki Soon, Eu-gene Siew

(参考訳) コモディティ・ニュースは、最近の商品価格運動の要約や、ムーブメントに繋がった注目すべき出来事など、豊富な情報を含んでいる。イベント抽出を通じて、商品ニュースから抽出した有用な情報は、商品価格予測に使用できる商品価格運動と商品間の因果関係のマイニングに極めて有用である。今後の研究を容易にするために、以下の情報と注釈付きデータセットを紹介する。 (i) エンティティ(nomi-nalとnamedの両方)、 (ii) イベント(trigger words and argument role)、 (iii) イベントメタデータ(modality, polarity and intensity)、 (iv) イベント-イベント関係。

Commodity News contains a wealth of information such as sum-mary of the recent commodity price movement and notable events that led tothe movement. Through event extraction, useful information extracted fromcommodity news is extremely useful in mining for causal relation betweenevents and commodity price movement, which can be used for commodity priceprediction. To facilitate the future research, we introduce a new dataset withthe following information identified and annotated: (i) entities (both nomi-nal and named), (ii) events (trigger words and argument roles), (iii) eventmetadata: modality, polarity and intensity and (iv) event-event relations.

翻訳日:2021-05-20 00:53:47 公開日:2021-05-18

# (参考訳) ヘイスタックにおける針の発見:共同検出・追跡による4Kビデオにおけるニーフライング物体検出

Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K Videos using a Joint Detection-and-Tracking Approach ( http://arxiv.org/abs/2105.08253v1 )

ライセンス: CC BY 4.0

Ryota Yoshihashi, Rei Kawakami, Shaodi You, Tu Tuan Trinh, Makoto Iida, Takeshi Naemura

(参考訳) 高解像度ビデオで小さな物体を検出することは、視覚情報がほとんど信頼できないため難しい。特に、課題には、非常に低い解像度のオブジェクト、圧縮によるmpegアーティファクト、多くのハードネガティブな検索領域が含まれる。追跡は、信頼性の低い外観と信頼性の低い動作推定のため等しく困難である。幸いなことに、この2つの困難なタスクを組み合わせることで、相互に利益が得られます。そこで,本論文では,単一,トレーニング可能な,エンドツーエンドのネットワークを通じて学習した多フレーム表現を用いて,検出と追跡を共同で行うリカレント相関ネットワークというニューラルネットワークモデルを提案する。このフレームワークは、長期にわたる畳み込みメモリネットワークを利用して、検出のための情報的外観変化を学習し、学習された表現は、その性能を高めるために追跡において共有される。鳥や無人航空機などの小型飛行物体を含むシーンの画像を含むデータセットを用いた実験において、提案手法は、深部単一フレーム検出器や既存のモーションベース検出器に対する検出性能の一貫した改善をもたらした。さらに,鳥画像データセットのトラッカとして評価された場合,ネットワークは最先端の汎用オブジェクトトラッカと同様に動作する。

Detecting tiny objects in a high-resolution video is challenging because the visual information is little and unreliable. Specifically, the challenge includes very low resolution of the objects, MPEG artifacts due to compression and a large searching area with many hard negatives. Tracking is equally difficult because of the unreliable appearance, and the unreliable motion estimation. Luckily, we found that by combining this two challenging tasks together, there will be mutual benefits. Following the idea, in this paper, we present a neural network model called the Recurrent Correlational Network, where detection and tracking are jointly performed over a multi-frame representation learned through a single, trainable, and end-to-end network. The framework exploits a convolutional long short-term memory network for learning informative appearance changes for detection, while the learned representation is shared in tracking for enhancing its performance. In experiments with datasets containing images of scenes with small flying objects, such as birds and unmanned aerial vehicles, the proposed method yielded consistent improvements in detection performance over deep single-frame detectors and existing motion-based detectors. Furthermore, our network performs as well as state-of-the-art generic object trackers when it was evaluated as a tracker on a bird image dataset.

翻訳日:2021-05-20 00:40:09 公開日:2021-05-18

# (参考訳) ログロススコアからのラベル推論攻撃

Label Inference Attacks from Log-loss Scores ( http://arxiv.org/abs/2105.08266v1 )

ライセンス: CC BY 4.0

Abhinav Aggarwal, Shiva Prasad Kasiviswanathan, Zekun Xu, Oluwaseyi Feyisetan, Nathanael Teissier

(参考訳) ログロス(クロスエントロピー損失とも呼ばれる)メトリックは、分類アルゴリズムのパフォーマンスを評価するために機械学習アプリケーションに広く使われている。本稿では,データセットのラベルを単一の(あるいは複数)log-lossスコアから推測する問題を,他のデータにアクセスせずに検討する。驚くべきことに、任意の有限個のラベルクラスに対して、任意の精度演算が可能であれば、注意深く構築された単一の予測ベクトルのログロススコアからデータセットのラベルを正確に推測できることが示されている。さらに,log-lossスコアにノイズを加えたり,演算精度が制限されたりしても成功するラベル推論アルゴリズム(attacks)を提案する。私たちのアルゴリズムはすべて数論と組合せ論のアイデアに依存しており、モデルトレーニングは必要ありません。実際のデータセット上で実験的なシミュレーションを行い、実際の攻撃の容易さを実証した。

Log-loss (also known as cross-entropy loss) metric is ubiquitously used across machine learning applications to assess the performance of classification algorithms. In this paper, we investigate the problem of inferring the labels of a dataset from single (or multiple) log-loss score(s), without any other access to the dataset. Surprisingly, we show that for any finite number of label classes, it is possible to accurately infer the labels of the dataset from the reported log-loss score of a single carefully constructed prediction vector if we allow arbitrary precision arithmetic. Additionally, we present label inference algorithms (attacks) that succeed even under addition of noise to the log-loss scores and under limited precision arithmetic. All our algorithms rely on ideas from number theory and combinatorics and require no model training. We run experimental simulations on some real datasets to demonstrate the ease of running these attacks in practice.

翻訳日:2021-05-20 00:16:38 公開日:2021-05-18

# (参考訳) echocp : コントラスト経胸腔的心エコー図を用いた卵管診断のための心エコー図データセット

EchoCP: An Echocardiography Dataset in Contrast Transthoracic Echocardiography for Patent Foramen Ovale Diagnosis ( http://arxiv.org/abs/2105.08267v1 )

ライセンス: CC BY 4.0

Tianchen Wang, Zhihe Li, Meiping Huang, Jian Zhuang, Shanshan Bi, Jiawei Zhang, Yiyu Shi, Hongwen Fei, Xiaowei Xu

(参考訳) 特許前卵胞(英: patent foramen ovale, pfo)は、心房中隔の反上部に位置する中隔、霊長体、中隔の間の潜在的分離である。 PFOは、米国で5番目に多い死因である暗号化的脳卒中を引き起こす主要な要因の1つである。 PFO診断では, 造影心エコー法(cTTE)が, 他と比べ, より堅牢な方法として好まれる。しかし,心エコービデオのソノグラフィーが手作業で行うため,cTTEによる現在のPFO診断は極めて遅い。現在、コミュニティでこの重要なトピックのための公開データセットはありません。本稿では, PFO 診断をターゲットとした, cTTE における最初の心エコー画像データセットとして EchoCP を提案する。 EchoCPは、安静とValsalva操作ビデオの両方を持つ30の患者で構成される。さらに, 術中心室セグメンテーション法に基づくpfo診断のベースライン自動決定法を確立し, 平均平均diceスコア 0.89 を得たが, 改善の余地は多く, pfo診断の精度は 0.70/0.67 に留まった。挑戦的なEchoCPデータセットがさらなる研究を刺激し、複数のドメインに影響を及ぼす革新的で汎用的なソリューションにつながることを期待しています。データセットがリリースされます。

Patent foramen ovale (PFO) is a potential separation between the septum, primum and septum secundum located in the anterosuperior portion of the atrial septum. PFO is one of the main factors causing cryptogenic stroke which is the fifth leading cause of death in the United States. For PFO diagnosis, contrast transthoracic echocardiography (cTTE) is preferred as being a more robust method compared with others. However, the current PFO diagnosis through cTTE is extremely slow as it is proceeded manually by sonographers on echocardiography videos. Currently there is no publicly available dataset for this important topic in the community. In this paper, we present EchoCP, as the first echocardiography dataset in cTTE targeting PFO diagnosis. EchoCP consists of 30 patients with both rest and Valsalva maneuver videos which covers various PFO grades. We further establish an automated baseline method for PFO diagnosis based on the state-of-the-art cardiac chamber segmentation technique, which achieves 0.89 average mean Dice score, but only 0.70/0.67 mean accuracies for PFO diagnosis, leaving large room for improvement. We hope that the challenging EchoCP dataset can stimulate further research and lead to innovative and generic solutions that would have an impact in multiple domains. Our dataset is released.

翻訳日:2021-05-19 23:52:10 公開日:2021-05-18

# (参考訳) 知識伝達による運転支援物体検出の検討

Exploring Driving-aware Salient Object Detection via Knowledge Transfer ( http://arxiv.org/abs/2105.08286v1 )

ライセンス: CC BY 4.0

Jinming Su, Changqun Xia, and Jia Li

(参考訳) 近年,ニューラルネットワークの急速な発展に伴い,sod(general salient object detection)が大きな進歩を遂げている。しかし、タスク固有のデータセットがないため、タスク対応SODの研究はほとんど行われていない。本稿では,有向物体の画素レベルのマスクが注釈付けされた運転タスク指向のデータセットを構築する。一般的なSODデータセットと比較すると、クロスドメインの知識差とタスク固有のシーンギャップは、運転時の健全な物体に焦点を合わせる2つの主な課題であることがわかった。これらの知見に触発されて,知識伝達畳み込みニューラルネットワークを用いた運転タスク認識型SODのベースラインモデルを提案した。このネットワークでは,知識差を補うために,注意に基づく知識伝達モジュールを構築する。さらに、複雑なタスク固有のシーンにおけるオブジェクトの詳細な特徴復号を行うために、効率的な境界認識機能復号モジュールを導入する。ネットワーク全体は知識伝達と機能デコードモジュールを漸進的に統合する。実験により,提案したデータセットは非常に困難であることが示され,提案手法は,タスク認識型SODの開発を容易にする12の最先端メソッドよりも優れていた。

Recently, general salient object detection (SOD) has made great progress with the rapid development of deep neural networks. However, task-aware SOD has hardly been studied due to the lack of task-specific datasets. In this paper, we construct a driving task-oriented dataset where pixel-level masks of salient objects have been annotated. Comparing with general SOD datasets, we find that the cross-domain knowledge difference and task-specific scene gap are two main challenges to focus the salient objects when driving. Inspired by these findings, we proposed a baseline model for the driving task-aware SOD via a knowledge transfer convolutional neural network. In this network, we construct an attentionbased knowledge transfer module to make up the knowledge difference. In addition, an efficient boundary-aware feature decoding module is introduced to perform fine feature decoding for objects in the complex task-specific scenes. The whole network integrates the knowledge transfer and feature decoding modules in a progressive manner. Experiments show that the proposed dataset is very challenging, and the proposed method outperforms 12 state-of-the-art methods on the dataset, which facilitates the development of task-aware SOD.

翻訳日:2021-05-19 23:49:41 公開日:2021-05-18

# (参考訳) ソーシャルネットワーク上のカスケード予測のための独立非対称埋め込みモデル

Independent Asymmetric Embedding Model for Cascade Prediction on Social Network ( http://arxiv.org/abs/2105.08291v1 )

ライセンス: CC BY 4.0

Wenjin Xie and Xiaomeng Wang and Tao Jia

(参考訳) ソーシャルネットワーク上での情報拡散の予測は,マーケティングや世論管理において極めて重要な意味を持つ。カスケード予測は、メッセージをソーシャルネットワークに再投稿する可能性のある個人を予測することを目的としている。ある種類の手法は、人口統計学的、構造的、時間的特徴を予測に利用するか、特定の情報拡散モデルに明示的に依存する。他のモデルは完全にデータ駆動であり、グローバルネットワーク構造を必要としない。そこで,ネットワーク埋め込みに基づく大規模拡散予測モデルを提案する。これらのモデルは、ユーザをカスケード情報を使用して潜在空間に埋め込むが、埋め込み時のユーザ間の介入に対する考慮が欠如している。本稿では,カスケード予測のための社会的埋め込み学習のための独立な非対称埋め込み法を提案する。既存の手法と異なり、各個体を1つの潜伏影響空間と複数の潜伏感受性空間に埋め込む。さらに,提案手法は,カスケード内のユーザ組み合わせの共起制御を捕捉し,計算効率を向上する。実世界のデータセット上で行った広範な実験の結果は、予測精度とコスト効率の両方を検証できた。

The prediction for information diffusion on social networks has great practical significance in marketing and public opinion control. Cascade prediction aims to predict the individuals who will potentially repost the message on the social network. One kind of methods either exploit demographical, structural, and temporal features for prediction, or explicitly rely on particular information diffusion models. The other kind of models are fully data-driven and do not require a global network structure. Thus massive diffusion prediction models based on network embedding are proposed. These models embed the users into the latent space using their cascade information, but are lack of consideration for the intervene among users when embedding. In this paper, we propose an independent asymmetric embedding method to learn social embedding for cascade prediction. Different from existing methods, our method embeds each individual into one latent influence space and multiple latent susceptibility spaces. Furthermore, our method captures the co-occurrence regulation of user combination in cascades to improve the calculating effectiveness. The results of extensive experiments conducted on real-world datasets verify both the predictive accuracy and cost-effectiveness of our approach.

翻訳日:2021-05-19 23:40:31 公開日:2021-05-18

# (参考訳) 自己申告症状は毎日のcovid-19感染者を予測できるか?

Can Self Reported Symptoms Predict Daily COVID-19 Cases? ( http://arxiv.org/abs/2105.08321v1 )

ライセンス: CC BY 4.0

Parth Patwa and Viswanatha Reddy and Rohan Sukumaran and Sethuraman TV and Eptehal Nashnoush and Sheshank Shankar and Rishemjit Kaur and Abhishek Singh and Ramesh Raskar

(参考訳) 新型コロナウイルスのパンデミックが世界中の生活や経済に影響を及ぼし、多くの死者を出した。ワクチン接種は重要な介入であるが、ロールアウトは世界中で遅く、不平等である。そのため、大規模な検査はウイルスをモニターし、封じ込めるための重要な方法の1つとして残っている。大規模なテストは高価で厳しいです。したがって、ケース数を見積もる別の方法が必要である。オンライン調査は、パンデミック中のデータ収集に有効な方法であることが示されている。本研究では,自己申告症状を用いて新型コロナウイルスの感染率を推定する機械学習モデルを開発した。最善のモデルでは、1州あたり平均絶対誤差(mae)が226.30(maeは27.09%)と予測され、自己報告症状を用いて実際の感染者数を予測する可能性が示された。モデルは、状態レベルでトレーニングされるローカルモデルと、すべての州で集約された複合データに基づいてトレーニングされる単一のグローバルモデルという、2つのレベルのデータ粒度で開発されている。その結果,グローバルモデルとは対照的に,局所モデルに対する誤差が低かった。また、最も重要な症状(機能)は、状態によって大きく異なることも示している。この研究は、クラウドソーシングデータに基づいて開発されたモデルは、オンラインプラットフォームを介してキュレーションされ、既存の疫学的監視インフラを費用対効果で補完できることを示した。

The COVID-19 pandemic has impacted lives and economies across the globe, leading to many deaths. While vaccination is an important intervention, its roll-out is slow and unequal across the globe. Therefore, extensive testing still remains one of the key methods to monitor and contain the virus. Testing on a large scale is expensive and arduous. Hence, we need alternate methods to estimate the number of cases. Online surveys have been shown to be an effective method for data collection amidst the pandemic. In this work, we develop machine learning models to estimate the prevalence of COVID-19 using self-reported symptoms. Our best model predicts the daily cases with a mean absolute error (MAE) of 226.30 (normalized MAE of 27.09%) per state, which demonstrates the possibility of predicting the actual number of confirmed cases by utilizing self-reported symptoms. The models are developed at two levels of data granularity - local models, which are trained at the state level, and a single global model which is trained on the combined data aggregated across all states. Our results indicate a lower error on the local models as opposed to the global model. In addition, we also show that the most important symptoms (features) vary considerably from state to state. This work demonstrates that the models developed on crowd-sourced data, curated via online platforms, can complement the existing epidemiological surveillance infrastructure in a cost-effective manner.

翻訳日:2021-05-19 23:29:16 公開日:2021-05-18

# (参考訳) ELdrオントロジーに基づくアクティブラーニングの概念と接続型クエリ

Actively Learning Concepts and Conjunctive Queries under ELdr-Ontologies ( http://arxiv.org/abs/2105.08326v1 )

ライセンス: CC BY 4.0

Maurice Funk, Jean Christoph Jung, Carsten Lutz

(参考訳) 本稿では,学習アルゴリズムがオラクル(ドメインエキスパートなど)を対話的にクエリすることのできる,Angluinのアクティブラーニングフレームワークにおいて,記述論理ELdrで定式化されたオントロジーの存在下で概念やクエリを学習する問題を考察する。 1) el-concepts, (2) symmetry-free eli-concepts, (3) chordal, symmetry-free, そしてbounded arityである結合クエリ(cqs)である。いずれの場合も、学習者は、ABoxesと同値クエリに基づいて、そのクラスから与えられた概念/クエリがターゲットと同等であるかどうかを問うオラクルメンバーシップクエリにポーズすることができる。 (3) における有界アリティに対する制限は、同値クエリで非制限な CQ が認められると取り除かれる。また,EL-concepts は ELI-ontology の存在下で学習可能な多項式クエリではないことを示す。

We consider the problem to learn a concept or a query in the presence of an ontology formulated in the description logic ELdr, in Angluin's framework of active learning that allows the learning algorithm to interactively query an oracle (such as a domain expert). We show that the following can be learned in polynomial time: (1) EL-concepts, (2) symmetry-free ELI-concepts, and (3) conjunctive queries (CQs) that are chordal, symmetry-free, and of bounded arity. In all cases, the learner can pose to the oracle membership queries based on ABoxes and equivalence queries that ask whether a given concept/query from the considered class is equivalent to the target. The restriction to bounded arity in (3) can be removed when we admit unrestricted CQs in equivalence queries. We also show that EL-concepts are not polynomial query learnable in the presence of ELI-ontologies.

翻訳日:2021-05-19 23:15:01 公開日:2021-05-18

# (参考訳) 凸クラスタリングソリューションについて

On Convex Clustering Solutions ( http://arxiv.org/abs/2105.08348v1 )

ライセンス: CC BY 4.0

Canh Hao Nguyen, Hiroshi Mamitsuka

(参考訳) 凸クラスタリング(convex clustering)は、その凸定式化による効率や最適性などの優れた特性を持つ魅力的なクラスタリングアルゴリズムである。 k平均クラスタリングと凝集クラスタリングの両方を一般化すると考えられている。しかし、凸クラスタリングがこれらのアルゴリズムの望ましい特性を維持するかどうかは不明である。一般的な期待としては、凸クラスタリングは非凸クラスタのような難しいクラスタタイプを学ぶことができる。現在の凸クラスタリングの理解は、十分に分離されたクラスタ上の一貫性結果のみに限られている。我々はその解に対する新しい理解を示す。凸クラスタリングは凸クラスタのみを学習できることを証明する。すると、クラスターは大きなギャップを持つ有界球を持つことを示す。さらに、ソリューション、正規化ハイパーパラメータ、クラスタ化可能なケース、一貫性を特徴付ける。

Convex clustering is an attractive clustering algorithm with favorable properties such as efficiency and optimality owing to its convex formulation. It is thought to generalize both k-means clustering and agglomerative clustering. However, it is not known whether convex clustering preserves desirable properties of these algorithms. A common expectation is that convex clustering may learn difficult cluster types such as non-convex ones. Current understanding of convex clustering is limited to only consistency results on well-separated clusters. We show new understanding of its solutions. We prove that convex clustering can only learn convex clusters. We then show that the clusters have disjoint bounding balls with significant gaps. We further characterize the solutions, regularization hyperparameters, inclusterable cases and consistency.

翻訳日:2021-05-19 21:34:15 公開日:2021-05-18

# (参考訳) 深層強化学習を用いたオンラインマルチモーダル交通計画

Online Multimodal Transportation Planning using Deep Reinforcement Learning ( http://arxiv.org/abs/2105.08374v1 )

ライセンス: CC BY 4.0

Amirreza Farahani, Laura Genga and Remco Dijkman

(参考訳) 本稿では,トラックにコンテナを割り当てたり,目的地に輸送する列車にコンテナを割り当てるマルチモーダル輸送計画問題を解決するための深層強化学習手法を提案する。従来の計画手法は"オフライン"(すなわち、輸送開始前にコンテナのバッチを決定する)で動作するが、提案されたアプローチは"オンライン"であり、輸送が実行される間、個々のコンテナに対して決定を下すことができる。オンライン交通計画は、当初の交通計画に影響を及ぼす可能性のある予期せぬ出来事に効果的に対応し、企業による輸送コストの削減を支援する。提案したDeep Reinforcement Learningアルゴリズムで異なるコンテナ選択ヒューリスティックスを実装し,ロジスティクス企業における実ケーススタディに基づいて,現実的なシナリオをシミュレートしたデータを用いて,各ヒューリスティックのパフォーマンスを評価した。実験の結果,提案手法はコンテナ割り当ての効果的なパターンを学習できることがわかった。競争相手の総輸送コストと車両使用能力の面では20.48%から55.32%、キャパシティは7.51%から20.54%に向上した。さらに,オフライン環境で整数線形計画解法で生成した最適解の容量は,コストで2.7%,オフラインで0.72%であった。

In this paper we propose a Deep Reinforcement Learning approach to solve a multimodal transportation planning problem, in which containers must be assigned to a truck or to trains that will transport them to their destination. While traditional planning methods work "offline" (i.e., they take decisions for a batch of containers before the transportation starts), the proposed approach is "online", in that it can take decisions for individual containers, while transportation is being executed. Planning transportation online helps to effectively respond to unforeseen events that may affect the original transportation plan, thus supporting companies in lowering transportation costs. We implemented different container selection heuristics within the proposed Deep Reinforcement Learning algorithm and we evaluated its performance for each heuristic using data that simulate a realistic scenario, designed on the basis of a real case study at a logistics company. The experimental results revealed that the proposed method was able to learn effective patterns of container assignment. It outperformed tested competitors in terms of total transportation costs and utilization of train capacity by 20.48% to 55.32% for the cost and by 7.51% to 20.54% for the capacity. Furthermore, it obtained results within 2.7% for the cost and 0.72% for the capacity of the optimal solution generated by an Integer Linear Programming solver in an offline setting.

翻訳日:2021-05-19 21:13:03 公開日:2021-05-18

# (参考訳) i2c2w:正確なシーン認識のための画像から文字への変換器

I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text Recognition ( http://arxiv.org/abs/2105.08383v1 )

ライセンス: CC0 1.0

Chuhui Xue, Shijian Lu, Song Bai, Wenqing Zhang, Changhu Wang

(参考訳) 自然言語処理の進歩を利用して、最近のシーンのテキスト認識者はエンコーダ-デコーダアーキテクチャを採用しており、テキストイメージはまず代表的特徴に変換され、その後 ‘direct decoding’ を介して文字のシーケンスに変換される。しかし、シーンテキスト画像は複雑な背景や幾何歪みなどの様々な音源の豊かなノイズに悩まされ、デコーダを混乱させ、ノイズの多いデコード時間ステップで視覚的特徴の不正なアライメントにつながる。本稿では,シーンの様々なノイズに対して正確かつ耐性のある新しいシーンテキスト認識装置I2C2Wを提案する。 i2c2wはイメージ・ツー・キャラクタモジュール(i2c)と文字・ワードモジュール(c2w)から構成される。 i2cは文字を検出し、単語内の相対位置を予測する。時間ステップの制限なしに、異なる視覚的特徴のアライメントに基づいて、不正かつ冗長な文字を含む全ての文字を検出する。検出された文字を入力として、C2Wは文字の意味とその位置から学習し、不正かつ冗長な検出をフィルタリングし、最終的な単語認識を生成する。 7つの公開データセットに対する大規模な実験は、I2C2Wが優れた認識性能を達成し、不規則なシーンテキストデータセットに対して大きなマージンで最先端のパフォーマンスを達成していることを示している。

Leveraging the advances of natural language processing, most recent scene text recognizers adopt an encoder-decoder architecture where text images are first converted to representative features and then a sequence of characters via `direct decoding'. However, scene text images suffer from rich noises of different sources such as complex background and geometric distortions which often confuse the decoder and lead to incorrect alignment of visual features at noisy decoding time steps. This paper presents I2C2W, a novel scene text recognizer that is accurate and tolerant to various noises in scenes. I2C2W consists of an image-to-character module (I2C) and a character-to-word module (C2W) which are complementary and can be trained end-to-end. I2C detects characters and predicts their relative positions in a word. It strives to detect all possible characters including incorrect and redundant ones based on different alignments of visual features without the restriction of time steps. Taking the detected characters as input, C2W learns from character semantics and their positions to filter out incorrect and redundant detection and produce the final word recognition. Extensive experiments over seven public datasets show that I2C2W achieves superior recognition performances and outperforms the state-of-the-art by large margins on challenging irregular scene text datasets.

翻訳日:2021-05-19 20:58:29 公開日:2021-05-18

# (参考訳) 道路網における小型物体の検出の改善

Improved detection of small objects in road network sequences ( http://arxiv.org/abs/2105.08416v1 )

ライセンス: CC BY 4.0

Iv\'an Garc\'ia, Rafael Marcos Luque and Ezequiel L\'opez

(参考訳) 現在の道路ネットワークにある膨大な数の既存のIPカメラは、取得したデータを利用してビデオを分析し、重要なイベントを検出する機会である。この目的のためには、数年前まで古典的な人工視覚技術を用いて行われた作業である移動車を検出する必要がある。現在、深層学習ネットワークによって大幅に改善されている。それでも、オブジェクト検出はコンピュータビジョンにおける主要なオープン問題の1つと考えられている。現在のシナリオは絶えず進化しており、この分野を改善しようとする新しいモデルやテクニックが現れています。特に、道路シーンに登場する車両に主に対応している小型物体の検出に関して、新たな問題や欠点が現れる。これらのことは、小さな元素の低い検出率を改善する新しいソリューションが不可欠であることを意味している。様々な研究ラインの中で、この研究は小さな物体の検出に焦点を当てている。特に,本提案では,ビデオ監視カメラで撮影した画像からの車両検出を目的とした。本研究では,畳み込みニューラルネットワーク \emph{(CNN)} による検出に基づく超解像プロセスを適用することで,小型物体の検出を行う新しい手法を提案する。ニューラルネットワークは、画像の解像度を向上させるプロセスと統合され、オブジェクト検出性能が向上する。この手法は,様々なスケールの要素を含む一組のトラヒック画像に対して,モデルにより得られた検出結果に応じて効率性をテストすることで,幅広い状況において良好な結果が得られることを示す。

The vast number of existing IP cameras in current road networks is an opportunity to take advantage of the captured data and analyze the video and detect any significant events. For this purpose, it is necessary to detect moving vehicles, a task that was carried out using classical artificial vision techniques until a few years ago. Nowadays, significant improvements have been obtained by deep learning networks. Still, object detection is considered one of the leading open issues within computer vision. The current scenario is constantly evolving, and new models and techniques are appearing trying to improve this field. In particular, new problems and drawbacks appear regarding detecting small objects, which correspond mainly to the vehicles that appear in the road scenes. All this means that new solutions that try to improve the low detection rate of small elements are essential. Among the different emerging research lines, this work focuses on the detection of small objects. In particular, our proposal aims to vehicle detection from images captured by video surveillance cameras. In this work, we propose a new procedure for detecting small-scale objects by applying super-resolution processes based on detections performed by convolutional neural networks \emph{(CNN)}. The neural network is integrated with processes that are in charge of increasing the resolution of the images to improve the object detection performance. This solution has been tested for a set of traffic images containing elements of different scales to test the efficiency according to the detections obtained by the model, thus demonstrating that our proposal achieves good results in a wide range of situations.

翻訳日:2021-05-19 20:44:02 公開日:2021-05-18

# (参考訳) 深層学習概念に基づくPtychographyのパラメータ改善手法

A parameter refinement method for Ptychography based on Deep Learning concepts ( http://arxiv.org/abs/2105.08058v1 )

ライセンス: CC BY 4.0

Francesco Guzzi, George Kourousias, Fulvio Bill\`e, Roberto Pugliese, Alessandra Gianoncelli and Sergio Carrato

(参考訳) x-ray ptychography(x-ray ptychography)は、生物およびナノテクノロジーの標本の詳細な定量的イメージングを提供する高度な計算顕微鏡技術である。しかし, 伝播距離, 位置誤差, 部分コヒーレンスにおける粗いパラメータは, 実験の有効性をしばしば低下させる。本研究では,これらのアクタを正式に導入し,最適化問題としての再構築全体を解決した。最新のDeep Learningフレームワークは、セットアップの不整合を自律的に補正するために使用され、ポチコグラフィーの再構築の質が向上する。自動的なプロシージャは信頼性のある分析の時間を短縮するために非常に重要であり、この種の顕微鏡を使用するすべての分野に重大な影響を及ぼす。ソフトウェアフレームワークであるSciComPtyにアルゴリズムを実装し、オープンソースとしてリリースしました。我々は,elettra シンクロトロン施設のツインミックビームラインで取得した合成データセットと実データの両方でシステムをテストした。

X-ray Ptychography is an advanced computational microscopy technique which is delivering exceptionally detailed quantitative imaging of biological and nanotechnology specimens. However coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability. In this work we formally introduced these actors, solving the whole reconstruction as an optimisation problem. A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction. Automatic procedures are indeed crucial to reduce the time for a reliable analysis, which has a significant impact on all the fields that use this kind of microscopy. We implemented our algorithm in our software framework, SciComPty, releasing it as open-source. We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.

翻訳日:2021-05-19 20:31:52 公開日:2021-05-18

# (参考訳) DRILL:不均衡生涯学習のための動的表現

DRILL: Dynamic Representations for Imbalanced Lifelong Learning ( http://arxiv.org/abs/2105.08445v1 )

ライセンス: CC BY 4.0

Kyra Ahrens, Fares Abawi, Stefan Wermter

(参考訳) 継続的あるいは生涯的学習は、機械学習、特に自然言語処理(NLP)における長年にわたる課題である。 bertのような最先端の言語モデルは、マルチタスク学習シナリオにおける優れたパフォーマンスのために、この分野で新たな時代を迎えてきたが、データ分布のシフトを伴う連続したデータストリームに晒された時に忘れられてしまう。本稿では,オープンドメインテキスト分類のための新しい連続学習アーキテクチャDRILLを紹介する。 DRILLは生物学的にインスパイアされた自己組織化ニューラルアーキテクチャを利用して、BERTから潜在言語表現をタスクインクリメンタルに選択的にゲートする。本実験では,DRILLがタスク境界に関する事前知識を必要とせず,非定常的な非定常データの現実的なシナリオにおいて,現在の手法よりも優れていることを示す。我々の知る限りでは、DRILLはNLPのオープンドメイン生涯学習に自己組織化ニューラルアーキテクチャを使用した最初の種類のものだ。

Continual or lifelong learning has been a long-standing challenge in machine learning to date, especially in natural language processing (NLP). Although state-of-the-art language models such as BERT have ushered in a new era in this field due to their outstanding performance in multitask learning scenarios, they suffer from forgetting when being exposed to a continuous stream of data with shifting data distributions. In this paper, we introduce DRILL, a novel continual learning architecture for open-domain text classification. DRILL leverages a biologically inspired self-organizing neural architecture to selectively gate latent language representations from BERT in a task-incremental manner. We demonstrate in our experiments that DRILL outperforms current methods in a realistic scenario of imbalanced, non-stationary data without prior knowledge about task boundaries. To the best of our knowledge, DRILL is the first of its kind to use a self-organizing neural architecture for open-domain lifelong learning in NLP.

翻訳日:2021-05-19 20:16:59 公開日:2021-05-18

# (参考訳) 局所制御された距離ベクトル流を用いた深部能動輪郭

Deep Active Contours Using Locally Controlled Distance Vector Flow ( http://arxiv.org/abs/2105.08447v1 )

ライセンス: CC BY 4.0

Parastoo Akbari, Atefeh Ziaei, and Hamed Azarnoush

(参考訳) ACM(Active Contours Model)はコンピュータビジョンや画像処理に広く使われている。近年、畳み込みニューラルネットワーク(CNN)は、エネルギー関数と初期化のパラメータへのACMの依存に伴う制限を取り除くために、輪郭の進化と画像セグメント化の過程において、ユーザに代わってアクティブな輪郭と組み合わせられている。しかし、それ以前の作業は、ここで対処する自動初期化を目標としなかった。手動初期化に加えて、現在のメソッドは初期位置に対して高い感度を持ち、境界を正確に定義できない。エネルギー関数パラメータの割り当て問題に加えて,手動初期化や捕捉範囲の不足,境界への収束性の低下といった問題に対処する完全自動画像分割手法を提案する。 2つのcnnを訓練し, 有効輪郭重み付けパラメータを予測し, 距離変換(dt)と初期化円を抽出するための基底真理マスクを生成する。距離変換は、画像の各ピクセルから境界上の最も近い点へ向けてベクトル場を形成するために使用され、その大きさはユークリッド距離写像と等しい。本研究では, ビルディングインスタンスセグメンテーションデータセットであるVayhingenとBingハッシュと, INBreastとDDSM-BCRPの2つのマンモグラフィ画像データセットを含む4つの公開データセットについて評価を行った。今回のアプローチは,平均交点オーバーユニオン(miou)の0.59 ans 2.39パーセント,境界f-score(boundf)の7.38および8.62パーセントという,vaihingenとbing hutsデータセットの最新の研究を上回っている。 INBreast と DDSM-BCRP データセットのDice similarity coefficient は94.23% と 90.89% である。

Active contours Model (ACM) has been extensively used in computer vision and image processing. In recent studies, Convolutional Neural Networks (CNNs) have been combined with active contours replacing the user in the process of contour evolution and image segmentation to eliminate limitations associated with ACM's dependence on parameters of the energy functional and initialization. However, prior works did not aim for automatic initialization which is addressed here. In addition to manual initialization, current methods are highly sensitive to initial location and fail to delineate borders accurately. We propose a fully automatic image segmentation method to address problems of manual initialization, insufficient capture range, and poor convergence to boundaries, in addition to the problem of assignment of energy functional parameters. We train two CNNs, which predict active contour weighting parameters and generate a ground truth mask to extract Distance Transform (DT) and an initialization circle. Distance transform is used to form a vector field pointing from each pixel of the image towards the closest point on the boundary, the size of which is equal to the Euclidean distance map. We evaluate our method on four publicly available datasets including two building instance segmentation datasets, Vaihingen and Bing huts, and two mammography image datasets, INBreast and DDSM-BCRP. Our approach outperforms latest research by 0.59 ans 2.39 percent in mean Intersection-over-Union (mIoU), 7.38 and 8.62 percent in Boundary F-score (BoundF) for Vaihingen and Bing huts datasets, respectively. Dice similarity coefficient for the INBreast and DDSM-BCRP datasets is 94.23% and 90.89%, respectively indicating our method is comparable to state-of-the-art frameworks.

翻訳日:2021-05-19 20:07:23 公開日:2021-05-18

# (参考訳) 顔アンチスプーフィングのための教師なし複合ドメイン適応

Unsupervised Compound Domain Adaptation for Face Anti-Spoofing ( http://arxiv.org/abs/2105.08463v1 )

ライセンス: CC BY 4.0

Ankush Panwar, Pratyush Singh, Suman Saha, Danda Pani Paudel and Luc Van Gool

(参考訳) 現実の環境での顔認証システムを堅牢なものにすることを目的とした顔認証の課題に対処する。モデルがトレーニングされたラベル付きソースドメインと比較して,ライブ対スプーフ顔画像の検出状況は,対象領域で大きく異なる場合がある。このような違いは、新規で未知のスプーフタイプ、照明条件、背景などによって引き起こされる可能性がある。これらの差異は対象を複合ドメインとし、教師なし複ドメイン適応の問題を提起する。本稿では,本研究で初めて,対スプーフィング作業における複合ドメイン仮定の有効性を実証する。そこで本研究では,ソースモデルを対象ドメインに適応させるメモリ拡張手法を提案する。カリキュラム学習とドメインに依存しないソースネットワークトレーニングアプローチを用いることで、適応プロセスをさらに改善する。提案手法は,複数の新しいスプーフ型からなる複合ターゲットドメインに適応する。複数のベンチマークデータセットに対する実験により,提案手法が最先端よりも優れていることを示す。

We address the problem of face anti-spoofing which aims to make the face verification systems robust in the real world settings. The context of detecting live vs. spoofed face images may differ significantly in the target domain, when compared to that of labeled source domain where the model is trained. Such difference may be caused due to new and unknown spoof types, illumination conditions, scene backgrounds, among many others. These varieties of differences make the target a compound domain, thus calling for the problem of the unsupervised compound domain adaptation. We demonstrate the effectiveness of the compound domain assumption for the task of face anti-spoofing, for the first time in this work. To this end, we propose a memory augmentation method for adapting the source model to the target domain in a domain aware manner. The adaptation process is further improved by using the curriculum learning and the domain agnostic source network training approaches. The proposed method successfully adapts to the compound target domain consisting multiple new spoof types. Our experiments on multiple benchmark datasets demonstrate the superiority of the proposed method over the state-of-the-art.

翻訳日:2021-05-19 19:57:13 公開日:2021-05-18

# (参考訳) ビデオポリープセグメンテーションのためのプログレッシブノーマライズドセルフアテンションネットワーク

Progressively Normalized Self-Attention Network for Video Polyp Segmentation ( http://arxiv.org/abs/2105.08468v1 )

ライセンス: CC BY 4.0

Ge-Peng Ji, Yu-Cheng Chou, Deng-Ping Fan, Geng Chen, Huazhu Fu, Debesh Jha, and Ling Shao

(参考訳) 既存のビデオポリプセグメンテーション(VPS)モデルは、通常、特徴を抽出するために畳み込みニューラルネットワーク(CNN)を使用する。しかし、cnnは、その限られた受容領域のため、連続するビデオフレームの全体的時間的および空間的情報を十分に活用できないため、偽陽性のセグメンテーション結果が得られる。本稿では,RTX 2080 GPU上でのリアルタイム速度 (~140fps) のポリプビデオから表現を効率よく学習し,後処理を行わない新しい PNS-Net (Progressively Normalized Self-attention Network) を提案する。当社のPNS-Netは,再帰性とCNNを完全に装備する,基本的正規化自己注意ブロックのみをベースとしています。 VPSデータセットに挑戦する実験は、提案されたNS-Netが最先端のパフォーマンスを達成することを示す。また,チャネル分割,ソフトアテンション,プログレッシブ学習戦略の有効性を検討するために,広範囲な実験を行った。 PNS-Netは、異なる設定でうまく機能し、VPSタスクに対する有望なソリューションになります。

Existing video polyp segmentation (VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs can not fully exploit the global temporal and spatial information in successive video frames, resulting in false-positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single RTX 2080 GPU and no post-processing. Our PNS-Net is based solely on a basic normalized self-attention block, equipping with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.

翻訳日:2021-05-19 19:42:13 公開日:2021-05-18

# (参考訳) ベイズ型プレイヤーモデリングによる高速ゲームコンテンツ適応

Fast Game Content Adaptation Through Bayesian-based Player Modelling ( http://arxiv.org/abs/2105.08484v1 )

ライセンス: CC BY 4.0

Miguel Gonz\'alez-Duque, Rasmus Berg Palm and Sebastian Risi

(参考訳) ゲーム(および多くのユーザ向けシステム)では、ユーザの好みや経験にコンテンツを適用することが重要な課題である。本稿では,動的難易度調整(DDA)の文脈において,この目標を実現する新しい手法を提案する。ここでの目的は、ゲームの内容がプレイヤーのスキルレベルに常に適応し、難しすぎるか難しすぎる状態を避けることで、プレイヤーをエンゲージさせることである。 DDAの現在のシステムは、高価なデータマイニングや、特定のドメイン用に設計された手作りのルールに依存しており、通常はプレイヤーをフローに維持するために適応し、デザイナーが意図的に簡単または困難であるコンテンツを提示する余地は残っていない。本稿では,領域に依存せず,特定の困難を対象とするベイズ最適化ベースのddaシステムを提案する。我々はこのフレームワークをパズルゲームSudokuと単純なRogueライクゲームという2つの異なる領域に展開する。獲得関数の最適化を変更することで,5回未満の反復(sudoku)と15回の反復(simple roguelike)で,異なるスキルレベルを持つプレイヤーに対して,難易度の高いパズルを提示できる。これらの結果は、様々な領域におけるコンテンツ適応の有望な代替案を指している。

In games (as well as many user-facing systems), adapting content to user's preferences and experience is an important challenge. This paper explores a novel method to realize this goal in the context of dynamic difficulty adjustment (DDA). Here the aim is to constantly adapt the content of a game to the skill level of the player, keeping them engaged by avoiding states that are either too difficult or too easy. Current systems for DDA rely on expensive data mining, or on hand-crafted rules designed for particular domains, and usually adapts to keep players in the flow, leaving no room for the designer to present content that is purposefully easy or difficult. This paper presents a Bayesian Optimization-based system for DDA that is agnostic to the domain and that can target particular difficulties. We deploy this framework in two different domains: the puzzle game Sudoku, and a simple Roguelike game. By modifying the acquisition function's optimization, we are reliably able to present a puzzle with a bespoke difficulty for players with different skill levels in less than five iterations (for Sudoku) and fifteen iterations (for the simple Roguelike), significantly outperforming simpler heuristics for difficulty adjustment in said domains, with the added benefit of maintaining a model of the user. These results point towards a promising alternative for content adaption in a variety of different domains.

翻訳日:2021-05-19 19:30:56 公開日:2021-05-18

# (参考訳) ターゲットディスプレイ広告におけるマルチタスク学習によるオーディエンス多段階変換の系列依存性のモデル化

Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising ( http://arxiv.org/abs/2105.08489v1 )

ライセンス: CC BY 4.0

Dongbo Xi, Zhen Chen, Peng Yan, Yinger Zhang, Yongchun Zhu, Fuzhen Zhuang, Yu Chen

(参考訳) 現実世界のほとんどのオンラインアプリケーション(eコマースや金融など)では、顧客獲得は通常、オーディエンスの多段階変換プロセスである。例えば、インプレッション->クリック->購入プロセスは通常、Eコマースプラットフォームのオーディエンスによって実行される。しかし、従来の広告よりも金融広告(クレジットカード広告など)で顧客を獲得することは困難である。一方、オーディエンスマルチステップ変換パスは長くなる。一方、正のフィードバックは段階的にスパーサー(クラス不均衡)であり、活性化の遅延による最終正のフィードバックを得ることは困難である。マルチタスク学習は、この方向の典型的なソリューションです。この方向にかなりのマルチタスクの取り組みがなされているが、長年にわたる課題は、エンド・ツー・エンドの変換を改善するために、オーディエンス間の長いパスのシーケンシャルな依存を明示的にモデル化する方法である。本稿では,適応型情報転送(ait)モジュールを用いて,オーディエンス間の逐次依存性をモデル化する適応型情報転送マルチタスク(aitm)フレームワークを提案する。 AITモジュールは、異なる変換段階で転送する情報の種類と量を適応的に学習することができる。さらに、損失関数に振舞い期待キャリブレータを組み合わせることで、AITMフレームワークはより正確なエンドツーエンド変換識別を得ることができる。提案するフレームワークはMeituanアプリにデプロイされ、Meituan Co-Branded Credit Cardsのエンドツーエンドの変換レートの高いバナーをリアルタイムでユーザに提示する。産業用および公共用両方の実世界のデータセットのオフライン実験結果から,提案したフレームワークは最先端のベースラインに比べて性能が著しく向上していることが明らかとなった。

In most real-world large-scale online applications (e.g., e-commerce or finance), customer acquisition is usually a multi-step conversion process of audiences. For example, an impression->click->purchase process is usually performed of audiences for e-commerce platforms. However, it is more difficult to acquire customers in financial advertising (e.g., credit card advertising) than in traditional advertising. On the one hand, the audience multi-step conversion path is longer. On the other hand, the positive feedback is sparser (class imbalance) step by step, and it is difficult to obtain the final positive feedback due to the delayed feedback of activation. Multi-task learning is a typical solution in this direction. While considerable multi-task efforts have been made in this direction, a long-standing challenge is how to explicitly model the long-path sequential dependence among audience multi-step conversions for improving the end-to-end conversion. In this paper, we propose an Adaptive Information Transfer Multi-task (AITM) framework, which models the sequential dependence among audience multi-step conversions via the Adaptive Information Transfer (AIT) module. The AIT module can adaptively learn what and how much information to transfer for different conversion stages. Besides, by combining the Behavioral Expectation Calibrator in the loss function, the AITM framework can yield more accurate end-to-end conversion identification. The proposed framework is deployed in Meituan app, which utilizes it to real-timely show a banner to the audience with a high end-to-end conversion rate for Meituan Co-Branded Credit Cards. Offline experimental results on both industrial and public real-world datasets clearly demonstrate that the proposed framework achieves significantly better performance compared with state-of-the-art baselines.

翻訳日:2021-05-19 19:15:08 公開日:2021-05-18

# (参考訳) 画素レベルでのリモートセンシング画像のマルチビューコントラスト符号化

Multi-view Contrastive Coding of Remote Sensing Images at Pixel-level ( http://arxiv.org/abs/2105.08501v1 )

ライセンス: CC BY 4.0

Yuxing Chen

(参考訳) 我々の惑星は複数のセンサー(マルチスペクトル、ライダー、SARなど)と異なる時間で衛星によって観測される。多視点観察は、一つのものよりも補完的な情報をもたらす。あるいは、幾何やセマンティクスなど、異なるビュー間で共有される共通の機能もある。近年,マルチビューリモートセンシング画像のアライメントと,ビュー不変因子のモデル化による単一センサ画像の特徴表現の改善のために,コントラスト学習手法が提案されている。しかし、これらの手法は、事前に定義されたタスクの事前学習、あるいは画像レベルの分類のみに焦点を当てている。さらに、これらの手法は不確実性推定の研究を欠いている。本研究では,この制限を克服するために,ラベルのないマルチビュー設定に基づく画素単位のコントラスト的アプローチを提案する。これは、特徴アライメントにおける対照的な損失と、多視点画像間の均一性によって達成される。このアプローチでは, 擬似媒介ResUnetを用いて, シフトした正の対から特徴を整列させ, 超球上の特徴の誘導分布を均一化することを目的とした表現を学習する。マルチビューリモートセンシング画像の学習特徴を、線形プロトコルの評価と教師なしの変更検出タスクに基づいて評価する。提案手法を動作させるアプローチの重要な特性を分析し,シフト等分散の要求が提案手法の成功を保証し,表現の不確実性の推定が性能向上につながることを見出した。さらに、マルチビューコントラスト学習の性能は、異なるセンサの選択によって影響を受ける。その結果,最先端のマルチビューコントラスト法よりも効率と精度が向上した。

Our planet is viewed by satellites through multiple sensors (e.g., multi-spectral, Lidar and SAR) and at different times. Multi-view observations bring us complementary information than the single one. Alternatively, there are common features shared between different views, such as geometry and semantics. Recently, contrastive learning methods have been proposed for the alignment of multi-view remote sensing images and improving the feature representation of single sensor images by modeling view-invariant factors. However, these methods are based on the pretraining of the predefined tasks or just focus on image-level classification. Moreover, these methods lack research on uncertainty estimation. In this work, a pixel-wise contrastive approach based on an unlabeled multi-view setting is proposed to overcome this limitation. This is achieved by the use of contrastive loss in the feature alignment and uniformity between multi-view images. In this approach, a pseudo-Siamese ResUnet is trained to learn a representation that aims to align features from the shifted positive pairs and uniform the induced distribution of the features on the hypersphere. The learned features of multi-view remote sensing images are evaluated on a liner protocol evaluation and an unsupervised change detection task. We analyze key properties of the approach that make it work, finding that the requirement of shift equivariance ensured the success of the proposed approach and the uncertainty estimation of representations leads to performance improvements. Moreover, the performance of multi-view contrastive learning is affected by the choice of different sensors. Results demonstrate both improvements in efficiency and accuracy over the state-of-the-art multi-view contrastive methods.

翻訳日:2021-05-19 18:59:26 公開日:2021-05-18

# (参考訳) ニューラルマシン翻訳における最小ベイズリスク復号の特性の理解

Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation ( http://arxiv.org/abs/2105.08504v1 )

ライセンス: CC BY 4.0

Mathias M\"uller and Rico Sennrich

(参考訳) ニューラルマシン翻訳(nmt)は現在、短すぎる翻訳や頻繁な単語の過剰生成といったバイアスを示しており、トレーニングデータやドメインシフトのノイズをコピーするロバスト性が乏しい。最近の研究はこれらの欠点をビーム探索(nmtのデファクト標準推論アルゴリズム)と結びつけており、eikema & aziz (2020) は最小ベイズリスク(mbr)をバイアスのないサンプルにデコードすることを提案している。本稿では,これまでに報告された多数のバイアスとビームサーチの故障事例に対するmbr復号の特性について実験的に検討する。 MBRは、実用関数として使用されるMT測定値から、長さとトークンの周波数バイアスがまだ残っているが、トレーニングデータやドメインシフトのコピーノイズに対する堅牢性も向上している。

Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words, and shows poor robustness to copy noise in training data or domain shift. Recent work has tied these shortcomings to beam search -- the de facto standard inference algorithm in NMT -- and Eikema & Aziz (2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead. In this paper, we empirically investigate the properties of MBR decoding on a number of previously reported biases and failure cases of beam search. We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases robustness against copy noise in the training data and domain shift.

翻訳日:2021-05-19 18:46:57 公開日:2021-05-18

# (参考訳) Transformers \`a Grande Vitesse

Transformers \`a Grande Vitesse ( http://arxiv.org/abs/2105.08526v1 )

ライセンス: CC BY-SA 4.0

Farid Arthaud, Guillaume Lecoeur, Alban Pierre

(参考訳) 堅牢な走行時間予測は、交通インフラ、特に交通規制と乗客満足度の両方に大きな影響を与える鉄道網の管理において最も重要なものである。我々は,鉄道網全体の規模で鉄道区間を走行する列車の走行時間を予測することを目的として,理論的循環計画に対する列車の遅延を推定する。鉄道会社内の既存の実装は、列車の遅延が残りの旅行の間一定であるように近似して機能する。列車の遅延の進行を予測することは、主要な道路交通予測問題と異なり、列車の間隔、駅の混雑、不均一な車両など、いくつかの難解な現象を含むため、ユニークな難題である。まず,フランス国鉄の遅延伝播現象の実証的証拠を提示し,列車間の相互作用によって遅延が増幅されることを示した。次に, 変圧器アーキテクチャと事前学習した組込みを用いた新しい手法を提案し, 鉄道網全体のスケールで列車の遅延をリアルタイムに並列に予測する手法を提案する(ピーク時3k以上の列車は平均70分間隔で予測を行う)。提案手法は,現在使われている,実験的な予測手法と比較して,実世界のデータに対して非常に肯定的な結果をもたらす。私たちの仕事は、フランスの鉄道会社sncfによる旅客情報システムの実装の初期段階にあり、交通規制決定を支援するツールとして候補となっている。

Robust travel time predictions are of prime importance in managing any transportation infrastructure, and particularly in rail networks where they have major impacts both on traffic regulation and passenger satisfaction. We aim at predicting the travel time of trains on rail sections at the scale of an entire rail network in real-time, by estimating trains' delays relative to a theoretical circulation plan. Existing implementations within railway companies generally work using the approximation that a train's delay will stay constant for the rest of its trip. Predicting the evolution of a given train's delay is a uniquely hard problem, distinct from mainstream road traffic forecasting problems, since it involves several hard-to-model phenomena: train spacing, station congestion and heterogeneous rolling stock among others. We first offer empirical evidence of the previously unexplored phenomenon of delay propagation in the French National Railway Network, leading to delays being amplified by interactions between trains. We then contribute a novel technique using the transformer architecture and pre-trained embeddings to make real-time massively parallel predictions for train delays at the scale of the whole rail network (over 3k trains at peak hours, making predictions at an average horizon of 70 minutes). Our approach yields very positive results on real-world data when compared to currently-used and experimental prediction techniques. Our work is in the early stages of implementation for industrial use at the French railway company SNCF for passenger information systems, and a contender as a tool to aid traffic regulation decisions.

翻訳日:2021-05-19 18:35:22 公開日:2021-05-18

# (参考訳) VASS到達性問題に対するアッカーマン下界の改善

Improved Ackermannian lower bound for the VASS reachability problem ( http://arxiv.org/abs/2105.08551v1 )

ライセンス: CC BY 4.0

S{\l}awomir Lasota

(参考訳) このドラフトは、最近 Czerwi\'nski と Orlikowski によって発表された状態を持つベクトル加算系(VASS)における到達可能性問題に対するアッカーマン下界のフォローアップである。独立して、同じ結果がlerouxによって発表されたが、かなり異なる証拠がある。 czerwi\'nski と orlikowski は、次元 6k$ で $f_k$-hardness、次元 4k+9$ で leroux を証明しているが、単純化された構成により、既に次元 3k+2$ で $f_k$-hardness が得られる。

This draft is a follow-up of the Ackermannian lower bound for the reachability problem in vector addition systems with states (VASS), recently announced by Czerwi\'nski and Orlikowski. Independently, the same result has been announced by Leroux, but with a significantly different proof. We provide a simplification of the former construction, thus improving the lower bound for VASS in fixed dimension: while Czerwi\'nski and Orlikowski prove $F_k$-hardness in dimension $6k$, and Leroux in dimension $4k+9$, the simplified construction yields $F_k$-hardness already in dimension $3k+2$.

翻訳日:2021-05-19 18:16:10 公開日:2021-05-18

# (参考訳) 複雑な3次元環境におけるキュラス探索のための$\beta$-VAE符号化

Fixed $\beta$-VAE Encoding for Curious Exploration in Complex 3D Environments ( http://arxiv.org/abs/2105.08568v1 )

ライセンス: CC BY 4.0

Auguste Lehuger, Matthew Crosby

(参考訳) 好奇心は、環境報酬を内在的な報酬で増やす一般的な方法であり、探索を促進し、スパース報酬設定において特に有用である。キュリオシティは次の状態予測誤差を用いて計算されるため、使用する状態エンコーディングの種類は性能に大きな影響を与える。ランダムな特徴と逆動的特徴は、Atariや他の主に2D環境の以前の結果に基づいて、VAEよりも一般的に好まれる。しかし、VAEと異なり、最適な行動のための十分な情報をエンコードしていないため、環境が複雑化するにつれて、ますます重要になる。本稿では,3D物理環境であるAnimal-AIを用いて,固定された$\beta$-VAEエンコーディングを好奇心で効果的に利用できることを示す。これをカリキュラム学習と組み合わせて、未解決の探索集約的なデトラウトタスクを解き、次の最良エンコーディングに対してトレーニングカリキュラムのサンプル効率を22倍に向上させる。また、atariのブレイクアウトの結果は、ランダムな機能や逆ダイナミクス機能よりも優れたエンコーディングで一致しています。

Curiosity is a general method for augmenting an environment reward with an intrinsic reward, which encourages exploration and is especially useful in sparse reward settings. As curiosity is calculated using next state prediction error, the type of state encoding used has a large impact on performance. Random features and inverse-dynamics features are generally preferred over VAEs based on previous results from Atari and other mostly 2D environments. However, unlike VAEs, they may not encode sufficient information for optimal behaviour, which becomes increasingly important as environments become more complex. In this paper, we use the sparse reward 3D physics environment Animal-AI, to demonstrate how a fixed $\beta$-VAE encoding can be used effectively with curiosity. We combine this with curriculum learning to solve the previously unsolved exploration intensive detour tasks while achieving 22\% gain in sample efficiency on the training curriculum against the next best encoding. We also corroborate the results on Atari Breakout, with our custom encoding outperforming random features and inverse-dynamics features.

翻訳日:2021-05-19 18:04:46 公開日:2021-05-18

# (参考訳) 画像キャプションのための因果干渉によるマルチタスク学習

Dependent Multi-Task Learning with Causal Intervention for Image Captioning ( http://arxiv.org/abs/2105.08573v1 )

ライセンス: CC BY 4.0

Wenqing Chen, Jidong Tian, Caoyun Fan, Hao He, and Yaohui Jin

(参考訳) 画像キャプションの最近の研究は、主に抽出列生成のパラダイムに従い、オブジェクトベースの特徴列を事前抽出し、単一のシーケンス対シーケンスタスクとして画像キャプションを定式化する。 1) モデルが矛盾する事実を生成する内容の不整合,2) モデルが重要な情報の一部を見逃すような情報がない,という2つの問題を発見した。因果的な観点からすると、モデルが視覚的特徴と特定の表現(例えば「長い髪」と「女性」の視覚的特徴)の間の散発的な統計的相関を捉えたからである。本稿では,因果介入(dmtci)を用いた依存型マルチタスク学習フレームワークを提案する。まず、中間タスク、カテゴリの袋生成、最終タスクの前に、画像キャプションを伴います。中間タスクは、モデルが視覚的特徴をよりよく理解し、コンテンツ一貫性の問題を軽減するのに役立つ。次に、Pearlのdo-calculusをモデルに適用し、視覚的特徴と可能共同創設者とのリンクを遮断し、モデルが因果的視覚的特徴にフォーカスできるようにする。特に、高周波の概念セットは、実際の共同設立者が連続空間で推測されるプロキシ共同設立者と見なされる。最後に,マルチエージェント強化学習(marl)戦略を用いてエンドツーエンドトレーニングを可能にし,タスク間エラーの蓄積を低減する。実験により,本モデルがベースラインモデルより優れ,最先端モデルと競合する性能が得られた。

Recent work for image captioning mainly followed an extract-then-generate paradigm, pre-extracting a sequence of object-based features and then formulating image captioning as a single sequence-to-sequence task. Although promising, we observed two problems in generated captions: 1) content inconsistency where models would generate contradicting facts; 2) not informative enough where models would miss parts of important information. From a causal perspective, the reason is that models have captured spurious statistical correlations between visual features and certain expressions (e.g., visual features of "long hair" and "woman"). In this paper, we propose a dependent multi-task learning framework with the causal intervention (DMTCI). Firstly, we involve an intermediate task, bag-of-categories generation, before the final task, image captioning. The intermediate task would help the model better understand the visual features and thus alleviate the content inconsistency problem. Secondly, we apply Pearl's do-calculus on the model, cutting off the link between the visual features and possible confounders and thus letting models focus on the causal visual features. Specifically, the high-frequency concept set is considered as the proxy confounders where the real confounders are inferred in the continuous space. Finally, we use a multi-agent reinforcement learning (MARL) strategy to enable end-to-end training and reduce the inter-task error accumulations. The extensive experiments show that our model outperforms the baseline models and achieves competitive performance with state-of-the-art models.

翻訳日:2021-05-19 17:49:29 公開日:2021-05-18

# (参考訳) 海洋水中の有意な波高の予測

Forecasting Significant Wave Heights in Oceanic Waters ( http://arxiv.org/abs/2105.08583v1 )

ライセンス: CC BY 4.0

Pujan Pokhrel, Elias Ioup, Md Tamjidul Hoque, Mahdi Abdelguerfi, Julian Simeonov

(参考訳) 本稿では,海洋水中の波高を推定するための余分木(et)アルゴリズムに基づく機械学習手法を提案する。点計測を行うCDIPブイから複数の特徴を導出するため,まず様々なパラメータを解析し,30分間隔で予測する。提案アルゴリズムは、それぞれ1日前の予測でScatter Index (SI), Bias, correlation Coefficient, Root Mean Squared Error (RMSE) が0.130,-0.002, 0.97, 0.14であり、テストデータセット上で14日前の予測では0.110,-0.001, 0.98, 0.122である。他の最先端の手法では120時間前にしか予測できないが、さらに14日延長する。この14日間の制限は予測限界ではないが、実験のセットアップによって生じる。提案手法は,スペクトル特性,hvブロッククロスバリデーション,厳密QC基準を含む。提案アルゴリズムは,1日先進予測のための有意な波高予測に一般的に使用される最先端手法よりも,はるかに優れた性能を示す。さらに, 数値計算法と比較して, 提案手法の性能が向上し, 海洋水中の波高を早期に予測できる長周期に拡張できることを示した。

This paper proposes a machine learning method based on the Extra Trees (ET) algorithm for forecasting Significant Wave Heights in oceanic waters. To derive multiple features from the CDIP buoys, which make point measurements, we first nowcast various parameters and then forecast them at 30-min intervals. The proposed algorithm has Scatter Index (SI), Bias, Correlation Coefficient, Root Mean Squared Error (RMSE) of 0.130, -0.002, 0.97, and 0.14, respectively, for one day ahead prediction and 0.110, -0.001, 0.98, and 0.122, respectively, for 14-day ahead prediction on the testing dataset. While other state-of-the-art methods can only forecast up to 120 hours ahead, we extend it further to 14 days. This 14-day limit is not the forecasting limit, but it arises due to our experiment's setup. Our proposed setup includes spectral features, hv-block cross-validation, and stringent QC criteria. The proposed algorithm performs significantly better than the state-of-the-art methods commonly used for significant wave height forecasting for one-day ahead prediction. Moreover, the improved performance of the proposed machine learning method compared to the numerical methods, shows that this performance can be extended to even longer time periods allowing for early prediction of significant wave heights in oceanic waters.

翻訳日:2021-05-19 17:32:17 公開日:2021-05-18

# (参考訳) テキスト分類のための自己解釈型畳み込みニューラルネットワーク

Self-interpretable Convolutional Neural Networks for Text Classification ( http://arxiv.org/abs/2105.08589v1 )

ライセンス: CC BY 4.0

Wei Zhao, Rahul Singh, Tarun Joshi, Agus Sudjianto, Vijayan N. Nair

(参考訳) 自然言語処理(NLP)のディープラーニングモデルは本質的に複雑であり、本質的にはブラックボックスと見なされることが多い。本稿では,relu-dnnに固有な局所線形モデルを用いて,テキスト分類問題に対する畳み込みニューラルネットワークの解釈手法を開発した。 CNNモデルは、畳み込み層に埋め込み、最大プールを用いてそれらをフィルタリングし、分類のためにReLU-DNNを使用して最適化する。全体的な自己解釈モデルを得るために、ReLU DNNからの局所線形モデルのシステムは、最大プールフィルタを通して適切なn-gramにマッピングされる。実験データセットを用いた結果から,提案手法は,より複雑なcnnモデルに対して,自己解釈可能で同等の性能を持つ並列モデルを生成することが示された。また,畳み込み層と分類層の複雑さがモデル性能に与える影響についても検討した。

Deep learning models for natural language processing (NLP) are inherently complex and often viewed as black box in nature. This paper develops an approach for interpreting convolutional neural networks for text classification problems by exploiting the local-linear models inherent in ReLU-DNNs. The CNN model combines the word embedding through convolutional layers, filters them using max-pooling, and optimizes using a ReLU-DNN for classification. To get an overall self-interpretable model, the system of local linear models from the ReLU DNN are mapped back through the max-pool filter to the appropriate n-grams. Our results on experimental datasets demonstrate that our proposed technique produce parsimonious models that are self-interpretable and have comparable performance with respect to a more complex CNN model. We also study the impact of the complexity of the convolutional layers and the classification layers on the model performance.

翻訳日:2021-05-19 17:18:17 公開日:2021-05-18

# (参考訳) WOVe:GloVeワード埋め込みに単語順序を組み込む

WOVe: Incorporating Word Order in GloVe Word Embeddings ( http://arxiv.org/abs/2105.08597v1 )

ライセンス: CC BY 4.0

Mohammed Ibrahim, Susan Gauch, Tyler Gerth, Brandon Cox

(参考訳) 単語ベクトル表現は、構造化されていないテキストから有用な情報を抽出する新しい機会を開く。単語をベクトルとして定義することで、機械学習アルゴリズムがテキストを理解して情報を抽出することが容易になった。ワードベクトル表現は、単語同義語、単語類似、構文解析など、多くのアプリケーションで使われている。 GloVeは、単語コンテキストと行列ベクトル化に基づいて、エフェクティブなベクトル学習アルゴリズムである。従来のベクトル学習アルゴリズムを改善する。しかし、グローブモデルは文脈の中で単語が現れる順序を明示的に考慮しない。本稿では,グローブワード埋め込みに単語順序を組み込む複数の手法を提案する。実験の結果, 単語順ベクトル(WOVe)の単語埋め込みは, アナログ補完と単語類似性の自然なランゲージタスクにおいて, 未修正のGloVeよりも優れていることがわかった。単語類似性タスクでは、直接結合性がわずかに優れており、平均的なランクが2%上昇している。しかし、GloVeのベースラインでは単語類似タスクが大幅に改善され、平均36.34%の精度が向上した。

Word vector representations open up new opportunities to extract useful information from unstructured text. Defining a word as a vector made it easy for the machine learning algorithms to understand a text and extract information from. Word vector representations have been used in many applications such word synonyms, word analogy, syntactic parsing, and many others. GloVe, based on word contexts and matrix vectorization, is an ef-fective vector-learning algorithm. It improves on previous vector-learning algorithms. However, the GloVe model fails to explicitly consider the order in which words appear within their contexts. In this paper, multiple methods of incorporating word order in GloVe word embeddings are proposed. Experimental results show that our Word Order Vector (WOVe) word embeddings approach outperforms unmodified GloVe on the natural lan-guage tasks of analogy completion and word similarity. WOVe with direct concatenation slightly outperformed GloVe on the word similarity task, increasing average rank by 2%. However, it greatly improved on the GloVe baseline on a word analogy task, achieving an average 36.34% improvement in accuracy.

翻訳日:2021-05-19 17:08:16 公開日:2021-05-18

# (参考訳) 弾性部分マッチングを用いた機能データの形状解析

Shape Analysis of Functional Data with Elastic Partial Matching ( http://arxiv.org/abs/2105.08604v1 )

ライセンス: CC BY 4.0

Darshan Bryner and Anuj Srivastava

(参考訳) 弾性リーマン計量は、関数および曲線形状データの統計処理に過去に成功している。しかし、この使用法には重要な制限が課されており、関数の境界は固定され、一致すると仮定されている。未整合境界を示す機能データは、通常、異なる地理的領域に関連するCOVID-19感染率曲線などの変動進化率を持つ力学系から生じる。この場合、そのようなデータをスライディングバウンダリでモデル化し、部分マッチングを使用する方がより自然である。本稿では,位相可変性と不確定境界下での関数の部分マッチング,比較,クラスタリングを可能にする包括的リーマンフレームワークを開発した。我々は,(1)時変群と時変群の合同作用を形成すること,(2)この共同作用に不変な計量を導入し,弾性的部分マッチングへの勾配に基づくアプローチを可能にすること,(3)計量特性を損なうことなく,両者の相対的影響を制御できる修正を提示すること,により過去の作業を拡張した。このフレームワークは、COVID-19レートカーブの登録とクラスタリング、必須パターンの特定、ミスマッチエラーの最小化、以前の方法と比較してクラスタ内のばらつきの低減のために説明されている。

Elastic Riemannian metrics have been used successfully in the past for statistical treatments of functional and curve shape data. However, this usage has suffered from an important restriction: the function boundaries are assumed fixed and matched. Functional data exhibiting unmatched boundaries typically arise from dynamical systems with variable evolution rates such as COVID-19 infection rate curves associated with different geographical regions. In this case, it is more natural to model such data with sliding boundaries and use partial matching, i.e., only a part of a function is matched to another function. Here, we develop a comprehensive Riemannian framework that allows for partial matching, comparing, and clustering of functions under both phase variability and uncertain boundaries. We extend past work by: (1) Forming a joint action of the time-warping and time-scaling groups; (2) Introducing a metric that is invariant to this joint action, allowing for a gradient-based approach to elastic partial matching; and (3) Presenting a modification that, while losing the metric property, allows one to control relative influence of the two groups. This framework is illustrated for registering and clustering shapes of COVID-19 rate curves, identifying essential patterns, minimizing mismatch errors, and reducing variability within clusters compared to previous methods.

翻訳日:2021-05-19 17:00:33 公開日:2021-05-18

# (参考訳) ベイズニューラルネットワークによる逆例の検出

Detecting Adversarial Examples with Bayesian Neural Network ( http://arxiv.org/abs/2105.08620v1 )

ライセンス: CC BY 4.0

Yao Li, Tongyi Tang, Cho-Jui Hsieh, Thomas C. M. Lee

(参考訳) ディープニューラルネットワーク(Deep Neural Network, DNN)は、人間の自然なイメージと区別できないまま、DNNを騙すために慎重に作られた例である。本稿では,ランダム成分が予測器の滑らかさを向上し,ディープニューラルネットワークの出力分布をシミュレートしやすくするという観測結果に動機づけられた,逆行例を検出する新しい枠組みを提案する。そこで本研究では,バテクタに略されるベイズ型逆例検出器を提案し,逆例検出の性能を向上させる。具体的には,実例と逆例の隠れ層出力の分布差について検討し,ベイズニューラルネットワーク(bnn)のランダム性を用いて隠れ層出力分布をシミュレートし,分布分散を利用して逆例を検出することを提案する。 bnnの利点は、ランダム成分を持たないニューラルネットワークはそのような特性を持たないが、出力が確率的であることである。ポピュラーアタックに対するいくつかのベンチマークデータセットでの実証結果から、提案するバテクタは、敵対的な例検出において最先端の検出器よりも優れていることが分かる。

Deep neural networks (DNNs) are vulnerable against adversarial examples, i.e., examples that are carefully crafted to fool the DNNs while being indistinguishable from the natural images to humans. In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate output distribution of deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATector, to improve the performance of adversarial example detection. In specific, we study the distributional difference of hidden layer output between natural and adversarial examples, and propose to use the randomness of Bayesian neural network (BNN) to simulate hidden layer output distribution and leverage the distribution dispersion to detect adversarial examples. The advantage of BNN is that the output is stochastic while neural network without random components do not have such characteristics. Empirical results on several benchmark datasets against popular attacks show that, the proposed BATector outperforms the state-of-the-art detectors in adversarial example detection.

翻訳日:2021-05-19 16:32:16 公開日:2021-05-18

# (参考訳) Zorro: グラフニューラルネットワークにおける妥当性,スパース,安定した説明

Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks ( http://arxiv.org/abs/2105.08621v1 )

ライセンス: CC BY 4.0

Thorben Funke, Megha Khosla, Avishek Anand

(参考訳) グラフニューラルネットワークの普及と応用により、GNNモデルの判断を解釈し理解するためのいくつかの提案がなされている。 GNNモデルの説明は他の入力設定と原理的に異なる。グラフ構造で接続された特徴やその他の関連インスタンスを入力する決定を属性とすることが重要である。我々は,gnnモデルが生成するラベル分布と説明との相互情報を最大化する先行説明生成手法が制限的であることを見出した。具体的には、既存のアプローチでは、予測、スパース、あるいは入力摂動に頑健な説明を強制しない。本稿では,GNNにおける説明手法が従うべき基本原理を概説し,説明の有効性の尺度として計量忠実度を導入する。本稿では、簡単な組合せ法を用いて忠実度を最適化する速度歪み理論の原理に基づく新しいアプローチZorroを提案する。実データと合成データセットの大規模な実験により、Zorroは既存のGNNの説明手法よりもスペーサー、安定、忠実な説明を生み出すことが明らかになった。

With the ever-increasing popularity and applications of graph neural networks, several proposals have been made to interpret and understand the decisions of a GNN model. Explanations for a GNN model differ in principle from other input settings. It is important to attribute the decision to input features and other related instances connected by the graph structure. We find that the previous explanation generation approaches that maximize the mutual information between the label distribution produced by the GNN model and the explanation to be restrictive. Specifically, existing approaches do not enforce explanations to be predictive, sparse, or robust to input perturbations. In this paper, we lay down some of the fundamental principles that an explanation method for GNNs should follow and introduce a metric fidelity as a measure of the explanation's effectiveness. We propose a novel approach Zorro based on the principles from rate-distortion theory that uses a simple combinatorial procedure to optimize for fidelity. Extensive experiments on real and synthetic datasets reveal that Zorro produces sparser, stable, and more faithful explanations than existing GNN explanation approaches.

翻訳日:2021-05-19 16:18:36 公開日:2021-05-18

# (参考訳) 相関構造を用いた抽象画像の美的評価

Assessing aesthetics of generated abstract images using correlation structure ( http://arxiv.org/abs/2105.08635v1 )

ライセンス: CC BY 4.0

Sina Khajehabdollahi, Georg Martius, Anna Levina

(参考訳) 自然画像や人間の選択画像から偏りなく抽象美的画像を生成することができるか? 美的イメージは相関関数に含まれているか? 本稿では,これらの質問に対する回答について述べる。ランダム重みと異なるアーキテクチャを持つ合成パターン生成ネットワークを用いて画像を生成する。ランダムに選択された重みであっても、相関関数はネットワークアーキテクチャによって決定される。制御された実験では、人間はすべての生成された画像の大規模なデータセットから美的イメージを抽出した。統計的解析により、相関関数は美的画像では確かに異なることが分かる。

Can we generate abstract aesthetic images without bias from natural or human selected image corpi? Are aesthetic images singled out in their correlation functions? In this paper we give answers to these and more questions. We generate images using compositional pattern-producing networks with random weights and varying architecture. We demonstrate that even with the randomly selected weights the correlation functions remain largely determined by the network architecture. In a controlled experiment, human subjects picked aesthetic images out of a large dataset of all generated images. Statistical analysis reveals that the correlation function is indeed different for aesthetic images.

翻訳日:2021-05-19 15:54:24 公開日:2021-05-18

# (参考訳) CoTexT: Code-Text Transformerによるマルチタスク学習

CoTexT: Multi-task Learning with Code-Text Transformer ( http://arxiv.org/abs/2105.08645v1 )

ライセンス: CC BY 4.0

Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Anibal, Alec Peltekian, and Yanfang Ye

(参考訳) マルチタスク学習を通じて自然言語(NL)とプログラミング言語(PL)の代表的な文脈を学習するトランスフォーマーベースのアーキテクチャエンコーダデコーダモデルであるCoTexTを提案する。 CoTexTは、コード要約/文書化、コード生成、欠陥検出、コードデバッギングなど、下流のNL-PLタスクをサポートする汎用的な理解とコードテキスト生成を学ぶために、大規模なプログラミング言語コーパスに基づいて、自己管理型で事前訓練されている。我々は、CoTexTを利用可能なPLコーパスの異なる組み合わせで訓練する。これは、"bimodal"データと"unimodal"データの両方で、後者は、入力シーケンス内の自然文と対応するコードスニペットの組み合わせであり、後者は単なるコードスニペットである。マルチタスク学習のCoTexTをCodeXGLUE上で生成・分類タスクで評価し,すべての下流タスクで最先端を実現する。

We present CoTexT, a transformer-based architecture encoder-decoder pre-trained model that learns the representative context between natural language (NL) and programming language (PL) through multi-task learning. CoTexT is pre-trained, in self-supervised fashion, based on large programming language corpus to learn general-purpose understanding and code-text generation supporting downstream NL-PL task such as code summarizing/documentation, code generation, defect detection, code debugging, etc. We train CoTexT on different combination of available PL corpus including both "bimodal" and "unimodal" data where the former is the combinations of both natural texts and their corresponding code snippets in an input sequence and the latter is merely code snippets. We evaluate multi-task learning CoTexT on different generation and classification tasks on CodeXGLUE and it achieves state-of-the-art on all downstream tasks.

翻訳日:2021-05-19 15:44:29 公開日:2021-05-18

# (参考訳) IntFormer: Transformerアーキテクチャの助けを借りて歩行者の意図を予測する

IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture ( http://arxiv.org/abs/2105.08647v1 )

ライセンス: CC BY-SA 4.0

J. Lorenzo, I. Parra and M. A. Sotelo

(参考訳) 歩行者の横断行動を理解することは、インテリジェントな車両開発において重要な目標であり、セキュリティと交通の流れの改善につながる。本稿では,IntFormerという手法を開発した。これはトランスフォーマーアーキテクチャとrubiksnetと呼ばれる新しい畳み込みビデオ分類モデルに基づいている。最近のベンチマークでの評価手順に従うと、我々のモデルは高い性能(約40$ seq)で最先端の結果に達することを示す。 1秒あたり)とサイズ(8\times $smaller than the best performing model)で、リアルタイム使用に適している。また、各入力特徴についても検討し、Ego-vehicleの速度が最も重要な変数であることを発見した。

Understanding pedestrian crossing behavior is an essential goal in intelligent vehicle development, leading to an improvement in their security and traffic flow. In this paper, we developed a method called IntFormer. It is based on transformer architecture and a novel convolutional video classification model called RubiksNet. Following the evaluation procedure in a recent benchmark, we show that our model reaches state-of-the-art results with good performance ($\approx 40$ seq. per second) and size ($8\times $smaller than the best performing model), making it suitable for real-time usage. We also explore each of the input features, finding that ego-vehicle speed is the most important variable, possibly due to the similarity in crossing cases in PIE dataset.

翻訳日:2021-05-19 15:33:27 公開日:2021-05-18

# (参考訳) スケーラブルコンテンツに基づくビジュアルメディア検索のためのマルチモーダルディープラーニングフレームワーク

A multimodal deep learning framework for scalable content based visual media retrieval ( http://arxiv.org/abs/2105.08665v1 )

ライセンス: CC BY 4.0

Ambareesh Ravi, Amith Nandakumar

(参考訳) 本稿では,画像と映像の両方に対して協調的に動作可能な深層学習の力を活用し,コンテンツベースビジュアルメディア検索システムのための新しい,効率的,モジュール性,スケーラブルなフレームワークを提案し,検索のための効率的な比較・フィルタリング指標を提案する。提案手法を従来の手法と比較し,提案手法の有効性と効率性,検索アーキテクチャの能力をさらに高める可能性のある改善を実証する。

We propose a novel, efficient, modular and scalable framework for content based visual media retrieval systems by leveraging the power of Deep Learning which is flexible to work both for images and videos conjointly and we also introduce an efficient comparison and filtering metric for retrieval. We put forward our findings from critical performance tests comparing our method to the predominant conventional approach to demonstrate the feasibility and efficiency of the proposed solution with best practices, possible improvements that may further augment the ability of retrieval architectures.

翻訳日:2021-05-19 15:24:19 公開日:2021-05-18

# (参考訳) 数値エッジ属性を用いた知識グラフからの埋め込み学習

Learning Embeddings from Knowledge Graphs With Numeric Edge Attributes ( http://arxiv.org/abs/2105.08683v1 )

ライセンス: CC BY-SA 4.0

Sumit Pai, Luca Costabello

(参考訳) 知識グラフのエッジに関連する数値は、遺伝的データからソーシャルネットワークまで、多くのシナリオにおいて不確実性、エッジの重要性、さらには帯域外知識を表すために使われてきた。それにもかかわらず、従来の知識グラフ埋め込みモデルは、予測力を損なうような情報をキャプチャするように設計されていない。本稿では,従来の知識グラフ埋め込みアーキテクチャのスコアリング層に,数値エッジ属性を注入する新しい手法を提案する。数値知識グラフの公開実験により,本手法は従来の数値知識ベースラインよりも,最近のukgeモデルよりも優れていることが示された。

Numeric values associated to edges of a knowledge graph have been used to represent uncertainty, edge importance, and even out-of-band knowledge in a growing number of scenarios, ranging from genetic data to social networks. Nevertheless, traditional knowledge graph embedding models are not designed to capture such information, to the detriment of predictive power. We propose a novel method that injects numeric edge attributes into the scoring layer of a traditional knowledge graph embedding architecture. Experiments with publicly available numeric-enriched knowledge graphs show that our method outperforms traditional numeric-unaware baselines as well as the recent UKGE model.

翻訳日:2021-05-19 15:10:58 公開日:2021-05-18

# (参考訳) 動的チーム構成のためのコーチプレイヤマルチエージェント強化学習

Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition ( http://arxiv.org/abs/2105.08692v1 )

ライセンス: CC BY 4.0

Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu and Animashree Anandkumar

(参考訳) 現実世界のマルチエージェントシステムでは、異なる能力を持つエージェントがチーム全体の目標を変更することなく参加または離脱することができる。このようなダイナミックな構成でチームをコーディネートすることは難しい。この問題に対処するためのコーチ・プレイヤ・フレームワークであるCOPAを提案する。コーチは環境をグローバルに把握し、個々の戦略を分散することで、部分的な視点しか持たないプレイヤーをコーディネートしていると仮定する。具体的には,1) コーチと選手の双方に注意機構を導入し,2) 学習の規則化のための変動目標を提案し,3) コーチが選手といつコミュニケーションをするかを決めるための適応的なコミュニケーション手法を設計する。本手法は,資源収集タスク,救助ゲーム,およびStarCraftマイクロマネジメントタスクにおいて検証する。新しいチーム構成にゼロショットの一般化を実証する。本手法は,全プレイヤーが環境をフルに把握できる環境よりも,同等あるいは優れた性能を実現する。また,適応的なコミュニケーション戦略を用いることで,コーチが13%の時間でコミュニケーションを行う場合でも,パフォーマンスは高いままである。

In real-world multiagent systems, agents with different capabilities may join or leave without altering the team's overarching goals. Coordinating teams with such dynamic composition is challenging: the optimal team strategy varies with the composition. We propose COPA, a coach-player framework to tackle this problem. We assume the coach has a global view of the environment and coordinates the players, who only have partial views, by distributing individual strategies. Specifically, we 1) adopt the attention mechanism for both the coach and the players; 2) propose a variational objective to regularize learning; and 3) design an adaptive communication method to let the coach decide when to communicate with the players. We validate our methods on a resource collection task, a rescue game, and the StarCraft micromanagement tasks. We demonstrate zero-shot generalization to new team compositions. Our method achieves comparable or better performance than the setting where all players have a full view of the environment. Moreover, we see that the performance remains high even when the coach communicates as little as 13% of the time using the adaptive communication strategy.

翻訳日:2021-05-19 14:58:53 公開日:2021-05-18

# (参考訳) Manifold-Aware Wasserstein GAN を用いた人間の動作予測

Human Motion Prediction Using Manifold-Aware Wasserstein GAN ( http://arxiv.org/abs/2105.08715v1 )

ライセンス: CC BY 4.0

Baptiste Chopin, Naima Otberdout, Mohamed Daoudi, Angela Bartolo

(参考訳) ヒューマンモーション予測は、事前ポーズシーケンスが与えられた将来の人間のポーズを予測することを目的としている。予測運動の不連続性と長期地平線の性能劣化は、現在も文献で直面する主な課題である。本研究では,人間の動きのコンパクトな表現を用いて,これらの課題に対処する。具体的には、3次元人間のポーズの時間的進化を軌跡としてモデル化し、人間の動きを球面多様体上の単一点にマッピングする。これらの非ユークリッド表現を学ぶために、異なる損失を通じて人間の運動の時間的および空間的依存性を捉える多様体認識ワッサースタイン生成逆モデルを構築する。大規模な実験により、我々のアプローチはCMU MoCapとHuman 3.6Mデータセットの最先端よりも優れていることが示された。定性的結果は予測運動の滑らかさを示す。

Human motion prediction aims to forecast future human poses given a prior pose sequence. The discontinuity of the predicted motion and the performance deterioration in long-term horizons are still the main challenges encountered in current literature. In this work, we tackle these issues by using a compact manifold-valued representation of human motion. Specifically, we model the temporal evolution of the 3D human poses as trajectory, what allows us to map human motions to single points on a sphere manifold. To learn these non-Euclidean representations, we build a manifold-aware Wasserstein generative adversarial model that captures the temporal and spatial dependencies of human motion through different losses. Extensive experiments show that our approach outperforms the state-of-the-art on CMU MoCap and Human 3.6M datasets. Our qualitative results show the smoothness of the predicted motions.

翻訳日:2021-05-19 14:33:48 公開日:2021-05-18

# 交代最小化による線形メタラーニング

Sample Efficient Linear Meta-Learning by Alternating Minimization ( http://arxiv.org/abs/2105.08306v1 )

ライセンス: Link先を確認

Kiran Koshy Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh

(参考訳) メタラーニングは、与えられたタスクセットから知識を合成して利用し、非常に小さなデータを使って新しいタスクを迅速に学習する。低次元部分空間にある線形回帰タスクのメタラーニングは、この領域で広く研究されている基本的な問題である。しかし、既存の結果は、非常に最適な推定誤差を保証するか、タスク毎に$\Omega(d)$サンプルを必要とする($d$はデータ次元である)。本研究では,低次元部分空間と回帰器を交互に学習する簡易交互最小化法(MLLAM)について検討する。定数部分空間次元 mllam は、タスクごとに$\omega(\log d)$ のサンプルしか必要とせず、ほぼ最適な推定誤差が得られる。しかし、タスク毎に必要なサンプル数はタスク数で対数的に増加する。低雑音環境下でのこの対策として,タスク毎のサンプル数が任意に多数存在する場合でも,MLLAMと同じ強い統計的保証を保証するタスクサブセット選択方式を提案する。

Meta-learning synthesizes and leverages the knowledge from a given set of tasks to rapidly learn new tasks using very little data. Meta-learning of linear regression tasks, where the regressors lie in a low-dimensional subspace, is an extensively-studied fundamental problem in this domain. However, existing results either guarantee highly suboptimal estimation errors, or require $\Omega(d)$ samples per task (where $d$ is the data dimensionality) thus providing little gain over separately learning each task. In this work, we study a simple alternating minimization method (MLLAM), which alternately learns the low-dimensional subspace and the regressors. We show that, for a constant subspace dimension MLLAM obtains nearly-optimal estimation error, despite requiring only $\Omega(\log d)$ samples per task. However, the number of samples required per task grows logarithmically with the number of tasks. To remedy this in the low-noise regime, we propose a novel task subset selection scheme that ensures the same strong statistical guarantee as MLLAM, even with bounded number of samples per task for arbitrarily large number of tasks.

翻訳日:2021-05-19 14:18:14 公開日:2021-05-18

# データレス知識蒸留におけるコントラストモデルインバージョン

Contrastive Model Inversion for Data-Free Knowledge Distillation ( http://arxiv.org/abs/2105.08584v1 )

ライセンス: Link先を確認

Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, Mingli Song

(参考訳) トレーニング済みのモデルからトレーニングデータを復元することを目的としているモデル反転は、最近実現可能であることが証明された。しかし, 既存の逆転法では, 合成されたインスタンスは互いに非常によく似ており, 知識蒸留などの下流タスクに限定的な有効性を示すモード崩壊問題に悩まされることが多い。本稿では,データ多様性を最適化可能な目的として明示的にモデル化し,モード崩壊問題を緩和するContrastive Model Inversion~(CMI)を提案する。我々の主な観察では、同じ量のデータの制約の下では、高いデータの多様性は、通常より強いインスタンス識別を示す。この目的のために、我々はCMIにおいて、前回のバッチで既に合成されたものと区別できるように合成インスタンスを奨励する対照的な学習目標を紹介した。 CIFAR-10, CIFAR-100, Tiny-ImageNetの事前学習モデル実験により, CMIは芸術的状況よりも視覚的に可視なインスタンスを生成するだけでなく, 生成したデータを知識蒸留に使用する場合, 極めて優れた性能が得られることが示された。コードは \url{https://github.com/zju-vipa/DataFree} で入手できる。

Model inversion, whose goal is to recover training data from a pre-trained model, has been recently proved feasible. However, existing inversion methods usually suffer from the mode collapse problem, where the synthesized instances are highly similar to each other and thus show limited effectiveness for downstream tasks, such as knowledge distillation. In this paper, we propose Contrastive Model Inversion~(CMI), where the data diversity is explicitly modeled as an optimizable objective, to alleviate the mode collapse issue. Our main observation is that, under the constraint of the same amount of data, higher data diversity usually indicates stronger instance discrimination. To this end, we introduce in CMI a contrastive learning objective that encourages the synthesizing instances to be distinguishable from the already synthesized ones in previous batches. Experiments of pre-trained models on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CMI not only generates more visually plausible instances than the state of the arts, but also achieves significantly superior performance when the generated data are used for knowledge distillation. Code is available at \url{https://github.com/zju-vipa/DataFree}.

翻訳日:2021-05-19 14:17:37 公開日:2021-05-18

# 線形複雑度を有する変圧器の相対位置符号化

Relative Positional Encoding for Transformers with Linear Complexity ( http://arxiv.org/abs/2105.08399v1 )

ライセンス: Link先を確認

Antoine Liutkus, Ond\v{r}ej C\'ifka, Shih-Lun Wu, Umut \c{S}im\c{s}ekli, Yi-Hsuan Yang, Ga\"el Richard

(参考訳) トランスフォーマーモデルの最近の進歩は、線形空間と時間複雑さのために、前例のないシーケンス長を許容している。一方、相対位置符号化 (relative positional encoding, rpe) は古典的トランスフォーマーにとって有益であり、推論のための絶対位置ではなくラグを利用する。しかし、最近のトランスフォーマーの線形変種には RPE が利用できないのは、注意行列の明示的な計算を必要とするためである。本稿では,このギャップを埋めて,古典的な付加形(正弦波)PEの代替として使用でき,RPEのように確実に振る舞うPEを生成する方法として,確率的位置エンコーディングを提案する。主な理論的貢献は、位置符号化と相関したガウス過程の相互共分散構造を関連付けることである。本稿では,Long-Range Arenaベンチマークと音楽生成におけるアプローチの性能について述べる。

Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity. In the meantime, relative positional encoding (RPE) was proposed as beneficial for classical Transformers and consists in exploiting lags instead of absolute positions for inference. Still, RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix, which is precisely what is avoided by such methods. In this paper, we bridge this gap and present Stochastic Positional Encoding as a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE. The main theoretical contribution is to make a connection between positional encoding and cross-covariance structures of correlated Gaussian processes. We illustrate the performance of our approach on the Long-Range Arena benchmark and on music generation.

翻訳日:2021-05-19 14:17:15 公開日:2021-05-18

# 事前と後方のパラメトリゼーション不変な解釈

Parametrization invariant interpretation of priors and posteriors ( http://arxiv.org/abs/2105.08304v1 )

ライセンス: Link先を確認

Jesus Cerquides

(参考訳) 本稿では、リーマン多様体上の確率を利用して、ベイズ予想における事前および後続の解釈を再考する。主マインドシフトは、「事前分布が我々のモデルのパラメータ上で確率分布を確立する」という考えから「事前分布が確率分布を超えた確率分布を確立する」という考えに移行することである。そのため、我々の確率モデルがフィッシャー計量を持つリーマン多様体であると仮定する。この考え方の下では、確率分布上の任意の分布は「内在的」であり、つまり多様体に選択される特定のパラメトリゼーションに不変である。我々はベルヌーイ分布の多様体上の分布の単純な解析を通じてアイデアを例示する。最大アフター推定の最大の欠点の1つは、それらはパラメトリゼーションに依存することである。ここで開発された理解に基づき、パラメトリゼーションとは独立な最大アフター推定を定義することができる。

In this paper we leverage on probability over Riemannian manifolds to rethink the interpretation of priors and posteriors in Bayesian inference. The main mindshift is to move away from the idea that "a prior distribution establishes a probability distribution over the parameters of our model" to the idea that "a prior distribution establishes a probability distribution over probability distributions". To do that we assume that our probabilistic model is a Riemannian manifold with the Fisher metric. Under this mindset, any distribution over probability distributions should be "intrinsic", that is, invariant to the specific parametrization which is selected for the manifold. We exemplify our ideas through a simple analysis of distributions over the manifold of Bernoulli distributions. One of the major shortcomings of maximum a posteriori estimates is that they depend on the parametrization. Based on the understanding developed here, we can define the maximum a posteriori estimate which is independent of the parametrization.

翻訳日:2021-05-19 14:16:59 公開日:2021-05-18

# スタイル誘導型プランニングによるスタイリズドストーリー生成

Stylized Story Generation with Style-Guided Planning ( http://arxiv.org/abs/2105.08625v1 )

ライセンス: Link先を確認

Xiangzhe Kong, Jialiang Huang, Ziquan Tung, Jian Guan and Minlie Huang

(参考訳) 現在のストーリーテリングシステムは、ナレーションスタイルを考慮せずにコヒーレントなプロットでストーリーを生成することに焦点を当てている。そこで,本稿では,先進的な文脈を与えられたスペクティブスタイルで物語を生成する新しいタスク,スタイル化されたストーリージェネレーションを提案する。この問題に対処するために,まず文体化されたキーワードを計画し,そのキーワードの誘導で全ストーリーを生成する新しい生成モデルを提案する。さらに、生成したストーリーと特定スタイルの整合性を評価するために、2つの自動メトリクスを提案する。実験では、ROCStoriesデータセット(Mostafazadeh et al., 2016)に基づいて、当社のモデルが制御可能であることを実証した。本研究は,今後の研究におけるスタイリズドストーリー生成の展望を示す。

Current storytelling systems focus more ongenerating stories with coherent plots regard-less of the narration style, which is impor-tant for controllable text generation. There-fore, we propose a new task, stylized story gen-eration, namely generating stories with speci-fied style given a leading context. To tacklethe problem, we propose a novel generationmodel that first plans the stylized keywordsand then generates the whole story with theguidance of the keywords. Besides, we pro-pose two automatic metrics to evaluate theconsistency between the generated story andthe specified style. Experiments demonstratesthat our model can controllably generateemo-tion-driven orevent-driven stories based onthe ROCStories dataset (Mostafazadeh et al.,2016). Our study presents insights for stylizedstory generation in further research.

翻訳日:2021-05-19 14:16:45 公開日:2021-05-18

# ビデオグラウンドのためのシーケンスマッチングを用いた並列アテンションネットワーク

Parallel Attention Network with Sequence Matching for Video Grounding ( http://arxiv.org/abs/2105.08481v1 )

ライセンス: Link先を確認

Hao Zhang, Aixin Sun, Wei Jing, Liangli Zhen, Joey Tianyi Zhou, Rick Siow Mong Goh

(参考訳) ビデオグラウンディングは、意味的に言語クエリに対応する時間モーメントを検索することを目的としている。本研究では,マルチモーダル表現学習とターゲットモーメント境界予測という課題に対処するために,シーケンスマッチングを用いた並列注意ネットワーク(SeqPAN)を提案する。我々は,ビデオとテキスト間の自己モダルコンテキストとクロスモダル注意情報を効果的に捉えるために,自己誘導型並列アテンションモジュールを設計した。自然言語処理におけるシーケンスラベリングタスクにインスパイアされた我々は、真理モーメントを開始、内部、終了領域に分割した。次に,領域ラベルを用いた開始/終了境界予測を導くシーケンスマッチング戦略を提案する。 3つのデータセットの実験結果は、SeqPANが最先端の手法よりも優れていることを示している。さらに、自己誘導並列注意モジュールとシーケンスマッチングモジュールの有効性を検証する。

Given a video, video grounding aims to retrieve a temporal moment that semantically corresponds to a language query. In this work, we propose a Parallel Attention Network with Sequence matching (SeqPAN) to address the challenges in this task: multi-modal representation learning, and target moment boundary prediction. We design a self-guided parallel attention module to effectively capture self-modal contexts and cross-modal attentive information between video and text. Inspired by sequence labeling tasks in natural language processing, we split the ground truth moment into begin, inside, and end regions. We then propose a sequence matching strategy to guide start/end boundary predictions using region labels. Experimental results on three datasets show that SeqPAN is superior to state-of-the-art methods. Furthermore, the effectiveness of the self-guided parallel attention module and the sequence matching module is verified.

翻訳日:2021-05-19 14:16:30 公開日:2021-05-18

# 適応型ビデオ圧縮センシングのための強化学習

Reinforcement Learning for Adaptive Video Compressive Sensing ( http://arxiv.org/abs/2105.08205v1 )

ライセンス: Link先を確認

Sidi Lu, Xin Yuan, Aggelos K Katsaggelos, Weisong Shi

(参考訳) 映像圧縮センシングに強化学習を適用し,圧縮比を適応させる。具体的には、低速カメラを用いて高速映像を撮影するビデオスナップショット圧縮画像(SCI)について、スナップショット計測から複数の(B)ビデオフレームを再構成できると考えられる。前回の研究では、異なる場面でビデオSCIシステムにBを適応する方法が研究の欠如となっている。本稿では,強化学習(RL)を用いて,このギャップを埋める。再構成のための様々な畳み込みニューラルネットワークと同様にrlモデルが学習され、ビデオsciシステムの適応的センシングを実現する。さらに、再構成のないビデオSCI測定を直接使用したオブジェクト検出ネットワークの性能を用いて、RLに基づく適応的なビデオ圧縮センシングを行う。したがって,提案手法は低コストかつリアルタイムに実現可能である。我々の研究は、ビデオSCIの実際の応用に向けて一歩前進する。

We apply reinforcement learning to video compressive sensing to adapt the compression ratio. Specifically, video snapshot compressive imaging (SCI), which captures high-speed video using a low-speed camera is considered in this work, in which multiple (B) video frames can be reconstructed from a snapshot measurement. One research gap in previous studies is how to adapt B in the video SCI system for different scenes. In this paper, we fill this gap utilizing reinforcement learning (RL). An RL model, as well as various convolutional neural networks for reconstruction, are learned to achieve adaptive sensing of video SCI systems. Furthermore, the performance of an object detection network using directly the video SCI measurements without reconstruction is also used to perform RL-based adaptive video compressive sensing. Our proposed adaptive SCI method can thus be implemented in low cost and real time. Our work takes the technology one step further towards real applications of video SCI.

翻訳日:2021-05-19 14:15:20 公開日:2021-05-18

# NExT-QA: 時間的行動の説明に対する質問のNext Phase

NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions ( http://arxiv.org/abs/2105.08276v1 )

ライセンス: Link先を確認

Junbin Xiao, Xindi Shang, Angela Yao and Tat-Seng Chua

(参考訳) ビデオ質問応答(VideoQA)ベンチマークであるNExT-QAを導入し,映像理解の促進と時間的行動の説明を行う。本データセットに基づいて,因果行動推論,時間的行動推論,共通場面理解を対象とする複数選択およびオープンエンドQAタスクを設定した。ベースラインの広範囲な解析とビデオQA手法の確立により, 浅いシーン記述では高い性能を示すが, 因果的・時間的行動推論では弱いことがわかった。さらに, 複数選択QAに適応したモデルでは, 解の一般化に苦慮している。これにより、これらのモデルが改善の可能性を推論し強調する能力に疑問が持ち上がっている。 NExT-QAが次世代のVQA研究を指導し、表面的なシーン記述を超えて、ビデオのより深い理解へと進むことを願っている。 (データセットと関連するリソースはhttps://github.com/doc-doc/NExT-QA.git)。

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions. Based on the dataset, we set up multi-choice and open-ended QA tasks targeting causal action reasoning, temporal action reasoning, and common scene comprehension. Through extensive analysis of baselines and established VideoQA techniques, we find that top-performing methods excel at shallow scene descriptions but are weak in causal and temporal action reasoning. Furthermore, the models that are effective on multi-choice QA, when adapted to open-ended QA, still struggle in generalizing the answers. This raises doubt on the ability of these models to reason and highlights possibilities for improvement. With detailed results for different question types and heuristic observations for future works, we hope NExT-QA will guide the next generation of VQA research to go beyond superficial scene description towards a deeper understanding of videos. (The dataset and related resources are available at https://github.com/doc-doc/NExT-QA.git)

翻訳日:2021-05-19 14:15:08 公開日:2021-05-18

# スパースアクションタスクのためのsparsity prior regularized q-learning

Sparsity Prior Regularized Q-learning for Sparse Action Tasks ( http://arxiv.org/abs/2105.08666v1 )

ライセンス: Link先を確認

Jing-Cheng Pang, Tian Xu, Sheng-Yi Jiang, Yu-Ren Liu, Yang Yu

(参考訳) 多くの意思決定タスクにおいて、特定のアクションは、銃術の「火」や株式取引の「買い」など、その頻度や総量によって制限される。我々はそのような行動を「スパースアクション」と呼ぶ。スパースアクションは、しばしば優れたパフォーマンスを達成する上で重要な役割を果たす。しかしながら、emph{classical bellman update} によって推定されるそれらのq値は、通常、標本のスパース性のため、大きな推定誤差を被る。 emph{greedy} のポリシーは、バイアス付き Q-函数によって大きく誤解される可能性があり、スパース作用を積極的に行い、大きな準最適をもたらす。本稿では,sparseアクションに低い確率を割り当てる参照分布を構築し,その参照分布に明示的な制約を持つ正規化対象を提案する。さらに、正規化ベルマン演算子と正規化最適ポリシーを導出し、エラーの伝播を遅くし、エージェントがよりスパースアクションを取るよう誘導する。実験の結果,本手法は,典型的なスパース動作タスクにおける最先端性能を実現する。

In many decision-making tasks, some specific actions are limited in their frequency or total amounts, such as "fire" in the gunfight game and "buy/sell" in the stock trading. We name such actions as "sparse action". Sparse action often plays a crucial role in achieving good performance. However, their Q-values, estimated by \emph{classical Bellman update}, usually suffer from a large estimation error due to the sparsity of their samples. The \emph{greedy} policy could be greatly misled by the biased Q-function and takes sparse action aggressively, which leads to a huge sub-optimality. This paper constructs a reference distribution that assigns a low probability to sparse action and proposes a regularized objective with an explicit constraint to the reference distribution. Furthermore, we derive a regularized Bellman operator and a regularized optimal policy that can slow down the propagation of error and guide the agent to take sparse action more carefully. The experiment results demonstrate that our method achieves state-of-the-art performance on typical sparse action tasks.

翻訳日:2021-05-19 14:14:33 公開日:2021-05-18

# 逐次独立メカニズムの高速・低速学習

Fast and Slow Learning of Recurrent Independent Mechanisms ( http://arxiv.org/abs/2105.08710v1 )

ライセンス: Link先を確認

Kanika Madan, Rosemary Nan Ke, Anirudh Goyal, Bernhard Bernhard Sch\"olkopf, Yoshua Bengio

(参考訳) 知識を交換可能な部品に分解することは、分布の変化がある場合に一般化の利点を約束する。環境と相互作用する学習エージェントは、既存の知識の新たな組み合わせを必要とする状況に直面しやすい。このような知識の分解は、分布外変化を体系的に一般化できる上で特に重要であると仮定する。そこで本研究では,エージェントが必要とする知識の一部と報酬関数が定常的であり,タスク間で再利用可能な,特定のトレーニングフレームワークを提案する。注意機構は、どのモジュールを現在のタスクに適応できるかを動的に選択し、選択したモジュールのパラメータは、学習者が経験する変化に直面すると迅速に変更でき、一方で注意機構のパラメータは安定してゆっくりと変化するメタパラメータとして動作する。我々は,注意のボトルネックを通じて相互に疎通するモジュール群が捉えた知識の断片に着目した。画像レベルの入力を伴う部分的に観測されたグリッドの世界におけるナビゲーションを含む強化学習装置において,提案方式のモジュール的側面をメタラーニングすることで,より高速な適応を実現することができる。また,パラメータとメタパラメータの役割を逆転させることは,動的に選択されたモジュールを高速に適応するための特別な役割を示唆する。

Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution. A learning agent interacting with its environment is likely to be faced with situations requiring novel combinations of existing pieces of knowledge. We hypothesize that such a decomposition of knowledge is particularly relevant for being able to generalize in a systematic manner to out-of-distribution changes. To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks. An attention mechanism dynamically selects which modules can be adapted to the current task, and the parameters of the selected modules are allowed to change quickly as the learner is confronted with variations in what it experiences, while the parameters of the attention mechanisms act as stable, slowly changing, meta-parameters. We focus on pieces of knowledge captured by an ensemble of modules sparsely communicating with each other via a bottleneck of attention. We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup involving navigation in a partially observed grid world with image-level input. We also find that reversing the role of parameters and meta-parameters does not work nearly as well, suggesting a particular role for fast adaptation of the dynamically selected modules.

翻訳日:2021-05-19 14:14:13 公開日:2021-05-18

# 不均一文脈における分布ロバスト学習

Distributionally Robust Learning in Heterogeneous Contexts ( http://arxiv.org/abs/2105.08532v1 )

ライセンス: Link先を確認

Muhammad Osama, Dave Zachariah, Petre Stoica

(参考訳) 本研究では,異なる文脈で得られた学習データから学習する際の問題点について考察する。我々は,超過リスクに着目した分散ロバストな手法を開発し,従来の超保守的ミニマックスアプローチよりもパフォーマンスとロバスト性のトレードオフをより適切なものにする。提案手法は計算可能であり,統計的保証を提供する。実データと合成データの両方を用いてその性能を示す。

We consider the problem of learning from training data obtained in different contexts, where the test data is subject to distributional shifts. We develop a distributionally robust method that focuses on excess risks and achieves a more appropriate trade-off between performance and robustness than the conventional and overly conservative minimax approach. The proposed method is computationally feasible and provides statistical guarantees. We demonstrate its performance using both real and synthetic data.

翻訳日:2021-05-19 14:13:51 公開日:2021-05-18

# sparta:空間的注意と敵対的ロバストな活性化

Sparta: Spatially Attentive and Adversarially Robust Activation ( http://arxiv.org/abs/2105.08269v1 )

ライセンス: Link先を確認

Qing Guo, Felix Juefei-Xu, Changqing Zhou, Yang Liu, Song Wang

(参考訳) 敵対的トレーニング(AT)は、深層畳み込みニューラルネットワーク(CNN)の堅牢性を改善する最も効果的な方法の1つである。一般的なネットワークトレーニングと同じように、atの有効性は基本的なネットワークコンポーネントの設計に依存する。本稿では,AT における強靭性 CNN における基本 ReLU 活性化成分の役割について,詳細な研究を行う。 ReLUアクティベーションの空間的共有性および入力非依存性により、CNNは標準的あるいは逆的トレーニングによるホワイトボックス攻撃に対してより堅牢であることがわかった。この問題に対処するため、我々はReLUを新しいSpartaアクティベーション関数(Spatially Attentive and Adversarially Robust Activation)に拡張し、CNNがより高いロバスト性、すなわち、敵の事例におけるエラー率、そしてより高い精度、すなわちクリーンな例におけるエラー率、すなわち既存の最先端(SOTA)アクティベーション関数よりも高い精度を実現する。さらに, Sparta と SOTA 活性化関数の関係について検討し, 本手法の利点について考察した。包括的な実験により,提案手法が優れたクロスcnnおよびクロスデータセット転送性を示すことがわかった。前者の場合、1つのCNN(例えばResNet-18)に対して逆向きに訓練されたSparta関数を固定し、他のCNN(例えばResNet-34)をトレーニングするために直接使用することができる。後者では、あるデータセット(例えば、CIFAR-10)でトレーニングされたSparta関数を使用して、別のデータセット(例えば、SVHN)で敵対的に堅牢なCNNをトレーニングすることができる。どちらの場合も、SpartaはバニラReLUよりも堅牢性が高く、提案手法の柔軟性と汎用性を検証する。

Adversarial training (AT) is one of the most effective ways for improving the robustness of deep convolution neural networks (CNNs). Just like common network training, the effectiveness of AT relies on the design of basic network components. In this paper, we conduct an in-depth study on the role of the basic ReLU activation component in AT for robust CNNs. We find that the spatially-shared and input-independent properties of ReLU activation make CNNs less robust to white-box adversarial attacks with either standard or adversarial training. To address this problem, we extend ReLU to a novel Sparta activation function (Spatially attentive and Adversarially Robust Activation), which enables CNNs to achieve both higher robustness, i.e., lower error rate on adversarial examples, and higher accuracy, i.e., lower error rate on clean examples, than the existing state-of-the-art (SOTA) activation functions. We further study the relationship between Sparta and the SOTA activation functions, providing more insights about the advantages of our method. With comprehensive experiments, we also find that the proposed method exhibits superior cross-CNN and cross-dataset transferability. For the former, the adversarially trained Sparta function for one CNN (e.g., ResNet-18) can be fixed and directly used to train another adversarially robust CNN (e.g., ResNet-34). For the latter, the Sparta function trained on one dataset (e.g., CIFAR-10) can be employed to train adversarially robust CNNs on another dataset (e.g., SVHN). In both cases, Sparta leads to CNNs with higher robustness than the vanilla ReLU, verifying the flexibility and versatility of the proposed method.

翻訳日:2021-05-19 14:13:44 公開日:2021-05-18

# 理論誘導残差ネットワークによる経路学習

Learning to Route via Theory-Guided Residual Network ( http://arxiv.org/abs/2105.08279v1 )

ライセンス: Link先を確認

Chang Liu, Guanjie Zheng, Zhenhui Li

(参考訳) 交通量と関連する問題は、常に近代都市に対する懸念であった。深層学習と強化学習の助けを借りて、スマート交通信号制御システムやタクシー配車システムなど、これらの交通問題を解決するための様々な政策を提案してきた。実際の都市で直接適用すると実際のコストがかかるため、人々は通常、都市シミュレーターでこれらのポリシーを検証する。しかし, 都市シミュレータで検証されたこれらの政策は, シミュレータが現実と大きく異なる場合, 実際の都市で失敗する可能性がある。この問題に取り組むためには,実際の交通シミュレーションシステムを構築する必要がある。そこで本研究では,交通シミュレータにおいて最も重要な部分の一つである人間のルーティングモデルを学習することを提案する。この問題には2つの大きな課題がある。第一に、人間の経路決定は、共通時間と距離要素以外の複数の要因によって決定される。第2に,現行のルートデータは通常,プライバシとデバイス可用性の問題から,車両のごく一部をカバーする。これらの問題に対処するために、理論的部分は人間の経路決定の一般的な原則(例えば、最速経路)を強調し、残余部分は乾燥可能な条件設定(例えば、ローカル道路やハイウェイ)を捉えることができる理論誘導残差ネットワークモデルを提案する。理論部分は、訓練に必要なデータを必要としない従来の最短経路アルゴリズムから成り立っているため、残余のネットワークは限られたデータから人間のルーティングモデルを学習することができる。我々は複数の実世界のデータセットに対して広範囲に実験を行い、特に小さなデータを用いて、モデルの優れた性能を示す。さらに、ケーススタディを通じて、私たちのモデルが実際のルートを回復する上で優れている理由も示しています。

The heavy traffic and related issues have always been concerns for modern cities. With the help of deep learning and reinforcement learning, people have proposed various policies to solve these traffic-related problems, such as smart traffic signal control systems and taxi dispatching systems. People usually validate these policies in a city simulator, since directly applying them in the real city introduces real cost. However, these policies validated in the city simulator may fail in the real city if the simulator is significantly different from the real world. To tackle this problem, we need to build a real-like traffic simulation system. Therefore, in this paper, we propose to learn the human routing model, which is one of the most essential part in the traffic simulator. This problem has two major challenges. First, human routing decisions are determined by multiple factors, besides the common time and distance factor. Second, current historical routes data usually covers just a small portion of vehicles, due to privacy and device availability issues. To address these problems, we propose a theory-guided residual network model, where the theoretical part can emphasize the general principles for human routing decisions (e.g., fastest route), and the residual part can capture drivable condition preferences (e.g., local road or highway). Since the theoretical part is composed of traditional shortest path algorithms that do not need data to train, our residual network can learn human routing models from limited data. We have conducted extensive experiments on multiple real-world datasets to show the superior performance of our model, especially with small data. Besides, we have also illustrated why our model is better at recovering real routes through case studies.

翻訳日:2021-05-19 14:12:21 公開日:2021-05-18

# ゼロショットレコメンダシステム

Zero-Shot Recommender Systems ( http://arxiv.org/abs/2105.08318v1 )

ライセンス: Link先を確認

Hao Ding, Yifei Ma, Anoop Deoras, Yuyang Wang, Hao Wang

(参考訳) 推薦システム(RS)の性能は、利用可能なトレーニングデータの量に大きく依存する。これはアーリーステージの製品にニワトリの問題を生じさせ、そのデータ量は彼らのRSの性能に依存する。一方、ゼロショット学習は、古いデータセットから全く新しいデータセットへのある程度の一般化を約束する。本稿では,RSにおけるゼロショット学習の可能性を検討する。我々は、ZESRecと呼ばれるアルゴリズムを開発し、古いデータセットでトレーニングし、重複するユーザも重複するアイテムも存在しない新しいデータセットに一般化する。カテゴリー的な項目インデックス、すなわち項目idとは異なり、zesrecは項目の自然言語記述(または記述埋め込み)を連続的なインデックスとして使用するため、自然に見えない項目に一般化する。ユーザの観点からは、zesrecはアイテムとのインタラクションを使用してユーザを表現するためにシーケンシャルなrsの最近の進歩をベースにしている。 2組の現実世界のRSデータセットを調査し、ZESRecがこのようなゼロショット設定でレコメンデーションをうまく実現できることを示し、データスカーススタートアップやアーリーステージ製品におけるチキンとエッグの問題を解決する新たな機会を開く。

Performance of recommender systems (RS) relies heavily on the amount of training data available. This poses a chicken-and-egg problem for early-stage products, whose amount of data, in turn, relies on the performance of their RS. On the other hand, zero-shot learning promises some degree of generalization from an old dataset to an entirely new dataset. In this paper, we explore the possibility of zero-shot learning in RS. We develop an algorithm, dubbed ZEro-Shot Recommenders (ZESRec), that is trained on an old dataset and generalize to a new one where there are neither overlapping users nor overlapping items, a setting that contrasts typical cross-domain RS that has either overlapping users or items. Different from categorical item indices, i.e., item ID, in previous methods, ZESRec uses items' natural-language descriptions (or description embeddings) as their continuous indices, and therefore naturally generalize to any unseen items. In terms of users, ZESRec builds upon recent advances on sequential RS to represent users using their interactions with items, thereby generalizing to unseen users as well. We study two pairs of real-world RS datasets and demonstrate that ZESRec can successfully enable recommendations in such a zero-shot setting, opening up new opportunities for resolving the chicken-and-egg problem for data-scarce startups or early-stage products.

翻訳日:2021-05-19 14:11:57 公開日:2021-05-18

# 経時的医療記録の類似性尺度としての多変量抽象型動的時間ワープ法の実装と評価

Implementation and Evaluation of a Multivariate Abstraction-Based, Interval-Based Dynamic Time-Warping Method as a Similarity Measure for Longitudinal Medical Records ( http://arxiv.org/abs/2105.08450v1 )

ライセンス: Link先を確認

Yuval Shahar and Matan Lion

(参考訳) A)インターバルベース表現(iRep): [1] 生のタイムスタンプデータをインターバルベース抽象化に抽象化する、[2] 比較周期スコピング、[3] 抽象インターバルを与えられた時間粒度に分割する、(B) インターバルベースマッチング(iMatch): 変更されたDTWを使用してパーティショニングされた抽象概念レコードにマッチする、インターバルベースの動的タイムワープ(iDTW)に拡張した。ドメイン知識を使って、医療記録の生データ(4つから5つの関連する概念のうち最大3つの概念)を2つのインターバルタイプに抽象化しました。 low, High) と Gradient の抽象化(例) Incrasing, Decrasing)。すべての一次元(状態または勾配)または多次元(状態と勾配)の抽象組み合わせを作成しました。課題: 自己骨髄移植または同種骨髄移植を161例, B型肝炎を125例, C型肝炎を125例, 来年のマイクロアルブミン尿症またはマクロアルブミン尿症を151例とした。 k-nearest-neighbors majority, k=1 to sqrt(n), n = set sizeを用いた。 23400(オンコロジー)、19,800(肝炎)、7,128(糖尿病)の10倍のクロスバリデーション実験を行った。測度:曲線の下の領域(auc)、最適なユーデン指数。 Paired t-tests compared result vectors for equivalent configurations than a test variable, to determine a significant mean accuracy difference (P<0.05。抽象化を用いた平均分類と予測は,生のタイムスタンプデータのみを使用するよりも有意に良好であった。各ドメインにおいて、少なくとも1つの抽象化の組み合わせは、生のデータを使用するよりも大幅にパフォーマンスが向上した。特徴数の増加、多次元抽象化の使用によりパフォーマンスが向上した。生のデータと異なり、最適性能はk=5で、抽象化を用いて達成されることが多い。

We extended dynamic time warping (DTW) into interval-based dynamic time warping (iDTW), including (A) interval-based representation (iRep): [1] abstracting raw, time-stamped data into interval-based abstractions, [2] comparison-period scoping, [3] partitioning abstract intervals into a given temporal granularity; (B) interval-based matching (iMatch): matching partitioned, abstract-concepts records, using a modified DTW. Using domain knowledge, we abstracted the raw data of medical records, for up to three concepts out of four or five relevant concepts, into two interval types: State abstractions (e.g. LOW, HIGH) and Gradient abstractions (e.g. INCREASING, DECREASING). We created all uni-dimensional (State or Gradient) or multi-dimensional (State and Gradient) abstraction combinations. Tasks: Classifying 161 oncology patients records as autologous or allogenic bone-marrow transplantation; classifying 125 hepatitis patients records as B or C hepatitis; predicting micro- or macro-albuminuria in the next year for 151 Type 2 diabetes patients. We used a k-Nearest-Neighbors majority, k=1 to SQRT(N), N = set size. 50,328 10-fold cross-validation experiments were performed: 23,400 (Oncology), 19,800 (Hepatitis), 7,128 (Diabetes). Measures: Area Under the Curve (AUC), optimal Youden's Index. Paired t-tests compared result vectors for equivalent configurations other than a tested variable, to determine a significant mean accuracy difference (P<0.05). Mean classification and prediction using abstractions was significantly better than using only raw time-stamped data. In each domain, at least one abstraction combination led to a significantly better performance than using raw data. Increasing feature number, and using multi-dimensional abstractions, enhanced performance. Unlike when using raw data, optimal performance was often reached with k=5, using abstractions.

翻訳日:2021-05-19 14:11:34 公開日:2021-05-18

# 適応型ABAC政策学習 : 強化学習アプローチ

Adaptive ABAC Policy Learning: A Reinforcement Learning Approach ( http://arxiv.org/abs/2105.08587v1 )

ライセンス: Link先を確認

Leila Karimi, Mai Abdelhakim, James Joshi

(参考訳) コンピュータシステムの急速な進歩により、より効率的かつ効率的なアクセス制御(AC)アプローチへの需要が高まっている。近年、ABAC(Atribute Based Access Control)アプローチは、このような複雑なコンピューティング環境のACニーズを満たす上で有望であることが示されている。 abacモデルは、システム内のエンティティの属性と認可ポリシーに基づく要求者へのアクセスを許可するが、その汎用性と柔軟性はより高いコストを伴う。さらに、組織システムの複雑さの増大とリソースへの連合的なアクセスの必要性により、ACの執行と管理がより困難になる。本稿では,認証管理タスクを自動化するための適応型ABACポリシー学習手法を提案する。 abacポリシー学習を強化学習問題としてモデル化する。特に,承認エンジンがフィードバック制御ループを介してABACモデルを適応させるコンテキスト的盗聴システムを提案する。学習過程を高速化するために,属性値階層に基づく学習モデルと計画手法を初期化する4つの手法を提案する。実例として,ホームIoT環境のための適応型ABACポリシー学習モデルの開発に注力する。提案手法を実データおよび合成データに対して評価する。評価において、完全なデータセットとスパースデータセットの両方を考慮する。実験結果から,提案手法は,多くのシナリオにおける教師付き学習に基づくものと同等の性能を達成し,いくつかの状況でそれを上回る結果が得られた。

With rapid advances in computing systems, there is an increasing demand for more effective and efficient access control (AC) approaches. Recently, Attribute Based Access Control (ABAC) approaches have been shown to be promising in fulfilling the AC needs of such emerging complex computing environments. An ABAC model grants access to a requester based on attributes of entities in a system and an authorization policy; however, its generality and flexibility come with a higher cost. Further, increasing complexities of organizational systems and the need for federated accesses to their resources make the task of AC enforcement and management much more challenging. In this paper, we propose an adaptive ABAC policy learning approach to automate the authorization management task. We model ABAC policy learning as a reinforcement learning problem. In particular, we propose a contextual bandit system, in which an authorization engine adapts an ABAC model through a feedback control loop; it relies on interacting with users/administrators of the system to receive their feedback that assists the model in making authorization decisions. We propose four methods for initializing the learning model and a planning approach based on attribute value hierarchy to accelerate the learning process. We focus on developing an adaptive ABAC policy learning model for a home IoT environment as a running example. We evaluate our proposed approach over real and synthetic data. We consider both complete and sparse datasets in our evaluations. Our experimental results show that the proposed approach achieves performance that is comparable to ones based on supervised learning in many scenarios and even outperforms them in several situations.

翻訳日:2021-05-19 14:10:57 公開日:2021-05-18

# 分散マルチロボットサブモジュラー動作選択のためのグラフニューラルネットワーク

Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection ( http://arxiv.org/abs/2105.08601v1 )

ライセンス: Link先を確認

Lifeng Zhou, Vishnu D. Sharma, Qingbiao Li, Amanda Prorok, Alejandro Ribeiro, Vijay Kumar

(参考訳) 本稿では,分散化サブモジュラー最大化のための学習ベースアプローチを開発する。ロボットが行動プリミティブなどのアクションを共同で選択し、ローカルコミュニケーションのみによるチームサブモジュラー目標を最大化するためのアプリケーションに注目します。このようなアプリケーションは、エリアカバレッジのためのマルチロボットモーションプランニング、環境探索、ターゲット追跡など、大規模なマルチロボット協調に不可欠である。しかし、現在の分散化部分モジュラー最大化アルゴリズムは、ロボット間通信の仮定を必要とするか、あるいはいくつかの準最適保証を失う。本研究では,分散通信を用いた大規模部分モジュラー最大化に向けた汎用学習アーキテクチャを提案する。特に、我々の学習アーキテクチャはグラフニューラルネットワーク(GNN)を利用して、ロボットの局所的な相互作用を捉え、ロボットの分散意思決定を学ぶ。専門的なソリューションを模倣して学習モデルを訓練し,局所的な観察とコミュニケーションのみを含む分散的な行動選択モデルを実装した。我々は,大規模ロボットネットワークを用いたアクティブターゲットカバレッジのシナリオにおいて,GNNベースの学習手法の性能を示す。シミュレーションの結果,我々のアプローチは,エキスパートアルゴリズムのカバレッジ性能にほぼ匹敵するものの,30体以上のロボットで複数の注文を高速に実行することがわかった。また,従来は認識されていなかったシナリオ,例えば,より大きな環境やロボットのネットワークなどにおいて,我々のアプローチの一般化能力を示す。

In this paper, we develop a learning-based approach for decentralized submodular maximization. We focus on applications where robots are required to jointly select actions, e.g., motion primitives, to maximize team submodular objectives with local communications only. Such applications are essential for large-scale multi-robot coordination such as multi-robot motion planning for area coverage, environment exploration, and target tracking. But the current decentralized submodular maximization algorithms either require assumptions on the inter-robot communication or lose some suboptimal guarantees. In this work, we propose a general-purpose learning architecture towards submodular maximization at scale, with decentralized communications. Particularly, our learning architecture leverages a graph neural network (GNN) to capture local interactions of the robots and learns decentralized decision-making for the robots. We train the learning model by imitating an expert solution and implement the resulting model for decentralized action selection involving local observations and communications only. We demonstrate the performance of our GNN-based learning approach in a scenario of active target coverage with large networks of robots. The simulation results show our approach nearly matches the coverage performance of the expert algorithm, and yet runs several orders faster with more than 30 robots. The results also exhibit our approach's generalization capability in previously unseen scenarios, e.g., larger environments and larger networks of robots.

翻訳日:2021-05-19 14:10:37 公開日:2021-05-18

# ASM2TV: 適応型セミスーパービジョンマルチタスクマルチビュー学習フレームワーク

ASM2TV: An Adaptive Semi-Supervised Multi-Task Multi-View Learning Framework ( http://arxiv.org/abs/2105.08643v1 )

ライセンス: Link先を確認

Zekai Chen, Maiwang Shi, Xiao Zhang, Haochao Ying

(参考訳) IoTにおけるヒューマンアクティビティ認識(HAR)のような現実のシナリオの多くは、マルチタスクのマルチビュー学習問題として形式化することができる。各タスクは、複数のソースから収集された複数の共有機能ビューで構成される。最近のアプローチの共通点は、共通知識を明らかにするために、タスクをまたいだ各ビューに対して、最初のフェーズで典型的なハード/ソフトの共有戦略を個別に採用することである。一方、タスク間の複数のビューは、実用的な状況下で相互に関連している可能性がある。一方で、ラベル付きデータが少ない場合、教師付きメソッドは不十分かもしれない。これらの課題に対処するために,準教師付きマルチタスク多視点学習のための新しいフレームワーク ASM2TV を提案する。本稿では,任意のタスクに対して最も望ましい候補共有ブロックを適応的に選択する,学習可能なタスクビュー対応共有ポリシであるゲーティングコントロールポリシーを提案する。重要な点として,本提案手法は大量の未ラベルの断片化時系列をフル活用し,広範囲のアプリケーションに対応する汎用的なフレームワークである。さまざまな主題やソースから収集された2つの多様な実世界のHARベンチマークデータセットの実験は、我々のフレームワークが他の最先端技術よりも優れていることを示している。

Many real-world scenarios, such as human activity recognition (HAR) in IoT, can be formalized as a multi-task multi-view learning problem. Each specific task consists of multiple shared feature views collected from multiple sources, either homogeneous or heterogeneous. Common among recent approaches is to employ a typical hard/soft sharing strategy at the initial phase separately for each view across tasks to uncover common knowledge, underlying the assumption that all views are conditionally independent. On the one hand, multiple views across tasks possibly relate to each other under practical situations. On the other hand, supervised methods might be insufficient when labeled data is scarce. To tackle these challenges, we introduce a novel framework ASM2TV for semi-supervised multi-task multi-view learning. We present a new perspective named gating control policy, a learnable task-view-interacted sharing policy that adaptively selects the most desirable candidate shared block for any view across any task, which uncovers more fine-grained task-view-interacted relatedness and improves inference efficiency. Significantly, our proposed gathering consistency adaption procedure takes full advantage of large amounts of unlabeled fragmented time-series, making it a general framework that accommodates a wide range of applications. Experiments on two diverse real-world HAR benchmark datasets collected from various subjects and sources demonstrate our framework's superiority over other state-of-the-arts.

翻訳日:2021-05-19 14:10:16 公開日:2021-05-18

# DCAP:ユーザ応答予測のためのディープクロス注意製品ネットワーク

DCAP: Deep Cross Attentional Product Network for User Response Prediction ( http://arxiv.org/abs/2105.08649v1 )

ライセンス: Link先を確認

Zekai Chen, Fangtian Zhong, Zhumin Chen, Xiao Zhang, Robert Pless, Xiuzhen Cheng

(参考訳) ユーザが広告をクリックしたりアイテムを購入したりといった特定のコンテキストで事前定義されたポジティブな応答を提供する確率を予測することを目的としたユーザ応答予測は、オンライン広告やレコメンデーションシステム、検索ランキングといった多くの産業アプリケーションにとって不可欠である。しかし、これらのタスクで収集されたデータの高次元性と超広さのため、クロス機能は必然的に高価である。ユーザ応答の予測に関する以前の研究では、機能ベクタを2次あるいは高次クロス機能を明示的に、あるいは暗黙的にモデル化するために、機能ベクターを拡張して機能インタラクションを利用した。しかし、これらの既存手法は、モデルアーキテクチャの制限のために十分なクロスフィーチャを学習しなかったり、同じ重みを持つ全ての高階特徴相互作用をモデル化することによって妨げられる。この研究は、新しいアーキテクチャであるDeep Cross Attentional Product Network (DCAP)を提案することで、このギャップを埋めることを目的としている。さらに、マルチヘッドアテンションメカニズムとProduct Neural Network(PNN)にインスパイアされた各ネットワーク層における異なるクロス機能の重要性を区別し、実践者がより詳細なユーザ行動分析を行うことを可能にする。さらに,提案モデルは容易に実装でき,並行して訓練できる。実世界の3つのデータセットに関する総合的な実験を行う。その結果,提案モデルDCAPは最先端モデルと比較して優れた予測性能が得られることが示された。

User response prediction, which aims to predict the probability that a user will provide a predefined positive response in a given context such as clicking on an ad or purchasing an item, is crucial to many industrial applications such as online advertising, recommender systems, and search ranking. However, due to the high dimensionality and super sparsity of the data collected in these tasks, handcrafting cross features is inevitably time expensive. Prior studies in predicting user response leveraged the feature interactions by enhancing feature vectors with products of features to model second-order or high-order cross features, either explicitly or implicitly. Nevertheless, these existing methods can be hindered by not learning sufficient cross features due to model architecture limitations or modeling all high-order feature interactions with equal weights. This work aims to fill this gap by proposing a novel architecture Deep Cross Attentional Product Network (DCAP), which keeps cross network's benefits in modeling high-order feature interactions explicitly at the vector-wise level. Beyond that, it can differentiate the importance of different cross features in each network layer inspired by the multi-head attention mechanism and Product Neural Network (PNN), allowing practitioners to perform a more in-depth analysis of user behaviors. Additionally, our proposed model can be easily implemented and train in parallel. We conduct comprehensive experiments on three real-world datasets. The results have robustly demonstrated that our proposed model DCAP achieves superior prediction performance compared with the state-of-the-art models.

翻訳日:2021-05-19 14:09:54 公開日:2021-05-18

# 破損測定による低ランク行列回復問題に対するシャープ制限等尺特性境界

Sharp Restricted Isometry Property Bounds for Low-rank Matrix Recovery Problems with Corrupted Measurements ( http://arxiv.org/abs/2105.08232v1 )

ライセンス: Link先を確認

Ziye Ma, Yingjie Bi, Javad Lavaei, Somayeh Sojoudi

(参考訳) 本稿では,雑音による線形測定による一般的な低ランク行列回復問題について検討する。本研究の目的は,局所探索手法の制限等尺性(RIP)の条件が,誤差の少ない基底真理を見つけることができるかを理解することである。非凸問題の景観を解析することにより、まず、RIP定数が1/2より小さいという仮定の下で、任意の局所最小化器と基底真理の間の最大距離に関する大域的保証を提案する。ノイズの強度が減少するにつれて、この距離がゼロに縮まることを示す。我々の新しい保証は、RIP定数の点で鋭く、既存の結果よりもはるかに強い。次に、任意の RIP 定数を持つ問題に対する局所的な保証を示し、任意の局所最小化器は基底的真理にかなり近いか、それから遠く離れていることを示す。これらの結果から,問題の雑音強度とRIP定数が,真の解に対する局所最小値の位置に与える影響が示された。

In this paper, we study a general low-rank matrix recovery problem with linear measurements corrupted by some noise. The objective is to understand under what conditions on the restricted isometry property (RIP) of the problem local search methods can find the ground truth with a small error. By analyzing the landscape of the non-convex problem, we first propose a global guarantee on the maximum distance between an arbitrary local minimizer and the ground truth under the assumption that the RIP constant is smaller than 1/2. We show that this distance shrinks to zero as the intensity of the noise reduces. Our new guarantee is sharp in terms of the RIP constant and is much stronger than the existing results. We then present a local guarantee for problems with an arbitrary RIP constant, which states that any local minimizer is either considerably close to the ground truth or far away from it. The developed results demonstrate how the noise intensity and the RIP constant of the problem affect the locations of the local minima relative to the true solution.

翻訳日:2021-05-19 14:09:16 公開日:2021-05-18

# oneshot differentially top-k selection

Oneshot Differentially Private Top-k Selection ( http://arxiv.org/abs/2105.08233v1 )

ライセンス: Link先を確認

Gang Qiao, Weijie J. Su, Li Zhang

(参考訳) プライバシリークのないトップ$1の要素を効率的かつ正確に選択できることは、さまざまなデータ分析タスクの不可欠なコンポーネントであり、大きな注目を集めている。本稿では,上位k$問題に対する高速かつ低歪みかつ微分プライベートなプリミティブである「textit{oneshot mechanism}」を紹介する。文献の既存手法と比較すると,本アルゴリズムは数にLaplaceノイズを付加し,高額なノイズ数とその推定値を一括してリリースすることにより,有効性を保ちながら計算コストを大幅に削減する。このメカニズムのプライバシーの証明は、独立した理論的関心を持つ新しい結合技術に依存している。最後に,複数仮説検定とペア比較によるランク付けにワンショット機構を適用し,その差分プライベートな結果を得る。

Being able to efficiently and accurately select the top-$k$ elements without privacy leakage is an integral component of various data analysis tasks and has gained significant attention. In this paper, we introduce the \textit{oneshot mechanism}, a fast, low-distortion, and differentially private primitive for the top-$k$ problem. Compared with existing approaches in the literature, our algorithm adds Laplace noise to the counts and releases the top-$k$ noisy counts and their estimates in a oneshot fashion, thereby substantially reducing the computational cost while maintaining satisfying utility. Our proof of privacy for this mechanism relies on a novel coupling technique that is of independent theoretical interest. Finally, we apply the oneshot mechanism to multiple hypothesis testing and ranking from pairwise comparisons and thus obtain their differentially private counterparts.

翻訳日:2021-05-19 14:09:00 公開日:2021-05-18

# 局所感性ハッシュによる線形最小二乗値反復

Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing ( http://arxiv.org/abs/2105.08285v1 )

ライセンス: Link先を確認

Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu

(参考訳) 本稿では,動作数において実行時の複雑性を部分線形に有する,最初の証明可能な最小二乗値反復 (lsvi) アルゴリズムを提案する。本稿では,最大内部積探索問題として値反復値関数推定法を定式化し,局所性に敏感なハッシュ (LSH) [Indyk and Motwani STOC'98, Andoni and Razenshteyn STOC'15, Andoni, Laarhoven, Razenshteyn and Waingarten SODA'17] 型データ構造を提案する。さらに, 近似最大内積探索理論と強化学習の後悔分析との関係を明らかにした。我々は、近似係数を選択することで、我々のSublinear LSVIアルゴリズムが元のLSVIアルゴリズムと同じ後悔を保ちつつ、実行時の複雑さをアクションの数でサブリニアに減らすことを証明した。私たちの知る限りでは、これはlshと強化学習を組み合わせることで、証明可能な改善をもたらす最初の仕事です。データ構造と反復アルゴリズムを組み合わせた新しい手法が、コスト削減と最適化のさらなる研究の扉を開くことを願っている。

We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions. We formulate the value function estimation procedure in value iteration as an approximate maximum inner product search problem and propose a locality sensitive hashing (LSH) [Indyk and Motwani STOC'98, Andoni and Razenshteyn STOC'15, Andoni, Laarhoven, Razenshteyn and Waingarten SODA'17] type data structure to solve this problem with sublinear time complexity. Moreover, we build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning. We prove that, with our choice of approximation factor, our Sublinear LSVI algorithms maintain the same regret as the original LSVI algorithms while reducing the runtime complexity to sublinear in the number of actions. To the best of our knowledge, this is the first work that combines LSH with reinforcement learning resulting in provable improvements. We hope that our novel way of combining data-structures and iterative algorithm will open the door for further study into cost reduction in optimization.

翻訳日:2021-05-19 14:08:46 公開日:2021-05-18

# 密度に基づく原子表現の最適ラジアル基底

Optimal radial basis for density-based atomic representations ( http://arxiv.org/abs/2105.08717v1 )

ライセンス: Link先を確認

Alexander Goscinski, F\'elix Musil, Sergey Pozdnyakov, and Michele Ceriotti

(参考訳) 原子スケールでの物質の特性をターゲットにしたほぼ全ての機械学習アルゴリズムの入力は、デカルト原子座標のリストをより対称な表現に変換することを含む。これらの最も一般的な表現の多くは、原子密度の対称性相関の拡張と見なすことができ、主に基底の選択によって異なる。ここでは、データセットの構造的多様性を最も効率的に表現するために選択された適応的で最適な数値基底を構築する方法について論じる。トレーニングデータセットごとに、この最適なベースはユニークで、スプラインで近似することで、プリミティブベースに関して追加コストなしで計算することができる。この構成は、正確で計算効率の良い表現をもたらし、分子と凝縮相の両方の機械学習モデルを含む例を示す。

The input of almost every machine learning algorithm targeting the properties of matter at the atomic scale involves a transformation of the list of Cartesian atomic coordinates into a more symmetric representation. Many of these most popular representations can be seen as an expansion of the symmetrized correlations of the atom density, and differ mainly by the choice of basis. Here we discuss how to build an adaptive, optimal numerical basis that is chosen to represent most efficiently the structural diversity of the dataset at hand. For each training dataset, this optimal basis is unique, and can be computed at no additional cost with respect to the primitive basis by approximating it with splines. We demonstrate that this construction yields representations that are accurate and computationally efficient, presenting examples that involve both molecular and condensed-phase machine-learning models.

翻訳日:2021-05-19 14:08:21 公開日:2021-05-18

# UncertaintyFuseNet: uncertainty-aware Hierarchical Feature Fusion with Ensemble Monte Carlo Dropout for COVID-19 Detection

UncertaintyFuseNet: Robust Uncertainty-aware Hierarchical Feature Fusion with Ensemble Monte Carlo Dropout for COVID-19 Detection ( http://arxiv.org/abs/2105.08590v1 )

ライセンス: Link先を確認

Moloud Abdar, Soorena Salari, Sina Qahremani, Hak-Keung Lam, Fakhri Karray, Sadiq Hussain, Abbas Khosravi, U. Rajendra Acharya, Saeid Nahavandi

(参考訳) 新型コロナウイルス(Coronavirus disease 2019)は1億5100万人以上に感染し、現在まで世界中で約317万人が死亡している。新型コロナウイルス(covid-19)の急速な拡大は、人間の生命と健康を脅かし続けている。そのため,CTとX線データセットを用いて,新型コロナウイルスと他の疾患を正確に区別できる機械学習と深層学習を基盤としたCADシステムの開発が不可欠であり,最優先事項である。 CT画像とX線画像のどちらを用いた以前の研究と異なり、実装に十分なサンプルが得られたデータ型を両方使用した。一方で、この広汎性ウイルスの極度の感受性のため、モデル不確実性は考慮されるべきであるが、ほとんどの研究はそれを見落としている。そこで我々は,不確実性モジュールであるEnsemble Monte Carlo (EMC)ドロップアウトからなる,$UncertaintyFuseNet$という新しい強力な融合モデルを提案する。以上の結果から,CTスキャンとX線データを用いたCOVID-19検出のための融合の有用性が示唆された。また、提案する$uncertaintyfusenet$モデルはノイズに対してかなり頑健で、未発見のデータでもうまく動作します。この研究のソースコードとモデルは、https://github.com/moloud 1987/uncertaintyfusenet-for-covid-19-classificationで入手できる。

The COVID-19 (Coronavirus disease 2019) has infected more than 151 million people and caused approximately 3.17 million deaths around the world up to the present. The rapid spread of COVID-19 is continuing to threaten human's life and health. Therefore, the development of computer-aided detection (CAD) systems based on machine and deep learning methods which are able to accurately differentiate COVID-19 from other diseases using chest computed tomography (CT) and X-Ray datasets is essential and of immediate priority. Different from most of the previous studies which used either one of CT or X-ray images, we employed both data types with sufficient samples in implementation. On the other hand, due to the extreme sensitivity of this pervasive virus, model uncertainty should be considered, while most previous studies have overlooked it. Therefore, we propose a novel powerful fusion model named $UncertaintyFuseNet$ that consists of an uncertainty module: Ensemble Monte Carlo (EMC) dropout. The obtained results prove the effectiveness of our proposed fusion for COVID-19 detection using CT scan and X-Ray datasets. Also, our proposed $UncertaintyFuseNet$ model is significantly robust to noise and performs well with the previously unseen data. The source codes and models of this study are available at: https://github.com/moloud1987/UncertaintyFuseNet-for-COVID-19-Classification.

翻訳日:2021-05-19 14:06:09 公開日:2021-05-18

# SAIL-VOS 3D:映像データからのオブジェクト検出と3Dメッシュ再構成のための合成データセットとベースライン

SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data ( http://arxiv.org/abs/2105.08612v1 )

ライセンス: Link先を確認

Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing

(参考訳) 映像データからオブジェクトの詳細な3D情報を抽出することは、全体像理解の重要な目標である。最近の手法では、単一の画像からオブジェクトのメッシュを再構築する場合に印象的な結果が得られたが、オブジェクトの一部が観測できないため、結果が曖昧なままであることが多い。さらに、メッシュ再構築のための既存の画像ベースのデータセットは、時間情報を統合するモデルの研究を許可しません。 SAIL-VOS 3D:SAIL-VOSを拡張したフレーム単位のメッシュアノテーションを備えた合成ビデオデータセット。また,時間モデルによる映像データから3次元メッシュを再構成するための最初のベースラインを開発した。提案するベースラインがSAIL-VOS 3DとPix3Dに対して有効であることを示し,時間的情報により復元精度が向上することを示した。リソースと追加情報はhttp://sailvos.web.illinois.eduで入手できる。

Extracting detailed 3D information of objects from video data is an important goal for holistic scene understanding. While recent methods have shown impressive results when reconstructing meshes of objects from a single image, results often remain ambiguous as part of the object is unobserved. Moreover, existing image-based datasets for mesh reconstruction don't permit to study models which integrate temporal information. To alleviate both concerns we present SAIL-VOS 3D: a synthetic video dataset with frame-by-frame mesh annotations which extends SAIL-VOS. We also develop first baselines for reconstruction of 3D meshes from video data via temporal models. We demonstrate efficacy of the proposed baseline on SAIL-VOS 3D and Pix3D, showing that temporal information improves reconstruction quality. Resources and additional information are available at http://sailvos.web.illinois.edu.

翻訳日:2021-05-19 14:05:46 公開日:2021-05-18

# Image Cropping on Twitter: Fairness Metrics, their limitation, and the importance of Representation, Design, and Agency

Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency ( http://arxiv.org/abs/2105.08667v1 )

ライセンス: Link先を確認

Kyra Yee, Uthaipon Tantipongpipat, Shubhanshu Mishra

(参考訳) twitterは機械学習を使って画像を収穫する。 2020年秋、twitterのユーザーは、自動トリッピングシステムが浅黒い肌の個人よりも明るい肌を好むことを懸念し、またこのシステムが頭の代わりに女性の身体をトリッピングすることを好んでいることを懸念した。これらの懸念に対処するために,形式化されたグループフェアネスメトリクスを用いて広範な分析を行う。作付けにおける系統的な相違は,最も顕著な点に基づく作付けが,その相違を増幅するという事実を含む寄与要因を同定する。しかし, 自動収穫における表現的害のリスクを捉えるには, 形式化された公正度指標と定量分析が不十分であることを示す。ユーザエージェンシーをよりよく保存するソリューションとして,サリエンシに基づく収穫の除去を提案する。表現的危害に関する懸念に十分対処できる新しいソリューションを開発するために、我々の批判は、人間中心の設計を含む量的および質的手法の組み合わせを動機付ける。

Twitter uses machine learning to crop images, where crops are centered around the part predicted to be the most salient. In fall 2020, Twitter users raised concerns that the automated image cropping system on Twitter favored light-skinned over dark-skinned individuals, as well as concerns that the system favored cropping woman's bodies instead of their heads. In order to address these concerns, we conduct an extensive analysis using formalized group fairness metrics. We find systematic disparities in cropping and identify contributing factors, including the fact that the cropping based on the single most salient point can amplify the disparities. However, we demonstrate that formalized fairness metrics and quantitative analysis on their own are insufficient for capturing the risk of representational harm in automatic cropping. We suggest the removal of saliency-based cropping in favor of a solution that better preserves user agency. For developing a new solution that sufficiently address concerns related to representational harm, our critique motivates a combination of quantitative and qualitative methods that include human-centered design.

翻訳日:2021-05-19 14:05:00 公開日:2021-05-18

# 勾配で戦うグラデーション:敵の攻撃に対する動的防御

Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks ( http://arxiv.org/abs/2105.08714v1 )

ライセンス: Link先を確認

Dequan Wang, An Ju, Evan Shelhamer, David Wagner, Trevor Darrell

(参考訳) 敵の攻撃は防御を破るためにモデルに最適化する。既存の防御は静的であり、攻撃が変化しても一度トレーニングされたままである。モデルは反撃し、テスト時に攻撃に対して防御を最適化すべきである。防御エントロピー最小化(dent)により,テスト中にモデルと入力に適応する動的防御を提案する。 dentは既存のモデルとの互換性と列車時の防御のために、トレーニングではなくテストを変更する。 Dentは、CIFAR-10/100およびImageNetに対する、敵に訓練された防御と名指しで訓練されたモデルの堅牢性を改善する。特にdentは、cifar-10のオートアタックに対して、$\epsilon_\infty$ = 8/255で絶対的に20ポイント以上の防御を強化している。

Adversarial attacks optimize against models to defeat defenses. Existing defenses are static, and stay the same once trained, even while attacks change. We argue that models should fight back, and optimize their defenses against attacks at test time. We propose dynamic defenses, to adapt the model and input during testing, by defensive entropy minimization (dent). Dent alters testing, but not training, for compatibility with existing models and train-time defenses. Dent improves the robustness of adversarially-trained defenses and nominally-trained models against white-box, black-box, and adaptive attacks on CIFAR-10/100 and ImageNet. In particular, dent boosts state-of-the-art defenses by 20+ points absolute against AutoAttack on CIFAR-10 at $\epsilon_\infty$ = 8/255.

翻訳日:2021-05-19 14:04:39 公開日:2021-05-18

# データ次元にパラメータ化されたreluネットワークトレーニングの計算複雑性

The Computational Complexity of ReLU Network Training Parameterized by Data Dimensionality ( http://arxiv.org/abs/2105.08675v1 )

ライセンス: Link先を確認

Vincent Froese, Christoph Hertrich, Rolf Niedermeier

(参考訳) 線形整列ユニット(ReLU)を用いた単純なニューラルネットワークの学習における計算複雑性の理解が近年,集中的な研究の対象となっている。そこで本論文では,2層reluネットワークの各種損失関数に対するパラメータ化複雑性に関するいくつかの結果について述べる。他のパラメータに関する簡単な議論の後、トレーニングデータの次元$d$が計算複雑性に与える影響を分析することに重点を置いている。パラメータ $d$ に対する W[1]-hardness という観点でランニングタイムの下界を提供し、既知のブルートフォース戦略が本質的に最適であることを証明する(指数時間仮説を仮定する)。これまでの研究と比較すると、結果は幅広い損失関数に対して、すべての$p\in[0,\infty]$に対して$\ell^p$-lossを含む。特に、定数$d$と凸損失関数の既知の多項式時間アルゴリズムを、より一般的な損失関数のクラスに拡張し、これらの場合もランニングタイムの下限と一致する。

Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension $d$ of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter $d$ and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including $\ell^p$-loss for all $p\in[0,\infty]$. In particular, we extend a known polynomial-time algorithm for constant $d$ and convex loss functions to a more general class of loss functions, matching our running time lower bounds also in these cases.

翻訳日:2021-05-19 14:04:26 公開日:2021-05-18

# ハイパーグラフによる高次相互作用の非パラメトリックモデリング

Nonparametric Modeling of Higher-Order Interactions via Hypergraphons ( http://arxiv.org/abs/2105.08678v1 )

ライセンス: Link先を確認

Krishnakumar Balasubramanian

(参考訳) 大規模ハイパーグラフの限界であるハイパーグラフを用いた高次相互作用のモデル化における統計的およびアルゴリズム的側面について検討する。ハイパーグラフはモデリングの観点からは非常に強力であるが、実際に効率的に推定できる制限された単純なリプシッツハイパーグラフ(SLH)のクラスを考える。また、SLHのクラスに最適である推定器の収束率も提供する。理論を裏付けるシミュレーション結果が提供される。

We study statistical and algorithmic aspects of using hypergraphons, that are limits of large hypergraphs, for modeling higher-order interactions. Although hypergraphons are extremely powerful from a modeling perspective, we consider a restricted class of Simple Lipschitz Hypergraphons (SLH), that are amenable to practically efficient estimation. We also provide rates of convergence for our estimator that are optimal for the class of SLH. Simulation results are provided to corroborate the theory.

翻訳日:2021-05-19 14:04:08 公開日:2021-05-18

# LEWIS: 教師なしテキストスタイル転送のためのLevenshtein編集

LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer ( http://arxiv.org/abs/2105.08206v1 )

ライセンス: Link先を確認

Machel Reid and Victor Zhong

(参考訳) 多くのタイプのテキストスタイル転送は、小さな正確な編集(例えば)だけで実現できる。気持ちの移り変わりはひどい時間だったのに... 素晴らしい時間を過ごしたのに) 本稿では,Levenshtein編集操作を用いてテキストを変換するスタイル転送のための粗大なエディタを提案する。挿入、置換、削除)。従来の単一スパン編集法とは異なり,本手法はソーステキスト中の複数のスパンを同時に編集する。並列スタイルのテキストペア(例)なしでトレーニングするペアの +/sentiment 文) では、教師なしのデータ合成手順を提案する。まず、スタイル分類子に注意を向けて、テキストをスタイル非依存のテンプレートに変換する。そしてテンプレートのスロットを微調整された事前学習された言語モデルで埋めます。提案手法は感情(yelp, amazon)と礼儀正しい(polite)トランスファー(polite)において,既存の生成および編集スタイルトランスファー手法を上回っている。特にマルチスパン編集はシングルスパン編集よりも高い性能と多様な出力を実現する。さらに,教師なしデータ合成における従来の手法と比較して,高品質な並列スタイルペアが得られ,モデル性能が向上する。

Many types of text style transfer can be achieved with only small, precise edits (e.g. sentiment transfer from I had a terrible time... to I had a great time...). We propose a coarse-to-fine editor for style transfer that transforms text using Levenshtein edit operations (e.g. insert, replace, delete). Unlike prior single-span edit methods, our method concurrently edits multiple spans in the source text. To train without parallel style text pairs (e.g. pairs of +/- sentiment statements), we propose an unsupervised data synthesis procedure. We first convert text to style-agnostic templates using style classifier attention (e.g. I had a SLOT time...), then fill in slots in these templates using fine-tuned pretrained language models. Our method outperforms existing generation and editing style transfer methods on sentiment (Yelp, Amazon) and politeness (Polite) transfer. In particular, multi-span editing achieves higher performance and more diverse output than single-span editing. Moreover, compared to previous methods on unsupervised data synthesis, our method results in higher quality parallel style pairs and improves model performance.

翻訳日:2021-05-19 14:03:42 公開日:2021-05-18

# BookSum: 長文ナラティブ要約のためのデータセットのコレクション

BookSum: A Collection of Datasets for Long-form Narrative Summarization ( http://arxiv.org/abs/2105.08209v1 )

ライセンス: Link先を確認

Wojciech Kry\'sci\'nski, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong, Dragomir Radev

(参考訳) 利用可能なテキスト要約データセットの大部分は、長期因果関係や時間依存がなく、強いレイアウトやスタイルバイアスを含む短い形式のソースドキュメントを含んでいる。関連性はあるものの、このようなデータセットは将来のテキスト要約システムに限定的な課題をもたらすだろう。長文要約のためのデータセットの集合であるBookSumを導入することで,これらの問題に対処する。私たちのデータセットは、小説、戯曲、物語などの文学領域のソースドキュメントをカバーしており、難易度の増加の3つのレベル(段落、章、書籍レベル)において、高度に抽象的な人間による要約を含んでいます。データセットのドメインと構造は、非常に長いドキュメントの処理、非自明な因果関係と時間的依存関係、リッチな談話構造など、要約システムに固有の課題をもたらします。今後の作業を容易にするため、データセットのベースラインとして、複数の抽出および抽象的な要約モデルを訓練し、評価した。

The majority of available text summarization datasets include short-form source documents that lack long-range causal and temporal dependencies, and often contain strong layout and stylistic biases. While relevant, such datasets will offer limited challenges for future generations of text summarization systems. We address these issues by introducing BookSum, a collection of datasets for long-form narrative summarization. Our dataset covers source documents from the literature domain, such as novels, plays and stories, and includes highly abstractive, human written summaries on three levels of granularity of increasing difficulty: paragraph-, chapter-, and book-level. The domain and structure of our dataset poses a unique set of challenges for summarization systems, which include: processing very long documents, non-trivial causal and temporal dependencies, and rich discourse structures. To facilitate future work, we trained and evaluated multiple extractive and abstractive summarization models as baselines for our dataset.

翻訳日:2021-05-19 14:03:23 公開日:2021-05-18

# 感情誘発機:デュアルジェネレータに基づく会話生成を誘発する感情

Emotion Eliciting Machine: Emotion Eliciting Conversation Generation based on Dual Generator ( http://arxiv.org/abs/2105.08251v1 )

ライセンス: Link先を確認

Hao Jiang, Yutao Zhu, Xinyu Zhang, Zhicheng Dou, Pan Du, Te Pi, Yantao Jia

(参考訳) 近年、感情的なチャットボットの構築に大きな進歩が見られた。チャットボットが与えられた感情で応答を生成するための素晴らしい方法が提案されている。しかし,会話中のユーザの感情変化は十分に検討されていない。本研究では,人間と機械の会話において,ユーザのポジティブな感情を誘発する応答を生成することを目的としたポジティブ感情誘発問題について検討する。この問題に対処するために,弱い教師付き感情除去機械(EEM)を提案する。具体的には,まず,事前学習した感情分類器に基づいて,ユーザの感情状態変化の弱いラベルを変換で収集する。次に,会話におけるユーザの感情状態の変化に基づいて,正と負の両方の応答生成をモデル化する二重エンコーダデコーダ構造を提案する。二重構造の上に感情誘発因子を導入し、感情誘発時の反応に対する肯定的および否定的な感情的影響のバランスをとる。この要因はまた、感情誘発のきめ細かい制御方法を提供する。大規模な実世界のデータセットによる実験結果から、EEMは肯定的な感情誘発反応の生成において既存のモデルよりも優れていた。

Recent years have witnessed great progress on building emotional chatbots. Tremendous methods have been proposed for chatbots to generate responses with given emotions. However, the emotion changes of the user during the conversation has not been fully explored. In this work, we study the problem of positive emotion elicitation, which aims to generate responses that can elicit positive emotion of the user, in human-machine conversation. We propose a weakly supervised Emotion Eliciting Machine (EEM) to address this problem. Specifically, we first collect weak labels of user emotion status changes in a conversion based on a pre-trained emotion classifier. Then we propose a dual encoder-decoder structure to model the generation of responses in both positive and negative side based on the changes of the user's emotion status in the conversation. An emotion eliciting factor is introduced on top of the dual structure to balance the positive and negative emotional impacts on the generated response during emotion elicitation. The factor also provides a fine-grained controlling manner for emotion elicitation. Experimental results on a large real-world dataset show that EEM outperforms the existing models in generating responses with positive emotion elicitation.

翻訳日:2021-05-19 14:03:04 公開日:2021-05-18

# KECRS:知識に富んだ会話レコメンデーションシステムを目指して

KECRS: Towards Knowledge-Enriched Conversational Recommendation System ( http://arxiv.org/abs/2105.08261v1 )

ライセンス: Link先を確認

Tong Zhang, Yong Liu, Peixiang Zhong, Chen Zhang, Hao Wang, Chunyan Miao

(参考訳) チャットベースの会話レコメンデーションシステム(CRS)は、自然言語による対話を通じて、ユーザにアイテムレコメンデーションを提供する。ユーザの意図をよりよく理解するために、外部知識グラフ(KG)がチャットベースのCRSに導入されている。しかし、既存のチップチャットベースのCRSは、通常反復的なアイテムレコメンデーションを生成し、KGからの知識をCRSに適切に注入して情報的応答を生成することはできない。これらの問題を解決するため、まず、推奨項目を新規かつ潜在的に興味のあるものにするために、会話レコメンデーションタスクを再構成する。そこで我々はKECRS(Knowledge-Enriched Conversational Recommendation System)を提案する。特に,バグ・オブ・エンティティ(boe)損失と輸液損失を発達させ,より多様で有益な応答を生成するために,kg と crs との統合性が向上した。 BOE損失は、CRSに人書きの発話とKGから学ぶための追加の監視信号を提供する。注入損失は、この2つの埋め込みにおける同じ単語の距離を最小化することにより、単語埋め込みとエンティティ埋め込みの間のギャップを埋める。さらに、高品質なKG, \ie The Movie Domain Knowledge Graph (TMDKG)を構築することで、研究の促進を図る。大規模データセットによる実験結果から,KECRSは推奨精度と応答生成品質の両方の観点から,最先端のチャットベースのCRSよりも優れていた。

The chit-chat-based conversational recommendation systems (CRS) provide item recommendations to users through natural language interactions. To better understand user's intentions, external knowledge graphs (KG) have been introduced into chit-chat-based CRS. However, existing chit-chat-based CRS usually generate repetitive item recommendations, and they cannot properly infuse knowledge from KG into CRS to generate informative responses. To remedy these issues, we first reformulate the conversational recommendation task to highlight that the recommended items should be new and possibly interested by users. Then, we propose the Knowledge-Enriched Conversational Recommendation System (KECRS). Specifically, we develop the Bag-of-Entity (BOE) loss and the infusion loss to better integrate KG with CRS for generating more diverse and informative responses. BOE loss provides an additional supervision signal to guide CRS to learn from both human-written utterances and KG. Infusion loss bridges the gap between the word embeddings and entity embeddings by minimizing distances of the same words in these two embeddings. Moreover, we facilitate our study by constructing a high-quality KG, \ie The Movie Domain Knowledge Graph (TMDKG). Experimental results on a large-scale dataset demonstrate that KECRS outperforms state-of-the-art chit-chat-based CRS, in terms of both recommendation accuracy and response generation quality.

翻訳日:2021-05-19 14:02:50 公開日:2021-05-18

# CoMAE:共感応答生成のための多要素階層フレームワーク

CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation ( http://arxiv.org/abs/2105.08316v1 )

ライセンス: Link先を確認

Chujie Zheng, Yong Liu, Wei Chen, Yongcai Leng and Minlie Huang

(参考訳) オープンドメインダイアログシステムの成功には共感の能力が不可欠である。多次元性の性質から,コミュニケーション機構や対話行動,感情など,共感表現に関連するさまざまな要因が存在する。しかしながら、既存の共感的応答生成法は、通常、1つの共感因子のみを考慮するか、異なる要因間の階層的関係を無視し、共感モデリングの弱い能力をもたらす。本稿では,共感表現の3つの重要な要素を階層的にモデル化した,共感応答生成のための多要素階層型フレームワークCoMAEを提案する。実験により,我々のCoMAEモデルが従来の方法よりも共感的な反応を生成できることが示された。また,実生活コーパスにおける経験的分析と広範な実験を通して,異なる要因の階層的モデリングの重要性を強調する。私たちのコードと使用済みデータはhttps://github.com/chujiezheng/comae.comから入手できます。

The capacity of empathy is crucial to the success of open-domain dialog systems. Due to its nature of multi-dimensionality, there are various factors that relate to empathy expression, such as communication mechanism, dialog act and emotion. However, existing methods for empathetic response generation usually either consider only one empathy factor or ignore the hierarchical relationships between different factors, leading to a weak ability of empathy modeling. In this paper, we propose a multi-factor hierarchical framework, CoMAE, for empathetic response generation, which models the above three key factors of empathy expression in a hierarchical way. We show experimentally that our CoMAE-based model can generate more empathetic responses than previous methods. We also highlight the importance of hierarchical modeling of different factors through both the empirical analysis on a real-life corpus and the extensive experiments. Our codes and used data are available at https://github.com/chujiezheng/CoMAE.

翻訳日:2021-05-19 14:02:23 公開日:2021-05-18

# エンティティ型制約付き関係分類

Relation Classification with Entity Type Restriction ( http://arxiv.org/abs/2105.08393v1 )

ライセンス: Link先を確認

Shengfei Lyu, Huanhuan Chen

(参考訳) 関係分類は文中の2つの実体間の関係を予測することを目的としている。既存の方法は、すべての関係を文中の2つの実体の候補関係とみなす。これらの方法は、エンティティタイプによる候補関係の制限を無視し、いくつかの不適切な関係が候補関係となる。本稿では,関係性を制限するためにエンティティタイプを利用した関係分類法であるRelation Classification with ENtity Type Regulation (RECENT)を提案する。特に、関係型とエンティティ型の相互制約を形式化し、関係分類に導入する。さらに、提案するパラダイムであるRECENTはモデルに依存しない。それぞれ2つの代表モデルGCNとSpanBERTに基づいて、RECENT_GCNとRECENT_SpanBERTをトレーニングする。標準データセットの実験結果は、RECENTがGCNとSpanBERTのパフォーマンスをそれぞれ6.9ポイント、F1が4.4ポイント改善したことを示している。特にRECENT_SpanBERTはTACREDで新しい最先端を実現している。

Relation classification aims to predict a relation between two entities in a sentence. The existing methods regard all relations as the candidate relations for the two entities in a sentence. These methods neglect the restrictions on candidate relations by entity types, which leads to some inappropriate relations being candidate relations. In this paper, we propose a novel paradigm, RElation Classification with ENtity Type restriction (RECENT), which exploits entity types to restrict candidate relations. Specially, the mutual restrictions of relations and entity types are formalized and introduced into relation classification. Besides, the proposed paradigm, RECENT, is model-agnostic. Based on two representative models GCN and SpanBERT respectively, RECENT_GCN and RECENT_SpanBERT are trained in RECENT. Experimental results on a standard dataset indicate that RECENT improves the performance of GCN and SpanBERT by 6.9 and 4.4 F1 points, respectively. Especially, RECENT_SpanBERT achieves a new state-of-the-art on TACRED.

翻訳日:2021-05-19 14:02:08 公開日:2021-05-18

# 付加的な構成性を再考する: 単語埋め込みによるAND, OR, NOT操作

Revisiting Additive Compositionality: AND, OR and NOT Operations with Word Embeddings ( http://arxiv.org/abs/2105.08585v1 )

ライセンス: Link先を確認

Masahiro Naito, Sho Yokoi, Geewook Kim, Hidetoshi Shimodaira

(参考訳) word2vec や glove のような典型的な単語埋め込みメソッドは、その意味を埋め込み(付加的合成性)を付加することで構成できるという特性を持つことはよく知られている。加法構成性を説明するためにいくつかの理論が提案されているが、以下の疑問は未解決である: (q1) これらの理論の仮定は、実際的な単語埋め込みには当てはまらない。 (q2) 通常の加法構成性は、単語の意味の操作や操作と見なすことができるが、埋め込みによって他の演算がどのように計算されるかはよく分かっていない。我々は,周波数重み付けセンタリングの考え方によってこの問題に対処した。本稿では, (q1) に対する回答として, 実用的な単語埋め込みと付加合成性理論とのギャップを橋渡しする後処理法を提案する。また、(Q2)への応答として単語埋め込みの線形操作により、意味のORまたはNOTを取る方法を提供する。さらに,本手法の処理後処理(トップ100における3.5倍の精度向上)により,通常の加法的構成性であるAND操作の精度が向上し,ORおよびNOT操作が正しく行えることを実験的に確認した。

It is well-known that typical word embedding methods such as Word2Vec and GloVe have the property that the meaning can be composed by adding up the embeddings (additive compositionality). Several theories have been proposed to explain additive compositionality, but the following questions remain unanswered: (Q1) The assumptions of those theories do not hold for the practical word embedding. (Q2) Ordinary additive compositionality can be seen as an AND operation of word meanings, but it is not well understood how other operations, such as OR and NOT, can be computed by the embeddings. We address these issues by the idea of frequency-weighted centering at its core. This paper proposes a post-processing method for bridging the gap between practical word embedding and the assumption of theory about additive compositionality as an answer to (Q1). It also gives a method for taking OR or NOT of the meaning by linear operation of word embedding as an answer to (Q2). Moreover, we confirm experimentally that the accuracy of AND operation, i.e., the ordinary additive compositionality, can be improved by our post-processing method (3.5x improvement in top-100 accuracy) and that OR and NOT operations can be performed correctly.

翻訳日:2021-05-19 14:01:55 公開日:2021-05-18

# PoBRL:Blending Reinforcement Learning Policiesによる多文書要約の最適化

PoBRL: Optimizing Multi-Document Summarization by Blending Reinforcement Learning Policies ( http://arxiv.org/abs/2105.08244v1 )

ライセンス: Link先を確認

Andy Su, Difei Su, John M.Mulvey, H.Vincent Poor

(参考訳) 多文書要約を解くための新しい強化学習フレームワークPoBRLを提案する。 PoBRLは、高品質な要約に必要な3つの目的、すなわち重要性、妥当性、長さを共同で最適化する。我々の戦略は、この多目的最適化を、強化学習によって個別に解ける様々なサブプロブレムに分解する。 PoBRLを利用して、学習した各ポリシーをブレンドして、元の入力の簡潔で完全な表現である要約を生成する。実験結果から,複数のマルチドキュメントデータセットにおける最先端の性能を示す。また,本手法が高品質な出力を生成することを示す。

We propose a novel reinforcement learning based framework PoBRL for solving multi-document summarization. PoBRL jointly optimizes over the following three objectives necessary for a high-quality summary: importance, relevance, and length. Our strategy decouples this multi-objective optimization into different subproblems that can be solved individually by reinforcement learning. Utilizing PoBRL, we then blend each learned policies together to produce a summary that is a concise and complete representation of the original input. Our empirical analysis shows state-of-the-art performance on several multi-document datasets. Human evaluation also shows that our method produces high-quality output.

翻訳日:2021-05-19 14:01:35 公開日:2021-05-18

# E-Commerce Fresh Retailのマークダウン: 対実予測と多機能最適化アプローチ

Markdowns in E-Commerce Fresh Retail: A Counterfactual Prediction and Multi-Period Optimization Approach ( http://arxiv.org/abs/2105.08313v1 )

ライセンス: Link先を確認

Junhao Hua, Ling Yan, Huan Xu, Cheng Yang

(参考訳) 本稿では,大量の観測トランザクションデータを活用することで,非現実的予測と多周期価格最適化からなる,マークダウンのための新しいデータ駆動型かつ解釈可能な価格設定手法を提案する。まず, 準パラメトリック構造モデルを構築し, 個々の価格弾性を学習し, 反事実需要を予測する。この半パラメトリックモデルは、非パラメトリック機械学習モデルの予測可能性と経済モデルの解釈可能性の両方を活用する。第2に,有限販売地平線上での消耗品全体の利益を最大化する多周期動的価格アルゴリズムを提案する。決定論的需要を用いる従来のアプローチとは異なり、予測プロセスに必然的にランダム性を持つため、反事実的需要の不確かさをモデル化する。確率モデルに基づいてマルコフ決定プロセスによる逐次価格戦略を導出し,それを解決するための2段階のアルゴリズムを設計する。提案アルゴリズムは非常に効率的である。指数関数から多項式への時間の複雑さを減少させる。実験の結果,我々の価格アルゴリズムの利点が示され,提案したフレームワークは有名なeコマースの新鮮小売シナリオであるFreshippoにうまく展開されている。

In this paper, by leveraging abundant observational transaction data, we propose a novel data-driven and interpretable pricing approach for markdowns, consisting of counterfactual prediction and multi-period price optimization. Firstly, we build a semi-parametric structural model to learn individual price elasticity and predict counterfactual demand. This semi-parametric model takes advantage of both the predictability of nonparametric machine learning model and the interpretability of economic model. Secondly, we propose a multi-period dynamic pricing algorithm to maximize the overall profit of a perishable product over its finite selling horizon. Different with the traditional approaches that use the deterministic demand, we model the uncertainty of counterfactual demand since it inevitably has randomness in the prediction process. Based on the stochastic model, we derive a sequential pricing strategy by Markov decision process, and design a two-stage algorithm to solve it. The proposed algorithm is very efficient. It reduces the time complexity from exponential to polynomial. Experimental results show the advantages of our pricing algorithm, and the proposed framework has been successfully deployed to the well-known e-commerce fresh retail scenario - Freshippo.

翻訳日:2021-05-19 14:01:28 公開日:2021-05-18

# SATによるハイブリッドシステムの再構成

Reconfiguring Hybrid Systems Using SAT ( http://arxiv.org/abs/2105.08398v1 )

ライセンス: Link先を確認

Kaja Balzereit and Oliver Niggemann

(参考訳) リコンフィグレーションは、システム目標に再び到達できるように、システム構成を自動的に適応することで、障害からシステムを取り戻すことを目的としている。古典的なアプローチは通常、対応するリカバリアクションを手動で定義する事前定義された障害セットを使用する。これは頻繁な変更によって特徴づけられる現代のハイブリッドシステムでは不可能である。代わりに、AIベースのアプローチは、非デフォルトシステムのモデルを活用し、有効な振る舞いを再び確立する再構成操作のセットを検索する必要がある。この研究は、3つの主要な課題を解決する新しいアルゴリズムを提示している。欠陥の振る舞いをモデル化する必要はありません (ii)主に連続系変数や制御信号の数が多いため、もともと大きすぎる探索空間を識別し、縮小する。 3) 命題論理にはSATソルバを用いており, 第一に妥当性という二項の概念を定義している。第二に、検索自体を実装する -- 任意のソリューションを素早く識別するために最適なソリューションを犠牲にする。この手法はプロセス工学シミュレーションシステム上で障害を再構成できることが示されている。

Reconfiguration aims at recovering a system from a fault by automatically adapting the system configuration, such that the system goal can be reached again. Classical approaches typically use a set of pre-defined faults for which corresponding recovery actions are defined manually. This is not possible for modern hybrid systems which are characterized by frequent changes. Instead, AI-based approaches are needed which leverage on a model of the non-faulty system and which search for a set of reconfiguration operations which will establish a valid behavior again. This work presents a novel algorithm which solves three main challenges: (i) Only a model of the non-faulty system is needed, i.e. the faulty behavior does not need to be modeled. (ii) It discretizes and reduces the search space which originally is too large -- mainly due to the high number of continuous system variables and control signals. (iii) It uses a SAT solver for propositional logic for two purposes: First, it defines the binary concept of validity. Second, it implements the search itself -- sacrificing the optimal solution for a quick identification of an arbitrary solution. It is shown that the approach is able to reconfigure faults on simulated process engineering systems.

翻訳日:2021-05-19 14:01:10 公開日:2021-05-18

# CFR-MIX: Combinatorial Action Spaceによる不完全な情報集約型ゲームの解決

CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space ( http://arxiv.org/abs/2105.08440v1 )

ライセンス: Link先を確認

Shuxin Li, Youzhi Zhang, Xinrun Wang, Wanqi Xue, Bo An

(参考訳) 多くの現実世界のシナリオでは、エージェントのチームが互いに調整し、対戦相手と競う。このタイプのゲーム解決の課題は、チームの共同アクションスペースがエージェント数で指数関数的に増大し、既存のアルゴリズム、例えば、反事実後悔最小化(cfr)の非効率化につながることである。そこで本研究では,CFRの新しいフレームワークであるCFR-MIXを提案する。まず,各エージェントの個別戦略を用いた共同行動戦略と,エージェント間の協調を維持するための一貫性関係を示す新しい戦略表現を提案する。 cfrフレームワークの下で個々の戦略との均衡を計算するために,戦略間の一貫性関係を累積後悔値間の一貫性関係に変換する。さらに, 累積的後悔値に対する新しい分解法を提案し, 累積的後悔値間の整合性関係を保証する。最後に, 混合層を用いた新しいアルゴリズムCFR-MIXを導入し, 個別動作の累積後悔値の非線形結合として, 共同動作の累積後悔値を推定する。実験の結果,CFR-MIXは様々なゲームにおいて既存のアルゴリズムよりも優れていた。

In many real-world scenarios, a team of agents coordinate with each other to compete against an opponent. The challenge of solving this type of game is that the team's joint action space grows exponentially with the number of agents, which results in the inefficiency of the existing algorithms, e.g., Counterfactual Regret Minimization (CFR). To address this problem, we propose a new framework of CFR: CFR-MIX. Firstly, we propose a new strategy representation that represents a joint action strategy using individual strategies of all agents and a consistency relationship to maintain the cooperation between agents. To compute the equilibrium with individual strategies under the CFR framework, we transform the consistency relationship between strategies to the consistency relationship between the cumulative regret values. Furthermore, we propose a novel decomposition method over cumulative regret values to guarantee the consistency relationship between the cumulative regret values. Finally, we introduce our new algorithm CFR-MIX which employs a mixing layer to estimate cumulative regret values of joint actions as a non-linear combination of cumulative regret values of individual actions. Experimental results show that CFR-MIX outperforms existing algorithms on various games significantly.

翻訳日:2021-05-19 14:00:55 公開日:2021-05-18

# N-ary Relational Factsのリンク予測:グラフに基づくアプローチ

Link Prediction on N-ary Relational Facts: A Graph-based Approach ( http://arxiv.org/abs/2105.08476v1 )

ライセンス: Link先を確認

Quan Wang, Haifeng Wang, Yajuan Lyu, Yong Zhu

(参考訳) 知識グラフ(KG)のリンク予測は重要な研究トピックである。それまでの研究は主に二項関係に焦点をあて、現実世界のKGではユビキタスだが、高次関係にはあまり注意を払わなかった。本稿では,n-項関係の事実に対するリンク予測を考察し,この課題に対するグラフベースアプローチを提案する。我々のアプローチの鍵は、事実の n-項構造を小さな不均一なグラフとして表現し、エッジバイアスの完全な接続された注意でこのグラフをモデル化することです。完全接続された注意は普遍的な頂点間相互作用を捉える一方、エッジアウェアの注意バイアスによりグラフ構造とその不均一性を特に符号化する。この方法では、我々のアプローチは、各n-ary事実におけるグローバルとローカルの依存関係を完全にモデル化します。広範な評価は、我々のアプローチの有効性と優位性を検証する。さまざまなn-aryリレーショナルベンチマークにおいて,現在の最先端よりも実質的に,一貫して優れたパフォーマンスを実現しています。私たちのコードは公開されています。

Link prediction on knowledge graphs (KGs) is a key research topic. Previous work mainly focused on binary relations, paying less attention to higher-arity relations although they are ubiquitous in real-world KGs. This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task. The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention. The fully-connected attention captures universal inter-vertex interactions, while with edge-aware attentive biases to particularly encode the graph structure and its heterogeneity. In this fashion, our approach fully models global and local dependencies in each n-ary fact, and hence can more effectively capture associations therein. Extensive evaluation verifies the effectiveness and superiority of our approach. It performs substantially and consistently better than current state-of-the-art across a variety of n-ary relational benchmarks. Our code is publicly available.

翻訳日:2021-05-19 14:00:39 公開日:2021-05-18

# DACBench:動的アルゴリズム構成のためのベンチマークライブラリ

DACBench: A Benchmark Library for Dynamic Algorithm Configuration ( http://arxiv.org/abs/2105.08541v1 )

ライセンス: Link先を確認

Theresa Eimer, Andr\'e Biedenkapp, Maximilian Reimer, Steven Adriaensen, Frank Hutter, Marius Lindauer

(参考訳) Dynamic Algorithm Configuration (DAC)は、ターゲットアルゴリズムのハイパーパラメータを動的に制御してパフォーマンスを向上させることを目的としている。いくつかの理論的および実証的な結果は、進化計算、AI計画、ディープラーニングのような領域におけるハイパーパラメータを動的に制御する利点を示している。しかし、これらの結果の複製やDACの新しい手法の研究は、既存のベンチマークがしばしば同一インタフェースと互換性がないため困難である。ベンチマークの容易化とDACの研究を目的として,AIドメインから既存のDACベンチマークを収集,標準化するベンチマークライブラリであるDACBenchと,新たなベンチマーク用のテンプレートを提案する。 dacbenchの設計には, (i) 柔軟性, (ii) 再現性, (iii) 拡張性, (iv) 自動ドキュメンテーションと可視化といった重要なデシデラタに注目した。 DACの可能性,適用性,課題を示すために,6つの初期ベンチマークの集合が,いくつかの難易度でどのように比較されるかを検討する。

Dynamic Algorithm Configuration (DAC) aims to dynamically control a target algorithm's hyperparameters in order to improve its performance. Several theoretical and empirical results have demonstrated the benefits of dynamically controlling hyperparameters in domains like evolutionary computation, AI Planning or deep learning. Replicating these results, as well as studying new methods for DAC, however, is difficult since existing benchmarks are often specialized and incompatible with the same interfaces. To facilitate benchmarking and thus research on DAC, we propose DACBench, a benchmark library that seeks to collect and standardize existing DAC benchmarks from different AI domains, as well as provide a template for new ones. For the design of DACBench, we focused on important desiderata, such as (i) flexibility, (ii) reproducibility, (iii) extensibility and (iv) automatic documentation and visualization. To show the potential, broad applicability and challenges of DAC, we explore how a set of six initial benchmarks compare in several dimensions of difficulty.

翻訳日:2021-05-19 14:00:22 公開日:2021-05-18

# 野生の単一ビューの地球中心ポス

Single View Geocentric Pose in the Wild ( http://arxiv.org/abs/2105.08229v1 )

ライセンス: Link先を確認

Gordon Christie, Kevin Foster, Shea Hagstrom, Gregory D. Hager, Myron Z. Brown

(参考訳) セマンティックマッピング、マップアライメント、変化検出などの地球観測タスクの現在の方法は、ほぼナディア画像に依存しているが、自然災害のような動的な世界イベントに対応する最初の画像は斜めであることが多い。これらの課題は、観測対象視差により斜め画像にとってはるかに困難である。近年、衛星画像に登録された空中ライダーによる訓練により、地上の高さと重力に対する方向を規定した地中心のポーズを復元することに成功した。本稿では,アフィン不変性を利用した新しい課題のモデルを提案する。また,本手法を現実のアプリケーションに適用する上で,現実的な課題にも対処する。私たちのデータとコードは公開されています。

Current methods for Earth observation tasks such as semantic mapping, map alignment, and change detection rely on near-nadir images; however, often the first available images in response to dynamic world events such as natural disasters are oblique. These tasks are much more difficult for oblique images due to observed object parallax. There has been recent success in learning to regress geocentric pose, defined as height above ground and orientation with respect to gravity, by training with airborne lidar registered to satellite images. We present a model for this novel task that exploits affine invariance properties to outperform state of the art performance by a wide margin. We also address practical issues required to deploy this method in the wild for real-world applications. Our data and code are publicly available.

翻訳日:2021-05-19 13:59:43 公開日:2021-05-18

# 教師なしスケッチに基づく画像検索に向けて

Towards Unsupervised Sketch-based Image Retrieval ( http://arxiv.org/abs/2105.08237v1 )

ライセンス: Link先を確認

Conghui Hu, Yongxin Yang, Yunpeng Li, Timothy M. Hospedales, Yi-Zhe Song

(参考訳) 現在の教師付きスケッチベース画像検索(SBIR)は優れた性能を発揮する。しかし、データ収集とラベリングのコストは、実際のアプリケーションの実用的なデプロイに対する難解な障壁を伴います。本稿では,従来訓練に必要であったラベル付けコスト(カテゴリアノテーションとスケッチ写真ペアリング)を取り除くための教師なしsbirの最初の試みについて述べる。既存の単一ドメインの教師なし表現学習手法は、この問題のユニークなクロスドメイン性(スケッチとフォト)のため、このアプリケーションでは性能が悪い。そこで我々は,教師なし表現学習とスケッチ写真領域アライメントを同時に行う新しい枠組みを提案する。技術的には、これは関節分布最適輸送(JDOT)を利用して表現学習中に異なる領域からのデータを整列させ、トレーニング可能なクラスタプロトタイプと機能記憶バンクで拡張し、スケーラビリティと効率をさらに向上させます。広範な実験により,新しい教師なし設定では優れた性能を達成し,ゼロショット設定では最先端よりも優れた性能を示すことができた。

Current supervised sketch-based image retrieval (SBIR) methods achieve excellent performance. However, the cost of data collection and labeling imposes an intractable barrier to practical deployment of real applications. In this paper, we present the first attempt at unsupervised SBIR to remove the labeling cost (category annotations and sketch-photo pairings) that is conventionally needed for training. Existing single-domain unsupervised representation learning methods perform poorly in this application, due to the unique cross-domain (sketch and photo) nature of the problem. We therefore introduce a novel framework that simultaneously performs unsupervised representation learning and sketch-photo domain alignment. Technically this is underpinned by exploiting joint distribution optimal transport (JDOT) to align data from different domains during representation learning, which we extend with trainable cluster prototypes and feature memory banks to further improve scalability and efficacy. Extensive experiments show that our framework achieves excellent performance in the new unsupervised setting, and performs comparably or better than state-of-the-art in the zero-shot setting.

翻訳日:2021-05-19 13:59:31 公開日:2021-05-18

# セルフポイントフロー:最適移動とランダム歩行を伴うポイントクラウドからの自己教師付きシーンフロー推定

Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk ( http://arxiv.org/abs/2105.08248v1 )

ライセンス: Link先を確認

Ruibo Li, Guosheng Lin, Lihua Xie

(参考訳) 注釈付きシーンフローデータの不足により,ポイントクラウドにおける自己教師ありシーンフロー学習が注目されている。自己監督的な方法では、2点雲間の対応性を確立することが効果的なアプローチである。従来の手法では、3次元点座標上の距離のみを考慮し、(1)色や表面の正常といった他の識別的指標を見落とし、(2)マッチングが制約のない状況で操作され、複数の点が同じ対応点に終止符を打つため、しばしばサブパー性能を生成する。この問題に対処するため、このマッチングタスクを最適な輸送問題として定式化する。出力最適割り当て行列を用いて擬似基底真理の生成を導くことができる。この最適輸送法では,複数の記述子を考慮した輸送コストを設計し,質量等式制約による1対1のマッチングを奨励する。また、各点にグラフを構築することにより、擬似ラベルの局所的一貫性を促進するランダムウォークモジュールを導入する。 FlyingThings3D と KITTI の総合的な実験により,本手法が自己教師付き学習手法の最先端性能を実現することを示す。我々の自己指導手法は、訓練に基礎的な真実の流れを必要としないが、教師付き学習手法と同等に機能する。

Due to the scarcity of annotated scene flow data, self-supervised scene flow learning in point clouds has attracted increasing attention. In the self-supervised manner, establishing correspondences between two point clouds to approximate scene flow is an effective approach. Previous methods often obtain correspondences by applying point-wise matching that only takes the distance on 3D point coordinates into account, introducing two critical issues: (1) it overlooks other discriminative measures, such as color and surface normal, which often bring fruitful clues for accurate matching; and (2) it often generates sub-par performance, as the matching is operated in an unconstrained situation, where multiple points can be ended up with the same corresponding point. To address the issues, we formulate this matching task as an optimal transport problem. The output optimal assignment matrix can be utilized to guide the generation of pseudo ground truth. In this optimal transport, we design the transport cost by considering multiple descriptors and encourage one-to-one matching by mass equality constraints. Also, constructing a graph on the points, a random walk module is introduced to encourage the local consistency of the pseudo labels. Comprehensive experiments on FlyingThings3D and KITTI show that our method achieves state-of-the-art performance among self-supervised learning methods. Our self-supervised method even performs on par with some supervised learning approaches, although we do not need any ground truth flow for training.

翻訳日:2021-05-19 13:59:12 公開日:2021-05-18

# 知識蒸留とクロスモーダルマッチングの併用による弱教師付き密集ビデオキャプション

Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching ( http://arxiv.org/abs/2105.08252v1 )

ライセンス: Link先を確認

Bofeng Wu, Guocheng Niu, Jun Yu, Xinyan Xiao, Jian Zhang and Hua Wu

(参考訳) 本稿では,ペアワイズなイベントセンテンスアノテーションを使わずに動画キャプション(dvc)を行う手法を提案する。まず,関連する課題から抽出した知識を用いて,高品質なイベント提案を生成する。次に,提案文と文のセマンティックマッチングを構築するために,典型的にクロスモーダル検索タスクに適用されるコントラッシブ・ロスとサイクル・一貫性・ロスを取り入れ,最終的にキャプション生成モジュールのトレーニングに使用される。また、アノテート画像に基づく事前学習によりマッチングモジュールのパラメータを初期化し、マッチング性能を向上させる。 activitynet-captionデータセットに関する広範な実験は、蒸留に基づくイベント提案生成と、弱い教師付きdvcとのクロスモーダル検索に基づく意味マッチングの意義を明らかにし、この手法が既存の最先端手法に優れていることを示す。

This paper proposes an approach to Dense Video Captioning (DVC) without pairwise event-sentence annotation. First, we adopt the knowledge distilled from relevant and well solved tasks to generate high-quality event proposals. Then we incorporate contrastive loss and cycle-consistency loss typically applied to cross-modal retrieval tasks to build semantic matching between the proposals and sentences, which are eventually used to train the caption generation module. In addition, the parameters of matching module are initialized via pre-training based on annotated images to improve the matching performance. Extensive experiments on ActivityNet-Caption dataset reveal the significance of distillation-based event proposal generation and cross-modal retrieval-based semantic matching to weakly supervised DVC, and demonstrate the superiority of our method to existing state-of-the-art methods.

翻訳日:2021-05-19 13:58:47 公開日:2021-05-18

# Exemplar-based Open-Set Panoptic Segmentation Network

Exemplar-Based Open-Set Panoptic Segmentation Network ( http://arxiv.org/abs/2105.08336v1 )

ライセンス: Link先を確認

Jaedong Hwang, Seoung Wug Oh, Joon-Young Lee, Bohyung Han

(参考訳) 我々は、panoptic segmentationをopen-worldに拡張し、open-set panoptic segmentation~(ops)タスクを導入する。このタスクは、既知のクラスだけでなく、トレーニング中に認識されていない未知のクラスに対しても、panopticのセグメンテーションを実行する必要がある。タスクの実践的課題を調査し,既存のデータセットであるCOCO上にベンチマークを構築する。さらに,実証理論にインスパイアされた,新しいオープン・セット・パノプティブ・セグメンテーション・ネットワーク~(EOPSN)を提案する。提案手法は,クラスタ化によって識別され,疑似グラウンドルーツとして使用されるexemplarsに基づく新しいクラスを識別する。各クラスのサイズは、クラスに関連する既存のクラスと類似性に基づいて、新しい例をマイニングすることによって増加する。提案するベンチマークでeopsnを評価し,提案の有効性を実証する。私たちの仕事の第一の目的は、オープンワールドのシナリオにおける認識にコミュニティの注意を引き付けることです。我々のアルゴリズムの実装は、プロジェクトのWebページで利用可能である。

We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation~(OPS) task. This task requires performing panoptic segmentation for not only \known classes but also \unknown ones that have not been acknowledged during training. We investigate the practical challenges of the task and construct a benchmark on top of an existing dataset, COCO. In addition, we propose a novel exemplar-based open-set panoptic segmentation network~(EOPSN) inspired by exemplar theory. Our approach identifies a new class based on exemplars, which are identified by clustering and employed as pseudo-ground-truths. The size of each class increases by mining new exemplars based on the similarities to the existing ones associated with the class. We evaluate EOPSN on the proposed benchmark and demonstrate the effectiveness of our proposals. The primary goal of our work is to draw the attention of the community to the recognition in the open-world scenarios. The implementation of our algorithm is available on the project webpage: https://cv.snu.ac.kr/research/EOPSN.

翻訳日:2021-05-19 13:58:29 公開日:2021-05-18

# 小型非均質データセットからの手術ロボット動作の教師なし同定

Unsupervised identification of surgical robotic actions from small non homogeneous datasets ( http://arxiv.org/abs/2105.08488v1 )

ライセンス: Link先を確認

Daniele Meli, Paolo Fiorini

(参考訳) ロボット支援手術は確立された臨床実践である。研修生のパフォーマンス評価や、自律的な実行とモニタリングのための手術プロセスモデリングなど、さまざまな用途において外科的アクションの自動識別が必要である。しかし,手術が複雑で長い場合,手作業で記録に注釈を付ける重荷がかかるため,指導された行動同定は不可能である。さらに、手術手順の実施例が記録されることも少なくない。本稿では,da vinci research kitで実施した標準手術訓練課題であるリングトランスファーにおいて,手術動作の教師なし識別のための新しいアルゴリズムを提案する。非常に限られた実行データセットから自動的にキネマティックおよびセマンティックな視覚的特徴を抽出することにより、同様のアプリケーションで最先端の結果を大幅に上回り、ノイズやショートアクション、非均一なワークフローの存在下でもセグメンテーション(88%対82%のマッチングスコア)とクラスタリング(67%対54%のF1スコア)の品質を向上させることができる。非反復的なアクションシーケンス標準商用仕様のハードウェア上の完全なアクション識別は、単一の実行のために1 s未満で実行される。

Robot-assisted surgery is an established clinical practice. The automatic identification of surgical actions is needed for a range of applications, including performance assessment of trainees and surgical process modeling for autonomous execution and monitoring. However, supervised action identification is not feasible, due to the burden of manually annotating recordings of potentially complex and long surgical executions. Moreover, often few example executions of a surgical procedure can be recorded. This paper proposes a novel algorithm for unsupervised identification of surgical actions in a standard surgical training task, the ring transfer, executed with da Vinci Research Kit. Exploiting kinematic and semantic visual features automatically extracted from a very limited dataset of executions, we are able to significantly outperform the state-of-the-art results for a similar application, improving the quality of segmentation (88% vs. 82% matching score) and clustering (67% vs. 54% F1-score) even in the presence of noise, short actions and non homogeneous workflows, i.e. non repetitive action sequences. Full action identification on hardware with standard commercial specifications is performed in less than 1 s for single execution.

翻訳日:2021-05-19 13:58:16 公開日:2021-05-18

# 高速かつ効率的なシーンテキスト認識のための視覚変換器

Vision Transformer for Fast and Efficient Scene Text Recognition ( http://arxiv.org/abs/2105.08582v1 )

ライセンス: Link先を確認

Rowel Atienza

(参考訳) Scene Text Recognition (STR) は、コンピュータがオブジェクトラベル、道路標識、指示書などの自然なシーンでテキストを読むことを可能にする。 STRは、どのオブジェクトを選択するか、どの方向に進むか、次のアクションのステップは何かといった、マシンが情報的な決定を行うのを助ける。 STRの研究の本体では、常に認識精度に焦点が当てられている。速度と計算効率にはあまり重点が置かれておらず、特にエネルギー制約のあるモバイルマシンでも同様に重要である。本稿では、計算およびパラメータ効率のよい視覚変換器(ViT)上に構築された単純な単一ステージモデルアーキテクチャを持つSTRであるViTSTRを提案する。 TRBAのような、84.3%の精度の強力なベースライン法では、私たちの小さなViTSTRは、パラメータの43.4%と42.2%のFLOPSを使用して、2.4倍の速度で82.6%(データ拡張で84.2%)の競争精度を達成する。 ViTSTRの小さなバージョンは80.3%の精度(データ拡張で82.1%)、2.5倍の速度で、パラメータの10.9%と11.9%のFLOPSしか必要としない。データ拡張では、我々のベースViTSTRはTRBAの精度85.2%(拡張なしで83.7%)を2.3倍に向上するが、73.2%以上のパラメータと61.5%以上のFLOPSを必要とする。トレードオフに関して言えば、ほぼ全てのViTSTR構成は、精度、速度、計算効率を同時に最大化するために、フロンティア付近にある。

Scene text recognition (STR) enables computers to read text in natural scenes such as object labels, road signs and instructions. STR helps machines perform informed decisions such as what object to pick, which direction to go, and what is the next step of action. In the body of work on STR, the focus has always been on recognition accuracy. There is little emphasis placed on speed and computational efficiency which are equally important especially for energy-constrained mobile machines. In this paper we propose ViTSTR, an STR with a simple single stage model architecture built on a compute and parameter efficient vision transformer (ViT). On a comparable strong baseline method such as TRBA with accuracy of 84.3%, our small ViTSTR achieves a competitive accuracy of 82.6% (84.2% with data augmentation) at 2.4x speed up, using only 43.4% of the number of parameters and 42.2% FLOPS. The tiny version of ViTSTR achieves 80.3% accuracy (82.1% with data augmentation), at 2.5x the speed, requiring only 10.9% of the number of parameters and 11.9% FLOPS. With data augmentation, our base ViTSTR outperforms TRBA at 85.2% accuracy (83.7% without augmentation) at 2.3x the speed but requires 73.2% more parameters and 61.5% more FLOPS. In terms of trade-offs, nearly all ViTSTR configurations are at or near the frontiers to maximize accuracy, speed and computational efficiency all at the same time.

翻訳日:2021-05-19 13:57:54 公開日:2021-05-18

# 圧縮特徴リプレイによるオンライン連続学習のためのACAE-REMIND

ACAE-REMIND for Online Continual Learning with Compressed Feature Replay ( http://arxiv.org/abs/2105.08595v1 )

ライセンス: Link先を確認

Kai Wang, Luis Herranz, Joost van de Weijer

(参考訳) オンライン連続学習は、学習者が一度だけデータを考えることができる、複数の異なるタスクから、非IIDデータストリームから学習することを目的としている。通常、メソッドは制限されたバッファを使用して、ストリームにいくつかのイメージを保存することができる。近年,画像の中間層表現が保存(あるいは生成)される機能リプレイは,メモリの削減を図りながら,画像リプレイよりも優れた結果をもたらすことが判明した。量子化された例はメモリ使用量をさらに削減できる。しかし、これらの方法の欠点は、固定された(あるいは非常に非推移的な)バックボーンネットワークを使用することである。これは、全てのタスクを区別できる表現の学習を著しく制限する。この問題を解決するために,中間層で高い圧縮率で特徴再生を行うための補助分類器自動エンコーダ (ACAE) モジュールを提案する。画像あたりのメモリフットプリントの削減により、リプレイ用に多くの例を節約できます。実験では、オンライン連続学習環境下でタスク非依存評価を行い、ImageNet-Subset、CIFAR100、CIFAR10データセット上で最先端のパフォーマンスを得る。

Online continual learning aims to learn from a non-IID stream of data from a number of different tasks, where the learner is only allowed to consider data once. Methods are typically allowed to use a limited buffer to store some of the images in the stream. Recently, it was found that feature replay, where an intermediate layer representation of the image is stored (or generated) leads to superior results than image replay, while requiring less memory. Quantized exemplars can further reduce the memory usage. However, a drawback of these methods is that they use a fixed (or very intransigent) backbone network. This significantly limits the learning of representations that can discriminate between all tasks. To address this problem, we propose an auxiliary classifier auto-encoder (ACAE) module for feature replay at intermediate layers with high compression rates. The reduced memory footprint per image allows us to save more exemplars for replay. In our experiments, we conduct task-agnostic evaluation under online continual learning setting and get state-of-the-art performance on ImageNet-Subset, CIFAR100 and CIFAR10 dataset.

翻訳日:2021-05-19 13:57:25 公開日:2021-05-18

# 都市交通現場におけるセマンティック・コンシスタント・レアルドメイン適応のためのコンテンツディスタングル

Content Disentanglement for Semantically Consistent Synthetic-to-RealDomain Adaptation in Urban Traffic Scenes ( http://arxiv.org/abs/2105.08704v1 )

ライセンス: Link先を確認

Mert Keser, Artem Savkin, Federico Tombari

(参考訳) 合成データ生成は、自動運転における新しい交通シナリオを生成するための魅力的なアプローチである。しかし、合成データのみに訓練されたディープラーニング技術は、実データ上でのテスト時に劇的なパフォーマンス低下に遭遇する。このような性能低下は、一般に、実データと合成データの間の領域ギャップに起因する。上記の領域ギャップを軽減するために、ドメイン適応法が適用されている。これらの手法は視覚的に魅力的な結果をもたらすが、翻訳されたサンプルは通常意味的不一致をもたらす。本研究では,合成データと実データ間の意味的に一貫したドメイン適応を可能にする,教師なしのエンドツーエンドドメイン適応ネットワークアーキテクチャを提案する。セマンティックセグメンテーションの下流タスクにおけるアーキテクチャを評価し,最先端手法と比較して優れた性能が得られることを示す。

Synthetic data generation is an appealing approach to generate novel traffic scenarios in autonomous driving. However, deep learning techniques trained solely on synthetic data encounter dramatic performance drops when they are tested on real data. Such performance drop is commonly attributed to the domain gap between real and synthetic data. Domain adaptation methods have been applied to mitigate the aforementioned domain gap. These methods achieve visually appealing results, but the translated samples usually introduce semantic inconsistencies. In this work, we propose a new, unsupervised, end-to-end domain adaptation network architecture that enables semantically consistent domain adaptation between synthetic and real data. We evaluate our architecture on the downstream task of semantic segmentation and show that our method achieves superior performance compared to the state-of-the-art methods.

翻訳日:2021-05-19 13:57:07 公開日:2021-05-18

# 検索エンジンの魔法:検索エンジンとの会話を通して情報にアクセスする

Wizard of Search Engine: Access to Information Through Conversations with Search Engines ( http://arxiv.org/abs/2105.08301v1 )

ライセンス: Link先を確認

Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zhumin Chen, Zhaochun Ren and Maarten de Rijke

(参考訳) 会話情報探索(CIS)は、人々を情報に結びつける上でますます重要な役割を担っている。適切なリソースが不足しているため、CISに関する以前の研究は理論・概念的枠組み、実験室ベースのユーザー研究、あるいはCISの特定の側面(例えば、質問を明確にすること)の研究に限られている。本研究では,3つの側面からCISの研究を促進するために努力する。 1) 意図検出(ID)、キーフレーズ抽出(KE)、行動予測(AP)、クエリ選択(QS)、通過選択(PS)、応答生成(RG)の6つのサブタスクでCIS用のパイプラインを定式化する。 2) cisのすべての側面に関する包括的かつ詳細な調査を可能にする,ウィザード・オブ・検索エンジン(wise)と呼ばれるベンチマークデータセットをリリースする。 (3)6つのサブタスクを共同で個別にトレーニングし、評価できるニューラルネットワークを設計し、利用可能なデータを完全に活用することでWISEの要求を大規模に削減できる事前訓練/微調整学習方式を考案する。 WISE統計に基づくCISの有用な特徴について報告する。また、いくつかの指標で示されるような効果的なCISを実現するために、最良のモデル変種が可能であることを示す。我々は、この重要な研究方向性のさらなる改善を計測し、将来の研究を促進するためのデータセット、コード、および評価スクリプトをリリースする。

Conversational information seeking (CIS) is playing an increasingly important role in connecting people to information. Due to the lack of suitable resource, previous studies on CIS are limited to the study of theoretical/conceptual frameworks, laboratory-based user studies, or a particular aspect of CIS (e.g., asking clarifying questions). In this work, we make efforts to facilitate research on CIS from three aspects. (1) We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS), and response generation (RG). (2) We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS. (3) We design a neural architecture capable of training and evaluating both jointly and separately on the six sub-tasks, and devise a pre-train/fine-tune learning scheme, that can reduce the requirements of WISE in scale by making full use of available data. We report some useful characteristics of CIS based on statistics of WISE. We also show that our best performing model variant isable to achieve effective CIS as indicated by several metrics. We release the dataset, the code, as well as the evaluation scripts to facilitate future research by measuring further improvements in this important research direction.

翻訳日:2021-05-19 13:56:56 公開日:2021-05-18

# インドの政治危機中、Twitter上でのインフルエンサーの偏光

Divided We Rule: Influencer Polarization on Twitter During Political Crises in India ( http://arxiv.org/abs/2105.08361v1 )

ライセンス: Link先を確認

Saloni Dash, Dibyendu Mishra, Gazal Shekhawat, Joyojeet Pal

(参考訳) インフルエンサーは、ソーシャルメディアにおける情報伝達の性質とネットワークの鍵となる。インフルエンサーは、問題への関与を通じて政治的談話において特に重要であり、彼らの正当性は、オンライン操作によってのみまたは部分的に導かれるか、あるいは芸能人、ジャーナリストなどの専門知識のオフライン領域を持つ。インフルエンサーの政治的関与と極性の定量化には、インドの政治危機における6kインフルエンサーと26kインドの政治家のツイートをエンコードするために、googleのuniversal sentence encoder(use)を使用します。次に、リツイートグラフとともに、政治的問題に関して、インフルエンサーの姿勢と極性を計算するのに役立つツイート埋め込みに基づいて、インフルエンサーの集合ベクトル表現を得る。新型コロナウイルス(COVID-19)では、政府側にインフルエンサーが集まっている一方で、市民権、カシミールの州昇格、農民の抗議に関する他の3つの論争的な問題について、主に政府主導のファンアカウントであり、現職の地位を拡大している。本手法は、現在のインドにおける政治的分裂の洞察を提供するとともに、他の文脈におけるインフルエンサーや偏極を研究する手段を提供する。

Influencers are key to the nature and networks of information propagation on social media. Influencers are particularly important in political discourse through their engagement with issues, and may derive their legitimacy either solely or partly through online operation, or have an offline sphere of expertise such as entertainers, journalists etc. To quantify influencers' political engagement and polarity, we use Google's Universal Sentence Encoder (USE) to encode the tweets of 6k influencers and 26k Indian politicians during political crises in India. We then obtain aggregate vector representations of the influencers based on their tweet embeddings, which alongside retweet graphs help compute the stance and polarity of these influencers with respect to the political issues. We find that while on COVID-19 there is a confluence of influencers on the side of the government, on three other contentious issues around citizenship, Kashmir's statehood, and farmers' protests, it is mainly government-aligned fan accounts that amplify the incumbent's positions. We propose that this method offers insight into the political schisms in present-day India, but also offers a means to study influencers and polarization in other contexts.

翻訳日:2021-05-19 13:56:31 公開日:2021-05-18

# エンティティベースのクエリ解釈

Entity-Based Query Interpretation ( http://arxiv.org/abs/2105.08581v1 )

ライセンス: Link先を確認

Vaibhav Kasturia, Marcel Gohsen, Matthias Hagen

(参考訳) パリ・ヒルトン(paris hilton)は、有名人の最新のニュースを見つけたり、パリで特定のホテルを見つけたりすることを目的としているのだろうか? そして、世界の20以上の「パリ」の中で、どちらがそうですか。本稿では,エンティティベースのクエリ解釈を導出することで,このあいまいさを解消することを提案する。あるクエリに対して,クエリの適切な部分を,背景知識ベースで意味的に互換性のあるエンティティにリンクすること。提案手法は, 検索応答時間が数百ミリ秒を超えるべきではないため, 有効性だけでなく, 効率性にも焦点をあてるものである。提案手法では,クエリセグメンテーションを前処理ステップとして,有望なセグメントベースの「骨格」を見つけることを提案する。これらの骨格は、包含されたセグメントを知識ベースからエンティティにリンクし、最後のステップで解釈をランク付けすることで「解釈」に拡張される。 2,800のクエリをコーパスで比較した結果,これまでで最も有効なクエリエンティティリンク手法よりも,実行時の解釈精度を向上するアプローチが示された。

Web search queries can be rather ambiguous: Is "paris hilton" meant to find the latest news on the celebrity or to find a specific hotel in Paris? And in which of the worldwide more than 20 "Parises"? We propose to solve this ambiguity problem by deriving entity-based query interpretations: given some query, the task is to link suitable parts of the query to semantically compatible entities in a background knowledge base. Our suggested approach to identify the most reasonable interpretations of a query based on the contained entities focuses on effectiveness but also on efficiency since web search response times should not exceed some hundreds of milliseconds. In our approach, we propose to use query segmentation as a pre-processing step that finds promising segment-based "skeletons". These skeletons are then enhanced to "interpretations" by linking the contained segments to entities from a knowledge base and then ranking the interpretations in a final step. An experimental comparison on a corpus of 2,800 queries shows our approach to have a better interpretation accuracy at a better run time than the previously most effective query entity linking methods.

翻訳日:2021-05-19 13:56:10 公開日:2021-05-18

# 残留ネットワークと埋め込み利用:グラフ畳み込みネットワークを用いたノード分類の新手法

Residual Network and Embedding Usage: New Tricks of Node Classification with Graph Convolutional Networks ( http://arxiv.org/abs/2105.08330v1 )

ライセンス: Link先を確認

Huixuan Chi, Yuying Wang, Qinfen Hao, Hong Xia

(参考訳) グラフ畳み込みネットワーク(GCN)とその後の変種は、グラフ上のタスク、特にノード分類タスクを解決するために提案されている。しかし文献では、ほとんどのトリックやテクニックが実装の詳細として言及されているか、ソースコードでしか見えない。本稿ではまず,GCNのミニバッチトレーニングで使用される既存の効果的なトリックについて要約する。これに基づいて,gcn_resフレームワークと組込み使用法という2つの新しい手法が,異なるデータセットにおけるベースラインのテスト精度を向上させるために,残差ネットワークと事前学習された組込みを活用することで提案されている。 Open Graph Benchmark (OGB) の実験では、これらの手法を組み合わせることで、様々なGCNのテスト精度が1.21%〜2.84%向上した。実装はhttps://github.com/ytchx 1999/PyG-OGB-Tricks.comで公開しています。

Graph Convolutional Networks (GCNs) and subsequent variants have been proposed to solve tasks on graphs, especially node classification tasks. In the literature, however, most tricks or techniques are either briefly mentioned as implementation details or only visible in source code. In this paper, we first summarize some existing effective tricks used in GCNs mini-batch training. Based on this, two novel tricks named GCN_res Framework and Embedding Usage are proposed by leveraging residual network and pre-trained embedding to improve baseline's test accuracy in different datasets. Experiments on Open Graph Benchmark (OGB) show that, by combining these techniques, the test accuracy of various GCNs increases by 1.21%~2.84%. We open source our implementation at https://github.com/ytchx1999/PyG-OGB-Tricks.

翻訳日:2021-05-19 13:55:33 公開日:2021-05-18

# グラフニューラルネットワークを用いた時系列異常検出のためのスタックングVAE

Stacking VAE with Graph Neural Networks for Effective and Interpretable Time Series Anomaly Detection ( http://arxiv.org/abs/2105.08397v1 )

ライセンス: Link先を確認

Wenkai Li, Wenbo Hu, Ning Chen, Cheng Feng

(参考訳) 実世界の保守アプリケーションにおいて、深層生成モデルは、複数のセンサから収集された時系列信号からエンティティの異常事象を検出する上で有望な性能を示した。それにもかかわらず、このようなモデルを時系列異常検出に活用するための2つの重要な課題を概説する:1)効率的かつ効率的な再構成モデルの開発、2)多変量時系列データチャネル間の類似性と相互関係構造を利用する。これらの課題に対処するため,本稿では,グラフニューラルネットワークを用いた重畳変動自動エンコーダ(VAE)モデルを提案する。具体的には,チャネル間の類似性を持つ多変量時系列データに対して,重み共有方式を用いた積み重ねブロックワイズ再構築フレームワークを提案する。さらに,グラフ学習モジュールを用いて疎隣接行列を学習し,時系列データチャネル間の安定な相互関係構造情報を明示的に把握し,系列パターンの解釈可能な再構成を行う。実験結果から,提案モデルが3つの公開データセットに対して高いベースラインを達成し,その一方でトレーニング効率の維持が図られた。さらに,本モデルで学習した直感的な安定構造は,検出結果の解釈可能性を大幅に向上させることを示した。

In real-world maintenance applications, deep generative models have shown promising performance in detecting anomalous events of entities from time-series signals collected from multiple sensors. Nevertheless, we outline two important challenges of leveraging such models for times-series anomaly detection: 1) developing effective and efficient reconstruction models and 2) exploiting the similarity and interrelation structures among the multivariate time series data channels. To address these challenges, in this paper we propose a stacking variational auto-encoder (VAE) model with graph neural networks for the effective and interpretable time-series anomaly detection. Specifically, we propose a stacking block-wise reconstruction framework with a weight-sharing scheme for the multivariate time series data with similarities among channels. Moreover, with a graph learning module, our model learns a sparse adjacency matrix to explicitly capture the stable interrelation structure information among multiple time series data channels for interpretable reconstruction of series patterns. Experimental results show that our proposed model outperforms the strong baselines on three public datasets with considerable improvements and meanwhile still maintains the training efficiency. Furthermore, we demonstrate that the intuitive stable structure learned by our model significantly improves the interpretability of our detection results.

翻訳日:2021-05-19 13:55:18 公開日:2021-05-18

# 深層学習によるアルツハイマー病診断の自動評価

Automatic Assessment of Alzheimer's Disease Diagnosis Based on Deep Learning Techniques ( http://arxiv.org/abs/2105.08446v1 )

ライセンス: Link先を確認

Alejandro Puente-Castro, Enrique Fernandez-Blanco, Alejandro Pazos, Cristian R. Munteanu

(参考訳) 早期発見はアルツハイマー病(AD)の進行を防ぐために重要である。したがって、専門家はできるだけ早く予防治療を開始することができる。彼らはADの早期かつ最も検出が難しい診断において、迅速かつ正確な評価を要求する。本研究の主な目的は、一般的には使われない矢状磁気共鳴画像(MRI)における疾患の存在を自動的に検出するシステムを開発することである。 ADNIデータセットとOASISデータセットの矢状MRIが採用された。より正確な結果を得るために,Transfer Learning (TL) 技術を用いて実験を行った。第一に、ADとそのステージに関する損傷は、矢状MRIにおいて区別でき、第二に、矢状MRIを用いたDLモデルを用いて得られた結果は、水平平面MRIを用いた最先端のMRIと類似している。矢状面MRIは一般的には使われていないが、この研究は、少なくとも、ADを早期に同定する他の平面からのMRIと同じくらい効果があることを証明した。これはさらなる研究の道を開くかもしれない。最後に、ある分野において、データセットの例を得るのは非常に高価であることに留意する必要がある。本研究は,これらの分野でDLモデルを構築できることを実証する一方,TLは少ない例でタスクを完了するための必須のツールである。

Early detection is crucial to prevent the progression of Alzheimer's disease (AD). Thus, specialists can begin preventive treatment as soon as possible. They demand fast and precise assessment in the diagnosis of AD in the earliest and hardest to detect stages. The main objective of this work is to develop a system that automatically detects the presence of the disease in sagittal magnetic resonance images (MRI), which are not generally used. Sagittal MRIs from ADNI and OASIS data sets were employed. Experiments were conducted using Transfer Learning (TL) techniques in order to achieve more accurate results. There are two main conclusions to be drawn from this work: first, the damages related to AD and its stages can be distinguished in sagittal MRI and, second, the results obtained using DL models with sagittal MRIs are similar to the state-of-the-art, which uses the horizontal-plane MRI. Although sagittal-plane MRIs are not commonly used, this work proved that they were, at least, as effective as MRI from other planes at identifying AD in early stages. This could pave the way for further research. Finally, one should bear in mind that in certain fields, obtaining the examples for a data set can be very expensive. This study proved that DL models could be built in these fields, whereas TL is an essential tool for completing the task with fewer examples.

翻訳日:2021-05-19 13:55:01 公開日:2021-05-18

# 不平等な長期水道需要予測

Univariate Long-Term Municipal Water Demand Forecasting ( http://arxiv.org/abs/2105.08486v1 )

ライセンス: Link先を確認

Blake VanBerlo, Matthew A.S. Ross, Daniel Hsia

(参考訳) 本研究は,カナダのロンドンにおける都市全体の水消費のモデル化について述べる。線形回帰やfacebookの預言法,リカレントニューラルネットワーク,畳み込みニューラルネットワークなど,水消費を伴う不定時系列予測のタスクに対して,複数のモデリング手法が評価された。命題は選択のモデルとして同定され、5倍のクロス検証で平均2.51%の絶対誤差を達成した。預言者はまた、水需要管理の利害関係者にとって価値のある他の利点があることが判明した。提案手法の実装は,他の自治体でも適用可能であるため,オープンソース化されている。

This study describes an investigation into the modelling of citywide water consumption in London, Canada. Multiple modelling techniques were evaluated for the task of univariate time series forecasting with water consumption, including linear regression, Facebook's Prophet method, recurrent neural networks, and convolutional neural networks. Prophet was identified as the model of choice, having achieved a mean absolute percentage error of 2.51%, averaged across a 5-fold cross validation. Prophet was also found to have other advantages deemed valuable to water demand management stakeholders, including inherent interpretability and graceful handling of missing data. The implementation for the methods described in this paper has been open sourced, as they may be adaptable by other municipalities.

翻訳日:2021-05-19 13:54:41 公開日:2021-05-18

# multi-aspect temporal network embedded: a mixed of hawkes process view

Multi-Aspect Temporal Network Embedding: A Mixture of Hawkes Process View ( http://arxiv.org/abs/2105.08566v1 )

ライセンス: Link先を確認

Yutian Chang and Guannan Liu and Yuan Zuo and Junjie Wu

(参考訳) 近年,ネットワーク組込みの研究が盛んに行われている。現存する研究は、ネットワーク構造の本質的なダイナミクスを明らかにする重要な情報として地区形成を取り上げ、近隣の歴史的影響を捉えるために、時間的エッジ形成シーケンスの符号化を提案した。しかし,本稿では,エッジの形成は時間的影響を含む様々な要因に起因しうると論じる。実のところ、異なるノードの側面は、識別された隣人の形成を駆動し、時間的スコープを超えながら関連するマルチスペクトル埋め込みを生み出します。そこで本研究では,ホークスに基づく時間的ネットワーク埋め込み(mhne)モデルを用いて,ネットワークのアスペクト駆動近傍形成を捉える手法を提案する。 MHNEでは、複数アスペクトの埋め込みをホークス過程の混合にエンコードし、励起効果と潜伏面をモデル化する利点を得る。具体的には、履歴イベントの励起効果を考慮して異なる重みを割り当てるためにグラフアテンション機構を使用し、一方、gumbel-softmaxはアスペクト上の分布を導出するために接続される。 8つの異なる時間ネットワークに関する大規模な実験は、MHNEによって得られたマルチアスペクト埋め込みの性能を最先端の手法と比較した。

Recent years have witnessed the tremendous research interests in network embedding. Extant works have taken the neighborhood formation as the critical information to reveal the inherent dynamics of network structures, and suggested encoding temporal edge formation sequences to capture the historical influences of neighbors. In this paper, however, we argue that the edge formation can be attributed to a variety of driving factors including the temporal influence, which is better referred to as multiple aspects. As a matter of fact, different node aspects can drive the formation of distinctive neighbors, giving birth to the multi-aspect embedding that relates to but goes beyond a temporal scope. Along this vein, we propose a Mixture of Hawkes-based Temporal Network Embeddings (MHNE) model to capture the aspect-driven neighborhood formation of networks. In MHNE, we encode the multi-aspect embeddings into the mixture of Hawkes processes to gain the advantages in modeling the excitation effects and the latent aspects. Specifically, a graph attention mechanism is used to assign different weights to account for the excitation effects of history events, while a Gumbel-Softmax is plugged in to derive the distribution over the aspects. Extensive experiments on 8 different temporal networks have demonstrated the great performance of the multi-aspect embeddings obtained by MHNE in comparison with the state-of-the-art methods.

翻訳日:2021-05-19 13:54:29 公開日:2021-05-18

# ニューラルネットワークを用いたPDE制約モデル:最適化と大域収束

PDE-constrained Models with Neural Network Terms: Optimization and Global Convergence ( http://arxiv.org/abs/2105.08633v1 )

ライセンス: Link先を確認

Justin Sirignano, Jonathan MacArt, Konstantinos Spiliopoulos

(参考訳) 近年、深層学習を用いて、科学と工学における偏微分方程式(pde)モデルを開発した。 PDEの機能形式はニューラルネットワークによって決定され、ニューラルネットワークパラメータは利用可能なデータに校正される。 PDEを最適化することで、組み込みニューラルネットワークの校正を行うことができる。これらの応用に動機づけられ,ニューラルネットワークを用いた線形楕円型pdesの最適化を厳格に検討した。 PDEのニューラルネットワークパラメータは勾配降下を用いて最適化され、その勾配は隣接PDEを用いて評価される。パラメータの数が大きくなると、PDE と随伴する PDE は非局所 PDE 系に収束する。この制限付きPDEシステムを用いて、最適化中にニューラルネットワーク-PDEのグローバル最小値への収束を証明できる。極限PDEシステムは、固有値が正であるが任意に小さい非局所線型作用素を含む。固有値に対するスペクトルギャップの欠如は、大域収束証明の主要な課題である。結合型PDEと随伴型PDEシステムのスペクトル分解の注意深い解析が必要である。最後に, ニューラルネットワークをレイノルズ平均化 Navier-Stokes (RANS) 方程式の閉包モデルとして機能させる流体力学への応用のために, ニューラルネットワークモデルをトレーニングする。 RANSニューラルネットワークモデルは、乱流チャネルフローの複数のデータセットに基づいてトレーニングされ、Reynoldsの異なる数値でサンプル外評価される。

Recent research has used deep learning to develop partial differential equation (PDE) models in science and engineering. The functional form of the PDE is determined by a neural network, and the neural network parameters are calibrated to available data. Calibration of the embedded neural network can be performed by optimizing over the PDE. Motivated by these applications, we rigorously study the optimization of a class of linear elliptic PDEs with neural network terms. The neural network parameters in the PDE are optimized using gradient descent, where the gradient is evaluated using an adjoint PDE. As the number of parameters become large, the PDE and adjoint PDE converge to a non-local PDE system. Using this limit PDE system, we are able to prove convergence of the neural network-PDE to a global minimum during the optimization. The limit PDE system contains a non-local linear operator whose eigenvalues are positive but become arbitrarily small. The lack of a spectral gap for the eigenvalues poses the main challenge for the global convergence proof. Careful analysis of the spectral decomposition of the coupled PDE and adjoint PDE system is required. Finally, we use this adjoint method to train a neural network model for an application in fluid mechanics, in which the neural network functions as a closure model for the Reynolds-averaged Navier-Stokes (RANS) equations. The RANS neural network model is trained on several datasets for turbulent channel flow and is evaluated out-of-sample at different Reynolds numbers.

翻訳日:2021-05-19 13:54:08 公開日:2021-05-18

# 賭けによる予測アルゴリズムの強化

Enhancement of prediction algorithms by betting ( http://arxiv.org/abs/2105.08669v1 )

ライセンス: Link先を確認

Vladimir Vovk

(参考訳) 本稿では,確率的予測アルゴリズムの品質向上のための手法を提案する。これは、最近開発された共形テストmartingalesの成功に触発されている。

This note proposes a procedure for enhancing the quality of probabilistic prediction algorithms via betting against their predictions. It is inspired by the success of the conformal test martingales that have been developed recently.

翻訳日:2021-05-19 13:53:47 公開日:2021-05-18

# AIと共有の繁栄

AI and Shared Prosperity ( http://arxiv.org/abs/2105.08475v1 )

ライセンス: Link先を確認

Katya Klinova and Anton Korinek

(参考訳) 人間の労働を自動化するAIの今後の進歩は、労働市場や不平等に深刻な影響を及ぼす可能性がある。本稿では、生産性の向上が社会を豊かにすると同時に、労働需要の増加に寄与することを考慮して、労働市場における特定のタイプのaiシステムの影響を分析する枠組みを提案する。この分析により、倫理に配慮した企業は、aiシステムの作成や展開を可能にし、研究者や政策立案者は、労働市場や不平等に対する彼らの行動の影響を考慮に入れ、aiの進歩を、共通の繁栄と全人類の包括的経済未来を促進する方向に導くことができる。

Future advances in AI that automate away human labor may have stark implications for labor markets and inequality. This paper proposes a framework to analyze the effects of specific types of AI systems on the labor market, based on how much labor demand they will create versus displace, while taking into account that productivity gains also make society wealthier and thereby contribute to additional labor demand. This analysis enables ethically-minded companies creating or deploying AI systems as well as researchers and policymakers to take into account the effects of their actions on labor markets and inequality, and therefore to steer progress in AI in a direction that advances shared prosperity and an inclusive economic future for all of humanity.

翻訳日:2021-05-19 13:53:29 公開日:2021-05-18

# 自分の寝室を飾る: 生成的広告ネットワークによるローカルな画像生成

Decorating Your Own Bedroom: Locally Controlling Image Generation with Generative Adversarial Networks ( http://arxiv.org/abs/2105.08222v1 )

ライセンス: Link先を確認

Chen Zhang, Yinghao Xu, Yujun Shen

(参考訳) GAN(Generative Adversarial Networks)は高品質な画像の合成に成功している。しかし、十分に訓練されたGANモデルの生成過程を制御し、出力イメージをカスタマイズする方法は、明らかにされていない。 GANで使用される入力潜時符号の変調は、出力画像の変動係数を合理的に変更できることが最近発見されたが、そのような操作は通常、画像全体を変更するために現れる。本研究では,出力画像のローカル編集をサポートするためのloganと呼ばれる効果的な手法を提案する。具体的には,コンテンツ変調とスタイル変調の2つの演算子を優先マスクとともに導入し,中間生成特性の正確な制御を容易にする。寝室の合成を例にとれば、部屋内の個々のオブジェクトをシームレスに削除、挿入、シフト、回転することが可能です。さらに, 部屋を完全に取り除き, 家具やスタイルをカスタマイズして再調合することができる。実験結果から,多目的画像編集のための事前学習されたGANの画像生成を操る大きな可能性を示した。

Generative Adversarial Networks (GANs) have made great success in synthesizing high-quality images. However, how to steer the generation process of a well-trained GAN model and customize the output image is much less explored. It has been recently found that modulating the input latent code used in GANs can reasonably alter some variation factors in the output image, but such manipulation usually presents to change the entire image as a whole. In this work, we propose an effective approach, termed as LoGAN, to support local editing of the output image. Concretely, we introduce two operators, i.e., content modulation and style modulation, together with a priority mask to facilitate the precise control of the intermediate generative features. Taking bedroom synthesis as an instance, we are able to seamlessly remove, insert, shift, and rotate the individual objects inside a room. Furthermore, our method can completely clear out a room and then refurnish it with customized furniture and styles. Experimental results show the great potentials of steering the image generation of pre-trained GANs for versatile image editing.

翻訳日:2021-05-19 13:53:17 公開日:2021-05-18

# chexnetで事前学習したresnet50を用いたコロナ病に対応するx線画像の分類

Transfer learning approach to Classify the X-ray image that corresponds to corona disease Using ResNet50 pretrained by ChexNet ( http://arxiv.org/abs/2105.08382v1 )

ライセンス: Link先を確認

Mahyar Bolhassani

(参考訳) コロナウイルスは世界中の人々に悪影響を及ぼした。コビッド19ウイルスと肺炎やインフルエンザなどの他の呼吸器疾患との間には共通の症状がある。したがって、迅速な診断は患者を救うだけでなく、感染拡大を防ぐためにも重要である。最も頼りになる診断方法の1つは、肺のx線像である。深層学習アプローチの助けを借りて、深層モデルに影響のある肺の状態を学ぶように教えることができる。したがって、新しいサンプルをCovid19感染患者であるかどうかの分類が可能である。このプロジェクトでは、imagenetデータセットとchexnetデータセットで事前トレーニングされたresnet50に基づく深いモデルをトレーニングします。 kaggle が導入した非バランスな coronahack 胸部 x-ray データセットに基づいて,バイナリ分類とマルチクラス分類の両方を適用した。また,焦点損失とクロスエントロピー損失を用いた場合の比較を行った。

Coronavirus adversely has affected people worldwide. There are common symptoms between the Covid19 virus disease and other respiratory diseases like pneumonia or Influenza. Therefore, diagnosing it fast is crucial not only to save patients but also to prevent it from spreading. One of the most reliant methods of diagnosis is through X-ray images of a lung. With the help of deep learning approaches, we can teach the deep model to learn the condition of an affected lung. Therefore, it can classify the new sample as if it is a Covid19 infected patient or not. In this project, we train a deep model based on ResNet50 pretrained by ImageNet dataset and CheXNet dataset. Based on the imbalanced CoronaHack Chest X-Ray dataset introducing by Kaggle we applied both binary and multi-class classification. Also, we compare the results when using Focal loss and Cross entropy loss.

翻訳日:2021-05-19 13:52:59 公開日:2021-05-18

# 固定フロップ数でのハイパーネットワークのオーバーパラメトリゼーションによる高速ニューラルネットワークの高速化

Overparametrization of HyperNetworks at Fixed FLOP-Count Enables Fast Neural Image Enhancement ( http://arxiv.org/abs/2105.08470v1 )

ライセンス: Link先を確認

Lorenz K. Muller

(参考訳) 深層畳み込みニューラルネットワークは、小型のモバイルカメラセンサーで撮影された画像を強化し、分解、分解、超高解像度化といったタスクに優れる。しかし、モバイルデバイスで実際に使用する場合、これらのネットワークはFLOPを多用し、畳み込み層のFLOPを削減し、パラメータ数を減少させる。これは、過大なパラメータを持つニューラルネットワークがしばしば最も一般化しているという最近の発見から、問題となっている。本稿では,フロップと標準畳み込みのパラメータの固定比率を破るためにハイパーネットワークの利用を提案する。これにより、ZRR(Zurich RAW-to-DSLR)データセットにおけるSSIMおよびMS-SSIMの従来の最先端アーキテクチャを10倍に削減できる。 zrr では、より大きな画像限界において、固定フロップ数における「二重増分」挙動と一致する一般化曲線をさらに観察する。最後に、既存のネットワーク(VDN)に同じ手法を適用することで、スマートフォン画像デノイングデータセット(SIDD)の忠実さを維持しながら計算コストを削減できることを示す。キー関数のコードは、appendixで与えられる。

Deep convolutional neural networks can enhance images taken with small mobile camera sensors and excel at tasks like demoisaicing, denoising and super-resolution. However, for practical use on mobile devices these networks often require too many FLOPs and reducing the FLOPs of a convolution layer, also reduces its parameter count. This is problematic in view of the recent finding that heavily over-parameterized neural networks are often the ones that generalize best. In this paper we propose to use HyperNetworks to break the fixed ratio of FLOPs to parameters of standard convolutions. This allows us to exceed previous state-of-the-art architectures in SSIM and MS-SSIM on the Zurich RAW- to-DSLR (ZRR) data-set at > 10x reduced FLOP-count. On ZRR we further observe generalization curves consistent with 'double-descent' behavior at fixed FLOP-count, in the large image limit. Finally we demonstrate the same technique can be applied to an existing network (VDN) to reduce its computational cost while maintaining fidelity on the Smartphone Image Denoising Dataset (SIDD). Code for key functions is given in the appendix.

翻訳日:2021-05-19 13:52:48 公開日:2021-05-18

# 平均的マルチエージェント強化学習のための置換不変ポリシー最適化:原則的アプローチ

Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach ( http://arxiv.org/abs/2105.08268v1 )

ライセンス: Link先を確認

Yan Li, Lingxiao Wang, Jiachen Yang, Ethan Wang, Zhaoran Wang, Tuo Zhao, Hongyuan Zha

(参考訳) 多エージェント強化学習(MARL)は, エージェントの数が指数関数的に増加するにつれて, より多くのエージェントの存在下でより困難になる。このようなスケール上の課題に対処するために,置換不変性を持つ協調的marl問題のクラスを同定し,平均場マルコフ決定過程(mdp)として定式化する。そこで,置換不変なアクター批判型ニューラルアーキテクチャのコアとなる平均場近似ポリシ最適化(MF-PPO)アルゴリズムを提案する。我々は,MF-PPOが収束のサブ線形速度で世界的最適政策を達成することを証明した。さらに、サンプルの複雑さはエージェントの数に依存しない。マルチエージェント粒子環境(MPE)における数値実験により,MF-PPOの理論的利点を検証する。特に、置換不変ニューラルアーキテクチャによって引き起こされる帰納バイアスにより、MF-PPOは、その一般化性能の鍵となる、より少ないモデルパラメータで既存の競合より優れていることを示す。

Multi-agent reinforcement learning (MARL) becomes more challenging in the presence of more agents, as the capacity of the joint state and action spaces grows exponentially in the number of agents. To address such a challenge of scale, we identify a class of cooperative MARL problems with permutation invariance, and formulate it as a mean-field Markov decision processes (MDP). To exploit the permutation invariance therein, we propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture. We prove that MF-PPO attains the globally optimal policy at a sublinear rate of convergence. Moreover, its sample complexity is independent of the number of agents. We validate the theoretical advantages of MF-PPO with numerical experiments in the multi-agent particle environment (MPE). In particular, we show that the inductive bias introduced by the permutation-invariant neural architecture enables MF-PPO to outperform existing competitors with a smaller number of model parameters, which is the key to its generalization performance.

翻訳日:2021-05-19 13:52:17 公開日:2021-05-18

# ModelPS: スケールでトレーニング済みモデルを編集するためのインタラクティブで協調的なプラットフォーム

ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale ( http://arxiv.org/abs/2105.08275v1 )

ライセンス: Link先を確認

Yuanming Li, Huaizheng Zhang, Shanshan Jiang, Fan Yang, Yonggang Wen and Yong Luo

(参考訳) AIエンジニアリングは、さまざまなバックグラウンドを持つソフトウェア開発者の間でディープニューラルネットワーク(DNN)モデルを民主化するための重要な規律として登場した。特に、デプロイメント段階でこれらのDNNモデルを変更することは、非常に難しい課題です。本研究では,協調的なdnnモデル編集とインテリジェントなモデル提供を実現するために,ローコードソリューションであるmodelps("model photoshop"の頭字語)を提案し,開発する。 modelpsソリューションは2つのトランスフォーメーションフィーチャを具体化している: 1) 開発者チームがdnnモデルをローコード形式で画像で共有および編集するためのユーザフレンドリーなwebインターフェース、2) 開発者が所定のデプロイメント要件や制約のためにモデル編集設定をカスタマイズするのを支援するバックエンドのmodel genieエンジン。広範にわたるディープラーニング(DL)モデルを用いたケーススタディでは,生産性の向上により,開発オーバーヘッドと通信オーバーヘッドの両方を大幅に削減できることが示された。コードはGitHubでオープンソースパッケージとしてリリースされた。

AI engineering has emerged as a crucial discipline to democratize deep neural network (DNN) models among software developers with a diverse background. In particular, altering these DNN models in the deployment stage posits a tremendous challenge. In this research, we propose and develop a low-code solution, ModelPS (an acronym for "Model Photoshop"), to enable and empower collaborative DNN model editing and intelligent model serving. The ModelPS solution embodies two transformative features: 1) a user-friendly web interface for a developer team to share and edit DNN models pictorially, in a low-code fashion, and 2) a model genie engine in the backend to aid developers in customizing model editing configurations for given deployment requirements or constraints. Our case studies with a wide range of deep learning (DL) models show that the system can tremendously reduce both development and communication overheads with improved productivity. The code has been released as an open-source package at GitHub.

翻訳日:2021-05-19 13:51:58 公開日:2021-05-18

# DRIVE: 1ビット分散平均推定

DRIVE: One-bit Distributed Mean Estimation ( http://arxiv.org/abs/2105.08339v1 )

ライセンス: Link先を確認

Shay Vargaftik, Ran Ben Basat, Amit Portnoy, Gal Mendelson, Yaniv Ben-Itzhak, Michael Mitzenmacher

(参考訳) 我々は、$n$クライアントが$d(1+o(1))$ビットのみを使用して$d$-dimensional real-valuedベクターを送信する問題を考える。このような圧縮問題は、フェデレートされた分散学習や、他の領域でも発生する。従来の圧縮アルゴリズムを精度と計算効率で上回る、新しい数学的結果を提供し、それに対応する新しいアルゴリズムを導出する。本手法は,分散学習タスクとフェデレーション学習タスクの集合において,様々なデータセットを用いて評価し,その状態に対して一貫した改善を示す。

We consider the problem where $n$ clients transmit $d$-dimensional real-valued vectors using only $d(1+o(1))$ bits each, in a manner that allows a receiver to approximately reconstruct their mean. Such compression problems arise in federated and distributed learning, as well as in other domains. We provide novel mathematical results and derive corresponding new algorithms that outperform previous compression algorithms in accuracy and computational efficiency. We evaluate our methods on a collection of distributed and federated learning tasks, using a variety of datasets, and show a consistent improvement over the state of the art.

翻訳日:2021-05-19 13:51:39 公開日:2021-05-18

# Euler-Maruyamaスキームを模倣したニューラルネットワークによる確率力学系の学習

Learning stochastic dynamical systems with neural networks mimicking the Euler-Maruyama scheme ( http://arxiv.org/abs/2105.08449v1 )

ライセンス: Link先を確認

Noura Dridi, Lucas Drumetz, Ronan Fablet

(参考訳) 確率微分方程式(SDE)は力学系の最も重要な表現の一つである。これらは、システムの決定論的構成要素と、ランダムな未知の要因を表す確率的要素を含む能力で特筆される。しかし、これはSDEの学習を通常の微分方程式(ODE)よりもはるかに困難にする。本稿では、SDEのパラメータをSDE統合スキームを組み込んだニューラルネットワークで表現するデータ駆動手法を提案する。損失関数は最大確率基準に基づいており、1つのマルコフ・ガウスの仮定に従っている。このアルゴリズムは、幾何学的ブラウン運動とロレンツ-63モデルの確率バージョンに適用される。後者は、状態に依存する確率的なコンポーネントが存在するため、特に対処が難しい。アルゴリズムの性能は異なるシミュレーション結果を用いて検証される。さらに,非線型ドリフト推定に用いる基準勾配マッチング法と,確率項を考慮しないニューラルネットワークに基づく手法との比較を行った。

Stochastic differential equations (SDEs) are one of the most important representations of dynamical systems. They are notable for the ability to include a deterministic component of the system and a stochastic one to represent random unknown factors. However, this makes learning SDEs much more challenging than ordinary differential equations (ODEs). In this paper, we propose a data driven approach where parameters of the SDE are represented by a neural network with a built-in SDE integration scheme. The loss function is based on a maximum likelihood criterion, under order one Markov Gaussian assumptions. The algorithm is applied to the geometric brownian motion and a stochastic version of the Lorenz-63 model. The latter is particularly hard to handle due to the presence of a stochastic component that depends on the state. The algorithm performance is attested using different simulations results. Besides, comparisons are performed with the reference gradient matching method used for non linear drift estimation, and a neural networks-based method, that does not consider the stochastic term.

翻訳日:2021-05-19 13:51:28 公開日:2021-05-18

# 6GネットワークのためのAI-Native Network Slicing

AI-Native Network Slicing for 6G Networks ( http://arxiv.org/abs/2105.08576v1 )

ライセンス: Link先を確認

Wen Wu, Conghao Zhou, Mushu Li, Huaqing Wu, Haibo Zhou, Ning Zhang, Xuemin (Sherman) Shen, Weihua Zhuang

(参考訳) 第5世代(5G)ネットワークのグローバル展開では、5Gを超えて第6世代(6G)ネットワークを想定する必要がある。 6Gネットワークには、宇宙空間の統合ネットワーク、高度なネットワーク仮想化、ユビキタスインテリジェンスが期待されている。本稿では、インテリジェントなネットワーク管理を促進し、新興AIサービスをサポートするために、6Gネットワークのための人工知能(AI)ネイティブネットワークスライシングアーキテクチャを提案する。 AIは、提案されたネットワークスライシングアーキテクチャで構築されており、AIとネットワークスライシングのシナジーを可能にする。 AIソリューションは、インテリジェントネットワーク管理、すなわちスライシングのためのAIを促進するために、ネットワークスライシングのライフサイクル全体について調査されている。さらに、スライスインスタンスを構築し、効率的なリソース管理、すなわちAIスライスを実行することによって、新興AIサービスをサポートするために、ネットワークスライシングアプローチについて議論する。最後に、ケーススタディを示し、6GでAIネイティブネットワークスライシングに不可欠なオープンリサーチ問題について議論する。

With the global roll-out of the fifth generation (5G) networks, it is necessary to look beyond 5G and envision the sixth generation (6G) networks. The 6G networks are expected to have space-air-ground integrated networking, advanced network virtualization, and ubiquitous intelligence. This article proposes an artificial intelligence (AI)-native network slicing architecture for 6G networks to facilitate intelligent network management and support emerging AI services. AI is built in the proposed network slicing architecture to enable the synergy of AI and network slicing. AI solutions are investigated for the entire lifecycle of network slicing to facilitate intelligent network management, i.e., AI for slicing. Furthermore, network slicing approaches are discussed to support emerging AI services by constructing slice instances and performing efficient resource management, i.e., slicing for AI. Finally, a case study is presented, followed by a discussion of open research issues that are essential for AI-native network slicing in 6G.

翻訳日:2021-05-19 13:50:51 公開日:2021-05-18

# インスタンス標的中毒における学習と認定

Learning and Certification under Instance-targeted Poisoning ( http://arxiv.org/abs/2105.08709v1 )

ライセンス: Link先を確認

Ji Gao, Amin Karbasi, Mohammad Mahmoody

(参考訳) 本稿では,特定のターゲットインスタンスで学習者を騙すことを目標として,学習セットのごく一部を変更する可能性がある,インスタンス標的中毒攻撃下でのpac学習可能性と認定について検討する。最初のコントリビューションは、様々な設定で問題を形式化し、学習者のランダムさや敵の攻撃がそれに依存するかどうかといった微妙な側面を明確に議論することである。敵の予算がサンプルの複雑さに比例してスケールすると、PAC学習性と認定が達成可能であることを示す。対照的に、敵の予算がサンプルの複雑さと線形に増加すると、敵は期待される0-1の損失を1に引き上げる可能性がある。さらに,同じ攻撃モデルを用いて分布特異的pac学習に結果を拡張し,ガウス分布下での半空間学習において,認証による適切な学習が可能であることを示す。最後に,実データ集合上のk近傍のロバスト性,ロジスティック回帰,多層パーセプトロン,畳み込みニューラルネットワークを実験的に検討し,ターゲットポジショニング攻撃に対して検証する。我々の実験結果によると、多くのモデル、特に最先端のニューラルネットワークは、これらの強力な攻撃に対して脆弱である。興味深いことに、標準精度の高いメソッドは、インスタンスターゲットの毒殺攻撃に対してより脆弱である可能性がある。

In this paper, we study PAC learnability and certification under instance-targeted poisoning attacks, where the adversary may change a fraction of the training set with the goal of fooling the learner at a specific target instance. Our first contribution is to formalize the problem in various settings, and explicitly discussing subtle aspects such as learner's randomness and whether (or not) adversary's attack can depend on it. We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable. In contrast, when the adversary's budget grows linearly with the sample complexity, the adversary can potentially drive up the expected 0-1 loss to one. We further extend our results to distribution-specific PAC learning in the same attack model and show that proper learning with certification is possible for learning halfspaces under Gaussian distribution. Finally, we empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets, and test them against targeted-poisoning attacks. Our experimental results show that many models, especially state-of-the-art neural networks, are indeed vulnerable to these strong attacks. Interestingly, we observe that methods with high standard accuracy might be more vulnerable to instance-targeted poisoning attacks.

翻訳日:2021-05-19 13:50:34 公開日:2021-05-18

# BBE:エージェント・ベース・モデリングによるゲーム内ベッティング取引所の組織動態のシミュレーション

BBE: Simulating the Microstructural Dynamics of an In-Play Betting Exchange via Agent-Based Modelling ( http://arxiv.org/abs/2105.08310v1 )

ライセンス: Link先を確認

Dave Cliff

(参考訳) I describe the rationale for, and design of, an agent-based simulation model of a contemporary online sports-betting exchange: such exchanges, closely related to the exchange mechanisms at the heart of major financial markets, have revolutionized the gambling industry in the past 20 years, but gathering sufficiently large quantities of rich and temporally high-resolution data from real exchanges - i.e., the sort of data that is needed in large quantities for Deep Learning - is often very expensive, and sometimes simply impossible; this creates a need for a plausibly realistic synthetic data generator, which is what this simulation now provides. The simulator, named the "Bristol Betting Exchange" (BBE), is intended as a common platform, a data-source and experimental test-bed, for researchers studying the application of AI and machine learning (ML) techniques to issues arising in betting exchanges; and, as far as I have been able to determine, BBE is the first of its kind: a free open-source agent-based simulation model consisting not only of a sports-betting exchange, but also a minimal simulation model of racetrack sporting events (e.g., horse-races or car-races) about which bets may be made, and a population of simulated bettors who each form their own private evaluation of odds and place bets on the exchange before and - cruciallyduring the race itself (i.e., so-called "in-play" betting) and whose betting opinions change second-by-second as each race event unfolds. BBEは、AI/MLと高度なデータ分析技術の適用を通じて、スポーツイベントに賭けるための収益戦略の自動発見や改善のための大規模な高解像度データセットの生成を可能にする概念実証システムとして提供される。本稿は,関連文献の広範な調査を行い,bbeの動機と設計を説明し,簡単な図示的結果を示す。

I describe the rationale for, and design of, an agent-based simulation model of a contemporary online sports-betting exchange: such exchanges, closely related to the exchange mechanisms at the heart of major financial markets, have revolutionized the gambling industry in the past 20 years, but gathering sufficiently large quantities of rich and temporally high-resolution data from real exchanges - i.e., the sort of data that is needed in large quantities for Deep Learning - is often very expensive, and sometimes simply impossible; this creates a need for a plausibly realistic synthetic data generator, which is what this simulation now provides. The simulator, named the "Bristol Betting Exchange" (BBE), is intended as a common platform, a data-source and experimental test-bed, for researchers studying the application of AI and machine learning (ML) techniques to issues arising in betting exchanges; and, as far as I have been able to determine, BBE is the first of its kind: a free open-source agent-based simulation model consisting not only of a sports-betting exchange, but also a minimal simulation model of racetrack sporting events (e.g., horse-races or car-races) about which bets may be made, and a population of simulated bettors who each form their own private evaluation of odds and place bets on the exchange before and - crucially - during the race itself (i.e., so-called "in-play" betting) and whose betting opinions change second-by-second as each race event unfolds. BBE is offered as a proof-of-concept system that enables the generation of large high-resolution data-sets for automated discovery or improvement of profitable strategies for betting on sporting events via the application of AI/ML and advanced data analytics techniques. This paper offers an extensive survey of relevant literature and explains the motivation and design of BBE, and presents brief illustrative results.

翻訳日:2021-05-19 13:50:11 公開日:2021-05-18

# 二重交換モデルにおける停止相分離:機械学習による大規模シミュレーション

Arrested phase separation in double-exchange models: machine-learning enabled large-scale simulation ( http://arxiv.org/abs/2105.08221v1 )

ライセンス: Link先を確認

Puhan Zhang, Gia-Wei Chern

(参考訳) 本稿では,小型完全対角化解から学習したディープラーニングニューラルネットワークポテンシャルに基づいて,シングルバンド二重交換モデルにおける電子相分離の大規模動的シミュレーションを提案する。ドープ孔を半充填した絶縁背景から分離し, 興味深い相関誘起凍結挙動を明らかにする。ホールの凝集は、電荷キャリアと局所磁気モーメントの結合による強磁性クラスターの形成によって安定化されるが、この安定化はまた、反強磁性スピン-スピン相関が背景で十分に発達しているときにホールの凝縮電位を生成する。自走ブラックホールの運動量が劇的に減少すると、強磁性クラスターのさらなる成長が妨げられ、位相分離が達成される。磁気抵抗効果を示す材料における相分離ダイナミクスの意義について考察した。

We present large-scale dynamical simulations of electronic phase separation in the single-band double-exchange model based on deep-learning neural-network potentials trained from small-size exact diagonalization solutions. We uncover an intriguing correlation-induced freezing behavior as doped holes are segregated from half-filled insulating background during equilibration. While the aggregation of holes is stabilized by the formation of ferromagnetic clusters through Hund's coupling between charge carriers and local magnetic moments, this stabilization also creates confining potentials for holes when antiferromagnetic spin-spin correlation is well developed in the background. The dramatically reduced mobility of the self-trapped holes prematurely disrupts further growth of the ferromagnetic clusters, leading to an arrested phase separation. Implications of our findings for phase separation dynamics in materials that exhibit colossal magnetoresistance effect are discussed.

翻訳日:2021-05-19 13:49:30 公開日:2021-05-18

# 重み共有ディープニューラルネットワークを用いた多核ジオメトリの電子シュレーディンガー方程式の解法

Solving the electronic Schr\"odinger equation for multiple nuclear geometries with weight-sharing deep neural networks ( http://arxiv.org/abs/2105.08351v1 )

ライセンス: Link先を確認

Michael Scherbela, Rafael Reisenhofer, Leon Gerard, Philipp Marquetand and Philipp Grohs

(参考訳) schr\"odinger方程式の正確な数値解は量子化学において極めて重要である。しかし、現在の高精度手法の計算コストは相互作用する粒子の数に比例する。モンテカルロ法と教師なしニューラルネットワークのトレーニングを組み合わせることで、この設定における次元性の呪いを克服し、計算コストを適度にスケーリングして個々の分子の正確な波動関数を得るための有望なアプローチが提案されている。これらの手法は現在、分子の測度に関して波動関数が示す規則性を利用していない。近年のDeep Transfer Learningの機械翻訳やコンピュータビジョンタスクへの応用に着想を得て、ニューラルネットワークベースのモデルを異なる分子ジオメトリに最適化する際に、重み付け制約を導入することで、この規則性を活用する。すなわち、ニューラルネットワークモデルにおける重みの最大95%が、実際には様々な分子ジオメトリにわたって等しいように最適化プロセスを制限する。この手法は、同じ分子の核ジオメトリの集合を等級で考える場合の最適化を加速し、異なる分子にまたがって高い精度をもたらす事前学習されたニューラルネットワークの波動関数への有望な道を開くことを見出した。

Accurate numerical solutions for the Schr\"odinger equation are of utmost importance in quantum chemistry. However, the computational cost of current high-accuracy methods scales poorly with the number of interacting particles. Combining Monte Carlo methods with unsupervised training of neural networks has recently been proposed as a promising approach to overcome the curse of dimensionality in this setting and to obtain accurate wavefunctions for individual molecules at a moderately scaling computational cost. These methods currently do not exploit the regularity exhibited by wavefunctions with respect to their molecular geometries. Inspired by recent successful applications of deep transfer learning in machine translation and computer vision tasks, we attempt to leverage this regularity by introducing a weight-sharing constraint when optimizing neural network-based models for different molecular geometries. That is, we restrict the optimization process such that up to 95 percent of weights in a neural network model are in fact equal across varying molecular geometries. We find that this technique can accelerate optimization when considering sets of nuclear geometries of the same molecule by an order of magnitude and that it opens a promising route towards pre-trained neural network wavefunctions that yield high accuracy even across different molecules.

翻訳日:2021-05-19 13:49:16 公開日:2021-05-18

# 領域制約のロバスト性について

On the Robustness of Domain Constraints ( http://arxiv.org/abs/2105.08619v1 )

ライセンス: Link先を確認

Ryan Sheatsley and Blaine Hoak and Eric Pauley and Yohan Beugin and Michael J. Weisman and Patrick McDaniel

(参考訳) 機械学習は、モデルのパフォーマンスを損なうように設計された逆例入力に対して脆弱である。しかし、逆例がモデル化された領域における現実的な入力を表すかどうかは不明である。ネットワークやフィッシングのような様々なドメインは、敵が攻撃を実現するために満たさなければならない特徴(敵固有の目標に加えて)の間のドメイン制約と複雑な関係を持つ。本稿では,ドメイン制約が敵の能力を制限する方法と,現実的な(制約に従順な)例を作成するために敵の戦略をどのように適用できるかを検討する。そこで本研究では,データからドメイン制約を学習する手法を開発し,学習した制約を敵対的工法に組み込む方法を示す。ネットワーク侵入とフィッシングデータセットにおける我々のアプローチの有効性を評価し,(1)最先端の工法アルゴリズムが生成する敵例の最大82%がドメイン制約に違反し,(2)ドメイン制約は敵例に対して堅牢であり,制約を強制するとモデル精度が最大34%向上することを示した。我々は、ドメイン制約を満たすために入力を変更する必要があるだけでなく、これらの制約が有効な敵例の生成をより困難にしていることを観察する。

Machine learning is vulnerable to adversarial examples-inputs designed to cause models to perform poorly. However, it is unclear if adversarial examples represent realistic inputs in the modeled domains. Diverse domains such as networks and phishing have domain constraints-complex relationships between features that an adversary must satisfy for an attack to be realized (in addition to any adversary-specific goals). In this paper, we explore how domain constraints limit adversarial capabilities and how adversaries can adapt their strategies to create realistic (constraint-compliant) examples. In this, we develop techniques to learn domain constraints from data, and show how the learned constraints can be integrated into the adversarial crafting process. We evaluate the efficacy of our approach in network intrusion and phishing datasets and find: (1) up to 82% of adversarial examples produced by state-of-the-art crafting algorithms violate domain constraints, (2) domain constraints are robust to adversarial examples; enforcing constraints yields an increase in model accuracy by up to 34%. We observe not only that adversaries must alter inputs to satisfy domain constraints, but that these constraints make the generation of valid adversarial examples far more challenging.

翻訳日:2021-05-19 13:48:34 公開日:2021-05-18

# DID-eFed: 分散アイデンティティを備えたサービスとしてのフェデレーション学習の実現

DID-eFed: Facilitating Federated Learning as a Service withDecentralized Identities ( http://arxiv.org/abs/2105.08671v1 )

ライセンス: Link先を確認

Jiahui Geng, Neel Kanwal, Martin Gilje Jaatun, Chunming Rong

(参考訳) 私たちはビッグデータの時代に入り、人工知能応用の繁栄の「燃料」と考えられている。 eu一般データ保護規則(gdpr)の制定は、ビッグデータにおける個人のプライバシーに関する懸念を引き起こす。フェデレートラーニング(FL)は、ユーザプライバシとデータの機密性要件に準拠したまま、複数のパーティ間で共有される高性能モデルを構築するのに役立つ機能的なソリューションとして現れます。 FLは、実アプリケーションで集中的に研究され、使用されているが、関心のあるサードパーティへのFLaaS(Federated Learning as a Service)としての展望と応用に関する研究は、まだ限られている。本稿では,分散ID(DID)とスマートコントラクトによってFLが促進されるFLaaSシステム,DID-eFedを提案する。 didは当社のシステムにおいて、より柔軟で信頼性の高い分散アクセス管理を可能にします。 DID-eFedが病院や研究機関のFLaaSを可能にするシナリオについて述べる。

We have entered the era of big data, and it is considered to be the "fuel" for the flourishing of artificial intelligence applications. The enactment of the EU General Data Protection Regulation (GDPR) raises concerns about individuals' privacy in big data. Federated learning (FL) emerges as a functional solution that can help build high-performance models shared among multiple parties while still complying with user privacy and data confidentiality requirements. Although FL has been intensively studied and used in real applications, there is still limited research related to its prospects and applications as a FLaaS (Federated Learning as a Service) to interested 3rd parties. In this paper, we present a FLaaS system: DID-eFed, where FL is facilitated by decentralized identities (DID) and a smart contract. DID enables a more flexible and credible decentralized access management in our system, while the smart contract offers a frictionless and less error-prone process. We describe particularly the scenario where our DID-eFed enables the FLaaS among hospitals and research institutions.

翻訳日:2021-05-19 13:48:14 公開日:2021-05-18

# (参考訳) slgpt: transfer learningを使用してsimulinkモデルファイルを直接生成し、simulinkツールチェーンのバグを見つける

SLGPT: Using Transfer Learning to Directly Generate Simulink Model Files and Find Bugs in the Simulink Toolchain ( http://arxiv.org/abs/2105.07465v2 )

ライセンス: CC BY 4.0

Sohil Lal Shrestha and Christoph Csallner

(参考訳) Simulinkのような商用サイバー物理システム(CPS)開発ツールのバグを見つけることは、コードベースに数百万行のコードが含まれており、完全な形式言語仕様が利用できないため難しい。ディープラーニング技術は、サンプルモデルからそのような言語仕様を学ぶことを約束する一方で、ディープラーニングは、うまく機能するために多数のトレーニングデータが必要です。 SLGPTは、転送学習を用いて、大規模なトレーニングデータに基づいて事前学習された強力な生成事前学習トランスフォーマ2(GPT-2)モデルを活用することでこの問題に対処する。 SLGPTは、オープンソースリポジトリから抽出されたランダムに生成されたモデルとモデルの両方でGPT-2をSimulinkに適合させる。 SLGPTは、最も近い競合であるDeepFuzzSLよりもオープンソースモデルに近いSimulinkモデルを作成し、DeepFuzzSLが発見したSimulink開発ツールチェーンのスーパーセットを発見した。

Finding bugs in a commercial cyber-physical system (CPS) development tool such as Simulink is hard as its codebase contains millions of lines of code and complete formal language specifications are not available. While deep learning techniques promise to learn such language specifications from sample models, deep learning needs a large number of training data to work well. SLGPT addresses this problem by using transfer learning to leverage the powerful Generative Pre-trained Transformer 2 (GPT-2) model, which has been pre-trained on a large set of training data. SLGPT adapts GPT-2 to Simulink with both randomly generated models and models mined from open-source repositories. SLGPT produced Simulink models that are both more similar to open-source models than its closest competitor, DeepFuzzSL, and found a super-set of the Simulink development toolchain bugs found by DeepFuzzSL.

翻訳日:2021-05-19 12:10:06 公開日:2021-05-18

# (参考訳) マルチモーダル深層ニューラルネットワークにおける説明可能性の検討

A Review on Explainability in Multimodal Deep Neural Nets ( http://arxiv.org/abs/2105.07878v2 )

ライセンス: CC BY 4.0

Gargi Joshi, Rahee Walambe, Ketan Kotecha

(参考訳) ディープニューラルネットワークを利用した人工知能技術は、コンピュータビジョンアプリケーションや自然言語処理タスクなど、いくつかのアプリケーション領域で大きな成功を収めています。人間レベルのパフォーマンスを上回ることで、言語、視覚、感覚、テキストの異なるモダリティが正確な予測と識別において重要な役割を果たすアプリケーションの研究が促進された。深層学習モデルを用いたマルチモーダル融合法が文献で提案されている。その優れた性能にもかかわらず、深層ニューラルネットワークの複雑で不透明でブラックボックスな性質は、社会的受容と使用性を制限する。これにより、モデル解釈可能性と説明可能性の探求が生まれ、さらにマルチモーダルAIメソッドを含む複雑なタスクにもたらされた。本稿では,マルチモーダル深層ニューラルネットワーク,特に視覚と言語タスクにおける説明可能性に関する包括的な調査と解説を行うため,本論文を概説する。本稿では,マルチモーダルaiとその汎用ドメインへの応用に関するいくつかの話題を取り上げ,その意義,データセット,手法と技法の基本構成要素,課題,応用,今後のトレンドについて述べる。

Artificial Intelligence techniques powered by deep neural nets have achieved much success in several application domains, most significantly and notably in the Computer Vision applications and Natural Language Processing tasks. Surpassing human-level performance propelled the research in the applications where different modalities amongst language, vision, sensory, text play an important role in accurate predictions and identification. Several multimodal fusion methods employing deep learning models are proposed in the literature. Despite their outstanding performance, the complex, opaque and black-box nature of the deep neural nets limits their social acceptance and usability. This has given rise to the quest for model interpretability and explainability, more so in the complex tasks involving multimodal AI methods. This paper extensively reviews the present literature to present a comprehensive survey and commentary on the explainability in multimodal deep neural nets, especially for the vision and language tasks. Several topics on multimodal AI and its applications for generic domains have been covered in this paper, including the significance, datasets, fundamental building blocks of the methods and techniques, challenges, applications, and future trends in this domain

翻訳日:2021-05-19 12:00:27 公開日:2021-05-18

# ワールドワイドロードシーン画像におけるポットホールの自動捕捉学習

Learning to Automatically Catch Potholes in Worldwide Road Scene Images ( http://arxiv.org/abs/2105.07986v2 )

ライセンス: Link先を確認

J. Javier Yebes, David Montero, Ignacio Arriola

(参考訳) 世界中の舗装道路に存在するいくつかの道路の危険の中で、ポットホールは最も厄介なものの1つであり、メンテナンスコストも高い。技術や研究の進展により、これらの危険を自動的に検出することへの関心が高まっている。我々の研究は、現実世界の道路シーンの画像から抜け穴を検出するという課題に取り組みました。主な斬新さは、最新のAIの進歩を応用して、穴の視覚的外観を学ぶことにある。私たちはpotholeアノテーションで画像の大規模なデータセットを構築しました。彼らは、様々な環境条件下で異なるカメラ、車両、視点で撮影された世界中の異なる都市の道路シーンを含んでいた。次に,高速なr-cnnとssd深層ニューラルネットワークに基づく4種類の物体検出モデルを微調整した。車両に埋め込むことができるGPGPU機能を備えたNvidia DrivePX2プラットフォーム上で,高い平均精度を達成し,ポットホール検出器を試験した。さらに、AUTOPILOT H2020プロジェクトの一環として、検出されたポットホールを所定のIoTプラットフォームに通知するために、実際の車両にデプロイされた。

Among several road hazards that are present in any paved way in the world, potholes are one of the most annoying and also involving higher maintenance costs. There exists an increasing interest on the automated detection of these hazards enabled by technological and research progress. Our research work tackled the challenge of pothole detection from images of real world road scenes. The main novelty resides on the application of the latest progress in AI to learn the visual appearance of potholes. We built a large dataset of images with pothole annotations. They contained road scenes from different cities in the world, taken with different cameras, vehicles and viewpoints under varied environmental conditions. Then, we fine-tuned four different object detection models based on Faster R-CNN and SSD deep neural networks. We achieved high average precision and the pothole detector was tested on the Nvidia DrivePX2 platform with GPGPU capability, which can be embedded on vehicles. Moreover, it was deployed on a real vehicle to notify the detected potholes to a given IoT platform as part of AUTOPILOT H2020 project.

翻訳日:2021-05-19 11:14:16 公開日:2021-05-18

# 異常検出におけるadversarial discriminative transferの重要性

Importance Weighted Adversarial Discriminative Transfer for Anomaly Detection ( http://arxiv.org/abs/2105.06649v2 )

ライセンス: Link先を確認

Cangning Fan, Fangyi Zhang, Peng Liu, Xiuyu Sun, Hao Li, Ting Xiao, Wei Zhao, Xianglong Tang

(参考訳) 異常検出のための以前の転送方法は、一般的にソースまたはターゲットドメインのラベル付きデータの可用性を前提としている。しかし、大規模なラベル付きデータが高価すぎる多くの実アプリケーションでは、そのような仮定は有効ではない。そこで本稿では,対象ドメインにラベル付き正規/異常データがなく,関連するソースドメインからの正規データのみが存在するケースにおいて,異常検出知識を教師なしで転送するための重み付き対向オートエンコーダ方式を提案する。具体的には、ソース領域とターゲット領域の両方で正規データの分布を調整することを学習するが、ターゲット領域における異常データの分布は変わらない。このようにして、対象領域内の正常データと異常データの分布との間に明らかなギャップが生じ、ドメイン内の異常検出を可能にする。複数の合成データセットに対する大規模な実験とUCSDベンチマークにより,本手法の有効性が示された。コードはhttps://github.com/fancangning/anomaly_detection_transferで入手できる。

Previous transfer methods for anomaly detection generally assume the availability of labeled data in source or target domains. However, such an assumption is not valid in most real applications where large-scale labeled data are too expensive. Therefore, this paper proposes an importance weighted adversarial autoencoder-based method to transfer anomaly detection knowledge in an unsupervised manner, particularly for a rarely studied scenario where a target domain has no labeled normal/abnormal data while only normal data from a related source domain exist. Specifically, the method learns to align the distributions of normal data in both source and target domains, but leave the distribution of abnormal data in the target domain unchanged. In this way, an obvious gap can be produced between the distributions of normal and abnormal data in the target domain, therefore enabling the anomaly detection in the domain. Extensive experiments on multiple synthetic datasets and the UCSD benchmark demonstrate the effectiveness of our approach. The code is available at https://github.com/fancangning/anomaly_detection_transfer.

翻訳日:2021-05-19 11:14:02 公開日:2021-05-18

# 視覚トランスフォーマーは堅牢な学習者です

Vision Transformers are Robust Learners ( http://arxiv.org/abs/2105.07581v2 )

ライセンス: Link先を確認

Sayak Paul and Pin-Yu Chen

(参考訳) 複数の自己注意層で構成されたトランスフォーマーは、さまざまなデータモダリティに適用可能な汎用的な学習プリミティブに対して、パラメータ効率を向上して最先端のSOTA(State-of-the-art)標準精度を達成するコンピュータビジョンの最近のブレークスルーを含む、強い約束を持っている。セルフアテンションは入力データ内に存在する異なるコンポーネントを体系的に整列させるのに役立つため、モデルロバスト性ベンチマークでその性能を調査する根拠を残している。本研究では,視覚トランスフォーマ (vit) の共通の腐敗や摂動, 分布シフト, 自然逆流に対するロバスト性について検討する。 vitモデルとsoma畳み込みニューラルネットワーク(cnns)の総合的な性能比較を行うために,ロバスト分類に関する6種類の画像ネットデータセットを用いた。 6つの体系的に設計された実験を通して、ViTsが実際により堅牢な学習者である理由を説明するために、定量的および定性的な指標の両方を提供する分析を行う。例えば、より少ないパラメータと類似したデータセットと事前トレーニングの組み合わせで、ViTはImageNet-Aで28.10%の精度を提供する。画像マスキング,フーリエスペクトル感度および離散コサインエネルギースペクトルへの拡散に関する解析により,ViTの強靭性向上に寄与する興味深い性質が明らかになった。実験を再現するためのコードは以下の通りである。

Transformers, composed of multiple self-attention layers, hold strong promises toward a generic learning primitive applicable to different data modalities, including the recent breakthroughs in computer vision achieving state-of-the-art (SOTA) standard accuracy with better parameter efficiency. Since self-attention helps a model systematically align different components present inside the input data, it leaves grounds to investigate its performance under model robustness benchmarks. In this work, we study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples. We use six different diverse ImageNet datasets concerning robust classification to conduct a comprehensive performance comparison of ViT models and SOTA convolutional neural networks (CNNs), Big-Transfer. Through a series of six systematically designed experiments, we then present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners. For example, with fewer parameters and similar dataset and pre-training combinations, ViT gives a top-1 accuracy of 28.10% on ImageNet-A which is 4.3x higher than a comparable variant of BiT. Our analyses on image masking, Fourier spectrum sensitivity, and spread on discrete cosine energy spectrum reveal intriguing properties of ViT attributing to improved robustness. Code for reproducing our experiments is available here: https://git.io/J3VO0.

翻訳日:2021-05-19 11:13:46 公開日:2021-05-18

# 高次元クラスタデータ可視化のためのt-SNEの理論基礎

Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data ( http://arxiv.org/abs/2105.07536v2 )

ライセンス: Link先を確認

T. Tony Cai and Rong Ma

(参考訳) 本研究では,一般的な非線形次元低減・データ可視化手法であるt-distributed stochastic neighbor embedded(t-sne)の理論的基礎について検討する。勾配降下法に基づく t-SNE の解析のための新しい理論的枠組みを提案する。 t-SNEの初期の誇張段階において、基礎となるグラフであるラプラシアンに基づく電力反復に対する漸近的同値性を示し、その制限挙動を特徴づけ、ラプラシアンスペクトルクラスタリングとの深い関係、および暗黙の正則化として早期停止を含む基本原理を明らかにする。結果は,このような計算戦略の固有機構と経験的利点を説明する。 t-SNEの埋め込み段階では, 繰り返しを通して低次元写像の運動特性を特徴づけ, クラスタ間反発と低次元写像の拡張挙動を特徴とする増幅位相を同定する。一般的な理論では、クラスタ化されたデータを視覚化するためのt-SNEの高速収束率と例外的な経験的性能を説明し、t-SNE出力の解釈をもたらし、様々なアプリケーションでチューニングパラメータを選択するための理論的ガイダンスを提供する。

This study investigates the theoretical foundations of t-distributed stochastic neighbor embedding (t-SNE), a popular nonlinear dimension reduction and data visualization method. A novel theoretical framework for the analysis of t-SNE based on the gradient descent approach is presented. For the early exaggeration stage of t-SNE, we show its asymptotic equivalence to a power iteration based on the underlying graph Laplacian, characterize its limiting behavior, and uncover its deep connection to Laplacian spectral clustering, and fundamental principles including early stopping as implicit regularization. The results explain the intrinsic mechanism and the empirical benefits of such a computational strategy. For the embedding stage of t-SNE, we characterize the kinematics of the low-dimensional map throughout the iterations, and identify an amplification phase, featuring the intercluster repulsion and the expansive behavior of the low-dimensional map. The general theory explains the fast convergence rate and the exceptional empirical performance of t-SNE for visualizing clustered data, brings forth the interpretations of the t-SNE output, and provides theoretical guidance for selecting tuning parameters in various applications.

翻訳日:2021-05-19 11:13:22 公開日:2021-05-18

# エビデンス理論における近似エントロピーに基づく基本確率割当て積分の不確かさの測定

Uncertainty Measurement of Basic Probability Assignment Integrity Based on Approximate Entropy in Evidence Theory ( http://arxiv.org/abs/2105.07382v2 )

ライセンス: Link先を確認

Tianxiang Zhan, Yuanpeng He, Hanwen Li, Fuyuan Xiao

(参考訳) 証拠理論は、確率の延長は未知や不正確な情報にうまく対処できるというものである。不確かさの測定は証拠理論と確率理論の両方において重要な役割を果たす。近似エントロピー (ApEn) は、複素系の不規則性を記述するためにピンカスによって提案されている。時系列が不規則であればあるほど、近似エントロピーは大きくなる。ネットワークのApEnは、ネットワークが新しいノードを生成する能力、または未発見ノードの可能性を表す。ネットワーク特性と基本確率割当(BPA)の関連付けにより、完全性に関するBPAの不確実性の尺度を得ることができる。論文の主な貢献は、基本確率割り当ての完全性を定義することであり、BPAの近似エントロピーは、BPAの完全性の不確実性を測定するために提案される。提案手法は,証拠理論におけるBPAの不確実性を計算するための論理ネットワーク構造に基づく。提案手法に基づく不確実性は,BPAの完全性の不確実性を表し,BPAの信頼性の同定に寄与する。

Evidence theory is that the extension of probability can better deal with unknowns and inaccurate information. Uncertainty measurement plays a vital role in both evidence theory and probability theory. Approximate Entropy (ApEn) is proposed by Pincus to describe the irregularities of complex systems. The more irregular the time series, the greater the approximate entropy. The ApEn of the network represents the ability of a network to generate new nodes, or the possibility of undiscovered nodes. Through the association of network characteristics and basic probability assignment (BPA) , a measure of the uncertainty of BPA regarding completeness can be obtained. The main contribution of paper is to define the integrity of the basic probability assignment then the approximate entropy of the BPA is proposed to measure the uncertainty of the integrity of the BPA. The proposed method is based on the logical network structure to calculate the uncertainty of BPA in evidence theory. The uncertainty based on the proposed method represents the uncertainty of integrity of BPA and contributes to the identification of the credibility of BPA.

翻訳日:2021-05-19 11:13:02 公開日:2021-05-18

PDF登録状況（公開日: 20210518）