Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210201となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 量子アニーリングによる正方形格子の信号最適化 Traffic Signal Optimization on a Square Lattice with Quantum Annealing ( http://arxiv.org/abs/2003.07527v2 ) ライセンス: Link先を確認	Daisuke Inoue, Akihisa Okada, Tadayoshi Matsumori, Kazuyuki Aihara, Hiroaki Yoshida	(参考訳) 都市部におけるインテリジェント交通システムの普及は計算負荷の増大を引き起こし、大規模交通を管理するための新しいアーキテクチャを必要としている。本研究では,量子アニーリングマシンであるd-wave quantum annealerを用いて,正方形格子上に配置したトラヒック信号を大域的に制御する手法を開発した。まず2つの直交方向における交通流の不均衡を最小限に抑える信号最適化問題を定式化する。次に、この問題をイジングハミルトニアンとして再定義し、量子アニーラーと完全互換である。新たな制御法は, 大規模都市における従来の局所制御法と比較し, 広いパラメータ範囲の交通不均衡を抑制する上で, グローバル制御法が優れていることを示す。さらに, 量子アニール装置を用いて得られた大域的制御手法の解法は, 従来の模擬アニール法よりも優れている。さらに, 局所制御法とグローバル制御法が, 車両の旋回と直進の確率が等しい限界に収まることを解析的に証明した。これらの結果は数値実験によって検証される。 The spread of intelligent transportation systems in urban cities has caused heavy computational loads, requiring a novel architecture for managing large-scale traffic. In this study, we develop a method for globally controlling traffic signals arranged on a square lattice by means of a quantum annealing machine, namely the D-Wave quantum annealer. We first formulate a signal optimization problem that minimizes the imbalance of traffic flows in two orthogonal directions. Then we reformulate this problem as an Ising Hamiltonian, which is fully compatible with quantum annealers. The new control method is compared with a conventional local control method for a large 50-by-50 city, and the results exhibit the superiority of our global control method in suppressing traffic imbalance over wide parameter ranges. Furthermore, the solutions to the global control method obtained with the quantum annealing machine are better than those obtained with conventional simulated annealing. In addition, we prove analytically that the local and the global control methods converge at the limit where cars have equal probabilities for turning and going straight. These results are verified with numerical experiments.	翻訳日:2023-05-28 22:18:58 公開日:2021-02-01
# 検出効率ミスマッチを用いた実用的な量子鍵分布のセキュリティ証明 Security proof of practical quantum key distribution with detection-efficiency mismatch ( http://arxiv.org/abs/2004.04383v2 ) ライセンス: Link先を確認	Yanbao Zhang, Patrick J. Coles, Adam Winick, Jie Lin, and Norbert Lutkenhaus	(参考訳) しきい値検出器を用いた量子鍵分布(QKD)プロトコルは、高性能QKD実証を駆動している。対応するセキュリティ証明は通常、すべての物理検出器が同じ検出効率を持つと仮定する。しかし、実際に使用される検出器の効率は、これらの検出器の製造とセットアップによってミスマッチを示す可能性がある。ミスマッチは、受信信号の異なる空間-時間モードが検出器と異なる結合を持つ可能性があるためにも引き起こされる。本稿では,通常の仮定を伴わずにセキュリティ証明を提供する手法を開発した。本手法は,敵の攻撃戦略を制限することなく,検出効率のミスマッチを考慮に入れることができる。特に、我々のセキュリティ証明が実際の状況に直接適用されるように、入ってくる信号の光子数のカットオフは一切頼らない。本稿では,偏光符号化用に設計され,複数の時空間モードに敏感な受信機について述べる。検出器モデルでは、任意の空間-時間モード間の量子干渉の欠如を仮定する。この検出器モデルを用いたQKDプロトコルでは、効率のミスマッチを特徴とし、光子数のカットオフ仮定なしでセキュリティ証明を行うことができる。また, 本手法では, 検知モデルの効率的ミスマッチがなければ, 検出非効率による損失が敵の制御の外にあると仮定した場合に, 鍵レートが増加することを示した。 Quantum key distribution (QKD) protocols with threshold detectors are driving high-performance QKD demonstrations. The corresponding security proofs usually assume that all physical detectors have the same detection efficiency. However, the efficiencies of the detectors used in practice might show a mismatch depending on the manufacturing and setup of these detectors. A mismatch can also be induced as the different spatial-temporal modes of an incoming signal might couple differently to a detector. Here we develop a method that allows to provide security proofs without the usual assumption. Our method can take the detection-efficiency mismatch into account without having to restrict the attack strategy of the adversary. Especially, we do not rely on any photon-number cut-off of incoming signals such that our security proof is directly applicable to practical situations. We illustrate our method for a receiver that is designed for polarization encoding and is sensitive to a number of spatial-temporal modes. In our detector model, the absence of quantum interference between any pair of spatial-temporal modes is assumed. For a QKD protocol with this detector model, we can perform a security proof with characterized efficiency mismatch and without photon-number cut-off assumption. Our method also shows that in the absence of efficiency mismatch in our detector model, the key rate increases if the loss due to detection inefficiency is assumed to be outside of the adversary's control, as compared to the view where for a security proof this loss is attributed to the action of the adversary.	翻訳日:2023-05-25 08:52:35 公開日:2021-02-01
# 量子分割関数近似のための効率的なアルゴリズム Efficient Algorithms for Approximating Quantum Partition Functions ( http://arxiv.org/abs/2004.11568v2 ) ライセンス: Link先を確認	Ryan L. Mann, Tyler Helmuth	(参考訳) 高温における量子スピンモデルの分配関数に対する多項式時間近似アルゴリズムを確立する。このアルゴリズムは、neto\v{c}n\'y と redig の量子クラスター展開と、helmuth, perkins, regts によるアルゴリズム設計へのクラスタ拡張アプローチに基づいている。同様の結果は関連する手法によって以前にも得られており、有界次グラフ上のペアワイズ相互作用の場合の単純かつわずかにシャープな解析が主な貢献である。 We establish a polynomial-time approximation algorithm for partition functions of quantum spin models at high temperature. Our algorithm is based on the quantum cluster expansion of Neto\v{c}n\'y and Redig and the cluster expansion approach to designing algorithms due to Helmuth, Perkins, and Regts. Similar results have previously been obtained by related methods, and our main contribution is a simple and slightly sharper analysis for the case of pairwise interactions on bounded-degree graphs.	翻訳日:2023-05-22 06:22:25 公開日:2021-02-01
# 拡張不可能な積基底、有界絡み状態、および範囲基準 Unextendible product bases, bound entangled states, and the range criterion ( http://arxiv.org/abs/2005.02108v3 ) ライセンス: Link先を確認	Pratapaditya Bej, Saronath Halder	(参考訳) 非拡張積基底 (unextendible product basis, upb) は、与えられたヒルベルト空間の部分空間にまたがる直交積状態の集合であり、相補部分空間は積状態を持たない。これらの積基底は有界絡み状態(BE)を生成するのに有用である。本研究では、最小ランクのBE状態を生成することができる最大サイズの再現可能かつ既約 UPB を考える。還元可能なupbから、1つ以上の状態を局所的に排除することができ、測定後の状態が直交する。一方、既約 UPB の場合、上記は不可能である。特に、現在のサイズのUPBは、範囲の基準を満たす最も広い品種のランクを持つBE状態を生成するのに役立つ可能性があるため、重要である。ここではそのようなBE状態について述べる。また、他の種類のBE状態を提供し、状態の特定の特性を分析する。現在のBE状態のいくつかはタイル構造と関連している。さらに, 最小ランクのBE状態に対応する異なる UPB を提供し, UPB の重要な性質について議論する。 An unextendible product basis (UPB) is a set of orthogonal product states which span a subspace of a given Hilbert space while the complementary subspace contains no product state. These product bases are useful to produce bound entangled (BE) states. In this work we consider reducible and irreducible UPBs of maximum size, which can produce BE states of minimum rank. From a reducible UPB, it is possible to eliminate one or more states locally, keeping the post-measurement states orthogonal. On the other hand, for an irreducible UPB, the above is not possible. Particularly, the UPBs of the present size are important as they might be useful to produce BE states, having ranks of the widest variety, which satisfy the range criterion. Here we talk about such BE states. We also provide other types of BE states and analyze certain properties of the states. Some of the present BE states are associated with the tile structures. Furthermore, we provide different UPBs corresponding to the present BE states of minimum rank and discuss important properties of the UPBs.	翻訳日:2023-05-21 03:01:06 公開日:2021-02-01
# クリフォード階層におけるコスト最適単一量子ゲート合成 Cost-optimal single-qubit gate synthesis in the Clifford hierarchy ( http://arxiv.org/abs/2005.05581v3 ) ライセンス: Link先を確認	Gary J. Mooney, Charles D. Hill and Lloyd C. L. Hollenberg	(参考訳) 普遍的な量子計算では、フォールトトレラントな量子情報処理に必要な大量のリソースが現実的な実装のために克服される。重要な側面は、量子誤り訂正符号内の論理ゲートから構築された任意のユニタリ演算子を実装することである。合成アルゴリズムは、量子誤り訂正符号で符号化されながらフォールトトレラントに実行可能な小さなユニバーサルゲートのセットから選択された論理ゲートのシーケンスを組み立てることで、任意の精度までユニタリゲートを近似することができる。しかし、現在の手順はまだ基本ゲートコストの個別割り当てをサポートしておらず、多くはユニバーサルベースゲートの拡張セットをサポートしていない。基準ゲートの正準クリフォード+$t$ 集合に対するdijkstraのパスファインディングアルゴリズムに基づいて, 費用最適シーケンスの解析を行い, クリフォード階層の上位階からの$z$-rotationを含む場合と比較した。基本ゲート費用を割り当てる2つのアプローチが用いられた。まず、z$回転触媒回路を再帰的に適用することにより、コストをt$-countsに削減した。第二に、ゲートを直接分離し、フォールトトレラントに実装するのに必要とされる平均的な(物理的レベルの)マジック状態としてコストが割り当てられた。その結果,Z$-回転触媒を用いた場合の平均シーケンスコストは最大5,4\pm 3\%,マジック状態蒸留法では最大3,3\pm 2 \%であることがわかった。さらに,ランダムなターゲットゲートを近似するシーケンス内に現れるクリフォード階層の高次数から,Z$回転ゲートの集合の比率を推定する解析モデルを開発することにより,ベースゲートコストの特定の割り当ての制限について検討した。 For universal quantum computation, a major challenge to overcome for practical implementation is the large amount of resources required for fault-tolerant quantum information processing. An important aspect is implementing arbitrary unitary operators built from logical gates within the quantum error correction code. A synthesis algorithm can be used to approximate any unitary gate up to arbitrary precision by assembling sequences of logical gates chosen from a small set of universal gates that are fault-tolerantly performable while encoded in a quantum error-correction code. However, current procedures do not yet support individual assignment of base gate costs and many do not support extended sets of universal base gates. We analysed cost-optimal sequences using an exhaustive search based on Dijkstra's pathfinding algorithm for the canonical Clifford+$T$ set of base gates and compared them to when additionally including $Z$-rotations from higher orders of the Clifford hierarchy. Two approaches of assigning base gate costs were used. First, costs were reduced to $T$-counts by recursively applying a $Z$-rotation catalyst circuit. Second, costs were assigned as the average numbers of raw (i.e. physical level) magic states required to directly distil and implement the gates fault-tolerantly. We found that the average sequence cost decreases by up to $54\pm 3\%$ when using the $Z$-rotation catalyst circuit approach and by up to $33\pm 2 \%$ when using the magic state distillation approach. In addition, we investigated observed limitations of certain assignments of base gate costs by developing an analytic model to estimate the proportion of sets of $Z$-rotation gates from higher orders of the Clifford hierarchy that are found within sequences approximating random target gates.	翻訳日:2023-05-20 11:59:50 公開日:2021-02-01
# ローカルオペレータの絡み合いと蝶効果 Entanglement of Local Operators and the Butterfly Effect ( http://arxiv.org/abs/2005.14243v2 ) ライセンス: Link先を確認	Jonah Kudler-Flam, Masahiro Nozaki, Shinsei Ryu, Mao Tian Tan	(参考訳) 局所演算子挿入による摂動に対する量子情報と古典情報の堅牢性について検討する。ハイゼンベルク図形の局所作用素のヒルベルト空間における多部交絡測度を計算することでこれを実現できる。探索する初期条件に対する感度は、量子多体系における蝶効果の明快な顕在化である。我々は、古典的な統計力学問題に写像することで、局所作用素状態における相互情報、対数否定性、および反射エントロピーを計算するために、ハールランダムユニタリ回路の「膜理論」を導出し、任意の局所作用素挿入が因果性によって許容される限り早く情報を非局在化することを示す。共形場の理論では、バルク幾何学が地平線上にある局所的な物体を持つ永遠のブラックホールによって記述されるホログラフィック双対が認められる。これらの最大スクランブラとは対照的に、自由フェルミオンやクリフォード回路のような可積分系において局所演算子によって非局在化されるのは、$O(1)$の情報量のみである。 We study the robustness of quantum and classical information to perturbations implemented by local operator insertions. We do this by computing multipartite entanglement measures in the Hilbert space of local operators in the Heisenberg picture. The sensitivity to initial conditions that we explore is an illuminating manifestation of the butterfly effect in quantum many-body systems. We derive a "membrane theory" in Haar random unitary circuits to compute the mutual information, logarithmic negativity, and reflected entropy in the local operator state by mapping to a classical statistical mechanics problem and find that any local operator insertion delocalizes information as fast as is allowed by causality. Identical behavior is found for conformal field theories admitting holographic duals where the bulk geometry is described by the eternal black hole with a local object situated at the horizon. In contrast to these maximal scramblers, only an $O(1)$ amount of information is found to be delocalized by local operators in integrable systems such as free fermions and Clifford circuits.	翻訳日:2023-05-18 02:40:13 公開日:2021-02-01
# 暗号化を伴わない計測デバイス非依存量子通信」のセキュリティ向上 Improving the Security of "Measurement-Device-Independent Quantum Communication without Encryption" ( http://arxiv.org/abs/2006.05263v2 ) ライセンス: Link先を確認	Nayana Das and Goutam Paul	(参考訳) 2018年、niuらはeinstein-podolsky-rosen対を用いた測定デバイス非依存の量子セキュアな直接通信プロトコルを提案し、それを量子対話プロトコルに一般化した(niu et al., science bulletin 63.20, 2018)。これらのプロトコルを分析することで、両方のプロトコルでいくつかのセキュリティ問題を見つけます。本研究では,双方のプロトコルが情報漏洩に対して安全でないこと,第三者がアクティブな攻撃を伴わずに秘密情報の半分を取得できることを示す。また,セキュリティ向上のために,これらのプロトコルの適切な修正も提案する。 Recently in 2018, Niu et al. proposed a measurement-device-independent quantum secure direct communication protocol using Einstein-Podolsky-Rosen pairs and generalized it to a quantum dialogue protocol (Niu et al., Science bulletin 63.20, 2018). By analyzing these protocols we find some security issues in both these protocols. In this work, we show that both the protocols are not secure against information leakage, and a third party can get half of the secret information without any active attack. We also propose suitable modifications of these protocols to improve the security.	翻訳日:2023-05-16 04:57:19 公開日:2021-02-01
# スーパーポジング軌道による量子通信の実験的促進 Experimental Quantum Communication Enhancement by Superposing Trajectories ( http://arxiv.org/abs/2007.05005v2 ) ライセンス: Link先を確認	Giulia Rubino, Lee A. Rozema, Daniel Ebler, Hl\'er Kristj\'ansson, Sina Salek, Philippe Allard Gu\'erin, Alastair A. Abbott, Cyril Branciard, \v{C}aslav Brukner, Giulio Chiribella, Philip Walther	(参考訳) 量子通信ネットワークでは、ワイヤは量子系が送信される、明確に定義された軌道を表す。それにもかかわらず、軌道は異なるノイズの通信チャネルの順序を制御する量子制御として使用することができ、量子通信プロトコルが明確に定義された軌道を介して失敗した場合でも、そのような制御は情報の伝達を可能にすることが示されている。この結果は、通信の強化における軌道の重ね合わせの役割に関するさらなる研究の動機となり、並列通信チャネルの量子制御や、量子制御操作を伴う直列のチャネルの使用も通信の利点につながる可能性があることを明らかにした。そこで本研究では, この結果に基づいて, 2つの軌跡の重ね合わせを行う方法について実験および数値的に比較する。我々は、量子干渉法(quantum interferometry)の枠組みの中で、量子制御操作を伴う直列チャネルの使用が一般に最大の利点をもたらすことを観察する。本研究は,実験的な量子光学シナリオにおけるこれらの利点の性質を明らかにすることに貢献し,情報交換と情報キャリアの軌道が量子である量子通信パラダイムの拡張の利点を示す。 In quantum communication networks, wires represent well-defined trajectories along which quantum systems are transmitted. In spite of this, trajectories can be used as a quantum control to govern the order of different noisy communication channels, and such a control has been shown to enable the transmission of information even when quantum communication protocols through well-defined trajectories fail. This result has motivated further investigations on the role of the superposition of trajectories in enhancing communication, which revealed that the use of quantum control of parallel communication channels, or of channels in series with quantum-controlled operations, can also lead to communication advantages. Building upon these findings, here we experimentally and numerically compare different ways in which two trajectories through a pair of noisy channels can be superposed. We observe that, within the framework of quantum interferometry, the use of channels in series with quantum-controlled operations generally yields the largest advantages. Our results contribute to clarify the nature of these advantages in experimental quantum-optical scenarios, and showcase the benefit of an extension of the quantum communication paradigm in which both the information exchanged and the trajectory of the information carriers are quantum.	翻訳日:2023-05-10 21:17:01 公開日:2021-02-01
# 曲面超曲面上の絡み合い--場分解器アプローチ Entanglement on curved hypersurfaces: A field-discretizer approach ( http://arxiv.org/abs/2007.09657v3 ) ライセンス: Link先を確認	Tal Schwartzman and Benni Reznik (School of Physics and Astronomy, Tel-Aviv University, Tel Aviv, Israel)	(参考訳) 相対論的場の量子論における一般超曲面上の絡み合いを測定するための共変スキームを提案する。そのため、超曲面に沿った場と局所的に相互作用することで、場の状態と離散化子の状態を完全に交換する補助相対論的場「離散化器」を導入する。離散化器は、空間格子を導入することなく、共変方式で、フィールドの無限大を効果的に切断することができる。これは、任意の超曲面上の任意の領域間の絡み合いを評価する効率的な方法を提供する。例えば、1+1次元の相補的領域と分離された領域の絡み合い、ミンコフスキー空間の平坦な超曲面、ミルン空間の曲面超曲面、およびヌル曲面に近づく超曲面上の領域について検討する。その結果, 1+1次元の任意の超曲面上の領域間の絡み合いは, 内部の形状ではなく, 領域の時空の終端に依存することがわかった。本研究の結果は, 平坦な超曲面に対して, 従来の結果と相関し, 拡張するものである。 We propose a covariant scheme for measuring entanglement on general hypersurfaces in relativistic quantum field theory. For that, we introduce an auxiliary relativistic field, 'the discretizer', that by locally interacting with the field along a hypersurface, fully swaps the field's and discretizer's states. It is shown, that the discretizer can be used to effectively cut-off the field's infinities, in a covariant fashion, and without having to introduce a spatial lattice. This, in turn, provides us an efficient way to evaluate entanglement between arbitrary regions on any hypersurface. As examples, we study the entanglement between complementary and separated regions in 1+1 dimensions, for flat hypersurfaces in Minkowski space, for curved hypersurfaces in Milne space, and for regions on hypersurfaces approaching null-surfaces. Our results show that the entanglement between regions on arbitrary hypersurfaces in 1+1 dimensions depends only on the space-time endpoints of the regions, and not on the shape of the interior. Our results corroborate and extend previous results for flat hypersurfaces.	翻訳日:2023-05-09 01:16:31 公開日:2021-02-01
# スピンのエントロピーダイナミクス The Entropic Dynamics of Spin ( http://arxiv.org/abs/2007.15719v2 ) ライセンス: Link先を確認	Ariel Caticha and Nicholas Carrara	(参考訳) エントロピック・ダイナミクス(ED)のアプローチでは、量子論の本質はその確率論的性質にあるが、ヒルベルト空間構造は二次的かつ究極的には任意の役割を果たす。確率分布のダイナミクスは、エントロピーの最大化によって、関連する物理的情報(方向性、相関、ゲージ相互作用など)を運ぶ制約によって引き起こされる。課題は、これらの制約を特定し、制約自体の更新方法の基準を確立することです。本稿では、EDフレームワークを拡張し、スピン1/2点粒子を記述する。 EDスピンは回転体としてモデル化されず、また点粒子の運動によってもモデル化されておらず、波動関数のエピステミック特性である。スピンの特異な回転特性を反映する制約は、幾何学代数の言語で最も効果的に表現される。すべての制約の更新は、対称性原理の中心的な重要性を強調する方法で行われる。まず、確率の位相空間、それらの共役モーメント、スピン変数における適切なシンプレクティックおよび計量構造を特定する。この構成は、情報幾何との深い関係を強調するスピン1/2粒子のフビニ・スタディ計量の導出となる。次に、シンプレクティック構造(ハミルトンフロー)と計量構造(キリングフロー)の両方を保存するEDを構築する。一般ハミルトン・キリング流は波動関数において線形であることを示す。さらに、ハミルトニアンが時間におけるエントロピー発展の生成元であることは、パウリ方程式によって記述されたエントロピーダイナミクスに繋がる。我々は、他の解釈によって提供されるものとは大きく異なる物理図形を生み出す形式主義の新たな解釈について議論した。 In the Entropic Dynamics (ED) approach the essence of quantum theory lies in its probabilistic nature while the Hilbert space structure plays a secondary and ultimately optional role. The dynamics of probability distributions is driven by the maximization of an entropy subject to constraints that carry the relevant physical information -- directionality, correlations, gauge interactions, etc. The challenge is to identify those constraints and to establish a criterion for how the constraints themselves are updated. In this paper the ED framework is extended to describe a spin-1/2 point particle. In ED spin is neither modelled as a rotating body, nor through the motion of a point particle; it is an epistemic property of the wave function. The constraint that reflects the peculiar rotational properties of spin is most effectively expressed in the language of geometric algebra. The updating of all constraints is carried out in a way that stresses the central importance of symmetry principles. First we identify the appropriate symplectic and metric structures in the phase space of probabilities, their conjugate momenta, and the spin variables. This construction yields a derivation of the Fubini-Study metric for a spin-1/2 particle which highlights its deep connection to information geometry. Then we construct an ED that preserves both the symplectic structure (a Hamiltonian flow) and the metric structure (a Killing flow). We show that generic Hamiltonian-Killing flows are linear in the wave function. Imposing further that the Hamiltonian be the generator of an entropic evolution in time leads to an entropic dynamics described by the Pauli equation. We conclude with a discussion of the new interpretation of the formalism which yields a physical picture that is significantly different from that provided by other interpretations.	翻訳日:2023-05-07 18:12:03 公開日:2021-02-01
# 動的量子相転移の絡み合いビュー Entanglement view of dynamical quantum phase transitions ( http://arxiv.org/abs/2008.04894v2 ) ライセンス: Link先を確認	Stefano De Nicola, Alexios A. Michailidis, Maksym Serbyn	(参考訳) 平衡分割関数と多体ユニタリダイナミクスの戻り確率の類似性は、動的量子相転移(DQPT)の概念をもたらした。 DQPTは、戻り振幅の非解析性によって定義され、多くのモデルに存在する。場合によっては、DQPTは順序パラメータのような平衡概念と関連付けられるが、それらの普遍的な記述は開問題である。本研究では,熱力学的極限におけるユニタリダイナミクスの行列積状態記述を用いて,dqptの分類に向けた第1ステップを提案する。これにより,量子イジングモデルにおける解析的記述を用いて示される,前接と絡み合いdqptの2つの制限ケースを区別することができる。先行DQPTは大きな絡み合いギャップによって特徴づけられ、その性質上半古典的であるが、絡み合いDQPTは絡み合いスペクトルの避けられた交差付近で発生し、非局所相関の複雑なパターンで区別できる。本稿では,Isingモデル以外のDQPTの存在を実証し,それらを識別し,それらの相互作用を複雑なDQPT現象と関連付ける可観測物質について議論する。 The analogy between an equilibrium partition function and the return probability in many-body unitary dynamics has led to the concept of dynamical quantum phase transition (DQPT). DQPTs are defined by non-analyticities in the return amplitude and are present in many models. In some cases DQPTs can be related to equilibrium concepts such as order parameters, yet their universal description is an open question. In this work we provide first steps towards a classification of DQPTs by using a matrix product state description of unitary dynamics in the thermodynamic limit. This allows us to distinguish the two limiting cases of precession and entanglement DQPTs, which are illustrated using an analytical description in the quantum Ising model. While precession DQPTs are characterized by a large entanglement gap and are semiclassical in their nature, entanglement DQPTs occur near avoided crossings in the entanglement spectrum and can be distinguished by a complex pattern of non-local correlations. We demonstrate the existence of precession and entanglement DQPTs beyond Ising model, discuss observables that can distinguish them and relate their interplay to complex DQPT phenomenology.	翻訳日:2023-05-06 13:50:14 公開日:2021-02-01
# 1次元ハバード模型の一般流体力学による研究:スピン、電荷、エネルギー電流の定常凝縮と比例性 Generalized hydrodynamics study of the one-dimensional Hubbard model: Stationary clogging and proportionality of spin, charge, and energy currents ( http://arxiv.org/abs/2008.06522v2 ) ライセンス: Link先を確認	Yuji Nozawa, Hirokazu Tsunetsugu	(参考訳) これまでの研究 (Y. Nozawa and H. Tsunetsugu, Phys. B 101, 035121 (2020)] において, 分割プロトコルの一般化力学理論に基づく一次元ハバードモデルのクエンチ力学を研究し, 閉包現象の存在を示した。クロッギングは、電荷電流が非ゼロエネルギー電流と共存する現象であり、このプロトコルが、系の左半分が高温で満たされ、右半分が空であるという初期条件を使用するときに発見された。詰まりは左半分の全ての場所で起こり、接続点からの距離に比例してしばらく続く。本稿では,様々な初期条件を用いて2つの問題を論じる。第一の問題は、定常状態での詰まりの可能性である。右半分の電子密度が初期ゼロに設定されると、左半充填部は初期状態で様々なパラメータのセットのために膨張することが分かった。これは, 詰まり現象が長期定常状態のすべての場所で発生し, その起源についても考察することを意味する。さらに、静止クロッギングにはバック電流、すなわち粒子密度電流が高密度領域に向かって流れる。また、スピンクロッギングは初期のいくつかの条件、すなわち消滅するスピン電流が非ゼロエネルギー電流と共存することも見出した。第二の問題はスピン電流と電荷電流の比例である。電流比が非ゼロ定数に固定された2つの時空間領域を発見した。我々は,電流比が様々な初期条件にどう依存するかを数値的に検討した。また,電荷とエネルギー電流の比についても検討した。 In our previous work [Y. Nozawa and H. Tsunetsugu, Phys. Rev. B 101, 035121 (2020)], we studied quench dynamics in the one-dimensional Hubbard model based on the generalized hydrodynamics theory for a partitioning protocol and showed the presence of a clogging phenomenon. Clogging is a phenomenon that vanishing charge current coexists with nonzero energy current, and was found when the protocol uses the initial condition that the left half of the system is prepared to be half filling at high temperatures with the right half being empty. Clogging occurs at all the sites in the left half and lasts for a time proportional to its distance from the connection point. In this paper, we use various different initial conditions and discuss two issues. The first issue is the possibility of clogging in a stationary state. When the electron density in the right half is initially set nonzero, we found that the left half-filled part expands for various sets of parameters in the initial condition. This means that the clogging phenomenon occurs at all the sites in the long-time stationary state, and we also discuss its origin. In addition, stationary clogging is accompanied by a back current, namely, particle density current flows towards the high-density region. We also found spin clogging occurs for some initial conditions, i.e., the vanishing spin current coexists with nonzero energy current. The second issue is the proportionality of spin and charge currents. We have found two spatio-temporal regions where the current ratio is fixed to a nonzero constant. We numerically studied how the current ratio depends on various initial conditions. We also studied the ratio of charge and energy currents.	翻訳日:2023-05-06 06:51:24 公開日:2021-02-01
# 量子ビット上の(2+1)次元格子ゲージ理論のリアルタイムシミュレーション Real-time simulation of (2+1)-dimensional lattice gauge theory on qubits ( http://arxiv.org/abs/2008.11395v3 ) ライセンス: Link先を確認	Arata Yamamoto	(参考訳) 2+1次元におけるZ2格子ゲージ理論の量子シミュレーションについて検討する。双対変数の定式化、いわゆるウェグナー双対性は、冗長ゲージ自由度を下げるために用いられる。電荷保存の問題は任意の電荷分布に対して解決される。実演として,2つの静電荷,すなわち2つのテンポラリ・ウィルソン線を用いて,システムのリアルタイム進化をシミュレートする。シミュレータ(ハードウェアノイズなし)と、量子コンピュータの実装置(ハードウェアノイズ相当)によって得られたいくつかの結果を示す。 We study the quantum simulation of Z2 lattice gauge theory in 2+1 dimensions. The dual variable formulation, the so-called Wegner duality, is utilized for reducing redundant gauge degrees of freedom. The problem of artificial charge unconservation is resolved for any charge distribution. As a demonstration, we simulate the real-time evolution of the system with two static electric charges, i.e., with two temporal Wilson lines. Some results obtained by the simulator (with no hardware noise) and the real device (with sizable hardware noise) of a quantum computer are shown.	翻訳日:2023-05-04 21:49:33 公開日:2021-02-01
# 高次微分理論と量子力学的対応のためのハミルトン・ヤコビ方程式の新しい定式化 Novel formulation of Hamilton-Jacobi equation for higher derivative theory and quantum mechanical correspondence ( http://arxiv.org/abs/2009.03200v2 ) ライセンス: Link先を確認	Zhi-Qiang Guo	(参考訳) 高次微分理論では、カラテオドリーの等価ラグランジアン(英語版)のアプローチを用いて、ハミルトン・ヤコビ方程式の新しい定式化が存在することを示し、これはハミルトンの標準的アプローチから導かれる定式化とは異なる。これらの新しいハミルトン・ヤコビ方程式の量子力学的対応は、高次微分理論の量子力学における非有界な負エネルギー問題を避けることができる非線形量子力学へと導かれる。 For higher derivative theories, using the approach of Caratheodory's equivalent Lagrangian, we show that there exist novel formulations of Hamilton-Jacobi equations, which are different from the formulations derived from Hamilton's canonical approach. The quantum mechanical correspondences of these novel Hamilton-Jacobi equations lead to nonlinear quantum mechanics, which seem being able to avoid the unbounded negative energy problem in the quantum mechanics of higher derivative theories.	翻訳日:2023-05-03 07:23:56 公開日:2021-02-01
# 振動分光法の近・長期量子アルゴリズムによるアプローチ Near- and long-term quantum algorithmic approaches for vibrational spectroscopy ( http://arxiv.org/abs/2009.05066v2 ) ライセンス: Link先を確認	Nicolas P. D. Sawaya, Francesco Paesani, Daniel P. Tabor	(参考訳) 分子の振動構造を決定することは、大気科学から触媒、燃料燃焼モデリング、生化学イメージング、天体化学まで、いくつかの分野において基本的な応用の中心である。しかし、重要な不調和性やモードカップリングが存在する場合、この問題は古典的にはわずか数原子の分子に対して引き起こされる。本稿では、近・長期の量子コンピュータにおける分子振動構造問題を解決するための一連の量子アルゴリズムについて概説する。多くの固有状態がしばしば望まれる、興味のある状態が基底状態から遠ざかる(あるエネルギーウィンドウに「ズームイン」の方法を要求する)、非単項エルミート作用素に対する遷移振幅が計算されなければならない。これらのハードルに対処し、4つの分子振動ハミルトニアンの問題を考察する。最後に, 与えられたエネルギー精度に対して, 電子構造問題インスタンスの前に, 振動問題インスタンスが量子コンピュータ上でシミュレート可能であることを示唆する解析的および数値的な結果を与える。これらの結果は、量子情報コミュニティにもっと焦点を絞って、科学的および工業的に重要な量子振動問題に移行するべきであることを暗示している。 Determining the vibrational structure of a molecule is central to fundamental applications in several areas, from atmospheric science to catalysis, fuel combustion modeling, biochemical imaging, and astrochemistry. However, when significant anharmonicity and mode coupling are present, the problem is classically intractable for a molecule of just a few atoms. Here, we outline a set of quantum algorithms for solving the molecular vibrational structure problem for both near- and long-term quantum computers. There are previously unaddressed characteristics of this problem which require approaches distinct from most instances of the commonly studied quantum simulation of electronic structure: many eigenstates are often desired, states of interest are often far from the ground state (requiring methods for "zooming in" to some energy window), and transition amplitudes with respect to a non-unitary Hermitian operator must be calculated. We address these hurdles and consider problem instances of four molecular vibrational Hamiltonians. Finally and most importantly, we give analytical and numerical results which suggest that, to a given energy precision, a vibrational problem instance will be simulatable on a quantum computer before an electronic structure problem instance. These results imply that more focus in the quantum information community ought to shift toward scientifically and industrially important quantum vibrational problems.	翻訳日:2023-05-03 00:39:13 公開日:2021-02-01
# 連続計測とフィードバック制御による量子同期の高速化 Enhancement of quantum synchronization via continuous measurement and feedback control ( http://arxiv.org/abs/2009.05468v2 ) ライセンス: Link先を確認	Yuzuru Kato, Hiroya Nakao	(参考訳) 本研究では,高調波駆動による量子ファンデルポル発振器の同期について検討し,発振器に線形に結合した追加浴槽上で連続ホモダイン測定を行い,発振器にフィードバック制御を適用することにより,量子同期を向上できることを実証した。連続測定により量子揺らぎを減少させることで発振子の位相コヒーレンスを増大させる一方、測定バックアクションは位相同期点周辺の揺らぎを必然的に誘発する。本研究では,高調波駆動の周波数を調整し,測定誘起変動を抑制するための簡単なフィードバックポリシーを提案する。さらに、発振子の位相拡散が最大であり、発振子の位相の最大情報を抽出した二次角度で量子計測を行うことにより、量子同期の最大拡張を実現することを実証する。 We study synchronization of a quantum van der Pol oscillator with a harmonic drive and demonstrate that quantum synchronization can be enhanced by performing continuous homodyne measurement on an additional bath linearly coupled to the oscillator and applying feedback control to the oscillator. The phase coherence of the oscillator is increased by reducing quantum fluctuations via the continuous measurement, whereas the measurement backaction inevitably induces fluctuations around the phase-locking point. We propose a simple feedback policy for suppressing measurement-induced fluctuations by adjusting the frequency of the harmonic drive, which results in enhancement of quantum synchronization. We further demonstrate that the maximum enhancement of quantum synchronization is achieved by performing quantum measurement on the quadrature angle at which the phase diffusion of the oscillator is the largest and the maximal information of the oscillator phase is extracted.	翻訳日:2023-05-02 22:29:16 公開日:2021-02-01
# パワーロー相互作用系における最適状態伝達と絡み合い生成 Optimal State Transfer and Entanglement Generation in Power-law Interacting Systems ( http://arxiv.org/abs/2010.02930v2 ) ライセンス: Link先を確認	Minh C. Tran, Abhinav Deshpande, Andrew Y. Guo, Andrew Lucas, Alexey V. Gorshkov	(参考訳) 未知の量子ビット状態をマルチキュービットのグリーンバーガー・ホーン・サイーリンガー様状態に符号化するための最適なプロトコルを示し、その結果、パワーロー (1/r^\alpha$) 相互作用を示す大規模システムにおいて量子情報を転送する。すべてのパワーロー指数$\alpha$($d$と$d+1$)に対して、$d$はシステムの次元であり、このプロトコルは多項式スピードアップを$\alpha>2d$、超多項式スピードアップを$\alpha\leq 2d$とする。すべての$\alpha>d$ に対して、このプロトコルは lieb-robinson 境界を飽和させ(多項補正まで)、プロトコルの最適性とこの機構における境界の厳密性を確立する。このプロトコルは、量子センシング、量子コンピューティング、トポロジカルに順序付けられた状態の準備など、幅広い応用がある。さらに、このプロトコルは、パワーロー相互作用システムのデジタルシミュレーションにおいて、ゲート数の下限を提供する。 We present an optimal protocol for encoding an unknown qubit state into a multiqubit Greenberger-Horne-Zeilinger-like state and, consequently, transferring quantum information in large systems exhibiting power-law ($1/r^\alpha$) interactions. For all power-law exponents $\alpha$ between $d$ and $2d+1$, where $d$ is the dimension of the system, the protocol yields a polynomial speedup for $\alpha>2d$ and a superpolynomial speedup for $\alpha\leq 2d$, compared to the state of the art. For all $\alpha>d$, the protocol saturates the Lieb-Robinson bounds (up to subpolynomial corrections), thereby establishing the optimality of the protocol and the tightness of the bounds in this regime. The protocol has a wide range of applications, including in quantum sensing, quantum computing, and preparation of topologically ordered states. In addition, the protocol provides a lower bound on the gate count in digital simulations of power-law interacting systems.	翻訳日:2023-04-29 20:14:17 公開日:2021-02-01
# 新型コロナウイルス(COVID-19)関連スマートフォンアプリのプライバシー問題とユーザ受け入れ Apps Against the Spread: Privacy Implications and User Acceptance of COVID-19-Related Smartphone Apps on Three Continents ( http://arxiv.org/abs/2010.14245v2 ) ライセンス: Link先を確認	Christine Utz, Steffen Becker, Theodor Schnitzler, Florian M. Farke, Franziska Herbert, Leonie Schaewitz, Martin Degeling, Markus D\"urmuth	(参考訳) 新型コロナウイルス(COVID-19)のパンデミックにより、スマートフォンアプリケーションの開発が加速している。多くの"コロナアプリ"が広く採用されることが求められており、政府の支援する健康アプリケーションに対するプライバシー、セキュリティ、社会的影響に関する公の議論を引き起こしている。我々はドイツ(n = 1,003)、米国(n = 1,003)、中国(n = 1,019)の代表的なオンライン調査を行い、コンテキスト整合性フレームワークに基づいたヴィグネットデザインを用いてコロナアプリのユーザ受け入れを調査した。われわれはコンタクトトレース、症状チェック、検疫、健康診断、単なる情報のためのアプリを調査した。以上の結果から,中国ではユーザ受け入れが最も多く,米国ではユーザ受け入れが低い国間で,採用を促進するデータ処理プラクティスの洞察が得られた。中国の参加者はパーソナライズされたデータの収集を好み、ドイツとアメリカの参加者は匿名性を好む。国全体では、接触追跡は検疫機関よりも肯定的に見られ、技術的な不具合はユーザーの受け入れに悪影響を及ぼす。 The COVID-19 pandemic has fueled the development of smartphone applications to assist disease management. Many "corona apps" require widespread adoption to be effective, which has sparked public debates about the privacy, security, and societal implications of government-backed health applications. We conducted a representative online study in Germany (n = 1,003), the US (n = 1,003), and China (n = 1,019) to investigate user acceptance of corona apps, using a vignette design based on the contextual integrity framework. We explored apps for contact tracing, symptom checks, quarantine enforcement, health certificates, and mere information. Our results provide insights into data processing practices that foster adoption and reveal significant differences between countries, with user acceptance being highest in China and lowest in the US. Chinese participants prefer the collection of personalized data, while German and US participants favor anonymity. Across countries, contact tracing is viewed more positively than quarantine enforcement, and technical malfunctions negatively impact user acceptance.	翻訳日:2023-04-27 08:41:23 公開日:2021-02-01
# 非多重性自由群に対する文字ランダム化ベンチマークと部分空間,リーク,マッチゲートランダム化ベンチマークへの応用 Character randomized benchmarking for non-multiplicity-free groups with applications to subspace, leakage, and matchgate randomized benchmarking ( http://arxiv.org/abs/2011.00007v2 ) ライセンス: Link先を確認	Jahan Claes, Eleanor Rieffel, Zhihui Wang	(参考訳) ランダム化ベンチマーク(RB)は実験量子ゲートの誤差率を決定する強力な手法である。しかし、伝統的なRBは、クリフォード群(Clifford group)のようなゲートセットに制限されている。最近導入されたキャラクタRBは、表現理論の技法を用いてより一般的なゲートをベンチマークすることができるが、この手法は「多重性のない」グループにしか適用されていない。本稿では,非多重性自由群を明示的に扱うために,原文字RBの導出を拡張し,いくつかの応用を導出する。まず、最近導入された部分空間RBの厳密なバージョンを導出し、SWAPの下で対称な1ビットと2ビットのゲートの集合を特徴付ける。次に,より一般的なゲート群に適用可能な新しいリークrbプロトコルを開発した。最後に、マッチゲート群に対するスケーラブルなRBプロトコルを導出するが、クリフォード群のような群はユニバーサルではないが、1つの追加ゲートを追加することで普遍となる。この例は、スケーラブルな非クリフォードRBプロトコルの数少ない例の1つである。これら3つの場合において、既存の理論と比較して、我々の手法は類似の資源を必要とするが、より正確なゲート忠実度推定を提供するか、より一般的なゲート群に適用する。結論として,マルチプライシティフリーキャラクタrbを用いてスケーラブルなrbプロトコルと特定のゲートを特徴付ける手法の新しいクラスを開発する可能性,課題について考察する。 Randomized benchmarking (RB) is a powerful method for determining the error rate of experimental quantum gates. Traditional RB, however, is restricted to gatesets, such as the Clifford group, that form a unitary 2-design. The recently introduced character RB can benchmark more general gates using techniques from representation theory; up to now, however, this method has only been applied to "multiplicity-free" groups, a mathematical restriction on these groups. In this paper, we extend the original character RB derivation to explicitly treat non-multiplicity-free groups, and derive several applications. First, we derive a rigorous version of the recently introduced subspace RB, which seeks to characterize a set of one- and two-qubit gates that are symmetric under SWAP. Second, we develop a new leakage RB protocol that applies to more general groups of gates. Finally, we derive a scalable RB protocol for the matchgate group, a group that like the Clifford group is non-universal but becomes universal with the addition of one additional gate. This example provides one of the few examples of a scalable non-Clifford RB protocol. In all three cases, compared to existing theories, our method requires similar resources, but either provides a more accurate estimate of gate fidelity, or applies to a more general group of gates. In conclusion, we discuss the potential, and challenges, of using non-multiplicity-free character RB to develop new classes of scalable RB protocols and methods of characterizing specific gates.	翻訳日:2023-04-26 07:42:02 公開日:2021-02-01
# パラメトリックアンプキャビティ内の原子への同型性によるJaynes-Cummings-Rabiモデルのスペクトルの探索 Probing the spectrum of the Jaynes-Cummings-Rabi model by its isomorphism to an atom inside a parametric amplifier cavity ( http://arxiv.org/abs/2011.04143v2 ) ライセンス: Link先を確認	R. Guti\'errez-J\'auregui and G. S. Agarwal	(参考訳) キャビティ量子電磁力学のjaynes-cummings-rabiモデルがパラメトリック増幅器キャビティ内の量子ビットのハミルトニアンへの同型によりどのように実現されるかを示す。この実現により、キュービットとパラメトリック発振器を含むパラメトリックアンプキャビティに印加されたプローブにより、ラビモデルの完全なスペクトルを観測する方法が明確になる。同型の重要な結果は、実際の周波数がデチューニングに置き換えられ、超強結合状態に到達することができることである。この状態の中では、プローブされたスペクトルは、地上と最初の励起状態の遷移に遡る狭い共鳴ピークを示す。これらの状態の正確な形式はエネルギー交差で与えられ、数値的に拡張される。交差では、固有状態は磁場と原子の絡み合った状態であり、そこで磁場は絞られた猫状態の内部で見られる。 We show how the Jaynes--Cummings--Rabi model of cavity quantum electrodynamics can be realized via an isomorphism to the Hamiltonian of a qubit inside a parametric amplifier cavity. This realization clears the way to observe the full spectrum of the Rabi model via a probe applied to a parametric amplifier cavity containing a qubit and a parametric oscillator operating below threshold. An important outcome of the isomorphism is that the actual frequencies are replaced by detunings which make it feasible to reach the ultra-strong coupling regime. We find that inside this regime the probed spectrum displays a narrow resonance peak that is traced back to the transition between ground and first excited states. The exact form of these states is given at an energy crossing and then extended numerically. At the crossing, the eigenstates are entangled states of field and atom where the field is found inside squeezed cat states.	翻訳日:2023-04-24 21:33:48 公開日:2021-02-01
# フォトニック結晶キャビティの1つのホール内に配置した人工原子に基づくハイブリッド量子フォトニクス Hybrid quantum photonics based on artificial atoms placed inside one hole of a photonic crystal cavity ( http://arxiv.org/abs/2012.11503v3 ) ライセンス: Link先を確認	Konstantin G. Fehler, Lukas Antoniuk, Niklas Lettner, Anna P. Ovvyan, Richard Waltrich, Nico Gruhler, Valery A. Davydov, Viatcheslav N. Agafonov, Wolfram H. P. Pernice, Alexander Kubanek	(参考訳) スピンベースの量子フォトニクスは、分散量子コンピューティングと量子ネットワークを実現することを約束する。性能は効率のよい絡み合い分布に依存し、空洞量子電磁力学によって効率を高めることができる。中心的な課題は、大きなスピン光子結合率と高い動作帯域を持つコンパクトデバイスの開発である。フォトニック結晶キャビティは強い磁場閉じ込めを構成するが、モード場最大値における原子系の正確な位置決めに高い要求を課す。ダイヤモンドのカラーセンター、特に負電荷のシリコン空白中心は有望な原子系として現れた。大きなスペクトル安定性と長期の核スピンメモリへのアクセスにより、メモリ強化量子通信を含む量子ネットワークノードの初等的な実証が可能となった。ハイブリッドアプローチでは,SiV$^-$-含有ナノダイアモンドを1次元,非定常,Si$_3$N$_4系フォトニック結晶キャビティの1ホール内に配置し,それぞれの光遷移をキャビティモードに整合的に結合する。我々は2モード合成、導波路、パーセルエンハンスメント、共振器共振チューニングを利用して光物質結合を最適化する。結果として生じる光子フラックスは、自由空間に比べて14倍以上増加する。寿命を460ps以下に短縮することで、潜在的な動作帯域幅はghz以上になる。本研究は,SiV^-$-中心をナノダイヤモンドとするハイブリッド量子フォトニクスに基づく量子ネットワークノードの実現に向けた重要なステップを示す。 Spin-based quantum photonics promise to realize distributed quantum computing and quantum networks. The performance depends on efficient entanglement distribution, where the efficiency can be boosted by means of cavity quantum electrodynamics. The central challenge is the development of compact devices with large spin-photon coupling rates and high operation bandwidth. Photonic crystal cavities comprise strong field confinement but put high demands on accurate positioning of an atomic system in the mode field maximum. Color center in diamond, and in particular the negatively-charged Silicon-Vacancy center, emerged as a promising atom-like systems. Large spectral stability and access to long-lived, nuclear spin memories enabled elementary demonstrations of quantum network nodes including memory-enhanced quantum communication. In a hybrid approach, we deterministically place SiV$^-$-containing nanodiamonds inside one hole of a one-dimensional, free-standing, Si$_3$N$_4$-based photonic crystal cavity and coherently couple individual optical transitions to the cavity mode. We optimize the light-matter coupling by utilizing two-mode composition, waveguiding, Purcell-enhancement and cavity resonance tuning. The resulting photon flux is increased by more than a factor of 14 as compared to free-space. The corresponding lifetime shortening to below 460 ps puts the potential operation bandwidth beyond GHz rates. Our results mark an important step to realize quantum network nodes based on hybrid quantum photonics with SiV$^-$- center in nanodiamonds.	翻訳日:2023-04-20 00:17:18 公開日:2021-02-01
# 実験データによる位相相転移の教師なし機械学習 Unsupervised machine learning of topological phase transitions from experimental data ( http://arxiv.org/abs/2101.05712v2 ) ライセンス: Link先を確認	Niklas K\"aming, Anna Dawid, Korbinian Kottmann, Maciej Lewenstein, Klaus Sengstock, Alexandre Dauphin, Christof Weitenberg	(参考訳) 相転移の同定は、量子多体物理学における重要な課題の1つである。近年、機械学習手法は、ノイズや不完全なデータから、順序パラメータの知識がなくても位相境界をローカライズする代替手法であることが示されている。ここでは,超低温原子からの実験データに対して異常検出や影響関数を含む,教師なしの機械学習手法を適用する。このようにして、Haldaneモデルの位相位相図は、完全に偏りのない方法で得られる。本研究では, 有限温度実験データとFloquet システムのデータに対して, 単一マイクロモーション位相に後処理した場合に適用可能であることを示す。我々の研究は、複雑な多体系における新しいエキゾチック位相の教師なし検出のためのベンチマークを提供する。 Identifying phase transitions is one of the key challenges in quantum many-body physics. Recently, machine learning methods have been shown to be an alternative way of localising phase boundaries also from noisy and imperfect data and without the knowledge of the order parameter. Here we apply different unsupervised machine learning techniques including anomaly detection and influence functions to experimental data from ultracold atoms. In this way we obtain the topological phase diagram of the Haldane model in a completely unbiased fashion. We show that the methods can successfully be applied to experimental data at finite temperature and to data of Floquet systems, when postprocessing the data to a single micromotion phase. Our work provides a benchmark for unsupervised detection of new exotic phases in complex many-body systems.	翻訳日:2023-04-15 05:05:20 公開日:2021-02-01
# 非可換調和振動子とランダウ問題の分析スペクトルの同型 Isomorphism of Analytical Spectrum between Noncommutative Harmonic Oscillator and Landau Problem ( http://arxiv.org/abs/2101.05929v2 ) ライセンス: Link先を確認	M.N. Nazmi M. Rusli, Nurisya M. Shah, Hishamuddin Zainuddin and Chan Kar Tim	(参考訳) 非可換等方振動子のハミルトニアンとランダウ問題の比較は、これらの2つのモデルが区別できない特定の条件を研究するために分析される。対称および2つのランダウゲージにおけるランダウ問題のエネルギー固有値と固有状態を解析的に評価する。非可換等方調和振動子のハミルトニアンは、可換座標空間におけるboppシフトを用いて得られる。その結果、2つの系は、両ゲージの選択に対して$n_{r}$と$m_{l}$と$qb = eb > 0$の類似の値に同型であることが示された。しかし、非可換発振器が空間的自由度を1つ失わなければならないランダウゲージにはさらなる要件がある。また、ハミルトン群が互いに整合性を持つためには、因子$\zeta$でパラメタ化する必要がある。次に波動関数と確率密度関数をプロットし、出現する振る舞いを説明する。最後に、非可換性または磁場が固有状態および同型系の確率分布に及ぼす影響を示す。 The comparison of the Hamiltonians of the noncommutative isotropic harmonic oscillator and Landau problem are analysed to study the specific conditions under which these two models are indistinguishable. The energy eigenvalues and eigenstates of Landau problem in symmetric and two Landau gauges are evaluated analytically. The Hamiltonian of a noncommutative isotropic harmonic oscillator is found by using Bopp's shift in commutative coordinate space. The result shows that the two systems are isomorphic up to the similar values of $n_{r}$ and $m_{l}$ and $qB = eB > 0$ for both gauge choices. However, there is an additional requirement for Landau gauge where the noncommutative oscillator has to lose one spatial degree of freedom. It also needs to be parametrized by a factor $\zeta$ for their Hamiltonians to be consistent with each other. The wavefunctions and probability density functions are then plotted and the behaviour that emerges is explained. Finally, the effects of noncommutativity or magnetic field on the eigenstates and their probability distribution of the isomorphic system are shown.	翻訳日:2023-04-15 03:08:26 公開日:2021-02-01
# デジタル量子コンピュータによる量子材料シミュレーション Simulating Quantum Materials with Digital Quantum Computers ( http://arxiv.org/abs/2101.08836v2 ) ライセンス: Link先を確認	Lindsay Bassman, Miroslav Urbanek, Mekena Metcalf, Jonathan Carter, Alexander F. Kemper, Wibe de Jong	(参考訳) 量子材料は幅広いエキゾチックな現象と実用的な性質を示す。これらの材料をより深く理解することで、量子領域の基本物理学に関する深い洞察と、エンターテイメント、医療、持続可能性のための先進技術を提供することができる。デジタル量子コンピュータ(DQC)の出現は、古典的コンピュータでは引き起こせない量子シミュレーションを効率的に行うことができ、量子物質の顕著で直感に反する振る舞いをテストし分析するための、有望な道筋を提供する。これらの新しいツールを備えた多様な領域の科学者は、物理量子の優位性(量子コンピュータを使って、どんな古典的コンピュータでも実行できない計算で新しい物理学を学ぶ)を達成するために競い合っている。したがって、このレビューの目的は、物理科学の科学者がアクセス可能なこの目標に向けての進捗の概要を提供することである。まず、利用可能な技術とアルゴリズムをレビューし、量子コンピュータ上で材料を表現する無数の方法を詳細に説明する。次に、現在利用可能なDQCで成功したシミュレーションを紹介し、この初期段階の技術で研究できる静的および動的特性の多様性を強調します。最後に、材料問題をDQCにマッピングする方法の2つの例を紹介します。このレビューは、ドメインエキスパートの分野における進歩の組織的な概要と、DQCの量子材料に関する独自のシミュレーションの開始に関心のある分野の科学者へのアクセシビリティな紹介として役立てられることを願っている。 Quantum materials exhibit a wide array of exotic phenomena and practically useful properties. A better understanding of these materials can provide deeper insights into fundamental physics in the quantum realm as well as advance technology for entertainment, healthcare, and sustainability. The emergence of digital quantum computers (DQCs), which can efficiently perform quantum simulations that are otherwise intractable on classical computers, provides a promising path forward for testing and analyzing the remarkable, and often counter-intuitive, behavior of quantum materials. Equipped with these new tools, scientists from diverse domains are racing towards achieving physical quantum advantage (i.e., using a quantum computer to learn new physics with a computation that cannot feasibly be run on any classical computer). The aim of this review, therefore, is to provide a summary of progress made towards this goal that is accessible to scientists across the physical sciences. We will first review the available technology and algorithms, and detail the myriad ways to represent materials on quantum computers. Next, we will showcase the simulations that have been successfully performed on currently available DQCs, emphasizing the variety of properties, both static and dynamic, that can be studied with this nascent technology. Finally, we work through two examples of how to map a materials problem onto a DQC, with full code included in the Supplementary Material. It is our hope that this review can serve as an organized overview of progress in the field for domain experts and an accessible introduction to scientists in related fields interested in beginning to perform their own simulations of quantum materials on DQCs.	翻訳日:2023-04-14 08:30:00 公開日:2021-02-01
# 円偏光レーザー場の存在下での弾性電子-陽子散乱 Elastic electron-proton scattering in the presence of a circularly polarized laser field ( http://arxiv.org/abs/2102.00722v1 ) ライセンス: Link先を確認	I Dahiri, M Jakha, S Mouslih, B Manaut and S Taj	(参考訳) 近年のレーザー技術の進歩により、非常に強力なレーザー分野における基本レーザー支援プロセスの研究が重要になっている。本研究およびレーザー支援量子電磁力学(QED)の枠組みにおいて、電子-陽子散乱は円偏光の強い電磁場の存在下で考慮された。まず,陽子を使わずに電子の相対論的ドレッシングのみを考慮に入れる過程について考察する。そして、プロトンドレッシングの効果を探求するために、電子とプロトンとの相対論的ドレッシングを完全に検討し、ディラック・ヴォルコフ関数を用いてそれらを記述する。両方の場合における差分断面積 (DCS) の解析式は摂動理論の最低次で導かれる。その結果、レーザ磁場によりDCSが顕著に減少する。プロトンドレッシングの効果は10^{10}~\text{V/cm}$以上のレーザー磁場強度で現れ始め、従って考慮する必要がある。レーザフィールド強度と周波数がdcsに及ぼす影響を報告した。モット散乱とレーザーフリーの結果との比較も含む。 Owing to recent advances in laser technology, it has become important to investigate fundamental laser-assisted processes in very powerful laser fields. In the present work and within the framework of laser-assisted quantum electrodynamics (QED), electron-proton scattering was considered in the presence of a strong electromagnetic field of circular polarization. First, we present a study of the process where we only take into account the relativistic dressing of the electron without the proton. Then, in order to explore the effect of the proton dressing, we fully consider the relativistic dressing of the electron and the proton together and describe them by using Dirac-Volkov functions. The analytical expression for the differential cross section (DCS) in both cases is derived at lowest-order of perturbation theory. As a result, the DCS is notably reduced by the laser field. It is found that the effect of proton dressing begins to appear at laser field strengths greater than or equal to $10^{10}~\text{V/cm}$ and it therefore must be taken into account. The influence of the laser field strength and frequency on the DCS is reported. A comparison with the Mott scattering and the laser-free results is also included.	翻訳日:2023-04-13 03:14:50 公開日:2021-02-01
# 量子暗号経済: 量子技術の進化のためのブロックチェーン予測市場 Quantum crypto-economics: Blockchain prediction markets for the evolution of quantum technology ( http://arxiv.org/abs/2102.00659v1 ) ライセンス: Link先を確認	Peter P. Rohde, Vijay Mohan, Sinclair Davidson, Chris Berg, Darcy Allen, Gavin K. Brennen, Jason Potts	(参考訳) 現在進行中の最も重要な技術進歩の2つは、量子技術の出現と、グローバル金融システムの暗号資産への移行、特にブロックチェーンベースの暗号通貨とスマートコントラクトである。しかし、いずれにせよ、量子技術はブロックチェーンの暗号基盤を直接侵害する能力を持つので、両者の間には重要な相互作用がある。我々は、量子リスクプレミアムの価格を含む様々なシナリオで、量子障害の金融モデルを構築することで、この複雑な相互作用を探求する。これを量子暗号経済と呼ぶ。 Two of the most important technological advancements currently underway are the advent of quantum technologies, and the transitioning of global financial systems towards cryptographic assets, notably blockchain-based cryptocurrencies and smart contracts. There is, however, an important interplay between the two, given that, in due course, quantum technology will have the ability to directly compromise the cryptographic foundations of blockchain. We explore this complex interplay by building financial models for quantum failure in various scenarios, including pricing quantum risk premiums. We call this quantum crypto-economics.	翻訳日:2023-04-13 03:14:08 公開日:2021-02-01
# マヨラナフェルミオンゲートを用いたフェルミオン系の量子演算とプロセストモグラフィー Quantum operation of fermionic systems and process tomography using Majorana fermion gates ( http://arxiv.org/abs/2102.00620v1 ) ライセンス: Link先を確認	Gang Zhang, Mingxia Huo and Ying Li	(参考訳) 量子トモグラフィーは、量子演算のキャラクタリゼーションにとって重要なツールである。本稿では,フェルミオン系における量子トモグラフィーの枠組みについて述べる。量子ビット系と比較すると、フェルミオンはフェルミオン系の状態、過程、測定値に制約を設定するスーパー選択規則に従う。その結果、フェルミオンモードの部分集合に作用する操作は部分的にしか再構築できず、完全な再構成には部分集合に加えて少なくとも1つの補助フェルミオンモードが必要となる。また,マヨルダナフェルミオン量子コンピュータにおいて,情報完全状態の生成と測定を実現するための一連の回路を含む,ゲートに基づく完全再構成のためのプロトコルを報告する。 Quantum tomography is an important tool for the characterisation of quantum operations. In this paper, we present a framework of quantum tomography in fermionic systems. Compared with qubit systems, fermions obey the superselection rule, which sets constraints on states, processes and measurements in a fermionic system. As a result, we can only partly reconstruct an operation that acts on a subset of fermion modes, and the full reconstruction always requires at least one ancillary fermion mode in addition to the subset. We also report a protocol for the full reconstruction based on gates in Majorana fermion quantum computer, including a set of circuits for realising the informationally-complete state preparation and measurement.	翻訳日:2023-04-13 03:13:30 公開日:2021-02-01
# 3レベル設定を超えた非断熱ホロノミック量子計算の実現 Realizing nonadiabatic holonomic quantum computation beyond the three-level setting ( http://arxiv.org/abs/2102.00603v1 ) ライセンス: Link先を確認	G. F. Xu, P. Z. Zhao, Erik Sj\"oqvist, D. M. Tong	(参考訳) 非線形ホロノミック量子計算(NHQC)は、誤差耐性ゲートを実装する方法を提供し、近年注目されている。提案されて以来、NHQC の一般的なビルディングブロックは3レベル {\Lambda} システムとなり、これらのシステムに基づいて多くの NHQC スキームが開発されている。本稿では,NHQCの標準3レベル設定以上の実現について検討する。我々の提案の中心となる考え方は、ビルディングブロックシステムのヒルベルト空間を拡大し、純粋にホロノミックな進化を保証するために二部グラフ構造を持たせることで、NHQCを改善することである。提案手法は,従来のキュービットベースのNHQCを効率よく短縮するだけでなく,quditベースのNHQCの実装も提供する。そこで本提案では,効率の良い量子情報プロセッサの物理実現に大きく貢献できるNHQCのさらなる開発を提案する。 Nonadiabatic holonomic quantum computation (NHQC) provides a method to implement error resilient gates and that has attracted considerable attention recently. Since it was proposed, three-level {\Lambda} systems have become the typical building block for NHQC and a number of NHQC schemes have been developed based on such systems. In this paper, we investigate the realization of NHQC beyond the standard three-level setting. The central idea of our proposal is to improve NHQC by enlarging the Hilbert space of the building block system and letting it have a bipartite graph structure in order to ensure purely holonomic evolution. Our proposal not only improves conventional qubit-based NHQC by efficiently reducing its duration, but also provides implementations of qudit-based NHQC. Therefore, our proposal provides a further development of NHQC that can contribute significantly to the physical realization of efficient quantum information processors.	翻訳日:2023-04-13 03:12:57 公開日:2021-02-01
# 2レベル系によるベリー曲率による量子状態進化の追跡 Tracking quantum state evolution by the Berry curvature with a two-level system ( http://arxiv.org/abs/2102.00808v1 ) ライセンス: Link先を確認	Ze-Lin Zhang, Ping Xu, Zhen-Biao Yang	(参考訳) 駆動する2レベル系のハミルトニアンの制御パラメータにまたがる2種類の位相構造(球面とトーラス)について検討し,その構造と系の力学との関係について考察した。本稿では, 動的応答法によって得られたベリー曲率について考察し, ベリー曲率を積分して探索したガッピング領域を含む物理および可観測多様体を示し, ベリー曲率を抽出してシステムの状態変化を追跡・操作できることを示す。 We investigate two kinds of topological structures (sphere and torus) spanned by the controlled parameters of a driven two-level system's Hamiltonian, and consider the connection between the structures and the system's dynamics. We discuss the Berry curvature obtained through the dynamical response method, show the certain physical and observable manifolds including the gapped region probed by integrating the Berry curvature, and demonstrate the system's state evolution can be tracked and manipulated by extracting the Berry curvature.	翻訳日:2023-04-13 03:07:15 公開日:2021-02-01
# 単一InAs/GaAs量子ドットにおける長寿命発光ダイナミクスの解析 Analysis of Emission Dynamics of a Long Lifetime in Single InAs/GaAs Quantum Dots ( http://arxiv.org/abs/2102.00791v1 ) ライセンス: Link先を確認	Junhui Huang, Hao Chen, Zhiyao Zhuo, Jian Wang, Shulun Li, Kun Ding, Haiqiao Ni, Zhichuan Niu, Desheng Jiang, Xiuming Dou, and Baoquan Sun	(参考訳) 単一InAs/GaAs量子ドット (QD) 試料では, 湿潤層 (WL) [ACS Photonics 2020,7,3228-3235] に長寿命の準安定状態が存在することが報告されている。本稿では,エミッション減衰曲線をシミュレートする新しい3レベルモデルを提案する。このモデルでは、準安定状態の励起子がQDによって拡散され、そしてQDで蛍光を放出すると仮定すると、拡張されたような指数関数の崩壊公式はI(t)=At^({\beta}-1)e^(-(rt)^{\beta} として導かれ、これは平均寿命<{\tau}>=1/r{\Gamma}(1/{\beta}+1) の解析式で長寿命の崩壊曲線をうまく記述することができる。さらに,提案する3レベルモデルに基づき,測定したg^2(t)曲線によく適合する2次自己相関関数g^2(t)の式も得られた。 A very long lifetime emission with non-single exponential decay characteristic has been reported for single InAs/GaAs quantum dot (QD) samples, in which there exists a long-lived metastable state in the wetting layer (WL) [ACS Photonics 2020,7,3228-3235]. In this article we have proposed a new three-level model to simulate the emission decay curve. In this model, assuming that the excitons in metastable state will diffuse and be trapped by QDs, and then emit fluorescence in QDs, a stretched-like exponential decay formula is derived as I(t)=At^({\beta}-1)e^(-(rt)^{\beta}), which can well describe the long lifetime decay curve with an analytical expression of average lifetime <{\tau}>=1/r{\Gamma}(1/{\beta}+1), where {\Gamma} is the Gamma function. Furthermore, based on the proposed three-level model, an expression of the second-order auto-correlation function g^2 (t) which can well fit the measured g^2 (t) curve is also obtained.	翻訳日:2023-04-13 03:07:04 公開日:2021-02-01
# 識別可能な粒子に対する一夫一婦制による識別不能粒子の絡み合いの最大違反 Maximum Violation of Monogamy of Entanglement for Indistinguishable Particles by Measures that are Monogamous for Distinguishable Particles ( http://arxiv.org/abs/2102.00780v1 ) ライセンス: Link先を確認	Goutam Paul, Soumya Das and Anindya Banerji	(参考訳) 量子物理学の2つの重要な結果は、 \textit{no-cloning} 定理と \textit{monogamy of entanglement} である。前者は任意の未知の量子状態の独立かつ同一のコピーの作成を禁止し、後者は複数の量子系間の量子絡み合いの共有性を制限する。識別可能な粒子の場合、これらの結果の1つはもう一方を暗示する。本報告では, 識別不能粒子(各粒子は個別に対応できない)を持つ量子ビット系において, 識別可能な粒子に対して単元的な測度によって, 絡み合いの単元性に対する最大違反が可能であることを示す。この結果を導出するために,各自由度と他の自由度が絡み合う空間的位置に対応する識別不能粒子に対する自由トレースアウトルールの程度を定式化する。この結果は、無閉定理に矛盾することなく、区別不可能な粒子に対する量子エンタングルメントの共有性に対する制限を取り除く。 Two important results of quantum physics are the \textit{no-cloning} theorem and the \textit{monogamy of entanglement}. The former forbids the creation of an independent and identical copy of an arbitrary unknown quantum state and the latter restricts the shareability of quantum entanglement among multiple quantum systems. For distinguishable particles, one of these results imply the other. In this Letter, we show that in qubit systems with indistinguishable particles (where each particle cannot be addressed individually), a maximum violation of the monogamy of entanglement is possible by the measures that are monogamous for distinguishable particles. To derive this result, we formulate the degree of freedom trace-out rule for indistinguishable particles corresponding to a spatial location where each degree of freedom might be entangled with the other degrees of freedom. Our result removes the restriction on the shareability of quantum entanglement for indistinguishable particles, without contradicting the no-cloning theorem.	翻訳日:2023-04-13 03:06:38 公開日:2021-02-01
# Bosonic Indistinguishability-Dependent Contextuality Bosonic Indistinguishability-Dependent Contextuality ( http://arxiv.org/abs/2102.00746v1 ) ライセンス: Link先を確認	Ali Asadian and Ad\'an Cabello	(参考訳) 我々は、最大文脈性とボソン不連続性を結び付ける量子文脈性の形式を、クラスー=ホルン=シモニー=ホルトベルの不等式が最大エンタングルメントに結びついているのと同様の方法で発見する。以前のフォトニックコンテクストリティとは異なり、この形式は区別不能と高次干渉に依存するため、古典光ではシミュレートできない。ボソニック系の理想的な測定は、連立量子ビットとの分散結合によって行うことができる。これにより、各測定の終了を遅らせ、既存のプラットフォームでは達成できない特徴である高次元のコンテキスト相関をターゲットとすることが可能になります。 We uncover a form of quantum contextuality that connects maximal contextuality to boson indistinguihability in a similar way maximal nonlocality with respect to the Clauser-Horne-Shimony-Holt Bell inequality is connected to maximal entanglement. Unlike previous forms of photonic contextuality, this form cannot be simulated with classical light, as it relies on indistinguishability and higher-order interference. Ideal measurements on the bosonic system can be performed by means of dispersive coupling with an ancillary qubit. This allows us delaying at will the ending of each measurement and targeting high-dimensional contextual correlations, which are features which cannot be achieved with existing platforms.	翻訳日:2023-04-13 03:05:42 公開日:2021-02-01
# 2方向古典通信を用いた実用的な量子鍵分布のための構成可能セキュリティ Composable security for practical quantum key distribution with two way classical communication ( http://arxiv.org/abs/2102.00739v1 ) ライセンス: Link先を確認	Cong Jiang, Xiao-Long Hu, Zong-wen Yu and Xiang-bin Wang	(参考訳) 本稿では,量子鍵分布(QKD)における有限鍵効果を2方向古典通信(TWCC)を用いて正確に計算する手法を提案する。 TWCCのない通常のQKDとは異なり、ここでは各2ビットランダム群のタグ付けやアンタグの確率は独立ではない。我々は、全てのビットが独立で同一の仮想ビット集合を想像することで、この問題を厳格に解決する。独立ビットと同一ビットを含むこの想像上の集合から得られる結果と、非独立ビットの実集合から得られる結果との関係を示す。明示的な公式では、計算にチャーンオフバウンドを適用するだけで正しい鍵レートが得られるが、失敗確率は少し変化する。 We present methods to strictly calculate the finite-key effects in quantum key distribution (QKD) with error rejection through two-way classical communication (TWCC) for the sending-or-not-sending twin-field protocol. Unlike the normal QKD without TWCC, here the probability of tagging or untagging for each two-bit random group is not independent. We rigorously solve this problem by imagining a virtual set of bits where every bit is independent and identical. We show the relationship between the outcome starting from this imagined set containing independent and identical bits and the outcome starting with the real set of non-independent bits. With explicit formulas, we show that simply applying Chernoff bound in the calculation gives correct key rate, but the failure probability changes a little bit.	翻訳日:2023-04-13 03:05:11 公開日:2021-02-01
# スピン群上の自由フェルミオンの有限次元系と拡散過程 Finite dimensional systems of free Fermions and diffusion processes on Spin groups ( http://arxiv.org/abs/2102.01000v1 ) ライセンス: Link先を確認	Luigi M. Borasi	(参考訳) この記事では、有限次元のフェルミオン(Fermion)について論じ、そこでは、外側代数自身に埋め込まれた有限次元複素空間のベクトルを意味する。これらのフェルミオンはスピンを持たないが、反可換性を持つ。リー群 $\mathrm{Spin}(2n+1)$ 上の不変複素ベクトル場をフェルミオン生成および消滅作用素に関連付ける。これらのベクトル場はリー代数 $\mathfrak{so}(2n+1)$ の正則表現の複素化の元である。したがって、それらは標準の反可換関係を満たさないが、もしそれらが適切な部分空間 $l^2(\mathrm{spin}(2n+1)) に射影されたら、これらの関係は満たされる。生成消滅作用素における対称正定値二次形式の観点から、このフェルミオン系の自由時間発展を定義する。フェルミオン生成および(不変)ベクトル場によってもたらされる消滅作用素の実現により、確率拡散過程を生成する2階作用素の和である正の自己共役作用素と、2階作用素と強く可換な1階複素作用素の和で、この時間進化を解釈することができる。確率論的解釈は、二階作用素に付随する拡散過程に関して、ファインマン・カックのような公式の項で与えられる。 In this article we are concerned with finite dimensional Fermions, by which we mean vectors in a finite dimensional complex space embedded in the exterior algebra over itself. These Fermions are spinless but possess the characterizing anticommutativity property. We associate invariant complex vector fields on the Lie group $\mathrm{Spin}(2n+1)$ to the Fermionic creation and annihilation operators. These vector fields are elements of the complexification of the regular representation of the Lie algebra $\mathfrak{so}(2n+1)$. As such, they do not satisfy the canonical anticommutation relations, however, once they have been projected onto an appropriate subspace of $L^2(\mathrm{Spin}(2n+1))$, these relations are satisfied. We define a free time evolution of this system of Fermions in terms of a symmetric positive-definite quadratic form in the creation-annihilation operators. The realization of Fermionic creation and annihilation operators brought by the (invariant) vector fields allows us to interpret this time evolution in terms of a positive selfadjoint operator which is the sum of a second order operator, which generates a stochastic diffusion process, and a first order complex operator, which strongly commutes with the second order operator. A probabilistic interpretation is given in terms of a Feynman-Kac like formula with respect to the diffusion process associated with the second order operator.	翻訳日:2023-04-13 02:57:14 公開日:2021-02-01
# マルチループ原子サニャック干渉計 Multi-loop atomic Sagnac interferometry ( http://arxiv.org/abs/2102.00991v1 ) ライセンス: Link先を確認	Christian Schubert, Sven Abend, Matthias Gersemann, Martina Gebbe, Dennis Schlippert, Peter Berg, Ernst M. Rasel	(参考訳) 光および物質波干渉計の回転に対する感度は、sagnac効果に基づいており、干渉計で囲まれた面積によって増加する。光の場合、後者は複数のファイバーループを形成することで拡大できるが、物質波干渉計の等価値はまだ実験的な課題である。光パルスによって形成されるスケーラブルな領域を有するマルチループ原子干渉計の概念を提案する。提案手法は,地球回転モニタリングに必要な長期安定性と組み合わせて,最大2 sで2 cdot10^{-11}$ rad/sの感度を提供する。 The sensitivity of light and matter-wave interferometers to rotations is based on the Sagnac effect and increases with the area enclosed by the interferometer. In the case of light, the latter can be enlarged by forming multiple fibre loops, whereas the equivalent for matter-wave interferometers remains an experimental challenge. We present a concept for a multi-loop atom interferometer with a scalable area formed by light pulses. Our method will offer sensitivities as high as $2\cdot10^{-11}$ rad/s at 1 s in combination with the respective long-term stability as required for Earth rotation monitoring.	翻訳日:2023-04-13 02:56:38 公開日:2021-02-01
# 進化的多目的最適化における大規模候補解集合からの高速グリーディサブセット選択 Fast Greedy Subset Selection from Large Candidate Solution Sets in Evolutionary Multi-objective Optimization ( http://arxiv.org/abs/2102.00941v1 ) ライセンス: Link先を確認	Weiyu Chen, Hisao Ishibuchi, and Ke Shang	(参考訳) サブセット選択は進化的多目的最適化(EMO)の分野において興味深く重要なトピックである。特に、非有界な外部アーカイブを持つEMOアルゴリズムでは、サブセット選択は、最終結果としてあらかじめ指定された数のソリューションを選択するために必須な後処理手順である。本稿では,超体積,IGD,IGD+インジケータのグリーディ部分選択の効率について論じる。グリーディアルゴリズムは通常、サブセット選択を効率的に処理する。しかし、多数のソリューションが与えられると(例えば、無制限の外部アーカイブにおける数万のソリューションからのサブセット選択など)、それらはしばしば時間がかかります。我々の考えは、超体積指標で知られている部分モジュラー特性を用いて効率を向上させることである。まず、IGDとIGD+の指標も準モジュラであることを示す。次に,サブモジュラー特性に基づき,各指標に対する効率的なグリーディ包含アルゴリズムを提案する。次に,提案アルゴリズムが標準部分集合選択アルゴリズムよりもはるかに高速であることを示す計算実験を行った。 Subset selection is an interesting and important topic in the field of evolutionary multi-objective optimization (EMO). Especially, in an EMO algorithm with an unbounded external archive, subset selection is an essential post-processing procedure to select a pre-specified number of solutions as the final result. In this paper, we discuss the efficiency of greedy subset selection for the hypervolume, IGD and IGD+ indicators. Greedy algorithms usually efficiently handle subset selection. However, when a large number of solutions are given (e.g., subset selection from tens of thousands of solutions in an unbounded external archive), they often become time-consuming. Our idea is to use the submodular property, which is known for the hypervolume indicator, to improve their efficiency. First, we prove that the IGD and IGD+ indicators are also submodular. Next, based on the submodular property, we propose an efficient greedy inclusion algorithm for each indicator. Then, we demonstrate through computational experiments that the proposed algorithms are much faster than the standard greedy subset selection algorithms.	翻訳日:2023-04-13 02:56:20 公開日:2021-02-01
# 曲面上のスピン零中性および荷電粒子に対するエルミートハミルトニアンの構成 : 物理的アプローチ Constructing Hermitian Hamiltonians for spin zero neutral and charged particles on a curved surface : physical approach ( http://arxiv.org/abs/2102.00896v1 ) ライセンス: Link先を確認	M.S.Shikakhwa and N.Chair	(参考訳) 表面を囲む層の厚さをゼロにすることで表面にピン留めされたスピン零粒子の表面ハミルトニアンを構築する。これを達成するための新しいアプローチは、表面上の成分と表面への正規成分が別々にエルミートである3D運動量作用素の式から始めることである。運動エネルギー作用素の通常の部分は、この場合エルミート作用素である。この演算子を落として層の厚さをゼロにすると、予想される幾何学的ポテンシャル項を含むエルミート曲面ハミルトニアンが自動的に得られる。電磁場中の中性粒子と荷電粒子の両方に対するハミルトニアンが構成される。エルミート曲面と正規モーメントが通常の正規運動量作用素と曲面運動量作用素を対称性付けると自動的に現れることを示す。このアプローチは、幾何学的ポテンシャルが表面運動量作用素に追加されてエルミートを表わす用語に由来することを明らかにしている; この用語自体は、曲線座標における微分運動量作用素の対称性と順序付けから生じる。本稿では, この手法とJenssen, Koppe, Costa (いわゆるThin-Layer Quantization (TLQ)) の類似したアプローチとの関係について検討する。ここで導入された波動関数の臨界変換は、実際に層の厚さをゼロにする前(著者らによって明確に述べられてはいないが)、表面と正常な運動エネルギー演算子ヘルミティアンのそれぞれをそれ自体で表現する。 The surface Hamiltonian for a spin zero particle that is pinned to a surface by letting the thickness of a layer surrounding the surface go to zero -- assuming a strong normal force -- is constructed. The new approach we follow to achieve this is to start with an expression for the 3D momentum operators whose components along the surface and the normal to the surface are separately Hermitian. The normal part of the kinetic energy operator is a Hermitian operator in this case. When this operator is dropped and the thickness of the layer is set to zero, one automatically gets the Hermitian surface Hamiltonian that contains the geometric potential term as expected. Hamiltonians for both a neutral and a charged particle in an electromagnetic field are constructed. We show that a Hermitian surface and normal momenta emerge automatically once one symmetrizes the usual normal and surface momentum operators. The present approach makes it manifest that the geometrical potential originates from the term that is added to the surface momentum operator to render it Hermitian; this term itself emerges from symmetrization/ordering of differential momentum operators in curvilinear coordinates. We investigate the connection between this approach and the similar approach of Jenssen and Koppe and Costa ( the so called Thin-Layer Quantization (TLQ)). We note that the critical transformation of the wavefunction introduced there before taking the thickness of the layer to zero actually -- while not noted explicitly stated by the authors -- renders each of the surface and normal kinetic energy operators Hermitian by itself, which is just what our approach does from the onset.	翻訳日:2023-04-13 02:55:44 公開日:2021-02-01
# 癌治療用ナノキャリアの自動発見のための進化計算プラットフォーム Evolutionary computational platform for the automatic discovery of nanocarriers for cancer treatment ( http://arxiv.org/abs/2102.00879v1 ) ライセンス: Link先を確認	Namid Stillman, Igor Balaz, Antisthenis Tsompanas, Marina Kovacevic, Sepinoud Azimi, Sebastien Lafond, Andrew Adamatzky, Sabine Hauert	(参考訳) ナノメディシンの進化のためのEVONANOプラットフォームと抗がん剤への応用について述べる。 EVONANOは腫瘍を成長させ、代表シナリオを抽出し、これらのシナリオを通してナノ粒子輸送をシミュレートし、ナノ粒子分布を予測するシミュレータを含む。ナノ粒子の設計は機械学習を用いて最適化され、最も効果的な抗がん治療を効率的に見つける。我々は,ナノ粒子の性質を最適化する2つの例と,がん細胞を腫瘍環境下で選択的に殺傷する治療法を実演した。 We present the EVONANO platform for the evolution of nanomedicines with application to anti-cancer treatments. EVONANO includes a simulator to grow tumours, extract representative scenarios, and then simulate nanoparticle transport through these scenarios to predict nanoparticle distribution. The nanoparticle designs are optimised using machine learning to efficiently find the most effective anti-cancer treatments. We demonstrate our platform with two examples optimising the properties of nanoparticles and treatment to selectively kill cancer cells over a range of tumour environments.	翻訳日:2023-04-13 02:54:55 公開日:2021-02-01
# ポラリトン超流動における量子化された渦とダークソリトンの自発的生成、強化伝播、光インプリンティング:量子乱流の制御に向けて Spontaneous generation, enhanced propagation and optical imprinting of quantized vortices and dark solitons in a polariton superfluid: towards the control of quantum turbulence ( http://arxiv.org/abs/2102.01075v1 ) ライセンス: Link先を確認	Anne Maitre, Ferdinand Claude, Giovani Lerario, Serguei Koniakhin, Simon Pigeon, Dmitry Solnyshkov, Guillaume Malpuech, Quentin Glorieux, Elisabeth Giacobino and Alberto Bramati	(参考訳) 共振ポンピングされたポラリトン超流動体では、偏光子系の安定性に基づいた新しい状態が探索され、偏光子流体のマクロ距離への伝播が促進された。この手法は全光学インプリント法とともに、量子化された渦や暗いソリトンのような様々なトポロジカル励起の生成と制御を可能にした。新しい実験スキームの柔軟性とスケーラビリティは、光の発散性量子流体における量子乱流の体系的研究への道を開く。本稿では,安定度向上のための基本原理とインプリント技術について概説し,本研究の成果と今後の展望について考察する。 In resonantly pumped polariton superfluids we recently explored a new regime based on the bistability of the polariton system to enhance the propagation of polariton fluids up to macroscopic distances. This technique together with an all-optical imprinting method allowed the generation and control of various topological excitations such as quantized vortices and dark solitons. The flexibility and scalability of the new experimental scheme opens the way to the systematic study of quantum turbulence in driven dissipative quantum fluids of light. In this article we review the basic working principles of the bistability enhanced propagation and of the imprinting technique and we discuss the main achieved results as well as the most promising future research directions.	翻訳日:2023-04-13 02:48:00 公開日:2021-02-01
# 実験室における量子重力--大きさと可逆ワームホールによるテレポーテーション(ii) Quantum Gravity in the Lab: Teleportation by Size and Traversable Wormholes, Part II ( http://arxiv.org/abs/2102.01064v1 ) ライセンス: Link先を確認	Sepehr Nezami, Henry W. Lin, Adam R. Brown, Hrant Gharibyan, Stefan Leichenauer, Grant Salton, Leonard Susskind, Brian Swingle, Michael Walter	(参考訳) [1]では、量子デバイスを用いて量子重力をシミュレートする方法を説明し、サイズによるテレポーテーションとサイズワインディングの現象を具体的な提案を行った。ここでは、「実験室における量子重力」の意味と、大きさの曲がり角が重力物理学やワームホールにどのように結びつくのかを詳しく説明します。完全大きさの巻線は演算子の大きさの波動関数の顕著できめ細かな特性であり、この性質がほぼAdS_2バルクの量子系に対して成り立つことを示す。次に, sachdev-ye-kitaevモデル, ランダム行列, スピン鎖の3つの系におけるテレポーテーションを詳細に検討し, 近距離量子デバイスにおいてこれらの現象を実現するための展望について考察した。 In [1] we discussed how quantum gravity may be simulated using quantum devices and gave a specific proposal -- teleportation by size and the phenomenon of size-winding. Here we elaborate on what it means to do 'Quantum Gravity in the Lab' and how size-winding connects to bulk gravitational physics and traversable wormholes. Perfect size-winding is a remarkable, fine-grained property of the size wavefunction of an operator; we show from a bulk calculation that this property must hold for quantum systems with a nearly-AdS_2 bulk. We then examine in detail teleportation by size in three systems: the Sachdev-Ye-Kitaev model, random matrices, and spin chains, and discuss prospects for realizing these phenomena in near-term quantum devices.	翻訳日:2023-04-13 02:47:40 公開日:2021-02-01
# 相対論的量子力学における時間とエネルギーの第二量子化 Second quantization of time and energy in Relativistic Quantum Mechanics ( http://arxiv.org/abs/2102.01042v1 ) ライセンス: Link先を確認	M. Bauer and C.A. Aguill\'on	(参考訳) ローレンツ不変性とボルン相反不変性に基づいて、特殊相対性理論(sr)の正準量子化は、ディラックのハミルトニアンの存在と、パウリの反対を回避した自己随伴時間演算子の存在の統一的な起源であることが示されている。このように、このアプローチは運動量とエネルギーの足場における空間と時間の扱いを量子力学 (Quantum Mechanics, QM) に復元する。時間作用素場の第二量子化は、ディラック・ハミルトン場のステップバイステップに従う。これは、量子場理論(QFT)におけるエネルギー量子と似た方法で、時間量子の概念を導入する。初期の関係は、フェシュバッハの原子核反応の統一理論に十分見られる。コールド原子系におけるフェシュバッハ共鳴やボース=アインシュタイン凝縮、量子重力における時間の問題など、現在の発展にその関連性が指摘されている。 . Based on Lorentz invariance and Born reciprocity invariance, the canonical quantization of Special Relativity (SR) has been shown to provide a unified origin for the existence of Dirac's Hamiltonian and a self adjoint time operator that circumvents Pauli's objection. As such, this approach restores to Quantum Mechanics (QM) the treatment of space and time on an equivalent footing as that of momentum and energy. Second quantization of the time operator field follows step by step that of the Dirac Hamiltonian field. It introduces the concept of time quanta, in a similar way to the energy quanta in Quantum Field Theory (QFT). An early connection is found allready in Feshbach's unified theory of nuclear reactions. Its possible relevance in current developments such as Feshbach resonances in the fields of cold atom systems, of Bose-Einstein condensates and in the problem of time in Quantum Gravity is noted. .	翻訳日:2023-04-13 02:47:01 公開日:2021-02-01
# 量子情報処理と量子センシングのための2モード圧縮状態の重ね合わせ Superposition of two-mode squeezed states for quantum information processing and quantum sensing ( http://arxiv.org/abs/2102.01032v1 ) ライセンス: Link先を確認	Fernando R. Cardoso, Daniel Z. Rossatto, Gabriel P. L. M. Fernandes, Gerard Higgins and Celso J. Villas-Boas	(参考訳) 量子情報処理や量子センシングに応用可能な2モード圧縮状態(TMSS)の重ね合わせについて検討する。まず、各モードの統計や2つのモード間の絡み合いの程度など、これらの非古典的状態のいくつかの性質について検討する。ここで述べたように、2モードのJaynes-Cummingsと反Jaynes-Cummings相互作用を2つのモードとスピン-$\tfrac{1}{2}$粒子の系で誘導することで、我々が考える状態を作ることができる。 2つのTMSSを重畳して2つの高調波発振器を作成した場合、位相空間におけるモードの任意の変位を検出するために、各単モード状態が有利に利用できることを示す。この還元状態のウィグナー関数は位相空間原点を中心とする対称ピークを示し、平均光子数の増加と同時に両二次においてより狭くなるという便利な特異性を持つ。この狭いピークは我々の量子センサーのポインタとして利用することができ、その位置は発振器による変位を示す位相空間にある。 We investigate superpositions of two-mode squeezed states (TMSSs), which have potential applications to quantum information processing and quantum sensing. Firstly we study some properties of these nonclassical states such as the statistics of each mode and the degree of entanglement between the two modes, which can be higher than that of a TMSS with the same degree of squeezing. The states we consider can be prepared by inducing two-mode Jaynes-Cummings and anti-Jaynes-Cummings interactions in a system of two modes and a spin-$\tfrac{1}{2}$ particle, for instance in the trapped ion domain, as described here. We show that when two harmonic oscillators are prepared in a superposition of two TMSSs, each reduced single-mode state can be advantageously employed to sense arbitrary displacements of the mode in phase space. The Wigner function of this reduced state exhibits a symmetrical peak centered at the phase-space origin, which has the convenient peculiarity of getting narrower in both quadratures simultaneously as the average photon number increases. This narrow peakcan be used as the pointer of our quantum sensor, with its position in phase space indicating the displacement undergone by the oscillator.	翻訳日:2023-04-13 02:46:45 公開日:2021-02-01
# 古典的な影で揺れる量子 Quantum scrambling with classical shadows ( http://arxiv.org/abs/2102.01008v1 ) ライセンス: Link先を確認	Roy J. Garcia and You Zhou and Arthur Jaffe	(参考訳) 量子力学は基本的な関心事であり、量子情報処理に影響を及ぼす。 4点の時間外相関器(OTOC)は、伝統的に多体動学の量子情報の量子化に用いられている。 OTOCの異常な時間秩序のため、その測定は困難である。本稿では,早期スクランブル動作を明らかにするための高点OTOCを提案し,影推定法を用いて高点OTOCを測定するためのプロトコルを提案する。このプロトコルは、時間反転進化と補助制御の必要性を回避する。それらは、単一量子ビットの読み出しを持つ短期量子デバイスで実装することができる。 Quantum dynamics is of fundamental interest and has implications in quantum information processing. The four-point out-of-time-ordered correlator (OTOC) is traditionally used to quantify quantum information scrambling under many-body dynamics. Due to the OTOC's unusual time ordering, its measurement is challenging. We propose higher-point OTOCs to reveal early-time scrambling behavior, and present protocols to measure any higher-point OTOC using the shadow estimation method. The protocols circumvent the need for time-reversal evolution and ancillary control. They can be implemented in near-term quantum devices with single-qubit readout.	翻訳日:2023-04-13 02:46:02 公開日:2021-02-01
# 専用量子プロセッサ設計 Special-Purpose Quantum Processor Design ( http://arxiv.org/abs/2102.01228v1 ) ライセンス: Link先を確認	Bin-Han Lu, Yu-Chun Wu, Wei-Cheng Kong, Qi Zhou, and Guo-Ping Guo	(参考訳) 量子ビットの完全接続は、ほとんどの量子アルゴリズムにおいて必要であり、ノイズ中間スケール量子プロセッサに直接実装することは困難である。しかし、未結合キュービット間の2量子ゲートを可能にするスワップゲートの挿入は計算結果の忠実度を著しく低下させる。そこで本研究では,異なる量子アルゴリズムに適した構造を設計できる特殊目的量子プロセッサ設計法を提案する。提案手法は,プロセッサ構造を二次元格子グラフから一般平面グラフに拡張し,量子アルゴリズムの論理量子ビットと物理制約との間の2量子ゲート分布に応じて物理カプラを配置する。実験の結果, 設計手法は他の手法と比較して, 2キュービットゲートあたりの余剰スワップゲートの数を平均104.2%削減できることがわかった。また, 深さとキュービット数の増加に伴い, 他の手法に対する本手法のアドバンテージはより明確になる。その結果,本手法は計算結果の忠実性向上に競争力があり,技術的条件下で量子優位を示す可能性が示唆された。 Full connectivity of qubits is necessary for most quantum algorithms, which is difficult to directly implement on Noisy Intermediate-Scale Quantum processors. However, inserting swap gate to enable the two-qubit gates between uncoupled qubits significantly decreases the computation result fidelity. To this end, we propose a Special-Purpose Quantum Processor Design method that can design suitable structures for different quantum algorithms. Our method extends the processor structure from two-dimensional lattice graph to general planar graph and arranges the physical couplers according to the two-qubit gate distribution between the logical qubits of the quantum algorithm and the physical constraints. Experimental results show that our design methodology, compared with other methods, could reduce the number of extra swap gates per two-qubit gate by at least 104.2% on average. Also, our method's advantage over other methods becomes more obvious as the depth and qubit number increase. The result reveals that our method is competitive in improving computation result fidelity and it has the potential to demonstrate quantum advantage under the technical conditions.	翻訳日:2023-04-13 02:38:25 公開日:2021-02-01
# オーシャン・ムカイ・ソルバを用いた電子構造QUBOのサンプリング Sampling electronic structure QUBOs with Ocean and Mukai solvers ( http://arxiv.org/abs/2102.01225v1 ) ライセンス: Link先を確認	Alexander Teplukhin (1), Brian K. Kendrick (1), Susan M. Mniszewski (2), Sergei Tretiak (1) and Pavel A. Dub (3) ((1) Theoretical Division, Los Alamos National Laboratory, (2) Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, (3) Chemistry Division, Los Alamos National Laboratory)	(参考訳) 最も先進的なD波アドバンテージ量子アニールは5000以上の量子ビットを持つが、全ての量子ビットは少数の近傍に接続される。したがって、完全連結グラフの実装は、量子ビット数の大きさの減少をもたらす。量子ビット数の減少を補うためには、qbsolvのような特殊なヒューリスティックなソフトウェアに頼る必要がある。本研究では,d-wave oceanツールの一部であるオープンソースのqbsolvとquantum computing inc.(qci)の新しいmukai qubo solverの2つの実装の性能を比較した。この比較は電子構造問題を解くために行われ、古典的モード(タブサーチ技術)で実装される。量子アニーラー固有解法(Quantum Annealer Eigensolver)は、電子構造固有値-固有ベクトル方程式を現代の量子アニーラー上で解ける問題の種類にマッピングするために用いられる。本研究で行ったすべての計算,地上および励起状態の計算において,向浦解法はOcean qbsolvよりも優れていた。この研究は、現代の量子アニールの利用を支援するソフトウェアの開発を刺激する。 The most advanced D-Wave Advantage quantum annealer has 5000+ qubits, however, every qubit is connected to a small number of neighbors. As such, implementation of a fully-connected graph results in an order of magnitude reduction in qubit count. To compensate for the reduced number of qubits, one has to rely on special heuristic software such as qbsolv, the purpose of which is to decompose a large problem into smaller pieces that fit onto a quantum annealer. In this work, we compare the performance of two implementations of such software: the original open-source qbsolv which is a part of the D-Wave Ocean tools and a new Mukai QUBO solver from Quantum Computing Inc. (QCI). The comparison is done for solving the electronic structure problem and is implemented in a classical mode (Tabu search techniques). The Quantum Annealer Eigensolver is used to map the electronic structure eigenvalue-eigenvector equation to a type of problem solvable on modern quantum annealers. We find that the Mukai QUBO solver outperforms the Ocean qbsolv for all calculations done in the present work, both the ground and excited state calculations. This work stimulates the development of software to assist in the utilization of modern quantum annealers.	翻訳日:2023-04-13 02:38:09 公開日:2021-02-01
# big geosocial data analyticsを用いた大規模イベントにおける集団行動の理解 Understanding collective human movement dynamics during large-scale events using big geosocial data analytics ( http://arxiv.org/abs/2102.01175v1 ) ライセンス: Link先を確認	Junchuan Fan, Kathleen Stewart	(参考訳) 情報通信技術の急速な進歩に伴い、多くの研究者は、大規模な自然または社会的な出来事に対応するために、個人データベンダーの代替データソースを採用する。ジオリファレンスされたつぶやきのような大きなジオソーシャルデータは、現実世界のイベントが起こっているときに公開され、動的に進化しているため、人口のリアルタイムな感情や反応を捉えやすい。しかし、正確な位置情報は都市人口中心への偏りや偏りが少ない。本研究では,公開されたジオリファレンスツイートから大規模イベントに応答して,人間の動きのダイナミクスを抽出するための大規模ジオソーシャルデータ分析フレームワークを開発した。このフレームワークは、ジオレファレンスツイートのデータ不足を軽減するために、よりターゲット的な方法でデータを収集する2段階のデータ収集モジュールを含む。また、異なる空間スケールでジオレファレンス情報を融合するために、可変帯域カーネル密度推定(VB-KDE)アプローチを採用し、ジオレファレンスツイートに含まれる人間の動きの信号をさらに増強した。ジオレファレンスされたツイートのサンプリングバイアスを補正するため、人口別に異なる空間単位(例えば、郡、州)のツイート数を調整した。提案する分析フレームワークの性能を実証するため,米国全土で発生した天文学的イベント,すなわち2017年グレートアメリカン・エクリプスを事例として選択し,このイベントに対する人間の運動動態について検討した。しかし、この分析枠組みはハリケーンや地震のような他の種類の大規模イベントにも容易に適用できる。 With the rapid advancement of information and communication technologies, many researchers have adopted alternative data sources from private data vendors to study human movement dynamics in response to large-scale natural or societal events. Big geosocial data such as georeferenced tweets are publicly available and dynamically evolving as real-world events are happening, making it more likely to capture the real-time sentiments and responses of populations. However, precisely-geolocated geosocial data is scarce and biased toward urban population centers. In this research, we developed a big geosocial data analytical framework for extracting human movement dynamics in response to large-scale events from publicly available georeferenced tweets. The framework includes a two-stage data collection module that collects data in a more targeted fashion in order to mitigate the data scarcity issue of georeferenced tweets; in addition, a variable bandwidth kernel density estimation(VB-KDE) approach was adopted to fuse georeference information at different spatial scales, further augmenting the signals of human movement dynamics contained in georeferenced tweets. To correct for the sampling bias of georeferenced tweets, we adjusted the number of tweets for different spatial units (e.g., county, state) by population. To demonstrate the performance of the proposed analytic framework, we chose an astronomical event that occurred nationwide across the United States, i.e., the 2017 Great American Eclipse, as an example event and studied the human movement dynamics in response to this event. However, this analytic framework can easily be applied to other types of large-scale events such as hurricanes or earthquakes.	翻訳日:2023-04-13 02:37:23 公開日:2021-02-01
# ソーダライムガラス中のna/kイオン交換法により作製した光方向カプラによる量子プロジェクタ Quantum projectors implemented with optical directional couplers fabricated by Na/K ion-exchange in soda-lime glass ( http://arxiv.org/abs/2102.01169v1 ) ライセンス: Link先を確認	Xes\'us Prieto-Blanco, Carlos Montero-Orille, Jes\'us Li\~nares, H\'ector Gonz\'alez-N\'u\~nez and Daniel Balado	(参考訳) イオン交換Na/Kプロセスで作製した集積光指向性カプラにより実装された量子プロジェクタの理論的および実験的研究を行った。 2x2方向結合器を連結したデバイスに関する理論的考察を行い、n-次元量子射影計測の実行能力と1-qudit状態の生成について述べる。これらの装置の基本単位は2x2方向結合器であるので、このような結合器の製造と光学パラメータの間の経験的関係を光学的特徴付けにより得るための実験的研究を行う。同様に、2次元の量子プロジェクターは、X(対角)およびY(円)基底の状態に対して射影の測定値が得られるように示される。 We present a preliminary theoretical and experimental study of quantum projectors implemented by integrated optical directional couplers fabricated by ion-exchange Na/K processes in soda-lime glass. Theoretical considerations about devices formed by concatenated 2x2 directional couplers are presented in order to show their capabilities for implementing N-dimensional quantum projective measurements, and concomitantly the production of 1-qudit states. Since the fundamental unit of these devices are 2x2 directional couplers, we present an experimental study for obtaining, by an optical characterization, empiric relationships between fabrication and optical parameters of such couplers. Likewise, a two-dimensional quantum projector is demonstrated in such a way that projective measurements are obtained for the states of X (diagonal) and Y (circular) bases.	翻訳日:2023-04-13 02:36:54 公開日:2021-02-01
# DisQ: OpenPulseを用いたIBM量子コンピュータの新しい量子出力状態分類法 DisQ: A Novel Quantum Output State Classification Method on IBM Quantum Computers using OpenPulse ( http://arxiv.org/abs/2102.01153v1 ) ライセンス: Link先を確認	Tirthak Patel and Devesh Tiwari	(参考訳) 超伝導量子コンピューティング技術は、新しい計算可能性の時代を幕開けた。量子技術の改善と、誤差率を低減した量子アルゴリズムを効率的に実行するソフトウェアスタックの構築に向けた研究が盛んに行われているが、誤差率の低減を目的とした量子出力状態の定義と分類の最適化への取り組みはまだ限られている。そこで本研究では,NISQデバイス上での量子プログラムの誤り率を低減する量子出力状態分類手法であるDisQを提案する。 Superconducting quantum computing technology has ushered in a new era of computational possibilities. While a considerable research effort has been geared toward improving the quantum technology and building the software stack to efficiently execute quantum algorithms with reduced error rate, effort toward optimizing how quantum output states are defined and classified for the purpose of reducing the error rate is still limited. To this end, this paper proposes DisQ, a quantum output state classification approach which reduces error rates of quantum programs on NISQ devices.	翻訳日:2023-04-13 02:36:40 公開日:2021-02-01
# 量子軌道上の久須岡測度のエルゴーディティー Ergodicity of Kusuoka measures on quantum trajectories ( http://arxiv.org/abs/2102.01140v1 ) ライセンス: Link先を確認	Anna Szczepanek	(参考訳) 1989年、クズーカは行列の積の助けを借りて定義されるシフト空間の確率測度の研究を開始した。特に、この措置のエルゴード性に十分な条件を導いており、それ以来、薬岡措置と呼ばれるようになった。我々は、一様発展する量子系の繰り返し測定が、測定結果の列の空間上にクズーカ測度を生成することを観測する。測定値がスケールした射影からなる場合、草岡の十分なエルゴディダリティ条件は大幅に単純化できることを示す。すると、測定が一様スケールされた rank-1 射影(つまり rank-1 povm である)またはちょうど 2 つの射影(そのうちの 1 つは rank-1 である)からなる場合、この条件はエルゴード性にも必要であることが証明される。後者の種類の測定では、全ての結果列が系によってその逆に放出される確率と同じであるという意味で、クズーカ測度は可逆的であることも示している。 In 1989 Kusuoka started the study of probability measures on the shift space that are defined with the help of products of matrices. In particular, he derived a sufficient condition for the ergodicity of such measures, which have since been referred to as Kusuoka measures. We observe that repeated measurements of a unitarily evolving quantum system generate a Kusuoka measure on the space of sequences of measurement outcomes. We show that if the measurement consists of scaled projections, then Kusuoka's sufficient ergodicity condition can be significantly simplified. We then prove that this condition is also necessary for ergodicity if the measurement consists of uniformly scaled rank-1 projections (i.e., it is a rank-1 POVM), or of exactly two projections, one of which is rank-1. For the latter class of measurements we also show that the Kusuoka measure is reversible in the sense that every string of outcomes has the same probability of being emitted by the system as its reverse.	翻訳日:2023-04-13 02:36:30 公開日:2021-02-01
# 量子チェシャー猫のグラインおよびスナール選択経路の遅延選択 Delayed choice of paths selected by grin and snarl of quantum Cheshire Cat ( http://arxiv.org/abs/2001.00669v2 ) ライセンス: Link先を確認	Debmalya Das and Ujjwal Sen	(参考訳) いわゆる量子チェシャー・キャット(Quantum Cheshire Cat)は、猫と同一視される光子と、その猫と同一視される偏光の成分が分離されるシナリオである。我々は、光子の偏極の2つの直交成分を平均で分離するために同じ技術が使用できることを観察する。我々は、光子の偏光成分を、理解の容易さのために猫のほこりと鳴き声として識別する。また,マッハツェンダー干渉計の2つの腕における光子の入射偏光を同時に調整するゲダンケン実験を行った。光子偏極の2つの特定の選択において、2つの成分の存在は2つの腕の中で反転する。このグラインとスナールの反転は、分極成分がチューナーと相互作用する前、すなわち各アームの選択が行われる前に起こる。 The so-called quantum Cheshire Cat is a scenario where a photon, identified with a cat, and a component of its polarization, identified with the grin of that cat, are separated. We observe that the same techniques can be used to separate two orthogonal components of polarization of a photon, on an average. We identify these polarization components of the photon as the grin and snarl of the cat for ease of comprehension. A gedanken experiment is presented in which we simultaneously tune the input polarizations of the photon in the two arms of a Mach-Zehnder interferometer. It is noted that for two particular choices of photon polarization, the presence of the two components gets reversed in the two arms. This reversal of the grin and the snarl occurs before the polarization components even interact with the tuners, i.e., before the choice of which arm each should be in is made.	翻訳日:2023-01-16 04:31:07 公開日:2021-02-01
# RDAnet:合成開口レーダ画像形成のためのディープラーニングに基づくアプローチ RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation ( http://arxiv.org/abs/2001.08202v2 ) ライセンス: Link先を確認	Andrew Rittenbach (1) and John Paul Walters (1) ((1) University of Southern California Information Sciences Institute, Arlington VA)	(参考訳) SAR(Synthetic Aperture Radar)イメージングシステムは、衛星などの移動物体からレーダー信号を関心の対象に向けて放射することによって動作する。反射レーダエコーを受信し、後に画像形成アルゴリズムによってSAR画像を形成する。分類や自動目標認識などのコンピュータビジョンタスクにおいて,SAR画像を使用することに大きな関心がある。しかし今日では、SARアプリケーションは複数の操作で構成されている:画像形成と画像処理である。本研究では,sar処理パイプラインを統合することで,画像形成と画像処理タスクの両方を実行するディープニューラルネットワークを訓練する。その結果,従来のアルゴリズムと同等の画質のSAR画像を精度良く出力できることが示唆された。この研究は、実データを使用した統合ニューラルネットワークベースのSAR処理パイプラインの最初の実演であると考えています。 Synthetic Aperture Radar (SAR) imaging systems operate by emitting radar signals from a moving object, such as a satellite, towards the target of interest. Reflected radar echoes are received and later used by image formation algorithms to form a SAR image. There is great interest in using SAR images in computer vision tasks such as classification or automatic target recognition. Today, however, SAR applications consist of multiple operations: image formation followed by image processing. In this work, we train a deep neural network that performs both the image formation and image processing tasks, integrating the SAR processing pipeline. Results show that our integrated pipeline can output accurately classified SAR imagery with image quality comparable to those formed using a traditional algorithm. We believe that this work is the first demonstration of an integrated neural network based SAR processing pipeline using real data.	翻訳日:2023-01-07 18:58:34 公開日:2021-02-01
# レジームスイッチングバンド Regime Switching Bandits ( http://arxiv.org/abs/2001.09390v3 ) ライセンス: Link先を確認	Xiang Zhou, Yi Xiong, Ningyuan Chen, Xuefeng Gao	(参考訳) 報酬がレジームスイッチングを示すマルチアームバンディット問題について検討する。特に、すべての腕から生成されるランダム報酬の分布は、有限状態マルコフ連鎖としてモデル化された共通の状態によって変調される。エージェントは基底状態を観察しず、遷移行列と報酬分布を学習しなければならない。本稿では,隠れマルコフモデルに対するスペクトル手法推定,部分的に観測可能なマルコフ決定過程における信念誤差制御,オンライン学習のための高信頼度手法に基づく学習アルゴリズムを提案する。また、t$が学習の地平線である学習アルゴリズムに対して、上限値の$o(t^{2/3}\sqrt{\log t})$を確立する。最後に,学習アルゴリズムの性能を実証する概念実証実験を行った。 We study a multi-armed bandit problem where the rewards exhibit regime switching. Specifically, the distributions of the random rewards generated from all arms are modulated by a common underlying state modeled as a finite-state Markov chain. The agent does not observe the underlying state and has to learn the transition matrix and the reward distributions. We propose a learning algorithm for this problem, building on spectral method-of-moments estimations for hidden Markov models, belief error control in partially observable Markov decision processes and upper-confidence-bound methods for online learning. We also establish an upper bound $O(T^{2/3}\sqrt{\log T})$ for the proposed learning algorithm where $T$ is the learning horizon. Finally, we conduct proof-of-concept experiments to illustrate the performance of the learning algorithm.	翻訳日:2023-01-06 19:17:35 公開日:2021-02-01
# 非凸最適化のための局所条件下における確率勾配ハミルトンモンテカルロの漸近解析 Nonasymptotic analysis of Stochastic Gradient Hamiltonian Monte Carlo under local conditions for nonconvex optimization ( http://arxiv.org/abs/2002.05465v3 ) ライセンス: Link先を確認	\"Omer Deniz Akyildiz, Sotirios Sabanis	(参考訳) 確率勾配ハミルトニアンモンテカルロ (sghmc) をwasserstein-2 距離の目標測度に収束させる非漸近解析をlog-concavityを仮定することなく提供する。本分析では,SGHMCの局所的な条件下での重要な理論的特性を定量化し,その結果を著しく改善する。特に、目標とSGHMCの法則の間のワッサーシュタイン-2距離がアルゴリズムのステップサイズによって一様に制御されていることを証明し、SGHMCがイテレーション数で一様に高精度な結果を提供できることを示す。この分析により,局所条件下での非凸最適化問題に対する漸近的境界を求めることができ,SGHMCは非凸最適化器と見なすと,最もよく知られた速度で世界最小値に収束する。この結果を用いて,スケーラブルベイズ推定と非漸近一般化境界に対する非漸近的境界を求める。 We provide a nonasymptotic analysis of the convergence of the stochastic gradient Hamiltonian Monte Carlo (SGHMC) to a target measure in Wasserstein-2 distance without assuming log-concavity. Our analysis quantifies key theoretical properties of the SGHMC as a sampler under local conditions which significantly improves the findings of previous results. In particular, we prove that the Wasserstein-2 distance between the target and the law of the SGHMC is uniformly controlled by the step-size of the algorithm, therefore demonstrate that the SGHMC can provide high-precision results uniformly in the number of iterations. The analysis also allows us to obtain nonasymptotic bounds for nonconvex optimization problems under local conditions and implies that the SGHMC, when viewed as a nonconvex optimizer, converges to a global minimum with the best known rates. We apply our results to obtain nonasymptotic bounds for scalable Bayesian inference and nonasymptotic generalization bounds.	翻訳日:2023-01-01 13:39:12 公開日:2021-02-01
# 知識追跡のための適切なクエリ、キー、価値計算を目指して Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing ( http://arxiv.org/abs/2002.07033v5 ) ライセンス: Link先を確認	Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Byungsoo Kim, Yeongmin Cha, Dongmin Shin, Chan Bae, Jaewe Heo	(参考訳) 知識追跡は、学習活動を通じて学生の知識をモデル化する行為であり、コンピュータ支援教育の分野で広く研究されている問題である。注意機構を持つモデルはベイズ知識の追跡や協調フィルタリングといった従来のアプローチを上回っているが、それらは2つの制限を共有している。まず、モデルは浅い注意層に依存し、時間とともにエクササイズとレスポンスの間の複雑な関係を捉えない。第二に、知識追跡のための自己注意層に対するクエリ、キー、値の組み合わせは、広範囲に調査されていない。エクササイズとインタラクション(エクササイズ-レスポンスペア)をクエリとして使用する通常のプラクティスには,それぞれ経験的サポートが欠けている。本稿では,知識追跡のための新しいトランスフォーマーモデルであるSAINT:Separated Self-AttentIve Neural Knowledge Tracingを提案する。 SAINTはエンコーダ・デコーダ構造を持ち、エクササイズとレスポンスの埋め込みシーケンスはそれぞれエンコーダとデコーダを別々に入力し、注意層を複数回重ねることができる。私たちの知識を最大限活用するために、これは、エクササイズとレスポンスを別々に適用する、知識トレースのためのエンコーダ・デコーダモデルを提案する最初の作業である。大規模知識追跡データセットにおける経験的評価から,SAINTは知識追跡における最先端のパフォーマンスを達成し,AUCを1.8%改善した。 Knowledge tracing, the act of modeling a student's knowledge through learning activities, is an extensively studied problem in the field of computer-aided education. Although models with attention mechanism have outperformed traditional approaches such as Bayesian knowledge tracing and collaborative filtering, they share two limitations. Firstly, the models rely on shallow attention layers and fail to capture complex relations among exercises and responses over time. Secondly, different combinations of queries, keys and values for the self-attention layer for knowledge tracing were not extensively explored. Usual practice of using exercises and interactions (exercise-response pairs) as queries and keys/values respectively lacks empirical support. In this paper, we propose a novel Transformer based model for knowledge tracing, SAINT: Separated Self-AttentIve Neural Knowledge Tracing. SAINT has an encoder-decoder structure where exercise and response embedding sequence separately enter the encoder and the decoder respectively, which allows to stack attention layers multiple times. To the best of our knowledge, this is the first work to suggest an encoder-decoder model for knowledge tracing that applies deep self-attentive layers to exercises and responses separately. The empirical evaluations on a large-scale knowledge tracing dataset show that SAINT achieves the state-of-the-art performance in knowledge tracing with the improvement of AUC by 1.8% compared to the current state-of-the-art models.	翻訳日:2023-01-01 04:22:27 公開日:2021-02-01
# 信頼に基づく協調フィルタリングのためのグラフ埋め込みの実証比較 Empirical Comparison of Graph Embeddings for Trust-Based Collaborative Filtering ( http://arxiv.org/abs/2003.13345v2 ) ライセンス: Link先を確認	Tomislav Duricic, Hussain Hussain, Emanuel Lacic, Dominik Kowald, Denis Helic, Elisabeth Lex	(参考訳) 本研究では,信頼に基づく協調フィルタリングのための潜在ユーザ表現を生成するためのグラフ埋め込みの有用性について検討する。コールドスタート設定では、公開されている3つのデータセットに基づいて、4つのメソッドファミリーからのアプローチを評価する。 (i)因子化に基づく (ii)ランダムウォークベース。 (iii)深層学習ベース、及び (iv)大規模情報ネットワーク埋め込み(line)アプローチ。 4つのファミリーで、ランダムウォークに基づくアプローチは、常に最高の精度を達成する。さらに、非常に斬新で多様なレコメンデーションも生み出す。さらに,信頼度に基づく協調フィルタリングにおけるグラフ埋め込みの利用は,ユーザカバレッジを著しく向上させることを示す。 In this work, we study the utility of graph embeddings to generate latent user representations for trust-based collaborative filtering. In a cold-start setting, on three publicly available datasets, we evaluate approaches from four method families: (i) factorization-based, (ii) random walk-based, (iii) deep learning-based, and (iv) the Large-scale Information Network Embedding (LINE) approach. We find that across the four families, random-walk-based approaches consistently achieve the best accuracy. Besides, they result in highly novel and diverse recommendations. Furthermore, our results show that the use of graph embeddings in trust-based collaborative filtering significantly improves user coverage.	翻訳日:2022-12-18 06:32:52 公開日:2021-02-01
# efficientps:効率的なpanopticセグメンテーション EfficientPS: Efficient Panoptic Segmentation ( http://arxiv.org/abs/2004.02307v3 ) ライセンス: Link先を確認	Rohit Mohan, Abhinav Valada	(参考訳) 自律ロボットが行動する場面を理解することは、その能力的機能にとって重要である。このようなシーン理解は、パノプティックセグメンテーションタスクによって効果的に対処できる一般的なシーンセマンティクスとともに、交通参加者のインスタンスを認識する必要がある。本稿では,意味的にリッチなマルチスケール機能を効率的にエンコードし融合する共有バックボーンからなる効率的なpanoptic segmentation(efficiantps)アーキテクチャを提案する。我々は、細部および文脈的特徴を整合的に集約する新しいセマンティックヘッドと、インスタンスヘッドとしてMask R-CNNの新しい変種を組み込んだ。また,本実装では,両ヘッドからの出力ロジットを総合的に統合し,最終的なpanopticセグメンテーション出力を生成する新しいpanoptic fusionモジュールを提案する。さらに、一般的なKITTIベンチマークのためのパノビュータアノテーションを含むKITTIパノビュータセグメンテーションデータセットについても紹介する。 cityscapes、kitti、mapillary vistas、およびindian driving datasetに関する広範な評価は、我々の提案するアーキテクチャが、これまでで最も効率的で高速なpanopticセグメンテーションアーキテクチャでありながら、これら4つのベンチマークすべてに一貫して最新技術を設定していることを示している。 Understanding the scene in which an autonomous robot operates is critical for its competent functioning. Such scene comprehension necessitates recognizing instances of traffic participants along with general scene semantics which can be effectively addressed by the panoptic segmentation task. In this paper, we introduce the Efficient Panoptic Segmentation (EfficientPS) architecture that consists of a shared backbone which efficiently encodes and fuses semantically rich multi-scale features. We incorporate a new semantic head that aggregates fine and contextual features coherently and a new variant of Mask R-CNN as the instance head. We also propose a novel panoptic fusion module that congruously integrates the output logits from both the heads of our EfficientPS architecture to yield the final panoptic segmentation output. Additionally, we introduce the KITTI panoptic segmentation dataset that contains panoptic annotations for the popularly challenging KITTI benchmark. Extensive evaluations on Cityscapes, KITTI, Mapillary Vistas and Indian Driving Dataset demonstrate that our proposed architecture consistently sets the new state-of-the-art on all these four benchmarks while being the most efficient and fast panoptic segmentation architecture to date.	翻訳日:2022-12-16 12:35:55 公開日:2021-02-01
# CALMによるオンライン連続学習の評価 Evaluating Online Continual Learning with CALM ( http://arxiv.org/abs/2004.03340v2 ) ライセンス: Link先を確認	Germ\'an Kruszewski, Ionut-Teodor Sorodoc, Tomas Mikolov	(参考訳) オンライン連続学習(ocl: online continual learning)は、連続的なデータストリーム上で1回以上の例を観察せずに学習することを研究する。しかし、一般的に利用可能なベンチマークは、異なるタスクを明示的に指示したり、潜在的な類似性構造を欠いたり、異なる例間の時間的独立性を仮定したりするため、これらの現実の状況とは程遠い。本稿では,言語モデリングに基づくOCLの新しいベンチマークを提案する。さらに,この設定における破滅的忘れについての新しい指標を提案し,専門家の組成に基づいて複数のベースラインモデルを評価する。最後に,異なる入力間の潜在類似性を学習する単純なゲーティング手法を導入し,専門家モデルの製品の性能を向上させる。 Online Continual Learning (OCL) studies learning over a continuous data stream without observing any single example more than once, a setting that is closer to the experience of humans and systems that must learn "on-the-wild". Yet, commonly available benchmarks are far from these real-world conditions, because they explicitly signal different tasks, lack latent similarity structure or assume temporal independence between different examples. Here, we propose a new benchmark for OCL based on language modelling in which input alternates between different languages and domains without any explicit delimitation. Additionally, we propose new metrics to study catastrophic forgetting in this setting and evaluate multiple baseline models based on compositions of experts. Finally, we introduce a simple gating technique that learns the latent similarities between different inputs, improving the performance of a Products of Experts model.	翻訳日:2022-12-15 22:25:55 公開日:2021-02-01
# 見ずに成績を上げた人たち:視線行動を用いた評価エッセイのマルチタスク学習アプローチ Happy Are Those Who Grade without Seeing: A Multi-Task Learning Approach to Grade Essays Using Gaze Behaviour ( http://arxiv.org/abs/2005.12078v2 ) ライセンス: Link先を確認	Sandeep Mathias, Rudra Murthy, Diptesh Kanojia, Abhijit Mishra, Pushpak Bhattacharyya	(参考訳) 読者の視線行動は、自動エッセイグレーディングのようないくつかのNLPタスクを解決するのに役立つ。しかし、読者からの視線行動の収集には時間とお金がかかる。本稿では,マルチタスク学習フレームワークを用いて実行時に学習される視線行動を用いた自動エッセイ評価手法を提案する。このマルチタスク学習による自動エッセイ評価手法の有効性を示すために,4つのエッセイセットにまたがる48個のエッセイのアイズ行動の収集と,残りのエッセイのアイズ行動の学習を行い,7000以上のエッセイを数える。学習した視線行動を用いて,視線データを有するエッセイセットの最先端システムに対する統計的に有意な性能改善を実現することができる。また,他の4つのエッセイセットにおいて統計的に有意な改善を達成し,約6000のエッセイを数える。我々のアプローチは、学習の視線行動が自動エッセイ評価を改善することを立証する。 The gaze behaviour of a reader is helpful in solving several NLP tasks such as automatic essay grading. However, collecting gaze behaviour from readers is costly in terms of time and money. In this paper, we propose a way to improve automatic essay grading using gaze behaviour, which is learnt at run time using a multi-task learning framework. To demonstrate the efficacy of this multi-task learning based approach to automatic essay grading, we collect gaze behaviour for 48 essays across 4 essay sets, and learn gaze behaviour for the rest of the essays, numbering over 7000 essays. Using the learnt gaze behaviour, we can achieve a statistically significant improvement in performance over the state-of-the-art system for the essay sets where we have gaze data. We also achieve a statistically significant improvement for 4 other essay sets, numbering about 6000 essays, where we have no gaze behaviour data available. Our approach establishes that learning gaze behaviour improves automatic essay grading.	翻訳日:2022-11-29 05:56:35 公開日:2021-02-01
# 局所三方向パターンに基づくロバストバッグの検出と分類 Robust Baggage Detection and Classification Based on Local Tri-directional Pattern ( http://arxiv.org/abs/2006.07345v3 ) ライセンス: Link先を確認	Shahbano, Muhammad Abdullah and Kashif Inayat	(参考訳) 近年,コンピュータビジョンコミュニティにおいて映像自動監視システムの重要性が高まっている。監視の重要な目的は公共の場での監視とセキュリティである。従来のローカルバイナリパターンでは、機能記述は何らかの不正確であり、機能サイズは十分である。そこで本研究では,このような欠点を克服するために,荷物を運んだり運んだりしない人の検出アルゴリズムを提案する。頭部、体幹、四肢を含む人体部位の異なる特徴を抽出するために、局所的三方向パターン記述器を提示する。そして、サポートベクトルマシンの助けを借りて、抽出された特徴を訓練し評価する。 INRIAとMSMT17 V1データセットの実験結果は、LtriDPがいくつかの最先端の機能記述子より優れ、その有効性を検証することを示している。 In recent decades, the automatic video surveillance system has gained significant importance in computer vision community. The crucial objective of surveillance is monitoring and security in public places. In the traditional Local Binary Pattern, the feature description is somehow inaccurate, and the feature size is large enough. Therefore, to overcome these shortcomings, our research proposed a detection algorithm for a human with or without carrying baggage. The Local tri-directional pattern descriptor is exhibited to extract features of different human body parts including head, trunk, and limbs. Then with the help of support vector machine, extracted features are trained and evaluated. Experimental results on INRIA and MSMT17 V1 datasets show that LtriDP outperforms several state-of-the-art feature descriptors and validate its effectiveness.	翻訳日:2022-11-22 04:36:05 公開日:2021-02-01
# 戦略的相補性を持つ平均場ゲームのための強化学習 Reinforcement Learning for Mean Field Games with Strategic Complementarities ( http://arxiv.org/abs/2006.11683v3 ) ライセンス: Link先を確認	Kiyeob Lee, Desik Rengarajan, Dileep Kalathil, Srinivas Shakkottai	(参考訳) 平均場ゲーム (Mean Field Games, MFG) は、非常に多数のエージェントを持つゲームのクラスであり、標準平衡の概念は平均場平衡 (Mean Field Equilibrium, MFE) である。動的MFGにおけるMFE学習アルゴリズムは一般には知られていない。我々の焦点は、MFG-SC(Strategic Complementarities)と呼ばれる単調性を持つ重要なサブクラスである。本稿では,Trembling-Hand-Perfect MFE (T-MFE) と呼ばれる平衡概念を自然に改良し,エージェントがランダム化の尺度を用いて,そのようなランダム化がペイオフに与える影響を考察する。本稿では,T-MFEを既知のモデルで計算する簡単なアルゴリズムを提案する。また、T-MFE学習のためのモデルフリーおよびモデルベースアプローチを導入し、両方のアルゴリズムの複雑なサンプルを提供する。また,シミュレータの必要性を緩和する完全オンライン学習方式も開発した。最後に,実世界の応用に動機づけられた実例を用いて,提案アルゴリズムの性能を実証的に評価する。 Mean Field Games (MFG) are the class of games with a very large number of agents and the standard equilibrium concept is a Mean Field Equilibrium (MFE). Algorithms for learning MFE in dynamic MFGs are unknown in general. Our focus is on an important subclass that possess a monotonicity property called Strategic Complementarities (MFG-SC). We introduce a natural refinement to the equilibrium concept that we call Trembling-Hand-Perfect MFE (T-MFE), which allows agents to employ a measure of randomization while accounting for the impact of such randomization on their payoffs. We propose a simple algorithm for computing T-MFE under a known model. We also introduce a model-free and a model-based approach to learning T-MFE and provide sample complexities of both algorithms. We also develop a fully online learning scheme that obviates the need for a simulator. Finally, we empirically evaluate the performance of the proposed algorithms via examples motivated by real-world applications.	翻訳日:2022-11-18 12:40:43 公開日:2021-02-01
# 潜在共同設立者による自己相関時系列のハイリコール因果発見 High-recall causal discovery for autocorrelated time series with latent confounders ( http://arxiv.org/abs/2007.01884v3 ) ライセンス: Link先を確認	Andreas Gerhardus and Jakob Runge	(参考訳) そこで本論文では,線形・非線形・ラグランジュ・コンテンポラリー・因果関係を時系列観測から発見する新しい手法を提案する。 fciや変種のような既存の因果発見法では,自己相関型時系列の場合のリコールが低く,条件付き独立テストの効果が低かったことが主な原因である。情報理論の議論は、因果関係の親が条件セットに含まれる場合、効果の大きさを増大させることができることを示している。早期に親を識別するために,新たな配向規則を用いて,すでにエッジ除去段階にある祖先関係を判定する反復手順を提案する。本手法は順序非依存であり,オラクルの場合において完全かつ完全であることを示す。異なる変数数,時間ラグ,サンプルサイズ,さらに詳細なシミュレーション研究を行い,偽陽性を所望のレベルに保ちながら,自己相関連続変数の場合の既存の手法よりもはるかに高いリコールを実現することを実証した。この性能はより強い自己相関によって向上する。 https://github.com/jakobrunge/tigramiteでは、シミュレーション研究に関わるすべてのメソッドにpythonコードを提供しています。 We present a new method for linear and nonlinear, lagged and contemporaneous constraint-based causal discovery from observational time series in the presence of latent confounders. We show that existing causal discovery methods such as FCI and variants suffer from low recall in the autocorrelated time series case and identify low effect size of conditional independence tests as the main reason. Information-theoretical arguments show that effect size can often be increased if causal parents are included in the conditioning sets. To identify parents early on, we suggest an iterative procedure that utilizes novel orientation rules to determine ancestral relationships already during the edge removal phase. We prove that the method is order-independent, and sound and complete in the oracle case. Extensive simulation studies for different numbers of variables, time lags, sample sizes, and further cases demonstrate that our method indeed achieves much higher recall than existing methods for the case of autocorrelated continuous variables while keeping false positives at the desired level. This performance gain grows with stronger autocorrelation. At https://github.com/jakobrunge/tigramite we provide Python code for all methods involved in the simulation studies.	翻訳日:2022-11-14 05:00:58 公開日:2021-02-01
# 超音波ガイド下手術における医療機器検出の試み Medical Instrument Detection in Ultrasound-Guided Interventions: A Review ( http://arxiv.org/abs/2007.04807v2 ) ライセンス: Link先を確認	Hongxu Yang, Caifeng Shan, Alexander F. Kolen, Peter H. N. de With	(参考訳) 医療機器検出は, 外科医がより優れた解釈で効率的に機器を見つけることが容易になるため, コンピュータ支援の介入には不可欠である。本稿では,超音波ガイド下手術における医療機器検出法について概説する。まず,従来の非データ駆動手法とデータ駆動手法を含む計器検出手法について概説する。非データ駆動手法は、機械学習の時代、すなわちデータ駆動アプローチ以前に広く研究された。臨床データを用いた麻酔, 生検, 前立腺切断療法, 心カテーテル治療など, 超音波における医療機器検出の主な臨床応用について検討した。最後に,コンピュータ支援介入コミュニティにおける主要な課題と今後の研究方向性をまとめるために,いくつかの主要出版物を選定した。 Medical instrument detection is essential for computer-assisted interventions since it would facilitate the surgeons to find the instrument efficiently with a better interpretation, which leads to a better outcome. This article reviews medical instrument detection methods in the ultrasound-guided intervention. First, we present a comprehensive review of instrument detection methodologies, which include traditional non-data-driven methods and data-driven methods. The non-data-driven methods were extensively studied prior to the era of machine learning, i.e. data-driven approaches. We discuss the main clinical applications of medical instrument detection in ultrasound, including anesthesia, biopsy, prostate brachytherapy, and cardiac catheterization, which were validated on clinical datasets. Finally, we selected several principal publications to summarize the key issues and potential research directions for the computer-assisted intervention community.	翻訳日:2022-11-12 05:27:53 公開日:2021-02-01
# 画像テキストマッチングのためのコンセンサス対応ビジュアルセマンティック埋め込み Consensus-Aware Visual-Semantic Embedding for Image-Text Matching ( http://arxiv.org/abs/2007.08883v2 ) ライセンス: Link先を確認	Haoran Wang, Ying Zhang, Zhong Ji, Yanwei Pang, Lin Ma	(参考訳) イメージテキストマッチングは、ビジョンと言語を橋渡しする上で中心的な役割を果たす。既存のほとんどのアプローチは、画像とテキストのインスタンスペアにのみ依存して表現を学習し、一致した関係を利用し、対応するアライメントを作成する。このようなアプローチは、画像とテキストの間の高レベルな関係を推論する能力を妨げかねない外部の常識知識を考慮せずに、インスタンスのペアデータに含まれる表面的関連のみを利用する。本稿では,両モード間で共有されるコモンセンス知識を画像テキストマッチングに組み込むために,コンセンサス対応のビジュアル・セマンティック・エンベディング(CVSE)モデルを提案する。具体的には、イメージキャプションコーパスからの意味概念間の統計的共起相関を計算し、構成された概念相関グラフを配置することにより、コンセンサス対応の概念(CAC)表現を生成する。その後、CVSEは、悪用されたコンセンサスと両方のモダリティのインスタンスレベルの表現に基づいて、画像とテキストの関連とアライメントを学習する。 2つの公開データセットで実施された広範囲な実験により、エクスプロイトされたコンセンサスは、双方向画像およびテキスト検索タスクにおける最先端のアプローチよりも優れたパフォーマンスで、より有意義な視覚意味埋め込みの構築に重要な貢献をしていることを検証した。この論文のコードは、https://github.com/brucew91/cvseで入手できる。 Image-text matching plays a central role in bridging vision and language. Most existing approaches only rely on the image-text instance pair to learn their representations, thereby exploiting their matching relationships and making the corresponding alignments. Such approaches only exploit the superficial associations contained in the instance pairwise data, with no consideration of any external commonsense knowledge, which may hinder their capabilities to reason the higher-level relationships between image and text. In this paper, we propose a Consensus-aware Visual-Semantic Embedding (CVSE) model to incorporate the consensus information, namely the commonsense knowledge shared between both modalities, into image-text matching. Specifically, the consensus information is exploited by computing the statistical co-occurrence correlations between the semantic concepts from the image captioning corpus and deploying the constructed concept correlation graph to yield the consensus-aware concept (CAC) representations. Afterwards, CVSE learns the associations and alignments between image and text based on the exploited consensus as well as the instance-level representations for both modalities. Extensive experiments conducted on two public datasets verify that the exploited consensus makes significant contributions to constructing more meaningful visual-semantic embeddings, with the superior performances over the state-of-the-art approaches on the bidirectional image and text retrieval task. Our code of this paper is available at: https://github.com/BruceW91/CVSE.	翻訳日:2022-11-09 14:05:35 公開日:2021-02-01
# SummEval: 要約評価の再評価 SummEval: Re-evaluating Summarization Evaluation ( http://arxiv.org/abs/2007.12626v4 ) ライセンス: Link先を確認	Alexander R. Fabbri, Wojciech Kry\'sci\'nski, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev	(参考訳) テキスト要約のための評価指標に関する総合的な最新の研究の欠如と評価プロトコルに関する合意の欠如は、進歩を阻害し続けている。 5次元の要約評価手法の既存の欠点に対処する。 1)14個の自動評価指標を総合的かつ一貫した方法で再評価する。 2) 上記の自動評価指標を用いて, 最新の要約モデル23を常にベンチマークする。 3) cnn/dailymailニュースデータセットでトレーニングされたモデルによって生成された最大の要約の集合を統一した形式で共有する。 4) 幅広い自動メトリクスの要約モデルを評価するための拡張可能で統一的なapiを提供するツールキットを実装し,共有する。 5) 専門家とクラウドソースワーカーの両方が注釈を付けたcnn/daily mailデータセット上で,モデルタイプ,モデル生成要約の人的判断の収集に関して,最大かつ最も多様で多様なものを収集し,共有する。この研究により、テキスト要約のためのより完全な評価プロトコルの促進と、人間の判断とよりよく相関する評価メトリクスの開発に関する研究の促進が期待できる。 The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress. We address the existing shortcomings of summarization evaluation methods along five dimensions: 1) we re-evaluate 14 automatic evaluation metrics in a comprehensive and consistent fashion using neural summarization model outputs along with expert and crowd-sourced human annotations, 2) we consistently benchmark 23 recent summarization models using the aforementioned automatic evaluation metrics, 3) we assemble the largest collection of summaries generated by models trained on the CNN/DailyMail news dataset and share it in a unified format, 4) we implement and share a toolkit that provides an extensible and unified API for evaluating summarization models across a broad range of automatic metrics, 5) we assemble and share the largest and most diverse, in terms of model types, collection of human judgments of model-generated summaries on the CNN/Daily Mail dataset annotated by both expert judges and crowd-source workers. We hope that this work will help promote a more complete evaluation protocol for text summarization as well as advance research in developing evaluation metrics that better correlate with human judgments.	翻訳日:2022-11-07 06:39:58 公開日:2021-02-01
# 複数インスタンス拡張による視覚特徴のKショットコントラスト学習 K-Shot Contrastive Learning of Visual Features with Multiple Instance Augmentations ( http://arxiv.org/abs/2007.13310v2 ) ライセンス: Link先を確認	Haohang Xu, Hongkai Xiong, Guo-Jun Qi	(参考訳) 本稿では,複数の補足を適用して各インスタンスのサンプル変動を調べることにより,視覚特徴のk$-shot contrastive learning(kscl)を提案する。異なるインスタンスを区別するために差別的特徴を学習することで、インスタンス間の差別の利点と、インスタンス上の拡張サンプルの変種とクエリを一致させることによるインスタンス内バリエーションを組み合わせることを目的としている。特にインスタンスごとにインスタンスサブスペースを構築し、$k$-shot拡張のバリエーションの重要な要因がどのように結合され、拡張のバリエーションを形成するかをモデル化する。クエリが与えられると、最も関連するインスタンスの変種は、クエリをサブスペースに投影して、ポジティブなインスタンスクラスを予測することで取得される。これは、特別なワンショットケースと見なせる既存のコントラスト学習を一般化する。インスタンス部分空間を構成するために固有値分解を行い、埋め込みネットワークを微分可能な部分空間構成を通じてエンドツーエンドに訓練することができる。提案した$K$-shotのコントラスト学習は,最先端の教師なし手法よりも優れた性能を示す。 In this paper, we propose the $K$-Shot Contrastive Learning (KSCL) of visual features by applying multiple augmentations to investigate the sample variations within individual instances. It aims to combine the advantages of inter-instance discrimination by learning discriminative features to distinguish between different instances, as well as intra-instance variations by matching queries against the variants of augmented samples over instances. Particularly, for each instance, it constructs an instance subspace to model the configuration of how the significant factors of variations in $K$-shot augmentations can be combined to form the variants of augmentations. Given a query, the most relevant variant of instances is then retrieved by projecting the query onto their subspaces to predict the positive instance class. This generalizes the existing contrastive learning that can be viewed as a special one-shot case. An eigenvalue decomposition is performed to configure instance subspaces, and the embedding network can be trained end-to-end through the differentiable subspace configuration. Experiment results demonstrate the proposed $K$-shot contrastive learning achieves superior performances to the state-of-the-art unsupervised methods.	翻訳日:2022-11-06 08:37:29 公開日:2021-02-01
# VPC-Net:MLS点雲からの3次元車両の完成 VPC-Net: Completion of 3D Vehicles from MLS Point Clouds ( http://arxiv.org/abs/2008.03404v2 ) ライセンス: Link先を確認	Yan Xia, Yusheng Xu, Cheng Wang, Uwe Stilla	(参考訳) 都市シナリオの道路環境における動的かつ不可欠な要素として、車両が最も人気のある調査対象である。車両の挙動を監視し,その幾何学的特徴を抽出するためには,車両の正確な即時測定が交通・交通分野において重要な役割を果たす。モバイルレーザースキャン(MLS)システムから取得した点雲は、前例のない詳細な道路シーンの3D情報を提供する。インテリジェントな輸送と自動運転の分野で、特に車両の抽出に十分なデータソースであることが証明されている。しかしながら、mlsシステムから取得した車両の3dポイント雲は、必然的に物体の閉塞や自閉のため不完全である。この問題に対処するため,我々はMLSデータから完全で高密度で均一な点雲を合成するニューラルネットワークを提案し,VPC-Net(Vaby Points Completion-Net)と名付けた。本稿では,空間変換器ネットワークと点特徴強調層からなる入力インスタンスからグローバルな特徴を抽出する新しいエンコーダモジュールを提案する。さらに、車両の詳細を入力から保存し、詳細な情報で完全な出力を洗練するために、新しい精細モジュールも提示される。入力としてスパースと部分点雲が与えられると、ネットワークは完全で現実的な車両構造を生成し、部分的な入力から細かな詳細を維持することができる。提案するvpc-netを合成および実scanデータセットを用いて異なる実験で評価し,その結果を3次元車両監視タスクに適用した。定量的および定性的な実験は、提案したVPC-Netの有望な性能を示し、最先端の結果を示す。 As a dynamic and essential component in the road environment of urban scenarios, vehicles are the most popular investigation targets. To monitor their behavior and extract their geometric characteristics, an accurate and instant measurement of vehicles plays a vital role in traffic and transportation fields. Point clouds acquired from the mobile laser scanning (MLS) system deliver 3D information of road scenes with unprecedented detail. They have proven to be an adequate data source in the fields of intelligent transportation and autonomous driving, especially for extracting vehicles. However, acquired 3D point clouds of vehicles from MLS systems are inevitably incomplete due to object occlusion or self-occlusion. To tackle this problem, we proposed a neural network to synthesize complete, dense, and uniform point clouds for vehicles from MLS data, named Vehicle Points Completion-Net (VPC-Net). In this network, we introduce a new encoder module to extract global features from the input instance, consisting of a spatial transformer network and point feature enhancement layer. Moreover, a new refiner module is also presented to preserve the vehicle details from inputs and refine the complete outputs with fine-grained information. Given sparse and partial point clouds as inputs, the network can generate complete and realistic vehicle structures and keep the fine-grained details from the partial inputs. We evaluated the proposed VPC-Net in different experiments using synthetic and real-scan datasets and applied the results to 3D vehicle monitoring tasks. Quantitative and qualitative experiments demonstrate the promising performance of the proposed VPC-Net and show state-of-the-art results.	翻訳日:2022-11-01 11:55:18 公開日:2021-02-01
# MR画像再構成のための生成密度先行値のホモトピック勾配 Homotopic Gradients of Generative Density Priors for MR Image Reconstruction ( http://arxiv.org/abs/2008.06284v2 ) ライセンス: Link先を確認	Cong Quan, Jinjie Zhou, Yuanzheng Zhu, Yang Chen, Shanshan Wang, Dong Liang, Qiegen Liu	(参考訳) 深層学習(特に生成モデル)は、画像再構成を著しく高速化し、最近は測定を減らした。本研究では, 密度優先を最適化する既存の生成モデルではなく, 除音スコアマッチングを活かして, 生成密度優先(hggdp)のホモトピー勾配を磁気共鳴イメージング(mri)再構成のために提案する。より正確には、生成密度以前の低次元多様体と低データ密度領域の問題に取り組むために、高次元空間における目標勾配を推定する。訓練段階でのネットワーク入力として高次元テンソルを形成することにより,より強力な雑音条件スコアネットワークを訓練する。さらに人工的なノイズが埋め込み空間に注入される。再建段階では、復元性能を高めるなど、事前の密度を追求するためにホモトピー法が用いられる。実験結果から, 高い再構成精度でHGGDPの顕著な性能が示唆された。k空間データの10%だけが, 完全サンプルデータを用いた標準的なMRI再構成と同様に, 高品質な画像を生成することができる。 Deep learning, particularly the generative model, has demonstrated tremendous potential to significantly speed up image reconstruction with reduced measurements recently. Rather than the existing generative models that often optimize the density priors, in this work, by taking advantage of the denoising score matching, homotopic gradients of generative density priors (HGGDP) are proposed for magnetic resonance imaging (MRI) reconstruction. More precisely, to tackle the low-dimensional manifold and low data density region issues in generative density prior, we estimate the target gradients in higher-dimensional space. We train a more powerful noise conditional score network by forming high-dimensional tensor as the network input at the training phase. More artificial noise is also injected in the embedding space. At the reconstruction stage, a homotopy method is employed to pursue the density prior, such as to boost the reconstruction performance. Experiment results imply the remarkable performance of HGGDP in terms of high reconstruction accuracy; only 10% of the k-space data can still generate images of high quality as effectively as standard MRI reconstruction with the fully sampled data.	翻訳日:2022-10-30 17:45:54 公開日:2021-02-01
# 混合モデルを用いた可逆ニューラルネットワークの安定化 Stabilizing Invertible Neural Networks Using Mixture Models ( http://arxiv.org/abs/2009.02994v2 ) ライセンス: Link先を確認	Paul Hagemann and Sebastian Neumayer	(参考訳) 本稿では,逆問題の解法を提供する可逆ニューラルネットワークの特性について解析する。我々の主な焦点は、対応する逆ネットワークのリプシッツ定数の調査と制御である。このような制御がなければ、数値シミュレーションはエラーになりがちであり、従来のアプローチに比較してはあまり得られない。幸いなことに, 標準正規分布からガウス混合モデルへの潜在分布の変化は, リプシッツ定数の爆発の問題を解決している。実際、数値シミュレーションにより、この修正によってマルチモーダルアプリケーションにおけるサンプリング品質が大幅に向上することを確認した。 In this paper, we analyze the properties of invertible neural networks, which provide a way of solving inverse problems. Our main focus lies on investigating and controlling the Lipschitz constants of the corresponding inverse networks. Without such an control, numerical simulations are prone to errors and not much is gained against traditional approaches. Fortunately, our analysis indicates that changing the latent distribution from a standard normal one to a Gaussian mixture model resolves the issue of exploding Lipschitz constants. Indeed, numerical simulations confirm that this modification leads to significantly improved sampling quality in multimodal applications.	翻訳日:2022-10-21 02:39:58 公開日:2021-02-01
# ゼロショット実行可能意味解析のための接地適応 Grounded Adaptation for Zero-shot Executable Semantic Parsing ( http://arxiv.org/abs/2009.07396v3 ) ライセンス: Link先を確認	Victor Zhong, Mike Lewis, Sida I. Wang, Luke Zettlemoyer	(参考訳) 既存のセマンティックパーサを新しい環境(例えば新しいデータベーススキーマ)に適応させるために,ゼロショット実行可能なセマンティックパーシング(GAZP)のためのグラウンド適応を提案する。 GAZPは新しい環境でデータ(例えば、発話とSQLクエリ)を合成するために前方のセマンティックパーサと後方の発話生成器を組み合わせる。トレーニング環境では検証されていない例を合成するデータ拡張とは異なり、GAZPは入力と出力の整合性を検証する新しい環境でサンプルを合成する。 Spider、Sparc、CoSQLのゼロショットセマンティック解析タスクでは、GAZPはベースラインパーサの論理形式と実行精度を改善している。分析の結果,GAZPはトレーニング環境におけるデータ拡張に優れ,GAZP合成データの量によって性能が向上し,サイクル整合性が適応の鍵となることがわかった。 We propose Grounded Adaptation for Zero-shot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e.g. new database schemas). GAZP combines a forward semantic parser with a backward utterance generator to synthesize data (e.g. utterances and SQL queries) in the new environment, then selects cycle-consistent examples to adapt the parser. Unlike data-augmentation, which typically synthesizes unverified examples in the training environment, GAZP synthesizes examples in the new environment whose input-output consistency are verified. On the Spider, Sparc, and CoSQL zero-shot semantic parsing tasks, GAZP improves logical form and execution accuracy of the baseline parser. Our analyses show that GAZP outperforms data-augmentation in the training environment, performance increases with the amount of GAZP-synthesized data, and cycle-consistency is central to successful adaptation.	翻訳日:2022-10-17 22:43:22 公開日:2021-02-01
# 実時間最適化とベイズ最適化と微分自由最適化: 修正子適応の物語 Real-Time Optimization Meets Bayesian Optimization and Derivative-Free Optimization: A Tale of Modifier Adaptation ( http://arxiv.org/abs/2009.08819v2 ) ライセンス: Link先を確認	Ehecatl Antonio del Rio-Chanona and Panagiotis Petsagkourakis and Eric Bradford and Jose Eduardo Alves Graciano and Benoit Chachuat	(参考訳) 本稿では,不確実なプロセスのリアルタイム最適化において,プラントモデルミスマッチを克服する修飾子適応方式を提案する。主な貢献はベイズ最適化と微分自由最適化の領域からの概念の統合にある。提案手法では, 物理モデルを埋め込み, 信頼領域のアイデアを頼りに探索中のリスクを最小限にし, ガウス過程回帰を用いて非パラメトリックな方法で植物モデルミスマッチを捉え, 獲得関数を用いて探索を進める。半バッチフォトバイオリアクター最適化問題を含む数値ケーススタディにおいて, 取得関数の使用, プロセスノイズレベルを知ること, あるいは, 名目プロセスモデルを指定することの利点を述べる。 This paper investigates a new class of modifier-adaptation schemes to overcome plant-model mismatch in real-time optimization of uncertain processes. The main contribution lies in the integration of concepts from the areas of Bayesian optimization and derivative-free optimization. The proposed schemes embed a physical model and rely on trust-region ideas to minimize risk during the exploration, while employing Gaussian process regression to capture the plant-model mismatch in a non-parametric way and drive the exploration by means of acquisition functions. The benefits of using an acquisition function, knowing the process noise level, or specifying a nominal process model are illustrated on numerical case studies, including a semi-batch photobioreactor optimization problem.	翻訳日:2022-10-17 03:44:20 公開日:2021-02-01
# DocuBot : 自然言語インタラクションを用いた財務報告の生成 DocuBot : Generating financial reports using natural language interactions ( http://arxiv.org/abs/2010.01169v2 ) ライセンス: Link先を確認	Vineeth Ravi, Selim Amrouni, Andrea Stefanucci, Armineh Nourbakhsh, Prashant Reddy, Manuela Veloso	(参考訳) 金融サービス業界は、膨大な量の複雑なデータを永久に処理します。デジタルレポートは、退屈な手作業の分析と、基礎となるトレンドとデータの特性の可視化に基づいて作成されることが多い。多くの場合、これらのレポートの作成における人間の計算エラーの増大コストは非常に高い。自然言語インタラクションを「スキル」としてモデル化し、基礎となるデータを変換してデジタル文書のコンテンツを作成・修正するための、aiを活用した新しいバーチャルアシスタントであるdocubotを提案する。 DocuBotは、保存したスキルを再利用するために集約し、人間が自動的にリカレントレポートを生成することができる。 docubotはユーザと対話することで、ドメイン固有およびユーザ固有の語彙を継続的に学習する機能も備えている。我々は,DocuBotが金融業界に価値をもたらす証拠を示し,PowerPointのプレゼンテーション作成に携わる実際のユーザとシミュレーションユーザによる実験による影響を実証する。 The financial services industry perpetually processes an overwhelming amount of complex data. Digital reports are often created based on tedious manual analysis as well as visualization of the underlying trends and characteristics of data. Often, the accruing costs of human computation errors in creating these reports are very high. We present DocuBot, a novel AI-powered virtual assistant for creating and modifying content in digital documents by modeling natural language interactions as "skills" and using them to transform underlying data. DocuBot has the ability to agglomerate saved skills for reuse, enabling humans to automatically generate recurrent reports. DocuBot also has the capability to continuously learn domain-specific and user-specific vocabulary by interacting with the user. We present evidence that DocuBot adds value to the financial industry and demonstrate its impact with experiments involving real and simulated users tasked with creating PowerPoint presentations.	翻訳日:2022-10-12 01:44:12 公開日:2021-02-01
# 局所文脈埋め込みによるきめ細かな感性分類の強化 Enhancing Fine-grained Sentiment Classification Exploiting Local Context Embedding ( http://arxiv.org/abs/2010.00767v3 ) ライセンス: Link先を確認	Heng Yang, Biqing Zeng	(参考訳) ターゲット指向感情分類は、ターゲットの感情極性を分析するための自然言語処理のきめ細かいタスクである。感情分類の性能を向上させるために、多くのアプローチがターゲットの重要な文脈単語を捉えるために様々な注意メカニズムを提案した。しかし,従来のアプローチでは,対象の感情とその局所的文脈の有意な関連性は無視されていた。本稿では,ローカルコンテキスト埋め込みとローカルコンテキスト予測損失を備えたローカルコンテキスト認識ネットワーク(LCA-Net)を提案する。 3つの共通データセットにおける実験結果は、ローカルコンテキスト認識ネットワークが、ローカルコンテキスト特徴抽出において既存のアプローチよりも優れていることを示している。さらに、ローカルコンテキスト認識フレームワークは多くのモデルに適応しやすく、他のターゲットレベルのタスクを改善する可能性がある。 Target-oriented sentiment classification is a fine-grained task of natural language processing to analyze the sentiment polarity of the targets. To improve the performance of sentiment classification, many approaches proposed various attention mechanisms to capture the important context words of a target. However, previous approaches ignored the significant relatedness of a target's sentiment and its local context. This paper proposes a local context-aware network (LCA-Net), equipped with the local context embedding and local context prediction loss, to strengthen the model by emphasizing the sentiment information of the local context. The experimental results on three common datasets show that local context-aware network performs superior to existing approaches in extracting local context features. Besides, the local context-aware framework is easy to adapt to many models, with the potential to improve other target-level tasks.	翻訳日:2022-10-12 01:34:17 公開日:2021-02-01
# 複雑な地形を歩むための指導カリキュラム学習 Guided Curriculum Learning for Walking Over Complex Terrain ( http://arxiv.org/abs/2010.03848v2 ) ライセンス: Link先を確認	Brendan Tidd, Nicolas Hudson, Akansel Cosgun	(参考訳) 複雑な地形の上を歩くという信頼性の高い二足歩行は難しい問題だ。カリキュラム学習とは、タスクの達成可能なバージョンから始めて、成功基準が満たされるにつれて難易度を高めるという考え方である。本稿では,二足歩行のための深層強化学習政策を学習するための3段階カリキュラムを提案する。第1段階では、エージェントは容易な地形上で開始され、徐々に地形の難しさが増し、目標方針から導出される力がロボット関節およびベースに適用される。第2段階では、誘導力は徐々にゼロに減少する。最後に、第3段階では、ロボットベースに大きさが大きくなるランダムな摂動が適用され、ポリシーの堅牢性が改善される。シミュレーション実験では, 平面, ハードル, 隙間, 階段, 階段の5種類の地形に対して, 歩行方針の学習に有効であることを示した。さらに,人間による実演の欠如により,複雑な地形を横断することを学ぶには,簡単な手で設計した歩行路が十分であることを示す。アブレーション研究において,カリキュラムの3段階のいずれかを選択すると,学習性能が低下することが示された。 Reliable bipedal walking over complex terrain is a challenging problem, using a curriculum can help learning. Curriculum learning is the idea of starting with an achievable version of a task and increasing the difficulty as a success criteria is met. We propose a 3-stage curriculum to train Deep Reinforcement Learning policies for bipedal walking over various challenging terrains. In the first stage, the agent starts on an easy terrain and the terrain difficulty is gradually increased, while forces derived from a target policy are applied to the robot joints and the base. In the second stage, the guiding forces are gradually reduced to zero. Finally, in the third stage, random perturbations with increasing magnitude are applied to the robot base, so the robustness of the policies are improved. In simulation experiments, we show that our approach is effective in learning walking policies, separate from each other, for five terrain types: flat, hurdles, gaps, stairs, and steps. Moreover, we demonstrate that in the absence of human demonstrations, a simple hand designed walking trajectory is a sufficient prior to learn to traverse complex terrain types. In ablation studies, we show that taking out any one of the three stages of the curriculum degrades the learning performance.	翻訳日:2022-10-09 11:39:11 公開日:2021-02-01
# MIDI拡張を用いた変圧器型ピッチシーケンスオートエンコーダ A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation ( http://arxiv.org/abs/2010.07758v3 ) ライセンス: Link先を確認	Mingshuo Ding, Yinghao Ma	(参考訳) 近年のディープラーニング自動音楽生成アルゴリズムの成果にもかかわらず、シングルトラック音楽の抜粋がオートマトンやホモ・サピエンスによって構成されているかどうかを評価するためのアプローチがいくつか提案されている。この問題に対処するために、ALBERTに基づくマスク付き言語モデルを作曲家分類に適用する。目的は、MIDIクリップが自動生成仮説に基づいて構成される可能性を示し、AIで構成されたシングルトラックMIDIのみを用いてトレーニングするモデルを得ることである。本稿では,パラメータの量を削減し,データ拡張に関する2つの手法と,オーバーフィッティングを防止するための洗練された損失関数を提案する。実験結果は,CSMT(2020)のデータチャレンジにおける7ドルチームすべてにおいて,我々のモデルが$3^{rd}$であることを示している。さらに、このインスピレーション手法は、小さなデータセットに基づく他の音楽情報検索タスクにも適用することができる。 Despite recent achievements of deep learning automatic music generation algorithms, few approaches have been proposed to evaluate whether a single-track music excerpt is composed by automatons or Homo sapiens. To tackle this problem, we apply a masked language model based on ALBERT for composers classification. The aim is to obtain a model that can suggest the probability a MIDI clip might be composed condition on the auto-generation hypothesis, and which is trained with only AI-composed single-track MIDI. In this paper, the amount of parameters is reduced, two methods on data augmentation are proposed as well as a refined loss function to prevent overfitting. The experiment results show our model ranks $3^{rd}$ in all the $7$ teams in the data challenge in CSMT(2020). Furthermore, this inspiring method could be spread to other music information retrieval tasks that are based on a small dataset.	翻訳日:2022-10-07 05:38:55 公開日:2021-02-01
# ファネル構造を用いた意思決定問題--メールマーケティングキャンペーンへの応用によるマルチタスク学習アプローチ Decision Making Problems with Funnel Structure: A Multi-Task Learning Approach with Application to Email Marketing Campaigns ( http://arxiv.org/abs/2010.08048v2 ) ライセンス: Link先を確認	Ziping Xu, Amirhossein Meisami, Ambuj Tewari	(参考訳) 本稿では,ファンネル構造を用いた意思決定問題について考察する。マーケティング分野においてよく知られた概念であるファンネル構造は、浅いものよりも深い層からの観測がはるかに少ない層状に、意思決定者が環境と相互作用するシステムにおいて発生する。例えば、eメールマーケティングキャンペーンアプリケーションでは、レイヤはオープン、クリック、購入のイベントに対応しています。 ClickからPurchaseへの変換は、メールのリンクがクリックされない限り購入はできないため、非常に頻繁に行われる。我々は,この困難な意思決定問題をファンネル構造を持つコンテキストバンディットとして定式化し,深層層からの十分な観察の欠如を軽減するマルチタスク学習アルゴリズムを開発した。我々は予測誤差とアルゴリズムの後悔の両方を分析した。我々は単純なシミュレーションにより予測誤差の理論を検証する。メールマーケティング企業による実世界データに基づくシミュレーション環境と実環境の両方における実験により,従来の手法に比べてアルゴリズムが大幅に改善することが示された。 This paper studies the decision making problem with Funnel Structure. Funnel structure, a well-known concept in the marketing field, occurs in those systems where the decision maker interacts with the environment in a layered manner receiving far fewer observations from deep layers than shallow ones. For example, in the email marketing campaign application, the layers correspond to Open, Click and Purchase events. Conversions from Click to Purchase happen very infrequently because a purchase cannot be made unless the link in an email is clicked on. We formulate this challenging decision making problem as a contextual bandit with funnel structure and develop a multi-task learning algorithm that mitigates the lack of sufficient observations from deeper layers. We analyze both the prediction error and the regret of our algorithms. We verify our theory on prediction errors through a simple simulation. Experiments on both a simulated environment and an environment based on real-world data from a major email marketing company show that our algorithms offer significant improvement over previous methods.	翻訳日:2022-10-07 03:25:46 公開日:2021-02-01
# SAINT+:EDNetの精度予測のための時間的特徴の統合 SAINT+: Integrating Temporal Features for EdNet Correctness Prediction ( http://arxiv.org/abs/2010.12042v2 ) ライセンス: Link先を確認	Dongmin Shin, Yugeun Shim, Hangyeol Yu, Seewoo Lee, Byungsoo Kim, Youngduck Choi	(参考訳) 本稿では,学習者情報と運動情報とを別々に処理する,トランスフォーマーに基づく知識追跡モデルSAINTの後継であるSAINT+を提案する。 SAINTのアーキテクチャに従って、SAINT+はエンコーダ・デコーダ構造を持ち、エンコーダは運動埋め込みのストリームに自己アテンション層を適用し、デコーダは、応答埋め込みとエンコーダ出力のストリームに自己アテンション層とエンコーダ・アテンション層を交互に適用する。さらに、SAINT+は2つの時間的特徴埋め込みを反応埋め込みに組み込んでおり、時間経過、生徒が答えるのに要する時間、学習活動間の時間間隔であるラグ時間である。教育領域で最大の公開ベンチマークデータセットであるEdNetにおけるSAINT+の有効性を実証的に評価した。実験結果から,SAINT+は,EdNetデータセットの現在最先端モデルであるSAINTと比較して,レシーバ動作特性曲線下での領域の1.25%の改善により,知識追跡における最先端性を実現していることがわかった。 We propose SAINT+, a successor of SAINT which is a Transformer based knowledge tracing model that separately processes exercise information and student response information. Following the architecture of SAINT, SAINT+ has an encoder-decoder structure where the encoder applies self-attention layers to a stream of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to streams of response embeddings and encoder output. Moreover, SAINT+ incorporates two temporal feature embeddings into the response embeddings: elapsed time, the time taken for a student to answer, and lag time, the time interval between adjacent learning activities. We empirically evaluate the effectiveness of SAINT+ on EdNet, the largest publicly available benchmark dataset in the education domain. Experimental results show that SAINT+ achieves state-of-the-art performance in knowledge tracing with an improvement of 1.25% in area under receiver operating characteristic curve compared to SAINT, the current state-of-the-art model in EdNet dataset.	翻訳日:2022-10-05 21:23:50 公開日:2021-02-01
# 動的環境における後悔最適制御 Regret-optimal control in dynamic environments ( http://arxiv.org/abs/2010.10473v2 ) ライセンス: Link先を確認	Gautam Goel, Babak Hassibi	(参考訳) 後悔最小化の観点から線形時間変動力学系の制御を考察する。この領域における多くの先行研究とは違って、特定の種類のコントローラにおいて最高の固定コントローラではなく、後方視(動的後悔)で選択された制御アクションの最良の動的シーケンスに対する後悔を最小限に抑えるオンラインコントローラを設計する問題に焦点を当てる(静的後悔)。この定式化は、環境が経時的に変化しても魅力的であり、単一のコントローラが時間軸全体にわたって優れたパフォーマンスを達成することはない。我々は,新たなH_{\infty}$制御による後悔最適制御系の状態空間構造を導出し,乱れのエネルギーの観点から,その後悔に基づく厳密なデータ依存境界を提示する。この結果は,制御器が将来の乱れを予測できるモデル予測設定や,固定遅延後のシステムダイナミクスにのみ影響する設定に容易に拡張できる。そこで本研究では,確率的および敵対的環境における$h_2$-optimal と $h_{\infty}$-optimal controller の性能を相互に補間する数値実験を行った。 We consider control in linear time-varying dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which minimizes regret against the best dynamic sequence of control actions selected in hindsight (dynamic regret), instead of the best fixed controller in some specific class of controllers (static regret). This formulation is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We derive the state-space structure of the regret-optimal controller via a novel reduction to $H_{\infty}$ control and present a tight data-dependent bound on its regret in terms of the energy of the disturbance. Our results easily extend to the model-predictive setting where the controller can anticipate future disturbances and to settings where the controller only affects the system dynamics after a fixed delay. We present numerical experiments which show that our regret-optimal controller interpolates between the performance of the $H_2$-optimal and $H_{\infty}$-optimal controllers across stochastic and adversarial environments.	翻訳日:2022-10-05 08:05:44 公開日:2021-02-01
# 医用画像のクロスモーダル情報最大化:CMIM Cross-Modal Information Maximization for Medical Imaging: CMIM ( http://arxiv.org/abs/2010.10593v3 ) ライセンス: Link先を確認	Tristan Sylvain, Francis Dutil, Tess Berthier, Lisa Di Jorio, Margaux Luck, Devon Hjelm, Yoshua Bengio	(参考訳) 病院では、患者が行っている異なる医用画像検査(CTスキャン、MRI、PET、超音波など)や関連する放射線検査など、異なるモードで同じ情報を利用できる特定の情報システムにデータがサイロ化される。これは、テスト時に常に利用できないかもしれない同じ情報の複数のビューを列車で取得し、使用するためのユニークな機会を提供する。本稿では, 相互情報最大化の最近の進歩を用いて, モダリティ低下に弾力性のあるマルチモーダル入力の良質な表現を学習することにより, 利用可能なデータを最大限に活用する革新的な枠組みを提案する。列車時間におけるクロスモーダル情報の最大化により、医療画像分類とセグメンテーションという2つの異なる設定で、最先端のベースラインを上回ります。特に本手法は,弱いモダリティの推論時間性能に大きな影響を与えることが示されている。 In hospitals, data are siloed to specific information systems that make the same information available under different modalities such as the different medical imaging exams the patient undergoes (CT scans, MRI, PET, Ultrasound, etc.) and their associated radiology reports. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time. In this paper, we propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time, using recent advances in mutual information maximization. By maximizing cross-modal information at train time, we are able to outperform several state-of-the-art baselines in two different settings, medical image classification, and segmentation. In particular, our method is shown to have a strong impact on the inference-time performance of weaker modalities.	翻訳日:2022-10-05 06:11:23 公開日:2021-02-01
# 一般化された連続ゼロショット学習 Generalized Continual Zero-Shot Learning ( http://arxiv.org/abs/2011.08508v3 ) ライセンス: Link先を確認	Chandan Gautam, Sethupathy Parameswaran, Ashish Mishra, Suresh Sundaram	(参考訳) 最近、ゼロショット学習(ZSL)がエキサイティングなトピックとして登場し、多くの注目を集めた。 zslは、見知らぬクラスからの知識をクラス記述に基づいて見知らぬクラスに移すことで、見当たらないクラスを分類することを目指している。有望なパフォーマンスを示しているにもかかわらず、ZSLのアプローチは、すべてのクラスからのトレーニングサンプルがトレーニング中に利用可能であると仮定している。そこで本研究では,タスクの形式でクラスが順次到着するZSL(Continuousal ZSL, CZSL)のより汎用的で実用的な設定を提案し,過去の経験を生かして変化する環境から積極的に学習する。さらに、信頼性を高めるために、トレーニングプロセス中にタスクアイデンティティが明らかにされるが、テスト中には明らかにされない、単一ヘッド連続学習環境のCZSLを開発する。破滅的な忘れと不透明を避けるため,我々は知識蒸留と,それ以前のタスクからのサンプルの保存と再生を,小さなエピソードメモリを用いて行う。我々は,5つのzslベンチマークデータセット上で,連続学習の2つの異なる設定のためのベースラインを開発し,一般化されたczslを評価する。さらに、czslは2種類の変分オートエンコーダに対して開発され、分類のために2種類の特徴を生成する。 (i)出力空間に生成した特徴と (ii)潜在空間における識別的特徴を生じる。実験結果から, 単一頭部CZSLはより一般化可能で, 実用に適していることが明らかとなった。 Recently, zero-shot learning (ZSL) emerged as an exciting topic and attracted a lot of attention. ZSL aims to classify unseen classes by transferring the knowledge from seen classes to unseen classes based on the class description. Despite showing promising performance, ZSL approaches assume that the training samples from all seen classes are available during the training, which is practically not feasible. To address this issue, we propose a more generalized and practical setup for ZSL, i.e., continual ZSL (CZSL), where classes arrive sequentially in the form of a task and it actively learns from the changing environment by leveraging the past experience. Further, to enhance the reliability, we develop CZSL for a single head continual learning setting where task identity is revealed during the training process but not during the testing. To avoid catastrophic forgetting and intransigence, we use knowledge distillation and storing and replay the few samples from previous tasks using a small episodic memory. We develop baselines and evaluate generalized CZSL on five ZSL benchmark datasets for two different settings of continual learning: with and without class incremental. Moreover, CZSL is developed for two types of variational autoencoders, which generates two types of features for classification: (i) generated features at output space and (ii) generated discriminative features at the latent space. The experimental results clearly indicate the single head CZSL is more generalizable and suitable for practical applications.	翻訳日:2022-09-24 16:46:58 公開日:2021-02-01
# 多機能核融合深部ネットワークによるロバスト超解像深度イメージング Robust super-resolution depth imaging via a multi-feature fusion deep network ( http://arxiv.org/abs/2011.11444v2 ) ライセンス: Link先を確認	Alice Ruget, Stephen McLaughlin, Robert K. Henderson, Istvan Gyongy, Abderrahim Halimi and Jonathan Leach	(参考訳) 3次元イメージングは、深度を記録する必要がある画像アプリケーションにおいて重要な役割を果たす。深度イメージングを利用するアプリケーションの数は急速に増えており、例えば自動運転車やスマートフォンカメラのオートフォーカスアシストなどがある。単一光子感度検出器(SPAD)アレイによる光検出・測光(LIDAR)は、高フレームレートで深度画像の取得を可能にする新興技術である。しかし、この技術の空間分解能は、通常、従来のカメラで記録された強度画像と比較して低い。本研究では,SPADカメラからの奥行き画像のネイティブ解像度を高めるために,カメラのヒストグラムデータから抽出できる複数の特徴を活かしたディープネットワークを構築した。ネットワークはデュアルモードで動作するSPADカメラ用に設計されており、高フレームレートで低解像度深度と高解像度の高解像度の画像を交互に撮影する。ネットワークは、深度の上昇を導くために、下地ヒストグラムから抽出された強度画像と複数の特徴を使用する。我々のネットワークは、幅広い信号対雑音比と光子レベルにまたがる画像分解能の大幅な向上と画像デノイングを提供する。ネットワークを様々な3Dデータに適用し,デノナイジングと4倍の解像度の深度向上を実証する。 Three-dimensional imaging plays an important role in imaging applications where it is necessary to record depth. The number of applications that use depth imaging is increasing rapidly, and examples include self-driving autonomous vehicles and auto-focus assist on smartphone cameras. Light detection and ranging (LIDAR) via single-photon sensitive detector (SPAD) arrays is an emerging technology that enables the acquisition of depth images at high frame rates. However, the spatial resolution of this technology is typically low in comparison to the intensity images recorded by conventional cameras. To increase the native resolution of depth images from a SPAD camera, we develop a deep network built specifically to take advantage of the multiple features that can be extracted from a camera's histogram data. The network is designed for a SPAD camera operating in a dual-mode such that it captures alternate low resolution depth and high resolution intensity images at high frame rates, thus the system does not require any additional sensor to provide intensity images. The network then uses the intensity images and multiple features extracted from downsampled histograms to guide the upsampling of the depth. Our network provides significant image resolution enhancement and image denoising across a wide range of signal-to-noise ratios and photon levels. We apply the network to a range of 3D data, demonstrating denoising and a four-fold resolution enhancement of depth.	翻訳日:2022-09-23 06:24:23 公開日:2021-02-01
# (参考訳) 改良された電波銀河分類のためのアテンションゲーティング Attention-gating for improved radio galaxy classification ( http://arxiv.org/abs/2012.01248v2 ) ライセンス: CC BY 4.0	Micah Bowles, Anna M. M. Scaife, Fiona Porter, Hongming Tang, David J. Bastien	(参考訳) 本研究では,畳み込みニューラルネットワークを用いた電波銀河の分類技術として注目される。この分野では、次の最小のCNNアプリケーションよりも50%以上少ないパラメータを使用しながら、従来の分類器と同等のアテンションベースモデルを提案する。注意図作成に使用される正規化と集約法の選択が個々のモデルの出力にどのように影響するかを定量的に示し、その結果の注意マップを用いて、モデルによる分類選択を解釈できることを示す。我々は,本モデルで同定された有能な領域が,有能な人間分類器が同等の分類を行う領域とよく一致していることを観察した。正規化とアグリゲーションの選択は個々のモデルの性能にはほとんど影響しないが、それぞれの注意マップの解釈可能性に大きな影響を与え、天文学者が電波源を目で分類する方法とよく一致したモデルを選択することで、より効果的な方法でモデルを利用できることを示す。 In this work we introduce attention as a state of the art mechanism for classification of radio galaxies using convolutional neural networks. We present an attention-based model that performs on par with previous classifiers while using more than 50% fewer parameters than the next smallest classic CNN application in this field. We demonstrate quantitatively how the selection of normalisation and aggregation methods used in attention-gating can affect the output of individual models, and show that the resulting attention maps can be used to interpret the classification choices made by the model. We observe that the salient regions identified by the our model align well with the regions an expert human classifier would attend to make equivalent classifications. We show that while the selection of normalisation and aggregation may only minimally affect the performance of individual models, it can significantly affect the interpretability of the respective attention maps and by selecting a model which aligns well with how astronomers classify radio sources by eye, a user can employ the model in a more effective manner.	翻訳日:2021-05-30 09:05:53 公開日:2021-02-01
# (参考訳) 非凸最適化のための縮小半径によるブロック座標降下の収束 Convergence of block coordinate descent with diminishing radius for nonconvex optimization ( http://arxiv.org/abs/2012.03503v2 ) ライセンス: CC BY 4.0	Hanbaek Lyu	(参考訳) ブロック座標降下(英: Block coordinate descent、BCD)は、非凸最適化のための単純な反復アルゴリズムであり、各ブロック座標の目的関数を逐次最小化し、他の座標を固定する。我々はブロックワイズ凸と微分可能な目的関数の定常点に収束することを保証した bcd のバージョンを提案する。さらに、$n$ が反復数を表す順序 $\log n/\sqrt{n}$ の最適な収束率を得る。鍵となる考え方は、減少する半径内でパラメータ探索を制限し、反復体の安定性を促進させ、そのような補助的制約が限界で消えることを示すことである。応用として、再構成誤差の定常点に収束する非負のCPテンソル因子化のための修正された最小二乗アルゴリズムを、収束率のベストケースで同じ境界で提供する。また,合成データと実世界のデータの両方を用いて実験を行った。 Block coordinate descent (BCD), also known as nonlinear Gauss-Seidel, is a simple iterative algorithm for nonconvex optimization that sequentially minimizes the objective function in each block coordinate while the other coordinates are held fixed. We propose a version of BCD that is guaranteed to converge to the stationary points of block-wise convex and differentiable objective functions under constraints. Furthermore, we obtain a best-case rate of convergence of order $\log n/\sqrt{n}$, where $n$ denotes the number of iterations. A key idea is to restrict the parameter search within a diminishing radius to promote stability of iterates, and then to show that such auxiliary constraints vanish in the limit. As an application, we provide a modified alternating least squares algorithm for nonnegative CP tensor factorization that converges to the stationary points of the reconstruction error with the same bound on the best-case rate of convergence. We also experimentally validate our results with both synthetic and real-world data.	翻訳日:2021-05-21 04:49:45 公開日:2021-02-01
# MHT-X:アルゴリズムXを用いたオフライン多重仮説追跡 MHT-X: Offline Multiple Hypothesis Tracking with Algorithm X ( http://arxiv.org/abs/2101.05202v2 ) ライセンス: Link先を確認	Peteris Zvejnieks, Mihails Birjukovs, Martins Klevs, Megumi Akashi, Sven Eckert, Andris Jakovics	(参考訳) Pythonを用いて最適な相関探索のためのアルゴリズムXを用いたオフライン多重仮説追跡の効率的で汎用的な実装を開発した。このコードは、オンライン処理を必要としない科学アプリケーションを対象としている。有向グラフフレームワークが使われ、時間窓幅が漸進的に増加する複数のスキャンが最大確率軌道のためのエッジ構築に使用される。現在のバージョンのコードは多相流体力学への応用のために開発された。気泡と粒子追跡は、物体の動きを解消し、マージし、分割することができる。対象特性の統計関数に変換される弱い質量と運動量保存則を用いて、実現可能な対象関係と軌道グラフエッジの確率を決定する。符号は n 次元運動と互換性があり、任意のトラックオブジェクト特性を持つ。このフレームワークは、現在使われているヒューリスティックを、問題に対してより適切なものに置き換えることで、現在のアプリケーションを超えて容易に拡張できる。コードはオープンソースで、今後も開発が続けられる。 An efficient and versatile implementation of offline multiple hypothesis tracking with Algorithm X for optimal association search was developed using Python. The code is intended for scientific applications that do not require online processing. Directed graph framework is used and multiple scans with progressively increasing time window width are used for edge construction for maximum likelihood trajectories. The current version of the code was developed for applications in multiphase hydrodynamics, e.g. bubble and particle tracking, and is capable of resolving object motion, merges and splits. Feasible object associations and trajectory graph edge likelihoods are determined using weak mass and momentum conservation laws translated to statistical functions for object properties. The code is compatible with n-dimensional motion with arbitrarily many tracked object properties. This framework is easily extendable beyond the present application by replacing the currently used heuristics with ones more appropriate for the problem at hand. The code is open-source and will be continuously developed further.	翻訳日:2021-05-02 07:15:22 公開日:2021-02-01
# 機械学習を用いた大気イメージングアセンブリのマルチチャネル自動校正 Multi-Channel Auto-Calibration for the Atmospheric Imaging Assembly using Machine Learning ( http://arxiv.org/abs/2012.14023v4 ) ライセンス: Link先を確認	Luiz F. G. dos Santos, Souvik Bose, Valentina Salvatelli, Brad Neuberg, Mark C. M. Cheung, Miho Janvier, Meng Jin, Yarin Gal, Paul Boerner, and At{\i}l{\i}m G\"une\c{s} Baydin	(参考訳) 太陽活動は、惑星間媒質や地球上の宇宙天気に影響を与える重要な役割を担っている。ヘリオフィジカルス宇宙ミッションに搭載されたリモートセンシング機器は、その磁場の測定と多層多熱・動的太陽大気からの光放射を通じて太陽の活動に関する情報のプールを提供する。宇宙からの極端紫外線(euv)波長の観測は、太陽の外層、すなわち色球とコロナの微妙な性質を理解するのに役立つ。残念ながら、NASAのソーラー・ダイナミクス・オブザーバ(SDO)に搭載されている大気イメージング・アセンブリ(AIA)のような機器は、時間依存性の劣化に悩まされ、感度が低下する。現在のキャリブレーション技術は周期的な観測ロケットに依存しており、これは低頻度で、深宇宙ミッションでは実現不可能である。畳み込みニューラルネットワーク(CNN)に基づく別のキャリブレーション手法を提案する。分析にはSDO-AIAデータを用いる。以上の結果から,CNNをベースとしたモデルでは,ロケット実験の結果をある程度の精度で総合的に再現することが可能であることが示唆された。さらに、標準の「アストロノマー法」ベースラインモデルとの比較により、CNNアプローチがこのベースラインを著しく上回ることを示した。提案手法は,EUV機器を校正し,異なるEUVチャネル間のチャネル間関係の理解を深めるための新しい手法の枠組みを確立するものである。 Solar activity plays a quintessential role in influencing the interplanetary medium and space-weather around the Earth. Remote sensing instruments onboard heliophysics space missions provide a pool of information about the Sun's activity via the measurement of its magnetic field and the emission of light from the multi-layered, multi-thermal, and dynamic solar atmosphere. Extreme UV (EUV) wavelength observations from space help in understanding the subtleties of the outer layers of the Sun, namely the chromosphere and the corona. Unfortunately, such instruments, like the Atmospheric Imaging Assembly (AIA) onboard NASA's Solar Dynamics Observatory (SDO), suffer from time-dependent degradation, reducing their sensitivity. Current state-of-the-art calibration techniques rely on periodic sounding rockets, which can be infrequent and rather unfeasible for deep-space missions. We present an alternative calibration approach based on convolutional neural networks (CNNs). We use SDO-AIA data for our analysis. Our results show that CNN-based models could comprehensively reproduce the sounding rocket experiments' outcomes within a reasonable degree of accuracy, indicating that it performs equally well compared with the current techniques. Furthermore, a comparison with a standard "astronomer's technique" baseline model reveals that the CNN approach significantly outperforms this baseline. Our approach establishes the framework for a novel technique to calibrate EUV instruments and advance our understanding of the cross-channel relation between different EUV channels.	翻訳日:2021-04-24 20:07:36 公開日:2021-02-01
# ワーストケース比較による区間群フェアネスの特性評価 Characterizing Intersectional Group Fairness with Worst-Case Comparisons ( http://arxiv.org/abs/2101.01673v3 ) ライセンス: Link先を確認	Avijit Ghosh, Lea Genuit, Mary Reagan	(参考訳) 機械学習または人工知能アルゴリズムは、社会における既存の偏見を模倣し増幅する傾向にあるため、近年かなり精査されている。これはニッチだが成長する仕事の体となり、これらのバイアスを特定し、修正しようとする。これらのアルゴリズムをより公平にするための第一歩は、不公平さを測定するメトリクスを設計することです。この分野での既存の仕事の多くは、公正(保護されたグループと保護されていないグループ)と政治的に定義されたカテゴリー(人種または性別)の両立観を扱う。このような分類は交叉性の重要なニュアンスを見逃す - バイアスは、異なるカテゴリのメンバシップを結合するサブグループで増幅されることが多い。本稿では,交差点のレンズ下でのフェアネス指標の考察,交差点のフェアネスにおける既存作業の特定,既存のグループフェアネス指標の定義を拡張して交差点を包含する単純なケース比較手法の提案,そして,現代文脈における交差点フェアネスを扱うための社会的・法的・政治的枠組みの完成について論じる。 Machine Learning or Artificial Intelligence algorithms have gained considerable scrutiny in recent times owing to their propensity towards imitating and amplifying existing prejudices in society. This has led to a niche but growing body of work that identifies and attempts to fix these biases. A first step towards making these algorithms more fair is designing metrics that measure unfairness. Most existing work in this field deals with either a binary view of fairness (protected vs. unprotected groups) or politically defined categories (race or gender). Such categorization misses the important nuance of intersectionality - biases can often be amplified in subgroups that combine membership from different categories, especially if such a subgroup is particularly underrepresented in historical platforms of opportunity. In this paper, we discuss why fairness metrics need to be looked at under the lens of intersectionality, identify existing work in intersectional fairness, suggest a simple worst case comparison method to expand the definitions of existing group fairness metrics to incorporate intersectionality, and finally conclude with the social, legal and political framework to handle intersectional fairness in the modern context.	翻訳日:2021-04-11 11:45:11 公開日:2021-02-01
# (参考訳) VIPPrint: 合成顔画像検出とソースリンクのためのプリント画像とスキャン画像の大規模データセット VIPPrint: A Large Scale Dataset of Printed and Scanned Images for Synthetic Face Images Detection and Source Linking ( http://arxiv.org/abs/2102.06792v1 ) ライセンス: CC BY-SA 4.0	Anselmo Ferreira, Ehsan Nowroozi and Mauro Barni	(参考訳) 印刷された画像やスキャンされた画像に対して有意義な法医学的分析を行う可能性は、多くのアプリケーションにおいて大きな役割を果たす。まず第一に、印刷された文書は、しばしばテロリスト計画、児童ポルノ写真、さらには偽のパッケージといった犯罪行為と関連付けられている。さらに、印刷や走査は、画像が印刷されスキャンされた後に、通常、操作された画像や合成画像に見られるアーティファクトがなくなるため、画像操作の痕跡や画像の合成特性を隠すために用いられる。この領域の研究を妨げる問題は、アルゴリズムの開発とベンチマークに使用される大規模な参照データセットの欠如である。本稿では,本課題に動機づけられ,多数の合成画像と自然画像からなる新しいデータセットを提案する。データセットの画像解析に係わる問題点を明らかにするために,複数のプリンタの属性法を比較した広範な実験を行った。また,自然顔画像と合成顔画像とを区別する最新の手法が,印刷やスキャン画像に適用しても失敗することを確認した。新たなデータセットが利用可能となり,予備実験が実施されれば,この領域におけるさらなる研究の動機と促進が期待できる。 The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we present a new dataset composed of a large number of synthetic and natural printed face images. To highlight the difficulties associated with the analysis of the images of the dataset, we carried out an extensive set of experiments comparing several printer attribution methods. We also verified that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images. We envision that the availability of the new dataset and the preliminary experiments we carried out will motivate and facilitate further research in this area.	翻訳日:2021-04-06 08:53:53 公開日:2021-02-01
# 層ベース複合評価ブートストラップ Layer-based Composite Reputation Bootstrapping ( http://arxiv.org/abs/2102.09951v1 ) ライセンス: Link先を確認	Sajib Mistry, Athman Bouguettaya, Lie Qu	(参考訳) 複合サービスのための新しい汎用評価ブートストラップフレームワークを提案する。複数の評判関連の指標がレイヤベースのフレームワークで検討され、コンポーネントサービスの評判を暗黙的に反映する。コンポーネントサービスの将来のパフォーマンスに対する指標の重要性は、修正されたランダムフォレストアルゴリズムを用いて学習される。本研究では,複合サービスの評価とコンポーネントサービスの評価指標の相関関係を明らかにするために,トポロジー対応フォレスト深層ニューラルネットワーク(fdnn)を提案する。トレーニングされたfDNNモデルは、信頼性の高い新しいコンポジットサービスの評判を予測する。実世界のデータセットを用いた実験により,提案手法の有効性が証明された。 We propose a novel generic reputation bootstrapping framework for composite services. Multiple reputation-related indicators are considered in a layer-based framework to implicitly reflect the reputation of the component services. The importance of an indicator on the future performance of a component service is learned using a modified Random Forest algorithm. We propose a topology-aware Forest Deep Neural Network (fDNN) to find the correlations between the reputation of a composite service and reputation indicators of component services. The trained fDNN model predicts the reputation of a new composite service with the confidence value. Experimental results with real-world dataset prove the efficiency of the proposed approach.	翻訳日:2021-04-05 00:23:55 公開日:2021-02-01
# (参考訳) 人工知能における量子数学 Quantum Mathematics in Artificial Intelligence ( http://arxiv.org/abs/2101.04255v3 ) ライセンス: CC BY-SA 4.0	Dominic Widdows and Kirsty Kitto and Trevor Cohen	(参考訳) 2010年以降の10年間、人工知能の成功はコンピュータ科学と技術の最前線にあり、ベクトル空間モデルは人工知能の最前線における位置を固めてきた。同時に、量子コンピュータはより強力になり、主要な進歩の発表が頻繁にニュースに取り上げられている。これらの領域の根底にある数学的手法は、しばしば実現されるよりも多くの共通点がある。ベクトル空間は1930年代に量子力学の公理的中心に位置づけられ、この採用はベクトル空間の線型幾何学から論理と確率を導出するための重要な動機となった。粒子間の量子的相互作用はテンソル積を用いてモデル化される。本稿では、人工知能(AI)、特に自動推論や自然言語処理(NLP)における利用例を含む、これらの一般的な数学分野について述べる。議論される技法には、ベクトル空間、スカラー積、部分空間と含意、直交射影と否定、双対ベクトル、密度行列、正作用素、テンソル積が含まれる。アプリケーション領域には、情報検索、分類と含意、単語センスと曖昧さのモデル化、知識ベースにおける推論、意味合成が含まれる。これらのアプローチのいくつかは量子ハードウェアに実装できる可能性がある。この実装の実践的なステップの多くは初期段階にあり、すでに実現しているものもある。一般的な数学的ツールのいくつかを説明することは、aiと量子コンピューティングの両方の研究者がこれらの重複をさらに活用し、途中で新しい方向を認識し探索するのに役立つ。 In the decade since 2010, successes in artificial intelligence have been at the forefront of computer science and technology, and vector space models have solidified a position at the forefront of artificial intelligence. At the same time, quantum computers have become much more powerful, and announcements of major advances are frequently in the news. The mathematical techniques underlying both these areas have more in common than is sometimes realized. Vector spaces took a position at the axiomatic heart of quantum mechanics in the 1930s, and this adoption was a key motivation for the derivation of logic and probability from the linear geometry of vector spaces. Quantum interactions between particles are modelled using the tensor product, which is also used to express objects and operations in artificial neural networks. This paper describes some of these common mathematical areas, including examples of how they are used in artificial intelligence (AI), particularly in automated reasoning and natural language processing (NLP). Techniques discussed include vector spaces, scalar products, subspaces and implication, orthogonal projection and negation, dual vectors, density matrices, positive operators, and tensor products. Application areas include information retrieval, categorization and implication, modelling word-senses and disambiguation, inference in knowledge bases, and semantic composition. Some of these approaches can potentially be implemented on quantum hardware. Many of the practical steps in this implementation are in early stages, and some are already realized. Explaining some of the common mathematical tools can help researchers in both AI and quantum computing further exploit these overlaps, recognizing and exploring new directions along the way.	翻訳日:2021-04-04 13:47:37 公開日:2021-02-01
# ECCV-TAO-2020の第1位: トラッキング対象の検出と表現 1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking ( http://arxiv.org/abs/2101.08040v2 ) ライセンス: Link先を確認	Fei Du, Bo Xu, Jiasheng Tang, Yuqi Zhang, Fan Wang, and Hao Li	(参考訳) 従来のトラッキング・バイ・検出パラダイムをこのトラッキング・ア・オブジェクト・タスクに拡張する。固体検出結果はまずTAOデータセットから抽出される。いくつかの最先端技術、例えば \textbf{ba}lanced-\textbf{g}roup \textbf{s}oftmax (\textbf{bags}\cite{li2020overcoming})や検出器s\cite{qiao2020detector}は検出中に統合される。そして,特徴学習ネットワークのトレーニングにより,あらゆる対象を表す出現特徴を学習した。検出と特徴表現を改善するために,いくつかのモデルを組み立てる。最も類似した外観機能を持つ単純なリンク戦略と、トラックレットレベルのポストアソシエーションモジュールが最終的に最終追跡結果を生成するために適用される。この方法は、challenge webサイトに \textbf{aoa}として提出される。コードはhttps://github.com/feiaxyt/winner_eccv20_taoで入手できる。 We extend the classical tracking-by-detection paradigm to this tracking-any-object task. Solid detection results are first extracted from TAO dataset. Some state-of-the-art techniques like \textbf{BA}lanced-\textbf{G}roup \textbf{S}oftmax (\textbf{BAGS}\cite{li2020overcoming}) and DetectoRS\cite{qiao2020detectors} are integrated during detection. Then we learned appearance features to represent any object by training feature learning networks. We ensemble several models for improving detection and feature representation. Simple linking strategies with most similar appearance features and tracklet-level post association module are finally applied to generate final tracking results. Our method is submitted as \textbf{AOA} on the challenge website. Code is available at https://github.com/feiaxyt/Winner_ECCV20_TAO.	翻訳日:2021-03-22 01:23:44 公開日:2021-02-01
# (参考訳) 構造的関係推論を用いたCross Chest Graphによる疾患診断 Cross Chest Graph for Disease Diagnosis with Structural Relational Reasoning ( http://arxiv.org/abs/2101.08992v2 ) ライセンス: CC BY 4.0	Gangming Zhao, Baolian Qi, Jinpeng Li	(参考訳) X線画像のコンピュータ診断において位置病変は重要である。しかし、ボックスレベルのアノテーションは時間と労力を要する。病変を正確に特定する方法は少ないが、注意すべきアノテーションがなくても、緊急の問題だ。弱い教師付きメソッドでこの問題にアプローチする作業がいくつかあるが、パフォーマンスは改善される必要がある。 1つの障害は、一般に弱教師付き手法は、高構造特性のようなX線像の特性を考慮できなかったことである。そこで我々は,医師のトレーニングや意思決定プロセスを模倣して自動病変検出の性能を向上させるクロスケストグラフ(CCG)を提案する。 CCGは、構造情報を利用して異なる領域を観察する医師の習慣をシミュレートすることで、異なる解剖学的領域間の画像内関係をモデル化する。一方、画像間の関係は知識分析モジュールによってモデル化され、複数の画像を比較する医師の習慣をシミュレートする。画像内および画像間情報を統合されたエンドツーエンドフレームワークに統合する。 The NIH Chest-14 database (112,120 frontal-view X-ray images with 14 disease) での実験結果から,本手法は医療分野の専門知識を吸収することにより,病変の局所化を弱め,最先端の性能を達成することを示した。 Locating lesions is important in the computer-aided diagnosis of X-ray images. However, box-level annotation is time-consuming and laborious. How to locate lesions accurately with few, or even without careful annotations is an urgent problem. Although several works have approached this problem with weakly-supervised methods, the performance needs to be improved. One obstacle is that general weakly-supervised methods have failed to consider the characteristics of X-ray images, such as the highly-structural attribute. We therefore propose the Cross-chest Graph (CCG), which improves the performance of automatic lesion detection by imitating doctor's training and decision-making process. CCG models the intra-image relationship between different anatomical areas by leveraging the structural information to simulate the doctor's habit of observing different areas. Meanwhile, the relationship between any pair of images is modeled by a knowledge-reasoning module to simulate the doctor's habit of comparing multiple images. We integrate intra-image and inter-image information into a unified end-to-end framework. Experimental results on the NIH Chest-14 database (112,120 frontal-view X-ray images with 14 diseases) demonstrate that the proposed method achieves state-of-the-art performance in weakly-supervised localization of lesions by absorbing professional knowledge in the medical field.	翻訳日:2021-03-21 02:19:40 公開日:2021-02-01
# ニューラルネットワークポテンシャルのアクティブラーニングを可能にする不確実性に対する敵対的攻撃 Adversarial Attacks on Uncertainty Enable Active Learning for Neural Network Potentials ( http://arxiv.org/abs/2101.11588v2 ) ライセンス: Link先を確認	Daniel Schwalbe-Koda, Aik Rui Tan, Rafael G\'omez-Bombarelli	(参考訳) ニューラルネットワーク(NN)ベースの原子間電位は、電子構造法の精度でポテンシャルエネルギー表面を迅速に予測する。しかし、NN予測は十分に学習された訓練領域内でのみ信頼性があり、外挿時の未知の挙動を持つ。 NN委員会による不確実性定量化は、予測信頼度が低いドメインを特定するが、NNポテンシャルをトレーニングするための設定空間を徹底的に探索するには、しばしば遅い原子論シミュレーションが必要である。ここでは,新しい分子ジオメトリとブートストラップnnポテンシャルをサンプリングするために,異種不確実性指標を用いた敵対的攻撃を用いる。アクティブ学習ループと組み合わせることで、NNポテンシャルの補間力は、追加のサンプルが少ない元のトレーニングデータを超えて改善されます。このフレームワークは複数の例で実証され、関連するジオメトリに関する広範な事前データなしで、運動障壁と集合変数のより良いサンプリングにつながります。敵攻撃は、位相空間とブートストラップNN電位を同時にサンプリングし、その堅牢性を高め、ポテンシャルエネルギー景観のより高速で正確な予測を可能にする新しい方法である。 Neural network (NN)-based interatomic potentials provide fast prediction of potential energy surfaces with the accuracy of electronic structure methods. However, NN predictions are only reliable within well-learned training domains, with unknown behavior when extrapolating. Uncertainty quantification through NN committees identify domains with low prediction confidence, but thoroughly exploring the configuration space for training NN potentials often requires slow atomistic simulations. Here, we employ adversarial attacks with a differentiable uncertainty metric to sample new molecular geometries and bootstrap NN potentials. In combination with an active learning loop, the extrapolation power of NN potentials is improved beyond the original training data with few additional samples. The framework is demonstrated on multiple examples, leading to better sampling of kinetic barriers and collective variables without extensive prior data on the relevant geometries. Adversarial attacks are new ways to simultaneously sample the phase space and bootstrap NN potentials, increasing their robustness and enabling a faster, accurate prediction of potential energy landscapes.	翻訳日:2021-03-13 19:32:38 公開日:2021-02-01
# (参考訳) バッテリーの健康状態推定のための機械学習パイプライン Machine learning pipeline for battery state of health estimation ( http://arxiv.org/abs/2102.00837v1 ) ライセンス: CC BY 4.0	Darius Roman, Saurabh Saxena, Valentin Robu, Michael Pecht and David Flynn	(参考訳) リチウムイオン電池は携帯用電子工学から電気自動車まで及ぶ現代適用でユビキタスです。アプリケーションに関係なく、オンボードコンピュータによるバッテリーの状態(SOH)の信頼性の高いリアルタイム推定は、バッテリーの安全な操作に不可欠であり、最終的に資産の完全性を保護します。本稿では,各種条件下での179セル上での電池容量フェード(バッテリヘルスの指標)推定のための機械学習パイプラインの設計と評価を行う。パイプラインは、2つのパラメトリックおよび2つの非パラメトリックアルゴリズムを用いて、関連する信頼区間で電池SOHを推定する。チャージ電圧と電流曲線のセグメントを使用して、パイプラインエンジニア30は自動的な特徴選択を行い、アルゴリズムを校正する。高速チャージプロトコルの下で動作しているセルにデプロイすると、最良のモデルは根平均二乗誤差 0.45\% を達成する。この研究は、バッテリーSOH推定のためのスケーラブルなデータ駆動モデルの設計に関する洞察を提供し、予測に関する信頼性境界の価値を強調します。パイプライン手法は、実験データと機械学習モデリングを組み合わせることで、SOHのリアルタイム推定を必要とする他の重要なコンポーネントに一般化することができる。 Lithium-ion batteries are ubiquitous in modern day applications ranging from portable electronics to electric vehicles. Irrespective of the application, reliable real-time estimation of battery state of health (SOH) by on-board computers is crucial to the safe operation of the battery, ultimately safeguarding asset integrity. In this paper, we design and evaluate a machine learning pipeline for estimation of battery capacity fade - a metric of battery health - on 179 cells cycled under various conditions. The pipeline estimates battery SOH with an associated confidence interval by using two parametric and two non-parametric algorithms. Using segments of charge voltage and current curves, the pipeline engineers 30 features, performs automatic feature selection and calibrates the algorithms. When deployed on cells operated under the fast-charging protocol, the best model achieves a root mean squared percent error of 0.45\%. This work provides insights into the design of scalable data-driven models for battery SOH estimation, emphasising the value of confidence bounds around the prediction. The pipeline methodology combines experimental data with machine learning modelling and can be generalized to other critical components that require real-time estimation of SOH.	翻訳日:2021-02-05 11:26:17 公開日:2021-02-01
# (参考訳) 視覚偽物のための階層的変分オートエンコーダ Hierarchical Variational Autoencoder for Visual Counterfactuals ( http://arxiv.org/abs/2102.00854v1 ) ライセンス: CC BY 4.0	Nicolas Vercheval, Aleksandra Pizurica	(参考訳) 条件変分自動エンコーダ(VAE)は、説明可能な人工知能(XAI)ツールとして注目されている。潜在空間の符号は、反事実を生み出す理論的に正しい方法を提供する。ターゲットとするセマンティック機能への介入による変更。実画像に適用するには、階層CVAEのようなより複雑なモデルが必要です。これは、ナイーブコンディショニングがもはや有効ではないという課題を伴う。本稿では, 後方効果の緩和が反ファクトの達成につながることを示すとともに, アプリケーション内の分類器を視覚的に監査する手法として, VAEX を階層型VAE として導入する。 Conditional Variational Auto Encoders (VAE) are gathering significant attention as an Explainable Artificial Intelligence (XAI) tool. The codes in the latent space provide a theoretically sound way to produce counterfactuals, i.e. alterations resulting from an intervention on a targeted semantic feature. To be applied on real images more complex models are needed, such as Hierarchical CVAE. This comes with a challenge as the naive conditioning is no longer effective. In this paper we show how relaxing the effect of the posterior leads to successful counterfactuals and we introduce VAEX an Hierarchical VAE designed for this approach that can visually audit a classifier in applications.	翻訳日:2021-02-05 11:23:27 公開日:2021-02-01
# (参考訳) 負の学習率を持つメタラーニング Meta-learning with negative learning rates ( http://arxiv.org/abs/2102.00940v1 ) ライセンス: CC BY 4.0	Alberto Bernacchia	(参考訳) ディープラーニングモデルは、うまく機能するために大量のデータを必要とします。対象タスクにデータが不足すると、同様のタスクのトレーニングで得られた知識を転送して、ターゲットをすばやく学習できます。成功しているアプローチはメタラーニング(メタラーニング)、あるいは、学習が外ループで表されるタスクの分布を学習し、勾配降下の内側ループで学習する学習である。しかし、最近の多くの実証研究では、内部ループは不要であり、より単純なモデルは等しく、あるいはより良く機能すると主張している。内部ループの学習速度の関数としてのmamlの性能について検討し,学習速度がゼロである場合,内部ループが存在しないことを示唆する。ランダム行列理論と線形モデルの厳密解を用いて、過剰パラメータモデルを用いた混合線形回帰および非線形回帰に適用するmamlの検定損失に対する代数的表現を計算する。意外なことに、適応のための最適学習率が正である一方で、トレーニングのための最適学習率が常に負であることは、これまで考えられなかった設定である。したがって、最近の研究が示唆しているように、学習率をゼロにすることでパフォーマンスが向上するだけでなく、学習率を負の値に下げることでさらに向上させることができる。これらの結果は,メタラーニングがどのような状況で最善かを明らかにするのに役立つ。 Deep learning models require a large amount of data to perform well. When data is scarce for a target task, we can transfer the knowledge gained by training on similar tasks to quickly learn the target. A successful approach is meta-learning, or learning to learn a distribution of tasks, where learning is represented by an outer loop, and to learn by an inner loop of gradient descent. However, a number of recent empirical studies argue that the inner loop is unnecessary and more simple models work equally well or even better. We study the performance of MAML as a function of the learning rate of the inner loop, where zero learning rate implies that there is no inner loop. Using random matrix theory and exact solutions of linear models, we calculate an algebraic expression for the test loss of MAML applied to mixed linear regression and nonlinear regression with overparameterized models. Surprisingly, while the optimal learning rate for adaptation is positive, we find that the optimal learning rate for training is always negative, a setting that has never been considered before. Therefore, not only does the performance increase by decreasing the learning rate to zero, as suggested by recent work, but it can be increased even further by decreasing the learning rate to negative values. These results help clarify under what circumstances meta-learning performs best.	翻訳日:2021-02-05 11:13:41 公開日:2021-02-01
# (参考訳) 心電図の逆問題に対する基礎関数に基づくデータ駆動学習 Basis Function Based Data Driven Learning for the Inverse Problem of Electrocardiography ( http://arxiv.org/abs/2102.00570v1 ) ライセンス: CC BY 4.0	Tommy Peng, Avinash Malik, Laura Bear, Mark L. Trew	(参考訳) 目的: ガウス3D(G3D)基底関数分解法を用いて, 心電図の従来の逆問題から回帰問題へと再構成する, 体表面電位(BSP)から心表面電位(HSP)を予測するニューラルネットワーク手法を提案する。方法: HSPはG3D基底関数を用いて生成され,境界要素フォワードモデルを通過して対応するBSPを得る。生成されたBSP(インプット)とHSP(アウトプット)はニューラルネットワークの訓練に使用され、その後様々な合成および分解された実世界のHSPを予測するために使用された。結果:g3d基底関数パラメータは実世界の左室ペース記録を正確に再現でき、根平均二乗誤差 (rmse) は1.34 \pm 1.30$%である。基礎データ訓練ニューラルネットワークは、RMSEが$8.46 \pm 1.55$%、およびRMSEが$18.5 \pm 5.25$%である実世界のデータのG3D表現でG3D基底関数合成データを予測できた。予測時間系列から生成された活性化マップは、実際の左室ペース記録から生成されたものと比較して、RMSEは17.0%であり、絶対差は10.3 pm 10.8$msである。結論: ガウス基底関数に基づく回帰問題として心電図の逆問題を再計算するデータ駆動モデルが成功し, ガウスデータのみを用いて訓練した場合でも実世界の記録の有望な時系列と活性化マップ予測を生成する。意義:ニューラルネットワークによって予測されるHSPを使用して、臨床評価中に心機能障害を識別する活性化マップを作成することができる。 Objective: This paper proposes an neural network approach for predicting heart surface potentials (HSPs) from body surface potentials (BSPs), which reframes the traditional inverse problem of electrocardiography into a regression problem through the use of Gaussian 3D (G3D) basis function decomposition. Methods: HSPs were generated using G3D basis functions and passed through a boundary element forward model to obtain corresponding BSPs. The generated BSPs (input) and HSPs (output) were used to train a neural network, which was then used to predict a variety of synthesized and decomposed real-world HSPs. Results: Fitted G3D basis function parameters can accurately reconstruct the real-world left ventricular paced recording with percent root mean squared error (RMSE) of $1.34 \pm 1.30$%. The basis data trained neural network was able to predict G3D basis function synthesized data with RMSE of $8.46 \pm 1.55$%, and G3D representation of real-world data with RMSE of $18.5 \pm 5.25$%. Activation map produced from the predicted time series had a RMSE of 17.0% and mean absolute difference of $10.3 \pm 10.8$ms when compared to that produced from the actual left ventricular paced recording. Conclusion: A Gaussian basis function based data driven model for re-framing the inverse problem of electrocardiography as a regression problem is successful and produces promising time series and activation map predictions of real-world recordings even when only trained using Guassian data. Significance: The HSPs predicted by the neural network can be used to create activation maps to identify cardiac dysfunctions during clinical assessment.	翻訳日:2021-02-05 08:19:11 公開日:2021-02-01
# (参考訳) 畳み込みニューラルネットワークを用いたパーキンソント歩行解析のための時空間反応力解析 Spatiotemporal Ground Reaction Force Analysis using Convolutional Neural Networks to Analyze Parkinsonian Gait ( http://arxiv.org/abs/2102.00628v1 ) ライセンス: CC0 1.0	Musthaq Ahamed, P.D.S.H. Gunawardane, Nimali T. Medagedara	(参考訳) パーキンソン病(英: Parkinson's disease, PD)は、高齢者の生活の質を大幅に低下させる不治の病気である。 PDは主に歩行パターンに影響を与え、歩行を正常から障害へと徐々に変化させる。 PDの早期診断は治療に重要であり,歩行パターン解析はPDの診断手法として用いられる。本稿では,PDに関連する歩行パターンの変化を識別するための指標として,生時空間反応力(GRF)を同定した。 GRFの変化は、前処理、変換、認識、性能評価を通じて畳み込みニューラルネットワークを用いて識別される。提案アルゴリズムは,pdの重症度を同定し,パーキンソン病の歩行と健康な歩行を区別することができる。この技術は自動意思決定プロセスにおいて97%の精度を示している。 Parkinson's disease (PD) is a non-curable disease that commonly found among elders that greatly reduce their quality of life. PD primarily affects the gait pattern and slowly changes the walking gait from the normality to disability. The early diagnosing of PD is important for treatments and gait pattern analysis is used as a technique to diagnose PD. The present paper has identified the raw spatiotemporal ground reaction force (GRF) as a key parameter to identify the changes in human gait patterns associated with PD. The changes in GRF are identified using a convolutional neural network through pre-processing, conversion, recognition, and performance evaluation. The proposed algorithm is capable of identifying the severity of the PD and distinguishing the parkinsonian gait from the healthy gait. The technique has shown a 97% of accuracy in automatic decision-making process.	翻訳日:2021-02-05 08:04:00 公開日:2021-02-01
# (参考訳) 生存データの機械学習モデルを用いた説明変数に関連する危険率の計算 Computing the Hazard Ratios Associated with Explanatory Variables Using Machine Learning Models of Survival Data ( http://arxiv.org/abs/2102.00637v1 ) ライセンス: CC BY 4.0	Sameer Sundrani and James Lu	(参考訳) 目的: Cox Proportional Hazards (CoxPH) モデルの生存データへの適用, および Hazard Ratio (HR) の導出が良好に確立されている。木をベースとした非線形機械学習(ML)モデルが生存分析に適用されているが、これらのモデルから説明変数に関連付けられたHRを計算するための方法論は存在しない。予測に対する説明変数の寄与を定量化する局所的正確で一貫性のある手法であるShapley additive explanation (SHAP)値を用いて,木ベースのMLモデルからHRを計算する新しい方法を提案する。方法: 大腸癌、乳癌、膵臓癌の患者から得られた3組の生存データを用いて、CoxPHの性能を最先端のMLモデルであるXGBoostと比較した。 XGBoostモデルから説明変数のHRを計算するために、SHAP値は指数化され、2つのサブグループの平均の比率が計算された。信頼区間は、トレーニングデータをブートストラップし、MLモデルを1000回生成することで計算された。 3つのデータセット全体で、すべての説明変数のHRを体系的に比較した。 PythonとRのオープンソースライブラリが分析に使用された。結果: 大腸癌群と乳癌群では, CoxPH と XGBoost のパフォーマンスは同等であり, HR の整合性は良好であった。 Pan-cancerデータセットでは、ほとんどの変数の一致を示しましたが、CoxPHとXGBoostの結果の間の2つの説明変数の反対の発見も示しました。その後のKaplan-MeierプロットはXGBoostモデルの発見を支持した。結論: MLモデルからのHRの導出は,複雑な生存データセットからの危険因子の同定を改善し,臨床試験の結果を予測するのに役立つ。 Purpose: The application of Cox Proportional Hazards (CoxPH) models to survival data and the derivation of Hazard Ratio (HR) is well established. While nonlinear, tree-based Machine Learning (ML) models have been developed and applied to the survival analysis, no methodology exists for computing HRs associated with explanatory variables from such models. We describe a novel way to compute HRs from tree-based ML models using the Shapley additive explanation (SHAP) values, which is a locally accurate and consistent methodology to quantify explanatory variables' contribution to predictions. Methods: We used three sets of publicly available survival data consisting of patients with colon, breast or pan cancer and compared the performance of CoxPH to the state-of-art ML model, XGBoost. To compute the HR for explanatory variables from the XGBoost model, the SHAP values were exponentiated and the ratio of the means over the two subgroups calculated. The confidence interval was computed via bootstrapping the training data and generating the ML model 1000 times. Across the three data sets, we systematically compared HRs for all explanatory variables. Open-source libraries in Python and R were used in the analyses. Results: For the colon and breast cancer data sets, the performance of CoxPH and XGBoost were comparable and we showed good consistency in the computed HRs. In the pan-cancer dataset, we showed agreement in most variables but also an opposite finding in two of the explanatory variables between the CoxPH and XGBoost result. Subsequent Kaplan-Meier plots supported the finding of the XGBoost model. Conclusion: Enabling the derivation of HR from ML models can help to improve the identification of risk factors from complex survival datasets and enhance the prediction of clinical trial outcomes.	翻訳日:2021-02-05 07:51:53 公開日:2021-02-01
# (参考訳) Webインテリジェンスアプリケーションのためのドメイン特化モデルの自動拡張 Automatic Expansion of Domain-Specific Affective Models for Web Intelligence Applications ( http://arxiv.org/abs/2102.00827v1 ) ライセンス: CC BY 4.0	Albert Weichselbraun, Jakob Steixner, Adrian M.P. Bra\c{s}oveanu, Arno Scharl, Max G\"obel and Lyndon J. B. Nixon	(参考訳) 知覚コンピューティングは、ポジティブな感情とネガティブな感情を区別するための極性(Polarity)、人間の感情の表現を捉えるためのよりニュアンスなモデル(nuanced model)など、様々な複雑さの明確に定義された感情モデルに依存している。コミュニケーションの成功を測定するために使用されると、高度な機械学習アプローチと組み合わせた最もきめ細かい感情モデルでさえ、組織の戦略的なポジショニング目標を完全に捉えることはできません。このような目標は、しばしば標準化された感情モデルから逸脱する。喜びや信頼といった特定の感情は、一般的に望ましいブランドの関連を表すが、マーケティング専門家によって定式化された特定のコミュニケーション目標はしばしばそのような標準的な次元を超えている。例えば、テレビ番組のブランドマネージャーは、恐れや悲しみが観客に望まれる感情であると考えるかもしれません。本稿では、ナレッジグラフで利用可能な共通知識と共通知識を言語モデルや感情推論と組み合わせ、カバレッジと一貫性を改善し、感情のドメイン固有の解釈をサポートする、感情モデルのための拡張技術を紹介します。広範な評価は、異なる拡張技術のパフォーマンスを比較します:(i) 再訪された感情の砂時計モデルに基づいて定量的評価し、手動でコンパイルされた金標準データを使用して、複数の感情カテゴリをカバーする複雑なモデルのパフォーマンスを評価し、(ii) テレビ番組ブランドのためのドメイン固有の感情モデルの定性評価。これらの評価の結果,導入技術は様々な組込みモデルと事前学習モデルをサポートしていることが示された。論文は、このアプローチをモデルリソースが乏しい他のシナリオに適用することに関する議論で締めくくられている。 Sentic computing relies on well-defined affective models of different complexity - polarity to distinguish positive and negative sentiment, for example, or more nuanced models to capture expressions of human emotions. When used to measure communication success, even the most granular affective model combined with sophisticated machine learning approaches may not fully capture an organisation's strategic positioning goals. Such goals often deviate from the assumptions of standardised affective models. While certain emotions such as Joy and Trust typically represent desirable brand associations, specific communication goals formulated by marketing professionals often go beyond such standard dimensions. For instance, the brand manager of a television show may consider fear or sadness to be desired emotions for its audience. This article introduces expansion techniques for affective models, combining common and commonsense knowledge available in knowledge graphs with language models and affective reasoning, improving coverage and consistency as well as supporting domain-specific interpretations of emotions. An extensive evaluation compares the performance of different expansion techniques: (i) a quantitative evaluation based on the revisited Hourglass of Emotions model to assess performance on complex models that cover multiple affective categories, using manually compiled gold standard data, and (ii) a qualitative evaluation of a domain-specific affective model for television programme brands. The results of these evaluations demonstrate that the introduced techniques support a variety of embeddings and pre-trained models. The paper concludes with a discussion on applying this approach to other scenarios where affective model resources are scarce.	翻訳日:2021-02-05 07:44:24 公開日:2021-02-01
# (参考訳) 分散フェデレーション学習はモデルとデータのプライバシを保存する Decentralized Federated Learning Preserves Model and Data Privacy ( http://arxiv.org/abs/2102.00880v1 ) ライセンス: CC BY 4.0	Thorsten Wittkopp and Alexander Acker	(参考訳) ITシステムの複雑さが増す中、障害発生時の運用をサポートするソリューションが必要です。したがって、AIOps(Artificial Intelligence for System Operations)は、学術と産業の両方において、ますます注目されつつある研究分野である。この領域の主要な問題の1つは、適切なラベル付きデータへのアクセスがないことです。これは、主に法的保護規制または産業機密性によるものです。この混乱を連合学習の領域から緩和する方法は、トレーニングデータに直接アクセスする必要がない。オリジナルアプローチでは、すべてのモデルパラメータの周期的集約によるモデル同期を実行するために、中央インスタンスを使用する。しかし、その機密知識やトレーニングデータを再構築することができるため、訓練されたモデルを公開できないシナリオはたくさんあります。さらに、中央のインスタンスは信頼される必要があり、単一障害点である。そこで,我々は,学習モデル間の知識共有を可能にする完全分散手法を提案する。オリジナルのトレーニングデータもモデルパラメータも送信する必要はない。この概念は、モデルに割り当てられた教師と学生の役割に依存しており、生徒は合成された入力データを通じて教師の出力に基づいて訓練される。ログ異常検出のケーススタディを実施している。その結果,教師が学習した未学習学生モデルが,教師と同等のF1スコアに達することがわかった。さらに,本手法は,異なる訓練データサブセット上で訓練された複数のモデルの同期を可能にすることを示す。 The increasing complexity of IT systems requires solutions, that support operations in case of failure. Therefore, Artificial Intelligence for System Operations (AIOps) is a field of research that is becoming increasingly focused, both in academia and industry. One of the major issues of this area is the lack of access to adequately labeled data, which is majorly due to legal protection regulations or industrial confidentiality. Methods to mitigate this stir from the area of federated learning, whereby no direct access to training data is required. Original approaches utilize a central instance to perform the model synchronization by periodical aggregation of all model parameters. However, there are many scenarios where trained models cannot be published since its either confidential knowledge or training data could be reconstructed from them. Furthermore the central instance needs to be trusted and is a single point of failure. As a solution, we propose a fully decentralized approach, which allows to share knowledge between trained models. Neither original training data nor model parameters need to be transmitted. The concept relies on teacher and student roles that are assigned to the models, whereby students are trained on the output of their teachers via synthetically generated input data. We conduct a case study on log anomaly detection. The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher. In addition, we demonstrate that our method allows the synchronization of several models trained on different distinct training data subsets.	翻訳日:2021-02-05 07:19:05 公開日:2021-02-01
# (参考訳) RNA二次構造体の神経表現と生成 Neural representation and generation for RNA secondary structures ( http://arxiv.org/abs/2102.00925v1 ) ライセンス: CC BY 4.0	Zichao Yan, William L. Hamilton and Mathieu Blanchette	(参考訳) 本研究は, 細胞活性や機能に影響を及ぼす複雑な構造を組み込むことができる遺伝子マクロ分子の一種であるRNAの生成と設計に関するものである。大規模で複雑な生物学的構造の設計は、計算薬物発見の重要かつ未承認の側面を表すグラフベースの深層生成モデリング技術に拍車をかけた。本研究では、異なるRNA構造モダリティの表現と生成の原理を検討し、これらの分子構造とそれらの配列を有意義な潜在空間に融合して生成するための柔軟な枠組みを提案する。 RNA分子構造を深く理解した当社の高度な符号化・復号法は、分子グラフとジャンクションツリー階層上で動作し、RNA構造規則性や折り畳み機構に関する強い誘導バイアスを統合し、生成したRNAの構造的妥当性、安定性、多様性を実現します。また,タンパク質との相互作用に関して,RNA分子埋め込みの潜伏空間を適切に整理し,この潜伏領域を探索し,新たなRNA分子を探索する目的の最適化も行っている。 Our work is concerned with the generation and targeted design of RNA, a type of genetic macromolecule that can adopt complex structures which influence their cellular activities and functions. The design of large scale and complex biological structures spurs dedicated graph-based deep generative modeling techniques, which represents a key but underappreciated aspect of computational drug discovery. In this work, we investigate the principles behind representing and generating different RNA structural modalities, and propose a flexible framework to jointly embed and generate these molecular structures along with their sequence in a meaningful latent space. Equipped with a deep understanding of RNA molecular structures, our most sophisticated encoding and decoding methods operate on the molecular graph as well as the junction tree hierarchy, integrating strong inductive bias about RNA structural regularity and folding mechanism such that high structural validity, stability and diversity of generated RNAs are achieved. Also, we seek to adequately organize the latent space of RNA molecular embeddings with regard to the interaction with proteins, and targeted optimization is used to navigate in this latent space to search for desired novel RNA molecules.	翻訳日:2021-02-05 07:08:21 公開日:2021-02-01
# (参考訳) マルチタスクガウスプロセスマルチオブジェクト自己アテンションネットワークを用いたCOVID-19患者における機械的換気のリアルタイム予測 Real-time Prediction for Mechanical Ventilation in COVID-19 Patients using A Multi-task Gaussian Process Multi-objective Self-attention Network ( http://arxiv.org/abs/2102.01147v1 ) ライセンス: CC BY 4.0	Kai Zhang, Siddharth Karanth, Bela Patel, Robert Murphy, Xiaoqian Jiang	(参考訳) 本研究では,院内感染者が機械的換気を必要とする確率を予測できる堅牢なインタイム予測器を提案する。 COVID-19患者のリスク予測の課題は、臨床設定で観察された患者のバイタルとラボの大きな変動と不規則なサンプリングにあります。既存の手法は時間依存的な機能の複雑なダイナミクスを扱うのに強い制限があり、情報を失う要約統計による時間的データの単純化や、より堅牢な結果をもたらすオーバーエンジニアリング機能などである。個別の患者に対して機械的換気を行うリスクのダイナミクスを追従するデータの不規則なサンプリング率を扱うための,新しいリアルタイムリスク軌跡予測モデルを提案する。このモデルは、観測値を用いたマルチタスクガウス過程を取り入れ、後継の多変条件確率を学習し、統一された時間グリッド上の欠落値を推定する。時間的インデュートデータは、予測タスクのために多目的セルフアテンションネットワークに供給される。リアルタイム予測を行うための新しい位置符号化層を提案し,ネットワークに追加した。位置層は、患者全体の病院滞在中に、各ユーザー定義の時点にリスクスコアを出力する。予測タスクを多目的学習フレームワークに設定し、すべての時点におけるリスクスコアを完全に最適化し、リスクスコアの軌道予測に堅牢性と一貫性を付加する。また,全国の病院内患者を対象とした大規模データベースを用いた実験により,auc(受信者動作特性曲線下の地域)とauprc(精密リコール曲線下の地域)のパフォーマンス指標,特に入院後の早期におけるパフォーマンスの向上が示された。 We propose a robust in-time predictor for in-hospital COVID-19 patient's probability of requiring mechanical ventilation. A challenge in the risk prediction for COVID-19 patients lies in the great variability and irregular sampling of patient's vitals and labs observed in the clinical setting. Existing methods have strong limitations in handling time-dependent features' complex dynamics, either oversimplifying temporal data with summary statistics that lose information or over-engineering features that lead to less robust outcomes. We propose a novel in-time risk trajectory predictive model to handle the irregular sampling rate in the data, which follows the dynamics of risk of performing mechanical ventilation for individual patients. The model incorporates the Multi-task Gaussian Process using observed values to learn the posterior joint multi-variant conditional probability and infer the missing values on a unified time grid. The temporal imputed data is fed into a multi-objective self-attention network for the prediction task. A novel positional encoding layer is proposed and added to the network for producing in-time predictions. The positional layer outputs a risk score at each user-defined time point during the entire hospital stay of an inpatient. We frame the prediction task into a multi-objective learning framework, and the risk scores at all time points are optimized altogether, which adds robustness and consistency to the risk score trajectory prediction. Our experimental evaluation on a large database with nationwide in-hospital patients with COVID-19 also demonstrates that it improved the state-of-the-art performance in terms of AUC (Area Under the receiver operating characteristic Curve) and AUPRC (Area Under the Precision-Recall Curve) performance metrics, especially at early times after hospital admission.	翻訳日:2021-02-05 06:37:35 公開日:2021-02-01
# (参考訳) 組合せ交換プロトコルの論理的表現のための一般的な枠組み A General Framework for the Logical Representation of Combinatorial Exchange Protocols ( http://arxiv.org/abs/2102.02061v1 ) ライセンス: CC BY 4.0	Munyque Mittelmann, Sylvain Bouveret, Laurent Perrussel	(参考訳) 本論文の目的は,組合せ交換を規定する規則を表現・推論するための枠組みを提案することである。このようなフレームワークは、自動トランザクションのための広く使用されるメカニズムであるオークションに基づいてデジタルマーケットプレイスを構築したい限り、最初は関心があります。コンビネーション取引所はオークションの最も一般的なケースであり、コンビネーションとコンビネーションのバリエーションを混ぜている:エージェントは商品のバンドルを取引しようとしている。したがって、フレームワークは2つの要件を満たすべきである: (i) 入札者が商品の組み合わせで入札を表現できるようにすべきであり、(ii) 特定の市場、すなわち法的入札、割り当ておよび支払いルールを管理するルールを記述することを許可すべきである。そこで我々は、ゲーム記述言語の精神の中で論理言語を定義する: Combinatorial Exchange Description Languageは、論理フレームワークにおける組合せ交換を記述するための最初の言語である。コントリビューションは2つある: まず、異なる種類のプロトコルを表現して一般的な次元を記述し、次に、この機械処理可能な言語におけるオークション特性の推論方法を示す。 The goal of this paper is to propose a framework for representing and reasoning about the rules governing a combinatorial exchange. Such a framework is at first interest as long as we want to build up digital marketplaces based on auction, a widely used mechanism for automated transactions. Combinatorial exchange is the most general case of auctions, mixing the double and combinatorial variants: agents bid to trade bundles of goods. Hence the framework should fulfill two requirements: (i) it should enable bidders to express their bids on combinations of goods and (ii) it should allow describing the rules governing some market, namely the legal bids, the allocation and payment rules. To do so, we define a logical language in the spirit of the Game Description Language: the Combinatorial Exchange Description Language is the first language for describing combinatorial exchange in a logical framework. The contribution is two-fold: first, we illustrate the general dimension by representing different kinds of protocols, and second, we show how to reason about auction properties in this machine-processable language.	翻訳日:2021-02-04 22:19:57 公開日:2021-02-01
# クラスター分析とコミュニティ検出の評価のための外部対策の評価と比較 Characterizing and comparing external measures for the assessment of cluster analysis and community detection ( http://arxiv.org/abs/2102.00708v1 ) ライセンス: Link先を確認	Nejat Arinik (LIA), Vincent Labatut, Rosa Figueiredo	(参考訳) クラスタ分析とグラフ分割の文脈では、同じセットの2つのパーティションを比較するために、文献で多くの外部評価手段が提案されている。これにより、与えられた状況に対して最も適切な尺度を選択することがエンドユーザの課題となる。しかし、この問題は文献では見過ごされている。従来の研究者が一貫して使用し始めたためだけに、研究者は伝統に従い、彼らの分野の標準的な尺度を使用する傾向があります。本研究では,この問題を解決するための新しい経験的評価フレームワークを提案し,エンドユーザーがアプリケーションに適した尺度を選択するのを支援する。候補測度の集まりでは、まず、事前に定義されたパラメトリック分割変換のセットを適用して得られるパーティションの生成データセットに対してそれらの振る舞いを計算して記述する。第2に,このフレームワークは回帰分析を行い,パラメータや変換の影響を受ける指標を特徴付ける。これにより、測定方法の説明と比較が可能となる。私たちのアプローチは特定の測度やアプリケーションに縛られませんので、どんな状況にも適用できます。我々は,本手法を標準尺度の選定に適用し,その妥当性を説明し,具体的ユースケースを2つに分けて実施する方法を示す。 In the context of cluster analysis and graph partitioning, many external evaluation measures have been proposed in the literature to compare two partitions of the same set. This makes the task of selecting the most appropriate measure for a given situation a challenge for the end user. However, this issue is overlooked in the literature. Researchers tend to follow tradition and use the standard measures of their field, although they often became standard only because previous researchers started consistently using them. In this work, we propose a new empirical evaluation framework to solve this issue, and help the end user selecting an appropriate measure for their application. For a collection of candidate measures, it first consists in describing their behavior by computing them for a generated dataset of partitions, obtained by applying a set of predefined parametric partition transformations. Second, our framework performs a regression analysis to characterize the measures in terms of how they are affected by these parameters and transformations. This allows both describing and comparing the measures. Our approach is not tied to any specific measure or application, so it can be applied to any situation. We illustrate its relevance by applying it to a selection of standard measures, and show how it can be put in practice through two concrete use cases.	翻訳日:2021-02-04 17:14:45 公開日:2021-02-01
# (参考訳) インド古典音楽における感情分類のためのニューラルネットワークアーキテクチャ Neural Network architectures to classify emotions in Indian Classical Music ( http://arxiv.org/abs/2102.00616v1 ) ライセンス: CC BY 4.0	Uddalok Sarkar, Sayan Nag, Medha Basu, Archi Banerjee, Shankha Sanyal, Ranjan Sengupta, Dipak Ghosh	(参考訳) 音楽はしばしば感情の言語と見なされる。長い間、人間の感情を引き出すことが知られており、人間の感情のタイプに基づいて音楽を分類することが、非常に興味深い研究のトピックである。インド古典音楽(ICM)によって引き起こされる感情を分類する作業になると、ICMに固有の曖昧さのため、さらに困難になる。 1つの演奏が聴衆の様々な感情的反応を誘発するという事実は、ICMの反響の性質に暗黙的である。ディープラーニングの分野での急速な進歩により、この音楽感情認識(MER)タスクはますます関連性が高く、堅牢になりつつあるため、最も困難なテストケースの1つ、すなわち1つに適用することができる。 ICMからの感情の分類。本稿では,200クリップがハッピー感情に対応し,残りの200クリップが悲しい感情に対応する,400のオーディオクリップ(それぞれ30秒)を持つjumusemodbという新しいデータセットを提案する。教師付き分類のために、2000年サブクリップ(各クリップを5つのサブクリップに分割する)の対応する音楽スペクトログラムに既存の4つのディープ畳み込みニューラルネットワーク(CNN)ベースのアーキテクチャ(resnet18, mobilenet v2.0, tightnet v1.0, vgg16)を使用し、周波数領域情報と時間領域情報の両方を含む。最初の結果は非常に刺激的であり、このアーキテクチャを使ってデータセットのベースライン値を設定することを楽しみにしています。インド古典音楽の豊富なコーパスを用いたCNNに基づく分類アルゴリズムは,グローバルな視点でもユニークであり,他の音楽のモダリティにおいても再現可能である。このデータセットはまだ開発中であり、他の感情的特徴を含むデータも追加する予定です。近いうちにデータセットを一般公開する予定です。 Music is often considered as the language of emotions. It has long been known to elicit emotions in human being and thus categorizing music based on the type of emotions they induce in human being is a very intriguing topic of research. When the task comes to classify emotions elicited by Indian Classical Music (ICM), it becomes much more challenging because of the inherent ambiguity associated with ICM. The fact that a single musical performance can evoke a variety of emotional response in the audience is implicit to the nature of ICM renditions. With the rapid advancements in the field of Deep Learning, this Music Emotion Recognition (MER) task is becoming more and more relevant and robust, hence can be applied to one of the most challenging test case i.e. classifying emotions elicited from ICM. In this paper we present a new dataset called JUMusEmoDB which presently has 400 audio clips (30 seconds each) where 200 clips correspond to happy emotions and the remaining 200 clips correspond to sad emotion. For supervised classification purposes, we have used 4 existing deep Convolutional Neural Network (CNN) based architectures (resnet18, mobilenet v2.0, squeezenet v1.0 and vgg16) on corresponding music spectrograms of the 2000 sub-clips (where every clip was segmented into 5 sub-clips of about 5 seconds each) which contain both time as well as frequency domain information. The initial results are quite inspiring, and we look forward to setting the baseline values for the dataset using this architecture. This type of CNN based classification algorithm using a rich corpus of Indian Classical Music is unique even in the global perspective and can be replicated in other modalities of music also. This dataset is still under development and we plan to include more data containing other emotional features as well. We plan to make the dataset publicly available soon.	翻訳日:2021-02-04 15:18:50 公開日:2021-02-01
# (参考訳) 量子インスパイアされた適応ブースティング Quantum Inspired Adaptive Boosting ( http://arxiv.org/abs/2102.00949v1 ) ライセンス: CC BY 4.0	B\'alint Dar\'oczy, Katalin Friedl, L\'aszl\'o Kab\'odi, Attila Pereszl\'enyi, D\'aniel Szab\'o	(参考訳) Schuld と Petruccione [arXiv:1704.02146v1] の量子アンサンブルに基づく分類アルゴリズムに基づいて、この量子アンサンブル法が古典アルゴリズムよりも有利でないことを示す等価な古典アルゴリズムを考案した。基本的には、それらのアルゴリズムを、同等の古典的なバージョンを思いつくまで単純化する。古典的なアルゴリズムの1つは極めて単純で、各入力を分類するために一定時間実行される。さらに,本論文の主な貢献として,量子アンサンブル法と適応的なブースティングを組み合わせた手法を提案する。アルゴリズムはテストされ、公開データセット上のAdaBoostアルゴリズムに匹敵することがわかった。 Building on the quantum ensemble based classifier algorithm of Schuld and Petruccione [arXiv:1704.02146v1], we devise equivalent classical algorithms which show that this quantum ensemble method does not have advantage over classical algorithms. Essentially, we simplify their algorithm until it is intuitive to come up with an equivalent classical version. One of the classical algorithms is extremely simple and runs in constant time for each input to be classified. We further develop the idea and, as the main contribution of the paper, we propose methods inspired by combining the quantum ensemble method with adaptive boosting. The algorithms were tested and found to be comparable to the AdaBoost algorithm on publicly available data sets.	翻訳日:2021-02-04 15:11:47 公開日:2021-02-01
# (参考訳) 一般化キタエフハニカム磁石の機械学習相図 Machine-Learned Phase Diagrams of Generalized Kitaev Honeycomb Magnets ( http://arxiv.org/abs/2102.01103v1 ) ライセンス: CC BY 4.0	Nihal Rao, Ke Liu, Marc Machaczek, Lode Pollet	(参考訳) 我々は、最近開発された解釈可能で教師なしの機械学習手法であるテンソルカーネルサポートベクトルマシン(TK-SVM)を用いて、ハニカム格子上の一般化されたハイゼンベルク-キタエフ-$\Gamma$$J$-$K$-$\Gamma$)モデルの低温古典位相図を調査する。以前の量子および古典研究で報告された再生相とは別に、私たちのマシンはネストされたZigzag-stripyの順序を見つけ出し、最近特定された調節された$S_3 \times Z_3$相の堅牢性を確立します。結果は、$J$, $K$, $\Gamma$の3つの主要な交換相互作用にまたがる制限されたパラメータ空間において、代表的なキタエフ物質$\alpha$-${\rm RuCl}_3$は、単純な強磁性体を含むいくつかの相の界面に近く、従来の$S_3 \times Z_3$とネストされたジグザグ・ストリーピー磁石を含む。ジグザグ順序は有限 $\Gamma^{\prime}$ および/または $J_3$ 項によって安定化されるが、4つの磁気順序は特に $\Gamma^{\prime}$ が反強磁性であれば競合する。 We use a recently developed interpretable and unsupervised machine-learning method, the tensorial kernel support vector machine (TK-SVM), to investigate the low-temperature classical phase diagram of a generalized Heisenberg-Kitaev-$\Gamma$ ($J$-$K$-$\Gamma$) model on a honeycomb lattice. Aside from reproducing phases reported by previous quantum and classical studies, our machine finds a hitherto missed nested zigzag-stripy order and establishes the robustness of a recently identified modulated $S_3 \times Z_3$ phase, which emerges through the competition between the Kitaev and $\Gamma$ spin liquids, against Heisenberg interactions. The results imply that, in the restricted parameter space spanned by the three primary exchange interactions -- $J$, $K$, and $\Gamma$, the representative Kitaev material $\alpha$-${\rm RuCl}_3$ lies close to the interface of several phases, including a simple ferromagnet, and the unconventional $S_3 \times Z_3$ and nested zigzag-stripy magnets. A zigzag order is stabilized by a finite $\Gamma^{\prime}$ and/or $J_3$ term, whereas the four magnetic orders may compete in particular if $\Gamma^{\prime}$ is anti-ferromagnetic.	翻訳日:2021-02-04 12:47:14 公開日:2021-02-01
# 早期アルツハイマー病検出のための反応時間に基づく分類 Classifications based on response times for detecting early-stage Alzheimer's disease ( http://arxiv.org/abs/2102.00738v1 ) ライセンス: Link先を確認	Alain Petrowski (TSP, RS2M)	(参考訳) 紹介:本論文は, 早期アルツハイマー病(ES-AD)患者と健常者(HC)患者を手書き・手書き作業記録を用いたデータセットから高精度に検出する方法を主に記述する。方法:提案手法は被験者の応答時間を用いる。タスクの最適なサブセットは、最初にグリッド検索に関連付けられた「サポートベクターマシン」(SVM)で選択されます。タスク持続時間の空間で定義されるガウス分布の混合は、SVMの結果を再現し、説明するために使用される。最後に、驚くほどシンプルで効率的なアドホック分類アルゴリズムがガウス混合物から導かれる。結果:本論文で示したソリューションは、手書きと描画タスクからHC/ES-ADを分類する技術の状態の最良の結果の2倍または4倍の誤差を減少させる。議論: 最高のsvm学習モデルは、この分類で高い精度に達するが、その学習能力が大きすぎて、データセットの小さなサイズに関する過度なリスクが確実である。提案するアドホック分類アルゴリズムは、3つの実パラメータを最適化するだけでよい。したがって、優れた一般化能力の恩恵を受けるべきである。 Introduction: This paper mainly describes a way to detect with high accuracy patients with early-stage Alzheimer's disease (ES-AD) versus healthy control (HC) subjects, from datasets built with handwriting and drawing task records. Method: The proposed approach uses subject's response times. An optimal subset of tasks is first selected with a "Support Vector Machine" (SVM) associated with a grid search. Mixtures of Gaussian distributions defined in the space of task durations are then used to reproduce and explain the results of the SVM. Finally, a surprisingly simple and efficient ad hoc classification algorithm is deduced from the Gaussian mixtures. Results: The solution presented in this paper makes two or even four times fewer errors than the best results of the state of the art concerning the classification HC/ES-AD from handwriting and drawing tasks. Discussion: The best SVM learning model reaches a high accuracy for this classification but its learning capacity is too large to ensure a low overfitting risk regarding the small size of the dataset. The proposed ad hoc classification algorithm only requires to optimize three real-parameters. It should therefore benefit from a good generalization ability.	翻訳日:2021-02-04 10:18:26 公開日:2021-02-01
# 教師なしリアルタイム構造健康モニタリングのためのシステム信頼性に基づくGANと一級共同ガウス分布のマルチアンサンブル System-reliability based multi-ensemble of GAN and one-class joint Gaussian distributions for unsupervised real-time structural health monitoring ( http://arxiv.org/abs/2102.01158v1 ) ライセンス: Link先を確認	Mohammad Hesam Soleimani-Babakamali, Reza Sepasdar, Kourosh Nasrollahzadeh, and Rodrigo Sarlo	(参考訳) 監視されていない健康モニタリングは、過去10年間で最も実用的なリアルタイム構造健康モニタリング(SHM)アプローチとして多くの注目を集めています。文献で提案された監視されていない技術の中には、堅牢でリアルタイムの健康監視の障害がまだあります。これらの障壁には、特徴抽出ステップの次元的削減からの情報の損失、それらのステップのケース依存性、動的クラスタリングの欠如、ユーザ定義パラメータに対する検出結果の感度が含まれる。本研究では,ケース依存抽出方式を使わずに,低次元と高次元を混合した非監視のリアルタイムSHM法を提案する。両機能は、GAN(Generative Adversarial Networks)と1-class Joint Gaussian Distribution Model (1-CG)のマルチアンサンブルのトレーニングに使用される。 GANと1-CGモデルの検出スコアに基づく極限状態関数のノベルティ検出システムを構築する。これらの極限状態関数(検出しきい値)の抵抗は、信頼性に基づく解析を通じてモンテカルロヒストグラムサンプリングを用いて、GAN生成データオブジェクトでユーザ定義パラメータに調整される。チューニングは、リアルタイムSHMでこれらのパラメータを選択するルールがないため、このメソッドをユーザー定義パラメータに堅牢にします。提案されたノベルティ検出フレームワークは、Yellow Frame(20ダメージクラス)とZ24 Bridge(15ダメージクラス)の2つの標準SHMデータセットに適用される。すべての異なる損傷カテゴリは、ユーザー定義パラメータの初期選択に対する低感度で識別され、動的なベースラインアプローチと静的なベースラインアプローチの両方を導入しました。 Unsupervised health monitoring has gained much attention in the last decade as the most practical real-time structural health monitoring (SHM) approach. Among the proposed unsupervised techniques in the literature, there are still obstacles to robust and real-time health monitoring. These barriers include loss of information from dimensionality reduction in feature extraction steps, case-dependency of those steps, lack of a dynamic clustering, and detection results' sensitivity to user-defined parameters. This study introduces an unsupervised real-time SHM method with a mixture of low- and high-dimensional features without a case-dependent extraction scheme. Both features are used to train multi-ensembles of Generative Adversarial Networks (GAN) and one-class joint Gaussian distribution models (1-CG). A novelty detection system of limit-state functions based on GAN and 1-CG models' detection scores is constructed. The Resistance of those limit-state functions (detection thresholds) is tuned to user-defined parameters with the GAN-generated data objects by employing the Monte Carlo histogram sampling through a reliability-based analysis. The tuning makes the method robust to user-defined parameters, which is crucial as there is no rule for selecting those parameters in a real-time SHM. The proposed novelty detection framework is applied to two standard SHM datasets to illustrate its generalizability: Yellow Frame (twenty damage classes) and Z24 Bridge (fifteen damage classes). All different damage categories are identified with low sensitivity to the initial choice of user-defined parameters with both introduced dynamic and static baseline approaches with few or no false alarms.	翻訳日:2021-02-04 10:17:47 公開日:2021-02-01
# 呪いか償還か? データ不均一性がフェデレーション学習のロバスト性に与える影響 Curse or Redemption? How Data Heterogeneity Affects the Robustness of Federated Learning ( http://arxiv.org/abs/2102.00655v1 ) ライセンス: Link先を確認	Syed Zawad, Ahsan Ali, Pin-Yu Chen, Ali Anwar, Yi Zhou, Nathalie Baracaldo, Yuan Tian, Feng Yan	(参考訳) データの不均一性は、フェデレートラーニングにおける重要な特徴の1つとして認識されているが、しばしば敵対的攻撃に対する堅牢性のレンズで見過ごされる。本論文では,合成およびLEAFベンチマークを用いた包括的な実験を通じて,フェデレーション学習におけるバックドア攻撃に対する影響を特徴づけ,理解することに焦点を当てる。実験結果から,データの不均一性は攻撃の有効性の主要な要因であり,攻撃の効率が低下し,効果的な攻撃戦略の設計が困難となり,攻撃結果も予測不能となるため,バックドア攻撃に対する防御の欠如となる可能性が示唆された。しかし,さらなる調査により,クライアント側バックドアのタイミングを単に調整するだけで,攻撃効果を著しく向上できるため,データの不均一性は償還よりも呪いに近いことが判明した。さらに重要なのは、データの異質性は、攻撃者が自分自身を偽装し、馬鹿げた機能ベースの防衛に活用することができる良性クライアントのローカルトレーニングでオーバーフィットをもたらす可能性があります。また、攻撃データ分布を調整することで効果的な攻撃戦略を作成できる。最後に,データの不均一性によってもたらされる呪いを守る可能性について論じる。大規模な実験と分析から得られた成果と教訓は、堅牢な連合学習手法とシステムを設計するための新たな洞察を提供する Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks. This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks. The initial impression driven by our experimental results suggests that data heterogeneity is the dominant factor in the effectiveness of attacks and it may be a redemption for defending against backdooring as it makes the attack less efficient, more challenging to design effective attack strategies, and the attack result also becomes less predictable. However, with further investigations, we found data heterogeneity is more of a curse than a redemption as the attack effectiveness can be significantly boosted by simply adjusting the client-side backdooring timing. More importantly,data heterogeneity may result in overfitting at the local training of benign clients, which can be utilized by attackers to disguise themselves and fool skewed-feature based defenses. In addition, effective attack strategies can be made by adjusting attack data distribution. Finally, we discuss the potential directions of defending the curses brought by data heterogeneity. The results and lessons learned from our extensive experiments and analysis offer new insights for designing robust federated learning methods and systems	翻訳日:2021-02-04 10:10:57 公開日:2021-02-01
# 化学空間探索のためのディープニューラルネットワークを用いた遺伝的アルゴリズムの再現性に関する研究 A reproducibility study of "Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space" ( http://arxiv.org/abs/2102.00700v1 ) ライセンス: Link先を確認	Kevin Maik Jablonka, Fergus Mcilwaine, Susana Garcia, Berend Smit, Brian Yoo	(参考訳) Nigamら。 SELFIES表現を利用した遺伝的アルゴリズム(GA)を報告し、生成された分子の多様性を改善するために適応的でニューラルネットワークに基づくペナルティを提案する。この論文の主な主張は、このGAは他の生成技術(罰則化されたlogPによって測定される)を上回っ、ニューラルネットワークベースの適応ペナルティが生成された分子の多様性を増加させることである。本研究では,それらの主張の再現性を検討した。全体としては、SELFIESベースのGAを用いて同等の結果を再現することができたが、ほとんどは(容易に最適化可能な)フィットネス機能の欠如(すなわち、長い硫黄を含む鎖を生成する)を利用していた。さらに, 判別器を用いて, 分子の発生を基準セットに類似するものに偏見を与えることができることも再現した。さらに,多様性の進化を定量化し,いくつかのハイパーパラメータの影響を理解し,適応的ペナルティの改善を提案する。 Nigam et al. reported a genetic algorithm (GA) utilizing the SELFIES representation and also propose an adaptive, neural network-based, penalty that is supposed to improve the diversity of the generated molecules. The main claims of the paper are that this GA outperforms other generative techniques (as measured by the penalized logP) and that a neural network-based adaptive penalty increases the diversity of the generated molecules. In this work, we investigated the reproducibility of their claims. Overall, we were able to reproduce comparable results using the SELFIES-based GA, but mostly by exploiting deficiencies of the (easily optimizable) fitness function (i.e., generating long, sulfur containing, chains). In addition, we also reproduce that the discriminator can be used to bias the generation of molecules to ones that are similar to the reference set. In addition, we also attempted to quantify the evolution of the diversity, understand the influence of some hyperparameters, and propose improvements to the adaptive penalty.	翻訳日:2021-02-04 10:10:15 公開日:2021-02-01
# 説明可能なランドスケープ解析に向けて:BBOB関数の極端特徴選択 Towards Explainable Exploratory Landscape Analysis: Extreme Feature Selection for Classifying BBOB Functions ( http://arxiv.org/abs/2102.00736v1 ) ライセンス: Link先を確認	Quentin Renau, Johann Dreo, Carola Doerr and Benjamin Doerr	(参考訳) 最近の機械学習(ML)の進歩により、最適化ヒューリスティックスの自動設計が現在、進化計算(EC)を揺るがしている。もっとも適したヒューリスティックを選ぶための手書きのガイドラインの設計がこの分野の研究活動を支配してきたのに対し、自動訓練されたヒューリスティックは、よく研究された最適化タスクにおいても、人間由来の選択肢よりも優れていた。したがって、MLベースのECはもはや未来的なビジョンではありませんが、コミュニティの不可欠な部分になっています。 MLベースのヒューリスティックがしばしば直面する重要な批判は、将来の開発を妨げる可能性のある説明可能性の潜在的な不足である。これは特に探索的ランドスケープ分析(ELA)に基づいてアルゴリズムのパフォーマンスを外挿する教師付き学習技術に当てはまります。このようなアプリケーションでは、特定のアルゴリズム選択または構成タスクの基礎となるモデルを構築するために多数の問題機能を使用することは珍しくありません。この作業の目標は、この多数の機能が本当に必要かどうかを分析することです。 BBOBテスト関数をテストベッドとして分類することで、驚くほど少数の機能(通常は4つ未満)が、98%の精度を達成するのに十分であることを示す。興味深いことに、このしきい値を満たすのに必要な機能の数は問題次元とともに減少する。分類精度は,複数のインスタンスがトレーニングやテストに関与している設定に転移することを示す。しかし, 離間ワンインスタンスアウト設定では分類精度が著しく低下し, 特徴の変換不変性が決定的な成功要因となる。 Facilitated by the recent advances of Machine Learning (ML), the automated design of optimization heuristics is currently shaking up evolutionary computation (EC). Where the design of hand-picked guidelines for choosing a most suitable heuristic has long dominated research activities in the field, automatically trained heuristics are now seen to outperform human-derived choices even for well-researched optimization tasks. ML-based EC is therefore not any more a futuristic vision, but has become an integral part of our community. A key criticism that ML-based heuristics are often faced with is their potential lack of explainability, which may hinder future developments. This applies in particular to supervised learning techniques which extrapolate algorithms' performance based on exploratory landscape analysis (ELA). In such applications, it is not uncommon to use dozens of problem features to build the models underlying the specific algorithm selection or configuration task. Our goal in this work is to analyze whether this many features are indeed needed. Using the classification of the BBOB test functions as testbed, we show that a surprisingly small number of features -- often less than four -- can suffice to achieve a 98\% accuracy. Interestingly, the number of features required to meet this threshold is found to decrease with the problem dimension. We show that the classification accuracy transfers to settings in which several instances are involved in training and testing. In the leave-one-instance-out setting, however, classification accuracy drops significantly, and the transformation-invariance of the features becomes a decisive success factor.	翻訳日:2021-02-04 10:09:37 公開日:2021-02-01
# SGDのための無痛ステップサイズ適応 Painless step size adaptation for SGD ( http://arxiv.org/abs/2102.00853v1 ) ライセンス: Link先を確認	Ilona Kulikovskikh and Tarzan Legovi\'c	(参考訳) 収束と一般化は、ニューラルネットワークのパフォーマンスの2つの重要な側面である。別々に解析すると、これらの性質は矛盾する結果をもたらす可能性がある。収束率の最適化は高速なトレーニングをもたらすが、最良の一般化誤差を保証しない。対立を避けるため、最近の研究では、オプティマイザに適度に大きなステップサイズを採用することを提案しているが、パフォーマンスに付加価値は未定である。テストの収束と一般化の改善を明示的に制御する4つの構成でLIGHT関数を提案します。 1) ニューラルネットワークの安定性を保証せずに、収束性と一般化の両方を改善すること、2) 過剰なパラメータ化を必要とせずに、より信頼性が高く説明可能なネットワークアーキテクチャを構築すること。私たちはそれを「痛みのない」ステップサイズの適応と呼びます。 Convergence and generalization are two crucial aspects of performance in neural networks. When analyzed separately, these properties may lead to contradictory results. Optimizing a convergence rate yields fast training, but does not guarantee the best generalization error. To avoid the conflict, recent studies suggest adopting a moderately large step size for optimizers, but the added value on the performance remains unclear. We propose the LIGHT function with the four configurations which regulate explicitly an improvement in convergence and generalization on testing. This contribution allows to: 1) improve both convergence and generalization of neural networks with no need to guarantee their stability; 2) build more reliable and explainable network architectures with no need for overparameterization. We refer to it as "painless" step size adaptation.	翻訳日:2021-02-04 10:08:51 公開日:2021-02-01
# 不可能なチューニングが可能になった:新しいエキスパートアルゴリズムとその応用 Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications ( http://arxiv.org/abs/2102.01046v1 ) ライセンス: Link先を確認	Liyu Chen, Haipeng Luo, Chen-Yu Wei	(参考訳) 我々は、古典的専門家問題の長年にわたる「不可能なチューニング」問題を解決し、実際、後悔の$O\left(\sqrt{(\ln d)\sum_t \ell_{t,i}^2}\right)$と同時に、$T$-round $d$-expert problemにおけるすべてのエキスパート$i$に対して$\ell_{t,i}$は、ラウンド$t$におけるエキスパート$i$の損失であることを示す。本アルゴリズムは、補正項と重み付きエントロピー正則化器を備えたミラー降下フレームワークに基づいている。自然だが、アルゴリズムはこれまで研究されておらず、慎重に分析する必要がある。また、学習者が受信する任意の予測ベクトル $m_t$ に対して $o\left(\sqrt{(\ln d)\sum_t (\ell_{t,i}-m_{t,i})^2}\right)$ を一般化し、異なる $m_t$ を選択して既存の結果を復元または改善する。さらに,同じフレームワークを使用して,基本アルゴリズムの集合を結合し,オーバーヘッドの少ない最善のアルゴリズムを学習するマスタアルゴリズムを作成する。マスターの新たな保証により、専門家問題とより一般的なオンライン線形最適化の両方に対して、多くの新しい結果が得られます。 We resolve the long-standing "impossible tuning" issue for the classic expert problem and show that, it is in fact possible to achieve regret $O\left(\sqrt{(\ln d)\sum_t \ell_{t,i}^2}\right)$ simultaneously for all expert $i$ in a $T$-round $d$-expert problem where $\ell_{t,i}$ is the loss for expert $i$ in round $t$. Our algorithm is based on the Mirror Descent framework with a correction term and a weighted entropy regularizer. While natural, the algorithm has not been studied before and requires a careful analysis. We also generalize the bound to $O\left(\sqrt{(\ln d)\sum_t (\ell_{t,i}-m_{t,i})^2}\right)$ for any prediction vector $m_t$ that the learner receives, and recover or improve many existing results by choosing different $m_t$. Furthermore, we use the same framework to create a master algorithm that combines a set of base algorithms and learns the best one with little overhead. The new guarantee of our master allows us to derive many new results for both the expert problem and more generally Online Linear Optimization.	翻訳日:2021-02-04 10:08:20 公開日:2021-02-01
# 旅行行動予測における数百の機械学習分類器と離散選択モデルの比較:実証的ベンチマーク Comparing hundreds of machine learning classifiers and discrete choice models in predicting travel behavior: an empirical benchmark ( http://arxiv.org/abs/2102.01130v1 ) ライセンス: Link先を確認	Shenhao Wang, Baichuan Mo, Stephane Hess, Jinhua Zhao	(参考訳) 研究者は、旅行行動を予測するために機械学習(ML)分類器と離散選択モデル(DCM)を比較してきたが、発見の一般化はデータ、文脈、著者の専門知識によって制限されている。本研究は、高度に構造化された方法で数百のMLおよびDCM分類器を比較して、一般化可能な経験的ベンチマークを提供することを目的とする。実験では,12のモデルファミリーから105のMLとDCMの分類器,3つのデータセット,3つのサンプルサイズ,3つのアウトプットを含む4つの超次元にまたがって予測精度と計算コストを評価した。この実験設計は6,970の実験につながり、35の以前の研究から136の実験ポイントのメタデータセットと関連付けられている。この研究は、旅行行動予測のための分類器の最も包括的でほぼ完全な比較です。その結果,アンサンブル法とディープニューラルネットワークが最も高い予測性能が得られるが,計算コストは比較的高いことがわかった。ランダムフォレストは最も計算効率が良く、予測と計算のバランスをとる。離散選択モデルは、上位ML分類器よりもわずか3～4パーセント低い精度を提供するが、より長い計算時間を持ち、大きなサンプルサイズ、高い入力次元、シミュレーションベースの推定で計算不可能になる。 MLおよびDCM分類器の相対的なランキングは非常に安定しており、予測精度と計算時間の絶対値は大きな変動を有する。本稿では, 深層ニューラルネットワーク, モデルアンサンブル, ランダム森林を, 将来の旅行行動予測のベースラインモデルとして用いることを提案する。選択モデリングのために、DCMコミュニティは、ビッグデータのコンテキストでDCMを広く採用できるように、適合モデルから計算効率の改善にもっと注意を向けるべきです。 Researchers have compared machine learning (ML) classifiers and discrete choice models (DCMs) in predicting travel behavior, but the generalizability of the findings is limited by the specifics of data, contexts, and authors' expertise. This study seeks to provide a generalizable empirical benchmark by comparing hundreds of ML and DCM classifiers in a highly structured manner. The experiments evaluate both prediction accuracy and computational cost by spanning four hyper-dimensions, including 105 ML and DCM classifiers from 12 model families, 3 datasets, 3 sample sizes, and 3 outputs. This experimental design leads to an immense number of 6,970 experiments, which are corroborated with a meta dataset of 136 experiment points from 35 previous studies. This study is hitherto the most comprehensive and almost exhaustive comparison of the classifiers for travel behavioral prediction. We found that the ensemble methods and deep neural networks achieve the highest predictive performance, but at a relatively high computational cost. Random forests are the most computationally efficient, balancing between prediction and computation. While discrete choice models offer accuracy with only 3-4 percentage points lower than the top ML classifiers, they have much longer computational time and become computationally impossible with large sample size, high input dimensions, or simulation-based estimation. The relative ranking of the ML and DCM classifiers is highly stable, while the absolute values of the prediction accuracy and computational time have large variations. Overall, this paper suggests using deep neural networks, model ensembles, and random forests as baseline models for future travel behavior prediction. For choice modeling, the DCM community should switch more attention from fitting models to improving computational efficiency, so that the DCMs can be widely adopted in the big data context.	翻訳日:2021-02-04 10:07:30 公開日:2021-02-01
# 攻撃下の未知力学系に対する動的カモフラージュによるセキュアな学習制御戦略 A Secure Learning Control Strategy via Dynamic Camouflaging for Unknown Dynamical Systems under Attacks ( http://arxiv.org/abs/2102.00573v1 ) ライセンス: Link先を確認	Sayak Mukherjee, Veronica Adetola	(参考訳) 本稿では、盗聴や隠蔽攻撃などの構成攻撃を受ける未知の線形時間不変サイバー物理システム(CPS)に対するセキュア強化学習(RL)に基づく制御手法を提案する。設計者が線形二次制御器(LQR)を学ぶために行う学習の探索段階で攻撃者が動的モデルについて学習する攻撃シナリオを検討し、その後、我々は二重学習ベースの制御と攻撃(DLCA)フレームワークと呼ばれる動的システムへの隠れた攻撃を実行するためにそのような情報を使用します。本研究では,動的システムの最適制御器を学習し,同時に,攻撃者によるシステムダイナミクスの推定に十分な誤情報を注入することができる,動的迷彩に基づく攻撃回復力強化学習アルゴリズム(ARRL)を提案する。このアルゴリズムには、コンセンサスマルチエージェントシステムとベンチマーク電力グリッドモデルに関する理論的保証と広範な数値実験が伴っている。 This paper presents a secure reinforcement learning (RL) based control method for unknown linear time-invariant cyber-physical systems (CPSs) that are subjected to compositional attacks such as eavesdropping and covert attack. We consider the attack scenario where the attacker learns about the dynamic model during the exploration phase of the learning conducted by the designer to learn a linear quadratic regulator (LQR), and thereafter, use such information to conduct a covert attack on the dynamic system, which we refer to as doubly learning-based control and attack (DLCA) framework. We propose a dynamic camouflaging based attack-resilient reinforcement learning (ARRL) algorithm which can learn the desired optimal controller for the dynamic system, and at the same time, can inject sufficient misinformation in the estimation of system dynamics by the attacker. The algorithm is accompanied by theoretical guarantees and extensive numerical experiments on a consensus multi-agent system and on a benchmark power grid model.	翻訳日:2021-02-04 09:56:00 公開日:2021-02-01
# 確率的オンライン凸最適化 : 確率時系列予測への応用 Stochastic Online Convex Optimization; Application to probabilistic time series forecasting ( http://arxiv.org/abs/2102.00729v1 ) ライセンス: Link先を確認	Olivier Wintenberger (LPSM UMR 8001)	(参考訳) オンラインアルゴリズムの確率的後悔境界は、通常「オンラインからバッチ」変換に由来する。この推論を逆にして,確率的凸最適化問題に適用可能な「バッチからオンラインへの変換」により,確率的exp-concavity条件下で解析を開始する。非凸損失関数の確率の高い高速確率的後悔境界を得る。このアプローチに基づき、非定常非有界時系列の予測と確率予測方法を提供します。 Stochastic regret bounds for online algorithms are usually derived from an "online to batch" conversion. Inverting the reasoning, we start our analyze by a "batch to online" conversion that applies in any Stochastic Online Convex Optimization problem under stochastic exp-concavity condition. We obtain fast rate stochastic regret bounds with high probability for non-convex loss functions. Based on this approach, we provide prediction and probabilistic forecasting methods for non-stationary unbounded time series.	翻訳日:2021-02-04 09:55:23 公開日:2021-02-01
# 粒子加速器における時系列の分類と予測の新しい手法 A Novel Approach for Classification and Forecasting of Time Series in Particle Accelerators ( http://arxiv.org/abs/2102.00786v1 ) ライセンス: Link先を確認	Sichen Li, M\'elissa Zacharias, Jochem Snuverink, Jaime Coello de Portugal, Fernando Perez-Cruz, Davide Reggiani and Andreas Adelmann	(参考訳) 粒子加速器のビーム遮断(インターロック)は、必要な安全対策にもかかわらず、突然の運用変更とビーム時間の相当な損失をもたらす。インタロック現象を予測し,高出力陽子加速器複合体のビーム損失を低減するために,新しい時系列分類手法を適用した。多変量時系列のウィンドウのバイナリ分類によって予測を行う。時系列は、時系列の内部構造を捕捉するだけでなく、画像分類技術の進歩を利用する畳み込みニューラルネットワークによって分類される再発プロットに変換される。 ROC曲線値が0.71 pm 0.01$のエリアを、ランダムフォレストモデルが0.65 pm 0.01$のエリアに到達させることで、インターロック毎のビーム時間損失を0.5 pm 0.2$秒削減することができる。 The beam interruptions (interlocks) of particle accelerators, despite being necessary safety measures, lead to abrupt operational changes and a substantial loss of beam time. A novel time series classification approach is applied to decrease beam time loss in the High Intensity Proton Accelerator complex by forecasting interlock events. The forecasting is performed through binary classification of windows of multivariate time series. The time series are transformed into Recurrence Plots which are then classified by a Convolutional Neural Network, which not only captures the inner structure of the time series but also utilizes the advances of image classification techniques. Our best performing interlock-to-stable classifier reaches an Area under the ROC Curve value of $0.71 \pm 0.01$ compared to $0.65 \pm 0.01$ of a Random Forest model, and it can potentially reduce the beam time loss by $0.5 \pm 0.2$ seconds per interlock.	翻訳日:2021-02-04 09:54:55 公開日:2021-02-01
# 低リソース音声認識のためのコントラスト表現のスケーリングについて On Scaling Contrastive Representations for Low-Resource Speech Recognition ( http://arxiv.org/abs/2102.00850v1 ) ライセンス: Link先を確認	Lasse Borgholt, Tycho Max Sylvester Tax, Jakob Drachmann Havtorn, Lars Maal{\o}e, Christian Igel	(参考訳) コントラスト学習による自己教師型学習の最近の進歩は,ラベル付きデータの10分以内で,競争的音声認識システムを学ぶことができることを示している。しかし、これらのシステムは事前学習を必要とするため計算コストが高く、さらに大きなパラメータ空間で微調整を行う。計算要求の高いwav2vec 2.0フレームワークの固定表現に関する最先端の音声認識を訓練することにより、微調整のないシステムの性能を検討する。パフォーマンスは微調整なしで低下し、極端な低リソース設定では、wav2vec 2.0は前バージョンより劣っている。また、wav2vec 2.0表現は低次元部分空間に存在し、表現の特徴の相関が自動音声認識器の訓練を安定化させる。最後に、パフォーマンスを継続的に改善するオリジナルのwav2vecフレームワークの双方向拡張を提案する。 Recent advances in self-supervised learning through contrastive training have shown that it is possible to learn a competitive speech recognition system with as little as 10 minutes of labeled data. However, these systems are computationally expensive since they require pre-training followed by fine-tuning in a large parameter space. We explore the performance of such systems without fine-tuning by training a state-of-the-art speech recognizer on the fixed representations from the computationally demanding wav2vec 2.0 framework. We find performance to decrease without fine-tuning and, in the extreme low-resource setting, wav2vec 2.0 is inferior to its predecessor. In addition, we find that wav2vec 2.0 representations live in a low dimensional subspace and that decorrelating the features of the representations can stabilize training of the automatic speech recognizer. Finally, we propose a bidirectional extension to the original wav2vec framework that consistently improves performance.	翻訳日:2021-02-04 09:54:17 公開日:2021-02-01
# 実例からの線形時間公式の学習の複雑さ The Complexity of Learning Linear Temporal Formulas from Examples ( http://arxiv.org/abs/2102.00876v1 ) ライセンス: Link先を確認	Nathana\"el Fijalkow and Guillaume Lagarde	(参考訳) 本稿では、例から線形時間論理(LTL)式を学習する計算の複雑さの研究を開始する。我々はLTLのフラグメントに対する近似アルゴリズムを構築し、硬さを証明し、特に、次の演算子と接続子のみを含むフラグメントの近似の厳密な境界を求め、多くのフラグメントに対するNP完全性の結果を証明する。 In this paper we initiate the study of the computational complexity of learning linear temporal logic (LTL) formulas from examples. We construct approximation algorithms for fragments of LTL and prove hardness results; in particular we obtain tight bounds for approximation of the fragment containing only the next operator and conjunctions, and prove NP-completeness results for many fragments.	翻訳日:2021-02-04 09:53:42 公開日:2021-02-01
# 確率的テイラー展開とフィルタリングおよび微分方程式への応用 A Probabilistic Taylor Expansion with Applications in Filtering and Differential Equations ( http://arxiv.org/abs/2102.00877v1 ) ライセンス: Link先を確認	Toni Karvonen, Jon Cockayne, Filip Tronarp, Simo S\"arkk\"a	(参考訳) 我々は、後進平均が特定のデータ選択に対して、任意の順序の切り詰められたテイラー展開を複製するガウス過程のクラスを研究する。データは、拡張点における微分評価から成り、以前の共分散カーネルはテイラー核のクラスに属しており、特定の電源系列形式で記述することができる。これにより、1次および2次テイラー展開を利用する様々なアルゴリズムの不確かさを統計的にモデル化することができる。このガウス過程モデルの有用性を実証するために、非線形状態推定のための古典的拡張カルマンフィルタの新しい確率バージョンと、通常の微分方程式を解くオイラー法を導入する。 We study a class of Gaussian processes for which the posterior mean, for a particular choice of data, replicates a truncated Taylor expansion of any order. The data consists of derivative evaluations at the expansion point and the prior covariance kernel belongs to the class of Taylor kernels, which can be written in a certain power series form. This permits statistical modelling of the uncertainty in a variety of algorithms that exploit first and second order Taylor expansions. To demonstrate the utility of this Gaussian process model we introduce new probabilistic versions of the classical extended Kalman filter for non-linear state estimation and the Euler method for solving ordinary differential equations.	翻訳日:2021-02-04 09:53:14 公開日:2021-02-01
# 深部ニューラルネットワーク推論パイプラインのForensicability Forensicability of Deep Neural Network Inference Pipelines ( http://arxiv.org/abs/2102.00921v1 ) ライセンス: Link先を確認	Alexander Schl\"ogl, Tobias Kupek, Rainer B\"ohme	(参考訳) 観測可能な出力における特性数値偏差をトレースすることにより,機械学習パイプラインの実行環境の特性を推定する手法を提案する。ローカルおよびクラウドホストマシン上で得られた一連の概念実証実験の結果は、ディープニューラルネットワーク予測を生成するために使用されるハードウェアプラットフォームの識別など、法医学的応用の可能性をもたらす。最後に,予測ラベルのみを用いて機械を識別するために,数値偏差を増幅する境界サンプルを導入する。 We propose methods to infer properties of the execution environment of machine learning pipelines by tracing characteristic numerical deviations in observable outputs. Results from a series of proof-of-concept experiments obtained on local and cloud-hosted machines give rise to possible forensic applications, such as the identification of the hardware platform used to produce deep neural network predictions. Finally, we introduce boundary samples that amplify the numerical deviations in order to distinguish machines by their predicted label only.	翻訳日:2021-02-04 09:52:44 公開日:2021-02-01
# 深層音楽情報ダイナミクス Deep Music Information Dynamics ( http://arxiv.org/abs/2102.01133v1 ) ライセンス: Link先を確認	Shlomo Dubnov	(参考訳) 音楽は、時間内に組織された複雑な同時イベントからなる。本稿では,音楽データそのものに由来する高い速度情報ダイナミクスとは対照的に,思考過程のダイナミクスを捉えることを想定した,低速な潜在表現ストリームである2つの並列ストリームを組み合わせた,深層音楽情報ダイナミクスと呼ばれる新しい枠組みを提案する。我々は,人間認知の速度ゆがみ理論に動機づけられ,リスナーの心に存在する想像上の予測と音楽面自体の情報ダイナミクスの関係を探究する枠組みを提案する。このモデルはシンボリック(midi)データの場合、音響面の計算には多くの層が必要であり、楽器の特性や表現力の強い反射を捉えることができる。数学的枠組みは、まず音楽観測の高速表現を確立し、ビットアロケーション法を使用して並列低レートデータストリームに還元する変動符号化に基づいています。ここで考慮される複合損失は、各ストリームの時間発展の観点での情報レートと、ハイレート表現とローレート表現の間の相互情報で測定されたエンコーディングの忠実性の両方を含む。論文で提示したシミュレーションでは,音楽表面の潜時・虚数・副次的側面を定量的かつ計算的に抽出可能な方法で近似することができる。本論文では,時間に基づく音楽生成モデルの解析と設計において,圧縮と予測のトレードオフが重要な要素であることを示唆する計算ツールのセットについて論じる。 Music comprises of a set of complex simultaneous events organized in time. In this paper we introduce a novel framework that we call Deep Musical Information Dynamics, which combines two parallel streams - a low rate latent representation stream that is assumed to capture the dynamics of a thought process contrasted with a higher rate information dynamics derived from the musical data itself. Motivated by rate-distortion theories of human cognition we propose a framework for exploring possible relations between imaginary anticipations existing in the listener's mind and information dynamics of the musical surface itself. This model is demonstrated for the case of symbolic (MIDI) data, as accounting for acoustic surface would require many more layers to capture instrument properties and performance expressive inflections. The mathematical framework is based on variational encoding that first establishes a high rate representation of the musical observations, which is then reduced using a bit-allocation method into a parallel low rate data stream. The combined loss considered here includes both the information rate in terms of time evolution for each stream, and the fidelity of encoding measured in terms of mutual information between the high and low rate representations. In the simulations presented in the paper we are able to juxtapose aspects of latent/imaginary surprisal versus surprisal of the music surface in a manner that is quantifiable and computationally tractable. The set of computational tools is discussed in the paper, suggesting that a trade off between compression and prediction are an important factor in the analysis and design of time-based music generative models.	翻訳日:2021-02-04 09:52:15 公開日:2021-02-01
# Gene Mover's Distance:Optimal Transportによる単一細胞類似性 The Gene Mover's Distance: Single-cell similarity via Optimal Transport ( http://arxiv.org/abs/2102.01218v1 ) ライセンス: Link先を確認	Riccardo Bellazzi and Andrea Codegoni and Stefano Gualandi and Giovanna Nicora and Eleonora Vercesi	(参考訳) 本稿では, 単細胞RNAシークエンシングにより得られた遺伝子発現プロファイルに基づいて, 一対の細胞間の類似性の尺度であるGene Mover's Distanceを紹介する。提案する距離の基本的な考え方は、単一細胞の遺伝子発現配列を離散的確率測度として解釈することである。したがって、2つのセル間の距離は、対応する2つの離散測度間の最適輸送問題を解くことで計算される。最適輸送モデルでは、一対の遺伝子間の距離を測定するために2種類のコスト関数を用いる。最初のコスト関数は、遺伝子を高次元ベクターにマッピングするために使用されるgen2vecと呼ばれる遺伝子埋め込みを利用する:遺伝子から他のベクターへ遺伝子発現の質量の単位を移動させるコストは、対応する埋め込みベクター間のユークリッド距離に設定される。第2のコスト関数はペアの遺伝子間のピアソン距離に基づいている。両方のコスト関数では、2つの遺伝子が相関するほど、その距離は低くなります。我々は、遺伝子ムーバーの距離を利用して、その状態とタイプに応じて細胞を分類する2つの分類問題を解く。新しいメトリックの影響を評価するために、異なる距離を使用して$ k$-Nearest Neighbor分類器のパフォーマンスを比較します。計算結果から、遺伝子ムーバーの距離は、文献で使われている最先端距離と競合していることが示された。 This paper introduces the Gene Mover's Distance, a measure of similarity between a pair of cells based on their gene expression profiles obtained via single-cell RNA sequencing. The underlying idea of the proposed distance is to interpret the gene expression array of a single cell as a discrete probability measure. The distance between two cells is hence computed by solving an Optimal Transport problem between the two corresponding discrete measures. In the Optimal Transport model, we use two types of cost function for measuring the distance between a pair of genes. The first cost function exploits a gene embedding, called gene2vec, which is used to map each gene to a high dimensional vector: the cost of moving a unit of mass of gene expression from a gene to another is set to the Euclidean distance between the corresponding embedded vectors. The second cost function is based on a Pearson distance among pairs of genes. In both cost functions, the more two genes are correlated, the lower is their distance. We exploit the Gene Mover's Distance to solve two classification problems: the classification of cells according to their condition and according to their type. To assess the impact of our new metric, we compare the performances of a $k$-Nearest Neighbor classifier using different distances. The computational results show that the Gene Mover's Distance is competitive with the state-of-the-art distances used in the literature.	翻訳日:2021-02-04 09:51:31 公開日:2021-02-01
# (参考訳) 量子フェア機械学習 Quantum Fair Machine Learning ( http://arxiv.org/abs/2102.00753v1 ) ライセンス: CC BY 4.0	Elija Perrier	(参考訳) 本稿では,量子フェア機械学習の分野について紹介する。古典的および量子的フェアマシンラーニングアルゴリズムの違いと類似性の比較分析を行い、量子計算のユニークな特徴が、量子アルゴリズムが公平性制約の対象となる場合の尺度、メトリクス、修復戦略をどのように変更するかを特定します。本稿では、グローバー探索アルゴリズムを用いて、量子アルゴリズムに課される統計パリティ制約を満たすことにより、量子フェア機械学習の最初の結果を示す。我々は、$\epsilon$-tolerance内でそのような統計パリティを達成するために必要なイテレーションの低いバウンドを提供する。正準リプシッツ条件の個々の公正度基準を量子メトリクスを用いて量子設定に拡張する。量子情報処理と量子データに関わる機械学習コンテキストにおける公平性の典型的な尺度の結果を検討する。最後に, 計算機科学, 倫理学, 量子計算分野の研究者に新たな関心を寄せるオープン質問と研究プログラムを提案する。 In this paper, we inaugurate the field of quantum fair machine learning. We undertake a comparative analysis of differences and similarities between classical and quantum fair machine learning algorithms, specifying how the unique features of quantum computation alter measures, metrics and remediation strategies when quantum algorithms are subject to fairness constraints. We present the first results in quantum fair machine learning by demonstrating the use of Grover's search algorithm to satisfy statistical parity constraints imposed on quantum algorithms. We provide lower-bounds on iterations needed to achieve such statistical parity within $\epsilon$-tolerance. We extend canonical Lipschitz-conditioned individual fairness criteria to the quantum setting using quantum metrics. We examine the consequences for typical measures of fairness in machine learning context when quantum information processing and quantum data are involved. Finally, we propose open questions and research programmes for this new field of interest to researchers in computer science, ethics and quantum computation.	翻訳日:2021-02-04 09:48:02 公開日:2021-02-01
# (参考訳) 第4回複雑系のスマートシミュレーションとモデリングに関する国際ワークショップ The 4th International Workshop on Smart Simulation and Modelling for Complex Systems ( http://arxiv.org/abs/2102.01190v1 ) ライセンス: CC BY 4.0	Xing Su, Yan Kong, Weihua Li	(参考訳) コンピュータベースのモデリングとシミュレーションは、物理学、天体物理学、化学、生物学、経済学、工学、社会科学など、さまざまな分野のシステムを理解するための有用なツールとなっている。複雑なシステムは、多数の相互作用するコンポーネント(エージェント、プロセスなど)で特徴付けられる。 ) は非線型かつ自己組織的である。複雑なシステムは、システムコンポーネント間の複雑な関係、リソースの分散特徴、および環境のダイナミクスのために、従来の計算アプローチを用いてシミュレーションやモデル化が難しい。一方、マルチエージェントシステムなどのスマートシステムは、複雑なシステムのモデリングとシミュレーションにおける利点と大きな可能性を実証しています。 Computer-based modelling and simulation have become useful tools to facilitate humans to understand systems in different domains, such as physics, astrophysics, chemistry, biology, economics, engineering and social science. A complex system is featured with a large number of interacting components (agents, processes, etc.), whose aggregate activities are nonlinear and self-organized. Complex systems are hard to be simulated or modelled by using traditional computational approaches due to complex relationships among system components, distributed features of resources, and dynamics of environments. Meanwhile, smart systems such as multi-agent systems have demonstrated advantages and great potentials in modelling and simulating complex systems.	翻訳日:2021-02-04 05:00:14 公開日:2021-02-01
# (参考訳) 確率的サブモジュラカバーのタイトバウンド A Tight Bound for Stochastic Submodular Cover ( http://arxiv.org/abs/2102.01149v1 ) ライセンス: CC BY 4.0	Lisa Hellerstein, Devorah Kletenik and Srinivasan Parthasarathy	(参考訳) ここで、golovin and krause (2011) の適応的グリーディアルゴリズムは、確率的部分多様体被覆に対して $(\ln (q/\eta)+1)$ の近似境界を達成していることを示す。 (整数値のユーティリティ関数の場合、$H(Q)$の有界な値を示し、$H(Q)$は$Q^{th}$ハーモニック数である。) この境界は Golovin と Krause によって論文の原版で主張されたが、この証明は後に Nan と Saligrama (2017) によって誤りであることが示されている。その後の Golovin and Krause (2017) の補正された証明は、$(\ln(Q/\eta) + 1)^2$ の二次境界を与える。この問題に対する他の以前の境界は、Im et al の作業によって暗示される 56(\ln(Q/\eta) + 1)$ である。 (2016) 関連する問題、および $k(\ln (Q/\eta)+1)$ について、Deshpande らによる。 2016年) と Hellerstein and Kletenik (2018) では、$k$ は州数である。我々の境界は、古典集合被覆問題に対するグリーディアルゴリズム上のよく知られた $(\ln~m + 1)$ 近似を一般化し、ここで $m$ は基底集合の大きさである。 We show that the Adaptive Greedy algorithm of Golovin and Krause (2011) achieves an approximation bound of $(\ln (Q/\eta)+1)$ for Stochastic Submodular Cover: here $Q$ is the "goal value" and $\eta$ is the smallest non-zero marginal increase in utility deliverable by an item. (For integer-valued utility functions, we show a bound of $H(Q)$, where $H(Q)$ is the $Q^{th}$ Harmonic number.) Although this bound was claimed by Golovin and Krause in the original version of their paper, the proof was later shown to be incorrect by Nan and Saligrama (2017). The subsequent corrected proof of Golovin and Krause (2017) gives a quadratic bound of $(\ln(Q/\eta) + 1)^2$. Other previous bounds for the problem are $56(\ln(Q/\eta) + 1)$, implied by work of Im et al. (2016) on a related problem, and $k(\ln (Q/\eta)+1)$, due to Deshpande et al. (2016) and Hellerstein and Kletenik (2018), where $k$ is the number of states. Our bound generalizes the well-known $(\ln~m + 1)$ approximation bound on the greedy algorithm for the classical Set Cover problem, where $m$ is the size of the ground set.	翻訳日:2021-02-04 04:13:11 公開日:2021-02-01
# (参考訳) インストゥルメンタル変数アプローチとベイズ非パラメトリック機械学習による因果推論 Causal Inference with the Instrumental Variable Approach and Bayesian Nonparametric Machine Learning ( http://arxiv.org/abs/2102.01199v1 ) ライセンス: CC BY-SA 4.0	Robert E. McCulloch, Rodney A. Sparapani, Brent R. Logan and Purushottam W. Laud	(参考訳) インスツルメンタル変数モデルで推論するための新しいフレキシブルなフレームワークを提供する。線形仕様を使用するのではなく、Bayesian Additive Regression Trees (BART)による機械学習を用いて、楽器や他の説明変数の効果を特徴付ける関数を推定する。誤差項とその分布はディリクレプロセス混合物を用いて推定される。シミュレーションおよび実例は、真の函数が線型であるとき、ほとんど失われないことを示している。しかし、非線形性が存在する場合、手動チューニングをほとんど行わずに劇的な改善が得られる。 We provide a new flexible framework for inference with the instrumental variable model. Rather than using linear specifications, functions characterizing the effects of instruments and other explanatory variables are estimated using machine learning via Bayesian Additive Regression Trees (BART). Error terms and their distribution are inferred using Dirichlet Process mixtures. Simulated and real examples show that when the true functions are linear, little is lost. But when nonlinearities are present, dramatic improvements are obtained with virtually no manual tuning.	翻訳日:2021-02-04 01:11:49 公開日:2021-02-01
# (参考訳) 説明可能な人工知能を用いた急性中毒の診断 Diagnosis of Acute Poisoning Using Explainable Artificial Intelligence ( http://arxiv.org/abs/2102.01116v1 ) ライセンス: CC BY 4.0	Michael Chary, Ed W Boyer, Michele M Burns	(参考訳) 医療毒性学(英語: medical toxicology)は、薬物の毒性を、過剰摂取、薬物乱用、またはスコーピオンステントなど、治療する専門分野である。毒性学の知識と研究の量は、他の医学分野と同様に、個々の臨床医が完全に習得し、現在の状態を維持する能力を上回っている。医学毒性学への機械学習技術の適用は、初期治療の決定はしばしばいくつかのテキストデータに基づいており、事前知識に大きく依存するため、困難です。 ml技術は、しばしば医師が透明な方法で知識を表現せず、ユーザビリティへの障壁を生じさせる。ルールベースのシステムと決定木学習はより透過的なアプローチであるが、しばしば一般化が不十分で、実装と維持には専門家のキュレーションが必要である。そこで我々は,医療毒性学者の知識基盤の一部を表す確率論的論理ネットワークを構築した。本手法は臨床医の知識表現と臨床意思決定を透過的に模倣する。 Takと呼ばれるこのソフトウェアは、簡単なケースと中間的な困難ケースで人間に比較できるが、難しい臨床ケースでは人間より優れている。 takは決定木分類器をあらゆる難易度で上回っている。確率論理は、許容可能なレベルのパフォーマンスを達成できれば、医療での使用がより受け入れられるかもしれない説明可能な人工知能の1つの形態を提供します。 Medical toxicology is the clinical specialty that treats the toxic effects of substances, be it an overdose, a medication error, or a scorpion sting. The volume of toxicological knowledge and research has, as with other medical specialties, outstripped the ability of the individual clinician to entirely master and stay current with it. The application of machine learning techniques to medical toxicology is challenging because initial treatment decisions are often based on a few pieces of textual data and rely heavily on prior knowledge. ML techniques often do not represent knowledge in a way that is transparent for the physician, raising barriers to usability. Rule-based systems and decision tree learning are more transparent approaches, but often generalize poorly and require expert curation to implement and maintain. Here, we construct a probabilistic logic network to represent a portion of the knowledge base of a medical toxicologist. Our approach transparently mimics the knowledge representation and clinical decision-making of practicing clinicians. The software, dubbed Tak, performs comparably to humans on straightforward cases and intermediate difficulty cases, but is outperformed by humans on challenging clinical cases. Tak outperforms a decision tree classifier at all levels of difficulty. Probabilistic logic provides one form of explainable artificial intelligence that may be more acceptable for use in healthcare, if it can achieve acceptable levels of performance.	翻訳日:2021-02-04 00:12:07 公開日:2021-02-01
# (参考訳) ビデオ記憶性予測のためのマルチモーダルアンサンブルモデル Multi-modal Ensemble Models for Predicting Video Memorability ( http://arxiv.org/abs/2102.01173v1 ) ライセンス: CC BY 4.0	Tony Zhao, Irving Fang, Jeffrey Kim, Gerald Friedland	(参考訳) メディアの記憶可能性のモデリングは、機械学習の分野で一貫した課題である。 MediaEval2020のPredicting Media Memorabilityタスクは、このトピックに対処する同様の課題の中で最新のベンチマークです。課題の以前のイテレーションで開発された技術に基づいて,抽出した映像,画像,テキスト,音声特徴を用いてアンサンブル手法を開発した。本研究は,メディアの記憶可能性を予測するための特徴として,抽出音声埋め込みの有効性と高一般化性を紹介する。 Modeling media memorability has been a consistent challenge in the field of machine learning. The Predicting Media Memorability task in MediaEval2020 is the latest benchmark among similar challenges addressing this topic. Building upon techniques developed in previous iterations of the challenge, we developed ensemble methods with the use of extracted video, image, text, and audio features. Critically, in this work we introduce and demonstrate the efficacy and high generalizability of extracted audio embeddings as a feature for the task of predicting media memorability.	翻訳日:2021-02-04 00:00:27 公開日:2021-02-01
# (参考訳) toon2real:漫画画像をリアルな画像に翻訳する toon2real: Translating Cartoon Images to Realistic Images ( http://arxiv.org/abs/2102.01143v1 ) ライセンス: CC BY 4.0	K. M. Arefeen Sultan, Mohammad Imrul Jubair, MD. Nahidul Islam, Sayed Hossain Khan	(参考訳) 画像から画像への変換に関しては、GAN(Generative Adversarial Networks)は教師なしデータセットでも大きな成功を収めている。本研究では,GANを用いた漫画画像から写真実写画像への翻訳を目的とする。このタスクを実行するためにいくつかの最先端モデルを適用するが、高品質な翻訳には失敗する。これら2つのドメイン間の浅い差がこの問題を引き起こすのを観察する。そこで本研究では,漫画領域からフォトリアリスティック領域への画像翻訳のためのCycleGANモデルに基づく手法を提案する。モデルを効率よくするために、我々のモデルに安定性を加えたスペクトル正規化を実装した。実験の結果を実証し,提案手法が他の最先端技術であるUNITと比較して最も低いFrechet Inception Distanceスコアと優れた結果を得たことを示す。 In terms of Image-to-image translation, Generative Adversarial Networks (GANs) has achieved great success even when it is used in the unsupervised dataset. In this work, we aim to translate cartoon images to photo-realistic images using GAN. We apply several state-of-the-art models to perform this task; however, they fail to perform good quality translations. We observe that the shallow difference between these two domains causes this issue. Based on this idea, we propose a method based on CycleGAN model for image translation from cartoon domain to photo-realistic domain. To make our model efficient, we implemented Spectral Normalization which added stability in our model. We demonstrate our experimental results and show that our proposed model has achieved the lowest Frechet Inception Distance score and better results compared to another state-of-the-art technique, UNIT.	翻訳日:2021-02-03 21:22:16 公開日:2021-02-01
# (参考訳) Image Domain DEEP-SLRによる並列MRデータの再構成とセグメント化 Reconstruction and Segmentation of Parallel MR Data using Image Domain DEEP-SLR ( http://arxiv.org/abs/2102.01172v1 ) ライセンス: CC BY 4.0	Aniket Pramanik, Mathews Jacob	(参考訳) この研究の主な焦点は、並列MRI(PMRI)脳データの共同再構成と分割のための新しいフレームワークである。画像領域深層ネットワークの導入により,PMRIデータのキャリブレーションレスリカバリを実現した。提案されたアプローチは, CLEAR [6] を含む非補正 PMRI 回復のための局所低ランクアプローチの深層学習 (DL) に基づく一般化である。画像領域アプローチは、k空間ベースのアプローチと比較して、余分な消滅関係を利用するため、性能改善が期待できる。アーティファクトのアンサンプリングによるセグメンテーションエラーを最小限に抑えるため,提案手法をセグメンテーションネットワークと組み合わせ,エンドツーエンドでトレーニングした。この手法は、セグメンテーションエラーの低減に加えて、オーバーフィットの低減による再構築性能の向上も実現し、再構成された画像は、独立して訓練された再構築ネットワークよりもぼやけやシャープなエッジを減少させる。 The main focus of this work is a novel framework for the joint reconstruction and segmentation of parallel MRI (PMRI) brain data. We introduce an image domain deep network for calibrationless recovery of undersampled PMRI data. The proposed approach is the deep-learning (DL) based generalization of local low-rank based approaches for uncalibrated PMRI recovery including CLEAR [6]. Since the image domain approach exploits additional annihilation relations compared to k-space based approaches, we expect it to offer improved performance. To minimize segmentation errors resulting from undersampling artifacts, we combined the proposed scheme with a segmentation network and trained it in an end-to-end fashion. In addition to reducing segmentation errors, this approach also offers improved reconstruction performance by reducing overfitting; the reconstructed images exhibit reduced blurring and sharper edges than independently trained reconstruction network.	翻訳日:2021-02-03 17:52:54 公開日:2021-02-01
# 生音声から生成した音声言語モデリング Generative Spoken Language Modeling from Raw Audio ( http://arxiv.org/abs/2102.01192v1 ) ライセンス: Link先を確認	Kushal Lakhotia, Evgeny Kharitonov, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Benjamin Bolte, Tu-Anh Nguyen, Jade Copet, Alexei Baevski, Adelrahman Mohamed, Emmanuel Dupoux	(参考訳) ジェネレーティブ・スピーカ言語モデリングは、(テキストやラベルなしで)生の音声のみから言語の音響的および言語的特性を共同で学習することを含む。音声合成(システム自身の音声を用いて音声入力を繰り返す)と音声生成(音声プロンプトで条件付きまたは無条件で新規音声出力を生成する)の2つのタスクにおいて、生成した出力を音響的および言語的品質で自動評価する指標を導入し、これらの指標を人間の判断で検証する。本研究では,離散音声エンコーダ(離散,低ビットレート,擬似テキスト単位)と生成言語モデル(擬似テキスト単位で学習)と音声デコーダ(擬似テキストから波形を生成する)からなるベースラインシステムをテストする。 3つの最先端の教師なし音声符号化(contrastive prediction coding (cpc), wav2vec 2.0, hubert)と離散単位数(50, 100, 200)を比較し,教師なしメトリクス(ゼロショットプローブタスク)で測定した学習単位の品質に依存するかを検討した。私たちは評価スタックとベースラインモデルをオープンソース化します。 Generative spoken language modeling involves learning jointly the acoustic and linguistic characteristics of a language from raw audio only (without text or labels). We introduce metrics to automatically evaluate the generated output in terms of acoustic and linguistic quality in two associated end-to-end tasks, respectively: speech resynthesis (repeating the speech input using the system's own voice), and speech generation (producing novel speech outputs conditional on a spoken prompt, or unconditionally), and validate these metrics with human judgment. We test baseline systems consisting of a discrete speech encoder (returning discrete, low bitrate, pseudo-text units), a generative language model (trained on pseudo-text units), and a speech decoder (generating a waveform from pseudo-text). By comparing three state-of-the-art unsupervised speech encoders (Contrastive Predictive Coding (CPC), wav2vec 2.0, HuBERT), and varying the number of discrete units (50, 100, 200), we investigate how the generative performance depends on the quality of the learned units as measured by unsupervised metrics (zero-shot probe tasks). We will open source our evaluation stack and baseline models.	翻訳日:2021-02-03 16:56:07 公開日:2021-02-01
# 大規模多目的質問回答データによる自己学習機械の読み書き Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question Answering Data ( http://arxiv.org/abs/2102.01226v1 ) ライセンス: Link先を確認	Dian Yu, Kai Sun, Dong Yu, Claire Cardie	(参考訳) この領域の最近の研究にもかかわらず、対象領域の質問応答データが機械読解(MRC)タスクに有用かどうかはまだ不明である。本稿では,この問題について考察する。大規模多目的多目的質問答えデータセットであるExamQAを収集し,Web検索エンジンが返送する不完全でノイズの多いスニペットを各問合せインスタンスのコンテキストとして使用し,弱ラベルのMRCインスタンスに変換する。次に,生成した弱ラベルMRCインスタンスを,ターゲットMRCタスクを改善するための自己学習パラダイムを提案する。実験結果から,マルチチョイスMRCデータセットC^3では5.1%の精度向上が可能であり,本フレームワークの有効性と,機械学習理解のための大規模質問応答データの有効性が示された。 In spite of much recent research in the area, it is still unclear whether subject-area question-answering data is useful for machine reading comprehension (MRC) tasks. In this paper, we investigate this question. We collect a large-scale multi-subject multiple-choice question-answering dataset, ExamQA, and use incomplete and noisy snippets returned by a web search engine as the relevant context for each question-answering instance to convert it into a weakly-labeled MRC instance. We then propose a self-teaching paradigm to better use the generated weakly-labeled MRC instances to improve a target MRC task. Experimental results show that we can obtain an improvement of 5.1% in accuracy on a multiple-choice MRC dataset, C^3, demonstrating the effectiveness of our framework and the usefulness of large-scale subject-area question-answering data for machine reading comprehension.	翻訳日:2021-02-03 16:55:20 公開日:2021-02-01
# RectiNet-v2: ドキュメントイメージのデワーピングのためのスタックネットワークアーキテクチャ RectiNet-v2: A stacked network architecture for document image dewarping ( http://arxiv.org/abs/2102.01120v1 ) ライセンス: Link先を確認	Hmrishav Bandyopadhyay, Tanmoy Dasgupta, Nibaran Das, Mita Nasipuri	(参考訳) モバイルとハンドヘルドカメラの登場により、ドキュメントイメージはほぼすべての領域に浸透しています。これらの画像のデワーピングは、文書認識アルゴリズムによって理解できるように、視点の歪みや折り畳みを取り除くために不可欠です。そこで本研究では,入力として使用する歪文書から歪みのない文書画像を生成可能な,エンドツーエンドCNNアーキテクチャを提案する。自然データの不足を補うために合成シミュレーションされた歪んだ文書画像上でこのモデルを訓練する。本手法は, 共有重み付きバイフラクテッドデコーダを用いたグリッド座標の混入防止, U-Net スキップ接続における残存ネットワークによるモデル内の異なる受容フィールドからのデータフロー, およびゲートネットワークを用いた文書画像の構造と線レベルの詳細のモデルフォーカス支援において斬新な手法である。本手法は,この領域のベンチマークであるDocUNetデータセット上で評価し,最新の手法に匹敵する結果を得る。 With the advent of mobile and hand-held cameras, document images have found their way into almost every domain. Dewarping of these images for the removal of perspective distortions and folds is essential so that they can be understood by document recognition algorithms. For this, we propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input. We train this model on warped document images simulated synthetically to compensate for lack of enough natural data. Our method is novel in the use of a bifurcated decoder with shared weights to prevent intermingling of grid coordinates, in the use of residual networks in the U-Net skip connections to allow flow of data from different receptive fields in the model, and in the use of a gated network to help the model focus on structure and line level detail of the document image. We evaluate our method on the DocUNet dataset, a benchmark in this domain, and obtain results comparable to state-of-the-art methods.	翻訳日:2021-02-03 16:51:36 公開日:2021-02-01
# 随伴剛体変換ネットワーク:3次元形状の自己監督アライメント Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes ( http://arxiv.org/abs/2102.01161v1 ) ライセンス: Link先を確認	Keyang Zhou, Bharat Lal Bhatnagar, Bernt Schiele, Gerard Pons-Moll	(参考訳) 3Dデータ(ポイントクラウド、メッシュ)のほとんどの学習方法は、データが正常な向きに慎重に整列されていない場合に、大幅なパフォーマンス低下を被る。異なるソースから収集された現実世界の3Dデータをアライメントすることは簡単ではなく、手動の介入が必要です。本論文では,既存の3Dネットワークと統合して,形状の再構築,非剛体登録,潜在非絡み合いなどのタスクにおける性能を大幅に向上させるニューラルネットワークであるAdjoint Rigid Transform (ART) Networkを提案する。 ARTは、多くのタスクに不可欠な正準方向への入力形状の回転を学習します。 artは入力形状に回転同分散制約を課すことでこれを達成する。注目すべき結果は、自己スーパービジョンだけで、artは剛体オブジェクトと非剛体オブジェクトの両方のユニークな標準指向を見つけることができ、下流のタスクパフォーマンスが著しく向上する。さらなる研究のために、コードと事前トレーニングモデルをリリースします。 Most learning methods for 3D data (point clouds, meshes) suffer significant performance drops when the data is not carefully aligned to a canonical orientation. Aligning real world 3D data collected from different sources is non-trivial and requires manual intervention. In this paper, we propose the Adjoint Rigid Transform (ART) Network, a neural module which can be integrated with existing 3D networks to significantly boost their performance in tasks such as shape reconstruction, non-rigid registration, and latent disentanglement. ART learns to rotate input shapes to a canonical orientation that is crucial for a lot of tasks. ART achieves this by imposing rotation equivariance constraint on input shapes. The remarkable result is that with only self-supervision, ART can discover a unique canonical orientation for both rigid and nonrigid objects, which leads to a notable boost in downstream task performance. We will release our code and pre-trained models for further research.	翻訳日:2021-02-03 16:50:56 公開日:2021-02-01
# 編集を楽しむ: 潜在空間ナビゲーションによる画像編集のための制御可能なgan Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation ( http://arxiv.org/abs/2102.01187v1 ) ライセンス: Link先を確認	Peiye Zhuang, Oluwasanmi Koyejo, Alexander G. Schwing	(参考訳) 制御可能なセマンティック画像編集により、ユーザーはクリック数が少なく画像属性全体を変更できます。例えば、夏のシーンは冬に撮影されたように徐々に見えます。このタスクの古典的なアプローチは、GAN(Generative Adversarial Net)を使用して、潜在空間と適切な潜在空間変換を学ぶ。しかし、現在のアプローチはしばしば、絡み合った属性編集、グローバルなイメージアイデンティティの変更、および写真リアリズムの減少に苦しんでいます。これらの懸念に対処するために,複数の属性変換を同時に学習し,属性回帰を変換関数のトレーニングに統合し,画像のアイデンティティとフォトリアリズムの維持を促進するコンテンツ損失と敵対的損失を適用する。質的評価を主とした先行作業とは異なり、制御可能な編集性能を測定するための定量的評価戦略を提案します。本モデルでは,画像の同一性やリアリズムを保ちながら,単一属性と複数属性の編集をよりよく制御することができる。実画像と合成画像の両方に対して実験結果を提供し,本モデルがターゲット画像操作の最先端性能を達成することを強調した。 Controllable semantic image editing enables a user to change entire image attributes with few clicks, e.g., gradually making a summer scene look like it was taken in winter. Classic approaches for this task use a Generative Adversarial Net (GAN) to learn a latent space and suitable latent-space transformations. However, current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism. To address these concerns, we learn multiple attribute transformations simultaneously, we integrate attribute regression into the training of transformation functions, apply a content loss and an adversarial loss that encourage the maintenance of image identity and photo-realism. We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation. Our model permits better control for both single- and multiple-attribute editing, while also preserving image identity and realism during transformation. We provide empirical results for both real and synthetic images, highlighting that our model achieves state-of-the-art performance for targeted image manipulation.	翻訳日:2021-02-03 16:50:19 公開日:2021-02-01
# 単眼直接視覚オドメトリーにおける特徴量に基づく再局在の密結合 Tight-Integration of Feature-Based Relocalization in Monocular Direct Visual Odometry ( http://arxiv.org/abs/2102.01191v1 ) ライセンス: Link先を確認	Mariia Gladkova, Rui Wang, Niclas Zeller, and Daniel Cremers	(参考訳) 本稿では,地図ベースの再局在化をオンラインの直接視覚オドメトリに統合するフレームワークを提案する。直接手法の地図に基づく再局在化を実現するため,画像特徴を直接スパースオドメトリー(DSO)に統合し,オンライン視覚計測(VO)と以前に構築された地図を関連づけるために特徴マッチングに依存する。再ローカライゼーションのポーズの統合は3倍である。まず、ポーズ先行として扱われ、フロントエンド追跡のダイレクトイメージアライメントに密に統合される。第2に、バックエンドバンドル調整に密に統合される。オンライン融合モジュールは、相対的なVOポーズとグローバルな再ローカライズポーズをポーズグラフに組み合わせ、キーフレームをスムースかつグローバルに正確なポーズで推定する。本手法は2つのマルチウェザーデータセットで評価し,手作業と学習の異なる特徴を統合し,カメラ追跡精度の向上が期待できることを示す。 In this paper we propose a framework for integrating map-based relocalization into online direct visual odometry. To achieve map-based relocalization for direct methods, we integrate image features into Direct Sparse Odometry (DSO) and rely on feature matching to associate online visual odometry (VO) with a previously built map. The integration of the relocalization poses is threefold. Firstly, they are treated as pose priors and tightly integrated into the direct image alignment of the front-end tracking. Secondly, they are also tightly integrated into the back-end bundle adjustment. An online fusion module is further proposed to combine relative VO poses and global relocalization poses in a pose graph to estimate keyframe-wise smooth and globally accurate poses. We evaluate our method on two multi-weather datasets showing the benefits of integrating different handcrafted and learned features and demonstrating promising improvements on camera tracking accuracy.	翻訳日:2021-02-03 16:49:40 公開日:2021-02-01
# スロットアテンションを有する文字系列から有意義な単位を誘導する Inducing Meaningful Units from Character Sequences with Slot Attention ( http://arxiv.org/abs/2102.01223v1 ) ライセンス: Link先を確認	Melika Behjati and James Henderson	(参考訳) 文字は意味を伝えないが、文字の配列はそうである。抽象的意味保持単位を一連の文字で学習するための教師なし分布法を提案する。このモデルはシーケンスをセグメンテーションする代わりに、最近提案されたスロットアテンションと呼ばれる画像のオブジェクト発見のためのアーキテクチャを用いて、シーケンス内の"オブジェクト"の連続的な表現を検出する。我々は、異なる言語でモデルを訓練し、取得した表現の品質を分類器で評価する。我々の実験は、より高い抽象レベルで意味を捉える能力において有望な結果を示す。 Characters do not convey meaning, but sequences of characters do. We propose an unsupervised distributional method to learn the abstract meaning-bearing units in a sequence of characters. Rather than segmenting the sequence, this model discovers continuous representations of the "objects" in the sequence, using a recently proposed architecture for object discovery in images called Slot Attention. We train our model on different languages and evaluate the quality of the obtained representations with probing classifiers. Our experiments show promising results in the ability of our units to capture meaning at a higher level of abstraction.	翻訳日:2021-02-03 16:42:31 公開日:2021-02-01
# GraphDF:分子グラフ生成のための離散フローモデル GraphDF: A Discrete Flow Model for Molecular Graph Generation ( http://arxiv.org/abs/2102.01189v1 ) ライセンス: Link先を確認	Youzhi Luo, Keqiang Yan, Shuiwang Ji	(参考訳) 深層モデルを用いた分子グラフ生成の問題点を考察する。グラフは離散的であるが、既存のほとんどのメソッドは連続潜伏変数を使用し、離散グラフ構造の不正確なモデリングをもたらす。本稿では,正規化フロー法に基づく分子グラフ生成のための新しい離散潜在変数モデルであるGraphDFを提案する。 graphdfは、離散的潜在変数をグラフノードとエッジにマッピングするために、可逆モジュロシフト変換を使用する。離散潜在変数を用いることで計算コストを削減し、復号化の負の効果を排除できることを示す。包括的実験により,graphdfはランダム生成,プロパティ最適化,制約付き最適化タスクにおいて,先行手法よりも優れていた。 We consider the problem of molecular graph generation using deep models. While graphs are discrete, most existing methods use continuous latent variables, resulting in inaccurate modeling of discrete graph structures. In this work, we propose GraphDF, a novel discrete latent variable model for molecular graph generation based on normalizing flow methods. GraphDF uses invertible modulo shift transforms to map discrete latent variables to graph nodes and edges. We show that the use of discrete latent variables reduces computational costs and eliminates the negative effect of dequantization. Comprehensive experimental results show that GraphDF outperforms prior methods on random generation, property optimization, and constrained optimization tasks.	翻訳日:2021-02-03 16:42:00 公開日:2021-02-01
# 並列ウェーブネットを用いたUniversal Neural Vocoding Universal Neural Vocoding with Parallel WaveNet ( http://arxiv.org/abs/2102.01106v1 ) ライセンス: Link先を確認	Yunlong Jiao, Adam Gabrys, Georgi Tinchev, Bartosz Putrycz, Daniel Korzekwa, Viacheslav Klimkov	(参考訳) 本稿では,パラレルウェーブネットに基づくユニバーサルニューラルボコーダと,オーディオエンコーダと呼ばれる追加条件付きネットワークを提案する。われわれのuniversal vocoderは、幅広いユースケースでリアルタイムの高品質な音声合成を提供する。 17のユニークなスタイルで20の言語を話しました。そのうち7つの声と5つのスタイルはトレーニング中に公開されていませんでした。提案するユニバーサルボコーダは,話者依存型ボコーダを圧倒的に上回っている。また,提案するボコーダは,自然性と普遍性の観点から,既存のニューラルボコーダアーキテクチャよりも優れていることを示す。これらの発見は、300以上のオープンソース音声のさらなるテストにおいて一貫しています。 We present a universal neural vocoder based on Parallel WaveNet, with an additional conditioning network called Audio Encoder. Our universal vocoder offers real-time high-quality speech synthesis on a wide range of use cases. We tested it on 43 internal speakers of diverse age and gender, speaking 20 languages in 17 unique styles, of which 7 voices and 5 styles were not exposed during training. We show that the proposed universal vocoder significantly outperforms speaker-dependent vocoders overall. We also show that the proposed vocoder outperforms several existing neural vocoder architectures in terms of naturalness and universality. These findings are consistent when we further test on more than 300 open-source voices.	翻訳日:2021-02-03 16:36:56 公開日:2021-02-01
# BERT-based Label & Instance Embeddings による遠隔監視型関係抽出の改善 Improving Distantly-Supervised Relation Extraction through BERT-based Label & Instance Embeddings ( http://arxiv.org/abs/2102.01156v1 ) ライセンス: Link先を確認	Despina Christou, Grigorios Tsoumakas	(参考訳) 遠隔教師付き関係抽出(RE)は,REを大規模コーパスに拡張する有効な方法であるが,ノイズラベルに悩まされている。既存のアプローチは、マルチインスタンス学習と追加情報の提供を通じてノイズを緩和しようとしますが、主にトップの頻繁な関係を認識し、長期にわたってそれらを無視します。 REDSandT(Relation Extraction with Distant Supervision and Transformers)は、BERTの事前訓練モデルとラベルとエンティティの関係をそれぞれ活用し、高度に有益なインスタンスとREのラベル埋め込みを通じてより広い関係セットをキャプチャする、遠隔監視トランスフォーマーベースの新しいREメソッドである。エンティティペアとエンティティの型を接続するサブツリーを含む構造化された入力にBERTを微調整することで、ReDSandTはリレーショナルトークンのみにフォーカスするように誘導する。抽出した情報ベクトルを用いてラベル埋め込みを形づくり、さらにノイズを低減するためにインスタンス上の注意機構として使用する。最後に、関係とインスタンス埋め込みを結合することで文を表現する。 NYT-10データセットの実験では、REDSandTはより幅広い信頼関係を捉え、最先端のAUC(0.424)を達成している。 Distantly-supervised relation extraction (RE) is an effective method to scale RE to large corpora but suffers from noisy labels. Existing approaches try to alleviate noise through multi-instance learning and by providing additional information, but manage to recognize mainly the top frequent relations, neglecting those in the long-tail. We propose REDSandT (Relation Extraction with Distant Supervision and Transformers), a novel distantly-supervised transformer-based RE method, that manages to capture a wider set of relations through highly informative instance and label embeddings for RE, by exploiting BERT's pre-trained model, and the relationship between labels and entities, respectively. We guide REDSandT to focus solely on relational tokens by fine-tuning BERT on a structured input, including the sub-tree connecting an entity pair and the entities' types. Using the extracted informative vectors, we shape label embeddings, which we also use as attention mechanism over instances to further reduce noise. Finally, we represent sentences by concatenating relation and instance embeddings. Experiments in the NYT-10 dataset show that REDSandT captures a broader set of relations with higher confidence, achieving state-of-the-art AUC (0.424).	翻訳日:2021-02-03 16:36:25 公開日:2021-02-01
# SGDがGDよりも一般化(正規化は役に立たない) SGD Generalizes Better Than GD (And Regularization Doesn't Help) ( http://arxiv.org/abs/2102.01117v1 ) ライセンス: Link先を確認	Idan Amir, Tomer Koren, Roi Livni	(参考訳) 基本確率凸最適化モデルにおける確率勾配降下(SGD)とフルバッチ勾配降下(GD)の一般化性能との間に新たな分離結果を与える。 SGD の場合、$O(1/\epsilon^2)$ 反復は$\epsilon$ 過剰な予測リスクを持つ解を得るのに十分であることが知られているが、同じステップ数で GD がオーバーフィットし、$\Omega(1)$ 一般化誤差を持つ解を出力できることが示されている。さらに,近年のbassily et alによる研究により,sgdの一般化性能に適合するために,実のところ$\omega(1/\epsilon^4)$イテレーションが必要であることを示した。 (2020). さらに,gdによって最小化される経験的リスクの正則化は,上記の結果に本質的に変化せず,安定性,暗黙的バイアス,一般化における学習アルゴリズムの役割について再検討する。 We give a new separation result between the generalization performance of stochastic gradient descent (SGD) and of full-batch gradient descent (GD) in the fundamental stochastic convex optimization model. While for SGD it is well-known that $O(1/\epsilon^2)$ iterations suffice for obtaining a solution with $\epsilon$ excess expected risk, we show that with the same number of steps GD may overfit and emit a solution with $\Omega(1)$ generalization error. Moreover, we show that in fact $\Omega(1/\epsilon^4)$ iterations are necessary for GD to match the generalization performance of SGD, which is also tight due to recent work by Bassily et al. (2020). We further discuss how regularizing the empirical risk minimized by GD essentially does not change the above result, and revisit the concepts of stability, implicit bias and the role of the learning algorithm in generalization.	翻訳日:2021-02-03 16:34:19 公開日:2021-02-01
# 単一プロップによる頑健なニューラルネットワークの高速学習 Fast Training of Provably Robust Neural Networks by SingleProp ( http://arxiv.org/abs/2102.01208v1 ) ライセンス: Link先を確認	Akhilan Boopathy, Tsui-Wei Weng, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, Luca Daniel	(参考訳) 最近の研究は、認証された保証を伴う敵の攻撃からニューラルネットワークを守るいくつかの方法を開発した。しかし、これらの技術は、訓練中に認証を使用することで計算コストがかかる。既存の認定防御よりも効率的で、ネットワークを介して1つの追加の前方伝播を必要とする新しい正規化器を開発し、同様の認定精度でネットワークを訓練することができます。 mnist と cifar-10 の実験を通じて,最先端の認証防御と比較して,トレーニング速度と同等の認定精度が向上することを示す。 Recent works have developed several methods of defending neural networks against adversarial attacks with certified guarantees. However, these techniques can be computationally costly due to the use of certification during training. We develop a new regularizer that is both more efficient than existing certified defenses, requiring only one additional forward propagation through a network, and can be used to train networks with similar certified accuracy. Through experiments on MNIST and CIFAR-10 we demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses.	翻訳日:2021-02-03 16:33:37 公開日:2021-02-01
# 線形ペイオフのための二重ロバストトンプソンサンプリング Doubly Robust Thompson Sampling for linear payoffs ( http://arxiv.org/abs/2102.01229v1 ) ライセンス: Link先を確認	Wonyoung Kim, Gi-soo Kim, Myunghee Cho Paik	(参考訳) バンドイット問題における挑戦的な側面は、選択された腕のみに確率的な報酬が観察され、他の腕の報酬が失われることである。アームの選択は過去のコンテキストと報酬ペアに依存するため、選択されたアームのコンテキストは相関に苦しめられ、分析が困難になる。本論文では,データ文献の欠落に用いるDR手法をTSに応用した,Dubly Robust (DR) Thompson Sampling (TS) という新しいマルチアームコンテキストバンディットアルゴリズムを提案する。提案されたアルゴリズムは、$d$ が文脈の次元である$\sqrt{d}$ の係数によって ts の境界を改善する。提案手法の利点は,ts の理論的解析に使用される不飽和アームの技術的定義を回避できるため,選択または選択しないすべてのコンテキストデータを使用することである。経験的研究はTSよりも提案されたアルゴリズムの利点を示す。 A challenging aspect of the bandit problem is that a stochastic reward is observed only for the chosen arm and the rewards of other arms remain missing. Since the arm choice depends on the past context and reward pairs, the contexts of chosen arms suffer from correlation and render the analysis difficult. We propose a novel multi-armed contextual bandit algorithm called Doubly Robust (DR) Thompson Sampling (TS) that applies the DR technique used in missing data literature to TS. The proposed algorithm improves the bound of TS by a factor of $\sqrt{d}$, where $d$ is the dimension of the context. A benefit of the proposed method is that it uses all the context data, chosen or not chosen, thus allowing to circumvent the technical definition of unsaturated arms used in theoretical analysis of TS. Empirical studies show the advantage of the proposed algorithm over TS.	翻訳日:2021-02-03 16:33:05 公開日:2021-02-01
# スタックオーバーフローからのクエリログに基づく効率的な検索のための自動クエリリフォーマレーション Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow ( http://arxiv.org/abs/2102.00826v1 ) ライセンス: Link先を確認	Kaibo Cao (1), Chunyang Chen (2), Sebastian Baltes (3), Christoph Treude (3), Xiang Chen (4) ((1) Software Institute, Nanjing University, China, (2) Faculty of Information Technology, Monash University, Australia, (3) School of Computer Science, University of Adelaide, Australia, (4) School of Information Science and Technology, Nantong University, China)	(参考訳) プログラミングのq&aサイトとして人気があるstack overflowは、開発者にとって宝物だ。しかしながら、stack overflowの質問や回答の量によって、開発者が探している情報を効率的に見つけることが難しくなる。検索結果の貧弱化につながる2つのギャップは、ユーザの意図とテキストクエリの間のギャップ、クエリとポストコンテンツの間の意味的ギャップである。そのため開発者は、ミススペルされた単語を訂正し、特定のプログラミング言語やプラットフォームに制限を加えることで、クエリを常に修正する必要がある。クエリの改定は、特に初心者にとっては面倒であるので、ディープラーニングに基づく自動ソフトウェア固有のクエリの改定手法を提案する。 Stack Overflowが提供するクエリログを用いて,クエリとそれに対応するクエリを含む大規模クエリ再構成コーパスを構築する。提案手法では,ユーザが元のクエリを入力した場合に,候補変更クエリを自動的に生成するトランスフォーマーモデルを訓練する。評価の結果、我々のアプローチは5つの最先端ベースラインを上回り、$\mathit{exactmatch}$で5.6%から33.5%、$\mathit{gleu}$で4.8%から14.4%向上した。 As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of $\mathit{ExactMatch}$ and a 4.8% to 14.4% boost in terms of $\mathit{GLEU}$.	翻訳日:2021-02-03 16:29:42 公開日:2021-02-01
# サイエンス・コンスピレーション・ビデオの視覚的フレイミング: 機械学習とコミュニケーション理論の統合による色と明度の利用に関する研究 Visual Framing of Science Conspiracy Videos: Integrating Machine Learning with Communication Theories to Study the Use of Color and Brightness ( http://arxiv.org/abs/2102.01163v1 ) ライセンス: Link先を確認	Kaiping Chen, Sang Jung Kim, Sebastian Raschka, Qiantong Gao	(参考訳) 近年、インターネット上の科学陰謀ビデオの爆発が目撃され、科学認識論と科学の一般理解に挑戦している。学者たちは、不確実性や恐怖といった陰謀のメッセージで使用される説得技術について調べ始めたが、視覚的物語についてはほとんど理解されていない。本稿では、陰謀と反陰謀のビデオから数百万フレームを解析し、陰謀ビデオにおける視覚的フレーミングの理解のギャップを計算手法を用いて解決する。共謀ビデオは色のばらつきや明るさが低い傾向にあり、特にサムネイルや初期のビデオでは顕著だった。本論文は,ソーシャルメディア上での共謀を識別するために,研究者がテキスト的および視覚的な特徴を統合する方法を示し,デジタル時代の視覚操作研究に関心のある研究者にとっての計算モデリングの意義について論じる。 Recent years have witnessed an explosion of science conspiracy videos on the Internet, challenging science epistemology and public understanding of science. Scholars have started to examine the persuasion techniques used in conspiracy messages such as uncertainty and fear yet, little is understood about the visual narratives, especially how visual narratives differ in videos that debunk conspiracies versus those that propagate conspiracies. This paper addresses this gap in understanding visual framing in conspiracy videos through analyzing millions of frames from conspiracy and counter-conspiracy YouTube videos using computational methods. We found that conspiracy videos tended to use lower color variance and brightness, especially in thumbnails and earlier parts of the videos. This paper also demonstrates how researchers can integrate textual and visual features for identifying conspiracies on social media and discusses the implications of computational modeling for scholars interested in studying visual manipulation in the digital era.	翻訳日:2021-02-03 16:15:44 公開日:2021-02-01
# (参考訳) 「大麻にかかわるうつ病ですか。」「限定監督による実体・関係抽出のための知識融合モデル」 "Is depression related to cannabis?": A knowledge-infused model for Entity and Relation Extraction with Limited Supervision ( http://arxiv.org/abs/2102.01222v1 ) ライセンス: CC BY 4.0	Kaushik Roy, Usha Lokala, Vedant Khandelwal, and Amit Sheth	(参考訳) 精神の健康を改善するための大麻の使用の利点を強く宣伝し、大麻の合法化が立法府の優先事項である。しかし、予備的な科学的研究は、大麻と精神の健康の改善を決定づけるものではない。本研究では、大麻の個人的使用を含む標的ソーシャルメディアコーパスにおける大麻の抑うつと消費の関係を検討し、その潜在的な精神的健康上の利益を導き出そうとする。ドメインの専門家がアノテートした3つのカテゴリ(理由、効果、中毒)に関連付けられたツイートを使用します。最先端の自然ランガウジ処理技術は、大麻のフレーズとうつ病指標の間のこれらの関係の抽出に不足します。本研究は,精神疾患の診断・統計マニュアルを付加した依存症用薬物乱用オントロジーを精神保健に応用し,その限界に対処することを目的とする。ドメインエキスパートの時間が限られているためアノテーションが不足しているため、広範囲のコーパスで訓練されたGPT-3とともに教師付きコントラスト学習を使用して、限られた監督下でもパフォーマンスの向上を実現している。実験の結果,本手法は最先端の関係抽出装置よりも大麻-うつ病関係を有意に抽出できることが判明した。良質なアノテーションは、科学コミュニティが大麻とうつ病の関連性をよりよく理解するために使用できる学習表現を使用して、近隣のアプローチで提供することができる。 With strong marketing advocacy of the benefits of cannabis use for improved mental health, cannabis legalization is a priority among legislators. However, preliminary scientific research does not conclusively associate cannabis with improved mental health. In this study, we explore the relationship between depression and consumption of cannabis in a targeted social media corpus involving personal use of cannabis with the intent to derive its potential mental health benefit. We use tweets that contain an association among three categories annotated by domain experts - Reason, Effect, and Addiction. The state-of-the-art Natural Langauge Processing techniques fall short in extracting these relationships between cannabis phrases and the depression indicators. We seek to address the limitation by using domain knowledge; specifically, the Drug Abuse Ontology for addiction augmented with Diagnostic and Statistical Manual of Mental Disorders lexicons for mental health. Because of the lack of annotations due to the limited availability of the domain experts' time, we use supervised contrastive learning in conjunction with GPT-3 trained on a vast corpus to achieve improved performance even with limited supervision. Experimental results show that our method can significantly extract cannabis-depression relationships better than the state-of-the-art relation extractor. High-quality annotations can be provided using a nearest neighbor approach using the learned representations that can be used by the scientific community to understand the association between cannabis and depression better.	翻訳日:2021-02-03 16:12:59 公開日:2021-02-01
# (参考訳) グラフ畳み込みニューラルネットワークによる汎用OCRパラグラフの同定 General-Purpose OCR Paragraph Identification by Graph Convolutional Neural Networks ( http://arxiv.org/abs/2101.12741v2 ) ライセンス: CC BY 4.0	Renshen Wang, Yasuhisa Fujii and Ashok C. Popat	(参考訳) パラグラフはドキュメントエンティティの重要なクラスです。 OCRテキストボックスに適用した空間グラフ畳み込みニューラルネットワーク(GCN)による段落識別のための新しい手法を提案する。行分割と行クラスタリングという2つのステップを実行して、OCR結果の行から段落を抽出します。各ステップはバウンディングボックスから構築されたβ-スケルトングラフを使用し、グラフエッジはグラフ畳み込み操作の効率的なサポートを提供する。純粋なレイアウト入力機能のみにより、GCNモデルのサイズはR-CNNベースのモデルと比較して3〜4桁小さく、PubLayNetや他のデータセットで同等以上の精度を達成しています。さらに、GCNモデルは、合成トレーニングデータから実世界画像への良好な一般化と、可変文書スタイルに対する良好な適応性を示す。 Paragraphs are an important class of document entities. We propose a new approach for paragraph identification by spatial graph convolutional neural networks (GCN) applied on OCR text boxes. Two steps, namely line splitting and line clustering, are performed to extract paragraphs from the lines in OCR results. Each step uses a beta-skeleton graph constructed from bounding boxes, where the graph edges provide efficient support for graph convolution operations. With only pure layout input features, the GCN model size is 3~4 orders of magnitude smaller compared to R-CNN based models, while achieving comparable or better accuracies on PubLayNet and other datasets. Furthermore, the GCN models show good generalization from synthetic training data to real-world images, and good adaptivity for variable document styles.	翻訳日:2021-02-03 13:29:22 公開日:2021-02-01
# (参考訳) ファーストパーソンビデオからのコンタクト表現による予測アクション Forecasting Action through Contact Representations from First Person Video ( http://arxiv.org/abs/2102.00649v1 ) ライセンス: CC BY 4.0	Eadom Dessalene, Chinmaya Devaraj, Michael Maynord, Cornelia Fermuller, and Yiannis Aloimonos	(参考訳) 手操作を含む人間の行動は、手対象の接触の作成と破壊に基づいて構成され、行動の人間の視覚的理解は、認知科学の先駆的な研究によって実証されるように、接触の予測に依存している。これから着想を得て,接触を中心とした表現とモデルを紹介し,行動予測と予測に使用する。 EPIC Kitchensデータセットのサブセットをアノテートして、ハンドとオブジェクト間の接触時間、ハンドとオブジェクトのセグメンテーションを含むようにします。これらのアノテーションを使って予測モジュール、接触予測マップを生成するモジュール、そして次のアクティブオブジェクトセグメンテーションを訓練します。予測モジュールの上に、アクション予測と予測のためのフレームワークであるEgocentric Object Manipulation Graphs (Ego-OMG)を適用します。 Ego-OMGは、接触線型行動状態間のグラフモデリング遷移を使用して、より長期の時間的意味関係をモデル化する。 ego-omg内の予測モジュールの使用は、最先端の結果を生成し、epic kitchens action anticipation challengeのunseenおよびseetテストセットでそれぞれ1位と2位を達成し、epic kitchens上でのアクション予測とアクション予測のタスクに関する最先端の結果を得る。我々は,予測モジュールの特性に関するアブレーション研究を行い,その有用性を評価する。 Human actions involving hand manipulations are structured according to the making and breaking of hand-object contact, and human visual understanding of action is reliant on anticipation of contact as is demonstrated by pioneering work in cognitive science. Taking inspiration from this, we introduce representations and models centered on contact, which we then use in action prediction and anticipation. We annotate a subset of the EPIC Kitchens dataset to include time-to-contact between hands and objects, as well as segmentations of hands and objects. Using these annotations we train the Anticipation Module, a module producing Contact Anticipation Maps and Next Active Object Segmentations - novel low-level representations providing temporal and spatial characteristics of anticipated near future action. On top of the Anticipation Module we apply Egocentric Object Manipulation Graphs (Ego-OMG), a framework for action anticipation and prediction. Ego-OMG models longer term temporal semantic relations through the use of a graph modeling transitions between contact delineated action states. Use of the Anticipation Module within Ego-OMG produces state-of-the-art results, achieving 1st and 2nd place on the unseen and seen test sets, respectively, of the EPIC Kitchens Action Anticipation Challenge, and achieving state-of-the-art results on the tasks of action anticipation and action prediction over EPIC Kitchens. We perform ablation studies over characteristics of the Anticipation Module to evaluate their utility.	翻訳日:2021-02-03 08:25:19 公開日:2021-02-01
# (参考訳) 顔について:顔認識評価に関する調査 About Face: A Survey of Facial Recognition Evaluation ( http://arxiv.org/abs/2102.00813v1 ) ライセンス: CC BY 4.0	Inioluwa Deborah Raji, Genevieve Fried	(参考訳) 1976年から2019年にかけて、さまざまなソース、人口、状況から1700万以上の被験者の1億4500万枚の画像から構築された100以上の顔データセットを調査した。歴史的調査によると、これらのデータセットは、政治的モチベーションの変化、技術的能力、そして現在の規範によって形作られています。このような影響が特定のプラクティスをマスクする方法(その一部は実際に有害であるか、あるいはそれ以外は問題)を議論し、現実世界のテクノロジーの機能をより明確に理解するために、そのような詳細の明示的なコミュニケーションのケースを作ります。 We survey over 100 face datasets constructed between 1976 to 2019 of 145 million images of over 17 million subjects from a range of sources, demographics and conditions. Our historical survey reveals that these datasets are contextually informed, shaped by changes in political motivations, technological capability and current norms. We discuss how such influences mask specific practices (some of which may actually be harmful or otherwise problematic) and make a case for the explicit communication of such details in order to establish a more grounded understanding of the technology's function in the real world.	翻訳日:2021-02-03 08:01:49 公開日:2021-02-01
# (参考訳) ビデオキャプションのためのセマンティックグループネットワーク Semantic Grouping Network for Video Captioning ( http://arxiv.org/abs/2102.00831v1 ) ライセンス: CC BY 4.0	Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo	(参考訳) 本論文では,(1)部分的に符号化されたキャプションの単語フレーズを区別してビデオフレームをグループ化しようとするセマンティックグループネットワーク(Semantic Grouping Network, SGN)と呼ばれるビデオキャプション生成ネットワークを検討し,(2)セマンティックアライメント群を復号して次の単語を予測する。連続するフレームがユニークな情報を提供する可能性は低いため、以前の手法は入力ビデオのみに基づいて繰り返し情報を破棄またはマージすることに重点を置いていた。 SGNは、部分的にデコードされたキャプションの最も識別された単語フレーズをキャプチャするアルゴリズムと、関連するビデオフレームに各フレーズを関連付けるマッピングを学習する。従来の手法とは対照的に、復号された単語からの連続的なフィードバックにより、SGNは部分的に復号されたキャプションに対応するビデオ表現を動的に更新することができる。さらに、マニュアルアノテーションなしで単語句とビデオフレームの正確な整合を容易にするために、コントラストの注意損失が提案される。 SGNは、MSVDおよびMSR-VTTデータセット上のCIDEr-Dスコアの2.1%pおよび2.4%pのマージンでランナーアップ方法を上回ることにより、最新のパフォーマンスを実現します。広範な実験は、SGNの有効性と解釈可能性を示しています。 This paper considers a video caption generating network referred to as Semantic Grouping Network (SGN) that attempts (1) to group video frames with discriminating word phrases of partially decoded caption and then (2) to decode those semantically aligned groups in predicting the next word. As consecutive frames are not likely to provide unique information, prior methods have focused on discarding or merging repetitive information based only on the input video. The SGN learns an algorithm to capture the most discriminating word phrases of the partially decoded caption and a mapping that associates each phrase to the relevant video frames - establishing this mapping allows semantically related frames to be clustered, which reduces redundancy. In contrast to the prior methods, the continuous feedback from decoded words enables the SGN to dynamically update the video representation that adapts to the partially decoded caption. Furthermore, a contrastive attention loss is proposed to facilitate accurate alignment between a word phrase and video frames without manual annotations. The SGN achieves state-of-the-art performances by outperforming runner-up methods by a margin of 2.1%p and 2.4%p in a CIDEr-D score on MSVD and MSR-VTT datasets, respectively. Extensive experiments demonstrate the effectiveness and interpretability of the SGN.	翻訳日:2021-02-03 08:00:59 公開日:2021-02-01
# (参考訳) 構造予測における超高速速度 Super fast rates in structured prediction ( http://arxiv.org/abs/2102.00760v1 ) ライセンス: CC BY 4.0	Vivien Cabannes and Alessandro Rudi and Francis Bach	(参考訳) 分類のような離散的教師付き学習問題は、回帰に類似した連続的な代理問題を導入することでしばしば取り組まれる。サロゲート誤差による推定と解の間の元の誤差の境界は、連続インスタンスに対して既に示されている収束率で離散的な問題を内包する。しかし、現在のアプローチでは、連続的な問題が連続的な値を予測するとき、離散的な問題は本質的に離散的な出力を予測しているという事実を活用できない。本稿では、一般的な構造化された予測問題についてこの問題に取り組み、過度のリスクに対する収束率が$n^{-1}$よりも速く、$n$が観測数であり、最も強い仮定で指数関数的なレートも含む「超高速」率への道を開く。まず,近接近傍に基づく予測器について説明を行い,構造化予測の枠組み内の任意の離散問題に対してバイナリ分類で知られている確率を一般化する。次に,n^{-1/4}$の既知の速度を任意に高速化するカーネルリッジ回帰を,問題の硬さを特徴付けるパラメータによって検討し,スムーズな仮定の下で,次元性の呪いを回避できるようにする。 Discrete supervised learning problems such as classification are often tackled by introducing a continuous surrogate problem akin to regression. Bounding the original error, between estimate and solution, by the surrogate error endows discrete problems with convergence rates already shown for continuous instances. Yet, current approaches do not leverage the fact that discrete problems are essentially predicting a discrete output when continuous problems are predicting a continuous value. In this paper, we tackle this issue for general structured prediction problems, opening the way to "super fast" rates, that is, convergence rates for the excess risk faster than $n^{-1}$, where $n$ is the number of observations, with even exponential rates with the strongest assumptions. We first illustrate it for predictors based on nearest neighbors, generalizing rates known for binary classification to any discrete problem within the framework of structured prediction. We then consider kernel ridge regression where we improve known rates in $n^{-1/4}$ to arbitrarily fast rates, depending on a parameter characterizing the hardness of the problem, thus allowing, under smoothness assumptions, to bypass the curse of dimensionality.	翻訳日:2021-02-03 07:00:59 公開日:2021-02-01
# (参考訳) 歴史的コーポラの神経OCRポストホック補正 Neural OCR Post-Hoc Correction of Historical Corpora ( http://arxiv.org/abs/2102.00583v1 ) ライセンス: CC BY-SA 4.0	Lijun Lyu, Maria Koutraki, Martin Krickl, Besnik Fetahu	(参考訳) 光文字認識(ocr)は歴史的コレクションへのより深いアクセスに不可欠である。 OCRは、文字、単語、または単語分割の転写エラーの主源として、正書法の変化、書体、言語進化(新しい文字、単語スペルなど)を考慮する必要がある。歴史的印刷物のデジタルコーパスでは、スキャン品質の低下と言語標準化の欠如によりエラーはさらに悪化します。 OCRポストホック補正のタスクでは、OCR転写エラーを補正するために、リカレント(RNN)とディープ畳み込みネットワーク(ConvNet)を組み合わせたニューラルアプローチを提案します。文字レベルでは、誤りを柔軟に捉え、新しい注意機構に基づいて補正された出力を復号する。入力と出力の類似性を考慮し,モデルの補正動作に報酬を与える新たな損失関数を提案する。ドイツ語での履歴書コーパスの評価は、私たちのモデルが多様なOCR転写エラーをキャプチャし、単語誤り率を32.3%以上89%削減できることを示しています。 Optical character recognition (OCR) is crucial for a deeper access to historical collections. OCR needs to account for orthographic variations, typefaces, or language evolution (i.e., new letters, word spellings), as the main source of character, word, or word segmentation transcription errors. For digital corpora of historical prints, the errors are further exacerbated due to low scan quality and lack of language standardization. For the task of OCR post-hoc correction, we propose a neural approach based on a combination of recurrent (RNN) and deep convolutional network (ConvNet) to correct OCR transcription errors. At character level we flexibly capture errors, and decode the corrected output based on a novel attention mechanism. Accounting for the input and output similarity, we propose a new loss function that rewards the model's correcting behavior. Evaluation on a historical book corpus in German language shows that our models are robust in capturing diverse OCR transcription errors and reduce the word error rate of 32.3% by more than 89%.	翻訳日:2021-02-03 05:29:39 公開日:2021-02-01
# (参考訳) 科学論文におけるマルチレベルヘッダ数値表のメトリック型同定 Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers ( http://arxiv.org/abs/2102.00819v1 ) ライセンス: CC BY 4.0	Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Manabu Okumura, Hiroya Takamura	(参考訳) 数値表は科学論文に実験結果を示すために広く使われている。テーブル理解のためには、テーブル内の数値を識別するためにメトリクス型が不可欠です。本稿では,新しい情報抽出タスク,多レベルヘッダ数値表からのメトリックタイプ識別,ヘッダテーブル,キャプション,メトリックタイプからなる科学論文から抽出したデータセットを提案する。そこで我々は,ポインタ生成モデルとBERTモデルを用いた2つの共同学習型ニューラル分類と生成方式を提案する。その結果, 共同モデルは, ヘッド内とヘッド外の両方のメトリック型識別問題に対処できることが示された。 Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. We introduce a new information extraction task, metric-type identification from multi-level header numerical tables, and provide a dataset extracted from scientific papers consisting of header tables, captions, and metric-types. We then propose two joint-learning neural classification and generation schemes featuring pointer-generator-based and BERT-based models. Our results show that the joint models can handle both in-header and out-of-header metric-type identification problems.	翻訳日:2021-02-03 05:12:38 公開日:2021-02-01
# (参考訳) 多言語lama:多言語事前学習言語モデルにおける知識の検討 Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models ( http://arxiv.org/abs/2102.00894v1 ) ライセンス: CC BY 4.0	Nora Kassner, Philipp Dufter, Hinrich Sch\"utze	(参考訳) 近年,単言語英語モデルが知識ベースとして利用できることが判明している。構造知識ベースクエリの代わりに,「パリは[MASK]の首都」などのマスキング文がプローブとして使用される。確立されたベンチマークTRExとGoogleREを53言語に翻訳する。 mBERTを使って3つの質問を調査する。 i) mBERTは多言語知識ベースとして使用できるか? ほとんどの先行研究は英語のみを扱っている。複数の言語に研究を拡張することは、多様性とアクセシビリティにとって重要である。 (ii) 知識ベース言語非依存としてのmBERTのパフォーマンスは、言語によって異なりますか? (iii) 多言語モデルはより多くのテキストで訓練される。例えば、mBERTは104のウィキペディアで訓練される。 mBERTはこれをより良いパフォーマンスに活用できますか? 知識ベースとしてmBERTを使用することで、言語間でパフォーマンスが変化し、言語間で予測をプールすることでパフォーマンスが向上します。逆に、mBERTは言語バイアスを示す。例えば、イタリア語で問い合わせた場合、イタリアを起源の国と予測する傾向があります。 Recently, it has been found that monolingual English language models can be used as knowledge bases. Instead of structural knowledge base queries, masked sentences such as "Paris is the capital of [MASK]" are used as probes. We translate the established benchmarks TREx and GoogleRE into 53 languages. Working with mBERT, we investigate three questions. (i) Can mBERT be used as a multilingual knowledge base? Most prior work only considers English. Extending research to multiple languages is important for diversity and accessibility. (ii) Is mBERT's performance as knowledge base language-independent or does it vary from language to language? (iii) A multilingual model is trained on more text, e.g., mBERT is trained on 104 Wikipedias. Can mBERT leverage this for better performance? We find that using mBERT as a knowledge base yields varying performance across languages and pooling predictions across languages improves performance. Conversely, mBERT exhibits a language bias; e.g., when queried in Italian, it tends to predict Italy as the country of origin.	翻訳日:2021-02-03 04:49:14 公開日:2021-02-01
# (参考訳) ニュース記事の抗議数:データセットと半自動化データ収集パイプライン Counting Protests in News Articles: A Dataset and Semi-Automated Data Collection Pipeline ( http://arxiv.org/abs/2102.00917v1 ) ライセンス: CC BY 4.0	Tommy Leung, L. Nathan Perkins	(参考訳) 2017年1月から2021年1月にかけて、米国の何千もの地元ニュースソースが、公民権、移民、銃、環境などに関する42,000以上の抗議を報告した。抗議を毎日報告する地元のジャーナリストの膨大な数を考えると、これらの出来事を構造化されたデータとして抽出して時間的および地理的傾向を理解することで、市民の意思決定が促進されます。しかし、ニュース記事からイベントを抽出するタスクは、ドメイン検出、スロットフィリング、コアファレンス解決の分野で、NLPコミュニティによく知られた課題を提示します。ニュース記事から構造化されたデータを抽出するリソースを改善するために、我々の貢献は3倍になる。 We 1) release a manually labeled dataset of news article URLs, dates, locations, crowd size estimates, and 494 discrete descriptive tags corresponding to 42,347 reported protest events in the United States between January 2017 and January 2021; 2) describe the semi-automated data collection pipeline used to discover, sort, and review the 144,568 English articles that comprise the dataset; and 3) benchmark a long-short term memory (LSTM) low dimensional classifier that demonstrates the utility of processing news articles based on syntactic structures, such as paragraphs and sentences, to count the number of reported protest events. Between January 2017 and January 2021, thousands of local news sources in the United States reported on over 42,000 protests about topics such as civil rights, immigration, guns, and the environment. Given the vast number of local journalists that report on protests daily, extracting these events as structured data to understand temporal and geographic trends can empower civic decision-making. However, the task of extracting events from news articles presents well known challenges to the NLP community in the fields of domain detection, slot filling, and coreference resolution. To help improve the resources available for extracting structured data from news stories, our contribution is three-fold. We 1) release a manually labeled dataset of news article URLs, dates, locations, crowd size estimates, and 494 discrete descriptive tags corresponding to 42,347 reported protest events in the United States between January 2017 and January 2021; 2) describe the semi-automated data collection pipeline used to discover, sort, and review the 144,568 English articles that comprise the dataset; and 3) benchmark a long-short term memory (LSTM) low dimensional classifier that demonstrates the utility of processing news articles based on syntactic structures, such as paragraphs and sentences, to count the number of reported protest events.	翻訳日:2021-02-03 04:33:32 公開日:2021-02-01
# (参考訳) 再帰的KMeansとDijkstraアルゴリズムによるCVRPの解法 Using Recursive KMeans and Dijkstra Algorithm to Solve CVRP ( http://arxiv.org/abs/2102.00567v1 ) ライセンス: CC BY 4.0	Hassan Moussa	(参考訳) キャパシタ付き車両ルーティング問題(CVRP)は、今日の最も一般的な最適化問題のひとつです。 Capacitated vehicle routing problem (CVRP) is being one of the most common optimization problems in our days	翻訳日:2021-02-03 03:59:43 公開日:2021-02-01
# (参考訳) 分割関数のサンプリングと複雑性 Sampling and Complexity of Partition Function ( http://arxiv.org/abs/2102.00855v1 ) ライセンス: CC0 1.0	Chuyu Xiong	(参考訳) 数分割問題はよく知られた問題であり、21 Karp のNP完全問題 \cite{karp} の1つである。分割関数は、数範囲が制限された数分割問題と等価なブール関数である。数値分割問題と分割関数の計算複雑性を理解することは極めて重要かつ困難である。このような問題には、新しいツールとメソッド \cite{aaronson} が必要だと推測される。汎用学習マシン \cite{paper5, paper8} に関する最近の研究で、我々は極端に適合するツール、適切なサンプリングセット、パラメータ付きブール関数(試行錯誤方式で使用される)を開発した。これらのツールがパーティション関数に適用できることが分かりました。本稿では,パーティション関数のセットアップ,パーティション関数のプロパティ,使用するツールについて論じる。このアプローチは、分割関数の計算複雑性の低い境界と、数分割問題の計算複雑性の低い境界が問題の大きさに指数関数的であることを証明します。これは次のように意味する: {\bf P} $\ne$ {\bf NP} \cite{cook}。 The number partition problem is a well-known problem, which is one of 21 Karp's NP-complete problems \cite{karp}. Partition function is a boolean function that is equivalent to the number partition problem with number range restricted. To understand the computational complexity of the number partition problem and partition function is quite important and hard. People speculate that we need new tools and methods \cite{aaronson} for such problem. In our recent research on universal learning machine \cite{paper5, paper8}, we developed some tools, namely, fitting extremum, proper sampling set, boolean function with parameters (used in trial-and-error fashion). We found that these tools could be applied to the partition function. In this article, we discuss the set up of the partition function, properties of the partition function, and the tools to be used. This approach leads us to prove that the lower bound of the computational complexity of partition function, as well as the lower bound of the computational complexity of the number partition problem, is exponential to the size of problem. This implies: {\bf P} $\ne$ {\bf NP} \cite{cook}.	翻訳日:2021-02-03 03:55:37 公開日:2021-02-01
# (参考訳) Box Re-Ranking: ドメイン適応ペデストリアン検出のための教師なし偽陽性抑制 Box Re-Ranking: Unsupervised False Positive Suppression for Domain Adaptive Pedestrian Detection ( http://arxiv.org/abs/2102.00595v1 ) ライセンス: CC BY 4.0	Weijie Chen and Yilu Guo and Shicai Yang and Zhaoyang Li and Zhenxin Ma and Binbin Chen and Long Zhao and Di Xie and Shiliang Pu and Yueting Zhuang	(参考訳) 偽陽性は、ドメイン適応型歩行者検出におけるドメインシフトの診断によってもたらされる最も深刻な問題の1つです。しかし、各ボックスを無数のターゲットドメインにラベル付けすることは不可能です。したがって、各対象領域における偽陽性を教師なしの方法で抑制することに注意を向ける。本稿では,オブジェクト検出タスクをポジティブボックスとネガティブボックスのランキングタスクに革新的にモデル化し,偽陽性抑圧問題をエレガントにボックス再ランク問題に変換することにより,手動のアノテーションを使わずに解決できるようにする。ボックスの再ランク付け時に付随する問題は、チェリーピッキングにラベル付きバリデーションデータが利用できないことである。本研究は,正の正の変わらずを検出することを目的として,自己監督評価指標であるボックス数アライメントを提案し,最適化されたモデルがキャパシティの劣化を防ぐ。クロスドメイン歩行者検出データセットを用いて大規模な実験を行い,提案手法の有効性を実証した。さらに、2つの一般教師なしドメイン適応オブジェクト検出ベンチマークへの拡張は、他の最先端技術に対する当社の優位性もサポートする。 False positive is one of the most serious problems brought by agnostic domain shift in domain adaptive pedestrian detection. However, it is impossible to label each box in countless target domains. Therefore, it yields our attention to suppress false positive in each target domain in an unsupervised way. In this paper, we model an object detection task into a ranking task among positive and negative boxes innovatively, and thus transform a false positive suppression problem into a box re-ranking problem elegantly, which makes it feasible to solve without manual annotation. An attached problem during box re-ranking appears that no labeled validation data is available for cherrypicking. Considering we aim to keep the detection of true positive unchanged, we propose box number alignment, a self-supervised evaluation metric, to prevent the optimized model from capacity degeneration. Extensive experiments conducted on cross-domain pedestrian detection datasets have demonstrated the effectiveness of our proposed framework. Furthermore, the extension to two general unsupervised domain adaptive object detection benchmarks also supports our superiority to other state-of-the-arts.	翻訳日:2021-02-03 02:14:02 公開日:2021-02-01
# (参考訳) ライン描画による顔写真とスケッチの橋渡し Bridging Unpaired Facial Photos And Sketches By Line-drawings ( http://arxiv.org/abs/2102.00635v1 ) ライセンス: CC BY 4.0	Fei Gao, Meimei Shang, Xiang Li, Jingjie Zhu, Lingna Dai	(参考訳) 本論文では,不対データを用いて顔スケッチ合成モデルを学習する新しい手法を提案する。私たちの主なアイデアは、写真ドメイン $\mathcal{X}$ とスケッチドメイン $Y$ を線引きドメイン $\mathcal{Z}$ を使ってブリッジすることです。特に,画像とスケッチの両方を,ニューラルスタイルの転送手法を用いて線画にマッピングする。 F: \mathcal{X}/\mathcal{Y} \mapsto \mathcal{Z}$ である。その結果、 \textit{pseudo paired data} $(\mathcal{z}, \mathcal{y})$ を得ることができ、マッピング $g:\mathcal{z} \mapsto \mathcal{y}$ を教師あり学習方法で学習することができる。推論段階では、顔写真が与えられたら、まずラインドローイングに転送し、次に$G \circ F$でスケッチに転送できます。さらに,異なるタイプのストロークを生成するための新しいストローク損失を提案する。 sRenderと呼ばれる私たちの方法は、人間のアーティストのレンダリングプロセスとよく一致します。実験結果は、sRenderがマルチスタイルのスケッチを生成し、既存の不対画像から画像への変換方法を大幅に上回ることを実証した。 In this paper, we propose a novel method to learn face sketch synthesis models by using unpaired data. Our main idea is bridging the photo domain $\mathcal{X}$ and the sketch domain $Y$ by using the line-drawing domain $\mathcal{Z}$. Specially, we map both photos and sketches to line-drawings by using a neural style transfer method, i.e. $F: \mathcal{X}/\mathcal{Y} \mapsto \mathcal{Z}$. Consequently, we obtain \textit{pseudo paired data} $(\mathcal{Z}, \mathcal{Y})$, and can learn the mapping $G:\mathcal{Z} \mapsto \mathcal{Y}$ in a supervised learning manner. In the inference stage, given a facial photo, we can first transfer it to a line-drawing and then to a sketch by $G \circ F$. Additionally, we propose a novel stroke loss for generating different types of strokes. Our method, termed sRender, accords well with human artists' rendering process. Experimental results demonstrate that sRender can generate multi-style sketches, and significantly outperforms existing unpaired image-to-image translation methods.	翻訳日:2021-02-03 02:00:49 公開日:2021-02-01
# (参考訳) 映像からの自己監督等変性シーン合成 Self-Supervised Equivariant Scene Synthesis from Video ( http://arxiv.org/abs/2102.00863v1 ) ライセンス: CC BY 4.0	Cinjon Resnick, Or Litany, Cosmas Hei{\ss}, Hugo Larochelle, Joan Bruna, Kyunghyun Cho	(参考訳) 本研究では,背景,キャラクタ,アニメーションに自動的に区切られた映像からシーン表現を学習するための自己教師付きフレームワークを提案する。本手法は,フレーム間の変換に対して等変性を持ち,背景が同じ変換に対して一定であることに着目した。トレーニング後、画像エンコーディングをリアルタイムで操作して、非表示のコンポーネントの組み合わせを作成できます。私たちが知る限り、我々は、解釈可能な背景、キャラクタ、アニメーションの教師なし抽出と合成を行う最初の方法である。我々は,背景付きmnistの移動,2次元ビデオゲームスプライト,ファッションモデリングという3つのデータセットで結果を示す。 We propose a self-supervised framework to learn scene representations from video that are automatically delineated into background, characters, and their animations. Our method capitalizes on moving characters being equivariant with respect to their transformation across frames and the background being constant with respect to that same transformation. After training, we can manipulate image encodings in real time to create unseen combinations of the delineated components. As far as we know, we are the first method to perform unsupervised extraction and synthesis of interpretable background, character, and animation. We demonstrate results on three datasets: Moving MNIST with backgrounds, 2D video game sprites, and Fashion Modeling.	翻訳日:2021-02-03 01:37:04 公開日:2021-02-01
# (参考訳) 3次元ニューロン分割のための連続リカレントニューラルネットワーク Consistent Recurrent Neural Networks for 3D Neuron Segmentation ( http://arxiv.org/abs/2102.01021v1 ) ライセンス: CC BY 4.0	Felix Gonda, Donglai Wei, Hanspeter Pfister	(参考訳) 時空間整合性のある画像中の各物体の2次元マスクを逐次生成するニューロンの3次元再構成のための再帰的ネットワークを提案する。ネットワークは2つの部分で一貫性をモデル化する: (i) 局所性により、非排他的および時間的に隣接したオブジェクト関係と双方向の繰り返しを探索することができる。 (ii) 非ローカルで、スキップ接続で時間領域内の長距離オブジェクト関係を探索することができる。提案するネットワークは、入力画像からオブジェクトマスクのシーケンスまで、エンドツーエンドでトレーニング可能であり、オブジェクト境界に依存する手法と比較して、その出力は後処理を必要としない。本手法は, SNEMI3Dチャレンジにおいて, ニューロンセグメンテーションの3つのベンチマークを用いて評価し, 最新の性能を達成した。 We present a recurrent network for the 3D reconstruction of neurons that sequentially generates binary masks for every object in an image with spatio-temporal consistency. Our network models consistency in two parts: (i) local, which allows exploring non-occluding and temporally-adjacent object relationships with bi-directional recurrence. (ii) non-local, which allows exploring long-range object relationships in the temporal domain with skip connections. Our proposed network is end-to-end trainable from an input image to a sequence of object masks, and, compared to methods relying on object boundaries, its output does not require post-processing. We evaluate our method on three benchmarks for neuron segmentation and achieved state-of-the-art performance on the SNEMI3D challenge.	翻訳日:2021-02-03 01:23:32 公開日:2021-02-01
# (参考訳) seq2seq学習を用いたテキスト対ハッシュ生成 Text-to-hashtag Generation using Seq2seq Learning ( http://arxiv.org/abs/2102.00904v1 ) ライセンス: CC BY 4.0	Augusto Camargo, Wesley Carvalho, Felipe Peressim	(参考訳) 本論文では、BiLSTMとBERTに基づくモデルがブラジルのポルトガル語でハッシュタグを生成し、Eコマースのウェブサイトで使用できるかどうかを検討した。我々はEコマースレビューのコーパスと商品のタイトルを入力として処理し、ハッシュタグを出力として生成した。 NIST,BLEU,METEOR,クラウドソーシングスコアの4つの定量値を用いて評価を行った。 Word Cloudは定性メトリックとして使用された。すべてのコンピュータ測定値(NIST、BLEU、METEOR)が悪い結果を示したのに加えて、クラウドソースは素晴らしいスコアを示した。我々は、ニューラルネットワークによって生成されたテキストが、Eコマースのウェブサイトで製品のハッシュタグとして使われることを非常に約束していると結論付けた。この作業のコードはhttps://github.com/augustocamargo/text-to-hashtagで入手できる。 In this paper, we studied if models based on BiLSTM and BERT can generate hashtags in Brazilian portuguese that can be used in Ecommerce websites. We processed a corpus of Ecommerce reviews and titles of products as inputs and we generated hashtags as outputs. We evaluate the results using four quantitatives metrics: NIST, BLEU, METEOR and a crowdsourced score. Word Cloud was used as a qualitative metric. Besides all computer metered metrics (NIST, BLEU and METEOR) showed bad results, the crowdsourced showed amazing scores. We concluded that the texts generated by the neural networks are very promising to be used as hashtags of products in Ecommerce websites [1]. The code for this work is available on https://github.com/augustocamargo/text-to-hashtag	翻訳日:2021-02-03 00:49:21 公開日:2021-02-01
# (参考訳) オフライン強化学習のための近世界ベンチマーク Near Real-World Benchmarks for Offline Reinforcement Learning ( http://arxiv.org/abs/2102.00714v1 ) ライセンス: CC BY 4.0	Rongjun Qin, Songyi Gao, Xingyuan Zhang, Zhen Xu, Shengkai Huang, Zewen Li, Weinan Zhang, Yang Yu	(参考訳) オフライン強化学習(rl)は、トレーニング中の環境との余分なインタラクションなしに、収集したデータのバッチから最適なポリシーを学ぶことを目的としている。オフラインRLは環境における有害な実行を緩和しようとするため、RLアプリケーションの範囲を大きく広げることになる。しかし、現在のオフラインRLベンチマークは一般的に大きな現実的なギャップがある。それらは、非常に探索的なポリシーによって収集された大きなデータセットを含み、訓練されたポリシーは、環境内で直接評価されます。一方、現実の状況では、高度に探索的なポリシーを実行することは、システムの安全性を確保するために禁止され、データは一般的に非常に制限され、トレーニングされたポリシーは、デプロイ前に適切に検証されるべきである。本稿では,近世界のベンチマークであるNewRLについて述べる。 NewRLには、ポリシー検証のために制御されたサイズと追加のテストデータセットを備えたさまざまなドメインのデータセットが含まれています。既存のオフラインRLアルゴリズムをNewRL上で評価する。実験では、データセット報酬の代わりに、ポリシーのパフォーマンスも行動ポリシーの決定論的なバージョンと比較されるべきであると主張します。決定論的行動ポリシーは実際のシナリオのベースラインであるため、データセットはパフォーマンスを低下させる可能性のあるアクション摂動で収集されることが多い。実験結果から,テスト済みのオフラインRLアルゴリズムは,上記の多くのデータセットに対する決定論的ポリシと競合するだけであり,オフラインポリシ評価がほとんど役に立たないことが示された。 NewRL スーツは http://polixir.ai/research/newrl で見ることができる。この研究が研究に光を当て、現実世界のシステムにRLをデプロイする際にもっと注目されることを願っています。 Offline reinforcement learning (RL) aims at learning an optimal policy from a batch of collected data, without extra interactions with the environment during training. Offline RL attempts to alleviate the hazardous executions in environments, thus it will greatly broaden the scope of RL applications. However, current offline RL benchmarks commonly have a large reality gap. They involve large datasets collected by highly exploratory policies, and a trained policy is directly evaluated in the environment. Meanwhile, in real-world situations, running a highly exploratory policy is prohibited to ensure system safety, the data is commonly very limited, and a trained policy should be well validated before deployment. In this paper, we present a suite of near real-world benchmarks, NewRL. NewRL contains datasets from various domains with controlled sizes and extra test datasets for the purpose of policy validation. We then evaluate existing offline RL algorithms on NewRL. In the experiments, we argue that the performance of a policy should also be compared with the deterministic version of the behavior policy, instead of the dataset reward. Because the deterministic behavior policy is the baseline in the real scenarios, while the dataset is often collected with action perturbations that can degrade the performance. The empirical results demonstrate that the tested offline RL algorithms appear only competitive to the above deterministic policy on many datasets, and the offline policy evaluation hardly helps. The NewRL suit can be found at http://polixir.ai/research/newrl. We hope this work will shed some light on research and draw more attention when deploying RL in real-world systems.	翻訳日:2021-02-03 00:03:49 公開日:2021-02-01
# (参考訳) マルチエージェントDeep Reinforcement Learningを用いたmmWave MU-MISOシステムのハイブリッドビームフォーミング Hybrid Beamforming for mmWave MU-MISO Systems Exploiting Multi-agent Deep Reinforcement Learning ( http://arxiv.org/abs/2102.00735v1 ) ライセンス: CC BY 4.0	Qisheng Wang, Xiao Li, Shi Jin, and Yijiain Chen	(参考訳) 本書では、ミリ波(mmWave)マルチユーザ(MU)マルチインプットシングル出力(MISO)システムのための深層補強学習(DRL)に基づくハイブリッドビームフォーミングについて検討する。 DRLの探索効率問題を解決するためにマルチエージェントDRL法を提案する。提案手法では,優先されたリプレイバッファとより情報的な報酬を適用し,コンバージェンスを高速化する。シミュレーションの結果,提案アーキテクチャはベンチマークよりもスペクトル効率が高く,時間消費の少ないため,実用化に適していることがわかった。 In this letter, we investigate the hybrid beamforming based on deep reinforcement learning (DRL) for millimeter Wave (mmWave) multi-user (MU) multiple-input-single-output (MISO) system. A multi-agent DRL method is proposed to solve the exploration efficiency problem in DRL. In the proposed method, prioritized replay buffer and more informative reward are applied to accelerate the convergence. Simulation results show that the proposed architecture achieves higher spectral efficiency and less time consumption than the benchmarks, thus is more suitable for practical applications.	翻訳日:2021-02-02 22:12:05 公開日:2021-02-01
# (参考訳) 知識蒸留のためのソフトラベルの再考:バイアス分散トレードオフの視点 Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective ( http://arxiv.org/abs/2102.00650v1 ) ライセンス: CC BY 4.0	Helong Zhou, Liangchen Song, Jiajie Chen, Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang	(参考訳) 知識蒸留は、よく訓練されたネットワークまたはそれらのアンサンブルを利用して、学生ネットワークのトレーニングを指導するための効果的なアプローチである。教師ネットワークからの出力は、新しいネットワークのトレーニングを監督するためのソフトラベルとして使用される。最近の研究では、ソフトラベルの興味をそそる性質が示され、ラベルをソフトにすることは学生ネットワークにとって良い正規化となる。統計的学習の観点から、正規化はばらつきを減らすことを目指していますが、ソフトラベルによるトレーニングではバイアスとばらつきの変化が明確ではありません。本稿では,ソフトラベル蒸留によるバイアス分散トレードオフについて検討する。具体的には、トレーニング中のバイアス分散トレードオフがサンプルごとに異なることを観察する。さらに、同じ蒸留温度設定下では、蒸留性能がいくつかの特定のサンプルの数に負の関連していることを観察します。しかし, 正則化試料を完全にろ過しても蒸留性能は低下する。私たちの発見は、ネットワークがサンプルワイズバイアス分散トレードオフを適応的に処理するのに役立つ、新しい重み付きソフトラベルを提案しました。本手法の有効性を検証するための標準評価ベンチマーク実験を行った。コードは \url{https://github.com/bellymonster/Weighted-Soft-Label-Distillation} で入手できます。 Knowledge distillation is an effective approach to leverage a well-trained network or an ensemble of them, named as the teacher, to guide the training of a student network. The outputs from the teacher network are used as soft labels for supervising the training of a new network. Recent studies \citep{muller2019does,yuan2020revisiting} revealed an intriguing property of the soft labels that making labels soft serves as a good regularization to the student network. From the perspective of statistical learning, regularization aims to reduce the variance, however how bias and variance change is not clear for training with soft labels. In this paper, we investigate the bias-variance tradeoff brought by distillation with soft labels. Specifically, we observe that during training the bias-variance tradeoff varies sample-wisely. Further, under the same distillation temperature setting, we observe that the distillation performance is negatively associated with the number of some specific samples, which are named as regularization samples since these samples lead to bias increasing and variance decreasing. Nevertheless, we empirically find that completely filtering out regularization samples also deteriorates distillation performance. Our discoveries inspired us to propose the novel weighted soft labels to help the network adaptively handle the sample-wise bias-variance tradeoff. Experiments on standard evaluation benchmarks validate the effectiveness of our method. Our code is available at \url{https://github.com/bellymonster/Weighted-Soft-Label-Distillation}.	翻訳日:2021-02-02 20:37:44 公開日:2021-02-01
# (参考訳) 学習水型脱感作表現による水中画像強調 Underwater Image Enhancement via Learning Water Type Desensitized Representations ( http://arxiv.org/abs/2102.00676v1 ) ライセンス: CC BY 4.0	Zhenqi Fu, Xiaopeng Lin, Wu Wang, Yue Huang, and Xinghao Ding	(参考訳) 水中での応用では、光吸収と散乱の影響は画像劣化をもたらす。さらに、複雑で変更可能なイメージング環境は、水タイプの多様性に対処するための普遍的な強化ソリューションを提供することを困難にします。本稿では,これらの課題に対処するため,SCNetと呼ばれる新しい水中画像強調(UIE)フレームワークを提案する。 SCNetは、水型脱感作機能を学ぶ重要なアイデアで、空間とチャネルの両方の寸法にわたる正規化スキームに基づいています。劣化の多様性は画素間の強い相関に主に根ざしており、ミニバッチにおける各インスタンスの空間的次元にわたるアクティベーションの非相関化にホワイトニングを適用する。また,チャネル間のアクティベーションの最初の2つのモーメントを標準化し再注入することで,チャネルワイズ相関を解消する。空間的およびチャネル次元の正規化スキームは、U-Netの各スケールで実行され、マルチスケール表現を得る。このような潜時符号化により、デコーダはクリーン信号を容易に再構成でき、水による歪みタイプの影響を受けない。 2つの実世界のUIEデータセットによる実験結果から,提案手法は多様な水型で画像の強化に成功し,視覚的品質改善の競争性能が向上することが示された。 For underwater applications, the effects of light absorption and scattering result in image degradation. Moreover, the complex and changeable imaging environment makes it difficult to provide a universal enhancement solution to cope with the diversity of water types. In this letter, we present a novel underwater image enhancement (UIE) framework termed SCNet to address the above issues. SCNet is based on normalization schemes across both spatial and channel dimensions with the key idea of learning water type desensitized features. Considering the diversity of degradation is mainly rooted in the strong correlation among pixels, we apply whitening to de-correlates activations across spatial dimensions for each instance in a mini-batch. We also eliminate channel-wise correlation by standardizing and re-injecting the first two moments of the activations across channels. The normalization schemes of spatial and channel dimensions are performed at each scale of the U-Net to obtain multi-scale representations. With such latent encodings, the decoder can easily reconstruct the clean signal, and unaffected by the distortion types caused by the water. Experimental results on two real-world UIE datasets show that the proposed approach can successfully enhance images with diverse water types, and achieves competitive performance in visual quality improvement.	翻訳日:2021-02-02 20:21:38 公開日:2021-02-01
# (参考訳) 天空画像からの深層学習照度予測モデルのベンチマーク -詳細な分析- Benchmarking of Deep Learning Irradiance Forecasting Models from Sky Images -- an in-depth Analysis ( http://arxiv.org/abs/2102.00721v1 ) ライセンス: CC BY 4.0	Quentin Paletta, Guillaume Arbod and Joan Lasenby	(参考訳) スマートグリッド、発電所の運用、ハイブリッドシステム管理、エネルギー取引など多くの産業応用は、ソーラーパネルからの断続的なエネルギー生産に対応するため、短期的な太陽予報の改善の恩恵を受ける可能性がある。しかし、現在の雲を空からモデル化するアプローチでは、雲の空間的配置、時間的ダイナミクス、太陽放射との物理的相互作用に関する精度が不足している。大規模データセットの増加によって、これらの制限に対処するためにデータ駆動メソッドが開発され、有望な結果が得られた。本研究では、半球空画像と外生変数のシーケンスから太陽光照射を予測するために訓練された4つのDeep Learningアーキテクチャを比較した。各モデルの相対的なパフォーマンスを評価するために、スマート永続化モデルに基づく予測スキルメトリックと、ランプと時間の歪みメトリックを使用しました。その結果、天空画像列の時空間的側面のエンコーディングは、試験年度の予測スキルが20.4%に達したことにより、予測を大幅に改善した。しかし、実験データに基づいて、Deep Learningモデルは共通の設定で「非常にスマートな永続化モデル」として振る舞う傾向にあり、持続モデルと時間的に一致し、最もペナリングなエラーを緩和する傾向にあると結論付けている。したがって、スカイカメラで捉えられたにもかかわらず、モデルはしばしば太陽を遮る雲のような大きな照度変化を引き起こす基本的な事象を見逃す。反応性から予測性まで、このアプローチの放射能予測への移行に貢献できることを願っています。 A number of industrial applications, such as smart grids, power plant operation, hybrid system management or energy trading, could benefit from improved short-term solar forecasting, addressing the intermittent energy production from solar panels. However, current approaches to modelling the cloud cover dynamics from sky images still lack precision regarding the spatial configuration of clouds, their temporal dynamics and physical interactions with solar radiation. Benefiting from a growing number of large datasets, data driven methods are being developed to address these limitations with promising results. In this study, we compare four commonly used Deep Learning architectures trained to forecast solar irradiance from sequences of hemispherical sky images and exogenous variables. To assess the relative performance of each model, we used the Forecast Skill metric based on the smart persistence model, as well as ramp and time distortion metrics. The results show that encoding spatiotemporal aspects of the sequence of sky images greatly improved the predictions with 10 min ahead Forecast Skill reaching 20.4% on the test year. However, based on the experimental data, we conclude that, with a common setup, Deep Learning models tend to behave just as a `very smart persistence model', temporally aligned with the persistence model while mitigating its most penalising errors. Thus, despite being captured by the sky cameras, models often miss fundamental events causing large irradiance changes such as clouds obscuring the sun. We hope that our work will contribute to a shift of this approach to irradiance forecasting, from reactive to anticipatory.	翻訳日:2021-02-02 20:12:08 公開日:2021-02-01
# (参考訳) 分類マージンによる騒音ラベルのコンバット学習 Learning to Combat Noisy Labels via Classification Margins ( http://arxiv.org/abs/2102.00751v1 ) ライセンス: CC BY 4.0	Jason Z. Lin and Jelena Bradic	(参考訳) ノイズの多いラベルでトレーニングされたディープニューラルネットワークは、ノイズの多いものからクリーンなインスタンスを識別する能力が急速に失われることが知られている。早期学習フェーズが終了した後、ネットワークは騒々しいインスタンスを記憶し、一般化パフォーマンスの低下につながります。この問題を解決するため、マーベル(MARgins Via Early Learning)を提案し、分類のマージンの画期的な歴史を維持しながら、あらゆるインスタンスの「適合性」を追跡します。連続する負のマージンに基づいて、重みをゼロにすることで、疑わしいノイズを排除した。さらに、MARVEL+のアップウェイトは、ネットワークが分類境界のよりニュアンスな表現を学習できるようにする。合成ラベルノイズを用いたベンチマーク実験の結果,MARVELは非対称雑音下でのマージンが著しく大きいため,他のベースラインよりも高い性能を示した。 A deep neural network trained on noisy labels is known to quickly lose its power to discriminate clean instances from noisy ones. After the early learning phase has ended, the network memorizes the noisy instances, which leads to a degradation in generalization performance. To resolve this issue, we propose MARVEL (MARgins Via Early Learning), where we track the goodness of "fit" for every instance by maintaining an epoch-history of its classification margins. Based on consecutive negative margins, we discard suspected noisy instances by zeroing out their weights. In addition, MARVEL+ upweights arduous instances enabling the network to learn a more nuanced representation of the classification boundary. Experimental results on benchmark datasets with synthetic label noise show that MARVEL outperforms other baselines consistently across different noise levels, with a significantly larger margin under asymmetric noise.	翻訳日:2021-02-02 20:08:58 公開日:2021-02-01
# (参考訳) Zen-NAS: 高速画像認識のためのゼロショットNAS Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition ( http://arxiv.org/abs/2102.01063v1 ) ライセンス: CC BY 4.0	Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin	(参考訳) neural architecture search (nas) の重要なコンポーネントは、クエリされたアーキテクチャの精度を主張する精度予測器である。高品質な精度予測器を構築するために、従来のNASアルゴリズムは大量のアーキテクチャや大きなスーパーネットを訓練する。このステップは数百から数千のGPU日を消費し、総検索コストを上回ります。そこで本研究では,精度予測器をZen-scoreという新しいモデル複雑性指標に置き換えることを提案する。モデルの精度を予測する代わりに、Zen-scoreはパラメータを訓練せずにネットワークのモデルの複雑さを直接主張します。これは、ネットワークのモデル複雑さがターゲットデータセットの精度と正に相関していることを示すディープラーニング理論の最近の進歩にインスパイアされている。 zen-score の計算はランダムなガウス入力を用いたランダム初期化ネットワークによる数回の前方推論しか行わない。これは、Vanilla Convolutional Neural Networks(VCN-networks)または互換の亜種に適用でき、現実世界のアプリケーションで人気のあるネットワークの大部分をカバーする。 Zen-ScoreとEvolutionary Algorithmを組み合わせると、Zen-NASという新しいZero-Shot NASアルゴリズムが得られる。 CIFAR10/CIFAR100とImageNetについて広範な実験を行った。要約すると、Zen-NASは半GPU日(12GPU時間)未満で高性能アーキテクチャを設計することができる。結果、ZenNetsという名前のネットワークは、ImageNet上で最大83.0\%$ top-1精度を達成する。同じまたはより良い精度のEfficientNets-B3/B5と比較して、ZenNetsはNVIDIA V100の5.6$の速度、NVIDIA T4の111$の速度、Google Pixel2の2.6$の速度、および50\%$の少ないFLOPを使用しています。ソースコードと事前トレーニング済みモデルはhttps://github.com/idstcv/zennasでリリースしています。 A key component in Neural Architecture Search (NAS) is an accuracy predictor which asserts the accuracy of a queried architecture. To build a high quality accuracy predictor, conventional NAS algorithms rely on training a mass of architectures or a big supernet. This step often consumes hundreds to thousands of GPU days, dominating the total search cost. To address this issue, we propose to replace the accuracy predictor with a novel model-complexity index named Zen-score. Instead of predicting model accuracy, Zen-score directly asserts the model complexity of a network without training its parameters. This is inspired by recent advances in deep learning theories which show that model complexity of a network positively correlates to its accuracy on the target dataset. The computation of Zen-score only takes a few forward inferences through a randomly initialized network using random Gaussian input. It is applicable to any Vanilla Convolutional Neural Networks (VCN-networks) or compatible variants, covering a majority of networks popular in real-world applications. When combining Zen-score with Evolutionary Algorithm, we obtain a novel Zero-Shot NAS algorithm named Zen-NAS. We conduct extensive experiments on CIFAR10/CIFAR100 and ImageNet. In summary, Zen-NAS is able to design high performance architectures in less than half GPU day (12 GPU hours). The resultant networks, named ZenNets, achieve up to $83.0\%$ top-1 accuracy on ImageNet. Comparing to EfficientNets-B3/B5 of the same or better accuracies, ZenNets are up to $5.6$ times faster on NVIDIA V100, $11$ times faster on NVIDIA T4, $2.6$ times faster on Google Pixel2 and uses $50\%$ less FLOPs. Our source code and pre-trained models are released on https://github.com/idstcv/ZenNAS.	翻訳日:2021-02-02 19:52:26 公開日:2021-02-01
# (参考訳) 行列因子化のリーマン的展望 Riemannian Perspective on Matrix Factorization ( http://arxiv.org/abs/2102.00937v1 ) ライセンス: CC BY 4.0	Kwangjun Ahn, Felipe Suarez	(参考訳) リーマン幾何学による行列完備に対する非凸行列分解法の研究を行う。グラスマン多様体上の最適化定式化に基づき、部分空間間の主角の概念に基づいて風景を特徴づける。完全に観察された場合、我々は、コストが測地的に凸である領域と、すべての臨界点が厳密なサドルである領域が存在することを示した。本研究では, 部分観察例を経験的に検討した。 We study the non-convex matrix factorization approach to matrix completion via Riemannian geometry. Based on an optimization formulation over a Grassmannian manifold, we characterize the landscape based on the notion of principal angles between subspaces. For the fully observed case, our results show that there is a region in which the cost is geodesically convex, and outside of which all critical points are strictly saddle. We empirically study the partially observed case based on our findings.	翻訳日:2021-02-02 19:18:30 公開日:2021-02-01
# (参考訳) ConvNets for Counting: Object Detection of Transient Phenomena in Steelpan Drums ConvNets for Counting: Object Detection of Transient Phenomena in Steelpan Drums ( http://arxiv.org/abs/2102.00632v1 ) ライセンス: CC BY 4.0	Scott H. Hawley and Andrew C. Morrison	(参考訳) 電子スペックルパターン干渉計(ESPI)で照らされたカリブ海のスチールパンドラムの過渡振動の高速ビデオ記録のフレームに見る楕円反ノード領域の干渉縞を数えるために、畳み込みニューラルネットワークで構築された物体検出器を訓練する。本モデルで提案するアノテーション「SPNet」は,交感神経振動モードの発達を追跡することで,ドラムの時間依存行動の理解に寄与することを目的としている。このシステムは、Zooniverse Steelpan vibrations Projectから得られたクラウドソーシングされた人間の注釈付き画像のデータセットで訓練される。また,人間のアノテート画像が比較的少ないため,視覚特性が実際の画像と一致した合成画像のコーパスを生成的逆ネットワークを用いて学習し,スタイル転送を行う。何千ものラベルのないビデオフレームの注釈を予測するためにモデルを適用することで、同じドラムストライクのオーディオ記録と一致する特徴を追跡し、振動を測定することができる。 1つの驚くべき結果として、機械注釈付きビデオフレームは、オーディオ録音におけるそのような遷移に大きく先行する第1と第2の高調波の遷移を明らかにする。本稿では,主にモデルの開発について述べるので,さらなる応用が期待できる。 We train an object detector built from convolutional neural networks to count interference fringes in elliptical antinode regions visible in frames of high-speed video recordings of transient oscillations in Caribbean steelpan drums illuminated by electronic speckle pattern interferometry (ESPI). The annotations provided by our model, "SPNet" are intended to contribute to the understanding of time-dependent behavior in such drums by tracking the development of sympathetic vibration modes. The system is trained on a dataset of crowdsourced human-annotated images obtained from the Zooniverse Steelpan Vibrations Project. Due to the relatively small number of human-annotated images, we also train on a large corpus of synthetic images whose visual properties have been matched to those of the real images by using a Generative Adversarial Network to perform style transfer. Applying the model to predict annotations of thousands of unlabeled video frames, we can track features and measure oscillations consistent with audio recordings of the same drum strikes. One surprising result is that the machine-annotated video frames reveal transitions between the first and second harmonics of drum notes that significantly precede such transitions present in the audio recordings. As this paper primarily concerns the development of the model, deeper physical insights await its further application.	翻訳日:2021-02-02 17:46:33 公開日:2021-02-01
# (参考訳) Densely Connected Residual Residual (Dense R2UNet) Convolutional Neural Network for Segmentation of Lung CT Images Densely Connected Recurrent Residual (Dense R2UNet) Convolutional Neural Network for Segmentation of Lung CT Images ( http://arxiv.org/abs/2102.00663v1 ) ライセンス: CC BY 4.0	Kaushik Dutta	(参考訳) ディープラーニングネットワークは、セマンティックセグメンテーションのためのアートパフォーマンスの状態を提供するものとして確立されている。これらの技術は医学の検出、区分および分類に特に適用されます。 U-Netベースのアーキテクチャの出現は、このアプリケーションで特に人気がある。本稿では、U-Netモデルアーキテクチャに基づくRecurrent CNN, Residual Network, Dense Convolutional Networkの合成であるDense Recurrent Residual Convolutional Neural Network(Dense R2U CNN)について述べる。残留ユニットはより深いネットワークを訓練するのを助け、密な繰り返し層はセグメンテーションに必要な機能伝搬を強化する。ベンチマークLung Lesionデータセットでテストされた提案モデルは、同等のモデルよりもセグメンテーションタスクのパフォーマンスが向上した。 Deep Learning networks have established themselves as providing state of art performance for semantic segmentation. These techniques are widely applied specifically to medical detection, segmentation and classification. The advent of the U-Net based architecture has become particularly popular for this application. In this paper we present the Dense Recurrent Residual Convolutional Neural Network(Dense R2U CNN) which is a synthesis of Recurrent CNN, Residual Network and Dense Convolutional Network based on the U-Net model architecture. The residual unit helps training deeper network, while the dense recurrent layers enhances feature propagation needed for segmentation. The proposed model tested on the benchmark Lung Lesion dataset showed better performance on segmentation tasks than its equivalent models.	翻訳日:2021-02-02 17:21:11 公開日:2021-02-01
# (参考訳) 高速ラジアルマルチコイル2次元シネmr画像再構成のためのエンド・ツー・エンド訓練型反復ネットワークアーキテクチャ An End-To-End-Trainable Iterative Network Architecture for Accelerated Radial Multi-Coil 2D Cine MR Image Reconstruction ( http://arxiv.org/abs/2102.00783v1 ) ライセンス: CC BY 4.0	Andreas Kofler, Markus Haltmeier, Tobias Schaeffter and Christoph Kolbitsch	(参考訳) 目的: 学習反復スキームに類似した反復畳み込みニューラルネットワーク (CNN) は, 画像再構成問題に対して, 様々な画像モダリティをまたがって常に最先端の結果をもたらすことを示した。しかし、これらの手法はアーキテクチャのフォワードモデルを含むため、比較的小さな再構成問題や計算コストの低い演算子の問題に適用性に制限されることが多い。その結果, 動的非カルト的マルチコイル再構成問題には適用されていない。本研究では,複数の受信コイルを有する加速型2次元ラジアルシネMRIの画像再構成のためのCNN-Architectureを提案する。このネットワークは、計算で軽量なCNNコンポーネントと、効率的なトレーニング戦略を使用してエンドツーエンドで共同トレーニングできるその後の共役グラデーション(CG)方法に基づいています。提案した訓練戦略を検討し,学習と非学習の正規化手法を用いて,他のよく知られた再建手法と比較した。結果: 提案手法は非学習正規化に基づく他の手法よりも優れていた。さらに、3D U-Netを用いたCNNベースの手法と適応辞書学習を用いた手法とを類似または良好に行う。また,反復のみを用いてネットワークをトレーニングしても,テスト時間内にネットワークの長さを増加させ,結果をさらに改善できることを実証的に実証する。結論: エンドツーエンドのトレーニングは、再構成ネットワークのトレーニング可能なパラメータの数を大幅に削減し、安定化します。さらに、テスト時にネットワークの長さを変更することができるため、CNNブロックの複雑さと各CGブロックの反復数との間に妥協を見つける必要性は無関係になります。 Purpose: Iterative Convolutional Neural Networks (CNNs) which resemble unrolled learned iterative schemes have shown to consistently deliver state-of-the-art results for image reconstruction problems across different imaging modalities. However, because these methodes include the forward model in the architecture, their applicability is often restricted to either relatively small reconstruction problems or to problems with operators which are computationally cheap to compute. As a consequence, they have so far not been applied to dynamic non-Cartesian multi-coil reconstruction problems. Methods: In this work, we propose a CNN-architecture for image reconstruction of accelerated 2D radial cine MRI with multiple receiver coils. The network is based on a computationally light CNN-component and a subsequent conjugate gradient (CG) method which can be jointly trained end-to-end using an efficient training strategy. We investigate the proposed training-strategy and compare our method to other well-known reconstruction techniques with learned and non-learned regularization methods. Results: Our proposed method outperforms all other methods based on non-learned regularization. Further, it performs similar or better than a CNN-based method employing a 3D U-Net and a method using adaptive dictionary learning. In addition, we empirically demonstrate that even by training the network with only iteration, it is possible to increase the length of the network at test time and further improve the results. Conclusions: End-to-end training allows to highly reduce the number of trainable parameters of and stabilize the reconstruction network. Further, because it is possible to change the length of the network at test time, the need to find a compromise between the complexity of the CNN-block and the number of iterations in each CG-block becomes irrelevant.	翻訳日:2021-02-02 17:14:59 公開日:2021-02-01
# (参考訳) 重症デング患者の肺超音波ビデオにおけるB線の自動検出 Automatic Detection of B-lines in Lung Ultrasound Videos From Severe Dengue Patients ( http://arxiv.org/abs/2102.01059v1 ) ライセンス: CC BY 4.0	Hamideh Kerdegari, Phung Tran Huy Nhat, Angela McBride, VITAL Consortium, Reza Razavi, Nguyen Van Hao, Louise Thwaites, Sophie Yacoub, Alberto Gomez	(参考訳) 肺超音波(LUS)イメージングは、様々な疾患による肺への液漏れによるB線アーチファクトの存在を含む肺の異常を評価するために用いられる。しかし、これらのアーティファクトの手動検出は困難です。本論文では,弱ラベルを用いた深層ニューラルネットワークを用いて,LUS動画中のB線を自動的に検出・局在化するための新しい手法を提案する。そのために、畳み込みニューラルネットワーク(CNN)と、長期の短期メモリ(LSTM)ネットワークと時間的注意メカニズムを組み合わせています。 4つの異なるモデルが60人の患者のデータを用いて比較される。その結果,F1スコア0.81で1秒間クリップがB線を含むか否かを判断し,87.5%の精度でB線で代表フレームを抽出できることがわかった。 Lung ultrasound (LUS) imaging is used to assess lung abnormalities, including the presence of B-line artefacts due to fluid leakage into the lungs caused by a variety of diseases. However, manual detection of these artefacts is challenging. In this paper, we propose a novel methodology to automatically detect and localize B-lines in LUS videos using deep neural networks trained with weak labels. To this end, we combine a convolutional neural network (CNN) with a long short-term memory (LSTM) network and a temporal attention mechanism. Four different models are compared using data from 60 patients. Results show that our best model can determine whether one-second clips contain B-lines or not with an F1 score of 0.81, and extracts a representative frame with B-lines with an accuracy of 87.5%.	翻訳日:2021-02-02 17:13:41 公開日:2021-02-01
# GTAE:言語制約付きテキストスタイル転送のためのグラフトランスフォーマーベースのオートエンコーダ GTAE: Graph-Transformer based Auto-Encoders for Linguistic-Constrained Text Style Transfer ( http://arxiv.org/abs/2102.00769v1 ) ライセンス: Link先を確認	Yukai Shi, Sen Zhang, Chenxing Zhou, Xiaodan Liang, Xiaojun Yang, Liang Lin	(参考訳) 非並列テキストスタイル転送は近年研究の関心を集めている。エンコーダデコーダフレームワークに基づいてスタイルを転送することに成功したにもかかわらず、現在のアプローチは、主に大きな制約のないモデル空間または潜在的な埋め込みスペース上の単純すぎる仮定のために、元の文の内容とロジックを保存する能力がまだ欠けています。言語自体が特定の文法を持つ人間のインテリジェントな産物であり、その性質によってルールベースのモデル空間が制限されているため、この問題を緩和するためには、深いニューラルネットワークのモデル容量を人間の言語規則から本質的なモデル制約と照合する必要がある。そこで本稿では,グラフ変換器を用いたオートエンコーダ(GTAE)という手法を提案する。文を言語グラフとしてモデル化し,特徴抽出とスタイル転送をグラフレベルで行うことで,原文の内容と言語構造を最大に保持する。 3つの非並列テキストスタイルの転送タスクの定量的実験結果から,本モデルはコンテンツ保存における最先端の手法よりも優れており,転送精度と文自然性に匹敵する性能が得られた。 Non-parallel text style transfer has attracted increasing research interests in recent years. Despite successes in transferring the style based on the encoder-decoder framework, current approaches still lack the ability to preserve the content and even logic of original sentences, mainly due to the large unconstrained model space or too simplified assumptions on latent embedding space. Since language itself is an intelligent product of humans with certain grammars and has a limited rule-based model space by its nature, relieving this problem requires reconciling the model capacity of deep neural networks with the intrinsic model constraints from human linguistic rules. To this end, we propose a method called Graph Transformer based Auto Encoder (GTAE), which models a sentence as a linguistic graph and performs feature extraction and style transfer at the graph level, to maximally retain the content and the linguistic structure of original sentences. Quantitative experiment results on three non-parallel text style transfer tasks show that our model outperforms state-of-the-art methods in content preservation, while achieving comparable performance on transfer accuracy and sentence naturalness.	翻訳日:2021-02-02 17:04:34 公開日:2021-02-01
# 半指導学習による中国語の多音障害 Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning ( http://arxiv.org/abs/2102.00621v1 ) ライセンス: Link先を確認	Yi Shi and Congyi Wang and Yu Chen and Bin Wang	(参考訳) 漢字の大部分は単音であり、発音は独特であり、チェックテーブルで簡単に発音することができる。それらに対して、ポリフォニック文字は複数の発音を持つ。中国語話者に関連する言語計算タスクを実行するには、その文脈に応じて、各ポリフォンの正しい発音を特定する必要があります。この処理はPolyphone Disambiguationと呼ばれ、中国のテキスト音声(TTS)システムのGrapheme-to-phoneme(G2P)変換ステップにおける重要な手順である。この問題は知識ベースのアプローチと学習ベースのアプローチの両方でよく研究されているが、公開データセットの欠如や、ポリフォンに関する複雑な言語現象のため、依然として難しい。本稿では,無ラベルテキストデータを利用する可能性のある中国語ポリホン不曖昧化のための半教師付き学習(ssl)フレームワークを提案する。エントロピー-thresholding やlexicon-based labeling など,様々なプロキシラベリング戦略の効果を検討する。アーキテクチャに関しては、Electraの事前トレーニングされたモデルとConvolution BLSTMレイヤーを組み合わせて、タスクを微調整します。定性的および定量的実験により,マンダリン中国語多音不明瞭度における最先端性能が得られた。さらに,ポリホンの曖昧化タスクに特化した新しいデータセットを公開し,さらなる研究を促進する。 The majority of Chinese characters are monophonic, i.e.their pronunciations are unique and thus can be induced easily using a check table. As for their counterparts, polyphonic characters have more than one pronunciation. To perform linguistic computation tasks related to spoken Mandarin Chinese, the correct pronunciation for each polyphone must be identified among several candidates according to its context. This process is called Polyphone Disambiguation, a key procedure in the Grapheme-to-phoneme (G2P) conversion step of a Chinese text-to-speech (TTS) system. The problem is well explored with both knowledge-based and learning-based approaches, yet it remains challenging due to the lack of publicly available datasets and complex language phenomenon concerned polyphone. In this paper, we propose a novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data. We explore the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling. As for the architecture, a pre-trained model of Electra is combined with Convolution BLSTM layers to fine-tune on our task. Qualitative and quantitative experiments demonstrate that our method achieves state-of-the-art performance in Mandarin Chinese polyphone disambiguation. In addition, we publish a novel dataset specifically for the polyphone disambiguation task to promote further researches.	翻訳日:2021-02-02 17:02:16 公開日:2021-02-01
# 用語定義からの常識知識マイニング Commonsense Knowledge Mining from Term Definitions ( http://arxiv.org/abs/2102.00651v1 ) ライセンス: Link先を確認	Zhicheng Liang and Deborah L. McGuinness	(参考訳) commonsenseの知識は、質問応答や自然言語理解など、さまざまな応用分野に有益であることが証明されている。以前の研究では、現在のcommonsense知識グラフをカバーするために、テキストから自動的に3倍のcommonsense知識を収集することを検討した。辞書用語定義をインプットとして,コモンセンスの知識トリプルをマイニングする機械学習手法をいくつか検討し,その初期評価を行った。まず,テキストから部分音声タグパターンを用いて3つの候補を抽出し,既存の3つのモデルの性能を比較した。私たちの実験では、用語定義には意味関係に対する正当かつ新しいコモンセンスの知識トリプルが含まれており、また既存のトリプルスコアリングモデルを使用する際の課題も示している。 Commonsense knowledge has proven to be beneficial to a variety of application areas, including question answering and natural language understanding. Previous work explored collecting commonsense knowledge triples automatically from text to increase the coverage of current commonsense knowledge graphs. We investigate a few machine learning approaches to mining commonsense knowledge triples using dictionary term definitions as inputs and provide some initial evaluation of the results. We start from extracting candidate triples using part-of-speech tag patterns from text, and then compare the performance of three existing models for triple scoring. Our experiments show that term definitions contain some valid and novel commonsense knowledge triples for some semantic relations, and also indicate some challenges with using existing triple scoring models.	翻訳日:2021-02-02 17:01:35 公開日:2021-02-01
# 多くの手が軽い仕事をする: 自動スコアのエッセイにエッセイの跡を使用する Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays ( http://arxiv.org/abs/2102.00781v1 ) ライセンス: Link先を確認	Rahul Kumar, Sandeep Mathias, Sriparna Saha, Pushpak Bhattacharyya	(参考訳) aeg(automatic essay grading)の分野におけるほとんどの研究は、エッセイの総合的なスコア付けに向けられているが、個々のエッセイの特徴をスコアリングする作業も行われている。本論文では,マルチタスク学習(MTL)手法を用いてエッセイを体系的に採点する方法について述べる。ここでは,エッセイを体系的に採点することが主課題であり,エッセイ特性を採点することが補助課題である。 LSTMとBiLSTMの両方を用いて,STL(Single-task Learning)アプローチとの比較を行った。また,補助作業の結果を他のaegシステムで実施したタスクと比較した。異なる種類のエッセイにどの特性が最適かを調べるために、エッセイのそれぞれの特徴に対してアブレーションテストを実施します。また、各システムのランタイムとトレーニングパラメータの数を報告します。 MTLをベースとしたBiLSTMシステムは,エッセイ特性の評価だけでなく,エッセイ特性の評価にも有効であることがわかった。 Most research in the area of automatic essay grading (AEG) is geared towards scoring the essay holistically while there has also been some work done on scoring individual essay traits. In this paper, we describe a way to score essays holistically using a multi-task learning (MTL) approach, where scoring the essay holistically is the primary task, and scoring the essay traits is the auxiliary task. We compare our results with a single-task learning (STL) approach, using both LSTMs and BiLSTMs. We also compare our results of the auxiliary task with such tasks done in other AEG systems. To find out which traits work best for different types of essays, we conduct ablation tests for each of the essay traits. We also report the runtime and number of training parameters for each system. We find that MTL-based BiLSTM system gives the best results for scoring the essay holistically, as well as performing well on scoring the essay traits.	翻訳日:2021-02-02 17:01:01 公開日:2021-02-01
# マルチファセットプロトタイプを用いたフェーショット画像分類 Few-shot Image Classification with Multi-Facet Prototypes ( http://arxiv.org/abs/2102.00801v1 ) ライセンス: Link先を確認	Kun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, Steven Schockaert	(参考訳) 少数ショット学習(FSL)の目的は、少数のトレーニング例から画像カテゴリの認識方法を学ぶことである。中心となる課題は、利用可能なトレーニングサンプルは通常、考慮されたカテゴリの最も特徴的な視覚特徴を決定するために不十分であることだ。この課題に対処するため、これらの視覚的特徴をファセットに整理し、同じ種類の機能を直感的にグループ化する(例)。形状、色、または質感に関連する機能)。これは, (i) 各ファセットの重要性がカテゴリごとに異なる, (ii) カテゴリ名の事前学習された埋め込みからファセットの重要性を予測することができる,という仮定に基づく。特に,あるカテゴリの集合に対して,予測されたフェーレット重み付けに依存する適応的類似度尺度を提案する。この測度は、既存のメトリックベースメソッドの幅広い配列と組み合わせて使用できる。 miniImageNet と CUB の実験により,我々の手法は計量ベース FSL の最先端性の向上を図っている。 The aim of few-shot learning (FSL) is to learn how to recognize image categories from a small number of training examples. A central challenge is that the available training examples are normally insufficient to determine which visual features are most characteristic of the considered categories. To address this challenge, we organize these visual features into facets, which intuitively group features of the same kind (e.g. features that are relevant to shape, color, or texture). This is motivated from the assumption that (i) the importance of each facet differs from category to category and (ii) it is possible to predict facet importance from a pre-trained embedding of the category names. In particular, we propose an adaptive similarity measure, relying on predicted facet importance weights for a given set of categories. This measure can be used in combination with a wide array of existing metric-based methods. Experiments on miniImageNet and CUB show that our approach improves the state-of-the-art in metric-based FSL.	翻訳日:2021-02-02 16:57:36 公開日:2021-02-01
# Bellman Eluder Dimension: RL問題の新しいリッチクラスとサンプル効率の高いアルゴリズム Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms ( http://arxiv.org/abs/2102.00815v1 ) ライセンス: Link先を確認	Chi Jin, Qinghua Liu, Sobhan Miryoosefi	(参考訳) サンプル効率の学習を促進する最小限の構造的仮定を見つけることは、強化学習(RL)において最も重要な研究方向の1つである。本稿では,新しい複雑性尺度であるbellman eluder(be)次元を導入することで,この基本的な問題に対する理解を深める。我々は,低BE次元のRL問題の族が極めて豊富であることを示し,これは表付きMDP,線形MDP,反応性PMDP,低ベルマンランク問題,低エルダー次元問題など,既存のトラクタブルRL問題の大部分を仮定している。本稿ではさらに,新しい最適化に基づくアルゴリズム -- ゴルフ,および仮説除去に基づくアルゴリズム -- olive (jiang et alで提案) を再分析する。 (2017)). 両アルゴリズムは、全ての関連するパラメータの多項式である多数のサンプルにおいて低BE次元問題の準最適ポリシを学習するが、状態-作用空間のサイズには依存しないことを示す。我々の後悔とサンプルの複雑さの結果は、BE次元の低いいくつかのよく知られたサブクラスに対して、最良の既存の結果と一致または改善する。 Finding the minimal structural assumptions that empower sample-efficient learning is one of the most important research directions in Reinforcement Learning (RL). This paper advances our understanding of this fundamental question by introducing a new complexity measure -- Bellman Eluder (BE) dimension. We show that the family of RL problems of low BE dimension is remarkably rich, which subsumes a vast majority of existing tractable RL problems including but not limited to tabular MDPs, linear MDPs, reactive POMDPs, low Bellman rank problems as well as low Eluder dimension problems. This paper further designs a new optimization-based algorithm -- GOLF, and reanalyzes a hypothesis elimination-based algorithm -- OLIVE (proposed in Jiang et al. (2017)). We prove that both algorithms learn the near-optimal policies of low BE dimension problems in a number of samples that is polynomial in all relevant parameters, but independent of the size of state-action space. Our regret and sample complexity results match or improve the best existing results for several well-known subclasses of low BE dimension problems.	翻訳日:2021-02-02 16:54:57 公開日:2021-02-01
# 畳み込みLSTMの時空間気象予測と注意機構 Spatio-temporal Weather Forecasting and Attention Mechanism on Convolutional LSTMs ( http://arxiv.org/abs/2102.00696v1 ) ライセンス: Link先を確認	Selim Furkan Tekin, Oguzhan Karaahmetoglu, Fatih Ilhan, Ismail Balaban and Suleyman Serdar Kozat	(参考訳) 高解像度物理モデル上での数値天気予報はスーパーコンピュータ上での計算時間を消費する。ディープラーニングと機械学習の手法の予測への応用は、この領域で新しいソリューションを明らかにした。本稿では,入力気象データと観測データの両方を用いて,新たな深層学習アーキテクチャを提供することで,高分解能の数値気象データを予測する。問題を時空間予測として定式化する。本モデルは,畳み込み型長期記憶と,エンコーダ・デコーダ構造を持つ畳み込み型ニューラルネットワークユニットから構成される。注意とコンテキストマッチング機構により、短期的なパフォーマンスと解釈性を向上させます。我々は,高スケール,実時間,ベンチマーク数値気象データ,era5時間毎の圧力レベルに関する実験を行い,温度を予測した。その結果,入力系列の異なる部分に着目した注意行列と空間的相関と時間的相関が有意な改善を示した。本モデルは,ConvLSTM予測ネットワークやU-Netなど,ベースラインモデルの中で最高の検証とテストスコアを得る。我々は定性的かつ定量的な結果を提供し、平均2度の誤差で3時間の周波数で10の時間ステップを予測した。当社のコードとデータは公開されています。 Numerical weather forecasting on high-resolution physical models consume hours of computations on supercomputers. Application of deep learning and machine learning methods in forecasting revealed new solutions in this area. In this paper, we forecast high-resolution numeric weather data using both input weather data and observations by providing a novel deep learning architecture. We formulate the problem as spatio-temporal prediction. Our model is composed of Convolutional Long-short Term Memory, and Convolutional Neural Network units with encoder-decoder structure. We enhance the short-long term performance and interpretability with an attention and a context matcher mechanism. We perform experiments on high-scale, real-life, benchmark numerical weather dataset, ERA5 hourly data on pressure levels, and forecast the temperature. The results show significant improvements in capturing both spatial and temporal correlations with attention matrices focusing on different parts of the input series. Our model obtains the best validation and the best test score among the baseline models, including ConvLSTM forecasting network and U-Net. We provide qualitative and quantitative results and show that our model forecasts 10 time steps with 3 hour frequency with an average of 2 degrees error. Our code and the data are publicly available.	翻訳日:2021-02-02 16:53:08 公開日:2021-02-01
# ノックオフによる反実生成 Counterfactual Generation with Knockoffs ( http://arxiv.org/abs/2102.00951v1 ) ライセンス: Link先を確認	Oana-Iuliana Popescu, Maha Shadaydeh, Joachim Denzler	(参考訳) ディープニューラルネットワークの決定の人間の解釈性は、特にそれが人間の生活に直接影響を及ぼす領域において重要である。既に訓練済みのニューラルネットワークの因果的説明は、入力特徴の摂動と、摂動後の分類器の結果の変化による重要性の寄与によって生成される。摂動は、ヒューリスティックまたは生成的インフィル方式で特徴を置き換えることによって行うことができる。インフィル機能の選択は、アーティファクトの数、すなわち偽陽性アトリビューションに大きく影響します。ヒューリスティックな手法は、摂動後の画像が元のデータ分布に遠く及ばないため、偽陽性のアーティファクトをもたらす。生成的インフィルングメソッドは、元のデータ分布を尊重するインフィルング値を生成することによってアーティファクトを削減する。しかし,現在のインフィル法では,インフィル値と元のデータとの相関が高いため,偽陰性も増大する可能性がある。本稿では,2015年にBarber と Cand\`es が,制御可能な擬似発見率を持つ変数選択ツールとして開発した,統計的に座屈した Knockoffs フレームワークを組み込むことにより,この問題を軽減することを提案する。ノックオフは、元のデータから可能な限りデコレーションに関連する統計的にnull-variablesであり、基礎となるデータ分布を変更することなく元のデータと交換することができる。異なるインフィル方式の比較は、インフィルディングとノックオフは説明のコンパクト性を維持しつつ、より因果的な意味で説明を明らかにすることができることを示している。 Human interpretability of deep neural networks' decisions is crucial, especially in domains where these directly affect human lives. Counterfactual explanations of already trained neural networks can be generated by perturbing input features and attributing importance according to the change in the classifier's outcome after perturbation. Perturbation can be done by replacing features using heuristic or generative in-filling methods. The choice of in-filling function significantly impacts the number of artifacts, i.e., false-positive attributions. Heuristic methods result in false-positive artifacts because the image after the perturbation is far from the original data distribution. Generative in-filling methods reduce artifacts by producing in-filling values that respect the original data distribution. However, current generative in-filling methods may also increase false-negatives due to the high correlation of in-filling values with the original data. In this paper, we propose to alleviate this by generating in-fillings with the statistically-grounded Knockoffs framework, which was developed by Barber and Cand\`es in 2015 as a tool for variable selection with controllable false discovery rate. Knockoffs are statistically null-variables as decorrelated as possible from the original data, which can be swapped with the originals without changing the underlying data distribution. A comparison of different in-filling methods indicates that in-filling with knockoffs can reveal explanations in a more causal sense while still maintaining the compactness of the explanations.	翻訳日:2021-02-02 16:52:28 公開日:2021-02-01
# 適応基底分解による単画像非一様Blurカーネル推定 Single Image Non-uniform Blur Kernel Estimation via Adaptive Basis Decomposition ( http://arxiv.org/abs/2102.01026v1 ) ライセンス: Link先を確認	Guillermo Carbajal, Patricia Vitoria, Mauricio Delbracio, Pablo Mus\'e, Jos\'e Lezama	(参考訳) カメラの揺動や物体の動きによる動きのぼやけを特徴付けることは、画像復元にとって重要な課題である。近年、写真における動きのぼやけの除去は、ぼやけた画像から鋭い画像へ直接マッピングするように訓練されたディープラーニングベースの手法によって、目覚ましい進歩を遂げている。一方, 動きのぼかしのキャラクタリゼーションは, データ駆動のエンド・ツー・エンド・エンド・アプローチに先立って, モデルに基づくラグの復元手法が進歩している。本稿では,高密度な非一様運動ボケ推定のための一般非パラメトリックモデルを提案する。ぼやけた画像が与えられたとき、適応基底カーネルの集合とピクセルレベルでの混合係数を推定し、動きのぼやきのピクセルごとのマップを生成する。このリッチだが効率的な劣化過程のフォワードモデルにより、逆問題の解決に既存のツールを活用することができる。提案手法は,既存の不均一な動きのぼかし推定の限界を克服し,モデルベースとデータ駆動アプローチのギャップを埋めることに寄与することを示す。 Characterizing and removing motion blur caused by camera shake or object motion remains an important task for image restoration. In recent years, removal of motion blur in photographs has seen impressive progress in the hands of deep learning-based methods, trained to map directly from blurry to sharp images. Characterization of motion blur, on the other hand, has received less attention and progress in model-based methods for restoration lags behind that of data-driven end-to-end approaches. In this paper, we propose a general, non-parametric model for dense non-uniform motion blur estimation. Given a blurry image, we estimate a set of adaptive basis kernels as well as the mixing coefficients at pixel level, producing a per-pixel map of motion blur. This rich but efficient forward model of the degradation process allows the utilization of existing tools for solving inverse problems. We show that our method overcomes the limitations of existing non-uniform motion blur estimation and that it contributes to bridging the gap between model-based and data-driven approaches for deblurring real photographs.	翻訳日:2021-02-02 16:51:43 公開日:2021-02-01
# Harrington Yowlumne Narrative Corpus The Harrington Yowlumne Narrative Corpus ( http://arxiv.org/abs/2102.00610v1 ) ライセンス: Link先を確認	Nathan M. White and Timothy Henry-Rodriguez	(参考訳) マイノリティ言語は、特に技術分野において、開発に十分な資源を欠いている。同様に、スミソニアン研究所のJ・P・ハリントン・ペーパーズ・コレクションは、手書きで非組織化されたフォーマットのために、コミュニティメンバーや研究者が実際にアクセスすることは困難である。我々の現在の研究は、この公に利用できながら問題のある素材の一部を、自然言語処理で実際に利用できるものにすることを目指している。ここでは、1910年から1925年の間、カリフォルニア州カーン郡のティンリウ牧場のテホネ・ル～ノ・ヨーラムヌコミュニティに由来する20の物語テキストのコーパスであるHarrington Yowlumne Narrative Corpusを紹介します。テキストをデジタルで書き起こし、これらのテキストでゴールド標準のレキセメベースの正規化テキストを提供する。さらに、67,835文字の文字が10,721文字の標準テキスト正規化語と一致する。 Minority languages continue to lack adequate resources for their development, especially in the technological domain. Likewise, the J.P. Harrington Papers collection at the Smithsonian Institution are difficult to access in practical terms for community members and researchers due to its handwritten and disorganized format. Our current work seeks to make a portion of this publicly-available yet problematic material practically accessible for natural language processing use. Here, we present the Harrington Yowlumne Narrative Corpus, a corpus of 20 narrative texts that derive from the Tejone\~no Yowlumne community of the Tinliw rancheria in Kern County, California between 1910 and 1925. We digitally transcribe the texts and provide gold-standard aligned lexeme-based normalized text with these texts. Altogether, the text contains 67,835 transcribed characters aligned with 10,721 gold standard text-normalized words.	翻訳日:2021-02-02 16:49:15 公開日:2021-02-01
# 回答選択のための階層的ランキング Hierarchical Ranking for Answer Selection ( http://arxiv.org/abs/2102.00677v1 ) ライセンス: Link先を確認	Hang Gao, Mengting Hu, Renhong Cheng, Tiegang Gao	(参考訳) 回答の選択は、与えられた質問に対する候補回答のプールから正の回答を選択するタスクです。本稿では,階層的ランキングという,解答選択のための新しい戦略を提案する。我々は,ポイントレベルのランキング,ペアレベルのランキング,リストレベルのランキングの3つの階層を導入する。候補者の回答をランキングするのと同じ目標を達成するために、異なる視点からの監督情報を使用して最適化目標を策定します。したがって、3つのレベルは関連しており、互いに促進することができる。我々は,多タスク学習(mtl)戦略に基づくスキーム,ランキング統合(ri)スキーム,プログレッシブランキング統合(pri)スキームという,階層的ランキングを共同で適用するための3つのスキームを検討した。 WikiQA と TREC-QA の2つの公開データセットによる実験結果から,提案した階層的ランキングが有効であることを示す。 TREC-QAとWikiQAの両方で最新の(非BERT)パフォーマンスを実現します。 Answer selection is a task to choose the positive answers from a pool of candidate answers for a given question. In this paper, we propose a novel strategy for answer selection, called hierarchical ranking. We introduce three levels of ranking: point-level ranking, pair-level ranking, and list-level ranking. They formulate their optimization objectives by employing supervisory information from different perspectives to achieve the same goal of ranking candidate answers. Therefore, the three levels of ranking are related and they can promote each other. We take the well-performed compare-aggregate model as the backbone and explore three schemes to implement the idea of applying the hierarchical rankings jointly: the scheme under the Multi-Task Learning (MTL) strategy, the Ranking Integration (RI) scheme, and the Progressive Ranking Integration (PRI) scheme. Experimental results on two public datasets, WikiQA and TREC-QA, demonstrate that the proposed hierarchical ranking is effective. Our method achieves state-of-the-art (non-BERT) performance on both TREC-QA and WikiQA.	翻訳日:2021-02-02 16:48:36 公開日:2021-02-01
# イディオムコーポラ建設のためのガミファイドクラウドソーシング Gamified Crowdsourcing for Idiom Corpora Construction ( http://arxiv.org/abs/2102.00881v1 ) ライセンス: Link先を確認	G\"ul\c{s}en Eryi\u{g}it, Ali \c{S}enta\c{s}, Johanna Monti	(参考訳) 慣用的な表現を学ぶことは、その予測不可能な意味のために第二言語学習の最も困難な段階の1つと見なされます。同様の状況は、機械翻訳や構文解析などの自然言語処理アプリケーション内での識別にも当てはまる。高品質の使用サンプルの欠如は、人間だけでなく人工知能システムにとってもこの課題を悪化させます。本稿では,慣用的・非慣用的な使用例を提供し,他のプレイヤーのエントリーを評価しながら,互いに競合するネイティブスピーカーのための非同期マルチプレイヤーゲームとして,メッセージングボットを設計する。古典的なクラウドプロセッシングアノテーションの分野への取り組みとは対照的に,文献の中では初めて,クラウドプロセッシングとクラウドプロセッシングのアプローチが実装され,イディオムコーパス構築のためにテストされている。このアプローチは言語に依存しず、フィールドの従来のデータ準備技術と比較して2つの言語で評価されます。群衆の反応は、異なる動機づけの手段(すなわち、ゲーミフィケーションと金銭的報酬)で監視される。その結果, 提案手法は対象資料の収集に有効であり, 露骨なクラウドソーシング手法であるにもかかわらず, 観客を楽しませ, 有用であることがわかった。このアプローチは、第二言語学習教材として使用する異なる自然言語のためのイディオムコーパスの構築、教師付きイディオム識別システムのためのトレーニングデータ、辞書研究のためのサンプルをスピードアップする可能性があることが示されている。 Learning idiomatic expressions is seen as one of the most challenging stages in second language learning because of their unpredictable meaning. A similar situation holds for their identification within natural language processing applications such as machine translation and parsing. The lack of high-quality usage samples exacerbates this challenge not only for humans but also for artificial intelligence systems. This article introduces a gamified crowdsourcing approach for collecting language learning materials for idiomatic expressions; a messaging bot is designed as an asynchronous multiplayer game for native speakers who compete with each other while providing idiomatic and nonidiomatic usage examples and rating other players' entries. As opposed to classical crowdprocessing annotation efforts in the field, for the first time in the literature, a crowdcreating & crowdrating approach is implemented and tested for idiom corpora construction. The approach is language independent and evaluated on two languages in comparison to traditional data preparation techniques in the field. The reaction of the crowd is monitored under different motivational means (namely, gamification affordances and monetary rewards). The results reveal that the proposed approach is powerful in collecting the targeted materials, and although being an explicit crowdsourcing approach, it is found entertaining and useful by the crowd. The approach has been shown to have the potential to speed up the construction of idiom corpora for different natural languages to be used as second language learning material, training data for supervised idiom identification systems, or samples for lexicographic studies.	翻訳日:2021-02-02 16:47:13 公開日:2021-02-01
# 学習済み言語モデルにおける一貫性の測定と改善 Measuring and Improving Consistency in Pretrained Language Models ( http://arxiv.org/abs/2102.01017v1 ) ライセンス: Link先を確認	Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Sch\"utze, Yoav Goldberg	(参考訳) モデルの一貫性、すなわち、その入力における意味保存的な変化の下での振る舞いの不変性は、自然言語処理において非常に望ましい特性である。本稿では, 事前学習型言語モデル(PLM)は, 事実的知識に対して一貫性があるか? この目的のために私たちは、clozeスタイルのクエリ英語パラフレーズの高品質なリソースであるpararelを作成します。総計328のパラフレーズがあり、38の関係がある。 ParaRel を用いて、我々が実験したすべての PLM の整合性は貧弱であるが、関係のばらつきは高い。 plm の表現空間の解析は,構造が貧弱であり,現在,知識を堅牢に表現するのに適していないことを示唆する。最後に,モデルの一貫性を向上させる手法を提案し,その効果を実験的に実証する。 Consistency of a model -- that is, the invariance of its behavior under meaning-preserving alternations in its input -- is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for thirty-eight relations. Using ParaRel, we show that the consistency of all PLMs we experiment with is poor -- though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge in a robust way. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness.	翻訳日:2021-02-02 16:46:28 公開日:2021-02-01
# SJ_AJ@DravidianLangTech-EACL2021: 攻撃言語識別のための多言語BERTモデルのタスク適応事前訓練 SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification ( http://arxiv.org/abs/2102.01051v1 ) ライセンス: Link先を確認	Sai Muralidhar Jayanthi, Akshat Gupta	(参考訳) 本稿では,ドラビダ語における攻撃的言語識別に関するEACL 2021-Shared Taskを提案する。私たちの最終システムはmBERTとXLM-RoBERTaモデルのアンサンブルであり、マスク付き言語モデリング目的の多言語BERTモデルのタスク適応事前トレーニングを利用しています。私たちのシステムは、カンナダで1位、マラヤラムで2位、タミルで3位にランクされました。 In this paper we present our submission for the EACL 2021-Shared Task on Offensive Language Identification in Dravidian languages. Our final system is an ensemble of mBERT and XLM-RoBERTa models which leverage task-adaptive pre-training of multilingual BERT models with a masked language modeling objective. Our system was ranked 1st for Kannada, 2nd for Malayalam and 3rd for Tamil.	翻訳日:2021-02-02 16:45:56 公開日:2021-02-01
# 小さくて合成的なベンチマークは、モデリングのイノベーションを駆動できますか? 質問応答モデリング手法のふりかえり的研究 Can Small and Synthetic Benchmarks Drive Modeling Innovation? A Retrospective Study of Question Answering Modeling Approaches ( http://arxiv.org/abs/2102.01065v1 ) ライセンス: Link先を確認	Nelson F. Liu and Tony Lee and Robin Jia and Percy Liang	(参考訳) データセットは、正確でデプロイ可能なシステムのトレーニングのためのリソースであるだけでなく、新しいモデリングアプローチを開発するためのベンチマークでもある。正確なシステムのトレーニングには大規模で自然なデータセットが必要ですが、モデリングの革新を促進するには必要でしょうか? 例えば、人気のあるsunity question answering benchmarkは、新しいモデリングアプローチの開発につながったが、シンセサイザーや小さなベンチマークが同様のイノベーションに繋がる可能性がある。この反現実的な質問は答えられないが、我々は必要条件、すなわちベンチマークがSQuAD上で行った発見を再カプセル化できる能力について研究することができる。我々は20のSQuADモデリングアプローチの振り返り調査を行い、32の既存および合成ベンチマークがSQuADとどのように一致しているかを調査する。我々は,SQuADに類似しないが,SQuADとの精度が高く,SQuADの歴史的モデリング改善を反映するためには,自然性やサイズは必要ないことを実証した,小型でターゲットの合成ベンチマークを慎重に構築する。この結果から,小型かつ慎重に設計された合成ベンチマークが新たなモデリング手法の開発に有用である可能性が示唆された。 Datasets are not only resources for training accurate, deployable systems, but are also benchmarks for developing new modeling approaches. While large, natural datasets are necessary for training accurate systems, are they necessary for driving modeling innovation? For example, while the popular SQuAD question answering benchmark has driven the development of new modeling approaches, could synthetic or smaller benchmarks have led to similar innovations? This counterfactual question is impossible to answer, but we can study a necessary condition: the ability for a benchmark to recapitulate findings made on SQuAD. We conduct a retrospective study of 20 SQuAD modeling approaches, investigating how well 32 existing and synthesized benchmarks concur with SQuAD -- i.e., do they rank the approaches similarly? We carefully construct small, targeted synthetic benchmarks that do not resemble natural language, yet have high concurrence with SQuAD, demonstrating that naturalness and size are not necessary for reflecting historical modeling improvements on SQuAD. Our results raise the intriguing possibility that small and carefully designed synthetic benchmarks may be useful for driving the development of new modeling approaches.	翻訳日:2021-02-02 16:45:27 公開日:2021-02-01
# Piagetの認知発達理論に触発された解釈型強化学習 Interpretable Reinforcement Learning Inspired by Piaget's Theory of Cognitive Development ( http://arxiv.org/abs/2102.00572v1 ) ライセンス: Link先を確認	Aref Hakimzadeh, Yanbo Xue, and Peyman Setoodeh	(参考訳) 人間レベルの認知能力を持つロボットを設計するための取り組みは、学習機械の異なるカテゴリに導かれた。スキナーの理論によれば、強化学習(rl)は人間の直観と認知において重要な役割を果たす。ディープRLアルゴリズムを含む最先端の手法の大部分は、コネクティストの視点に強く影響されます。このようなアルゴリズムは、他の分野における心の理論や学習の恩恵を受けることができる。本稿では、思考仮説言語(LOTH)、スクリプト理論、およびピアジェットの認知発達理論などの理論が相補的なアプローチを提供し、RL分野を豊かにするという考えを楽しませる。この考え方に続いて、生産性、体系性、推論コヒーレンスの概念を支持するピアジェットのスキーマ理論に対して、接続論とは対照的に、一般的な計算ビルディングブロックが提案される。提案手法の抽象化はシステム自体に完全に依存しており、事前定義されたアーキテクチャによって外部的に制約されない。プロセス全体は、Neisserの知覚サイクルモデルと一致する。 3つの典型的な制御問題に対する実験と行動解析により,提案手法の解釈可能性とその競合性が,最先端アルゴリズムと比較して確認された。したがって、提案フレームワークは、人工知能システムにおいて人間のような認知を実現するためのステップとみなすことができる。 Endeavors for designing robots with human-level cognitive abilities have led to different categories of learning machines. According to Skinner's theory, reinforcement learning (RL) plays a key role in human intuition and cognition. Majority of the state-of-the-art methods including deep RL algorithms are strongly influenced by the connectionist viewpoint. Such algorithms can significantly benefit from theories of mind and learning in other disciplines. This paper entertains the idea that theories such as language of thought hypothesis (LOTH), script theory, and Piaget's cognitive development theory provide complementary approaches, which will enrich the RL field. Following this line of thinking, a general computational building block is proposed for Piaget's schema theory that supports the notions of productivity, systematicity, and inferential coherence as described by Fodor in contrast with the connectionism theory. Abstraction in the proposed method is completely upon the system itself and is not externally constrained by any predefined architecture. The whole process matches the Neisser's perceptual cycle model. Performed experiments on three typical control problems followed by behavioral analysis confirm the interpretability of the proposed method and its competitiveness compared to the state-of-the-art algorithms. Hence, the proposed framework can be viewed as a step towards achieving human-like cognition in artificial intelligent systems.	翻訳日:2021-02-02 16:42:55 公開日:2021-02-01
# 自動運転技術における計画・責任・安全の制御可能性 The Controllability of Planning, Responsibility, and Security in Automatic Driving Technology ( http://arxiv.org/abs/2102.00617v1 ) ライセンス: Link先を確認	Dan Wan and Hao Zhan	(参考訳) 自動走行技術は常に安定して制御可能な状態にあり、具体的には、制御可能な計画、制御可能な責任、制御可能な情報に分けられることを期待しています。この制御性が損なわれると、トロリージレンマ、責任帰属、情報漏洩、セキュリティなどの問題が発生します。本稿では,これら3つの問題を別々に論じ,誤解を明確にする。 People hope automated driving technology is always in a stable and controllable state; specifically, it can be divided into controllable planning, controllable responsibility, and controllable information. When this controllability is undermined, it brings about the problems, e.g., trolley dilemma, responsibility attribution, information leakage, and security. This article discusses these three types of issues separately and clarifies the misunderstandings.	翻訳日:2021-02-02 16:42:16 公開日:2021-02-01
# 画像のテクスト記述から空間関係を推測する Inferring spatial relations from textual descriptions of images ( http://arxiv.org/abs/2102.00997v1 ) ライセンス: Link先を確認	Aitzol Elu, Gorka Azkune, Oier Lopez de Lacalle, Ignacio Arganda-Carreras, Aitor Soroa, Eneko Agirre	(参考訳) テキスト記述から画像を生成するには、あるレベルの言語理解と、記述される物理的実体の空間的関係に関する常識知識が必要である。本研究では,テキストに基づくシーン構成における重要なステップであるエンティティ間の空間的関係を推測することに焦点を当てた。具体的には、対象への言及と対象の境界ボックスの位置とサイズを含むキャプションを考えると、キャプションで言及された対象の位置と大きさを予測することが私たちの目標です。以前の作業ではキャプションのテキスト情報ではなく、対象と対象の間の手動で提供された関係保持を使用していました。実際に使用される評価データセットには、手動で注釈付けされたオントロジ三脚が含まれているが、キャプションがないため、運動は非現実的で、手動ステップが必要であり、システムはキャプション内のリッチな情報を活用できなかった。本稿では, キャプションの全文と, キャプションからの空間的関係推論を直接評価できるMS-COCOから派生したデータセットであるRelations in Captions (REC-COCO) を用いたシステムを提案する。実験の結果,(1)字幕から直接対象物の大きさや位置を推測することが可能であり,(2)完全テキストを用いることで,手作業による注釈付き関係を用いた場合よりも,対象物の位置を推定できることがわかった。我々の研究は、キャプションを付与したシステムにおいて、最終的な画像を生成するために、どのエンティティとそれぞれの場所とサイズを表現する必要があるかを決定する方法である。 Generating an image from its textual description requires both a certain level of language understanding and common sense knowledge about the spatial relations of the physical entities being described. In this work, we focus on inferring the spatial relation between entities, a key step in the process of composing scenes based on text. More specifically, given a caption containing a mention to a subject and the location and size of the bounding box of that subject, our goal is to predict the location and size of an object mentioned in the caption. Previous work did not use the caption text information, but a manually provided relation holding between the subject and the object. In fact, the used evaluation datasets contain manually annotated ontological triplets but no captions, making the exercise unrealistic: a manual step was required; and systems did not leverage the richer information in captions. Here we present a system that uses the full caption, and Relations in Captions (REC-COCO), a dataset derived from MS-COCO which allows to evaluate spatial relation inference from captions directly. Our experiments show that: (1) it is possible to infer the size and location of an object with respect to a given subject directly from the caption; (2) the use of full text allows to place the object better than using a manually annotated relation. Our work paves the way for systems that, given a caption, decide which entities need to be depicted and their respective location and sizes, in order to then generate the final image.	翻訳日:2021-02-02 16:41:09 公開日:2021-02-01
# RoutingGAN:Disentangleed Learningによるルーティング年齢の進行と回帰 RoutingGAN: Routing Age Progression and Regression with Disentangled Learning ( http://arxiv.org/abs/2102.00601v1 ) ライセンス: Link先を確認	Zhizhong Huang and Junping Zhang and Hongming Shan	(参考訳) Although impressive results have been achieved for age progression and regression, there remain two major issues in generative adversarial networks (GANs)-based methods: 1) conditional GANs (cGANs)-based methods can learn various effects between any two age groups in a single model, but are insufficient to characterize some specific patterns due to completely shared convolutions filters; and 2) GANs-based methods can, by utilizing several models to learn effects independently, learn some specific patterns, however, they are cumbersome and require age label in advance. 本稿では,これらの欠陥に対処するために,GAN~(RoutingGAN)に基づくドロップアウト方式を導入して,高レベルの意味的特徴空間における異なる効果を導出する。具体的には、まず、入力面から年齢不変な特徴を外し、その後、他の出力を落として、異なる年齢グループに畳み込みフィルタを割り当てる残差ルータによって、その特徴に徐々に効果を付加する。その結果,提案するルーティングガンは,コンボリューションフィルタを一部共有することで,単一のモデルで同時に様々な効果を学習することができる。 2つのベンチマークデータセットの実験結果は、定性的かつ定量的に既存の手法よりも優れた性能を示した。 Although impressive results have been achieved for age progression and regression, there remain two major issues in generative adversarial networks (GANs)-based methods: 1) conditional GANs (cGANs)-based methods can learn various effects between any two age groups in a single model, but are insufficient to characterize some specific patterns due to completely shared convolutions filters; and 2) GANs-based methods can, by utilizing several models to learn effects independently, learn some specific patterns, however, they are cumbersome and require age label in advance. To address these deficiencies and have the best of both worlds, this paper introduces a dropout-like method based on GAN~(RoutingGAN) to route different effects in a high-level semantic feature space. Specifically, we first disentangle the age-invariant features from the input face, and then gradually add the effects to the features by residual routers that assign the convolution filters to different age groups by dropping out the outputs of others. As a result, the proposed RoutingGAN can simultaneously learn various effects in a single model, with convolution filters being shared in part to learn some specific effects. Experimental results on two benchmarked datasets demonstrate superior performance over existing methods both qualitatively and quantitatively.	翻訳日:2021-02-02 16:31:32 公開日:2021-02-01
# エンドツーエンド食品画像解析システム An End-to-End Food Image Analysis System ( http://arxiv.org/abs/2102.00645v1 ) ライセンス: Link先を確認	Jiangpeng He, Runyu Mao, Zeman Shao, Janine L. Wright, Deborah A. Kerr, Carol J. Boushey and Fengqing Zhu	(参考訳) 現代の深層学習技術は、食品認識や食品部分サイズ推定などの画像に基づく食事評価の進歩を可能にしている。食品の種類や消費量に関する貴重な情報は、多くの慢性疾患の予防に不可欠である。しかし、既存の画像に基づく食品分析の方法はエンドツーエンドでも、複数のタスク(認識や部分推定など)を一緒に処理することができず、現実のアプリケーションに適用することは困難である。本稿では,食品の局所化,分類,部分サイズ推定を融合した画像ベース食品分析フレームワークを提案する。提案手法はエンド・ツー・エンド,すなわち複数の食品を含む任意の食品画像であり,本システムでは各食品を対応する食品の種類と部分サイズでローカライズすることができる。また、条件付きGANで得られた食品エネルギー分布マップを局在化して4チャンネルRGB分布画像を生成することにより、単一食品部分推定を改善します。栄養摂食調査から収集した実生活食品画像データセットを用いて、エンドツーエンドの枠組みを評価した。 Modern deep learning techniques have enabled advances in image-based dietary assessment such as food recognition and food portion size estimation. Valuable information on the types of foods and the amount consumed are crucial for prevention of many chronic diseases. However, existing methods for automated image-based food analysis are neither end-to-end nor are capable of processing multiple tasks (e.g., recognition and portion estimation) together, making it difficult to apply to real life applications. In this paper, we propose an image-based food analysis framework that integrates food localization, classification and portion size estimation. Our proposed framework is end-to-end, i.e., the input can be an arbitrary food image containing multiple food items and our system can localize each single food item with its corresponding predicted food type and portion size. We also improve the single food portion estimation by consolidating localization results with a food energy distribution map obtained by conditional GAN to generate a four-channel RGB-Distribution image. Our end-to-end framework is evaluated on a real life food image dataset collected from a nutrition feeding study.	翻訳日:2021-02-02 16:30:53 公開日:2021-02-01
# 部分適応・関係注意モジュールによるNIR-to-VIS顔認識 A NIR-to-VIS face recognition via part adaptive and relation attention module ( http://arxiv.org/abs/2102.00689v1 ) ライセンス: Link先を確認	Rushuang Xu, MyeongAh Cho, and Sangyoun Lee	(参考訳) 顔認識アプリケーションシナリオでは,近赤外線(nir)監視カメラによる夜間の撮影など,さまざまな状況で撮影された顔画像を処理する必要がある。 NIRと可視光(VIS)の照度差は、顔画像間のドメインギャップを引き起こし、ポーズと感情の変動も顔のマッチングをより困難にします。ヘテロジニアス顔認識(hfr)はドメイン間の不一致が困難であり、顔部関係情報などのドメイン不変な特徴の抽出に多くの研究が集中している。しかし、ポーズ変動が発生した場合、顔成分位置が変化し、異なる部分関係が抽出される。本稿では,セマンティックマスクを用いて得られた顔の部位を抽出し,それぞれの特徴を用いた関係モデリングを行う部分関係アテンションモジュールを提案する。さらに,各部位の適応重みを用いた成分適応三重項損失関数を提案し,各領域やポーズに関係なくクラス内同一性を低減する。最後に,CASIA NIR-VIS 2.0の性能向上を図り,大きなポーズと感情の変化を伴うBUAA-VisNirにおいて優れた結果が得られることを示す。 In the face recognition application scenario, we need to process facial images captured in various conditions, such as at night by near-infrared (NIR) surveillance cameras. The illumination difference between NIR and visible-light (VIS) causes a domain gap between facial images, and the variations in pose and emotion also make facial matching more difficult. Heterogeneous face recognition (HFR) has difficulties in domain discrepancy, and many studies have focused on extracting domain-invariant features, such as facial part relational information. However, when pose variation occurs, the facial component position changes, and a different part relation is extracted. In this paper, we propose a part relation attention module that crops facial parts obtained through a semantic mask and performs relational modeling using each of these representative features. Furthermore, we suggest component adaptive triplet loss function using adaptive weights for each part to reduce the intra-class identity regardless of the domain as well as pose. Finally, our method exhibits a performance improvement in the CASIA NIR-VIS 2.0 and achieves superior result in the BUAA-VisNir with large pose and emotion variations.	翻訳日:2021-02-02 16:30:16 公開日:2021-02-01
# オーロラガード:モバイル照明システムを介して信頼できる顔アンチスプーフィング Aurora Guard: Reliable Face Anti-Spoofing via Mobile Lighting System ( http://arxiv.org/abs/2102.00713v1 ) ライセンス: Link先を確認	Jian Zhang, Ying Tai, Taiping Yao, Jia Meng, Shouhong Ding, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji	(参考訳) モバイル端末での顔認証は様々なシナリオで広く適用されている。瞬き目や微妙な表情など、最先端の顔認証/検証システムの信頼性は高まっているが、紙写真やデジタルビデオの高解像度レンダリングリプレイに対する反偽造は、オープンな問題として残る。本稿では,オーロラガード(Aurora Guard, AG)と呼ばれる簡易かつ効果的な顔保護システムを提案する。提案システムはまず,まず光反射解析を用いて正常なキューを抽出し,さらに,本質的な深度と物質マップを精度良く復元する,エンドツーエンドのマルチタスク畳み込みニューラルネットワーク(CNN)と,回帰枝の光CAPTCHA検査機構を併用してシステムの信頼性を向上する。公開Replay-AttackおよびCASIAデータセットに関する実験は、提案手法が最先端のものよりも優れていることを実証している。また,実生および多種多様なspoofingサンプルを含む大規模データセットについても広範な実験を行い,本手法の汎用性をさらに検証した。 Face authentication on mobile end has been widely applied in various scenarios. Despite the increasing reliability of cutting-edge face authentication/verification systems to variations like blinking eye and subtle facial expression, anti-spoofing against high-resolution rendering replay of paper photos or digital videos retains as an open problem. In this paper, we propose a simple yet effective face anti-spoofing system, termed Aurora Guard (AG). Our system firstly extracts the normal cues via light reflection analysis, and then adopts an end-to-end trainable multi-task Convolutional Neural Network (CNN) to accurately recover subjects' intrinsic depth and material map to assist liveness classification, along with the light CAPTCHA checking mechanism in the regression branch to further improve the system reliability. Experiments on public Replay-Attack and CASIA datasets demonstrate the merits of our proposed method over the state-of-the-arts. We also conduct extensive experiments on a large-scale dataset containing 12,000 live and diverse spoofing samples, which further validates the generalization ability of our method in the wild.	翻訳日:2021-02-02 16:29:34 公開日:2021-02-01
# ビデオトランスフォーマネットワーク Video Transformer Network ( http://arxiv.org/abs/2102.00719v1 ) ライセンス: Link先を確認	Daniel Neimark, Omri Bar, Maya Zohar, Dotan Asselmann	(参考訳) 本稿では,ビデオ認識のためのトランスフォーマーベースのフレームワークであるVTNを提案する。近年の視覚変換器の発展に触発されて,3D ConvNet に依存した映像行動認識の標準手法を廃止し,映像シーケンス情報全体への参加による行動分類手法を導入する。われわれのアプローチは汎用的で、任意の2次元空間ネットワーク上に構築されている。ウォールランタイムの面では、16.1\times$高速にトレーニングし、推論中に5.1\times$高速で実行し、他の最先端のメソッドと比較して競合精度を維持している。 1回のエンドツーエンドパスでビデオ全体を解析できるが、gflopsは1.5\times$より少ない。我々は、Kinetics-400の競合結果を報告し、VTN特性のアブレーション研究と精度と推論速度のトレードオフを提示する。私たちのアプローチが新しいベースラインとなり、ビデオ認識領域における新しい研究ラインを開始することを願っています。コードとモデルは近く提供される。 This paper presents VTN, a transformer-based framework for video recognition. Inspired by recent developments in vision transformers, we ditch the standard approach in video action recognition that relies on 3D ConvNets and introduce a method that classifies actions by attending to the entire video sequence information. Our approach is generic and builds on top of any given 2D spatial network. In terms of wall runtime, it trains $16.1\times$ faster and runs $5.1\times$ faster during inference while maintaining competitive accuracy compared to other state-of-the-art methods. It enables whole video analysis, via a single end-to-end pass, while requiring $1.5\times$ fewer GFLOPs. We report competitive results on Kinetics-400 and present an ablation study of VTN properties and the trade-off between accuracy and inference speed. We hope our approach will serve as a new baseline and start a fresh line of research in the video recognition domain. Code and models will be available soon.	翻訳日:2021-02-02 16:28:38 公開日:2021-02-01
# Landmark Breaker: ランドマークの抽出を乱してDeepFakeを妨害する Landmark Breaker: Obstructing DeepFake By Disturbing Landmark Extraction ( http://arxiv.org/abs/2102.00798v1 ) ライセンス: Link先を確認	Pu Sun, Yuezun Li, Honggang Qi and Siwei Lyu	(参考訳) 最近のDeep Neural Networks(DNN)の開発により、AI合成顔のリアリズムが大幅に向上し、最も注目すべき例はDeepFakesです。ディープフェイク技術は、同じ顔属性を保持しながら、他の被験者の顔から対象の顔を合成することができる。ソーシャルメディアのポータル(Facebook、Instagramなど)が急速に増加し、こうした現実的な偽の顔はインターネット上で急速に広まり、社会に悪影響を及ぼした。本稿では,顔のランドマーク抽出を阻害する最初の専用手法であるランドマークブレーカーについて説明し,DeepFakeビデオ生成の妨害に応用し,DeepFake品質を低下させるために,顔のランドマーク抽出が入力面のアライメントに影響を与える可能性があることを動機とする。本手法は逆摂動を用いて達成する。 DeepFake生成後にのみ動作する検出方法と比較して、Landmark BreakerはDeepFake生成を防ぐために一歩前進する。最近のceleb-dfデータセットを用いた3つの最先端顔ランドマーク抽出装置について実験を行った。 The recent development of Deep Neural Networks (DNN) has significantly increased the realism of AI-synthesized faces, with the most notable examples being the DeepFakes. The DeepFake technology can synthesize a face of target subject from a face of another subject, while retains the same face attributes. With the rapidly increased social media portals (Facebook, Instagram, etc), these realistic fake faces rapidly spread though the Internet, causing a broad negative impact to the society. In this paper, we describe Landmark Breaker, the first dedicated method to disrupt facial landmark extraction, and apply it to the obstruction of the generation of DeepFake videos.Our motivation is that disrupting the facial landmark extraction can affect the alignment of input face so as to degrade the DeepFake quality. Our method is achieved using adversarial perturbations. Compared to the detection methods that only work after DeepFake generation, Landmark Breaker goes one step ahead to prevent DeepFake generation. The experiments are conducted on three state-of-the-art facial landmark extractors using the recent Celeb-DF dataset.	翻訳日:2021-02-02 16:28:01 公開日:2021-02-01
# マンモグラムにおけるセグメンティングマイクロ石灰化とその応用 Segmenting Microcalcifications in Mammograms and its Applications ( http://arxiv.org/abs/2102.00811v1 ) ライセンス: Link先を確認	Roee Zamir and Shai Bagon and David Samocha and Yael Yagil and Ronen Basri and Miri Sklair-Levy Meirav Galun	(参考訳) 微小石灰化は、乳房の軟組織背景に明るい白い斑点としてマンモグラムに現れるカルシウムの小さな堆積物です。微小石灰化はSitu乳癌の直腸癌の特異な徴候であり,診断と検診にはその正確な検出が不可欠である。マンモグラム中のこれらの小さなカルシウム残基を手動で検出することは、専門家の放射線技師にとっても、時間的消費とエラーになりやすい。マイクロ石灰化の検出とセグメント化のための既存のコンピュータ化アルゴリズムは、高い偽陽性率に苦しむ傾向にあり、広く使われることを妨げている。本稿では,深層学習を用いた正確な計算分割法を提案する。トレーニングフェーズにハードピクセルを集中させる戦略を提案することで、偽陽性率を低く抑えるという課題に特に対処する。さらに,マイクロ石灰化のクラスター上で有意義な統計情報を抽出することができる。 Microcalcifications are small deposits of calcium that appear in mammograms as bright white specks on the soft tissue background of the breast. Microcalcifications may be a unique indication for Ductal Carcinoma in Situ breast cancer, and therefore their accurate detection is crucial for diagnosis and screening. Manual detection of these tiny calcium residues in mammograms is both time-consuming and error-prone, even for expert radiologists, since these microcalcifications are small and can be easily missed. Existing computerized algorithms for detecting and segmenting microcalcifications tend to suffer from a high false-positive rate, hindering their widespread use. In this paper, we propose an accurate calcification segmentation method using deep learning. We specifically address the challenge of keeping the false positive rate low by suggesting a strategy for focusing the hard pixels in the training phase. Furthermore, our accurate segmentation enables extracting meaningful statistics on clusters of microcalcifications.	翻訳日:2021-02-02 16:27:21 公開日:2021-02-01
# カーネル化散乱ヒストグラム空間上の核距離による動的テクスチャ認識 Dynamic Texture Recognition via Nuclear Distances on Kernelized Scattering Histogram Spaces ( http://arxiv.org/abs/2102.00841v1 ) ライセンス: Link先を確認	Alexander Sagel, Julian W\"ormann, Hao Shen	(参考訳) 遠隔に基づく動的テクスチャ認識は,映像データの検索からセグメンテーションまで,マルチメディア処理における重要な研究分野である。動的テクスチャの最も特徴的な特徴が個々のフレームの出現であるという予想に基づいて, 散乱変換を用いて計算したフレーム的特徴ベクトルの局所空間として動的テクスチャを記述することを提案する。これらの空間を基底不変計量と組み合わせることで、最寄りの近傍分類と最寄りのクラスセンター分類のための最先端の結果を競争的に生成する枠組みを得る。 Distance-based dynamic texture recognition is an important research field in multimedia processing with applications ranging from retrieval to segmentation of video data. Based on the conjecture that the most distinctive characteristic of a dynamic texture is the appearance of its individual frames, this work proposes to describe dynamic textures as kernelized spaces of frame-wise feature vectors computed using the Scattering transform. By combining these spaces with a basis-invariant metric, we get a framework that produces competitive results for nearest neighbor classification and state-of-the-art results for nearest class center classification.	翻訳日:2021-02-02 16:26:45 公開日:2021-02-01
# 大語彙物体検出器の評価:悪魔は細部にある Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details ( http://arxiv.org/abs/2102.01066v1 ) ライセンス: Link先を確認	Achal Dave, Piotr Doll\'ar, Deva Ramanan, Alexander Kirillov, Ross Girshick	(参考訳) 設計上、オブジェクト検出のための平均精度(ap)は、すべてのクラスを独立して扱うことを目的としている。一方、これは全てのクラスを扱い、まれに頻繁に、等しく扱うのが望ましい。一方、現実世界のユースケースにおいて重要な特性であるカテゴリ間信頼度校正を無視する。残念なことに、不均衡で大語彙のデータセットでは、APのデフォルト実装はカテゴリに依存していないし、適切に校正された検出器を直接報酬しない。実際、既定の実装では、単純で非合理的な再ランクポリシーがAPを大きなマージンで改善できるゲーム可能なメトリックが生成される。これらの制限に対処するために、2つの補完指標を紹介します。まず、既定のAP実装に簡単な修正を加え、本来意図されていたようなカテゴリ間で真に独立であることを保証する。最近の大語彙検出の進歩をベンチマークし、報告された多くの成果が、新しいクラス毎の独立評価の下での改善に繋がらないことを見出し、最近の改善は、クロスカテゴリランキングへの変更を解釈するのが困難であることを示唆する。カテゴリ間ランキングを確実にベンチマークすることの重要性を考えると、カテゴリ間ランキングを直接比較することで、適切に校正された検出器に報酬を与えるAP(AP-pool)のプール版を考える。最後に、キャリブレーションの古典的アプローチを再検討し、明示的に校正する検出器がAPプールの最先端を1.7ポイント改善することを発見した。 By design, average precision (AP) for object detection aims to treat all classes independently: AP is computed independently per category and averaged. On the one hand, this is desirable as it treats all classes, rare to frequent, equally. On the other hand, it ignores cross-category confidence calibration, a key property in real-world use cases. Unfortunately, we find that on imbalanced, large-vocabulary datasets, the default implementation of AP is neither category independent, nor does it directly reward properly calibrated detectors. In fact, we show that the default implementation produces a gameable metric, where a simple, nonsensical re-ranking policy can improve AP by a large margin. To address these limitations, we introduce two complementary metrics. First, we present a simple fix to the default AP implementation, ensuring that it is truly independent across categories as originally intended. We benchmark recent advances in large-vocabulary detection and find that many reported gains do not translate to improvements under our new per-class independent evaluation, suggesting recent improvements may arise from difficult to interpret changes to cross-category rankings. Given the importance of reliably benchmarking cross-category rankings, we consider a pooled version of AP (AP-pool) that rewards properly calibrated detectors by directly comparing cross-category rankings. Finally, we revisit classical approaches for calibration and find that explicitly calibrating detectors improves state-of-the-art on AP-pool by 1.7 points.	翻訳日:2021-02-02 16:26:15 公開日:2021-02-01
# Phoneme-BERT: Phoneme Sequence と ASR Transcript の合同言語モデリング Phoneme-BERT: Joint Language Modelling of Phoneme Sequence and ASR Transcript ( http://arxiv.org/abs/2102.00804v1 ) ライセンス: Link先を確認	Mukuntha Narayanan Sundararaman, Ayush Kumar, Jithendra Vepa	(参考訳) 近年,asrシステムの発話認識能力が大幅に向上している。しかし、翻訳されたテキストで置換と削除のエラーが流行している、騒々しいドメイン外のデータにとって、まだ難しい作業です。これらのエラーは下流タスクのパフォーマンスを著しく低下させる。本研究では,ASRの誤りに頑健な音素認識表現を学習するために,音素シーケンスとASR書き起こしを用いた共同言語モデルを学習するPhonemeBERTと呼ばれるBERTスタイルの言語モデルを提案する。 PhonemeBERTは、音素シーケンスを付加的な機能として使用する下流タスクや、音素情報を利用せずに下流タスク用のASR-transcriptしか持たない低リソース設定でも使用できることを示しています。我々は3つのベンチマークデータセット(Stanford Sentiment Treebank, TREC, ATIS)に対して,それぞれ感情,質問,意図の分類タスクに対してノイズの多いデータを生成することで,我々のアプローチを広範囲に評価した。提案手法の結果は,各データセットにおける最先端のベースラインを総合的に上回ります。 Recent years have witnessed significant improvement in ASR systems to recognize spoken utterances. However, it is still a challenging task for noisy and out-of-domain data, where substitution and deletion errors are prevalent in the transcribed text. These errors significantly degrade the performance of downstream tasks. In this work, we propose a BERT-style language model, referred to as PhonemeBERT, that learns a joint language model with phoneme sequence and ASR transcript to learn phonetic-aware representations that are robust to ASR errors. We show that PhonemeBERT can be used on downstream tasks using phoneme sequences as additional features, and also in low-resource setup where we only have ASR-transcripts for the downstream tasks with no phoneme information available. We evaluate our approach extensively by generating noisy data for three benchmark datasets - Stanford Sentiment Treebank, TREC and ATIS for sentiment, question and intent classification tasks respectively. The results of the proposed approach beats the state-of-the-art baselines comprehensively on each dataset.	翻訳日:2021-02-02 16:21:37 公開日:2021-02-01
# 潜在空間における対人訓練のスピードアップに向けて Towards Speeding up Adversarial Training in Latent Spaces ( http://arxiv.org/abs/2102.00662v1 ) ライセンス: Link先を確認	Yaguan Qian, Qiqi Shao, Tengteng Yao, Bin Wang, Shaoning Zeng, Zhaoquan Gu and Wassim Swaileh	(参考訳) 敵対的な訓練は、敵対的な例から守る最も効果的な方法です。しかし,既存の対人訓練手法では,入力空間における対人的な例を生成する必要があるため,時間消費の主部分を占めている。学習過程を高速化するため,本研究では,実例を生成する必要のない新しい学習手法を提案する。クリーンな例は、自身のクラス以外のどのクラスよりも2番目に大きなロジットコンポーネントを持つクラスの決定境界に近いことに気付きます。したがって、ロジットに摂動を加えて内在的敵例(EAEs)を生成することで、トレーニングプロセスを高速化するために勾配を計算することを避けることができる。我々はさらに多様体の理論によってAEの存在についての深い洞察を得る。付加的な摂動が制約の範囲内にあることを保証するため、統計分布を用いてシード例を選択してAEを製作する。 CIFAR-10 と ImageNet で大規模な実験を行い,現状の "Free" と "Fast" の手法と比較して,EAE の対人訓練はトレーニング時間を短縮するだけでなく,モデルの堅牢性も向上することを示した。さらに,EAE対人訓練は,既存の方法に比べてクリーンサンプルの精度にはほとんど影響を与えない。 Adversarial training is wildly considered as the most effective way to defend against adversarial examples. However, existing adversarial training methods consume unbearable time cost, since they need to generate adversarial examples in the input space, which accounts for the main part of total time-consuming. For speeding up the training process, we propose a novel adversarial training method that does not need to generate real adversarial examples. We notice that a clean example is closer to the decision boundary of the class with the second largest logit component than any other class besides its own class. Thus, by adding perturbations to logits to generate Endogenous Adversarial Examples(EAEs) -- adversarial examples in the latent space, it can avoid calculating gradients to speed up the training process. We further gain a deep insight into the existence of EAEs by the theory of manifold. To guarantee the added perturbation is within the range of constraint, we use statistical distributions to select seed examples to craft EAEs. Extensive experiments are conducted on CIFAR-10 and ImageNet, and the results show that compare with state-of-the-art "Free" and "Fast" methods, our EAE adversarial training not only shortens the training time, but also enhances the robustness of the model. Moreover, the EAE adversarial training has little impact on the accuracy of clean examples than the existing methods.	翻訳日:2021-02-02 16:17:38 公開日:2021-02-01
# グラフ畳み込みネットワークと交差する自律ナビゲーションと自動運転車の条件模倣学習 Autonomous Navigation through intersections with Graph ConvolutionalNetworks and Conditional Imitation Learning for Self-driving Cars ( http://arxiv.org/abs/2102.00675v1 ) ライセンス: Link先を確認	Xiaodong Mei, Yuxiang Sun, Yuying Chen, Congcong Liu, Ming Liu	(参考訳) 自動運転では、多くの交通参加者が移動する信号のない交差点を通るナビゲーションは難しい作業です。そこで本研究では,ナビゲーションポリシー学習のための新しい分岐ネットワークG-CILを提案する。具体的には,グラフ構造データなどの動的環境を第一に表現し,エッジ定義の効果的な戦略を提案する。次に、グラフ畳み込みニューラルネットワークを知覚モジュールとして、環境から大域的および幾何学的特徴をキャプチャする。安全かつ効率的なナビゲーションポリシを生成するために,条件付き模倣学習アルゴリズムを組み込んで,専門家によるデモンストレーションから直接運転行動を学習する。提案するネットワークは,複数の周辺車両を処理でき,与えられた高レベルコマンド(例えば,左折してグローバル目標へ)に応じて最適な制御動作(ステアリング角度やスロットルなど)を生成することができる。信号のない交差点と様々な交通密度の評価は、我々のエンドツーエンドのトレーニング可能なニューラルネットワークが、より高い成功率と短いナビゲーション時間でベースラインを上回っていることを示している。 In autonomous driving, navigation through unsignaled intersections with many traffic participants moving around is a challenging task. To provide a solution to this problem, we propose a novel branched network G-CIL for the navigation policy learning. Specifically, we firstly represent such dynamic environments as graph-structured data and propose an effective strategy for edge definition to aggregate surrounding information for the ego-vehicle. Then graph convolutional neural networks are used as the perception module to capture global and geometric features from the environment. To generate safe and efficient navigation policy, we further incorporate it with conditional imitation learning algorithm, to learn driving behaviors directly from expert demonstrations. Our proposed network is capable of handling a varying number of surrounding vehicles and generating optimal control actions (e.g., steering angle and throttle) according to the given high-level commands (e.g., turn left towards the global goal). Evaluations on unsignaled intersections with various traffic densities demonstrate that our end-to-end trainable neural network outperforms the baselines with higher success rate and shorter navigation time.	翻訳日:2021-02-02 16:16:57 公開日:2021-02-01
# 不確実性監視によるディープラーニングシステムのフェールセーフ実行 Fail-Safe Execution of Deep Learning based Systems through Uncertainty Monitoring ( http://arxiv.org/abs/2102.00902v1 ) ライセンス: Link先を確認	Michael Weiss and Paolo Tonella	(参考訳) 現代のソフトウェアシステムは、画像、ビデオ、自然言語テキスト、音声信号などの複雑な非構造化入力を処理する際にDeep Neural Networks (DNN) に依存している。このような入力空間の難解な大きさ、学習アルゴリズムの本質的な制限、およびいくつかの入力に対する予測の曖昧さが提供され、DNNの予測が常に正しいという保証はない。フェイルセーフディープラーニングベースシステム(DLS)は、DNN障害をスーパーバイザによって処理する装備の1つであり、信頼すべきでない予測を認識し、DLSを安全な状態にする治癒手順を活性化することができる。本稿では,DNN不確実性推定器を用いてこのようなスーパーバイザを実装する手法を提案する。まず、DNNの不確実性を測定するための既存のアプローチの利点と欠点を議論し、そのようなアプローチに依存するスーパーバイザーの実証的評価のための新しいメトリクスを提案します。次に、公開ツールUNCERTAINTY-WIZARDについて述べ、通常のtf.keras DNNに対する不確実性を透過的に推定する。最後に,このアプローチを実証的に検証するために,4つの異なる課題について実施した大規模研究について検討し,dlsのフェールセーフ実行の不確実性を監視するソフトウェア技術者への指導として,教訓を報告する。 Modern software systems rely on Deep Neural Networks (DNN) when processing complex, unstructured inputs, such as images, videos, natural language texts or audio signals. Provided the intractably large size of such input spaces, the intrinsic limitations of learning algorithms, and the ambiguity about the expected predictions for some of the inputs, not only there is no guarantee that DNN's predictions are always correct, but rather developers must safely assume a low, though not negligible, error probability. A fail-safe Deep Learning based System (DLS) is one equipped to handle DNN faults by means of a supervisor, capable of recognizing predictions that should not be trusted and that should activate a healing procedure bringing the DLS to a safe state. In this paper, we propose an approach to use DNN uncertainty estimators to implement such a supervisor. We first discuss the advantages and disadvantages of existing approaches to measure uncertainty for DNNs and propose novel metrics for the empirical assessment of the supervisor that rely on such approaches. We then describe our publicly available tool UNCERTAINTY-WIZARD, which allows transparent estimation of uncertainty for regular tf.keras DNNs. Lastly, we discuss a large-scale study conducted on four different subjects to empirically validate the approach, reporting the lessons-learned as guidance for software engineers who intend to monitor uncertainty for fail-safe execution of DLS.	翻訳日:2021-02-02 16:16:18 公開日:2021-02-01
# 分布型モンテカルロ木探索によるリスク認識と多目的意思決定 Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search ( http://arxiv.org/abs/2102.00966v1 ) ライセンス: Link先を確認	Conor F. Hayes, Mathieu Reymond, Diederik M. Roijers, Enda Howley, Patrick Mannion	(参考訳) 多くのリスク認識および多目的強化学習設定において、ユーザの有用性はポリシーの単一実行から導かれる。これらの設定では、平均的な将来のリターンに基づいた決定は適切ではない。例えば、医療現場では、患者は病気を治療する機会を1つだけ持つことができる。決定を行う場合、期待されるリターン(強化学習では値として知られています)は、決定が持つ可能性のある有害あるいはポジティブな結果の範囲を考慮できないのです。我々の重要な洞察は、エージェントが決定時に要求する重要な情報を表現するために、期待される未来よりも分布を使うべきだということです。本論文では,個々の政策実行から得られる様々なリターンの有用性について,後方分布を学習するアルゴリズムである分散モンテカルロ木探索を提案する。さらに,本アルゴリズムは,期待値の効用に対する多目的強化学習において,最先端の手法よりも優れていた。 In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from the single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. When making a decision, just the expected return -- known in reinforcement learning as the value -- cannot account for the potential range of adverse or positive outcomes a decision may have. Our key insight is that we should use the distribution over expected future returns differently to represent the critical information that the agent requires at decision time. In this paper, we propose Distributional Monte Carlo Tree Search, an algorithm that learns a posterior distribution over the utility of the different possible returns attainable from individual policy executions, resulting in good policies for both risk-aware and multi-objective settings. Moreover, our algorithm outperforms the state-of-the-art in multi-objective reinforcement learning for the expected utility of the returns.	翻訳日:2021-02-02 16:15:31 公開日:2021-02-01
# 術後前立腺癌に対するctvコントーリングにおける医師スタイルの変化のドシメトリー効果:深層学習に基づくシミュレーション研究 Dosimetric impact of physician style variations in contouring CTV for post-operative prostate cancer: A deep learning based simulation study ( http://arxiv.org/abs/2102.01006v1 ) ライセンス: Link先を確認	Anjali Balagopal, Dan Nguyen, Maryam Mashayekhi, Howard Morgan, Aurelie Garant, Neil Desai, Raquibul Hannan, Mu-Han Lin, Steve Jiang	(参考訳) 腫瘍のセグメンテーションでは、オブザーバ間変異が重要な問題であることが認識されている。これは、臨床標的量(CTV)セグメント化、特に術後設定において、総腫瘍が存在しない場合にさらに重要である。このシナリオでは、CTVは解剖学的に確立された構造ではなく、医師が使用する臨床ガイドライン、腫瘍の制御と毒性のトレードオフ、経験、トレーニングの背景などに基づいて決定するものである。この結果、医師間のオブザーバ間の変動性が高まる。オブザーバ間の変動性は問題視されているが、各患者に複数のctvパターンがないため、線量計画に必要なかなりの時間を要するため、その線量測定結果はまだ不明である。本研究では,これらのスタイリスティックな変化が臓器リスク(oar)線量に与える影響を,深層学習による臨床ワークフローのシミュレーションにより解析する。ある医師が以前に治療した患者に対しては、DLベースのツールを使用して、他の医師がCTVをどのように輪郭化し、対応する線量分布がこの患者にどのように見えるかをシミュレートします。複数の医師のスタイルをシミュレートするために、既存の社内ctvセグメンテーションモデルを使用し、医師のスタイルを認識できるセグメンテーションを生成する。対応する線量分布は、すべての構造に平均して、試験データ上の処方用量の3%以内の線量を予測することができる、別の社内ディープラーニングツールを使用して予測される。各検査患者に対して,4種類の異なる医師型ctvが検討され,4種類の線量分布が解析された。 OAR線量測定値を比較すると、医師スタイルの変動は臓器に異なる線量を与えても、最大線量点を除くすべての重要な線量測定値が臨床的に許容される限界内にあることを示している。 In tumor segmentation, inter-observer variation is acknowledged to be a significant problem. This is even more significant in clinical target volume (CTV) segmentation, specifically, in post-operative settings, where a gross tumor does not exist. In this scenario, CTV is not an anatomically established structure but rather one determined by the physician based on the clinical guideline used, the preferred trade off between tumor control and toxicity, their experience, training background etc... This results in high inter-observer variability between physicians. Inter-observer variability has been considered an issue, however its dosimetric consequence is still unclear, due to the absence of multiple physician CTV contours for each patient and the significant amount of time required for dose planning. In this study, we analyze the impact that these physician stylistic variations have on organs-at-risk (OAR) dose by simulating the clinical workflow using deep learning. For a given patient previously treated by one physician, we use DL-based tools to simulate how other physicians would contour the CTV and how the corresponding dose distributions should look like for this patient. To simulate multiple physician styles, we use a previously developed in-house CTV segmentation model that can produce physician style-aware segmentations. The corresponding dose distribution is predicted using another in-house deep learning tool, which, averaging across all structures, is capable of predicting dose within 3% of the prescription dose on the test data. For every test patient, four different physician-style CTVs are considered and four different dose distributions are analyzed. OAR dose metrics are compared, showing that even though physician style variations results in organs getting different doses, all the important dose metrics except Maximum Dose point are within the clinically acceptable limit.	翻訳日:2021-02-02 16:14:55 公開日:2021-02-01
# 大規模言語モデルの微調整のためのスケーリングフェデレーション学習 Scaling Federated Learning for Fine-tuning of Large Language Models ( http://arxiv.org/abs/2102.00875v1 ) ライセンス: Link先を確認	Agrin Hilmkil and Sebastian Callh and Matteo Barbieri and Leon Ren\'e S\"utfeld and Edvin Listo Zec and Olof Mogren	(参考訳) Federated Learning(FL)は分散コンピューティングと分散データに対する有望なアプローチであり、法的なフレームワークに対するプライバシーとコンプライアンスのレベルを提供します。これにより、FLは消費者およびヘルスケアアプリケーションの両方に魅力的になります。この領域は積極的に検討されているが、より大きな言語モデルの文脈でflを調査した研究はほとんどなく、タスク、アーキテクチャ、クライアントの数、その他の関連する要因間での堅牢性に関する包括的なレビューが欠けている。本稿では,共用学習環境におけるトランスフォーマティブ言語モデルの微調整について検討する。我々は,感情分析や著者識別などのテキスト分類タスクにおいて,さまざまなサイズのBERT変異(BERT, ALBERT, DistilBERT)を評価する。フェデレーション平均設定におけるタスクパフォーマンスに対する分散計算の影響を評価するために、32までのクライアント数を広範囲に監視します。実験結果から, 評価モデルの大規模化は, 一般にフェデレーショントレーニングを禁止していないことが示唆されるが, 異なるモデルがフェデレーション平均化を様々な程度に扱うことが判明した。特にDistilBERTは、より多くのクライアントと大幅に遅く収束し、いくつかの状況下では、チャンスレベルのパフォーマンスに崩壊します。この問題を調査することは、将来の研究に興味深い視点をもたらす。 Federated learning (FL) is a promising approach to distributed compute, as well as distributed data, and provides a level of privacy and compliance to legal frameworks. This makes FL attractive for both consumer and healthcare applications. While the area is actively being explored, few studies have examined FL in the context of larger language models and there is a lack of comprehensive reviews of robustness across tasks, architectures, numbers of clients, and other relevant factors. In this paper, we explore the fine-tuning of Transformer-based language models in a federated learning setting. We evaluate three popular BERT-variants of different sizes (BERT, ALBERT, and DistilBERT) on a number of text classification tasks such as sentiment analysis and author identification. We perform an extensive sweep over the number of clients, ranging up to 32, to evaluate the impact of distributed compute on task performance in the federated averaging setting. While our findings suggest that the large sizes of the evaluated models are not generally prohibitive to federated training, we found that the different models handle federated averaging to a varying degree. Most notably, DistilBERT converges significantly slower with larger numbers of clients, and under some circumstances, even collapses to chance level performance. Investigating this issue presents an interesting perspective for future research.	翻訳日:2021-02-02 16:14:02 公開日:2021-02-01
# End2End音響とセマンティックトランスダクション End2End Acoustic to Semantic Transduction ( http://arxiv.org/abs/2102.01013v1 ) ライセンス: Link先を確認	Valentin Pelloin, Nathalie Camelin, Antoine Laurent, Renato De Mori, Antoine Caubri\`ere, Yannick Est\`eve, Sylvain Meignier	(参考訳) 本稿では,注意機構を用いた新しいエンドツーエンドシーケンス・ツー・シーケンス音声言語理解モデルを提案する。意味的内容を仮説化するために、コンテキスト音響特徴を確実に選択する。アコースティックスパンからすべての発音された単語や概念を抽出できる初期アーキテクチャを設計、試験する。浅い融合言語モデルでは、このシステムはフランスのMEDIAコーパスにおける13.6のコンセプトエラーレート(CER)と18.5のコンセプト値エラーレート(CVER)に達し、最先端技術と比較して絶対2.8ポイントの削減を実現している。そこで,概念とその価値を仮説化するモデルを提案する。この変換は、新しいタイプのコンテキストなしで15.4 CERと21.6 CVERに達する。 In this paper, we propose a novel end-to-end sequence-to-sequence spoken language understanding model using an attention mechanism. It reliably selects contextual acoustic features in order to hypothesize semantic contents. An initial architecture capable of extracting all pronounced words and concepts from acoustic spans is designed and tested. With a shallow fusion language model, this system reaches a 13.6 concept error rate (CER) and an 18.5 concept value error rate (CVER) on the French MEDIA corpus, achieving an absolute 2.8 points reduction compared to the state-of-the-art. Then, an original model is proposed for hypothesizing concepts and their values. This transduction reaches a 15.4 CER and a 21.6 CVER without any new type of context.	翻訳日:2021-02-02 16:13:18 公開日:2021-02-01
# 時間系列回帰と予測のためのニューラルネットワークの自動相関誤差の調整 Adjusting for Autocorrelated Errors in Neural Networks for Time Series Regression and Forecasting ( http://arxiv.org/abs/2101.12578v2 ) ライセンス: Link先を確認	Fan-Keng Sun and Christopher I. Lang and Duane S. Boning	(参考訳) 多くの場合、既知のパラメトリックモデル構造を用いて時系列データの高精度なモデルを生成することは困難である。これに対し、ニューラルネットワークを用いて時系列を概ねモデル化する研究が増えている。時系列でニューラルネットワークをトレーニングする一般的な前提は、異なる時間ステップでのエラーは非相関であるということである。しかし、データの時間性のため、多くのケースでエラーは自己相関しており、そのような最大推定は不正確である。本稿では,自己相関係数をモデルパラメータと協調して学習し,自己相関誤差に適応することを提案する。時系列回帰の場合, 大規模実験では, 特に自己相関が強い場合に, プライス-ウィンステン法を上回っていることが示された。さらに,本手法を時系列予測に拡張し,様々な最先端モデルで適用する。実世界のデータセットの広範囲にわたる結果から,本手法はほぼすべてのケースで性能が向上することが示された。 In many cases, it is difficult to generate highly accurate models for time series data using a known parametric model structure. In response, an increasing body of research focuses on using neural networks to model time series approximately. A common assumption in training neural networks on time series is that the errors at different time steps are uncorrelated. However, due to the temporality of the data, errors are actually autocorrelated in many cases, which makes such maximum likelihood estimation inaccurate. In this paper, we propose to learn the autocorrelation coefficient jointly with the model parameters in order to adjust for autocorrelated errors. For time series regression, large-scale experiments indicate that our method outperforms the Prais-Winsten method, especially when the autocorrelation is strong. Furthermore, we broaden our method to time series forecasting and apply it with various state-of-the-art models. Results across a wide range of real-world datasets show that our method enhances performance in almost all cases.	翻訳日:2021-02-02 16:12:45 公開日:2021-02-01
# 対称正定値行列多様体上の確率的学習ベクトル量子化 Probabilistic Learning Vector Quantization on Manifold of Symmetric Positive Definite Matrices ( http://arxiv.org/abs/2102.00667v1 ) ライセンス: Link先を確認	Fengzhen Tang, Haifeng Feng, Peter Tino, Bailu Si, Daxiong Ji	(参考訳) 本稿では,確率論的学習ベクトル量子化の枠組みにおける多様体値データの新しい分類法を開発する。多くの分類シナリオにおいて、データは本質的に曲線リーマン多様体上に存在する点である対称正定値行列によって自然に表現することができる。リーマン多様体の非ユークリッド幾何学のために、伝統的なユークリッド機械学習アルゴリズムはそのようなデータに悪い結果をもたらす。本稿では,リーマン自然計量(アフィン不変計量)を備えた対称正定行列の多様体上に存在するデータ点の確率的学習ベクトル量子化アルゴリズムを一般化する。誘導されたリーマン距離を利用して、確率学習リーマン空間量子化アルゴリズムを導出し、リーマン勾配降下による学習規則を得る。合成データ,画像データ,運動画像脳波データに関する実証的研究は,提案手法の優れた性能を示す。 In this paper, we develop a new classification method for manifold-valued data in the framework of probabilistic learning vector quantization. In many classification scenarios, the data can be naturally represented by symmetric positive definite matrices, which are inherently points that live on a curved Riemannian manifold. Due to the non-Euclidean geometry of Riemannian manifolds, traditional Euclidean machine learning algorithms yield poor results on such data. In this paper, we generalize the probabilistic learning vector quantization algorithm for data points living on the manifold of symmetric positive definite matrices equipped with Riemannian natural metric (affine-invariant metric). By exploiting the induced Riemannian distance, we derive the probabilistic learning Riemannian space quantization algorithm, obtaining the learning rule through Riemannian gradient descent. Empirical investigations on synthetic data, image data , and motor imagery EEG data demonstrate the superior performance of the proposed method.	翻訳日:2021-02-02 16:06:51 公開日:2021-02-01
# Surrogate Set Classification による複数非ラベルデータセットのバイナリ分類 Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification ( http://arxiv.org/abs/2102.00678v1 ) ライセンス: Link先を確認	Shida Lei, Nan Lu, Gang Niu, Issei Sato, Masashi Sugiyama	(参考訳) 高アノテーションコストに対処するために,弱い教師データのみから分類器を訓練することが近年注目されている。様々なアプローチの中で、完全に教師なしの分類からの監督を強化することは有望な方向であり、通常はクラス優先を唯一の監督として採用し、ラベルなし(u)データセットからバイナリ分類器を訓練する。既存のリスク一貫性メソッドは理論的には高い柔軟性を持つが、2つのUセットからのみ学ぶことができる。本稿では,mU集合から$m\ge2$に対して二進分類を行う新しい手法を提案する。本研究の目的は,各観測データから u セットが描画されるかを予測することを目的とした,surrogate set classification (ssc) と呼ばれる補助的分類課題を検討することである。 SSCは標準(マルチクラス)の分類法で解決でき、SSCの解を用いて、ある線形フラクタル変換によって最終二項分類器を得る。我々は,この手法を柔軟かつ効率的なエンドツーエンドのディープラーニングフレームワークで構築し,分類器一貫性を証明した。実験により,提案手法が最先端手法よりも優れていることを示す。 To cope with high annotation costs, training a classifier only from weakly supervised data has attracted a great deal of attention these days. Among various approaches, strengthening supervision from completely unsupervised classification is a promising direction, which typically employs class priors as the only supervision and trains a binary classifier from unlabeled (U) datasets. While existing risk-consistent methods are theoretically grounded with high flexibility, they can learn only from two U sets. In this paper, we propose a new approach for binary classification from m U-sets for $m\ge2$. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC), which is aimed at predicting from which U set each observed data is drawn. SSC can be solved by a standard (multi-class) classification method, and we use the SSC solution to obtain the final binary classifier through a certain linear-fractional transformation. We built our method in a flexible and efficient end-to-end deep learning framework and prove it to be classifier-consistent. Through experiments, we demonstrate the superiority of our proposed method over state-of-the-art methods.	翻訳日:2021-02-02 16:06:20 公開日:2021-02-01
# VAEにおけるクラス関連およびクラス非依存因子の半監督的解離 Semi-Supervised Disentanglement of Class-Related and Class-Independent Factors in VAE ( http://arxiv.org/abs/2102.00892v1 ) ライセンス: Link先を確認	Sina Hajimiri, Aryo Lotfi, Mahdieh Soleymani Baghshah	(参考訳) 近年,不整合表現を学習するための変分オートエンコーダのフレームワークの拡張が注目されている。そこで我々は,データ変動のクラス関連要因とクラス非依存要因を分離できるフレームワークを提案する。本フレームワークは,データからクラス関連因子を抽出するプロセスを改善するために,潜在空間における注意機構を用いる。また,混合モデルを学習可能な事前分布として活用し,目的関数にbhattacharyya係数を組み込んで重なり合う混合を防止することで,データ分布の多様性を扱う。我々のモデルエンコーダは、表現の解釈性を改善するために、ラベル付きデータの少ない半教師付き方式でさらに訓練されている。実験により,本フレームワークはクラスやクラスに依存しない変動要因を分離し,解釈可能な特徴を学習することを示した。さらに,各データセットの定量的,定性的な結果を用いて,モデルの性能を実証する。 In recent years, extending variational autoencoder's framework to learn disentangled representations has received much attention. We address this problem by proposing a framework capable of disentangling class-related and class-independent factors of variation in data. Our framework employs an attention mechanism in its latent space in order to improve the process of extracting class-related factors from data. We also deal with the multimodality of data distribution by utilizing mixture models as learnable prior distributions, as well as incorporating the Bhattacharyya coefficient in the objective function to prevent highly overlapping mixtures. Our model's encoder is further trained in a semi-supervised manner, with a small fraction of labeled data, to improve representations' interpretability. Experiments show that our framework disentangles class-related and class-independent factors of variation and learns interpretable features. Moreover, we demonstrate our model's performance with quantitative and qualitative results on various datasets.	翻訳日:2021-02-02 16:05:39 公開日:2021-02-01
# 全最小二乗位相検索 Total least squares phase retrieval ( http://arxiv.org/abs/2102.00927v1 ) ライセンス: Link先を確認	Sidharth Gupta and Ivan Dokmani\'c	(参考訳) 本稿では,検出ベクトルの誤差による位相探索問題に対処する。最近の位相探索法は最小二乗法(LS)の定式化に基づいており、2次測定の誤差を仮定している。このアプローチを拡張し、オペレータエラーの線形逆問題に精通した総最小二乗(TLS)フレームワークを採用することで、センシングベクターのエラーを処理する。本稿では, 位相探索問題の勾配降下と特異な幾何学を用いて, 単純かつ効率的なTLS解を得る方法を示す。さらに、我々はソリューションエラーを計算することを可能にするセンシングベクターと測定に関してTLSおよびLSソリューションの勾配を導き出します。これらのエラー式を分析することで、各メソッドがいつうまく機能すべきかを決定します。シミュレーションを行い,本手法の利点を実証し,解析結果の検証を行う。さらに,検出ベクトルと測定誤差を自然に含む実光ハードウェア上で位相探索実験を行うことにより,本手法の有効性を実証する。 We address the phase retrieval problem with errors in the sensing vectors. A number of recent methods for phase retrieval are based on least squares (LS) formulations which assume errors in the quadratic measurements. We extend this approach to handle errors in the sensing vectors by adopting the total least squares (TLS) framework familiar from linear inverse problems with operator errors. We show how gradient descent and the peculiar geometry of the phase retrieval problem can be used to obtain a simple and efficient TLS solution. Additionally, we derive the gradients of the TLS and LS solutions with respect to the sensing vectors and measurements which enables us to calculate the solution errors. By analyzing these error expressions we determine when each method should perform well. We run simulations to demonstrate the benefits of our method and verify the analysis. We further demonstrate the effectiveness of our approach by performing phase retrieval experiments on real optical hardware which naturally contains sensing vector and measurement errors.	翻訳日:2021-02-02 16:05:03 公開日:2021-02-01
# 確率的勾配Descenceのための情報理論一般化境界 Information-Theoretic Generalization Bounds for Stochastic Gradient Descent ( http://arxiv.org/abs/2102.00931v1 ) ライセンス: Link先を確認	Gergely Neu	(参考訳) 一般的な非凸損失関数を最適化するための確率勾配勾配法の一般化特性について検討する。我々の主な貢献は,sgdで計算されたイテレートの経路に沿って評価された確率勾配の局所統計に依存する一般化誤差の上限を提供することである。我々の境界が依存する重要な要因は、勾配のばらつき(データ分布に関する)と、SGD経路に沿った目的関数の局所的滑らかさ、最終的な出力に対する摂動に対する損失関数の感度である。当社の重要な技術ツールは、以前にSGDのランダム化変種を分析するために使用される情報理論一般化境界と、反復の摂動解析を組み合わせることです。 We study the generalization properties of the popular stochastic gradient descent method for optimizing general non-convex loss functions. Our main contribution is providing upper bounds on the generalization error that depend on local statistics of the stochastic gradients evaluated along the path of iterates calculated by SGD. The key factors our bounds depend on are the variance of the gradients (with respect to the data distribution) and the local smoothness of the objective function along the SGD path, and the sensitivity of the loss function to perturbations to the final output. Our key technical tool is combining the information-theoretic generalization bounds previously used for analyzing randomized variants of SGD with a perturbation analysis of the iterates.	翻訳日:2021-02-02 16:04:29 公開日:2021-02-01
# 時間論理仕様を用いたマルチエージェント強化学習 Multi-Agent Reinforcement Learning with Temporal Logic Specifications ( http://arxiv.org/abs/2102.00582v1 ) ライセンス: Link先を確認	Lewis Hammond and Alessandro Abate and Julian Gutierrez and Michael Wooldridge	(参考訳) 本稿では,未知の環境におけるエージェント群による時間論理仕様を満たす学習の問題について検討し,確率的行動を示す可能性がある。学習の観点からは、これらの仕様はタスクや目的をキャプチャするリッチな形式言語を提供する一方で、ロジックや自動検証の観点からは、学習機能の導入によって、大規模で統計的で未知の環境での実用的な応用が可能になる。しかし、この領域の既存の仕事は限られています。完全な線形時間論理や正当性を保証するフレームワークのうち、これまでのすべてのメソッドでは、単一の時間論理仕様と単一のエージェントのみを考慮する。この制限を克服するために、時間論理仕様のための最初のマルチエージェント強化学習技術を開発しました。関数近似を用いても,主アルゴリズムであるALMANAC(Automaton/Logic Multi-Agent Natural Actor-Critic)の正確性と収束性を保証する。理論的結果とともに,予備実験のセットを通じて,本手法の適用性をさらに実証する。 In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour. From a learning perspective these specifications provide a rich formal language with which to capture tasks or objectives, while from a logic and automated verification perspective the introduction of learning capabilities allows for practical applications in large, stochastic, unknown environments. The existing work in this area is, however, limited. Of the frameworks that consider full linear temporal logic or have correctness guarantees, all methods thus far consider only the case of a single temporal logic specification and a single agent. In order to overcome this limitation, we develop the first multi-agent reinforcement learning technique for temporal logic specifications, which is also novel in its ability to handle multiple specifications. We provide correctness and convergence guarantees for our main algorithm - ALMANAC (Automaton/Logic Multi-Agent Natural Actor-Critic) - even when using function approximation. Alongside our theoretical results, we further demonstrate the applicability of our technique via a set of preliminary experiments.	翻訳日:2021-02-02 16:02:48 公開日:2021-02-01
# AIのモラル責任に関する人間の認識:AIによるベイル意思決定のケーススタディ Human Perceptions on Moral Responsibility of AI: A Case Study in AI-Assisted Bail Decision-Making ( http://arxiv.org/abs/2102.00625v1 ) ライセンス: Link先を確認	Gabriel Lima, Nina Grgi\'c-Hla\v{c}a, Meeyoung Cha	(参考訳) 自律人工知能(AI)システムの行動に対する責任をどう捉えるかは、人文科学や社会科学の分野で広く議論されている。この研究では、AIと人間のエージェントに関する8つの異なる道徳的責任の概念の人々の認識を保釈意思決定の文脈で測定する2つの実験(それぞれ$N$=200)を提示する。実生活に適応したヴィグネットを用いて、我々の実験では、AIエージェントは因果責任を持ち、人間エージェントと同じような責任を負っている。しかし、これらのエージェントの倫理的責任の認識には有意義な違いがあり、人間のエージェントはaiエージェントよりも現代的で先見的な責任の考え方の方が高いと説明されていた。また、AIと人間の意思決定者とアドバイザーの両方が、その性質に関わらず、自分の決定を正当化することを期待していることもわかりました。本稿は、ハイテイクシナリオにおける説明可能なAIの必要性など、これらの発見のポリシーとHCIの影響について論じる。 How to attribute responsibility for autonomous artificial intelligence (AI) systems' actions has been widely debated across the humanities and social science disciplines. This work presents two experiments ($N$=200 each) that measure people's perceptions of eight different notions of moral responsibility concerning AI and human agents in the context of bail decision-making. Using real-life adapted vignettes, our experiments show that AI agents are held causally responsible and blamed similarly to human agents for an identical task. However, there was a meaningful difference in how people perceived these agents' moral responsibility; human agents were ascribed to a higher degree of present-looking and forward-looking notions of responsibility than AI agents. We also found that people expect both AI and human decision-makers and advisors to justify their decisions regardless of their nature. We discuss policy and HCI implications of these findings, such as the need for explainable AI in high-stakes scenarios.	翻訳日:2021-02-02 16:02:11 公開日:2021-02-01
# DRLDO: メタモルフィックマルウェアに対する防御のための新しいDRLベースのD-Obfuscation System DRLDO: A novel DRL based De-ObfuscationSystem for Defense against Metamorphic Malware ( http://arxiv.org/abs/2102.00898v1 ) ライセンス: Link先を確認	Mohit Sewak and Sanjay K. Sahay and Hemant Rathore	(参考訳) 本論文では,オプコードレベルでのメタモルフィックおよび難読化マルウェアを正規化し,高度なメタモルフィック・デ・難読化防御システムを構築するための新しいメカニズムを提案する。深層強化学習に基づくde-obfuscatorのためのdrldoと呼ぶ。 DRLDOをサブコンポーネントとして含むことで、既存の侵入検知システムは、既存のマルウェアの難読化および変成変種からの「ゼロデイ」攻撃に対する防御能力を増強することができる。これは、高度なDRLを使用してオペコードレベルまで難読化をインテリジェントかつ自動的に正規化するシステムがないだけでなく、DRLDOシステムが既存のIDSに一切変更を課さないために重要なものとなっている。 DRLDOシステムはIDSの分類器を難読化されたサンプルを含む新しいデータセットで再訓練する義務も負いません。したがって、DRLDOは既存のIDSデプロイメントに容易に再適合できる。我々は,複数の世代のマルウェアを含む標準データセットから得られたマルウェアサンプルから発生する難読化に対する複数同時攻撃に対して,システムの設計,開発,および評価を行う実験を行った。実験の結果、DRLDOは、既存の訓練済みマルウェア分類器によってマルウェアの検出不能な難解な変種を検出可能にすることが証明された。検出確率はカットオフマークを大きく上回り,分類器では0.6に上昇し,難読化マルウェアを曖昧に検出した。さらに、DRLDOが生成した難読化変種は、ベースマルウェアと非常に高い相関(0.99)を達成した。この観察は、DRLDOシステムが実際に難読化を学習しており、簡単なトリックを悪用していないことを実証する。 In this paper, we propose a novel mechanism to normalize metamorphic and obfuscated malware down at the opcode level and hence create an advanced metamorphic malware de-obfuscation and defense system. We name this system DRLDO, for Deep Reinforcement Learning based De-Obfuscator. With the inclusion of the DRLDO as a sub-component, an existing Intrusion Detection System could be augmented with defensive capabilities against 'zero-day' attacks from obfuscated and metamorphic variants of existing malware. This gains importance, not only because there exists no system to date that uses advanced DRL to intelligently and automatically normalize obfuscation down even to the opcode level, but also because the DRLDO system does not mandate any changes to the existing IDS. The DRLDO system does not even mandate the IDS' classifier to be retrained with any new dataset containing obfuscated samples. Hence DRLDO could be easily retrofitted into any existing IDS deployment. We designed, developed, and conducted experiments on the system to evaluate the same against multiple-simultaneous attacks from obfuscations generated from malware samples from a standardized dataset that contains multiple generations of malware. Experimental results prove that DRLDO was able to successfully make the otherwise un-detectable obfuscated variants of the malware detectable by an existing pre-trained malware classifier. The detection probability was raised well above the cut-off mark to 0.6 for the classifier to detect the obfuscated malware unambiguously. Further, the de-obfuscated variants generated by DRLDO achieved a very high correlation (of 0.99) with the base malware. This observation validates that the DRLDO system is actually learning to de-obfuscate and not exploiting a trivial trick.	翻訳日:2021-02-02 16:01:33 公開日:2021-02-01
# ARMプロセッサ上のML演算子のキャッシュ境界の理解 Understanding Cache Boundness of ML Operators on ARM Processors ( http://arxiv.org/abs/2102.00932v1 ) ライセンス: Link先を確認	Bernhard Klein and Christoph Gratl and Manfred M\"ucke and Holger Fr\"oning	(参考訳) TVMのような機械学習コンパイラは、組み込みCPUに高速で柔軟なデプロイを可能にする。これにより、ML圧縮技術で一般的な非標準演算子の使用が可能になる。しかし、適切なソリューションを設計するには、mlワークロードにおける典型的な計算インテンシーオペレータの制限を理解する必要がある。これは、組み込みARMプロセッサの基本ハードウェア制限と比較して、TVMで生成された高密度および畳み込み演算子の最初の詳細分析です。これにより、TVMとopenBLASで作成された計算ピーク性能、理論と測定値、および実世界の最先端結果のギャップが説明できる。代わりに、単精度一般行列乗算(GEMM)と畳み込みがL1キャッシュ可読帯域でバインドされていることがわかる。 8ビットおよびビットシリアル量子化演算子の探索は、キャッシュバウンド浮動小数点演算子と比較して、量子化が関連するスピードアップを達成するために使用できることを示した。しかし、量子化演算子の性能はデータレイアウトとビットパッキングの相互作用に大きく依存する。 Machine Learning compilers like TVM allow a fast and flexible deployment on embedded CPUs. This enables the use of non-standard operators, which are common in ML compression techniques. However, it is necessary to understand the limitations of typical compute-intense operators in ML workloads to design a proper solution. This is the first in-detail analysis of dense and convolution operators, generated with TVM, that compares to the fundamental hardware limits of embedded ARM processors. Thereby it explains the gap between computational peak performance, theoretical and measured, and real-world state-of-the-art results, created with TVM and openBLAS. Instead, one can see that single-precision general matrix multiply (GEMM) and convolutions are bound by L1-cache-read bandwidth. Explorations of 8-bit and bit-serial quantized operators show that quantization can be used to achieve relevant speedups compared to cache-bound floating-point operators. However, the performance of quantized operators highly depends on the interaction between data layout and bit packing.	翻訳日:2021-02-02 16:00:46 公開日:2021-02-01
# ハイブリッド情報駆動マルチエージェント強化学習 Hybrid Information-driven Multi-agent Reinforcement Learning ( http://arxiv.org/abs/2102.01004v1 ) ライセンス: Link先を確認	William A. Dawson, Ruben Glatt, Edward Rusu, Braden C. Soper, Ryan A. Goldhahn	(参考訳) 情報理論センサ管理手法は、マルチエージェントシステムの最適制御を考える場合の状態推定問題に対する理想的な解決策であるが、大規模分散マルチエージェントシステムで典型的な限られた計算資源を考えると、大きな状態空間では計算集約的すぎる。強化学習(RL)は、分散エージェントの多くのシステムに固有のリソース制約を考慮して、分散最適制御問題に対する近似ソリューションを見つけることができる有望な代替手段です。しかし、特に州空間の大部分でエージェントがほとんどフィードバックを受けていない低情報環境では、rlトレーニングは禁止的に非効率である。本稿では,情報理論モデルをヒューリスティックとして活用し,エージェントが大きなスパース状態空間をナビゲートするのを支援する,情報駆動型マルチエージェント強化学習(marl)手法を提案する。本稿では,この目的に向けた取り組みについて述べる。予備的な知見から,このようなアプローチは,単純なベースラインメトリクスよりもスパース状態空間を探索する上で,およそ3桁の効率性を持つエージェントのシステムをもたらす可能性が示唆された。作業はまだ初期段階ですが、将来の研究に有望な方向性を提供します。 Information theoretic sensor management approaches are an ideal solution to state estimation problems when considering the optimal control of multi-agent systems, however they are too computationally intensive for large state spaces, especially when considering the limited computational resources typical of large-scale distributed multi-agent systems. Reinforcement learning (RL) is a promising alternative which can find approximate solutions to distributed optimal control problems that take into account the resource constraints inherent in many systems of distributed agents. However, the RL training can be prohibitively inefficient, especially in low-information environments where agents receive little to no feedback in large portions of the state space. We propose a hybrid information-driven multi-agent reinforcement learning (MARL) approach that utilizes information theoretic models as heuristics to help the agents navigate large sparse state spaces, coupled with information based rewards in an RL framework to learn higher-level policies. This paper presents our ongoing work towards this objective. Our preliminary findings show that such an approach can result in a system of agents that are approximately three orders of magnitude more efficient at exploring a sparse state space than naive baseline metrics. While the work is still in its early stages, it provides a promising direction for future research.	翻訳日:2021-02-02 16:00:07 公開日:2021-02-01
# 低線量x線ct用深部高分解能ネットワーク Deep High-Resolution Network for Low Dose X-ray CT Denoising ( http://arxiv.org/abs/2102.00599v1 ) ライセンス: Link先を確認	Ti Bai, Dan Nguyen, Biling Wang and Steve Jiang	(参考訳) 低線量CT (LDCT) は, 患者に対する放射線量が少ないため, 臨床的に望ましい。しかし、LDCT画像の品質は、必然的に強い量子ノイズのため、しばしば準最適である。コンピュータビジョンにおける未熟な成功にインスパイアされたディープラーニング(DL)ベースの技術は、LDCTのノイズ除去に使用されています。 DLモデルの有望なノイズ除去能力にもかかわらず、DLデノ化画像の分解能は損なわれ、臨床価値は低下している。本研究では,この問題の軽減を目的とした高分解能ネットワーク(HRNet)の導入により,より効率的なデノイザーを開発した。 hrnetはサブネットワークの複数のブランチで構成され、後に融合されるマルチスケールな特徴を抽出するため、生成された特徴の品質が大幅に向上し、ノイズ除去性能が向上する。実験結果から, HRNetをベースとしたデノイザは, ノイズ抑制能力に比較して, 優れた画像分解能保持能の点で, ベンチマークしたUNetベースのデノイザよりも優れた性能を示した。 root-mean-squared-errors (RMSE)/structure similarity index (SSIM)により、HRNetベースの denoiser は 113.80/0.550 (LDCT) から 55.24/0.745 (HRNet) に値を改善することができ、UNetベースの denoiser の 59.87/0.712 と比較できる。 Low Dose Computed Tomography (LDCT) is clinically desirable due to the reduced radiation to patients. However, the quality of LDCT images is often sub-optimal because of the inevitable strong quantum noise. Inspired by their unprecedent success in computer vision, deep learning (DL)-based techniques have been used for LDCT denoising. Despite the promising noise removal ability of DL models, people have observed that the resolution of the DL-denoised images is compromised, decreasing their clinical value. Aiming at relieving this problem, in this work, we developed a more effective denoiser by introducing a high-resolution network (HRNet). Since HRNet consists of multiple branches of subnetworks to extract multiscale features which are later fused together, the quality of the generated features can be substantially enhanced, leading to improved denoising performance. Experimental results demonstrated that the introduced HRNet-based denoiser outperforms the benchmarked UNet-based denoiser in terms of superior image resolution preservation ability while comparable, if not better, noise suppression ability. Quantitative metrics in terms of root-mean-squared-errors (RMSE)/structure similarity index (SSIM) showed that the HRNet-based denoiser can improve the values from 113.80/0.550 (LDCT) to 55.24/0.745 (HRNet), in comparison to 59.87/0.712 for the UNet-based denoiser.	翻訳日:2021-02-02 15:50:21 公開日:2021-02-01
# デッドビット問題からディープハッシュを救う Rescuing Deep Hashing from Dead Bits Problem ( http://arxiv.org/abs/2102.00648v1 ) ライセンス: Link先を確認	Shu Zhao, Dayan Wu, Yucan Zhou, Bo Li and Weiping Wang	(参考訳) ディープハッシュ法は大規模画像検索において高い検索精度と効率を示す。離散ハッシュビットの最適化は、常にディープハッシュ方式に重点を置いている。これらの方法の一般的な戦略は、例えばアクティベーション関数を採用することである。 $\operatorname{sigmoid}(\cdot)$または$\operatorname{tanh}(\cdot)$は、近似離散値への量子化損失を最小限に抑える。しかし、このパラダイムは、ますます多くのハッシュビットを活性化関数の間違った飽和領域に閉じ込め、決して逃がすことはないかもしれない。この問題を "Dead Bits Problem~(DBP)" と呼ぶ。さらに、既存の量子化損失もDBPを増大させます。本稿では,DBPを緩和するアクティベーション関数の前に作用する,単純だが効果的な勾配増幅器を提案する。さらに、DBPをさらに軽減するためにエラー認識量子化損失を考案する。 2つの画像の類似性に基づいて量子化損失の負の効果を回避する。提案する勾配増幅器と誤り認識量子化損失は、様々なディープハッシュ法と互換性がある。 3つのデータセットの実験結果は、提案した勾配増幅器の効率と誤り認識量子化損失を示す。 Deep hashing methods have shown great retrieval accuracy and efficiency in large-scale image retrieval. How to optimize discrete hash bits is always the focus in deep hashing methods. A common strategy in these methods is to adopt an activation function, e.g. $\operatorname{sigmoid}(\cdot)$ or $\operatorname{tanh}(\cdot)$, and minimize a quantization loss to approximate discrete values. However, this paradigm may make more and more hash bits stuck into the wrong saturated area of the activation functions and never escaped. We call this problem "Dead Bits Problem~(DBP)". Besides, the existing quantization loss will aggravate DBP as well. In this paper, we propose a simple but effective gradient amplifier which acts before activation functions to alleviate DBP. Moreover, we devise an error-aware quantization loss to further alleviate DBP. It avoids the negative effect of quantization loss based on the similarity between two images. The proposed gradient amplifier and error-aware quantization loss are compatible with a variety of deep hashing methods. Experimental results on three datasets demonstrate the efficiency of the proposed gradient amplifier and the error-aware quantization loss.	翻訳日:2021-02-02 15:49:34 公開日:2021-02-01
# 深層学習によるSentinel-1 SAR画像のスペックル低減のための複数時間情報公開 Exploiting multi-temporal information for improved speckle reduction of Sentinel-1 SAR images by deep learning ( http://arxiv.org/abs/2102.00682v1 ) ライセンス: Link先を確認	Emanuele Dalsasso, In\`es Meraoumia, Lo\"ic Denis, Florence Tupin	(参考訳) 深層学習によるSAR振幅画像のスペックル低減効果は前例がない。 SAR画像の多時間スタックの広範な利用は、さらにデノナイジングの品質を向上させることができる。本稿では,時間的情報を深部ニューラルネットワークに統合し,スペックル抑制を柔軟かつ効率的に行う方法を提案する。アーカイブは、SAR画像の長い時系列へのアクセスを提供し、そこから複数の時間平均をほとんどスペックル変動を伴わずに計算することができる。提案手法は,この多時間平均と特定の日付の画像を比画像の形で結合し,最新のニューラルネットワークを用いてこの比画像のスペックルを除去する。この単純な戦略は、マルチテンポラル平均を知らずに元の画像をフィルタリングするよりも顕著な改善をもたらすことが示される。 Deep learning approaches show unprecedented results for speckle reduction in SAR amplitude images. The wide availability of multi-temporal stacks of SAR images can improve even further the quality of denoising. In this paper, we propose a flexible yet efficient way to integrate temporal information into a deep neural network for speckle suppression. Archives provide access to long time-series of SAR images, from which multi-temporal averages can be computed with virtually no remaining speckle fluctuations. The proposed method combines this multi-temporal average and the image at a given date in the form of a ratio image and uses a state-of-the-art neural network to remove the speckle in this ratio image. This simple strategy is shown to offer a noticeable improvement compared to filtering the original image without knowledge of the multi-temporal average.	翻訳日:2021-02-02 15:48:59 公開日:2021-02-01
# 自動運転のための地上認識モノラル3次元物体検出 Ground-aware Monocular 3D Object Detection for Autonomous Driving ( http://arxiv.org/abs/2102.00690v1 ) ライセンス: Link先を確認	Yuxuan Liu, Yuan Yixuan, Ming Liu	(参考訳) 単一のRGBカメラで環境中の物体の3D位置と方向を推定することは、低コストの都市自動運転と移動ロボットにとって重要で困難な作業です。既存のアルゴリズムのほとんどは、2D-3D対応における幾何学的制約に基づいており、これは一般的な6Dオブジェクトのポーズ推定に由来する。まず、地上の飛行機が運転シーンで3D検出で深度推論のさらなる手がかりを提供する方法を特定します。この観測に基づいて、3Dアンカーの処理を改善し、深層学習の枠組みにおいて、そのようなアプリケーション固有の先行を十分に活用する新しいニューラルネットワークモジュールを導入する。最後に,提案する3次元物体検出モジュールを組み込んだ効率的なニューラルネットワークを提案する。さらに,単眼深度予測用に設計されたニューラルネットワークを用いて,提案モジュールのパワーを検証した。提案した2つのネットワークは,KITTIの3次元オブジェクト検出と深度予測のベンチマークでそれぞれ最先端の性能を達成している。コードはhttps://www.github.com/Owen-Liuyuxuan/visualDet3Dで公開される。 Estimating the 3D position and orientation of objects in the environment with a single RGB camera is a critical and challenging task for low-cost urban autonomous driving and mobile robots. Most of the existing algorithms are based on the geometric constraints in 2D-3D correspondence, which stems from generic 6D object pose estimation. We first identify how the ground plane provides additional clues in depth reasoning in 3D detection in driving scenes. Based on this observation, we then improve the processing of 3D anchors and introduce a novel neural network module to fully utilize such application-specific priors in the framework of deep learning. Finally, we introduce an efficient neural network embedded with the proposed module for 3D object detection. We further verify the power of the proposed module with a neural network designed for monocular depth prediction. The two proposed networks achieve state-of-the-art performances on the KITTI 3D object detection and depth prediction benchmarks, respectively. The code will be published in https://www.github.com/Owen-Liuyuxuan/visualDet3D	翻訳日:2021-02-02 15:48:26 公開日:2021-02-01
# 深層学習によるSentinel-1 GRD画像の抽出と狭川セグメンテーションへの応用 Despeckling Sentinel-1 GRD images by deep learning and application to narrow river segmentation ( http://arxiv.org/abs/2102.00692v1 ) ライセンス: Link先を確認	Nicolas Gasnier, Emanuele Dalsasso, Lo\"ic Denis, Florence Tupin	(参考訳) 本稿では,最近提案されたSAR2SAR(自己監督型トレーニング戦略)に基づく,Sentinel-1 GRD画像の非スペックリング手法を提案する。 Sentinel 1 GRD画像のコレクション上のディープニューラルネットワークのトレーニングは、スペックルの空間変動空間相関に堅牢な脱スペックリングアルゴリズムにつながります。劣化した画像は狭い川のような構造物の検出を改善する。我々は,外因性情報に基づく検出器と線形特徴検出器を適用し,デスペックリングニューラルネットワークにより予め処理された画像に対して処理チェーンを適用する場合,河川のセグメンテーションが良好であることを示す。 This paper presents a despeckling method for Sentinel-1 GRD images based on the recently proposed framework "SAR2SAR": a self-supervised training strategy. Training the deep neural network on collections of Sentinel 1 GRD images leads to a despeckling algorithm that is robust to space-variant spatial correlations of speckle. Despeckled images improve the detection of structures like narrow rivers. We apply a detector based on exogenous information and a linear features detector and show that rivers are better segmented when the processing chain is applied to images pre-processed by our despeckling neural network.	翻訳日:2021-02-02 15:47:50 公開日:2021-02-01
# 粒子イメージング検出器のためのスケーラブル, エンドツーエンド, 深層学習に基づくデータ再構築チェーン Scalable, End-to-End, Deep-Learning-Based Data Reconstruction Chain for Particle Imaging Detectors ( http://arxiv.org/abs/2102.01033v1 ) ライセンス: Link先を確認	Francois Drielsma, Kazuhiro Terao, Laura Domin\'e, Dae Heun Koh	(参考訳) コンピュータビジョン(CV)と機械学習(ML)の最近の進歩は、粒子イメージング検出器データの分析に新しいアプローチを動機づけています。孤立CVタスクに取り組む従来の取り組みとは違って,ニュートリノ物理の強度フロンティアにおける高精度撮像技術であるLiquid Argon Time Projection Chambers (LArTPCs) のための,エンドツーエンドのMLベースのデータ再構成チェーンを導入する。このチェーンは、スパース畳み込みニューラルネットワークを用いたボクセルレベルの特徴抽出とグラフニューラルネットワークを用いた粒子超構造形成を組み合わせたマルチタスクネットワークカスケードである。各アルゴリズムは物理による誘導バイアスを組み込んでおり、その集団階層は因果構造を強制するために使用される。出力は、高レベルの物理推論に使用できるイベントの包括的な説明です。このチェーンはエンドツーエンドで最適化可能であり、時間を要する手動のソフトウェア調整は不要である。また、Deep Underground Neutrino Experimentの3D画像LArTPCで期待される数十の高エネルギーニュートリノ相互作用のこれまでにない蓄積を処理する最初の実装です。チェーン全体がトレーニングされ、そのパフォーマンスはオープンシミュレーションデータセットを使用して各ステップで評価される。 Recent inroads in Computer Vision (CV) and Machine Learning (ML) have motivated a new approach to the analysis of particle imaging detector data. Unlike previous efforts which tackled isolated CV tasks, this paper introduces an end-to-end, ML-based data reconstruction chain for Liquid Argon Time Projection Chambers (LArTPCs), the state-of-the-art in precision imaging at the intensity frontier of neutrino physics. The chain is a multi-task network cascade which combines voxel-level feature extraction using Sparse Convolutional Neural Networks and particle superstructure formation using Graph Neural Networks. Each algorithm incorporates physics-informed inductive biases, while their collective hierarchy is used to enforce a causal structure. The output is a comprehensive description of an event that may be used for high-level physics inference. The chain is end-to-end optimizable, eliminating the need for time-intensive manual software adjustments. It is also the first implementation to handle the unprecedented pile-up of dozens of high energy neutrino interactions, expected in the 3D-imaging LArTPC of the Deep Underground Neutrino Experiment. The chain is trained as a whole and its performance is assessed at each step using an open simulated data set.	翻訳日:2021-02-02 15:47:19 公開日:2021-02-01
# 雑音二元系ニューラルネットワークにおける情報収縮とその意義 Information contraction in noisy binary neural networks and its implications ( http://arxiv.org/abs/2101.11750v2 ) ライセンス: Link先を確認	Chuteng Zhou, Quntao Zhuang, Matthew Mattina, Paul N. Whatmough	(参考訳) ニューラルネットワークは、大規模画像分類、オブジェクト検出、自然言語処理タスクにおいて最先端のパフォーマンスを達成する機械学習モデルとして重要になっている。本稿では、各ニューロンが不正確な出力を生じる確率がゼロでないノイズの多いバイナリニューラルネットワークについて検討する。これらの騒がしいモデルは、生物学的、物理的、電子的な文脈から生じ、物理的世界に関連する重要な種類のモデルを構成する。直感的には、そのようなシステムのニューロン数は、同じレベルの表現力と計算信頼性を維持しながらノイズを補うために増加する必要がある。私たちの重要な発見は、ノイズの多いニューラルネットワークの必要な数のニューロンの境界が低くなっていることです。この下限を証明するために、我々は情報理論のアプローチを採用し、二進対称チャネルに対するエバンス・シュルマンの結果を一般チャネルに一般化するだけでなく、ネットワークにおけるエンドツーエンドの情報収縮を推定する際のタイツネスを大幅に改善する、新しい強データ処理不等式(SDPI)を得る。我々のSDPIは、ニューラルネットワークやセルオートマトンなど、さまざまな情報処理システムに適用できる。ノイズのないニューラルネットワークに対する理解とは大きく異なるノイズの多いニューラルネットワークに対して,SDPIを雑音の多いバイナリニューラルネットワークに適用し,その鍵となる下位境界を求め,その影響をネットワークの深さ幅トレードオフに適用することを提案する。さらに、SDPIを適用してフォールトトレラント細胞オートマトンを研究し、エラー訂正オーバーヘッドと緩和時間の境界を得る。本稿では,情報理論のレンズを通して,雑音情報処理システムの新たな理解を提供する。 Neural networks have gained importance as the machine learning models that achieve state-of-the-art performance on large-scale image classification, object detection and natural language processing tasks. In this paper, we consider noisy binary neural networks, where each neuron has a non-zero probability of producing an incorrect output. These noisy models may arise from biological, physical and electronic contexts and constitute an important class of models that are relevant to the physical world. Intuitively, the number of neurons in such systems has to grow to compensate for the noise while maintaining the same level of expressive power and computation reliability. Our key finding is a lower bound for the required number of neurons in noisy neural networks, which is first of its kind. To prove this lower bound, we take an information theoretic approach and obtain a novel strong data processing inequality (SDPI), which not only generalizes the Evans-Schulman results for binary symmetric channels to general channels, but also improves the tightness drastically when applied to estimate end-to-end information contraction in networks. Our SDPI can be applied to various information processing systems, including neural networks and cellular automata. Applying the SDPI in noisy binary neural networks, we obtain our key lower bound and investigate its implications on network depth-width trade-offs, our results suggest a depth-width trade-off for noisy neural networks that is very different from the established understanding regarding noiseless neural networks. Furthermore, we apply the SDPI to study fault-tolerant cellular automata and obtain bounds on the error correction overheads and the relaxation time. This paper offers new understanding of noisy information processing systems through the lens of information theory.	翻訳日:2021-02-02 15:46:35 公開日:2021-02-01
# 一般化非定常バンディット Generalized non-stationary bandits ( http://arxiv.org/abs/2102.00725v1 ) ライセンス: Link先を確認	Anne Gael Manegueu, Alexandra Carpentier and Yi Yu	(参考訳) 本稿では,スイッチングバンドイット問題を一般化する非定常確率バンドイット問題について検討する。スイッチングバンドイット問題(\textbf{Case a})に加えて、我々は3つの具体的な例に興味を持っている: (\textbf{b}) 腕の手段は局所多項式であり、 (\textbf{c}) 腕の手段は局所的に滑らかであり、 (\textbf{d}) 腕の隙間は束縛された数の屈曲点を持ち、そこでは最も高い腕の平均は短い範囲であまり変化しない。これらの3つの設定は非常に異なるが、共通する点がある: (i) ギャップの対数の同様の大きさのレベル集合の数を制御でき、 (ii) 最高平均は急な変更の数に制限があり、それ以外は変化が限られている。この一般的な設定では、特に4つの問題 (a)-(d) を効率的かつ統一的に解く1つのアルゴリズムを提案する。 In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d) mentioned.	翻訳日:2021-02-02 15:40:13 公開日:2021-02-01
# CTスキャンによるCOVID-19診断のためのフェーショット学習 Few-shot Learning for CT Scan based COVID-19 Diagnosis ( http://arxiv.org/abs/2102.00596v1 ) ライセンス: Link先を確認	Yifan Jiang, Han Chen, David K. Han, Hanseok Ko	(参考訳) 2019年コロナウイルス(COVID-19)は、188の国と地域で4000万人以上の人々が感染する国際懸念の公衆衛生緊急事態です。胸部CT(Chest Computed Tomography)イメージング技術は、高い診断精度と堅牢性により、新型コロナウイルスの大量検査に不可欠な方法となっています。近年,深層学習は医用画像の自動スクリーニングに有効なツールとなり,新型コロナウイルスの診断にも利用されている。しかし、COVID-19に関連する高い感染リスクは、収集されたラベル付きデータの相対的な滞留をもたらし、そのような方法のパフォーマンスを制限します。さらに、CT画像を正確にラベル付けするには、放射線医の専門知識が必要です。以上の課題に対処するために,少量のラベル付きCTスキャンが利用可能である場合にのみ効果的に機能する,教師付きドメイン適応型COVID-19 CT診断法を提案する。本提案手法は、ラベル付きデータのばらつきを補うために、大量の合成COVID-19 CT画像を利用して、ソースドメイン(合成データ)からターゲットドメイン(実データ)までのネットワークをクロスドメイントレーニング機構で調整する。実験の結果, 新型ct画像診断による診断作業において, 最先端のパフォーマンスが得られた。 Coronavirus disease 2019 (COVID-19) is a Public Health Emergency of International Concern infecting more than 40 million people across 188 countries and territories. Chest computed tomography (CT) imaging technique benefits from its high diagnostic accuracy and robustness, it has become an indispensable way for COVID-19 mass testing. Recently, deep learning approaches have become an effective tool for automatic screening of medical images, and it is also being considered for COVID-19 diagnosis. However, the high infection risk involved with COVID-19 leads to relative sparseness of collected labeled data limiting the performance of such methodologies. Moreover, accurately labeling CT images require expertise of radiologists making the process expensive and time-consuming. In order to tackle the above issues, we propose a supervised domain adaption based COVID-19 CT diagnostic method which can perform effectively when only a small samples of labeled CT scans are available. To compensate for the sparseness of labeled data, the proposed method utilizes a large amount of synthetic COVID-19 CT images and adjusts the networks from the source domain (synthetic data) to the target domain (real data) with a cross-domain training mechanism. Experimental results show that the proposed method achieves state-of-the-art performance on few-shot COVID-19 CT imaging based diagnostic tasks.	翻訳日:2021-02-02 15:34:41 公開日:2021-02-01
# CRPS学習 CRPS Learning ( http://arxiv.org/abs/2102.00968v1 ) ライセンス: Link先を確認	Jonathan Berrisch, Florian Ziel	(参考訳) 組み合わせと集約技術は予測精度を大幅に向上させることができる。これはまた、完全な予測分布が組み合わさった確率予測手法にも当てはまる。ベイズモデル平均化(BMA)のような時間変化および適応的な重み付けスキームはいくつか存在する。しかし、異なる予測器の性能は時間とともに異なるだけでなく、分布の一部にも異なる可能性がある。したがって、分布の中心ではより正確なものがあり、他のものは分布の尾部を予測するのに優れている。その結果、時間と分布の異なる性能を考慮に入れた新たな重み付け手法が導入された。本稿では,連続ランク付き確率スコア(crps)に対して最適化するポイントワイズオンラインアグリゲーションアルゴリズムについて検討する。完全適応的ベルンシュタインオンラインアグリゲーション(BOA)法の理論的性質を解析した後,ポイントワイズCRPS学習のためのスムースな手順を導入する。特性はシミュレーション研究によって確認され、議論されます。さらに, 炭素市場に関する予測研究において, その性能について概説する。詳細は、欧州の排出量許容価格の分布を予測する。 Combination and aggregation techniques can improve forecast accuracy substantially. This also holds for probabilistic forecasting methods where full predictive distributions are combined. There are several time-varying and adaptive weighting schemes like Bayesian model averaging (BMA). However, the performance of different forecasters may vary not only over time but also in parts of the distribution. So one may be more accurate in the center of the distributions, and other ones perform better in predicting the distribution's tails. Consequently, we introduce a new weighting procedure that considers both varying performance across time and the distribution. We discuss pointwise online aggregation algorithms that optimize with respect to the continuous ranked probability score (CRPS). After analyzing the theoretical properties of a fully adaptive Bernstein online aggregation (BOA) method, we introduce smoothing procedures for pointwise CRPS learning. The properties are confirmed and discussed using simulation studies. Additionally, we illustrate the performance in a forecasting study for carbon markets. In detail, we predict the distribution of European emission allowance prices.	翻訳日:2021-02-02 15:32:01 公開日:2021-02-01
# (参考訳) Twice Mixing: 水中画像強調のためのランク学習に基づく品質評価手法 Twice Mixing: A Rank Learning based Quality Assessment Approach for Underwater Image Enhancement ( http://arxiv.org/abs/2102.00670v1 ) ライセンス: CC BY 4.0	Zhenqi Fu, Xueyang Fu, Yue Huang, and Xinghao Ding	(参考訳) 水中画像の品質を向上させるために、過去数年間にさまざまな種類の水中画像強化(UIE)オペレータが提案されています。しかし、効果的な客観的評価方法の欠如はUIE技術のさらなる発展を制限します。本稿では,新しいランク学習による無基準品質評価法を提案する。 2回混合と呼ばれるこのアプローチは、高品質な画像と低品質の画像を混ぜることで、中間品質の画像が生成されるという観察によって動機付けられたものです。典型的な混合アルゴリズムは、与えられた入力データのペアを線形に補間する。しかし,人間の視覚系は画像処理において一様でなく非線形である。そこで,これらの混合画像と,それらの絶対スコアを線形結合で計算した深層ニューラルネットワークを直接学習する代わりに,シアムネットワークを訓練し,それらの品質ランキングを学ぶことを提案する。 2回混合は精巧に定式化された自己スーパービジョン機構に基づいて訓練される。具体的には、各イテレーションの前に、仮想画像の生成とネットワークトレーニングの誘導の両方に使用される2つの混合比をランダムに生成する。テストフェーズでは、ネットワークの単一のブランチを抽出し、異なるUIE出力の品質ランキングを予測します。我々は,合成データと実世界のデータセットの両方について広範な実験を行う。実験の結果,提案手法が従来の手法を大きく上回ることがわかった。 To improve the quality of underwater images, various kinds of underwater image enhancement (UIE) operators have been proposed during the past few years. However, the lack of effective objective evaluation methods limits the further development of UIE techniques. In this paper, we propose a novel rank learning guided no-reference quality assessment method for UIE. Our approach, termed Twice Mixing, is motivated by the observation that a mid-quality image can be generated by mixing a high-quality image with its low-quality version. Typical mixup algorithms linearly interpolate a given pair of input data. However, the human visual system is non-uniformity and non-linear in processing images. Therefore, instead of directly training a deep neural network based on the mixed images and their absolute scores calculated by linear combinations, we propose to train a Siamese Network to learn their quality rankings. Twice Mixing is trained based on an elaborately formulated self-supervision mechanism. Specifically, before each iteration, we randomly generate two mixing ratios which will be employed for both generating virtual images and guiding the network training. In the test phase, a single branch of the network is extracted to predict the quality rankings of different UIE outputs. We conduct extensive experiments on both synthetic and real-world datasets. Experimental results demonstrate that our approach outperforms the previous methods significantly.	翻訳日:2021-02-02 15:31:16 公開日:2021-02-01
# (参考訳) 明示的共通知識を用いた事前配置問題の再検討 Revisiting the Prepositional-Phrase Attachment Problem Using Explicit Commonsense Knowledge ( http://arxiv.org/abs/2102.00924v1 ) ライセンス: CC BY 4.0	Yida Xin, Henry Lieberman and Peter Chin	(参考訳) PP(Prepositional-phrase)アタッチメントの曖昧さを解決するという課題を再考する。現在提案されている解はルールベースであり、明示的な文法規則はあいまいさの解決方法を指示する; あるいは、ラベル付き例のコーパスから決定が学習される統計的手法である。明示的なコモンセンス知識ベースは、適切なアタッチメント決定を行う上で必須の要素となる。 Patch-Commと呼ばれるモジュールを実装し、さまざまな従来のパーサーがアタッチメントの決定を行えるようにしました。 Commonsense KBが直接的な回答を提供しない場合には、一部のNLPシステムが語彙外単語を処理するのと同様の方法で「知識外ベース」アサーションを推論するより一般的なシステムに戻ります。以上の結果から,コモンセンス知識ベースアプローチは,ルールベースと統計技術の統合により,両世界のベストを発揮できることが示唆された。 AIにおける説明可能性の重要性がますます認識される中、NLP開発者はシステムの振る舞いをよりよく理解し、エンドユーザとの自然な対話を促進することができる。 We revisit the challenging problem of resolving prepositional-phrase (PP) attachment ambiguity. To date, proposed solutions are either rule-based, where explicit grammar rules direct how to resolve ambiguities; or statistical, where the decision is learned from a corpus of labeled examples. We argue that explicit commonsense knowledge bases can provide an essential ingredient for making good attachment decisions. We implemented a module, named Patch-Comm, that can be used by a variety of conventional parsers, to make attachment decisions. Where the commonsense KB does not provide direct answers, we fall back on a more general system that infers "out-of-knowledge-base" assertions in a manner similar to the way some NLP systems handle out-of-vocabulary words. Our results suggest that the commonsense knowledge-based approach can provide the best of both worlds, integrating rule-based and statistical techniques. As the field is increasingly coming to recognize the importance of explainability in AI, a commonsense approach can enable NLP developers to better understand the behavior of systems, and facilitate natural dialogues with end users.	翻訳日:2021-02-02 15:13:54 公開日:2021-02-01

Title

Authors

Abstract

論文公表日・翻訳日

# 量子アニーリングによる正方形格子の信号最適化

Traffic Signal Optimization on a Square Lattice with Quantum Annealing ( http://arxiv.org/abs/2003.07527v2 )

ライセンス: Link先を確認

Daisuke Inoue, Akihisa Okada, Tadayoshi Matsumori, Kazuyuki Aihara, Hiroaki Yoshida

(参考訳) 都市部におけるインテリジェント交通システムの普及は計算負荷の増大を引き起こし、大規模交通を管理するための新しいアーキテクチャを必要としている。本研究では,量子アニーリングマシンであるd-wave quantum annealerを用いて,正方形格子上に配置したトラヒック信号を大域的に制御する手法を開発した。まず2つの直交方向における交通流の不均衡を最小限に抑える信号最適化問題を定式化する。次に、この問題をイジングハミルトニアンとして再定義し、量子アニーラーと完全互換である。新たな制御法は, 大規模都市における従来の局所制御法と比較し, 広いパラメータ範囲の交通不均衡を抑制する上で, グローバル制御法が優れていることを示す。さらに, 量子アニール装置を用いて得られた大域的制御手法の解法は, 従来の模擬アニール法よりも優れている。さらに, 局所制御法とグローバル制御法が, 車両の旋回と直進の確率が等しい限界に収まることを解析的に証明した。これらの結果は数値実験によって検証される。

The spread of intelligent transportation systems in urban cities has caused heavy computational loads, requiring a novel architecture for managing large-scale traffic. In this study, we develop a method for globally controlling traffic signals arranged on a square lattice by means of a quantum annealing machine, namely the D-Wave quantum annealer. We first formulate a signal optimization problem that minimizes the imbalance of traffic flows in two orthogonal directions. Then we reformulate this problem as an Ising Hamiltonian, which is fully compatible with quantum annealers. The new control method is compared with a conventional local control method for a large 50-by-50 city, and the results exhibit the superiority of our global control method in suppressing traffic imbalance over wide parameter ranges. Furthermore, the solutions to the global control method obtained with the quantum annealing machine are better than those obtained with conventional simulated annealing. In addition, we prove analytically that the local and the global control methods converge at the limit where cars have equal probabilities for turning and going straight. These results are verified with numerical experiments.

翻訳日:2023-05-28 22:18:58 公開日:2021-02-01

# 検出効率ミスマッチを用いた実用的な量子鍵分布のセキュリティ証明

Security proof of practical quantum key distribution with detection-efficiency mismatch ( http://arxiv.org/abs/2004.04383v2 )

ライセンス: Link先を確認

Yanbao Zhang, Patrick J. Coles, Adam Winick, Jie Lin, and Norbert Lutkenhaus

(参考訳) しきい値検出器を用いた量子鍵分布(QKD)プロトコルは、高性能QKD実証を駆動している。対応するセキュリティ証明は通常、すべての物理検出器が同じ検出効率を持つと仮定する。しかし、実際に使用される検出器の効率は、これらの検出器の製造とセットアップによってミスマッチを示す可能性がある。ミスマッチは、受信信号の異なる空間-時間モードが検出器と異なる結合を持つ可能性があるためにも引き起こされる。本稿では,通常の仮定を伴わずにセキュリティ証明を提供する手法を開発した。本手法は,敵の攻撃戦略を制限することなく,検出効率のミスマッチを考慮に入れることができる。特に、我々のセキュリティ証明が実際の状況に直接適用されるように、入ってくる信号の光子数のカットオフは一切頼らない。本稿では,偏光符号化用に設計され,複数の時空間モードに敏感な受信機について述べる。検出器モデルでは、任意の空間-時間モード間の量子干渉の欠如を仮定する。この検出器モデルを用いたQKDプロトコルでは、効率のミスマッチを特徴とし、光子数のカットオフ仮定なしでセキュリティ証明を行うことができる。また, 本手法では, 検知モデルの効率的ミスマッチがなければ, 検出非効率による損失が敵の制御の外にあると仮定した場合に, 鍵レートが増加することを示した。

Quantum key distribution (QKD) protocols with threshold detectors are driving high-performance QKD demonstrations. The corresponding security proofs usually assume that all physical detectors have the same detection efficiency. However, the efficiencies of the detectors used in practice might show a mismatch depending on the manufacturing and setup of these detectors. A mismatch can also be induced as the different spatial-temporal modes of an incoming signal might couple differently to a detector. Here we develop a method that allows to provide security proofs without the usual assumption. Our method can take the detection-efficiency mismatch into account without having to restrict the attack strategy of the adversary. Especially, we do not rely on any photon-number cut-off of incoming signals such that our security proof is directly applicable to practical situations. We illustrate our method for a receiver that is designed for polarization encoding and is sensitive to a number of spatial-temporal modes. In our detector model, the absence of quantum interference between any pair of spatial-temporal modes is assumed. For a QKD protocol with this detector model, we can perform a security proof with characterized efficiency mismatch and without photon-number cut-off assumption. Our method also shows that in the absence of efficiency mismatch in our detector model, the key rate increases if the loss due to detection inefficiency is assumed to be outside of the adversary's control, as compared to the view where for a security proof this loss is attributed to the action of the adversary.

翻訳日:2023-05-25 08:52:35 公開日:2021-02-01

# 量子分割関数近似のための効率的なアルゴリズム

Efficient Algorithms for Approximating Quantum Partition Functions ( http://arxiv.org/abs/2004.11568v2 )

ライセンス: Link先を確認

Ryan L. Mann, Tyler Helmuth

(参考訳) 高温における量子スピンモデルの分配関数に対する多項式時間近似アルゴリズムを確立する。このアルゴリズムは、neto\v{c}n\'y と redig の量子クラスター展開と、helmuth, perkins, regts によるアルゴリズム設計へのクラスタ拡張アプローチに基づいている。同様の結果は関連する手法によって以前にも得られており、有界次グラフ上のペアワイズ相互作用の場合の単純かつわずかにシャープな解析が主な貢献である。

We establish a polynomial-time approximation algorithm for partition functions of quantum spin models at high temperature. Our algorithm is based on the quantum cluster expansion of Neto\v{c}n\'y and Redig and the cluster expansion approach to designing algorithms due to Helmuth, Perkins, and Regts. Similar results have previously been obtained by related methods, and our main contribution is a simple and slightly sharper analysis for the case of pairwise interactions on bounded-degree graphs.

翻訳日:2023-05-22 06:22:25 公開日:2021-02-01

# 拡張不可能な積基底、有界絡み状態、および範囲基準

Unextendible product bases, bound entangled states, and the range criterion ( http://arxiv.org/abs/2005.02108v3 )

ライセンス: Link先を確認

Pratapaditya Bej, Saronath Halder

(参考訳) 非拡張積基底 (unextendible product basis, upb) は、与えられたヒルベルト空間の部分空間にまたがる直交積状態の集合であり、相補部分空間は積状態を持たない。これらの積基底は有界絡み状態(BE)を生成するのに有用である。本研究では、最小ランクのBE状態を生成することができる最大サイズの再現可能かつ既約 UPB を考える。還元可能なupbから、1つ以上の状態を局所的に排除することができ、測定後の状態が直交する。一方、既約 UPB の場合、上記は不可能である。特に、現在のサイズのUPBは、範囲の基準を満たす最も広い品種のランクを持つBE状態を生成するのに役立つ可能性があるため、重要である。ここではそのようなBE状態について述べる。また、他の種類のBE状態を提供し、状態の特定の特性を分析する。現在のBE状態のいくつかはタイル構造と関連している。さらに, 最小ランクのBE状態に対応する異なる UPB を提供し, UPB の重要な性質について議論する。

An unextendible product basis (UPB) is a set of orthogonal product states which span a subspace of a given Hilbert space while the complementary subspace contains no product state. These product bases are useful to produce bound entangled (BE) states. In this work we consider reducible and irreducible UPBs of maximum size, which can produce BE states of minimum rank. From a reducible UPB, it is possible to eliminate one or more states locally, keeping the post-measurement states orthogonal. On the other hand, for an irreducible UPB, the above is not possible. Particularly, the UPBs of the present size are important as they might be useful to produce BE states, having ranks of the widest variety, which satisfy the range criterion. Here we talk about such BE states. We also provide other types of BE states and analyze certain properties of the states. Some of the present BE states are associated with the tile structures. Furthermore, we provide different UPBs corresponding to the present BE states of minimum rank and discuss important properties of the UPBs.

翻訳日:2023-05-21 03:01:06 公開日:2021-02-01

# クリフォード階層におけるコスト最適単一量子ゲート合成

Cost-optimal single-qubit gate synthesis in the Clifford hierarchy ( http://arxiv.org/abs/2005.05581v3 )

ライセンス: Link先を確認

Gary J. Mooney, Charles D. Hill and Lloyd C. L. Hollenberg

(参考訳) 普遍的な量子計算では、フォールトトレラントな量子情報処理に必要な大量のリソースが現実的な実装のために克服される。重要な側面は、量子誤り訂正符号内の論理ゲートから構築された任意のユニタリ演算子を実装することである。合成アルゴリズムは、量子誤り訂正符号で符号化されながらフォールトトレラントに実行可能な小さなユニバーサルゲートのセットから選択された論理ゲートのシーケンスを組み立てることで、任意の精度までユニタリゲートを近似することができる。しかし、現在の手順はまだ基本ゲートコストの個別割り当てをサポートしておらず、多くはユニバーサルベースゲートの拡張セットをサポートしていない。基準ゲートの正準クリフォード+$t$ 集合に対するdijkstraのパスファインディングアルゴリズムに基づいて, 費用最適シーケンスの解析を行い, クリフォード階層の上位階からの$z$-rotationを含む場合と比較した。基本ゲート費用を割り当てる2つのアプローチが用いられた。まず、z$回転触媒回路を再帰的に適用することにより、コストをt$-countsに削減した。第二に、ゲートを直接分離し、フォールトトレラントに実装するのに必要とされる平均的な(物理的レベルの)マジック状態としてコストが割り当てられた。その結果,Z$-回転触媒を用いた場合の平均シーケンスコストは最大5,4\pm 3\%,マジック状態蒸留法では最大3,3\pm 2 \%であることがわかった。さらに,ランダムなターゲットゲートを近似するシーケンス内に現れるクリフォード階層の高次数から,Z$回転ゲートの集合の比率を推定する解析モデルを開発することにより,ベースゲートコストの特定の割り当ての制限について検討した。

For universal quantum computation, a major challenge to overcome for practical implementation is the large amount of resources required for fault-tolerant quantum information processing. An important aspect is implementing arbitrary unitary operators built from logical gates within the quantum error correction code. A synthesis algorithm can be used to approximate any unitary gate up to arbitrary precision by assembling sequences of logical gates chosen from a small set of universal gates that are fault-tolerantly performable while encoded in a quantum error-correction code. However, current procedures do not yet support individual assignment of base gate costs and many do not support extended sets of universal base gates. We analysed cost-optimal sequences using an exhaustive search based on Dijkstra's pathfinding algorithm for the canonical Clifford+$T$ set of base gates and compared them to when additionally including $Z$-rotations from higher orders of the Clifford hierarchy. Two approaches of assigning base gate costs were used. First, costs were reduced to $T$-counts by recursively applying a $Z$-rotation catalyst circuit. Second, costs were assigned as the average numbers of raw (i.e. physical level) magic states required to directly distil and implement the gates fault-tolerantly. We found that the average sequence cost decreases by up to $54\pm 3\%$ when using the $Z$-rotation catalyst circuit approach and by up to $33\pm 2 \%$ when using the magic state distillation approach. In addition, we investigated observed limitations of certain assignments of base gate costs by developing an analytic model to estimate the proportion of sets of $Z$-rotation gates from higher orders of the Clifford hierarchy that are found within sequences approximating random target gates.

翻訳日:2023-05-20 11:59:50 公開日:2021-02-01

# ローカルオペレータの絡み合いと蝶効果

Entanglement of Local Operators and the Butterfly Effect ( http://arxiv.org/abs/2005.14243v2 )

ライセンス: Link先を確認

Jonah Kudler-Flam, Masahiro Nozaki, Shinsei Ryu, Mao Tian Tan

(参考訳) 局所演算子挿入による摂動に対する量子情報と古典情報の堅牢性について検討する。ハイゼンベルク図形の局所作用素のヒルベルト空間における多部交絡測度を計算することでこれを実現できる。探索する初期条件に対する感度は、量子多体系における蝶効果の明快な顕在化である。我々は、古典的な統計力学問題に写像することで、局所作用素状態における相互情報、対数否定性、および反射エントロピーを計算するために、ハールランダムユニタリ回路の「膜理論」を導出し、任意の局所作用素挿入が因果性によって許容される限り早く情報を非局在化することを示す。共形場の理論では、バルク幾何学が地平線上にある局所的な物体を持つ永遠のブラックホールによって記述されるホログラフィック双対が認められる。これらの最大スクランブラとは対照的に、自由フェルミオンやクリフォード回路のような可積分系において局所演算子によって非局在化されるのは、$O(1)$の情報量のみである。

We study the robustness of quantum and classical information to perturbations implemented by local operator insertions. We do this by computing multipartite entanglement measures in the Hilbert space of local operators in the Heisenberg picture. The sensitivity to initial conditions that we explore is an illuminating manifestation of the butterfly effect in quantum many-body systems. We derive a "membrane theory" in Haar random unitary circuits to compute the mutual information, logarithmic negativity, and reflected entropy in the local operator state by mapping to a classical statistical mechanics problem and find that any local operator insertion delocalizes information as fast as is allowed by causality. Identical behavior is found for conformal field theories admitting holographic duals where the bulk geometry is described by the eternal black hole with a local object situated at the horizon. In contrast to these maximal scramblers, only an $O(1)$ amount of information is found to be delocalized by local operators in integrable systems such as free fermions and Clifford circuits.

翻訳日:2023-05-18 02:40:13 公開日:2021-02-01

# 暗号化を伴わない計測デバイス非依存量子通信」のセキュリティ向上

Improving the Security of "Measurement-Device-Independent Quantum Communication without Encryption" ( http://arxiv.org/abs/2006.05263v2 )

ライセンス: Link先を確認

Nayana Das and Goutam Paul

(参考訳) 2018年、niuらはeinstein-podolsky-rosen対を用いた測定デバイス非依存の量子セキュアな直接通信プロトコルを提案し、それを量子対話プロトコルに一般化した(niu et al., science bulletin 63.20, 2018)。これらのプロトコルを分析することで、両方のプロトコルでいくつかのセキュリティ問題を見つけます。本研究では,双方のプロトコルが情報漏洩に対して安全でないこと,第三者がアクティブな攻撃を伴わずに秘密情報の半分を取得できることを示す。また,セキュリティ向上のために,これらのプロトコルの適切な修正も提案する。

Recently in 2018, Niu et al. proposed a measurement-device-independent quantum secure direct communication protocol using Einstein-Podolsky-Rosen pairs and generalized it to a quantum dialogue protocol (Niu et al., Science bulletin 63.20, 2018). By analyzing these protocols we find some security issues in both these protocols. In this work, we show that both the protocols are not secure against information leakage, and a third party can get half of the secret information without any active attack. We also propose suitable modifications of these protocols to improve the security.

翻訳日:2023-05-16 04:57:19 公開日:2021-02-01

# スーパーポジング軌道による量子通信の実験的促進

Experimental Quantum Communication Enhancement by Superposing Trajectories ( http://arxiv.org/abs/2007.05005v2 )

ライセンス: Link先を確認

Giulia Rubino, Lee A. Rozema, Daniel Ebler, Hl\'er Kristj\'ansson, Sina Salek, Philippe Allard Gu\'erin, Alastair A. Abbott, Cyril Branciard, \v{C}aslav Brukner, Giulio Chiribella, Philip Walther

(参考訳) 量子通信ネットワークでは、ワイヤは量子系が送信される、明確に定義された軌道を表す。それにもかかわらず、軌道は異なるノイズの通信チャネルの順序を制御する量子制御として使用することができ、量子通信プロトコルが明確に定義された軌道を介して失敗した場合でも、そのような制御は情報の伝達を可能にすることが示されている。この結果は、通信の強化における軌道の重ね合わせの役割に関するさらなる研究の動機となり、並列通信チャネルの量子制御や、量子制御操作を伴う直列のチャネルの使用も通信の利点につながる可能性があることを明らかにした。そこで本研究では, この結果に基づいて, 2つの軌跡の重ね合わせを行う方法について実験および数値的に比較する。我々は、量子干渉法(quantum interferometry)の枠組みの中で、量子制御操作を伴う直列チャネルの使用が一般に最大の利点をもたらすことを観察する。本研究は,実験的な量子光学シナリオにおけるこれらの利点の性質を明らかにすることに貢献し,情報交換と情報キャリアの軌道が量子である量子通信パラダイムの拡張の利点を示す。

In quantum communication networks, wires represent well-defined trajectories along which quantum systems are transmitted. In spite of this, trajectories can be used as a quantum control to govern the order of different noisy communication channels, and such a control has been shown to enable the transmission of information even when quantum communication protocols through well-defined trajectories fail. This result has motivated further investigations on the role of the superposition of trajectories in enhancing communication, which revealed that the use of quantum control of parallel communication channels, or of channels in series with quantum-controlled operations, can also lead to communication advantages. Building upon these findings, here we experimentally and numerically compare different ways in which two trajectories through a pair of noisy channels can be superposed. We observe that, within the framework of quantum interferometry, the use of channels in series with quantum-controlled operations generally yields the largest advantages. Our results contribute to clarify the nature of these advantages in experimental quantum-optical scenarios, and showcase the benefit of an extension of the quantum communication paradigm in which both the information exchanged and the trajectory of the information carriers are quantum.

翻訳日:2023-05-10 21:17:01 公開日:2021-02-01

# 曲面超曲面上の絡み合い--場分解器アプローチ

Entanglement on curved hypersurfaces: A field-discretizer approach ( http://arxiv.org/abs/2007.09657v3 )

ライセンス: Link先を確認

Tal Schwartzman and Benni Reznik (School of Physics and Astronomy, Tel-Aviv University, Tel Aviv, Israel)

(参考訳) 相対論的場の量子論における一般超曲面上の絡み合いを測定するための共変スキームを提案する。そのため、超曲面に沿った場と局所的に相互作用することで、場の状態と離散化子の状態を完全に交換する補助相対論的場「離散化器」を導入する。離散化器は、空間格子を導入することなく、共変方式で、フィールドの無限大を効果的に切断することができる。これは、任意の超曲面上の任意の領域間の絡み合いを評価する効率的な方法を提供する。例えば、1+1次元の相補的領域と分離された領域の絡み合い、ミンコフスキー空間の平坦な超曲面、ミルン空間の曲面超曲面、およびヌル曲面に近づく超曲面上の領域について検討する。その結果, 1+1次元の任意の超曲面上の領域間の絡み合いは, 内部の形状ではなく, 領域の時空の終端に依存することがわかった。本研究の結果は, 平坦な超曲面に対して, 従来の結果と相関し, 拡張するものである。

We propose a covariant scheme for measuring entanglement on general hypersurfaces in relativistic quantum field theory. For that, we introduce an auxiliary relativistic field, 'the discretizer', that by locally interacting with the field along a hypersurface, fully swaps the field's and discretizer's states. It is shown, that the discretizer can be used to effectively cut-off the field's infinities, in a covariant fashion, and without having to introduce a spatial lattice. This, in turn, provides us an efficient way to evaluate entanglement between arbitrary regions on any hypersurface. As examples, we study the entanglement between complementary and separated regions in 1+1 dimensions, for flat hypersurfaces in Minkowski space, for curved hypersurfaces in Milne space, and for regions on hypersurfaces approaching null-surfaces. Our results show that the entanglement between regions on arbitrary hypersurfaces in 1+1 dimensions depends only on the space-time endpoints of the regions, and not on the shape of the interior. Our results corroborate and extend previous results for flat hypersurfaces.

翻訳日:2023-05-09 01:16:31 公開日:2021-02-01

# スピンのエントロピーダイナミクス

The Entropic Dynamics of Spin ( http://arxiv.org/abs/2007.15719v2 )

ライセンス: Link先を確認

Ariel Caticha and Nicholas Carrara

(参考訳) エントロピック・ダイナミクス(ED)のアプローチでは、量子論の本質はその確率論的性質にあるが、ヒルベルト空間構造は二次的かつ究極的には任意の役割を果たす。確率分布のダイナミクスは、エントロピーの最大化によって、関連する物理的情報(方向性、相関、ゲージ相互作用など)を運ぶ制約によって引き起こされる。課題は、これらの制約を特定し、制約自体の更新方法の基準を確立することです。本稿では、EDフレームワークを拡張し、スピン1/2点粒子を記述する。 EDスピンは回転体としてモデル化されず、また点粒子の運動によってもモデル化されておらず、波動関数のエピステミック特性である。スピンの特異な回転特性を反映する制約は、幾何学代数の言語で最も効果的に表現される。すべての制約の更新は、対称性原理の中心的な重要性を強調する方法で行われる。まず、確率の位相空間、それらの共役モーメント、スピン変数における適切なシンプレクティックおよび計量構造を特定する。この構成は、情報幾何との深い関係を強調するスピン1/2粒子のフビニ・スタディ計量の導出となる。次に、シンプレクティック構造(ハミルトンフロー)と計量構造(キリングフロー)の両方を保存するEDを構築する。一般ハミルトン・キリング流は波動関数において線形であることを示す。さらに、ハミルトニアンが時間におけるエントロピー発展の生成元であることは、パウリ方程式によって記述されたエントロピーダイナミクスに繋がる。我々は、他の解釈によって提供されるものとは大きく異なる物理図形を生み出す形式主義の新たな解釈について議論した。

In the Entropic Dynamics (ED) approach the essence of quantum theory lies in its probabilistic nature while the Hilbert space structure plays a secondary and ultimately optional role. The dynamics of probability distributions is driven by the maximization of an entropy subject to constraints that carry the relevant physical information -- directionality, correlations, gauge interactions, etc. The challenge is to identify those constraints and to establish a criterion for how the constraints themselves are updated. In this paper the ED framework is extended to describe a spin-1/2 point particle. In ED spin is neither modelled as a rotating body, nor through the motion of a point particle; it is an epistemic property of the wave function. The constraint that reflects the peculiar rotational properties of spin is most effectively expressed in the language of geometric algebra. The updating of all constraints is carried out in a way that stresses the central importance of symmetry principles. First we identify the appropriate symplectic and metric structures in the phase space of probabilities, their conjugate momenta, and the spin variables. This construction yields a derivation of the Fubini-Study metric for a spin-1/2 particle which highlights its deep connection to information geometry. Then we construct an ED that preserves both the symplectic structure (a Hamiltonian flow) and the metric structure (a Killing flow). We show that generic Hamiltonian-Killing flows are linear in the wave function. Imposing further that the Hamiltonian be the generator of an entropic evolution in time leads to an entropic dynamics described by the Pauli equation. We conclude with a discussion of the new interpretation of the formalism which yields a physical picture that is significantly different from that provided by other interpretations.

翻訳日:2023-05-07 18:12:03 公開日:2021-02-01

# 動的量子相転移の絡み合いビュー

Entanglement view of dynamical quantum phase transitions ( http://arxiv.org/abs/2008.04894v2 )

ライセンス: Link先を確認

Stefano De Nicola, Alexios A. Michailidis, Maksym Serbyn

(参考訳) 平衡分割関数と多体ユニタリダイナミクスの戻り確率の類似性は、動的量子相転移(DQPT)の概念をもたらした。 DQPTは、戻り振幅の非解析性によって定義され、多くのモデルに存在する。場合によっては、DQPTは順序パラメータのような平衡概念と関連付けられるが、それらの普遍的な記述は開問題である。本研究では,熱力学的極限におけるユニタリダイナミクスの行列積状態記述を用いて,dqptの分類に向けた第1ステップを提案する。これにより,量子イジングモデルにおける解析的記述を用いて示される,前接と絡み合いdqptの2つの制限ケースを区別することができる。先行DQPTは大きな絡み合いギャップによって特徴づけられ、その性質上半古典的であるが、絡み合いDQPTは絡み合いスペクトルの避けられた交差付近で発生し、非局所相関の複雑なパターンで区別できる。本稿では,Isingモデル以外のDQPTの存在を実証し,それらを識別し,それらの相互作用を複雑なDQPT現象と関連付ける可観測物質について議論する。

The analogy between an equilibrium partition function and the return probability in many-body unitary dynamics has led to the concept of dynamical quantum phase transition (DQPT). DQPTs are defined by non-analyticities in the return amplitude and are present in many models. In some cases DQPTs can be related to equilibrium concepts such as order parameters, yet their universal description is an open question. In this work we provide first steps towards a classification of DQPTs by using a matrix product state description of unitary dynamics in the thermodynamic limit. This allows us to distinguish the two limiting cases of precession and entanglement DQPTs, which are illustrated using an analytical description in the quantum Ising model. While precession DQPTs are characterized by a large entanglement gap and are semiclassical in their nature, entanglement DQPTs occur near avoided crossings in the entanglement spectrum and can be distinguished by a complex pattern of non-local correlations. We demonstrate the existence of precession and entanglement DQPTs beyond Ising model, discuss observables that can distinguish them and relate their interplay to complex DQPT phenomenology.

翻訳日:2023-05-06 13:50:14 公開日:2021-02-01

# 1次元ハバード模型の一般流体力学による研究:スピン、電荷、エネルギー電流の定常凝縮と比例性

Generalized hydrodynamics study of the one-dimensional Hubbard model: Stationary clogging and proportionality of spin, charge, and energy currents ( http://arxiv.org/abs/2008.06522v2 )

ライセンス: Link先を確認

Yuji Nozawa, Hirokazu Tsunetsugu

(参考訳) これまでの研究 (Y. Nozawa and H. Tsunetsugu, Phys. B 101, 035121 (2020)] において, 分割プロトコルの一般化力学理論に基づく一次元ハバードモデルのクエンチ力学を研究し, 閉包現象の存在を示した。クロッギングは、電荷電流が非ゼロエネルギー電流と共存する現象であり、このプロトコルが、系の左半分が高温で満たされ、右半分が空であるという初期条件を使用するときに発見された。詰まりは左半分の全ての場所で起こり、接続点からの距離に比例してしばらく続く。本稿では,様々な初期条件を用いて2つの問題を論じる。第一の問題は、定常状態での詰まりの可能性である。右半分の電子密度が初期ゼロに設定されると、左半充填部は初期状態で様々なパラメータのセットのために膨張することが分かった。これは, 詰まり現象が長期定常状態のすべての場所で発生し, その起源についても考察することを意味する。さらに、静止クロッギングにはバック電流、すなわち粒子密度電流が高密度領域に向かって流れる。また、スピンクロッギングは初期のいくつかの条件、すなわち消滅するスピン電流が非ゼロエネルギー電流と共存することも見出した。第二の問題はスピン電流と電荷電流の比例である。電流比が非ゼロ定数に固定された2つの時空間領域を発見した。我々は,電流比が様々な初期条件にどう依存するかを数値的に検討した。また,電荷とエネルギー電流の比についても検討した。

In our previous work [Y. Nozawa and H. Tsunetsugu, Phys. Rev. B 101, 035121 (2020)], we studied quench dynamics in the one-dimensional Hubbard model based on the generalized hydrodynamics theory for a partitioning protocol and showed the presence of a clogging phenomenon. Clogging is a phenomenon that vanishing charge current coexists with nonzero energy current, and was found when the protocol uses the initial condition that the left half of the system is prepared to be half filling at high temperatures with the right half being empty. Clogging occurs at all the sites in the left half and lasts for a time proportional to its distance from the connection point. In this paper, we use various different initial conditions and discuss two issues. The first issue is the possibility of clogging in a stationary state. When the electron density in the right half is initially set nonzero, we found that the left half-filled part expands for various sets of parameters in the initial condition. This means that the clogging phenomenon occurs at all the sites in the long-time stationary state, and we also discuss its origin. In addition, stationary clogging is accompanied by a back current, namely, particle density current flows towards the high-density region. We also found spin clogging occurs for some initial conditions, i.e., the vanishing spin current coexists with nonzero energy current. The second issue is the proportionality of spin and charge currents. We have found two spatio-temporal regions where the current ratio is fixed to a nonzero constant. We numerically studied how the current ratio depends on various initial conditions. We also studied the ratio of charge and energy currents.

翻訳日:2023-05-06 06:51:24 公開日:2021-02-01

# 量子ビット上の(2+1)次元格子ゲージ理論のリアルタイムシミュレーション

Real-time simulation of (2+1)-dimensional lattice gauge theory on qubits ( http://arxiv.org/abs/2008.11395v3 )

ライセンス: Link先を確認

Arata Yamamoto

(参考訳) 2+1次元におけるZ2格子ゲージ理論の量子シミュレーションについて検討する。双対変数の定式化、いわゆるウェグナー双対性は、冗長ゲージ自由度を下げるために用いられる。電荷保存の問題は任意の電荷分布に対して解決される。実演として,2つの静電荷,すなわち2つのテンポラリ・ウィルソン線を用いて,システムのリアルタイム進化をシミュレートする。シミュレータ(ハードウェアノイズなし)と、量子コンピュータの実装置(ハードウェアノイズ相当)によって得られたいくつかの結果を示す。

We study the quantum simulation of Z2 lattice gauge theory in 2+1 dimensions. The dual variable formulation, the so-called Wegner duality, is utilized for reducing redundant gauge degrees of freedom. The problem of artificial charge unconservation is resolved for any charge distribution. As a demonstration, we simulate the real-time evolution of the system with two static electric charges, i.e., with two temporal Wilson lines. Some results obtained by the simulator (with no hardware noise) and the real device (with sizable hardware noise) of a quantum computer are shown.

翻訳日:2023-05-04 21:49:33 公開日:2021-02-01

# 高次微分理論と量子力学的対応のためのハミルトン・ヤコビ方程式の新しい定式化

Novel formulation of Hamilton-Jacobi equation for higher derivative theory and quantum mechanical correspondence ( http://arxiv.org/abs/2009.03200v2 )

ライセンス: Link先を確認

Zhi-Qiang Guo

(参考訳) 高次微分理論では、カラテオドリーの等価ラグランジアン(英語版)のアプローチを用いて、ハミルトン・ヤコビ方程式の新しい定式化が存在することを示し、これはハミルトンの標準的アプローチから導かれる定式化とは異なる。これらの新しいハミルトン・ヤコビ方程式の量子力学的対応は、高次微分理論の量子力学における非有界な負エネルギー問題を避けることができる非線形量子力学へと導かれる。

For higher derivative theories, using the approach of Caratheodory's equivalent Lagrangian, we show that there exist novel formulations of Hamilton-Jacobi equations, which are different from the formulations derived from Hamilton's canonical approach. The quantum mechanical correspondences of these novel Hamilton-Jacobi equations lead to nonlinear quantum mechanics, which seem being able to avoid the unbounded negative energy problem in the quantum mechanics of higher derivative theories.

翻訳日:2023-05-03 07:23:56 公開日:2021-02-01

# 振動分光法の近・長期量子アルゴリズムによるアプローチ

Near- and long-term quantum algorithmic approaches for vibrational spectroscopy ( http://arxiv.org/abs/2009.05066v2 )

ライセンス: Link先を確認

Nicolas P. D. Sawaya, Francesco Paesani, Daniel P. Tabor

(参考訳) 分子の振動構造を決定することは、大気科学から触媒、燃料燃焼モデリング、生化学イメージング、天体化学まで、いくつかの分野において基本的な応用の中心である。しかし、重要な不調和性やモードカップリングが存在する場合、この問題は古典的にはわずか数原子の分子に対して引き起こされる。本稿では、近・長期の量子コンピュータにおける分子振動構造問題を解決するための一連の量子アルゴリズムについて概説する。多くの固有状態がしばしば望まれる、興味のある状態が基底状態から遠ざかる(あるエネルギーウィンドウに「ズームイン」の方法を要求する)、非単項エルミート作用素に対する遷移振幅が計算されなければならない。これらのハードルに対処し、4つの分子振動ハミルトニアンの問題を考察する。最後に, 与えられたエネルギー精度に対して, 電子構造問題インスタンスの前に, 振動問題インスタンスが量子コンピュータ上でシミュレート可能であることを示唆する解析的および数値的な結果を与える。これらの結果は、量子情報コミュニティにもっと焦点を絞って、科学的および工業的に重要な量子振動問題に移行するべきであることを暗示している。

Determining the vibrational structure of a molecule is central to fundamental applications in several areas, from atmospheric science to catalysis, fuel combustion modeling, biochemical imaging, and astrochemistry. However, when significant anharmonicity and mode coupling are present, the problem is classically intractable for a molecule of just a few atoms. Here, we outline a set of quantum algorithms for solving the molecular vibrational structure problem for both near- and long-term quantum computers. There are previously unaddressed characteristics of this problem which require approaches distinct from most instances of the commonly studied quantum simulation of electronic structure: many eigenstates are often desired, states of interest are often far from the ground state (requiring methods for "zooming in" to some energy window), and transition amplitudes with respect to a non-unitary Hermitian operator must be calculated. We address these hurdles and consider problem instances of four molecular vibrational Hamiltonians. Finally and most importantly, we give analytical and numerical results which suggest that, to a given energy precision, a vibrational problem instance will be simulatable on a quantum computer before an electronic structure problem instance. These results imply that more focus in the quantum information community ought to shift toward scientifically and industrially important quantum vibrational problems.

翻訳日:2023-05-03 00:39:13 公開日:2021-02-01

# 連続計測とフィードバック制御による量子同期の高速化

Enhancement of quantum synchronization via continuous measurement and feedback control ( http://arxiv.org/abs/2009.05468v2 )

ライセンス: Link先を確認

Yuzuru Kato, Hiroya Nakao

(参考訳) 本研究では,高調波駆動による量子ファンデルポル発振器の同期について検討し,発振器に線形に結合した追加浴槽上で連続ホモダイン測定を行い,発振器にフィードバック制御を適用することにより,量子同期を向上できることを実証した。連続測定により量子揺らぎを減少させることで発振子の位相コヒーレンスを増大させる一方、測定バックアクションは位相同期点周辺の揺らぎを必然的に誘発する。本研究では,高調波駆動の周波数を調整し,測定誘起変動を抑制するための簡単なフィードバックポリシーを提案する。さらに、発振子の位相拡散が最大であり、発振子の位相の最大情報を抽出した二次角度で量子計測を行うことにより、量子同期の最大拡張を実現することを実証する。

We study synchronization of a quantum van der Pol oscillator with a harmonic drive and demonstrate that quantum synchronization can be enhanced by performing continuous homodyne measurement on an additional bath linearly coupled to the oscillator and applying feedback control to the oscillator. The phase coherence of the oscillator is increased by reducing quantum fluctuations via the continuous measurement, whereas the measurement backaction inevitably induces fluctuations around the phase-locking point. We propose a simple feedback policy for suppressing measurement-induced fluctuations by adjusting the frequency of the harmonic drive, which results in enhancement of quantum synchronization. We further demonstrate that the maximum enhancement of quantum synchronization is achieved by performing quantum measurement on the quadrature angle at which the phase diffusion of the oscillator is the largest and the maximal information of the oscillator phase is extracted.

翻訳日:2023-05-02 22:29:16 公開日:2021-02-01

# パワーロー相互作用系における最適状態伝達と絡み合い生成

Optimal State Transfer and Entanglement Generation in Power-law Interacting Systems ( http://arxiv.org/abs/2010.02930v2 )

ライセンス: Link先を確認

Minh C. Tran, Abhinav Deshpande, Andrew Y. Guo, Andrew Lucas, Alexey V. Gorshkov

(参考訳) 未知の量子ビット状態をマルチキュービットのグリーンバーガー・ホーン・サイーリンガー様状態に符号化するための最適なプロトコルを示し、その結果、パワーロー (1/r^\alpha$) 相互作用を示す大規模システムにおいて量子情報を転送する。すべてのパワーロー指数$\alpha$($d$と$d+1$)に対して、$d$はシステムの次元であり、このプロトコルは多項式スピードアップを$\alpha>2d$、超多項式スピードアップを$\alpha\leq 2d$とする。すべての$\alpha>d$ に対して、このプロトコルは lieb-robinson 境界を飽和させ(多項補正まで)、プロトコルの最適性とこの機構における境界の厳密性を確立する。このプロトコルは、量子センシング、量子コンピューティング、トポロジカルに順序付けられた状態の準備など、幅広い応用がある。さらに、このプロトコルは、パワーロー相互作用システムのデジタルシミュレーションにおいて、ゲート数の下限を提供する。

We present an optimal protocol for encoding an unknown qubit state into a multiqubit Greenberger-Horne-Zeilinger-like state and, consequently, transferring quantum information in large systems exhibiting power-law ($1/r^\alpha$) interactions. For all power-law exponents $\alpha$ between $d$ and $2d+1$, where $d$ is the dimension of the system, the protocol yields a polynomial speedup for $\alpha>2d$ and a superpolynomial speedup for $\alpha\leq 2d$, compared to the state of the art. For all $\alpha>d$, the protocol saturates the Lieb-Robinson bounds (up to subpolynomial corrections), thereby establishing the optimality of the protocol and the tightness of the bounds in this regime. The protocol has a wide range of applications, including in quantum sensing, quantum computing, and preparation of topologically ordered states. In addition, the protocol provides a lower bound on the gate count in digital simulations of power-law interacting systems.

翻訳日:2023-04-29 20:14:17 公開日:2021-02-01

# 新型コロナウイルス(COVID-19)関連スマートフォンアプリのプライバシー問題とユーザ受け入れ

Apps Against the Spread: Privacy Implications and User Acceptance of COVID-19-Related Smartphone Apps on Three Continents ( http://arxiv.org/abs/2010.14245v2 )

ライセンス: Link先を確認

Christine Utz, Steffen Becker, Theodor Schnitzler, Florian M. Farke, Franziska Herbert, Leonie Schaewitz, Martin Degeling, Markus D\"urmuth

(参考訳) 新型コロナウイルス(COVID-19)のパンデミックにより、スマートフォンアプリケーションの開発が加速している。多くの"コロナアプリ"が広く採用されることが求められており、政府の支援する健康アプリケーションに対するプライバシー、セキュリティ、社会的影響に関する公の議論を引き起こしている。我々はドイツ(n = 1,003)、米国(n = 1,003)、中国(n = 1,019)の代表的なオンライン調査を行い、コンテキスト整合性フレームワークに基づいたヴィグネットデザインを用いてコロナアプリのユーザ受け入れを調査した。われわれはコンタクトトレース、症状チェック、検疫、健康診断、単なる情報のためのアプリを調査した。以上の結果から,中国ではユーザ受け入れが最も多く,米国ではユーザ受け入れが低い国間で,採用を促進するデータ処理プラクティスの洞察が得られた。中国の参加者はパーソナライズされたデータの収集を好み、ドイツとアメリカの参加者は匿名性を好む。国全体では、接触追跡は検疫機関よりも肯定的に見られ、技術的な不具合はユーザーの受け入れに悪影響を及ぼす。

The COVID-19 pandemic has fueled the development of smartphone applications to assist disease management. Many "corona apps" require widespread adoption to be effective, which has sparked public debates about the privacy, security, and societal implications of government-backed health applications. We conducted a representative online study in Germany (n = 1,003), the US (n = 1,003), and China (n = 1,019) to investigate user acceptance of corona apps, using a vignette design based on the contextual integrity framework. We explored apps for contact tracing, symptom checks, quarantine enforcement, health certificates, and mere information. Our results provide insights into data processing practices that foster adoption and reveal significant differences between countries, with user acceptance being highest in China and lowest in the US. Chinese participants prefer the collection of personalized data, while German and US participants favor anonymity. Across countries, contact tracing is viewed more positively than quarantine enforcement, and technical malfunctions negatively impact user acceptance.

翻訳日:2023-04-27 08:41:23 公開日:2021-02-01

# 非多重性自由群に対する文字ランダム化ベンチマークと部分空間,リーク,マッチゲートランダム化ベンチマークへの応用

Character randomized benchmarking for non-multiplicity-free groups with applications to subspace, leakage, and matchgate randomized benchmarking ( http://arxiv.org/abs/2011.00007v2 )

ライセンス: Link先を確認

Jahan Claes, Eleanor Rieffel, Zhihui Wang

(参考訳) ランダム化ベンチマーク(RB)は実験量子ゲートの誤差率を決定する強力な手法である。しかし、伝統的なRBは、クリフォード群(Clifford group)のようなゲートセットに制限されている。最近導入されたキャラクタRBは、表現理論の技法を用いてより一般的なゲートをベンチマークすることができるが、この手法は「多重性のない」グループにしか適用されていない。本稿では,非多重性自由群を明示的に扱うために,原文字RBの導出を拡張し,いくつかの応用を導出する。まず、最近導入された部分空間RBの厳密なバージョンを導出し、SWAPの下で対称な1ビットと2ビットのゲートの集合を特徴付ける。次に,より一般的なゲート群に適用可能な新しいリークrbプロトコルを開発した。最後に、マッチゲート群に対するスケーラブルなRBプロトコルを導出するが、クリフォード群のような群はユニバーサルではないが、1つの追加ゲートを追加することで普遍となる。この例は、スケーラブルな非クリフォードRBプロトコルの数少ない例の1つである。これら3つの場合において、既存の理論と比較して、我々の手法は類似の資源を必要とするが、より正確なゲート忠実度推定を提供するか、より一般的なゲート群に適用する。結論として,マルチプライシティフリーキャラクタrbを用いてスケーラブルなrbプロトコルと特定のゲートを特徴付ける手法の新しいクラスを開発する可能性,課題について考察する。

Randomized benchmarking (RB) is a powerful method for determining the error rate of experimental quantum gates. Traditional RB, however, is restricted to gatesets, such as the Clifford group, that form a unitary 2-design. The recently introduced character RB can benchmark more general gates using techniques from representation theory; up to now, however, this method has only been applied to "multiplicity-free" groups, a mathematical restriction on these groups. In this paper, we extend the original character RB derivation to explicitly treat non-multiplicity-free groups, and derive several applications. First, we derive a rigorous version of the recently introduced subspace RB, which seeks to characterize a set of one- and two-qubit gates that are symmetric under SWAP. Second, we develop a new leakage RB protocol that applies to more general groups of gates. Finally, we derive a scalable RB protocol for the matchgate group, a group that like the Clifford group is non-universal but becomes universal with the addition of one additional gate. This example provides one of the few examples of a scalable non-Clifford RB protocol. In all three cases, compared to existing theories, our method requires similar resources, but either provides a more accurate estimate of gate fidelity, or applies to a more general group of gates. In conclusion, we discuss the potential, and challenges, of using non-multiplicity-free character RB to develop new classes of scalable RB protocols and methods of characterizing specific gates.

翻訳日:2023-04-26 07:42:02 公開日:2021-02-01

# パラメトリックアンプキャビティ内の原子への同型性によるJaynes-Cummings-Rabiモデルのスペクトルの探索

Probing the spectrum of the Jaynes-Cummings-Rabi model by its isomorphism to an atom inside a parametric amplifier cavity ( http://arxiv.org/abs/2011.04143v2 )

ライセンス: Link先を確認

R. Guti\'errez-J\'auregui and G. S. Agarwal

(参考訳) キャビティ量子電磁力学のjaynes-cummings-rabiモデルがパラメトリック増幅器キャビティ内の量子ビットのハミルトニアンへの同型によりどのように実現されるかを示す。この実現により、キュービットとパラメトリック発振器を含むパラメトリックアンプキャビティに印加されたプローブにより、ラビモデルの完全なスペクトルを観測する方法が明確になる。同型の重要な結果は、実際の周波数がデチューニングに置き換えられ、超強結合状態に到達することができることである。この状態の中では、プローブされたスペクトルは、地上と最初の励起状態の遷移に遡る狭い共鳴ピークを示す。これらの状態の正確な形式はエネルギー交差で与えられ、数値的に拡張される。交差では、固有状態は磁場と原子の絡み合った状態であり、そこで磁場は絞られた猫状態の内部で見られる。

We show how the Jaynes--Cummings--Rabi model of cavity quantum electrodynamics can be realized via an isomorphism to the Hamiltonian of a qubit inside a parametric amplifier cavity. This realization clears the way to observe the full spectrum of the Rabi model via a probe applied to a parametric amplifier cavity containing a qubit and a parametric oscillator operating below threshold. An important outcome of the isomorphism is that the actual frequencies are replaced by detunings which make it feasible to reach the ultra-strong coupling regime. We find that inside this regime the probed spectrum displays a narrow resonance peak that is traced back to the transition between ground and first excited states. The exact form of these states is given at an energy crossing and then extended numerically. At the crossing, the eigenstates are entangled states of field and atom where the field is found inside squeezed cat states.

翻訳日:2023-04-24 21:33:48 公開日:2021-02-01

# フォトニック結晶キャビティの1つのホール内に配置した人工原子に基づくハイブリッド量子フォトニクス

Hybrid quantum photonics based on artificial atoms placed inside one hole of a photonic crystal cavity ( http://arxiv.org/abs/2012.11503v3 )

ライセンス: Link先を確認

Konstantin G. Fehler, Lukas Antoniuk, Niklas Lettner, Anna P. Ovvyan, Richard Waltrich, Nico Gruhler, Valery A. Davydov, Viatcheslav N. Agafonov, Wolfram H. P. Pernice, Alexander Kubanek

(参考訳) スピンベースの量子フォトニクスは、分散量子コンピューティングと量子ネットワークを実現することを約束する。性能は効率のよい絡み合い分布に依存し、空洞量子電磁力学によって効率を高めることができる。中心的な課題は、大きなスピン光子結合率と高い動作帯域を持つコンパクトデバイスの開発である。フォトニック結晶キャビティは強い磁場閉じ込めを構成するが、モード場最大値における原子系の正確な位置決めに高い要求を課す。ダイヤモンドのカラーセンター、特に負電荷のシリコン空白中心は有望な原子系として現れた。大きなスペクトル安定性と長期の核スピンメモリへのアクセスにより、メモリ強化量子通信を含む量子ネットワークノードの初等的な実証が可能となった。ハイブリッドアプローチでは,SiV$^-$-含有ナノダイアモンドを1次元,非定常,Si$_3$N$_4系フォトニック結晶キャビティの1ホール内に配置し,それぞれの光遷移をキャビティモードに整合的に結合する。我々は2モード合成、導波路、パーセルエンハンスメント、共振器共振チューニングを利用して光物質結合を最適化する。結果として生じる光子フラックスは、自由空間に比べて14倍以上増加する。寿命を460ps以下に短縮することで、潜在的な動作帯域幅はghz以上になる。本研究は,SiV^-$-中心をナノダイヤモンドとするハイブリッド量子フォトニクスに基づく量子ネットワークノードの実現に向けた重要なステップを示す。

Spin-based quantum photonics promise to realize distributed quantum computing and quantum networks. The performance depends on efficient entanglement distribution, where the efficiency can be boosted by means of cavity quantum electrodynamics. The central challenge is the development of compact devices with large spin-photon coupling rates and high operation bandwidth. Photonic crystal cavities comprise strong field confinement but put high demands on accurate positioning of an atomic system in the mode field maximum. Color center in diamond, and in particular the negatively-charged Silicon-Vacancy center, emerged as a promising atom-like systems. Large spectral stability and access to long-lived, nuclear spin memories enabled elementary demonstrations of quantum network nodes including memory-enhanced quantum communication. In a hybrid approach, we deterministically place SiV$^-$-containing nanodiamonds inside one hole of a one-dimensional, free-standing, Si$_3$N$_4$-based photonic crystal cavity and coherently couple individual optical transitions to the cavity mode. We optimize the light-matter coupling by utilizing two-mode composition, waveguiding, Purcell-enhancement and cavity resonance tuning. The resulting photon flux is increased by more than a factor of 14 as compared to free-space. The corresponding lifetime shortening to below 460 ps puts the potential operation bandwidth beyond GHz rates. Our results mark an important step to realize quantum network nodes based on hybrid quantum photonics with SiV$^-$- center in nanodiamonds.

翻訳日:2023-04-20 00:17:18 公開日:2021-02-01

# 実験データによる位相相転移の教師なし機械学習

Unsupervised machine learning of topological phase transitions from experimental data ( http://arxiv.org/abs/2101.05712v2 )

ライセンス: Link先を確認

Niklas K\"aming, Anna Dawid, Korbinian Kottmann, Maciej Lewenstein, Klaus Sengstock, Alexandre Dauphin, Christof Weitenberg

(参考訳) 相転移の同定は、量子多体物理学における重要な課題の1つである。近年、機械学習手法は、ノイズや不完全なデータから、順序パラメータの知識がなくても位相境界をローカライズする代替手法であることが示されている。ここでは,超低温原子からの実験データに対して異常検出や影響関数を含む,教師なしの機械学習手法を適用する。このようにして、Haldaneモデルの位相位相図は、完全に偏りのない方法で得られる。本研究では, 有限温度実験データとFloquet システムのデータに対して, 単一マイクロモーション位相に後処理した場合に適用可能であることを示す。我々の研究は、複雑な多体系における新しいエキゾチック位相の教師なし検出のためのベンチマークを提供する。

Identifying phase transitions is one of the key challenges in quantum many-body physics. Recently, machine learning methods have been shown to be an alternative way of localising phase boundaries also from noisy and imperfect data and without the knowledge of the order parameter. Here we apply different unsupervised machine learning techniques including anomaly detection and influence functions to experimental data from ultracold atoms. In this way we obtain the topological phase diagram of the Haldane model in a completely unbiased fashion. We show that the methods can successfully be applied to experimental data at finite temperature and to data of Floquet systems, when postprocessing the data to a single micromotion phase. Our work provides a benchmark for unsupervised detection of new exotic phases in complex many-body systems.

翻訳日:2023-04-15 05:05:20 公開日:2021-02-01

# 非可換調和振動子とランダウ問題の分析スペクトルの同型

Isomorphism of Analytical Spectrum between Noncommutative Harmonic Oscillator and Landau Problem ( http://arxiv.org/abs/2101.05929v2 )

ライセンス: Link先を確認

M.N. Nazmi M. Rusli, Nurisya M. Shah, Hishamuddin Zainuddin and Chan Kar Tim

(参考訳) 非可換等方振動子のハミルトニアンとランダウ問題の比較は、これらの2つのモデルが区別できない特定の条件を研究するために分析される。対称および2つのランダウゲージにおけるランダウ問題のエネルギー固有値と固有状態を解析的に評価する。非可換等方調和振動子のハミルトニアンは、可換座標空間におけるboppシフトを用いて得られる。その結果、2つの系は、両ゲージの選択に対して$n_{r}$と$m_{l}$と$qb = eb > 0$の類似の値に同型であることが示された。しかし、非可換発振器が空間的自由度を1つ失わなければならないランダウゲージにはさらなる要件がある。また、ハミルトン群が互いに整合性を持つためには、因子$\zeta$でパラメタ化する必要がある。次に波動関数と確率密度関数をプロットし、出現する振る舞いを説明する。最後に、非可換性または磁場が固有状態および同型系の確率分布に及ぼす影響を示す。

The comparison of the Hamiltonians of the noncommutative isotropic harmonic oscillator and Landau problem are analysed to study the specific conditions under which these two models are indistinguishable. The energy eigenvalues and eigenstates of Landau problem in symmetric and two Landau gauges are evaluated analytically. The Hamiltonian of a noncommutative isotropic harmonic oscillator is found by using Bopp's shift in commutative coordinate space. The result shows that the two systems are isomorphic up to the similar values of $n_{r}$ and $m_{l}$ and $qB = eB > 0$ for both gauge choices. However, there is an additional requirement for Landau gauge where the noncommutative oscillator has to lose one spatial degree of freedom. It also needs to be parametrized by a factor $\zeta$ for their Hamiltonians to be consistent with each other. The wavefunctions and probability density functions are then plotted and the behaviour that emerges is explained. Finally, the effects of noncommutativity or magnetic field on the eigenstates and their probability distribution of the isomorphic system are shown.

翻訳日:2023-04-15 03:08:26 公開日:2021-02-01

# デジタル量子コンピュータによる量子材料シミュレーション

Simulating Quantum Materials with Digital Quantum Computers ( http://arxiv.org/abs/2101.08836v2 )

ライセンス: Link先を確認

Lindsay Bassman, Miroslav Urbanek, Mekena Metcalf, Jonathan Carter, Alexander F. Kemper, Wibe de Jong

(参考訳) 量子材料は幅広いエキゾチックな現象と実用的な性質を示す。これらの材料をより深く理解することで、量子領域の基本物理学に関する深い洞察と、エンターテイメント、医療、持続可能性のための先進技術を提供することができる。デジタル量子コンピュータ(DQC)の出現は、古典的コンピュータでは引き起こせない量子シミュレーションを効率的に行うことができ、量子物質の顕著で直感に反する振る舞いをテストし分析するための、有望な道筋を提供する。これらの新しいツールを備えた多様な領域の科学者は、物理量子の優位性(量子コンピュータを使って、どんな古典的コンピュータでも実行できない計算で新しい物理学を学ぶ)を達成するために競い合っている。したがって、このレビューの目的は、物理科学の科学者がアクセス可能なこの目標に向けての進捗の概要を提供することである。まず、利用可能な技術とアルゴリズムをレビューし、量子コンピュータ上で材料を表現する無数の方法を詳細に説明する。次に、現在利用可能なDQCで成功したシミュレーションを紹介し、この初期段階の技術で研究できる静的および動的特性の多様性を強調します。最後に、材料問題をDQCにマッピングする方法の2つの例を紹介します。このレビューは、ドメインエキスパートの分野における進歩の組織的な概要と、DQCの量子材料に関する独自のシミュレーションの開始に関心のある分野の科学者へのアクセシビリティな紹介として役立てられることを願っている。

Quantum materials exhibit a wide array of exotic phenomena and practically useful properties. A better understanding of these materials can provide deeper insights into fundamental physics in the quantum realm as well as advance technology for entertainment, healthcare, and sustainability. The emergence of digital quantum computers (DQCs), which can efficiently perform quantum simulations that are otherwise intractable on classical computers, provides a promising path forward for testing and analyzing the remarkable, and often counter-intuitive, behavior of quantum materials. Equipped with these new tools, scientists from diverse domains are racing towards achieving physical quantum advantage (i.e., using a quantum computer to learn new physics with a computation that cannot feasibly be run on any classical computer). The aim of this review, therefore, is to provide a summary of progress made towards this goal that is accessible to scientists across the physical sciences. We will first review the available technology and algorithms, and detail the myriad ways to represent materials on quantum computers. Next, we will showcase the simulations that have been successfully performed on currently available DQCs, emphasizing the variety of properties, both static and dynamic, that can be studied with this nascent technology. Finally, we work through two examples of how to map a materials problem onto a DQC, with full code included in the Supplementary Material. It is our hope that this review can serve as an organized overview of progress in the field for domain experts and an accessible introduction to scientists in related fields interested in beginning to perform their own simulations of quantum materials on DQCs.

翻訳日:2023-04-14 08:30:00 公開日:2021-02-01

# 円偏光レーザー場の存在下での弾性電子-陽子散乱

Elastic electron-proton scattering in the presence of a circularly polarized laser field ( http://arxiv.org/abs/2102.00722v1 )

ライセンス: Link先を確認

I Dahiri, M Jakha, S Mouslih, B Manaut and S Taj

(参考訳) 近年のレーザー技術の進歩により、非常に強力なレーザー分野における基本レーザー支援プロセスの研究が重要になっている。本研究およびレーザー支援量子電磁力学(QED)の枠組みにおいて、電子-陽子散乱は円偏光の強い電磁場の存在下で考慮された。まず,陽子を使わずに電子の相対論的ドレッシングのみを考慮に入れる過程について考察する。そして、プロトンドレッシングの効果を探求するために、電子とプロトンとの相対論的ドレッシングを完全に検討し、ディラック・ヴォルコフ関数を用いてそれらを記述する。両方の場合における差分断面積 (DCS) の解析式は摂動理論の最低次で導かれる。その結果、レーザ磁場によりDCSが顕著に減少する。プロトンドレッシングの効果は10^{10}~\text{V/cm}$以上のレーザー磁場強度で現れ始め、従って考慮する必要がある。レーザフィールド強度と周波数がdcsに及ぼす影響を報告した。モット散乱とレーザーフリーの結果との比較も含む。

Owing to recent advances in laser technology, it has become important to investigate fundamental laser-assisted processes in very powerful laser fields. In the present work and within the framework of laser-assisted quantum electrodynamics (QED), electron-proton scattering was considered in the presence of a strong electromagnetic field of circular polarization. First, we present a study of the process where we only take into account the relativistic dressing of the electron without the proton. Then, in order to explore the effect of the proton dressing, we fully consider the relativistic dressing of the electron and the proton together and describe them by using Dirac-Volkov functions. The analytical expression for the differential cross section (DCS) in both cases is derived at lowest-order of perturbation theory. As a result, the DCS is notably reduced by the laser field. It is found that the effect of proton dressing begins to appear at laser field strengths greater than or equal to $10^{10}~\text{V/cm}$ and it therefore must be taken into account. The influence of the laser field strength and frequency on the DCS is reported. A comparison with the Mott scattering and the laser-free results is also included.

翻訳日:2023-04-13 03:14:50 公開日:2021-02-01

# 量子暗号経済: 量子技術の進化のためのブロックチェーン予測市場

Quantum crypto-economics: Blockchain prediction markets for the evolution of quantum technology ( http://arxiv.org/abs/2102.00659v1 )

ライセンス: Link先を確認

Peter P. Rohde, Vijay Mohan, Sinclair Davidson, Chris Berg, Darcy Allen, Gavin K. Brennen, Jason Potts

(参考訳) 現在進行中の最も重要な技術進歩の2つは、量子技術の出現と、グローバル金融システムの暗号資産への移行、特にブロックチェーンベースの暗号通貨とスマートコントラクトである。しかし、いずれにせよ、量子技術はブロックチェーンの暗号基盤を直接侵害する能力を持つので、両者の間には重要な相互作用がある。我々は、量子リスクプレミアムの価格を含む様々なシナリオで、量子障害の金融モデルを構築することで、この複雑な相互作用を探求する。これを量子暗号経済と呼ぶ。

Two of the most important technological advancements currently underway are the advent of quantum technologies, and the transitioning of global financial systems towards cryptographic assets, notably blockchain-based cryptocurrencies and smart contracts. There is, however, an important interplay between the two, given that, in due course, quantum technology will have the ability to directly compromise the cryptographic foundations of blockchain. We explore this complex interplay by building financial models for quantum failure in various scenarios, including pricing quantum risk premiums. We call this quantum crypto-economics.

翻訳日:2023-04-13 03:14:08 公開日:2021-02-01

# マヨラナフェルミオンゲートを用いたフェルミオン系の量子演算とプロセストモグラフィー

Quantum operation of fermionic systems and process tomography using Majorana fermion gates ( http://arxiv.org/abs/2102.00620v1 )

ライセンス: Link先を確認

Gang Zhang, Mingxia Huo and Ying Li

(参考訳) 量子トモグラフィーは、量子演算のキャラクタリゼーションにとって重要なツールである。本稿では,フェルミオン系における量子トモグラフィーの枠組みについて述べる。量子ビット系と比較すると、フェルミオンはフェルミオン系の状態、過程、測定値に制約を設定するスーパー選択規則に従う。その結果、フェルミオンモードの部分集合に作用する操作は部分的にしか再構築できず、完全な再構成には部分集合に加えて少なくとも1つの補助フェルミオンモードが必要となる。また,マヨルダナフェルミオン量子コンピュータにおいて,情報完全状態の生成と測定を実現するための一連の回路を含む,ゲートに基づく完全再構成のためのプロトコルを報告する。

Quantum tomography is an important tool for the characterisation of quantum operations. In this paper, we present a framework of quantum tomography in fermionic systems. Compared with qubit systems, fermions obey the superselection rule, which sets constraints on states, processes and measurements in a fermionic system. As a result, we can only partly reconstruct an operation that acts on a subset of fermion modes, and the full reconstruction always requires at least one ancillary fermion mode in addition to the subset. We also report a protocol for the full reconstruction based on gates in Majorana fermion quantum computer, including a set of circuits for realising the informationally-complete state preparation and measurement.

翻訳日:2023-04-13 03:13:30 公開日:2021-02-01

# 3レベル設定を超えた非断熱ホロノミック量子計算の実現

Realizing nonadiabatic holonomic quantum computation beyond the three-level setting ( http://arxiv.org/abs/2102.00603v1 )

ライセンス: Link先を確認

G. F. Xu, P. Z. Zhao, Erik Sj\"oqvist, D. M. Tong

(参考訳) 非線形ホロノミック量子計算(NHQC)は、誤差耐性ゲートを実装する方法を提供し、近年注目されている。提案されて以来、NHQC の一般的なビルディングブロックは3レベル {\Lambda} システムとなり、これらのシステムに基づいて多くの NHQC スキームが開発されている。本稿では,NHQCの標準3レベル設定以上の実現について検討する。我々の提案の中心となる考え方は、ビルディングブロックシステムのヒルベルト空間を拡大し、純粋にホロノミックな進化を保証するために二部グラフ構造を持たせることで、NHQCを改善することである。提案手法は,従来のキュービットベースのNHQCを効率よく短縮するだけでなく,quditベースのNHQCの実装も提供する。そこで本提案では,効率の良い量子情報プロセッサの物理実現に大きく貢献できるNHQCのさらなる開発を提案する。

Nonadiabatic holonomic quantum computation (NHQC) provides a method to implement error resilient gates and that has attracted considerable attention recently. Since it was proposed, three-level {\Lambda} systems have become the typical building block for NHQC and a number of NHQC schemes have been developed based on such systems. In this paper, we investigate the realization of NHQC beyond the standard three-level setting. The central idea of our proposal is to improve NHQC by enlarging the Hilbert space of the building block system and letting it have a bipartite graph structure in order to ensure purely holonomic evolution. Our proposal not only improves conventional qubit-based NHQC by efficiently reducing its duration, but also provides implementations of qudit-based NHQC. Therefore, our proposal provides a further development of NHQC that can contribute significantly to the physical realization of efficient quantum information processors.

翻訳日:2023-04-13 03:12:57 公開日:2021-02-01

# 2レベル系によるベリー曲率による量子状態進化の追跡

Tracking quantum state evolution by the Berry curvature with a two-level system ( http://arxiv.org/abs/2102.00808v1 )

ライセンス: Link先を確認

Ze-Lin Zhang, Ping Xu, Zhen-Biao Yang

(参考訳) 駆動する2レベル系のハミルトニアンの制御パラメータにまたがる2種類の位相構造(球面とトーラス)について検討し,その構造と系の力学との関係について考察した。本稿では, 動的応答法によって得られたベリー曲率について考察し, ベリー曲率を積分して探索したガッピング領域を含む物理および可観測多様体を示し, ベリー曲率を抽出してシステムの状態変化を追跡・操作できることを示す。

We investigate two kinds of topological structures (sphere and torus) spanned by the controlled parameters of a driven two-level system's Hamiltonian, and consider the connection between the structures and the system's dynamics. We discuss the Berry curvature obtained through the dynamical response method, show the certain physical and observable manifolds including the gapped region probed by integrating the Berry curvature, and demonstrate the system's state evolution can be tracked and manipulated by extracting the Berry curvature.

翻訳日:2023-04-13 03:07:15 公開日:2021-02-01

# 単一InAs/GaAs量子ドットにおける長寿命発光ダイナミクスの解析

Analysis of Emission Dynamics of a Long Lifetime in Single InAs/GaAs Quantum Dots ( http://arxiv.org/abs/2102.00791v1 )

ライセンス: Link先を確認

Junhui Huang, Hao Chen, Zhiyao Zhuo, Jian Wang, Shulun Li, Kun Ding, Haiqiao Ni, Zhichuan Niu, Desheng Jiang, Xiuming Dou, and Baoquan Sun

(参考訳) 単一InAs/GaAs量子ドット (QD) 試料では, 湿潤層 (WL) [ACS Photonics 2020,7,3228-3235] に長寿命の準安定状態が存在することが報告されている。本稿では,エミッション減衰曲線をシミュレートする新しい3レベルモデルを提案する。このモデルでは、準安定状態の励起子がQDによって拡散され、そしてQDで蛍光を放出すると仮定すると、拡張されたような指数関数の崩壊公式はI(t)=At^({\beta}-1)e^(-(rt)^{\beta} として導かれ、これは平均寿命<{\tau}>=1/r{\Gamma}(1/{\beta}+1) の解析式で長寿命の崩壊曲線をうまく記述することができる。さらに,提案する3レベルモデルに基づき,測定したg^2(t)曲線によく適合する2次自己相関関数g^2(t)の式も得られた。

A very long lifetime emission with non-single exponential decay characteristic has been reported for single InAs/GaAs quantum dot (QD) samples, in which there exists a long-lived metastable state in the wetting layer (WL) [ACS Photonics 2020,7,3228-3235]. In this article we have proposed a new three-level model to simulate the emission decay curve. In this model, assuming that the excitons in metastable state will diffuse and be trapped by QDs, and then emit fluorescence in QDs, a stretched-like exponential decay formula is derived as I(t)=At^({\beta}-1)e^(-(rt)^{\beta}), which can well describe the long lifetime decay curve with an analytical expression of average lifetime <{\tau}>=1/r{\Gamma}(1/{\beta}+1), where {\Gamma} is the Gamma function. Furthermore, based on the proposed three-level model, an expression of the second-order auto-correlation function g^2 (t) which can well fit the measured g^2 (t) curve is also obtained.

翻訳日:2023-04-13 03:07:04 公開日:2021-02-01

# 識別可能な粒子に対する一夫一婦制による識別不能粒子の絡み合いの最大違反

Maximum Violation of Monogamy of Entanglement for Indistinguishable Particles by Measures that are Monogamous for Distinguishable Particles ( http://arxiv.org/abs/2102.00780v1 )

ライセンス: Link先を確認

Goutam Paul, Soumya Das and Anindya Banerji

(参考訳) 量子物理学の2つの重要な結果は、 \textit{no-cloning} 定理と \textit{monogamy of entanglement} である。前者は任意の未知の量子状態の独立かつ同一のコピーの作成を禁止し、後者は複数の量子系間の量子絡み合いの共有性を制限する。識別可能な粒子の場合、これらの結果の1つはもう一方を暗示する。本報告では, 識別不能粒子(各粒子は個別に対応できない)を持つ量子ビット系において, 識別可能な粒子に対して単元的な測度によって, 絡み合いの単元性に対する最大違反が可能であることを示す。この結果を導出するために,各自由度と他の自由度が絡み合う空間的位置に対応する識別不能粒子に対する自由トレースアウトルールの程度を定式化する。この結果は、無閉定理に矛盾することなく、区別不可能な粒子に対する量子エンタングルメントの共有性に対する制限を取り除く。

Two important results of quantum physics are the \textit{no-cloning} theorem and the \textit{monogamy of entanglement}. The former forbids the creation of an independent and identical copy of an arbitrary unknown quantum state and the latter restricts the shareability of quantum entanglement among multiple quantum systems. For distinguishable particles, one of these results imply the other. In this Letter, we show that in qubit systems with indistinguishable particles (where each particle cannot be addressed individually), a maximum violation of the monogamy of entanglement is possible by the measures that are monogamous for distinguishable particles. To derive this result, we formulate the degree of freedom trace-out rule for indistinguishable particles corresponding to a spatial location where each degree of freedom might be entangled with the other degrees of freedom. Our result removes the restriction on the shareability of quantum entanglement for indistinguishable particles, without contradicting the no-cloning theorem.

翻訳日:2023-04-13 03:06:38 公開日:2021-02-01

# Bosonic Indistinguishability-Dependent Contextuality

Bosonic Indistinguishability-Dependent Contextuality ( http://arxiv.org/abs/2102.00746v1 )

ライセンス: Link先を確認

Ali Asadian and Ad\'an Cabello

(参考訳) 我々は、最大文脈性とボソン不連続性を結び付ける量子文脈性の形式を、クラスー=ホルン=シモニー=ホルトベルの不等式が最大エンタングルメントに結びついているのと同様の方法で発見する。以前のフォトニックコンテクストリティとは異なり、この形式は区別不能と高次干渉に依存するため、古典光ではシミュレートできない。ボソニック系の理想的な測定は、連立量子ビットとの分散結合によって行うことができる。これにより、各測定の終了を遅らせ、既存のプラットフォームでは達成できない特徴である高次元のコンテキスト相関をターゲットとすることが可能になります。

We uncover a form of quantum contextuality that connects maximal contextuality to boson indistinguihability in a similar way maximal nonlocality with respect to the Clauser-Horne-Shimony-Holt Bell inequality is connected to maximal entanglement. Unlike previous forms of photonic contextuality, this form cannot be simulated with classical light, as it relies on indistinguishability and higher-order interference. Ideal measurements on the bosonic system can be performed by means of dispersive coupling with an ancillary qubit. This allows us delaying at will the ending of each measurement and targeting high-dimensional contextual correlations, which are features which cannot be achieved with existing platforms.

翻訳日:2023-04-13 03:05:42 公開日:2021-02-01

# 2方向古典通信を用いた実用的な量子鍵分布のための構成可能セキュリティ

Composable security for practical quantum key distribution with two way classical communication ( http://arxiv.org/abs/2102.00739v1 )

ライセンス: Link先を確認

Cong Jiang, Xiao-Long Hu, Zong-wen Yu and Xiang-bin Wang

(参考訳) 本稿では,量子鍵分布(QKD)における有限鍵効果を2方向古典通信(TWCC)を用いて正確に計算する手法を提案する。 TWCCのない通常のQKDとは異なり、ここでは各2ビットランダム群のタグ付けやアンタグの確率は独立ではない。我々は、全てのビットが独立で同一の仮想ビット集合を想像することで、この問題を厳格に解決する。独立ビットと同一ビットを含むこの想像上の集合から得られる結果と、非独立ビットの実集合から得られる結果との関係を示す。明示的な公式では、計算にチャーンオフバウンドを適用するだけで正しい鍵レートが得られるが、失敗確率は少し変化する。

We present methods to strictly calculate the finite-key effects in quantum key distribution (QKD) with error rejection through two-way classical communication (TWCC) for the sending-or-not-sending twin-field protocol. Unlike the normal QKD without TWCC, here the probability of tagging or untagging for each two-bit random group is not independent. We rigorously solve this problem by imagining a virtual set of bits where every bit is independent and identical. We show the relationship between the outcome starting from this imagined set containing independent and identical bits and the outcome starting with the real set of non-independent bits. With explicit formulas, we show that simply applying Chernoff bound in the calculation gives correct key rate, but the failure probability changes a little bit.

翻訳日:2023-04-13 03:05:11 公開日:2021-02-01

# スピン群上の自由フェルミオンの有限次元系と拡散過程

Finite dimensional systems of free Fermions and diffusion processes on Spin groups ( http://arxiv.org/abs/2102.01000v1 )

ライセンス: Link先を確認

Luigi M. Borasi

(参考訳) この記事では、有限次元のフェルミオン(Fermion)について論じ、そこでは、外側代数自身に埋め込まれた有限次元複素空間のベクトルを意味する。これらのフェルミオンはスピンを持たないが、反可換性を持つ。リー群 $\mathrm{Spin}(2n+1)$ 上の不変複素ベクトル場をフェルミオン生成および消滅作用素に関連付ける。これらのベクトル場はリー代数 $\mathfrak{so}(2n+1)$ の正則表現の複素化の元である。したがって、それらは標準の反可換関係を満たさないが、もしそれらが適切な部分空間 $l^2(\mathrm{spin}(2n+1)) に射影されたら、これらの関係は満たされる。生成消滅作用素における対称正定値二次形式の観点から、このフェルミオン系の自由時間発展を定義する。フェルミオン生成および(不変)ベクトル場によってもたらされる消滅作用素の実現により、確率拡散過程を生成する2階作用素の和である正の自己共役作用素と、2階作用素と強く可換な1階複素作用素の和で、この時間進化を解釈することができる。確率論的解釈は、二階作用素に付随する拡散過程に関して、ファインマン・カックのような公式の項で与えられる。

In this article we are concerned with finite dimensional Fermions, by which we mean vectors in a finite dimensional complex space embedded in the exterior algebra over itself. These Fermions are spinless but possess the characterizing anticommutativity property. We associate invariant complex vector fields on the Lie group $\mathrm{Spin}(2n+1)$ to the Fermionic creation and annihilation operators. These vector fields are elements of the complexification of the regular representation of the Lie algebra $\mathfrak{so}(2n+1)$. As such, they do not satisfy the canonical anticommutation relations, however, once they have been projected onto an appropriate subspace of $L^2(\mathrm{Spin}(2n+1))$, these relations are satisfied. We define a free time evolution of this system of Fermions in terms of a symmetric positive-definite quadratic form in the creation-annihilation operators. The realization of Fermionic creation and annihilation operators brought by the (invariant) vector fields allows us to interpret this time evolution in terms of a positive selfadjoint operator which is the sum of a second order operator, which generates a stochastic diffusion process, and a first order complex operator, which strongly commutes with the second order operator. A probabilistic interpretation is given in terms of a Feynman-Kac like formula with respect to the diffusion process associated with the second order operator.

翻訳日:2023-04-13 02:57:14 公開日:2021-02-01

# マルチループ原子サニャック干渉計

Multi-loop atomic Sagnac interferometry ( http://arxiv.org/abs/2102.00991v1 )

ライセンス: Link先を確認

Christian Schubert, Sven Abend, Matthias Gersemann, Martina Gebbe, Dennis Schlippert, Peter Berg, Ernst M. Rasel

(参考訳) 光および物質波干渉計の回転に対する感度は、sagnac効果に基づいており、干渉計で囲まれた面積によって増加する。光の場合、後者は複数のファイバーループを形成することで拡大できるが、物質波干渉計の等価値はまだ実験的な課題である。光パルスによって形成されるスケーラブルな領域を有するマルチループ原子干渉計の概念を提案する。提案手法は,地球回転モニタリングに必要な長期安定性と組み合わせて,最大2 sで2 cdot10^{-11}$ rad/sの感度を提供する。

The sensitivity of light and matter-wave interferometers to rotations is based on the Sagnac effect and increases with the area enclosed by the interferometer. In the case of light, the latter can be enlarged by forming multiple fibre loops, whereas the equivalent for matter-wave interferometers remains an experimental challenge. We present a concept for a multi-loop atom interferometer with a scalable area formed by light pulses. Our method will offer sensitivities as high as $2\cdot10^{-11}$ rad/s at 1 s in combination with the respective long-term stability as required for Earth rotation monitoring.

翻訳日:2023-04-13 02:56:38 公開日:2021-02-01

# 進化的多目的最適化における大規模候補解集合からの高速グリーディサブセット選択

Fast Greedy Subset Selection from Large Candidate Solution Sets in Evolutionary Multi-objective Optimization ( http://arxiv.org/abs/2102.00941v1 )

ライセンス: Link先を確認

Weiyu Chen, Hisao Ishibuchi, and Ke Shang

(参考訳) サブセット選択は進化的多目的最適化(EMO)の分野において興味深く重要なトピックである。特に、非有界な外部アーカイブを持つEMOアルゴリズムでは、サブセット選択は、最終結果としてあらかじめ指定された数のソリューションを選択するために必須な後処理手順である。本稿では,超体積,IGD,IGD+インジケータのグリーディ部分選択の効率について論じる。グリーディアルゴリズムは通常、サブセット選択を効率的に処理する。しかし、多数のソリューションが与えられると(例えば、無制限の外部アーカイブにおける数万のソリューションからのサブセット選択など)、それらはしばしば時間がかかります。我々の考えは、超体積指標で知られている部分モジュラー特性を用いて効率を向上させることである。まず、IGDとIGD+の指標も準モジュラであることを示す。次に,サブモジュラー特性に基づき,各指標に対する効率的なグリーディ包含アルゴリズムを提案する。次に,提案アルゴリズムが標準部分集合選択アルゴリズムよりもはるかに高速であることを示す計算実験を行った。

Subset selection is an interesting and important topic in the field of evolutionary multi-objective optimization (EMO). Especially, in an EMO algorithm with an unbounded external archive, subset selection is an essential post-processing procedure to select a pre-specified number of solutions as the final result. In this paper, we discuss the efficiency of greedy subset selection for the hypervolume, IGD and IGD+ indicators. Greedy algorithms usually efficiently handle subset selection. However, when a large number of solutions are given (e.g., subset selection from tens of thousands of solutions in an unbounded external archive), they often become time-consuming. Our idea is to use the submodular property, which is known for the hypervolume indicator, to improve their efficiency. First, we prove that the IGD and IGD+ indicators are also submodular. Next, based on the submodular property, we propose an efficient greedy inclusion algorithm for each indicator. Then, we demonstrate through computational experiments that the proposed algorithms are much faster than the standard greedy subset selection algorithms.

翻訳日:2023-04-13 02:56:20 公開日:2021-02-01

# 曲面上のスピン零中性および荷電粒子に対するエルミートハミルトニアンの構成 : 物理的アプローチ

Constructing Hermitian Hamiltonians for spin zero neutral and charged particles on a curved surface : physical approach ( http://arxiv.org/abs/2102.00896v1 )

ライセンス: Link先を確認

M.S.Shikakhwa and N.Chair

(参考訳) 表面を囲む層の厚さをゼロにすることで表面にピン留めされたスピン零粒子の表面ハミルトニアンを構築する。これを達成するための新しいアプローチは、表面上の成分と表面への正規成分が別々にエルミートである3D運動量作用素の式から始めることである。運動エネルギー作用素の通常の部分は、この場合エルミート作用素である。この演算子を落として層の厚さをゼロにすると、予想される幾何学的ポテンシャル項を含むエルミート曲面ハミルトニアンが自動的に得られる。電磁場中の中性粒子と荷電粒子の両方に対するハミルトニアンが構成される。エルミート曲面と正規モーメントが通常の正規運動量作用素と曲面運動量作用素を対称性付けると自動的に現れることを示す。このアプローチは、幾何学的ポテンシャルが表面運動量作用素に追加されてエルミートを表わす用語に由来することを明らかにしている; この用語自体は、曲線座標における微分運動量作用素の対称性と順序付けから生じる。本稿では, この手法とJenssen, Koppe, Costa (いわゆるThin-Layer Quantization (TLQ)) の類似したアプローチとの関係について検討する。ここで導入された波動関数の臨界変換は、実際に層の厚さをゼロにする前(著者らによって明確に述べられてはいないが)、表面と正常な運動エネルギー演算子ヘルミティアンのそれぞれをそれ自体で表現する。

The surface Hamiltonian for a spin zero particle that is pinned to a surface by letting the thickness of a layer surrounding the surface go to zero -- assuming a strong normal force -- is constructed. The new approach we follow to achieve this is to start with an expression for the 3D momentum operators whose components along the surface and the normal to the surface are separately Hermitian. The normal part of the kinetic energy operator is a Hermitian operator in this case. When this operator is dropped and the thickness of the layer is set to zero, one automatically gets the Hermitian surface Hamiltonian that contains the geometric potential term as expected. Hamiltonians for both a neutral and a charged particle in an electromagnetic field are constructed. We show that a Hermitian surface and normal momenta emerge automatically once one symmetrizes the usual normal and surface momentum operators. The present approach makes it manifest that the geometrical potential originates from the term that is added to the surface momentum operator to render it Hermitian; this term itself emerges from symmetrization/ordering of differential momentum operators in curvilinear coordinates. We investigate the connection between this approach and the similar approach of Jenssen and Koppe and Costa ( the so called Thin-Layer Quantization (TLQ)). We note that the critical transformation of the wavefunction introduced there before taking the thickness of the layer to zero actually -- while not noted explicitly stated by the authors -- renders each of the surface and normal kinetic energy operators Hermitian by itself, which is just what our approach does from the onset.

翻訳日:2023-04-13 02:55:44 公開日:2021-02-01

# 癌治療用ナノキャリアの自動発見のための進化計算プラットフォーム

Evolutionary computational platform for the automatic discovery of nanocarriers for cancer treatment ( http://arxiv.org/abs/2102.00879v1 )

ライセンス: Link先を確認

Namid Stillman, Igor Balaz, Antisthenis Tsompanas, Marina Kovacevic, Sepinoud Azimi, Sebastien Lafond, Andrew Adamatzky, Sabine Hauert

(参考訳) ナノメディシンの進化のためのEVONANOプラットフォームと抗がん剤への応用について述べる。 EVONANOは腫瘍を成長させ、代表シナリオを抽出し、これらのシナリオを通してナノ粒子輸送をシミュレートし、ナノ粒子分布を予測するシミュレータを含む。ナノ粒子の設計は機械学習を用いて最適化され、最も効果的な抗がん治療を効率的に見つける。我々は,ナノ粒子の性質を最適化する2つの例と,がん細胞を腫瘍環境下で選択的に殺傷する治療法を実演した。

We present the EVONANO platform for the evolution of nanomedicines with application to anti-cancer treatments. EVONANO includes a simulator to grow tumours, extract representative scenarios, and then simulate nanoparticle transport through these scenarios to predict nanoparticle distribution. The nanoparticle designs are optimised using machine learning to efficiently find the most effective anti-cancer treatments. We demonstrate our platform with two examples optimising the properties of nanoparticles and treatment to selectively kill cancer cells over a range of tumour environments.

翻訳日:2023-04-13 02:54:55 公開日:2021-02-01

# ポラリトン超流動における量子化された渦とダークソリトンの自発的生成、強化伝播、光インプリンティング:量子乱流の制御に向けて

Spontaneous generation, enhanced propagation and optical imprinting of quantized vortices and dark solitons in a polariton superfluid: towards the control of quantum turbulence ( http://arxiv.org/abs/2102.01075v1 )

ライセンス: Link先を確認

Anne Maitre, Ferdinand Claude, Giovani Lerario, Serguei Koniakhin, Simon Pigeon, Dmitry Solnyshkov, Guillaume Malpuech, Quentin Glorieux, Elisabeth Giacobino and Alberto Bramati

(参考訳) 共振ポンピングされたポラリトン超流動体では、偏光子系の安定性に基づいた新しい状態が探索され、偏光子流体のマクロ距離への伝播が促進された。この手法は全光学インプリント法とともに、量子化された渦や暗いソリトンのような様々なトポロジカル励起の生成と制御を可能にした。新しい実験スキームの柔軟性とスケーラビリティは、光の発散性量子流体における量子乱流の体系的研究への道を開く。本稿では,安定度向上のための基本原理とインプリント技術について概説し,本研究の成果と今後の展望について考察する。

In resonantly pumped polariton superfluids we recently explored a new regime based on the bistability of the polariton system to enhance the propagation of polariton fluids up to macroscopic distances. This technique together with an all-optical imprinting method allowed the generation and control of various topological excitations such as quantized vortices and dark solitons. The flexibility and scalability of the new experimental scheme opens the way to the systematic study of quantum turbulence in driven dissipative quantum fluids of light. In this article we review the basic working principles of the bistability enhanced propagation and of the imprinting technique and we discuss the main achieved results as well as the most promising future research directions.

翻訳日:2023-04-13 02:48:00 公開日:2021-02-01

# 実験室における量子重力--大きさと可逆ワームホールによるテレポーテーション(ii)

Quantum Gravity in the Lab: Teleportation by Size and Traversable Wormholes, Part II ( http://arxiv.org/abs/2102.01064v1 )

ライセンス: Link先を確認

Sepehr Nezami, Henry W. Lin, Adam R. Brown, Hrant Gharibyan, Stefan Leichenauer, Grant Salton, Leonard Susskind, Brian Swingle, Michael Walter

(参考訳) [1]では、量子デバイスを用いて量子重力をシミュレートする方法を説明し、サイズによるテレポーテーションとサイズワインディングの現象を具体的な提案を行った。ここでは、「実験室における量子重力」の意味と、大きさの曲がり角が重力物理学やワームホールにどのように結びつくのかを詳しく説明します。完全大きさの巻線は演算子の大きさの波動関数の顕著できめ細かな特性であり、この性質がほぼAdS_2バルクの量子系に対して成り立つことを示す。次に, sachdev-ye-kitaevモデル, ランダム行列, スピン鎖の3つの系におけるテレポーテーションを詳細に検討し, 近距離量子デバイスにおいてこれらの現象を実現するための展望について考察した。

In [1] we discussed how quantum gravity may be simulated using quantum devices and gave a specific proposal -- teleportation by size and the phenomenon of size-winding. Here we elaborate on what it means to do 'Quantum Gravity in the Lab' and how size-winding connects to bulk gravitational physics and traversable wormholes. Perfect size-winding is a remarkable, fine-grained property of the size wavefunction of an operator; we show from a bulk calculation that this property must hold for quantum systems with a nearly-AdS_2 bulk. We then examine in detail teleportation by size in three systems: the Sachdev-Ye-Kitaev model, random matrices, and spin chains, and discuss prospects for realizing these phenomena in near-term quantum devices.

翻訳日:2023-04-13 02:47:40 公開日:2021-02-01

# 相対論的量子力学における時間とエネルギーの第二量子化

Second quantization of time and energy in Relativistic Quantum Mechanics ( http://arxiv.org/abs/2102.01042v1 )

ライセンス: Link先を確認

M. Bauer and C.A. Aguill\'on

(参考訳) ローレンツ不変性とボルン相反不変性に基づいて、特殊相対性理論(sr)の正準量子化は、ディラックのハミルトニアンの存在と、パウリの反対を回避した自己随伴時間演算子の存在の統一的な起源であることが示されている。このように、このアプローチは運動量とエネルギーの足場における空間と時間の扱いを量子力学 (Quantum Mechanics, QM) に復元する。時間作用素場の第二量子化は、ディラック・ハミルトン場のステップバイステップに従う。これは、量子場理論(QFT)におけるエネルギー量子と似た方法で、時間量子の概念を導入する。初期の関係は、フェシュバッハの原子核反応の統一理論に十分見られる。コールド原子系におけるフェシュバッハ共鳴やボース=アインシュタイン凝縮、量子重力における時間の問題など、現在の発展にその関連性が指摘されている。 .

Based on Lorentz invariance and Born reciprocity invariance, the canonical quantization of Special Relativity (SR) has been shown to provide a unified origin for the existence of Dirac's Hamiltonian and a self adjoint time operator that circumvents Pauli's objection. As such, this approach restores to Quantum Mechanics (QM) the treatment of space and time on an equivalent footing as that of momentum and energy. Second quantization of the time operator field follows step by step that of the Dirac Hamiltonian field. It introduces the concept of time quanta, in a similar way to the energy quanta in Quantum Field Theory (QFT). An early connection is found allready in Feshbach's unified theory of nuclear reactions. Its possible relevance in current developments such as Feshbach resonances in the fields of cold atom systems, of Bose-Einstein condensates and in the problem of time in Quantum Gravity is noted. .

翻訳日:2023-04-13 02:47:01 公開日:2021-02-01

# 量子情報処理と量子センシングのための2モード圧縮状態の重ね合わせ

Superposition of two-mode squeezed states for quantum information processing and quantum sensing ( http://arxiv.org/abs/2102.01032v1 )

ライセンス: Link先を確認

Fernando R. Cardoso, Daniel Z. Rossatto, Gabriel P. L. M. Fernandes, Gerard Higgins and Celso J. Villas-Boas

(参考訳) 量子情報処理や量子センシングに応用可能な2モード圧縮状態(TMSS)の重ね合わせについて検討する。まず、各モードの統計や2つのモード間の絡み合いの程度など、これらの非古典的状態のいくつかの性質について検討する。ここで述べたように、2モードのJaynes-Cummingsと反Jaynes-Cummings相互作用を2つのモードとスピン-$\tfrac{1}{2}$粒子の系で誘導することで、我々が考える状態を作ることができる。 2つのTMSSを重畳して2つの高調波発振器を作成した場合、位相空間におけるモードの任意の変位を検出するために、各単モード状態が有利に利用できることを示す。この還元状態のウィグナー関数は位相空間原点を中心とする対称ピークを示し、平均光子数の増加と同時に両二次においてより狭くなるという便利な特異性を持つ。この狭いピークは我々の量子センサーのポインタとして利用することができ、その位置は発振器による変位を示す位相空間にある。

We investigate superpositions of two-mode squeezed states (TMSSs), which have potential applications to quantum information processing and quantum sensing. Firstly we study some properties of these nonclassical states such as the statistics of each mode and the degree of entanglement between the two modes, which can be higher than that of a TMSS with the same degree of squeezing. The states we consider can be prepared by inducing two-mode Jaynes-Cummings and anti-Jaynes-Cummings interactions in a system of two modes and a spin-$\tfrac{1}{2}$ particle, for instance in the trapped ion domain, as described here. We show that when two harmonic oscillators are prepared in a superposition of two TMSSs, each reduced single-mode state can be advantageously employed to sense arbitrary displacements of the mode in phase space. The Wigner function of this reduced state exhibits a symmetrical peak centered at the phase-space origin, which has the convenient peculiarity of getting narrower in both quadratures simultaneously as the average photon number increases. This narrow peakcan be used as the pointer of our quantum sensor, with its position in phase space indicating the displacement undergone by the oscillator.

翻訳日:2023-04-13 02:46:45 公開日:2021-02-01

# 古典的な影で揺れる量子

Quantum scrambling with classical shadows ( http://arxiv.org/abs/2102.01008v1 )

ライセンス: Link先を確認

Roy J. Garcia and You Zhou and Arthur Jaffe

(参考訳) 量子力学は基本的な関心事であり、量子情報処理に影響を及ぼす。 4点の時間外相関器(OTOC)は、伝統的に多体動学の量子情報の量子化に用いられている。 OTOCの異常な時間秩序のため、その測定は困難である。本稿では,早期スクランブル動作を明らかにするための高点OTOCを提案し,影推定法を用いて高点OTOCを測定するためのプロトコルを提案する。このプロトコルは、時間反転進化と補助制御の必要性を回避する。それらは、単一量子ビットの読み出しを持つ短期量子デバイスで実装することができる。

Quantum dynamics is of fundamental interest and has implications in quantum information processing. The four-point out-of-time-ordered correlator (OTOC) is traditionally used to quantify quantum information scrambling under many-body dynamics. Due to the OTOC's unusual time ordering, its measurement is challenging. We propose higher-point OTOCs to reveal early-time scrambling behavior, and present protocols to measure any higher-point OTOC using the shadow estimation method. The protocols circumvent the need for time-reversal evolution and ancillary control. They can be implemented in near-term quantum devices with single-qubit readout.

翻訳日:2023-04-13 02:46:02 公開日:2021-02-01

# 専用量子プロセッサ設計

Special-Purpose Quantum Processor Design ( http://arxiv.org/abs/2102.01228v1 )

ライセンス: Link先を確認

Bin-Han Lu, Yu-Chun Wu, Wei-Cheng Kong, Qi Zhou, and Guo-Ping Guo

(参考訳) 量子ビットの完全接続は、ほとんどの量子アルゴリズムにおいて必要であり、ノイズ中間スケール量子プロセッサに直接実装することは困難である。しかし、未結合キュービット間の2量子ゲートを可能にするスワップゲートの挿入は計算結果の忠実度を著しく低下させる。そこで本研究では,異なる量子アルゴリズムに適した構造を設計できる特殊目的量子プロセッサ設計法を提案する。提案手法は,プロセッサ構造を二次元格子グラフから一般平面グラフに拡張し,量子アルゴリズムの論理量子ビットと物理制約との間の2量子ゲート分布に応じて物理カプラを配置する。実験の結果, 設計手法は他の手法と比較して, 2キュービットゲートあたりの余剰スワップゲートの数を平均104.2%削減できることがわかった。また, 深さとキュービット数の増加に伴い, 他の手法に対する本手法のアドバンテージはより明確になる。その結果,本手法は計算結果の忠実性向上に競争力があり,技術的条件下で量子優位を示す可能性が示唆された。

Full connectivity of qubits is necessary for most quantum algorithms, which is difficult to directly implement on Noisy Intermediate-Scale Quantum processors. However, inserting swap gate to enable the two-qubit gates between uncoupled qubits significantly decreases the computation result fidelity. To this end, we propose a Special-Purpose Quantum Processor Design method that can design suitable structures for different quantum algorithms. Our method extends the processor structure from two-dimensional lattice graph to general planar graph and arranges the physical couplers according to the two-qubit gate distribution between the logical qubits of the quantum algorithm and the physical constraints. Experimental results show that our design methodology, compared with other methods, could reduce the number of extra swap gates per two-qubit gate by at least 104.2% on average. Also, our method's advantage over other methods becomes more obvious as the depth and qubit number increase. The result reveals that our method is competitive in improving computation result fidelity and it has the potential to demonstrate quantum advantage under the technical conditions.

翻訳日:2023-04-13 02:38:25 公開日:2021-02-01

# オーシャン・ムカイ・ソルバを用いた電子構造QUBOのサンプリング

Sampling electronic structure QUBOs with Ocean and Mukai solvers ( http://arxiv.org/abs/2102.01225v1 )

ライセンス: Link先を確認

Alexander Teplukhin (1), Brian K. Kendrick (1), Susan M. Mniszewski (2), Sergei Tretiak (1) and Pavel A. Dub (3) ((1) Theoretical Division, Los Alamos National Laboratory, (2) Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, (3) Chemistry Division, Los Alamos National Laboratory)

(参考訳) 最も先進的なD波アドバンテージ量子アニールは5000以上の量子ビットを持つが、全ての量子ビットは少数の近傍に接続される。したがって、完全連結グラフの実装は、量子ビット数の大きさの減少をもたらす。量子ビット数の減少を補うためには、qbsolvのような特殊なヒューリスティックなソフトウェアに頼る必要がある。本研究では,d-wave oceanツールの一部であるオープンソースのqbsolvとquantum computing inc.(qci)の新しいmukai qubo solverの2つの実装の性能を比較した。この比較は電子構造問題を解くために行われ、古典的モード(タブサーチ技術)で実装される。量子アニーラー固有解法(Quantum Annealer Eigensolver)は、電子構造固有値-固有ベクトル方程式を現代の量子アニーラー上で解ける問題の種類にマッピングするために用いられる。本研究で行ったすべての計算,地上および励起状態の計算において,向浦解法はOcean qbsolvよりも優れていた。この研究は、現代の量子アニールの利用を支援するソフトウェアの開発を刺激する。

The most advanced D-Wave Advantage quantum annealer has 5000+ qubits, however, every qubit is connected to a small number of neighbors. As such, implementation of a fully-connected graph results in an order of magnitude reduction in qubit count. To compensate for the reduced number of qubits, one has to rely on special heuristic software such as qbsolv, the purpose of which is to decompose a large problem into smaller pieces that fit onto a quantum annealer. In this work, we compare the performance of two implementations of such software: the original open-source qbsolv which is a part of the D-Wave Ocean tools and a new Mukai QUBO solver from Quantum Computing Inc. (QCI). The comparison is done for solving the electronic structure problem and is implemented in a classical mode (Tabu search techniques). The Quantum Annealer Eigensolver is used to map the electronic structure eigenvalue-eigenvector equation to a type of problem solvable on modern quantum annealers. We find that the Mukai QUBO solver outperforms the Ocean qbsolv for all calculations done in the present work, both the ground and excited state calculations. This work stimulates the development of software to assist in the utilization of modern quantum annealers.

翻訳日:2023-04-13 02:38:09 公開日:2021-02-01

# big geosocial data analyticsを用いた大規模イベントにおける集団行動の理解

Understanding collective human movement dynamics during large-scale events using big geosocial data analytics ( http://arxiv.org/abs/2102.01175v1 )

ライセンス: Link先を確認

Junchuan Fan, Kathleen Stewart

(参考訳) 情報通信技術の急速な進歩に伴い、多くの研究者は、大規模な自然または社会的な出来事に対応するために、個人データベンダーの代替データソースを採用する。ジオリファレンスされたつぶやきのような大きなジオソーシャルデータは、現実世界のイベントが起こっているときに公開され、動的に進化しているため、人口のリアルタイムな感情や反応を捉えやすい。しかし、正確な位置情報は都市人口中心への偏りや偏りが少ない。本研究では,公開されたジオリファレンスツイートから大規模イベントに応答して,人間の動きのダイナミクスを抽出するための大規模ジオソーシャルデータ分析フレームワークを開発した。このフレームワークは、ジオレファレンスツイートのデータ不足を軽減するために、よりターゲット的な方法でデータを収集する2段階のデータ収集モジュールを含む。また、異なる空間スケールでジオレファレンス情報を融合するために、可変帯域カーネル密度推定(VB-KDE)アプローチを採用し、ジオレファレンスツイートに含まれる人間の動きの信号をさらに増強した。ジオレファレンスされたツイートのサンプリングバイアスを補正するため、人口別に異なる空間単位(例えば、郡、州)のツイート数を調整した。提案する分析フレームワークの性能を実証するため,米国全土で発生した天文学的イベント,すなわち2017年グレートアメリカン・エクリプスを事例として選択し,このイベントに対する人間の運動動態について検討した。しかし、この分析枠組みはハリケーンや地震のような他の種類の大規模イベントにも容易に適用できる。

With the rapid advancement of information and communication technologies, many researchers have adopted alternative data sources from private data vendors to study human movement dynamics in response to large-scale natural or societal events. Big geosocial data such as georeferenced tweets are publicly available and dynamically evolving as real-world events are happening, making it more likely to capture the real-time sentiments and responses of populations. However, precisely-geolocated geosocial data is scarce and biased toward urban population centers. In this research, we developed a big geosocial data analytical framework for extracting human movement dynamics in response to large-scale events from publicly available georeferenced tweets. The framework includes a two-stage data collection module that collects data in a more targeted fashion in order to mitigate the data scarcity issue of georeferenced tweets; in addition, a variable bandwidth kernel density estimation(VB-KDE) approach was adopted to fuse georeference information at different spatial scales, further augmenting the signals of human movement dynamics contained in georeferenced tweets. To correct for the sampling bias of georeferenced tweets, we adjusted the number of tweets for different spatial units (e.g., county, state) by population. To demonstrate the performance of the proposed analytic framework, we chose an astronomical event that occurred nationwide across the United States, i.e., the 2017 Great American Eclipse, as an example event and studied the human movement dynamics in response to this event. However, this analytic framework can easily be applied to other types of large-scale events such as hurricanes or earthquakes.

翻訳日:2023-04-13 02:37:23 公開日:2021-02-01

# ソーダライムガラス中のna/kイオン交換法により作製した光方向カプラによる量子プロジェクタ

Quantum projectors implemented with optical directional couplers fabricated by Na/K ion-exchange in soda-lime glass ( http://arxiv.org/abs/2102.01169v1 )

ライセンス: Link先を確認

Xes\'us Prieto-Blanco, Carlos Montero-Orille, Jes\'us Li\~nares, H\'ector Gonz\'alez-N\'u\~nez and Daniel Balado

(参考訳) イオン交換Na/Kプロセスで作製した集積光指向性カプラにより実装された量子プロジェクタの理論的および実験的研究を行った。 2x2方向結合器を連結したデバイスに関する理論的考察を行い、n-次元量子射影計測の実行能力と1-qudit状態の生成について述べる。これらの装置の基本単位は2x2方向結合器であるので、このような結合器の製造と光学パラメータの間の経験的関係を光学的特徴付けにより得るための実験的研究を行う。同様に、2次元の量子プロジェクターは、X(対角)およびY(円)基底の状態に対して射影の測定値が得られるように示される。

We present a preliminary theoretical and experimental study of quantum projectors implemented by integrated optical directional couplers fabricated by ion-exchange Na/K processes in soda-lime glass. Theoretical considerations about devices formed by concatenated 2x2 directional couplers are presented in order to show their capabilities for implementing N-dimensional quantum projective measurements, and concomitantly the production of 1-qudit states. Since the fundamental unit of these devices are 2x2 directional couplers, we present an experimental study for obtaining, by an optical characterization, empiric relationships between fabrication and optical parameters of such couplers. Likewise, a two-dimensional quantum projector is demonstrated in such a way that projective measurements are obtained for the states of X (diagonal) and Y (circular) bases.

翻訳日:2023-04-13 02:36:54 公開日:2021-02-01

# DisQ: OpenPulseを用いたIBM量子コンピュータの新しい量子出力状態分類法

DisQ: A Novel Quantum Output State Classification Method on IBM Quantum Computers using OpenPulse ( http://arxiv.org/abs/2102.01153v1 )

ライセンス: Link先を確認

Tirthak Patel and Devesh Tiwari

(参考訳) 超伝導量子コンピューティング技術は、新しい計算可能性の時代を幕開けた。量子技術の改善と、誤差率を低減した量子アルゴリズムを効率的に実行するソフトウェアスタックの構築に向けた研究が盛んに行われているが、誤差率の低減を目的とした量子出力状態の定義と分類の最適化への取り組みはまだ限られている。そこで本研究では,NISQデバイス上での量子プログラムの誤り率を低減する量子出力状態分類手法であるDisQを提案する。

Superconducting quantum computing technology has ushered in a new era of computational possibilities. While a considerable research effort has been geared toward improving the quantum technology and building the software stack to efficiently execute quantum algorithms with reduced error rate, effort toward optimizing how quantum output states are defined and classified for the purpose of reducing the error rate is still limited. To this end, this paper proposes DisQ, a quantum output state classification approach which reduces error rates of quantum programs on NISQ devices.

翻訳日:2023-04-13 02:36:40 公開日:2021-02-01

# 量子軌道上の久須岡測度のエルゴーディティー

Ergodicity of Kusuoka measures on quantum trajectories ( http://arxiv.org/abs/2102.01140v1 )

ライセンス: Link先を確認

Anna Szczepanek

(参考訳) 1989年、クズーカは行列の積の助けを借りて定義されるシフト空間の確率測度の研究を開始した。特に、この措置のエルゴード性に十分な条件を導いており、それ以来、薬岡措置と呼ばれるようになった。我々は、一様発展する量子系の繰り返し測定が、測定結果の列の空間上にクズーカ測度を生成することを観測する。測定値がスケールした射影からなる場合、草岡の十分なエルゴディダリティ条件は大幅に単純化できることを示す。すると、測定が一様スケールされた rank-1 射影(つまり rank-1 povm である)またはちょうど 2 つの射影(そのうちの 1 つは rank-1 である)からなる場合、この条件はエルゴード性にも必要であることが証明される。後者の種類の測定では、全ての結果列が系によってその逆に放出される確率と同じであるという意味で、クズーカ測度は可逆的であることも示している。

In 1989 Kusuoka started the study of probability measures on the shift space that are defined with the help of products of matrices. In particular, he derived a sufficient condition for the ergodicity of such measures, which have since been referred to as Kusuoka measures. We observe that repeated measurements of a unitarily evolving quantum system generate a Kusuoka measure on the space of sequences of measurement outcomes. We show that if the measurement consists of scaled projections, then Kusuoka's sufficient ergodicity condition can be significantly simplified. We then prove that this condition is also necessary for ergodicity if the measurement consists of uniformly scaled rank-1 projections (i.e., it is a rank-1 POVM), or of exactly two projections, one of which is rank-1. For the latter class of measurements we also show that the Kusuoka measure is reversible in the sense that every string of outcomes has the same probability of being emitted by the system as its reverse.

翻訳日:2023-04-13 02:36:30 公開日:2021-02-01

# 量子チェシャー猫のグラインおよびスナール選択経路の遅延選択

Delayed choice of paths selected by grin and snarl of quantum Cheshire Cat ( http://arxiv.org/abs/2001.00669v2 )

ライセンス: Link先を確認

Debmalya Das and Ujjwal Sen

(参考訳) いわゆる量子チェシャー・キャット(Quantum Cheshire Cat)は、猫と同一視される光子と、その猫と同一視される偏光の成分が分離されるシナリオである。我々は、光子の偏極の2つの直交成分を平均で分離するために同じ技術が使用できることを観察する。我々は、光子の偏光成分を、理解の容易さのために猫のほこりと鳴き声として識別する。また,マッハツェンダー干渉計の2つの腕における光子の入射偏光を同時に調整するゲダンケン実験を行った。光子偏極の2つの特定の選択において、2つの成分の存在は2つの腕の中で反転する。このグラインとスナールの反転は、分極成分がチューナーと相互作用する前、すなわち各アームの選択が行われる前に起こる。

The so-called quantum Cheshire Cat is a scenario where a photon, identified with a cat, and a component of its polarization, identified with the grin of that cat, are separated. We observe that the same techniques can be used to separate two orthogonal components of polarization of a photon, on an average. We identify these polarization components of the photon as the grin and snarl of the cat for ease of comprehension. A gedanken experiment is presented in which we simultaneously tune the input polarizations of the photon in the two arms of a Mach-Zehnder interferometer. It is noted that for two particular choices of photon polarization, the presence of the two components gets reversed in the two arms. This reversal of the grin and the snarl occurs before the polarization components even interact with the tuners, i.e., before the choice of which arm each should be in is made.

翻訳日:2023-01-16 04:31:07 公開日:2021-02-01

# RDAnet:合成開口レーダ画像形成のためのディープラーニングに基づくアプローチ

RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation ( http://arxiv.org/abs/2001.08202v2 )

ライセンス: Link先を確認

Andrew Rittenbach (1) and John Paul Walters (1) ((1) University of Southern California Information Sciences Institute, Arlington VA)

(参考訳) SAR(Synthetic Aperture Radar)イメージングシステムは、衛星などの移動物体からレーダー信号を関心の対象に向けて放射することによって動作する。反射レーダエコーを受信し、後に画像形成アルゴリズムによってSAR画像を形成する。分類や自動目標認識などのコンピュータビジョンタスクにおいて,SAR画像を使用することに大きな関心がある。しかし今日では、SARアプリケーションは複数の操作で構成されている:画像形成と画像処理である。本研究では,sar処理パイプラインを統合することで,画像形成と画像処理タスクの両方を実行するディープニューラルネットワークを訓練する。その結果,従来のアルゴリズムと同等の画質のSAR画像を精度良く出力できることが示唆された。この研究は、実データを使用した統合ニューラルネットワークベースのSAR処理パイプラインの最初の実演であると考えています。

Synthetic Aperture Radar (SAR) imaging systems operate by emitting radar signals from a moving object, such as a satellite, towards the target of interest. Reflected radar echoes are received and later used by image formation algorithms to form a SAR image. There is great interest in using SAR images in computer vision tasks such as classification or automatic target recognition. Today, however, SAR applications consist of multiple operations: image formation followed by image processing. In this work, we train a deep neural network that performs both the image formation and image processing tasks, integrating the SAR processing pipeline. Results show that our integrated pipeline can output accurately classified SAR imagery with image quality comparable to those formed using a traditional algorithm. We believe that this work is the first demonstration of an integrated neural network based SAR processing pipeline using real data.

翻訳日:2023-01-07 18:58:34 公開日:2021-02-01

# レジームスイッチングバンド

Regime Switching Bandits ( http://arxiv.org/abs/2001.09390v3 )

ライセンス: Link先を確認

Xiang Zhou, Yi Xiong, Ningyuan Chen, Xuefeng Gao

(参考訳) 報酬がレジームスイッチングを示すマルチアームバンディット問題について検討する。特に、すべての腕から生成されるランダム報酬の分布は、有限状態マルコフ連鎖としてモデル化された共通の状態によって変調される。エージェントは基底状態を観察しず、遷移行列と報酬分布を学習しなければならない。本稿では,隠れマルコフモデルに対するスペクトル手法推定,部分的に観測可能なマルコフ決定過程における信念誤差制御,オンライン学習のための高信頼度手法に基づく学習アルゴリズムを提案する。また、t$が学習の地平線である学習アルゴリズムに対して、上限値の$o(t^{2/3}\sqrt{\log t})$を確立する。最後に,学習アルゴリズムの性能を実証する概念実証実験を行った。

We study a multi-armed bandit problem where the rewards exhibit regime switching. Specifically, the distributions of the random rewards generated from all arms are modulated by a common underlying state modeled as a finite-state Markov chain. The agent does not observe the underlying state and has to learn the transition matrix and the reward distributions. We propose a learning algorithm for this problem, building on spectral method-of-moments estimations for hidden Markov models, belief error control in partially observable Markov decision processes and upper-confidence-bound methods for online learning. We also establish an upper bound $O(T^{2/3}\sqrt{\log T})$ for the proposed learning algorithm where $T$ is the learning horizon. Finally, we conduct proof-of-concept experiments to illustrate the performance of the learning algorithm.

翻訳日:2023-01-06 19:17:35 公開日:2021-02-01

# 非凸最適化のための局所条件下における確率勾配ハミルトンモンテカルロの漸近解析

Nonasymptotic analysis of Stochastic Gradient Hamiltonian Monte Carlo under local conditions for nonconvex optimization ( http://arxiv.org/abs/2002.05465v3 )

ライセンス: Link先を確認

\"Omer Deniz Akyildiz, Sotirios Sabanis

(参考訳) 確率勾配ハミルトニアンモンテカルロ (sghmc) をwasserstein-2 距離の目標測度に収束させる非漸近解析をlog-concavityを仮定することなく提供する。本分析では,SGHMCの局所的な条件下での重要な理論的特性を定量化し,その結果を著しく改善する。特に、目標とSGHMCの法則の間のワッサーシュタイン-2距離がアルゴリズムのステップサイズによって一様に制御されていることを証明し、SGHMCがイテレーション数で一様に高精度な結果を提供できることを示す。この分析により,局所条件下での非凸最適化問題に対する漸近的境界を求めることができ,SGHMCは非凸最適化器と見なすと,最もよく知られた速度で世界最小値に収束する。この結果を用いて,スケーラブルベイズ推定と非漸近一般化境界に対する非漸近的境界を求める。

We provide a nonasymptotic analysis of the convergence of the stochastic gradient Hamiltonian Monte Carlo (SGHMC) to a target measure in Wasserstein-2 distance without assuming log-concavity. Our analysis quantifies key theoretical properties of the SGHMC as a sampler under local conditions which significantly improves the findings of previous results. In particular, we prove that the Wasserstein-2 distance between the target and the law of the SGHMC is uniformly controlled by the step-size of the algorithm, therefore demonstrate that the SGHMC can provide high-precision results uniformly in the number of iterations. The analysis also allows us to obtain nonasymptotic bounds for nonconvex optimization problems under local conditions and implies that the SGHMC, when viewed as a nonconvex optimizer, converges to a global minimum with the best known rates. We apply our results to obtain nonasymptotic bounds for scalable Bayesian inference and nonasymptotic generalization bounds.

翻訳日:2023-01-01 13:39:12 公開日:2021-02-01

# 知識追跡のための適切なクエリ、キー、価値計算を目指して

Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing ( http://arxiv.org/abs/2002.07033v5 )

ライセンス: Link先を確認

Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Byungsoo Kim, Yeongmin Cha, Dongmin Shin, Chan Bae, Jaewe Heo

(参考訳) 知識追跡は、学習活動を通じて学生の知識をモデル化する行為であり、コンピュータ支援教育の分野で広く研究されている問題である。注意機構を持つモデルはベイズ知識の追跡や協調フィルタリングといった従来のアプローチを上回っているが、それらは2つの制限を共有している。まず、モデルは浅い注意層に依存し、時間とともにエクササイズとレスポンスの間の複雑な関係を捉えない。第二に、知識追跡のための自己注意層に対するクエリ、キー、値の組み合わせは、広範囲に調査されていない。エクササイズとインタラクション(エクササイズ-レスポンスペア)をクエリとして使用する通常のプラクティスには,それぞれ経験的サポートが欠けている。本稿では,知識追跡のための新しいトランスフォーマーモデルであるSAINT:Separated Self-AttentIve Neural Knowledge Tracingを提案する。 SAINTはエンコーダ・デコーダ構造を持ち、エクササイズとレスポンスの埋め込みシーケンスはそれぞれエンコーダとデコーダを別々に入力し、注意層を複数回重ねることができる。私たちの知識を最大限活用するために、これは、エクササイズとレスポンスを別々に適用する、知識トレースのためのエンコーダ・デコーダモデルを提案する最初の作業である。大規模知識追跡データセットにおける経験的評価から,SAINTは知識追跡における最先端のパフォーマンスを達成し,AUCを1.8%改善した。

Knowledge tracing, the act of modeling a student's knowledge through learning activities, is an extensively studied problem in the field of computer-aided education. Although models with attention mechanism have outperformed traditional approaches such as Bayesian knowledge tracing and collaborative filtering, they share two limitations. Firstly, the models rely on shallow attention layers and fail to capture complex relations among exercises and responses over time. Secondly, different combinations of queries, keys and values for the self-attention layer for knowledge tracing were not extensively explored. Usual practice of using exercises and interactions (exercise-response pairs) as queries and keys/values respectively lacks empirical support. In this paper, we propose a novel Transformer based model for knowledge tracing, SAINT: Separated Self-AttentIve Neural Knowledge Tracing. SAINT has an encoder-decoder structure where exercise and response embedding sequence separately enter the encoder and the decoder respectively, which allows to stack attention layers multiple times. To the best of our knowledge, this is the first work to suggest an encoder-decoder model for knowledge tracing that applies deep self-attentive layers to exercises and responses separately. The empirical evaluations on a large-scale knowledge tracing dataset show that SAINT achieves the state-of-the-art performance in knowledge tracing with the improvement of AUC by 1.8% compared to the current state-of-the-art models.

翻訳日:2023-01-01 04:22:27 公開日:2021-02-01

# 信頼に基づく協調フィルタリングのためのグラフ埋め込みの実証比較

Empirical Comparison of Graph Embeddings for Trust-Based Collaborative Filtering ( http://arxiv.org/abs/2003.13345v2 )

ライセンス: Link先を確認

Tomislav Duricic, Hussain Hussain, Emanuel Lacic, Dominik Kowald, Denis Helic, Elisabeth Lex

(参考訳) 本研究では,信頼に基づく協調フィルタリングのための潜在ユーザ表現を生成するためのグラフ埋め込みの有用性について検討する。コールドスタート設定では、公開されている3つのデータセットに基づいて、4つのメソッドファミリーからのアプローチを評価する。 (i)因子化に基づく (ii)ランダムウォークベース。 (iii)深層学習ベース、及び (iv)大規模情報ネットワーク埋め込み(line)アプローチ。 4つのファミリーで、ランダムウォークに基づくアプローチは、常に最高の精度を達成する。さらに、非常に斬新で多様なレコメンデーションも生み出す。さらに,信頼度に基づく協調フィルタリングにおけるグラフ埋め込みの利用は,ユーザカバレッジを著しく向上させることを示す。

In this work, we study the utility of graph embeddings to generate latent user representations for trust-based collaborative filtering. In a cold-start setting, on three publicly available datasets, we evaluate approaches from four method families: (i) factorization-based, (ii) random walk-based, (iii) deep learning-based, and (iv) the Large-scale Information Network Embedding (LINE) approach. We find that across the four families, random-walk-based approaches consistently achieve the best accuracy. Besides, they result in highly novel and diverse recommendations. Furthermore, our results show that the use of graph embeddings in trust-based collaborative filtering significantly improves user coverage.

翻訳日:2022-12-18 06:32:52 公開日:2021-02-01

# efficientps:効率的なpanopticセグメンテーション

EfficientPS: Efficient Panoptic Segmentation ( http://arxiv.org/abs/2004.02307v3 )

ライセンス: Link先を確認

Rohit Mohan, Abhinav Valada

(参考訳) 自律ロボットが行動する場面を理解することは、その能力的機能にとって重要である。このようなシーン理解は、パノプティックセグメンテーションタスクによって効果的に対処できる一般的なシーンセマンティクスとともに、交通参加者のインスタンスを認識する必要がある。本稿では,意味的にリッチなマルチスケール機能を効率的にエンコードし融合する共有バックボーンからなる効率的なpanoptic segmentation(efficiantps)アーキテクチャを提案する。我々は、細部および文脈的特徴を整合的に集約する新しいセマンティックヘッドと、インスタンスヘッドとしてMask R-CNNの新しい変種を組み込んだ。また,本実装では,両ヘッドからの出力ロジットを総合的に統合し,最終的なpanopticセグメンテーション出力を生成する新しいpanoptic fusionモジュールを提案する。さらに、一般的なKITTIベンチマークのためのパノビュータアノテーションを含むKITTIパノビュータセグメンテーションデータセットについても紹介する。 cityscapes、kitti、mapillary vistas、およびindian driving datasetに関する広範な評価は、我々の提案するアーキテクチャが、これまでで最も効率的で高速なpanopticセグメンテーションアーキテクチャでありながら、これら4つのベンチマークすべてに一貫して最新技術を設定していることを示している。

Understanding the scene in which an autonomous robot operates is critical for its competent functioning. Such scene comprehension necessitates recognizing instances of traffic participants along with general scene semantics which can be effectively addressed by the panoptic segmentation task. In this paper, we introduce the Efficient Panoptic Segmentation (EfficientPS) architecture that consists of a shared backbone which efficiently encodes and fuses semantically rich multi-scale features. We incorporate a new semantic head that aggregates fine and contextual features coherently and a new variant of Mask R-CNN as the instance head. We also propose a novel panoptic fusion module that congruously integrates the output logits from both the heads of our EfficientPS architecture to yield the final panoptic segmentation output. Additionally, we introduce the KITTI panoptic segmentation dataset that contains panoptic annotations for the popularly challenging KITTI benchmark. Extensive evaluations on Cityscapes, KITTI, Mapillary Vistas and Indian Driving Dataset demonstrate that our proposed architecture consistently sets the new state-of-the-art on all these four benchmarks while being the most efficient and fast panoptic segmentation architecture to date.

翻訳日:2022-12-16 12:35:55 公開日:2021-02-01

# CALMによるオンライン連続学習の評価

Evaluating Online Continual Learning with CALM ( http://arxiv.org/abs/2004.03340v2 )

ライセンス: Link先を確認

Germ\'an Kruszewski, Ionut-Teodor Sorodoc, Tomas Mikolov

(参考訳) オンライン連続学習(ocl: online continual learning)は、連続的なデータストリーム上で1回以上の例を観察せずに学習することを研究する。しかし、一般的に利用可能なベンチマークは、異なるタスクを明示的に指示したり、潜在的な類似性構造を欠いたり、異なる例間の時間的独立性を仮定したりするため、これらの現実の状況とは程遠い。本稿では,言語モデリングに基づくOCLの新しいベンチマークを提案する。さらに,この設定における破滅的忘れについての新しい指標を提案し,専門家の組成に基づいて複数のベースラインモデルを評価する。最後に,異なる入力間の潜在類似性を学習する単純なゲーティング手法を導入し,専門家モデルの製品の性能を向上させる。

Online Continual Learning (OCL) studies learning over a continuous data stream without observing any single example more than once, a setting that is closer to the experience of humans and systems that must learn "on-the-wild". Yet, commonly available benchmarks are far from these real-world conditions, because they explicitly signal different tasks, lack latent similarity structure or assume temporal independence between different examples. Here, we propose a new benchmark for OCL based on language modelling in which input alternates between different languages and domains without any explicit delimitation. Additionally, we propose new metrics to study catastrophic forgetting in this setting and evaluate multiple baseline models based on compositions of experts. Finally, we introduce a simple gating technique that learns the latent similarities between different inputs, improving the performance of a Products of Experts model.

翻訳日:2022-12-15 22:25:55 公開日:2021-02-01

# 見ずに成績を上げた人たち:視線行動を用いた評価エッセイのマルチタスク学習アプローチ

Happy Are Those Who Grade without Seeing: A Multi-Task Learning Approach to Grade Essays Using Gaze Behaviour ( http://arxiv.org/abs/2005.12078v2 )

ライセンス: Link先を確認

Sandeep Mathias, Rudra Murthy, Diptesh Kanojia, Abhijit Mishra, Pushpak Bhattacharyya

(参考訳) 読者の視線行動は、自動エッセイグレーディングのようないくつかのNLPタスクを解決するのに役立つ。しかし、読者からの視線行動の収集には時間とお金がかかる。本稿では,マルチタスク学習フレームワークを用いて実行時に学習される視線行動を用いた自動エッセイ評価手法を提案する。このマルチタスク学習による自動エッセイ評価手法の有効性を示すために,4つのエッセイセットにまたがる48個のエッセイのアイズ行動の収集と,残りのエッセイのアイズ行動の学習を行い,7000以上のエッセイを数える。学習した視線行動を用いて,視線データを有するエッセイセットの最先端システムに対する統計的に有意な性能改善を実現することができる。また,他の4つのエッセイセットにおいて統計的に有意な改善を達成し,約6000のエッセイを数える。我々のアプローチは、学習の視線行動が自動エッセイ評価を改善することを立証する。

The gaze behaviour of a reader is helpful in solving several NLP tasks such as automatic essay grading. However, collecting gaze behaviour from readers is costly in terms of time and money. In this paper, we propose a way to improve automatic essay grading using gaze behaviour, which is learnt at run time using a multi-task learning framework. To demonstrate the efficacy of this multi-task learning based approach to automatic essay grading, we collect gaze behaviour for 48 essays across 4 essay sets, and learn gaze behaviour for the rest of the essays, numbering over 7000 essays. Using the learnt gaze behaviour, we can achieve a statistically significant improvement in performance over the state-of-the-art system for the essay sets where we have gaze data. We also achieve a statistically significant improvement for 4 other essay sets, numbering about 6000 essays, where we have no gaze behaviour data available. Our approach establishes that learning gaze behaviour improves automatic essay grading.

翻訳日:2022-11-29 05:56:35 公開日:2021-02-01

# 局所三方向パターンに基づくロバストバッグの検出と分類

Robust Baggage Detection and Classification Based on Local Tri-directional Pattern ( http://arxiv.org/abs/2006.07345v3 )

ライセンス: Link先を確認

Shahbano, Muhammad Abdullah and Kashif Inayat

(参考訳) 近年,コンピュータビジョンコミュニティにおいて映像自動監視システムの重要性が高まっている。監視の重要な目的は公共の場での監視とセキュリティである。従来のローカルバイナリパターンでは、機能記述は何らかの不正確であり、機能サイズは十分である。そこで本研究では,このような欠点を克服するために,荷物を運んだり運んだりしない人の検出アルゴリズムを提案する。頭部、体幹、四肢を含む人体部位の異なる特徴を抽出するために、局所的三方向パターン記述器を提示する。そして、サポートベクトルマシンの助けを借りて、抽出された特徴を訓練し評価する。 INRIAとMSMT17 V1データセットの実験結果は、LtriDPがいくつかの最先端の機能記述子より優れ、その有効性を検証することを示している。

In recent decades, the automatic video surveillance system has gained significant importance in computer vision community. The crucial objective of surveillance is monitoring and security in public places. In the traditional Local Binary Pattern, the feature description is somehow inaccurate, and the feature size is large enough. Therefore, to overcome these shortcomings, our research proposed a detection algorithm for a human with or without carrying baggage. The Local tri-directional pattern descriptor is exhibited to extract features of different human body parts including head, trunk, and limbs. Then with the help of support vector machine, extracted features are trained and evaluated. Experimental results on INRIA and MSMT17 V1 datasets show that LtriDP outperforms several state-of-the-art feature descriptors and validate its effectiveness.

翻訳日:2022-11-22 04:36:05 公開日:2021-02-01

# 戦略的相補性を持つ平均場ゲームのための強化学習

Reinforcement Learning for Mean Field Games with Strategic Complementarities ( http://arxiv.org/abs/2006.11683v3 )

ライセンス: Link先を確認

Kiyeob Lee, Desik Rengarajan, Dileep Kalathil, Srinivas Shakkottai

(参考訳) 平均場ゲーム (Mean Field Games, MFG) は、非常に多数のエージェントを持つゲームのクラスであり、標準平衡の概念は平均場平衡 (Mean Field Equilibrium, MFE) である。動的MFGにおけるMFE学習アルゴリズムは一般には知られていない。我々の焦点は、MFG-SC(Strategic Complementarities)と呼ばれる単調性を持つ重要なサブクラスである。本稿では,Trembling-Hand-Perfect MFE (T-MFE) と呼ばれる平衡概念を自然に改良し,エージェントがランダム化の尺度を用いて,そのようなランダム化がペイオフに与える影響を考察する。本稿では,T-MFEを既知のモデルで計算する簡単なアルゴリズムを提案する。また、T-MFE学習のためのモデルフリーおよびモデルベースアプローチを導入し、両方のアルゴリズムの複雑なサンプルを提供する。また,シミュレータの必要性を緩和する完全オンライン学習方式も開発した。最後に,実世界の応用に動機づけられた実例を用いて,提案アルゴリズムの性能を実証的に評価する。

Mean Field Games (MFG) are the class of games with a very large number of agents and the standard equilibrium concept is a Mean Field Equilibrium (MFE). Algorithms for learning MFE in dynamic MFGs are unknown in general. Our focus is on an important subclass that possess a monotonicity property called Strategic Complementarities (MFG-SC). We introduce a natural refinement to the equilibrium concept that we call Trembling-Hand-Perfect MFE (T-MFE), which allows agents to employ a measure of randomization while accounting for the impact of such randomization on their payoffs. We propose a simple algorithm for computing T-MFE under a known model. We also introduce a model-free and a model-based approach to learning T-MFE and provide sample complexities of both algorithms. We also develop a fully online learning scheme that obviates the need for a simulator. Finally, we empirically evaluate the performance of the proposed algorithms via examples motivated by real-world applications.

翻訳日:2022-11-18 12:40:43 公開日:2021-02-01

# 潜在共同設立者による自己相関時系列のハイリコール因果発見

High-recall causal discovery for autocorrelated time series with latent confounders ( http://arxiv.org/abs/2007.01884v3 )

ライセンス: Link先を確認

Andreas Gerhardus and Jakob Runge

(参考訳) そこで本論文では,線形・非線形・ラグランジュ・コンテンポラリー・因果関係を時系列観測から発見する新しい手法を提案する。 fciや変種のような既存の因果発見法では,自己相関型時系列の場合のリコールが低く,条件付き独立テストの効果が低かったことが主な原因である。情報理論の議論は、因果関係の親が条件セットに含まれる場合、効果の大きさを増大させることができることを示している。早期に親を識別するために,新たな配向規則を用いて,すでにエッジ除去段階にある祖先関係を判定する反復手順を提案する。本手法は順序非依存であり,オラクルの場合において完全かつ完全であることを示す。異なる変数数,時間ラグ,サンプルサイズ,さらに詳細なシミュレーション研究を行い,偽陽性を所望のレベルに保ちながら,自己相関連続変数の場合の既存の手法よりもはるかに高いリコールを実現することを実証した。この性能はより強い自己相関によって向上する。 https://github.com/jakobrunge/tigramiteでは、シミュレーション研究に関わるすべてのメソッドにpythonコードを提供しています。

We present a new method for linear and nonlinear, lagged and contemporaneous constraint-based causal discovery from observational time series in the presence of latent confounders. We show that existing causal discovery methods such as FCI and variants suffer from low recall in the autocorrelated time series case and identify low effect size of conditional independence tests as the main reason. Information-theoretical arguments show that effect size can often be increased if causal parents are included in the conditioning sets. To identify parents early on, we suggest an iterative procedure that utilizes novel orientation rules to determine ancestral relationships already during the edge removal phase. We prove that the method is order-independent, and sound and complete in the oracle case. Extensive simulation studies for different numbers of variables, time lags, sample sizes, and further cases demonstrate that our method indeed achieves much higher recall than existing methods for the case of autocorrelated continuous variables while keeping false positives at the desired level. This performance gain grows with stronger autocorrelation. At https://github.com/jakobrunge/tigramite we provide Python code for all methods involved in the simulation studies.

翻訳日:2022-11-14 05:00:58 公開日:2021-02-01

# 超音波ガイド下手術における医療機器検出の試み

Medical Instrument Detection in Ultrasound-Guided Interventions: A Review ( http://arxiv.org/abs/2007.04807v2 )

ライセンス: Link先を確認

Hongxu Yang, Caifeng Shan, Alexander F. Kolen, Peter H. N. de With

(参考訳) 医療機器検出は, 外科医がより優れた解釈で効率的に機器を見つけることが容易になるため, コンピュータ支援の介入には不可欠である。本稿では,超音波ガイド下手術における医療機器検出法について概説する。まず,従来の非データ駆動手法とデータ駆動手法を含む計器検出手法について概説する。非データ駆動手法は、機械学習の時代、すなわちデータ駆動アプローチ以前に広く研究された。臨床データを用いた麻酔, 生検, 前立腺切断療法, 心カテーテル治療など, 超音波における医療機器検出の主な臨床応用について検討した。最後に,コンピュータ支援介入コミュニティにおける主要な課題と今後の研究方向性をまとめるために,いくつかの主要出版物を選定した。

Medical instrument detection is essential for computer-assisted interventions since it would facilitate the surgeons to find the instrument efficiently with a better interpretation, which leads to a better outcome. This article reviews medical instrument detection methods in the ultrasound-guided intervention. First, we present a comprehensive review of instrument detection methodologies, which include traditional non-data-driven methods and data-driven methods. The non-data-driven methods were extensively studied prior to the era of machine learning, i.e. data-driven approaches. We discuss the main clinical applications of medical instrument detection in ultrasound, including anesthesia, biopsy, prostate brachytherapy, and cardiac catheterization, which were validated on clinical datasets. Finally, we selected several principal publications to summarize the key issues and potential research directions for the computer-assisted intervention community.

翻訳日:2022-11-12 05:27:53 公開日:2021-02-01

# 画像テキストマッチングのためのコンセンサス対応ビジュアルセマンティック埋め込み

Consensus-Aware Visual-Semantic Embedding for Image-Text Matching ( http://arxiv.org/abs/2007.08883v2 )

ライセンス: Link先を確認

Haoran Wang, Ying Zhang, Zhong Ji, Yanwei Pang, Lin Ma

(参考訳) イメージテキストマッチングは、ビジョンと言語を橋渡しする上で中心的な役割を果たす。既存のほとんどのアプローチは、画像とテキストのインスタンスペアにのみ依存して表現を学習し、一致した関係を利用し、対応するアライメントを作成する。このようなアプローチは、画像とテキストの間の高レベルな関係を推論する能力を妨げかねない外部の常識知識を考慮せずに、インスタンスのペアデータに含まれる表面的関連のみを利用する。本稿では,両モード間で共有されるコモンセンス知識を画像テキストマッチングに組み込むために,コンセンサス対応のビジュアル・セマンティック・エンベディング(CVSE)モデルを提案する。具体的には、イメージキャプションコーパスからの意味概念間の統計的共起相関を計算し、構成された概念相関グラフを配置することにより、コンセンサス対応の概念(CAC)表現を生成する。その後、CVSEは、悪用されたコンセンサスと両方のモダリティのインスタンスレベルの表現に基づいて、画像とテキストの関連とアライメントを学習する。 2つの公開データセットで実施された広範囲な実験により、エクスプロイトされたコンセンサスは、双方向画像およびテキスト検索タスクにおける最先端のアプローチよりも優れたパフォーマンスで、より有意義な視覚意味埋め込みの構築に重要な貢献をしていることを検証した。この論文のコードは、https://github.com/brucew91/cvseで入手できる。

Image-text matching plays a central role in bridging vision and language. Most existing approaches only rely on the image-text instance pair to learn their representations, thereby exploiting their matching relationships and making the corresponding alignments. Such approaches only exploit the superficial associations contained in the instance pairwise data, with no consideration of any external commonsense knowledge, which may hinder their capabilities to reason the higher-level relationships between image and text. In this paper, we propose a Consensus-aware Visual-Semantic Embedding (CVSE) model to incorporate the consensus information, namely the commonsense knowledge shared between both modalities, into image-text matching. Specifically, the consensus information is exploited by computing the statistical co-occurrence correlations between the semantic concepts from the image captioning corpus and deploying the constructed concept correlation graph to yield the consensus-aware concept (CAC) representations. Afterwards, CVSE learns the associations and alignments between image and text based on the exploited consensus as well as the instance-level representations for both modalities. Extensive experiments conducted on two public datasets verify that the exploited consensus makes significant contributions to constructing more meaningful visual-semantic embeddings, with the superior performances over the state-of-the-art approaches on the bidirectional image and text retrieval task. Our code of this paper is available at: https://github.com/BruceW91/CVSE.

翻訳日:2022-11-09 14:05:35 公開日:2021-02-01

# SummEval: 要約評価の再評価

SummEval: Re-evaluating Summarization Evaluation ( http://arxiv.org/abs/2007.12626v4 )

ライセンス: Link先を確認

Alexander R. Fabbri, Wojciech Kry\'sci\'nski, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev

(参考訳) テキスト要約のための評価指標に関する総合的な最新の研究の欠如と評価プロトコルに関する合意の欠如は、進歩を阻害し続けている。 5次元の要約評価手法の既存の欠点に対処する。 1)14個の自動評価指標を総合的かつ一貫した方法で再評価する。 2) 上記の自動評価指標を用いて, 最新の要約モデル23を常にベンチマークする。 3) cnn/dailymailニュースデータセットでトレーニングされたモデルによって生成された最大の要約の集合を統一した形式で共有する。 4) 幅広い自動メトリクスの要約モデルを評価するための拡張可能で統一的なapiを提供するツールキットを実装し,共有する。 5) 専門家とクラウドソースワーカーの両方が注釈を付けたcnn/daily mailデータセット上で,モデルタイプ,モデル生成要約の人的判断の収集に関して,最大かつ最も多様で多様なものを収集し,共有する。この研究により、テキスト要約のためのより完全な評価プロトコルの促進と、人間の判断とよりよく相関する評価メトリクスの開発に関する研究の促進が期待できる。

The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress. We address the existing shortcomings of summarization evaluation methods along five dimensions: 1) we re-evaluate 14 automatic evaluation metrics in a comprehensive and consistent fashion using neural summarization model outputs along with expert and crowd-sourced human annotations, 2) we consistently benchmark 23 recent summarization models using the aforementioned automatic evaluation metrics, 3) we assemble the largest collection of summaries generated by models trained on the CNN/DailyMail news dataset and share it in a unified format, 4) we implement and share a toolkit that provides an extensible and unified API for evaluating summarization models across a broad range of automatic metrics, 5) we assemble and share the largest and most diverse, in terms of model types, collection of human judgments of model-generated summaries on the CNN/Daily Mail dataset annotated by both expert judges and crowd-source workers. We hope that this work will help promote a more complete evaluation protocol for text summarization as well as advance research in developing evaluation metrics that better correlate with human judgments.

翻訳日:2022-11-07 06:39:58 公開日:2021-02-01

# 複数インスタンス拡張による視覚特徴のKショットコントラスト学習

K-Shot Contrastive Learning of Visual Features with Multiple Instance Augmentations ( http://arxiv.org/abs/2007.13310v2 )

ライセンス: Link先を確認

Haohang Xu, Hongkai Xiong, Guo-Jun Qi

(参考訳) 本稿では,複数の補足を適用して各インスタンスのサンプル変動を調べることにより,視覚特徴のk$-shot contrastive learning(kscl)を提案する。異なるインスタンスを区別するために差別的特徴を学習することで、インスタンス間の差別の利点と、インスタンス上の拡張サンプルの変種とクエリを一致させることによるインスタンス内バリエーションを組み合わせることを目的としている。特にインスタンスごとにインスタンスサブスペースを構築し、$k$-shot拡張のバリエーションの重要な要因がどのように結合され、拡張のバリエーションを形成するかをモデル化する。クエリが与えられると、最も関連するインスタンスの変種は、クエリをサブスペースに投影して、ポジティブなインスタンスクラスを予測することで取得される。これは、特別なワンショットケースと見なせる既存のコントラスト学習を一般化する。インスタンス部分空間を構成するために固有値分解を行い、埋め込みネットワークを微分可能な部分空間構成を通じてエンドツーエンドに訓練することができる。提案した$K$-shotのコントラスト学習は,最先端の教師なし手法よりも優れた性能を示す。

In this paper, we propose the $K$-Shot Contrastive Learning (KSCL) of visual features by applying multiple augmentations to investigate the sample variations within individual instances. It aims to combine the advantages of inter-instance discrimination by learning discriminative features to distinguish between different instances, as well as intra-instance variations by matching queries against the variants of augmented samples over instances. Particularly, for each instance, it constructs an instance subspace to model the configuration of how the significant factors of variations in $K$-shot augmentations can be combined to form the variants of augmentations. Given a query, the most relevant variant of instances is then retrieved by projecting the query onto their subspaces to predict the positive instance class. This generalizes the existing contrastive learning that can be viewed as a special one-shot case. An eigenvalue decomposition is performed to configure instance subspaces, and the embedding network can be trained end-to-end through the differentiable subspace configuration. Experiment results demonstrate the proposed $K$-shot contrastive learning achieves superior performances to the state-of-the-art unsupervised methods.

翻訳日:2022-11-06 08:37:29 公開日:2021-02-01

# VPC-Net:MLS点雲からの3次元車両の完成

VPC-Net: Completion of 3D Vehicles from MLS Point Clouds ( http://arxiv.org/abs/2008.03404v2 )

ライセンス: Link先を確認

Yan Xia, Yusheng Xu, Cheng Wang, Uwe Stilla

(参考訳) 都市シナリオの道路環境における動的かつ不可欠な要素として、車両が最も人気のある調査対象である。車両の挙動を監視し,その幾何学的特徴を抽出するためには,車両の正確な即時測定が交通・交通分野において重要な役割を果たす。モバイルレーザースキャン(MLS)システムから取得した点雲は、前例のない詳細な道路シーンの3D情報を提供する。インテリジェントな輸送と自動運転の分野で、特に車両の抽出に十分なデータソースであることが証明されている。しかしながら、mlsシステムから取得した車両の3dポイント雲は、必然的に物体の閉塞や自閉のため不完全である。この問題に対処するため,我々はMLSデータから完全で高密度で均一な点雲を合成するニューラルネットワークを提案し,VPC-Net(Vaby Points Completion-Net)と名付けた。本稿では,空間変換器ネットワークと点特徴強調層からなる入力インスタンスからグローバルな特徴を抽出する新しいエンコーダモジュールを提案する。さらに、車両の詳細を入力から保存し、詳細な情報で完全な出力を洗練するために、新しい精細モジュールも提示される。入力としてスパースと部分点雲が与えられると、ネットワークは完全で現実的な車両構造を生成し、部分的な入力から細かな詳細を維持することができる。提案するvpc-netを合成および実scanデータセットを用いて異なる実験で評価し,その結果を3次元車両監視タスクに適用した。定量的および定性的な実験は、提案したVPC-Netの有望な性能を示し、最先端の結果を示す。

As a dynamic and essential component in the road environment of urban scenarios, vehicles are the most popular investigation targets. To monitor their behavior and extract their geometric characteristics, an accurate and instant measurement of vehicles plays a vital role in traffic and transportation fields. Point clouds acquired from the mobile laser scanning (MLS) system deliver 3D information of road scenes with unprecedented detail. They have proven to be an adequate data source in the fields of intelligent transportation and autonomous driving, especially for extracting vehicles. However, acquired 3D point clouds of vehicles from MLS systems are inevitably incomplete due to object occlusion or self-occlusion. To tackle this problem, we proposed a neural network to synthesize complete, dense, and uniform point clouds for vehicles from MLS data, named Vehicle Points Completion-Net (VPC-Net). In this network, we introduce a new encoder module to extract global features from the input instance, consisting of a spatial transformer network and point feature enhancement layer. Moreover, a new refiner module is also presented to preserve the vehicle details from inputs and refine the complete outputs with fine-grained information. Given sparse and partial point clouds as inputs, the network can generate complete and realistic vehicle structures and keep the fine-grained details from the partial inputs. We evaluated the proposed VPC-Net in different experiments using synthetic and real-scan datasets and applied the results to 3D vehicle monitoring tasks. Quantitative and qualitative experiments demonstrate the promising performance of the proposed VPC-Net and show state-of-the-art results.

翻訳日:2022-11-01 11:55:18 公開日:2021-02-01

# MR画像再構成のための生成密度先行値のホモトピック勾配

Homotopic Gradients of Generative Density Priors for MR Image Reconstruction ( http://arxiv.org/abs/2008.06284v2 )

ライセンス: Link先を確認

Cong Quan, Jinjie Zhou, Yuanzheng Zhu, Yang Chen, Shanshan Wang, Dong Liang, Qiegen Liu

(参考訳) 深層学習(特に生成モデル)は、画像再構成を著しく高速化し、最近は測定を減らした。本研究では, 密度優先を最適化する既存の生成モデルではなく, 除音スコアマッチングを活かして, 生成密度優先(hggdp)のホモトピー勾配を磁気共鳴イメージング(mri)再構成のために提案する。より正確には、生成密度以前の低次元多様体と低データ密度領域の問題に取り組むために、高次元空間における目標勾配を推定する。訓練段階でのネットワーク入力として高次元テンソルを形成することにより,より強力な雑音条件スコアネットワークを訓練する。さらに人工的なノイズが埋め込み空間に注入される。再建段階では、復元性能を高めるなど、事前の密度を追求するためにホモトピー法が用いられる。実験結果から, 高い再構成精度でHGGDPの顕著な性能が示唆された。k空間データの10%だけが, 完全サンプルデータを用いた標準的なMRI再構成と同様に, 高品質な画像を生成することができる。

Deep learning, particularly the generative model, has demonstrated tremendous potential to significantly speed up image reconstruction with reduced measurements recently. Rather than the existing generative models that often optimize the density priors, in this work, by taking advantage of the denoising score matching, homotopic gradients of generative density priors (HGGDP) are proposed for magnetic resonance imaging (MRI) reconstruction. More precisely, to tackle the low-dimensional manifold and low data density region issues in generative density prior, we estimate the target gradients in higher-dimensional space. We train a more powerful noise conditional score network by forming high-dimensional tensor as the network input at the training phase. More artificial noise is also injected in the embedding space. At the reconstruction stage, a homotopy method is employed to pursue the density prior, such as to boost the reconstruction performance. Experiment results imply the remarkable performance of HGGDP in terms of high reconstruction accuracy; only 10% of the k-space data can still generate images of high quality as effectively as standard MRI reconstruction with the fully sampled data.

翻訳日:2022-10-30 17:45:54 公開日:2021-02-01

# 混合モデルを用いた可逆ニューラルネットワークの安定化

Stabilizing Invertible Neural Networks Using Mixture Models ( http://arxiv.org/abs/2009.02994v2 )

ライセンス: Link先を確認

Paul Hagemann and Sebastian Neumayer

(参考訳) 本稿では,逆問題の解法を提供する可逆ニューラルネットワークの特性について解析する。我々の主な焦点は、対応する逆ネットワークのリプシッツ定数の調査と制御である。このような制御がなければ、数値シミュレーションはエラーになりがちであり、従来のアプローチに比較してはあまり得られない。幸いなことに, 標準正規分布からガウス混合モデルへの潜在分布の変化は, リプシッツ定数の爆発の問題を解決している。実際、数値シミュレーションにより、この修正によってマルチモーダルアプリケーションにおけるサンプリング品質が大幅に向上することを確認した。

In this paper, we analyze the properties of invertible neural networks, which provide a way of solving inverse problems. Our main focus lies on investigating and controlling the Lipschitz constants of the corresponding inverse networks. Without such an control, numerical simulations are prone to errors and not much is gained against traditional approaches. Fortunately, our analysis indicates that changing the latent distribution from a standard normal one to a Gaussian mixture model resolves the issue of exploding Lipschitz constants. Indeed, numerical simulations confirm that this modification leads to significantly improved sampling quality in multimodal applications.

翻訳日:2022-10-21 02:39:58 公開日:2021-02-01

# ゼロショット実行可能意味解析のための接地適応

Grounded Adaptation for Zero-shot Executable Semantic Parsing ( http://arxiv.org/abs/2009.07396v3 )

ライセンス: Link先を確認

Victor Zhong, Mike Lewis, Sida I. Wang, Luke Zettlemoyer

(参考訳) 既存のセマンティックパーサを新しい環境(例えば新しいデータベーススキーマ)に適応させるために,ゼロショット実行可能なセマンティックパーシング(GAZP)のためのグラウンド適応を提案する。 GAZPは新しい環境でデータ(例えば、発話とSQLクエリ)を合成するために前方のセマンティックパーサと後方の発話生成器を組み合わせる。トレーニング環境では検証されていない例を合成するデータ拡張とは異なり、GAZPは入力と出力の整合性を検証する新しい環境でサンプルを合成する。 Spider、Sparc、CoSQLのゼロショットセマンティック解析タスクでは、GAZPはベースラインパーサの論理形式と実行精度を改善している。分析の結果,GAZPはトレーニング環境におけるデータ拡張に優れ,GAZP合成データの量によって性能が向上し,サイクル整合性が適応の鍵となることがわかった。

We propose Grounded Adaptation for Zero-shot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e.g. new database schemas). GAZP combines a forward semantic parser with a backward utterance generator to synthesize data (e.g. utterances and SQL queries) in the new environment, then selects cycle-consistent examples to adapt the parser. Unlike data-augmentation, which typically synthesizes unverified examples in the training environment, GAZP synthesizes examples in the new environment whose input-output consistency are verified. On the Spider, Sparc, and CoSQL zero-shot semantic parsing tasks, GAZP improves logical form and execution accuracy of the baseline parser. Our analyses show that GAZP outperforms data-augmentation in the training environment, performance increases with the amount of GAZP-synthesized data, and cycle-consistency is central to successful adaptation.

翻訳日:2022-10-17 22:43:22 公開日:2021-02-01

# 実時間最適化とベイズ最適化と微分自由最適化: 修正子適応の物語

Real-Time Optimization Meets Bayesian Optimization and Derivative-Free Optimization: A Tale of Modifier Adaptation ( http://arxiv.org/abs/2009.08819v2 )

ライセンス: Link先を確認

Ehecatl Antonio del Rio-Chanona and Panagiotis Petsagkourakis and Eric Bradford and Jose Eduardo Alves Graciano and Benoit Chachuat

(参考訳) 本稿では,不確実なプロセスのリアルタイム最適化において,プラントモデルミスマッチを克服する修飾子適応方式を提案する。主な貢献はベイズ最適化と微分自由最適化の領域からの概念の統合にある。提案手法では, 物理モデルを埋め込み, 信頼領域のアイデアを頼りに探索中のリスクを最小限にし, ガウス過程回帰を用いて非パラメトリックな方法で植物モデルミスマッチを捉え, 獲得関数を用いて探索を進める。半バッチフォトバイオリアクター最適化問題を含む数値ケーススタディにおいて, 取得関数の使用, プロセスノイズレベルを知ること, あるいは, 名目プロセスモデルを指定することの利点を述べる。

This paper investigates a new class of modifier-adaptation schemes to overcome plant-model mismatch in real-time optimization of uncertain processes. The main contribution lies in the integration of concepts from the areas of Bayesian optimization and derivative-free optimization. The proposed schemes embed a physical model and rely on trust-region ideas to minimize risk during the exploration, while employing Gaussian process regression to capture the plant-model mismatch in a non-parametric way and drive the exploration by means of acquisition functions. The benefits of using an acquisition function, knowing the process noise level, or specifying a nominal process model are illustrated on numerical case studies, including a semi-batch photobioreactor optimization problem.

翻訳日:2022-10-17 03:44:20 公開日:2021-02-01

# DocuBot : 自然言語インタラクションを用いた財務報告の生成

DocuBot : Generating financial reports using natural language interactions ( http://arxiv.org/abs/2010.01169v2 )

ライセンス: Link先を確認

Vineeth Ravi, Selim Amrouni, Andrea Stefanucci, Armineh Nourbakhsh, Prashant Reddy, Manuela Veloso

(参考訳) 金融サービス業界は、膨大な量の複雑なデータを永久に処理します。デジタルレポートは、退屈な手作業の分析と、基礎となるトレンドとデータの特性の可視化に基づいて作成されることが多い。多くの場合、これらのレポートの作成における人間の計算エラーの増大コストは非常に高い。自然言語インタラクションを「スキル」としてモデル化し、基礎となるデータを変換してデジタル文書のコンテンツを作成・修正するための、aiを活用した新しいバーチャルアシスタントであるdocubotを提案する。 DocuBotは、保存したスキルを再利用するために集約し、人間が自動的にリカレントレポートを生成することができる。 docubotはユーザと対話することで、ドメイン固有およびユーザ固有の語彙を継続的に学習する機能も備えている。我々は,DocuBotが金融業界に価値をもたらす証拠を示し,PowerPointのプレゼンテーション作成に携わる実際のユーザとシミュレーションユーザによる実験による影響を実証する。

The financial services industry perpetually processes an overwhelming amount of complex data. Digital reports are often created based on tedious manual analysis as well as visualization of the underlying trends and characteristics of data. Often, the accruing costs of human computation errors in creating these reports are very high. We present DocuBot, a novel AI-powered virtual assistant for creating and modifying content in digital documents by modeling natural language interactions as "skills" and using them to transform underlying data. DocuBot has the ability to agglomerate saved skills for reuse, enabling humans to automatically generate recurrent reports. DocuBot also has the capability to continuously learn domain-specific and user-specific vocabulary by interacting with the user. We present evidence that DocuBot adds value to the financial industry and demonstrate its impact with experiments involving real and simulated users tasked with creating PowerPoint presentations.

翻訳日:2022-10-12 01:44:12 公開日:2021-02-01

# 局所文脈埋め込みによるきめ細かな感性分類の強化

Enhancing Fine-grained Sentiment Classification Exploiting Local Context Embedding ( http://arxiv.org/abs/2010.00767v3 )

ライセンス: Link先を確認

Heng Yang, Biqing Zeng

(参考訳) ターゲット指向感情分類は、ターゲットの感情極性を分析するための自然言語処理のきめ細かいタスクである。感情分類の性能を向上させるために、多くのアプローチがターゲットの重要な文脈単語を捉えるために様々な注意メカニズムを提案した。しかし,従来のアプローチでは,対象の感情とその局所的文脈の有意な関連性は無視されていた。本稿では,ローカルコンテキスト埋め込みとローカルコンテキスト予測損失を備えたローカルコンテキスト認識ネットワーク(LCA-Net)を提案する。 3つの共通データセットにおける実験結果は、ローカルコンテキスト認識ネットワークが、ローカルコンテキスト特徴抽出において既存のアプローチよりも優れていることを示している。さらに、ローカルコンテキスト認識フレームワークは多くのモデルに適応しやすく、他のターゲットレベルのタスクを改善する可能性がある。

Target-oriented sentiment classification is a fine-grained task of natural language processing to analyze the sentiment polarity of the targets. To improve the performance of sentiment classification, many approaches proposed various attention mechanisms to capture the important context words of a target. However, previous approaches ignored the significant relatedness of a target's sentiment and its local context. This paper proposes a local context-aware network (LCA-Net), equipped with the local context embedding and local context prediction loss, to strengthen the model by emphasizing the sentiment information of the local context. The experimental results on three common datasets show that local context-aware network performs superior to existing approaches in extracting local context features. Besides, the local context-aware framework is easy to adapt to many models, with the potential to improve other target-level tasks.

翻訳日:2022-10-12 01:34:17 公開日:2021-02-01

# 複雑な地形を歩むための指導カリキュラム学習

Guided Curriculum Learning for Walking Over Complex Terrain ( http://arxiv.org/abs/2010.03848v2 )

ライセンス: Link先を確認

Brendan Tidd, Nicolas Hudson, Akansel Cosgun

(参考訳) 複雑な地形の上を歩くという信頼性の高い二足歩行は難しい問題だ。カリキュラム学習とは、タスクの達成可能なバージョンから始めて、成功基準が満たされるにつれて難易度を高めるという考え方である。本稿では,二足歩行のための深層強化学習政策を学習するための3段階カリキュラムを提案する。第1段階では、エージェントは容易な地形上で開始され、徐々に地形の難しさが増し、目標方針から導出される力がロボット関節およびベースに適用される。第2段階では、誘導力は徐々にゼロに減少する。最後に、第3段階では、ロボットベースに大きさが大きくなるランダムな摂動が適用され、ポリシーの堅牢性が改善される。シミュレーション実験では, 平面, ハードル, 隙間, 階段, 階段の5種類の地形に対して, 歩行方針の学習に有効であることを示した。さらに,人間による実演の欠如により,複雑な地形を横断することを学ぶには,簡単な手で設計した歩行路が十分であることを示す。アブレーション研究において,カリキュラムの3段階のいずれかを選択すると,学習性能が低下することが示された。

Reliable bipedal walking over complex terrain is a challenging problem, using a curriculum can help learning. Curriculum learning is the idea of starting with an achievable version of a task and increasing the difficulty as a success criteria is met. We propose a 3-stage curriculum to train Deep Reinforcement Learning policies for bipedal walking over various challenging terrains. In the first stage, the agent starts on an easy terrain and the terrain difficulty is gradually increased, while forces derived from a target policy are applied to the robot joints and the base. In the second stage, the guiding forces are gradually reduced to zero. Finally, in the third stage, random perturbations with increasing magnitude are applied to the robot base, so the robustness of the policies are improved. In simulation experiments, we show that our approach is effective in learning walking policies, separate from each other, for five terrain types: flat, hurdles, gaps, stairs, and steps. Moreover, we demonstrate that in the absence of human demonstrations, a simple hand designed walking trajectory is a sufficient prior to learn to traverse complex terrain types. In ablation studies, we show that taking out any one of the three stages of the curriculum degrades the learning performance.

翻訳日:2022-10-09 11:39:11 公開日:2021-02-01

# MIDI拡張を用いた変圧器型ピッチシーケンスオートエンコーダ

A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation ( http://arxiv.org/abs/2010.07758v3 )

ライセンス: Link先を確認

Mingshuo Ding, Yinghao Ma

(参考訳) 近年のディープラーニング自動音楽生成アルゴリズムの成果にもかかわらず、シングルトラック音楽の抜粋がオートマトンやホモ・サピエンスによって構成されているかどうかを評価するためのアプローチがいくつか提案されている。この問題に対処するために、ALBERTに基づくマスク付き言語モデルを作曲家分類に適用する。目的は、MIDIクリップが自動生成仮説に基づいて構成される可能性を示し、AIで構成されたシングルトラックMIDIのみを用いてトレーニングするモデルを得ることである。本稿では,パラメータの量を削減し,データ拡張に関する2つの手法と,オーバーフィッティングを防止するための洗練された損失関数を提案する。実験結果は,CSMT(2020)のデータチャレンジにおける7ドルチームすべてにおいて,我々のモデルが$3^{rd}$であることを示している。さらに、このインスピレーション手法は、小さなデータセットに基づく他の音楽情報検索タスクにも適用することができる。

Despite recent achievements of deep learning automatic music generation algorithms, few approaches have been proposed to evaluate whether a single-track music excerpt is composed by automatons or Homo sapiens. To tackle this problem, we apply a masked language model based on ALBERT for composers classification. The aim is to obtain a model that can suggest the probability a MIDI clip might be composed condition on the auto-generation hypothesis, and which is trained with only AI-composed single-track MIDI. In this paper, the amount of parameters is reduced, two methods on data augmentation are proposed as well as a refined loss function to prevent overfitting. The experiment results show our model ranks $3^{rd}$ in all the $7$ teams in the data challenge in CSMT(2020). Furthermore, this inspiring method could be spread to other music information retrieval tasks that are based on a small dataset.

翻訳日:2022-10-07 05:38:55 公開日:2021-02-01

# ファネル構造を用いた意思決定問題--メールマーケティングキャンペーンへの応用によるマルチタスク学習アプローチ

Decision Making Problems with Funnel Structure: A Multi-Task Learning Approach with Application to Email Marketing Campaigns ( http://arxiv.org/abs/2010.08048v2 )

ライセンス: Link先を確認

Ziping Xu, Amirhossein Meisami, Ambuj Tewari

(参考訳) 本稿では,ファンネル構造を用いた意思決定問題について考察する。マーケティング分野においてよく知られた概念であるファンネル構造は、浅いものよりも深い層からの観測がはるかに少ない層状に、意思決定者が環境と相互作用するシステムにおいて発生する。例えば、eメールマーケティングキャンペーンアプリケーションでは、レイヤはオープン、クリック、購入のイベントに対応しています。 ClickからPurchaseへの変換は、メールのリンクがクリックされない限り購入はできないため、非常に頻繁に行われる。我々は,この困難な意思決定問題をファンネル構造を持つコンテキストバンディットとして定式化し,深層層からの十分な観察の欠如を軽減するマルチタスク学習アルゴリズムを開発した。我々は予測誤差とアルゴリズムの後悔の両方を分析した。我々は単純なシミュレーションにより予測誤差の理論を検証する。メールマーケティング企業による実世界データに基づくシミュレーション環境と実環境の両方における実験により,従来の手法に比べてアルゴリズムが大幅に改善することが示された。

This paper studies the decision making problem with Funnel Structure. Funnel structure, a well-known concept in the marketing field, occurs in those systems where the decision maker interacts with the environment in a layered manner receiving far fewer observations from deep layers than shallow ones. For example, in the email marketing campaign application, the layers correspond to Open, Click and Purchase events. Conversions from Click to Purchase happen very infrequently because a purchase cannot be made unless the link in an email is clicked on. We formulate this challenging decision making problem as a contextual bandit with funnel structure and develop a multi-task learning algorithm that mitigates the lack of sufficient observations from deeper layers. We analyze both the prediction error and the regret of our algorithms. We verify our theory on prediction errors through a simple simulation. Experiments on both a simulated environment and an environment based on real-world data from a major email marketing company show that our algorithms offer significant improvement over previous methods.

翻訳日:2022-10-07 03:25:46 公開日:2021-02-01

# SAINT+:EDNetの精度予測のための時間的特徴の統合

SAINT+: Integrating Temporal Features for EdNet Correctness Prediction ( http://arxiv.org/abs/2010.12042v2 )

ライセンス: Link先を確認

Dongmin Shin, Yugeun Shim, Hangyeol Yu, Seewoo Lee, Byungsoo Kim, Youngduck Choi

(参考訳) 本稿では,学習者情報と運動情報とを別々に処理する,トランスフォーマーに基づく知識追跡モデルSAINTの後継であるSAINT+を提案する。 SAINTのアーキテクチャに従って、SAINT+はエンコーダ・デコーダ構造を持ち、エンコーダは運動埋め込みのストリームに自己アテンション層を適用し、デコーダは、応答埋め込みとエンコーダ出力のストリームに自己アテンション層とエンコーダ・アテンション層を交互に適用する。さらに、SAINT+は2つの時間的特徴埋め込みを反応埋め込みに組み込んでおり、時間経過、生徒が答えるのに要する時間、学習活動間の時間間隔であるラグ時間である。教育領域で最大の公開ベンチマークデータセットであるEdNetにおけるSAINT+の有効性を実証的に評価した。実験結果から,SAINT+は,EdNetデータセットの現在最先端モデルであるSAINTと比較して,レシーバ動作特性曲線下での領域の1.25%の改善により,知識追跡における最先端性を実現していることがわかった。

We propose SAINT+, a successor of SAINT which is a Transformer based knowledge tracing model that separately processes exercise information and student response information. Following the architecture of SAINT, SAINT+ has an encoder-decoder structure where the encoder applies self-attention layers to a stream of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to streams of response embeddings and encoder output. Moreover, SAINT+ incorporates two temporal feature embeddings into the response embeddings: elapsed time, the time taken for a student to answer, and lag time, the time interval between adjacent learning activities. We empirically evaluate the effectiveness of SAINT+ on EdNet, the largest publicly available benchmark dataset in the education domain. Experimental results show that SAINT+ achieves state-of-the-art performance in knowledge tracing with an improvement of 1.25% in area under receiver operating characteristic curve compared to SAINT, the current state-of-the-art model in EdNet dataset.

翻訳日:2022-10-05 21:23:50 公開日:2021-02-01

# 動的環境における後悔最適制御

Regret-optimal control in dynamic environments ( http://arxiv.org/abs/2010.10473v2 )

ライセンス: Link先を確認

Gautam Goel, Babak Hassibi

(参考訳) 後悔最小化の観点から線形時間変動力学系の制御を考察する。この領域における多くの先行研究とは違って、特定の種類のコントローラにおいて最高の固定コントローラではなく、後方視(動的後悔)で選択された制御アクションの最良の動的シーケンスに対する後悔を最小限に抑えるオンラインコントローラを設計する問題に焦点を当てる(静的後悔)。この定式化は、環境が経時的に変化しても魅力的であり、単一のコントローラが時間軸全体にわたって優れたパフォーマンスを達成することはない。我々は,新たなH_{\infty}$制御による後悔最適制御系の状態空間構造を導出し,乱れのエネルギーの観点から,その後悔に基づく厳密なデータ依存境界を提示する。この結果は,制御器が将来の乱れを予測できるモデル予測設定や,固定遅延後のシステムダイナミクスにのみ影響する設定に容易に拡張できる。そこで本研究では,確率的および敵対的環境における$h_2$-optimal と $h_{\infty}$-optimal controller の性能を相互に補間する数値実験を行った。

We consider control in linear time-varying dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which minimizes regret against the best dynamic sequence of control actions selected in hindsight (dynamic regret), instead of the best fixed controller in some specific class of controllers (static regret). This formulation is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We derive the state-space structure of the regret-optimal controller via a novel reduction to $H_{\infty}$ control and present a tight data-dependent bound on its regret in terms of the energy of the disturbance. Our results easily extend to the model-predictive setting where the controller can anticipate future disturbances and to settings where the controller only affects the system dynamics after a fixed delay. We present numerical experiments which show that our regret-optimal controller interpolates between the performance of the $H_2$-optimal and $H_{\infty}$-optimal controllers across stochastic and adversarial environments.

翻訳日:2022-10-05 08:05:44 公開日:2021-02-01

# 医用画像のクロスモーダル情報最大化:CMIM

Cross-Modal Information Maximization for Medical Imaging: CMIM ( http://arxiv.org/abs/2010.10593v3 )

ライセンス: Link先を確認

Tristan Sylvain, Francis Dutil, Tess Berthier, Lisa Di Jorio, Margaux Luck, Devon Hjelm, Yoshua Bengio

(参考訳) 病院では、患者が行っている異なる医用画像検査(CTスキャン、MRI、PET、超音波など)や関連する放射線検査など、異なるモードで同じ情報を利用できる特定の情報システムにデータがサイロ化される。これは、テスト時に常に利用できないかもしれない同じ情報の複数のビューを列車で取得し、使用するためのユニークな機会を提供する。本稿では, 相互情報最大化の最近の進歩を用いて, モダリティ低下に弾力性のあるマルチモーダル入力の良質な表現を学習することにより, 利用可能なデータを最大限に活用する革新的な枠組みを提案する。列車時間におけるクロスモーダル情報の最大化により、医療画像分類とセグメンテーションという2つの異なる設定で、最先端のベースラインを上回ります。特に本手法は,弱いモダリティの推論時間性能に大きな影響を与えることが示されている。

In hospitals, data are siloed to specific information systems that make the same information available under different modalities such as the different medical imaging exams the patient undergoes (CT scans, MRI, PET, Ultrasound, etc.) and their associated radiology reports. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time. In this paper, we propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time, using recent advances in mutual information maximization. By maximizing cross-modal information at train time, we are able to outperform several state-of-the-art baselines in two different settings, medical image classification, and segmentation. In particular, our method is shown to have a strong impact on the inference-time performance of weaker modalities.

翻訳日:2022-10-05 06:11:23 公開日:2021-02-01

# 一般化された連続ゼロショット学習

Generalized Continual Zero-Shot Learning ( http://arxiv.org/abs/2011.08508v3 )

ライセンス: Link先を確認

Chandan Gautam, Sethupathy Parameswaran, Ashish Mishra, Suresh Sundaram

(参考訳) 最近、ゼロショット学習(ZSL)がエキサイティングなトピックとして登場し、多くの注目を集めた。 zslは、見知らぬクラスからの知識をクラス記述に基づいて見知らぬクラスに移すことで、見当たらないクラスを分類することを目指している。有望なパフォーマンスを示しているにもかかわらず、ZSLのアプローチは、すべてのクラスからのトレーニングサンプルがトレーニング中に利用可能であると仮定している。そこで本研究では,タスクの形式でクラスが順次到着するZSL(Continuousal ZSL, CZSL)のより汎用的で実用的な設定を提案し,過去の経験を生かして変化する環境から積極的に学習する。さらに、信頼性を高めるために、トレーニングプロセス中にタスクアイデンティティが明らかにされるが、テスト中には明らかにされない、単一ヘッド連続学習環境のCZSLを開発する。破滅的な忘れと不透明を避けるため,我々は知識蒸留と,それ以前のタスクからのサンプルの保存と再生を,小さなエピソードメモリを用いて行う。我々は,5つのzslベンチマークデータセット上で,連続学習の2つの異なる設定のためのベースラインを開発し,一般化されたczslを評価する。さらに、czslは2種類の変分オートエンコーダに対して開発され、分類のために2種類の特徴を生成する。 (i)出力空間に生成した特徴と (ii)潜在空間における識別的特徴を生じる。実験結果から, 単一頭部CZSLはより一般化可能で, 実用に適していることが明らかとなった。

Recently, zero-shot learning (ZSL) emerged as an exciting topic and attracted a lot of attention. ZSL aims to classify unseen classes by transferring the knowledge from seen classes to unseen classes based on the class description. Despite showing promising performance, ZSL approaches assume that the training samples from all seen classes are available during the training, which is practically not feasible. To address this issue, we propose a more generalized and practical setup for ZSL, i.e., continual ZSL (CZSL), where classes arrive sequentially in the form of a task and it actively learns from the changing environment by leveraging the past experience. Further, to enhance the reliability, we develop CZSL for a single head continual learning setting where task identity is revealed during the training process but not during the testing. To avoid catastrophic forgetting and intransigence, we use knowledge distillation and storing and replay the few samples from previous tasks using a small episodic memory. We develop baselines and evaluate generalized CZSL on five ZSL benchmark datasets for two different settings of continual learning: with and without class incremental. Moreover, CZSL is developed for two types of variational autoencoders, which generates two types of features for classification: (i) generated features at output space and (ii) generated discriminative features at the latent space. The experimental results clearly indicate the single head CZSL is more generalizable and suitable for practical applications.

翻訳日:2022-09-24 16:46:58 公開日:2021-02-01

# 多機能核融合深部ネットワークによるロバスト超解像深度イメージング

Robust super-resolution depth imaging via a multi-feature fusion deep network ( http://arxiv.org/abs/2011.11444v2 )

ライセンス: Link先を確認

Alice Ruget, Stephen McLaughlin, Robert K. Henderson, Istvan Gyongy, Abderrahim Halimi and Jonathan Leach

(参考訳) 3次元イメージングは、深度を記録する必要がある画像アプリケーションにおいて重要な役割を果たす。深度イメージングを利用するアプリケーションの数は急速に増えており、例えば自動運転車やスマートフォンカメラのオートフォーカスアシストなどがある。単一光子感度検出器(SPAD)アレイによる光検出・測光(LIDAR)は、高フレームレートで深度画像の取得を可能にする新興技術である。しかし、この技術の空間分解能は、通常、従来のカメラで記録された強度画像と比較して低い。本研究では,SPADカメラからの奥行き画像のネイティブ解像度を高めるために,カメラのヒストグラムデータから抽出できる複数の特徴を活かしたディープネットワークを構築した。ネットワークはデュアルモードで動作するSPADカメラ用に設計されており、高フレームレートで低解像度深度と高解像度の高解像度の画像を交互に撮影する。ネットワークは、深度の上昇を導くために、下地ヒストグラムから抽出された強度画像と複数の特徴を使用する。我々のネットワークは、幅広い信号対雑音比と光子レベルにまたがる画像分解能の大幅な向上と画像デノイングを提供する。ネットワークを様々な3Dデータに適用し,デノナイジングと4倍の解像度の深度向上を実証する。

Three-dimensional imaging plays an important role in imaging applications where it is necessary to record depth. The number of applications that use depth imaging is increasing rapidly, and examples include self-driving autonomous vehicles and auto-focus assist on smartphone cameras. Light detection and ranging (LIDAR) via single-photon sensitive detector (SPAD) arrays is an emerging technology that enables the acquisition of depth images at high frame rates. However, the spatial resolution of this technology is typically low in comparison to the intensity images recorded by conventional cameras. To increase the native resolution of depth images from a SPAD camera, we develop a deep network built specifically to take advantage of the multiple features that can be extracted from a camera's histogram data. The network is designed for a SPAD camera operating in a dual-mode such that it captures alternate low resolution depth and high resolution intensity images at high frame rates, thus the system does not require any additional sensor to provide intensity images. The network then uses the intensity images and multiple features extracted from downsampled histograms to guide the upsampling of the depth. Our network provides significant image resolution enhancement and image denoising across a wide range of signal-to-noise ratios and photon levels. We apply the network to a range of 3D data, demonstrating denoising and a four-fold resolution enhancement of depth.

翻訳日:2022-09-23 06:24:23 公開日:2021-02-01

# (参考訳) 改良された電波銀河分類のためのアテンションゲーティング

Attention-gating for improved radio galaxy classification ( http://arxiv.org/abs/2012.01248v2 )

ライセンス: CC BY 4.0

Micah Bowles, Anna M. M. Scaife, Fiona Porter, Hongming Tang, David J. Bastien

(参考訳) 本研究では,畳み込みニューラルネットワークを用いた電波銀河の分類技術として注目される。この分野では、次の最小のCNNアプリケーションよりも50%以上少ないパラメータを使用しながら、従来の分類器と同等のアテンションベースモデルを提案する。注意図作成に使用される正規化と集約法の選択が個々のモデルの出力にどのように影響するかを定量的に示し、その結果の注意マップを用いて、モデルによる分類選択を解釈できることを示す。我々は,本モデルで同定された有能な領域が,有能な人間分類器が同等の分類を行う領域とよく一致していることを観察した。正規化とアグリゲーションの選択は個々のモデルの性能にはほとんど影響しないが、それぞれの注意マップの解釈可能性に大きな影響を与え、天文学者が電波源を目で分類する方法とよく一致したモデルを選択することで、より効果的な方法でモデルを利用できることを示す。

In this work we introduce attention as a state of the art mechanism for classification of radio galaxies using convolutional neural networks. We present an attention-based model that performs on par with previous classifiers while using more than 50% fewer parameters than the next smallest classic CNN application in this field. We demonstrate quantitatively how the selection of normalisation and aggregation methods used in attention-gating can affect the output of individual models, and show that the resulting attention maps can be used to interpret the classification choices made by the model. We observe that the salient regions identified by the our model align well with the regions an expert human classifier would attend to make equivalent classifications. We show that while the selection of normalisation and aggregation may only minimally affect the performance of individual models, it can significantly affect the interpretability of the respective attention maps and by selecting a model which aligns well with how astronomers classify radio sources by eye, a user can employ the model in a more effective manner.

翻訳日:2021-05-30 09:05:53 公開日:2021-02-01

# (参考訳) 非凸最適化のための縮小半径によるブロック座標降下の収束

Convergence of block coordinate descent with diminishing radius for nonconvex optimization ( http://arxiv.org/abs/2012.03503v2 )

ライセンス: CC BY 4.0

Hanbaek Lyu

(参考訳) ブロック座標降下(英: Block coordinate descent、BCD)は、非凸最適化のための単純な反復アルゴリズムであり、各ブロック座標の目的関数を逐次最小化し、他の座標を固定する。我々はブロックワイズ凸と微分可能な目的関数の定常点に収束することを保証した bcd のバージョンを提案する。さらに、$n$ が反復数を表す順序 $\log n/\sqrt{n}$ の最適な収束率を得る。鍵となる考え方は、減少する半径内でパラメータ探索を制限し、反復体の安定性を促進させ、そのような補助的制約が限界で消えることを示すことである。応用として、再構成誤差の定常点に収束する非負のCPテンソル因子化のための修正された最小二乗アルゴリズムを、収束率のベストケースで同じ境界で提供する。また,合成データと実世界のデータの両方を用いて実験を行った。

Block coordinate descent (BCD), also known as nonlinear Gauss-Seidel, is a simple iterative algorithm for nonconvex optimization that sequentially minimizes the objective function in each block coordinate while the other coordinates are held fixed. We propose a version of BCD that is guaranteed to converge to the stationary points of block-wise convex and differentiable objective functions under constraints. Furthermore, we obtain a best-case rate of convergence of order $\log n/\sqrt{n}$, where $n$ denotes the number of iterations. A key idea is to restrict the parameter search within a diminishing radius to promote stability of iterates, and then to show that such auxiliary constraints vanish in the limit. As an application, we provide a modified alternating least squares algorithm for nonnegative CP tensor factorization that converges to the stationary points of the reconstruction error with the same bound on the best-case rate of convergence. We also experimentally validate our results with both synthetic and real-world data.

翻訳日:2021-05-21 04:49:45 公開日:2021-02-01

# MHT-X:アルゴリズムXを用いたオフライン多重仮説追跡

MHT-X: Offline Multiple Hypothesis Tracking with Algorithm X ( http://arxiv.org/abs/2101.05202v2 )

ライセンス: Link先を確認

Peteris Zvejnieks, Mihails Birjukovs, Martins Klevs, Megumi Akashi, Sven Eckert, Andris Jakovics

(参考訳) Pythonを用いて最適な相関探索のためのアルゴリズムXを用いたオフライン多重仮説追跡の効率的で汎用的な実装を開発した。このコードは、オンライン処理を必要としない科学アプリケーションを対象としている。有向グラフフレームワークが使われ、時間窓幅が漸進的に増加する複数のスキャンが最大確率軌道のためのエッジ構築に使用される。現在のバージョンのコードは多相流体力学への応用のために開発された。気泡と粒子追跡は、物体の動きを解消し、マージし、分割することができる。対象特性の統計関数に変換される弱い質量と運動量保存則を用いて、実現可能な対象関係と軌道グラフエッジの確率を決定する。符号は n 次元運動と互換性があり、任意のトラックオブジェクト特性を持つ。このフレームワークは、現在使われているヒューリスティックを、問題に対してより適切なものに置き換えることで、現在のアプリケーションを超えて容易に拡張できる。コードはオープンソースで、今後も開発が続けられる。

An efficient and versatile implementation of offline multiple hypothesis tracking with Algorithm X for optimal association search was developed using Python. The code is intended for scientific applications that do not require online processing. Directed graph framework is used and multiple scans with progressively increasing time window width are used for edge construction for maximum likelihood trajectories. The current version of the code was developed for applications in multiphase hydrodynamics, e.g. bubble and particle tracking, and is capable of resolving object motion, merges and splits. Feasible object associations and trajectory graph edge likelihoods are determined using weak mass and momentum conservation laws translated to statistical functions for object properties. The code is compatible with n-dimensional motion with arbitrarily many tracked object properties. This framework is easily extendable beyond the present application by replacing the currently used heuristics with ones more appropriate for the problem at hand. The code is open-source and will be continuously developed further.

翻訳日:2021-05-02 07:15:22 公開日:2021-02-01

# 機械学習を用いた大気イメージングアセンブリのマルチチャネル自動校正

Multi-Channel Auto-Calibration for the Atmospheric Imaging Assembly using Machine Learning ( http://arxiv.org/abs/2012.14023v4 )

ライセンス: Link先を確認

Luiz F. G. dos Santos, Souvik Bose, Valentina Salvatelli, Brad Neuberg, Mark C. M. Cheung, Miho Janvier, Meng Jin, Yarin Gal, Paul Boerner, and At{\i}l{\i}m G\"une\c{s} Baydin

(参考訳) 太陽活動は、惑星間媒質や地球上の宇宙天気に影響を与える重要な役割を担っている。ヘリオフィジカルス宇宙ミッションに搭載されたリモートセンシング機器は、その磁場の測定と多層多熱・動的太陽大気からの光放射を通じて太陽の活動に関する情報のプールを提供する。宇宙からの極端紫外線(euv)波長の観測は、太陽の外層、すなわち色球とコロナの微妙な性質を理解するのに役立つ。残念ながら、NASAのソーラー・ダイナミクス・オブザーバ(SDO)に搭載されている大気イメージング・アセンブリ(AIA)のような機器は、時間依存性の劣化に悩まされ、感度が低下する。現在のキャリブレーション技術は周期的な観測ロケットに依存しており、これは低頻度で、深宇宙ミッションでは実現不可能である。畳み込みニューラルネットワーク(CNN)に基づく別のキャリブレーション手法を提案する。分析にはSDO-AIAデータを用いる。以上の結果から,CNNをベースとしたモデルでは,ロケット実験の結果をある程度の精度で総合的に再現することが可能であることが示唆された。さらに、標準の「アストロノマー法」ベースラインモデルとの比較により、CNNアプローチがこのベースラインを著しく上回ることを示した。提案手法は,EUV機器を校正し,異なるEUVチャネル間のチャネル間関係の理解を深めるための新しい手法の枠組みを確立するものである。

Solar activity plays a quintessential role in influencing the interplanetary medium and space-weather around the Earth. Remote sensing instruments onboard heliophysics space missions provide a pool of information about the Sun's activity via the measurement of its magnetic field and the emission of light from the multi-layered, multi-thermal, and dynamic solar atmosphere. Extreme UV (EUV) wavelength observations from space help in understanding the subtleties of the outer layers of the Sun, namely the chromosphere and the corona. Unfortunately, such instruments, like the Atmospheric Imaging Assembly (AIA) onboard NASA's Solar Dynamics Observatory (SDO), suffer from time-dependent degradation, reducing their sensitivity. Current state-of-the-art calibration techniques rely on periodic sounding rockets, which can be infrequent and rather unfeasible for deep-space missions. We present an alternative calibration approach based on convolutional neural networks (CNNs). We use SDO-AIA data for our analysis. Our results show that CNN-based models could comprehensively reproduce the sounding rocket experiments' outcomes within a reasonable degree of accuracy, indicating that it performs equally well compared with the current techniques. Furthermore, a comparison with a standard "astronomer's technique" baseline model reveals that the CNN approach significantly outperforms this baseline. Our approach establishes the framework for a novel technique to calibrate EUV instruments and advance our understanding of the cross-channel relation between different EUV channels.

翻訳日:2021-04-24 20:07:36 公開日:2021-02-01

# ワーストケース比較による区間群フェアネスの特性評価

Characterizing Intersectional Group Fairness with Worst-Case Comparisons ( http://arxiv.org/abs/2101.01673v3 )

ライセンス: Link先を確認

Avijit Ghosh, Lea Genuit, Mary Reagan

(参考訳) 機械学習または人工知能アルゴリズムは、社会における既存の偏見を模倣し増幅する傾向にあるため、近年かなり精査されている。これはニッチだが成長する仕事の体となり、これらのバイアスを特定し、修正しようとする。これらのアルゴリズムをより公平にするための第一歩は、不公平さを測定するメトリクスを設計することです。この分野での既存の仕事の多くは、公正(保護されたグループと保護されていないグループ)と政治的に定義されたカテゴリー(人種または性別)の両立観を扱う。このような分類は交叉性の重要なニュアンスを見逃す - バイアスは、異なるカテゴリのメンバシップを結合するサブグループで増幅されることが多い。本稿では,交差点のレンズ下でのフェアネス指標の考察,交差点のフェアネスにおける既存作業の特定,既存のグループフェアネス指標の定義を拡張して交差点を包含する単純なケース比較手法の提案,そして,現代文脈における交差点フェアネスを扱うための社会的・法的・政治的枠組みの完成について論じる。

Machine Learning or Artificial Intelligence algorithms have gained considerable scrutiny in recent times owing to their propensity towards imitating and amplifying existing prejudices in society. This has led to a niche but growing body of work that identifies and attempts to fix these biases. A first step towards making these algorithms more fair is designing metrics that measure unfairness. Most existing work in this field deals with either a binary view of fairness (protected vs. unprotected groups) or politically defined categories (race or gender). Such categorization misses the important nuance of intersectionality - biases can often be amplified in subgroups that combine membership from different categories, especially if such a subgroup is particularly underrepresented in historical platforms of opportunity. In this paper, we discuss why fairness metrics need to be looked at under the lens of intersectionality, identify existing work in intersectional fairness, suggest a simple worst case comparison method to expand the definitions of existing group fairness metrics to incorporate intersectionality, and finally conclude with the social, legal and political framework to handle intersectional fairness in the modern context.

翻訳日:2021-04-11 11:45:11 公開日:2021-02-01

# (参考訳) VIPPrint: 合成顔画像検出とソースリンクのためのプリント画像とスキャン画像の大規模データセット

VIPPrint: A Large Scale Dataset of Printed and Scanned Images for Synthetic Face Images Detection and Source Linking ( http://arxiv.org/abs/2102.06792v1 )

ライセンス: CC BY-SA 4.0

Anselmo Ferreira, Ehsan Nowroozi and Mauro Barni

(参考訳) 印刷された画像やスキャンされた画像に対して有意義な法医学的分析を行う可能性は、多くのアプリケーションにおいて大きな役割を果たす。まず第一に、印刷された文書は、しばしばテロリスト計画、児童ポルノ写真、さらには偽のパッケージといった犯罪行為と関連付けられている。さらに、印刷や走査は、画像が印刷されスキャンされた後に、通常、操作された画像や合成画像に見られるアーティファクトがなくなるため、画像操作の痕跡や画像の合成特性を隠すために用いられる。この領域の研究を妨げる問題は、アルゴリズムの開発とベンチマークに使用される大規模な参照データセットの欠如である。本稿では,本課題に動機づけられ,多数の合成画像と自然画像からなる新しいデータセットを提案する。データセットの画像解析に係わる問題点を明らかにするために,複数のプリンタの属性法を比較した広範な実験を行った。また,自然顔画像と合成顔画像とを区別する最新の手法が,印刷やスキャン画像に適用しても失敗することを確認した。新たなデータセットが利用可能となり,予備実験が実施されれば,この領域におけるさらなる研究の動機と促進が期待できる。

The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we present a new dataset composed of a large number of synthetic and natural printed face images. To highlight the difficulties associated with the analysis of the images of the dataset, we carried out an extensive set of experiments comparing several printer attribution methods. We also verified that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images. We envision that the availability of the new dataset and the preliminary experiments we carried out will motivate and facilitate further research in this area.

翻訳日:2021-04-06 08:53:53 公開日:2021-02-01

# 層ベース複合評価ブートストラップ

Layer-based Composite Reputation Bootstrapping ( http://arxiv.org/abs/2102.09951v1 )

ライセンス: Link先を確認

Sajib Mistry, Athman Bouguettaya, Lie Qu

(参考訳) 複合サービスのための新しい汎用評価ブートストラップフレームワークを提案する。複数の評判関連の指標がレイヤベースのフレームワークで検討され、コンポーネントサービスの評判を暗黙的に反映する。コンポーネントサービスの将来のパフォーマンスに対する指標の重要性は、修正されたランダムフォレストアルゴリズムを用いて学習される。本研究では,複合サービスの評価とコンポーネントサービスの評価指標の相関関係を明らかにするために,トポロジー対応フォレスト深層ニューラルネットワーク(fdnn)を提案する。トレーニングされたfDNNモデルは、信頼性の高い新しいコンポジットサービスの評判を予測する。実世界のデータセットを用いた実験により,提案手法の有効性が証明された。

We propose a novel generic reputation bootstrapping framework for composite services. Multiple reputation-related indicators are considered in a layer-based framework to implicitly reflect the reputation of the component services. The importance of an indicator on the future performance of a component service is learned using a modified Random Forest algorithm. We propose a topology-aware Forest Deep Neural Network (fDNN) to find the correlations between the reputation of a composite service and reputation indicators of component services. The trained fDNN model predicts the reputation of a new composite service with the confidence value. Experimental results with real-world dataset prove the efficiency of the proposed approach.

翻訳日:2021-04-05 00:23:55 公開日:2021-02-01

# (参考訳) 人工知能における量子数学

Quantum Mathematics in Artificial Intelligence ( http://arxiv.org/abs/2101.04255v3 )

ライセンス: CC BY-SA 4.0

Dominic Widdows and Kirsty Kitto and Trevor Cohen

(参考訳) 2010年以降の10年間、人工知能の成功はコンピュータ科学と技術の最前線にあり、ベクトル空間モデルは人工知能の最前線における位置を固めてきた。同時に、量子コンピュータはより強力になり、主要な進歩の発表が頻繁にニュースに取り上げられている。これらの領域の根底にある数学的手法は、しばしば実現されるよりも多くの共通点がある。ベクトル空間は1930年代に量子力学の公理的中心に位置づけられ、この採用はベクトル空間の線型幾何学から論理と確率を導出するための重要な動機となった。粒子間の量子的相互作用はテンソル積を用いてモデル化される。本稿では、人工知能(AI)、特に自動推論や自然言語処理(NLP)における利用例を含む、これらの一般的な数学分野について述べる。議論される技法には、ベクトル空間、スカラー積、部分空間と含意、直交射影と否定、双対ベクトル、密度行列、正作用素、テンソル積が含まれる。アプリケーション領域には、情報検索、分類と含意、単語センスと曖昧さのモデル化、知識ベースにおける推論、意味合成が含まれる。これらのアプローチのいくつかは量子ハードウェアに実装できる可能性がある。この実装の実践的なステップの多くは初期段階にあり、すでに実現しているものもある。一般的な数学的ツールのいくつかを説明することは、aiと量子コンピューティングの両方の研究者がこれらの重複をさらに活用し、途中で新しい方向を認識し探索するのに役立つ。

In the decade since 2010, successes in artificial intelligence have been at the forefront of computer science and technology, and vector space models have solidified a position at the forefront of artificial intelligence. At the same time, quantum computers have become much more powerful, and announcements of major advances are frequently in the news. The mathematical techniques underlying both these areas have more in common than is sometimes realized. Vector spaces took a position at the axiomatic heart of quantum mechanics in the 1930s, and this adoption was a key motivation for the derivation of logic and probability from the linear geometry of vector spaces. Quantum interactions between particles are modelled using the tensor product, which is also used to express objects and operations in artificial neural networks. This paper describes some of these common mathematical areas, including examples of how they are used in artificial intelligence (AI), particularly in automated reasoning and natural language processing (NLP). Techniques discussed include vector spaces, scalar products, subspaces and implication, orthogonal projection and negation, dual vectors, density matrices, positive operators, and tensor products. Application areas include information retrieval, categorization and implication, modelling word-senses and disambiguation, inference in knowledge bases, and semantic composition. Some of these approaches can potentially be implemented on quantum hardware. Many of the practical steps in this implementation are in early stages, and some are already realized. Explaining some of the common mathematical tools can help researchers in both AI and quantum computing further exploit these overlaps, recognizing and exploring new directions along the way.

翻訳日:2021-04-04 13:47:37 公開日:2021-02-01

# ECCV-TAO-2020の第1位: トラッキング対象の検出と表現

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking ( http://arxiv.org/abs/2101.08040v2 )

ライセンス: Link先を確認

Fei Du, Bo Xu, Jiasheng Tang, Yuqi Zhang, Fan Wang, and Hao Li

(参考訳) 従来のトラッキング・バイ・検出パラダイムをこのトラッキング・ア・オブジェクト・タスクに拡張する。固体検出結果はまずTAOデータセットから抽出される。いくつかの最先端技術、例えば \textbf{ba}lanced-\textbf{g}roup \textbf{s}oftmax (\textbf{bags}\cite{li2020overcoming})や検出器s\cite{qiao2020detector}は検出中に統合される。そして,特徴学習ネットワークのトレーニングにより,あらゆる対象を表す出現特徴を学習した。検出と特徴表現を改善するために,いくつかのモデルを組み立てる。最も類似した外観機能を持つ単純なリンク戦略と、トラックレットレベルのポストアソシエーションモジュールが最終的に最終追跡結果を生成するために適用される。この方法は、challenge webサイトに \textbf{aoa}として提出される。コードはhttps://github.com/feiaxyt/winner_eccv20_taoで入手できる。

We extend the classical tracking-by-detection paradigm to this tracking-any-object task. Solid detection results are first extracted from TAO dataset. Some state-of-the-art techniques like \textbf{BA}lanced-\textbf{G}roup \textbf{S}oftmax (\textbf{BAGS}\cite{li2020overcoming}) and DetectoRS\cite{qiao2020detectors} are integrated during detection. Then we learned appearance features to represent any object by training feature learning networks. We ensemble several models for improving detection and feature representation. Simple linking strategies with most similar appearance features and tracklet-level post association module are finally applied to generate final tracking results. Our method is submitted as \textbf{AOA} on the challenge website. Code is available at https://github.com/feiaxyt/Winner_ECCV20_TAO.

翻訳日:2021-03-22 01:23:44 公開日:2021-02-01

# (参考訳) 構造的関係推論を用いたCross Chest Graphによる疾患診断

Cross Chest Graph for Disease Diagnosis with Structural Relational Reasoning ( http://arxiv.org/abs/2101.08992v2 )

ライセンス: CC BY 4.0

Gangming Zhao, Baolian Qi, Jinpeng Li

(参考訳) X線画像のコンピュータ診断において位置病変は重要である。しかし、ボックスレベルのアノテーションは時間と労力を要する。病変を正確に特定する方法は少ないが、注意すべきアノテーションがなくても、緊急の問題だ。弱い教師付きメソッドでこの問題にアプローチする作業がいくつかあるが、パフォーマンスは改善される必要がある。 1つの障害は、一般に弱教師付き手法は、高構造特性のようなX線像の特性を考慮できなかったことである。そこで我々は,医師のトレーニングや意思決定プロセスを模倣して自動病変検出の性能を向上させるクロスケストグラフ(CCG)を提案する。 CCGは、構造情報を利用して異なる領域を観察する医師の習慣をシミュレートすることで、異なる解剖学的領域間の画像内関係をモデル化する。一方、画像間の関係は知識分析モジュールによってモデル化され、複数の画像を比較する医師の習慣をシミュレートする。画像内および画像間情報を統合されたエンドツーエンドフレームワークに統合する。 The NIH Chest-14 database (112,120 frontal-view X-ray images with 14 disease) での実験結果から,本手法は医療分野の専門知識を吸収することにより,病変の局所化を弱め,最先端の性能を達成することを示した。

Locating lesions is important in the computer-aided diagnosis of X-ray images. However, box-level annotation is time-consuming and laborious. How to locate lesions accurately with few, or even without careful annotations is an urgent problem. Although several works have approached this problem with weakly-supervised methods, the performance needs to be improved. One obstacle is that general weakly-supervised methods have failed to consider the characteristics of X-ray images, such as the highly-structural attribute. We therefore propose the Cross-chest Graph (CCG), which improves the performance of automatic lesion detection by imitating doctor's training and decision-making process. CCG models the intra-image relationship between different anatomical areas by leveraging the structural information to simulate the doctor's habit of observing different areas. Meanwhile, the relationship between any pair of images is modeled by a knowledge-reasoning module to simulate the doctor's habit of comparing multiple images. We integrate intra-image and inter-image information into a unified end-to-end framework. Experimental results on the NIH Chest-14 database (112,120 frontal-view X-ray images with 14 diseases) demonstrate that the proposed method achieves state-of-the-art performance in weakly-supervised localization of lesions by absorbing professional knowledge in the medical field.

翻訳日:2021-03-21 02:19:40 公開日:2021-02-01

# ニューラルネットワークポテンシャルのアクティブラーニングを可能にする不確実性に対する敵対的攻撃

Adversarial Attacks on Uncertainty Enable Active Learning for Neural Network Potentials ( http://arxiv.org/abs/2101.11588v2 )

ライセンス: Link先を確認

Daniel Schwalbe-Koda, Aik Rui Tan, Rafael G\'omez-Bombarelli

(参考訳) ニューラルネットワーク(NN)ベースの原子間電位は、電子構造法の精度でポテンシャルエネルギー表面を迅速に予測する。しかし、NN予測は十分に学習された訓練領域内でのみ信頼性があり、外挿時の未知の挙動を持つ。 NN委員会による不確実性定量化は、予測信頼度が低いドメインを特定するが、NNポテンシャルをトレーニングするための設定空間を徹底的に探索するには、しばしば遅い原子論シミュレーションが必要である。ここでは,新しい分子ジオメトリとブートストラップnnポテンシャルをサンプリングするために,異種不確実性指標を用いた敵対的攻撃を用いる。アクティブ学習ループと組み合わせることで、NNポテンシャルの補間力は、追加のサンプルが少ない元のトレーニングデータを超えて改善されます。このフレームワークは複数の例で実証され、関連するジオメトリに関する広範な事前データなしで、運動障壁と集合変数のより良いサンプリングにつながります。敵攻撃は、位相空間とブートストラップNN電位を同時にサンプリングし、その堅牢性を高め、ポテンシャルエネルギー景観のより高速で正確な予測を可能にする新しい方法である。

Neural network (NN)-based interatomic potentials provide fast prediction of potential energy surfaces with the accuracy of electronic structure methods. However, NN predictions are only reliable within well-learned training domains, with unknown behavior when extrapolating. Uncertainty quantification through NN committees identify domains with low prediction confidence, but thoroughly exploring the configuration space for training NN potentials often requires slow atomistic simulations. Here, we employ adversarial attacks with a differentiable uncertainty metric to sample new molecular geometries and bootstrap NN potentials. In combination with an active learning loop, the extrapolation power of NN potentials is improved beyond the original training data with few additional samples. The framework is demonstrated on multiple examples, leading to better sampling of kinetic barriers and collective variables without extensive prior data on the relevant geometries. Adversarial attacks are new ways to simultaneously sample the phase space and bootstrap NN potentials, increasing their robustness and enabling a faster, accurate prediction of potential energy landscapes.

翻訳日:2021-03-13 19:32:38 公開日:2021-02-01

# (参考訳) バッテリーの健康状態推定のための機械学習パイプライン

Machine learning pipeline for battery state of health estimation ( http://arxiv.org/abs/2102.00837v1 )

ライセンス: CC BY 4.0

Darius Roman, Saurabh Saxena, Valentin Robu, Michael Pecht and David Flynn

(参考訳) リチウムイオン電池は携帯用電子工学から電気自動車まで及ぶ現代適用でユビキタスです。アプリケーションに関係なく、オンボードコンピュータによるバッテリーの状態(SOH)の信頼性の高いリアルタイム推定は、バッテリーの安全な操作に不可欠であり、最終的に資産の完全性を保護します。本稿では,各種条件下での179セル上での電池容量フェード(バッテリヘルスの指標)推定のための機械学習パイプラインの設計と評価を行う。パイプラインは、2つのパラメトリックおよび2つの非パラメトリックアルゴリズムを用いて、関連する信頼区間で電池SOHを推定する。チャージ電圧と電流曲線のセグメントを使用して、パイプラインエンジニア30は自動的な特徴選択を行い、アルゴリズムを校正する。高速チャージプロトコルの下で動作しているセルにデプロイすると、最良のモデルは根平均二乗誤差 0.45\% を達成する。この研究は、バッテリーSOH推定のためのスケーラブルなデータ駆動モデルの設計に関する洞察を提供し、予測に関する信頼性境界の価値を強調します。パイプライン手法は、実験データと機械学習モデリングを組み合わせることで、SOHのリアルタイム推定を必要とする他の重要なコンポーネントに一般化することができる。

Lithium-ion batteries are ubiquitous in modern day applications ranging from portable electronics to electric vehicles. Irrespective of the application, reliable real-time estimation of battery state of health (SOH) by on-board computers is crucial to the safe operation of the battery, ultimately safeguarding asset integrity. In this paper, we design and evaluate a machine learning pipeline for estimation of battery capacity fade - a metric of battery health - on 179 cells cycled under various conditions. The pipeline estimates battery SOH with an associated confidence interval by using two parametric and two non-parametric algorithms. Using segments of charge voltage and current curves, the pipeline engineers 30 features, performs automatic feature selection and calibrates the algorithms. When deployed on cells operated under the fast-charging protocol, the best model achieves a root mean squared percent error of 0.45\%. This work provides insights into the design of scalable data-driven models for battery SOH estimation, emphasising the value of confidence bounds around the prediction. The pipeline methodology combines experimental data with machine learning modelling and can be generalized to other critical components that require real-time estimation of SOH.

翻訳日:2021-02-05 11:26:17 公開日:2021-02-01

# (参考訳) 視覚偽物のための階層的変分オートエンコーダ

Hierarchical Variational Autoencoder for Visual Counterfactuals ( http://arxiv.org/abs/2102.00854v1 )

ライセンス: CC BY 4.0

Nicolas Vercheval, Aleksandra Pizurica

(参考訳) 条件変分自動エンコーダ(VAE)は、説明可能な人工知能(XAI)ツールとして注目されている。潜在空間の符号は、反事実を生み出す理論的に正しい方法を提供する。ターゲットとするセマンティック機能への介入による変更。実画像に適用するには、階層CVAEのようなより複雑なモデルが必要です。これは、ナイーブコンディショニングがもはや有効ではないという課題を伴う。本稿では, 後方効果の緩和が反ファクトの達成につながることを示すとともに, アプリケーション内の分類器を視覚的に監査する手法として, VAEX を階層型VAE として導入する。

Conditional Variational Auto Encoders (VAE) are gathering significant attention as an Explainable Artificial Intelligence (XAI) tool. The codes in the latent space provide a theoretically sound way to produce counterfactuals, i.e. alterations resulting from an intervention on a targeted semantic feature. To be applied on real images more complex models are needed, such as Hierarchical CVAE. This comes with a challenge as the naive conditioning is no longer effective. In this paper we show how relaxing the effect of the posterior leads to successful counterfactuals and we introduce VAEX an Hierarchical VAE designed for this approach that can visually audit a classifier in applications.

翻訳日:2021-02-05 11:23:27 公開日:2021-02-01

# (参考訳) 負の学習率を持つメタラーニング

Meta-learning with negative learning rates ( http://arxiv.org/abs/2102.00940v1 )

ライセンス: CC BY 4.0

Alberto Bernacchia

(参考訳) ディープラーニングモデルは、うまく機能するために大量のデータを必要とします。対象タスクにデータが不足すると、同様のタスクのトレーニングで得られた知識を転送して、ターゲットをすばやく学習できます。成功しているアプローチはメタラーニング(メタラーニング)、あるいは、学習が外ループで表されるタスクの分布を学習し、勾配降下の内側ループで学習する学習である。しかし、最近の多くの実証研究では、内部ループは不要であり、より単純なモデルは等しく、あるいはより良く機能すると主張している。内部ループの学習速度の関数としてのmamlの性能について検討し,学習速度がゼロである場合,内部ループが存在しないことを示唆する。ランダム行列理論と線形モデルの厳密解を用いて、過剰パラメータモデルを用いた混合線形回帰および非線形回帰に適用するmamlの検定損失に対する代数的表現を計算する。意外なことに、適応のための最適学習率が正である一方で、トレーニングのための最適学習率が常に負であることは、これまで考えられなかった設定である。したがって、最近の研究が示唆しているように、学習率をゼロにすることでパフォーマンスが向上するだけでなく、学習率を負の値に下げることでさらに向上させることができる。これらの結果は,メタラーニングがどのような状況で最善かを明らかにするのに役立つ。

Deep learning models require a large amount of data to perform well. When data is scarce for a target task, we can transfer the knowledge gained by training on similar tasks to quickly learn the target. A successful approach is meta-learning, or learning to learn a distribution of tasks, where learning is represented by an outer loop, and to learn by an inner loop of gradient descent. However, a number of recent empirical studies argue that the inner loop is unnecessary and more simple models work equally well or even better. We study the performance of MAML as a function of the learning rate of the inner loop, where zero learning rate implies that there is no inner loop. Using random matrix theory and exact solutions of linear models, we calculate an algebraic expression for the test loss of MAML applied to mixed linear regression and nonlinear regression with overparameterized models. Surprisingly, while the optimal learning rate for adaptation is positive, we find that the optimal learning rate for training is always negative, a setting that has never been considered before. Therefore, not only does the performance increase by decreasing the learning rate to zero, as suggested by recent work, but it can be increased even further by decreasing the learning rate to negative values. These results help clarify under what circumstances meta-learning performs best.

翻訳日:2021-02-05 11:13:41 公開日:2021-02-01

# (参考訳) 心電図の逆問題に対する基礎関数に基づくデータ駆動学習

Basis Function Based Data Driven Learning for the Inverse Problem of Electrocardiography ( http://arxiv.org/abs/2102.00570v1 )

ライセンス: CC BY 4.0

Tommy Peng, Avinash Malik, Laura Bear, Mark L. Trew

(参考訳) 目的: ガウス3D(G3D)基底関数分解法を用いて, 心電図の従来の逆問題から回帰問題へと再構成する, 体表面電位(BSP)から心表面電位(HSP)を予測するニューラルネットワーク手法を提案する。方法: HSPはG3D基底関数を用いて生成され,境界要素フォワードモデルを通過して対応するBSPを得る。生成されたBSP(インプット)とHSP(アウトプット)はニューラルネットワークの訓練に使用され、その後様々な合成および分解された実世界のHSPを予測するために使用された。結果:g3d基底関数パラメータは実世界の左室ペース記録を正確に再現でき、根平均二乗誤差 (rmse) は1.34 \pm 1.30$%である。基礎データ訓練ニューラルネットワークは、RMSEが$8.46 \pm 1.55$%、およびRMSEが$18.5 \pm 5.25$%である実世界のデータのG3D表現でG3D基底関数合成データを予測できた。予測時間系列から生成された活性化マップは、実際の左室ペース記録から生成されたものと比較して、RMSEは17.0%であり、絶対差は10.3 pm 10.8$msである。結論: ガウス基底関数に基づく回帰問題として心電図の逆問題を再計算するデータ駆動モデルが成功し, ガウスデータのみを用いて訓練した場合でも実世界の記録の有望な時系列と活性化マップ予測を生成する。意義:ニューラルネットワークによって予測されるHSPを使用して、臨床評価中に心機能障害を識別する活性化マップを作成することができる。

Objective: This paper proposes an neural network approach for predicting heart surface potentials (HSPs) from body surface potentials (BSPs), which reframes the traditional inverse problem of electrocardiography into a regression problem through the use of Gaussian 3D (G3D) basis function decomposition. Methods: HSPs were generated using G3D basis functions and passed through a boundary element forward model to obtain corresponding BSPs. The generated BSPs (input) and HSPs (output) were used to train a neural network, which was then used to predict a variety of synthesized and decomposed real-world HSPs. Results: Fitted G3D basis function parameters can accurately reconstruct the real-world left ventricular paced recording with percent root mean squared error (RMSE) of $1.34 \pm 1.30$%. The basis data trained neural network was able to predict G3D basis function synthesized data with RMSE of $8.46 \pm 1.55$%, and G3D representation of real-world data with RMSE of $18.5 \pm 5.25$%. Activation map produced from the predicted time series had a RMSE of 17.0% and mean absolute difference of $10.3 \pm 10.8$ms when compared to that produced from the actual left ventricular paced recording. Conclusion: A Gaussian basis function based data driven model for re-framing the inverse problem of electrocardiography as a regression problem is successful and produces promising time series and activation map predictions of real-world recordings even when only trained using Guassian data. Significance: The HSPs predicted by the neural network can be used to create activation maps to identify cardiac dysfunctions during clinical assessment.

翻訳日:2021-02-05 08:19:11 公開日:2021-02-01

# (参考訳) 畳み込みニューラルネットワークを用いたパーキンソント歩行解析のための時空間反応力解析

Spatiotemporal Ground Reaction Force Analysis using Convolutional Neural Networks to Analyze Parkinsonian Gait ( http://arxiv.org/abs/2102.00628v1 )

ライセンス: CC0 1.0

Musthaq Ahamed, P.D.S.H. Gunawardane, Nimali T. Medagedara

(参考訳) パーキンソン病(英: Parkinson's disease, PD)は、高齢者の生活の質を大幅に低下させる不治の病気である。 PDは主に歩行パターンに影響を与え、歩行を正常から障害へと徐々に変化させる。 PDの早期診断は治療に重要であり,歩行パターン解析はPDの診断手法として用いられる。本稿では,PDに関連する歩行パターンの変化を識別するための指標として,生時空間反応力(GRF)を同定した。 GRFの変化は、前処理、変換、認識、性能評価を通じて畳み込みニューラルネットワークを用いて識別される。提案アルゴリズムは,pdの重症度を同定し,パーキンソン病の歩行と健康な歩行を区別することができる。この技術は自動意思決定プロセスにおいて97%の精度を示している。

Parkinson's disease (PD) is a non-curable disease that commonly found among elders that greatly reduce their quality of life. PD primarily affects the gait pattern and slowly changes the walking gait from the normality to disability. The early diagnosing of PD is important for treatments and gait pattern analysis is used as a technique to diagnose PD. The present paper has identified the raw spatiotemporal ground reaction force (GRF) as a key parameter to identify the changes in human gait patterns associated with PD. The changes in GRF are identified using a convolutional neural network through pre-processing, conversion, recognition, and performance evaluation. The proposed algorithm is capable of identifying the severity of the PD and distinguishing the parkinsonian gait from the healthy gait. The technique has shown a 97% of accuracy in automatic decision-making process.

翻訳日:2021-02-05 08:04:00 公開日:2021-02-01

# (参考訳) 生存データの機械学習モデルを用いた説明変数に関連する危険率の計算

Computing the Hazard Ratios Associated with Explanatory Variables Using Machine Learning Models of Survival Data ( http://arxiv.org/abs/2102.00637v1 )

ライセンス: CC BY 4.0

Sameer Sundrani and James Lu

(参考訳) 目的: Cox Proportional Hazards (CoxPH) モデルの生存データへの適用, および Hazard Ratio (HR) の導出が良好に確立されている。木をベースとした非線形機械学習(ML)モデルが生存分析に適用されているが、これらのモデルから説明変数に関連付けられたHRを計算するための方法論は存在しない。予測に対する説明変数の寄与を定量化する局所的正確で一貫性のある手法であるShapley additive explanation (SHAP)値を用いて,木ベースのMLモデルからHRを計算する新しい方法を提案する。方法: 大腸癌、乳癌、膵臓癌の患者から得られた3組の生存データを用いて、CoxPHの性能を最先端のMLモデルであるXGBoostと比較した。 XGBoostモデルから説明変数のHRを計算するために、SHAP値は指数化され、2つのサブグループの平均の比率が計算された。信頼区間は、トレーニングデータをブートストラップし、MLモデルを1000回生成することで計算された。 3つのデータセット全体で、すべての説明変数のHRを体系的に比較した。 PythonとRのオープンソースライブラリが分析に使用された。結果: 大腸癌群と乳癌群では, CoxPH と XGBoost のパフォーマンスは同等であり, HR の整合性は良好であった。 Pan-cancerデータセットでは、ほとんどの変数の一致を示しましたが、CoxPHとXGBoostの結果の間の2つの説明変数の反対の発見も示しました。その後のKaplan-MeierプロットはXGBoostモデルの発見を支持した。結論: MLモデルからのHRの導出は,複雑な生存データセットからの危険因子の同定を改善し,臨床試験の結果を予測するのに役立つ。

Purpose: The application of Cox Proportional Hazards (CoxPH) models to survival data and the derivation of Hazard Ratio (HR) is well established. While nonlinear, tree-based Machine Learning (ML) models have been developed and applied to the survival analysis, no methodology exists for computing HRs associated with explanatory variables from such models. We describe a novel way to compute HRs from tree-based ML models using the Shapley additive explanation (SHAP) values, which is a locally accurate and consistent methodology to quantify explanatory variables' contribution to predictions. Methods: We used three sets of publicly available survival data consisting of patients with colon, breast or pan cancer and compared the performance of CoxPH to the state-of-art ML model, XGBoost. To compute the HR for explanatory variables from the XGBoost model, the SHAP values were exponentiated and the ratio of the means over the two subgroups calculated. The confidence interval was computed via bootstrapping the training data and generating the ML model 1000 times. Across the three data sets, we systematically compared HRs for all explanatory variables. Open-source libraries in Python and R were used in the analyses. Results: For the colon and breast cancer data sets, the performance of CoxPH and XGBoost were comparable and we showed good consistency in the computed HRs. In the pan-cancer dataset, we showed agreement in most variables but also an opposite finding in two of the explanatory variables between the CoxPH and XGBoost result. Subsequent Kaplan-Meier plots supported the finding of the XGBoost model. Conclusion: Enabling the derivation of HR from ML models can help to improve the identification of risk factors from complex survival datasets and enhance the prediction of clinical trial outcomes.

翻訳日:2021-02-05 07:51:53 公開日:2021-02-01

# (参考訳) Webインテリジェンスアプリケーションのためのドメイン特化モデルの自動拡張

Automatic Expansion of Domain-Specific Affective Models for Web Intelligence Applications ( http://arxiv.org/abs/2102.00827v1 )

ライセンス: CC BY 4.0

Albert Weichselbraun, Jakob Steixner, Adrian M.P. Bra\c{s}oveanu, Arno Scharl, Max G\"obel and Lyndon J. B. Nixon

(参考訳) 知覚コンピューティングは、ポジティブな感情とネガティブな感情を区別するための極性(Polarity)、人間の感情の表現を捉えるためのよりニュアンスなモデル(nuanced model)など、様々な複雑さの明確に定義された感情モデルに依存している。コミュニケーションの成功を測定するために使用されると、高度な機械学習アプローチと組み合わせた最もきめ細かい感情モデルでさえ、組織の戦略的なポジショニング目標を完全に捉えることはできません。このような目標は、しばしば標準化された感情モデルから逸脱する。喜びや信頼といった特定の感情は、一般的に望ましいブランドの関連を表すが、マーケティング専門家によって定式化された特定のコミュニケーション目標はしばしばそのような標準的な次元を超えている。例えば、テレビ番組のブランドマネージャーは、恐れや悲しみが観客に望まれる感情であると考えるかもしれません。本稿では、ナレッジグラフで利用可能な共通知識と共通知識を言語モデルや感情推論と組み合わせ、カバレッジと一貫性を改善し、感情のドメイン固有の解釈をサポートする、感情モデルのための拡張技術を紹介します。広範な評価は、異なる拡張技術のパフォーマンスを比較します:(i) 再訪された感情の砂時計モデルに基づいて定量的評価し、手動でコンパイルされた金標準データを使用して、複数の感情カテゴリをカバーする複雑なモデルのパフォーマンスを評価し、(ii) テレビ番組ブランドのためのドメイン固有の感情モデルの定性評価。これらの評価の結果,導入技術は様々な組込みモデルと事前学習モデルをサポートしていることが示された。論文は、このアプローチをモデルリソースが乏しい他のシナリオに適用することに関する議論で締めくくられている。

Sentic computing relies on well-defined affective models of different complexity - polarity to distinguish positive and negative sentiment, for example, or more nuanced models to capture expressions of human emotions. When used to measure communication success, even the most granular affective model combined with sophisticated machine learning approaches may not fully capture an organisation's strategic positioning goals. Such goals often deviate from the assumptions of standardised affective models. While certain emotions such as Joy and Trust typically represent desirable brand associations, specific communication goals formulated by marketing professionals often go beyond such standard dimensions. For instance, the brand manager of a television show may consider fear or sadness to be desired emotions for its audience. This article introduces expansion techniques for affective models, combining common and commonsense knowledge available in knowledge graphs with language models and affective reasoning, improving coverage and consistency as well as supporting domain-specific interpretations of emotions. An extensive evaluation compares the performance of different expansion techniques: (i) a quantitative evaluation based on the revisited Hourglass of Emotions model to assess performance on complex models that cover multiple affective categories, using manually compiled gold standard data, and (ii) a qualitative evaluation of a domain-specific affective model for television programme brands. The results of these evaluations demonstrate that the introduced techniques support a variety of embeddings and pre-trained models. The paper concludes with a discussion on applying this approach to other scenarios where affective model resources are scarce.

翻訳日:2021-02-05 07:44:24 公開日:2021-02-01

# (参考訳) 分散フェデレーション学習はモデルとデータのプライバシを保存する

Decentralized Federated Learning Preserves Model and Data Privacy ( http://arxiv.org/abs/2102.00880v1 )

ライセンス: CC BY 4.0

Thorsten Wittkopp and Alexander Acker

(参考訳) ITシステムの複雑さが増す中、障害発生時の運用をサポートするソリューションが必要です。したがって、AIOps(Artificial Intelligence for System Operations)は、学術と産業の両方において、ますます注目されつつある研究分野である。この領域の主要な問題の1つは、適切なラベル付きデータへのアクセスがないことです。これは、主に法的保護規制または産業機密性によるものです。この混乱を連合学習の領域から緩和する方法は、トレーニングデータに直接アクセスする必要がない。オリジナルアプローチでは、すべてのモデルパラメータの周期的集約によるモデル同期を実行するために、中央インスタンスを使用する。しかし、その機密知識やトレーニングデータを再構築することができるため、訓練されたモデルを公開できないシナリオはたくさんあります。さらに、中央のインスタンスは信頼される必要があり、単一障害点である。そこで,我々は,学習モデル間の知識共有を可能にする完全分散手法を提案する。オリジナルのトレーニングデータもモデルパラメータも送信する必要はない。この概念は、モデルに割り当てられた教師と学生の役割に依存しており、生徒は合成された入力データを通じて教師の出力に基づいて訓練される。ログ異常検出のケーススタディを実施している。その結果,教師が学習した未学習学生モデルが,教師と同等のF1スコアに達することがわかった。さらに,本手法は,異なる訓練データサブセット上で訓練された複数のモデルの同期を可能にすることを示す。

The increasing complexity of IT systems requires solutions, that support operations in case of failure. Therefore, Artificial Intelligence for System Operations (AIOps) is a field of research that is becoming increasingly focused, both in academia and industry. One of the major issues of this area is the lack of access to adequately labeled data, which is majorly due to legal protection regulations or industrial confidentiality. Methods to mitigate this stir from the area of federated learning, whereby no direct access to training data is required. Original approaches utilize a central instance to perform the model synchronization by periodical aggregation of all model parameters. However, there are many scenarios where trained models cannot be published since its either confidential knowledge or training data could be reconstructed from them. Furthermore the central instance needs to be trusted and is a single point of failure. As a solution, we propose a fully decentralized approach, which allows to share knowledge between trained models. Neither original training data nor model parameters need to be transmitted. The concept relies on teacher and student roles that are assigned to the models, whereby students are trained on the output of their teachers via synthetically generated input data. We conduct a case study on log anomaly detection. The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher. In addition, we demonstrate that our method allows the synchronization of several models trained on different distinct training data subsets.

翻訳日:2021-02-05 07:19:05 公開日:2021-02-01

# (参考訳) RNA二次構造体の神経表現と生成

Neural representation and generation for RNA secondary structures ( http://arxiv.org/abs/2102.00925v1 )

ライセンス: CC BY 4.0

Zichao Yan, William L. Hamilton and Mathieu Blanchette

(参考訳) 本研究は, 細胞活性や機能に影響を及ぼす複雑な構造を組み込むことができる遺伝子マクロ分子の一種であるRNAの生成と設計に関するものである。大規模で複雑な生物学的構造の設計は、計算薬物発見の重要かつ未承認の側面を表すグラフベースの深層生成モデリング技術に拍車をかけた。本研究では、異なるRNA構造モダリティの表現と生成の原理を検討し、これらの分子構造とそれらの配列を有意義な潜在空間に融合して生成するための柔軟な枠組みを提案する。 RNA分子構造を深く理解した当社の高度な符号化・復号法は、分子グラフとジャンクションツリー階層上で動作し、RNA構造規則性や折り畳み機構に関する強い誘導バイアスを統合し、生成したRNAの構造的妥当性、安定性、多様性を実現します。また,タンパク質との相互作用に関して,RNA分子埋め込みの潜伏空間を適切に整理し,この潜伏領域を探索し,新たなRNA分子を探索する目的の最適化も行っている。

Our work is concerned with the generation and targeted design of RNA, a type of genetic macromolecule that can adopt complex structures which influence their cellular activities and functions. The design of large scale and complex biological structures spurs dedicated graph-based deep generative modeling techniques, which represents a key but underappreciated aspect of computational drug discovery. In this work, we investigate the principles behind representing and generating different RNA structural modalities, and propose a flexible framework to jointly embed and generate these molecular structures along with their sequence in a meaningful latent space. Equipped with a deep understanding of RNA molecular structures, our most sophisticated encoding and decoding methods operate on the molecular graph as well as the junction tree hierarchy, integrating strong inductive bias about RNA structural regularity and folding mechanism such that high structural validity, stability and diversity of generated RNAs are achieved. Also, we seek to adequately organize the latent space of RNA molecular embeddings with regard to the interaction with proteins, and targeted optimization is used to navigate in this latent space to search for desired novel RNA molecules.

翻訳日:2021-02-05 07:08:21 公開日:2021-02-01

# (参考訳) マルチタスクガウスプロセスマルチオブジェクト自己アテンションネットワークを用いたCOVID-19患者における機械的換気のリアルタイム予測

Real-time Prediction for Mechanical Ventilation in COVID-19 Patients using A Multi-task Gaussian Process Multi-objective Self-attention Network ( http://arxiv.org/abs/2102.01147v1 )

ライセンス: CC BY 4.0

Kai Zhang, Siddharth Karanth, Bela Patel, Robert Murphy, Xiaoqian Jiang

(参考訳) 本研究では,院内感染者が機械的換気を必要とする確率を予測できる堅牢なインタイム予測器を提案する。 COVID-19患者のリスク予測の課題は、臨床設定で観察された患者のバイタルとラボの大きな変動と不規則なサンプリングにあります。既存の手法は時間依存的な機能の複雑なダイナミクスを扱うのに強い制限があり、情報を失う要約統計による時間的データの単純化や、より堅牢な結果をもたらすオーバーエンジニアリング機能などである。個別の患者に対して機械的換気を行うリスクのダイナミクスを追従するデータの不規則なサンプリング率を扱うための,新しいリアルタイムリスク軌跡予測モデルを提案する。このモデルは、観測値を用いたマルチタスクガウス過程を取り入れ、後継の多変条件確率を学習し、統一された時間グリッド上の欠落値を推定する。時間的インデュートデータは、予測タスクのために多目的セルフアテンションネットワークに供給される。リアルタイム予測を行うための新しい位置符号化層を提案し,ネットワークに追加した。位置層は、患者全体の病院滞在中に、各ユーザー定義の時点にリスクスコアを出力する。予測タスクを多目的学習フレームワークに設定し、すべての時点におけるリスクスコアを完全に最適化し、リスクスコアの軌道予測に堅牢性と一貫性を付加する。また,全国の病院内患者を対象とした大規模データベースを用いた実験により,auc(受信者動作特性曲線下の地域)とauprc(精密リコール曲線下の地域)のパフォーマンス指標,特に入院後の早期におけるパフォーマンスの向上が示された。

We propose a robust in-time predictor for in-hospital COVID-19 patient's probability of requiring mechanical ventilation. A challenge in the risk prediction for COVID-19 patients lies in the great variability and irregular sampling of patient's vitals and labs observed in the clinical setting. Existing methods have strong limitations in handling time-dependent features' complex dynamics, either oversimplifying temporal data with summary statistics that lose information or over-engineering features that lead to less robust outcomes. We propose a novel in-time risk trajectory predictive model to handle the irregular sampling rate in the data, which follows the dynamics of risk of performing mechanical ventilation for individual patients. The model incorporates the Multi-task Gaussian Process using observed values to learn the posterior joint multi-variant conditional probability and infer the missing values on a unified time grid. The temporal imputed data is fed into a multi-objective self-attention network for the prediction task. A novel positional encoding layer is proposed and added to the network for producing in-time predictions. The positional layer outputs a risk score at each user-defined time point during the entire hospital stay of an inpatient. We frame the prediction task into a multi-objective learning framework, and the risk scores at all time points are optimized altogether, which adds robustness and consistency to the risk score trajectory prediction. Our experimental evaluation on a large database with nationwide in-hospital patients with COVID-19 also demonstrates that it improved the state-of-the-art performance in terms of AUC (Area Under the receiver operating characteristic Curve) and AUPRC (Area Under the Precision-Recall Curve) performance metrics, especially at early times after hospital admission.

翻訳日:2021-02-05 06:37:35 公開日:2021-02-01

# (参考訳) 組合せ交換プロトコルの論理的表現のための一般的な枠組み

A General Framework for the Logical Representation of Combinatorial Exchange Protocols ( http://arxiv.org/abs/2102.02061v1 )

ライセンス: CC BY 4.0

Munyque Mittelmann, Sylvain Bouveret, Laurent Perrussel

(参考訳) 本論文の目的は,組合せ交換を規定する規則を表現・推論するための枠組みを提案することである。このようなフレームワークは、自動トランザクションのための広く使用されるメカニズムであるオークションに基づいてデジタルマーケットプレイスを構築したい限り、最初は関心があります。コンビネーション取引所はオークションの最も一般的なケースであり、コンビネーションとコンビネーションのバリエーションを混ぜている:エージェントは商品のバンドルを取引しようとしている。したがって、フレームワークは2つの要件を満たすべきである: (i) 入札者が商品の組み合わせで入札を表現できるようにすべきであり、(ii) 特定の市場、すなわち法的入札、割り当ておよび支払いルールを管理するルールを記述することを許可すべきである。そこで我々は、ゲーム記述言語の精神の中で論理言語を定義する: Combinatorial Exchange Description Languageは、論理フレームワークにおける組合せ交換を記述するための最初の言語である。コントリビューションは2つある: まず、異なる種類のプロトコルを表現して一般的な次元を記述し、次に、この機械処理可能な言語におけるオークション特性の推論方法を示す。

The goal of this paper is to propose a framework for representing and reasoning about the rules governing a combinatorial exchange. Such a framework is at first interest as long as we want to build up digital marketplaces based on auction, a widely used mechanism for automated transactions. Combinatorial exchange is the most general case of auctions, mixing the double and combinatorial variants: agents bid to trade bundles of goods. Hence the framework should fulfill two requirements: (i) it should enable bidders to express their bids on combinations of goods and (ii) it should allow describing the rules governing some market, namely the legal bids, the allocation and payment rules. To do so, we define a logical language in the spirit of the Game Description Language: the Combinatorial Exchange Description Language is the first language for describing combinatorial exchange in a logical framework. The contribution is two-fold: first, we illustrate the general dimension by representing different kinds of protocols, and second, we show how to reason about auction properties in this machine-processable language.

翻訳日:2021-02-04 22:19:57 公開日:2021-02-01

# クラスター分析とコミュニティ検出の評価のための外部対策の評価と比較

Characterizing and comparing external measures for the assessment of cluster analysis and community detection ( http://arxiv.org/abs/2102.00708v1 )

ライセンス: Link先を確認

Nejat Arinik (LIA), Vincent Labatut, Rosa Figueiredo

(参考訳) クラスタ分析とグラフ分割の文脈では、同じセットの2つのパーティションを比較するために、文献で多くの外部評価手段が提案されている。これにより、与えられた状況に対して最も適切な尺度を選択することがエンドユーザの課題となる。しかし、この問題は文献では見過ごされている。従来の研究者が一貫して使用し始めたためだけに、研究者は伝統に従い、彼らの分野の標準的な尺度を使用する傾向があります。本研究では,この問題を解決するための新しい経験的評価フレームワークを提案し,エンドユーザーがアプリケーションに適した尺度を選択するのを支援する。候補測度の集まりでは、まず、事前に定義されたパラメトリック分割変換のセットを適用して得られるパーティションの生成データセットに対してそれらの振る舞いを計算して記述する。第2に,このフレームワークは回帰分析を行い,パラメータや変換の影響を受ける指標を特徴付ける。これにより、測定方法の説明と比較が可能となる。私たちのアプローチは特定の測度やアプリケーションに縛られませんので、どんな状況にも適用できます。我々は,本手法を標準尺度の選定に適用し,その妥当性を説明し,具体的ユースケースを2つに分けて実施する方法を示す。

In the context of cluster analysis and graph partitioning, many external evaluation measures have been proposed in the literature to compare two partitions of the same set. This makes the task of selecting the most appropriate measure for a given situation a challenge for the end user. However, this issue is overlooked in the literature. Researchers tend to follow tradition and use the standard measures of their field, although they often became standard only because previous researchers started consistently using them. In this work, we propose a new empirical evaluation framework to solve this issue, and help the end user selecting an appropriate measure for their application. For a collection of candidate measures, it first consists in describing their behavior by computing them for a generated dataset of partitions, obtained by applying a set of predefined parametric partition transformations. Second, our framework performs a regression analysis to characterize the measures in terms of how they are affected by these parameters and transformations. This allows both describing and comparing the measures. Our approach is not tied to any specific measure or application, so it can be applied to any situation. We illustrate its relevance by applying it to a selection of standard measures, and show how it can be put in practice through two concrete use cases.

翻訳日:2021-02-04 17:14:45 公開日:2021-02-01

# (参考訳) インド古典音楽における感情分類のためのニューラルネットワークアーキテクチャ

Neural Network architectures to classify emotions in Indian Classical Music ( http://arxiv.org/abs/2102.00616v1 )

ライセンス: CC BY 4.0

Uddalok Sarkar, Sayan Nag, Medha Basu, Archi Banerjee, Shankha Sanyal, Ranjan Sengupta, Dipak Ghosh

(参考訳) 音楽はしばしば感情の言語と見なされる。長い間、人間の感情を引き出すことが知られており、人間の感情のタイプに基づいて音楽を分類することが、非常に興味深い研究のトピックである。インド古典音楽(ICM)によって引き起こされる感情を分類する作業になると、ICMに固有の曖昧さのため、さらに困難になる。 1つの演奏が聴衆の様々な感情的反応を誘発するという事実は、ICMの反響の性質に暗黙的である。ディープラーニングの分野での急速な進歩により、この音楽感情認識(MER)タスクはますます関連性が高く、堅牢になりつつあるため、最も困難なテストケースの1つ、すなわち1つに適用することができる。 ICMからの感情の分類。本稿では,200クリップがハッピー感情に対応し,残りの200クリップが悲しい感情に対応する,400のオーディオクリップ(それぞれ30秒)を持つjumusemodbという新しいデータセットを提案する。教師付き分類のために、2000年サブクリップ(各クリップを5つのサブクリップに分割する)の対応する音楽スペクトログラムに既存の4つのディープ畳み込みニューラルネットワーク(CNN)ベースのアーキテクチャ(resnet18, mobilenet v2.0, tightnet v1.0, vgg16)を使用し、周波数領域情報と時間領域情報の両方を含む。最初の結果は非常に刺激的であり、このアーキテクチャを使ってデータセットのベースライン値を設定することを楽しみにしています。インド古典音楽の豊富なコーパスを用いたCNNに基づく分類アルゴリズムは,グローバルな視点でもユニークであり,他の音楽のモダリティにおいても再現可能である。このデータセットはまだ開発中であり、他の感情的特徴を含むデータも追加する予定です。近いうちにデータセットを一般公開する予定です。

Music is often considered as the language of emotions. It has long been known to elicit emotions in human being and thus categorizing music based on the type of emotions they induce in human being is a very intriguing topic of research. When the task comes to classify emotions elicited by Indian Classical Music (ICM), it becomes much more challenging because of the inherent ambiguity associated with ICM. The fact that a single musical performance can evoke a variety of emotional response in the audience is implicit to the nature of ICM renditions. With the rapid advancements in the field of Deep Learning, this Music Emotion Recognition (MER) task is becoming more and more relevant and robust, hence can be applied to one of the most challenging test case i.e. classifying emotions elicited from ICM. In this paper we present a new dataset called JUMusEmoDB which presently has 400 audio clips (30 seconds each) where 200 clips correspond to happy emotions and the remaining 200 clips correspond to sad emotion. For supervised classification purposes, we have used 4 existing deep Convolutional Neural Network (CNN) based architectures (resnet18, mobilenet v2.0, squeezenet v1.0 and vgg16) on corresponding music spectrograms of the 2000 sub-clips (where every clip was segmented into 5 sub-clips of about 5 seconds each) which contain both time as well as frequency domain information. The initial results are quite inspiring, and we look forward to setting the baseline values for the dataset using this architecture. This type of CNN based classification algorithm using a rich corpus of Indian Classical Music is unique even in the global perspective and can be replicated in other modalities of music also. This dataset is still under development and we plan to include more data containing other emotional features as well. We plan to make the dataset publicly available soon.

翻訳日:2021-02-04 15:18:50 公開日:2021-02-01

# (参考訳) 量子インスパイアされた適応ブースティング

Quantum Inspired Adaptive Boosting ( http://arxiv.org/abs/2102.00949v1 )

ライセンス: CC BY 4.0

B\'alint Dar\'oczy, Katalin Friedl, L\'aszl\'o Kab\'odi, Attila Pereszl\'enyi, D\'aniel Szab\'o

(参考訳) Schuld と Petruccione [arXiv:1704.02146v1] の量子アンサンブルに基づく分類アルゴリズムに基づいて、この量子アンサンブル法が古典アルゴリズムよりも有利でないことを示す等価な古典アルゴリズムを考案した。基本的には、それらのアルゴリズムを、同等の古典的なバージョンを思いつくまで単純化する。古典的なアルゴリズムの1つは極めて単純で、各入力を分類するために一定時間実行される。さらに,本論文の主な貢献として,量子アンサンブル法と適応的なブースティングを組み合わせた手法を提案する。アルゴリズムはテストされ、公開データセット上のAdaBoostアルゴリズムに匹敵することがわかった。

Building on the quantum ensemble based classifier algorithm of Schuld and Petruccione [arXiv:1704.02146v1], we devise equivalent classical algorithms which show that this quantum ensemble method does not have advantage over classical algorithms. Essentially, we simplify their algorithm until it is intuitive to come up with an equivalent classical version. One of the classical algorithms is extremely simple and runs in constant time for each input to be classified. We further develop the idea and, as the main contribution of the paper, we propose methods inspired by combining the quantum ensemble method with adaptive boosting. The algorithms were tested and found to be comparable to the AdaBoost algorithm on publicly available data sets.

翻訳日:2021-02-04 15:11:47 公開日:2021-02-01

# (参考訳) 一般化キタエフハニカム磁石の機械学習相図

Machine-Learned Phase Diagrams of Generalized Kitaev Honeycomb Magnets ( http://arxiv.org/abs/2102.01103v1 )

ライセンス: CC BY 4.0

Nihal Rao, Ke Liu, Marc Machaczek, Lode Pollet

(参考訳) 我々は、最近開発された解釈可能で教師なしの機械学習手法であるテンソルカーネルサポートベクトルマシン(TK-SVM)を用いて、ハニカム格子上の一般化されたハイゼンベルク-キタエフ-$\Gamma$$J$-$K$-$\Gamma$)モデルの低温古典位相図を調査する。以前の量子および古典研究で報告された再生相とは別に、私たちのマシンはネストされたZigzag-stripyの順序を見つけ出し、最近特定された調節された$S_3 \times Z_3$相の堅牢性を確立します。結果は、$J$, $K$, $\Gamma$の3つの主要な交換相互作用にまたがる制限されたパラメータ空間において、代表的なキタエフ物質$\alpha$-${\rm RuCl}_3$は、単純な強磁性体を含むいくつかの相の界面に近く、従来の$S_3 \times Z_3$とネストされたジグザグ・ストリーピー磁石を含む。ジグザグ順序は有限 $\Gamma^{\prime}$ および/または $J_3$ 項によって安定化されるが、4つの磁気順序は特に $\Gamma^{\prime}$ が反強磁性であれば競合する。

We use a recently developed interpretable and unsupervised machine-learning method, the tensorial kernel support vector machine (TK-SVM), to investigate the low-temperature classical phase diagram of a generalized Heisenberg-Kitaev-$\Gamma$ ($J$-$K$-$\Gamma$) model on a honeycomb lattice. Aside from reproducing phases reported by previous quantum and classical studies, our machine finds a hitherto missed nested zigzag-stripy order and establishes the robustness of a recently identified modulated $S_3 \times Z_3$ phase, which emerges through the competition between the Kitaev and $\Gamma$ spin liquids, against Heisenberg interactions. The results imply that, in the restricted parameter space spanned by the three primary exchange interactions -- $J$, $K$, and $\Gamma$, the representative Kitaev material $\alpha$-${\rm RuCl}_3$ lies close to the interface of several phases, including a simple ferromagnet, and the unconventional $S_3 \times Z_3$ and nested zigzag-stripy magnets. A zigzag order is stabilized by a finite $\Gamma^{\prime}$ and/or $J_3$ term, whereas the four magnetic orders may compete in particular if $\Gamma^{\prime}$ is anti-ferromagnetic.

翻訳日:2021-02-04 12:47:14 公開日:2021-02-01

# 早期アルツハイマー病検出のための反応時間に基づく分類

Classifications based on response times for detecting early-stage Alzheimer's disease ( http://arxiv.org/abs/2102.00738v1 )

ライセンス: Link先を確認

Alain Petrowski (TSP, RS2M)

(参考訳) 紹介:本論文は, 早期アルツハイマー病(ES-AD)患者と健常者(HC)患者を手書き・手書き作業記録を用いたデータセットから高精度に検出する方法を主に記述する。方法:提案手法は被験者の応答時間を用いる。タスクの最適なサブセットは、最初にグリッド検索に関連付けられた「サポートベクターマシン」(SVM)で選択されます。タスク持続時間の空間で定義されるガウス分布の混合は、SVMの結果を再現し、説明するために使用される。最後に、驚くほどシンプルで効率的なアドホック分類アルゴリズムがガウス混合物から導かれる。結果:本論文で示したソリューションは、手書きと描画タスクからHC/ES-ADを分類する技術の状態の最良の結果の2倍または4倍の誤差を減少させる。議論: 最高のsvm学習モデルは、この分類で高い精度に達するが、その学習能力が大きすぎて、データセットの小さなサイズに関する過度なリスクが確実である。提案するアドホック分類アルゴリズムは、3つの実パラメータを最適化するだけでよい。したがって、優れた一般化能力の恩恵を受けるべきである。

Introduction: This paper mainly describes a way to detect with high accuracy patients with early-stage Alzheimer's disease (ES-AD) versus healthy control (HC) subjects, from datasets built with handwriting and drawing task records. Method: The proposed approach uses subject's response times. An optimal subset of tasks is first selected with a "Support Vector Machine" (SVM) associated with a grid search. Mixtures of Gaussian distributions defined in the space of task durations are then used to reproduce and explain the results of the SVM. Finally, a surprisingly simple and efficient ad hoc classification algorithm is deduced from the Gaussian mixtures. Results: The solution presented in this paper makes two or even four times fewer errors than the best results of the state of the art concerning the classification HC/ES-AD from handwriting and drawing tasks. Discussion: The best SVM learning model reaches a high accuracy for this classification but its learning capacity is too large to ensure a low overfitting risk regarding the small size of the dataset. The proposed ad hoc classification algorithm only requires to optimize three real-parameters. It should therefore benefit from a good generalization ability.

翻訳日:2021-02-04 10:18:26 公開日:2021-02-01

# 教師なしリアルタイム構造健康モニタリングのためのシステム信頼性に基づくGANと一級共同ガウス分布のマルチアンサンブル

System-reliability based multi-ensemble of GAN and one-class joint Gaussian distributions for unsupervised real-time structural health monitoring ( http://arxiv.org/abs/2102.01158v1 )

ライセンス: Link先を確認

Mohammad Hesam Soleimani-Babakamali, Reza Sepasdar, Kourosh Nasrollahzadeh, and Rodrigo Sarlo

(参考訳) 監視されていない健康モニタリングは、過去10年間で最も実用的なリアルタイム構造健康モニタリング(SHM)アプローチとして多くの注目を集めています。文献で提案された監視されていない技術の中には、堅牢でリアルタイムの健康監視の障害がまだあります。これらの障壁には、特徴抽出ステップの次元的削減からの情報の損失、それらのステップのケース依存性、動的クラスタリングの欠如、ユーザ定義パラメータに対する検出結果の感度が含まれる。本研究では,ケース依存抽出方式を使わずに,低次元と高次元を混合した非監視のリアルタイムSHM法を提案する。両機能は、GAN(Generative Adversarial Networks)と1-class Joint Gaussian Distribution Model (1-CG)のマルチアンサンブルのトレーニングに使用される。 GANと1-CGモデルの検出スコアに基づく極限状態関数のノベルティ検出システムを構築する。これらの極限状態関数(検出しきい値)の抵抗は、信頼性に基づく解析を通じてモンテカルロヒストグラムサンプリングを用いて、GAN生成データオブジェクトでユーザ定義パラメータに調整される。チューニングは、リアルタイムSHMでこれらのパラメータを選択するルールがないため、このメソッドをユーザー定義パラメータに堅牢にします。提案されたノベルティ検出フレームワークは、Yellow Frame(20ダメージクラス)とZ24 Bridge(15ダメージクラス)の2つの標準SHMデータセットに適用される。すべての異なる損傷カテゴリは、ユーザー定義パラメータの初期選択に対する低感度で識別され、動的なベースラインアプローチと静的なベースラインアプローチの両方を導入しました。

Unsupervised health monitoring has gained much attention in the last decade as the most practical real-time structural health monitoring (SHM) approach. Among the proposed unsupervised techniques in the literature, there are still obstacles to robust and real-time health monitoring. These barriers include loss of information from dimensionality reduction in feature extraction steps, case-dependency of those steps, lack of a dynamic clustering, and detection results' sensitivity to user-defined parameters. This study introduces an unsupervised real-time SHM method with a mixture of low- and high-dimensional features without a case-dependent extraction scheme. Both features are used to train multi-ensembles of Generative Adversarial Networks (GAN) and one-class joint Gaussian distribution models (1-CG). A novelty detection system of limit-state functions based on GAN and 1-CG models' detection scores is constructed. The Resistance of those limit-state functions (detection thresholds) is tuned to user-defined parameters with the GAN-generated data objects by employing the Monte Carlo histogram sampling through a reliability-based analysis. The tuning makes the method robust to user-defined parameters, which is crucial as there is no rule for selecting those parameters in a real-time SHM. The proposed novelty detection framework is applied to two standard SHM datasets to illustrate its generalizability: Yellow Frame (twenty damage classes) and Z24 Bridge (fifteen damage classes). All different damage categories are identified with low sensitivity to the initial choice of user-defined parameters with both introduced dynamic and static baseline approaches with few or no false alarms.

翻訳日:2021-02-04 10:17:47 公開日:2021-02-01

# 呪いか償還か? データ不均一性がフェデレーション学習のロバスト性に与える影響

Curse or Redemption? How Data Heterogeneity Affects the Robustness of Federated Learning ( http://arxiv.org/abs/2102.00655v1 )

ライセンス: Link先を確認

Syed Zawad, Ahsan Ali, Pin-Yu Chen, Ali Anwar, Yi Zhou, Nathalie Baracaldo, Yuan Tian, Feng Yan

(参考訳) データの不均一性は、フェデレートラーニングにおける重要な特徴の1つとして認識されているが、しばしば敵対的攻撃に対する堅牢性のレンズで見過ごされる。本論文では,合成およびLEAFベンチマークを用いた包括的な実験を通じて,フェデレーション学習におけるバックドア攻撃に対する影響を特徴づけ,理解することに焦点を当てる。実験結果から,データの不均一性は攻撃の有効性の主要な要因であり,攻撃の効率が低下し,効果的な攻撃戦略の設計が困難となり,攻撃結果も予測不能となるため,バックドア攻撃に対する防御の欠如となる可能性が示唆された。しかし,さらなる調査により,クライアント側バックドアのタイミングを単に調整するだけで,攻撃効果を著しく向上できるため,データの不均一性は償還よりも呪いに近いことが判明した。さらに重要なのは、データの異質性は、攻撃者が自分自身を偽装し、馬鹿げた機能ベースの防衛に活用することができる良性クライアントのローカルトレーニングでオーバーフィットをもたらす可能性があります。また、攻撃データ分布を調整することで効果的な攻撃戦略を作成できる。最後に,データの不均一性によってもたらされる呪いを守る可能性について論じる。大規模な実験と分析から得られた成果と教訓は、堅牢な連合学習手法とシステムを設計するための新たな洞察を提供する

Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks. This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks. The initial impression driven by our experimental results suggests that data heterogeneity is the dominant factor in the effectiveness of attacks and it may be a redemption for defending against backdooring as it makes the attack less efficient, more challenging to design effective attack strategies, and the attack result also becomes less predictable. However, with further investigations, we found data heterogeneity is more of a curse than a redemption as the attack effectiveness can be significantly boosted by simply adjusting the client-side backdooring timing. More importantly,data heterogeneity may result in overfitting at the local training of benign clients, which can be utilized by attackers to disguise themselves and fool skewed-feature based defenses. In addition, effective attack strategies can be made by adjusting attack data distribution. Finally, we discuss the potential directions of defending the curses brought by data heterogeneity. The results and lessons learned from our extensive experiments and analysis offer new insights for designing robust federated learning methods and systems

翻訳日:2021-02-04 10:10:57 公開日:2021-02-01

# 化学空間探索のためのディープニューラルネットワークを用いた遺伝的アルゴリズムの再現性に関する研究

A reproducibility study of "Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space" ( http://arxiv.org/abs/2102.00700v1 )

ライセンス: Link先を確認

Kevin Maik Jablonka, Fergus Mcilwaine, Susana Garcia, Berend Smit, Brian Yoo

(参考訳) Nigamら。 SELFIES表現を利用した遺伝的アルゴリズム(GA)を報告し、生成された分子の多様性を改善するために適応的でニューラルネットワークに基づくペナルティを提案する。この論文の主な主張は、このGAは他の生成技術(罰則化されたlogPによって測定される)を上回っ、ニューラルネットワークベースの適応ペナルティが生成された分子の多様性を増加させることである。本研究では,それらの主張の再現性を検討した。全体としては、SELFIESベースのGAを用いて同等の結果を再現することができたが、ほとんどは(容易に最適化可能な)フィットネス機能の欠如(すなわち、長い硫黄を含む鎖を生成する)を利用していた。さらに, 判別器を用いて, 分子の発生を基準セットに類似するものに偏見を与えることができることも再現した。さらに,多様性の進化を定量化し,いくつかのハイパーパラメータの影響を理解し,適応的ペナルティの改善を提案する。

Nigam et al. reported a genetic algorithm (GA) utilizing the SELFIES representation and also propose an adaptive, neural network-based, penalty that is supposed to improve the diversity of the generated molecules. The main claims of the paper are that this GA outperforms other generative techniques (as measured by the penalized logP) and that a neural network-based adaptive penalty increases the diversity of the generated molecules. In this work, we investigated the reproducibility of their claims. Overall, we were able to reproduce comparable results using the SELFIES-based GA, but mostly by exploiting deficiencies of the (easily optimizable) fitness function (i.e., generating long, sulfur containing, chains). In addition, we also reproduce that the discriminator can be used to bias the generation of molecules to ones that are similar to the reference set. In addition, we also attempted to quantify the evolution of the diversity, understand the influence of some hyperparameters, and propose improvements to the adaptive penalty.

翻訳日:2021-02-04 10:10:15 公開日:2021-02-01

# 説明可能なランドスケープ解析に向けて:BBOB関数の極端特徴選択

Towards Explainable Exploratory Landscape Analysis: Extreme Feature Selection for Classifying BBOB Functions ( http://arxiv.org/abs/2102.00736v1 )

ライセンス: Link先を確認

Quentin Renau, Johann Dreo, Carola Doerr and Benjamin Doerr

(参考訳) 最近の機械学習(ML)の進歩により、最適化ヒューリスティックスの自動設計が現在、進化計算(EC)を揺るがしている。もっとも適したヒューリスティックを選ぶための手書きのガイドラインの設計がこの分野の研究活動を支配してきたのに対し、自動訓練されたヒューリスティックは、よく研究された最適化タスクにおいても、人間由来の選択肢よりも優れていた。したがって、MLベースのECはもはや未来的なビジョンではありませんが、コミュニティの不可欠な部分になっています。 MLベースのヒューリスティックがしばしば直面する重要な批判は、将来の開発を妨げる可能性のある説明可能性の潜在的な不足である。これは特に探索的ランドスケープ分析(ELA)に基づいてアルゴリズムのパフォーマンスを外挿する教師付き学習技術に当てはまります。このようなアプリケーションでは、特定のアルゴリズム選択または構成タスクの基礎となるモデルを構築するために多数の問題機能を使用することは珍しくありません。この作業の目標は、この多数の機能が本当に必要かどうかを分析することです。 BBOBテスト関数をテストベッドとして分類することで、驚くほど少数の機能(通常は4つ未満)が、98%の精度を達成するのに十分であることを示す。興味深いことに、このしきい値を満たすのに必要な機能の数は問題次元とともに減少する。分類精度は,複数のインスタンスがトレーニングやテストに関与している設定に転移することを示す。しかし, 離間ワンインスタンスアウト設定では分類精度が著しく低下し, 特徴の変換不変性が決定的な成功要因となる。

Facilitated by the recent advances of Machine Learning (ML), the automated design of optimization heuristics is currently shaking up evolutionary computation (EC). Where the design of hand-picked guidelines for choosing a most suitable heuristic has long dominated research activities in the field, automatically trained heuristics are now seen to outperform human-derived choices even for well-researched optimization tasks. ML-based EC is therefore not any more a futuristic vision, but has become an integral part of our community. A key criticism that ML-based heuristics are often faced with is their potential lack of explainability, which may hinder future developments. This applies in particular to supervised learning techniques which extrapolate algorithms' performance based on exploratory landscape analysis (ELA). In such applications, it is not uncommon to use dozens of problem features to build the models underlying the specific algorithm selection or configuration task. Our goal in this work is to analyze whether this many features are indeed needed. Using the classification of the BBOB test functions as testbed, we show that a surprisingly small number of features -- often less than four -- can suffice to achieve a 98\% accuracy. Interestingly, the number of features required to meet this threshold is found to decrease with the problem dimension. We show that the classification accuracy transfers to settings in which several instances are involved in training and testing. In the leave-one-instance-out setting, however, classification accuracy drops significantly, and the transformation-invariance of the features becomes a decisive success factor.

翻訳日:2021-02-04 10:09:37 公開日:2021-02-01

# SGDのための無痛ステップサイズ適応

Painless step size adaptation for SGD ( http://arxiv.org/abs/2102.00853v1 )

ライセンス: Link先を確認

Ilona Kulikovskikh and Tarzan Legovi\'c

(参考訳) 収束と一般化は、ニューラルネットワークのパフォーマンスの2つの重要な側面である。別々に解析すると、これらの性質は矛盾する結果をもたらす可能性がある。収束率の最適化は高速なトレーニングをもたらすが、最良の一般化誤差を保証しない。対立を避けるため、最近の研究では、オプティマイザに適度に大きなステップサイズを採用することを提案しているが、パフォーマンスに付加価値は未定である。テストの収束と一般化の改善を明示的に制御する4つの構成でLIGHT関数を提案します。 1) ニューラルネットワークの安定性を保証せずに、収束性と一般化の両方を改善すること、2) 過剰なパラメータ化を必要とせずに、より信頼性が高く説明可能なネットワークアーキテクチャを構築すること。私たちはそれを「痛みのない」ステップサイズの適応と呼びます。

Convergence and generalization are two crucial aspects of performance in neural networks. When analyzed separately, these properties may lead to contradictory results. Optimizing a convergence rate yields fast training, but does not guarantee the best generalization error. To avoid the conflict, recent studies suggest adopting a moderately large step size for optimizers, but the added value on the performance remains unclear. We propose the LIGHT function with the four configurations which regulate explicitly an improvement in convergence and generalization on testing. This contribution allows to: 1) improve both convergence and generalization of neural networks with no need to guarantee their stability; 2) build more reliable and explainable network architectures with no need for overparameterization. We refer to it as "painless" step size adaptation.

翻訳日:2021-02-04 10:08:51 公開日:2021-02-01

# 不可能なチューニングが可能になった:新しいエキスパートアルゴリズムとその応用

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications ( http://arxiv.org/abs/2102.01046v1 )

ライセンス: Link先を確認

Liyu Chen, Haipeng Luo, Chen-Yu Wei

(参考訳) 我々は、古典的専門家問題の長年にわたる「不可能なチューニング」問題を解決し、実際、後悔の$O\left(\sqrt{(\ln d)\sum_t \ell_{t,i}^2}\right)$と同時に、$T$-round $d$-expert problemにおけるすべてのエキスパート$i$に対して$\ell_{t,i}$は、ラウンド$t$におけるエキスパート$i$の損失であることを示す。本アルゴリズムは、補正項と重み付きエントロピー正則化器を備えたミラー降下フレームワークに基づいている。自然だが、アルゴリズムはこれまで研究されておらず、慎重に分析する必要がある。また、学習者が受信する任意の予測ベクトル $m_t$ に対して $o\left(\sqrt{(\ln d)\sum_t (\ell_{t,i}-m_{t,i})^2}\right)$ を一般化し、異なる $m_t$ を選択して既存の結果を復元または改善する。さらに,同じフレームワークを使用して,基本アルゴリズムの集合を結合し,オーバーヘッドの少ない最善のアルゴリズムを学習するマスタアルゴリズムを作成する。マスターの新たな保証により、専門家問題とより一般的なオンライン線形最適化の両方に対して、多くの新しい結果が得られます。

We resolve the long-standing "impossible tuning" issue for the classic expert problem and show that, it is in fact possible to achieve regret $O\left(\sqrt{(\ln d)\sum_t \ell_{t,i}^2}\right)$ simultaneously for all expert $i$ in a $T$-round $d$-expert problem where $\ell_{t,i}$ is the loss for expert $i$ in round $t$. Our algorithm is based on the Mirror Descent framework with a correction term and a weighted entropy regularizer. While natural, the algorithm has not been studied before and requires a careful analysis. We also generalize the bound to $O\left(\sqrt{(\ln d)\sum_t (\ell_{t,i}-m_{t,i})^2}\right)$ for any prediction vector $m_t$ that the learner receives, and recover or improve many existing results by choosing different $m_t$. Furthermore, we use the same framework to create a master algorithm that combines a set of base algorithms and learns the best one with little overhead. The new guarantee of our master allows us to derive many new results for both the expert problem and more generally Online Linear Optimization.

翻訳日:2021-02-04 10:08:20 公開日:2021-02-01

# 旅行行動予測における数百の機械学習分類器と離散選択モデルの比較:実証的ベンチマーク

Comparing hundreds of machine learning classifiers and discrete choice models in predicting travel behavior: an empirical benchmark ( http://arxiv.org/abs/2102.01130v1 )

ライセンス: Link先を確認

Shenhao Wang, Baichuan Mo, Stephane Hess, Jinhua Zhao

(参考訳) 研究者は、旅行行動を予測するために機械学習(ML)分類器と離散選択モデル(DCM)を比較してきたが、発見の一般化はデータ、文脈、著者の専門知識によって制限されている。本研究は、高度に構造化された方法で数百のMLおよびDCM分類器を比較して、一般化可能な経験的ベンチマークを提供することを目的とする。実験では,12のモデルファミリーから105のMLとDCMの分類器,3つのデータセット,3つのサンプルサイズ,3つのアウトプットを含む4つの超次元にまたがって予測精度と計算コストを評価した。この実験設計は6,970の実験につながり、35の以前の研究から136の実験ポイントのメタデータセットと関連付けられている。この研究は、旅行行動予測のための分類器の最も包括的でほぼ完全な比較です。その結果,アンサンブル法とディープニューラルネットワークが最も高い予測性能が得られるが,計算コストは比較的高いことがわかった。ランダムフォレストは最も計算効率が良く、予測と計算のバランスをとる。離散選択モデルは、上位ML分類器よりもわずか3～4パーセント低い精度を提供するが、より長い計算時間を持ち、大きなサンプルサイズ、高い入力次元、シミュレーションベースの推定で計算不可能になる。 MLおよびDCM分類器の相対的なランキングは非常に安定しており、予測精度と計算時間の絶対値は大きな変動を有する。本稿では, 深層ニューラルネットワーク, モデルアンサンブル, ランダム森林を, 将来の旅行行動予測のベースラインモデルとして用いることを提案する。選択モデリングのために、DCMコミュニティは、ビッグデータのコンテキストでDCMを広く採用できるように、適合モデルから計算効率の改善にもっと注意を向けるべきです。

Researchers have compared machine learning (ML) classifiers and discrete choice models (DCMs) in predicting travel behavior, but the generalizability of the findings is limited by the specifics of data, contexts, and authors' expertise. This study seeks to provide a generalizable empirical benchmark by comparing hundreds of ML and DCM classifiers in a highly structured manner. The experiments evaluate both prediction accuracy and computational cost by spanning four hyper-dimensions, including 105 ML and DCM classifiers from 12 model families, 3 datasets, 3 sample sizes, and 3 outputs. This experimental design leads to an immense number of 6,970 experiments, which are corroborated with a meta dataset of 136 experiment points from 35 previous studies. This study is hitherto the most comprehensive and almost exhaustive comparison of the classifiers for travel behavioral prediction. We found that the ensemble methods and deep neural networks achieve the highest predictive performance, but at a relatively high computational cost. Random forests are the most computationally efficient, balancing between prediction and computation. While discrete choice models offer accuracy with only 3-4 percentage points lower than the top ML classifiers, they have much longer computational time and become computationally impossible with large sample size, high input dimensions, or simulation-based estimation. The relative ranking of the ML and DCM classifiers is highly stable, while the absolute values of the prediction accuracy and computational time have large variations. Overall, this paper suggests using deep neural networks, model ensembles, and random forests as baseline models for future travel behavior prediction. For choice modeling, the DCM community should switch more attention from fitting models to improving computational efficiency, so that the DCMs can be widely adopted in the big data context.

翻訳日:2021-02-04 10:07:30 公開日:2021-02-01

# 攻撃下の未知力学系に対する動的カモフラージュによるセキュアな学習制御戦略

A Secure Learning Control Strategy via Dynamic Camouflaging for Unknown Dynamical Systems under Attacks ( http://arxiv.org/abs/2102.00573v1 )

ライセンス: Link先を確認

Sayak Mukherjee, Veronica Adetola

(参考訳) 本稿では、盗聴や隠蔽攻撃などの構成攻撃を受ける未知の線形時間不変サイバー物理システム(CPS)に対するセキュア強化学習(RL)に基づく制御手法を提案する。設計者が線形二次制御器(LQR)を学ぶために行う学習の探索段階で攻撃者が動的モデルについて学習する攻撃シナリオを検討し、その後、我々は二重学習ベースの制御と攻撃(DLCA)フレームワークと呼ばれる動的システムへの隠れた攻撃を実行するためにそのような情報を使用します。本研究では,動的システムの最適制御器を学習し,同時に,攻撃者によるシステムダイナミクスの推定に十分な誤情報を注入することができる,動的迷彩に基づく攻撃回復力強化学習アルゴリズム(ARRL)を提案する。このアルゴリズムには、コンセンサスマルチエージェントシステムとベンチマーク電力グリッドモデルに関する理論的保証と広範な数値実験が伴っている。

This paper presents a secure reinforcement learning (RL) based control method for unknown linear time-invariant cyber-physical systems (CPSs) that are subjected to compositional attacks such as eavesdropping and covert attack. We consider the attack scenario where the attacker learns about the dynamic model during the exploration phase of the learning conducted by the designer to learn a linear quadratic regulator (LQR), and thereafter, use such information to conduct a covert attack on the dynamic system, which we refer to as doubly learning-based control and attack (DLCA) framework. We propose a dynamic camouflaging based attack-resilient reinforcement learning (ARRL) algorithm which can learn the desired optimal controller for the dynamic system, and at the same time, can inject sufficient misinformation in the estimation of system dynamics by the attacker. The algorithm is accompanied by theoretical guarantees and extensive numerical experiments on a consensus multi-agent system and on a benchmark power grid model.

翻訳日:2021-02-04 09:56:00 公開日:2021-02-01

# 確率的オンライン凸最適化 : 確率時系列予測への応用

Stochastic Online Convex Optimization; Application to probabilistic time series forecasting ( http://arxiv.org/abs/2102.00729v1 )

ライセンス: Link先を確認

Olivier Wintenberger (LPSM UMR 8001)

(参考訳) オンラインアルゴリズムの確率的後悔境界は、通常「オンラインからバッチ」変換に由来する。この推論を逆にして,確率的凸最適化問題に適用可能な「バッチからオンラインへの変換」により,確率的exp-concavity条件下で解析を開始する。非凸損失関数の確率の高い高速確率的後悔境界を得る。このアプローチに基づき、非定常非有界時系列の予測と確率予測方法を提供します。

Stochastic regret bounds for online algorithms are usually derived from an "online to batch" conversion. Inverting the reasoning, we start our analyze by a "batch to online" conversion that applies in any Stochastic Online Convex Optimization problem under stochastic exp-concavity condition. We obtain fast rate stochastic regret bounds with high probability for non-convex loss functions. Based on this approach, we provide prediction and probabilistic forecasting methods for non-stationary unbounded time series.

翻訳日:2021-02-04 09:55:23 公開日:2021-02-01

# 粒子加速器における時系列の分類と予測の新しい手法

A Novel Approach for Classification and Forecasting of Time Series in Particle Accelerators ( http://arxiv.org/abs/2102.00786v1 )

ライセンス: Link先を確認

Sichen Li, M\'elissa Zacharias, Jochem Snuverink, Jaime Coello de Portugal, Fernando Perez-Cruz, Davide Reggiani and Andreas Adelmann

(参考訳) 粒子加速器のビーム遮断(インターロック)は、必要な安全対策にもかかわらず、突然の運用変更とビーム時間の相当な損失をもたらす。インタロック現象を予測し,高出力陽子加速器複合体のビーム損失を低減するために,新しい時系列分類手法を適用した。多変量時系列のウィンドウのバイナリ分類によって予測を行う。時系列は、時系列の内部構造を捕捉するだけでなく、画像分類技術の進歩を利用する畳み込みニューラルネットワークによって分類される再発プロットに変換される。 ROC曲線値が0.71 pm 0.01$のエリアを、ランダムフォレストモデルが0.65 pm 0.01$のエリアに到達させることで、インターロック毎のビーム時間損失を0.5 pm 0.2$秒削減することができる。

The beam interruptions (interlocks) of particle accelerators, despite being necessary safety measures, lead to abrupt operational changes and a substantial loss of beam time. A novel time series classification approach is applied to decrease beam time loss in the High Intensity Proton Accelerator complex by forecasting interlock events. The forecasting is performed through binary classification of windows of multivariate time series. The time series are transformed into Recurrence Plots which are then classified by a Convolutional Neural Network, which not only captures the inner structure of the time series but also utilizes the advances of image classification techniques. Our best performing interlock-to-stable classifier reaches an Area under the ROC Curve value of $0.71 \pm 0.01$ compared to $0.65 \pm 0.01$ of a Random Forest model, and it can potentially reduce the beam time loss by $0.5 \pm 0.2$ seconds per interlock.

翻訳日:2021-02-04 09:54:55 公開日:2021-02-01

# 低リソース音声認識のためのコントラスト表現のスケーリングについて

On Scaling Contrastive Representations for Low-Resource Speech Recognition ( http://arxiv.org/abs/2102.00850v1 )

ライセンス: Link先を確認

Lasse Borgholt, Tycho Max Sylvester Tax, Jakob Drachmann Havtorn, Lars Maal{\o}e, Christian Igel

(参考訳) コントラスト学習による自己教師型学習の最近の進歩は,ラベル付きデータの10分以内で,競争的音声認識システムを学ぶことができることを示している。しかし、これらのシステムは事前学習を必要とするため計算コストが高く、さらに大きなパラメータ空間で微調整を行う。計算要求の高いwav2vec 2.0フレームワークの固定表現に関する最先端の音声認識を訓練することにより、微調整のないシステムの性能を検討する。パフォーマンスは微調整なしで低下し、極端な低リソース設定では、wav2vec 2.0は前バージョンより劣っている。また、wav2vec 2.0表現は低次元部分空間に存在し、表現の特徴の相関が自動音声認識器の訓練を安定化させる。最後に、パフォーマンスを継続的に改善するオリジナルのwav2vecフレームワークの双方向拡張を提案する。

Recent advances in self-supervised learning through contrastive training have shown that it is possible to learn a competitive speech recognition system with as little as 10 minutes of labeled data. However, these systems are computationally expensive since they require pre-training followed by fine-tuning in a large parameter space. We explore the performance of such systems without fine-tuning by training a state-of-the-art speech recognizer on the fixed representations from the computationally demanding wav2vec 2.0 framework. We find performance to decrease without fine-tuning and, in the extreme low-resource setting, wav2vec 2.0 is inferior to its predecessor. In addition, we find that wav2vec 2.0 representations live in a low dimensional subspace and that decorrelating the features of the representations can stabilize training of the automatic speech recognizer. Finally, we propose a bidirectional extension to the original wav2vec framework that consistently improves performance.

翻訳日:2021-02-04 09:54:17 公開日:2021-02-01

# 実例からの線形時間公式の学習の複雑さ

The Complexity of Learning Linear Temporal Formulas from Examples ( http://arxiv.org/abs/2102.00876v1 )

ライセンス: Link先を確認

Nathana\"el Fijalkow and Guillaume Lagarde

(参考訳) 本稿では、例から線形時間論理(LTL)式を学習する計算の複雑さの研究を開始する。我々はLTLのフラグメントに対する近似アルゴリズムを構築し、硬さを証明し、特に、次の演算子と接続子のみを含むフラグメントの近似の厳密な境界を求め、多くのフラグメントに対するNP完全性の結果を証明する。

In this paper we initiate the study of the computational complexity of learning linear temporal logic (LTL) formulas from examples. We construct approximation algorithms for fragments of LTL and prove hardness results; in particular we obtain tight bounds for approximation of the fragment containing only the next operator and conjunctions, and prove NP-completeness results for many fragments.

翻訳日:2021-02-04 09:53:42 公開日:2021-02-01

# 確率的テイラー展開とフィルタリングおよび微分方程式への応用

A Probabilistic Taylor Expansion with Applications in Filtering and Differential Equations ( http://arxiv.org/abs/2102.00877v1 )

ライセンス: Link先を確認

Toni Karvonen, Jon Cockayne, Filip Tronarp, Simo S\"arkk\"a

(参考訳) 我々は、後進平均が特定のデータ選択に対して、任意の順序の切り詰められたテイラー展開を複製するガウス過程のクラスを研究する。データは、拡張点における微分評価から成り、以前の共分散カーネルはテイラー核のクラスに属しており、特定の電源系列形式で記述することができる。これにより、1次および2次テイラー展開を利用する様々なアルゴリズムの不確かさを統計的にモデル化することができる。このガウス過程モデルの有用性を実証するために、非線形状態推定のための古典的拡張カルマンフィルタの新しい確率バージョンと、通常の微分方程式を解くオイラー法を導入する。

We study a class of Gaussian processes for which the posterior mean, for a particular choice of data, replicates a truncated Taylor expansion of any order. The data consists of derivative evaluations at the expansion point and the prior covariance kernel belongs to the class of Taylor kernels, which can be written in a certain power series form. This permits statistical modelling of the uncertainty in a variety of algorithms that exploit first and second order Taylor expansions. To demonstrate the utility of this Gaussian process model we introduce new probabilistic versions of the classical extended Kalman filter for non-linear state estimation and the Euler method for solving ordinary differential equations.

翻訳日:2021-02-04 09:53:14 公開日:2021-02-01

# 深部ニューラルネットワーク推論パイプラインのForensicability

Forensicability of Deep Neural Network Inference Pipelines ( http://arxiv.org/abs/2102.00921v1 )

ライセンス: Link先を確認

Alexander Schl\"ogl, Tobias Kupek, Rainer B\"ohme

(参考訳) 観測可能な出力における特性数値偏差をトレースすることにより,機械学習パイプラインの実行環境の特性を推定する手法を提案する。ローカルおよびクラウドホストマシン上で得られた一連の概念実証実験の結果は、ディープニューラルネットワーク予測を生成するために使用されるハードウェアプラットフォームの識別など、法医学的応用の可能性をもたらす。最後に,予測ラベルのみを用いて機械を識別するために,数値偏差を増幅する境界サンプルを導入する。

We propose methods to infer properties of the execution environment of machine learning pipelines by tracing characteristic numerical deviations in observable outputs. Results from a series of proof-of-concept experiments obtained on local and cloud-hosted machines give rise to possible forensic applications, such as the identification of the hardware platform used to produce deep neural network predictions. Finally, we introduce boundary samples that amplify the numerical deviations in order to distinguish machines by their predicted label only.

翻訳日:2021-02-04 09:52:44 公開日:2021-02-01

# 深層音楽情報ダイナミクス

Deep Music Information Dynamics ( http://arxiv.org/abs/2102.01133v1 )

ライセンス: Link先を確認

Shlomo Dubnov

(参考訳) 音楽は、時間内に組織された複雑な同時イベントからなる。本稿では,音楽データそのものに由来する高い速度情報ダイナミクスとは対照的に,思考過程のダイナミクスを捉えることを想定した,低速な潜在表現ストリームである2つの並列ストリームを組み合わせた,深層音楽情報ダイナミクスと呼ばれる新しい枠組みを提案する。我々は,人間認知の速度ゆがみ理論に動機づけられ,リスナーの心に存在する想像上の予測と音楽面自体の情報ダイナミクスの関係を探究する枠組みを提案する。このモデルはシンボリック(midi)データの場合、音響面の計算には多くの層が必要であり、楽器の特性や表現力の強い反射を捉えることができる。数学的枠組みは、まず音楽観測の高速表現を確立し、ビットアロケーション法を使用して並列低レートデータストリームに還元する変動符号化に基づいています。ここで考慮される複合損失は、各ストリームの時間発展の観点での情報レートと、ハイレート表現とローレート表現の間の相互情報で測定されたエンコーディングの忠実性の両方を含む。論文で提示したシミュレーションでは,音楽表面の潜時・虚数・副次的側面を定量的かつ計算的に抽出可能な方法で近似することができる。本論文では,時間に基づく音楽生成モデルの解析と設計において,圧縮と予測のトレードオフが重要な要素であることを示唆する計算ツールのセットについて論じる。

Music comprises of a set of complex simultaneous events organized in time. In this paper we introduce a novel framework that we call Deep Musical Information Dynamics, which combines two parallel streams - a low rate latent representation stream that is assumed to capture the dynamics of a thought process contrasted with a higher rate information dynamics derived from the musical data itself. Motivated by rate-distortion theories of human cognition we propose a framework for exploring possible relations between imaginary anticipations existing in the listener's mind and information dynamics of the musical surface itself. This model is demonstrated for the case of symbolic (MIDI) data, as accounting for acoustic surface would require many more layers to capture instrument properties and performance expressive inflections. The mathematical framework is based on variational encoding that first establishes a high rate representation of the musical observations, which is then reduced using a bit-allocation method into a parallel low rate data stream. The combined loss considered here includes both the information rate in terms of time evolution for each stream, and the fidelity of encoding measured in terms of mutual information between the high and low rate representations. In the simulations presented in the paper we are able to juxtapose aspects of latent/imaginary surprisal versus surprisal of the music surface in a manner that is quantifiable and computationally tractable. The set of computational tools is discussed in the paper, suggesting that a trade off between compression and prediction are an important factor in the analysis and design of time-based music generative models.

翻訳日:2021-02-04 09:52:15 公開日:2021-02-01

# Gene Mover's Distance:Optimal Transportによる単一細胞類似性

The Gene Mover's Distance: Single-cell similarity via Optimal Transport ( http://arxiv.org/abs/2102.01218v1 )

ライセンス: Link先を確認

Riccardo Bellazzi and Andrea Codegoni and Stefano Gualandi and Giovanna Nicora and Eleonora Vercesi

(参考訳) 本稿では, 単細胞RNAシークエンシングにより得られた遺伝子発現プロファイルに基づいて, 一対の細胞間の類似性の尺度であるGene Mover's Distanceを紹介する。提案する距離の基本的な考え方は、単一細胞の遺伝子発現配列を離散的確率測度として解釈することである。したがって、2つのセル間の距離は、対応する2つの離散測度間の最適輸送問題を解くことで計算される。最適輸送モデルでは、一対の遺伝子間の距離を測定するために2種類のコスト関数を用いる。最初のコスト関数は、遺伝子を高次元ベクターにマッピングするために使用されるgen2vecと呼ばれる遺伝子埋め込みを利用する:遺伝子から他のベクターへ遺伝子発現の質量の単位を移動させるコストは、対応する埋め込みベクター間のユークリッド距離に設定される。第2のコスト関数はペアの遺伝子間のピアソン距離に基づいている。両方のコスト関数では、2つの遺伝子が相関するほど、その距離は低くなります。我々は、遺伝子ムーバーの距離を利用して、その状態とタイプに応じて細胞を分類する2つの分類問題を解く。新しいメトリックの影響を評価するために、異なる距離を使用して$ k$-Nearest Neighbor分類器のパフォーマンスを比較します。計算結果から、遺伝子ムーバーの距離は、文献で使われている最先端距離と競合していることが示された。

This paper introduces the Gene Mover's Distance, a measure of similarity between a pair of cells based on their gene expression profiles obtained via single-cell RNA sequencing. The underlying idea of the proposed distance is to interpret the gene expression array of a single cell as a discrete probability measure. The distance between two cells is hence computed by solving an Optimal Transport problem between the two corresponding discrete measures. In the Optimal Transport model, we use two types of cost function for measuring the distance between a pair of genes. The first cost function exploits a gene embedding, called gene2vec, which is used to map each gene to a high dimensional vector: the cost of moving a unit of mass of gene expression from a gene to another is set to the Euclidean distance between the corresponding embedded vectors. The second cost function is based on a Pearson distance among pairs of genes. In both cost functions, the more two genes are correlated, the lower is their distance. We exploit the Gene Mover's Distance to solve two classification problems: the classification of cells according to their condition and according to their type. To assess the impact of our new metric, we compare the performances of a $k$-Nearest Neighbor classifier using different distances. The computational results show that the Gene Mover's Distance is competitive with the state-of-the-art distances used in the literature.

翻訳日:2021-02-04 09:51:31 公開日:2021-02-01

# (参考訳) 量子フェア機械学習

Quantum Fair Machine Learning ( http://arxiv.org/abs/2102.00753v1 )

ライセンス: CC BY 4.0

Elija Perrier

(参考訳) 本稿では,量子フェア機械学習の分野について紹介する。古典的および量子的フェアマシンラーニングアルゴリズムの違いと類似性の比較分析を行い、量子計算のユニークな特徴が、量子アルゴリズムが公平性制約の対象となる場合の尺度、メトリクス、修復戦略をどのように変更するかを特定します。本稿では、グローバー探索アルゴリズムを用いて、量子アルゴリズムに課される統計パリティ制約を満たすことにより、量子フェア機械学習の最初の結果を示す。我々は、$\epsilon$-tolerance内でそのような統計パリティを達成するために必要なイテレーションの低いバウンドを提供する。正準リプシッツ条件の個々の公正度基準を量子メトリクスを用いて量子設定に拡張する。量子情報処理と量子データに関わる機械学習コンテキストにおける公平性の典型的な尺度の結果を検討する。最後に, 計算機科学, 倫理学, 量子計算分野の研究者に新たな関心を寄せるオープン質問と研究プログラムを提案する。

In this paper, we inaugurate the field of quantum fair machine learning. We undertake a comparative analysis of differences and similarities between classical and quantum fair machine learning algorithms, specifying how the unique features of quantum computation alter measures, metrics and remediation strategies when quantum algorithms are subject to fairness constraints. We present the first results in quantum fair machine learning by demonstrating the use of Grover's search algorithm to satisfy statistical parity constraints imposed on quantum algorithms. We provide lower-bounds on iterations needed to achieve such statistical parity within $\epsilon$-tolerance. We extend canonical Lipschitz-conditioned individual fairness criteria to the quantum setting using quantum metrics. We examine the consequences for typical measures of fairness in machine learning context when quantum information processing and quantum data are involved. Finally, we propose open questions and research programmes for this new field of interest to researchers in computer science, ethics and quantum computation.

翻訳日:2021-02-04 09:48:02 公開日:2021-02-01

# (参考訳) 第4回複雑系のスマートシミュレーションとモデリングに関する国際ワークショップ

The 4th International Workshop on Smart Simulation and Modelling for Complex Systems ( http://arxiv.org/abs/2102.01190v1 )

ライセンス: CC BY 4.0

Xing Su, Yan Kong, Weihua Li

(参考訳) コンピュータベースのモデリングとシミュレーションは、物理学、天体物理学、化学、生物学、経済学、工学、社会科学など、さまざまな分野のシステムを理解するための有用なツールとなっている。複雑なシステムは、多数の相互作用するコンポーネント(エージェント、プロセスなど)で特徴付けられる。 ) は非線型かつ自己組織的である。複雑なシステムは、システムコンポーネント間の複雑な関係、リソースの分散特徴、および環境のダイナミクスのために、従来の計算アプローチを用いてシミュレーションやモデル化が難しい。一方、マルチエージェントシステムなどのスマートシステムは、複雑なシステムのモデリングとシミュレーションにおける利点と大きな可能性を実証しています。

Computer-based modelling and simulation have become useful tools to facilitate humans to understand systems in different domains, such as physics, astrophysics, chemistry, biology, economics, engineering and social science. A complex system is featured with a large number of interacting components (agents, processes, etc.), whose aggregate activities are nonlinear and self-organized. Complex systems are hard to be simulated or modelled by using traditional computational approaches due to complex relationships among system components, distributed features of resources, and dynamics of environments. Meanwhile, smart systems such as multi-agent systems have demonstrated advantages and great potentials in modelling and simulating complex systems.

翻訳日:2021-02-04 05:00:14 公開日:2021-02-01

# (参考訳) 確率的サブモジュラカバーのタイトバウンド

A Tight Bound for Stochastic Submodular Cover ( http://arxiv.org/abs/2102.01149v1 )

ライセンス: CC BY 4.0

Lisa Hellerstein, Devorah Kletenik and Srinivasan Parthasarathy

(参考訳) ここで、golovin and krause (2011) の適応的グリーディアルゴリズムは、確率的部分多様体被覆に対して $(\ln (q/\eta)+1)$ の近似境界を達成していることを示す。 (整数値のユーティリティ関数の場合、$H(Q)$の有界な値を示し、$H(Q)$は$Q^{th}$ハーモニック数である。) この境界は Golovin と Krause によって論文の原版で主張されたが、この証明は後に Nan と Saligrama (2017) によって誤りであることが示されている。その後の Golovin and Krause (2017) の補正された証明は、$(\ln(Q/\eta) + 1)^2$ の二次境界を与える。この問題に対する他の以前の境界は、Im et al の作業によって暗示される 56(\ln(Q/\eta) + 1)$ である。 (2016) 関連する問題、および $k(\ln (Q/\eta)+1)$ について、Deshpande らによる。 2016年) と Hellerstein and Kletenik (2018) では、$k$ は州数である。我々の境界は、古典集合被覆問題に対するグリーディアルゴリズム上のよく知られた $(\ln~m + 1)$ 近似を一般化し、ここで $m$ は基底集合の大きさである。

We show that the Adaptive Greedy algorithm of Golovin and Krause (2011) achieves an approximation bound of $(\ln (Q/\eta)+1)$ for Stochastic Submodular Cover: here $Q$ is the "goal value" and $\eta$ is the smallest non-zero marginal increase in utility deliverable by an item. (For integer-valued utility functions, we show a bound of $H(Q)$, where $H(Q)$ is the $Q^{th}$ Harmonic number.) Although this bound was claimed by Golovin and Krause in the original version of their paper, the proof was later shown to be incorrect by Nan and Saligrama (2017). The subsequent corrected proof of Golovin and Krause (2017) gives a quadratic bound of $(\ln(Q/\eta) + 1)^2$. Other previous bounds for the problem are $56(\ln(Q/\eta) + 1)$, implied by work of Im et al. (2016) on a related problem, and $k(\ln (Q/\eta)+1)$, due to Deshpande et al. (2016) and Hellerstein and Kletenik (2018), where $k$ is the number of states. Our bound generalizes the well-known $(\ln~m + 1)$ approximation bound on the greedy algorithm for the classical Set Cover problem, where $m$ is the size of the ground set.

翻訳日:2021-02-04 04:13:11 公開日:2021-02-01

# (参考訳) インストゥルメンタル変数アプローチとベイズ非パラメトリック機械学習による因果推論

Causal Inference with the Instrumental Variable Approach and Bayesian Nonparametric Machine Learning ( http://arxiv.org/abs/2102.01199v1 )

ライセンス: CC BY-SA 4.0

Robert E. McCulloch, Rodney A. Sparapani, Brent R. Logan and Purushottam W. Laud

(参考訳) インスツルメンタル変数モデルで推論するための新しいフレキシブルなフレームワークを提供する。線形仕様を使用するのではなく、Bayesian Additive Regression Trees (BART)による機械学習を用いて、楽器や他の説明変数の効果を特徴付ける関数を推定する。誤差項とその分布はディリクレプロセス混合物を用いて推定される。シミュレーションおよび実例は、真の函数が線型であるとき、ほとんど失われないことを示している。しかし、非線形性が存在する場合、手動チューニングをほとんど行わずに劇的な改善が得られる。

We provide a new flexible framework for inference with the instrumental variable model. Rather than using linear specifications, functions characterizing the effects of instruments and other explanatory variables are estimated using machine learning via Bayesian Additive Regression Trees (BART). Error terms and their distribution are inferred using Dirichlet Process mixtures. Simulated and real examples show that when the true functions are linear, little is lost. But when nonlinearities are present, dramatic improvements are obtained with virtually no manual tuning.

翻訳日:2021-02-04 01:11:49 公開日:2021-02-01

# (参考訳) 説明可能な人工知能を用いた急性中毒の診断

Diagnosis of Acute Poisoning Using Explainable Artificial Intelligence ( http://arxiv.org/abs/2102.01116v1 )

ライセンス: CC BY 4.0

Michael Chary, Ed W Boyer, Michele M Burns

(参考訳) 医療毒性学(英語: medical toxicology)は、薬物の毒性を、過剰摂取、薬物乱用、またはスコーピオンステントなど、治療する専門分野である。毒性学の知識と研究の量は、他の医学分野と同様に、個々の臨床医が完全に習得し、現在の状態を維持する能力を上回っている。医学毒性学への機械学習技術の適用は、初期治療の決定はしばしばいくつかのテキストデータに基づいており、事前知識に大きく依存するため、困難です。 ml技術は、しばしば医師が透明な方法で知識を表現せず、ユーザビリティへの障壁を生じさせる。ルールベースのシステムと決定木学習はより透過的なアプローチであるが、しばしば一般化が不十分で、実装と維持には専門家のキュレーションが必要である。そこで我々は,医療毒性学者の知識基盤の一部を表す確率論的論理ネットワークを構築した。本手法は臨床医の知識表現と臨床意思決定を透過的に模倣する。 Takと呼ばれるこのソフトウェアは、簡単なケースと中間的な困難ケースで人間に比較できるが、難しい臨床ケースでは人間より優れている。 takは決定木分類器をあらゆる難易度で上回っている。確率論理は、許容可能なレベルのパフォーマンスを達成できれば、医療での使用がより受け入れられるかもしれない説明可能な人工知能の1つの形態を提供します。

Medical toxicology is the clinical specialty that treats the toxic effects of substances, be it an overdose, a medication error, or a scorpion sting. The volume of toxicological knowledge and research has, as with other medical specialties, outstripped the ability of the individual clinician to entirely master and stay current with it. The application of machine learning techniques to medical toxicology is challenging because initial treatment decisions are often based on a few pieces of textual data and rely heavily on prior knowledge. ML techniques often do not represent knowledge in a way that is transparent for the physician, raising barriers to usability. Rule-based systems and decision tree learning are more transparent approaches, but often generalize poorly and require expert curation to implement and maintain. Here, we construct a probabilistic logic network to represent a portion of the knowledge base of a medical toxicologist. Our approach transparently mimics the knowledge representation and clinical decision-making of practicing clinicians. The software, dubbed Tak, performs comparably to humans on straightforward cases and intermediate difficulty cases, but is outperformed by humans on challenging clinical cases. Tak outperforms a decision tree classifier at all levels of difficulty. Probabilistic logic provides one form of explainable artificial intelligence that may be more acceptable for use in healthcare, if it can achieve acceptable levels of performance.

翻訳日:2021-02-04 00:12:07 公開日:2021-02-01

# (参考訳) ビデオ記憶性予測のためのマルチモーダルアンサンブルモデル

Multi-modal Ensemble Models for Predicting Video Memorability ( http://arxiv.org/abs/2102.01173v1 )

ライセンス: CC BY 4.0

Tony Zhao, Irving Fang, Jeffrey Kim, Gerald Friedland

(参考訳) メディアの記憶可能性のモデリングは、機械学習の分野で一貫した課題である。 MediaEval2020のPredicting Media Memorabilityタスクは、このトピックに対処する同様の課題の中で最新のベンチマークです。課題の以前のイテレーションで開発された技術に基づいて,抽出した映像,画像,テキスト,音声特徴を用いてアンサンブル手法を開発した。本研究は,メディアの記憶可能性を予測するための特徴として,抽出音声埋め込みの有効性と高一般化性を紹介する。

Modeling media memorability has been a consistent challenge in the field of machine learning. The Predicting Media Memorability task in MediaEval2020 is the latest benchmark among similar challenges addressing this topic. Building upon techniques developed in previous iterations of the challenge, we developed ensemble methods with the use of extracted video, image, text, and audio features. Critically, in this work we introduce and demonstrate the efficacy and high generalizability of extracted audio embeddings as a feature for the task of predicting media memorability.

翻訳日:2021-02-04 00:00:27 公開日:2021-02-01

# (参考訳) toon2real:漫画画像をリアルな画像に翻訳する

toon2real: Translating Cartoon Images to Realistic Images ( http://arxiv.org/abs/2102.01143v1 )

ライセンス: CC BY 4.0

K. M. Arefeen Sultan, Mohammad Imrul Jubair, MD. Nahidul Islam, Sayed Hossain Khan

(参考訳) 画像から画像への変換に関しては、GAN(Generative Adversarial Networks)は教師なしデータセットでも大きな成功を収めている。本研究では,GANを用いた漫画画像から写真実写画像への翻訳を目的とする。このタスクを実行するためにいくつかの最先端モデルを適用するが、高品質な翻訳には失敗する。これら2つのドメイン間の浅い差がこの問題を引き起こすのを観察する。そこで本研究では,漫画領域からフォトリアリスティック領域への画像翻訳のためのCycleGANモデルに基づく手法を提案する。モデルを効率よくするために、我々のモデルに安定性を加えたスペクトル正規化を実装した。実験の結果を実証し,提案手法が他の最先端技術であるUNITと比較して最も低いFrechet Inception Distanceスコアと優れた結果を得たことを示す。

In terms of Image-to-image translation, Generative Adversarial Networks (GANs) has achieved great success even when it is used in the unsupervised dataset. In this work, we aim to translate cartoon images to photo-realistic images using GAN. We apply several state-of-the-art models to perform this task; however, they fail to perform good quality translations. We observe that the shallow difference between these two domains causes this issue. Based on this idea, we propose a method based on CycleGAN model for image translation from cartoon domain to photo-realistic domain. To make our model efficient, we implemented Spectral Normalization which added stability in our model. We demonstrate our experimental results and show that our proposed model has achieved the lowest Frechet Inception Distance score and better results compared to another state-of-the-art technique, UNIT.

翻訳日:2021-02-03 21:22:16 公開日:2021-02-01

# (参考訳) Image Domain DEEP-SLRによる並列MRデータの再構成とセグメント化

Reconstruction and Segmentation of Parallel MR Data using Image Domain DEEP-SLR ( http://arxiv.org/abs/2102.01172v1 )

ライセンス: CC BY 4.0

Aniket Pramanik, Mathews Jacob

(参考訳) この研究の主な焦点は、並列MRI(PMRI)脳データの共同再構成と分割のための新しいフレームワークである。画像領域深層ネットワークの導入により,PMRIデータのキャリブレーションレスリカバリを実現した。提案されたアプローチは, CLEAR [6] を含む非補正 PMRI 回復のための局所低ランクアプローチの深層学習 (DL) に基づく一般化である。画像領域アプローチは、k空間ベースのアプローチと比較して、余分な消滅関係を利用するため、性能改善が期待できる。アーティファクトのアンサンプリングによるセグメンテーションエラーを最小限に抑えるため,提案手法をセグメンテーションネットワークと組み合わせ,エンドツーエンドでトレーニングした。この手法は、セグメンテーションエラーの低減に加えて、オーバーフィットの低減による再構築性能の向上も実現し、再構成された画像は、独立して訓練された再構築ネットワークよりもぼやけやシャープなエッジを減少させる。

The main focus of this work is a novel framework for the joint reconstruction and segmentation of parallel MRI (PMRI) brain data. We introduce an image domain deep network for calibrationless recovery of undersampled PMRI data. The proposed approach is the deep-learning (DL) based generalization of local low-rank based approaches for uncalibrated PMRI recovery including CLEAR [6]. Since the image domain approach exploits additional annihilation relations compared to k-space based approaches, we expect it to offer improved performance. To minimize segmentation errors resulting from undersampling artifacts, we combined the proposed scheme with a segmentation network and trained it in an end-to-end fashion. In addition to reducing segmentation errors, this approach also offers improved reconstruction performance by reducing overfitting; the reconstructed images exhibit reduced blurring and sharper edges than independently trained reconstruction network.

翻訳日:2021-02-03 17:52:54 公開日:2021-02-01

# 生音声から生成した音声言語モデリング

Generative Spoken Language Modeling from Raw Audio ( http://arxiv.org/abs/2102.01192v1 )

ライセンス: Link先を確認

Kushal Lakhotia, Evgeny Kharitonov, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Benjamin Bolte, Tu-Anh Nguyen, Jade Copet, Alexei Baevski, Adelrahman Mohamed, Emmanuel Dupoux

(参考訳) ジェネレーティブ・スピーカ言語モデリングは、(テキストやラベルなしで)生の音声のみから言語の音響的および言語的特性を共同で学習することを含む。音声合成(システム自身の音声を用いて音声入力を繰り返す)と音声生成(音声プロンプトで条件付きまたは無条件で新規音声出力を生成する)の2つのタスクにおいて、生成した出力を音響的および言語的品質で自動評価する指標を導入し、これらの指標を人間の判断で検証する。本研究では,離散音声エンコーダ(離散,低ビットレート,擬似テキスト単位)と生成言語モデル(擬似テキスト単位で学習)と音声デコーダ(擬似テキストから波形を生成する)からなるベースラインシステムをテストする。 3つの最先端の教師なし音声符号化(contrastive prediction coding (cpc), wav2vec 2.0, hubert)と離散単位数(50, 100, 200)を比較し,教師なしメトリクス(ゼロショットプローブタスク)で測定した学習単位の品質に依存するかを検討した。私たちは評価スタックとベースラインモデルをオープンソース化します。

Generative spoken language modeling involves learning jointly the acoustic and linguistic characteristics of a language from raw audio only (without text or labels). We introduce metrics to automatically evaluate the generated output in terms of acoustic and linguistic quality in two associated end-to-end tasks, respectively: speech resynthesis (repeating the speech input using the system's own voice), and speech generation (producing novel speech outputs conditional on a spoken prompt, or unconditionally), and validate these metrics with human judgment. We test baseline systems consisting of a discrete speech encoder (returning discrete, low bitrate, pseudo-text units), a generative language model (trained on pseudo-text units), and a speech decoder (generating a waveform from pseudo-text). By comparing three state-of-the-art unsupervised speech encoders (Contrastive Predictive Coding (CPC), wav2vec 2.0, HuBERT), and varying the number of discrete units (50, 100, 200), we investigate how the generative performance depends on the quality of the learned units as measured by unsupervised metrics (zero-shot probe tasks). We will open source our evaluation stack and baseline models.

翻訳日:2021-02-03 16:56:07 公開日:2021-02-01

# 大規模多目的質問回答データによる自己学習機械の読み書き

Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question Answering Data ( http://arxiv.org/abs/2102.01226v1 )

ライセンス: Link先を確認

Dian Yu, Kai Sun, Dong Yu, Claire Cardie

(参考訳) この領域の最近の研究にもかかわらず、対象領域の質問応答データが機械読解(MRC)タスクに有用かどうかはまだ不明である。本稿では,この問題について考察する。大規模多目的多目的質問答えデータセットであるExamQAを収集し,Web検索エンジンが返送する不完全でノイズの多いスニペットを各問合せインスタンスのコンテキストとして使用し,弱ラベルのMRCインスタンスに変換する。次に,生成した弱ラベルMRCインスタンスを,ターゲットMRCタスクを改善するための自己学習パラダイムを提案する。実験結果から,マルチチョイスMRCデータセットC^3では5.1%の精度向上が可能であり,本フレームワークの有効性と,機械学習理解のための大規模質問応答データの有効性が示された。

In spite of much recent research in the area, it is still unclear whether subject-area question-answering data is useful for machine reading comprehension (MRC) tasks. In this paper, we investigate this question. We collect a large-scale multi-subject multiple-choice question-answering dataset, ExamQA, and use incomplete and noisy snippets returned by a web search engine as the relevant context for each question-answering instance to convert it into a weakly-labeled MRC instance. We then propose a self-teaching paradigm to better use the generated weakly-labeled MRC instances to improve a target MRC task. Experimental results show that we can obtain an improvement of 5.1% in accuracy on a multiple-choice MRC dataset, C^3, demonstrating the effectiveness of our framework and the usefulness of large-scale subject-area question-answering data for machine reading comprehension.

翻訳日:2021-02-03 16:55:20 公開日:2021-02-01

# RectiNet-v2: ドキュメントイメージのデワーピングのためのスタックネットワークアーキテクチャ

RectiNet-v2: A stacked network architecture for document image dewarping ( http://arxiv.org/abs/2102.01120v1 )

ライセンス: Link先を確認

Hmrishav Bandyopadhyay, Tanmoy Dasgupta, Nibaran Das, Mita Nasipuri

(参考訳) モバイルとハンドヘルドカメラの登場により、ドキュメントイメージはほぼすべての領域に浸透しています。これらの画像のデワーピングは、文書認識アルゴリズムによって理解できるように、視点の歪みや折り畳みを取り除くために不可欠です。そこで本研究では,入力として使用する歪文書から歪みのない文書画像を生成可能な,エンドツーエンドCNNアーキテクチャを提案する。自然データの不足を補うために合成シミュレーションされた歪んだ文書画像上でこのモデルを訓練する。本手法は, 共有重み付きバイフラクテッドデコーダを用いたグリッド座標の混入防止, U-Net スキップ接続における残存ネットワークによるモデル内の異なる受容フィールドからのデータフロー, およびゲートネットワークを用いた文書画像の構造と線レベルの詳細のモデルフォーカス支援において斬新な手法である。本手法は,この領域のベンチマークであるDocUNetデータセット上で評価し,最新の手法に匹敵する結果を得る。

With the advent of mobile and hand-held cameras, document images have found their way into almost every domain. Dewarping of these images for the removal of perspective distortions and folds is essential so that they can be understood by document recognition algorithms. For this, we propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input. We train this model on warped document images simulated synthetically to compensate for lack of enough natural data. Our method is novel in the use of a bifurcated decoder with shared weights to prevent intermingling of grid coordinates, in the use of residual networks in the U-Net skip connections to allow flow of data from different receptive fields in the model, and in the use of a gated network to help the model focus on structure and line level detail of the document image. We evaluate our method on the DocUNet dataset, a benchmark in this domain, and obtain results comparable to state-of-the-art methods.

翻訳日:2021-02-03 16:51:36 公開日:2021-02-01

# 随伴剛体変換ネットワーク:3次元形状の自己監督アライメント

Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes ( http://arxiv.org/abs/2102.01161v1 )

ライセンス: Link先を確認

Keyang Zhou, Bharat Lal Bhatnagar, Bernt Schiele, Gerard Pons-Moll

(参考訳) 3Dデータ(ポイントクラウド、メッシュ)のほとんどの学習方法は、データが正常な向きに慎重に整列されていない場合に、大幅なパフォーマンス低下を被る。異なるソースから収集された現実世界の3Dデータをアライメントすることは簡単ではなく、手動の介入が必要です。本論文では,既存の3Dネットワークと統合して,形状の再構築,非剛体登録,潜在非絡み合いなどのタスクにおける性能を大幅に向上させるニューラルネットワークであるAdjoint Rigid Transform (ART) Networkを提案する。 ARTは、多くのタスクに不可欠な正準方向への入力形状の回転を学習します。 artは入力形状に回転同分散制約を課すことでこれを達成する。注目すべき結果は、自己スーパービジョンだけで、artは剛体オブジェクトと非剛体オブジェクトの両方のユニークな標準指向を見つけることができ、下流のタスクパフォーマンスが著しく向上する。さらなる研究のために、コードと事前トレーニングモデルをリリースします。

Most learning methods for 3D data (point clouds, meshes) suffer significant performance drops when the data is not carefully aligned to a canonical orientation. Aligning real world 3D data collected from different sources is non-trivial and requires manual intervention. In this paper, we propose the Adjoint Rigid Transform (ART) Network, a neural module which can be integrated with existing 3D networks to significantly boost their performance in tasks such as shape reconstruction, non-rigid registration, and latent disentanglement. ART learns to rotate input shapes to a canonical orientation that is crucial for a lot of tasks. ART achieves this by imposing rotation equivariance constraint on input shapes. The remarkable result is that with only self-supervision, ART can discover a unique canonical orientation for both rigid and nonrigid objects, which leads to a notable boost in downstream task performance. We will release our code and pre-trained models for further research.

翻訳日:2021-02-03 16:50:56 公開日:2021-02-01

# 編集を楽しむ: 潜在空間ナビゲーションによる画像編集のための制御可能なgan

Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation ( http://arxiv.org/abs/2102.01187v1 )

ライセンス: Link先を確認

Peiye Zhuang, Oluwasanmi Koyejo, Alexander G. Schwing

(参考訳) 制御可能なセマンティック画像編集により、ユーザーはクリック数が少なく画像属性全体を変更できます。例えば、夏のシーンは冬に撮影されたように徐々に見えます。このタスクの古典的なアプローチは、GAN(Generative Adversarial Net)を使用して、潜在空間と適切な潜在空間変換を学ぶ。しかし、現在のアプローチはしばしば、絡み合った属性編集、グローバルなイメージアイデンティティの変更、および写真リアリズムの減少に苦しんでいます。これらの懸念に対処するために,複数の属性変換を同時に学習し,属性回帰を変換関数のトレーニングに統合し,画像のアイデンティティとフォトリアリズムの維持を促進するコンテンツ損失と敵対的損失を適用する。質的評価を主とした先行作業とは異なり、制御可能な編集性能を測定するための定量的評価戦略を提案します。本モデルでは,画像の同一性やリアリズムを保ちながら,単一属性と複数属性の編集をよりよく制御することができる。実画像と合成画像の両方に対して実験結果を提供し,本モデルがターゲット画像操作の最先端性能を達成することを強調した。

Controllable semantic image editing enables a user to change entire image attributes with few clicks, e.g., gradually making a summer scene look like it was taken in winter. Classic approaches for this task use a Generative Adversarial Net (GAN) to learn a latent space and suitable latent-space transformations. However, current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism. To address these concerns, we learn multiple attribute transformations simultaneously, we integrate attribute regression into the training of transformation functions, apply a content loss and an adversarial loss that encourage the maintenance of image identity and photo-realism. We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation. Our model permits better control for both single- and multiple-attribute editing, while also preserving image identity and realism during transformation. We provide empirical results for both real and synthetic images, highlighting that our model achieves state-of-the-art performance for targeted image manipulation.

翻訳日:2021-02-03 16:50:19 公開日:2021-02-01

# 単眼直接視覚オドメトリーにおける特徴量に基づく再局在の密結合

Tight-Integration of Feature-Based Relocalization in Monocular Direct Visual Odometry ( http://arxiv.org/abs/2102.01191v1 )

ライセンス: Link先を確認

Mariia Gladkova, Rui Wang, Niclas Zeller, and Daniel Cremers

(参考訳) 本稿では,地図ベースの再局在化をオンラインの直接視覚オドメトリに統合するフレームワークを提案する。直接手法の地図に基づく再局在化を実現するため,画像特徴を直接スパースオドメトリー(DSO)に統合し,オンライン視覚計測(VO)と以前に構築された地図を関連づけるために特徴マッチングに依存する。再ローカライゼーションのポーズの統合は3倍である。まず、ポーズ先行として扱われ、フロントエンド追跡のダイレクトイメージアライメントに密に統合される。第2に、バックエンドバンドル調整に密に統合される。オンライン融合モジュールは、相対的なVOポーズとグローバルな再ローカライズポーズをポーズグラフに組み合わせ、キーフレームをスムースかつグローバルに正確なポーズで推定する。本手法は2つのマルチウェザーデータセットで評価し,手作業と学習の異なる特徴を統合し,カメラ追跡精度の向上が期待できることを示す。

In this paper we propose a framework for integrating map-based relocalization into online direct visual odometry. To achieve map-based relocalization for direct methods, we integrate image features into Direct Sparse Odometry (DSO) and rely on feature matching to associate online visual odometry (VO) with a previously built map. The integration of the relocalization poses is threefold. Firstly, they are treated as pose priors and tightly integrated into the direct image alignment of the front-end tracking. Secondly, they are also tightly integrated into the back-end bundle adjustment. An online fusion module is further proposed to combine relative VO poses and global relocalization poses in a pose graph to estimate keyframe-wise smooth and globally accurate poses. We evaluate our method on two multi-weather datasets showing the benefits of integrating different handcrafted and learned features and demonstrating promising improvements on camera tracking accuracy.

翻訳日:2021-02-03 16:49:40 公開日:2021-02-01

# スロットアテンションを有する文字系列から有意義な単位を誘導する

Inducing Meaningful Units from Character Sequences with Slot Attention ( http://arxiv.org/abs/2102.01223v1 )

ライセンス: Link先を確認

Melika Behjati and James Henderson

(参考訳) 文字は意味を伝えないが、文字の配列はそうである。抽象的意味保持単位を一連の文字で学習するための教師なし分布法を提案する。このモデルはシーケンスをセグメンテーションする代わりに、最近提案されたスロットアテンションと呼ばれる画像のオブジェクト発見のためのアーキテクチャを用いて、シーケンス内の"オブジェクト"の連続的な表現を検出する。我々は、異なる言語でモデルを訓練し、取得した表現の品質を分類器で評価する。我々の実験は、より高い抽象レベルで意味を捉える能力において有望な結果を示す。

Characters do not convey meaning, but sequences of characters do. We propose an unsupervised distributional method to learn the abstract meaning-bearing units in a sequence of characters. Rather than segmenting the sequence, this model discovers continuous representations of the "objects" in the sequence, using a recently proposed architecture for object discovery in images called Slot Attention. We train our model on different languages and evaluate the quality of the obtained representations with probing classifiers. Our experiments show promising results in the ability of our units to capture meaning at a higher level of abstraction.

翻訳日:2021-02-03 16:42:31 公開日:2021-02-01

# GraphDF:分子グラフ生成のための離散フローモデル

GraphDF: A Discrete Flow Model for Molecular Graph Generation ( http://arxiv.org/abs/2102.01189v1 )

ライセンス: Link先を確認

Youzhi Luo, Keqiang Yan, Shuiwang Ji

(参考訳) 深層モデルを用いた分子グラフ生成の問題点を考察する。グラフは離散的であるが、既存のほとんどのメソッドは連続潜伏変数を使用し、離散グラフ構造の不正確なモデリングをもたらす。本稿では,正規化フロー法に基づく分子グラフ生成のための新しい離散潜在変数モデルであるGraphDFを提案する。 graphdfは、離散的潜在変数をグラフノードとエッジにマッピングするために、可逆モジュロシフト変換を使用する。離散潜在変数を用いることで計算コストを削減し、復号化の負の効果を排除できることを示す。包括的実験により,graphdfはランダム生成,プロパティ最適化,制約付き最適化タスクにおいて,先行手法よりも優れていた。

We consider the problem of molecular graph generation using deep models. While graphs are discrete, most existing methods use continuous latent variables, resulting in inaccurate modeling of discrete graph structures. In this work, we propose GraphDF, a novel discrete latent variable model for molecular graph generation based on normalizing flow methods. GraphDF uses invertible modulo shift transforms to map discrete latent variables to graph nodes and edges. We show that the use of discrete latent variables reduces computational costs and eliminates the negative effect of dequantization. Comprehensive experimental results show that GraphDF outperforms prior methods on random generation, property optimization, and constrained optimization tasks.

翻訳日:2021-02-03 16:42:00 公開日:2021-02-01

# 並列ウェーブネットを用いたUniversal Neural Vocoding

Universal Neural Vocoding with Parallel WaveNet ( http://arxiv.org/abs/2102.01106v1 )

ライセンス: Link先を確認

Yunlong Jiao, Adam Gabrys, Georgi Tinchev, Bartosz Putrycz, Daniel Korzekwa, Viacheslav Klimkov

(参考訳) 本稿では,パラレルウェーブネットに基づくユニバーサルニューラルボコーダと,オーディオエンコーダと呼ばれる追加条件付きネットワークを提案する。われわれのuniversal vocoderは、幅広いユースケースでリアルタイムの高品質な音声合成を提供する。 17のユニークなスタイルで20の言語を話しました。そのうち7つの声と5つのスタイルはトレーニング中に公開されていませんでした。提案するユニバーサルボコーダは,話者依存型ボコーダを圧倒的に上回っている。また,提案するボコーダは,自然性と普遍性の観点から,既存のニューラルボコーダアーキテクチャよりも優れていることを示す。これらの発見は、300以上のオープンソース音声のさらなるテストにおいて一貫しています。

We present a universal neural vocoder based on Parallel WaveNet, with an additional conditioning network called Audio Encoder. Our universal vocoder offers real-time high-quality speech synthesis on a wide range of use cases. We tested it on 43 internal speakers of diverse age and gender, speaking 20 languages in 17 unique styles, of which 7 voices and 5 styles were not exposed during training. We show that the proposed universal vocoder significantly outperforms speaker-dependent vocoders overall. We also show that the proposed vocoder outperforms several existing neural vocoder architectures in terms of naturalness and universality. These findings are consistent when we further test on more than 300 open-source voices.

翻訳日:2021-02-03 16:36:56 公開日:2021-02-01

# BERT-based Label & Instance Embeddings による遠隔監視型関係抽出の改善

Improving Distantly-Supervised Relation Extraction through BERT-based Label & Instance Embeddings ( http://arxiv.org/abs/2102.01156v1 )

ライセンス: Link先を確認

Despina Christou, Grigorios Tsoumakas

(参考訳) 遠隔教師付き関係抽出(RE)は,REを大規模コーパスに拡張する有効な方法であるが,ノイズラベルに悩まされている。既存のアプローチは、マルチインスタンス学習と追加情報の提供を通じてノイズを緩和しようとしますが、主にトップの頻繁な関係を認識し、長期にわたってそれらを無視します。 REDSandT(Relation Extraction with Distant Supervision and Transformers)は、BERTの事前訓練モデルとラベルとエンティティの関係をそれぞれ活用し、高度に有益なインスタンスとREのラベル埋め込みを通じてより広い関係セットをキャプチャする、遠隔監視トランスフォーマーベースの新しいREメソッドである。エンティティペアとエンティティの型を接続するサブツリーを含む構造化された入力にBERTを微調整することで、ReDSandTはリレーショナルトークンのみにフォーカスするように誘導する。抽出した情報ベクトルを用いてラベル埋め込みを形づくり、さらにノイズを低減するためにインスタンス上の注意機構として使用する。最後に、関係とインスタンス埋め込みを結合することで文を表現する。 NYT-10データセットの実験では、REDSandTはより幅広い信頼関係を捉え、最先端のAUC(0.424)を達成している。

Distantly-supervised relation extraction (RE) is an effective method to scale RE to large corpora but suffers from noisy labels. Existing approaches try to alleviate noise through multi-instance learning and by providing additional information, but manage to recognize mainly the top frequent relations, neglecting those in the long-tail. We propose REDSandT (Relation Extraction with Distant Supervision and Transformers), a novel distantly-supervised transformer-based RE method, that manages to capture a wider set of relations through highly informative instance and label embeddings for RE, by exploiting BERT's pre-trained model, and the relationship between labels and entities, respectively. We guide REDSandT to focus solely on relational tokens by fine-tuning BERT on a structured input, including the sub-tree connecting an entity pair and the entities' types. Using the extracted informative vectors, we shape label embeddings, which we also use as attention mechanism over instances to further reduce noise. Finally, we represent sentences by concatenating relation and instance embeddings. Experiments in the NYT-10 dataset show that REDSandT captures a broader set of relations with higher confidence, achieving state-of-the-art AUC (0.424).

翻訳日:2021-02-03 16:36:25 公開日:2021-02-01

# SGDがGDよりも一般化(正規化は役に立たない)

SGD Generalizes Better Than GD (And Regularization Doesn't Help) ( http://arxiv.org/abs/2102.01117v1 )

ライセンス: Link先を確認

Idan Amir, Tomer Koren, Roi Livni

(参考訳) 基本確率凸最適化モデルにおける確率勾配降下(SGD)とフルバッチ勾配降下(GD)の一般化性能との間に新たな分離結果を与える。 SGD の場合、$O(1/\epsilon^2)$ 反復は$\epsilon$ 過剰な予測リスクを持つ解を得るのに十分であることが知られているが、同じステップ数で GD がオーバーフィットし、$\Omega(1)$ 一般化誤差を持つ解を出力できることが示されている。さらに,近年のbassily et alによる研究により,sgdの一般化性能に適合するために,実のところ$\omega(1/\epsilon^4)$イテレーションが必要であることを示した。 (2020). さらに,gdによって最小化される経験的リスクの正則化は,上記の結果に本質的に変化せず,安定性,暗黙的バイアス,一般化における学習アルゴリズムの役割について再検討する。

We give a new separation result between the generalization performance of stochastic gradient descent (SGD) and of full-batch gradient descent (GD) in the fundamental stochastic convex optimization model. While for SGD it is well-known that $O(1/\epsilon^2)$ iterations suffice for obtaining a solution with $\epsilon$ excess expected risk, we show that with the same number of steps GD may overfit and emit a solution with $\Omega(1)$ generalization error. Moreover, we show that in fact $\Omega(1/\epsilon^4)$ iterations are necessary for GD to match the generalization performance of SGD, which is also tight due to recent work by Bassily et al. (2020). We further discuss how regularizing the empirical risk minimized by GD essentially does not change the above result, and revisit the concepts of stability, implicit bias and the role of the learning algorithm in generalization.

翻訳日:2021-02-03 16:34:19 公開日:2021-02-01

# 単一プロップによる頑健なニューラルネットワークの高速学習

Fast Training of Provably Robust Neural Networks by SingleProp ( http://arxiv.org/abs/2102.01208v1 )

ライセンス: Link先を確認

Akhilan Boopathy, Tsui-Wei Weng, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, Luca Daniel

(参考訳) 最近の研究は、認証された保証を伴う敵の攻撃からニューラルネットワークを守るいくつかの方法を開発した。しかし、これらの技術は、訓練中に認証を使用することで計算コストがかかる。既存の認定防御よりも効率的で、ネットワークを介して1つの追加の前方伝播を必要とする新しい正規化器を開発し、同様の認定精度でネットワークを訓練することができます。 mnist と cifar-10 の実験を通じて,最先端の認証防御と比較して,トレーニング速度と同等の認定精度が向上することを示す。

Recent works have developed several methods of defending neural networks against adversarial attacks with certified guarantees. However, these techniques can be computationally costly due to the use of certification during training. We develop a new regularizer that is both more efficient than existing certified defenses, requiring only one additional forward propagation through a network, and can be used to train networks with similar certified accuracy. Through experiments on MNIST and CIFAR-10 we demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses.

翻訳日:2021-02-03 16:33:37 公開日:2021-02-01

# 線形ペイオフのための二重ロバストトンプソンサンプリング

Doubly Robust Thompson Sampling for linear payoffs ( http://arxiv.org/abs/2102.01229v1 )

ライセンス: Link先を確認

Wonyoung Kim, Gi-soo Kim, Myunghee Cho Paik

(参考訳) バンドイット問題における挑戦的な側面は、選択された腕のみに確率的な報酬が観察され、他の腕の報酬が失われることである。アームの選択は過去のコンテキストと報酬ペアに依存するため、選択されたアームのコンテキストは相関に苦しめられ、分析が困難になる。本論文では,データ文献の欠落に用いるDR手法をTSに応用した,Dubly Robust (DR) Thompson Sampling (TS) という新しいマルチアームコンテキストバンディットアルゴリズムを提案する。提案されたアルゴリズムは、$d$ が文脈の次元である$\sqrt{d}$ の係数によって ts の境界を改善する。提案手法の利点は,ts の理論的解析に使用される不飽和アームの技術的定義を回避できるため,選択または選択しないすべてのコンテキストデータを使用することである。経験的研究はTSよりも提案されたアルゴリズムの利点を示す。

A challenging aspect of the bandit problem is that a stochastic reward is observed only for the chosen arm and the rewards of other arms remain missing. Since the arm choice depends on the past context and reward pairs, the contexts of chosen arms suffer from correlation and render the analysis difficult. We propose a novel multi-armed contextual bandit algorithm called Doubly Robust (DR) Thompson Sampling (TS) that applies the DR technique used in missing data literature to TS. The proposed algorithm improves the bound of TS by a factor of $\sqrt{d}$, where $d$ is the dimension of the context. A benefit of the proposed method is that it uses all the context data, chosen or not chosen, thus allowing to circumvent the technical definition of unsaturated arms used in theoretical analysis of TS. Empirical studies show the advantage of the proposed algorithm over TS.

翻訳日:2021-02-03 16:33:05 公開日:2021-02-01

# スタックオーバーフローからのクエリログに基づく効率的な検索のための自動クエリリフォーマレーション

Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow ( http://arxiv.org/abs/2102.00826v1 )

ライセンス: Link先を確認

Kaibo Cao (1), Chunyang Chen (2), Sebastian Baltes (3), Christoph Treude (3), Xiang Chen (4) ((1) Software Institute, Nanjing University, China, (2) Faculty of Information Technology, Monash University, Australia, (3) School of Computer Science, University of Adelaide, Australia, (4) School of Information Science and Technology, Nantong University, China)

(参考訳) プログラミングのq&aサイトとして人気があるstack overflowは、開発者にとって宝物だ。しかしながら、stack overflowの質問や回答の量によって、開発者が探している情報を効率的に見つけることが難しくなる。検索結果の貧弱化につながる2つのギャップは、ユーザの意図とテキストクエリの間のギャップ、クエリとポストコンテンツの間の意味的ギャップである。そのため開発者は、ミススペルされた単語を訂正し、特定のプログラミング言語やプラットフォームに制限を加えることで、クエリを常に修正する必要がある。クエリの改定は、特に初心者にとっては面倒であるので、ディープラーニングに基づく自動ソフトウェア固有のクエリの改定手法を提案する。 Stack Overflowが提供するクエリログを用いて,クエリとそれに対応するクエリを含む大規模クエリ再構成コーパスを構築する。提案手法では,ユーザが元のクエリを入力した場合に,候補変更クエリを自動的に生成するトランスフォーマーモデルを訓練する。評価の結果、我々のアプローチは5つの最先端ベースラインを上回り、$\mathit{exactmatch}$で5.6%から33.5%、$\mathit{gleu}$で4.8%から14.4%向上した。

As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of $\mathit{ExactMatch}$ and a 4.8% to 14.4% boost in terms of $\mathit{GLEU}$.

翻訳日:2021-02-03 16:29:42 公開日:2021-02-01

# サイエンス・コンスピレーション・ビデオの視覚的フレイミング: 機械学習とコミュニケーション理論の統合による色と明度の利用に関する研究

Visual Framing of Science Conspiracy Videos: Integrating Machine Learning with Communication Theories to Study the Use of Color and Brightness ( http://arxiv.org/abs/2102.01163v1 )

ライセンス: Link先を確認

Kaiping Chen, Sang Jung Kim, Sebastian Raschka, Qiantong Gao

(参考訳) 近年、インターネット上の科学陰謀ビデオの爆発が目撃され、科学認識論と科学の一般理解に挑戦している。学者たちは、不確実性や恐怖といった陰謀のメッセージで使用される説得技術について調べ始めたが、視覚的物語についてはほとんど理解されていない。本稿では、陰謀と反陰謀のビデオから数百万フレームを解析し、陰謀ビデオにおける視覚的フレーミングの理解のギャップを計算手法を用いて解決する。共謀ビデオは色のばらつきや明るさが低い傾向にあり、特にサムネイルや初期のビデオでは顕著だった。本論文は,ソーシャルメディア上での共謀を識別するために,研究者がテキスト的および視覚的な特徴を統合する方法を示し,デジタル時代の視覚操作研究に関心のある研究者にとっての計算モデリングの意義について論じる。

Recent years have witnessed an explosion of science conspiracy videos on the Internet, challenging science epistemology and public understanding of science. Scholars have started to examine the persuasion techniques used in conspiracy messages such as uncertainty and fear yet, little is understood about the visual narratives, especially how visual narratives differ in videos that debunk conspiracies versus those that propagate conspiracies. This paper addresses this gap in understanding visual framing in conspiracy videos through analyzing millions of frames from conspiracy and counter-conspiracy YouTube videos using computational methods. We found that conspiracy videos tended to use lower color variance and brightness, especially in thumbnails and earlier parts of the videos. This paper also demonstrates how researchers can integrate textual and visual features for identifying conspiracies on social media and discusses the implications of computational modeling for scholars interested in studying visual manipulation in the digital era.

翻訳日:2021-02-03 16:15:44 公開日:2021-02-01

# (参考訳) 「大麻にかかわるうつ病ですか。」「限定監督による実体・関係抽出のための知識融合モデル」

"Is depression related to cannabis?": A knowledge-infused model for Entity and Relation Extraction with Limited Supervision ( http://arxiv.org/abs/2102.01222v1 )

ライセンス: CC BY 4.0

Kaushik Roy, Usha Lokala, Vedant Khandelwal, and Amit Sheth

(参考訳) 精神の健康を改善するための大麻の使用の利点を強く宣伝し、大麻の合法化が立法府の優先事項である。しかし、予備的な科学的研究は、大麻と精神の健康の改善を決定づけるものではない。本研究では、大麻の個人的使用を含む標的ソーシャルメディアコーパスにおける大麻の抑うつと消費の関係を検討し、その潜在的な精神的健康上の利益を導き出そうとする。ドメインの専門家がアノテートした3つのカテゴリ(理由、効果、中毒)に関連付けられたツイートを使用します。最先端の自然ランガウジ処理技術は、大麻のフレーズとうつ病指標の間のこれらの関係の抽出に不足します。本研究は,精神疾患の診断・統計マニュアルを付加した依存症用薬物乱用オントロジーを精神保健に応用し,その限界に対処することを目的とする。ドメインエキスパートの時間が限られているためアノテーションが不足しているため、広範囲のコーパスで訓練されたGPT-3とともに教師付きコントラスト学習を使用して、限られた監督下でもパフォーマンスの向上を実現している。実験の結果,本手法は最先端の関係抽出装置よりも大麻-うつ病関係を有意に抽出できることが判明した。良質なアノテーションは、科学コミュニティが大麻とうつ病の関連性をよりよく理解するために使用できる学習表現を使用して、近隣のアプローチで提供することができる。

With strong marketing advocacy of the benefits of cannabis use for improved mental health, cannabis legalization is a priority among legislators. However, preliminary scientific research does not conclusively associate cannabis with improved mental health. In this study, we explore the relationship between depression and consumption of cannabis in a targeted social media corpus involving personal use of cannabis with the intent to derive its potential mental health benefit. We use tweets that contain an association among three categories annotated by domain experts - Reason, Effect, and Addiction. The state-of-the-art Natural Langauge Processing techniques fall short in extracting these relationships between cannabis phrases and the depression indicators. We seek to address the limitation by using domain knowledge; specifically, the Drug Abuse Ontology for addiction augmented with Diagnostic and Statistical Manual of Mental Disorders lexicons for mental health. Because of the lack of annotations due to the limited availability of the domain experts' time, we use supervised contrastive learning in conjunction with GPT-3 trained on a vast corpus to achieve improved performance even with limited supervision. Experimental results show that our method can significantly extract cannabis-depression relationships better than the state-of-the-art relation extractor. High-quality annotations can be provided using a nearest neighbor approach using the learned representations that can be used by the scientific community to understand the association between cannabis and depression better.

翻訳日:2021-02-03 16:12:59 公開日:2021-02-01

# (参考訳) グラフ畳み込みニューラルネットワークによる汎用OCRパラグラフの同定

General-Purpose OCR Paragraph Identification by Graph Convolutional Neural Networks ( http://arxiv.org/abs/2101.12741v2 )

ライセンス: CC BY 4.0

Renshen Wang, Yasuhisa Fujii and Ashok C. Popat

(参考訳) パラグラフはドキュメントエンティティの重要なクラスです。 OCRテキストボックスに適用した空間グラフ畳み込みニューラルネットワーク(GCN)による段落識別のための新しい手法を提案する。行分割と行クラスタリングという2つのステップを実行して、OCR結果の行から段落を抽出します。各ステップはバウンディングボックスから構築されたβ-スケルトングラフを使用し、グラフエッジはグラフ畳み込み操作の効率的なサポートを提供する。純粋なレイアウト入力機能のみにより、GCNモデルのサイズはR-CNNベースのモデルと比較して3〜4桁小さく、PubLayNetや他のデータセットで同等以上の精度を達成しています。さらに、GCNモデルは、合成トレーニングデータから実世界画像への良好な一般化と、可変文書スタイルに対する良好な適応性を示す。

Paragraphs are an important class of document entities. We propose a new approach for paragraph identification by spatial graph convolutional neural networks (GCN) applied on OCR text boxes. Two steps, namely line splitting and line clustering, are performed to extract paragraphs from the lines in OCR results. Each step uses a beta-skeleton graph constructed from bounding boxes, where the graph edges provide efficient support for graph convolution operations. With only pure layout input features, the GCN model size is 3~4 orders of magnitude smaller compared to R-CNN based models, while achieving comparable or better accuracies on PubLayNet and other datasets. Furthermore, the GCN models show good generalization from synthetic training data to real-world images, and good adaptivity for variable document styles.

翻訳日:2021-02-03 13:29:22 公開日:2021-02-01

# (参考訳) ファーストパーソンビデオからのコンタクト表現による予測アクション

Forecasting Action through Contact Representations from First Person Video ( http://arxiv.org/abs/2102.00649v1 )

ライセンス: CC BY 4.0

Eadom Dessalene, Chinmaya Devaraj, Michael Maynord, Cornelia Fermuller, and Yiannis Aloimonos

(参考訳) 手操作を含む人間の行動は、手対象の接触の作成と破壊に基づいて構成され、行動の人間の視覚的理解は、認知科学の先駆的な研究によって実証されるように、接触の予測に依存している。これから着想を得て,接触を中心とした表現とモデルを紹介し,行動予測と予測に使用する。 EPIC Kitchensデータセットのサブセットをアノテートして、ハンドとオブジェクト間の接触時間、ハンドとオブジェクトのセグメンテーションを含むようにします。これらのアノテーションを使って予測モジュール、接触予測マップを生成するモジュール、そして次のアクティブオブジェクトセグメンテーションを訓練します。予測モジュールの上に、アクション予測と予測のためのフレームワークであるEgocentric Object Manipulation Graphs (Ego-OMG)を適用します。 Ego-OMGは、接触線型行動状態間のグラフモデリング遷移を使用して、より長期の時間的意味関係をモデル化する。 ego-omg内の予測モジュールの使用は、最先端の結果を生成し、epic kitchens action anticipation challengeのunseenおよびseetテストセットでそれぞれ1位と2位を達成し、epic kitchens上でのアクション予測とアクション予測のタスクに関する最先端の結果を得る。我々は,予測モジュールの特性に関するアブレーション研究を行い,その有用性を評価する。

Human actions involving hand manipulations are structured according to the making and breaking of hand-object contact, and human visual understanding of action is reliant on anticipation of contact as is demonstrated by pioneering work in cognitive science. Taking inspiration from this, we introduce representations and models centered on contact, which we then use in action prediction and anticipation. We annotate a subset of the EPIC Kitchens dataset to include time-to-contact between hands and objects, as well as segmentations of hands and objects. Using these annotations we train the Anticipation Module, a module producing Contact Anticipation Maps and Next Active Object Segmentations - novel low-level representations providing temporal and spatial characteristics of anticipated near future action. On top of the Anticipation Module we apply Egocentric Object Manipulation Graphs (Ego-OMG), a framework for action anticipation and prediction. Ego-OMG models longer term temporal semantic relations through the use of a graph modeling transitions between contact delineated action states. Use of the Anticipation Module within Ego-OMG produces state-of-the-art results, achieving 1st and 2nd place on the unseen and seen test sets, respectively, of the EPIC Kitchens Action Anticipation Challenge, and achieving state-of-the-art results on the tasks of action anticipation and action prediction over EPIC Kitchens. We perform ablation studies over characteristics of the Anticipation Module to evaluate their utility.

翻訳日:2021-02-03 08:25:19 公開日:2021-02-01

# (参考訳) 顔について:顔認識評価に関する調査

About Face: A Survey of Facial Recognition Evaluation ( http://arxiv.org/abs/2102.00813v1 )

ライセンス: CC BY 4.0

Inioluwa Deborah Raji, Genevieve Fried

(参考訳) 1976年から2019年にかけて、さまざまなソース、人口、状況から1700万以上の被験者の1億4500万枚の画像から構築された100以上の顔データセットを調査した。歴史的調査によると、これらのデータセットは、政治的モチベーションの変化、技術的能力、そして現在の規範によって形作られています。このような影響が特定のプラクティスをマスクする方法(その一部は実際に有害であるか、あるいはそれ以外は問題)を議論し、現実世界のテクノロジーの機能をより明確に理解するために、そのような詳細の明示的なコミュニケーションのケースを作ります。

We survey over 100 face datasets constructed between 1976 to 2019 of 145 million images of over 17 million subjects from a range of sources, demographics and conditions. Our historical survey reveals that these datasets are contextually informed, shaped by changes in political motivations, technological capability and current norms. We discuss how such influences mask specific practices (some of which may actually be harmful or otherwise problematic) and make a case for the explicit communication of such details in order to establish a more grounded understanding of the technology's function in the real world.

翻訳日:2021-02-03 08:01:49 公開日:2021-02-01

# (参考訳) ビデオキャプションのためのセマンティックグループネットワーク

Semantic Grouping Network for Video Captioning ( http://arxiv.org/abs/2102.00831v1 )

ライセンス: CC BY 4.0

Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo

(参考訳) 本論文では,(1)部分的に符号化されたキャプションの単語フレーズを区別してビデオフレームをグループ化しようとするセマンティックグループネットワーク(Semantic Grouping Network, SGN)と呼ばれるビデオキャプション生成ネットワークを検討し,(2)セマンティックアライメント群を復号して次の単語を予測する。連続するフレームがユニークな情報を提供する可能性は低いため、以前の手法は入力ビデオのみに基づいて繰り返し情報を破棄またはマージすることに重点を置いていた。 SGNは、部分的にデコードされたキャプションの最も識別された単語フレーズをキャプチャするアルゴリズムと、関連するビデオフレームに各フレーズを関連付けるマッピングを学習する。従来の手法とは対照的に、復号された単語からの連続的なフィードバックにより、SGNは部分的に復号されたキャプションに対応するビデオ表現を動的に更新することができる。さらに、マニュアルアノテーションなしで単語句とビデオフレームの正確な整合を容易にするために、コントラストの注意損失が提案される。 SGNは、MSVDおよびMSR-VTTデータセット上のCIDEr-Dスコアの2.1%pおよび2.4%pのマージンでランナーアップ方法を上回ることにより、最新のパフォーマンスを実現します。広範な実験は、SGNの有効性と解釈可能性を示しています。

This paper considers a video caption generating network referred to as Semantic Grouping Network (SGN) that attempts (1) to group video frames with discriminating word phrases of partially decoded caption and then (2) to decode those semantically aligned groups in predicting the next word. As consecutive frames are not likely to provide unique information, prior methods have focused on discarding or merging repetitive information based only on the input video. The SGN learns an algorithm to capture the most discriminating word phrases of the partially decoded caption and a mapping that associates each phrase to the relevant video frames - establishing this mapping allows semantically related frames to be clustered, which reduces redundancy. In contrast to the prior methods, the continuous feedback from decoded words enables the SGN to dynamically update the video representation that adapts to the partially decoded caption. Furthermore, a contrastive attention loss is proposed to facilitate accurate alignment between a word phrase and video frames without manual annotations. The SGN achieves state-of-the-art performances by outperforming runner-up methods by a margin of 2.1%p and 2.4%p in a CIDEr-D score on MSVD and MSR-VTT datasets, respectively. Extensive experiments demonstrate the effectiveness and interpretability of the SGN.

翻訳日:2021-02-03 08:00:59 公開日:2021-02-01

# (参考訳) 構造予測における超高速速度

Super fast rates in structured prediction ( http://arxiv.org/abs/2102.00760v1 )

ライセンス: CC BY 4.0

Vivien Cabannes and Alessandro Rudi and Francis Bach

(参考訳) 分類のような離散的教師付き学習問題は、回帰に類似した連続的な代理問題を導入することでしばしば取り組まれる。サロゲート誤差による推定と解の間の元の誤差の境界は、連続インスタンスに対して既に示されている収束率で離散的な問題を内包する。しかし、現在のアプローチでは、連続的な問題が連続的な値を予測するとき、離散的な問題は本質的に離散的な出力を予測しているという事実を活用できない。本稿では、一般的な構造化された予測問題についてこの問題に取り組み、過度のリスクに対する収束率が$n^{-1}$よりも速く、$n$が観測数であり、最も強い仮定で指数関数的なレートも含む「超高速」率への道を開く。まず,近接近傍に基づく予測器について説明を行い,構造化予測の枠組み内の任意の離散問題に対してバイナリ分類で知られている確率を一般化する。次に,n^{-1/4}$の既知の速度を任意に高速化するカーネルリッジ回帰を,問題の硬さを特徴付けるパラメータによって検討し,スムーズな仮定の下で,次元性の呪いを回避できるようにする。

Discrete supervised learning problems such as classification are often tackled by introducing a continuous surrogate problem akin to regression. Bounding the original error, between estimate and solution, by the surrogate error endows discrete problems with convergence rates already shown for continuous instances. Yet, current approaches do not leverage the fact that discrete problems are essentially predicting a discrete output when continuous problems are predicting a continuous value. In this paper, we tackle this issue for general structured prediction problems, opening the way to "super fast" rates, that is, convergence rates for the excess risk faster than $n^{-1}$, where $n$ is the number of observations, with even exponential rates with the strongest assumptions. We first illustrate it for predictors based on nearest neighbors, generalizing rates known for binary classification to any discrete problem within the framework of structured prediction. We then consider kernel ridge regression where we improve known rates in $n^{-1/4}$ to arbitrarily fast rates, depending on a parameter characterizing the hardness of the problem, thus allowing, under smoothness assumptions, to bypass the curse of dimensionality.

翻訳日:2021-02-03 07:00:59 公開日:2021-02-01

# (参考訳) 歴史的コーポラの神経OCRポストホック補正

Neural OCR Post-Hoc Correction of Historical Corpora ( http://arxiv.org/abs/2102.00583v1 )

ライセンス: CC BY-SA 4.0

Lijun Lyu, Maria Koutraki, Martin Krickl, Besnik Fetahu

(参考訳) 光文字認識(ocr)は歴史的コレクションへのより深いアクセスに不可欠である。 OCRは、文字、単語、または単語分割の転写エラーの主源として、正書法の変化、書体、言語進化(新しい文字、単語スペルなど)を考慮する必要がある。歴史的印刷物のデジタルコーパスでは、スキャン品質の低下と言語標準化の欠如によりエラーはさらに悪化します。 OCRポストホック補正のタスクでは、OCR転写エラーを補正するために、リカレント(RNN)とディープ畳み込みネットワーク(ConvNet)を組み合わせたニューラルアプローチを提案します。文字レベルでは、誤りを柔軟に捉え、新しい注意機構に基づいて補正された出力を復号する。入力と出力の類似性を考慮し,モデルの補正動作に報酬を与える新たな損失関数を提案する。ドイツ語での履歴書コーパスの評価は、私たちのモデルが多様なOCR転写エラーをキャプチャし、単語誤り率を32.3%以上89%削減できることを示しています。

Optical character recognition (OCR) is crucial for a deeper access to historical collections. OCR needs to account for orthographic variations, typefaces, or language evolution (i.e., new letters, word spellings), as the main source of character, word, or word segmentation transcription errors. For digital corpora of historical prints, the errors are further exacerbated due to low scan quality and lack of language standardization. For the task of OCR post-hoc correction, we propose a neural approach based on a combination of recurrent (RNN) and deep convolutional network (ConvNet) to correct OCR transcription errors. At character level we flexibly capture errors, and decode the corrected output based on a novel attention mechanism. Accounting for the input and output similarity, we propose a new loss function that rewards the model's correcting behavior. Evaluation on a historical book corpus in German language shows that our models are robust in capturing diverse OCR transcription errors and reduce the word error rate of 32.3% by more than 89%.

翻訳日:2021-02-03 05:29:39 公開日:2021-02-01

# (参考訳) 科学論文におけるマルチレベルヘッダ数値表のメトリック型同定

Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers ( http://arxiv.org/abs/2102.00819v1 )

ライセンス: CC BY 4.0

Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Manabu Okumura, Hiroya Takamura

(参考訳) 数値表は科学論文に実験結果を示すために広く使われている。テーブル理解のためには、テーブル内の数値を識別するためにメトリクス型が不可欠です。本稿では,新しい情報抽出タスク,多レベルヘッダ数値表からのメトリックタイプ識別,ヘッダテーブル,キャプション,メトリックタイプからなる科学論文から抽出したデータセットを提案する。そこで我々は,ポインタ生成モデルとBERTモデルを用いた2つの共同学習型ニューラル分類と生成方式を提案する。その結果, 共同モデルは, ヘッド内とヘッド外の両方のメトリック型識別問題に対処できることが示された。

Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. We introduce a new information extraction task, metric-type identification from multi-level header numerical tables, and provide a dataset extracted from scientific papers consisting of header tables, captions, and metric-types. We then propose two joint-learning neural classification and generation schemes featuring pointer-generator-based and BERT-based models. Our results show that the joint models can handle both in-header and out-of-header metric-type identification problems.

翻訳日:2021-02-03 05:12:38 公開日:2021-02-01

# (参考訳) 多言語lama:多言語事前学習言語モデルにおける知識の検討

Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models ( http://arxiv.org/abs/2102.00894v1 )

ライセンス: CC BY 4.0

Nora Kassner, Philipp Dufter, Hinrich Sch\"utze

(参考訳) 近年,単言語英語モデルが知識ベースとして利用できることが判明している。構造知識ベースクエリの代わりに,「パリは[MASK]の首都」などのマスキング文がプローブとして使用される。確立されたベンチマークTRExとGoogleREを53言語に翻訳する。 mBERTを使って3つの質問を調査する。 i) mBERTは多言語知識ベースとして使用できるか? ほとんどの先行研究は英語のみを扱っている。複数の言語に研究を拡張することは、多様性とアクセシビリティにとって重要である。 (ii) 知識ベース言語非依存としてのmBERTのパフォーマンスは、言語によって異なりますか? (iii) 多言語モデルはより多くのテキストで訓練される。例えば、mBERTは104のウィキペディアで訓練される。 mBERTはこれをより良いパフォーマンスに活用できますか? 知識ベースとしてmBERTを使用することで、言語間でパフォーマンスが変化し、言語間で予測をプールすることでパフォーマンスが向上します。逆に、mBERTは言語バイアスを示す。例えば、イタリア語で問い合わせた場合、イタリアを起源の国と予測する傾向があります。

Recently, it has been found that monolingual English language models can be used as knowledge bases. Instead of structural knowledge base queries, masked sentences such as "Paris is the capital of [MASK]" are used as probes. We translate the established benchmarks TREx and GoogleRE into 53 languages. Working with mBERT, we investigate three questions. (i) Can mBERT be used as a multilingual knowledge base? Most prior work only considers English. Extending research to multiple languages is important for diversity and accessibility. (ii) Is mBERT's performance as knowledge base language-independent or does it vary from language to language? (iii) A multilingual model is trained on more text, e.g., mBERT is trained on 104 Wikipedias. Can mBERT leverage this for better performance? We find that using mBERT as a knowledge base yields varying performance across languages and pooling predictions across languages improves performance. Conversely, mBERT exhibits a language bias; e.g., when queried in Italian, it tends to predict Italy as the country of origin.

翻訳日:2021-02-03 04:49:14 公開日:2021-02-01

# (参考訳) ニュース記事の抗議数:データセットと半自動化データ収集パイプライン

Counting Protests in News Articles: A Dataset and Semi-Automated Data Collection Pipeline ( http://arxiv.org/abs/2102.00917v1 )

ライセンス: CC BY 4.0

Tommy Leung, L. Nathan Perkins

(参考訳) 2017年1月から2021年1月にかけて、米国の何千もの地元ニュースソースが、公民権、移民、銃、環境などに関する42,000以上の抗議を報告した。抗議を毎日報告する地元のジャーナリストの膨大な数を考えると、これらの出来事を構造化されたデータとして抽出して時間的および地理的傾向を理解することで、市民の意思決定が促進されます。しかし、ニュース記事からイベントを抽出するタスクは、ドメイン検出、スロットフィリング、コアファレンス解決の分野で、NLPコミュニティによく知られた課題を提示します。ニュース記事から構造化されたデータを抽出するリソースを改善するために、我々の貢献は3倍になる。 We 1) release a manually labeled dataset of news article URLs, dates, locations, crowd size estimates, and 494 discrete descriptive tags corresponding to 42,347 reported protest events in the United States between January 2017 and January 2021; 2) describe the semi-automated data collection pipeline used to discover, sort, and review the 144,568 English articles that comprise the dataset; and 3) benchmark a long-short term memory (LSTM) low dimensional classifier that demonstrates the utility of processing news articles based on syntactic structures, such as paragraphs and sentences, to count the number of reported protest events.

Between January 2017 and January 2021, thousands of local news sources in the United States reported on over 42,000 protests about topics such as civil rights, immigration, guns, and the environment. Given the vast number of local journalists that report on protests daily, extracting these events as structured data to understand temporal and geographic trends can empower civic decision-making. However, the task of extracting events from news articles presents well known challenges to the NLP community in the fields of domain detection, slot filling, and coreference resolution. To help improve the resources available for extracting structured data from news stories, our contribution is three-fold. We 1) release a manually labeled dataset of news article URLs, dates, locations, crowd size estimates, and 494 discrete descriptive tags corresponding to 42,347 reported protest events in the United States between January 2017 and January 2021; 2) describe the semi-automated data collection pipeline used to discover, sort, and review the 144,568 English articles that comprise the dataset; and 3) benchmark a long-short term memory (LSTM) low dimensional classifier that demonstrates the utility of processing news articles based on syntactic structures, such as paragraphs and sentences, to count the number of reported protest events.

翻訳日:2021-02-03 04:33:32 公開日:2021-02-01

# (参考訳) 再帰的KMeansとDijkstraアルゴリズムによるCVRPの解法

Using Recursive KMeans and Dijkstra Algorithm to Solve CVRP ( http://arxiv.org/abs/2102.00567v1 )

ライセンス: CC BY 4.0

Hassan Moussa

(参考訳) キャパシタ付き車両ルーティング問題(CVRP)は、今日の最も一般的な最適化問題のひとつです。

Capacitated vehicle routing problem (CVRP) is being one of the most common optimization problems in our days

翻訳日:2021-02-03 03:59:43 公開日:2021-02-01

# (参考訳) 分割関数のサンプリングと複雑性

Sampling and Complexity of Partition Function ( http://arxiv.org/abs/2102.00855v1 )

ライセンス: CC0 1.0

Chuyu Xiong

(参考訳) 数分割問題はよく知られた問題であり、21 Karp のNP完全問題 \cite{karp} の1つである。分割関数は、数範囲が制限された数分割問題と等価なブール関数である。数値分割問題と分割関数の計算複雑性を理解することは極めて重要かつ困難である。このような問題には、新しいツールとメソッド \cite{aaronson} が必要だと推測される。汎用学習マシン \cite{paper5, paper8} に関する最近の研究で、我々は極端に適合するツール、適切なサンプリングセット、パラメータ付きブール関数(試行錯誤方式で使用される)を開発した。これらのツールがパーティション関数に適用できることが分かりました。本稿では,パーティション関数のセットアップ,パーティション関数のプロパティ,使用するツールについて論じる。このアプローチは、分割関数の計算複雑性の低い境界と、数分割問題の計算複雑性の低い境界が問題の大きさに指数関数的であることを証明します。これは次のように意味する: {\bf P} $\ne$ {\bf NP} \cite{cook}。

The number partition problem is a well-known problem, which is one of 21 Karp's NP-complete problems \cite{karp}. Partition function is a boolean function that is equivalent to the number partition problem with number range restricted. To understand the computational complexity of the number partition problem and partition function is quite important and hard. People speculate that we need new tools and methods \cite{aaronson} for such problem. In our recent research on universal learning machine \cite{paper5, paper8}, we developed some tools, namely, fitting extremum, proper sampling set, boolean function with parameters (used in trial-and-error fashion). We found that these tools could be applied to the partition function. In this article, we discuss the set up of the partition function, properties of the partition function, and the tools to be used. This approach leads us to prove that the lower bound of the computational complexity of partition function, as well as the lower bound of the computational complexity of the number partition problem, is exponential to the size of problem. This implies: {\bf P} $\ne$ {\bf NP} \cite{cook}.

翻訳日:2021-02-03 03:55:37 公開日:2021-02-01

# (参考訳) Box Re-Ranking: ドメイン適応ペデストリアン検出のための教師なし偽陽性抑制

Box Re-Ranking: Unsupervised False Positive Suppression for Domain Adaptive Pedestrian Detection ( http://arxiv.org/abs/2102.00595v1 )

ライセンス: CC BY 4.0

Weijie Chen and Yilu Guo and Shicai Yang and Zhaoyang Li and Zhenxin Ma and Binbin Chen and Long Zhao and Di Xie and Shiliang Pu and Yueting Zhuang

(参考訳) 偽陽性は、ドメイン適応型歩行者検出におけるドメインシフトの診断によってもたらされる最も深刻な問題の1つです。しかし、各ボックスを無数のターゲットドメインにラベル付けすることは不可能です。したがって、各対象領域における偽陽性を教師なしの方法で抑制することに注意を向ける。本稿では,オブジェクト検出タスクをポジティブボックスとネガティブボックスのランキングタスクに革新的にモデル化し,偽陽性抑圧問題をエレガントにボックス再ランク問題に変換することにより,手動のアノテーションを使わずに解決できるようにする。ボックスの再ランク付け時に付随する問題は、チェリーピッキングにラベル付きバリデーションデータが利用できないことである。本研究は,正の正の変わらずを検出することを目的として,自己監督評価指標であるボックス数アライメントを提案し,最適化されたモデルがキャパシティの劣化を防ぐ。クロスドメイン歩行者検出データセットを用いて大規模な実験を行い,提案手法の有効性を実証した。さらに、2つの一般教師なしドメイン適応オブジェクト検出ベンチマークへの拡張は、他の最先端技術に対する当社の優位性もサポートする。

False positive is one of the most serious problems brought by agnostic domain shift in domain adaptive pedestrian detection. However, it is impossible to label each box in countless target domains. Therefore, it yields our attention to suppress false positive in each target domain in an unsupervised way. In this paper, we model an object detection task into a ranking task among positive and negative boxes innovatively, and thus transform a false positive suppression problem into a box re-ranking problem elegantly, which makes it feasible to solve without manual annotation. An attached problem during box re-ranking appears that no labeled validation data is available for cherrypicking. Considering we aim to keep the detection of true positive unchanged, we propose box number alignment, a self-supervised evaluation metric, to prevent the optimized model from capacity degeneration. Extensive experiments conducted on cross-domain pedestrian detection datasets have demonstrated the effectiveness of our proposed framework. Furthermore, the extension to two general unsupervised domain adaptive object detection benchmarks also supports our superiority to other state-of-the-arts.

翻訳日:2021-02-03 02:14:02 公開日:2021-02-01

# (参考訳) ライン描画による顔写真とスケッチの橋渡し

Bridging Unpaired Facial Photos And Sketches By Line-drawings ( http://arxiv.org/abs/2102.00635v1 )

ライセンス: CC BY 4.0

Fei Gao, Meimei Shang, Xiang Li, Jingjie Zhu, Lingna Dai

(参考訳) 本論文では,不対データを用いて顔スケッチ合成モデルを学習する新しい手法を提案する。私たちの主なアイデアは、写真ドメイン $\mathcal{X}$ とスケッチドメイン $Y$ を線引きドメイン $\mathcal{Z}$ を使ってブリッジすることです。特に,画像とスケッチの両方を,ニューラルスタイルの転送手法を用いて線画にマッピングする。 F: \mathcal{X}/\mathcal{Y} \mapsto \mathcal{Z}$ である。その結果、 \textit{pseudo paired data} $(\mathcal{z}, \mathcal{y})$ を得ることができ、マッピング $g:\mathcal{z} \mapsto \mathcal{y}$ を教師あり学習方法で学習することができる。推論段階では、顔写真が与えられたら、まずラインドローイングに転送し、次に$G \circ F$でスケッチに転送できます。さらに,異なるタイプのストロークを生成するための新しいストローク損失を提案する。 sRenderと呼ばれる私たちの方法は、人間のアーティストのレンダリングプロセスとよく一致します。実験結果は、sRenderがマルチスタイルのスケッチを生成し、既存の不対画像から画像への変換方法を大幅に上回ることを実証した。

In this paper, we propose a novel method to learn face sketch synthesis models by using unpaired data. Our main idea is bridging the photo domain $\mathcal{X}$ and the sketch domain $Y$ by using the line-drawing domain $\mathcal{Z}$. Specially, we map both photos and sketches to line-drawings by using a neural style transfer method, i.e. $F: \mathcal{X}/\mathcal{Y} \mapsto \mathcal{Z}$. Consequently, we obtain \textit{pseudo paired data} $(\mathcal{Z}, \mathcal{Y})$, and can learn the mapping $G:\mathcal{Z} \mapsto \mathcal{Y}$ in a supervised learning manner. In the inference stage, given a facial photo, we can first transfer it to a line-drawing and then to a sketch by $G \circ F$. Additionally, we propose a novel stroke loss for generating different types of strokes. Our method, termed sRender, accords well with human artists' rendering process. Experimental results demonstrate that sRender can generate multi-style sketches, and significantly outperforms existing unpaired image-to-image translation methods.

翻訳日:2021-02-03 02:00:49 公開日:2021-02-01

# (参考訳) 映像からの自己監督等変性シーン合成

Self-Supervised Equivariant Scene Synthesis from Video ( http://arxiv.org/abs/2102.00863v1 )

ライセンス: CC BY 4.0

Cinjon Resnick, Or Litany, Cosmas Hei{\ss}, Hugo Larochelle, Joan Bruna, Kyunghyun Cho

(参考訳) 本研究では,背景,キャラクタ,アニメーションに自動的に区切られた映像からシーン表現を学習するための自己教師付きフレームワークを提案する。本手法は,フレーム間の変換に対して等変性を持ち,背景が同じ変換に対して一定であることに着目した。トレーニング後、画像エンコーディングをリアルタイムで操作して、非表示のコンポーネントの組み合わせを作成できます。私たちが知る限り、我々は、解釈可能な背景、キャラクタ、アニメーションの教師なし抽出と合成を行う最初の方法である。我々は,背景付きmnistの移動,2次元ビデオゲームスプライト,ファッションモデリングという3つのデータセットで結果を示す。

We propose a self-supervised framework to learn scene representations from video that are automatically delineated into background, characters, and their animations. Our method capitalizes on moving characters being equivariant with respect to their transformation across frames and the background being constant with respect to that same transformation. After training, we can manipulate image encodings in real time to create unseen combinations of the delineated components. As far as we know, we are the first method to perform unsupervised extraction and synthesis of interpretable background, character, and animation. We demonstrate results on three datasets: Moving MNIST with backgrounds, 2D video game sprites, and Fashion Modeling.

翻訳日:2021-02-03 01:37:04 公開日:2021-02-01

# (参考訳) 3次元ニューロン分割のための連続リカレントニューラルネットワーク

Consistent Recurrent Neural Networks for 3D Neuron Segmentation ( http://arxiv.org/abs/2102.01021v1 )

ライセンス: CC BY 4.0

Felix Gonda, Donglai Wei, Hanspeter Pfister

(参考訳) 時空間整合性のある画像中の各物体の2次元マスクを逐次生成するニューロンの3次元再構成のための再帰的ネットワークを提案する。ネットワークは2つの部分で一貫性をモデル化する: (i) 局所性により、非排他的および時間的に隣接したオブジェクト関係と双方向の繰り返しを探索することができる。 (ii) 非ローカルで、スキップ接続で時間領域内の長距離オブジェクト関係を探索することができる。提案するネットワークは、入力画像からオブジェクトマスクのシーケンスまで、エンドツーエンドでトレーニング可能であり、オブジェクト境界に依存する手法と比較して、その出力は後処理を必要としない。本手法は, SNEMI3Dチャレンジにおいて, ニューロンセグメンテーションの3つのベンチマークを用いて評価し, 最新の性能を達成した。

We present a recurrent network for the 3D reconstruction of neurons that sequentially generates binary masks for every object in an image with spatio-temporal consistency. Our network models consistency in two parts: (i) local, which allows exploring non-occluding and temporally-adjacent object relationships with bi-directional recurrence. (ii) non-local, which allows exploring long-range object relationships in the temporal domain with skip connections. Our proposed network is end-to-end trainable from an input image to a sequence of object masks, and, compared to methods relying on object boundaries, its output does not require post-processing. We evaluate our method on three benchmarks for neuron segmentation and achieved state-of-the-art performance on the SNEMI3D challenge.

翻訳日:2021-02-03 01:23:32 公開日:2021-02-01

# (参考訳) seq2seq学習を用いたテキスト対ハッシュ生成

Text-to-hashtag Generation using Seq2seq Learning ( http://arxiv.org/abs/2102.00904v1 )

ライセンス: CC BY 4.0

Augusto Camargo, Wesley Carvalho, Felipe Peressim

(参考訳) 本論文では、BiLSTMとBERTに基づくモデルがブラジルのポルトガル語でハッシュタグを生成し、Eコマースのウェブサイトで使用できるかどうかを検討した。我々はEコマースレビューのコーパスと商品のタイトルを入力として処理し、ハッシュタグを出力として生成した。 NIST,BLEU,METEOR,クラウドソーシングスコアの4つの定量値を用いて評価を行った。 Word Cloudは定性メトリックとして使用された。すべてのコンピュータ測定値(NIST、BLEU、METEOR)が悪い結果を示したのに加えて、クラウドソースは素晴らしいスコアを示した。我々は、ニューラルネットワークによって生成されたテキストが、Eコマースのウェブサイトで製品のハッシュタグとして使われることを非常に約束していると結論付けた。この作業のコードはhttps://github.com/augustocamargo/text-to-hashtagで入手できる。

In this paper, we studied if models based on BiLSTM and BERT can generate hashtags in Brazilian portuguese that can be used in Ecommerce websites. We processed a corpus of Ecommerce reviews and titles of products as inputs and we generated hashtags as outputs. We evaluate the results using four quantitatives metrics: NIST, BLEU, METEOR and a crowdsourced score. Word Cloud was used as a qualitative metric. Besides all computer metered metrics (NIST, BLEU and METEOR) showed bad results, the crowdsourced showed amazing scores. We concluded that the texts generated by the neural networks are very promising to be used as hashtags of products in Ecommerce websites [1]. The code for this work is available on https://github.com/augustocamargo/text-to-hashtag

翻訳日:2021-02-03 00:49:21 公開日:2021-02-01

# (参考訳) オフライン強化学習のための近世界ベンチマーク

Near Real-World Benchmarks for Offline Reinforcement Learning ( http://arxiv.org/abs/2102.00714v1 )

ライセンス: CC BY 4.0

Rongjun Qin, Songyi Gao, Xingyuan Zhang, Zhen Xu, Shengkai Huang, Zewen Li, Weinan Zhang, Yang Yu

(参考訳) オフライン強化学習(rl)は、トレーニング中の環境との余分なインタラクションなしに、収集したデータのバッチから最適なポリシーを学ぶことを目的としている。オフラインRLは環境における有害な実行を緩和しようとするため、RLアプリケーションの範囲を大きく広げることになる。しかし、現在のオフラインRLベンチマークは一般的に大きな現実的なギャップがある。それらは、非常に探索的なポリシーによって収集された大きなデータセットを含み、訓練されたポリシーは、環境内で直接評価されます。一方、現実の状況では、高度に探索的なポリシーを実行することは、システムの安全性を確保するために禁止され、データは一般的に非常に制限され、トレーニングされたポリシーは、デプロイ前に適切に検証されるべきである。本稿では,近世界のベンチマークであるNewRLについて述べる。 NewRLには、ポリシー検証のために制御されたサイズと追加のテストデータセットを備えたさまざまなドメインのデータセットが含まれています。既存のオフラインRLアルゴリズムをNewRL上で評価する。実験では、データセット報酬の代わりに、ポリシーのパフォーマンスも行動ポリシーの決定論的なバージョンと比較されるべきであると主張します。決定論的行動ポリシーは実際のシナリオのベースラインであるため、データセットはパフォーマンスを低下させる可能性のあるアクション摂動で収集されることが多い。実験結果から,テスト済みのオフラインRLアルゴリズムは,上記の多くのデータセットに対する決定論的ポリシと競合するだけであり,オフラインポリシ評価がほとんど役に立たないことが示された。 NewRL スーツは http://polixir.ai/research/newrl で見ることができる。この研究が研究に光を当て、現実世界のシステムにRLをデプロイする際にもっと注目されることを願っています。

Offline reinforcement learning (RL) aims at learning an optimal policy from a batch of collected data, without extra interactions with the environment during training. Offline RL attempts to alleviate the hazardous executions in environments, thus it will greatly broaden the scope of RL applications. However, current offline RL benchmarks commonly have a large reality gap. They involve large datasets collected by highly exploratory policies, and a trained policy is directly evaluated in the environment. Meanwhile, in real-world situations, running a highly exploratory policy is prohibited to ensure system safety, the data is commonly very limited, and a trained policy should be well validated before deployment. In this paper, we present a suite of near real-world benchmarks, NewRL. NewRL contains datasets from various domains with controlled sizes and extra test datasets for the purpose of policy validation. We then evaluate existing offline RL algorithms on NewRL. In the experiments, we argue that the performance of a policy should also be compared with the deterministic version of the behavior policy, instead of the dataset reward. Because the deterministic behavior policy is the baseline in the real scenarios, while the dataset is often collected with action perturbations that can degrade the performance. The empirical results demonstrate that the tested offline RL algorithms appear only competitive to the above deterministic policy on many datasets, and the offline policy evaluation hardly helps. The NewRL suit can be found at http://polixir.ai/research/newrl. We hope this work will shed some light on research and draw more attention when deploying RL in real-world systems.

翻訳日:2021-02-03 00:03:49 公開日:2021-02-01

# (参考訳) マルチエージェントDeep Reinforcement Learningを用いたmmWave MU-MISOシステムのハイブリッドビームフォーミング

Hybrid Beamforming for mmWave MU-MISO Systems Exploiting Multi-agent Deep Reinforcement Learning ( http://arxiv.org/abs/2102.00735v1 )

ライセンス: CC BY 4.0

Qisheng Wang, Xiao Li, Shi Jin, and Yijiain Chen

(参考訳) 本書では、ミリ波(mmWave)マルチユーザ(MU)マルチインプットシングル出力(MISO)システムのための深層補強学習(DRL)に基づくハイブリッドビームフォーミングについて検討する。 DRLの探索効率問題を解決するためにマルチエージェントDRL法を提案する。提案手法では,優先されたリプレイバッファとより情報的な報酬を適用し,コンバージェンスを高速化する。シミュレーションの結果,提案アーキテクチャはベンチマークよりもスペクトル効率が高く,時間消費の少ないため,実用化に適していることがわかった。

In this letter, we investigate the hybrid beamforming based on deep reinforcement learning (DRL) for millimeter Wave (mmWave) multi-user (MU) multiple-input-single-output (MISO) system. A multi-agent DRL method is proposed to solve the exploration efficiency problem in DRL. In the proposed method, prioritized replay buffer and more informative reward are applied to accelerate the convergence. Simulation results show that the proposed architecture achieves higher spectral efficiency and less time consumption than the benchmarks, thus is more suitable for practical applications.

翻訳日:2021-02-02 22:12:05 公開日:2021-02-01

# (参考訳) 知識蒸留のためのソフトラベルの再考:バイアス分散トレードオフの視点

Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective ( http://arxiv.org/abs/2102.00650v1 )

ライセンス: CC BY 4.0

Helong Zhou, Liangchen Song, Jiajie Chen, Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang

(参考訳) 知識蒸留は、よく訓練されたネットワークまたはそれらのアンサンブルを利用して、学生ネットワークのトレーニングを指導するための効果的なアプローチである。教師ネットワークからの出力は、新しいネットワークのトレーニングを監督するためのソフトラベルとして使用される。最近の研究では、ソフトラベルの興味をそそる性質が示され、ラベルをソフトにすることは学生ネットワークにとって良い正規化となる。統計的学習の観点から、正規化はばらつきを減らすことを目指していますが、ソフトラベルによるトレーニングではバイアスとばらつきの変化が明確ではありません。本稿では,ソフトラベル蒸留によるバイアス分散トレードオフについて検討する。具体的には、トレーニング中のバイアス分散トレードオフがサンプルごとに異なることを観察する。さらに、同じ蒸留温度設定下では、蒸留性能がいくつかの特定のサンプルの数に負の関連していることを観察します。しかし, 正則化試料を完全にろ過しても蒸留性能は低下する。私たちの発見は、ネットワークがサンプルワイズバイアス分散トレードオフを適応的に処理するのに役立つ、新しい重み付きソフトラベルを提案しました。本手法の有効性を検証するための標準評価ベンチマーク実験を行った。コードは \url{https://github.com/bellymonster/Weighted-Soft-Label-Distillation} で入手できます。

Knowledge distillation is an effective approach to leverage a well-trained network or an ensemble of them, named as the teacher, to guide the training of a student network. The outputs from the teacher network are used as soft labels for supervising the training of a new network. Recent studies \citep{muller2019does,yuan2020revisiting} revealed an intriguing property of the soft labels that making labels soft serves as a good regularization to the student network. From the perspective of statistical learning, regularization aims to reduce the variance, however how bias and variance change is not clear for training with soft labels. In this paper, we investigate the bias-variance tradeoff brought by distillation with soft labels. Specifically, we observe that during training the bias-variance tradeoff varies sample-wisely. Further, under the same distillation temperature setting, we observe that the distillation performance is negatively associated with the number of some specific samples, which are named as regularization samples since these samples lead to bias increasing and variance decreasing. Nevertheless, we empirically find that completely filtering out regularization samples also deteriorates distillation performance. Our discoveries inspired us to propose the novel weighted soft labels to help the network adaptively handle the sample-wise bias-variance tradeoff. Experiments on standard evaluation benchmarks validate the effectiveness of our method. Our code is available at \url{https://github.com/bellymonster/Weighted-Soft-Label-Distillation}.

翻訳日:2021-02-02 20:37:44 公開日:2021-02-01

# (参考訳) 学習水型脱感作表現による水中画像強調

Underwater Image Enhancement via Learning Water Type Desensitized Representations ( http://arxiv.org/abs/2102.00676v1 )

ライセンス: CC BY 4.0

Zhenqi Fu, Xiaopeng Lin, Wu Wang, Yue Huang, and Xinghao Ding

(参考訳) 水中での応用では、光吸収と散乱の影響は画像劣化をもたらす。さらに、複雑で変更可能なイメージング環境は、水タイプの多様性に対処するための普遍的な強化ソリューションを提供することを困難にします。本稿では,これらの課題に対処するため,SCNetと呼ばれる新しい水中画像強調(UIE)フレームワークを提案する。 SCNetは、水型脱感作機能を学ぶ重要なアイデアで、空間とチャネルの両方の寸法にわたる正規化スキームに基づいています。劣化の多様性は画素間の強い相関に主に根ざしており、ミニバッチにおける各インスタンスの空間的次元にわたるアクティベーションの非相関化にホワイトニングを適用する。また,チャネル間のアクティベーションの最初の2つのモーメントを標準化し再注入することで,チャネルワイズ相関を解消する。空間的およびチャネル次元の正規化スキームは、U-Netの各スケールで実行され、マルチスケール表現を得る。このような潜時符号化により、デコーダはクリーン信号を容易に再構成でき、水による歪みタイプの影響を受けない。 2つの実世界のUIEデータセットによる実験結果から,提案手法は多様な水型で画像の強化に成功し,視覚的品質改善の競争性能が向上することが示された。

For underwater applications, the effects of light absorption and scattering result in image degradation. Moreover, the complex and changeable imaging environment makes it difficult to provide a universal enhancement solution to cope with the diversity of water types. In this letter, we present a novel underwater image enhancement (UIE) framework termed SCNet to address the above issues. SCNet is based on normalization schemes across both spatial and channel dimensions with the key idea of learning water type desensitized features. Considering the diversity of degradation is mainly rooted in the strong correlation among pixels, we apply whitening to de-correlates activations across spatial dimensions for each instance in a mini-batch. We also eliminate channel-wise correlation by standardizing and re-injecting the first two moments of the activations across channels. The normalization schemes of spatial and channel dimensions are performed at each scale of the U-Net to obtain multi-scale representations. With such latent encodings, the decoder can easily reconstruct the clean signal, and unaffected by the distortion types caused by the water. Experimental results on two real-world UIE datasets show that the proposed approach can successfully enhance images with diverse water types, and achieves competitive performance in visual quality improvement.

翻訳日:2021-02-02 20:21:38 公開日:2021-02-01

# (参考訳) 天空画像からの深層学習照度予測モデルのベンチマーク -詳細な分析-

Benchmarking of Deep Learning Irradiance Forecasting Models from Sky Images -- an in-depth Analysis ( http://arxiv.org/abs/2102.00721v1 )

ライセンス: CC BY 4.0

Quentin Paletta, Guillaume Arbod and Joan Lasenby

(参考訳) スマートグリッド、発電所の運用、ハイブリッドシステム管理、エネルギー取引など多くの産業応用は、ソーラーパネルからの断続的なエネルギー生産に対応するため、短期的な太陽予報の改善の恩恵を受ける可能性がある。しかし、現在の雲を空からモデル化するアプローチでは、雲の空間的配置、時間的ダイナミクス、太陽放射との物理的相互作用に関する精度が不足している。大規模データセットの増加によって、これらの制限に対処するためにデータ駆動メソッドが開発され、有望な結果が得られた。本研究では、半球空画像と外生変数のシーケンスから太陽光照射を予測するために訓練された4つのDeep Learningアーキテクチャを比較した。各モデルの相対的なパフォーマンスを評価するために、スマート永続化モデルに基づく予測スキルメトリックと、ランプと時間の歪みメトリックを使用しました。その結果、天空画像列の時空間的側面のエンコーディングは、試験年度の予測スキルが20.4%に達したことにより、予測を大幅に改善した。しかし、実験データに基づいて、Deep Learningモデルは共通の設定で「非常にスマートな永続化モデル」として振る舞う傾向にあり、持続モデルと時間的に一致し、最もペナリングなエラーを緩和する傾向にあると結論付けている。したがって、スカイカメラで捉えられたにもかかわらず、モデルはしばしば太陽を遮る雲のような大きな照度変化を引き起こす基本的な事象を見逃す。反応性から予測性まで、このアプローチの放射能予測への移行に貢献できることを願っています。

A number of industrial applications, such as smart grids, power plant operation, hybrid system management or energy trading, could benefit from improved short-term solar forecasting, addressing the intermittent energy production from solar panels. However, current approaches to modelling the cloud cover dynamics from sky images still lack precision regarding the spatial configuration of clouds, their temporal dynamics and physical interactions with solar radiation. Benefiting from a growing number of large datasets, data driven methods are being developed to address these limitations with promising results. In this study, we compare four commonly used Deep Learning architectures trained to forecast solar irradiance from sequences of hemispherical sky images and exogenous variables. To assess the relative performance of each model, we used the Forecast Skill metric based on the smart persistence model, as well as ramp and time distortion metrics. The results show that encoding spatiotemporal aspects of the sequence of sky images greatly improved the predictions with 10 min ahead Forecast Skill reaching 20.4% on the test year. However, based on the experimental data, we conclude that, with a common setup, Deep Learning models tend to behave just as a `very smart persistence model', temporally aligned with the persistence model while mitigating its most penalising errors. Thus, despite being captured by the sky cameras, models often miss fundamental events causing large irradiance changes such as clouds obscuring the sun. We hope that our work will contribute to a shift of this approach to irradiance forecasting, from reactive to anticipatory.

翻訳日:2021-02-02 20:12:08 公開日:2021-02-01

# (参考訳) 分類マージンによる騒音ラベルのコンバット学習

Learning to Combat Noisy Labels via Classification Margins ( http://arxiv.org/abs/2102.00751v1 )

ライセンス: CC BY 4.0

Jason Z. Lin and Jelena Bradic

(参考訳) ノイズの多いラベルでトレーニングされたディープニューラルネットワークは、ノイズの多いものからクリーンなインスタンスを識別する能力が急速に失われることが知られている。早期学習フェーズが終了した後、ネットワークは騒々しいインスタンスを記憶し、一般化パフォーマンスの低下につながります。この問題を解決するため、マーベル(MARgins Via Early Learning)を提案し、分類のマージンの画期的な歴史を維持しながら、あらゆるインスタンスの「適合性」を追跡します。連続する負のマージンに基づいて、重みをゼロにすることで、疑わしいノイズを排除した。さらに、MARVEL+のアップウェイトは、ネットワークが分類境界のよりニュアンスな表現を学習できるようにする。合成ラベルノイズを用いたベンチマーク実験の結果,MARVELは非対称雑音下でのマージンが著しく大きいため,他のベースラインよりも高い性能を示した。

A deep neural network trained on noisy labels is known to quickly lose its power to discriminate clean instances from noisy ones. After the early learning phase has ended, the network memorizes the noisy instances, which leads to a degradation in generalization performance. To resolve this issue, we propose MARVEL (MARgins Via Early Learning), where we track the goodness of "fit" for every instance by maintaining an epoch-history of its classification margins. Based on consecutive negative margins, we discard suspected noisy instances by zeroing out their weights. In addition, MARVEL+ upweights arduous instances enabling the network to learn a more nuanced representation of the classification boundary. Experimental results on benchmark datasets with synthetic label noise show that MARVEL outperforms other baselines consistently across different noise levels, with a significantly larger margin under asymmetric noise.

翻訳日:2021-02-02 20:08:58 公開日:2021-02-01

# (参考訳) Zen-NAS: 高速画像認識のためのゼロショットNAS

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition ( http://arxiv.org/abs/2102.01063v1 )

ライセンス: CC BY 4.0

Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin

(参考訳) neural architecture search (nas) の重要なコンポーネントは、クエリされたアーキテクチャの精度を主張する精度予測器である。高品質な精度予測器を構築するために、従来のNASアルゴリズムは大量のアーキテクチャや大きなスーパーネットを訓練する。このステップは数百から数千のGPU日を消費し、総検索コストを上回ります。そこで本研究では,精度予測器をZen-scoreという新しいモデル複雑性指標に置き換えることを提案する。モデルの精度を予測する代わりに、Zen-scoreはパラメータを訓練せずにネットワークのモデルの複雑さを直接主張します。これは、ネットワークのモデル複雑さがターゲットデータセットの精度と正に相関していることを示すディープラーニング理論の最近の進歩にインスパイアされている。 zen-score の計算はランダムなガウス入力を用いたランダム初期化ネットワークによる数回の前方推論しか行わない。これは、Vanilla Convolutional Neural Networks(VCN-networks)または互換の亜種に適用でき、現実世界のアプリケーションで人気のあるネットワークの大部分をカバーする。 Zen-ScoreとEvolutionary Algorithmを組み合わせると、Zen-NASという新しいZero-Shot NASアルゴリズムが得られる。 CIFAR10/CIFAR100とImageNetについて広範な実験を行った。要約すると、Zen-NASは半GPU日(12GPU時間)未満で高性能アーキテクチャを設計することができる。結果、ZenNetsという名前のネットワークは、ImageNet上で最大83.0\%$ top-1精度を達成する。同じまたはより良い精度のEfficientNets-B3/B5と比較して、ZenNetsはNVIDIA V100の5.6$の速度、NVIDIA T4の111$の速度、Google Pixel2の2.6$の速度、および50\%$の少ないFLOPを使用しています。ソースコードと事前トレーニング済みモデルはhttps://github.com/idstcv/zennasでリリースしています。

A key component in Neural Architecture Search (NAS) is an accuracy predictor which asserts the accuracy of a queried architecture. To build a high quality accuracy predictor, conventional NAS algorithms rely on training a mass of architectures or a big supernet. This step often consumes hundreds to thousands of GPU days, dominating the total search cost. To address this issue, we propose to replace the accuracy predictor with a novel model-complexity index named Zen-score. Instead of predicting model accuracy, Zen-score directly asserts the model complexity of a network without training its parameters. This is inspired by recent advances in deep learning theories which show that model complexity of a network positively correlates to its accuracy on the target dataset. The computation of Zen-score only takes a few forward inferences through a randomly initialized network using random Gaussian input. It is applicable to any Vanilla Convolutional Neural Networks (VCN-networks) or compatible variants, covering a majority of networks popular in real-world applications. When combining Zen-score with Evolutionary Algorithm, we obtain a novel Zero-Shot NAS algorithm named Zen-NAS. We conduct extensive experiments on CIFAR10/CIFAR100 and ImageNet. In summary, Zen-NAS is able to design high performance architectures in less than half GPU day (12 GPU hours). The resultant networks, named ZenNets, achieve up to $83.0\%$ top-1 accuracy on ImageNet. Comparing to EfficientNets-B3/B5 of the same or better accuracies, ZenNets are up to $5.6$ times faster on NVIDIA V100, $11$ times faster on NVIDIA T4, $2.6$ times faster on Google Pixel2 and uses $50\%$ less FLOPs. Our source code and pre-trained models are released on https://github.com/idstcv/ZenNAS.

翻訳日:2021-02-02 19:52:26 公開日:2021-02-01

# (参考訳) 行列因子化のリーマン的展望

Riemannian Perspective on Matrix Factorization ( http://arxiv.org/abs/2102.00937v1 )

ライセンス: CC BY 4.0

Kwangjun Ahn, Felipe Suarez

(参考訳) リーマン幾何学による行列完備に対する非凸行列分解法の研究を行う。グラスマン多様体上の最適化定式化に基づき、部分空間間の主角の概念に基づいて風景を特徴づける。完全に観察された場合、我々は、コストが測地的に凸である領域と、すべての臨界点が厳密なサドルである領域が存在することを示した。本研究では, 部分観察例を経験的に検討した。

We study the non-convex matrix factorization approach to matrix completion via Riemannian geometry. Based on an optimization formulation over a Grassmannian manifold, we characterize the landscape based on the notion of principal angles between subspaces. For the fully observed case, our results show that there is a region in which the cost is geodesically convex, and outside of which all critical points are strictly saddle. We empirically study the partially observed case based on our findings.

翻訳日:2021-02-02 19:18:30 公開日:2021-02-01

# (参考訳) ConvNets for Counting: Object Detection of Transient Phenomena in Steelpan Drums

ConvNets for Counting: Object Detection of Transient Phenomena in Steelpan Drums ( http://arxiv.org/abs/2102.00632v1 )

ライセンス: CC BY 4.0

Scott H. Hawley and Andrew C. Morrison

(参考訳) 電子スペックルパターン干渉計(ESPI)で照らされたカリブ海のスチールパンドラムの過渡振動の高速ビデオ記録のフレームに見る楕円反ノード領域の干渉縞を数えるために、畳み込みニューラルネットワークで構築された物体検出器を訓練する。本モデルで提案するアノテーション「SPNet」は,交感神経振動モードの発達を追跡することで,ドラムの時間依存行動の理解に寄与することを目的としている。このシステムは、Zooniverse Steelpan vibrations Projectから得られたクラウドソーシングされた人間の注釈付き画像のデータセットで訓練される。また,人間のアノテート画像が比較的少ないため,視覚特性が実際の画像と一致した合成画像のコーパスを生成的逆ネットワークを用いて学習し,スタイル転送を行う。何千ものラベルのないビデオフレームの注釈を予測するためにモデルを適用することで、同じドラムストライクのオーディオ記録と一致する特徴を追跡し、振動を測定することができる。 1つの驚くべき結果として、機械注釈付きビデオフレームは、オーディオ録音におけるそのような遷移に大きく先行する第1と第2の高調波の遷移を明らかにする。本稿では,主にモデルの開発について述べるので,さらなる応用が期待できる。

We train an object detector built from convolutional neural networks to count interference fringes in elliptical antinode regions visible in frames of high-speed video recordings of transient oscillations in Caribbean steelpan drums illuminated by electronic speckle pattern interferometry (ESPI). The annotations provided by our model, "SPNet" are intended to contribute to the understanding of time-dependent behavior in such drums by tracking the development of sympathetic vibration modes. The system is trained on a dataset of crowdsourced human-annotated images obtained from the Zooniverse Steelpan Vibrations Project. Due to the relatively small number of human-annotated images, we also train on a large corpus of synthetic images whose visual properties have been matched to those of the real images by using a Generative Adversarial Network to perform style transfer. Applying the model to predict annotations of thousands of unlabeled video frames, we can track features and measure oscillations consistent with audio recordings of the same drum strikes. One surprising result is that the machine-annotated video frames reveal transitions between the first and second harmonics of drum notes that significantly precede such transitions present in the audio recordings. As this paper primarily concerns the development of the model, deeper physical insights await its further application.

翻訳日:2021-02-02 17:46:33 公開日:2021-02-01

# (参考訳) Densely Connected Residual Residual (Dense R2UNet) Convolutional Neural Network for Segmentation of Lung CT Images

Densely Connected Recurrent Residual (Dense R2UNet) Convolutional Neural Network for Segmentation of Lung CT Images ( http://arxiv.org/abs/2102.00663v1 )

ライセンス: CC BY 4.0

Kaushik Dutta

(参考訳) ディープラーニングネットワークは、セマンティックセグメンテーションのためのアートパフォーマンスの状態を提供するものとして確立されている。これらの技術は医学の検出、区分および分類に特に適用されます。 U-Netベースのアーキテクチャの出現は、このアプリケーションで特に人気がある。本稿では、U-Netモデルアーキテクチャに基づくRecurrent CNN, Residual Network, Dense Convolutional Networkの合成であるDense Recurrent Residual Convolutional Neural Network(Dense R2U CNN)について述べる。残留ユニットはより深いネットワークを訓練するのを助け、密な繰り返し層はセグメンテーションに必要な機能伝搬を強化する。ベンチマークLung Lesionデータセットでテストされた提案モデルは、同等のモデルよりもセグメンテーションタスクのパフォーマンスが向上した。

Deep Learning networks have established themselves as providing state of art performance for semantic segmentation. These techniques are widely applied specifically to medical detection, segmentation and classification. The advent of the U-Net based architecture has become particularly popular for this application. In this paper we present the Dense Recurrent Residual Convolutional Neural Network(Dense R2U CNN) which is a synthesis of Recurrent CNN, Residual Network and Dense Convolutional Network based on the U-Net model architecture. The residual unit helps training deeper network, while the dense recurrent layers enhances feature propagation needed for segmentation. The proposed model tested on the benchmark Lung Lesion dataset showed better performance on segmentation tasks than its equivalent models.

翻訳日:2021-02-02 17:21:11 公開日:2021-02-01

# (参考訳) 高速ラジアルマルチコイル2次元シネmr画像再構成のためのエンド・ツー・エンド訓練型反復ネットワークアーキテクチャ

An End-To-End-Trainable Iterative Network Architecture for Accelerated Radial Multi-Coil 2D Cine MR Image Reconstruction ( http://arxiv.org/abs/2102.00783v1 )

ライセンス: CC BY 4.0

Andreas Kofler, Markus Haltmeier, Tobias Schaeffter and Christoph Kolbitsch

(参考訳) 目的: 学習反復スキームに類似した反復畳み込みニューラルネットワーク (CNN) は, 画像再構成問題に対して, 様々な画像モダリティをまたがって常に最先端の結果をもたらすことを示した。しかし、これらの手法はアーキテクチャのフォワードモデルを含むため、比較的小さな再構成問題や計算コストの低い演算子の問題に適用性に制限されることが多い。その結果, 動的非カルト的マルチコイル再構成問題には適用されていない。本研究では,複数の受信コイルを有する加速型2次元ラジアルシネMRIの画像再構成のためのCNN-Architectureを提案する。このネットワークは、計算で軽量なCNNコンポーネントと、効率的なトレーニング戦略を使用してエンドツーエンドで共同トレーニングできるその後の共役グラデーション(CG)方法に基づいています。提案した訓練戦略を検討し,学習と非学習の正規化手法を用いて,他のよく知られた再建手法と比較した。結果: 提案手法は非学習正規化に基づく他の手法よりも優れていた。さらに、3D U-Netを用いたCNNベースの手法と適応辞書学習を用いた手法とを類似または良好に行う。また,反復のみを用いてネットワークをトレーニングしても,テスト時間内にネットワークの長さを増加させ,結果をさらに改善できることを実証的に実証する。結論: エンドツーエンドのトレーニングは、再構成ネットワークのトレーニング可能なパラメータの数を大幅に削減し、安定化します。さらに、テスト時にネットワークの長さを変更することができるため、CNNブロックの複雑さと各CGブロックの反復数との間に妥協を見つける必要性は無関係になります。

Purpose: Iterative Convolutional Neural Networks (CNNs) which resemble unrolled learned iterative schemes have shown to consistently deliver state-of-the-art results for image reconstruction problems across different imaging modalities. However, because these methodes include the forward model in the architecture, their applicability is often restricted to either relatively small reconstruction problems or to problems with operators which are computationally cheap to compute. As a consequence, they have so far not been applied to dynamic non-Cartesian multi-coil reconstruction problems. Methods: In this work, we propose a CNN-architecture for image reconstruction of accelerated 2D radial cine MRI with multiple receiver coils. The network is based on a computationally light CNN-component and a subsequent conjugate gradient (CG) method which can be jointly trained end-to-end using an efficient training strategy. We investigate the proposed training-strategy and compare our method to other well-known reconstruction techniques with learned and non-learned regularization methods. Results: Our proposed method outperforms all other methods based on non-learned regularization. Further, it performs similar or better than a CNN-based method employing a 3D U-Net and a method using adaptive dictionary learning. In addition, we empirically demonstrate that even by training the network with only iteration, it is possible to increase the length of the network at test time and further improve the results. Conclusions: End-to-end training allows to highly reduce the number of trainable parameters of and stabilize the reconstruction network. Further, because it is possible to change the length of the network at test time, the need to find a compromise between the complexity of the CNN-block and the number of iterations in each CG-block becomes irrelevant.

翻訳日:2021-02-02 17:14:59 公開日:2021-02-01

# (参考訳) 重症デング患者の肺超音波ビデオにおけるB線の自動検出

Automatic Detection of B-lines in Lung Ultrasound Videos From Severe Dengue Patients ( http://arxiv.org/abs/2102.01059v1 )

ライセンス: CC BY 4.0

Hamideh Kerdegari, Phung Tran Huy Nhat, Angela McBride, VITAL Consortium, Reza Razavi, Nguyen Van Hao, Louise Thwaites, Sophie Yacoub, Alberto Gomez

(参考訳) 肺超音波(LUS)イメージングは、様々な疾患による肺への液漏れによるB線アーチファクトの存在を含む肺の異常を評価するために用いられる。しかし、これらのアーティファクトの手動検出は困難です。本論文では,弱ラベルを用いた深層ニューラルネットワークを用いて,LUS動画中のB線を自動的に検出・局在化するための新しい手法を提案する。そのために、畳み込みニューラルネットワーク(CNN)と、長期の短期メモリ(LSTM)ネットワークと時間的注意メカニズムを組み合わせています。 4つの異なるモデルが60人の患者のデータを用いて比較される。その結果,F1スコア0.81で1秒間クリップがB線を含むか否かを判断し,87.5%の精度でB線で代表フレームを抽出できることがわかった。

Lung ultrasound (LUS) imaging is used to assess lung abnormalities, including the presence of B-line artefacts due to fluid leakage into the lungs caused by a variety of diseases. However, manual detection of these artefacts is challenging. In this paper, we propose a novel methodology to automatically detect and localize B-lines in LUS videos using deep neural networks trained with weak labels. To this end, we combine a convolutional neural network (CNN) with a long short-term memory (LSTM) network and a temporal attention mechanism. Four different models are compared using data from 60 patients. Results show that our best model can determine whether one-second clips contain B-lines or not with an F1 score of 0.81, and extracts a representative frame with B-lines with an accuracy of 87.5%.

翻訳日:2021-02-02 17:13:41 公開日:2021-02-01

# GTAE:言語制約付きテキストスタイル転送のためのグラフトランスフォーマーベースのオートエンコーダ

GTAE: Graph-Transformer based Auto-Encoders for Linguistic-Constrained Text Style Transfer ( http://arxiv.org/abs/2102.00769v1 )

ライセンス: Link先を確認

Yukai Shi, Sen Zhang, Chenxing Zhou, Xiaodan Liang, Xiaojun Yang, Liang Lin

(参考訳) 非並列テキストスタイル転送は近年研究の関心を集めている。エンコーダデコーダフレームワークに基づいてスタイルを転送することに成功したにもかかわらず、現在のアプローチは、主に大きな制約のないモデル空間または潜在的な埋め込みスペース上の単純すぎる仮定のために、元の文の内容とロジックを保存する能力がまだ欠けています。言語自体が特定の文法を持つ人間のインテリジェントな産物であり、その性質によってルールベースのモデル空間が制限されているため、この問題を緩和するためには、深いニューラルネットワークのモデル容量を人間の言語規則から本質的なモデル制約と照合する必要がある。そこで本稿では,グラフ変換器を用いたオートエンコーダ(GTAE)という手法を提案する。文を言語グラフとしてモデル化し,特徴抽出とスタイル転送をグラフレベルで行うことで,原文の内容と言語構造を最大に保持する。 3つの非並列テキストスタイルの転送タスクの定量的実験結果から,本モデルはコンテンツ保存における最先端の手法よりも優れており,転送精度と文自然性に匹敵する性能が得られた。

Non-parallel text style transfer has attracted increasing research interests in recent years. Despite successes in transferring the style based on the encoder-decoder framework, current approaches still lack the ability to preserve the content and even logic of original sentences, mainly due to the large unconstrained model space or too simplified assumptions on latent embedding space. Since language itself is an intelligent product of humans with certain grammars and has a limited rule-based model space by its nature, relieving this problem requires reconciling the model capacity of deep neural networks with the intrinsic model constraints from human linguistic rules. To this end, we propose a method called Graph Transformer based Auto Encoder (GTAE), which models a sentence as a linguistic graph and performs feature extraction and style transfer at the graph level, to maximally retain the content and the linguistic structure of original sentences. Quantitative experiment results on three non-parallel text style transfer tasks show that our model outperforms state-of-the-art methods in content preservation, while achieving comparable performance on transfer accuracy and sentence naturalness.

翻訳日:2021-02-02 17:04:34 公開日:2021-02-01

# 半指導学習による中国語の多音障害

Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning ( http://arxiv.org/abs/2102.00621v1 )

ライセンス: Link先を確認

Yi Shi and Congyi Wang and Yu Chen and Bin Wang

(参考訳) 漢字の大部分は単音であり、発音は独特であり、チェックテーブルで簡単に発音することができる。それらに対して、ポリフォニック文字は複数の発音を持つ。中国語話者に関連する言語計算タスクを実行するには、その文脈に応じて、各ポリフォンの正しい発音を特定する必要があります。この処理はPolyphone Disambiguationと呼ばれ、中国のテキスト音声(TTS)システムのGrapheme-to-phoneme(G2P)変換ステップにおける重要な手順である。この問題は知識ベースのアプローチと学習ベースのアプローチの両方でよく研究されているが、公開データセットの欠如や、ポリフォンに関する複雑な言語現象のため、依然として難しい。本稿では,無ラベルテキストデータを利用する可能性のある中国語ポリホン不曖昧化のための半教師付き学習(ssl)フレームワークを提案する。エントロピー-thresholding やlexicon-based labeling など,様々なプロキシラベリング戦略の効果を検討する。アーキテクチャに関しては、Electraの事前トレーニングされたモデルとConvolution BLSTMレイヤーを組み合わせて、タスクを微調整します。定性的および定量的実験により,マンダリン中国語多音不明瞭度における最先端性能が得られた。さらに,ポリホンの曖昧化タスクに特化した新しいデータセットを公開し,さらなる研究を促進する。

The majority of Chinese characters are monophonic, i.e.their pronunciations are unique and thus can be induced easily using a check table. As for their counterparts, polyphonic characters have more than one pronunciation. To perform linguistic computation tasks related to spoken Mandarin Chinese, the correct pronunciation for each polyphone must be identified among several candidates according to its context. This process is called Polyphone Disambiguation, a key procedure in the Grapheme-to-phoneme (G2P) conversion step of a Chinese text-to-speech (TTS) system. The problem is well explored with both knowledge-based and learning-based approaches, yet it remains challenging due to the lack of publicly available datasets and complex language phenomenon concerned polyphone. In this paper, we propose a novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data. We explore the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling. As for the architecture, a pre-trained model of Electra is combined with Convolution BLSTM layers to fine-tune on our task. Qualitative and quantitative experiments demonstrate that our method achieves state-of-the-art performance in Mandarin Chinese polyphone disambiguation. In addition, we publish a novel dataset specifically for the polyphone disambiguation task to promote further researches.

翻訳日:2021-02-02 17:02:16 公開日:2021-02-01

# 用語定義からの常識知識マイニング

Commonsense Knowledge Mining from Term Definitions ( http://arxiv.org/abs/2102.00651v1 )

ライセンス: Link先を確認

Zhicheng Liang and Deborah L. McGuinness

(参考訳) commonsenseの知識は、質問応答や自然言語理解など、さまざまな応用分野に有益であることが証明されている。以前の研究では、現在のcommonsense知識グラフをカバーするために、テキストから自動的に3倍のcommonsense知識を収集することを検討した。辞書用語定義をインプットとして,コモンセンスの知識トリプルをマイニングする機械学習手法をいくつか検討し,その初期評価を行った。まず,テキストから部分音声タグパターンを用いて3つの候補を抽出し,既存の3つのモデルの性能を比較した。私たちの実験では、用語定義には意味関係に対する正当かつ新しいコモンセンスの知識トリプルが含まれており、また既存のトリプルスコアリングモデルを使用する際の課題も示している。

Commonsense knowledge has proven to be beneficial to a variety of application areas, including question answering and natural language understanding. Previous work explored collecting commonsense knowledge triples automatically from text to increase the coverage of current commonsense knowledge graphs. We investigate a few machine learning approaches to mining commonsense knowledge triples using dictionary term definitions as inputs and provide some initial evaluation of the results. We start from extracting candidate triples using part-of-speech tag patterns from text, and then compare the performance of three existing models for triple scoring. Our experiments show that term definitions contain some valid and novel commonsense knowledge triples for some semantic relations, and also indicate some challenges with using existing triple scoring models.

翻訳日:2021-02-02 17:01:35 公開日:2021-02-01

# 多くの手が軽い仕事をする: 自動スコアのエッセイにエッセイの跡を使用する

Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays ( http://arxiv.org/abs/2102.00781v1 )

ライセンス: Link先を確認

Rahul Kumar, Sandeep Mathias, Sriparna Saha, Pushpak Bhattacharyya

(参考訳) aeg(automatic essay grading)の分野におけるほとんどの研究は、エッセイの総合的なスコア付けに向けられているが、個々のエッセイの特徴をスコアリングする作業も行われている。本論文では,マルチタスク学習(MTL)手法を用いてエッセイを体系的に採点する方法について述べる。ここでは,エッセイを体系的に採点することが主課題であり,エッセイ特性を採点することが補助課題である。 LSTMとBiLSTMの両方を用いて,STL(Single-task Learning)アプローチとの比較を行った。また,補助作業の結果を他のaegシステムで実施したタスクと比較した。異なる種類のエッセイにどの特性が最適かを調べるために、エッセイのそれぞれの特徴に対してアブレーションテストを実施します。また、各システムのランタイムとトレーニングパラメータの数を報告します。 MTLをベースとしたBiLSTMシステムは,エッセイ特性の評価だけでなく,エッセイ特性の評価にも有効であることがわかった。

Most research in the area of automatic essay grading (AEG) is geared towards scoring the essay holistically while there has also been some work done on scoring individual essay traits. In this paper, we describe a way to score essays holistically using a multi-task learning (MTL) approach, where scoring the essay holistically is the primary task, and scoring the essay traits is the auxiliary task. We compare our results with a single-task learning (STL) approach, using both LSTMs and BiLSTMs. We also compare our results of the auxiliary task with such tasks done in other AEG systems. To find out which traits work best for different types of essays, we conduct ablation tests for each of the essay traits. We also report the runtime and number of training parameters for each system. We find that MTL-based BiLSTM system gives the best results for scoring the essay holistically, as well as performing well on scoring the essay traits.

翻訳日:2021-02-02 17:01:01 公開日:2021-02-01

# マルチファセットプロトタイプを用いたフェーショット画像分類

Few-shot Image Classification with Multi-Facet Prototypes ( http://arxiv.org/abs/2102.00801v1 )

ライセンス: Link先を確認

Kun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, Steven Schockaert

(参考訳) 少数ショット学習(FSL)の目的は、少数のトレーニング例から画像カテゴリの認識方法を学ぶことである。中心となる課題は、利用可能なトレーニングサンプルは通常、考慮されたカテゴリの最も特徴的な視覚特徴を決定するために不十分であることだ。この課題に対処するため、これらの視覚的特徴をファセットに整理し、同じ種類の機能を直感的にグループ化する(例)。形状、色、または質感に関連する機能)。これは, (i) 各ファセットの重要性がカテゴリごとに異なる, (ii) カテゴリ名の事前学習された埋め込みからファセットの重要性を予測することができる,という仮定に基づく。特に,あるカテゴリの集合に対して,予測されたフェーレット重み付けに依存する適応的類似度尺度を提案する。この測度は、既存のメトリックベースメソッドの幅広い配列と組み合わせて使用できる。 miniImageNet と CUB の実験により,我々の手法は計量ベース FSL の最先端性の向上を図っている。

The aim of few-shot learning (FSL) is to learn how to recognize image categories from a small number of training examples. A central challenge is that the available training examples are normally insufficient to determine which visual features are most characteristic of the considered categories. To address this challenge, we organize these visual features into facets, which intuitively group features of the same kind (e.g. features that are relevant to shape, color, or texture). This is motivated from the assumption that (i) the importance of each facet differs from category to category and (ii) it is possible to predict facet importance from a pre-trained embedding of the category names. In particular, we propose an adaptive similarity measure, relying on predicted facet importance weights for a given set of categories. This measure can be used in combination with a wide array of existing metric-based methods. Experiments on miniImageNet and CUB show that our approach improves the state-of-the-art in metric-based FSL.

翻訳日:2021-02-02 16:57:36 公開日:2021-02-01

# Bellman Eluder Dimension: RL問題の新しいリッチクラスとサンプル効率の高いアルゴリズム

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms ( http://arxiv.org/abs/2102.00815v1 )

ライセンス: Link先を確認

Chi Jin, Qinghua Liu, Sobhan Miryoosefi

(参考訳) サンプル効率の学習を促進する最小限の構造的仮定を見つけることは、強化学習(RL)において最も重要な研究方向の1つである。本稿では,新しい複雑性尺度であるbellman eluder(be)次元を導入することで,この基本的な問題に対する理解を深める。我々は,低BE次元のRL問題の族が極めて豊富であることを示し,これは表付きMDP,線形MDP,反応性PMDP,低ベルマンランク問題,低エルダー次元問題など,既存のトラクタブルRL問題の大部分を仮定している。本稿ではさらに,新しい最適化に基づくアルゴリズム -- ゴルフ,および仮説除去に基づくアルゴリズム -- olive (jiang et alで提案) を再分析する。 (2017)). 両アルゴリズムは、全ての関連するパラメータの多項式である多数のサンプルにおいて低BE次元問題の準最適ポリシを学習するが、状態-作用空間のサイズには依存しないことを示す。我々の後悔とサンプルの複雑さの結果は、BE次元の低いいくつかのよく知られたサブクラスに対して、最良の既存の結果と一致または改善する。

Finding the minimal structural assumptions that empower sample-efficient learning is one of the most important research directions in Reinforcement Learning (RL). This paper advances our understanding of this fundamental question by introducing a new complexity measure -- Bellman Eluder (BE) dimension. We show that the family of RL problems of low BE dimension is remarkably rich, which subsumes a vast majority of existing tractable RL problems including but not limited to tabular MDPs, linear MDPs, reactive POMDPs, low Bellman rank problems as well as low Eluder dimension problems. This paper further designs a new optimization-based algorithm -- GOLF, and reanalyzes a hypothesis elimination-based algorithm -- OLIVE (proposed in Jiang et al. (2017)). We prove that both algorithms learn the near-optimal policies of low BE dimension problems in a number of samples that is polynomial in all relevant parameters, but independent of the size of state-action space. Our regret and sample complexity results match or improve the best existing results for several well-known subclasses of low BE dimension problems.

翻訳日:2021-02-02 16:54:57 公開日:2021-02-01

# 畳み込みLSTMの時空間気象予測と注意機構

Spatio-temporal Weather Forecasting and Attention Mechanism on Convolutional LSTMs ( http://arxiv.org/abs/2102.00696v1 )

ライセンス: Link先を確認

Selim Furkan Tekin, Oguzhan Karaahmetoglu, Fatih Ilhan, Ismail Balaban and Suleyman Serdar Kozat

(参考訳) 高解像度物理モデル上での数値天気予報はスーパーコンピュータ上での計算時間を消費する。ディープラーニングと機械学習の手法の予測への応用は、この領域で新しいソリューションを明らかにした。本稿では,入力気象データと観測データの両方を用いて,新たな深層学習アーキテクチャを提供することで,高分解能の数値気象データを予測する。問題を時空間予測として定式化する。本モデルは,畳み込み型長期記憶と,エンコーダ・デコーダ構造を持つ畳み込み型ニューラルネットワークユニットから構成される。注意とコンテキストマッチング機構により、短期的なパフォーマンスと解釈性を向上させます。我々は,高スケール,実時間,ベンチマーク数値気象データ,era5時間毎の圧力レベルに関する実験を行い,温度を予測した。その結果,入力系列の異なる部分に着目した注意行列と空間的相関と時間的相関が有意な改善を示した。本モデルは,ConvLSTM予測ネットワークやU-Netなど,ベースラインモデルの中で最高の検証とテストスコアを得る。我々は定性的かつ定量的な結果を提供し、平均2度の誤差で3時間の周波数で10の時間ステップを予測した。当社のコードとデータは公開されています。

Numerical weather forecasting on high-resolution physical models consume hours of computations on supercomputers. Application of deep learning and machine learning methods in forecasting revealed new solutions in this area. In this paper, we forecast high-resolution numeric weather data using both input weather data and observations by providing a novel deep learning architecture. We formulate the problem as spatio-temporal prediction. Our model is composed of Convolutional Long-short Term Memory, and Convolutional Neural Network units with encoder-decoder structure. We enhance the short-long term performance and interpretability with an attention and a context matcher mechanism. We perform experiments on high-scale, real-life, benchmark numerical weather dataset, ERA5 hourly data on pressure levels, and forecast the temperature. The results show significant improvements in capturing both spatial and temporal correlations with attention matrices focusing on different parts of the input series. Our model obtains the best validation and the best test score among the baseline models, including ConvLSTM forecasting network and U-Net. We provide qualitative and quantitative results and show that our model forecasts 10 time steps with 3 hour frequency with an average of 2 degrees error. Our code and the data are publicly available.

翻訳日:2021-02-02 16:53:08 公開日:2021-02-01

# ノックオフによる反実生成

Counterfactual Generation with Knockoffs ( http://arxiv.org/abs/2102.00951v1 )

ライセンス: Link先を確認

Oana-Iuliana Popescu, Maha Shadaydeh, Joachim Denzler

(参考訳) ディープニューラルネットワークの決定の人間の解釈性は、特にそれが人間の生活に直接影響を及ぼす領域において重要である。既に訓練済みのニューラルネットワークの因果的説明は、入力特徴の摂動と、摂動後の分類器の結果の変化による重要性の寄与によって生成される。摂動は、ヒューリスティックまたは生成的インフィル方式で特徴を置き換えることによって行うことができる。インフィル機能の選択は、アーティファクトの数、すなわち偽陽性アトリビューションに大きく影響します。ヒューリスティックな手法は、摂動後の画像が元のデータ分布に遠く及ばないため、偽陽性のアーティファクトをもたらす。生成的インフィルングメソッドは、元のデータ分布を尊重するインフィルング値を生成することによってアーティファクトを削減する。しかし,現在のインフィル法では,インフィル値と元のデータとの相関が高いため,偽陰性も増大する可能性がある。本稿では,2015年にBarber と Cand\`es が,制御可能な擬似発見率を持つ変数選択ツールとして開発した,統計的に座屈した Knockoffs フレームワークを組み込むことにより,この問題を軽減することを提案する。ノックオフは、元のデータから可能な限りデコレーションに関連する統計的にnull-variablesであり、基礎となるデータ分布を変更することなく元のデータと交換することができる。異なるインフィル方式の比較は、インフィルディングとノックオフは説明のコンパクト性を維持しつつ、より因果的な意味で説明を明らかにすることができることを示している。

Human interpretability of deep neural networks' decisions is crucial, especially in domains where these directly affect human lives. Counterfactual explanations of already trained neural networks can be generated by perturbing input features and attributing importance according to the change in the classifier's outcome after perturbation. Perturbation can be done by replacing features using heuristic or generative in-filling methods. The choice of in-filling function significantly impacts the number of artifacts, i.e., false-positive attributions. Heuristic methods result in false-positive artifacts because the image after the perturbation is far from the original data distribution. Generative in-filling methods reduce artifacts by producing in-filling values that respect the original data distribution. However, current generative in-filling methods may also increase false-negatives due to the high correlation of in-filling values with the original data. In this paper, we propose to alleviate this by generating in-fillings with the statistically-grounded Knockoffs framework, which was developed by Barber and Cand\`es in 2015 as a tool for variable selection with controllable false discovery rate. Knockoffs are statistically null-variables as decorrelated as possible from the original data, which can be swapped with the originals without changing the underlying data distribution. A comparison of different in-filling methods indicates that in-filling with knockoffs can reveal explanations in a more causal sense while still maintaining the compactness of the explanations.

翻訳日:2021-02-02 16:52:28 公開日:2021-02-01

# 適応基底分解による単画像非一様Blurカーネル推定

Single Image Non-uniform Blur Kernel Estimation via Adaptive Basis Decomposition ( http://arxiv.org/abs/2102.01026v1 )

ライセンス: Link先を確認

Guillermo Carbajal, Patricia Vitoria, Mauricio Delbracio, Pablo Mus\'e, Jos\'e Lezama

(参考訳) カメラの揺動や物体の動きによる動きのぼやけを特徴付けることは、画像復元にとって重要な課題である。近年、写真における動きのぼやけの除去は、ぼやけた画像から鋭い画像へ直接マッピングするように訓練されたディープラーニングベースの手法によって、目覚ましい進歩を遂げている。一方, 動きのぼかしのキャラクタリゼーションは, データ駆動のエンド・ツー・エンド・エンド・アプローチに先立って, モデルに基づくラグの復元手法が進歩している。本稿では,高密度な非一様運動ボケ推定のための一般非パラメトリックモデルを提案する。ぼやけた画像が与えられたとき、適応基底カーネルの集合とピクセルレベルでの混合係数を推定し、動きのぼやきのピクセルごとのマップを生成する。このリッチだが効率的な劣化過程のフォワードモデルにより、逆問題の解決に既存のツールを活用することができる。提案手法は,既存の不均一な動きのぼかし推定の限界を克服し,モデルベースとデータ駆動アプローチのギャップを埋めることに寄与することを示す。

Characterizing and removing motion blur caused by camera shake or object motion remains an important task for image restoration. In recent years, removal of motion blur in photographs has seen impressive progress in the hands of deep learning-based methods, trained to map directly from blurry to sharp images. Characterization of motion blur, on the other hand, has received less attention and progress in model-based methods for restoration lags behind that of data-driven end-to-end approaches. In this paper, we propose a general, non-parametric model for dense non-uniform motion blur estimation. Given a blurry image, we estimate a set of adaptive basis kernels as well as the mixing coefficients at pixel level, producing a per-pixel map of motion blur. This rich but efficient forward model of the degradation process allows the utilization of existing tools for solving inverse problems. We show that our method overcomes the limitations of existing non-uniform motion blur estimation and that it contributes to bridging the gap between model-based and data-driven approaches for deblurring real photographs.

翻訳日:2021-02-02 16:51:43 公開日:2021-02-01

# Harrington Yowlumne Narrative Corpus

The Harrington Yowlumne Narrative Corpus ( http://arxiv.org/abs/2102.00610v1 )

ライセンス: Link先を確認

Nathan M. White and Timothy Henry-Rodriguez

(参考訳) マイノリティ言語は、特に技術分野において、開発に十分な資源を欠いている。同様に、スミソニアン研究所のJ・P・ハリントン・ペーパーズ・コレクションは、手書きで非組織化されたフォーマットのために、コミュニティメンバーや研究者が実際にアクセスすることは困難である。我々の現在の研究は、この公に利用できながら問題のある素材の一部を、自然言語処理で実際に利用できるものにすることを目指している。ここでは、1910年から1925年の間、カリフォルニア州カーン郡のティンリウ牧場のテホネ・ル～ノ・ヨーラムヌコミュニティに由来する20の物語テキストのコーパスであるHarrington Yowlumne Narrative Corpusを紹介します。テキストをデジタルで書き起こし、これらのテキストでゴールド標準のレキセメベースの正規化テキストを提供する。さらに、67,835文字の文字が10,721文字の標準テキスト正規化語と一致する。

Minority languages continue to lack adequate resources for their development, especially in the technological domain. Likewise, the J.P. Harrington Papers collection at the Smithsonian Institution are difficult to access in practical terms for community members and researchers due to its handwritten and disorganized format. Our current work seeks to make a portion of this publicly-available yet problematic material practically accessible for natural language processing use. Here, we present the Harrington Yowlumne Narrative Corpus, a corpus of 20 narrative texts that derive from the Tejone\~no Yowlumne community of the Tinliw rancheria in Kern County, California between 1910 and 1925. We digitally transcribe the texts and provide gold-standard aligned lexeme-based normalized text with these texts. Altogether, the text contains 67,835 transcribed characters aligned with 10,721 gold standard text-normalized words.

翻訳日:2021-02-02 16:49:15 公開日:2021-02-01

# 回答選択のための階層的ランキング

Hierarchical Ranking for Answer Selection ( http://arxiv.org/abs/2102.00677v1 )

ライセンス: Link先を確認

Hang Gao, Mengting Hu, Renhong Cheng, Tiegang Gao

(参考訳) 回答の選択は、与えられた質問に対する候補回答のプールから正の回答を選択するタスクです。本稿では,階層的ランキングという,解答選択のための新しい戦略を提案する。我々は,ポイントレベルのランキング,ペアレベルのランキング,リストレベルのランキングの3つの階層を導入する。候補者の回答をランキングするのと同じ目標を達成するために、異なる視点からの監督情報を使用して最適化目標を策定します。したがって、3つのレベルは関連しており、互いに促進することができる。我々は,多タスク学習(mtl)戦略に基づくスキーム,ランキング統合(ri)スキーム,プログレッシブランキング統合(pri)スキームという,階層的ランキングを共同で適用するための3つのスキームを検討した。 WikiQA と TREC-QA の2つの公開データセットによる実験結果から,提案した階層的ランキングが有効であることを示す。 TREC-QAとWikiQAの両方で最新の(非BERT)パフォーマンスを実現します。

Answer selection is a task to choose the positive answers from a pool of candidate answers for a given question. In this paper, we propose a novel strategy for answer selection, called hierarchical ranking. We introduce three levels of ranking: point-level ranking, pair-level ranking, and list-level ranking. They formulate their optimization objectives by employing supervisory information from different perspectives to achieve the same goal of ranking candidate answers. Therefore, the three levels of ranking are related and they can promote each other. We take the well-performed compare-aggregate model as the backbone and explore three schemes to implement the idea of applying the hierarchical rankings jointly: the scheme under the Multi-Task Learning (MTL) strategy, the Ranking Integration (RI) scheme, and the Progressive Ranking Integration (PRI) scheme. Experimental results on two public datasets, WikiQA and TREC-QA, demonstrate that the proposed hierarchical ranking is effective. Our method achieves state-of-the-art (non-BERT) performance on both TREC-QA and WikiQA.

翻訳日:2021-02-02 16:48:36 公開日:2021-02-01

# イディオムコーポラ建設のためのガミファイドクラウドソーシング

Gamified Crowdsourcing for Idiom Corpora Construction ( http://arxiv.org/abs/2102.00881v1 )

ライセンス: Link先を確認

G\"ul\c{s}en Eryi\u{g}it, Ali \c{S}enta\c{s}, Johanna Monti

(参考訳) 慣用的な表現を学ぶことは、その予測不可能な意味のために第二言語学習の最も困難な段階の1つと見なされます。同様の状況は、機械翻訳や構文解析などの自然言語処理アプリケーション内での識別にも当てはまる。高品質の使用サンプルの欠如は、人間だけでなく人工知能システムにとってもこの課題を悪化させます。本稿では,慣用的・非慣用的な使用例を提供し,他のプレイヤーのエントリーを評価しながら,互いに競合するネイティブスピーカーのための非同期マルチプレイヤーゲームとして,メッセージングボットを設計する。古典的なクラウドプロセッシングアノテーションの分野への取り組みとは対照的に,文献の中では初めて,クラウドプロセッシングとクラウドプロセッシングのアプローチが実装され,イディオムコーパス構築のためにテストされている。このアプローチは言語に依存しず、フィールドの従来のデータ準備技術と比較して2つの言語で評価されます。群衆の反応は、異なる動機づけの手段(すなわち、ゲーミフィケーションと金銭的報酬)で監視される。その結果, 提案手法は対象資料の収集に有効であり, 露骨なクラウドソーシング手法であるにもかかわらず, 観客を楽しませ, 有用であることがわかった。このアプローチは、第二言語学習教材として使用する異なる自然言語のためのイディオムコーパスの構築、教師付きイディオム識別システムのためのトレーニングデータ、辞書研究のためのサンプルをスピードアップする可能性があることが示されている。

Learning idiomatic expressions is seen as one of the most challenging stages in second language learning because of their unpredictable meaning. A similar situation holds for their identification within natural language processing applications such as machine translation and parsing. The lack of high-quality usage samples exacerbates this challenge not only for humans but also for artificial intelligence systems. This article introduces a gamified crowdsourcing approach for collecting language learning materials for idiomatic expressions; a messaging bot is designed as an asynchronous multiplayer game for native speakers who compete with each other while providing idiomatic and nonidiomatic usage examples and rating other players' entries. As opposed to classical crowdprocessing annotation efforts in the field, for the first time in the literature, a crowdcreating & crowdrating approach is implemented and tested for idiom corpora construction. The approach is language independent and evaluated on two languages in comparison to traditional data preparation techniques in the field. The reaction of the crowd is monitored under different motivational means (namely, gamification affordances and monetary rewards). The results reveal that the proposed approach is powerful in collecting the targeted materials, and although being an explicit crowdsourcing approach, it is found entertaining and useful by the crowd. The approach has been shown to have the potential to speed up the construction of idiom corpora for different natural languages to be used as second language learning material, training data for supervised idiom identification systems, or samples for lexicographic studies.

翻訳日:2021-02-02 16:47:13 公開日:2021-02-01

# 学習済み言語モデルにおける一貫性の測定と改善

Measuring and Improving Consistency in Pretrained Language Models ( http://arxiv.org/abs/2102.01017v1 )

ライセンス: Link先を確認

Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Sch\"utze, Yoav Goldberg

(参考訳) モデルの一貫性、すなわち、その入力における意味保存的な変化の下での振る舞いの不変性は、自然言語処理において非常に望ましい特性である。本稿では, 事前学習型言語モデル(PLM)は, 事実的知識に対して一貫性があるか? この目的のために私たちは、clozeスタイルのクエリ英語パラフレーズの高品質なリソースであるpararelを作成します。総計328のパラフレーズがあり、38の関係がある。 ParaRel を用いて、我々が実験したすべての PLM の整合性は貧弱であるが、関係のばらつきは高い。 plm の表現空間の解析は,構造が貧弱であり,現在,知識を堅牢に表現するのに適していないことを示唆する。最後に,モデルの一貫性を向上させる手法を提案し,その効果を実験的に実証する。

Consistency of a model -- that is, the invariance of its behavior under meaning-preserving alternations in its input -- is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for thirty-eight relations. Using ParaRel, we show that the consistency of all PLMs we experiment with is poor -- though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge in a robust way. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness.

翻訳日:2021-02-02 16:46:28 公開日:2021-02-01

# SJ_AJ@DravidianLangTech-EACL2021: 攻撃言語識別のための多言語BERTモデルのタスク適応事前訓練

SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification ( http://arxiv.org/abs/2102.01051v1 )

ライセンス: Link先を確認

Sai Muralidhar Jayanthi, Akshat Gupta

(参考訳) 本稿では,ドラビダ語における攻撃的言語識別に関するEACL 2021-Shared Taskを提案する。私たちの最終システムはmBERTとXLM-RoBERTaモデルのアンサンブルであり、マスク付き言語モデリング目的の多言語BERTモデルのタスク適応事前トレーニングを利用しています。私たちのシステムは、カンナダで1位、マラヤラムで2位、タミルで3位にランクされました。

In this paper we present our submission for the EACL 2021-Shared Task on Offensive Language Identification in Dravidian languages. Our final system is an ensemble of mBERT and XLM-RoBERTa models which leverage task-adaptive pre-training of multilingual BERT models with a masked language modeling objective. Our system was ranked 1st for Kannada, 2nd for Malayalam and 3rd for Tamil.

翻訳日:2021-02-02 16:45:56 公開日:2021-02-01

# 小さくて合成的なベンチマークは、モデリングのイノベーションを駆動できますか? 質問応答モデリング手法のふりかえり的研究

Can Small and Synthetic Benchmarks Drive Modeling Innovation? A Retrospective Study of Question Answering Modeling Approaches ( http://arxiv.org/abs/2102.01065v1 )

ライセンス: Link先を確認

Nelson F. Liu and Tony Lee and Robin Jia and Percy Liang

(参考訳) データセットは、正確でデプロイ可能なシステムのトレーニングのためのリソースであるだけでなく、新しいモデリングアプローチを開発するためのベンチマークでもある。正確なシステムのトレーニングには大規模で自然なデータセットが必要ですが、モデリングの革新を促進するには必要でしょうか? 例えば、人気のあるsunity question answering benchmarkは、新しいモデリングアプローチの開発につながったが、シンセサイザーや小さなベンチマークが同様のイノベーションに繋がる可能性がある。この反現実的な質問は答えられないが、我々は必要条件、すなわちベンチマークがSQuAD上で行った発見を再カプセル化できる能力について研究することができる。我々は20のSQuADモデリングアプローチの振り返り調査を行い、32の既存および合成ベンチマークがSQuADとどのように一致しているかを調査する。我々は,SQuADに類似しないが,SQuADとの精度が高く,SQuADの歴史的モデリング改善を反映するためには,自然性やサイズは必要ないことを実証した,小型でターゲットの合成ベンチマークを慎重に構築する。この結果から,小型かつ慎重に設計された合成ベンチマークが新たなモデリング手法の開発に有用である可能性が示唆された。

Datasets are not only resources for training accurate, deployable systems, but are also benchmarks for developing new modeling approaches. While large, natural datasets are necessary for training accurate systems, are they necessary for driving modeling innovation? For example, while the popular SQuAD question answering benchmark has driven the development of new modeling approaches, could synthetic or smaller benchmarks have led to similar innovations? This counterfactual question is impossible to answer, but we can study a necessary condition: the ability for a benchmark to recapitulate findings made on SQuAD. We conduct a retrospective study of 20 SQuAD modeling approaches, investigating how well 32 existing and synthesized benchmarks concur with SQuAD -- i.e., do they rank the approaches similarly? We carefully construct small, targeted synthetic benchmarks that do not resemble natural language, yet have high concurrence with SQuAD, demonstrating that naturalness and size are not necessary for reflecting historical modeling improvements on SQuAD. Our results raise the intriguing possibility that small and carefully designed synthetic benchmarks may be useful for driving the development of new modeling approaches.

翻訳日:2021-02-02 16:45:27 公開日:2021-02-01

# Piagetの認知発達理論に触発された解釈型強化学習

Interpretable Reinforcement Learning Inspired by Piaget's Theory of Cognitive Development ( http://arxiv.org/abs/2102.00572v1 )

ライセンス: Link先を確認

Aref Hakimzadeh, Yanbo Xue, and Peyman Setoodeh

(参考訳) 人間レベルの認知能力を持つロボットを設計するための取り組みは、学習機械の異なるカテゴリに導かれた。スキナーの理論によれば、強化学習(rl)は人間の直観と認知において重要な役割を果たす。ディープRLアルゴリズムを含む最先端の手法の大部分は、コネクティストの視点に強く影響されます。このようなアルゴリズムは、他の分野における心の理論や学習の恩恵を受けることができる。本稿では、思考仮説言語(LOTH)、スクリプト理論、およびピアジェットの認知発達理論などの理論が相補的なアプローチを提供し、RL分野を豊かにするという考えを楽しませる。この考え方に続いて、生産性、体系性、推論コヒーレンスの概念を支持するピアジェットのスキーマ理論に対して、接続論とは対照的に、一般的な計算ビルディングブロックが提案される。提案手法の抽象化はシステム自体に完全に依存しており、事前定義されたアーキテクチャによって外部的に制約されない。プロセス全体は、Neisserの知覚サイクルモデルと一致する。 3つの典型的な制御問題に対する実験と行動解析により,提案手法の解釈可能性とその競合性が,最先端アルゴリズムと比較して確認された。したがって、提案フレームワークは、人工知能システムにおいて人間のような認知を実現するためのステップとみなすことができる。

Endeavors for designing robots with human-level cognitive abilities have led to different categories of learning machines. According to Skinner's theory, reinforcement learning (RL) plays a key role in human intuition and cognition. Majority of the state-of-the-art methods including deep RL algorithms are strongly influenced by the connectionist viewpoint. Such algorithms can significantly benefit from theories of mind and learning in other disciplines. This paper entertains the idea that theories such as language of thought hypothesis (LOTH), script theory, and Piaget's cognitive development theory provide complementary approaches, which will enrich the RL field. Following this line of thinking, a general computational building block is proposed for Piaget's schema theory that supports the notions of productivity, systematicity, and inferential coherence as described by Fodor in contrast with the connectionism theory. Abstraction in the proposed method is completely upon the system itself and is not externally constrained by any predefined architecture. The whole process matches the Neisser's perceptual cycle model. Performed experiments on three typical control problems followed by behavioral analysis confirm the interpretability of the proposed method and its competitiveness compared to the state-of-the-art algorithms. Hence, the proposed framework can be viewed as a step towards achieving human-like cognition in artificial intelligent systems.

翻訳日:2021-02-02 16:42:55 公開日:2021-02-01

# 自動運転技術における計画・責任・安全の制御可能性

The Controllability of Planning, Responsibility, and Security in Automatic Driving Technology ( http://arxiv.org/abs/2102.00617v1 )

ライセンス: Link先を確認

Dan Wan and Hao Zhan

(参考訳) 自動走行技術は常に安定して制御可能な状態にあり、具体的には、制御可能な計画、制御可能な責任、制御可能な情報に分けられることを期待しています。この制御性が損なわれると、トロリージレンマ、責任帰属、情報漏洩、セキュリティなどの問題が発生します。本稿では,これら3つの問題を別々に論じ,誤解を明確にする。

People hope automated driving technology is always in a stable and controllable state; specifically, it can be divided into controllable planning, controllable responsibility, and controllable information. When this controllability is undermined, it brings about the problems, e.g., trolley dilemma, responsibility attribution, information leakage, and security. This article discusses these three types of issues separately and clarifies the misunderstandings.

翻訳日:2021-02-02 16:42:16 公開日:2021-02-01

# 画像のテクスト記述から空間関係を推測する

Inferring spatial relations from textual descriptions of images ( http://arxiv.org/abs/2102.00997v1 )

ライセンス: Link先を確認

Aitzol Elu, Gorka Azkune, Oier Lopez de Lacalle, Ignacio Arganda-Carreras, Aitor Soroa, Eneko Agirre

(参考訳) テキスト記述から画像を生成するには、あるレベルの言語理解と、記述される物理的実体の空間的関係に関する常識知識が必要である。本研究では,テキストに基づくシーン構成における重要なステップであるエンティティ間の空間的関係を推測することに焦点を当てた。具体的には、対象への言及と対象の境界ボックスの位置とサイズを含むキャプションを考えると、キャプションで言及された対象の位置と大きさを予測することが私たちの目標です。以前の作業ではキャプションのテキスト情報ではなく、対象と対象の間の手動で提供された関係保持を使用していました。実際に使用される評価データセットには、手動で注釈付けされたオントロジ三脚が含まれているが、キャプションがないため、運動は非現実的で、手動ステップが必要であり、システムはキャプション内のリッチな情報を活用できなかった。本稿では, キャプションの全文と, キャプションからの空間的関係推論を直接評価できるMS-COCOから派生したデータセットであるRelations in Captions (REC-COCO) を用いたシステムを提案する。実験の結果,(1)字幕から直接対象物の大きさや位置を推測することが可能であり,(2)完全テキストを用いることで,手作業による注釈付き関係を用いた場合よりも,対象物の位置を推定できることがわかった。我々の研究は、キャプションを付与したシステムにおいて、最終的な画像を生成するために、どのエンティティとそれぞれの場所とサイズを表現する必要があるかを決定する方法である。

Generating an image from its textual description requires both a certain level of language understanding and common sense knowledge about the spatial relations of the physical entities being described. In this work, we focus on inferring the spatial relation between entities, a key step in the process of composing scenes based on text. More specifically, given a caption containing a mention to a subject and the location and size of the bounding box of that subject, our goal is to predict the location and size of an object mentioned in the caption. Previous work did not use the caption text information, but a manually provided relation holding between the subject and the object. In fact, the used evaluation datasets contain manually annotated ontological triplets but no captions, making the exercise unrealistic: a manual step was required; and systems did not leverage the richer information in captions. Here we present a system that uses the full caption, and Relations in Captions (REC-COCO), a dataset derived from MS-COCO which allows to evaluate spatial relation inference from captions directly. Our experiments show that: (1) it is possible to infer the size and location of an object with respect to a given subject directly from the caption; (2) the use of full text allows to place the object better than using a manually annotated relation. Our work paves the way for systems that, given a caption, decide which entities need to be depicted and their respective location and sizes, in order to then generate the final image.

翻訳日:2021-02-02 16:41:09 公開日:2021-02-01

# RoutingGAN:Disentangleed Learningによるルーティング年齢の進行と回帰

RoutingGAN: Routing Age Progression and Regression with Disentangled Learning ( http://arxiv.org/abs/2102.00601v1 )

ライセンス: Link先を確認

Zhizhong Huang and Junping Zhang and Hongming Shan

(参考訳) Although impressive results have been achieved for age progression and regression, there remain two major issues in generative adversarial networks (GANs)-based methods: 1) conditional GANs (cGANs)-based methods can learn various effects between any two age groups in a single model, but are insufficient to characterize some specific patterns due to completely shared convolutions filters; and 2) GANs-based methods can, by utilizing several models to learn effects independently, learn some specific patterns, however, they are cumbersome and require age label in advance. 本稿では,これらの欠陥に対処するために,GAN~(RoutingGAN)に基づくドロップアウト方式を導入して,高レベルの意味的特徴空間における異なる効果を導出する。具体的には、まず、入力面から年齢不変な特徴を外し、その後、他の出力を落として、異なる年齢グループに畳み込みフィルタを割り当てる残差ルータによって、その特徴に徐々に効果を付加する。その結果,提案するルーティングガンは,コンボリューションフィルタを一部共有することで,単一のモデルで同時に様々な効果を学習することができる。 2つのベンチマークデータセットの実験結果は、定性的かつ定量的に既存の手法よりも優れた性能を示した。

Although impressive results have been achieved for age progression and regression, there remain two major issues in generative adversarial networks (GANs)-based methods: 1) conditional GANs (cGANs)-based methods can learn various effects between any two age groups in a single model, but are insufficient to characterize some specific patterns due to completely shared convolutions filters; and 2) GANs-based methods can, by utilizing several models to learn effects independently, learn some specific patterns, however, they are cumbersome and require age label in advance. To address these deficiencies and have the best of both worlds, this paper introduces a dropout-like method based on GAN~(RoutingGAN) to route different effects in a high-level semantic feature space. Specifically, we first disentangle the age-invariant features from the input face, and then gradually add the effects to the features by residual routers that assign the convolution filters to different age groups by dropping out the outputs of others. As a result, the proposed RoutingGAN can simultaneously learn various effects in a single model, with convolution filters being shared in part to learn some specific effects. Experimental results on two benchmarked datasets demonstrate superior performance over existing methods both qualitatively and quantitatively.

翻訳日:2021-02-02 16:31:32 公開日:2021-02-01

# エンドツーエンド食品画像解析システム

An End-to-End Food Image Analysis System ( http://arxiv.org/abs/2102.00645v1 )

ライセンス: Link先を確認

Jiangpeng He, Runyu Mao, Zeman Shao, Janine L. Wright, Deborah A. Kerr, Carol J. Boushey and Fengqing Zhu

(参考訳) 現代の深層学習技術は、食品認識や食品部分サイズ推定などの画像に基づく食事評価の進歩を可能にしている。食品の種類や消費量に関する貴重な情報は、多くの慢性疾患の予防に不可欠である。しかし、既存の画像に基づく食品分析の方法はエンドツーエンドでも、複数のタスク(認識や部分推定など)を一緒に処理することができず、現実のアプリケーションに適用することは困難である。本稿では,食品の局所化,分類,部分サイズ推定を融合した画像ベース食品分析フレームワークを提案する。提案手法はエンド・ツー・エンド,すなわち複数の食品を含む任意の食品画像であり,本システムでは各食品を対応する食品の種類と部分サイズでローカライズすることができる。また、条件付きGANで得られた食品エネルギー分布マップを局在化して4チャンネルRGB分布画像を生成することにより、単一食品部分推定を改善します。栄養摂食調査から収集した実生活食品画像データセットを用いて、エンドツーエンドの枠組みを評価した。

Modern deep learning techniques have enabled advances in image-based dietary assessment such as food recognition and food portion size estimation. Valuable information on the types of foods and the amount consumed are crucial for prevention of many chronic diseases. However, existing methods for automated image-based food analysis are neither end-to-end nor are capable of processing multiple tasks (e.g., recognition and portion estimation) together, making it difficult to apply to real life applications. In this paper, we propose an image-based food analysis framework that integrates food localization, classification and portion size estimation. Our proposed framework is end-to-end, i.e., the input can be an arbitrary food image containing multiple food items and our system can localize each single food item with its corresponding predicted food type and portion size. We also improve the single food portion estimation by consolidating localization results with a food energy distribution map obtained by conditional GAN to generate a four-channel RGB-Distribution image. Our end-to-end framework is evaluated on a real life food image dataset collected from a nutrition feeding study.

翻訳日:2021-02-02 16:30:53 公開日:2021-02-01

# 部分適応・関係注意モジュールによるNIR-to-VIS顔認識

A NIR-to-VIS face recognition via part adaptive and relation attention module ( http://arxiv.org/abs/2102.00689v1 )

ライセンス: Link先を確認

Rushuang Xu, MyeongAh Cho, and Sangyoun Lee

(参考訳) 顔認識アプリケーションシナリオでは,近赤外線(nir)監視カメラによる夜間の撮影など,さまざまな状況で撮影された顔画像を処理する必要がある。 NIRと可視光(VIS)の照度差は、顔画像間のドメインギャップを引き起こし、ポーズと感情の変動も顔のマッチングをより困難にします。ヘテロジニアス顔認識(hfr)はドメイン間の不一致が困難であり、顔部関係情報などのドメイン不変な特徴の抽出に多くの研究が集中している。しかし、ポーズ変動が発生した場合、顔成分位置が変化し、異なる部分関係が抽出される。本稿では,セマンティックマスクを用いて得られた顔の部位を抽出し,それぞれの特徴を用いた関係モデリングを行う部分関係アテンションモジュールを提案する。さらに,各部位の適応重みを用いた成分適応三重項損失関数を提案し,各領域やポーズに関係なくクラス内同一性を低減する。最後に,CASIA NIR-VIS 2.0の性能向上を図り,大きなポーズと感情の変化を伴うBUAA-VisNirにおいて優れた結果が得られることを示す。

In the face recognition application scenario, we need to process facial images captured in various conditions, such as at night by near-infrared (NIR) surveillance cameras. The illumination difference between NIR and visible-light (VIS) causes a domain gap between facial images, and the variations in pose and emotion also make facial matching more difficult. Heterogeneous face recognition (HFR) has difficulties in domain discrepancy, and many studies have focused on extracting domain-invariant features, such as facial part relational information. However, when pose variation occurs, the facial component position changes, and a different part relation is extracted. In this paper, we propose a part relation attention module that crops facial parts obtained through a semantic mask and performs relational modeling using each of these representative features. Furthermore, we suggest component adaptive triplet loss function using adaptive weights for each part to reduce the intra-class identity regardless of the domain as well as pose. Finally, our method exhibits a performance improvement in the CASIA NIR-VIS 2.0 and achieves superior result in the BUAA-VisNir with large pose and emotion variations.

翻訳日:2021-02-02 16:30:16 公開日:2021-02-01

# オーロラガード:モバイル照明システムを介して信頼できる顔アンチスプーフィング

Aurora Guard: Reliable Face Anti-Spoofing via Mobile Lighting System ( http://arxiv.org/abs/2102.00713v1 )

ライセンス: Link先を確認

Jian Zhang, Ying Tai, Taiping Yao, Jia Meng, Shouhong Ding, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji

(参考訳) モバイル端末での顔認証は様々なシナリオで広く適用されている。瞬き目や微妙な表情など、最先端の顔認証/検証システムの信頼性は高まっているが、紙写真やデジタルビデオの高解像度レンダリングリプレイに対する反偽造は、オープンな問題として残る。本稿では,オーロラガード(Aurora Guard, AG)と呼ばれる簡易かつ効果的な顔保護システムを提案する。提案システムはまず,まず光反射解析を用いて正常なキューを抽出し,さらに,本質的な深度と物質マップを精度良く復元する,エンドツーエンドのマルチタスク畳み込みニューラルネットワーク(CNN)と,回帰枝の光CAPTCHA検査機構を併用してシステムの信頼性を向上する。公開Replay-AttackおよびCASIAデータセットに関する実験は、提案手法が最先端のものよりも優れていることを実証している。また,実生および多種多様なspoofingサンプルを含む大規模データセットについても広範な実験を行い,本手法の汎用性をさらに検証した。

Face authentication on mobile end has been widely applied in various scenarios. Despite the increasing reliability of cutting-edge face authentication/verification systems to variations like blinking eye and subtle facial expression, anti-spoofing against high-resolution rendering replay of paper photos or digital videos retains as an open problem. In this paper, we propose a simple yet effective face anti-spoofing system, termed Aurora Guard (AG). Our system firstly extracts the normal cues via light reflection analysis, and then adopts an end-to-end trainable multi-task Convolutional Neural Network (CNN) to accurately recover subjects' intrinsic depth and material map to assist liveness classification, along with the light CAPTCHA checking mechanism in the regression branch to further improve the system reliability. Experiments on public Replay-Attack and CASIA datasets demonstrate the merits of our proposed method over the state-of-the-arts. We also conduct extensive experiments on a large-scale dataset containing 12,000 live and diverse spoofing samples, which further validates the generalization ability of our method in the wild.

翻訳日:2021-02-02 16:29:34 公開日:2021-02-01

# ビデオトランスフォーマネットワーク

Video Transformer Network ( http://arxiv.org/abs/2102.00719v1 )

ライセンス: Link先を確認

Daniel Neimark, Omri Bar, Maya Zohar, Dotan Asselmann

(参考訳) 本稿では,ビデオ認識のためのトランスフォーマーベースのフレームワークであるVTNを提案する。近年の視覚変換器の発展に触発されて,3D ConvNet に依存した映像行動認識の標準手法を廃止し,映像シーケンス情報全体への参加による行動分類手法を導入する。われわれのアプローチは汎用的で、任意の2次元空間ネットワーク上に構築されている。ウォールランタイムの面では、16.1\times$高速にトレーニングし、推論中に5.1\times$高速で実行し、他の最先端のメソッドと比較して競合精度を維持している。 1回のエンドツーエンドパスでビデオ全体を解析できるが、gflopsは1.5\times$より少ない。我々は、Kinetics-400の競合結果を報告し、VTN特性のアブレーション研究と精度と推論速度のトレードオフを提示する。私たちのアプローチが新しいベースラインとなり、ビデオ認識領域における新しい研究ラインを開始することを願っています。コードとモデルは近く提供される。

This paper presents VTN, a transformer-based framework for video recognition. Inspired by recent developments in vision transformers, we ditch the standard approach in video action recognition that relies on 3D ConvNets and introduce a method that classifies actions by attending to the entire video sequence information. Our approach is generic and builds on top of any given 2D spatial network. In terms of wall runtime, it trains $16.1\times$ faster and runs $5.1\times$ faster during inference while maintaining competitive accuracy compared to other state-of-the-art methods. It enables whole video analysis, via a single end-to-end pass, while requiring $1.5\times$ fewer GFLOPs. We report competitive results on Kinetics-400 and present an ablation study of VTN properties and the trade-off between accuracy and inference speed. We hope our approach will serve as a new baseline and start a fresh line of research in the video recognition domain. Code and models will be available soon.

翻訳日:2021-02-02 16:28:38 公開日:2021-02-01

# Landmark Breaker: ランドマークの抽出を乱してDeepFakeを妨害する

Landmark Breaker: Obstructing DeepFake By Disturbing Landmark Extraction ( http://arxiv.org/abs/2102.00798v1 )

ライセンス: Link先を確認

Pu Sun, Yuezun Li, Honggang Qi and Siwei Lyu

(参考訳) 最近のDeep Neural Networks(DNN)の開発により、AI合成顔のリアリズムが大幅に向上し、最も注目すべき例はDeepFakesです。ディープフェイク技術は、同じ顔属性を保持しながら、他の被験者の顔から対象の顔を合成することができる。ソーシャルメディアのポータル(Facebook、Instagramなど)が急速に増加し、こうした現実的な偽の顔はインターネット上で急速に広まり、社会に悪影響を及ぼした。本稿では,顔のランドマーク抽出を阻害する最初の専用手法であるランドマークブレーカーについて説明し,DeepFakeビデオ生成の妨害に応用し,DeepFake品質を低下させるために,顔のランドマーク抽出が入力面のアライメントに影響を与える可能性があることを動機とする。本手法は逆摂動を用いて達成する。 DeepFake生成後にのみ動作する検出方法と比較して、Landmark BreakerはDeepFake生成を防ぐために一歩前進する。最近のceleb-dfデータセットを用いた3つの最先端顔ランドマーク抽出装置について実験を行った。

The recent development of Deep Neural Networks (DNN) has significantly increased the realism of AI-synthesized faces, with the most notable examples being the DeepFakes. The DeepFake technology can synthesize a face of target subject from a face of another subject, while retains the same face attributes. With the rapidly increased social media portals (Facebook, Instagram, etc), these realistic fake faces rapidly spread though the Internet, causing a broad negative impact to the society. In this paper, we describe Landmark Breaker, the first dedicated method to disrupt facial landmark extraction, and apply it to the obstruction of the generation of DeepFake videos.Our motivation is that disrupting the facial landmark extraction can affect the alignment of input face so as to degrade the DeepFake quality. Our method is achieved using adversarial perturbations. Compared to the detection methods that only work after DeepFake generation, Landmark Breaker goes one step ahead to prevent DeepFake generation. The experiments are conducted on three state-of-the-art facial landmark extractors using the recent Celeb-DF dataset.

翻訳日:2021-02-02 16:28:01 公開日:2021-02-01

# マンモグラムにおけるセグメンティングマイクロ石灰化とその応用

Segmenting Microcalcifications in Mammograms and its Applications ( http://arxiv.org/abs/2102.00811v1 )

ライセンス: Link先を確認

Roee Zamir and Shai Bagon and David Samocha and Yael Yagil and Ronen Basri and Miri Sklair-Levy Meirav Galun

(参考訳) 微小石灰化は、乳房の軟組織背景に明るい白い斑点としてマンモグラムに現れるカルシウムの小さな堆積物です。微小石灰化はSitu乳癌の直腸癌の特異な徴候であり,診断と検診にはその正確な検出が不可欠である。マンモグラム中のこれらの小さなカルシウム残基を手動で検出することは、専門家の放射線技師にとっても、時間的消費とエラーになりやすい。マイクロ石灰化の検出とセグメント化のための既存のコンピュータ化アルゴリズムは、高い偽陽性率に苦しむ傾向にあり、広く使われることを妨げている。本稿では,深層学習を用いた正確な計算分割法を提案する。トレーニングフェーズにハードピクセルを集中させる戦略を提案することで、偽陽性率を低く抑えるという課題に特に対処する。さらに,マイクロ石灰化のクラスター上で有意義な統計情報を抽出することができる。

Microcalcifications are small deposits of calcium that appear in mammograms as bright white specks on the soft tissue background of the breast. Microcalcifications may be a unique indication for Ductal Carcinoma in Situ breast cancer, and therefore their accurate detection is crucial for diagnosis and screening. Manual detection of these tiny calcium residues in mammograms is both time-consuming and error-prone, even for expert radiologists, since these microcalcifications are small and can be easily missed. Existing computerized algorithms for detecting and segmenting microcalcifications tend to suffer from a high false-positive rate, hindering their widespread use. In this paper, we propose an accurate calcification segmentation method using deep learning. We specifically address the challenge of keeping the false positive rate low by suggesting a strategy for focusing the hard pixels in the training phase. Furthermore, our accurate segmentation enables extracting meaningful statistics on clusters of microcalcifications.

翻訳日:2021-02-02 16:27:21 公開日:2021-02-01

# カーネル化散乱ヒストグラム空間上の核距離による動的テクスチャ認識

Dynamic Texture Recognition via Nuclear Distances on Kernelized Scattering Histogram Spaces ( http://arxiv.org/abs/2102.00841v1 )

ライセンス: Link先を確認

Alexander Sagel, Julian W\"ormann, Hao Shen

(参考訳) 遠隔に基づく動的テクスチャ認識は,映像データの検索からセグメンテーションまで,マルチメディア処理における重要な研究分野である。動的テクスチャの最も特徴的な特徴が個々のフレームの出現であるという予想に基づいて, 散乱変換を用いて計算したフレーム的特徴ベクトルの局所空間として動的テクスチャを記述することを提案する。これらの空間を基底不変計量と組み合わせることで、最寄りの近傍分類と最寄りのクラスセンター分類のための最先端の結果を競争的に生成する枠組みを得る。

Distance-based dynamic texture recognition is an important research field in multimedia processing with applications ranging from retrieval to segmentation of video data. Based on the conjecture that the most distinctive characteristic of a dynamic texture is the appearance of its individual frames, this work proposes to describe dynamic textures as kernelized spaces of frame-wise feature vectors computed using the Scattering transform. By combining these spaces with a basis-invariant metric, we get a framework that produces competitive results for nearest neighbor classification and state-of-the-art results for nearest class center classification.

翻訳日:2021-02-02 16:26:45 公開日:2021-02-01

# 大語彙物体検出器の評価:悪魔は細部にある

Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details ( http://arxiv.org/abs/2102.01066v1 )

ライセンス: Link先を確認

Achal Dave, Piotr Doll\'ar, Deva Ramanan, Alexander Kirillov, Ross Girshick

(参考訳) 設計上、オブジェクト検出のための平均精度(ap)は、すべてのクラスを独立して扱うことを目的としている。一方、これは全てのクラスを扱い、まれに頻繁に、等しく扱うのが望ましい。一方、現実世界のユースケースにおいて重要な特性であるカテゴリ間信頼度校正を無視する。残念なことに、不均衡で大語彙のデータセットでは、APのデフォルト実装はカテゴリに依存していないし、適切に校正された検出器を直接報酬しない。実際、既定の実装では、単純で非合理的な再ランクポリシーがAPを大きなマージンで改善できるゲーム可能なメトリックが生成される。これらの制限に対処するために、2つの補完指標を紹介します。まず、既定のAP実装に簡単な修正を加え、本来意図されていたようなカテゴリ間で真に独立であることを保証する。最近の大語彙検出の進歩をベンチマークし、報告された多くの成果が、新しいクラス毎の独立評価の下での改善に繋がらないことを見出し、最近の改善は、クロスカテゴリランキングへの変更を解釈するのが困難であることを示唆する。カテゴリ間ランキングを確実にベンチマークすることの重要性を考えると、カテゴリ間ランキングを直接比較することで、適切に校正された検出器に報酬を与えるAP(AP-pool)のプール版を考える。最後に、キャリブレーションの古典的アプローチを再検討し、明示的に校正する検出器がAPプールの最先端を1.7ポイント改善することを発見した。

By design, average precision (AP) for object detection aims to treat all classes independently: AP is computed independently per category and averaged. On the one hand, this is desirable as it treats all classes, rare to frequent, equally. On the other hand, it ignores cross-category confidence calibration, a key property in real-world use cases. Unfortunately, we find that on imbalanced, large-vocabulary datasets, the default implementation of AP is neither category independent, nor does it directly reward properly calibrated detectors. In fact, we show that the default implementation produces a gameable metric, where a simple, nonsensical re-ranking policy can improve AP by a large margin. To address these limitations, we introduce two complementary metrics. First, we present a simple fix to the default AP implementation, ensuring that it is truly independent across categories as originally intended. We benchmark recent advances in large-vocabulary detection and find that many reported gains do not translate to improvements under our new per-class independent evaluation, suggesting recent improvements may arise from difficult to interpret changes to cross-category rankings. Given the importance of reliably benchmarking cross-category rankings, we consider a pooled version of AP (AP-pool) that rewards properly calibrated detectors by directly comparing cross-category rankings. Finally, we revisit classical approaches for calibration and find that explicitly calibrating detectors improves state-of-the-art on AP-pool by 1.7 points.

翻訳日:2021-02-02 16:26:15 公開日:2021-02-01

# Phoneme-BERT: Phoneme Sequence と ASR Transcript の合同言語モデリング

Phoneme-BERT: Joint Language Modelling of Phoneme Sequence and ASR Transcript ( http://arxiv.org/abs/2102.00804v1 )

ライセンス: Link先を確認

Mukuntha Narayanan Sundararaman, Ayush Kumar, Jithendra Vepa

(参考訳) 近年,asrシステムの発話認識能力が大幅に向上している。しかし、翻訳されたテキストで置換と削除のエラーが流行している、騒々しいドメイン外のデータにとって、まだ難しい作業です。これらのエラーは下流タスクのパフォーマンスを著しく低下させる。本研究では,ASRの誤りに頑健な音素認識表現を学習するために,音素シーケンスとASR書き起こしを用いた共同言語モデルを学習するPhonemeBERTと呼ばれるBERTスタイルの言語モデルを提案する。 PhonemeBERTは、音素シーケンスを付加的な機能として使用する下流タスクや、音素情報を利用せずに下流タスク用のASR-transcriptしか持たない低リソース設定でも使用できることを示しています。我々は3つのベンチマークデータセット(Stanford Sentiment Treebank, TREC, ATIS)に対して,それぞれ感情,質問,意図の分類タスクに対してノイズの多いデータを生成することで,我々のアプローチを広範囲に評価した。提案手法の結果は,各データセットにおける最先端のベースラインを総合的に上回ります。

Recent years have witnessed significant improvement in ASR systems to recognize spoken utterances. However, it is still a challenging task for noisy and out-of-domain data, where substitution and deletion errors are prevalent in the transcribed text. These errors significantly degrade the performance of downstream tasks. In this work, we propose a BERT-style language model, referred to as PhonemeBERT, that learns a joint language model with phoneme sequence and ASR transcript to learn phonetic-aware representations that are robust to ASR errors. We show that PhonemeBERT can be used on downstream tasks using phoneme sequences as additional features, and also in low-resource setup where we only have ASR-transcripts for the downstream tasks with no phoneme information available. We evaluate our approach extensively by generating noisy data for three benchmark datasets - Stanford Sentiment Treebank, TREC and ATIS for sentiment, question and intent classification tasks respectively. The results of the proposed approach beats the state-of-the-art baselines comprehensively on each dataset.

翻訳日:2021-02-02 16:21:37 公開日:2021-02-01

# 潜在空間における対人訓練のスピードアップに向けて

Towards Speeding up Adversarial Training in Latent Spaces ( http://arxiv.org/abs/2102.00662v1 )

ライセンス: Link先を確認

Yaguan Qian, Qiqi Shao, Tengteng Yao, Bin Wang, Shaoning Zeng, Zhaoquan Gu and Wassim Swaileh

(参考訳) 敵対的な訓練は、敵対的な例から守る最も効果的な方法です。しかし,既存の対人訓練手法では,入力空間における対人的な例を生成する必要があるため,時間消費の主部分を占めている。学習過程を高速化するため,本研究では,実例を生成する必要のない新しい学習手法を提案する。クリーンな例は、自身のクラス以外のどのクラスよりも2番目に大きなロジットコンポーネントを持つクラスの決定境界に近いことに気付きます。したがって、ロジットに摂動を加えて内在的敵例(EAEs)を生成することで、トレーニングプロセスを高速化するために勾配を計算することを避けることができる。我々はさらに多様体の理論によってAEの存在についての深い洞察を得る。付加的な摂動が制約の範囲内にあることを保証するため、統計分布を用いてシード例を選択してAEを製作する。 CIFAR-10 と ImageNet で大規模な実験を行い,現状の "Free" と "Fast" の手法と比較して,EAE の対人訓練はトレーニング時間を短縮するだけでなく,モデルの堅牢性も向上することを示した。さらに,EAE対人訓練は,既存の方法に比べてクリーンサンプルの精度にはほとんど影響を与えない。

Adversarial training is wildly considered as the most effective way to defend against adversarial examples. However, existing adversarial training methods consume unbearable time cost, since they need to generate adversarial examples in the input space, which accounts for the main part of total time-consuming. For speeding up the training process, we propose a novel adversarial training method that does not need to generate real adversarial examples. We notice that a clean example is closer to the decision boundary of the class with the second largest logit component than any other class besides its own class. Thus, by adding perturbations to logits to generate Endogenous Adversarial Examples(EAEs) -- adversarial examples in the latent space, it can avoid calculating gradients to speed up the training process. We further gain a deep insight into the existence of EAEs by the theory of manifold. To guarantee the added perturbation is within the range of constraint, we use statistical distributions to select seed examples to craft EAEs. Extensive experiments are conducted on CIFAR-10 and ImageNet, and the results show that compare with state-of-the-art "Free" and "Fast" methods, our EAE adversarial training not only shortens the training time, but also enhances the robustness of the model. Moreover, the EAE adversarial training has little impact on the accuracy of clean examples than the existing methods.

翻訳日:2021-02-02 16:17:38 公開日:2021-02-01

# グラフ畳み込みネットワークと交差する自律ナビゲーションと自動運転車の条件模倣学習

Autonomous Navigation through intersections with Graph ConvolutionalNetworks and Conditional Imitation Learning for Self-driving Cars ( http://arxiv.org/abs/2102.00675v1 )

ライセンス: Link先を確認

Xiaodong Mei, Yuxiang Sun, Yuying Chen, Congcong Liu, Ming Liu

(参考訳) 自動運転では、多くの交通参加者が移動する信号のない交差点を通るナビゲーションは難しい作業です。そこで本研究では,ナビゲーションポリシー学習のための新しい分岐ネットワークG-CILを提案する。具体的には,グラフ構造データなどの動的環境を第一に表現し,エッジ定義の効果的な戦略を提案する。次に、グラフ畳み込みニューラルネットワークを知覚モジュールとして、環境から大域的および幾何学的特徴をキャプチャする。安全かつ効率的なナビゲーションポリシを生成するために,条件付き模倣学習アルゴリズムを組み込んで,専門家によるデモンストレーションから直接運転行動を学習する。提案するネットワークは,複数の周辺車両を処理でき,与えられた高レベルコマンド(例えば,左折してグローバル目標へ)に応じて最適な制御動作(ステアリング角度やスロットルなど)を生成することができる。信号のない交差点と様々な交通密度の評価は、我々のエンドツーエンドのトレーニング可能なニューラルネットワークが、より高い成功率と短いナビゲーション時間でベースラインを上回っていることを示している。

In autonomous driving, navigation through unsignaled intersections with many traffic participants moving around is a challenging task. To provide a solution to this problem, we propose a novel branched network G-CIL for the navigation policy learning. Specifically, we firstly represent such dynamic environments as graph-structured data and propose an effective strategy for edge definition to aggregate surrounding information for the ego-vehicle. Then graph convolutional neural networks are used as the perception module to capture global and geometric features from the environment. To generate safe and efficient navigation policy, we further incorporate it with conditional imitation learning algorithm, to learn driving behaviors directly from expert demonstrations. Our proposed network is capable of handling a varying number of surrounding vehicles and generating optimal control actions (e.g., steering angle and throttle) according to the given high-level commands (e.g., turn left towards the global goal). Evaluations on unsignaled intersections with various traffic densities demonstrate that our end-to-end trainable neural network outperforms the baselines with higher success rate and shorter navigation time.

翻訳日:2021-02-02 16:16:57 公開日:2021-02-01

# 不確実性監視によるディープラーニングシステムのフェールセーフ実行

Fail-Safe Execution of Deep Learning based Systems through Uncertainty Monitoring ( http://arxiv.org/abs/2102.00902v1 )

ライセンス: Link先を確認

Michael Weiss and Paolo Tonella

(参考訳) 現代のソフトウェアシステムは、画像、ビデオ、自然言語テキスト、音声信号などの複雑な非構造化入力を処理する際にDeep Neural Networks (DNN) に依存している。このような入力空間の難解な大きさ、学習アルゴリズムの本質的な制限、およびいくつかの入力に対する予測の曖昧さが提供され、DNNの予測が常に正しいという保証はない。フェイルセーフディープラーニングベースシステム(DLS)は、DNN障害をスーパーバイザによって処理する装備の1つであり、信頼すべきでない予測を認識し、DLSを安全な状態にする治癒手順を活性化することができる。本稿では,DNN不確実性推定器を用いてこのようなスーパーバイザを実装する手法を提案する。まず、DNNの不確実性を測定するための既存のアプローチの利点と欠点を議論し、そのようなアプローチに依存するスーパーバイザーの実証的評価のための新しいメトリクスを提案します。次に、公開ツールUNCERTAINTY-WIZARDについて述べ、通常のtf.keras DNNに対する不確実性を透過的に推定する。最後に,このアプローチを実証的に検証するために,4つの異なる課題について実施した大規模研究について検討し,dlsのフェールセーフ実行の不確実性を監視するソフトウェア技術者への指導として,教訓を報告する。

Modern software systems rely on Deep Neural Networks (DNN) when processing complex, unstructured inputs, such as images, videos, natural language texts or audio signals. Provided the intractably large size of such input spaces, the intrinsic limitations of learning algorithms, and the ambiguity about the expected predictions for some of the inputs, not only there is no guarantee that DNN's predictions are always correct, but rather developers must safely assume a low, though not negligible, error probability. A fail-safe Deep Learning based System (DLS) is one equipped to handle DNN faults by means of a supervisor, capable of recognizing predictions that should not be trusted and that should activate a healing procedure bringing the DLS to a safe state. In this paper, we propose an approach to use DNN uncertainty estimators to implement such a supervisor. We first discuss the advantages and disadvantages of existing approaches to measure uncertainty for DNNs and propose novel metrics for the empirical assessment of the supervisor that rely on such approaches. We then describe our publicly available tool UNCERTAINTY-WIZARD, which allows transparent estimation of uncertainty for regular tf.keras DNNs. Lastly, we discuss a large-scale study conducted on four different subjects to empirically validate the approach, reporting the lessons-learned as guidance for software engineers who intend to monitor uncertainty for fail-safe execution of DLS.

翻訳日:2021-02-02 16:16:18 公開日:2021-02-01

# 分布型モンテカルロ木探索によるリスク認識と多目的意思決定

Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search ( http://arxiv.org/abs/2102.00966v1 )

ライセンス: Link先を確認

Conor F. Hayes, Mathieu Reymond, Diederik M. Roijers, Enda Howley, Patrick Mannion

(参考訳) 多くのリスク認識および多目的強化学習設定において、ユーザの有用性はポリシーの単一実行から導かれる。これらの設定では、平均的な将来のリターンに基づいた決定は適切ではない。例えば、医療現場では、患者は病気を治療する機会を1つだけ持つことができる。決定を行う場合、期待されるリターン(強化学習では値として知られています)は、決定が持つ可能性のある有害あるいはポジティブな結果の範囲を考慮できないのです。我々の重要な洞察は、エージェントが決定時に要求する重要な情報を表現するために、期待される未来よりも分布を使うべきだということです。本論文では,個々の政策実行から得られる様々なリターンの有用性について,後方分布を学習するアルゴリズムである分散モンテカルロ木探索を提案する。さらに,本アルゴリズムは,期待値の効用に対する多目的強化学習において,最先端の手法よりも優れていた。

In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from the single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. When making a decision, just the expected return -- known in reinforcement learning as the value -- cannot account for the potential range of adverse or positive outcomes a decision may have. Our key insight is that we should use the distribution over expected future returns differently to represent the critical information that the agent requires at decision time. In this paper, we propose Distributional Monte Carlo Tree Search, an algorithm that learns a posterior distribution over the utility of the different possible returns attainable from individual policy executions, resulting in good policies for both risk-aware and multi-objective settings. Moreover, our algorithm outperforms the state-of-the-art in multi-objective reinforcement learning for the expected utility of the returns.

翻訳日:2021-02-02 16:15:31 公開日:2021-02-01

# 術後前立腺癌に対するctvコントーリングにおける医師スタイルの変化のドシメトリー効果:深層学習に基づくシミュレーション研究

Dosimetric impact of physician style variations in contouring CTV for post-operative prostate cancer: A deep learning based simulation study ( http://arxiv.org/abs/2102.01006v1 )

ライセンス: Link先を確認

Anjali Balagopal, Dan Nguyen, Maryam Mashayekhi, Howard Morgan, Aurelie Garant, Neil Desai, Raquibul Hannan, Mu-Han Lin, Steve Jiang

(参考訳) 腫瘍のセグメンテーションでは、オブザーバ間変異が重要な問題であることが認識されている。これは、臨床標的量(CTV)セグメント化、特に術後設定において、総腫瘍が存在しない場合にさらに重要である。このシナリオでは、CTVは解剖学的に確立された構造ではなく、医師が使用する臨床ガイドライン、腫瘍の制御と毒性のトレードオフ、経験、トレーニングの背景などに基づいて決定するものである。この結果、医師間のオブザーバ間の変動性が高まる。オブザーバ間の変動性は問題視されているが、各患者に複数のctvパターンがないため、線量計画に必要なかなりの時間を要するため、その線量測定結果はまだ不明である。本研究では,これらのスタイリスティックな変化が臓器リスク(oar)線量に与える影響を,深層学習による臨床ワークフローのシミュレーションにより解析する。ある医師が以前に治療した患者に対しては、DLベースのツールを使用して、他の医師がCTVをどのように輪郭化し、対応する線量分布がこの患者にどのように見えるかをシミュレートします。複数の医師のスタイルをシミュレートするために、既存の社内ctvセグメンテーションモデルを使用し、医師のスタイルを認識できるセグメンテーションを生成する。対応する線量分布は、すべての構造に平均して、試験データ上の処方用量の3%以内の線量を予測することができる、別の社内ディープラーニングツールを使用して予測される。各検査患者に対して,4種類の異なる医師型ctvが検討され,4種類の線量分布が解析された。 OAR線量測定値を比較すると、医師スタイルの変動は臓器に異なる線量を与えても、最大線量点を除くすべての重要な線量測定値が臨床的に許容される限界内にあることを示している。

In tumor segmentation, inter-observer variation is acknowledged to be a significant problem. This is even more significant in clinical target volume (CTV) segmentation, specifically, in post-operative settings, where a gross tumor does not exist. In this scenario, CTV is not an anatomically established structure but rather one determined by the physician based on the clinical guideline used, the preferred trade off between tumor control and toxicity, their experience, training background etc... This results in high inter-observer variability between physicians. Inter-observer variability has been considered an issue, however its dosimetric consequence is still unclear, due to the absence of multiple physician CTV contours for each patient and the significant amount of time required for dose planning. In this study, we analyze the impact that these physician stylistic variations have on organs-at-risk (OAR) dose by simulating the clinical workflow using deep learning. For a given patient previously treated by one physician, we use DL-based tools to simulate how other physicians would contour the CTV and how the corresponding dose distributions should look like for this patient. To simulate multiple physician styles, we use a previously developed in-house CTV segmentation model that can produce physician style-aware segmentations. The corresponding dose distribution is predicted using another in-house deep learning tool, which, averaging across all structures, is capable of predicting dose within 3% of the prescription dose on the test data. For every test patient, four different physician-style CTVs are considered and four different dose distributions are analyzed. OAR dose metrics are compared, showing that even though physician style variations results in organs getting different doses, all the important dose metrics except Maximum Dose point are within the clinically acceptable limit.

翻訳日:2021-02-02 16:14:55 公開日:2021-02-01

# 大規模言語モデルの微調整のためのスケーリングフェデレーション学習

Scaling Federated Learning for Fine-tuning of Large Language Models ( http://arxiv.org/abs/2102.00875v1 )

ライセンス: Link先を確認

Agrin Hilmkil and Sebastian Callh and Matteo Barbieri and Leon Ren\'e S\"utfeld and Edvin Listo Zec and Olof Mogren

(参考訳) Federated Learning(FL)は分散コンピューティングと分散データに対する有望なアプローチであり、法的なフレームワークに対するプライバシーとコンプライアンスのレベルを提供します。これにより、FLは消費者およびヘルスケアアプリケーションの両方に魅力的になります。この領域は積極的に検討されているが、より大きな言語モデルの文脈でflを調査した研究はほとんどなく、タスク、アーキテクチャ、クライアントの数、その他の関連する要因間での堅牢性に関する包括的なレビューが欠けている。本稿では,共用学習環境におけるトランスフォーマティブ言語モデルの微調整について検討する。我々は,感情分析や著者識別などのテキスト分類タスクにおいて,さまざまなサイズのBERT変異(BERT, ALBERT, DistilBERT)を評価する。フェデレーション平均設定におけるタスクパフォーマンスに対する分散計算の影響を評価するために、32までのクライアント数を広範囲に監視します。実験結果から, 評価モデルの大規模化は, 一般にフェデレーショントレーニングを禁止していないことが示唆されるが, 異なるモデルがフェデレーション平均化を様々な程度に扱うことが判明した。特にDistilBERTは、より多くのクライアントと大幅に遅く収束し、いくつかの状況下では、チャンスレベルのパフォーマンスに崩壊します。この問題を調査することは、将来の研究に興味深い視点をもたらす。

Federated learning (FL) is a promising approach to distributed compute, as well as distributed data, and provides a level of privacy and compliance to legal frameworks. This makes FL attractive for both consumer and healthcare applications. While the area is actively being explored, few studies have examined FL in the context of larger language models and there is a lack of comprehensive reviews of robustness across tasks, architectures, numbers of clients, and other relevant factors. In this paper, we explore the fine-tuning of Transformer-based language models in a federated learning setting. We evaluate three popular BERT-variants of different sizes (BERT, ALBERT, and DistilBERT) on a number of text classification tasks such as sentiment analysis and author identification. We perform an extensive sweep over the number of clients, ranging up to 32, to evaluate the impact of distributed compute on task performance in the federated averaging setting. While our findings suggest that the large sizes of the evaluated models are not generally prohibitive to federated training, we found that the different models handle federated averaging to a varying degree. Most notably, DistilBERT converges significantly slower with larger numbers of clients, and under some circumstances, even collapses to chance level performance. Investigating this issue presents an interesting perspective for future research.

翻訳日:2021-02-02 16:14:02 公開日:2021-02-01

# End2End音響とセマンティックトランスダクション

End2End Acoustic to Semantic Transduction ( http://arxiv.org/abs/2102.01013v1 )

ライセンス: Link先を確認

Valentin Pelloin, Nathalie Camelin, Antoine Laurent, Renato De Mori, Antoine Caubri\`ere, Yannick Est\`eve, Sylvain Meignier

(参考訳) 本稿では,注意機構を用いた新しいエンドツーエンドシーケンス・ツー・シーケンス音声言語理解モデルを提案する。意味的内容を仮説化するために、コンテキスト音響特徴を確実に選択する。アコースティックスパンからすべての発音された単語や概念を抽出できる初期アーキテクチャを設計、試験する。浅い融合言語モデルでは、このシステムはフランスのMEDIAコーパスにおける13.6のコンセプトエラーレート(CER)と18.5のコンセプト値エラーレート(CVER)に達し、最先端技術と比較して絶対2.8ポイントの削減を実現している。そこで,概念とその価値を仮説化するモデルを提案する。この変換は、新しいタイプのコンテキストなしで15.4 CERと21.6 CVERに達する。

In this paper, we propose a novel end-to-end sequence-to-sequence spoken language understanding model using an attention mechanism. It reliably selects contextual acoustic features in order to hypothesize semantic contents. An initial architecture capable of extracting all pronounced words and concepts from acoustic spans is designed and tested. With a shallow fusion language model, this system reaches a 13.6 concept error rate (CER) and an 18.5 concept value error rate (CVER) on the French MEDIA corpus, achieving an absolute 2.8 points reduction compared to the state-of-the-art. Then, an original model is proposed for hypothesizing concepts and their values. This transduction reaches a 15.4 CER and a 21.6 CVER without any new type of context.

翻訳日:2021-02-02 16:13:18 公開日:2021-02-01

# 時間系列回帰と予測のためのニューラルネットワークの自動相関誤差の調整

Adjusting for Autocorrelated Errors in Neural Networks for Time Series Regression and Forecasting ( http://arxiv.org/abs/2101.12578v2 )

ライセンス: Link先を確認

Fan-Keng Sun and Christopher I. Lang and Duane S. Boning

(参考訳) 多くの場合、既知のパラメトリックモデル構造を用いて時系列データの高精度なモデルを生成することは困難である。これに対し、ニューラルネットワークを用いて時系列を概ねモデル化する研究が増えている。時系列でニューラルネットワークをトレーニングする一般的な前提は、異なる時間ステップでのエラーは非相関であるということである。しかし、データの時間性のため、多くのケースでエラーは自己相関しており、そのような最大推定は不正確である。本稿では,自己相関係数をモデルパラメータと協調して学習し,自己相関誤差に適応することを提案する。時系列回帰の場合, 大規模実験では, 特に自己相関が強い場合に, プライス-ウィンステン法を上回っていることが示された。さらに,本手法を時系列予測に拡張し,様々な最先端モデルで適用する。実世界のデータセットの広範囲にわたる結果から,本手法はほぼすべてのケースで性能が向上することが示された。

In many cases, it is difficult to generate highly accurate models for time series data using a known parametric model structure. In response, an increasing body of research focuses on using neural networks to model time series approximately. A common assumption in training neural networks on time series is that the errors at different time steps are uncorrelated. However, due to the temporality of the data, errors are actually autocorrelated in many cases, which makes such maximum likelihood estimation inaccurate. In this paper, we propose to learn the autocorrelation coefficient jointly with the model parameters in order to adjust for autocorrelated errors. For time series regression, large-scale experiments indicate that our method outperforms the Prais-Winsten method, especially when the autocorrelation is strong. Furthermore, we broaden our method to time series forecasting and apply it with various state-of-the-art models. Results across a wide range of real-world datasets show that our method enhances performance in almost all cases.

翻訳日:2021-02-02 16:12:45 公開日:2021-02-01

# 対称正定値行列多様体上の確率的学習ベクトル量子化

Probabilistic Learning Vector Quantization on Manifold of Symmetric Positive Definite Matrices ( http://arxiv.org/abs/2102.00667v1 )

ライセンス: Link先を確認

Fengzhen Tang, Haifeng Feng, Peter Tino, Bailu Si, Daxiong Ji

(参考訳) 本稿では,確率論的学習ベクトル量子化の枠組みにおける多様体値データの新しい分類法を開発する。多くの分類シナリオにおいて、データは本質的に曲線リーマン多様体上に存在する点である対称正定値行列によって自然に表現することができる。リーマン多様体の非ユークリッド幾何学のために、伝統的なユークリッド機械学習アルゴリズムはそのようなデータに悪い結果をもたらす。本稿では,リーマン自然計量(アフィン不変計量)を備えた対称正定行列の多様体上に存在するデータ点の確率的学習ベクトル量子化アルゴリズムを一般化する。誘導されたリーマン距離を利用して、確率学習リーマン空間量子化アルゴリズムを導出し、リーマン勾配降下による学習規則を得る。合成データ,画像データ,運動画像脳波データに関する実証的研究は,提案手法の優れた性能を示す。

In this paper, we develop a new classification method for manifold-valued data in the framework of probabilistic learning vector quantization. In many classification scenarios, the data can be naturally represented by symmetric positive definite matrices, which are inherently points that live on a curved Riemannian manifold. Due to the non-Euclidean geometry of Riemannian manifolds, traditional Euclidean machine learning algorithms yield poor results on such data. In this paper, we generalize the probabilistic learning vector quantization algorithm for data points living on the manifold of symmetric positive definite matrices equipped with Riemannian natural metric (affine-invariant metric). By exploiting the induced Riemannian distance, we derive the probabilistic learning Riemannian space quantization algorithm, obtaining the learning rule through Riemannian gradient descent. Empirical investigations on synthetic data, image data , and motor imagery EEG data demonstrate the superior performance of the proposed method.

翻訳日:2021-02-02 16:06:51 公開日:2021-02-01

# Surrogate Set Classification による複数非ラベルデータセットのバイナリ分類

Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification ( http://arxiv.org/abs/2102.00678v1 )

ライセンス: Link先を確認

Shida Lei, Nan Lu, Gang Niu, Issei Sato, Masashi Sugiyama

(参考訳) 高アノテーションコストに対処するために,弱い教師データのみから分類器を訓練することが近年注目されている。様々なアプローチの中で、完全に教師なしの分類からの監督を強化することは有望な方向であり、通常はクラス優先を唯一の監督として採用し、ラベルなし(u)データセットからバイナリ分類器を訓練する。既存のリスク一貫性メソッドは理論的には高い柔軟性を持つが、2つのUセットからのみ学ぶことができる。本稿では,mU集合から$m\ge2$に対して二進分類を行う新しい手法を提案する。本研究の目的は,各観測データから u セットが描画されるかを予測することを目的とした,surrogate set classification (ssc) と呼ばれる補助的分類課題を検討することである。 SSCは標準(マルチクラス)の分類法で解決でき、SSCの解を用いて、ある線形フラクタル変換によって最終二項分類器を得る。我々は,この手法を柔軟かつ効率的なエンドツーエンドのディープラーニングフレームワークで構築し,分類器一貫性を証明した。実験により,提案手法が最先端手法よりも優れていることを示す。

To cope with high annotation costs, training a classifier only from weakly supervised data has attracted a great deal of attention these days. Among various approaches, strengthening supervision from completely unsupervised classification is a promising direction, which typically employs class priors as the only supervision and trains a binary classifier from unlabeled (U) datasets. While existing risk-consistent methods are theoretically grounded with high flexibility, they can learn only from two U sets. In this paper, we propose a new approach for binary classification from m U-sets for $m\ge2$. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC), which is aimed at predicting from which U set each observed data is drawn. SSC can be solved by a standard (multi-class) classification method, and we use the SSC solution to obtain the final binary classifier through a certain linear-fractional transformation. We built our method in a flexible and efficient end-to-end deep learning framework and prove it to be classifier-consistent. Through experiments, we demonstrate the superiority of our proposed method over state-of-the-art methods.

翻訳日:2021-02-02 16:06:20 公開日:2021-02-01

# VAEにおけるクラス関連およびクラス非依存因子の半監督的解離

Semi-Supervised Disentanglement of Class-Related and Class-Independent Factors in VAE ( http://arxiv.org/abs/2102.00892v1 )

ライセンス: Link先を確認

Sina Hajimiri, Aryo Lotfi, Mahdieh Soleymani Baghshah

(参考訳) 近年,不整合表現を学習するための変分オートエンコーダのフレームワークの拡張が注目されている。そこで我々は,データ変動のクラス関連要因とクラス非依存要因を分離できるフレームワークを提案する。本フレームワークは,データからクラス関連因子を抽出するプロセスを改善するために,潜在空間における注意機構を用いる。また,混合モデルを学習可能な事前分布として活用し,目的関数にbhattacharyya係数を組み込んで重なり合う混合を防止することで,データ分布の多様性を扱う。我々のモデルエンコーダは、表現の解釈性を改善するために、ラベル付きデータの少ない半教師付き方式でさらに訓練されている。実験により,本フレームワークはクラスやクラスに依存しない変動要因を分離し,解釈可能な特徴を学習することを示した。さらに,各データセットの定量的,定性的な結果を用いて,モデルの性能を実証する。

In recent years, extending variational autoencoder's framework to learn disentangled representations has received much attention. We address this problem by proposing a framework capable of disentangling class-related and class-independent factors of variation in data. Our framework employs an attention mechanism in its latent space in order to improve the process of extracting class-related factors from data. We also deal with the multimodality of data distribution by utilizing mixture models as learnable prior distributions, as well as incorporating the Bhattacharyya coefficient in the objective function to prevent highly overlapping mixtures. Our model's encoder is further trained in a semi-supervised manner, with a small fraction of labeled data, to improve representations' interpretability. Experiments show that our framework disentangles class-related and class-independent factors of variation and learns interpretable features. Moreover, we demonstrate our model's performance with quantitative and qualitative results on various datasets.

翻訳日:2021-02-02 16:05:39 公開日:2021-02-01

# 全最小二乗位相検索

Total least squares phase retrieval ( http://arxiv.org/abs/2102.00927v1 )

ライセンス: Link先を確認

Sidharth Gupta and Ivan Dokmani\'c

(参考訳) 本稿では,検出ベクトルの誤差による位相探索問題に対処する。最近の位相探索法は最小二乗法(LS)の定式化に基づいており、2次測定の誤差を仮定している。このアプローチを拡張し、オペレータエラーの線形逆問題に精通した総最小二乗(TLS)フレームワークを採用することで、センシングベクターのエラーを処理する。本稿では, 位相探索問題の勾配降下と特異な幾何学を用いて, 単純かつ効率的なTLS解を得る方法を示す。さらに、我々はソリューションエラーを計算することを可能にするセンシングベクターと測定に関してTLSおよびLSソリューションの勾配を導き出します。これらのエラー式を分析することで、各メソッドがいつうまく機能すべきかを決定します。シミュレーションを行い,本手法の利点を実証し,解析結果の検証を行う。さらに,検出ベクトルと測定誤差を自然に含む実光ハードウェア上で位相探索実験を行うことにより,本手法の有効性を実証する。

We address the phase retrieval problem with errors in the sensing vectors. A number of recent methods for phase retrieval are based on least squares (LS) formulations which assume errors in the quadratic measurements. We extend this approach to handle errors in the sensing vectors by adopting the total least squares (TLS) framework familiar from linear inverse problems with operator errors. We show how gradient descent and the peculiar geometry of the phase retrieval problem can be used to obtain a simple and efficient TLS solution. Additionally, we derive the gradients of the TLS and LS solutions with respect to the sensing vectors and measurements which enables us to calculate the solution errors. By analyzing these error expressions we determine when each method should perform well. We run simulations to demonstrate the benefits of our method and verify the analysis. We further demonstrate the effectiveness of our approach by performing phase retrieval experiments on real optical hardware which naturally contains sensing vector and measurement errors.

翻訳日:2021-02-02 16:05:03 公開日:2021-02-01

# 確率的勾配Descenceのための情報理論一般化境界

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent ( http://arxiv.org/abs/2102.00931v1 )

ライセンス: Link先を確認

Gergely Neu

(参考訳) 一般的な非凸損失関数を最適化するための確率勾配勾配法の一般化特性について検討する。我々の主な貢献は,sgdで計算されたイテレートの経路に沿って評価された確率勾配の局所統計に依存する一般化誤差の上限を提供することである。我々の境界が依存する重要な要因は、勾配のばらつき(データ分布に関する)と、SGD経路に沿った目的関数の局所的滑らかさ、最終的な出力に対する摂動に対する損失関数の感度である。当社の重要な技術ツールは、以前にSGDのランダム化変種を分析するために使用される情報理論一般化境界と、反復の摂動解析を組み合わせることです。

We study the generalization properties of the popular stochastic gradient descent method for optimizing general non-convex loss functions. Our main contribution is providing upper bounds on the generalization error that depend on local statistics of the stochastic gradients evaluated along the path of iterates calculated by SGD. The key factors our bounds depend on are the variance of the gradients (with respect to the data distribution) and the local smoothness of the objective function along the SGD path, and the sensitivity of the loss function to perturbations to the final output. Our key technical tool is combining the information-theoretic generalization bounds previously used for analyzing randomized variants of SGD with a perturbation analysis of the iterates.

翻訳日:2021-02-02 16:04:29 公開日:2021-02-01

# 時間論理仕様を用いたマルチエージェント強化学習

Multi-Agent Reinforcement Learning with Temporal Logic Specifications ( http://arxiv.org/abs/2102.00582v1 )

ライセンス: Link先を確認

Lewis Hammond and Alessandro Abate and Julian Gutierrez and Michael Wooldridge

(参考訳) 本稿では,未知の環境におけるエージェント群による時間論理仕様を満たす学習の問題について検討し,確率的行動を示す可能性がある。学習の観点からは、これらの仕様はタスクや目的をキャプチャするリッチな形式言語を提供する一方で、ロジックや自動検証の観点からは、学習機能の導入によって、大規模で統計的で未知の環境での実用的な応用が可能になる。しかし、この領域の既存の仕事は限られています。完全な線形時間論理や正当性を保証するフレームワークのうち、これまでのすべてのメソッドでは、単一の時間論理仕様と単一のエージェントのみを考慮する。この制限を克服するために、時間論理仕様のための最初のマルチエージェント強化学習技術を開発しました。関数近似を用いても,主アルゴリズムであるALMANAC(Automaton/Logic Multi-Agent Natural Actor-Critic)の正確性と収束性を保証する。理論的結果とともに,予備実験のセットを通じて,本手法の適用性をさらに実証する。

In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour. From a learning perspective these specifications provide a rich formal language with which to capture tasks or objectives, while from a logic and automated verification perspective the introduction of learning capabilities allows for practical applications in large, stochastic, unknown environments. The existing work in this area is, however, limited. Of the frameworks that consider full linear temporal logic or have correctness guarantees, all methods thus far consider only the case of a single temporal logic specification and a single agent. In order to overcome this limitation, we develop the first multi-agent reinforcement learning technique for temporal logic specifications, which is also novel in its ability to handle multiple specifications. We provide correctness and convergence guarantees for our main algorithm - ALMANAC (Automaton/Logic Multi-Agent Natural Actor-Critic) - even when using function approximation. Alongside our theoretical results, we further demonstrate the applicability of our technique via a set of preliminary experiments.

翻訳日:2021-02-02 16:02:48 公開日:2021-02-01

# AIのモラル責任に関する人間の認識:AIによるベイル意思決定のケーススタディ

Human Perceptions on Moral Responsibility of AI: A Case Study in AI-Assisted Bail Decision-Making ( http://arxiv.org/abs/2102.00625v1 )

ライセンス: Link先を確認

Gabriel Lima, Nina Grgi\'c-Hla\v{c}a, Meeyoung Cha

(参考訳) 自律人工知能(AI)システムの行動に対する責任をどう捉えるかは、人文科学や社会科学の分野で広く議論されている。この研究では、AIと人間のエージェントに関する8つの異なる道徳的責任の概念の人々の認識を保釈意思決定の文脈で測定する2つの実験(それぞれ$N$=200)を提示する。実生活に適応したヴィグネットを用いて、我々の実験では、AIエージェントは因果責任を持ち、人間エージェントと同じような責任を負っている。しかし、これらのエージェントの倫理的責任の認識には有意義な違いがあり、人間のエージェントはaiエージェントよりも現代的で先見的な責任の考え方の方が高いと説明されていた。また、AIと人間の意思決定者とアドバイザーの両方が、その性質に関わらず、自分の決定を正当化することを期待していることもわかりました。本稿は、ハイテイクシナリオにおける説明可能なAIの必要性など、これらの発見のポリシーとHCIの影響について論じる。

How to attribute responsibility for autonomous artificial intelligence (AI) systems' actions has been widely debated across the humanities and social science disciplines. This work presents two experiments ($N$=200 each) that measure people's perceptions of eight different notions of moral responsibility concerning AI and human agents in the context of bail decision-making. Using real-life adapted vignettes, our experiments show that AI agents are held causally responsible and blamed similarly to human agents for an identical task. However, there was a meaningful difference in how people perceived these agents' moral responsibility; human agents were ascribed to a higher degree of present-looking and forward-looking notions of responsibility than AI agents. We also found that people expect both AI and human decision-makers and advisors to justify their decisions regardless of their nature. We discuss policy and HCI implications of these findings, such as the need for explainable AI in high-stakes scenarios.

翻訳日:2021-02-02 16:02:11 公開日:2021-02-01

# DRLDO: メタモルフィックマルウェアに対する防御のための新しいDRLベースのD-Obfuscation System

DRLDO: A novel DRL based De-ObfuscationSystem for Defense against Metamorphic Malware ( http://arxiv.org/abs/2102.00898v1 )

ライセンス: Link先を確認

Mohit Sewak and Sanjay K. Sahay and Hemant Rathore

(参考訳) 本論文では,オプコードレベルでのメタモルフィックおよび難読化マルウェアを正規化し,高度なメタモルフィック・デ・難読化防御システムを構築するための新しいメカニズムを提案する。深層強化学習に基づくde-obfuscatorのためのdrldoと呼ぶ。 DRLDOをサブコンポーネントとして含むことで、既存の侵入検知システムは、既存のマルウェアの難読化および変成変種からの「ゼロデイ」攻撃に対する防御能力を増強することができる。これは、高度なDRLを使用してオペコードレベルまで難読化をインテリジェントかつ自動的に正規化するシステムがないだけでなく、DRLDOシステムが既存のIDSに一切変更を課さないために重要なものとなっている。 DRLDOシステムはIDSの分類器を難読化されたサンプルを含む新しいデータセットで再訓練する義務も負いません。したがって、DRLDOは既存のIDSデプロイメントに容易に再適合できる。我々は,複数の世代のマルウェアを含む標準データセットから得られたマルウェアサンプルから発生する難読化に対する複数同時攻撃に対して,システムの設計,開発,および評価を行う実験を行った。実験の結果、DRLDOは、既存の訓練済みマルウェア分類器によってマルウェアの検出不能な難解な変種を検出可能にすることが証明された。検出確率はカットオフマークを大きく上回り,分類器では0.6に上昇し,難読化マルウェアを曖昧に検出した。さらに、DRLDOが生成した難読化変種は、ベースマルウェアと非常に高い相関(0.99)を達成した。この観察は、DRLDOシステムが実際に難読化を学習しており、簡単なトリックを悪用していないことを実証する。

In this paper, we propose a novel mechanism to normalize metamorphic and obfuscated malware down at the opcode level and hence create an advanced metamorphic malware de-obfuscation and defense system. We name this system DRLDO, for Deep Reinforcement Learning based De-Obfuscator. With the inclusion of the DRLDO as a sub-component, an existing Intrusion Detection System could be augmented with defensive capabilities against 'zero-day' attacks from obfuscated and metamorphic variants of existing malware. This gains importance, not only because there exists no system to date that uses advanced DRL to intelligently and automatically normalize obfuscation down even to the opcode level, but also because the DRLDO system does not mandate any changes to the existing IDS. The DRLDO system does not even mandate the IDS' classifier to be retrained with any new dataset containing obfuscated samples. Hence DRLDO could be easily retrofitted into any existing IDS deployment. We designed, developed, and conducted experiments on the system to evaluate the same against multiple-simultaneous attacks from obfuscations generated from malware samples from a standardized dataset that contains multiple generations of malware. Experimental results prove that DRLDO was able to successfully make the otherwise un-detectable obfuscated variants of the malware detectable by an existing pre-trained malware classifier. The detection probability was raised well above the cut-off mark to 0.6 for the classifier to detect the obfuscated malware unambiguously. Further, the de-obfuscated variants generated by DRLDO achieved a very high correlation (of 0.99) with the base malware. This observation validates that the DRLDO system is actually learning to de-obfuscate and not exploiting a trivial trick.

翻訳日:2021-02-02 16:01:33 公開日:2021-02-01

# ARMプロセッサ上のML演算子のキャッシュ境界の理解

Understanding Cache Boundness of ML Operators on ARM Processors ( http://arxiv.org/abs/2102.00932v1 )

ライセンス: Link先を確認

Bernhard Klein and Christoph Gratl and Manfred M\"ucke and Holger Fr\"oning

(参考訳) TVMのような機械学習コンパイラは、組み込みCPUに高速で柔軟なデプロイを可能にする。これにより、ML圧縮技術で一般的な非標準演算子の使用が可能になる。しかし、適切なソリューションを設計するには、mlワークロードにおける典型的な計算インテンシーオペレータの制限を理解する必要がある。これは、組み込みARMプロセッサの基本ハードウェア制限と比較して、TVMで生成された高密度および畳み込み演算子の最初の詳細分析です。これにより、TVMとopenBLASで作成された計算ピーク性能、理論と測定値、および実世界の最先端結果のギャップが説明できる。代わりに、単精度一般行列乗算(GEMM)と畳み込みがL1キャッシュ可読帯域でバインドされていることがわかる。 8ビットおよびビットシリアル量子化演算子の探索は、キャッシュバウンド浮動小数点演算子と比較して、量子化が関連するスピードアップを達成するために使用できることを示した。しかし、量子化演算子の性能はデータレイアウトとビットパッキングの相互作用に大きく依存する。

Machine Learning compilers like TVM allow a fast and flexible deployment on embedded CPUs. This enables the use of non-standard operators, which are common in ML compression techniques. However, it is necessary to understand the limitations of typical compute-intense operators in ML workloads to design a proper solution. This is the first in-detail analysis of dense and convolution operators, generated with TVM, that compares to the fundamental hardware limits of embedded ARM processors. Thereby it explains the gap between computational peak performance, theoretical and measured, and real-world state-of-the-art results, created with TVM and openBLAS. Instead, one can see that single-precision general matrix multiply (GEMM) and convolutions are bound by L1-cache-read bandwidth. Explorations of 8-bit and bit-serial quantized operators show that quantization can be used to achieve relevant speedups compared to cache-bound floating-point operators. However, the performance of quantized operators highly depends on the interaction between data layout and bit packing.

翻訳日:2021-02-02 16:00:46 公開日:2021-02-01

# ハイブリッド情報駆動マルチエージェント強化学習

Hybrid Information-driven Multi-agent Reinforcement Learning ( http://arxiv.org/abs/2102.01004v1 )

ライセンス: Link先を確認

William A. Dawson, Ruben Glatt, Edward Rusu, Braden C. Soper, Ryan A. Goldhahn

(参考訳) 情報理論センサ管理手法は、マルチエージェントシステムの最適制御を考える場合の状態推定問題に対する理想的な解決策であるが、大規模分散マルチエージェントシステムで典型的な限られた計算資源を考えると、大きな状態空間では計算集約的すぎる。強化学習(RL)は、分散エージェントの多くのシステムに固有のリソース制約を考慮して、分散最適制御問題に対する近似ソリューションを見つけることができる有望な代替手段です。しかし、特に州空間の大部分でエージェントがほとんどフィードバックを受けていない低情報環境では、rlトレーニングは禁止的に非効率である。本稿では,情報理論モデルをヒューリスティックとして活用し,エージェントが大きなスパース状態空間をナビゲートするのを支援する,情報駆動型マルチエージェント強化学習(marl)手法を提案する。本稿では,この目的に向けた取り組みについて述べる。予備的な知見から,このようなアプローチは,単純なベースラインメトリクスよりもスパース状態空間を探索する上で,およそ3桁の効率性を持つエージェントのシステムをもたらす可能性が示唆された。作業はまだ初期段階ですが、将来の研究に有望な方向性を提供します。

Information theoretic sensor management approaches are an ideal solution to state estimation problems when considering the optimal control of multi-agent systems, however they are too computationally intensive for large state spaces, especially when considering the limited computational resources typical of large-scale distributed multi-agent systems. Reinforcement learning (RL) is a promising alternative which can find approximate solutions to distributed optimal control problems that take into account the resource constraints inherent in many systems of distributed agents. However, the RL training can be prohibitively inefficient, especially in low-information environments where agents receive little to no feedback in large portions of the state space. We propose a hybrid information-driven multi-agent reinforcement learning (MARL) approach that utilizes information theoretic models as heuristics to help the agents navigate large sparse state spaces, coupled with information based rewards in an RL framework to learn higher-level policies. This paper presents our ongoing work towards this objective. Our preliminary findings show that such an approach can result in a system of agents that are approximately three orders of magnitude more efficient at exploring a sparse state space than naive baseline metrics. While the work is still in its early stages, it provides a promising direction for future research.

翻訳日:2021-02-02 16:00:07 公開日:2021-02-01

# 低線量x線ct用深部高分解能ネットワーク

Deep High-Resolution Network for Low Dose X-ray CT Denoising ( http://arxiv.org/abs/2102.00599v1 )

ライセンス: Link先を確認

Ti Bai, Dan Nguyen, Biling Wang and Steve Jiang

(参考訳) 低線量CT (LDCT) は, 患者に対する放射線量が少ないため, 臨床的に望ましい。しかし、LDCT画像の品質は、必然的に強い量子ノイズのため、しばしば準最適である。コンピュータビジョンにおける未熟な成功にインスパイアされたディープラーニング(DL)ベースの技術は、LDCTのノイズ除去に使用されています。 DLモデルの有望なノイズ除去能力にもかかわらず、DLデノ化画像の分解能は損なわれ、臨床価値は低下している。本研究では,この問題の軽減を目的とした高分解能ネットワーク(HRNet)の導入により,より効率的なデノイザーを開発した。 hrnetはサブネットワークの複数のブランチで構成され、後に融合されるマルチスケールな特徴を抽出するため、生成された特徴の品質が大幅に向上し、ノイズ除去性能が向上する。実験結果から, HRNetをベースとしたデノイザは, ノイズ抑制能力に比較して, 優れた画像分解能保持能の点で, ベンチマークしたUNetベースのデノイザよりも優れた性能を示した。 root-mean-squared-errors (RMSE)/structure similarity index (SSIM)により、HRNetベースの denoiser は 113.80/0.550 (LDCT) から 55.24/0.745 (HRNet) に値を改善することができ、UNetベースの denoiser の 59.87/0.712 と比較できる。

Low Dose Computed Tomography (LDCT) is clinically desirable due to the reduced radiation to patients. However, the quality of LDCT images is often sub-optimal because of the inevitable strong quantum noise. Inspired by their unprecedent success in computer vision, deep learning (DL)-based techniques have been used for LDCT denoising. Despite the promising noise removal ability of DL models, people have observed that the resolution of the DL-denoised images is compromised, decreasing their clinical value. Aiming at relieving this problem, in this work, we developed a more effective denoiser by introducing a high-resolution network (HRNet). Since HRNet consists of multiple branches of subnetworks to extract multiscale features which are later fused together, the quality of the generated features can be substantially enhanced, leading to improved denoising performance. Experimental results demonstrated that the introduced HRNet-based denoiser outperforms the benchmarked UNet-based denoiser in terms of superior image resolution preservation ability while comparable, if not better, noise suppression ability. Quantitative metrics in terms of root-mean-squared-errors (RMSE)/structure similarity index (SSIM) showed that the HRNet-based denoiser can improve the values from 113.80/0.550 (LDCT) to 55.24/0.745 (HRNet), in comparison to 59.87/0.712 for the UNet-based denoiser.

翻訳日:2021-02-02 15:50:21 公開日:2021-02-01

# デッドビット問題からディープハッシュを救う

Rescuing Deep Hashing from Dead Bits Problem ( http://arxiv.org/abs/2102.00648v1 )

ライセンス: Link先を確認

Shu Zhao, Dayan Wu, Yucan Zhou, Bo Li and Weiping Wang

(参考訳) ディープハッシュ法は大規模画像検索において高い検索精度と効率を示す。離散ハッシュビットの最適化は、常にディープハッシュ方式に重点を置いている。これらの方法の一般的な戦略は、例えばアクティベーション関数を採用することである。 $\operatorname{sigmoid}(\cdot)$または$\operatorname{tanh}(\cdot)$は、近似離散値への量子化損失を最小限に抑える。しかし、このパラダイムは、ますます多くのハッシュビットを活性化関数の間違った飽和領域に閉じ込め、決して逃がすことはないかもしれない。この問題を "Dead Bits Problem~(DBP)" と呼ぶ。さらに、既存の量子化損失もDBPを増大させます。本稿では,DBPを緩和するアクティベーション関数の前に作用する,単純だが効果的な勾配増幅器を提案する。さらに、DBPをさらに軽減するためにエラー認識量子化損失を考案する。 2つの画像の類似性に基づいて量子化損失の負の効果を回避する。提案する勾配増幅器と誤り認識量子化損失は、様々なディープハッシュ法と互換性がある。 3つのデータセットの実験結果は、提案した勾配増幅器の効率と誤り認識量子化損失を示す。

Deep hashing methods have shown great retrieval accuracy and efficiency in large-scale image retrieval. How to optimize discrete hash bits is always the focus in deep hashing methods. A common strategy in these methods is to adopt an activation function, e.g. $\operatorname{sigmoid}(\cdot)$ or $\operatorname{tanh}(\cdot)$, and minimize a quantization loss to approximate discrete values. However, this paradigm may make more and more hash bits stuck into the wrong saturated area of the activation functions and never escaped. We call this problem "Dead Bits Problem~(DBP)". Besides, the existing quantization loss will aggravate DBP as well. In this paper, we propose a simple but effective gradient amplifier which acts before activation functions to alleviate DBP. Moreover, we devise an error-aware quantization loss to further alleviate DBP. It avoids the negative effect of quantization loss based on the similarity between two images. The proposed gradient amplifier and error-aware quantization loss are compatible with a variety of deep hashing methods. Experimental results on three datasets demonstrate the efficiency of the proposed gradient amplifier and the error-aware quantization loss.

翻訳日:2021-02-02 15:49:34 公開日:2021-02-01

# 深層学習によるSentinel-1 SAR画像のスペックル低減のための複数時間情報公開

Exploiting multi-temporal information for improved speckle reduction of Sentinel-1 SAR images by deep learning ( http://arxiv.org/abs/2102.00682v1 )

ライセンス: Link先を確認

Emanuele Dalsasso, In\`es Meraoumia, Lo\"ic Denis, Florence Tupin

(参考訳) 深層学習によるSAR振幅画像のスペックル低減効果は前例がない。 SAR画像の多時間スタックの広範な利用は、さらにデノナイジングの品質を向上させることができる。本稿では,時間的情報を深部ニューラルネットワークに統合し,スペックル抑制を柔軟かつ効率的に行う方法を提案する。アーカイブは、SAR画像の長い時系列へのアクセスを提供し、そこから複数の時間平均をほとんどスペックル変動を伴わずに計算することができる。提案手法は,この多時間平均と特定の日付の画像を比画像の形で結合し,最新のニューラルネットワークを用いてこの比画像のスペックルを除去する。この単純な戦略は、マルチテンポラル平均を知らずに元の画像をフィルタリングするよりも顕著な改善をもたらすことが示される。

Deep learning approaches show unprecedented results for speckle reduction in SAR amplitude images. The wide availability of multi-temporal stacks of SAR images can improve even further the quality of denoising. In this paper, we propose a flexible yet efficient way to integrate temporal information into a deep neural network for speckle suppression. Archives provide access to long time-series of SAR images, from which multi-temporal averages can be computed with virtually no remaining speckle fluctuations. The proposed method combines this multi-temporal average and the image at a given date in the form of a ratio image and uses a state-of-the-art neural network to remove the speckle in this ratio image. This simple strategy is shown to offer a noticeable improvement compared to filtering the original image without knowledge of the multi-temporal average.

翻訳日:2021-02-02 15:48:59 公開日:2021-02-01

# 自動運転のための地上認識モノラル3次元物体検出

Ground-aware Monocular 3D Object Detection for Autonomous Driving ( http://arxiv.org/abs/2102.00690v1 )

ライセンス: Link先を確認

Yuxuan Liu, Yuan Yixuan, Ming Liu

(参考訳) 単一のRGBカメラで環境中の物体の3D位置と方向を推定することは、低コストの都市自動運転と移動ロボットにとって重要で困難な作業です。既存のアルゴリズムのほとんどは、2D-3D対応における幾何学的制約に基づいており、これは一般的な6Dオブジェクトのポーズ推定に由来する。まず、地上の飛行機が運転シーンで3D検出で深度推論のさらなる手がかりを提供する方法を特定します。この観測に基づいて、3Dアンカーの処理を改善し、深層学習の枠組みにおいて、そのようなアプリケーション固有の先行を十分に活用する新しいニューラルネットワークモジュールを導入する。最後に,提案する3次元物体検出モジュールを組み込んだ効率的なニューラルネットワークを提案する。さらに,単眼深度予測用に設計されたニューラルネットワークを用いて,提案モジュールのパワーを検証した。提案した2つのネットワークは,KITTIの3次元オブジェクト検出と深度予測のベンチマークでそれぞれ最先端の性能を達成している。コードはhttps://www.github.com/Owen-Liuyuxuan/visualDet3Dで公開される。

Estimating the 3D position and orientation of objects in the environment with a single RGB camera is a critical and challenging task for low-cost urban autonomous driving and mobile robots. Most of the existing algorithms are based on the geometric constraints in 2D-3D correspondence, which stems from generic 6D object pose estimation. We first identify how the ground plane provides additional clues in depth reasoning in 3D detection in driving scenes. Based on this observation, we then improve the processing of 3D anchors and introduce a novel neural network module to fully utilize such application-specific priors in the framework of deep learning. Finally, we introduce an efficient neural network embedded with the proposed module for 3D object detection. We further verify the power of the proposed module with a neural network designed for monocular depth prediction. The two proposed networks achieve state-of-the-art performances on the KITTI 3D object detection and depth prediction benchmarks, respectively. The code will be published in https://www.github.com/Owen-Liuyuxuan/visualDet3D

翻訳日:2021-02-02 15:48:26 公開日:2021-02-01

# 深層学習によるSentinel-1 GRD画像の抽出と狭川セグメンテーションへの応用

Despeckling Sentinel-1 GRD images by deep learning and application to narrow river segmentation ( http://arxiv.org/abs/2102.00692v1 )

ライセンス: Link先を確認

Nicolas Gasnier, Emanuele Dalsasso, Lo\"ic Denis, Florence Tupin

(参考訳) 本稿では,最近提案されたSAR2SAR(自己監督型トレーニング戦略)に基づく,Sentinel-1 GRD画像の非スペックリング手法を提案する。 Sentinel 1 GRD画像のコレクション上のディープニューラルネットワークのトレーニングは、スペックルの空間変動空間相関に堅牢な脱スペックリングアルゴリズムにつながります。劣化した画像は狭い川のような構造物の検出を改善する。我々は,外因性情報に基づく検出器と線形特徴検出器を適用し,デスペックリングニューラルネットワークにより予め処理された画像に対して処理チェーンを適用する場合,河川のセグメンテーションが良好であることを示す。

This paper presents a despeckling method for Sentinel-1 GRD images based on the recently proposed framework "SAR2SAR": a self-supervised training strategy. Training the deep neural network on collections of Sentinel 1 GRD images leads to a despeckling algorithm that is robust to space-variant spatial correlations of speckle. Despeckled images improve the detection of structures like narrow rivers. We apply a detector based on exogenous information and a linear features detector and show that rivers are better segmented when the processing chain is applied to images pre-processed by our despeckling neural network.

翻訳日:2021-02-02 15:47:50 公開日:2021-02-01

# 粒子イメージング検出器のためのスケーラブル, エンドツーエンド, 深層学習に基づくデータ再構築チェーン

Scalable, End-to-End, Deep-Learning-Based Data Reconstruction Chain for Particle Imaging Detectors ( http://arxiv.org/abs/2102.01033v1 )

ライセンス: Link先を確認

Francois Drielsma, Kazuhiro Terao, Laura Domin\'e, Dae Heun Koh

(参考訳) コンピュータビジョン(CV)と機械学習(ML)の最近の進歩は、粒子イメージング検出器データの分析に新しいアプローチを動機づけています。孤立CVタスクに取り組む従来の取り組みとは違って,ニュートリノ物理の強度フロンティアにおける高精度撮像技術であるLiquid Argon Time Projection Chambers (LArTPCs) のための,エンドツーエンドのMLベースのデータ再構成チェーンを導入する。このチェーンは、スパース畳み込みニューラルネットワークを用いたボクセルレベルの特徴抽出とグラフニューラルネットワークを用いた粒子超構造形成を組み合わせたマルチタスクネットワークカスケードである。各アルゴリズムは物理による誘導バイアスを組み込んでおり、その集団階層は因果構造を強制するために使用される。出力は、高レベルの物理推論に使用できるイベントの包括的な説明です。このチェーンはエンドツーエンドで最適化可能であり、時間を要する手動のソフトウェア調整は不要である。また、Deep Underground Neutrino Experimentの3D画像LArTPCで期待される数十の高エネルギーニュートリノ相互作用のこれまでにない蓄積を処理する最初の実装です。チェーン全体がトレーニングされ、そのパフォーマンスはオープンシミュレーションデータセットを使用して各ステップで評価される。

Recent inroads in Computer Vision (CV) and Machine Learning (ML) have motivated a new approach to the analysis of particle imaging detector data. Unlike previous efforts which tackled isolated CV tasks, this paper introduces an end-to-end, ML-based data reconstruction chain for Liquid Argon Time Projection Chambers (LArTPCs), the state-of-the-art in precision imaging at the intensity frontier of neutrino physics. The chain is a multi-task network cascade which combines voxel-level feature extraction using Sparse Convolutional Neural Networks and particle superstructure formation using Graph Neural Networks. Each algorithm incorporates physics-informed inductive biases, while their collective hierarchy is used to enforce a causal structure. The output is a comprehensive description of an event that may be used for high-level physics inference. The chain is end-to-end optimizable, eliminating the need for time-intensive manual software adjustments. It is also the first implementation to handle the unprecedented pile-up of dozens of high energy neutrino interactions, expected in the 3D-imaging LArTPC of the Deep Underground Neutrino Experiment. The chain is trained as a whole and its performance is assessed at each step using an open simulated data set.

翻訳日:2021-02-02 15:47:19 公開日:2021-02-01

# 雑音二元系ニューラルネットワークにおける情報収縮とその意義

Information contraction in noisy binary neural networks and its implications ( http://arxiv.org/abs/2101.11750v2 )

ライセンス: Link先を確認

Chuteng Zhou, Quntao Zhuang, Matthew Mattina, Paul N. Whatmough

(参考訳) ニューラルネットワークは、大規模画像分類、オブジェクト検出、自然言語処理タスクにおいて最先端のパフォーマンスを達成する機械学習モデルとして重要になっている。本稿では、各ニューロンが不正確な出力を生じる確率がゼロでないノイズの多いバイナリニューラルネットワークについて検討する。これらの騒がしいモデルは、生物学的、物理的、電子的な文脈から生じ、物理的世界に関連する重要な種類のモデルを構成する。直感的には、そのようなシステムのニューロン数は、同じレベルの表現力と計算信頼性を維持しながらノイズを補うために増加する必要がある。私たちの重要な発見は、ノイズの多いニューラルネットワークの必要な数のニューロンの境界が低くなっていることです。この下限を証明するために、我々は情報理論のアプローチを採用し、二進対称チャネルに対するエバンス・シュルマンの結果を一般チャネルに一般化するだけでなく、ネットワークにおけるエンドツーエンドの情報収縮を推定する際のタイツネスを大幅に改善する、新しい強データ処理不等式(SDPI)を得る。我々のSDPIは、ニューラルネットワークやセルオートマトンなど、さまざまな情報処理システムに適用できる。ノイズのないニューラルネットワークに対する理解とは大きく異なるノイズの多いニューラルネットワークに対して,SDPIを雑音の多いバイナリニューラルネットワークに適用し,その鍵となる下位境界を求め,その影響をネットワークの深さ幅トレードオフに適用することを提案する。さらに、SDPIを適用してフォールトトレラント細胞オートマトンを研究し、エラー訂正オーバーヘッドと緩和時間の境界を得る。本稿では,情報理論のレンズを通して,雑音情報処理システムの新たな理解を提供する。

Neural networks have gained importance as the machine learning models that achieve state-of-the-art performance on large-scale image classification, object detection and natural language processing tasks. In this paper, we consider noisy binary neural networks, where each neuron has a non-zero probability of producing an incorrect output. These noisy models may arise from biological, physical and electronic contexts and constitute an important class of models that are relevant to the physical world. Intuitively, the number of neurons in such systems has to grow to compensate for the noise while maintaining the same level of expressive power and computation reliability. Our key finding is a lower bound for the required number of neurons in noisy neural networks, which is first of its kind. To prove this lower bound, we take an information theoretic approach and obtain a novel strong data processing inequality (SDPI), which not only generalizes the Evans-Schulman results for binary symmetric channels to general channels, but also improves the tightness drastically when applied to estimate end-to-end information contraction in networks. Our SDPI can be applied to various information processing systems, including neural networks and cellular automata. Applying the SDPI in noisy binary neural networks, we obtain our key lower bound and investigate its implications on network depth-width trade-offs, our results suggest a depth-width trade-off for noisy neural networks that is very different from the established understanding regarding noiseless neural networks. Furthermore, we apply the SDPI to study fault-tolerant cellular automata and obtain bounds on the error correction overheads and the relaxation time. This paper offers new understanding of noisy information processing systems through the lens of information theory.

翻訳日:2021-02-02 15:46:35 公開日:2021-02-01

# 一般化非定常バンディット

Generalized non-stationary bandits ( http://arxiv.org/abs/2102.00725v1 )

ライセンス: Link先を確認

Anne Gael Manegueu, Alexandra Carpentier and Yi Yu

(参考訳) 本稿では,スイッチングバンドイット問題を一般化する非定常確率バンドイット問題について検討する。スイッチングバンドイット問題(\textbf{Case a})に加えて、我々は3つの具体的な例に興味を持っている: (\textbf{b}) 腕の手段は局所多項式であり、 (\textbf{c}) 腕の手段は局所的に滑らかであり、 (\textbf{d}) 腕の隙間は束縛された数の屈曲点を持ち、そこでは最も高い腕の平均は短い範囲であまり変化しない。これらの3つの設定は非常に異なるが、共通する点がある: (i) ギャップの対数の同様の大きさのレベル集合の数を制御でき、 (ii) 最高平均は急な変更の数に制限があり、それ以外は変化が限られている。この一般的な設定では、特に4つの問題 (a)-(d) を効率的かつ統一的に解く1つのアルゴリズムを提案する。

In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d) mentioned.

翻訳日:2021-02-02 15:40:13 公開日:2021-02-01

# CTスキャンによるCOVID-19診断のためのフェーショット学習

Few-shot Learning for CT Scan based COVID-19 Diagnosis ( http://arxiv.org/abs/2102.00596v1 )

ライセンス: Link先を確認

Yifan Jiang, Han Chen, David K. Han, Hanseok Ko

(参考訳) 2019年コロナウイルス(COVID-19)は、188の国と地域で4000万人以上の人々が感染する国際懸念の公衆衛生緊急事態です。胸部CT(Chest Computed Tomography)イメージング技術は、高い診断精度と堅牢性により、新型コロナウイルスの大量検査に不可欠な方法となっています。近年,深層学習は医用画像の自動スクリーニングに有効なツールとなり,新型コロナウイルスの診断にも利用されている。しかし、COVID-19に関連する高い感染リスクは、収集されたラベル付きデータの相対的な滞留をもたらし、そのような方法のパフォーマンスを制限します。さらに、CT画像を正確にラベル付けするには、放射線医の専門知識が必要です。以上の課題に対処するために,少量のラベル付きCTスキャンが利用可能である場合にのみ効果的に機能する,教師付きドメイン適応型COVID-19 CT診断法を提案する。本提案手法は、ラベル付きデータのばらつきを補うために、大量の合成COVID-19 CT画像を利用して、ソースドメイン(合成データ)からターゲットドメイン(実データ)までのネットワークをクロスドメイントレーニング機構で調整する。実験の結果, 新型ct画像診断による診断作業において, 最先端のパフォーマンスが得られた。

Coronavirus disease 2019 (COVID-19) is a Public Health Emergency of International Concern infecting more than 40 million people across 188 countries and territories. Chest computed tomography (CT) imaging technique benefits from its high diagnostic accuracy and robustness, it has become an indispensable way for COVID-19 mass testing. Recently, deep learning approaches have become an effective tool for automatic screening of medical images, and it is also being considered for COVID-19 diagnosis. However, the high infection risk involved with COVID-19 leads to relative sparseness of collected labeled data limiting the performance of such methodologies. Moreover, accurately labeling CT images require expertise of radiologists making the process expensive and time-consuming. In order to tackle the above issues, we propose a supervised domain adaption based COVID-19 CT diagnostic method which can perform effectively when only a small samples of labeled CT scans are available. To compensate for the sparseness of labeled data, the proposed method utilizes a large amount of synthetic COVID-19 CT images and adjusts the networks from the source domain (synthetic data) to the target domain (real data) with a cross-domain training mechanism. Experimental results show that the proposed method achieves state-of-the-art performance on few-shot COVID-19 CT imaging based diagnostic tasks.

翻訳日:2021-02-02 15:34:41 公開日:2021-02-01

# CRPS学習

CRPS Learning ( http://arxiv.org/abs/2102.00968v1 )

ライセンス: Link先を確認

Jonathan Berrisch, Florian Ziel

(参考訳) 組み合わせと集約技術は予測精度を大幅に向上させることができる。これはまた、完全な予測分布が組み合わさった確率予測手法にも当てはまる。ベイズモデル平均化(BMA)のような時間変化および適応的な重み付けスキームはいくつか存在する。しかし、異なる予測器の性能は時間とともに異なるだけでなく、分布の一部にも異なる可能性がある。したがって、分布の中心ではより正確なものがあり、他のものは分布の尾部を予測するのに優れている。その結果、時間と分布の異なる性能を考慮に入れた新たな重み付け手法が導入された。本稿では,連続ランク付き確率スコア(crps)に対して最適化するポイントワイズオンラインアグリゲーションアルゴリズムについて検討する。完全適応的ベルンシュタインオンラインアグリゲーション(BOA)法の理論的性質を解析した後,ポイントワイズCRPS学習のためのスムースな手順を導入する。特性はシミュレーション研究によって確認され、議論されます。さらに, 炭素市場に関する予測研究において, その性能について概説する。詳細は、欧州の排出量許容価格の分布を予測する。

Combination and aggregation techniques can improve forecast accuracy substantially. This also holds for probabilistic forecasting methods where full predictive distributions are combined. There are several time-varying and adaptive weighting schemes like Bayesian model averaging (BMA). However, the performance of different forecasters may vary not only over time but also in parts of the distribution. So one may be more accurate in the center of the distributions, and other ones perform better in predicting the distribution's tails. Consequently, we introduce a new weighting procedure that considers both varying performance across time and the distribution. We discuss pointwise online aggregation algorithms that optimize with respect to the continuous ranked probability score (CRPS). After analyzing the theoretical properties of a fully adaptive Bernstein online aggregation (BOA) method, we introduce smoothing procedures for pointwise CRPS learning. The properties are confirmed and discussed using simulation studies. Additionally, we illustrate the performance in a forecasting study for carbon markets. In detail, we predict the distribution of European emission allowance prices.

翻訳日:2021-02-02 15:32:01 公開日:2021-02-01

# (参考訳) Twice Mixing: 水中画像強調のためのランク学習に基づく品質評価手法

Twice Mixing: A Rank Learning based Quality Assessment Approach for Underwater Image Enhancement ( http://arxiv.org/abs/2102.00670v1 )

ライセンス: CC BY 4.0

Zhenqi Fu, Xueyang Fu, Yue Huang, and Xinghao Ding

(参考訳) 水中画像の品質を向上させるために、過去数年間にさまざまな種類の水中画像強化(UIE)オペレータが提案されています。しかし、効果的な客観的評価方法の欠如はUIE技術のさらなる発展を制限します。本稿では,新しいランク学習による無基準品質評価法を提案する。 2回混合と呼ばれるこのアプローチは、高品質な画像と低品質の画像を混ぜることで、中間品質の画像が生成されるという観察によって動機付けられたものです。典型的な混合アルゴリズムは、与えられた入力データのペアを線形に補間する。しかし,人間の視覚系は画像処理において一様でなく非線形である。そこで,これらの混合画像と,それらの絶対スコアを線形結合で計算した深層ニューラルネットワークを直接学習する代わりに,シアムネットワークを訓練し,それらの品質ランキングを学ぶことを提案する。 2回混合は精巧に定式化された自己スーパービジョン機構に基づいて訓練される。具体的には、各イテレーションの前に、仮想画像の生成とネットワークトレーニングの誘導の両方に使用される2つの混合比をランダムに生成する。テストフェーズでは、ネットワークの単一のブランチを抽出し、異なるUIE出力の品質ランキングを予測します。我々は,合成データと実世界のデータセットの両方について広範な実験を行う。実験の結果,提案手法が従来の手法を大きく上回ることがわかった。

To improve the quality of underwater images, various kinds of underwater image enhancement (UIE) operators have been proposed during the past few years. However, the lack of effective objective evaluation methods limits the further development of UIE techniques. In this paper, we propose a novel rank learning guided no-reference quality assessment method for UIE. Our approach, termed Twice Mixing, is motivated by the observation that a mid-quality image can be generated by mixing a high-quality image with its low-quality version. Typical mixup algorithms linearly interpolate a given pair of input data. However, the human visual system is non-uniformity and non-linear in processing images. Therefore, instead of directly training a deep neural network based on the mixed images and their absolute scores calculated by linear combinations, we propose to train a Siamese Network to learn their quality rankings. Twice Mixing is trained based on an elaborately formulated self-supervision mechanism. Specifically, before each iteration, we randomly generate two mixing ratios which will be employed for both generating virtual images and guiding the network training. In the test phase, a single branch of the network is extracted to predict the quality rankings of different UIE outputs. We conduct extensive experiments on both synthetic and real-world datasets. Experimental results demonstrate that our approach outperforms the previous methods significantly.

翻訳日:2021-02-02 15:31:16 公開日:2021-02-01

# (参考訳) 明示的共通知識を用いた事前配置問題の再検討

Revisiting the Prepositional-Phrase Attachment Problem Using Explicit Commonsense Knowledge ( http://arxiv.org/abs/2102.00924v1 )

ライセンス: CC BY 4.0

Yida Xin, Henry Lieberman and Peter Chin

(参考訳) PP(Prepositional-phrase)アタッチメントの曖昧さを解決するという課題を再考する。現在提案されている解はルールベースであり、明示的な文法規則はあいまいさの解決方法を指示する; あるいは、ラベル付き例のコーパスから決定が学習される統計的手法である。明示的なコモンセンス知識ベースは、適切なアタッチメント決定を行う上で必須の要素となる。 Patch-Commと呼ばれるモジュールを実装し、さまざまな従来のパーサーがアタッチメントの決定を行えるようにしました。 Commonsense KBが直接的な回答を提供しない場合には、一部のNLPシステムが語彙外単語を処理するのと同様の方法で「知識外ベース」アサーションを推論するより一般的なシステムに戻ります。以上の結果から,コモンセンス知識ベースアプローチは,ルールベースと統計技術の統合により,両世界のベストを発揮できることが示唆された。 AIにおける説明可能性の重要性がますます認識される中、NLP開発者はシステムの振る舞いをよりよく理解し、エンドユーザとの自然な対話を促進することができる。

We revisit the challenging problem of resolving prepositional-phrase (PP) attachment ambiguity. To date, proposed solutions are either rule-based, where explicit grammar rules direct how to resolve ambiguities; or statistical, where the decision is learned from a corpus of labeled examples. We argue that explicit commonsense knowledge bases can provide an essential ingredient for making good attachment decisions. We implemented a module, named Patch-Comm, that can be used by a variety of conventional parsers, to make attachment decisions. Where the commonsense KB does not provide direct answers, we fall back on a more general system that infers "out-of-knowledge-base" assertions in a manner similar to the way some NLP systems handle out-of-vocabulary words. Our results suggest that the commonsense knowledge-based approach can provide the best of both worlds, integrating rule-based and statistical techniques. As the field is increasingly coming to recognize the importance of explainability in AI, a commonsense approach can enable NLP developers to better understand the behavior of systems, and facilitate natural dialogues with end users.

翻訳日:2021-02-02 15:13:54 公開日:2021-02-01

PDF登録状況（公開日: 20210201）