Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20201127となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# ブール関数のための可変量子ニューラルネットワーク Tunable Quantum Neural Networks for Boolean Functions ( http://arxiv.org/abs/2003.14122v2 ) ライセンス: Link先を確認	Viet Pham Ngoc and Herbert Wiklicky	(参考訳) 本稿では,量子ニューラルネットワークに対する新しいアプローチを提案する。我々の多層アーキテクチャは、古典的ニューラルネットワークの特徴である非線形活性化関数をエミュレートする計測の使用を避ける。それにもかかわらず、提案したアーキテクチャは、Boolean関数を学習することができる。この能力は、ブール関数と多制御NOTゲートからなる特定の量子回路の間に存在する対応から生じる。この対応は代数正規形式と呼ばれる関数の多項式表現によって構築される。この構成を用いて、任意のブール関数を学習するためにゲートをチューニングできるジェネリック量子回路のアイデアを導入する。学習課題を実行するために,測定の欠如を利用したアルゴリズムを考案した。長さ$n$の全てのバイナリ入力の重ね合わせを提示すると、ネットワークは少なくとも$n+1$の更新でターゲット関数を学習できる。 In this paper we propose a new approach to quantum neural networks. Our multi-layer architecture avoids the use of measurements that usually emulate the non-linear activation functions which are characteristic of the classical neural networks. Despite this, our proposed architecture is still able to learn any Boolean function. This ability arises from the correspondence that exists between a Boolean function and a particular quantum circuit made out of multi-controlled NOT gates. This correspondence is built via a polynomial representation of the function called the algebraic normal form. We use this construction to introduce the idea of a generic quantum circuit whose gates can be tuned to learn any Boolean functions. In order to perform the learning task, we have devised an algorithm that leverages the absence of measurements. When presented with a superposition of all the binary inputs of length $n$, the network can learn the target function in at most $n+1$ updates.	翻訳日:2023-05-27 07:51:34 公開日:2020-11-27
# ランダム場ハイゼンベルクスピン鎖における多体局在転移における鎖破壊とコステリッツ-チューレススケーリング Chain breaking and Kosterlitz-Thouless scaling at the many-body localization transition in the random field Heisenberg spin chain ( http://arxiv.org/abs/2004.02861v3 ) ライセンス: Link先を確認	Nicolas Laflorencie, Gabriel Lemari\'e, Nicolas Mac\'e	(参考訳) 多体局在化(MBL)遷移の微妙さを理解するための非常に理論的な努力にもかかわらず、多くの疑問が、特にその重要な性質に関して開かれている。ここでは、1次元のMBLが熱力学限界の連鎖破壊を引き起こすスピン凍結機構を伴っているという重要な観察を行う。解析的および数値的手法を用いて,これらの鎖切断は典型的な局在長を直接観測し,mbl遷移におけるスケーリング特性は表現論的再正規化群アプローチによって予測されたkosterlitz-thoulessシナリオと一致することを示した。 Despite tremendous theoretical efforts to understand subtleties of the many-body localization (MBL) transition, many questions remain open, in particular concerning its critical properties. Here we make the key observation that MBL in one dimension is accompanied by a spin freezing mechanism which causes chain breakings in the thermodynamic limit. Using analytical and numerical approaches, we show that such chain breakings directly probe the typical localization length, and that their scaling properties at the MBL transition agree with the Kosterlitz-Thouless scenario predicted by phenomenological renormalization group approaches.	翻訳日:2023-05-26 06:16:11 公開日:2020-11-27
# アレイホモダイニングによる2つの非コヒーレント源のサブレイリー分解 Sub-Rayleigh resolution of two incoherent sources by array homodyning ( http://arxiv.org/abs/2005.08693v2 ) ライセンス: Link先を確認	Chandan Datta, Marcin Jarzyna, Yink Loong Len, Karol {\L}ukanowski, Jan Ko{\l}ody\'nski, Konrad Banaszek	(参考訳) 画像平面の空間強度分布の測定に基づく従来の非干渉イメージングは、レイリー回折基準で記述された分解能ハードルに直面する。ここでは、アレイホモダイン検出により測定された2つの不整点間の距離をレイリー限界以下で十分高い信号-雑音比で推定できるというフィッシャー情報の概念を用いて理論的に実証する。この能力は、コヒーレント検出技術を用いて取得した個々の検出器画素間の空間コヒーレンス情報の可用性に起因する。サブレイリー領域で達成可能な精度の簡易な解析近似について述べる。さらに,モンテカルロシミュレーションデータに対して推定アルゴリズムを提案し,適用した。 Conventional incoherent imaging based on measuring the spatial intensity distribution in the image plane faces the resolution hurdle described by the Rayleigh diffraction criterion. Here, we demonstrate theoretically using the concept of the Fisher information that quadrature statistics measured by means of array homodyne detection enables estimation of the distance between two incoherent point sources well below the Rayleigh limit for sufficiently high signal-to-noise ratio. This capability is attributed to the availability of spatial coherence information between individual detector pixels acquired using the coherent detection technique. A simple analytical approximation for the precision attainable in the sub-Rayleigh region is presented. Furthermore, an estimation algorithm is proposed and applied to Monte Carlo simulated data.	翻訳日:2023-05-19 11:24:39 公開日:2020-11-27
# 環境を介する量子電池の帯電過程 Environment-mediated charging process of quantum batteries ( http://arxiv.org/abs/2005.12823v3 ) ライセンス: Link先を確認	F. T. Tabesh, F. H. Kamin and S. Salimi	(参考訳) 共有散逸環境を介するオープン量子電池の充電過程を, 2つの異なるシナリオで検討した。最初のケースでは、非マルコフ環境の存在下で量子チャージャー・バッテリモデルを考える。バッテリーは強結合状態で適切に充電できるが、外部の電力や充電器との直接のやりとり、すなわちワイヤレスライクな充電は発生しない。環境はバッテリの充電において大きな役割を果たすが、これは弱い結合状態では起こらない。第2のシナリオでは、マルコフ力学の存在下での2量子ビット系を考慮した量子電池の充電過程に対する個人および集団自発放出率の影響を示す。その結果, オープンバッテリは, アンダーダムや強い外部磁場を用いることで, マルコフ力学において良好に充電できることを示した。また、サブラジアント状態と中間状態を考慮した頑健な電池も提示する。さらに,最初のシナリオでエルゴトロピーを探索するための実験的なセットアップを提案する。 We study the charging process of open quantum batteries mediated by a common dissipative environment in two different scenarios. In the first case, we consider a quantum charger-battery model in the presence of a non-Markovian environment. Where the battery can be properly charged in a strong coupling regime, without any external power and any direct interaction with the charger, i.e., a wireless-like charging happens. The environment plays a major role in the charging of the battery, while this does not happen in a weak coupling regime. In the second scenario, we show the effect of individual and collective spontaneous emission rates on the charging process of quantum batteries by considering a two-qubit system in the presence of Markovian dynamics. Our results demonstrate that open batteries can be satisfactorily charged in Markovian dynamics by employing an underdamped regime and/or strong external fields. We also present a robust battery by taking into account subradiant states and an intermediate regime. Moreover, we propose an experimental setup to explore the ergotropy in the first scenario.	翻訳日:2023-05-18 07:33:42 公開日:2020-11-27
# スマートなコネクテッド・コミュニティが新型コロナウイルスの感染拡大と闘う Future Smart Connected Communities to Fight COVID-19 Outbreak ( http://arxiv.org/abs/2007.10477v2 ) ライセンス: Link先を確認	Deepti Gupta, Smriti Bhatt, Maanak Gupta, and Ali Saman Tosun	(参考訳) IoT(Internet of Things)はこの10年間で急速に成長し、さまざまなアプリケーションをサポートする幅広いデバイスを提供する次元と複雑さの面で開発を続けている。ユビキタスインターネット、コネクテッドセンサーとアクチュエータ、ネットワークと通信技術、人工知能(AI)によって、スマートサイバー物理システム(CPS)は、日々の生活で人間にサービスを提供する。しかし、新型コロナウイルス(COVID-19)の感染拡大により、現在の技術展開の限界が明らかになった。 IoTおよびスマートコネクテッドテクノロジとデータ駆動アプリケーションとの併用は、疾患の予防、継続的な監視、緩和だけでなく、ガイドライン、ルール、政府命令の即時実施において重要な役割を果たす可能性がある。本稿では、インテリジェントなモニタリング、プロアクティブな予防とコントロール、COVID-19の緩和のためのIoT対応エコシステムを構想する。我々は、E-health、スマートホーム、スマートサプライチェーン管理、スマートローカリティ、スマートシティなど、さまざまなスマートインフラストラクチャのための異なるアーキテクチャ、アプリケーション、技術システムを提案し、同様のアウトブレイクを管理し緩和する将来のコネクテッドコミュニティを開発する。さらに、これらのスマートなコミュニティやインフラを、これらのアウトブレイクと闘い、準備するために、今後の方向性とともに研究課題を提示する。 Internet of Things (IoT) has grown rapidly in the last decade and continue to develop in terms of dimension and complexity offering wide range of devices to support diverse set of applications. With ubiquitous Internet, connected sensors and actuators, networking and communication technology, and artificial intelligence (AI), smart cyber-physical systems (CPS) provide services rendering assistance to humans in their daily lives. However, the recent outbreak of COVID-19 (also known as coronavirus) pandemic has exposed and highlighted the limitations of current technological deployments to curtail this disease. IoT and smart connected technologies together with data-driven applications can play a crucial role not only in prevention, continuous monitoring, and mitigation of the disease, but also enable prompt enforcement of guidelines, rules and government orders to contain such future outbreaks. In this paper, we envision an IoT-enabled ecosystem for intelligent monitoring, pro-active prevention and control, and mitigation of COVID-19. We propose different architectures, applications and technology systems for various smart infrastructures including E-health, smart home, smart supply chain management, smart locality, and smart city, to develop future connected communities to manage and mitigate similar outbreaks. Furthermore, we present research challenges together with future directions to enable and develop these smart communities and infrastructures to fight and prepare against such outbreaks.	翻訳日:2023-05-08 22:58:20 公開日:2020-11-27
# 2次元正方格子における双極子ボソンのスタッガー状超流動相 Staggered superfluid phases of dipolar bosons in two-dimensional square lattices ( http://arxiv.org/abs/2008.00870v2 ) ライセンス: Link先を確認	Kuldeep Suthar, Rebecca Kraus, Hrushikesh Sable, Dilip Angom, Giovanna Morigi, and Jakub Zakrzewski	(参考訳) 二次元正方格子における超低温ボソンの量子基底状態の研究を行った。ボゾンは反発性双極子相互作用とs波散乱を介して相互作用する。この力学は双極子相互作用による相関ホッピングを含む拡張ボース・ハバードモデルによって記述され、係数は現実的なパラメータを持つワニエ展開を用いて第2量子化ハミルトニアンから得られる。相関ホッピング項の係数が負であり, 単粒子効果によるトンネル作用を阻害できる状態において, Gutzwiller ansatz を用いて位相図を決定する。この干渉は運動エネルギーの消滅時に停滞した超流動相と超固体相を生じさせ, 位相が圧縮不能な有限運動エネルギーにおけるパラメータ領域を同定する。得られた位相図をクラスタ・グッツウィラー法とDMRGを用いて一次元で得られた結果と比較した。 We study the quantum ground state of ultracold bosons in a two-dimensional square lattice. The bosons interact via the repulsive dipolar interactions and s-wave scattering. The dynamics is described by the extended Bose-Hubbard model including correlated hopping due to the dipolar interactions, the coefficients are found from the second quantized Hamiltonian using the Wannier expansion with realistic parameters. We determine the phase diagram using the Gutzwiller ansatz in the regime where the coefficients of the correlated hopping terms are negative and can interfere with the tunneling due to single-particle effects. We show that this interference gives rise to staggered superfluid and supersolid phases at vanishing kinetic energy, while we identify parameter regions at finite kinetic energy where the phases are incompressible. We compare the results with the phase diagram obtained with the cluster Gutzwiller approach and with the results found in one dimension using DMRG.	翻訳日:2023-05-07 06:36:36 公開日:2020-11-27
# 二重光モードを有する光学系におけるコヒーレントノイズキャンセリング Coherent noise cancellation in optomechanical system with double optical modes ( http://arxiv.org/abs/2009.04706v3 ) ライセンス: Link先を確認	Jiashun Yan and Jun Jing	(参考訳) コヒーレント量子ノイズキャンセレーション(cqnc)戦略は、標準量子限界を破る超感度メトロロジープロトコルを促進するために、単一モード光機械システムにおいて実行されてきた。 CQNCの鍵となる考え方は、放射圧と駆動から生じるバックアクションノイズは、光学モードをほぼ共振アシラリーモードに結合することでオフセットできるということである。本研究では,cqnc下での連続的な弱力センシングを,異なる周波数とメカニカルモードの2つの光モードからなる2重モード光機械システムで開発する。特に、高周波光モードを駆動し、低周波モードを探索し、プローブモードを補助モードに結合することにより、非対称な処理の下で、従来のcqncセンシングに類似させることができる。現在のCQNC戦略は、制約駆動力(ルース・ハーウィッツ基準)と有効正の機械減衰(安定光ばね条件)の両方に関して同時に二重モード系を安定化させることが重要である。さらに、CQNC戦略の非自明な拡張(シングルモード版からダブルモード版まで)の下でプローブモードとアシラリーモードの結合を利用することにより、回転波項と反回転項はそれぞれシステムの安定性とノイズキャンセリングに責任があることが判明した。現実の状況では, 中間に膜, ねじれたキャビティに基づく弱トーク検出器を配置した3部式光機械装置を用いて, 本方式を実践できる。 The coherent quantum noise cancellation (CQNC) strategy has been performed in the single-mode optomechanical systems to promote an ultra-sensitive metrology protocol to break the standard quantum limit. The key idea of CQNC is that the backaction noises arising from radiation pressure and driving can be offset by coupling the optical mode to a near-resonant ancillary mode. In this work, a continuous weak-force sensing under CQNC is developed in a double-mode optomechanical system consisted of two optical modes with distinct frequencies and a mechanical mode. In particular, under the asymmetrical treatment by driving the higher-frequency optical mode, probing the lower-frequency one, and coupling the probe mode to the ancillary mode, our configuration can be used to resemble the conventional CQNC sensing. It is more important to find that the current CQNC strategy simultaneously stabilizes the double-mode system with respect to both the constrained driving power (the Routh-Hurwitz criterion) and the effective positive mechanical damping (the stable optical-spring condition). Moreover, through exploiting the coupling between the probe mode and the ancillary mode under this nontrivial extension of the CQNC strategy (from the single-mode version to the double-mode one), the rotating-wave term and the counter-rotating term are found to be responsible to the system stability and the noise cancellation, respectively. In realistic situations, our scheme can be practiced in a tripartite optomechanical setup with a membrane in the middle and a twisted-cavity-based weak-torque detector.	翻訳日:2023-05-03 00:56:56 公開日:2020-11-27
# 時間制御弱障害におけるボース・アインシュタイン凝縮変形の非平衡進化 Non-equilibrium evolution of Bose-Einstein condensate deformation in temporally controlled weak disorder ( http://arxiv.org/abs/2009.10477v2 ) ライセンス: Link先を確認	Milan Radonji\'c and Axel Pelster	(参考訳) 弱障害電位のオン・オフが、障害誘発凝縮変形の出現によって初期平衡ボース・アインシュタイン凝縮の定常状態にどのように影響するかを考慮し、汚泥問題に対する摂動平均場アプローチの時間依存的拡張を考える。その結果, 定常凝縮変形は, 実際に障害の断熱スイッチに対応する平衡部分の和であり, 後者が特定の駆動プロトコルに依存する動的誘導部分であることがわかった。その後、障害がオフになれば、結果として生じる凝縮変形は、平衡部が消滅する間、長期限界における追加の動的誘起部分を取得する。また,不均一な捕捉凝縮物に対する適切な一般化を示す。その結果, 縮合変形は時間的に制御された弱障害におけるボース気体の定常状態の一般非平衡性の指標であることが示された。 We consider a time-dependent extension of a perturbative mean-field approach to the dirty boson problem by considering how switching on and off a weak disorder potential affects the stationary state of an initially equilibrated Bose-Einstein condensate by the emergence of a disorder-induced condensate deformation. We find that in the switch on scenario the stationary condensate deformation turns out to be a sum of an equilibrium part, that actually corresponds to adiabatic switching on the disorder, and a dynamically-induced part, where the latter depends on the particular driving protocol. If the disorder is switched off afterwards, the resulting condensate deformation acquires an additional dynamically-induced part in the long-time limit, while the equilibrium part vanishes. We also present an appropriate generalization to inhomogeneous trapped condensates. Our results demonstrate that the condensate deformation represents an indicator of the generically non-equilibrium nature of steady states of a Bose gas in a temporally controlled weak disorder.	翻訳日:2023-05-01 07:06:06 公開日:2020-11-27
# エンジニアリング純粋に非線形な結合とクォートン Engineering Purely Nonlinear Coupling with the Quarton ( http://arxiv.org/abs/2010.09959v2 ) ライセンス: Link先を確認	Yufeng Ye, Kaidong Peng, Mahdi Naghiloo, Gregory Cunningham, and Kevin P. O'Brien	(参考訳) 超伝導量子ビットと光子の強い非線形結合は、量子情報処理の重要な構成要素である。ジョセフソンの非線形性の摂動性のため、線形カップリングは分散状態において近似的な非線形カップリングにしばしば用いられる。しかし、この分散結合は弱く、基礎となる線形結合は局所モードを混合し、例えば不必要な自己結合をフォトンモードに分配する。ここでは、クォートンを用いて2つの線形分離されたトランモン量子ビット間の純粋に非線形結合を得る。クォートンのゼロ$\phi^2$ポテンシャルは、既存のスキームに比べて桁違いに強い巨大なギガヘルツレベルのクロスカーを可能にし、クォートンの正の$\phi^4$ポテンシャルはクォービットの負のセルフカーをキャンセルして共振器にすることができる。量子ビット光子、量子ビット光子、光子光子の間のこの巨大なクロスカーは、単一マイクロ波光子検出やボソニック符号の実装のような応用に最適である。 Strong nonlinear coupling of superconducting qubits and/or photons is a critical building block for quantum information processing. Due to the perturbative nature of the Josephson nonlinearity, linear coupling is often used in the dispersive regime to approximate nonlinear coupling. However, this dispersive coupling is weak and the underlying linear coupling mixes the local modes which, for example, distributes unwanted self-Kerr to photon modes. Here, we use the quarton to yield purely nonlinear coupling between two linearly decoupled transmon qubits. The quarton's zero $\phi^2$ potential enables a giant gigahertz-level cross-Kerr which is an order of magnitude stronger compared to existing schemes, and the quarton's positive $\phi^4$ potential can cancel the negative self-Kerr of qubits to linearize them into resonators. This giant cross-Kerr between bare modes of qubit-qubit, qubit-photon, and even photon-photon is ideal for applications such as single microwave photon detection and implementation of bosonic codes.	翻訳日:2023-04-28 05:50:14 公開日:2020-11-27
# マルチサイドバンドRABBITTスキームにおける遷移相の分解 Decomposition of the transition phase in multi-sideband RABBITT schemes ( http://arxiv.org/abs/2011.02989v2 ) ライセンス: Link先を確認	Divya Bharti, David Atri-Schuller, Gavin Menning, Kathryn R. Hamilton, Robert Moshammer, Thomas Pfeifer, Nicolas Douguet, Klaus Bartschat, Anne Harth	(参考訳) 2光子遷移(rabbitt)の干渉によるアト秒ビーティングの再構成は、光イオン化過程における原子遷移元素の相を決定するのに使用できる技術である。従来のRABBITTスキームでは、いわゆる漸近近似(asymptotic approximation)は、測定された位相を、単光子イオン化過程と連続体-連続体(cc)相に連結されたウィグナー相の和とみなす。本稿では,漸近近似をマルチサイドバンドRABBITTスキームに拡張する可能性を検討する。この近似からの予測は、原子水素の時間依存シュル=オディンガー方程式の解法に基づいて、 {\displaystyle {\it ab initio} 計算によって得られた結果と比較される。 Reconstruction of Attosecond Beating By Interference of Two-photon Transitions (RABBITT) is a technique that can be used to determine the phases of atomic transition elements in photoionization processes. In the traditional RABBITT scheme, the so-called "asymptotic approximation" considers the measured phase as a sum of the Wigner phase linked to a single-photon ionization process and the continuum-continuum (cc) phase associated with further single-photon transitions in the continuum. In this paper, we explore the possibility of extending the asymptotic approximation to multi-sideband RABBITT schemes. The predictions from this approximation are then compared with results obtained by an {\it ab initio} calculation based on solving the time-dependent Schr\"odinger equation for atomic hydrogen.	翻訳日:2023-04-25 05:17:02 公開日:2020-11-27
# 量子ニューラルネットワークの記憶容量と学習能力 Storage capacity and learning capability of quantum neural networks ( http://arxiv.org/abs/2011.06113v2 ) ライセンス: Link先を確認	Maciej Lewenstein, Aikaterini Gratsea, Andreu Riera-Campeny, Albert Aloy, Valentin Kasper, Anna Sanpera	(参考訳) 我々は、完全正のトレース保存(cptp)写像として記述される量子ニューラルネットワーク(qnns)の記憶容量を調べ、n$-次元ヒルベルト空間に作用する。我々はQNNが最大$N$の線形独立な純状態を保存することを実証し、対応する写像の構造を提供する。古典的なホップフィールドネットワークの記憶容量はニューロンの数に線形にスケールするが、qnnは指数関数的に独立な状態の数を格納できることを示した。我々はGardnerプログラムを用いることで、CPTPマップの相対体積をM$の定常状態で推定する。体積は$M$で指数関数的に減少し、$M\geq N+1$で0に縮まる。本研究の結果は、混合状態を格納したQNNとフィードフォワードQNNの入力出力関係に一般化される。提案手法は,QNNの記憶特性と入力出力状態の量子特性を関連付ける経路を開く。この論文はPeter Wittekの思い出に捧げられている。 We study the storage capacity of quantum neural networks (QNNs) described as completely positive trace preserving (CPTP) maps, which act on an $N$-dimensional Hilbert space. We demonstrate that QNNs can store up to $N$ linearly independent pure states and provide the structure of the corresponding maps. While the storage capacity of a classical Hopfield network scales linearly with the number of neurons, we show that QNNs can store an exponential number of linearly independent states. We estimate, employing the Gardner program, the relative volume of CPTP maps with $M$ stationary states. The volume decreases exponentially with $M$ and shrinks to zero for $M\geq N+1$. We generalize our results to QNNs storing mixed states as well as input-output relations for feed-forward QNNs. Our approach opens the path to relate storage properties of QNNs to the quantum properties of the input-output states. This paper is dedicated to the memory of Peter Wittek.	翻訳日:2023-04-24 11:33:25 公開日:2020-11-27
# 空洞中の原子の集合的自己トッピング Collective self-trapping of atoms in a cavity ( http://arxiv.org/abs/2011.10440v2 ) ライセンス: Link先を確認	A. Dombi, T. W. Clark, F. I. B. Williams, F. Jessen, J. Fort\'agh, D. Nagy, A. Vukics, P. Domokos	(参考訳) 本研究では,高精細キャビティの動的結合モードを用いて,寒冷原子雲の光双極子トラップを実験的に実証する。トラップは原子の集合的な作用を必要とすること、すなわち1つの原子は同じレーザー駆動条件下では閉じ込められないことを示す。原子はモードの周波数を共鳴に近づけることで、キャビティに閉じ込めるために必要な光の強度を与える。トラップ光モードにおける原子のバックアクションは、トラップの非指数的崩壊によっても現れる。 We experimentally demonstrate optical dipole trapping of a cloud of cold atoms by means of a dynamically coupled mode of a high-finesse cavity. We show that the trap requires a collective action of the atoms, i.e. a single atom would not be trapped under the same laser drive conditions. The atoms pull the frequency of the mode closer to resonance, thereby allowing the necessary light intensity for trapping into the cavity. The back-action of the atoms on the trapping light mode is also manifested by the non-exponential collapse of the trap.	翻訳日:2023-04-23 14:54:37 公開日:2020-11-27
# 乗り合い運転者の殺人事件が南京における乗り合い利用者の意志推定に及ぼす影響 Influence of Murder Incident of Ride-hailing Drivers on Ride-hailing User's Consuming Willingness in Nanchang ( http://arxiv.org/abs/2011.11384v2 ) ライセンス: Link先を確認	Guangxin He, Shenghuan Yang, Miaomiao Lei, Xing Wu, Yixin Sun, Yimeng Dang	(参考訳) 2018年の中国における配車ドライバーの殺人事件が頻発したため、配車会社はこのような事故の防止と乗客の安全確保のために一連の措置を講じた。本研究は,殺人事件後の配車アプリの使用意欲と安全確保に対するユーザの態度を調査した。ライドシェアリングドライバーの殺人事件は、人々のライドシェアリングアプリの使用に重大な影響を及ぼすことがわかった。女性利用者の有意感は「心理的害」など男性利用者の0.633倍であり, 女性利用者の間ではより明らかであった。最後に,配車アプリの効率には満足するが,安全性や信頼性には満足せず,重要であると考えられた。 Due to the frequent murder incidents of ride-hailing drivers in China in 2018, ride-hailing companies took a series of measures to prevent such incidents and ensure ride-hailing passengers' safety. This study investigated users' willingness to use ride-hailing apps after murder incidents and users' attitudes toward Safety Rectification. We found that murder incidents of ride-hailing drivers had a significant adverse impact on people's usage of ride-hailing apps. Female users' consuming willingness was 0.633 times that of male users, such as" psychological harm" was more evident among females, and Safety Rectification had a calming effect for some users. Finally, we found that people were satisfied with ride-hailing apps' efficiency, but were not satisfied with safety and reliability, considered them important; female users were more concerned about the security than male users.	翻訳日:2023-04-23 14:45:12 公開日:2020-11-27
# wi-fiのセキュリティとプライバシーの強化は、電波指紋の難読化による Stay Connected, Leave no Trace: Enhancing Security and Privacy in WiFi via Obfuscating Radiometric Fingerprints ( http://arxiv.org/abs/2011.12644v2 ) ライセンス: Link先を確認	Luis F. Abanto-Leon and Andreas Baeuml and Gek Hong (Allyson) Sim and Matthias Hollick and Arash Asadi	(参考訳) WiFiチップセットの固有のハードウェア欠陥は、送信された信号に現れ、ユニークなラジオメトリック指紋をもたらす。この指紋は、セキュリティを強化するための認証手段として使用できる。実際、近年の研究では、市販のデバイスに容易に実装できる実用的な指紋認証ソリューションが提案されている。本稿では,これらの解が偽装攻撃に対して非常に脆弱であることを解析的かつ実験的に証明する。また、このようなユニークなデバイスベースの署名は、ユーザーデバイスを追跡することによってプライバシーを侵害するために悪用されることも示しており、現在、ユーザーはデバイスをオフにする以外にそのようなプライバシー攻撃を防ぐ手段を持っていない。 RF-Veilは,不正行為に対して堅牢であるだけでなく,送信機の無線指紋を非正規受信機に隠蔽することでユーザのプライバシーを保護する。具体的には、送信信号に位相誤差のランダム化パターンを導入し、受信側だけが送信元の指紋を抽出できるようにした。一連の実験と分析において, 統計的攻撃に内在的ランダム化を採用する脆弱性を明らかにし, 対策を導入する。最後に,RF-Veilがユーザプライバシ保護とセキュリティ向上に有効であることを示す。さらに,提案手法はRF-Veilを使用しない他のデバイスとの通信を可能にする。 The intrinsic hardware imperfection of WiFi chipsets manifests itself in the transmitted signal, leading to a unique radiometric fingerprint. This fingerprint can be used as an additional means of authentication to enhance security. In fact, recent works propose practical fingerprinting solutions that can be readily implemented in commercial-off-the-shelf devices. In this paper, we prove analytically and experimentally that these solutions are highly vulnerable to impersonation attacks. We also demonstrate that such a unique device-based signature can be abused to violate privacy by tracking the user device, and, as of today, users do not have any means to prevent such privacy attacks other than turning off the device. We propose RF-Veil, a radiometric fingerprinting solution that not only is robust against impersonation attacks but also protects user privacy by obfuscating the radiometric fingerprint of the transmitter for non-legitimate receivers. Specifically, we introduce a randomized pattern of phase errors to the transmitted signal such that only the intended receiver can extract the original fingerprint of the transmitter. In a series of experiments and analyses, we expose the vulnerability of adopting naive randomization to statistical attacks and introduce countermeasures. Finally, we show the efficacy of RF-Veil experimentally in protecting user privacy and enhancing security. More importantly, our proposed solution allows communicating with other devices, which do not employ RF-Veil.	翻訳日:2023-04-23 00:56:15 公開日:2020-11-27
# 線形量子チャネルによる2励起ルーティング Two-excitation routing via linear quantum channels ( http://arxiv.org/abs/2011.13711v1 ) ライセンス: Link先を確認	Tony John George Apollaro and Wayne Jordan Chetcuti	(参考訳) ネットワーク内の異なるノード間で量子情報をルーティングすることは、量子インターネットの基本的な前提条件である。シングルキュービットルーティングは概ね解決されているが、多くのキュービットルーティングプロトコルは、これまで深く研究されていない。 arXiv:1911.12211における多重励振転送プロトコルに基づいて、複数の受動ブロックが線形連鎖に結合されたネットワーク上の2励振ルーティングプロトコルに摂動伝達スキームを適用する。我々は、受信機とチェーン間の切替可能な結合と永久結合の両方に対処する。このプロトコルはフェミオンネットワーク上で効率の良い2励振ルーティングを可能にするが、スピン=$\frac{1}{2}$ネットワークの場合、ネットワークの限られた領域だけが高品質なルーティングに適している。 Routing quantum information among different nodes in a network is a fundamental prerequisite for a quantum internet. While single-qubit routing has been largely addressed, many-qubit routing protocols have not been intensively investigated so far. Building on the many-excitation transfer protocol in arXiv:1911.12211, we apply the perturbative transfer scheme to a two-excitation routing protocol on a network where multiple two-receivers block are coupled to a linear chain. We address both the case of switchable and permanent couplings between the receivers and the chain. We find that the protocol allows for efficient two-excitation routing on a fermionic network, although for a spin-$\frac{1}{2}$ network only a limited region of the network is suitable for high-quality routing.	翻訳日:2023-04-22 20:48:22 公開日:2020-11-27
# 非可換平面における角運動量量子逆流 Angular momentum quantum backflow in the noncommutative plane ( http://arxiv.org/abs/2011.13644v1 ) ライセンス: Link先を確認	V. D. Paccoia, O. Panella and P. Roy	(参考訳) 非可換平面における量子バックフロー問題を研究する。特に,非可換運動量演算子と振動子相互作用のない荷電粒子について検討し,各ケースにおける角運動量逆流と,それらの差異について検討した。また、角運動量逆流の発生に関連する確率を提案し、その確率が物理パラメータ、すなわち磁場に依存するかどうかを調べる。 We study the quantum backflow problem in the noncommutative plane. In particular, we have considered a charged particle with and without an oscillator interaction with noncommuting momentum operators and examined angular momentum backflow in each case and how they differ from each other. We also propose a probability associated with the occurence of angular momentum backflow and investigate whether or not the probability depends on a physical parameter, namely the magnetic field.	翻訳日:2023-04-22 20:48:07 公開日:2020-11-27
# 市民集団における人間計算:知識管理ソリューションフレームワーク Human Computations in Citizen Crowds: A Knowledge Management Solution Framework ( http://arxiv.org/abs/2011.13638v1 ) ライセンス: Link先を確認	Nadeem Kafi, Zubair Ahmed Shaikh, and Muhammad Shahid Shaikh	(参考訳) KG(知識世代)と理解は伝統的に人間中心の活動であった。 KE (Knowledge Engineering) と KM (Knowledge Management) は、2つの異なる平面上の人間の知識を増強しようと試みている。しかし、どちらもコンピュータ中心である。クラウドソーシングhc(human computations)は最近、人間の認識とメモリを利用して、特定のタスクに関する多様な知識ストリームを生成する。文学は、市民の群衆のためのKMフレームワークについてはほとんど研究せず、様々な分野の人間からインプットを集め、タスクや知識カテゴリに関する知識を組織化し、コンピュータ中心の活動として新しい知識を再現する。本稿では,知識の生成,知識へのフィードバック,学習環境におけるその知識の結果を記録することを目的とした,ExamCheckという簡単なソリューションを実装したフレームワークの構築の試みを行う。 hcに基づく我々のソリューションは、構造化kmフレームワークが参加者自身にとって重要なコンテキストで複雑な問題に対処することができることを示している。 KG (Knowledge Generation) and understanding have traditionally been a Human-centric activity. KE (Knowledge Engineering) and KM (Knowledge Management) have tried to augment human knowledge on two separate planes: the first deals with machine interpretation of knowledge while the later explore interactions in human networks for KG and understanding. However, both remain computer-centric. Crowdsourced HC (Human Computations) have recently utilized human cognition and memory to generate diverse knowledge streams on specific tasks, which are mostly easy for humans to solve but remain challenging for machine algorithms. Literature shows little work on KM frameworks for citizen crowds, which gather input from the diverse category of Humans, organize that knowledge concerning tasks and knowledge categories and recreate new knowledge as a computer-centric activity. In this paper, we present an attempt to create a framework by implementing a simple solution, called ExamCheck, to focus on the generation of knowledge, feedback on that knowledge and recording the results of that knowledge in academic settings. Our solution, based on HC, shows that a structured KM framework can address a complex problem in a context that is important for participants themselves.	翻訳日:2023-04-22 20:48:00 公開日:2020-11-27
# 超対称性、半有界状態、および放牧入射反射 Supersymmetry, half-bound states, and grazing incidence reflection ( http://arxiv.org/abs/2011.13621v1 ) ライセンス: Link先を確認	D. A. Patient and S. A. R. Horsley	(参考訳) 平面媒質への入射時の電磁波は、ポテンシャル井戸に衝突するゼロエネルギー量子粒子と類似している。この極限波は通常完全に反射される。ここでは「半境界状態」の光学的類似性をサポートする誘電体プロファイルを探索し、放牧入射時の反射をゼロにする。これらのプロファイルを得るには、超対称量子力学とヘルムホルツ方程式の直接反転という、2つの異なる理論的アプローチを用いる。 Electromagnetic waves at grazing incidence onto a planar medium are analogous to zero energy quantum particles incident onto a potential well. In this limit waves are typically completely reflected. Here we explore dielectric profiles supporting optical analogues of `half-bound states', allowing for zero reflection at grazing incidence. To obtain these profiles we use two different theoretical approaches: supersymmetric quantum mechanics, and direct inversion of the Helmholtz equation.	翻訳日:2023-04-22 20:47:40 公開日:2020-11-27
# 異点を有する機械共振器のエネルギーレベル誘導と耐熱冷却 Energy-level-attraction and heating-resistant-cooling of mechanical resonators with exceptional points ( http://arxiv.org/abs/2011.13587v1 ) ライセンス: Link先を確認	Cheng Jiang, Yu-Long Liu, Mika A. Sillanp\"a\"a	(参考訳) 合成フォノニックゲージ場における機械共振器のエネルギー準位発展と基底状態冷却について検討した。可変ゲージ位相は、マルチモード光機械系における$\mathcal{pt}$- と anti-$\mathcal{pt}$-symmetric mechanical couplings の位相差によって媒介される。透過スペクトルは、ゲージ位相を変調して非対称なファノ線形状または二重光学的に誘起される透明性を示す。さらに、機械的結合が継続的に増大しても固有値が崩壊して縮退する。このような直感的エネルギー誘引は、反交差ではなく、$\mathcal{PT}$-と$-\mathcal{PT}$-対称結合の間の破壊的干渉に起因する。機械的固有値がピークに対応するキャビティ出力パワースペクトルにおいて,エネルギートラクションとそれに伴う例外点(EP)がより直感的に観測できることがわかった。機械冷却の場合、これらのEPでは平均フォノン占有数が最小となる。特にフォノン輸送は非相反し、EPでは理想的には一方向になる。最後に、ゲージ場を媒介とした非相反フォノン輸送に基づく耐熱性地中冷却を提案する。マクロメカニカル共振器の量子状態に向けて、ほとんどのオプトメカニカルシステムは本質的に空洞や機械的加熱によって制限される。我々の研究により、熱エネルギー移動はゲージ位相を調整し、悪名高い加熱限界を乗り越えるための有望な経路をサポートすることでブロックできることが判明した。 We study the energy-level evolution and ground-state cooling of mechanical resonators under a synthetic phononic gauge field. The tunable gauge phase is mediated by the phase difference between the $\mathcal{PT}$- and anti-$\mathcal{PT}$-symmetric mechanical couplings in a multimode optomechanical system. The transmission spectrum then exhibits the asymmetric Fano line shape or double optomechanically induced transparency by modulating the gauge phase. Moreover, the eigenvalues will collapse and become degenerate although the mechanical coupling is continuously increased. Such counterintuitive energy-attraction, instead of anti-crossing, attributes to destructive interferences between $\mathcal{PT}$- and anti-$\mathcal{PT}$-symmetric couplings. We find that the energy-attraction, as well as the accompanied exceptional points (EPs), can be more intuitively observed in the cavity output power spectrum where the mechanical eigenvalues correspond to the peaks. For mechanical cooling, the average phonon occupation number becomes minimum at these EPs. Especially, phonon transport becomes nonreciprocal and even ideally unidirectional at the EPs. Finally, we propose a heating-resistant ground-state cooling based on the nonreciprocal phonon transport, which is mediated by the gauge field. Towards the quantum regime of macroscopic mechanical resonators, most optomechanical systems are ultimately limited by their intrinsic cavity or mechanical heating. Our work revealed that the thermal energy transfer can be blocked by tuning the gauge phase, which supports a promising route to overpass the notorious heating limitations.	翻訳日:2023-04-22 20:46:59 公開日:2020-11-27
# ソーシャルメディアデータ、衛星画像、地理空間情報を用いた解釈可能な貧困マッピング Interpretable Poverty Mapping using Social Media Data, Satellite Images, and Geospatial Information ( http://arxiv.org/abs/2011.13563v1 ) ライセンス: Link先を確認	Chiara Ledesma, Oshean Lee Garonita, Lorenzo Jaime Flores, Isabelle Tingzon, and Danielle Dalisay	(参考訳) 人道的組織が貧困緩和のための脆弱な地域を特定するためには、正確できめ細かい最新の貧困データへのアクセスが不可欠である。近年、コンピュータビジョンと衛星画像の組み合わせによる貧困評価が成功しているが、ブラックボックスモデルと組み合わせた高解像度画像を取得するコストは、多くの開発組織にとって大きな障壁となる。本研究では,機械学習と,ソーシャルメディアデータ,低解像度衛星画像,ボランティア地理情報など,容易にアクセス可能なデータソースを用いて,貧困推定のための解釈可能かつ費用効率の高い手法を提案する。提案手法を用いて,フィリピンの資産推定におけるR^2$0.66を衛星画像を用いた0.63に対して達成した。最後に、機能の重要性分析を使用して、グローバルとローカルの両方で最も貢献度の高い機能を特定し、意思決定者が貧困に関する深い洞察を得る手助けをします。 Access to accurate, granular, and up-to-date poverty data is essential for humanitarian organizations to identify vulnerable areas for poverty alleviation efforts. Recent works have shown success in combining computer vision and satellite imagery for poverty estimation; however, the cost of acquiring high-resolution images coupled with black box models can be a barrier to adoption for many development organizations. In this study, we present a interpretable and cost-efficient approach to poverty estimation using machine learning and readily accessible data sources including social media data, low-resolution satellite images, and volunteered geographic information. Using our method, we achieve an $R^2$ of 0.66 for wealth estimation in the Philippines, compared to 0.63 using satellite imagery. Finally, we use feature importance analysis to identify the highest contributing features both globally and locally to help decision makers gain deeper insights into poverty.	翻訳日:2023-04-22 20:46:36 公開日:2020-11-27
# 都市Twitterネットワークとコミュニティ:アテネのマイクロブログを事例として Urban Twitter Networks and Communities: A Case Study of Microblogging in Athens ( http://arxiv.org/abs/2011.13785v1 ) ライセンス: Link先を確認	Tasos Spiliotopoulos, Ian Oakley	(参考訳) 本稿では,都市レベルのハッシュタグを用いたTwitterユーザによるコミュニティについて検討する。特に,ギリシャのアテネ市における,関連するTwitterハッシュタグデータの解析と可視化によって実証されたネットワークの視点を提供し,この地理的局所ネットワークのマイクロブロッギングの実践に関する概観と深い洞察を提供する。さらなる分析から、このネットワークのメンバーによって定義されたtwitterコミュニティは、現実のコミュニティの強い兆候を示していることが示唆された。 This paper examines the community formed by the Twitter users that used a city-level hashtag. In particular, we provide a network perspective of the city of Athens, Greece, as demonstrated by the analysis and visualization of the relevant Twitter hashtag data, in order to present both an overview and deeper insights at the microblogging practices of this geographic local network. Further analysis suggests that the Twitter community defined by the members of the network shows strong signs of a real-life community.	翻訳日:2023-04-22 20:39:03 公開日:2020-11-27
# 量子アシスト量子制御のためのアルゴリズムプリミティブ Algorithmic Primitives for Quantum-Assisted Quantum Control ( http://arxiv.org/abs/2011.13777v1 ) ライセンス: Link先を確認	Guru-Vamsi Policharla and Sai Vinjanampathy	(参考訳) NISQデバイスに実装可能な様々な量子支援量子制御アルゴリズムを構築するために,オーバーラップと遷移行列時系列を評価するための2つの原始的アルゴリズムについて論じる。従来の手法と異なり, 断層計測を回避し, 単一の量子ビット計測のみに依存する。トロッタライズと測定誤差から発生する合成アルゴリズムとノイズ源の回路複雑性を解析した。 We discuss two primitive algorithms to evaluate overlaps and transition matrix time series, which are used to construct a variety of quantum-assisted quantum control algorithms implementable on NISQ devices. Unlike previous approaches, our method bypasses tomographically complete measurements and instead relies solely on single qubit measurements. We analyse circuit complexity of composed algorithms and sources of noise arising from Trotterization and measurement errors.	翻訳日:2023-04-22 20:38:54 公開日:2020-11-27
# 駆動散逸ボソニック場の変動解析 Variational analysis of driven-dissipative bosonic fields ( http://arxiv.org/abs/2011.13746v1 ) ライセンス: Link先を確認	Tim Pistorius and Hendrik Weimer	(参考訳) 本稿では,任意の大きな占有数を持つ駆動分散ボソニック場に対する量子マスター方程式の変分解析を行う手法を提案する。我々のアプローチは、密度行列のP表現と開量子系の変分原理を組み合わせたものである。提案手法を,波動関数モンテカルロシミュレーションとJaynes-Cummingsモデルに対するMaxwell-Bloch方程式の解との比較により評価した。さらに,キャビティフィールドにおけるRydberg分極を記述するモデルについて検討し,異なるモード間の相関を記述するために,変分パラマタの追加を導入する。 We present a method to perform a variational analysis of the quantum master equation for driven-disspative bosonic fields with arbitrary large occupation numbers. Our approach combines the P representation of the density matrix and the variational principle for open quantum system. We benchmark the method by comparing it to wave-function Monte-Carlo simulations and the solution of the Maxwell-Bloch equation for the Jaynes-Cummings model. Furthermore, we study a model describing Rydberg polaritons in a cavity field and introduce an additional set of variational paramaters to describe correlations between different modes.	翻訳日:2023-04-22 20:37:36 公開日:2020-11-27
# 量子力学の多世界解釈--パラドックス的考察 Many-Worlds Interpretation of Quantum Mechanics: A Paradoxical Picture ( http://arxiv.org/abs/2011.13928v1 ) ライセンス: Link先を確認	Amir Abbass Varshovi	(参考訳) 量子力学の多世界解釈(MWI)は、解釈における(半)決定論的並列世界の現実に基づく前例のない存在論的視点から研究される。不確実性原理のおかげで、宇宙の正しいオントロジーを特定する一貫した方法が存在しないことが示され、そのため、MWIは我々の住む世界は非現実であると主張する固有の矛盾の対象となっている。 The many-worlds interpretation (MWI) of quantum mechanics is studied from an unprecedented ontological perspective based on the reality of (semi-) deterministic parallel worlds in the interpretation. It is demonstrated that with thanks to the uncertainty principle there would be no consistent way to specify the correct ontology of the Universe, hence the MWI is subject to an inherent contradiction which claims that the world where we live in is unreal.	翻訳日:2023-04-22 20:30:29 公開日:2020-11-27
# 非巻きフェルミオンSPT相:超対称性拡張 Unwinding Fermionic SPT Phases: Supersymmetry Extension ( http://arxiv.org/abs/2011.13921v1 ) ライセンス: Link先を確認	Abhishodh Prakash, Juven Wang	(参考訳) We show how 1+1-dimensional fermionic symmetry-protected topological states (SPTs, i.e. nontrivial short-range entangled gapped phases of quantum matter whose boundary exhibits 't Hooft anomaly and whose bulk cannot be deformed into a trivial tensor product state under finite-depth local unitary transformations only in the presence of global symmetries), indeed can be unwound to a trivial state by enlarging the Hilbert space via adding extra degrees of freedom and suitably extending the global symmetries. 境界上の拡張射影的大域対称性は、特定の意味で超対称性(すなわち、フェルミオン数パリティ$(-1)^F$と可換でない群要素を含む)となり、反単位時間反転対称性は分数化される。これはまた、群拡大の観点で適切な超対称性拡張により、ある種の異種なフェルミオン異常(例えば、時間反転や反射対称性における「パリティ」異常)を上昇および除去できることを意味する。 1+1dMajorana fermion chain の多層構造について、Sachdev-Ye-Kitaev (SYK) 相互作用によるモデル、超対称性で保護された固有フェルミオン性ギャップレス SPT 、コボルディズム理論による高次時空次元への一般化の明確な例を考察する。 We show how 1+1-dimensional fermionic symmetry-protected topological states (SPTs, i.e. nontrivial short-range entangled gapped phases of quantum matter whose boundary exhibits 't Hooft anomaly and whose bulk cannot be deformed into a trivial tensor product state under finite-depth local unitary transformations only in the presence of global symmetries), indeed can be unwound to a trivial state by enlarging the Hilbert space via adding extra degrees of freedom and suitably extending the global symmetries. The extended projective global symmetry on the boundary can become supersymmetric in a specific sense, i.e., it contains group elements that do not commute with the fermion number parity $(-1)^F$, while the anti-unitary time-reversal symmetry becomes fractionalized. This also means we can uplift and remove certain exotic fermionic anomalies (e.g., "parity" anomaly in time-reversal or reflection symmetry) via appropriate supersymmetry extensions in terms of group extensions. We work out explicit examples for multi-layers of 1+1d Majorana fermion chains, then comment on models with Sachdev-Ye-Kitaev (SYK) interactions, intrinsic fermionic gapless SPTs protected by supersymmetry, and generalizations to higher spacetime dimensions via a cobordism theory.	翻訳日:2023-04-22 20:30:23 公開日:2020-11-27
# コヒーレント状態重ね合わせ、絡み合いおよびゲージ/重力対応 Coherent state superpositions, entanglement and gauge/gravity correspondence ( http://arxiv.org/abs/2011.13919v1 ) ライセンス: Link先を確認	Hai Lin, Yuwei Zhu	(参考訳) 我々はゲージ/重力対応の文脈において,多重力状態のコヒーレント状態と巨大重力状態のコヒーレント状態という2種類のコヒーレント状態に注目した。我々は位相シフト演算子とそのコヒーレント状態の重ね合わせに対する作用を便利に利用する。 N$状態のシュロディンガー猫状態は、一列のヤングテーブルー状態に近づき、それらの間の忠実度は、N$で漸近的に1に達する。これらの状態の量子フィッシャー情報は、基底状態の励起エネルギーの分散に比例し、位相空間の角方向における状態の局在性を特徴づける。気泡広告における位相空間平面の異なる領域を用いて,重力自由度間の相関と絡み合いを解析した。位相空間平面における2つの絡み合った環間の相関は、2つの環の間の環状の面積に関係している。また、2種類のノイズコヒーレント状態も解析し、これはノイズレス限界における純コヒーレント状態と大きなノイズリミットにおける最大混合状態との間の補間状態と見なすことができる。 We focus on two types of coherent states, the coherent states of multi graviton states and the coherent states of giant graviton states, in the context of gauge/gravity correspondence. We conveniently use a phase shift operator and its actions on the superpositions of these coherent states. We find $N$-state Schrodinger cat states which approach the one-row Young tableau states, with fidelity between them asymptotically reaches 1 at large $N$. The quantum Fisher information of these states is proportional to the variance of the excitation energy of the underlying states, and characterizes the localizability of the states in the angular direction in the phase space. We analyze the correlation and entanglement between gravitational degrees of freedom using different regions of the phase space plane in bubbling AdS. The correlation between two entangled rings in the phase space plane is related to the area of the annulus between the two rings. We also analyze two types of noisy coherent states, which can be viewed as interpolated states that interpolate between a pure coherent state in the noiseless limit and a maximally mixed state in the large noise limit.	翻訳日:2023-04-22 20:29:58 公開日:2020-11-27
# ガリウムヒ素スピン量子ビットのロードマップ Roadmap for gallium arsenide spin qubits ( http://arxiv.org/abs/2011.13907v1 ) ライセンス: Link先を確認	Ferdinand Kuemmeth and Hendrik Bluhm	(参考訳) 窒化ガリウム(GaAs)のゲート定義量子ドットは、作製の比較的単純さと、単一の伝導帯谷、小さな有効質量、安定なドーパントのような好ましい電子特性のために、スピン量子ビット装置のパイオニアとして広く用いられている。 GaAsスピン量子ビットは、多くの研究室で容易に生成され、現在、絡み合い、量子非破壊測定、自動チューニング、マルチドットアレイ、コヒーレント交換結合、テレポーテーションなど様々な用途で研究されている。多くの注目が他の材料にシフトしているにもかかわらず、GaAsデバイスは概念実証量子情報処理や固体実験の原動力となるだろう。 Gate-defined quantum dots in gallium arsenide (GaAs) have been used extensively for pioneering spin qubit devices due to the relative simplicity of fabrication and favourable electronic properties such as a single conduction band valley, a small effective mass, and stable dopants. GaAs spin qubits are readily produced in many labs and are currently studied for various applications, including entanglement, quantum non-demolition measurements, automatic tuning, multi-dot arrays, coherent exchange coupling, and teleportation. Even while much attention is shifting to other materials, GaAs devices will likely remain a workhorse for proof-of-concept quantum information processing and solid-state experiments.	翻訳日:2023-04-22 20:29:39 公開日:2020-11-27
# 超伝導量子プロセッサ上のスターク多体局在 Stark many-body localization on a superconducting quantum processor ( http://arxiv.org/abs/2011.13895v1 ) ライセンス: Link先を確認	Qiujiang Guo, Chen Cheng, Hekang Li, Shibo Xu, Pengfei Zhang, Zhen Wang, Chao Song, Wuxin Liu, Wenhui Ren, Hang Dong, Rubem Mondaini, and H. Wang	(参考訳) 量子エミュレータは、チューナビリティと制御の程度が大きいため、密閉された量子多体系の微細な側面を観察することができる。後者はMulti-body Localization(MBL)現象と呼ばれ、局所情報の保存と遅い絡み合い成長によって動的に識別される非エルゴード的行動を記述する。ここでは,オンサイト・エネルギ・ランドスケープが乱れず,直線的に変化し,スタークmblをエミュレートする場合に,この現象学の正確な観察を行う。そこで我々は,32個の超伝導量子ビットからなる量子デバイスを構築し,非可積分スピンモデルの緩和ダイナミクスを忠実に再現する。本研究は, 古典的計算機における厳密なシミュレーションによって達成できる範囲を超過し, 量子アドバンテージの開始を示唆し, 量子計算を平衡多体問題を解くための資源として用いる方法を示す。 Quantum emulators, owing to their large degree of tunability and control, allow the observation of fine aspects of closed quantum many-body systems, as either the regime where thermalization takes place or when it is halted by the presence of disorder. The latter, dubbed many-body localization (MBL) phenomenon, describes the non-ergodic behavior that is dynamically identified by the preservation of local information and slow entanglement growth. Here, we provide a precise observation of this same phenomenology in the case the onsite energy landscape is not disordered, but rather linearly varied, emulating the Stark MBL. To this end, we construct a quantum device composed of thirty-two superconducting qubits, faithfully reproducing the relaxation dynamics of a non-integrable spin model. Our results describe the real-time evolution at sizes that surpass what is currently attainable by exact simulations in classical computers, signaling the onset of quantum advantage, thus bridging the way for quantum computation as a resource for solving out-of-equilibrium many-body problems.	翻訳日:2023-04-22 20:29:26 公開日:2020-11-27
# post or tweet: facebookとtwitterの利用に関する調査から学んだこと Post or Tweet: Lessons from a Study of Facebook and Twitter Usage ( http://arxiv.org/abs/2011.13802v1 ) ライセンス: Link先を確認	Tasos Spiliotopoulos, Ian Oakley	(参考訳) このワークショップでは、FacebookとTwitterという、おそらく最も人気のある2つのソーシャルネットワークサイトについて、現在進行中の混合調査についてレポートする。この研究の目的は、参加者のモチベーションに関する調査データとAPI抽出を通じて収集された利用データを組み合わせることで、ソーシャルメディアの選択とクロスプラットフォーム利用のニュアンスに光を当てることである。本研究のセットアップについて述べるとともに,参加者の募集やデータ収集,利用データの扱いと次元化,サイト間の利用データの比較などに関する課題と洞察に焦点をあてる。 This workshop paper reports on an ongoing mixed-methods study on the two arguably most popular social network sites, Facebook and Twitter, for the same users. The overarching goal of the study is to shed light into the nuances of social media selection and cross-platform use by combining survey data about participants' motivations with usage data collected via API extraction. We describe the set-up of the study and focus our discussion on the challenges and insights relating to participant recruiting and data collection, handling and dimensionalizing usage data, and comparing usage data across sites.	翻訳日:2023-04-22 20:28:29 公開日:2020-11-27
# ブラックローン問題:サブグループ差別と戦うための分配的ロバストな公平性 Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination ( http://arxiv.org/abs/2012.01193v1 ) ライセンス: Link先を確認	Mark Weber, Mikhail Yurochkin, Sherif Botros, Vanio Markov	(参考訳) 今日の貸付におけるアルゴリズム的公正性は、保護されたグループ間の統計的公平性を監視するためのグループフェアネス指標に依存している。このアプローチは代理業者によるサブグループ差別に対して脆弱であり、貸し手に対する法的・評判的な損害と、借り手に対する不正な不公平な結果という重大なリスクを負っている。実用的な課題は保護されたグループの多くの組み合わせと部分集合から生じる。我々は、米国における歴史的および残酷な人種差別の背景から、この問題を動機付け、利用可能なすべてのトレーニングデータを汚染し、アルゴリタミズムバイアスに対する公衆の感受性を高める。本稿では,貸付における公正性に関する規制コンプライアンスプロトコルを概観し,その限界について述べる。本稿では,個別のフェアネス法とそれに対応するフェアネス学習アルゴリズムの最近の発展から,既存のグループフェアネス要件を順守しつつ,サブグループ識別に対処するソリューションを提案する。 Algorithmic fairness in lending today relies on group fairness metrics for monitoring statistical parity across protected groups. This approach is vulnerable to subgroup discrimination by proxy, carrying significant risks of legal and reputational damage for lenders and blatantly unfair outcomes for borrowers. Practical challenges arise from the many possible combinations and subsets of protected groups. We motivate this problem against the backdrop of historical and residual racism in the United States polluting all available training data and raising public sensitivity to algorithimic bias. We review the current regulatory compliance protocols for fairness in lending and discuss their limitations relative to the contributions state-of-the-art fairness methods may afford. We propose a solution for addressing subgroup discrimination, while adhering to existing group fairness requirements, from recent developments in individual fairness methods and corresponding fair metric learning algorithms.	翻訳日:2023-04-22 20:21:36 公開日:2020-11-27
# 量子場理論のハイゼンベルク像における干渉、現実とフェルミオンの局所的要素 Interference in the Heisenberg Picture of Quantum Field Theory, Local Elements of Reality and Fermions ( http://arxiv.org/abs/2011.14003v1 ) ライセンス: Link先を確認	Chiara Marletto, Nicetu Tibau Vidal, Vlatko Vedral	(参考訳) ハイゼンベルク図を用いたマッハ・ツェンダー干渉計における単一光子の量子干渉について述べる。我々の目的は、記述が古典的な電磁場の場合と同様に局所的であることを示すことであり、唯一の違いは、電場と磁場が量子の場合、作用素(量子可観測器)であることである。次に,単電子マッハ・ツェンダー干渉計を考察し,この場合のハイゼンベルク像の適切な処理について説明する。興味深いことに、パリティ超選択則は光子と異なる電子の扱いを強いる。現在の演算子のような異なるフェルミオンモードの局所量子オブザーバブルのみを用いるモデルは相取得を記述することができる。ハイゼンベルク図で定式化された量子電磁力学の局所的な定式化の中で、この局所解析をフェルミオン場とボゾン場にどのように拡張するかについて議論する。 We describe the quantum interference of a single photon in the Mach-Zehnder interferometer using the Heisenberg picture. Our purpose is to show that the description is local just like in the case of the classical electromagnetic field, the only difference being that the electric and the magnetic fields are, in the quantum case, operators (quantum observables). We then consider a single-electron Mach-Zehnder interferometer and explain what the appropriate Heisenberg picture treatment is in this case. Interestingly, the parity superselection rule forces us to treat the electron differently to the photon. A model using only local quantum observables of different fermionic modes, such as the current operator, is nevertheless still viable to describe phase acquisition. We discuss how to extend this local analysis to coupled fermionic and bosonic fields within the same local formalism of quantum electrodynamics as formulated in the Heisenberg picture.	翻訳日:2023-04-22 20:21:15 公開日:2020-11-27
# 高次高調波分光における赤外単一サイクルパルス誘起高エネルギープラトー Infrared single-cycle pulse induced high-energy plateaus in high-order harmonic spectroscopy ( http://arxiv.org/abs/2011.13995v1 ) ライセンス: Link先を確認	Abdelmalek Taoutioui and Hicham Agueny	(参考訳) 新たな実験[例えば \textit{Z]によって動機付けられる。 nie et al。 Nat! フォトンスペクトル領域 5 - 14 $\mu m$ において赤外(ir)単サイクルパルスを生成する際、理論上、高次高調波発生(hhg)過程を制御するためのそれらの役割は、強烈な近赤外(nir)多サイクルパルス(\lambda$ = 1.27$\mu m$)によって引き起こされる。このシナリオは、時間依存シュリンガー方程式の数値シミュレーションにより、水素原子のプロトタイプとして実証される。特に、結合パルスは偶数次高調波を発生させ、最も重要なことは高エネルギープラトーを発生させることであり、高調波遮断はNIRパルス単独の場合と比較して3倍の係数で拡張されることを示す。出現した高エネルギー高原は、nir磁場中を移動中の単サイクル電界からイオン化電子への膨大な運動量移動の結果、高運動量電子再結合をもたらすと理解されている。また、赤外場誘起電子変位効果による放出電子の方向制御におけるir単サイクル場の役割を明らかにした。さらに, 2つのパルス間の相対的キャリア・エンベロープ位相と波長を変化させることで, 出現した高原を制御できることを示した。そこで本研究では,IR単サイクル高調波分光法による時間分解電子回折の新しい視点を開拓した。 Motivated by the emerging experiments [e.g. \textit{Z. Nie et al. Nat. Photon. \textbf{12}, 489 (2018)}] on producing infrared (IR) single cycle pulses in the spectral region 5 - 14 $\mu m$, we theoretically investigate their role for controlling high-order harmonic generation (HHG) process induced by an intense near-infrared (NIR) multi-cycle pulse ($\lambda$ = 1.27 $\mu m$). The scenario is demonstrated for a prototype of the hydrogen atom by numerical simulations of the time-dependent Schr\"odinger equation. In particular, we show that the combined pulses allow one to generate even-order harmonics and most importantly to produce high-energy plateaus and that the harmonic cutoff is extended by a factor of 3 compared to the case with the NIR pulse alone. The emerged high-energy plateaus is understood as a result of a vast momentum transfer from the single-cycle field to the ionized electrons while travelling in the NIR field, and thus leading to high-momentum electron recollisions. We also identify the role of the IR single-cycle field for controlling the directionality of the emitted electrons via the IR-field induced electron displacement effect. We further show that the emerged plateaus can be controlled by varying the relative carrier-envelope phase between the two pulses as well as their wavelengths. Thus, our findings open up new perspectives for time-resolved electron diffraction using an IR single-cycle field-assisted high-harmonic spectroscopy.	翻訳日:2023-04-22 20:20:44 公開日:2020-11-27
# 多ビット系冷却用量子冷凍機 Few-qubit quantum refrigerator for cooling a multi-qubit system ( http://arxiv.org/abs/2011.13973v1 ) ライセンス: Link先を確認	Onat Ar{\i}soy and \"Ozg\"ur E. M\"ustecapl{\i}o\u{g}lu	(参考訳) 相互作用するマルチキュービット系を冷却する小型量子冷蔵庫として,数量子システムを提案する。具体的には、量子冷凍機としてスピンスターモデルと呼ばれる、中央量子ビットをn$ ancilla qubitsに結合する。まず, 量子ビット間の相互作用が縦型および強磁性イジングモデル形式である場合, 中心量子ビットは環境よりも低温であることを示す。その後、より冷たい中心量子ビットは、一般的な量子多ビット系を冷却するために、量子冷蔵庫の冷媒界面として使用されることが提案されている。 n$ と qubit-qubit の相互作用強度で制御できる運転コストと冷却効率を考慮して,簡単な冷凍サイクルについて検討した。また、達成可能な温度の限界が設定される。このような数量子ビットのコンパクトな量子冷蔵庫は、量子技術の応用の次元を減らし、全量子システムに容易に統合でき、量子コンピューティングや熱デバイスのスピードとパワーを高めることができる。 We propose to use a few-qubit system as a compact quantum refrigerator for cooling an interacting multi-qubit system. We specifically consider a central qubit coupled to $N$ ancilla qubits in a so-called spin-star model as our quantum refrigerator. We first show that if the interaction between the qubits is of the longitudinal and ferromagnetic Ising model form, the central qubit is colder than the environment. The colder central qubit is then proposed to be used as the refrigerant interface of the quantum refrigerator to cool down general quantum many-qubit systems. We discuss a simple refrigeration cycle, considering the operation cost and cooling efficiency, which can be controlled by $N$ and the qubit-qubit interaction strength. Besides, bounds on the achievable temperature are established. Such few-qubit compact quantum refrigerators can be significant to reduce dimensions of quantum technology applications, can be easy to integrate into all-qubit systems, and can increase the speed and power of quantum computing and thermal devices.	翻訳日:2023-04-22 20:20:15 公開日:2020-11-27
# 中心スピンモデルにおける相関R'enyiエントロピーの実験的検出 Experimental Detection of the Correlation R\'enyi Entropy in the Central Spin Model ( http://arxiv.org/abs/2011.13948v1 ) ライセンス: Link先を確認	Mohamad Niknam, Lea F. Santos, David G. Cory	(参考訳) 量子ビット間の相関の体積を定量化するエントロピーを実験的に提案する。この実験は、中心スピン結合からなるほぼ孤立した量子系上で行われ、当初は他の15個のスピンとは無関係であった。スピンスピン相互作用のため、情報は中心スピンから周囲のスピンへと流れ、時間とともに成長するマルチスピン相関のクラスターを形成する。我々は、マルチスピン相関の振幅を直接測定し、R'enyiエントロピーと呼ばれる相関の進化を計算する核磁気共鳴実験を設計する。このエントロピーは、絡み合いエントロピーの平衡後でも成長し続ける。また,R'enyiエントロピーの平衡の飽和点と時間スケールがシステムサイズにどのように依存するかを解析した。 We propose and experimentally measure an entropy that quantifies the volume of correlations among qubits. The experiment is carried out on a nearly isolated quantum system composed of a central spin coupled and initially uncorrelated with 15 other spins. Due to the spin-spin interactions, information flows from the central spin to the surrounding ones forming clusters of multi-spin correlations that grow in time. We design a nuclear magnetic resonance experiment that directly measures the amplitudes of the multi-spin correlations and use them to compute the evolution of what we call correlation R\'enyi entropy. This entropy keeps growing even after the equilibration of the entanglement entropy. We also analyze how the saturation point and the timescale for the equilibration of the correlation R\'enyi entropy depend on the system size.	翻訳日:2023-04-22 20:19:07 公開日:2020-11-27
# ハールランダム州のマナ Mana in Haar-random states ( http://arxiv.org/abs/2011.13937v1 ) ライセンス: Link先を確認	Christopher David White and Justin H. Wilson	(参考訳) Mana は状態を生成するのに必要な非クリフォードリソースの量を測るもので、$\ell$ qudits 上の混合状態のマナは $\le \frac 1 2 (\ell \ln d - S_2)$; $S_2$ 状態の第2の Renyi エントロピーによって束縛される。ハールランダムな純および混合状態のマナを計算し、そのマナがヒルベルト空間次元においてほぼ対数(英語版)(logarithmic)であることを見つける:つまり、クウディ次元におけるクウディッツ数と対数(英語版)(logarithmic in qudit dimension)を広範囲に含む。特に、最大エントロピーに満たない状態の平均 mana は、その最大値の$\ln \pi/2$ に満たない。すると、この結果と近似的にt$-designsの最近の研究を結びつけて、manaは微分可能ではないので、非cliffordリソースの有用な尺度であると指摘します。 Mana is a measure of the amount of non-Clifford resources required to create a state; the mana of a mixed state on $\ell$ qudits bounded by $\le \frac 1 2 (\ell \ln d - S_2)$; $S_2$ the state's second Renyi entropy. We compute the mana of Haar-random pure and mixed states and find that the mana is nearly logarithmic in Hilbert space dimension: that is, extensive in number of qudits and logarithmic in qudit dimension. In particular, the average mana of states with less-than-maximal entropy falls short of that maximum by $\ln \pi/2$. We then connect this result to recent work on near-Clifford approximate $t$-designs; in doing so we point out that mana is a useful measure of non-Clifford resources precisely because it is not differentiable.	翻訳日:2023-04-22 20:18:24 公開日:2020-11-27
# 可制御長における抽象的要約のための解釈可能な多面的注意 Interpretable Multi-Headed Attention for Abstractive Summarization at Controllable Lengths ( http://arxiv.org/abs/2002.07845v2 ) ライセンス: Link先を確認	Ritesh Sarkhel, Moniba Keymanesh, Arnab Nandi, Srinivasan Parthasarathy	(参考訳) 制御可能な長さでの抽象的要約は自然言語処理において難しい課題である。トレーニングデータが限られているドメインや、サマリの長さが事前に分かっていないようなシナリオでは、さらに難しくなります。同時に、機械が生成した要約を信頼することに関して、人間の理解可能な言葉で要約がどのように構築されたかを説明することが重要かもしれない。本稿では,テキスト文書の要約を制御可能な長さで構築するための教師あり手法であるMulti-level Summarizer (MLS)を提案する。本手法のキーイネーバは,時間ステップ独立なセマンティクスカーネルの配列を用いて,入力文書上のアテンション分布を計算するマルチヘッドアテンション機構である。各カーネルは、人間の解釈可能な構文またはセマンティックプロパティを最適化する。英語における2つの低リソースデータセットの発掘実験により、MLSはMETEORスコアの14.70%まで強力なベースラインを上回ります。要約の人間による評価は、文書の重要概念を様々な予算で捉えることを示唆している。 Abstractive summarization at controllable lengths is a challenging task in natural language processing. It is even more challenging for domains where limited training data is available or scenarios in which the length of the summary is not known beforehand. At the same time, when it comes to trusting machine-generated summaries, explaining how a summary was constructed in human-understandable terms may be critical. We propose Multi-level Summarizer (MLS), a supervised method to construct abstractive summaries of a text document at controllable lengths. The key enabler of our method is an interpretable multi-headed attention mechanism that computes attention distribution over an input document using an array of timestep independent semantic kernels. Each kernel optimizes a human-interpretable syntactic or semantic property. Exhaustive experiments on two low-resource datasets in the English language show that MLS outperforms strong baselines by up to 14.70% in the METEOR score. Human evaluation of the summaries also suggests that they capture the key concepts of the document at various length-budgets.	翻訳日:2022-12-30 20:10:37 公開日:2020-11-27
# leafgan: 実用的植物病診断のためのデータ拡張法 LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis ( http://arxiv.org/abs/2002.10100v2 ) ライセンス: Link先を確認	Quan Huu Cap, Hiroyuki Uga, Satoshi Kagiwada, and Hitoshi Iyatomi	(参考訳) 植物病の自動診断のための多くの応用が深層学習技術の成功に基づいて開発されている。しかし、これらのアプリケーションはしばしば過剰フィッティングに苦しめられ、新しい環境からテストデータセットで使用すると診断性能が劇的に低下する。本稿では,独自の注意機構を持つ新しい画像から画像への翻訳システムであるleafganを提案する。 leafganは、植物病診断の性能を向上させるためのデータ拡張ツールとして、健康な画像から変換することで、さまざまな疾患画像を生成する。注意機構により,本モデルでは,様々な背景を持つ画像から関連領域のみを変換し,トレーニング画像の汎用性を高めることができる。 5級キュウリ病分類の実験では、バニラ型サイクロンによるデータ増強は、一般化の改善に役立てられず、つまり、疾患診断性能はベースラインからわずか0.7%向上した。一方、LeafGANは診断性能を7.4%向上させた。また、LeafGANが生成した画像は、Vanilla CycleGANが生成した画像よりも品質が高く、より説得力が高いことも確認した。コードは、https://github.com/IyatomiLab/LeafGAN.comで公開されている。 Many applications for the automated diagnosis of plant disease have been developed based on the success of deep learning techniques. However, these applications often suffer from overfitting, and the diagnostic performance is drastically decreased when used on test datasets from new environments. In this paper, we propose LeafGAN, a novel image-to-image translation system with own attention mechanism. LeafGAN generates a wide variety of diseased images via transformation from healthy images, as a data augmentation tool for improving the performance of plant disease diagnosis. Thanks to its own attention mechanism, our model can transform only relevant areas from images with a variety of backgrounds, thus enriching the versatility of the training images. Experiments with five-class cucumber disease classification show that data augmentation with vanilla CycleGAN cannot help to improve the generalization, i.e., disease diagnostic performance increased by only 0.7% from the baseline. In contrast, LeafGAN boosted the diagnostic performance by 7.4%. We also visually confirmed the generated images by our LeafGAN were much better quality and more convincing than those generated by vanilla CycleGAN. The code is available publicly at: https://github.com/IyatomiLab/LeafGAN.	翻訳日:2022-12-29 04:07:12 公開日:2020-11-27
# ベルヌーイ分布とカテゴリー分布の有限混合に対する平均場ゲームモデル A Mean Field Games model for finite mixtures of Bernoulli and Categorical distributions ( http://arxiv.org/abs/2004.08119v2 ) ライセンス: Link先を確認	Laura Aquilanti, Simone Cacace, Fabio Camilli and Raul De Maio	(参考訳) 有限混合モデルは、例えばデータクラスタリングにおいて、データの統計解析において重要なツールである。混合モデルの最適パラメータは、通常、期待最大化アルゴリズムによってログ類似汎関数を最大化することで計算される。本研究では,無限個のエージェントを持つ微分ゲームのクラスである平均場ゲームの理論に基づく代替手法を提案する。有限状態空間の多乗平均場ゲームシステムの解は、ベルヌーイ混合物に対する対数類似汎関数の臨界点を特徴づける。このアプローチは、カテゴリ分布の混合モデルに一般化される。したがって、Mean Field Gamesアプローチは混合モデルのパラメータを計算する方法を提供し、クラスタ解析の標準的な例にその適用例を示す。 Finite mixture models are an important tool in the statistical analysis of data, for example in data clustering. The optimal parameters of a mixture model are usually computed by maximizing the log-likelihood functional via the Expectation-Maximization algorithm. We propose an alternative approach based on the theory of Mean Field Games, a class of differential games with an infinite number of agents. We show that the solution of a finite state space multi-population Mean Field Games system characterizes the critical points of the log-likelihood functional for a Bernoulli mixture. The approach is then generalized to mixture models of categorical distributions. Hence, the Mean Field Games approach provides a method to compute the parameters of the mixture model, and we show its application to some standard examples in cluster analysis.	翻訳日:2022-12-12 12:57:53 公開日:2020-11-27
# 関連する歩行によるグラフニューラルネットワークの高次説明 Higher-Order Explanations of Graph Neural Networks via Relevant Walks ( http://arxiv.org/abs/2006.03589v3 ) ライセンス: Link先を確認	Thomas Schnake, Oliver Eberle, Jonas Lederer, Shinichi Nakajima, Kristof T. Sch\"utt, Klaus-Robert M\"uller, Gr\'egoire Montavon	(参考訳) グラフニューラルネットワーク(GNN)は、グラフ構造化データを予測するための一般的なアプローチである。 GNNは入力グラフをニューラルネットワーク構造にしっかりと絡み合わせるため、一般的な説明可能なAIアプローチは適用できない。これまでのところ、gnnはユーザーのためにブラックボックスのままだった。本稿では,GNNが高次展開を用いて自然に説明できることを示す。実際には,各ステップにおいて,レイヤワイド関連伝搬 (LRP) などの既存手法を適用可能なネスト属性方式を用いて,そのような説明を抽出することができる。出力は、予測に関係のある入力グラフへのウォークの集合である。我々は,GNN-LRPによって表現される新しい説明法を,広範囲のグラフニューラルネットワークに適用し,テキストデータの感情分析,量子化学における構造的優位性関係,画像分類に関する実用的な知見を抽出する。 Graph Neural Networks (GNNs) are a popular approach for predicting graph structured data. As GNNs tightly entangle the input graph into the neural network structure, common explainable AI approaches are not applicable. To a large extent, GNNs have remained black-boxes for the user so far. In this paper, we show that GNNs can in fact be naturally explained using higher-order expansions, i.e. by identifying groups of edges that jointly contribute to the prediction. Practically, we find that such explanations can be extracted using a nested attribution scheme, where existing techniques such as layer-wise relevance propagation (LRP) can be applied at each step. The output is a collection of walks into the input graph that are relevant for the prediction. Our novel explanation method, which we denote by GNN-LRP, is applicable to a broad range of graph neural networks and lets us extract practically relevant insights on sentiment analysis of text data, structure-property relationships in quantum chemistry, and image classification.	翻訳日:2022-11-25 02:34:04 公開日:2020-11-27
# 球運動ダイナミクス:正規化、重減少、SGDによるニューラルネットワークの学習ダイナミクス Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD ( http://arxiv.org/abs/2006.08419v4 ) ライセンス: Link先を確認	Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun	(参考訳) 本研究では、ニューラルネットワークの正規化、重崩壊(WD)、SGD(運動量)による学習のダイナミクスを包括的に明らかにし、Spherical Motion Dynamics (SMD) と名付けた。ほとんどの関連研究は、ウェイトノルムが変化しない「平衡」条件における「効果的な学習率」に焦点を当ててSMDを研究する。しかし、なぜSMDで平衡状態に到達できるかという彼らの議論は、欠如しているか、より説得力がない。本研究は平衡状態の原因を直接調査することでsmdを調査する。具体的には 1) SMDにおける平衡状態につながる仮定を導入し, 重みノルムが与えられた仮定と線形速度で収束できることを証明する。 2) SMDにおけるニューラルネットワークの進化を測定するために, 効果的な学習率の代替として「角更新」を提案し, 角更新が線形速度で理論値に収束することを示す。 3)ImageNet や MSCOCO など様々なコンピュータビジョンタスクにおける仮定と理論的結果の検証を行う。実験結果から, 理論的結果は経験的観察とよく一致した。 In this work, we comprehensively reveal the learning dynamics of neural network with normalization, weight decay (WD), and SGD (with momentum), named as Spherical Motion Dynamics (SMD). Most related works study SMD by focusing on "effective learning rate" in "equilibrium" condition, where weight norm remains unchanged. However, their discussions on why equilibrium condition can be reached in SMD is either absent or less convincing. Our work investigates SMD by directly exploring the cause of equilibrium condition. Specifically, 1) we introduce the assumptions that can lead to equilibrium condition in SMD, and prove that weight norm can converge at linear rate with given assumptions; 2) we propose "angular update" as a substitute for effective learning rate to measure the evolving of neural network in SMD, and prove angular update can also converge to its theoretical value at linear rate; 3) we verify our assumptions and theoretical results on various computer vision tasks including ImageNet and MSCOCO with standard settings. Experiment results show our theoretical findings agree well with empirical observations.	翻訳日:2022-11-21 02:31:47 公開日:2020-11-27
# 進化的貯水池計算ネットワークにおける機能分化 Functional differentiations in evolutionary reservoir computing networks ( http://arxiv.org/abs/2006.11507v2 ) ライセンス: Link先を確認	Yutaka Yamaguti and Ichiro Tsuda	(参考訳) ニューロンの機能的分化を示す拡張型貯水池コンピュータを提案する。本発明の貯水池コンピュータは,進化力学を用いて内部貯水池の変更を可能にするために開発され,これを進化貯水池コンピュータと呼ぶ。入力情報に応じて特異性を示す神経ユニットを開発するためには、内部ダイナミクスを制御し、ダイナミックスを拡張した後の収縮ダイナミクスを生成する必要がある。拡張ダイナミクスは入力情報の差を拡大するが、縮小ダイナミクスは入力情報のクラスターの形成に寄与し、複数のアトラクタを生成する。両方のダイナミクスの同時出現はカオスの存在を示している。対照的に、有限時間間隔におけるこれらのダイナミクスのシーケンシャルな出現は機能的な分化を引き起こす可能性がある。本稿では,進化的貯水池コンピュータにおいて,特定のニューロン単位がどのように得られるかを示す。 We propose an extended reservoir computer that shows the functional differentiation of neurons. The reservoir computer is developed to enable changing of the internal reservoir using evolutionary dynamics, and we call it an evolutionary reservoir computer. To develop neuronal units to show specificity, depending on the input information, the internal dynamics should be controlled to produce contracting dynamics after expanding dynamics. Expanding dynamics magnifies the difference of input information, while contracting dynamics contributes to forming clusters of input information, thereby producing multiple attractors. The simultaneous appearance of both dynamics indicates the existence of chaos. In contrast, sequential appearance of these dynamics during finite time intervals may induce functional differentiations. In this paper, we show how specific neuronal units are yielded in the evolutionary reservoir computer.	翻訳日:2022-11-18 22:46:04 公開日:2020-11-27
# 分散ネットワーク上のFew-shot学習のためのグラフプロトタイプネットワーク Graph Prototypical Networks for Few-shot Learning on Attributed Networks ( http://arxiv.org/abs/2006.12739v3 ) ライセンス: Link先を確認	Kaize Ding, Jianling Wang, Jundong Li, Kai Shu, Chenghao Liu, Huan Liu	(参考訳) 現在、名指しネットワークは、ソーシャルネットワーク分析、金融不正検出、薬物発見など、無数のハイインパクトなアプリケーションで広く使われている。属性ネットワークにおける中心的な分析課題として,ノード分類が研究コミュニティで注目されている。実世界の属性ネットワークでは、ノードクラスの大部分は限定されたラベル付きインスタンスのみを含み、ロングテールノードクラス分布を描画する。既存のノード分類アルゴリズムは、 \textit{few-shot}ノードクラスを処理できない。治療として、数発の学習が研究コミュニティで注目を集めている。しかし、ノードの分類は、以下の質問に答える必要があるため、依然として困難な問題である。 (i)数ショットノード分類のための属性ネットワークからメタ知識を抽出する方法 (ii)ロバストで効果的なモデルを構築するために各ラベル付きインスタンスのインフォメーションを識別するにはどうすればよいか? 本稿では,これらの質問に答えるために,グラフメタラーニングフレームワークであるgraph prototypical networks (gpn)を提案する。実テスト環境を模倣する半教師付きノード分類タスクのプールを構築することにより、GPNは属性ネットワーク上で \textit{meta-learning} を実行し、ターゲット分類タスクを扱うための非常に一般化可能なモデルを導出することができる。大規模な実験では、GPNが数発のノード分類において優れていることを示した。 Attributed networks nowadays are ubiquitous in a myriad of high-impact applications, such as social network analysis, financial fraud detection, and drug discovery. As a central analytical task on attributed networks, node classification has received much attention in the research community. In real-world attributed networks, a large portion of node classes only contain limited labeled instances, rendering a long-tail node class distribution. Existing node classification algorithms are unequipped to handle the \textit{few-shot} node classes. As a remedy, few-shot learning has attracted a surge of attention in the research community. Yet, few-shot node classification remains a challenging problem as we need to address the following questions: (i) How to extract meta-knowledge from an attributed network for few-shot node classification? (ii) How to identify the informativeness of each labeled instance for building a robust and effective model? To answer these questions, in this paper, we propose a graph meta-learning framework -- Graph Prototypical Networks (GPN). By constructing a pool of semi-supervised node classification tasks to mimic the real test environment, GPN is able to perform \textit{meta-learning} on an attributed network and derive a highly generalizable model for handling the target classification task. Extensive experiments demonstrate the superior capability of GPN in few-shot node classification.	翻訳日:2022-11-17 22:35:15 公開日:2020-11-27
# ゴールコンディション付き階層型予測器を用いた長期視覚計画 Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors ( http://arxiv.org/abs/2006.13205v2 ) ライセンス: Link先を確認	Karl Pertsch, Oleh Rybkin, Frederik Ebert, Chelsea Finn, Dinesh Jayaraman, Sergey Levine	(参考訳) 未来に予測し、計画する能力は、世界で行動するエージェントにとって基本である。遠方の目標を達成するために,まず目標に向けて粗い計画を考案し,さらに詳細を徐々に記入する軌道を複数の時間スケールで予測する。対照的に、視覚的予測と計画のための現在の学習アプローチは、(1)目標情報を考慮せずに予測し、(2)最高の時間分解能では、一度に1ステップずつ、長い水平タスクで失敗する。本研究では,これらの制約を克服可能な視覚的予測と計画のためのフレームワークを提案する。まず,目標に向かって予測する問題を定式化し,それに対応する潜在空間目標条件予測器(gcps)を提案する。 GCPは、目標に達する軌道のみに検索スペースを制約することで、計画の効率を大幅に改善する。さらに,2つの観測から観測結果を予測し,軌道の各部分を再帰的に分割することにより,gcpを階層モデルとして自然に定式化することができることを示す。この分割・分割戦略は, 長期予測に有効であり, 粗大から細かな方法で軌道を最適化する効率的な階層計画アルゴリズムを設計できる。目標条件と階層予測の両方を使用することで、GCPは以前よりもはるかに長い視野で視覚的な計画タスクを解決できることを示す。 The ability to predict and plan into the future is fundamental for agents acting in the world. To reach a faraway goal, we predict trajectories at multiple timescales, first devising a coarse plan towards the goal and then gradually filling in details. In contrast, current learning approaches for visual prediction and planning fail on long-horizon tasks as they generate predictions (1) without considering goal information, and (2) at the finest temporal resolution, one step at a time. In this work we propose a framework for visual prediction and planning that is able to overcome both of these limitations. First, we formulate the problem of predicting towards a goal and propose the corresponding class of latent space goal-conditioned predictors (GCPs). GCPs significantly improve planning efficiency by constraining the search space to only those trajectories that reach the goal. Further, we show how GCPs can be naturally formulated as hierarchical models that, given two observations, predict an observation between them, and by recursively subdividing each part of the trajectory generate complete sequences. This divide-and-conquer strategy is effective at long-term prediction, and enables us to design an effective hierarchical planning algorithm that optimizes trajectories in a coarse-to-fine manner. We show that by using both goal-conditioning and hierarchical prediction, GCPs enable us to solve visual planning tasks with much longer horizon than previously possible.	翻訳日:2022-11-17 21:25:09 公開日:2020-11-27
# 3Dの対人ロボットは人間をクローズできるか? Can 3D Adversarial Logos Cloak Humans? ( http://arxiv.org/abs/2006.14655v2 ) ライセンス: Link先を確認	Yi Wang, Jingyang Zhou, Tianlong Chen, Sijia Liu, Shiyu Chang, Chandrajit Bajaj, Zhangyang Wang	(参考訳) 敵の攻撃の傾向により、研究者たちは2Dシーンで訓練された物体探知機を騙そうと試みている。それらの多くは、現実世界で使われる可能性のある新たな攻撃形態として、画像に敵のパッチ(例えばロゴ)を付加することが挙げられる。それにもかかわらず、3dレンダリングビューからの敵意攻撃についてはあまり知られていない。本稿では, 2次元テクスチャ画像から任意の形状のロゴを構築し, ロゴ変換と呼ばれるテクスチャマッピングを用いて, この画像を3次元逆ロゴにマッピングする。結果として得られる3dの敵のロゴは、その形状と位置を容易に操作できる敵のテクスチャと見なされる。これは、コンピュータグラフィックス合成画像のための広告訓練の汎用性を大きく広げる。従来の敵対的パッチとは対照的に、この新しい攻撃形態は3Dオブジェクトの世界にマッピングされ、異なるレンダリングによって2D画像領域にバックプロパゲートされる。加えて、既存の敵のパッチとは異なり、我々の新しい3d敵ロゴは、モデル回転の下で堅牢に最先端の深層物体検出器を騙すように示されています。私たちのコードはhttps://github.com/tamu-vita/3d_adversarial_logoで利用可能です。 With the trend of adversarial attacks, researchers attempt to fool trained object detectors in 2D scenes. Among many of them, an intriguing new form of attack with potential real-world usage is to append adversarial patches (e.g. logos) to images. Nevertheless, much less have we known about adversarial attacks from 3D rendering views, which is essential for the attack to be persistently strong in the physical world. This paper presents a new 3D adversarial logo attack: we construct an arbitrary shape logo from a 2D texture image and map this image into a 3D adversarial logo via a texture mapping called logo transformation. The resulting 3D adversarial logo is then viewed as an adversarial texture enabling easy manipulation of its shape and position. This greatly extends the versatility of adversarial training for computer graphics synthesized imagery. Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering. In addition, and unlike existing adversarial patches, our new 3D adversarial logo is shown to fool state-of-the-art deep object detectors robustly under model rotations, leading to one step further for realistic attacks in the physical world. Our codes are available at https://github.com/TAMU-VITA/3D_Adversarial_Logo.	翻訳日:2022-11-17 02:45:09 公開日:2020-11-27
# 効率的なモバイルネットワーク設計のためのボトルネック構造再考 Rethinking Bottleneck Structure for Efficient Mobile Network Design ( http://arxiv.org/abs/2007.02269v4 ) ライセンス: Link先を確認	Zhou Daquan, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan	(参考訳) 倒立残差ブロックは最近,モバイルネットワークのアーキテクチャ設計を支配している。これは2つの設計ルールを導入することで、古典的な残留ボトルネックを変化させる。本稿では,このような設計変更の必要性を再検討し,情報損失や勾配混乱のリスクをもたらす可能性を見いだす。そこで我々は,その構造を反転させ,より高次元でのアイデンティティマッピングと空間変換を行い,情報損失と勾配混乱を効果的に緩和する,サンドグラスブロックと呼ばれる新しいボトルネック設計を提案する。大規模な実験は、一般的な信念とは異なり、そのようなボトルネック構造がモバイルネットワークの反転構造よりも有益であることを示した。 ImageNet分類では、パラメータや計算量を増やすことなく、逆残差ブロックを砂時計ブロックに置き換えることによって、MobileNetV2よりも1.7%以上精度を向上することができる。 Pascal VOC 2007 テストセットでは、対象検出において 0.9% mAP も改善されている。さらに,ニューラルネットワーク探索法dartsの探索空間に追加することにより,サンドグラスブロックの有効性をさらに検証する。 25%のパラメータ削減により、従来のdartsモデルよりも分類精度が0.13%向上した。コードは、https://github.com/zhoudaquan/rethinking_bottleneck_design.comで参照できる。 The inverted residual block is dominating architecture design for mobile networks recently. It changes the classic residual bottleneck by introducing two design rules: learning inverted residuals and using linear bottlenecks. In this paper, we rethink the necessity of such design changes and find it may bring risks of information loss and gradient confusion. We thus propose to flip the structure and present a novel bottleneck design, called the sandglass block, that performs identity mapping and spatial transformation at higher dimensions and thus alleviates information loss and gradient confusion effectively. Extensive experiments demonstrate that, different from the common belief, such bottleneck structure is more beneficial than the inverted ones for mobile networks. In ImageNet classification, by simply replacing the inverted residual block with our sandglass block without increasing parameters and computation, the classification accuracy can be improved by more than 1.7% over MobileNetV2. On Pascal VOC 2007 test set, we observe that there is also 0.9% mAP improvement in object detection. We further verify the effectiveness of the sandglass block by adding it into the search space of neural architecture search method DARTS. With 25% parameter reduction, the classification accuracy is improved by 0.13% over previous DARTS models. Code can be found at: https://github.com/zhoudaquan/rethinking_bottleneck_design.	翻訳日:2022-11-13 08:22:50 公開日:2020-11-27
# 単変量行列分布アルゴリズムは, 誤認と転移をよく表す The Univariate Marginal Distribution Algorithm Copes Well With Deception and Epistasis ( http://arxiv.org/abs/2007.08277v2 ) ライセンス: Link先を確認	Benjamin Doerr and Martin S. Krejca	(参考訳) 最近の研究で、Lehre and Nguyen (FOGA 2019) は、認知学習ブロック(DLB)問題を最適化するために、一変量境界分布アルゴリズム (UMDA) が親集団サイズで指数関数的な時間を必要とすることを示した。彼らはこの結果から、単変量EDAは偽りやてんかんに苦慮していると結論づけた。本研究では,この否定的な発見は,UMDAのパラメータを不運に選択することに起因することを示す。集団の大きさが遺伝的ドリフトを防げるほど大きく選択されると、umdaは最大$\lambda(\frac{n}{2} + 2 e \ln n)$の適合評価で高い確率でdlb問題を最適化する。子孫サイズの$\lambda$ of order $n \log n$は遺伝的ドリフトを防ぐことができるので、UMDAは$O(n^2 \log n)$フィットネス評価でDLB問題を解決することができる。対照的に、従来の進化的アルゴリズムでは、$o(n^3)$ よりも実行時間が保証されない(${(1+1)$ ea には厳密であることが証明されている)ため、umda は欺きとエピスタティスに対処できることを示唆している。より広い視点から見れば、UMDAは進化的アルゴリズムよりも局所最適に対処できることが示され、この結果は以前、コンパクトな遺伝的アルゴリズムでのみ知られていた。 Lehre と Nguyen の下位境界とともに、私たちの結果は、遺伝的ドリフトによる政権での EDA の実行が、劇的なパフォーマンス損失をもたらすことを厳格に証明した。 In their recent work, Lehre and Nguyen (FOGA 2019) show that the univariate marginal distribution algorithm (UMDA) needs time exponential in the parent populations size to optimize the DeceptiveLeadingBlocks (DLB) problem. They conclude from this result that univariate EDAs have difficulties with deception and epistasis. In this work, we show that this negative finding is caused by an unfortunate choice of the parameters of the UMDA. When the population sizes are chosen large enough to prevent genetic drift, then the UMDA optimizes the DLB problem with high probability with at most $\lambda(\frac{n}{2} + 2 e \ln n)$ fitness evaluations. Since an offspring population size $\lambda$ of order $n \log n$ can prevent genetic drift, the UMDA can solve the DLB problem with $O(n^2 \log n)$ fitness evaluations. In contrast, for classic evolutionary algorithms no better run time guarantee than $O(n^3)$ is known (which we prove to be tight for the ${(1+1)}$ EA), so our result rather suggests that the UMDA can cope well with deception and epistatis. From a broader perspective, our result shows that the UMDA can cope better with local optima than evolutionary algorithms; such a result was previously known only for the compact genetic algorithm. Together with the lower bound of Lehre and Nguyen, our result for the first time rigorously proves that running EDAs in the regime with genetic drift can lead to drastic performance losses.	翻訳日:2022-11-09 22:41:50 公開日:2020-11-27
# ヒト乳癌研究を支援する犬乳癌の完全な注釈付き全画像データセット A completely annotated whole slide image dataset of canine breast cancer to aid human breast cancer research ( http://arxiv.org/abs/2008.10244v2 ) ライセンス: Link先を確認	Marc Aubreville, Christof A. Bertram, Taryn A. Donovan, Christian Marzahl, Andreas Maier, and Robert Klopfleisch	(参考訳) 犬乳腺癌(CMC)はヒト乳癌の病理発生のモデルとして用いられており,腫瘍悪性度の評価には同様の段階が一般的である。この階調スキームの重要な構成要素は、ミトティックフィギュア(MF)の密度である。現在公開されているヒト乳癌のデータセットは、スライド画像全体の小さなサブセット(WSI)に対してのみアノテーションを提供する。 MFに完全アノテートされたCMCの21 WSIのデータセットを提案する。このために、病理学者は、潜在的なMFと似た外観の構造物の全てのWSIをスクリーニングした。第2の専門家はブラインドにラベルを割り当て、第3の専門家は最終ラベルを割り当てた。さらに,機械学習を用いて未検出のmfを同定した。最後に,アノテーションの一貫性を高めるために,表現学習と二次元投影を行った。我々のデータセットは 13,907 mf と 36,379 hard negative からなる。テストセットでは平均0.791score,ヒト乳癌データセットでは0.696scoreであった。 Canine mammary carcinoma (CMC) has been used as a model to investigate the pathogenesis of human breast cancer and the same grading scheme is commonly used to assess tumor malignancy in both. One key component of this grading scheme is the density of mitotic figures (MF). Current publicly available datasets on human breast cancer only provide annotations for small subsets of whole slide images (WSIs). We present a novel dataset of 21 WSIs of CMC completely annotated for MF. For this, a pathologist screened all WSIs for potential MF and structures with a similar appearance. A second expert blindly assigned labels, and for non-matching labels, a third expert assigned the final labels. Additionally, we used machine learning to identify previously undetected MF. Finally, we performed representation learning and two-dimensional projection to further increase the consistency of the annotations. Our dataset consists of 13,907 MF and 36,379 hard negatives. We achieved a mean F1-score of 0.791 on the test set and of up to 0.696 on a human breast cancer dataset.	翻訳日:2022-10-25 09:07:05 公開日:2020-11-27
# データ価格に関する調査:経済学からデータ科学へ A Survey on Data Pricing: from Economics to Data Science ( http://arxiv.org/abs/2009.04462v2 ) ライセンス: Link先を確認	Jian Pei	(参考訳) データの価値は低い。データの価値を客観的、体系的、定量的に評価するにはどうすればよいのか? 価格データ(一般に情報財)は、経済学、マーケティング、電子商取引、データ管理、データマイニング、機械学習など、分散した分野や原則で研究され、実践されてきた。本稿では,この重要な方向性について,学際的かつ総合的に概観する。データ価格の背景にある様々なモチベーションを調べ、データ価格の経済性を理解し、一連の基本原則に従って価格モデルの開発と進化をレビューする。デジタル製品とデータ製品の両方について論じる。また,今後の課題や方向性についても検討する。 Data are invaluable. How can we assess the value of data objectively, systematically and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, marketing, electronic commerce, data management, data mining and machine learning. In this article, we present a unified, interdisciplinary and comprehensive overview of this important direction. We examine various motivations behind data pricing, understand the economics of data pricing and review the development and evolution of pricing models according to a series of fundamental principles. We discuss both digital products and data products. We also consider a series of challenges and directions for future work.	翻訳日:2022-10-20 09:04:24 公開日:2020-11-27
# 適切な場所に木を植える:アルゴリズム融合による樹木栽培に適した場所の推薦 Planting trees at the right places: Recommending suitable sites for growing trees using algorithm fusion ( http://arxiv.org/abs/2009.08002v2 ) ライセンス: Link先を確認	Pushpendra Rana and Lav R Varshney	(参考訳) 大規模植林は炭素削減のための低コストの自然ソリューションとして提案されてきたが、特に発展途上国ではプランテーションの場が貧弱なため妨げられている。サイト選択を支援するため,物理に基づく伝統的な林業科学知識と機械学習を組み合わせたアルゴリズム融合に基づくePSAレコメンデーションシステムを開発した。 ePSAは、森林地帯内のブランクパッチを識別し、木の成長ポテンシャルに基づいて各パッチをランク付けすることで、森林範囲の役員を支援する。実験, ユーザスタディ, 展開の結果は, 北インド以北における炭素削減のための自然環境ソリューションとして, 樹木プランテーションの長期的成功を形作る上で, 推奨システムの有用性を特徴づけている。 Large-scale planting of trees has been proposed as a low-cost natural solution for carbon mitigation, but is hampered by poor selection of plantation sites, especially in developing countries. To aid in site selection, we develop the ePSA (e-Plantation Site Assistant) recommendation system based on algorithm fusion that combines physics-based/traditional forestry science knowledge with machine learning. ePSA assists forest range officers by identifying blank patches inside forest areas and ranking each such patch based on their tree growth potential. Experiments, user studies, and deployment results characterize the utility of the recommender system in shaping the long-term success of tree plantations as a nature climate solution for carbon mitigation in northern India and beyond.	翻訳日:2022-10-17 11:55:32 公開日:2020-11-27
# 画素からの連続制御における視覚的一般化の測定 Measuring Visual Generalization in Continuous Control from Pixels ( http://arxiv.org/abs/2010.06740v2 ) ライセンス: Link先を確認	Jake Grigsby, Yanjun Qi	(参考訳) 自己教師付き学習とデータ拡張は、連続制御タスクにおける状態と画像に基づく強化学習エージェントのパフォーマンスギャップを著しく減らした。しかし、現在の技術が現実世界の環境に要求される様々な視覚的条件に直面することができるかどうかはまだ不明である。本稿では,既存の連続制御領域にグラフィカルな多様性を加えることで,エージェントの視覚的一般化を検証できる挑戦的なベンチマークを提案する。実験結果から,現在の手法では様々な視覚変化の一般化が困難であり,これらのタスクを困難にさせる変動の具体的要因について検討した。データ拡張技術は自己教師あり学習手法より優れており、より重要な画像変換によってより優れた視覚的一般化が実現されていることが分かりました。 Self-supervised learning and data augmentation have significantly reduced the performance gap between state and image-based reinforcement learning agents in continuous control tasks. However, it is still unclear whether current techniques can face a variety of visual conditions required by real-world environments. We propose a challenging benchmark that tests agents' visual generalization by adding graphical variety to existing continuous control domains. Our empirical analysis shows that current methods struggle to generalize across a diverse set of visual changes, and we examine the specific factors of variation that make these tasks difficult. We find that data augmentation techniques outperform self-supervised learning approaches and that more significant image transformations provide better visual generalization \footnote{The benchmark and our augmented actor-critic implementation are open-sourced @ https://github.com/QData/dmc_remastered)	翻訳日:2022-10-07 22:36:49 公開日:2020-11-27
# メタグラディエントD4PGによるバランシング制約とリワード Balancing Constraints and Rewards with Meta-Gradient D4PG ( http://arxiv.org/abs/2010.06324v2 ) ライセンス: Link先を確認	Dan A. Calian and Daniel J. Mankowitz and Tom Zahavy and Zhongwen Xu and Junhyuk Oh and Nir Levine and Timothy Mann	(参考訳) 現実世界のアプリケーションを解決するためにRLエージェントを配置するには、複雑なシステムの制約を満たす必要があることが多い。しばしば制約しきい値は、システムの複雑な性質や、オフラインでしきい値を検証することができない(例えば、シミュレータや合理的なオフライン評価手順は存在しない)ために誤って設定される。これにより、制約に違反することなくタスクを解決できない解が得られる。しかし、現実の多くのケースでは制約違反は望ましくないが、それらは破滅的なものではなく、ソフト制約されたRLアプローチの必要性を動機付けている。本稿では,制約違反の最小化と期待リターンとの良好なトレードオフを見つけるために,メタグラディエンスを利用するソフトコンストレートrl手法を提案する。このアプローチの有効性は、4つの異なる MuJoCo ドメインのベースラインを一貫して上回ることを示すことで実証する。 Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints. Often the constraint thresholds are incorrectly set due to the complex nature of a system or the inability to verify the thresholds offline (e.g, no simulator or reasonable offline evaluation procedure exists). This results in solutions where a task cannot be solved without violating the constraints. However, in many real-world cases, constraint violations are undesirable yet they are not catastrophic, motivating the need for soft-constrained RL approaches. We present a soft-constrained RL approach that utilizes meta-gradients to find a good trade-off between expected return and minimizing constraint violations. We demonstrate the effectiveness of this approach by showing that it consistently outperforms the baselines across four different MuJoCo domains.	翻訳日:2022-10-07 22:26:11 公開日:2020-11-27
# D2RL:強化学習における深度アーキテクチャ D2RL: Deep Dense Architectures in Reinforcement Learning ( http://arxiv.org/abs/2010.09163v2 ) ライセンス: Link先を確認	Samarth Sinha, Homanga Bharadhwaj, Aravind Srinivas, Animesh Garg	(参考訳) ディープラーニングアーキテクチャの改善は、コンピュータビジョンや自然言語処理における教師付きおよび教師なし学習の状況を改善する上で重要な役割を担っているが、強化学習のためのニューラルネットワークアーキテクチャの選択は、いまだに未熟である。コンピュータビジョンと生成モデルにおけるアーキテクチャ選択の成功からインスピレーションを得て,様々なロボット学習ベンチマーク環境において,より深いネットワークと高密度接続を用いて強化学習を行う。以上の結果から,現状の手法は,密接な接続や深いネットワーク,操作やロコモーションタスクのスイートを通じて,固有認識と画像に基づく観察の両方において有益であることが判明した。私たちの成果が強力なベースラインとして機能し、強化学習のためのニューラルネットワークアーキテクチャに関するさらなる研究の動機になることを期待しています。コード付きプロジェクトのwebサイトは、このリンクhttps://sites.google.com/view/d2rl/home.comにある。 While improvements in deep learning architectures have played a crucial role in improving the state of supervised and unsupervised learning in computer vision and natural language processing, neural network architecture choices for reinforcement learning remain relatively under-explored. We take inspiration from successful architectural choices in computer vision and generative modelling, and investigate the use of deeper networks and dense connections for reinforcement learning on a variety of simulated robotic learning benchmark environments. Our findings reveal that current methods benefit significantly from dense connections and deeper networks, across a suite of manipulation and locomotion tasks, for both proprioceptive and image-based observations. We hope that our results can serve as a strong baseline and further motivate future research into neural network architectures for reinforcement learning. The project website with code is at this link https://sites.google.com/view/d2rl/home.	翻訳日:2022-10-05 22:24:32 公開日:2020-11-27
# 脳腫瘍分離のためのコンテキスト認識3D UNet Context Aware 3D UNet for Brain Tumor Segmentation ( http://arxiv.org/abs/2010.13082v2 ) ライセンス: Link先を確認	Parvez Ahmad, Saqib Qamar, Linlin Shen, Adnan Saeed	(参考訳) 深部畳み込みニューラルネットワーク(CNN)は医用画像解析において顕著な性能を発揮する。 UNetは、脳腫瘍セグメンテーションを含む医療画像タスクのための3D CNNアーキテクチャのパフォーマンスの主要なソースである。 UNetアーキテクチャのスキップ接続は、エンコーダとデコーダの経路から特徴を結合し、画像データから複数のコンテキスト情報を抽出する。マルチスケールの特徴は、脳腫瘍のセグメンテーションにおいて重要な役割を果たす。しかし、機能の使用制限により、セグメンテーションのためのUNetアプローチのパフォーマンスが低下する可能性がある。本稿では,脳腫瘍分割のためのunetアーキテクチャの改良を提案する。提案アーキテクチャでは,エンコーダとデコーダの経路に密結合したブロックを用いて,特徴再利用性の概念から複数のコンテキスト情報を抽出する。さらに、異なるカーネルサイズの特徴をマージすることで、ローカルおよびグローバル情報を抽出するために、残差インセプションブロック(RIB)が使用される。提案アーキテクチャをbrats(multi-modal brain tumor segmentation challenge) 2020年テストデータセットで検証した。全腫瘍(wt)、腫瘍コア(tc)、増強腫瘍(et)のdice(dsc)スコアはそれぞれ89.12%、84.74%、79.12%である。 Deep convolutional neural network (CNN) achieves remarkable performance for medical image analysis. UNet is the primary source in the performance of 3D CNN architectures for medical imaging tasks, including brain tumor segmentation. The skip connection in the UNet architecture concatenates features from both encoder and decoder paths to extract multi-contextual information from image data. The multi-scaled features play an essential role in brain tumor segmentation. However, the limited use of features can degrade the performance of the UNet approach for segmentation. In this paper, we propose a modified UNet architecture for brain tumor segmentation. In the proposed architecture, we used densely connected blocks in both encoder and decoder paths to extract multi-contextual information from the concept of feature reusability. In addition, residual-inception blocks (RIB) are used to extract the local and global information by merging features of different kernel sizes. We validate the proposed architecture on the multi-modal brain tumor segmentation challenge (BRATS) 2020 testing dataset. The dice (DSC) scores of the whole tumor (WT), tumor core (TC), and enhancing tumor (ET) are 89.12%, 84.74%, and 79.12%, respectively.	翻訳日:2022-10-03 04:56:45 公開日:2020-11-27
# 自己集合型3次元U-netニューラルネットワークによる脳腫瘍セグメント形成 : BraTS 2020チャレンジソリューション Brain tumor segmentation with self-ensembled, deeply-supervised 3D U-net neural networks: a BraTS 2020 challenge solution ( http://arxiv.org/abs/2011.01045v2 ) ライセンス: Link先を確認	Theophraste Henry, Alexandre Carre, Marvin Lerousseau, Theo Estienne, Charlotte Robert, Nikos Paragios, Eric Deutsch	(参考訳) 脳腫瘍のセグメンテーションは患者の疾患管理にとって重要な課題である。このタスクの自動化と標準化のために,我々は,マルチモーダル脳腫瘍分割課題(brats,multimodal brain tumor segmentation challenge,brats)2020のトレーニングデータセットに基づいて,深層監視と確率的重み平均化を中心に,複数のu-netライクなニューラルネットワークをトレーニングした。 2つの異なる訓練パイプラインからモデルの2つの独立したアンサンブルを訓練し、それぞれに脳腫瘍のセグメンテーションマップを作成した。これらの2つのラベルマップは、特定の腫瘍部分領域に対するそれぞれのアンサンブルのパフォーマンスを考慮してマージされた。試験時間増加を伴うオンライン検証データセットの性能は,0.81,0.91,0.85,Hausdorff(95%),20.6,4,3,5.7mm,全腫瘍コア,腫瘍コアの順であった。同様に、私たちのソリューションはDiceの0.79、0.89、0.84、最終テストデータセットでHausdorff(95%)の20.4、6.7、19.5mmを達成し、トップ10チームの中でランク付けしました。より複雑なトレーニングスキームとニューラルネットワークアーキテクチャを、トレーニング時間を大幅に増加させるコストで、大幅なパフォーマンス向上なしに調査した。以上より,各腫瘍亜領域の成績は良好で,バランスが良好であった。私たちのソリューションはhttps://github.com/lescientifik/open_brats2020でオープンソースです。 Brain tumor segmentation is a critical task for patient's disease management. In order to automate and standardize this task, we trained multiple U-net like neural networks, mainly with deep supervision and stochastic weight averaging, on the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2020 training dataset. Two independent ensembles of models from two different training pipelines were trained, and each produced a brain tumor segmentation map. These two labelmaps per patient were then merged, taking into account the performance of each ensemble for specific tumor subregions. Our performance on the online validation dataset with test time augmentation were as follows: Dice of 0.81, 0.91 and 0.85; Hausdorff (95%) of 20.6, 4,3, 5.7 mm for the enhancing tumor, whole tumor and tumor core, respectively. Similarly, our solution achieved a Dice of 0.79, 0.89 and 0.84, as well as Hausdorff (95%) of 20.4, 6.7 and 19.5mm on the final test dataset, ranking us among the top ten teams. More complicated training schemes and neural network architectures were investigated without significant performance gain at the cost of greatly increased training time. Overall, our approach yielded good and balanced performance for each tumor subregion. Our solution is open sourced at https://github.com/lescientifik/open_brats2020.	翻訳日:2022-10-01 17:19:29 公開日:2020-11-27
# 歪み画像復元のための生成的・識別的学習 Generative and Discriminative Learning for Distorted Image Restoration ( http://arxiv.org/abs/2011.05784v3 ) ライセンス: Link先を確認	Yi Gu, Yuting Gao, Jie Li, Chentao Wu, Weijia Jia	(参考訳) Liquifyは画像編集の一般的な技術であり、画像の歪みに使用できる。歪み変動の不確実性のため, 液状化フィルタによる歪み画像の復元は難しい課題である。画像を効率よく編集するには、歪んだ画像を自動的に復元することが期待される。本稿では、歪み画像の適切な歪みと完了を求めることで特徴付けられる歪み画像復元を目的とする。既存の手法は、自然現象によって生じる特定の規則的変形を解決するためのハードウェアアシストや幾何学原理に重点を置いているが、この課題における人工歪みの不規則性や不確実性には対処できない。そこで本研究では,深層ニューラルネットワークに基づく新しい生成・判別学習法を提案し,様々な再構成マッピングを学習し,複雑で高次元のデータを表現する。この方法は、タスクを整流段階と精練段階とに分解する。第1段階生成ネットワークは、歪み画像から補正画像へのマッピングを予測する。第二段階生成ネットワークはさらに知覚品質を最適化する。このタスクを探索するデータセットやベンチマークがないため、CelebAデータセットに基づいた前方歪みマッピングにより、Distorted Face Dataset(DFD)を作成します。提案ベンチマークの広範な実験評価を行い,本手法が画像復元に有効な方法であることを実証した。 Liquify is a common technique for image editing, which can be used for image distortion. Due to the uncertainty in the distortion variation, restoring distorted images caused by liquify filter is a challenging task. To edit images in an efficient way, distorted images are expected to be restored automatically. This paper aims at the distorted image restoration, which is characterized by seeking the appropriate warping and completion of a distorted image. Existing methods focus on the hardware assistance or the geometric principle to solve the specific regular deformation caused by natural phenomena, but they cannot handle the irregularity and uncertainty of artificial distortion in this task. To address this issue, we propose a novel generative and discriminative learning method based on deep neural networks, which can learn various reconstruction mappings and represent complex and high-dimensional data. This method decomposes the task into a rectification stage and a refinement stage. The first stage generative network predicts the mapping from the distorted images to the rectified ones. The second stage generative network then further optimizes the perceptual quality. Since there is no available dataset or benchmark to explore this task, we create a Distorted Face Dataset (DFD) by forward distortion mapping based on CelebA dataset. Extensive experimental evaluation on the proposed benchmark and the application demonstrates that our method is an effective way for distorted image restoration.	翻訳日:2022-09-27 00:51:59 公開日:2020-11-27
# 安全合成サンス仕様 Safety Synthesis Sans Specification ( http://arxiv.org/abs/2011.07630v2 ) ライセンス: Link先を確認	Roderick Bloem and Hana Chockler and Masoud Ebrahimi and Dana Fisman and Heinz Riener	(参考訳) 我々は、メンバシップクエリと推測クエリを使用して、競合する可能性のあるトランスデューサを含むターゲット言語である$u$から、transducer ${s}$を学習する問題を定義する。要件は${S}$の言語が$U$のサブセットであることである。ハードウェアおよびソフトウェア検証の多くの状況において、これは自然な問題である、と私たちは主張する。本稿では,この問題に対する学習アルゴリズムを考案し,その時間と問合せの複雑さが,対象言語のランク,非互換性尺度,与えられた反例の最大長に関して多項式であることを示す。本稿では,プロトタイプによる実験について報告する。 We define the problem of learning a transducer ${S}$ from a target language $U$ containing possibly conflicting transducers, using membership queries and conjecture queries. The requirement is that the language of ${S}$ be a subset of $U$. We argue that this is a natural question in many situations in hardware and software verification. We devise a learning algorithm for this problem and show that its time and query complexity is polynomial with respect to the rank of the target language, its incompatibility measure, and the maximal length of a given counterexample. We report on experiments conducted with a prototype implementation.	翻訳日:2022-09-25 07:50:44 公開日:2020-11-27
# ニューラルネットのハイパーパラメータ最適化に対する集団ベースハイブリッドアプローチ A Population-based Hybrid Approach to Hyperparameter Optimization for Neural Networks ( http://arxiv.org/abs/2011.11062v2 ) ライセンス: Link先を確認	Marcello Serqueira, Pedro Gonz\'alez, Eduardo Bezerra	(参考訳) 近年、大量のデータが生成され、コンピュータの電力は増え続けている。このシナリオは、人工ニューラルネットワークへの関心の復活につながった。効果的なニューラルネットワークモデルのトレーニングにおける大きな課題のひとつは、使用するハイパーパラメータの適切な組み合わせを見つけることだ。実際、ハイパーパラメータ空間を探索するための適切なアプローチの選択は、結果のニューラルネットワークモデルの精度に直接影響する。ハイパーパラメータ最適化の一般的なアプローチは、グリッド探索、ランダム探索、ベイズ最適化である。また、CMA-ESのような人口ベースの方法もある。本稿では,ハイパーパラメータ最適化のための新しい集団ベースアプローチであるHBRKGAを提案する。 HBRKGAは、Biased Random Key Genetic AlgorithmとRandom Walk技術を組み合わせて、ハイパーパラメータ空間を効率的に探索するハイブリッドアプローチである。提案手法の有効性を評価するため、8つの異なるデータセットに関するいくつかの計算実験を行った。その結果、HBRKGAは8つのデータセットのうち6つで(予測品質の観点から)ベースラインメソッドよりも優れたハイパーパラメータ構成を見出すことができた。 In recent years, large amounts of data have been generated, and computer power has kept growing. This scenario has led to a resurgence in the interest in artificial neural networks. One of the main challenges in training effective neural network models is finding the right combination of hyperparameters to be used. Indeed, the choice of an adequate approach to search the hyperparameter space directly influences the accuracy of the resulting neural network model. Common approaches for hyperparameter optimization are Grid Search, Random Search, and Bayesian Optimization. There are also population-based methods such as CMA-ES. In this paper, we present HBRKGA, a new population-based approach for hyperparameter optimization. HBRKGA is a hybrid approach that combines the Biased Random Key Genetic Algorithm with a Random Walk technique to search the hyperparameter space efficiently. Several computational experiments on eight different datasets were performed to assess the effectiveness of the proposed approach. Results showed that HBRKGA could find hyperparameter configurations that outperformed (in terms of predictive quality) the baseline methods in six out of eight datasets while showing a reasonable execution time.	翻訳日:2022-09-22 12:08:21 公開日:2020-11-27
# シリコンナノパターンデジタルメタマテリアルによる超コンパクト集積フォトニクスを実現する機械学習 Machine Learning enables Ultra-Compact Integrated Photonics through Silicon-Nanopattern Digital Metamaterials ( http://arxiv.org/abs/2011.11754v2 ) ライセンス: Link先を確認	Sourangsu Banerji, Apratim Majumder, Alex Hamrick, Rajesh Menon, and Berardi Sensale-Rodriguez	(参考訳) 本研究では,有限差分時間領域(FDTD)モデリングと組み合わせた機械学習アルゴリズムを用いて設計した3つの超コンパクト集積フォトニクスデバイスを実演する。デザインドメインを"バイナリピクセル"にデジタイズすることで、これらのデジタルメタマテリアルも容易に製造できる。様々なデバイス(ビームスプリッターと導波路の屈曲)を提示することにより、我々のアプローチの一般性を示す。エリアのフットプリントが${\lambda_0}^2$より小さいので、私たちのデザインは報告されている中では最小です。我々の手法は、機械学習とデジタルメタマテリアルを組み合わせることで、超コンパクトで製造可能なデバイスを可能にし、新しい「フォトニクスムーアの法則」を推進できる。 In this work, we demonstrate three ultra-compact integrated-photonics devices, which are designed via a machine-learning algorithm coupled with finite-difference time-domain (FDTD) modeling. Through digitizing the design domain into "binary pixels" these digital metamaterials are readily manufacturable as well. By showing a variety of devices (beamsplitters and waveguide bends), we showcase the generality of our approach. With an area footprint smaller than ${\lambda_0}^2$, our designs are amongst the smallest reported to-date. Our method combines machine learning with digital metamaterials to enable ultra-compact, manufacturable devices, which could power a new "Photonics Moore's Law."	翻訳日:2022-09-22 03:30:07 公開日:2020-11-27
# 顔動作単位検出のための差分注意マップを用いた計算効率の高い深層ニューラルネットワーク Computational efficient deep neural network with difference attention maps for facial action unit detection ( http://arxiv.org/abs/2011.12082v2 ) ライセンス: Link先を確認	Jing Chen, Chenhui Wang, Kejun Wang, Meichen Liu	(参考訳) 本稿では、差分画像に基づく計算効率のよいエンドツーエンドトレーニング深層ニューラルネットワーク(cednn)モデルと空間注意マップを提案する。まず、画像処理により差分画像を生成する。次に、空間的注意マップとして使用される異なるしきい値を用いて、差分画像の5つのバイナリ画像を得る。モデルの複雑さを減らすためにグループ畳み込みを使用します。 skip connectionと$\text{1}\times \text{1}$ convolutionは、ネットワークモデルが深くない場合でも優れたパフォーマンスを保証するために使用される。入力として、各ブロックの入力に空間注意マップを選択的に供給することができる。フィーチャーマップは、ターゲットタスクにもっと関係のある部分にフォーカスする傾向があります。さらに、異なる数のAUを訓練するために分類器のパラメータを調整する必要がある。計算量を増やすことなく、さまざまなデータセットに容易に拡張できる。多くの実験結果から、提案したCEDNNは明らかにdisFA+およびCK+データセットの従来のディープラーニング手法よりも優れていることが示されている。空間的注意マップを付加すると、最も先進的なAU検出法よりも優れた結果が得られる。同時に、ネットワークの規模が小さく、走行速度が速く、実験機器の要件も低い。 In this paper, we propose a computational efficient end-to-end training deep neural network (CEDNN) model and spatial attention maps based on difference images. Firstly, the difference image is generated by image processing. Then five binary images of difference images are obtained using different thresholds, which are used as spatial attention maps. We use group convolution to reduce model complexity. Skip connection and $\text{1}\times \text{1}$ convolution are used to ensure good performance even if the network model is not deep. As an input, spatial attention map can be selectively fed into the input of each block. The feature maps tend to focus on the parts that are related to the target task better. In addition, we only need to adjust the parameters of classifier to train different numbers of AU. It can be easily extended to varying datasets without increasing too much computation. A large number of experimental results show that the proposed CEDNN is obviously better than the traditional deep learning method on DISFA+ and CK+ datasets. After adding spatial attention maps, the result is better than the most advanced AU detection method. At the same time, the scale of the network is small, the running speed is fast, and the requirement for experimental equipment is low.	翻訳日:2022-09-21 13:38:20 公開日:2020-11-27
# 物理に着想を得た圧縮センシングのためのスパース構造学習 Learning sparse structures for physics-inspired compressed sensing ( http://arxiv.org/abs/2011.12831v2 ) ライセンス: Link先を確認	Cl\'ement Dorffer, Thomas Paviet-Salomon, Gilles Le Chenadec and Ang\'elique Dr\'emeau	(参考訳) 水中音響学では、浅い水環境は低周波源を考えるときにモード分散導波路として機能する。この文脈では、伝播信号は少数のモーダル成分の和として記述され、それぞれが自身の波動数に従って伝播する。これらの波数の推定は、伝播環境や放出源を理解する上で重要な関心事である。この問題を解決するために、我々は最近ベイズ的アプローチを提案している。広帯域ソースを扱う場合、波数を一方の周波数からもう一方の周波数にリンクする特定の依存性を統合することで、このモデルをさらに改善することができる。そこで本稿では,汎用構造スパルサリティ情報モデルとして活用される制限付きボルツマンマシンを用いた新しい手法を提案する。このモデルはディープベイズネットワークから派生したもので、よく知られた、証明されたアルゴリズムを用いて、物理的に現実的なシミュレーションデータで効率的に学習することができる。 In underwater acoustics, shallow water environments act as modal dispersive waveguides when considering low-frequency sources. In this context, propagating signals can be described as a sum of few modal components, each of them propagating according to its own wavenumber. Estimating these wavenumbers is of key interest to understand the propagating environment as well as the emitting source. To solve this problem, we proposed recently a Bayesian approach exploiting a sparsity-inforcing prior. When dealing with broadband sources, this model can be further improved by integrating the particular dependence linking the wavenumbers from one frequency to the other. In this contribution, we propose to resort to a new approach relying on a restricted Boltzmann machine, exploited as a generic structured sparsity-inforcing model. This model, derived from deep Bayesian networks, can indeed be efficiently learned on physically realistic simulated data using well-known and proven algorithms.	翻訳日:2022-09-21 04:03:50 公開日:2020-11-27
# 複数の文脈依存タスクの自律学習 Autonomous learning of multiple, context-dependent tasks ( http://arxiv.org/abs/2011.13847v1 ) ライセンス: Link先を確認	Vieri Giuliano Santucci and Davide Montella and Bruno Castro da Silva and Gianluca Baldassarre	(参考訳) 強化学習システムで複数のタスクを自律的に学習する問題に直面している場合、研究者は通常、タスクごとにひとつのパラメトリドポリシーだけで解決できるソリューションに焦点を当てる。しかし、異なるコンテキストを示す複雑な環境では、同じタスクは解決すべき異なるスキルセットを必要とするかもしれない。これらの状況は2つの課題をもたらします (a)異なる方針を必要とする異なる文脈を認識すること b) 新しい発見されたコンテキストにおいて、同じタスクを達成するためのポリシーをすばやく学習する。この2つの課題は、エージェントが与えられた環境で達成される可能性のある目標を自律的に発見し、それを達成するためのモータースキルを学ぶ、オープンエンドの学習フレームワークに直面する場合、さらに困難である。本稿では,2つの課題を統合的に解決するオープンエンド学習ロボットアーキテクチャC-GRAILを提案する。特に、アーキテクチャは、与えられた目標に対する期待性能の低下に基づいて、新しい関連するコンテストを検出し、無関係なコンペを無視することができる。さらに、アーキテクチャは、既に取得したポリシーから知識をインポートする転送学習を利用して、新しいコンテキストのポリシーをすばやく学習することができる。このアーキテクチャは、いくつかの異なる障害物を発生させる複数の障害物の存在下で、自律的に対象物に到達することを学習するロボットを含むシミュレーションロボット環境でテストされる。提案したアーキテクチャは、提案した自律的文脈発見および伝達学習機構を使用しない他のモデルよりも優れている。 When facing the problem of autonomously learning multiple tasks with reinforcement learning systems, researchers typically focus on solutions where just one parametrised policy per task is sufficient to solve them. However, in complex environments presenting different contexts, the same task might need a set of different skills to be solved. These situations pose two challenges: (a) to recognise the different contexts that need different policies; (b) quickly learn the policies to accomplish the same tasks in the new discovered contexts. These two challenges are even harder if faced within an open-ended learning framework where an agent has to autonomously discover the goals that it might accomplish in a given environment, and also to learn the motor skills to accomplish them. We propose a novel open-ended learning robot architecture, C-GRAIL, that solves the two challenges in an integrated fashion. In particular, the architecture is able to detect new relevant contests, and ignore irrelevant ones, on the basis of the decrease of the expected performance for a given goal. Moreover, the architecture can quickly learn the policies for the new contexts by exploiting transfer learning importing knowledge from already acquired policies. The architecture is tested in a simulated robotic environment involving a robot that autonomously learns to reach relevant target objects in the presence of multiple obstacles generating several different obstacles. The proposed architecture outperforms other models not using the proposed autonomous context-discovery and transfer-learning mechanisms.	翻訳日:2022-09-20 02:57:09 公開日:2020-11-27
# 人中心データセット作成のための倫理的ハイライト An Ethical Highlighter for People-Centric Dataset Creation ( http://arxiv.org/abs/2011.13583v1 ) ライセンス: Link先を確認	Margot Hanley, Apoorv Khandelwal, Hadar Averbuch-Elor, Noah Snavely and Helen Nissenbaum	(参考訳) 人々のコンピュータビジョンデータセットから生じる重要な倫理的な懸念が注目されており、結果として多くのデータセットが削除されている。人中心データセットの学術的ニーズを満たすため、既存のデータセットの倫理的評価をガイドし、ミスステップを避けるために将来のデータセット作成者を支援するための分析フレームワークを提案する。我々の研究は、先行研究のレビューと分析によって知らされ、そのような倫理的課題が生じる場所を強調します。 Important ethical concerns arising from computer vision datasets of people have been receiving significant attention, and a number of datasets have been withdrawn as a result. To meet the academic need for people-centric datasets, we propose an analytical framework to guide ethical evaluation of existing datasets and to serve future dataset creators in avoiding missteps. Our work is informed by a review and analysis of prior works and highlights where such ethical challenges arise.	翻訳日:2022-09-20 02:56:09 公開日:2020-11-27
# 深さ2 reluネットワークのタイトな硬さ評価 Tight Hardness Results for Training Depth-2 ReLU Networks ( http://arxiv.org/abs/2011.13550v1 ) ライセンス: Link先を確認	Surbhi Goel, Adam Klivans, Pasin Manurangsi, Daniel Reichman	(参考訳) ReLU活性化関数を用いた深度2ニューラルネットのトレーニングにおいて,これらのネットワークは単にReLUの重み付き和(負の係数を含むかもしれない)であることを示す。我々の目標は、与えられたトレーニングセットに対する平方損失を最小限に抑えるディープ2ニューラルネットワークを出力することです。この問題は1つのReLUを持つネットワークに対して既にNPハードであることが証明されている。また、2乗誤差を最小化($k>1$)する重み付き和$k$ReLUを、実現可能な設定(ラベルが未知の深さ-2 ReLUネットワークと整合である場合)で出力するNP硬度も証明する。また、所望の加算誤差$\epsilon$という観点で、実行時の下限を得ることができる。下界を得るには、Gap Exponential Time hypothesis (Gap-ETH) を用いるとともに、既知のDensest $\kappa$-Subgraph 問題を半周期時間で近似する難しさに関する新たな仮説を用いる(これらの仮説は異なる下界の証明に別々に使用される)。例えば、妥当な硬さ仮定の下では、最適なReLUを見つけるための適切な学習アルゴリズムは、1/\epsilon^2$で指数関数的に実行しなければならない。 ReLUを不適切に学習する以前の研究(Goel et al., COLT'17)とともに、これはReLUを学習するための適切なアルゴリズムと不適切なアルゴリズムを最初に分離することを意味する。また,有界重みを持つrelusの深さ2ネットワークを適切に学習することで,実現可能かつ不可知な環境での学習に必要な実行時間(worst-case)の上限を新たに与える問題についても検討した。ランニングタイム上の上限は、基本的に$\epsilon$への依存性の観点から下限と一致する。 We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of $k$ ReLUs minimizing the squared error (for $k>1$) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error $\epsilon$. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest $\kappa$-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in $1/\epsilon^2$. Together with a previous work regarding improperly learning a ReLU (Goel et al., COLT'17), this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on $\epsilon$.	翻訳日:2022-09-20 02:56:01 公開日:2020-11-27
# swyftによるシミュレーション効率のよい後方後方推定: 貴重な時間を無駄にしない Simulation-efficient marginal posterior estimation with swyft: stop wasting your precious time ( http://arxiv.org/abs/2011.13951v1 ) ライセンス: Link先を確認	Benjamin Kurt Miller, Alex Cole, Gilles Louppe, Christoph Weniger	(参考訳) アルゴリズムを紹介します (a)ネスト型神経電位-証拠比推定、及び b) パラメータの非均一なポアソン点プロセスキャッシュとそれに対応するシミュレーションによるシミュレーションの再利用。これらのアルゴリズムが組み合わさって、縁および関節後部の自動的および極めてシミュレーターな推定を可能にする。アルゴリズムは物理学や天文学の幅広い問題に適用でき、通常、従来の確率に基づくサンプリング法よりもはるかに優れたシミュレータ効率を提供する。提案手法は確率自由推論の一例であり, トラクタブルな確率関数を提供しないシミュレータにも適用可能である。シミュレータの実行は決して拒否されず、将来の分析で自動的に再利用できる。機能的なプロトタイプ実装として、オープンソースのソフトウェアパッケージswyftを提供しています。 We present algorithms (a) for nested neural likelihood-to-evidence ratio estimation, and (b) for simulation reuse via an inhomogeneous Poisson point process cache of parameters and corresponding simulations. Together, these algorithms enable automatic and extremely simulator efficient estimation of marginal and joint posteriors. The algorithms are applicable to a wide range of physics and astronomy problems and typically offer an order of magnitude better simulator efficiency than traditional likelihood-based sampling methods. Our approach is an example of likelihood-free inference, thus it is also applicable to simulators which do not offer a tractable likelihood function. Simulator runs are never rejected and can be automatically reused in future analysis. As functional prototype implementation we provide the open-source software package swyft.	翻訳日:2022-09-20 02:54:56 公開日:2020-11-27
# 学際モデル構築のための方法論--モデルから調査へ- A methodology for co-constructing an interdisciplinary model: from model to survey, from survey to model ( http://arxiv.org/abs/2011.13604v1 ) ライセンス: Link先を確認	Elise Beck, Julie Dugdale, Carole Adam, Christelle Ga\"idatzis, Julius Ba\~ngate	(参考訳) コンピュータ科学と社会科学はどのように協力して共通のモデルを構築するべきか? モデリングに本当に役立つデータを集めるには、どうすればよいのか? ターゲットモデルに合わせた調査をどのように設計すればよいのか? 本稿では,多分野研究プロジェクトの枠組みにおけるこれらの重要な疑問に答えることを目的とする。本研究は,地震発生直後の人間の行動のモデル化に応用され,複数の分野が関与する場合のモデル構築の課題について述べる。本研究の主な貢献は,多分野対話専用のツールの提案である。また、関連する異なる分野によって実行される豊かな知的過程の反射的分析を提案する。最後に、人類学者との協働から、多分野のプロセスの補完的な視点が与えられる。 How should computer science and social science collaborate to build a common model? How should they proceed to gather data that is really useful to the modelling? How can they design a survey that is tailored to the target model? This paper aims to answer those crucial questions in the framework of a multidisciplinary research project. This research addresses the issue of co-constructing a model when several disciplines are involved, and is applied to modelling human behaviour immediately after an earthquake. The main contribution of the work is to propose a tool dedicated to multidisciplinary dialogue. It also proposes a reflexive analysis of the enriching intellectual process carried out by the different disciplines involved. Finally, from working with an anthropologist, a complementary view of the multidisciplinary process is given.	翻訳日:2022-09-20 02:49:53 公開日:2020-11-27
# 関数と非パラメトリック関数に対する後部分布の収束率とベイズ推定器の等価性 Equivalence of Convergence Rates of Posterior Distributions and Bayes Estimators for Functions and Nonparametric Functionals ( http://arxiv.org/abs/2011.13967v1 ) ライセンス: Link先を確認	Zejian Liu and Meng Li	(参考訳) 非パラメトリック回帰におけるガウス過程に先行するベイズ法の後方収縮率と微分作用素に対するプラグイン特性について検討した。核の一般クラスに対しては、回帰関数とその微分の後方測度の収束率を定め、それらはいずれも、あるクラスにおける関数の対数係数まで最適である。本計算により,回帰関数とその導関数の速度最適推定はハイパーパラメータの選択と同一であり,ベイズ法が導関数の次数に著しく適応し,実数値関数を関数関数へ拡張する一般化プラグイン特性を享受できることを示した。これにより, 有限サンプル性能をシミュレーションにより評価した回帰関数とその導関数を, 実質的に簡易に推定できる。この証明は,任意の条件下でベイズ推定器の収束率に対して,後方分布(つまり後方収縮率)の収束率と逆の収束率とが一致することを示す。この同値性はガウス過程の一般クラスを持ち、回帰関数とその微分汎函数を$L_2$と$L_{\infty}$ノルムの下でカバーする。ベイズ系と非ベイズ系でこれら2つの基本的な大きなサンプル特性を結合するのに加えて、そのような同値性は非パラメトリック点推定器の収束率を計算することによって、新しいルーチンで後部収縮率を確立することができる。我々の議論の中核は、カーネルリッジ回帰と等価カーネル技術のための演算子理論フレームワークである。我々は、非パラメトリック点推定器の収束率と、独立な興味を持つかもしれない同値理論を確立する上で重要な急激な非漸近境界を導出する。 We study the posterior contraction rates of a Bayesian method with Gaussian process priors in nonparametric regression and its plug-in property for differential operators. For a general class of kernels, we establish convergence rates of the posterior measure of the regression function and its derivatives, which are both minimax optimal up to a logarithmic factor for functions in certain classes. Our calculation shows that the rate-optimal estimation of the regression function and its derivatives share the same choice of hyperparameter, indicating that the Bayes procedure remarkably adapts to the order of derivatives and enjoys a generalized plug-in property that extends real-valued functionals to function-valued functionals. This leads to a practically simple method for estimating the regression function and its derivatives, whose finite sample performance is assessed using simulations. Our proof shows that, under certain conditions, to any convergence rate of Bayes estimators there corresponds the same convergence rate of the posterior distributions (i.e., posterior contraction rate), and vice versa. This equivalence holds for a general class of Gaussian processes and covers the regression function and its derivative functionals, under both the $L_2$ and $L_{\infty}$ norms. In addition to connecting these two fundamental large sample properties in Bayesian and non-Bayesian regimes, such equivalence enables a new routine to establish posterior contraction rates by calculating convergence rates of nonparametric point estimators. At the core of our argument is an operator-theoretic framework for kernel ridge regression and equivalent kernel techniques. We derive a range of sharp non-asymptotic bounds that are pivotal in establishing convergence rates of nonparametric point estimators and the equivalence theory, which may be of independent interest.	翻訳日:2022-09-20 02:49:44 公開日:2020-11-27
# 深層強化学習による時変グラフの効率的な情報拡散 Efficient Information Diffusion in Time-Varying Graphs through Deep Reinforcement Learning ( http://arxiv.org/abs/2011.13518v1 ) ライセンス: Link先を確認	Matheus R. F. Mendon\c{c}a, Andr\'e M. S. Barreto, Artur Ziviani	(参考訳) 時間変化グラフによる効率的な情報拡散のためのネットワークシード(TVG)は多くの実世界のアプリケーションにおいて難しい課題である。この時空間影響の最大化問題をモデル化する方法はいくつかあるが、最終的な目標は、ノードが拡散過程を開始する最良の瞬間を決定することである。本稿では,各ノードの時間的挙動と接続パターンを学習し,TVGを介して拡散を開始するための最良の瞬間を予測できる,強化学習とグラフ埋め込みを併用したモデルであるSpatio-Temporal Influence Maximization~(STIM)を提案する。また,TVGの確率拡散過程をシミュレートする学習用人工TVGも開発し,STIMネットワークは非決定論的環境においても効率的なポリシーを学習可能であることを示した。 STIMは現実世界のTVGで評価され、ノードを通して情報を効率的に伝達する。最後に、STIMモデルが$O(\|E\|)$の時間複雑性を持つことを示す。そこでSTIMは,TVGにおける効率的な情報拡散のための新しい手法を提案する。 Network seeding for efficient information diffusion over time-varying graphs~(TVGs) is a challenging task with many real-world applications. There are several ways to model this spatio-temporal influence maximization problem, but the ultimate goal is to determine the best moment for a node to start the diffusion process. In this context, we propose Spatio-Temporal Influence Maximization~(STIM), a model trained with Reinforcement Learning and Graph Embedding over a set of artificial TVGs that is capable of learning the temporal behavior and connectivity pattern of each node, allowing it to predict the best moment to start a diffusion through the TVG. We also develop a special set of artificial TVGs used for training that simulate a stochastic diffusion process in TVGs, showing that the STIM network can learn an efficient policy even over a non-deterministic environment. STIM is also evaluated with a real-world TVG, where it also manages to efficiently propagate information through the nodes. Finally, we also show that the STIM model has a time complexity of $O(\|E\|)$. STIM, therefore, presents a novel approach for efficient information diffusion in TVGs, being highly versatile, where one can change the goal of the model by simply changing the adopted reward function.	翻訳日:2022-09-20 02:49:12 公開日:2020-11-27
# net2:プレプレースメントネット長推定用にカスタマイズしたグラフアテンションネットワーク手法 Net2: A Graph Attention Network Method Customized for Pre-Placement Net Length Estimation ( http://arxiv.org/abs/2011.13522v1 ) ライセンス: Link先を確認	Zhiyao Xie, Rongjian Liang, Xiaoqing Xu, Jiang Hu, Yixiao Duan, Yiran Chen	(参考訳) net lengthは、標準のデジタルデザインフローの様々な段階にわたってタイミングとパワーを最適化するための重要なプロキシメトリックである。しかし、ネット長情報の大多数はセル配置まで利用できないため、論理合成のような配置前の設計段階でネット長の最適化を明示的に検討することは重要な課題である。この研究は、セル配置前の個々のネット長を推定するために、Net2と呼ばれるカスタマイズを伴うグラフ注意ネットワーク手法を提案することで、この問題に対処する。精度指向バージョンであるNet2aは、長いネットと長いクリティカルパスの両方を識別する以前のいくつかの研究よりも約15%精度が向上している。高速バージョンであるNet2fは、配置よりも1000倍以上高速だが、さまざまな精度のメトリクスで、これまでの作業や他のニューラルネットワーク技術よりも優れている。 Net length is a key proxy metric for optimizing timing and power across various stages of a standard digital design flow. However, the bulk of net length information is not available until cell placement, and hence it is a significant challenge to explicitly consider net length optimization in design stages prior to placement, such as logic synthesis. This work addresses this challenge by proposing a graph attention network method with customization, called Net2, to estimate individual net length before cell placement. Its accuracy-oriented version Net2a achieves about 15% better accuracy than several previous works in identifying both long nets and long critical paths. Its fast version Net2f is more than 1000 times faster than placement while still outperforms previous works and other neural network techniques in terms of various accuracy metrics.	翻訳日:2022-09-20 02:48:53 公開日:2020-11-27
# 回転等価階層型ニューラルネットワークを用いたタンパク質モデル品質評価 Protein model quality assessment using rotation-equivariant, hierarchical neural networks ( http://arxiv.org/abs/2011.13557v1 ) ライセンス: Link先を確認	Stephan Eismann, Patricia Suriana, Bowen Jing, Raphael J.L. Townshend, Ron O. Dror	(参考訳) タンパク質は三次元(3d)構造に依存するミニチュアマシンである。この構造を計算的に決定することは未解決の大きな課題である。主なボトルネックは、モデル品質評価の課題である、候補の大きなプールの中で最も正確な構造モデルを選択することである。本稿では,タンパク質モデルの品質を評価するための新しい深層学習手法を提案する。我々のネットワークは、異なるレベルの構造解像度で原子構造と回転同変の畳み込みをポイントベースで表現する。これらの組み合わせにより、ネットワークはタンパク質構造全体からエンドツーエンドを学べる。近年のCASP(盲目予測コミュニティ実験)におけるタンパク質モデルの評価結果について報告する。特に注目すべきは、我々の手法は物理に着想を得たエネルギー用語を使用しず、複数のタンパク質の配列アライメントのような追加情報(個々のタンパク質モデルの原子構造以外の)を利用できないことである。 Proteins are miniature machines whose function depends on their three-dimensional (3D) structure. Determining this structure computationally remains an unsolved grand challenge. A major bottleneck involves selecting the most accurate structural model among a large pool of candidates, a task addressed in model quality assessment. Here, we present a novel deep learning approach to assess the quality of a protein model. Our network builds on a point-based representation of the atomic structure and rotation-equivariant convolutions at different levels of structural resolution. These combined aspects allow the network to learn end-to-end from entire protein structures. Our method achieves state-of-the-art results in scoring protein models submitted to recent rounds of CASP, a blind prediction community experiment. Particularly striking is that our method does not use physics-inspired energy terms and does not rely on the availability of additional information (beyond the atomic structure of the individual protein model), such as sequence alignments of multiple proteins.	翻訳日:2022-09-20 02:48:41 公開日:2020-11-27
# 新しい近似に基づく固有値補正自然勾配 Eigenvalue-corrected Natural Gradient Based on a New Approximation ( http://arxiv.org/abs/2011.13609v1 ) ライセンス: Link先を確認	Kai-Xin Gao, Xiao-Lei Liu, Zheng-Hai Huang, Min Wang, Shuangling Wang, Zidong Wang, Dachuan Xu, Fan Yu	(参考訳) ディープニューラルネットワーク(DNN)のトレーニングに2次最適化手法を用いると、多くの研究者が惹きつけている。最近提案されたEigenvalue-corrected Kronecker Factorization (EKFAC) (George et al., 2018) は、自然勾配の更新を対角法として解釈し、Kronecker-factored eigenbasisにおける不正確な再スケーリング係数を補正する。 Gao et al. (2020) は自然勾配に対する新たな近似を考察し、フィッシャー情報行列 (FIM) を2つの行列のクロネッカー積によって乗算された定数に近似し、近似の前と後のトレースを等しく保つ。本研究では,これら2つの手法の考え方を組み合わせて,Trace-restricted Eigenvalue-corrected Kronecker Factorization (TEKFAC)を提案する。提案手法は, kronecker-factored eigenbasis における不正確な再スケーリング係数を補正するだけでなく, gao et al. (2020) で提案した新しい近似法と有効減衰法を考察する。また、クロネッカー分解近似の差と関係についても論じる。実験により,本手法は複数のDNNにおいて,Adam,EKFAC,TKFAC等の運動量でSGDより優れていた。 Using second-order optimization methods for training deep neural networks (DNNs) has attracted many researchers. A recently proposed method, Eigenvalue-corrected Kronecker Factorization (EKFAC) (George et al., 2018), proposes an interpretation of viewing natural gradient update as a diagonal method, and corrects the inaccurate re-scaling factor in the Kronecker-factored eigenbasis. Gao et al. (2020) considers a new approximation to the natural gradient, which approximates the Fisher information matrix (FIM) to a constant multiplied by the Kronecker product of two matrices and keeps the trace equal before and after the approximation. In this work, we combine the ideas of these two methods and propose Trace-restricted Eigenvalue-corrected Kronecker Factorization (TEKFAC). The proposed method not only corrects the inexact re-scaling factor under the Kronecker-factored eigenbasis, but also considers the new approximation method and the effective damping technique proposed in Gao et al. (2020). We also discuss the differences and relationships among the Kronecker-factored approximations. Empirically, our method outperforms SGD with momentum, Adam, EKFAC and TKFAC on several DNNs.	翻訳日:2022-09-20 02:48:28 公開日:2020-11-27
# 強化学習に基づく無人航空機の協調経路とエネルギー最適化 Reinforcement Learning-based Joint Path and Energy Optimization of Cellular-Connected Unmanned Aerial Vehicles ( http://arxiv.org/abs/2011.13744v1 ) ライセンス: Link先を確認	Arash Hooshmand	(参考訳) 無人航空機(UAV)は最近かなりの研究関心を集めている。特にモノのインターネットの世界では、インターネット接続のUAVが大きな需要の1つだ。さらに、エネルギー制約、すなわちバッテリー制限は、その用途を制限することができるuavのボトルネックである。我々はエネルギー問題に対処し解決しようと試みる。そこで, 電力ステーション (PS) を装備した特定の位置で充電することで, UAVがバッテリ範囲よりもはるかに広い範囲で経路を計画できる, セル接続型UAVの経路計画法を提案する。例えば、エア・トゥ・エア(A2A)とエア・トゥ・グラウンド(A2G)の干渉や、UAVの軌道最適化に余分な制約を課す必要のない接続性のためである。飛行禁止区域は避けるべき非実用領域を決定する。バッテリーの充電を考慮し、長いミッションでUAVの問題を解決するため、我々は典型的な短距離経路プランナーを階層的に拡張するために強化学習(RL)を用いてきた。この問題は、広範囲を飛行するUAVに対してシミュレートされ、Qラーニングアルゴリズムにより、UAVが最適な経路と充電ポリシーを見つけることができる。 Unmanned Aerial Vehicles (UAVs) have attracted considerable research interest recently. Especially when it comes to the realm of Internet of Things, the UAVs with Internet connectivity are one of the main demands. Furthermore, the energy constraint i.e. battery limit is a bottle-neck of the UAVs that can limit their applications. We try to address and solve the energy problem. Therefore, a path planning method for a cellular-connected UAV is proposed that will enable the UAV to plan its path in an area much larger than its battery range by getting recharged in certain positions equipped with power stations (PSs). In addition to the energy constraint, there are also no-fly zones; for example, due to Air to Air (A2A) and Air to Ground (A2G) interference or for lack of necessary connectivity that impose extra constraints in the trajectory optimization of the UAV. No-fly zones determine the infeasible areas that should be avoided. We have used a reinforcement learning (RL) hierarchically to extend typical short-range path planners to consider battery recharge and solve the problem of UAVs in long missions. The problem is simulated for the UAV that flies over a large area, and Q-learning algorithm could enable the UAV to find the optimal path and recharge policy.	翻訳日:2022-09-20 02:48:04 公開日:2020-11-27
# CASTELO: Clustered Atom Subtypes aidEd Lead Optimization -- 機械学習と分子モデリングを組み合わせた手法 CASTELO: Clustered Atom Subtypes aidEd Lead Optimization -- a combined machine learning and molecular modeling method ( http://arxiv.org/abs/2011.13788v1 ) ライセンス: Link先を確認	Leili Zhang, Giacomo Domeniconi, Chih-Chieh Yang, Seung-gu Kang, Ruhong Zhou, Guojing Cong	(参考訳) 薬物の発見は、前臨床研究と臨床試験の2つの大きなステップからなる多段階のプロセスである。その段階の中で、リード最適化は前臨床予算の半分以上を簡単に消費する。本稿では,リード最適化ワークフローを自動化した機械学習と分子モデリングを組み合わせたアプローチを提案する。初期データ収集は物理に基づく分子動力学(MD)シミュレーションによって達成される。シミュレーションから抽出した予備特徴として接触行列を算出する。シミュレーションからの時間情報を活用するために,時間的ダイナミズム表現を用いた接触行列データを強化し,教師なし畳み込み型変分オートエンコーダ(cvae)を用いてモデル化した。最後に,従来のクラスタリング法とCVAEに基づくクラスタリング法を比較し,分子下構造をランク付けし,リード最適化の可能性を提案する。構造-活性関係データベースは必要とせず,薬剤の有効性を改善するための薬剤修飾ホットスポットに対する新たなヒントを提供する。我々のワークフローは、従来の労働集約的なプロセスと比較して、数ヶ月から数日のリード最適化のターンアラウンド時間を短縮できる可能性があり、医療研究者にとって貴重なツールになる可能性がある。 Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that automates lead optimization workflow \textit{in silico}. The initial data collection is achieved with physics-based molecular dynamics (MD) simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional clustering method and CVAE-based clustering method are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization. With no need for extensive structure-activity relationship database, our method provides new hints for drug modification hotspots which can be used to improve drug efficacy. Our workflow can potentially reduce the lead optimization turnaround time from months/years to days compared with the conventional labor-intensive process and thus can potentially become a valuable tool for medical researchers.	翻訳日:2022-09-20 02:47:19 公開日:2020-11-27
# 強化学習のためのベンチマークフレームワークの調査 A survey of benchmarking frameworks for reinforcement learning ( http://arxiv.org/abs/2011.13577v1 ) ライセンス: Link先を確認	Belinda Stapelberg and Katherine M. Malan	(参考訳) 強化学習は最近、機械学習コミュニティで注目を集めている。新しい手法が絶えず開発され、強化学習問題を解決する多くのアプローチがある。強化学習を用いた問題解決には,克服すべき課題がいろいろある。この分野の進歩を保証するため、ベンチマークは新しいアルゴリズムのテストや他のアプローチとの比較に重要である。したがって、公正な比較のための結果の再現性は、改善を正確に判断する上で不可欠である。本稿では、強化学習ベンチマークへの様々な貢献の概要と、強化学習が直面する課題に研究者がどう対処できるかについて論じる。論文の中では最もよく使われ、近年でも研究が進められている。本稿では,ベンチマークを用いた実装,タスク,アルゴリズム実装の面での貢献について述べる。この調査は、利用可能な幅広い強化学習ベンチマークタスクに注意を向け、標準化された方法で研究を奨励することを目的としている。さらに、この調査は、新しい強化学習アルゴリズムの開発とテストに使用できる様々なタスクに慣れていない研究者の概観として機能する。 Reinforcement learning has recently experienced increased prominence in the machine learning community. There are many approaches to solving reinforcement learning problems with new techniques developed constantly. When solving problems using reinforcement learning, there are various difficult challenges to overcome. To ensure progress in the field, benchmarks are important for testing new algorithms and comparing with other approaches. The reproducibility of results for fair comparison is therefore vital in ensuring that improvements are accurately judged. This paper provides an overview of different contributions to reinforcement learning benchmarking and discusses how they can assist researchers to address the challenges facing reinforcement learning. The contributions discussed are the most used and recent in the literature. The paper discusses the contributions in terms of implementation, tasks and provided algorithm implementations with benchmarks. The survey aims to bring attention to the wide range of reinforcement learning benchmarking tasks available and to encourage research to take place in a standardised manner. Additionally, this survey acts as an overview for researchers not familiar with the different tasks that can be used to develop and test new reinforcement learning algorithms.	翻訳日:2022-09-20 02:41:23 公開日:2020-11-27
# 信頼率クリッピングを用いた層別適応率法の改良 Improving Layer-wise Adaptive Rate Methods using Trust Ratio Clipping ( http://arxiv.org/abs/2011.13584v1 ) ライセンス: Link先を確認	Jeffrey Fong, Siwei Chen, Kaiqi Chen	(参考訳) 大きなバッチでニューラルネットワークをトレーニングすることは、ディープラーニングにとって基本的な重要性である。大規模なバッチトレーニングは、トレーニング時間を大幅に削減するが、精度を維持するのに困難である。最近の研究は、信頼率を用いた適応層別最適化を通じてこの問題に取り組むためにlarsやlambといった最適化手法を推し進めている。一般的な手法ではあるが、これらの手法は依然として不安定で極端な信頼率に悩まされており、性能が低下している。本稿では,その大きさを安定させ,極端な値を防止するため,信頼率クリッピングを用いたラムの新規変種であるlambcを提案する。 imagenetやcifar-10などの画像分類タスクについて実験を行い,各バッチサイズで有望な改善が得られた。 Training neural networks with large batch is of fundamental significance to deep learning. Large batch training remarkably reduces the amount of training time but has difficulties in maintaining accuracy. Recent works have put forward optimization methods such as LARS and LAMB to tackle this issue through adaptive layer-wise optimization using trust ratios. Though prevailing, such methods are observed to still suffer from unstable and extreme trust ratios which degrades performance. In this paper, we propose a new variant of LAMB, called LAMBC, which employs trust ratio clipping to stabilize its magnitude and prevent extreme values. We conducted experiments on image classification tasks such as ImageNet and CIFAR-10 and our empirical results demonstrate promising improvements across different batch sizes.	翻訳日:2022-09-20 02:41:08 公開日:2020-11-27
# 活性化拡散に基づくニューラルダイナミックモデルと認知操作のためのマイクロエクスラレーション A Neural Dynamic Model based on Activation Diffusion and a Micro-Explanation for Cognitive Operations ( http://arxiv.org/abs/2012.00104v1 ) ライセンス: Link先を確認	Hui Wei	(参考訳) 記憶の神経機構は、人工知能における表現の問題と非常に密接に関係している。本稿では,脳内のニューロンのネットワークをシミュレートする計算モデルを提案し,その処理方法について述べる。このモデルは神経情報処理の形態学的および電気生理学的特性を指し、ニューロンが発射シーケンスを符号化しているという仮定に基づいている。ネットワーク構造, 異なる段階における神経エンコーディング機能, 記憶における刺激の表現, 記憶を形成するアルゴリズムなどが提示された。また、学習の安定性と記憶能力のリコール率も分析した。神経のダイナミックなプロセスが後継として、情報が表現され、処理されるニューロンレベルかつコヒーレントな形式を実現するため、推論、問題解決、パターン認識、自然言語処理、学習など、人工知能のさまざまな分野の検証が容易になる。知的行動において起こる認知的操作の過程は一貫した表現を持ち、計算神経科学の観点からモデル化される。したがって、ニューロンのダイナミクスは、マイクロレベルで認知アーキテクチャの統一モデルによって、異なる知的行動の内部メカニズムを説明することができる。 The neural mechanism of memory has a very close relation with the problem of representation in artificial intelligence. In this paper a computational model was proposed to simulate the network of neurons in brain and how they process information. The model refers to morphological and electrophysiological characteristics of neural information processing, and is based on the assumption that neurons encode their firing sequence. The network structure, functions for neural encoding at different stages, the representation of stimuli in memory, and an algorithm to form a memory were presented. It also analyzed the stability and recall rate for learning and the capacity of memory. Because neural dynamic processes, one succeeding another, achieve a neuron-level and coherent form by which information is represented and processed, it may facilitate examination of various branches of Artificial Intelligence, such as inference, problem solving, pattern recognition, natural language processing and learning. The processes of cognitive manipulation occurring in intelligent behavior have a consistent representation while all being modeled from the perspective of computational neuroscience. Thus, the dynamics of neurons make it possible to explain the inner mechanisms of different intelligent behaviors by a unified model of cognitive architecture at a micro-level.	翻訳日:2022-09-20 02:39:45 公開日:2020-11-27
# 動的時間カメラとライダーの融合に基づく非協力環境におけるロバストなuavの自律着陸 Robust Autonomous Landing of UAV in Non-Cooperative Environments based on Dynamic Time Camera-LiDAR Fusion ( http://arxiv.org/abs/2011.13761v1 ) ライセンス: Link先を確認	Lyujie Chen, Xiaming Yuan, Yao Xiao, Yiding Zhang and Jihong Zhu	(参考訳) 非協力的な環境で安全な着陸場所を選択することは、UAVの完全な自律化に向けた重要なステップである。しかし、既存の手法は一般化能力の貧弱さと頑健さという共通の問題がある。未知の環境でのパフォーマンスは著しく低下し、エラーを自己検出して修正することはできない。本論文では,低コストLiDARと双眼カメラを備えたUAVシステムを構築し,平地と安全地を検知して非協調環境における自律着陸を実現する。我々は,LiDARの非繰り返し走査と高いFOVカバレッジ特性を利用して,動的時間深度補完アルゴリズムを考案した。提案した深度マップの自己評価手法と合わせて,推定フェーズにおけるLiDAR蓄積時間を動的に選択し,正確な予測結果が得られた。深さマップに基づいて、斜面、粗さ、安全領域の大きさなどの高レベルな地形情報を導出する。我々は,様々な未知の環境において,広範囲にわたる自律着陸実験を実施し,モデルが精度と速度を適応的にバランスさせ,uavが安全な着陸地点をロバストに選択できることを確認した。 Selecting safe landing sites in non-cooperative environments is a key step towards the full autonomy of UAVs. However, the existing methods have the common problems of poor generalization ability and robustness. Their performance in unknown environments is significantly degraded and the error cannot be self-detected and corrected. In this paper, we construct a UAV system equipped with low-cost LiDAR and binocular cameras to realize autonomous landing in non-cooperative environments by detecting the flat and safe ground area. Taking advantage of the non-repetitive scanning and high FOV coverage characteristics of LiDAR, we come up with a dynamic time depth completion algorithm. In conjunction with the proposed self-evaluation method of the depth map, our model can dynamically select the LiDAR accumulation time at the inference phase to ensure an accurate prediction result. Based on the depth map, the high-level terrain information such as slope, roughness, and the size of the safe area are derived. We have conducted extensive autonomous landing experiments in a variety of familiar or completely unknown environments, verifying that our model can adaptively balance the accuracy and speed, and the UAV can robustly select a safe landing site.	翻訳日:2022-09-20 02:38:37 公開日:2020-11-27
# 視覚的ローカライゼーションのための効率的なシーン圧縮 Efficient Scene Compression for Visual-based Localization ( http://arxiv.org/abs/2011.13894v1 ) ライセンス: Link先を確認	Marcela Mera-Trujillo, Benjamin Smith, Victor Fragoso	(参考訳) 3D再構成やシーン表現に関してカメラのポーズを推定することは、多くの複合現実とロボティクスアプリケーションにとって重要なステップである。現在利用可能な膨大なデータを考えると、多くのアプリケーションは効率的に動作するストレージや帯域幅を制限している。これらの制約を満たすため、多くのアプリケーションは3Dポイントの数を減らしてシーン表現を圧縮する。最先端の手法はk$-coverベースのアルゴリズムを使ってシーンを圧縮するが、それらは遅くてチューニングが難しい。速度の向上とパラメータチューニングの容易化を目的として,制約付き二次プログラム(qp)を用いてシーン表現を圧縮する新しい手法を提案する。このQPは1クラスのサポートベクトルマシンに似ているため、逐次最小最適化の変種を導出して解決する。提案手法では,支援ベクトルに対応する点を,シーンを表す点のサブセットとして用いる。また,本手法を高速に収束させる効率的な初期化手法を提案する。公開データセットを用いた実験により,提案手法はシーン表現を高速に圧縮し,正確なポーズ推定を行うことを示す。 Estimating the pose of a camera with respect to a 3D reconstruction or scene representation is a crucial step for many mixed reality and robotics applications. Given the vast amount of available data nowadays, many applications constrain storage and/or bandwidth to work efficiently. To satisfy these constraints, many applications compress a scene representation by reducing its number of 3D points. While state-of-the-art methods use $K$-cover-based algorithms to compress a scene, they are slow and hard to tune. To enhance speed and facilitate parameter tuning, this work introduces a novel approach that compresses a scene representation by means of a constrained quadratic program (QP). Because this QP resembles a one-class support vector machine, we derive a variant of the sequential minimal optimization to solve it. Our approach uses the points corresponding to the support vectors as the subset of points to represent a scene. We also present an efficient initialization method that allows our method to converge quickly. Our experiments on publicly available datasets show that our approach compresses a scene representation quickly while delivering accurate pose estimates.	翻訳日:2022-09-20 02:32:39 公開日:2020-11-27
# D-NeRF:ダイナミックシーンのためのニューラルラジアンス場 D-NeRF: Neural Radiance Fields for Dynamic Scenes ( http://arxiv.org/abs/2011.13961v1 ) ライセンス: Link先を確認	Albert Pumarola, Enric Corona, Gerard Pons-Moll, Francesc Moreno-Noguer	(参考訳) 機械学習と幾何学的推論を組み合わせたニューラルレンダリング技術は、スパース画像からシーンの新たなビューを合成するための最も有望なアプローチの1つである。このうちニューラル放射場(NeRF)は、深層ネットワークを訓練して5次元入力座標(空間的位置と視野方向を表す)を体積密度とビュー依存放射輝度にマッピングするものである。しかし、生成した画像に対して前例のないレベルの光リアリズムを実現するにもかかわらず、NeRFは静止シーンのみに適用でき、同じ空間位置を異なる画像から検索することができる。本稿では,神経放射野をダイナミックドメインに拡張する手法であるd-nerfについて紹介する。この手法により,シーンの周囲を移動する \emph{single} カメラから,剛体および非剛体運動下での新たな物体像の再構成とレンダリングが可能となる。この目的のために、時間はシステムへの追加入力として考慮し、学習プロセスを、シーンを標準空間にエンコードする段階と、この標準表現を特定の時間で変形シーンにマッピングする段階の2つの主要な段階に分割する。両方のマッピングは、完全に接続されたネットワークを使って同時に学習される。ネットワークがトレーニングされると、D-NeRFは新しい画像をレンダリングし、カメラビューと時間変数の両方を制御し、オブジェクトの動きを制御する。我々は,剛体・調音・非剛体動作下での物体のシーンに対するアプローチの有効性を実証した。コード、モデルウェイト、動的シーンデータセットがリリースされる。 Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. Among these, stands out the Neural radiance fields (NeRF), which trains a deep network to map 5D input coordinates (representing spatial location and viewing direction) into a volume density and view-dependent emitted radiance. However, despite achieving an unprecedented level of photorealism on the generated images, NeRF is only applicable to static scenes, where the same spatial location can be queried from different images. In this paper we introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions from a \emph{single} camera moving around the scene. For this purpose we consider time as an additional input to the system, and split the learning process in two main stages: one that encodes the scene into a canonical space and another that maps this canonical representation into the deformed scene at a particular time. Both mappings are simultaneously learned using fully-connected networks. Once the networks are trained, D-NeRF can render novel images, controlling both the camera view and the time variable, and thus, the object movement. We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions. Code, model weights and the dynamic scenes dataset will be released.	翻訳日:2022-09-20 02:31:59 公開日:2020-11-27
# テキストセグメンテーションを再考する:新しいデータセットとテキスト特異的リファインメントアプローチ Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach ( http://arxiv.org/abs/2011.14021v1 ) ライセンス: Link先を確認	Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi	(参考訳) テキストセグメンテーションは、テキストスタイル転送やシーンテキストの削除など、現実世界の多くのテキスト関連タスクにおいて必須条件である。しかし、高品質なデータセットや専用の調査が欠如しているため、この重要な前提条件は多くの研究において前提として残されており、現在の研究でほとんど見落とされてきた。このギャップを埋めるため、私たちはtextsegという、単語と文字を境界とする多角形、マスク、文字転写の6種類のアノテーションを備えた、大規模な詳細な注釈付きテキストデータセットを提案しました。また,従来のセグメンテーションモデルに負担を課すような,非凸境界や多様なテクスチャなど,テキストのユニークな特性に適応する新たなテキストセグメンテーション手法であるtextfine network(texrnet)についても紹介する。 texrnetでは、重要な機能プールや注意に基づく類似性チェックなど、このような課題に対処するために、テキスト固有のネットワーク設計を提案します。また,テキストセグメンテーションの大幅な改善を示すtrimapとdiscriminatorの損失についても紹介する。 TextSegデータセットと既存のデータセットの両方で大規模な実験が行われます。 texrnetは、他の最先端セグメンテーション手法と比較して、テキストセグメンテーション性能を2%近く向上させる。データセットとコードはhttps://github.com/SHI-Labs/Rethinking-Text-Segmentationで公開されます。 Text segmentation is a prerequisite in many real-world text-related tasks, e.g., text style transfer, and scene text removal. However, facing the lack of high-quality datasets and dedicated investigations, this critical prerequisite has been left as an assumption in many works, and has been largely overlooked by current research. To bridge this gap, we proposed TextSeg, a large-scale fine-annotated text dataset with six types of annotations: word- and character-wise bounding polygons, masks and transcriptions. We also introduce Text Refinement Network (TexRNet), a novel text segmentation approach that adapts to the unique properties of text, e.g. non-convex boundary, diverse texture, etc., which often impose burdens on traditional segmentation models. In our TexRNet, we propose text specific network designs to address such challenges, including key features pooling and attention-based similarity checking. We also introduce trimap and discriminator losses that show significant improvement on text segmentation. Extensive experiments are carried out on both our TextSeg dataset and other existing datasets. We demonstrate that TexRNet consistently improves text segmentation performance by nearly 2% compared to other state-of-the-art segmentation methods. Our dataset and code will be made available at https://github.com/SHI-Labs/Rethinking-Text-Segmentation.	翻訳日:2022-09-20 02:31:23 公開日:2020-11-27
# ニュースメディアにおけるジェンダーベースの暴力の影響を学習する機械学習 Machine Learning to study the impact of gender-based violence in the news media ( http://arxiv.org/abs/2012.07490v1 ) ライセンス: Link先を確認	Hugo J. Bello, Nora Palomar, Elisa Gallego, Lourdes Jim\'enez Navascu\'es and Celia Lozano	(参考訳) 未だにタブー的な話題だが、性別に基づく暴力(GBV)は被害者の健康、尊厳、安全、自治を損なう。この種の暴力を発生または維持するために多くの要因が研究されているが、メディアの影響はまだ不明である。ここでは、このニュースの効果をGBVで説明するために機械学習ツールを使用します。ニューラルネットワークにニュースを供給することにより、各記事に関連するトピック情報を復元することができる。以上の結果から,GBVニュースと公衆の意識,メディア性GBV症例の影響,GBVニュースの本質的なテーマ的関係が示唆された。使用中のニューラルモデルを簡単に調整できるので、他のメディアソースやトピックにもアプローチを拡張できます。 While it remains a taboo topic, gender-based violence (GBV) undermines the health, dignity, security and autonomy of its victims. Many factors have been studied to generate or maintain this kind of violence, however, the influence of the media is still uncertain. Here, we use Machine Learning tools to extrapolate the effect of the news in GBV. By feeding neural networks with news, the topic information associated with each article can be recovered. Our findings show a relationship between GBV news and public awareness, the effect of mediatic GBV cases, and the intrinsic thematic relationship of GBV news. Because the used neural model can be easily adjusted, this also allows us to extend our approach to other media sources or topics	翻訳日:2022-09-20 02:30:47 公開日:2020-11-27
# 深層学習における不確実性の再考:ロバスト性の改善について Rethinking Uncertainty in Deep Learning: Whether and How it Improves Robustness ( http://arxiv.org/abs/2011.13538v1 ) ライセンス: Link先を確認	Yilun Jin, Lixin Fan, Kam Woh Ng, Ce Ju, Qiang Yang	(参考訳) ディープ・ニューラル・ネットワーク(DNN)は、多くの治療法が提案される敵の攻撃に苦しむことが知られている。敵対的訓練(adversarial training, at)は最も強固な防御とされるが、クリーンな例と、より大きな摂動による攻撃のような他の種類の攻撃の両方において、パフォーマンスの低下に苦しむ。一方、エントロピー最大化(EntM)やラベル平滑化(LS)といった不確実な出力を奨励する正規化器は、クリーンな例の精度を維持し、弱い攻撃下での性能を向上させることができるが、強力な攻撃に対して防御する能力は疑わしい。本稿では,entmやlsを含む不確実性促進規則化剤を,敵対学習の分野で再検討する。 EntM と LS だけで小さな摂動下でのみ堅牢性が得られることを示す。反対に,不確実性促進調整器は原則的に補完し,クリーンな例と様々な攻撃,特に大きな摂動を伴う攻撃の両方において,一貫して性能を向上させる。さらに、不確実性促進正則化器がジャコビアン行列$\nabla_X f(X;\theta)$の観点からATの性能を高め、EntMが事実上ヤコビアン行列のノルムを縮小し、ロバスト性を促進することを明らかにする。 Deep neural networks (DNNs) are known to be prone to adversarial attacks, for which many remedies are proposed. While adversarial training (AT) is regarded as the most robust defense, it suffers from poor performance both on clean examples and under other types of attacks, e.g. attacks with larger perturbations. Meanwhile, regularizers that encourage uncertain outputs, such as entropy maximization (EntM) and label smoothing (LS) can maintain accuracy on clean examples and improve performance under weak attacks, yet their ability to defend against strong attacks is still in doubt. In this paper, we revisit uncertainty promotion regularizers, including EntM and LS, in the field of adversarial learning. We show that EntM and LS alone provide robustness only under small perturbations. Contrarily, we show that uncertainty promotion regularizers complement AT in a principled manner, consistently improving performance on both clean examples and under various attacks, especially attacks with large perturbations. We further analyze how uncertainty promotion regularizers enhance the performance of AT from the perspective of Jacobian matrices $\nabla_X f(X;\theta)$, and find out that EntM effectively shrinks the norm of Jacobian matrices and hence promotes robustness.	翻訳日:2022-09-20 02:30:13 公開日:2020-11-27
# 相互関係推論による自己教師付き時系列表現学習 Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning ( http://arxiv.org/abs/2011.13548v1 ) ライセンス: Link先を確認	Haoyi Fan, Fengbin Zhang, Yue Gao	(参考訳) 自己教師付き学習は、ラベルのないデータから有用な表現を抽出することで、多くの領域において優れたパフォーマンスを達成する。しかし,従来の自己教師あり手法の多くはサンプル間構造の探索に主眼を置いているが,時系列データにとって重要な時間内構造への取り組みは少ない。本稿では,自己監督型時系列表現学習フレームワークであるSelfTimeについて,時系列のサンプル間関係と時間内関係を探索し,ラベルなし時系列の基盤となる構造特徴を学習する。具体的には,まず所定のアンカー試料の正および負のサンプルをサンプリングし,このアンカーから時間片をサンプリングすることで時間内関係を生成する。そして、サンプル関係に基づいて、2つの別々の関係推論ヘッドを組み合わせた共有特徴抽出バックボーンを用いて、サンプル対のサンプル間関係推論の関係を定量化し、時間内関係推論のためのタイムピースペアの関係を定量化する。最後に、関係推論ヘッドの監督の下で、時系列の有用な表現をバックボーンから抽出する。時系列分類タスクのための実世界の時系列データセットの実験結果から,提案手法の有効性が示された。コードとデータはhttps://haoyfan.github.io/で公開されている。 Self-supervised learning achieves superior performance in many domains by extracting useful representations from the unlabeled data. However, most of traditional self-supervised methods mainly focus on exploring the inter-sample structure while less efforts have been concentrated on the underlying intra-temporal structure, which is important for time series data. In this paper, we present SelfTime: a general self-supervised time series representation learning framework, by exploring the inter-sample relation and intra-temporal relation of time series to learn the underlying structure feature on the unlabeled time series. Specifically, we first generate the inter-sample relation by sampling positive and negative samples of a given anchor sample, and intra-temporal relation by sampling time pieces from this anchor. Then, based on the sampled relation, a shared feature extraction backbone combined with two separate relation reasoning heads are employed to quantify the relationships of the sample pairs for inter-sample relation reasoning, and the relationships of the time piece pairs for intra-temporal relation reasoning, respectively. Finally, the useful representations of time series are extracted from the backbone under the supervision of relation reasoning heads. Experimental results on multiple real-world time series datasets for time series classification task demonstrate the effectiveness of the proposed method. Code and data are publicly available at https://haoyfan.github.io/.	翻訳日:2022-09-20 02:29:43 公開日:2020-11-27
# MEBOW: 野生における体配向の単分子的推定 MEBOW: Monocular Estimation of Body Orientation In the Wild ( http://arxiv.org/abs/2011.13688v1 ) ライセンス: Link先を確認	Chenyan Wu, Yukun Chen, Jiajia Luo, Che-Chun Su, Anuja Dawane, Bikramjot Hanzra, Zhuo Deng, Bilan Liu, James Wang, Cheng-Hao Kuo	(参考訳) 身体の向きの推定は、ロボット工学や自律運転を含む多くのアプリケーションにおいて重要な視覚的手がかりを提供する。特に3次元ポーズ推定が画像分解能の低下、咬合、身体部位の識別が困難である場合には望ましい。そこで本研究では,広視野画像からの方向推定のための大規模データセットであるCOCO-MEBOW(Monocular Estimation of Body Orientation in the Wild)を提案する。 COCOデータセットからの55K画像内の約130K人の身体の向き付けラベルは、効率的で高精度なアノテーションパイプラインを使用して収集されている。また、データセットのメリットも検証しました。まず,本データセットは人体方向推定モデルの性能と頑健性を大幅に向上させることができることを示す。さらに,3次元ポーズラベル,2次元ポーズラベル,我々の身体指向ラベルを共同訓練に用いる3次元ポーズ推定のための新しい3次元ソースソリューションを提案する。本モデルは,3次元ポーズラベルと2次元ポーズラベルのみを用いた単眼3次元ポーズ推定において,最先端のデュアルソースソリューションよりも優れる。これは、3次元のポーズ推定においてmebowの重要な利点であり、特に3次元のポーズではボディオリエンテーションに対する個人ごとのラベリングコストがはるかに低いため魅力的である。この研究は、人間の行動を理解することに関わる現実的な課題に対処する上で、MEBOWの高い可能性を示している。この研究の詳細はhttps://chenyanwu.github.io/MEBOW/.comで確認できる。 Body orientation estimation provides crucial visual cues in many applications, including robotics and autonomous driving. It is particularly desirable when 3-D pose estimation is difficult to infer due to poor image resolution, occlusion or indistinguishable body parts. We present COCO-MEBOW (Monocular Estimation of Body Orientation in the Wild), a new large-scale dataset for orientation estimation from a single in-the-wild image. The body-orientation labels for around 130K human bodies within 55K images from the COCO dataset have been collected using an efficient and high-precision annotation pipeline. We also validated the benefits of the dataset. First, we show that our dataset can substantially improve the performance and the robustness of a human body orientation estimation model, the development of which was previously limited by the scale and diversity of the available training data. Additionally, we present a novel triple-source solution for 3-D human pose estimation, where 3-D pose labels, 2-D pose labels, and our body-orientation labels are all used in joint training. Our model significantly outperforms state-of-the-art dual-source solutions for monocular 3-D human pose estimation, where training only uses 3-D pose labels and 2-D pose labels. This substantiates an important advantage of MEBOW for 3-D human pose estimation, which is particularly appealing because the per-instance labeling cost for body orientations is far less than that for 3-D poses. The work demonstrates high potential of MEBOW in addressing real-world challenges involving understanding human behaviors. Further information of this work is available at https://chenyanwu.github.io/MEBOW/.	翻訳日:2022-09-20 02:23:53 公開日:2020-11-27
# 高分解能乳房イメージングのための軽量U-Net Lightweight U-Net for High-Resolution Breast Imaging ( http://arxiv.org/abs/2011.13698v1 ) ライセンス: Link先を確認	Mickael Tardy, Diana Mateus	(参考訳) 乳癌検診における悪性度検出における完全畳み込みニューラルネットワークの検討を行った。我々は,ネットワークの精度と計算複雑性との間の許容範囲の妥協を求める教師付きセグメンテーションタスクに取り組んでいる。 We study the fully convolutional neural networks in the context of malignancy detection for breast cancer screening. We work on a supervised segmentation task looking for an acceptable compromise between the precision of the network and the computational complexity.	翻訳日:2022-09-20 02:23:02 公開日:2020-11-27
# 3D Invisible Cloak 3D Invisible Cloak ( http://arxiv.org/abs/2011.13705v1 ) ライセンス: Link先を確認	Mingfu Xue, Can He, Zhiyu Wu, Jian Wang, Zhe Liu, Weiqiang Liu	(参考訳) 本稿では,実世界の人検知器に対する新たな物理的ステルス攻撃を提案する。提案手法では, 敵のパッチを生成し, 実際の衣服に印刷することにより, 3次元の目立たないマントを作製する。クロークを身に着けている人は、人検知器の検知を回避し、ステルスを達成できる。 3次元の物理的制約(ラジアン、シワ、オクルージョン、アングルなど)が人的ステルス攻撃に与える影響を考察し、3次元の見えないクロークを生成するための3次元変換を提案する。我々は、現実の服に敵のパッチを印刷することで、難易度と複雑な3D物理シナリオの下で3D空間でステルス攻撃を行う。従来の3次元変換は、最適化プロセス中にパッチ上で実行される。さらに, 最適3次元目視クロークの生成方法について検討した。具体的には、特定の形状や色の入力画像を選択して最適な3d目に見えないクロークを生成する方法を検討する。また、物体検出器を他の物体と誤認させることに成功し、完全に姿を消す方法、すなわち物体として検出されない方法も検討する。最後に,デジタルドメインと物理世界での提案する攻撃の性能を体系的に評価するための体系的評価フレームワークを提案する。様々な屋内・屋外の物理的シナリオにおける実験結果から,提案手法は複雑で困難な物理的条件下であっても頑健で有効であることが明らかとなった。デジタルドメイン(inriaデータセット)のアタック成功率は86.56%であり、物理的世界における静的および動的ステルスアタックパフォーマンスは、それぞれ100%と77%であり、既存の作業よりもはるかに優れている。 In this paper, we propose a novel physical stealth attack against the person detectors in real world. The proposed method generates an adversarial patch, and prints it on real clothes to make a three dimensional (3D) invisible cloak. Anyone wearing the cloak can evade the detection of person detectors and achieve stealth. We consider the impacts of those 3D physical constraints (i.e., radian, wrinkle, occlusion, angle, etc.) on person stealth attacks, and propose 3D transformations to generate 3D invisible cloak. We launch the person stealth attacks in 3D physical space instead of 2D plane by printing the adversarial patches on real clothes under challenging and complex 3D physical scenarios. The conventional and 3D transformations are performed on the patch during its optimization process. Further, we study how to generate the optimal 3D invisible cloak. Specifically, we explore how to choose input images with specific shapes and colors to generate the optimal 3D invisible cloak. Besides, after successfully making the object detector misjudge the person as other objects, we explore how to make a person completely disappeared, i.e., the person will not be detected as any objects. Finally, we present a systematic evaluation framework to methodically evaluate the performance of the proposed attack in digital domain and physical world. Experimental results in various indoor and outdoor physical scenarios show that, the proposed person stealth attack method is robust and effective even under those complex and challenging physical conditions, such as the cloak is wrinkled, obscured, curved, and from different angles. The attack success rate in digital domain (Inria data set) is 86.56%, while the static and dynamic stealth attack performance in physical world is 100% and 77%, respectively, which are significantly better than existing works.	翻訳日:2022-09-20 02:22:58 公開日:2020-11-27
# 非教師者再識別のための非対称分岐による教師学生ネットワークの多様性向上 Enhancing Diversity in Teacher-Student Networks via Asymmetric branches for Unsupervised Person Re-identification ( http://arxiv.org/abs/2011.13776v1 ) ライセンス: Link先を確認	Hao Chen, Benoit Lagadec, Francois Bremond	(参考訳) unsupervised person re-identification (re-id)の目的は、労働集約的なアイデンティティアノテーションなしで差別的特徴を学ぶことである。 state-of-the-art unsupervised re-idメソッドは、ターゲットドメイン内の未ラベル画像に擬似ラベルを割り当て、ノイズの多い擬似ラベルから学習する。最近導入された平均教師モデルはラベルノイズを緩和する有望な方法である。しかし、訓練期間中、自己学習型教師学生ネットワークはすぐにコンセンサスに収束し、局所的な最小限に繋がる。ニューラルネットワーク内で非対称構造を用いてこの問題に対処する可能性を探る。まず, 特徴を異なる方法で抽出するために非対称分岐が提案され, 出現特徴の多様性が向上した。そこで,提案したクロスブランチ・インスペクションにより,一方の分枝が他方の分枝から監督を受け,異なる知識を伝達し,教師と学生のネットワーク間の重みの多様性を高める。拡張実験により,提案手法は,教師なし領域適応と教師なしRe-IDタスクの両方において,従来よりも大幅に性能が向上することが示された。 The objective of unsupervised person re-identification (Re-ID) is to learn discriminative features without labor-intensive identity annotations. State-of-the-art unsupervised Re-ID methods assign pseudo labels to unlabeled images in the target domain and learn from these noisy pseudo labels. Recently introduced Mean Teacher Model is a promising way to mitigate the label noise. However, during the training, self-ensembled teacher-student networks quickly converge to a consensus which leads to a local minimum. We explore the possibility of using an asymmetric structure inside neural network to address this problem. First, asymmetric branches are proposed to extract features in different manners, which enhances the feature diversity in appearance signatures. Then, our proposed cross-branch supervision allows one branch to get supervision from the other branch, which transfers distinct knowledge and enhances the weight diversity between teacher and student networks. Extensive experiments show that our proposed method can significantly surpass the performance of previous work on both unsupervised domain adaptation and fully unsupervised Re-ID tasks.	翻訳日:2022-09-20 02:22:29 公開日:2020-11-27
# 点雲の3次元意味セグメンテーションのための距離特徴密度を持つ球面補間畳み込みネットワーク Spherical Interpolated Convolutional Network with Distance-Feature Density for 3D Semantic Segmentation of Point Clouds ( http://arxiv.org/abs/2011.13784v1 ) ライセンス: Link先を確認	Guangming Wang, Yehui Yang, Huixin Zhang, Zhe Liu, and Hesheng Wang	(参考訳) 点雲の意味的セグメンテーションは、ロボットにとって環境認識の重要な部分である。しかし,点雲の非構造性から,従来の3次元畳み込みカーネルを直接採用して生の3次元点雲から特徴を抽出することは困難である。本稿では,従来の格子状3次元畳み込み演算子に代わる球面補間畳み込み演算子を提案する。新たに提案する特徴抽出演算子により,ネットワークの精度が向上し,ネットワークのパラメータが低減される。さらに,距離を補間重みとして,点雲補間法の欠陥を分析し,距離と特徴相関を組み合わせることにより,自己学習した距離特徴密度を提案する。提案手法は球状補間畳み込みネットワークの特徴抽出をより合理的かつ効果的に行う。提案ネットワークの有効性をポイントクラウドの3次元意味セグメンテーションタスクで実証した。実験の結果,提案手法はScanNetデータセットとParis-Lille-3Dデータセットで良好な性能を示すことがわかった。 The semantic segmentation of point clouds is an important part of the environment perception for robots. However, it is difficult to directly adopt the traditional 3D convolution kernel to extract features from raw 3D point clouds because of the unstructured property of point clouds. In this paper, a spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator. This newly proposed feature extraction operator improves the accuracy of the network and reduces the parameters of the network. In addition, this paper analyzes the defect of point cloud interpolation methods based on the distance as the interpolation weight and proposes the self-learned distance-feature density by combining the distance and the feature correlation. The proposed method makes the feature extraction of spherical interpolated convolution network more rational and effective. The effectiveness of the proposed network is demonstrated on the 3D semantic segmentation task of point clouds. Experiments show that the proposed method achieves good performance on the ScanNet dataset and Paris-Lille-3D dataset.	翻訳日:2022-09-20 02:22:08 公開日:2020-11-27
# 四点制約を用いた一般化ポス・アンド・スケール推定 Generalized Pose-and-Scale Estimation using 4-Point Congruence Constraints ( http://arxiv.org/abs/2011.13817v1 ) ライセンス: Link先を確認	Victor Fragoso, Sudipta Sinha	(参考訳) 一般化カメラの絶対的なポーズを4つの3次元点-線対から未知の内部スケールで計算する新しい方法gP4Pcを提案する。多くのポーズ・アンド・スケール法とは異なり、gP4Pcは未知の類似性変換に関連する4点の2つの集合によって定義される形状の合同から生じる制約に基づいている。問題に対する新しいパラメトリゼーションを選択することにより、4つのスカラー変数の2次方程式の系を導出する。変数は、カメラセンターからの光線に沿って3dポイントの距離を表す。このシステムをGroebnerベースベースの自動多項式解法で解いた後、効率的な3Dポイントアライメント法を用いて類似性変換を計算する。また,計算的に非常に効率的で,既存の解法よりも約3倍高速であるコプラナー点の場合には,解法の特殊変種も提案する。実データと合成データを用いた実験により, RANSACフレームワーク内で使用した場合, gP4Pcは, 競合する数値安定性, 精度, 騒音に対する頑健性を実現しつつ, 総走行時間において最速の手法であることが示された。 We present gP4Pc, a new method for computing the absolute pose of a generalized camera with unknown internal scale from four corresponding 3D point-and-ray pairs. Unlike most pose-and-scale methods, gP4Pc is based on constraints arising from the congruence of shapes defined by two sets of four points related by an unknown similarity transformation. By choosing a novel parametrization for the problem, we derive a system of four quadratic equations in four scalar variables. The variables represent the distances of 3D points along the rays from the camera centers. After solving this system via Groebner basis-based automatic polynomial solvers, we compute the similarity transformation using an efficient 3D point-point alignment method. We also propose a specialized variant of our solver for the case of coplanar points, which is computationally very efficient and about 3x faster than the fastest existing solver. Our experiments on real and synthetic datasets, demonstrate that gP4Pc is among the fastest methods in terms of total running time when used within a RANSAC framework, while achieving competitive numerical stability, accuracy, and robustness to noise.	翻訳日:2022-09-20 02:21:22 公開日:2020-11-27
# 自律運転における3次元lidarに基づく映像物体検出のための時間チャネルトランスフォーマ Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving ( http://arxiv.org/abs/2011.13628v1 ) ライセンス: Link先を確認	Zhenxun Yuan, Xiao Song, Lei Bai, Wengang Zhou, Zhe Wang, Wanli Ouyang	(参考訳) 業界における自動運転の強い需要は、3Dオブジェクト検出への強い関心をもたらし、多くの優れた3Dオブジェクト検出アルゴリズムを生み出した。しかし、ほとんどのアルゴリズムは単一フレームのデータのみをモデル化し、データのシーケンスの時間的情報を無視している。本研究では,lidarデータから映像物体を検出するための空間-時間領域とチャネル領域の関係をモデル化する,temporal-channel transformerと呼ばれる新しいトランスを提案する。このトランスの特別な設計として、エンコーダにエンコードされる情報は、デコーダのものと異なる、すなわち、エンコーダは、複数のフレームの時間的チャネル情報をエンコードし、デコーダは、現在のフレームの空間的チャネル情報をボクセル的にデコードする。具体的には、トランスの時間チャネルエンコーダは、異なるチャネルやフレームの特徴間の相関を利用して、異なるチャネルやフレームの情報をエンコードするように設計されている。一方、変圧器の空間デコーダは、現在のフレームの各位置の情報を復号する。検出ヘッドで物体検出を行う前に、ゲート機構を配置して現在のフレームの特徴を再検討し、アップサンプリング処理とともに対象フレームの表現を反復的に洗練することにより、対象情報を無関係にフィルタリングする。実験の結果,nuscenesベンチマークでグリッドvoxelを用いた3次元物体検出の最先端性能が得られた。 The strong demand of autonomous driving in the industry has lead to strong interest in 3D object detection and resulted in many excellent 3D object detection algorithms. However, the vast majority of algorithms only model single-frame data, ignoring the temporal information of the sequence of data. In this work, we propose a new transformer, called Temporal-Channel Transformer, to model the spatial-temporal domain and channel domain relationships for video object detecting from Lidar data. As a special design of this transformer, the information encoded in the encoder is different from that in the decoder, i.e. the encoder encodes temporal-channel information of multiple frames while the decoder decodes the spatial-channel information for the current frame in a voxel-wise manner. Specifically, the temporal-channel encoder of the transformer is designed to encode the information of different channels and frames by utilizing the correlation among features from different channels and frames. On the other hand, the spatial decoder of the transformer will decode the information for each location of the current frame. Before conducting the object detection with detection head, the gate mechanism is deployed for re-calibrating the features of current frame, which filters out the object irrelevant information by repetitively refine the representation of target frame along with the up-sampling process. Experimental results show that we achieve the state-of-the-art performance in grid voxel-based 3D object detection on the nuScenes benchmark.	翻訳日:2022-09-20 02:13:03 公開日:2020-11-27
# インスタンスワイズ3次元再構成のためのディスクリプタフリーマルチビュー領域マッチング Descriptor-Free Multi-View Region Matching for Instance-Wise 3D Reconstruction ( http://arxiv.org/abs/2011.13649v1 ) ライセンス: Link先を確認	Takuma Doi, Fumio Okura, Toshiki Nagahara, Yasuyuki Matsushita, Yasushi Yagi	(参考訳) 本稿では,テクスチャや形状記述子マッチングに頼らずに,インスタンスセグメンテーションのマルチビュー拡張を提案する。マルチビューインスタンスのセグメンテーションは、テクスチャや形状記述子を使ったマルチビューマッチングが難しいため、繰り返しのテクスチャや形、例えば植物葉を持つシーンでは困難になる。そこで本研究では,特徴記述子に依存しないエピポーラ幾何学に基づく多視点領域マッチング手法を提案する。さらに, エピポーラ領域マッチングは, 容易にインスタンスセグメンテーションに統合でき, 3次元再構成に有効であることを示す。実験により,マルチビューインスタンスマッチングと3次元再構成の精度が,ベースライン法と比較して向上した。 This paper proposes a multi-view extension of instance segmentation without relying on texture or shape descriptor matching. Multi-view instance segmentation becomes challenging for scenes with repetitive textures and shapes, e.g., plant leaves, due to the difficulty of multi-view matching using texture or shape descriptors. To this end, we propose a multi-view region matching method based on epipolar geometry, which does not rely on any feature descriptors. We further show that the epipolar region matching can be easily integrated into instance segmentation and effective for instance-wise 3D reconstruction. Experiments demonstrate the improved accuracy of multi-view instance matching and the 3D reconstruction compared to the baseline methods.	翻訳日:2022-09-20 02:12:39 公開日:2020-11-27
# 点群における実時間物体認識とポーズ推定 Towards real-time object recognition and pose estimation in point clouds ( http://arxiv.org/abs/2011.13669v1 ) ライセンス: Link先を確認	Marlon Marcon, Olga Regina Pereira Bellon and Luciano Silva	(参考訳) 物体認識と6次元ポーズ推定はコンピュータビジョン応用において非常に難しい課題である。このようなタスクの効率性にも拘わらず、標準メソッドはリアルタイム処理速度に遠く及ばない。本稿では,オブジェクトの細かな6DoFのポーズを,リアルタイムに現実的なシナリオに適用する新しいパイプラインを提案する。私たちは提案を3つに分けた。まず、Color機能分類では、ImageNetでトレーニングされたトレーニング済みのCNNカラー機能を使用してオブジェクト検出を行う。特徴ベース登録モジュールは粗いポーズ推定を行い、最後に細調整ステップはICPベースの密登録を行う。提案手法は,rgb-dシーンのデータセット上で約83\%の精度を実現する。処理時間については、オブジェクト検出タスクをフレーム処理速度最大90FPSで行い、フル実行戦略において、ポーズ推定を約14FPSで行う。我々は,提案のモジュール性により,必要時にのみ実行可能とし,マルチタスクの状況でもリアルタイム処理をアンロックするスケジュール実行を行うことができることを議論した。 Object recognition and 6DoF pose estimation are quite challenging tasks in computer vision applications. Despite efficiency in such tasks, standard methods deliver far from real-time processing rates. This paper presents a novel pipeline to estimate a fine 6DoF pose of objects, applied to realistic scenarios in real-time. We split our proposal into three main parts. Firstly, a Color feature classification leverages the use of pre-trained CNN color features trained on the ImageNet for object detection. A Feature-based registration module conducts a coarse pose estimation, and finally, a Fine-adjustment step performs an ICP-based dense registration. Our proposal achieves, in the best case, an accuracy performance of almost 83\% on the RGB-D Scenes dataset. Regarding processing time, the object detection task is done at a frame processing rate up to 90 FPS, and the pose estimation at almost 14 FPS in a full execution strategy. We discuss that due to the proposal's modularity, we could let the full execution occurs only when necessary and perform a scheduled execution that unlocks real-time processing, even for multitask situations.	翻訳日:2022-09-20 02:12:10 公開日:2020-11-27
# Progressively Stacking 2.0: BERTトレーニングスピードアップのための多段階階層トレーニング手法 Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup ( http://arxiv.org/abs/2011.13635v1 ) ライセンス: Link先を確認	Cheng Yang, Shengnan Wang, Chao Yang, Yuechuan Li, Ru He, Jingqiao Zhang	(参考訳) BERTのような事前訓練された言語モデルは、多くの自然言語処理タスクにおいて大幅な精度向上を実現している。その有効性にもかかわらず、膨大な数のパラメータがBERTモデルのトレーニングを非常に困難にしている。本稿では,BERTのトレーニング時間を削減するため,効率的な多段階階層トレーニング(MSLT)手法を提案する。トレーニングプロセス全体をいくつかの段階に分割する。トレーニングは、少数のエンコーダ層しか持たない小さなモデルから始まり、新しいエンコーダ層を追加することで、徐々にモデルの深さを増加させます。それぞれの段階で、新たに追加されるエンコーダ層のトップ(出力層の近くに)のみをトレーニングします。以前の段階でトレーニングされた他のレイヤのパラメータは、現在の段階では更新されない。 BERTトレーニングでは、特に後方の計算時間が勾配同期のための通信時間を含む分散トレーニング環境では、後方の計算の方が前方の計算よりもはるかに時間がかかる。提案されたトレーニング戦略では、上位層のみが後方計算に参加し、ほとんどの層は前方計算にのみ参加する。これにより、計算効率と通信効率が大幅に向上する。実験の結果,本手法は性能低下を伴わずに110%以上のトレーニングスピードアップを達成できることがわかった。 Pre-trained language models, such as BERT, have achieved significant accuracy gain in many natural language processing tasks. Despite its effectiveness, the huge number of parameters makes training a BERT model computationally very challenging. In this paper, we propose an efficient multi-stage layerwise training (MSLT) approach to reduce the training time of BERT. We decompose the whole training process into several stages. The training is started from a small model with only a few encoder layers and we gradually increase the depth of the model by adding new encoder layers. At each stage, we only train the top (near the output layer) few encoder layers which are newly added. The parameters of the other layers which have been trained in the previous stages will not be updated in the current stage. In BERT training, the backward computation is much more time-consuming than the forward computation, especially in the distributed training setting in which the backward computation time further includes the communication time for gradient synchronization. In the proposed training strategy, only top few layers participate in backward computation, while most layers only participate in forward computation. Hence both the computation and communication efficiencies are greatly improved. Experimental results show that the proposed method can achieve more than 110% training speedup without significant performance degradation.	翻訳日:2022-09-20 02:05:49 公開日:2020-11-27
# ニューロモルフィックハードウェア制約緩和のためのスパイクニューラルネットワークのコンパイル Compiling Spiking Neural Networks to Mitigate Neuromorphic Hardware Constraints ( http://arxiv.org/abs/2011.13965v1 ) ライセンス: Link先を確認	Adarsha Balaji and Anup Das	(参考訳) spiking neural networks (snns) は,<resource}-および<power}-constrained platform上で時空間パターン認識を行う効率的な計算モデルである。ニューロモルフィックハードウェア上で実行されるSNNは、これらのプラットフォームのエネルギー消費をさらに削減することができる。モデルサイズと複雑さの増大に伴い、SNNベースのアプリケーションをタイルベースのニューロモルフィックハードウェアにマッピングすることはますます困難になっている。これは神経シナプスコア(viz. a crossbar)がシナプス後ニューロンごとに一定の数のシナプス前接続しか持たないという制限に起因する。ニューロンごとに多くのニューロンとシナプス前接続を持つ複雑なsnnベースのモデルでは、(1)トレーニング後にクロスバーリソースに適合するために接続を刈り取る必要があるため、モデル品質の低下、例えば正確性、(2)ニューロンとシナプスをハードウェアの神経-シナプスコアに分割して配置する必要があるため、レイテンシとエネルギー消費の増加につながる可能性がある。本研究では,(1)複数のシナプス前接続を有するニューロン機能を,複数の均質な神経単位に分解し,クロスバーの利用を著しく改善し,全てのシナプス前接続を保持させる新しいアンロール法と,(2)エネルギー消費とスパイクレイテンシを最小化することを目的としたニューロモルフィックハードウェア上にsnsをマッピングする新しい手法であるspinsmapを提案する。 Spiking Neural Networks (SNNs) are efficient computation models to perform spatio-temporal pattern recognition on {resource}- and {power}-constrained platforms. SNNs executed on neuromorphic hardware can further reduce energy consumption of these platforms. With increasing model size and complexity, mapping SNN-based applications to tile-based neuromorphic hardware is becoming increasingly challenging. This is attributed to the limitations of neuro-synaptic cores, viz. a crossbar, to accommodate only a fixed number of pre-synaptic connections per post-synaptic neuron. For complex SNN-based models that have many neurons and pre-synaptic connections per neuron, (1) connections may need to be pruned after training to fit onto the crossbar resources, leading to a loss in model quality, e.g., accuracy, and (2) the neurons and synapses need to be partitioned and placed on the neuro-sypatic cores of the hardware, which could lead to increased latency and energy consumption. In this work, we propose (1) a novel unrolling technique that decomposes a neuron function with many pre-synaptic connections into a sequence of homogeneous neural units to significantly improve the crossbar utilization and retain all pre-synaptic connections, and (2) SpiNeMap, a novel methodology to map SNNs on neuromorphic hardware with an aim to minimize energy consumption and spike latency.	翻訳日:2022-09-20 02:05:11 公開日:2020-11-27
# 信念エントロピーに基づく区間値信頼構造の組み合わせ Combination of interval-valued belief structures based on belief entropy ( http://arxiv.org/abs/2011.13636v1 ) ライセンス: Link先を確認	Miao Qin, Yongchuan Tang	(参考訳) 本稿では,デンプスター・シェーファー証拠理論の枠組みにおける区間値信念構造の組み合わせと正規化の問題について検討する。既存のアプローチをレビューし、徹底的に分析する。従来のアプローチの利点と欠点を述べる。不確実性尺度に基づく新しい最適性アプローチが開発され、区間値の信念構造を結合する問題は、基本確率代入の組み合わせに縮退する。提案手法の合理性を示す数値的な例を示す。 This paper investigates the issues of combination and normalization of interval-valued belief structures within the framework of Dempster-Shafer theory of evidence. Existing approaches are reviewed and thoroughly analyzed. The advantages and drawbacks of previous approach are presented. A new optimality approach based on uncertainty measure is developed, where the problem of combining interval-valued belief structures degenerates into combining basic probability assignments. Numerical examples are provided to illustrate the rationality of the proposed approach.	翻訳日:2022-09-20 02:04:16 公開日:2020-11-27
# 人間のvrデモによる操作行動の構造的・意味的モデルの自動獲得 Automated acquisition of structured, semantic models of manipulation activities from human VR demonstration ( http://arxiv.org/abs/2011.13689v1 ) ライセンス: Link先を確認	Andrei Haidu and Michael Beetz	(参考訳) 本稿では,仮想環境から,人間の動作,ロボットの理解,日常的な活動の収集とアノテートが可能なシステムを提案する。人間の動きは、人工のバーチャルリアリティーデバイスと眼球追跡機能を使ってシミュレーションされた世界にマッピングされる。仮想世界のすべての相互作用は物理的にシミュレートされ、運動とその効果は現実世界と密接に関連している。アクティビティ実行中、サブシンボリックデータロガーは、オフラインシーンの再現と再生を可能にするために、フレーム単位の環境と人間の視線を記録する。物理エンジンと組み合わせて、オンラインモニター(記号データロガー)は(様々な文法を用いて)解析し、シミュレートされた世界におけるイベント、アクション、およびそれらの影響を記録する。 In this paper we present a system capable of collecting and annotating, human performed, robot understandable, everyday activities from virtual environments. The human movements are mapped in the simulated world using off-the-shelf virtual reality devices with full body, and eye tracking capabilities. All the interactions in the virtual world are physically simulated, thus movements and their effects are closely relatable to the real world. During the activity execution, a subsymbolic data logger is recording the environment and the human gaze on a per-frame basis, enabling offline scene reproduction and replays. Coupled with the physics engine, online monitors (symbolic data loggers) are parsing (using various grammars) and recording events, actions, and their effects in the simulated world.	翻訳日:2022-09-20 02:04:08 公開日:2020-11-27
# 近似知識コンパイルのための下限 Lower Bounds for Approximate Knowledge Compilation ( http://arxiv.org/abs/2011.13721v1 ) ライセンス: Link先を確認	Alexis de Colnet and Stefan Mengel	(参考訳) 知識コンパイルは、異なる表現言語の簡潔性と効率のトレードオフを研究する。多くの言語では、表現サイズには強い下限が知られているが、最近の研究は、いくつかの言語では、近似コンパイルを用いてこれらの境界をバイパスできることを示している。その考え方は、エラーの数をコントロールすることができる知識の近似をコンパイルすることである。効率的なモデルカウントと確率的推論をサポートするため,確率的推論などの文脈に適したコンパイル言語d-dnnf(decomposable negation normal form)の回路に焦点を当てた。さらに、d-DNNF には、近似に緩和することで回避できるような、既知のサイズの低い境界が存在する。本稿では,従来研究されてきた弱い近似と,近年のアルゴリズム的な結果に用いられている強い近似という,近似の2つの概念を定式化する。次に、d-DNNFによる近似の下位境界を示し、文献の正の結果を補完する。 Knowledge compilation studies the trade-off between succinctness and efficiency of different representation languages. For many languages, there are known strong lower bounds on the representation size, but recent work shows that, for some languages, one can bypass these bounds using approximate compilation. The idea is to compile an approximation of the knowledge for which the number of errors can be controlled. We focus on circuits in deterministic decomposable negation normal form (d-DNNF), a compilation language suitable in contexts such as probabilistic reasoning, as it supports efficient model counting and probabilistic inference. Moreover, there are known size lower bounds for d-DNNF which by relaxing to approximation one might be able to avoid. In this paper we formalize two notions of approximation: weak approximation which has been studied before in the decision diagram literature and strong approximation which has been used in recent algorithmic results. We then show lower bounds for approximation by d-DNNF, complementing the positive results from the literature.	翻訳日:2022-09-20 02:03:53 公開日:2020-11-27
# 協調作業における人間の反応・行動・嗜好調査 Investigating Human Response, Behaviour, and Preference in Joint-Task Interaction ( http://arxiv.org/abs/2011.14016v1 ) ライセンス: Link先を確認	Alan Lindsay, Bart Craenen, Sara Dalzel-Job, Robin L. Hill, Ronald P. A. Petrick	(参考訳) 人間の相互作用は、非言語的手がかりを含む幅広い信号に依存する。効果的な説明可能計画(XAIP)エージェントを開発するためには,これらの通信チャネルの範囲と有用性を理解することが重要である。我々の出発点は、共同作業相互作用と認知科学研究の既存の成果である。私たちの意図は、これらのレッスンは、ユーザの感情的尺度(つまり、ユーザの感情的状態を計画モデルに明示的に組み込む)を含む、ユーザの反応に応じて振る舞いを条件付けている、計画手法の使用を含むインタラクションエージェントの設計を通知できることです。我々は計画に基づくエージェントの動作と共同作業の相互作用の交差点でいくつかの概念を特定し、これらを用いて2つのエージェントを設計した。我々はこれらのエージェントと相互作用する人間の行動と反応を調べる実験を設計した。本稿では,デザインされた研究と,検討中の重要な疑問について述べる。また,シミュレーションユーザに対する2つのエージェントの挙動を実証分析により検討した。 Human interaction relies on a wide range of signals, including non-verbal cues. In order to develop effective Explainable Planning (XAIP) agents it is important that we understand the range and utility of these communication channels. Our starting point is existing results from joint task interaction and their study in cognitive science. Our intention is that these lessons can inform the design of interaction agents -- including those using planning techniques -- whose behaviour is conditioned on the user's response, including affective measures of the user (i.e., explicitly incorporating the user's affective state within the planning model). We have identified several concepts at the intersection of plan-based agent behaviour and joint task interaction and have used these to design two agents: one reactive and the other partially predictive. We have designed an experiment in order to examine human behaviour and response as they interact with these agents. In this paper we present the designed study and the key questions that are being investigated. We also present the results from an empirical analysis where we examined the behaviour of the two agents for simulated users.	翻訳日:2022-09-20 02:03:14 公開日:2020-11-27
# 多クラス分類のための深層建築の不確実性駆動アンサンブル胸部X線画像におけるCOVID-19診断への応用 Uncertainty-driven ensembles of deep architectures for multiclass classification. Application to COVID-19 diagnosis in chest X-ray images ( http://arxiv.org/abs/2011.14894v1 ) ライセンス: Link先を確認	Juan E. Arco, A. Ortiz, J.Ramirez, F.J. Martinez-Murcia, Yu-Dong Zhang, Juan M. Gorriz	(参考訳) 呼吸器疾患は毎年何百万人もの人を殺す。これらの病理の診断は、手動で時間を要するプロセスであり、サーバー間の変動、診断と治療の遅延がある。最近の新型コロナウイルス(COVID-19)パンデミックは、肺炎の診断を自動化するためのシステム開発の必要性を示す一方で、畳み込みニューラルネットワーク(CNN)は、医療画像の自動分類に優れた選択肢であることが証明されている。しかし、この文脈で信頼度分類を提供する必要性を考えると、モデルの予測の信頼性を定量化することが重要である。本研究では,ベイズ深層学習に基づく多段階アンサンブル分類システムを提案し,各分類決定の不確かさを定量化しながら性能を最大化する。このツールは、予測の不確実性に応じて結果を重み付けすることで、異なるアーキテクチャから抽出した情報を組み合わせる。ベイズネットワークの性能は、コントロール対細菌性肺炎、ウイルス性肺炎対covid-19肺炎の4つの病因を同時に区別する実際のシナリオで評価される。 3段階決定木を用いて4級分類を3つの二分分類に分割し、98.06%の精度を与え、最近の文献で得られた結果を上回った。この高い性能を得るのに必要な前処理の削減は、予測の信頼性に関する情報に加えて、臨床医の助けとなるシステムの適用性を示すものである。 Respiratory diseases kill million of people each year. Diagnosis of these pathologies is a manual, time-consuming process that has inter and intra-observer variability, delaying diagnosis and treatment. The recent COVID-19 pandemic has demonstrated the need of developing systems to automatize the diagnosis of pneumonia, whilst Convolutional Neural Network (CNNs) have proved to be an excellent option for the automatic classification of medical images. However, given the need of providing a confidence classification in this context it is crucial to quantify the reliability of the model's predictions. In this work, we propose a multi-level ensemble classification system based on a Bayesian Deep Learning approach in order to maximize performance while quantifying the uncertainty of each classification decision. This tool combines the information extracted from different architectures by weighting their results according to the uncertainty of their predictions. Performance of the Bayesian network is evaluated in a real scenario where simultaneously differentiating between four different pathologies: control vs bacterial pneumonia vs viral pneumonia vs COVID-19 pneumonia. A three-level decision tree is employed to divide the 4-class classification into three binary classifications, yielding an accuracy of 98.06% and overcoming the results obtained by recent literature. The reduced preprocessing needed for obtaining this high performance, in addition to the information provided about the reliability of the predictions evidence the applicability of the system to be used as an aid for clinicians.	翻訳日:2022-09-20 01:57:13 公開日:2020-11-27
# 神経多様体としての2次元フレーム非視覚空間の表現とその情報幾何解釈 Representation of 2D frame less visual space as a neural manifold and its information geometric interpretation ( http://arxiv.org/abs/2011.13585v1 ) ライセンス: Link先を確認	Debasis Mazumdar	(参考訳) 情報幾何学のフレームワークにおけるニューラル多様体としての2次元フレームの表現とモデリングについて述べる。視覚空間の双曲性の起源は神経科学の証拠を用いて研究されている。そこで本研究では,ヒト脳における空間情報の処理,特に距離の推定,幾何学曲線の知覚等を,フィッシャー・ラオ計量を用いたパラメトリック確率空間でモデル化できることを提案する。空間のコンパクト性、凸性、微分性は解析され、ブセマンが提唱した G 空間の公理に従うことが分かる。さらに、これは定数負曲率の斉次リーマン空間と考えることができる。したがって、空間が測地線を生じさせることが保証される。多くの視覚現象を表す測地学の計算機シミュレーションを行い、視覚空間の双曲構造を提唱する。シミュレーション結果と公開実験データの比較を行った。 Representation of 2D frame less visual space as neural manifold and its modelling in the frame work of information geometry is presented. Origin of hyperbolic nature of the visual space is investigated using evidences from neuroscience. Based on the results we propose that the processing of spatial information, particularly estimation of distance, perceiving geometrical curves etc. in the human brain can be modeled in a parametric probability space endowed with Fisher-Rao metric. Compactness, convexity and differentiability of the space is analysed and found that they obey the axioms of G space, proposed by Busemann. Further it is shown that it can be considered as a homogeneous Riemannian space of constant negative curvature. It is therefore ensured that the space yields geodesics into it. Computer simulation of geodesics representing a number of visual phenomena and advocating the hyperbolic structure of visual space is carried out. Comparison of the simulated results with the published experimental data is presented.	翻訳日:2022-09-20 01:56:50 公開日:2020-11-27
# モジュール型深層強化学習と政策伝達による適応型自動化 Adaptable Automation with Modular Deep Reinforcement Learning and Policy Transfer ( http://arxiv.org/abs/2012.01934v1 ) ライセンス: Link先を確認	Zohreh Raziei, Mohsen Moghaddam	(参考訳) 深層強化学習(rl)の最近の進歩は、機械が所定のタスクを実行するための最適なポリシーを自律的に学習できるインテリジェントオートメーションにとって、前例のない機会を生み出した。しかし、現在のディープrlアルゴリズムは、主に狭い範囲のタスクに特化しており、サンプル非効率であり、十分な安定性を欠いているため、産業的な採用を妨げている。本稿では,タスクのモジュール化と伝達学習の概念に基づいて,ハイパーアクタソフトアクタクリティカル(HASAC)RLフレームワークを開発し,テストすることによって,この制限に対処する。 HASACの目標は、エージェントが学習したタスクのポリシーを「ハイパーアクター」を介して新しいタスクに転送することで、新しいタスクへの適応性を高めることである。 HASACフレームワークは、新しい仮想ロボット操作ベンチマークであるMeta-Worldでテストされている。数値実験により、HASACは、報酬値、成功率、タスク完了時間の観点から、最先端の深部RLアルゴリズムよりも優れた性能を示す。 Recent advances in deep Reinforcement Learning (RL) have created unprecedented opportunities for intelligent automation, where a machine can autonomously learn an optimal policy for performing a given task. However, current deep RL algorithms predominantly specialize in a narrow range of tasks, are sample inefficient, and lack sufficient stability, which in turn hinder their industrial adoption. This article tackles this limitation by developing and testing a Hyper-Actor Soft Actor-Critic (HASAC) RL framework based on the notions of task modularization and transfer learning. The goal of the proposed HASAC is to enhance the adaptability of an agent to new tasks by transferring the learned policies of former tasks to the new task via a "hyper-actor". The HASAC framework is tested on a new virtual robotic manipulation benchmark, Meta-World. Numerical experiments show superior performance by HASAC over state-of-the-art deep RL algorithms in terms of reward value, success rate, and task completion time.	翻訳日:2022-09-20 01:56:36 公開日:2020-11-27
# 実体埋め込みベクトルを用いたハイブリッドガウス過程モデルを用いた細胞間知識伝達 Knowledge transfer across cell lines using Hybrid Gaussian Process models with entity embedding vectors ( http://arxiv.org/abs/2011.13863v1 ) ライセンス: Link先を確認	Clemens Hutter, Moritz von Stosch, Mariano Nicolas Cruz Bournazou, Alessandro Butt\'e	(参考訳) 現在までに生化学プロセスを開発するために多くの実験が行われている。生成されたデータは一度だけ使用され、開発のための決定を下す。既に開発されたプロセスのデータを利用して、新しいプロセスの予測を行い、必要な実験の数を大幅に削減できるだろうか。異なる製品のプロセスは振る舞いの違いを示し、通常、サブセットのみが同じように振る舞う。したがって、複数の製品にまたがるプロセスデータに対する効果的な学習には、製品のアイデンティティを合理的に表現する必要がある。ガウス過程回帰モデルへの入力となるベクトルを埋め込み、積の同一性(圏的特徴)を表現することを提案する。組込みベクトルがプロセスデータからどのように学習できるかを示し、製品類似性の概念を解釈可能であることを示す。性能改善は、シミュレーションされたクロスプロダクト学習タスクにおける従来のワンホット符号化と比較される。総じて、提案手法はウェットラブ実験において有意な減少をもたらす可能性がある。 To date, a large number of experiments are performed to develop a biochemical process. The generated data is used only once, to take decisions for development. Could we exploit data of already developed processes to make predictions for a novel process, we could significantly reduce the number of experiments needed. Processes for different products exhibit differences in behaviour, typically only a subset behave similar. Therefore, effective learning on multiple product spanning process data requires a sensible representation of the product identity. We propose to represent the product identity (a categorical feature) by embedding vectors that serve as input to a Gaussian Process regression model. We demonstrate how the embedding vectors can be learned from process data and show that they capture an interpretable notion of product similarity. The improvement in performance is compared to traditional one-hot encoding on a simulated cross product learning task. All in all, the proposed method could render possible significant reductions in wet-lab experiments.	翻訳日:2022-09-20 01:56:09 公開日:2020-11-27
# 地形モデルを用いたマラリアベクター飼育地の検出 Detection of Malaria Vector Breeding Habitats using Topographic Models ( http://arxiv.org/abs/2011.13714v1 ) ライセンス: Link先を確認	Aishwarya Jadhav	(参考訳) マラリアベクターの繁殖地として機能する停滞した水域の処理は、ほとんどのマラリア除去運動の基本的なステップである。しかし、大規模な水域の特定は高価であり、労働集約的で時間を要するため、資源が限られている国では困難である。水体を効率的に発見できる実用的なモデルは、現場労働者がスキャンする必要がある領域を大幅に減らし、限られた資源をターゲットにすることができる。そこで本研究では,可能でグローバルで高解像度なDEMデータに基づく実用的な地形モデルを提案する。ガーナのオプアシ地域を調査し,様々な地形特性が異なる水域に与える影響を調査し,水生生物形成に大きな影響を及ぼす特徴を明らかにする。複数のモデルの有効性をさらに評価する。我々の最良モデルは、衛星画像データを利用し、異なる設定で堅牢性を示すものでさえも、小さな水面の検出に地形変数を用いた以前の試みよりも著しく優れている。 Treatment of stagnant water bodies that act as a breeding site for malarial vectors is a fundamental step in most malaria elimination campaigns. However, identification of such water bodies over large areas is expensive, labour-intensive and time-consuming and hence, challenging in countries with limited resources. Practical models that can efficiently locate water bodies can target the limited resources by greatly reducing the area that needs to be scanned by the field workers. To this end, we propose a practical topographic model based on easily available, global, high-resolution DEM data to predict locations of potential vector-breeding water sites. We surveyed the Obuasi region of Ghana to assess the impact of various topographic features on different types of water bodies and uncover the features that significantly influence the formation of aquatic habitats. We further evaluate the effectiveness of multiple models. Our best model significantly outperforms earlier attempts that employ topographic variables for detection of small water sites, even the ones that utilize additional satellite imagery data and demonstrates robustness across different settings.	翻訳日:2022-09-20 01:55:51 公開日:2020-11-27
# 医用ハイパースペクトル画像解析における深層学習の動向 Trends in deep learning for medical hyperspectral image analysis ( http://arxiv.org/abs/2011.13974v1 ) ライセンス: Link先を確認	Uzair Khan, Paheding Sidike, Colin Elkin and Vijay Devabhaktuni	(参考訳) 深層学習のアルゴリズムは、過去10年間にいくつかの分野の関心を集め、医療用ハイパースペクトルイメージングは特に有望な分野である。以上より,医用ハイパースペクトル画像における深層学習の実施を論じるレビュー論文は存在せず,このレビュー論文が目指すのは,現在,深層学習を利用して医用ハイパースペクトル画像の効果的な分析を行う出版物を調べることである。本稿では,深層学習のブーム以来実施されてきた医療用ハイパースペクトル画像解析に関係し,適用可能な深層学習概念について論じる。本研究は, 医用ハイパースペクトル画像解析において, 深層学習を用いた分類, 分割, 検出について検討する。最後に、この規律に関連する現状と今後の課題と、その試みを克服するための取り組みについて論じる。 Deep learning algorithms have seen acute growth of interest in their applications throughout several fields of interest in the last decade, with medical hyperspectral imaging being a particularly promising domain. So far, to the best of our knowledge, there is no review paper that discusses the implementation of deep learning for medical hyperspectral imaging, which is what this review paper aims to accomplish by examining publications that currently utilize deep learning to perform effective analysis of medical hyperspectral imagery. This paper discusses deep learning concepts that are relevant and applicable to medical hyperspectral imaging analysis, several of which have been implemented since the boom in deep learning. This will comprise of reviewing the use of deep learning for classification, segmentation, and detection in order to investigate the analysis of medical hyperspectral imaging. Lastly, we discuss the current and future challenges pertaining to this discipline and the possible efforts to overcome such trials.	翻訳日:2022-09-20 01:55:35 公開日:2020-11-27
# 早期アルツハイマー病検出のためのMRI画像解析法 MRI Images Analysis Method for Early Stage Alzheimer's Disease Detection ( http://arxiv.org/abs/2012.00830v1 ) ライセンス: Link先を確認	Achraf Ben Miled, Taoufik Yeferny, and Amira ben Rabeh	(参考訳) アルツハイマー病(英: Alzheimer disease)は、記憶や認知機能を変える神経変性疾患である。この疾患の早期診断は、ミルド認知障害(MCI: Mild Cognitive Impairment)と呼ばれる予備段階の検出によって、依然として困難な問題である。本稿では,MCI 段階におけるアルツハイマー病を検出するために,MRI 画像から最も顕著な特徴を自動的に抽出する,事前学習ネットワーク AlexNet を実装した強力な分類アーキテクチャを提案する。 oasisデータベース脳の大規模データベースを用いて,提案手法を評価した。脳の様々な部分(前頭、矢状、軸)が用いられた。健常者210名とMRI210名を用いた96.83%の精度を実現した。 Alzheimer's disease is a neurogenerative disease that alters memories, cognitive functions leading to death. Early diagnosis of the disease, by detection of the preliminary stage, called Mild Cognitive Impairment (MCI), remains a challenging issue. In this respect, we introduce, in this paper, a powerful classification architecture that implements the pre-trained network AlexNet to automatically extract the most prominent features from Magnetic Resonance Imaging (MRI) images in order to detect the Alzheimer's disease at the MCI stage. The proposed method is evaluated using a big database from OASIS Database Brain. Various sections of the brain: frontal, sagittal and axial were used. The proposed method achieved 96.83% accuracy by using 420 subjects: 210 Normal and 210 MRI	翻訳日:2022-09-20 01:55:19 公開日:2020-11-27
# 弱いラベルを持つ脳動脈瘤分類のための解剖学的インフォームド3D CNN An anatomically-informed 3D CNN for brain aneurysm classification with weak labels ( http://arxiv.org/abs/2012.08645v1 ) ライセンス: Link先を確認	Tommaso Di Noto, Guillaume Marie, S\'ebastien Tourbier, Yasser Alem\'an-G\'omez, Guillaume Saliou, Meritxell Bach Cuadra, Patric Hagmann, Jonas Richiardi	(参考訳) 医療画像における検出タスクを実行するための一般的なアプローチは、初期セグメンテーションに依存することである。しかし、このアプローチは、医療専門家が描くのに反復的かつ時間のかかるvoxel-wiseアノテーションに強く依存している。ボクセルのマスクに代わる興味深い選択肢は、いわゆる「弱」ラベルである。これらは粗いアノテーションか、より正確ではないが、作成が著しく高速な過大なアノテーションである。本研究は,脳動脈瘤検出の課題を,教師付きセグメンテーション法やボクセルワイドデラインを用いた関連研究とは対照的に,弱いラベルを用いたパッチワイドバイナリ分類として扱う。我々のアプローチは、ほとんどの焦点疾患と同様に、異常なパッチ(大動脈瘤を含む)は異常のないものよりも多く、通常2つのクラスは異なる空間分布を持つという、データセット作成の非自明な課題に起因している。そこで本研究では,マルチスケール・マルチインプット・3D畳み込みニューラルネットワーク(CNN)を用いて,非バランスで空間的に歪んだデータセットの頻繁なシナリオに対処する。今回我々は,tof-mra(time-of-flight magnetic resonance angiography)を施行した214名 (83名, 131名) の脳動脈瘤の111例を経験した。我々は,ネットワークの難易度が増大する負のパッチサンプリングに対する2つの戦略を比較し,この選択が結果にどのように影響するかを示す。付加された空間情報が性能向上に寄与するかどうかを評価するために, 解剖学的にインフォームドされたCNNと, ベースライン, 空間非依存のCNNを比較した。容器のような負のパッチを含むより現実的で挑戦的なシナリオを考えると、前者は最も高い分類結果(精度$\simeq$95\%, AUROC$\simeq$0.95, AUPR$\simeq$0.71)を得た。 A commonly adopted approach to carry out detection tasks in medical imaging is to rely on an initial segmentation. However, this approach strongly depends on voxel-wise annotations which are repetitive and time-consuming to draw for medical experts. An interesting alternative to voxel-wise masks are so-called "weak" labels: these can either be coarse or oversized annotations that are less precise, but noticeably faster to create. In this work, we address the task of brain aneurysm detection as a patch-wise binary classification with weak labels, in contrast to related studies that rather use supervised segmentation methods and voxel-wise delineations. Our approach comes with the non-trivial challenge of the data set creation: as for most focal diseases, anomalous patches (with aneurysm) are outnumbered by those showing no anomaly, and the two classes usually have different spatial distributions. To tackle this frequent scenario of inherently imbalanced, spatially skewed data sets, we propose a novel, anatomically-driven approach by using a multi-scale and multi-input 3D Convolutional Neural Network (CNN). We apply our model to 214 subjects (83 patients, 131 controls) who underwent Time-Of-Flight Magnetic Resonance Angiography (TOF-MRA) and presented a total of 111 unruptured cerebral aneurysms. We compare two strategies for negative patch sampling that have an increasing level of difficulty for the network and we show how this choice can strongly affect the results. To assess whether the added spatial information helps improving performances, we compare our anatomically-informed CNN with a baseline, spatially-agnostic CNN. When considering the more realistic and challenging scenario including vessel-like negative patches, the former model attains the highest classification results (accuracy$\simeq$95\%, AUROC$\simeq$0.95, AUPR$\simeq$0.71), thus outperforming the baseline.	翻訳日:2022-09-20 01:55:09 公開日:2020-11-27
# ドメイン適応因果性エンコーダ Domain Adaptative Causality Encoder ( http://arxiv.org/abs/2011.13549v1 ) ライセンス: Link先を確認	Farhad Moghimifar, Gholamreza Haffari, Mahsa Baktashmotlagh	(参考訳) 個々のイベント間の低レベル関係の抽出を主眼とする現在のアプローチは,公開ラベル付きデータの不足によって制限されている。したがって、学習時にラベル付きデータが存在しない分布が異なる領域に適用した場合、結果のモデルは不十分である。この制限を克服するため,本論文では,依存木の特徴と逆学習を活用し,適応因果関係同定と局所化の課題に対処する。適応(adaptive)という用語は、トレーニングとテストのデータが2つの分散的なデータセットから来ているため、私たちの知る限りでは、この作業に対処するのは初めてです。さらに,テキスト中のすべての種類の因果関係を統合する新しい因果関係データセットであるmedcausを提案する。 4つの異なるベンチマーク因果関係データセットを用いた実験により,テキストからの因果関係の同定と局所化のタスクにおいて,既存基準よりも最大7%改善したアプローチが優れていることを示す。 Current approaches which are mainly based on the extraction of low-level relations among individual events are limited by the shortage of publicly available labelled data. Therefore, the resulting models perform poorly when applied to a distributionally different domain for which labelled data did not exist at the time of training. To overcome this limitation, in this paper, we leverage the characteristics of dependency trees and adversarial learning to address the tasks of adaptive causality identification and localisation. The term adaptive is used since the training and test data come from two distributionally different datasets, which to the best of our knowledge, this work is the first to address. Moreover, we present a new causality dataset, namely MedCaus, which integrates all types of causality in the text. Our experiments on four different benchmark causality datasets demonstrate the superiority of our approach over the existing baselines, by up to 7% improvement, on the tasks of identification and localisation of the causal relations from the text.	翻訳日:2022-09-20 01:54:31 公開日:2020-11-27
# 対話型文表現学習に基づく中国語医学質問応答照合 Chinese Medical Question Answer Matching Based on Interactive Sentence Representation Learning ( http://arxiv.org/abs/2011.13573v1 ) ライセンス: Link先を確認	Xiongtao Cui and Jungang Han	(参考訳) 中国の医学質問応答マッチングは、英語のオープンドメイン質問応答マッチングよりも難しい。深層学習法は質問応答マッチングの性能向上に優れてきたが,これらの手法は文内の意味情報のみに焦点をあてるが,質問と回答間の意味関係は無視し,結果として性能に欠陥が生じる。本稿では,この問題に取り組むために,対話型文表現学習モデルの設計を行う。本稿では,中国語の医学的問答マッチングに適応し,異なるニューラルネットワークの構造の利点を活かし,文内の深い意味情報を抽出し,問答間の意味関係を抽出し,多スケールcnnsネットワークやbigruネットワークと組み合わせ,ニューラルネットワークの異なる構造を活用し,文表現における意味的特徴を学習するクロスクロスバートネットワークを提案する。 cMedQA V2.0とcMedQA V1.0データセットの実験により、我々のモデルは、中国の医学的質問応答マッチングの既存の最先端モデルよりも大幅に優れていることが示された。 Chinese medical question-answer matching is more challenging than the open-domain question answer matching in English. Even though the deep learning method has performed well in improving the performance of question answer matching, these methods only focus on the semantic information inside sentences, while ignoring the semantic association between questions and answers, thus resulting in performance deficits. In this paper, we design a series of interactive sentence representation learning models to tackle this problem. To better adapt to Chinese medical question-answer matching and take the advantages of different neural network structures, we propose the Crossed BERT network to extract the deep semantic information inside the sentence and the semantic association between question and answer, and then combine with the multi-scale CNNs network or BiGRU network to take the advantage of different structure of neural networks to learn more semantic features into the sentence representation. The experiments on the cMedQA V2.0 and cMedQA V1.0 dataset show that our model significantly outperforms all the existing state-of-the-art models of Chinese medical question answer matching.	翻訳日:2022-09-20 01:54:14 公開日:2020-11-27
# 深い直交線形ネットワークは浅い Deep orthogonal linear networks are shallow ( http://arxiv.org/abs/2011.13831v1 ) ライセンス: Link先を確認	Pierre Ablin	(参考訳) 直交行列の積からなる深い直交線形ネットワークをトレーニングする際の問題を考える。リーマン勾配降下を伴う重みの訓練は、勾配降下による因子化全体の訓練と等価であることを示す。つまり、この設定では、過パラメータ化と暗黙のバイアスが全く影響しない:そのような深層で過パラメータ化されたネットワークのトレーニングは、一層浅層ネットワークのトレーニングと完全に等価である。 We consider the problem of training a deep orthogonal linear network, which consists of a product of orthogonal matrices, with no non-linearity in-between. We show that training the weights with Riemannian gradient descent is equivalent to training the whole factorization by gradient descent. This means that there is no effect of overparametrization and implicit bias at all in this setting: training such a deep, overparametrized, network is perfectly equivalent to training a one-layer shallow network.	翻訳日:2022-09-20 01:47:55 公開日:2020-11-27
# マルチナリー制限ボルツマン機の搬送損失機能とカラー画像生成 Tractable loss function and color image generation of multinary restricted Boltzmann machine ( http://arxiv.org/abs/2011.13509v1 ) ライセンス: Link先を確認	Juno Hwang and Wonseok Hwang and Junghyo Jo	(参考訳) 制限ボルツマンマシン(RBM)は統計力学の概念に基づく代表的な生成モデルである。解釈可能性の強い利点にもかかわらず、バックプロパゲーションの非利用性は、他の生成モデルよりも競争力を低下させる。ここで、二元および多元 rbms の可微分損失関数を導出する。次に,色付き顔画像を生成することで,学習性と性能を示す。 The restricted Boltzmann machine (RBM) is a representative generative model based on the concept of statistical mechanics. In spite of the strong merit of interpretability, unavailability of backpropagation makes it less competitive than other generative models. Here we derive differentiable loss functions for both binary and multinary RBMs. Then we demonstrate their learnability and performance by generating colored face images.	翻訳日:2022-09-20 01:47:46 公開日:2020-11-27
# 物理世界におけるディープラーニング顔認識に対するロバストな攻撃 Robust Attacks on Deep Learning Face Recognition in the Physical World ( http://arxiv.org/abs/2011.13526v1 ) ライセンス: Link先を確認	Meng Shen, Hao Yu, Liehuang Zhu, Ke Xu, Qi Li, Xiaojiang Du	(参考訳) ディープニューラルネットワーク(DNN)は、顔認識(FR)システムでますます使われている。しかし、最近の研究では、DNNは敵対的な例に弱いことが示されており、物理的世界ではDNNを使用してFRシステムを誤解させる可能性がある。これらのシステムに対する既存の攻撃は、単にデジタル世界で働く摂動を生成するか、摂動を生成するためにカスタマイズされた機器に依存するかのいずれかであり、様々な物理的環境では堅牢ではない。本稿では、敵ステッカーを使ってFRシステムを騙す物理世界攻撃であるFaceAdvを提案する。主にステッカージェネレータとトランスで構成されており、前者は複数の異なる形状のステッカーを製作でき、後者のトランスフォーマーは人間の顔にステッカーをデジタルに取り付け、ステッカーの有効性を向上させるためにジェネレータにフィードバックを提供することを目的としている。本研究では,3種類のFRシステム(ArcFace,CosFace,FaceNet)に対するFaceAdvの有効性を評価するための広範囲な実験を行った。その結果、faceadvは最先端の攻撃と比べて、ドッジと偽装の両方の成功率を大幅に向上できることがわかった。また,FaceAdvの堅牢性を示すため,包括的評価を行った。 Deep neural networks (DNNs) have been increasingly used in face recognition (FR) systems. Recent studies, however, show that DNNs are vulnerable to adversarial examples, which can potentially mislead the FR systems using DNNs in the physical world. Existing attacks on these systems either generate perturbations working merely in the digital world, or rely on customized equipments to generate perturbations and are not robust in varying physical environments. In this paper, we propose FaceAdv, a physical-world attack that crafts adversarial stickers to deceive FR systems. It mainly consists of a sticker generator and a transformer, where the former can craft several stickers with different shapes and the latter transformer aims to digitally attach stickers to human faces and provide feedbacks to the generator to improve the effectiveness of stickers. We conduct extensive experiments to evaluate the effectiveness of FaceAdv on attacking 3 typical FR systems (i.e., ArcFace, CosFace and FaceNet). The results show that compared with a state-of-the-art attack, FaceAdv can significantly improve success rate of both dodging and impersonating attacks. We also conduct comprehensive evaluations to demonstrate the robustness of FaceAdv.	翻訳日:2022-09-20 01:47:40 公開日:2020-11-27
# SocialGuard: ソーシャルイメージの逆例に基づくプライバシ保護技術 SocialGuard: An Adversarial Example Based Privacy-Preserving Technique for Social Images ( http://arxiv.org/abs/2011.13560v1 ) ライセンス: Link先を確認	Mingfu Xue, Shichang Sun, Zhiyu Wu, Can He, Jian Wang, Weiqiang Liu	(参考訳) さまざまなソーシャルプラットフォームの人気は、人々が日常的な写真をオンラインで共有するきっかけとなった。しかし、このようなオンライン写真共有行動によって、望ましくないプライバシーリークが発生する。 advanced deep neural network (dnn)ベースのオブジェクト検出器は、共有写真に露出したユーザーの個人情報を容易に盗むことができる。本稿では,対象物検知器によるプライバシ盗難に対するソーシャルイメージの新たな逆例に基づくプライバシ保存手法を提案する。具体的には,2種類の敵対的ソーシャル画像を作成するためのオブジェクト消去アルゴリズムを開発した。ソーシャルイメージ内のすべてのオブジェクトを、オブジェクト検出器によって検出されるのを防ぎ、一方は、カスタマイズされた機密オブジェクトを、オブジェクト検出器によって不正に分類することができる。 Object Disappearance Algorithmは、クリーンな社会的イメージに摂動を構築する。摂動を注入した後、社会的イメージは容易に物体検出器を騙すことができるが、その視覚品質は劣化しない。提案手法の有効性を評価するために,プライバシ保存成功率とプライバシリーク率の2つの指標を用いる。実験の結果,提案手法は社会的画像のプライバシーを効果的に保護できることがわかった。 ms-cocoおよびpascal voc 2007データセットにおける提案手法のプライバシ保護成功率は、それぞれ96.1%、99.3%であり、これら2つのデータセットのプライバシリーク率は0.57%、0.07%である。さらに,既存の画像処理手法(低輝度,ノイズ,ぼかし,モザイク,jpeg圧縮)と比較して,提案手法はプライバシ保護と画像品質の維持において,はるかに優れた性能を実現することができる。 The popularity of various social platforms has prompted more people to share their routine photos online. However, undesirable privacy leakages occur due to such online photo sharing behaviors. Advanced deep neural network (DNN) based object detectors can easily steal users' personal information exposed in shared photos. In this paper, we propose a novel adversarial example based privacy-preserving technique for social images against object detectors based privacy stealing. Specifically, we develop an Object Disappearance Algorithm to craft two kinds of adversarial social images. One can hide all objects in the social images from being detected by an object detector, and the other can make the customized sensitive objects be incorrectly classified by the object detector. The Object Disappearance Algorithm constructs perturbation on a clean social image. After being injected with the perturbation, the social image can easily fool the object detector, while its visual quality will not be degraded. We use two metrics, privacy-preserving success rate and privacy leakage rate, to evaluate the effectiveness of the proposed method. Experimental results show that, the proposed method can effectively protect the privacy of social images. The privacy-preserving success rates of the proposed method on MS-COCO and PASCAL VOC 2007 datasets are high up to 96.1% and 99.3%, respectively, and the privacy leakage rates on these two datasets are as low as 0.57% and 0.07%, respectively. In addition, compared with existing image processing methods (low brightness, noise, blur, mosaic and JPEG compression), the proposed method can achieve much better performance in privacy protection and image visual quality maintenance.	翻訳日:2022-09-20 01:46:50 公開日:2020-11-27
# road scene graph: インテリジェントな車両のためのセマンティックグラフベースのシーン表現データセット Road Scene Graph: A Semantic Graph-Based Scene Representation Dataset for Intelligent Vehicles ( http://arxiv.org/abs/2011.13588v1 ) ライセンス: Link先を確認	Yafu Tian, Alexander Carballo, Ruifeng Li and Kazuya Takeda	(参考訳) リッチセマンティック情報抽出は、次世代のインテリジェント車において重要な役割を果たす。現在,6次元ポーズ検出や道路シーンセマンティックセグメンテーションなどの基本的な応用に焦点を当てた研究が数多く行われている。これにより、これらのデータを組織化し、どのように活用するかを考える素晴らしい機会が得られます。本稿では,車載用特別シーングラフである道路シーングラフを提案する。古典的なデータ表現とは異なり、このグラフはオブジェクトの提案だけでなく、ペア関係も提供する。トポロジグラフにまとめることで、これらのデータは説明可能で、完全に接続可能であり、GCN(Graph Convolutional Networks)で簡単に処理できる。ここでは,基本グラフ予測モデルを含む道路シーングラフデータセットを用いて道路のシーングラフを適用する。本研究は,提案モデルを用いた実験評価も含む。 Rich semantic information extraction plays a vital role on next-generation intelligent vehicles. Currently there is great amount of research focusing on fundamental applications such as 6D pose detection, road scene semantic segmentation, etc. And this provides us a great opportunity to think about how shall these data be organized and exploited. In this paper we propose road scene graph,a special scene-graph for intelligent vehicles. Different to classical data representation, this graph provides not only object proposals but also their pair-wise relationships. By organizing them in a topological graph, these data are explainable, fully-connected, and could be easily processed by GCNs (Graph Convolutional Networks). Here we apply scene graph on roads using our Road Scene Graph dataset, including the basic graph prediction model. This work also includes experimental evaluations using the proposed model.	翻訳日:2022-09-20 01:46:23 公開日:2020-11-27
# GANの学習性に影響を与える特性に関する研究 A study of traits that affect learnability in GANs ( http://arxiv.org/abs/2011.13728v1 ) ライセンス: Link先を確認	Niladri Shekhar Dutt, Sunil Patel	(参考訳) generative adversarial networks gansは2つのニューラルネットワークを使用するアルゴリズムアーキテクチャであり、対向するニューラルネットワークを使用して、実際のデータに渡される新しい合成データインスタンスを考案する。 GANのトレーニングは難しい問題であり、ハイパーパラメータチューニングやアーキテクチャエンジニアリングといった高度なテクニックを適用する必要があります。多くの異なる損失、正規化と正規化スキーム、ネットワークアーキテクチャは、異なるタイプのデータセットに対するこの問題を解決するために提案されている。実験的な観察を理解し、簡単な理論を導出する必要がある。本稿では,パラメータ化合成データセットを用いて実験実験を行い,学習性に影響を与える特性について検討する。 Generative Adversarial Networks GANs are algorithmic architectures that use two neural networks, pitting one against the opposite so as to come up with new, synthetic instances of data that can pass for real data. Training a GAN is a challenging problem which requires us to apply advanced techniques like hyperparameter tuning, architecture engineering etc. Many different losses, regularization and normalization schemes, network architectures have been proposed to solve this challenging problem for different types of datasets. It becomes necessary to understand the experimental observations and deduce a simple theory for it. In this paper, we perform empirical experiments using parameterized synthetic datasets to probe what traits affect learnability.	翻訳日:2022-09-20 01:46:10 公開日:2020-11-27
# TaylorGAN: サンプル効率の良い自然言語生成のための隣り合わせのポリシー更新 TaylorGAN: Neighbor-Augmented Policy Update for Sample-Efficient Natural Language Generation ( http://arxiv.org/abs/2011.13527v1 ) ライセンス: Link先を確認	Chun-Hsing Lin, Siang-Ruei Wu, Hung-Yi Lee, Yun-Nung Chen	(参考訳) ReINFORCEのようなスコア関数ベースの自然言語生成(NLG)アプローチは、サンプル効率の低下や不安定性の訓練に悩まされている。これは主に離散空間サンプリングの非微分的性質のためであり、これらの方法は判別器をブラックボックスとして扱い、勾配情報を無視しなければならない。サンプル効率の向上とREINFORCEのばらつきの低減を目的として,オフ・ポリシー更新と1次テイラー展開による勾配推定を向上する新しいアプローチTaylorGANを提案する。このアプローチにより、NLGモデルをスクラッチからより小さなバッチサイズでトレーニングすることが可能になります -- 最大限の事前トレーニングをすることなく、品質と多様性の複数の指標において、既存のGANベースのメソッドよりも優れています。ソースコードとデータはhttps://github.com/miulab/taylorganで入手できる。 Score function-based natural language generation (NLG) approaches such as REINFORCE, in general, suffer from low sample efficiency and training instability problems. This is mainly due to the non-differentiable nature of the discrete space sampling and thus these methods have to treat the discriminator as a black box and ignore the gradient information. To improve the sample efficiency and reduce the variance of REINFORCE, we propose a novel approach, TaylorGAN, which augments the gradient estimation by off-policy update and the first-order Taylor expansion. This approach enables us to train NLG models from scratch with smaller batch size -- without maximum likelihood pre-training, and outperforms existing GAN-based methods on multiple metrics of quality and diversity. The source code and data are available at https://github.com/MiuLab/TaylorGAN	翻訳日:2022-09-20 01:39:45 公開日:2020-11-27
# 長尾関係抽出のためのラベルなしテキストからの学習関係プロトタイプ Learning Relation Prototype from Unlabeled Texts for Long-tail Relation Extraction ( http://arxiv.org/abs/2011.13574v1 ) ライセンス: Link先を確認	Yixin Cao, Jun Kuang, Ming Gao, Aoying Zhou, Yonggang Wen, Tat-Seng Chua	(参考訳) 関係抽出(re)は、テキストからエンティティ関係を抽出することによって、知識グラフ(kg)を完成させるための重要なステップである。トレーニングデータは主にいくつかのタイプの関係に集中しており、残りのタイプの関係に対する十分なアノテーションが欠如している。本稿では,関係型から知識を十分な学習データで伝達することで,テキストのラベルのない関係のプロトタイプを学習し,長い関係の抽出を容易にする方法を提案する。我々は,関係の意味と伝達学習の近さを反映した実体間の暗黙的要因として関係プロトタイプを学習する。具体的には、テキストから共起グラフを構築し、埋め込み学習のための一階と二階の両方のエンティティをキャプチャする。これに基づいて、ほぼ任意のREフレームワークに容易に適用可能な、プロトタイプに対応するエンティティペアからの距離をさらに最適化する。そこで我々は、New York TimesとGoogle Distant Supervisionという2つの公開データセットで大規模な実験を行い、8つの最先端ベースラインと比較し、提案モデルは大幅な改善(平均4.1% F1)を達成した。長期関係に関するさらなる議論は、学習された関係プロトタイプの有効性を示す。さらに,様々な成分の影響を解明するためのアブレーション研究を行い,これを4つの基本関係抽出モデルに適用して一般化能力を検証した。コードは後でリリースします。 Relation Extraction (RE) is a vital step to complete Knowledge Graph (KG) by extracting entity relations from texts.However, it usually suffers from the long-tail issue. The training data mainly concentrates on a few types of relations, leading to the lackof sufficient annotations for the remaining types of relations. In this paper, we propose a general approach to learn relation prototypesfrom unlabeled texts, to facilitate the long-tail relation extraction by transferring knowledge from the relation types with sufficient trainingdata. We learn relation prototypes as an implicit factor between entities, which reflects the meanings of relations as well as theirproximities for transfer learning. Specifically, we construct a co-occurrence graph from texts, and capture both first-order andsecond-order entity proximities for embedding learning. Based on this, we further optimize the distance from entity pairs tocorresponding prototypes, which can be easily adapted to almost arbitrary RE frameworks. Thus, the learning of infrequent or evenunseen relation types will benefit from semantically proximate relations through pairs of entities and large-scale textual information.We have conducted extensive experiments on two publicly available datasets: New York Times and Google Distant Supervision.Compared with eight state-of-the-art baselines, our proposed model achieves significant improvements (4.1% F1 on average). Furtherresults on long-tail relations demonstrate the effectiveness of the learned relation prototypes. We further conduct an ablation study toinvestigate the impacts of varying components, and apply it to four basic relation extraction models to verify the generalization ability.Finally, we analyze several example cases to give intuitive impressions as qualitative analysis. Our codes will be released later.	翻訳日:2022-09-20 01:39:31 公開日:2020-11-27
# Reflective-Net: 説明から学ぶ Reflective-Net: Learning from Explanations ( http://arxiv.org/abs/2011.13986v1 ) ライセンス: Link先を確認	Johannes Schneider and Michalis Vlachos	(参考訳) 人間は、迅速で直感的な決定をするだけでなく、自己表現、すなわち自己説明し、他人の説明から効率的に学ぶ能力を持っている。この研究は、既存の説明法、すなわちGrad-CAMに基づいて生成された説明に乗じて、このプロセスを模倣する最初のステップを提供する。従来のラベル付きデータと組み合わせた説明から学ぶことは、精度とトレーニング時間の観点から分類の大幅な改善をもたらす。 Humans possess a remarkable capability to make fast, intuitive decisions, but also to self-reflect, i.e., to explain to oneself, and to efficiently learn from explanations by others. This work provides the first steps toward mimicking this process by capitalizing on the explanations generated based on existing explanation methods, i.e. Grad-CAM. Learning from explanations combined with conventional labeled data yields significant improvements for classification in terms of accuracy and training time.	翻訳日:2022-09-20 01:37:40 公開日:2020-11-27
# ドメイン知識を使って機械に自己説明を教える Teaching the Machine to Explain Itself using Domain Knowledge ( http://arxiv.org/abs/2012.01932v1 ) ライセンス: Link先を確認	Vladimir Balayan, Pedro Saleiro, Catarina Bel\'em, Ludwig Krippahl and Pedro Bizarro	(参考訳) 機械学習(ml)は、人間がより良く、より速い決定を下すのを助けるためにますます使われています。しかし、非技術者はモデル予測の背後にある理論的根拠を理解するのに苦労し、アルゴリズムによる意思決定システムの信頼を妨げた。 AIの説明可能性に関する重要な研究は、説明方法を開発することによってAIシステムの信頼を取り戻す試みであるが、大きなブレークスルーはない。同時に、一般的な説明法(例えば LIME や SHAP)は、非データ科学者のペルソナを理解するのが非常に難しい説明を生成する。これを解決するために、意思決定タスクとドメイン知識を伝える関連する説明を共同で学習するニューラルネットワークベースのフレームワークJOELを提案する。 JOELは、専門家自身の推論に非常によく似た、モデルの予測に関する高いレベルの洞察を提供する、深い技術的ML知識の欠如を持つ、ループ内のドメインエキスパートに合わせたものだ。さらに、認定専門家のプールからドメインからのフィードバックを収集し、モデル(人間の教え)を改善するために使用することで、シームレスでより適切な説明を促進します。最後に,従来の専門家システムとドメイン分類体系のセマンティックマッピングを用いてブートストラップトレーニングセットを自動的に注釈付けし,概念に基づく人間のアノテーションの欠如を克服する。実世界の不正検出データセット上でJOELを実証的に検証する。 JOELはブートストラップデータセットから説明を一般化できることを示す。さらに, 人間の指導により, 説明文の予測精度を約$13.57\%$向上できることを示した。 Machine Learning (ML) has been increasingly used to aid humans to make better and faster decisions. However, non-technical humans-in-the-loop struggle to comprehend the rationale behind model predictions, hindering trust in algorithmic decision-making systems. Considerable research work on AI explainability attempts to win back trust in AI systems by developing explanation methods but there is still no major breakthrough. At the same time, popular explanation methods (e.g., LIME, and SHAP) produce explanations that are very hard to understand for non-data scientist persona. To address this, we present JOEL, a neural network-based framework to jointly learn a decision-making task and associated explanations that convey domain knowledge. JOEL is tailored to human-in-the-loop domain experts that lack deep technical ML knowledge, providing high-level insights about the model's predictions that very much resemble the experts' own reasoning. Moreover, we collect the domain feedback from a pool of certified experts and use it to ameliorate the model (human teaching), hence promoting seamless and better suited explanations. Lastly, we resort to semantic mappings between legacy expert systems and domain taxonomies to automatically annotate a bootstrap training set, overcoming the absence of concept-based human annotations. We validate JOEL empirically on a real-world fraud detection dataset. We show that JOEL can generalize the explanations from the bootstrap dataset. Furthermore, obtained results indicate that human teaching can further improve the explanations prediction quality by approximately $13.57\%$.	翻訳日:2022-09-20 01:37:32 公開日:2020-11-27
# すべての企業がその構造を所有:グラフニューラルネットワークによる企業クレジットレーティング Every Corporation Owns Its Structure: Corporate Credit Ratings via Graph Neural Networks ( http://arxiv.org/abs/2012.01933v1 ) ライセンス: Link先を確認	Bojing Feng, Haonan Xu, Wenfang Xue and Bindang Xue	(参考訳) 信用格付けは、投資におけるリスクと信頼性のレベルを反映し、金融リスクにおいて重要な役割を果たす、企業に関連する信用リスクの分析である。企業信用格付けを扱うためにベクトル空間に基づく機械学習とディープラーニング技術を実装する多くの研究が登場している。近年,ローン保証ネットワークなどの企業間の関係を考慮すると,グラフニューラルネットワークの出現に伴い,グラフベースモデルもいくつか適用されている。しかし、これらの既存のモデルは企業間のネットワークを構築し、内部の機能的相互作用を考慮に入れない。本稿では,このような問題を解決するために,グラフニューラルネットワークを用いた企業信用評価モデルCCR-GNNを提案する。まず、各企業の個々のグラフをセルフアウト製品に基づいて構築し、gnnを使用して、ローカル情報とグローバル情報の両方を含む機能インタラクションを明示的にモデル化します。中国の上場企業評価データセットで実施された大規模な実験は、CCR-GNNが最先端の手法を一貫して上回っていることを証明している。 Credit rating is an analysis of the credit risks associated with a corporation, which reflects the level of the riskiness and reliability in investing, and plays a vital role in financial risk. There have emerged many studies that implement machine learning and deep learning techniques which are based on vector space to deal with corporate credit rating. Recently, considering the relations among enterprises such as loan guarantee network, some graph-based models are applied in this field with the advent of graph neural networks. But these existing models build networks between corporations without taking the internal feature interactions into account. In this paper, to overcome such problems, we propose a novel model, Corporate Credit Rating via Graph Neural Networks, CCR-GNN for brevity. We firstly construct individual graphs for each corporation based on self-outer product and then use GNN to model the feature interaction explicitly, which includes both local and global information. Extensive experiments conducted on the Chinese public-listed corporate rating dataset, prove that CCR-GNN outperforms the state-of-the-art methods consistently.	翻訳日:2022-09-20 01:37:07 公開日:2020-11-27
# バイナリラテントを用いた変分オートエンコーダの直接進化最適化 Direct Evolutionary Optimization of Variational Autoencoders With Binary Latents ( http://arxiv.org/abs/2011.13704v1 ) ライセンス: Link先を確認	Enrico Guiraud, Jakob Drefs, J\"org L\"ucke	(参考訳) 離散潜在変数は実世界のデータにとって重要であると考えられており、離散潜在変数を持つ変分オートエンコーダ(VAE)の研究の動機となっている。しかし、この場合、標準的なVAEトレーニングは不可能であり、従来のような個別のVAEを訓練するために、個別の分散を操作するための異なる戦略を動機付けている。ここでは、符号化モデルに直接離散最適化を適用することにより、潜伏者の離散性を完全に維持できるかどうかを問う。この手法は, サイドステッピングサンプリング近似, 再パラメータ化トリック, 償却により, 標準的なVAEトレーニングから強く逸脱している。離散最適化は、進化的アルゴリズムと連動して、切断後段を用いた変分設定で実現される。バイナリラテントを持つVAEに対して、(A)ネットワーク重みに対する勾配上昇にそのような離散的変動法がどのように結びついているか、および(B)デコーダがトレーニングのために遅延状態を選択する方法を示す。従来の償却トレーニングはより効率的で、大きなニューラルネットワークに適用できる。しかし、より小さなネットワークを用いることで、数百の潜伏者に対して効率よく分散最適化を行うことができる。さらに重要なのは,直接最適化の有効性が,‘ゼロショット’学習において極めて競争力が高いことだ。大規模な教師付きネットワークとは対照的に、hereが調査したvaes canは、クリーンなデータや大きな画像データセットのトレーニングの事前のトレーニングなしに、1つのイメージをデノーズする。より一般に,vaeの訓練はサンプリングに基づく近似と再パラメータ化を伴わずに可能であり,一般にvae訓練の解析には興味深いものと考えられる。ゼロショット' 設定では、直接最適化され、さらに、VAE は非生成的アプローチによって以前より優れていた。 Discrete latent variables are considered important for real world data, which has motivated research on Variational Autoencoders (VAEs) with discrete latents. However, standard VAE-training is not possible in this case, which has motivated different strategies to manipulate discrete distributions in order to train discrete VAEs similarly to conventional ones. Here we ask if it is also possible to keep the discrete nature of the latents fully intact by applying a direct discrete optimization for the encoding model. The approach is consequently strongly diverting from standard VAE-training by sidestepping sampling approximation, reparameterization trick and amortization. Discrete optimization is realized in a variational setting using truncated posteriors in conjunction with evolutionary algorithms. For VAEs with binary latents, we (A) show how such a discrete variational method ties into gradient ascent for network weights, and (B) how the decoder is used to select latent states for training. Conventional amortized training is more efficient and applicable to large neural networks. However, using smaller networks, we here find direct discrete optimization to be efficiently scalable to hundreds of latents. More importantly, we find the effectiveness of direct optimization to be highly competitive in `zero-shot' learning. In contrast to large supervised networks, the here investigated VAEs can, e.g., denoise a single image without previous training on clean data and/or training on large image datasets. More generally, the studied approach shows that training of VAEs is indeed possible without sampling-based approximation and reparameterization, which may be interesting for the analysis of VAE-training in general. For `zero-shot' settings a direct optimization, furthermore, makes VAEs competitive where they have previously been outperformed by non-generative approaches.	翻訳日:2022-09-20 01:36:49 公開日:2020-11-27
# 深層ニューラルネットワークにおける畳み込み層の不確かさに関する研究 A Study on the Uncertainty of Convolutional Layers in Deep Neural Networks ( http://arxiv.org/abs/2011.13719v1 ) ライセンス: Link先を確認	Haojing Shen, Sihong Chen, Ran Wang	(参考訳) 本稿では,ニューラルネットワーク構造,すなわちLeNetにおける畳み込み層の接続重みに存在するMin-Max特性を示す。具体的には、Min-Max特性は、LeNetの後方伝播ベースのトレーニングの間、畳み込み層の重みが間隔の中心から遠ざかる、すなわち最小限に減少するか、最大まで増加することを意味する。不確実性の観点から、Min-Max特性が畳み込みの簡易な定式化によってモデルパラメータのファジィを最小化することを示す。実験により、Min-Max特性を持つモデルが強い対向性を持つことが確認され、この特性は損失関数の設計に組み込むことができる。本稿では,レネ構造の畳み込み層における不確かさの変化傾向を指摘し,畳み込みの解釈可能性について考察する。 This paper shows a Min-Max property existing in the connection weights of the convolutional layers in a neural network structure, i.e., the LeNet. Specifically, the Min-Max property means that, during the back propagation-based training for LeNet, the weights of the convolutional layers will become far away from their centers of intervals, i.e., decreasing to their minimum or increasing to their maximum. From the perspective of uncertainty, we demonstrate that the Min-Max property corresponds to minimizing the fuzziness of the model parameters through a simplified formulation of convolution. It is experimentally confirmed that the model with the Min-Max property has a stronger adversarial robustness, thus this property can be incorporated into the design of loss function. This paper points out a changing tendency of uncertainty in the convolutional layers of LeNet structure, and gives some insights to the interpretability of convolution.	翻訳日:2022-09-20 01:36:18 公開日:2020-11-27
# 条件付き無依存画素合成による画像生成 Image Generators with Conditionally-Independent Pixel Synthesis ( http://arxiv.org/abs/2011.13775v1 ) ライセンス: Link先を確認	Ivan Anokhin, Kirill Demochkin, Taras Khakhulin, Gleb Sterkin, Victor Lempitsky, Denis Korzhenkov	(参考訳) 既存の画像生成ネットワークは空間的畳み込みに大きく依存しており、オプションで画像の粗大な合成を徐々に行うことができる。本稿では,各画素における色値を,ランダム潜時ベクトルの値と,その画素の座標から独立に計算する,画像生成のための新しいアーキテクチャを提案する。合成中にピクセル間で情報を伝達する空間畳み込みや類似の操作は関与しない。本研究では, 逆方向の学習において, このようなジェネレータのモデリング能力を解析し, 新しいジェネレータを観察して, 最先端の畳み込みジェネレータに類似した生成品質を実現する。また,新しいアーキテクチャに特有の興味深い特性についても検討した。 Existing image generator networks rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images in a coarse-to-fine manner. Here, we present a new architecture for image generators, where the color value at each pixel is computed independently given the value of a random latent vector and the coordinate of that pixel. No spatial convolutions or similar operations that propagate information across pixels are involved during the synthesis. We analyze the modeling capabilities of such generators when trained in an adversarial fashion, and observe the new generators to achieve similar generation quality to state-of-the-art convolutional generators. We also investigate several interesting properties unique to the new architecture.	翻訳日:2022-09-20 01:30:14 公開日:2020-11-27
# 期待改善最大化によるcnnのアクティブラーニング Active Learning in CNNs via Expected Improvement Maximization ( http://arxiv.org/abs/2011.14015v1 ) ライセンス: Link先を確認	Udai G. Nagpal, David A Knowles	(参考訳) convolutional neural networks(cnns)などのディープラーニングモデルは、コンピュータビジョンや最近では計算生物学など、さまざまな領域において高いレベルの有効性を示している。しかし、効果的なモデルのトレーニングには、しばしば大規模なデータセットを組み立てたり、ラベル付けする必要がある。プールベースのアクティブラーニング技術は、これらの問題を軽減し、限られたデータで訓練されたモデルを利用して、学習プロセスを高速化するために、未ラベルのデータポイントをプールから選択的にクエリする。本稿では,提案する「Dropout-based expecteded IMprOvementS」(DEIMOS)について述べる。提案フレームワークは,モデル不確実性を捉える予測共分散行列の維持と,この行列を動的に更新することにより,バッチモード設定における多様な点のバッチを生成する。アクティブラーニングの結果,DIMOSはコンピュータビジョンやゲノミクスから取られた複数の回帰・分類タスクにおいて,既存のベースラインよりも優れていた。 Deep learning models such as Convolutional Neural Networks (CNNs) have demonstrated high levels of effectiveness in a variety of domains, including computer vision and more recently, computational biology. However, training effective models often requires assembling and/or labeling large datasets, which may be prohibitively time-consuming or costly. Pool-based active learning techniques have the potential to mitigate these issues, leveraging models trained on limited data to selectively query unlabeled data points from a pool in an attempt to expedite the learning process. Here we present "Dropout-based Expected IMprOvementS" (DEIMOS), a flexible and computationally-efficient approach to active learning that queries points that are expected to maximize the model's improvement across a representative sample of points. The proposed framework enables us to maintain a prediction covariance matrix capturing model uncertainty, and to dynamically update this matrix in order to generate diverse batches of points in the batch-mode setting. Our active learning results demonstrate that DEIMOS outperforms several existing baselines across multiple regression and classification tasks taken from computer vision and genomics.	翻訳日:2022-09-20 01:30:01 公開日:2020-11-27
# 変圧器を用いた一般マルチラベル画像分類 General Multi-label Image Classification with Transformers ( http://arxiv.org/abs/2011.14027v1 ) ライセンス: Link先を確認	Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi	(参考訳) マルチラベル画像分類は、画像に存在するオブジェクト、属性、その他のエンティティに対応するラベルの集合を予測するタスクである。本研究では,多ラベル画像分類のための一般的なフレームワークである分類変換器(C-Tran)を提案する。我々のアプローチは、マスク付きラベルの入力セットと畳み込みニューラルネットワークの視覚的特徴を与えられたターゲットラベルのセットを予測するために訓練されたTransformerエンコーダで構成されている。本手法の重要な要素はラベルマスクのトレーニング目的であり、トレーニング中にラベルの状態を正、負、未知と表現するために三元符号化方式を用いる。我々のモデルは、COCOやVisual Genomeのような挑戦的なデータセットに対する最先端のパフォーマンスを示す。さらに,トレーニング中のラベルの不確かさを明示的に表現するモデルであるため,推論中に部分的あるいは余分なラベルアノテーションを用いた画像に対して,よりよい結果が得られることがより一般的である。この追加機能は、COCO、Visual Genome、News500、CUBイメージデータセットで実証する。 Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. In this work we propose the Classification Transformer (C-Tran), a general framework for multi-label image classification that leverages Transformers to exploit the complex dependencies among visual features and labels. Our approach consists of a Transformer encoder trained to predict a set of target labels given an input set of masked labels, and visual features from a convolutional neural network. A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels as positive, negative, or unknown during training. Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome. Moreover, because our model explicitly represents the uncertainty of labels during training, it is more general by allowing us to produce improved results for images with partial or extra label annotations during inference. We demonstrate this additional capability in the COCO, Visual Genome, News500, and CUB image datasets.	翻訳日:2022-09-20 01:29:39 公開日:2020-11-27
# 深部強化学習を用いたヒューマノイドサッカーロボットのリアルタイムアクティブビジョン Real-time Active Vision for a Humanoid Soccer Robot Using Deep Reinforcement Learning ( http://arxiv.org/abs/2011.13851v1 ) ライセンス: Link先を確認	Soheil Khatibi, Meisam Teimouri, Mahdi Rezaei	(参考訳) 本稿では,人間型サッカーロボットのための深層強化学習手法を用いたアクティブビジョン手法を提案する。提案手法はロボットの視点を適応的に最適化し,ボールの視点を保ちながら自己局所化のための最も有用なランドマークを得る。アクティブビジョンは、限られた視野を持つヒューマノイド意思決定ロボットにとって重要である。能動視覚問題に対処するために、自己局在モデルの精度に大きく依存する確率論的エントロピーに基づくいくつかのアプローチが提案されている。しかし,本研究では,この問題をエピソディクス強化学習問題として定式化し,深層q学習法を用いて解く。提案するネットワークでは,ロボットの頭部を最高の視点に向けて移動させるために,カメラの生画像のみを必要とする。このモデルは、最高の視点を達成する上で、非常に競争力のある80%の成功率を示します。提案手法をwebotsシミュレータでシミュレーションしたヒューマノイドロボットに実装した。評価と実験結果から,提案手法は自己局所誤差の高い場合において,RoboCupコンテキストにおいてエントロピーに基づく手法よりも優れていることが示された。 In this paper, we present an active vision method using a deep reinforcement learning approach for a humanoid soccer-playing robot. The proposed method adaptively optimises the viewpoint of the robot to acquire the most useful landmarks for self-localisation while keeping the ball into its viewpoint. Active vision is critical for humanoid decision-maker robots with a limited field of view. To deal with an active vision problem, several probabilistic entropy-based approaches have previously been proposed which are highly dependent on the accuracy of the self-localisation model. However, in this research, we formulate the problem as an episodic reinforcement learning problem and employ a Deep Q-learning method to solve it. The proposed network only requires the raw images of the camera to move the robot's head toward the best viewpoint. The model shows a very competitive rate of 80% success rate in achieving the best viewpoint. We implemented the proposed method on a humanoid robot simulated in Webots simulator. Our evaluations and experimental results show that the proposed method outperforms the entropy-based methods in the RoboCup context, in cases with high self-localisation errors.	翻訳日:2022-09-20 01:29:21 公開日:2020-11-27
# エンティティの協調抽出と情報冗長性除去との関係 Joint Extraction of Entity and Relation with Information Redundancy Elimination ( http://arxiv.org/abs/2011.13565v1 ) ライセンス: Link先を確認	Yuanhao Shen and Jungang Han	(参考訳) 冗長な情報とエンティティと関係抽出モデルの重複関係の問題を解決するために,共同抽出モデルを提案する。このモデルは、関係のない冗長な情報を生成することなく、複数の関連エンティティを直接抽出することができる。また,エンコーダ-LSTMと呼ばれる再帰型ニューラルネットワークを提案し,文をモデル化する再帰型ユニットの能力を高める。具体的には、名前付きエンティティ認識サブモジュールは、事前訓練された言語モデルとLSTMデコーダ層で構成され、エンコーダ-LSTMネットワークを使用して関連するエンティティペア間の順序関係をモデル化するエンティティペア抽出サブモジュールと、注意機構を含む関係分類サブモジュールである。本モデルの有効性を評価するために, adeおよびconll04の公開データセットについて実験を行った。提案手法は,エンティティと関係抽出のタスクにおいて良好な性能を示し,冗長な情報の量を大幅に削減できることを示す。 To solve the problem of redundant information and overlapping relations of the entity and relation extraction model, we propose a joint extraction model. This model can directly extract multiple pairs of related entities without generating unrelated redundant information. We also propose a recurrent neural network named Encoder-LSTM that enhances the ability of recurrent units to model sentences. Specifically, the joint model includes three sub-modules: the Named Entity Recognition sub-module consisted of a pre-trained language model and an LSTM decoder layer, the Entity Pair Extraction sub-module which uses Encoder-LSTM network to model the order relationship between related entity pairs, and the Relation Classification sub-module including Attention mechanism. We conducted experiments on the public datasets ADE and CoNLL04 to evaluate the effectiveness of our model. The results show that the proposed model achieves good performance in the task of entity and relation extraction and can greatly reduce the amount of redundant information.	翻訳日:2022-09-20 01:29:03 公開日:2020-11-27
# センサネットワーク上の分散変分ベイズアルゴリズム Distributed Variational Bayesian Algorithms Over Sensor Networks ( http://arxiv.org/abs/2011.13600v1 ) ライセンス: Link先を確認	Junhao Hua, Chunguang Li	(参考訳) センサネットワークのコンテキストにおけるベイズフレームワークの分散推論/推定は、その幅広い適用性のために最近注目を集めている。変分ベイズアルゴリズム(英: variational bayesian algorithm)は、ベイズ推論で生じる難解な積分を近似する手法である。本稿では,非常に一般的な共役指数モデルに適用可能な一般ベイズ推論問題に対する2つの分散vbアルゴリズムを提案する。最初のアプローチでは、各ノードにおける大域的自然パラメータは、近似空間のリーマン幾何学を利用する確率的自然勾配を用いて最適化され、続いて隣人と協調するための情報拡散ステップが与えられる。第2の方法では、分散推定のための制約付き最適化定式化を自然パラメータ空間に確立し、乗算器の交互方向法(admm)により解く。次に,提案手法の有効性を評価するために,ベイズ混合モデルの分散推論・推定の応用について述べる。合成データと実データの両方のシミュレーションにより、提案アルゴリズムは優れた性能を持つことが示され、これは核融合センターで利用可能な全データに依存するvbアルゴリズムにほぼ匹敵する。 Distributed inference/estimation in Bayesian framework in the context of sensor networks has recently received much attention due to its broad applicability. The variational Bayesian (VB) algorithm is a technique for approximating intractable integrals arising in Bayesian inference. In this paper, we propose two novel distributed VB algorithms for general Bayesian inference problem, which can be applied to a very general class of conjugate-exponential models. In the first approach, the global natural parameters at each node are optimized using a stochastic natural gradient that utilizes the Riemannian geometry of the approximation space, followed by an information diffusion step for cooperation with the neighbors. In the second method, a constrained optimization formulation for distributed estimation is established in natural parameter space and solved by alternating direction method of multipliers (ADMM). An application of the distributed inference/estimation of a Bayesian Gaussian mixture model is then presented, to evaluate the effectiveness of the proposed algorithms. Simulations on both synthetic and real datasets demonstrate that the proposed algorithms have excellent performance, which are almost as good as the corresponding centralized VB algorithm relying on all data available in a fusion center.	翻訳日:2022-09-20 01:28:17 公開日:2020-11-27
# 教師の繰り返し強制と再重み付けによるマルチタスクmrイメージング Multi-task MR Imaging with Iterative Teacher Forcing and Re-weighted Deep Learning ( http://arxiv.org/abs/2011.13614v1 ) ライセンス: Link先を確認	Kehan Qi, Yu Gong, Xinfeng Liu, Xin Liu, Hairong Zheng, Shanshan Wang	(参考訳) 磁気共鳴(MR)再構成によるノイズ、アーティファクト、情報の喪失は、下流アプリケーションの最終性能を損なう可能性がある。本稿では,既存のビッグデータから事前知識を学習するマルチタスク深層学習手法を開発し,これらの知識を用いて,アンサンプリングk空間データからのmr再構成とセグメンテーションの同時支援を行う。マルチタスク深層学習フレームワークは,動的再重み付き損失制約 (DRLC) の下で設計した反復型教師強制スキーム (ITFS) によって統合・訓練された2つのネットワークサブモジュールを備える。 ITFSは、完全にサンプル化されたデータをトレーニングプロセスに注入することで、エラーの蓄積を避けるように設計されている。マルチタスクの精度を共プロパントするために,リコンストラクションとセグメンテーションサブモジュールからの貢献を動的にバランスさせるdrlcを提案する。提案手法は,2つのオープンデータセットと1つのin vivo内データセットを用いて評価し,6つの最先端手法と比較した。提案手法は,同時的かつ正確なMR再構成とセグメンテーションの促進機能を有することを示す。 Noises, artifacts, and loss of information caused by the magnetic resonance (MR) reconstruction may compromise the final performance of the downstream applications. In this paper, we develop a re-weighted multi-task deep learning method to learn prior knowledge from the existing big dataset and then utilize them to assist simultaneous MR reconstruction and segmentation from the under-sampled k-space data. The multi-task deep learning framework is equipped with two network sub-modules, which are integrated and trained by our designed iterative teacher forcing scheme (ITFS) under the dynamic re-weighted loss constraint (DRLC). The ITFS is designed to avoid error accumulation by injecting the fully-sampled data into the training process. The DRLC is proposed to dynamically balance the contributions from the reconstruction and segmentation sub-modules so as to co-prompt the multi-task accuracy. The proposed method has been evaluated on two open datasets and one in vivo in-house dataset and compared to six state-of-the-art methods. Results show that the proposed method possesses encouraging capabilities for simultaneous and accurate MR reconstruction and segmentation.	翻訳日:2022-09-20 01:27:58 公開日:2020-11-27
# Manifold Disentanglement を用いた医用画像翻訳の操作 Manipulating Medical Image Translation with Manifold Disentanglement ( http://arxiv.org/abs/2011.13615v1 ) ライセンス: Link先を確認	Siyu Liu, Jason A. Dowling, Craig Engstrom, Peter B. Greer, Stuart Crozier, Shekhar S. Chandra	(参考訳) 医用画像変換(ctからmrへ)は、i)ドメイン不変特徴の忠実な翻訳(解剖学的構造の形状情報など)、ii)ターゲット領域特徴の現実的な合成(mrにおける組織出現など)を必要とするため、難しい課題である。本研究では,この2つの特徴を明示的にモデル化する新しい画像翻訳フレームワークであるmdgan(mandular disentanglement generative adversarial network)を提案する。完全畳み込み生成器を使用してドメイン不変な特徴をモデル化し、スタイルコードを使用して対象領域の特徴を多様体として別々にモデル化する。この設計は、ドメイン不変の機能とドメイン固有の機能を明確に切り離し、双方を個別に制御することを目的としている。画像変換処理はスタイライゼーションタスクとして定式化され、入力は学習多様体からサンプリングされたスタイルコードに基づいて、様々なターゲットドメインイメージに「スタイライゼーション」(翻訳)される。 MDGANをマルチモーダルな医用画像変換のためにテストし、この多様体上に2つのドメイン固有の多様体クラスタを作成し、セグメント化マップを擬似CTと擬似MR画像に変換する。 MR多様体クラスタを横切る経路をトラバースすることで、入力から形状情報を保持しながら目標出力を操作可能であることを示す。 Medical image translation (e.g. CT to MR) is a challenging task as it requires I) faithful translation of domain-invariant features (e.g. shape information of anatomical structures) and II) realistic synthesis of target-domain features (e.g. tissue appearance in MR). In this work, we propose Manifold Disentanglement Generative Adversarial Network (MDGAN), a novel image translation framework that explicitly models these two types of features. It employs a fully convolutional generator to model domain-invariant features, and it uses style codes to separately model target-domain features as a manifold. This design aims to explicitly disentangle domain-invariant features and domain-specific features while gaining individual control of both. The image translation process is formulated as a stylisation task, where the input is "stylised" (translated) into diverse target-domain images based on style codes sampled from the learnt manifold. We test MDGAN for multi-modal medical image translation, where we create two domain-specific manifold clusters on the manifold to translate segmentation maps into pseudo-CT and pseudo-MR images, respectively. We show that by traversing a path across the MR manifold cluster, the target output can be manipulated while still retaining the shape information from the input.	翻訳日:2022-09-20 01:27:38 公開日:2020-11-27
# ほとんど訓練のない多目的ニューラルアーキテクチャ探索 Multi-objective Neural Architecture Search with Almost No Training ( http://arxiv.org/abs/2011.13591v1 ) ライセンス: Link先を確認	Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu	(参考訳) 近年、ニューラルアーキテクチャサーチ(NAS)は、学術と産業の両方から注目を集めている。印象的な実験結果の安定した流れにもかかわらず、既存のNASアルゴリズムのほとんどは、確率勾配降下(SGD)トレーニングのコストのかかる反復のために計算的に実行を禁止している。本研究では,ネットワークアーキテクチャの性能を迅速に評価するために,ランダムウェイト評価(rwe)と呼ばれる効果的な代替案を提案する。最後の線形分類層をトレーニングすることによって、rweはアーキテクチャを評価する計算コストを数時間から秒に短縮する。進化的多目的アルゴリズムに統合されると、rweは1つのgpuカードで2時間未満の検索でcifar-10で最先端のパフォーマンスを持つ一連の効率的なアーキテクチャを得る。 imagenetに対するランク次相関と転送学習実験に関するアブレーション研究は、rweの有効性をさらに検証した。 In the recent past, neural architecture search (NAS) has attracted increasing attention from both academia and industries. Despite the steady stream of impressive empirical results, most existing NAS algorithms are computationally prohibitive to execute due to the costly iterations of stochastic gradient descent (SGD) training. In this work, we propose an effective alternative, dubbed Random-Weight Evaluation (RWE), to rapidly estimate the performance of network architectures. By just training the last linear classification layer, RWE reduces the computational cost of evaluating an architecture from hours to seconds. When integrated within an evolutionary multi-objective algorithm, RWE obtains a set of efficient architectures with state-of-the-art performance on CIFAR-10 with less than two hours' searching on a single GPU card. Ablation studies on rank-order correlations and transfer learning experiments to ImageNet have further validated the effectiveness of RWE.	翻訳日:2022-09-20 01:20:17 公開日:2020-11-27
# 勾配の多様性と不確かさに基づくシーケンスラベリングのための深層能動的学習 Deep Active Learning for Sequence Labeling Based on Diversity and Uncertainty in Gradient ( http://arxiv.org/abs/2011.13570v1 ) ライセンス: Link先を確認	Yekyung Kim	(参考訳) 近年,自然言語処理タスクのアクティブラーニング(al)によるデータ依存の軽減が研究されている。しかし、クエリ選択においては、ほとんどの研究は、主に不確実性に基づくサンプリングに依存しており、一般にラベルなしデータの構造情報を活用していない。これにより、バッチアクティブな学習設定におけるサンプリングバイアスが発生し、同時に複数のサンプルを選択する。本研究では,シーケンスラベリングタスクに不確実性と多様性の両方を組み込んだ場合,アクティブラーニングを用いてラベル付きトレーニングデータの量を削減できることを実証する。我々は,複数のタスク,データセット,モデルにまたがる勾配埋め込みアプローチにおいて,重み付けされた多様性を選択することでシーケンスベースアプローチの効果を検討した。 Recently, several studies have investigated active learning (AL) for natural language processing tasks to alleviate data dependency. However, for query selection, most of these studies mainly rely on uncertainty-based sampling, which generally does not exploit the structural information of the unlabeled data. This leads to a sampling bias in the batch active learning setting, which selects several samples at once. In this work, we demonstrate that the amount of labeled training data can be reduced using active learning when it incorporates both uncertainty and diversity in the sequence labeling task. We examined the effects of our sequence-based approach by selecting weighted diverse in the gradient embedding approach across multiple tasks, datasets, models, and consistently outperform classic uncertainty-based sampling and diversity-based sampling.	翻訳日:2022-09-20 01:20:06 公開日:2020-11-27
# ナラティブ知識グラフにおける関係クラスタリング Relation Clustering in Narrative Knowledge Graphs ( http://arxiv.org/abs/2011.13647v1 ) ライセンス: Link先を確認	Simone Mellace, K Vani, Alessandro Antonucci	(参考訳) 小説や短編などの文学的文章を扱う場合、ナレッジグラフの形で構造化された情報の抽出は、小説の登場人物に対応するエンティティとそれらに関する監督された情報を集めるための適切なハードルとの間の膨大な関係によって妨げられる可能性がある。原文のリレーショナル文は(SBERTと)組み込まれ、意味論的に類似した関係をまとめるためにクラスタ化される。同じクラスタ内のすべての文は最終的に(BARTで)要約され、要約から抽出された記述ラベルが抽出される。予備テストでは、このようなクラスタリングが類似した関係をうまく検出でき、半教師付きアプローチのための貴重な前処理を提供することが示された。 When coping with literary texts such as novels or short stories, the extraction of structured information in the form of a knowledge graph might be hindered by the huge number of possible relations between the entities corresponding to the characters in the novel and the consequent hurdles in gathering supervised information about them. Such issue is addressed here as an unsupervised task empowered by transformers: relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations. All the sentences in the same cluster are finally summarized (with BART) and a descriptive label extracted from the summary. Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.	翻訳日:2022-09-20 01:19:52 公開日:2020-11-27
# 逐次混合によるリカレントニューラルネットワークの正規化 Regularizing Recurrent Neural Networks via Sequence Mixup ( http://arxiv.org/abs/2012.07527v1 ) ライセンス: Link先を確認	Armin Karamzade, Amir Najafi and Seyed Abolfazl Motahari	(参考訳) 本稿では,入力混合(Zhang et al., 2017)とマニフォールド混合(Verma et al., 2018)という,フィードフォワードニューラルネットワークにもともと提案されていた有名な正規化手法を,リカレントニューラルネットワーク(RNN)の領域に拡張する。提案手法は実装が容易で計算量も少ないが,様々なタスクにおいて単純なニューラルアーキテクチャの性能を活用している。我々は、実世界のデータセットに関するいくつかの実験を通して、我々の主張を検証するとともに、提案手法の性質と潜在的影響をさらに調査するための漸近的な理論的分析を提供する。 CoNLL-2003データ(Sang and De Meulder, 2003)上で, BiLSTM-CRFモデル(Huang et al., 2015)を名前付きエンティティ認識タスクに適用することにより,テストステージにおけるF-1スコアを改善し,損失を大幅に低減した。 In this paper, we extend a class of celebrated regularization techniques originally proposed for feed-forward neural networks, namely Input Mixup (Zhang et al., 2017) and Manifold Mixup (Verma et al., 2018), to the realm of Recurrent Neural Networks (RNN). Our proposed methods are easy to implement and have a low computational complexity, while leverage the performance of simple neural architectures in a variety of tasks. We have validated our claims through several experiments on real-world datasets, and also provide an asymptotic theoretical analysis to further investigate the properties and potential impacts of our proposed techniques. Applying sequence mixup to BiLSTM-CRF model (Huang et al., 2015) to Named Entity Recognition task on CoNLL-2003 data (Sang and De Meulder, 2003) has improved the F-1 score on the test stage and reduced the loss, considerably.	翻訳日:2022-09-20 01:19:20 公開日:2020-11-27
# デモとラベルなし体験によるオフライン学習 Offline Learning from Demonstrations and Unlabeled Experience ( http://arxiv.org/abs/2011.13885v1 ) ライセンス: Link先を確認	Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed	(参考訳) 行動クローニング(BC)は、専門家によるデモンストレーションに関する教師あり学習によって、報酬なしでポリシーをオフラインでトレーニングできるため、ロボット学習において実用的であることが多い。しかし、bcは、私たちがラベルのない経験と呼ぶもの、すなわち、報酬のアノテーションなしで、混合品質と未知の品質のデータを有効に活用しません。このラベルのないデータは、人間の遠隔操作、スクリプト化されたポリシー、および同じロボット上の他のエージェントなど、さまざまなソースによって生成される。このラベルのない体験を利用できるデータ駆動型オフラインロボット学習に向けて、Offline Reinforced Imitation Learning (ORIL)を紹介する。 ORILはまず、実証者や未ラベルの軌跡からの観察を対比して報酬関数を学び、次にすべてのデータを学習報酬で注釈付けし、最後にオフラインの強化学習を通じてエージェントを訓練する。各種の連続制御およびロボット操作タスクのシミュレーションにより、ORILはラベルなし体験を効果的に活用することにより、同等のBCエージェントよりも一貫して優れていることを示す。 Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations. However, BC does not effectively leverage what we will refer to as unlabeled experience: data of mixed and unknown quality without reward annotations. This unlabeled data can be generated by a variety of sources such as human teleoperation, scripted policies and other agents on the same robot. Towards data-driven offline robot learning that can use this unlabeled experience, we introduce Offline Reinforced Imitation Learning (ORIL). ORIL first learns a reward function by contrasting observations from demonstrator and unlabeled trajectories, then annotates all data with the learned reward, and finally trains an agent via offline reinforcement learning. Across a diverse set of continuous control and simulated robotic manipulation tasks, we show that ORIL consistently outperforms comparable BC agents by effectively leveraging unlabeled experience.	翻訳日:2022-09-20 01:18:58 公開日:2020-11-27

Title

Authors

Abstract

論文公表日・翻訳日

# ブール関数のための可変量子ニューラルネットワーク

Tunable Quantum Neural Networks for Boolean Functions ( http://arxiv.org/abs/2003.14122v2 )

ライセンス: Link先を確認

Viet Pham Ngoc and Herbert Wiklicky

(参考訳) 本稿では,量子ニューラルネットワークに対する新しいアプローチを提案する。我々の多層アーキテクチャは、古典的ニューラルネットワークの特徴である非線形活性化関数をエミュレートする計測の使用を避ける。それにもかかわらず、提案したアーキテクチャは、Boolean関数を学習することができる。この能力は、ブール関数と多制御NOTゲートからなる特定の量子回路の間に存在する対応から生じる。この対応は代数正規形式と呼ばれる関数の多項式表現によって構築される。この構成を用いて、任意のブール関数を学習するためにゲートをチューニングできるジェネリック量子回路のアイデアを導入する。学習課題を実行するために,測定の欠如を利用したアルゴリズムを考案した。長さ$n$の全てのバイナリ入力の重ね合わせを提示すると、ネットワークは少なくとも$n+1$の更新でターゲット関数を学習できる。

In this paper we propose a new approach to quantum neural networks. Our multi-layer architecture avoids the use of measurements that usually emulate the non-linear activation functions which are characteristic of the classical neural networks. Despite this, our proposed architecture is still able to learn any Boolean function. This ability arises from the correspondence that exists between a Boolean function and a particular quantum circuit made out of multi-controlled NOT gates. This correspondence is built via a polynomial representation of the function called the algebraic normal form. We use this construction to introduce the idea of a generic quantum circuit whose gates can be tuned to learn any Boolean functions. In order to perform the learning task, we have devised an algorithm that leverages the absence of measurements. When presented with a superposition of all the binary inputs of length $n$, the network can learn the target function in at most $n+1$ updates.

翻訳日:2023-05-27 07:51:34 公開日:2020-11-27

# ランダム場ハイゼンベルクスピン鎖における多体局在転移における鎖破壊とコステリッツ-チューレススケーリング

Chain breaking and Kosterlitz-Thouless scaling at the many-body localization transition in the random field Heisenberg spin chain ( http://arxiv.org/abs/2004.02861v3 )

ライセンス: Link先を確認

Nicolas Laflorencie, Gabriel Lemari\'e, Nicolas Mac\'e

(参考訳) 多体局在化(MBL)遷移の微妙さを理解するための非常に理論的な努力にもかかわらず、多くの疑問が、特にその重要な性質に関して開かれている。ここでは、1次元のMBLが熱力学限界の連鎖破壊を引き起こすスピン凍結機構を伴っているという重要な観察を行う。解析的および数値的手法を用いて,これらの鎖切断は典型的な局在長を直接観測し,mbl遷移におけるスケーリング特性は表現論的再正規化群アプローチによって予測されたkosterlitz-thoulessシナリオと一致することを示した。

Despite tremendous theoretical efforts to understand subtleties of the many-body localization (MBL) transition, many questions remain open, in particular concerning its critical properties. Here we make the key observation that MBL in one dimension is accompanied by a spin freezing mechanism which causes chain breakings in the thermodynamic limit. Using analytical and numerical approaches, we show that such chain breakings directly probe the typical localization length, and that their scaling properties at the MBL transition agree with the Kosterlitz-Thouless scenario predicted by phenomenological renormalization group approaches.

翻訳日:2023-05-26 06:16:11 公開日:2020-11-27

# アレイホモダイニングによる2つの非コヒーレント源のサブレイリー分解

Sub-Rayleigh resolution of two incoherent sources by array homodyning ( http://arxiv.org/abs/2005.08693v2 )

ライセンス: Link先を確認

Chandan Datta, Marcin Jarzyna, Yink Loong Len, Karol {\L}ukanowski, Jan Ko{\l}ody\'nski, Konrad Banaszek

(参考訳) 画像平面の空間強度分布の測定に基づく従来の非干渉イメージングは、レイリー回折基準で記述された分解能ハードルに直面する。ここでは、アレイホモダイン検出により測定された2つの不整点間の距離をレイリー限界以下で十分高い信号-雑音比で推定できるというフィッシャー情報の概念を用いて理論的に実証する。この能力は、コヒーレント検出技術を用いて取得した個々の検出器画素間の空間コヒーレンス情報の可用性に起因する。サブレイリー領域で達成可能な精度の簡易な解析近似について述べる。さらに,モンテカルロシミュレーションデータに対して推定アルゴリズムを提案し,適用した。

Conventional incoherent imaging based on measuring the spatial intensity distribution in the image plane faces the resolution hurdle described by the Rayleigh diffraction criterion. Here, we demonstrate theoretically using the concept of the Fisher information that quadrature statistics measured by means of array homodyne detection enables estimation of the distance between two incoherent point sources well below the Rayleigh limit for sufficiently high signal-to-noise ratio. This capability is attributed to the availability of spatial coherence information between individual detector pixels acquired using the coherent detection technique. A simple analytical approximation for the precision attainable in the sub-Rayleigh region is presented. Furthermore, an estimation algorithm is proposed and applied to Monte Carlo simulated data.

翻訳日:2023-05-19 11:24:39 公開日:2020-11-27

# 環境を介する量子電池の帯電過程

Environment-mediated charging process of quantum batteries ( http://arxiv.org/abs/2005.12823v3 )

ライセンス: Link先を確認

F. T. Tabesh, F. H. Kamin and S. Salimi

(参考訳) 共有散逸環境を介するオープン量子電池の充電過程を, 2つの異なるシナリオで検討した。最初のケースでは、非マルコフ環境の存在下で量子チャージャー・バッテリモデルを考える。バッテリーは強結合状態で適切に充電できるが、外部の電力や充電器との直接のやりとり、すなわちワイヤレスライクな充電は発生しない。環境はバッテリの充電において大きな役割を果たすが、これは弱い結合状態では起こらない。第2のシナリオでは、マルコフ力学の存在下での2量子ビット系を考慮した量子電池の充電過程に対する個人および集団自発放出率の影響を示す。その結果, オープンバッテリは, アンダーダムや強い外部磁場を用いることで, マルコフ力学において良好に充電できることを示した。また、サブラジアント状態と中間状態を考慮した頑健な電池も提示する。さらに,最初のシナリオでエルゴトロピーを探索するための実験的なセットアップを提案する。

We study the charging process of open quantum batteries mediated by a common dissipative environment in two different scenarios. In the first case, we consider a quantum charger-battery model in the presence of a non-Markovian environment. Where the battery can be properly charged in a strong coupling regime, without any external power and any direct interaction with the charger, i.e., a wireless-like charging happens. The environment plays a major role in the charging of the battery, while this does not happen in a weak coupling regime. In the second scenario, we show the effect of individual and collective spontaneous emission rates on the charging process of quantum batteries by considering a two-qubit system in the presence of Markovian dynamics. Our results demonstrate that open batteries can be satisfactorily charged in Markovian dynamics by employing an underdamped regime and/or strong external fields. We also present a robust battery by taking into account subradiant states and an intermediate regime. Moreover, we propose an experimental setup to explore the ergotropy in the first scenario.

翻訳日:2023-05-18 07:33:42 公開日:2020-11-27

# スマートなコネクテッド・コミュニティが新型コロナウイルスの感染拡大と闘う

Future Smart Connected Communities to Fight COVID-19 Outbreak ( http://arxiv.org/abs/2007.10477v2 )

ライセンス: Link先を確認

Deepti Gupta, Smriti Bhatt, Maanak Gupta, and Ali Saman Tosun

(参考訳) IoT(Internet of Things)はこの10年間で急速に成長し、さまざまなアプリケーションをサポートする幅広いデバイスを提供する次元と複雑さの面で開発を続けている。ユビキタスインターネット、コネクテッドセンサーとアクチュエータ、ネットワークと通信技術、人工知能(AI)によって、スマートサイバー物理システム(CPS)は、日々の生活で人間にサービスを提供する。しかし、新型コロナウイルス(COVID-19)の感染拡大により、現在の技術展開の限界が明らかになった。 IoTおよびスマートコネクテッドテクノロジとデータ駆動アプリケーションとの併用は、疾患の予防、継続的な監視、緩和だけでなく、ガイドライン、ルール、政府命令の即時実施において重要な役割を果たす可能性がある。本稿では、インテリジェントなモニタリング、プロアクティブな予防とコントロール、COVID-19の緩和のためのIoT対応エコシステムを構想する。我々は、E-health、スマートホーム、スマートサプライチェーン管理、スマートローカリティ、スマートシティなど、さまざまなスマートインフラストラクチャのための異なるアーキテクチャ、アプリケーション、技術システムを提案し、同様のアウトブレイクを管理し緩和する将来のコネクテッドコミュニティを開発する。さらに、これらのスマートなコミュニティやインフラを、これらのアウトブレイクと闘い、準備するために、今後の方向性とともに研究課題を提示する。

Internet of Things (IoT) has grown rapidly in the last decade and continue to develop in terms of dimension and complexity offering wide range of devices to support diverse set of applications. With ubiquitous Internet, connected sensors and actuators, networking and communication technology, and artificial intelligence (AI), smart cyber-physical systems (CPS) provide services rendering assistance to humans in their daily lives. However, the recent outbreak of COVID-19 (also known as coronavirus) pandemic has exposed and highlighted the limitations of current technological deployments to curtail this disease. IoT and smart connected technologies together with data-driven applications can play a crucial role not only in prevention, continuous monitoring, and mitigation of the disease, but also enable prompt enforcement of guidelines, rules and government orders to contain such future outbreaks. In this paper, we envision an IoT-enabled ecosystem for intelligent monitoring, pro-active prevention and control, and mitigation of COVID-19. We propose different architectures, applications and technology systems for various smart infrastructures including E-health, smart home, smart supply chain management, smart locality, and smart city, to develop future connected communities to manage and mitigate similar outbreaks. Furthermore, we present research challenges together with future directions to enable and develop these smart communities and infrastructures to fight and prepare against such outbreaks.

翻訳日:2023-05-08 22:58:20 公開日:2020-11-27

# 2次元正方格子における双極子ボソンのスタッガー状超流動相

Staggered superfluid phases of dipolar bosons in two-dimensional square lattices ( http://arxiv.org/abs/2008.00870v2 )

ライセンス: Link先を確認

Kuldeep Suthar, Rebecca Kraus, Hrushikesh Sable, Dilip Angom, Giovanna Morigi, and Jakub Zakrzewski

(参考訳) 二次元正方格子における超低温ボソンの量子基底状態の研究を行った。ボゾンは反発性双極子相互作用とs波散乱を介して相互作用する。この力学は双極子相互作用による相関ホッピングを含む拡張ボース・ハバードモデルによって記述され、係数は現実的なパラメータを持つワニエ展開を用いて第2量子化ハミルトニアンから得られる。相関ホッピング項の係数が負であり, 単粒子効果によるトンネル作用を阻害できる状態において, Gutzwiller ansatz を用いて位相図を決定する。この干渉は運動エネルギーの消滅時に停滞した超流動相と超固体相を生じさせ, 位相が圧縮不能な有限運動エネルギーにおけるパラメータ領域を同定する。得られた位相図をクラスタ・グッツウィラー法とDMRGを用いて一次元で得られた結果と比較した。

We study the quantum ground state of ultracold bosons in a two-dimensional square lattice. The bosons interact via the repulsive dipolar interactions and s-wave scattering. The dynamics is described by the extended Bose-Hubbard model including correlated hopping due to the dipolar interactions, the coefficients are found from the second quantized Hamiltonian using the Wannier expansion with realistic parameters. We determine the phase diagram using the Gutzwiller ansatz in the regime where the coefficients of the correlated hopping terms are negative and can interfere with the tunneling due to single-particle effects. We show that this interference gives rise to staggered superfluid and supersolid phases at vanishing kinetic energy, while we identify parameter regions at finite kinetic energy where the phases are incompressible. We compare the results with the phase diagram obtained with the cluster Gutzwiller approach and with the results found in one dimension using DMRG.

翻訳日:2023-05-07 06:36:36 公開日:2020-11-27

# 二重光モードを有する光学系におけるコヒーレントノイズキャンセリング

Coherent noise cancellation in optomechanical system with double optical modes ( http://arxiv.org/abs/2009.04706v3 )

ライセンス: Link先を確認

Jiashun Yan and Jun Jing

(参考訳) コヒーレント量子ノイズキャンセレーション(cqnc)戦略は、標準量子限界を破る超感度メトロロジープロトコルを促進するために、単一モード光機械システムにおいて実行されてきた。 CQNCの鍵となる考え方は、放射圧と駆動から生じるバックアクションノイズは、光学モードをほぼ共振アシラリーモードに結合することでオフセットできるということである。本研究では,cqnc下での連続的な弱力センシングを,異なる周波数とメカニカルモードの2つの光モードからなる2重モード光機械システムで開発する。特に、高周波光モードを駆動し、低周波モードを探索し、プローブモードを補助モードに結合することにより、非対称な処理の下で、従来のcqncセンシングに類似させることができる。現在のCQNC戦略は、制約駆動力(ルース・ハーウィッツ基準)と有効正の機械減衰(安定光ばね条件)の両方に関して同時に二重モード系を安定化させることが重要である。さらに、CQNC戦略の非自明な拡張(シングルモード版からダブルモード版まで)の下でプローブモードとアシラリーモードの結合を利用することにより、回転波項と反回転項はそれぞれシステムの安定性とノイズキャンセリングに責任があることが判明した。現実の状況では, 中間に膜, ねじれたキャビティに基づく弱トーク検出器を配置した3部式光機械装置を用いて, 本方式を実践できる。

The coherent quantum noise cancellation (CQNC) strategy has been performed in the single-mode optomechanical systems to promote an ultra-sensitive metrology protocol to break the standard quantum limit. The key idea of CQNC is that the backaction noises arising from radiation pressure and driving can be offset by coupling the optical mode to a near-resonant ancillary mode. In this work, a continuous weak-force sensing under CQNC is developed in a double-mode optomechanical system consisted of two optical modes with distinct frequencies and a mechanical mode. In particular, under the asymmetrical treatment by driving the higher-frequency optical mode, probing the lower-frequency one, and coupling the probe mode to the ancillary mode, our configuration can be used to resemble the conventional CQNC sensing. It is more important to find that the current CQNC strategy simultaneously stabilizes the double-mode system with respect to both the constrained driving power (the Routh-Hurwitz criterion) and the effective positive mechanical damping (the stable optical-spring condition). Moreover, through exploiting the coupling between the probe mode and the ancillary mode under this nontrivial extension of the CQNC strategy (from the single-mode version to the double-mode one), the rotating-wave term and the counter-rotating term are found to be responsible to the system stability and the noise cancellation, respectively. In realistic situations, our scheme can be practiced in a tripartite optomechanical setup with a membrane in the middle and a twisted-cavity-based weak-torque detector.

翻訳日:2023-05-03 00:56:56 公開日:2020-11-27

# 時間制御弱障害におけるボース・アインシュタイン凝縮変形の非平衡進化

Non-equilibrium evolution of Bose-Einstein condensate deformation in temporally controlled weak disorder ( http://arxiv.org/abs/2009.10477v2 )

ライセンス: Link先を確認

Milan Radonji\'c and Axel Pelster

(参考訳) 弱障害電位のオン・オフが、障害誘発凝縮変形の出現によって初期平衡ボース・アインシュタイン凝縮の定常状態にどのように影響するかを考慮し、汚泥問題に対する摂動平均場アプローチの時間依存的拡張を考える。その結果, 定常凝縮変形は, 実際に障害の断熱スイッチに対応する平衡部分の和であり, 後者が特定の駆動プロトコルに依存する動的誘導部分であることがわかった。その後、障害がオフになれば、結果として生じる凝縮変形は、平衡部が消滅する間、長期限界における追加の動的誘起部分を取得する。また,不均一な捕捉凝縮物に対する適切な一般化を示す。その結果, 縮合変形は時間的に制御された弱障害におけるボース気体の定常状態の一般非平衡性の指標であることが示された。

We consider a time-dependent extension of a perturbative mean-field approach to the dirty boson problem by considering how switching on and off a weak disorder potential affects the stationary state of an initially equilibrated Bose-Einstein condensate by the emergence of a disorder-induced condensate deformation. We find that in the switch on scenario the stationary condensate deformation turns out to be a sum of an equilibrium part, that actually corresponds to adiabatic switching on the disorder, and a dynamically-induced part, where the latter depends on the particular driving protocol. If the disorder is switched off afterwards, the resulting condensate deformation acquires an additional dynamically-induced part in the long-time limit, while the equilibrium part vanishes. We also present an appropriate generalization to inhomogeneous trapped condensates. Our results demonstrate that the condensate deformation represents an indicator of the generically non-equilibrium nature of steady states of a Bose gas in a temporally controlled weak disorder.

翻訳日:2023-05-01 07:06:06 公開日:2020-11-27

# エンジニアリング純粋に非線形な結合とクォートン

Engineering Purely Nonlinear Coupling with the Quarton ( http://arxiv.org/abs/2010.09959v2 )

ライセンス: Link先を確認

Yufeng Ye, Kaidong Peng, Mahdi Naghiloo, Gregory Cunningham, and Kevin P. O'Brien

(参考訳) 超伝導量子ビットと光子の強い非線形結合は、量子情報処理の重要な構成要素である。ジョセフソンの非線形性の摂動性のため、線形カップリングは分散状態において近似的な非線形カップリングにしばしば用いられる。しかし、この分散結合は弱く、基礎となる線形結合は局所モードを混合し、例えば不必要な自己結合をフォトンモードに分配する。ここでは、クォートンを用いて2つの線形分離されたトランモン量子ビット間の純粋に非線形結合を得る。クォートンのゼロ$\phi^2$ポテンシャルは、既存のスキームに比べて桁違いに強い巨大なギガヘルツレベルのクロスカーを可能にし、クォートンの正の$\phi^4$ポテンシャルはクォービットの負のセルフカーをキャンセルして共振器にすることができる。量子ビット光子、量子ビット光子、光子光子の間のこの巨大なクロスカーは、単一マイクロ波光子検出やボソニック符号の実装のような応用に最適である。

Strong nonlinear coupling of superconducting qubits and/or photons is a critical building block for quantum information processing. Due to the perturbative nature of the Josephson nonlinearity, linear coupling is often used in the dispersive regime to approximate nonlinear coupling. However, this dispersive coupling is weak and the underlying linear coupling mixes the local modes which, for example, distributes unwanted self-Kerr to photon modes. Here, we use the quarton to yield purely nonlinear coupling between two linearly decoupled transmon qubits. The quarton's zero $\phi^2$ potential enables a giant gigahertz-level cross-Kerr which is an order of magnitude stronger compared to existing schemes, and the quarton's positive $\phi^4$ potential can cancel the negative self-Kerr of qubits to linearize them into resonators. This giant cross-Kerr between bare modes of qubit-qubit, qubit-photon, and even photon-photon is ideal for applications such as single microwave photon detection and implementation of bosonic codes.

翻訳日:2023-04-28 05:50:14 公開日:2020-11-27

# マルチサイドバンドRABBITTスキームにおける遷移相の分解

Decomposition of the transition phase in multi-sideband RABBITT schemes ( http://arxiv.org/abs/2011.02989v2 )

ライセンス: Link先を確認

Divya Bharti, David Atri-Schuller, Gavin Menning, Kathryn R. Hamilton, Robert Moshammer, Thomas Pfeifer, Nicolas Douguet, Klaus Bartschat, Anne Harth

(参考訳) 2光子遷移(rabbitt)の干渉によるアト秒ビーティングの再構成は、光イオン化過程における原子遷移元素の相を決定するのに使用できる技術である。従来のRABBITTスキームでは、いわゆる漸近近似(asymptotic approximation)は、測定された位相を、単光子イオン化過程と連続体-連続体(cc)相に連結されたウィグナー相の和とみなす。本稿では,漸近近似をマルチサイドバンドRABBITTスキームに拡張する可能性を検討する。この近似からの予測は、原子水素の時間依存シュル=オディンガー方程式の解法に基づいて、 {\displaystyle {\it ab initio} 計算によって得られた結果と比較される。

Reconstruction of Attosecond Beating By Interference of Two-photon Transitions (RABBITT) is a technique that can be used to determine the phases of atomic transition elements in photoionization processes. In the traditional RABBITT scheme, the so-called "asymptotic approximation" considers the measured phase as a sum of the Wigner phase linked to a single-photon ionization process and the continuum-continuum (cc) phase associated with further single-photon transitions in the continuum. In this paper, we explore the possibility of extending the asymptotic approximation to multi-sideband RABBITT schemes. The predictions from this approximation are then compared with results obtained by an {\it ab initio} calculation based on solving the time-dependent Schr\"odinger equation for atomic hydrogen.

翻訳日:2023-04-25 05:17:02 公開日:2020-11-27

# 量子ニューラルネットワークの記憶容量と学習能力

Storage capacity and learning capability of quantum neural networks ( http://arxiv.org/abs/2011.06113v2 )

ライセンス: Link先を確認

Maciej Lewenstein, Aikaterini Gratsea, Andreu Riera-Campeny, Albert Aloy, Valentin Kasper, Anna Sanpera

(参考訳) 我々は、完全正のトレース保存(cptp)写像として記述される量子ニューラルネットワーク(qnns)の記憶容量を調べ、n$-次元ヒルベルト空間に作用する。我々はQNNが最大$N$の線形独立な純状態を保存することを実証し、対応する写像の構造を提供する。古典的なホップフィールドネットワークの記憶容量はニューロンの数に線形にスケールするが、qnnは指数関数的に独立な状態の数を格納できることを示した。我々はGardnerプログラムを用いることで、CPTPマップの相対体積をM$の定常状態で推定する。体積は$M$で指数関数的に減少し、$M\geq N+1$で0に縮まる。本研究の結果は、混合状態を格納したQNNとフィードフォワードQNNの入力出力関係に一般化される。提案手法は,QNNの記憶特性と入力出力状態の量子特性を関連付ける経路を開く。この論文はPeter Wittekの思い出に捧げられている。

We study the storage capacity of quantum neural networks (QNNs) described as completely positive trace preserving (CPTP) maps, which act on an $N$-dimensional Hilbert space. We demonstrate that QNNs can store up to $N$ linearly independent pure states and provide the structure of the corresponding maps. While the storage capacity of a classical Hopfield network scales linearly with the number of neurons, we show that QNNs can store an exponential number of linearly independent states. We estimate, employing the Gardner program, the relative volume of CPTP maps with $M$ stationary states. The volume decreases exponentially with $M$ and shrinks to zero for $M\geq N+1$. We generalize our results to QNNs storing mixed states as well as input-output relations for feed-forward QNNs. Our approach opens the path to relate storage properties of QNNs to the quantum properties of the input-output states. This paper is dedicated to the memory of Peter Wittek.

翻訳日:2023-04-24 11:33:25 公開日:2020-11-27

# 空洞中の原子の集合的自己トッピング

Collective self-trapping of atoms in a cavity ( http://arxiv.org/abs/2011.10440v2 )

ライセンス: Link先を確認

A. Dombi, T. W. Clark, F. I. B. Williams, F. Jessen, J. Fort\'agh, D. Nagy, A. Vukics, P. Domokos

(参考訳) 本研究では,高精細キャビティの動的結合モードを用いて,寒冷原子雲の光双極子トラップを実験的に実証する。トラップは原子の集合的な作用を必要とすること、すなわち1つの原子は同じレーザー駆動条件下では閉じ込められないことを示す。原子はモードの周波数を共鳴に近づけることで、キャビティに閉じ込めるために必要な光の強度を与える。トラップ光モードにおける原子のバックアクションは、トラップの非指数的崩壊によっても現れる。

We experimentally demonstrate optical dipole trapping of a cloud of cold atoms by means of a dynamically coupled mode of a high-finesse cavity. We show that the trap requires a collective action of the atoms, i.e. a single atom would not be trapped under the same laser drive conditions. The atoms pull the frequency of the mode closer to resonance, thereby allowing the necessary light intensity for trapping into the cavity. The back-action of the atoms on the trapping light mode is also manifested by the non-exponential collapse of the trap.

翻訳日:2023-04-23 14:54:37 公開日:2020-11-27

# 乗り合い運転者の殺人事件が南京における乗り合い利用者の意志推定に及ぼす影響

Influence of Murder Incident of Ride-hailing Drivers on Ride-hailing User's Consuming Willingness in Nanchang ( http://arxiv.org/abs/2011.11384v2 )

ライセンス: Link先を確認

Guangxin He, Shenghuan Yang, Miaomiao Lei, Xing Wu, Yixin Sun, Yimeng Dang

(参考訳) 2018年の中国における配車ドライバーの殺人事件が頻発したため、配車会社はこのような事故の防止と乗客の安全確保のために一連の措置を講じた。本研究は,殺人事件後の配車アプリの使用意欲と安全確保に対するユーザの態度を調査した。ライドシェアリングドライバーの殺人事件は、人々のライドシェアリングアプリの使用に重大な影響を及ぼすことがわかった。女性利用者の有意感は「心理的害」など男性利用者の0.633倍であり, 女性利用者の間ではより明らかであった。最後に,配車アプリの効率には満足するが,安全性や信頼性には満足せず,重要であると考えられた。

Due to the frequent murder incidents of ride-hailing drivers in China in 2018, ride-hailing companies took a series of measures to prevent such incidents and ensure ride-hailing passengers' safety. This study investigated users' willingness to use ride-hailing apps after murder incidents and users' attitudes toward Safety Rectification. We found that murder incidents of ride-hailing drivers had a significant adverse impact on people's usage of ride-hailing apps. Female users' consuming willingness was 0.633 times that of male users, such as" psychological harm" was more evident among females, and Safety Rectification had a calming effect for some users. Finally, we found that people were satisfied with ride-hailing apps' efficiency, but were not satisfied with safety and reliability, considered them important; female users were more concerned about the security than male users.

翻訳日:2023-04-23 14:45:12 公開日:2020-11-27

# wi-fiのセキュリティとプライバシーの強化は、電波指紋の難読化による

Stay Connected, Leave no Trace: Enhancing Security and Privacy in WiFi via Obfuscating Radiometric Fingerprints ( http://arxiv.org/abs/2011.12644v2 )

ライセンス: Link先を確認

Luis F. Abanto-Leon and Andreas Baeuml and Gek Hong (Allyson) Sim and Matthias Hollick and Arash Asadi

(参考訳) WiFiチップセットの固有のハードウェア欠陥は、送信された信号に現れ、ユニークなラジオメトリック指紋をもたらす。この指紋は、セキュリティを強化するための認証手段として使用できる。実際、近年の研究では、市販のデバイスに容易に実装できる実用的な指紋認証ソリューションが提案されている。本稿では,これらの解が偽装攻撃に対して非常に脆弱であることを解析的かつ実験的に証明する。また、このようなユニークなデバイスベースの署名は、ユーザーデバイスを追跡することによってプライバシーを侵害するために悪用されることも示しており、現在、ユーザーはデバイスをオフにする以外にそのようなプライバシー攻撃を防ぐ手段を持っていない。 RF-Veilは,不正行為に対して堅牢であるだけでなく,送信機の無線指紋を非正規受信機に隠蔽することでユーザのプライバシーを保護する。具体的には、送信信号に位相誤差のランダム化パターンを導入し、受信側だけが送信元の指紋を抽出できるようにした。一連の実験と分析において, 統計的攻撃に内在的ランダム化を採用する脆弱性を明らかにし, 対策を導入する。最後に,RF-Veilがユーザプライバシ保護とセキュリティ向上に有効であることを示す。さらに,提案手法はRF-Veilを使用しない他のデバイスとの通信を可能にする。

The intrinsic hardware imperfection of WiFi chipsets manifests itself in the transmitted signal, leading to a unique radiometric fingerprint. This fingerprint can be used as an additional means of authentication to enhance security. In fact, recent works propose practical fingerprinting solutions that can be readily implemented in commercial-off-the-shelf devices. In this paper, we prove analytically and experimentally that these solutions are highly vulnerable to impersonation attacks. We also demonstrate that such a unique device-based signature can be abused to violate privacy by tracking the user device, and, as of today, users do not have any means to prevent such privacy attacks other than turning off the device. We propose RF-Veil, a radiometric fingerprinting solution that not only is robust against impersonation attacks but also protects user privacy by obfuscating the radiometric fingerprint of the transmitter for non-legitimate receivers. Specifically, we introduce a randomized pattern of phase errors to the transmitted signal such that only the intended receiver can extract the original fingerprint of the transmitter. In a series of experiments and analyses, we expose the vulnerability of adopting naive randomization to statistical attacks and introduce countermeasures. Finally, we show the efficacy of RF-Veil experimentally in protecting user privacy and enhancing security. More importantly, our proposed solution allows communicating with other devices, which do not employ RF-Veil.

翻訳日:2023-04-23 00:56:15 公開日:2020-11-27

# 線形量子チャネルによる2励起ルーティング

Two-excitation routing via linear quantum channels ( http://arxiv.org/abs/2011.13711v1 )

ライセンス: Link先を確認

Tony John George Apollaro and Wayne Jordan Chetcuti

(参考訳) ネットワーク内の異なるノード間で量子情報をルーティングすることは、量子インターネットの基本的な前提条件である。シングルキュービットルーティングは概ね解決されているが、多くのキュービットルーティングプロトコルは、これまで深く研究されていない。 arXiv:1911.12211における多重励振転送プロトコルに基づいて、複数の受動ブロックが線形連鎖に結合されたネットワーク上の2励振ルーティングプロトコルに摂動伝達スキームを適用する。我々は、受信機とチェーン間の切替可能な結合と永久結合の両方に対処する。このプロトコルはフェミオンネットワーク上で効率の良い2励振ルーティングを可能にするが、スピン=$\frac{1}{2}$ネットワークの場合、ネットワークの限られた領域だけが高品質なルーティングに適している。

Routing quantum information among different nodes in a network is a fundamental prerequisite for a quantum internet. While single-qubit routing has been largely addressed, many-qubit routing protocols have not been intensively investigated so far. Building on the many-excitation transfer protocol in arXiv:1911.12211, we apply the perturbative transfer scheme to a two-excitation routing protocol on a network where multiple two-receivers block are coupled to a linear chain. We address both the case of switchable and permanent couplings between the receivers and the chain. We find that the protocol allows for efficient two-excitation routing on a fermionic network, although for a spin-$\frac{1}{2}$ network only a limited region of the network is suitable for high-quality routing.

翻訳日:2023-04-22 20:48:22 公開日:2020-11-27

# 非可換平面における角運動量量子逆流

Angular momentum quantum backflow in the noncommutative plane ( http://arxiv.org/abs/2011.13644v1 )

ライセンス: Link先を確認

V. D. Paccoia, O. Panella and P. Roy

(参考訳) 非可換平面における量子バックフロー問題を研究する。特に,非可換運動量演算子と振動子相互作用のない荷電粒子について検討し,各ケースにおける角運動量逆流と,それらの差異について検討した。また、角運動量逆流の発生に関連する確率を提案し、その確率が物理パラメータ、すなわち磁場に依存するかどうかを調べる。

We study the quantum backflow problem in the noncommutative plane. In particular, we have considered a charged particle with and without an oscillator interaction with noncommuting momentum operators and examined angular momentum backflow in each case and how they differ from each other. We also propose a probability associated with the occurence of angular momentum backflow and investigate whether or not the probability depends on a physical parameter, namely the magnetic field.

翻訳日:2023-04-22 20:48:07 公開日:2020-11-27

# 市民集団における人間計算:知識管理ソリューションフレームワーク

Human Computations in Citizen Crowds: A Knowledge Management Solution Framework ( http://arxiv.org/abs/2011.13638v1 )

ライセンス: Link先を確認

Nadeem Kafi, Zubair Ahmed Shaikh, and Muhammad Shahid Shaikh

(参考訳) KG(知識世代)と理解は伝統的に人間中心の活動であった。 KE (Knowledge Engineering) と KM (Knowledge Management) は、2つの異なる平面上の人間の知識を増強しようと試みている。しかし、どちらもコンピュータ中心である。クラウドソーシングhc(human computations)は最近、人間の認識とメモリを利用して、特定のタスクに関する多様な知識ストリームを生成する。文学は、市民の群衆のためのKMフレームワークについてはほとんど研究せず、様々な分野の人間からインプットを集め、タスクや知識カテゴリに関する知識を組織化し、コンピュータ中心の活動として新しい知識を再現する。本稿では,知識の生成,知識へのフィードバック,学習環境におけるその知識の結果を記録することを目的とした,ExamCheckという簡単なソリューションを実装したフレームワークの構築の試みを行う。 hcに基づく我々のソリューションは、構造化kmフレームワークが参加者自身にとって重要なコンテキストで複雑な問題に対処することができることを示している。

KG (Knowledge Generation) and understanding have traditionally been a Human-centric activity. KE (Knowledge Engineering) and KM (Knowledge Management) have tried to augment human knowledge on two separate planes: the first deals with machine interpretation of knowledge while the later explore interactions in human networks for KG and understanding. However, both remain computer-centric. Crowdsourced HC (Human Computations) have recently utilized human cognition and memory to generate diverse knowledge streams on specific tasks, which are mostly easy for humans to solve but remain challenging for machine algorithms. Literature shows little work on KM frameworks for citizen crowds, which gather input from the diverse category of Humans, organize that knowledge concerning tasks and knowledge categories and recreate new knowledge as a computer-centric activity. In this paper, we present an attempt to create a framework by implementing a simple solution, called ExamCheck, to focus on the generation of knowledge, feedback on that knowledge and recording the results of that knowledge in academic settings. Our solution, based on HC, shows that a structured KM framework can address a complex problem in a context that is important for participants themselves.

翻訳日:2023-04-22 20:48:00 公開日:2020-11-27

# 超対称性、半有界状態、および放牧入射反射

Supersymmetry, half-bound states, and grazing incidence reflection ( http://arxiv.org/abs/2011.13621v1 )

ライセンス: Link先を確認

D. A. Patient and S. A. R. Horsley

(参考訳) 平面媒質への入射時の電磁波は、ポテンシャル井戸に衝突するゼロエネルギー量子粒子と類似している。この極限波は通常完全に反射される。ここでは「半境界状態」の光学的類似性をサポートする誘電体プロファイルを探索し、放牧入射時の反射をゼロにする。これらのプロファイルを得るには、超対称量子力学とヘルムホルツ方程式の直接反転という、2つの異なる理論的アプローチを用いる。

Electromagnetic waves at grazing incidence onto a planar medium are analogous to zero energy quantum particles incident onto a potential well. In this limit waves are typically completely reflected. Here we explore dielectric profiles supporting optical analogues of `half-bound states', allowing for zero reflection at grazing incidence. To obtain these profiles we use two different theoretical approaches: supersymmetric quantum mechanics, and direct inversion of the Helmholtz equation.

翻訳日:2023-04-22 20:47:40 公開日:2020-11-27

# 異点を有する機械共振器のエネルギーレベル誘導と耐熱冷却

Energy-level-attraction and heating-resistant-cooling of mechanical resonators with exceptional points ( http://arxiv.org/abs/2011.13587v1 )

ライセンス: Link先を確認

Cheng Jiang, Yu-Long Liu, Mika A. Sillanp\"a\"a

(参考訳) 合成フォノニックゲージ場における機械共振器のエネルギー準位発展と基底状態冷却について検討した。可変ゲージ位相は、マルチモード光機械系における$\mathcal{pt}$- と anti-$\mathcal{pt}$-symmetric mechanical couplings の位相差によって媒介される。透過スペクトルは、ゲージ位相を変調して非対称なファノ線形状または二重光学的に誘起される透明性を示す。さらに、機械的結合が継続的に増大しても固有値が崩壊して縮退する。このような直感的エネルギー誘引は、反交差ではなく、$\mathcal{PT}$-と$-\mathcal{PT}$-対称結合の間の破壊的干渉に起因する。機械的固有値がピークに対応するキャビティ出力パワースペクトルにおいて,エネルギートラクションとそれに伴う例外点(EP)がより直感的に観測できることがわかった。機械冷却の場合、これらのEPでは平均フォノン占有数が最小となる。特にフォノン輸送は非相反し、EPでは理想的には一方向になる。最後に、ゲージ場を媒介とした非相反フォノン輸送に基づく耐熱性地中冷却を提案する。マクロメカニカル共振器の量子状態に向けて、ほとんどのオプトメカニカルシステムは本質的に空洞や機械的加熱によって制限される。我々の研究により、熱エネルギー移動はゲージ位相を調整し、悪名高い加熱限界を乗り越えるための有望な経路をサポートすることでブロックできることが判明した。

We study the energy-level evolution and ground-state cooling of mechanical resonators under a synthetic phononic gauge field. The tunable gauge phase is mediated by the phase difference between the $\mathcal{PT}$- and anti-$\mathcal{PT}$-symmetric mechanical couplings in a multimode optomechanical system. The transmission spectrum then exhibits the asymmetric Fano line shape or double optomechanically induced transparency by modulating the gauge phase. Moreover, the eigenvalues will collapse and become degenerate although the mechanical coupling is continuously increased. Such counterintuitive energy-attraction, instead of anti-crossing, attributes to destructive interferences between $\mathcal{PT}$- and anti-$\mathcal{PT}$-symmetric couplings. We find that the energy-attraction, as well as the accompanied exceptional points (EPs), can be more intuitively observed in the cavity output power spectrum where the mechanical eigenvalues correspond to the peaks. For mechanical cooling, the average phonon occupation number becomes minimum at these EPs. Especially, phonon transport becomes nonreciprocal and even ideally unidirectional at the EPs. Finally, we propose a heating-resistant ground-state cooling based on the nonreciprocal phonon transport, which is mediated by the gauge field. Towards the quantum regime of macroscopic mechanical resonators, most optomechanical systems are ultimately limited by their intrinsic cavity or mechanical heating. Our work revealed that the thermal energy transfer can be blocked by tuning the gauge phase, which supports a promising route to overpass the notorious heating limitations.

翻訳日:2023-04-22 20:46:59 公開日:2020-11-27

# ソーシャルメディアデータ、衛星画像、地理空間情報を用いた解釈可能な貧困マッピング

Interpretable Poverty Mapping using Social Media Data, Satellite Images, and Geospatial Information ( http://arxiv.org/abs/2011.13563v1 )

ライセンス: Link先を確認

Chiara Ledesma, Oshean Lee Garonita, Lorenzo Jaime Flores, Isabelle Tingzon, and Danielle Dalisay

(参考訳) 人道的組織が貧困緩和のための脆弱な地域を特定するためには、正確できめ細かい最新の貧困データへのアクセスが不可欠である。近年、コンピュータビジョンと衛星画像の組み合わせによる貧困評価が成功しているが、ブラックボックスモデルと組み合わせた高解像度画像を取得するコストは、多くの開発組織にとって大きな障壁となる。本研究では,機械学習と,ソーシャルメディアデータ,低解像度衛星画像,ボランティア地理情報など,容易にアクセス可能なデータソースを用いて,貧困推定のための解釈可能かつ費用効率の高い手法を提案する。提案手法を用いて,フィリピンの資産推定におけるR^2$0.66を衛星画像を用いた0.63に対して達成した。最後に、機能の重要性分析を使用して、グローバルとローカルの両方で最も貢献度の高い機能を特定し、意思決定者が貧困に関する深い洞察を得る手助けをします。

Access to accurate, granular, and up-to-date poverty data is essential for humanitarian organizations to identify vulnerable areas for poverty alleviation efforts. Recent works have shown success in combining computer vision and satellite imagery for poverty estimation; however, the cost of acquiring high-resolution images coupled with black box models can be a barrier to adoption for many development organizations. In this study, we present a interpretable and cost-efficient approach to poverty estimation using machine learning and readily accessible data sources including social media data, low-resolution satellite images, and volunteered geographic information. Using our method, we achieve an $R^2$ of 0.66 for wealth estimation in the Philippines, compared to 0.63 using satellite imagery. Finally, we use feature importance analysis to identify the highest contributing features both globally and locally to help decision makers gain deeper insights into poverty.

翻訳日:2023-04-22 20:46:36 公開日:2020-11-27

# 都市Twitterネットワークとコミュニティ:アテネのマイクロブログを事例として

Urban Twitter Networks and Communities: A Case Study of Microblogging in Athens ( http://arxiv.org/abs/2011.13785v1 )

ライセンス: Link先を確認

Tasos Spiliotopoulos, Ian Oakley

(参考訳) 本稿では,都市レベルのハッシュタグを用いたTwitterユーザによるコミュニティについて検討する。特に,ギリシャのアテネ市における,関連するTwitterハッシュタグデータの解析と可視化によって実証されたネットワークの視点を提供し,この地理的局所ネットワークのマイクロブロッギングの実践に関する概観と深い洞察を提供する。さらなる分析から、このネットワークのメンバーによって定義されたtwitterコミュニティは、現実のコミュニティの強い兆候を示していることが示唆された。

This paper examines the community formed by the Twitter users that used a city-level hashtag. In particular, we provide a network perspective of the city of Athens, Greece, as demonstrated by the analysis and visualization of the relevant Twitter hashtag data, in order to present both an overview and deeper insights at the microblogging practices of this geographic local network. Further analysis suggests that the Twitter community defined by the members of the network shows strong signs of a real-life community.

翻訳日:2023-04-22 20:39:03 公開日:2020-11-27

# 量子アシスト量子制御のためのアルゴリズムプリミティブ

Algorithmic Primitives for Quantum-Assisted Quantum Control ( http://arxiv.org/abs/2011.13777v1 )

ライセンス: Link先を確認

Guru-Vamsi Policharla and Sai Vinjanampathy

(参考訳) NISQデバイスに実装可能な様々な量子支援量子制御アルゴリズムを構築するために,オーバーラップと遷移行列時系列を評価するための2つの原始的アルゴリズムについて論じる。従来の手法と異なり, 断層計測を回避し, 単一の量子ビット計測のみに依存する。トロッタライズと測定誤差から発生する合成アルゴリズムとノイズ源の回路複雑性を解析した。

We discuss two primitive algorithms to evaluate overlaps and transition matrix time series, which are used to construct a variety of quantum-assisted quantum control algorithms implementable on NISQ devices. Unlike previous approaches, our method bypasses tomographically complete measurements and instead relies solely on single qubit measurements. We analyse circuit complexity of composed algorithms and sources of noise arising from Trotterization and measurement errors.

翻訳日:2023-04-22 20:38:54 公開日:2020-11-27

# 駆動散逸ボソニック場の変動解析

Variational analysis of driven-dissipative bosonic fields ( http://arxiv.org/abs/2011.13746v1 )

ライセンス: Link先を確認

Tim Pistorius and Hendrik Weimer

(参考訳) 本稿では,任意の大きな占有数を持つ駆動分散ボソニック場に対する量子マスター方程式の変分解析を行う手法を提案する。我々のアプローチは、密度行列のP表現と開量子系の変分原理を組み合わせたものである。提案手法を,波動関数モンテカルロシミュレーションとJaynes-Cummingsモデルに対するMaxwell-Bloch方程式の解との比較により評価した。さらに,キャビティフィールドにおけるRydberg分極を記述するモデルについて検討し,異なるモード間の相関を記述するために,変分パラマタの追加を導入する。

We present a method to perform a variational analysis of the quantum master equation for driven-disspative bosonic fields with arbitrary large occupation numbers. Our approach combines the P representation of the density matrix and the variational principle for open quantum system. We benchmark the method by comparing it to wave-function Monte-Carlo simulations and the solution of the Maxwell-Bloch equation for the Jaynes-Cummings model. Furthermore, we study a model describing Rydberg polaritons in a cavity field and introduce an additional set of variational paramaters to describe correlations between different modes.

翻訳日:2023-04-22 20:37:36 公開日:2020-11-27

# 量子力学の多世界解釈--パラドックス的考察

Many-Worlds Interpretation of Quantum Mechanics: A Paradoxical Picture ( http://arxiv.org/abs/2011.13928v1 )

ライセンス: Link先を確認

Amir Abbass Varshovi

(参考訳) 量子力学の多世界解釈(MWI)は、解釈における(半)決定論的並列世界の現実に基づく前例のない存在論的視点から研究される。不確実性原理のおかげで、宇宙の正しいオントロジーを特定する一貫した方法が存在しないことが示され、そのため、MWIは我々の住む世界は非現実であると主張する固有の矛盾の対象となっている。

The many-worlds interpretation (MWI) of quantum mechanics is studied from an unprecedented ontological perspective based on the reality of (semi-) deterministic parallel worlds in the interpretation. It is demonstrated that with thanks to the uncertainty principle there would be no consistent way to specify the correct ontology of the Universe, hence the MWI is subject to an inherent contradiction which claims that the world where we live in is unreal.

翻訳日:2023-04-22 20:30:29 公開日:2020-11-27

# 非巻きフェルミオンSPT相:超対称性拡張

Unwinding Fermionic SPT Phases: Supersymmetry Extension ( http://arxiv.org/abs/2011.13921v1 )

ライセンス: Link先を確認

Abhishodh Prakash, Juven Wang

(参考訳) We show how 1+1-dimensional fermionic symmetry-protected topological states (SPTs, i.e. nontrivial short-range entangled gapped phases of quantum matter whose boundary exhibits 't Hooft anomaly and whose bulk cannot be deformed into a trivial tensor product state under finite-depth local unitary transformations only in the presence of global symmetries), indeed can be unwound to a trivial state by enlarging the Hilbert space via adding extra degrees of freedom and suitably extending the global symmetries. 境界上の拡張射影的大域対称性は、特定の意味で超対称性(すなわち、フェルミオン数パリティ$(-1)^F$と可換でない群要素を含む)となり、反単位時間反転対称性は分数化される。これはまた、群拡大の観点で適切な超対称性拡張により、ある種の異種なフェルミオン異常(例えば、時間反転や反射対称性における「パリティ」異常)を上昇および除去できることを意味する。 1+1dMajorana fermion chain の多層構造について、Sachdev-Ye-Kitaev (SYK) 相互作用によるモデル、超対称性で保護された固有フェルミオン性ギャップレス SPT 、コボルディズム理論による高次時空次元への一般化の明確な例を考察する。

We show how 1+1-dimensional fermionic symmetry-protected topological states (SPTs, i.e. nontrivial short-range entangled gapped phases of quantum matter whose boundary exhibits 't Hooft anomaly and whose bulk cannot be deformed into a trivial tensor product state under finite-depth local unitary transformations only in the presence of global symmetries), indeed can be unwound to a trivial state by enlarging the Hilbert space via adding extra degrees of freedom and suitably extending the global symmetries. The extended projective global symmetry on the boundary can become supersymmetric in a specific sense, i.e., it contains group elements that do not commute with the fermion number parity $(-1)^F$, while the anti-unitary time-reversal symmetry becomes fractionalized. This also means we can uplift and remove certain exotic fermionic anomalies (e.g., "parity" anomaly in time-reversal or reflection symmetry) via appropriate supersymmetry extensions in terms of group extensions. We work out explicit examples for multi-layers of 1+1d Majorana fermion chains, then comment on models with Sachdev-Ye-Kitaev (SYK) interactions, intrinsic fermionic gapless SPTs protected by supersymmetry, and generalizations to higher spacetime dimensions via a cobordism theory.

翻訳日:2023-04-22 20:30:23 公開日:2020-11-27

# コヒーレント状態重ね合わせ、絡み合いおよびゲージ/重力対応

Coherent state superpositions, entanglement and gauge/gravity correspondence ( http://arxiv.org/abs/2011.13919v1 )

ライセンス: Link先を確認

Hai Lin, Yuwei Zhu

(参考訳) 我々はゲージ/重力対応の文脈において,多重力状態のコヒーレント状態と巨大重力状態のコヒーレント状態という2種類のコヒーレント状態に注目した。我々は位相シフト演算子とそのコヒーレント状態の重ね合わせに対する作用を便利に利用する。 N$状態のシュロディンガー猫状態は、一列のヤングテーブルー状態に近づき、それらの間の忠実度は、N$で漸近的に1に達する。これらの状態の量子フィッシャー情報は、基底状態の励起エネルギーの分散に比例し、位相空間の角方向における状態の局在性を特徴づける。気泡広告における位相空間平面の異なる領域を用いて,重力自由度間の相関と絡み合いを解析した。位相空間平面における2つの絡み合った環間の相関は、2つの環の間の環状の面積に関係している。また、2種類のノイズコヒーレント状態も解析し、これはノイズレス限界における純コヒーレント状態と大きなノイズリミットにおける最大混合状態との間の補間状態と見なすことができる。

We focus on two types of coherent states, the coherent states of multi graviton states and the coherent states of giant graviton states, in the context of gauge/gravity correspondence. We conveniently use a phase shift operator and its actions on the superpositions of these coherent states. We find $N$-state Schrodinger cat states which approach the one-row Young tableau states, with fidelity between them asymptotically reaches 1 at large $N$. The quantum Fisher information of these states is proportional to the variance of the excitation energy of the underlying states, and characterizes the localizability of the states in the angular direction in the phase space. We analyze the correlation and entanglement between gravitational degrees of freedom using different regions of the phase space plane in bubbling AdS. The correlation between two entangled rings in the phase space plane is related to the area of the annulus between the two rings. We also analyze two types of noisy coherent states, which can be viewed as interpolated states that interpolate between a pure coherent state in the noiseless limit and a maximally mixed state in the large noise limit.

翻訳日:2023-04-22 20:29:58 公開日:2020-11-27

# ガリウムヒ素スピン量子ビットのロードマップ

Roadmap for gallium arsenide spin qubits ( http://arxiv.org/abs/2011.13907v1 )

ライセンス: Link先を確認

Ferdinand Kuemmeth and Hendrik Bluhm

(参考訳) 窒化ガリウム(GaAs)のゲート定義量子ドットは、作製の比較的単純さと、単一の伝導帯谷、小さな有効質量、安定なドーパントのような好ましい電子特性のために、スピン量子ビット装置のパイオニアとして広く用いられている。 GaAsスピン量子ビットは、多くの研究室で容易に生成され、現在、絡み合い、量子非破壊測定、自動チューニング、マルチドットアレイ、コヒーレント交換結合、テレポーテーションなど様々な用途で研究されている。多くの注目が他の材料にシフトしているにもかかわらず、GaAsデバイスは概念実証量子情報処理や固体実験の原動力となるだろう。

Gate-defined quantum dots in gallium arsenide (GaAs) have been used extensively for pioneering spin qubit devices due to the relative simplicity of fabrication and favourable electronic properties such as a single conduction band valley, a small effective mass, and stable dopants. GaAs spin qubits are readily produced in many labs and are currently studied for various applications, including entanglement, quantum non-demolition measurements, automatic tuning, multi-dot arrays, coherent exchange coupling, and teleportation. Even while much attention is shifting to other materials, GaAs devices will likely remain a workhorse for proof-of-concept quantum information processing and solid-state experiments.

翻訳日:2023-04-22 20:29:39 公開日:2020-11-27

# 超伝導量子プロセッサ上のスターク多体局在

Stark many-body localization on a superconducting quantum processor ( http://arxiv.org/abs/2011.13895v1 )

ライセンス: Link先を確認

Qiujiang Guo, Chen Cheng, Hekang Li, Shibo Xu, Pengfei Zhang, Zhen Wang, Chao Song, Wuxin Liu, Wenhui Ren, Hang Dong, Rubem Mondaini, and H. Wang

(参考訳) 量子エミュレータは、チューナビリティと制御の程度が大きいため、密閉された量子多体系の微細な側面を観察することができる。後者はMulti-body Localization(MBL)現象と呼ばれ、局所情報の保存と遅い絡み合い成長によって動的に識別される非エルゴード的行動を記述する。ここでは,オンサイト・エネルギ・ランドスケープが乱れず,直線的に変化し,スタークmblをエミュレートする場合に,この現象学の正確な観察を行う。そこで我々は,32個の超伝導量子ビットからなる量子デバイスを構築し,非可積分スピンモデルの緩和ダイナミクスを忠実に再現する。本研究は, 古典的計算機における厳密なシミュレーションによって達成できる範囲を超過し, 量子アドバンテージの開始を示唆し, 量子計算を平衡多体問題を解くための資源として用いる方法を示す。

Quantum emulators, owing to their large degree of tunability and control, allow the observation of fine aspects of closed quantum many-body systems, as either the regime where thermalization takes place or when it is halted by the presence of disorder. The latter, dubbed many-body localization (MBL) phenomenon, describes the non-ergodic behavior that is dynamically identified by the preservation of local information and slow entanglement growth. Here, we provide a precise observation of this same phenomenology in the case the onsite energy landscape is not disordered, but rather linearly varied, emulating the Stark MBL. To this end, we construct a quantum device composed of thirty-two superconducting qubits, faithfully reproducing the relaxation dynamics of a non-integrable spin model. Our results describe the real-time evolution at sizes that surpass what is currently attainable by exact simulations in classical computers, signaling the onset of quantum advantage, thus bridging the way for quantum computation as a resource for solving out-of-equilibrium many-body problems.

翻訳日:2023-04-22 20:29:26 公開日:2020-11-27

# post or tweet: facebookとtwitterの利用に関する調査から学んだこと

Post or Tweet: Lessons from a Study of Facebook and Twitter Usage ( http://arxiv.org/abs/2011.13802v1 )

ライセンス: Link先を確認

Tasos Spiliotopoulos, Ian Oakley

(参考訳) このワークショップでは、FacebookとTwitterという、おそらく最も人気のある2つのソーシャルネットワークサイトについて、現在進行中の混合調査についてレポートする。この研究の目的は、参加者のモチベーションに関する調査データとAPI抽出を通じて収集された利用データを組み合わせることで、ソーシャルメディアの選択とクロスプラットフォーム利用のニュアンスに光を当てることである。本研究のセットアップについて述べるとともに,参加者の募集やデータ収集,利用データの扱いと次元化,サイト間の利用データの比較などに関する課題と洞察に焦点をあてる。

This workshop paper reports on an ongoing mixed-methods study on the two arguably most popular social network sites, Facebook and Twitter, for the same users. The overarching goal of the study is to shed light into the nuances of social media selection and cross-platform use by combining survey data about participants' motivations with usage data collected via API extraction. We describe the set-up of the study and focus our discussion on the challenges and insights relating to participant recruiting and data collection, handling and dimensionalizing usage data, and comparing usage data across sites.

翻訳日:2023-04-22 20:28:29 公開日:2020-11-27

# ブラックローン問題:サブグループ差別と戦うための分配的ロバストな公平性

Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination ( http://arxiv.org/abs/2012.01193v1 )

ライセンス: Link先を確認

Mark Weber, Mikhail Yurochkin, Sherif Botros, Vanio Markov

(参考訳) 今日の貸付におけるアルゴリズム的公正性は、保護されたグループ間の統計的公平性を監視するためのグループフェアネス指標に依存している。このアプローチは代理業者によるサブグループ差別に対して脆弱であり、貸し手に対する法的・評判的な損害と、借り手に対する不正な不公平な結果という重大なリスクを負っている。実用的な課題は保護されたグループの多くの組み合わせと部分集合から生じる。我々は、米国における歴史的および残酷な人種差別の背景から、この問題を動機付け、利用可能なすべてのトレーニングデータを汚染し、アルゴリタミズムバイアスに対する公衆の感受性を高める。本稿では,貸付における公正性に関する規制コンプライアンスプロトコルを概観し,その限界について述べる。本稿では,個別のフェアネス法とそれに対応するフェアネス学習アルゴリズムの最近の発展から,既存のグループフェアネス要件を順守しつつ,サブグループ識別に対処するソリューションを提案する。

Algorithmic fairness in lending today relies on group fairness metrics for monitoring statistical parity across protected groups. This approach is vulnerable to subgroup discrimination by proxy, carrying significant risks of legal and reputational damage for lenders and blatantly unfair outcomes for borrowers. Practical challenges arise from the many possible combinations and subsets of protected groups. We motivate this problem against the backdrop of historical and residual racism in the United States polluting all available training data and raising public sensitivity to algorithimic bias. We review the current regulatory compliance protocols for fairness in lending and discuss their limitations relative to the contributions state-of-the-art fairness methods may afford. We propose a solution for addressing subgroup discrimination, while adhering to existing group fairness requirements, from recent developments in individual fairness methods and corresponding fair metric learning algorithms.

翻訳日:2023-04-22 20:21:36 公開日:2020-11-27

# 量子場理論のハイゼンベルク像における干渉、現実とフェルミオンの局所的要素

Interference in the Heisenberg Picture of Quantum Field Theory, Local Elements of Reality and Fermions ( http://arxiv.org/abs/2011.14003v1 )

ライセンス: Link先を確認

Chiara Marletto, Nicetu Tibau Vidal, Vlatko Vedral

(参考訳) ハイゼンベルク図を用いたマッハ・ツェンダー干渉計における単一光子の量子干渉について述べる。我々の目的は、記述が古典的な電磁場の場合と同様に局所的であることを示すことであり、唯一の違いは、電場と磁場が量子の場合、作用素(量子可観測器)であることである。次に,単電子マッハ・ツェンダー干渉計を考察し,この場合のハイゼンベルク像の適切な処理について説明する。興味深いことに、パリティ超選択則は光子と異なる電子の扱いを強いる。現在の演算子のような異なるフェルミオンモードの局所量子オブザーバブルのみを用いるモデルは相取得を記述することができる。ハイゼンベルク図で定式化された量子電磁力学の局所的な定式化の中で、この局所解析をフェルミオン場とボゾン場にどのように拡張するかについて議論する。

We describe the quantum interference of a single photon in the Mach-Zehnder interferometer using the Heisenberg picture. Our purpose is to show that the description is local just like in the case of the classical electromagnetic field, the only difference being that the electric and the magnetic fields are, in the quantum case, operators (quantum observables). We then consider a single-electron Mach-Zehnder interferometer and explain what the appropriate Heisenberg picture treatment is in this case. Interestingly, the parity superselection rule forces us to treat the electron differently to the photon. A model using only local quantum observables of different fermionic modes, such as the current operator, is nevertheless still viable to describe phase acquisition. We discuss how to extend this local analysis to coupled fermionic and bosonic fields within the same local formalism of quantum electrodynamics as formulated in the Heisenberg picture.

翻訳日:2023-04-22 20:21:15 公開日:2020-11-27

# 高次高調波分光における赤外単一サイクルパルス誘起高エネルギープラトー

Infrared single-cycle pulse induced high-energy plateaus in high-order harmonic spectroscopy ( http://arxiv.org/abs/2011.13995v1 )

ライセンス: Link先を確認

Abdelmalek Taoutioui and Hicham Agueny

(参考訳) 新たな実験[例えば \textit{Z]によって動機付けられる。 nie et al。 Nat! フォトンスペクトル領域 5 - 14 $\mu m$ において赤外(ir)単サイクルパルスを生成する際、理論上、高次高調波発生(hhg)過程を制御するためのそれらの役割は、強烈な近赤外(nir)多サイクルパルス(\lambda$ = 1.27$\mu m$)によって引き起こされる。このシナリオは、時間依存シュリンガー方程式の数値シミュレーションにより、水素原子のプロトタイプとして実証される。特に、結合パルスは偶数次高調波を発生させ、最も重要なことは高エネルギープラトーを発生させることであり、高調波遮断はNIRパルス単独の場合と比較して3倍の係数で拡張されることを示す。出現した高エネルギー高原は、nir磁場中を移動中の単サイクル電界からイオン化電子への膨大な運動量移動の結果、高運動量電子再結合をもたらすと理解されている。また、赤外場誘起電子変位効果による放出電子の方向制御におけるir単サイクル場の役割を明らかにした。さらに, 2つのパルス間の相対的キャリア・エンベロープ位相と波長を変化させることで, 出現した高原を制御できることを示した。そこで本研究では,IR単サイクル高調波分光法による時間分解電子回折の新しい視点を開拓した。

Motivated by the emerging experiments [e.g. \textit{Z. Nie et al. Nat. Photon. \textbf{12}, 489 (2018)}] on producing infrared (IR) single cycle pulses in the spectral region 5 - 14 $\mu m$, we theoretically investigate their role for controlling high-order harmonic generation (HHG) process induced by an intense near-infrared (NIR) multi-cycle pulse ($\lambda$ = 1.27 $\mu m$). The scenario is demonstrated for a prototype of the hydrogen atom by numerical simulations of the time-dependent Schr\"odinger equation. In particular, we show that the combined pulses allow one to generate even-order harmonics and most importantly to produce high-energy plateaus and that the harmonic cutoff is extended by a factor of 3 compared to the case with the NIR pulse alone. The emerged high-energy plateaus is understood as a result of a vast momentum transfer from the single-cycle field to the ionized electrons while travelling in the NIR field, and thus leading to high-momentum electron recollisions. We also identify the role of the IR single-cycle field for controlling the directionality of the emitted electrons via the IR-field induced electron displacement effect. We further show that the emerged plateaus can be controlled by varying the relative carrier-envelope phase between the two pulses as well as their wavelengths. Thus, our findings open up new perspectives for time-resolved electron diffraction using an IR single-cycle field-assisted high-harmonic spectroscopy.

翻訳日:2023-04-22 20:20:44 公開日:2020-11-27

# 多ビット系冷却用量子冷凍機

Few-qubit quantum refrigerator for cooling a multi-qubit system ( http://arxiv.org/abs/2011.13973v1 )

ライセンス: Link先を確認

Onat Ar{\i}soy and \"Ozg\"ur E. M\"ustecapl{\i}o\u{g}lu

(参考訳) 相互作用するマルチキュービット系を冷却する小型量子冷蔵庫として,数量子システムを提案する。具体的には、量子冷凍機としてスピンスターモデルと呼ばれる、中央量子ビットをn$ ancilla qubitsに結合する。まず, 量子ビット間の相互作用が縦型および強磁性イジングモデル形式である場合, 中心量子ビットは環境よりも低温であることを示す。その後、より冷たい中心量子ビットは、一般的な量子多ビット系を冷却するために、量子冷蔵庫の冷媒界面として使用されることが提案されている。 n$ と qubit-qubit の相互作用強度で制御できる運転コストと冷却効率を考慮して,簡単な冷凍サイクルについて検討した。また、達成可能な温度の限界が設定される。このような数量子ビットのコンパクトな量子冷蔵庫は、量子技術の応用の次元を減らし、全量子システムに容易に統合でき、量子コンピューティングや熱デバイスのスピードとパワーを高めることができる。

We propose to use a few-qubit system as a compact quantum refrigerator for cooling an interacting multi-qubit system. We specifically consider a central qubit coupled to $N$ ancilla qubits in a so-called spin-star model as our quantum refrigerator. We first show that if the interaction between the qubits is of the longitudinal and ferromagnetic Ising model form, the central qubit is colder than the environment. The colder central qubit is then proposed to be used as the refrigerant interface of the quantum refrigerator to cool down general quantum many-qubit systems. We discuss a simple refrigeration cycle, considering the operation cost and cooling efficiency, which can be controlled by $N$ and the qubit-qubit interaction strength. Besides, bounds on the achievable temperature are established. Such few-qubit compact quantum refrigerators can be significant to reduce dimensions of quantum technology applications, can be easy to integrate into all-qubit systems, and can increase the speed and power of quantum computing and thermal devices.

翻訳日:2023-04-22 20:20:15 公開日:2020-11-27

# 中心スピンモデルにおける相関R'enyiエントロピーの実験的検出

Experimental Detection of the Correlation R\'enyi Entropy in the Central Spin Model ( http://arxiv.org/abs/2011.13948v1 )

ライセンス: Link先を確認

Mohamad Niknam, Lea F. Santos, David G. Cory

(参考訳) 量子ビット間の相関の体積を定量化するエントロピーを実験的に提案する。この実験は、中心スピン結合からなるほぼ孤立した量子系上で行われ、当初は他の15個のスピンとは無関係であった。スピンスピン相互作用のため、情報は中心スピンから周囲のスピンへと流れ、時間とともに成長するマルチスピン相関のクラスターを形成する。我々は、マルチスピン相関の振幅を直接測定し、R'enyiエントロピーと呼ばれる相関の進化を計算する核磁気共鳴実験を設計する。このエントロピーは、絡み合いエントロピーの平衡後でも成長し続ける。また,R'enyiエントロピーの平衡の飽和点と時間スケールがシステムサイズにどのように依存するかを解析した。

We propose and experimentally measure an entropy that quantifies the volume of correlations among qubits. The experiment is carried out on a nearly isolated quantum system composed of a central spin coupled and initially uncorrelated with 15 other spins. Due to the spin-spin interactions, information flows from the central spin to the surrounding ones forming clusters of multi-spin correlations that grow in time. We design a nuclear magnetic resonance experiment that directly measures the amplitudes of the multi-spin correlations and use them to compute the evolution of what we call correlation R\'enyi entropy. This entropy keeps growing even after the equilibration of the entanglement entropy. We also analyze how the saturation point and the timescale for the equilibration of the correlation R\'enyi entropy depend on the system size.

翻訳日:2023-04-22 20:19:07 公開日:2020-11-27

# ハールランダム州のマナ

Mana in Haar-random states ( http://arxiv.org/abs/2011.13937v1 )

ライセンス: Link先を確認

Christopher David White and Justin H. Wilson

(参考訳) Mana は状態を生成するのに必要な非クリフォードリソースの量を測るもので、$\ell$ qudits 上の混合状態のマナは $\le \frac 1 2 (\ell \ln d - S_2)$; $S_2$ 状態の第2の Renyi エントロピーによって束縛される。ハールランダムな純および混合状態のマナを計算し、そのマナがヒルベルト空間次元においてほぼ対数(英語版)(logarithmic)であることを見つける:つまり、クウディ次元におけるクウディッツ数と対数(英語版)(logarithmic in qudit dimension)を広範囲に含む。特に、最大エントロピーに満たない状態の平均 mana は、その最大値の$\ln \pi/2$ に満たない。すると、この結果と近似的にt$-designsの最近の研究を結びつけて、manaは微分可能ではないので、非cliffordリソースの有用な尺度であると指摘します。

Mana is a measure of the amount of non-Clifford resources required to create a state; the mana of a mixed state on $\ell$ qudits bounded by $\le \frac 1 2 (\ell \ln d - S_2)$; $S_2$ the state's second Renyi entropy. We compute the mana of Haar-random pure and mixed states and find that the mana is nearly logarithmic in Hilbert space dimension: that is, extensive in number of qudits and logarithmic in qudit dimension. In particular, the average mana of states with less-than-maximal entropy falls short of that maximum by $\ln \pi/2$. We then connect this result to recent work on near-Clifford approximate $t$-designs; in doing so we point out that mana is a useful measure of non-Clifford resources precisely because it is not differentiable.

翻訳日:2023-04-22 20:18:24 公開日:2020-11-27

# 可制御長における抽象的要約のための解釈可能な多面的注意

Interpretable Multi-Headed Attention for Abstractive Summarization at Controllable Lengths ( http://arxiv.org/abs/2002.07845v2 )

ライセンス: Link先を確認

Ritesh Sarkhel, Moniba Keymanesh, Arnab Nandi, Srinivasan Parthasarathy

(参考訳) 制御可能な長さでの抽象的要約は自然言語処理において難しい課題である。トレーニングデータが限られているドメインや、サマリの長さが事前に分かっていないようなシナリオでは、さらに難しくなります。同時に、機械が生成した要約を信頼することに関して、人間の理解可能な言葉で要約がどのように構築されたかを説明することが重要かもしれない。本稿では,テキスト文書の要約を制御可能な長さで構築するための教師あり手法であるMulti-level Summarizer (MLS)を提案する。本手法のキーイネーバは,時間ステップ独立なセマンティクスカーネルの配列を用いて,入力文書上のアテンション分布を計算するマルチヘッドアテンション機構である。各カーネルは、人間の解釈可能な構文またはセマンティックプロパティを最適化する。英語における2つの低リソースデータセットの発掘実験により、MLSはMETEORスコアの14.70%まで強力なベースラインを上回ります。要約の人間による評価は、文書の重要概念を様々な予算で捉えることを示唆している。

Abstractive summarization at controllable lengths is a challenging task in natural language processing. It is even more challenging for domains where limited training data is available or scenarios in which the length of the summary is not known beforehand. At the same time, when it comes to trusting machine-generated summaries, explaining how a summary was constructed in human-understandable terms may be critical. We propose Multi-level Summarizer (MLS), a supervised method to construct abstractive summaries of a text document at controllable lengths. The key enabler of our method is an interpretable multi-headed attention mechanism that computes attention distribution over an input document using an array of timestep independent semantic kernels. Each kernel optimizes a human-interpretable syntactic or semantic property. Exhaustive experiments on two low-resource datasets in the English language show that MLS outperforms strong baselines by up to 14.70% in the METEOR score. Human evaluation of the summaries also suggests that they capture the key concepts of the document at various length-budgets.

翻訳日:2022-12-30 20:10:37 公開日:2020-11-27

# leafgan: 実用的植物病診断のためのデータ拡張法

LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis ( http://arxiv.org/abs/2002.10100v2 )

ライセンス: Link先を確認

Quan Huu Cap, Hiroyuki Uga, Satoshi Kagiwada, and Hitoshi Iyatomi

(参考訳) 植物病の自動診断のための多くの応用が深層学習技術の成功に基づいて開発されている。しかし、これらのアプリケーションはしばしば過剰フィッティングに苦しめられ、新しい環境からテストデータセットで使用すると診断性能が劇的に低下する。本稿では,独自の注意機構を持つ新しい画像から画像への翻訳システムであるleafganを提案する。 leafganは、植物病診断の性能を向上させるためのデータ拡張ツールとして、健康な画像から変換することで、さまざまな疾患画像を生成する。注意機構により,本モデルでは,様々な背景を持つ画像から関連領域のみを変換し,トレーニング画像の汎用性を高めることができる。 5級キュウリ病分類の実験では、バニラ型サイクロンによるデータ増強は、一般化の改善に役立てられず、つまり、疾患診断性能はベースラインからわずか0.7%向上した。一方、LeafGANは診断性能を7.4%向上させた。また、LeafGANが生成した画像は、Vanilla CycleGANが生成した画像よりも品質が高く、より説得力が高いことも確認した。コードは、https://github.com/IyatomiLab/LeafGAN.comで公開されている。

Many applications for the automated diagnosis of plant disease have been developed based on the success of deep learning techniques. However, these applications often suffer from overfitting, and the diagnostic performance is drastically decreased when used on test datasets from new environments. In this paper, we propose LeafGAN, a novel image-to-image translation system with own attention mechanism. LeafGAN generates a wide variety of diseased images via transformation from healthy images, as a data augmentation tool for improving the performance of plant disease diagnosis. Thanks to its own attention mechanism, our model can transform only relevant areas from images with a variety of backgrounds, thus enriching the versatility of the training images. Experiments with five-class cucumber disease classification show that data augmentation with vanilla CycleGAN cannot help to improve the generalization, i.e., disease diagnostic performance increased by only 0.7% from the baseline. In contrast, LeafGAN boosted the diagnostic performance by 7.4%. We also visually confirmed the generated images by our LeafGAN were much better quality and more convincing than those generated by vanilla CycleGAN. The code is available publicly at: https://github.com/IyatomiLab/LeafGAN.

翻訳日:2022-12-29 04:07:12 公開日:2020-11-27

# ベルヌーイ分布とカテゴリー分布の有限混合に対する平均場ゲームモデル

A Mean Field Games model for finite mixtures of Bernoulli and Categorical distributions ( http://arxiv.org/abs/2004.08119v2 )

ライセンス: Link先を確認

Laura Aquilanti, Simone Cacace, Fabio Camilli and Raul De Maio

(参考訳) 有限混合モデルは、例えばデータクラスタリングにおいて、データの統計解析において重要なツールである。混合モデルの最適パラメータは、通常、期待最大化アルゴリズムによってログ類似汎関数を最大化することで計算される。本研究では,無限個のエージェントを持つ微分ゲームのクラスである平均場ゲームの理論に基づく代替手法を提案する。有限状態空間の多乗平均場ゲームシステムの解は、ベルヌーイ混合物に対する対数類似汎関数の臨界点を特徴づける。このアプローチは、カテゴリ分布の混合モデルに一般化される。したがって、Mean Field Gamesアプローチは混合モデルのパラメータを計算する方法を提供し、クラスタ解析の標準的な例にその適用例を示す。

Finite mixture models are an important tool in the statistical analysis of data, for example in data clustering. The optimal parameters of a mixture model are usually computed by maximizing the log-likelihood functional via the Expectation-Maximization algorithm. We propose an alternative approach based on the theory of Mean Field Games, a class of differential games with an infinite number of agents. We show that the solution of a finite state space multi-population Mean Field Games system characterizes the critical points of the log-likelihood functional for a Bernoulli mixture. The approach is then generalized to mixture models of categorical distributions. Hence, the Mean Field Games approach provides a method to compute the parameters of the mixture model, and we show its application to some standard examples in cluster analysis.

翻訳日:2022-12-12 12:57:53 公開日:2020-11-27

# 関連する歩行によるグラフニューラルネットワークの高次説明

Higher-Order Explanations of Graph Neural Networks via Relevant Walks ( http://arxiv.org/abs/2006.03589v3 )

ライセンス: Link先を確認

Thomas Schnake, Oliver Eberle, Jonas Lederer, Shinichi Nakajima, Kristof T. Sch\"utt, Klaus-Robert M\"uller, Gr\'egoire Montavon

(参考訳) グラフニューラルネットワーク(GNN)は、グラフ構造化データを予測するための一般的なアプローチである。 GNNは入力グラフをニューラルネットワーク構造にしっかりと絡み合わせるため、一般的な説明可能なAIアプローチは適用できない。これまでのところ、gnnはユーザーのためにブラックボックスのままだった。本稿では,GNNが高次展開を用いて自然に説明できることを示す。実際には,各ステップにおいて,レイヤワイド関連伝搬 (LRP) などの既存手法を適用可能なネスト属性方式を用いて,そのような説明を抽出することができる。出力は、予測に関係のある入力グラフへのウォークの集合である。我々は,GNN-LRPによって表現される新しい説明法を,広範囲のグラフニューラルネットワークに適用し,テキストデータの感情分析,量子化学における構造的優位性関係,画像分類に関する実用的な知見を抽出する。

Graph Neural Networks (GNNs) are a popular approach for predicting graph structured data. As GNNs tightly entangle the input graph into the neural network structure, common explainable AI approaches are not applicable. To a large extent, GNNs have remained black-boxes for the user so far. In this paper, we show that GNNs can in fact be naturally explained using higher-order expansions, i.e. by identifying groups of edges that jointly contribute to the prediction. Practically, we find that such explanations can be extracted using a nested attribution scheme, where existing techniques such as layer-wise relevance propagation (LRP) can be applied at each step. The output is a collection of walks into the input graph that are relevant for the prediction. Our novel explanation method, which we denote by GNN-LRP, is applicable to a broad range of graph neural networks and lets us extract practically relevant insights on sentiment analysis of text data, structure-property relationships in quantum chemistry, and image classification.

翻訳日:2022-11-25 02:34:04 公開日:2020-11-27

# 球運動ダイナミクス:正規化、重減少、SGDによるニューラルネットワークの学習ダイナミクス

Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD ( http://arxiv.org/abs/2006.08419v4 )

ライセンス: Link先を確認

Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun

(参考訳) 本研究では、ニューラルネットワークの正規化、重崩壊(WD)、SGD(運動量)による学習のダイナミクスを包括的に明らかにし、Spherical Motion Dynamics (SMD) と名付けた。ほとんどの関連研究は、ウェイトノルムが変化しない「平衡」条件における「効果的な学習率」に焦点を当ててSMDを研究する。しかし、なぜSMDで平衡状態に到達できるかという彼らの議論は、欠如しているか、より説得力がない。本研究は平衡状態の原因を直接調査することでsmdを調査する。具体的には 1) SMDにおける平衡状態につながる仮定を導入し, 重みノルムが与えられた仮定と線形速度で収束できることを証明する。 2) SMDにおけるニューラルネットワークの進化を測定するために, 効果的な学習率の代替として「角更新」を提案し, 角更新が線形速度で理論値に収束することを示す。 3)ImageNet や MSCOCO など様々なコンピュータビジョンタスクにおける仮定と理論的結果の検証を行う。実験結果から, 理論的結果は経験的観察とよく一致した。

In this work, we comprehensively reveal the learning dynamics of neural network with normalization, weight decay (WD), and SGD (with momentum), named as Spherical Motion Dynamics (SMD). Most related works study SMD by focusing on "effective learning rate" in "equilibrium" condition, where weight norm remains unchanged. However, their discussions on why equilibrium condition can be reached in SMD is either absent or less convincing. Our work investigates SMD by directly exploring the cause of equilibrium condition. Specifically, 1) we introduce the assumptions that can lead to equilibrium condition in SMD, and prove that weight norm can converge at linear rate with given assumptions; 2) we propose "angular update" as a substitute for effective learning rate to measure the evolving of neural network in SMD, and prove angular update can also converge to its theoretical value at linear rate; 3) we verify our assumptions and theoretical results on various computer vision tasks including ImageNet and MSCOCO with standard settings. Experiment results show our theoretical findings agree well with empirical observations.

翻訳日:2022-11-21 02:31:47 公開日:2020-11-27

# 進化的貯水池計算ネットワークにおける機能分化

Functional differentiations in evolutionary reservoir computing networks ( http://arxiv.org/abs/2006.11507v2 )

ライセンス: Link先を確認

Yutaka Yamaguti and Ichiro Tsuda

(参考訳) ニューロンの機能的分化を示す拡張型貯水池コンピュータを提案する。本発明の貯水池コンピュータは,進化力学を用いて内部貯水池の変更を可能にするために開発され,これを進化貯水池コンピュータと呼ぶ。入力情報に応じて特異性を示す神経ユニットを開発するためには、内部ダイナミクスを制御し、ダイナミックスを拡張した後の収縮ダイナミクスを生成する必要がある。拡張ダイナミクスは入力情報の差を拡大するが、縮小ダイナミクスは入力情報のクラスターの形成に寄与し、複数のアトラクタを生成する。両方のダイナミクスの同時出現はカオスの存在を示している。対照的に、有限時間間隔におけるこれらのダイナミクスのシーケンシャルな出現は機能的な分化を引き起こす可能性がある。本稿では,進化的貯水池コンピュータにおいて,特定のニューロン単位がどのように得られるかを示す。

We propose an extended reservoir computer that shows the functional differentiation of neurons. The reservoir computer is developed to enable changing of the internal reservoir using evolutionary dynamics, and we call it an evolutionary reservoir computer. To develop neuronal units to show specificity, depending on the input information, the internal dynamics should be controlled to produce contracting dynamics after expanding dynamics. Expanding dynamics magnifies the difference of input information, while contracting dynamics contributes to forming clusters of input information, thereby producing multiple attractors. The simultaneous appearance of both dynamics indicates the existence of chaos. In contrast, sequential appearance of these dynamics during finite time intervals may induce functional differentiations. In this paper, we show how specific neuronal units are yielded in the evolutionary reservoir computer.

翻訳日:2022-11-18 22:46:04 公開日:2020-11-27

# 分散ネットワーク上のFew-shot学習のためのグラフプロトタイプネットワーク

Graph Prototypical Networks for Few-shot Learning on Attributed Networks ( http://arxiv.org/abs/2006.12739v3 )

ライセンス: Link先を確認

Kaize Ding, Jianling Wang, Jundong Li, Kai Shu, Chenghao Liu, Huan Liu

(参考訳) 現在、名指しネットワークは、ソーシャルネットワーク分析、金融不正検出、薬物発見など、無数のハイインパクトなアプリケーションで広く使われている。属性ネットワークにおける中心的な分析課題として,ノード分類が研究コミュニティで注目されている。実世界の属性ネットワークでは、ノードクラスの大部分は限定されたラベル付きインスタンスのみを含み、ロングテールノードクラス分布を描画する。既存のノード分類アルゴリズムは、 \textit{few-shot}ノードクラスを処理できない。治療として、数発の学習が研究コミュニティで注目を集めている。しかし、ノードの分類は、以下の質問に答える必要があるため、依然として困難な問題である。 (i)数ショットノード分類のための属性ネットワークからメタ知識を抽出する方法 (ii)ロバストで効果的なモデルを構築するために各ラベル付きインスタンスのインフォメーションを識別するにはどうすればよいか? 本稿では,これらの質問に答えるために,グラフメタラーニングフレームワークであるgraph prototypical networks (gpn)を提案する。実テスト環境を模倣する半教師付きノード分類タスクのプールを構築することにより、GPNは属性ネットワーク上で \textit{meta-learning} を実行し、ターゲット分類タスクを扱うための非常に一般化可能なモデルを導出することができる。大規模な実験では、GPNが数発のノード分類において優れていることを示した。

Attributed networks nowadays are ubiquitous in a myriad of high-impact applications, such as social network analysis, financial fraud detection, and drug discovery. As a central analytical task on attributed networks, node classification has received much attention in the research community. In real-world attributed networks, a large portion of node classes only contain limited labeled instances, rendering a long-tail node class distribution. Existing node classification algorithms are unequipped to handle the \textit{few-shot} node classes. As a remedy, few-shot learning has attracted a surge of attention in the research community. Yet, few-shot node classification remains a challenging problem as we need to address the following questions: (i) How to extract meta-knowledge from an attributed network for few-shot node classification? (ii) How to identify the informativeness of each labeled instance for building a robust and effective model? To answer these questions, in this paper, we propose a graph meta-learning framework -- Graph Prototypical Networks (GPN). By constructing a pool of semi-supervised node classification tasks to mimic the real test environment, GPN is able to perform \textit{meta-learning} on an attributed network and derive a highly generalizable model for handling the target classification task. Extensive experiments demonstrate the superior capability of GPN in few-shot node classification.

翻訳日:2022-11-17 22:35:15 公開日:2020-11-27

# ゴールコンディション付き階層型予測器を用いた長期視覚計画

Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors ( http://arxiv.org/abs/2006.13205v2 )

ライセンス: Link先を確認

Karl Pertsch, Oleh Rybkin, Frederik Ebert, Chelsea Finn, Dinesh Jayaraman, Sergey Levine

(参考訳) 未来に予測し、計画する能力は、世界で行動するエージェントにとって基本である。遠方の目標を達成するために,まず目標に向けて粗い計画を考案し,さらに詳細を徐々に記入する軌道を複数の時間スケールで予測する。対照的に、視覚的予測と計画のための現在の学習アプローチは、(1)目標情報を考慮せずに予測し、(2)最高の時間分解能では、一度に1ステップずつ、長い水平タスクで失敗する。本研究では,これらの制約を克服可能な視覚的予測と計画のためのフレームワークを提案する。まず,目標に向かって予測する問題を定式化し,それに対応する潜在空間目標条件予測器(gcps)を提案する。 GCPは、目標に達する軌道のみに検索スペースを制約することで、計画の効率を大幅に改善する。さらに,2つの観測から観測結果を予測し,軌道の各部分を再帰的に分割することにより,gcpを階層モデルとして自然に定式化することができることを示す。この分割・分割戦略は, 長期予測に有効であり, 粗大から細かな方法で軌道を最適化する効率的な階層計画アルゴリズムを設計できる。目標条件と階層予測の両方を使用することで、GCPは以前よりもはるかに長い視野で視覚的な計画タスクを解決できることを示す。

The ability to predict and plan into the future is fundamental for agents acting in the world. To reach a faraway goal, we predict trajectories at multiple timescales, first devising a coarse plan towards the goal and then gradually filling in details. In contrast, current learning approaches for visual prediction and planning fail on long-horizon tasks as they generate predictions (1) without considering goal information, and (2) at the finest temporal resolution, one step at a time. In this work we propose a framework for visual prediction and planning that is able to overcome both of these limitations. First, we formulate the problem of predicting towards a goal and propose the corresponding class of latent space goal-conditioned predictors (GCPs). GCPs significantly improve planning efficiency by constraining the search space to only those trajectories that reach the goal. Further, we show how GCPs can be naturally formulated as hierarchical models that, given two observations, predict an observation between them, and by recursively subdividing each part of the trajectory generate complete sequences. This divide-and-conquer strategy is effective at long-term prediction, and enables us to design an effective hierarchical planning algorithm that optimizes trajectories in a coarse-to-fine manner. We show that by using both goal-conditioning and hierarchical prediction, GCPs enable us to solve visual planning tasks with much longer horizon than previously possible.

翻訳日:2022-11-17 21:25:09 公開日:2020-11-27

# 3Dの対人ロボットは人間をクローズできるか?

Can 3D Adversarial Logos Cloak Humans? ( http://arxiv.org/abs/2006.14655v2 )

ライセンス: Link先を確認

Yi Wang, Jingyang Zhou, Tianlong Chen, Sijia Liu, Shiyu Chang, Chandrajit Bajaj, Zhangyang Wang

(参考訳) 敵の攻撃の傾向により、研究者たちは2Dシーンで訓練された物体探知機を騙そうと試みている。それらの多くは、現実世界で使われる可能性のある新たな攻撃形態として、画像に敵のパッチ(例えばロゴ)を付加することが挙げられる。それにもかかわらず、3dレンダリングビューからの敵意攻撃についてはあまり知られていない。本稿では, 2次元テクスチャ画像から任意の形状のロゴを構築し, ロゴ変換と呼ばれるテクスチャマッピングを用いて, この画像を3次元逆ロゴにマッピングする。結果として得られる3dの敵のロゴは、その形状と位置を容易に操作できる敵のテクスチャと見なされる。これは、コンピュータグラフィックス合成画像のための広告訓練の汎用性を大きく広げる。従来の敵対的パッチとは対照的に、この新しい攻撃形態は3Dオブジェクトの世界にマッピングされ、異なるレンダリングによって2D画像領域にバックプロパゲートされる。加えて、既存の敵のパッチとは異なり、我々の新しい3d敵ロゴは、モデル回転の下で堅牢に最先端の深層物体検出器を騙すように示されています。私たちのコードはhttps://github.com/tamu-vita/3d_adversarial_logoで利用可能です。

With the trend of adversarial attacks, researchers attempt to fool trained object detectors in 2D scenes. Among many of them, an intriguing new form of attack with potential real-world usage is to append adversarial patches (e.g. logos) to images. Nevertheless, much less have we known about adversarial attacks from 3D rendering views, which is essential for the attack to be persistently strong in the physical world. This paper presents a new 3D adversarial logo attack: we construct an arbitrary shape logo from a 2D texture image and map this image into a 3D adversarial logo via a texture mapping called logo transformation. The resulting 3D adversarial logo is then viewed as an adversarial texture enabling easy manipulation of its shape and position. This greatly extends the versatility of adversarial training for computer graphics synthesized imagery. Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering. In addition, and unlike existing adversarial patches, our new 3D adversarial logo is shown to fool state-of-the-art deep object detectors robustly under model rotations, leading to one step further for realistic attacks in the physical world. Our codes are available at https://github.com/TAMU-VITA/3D_Adversarial_Logo.

翻訳日:2022-11-17 02:45:09 公開日:2020-11-27

# 効率的なモバイルネットワーク設計のためのボトルネック構造再考

Rethinking Bottleneck Structure for Efficient Mobile Network Design ( http://arxiv.org/abs/2007.02269v4 )

ライセンス: Link先を確認

Zhou Daquan, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

(参考訳) 倒立残差ブロックは最近,モバイルネットワークのアーキテクチャ設計を支配している。これは2つの設計ルールを導入することで、古典的な残留ボトルネックを変化させる。本稿では,このような設計変更の必要性を再検討し,情報損失や勾配混乱のリスクをもたらす可能性を見いだす。そこで我々は,その構造を反転させ,より高次元でのアイデンティティマッピングと空間変換を行い,情報損失と勾配混乱を効果的に緩和する,サンドグラスブロックと呼ばれる新しいボトルネック設計を提案する。大規模な実験は、一般的な信念とは異なり、そのようなボトルネック構造がモバイルネットワークの反転構造よりも有益であることを示した。 ImageNet分類では、パラメータや計算量を増やすことなく、逆残差ブロックを砂時計ブロックに置き換えることによって、MobileNetV2よりも1.7%以上精度を向上することができる。 Pascal VOC 2007 テストセットでは、対象検出において 0.9% mAP も改善されている。さらに,ニューラルネットワーク探索法dartsの探索空間に追加することにより,サンドグラスブロックの有効性をさらに検証する。 25%のパラメータ削減により、従来のdartsモデルよりも分類精度が0.13%向上した。コードは、https://github.com/zhoudaquan/rethinking_bottleneck_design.comで参照できる。

The inverted residual block is dominating architecture design for mobile networks recently. It changes the classic residual bottleneck by introducing two design rules: learning inverted residuals and using linear bottlenecks. In this paper, we rethink the necessity of such design changes and find it may bring risks of information loss and gradient confusion. We thus propose to flip the structure and present a novel bottleneck design, called the sandglass block, that performs identity mapping and spatial transformation at higher dimensions and thus alleviates information loss and gradient confusion effectively. Extensive experiments demonstrate that, different from the common belief, such bottleneck structure is more beneficial than the inverted ones for mobile networks. In ImageNet classification, by simply replacing the inverted residual block with our sandglass block without increasing parameters and computation, the classification accuracy can be improved by more than 1.7% over MobileNetV2. On Pascal VOC 2007 test set, we observe that there is also 0.9% mAP improvement in object detection. We further verify the effectiveness of the sandglass block by adding it into the search space of neural architecture search method DARTS. With 25% parameter reduction, the classification accuracy is improved by 0.13% over previous DARTS models. Code can be found at: https://github.com/zhoudaquan/rethinking_bottleneck_design.

翻訳日:2022-11-13 08:22:50 公開日:2020-11-27

# 単変量行列分布アルゴリズムは, 誤認と転移をよく表す

The Univariate Marginal Distribution Algorithm Copes Well With Deception and Epistasis ( http://arxiv.org/abs/2007.08277v2 )

ライセンス: Link先を確認

Benjamin Doerr and Martin S. Krejca

(参考訳) 最近の研究で、Lehre and Nguyen (FOGA 2019) は、認知学習ブロック(DLB)問題を最適化するために、一変量境界分布アルゴリズム (UMDA) が親集団サイズで指数関数的な時間を必要とすることを示した。彼らはこの結果から、単変量EDAは偽りやてんかんに苦慮していると結論づけた。本研究では,この否定的な発見は,UMDAのパラメータを不運に選択することに起因することを示す。集団の大きさが遺伝的ドリフトを防げるほど大きく選択されると、umdaは最大$\lambda(\frac{n}{2} + 2 e \ln n)$の適合評価で高い確率でdlb問題を最適化する。子孫サイズの$\lambda$ of order $n \log n$は遺伝的ドリフトを防ぐことができるので、UMDAは$O(n^2 \log n)$フィットネス評価でDLB問題を解決することができる。対照的に、従来の進化的アルゴリズムでは、$o(n^3)$ よりも実行時間が保証されない(${(1+1)$ ea には厳密であることが証明されている)ため、umda は欺きとエピスタティスに対処できることを示唆している。より広い視点から見れば、UMDAは進化的アルゴリズムよりも局所最適に対処できることが示され、この結果は以前、コンパクトな遺伝的アルゴリズムでのみ知られていた。 Lehre と Nguyen の下位境界とともに、私たちの結果は、遺伝的ドリフトによる政権での EDA の実行が、劇的なパフォーマンス損失をもたらすことを厳格に証明した。

In their recent work, Lehre and Nguyen (FOGA 2019) show that the univariate marginal distribution algorithm (UMDA) needs time exponential in the parent populations size to optimize the DeceptiveLeadingBlocks (DLB) problem. They conclude from this result that univariate EDAs have difficulties with deception and epistasis. In this work, we show that this negative finding is caused by an unfortunate choice of the parameters of the UMDA. When the population sizes are chosen large enough to prevent genetic drift, then the UMDA optimizes the DLB problem with high probability with at most $\lambda(\frac{n}{2} + 2 e \ln n)$ fitness evaluations. Since an offspring population size $\lambda$ of order $n \log n$ can prevent genetic drift, the UMDA can solve the DLB problem with $O(n^2 \log n)$ fitness evaluations. In contrast, for classic evolutionary algorithms no better run time guarantee than $O(n^3)$ is known (which we prove to be tight for the ${(1+1)}$ EA), so our result rather suggests that the UMDA can cope well with deception and epistatis. From a broader perspective, our result shows that the UMDA can cope better with local optima than evolutionary algorithms; such a result was previously known only for the compact genetic algorithm. Together with the lower bound of Lehre and Nguyen, our result for the first time rigorously proves that running EDAs in the regime with genetic drift can lead to drastic performance losses.

翻訳日:2022-11-09 22:41:50 公開日:2020-11-27

# ヒト乳癌研究を支援する犬乳癌の完全な注釈付き全画像データセット

A completely annotated whole slide image dataset of canine breast cancer to aid human breast cancer research ( http://arxiv.org/abs/2008.10244v2 )

ライセンス: Link先を確認

Marc Aubreville, Christof A. Bertram, Taryn A. Donovan, Christian Marzahl, Andreas Maier, and Robert Klopfleisch

(参考訳) 犬乳腺癌(CMC)はヒト乳癌の病理発生のモデルとして用いられており,腫瘍悪性度の評価には同様の段階が一般的である。この階調スキームの重要な構成要素は、ミトティックフィギュア(MF)の密度である。現在公開されているヒト乳癌のデータセットは、スライド画像全体の小さなサブセット(WSI)に対してのみアノテーションを提供する。 MFに完全アノテートされたCMCの21 WSIのデータセットを提案する。このために、病理学者は、潜在的なMFと似た外観の構造物の全てのWSIをスクリーニングした。第2の専門家はブラインドにラベルを割り当て、第3の専門家は最終ラベルを割り当てた。さらに,機械学習を用いて未検出のmfを同定した。最後に,アノテーションの一貫性を高めるために,表現学習と二次元投影を行った。我々のデータセットは 13,907 mf と 36,379 hard negative からなる。テストセットでは平均0.791score,ヒト乳癌データセットでは0.696scoreであった。

Canine mammary carcinoma (CMC) has been used as a model to investigate the pathogenesis of human breast cancer and the same grading scheme is commonly used to assess tumor malignancy in both. One key component of this grading scheme is the density of mitotic figures (MF). Current publicly available datasets on human breast cancer only provide annotations for small subsets of whole slide images (WSIs). We present a novel dataset of 21 WSIs of CMC completely annotated for MF. For this, a pathologist screened all WSIs for potential MF and structures with a similar appearance. A second expert blindly assigned labels, and for non-matching labels, a third expert assigned the final labels. Additionally, we used machine learning to identify previously undetected MF. Finally, we performed representation learning and two-dimensional projection to further increase the consistency of the annotations. Our dataset consists of 13,907 MF and 36,379 hard negatives. We achieved a mean F1-score of 0.791 on the test set and of up to 0.696 on a human breast cancer dataset.

翻訳日:2022-10-25 09:07:05 公開日:2020-11-27

# データ価格に関する調査:経済学からデータ科学へ

A Survey on Data Pricing: from Economics to Data Science ( http://arxiv.org/abs/2009.04462v2 )

ライセンス: Link先を確認

Jian Pei

(参考訳) データの価値は低い。データの価値を客観的、体系的、定量的に評価するにはどうすればよいのか? 価格データ(一般に情報財)は、経済学、マーケティング、電子商取引、データ管理、データマイニング、機械学習など、分散した分野や原則で研究され、実践されてきた。本稿では,この重要な方向性について,学際的かつ総合的に概観する。データ価格の背景にある様々なモチベーションを調べ、データ価格の経済性を理解し、一連の基本原則に従って価格モデルの開発と進化をレビューする。デジタル製品とデータ製品の両方について論じる。また,今後の課題や方向性についても検討する。

Data are invaluable. How can we assess the value of data objectively, systematically and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, marketing, electronic commerce, data management, data mining and machine learning. In this article, we present a unified, interdisciplinary and comprehensive overview of this important direction. We examine various motivations behind data pricing, understand the economics of data pricing and review the development and evolution of pricing models according to a series of fundamental principles. We discuss both digital products and data products. We also consider a series of challenges and directions for future work.

翻訳日:2022-10-20 09:04:24 公開日:2020-11-27

# 適切な場所に木を植える:アルゴリズム融合による樹木栽培に適した場所の推薦

Planting trees at the right places: Recommending suitable sites for growing trees using algorithm fusion ( http://arxiv.org/abs/2009.08002v2 )

ライセンス: Link先を確認

Pushpendra Rana and Lav R Varshney

(参考訳) 大規模植林は炭素削減のための低コストの自然ソリューションとして提案されてきたが、特に発展途上国ではプランテーションの場が貧弱なため妨げられている。サイト選択を支援するため,物理に基づく伝統的な林業科学知識と機械学習を組み合わせたアルゴリズム融合に基づくePSAレコメンデーションシステムを開発した。 ePSAは、森林地帯内のブランクパッチを識別し、木の成長ポテンシャルに基づいて各パッチをランク付けすることで、森林範囲の役員を支援する。実験, ユーザスタディ, 展開の結果は, 北インド以北における炭素削減のための自然環境ソリューションとして, 樹木プランテーションの長期的成功を形作る上で, 推奨システムの有用性を特徴づけている。

Large-scale planting of trees has been proposed as a low-cost natural solution for carbon mitigation, but is hampered by poor selection of plantation sites, especially in developing countries. To aid in site selection, we develop the ePSA (e-Plantation Site Assistant) recommendation system based on algorithm fusion that combines physics-based/traditional forestry science knowledge with machine learning. ePSA assists forest range officers by identifying blank patches inside forest areas and ranking each such patch based on their tree growth potential. Experiments, user studies, and deployment results characterize the utility of the recommender system in shaping the long-term success of tree plantations as a nature climate solution for carbon mitigation in northern India and beyond.

翻訳日:2022-10-17 11:55:32 公開日:2020-11-27

# 画素からの連続制御における視覚的一般化の測定

Measuring Visual Generalization in Continuous Control from Pixels ( http://arxiv.org/abs/2010.06740v2 )

ライセンス: Link先を確認

Jake Grigsby, Yanjun Qi

(参考訳) 自己教師付き学習とデータ拡張は、連続制御タスクにおける状態と画像に基づく強化学習エージェントのパフォーマンスギャップを著しく減らした。しかし、現在の技術が現実世界の環境に要求される様々な視覚的条件に直面することができるかどうかはまだ不明である。本稿では,既存の連続制御領域にグラフィカルな多様性を加えることで,エージェントの視覚的一般化を検証できる挑戦的なベンチマークを提案する。実験結果から,現在の手法では様々な視覚変化の一般化が困難であり,これらのタスクを困難にさせる変動の具体的要因について検討した。データ拡張技術は自己教師あり学習手法より優れており、より重要な画像変換によってより優れた視覚的一般化が実現されていることが分かりました。

Self-supervised learning and data augmentation have significantly reduced the performance gap between state and image-based reinforcement learning agents in continuous control tasks. However, it is still unclear whether current techniques can face a variety of visual conditions required by real-world environments. We propose a challenging benchmark that tests agents' visual generalization by adding graphical variety to existing continuous control domains. Our empirical analysis shows that current methods struggle to generalize across a diverse set of visual changes, and we examine the specific factors of variation that make these tasks difficult. We find that data augmentation techniques outperform self-supervised learning approaches and that more significant image transformations provide better visual generalization \footnote{The benchmark and our augmented actor-critic implementation are open-sourced @ https://github.com/QData/dmc_remastered)

翻訳日:2022-10-07 22:36:49 公開日:2020-11-27

# メタグラディエントD4PGによるバランシング制約とリワード

Balancing Constraints and Rewards with Meta-Gradient D4PG ( http://arxiv.org/abs/2010.06324v2 )

ライセンス: Link先を確認

Dan A. Calian and Daniel J. Mankowitz and Tom Zahavy and Zhongwen Xu and Junhyuk Oh and Nir Levine and Timothy Mann

(参考訳) 現実世界のアプリケーションを解決するためにRLエージェントを配置するには、複雑なシステムの制約を満たす必要があることが多い。しばしば制約しきい値は、システムの複雑な性質や、オフラインでしきい値を検証することができない(例えば、シミュレータや合理的なオフライン評価手順は存在しない)ために誤って設定される。これにより、制約に違反することなくタスクを解決できない解が得られる。しかし、現実の多くのケースでは制約違反は望ましくないが、それらは破滅的なものではなく、ソフト制約されたRLアプローチの必要性を動機付けている。本稿では,制約違反の最小化と期待リターンとの良好なトレードオフを見つけるために,メタグラディエンスを利用するソフトコンストレートrl手法を提案する。このアプローチの有効性は、4つの異なる MuJoCo ドメインのベースラインを一貫して上回ることを示すことで実証する。

Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints. Often the constraint thresholds are incorrectly set due to the complex nature of a system or the inability to verify the thresholds offline (e.g, no simulator or reasonable offline evaluation procedure exists). This results in solutions where a task cannot be solved without violating the constraints. However, in many real-world cases, constraint violations are undesirable yet they are not catastrophic, motivating the need for soft-constrained RL approaches. We present a soft-constrained RL approach that utilizes meta-gradients to find a good trade-off between expected return and minimizing constraint violations. We demonstrate the effectiveness of this approach by showing that it consistently outperforms the baselines across four different MuJoCo domains.

翻訳日:2022-10-07 22:26:11 公開日:2020-11-27

# D2RL:強化学習における深度アーキテクチャ

D2RL: Deep Dense Architectures in Reinforcement Learning ( http://arxiv.org/abs/2010.09163v2 )

ライセンス: Link先を確認

Samarth Sinha, Homanga Bharadhwaj, Aravind Srinivas, Animesh Garg

(参考訳) ディープラーニングアーキテクチャの改善は、コンピュータビジョンや自然言語処理における教師付きおよび教師なし学習の状況を改善する上で重要な役割を担っているが、強化学習のためのニューラルネットワークアーキテクチャの選択は、いまだに未熟である。コンピュータビジョンと生成モデルにおけるアーキテクチャ選択の成功からインスピレーションを得て,様々なロボット学習ベンチマーク環境において,より深いネットワークと高密度接続を用いて強化学習を行う。以上の結果から,現状の手法は,密接な接続や深いネットワーク,操作やロコモーションタスクのスイートを通じて,固有認識と画像に基づく観察の両方において有益であることが判明した。私たちの成果が強力なベースラインとして機能し、強化学習のためのニューラルネットワークアーキテクチャに関するさらなる研究の動機になることを期待しています。コード付きプロジェクトのwebサイトは、このリンクhttps://sites.google.com/view/d2rl/home.comにある。

While improvements in deep learning architectures have played a crucial role in improving the state of supervised and unsupervised learning in computer vision and natural language processing, neural network architecture choices for reinforcement learning remain relatively under-explored. We take inspiration from successful architectural choices in computer vision and generative modelling, and investigate the use of deeper networks and dense connections for reinforcement learning on a variety of simulated robotic learning benchmark environments. Our findings reveal that current methods benefit significantly from dense connections and deeper networks, across a suite of manipulation and locomotion tasks, for both proprioceptive and image-based observations. We hope that our results can serve as a strong baseline and further motivate future research into neural network architectures for reinforcement learning. The project website with code is at this link https://sites.google.com/view/d2rl/home.

翻訳日:2022-10-05 22:24:32 公開日:2020-11-27

# 脳腫瘍分離のためのコンテキスト認識3D UNet

Context Aware 3D UNet for Brain Tumor Segmentation ( http://arxiv.org/abs/2010.13082v2 )

ライセンス: Link先を確認

Parvez Ahmad, Saqib Qamar, Linlin Shen, Adnan Saeed

(参考訳) 深部畳み込みニューラルネットワーク(CNN)は医用画像解析において顕著な性能を発揮する。 UNetは、脳腫瘍セグメンテーションを含む医療画像タスクのための3D CNNアーキテクチャのパフォーマンスの主要なソースである。 UNetアーキテクチャのスキップ接続は、エンコーダとデコーダの経路から特徴を結合し、画像データから複数のコンテキスト情報を抽出する。マルチスケールの特徴は、脳腫瘍のセグメンテーションにおいて重要な役割を果たす。しかし、機能の使用制限により、セグメンテーションのためのUNetアプローチのパフォーマンスが低下する可能性がある。本稿では,脳腫瘍分割のためのunetアーキテクチャの改良を提案する。提案アーキテクチャでは,エンコーダとデコーダの経路に密結合したブロックを用いて,特徴再利用性の概念から複数のコンテキスト情報を抽出する。さらに、異なるカーネルサイズの特徴をマージすることで、ローカルおよびグローバル情報を抽出するために、残差インセプションブロック(RIB)が使用される。提案アーキテクチャをbrats(multi-modal brain tumor segmentation challenge) 2020年テストデータセットで検証した。全腫瘍(wt)、腫瘍コア(tc)、増強腫瘍(et)のdice(dsc)スコアはそれぞれ89.12%、84.74%、79.12%である。

Deep convolutional neural network (CNN) achieves remarkable performance for medical image analysis. UNet is the primary source in the performance of 3D CNN architectures for medical imaging tasks, including brain tumor segmentation. The skip connection in the UNet architecture concatenates features from both encoder and decoder paths to extract multi-contextual information from image data. The multi-scaled features play an essential role in brain tumor segmentation. However, the limited use of features can degrade the performance of the UNet approach for segmentation. In this paper, we propose a modified UNet architecture for brain tumor segmentation. In the proposed architecture, we used densely connected blocks in both encoder and decoder paths to extract multi-contextual information from the concept of feature reusability. In addition, residual-inception blocks (RIB) are used to extract the local and global information by merging features of different kernel sizes. We validate the proposed architecture on the multi-modal brain tumor segmentation challenge (BRATS) 2020 testing dataset. The dice (DSC) scores of the whole tumor (WT), tumor core (TC), and enhancing tumor (ET) are 89.12%, 84.74%, and 79.12%, respectively.

翻訳日:2022-10-03 04:56:45 公開日:2020-11-27

# 自己集合型3次元U-netニューラルネットワークによる脳腫瘍セグメント形成 : BraTS 2020チャレンジソリューション

Brain tumor segmentation with self-ensembled, deeply-supervised 3D U-net neural networks: a BraTS 2020 challenge solution ( http://arxiv.org/abs/2011.01045v2 )

ライセンス: Link先を確認

Theophraste Henry, Alexandre Carre, Marvin Lerousseau, Theo Estienne, Charlotte Robert, Nikos Paragios, Eric Deutsch

(参考訳) 脳腫瘍のセグメンテーションは患者の疾患管理にとって重要な課題である。このタスクの自動化と標準化のために,我々は,マルチモーダル脳腫瘍分割課題(brats,multimodal brain tumor segmentation challenge,brats)2020のトレーニングデータセットに基づいて,深層監視と確率的重み平均化を中心に,複数のu-netライクなニューラルネットワークをトレーニングした。 2つの異なる訓練パイプラインからモデルの2つの独立したアンサンブルを訓練し、それぞれに脳腫瘍のセグメンテーションマップを作成した。これらの2つのラベルマップは、特定の腫瘍部分領域に対するそれぞれのアンサンブルのパフォーマンスを考慮してマージされた。試験時間増加を伴うオンライン検証データセットの性能は,0.81,0.91,0.85,Hausdorff(95%),20.6,4,3,5.7mm,全腫瘍コア,腫瘍コアの順であった。同様に、私たちのソリューションはDiceの0.79、0.89、0.84、最終テストデータセットでHausdorff(95%)の20.4、6.7、19.5mmを達成し、トップ10チームの中でランク付けしました。より複雑なトレーニングスキームとニューラルネットワークアーキテクチャを、トレーニング時間を大幅に増加させるコストで、大幅なパフォーマンス向上なしに調査した。以上より,各腫瘍亜領域の成績は良好で,バランスが良好であった。私たちのソリューションはhttps://github.com/lescientifik/open_brats2020でオープンソースです。

Brain tumor segmentation is a critical task for patient's disease management. In order to automate and standardize this task, we trained multiple U-net like neural networks, mainly with deep supervision and stochastic weight averaging, on the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2020 training dataset. Two independent ensembles of models from two different training pipelines were trained, and each produced a brain tumor segmentation map. These two labelmaps per patient were then merged, taking into account the performance of each ensemble for specific tumor subregions. Our performance on the online validation dataset with test time augmentation were as follows: Dice of 0.81, 0.91 and 0.85; Hausdorff (95%) of 20.6, 4,3, 5.7 mm for the enhancing tumor, whole tumor and tumor core, respectively. Similarly, our solution achieved a Dice of 0.79, 0.89 and 0.84, as well as Hausdorff (95%) of 20.4, 6.7 and 19.5mm on the final test dataset, ranking us among the top ten teams. More complicated training schemes and neural network architectures were investigated without significant performance gain at the cost of greatly increased training time. Overall, our approach yielded good and balanced performance for each tumor subregion. Our solution is open sourced at https://github.com/lescientifik/open_brats2020.

翻訳日:2022-10-01 17:19:29 公開日:2020-11-27

# 歪み画像復元のための生成的・識別的学習

Generative and Discriminative Learning for Distorted Image Restoration ( http://arxiv.org/abs/2011.05784v3 )

ライセンス: Link先を確認

Yi Gu, Yuting Gao, Jie Li, Chentao Wu, Weijia Jia

(参考訳) Liquifyは画像編集の一般的な技術であり、画像の歪みに使用できる。歪み変動の不確実性のため, 液状化フィルタによる歪み画像の復元は難しい課題である。画像を効率よく編集するには、歪んだ画像を自動的に復元することが期待される。本稿では、歪み画像の適切な歪みと完了を求めることで特徴付けられる歪み画像復元を目的とする。既存の手法は、自然現象によって生じる特定の規則的変形を解決するためのハードウェアアシストや幾何学原理に重点を置いているが、この課題における人工歪みの不規則性や不確実性には対処できない。そこで本研究では,深層ニューラルネットワークに基づく新しい生成・判別学習法を提案し,様々な再構成マッピングを学習し,複雑で高次元のデータを表現する。この方法は、タスクを整流段階と精練段階とに分解する。第1段階生成ネットワークは、歪み画像から補正画像へのマッピングを予測する。第二段階生成ネットワークはさらに知覚品質を最適化する。このタスクを探索するデータセットやベンチマークがないため、CelebAデータセットに基づいた前方歪みマッピングにより、Distorted Face Dataset(DFD)を作成します。提案ベンチマークの広範な実験評価を行い,本手法が画像復元に有効な方法であることを実証した。

Liquify is a common technique for image editing, which can be used for image distortion. Due to the uncertainty in the distortion variation, restoring distorted images caused by liquify filter is a challenging task. To edit images in an efficient way, distorted images are expected to be restored automatically. This paper aims at the distorted image restoration, which is characterized by seeking the appropriate warping and completion of a distorted image. Existing methods focus on the hardware assistance or the geometric principle to solve the specific regular deformation caused by natural phenomena, but they cannot handle the irregularity and uncertainty of artificial distortion in this task. To address this issue, we propose a novel generative and discriminative learning method based on deep neural networks, which can learn various reconstruction mappings and represent complex and high-dimensional data. This method decomposes the task into a rectification stage and a refinement stage. The first stage generative network predicts the mapping from the distorted images to the rectified ones. The second stage generative network then further optimizes the perceptual quality. Since there is no available dataset or benchmark to explore this task, we create a Distorted Face Dataset (DFD) by forward distortion mapping based on CelebA dataset. Extensive experimental evaluation on the proposed benchmark and the application demonstrates that our method is an effective way for distorted image restoration.

翻訳日:2022-09-27 00:51:59 公開日:2020-11-27

# 安全合成サンス仕様

Safety Synthesis Sans Specification ( http://arxiv.org/abs/2011.07630v2 )

ライセンス: Link先を確認

Roderick Bloem and Hana Chockler and Masoud Ebrahimi and Dana Fisman and Heinz Riener

(参考訳) 我々は、メンバシップクエリと推測クエリを使用して、競合する可能性のあるトランスデューサを含むターゲット言語である$u$から、transducer ${s}$を学習する問題を定義する。要件は${S}$の言語が$U$のサブセットであることである。ハードウェアおよびソフトウェア検証の多くの状況において、これは自然な問題である、と私たちは主張する。本稿では,この問題に対する学習アルゴリズムを考案し,その時間と問合せの複雑さが,対象言語のランク,非互換性尺度,与えられた反例の最大長に関して多項式であることを示す。本稿では,プロトタイプによる実験について報告する。

We define the problem of learning a transducer ${S}$ from a target language $U$ containing possibly conflicting transducers, using membership queries and conjecture queries. The requirement is that the language of ${S}$ be a subset of $U$. We argue that this is a natural question in many situations in hardware and software verification. We devise a learning algorithm for this problem and show that its time and query complexity is polynomial with respect to the rank of the target language, its incompatibility measure, and the maximal length of a given counterexample. We report on experiments conducted with a prototype implementation.

翻訳日:2022-09-25 07:50:44 公開日:2020-11-27

# ニューラルネットのハイパーパラメータ最適化に対する集団ベースハイブリッドアプローチ

A Population-based Hybrid Approach to Hyperparameter Optimization for Neural Networks ( http://arxiv.org/abs/2011.11062v2 )

ライセンス: Link先を確認

Marcello Serqueira, Pedro Gonz\'alez, Eduardo Bezerra

(参考訳) 近年、大量のデータが生成され、コンピュータの電力は増え続けている。このシナリオは、人工ニューラルネットワークへの関心の復活につながった。効果的なニューラルネットワークモデルのトレーニングにおける大きな課題のひとつは、使用するハイパーパラメータの適切な組み合わせを見つけることだ。実際、ハイパーパラメータ空間を探索するための適切なアプローチの選択は、結果のニューラルネットワークモデルの精度に直接影響する。ハイパーパラメータ最適化の一般的なアプローチは、グリッド探索、ランダム探索、ベイズ最適化である。また、CMA-ESのような人口ベースの方法もある。本稿では,ハイパーパラメータ最適化のための新しい集団ベースアプローチであるHBRKGAを提案する。 HBRKGAは、Biased Random Key Genetic AlgorithmとRandom Walk技術を組み合わせて、ハイパーパラメータ空間を効率的に探索するハイブリッドアプローチである。提案手法の有効性を評価するため、8つの異なるデータセットに関するいくつかの計算実験を行った。その結果、HBRKGAは8つのデータセットのうち6つで(予測品質の観点から)ベースラインメソッドよりも優れたハイパーパラメータ構成を見出すことができた。

In recent years, large amounts of data have been generated, and computer power has kept growing. This scenario has led to a resurgence in the interest in artificial neural networks. One of the main challenges in training effective neural network models is finding the right combination of hyperparameters to be used. Indeed, the choice of an adequate approach to search the hyperparameter space directly influences the accuracy of the resulting neural network model. Common approaches for hyperparameter optimization are Grid Search, Random Search, and Bayesian Optimization. There are also population-based methods such as CMA-ES. In this paper, we present HBRKGA, a new population-based approach for hyperparameter optimization. HBRKGA is a hybrid approach that combines the Biased Random Key Genetic Algorithm with a Random Walk technique to search the hyperparameter space efficiently. Several computational experiments on eight different datasets were performed to assess the effectiveness of the proposed approach. Results showed that HBRKGA could find hyperparameter configurations that outperformed (in terms of predictive quality) the baseline methods in six out of eight datasets while showing a reasonable execution time.

翻訳日:2022-09-22 12:08:21 公開日:2020-11-27

# シリコンナノパターンデジタルメタマテリアルによる超コンパクト集積フォトニクスを実現する機械学習

Machine Learning enables Ultra-Compact Integrated Photonics through Silicon-Nanopattern Digital Metamaterials ( http://arxiv.org/abs/2011.11754v2 )

ライセンス: Link先を確認

Sourangsu Banerji, Apratim Majumder, Alex Hamrick, Rajesh Menon, and Berardi Sensale-Rodriguez

(参考訳) 本研究では,有限差分時間領域(FDTD)モデリングと組み合わせた機械学習アルゴリズムを用いて設計した3つの超コンパクト集積フォトニクスデバイスを実演する。デザインドメインを"バイナリピクセル"にデジタイズすることで、これらのデジタルメタマテリアルも容易に製造できる。様々なデバイス(ビームスプリッターと導波路の屈曲)を提示することにより、我々のアプローチの一般性を示す。エリアのフットプリントが${\lambda_0}^2$より小さいので、私たちのデザインは報告されている中では最小です。我々の手法は、機械学習とデジタルメタマテリアルを組み合わせることで、超コンパクトで製造可能なデバイスを可能にし、新しい「フォトニクスムーアの法則」を推進できる。

In this work, we demonstrate three ultra-compact integrated-photonics devices, which are designed via a machine-learning algorithm coupled with finite-difference time-domain (FDTD) modeling. Through digitizing the design domain into "binary pixels" these digital metamaterials are readily manufacturable as well. By showing a variety of devices (beamsplitters and waveguide bends), we showcase the generality of our approach. With an area footprint smaller than ${\lambda_0}^2$, our designs are amongst the smallest reported to-date. Our method combines machine learning with digital metamaterials to enable ultra-compact, manufacturable devices, which could power a new "Photonics Moore's Law."

翻訳日:2022-09-22 03:30:07 公開日:2020-11-27

# 顔動作単位検出のための差分注意マップを用いた計算効率の高い深層ニューラルネットワーク

Computational efficient deep neural network with difference attention maps for facial action unit detection ( http://arxiv.org/abs/2011.12082v2 )

ライセンス: Link先を確認

Jing Chen, Chenhui Wang, Kejun Wang, Meichen Liu

(参考訳) 本稿では、差分画像に基づく計算効率のよいエンドツーエンドトレーニング深層ニューラルネットワーク(cednn)モデルと空間注意マップを提案する。まず、画像処理により差分画像を生成する。次に、空間的注意マップとして使用される異なるしきい値を用いて、差分画像の5つのバイナリ画像を得る。モデルの複雑さを減らすためにグループ畳み込みを使用します。 skip connectionと$\text{1}\times \text{1}$ convolutionは、ネットワークモデルが深くない場合でも優れたパフォーマンスを保証するために使用される。入力として、各ブロックの入力に空間注意マップを選択的に供給することができる。フィーチャーマップは、ターゲットタスクにもっと関係のある部分にフォーカスする傾向があります。さらに、異なる数のAUを訓練するために分類器のパラメータを調整する必要がある。計算量を増やすことなく、さまざまなデータセットに容易に拡張できる。多くの実験結果から、提案したCEDNNは明らかにdisFA+およびCK+データセットの従来のディープラーニング手法よりも優れていることが示されている。空間的注意マップを付加すると、最も先進的なAU検出法よりも優れた結果が得られる。同時に、ネットワークの規模が小さく、走行速度が速く、実験機器の要件も低い。

In this paper, we propose a computational efficient end-to-end training deep neural network (CEDNN) model and spatial attention maps based on difference images. Firstly, the difference image is generated by image processing. Then five binary images of difference images are obtained using different thresholds, which are used as spatial attention maps. We use group convolution to reduce model complexity. Skip connection and $\text{1}\times \text{1}$ convolution are used to ensure good performance even if the network model is not deep. As an input, spatial attention map can be selectively fed into the input of each block. The feature maps tend to focus on the parts that are related to the target task better. In addition, we only need to adjust the parameters of classifier to train different numbers of AU. It can be easily extended to varying datasets without increasing too much computation. A large number of experimental results show that the proposed CEDNN is obviously better than the traditional deep learning method on DISFA+ and CK+ datasets. After adding spatial attention maps, the result is better than the most advanced AU detection method. At the same time, the scale of the network is small, the running speed is fast, and the requirement for experimental equipment is low.

翻訳日:2022-09-21 13:38:20 公開日:2020-11-27

# 物理に着想を得た圧縮センシングのためのスパース構造学習

Learning sparse structures for physics-inspired compressed sensing ( http://arxiv.org/abs/2011.12831v2 )

ライセンス: Link先を確認

Cl\'ement Dorffer, Thomas Paviet-Salomon, Gilles Le Chenadec and Ang\'elique Dr\'emeau

(参考訳) 水中音響学では、浅い水環境は低周波源を考えるときにモード分散導波路として機能する。この文脈では、伝播信号は少数のモーダル成分の和として記述され、それぞれが自身の波動数に従って伝播する。これらの波数の推定は、伝播環境や放出源を理解する上で重要な関心事である。この問題を解決するために、我々は最近ベイズ的アプローチを提案している。広帯域ソースを扱う場合、波数を一方の周波数からもう一方の周波数にリンクする特定の依存性を統合することで、このモデルをさらに改善することができる。そこで本稿では,汎用構造スパルサリティ情報モデルとして活用される制限付きボルツマンマシンを用いた新しい手法を提案する。このモデルはディープベイズネットワークから派生したもので、よく知られた、証明されたアルゴリズムを用いて、物理的に現実的なシミュレーションデータで効率的に学習することができる。

In underwater acoustics, shallow water environments act as modal dispersive waveguides when considering low-frequency sources. In this context, propagating signals can be described as a sum of few modal components, each of them propagating according to its own wavenumber. Estimating these wavenumbers is of key interest to understand the propagating environment as well as the emitting source. To solve this problem, we proposed recently a Bayesian approach exploiting a sparsity-inforcing prior. When dealing with broadband sources, this model can be further improved by integrating the particular dependence linking the wavenumbers from one frequency to the other. In this contribution, we propose to resort to a new approach relying on a restricted Boltzmann machine, exploited as a generic structured sparsity-inforcing model. This model, derived from deep Bayesian networks, can indeed be efficiently learned on physically realistic simulated data using well-known and proven algorithms.

翻訳日:2022-09-21 04:03:50 公開日:2020-11-27

# 複数の文脈依存タスクの自律学習

Autonomous learning of multiple, context-dependent tasks ( http://arxiv.org/abs/2011.13847v1 )

ライセンス: Link先を確認

Vieri Giuliano Santucci and Davide Montella and Bruno Castro da Silva and Gianluca Baldassarre

(参考訳) 強化学習システムで複数のタスクを自律的に学習する問題に直面している場合、研究者は通常、タスクごとにひとつのパラメトリドポリシーだけで解決できるソリューションに焦点を当てる。しかし、異なるコンテキストを示す複雑な環境では、同じタスクは解決すべき異なるスキルセットを必要とするかもしれない。これらの状況は2つの課題をもたらします (a)異なる方針を必要とする異なる文脈を認識すること b) 新しい発見されたコンテキストにおいて、同じタスクを達成するためのポリシーをすばやく学習する。この2つの課題は、エージェントが与えられた環境で達成される可能性のある目標を自律的に発見し、それを達成するためのモータースキルを学ぶ、オープンエンドの学習フレームワークに直面する場合、さらに困難である。本稿では,2つの課題を統合的に解決するオープンエンド学習ロボットアーキテクチャC-GRAILを提案する。特に、アーキテクチャは、与えられた目標に対する期待性能の低下に基づいて、新しい関連するコンテストを検出し、無関係なコンペを無視することができる。さらに、アーキテクチャは、既に取得したポリシーから知識をインポートする転送学習を利用して、新しいコンテキストのポリシーをすばやく学習することができる。このアーキテクチャは、いくつかの異なる障害物を発生させる複数の障害物の存在下で、自律的に対象物に到達することを学習するロボットを含むシミュレーションロボット環境でテストされる。提案したアーキテクチャは、提案した自律的文脈発見および伝達学習機構を使用しない他のモデルよりも優れている。

When facing the problem of autonomously learning multiple tasks with reinforcement learning systems, researchers typically focus on solutions where just one parametrised policy per task is sufficient to solve them. However, in complex environments presenting different contexts, the same task might need a set of different skills to be solved. These situations pose two challenges: (a) to recognise the different contexts that need different policies; (b) quickly learn the policies to accomplish the same tasks in the new discovered contexts. These two challenges are even harder if faced within an open-ended learning framework where an agent has to autonomously discover the goals that it might accomplish in a given environment, and also to learn the motor skills to accomplish them. We propose a novel open-ended learning robot architecture, C-GRAIL, that solves the two challenges in an integrated fashion. In particular, the architecture is able to detect new relevant contests, and ignore irrelevant ones, on the basis of the decrease of the expected performance for a given goal. Moreover, the architecture can quickly learn the policies for the new contexts by exploiting transfer learning importing knowledge from already acquired policies. The architecture is tested in a simulated robotic environment involving a robot that autonomously learns to reach relevant target objects in the presence of multiple obstacles generating several different obstacles. The proposed architecture outperforms other models not using the proposed autonomous context-discovery and transfer-learning mechanisms.

翻訳日:2022-09-20 02:57:09 公開日:2020-11-27

# 人中心データセット作成のための倫理的ハイライト

An Ethical Highlighter for People-Centric Dataset Creation ( http://arxiv.org/abs/2011.13583v1 )

ライセンス: Link先を確認

Margot Hanley, Apoorv Khandelwal, Hadar Averbuch-Elor, Noah Snavely and Helen Nissenbaum

(参考訳) 人々のコンピュータビジョンデータセットから生じる重要な倫理的な懸念が注目されており、結果として多くのデータセットが削除されている。人中心データセットの学術的ニーズを満たすため、既存のデータセットの倫理的評価をガイドし、ミスステップを避けるために将来のデータセット作成者を支援するための分析フレームワークを提案する。我々の研究は、先行研究のレビューと分析によって知らされ、そのような倫理的課題が生じる場所を強調します。

Important ethical concerns arising from computer vision datasets of people have been receiving significant attention, and a number of datasets have been withdrawn as a result. To meet the academic need for people-centric datasets, we propose an analytical framework to guide ethical evaluation of existing datasets and to serve future dataset creators in avoiding missteps. Our work is informed by a review and analysis of prior works and highlights where such ethical challenges arise.

翻訳日:2022-09-20 02:56:09 公開日:2020-11-27

# 深さ2 reluネットワークのタイトな硬さ評価

Tight Hardness Results for Training Depth-2 ReLU Networks ( http://arxiv.org/abs/2011.13550v1 )

ライセンス: Link先を確認

Surbhi Goel, Adam Klivans, Pasin Manurangsi, Daniel Reichman

(参考訳) ReLU活性化関数を用いた深度2ニューラルネットのトレーニングにおいて,これらのネットワークは単にReLUの重み付き和(負の係数を含むかもしれない)であることを示す。我々の目標は、与えられたトレーニングセットに対する平方損失を最小限に抑えるディープ2ニューラルネットワークを出力することです。この問題は1つのReLUを持つネットワークに対して既にNPハードであることが証明されている。また、2乗誤差を最小化($k>1$)する重み付き和$k$ReLUを、実現可能な設定(ラベルが未知の深さ-2 ReLUネットワークと整合である場合)で出力するNP硬度も証明する。また、所望の加算誤差$\epsilon$という観点で、実行時の下限を得ることができる。下界を得るには、Gap Exponential Time hypothesis (Gap-ETH) を用いるとともに、既知のDensest $\kappa$-Subgraph 問題を半周期時間で近似する難しさに関する新たな仮説を用いる(これらの仮説は異なる下界の証明に別々に使用される)。例えば、妥当な硬さ仮定の下では、最適なReLUを見つけるための適切な学習アルゴリズムは、1/\epsilon^2$で指数関数的に実行しなければならない。 ReLUを不適切に学習する以前の研究(Goel et al., COLT'17)とともに、これはReLUを学習するための適切なアルゴリズムと不適切なアルゴリズムを最初に分離することを意味する。また,有界重みを持つrelusの深さ2ネットワークを適切に学習することで,実現可能かつ不可知な環境での学習に必要な実行時間(worst-case)の上限を新たに与える問題についても検討した。ランニングタイム上の上限は、基本的に$\epsilon$への依存性の観点から下限と一致する。

We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of $k$ ReLUs minimizing the squared error (for $k>1$) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error $\epsilon$. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest $\kappa$-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in $1/\epsilon^2$. Together with a previous work regarding improperly learning a ReLU (Goel et al., COLT'17), this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on $\epsilon$.

翻訳日:2022-09-20 02:56:01 公開日:2020-11-27

# swyftによるシミュレーション効率のよい後方後方推定: 貴重な時間を無駄にしない

Simulation-efficient marginal posterior estimation with swyft: stop wasting your precious time ( http://arxiv.org/abs/2011.13951v1 )

ライセンス: Link先を確認

Benjamin Kurt Miller, Alex Cole, Gilles Louppe, Christoph Weniger

(参考訳) アルゴリズムを紹介します (a)ネスト型神経電位-証拠比推定、及び b) パラメータの非均一なポアソン点プロセスキャッシュとそれに対応するシミュレーションによるシミュレーションの再利用。これらのアルゴリズムが組み合わさって、縁および関節後部の自動的および極めてシミュレーターな推定を可能にする。アルゴリズムは物理学や天文学の幅広い問題に適用でき、通常、従来の確率に基づくサンプリング法よりもはるかに優れたシミュレータ効率を提供する。提案手法は確率自由推論の一例であり, トラクタブルな確率関数を提供しないシミュレータにも適用可能である。シミュレータの実行は決して拒否されず、将来の分析で自動的に再利用できる。機能的なプロトタイプ実装として、オープンソースのソフトウェアパッケージswyftを提供しています。

We present algorithms (a) for nested neural likelihood-to-evidence ratio estimation, and (b) for simulation reuse via an inhomogeneous Poisson point process cache of parameters and corresponding simulations. Together, these algorithms enable automatic and extremely simulator efficient estimation of marginal and joint posteriors. The algorithms are applicable to a wide range of physics and astronomy problems and typically offer an order of magnitude better simulator efficiency than traditional likelihood-based sampling methods. Our approach is an example of likelihood-free inference, thus it is also applicable to simulators which do not offer a tractable likelihood function. Simulator runs are never rejected and can be automatically reused in future analysis. As functional prototype implementation we provide the open-source software package swyft.

翻訳日:2022-09-20 02:54:56 公開日:2020-11-27

# 学際モデル構築のための方法論--モデルから調査へ-

A methodology for co-constructing an interdisciplinary model: from model to survey, from survey to model ( http://arxiv.org/abs/2011.13604v1 )

ライセンス: Link先を確認

Elise Beck, Julie Dugdale, Carole Adam, Christelle Ga\"idatzis, Julius Ba\~ngate

(参考訳) コンピュータ科学と社会科学はどのように協力して共通のモデルを構築するべきか? モデリングに本当に役立つデータを集めるには、どうすればよいのか? ターゲットモデルに合わせた調査をどのように設計すればよいのか? 本稿では,多分野研究プロジェクトの枠組みにおけるこれらの重要な疑問に答えることを目的とする。本研究は,地震発生直後の人間の行動のモデル化に応用され,複数の分野が関与する場合のモデル構築の課題について述べる。本研究の主な貢献は,多分野対話専用のツールの提案である。また、関連する異なる分野によって実行される豊かな知的過程の反射的分析を提案する。最後に、人類学者との協働から、多分野のプロセスの補完的な視点が与えられる。

How should computer science and social science collaborate to build a common model? How should they proceed to gather data that is really useful to the modelling? How can they design a survey that is tailored to the target model? This paper aims to answer those crucial questions in the framework of a multidisciplinary research project. This research addresses the issue of co-constructing a model when several disciplines are involved, and is applied to modelling human behaviour immediately after an earthquake. The main contribution of the work is to propose a tool dedicated to multidisciplinary dialogue. It also proposes a reflexive analysis of the enriching intellectual process carried out by the different disciplines involved. Finally, from working with an anthropologist, a complementary view of the multidisciplinary process is given.

翻訳日:2022-09-20 02:49:53 公開日:2020-11-27

# 関数と非パラメトリック関数に対する後部分布の収束率とベイズ推定器の等価性

Equivalence of Convergence Rates of Posterior Distributions and Bayes Estimators for Functions and Nonparametric Functionals ( http://arxiv.org/abs/2011.13967v1 )

ライセンス: Link先を確認

Zejian Liu and Meng Li

(参考訳) 非パラメトリック回帰におけるガウス過程に先行するベイズ法の後方収縮率と微分作用素に対するプラグイン特性について検討した。核の一般クラスに対しては、回帰関数とその微分の後方測度の収束率を定め、それらはいずれも、あるクラスにおける関数の対数係数まで最適である。本計算により,回帰関数とその導関数の速度最適推定はハイパーパラメータの選択と同一であり,ベイズ法が導関数の次数に著しく適応し,実数値関数を関数関数へ拡張する一般化プラグイン特性を享受できることを示した。これにより, 有限サンプル性能をシミュレーションにより評価した回帰関数とその導関数を, 実質的に簡易に推定できる。この証明は,任意の条件下でベイズ推定器の収束率に対して,後方分布(つまり後方収縮率)の収束率と逆の収束率とが一致することを示す。この同値性はガウス過程の一般クラスを持ち、回帰関数とその微分汎函数を$L_2$と$L_{\infty}$ノルムの下でカバーする。ベイズ系と非ベイズ系でこれら2つの基本的な大きなサンプル特性を結合するのに加えて、そのような同値性は非パラメトリック点推定器の収束率を計算することによって、新しいルーチンで後部収縮率を確立することができる。我々の議論の中核は、カーネルリッジ回帰と等価カーネル技術のための演算子理論フレームワークである。我々は、非パラメトリック点推定器の収束率と、独立な興味を持つかもしれない同値理論を確立する上で重要な急激な非漸近境界を導出する。

We study the posterior contraction rates of a Bayesian method with Gaussian process priors in nonparametric regression and its plug-in property for differential operators. For a general class of kernels, we establish convergence rates of the posterior measure of the regression function and its derivatives, which are both minimax optimal up to a logarithmic factor for functions in certain classes. Our calculation shows that the rate-optimal estimation of the regression function and its derivatives share the same choice of hyperparameter, indicating that the Bayes procedure remarkably adapts to the order of derivatives and enjoys a generalized plug-in property that extends real-valued functionals to function-valued functionals. This leads to a practically simple method for estimating the regression function and its derivatives, whose finite sample performance is assessed using simulations. Our proof shows that, under certain conditions, to any convergence rate of Bayes estimators there corresponds the same convergence rate of the posterior distributions (i.e., posterior contraction rate), and vice versa. This equivalence holds for a general class of Gaussian processes and covers the regression function and its derivative functionals, under both the $L_2$ and $L_{\infty}$ norms. In addition to connecting these two fundamental large sample properties in Bayesian and non-Bayesian regimes, such equivalence enables a new routine to establish posterior contraction rates by calculating convergence rates of nonparametric point estimators. At the core of our argument is an operator-theoretic framework for kernel ridge regression and equivalent kernel techniques. We derive a range of sharp non-asymptotic bounds that are pivotal in establishing convergence rates of nonparametric point estimators and the equivalence theory, which may be of independent interest.

翻訳日:2022-09-20 02:49:44 公開日:2020-11-27

# 深層強化学習による時変グラフの効率的な情報拡散

Efficient Information Diffusion in Time-Varying Graphs through Deep Reinforcement Learning ( http://arxiv.org/abs/2011.13518v1 )

ライセンス: Link先を確認

Matheus R. F. Mendon\c{c}a, Andr\'e M. S. Barreto, Artur Ziviani

(参考訳) 時間変化グラフによる効率的な情報拡散のためのネットワークシード(TVG)は多くの実世界のアプリケーションにおいて難しい課題である。この時空間影響の最大化問題をモデル化する方法はいくつかあるが、最終的な目標は、ノードが拡散過程を開始する最良の瞬間を決定することである。本稿では,各ノードの時間的挙動と接続パターンを学習し,TVGを介して拡散を開始するための最良の瞬間を予測できる,強化学習とグラフ埋め込みを併用したモデルであるSpatio-Temporal Influence Maximization~(STIM)を提案する。また,TVGの確率拡散過程をシミュレートする学習用人工TVGも開発し,STIMネットワークは非決定論的環境においても効率的なポリシーを学習可能であることを示した。 STIMは現実世界のTVGで評価され、ノードを通して情報を効率的に伝達する。最後に、STIMモデルが$O(|E|)$の時間複雑性を持つことを示す。そこでSTIMは,TVGにおける効率的な情報拡散のための新しい手法を提案する。

Network seeding for efficient information diffusion over time-varying graphs~(TVGs) is a challenging task with many real-world applications. There are several ways to model this spatio-temporal influence maximization problem, but the ultimate goal is to determine the best moment for a node to start the diffusion process. In this context, we propose Spatio-Temporal Influence Maximization~(STIM), a model trained with Reinforcement Learning and Graph Embedding over a set of artificial TVGs that is capable of learning the temporal behavior and connectivity pattern of each node, allowing it to predict the best moment to start a diffusion through the TVG. We also develop a special set of artificial TVGs used for training that simulate a stochastic diffusion process in TVGs, showing that the STIM network can learn an efficient policy even over a non-deterministic environment. STIM is also evaluated with a real-world TVG, where it also manages to efficiently propagate information through the nodes. Finally, we also show that the STIM model has a time complexity of $O(|E|)$. STIM, therefore, presents a novel approach for efficient information diffusion in TVGs, being highly versatile, where one can change the goal of the model by simply changing the adopted reward function.

翻訳日:2022-09-20 02:49:12 公開日:2020-11-27

# net2:プレプレースメントネット長推定用にカスタマイズしたグラフアテンションネットワーク手法

Net2: A Graph Attention Network Method Customized for Pre-Placement Net Length Estimation ( http://arxiv.org/abs/2011.13522v1 )

ライセンス: Link先を確認

Zhiyao Xie, Rongjian Liang, Xiaoqing Xu, Jiang Hu, Yixiao Duan, Yiran Chen

(参考訳) net lengthは、標準のデジタルデザインフローの様々な段階にわたってタイミングとパワーを最適化するための重要なプロキシメトリックである。しかし、ネット長情報の大多数はセル配置まで利用できないため、論理合成のような配置前の設計段階でネット長の最適化を明示的に検討することは重要な課題である。この研究は、セル配置前の個々のネット長を推定するために、Net2と呼ばれるカスタマイズを伴うグラフ注意ネットワーク手法を提案することで、この問題に対処する。精度指向バージョンであるNet2aは、長いネットと長いクリティカルパスの両方を識別する以前のいくつかの研究よりも約15%精度が向上している。高速バージョンであるNet2fは、配置よりも1000倍以上高速だが、さまざまな精度のメトリクスで、これまでの作業や他のニューラルネットワーク技術よりも優れている。

Net length is a key proxy metric for optimizing timing and power across various stages of a standard digital design flow. However, the bulk of net length information is not available until cell placement, and hence it is a significant challenge to explicitly consider net length optimization in design stages prior to placement, such as logic synthesis. This work addresses this challenge by proposing a graph attention network method with customization, called Net2, to estimate individual net length before cell placement. Its accuracy-oriented version Net2a achieves about 15% better accuracy than several previous works in identifying both long nets and long critical paths. Its fast version Net2f is more than 1000 times faster than placement while still outperforms previous works and other neural network techniques in terms of various accuracy metrics.

翻訳日:2022-09-20 02:48:53 公開日:2020-11-27

# 回転等価階層型ニューラルネットワークを用いたタンパク質モデル品質評価

Protein model quality assessment using rotation-equivariant, hierarchical neural networks ( http://arxiv.org/abs/2011.13557v1 )

ライセンス: Link先を確認

Stephan Eismann, Patricia Suriana, Bowen Jing, Raphael J.L. Townshend, Ron O. Dror

(参考訳) タンパク質は三次元(3d)構造に依存するミニチュアマシンである。この構造を計算的に決定することは未解決の大きな課題である。主なボトルネックは、モデル品質評価の課題である、候補の大きなプールの中で最も正確な構造モデルを選択することである。本稿では,タンパク質モデルの品質を評価するための新しい深層学習手法を提案する。我々のネットワークは、異なるレベルの構造解像度で原子構造と回転同変の畳み込みをポイントベースで表現する。これらの組み合わせにより、ネットワークはタンパク質構造全体からエンドツーエンドを学べる。近年のCASP(盲目予測コミュニティ実験)におけるタンパク質モデルの評価結果について報告する。特に注目すべきは、我々の手法は物理に着想を得たエネルギー用語を使用しず、複数のタンパク質の配列アライメントのような追加情報(個々のタンパク質モデルの原子構造以外の)を利用できないことである。

Proteins are miniature machines whose function depends on their three-dimensional (3D) structure. Determining this structure computationally remains an unsolved grand challenge. A major bottleneck involves selecting the most accurate structural model among a large pool of candidates, a task addressed in model quality assessment. Here, we present a novel deep learning approach to assess the quality of a protein model. Our network builds on a point-based representation of the atomic structure and rotation-equivariant convolutions at different levels of structural resolution. These combined aspects allow the network to learn end-to-end from entire protein structures. Our method achieves state-of-the-art results in scoring protein models submitted to recent rounds of CASP, a blind prediction community experiment. Particularly striking is that our method does not use physics-inspired energy terms and does not rely on the availability of additional information (beyond the atomic structure of the individual protein model), such as sequence alignments of multiple proteins.

翻訳日:2022-09-20 02:48:41 公開日:2020-11-27

# 新しい近似に基づく固有値補正自然勾配

Eigenvalue-corrected Natural Gradient Based on a New Approximation ( http://arxiv.org/abs/2011.13609v1 )

ライセンス: Link先を確認

Kai-Xin Gao, Xiao-Lei Liu, Zheng-Hai Huang, Min Wang, Shuangling Wang, Zidong Wang, Dachuan Xu, Fan Yu

(参考訳) ディープニューラルネットワーク(DNN)のトレーニングに2次最適化手法を用いると、多くの研究者が惹きつけている。最近提案されたEigenvalue-corrected Kronecker Factorization (EKFAC) (George et al., 2018) は、自然勾配の更新を対角法として解釈し、Kronecker-factored eigenbasisにおける不正確な再スケーリング係数を補正する。 Gao et al. (2020) は自然勾配に対する新たな近似を考察し、フィッシャー情報行列 (FIM) を2つの行列のクロネッカー積によって乗算された定数に近似し、近似の前と後のトレースを等しく保つ。本研究では,これら2つの手法の考え方を組み合わせて,Trace-restricted Eigenvalue-corrected Kronecker Factorization (TEKFAC)を提案する。提案手法は, kronecker-factored eigenbasis における不正確な再スケーリング係数を補正するだけでなく, gao et al. (2020) で提案した新しい近似法と有効減衰法を考察する。また、クロネッカー分解近似の差と関係についても論じる。実験により,本手法は複数のDNNにおいて,Adam,EKFAC,TKFAC等の運動量でSGDより優れていた。

Using second-order optimization methods for training deep neural networks (DNNs) has attracted many researchers. A recently proposed method, Eigenvalue-corrected Kronecker Factorization (EKFAC) (George et al., 2018), proposes an interpretation of viewing natural gradient update as a diagonal method, and corrects the inaccurate re-scaling factor in the Kronecker-factored eigenbasis. Gao et al. (2020) considers a new approximation to the natural gradient, which approximates the Fisher information matrix (FIM) to a constant multiplied by the Kronecker product of two matrices and keeps the trace equal before and after the approximation. In this work, we combine the ideas of these two methods and propose Trace-restricted Eigenvalue-corrected Kronecker Factorization (TEKFAC). The proposed method not only corrects the inexact re-scaling factor under the Kronecker-factored eigenbasis, but also considers the new approximation method and the effective damping technique proposed in Gao et al. (2020). We also discuss the differences and relationships among the Kronecker-factored approximations. Empirically, our method outperforms SGD with momentum, Adam, EKFAC and TKFAC on several DNNs.

翻訳日:2022-09-20 02:48:28 公開日:2020-11-27

# 強化学習に基づく無人航空機の協調経路とエネルギー最適化

Reinforcement Learning-based Joint Path and Energy Optimization of Cellular-Connected Unmanned Aerial Vehicles ( http://arxiv.org/abs/2011.13744v1 )

ライセンス: Link先を確認

Arash Hooshmand

(参考訳) 無人航空機(UAV)は最近かなりの研究関心を集めている。特にモノのインターネットの世界では、インターネット接続のUAVが大きな需要の1つだ。さらに、エネルギー制約、すなわちバッテリー制限は、その用途を制限することができるuavのボトルネックである。我々はエネルギー問題に対処し解決しようと試みる。そこで, 電力ステーション (PS) を装備した特定の位置で充電することで, UAVがバッテリ範囲よりもはるかに広い範囲で経路を計画できる, セル接続型UAVの経路計画法を提案する。例えば、エア・トゥ・エア(A2A)とエア・トゥ・グラウンド(A2G)の干渉や、UAVの軌道最適化に余分な制約を課す必要のない接続性のためである。飛行禁止区域は避けるべき非実用領域を決定する。バッテリーの充電を考慮し、長いミッションでUAVの問題を解決するため、我々は典型的な短距離経路プランナーを階層的に拡張するために強化学習(RL)を用いてきた。この問題は、広範囲を飛行するUAVに対してシミュレートされ、Qラーニングアルゴリズムにより、UAVが最適な経路と充電ポリシーを見つけることができる。

Unmanned Aerial Vehicles (UAVs) have attracted considerable research interest recently. Especially when it comes to the realm of Internet of Things, the UAVs with Internet connectivity are one of the main demands. Furthermore, the energy constraint i.e. battery limit is a bottle-neck of the UAVs that can limit their applications. We try to address and solve the energy problem. Therefore, a path planning method for a cellular-connected UAV is proposed that will enable the UAV to plan its path in an area much larger than its battery range by getting recharged in certain positions equipped with power stations (PSs). In addition to the energy constraint, there are also no-fly zones; for example, due to Air to Air (A2A) and Air to Ground (A2G) interference or for lack of necessary connectivity that impose extra constraints in the trajectory optimization of the UAV. No-fly zones determine the infeasible areas that should be avoided. We have used a reinforcement learning (RL) hierarchically to extend typical short-range path planners to consider battery recharge and solve the problem of UAVs in long missions. The problem is simulated for the UAV that flies over a large area, and Q-learning algorithm could enable the UAV to find the optimal path and recharge policy.

翻訳日:2022-09-20 02:48:04 公開日:2020-11-27

# CASTELO: Clustered Atom Subtypes aidEd Lead Optimization -- 機械学習と分子モデリングを組み合わせた手法

CASTELO: Clustered Atom Subtypes aidEd Lead Optimization -- a combined machine learning and molecular modeling method ( http://arxiv.org/abs/2011.13788v1 )

ライセンス: Link先を確認

Leili Zhang, Giacomo Domeniconi, Chih-Chieh Yang, Seung-gu Kang, Ruhong Zhou, Guojing Cong

(参考訳) 薬物の発見は、前臨床研究と臨床試験の2つの大きなステップからなる多段階のプロセスである。その段階の中で、リード最適化は前臨床予算の半分以上を簡単に消費する。本稿では,リード最適化ワークフローを自動化した機械学習と分子モデリングを組み合わせたアプローチを提案する。初期データ収集は物理に基づく分子動力学(MD)シミュレーションによって達成される。シミュレーションから抽出した予備特徴として接触行列を算出する。シミュレーションからの時間情報を活用するために,時間的ダイナミズム表現を用いた接触行列データを強化し,教師なし畳み込み型変分オートエンコーダ(cvae)を用いてモデル化した。最後に,従来のクラスタリング法とCVAEに基づくクラスタリング法を比較し,分子下構造をランク付けし,リード最適化の可能性を提案する。構造-活性関係データベースは必要とせず,薬剤の有効性を改善するための薬剤修飾ホットスポットに対する新たなヒントを提供する。我々のワークフローは、従来の労働集約的なプロセスと比較して、数ヶ月から数日のリード最適化のターンアラウンド時間を短縮できる可能性があり、医療研究者にとって貴重なツールになる可能性がある。

Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that automates lead optimization workflow \textit{in silico}. The initial data collection is achieved with physics-based molecular dynamics (MD) simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional clustering method and CVAE-based clustering method are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization. With no need for extensive structure-activity relationship database, our method provides new hints for drug modification hotspots which can be used to improve drug efficacy. Our workflow can potentially reduce the lead optimization turnaround time from months/years to days compared with the conventional labor-intensive process and thus can potentially become a valuable tool for medical researchers.

翻訳日:2022-09-20 02:47:19 公開日:2020-11-27

# 強化学習のためのベンチマークフレームワークの調査

A survey of benchmarking frameworks for reinforcement learning ( http://arxiv.org/abs/2011.13577v1 )

ライセンス: Link先を確認

Belinda Stapelberg and Katherine M. Malan

(参考訳) 強化学習は最近、機械学習コミュニティで注目を集めている。新しい手法が絶えず開発され、強化学習問題を解決する多くのアプローチがある。強化学習を用いた問題解決には,克服すべき課題がいろいろある。この分野の進歩を保証するため、ベンチマークは新しいアルゴリズムのテストや他のアプローチとの比較に重要である。したがって、公正な比較のための結果の再現性は、改善を正確に判断する上で不可欠である。本稿では、強化学習ベンチマークへの様々な貢献の概要と、強化学習が直面する課題に研究者がどう対処できるかについて論じる。論文の中では最もよく使われ、近年でも研究が進められている。本稿では,ベンチマークを用いた実装,タスク,アルゴリズム実装の面での貢献について述べる。この調査は、利用可能な幅広い強化学習ベンチマークタスクに注意を向け、標準化された方法で研究を奨励することを目的としている。さらに、この調査は、新しい強化学習アルゴリズムの開発とテストに使用できる様々なタスクに慣れていない研究者の概観として機能する。

Reinforcement learning has recently experienced increased prominence in the machine learning community. There are many approaches to solving reinforcement learning problems with new techniques developed constantly. When solving problems using reinforcement learning, there are various difficult challenges to overcome. To ensure progress in the field, benchmarks are important for testing new algorithms and comparing with other approaches. The reproducibility of results for fair comparison is therefore vital in ensuring that improvements are accurately judged. This paper provides an overview of different contributions to reinforcement learning benchmarking and discusses how they can assist researchers to address the challenges facing reinforcement learning. The contributions discussed are the most used and recent in the literature. The paper discusses the contributions in terms of implementation, tasks and provided algorithm implementations with benchmarks. The survey aims to bring attention to the wide range of reinforcement learning benchmarking tasks available and to encourage research to take place in a standardised manner. Additionally, this survey acts as an overview for researchers not familiar with the different tasks that can be used to develop and test new reinforcement learning algorithms.

翻訳日:2022-09-20 02:41:23 公開日:2020-11-27

# 信頼率クリッピングを用いた層別適応率法の改良

Improving Layer-wise Adaptive Rate Methods using Trust Ratio Clipping ( http://arxiv.org/abs/2011.13584v1 )

ライセンス: Link先を確認

Jeffrey Fong, Siwei Chen, Kaiqi Chen

(参考訳) 大きなバッチでニューラルネットワークをトレーニングすることは、ディープラーニングにとって基本的な重要性である。大規模なバッチトレーニングは、トレーニング時間を大幅に削減するが、精度を維持するのに困難である。最近の研究は、信頼率を用いた適応層別最適化を通じてこの問題に取り組むためにlarsやlambといった最適化手法を推し進めている。一般的な手法ではあるが、これらの手法は依然として不安定で極端な信頼率に悩まされており、性能が低下している。本稿では,その大きさを安定させ,極端な値を防止するため,信頼率クリッピングを用いたラムの新規変種であるlambcを提案する。 imagenetやcifar-10などの画像分類タスクについて実験を行い,各バッチサイズで有望な改善が得られた。

Training neural networks with large batch is of fundamental significance to deep learning. Large batch training remarkably reduces the amount of training time but has difficulties in maintaining accuracy. Recent works have put forward optimization methods such as LARS and LAMB to tackle this issue through adaptive layer-wise optimization using trust ratios. Though prevailing, such methods are observed to still suffer from unstable and extreme trust ratios which degrades performance. In this paper, we propose a new variant of LAMB, called LAMBC, which employs trust ratio clipping to stabilize its magnitude and prevent extreme values. We conducted experiments on image classification tasks such as ImageNet and CIFAR-10 and our empirical results demonstrate promising improvements across different batch sizes.

翻訳日:2022-09-20 02:41:08 公開日:2020-11-27

# 活性化拡散に基づくニューラルダイナミックモデルと認知操作のためのマイクロエクスラレーション

A Neural Dynamic Model based on Activation Diffusion and a Micro-Explanation for Cognitive Operations ( http://arxiv.org/abs/2012.00104v1 )

ライセンス: Link先を確認

Hui Wei

(参考訳) 記憶の神経機構は、人工知能における表現の問題と非常に密接に関係している。本稿では,脳内のニューロンのネットワークをシミュレートする計算モデルを提案し,その処理方法について述べる。このモデルは神経情報処理の形態学的および電気生理学的特性を指し、ニューロンが発射シーケンスを符号化しているという仮定に基づいている。ネットワーク構造, 異なる段階における神経エンコーディング機能, 記憶における刺激の表現, 記憶を形成するアルゴリズムなどが提示された。また、学習の安定性と記憶能力のリコール率も分析した。神経のダイナミックなプロセスが後継として、情報が表現され、処理されるニューロンレベルかつコヒーレントな形式を実現するため、推論、問題解決、パターン認識、自然言語処理、学習など、人工知能のさまざまな分野の検証が容易になる。知的行動において起こる認知的操作の過程は一貫した表現を持ち、計算神経科学の観点からモデル化される。したがって、ニューロンのダイナミクスは、マイクロレベルで認知アーキテクチャの統一モデルによって、異なる知的行動の内部メカニズムを説明することができる。

The neural mechanism of memory has a very close relation with the problem of representation in artificial intelligence. In this paper a computational model was proposed to simulate the network of neurons in brain and how they process information. The model refers to morphological and electrophysiological characteristics of neural information processing, and is based on the assumption that neurons encode their firing sequence. The network structure, functions for neural encoding at different stages, the representation of stimuli in memory, and an algorithm to form a memory were presented. It also analyzed the stability and recall rate for learning and the capacity of memory. Because neural dynamic processes, one succeeding another, achieve a neuron-level and coherent form by which information is represented and processed, it may facilitate examination of various branches of Artificial Intelligence, such as inference, problem solving, pattern recognition, natural language processing and learning. The processes of cognitive manipulation occurring in intelligent behavior have a consistent representation while all being modeled from the perspective of computational neuroscience. Thus, the dynamics of neurons make it possible to explain the inner mechanisms of different intelligent behaviors by a unified model of cognitive architecture at a micro-level.

翻訳日:2022-09-20 02:39:45 公開日:2020-11-27

# 動的時間カメラとライダーの融合に基づく非協力環境におけるロバストなuavの自律着陸

Robust Autonomous Landing of UAV in Non-Cooperative Environments based on Dynamic Time Camera-LiDAR Fusion ( http://arxiv.org/abs/2011.13761v1 )

ライセンス: Link先を確認

Lyujie Chen, Xiaming Yuan, Yao Xiao, Yiding Zhang and Jihong Zhu

(参考訳) 非協力的な環境で安全な着陸場所を選択することは、UAVの完全な自律化に向けた重要なステップである。しかし、既存の手法は一般化能力の貧弱さと頑健さという共通の問題がある。未知の環境でのパフォーマンスは著しく低下し、エラーを自己検出して修正することはできない。本論文では,低コストLiDARと双眼カメラを備えたUAVシステムを構築し,平地と安全地を検知して非協調環境における自律着陸を実現する。我々は,LiDARの非繰り返し走査と高いFOVカバレッジ特性を利用して,動的時間深度補完アルゴリズムを考案した。提案した深度マップの自己評価手法と合わせて,推定フェーズにおけるLiDAR蓄積時間を動的に選択し,正確な予測結果が得られた。深さマップに基づいて、斜面、粗さ、安全領域の大きさなどの高レベルな地形情報を導出する。我々は,様々な未知の環境において,広範囲にわたる自律着陸実験を実施し,モデルが精度と速度を適応的にバランスさせ,uavが安全な着陸地点をロバストに選択できることを確認した。

Selecting safe landing sites in non-cooperative environments is a key step towards the full autonomy of UAVs. However, the existing methods have the common problems of poor generalization ability and robustness. Their performance in unknown environments is significantly degraded and the error cannot be self-detected and corrected. In this paper, we construct a UAV system equipped with low-cost LiDAR and binocular cameras to realize autonomous landing in non-cooperative environments by detecting the flat and safe ground area. Taking advantage of the non-repetitive scanning and high FOV coverage characteristics of LiDAR, we come up with a dynamic time depth completion algorithm. In conjunction with the proposed self-evaluation method of the depth map, our model can dynamically select the LiDAR accumulation time at the inference phase to ensure an accurate prediction result. Based on the depth map, the high-level terrain information such as slope, roughness, and the size of the safe area are derived. We have conducted extensive autonomous landing experiments in a variety of familiar or completely unknown environments, verifying that our model can adaptively balance the accuracy and speed, and the UAV can robustly select a safe landing site.

翻訳日:2022-09-20 02:38:37 公開日:2020-11-27

# 視覚的ローカライゼーションのための効率的なシーン圧縮

Efficient Scene Compression for Visual-based Localization ( http://arxiv.org/abs/2011.13894v1 )

ライセンス: Link先を確認

Marcela Mera-Trujillo, Benjamin Smith, Victor Fragoso

(参考訳) 3D再構成やシーン表現に関してカメラのポーズを推定することは、多くの複合現実とロボティクスアプリケーションにとって重要なステップである。現在利用可能な膨大なデータを考えると、多くのアプリケーションは効率的に動作するストレージや帯域幅を制限している。これらの制約を満たすため、多くのアプリケーションは3Dポイントの数を減らしてシーン表現を圧縮する。最先端の手法はk$-coverベースのアルゴリズムを使ってシーンを圧縮するが、それらは遅くてチューニングが難しい。速度の向上とパラメータチューニングの容易化を目的として,制約付き二次プログラム(qp)を用いてシーン表現を圧縮する新しい手法を提案する。このQPは1クラスのサポートベクトルマシンに似ているため、逐次最小最適化の変種を導出して解決する。提案手法では,支援ベクトルに対応する点を,シーンを表す点のサブセットとして用いる。また,本手法を高速に収束させる効率的な初期化手法を提案する。公開データセットを用いた実験により,提案手法はシーン表現を高速に圧縮し,正確なポーズ推定を行うことを示す。

Estimating the pose of a camera with respect to a 3D reconstruction or scene representation is a crucial step for many mixed reality and robotics applications. Given the vast amount of available data nowadays, many applications constrain storage and/or bandwidth to work efficiently. To satisfy these constraints, many applications compress a scene representation by reducing its number of 3D points. While state-of-the-art methods use $K$-cover-based algorithms to compress a scene, they are slow and hard to tune. To enhance speed and facilitate parameter tuning, this work introduces a novel approach that compresses a scene representation by means of a constrained quadratic program (QP). Because this QP resembles a one-class support vector machine, we derive a variant of the sequential minimal optimization to solve it. Our approach uses the points corresponding to the support vectors as the subset of points to represent a scene. We also present an efficient initialization method that allows our method to converge quickly. Our experiments on publicly available datasets show that our approach compresses a scene representation quickly while delivering accurate pose estimates.

翻訳日:2022-09-20 02:32:39 公開日:2020-11-27

# D-NeRF:ダイナミックシーンのためのニューラルラジアンス場

D-NeRF: Neural Radiance Fields for Dynamic Scenes ( http://arxiv.org/abs/2011.13961v1 )

ライセンス: Link先を確認

Albert Pumarola, Enric Corona, Gerard Pons-Moll, Francesc Moreno-Noguer

(参考訳) 機械学習と幾何学的推論を組み合わせたニューラルレンダリング技術は、スパース画像からシーンの新たなビューを合成するための最も有望なアプローチの1つである。このうちニューラル放射場(NeRF)は、深層ネットワークを訓練して5次元入力座標(空間的位置と視野方向を表す)を体積密度とビュー依存放射輝度にマッピングするものである。しかし、生成した画像に対して前例のないレベルの光リアリズムを実現するにもかかわらず、NeRFは静止シーンのみに適用でき、同じ空間位置を異なる画像から検索することができる。本稿では,神経放射野をダイナミックドメインに拡張する手法であるd-nerfについて紹介する。この手法により,シーンの周囲を移動する \emph{single} カメラから,剛体および非剛体運動下での新たな物体像の再構成とレンダリングが可能となる。この目的のために、時間はシステムへの追加入力として考慮し、学習プロセスを、シーンを標準空間にエンコードする段階と、この標準表現を特定の時間で変形シーンにマッピングする段階の2つの主要な段階に分割する。両方のマッピングは、完全に接続されたネットワークを使って同時に学習される。ネットワークがトレーニングされると、D-NeRFは新しい画像をレンダリングし、カメラビューと時間変数の両方を制御し、オブジェクトの動きを制御する。我々は,剛体・調音・非剛体動作下での物体のシーンに対するアプローチの有効性を実証した。コード、モデルウェイト、動的シーンデータセットがリリースされる。

Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. Among these, stands out the Neural radiance fields (NeRF), which trains a deep network to map 5D input coordinates (representing spatial location and viewing direction) into a volume density and view-dependent emitted radiance. However, despite achieving an unprecedented level of photorealism on the generated images, NeRF is only applicable to static scenes, where the same spatial location can be queried from different images. In this paper we introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions from a \emph{single} camera moving around the scene. For this purpose we consider time as an additional input to the system, and split the learning process in two main stages: one that encodes the scene into a canonical space and another that maps this canonical representation into the deformed scene at a particular time. Both mappings are simultaneously learned using fully-connected networks. Once the networks are trained, D-NeRF can render novel images, controlling both the camera view and the time variable, and thus, the object movement. We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions. Code, model weights and the dynamic scenes dataset will be released.

翻訳日:2022-09-20 02:31:59 公開日:2020-11-27

# テキストセグメンテーションを再考する:新しいデータセットとテキスト特異的リファインメントアプローチ

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach ( http://arxiv.org/abs/2011.14021v1 )

ライセンス: Link先を確認

Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi

(参考訳) テキストセグメンテーションは、テキストスタイル転送やシーンテキストの削除など、現実世界の多くのテキスト関連タスクにおいて必須条件である。しかし、高品質なデータセットや専用の調査が欠如しているため、この重要な前提条件は多くの研究において前提として残されており、現在の研究でほとんど見落とされてきた。このギャップを埋めるため、私たちはtextsegという、単語と文字を境界とする多角形、マスク、文字転写の6種類のアノテーションを備えた、大規模な詳細な注釈付きテキストデータセットを提案しました。また,従来のセグメンテーションモデルに負担を課すような,非凸境界や多様なテクスチャなど,テキストのユニークな特性に適応する新たなテキストセグメンテーション手法であるtextfine network(texrnet)についても紹介する。 texrnetでは、重要な機能プールや注意に基づく類似性チェックなど、このような課題に対処するために、テキスト固有のネットワーク設計を提案します。また,テキストセグメンテーションの大幅な改善を示すtrimapとdiscriminatorの損失についても紹介する。 TextSegデータセットと既存のデータセットの両方で大規模な実験が行われます。 texrnetは、他の最先端セグメンテーション手法と比較して、テキストセグメンテーション性能を2%近く向上させる。データセットとコードはhttps://github.com/SHI-Labs/Rethinking-Text-Segmentationで公開されます。

Text segmentation is a prerequisite in many real-world text-related tasks, e.g., text style transfer, and scene text removal. However, facing the lack of high-quality datasets and dedicated investigations, this critical prerequisite has been left as an assumption in many works, and has been largely overlooked by current research. To bridge this gap, we proposed TextSeg, a large-scale fine-annotated text dataset with six types of annotations: word- and character-wise bounding polygons, masks and transcriptions. We also introduce Text Refinement Network (TexRNet), a novel text segmentation approach that adapts to the unique properties of text, e.g. non-convex boundary, diverse texture, etc., which often impose burdens on traditional segmentation models. In our TexRNet, we propose text specific network designs to address such challenges, including key features pooling and attention-based similarity checking. We also introduce trimap and discriminator losses that show significant improvement on text segmentation. Extensive experiments are carried out on both our TextSeg dataset and other existing datasets. We demonstrate that TexRNet consistently improves text segmentation performance by nearly 2% compared to other state-of-the-art segmentation methods. Our dataset and code will be made available at https://github.com/SHI-Labs/Rethinking-Text-Segmentation.

翻訳日:2022-09-20 02:31:23 公開日:2020-11-27

# ニュースメディアにおけるジェンダーベースの暴力の影響を学習する機械学習

Machine Learning to study the impact of gender-based violence in the news media ( http://arxiv.org/abs/2012.07490v1 )

ライセンス: Link先を確認

Hugo J. Bello, Nora Palomar, Elisa Gallego, Lourdes Jim\'enez Navascu\'es and Celia Lozano

(参考訳) 未だにタブー的な話題だが、性別に基づく暴力(GBV)は被害者の健康、尊厳、安全、自治を損なう。この種の暴力を発生または維持するために多くの要因が研究されているが、メディアの影響はまだ不明である。ここでは、このニュースの効果をGBVで説明するために機械学習ツールを使用します。ニューラルネットワークにニュースを供給することにより、各記事に関連するトピック情報を復元することができる。以上の結果から,GBVニュースと公衆の意識,メディア性GBV症例の影響,GBVニュースの本質的なテーマ的関係が示唆された。使用中のニューラルモデルを簡単に調整できるので、他のメディアソースやトピックにもアプローチを拡張できます。

While it remains a taboo topic, gender-based violence (GBV) undermines the health, dignity, security and autonomy of its victims. Many factors have been studied to generate or maintain this kind of violence, however, the influence of the media is still uncertain. Here, we use Machine Learning tools to extrapolate the effect of the news in GBV. By feeding neural networks with news, the topic information associated with each article can be recovered. Our findings show a relationship between GBV news and public awareness, the effect of mediatic GBV cases, and the intrinsic thematic relationship of GBV news. Because the used neural model can be easily adjusted, this also allows us to extend our approach to other media sources or topics

翻訳日:2022-09-20 02:30:47 公開日:2020-11-27

# 深層学習における不確実性の再考:ロバスト性の改善について

Rethinking Uncertainty in Deep Learning: Whether and How it Improves Robustness ( http://arxiv.org/abs/2011.13538v1 )

ライセンス: Link先を確認

Yilun Jin, Lixin Fan, Kam Woh Ng, Ce Ju, Qiang Yang

(参考訳) ディープ・ニューラル・ネットワーク(DNN)は、多くの治療法が提案される敵の攻撃に苦しむことが知られている。敵対的訓練(adversarial training, at)は最も強固な防御とされるが、クリーンな例と、より大きな摂動による攻撃のような他の種類の攻撃の両方において、パフォーマンスの低下に苦しむ。一方、エントロピー最大化(EntM)やラベル平滑化(LS)といった不確実な出力を奨励する正規化器は、クリーンな例の精度を維持し、弱い攻撃下での性能を向上させることができるが、強力な攻撃に対して防御する能力は疑わしい。本稿では,entmやlsを含む不確実性促進規則化剤を,敵対学習の分野で再検討する。 EntM と LS だけで小さな摂動下でのみ堅牢性が得られることを示す。反対に,不確実性促進調整器は原則的に補完し,クリーンな例と様々な攻撃,特に大きな摂動を伴う攻撃の両方において,一貫して性能を向上させる。さらに、不確実性促進正則化器がジャコビアン行列$\nabla_X f(X;\theta)$の観点からATの性能を高め、EntMが事実上ヤコビアン行列のノルムを縮小し、ロバスト性を促進することを明らかにする。

Deep neural networks (DNNs) are known to be prone to adversarial attacks, for which many remedies are proposed. While adversarial training (AT) is regarded as the most robust defense, it suffers from poor performance both on clean examples and under other types of attacks, e.g. attacks with larger perturbations. Meanwhile, regularizers that encourage uncertain outputs, such as entropy maximization (EntM) and label smoothing (LS) can maintain accuracy on clean examples and improve performance under weak attacks, yet their ability to defend against strong attacks is still in doubt. In this paper, we revisit uncertainty promotion regularizers, including EntM and LS, in the field of adversarial learning. We show that EntM and LS alone provide robustness only under small perturbations. Contrarily, we show that uncertainty promotion regularizers complement AT in a principled manner, consistently improving performance on both clean examples and under various attacks, especially attacks with large perturbations. We further analyze how uncertainty promotion regularizers enhance the performance of AT from the perspective of Jacobian matrices $\nabla_X f(X;\theta)$, and find out that EntM effectively shrinks the norm of Jacobian matrices and hence promotes robustness.

翻訳日:2022-09-20 02:30:13 公開日:2020-11-27

# 相互関係推論による自己教師付き時系列表現学習

Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning ( http://arxiv.org/abs/2011.13548v1 )

ライセンス: Link先を確認

Haoyi Fan, Fengbin Zhang, Yue Gao

(参考訳) 自己教師付き学習は、ラベルのないデータから有用な表現を抽出することで、多くの領域において優れたパフォーマンスを達成する。しかし,従来の自己教師あり手法の多くはサンプル間構造の探索に主眼を置いているが,時系列データにとって重要な時間内構造への取り組みは少ない。本稿では,自己監督型時系列表現学習フレームワークであるSelfTimeについて,時系列のサンプル間関係と時間内関係を探索し,ラベルなし時系列の基盤となる構造特徴を学習する。具体的には,まず所定のアンカー試料の正および負のサンプルをサンプリングし,このアンカーから時間片をサンプリングすることで時間内関係を生成する。そして、サンプル関係に基づいて、2つの別々の関係推論ヘッドを組み合わせた共有特徴抽出バックボーンを用いて、サンプル対のサンプル間関係推論の関係を定量化し、時間内関係推論のためのタイムピースペアの関係を定量化する。最後に、関係推論ヘッドの監督の下で、時系列の有用な表現をバックボーンから抽出する。時系列分類タスクのための実世界の時系列データセットの実験結果から,提案手法の有効性が示された。コードとデータはhttps://haoyfan.github.io/で公開されている。

Self-supervised learning achieves superior performance in many domains by extracting useful representations from the unlabeled data. However, most of traditional self-supervised methods mainly focus on exploring the inter-sample structure while less efforts have been concentrated on the underlying intra-temporal structure, which is important for time series data. In this paper, we present SelfTime: a general self-supervised time series representation learning framework, by exploring the inter-sample relation and intra-temporal relation of time series to learn the underlying structure feature on the unlabeled time series. Specifically, we first generate the inter-sample relation by sampling positive and negative samples of a given anchor sample, and intra-temporal relation by sampling time pieces from this anchor. Then, based on the sampled relation, a shared feature extraction backbone combined with two separate relation reasoning heads are employed to quantify the relationships of the sample pairs for inter-sample relation reasoning, and the relationships of the time piece pairs for intra-temporal relation reasoning, respectively. Finally, the useful representations of time series are extracted from the backbone under the supervision of relation reasoning heads. Experimental results on multiple real-world time series datasets for time series classification task demonstrate the effectiveness of the proposed method. Code and data are publicly available at https://haoyfan.github.io/.

翻訳日:2022-09-20 02:29:43 公開日:2020-11-27

# MEBOW: 野生における体配向の単分子的推定

MEBOW: Monocular Estimation of Body Orientation In the Wild ( http://arxiv.org/abs/2011.13688v1 )

ライセンス: Link先を確認

Chenyan Wu, Yukun Chen, Jiajia Luo, Che-Chun Su, Anuja Dawane, Bikramjot Hanzra, Zhuo Deng, Bilan Liu, James Wang, Cheng-Hao Kuo

(参考訳) 身体の向きの推定は、ロボット工学や自律運転を含む多くのアプリケーションにおいて重要な視覚的手がかりを提供する。特に3次元ポーズ推定が画像分解能の低下、咬合、身体部位の識別が困難である場合には望ましい。そこで本研究では,広視野画像からの方向推定のための大規模データセットであるCOCO-MEBOW(Monocular Estimation of Body Orientation in the Wild)を提案する。 COCOデータセットからの55K画像内の約130K人の身体の向き付けラベルは、効率的で高精度なアノテーションパイプラインを使用して収集されている。また、データセットのメリットも検証しました。まず,本データセットは人体方向推定モデルの性能と頑健性を大幅に向上させることができることを示す。さらに,3次元ポーズラベル,2次元ポーズラベル,我々の身体指向ラベルを共同訓練に用いる3次元ポーズ推定のための新しい3次元ソースソリューションを提案する。本モデルは,3次元ポーズラベルと2次元ポーズラベルのみを用いた単眼3次元ポーズ推定において,最先端のデュアルソースソリューションよりも優れる。これは、3次元のポーズ推定においてmebowの重要な利点であり、特に3次元のポーズではボディオリエンテーションに対する個人ごとのラベリングコストがはるかに低いため魅力的である。この研究は、人間の行動を理解することに関わる現実的な課題に対処する上で、MEBOWの高い可能性を示している。この研究の詳細はhttps://chenyanwu.github.io/MEBOW/.comで確認できる。

Body orientation estimation provides crucial visual cues in many applications, including robotics and autonomous driving. It is particularly desirable when 3-D pose estimation is difficult to infer due to poor image resolution, occlusion or indistinguishable body parts. We present COCO-MEBOW (Monocular Estimation of Body Orientation in the Wild), a new large-scale dataset for orientation estimation from a single in-the-wild image. The body-orientation labels for around 130K human bodies within 55K images from the COCO dataset have been collected using an efficient and high-precision annotation pipeline. We also validated the benefits of the dataset. First, we show that our dataset can substantially improve the performance and the robustness of a human body orientation estimation model, the development of which was previously limited by the scale and diversity of the available training data. Additionally, we present a novel triple-source solution for 3-D human pose estimation, where 3-D pose labels, 2-D pose labels, and our body-orientation labels are all used in joint training. Our model significantly outperforms state-of-the-art dual-source solutions for monocular 3-D human pose estimation, where training only uses 3-D pose labels and 2-D pose labels. This substantiates an important advantage of MEBOW for 3-D human pose estimation, which is particularly appealing because the per-instance labeling cost for body orientations is far less than that for 3-D poses. The work demonstrates high potential of MEBOW in addressing real-world challenges involving understanding human behaviors. Further information of this work is available at https://chenyanwu.github.io/MEBOW/.

翻訳日:2022-09-20 02:23:53 公開日:2020-11-27

# 高分解能乳房イメージングのための軽量U-Net

Lightweight U-Net for High-Resolution Breast Imaging ( http://arxiv.org/abs/2011.13698v1 )

ライセンス: Link先を確認

Mickael Tardy, Diana Mateus

(参考訳) 乳癌検診における悪性度検出における完全畳み込みニューラルネットワークの検討を行った。我々は,ネットワークの精度と計算複雑性との間の許容範囲の妥協を求める教師付きセグメンテーションタスクに取り組んでいる。

We study the fully convolutional neural networks in the context of malignancy detection for breast cancer screening. We work on a supervised segmentation task looking for an acceptable compromise between the precision of the network and the computational complexity.

翻訳日:2022-09-20 02:23:02 公開日:2020-11-27

# 3D Invisible Cloak

3D Invisible Cloak ( http://arxiv.org/abs/2011.13705v1 )

ライセンス: Link先を確認

Mingfu Xue, Can He, Zhiyu Wu, Jian Wang, Zhe Liu, Weiqiang Liu

(参考訳) 本稿では,実世界の人検知器に対する新たな物理的ステルス攻撃を提案する。提案手法では, 敵のパッチを生成し, 実際の衣服に印刷することにより, 3次元の目立たないマントを作製する。クロークを身に着けている人は、人検知器の検知を回避し、ステルスを達成できる。 3次元の物理的制約(ラジアン、シワ、オクルージョン、アングルなど)が人的ステルス攻撃に与える影響を考察し、3次元の見えないクロークを生成するための3次元変換を提案する。我々は、現実の服に敵のパッチを印刷することで、難易度と複雑な3D物理シナリオの下で3D空間でステルス攻撃を行う。従来の3次元変換は、最適化プロセス中にパッチ上で実行される。さらに, 最適3次元目視クロークの生成方法について検討した。具体的には、特定の形状や色の入力画像を選択して最適な3d目に見えないクロークを生成する方法を検討する。また、物体検出器を他の物体と誤認させることに成功し、完全に姿を消す方法、すなわち物体として検出されない方法も検討する。最後に,デジタルドメインと物理世界での提案する攻撃の性能を体系的に評価するための体系的評価フレームワークを提案する。様々な屋内・屋外の物理的シナリオにおける実験結果から,提案手法は複雑で困難な物理的条件下であっても頑健で有効であることが明らかとなった。デジタルドメイン(inriaデータセット)のアタック成功率は86.56%であり、物理的世界における静的および動的ステルスアタックパフォーマンスは、それぞれ100%と77%であり、既存の作業よりもはるかに優れている。

In this paper, we propose a novel physical stealth attack against the person detectors in real world. The proposed method generates an adversarial patch, and prints it on real clothes to make a three dimensional (3D) invisible cloak. Anyone wearing the cloak can evade the detection of person detectors and achieve stealth. We consider the impacts of those 3D physical constraints (i.e., radian, wrinkle, occlusion, angle, etc.) on person stealth attacks, and propose 3D transformations to generate 3D invisible cloak. We launch the person stealth attacks in 3D physical space instead of 2D plane by printing the adversarial patches on real clothes under challenging and complex 3D physical scenarios. The conventional and 3D transformations are performed on the patch during its optimization process. Further, we study how to generate the optimal 3D invisible cloak. Specifically, we explore how to choose input images with specific shapes and colors to generate the optimal 3D invisible cloak. Besides, after successfully making the object detector misjudge the person as other objects, we explore how to make a person completely disappeared, i.e., the person will not be detected as any objects. Finally, we present a systematic evaluation framework to methodically evaluate the performance of the proposed attack in digital domain and physical world. Experimental results in various indoor and outdoor physical scenarios show that, the proposed person stealth attack method is robust and effective even under those complex and challenging physical conditions, such as the cloak is wrinkled, obscured, curved, and from different angles. The attack success rate in digital domain (Inria data set) is 86.56%, while the static and dynamic stealth attack performance in physical world is 100% and 77%, respectively, which are significantly better than existing works.

翻訳日:2022-09-20 02:22:58 公開日:2020-11-27

# 非教師者再識別のための非対称分岐による教師学生ネットワークの多様性向上

Enhancing Diversity in Teacher-Student Networks via Asymmetric branches for Unsupervised Person Re-identification ( http://arxiv.org/abs/2011.13776v1 )

ライセンス: Link先を確認

Hao Chen, Benoit Lagadec, Francois Bremond

(参考訳) unsupervised person re-identification (re-id)の目的は、労働集約的なアイデンティティアノテーションなしで差別的特徴を学ぶことである。 state-of-the-art unsupervised re-idメソッドは、ターゲットドメイン内の未ラベル画像に擬似ラベルを割り当て、ノイズの多い擬似ラベルから学習する。最近導入された平均教師モデルはラベルノイズを緩和する有望な方法である。しかし、訓練期間中、自己学習型教師学生ネットワークはすぐにコンセンサスに収束し、局所的な最小限に繋がる。ニューラルネットワーク内で非対称構造を用いてこの問題に対処する可能性を探る。まず, 特徴を異なる方法で抽出するために非対称分岐が提案され, 出現特徴の多様性が向上した。そこで,提案したクロスブランチ・インスペクションにより,一方の分枝が他方の分枝から監督を受け,異なる知識を伝達し,教師と学生のネットワーク間の重みの多様性を高める。拡張実験により,提案手法は,教師なし領域適応と教師なしRe-IDタスクの両方において,従来よりも大幅に性能が向上することが示された。

The objective of unsupervised person re-identification (Re-ID) is to learn discriminative features without labor-intensive identity annotations. State-of-the-art unsupervised Re-ID methods assign pseudo labels to unlabeled images in the target domain and learn from these noisy pseudo labels. Recently introduced Mean Teacher Model is a promising way to mitigate the label noise. However, during the training, self-ensembled teacher-student networks quickly converge to a consensus which leads to a local minimum. We explore the possibility of using an asymmetric structure inside neural network to address this problem. First, asymmetric branches are proposed to extract features in different manners, which enhances the feature diversity in appearance signatures. Then, our proposed cross-branch supervision allows one branch to get supervision from the other branch, which transfers distinct knowledge and enhances the weight diversity between teacher and student networks. Extensive experiments show that our proposed method can significantly surpass the performance of previous work on both unsupervised domain adaptation and fully unsupervised Re-ID tasks.

翻訳日:2022-09-20 02:22:29 公開日:2020-11-27

# 点雲の3次元意味セグメンテーションのための距離特徴密度を持つ球面補間畳み込みネットワーク

Spherical Interpolated Convolutional Network with Distance-Feature Density for 3D Semantic Segmentation of Point Clouds ( http://arxiv.org/abs/2011.13784v1 )

ライセンス: Link先を確認

Guangming Wang, Yehui Yang, Huixin Zhang, Zhe Liu, and Hesheng Wang

(参考訳) 点雲の意味的セグメンテーションは、ロボットにとって環境認識の重要な部分である。しかし,点雲の非構造性から,従来の3次元畳み込みカーネルを直接採用して生の3次元点雲から特徴を抽出することは困難である。本稿では,従来の格子状3次元畳み込み演算子に代わる球面補間畳み込み演算子を提案する。新たに提案する特徴抽出演算子により,ネットワークの精度が向上し,ネットワークのパラメータが低減される。さらに,距離を補間重みとして,点雲補間法の欠陥を分析し,距離と特徴相関を組み合わせることにより,自己学習した距離特徴密度を提案する。提案手法は球状補間畳み込みネットワークの特徴抽出をより合理的かつ効果的に行う。提案ネットワークの有効性をポイントクラウドの3次元意味セグメンテーションタスクで実証した。実験の結果,提案手法はScanNetデータセットとParis-Lille-3Dデータセットで良好な性能を示すことがわかった。

The semantic segmentation of point clouds is an important part of the environment perception for robots. However, it is difficult to directly adopt the traditional 3D convolution kernel to extract features from raw 3D point clouds because of the unstructured property of point clouds. In this paper, a spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator. This newly proposed feature extraction operator improves the accuracy of the network and reduces the parameters of the network. In addition, this paper analyzes the defect of point cloud interpolation methods based on the distance as the interpolation weight and proposes the self-learned distance-feature density by combining the distance and the feature correlation. The proposed method makes the feature extraction of spherical interpolated convolution network more rational and effective. The effectiveness of the proposed network is demonstrated on the 3D semantic segmentation task of point clouds. Experiments show that the proposed method achieves good performance on the ScanNet dataset and Paris-Lille-3D dataset.

翻訳日:2022-09-20 02:22:08 公開日:2020-11-27

# 四点制約を用いた一般化ポス・アンド・スケール推定

Generalized Pose-and-Scale Estimation using 4-Point Congruence Constraints ( http://arxiv.org/abs/2011.13817v1 )

ライセンス: Link先を確認

Victor Fragoso, Sudipta Sinha

(参考訳) 一般化カメラの絶対的なポーズを4つの3次元点-線対から未知の内部スケールで計算する新しい方法gP4Pcを提案する。多くのポーズ・アンド・スケール法とは異なり、gP4Pcは未知の類似性変換に関連する4点の2つの集合によって定義される形状の合同から生じる制約に基づいている。問題に対する新しいパラメトリゼーションを選択することにより、4つのスカラー変数の2次方程式の系を導出する。変数は、カメラセンターからの光線に沿って3dポイントの距離を表す。このシステムをGroebnerベースベースの自動多項式解法で解いた後、効率的な3Dポイントアライメント法を用いて類似性変換を計算する。また,計算的に非常に効率的で,既存の解法よりも約3倍高速であるコプラナー点の場合には,解法の特殊変種も提案する。実データと合成データを用いた実験により, RANSACフレームワーク内で使用した場合, gP4Pcは, 競合する数値安定性, 精度, 騒音に対する頑健性を実現しつつ, 総走行時間において最速の手法であることが示された。

We present gP4Pc, a new method for computing the absolute pose of a generalized camera with unknown internal scale from four corresponding 3D point-and-ray pairs. Unlike most pose-and-scale methods, gP4Pc is based on constraints arising from the congruence of shapes defined by two sets of four points related by an unknown similarity transformation. By choosing a novel parametrization for the problem, we derive a system of four quadratic equations in four scalar variables. The variables represent the distances of 3D points along the rays from the camera centers. After solving this system via Groebner basis-based automatic polynomial solvers, we compute the similarity transformation using an efficient 3D point-point alignment method. We also propose a specialized variant of our solver for the case of coplanar points, which is computationally very efficient and about 3x faster than the fastest existing solver. Our experiments on real and synthetic datasets, demonstrate that gP4Pc is among the fastest methods in terms of total running time when used within a RANSAC framework, while achieving competitive numerical stability, accuracy, and robustness to noise.

翻訳日:2022-09-20 02:21:22 公開日:2020-11-27

# 自律運転における3次元lidarに基づく映像物体検出のための時間チャネルトランスフォーマ

Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving ( http://arxiv.org/abs/2011.13628v1 )

ライセンス: Link先を確認

Zhenxun Yuan, Xiao Song, Lei Bai, Wengang Zhou, Zhe Wang, Wanli Ouyang

(参考訳) 業界における自動運転の強い需要は、3Dオブジェクト検出への強い関心をもたらし、多くの優れた3Dオブジェクト検出アルゴリズムを生み出した。しかし、ほとんどのアルゴリズムは単一フレームのデータのみをモデル化し、データのシーケンスの時間的情報を無視している。本研究では,lidarデータから映像物体を検出するための空間-時間領域とチャネル領域の関係をモデル化する,temporal-channel transformerと呼ばれる新しいトランスを提案する。このトランスの特別な設計として、エンコーダにエンコードされる情報は、デコーダのものと異なる、すなわち、エンコーダは、複数のフレームの時間的チャネル情報をエンコードし、デコーダは、現在のフレームの空間的チャネル情報をボクセル的にデコードする。具体的には、トランスの時間チャネルエンコーダは、異なるチャネルやフレームの特徴間の相関を利用して、異なるチャネルやフレームの情報をエンコードするように設計されている。一方、変圧器の空間デコーダは、現在のフレームの各位置の情報を復号する。検出ヘッドで物体検出を行う前に、ゲート機構を配置して現在のフレームの特徴を再検討し、アップサンプリング処理とともに対象フレームの表現を反復的に洗練することにより、対象情報を無関係にフィルタリングする。実験の結果,nuscenesベンチマークでグリッドvoxelを用いた3次元物体検出の最先端性能が得られた。

The strong demand of autonomous driving in the industry has lead to strong interest in 3D object detection and resulted in many excellent 3D object detection algorithms. However, the vast majority of algorithms only model single-frame data, ignoring the temporal information of the sequence of data. In this work, we propose a new transformer, called Temporal-Channel Transformer, to model the spatial-temporal domain and channel domain relationships for video object detecting from Lidar data. As a special design of this transformer, the information encoded in the encoder is different from that in the decoder, i.e. the encoder encodes temporal-channel information of multiple frames while the decoder decodes the spatial-channel information for the current frame in a voxel-wise manner. Specifically, the temporal-channel encoder of the transformer is designed to encode the information of different channels and frames by utilizing the correlation among features from different channels and frames. On the other hand, the spatial decoder of the transformer will decode the information for each location of the current frame. Before conducting the object detection with detection head, the gate mechanism is deployed for re-calibrating the features of current frame, which filters out the object irrelevant information by repetitively refine the representation of target frame along with the up-sampling process. Experimental results show that we achieve the state-of-the-art performance in grid voxel-based 3D object detection on the nuScenes benchmark.

翻訳日:2022-09-20 02:13:03 公開日:2020-11-27

# インスタンスワイズ3次元再構成のためのディスクリプタフリーマルチビュー領域マッチング

Descriptor-Free Multi-View Region Matching for Instance-Wise 3D Reconstruction ( http://arxiv.org/abs/2011.13649v1 )

ライセンス: Link先を確認

Takuma Doi, Fumio Okura, Toshiki Nagahara, Yasuyuki Matsushita, Yasushi Yagi

(参考訳) 本稿では,テクスチャや形状記述子マッチングに頼らずに,インスタンスセグメンテーションのマルチビュー拡張を提案する。マルチビューインスタンスのセグメンテーションは、テクスチャや形状記述子を使ったマルチビューマッチングが難しいため、繰り返しのテクスチャや形、例えば植物葉を持つシーンでは困難になる。そこで本研究では,特徴記述子に依存しないエピポーラ幾何学に基づく多視点領域マッチング手法を提案する。さらに, エピポーラ領域マッチングは, 容易にインスタンスセグメンテーションに統合でき, 3次元再構成に有効であることを示す。実験により,マルチビューインスタンスマッチングと3次元再構成の精度が,ベースライン法と比較して向上した。

This paper proposes a multi-view extension of instance segmentation without relying on texture or shape descriptor matching. Multi-view instance segmentation becomes challenging for scenes with repetitive textures and shapes, e.g., plant leaves, due to the difficulty of multi-view matching using texture or shape descriptors. To this end, we propose a multi-view region matching method based on epipolar geometry, which does not rely on any feature descriptors. We further show that the epipolar region matching can be easily integrated into instance segmentation and effective for instance-wise 3D reconstruction. Experiments demonstrate the improved accuracy of multi-view instance matching and the 3D reconstruction compared to the baseline methods.

翻訳日:2022-09-20 02:12:39 公開日:2020-11-27

# 点群における実時間物体認識とポーズ推定

Towards real-time object recognition and pose estimation in point clouds ( http://arxiv.org/abs/2011.13669v1 )

ライセンス: Link先を確認

Marlon Marcon, Olga Regina Pereira Bellon and Luciano Silva

(参考訳) 物体認識と6次元ポーズ推定はコンピュータビジョン応用において非常に難しい課題である。このようなタスクの効率性にも拘わらず、標準メソッドはリアルタイム処理速度に遠く及ばない。本稿では,オブジェクトの細かな6DoFのポーズを,リアルタイムに現実的なシナリオに適用する新しいパイプラインを提案する。私たちは提案を3つに分けた。まず、Color機能分類では、ImageNetでトレーニングされたトレーニング済みのCNNカラー機能を使用してオブジェクト検出を行う。特徴ベース登録モジュールは粗いポーズ推定を行い、最後に細調整ステップはICPベースの密登録を行う。提案手法は,rgb-dシーンのデータセット上で約83\%の精度を実現する。処理時間については、オブジェクト検出タスクをフレーム処理速度最大90FPSで行い、フル実行戦略において、ポーズ推定を約14FPSで行う。我々は,提案のモジュール性により,必要時にのみ実行可能とし,マルチタスクの状況でもリアルタイム処理をアンロックするスケジュール実行を行うことができることを議論した。

Object recognition and 6DoF pose estimation are quite challenging tasks in computer vision applications. Despite efficiency in such tasks, standard methods deliver far from real-time processing rates. This paper presents a novel pipeline to estimate a fine 6DoF pose of objects, applied to realistic scenarios in real-time. We split our proposal into three main parts. Firstly, a Color feature classification leverages the use of pre-trained CNN color features trained on the ImageNet for object detection. A Feature-based registration module conducts a coarse pose estimation, and finally, a Fine-adjustment step performs an ICP-based dense registration. Our proposal achieves, in the best case, an accuracy performance of almost 83\% on the RGB-D Scenes dataset. Regarding processing time, the object detection task is done at a frame processing rate up to 90 FPS, and the pose estimation at almost 14 FPS in a full execution strategy. We discuss that due to the proposal's modularity, we could let the full execution occurs only when necessary and perform a scheduled execution that unlocks real-time processing, even for multitask situations.

翻訳日:2022-09-20 02:12:10 公開日:2020-11-27

# Progressively Stacking 2.0: BERTトレーニングスピードアップのための多段階階層トレーニング手法

Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup ( http://arxiv.org/abs/2011.13635v1 )

ライセンス: Link先を確認

Cheng Yang, Shengnan Wang, Chao Yang, Yuechuan Li, Ru He, Jingqiao Zhang

(参考訳) BERTのような事前訓練された言語モデルは、多くの自然言語処理タスクにおいて大幅な精度向上を実現している。その有効性にもかかわらず、膨大な数のパラメータがBERTモデルのトレーニングを非常に困難にしている。本稿では,BERTのトレーニング時間を削減するため,効率的な多段階階層トレーニング(MSLT)手法を提案する。トレーニングプロセス全体をいくつかの段階に分割する。トレーニングは、少数のエンコーダ層しか持たない小さなモデルから始まり、新しいエンコーダ層を追加することで、徐々にモデルの深さを増加させます。それぞれの段階で、新たに追加されるエンコーダ層のトップ(出力層の近くに)のみをトレーニングします。以前の段階でトレーニングされた他のレイヤのパラメータは、現在の段階では更新されない。 BERTトレーニングでは、特に後方の計算時間が勾配同期のための通信時間を含む分散トレーニング環境では、後方の計算の方が前方の計算よりもはるかに時間がかかる。提案されたトレーニング戦略では、上位層のみが後方計算に参加し、ほとんどの層は前方計算にのみ参加する。これにより、計算効率と通信効率が大幅に向上する。実験の結果,本手法は性能低下を伴わずに110%以上のトレーニングスピードアップを達成できることがわかった。

Pre-trained language models, such as BERT, have achieved significant accuracy gain in many natural language processing tasks. Despite its effectiveness, the huge number of parameters makes training a BERT model computationally very challenging. In this paper, we propose an efficient multi-stage layerwise training (MSLT) approach to reduce the training time of BERT. We decompose the whole training process into several stages. The training is started from a small model with only a few encoder layers and we gradually increase the depth of the model by adding new encoder layers. At each stage, we only train the top (near the output layer) few encoder layers which are newly added. The parameters of the other layers which have been trained in the previous stages will not be updated in the current stage. In BERT training, the backward computation is much more time-consuming than the forward computation, especially in the distributed training setting in which the backward computation time further includes the communication time for gradient synchronization. In the proposed training strategy, only top few layers participate in backward computation, while most layers only participate in forward computation. Hence both the computation and communication efficiencies are greatly improved. Experimental results show that the proposed method can achieve more than 110% training speedup without significant performance degradation.

翻訳日:2022-09-20 02:05:49 公開日:2020-11-27

# ニューロモルフィックハードウェア制約緩和のためのスパイクニューラルネットワークのコンパイル

Compiling Spiking Neural Networks to Mitigate Neuromorphic Hardware Constraints ( http://arxiv.org/abs/2011.13965v1 )

ライセンス: Link先を確認

Adarsha Balaji and Anup Das

(参考訳) spiking neural networks (snns) は,<resource}-および<power}-constrained platform上で時空間パターン認識を行う効率的な計算モデルである。ニューロモルフィックハードウェア上で実行されるSNNは、これらのプラットフォームのエネルギー消費をさらに削減することができる。モデルサイズと複雑さの増大に伴い、SNNベースのアプリケーションをタイルベースのニューロモルフィックハードウェアにマッピングすることはますます困難になっている。これは神経シナプスコア(viz. a crossbar)がシナプス後ニューロンごとに一定の数のシナプス前接続しか持たないという制限に起因する。ニューロンごとに多くのニューロンとシナプス前接続を持つ複雑なsnnベースのモデルでは、(1)トレーニング後にクロスバーリソースに適合するために接続を刈り取る必要があるため、モデル品質の低下、例えば正確性、(2)ニューロンとシナプスをハードウェアの神経-シナプスコアに分割して配置する必要があるため、レイテンシとエネルギー消費の増加につながる可能性がある。本研究では,(1)複数のシナプス前接続を有するニューロン機能を,複数の均質な神経単位に分解し,クロスバーの利用を著しく改善し,全てのシナプス前接続を保持させる新しいアンロール法と,(2)エネルギー消費とスパイクレイテンシを最小化することを目的としたニューロモルフィックハードウェア上にsnsをマッピングする新しい手法であるspinsmapを提案する。

Spiking Neural Networks (SNNs) are efficient computation models to perform spatio-temporal pattern recognition on {resource}- and {power}-constrained platforms. SNNs executed on neuromorphic hardware can further reduce energy consumption of these platforms. With increasing model size and complexity, mapping SNN-based applications to tile-based neuromorphic hardware is becoming increasingly challenging. This is attributed to the limitations of neuro-synaptic cores, viz. a crossbar, to accommodate only a fixed number of pre-synaptic connections per post-synaptic neuron. For complex SNN-based models that have many neurons and pre-synaptic connections per neuron, (1) connections may need to be pruned after training to fit onto the crossbar resources, leading to a loss in model quality, e.g., accuracy, and (2) the neurons and synapses need to be partitioned and placed on the neuro-sypatic cores of the hardware, which could lead to increased latency and energy consumption. In this work, we propose (1) a novel unrolling technique that decomposes a neuron function with many pre-synaptic connections into a sequence of homogeneous neural units to significantly improve the crossbar utilization and retain all pre-synaptic connections, and (2) SpiNeMap, a novel methodology to map SNNs on neuromorphic hardware with an aim to minimize energy consumption and spike latency.

翻訳日:2022-09-20 02:05:11 公開日:2020-11-27

# 信念エントロピーに基づく区間値信頼構造の組み合わせ

Combination of interval-valued belief structures based on belief entropy ( http://arxiv.org/abs/2011.13636v1 )

ライセンス: Link先を確認

Miao Qin, Yongchuan Tang

(参考訳) 本稿では,デンプスター・シェーファー証拠理論の枠組みにおける区間値信念構造の組み合わせと正規化の問題について検討する。既存のアプローチをレビューし、徹底的に分析する。従来のアプローチの利点と欠点を述べる。不確実性尺度に基づく新しい最適性アプローチが開発され、区間値の信念構造を結合する問題は、基本確率代入の組み合わせに縮退する。提案手法の合理性を示す数値的な例を示す。

This paper investigates the issues of combination and normalization of interval-valued belief structures within the framework of Dempster-Shafer theory of evidence. Existing approaches are reviewed and thoroughly analyzed. The advantages and drawbacks of previous approach are presented. A new optimality approach based on uncertainty measure is developed, where the problem of combining interval-valued belief structures degenerates into combining basic probability assignments. Numerical examples are provided to illustrate the rationality of the proposed approach.

翻訳日:2022-09-20 02:04:16 公開日:2020-11-27

# 人間のvrデモによる操作行動の構造的・意味的モデルの自動獲得

Automated acquisition of structured, semantic models of manipulation activities from human VR demonstration ( http://arxiv.org/abs/2011.13689v1 )

ライセンス: Link先を確認

Andrei Haidu and Michael Beetz

(参考訳) 本稿では,仮想環境から,人間の動作,ロボットの理解,日常的な活動の収集とアノテートが可能なシステムを提案する。人間の動きは、人工のバーチャルリアリティーデバイスと眼球追跡機能を使ってシミュレーションされた世界にマッピングされる。仮想世界のすべての相互作用は物理的にシミュレートされ、運動とその効果は現実世界と密接に関連している。アクティビティ実行中、サブシンボリックデータロガーは、オフラインシーンの再現と再生を可能にするために、フレーム単位の環境と人間の視線を記録する。物理エンジンと組み合わせて、オンラインモニター(記号データロガー)は(様々な文法を用いて)解析し、シミュレートされた世界におけるイベント、アクション、およびそれらの影響を記録する。

In this paper we present a system capable of collecting and annotating, human performed, robot understandable, everyday activities from virtual environments. The human movements are mapped in the simulated world using off-the-shelf virtual reality devices with full body, and eye tracking capabilities. All the interactions in the virtual world are physically simulated, thus movements and their effects are closely relatable to the real world. During the activity execution, a subsymbolic data logger is recording the environment and the human gaze on a per-frame basis, enabling offline scene reproduction and replays. Coupled with the physics engine, online monitors (symbolic data loggers) are parsing (using various grammars) and recording events, actions, and their effects in the simulated world.

翻訳日:2022-09-20 02:04:08 公開日:2020-11-27

# 近似知識コンパイルのための下限

Lower Bounds for Approximate Knowledge Compilation ( http://arxiv.org/abs/2011.13721v1 )

ライセンス: Link先を確認

Alexis de Colnet and Stefan Mengel

(参考訳) 知識コンパイルは、異なる表現言語の簡潔性と効率のトレードオフを研究する。多くの言語では、表現サイズには強い下限が知られているが、最近の研究は、いくつかの言語では、近似コンパイルを用いてこれらの境界をバイパスできることを示している。その考え方は、エラーの数をコントロールすることができる知識の近似をコンパイルすることである。効率的なモデルカウントと確率的推論をサポートするため,確率的推論などの文脈に適したコンパイル言語d-dnnf(decomposable negation normal form)の回路に焦点を当てた。さらに、d-DNNF には、近似に緩和することで回避できるような、既知のサイズの低い境界が存在する。本稿では,従来研究されてきた弱い近似と,近年のアルゴリズム的な結果に用いられている強い近似という,近似の2つの概念を定式化する。次に、d-DNNFによる近似の下位境界を示し、文献の正の結果を補完する。

Knowledge compilation studies the trade-off between succinctness and efficiency of different representation languages. For many languages, there are known strong lower bounds on the representation size, but recent work shows that, for some languages, one can bypass these bounds using approximate compilation. The idea is to compile an approximation of the knowledge for which the number of errors can be controlled. We focus on circuits in deterministic decomposable negation normal form (d-DNNF), a compilation language suitable in contexts such as probabilistic reasoning, as it supports efficient model counting and probabilistic inference. Moreover, there are known size lower bounds for d-DNNF which by relaxing to approximation one might be able to avoid. In this paper we formalize two notions of approximation: weak approximation which has been studied before in the decision diagram literature and strong approximation which has been used in recent algorithmic results. We then show lower bounds for approximation by d-DNNF, complementing the positive results from the literature.

翻訳日:2022-09-20 02:03:53 公開日:2020-11-27

# 協調作業における人間の反応・行動・嗜好調査

Investigating Human Response, Behaviour, and Preference in Joint-Task Interaction ( http://arxiv.org/abs/2011.14016v1 )

ライセンス: Link先を確認

Alan Lindsay, Bart Craenen, Sara Dalzel-Job, Robin L. Hill, Ronald P. A. Petrick

(参考訳) 人間の相互作用は、非言語的手がかりを含む幅広い信号に依存する。効果的な説明可能計画(XAIP)エージェントを開発するためには,これらの通信チャネルの範囲と有用性を理解することが重要である。我々の出発点は、共同作業相互作用と認知科学研究の既存の成果である。私たちの意図は、これらのレッスンは、ユーザの感情的尺度(つまり、ユーザの感情的状態を計画モデルに明示的に組み込む)を含む、ユーザの反応に応じて振る舞いを条件付けている、計画手法の使用を含むインタラクションエージェントの設計を通知できることです。我々は計画に基づくエージェントの動作と共同作業の相互作用の交差点でいくつかの概念を特定し、これらを用いて2つのエージェントを設計した。我々はこれらのエージェントと相互作用する人間の行動と反応を調べる実験を設計した。本稿では,デザインされた研究と,検討中の重要な疑問について述べる。また,シミュレーションユーザに対する2つのエージェントの挙動を実証分析により検討した。

Human interaction relies on a wide range of signals, including non-verbal cues. In order to develop effective Explainable Planning (XAIP) agents it is important that we understand the range and utility of these communication channels. Our starting point is existing results from joint task interaction and their study in cognitive science. Our intention is that these lessons can inform the design of interaction agents -- including those using planning techniques -- whose behaviour is conditioned on the user's response, including affective measures of the user (i.e., explicitly incorporating the user's affective state within the planning model). We have identified several concepts at the intersection of plan-based agent behaviour and joint task interaction and have used these to design two agents: one reactive and the other partially predictive. We have designed an experiment in order to examine human behaviour and response as they interact with these agents. In this paper we present the designed study and the key questions that are being investigated. We also present the results from an empirical analysis where we examined the behaviour of the two agents for simulated users.

翻訳日:2022-09-20 02:03:14 公開日:2020-11-27

# 多クラス分類のための深層建築の不確実性駆動アンサンブル胸部X線画像におけるCOVID-19診断への応用

Uncertainty-driven ensembles of deep architectures for multiclass classification. Application to COVID-19 diagnosis in chest X-ray images ( http://arxiv.org/abs/2011.14894v1 )

ライセンス: Link先を確認

Juan E. Arco, A. Ortiz, J.Ramirez, F.J. Martinez-Murcia, Yu-Dong Zhang, Juan M. Gorriz

(参考訳) 呼吸器疾患は毎年何百万人もの人を殺す。これらの病理の診断は、手動で時間を要するプロセスであり、サーバー間の変動、診断と治療の遅延がある。最近の新型コロナウイルス(COVID-19)パンデミックは、肺炎の診断を自動化するためのシステム開発の必要性を示す一方で、畳み込みニューラルネットワーク(CNN)は、医療画像の自動分類に優れた選択肢であることが証明されている。しかし、この文脈で信頼度分類を提供する必要性を考えると、モデルの予測の信頼性を定量化することが重要である。本研究では,ベイズ深層学習に基づく多段階アンサンブル分類システムを提案し,各分類決定の不確かさを定量化しながら性能を最大化する。このツールは、予測の不確実性に応じて結果を重み付けすることで、異なるアーキテクチャから抽出した情報を組み合わせる。ベイズネットワークの性能は、コントロール対細菌性肺炎、ウイルス性肺炎対covid-19肺炎の4つの病因を同時に区別する実際のシナリオで評価される。 3段階決定木を用いて4級分類を3つの二分分類に分割し、98.06%の精度を与え、最近の文献で得られた結果を上回った。この高い性能を得るのに必要な前処理の削減は、予測の信頼性に関する情報に加えて、臨床医の助けとなるシステムの適用性を示すものである。

Respiratory diseases kill million of people each year. Diagnosis of these pathologies is a manual, time-consuming process that has inter and intra-observer variability, delaying diagnosis and treatment. The recent COVID-19 pandemic has demonstrated the need of developing systems to automatize the diagnosis of pneumonia, whilst Convolutional Neural Network (CNNs) have proved to be an excellent option for the automatic classification of medical images. However, given the need of providing a confidence classification in this context it is crucial to quantify the reliability of the model's predictions. In this work, we propose a multi-level ensemble classification system based on a Bayesian Deep Learning approach in order to maximize performance while quantifying the uncertainty of each classification decision. This tool combines the information extracted from different architectures by weighting their results according to the uncertainty of their predictions. Performance of the Bayesian network is evaluated in a real scenario where simultaneously differentiating between four different pathologies: control vs bacterial pneumonia vs viral pneumonia vs COVID-19 pneumonia. A three-level decision tree is employed to divide the 4-class classification into three binary classifications, yielding an accuracy of 98.06% and overcoming the results obtained by recent literature. The reduced preprocessing needed for obtaining this high performance, in addition to the information provided about the reliability of the predictions evidence the applicability of the system to be used as an aid for clinicians.

翻訳日:2022-09-20 01:57:13 公開日:2020-11-27

# 神経多様体としての2次元フレーム非視覚空間の表現とその情報幾何解釈

Representation of 2D frame less visual space as a neural manifold and its information geometric interpretation ( http://arxiv.org/abs/2011.13585v1 )

ライセンス: Link先を確認

Debasis Mazumdar

(参考訳) 情報幾何学のフレームワークにおけるニューラル多様体としての2次元フレームの表現とモデリングについて述べる。視覚空間の双曲性の起源は神経科学の証拠を用いて研究されている。そこで本研究では,ヒト脳における空間情報の処理,特に距離の推定,幾何学曲線の知覚等を,フィッシャー・ラオ計量を用いたパラメトリック確率空間でモデル化できることを提案する。空間のコンパクト性、凸性、微分性は解析され、ブセマンが提唱した G 空間の公理に従うことが分かる。さらに、これは定数負曲率の斉次リーマン空間と考えることができる。したがって、空間が測地線を生じさせることが保証される。多くの視覚現象を表す測地学の計算機シミュレーションを行い、視覚空間の双曲構造を提唱する。シミュレーション結果と公開実験データの比較を行った。

Representation of 2D frame less visual space as neural manifold and its modelling in the frame work of information geometry is presented. Origin of hyperbolic nature of the visual space is investigated using evidences from neuroscience. Based on the results we propose that the processing of spatial information, particularly estimation of distance, perceiving geometrical curves etc. in the human brain can be modeled in a parametric probability space endowed with Fisher-Rao metric. Compactness, convexity and differentiability of the space is analysed and found that they obey the axioms of G space, proposed by Busemann. Further it is shown that it can be considered as a homogeneous Riemannian space of constant negative curvature. It is therefore ensured that the space yields geodesics into it. Computer simulation of geodesics representing a number of visual phenomena and advocating the hyperbolic structure of visual space is carried out. Comparison of the simulated results with the published experimental data is presented.

翻訳日:2022-09-20 01:56:50 公開日:2020-11-27

# モジュール型深層強化学習と政策伝達による適応型自動化

Adaptable Automation with Modular Deep Reinforcement Learning and Policy Transfer ( http://arxiv.org/abs/2012.01934v1 )

ライセンス: Link先を確認

Zohreh Raziei, Mohsen Moghaddam

(参考訳) 深層強化学習(rl)の最近の進歩は、機械が所定のタスクを実行するための最適なポリシーを自律的に学習できるインテリジェントオートメーションにとって、前例のない機会を生み出した。しかし、現在のディープrlアルゴリズムは、主に狭い範囲のタスクに特化しており、サンプル非効率であり、十分な安定性を欠いているため、産業的な採用を妨げている。本稿では,タスクのモジュール化と伝達学習の概念に基づいて,ハイパーアクタソフトアクタクリティカル(HASAC)RLフレームワークを開発し,テストすることによって,この制限に対処する。 HASACの目標は、エージェントが学習したタスクのポリシーを「ハイパーアクター」を介して新しいタスクに転送することで、新しいタスクへの適応性を高めることである。 HASACフレームワークは、新しい仮想ロボット操作ベンチマークであるMeta-Worldでテストされている。数値実験により、HASACは、報酬値、成功率、タスク完了時間の観点から、最先端の深部RLアルゴリズムよりも優れた性能を示す。

Recent advances in deep Reinforcement Learning (RL) have created unprecedented opportunities for intelligent automation, where a machine can autonomously learn an optimal policy for performing a given task. However, current deep RL algorithms predominantly specialize in a narrow range of tasks, are sample inefficient, and lack sufficient stability, which in turn hinder their industrial adoption. This article tackles this limitation by developing and testing a Hyper-Actor Soft Actor-Critic (HASAC) RL framework based on the notions of task modularization and transfer learning. The goal of the proposed HASAC is to enhance the adaptability of an agent to new tasks by transferring the learned policies of former tasks to the new task via a "hyper-actor". The HASAC framework is tested on a new virtual robotic manipulation benchmark, Meta-World. Numerical experiments show superior performance by HASAC over state-of-the-art deep RL algorithms in terms of reward value, success rate, and task completion time.

翻訳日:2022-09-20 01:56:36 公開日:2020-11-27

# 実体埋め込みベクトルを用いたハイブリッドガウス過程モデルを用いた細胞間知識伝達

Knowledge transfer across cell lines using Hybrid Gaussian Process models with entity embedding vectors ( http://arxiv.org/abs/2011.13863v1 )

ライセンス: Link先を確認

Clemens Hutter, Moritz von Stosch, Mariano Nicolas Cruz Bournazou, Alessandro Butt\'e

(参考訳) 現在までに生化学プロセスを開発するために多くの実験が行われている。生成されたデータは一度だけ使用され、開発のための決定を下す。既に開発されたプロセスのデータを利用して、新しいプロセスの予測を行い、必要な実験の数を大幅に削減できるだろうか。異なる製品のプロセスは振る舞いの違いを示し、通常、サブセットのみが同じように振る舞う。したがって、複数の製品にまたがるプロセスデータに対する効果的な学習には、製品のアイデンティティを合理的に表現する必要がある。ガウス過程回帰モデルへの入力となるベクトルを埋め込み、積の同一性(圏的特徴)を表現することを提案する。組込みベクトルがプロセスデータからどのように学習できるかを示し、製品類似性の概念を解釈可能であることを示す。性能改善は、シミュレーションされたクロスプロダクト学習タスクにおける従来のワンホット符号化と比較される。総じて、提案手法はウェットラブ実験において有意な減少をもたらす可能性がある。

To date, a large number of experiments are performed to develop a biochemical process. The generated data is used only once, to take decisions for development. Could we exploit data of already developed processes to make predictions for a novel process, we could significantly reduce the number of experiments needed. Processes for different products exhibit differences in behaviour, typically only a subset behave similar. Therefore, effective learning on multiple product spanning process data requires a sensible representation of the product identity. We propose to represent the product identity (a categorical feature) by embedding vectors that serve as input to a Gaussian Process regression model. We demonstrate how the embedding vectors can be learned from process data and show that they capture an interpretable notion of product similarity. The improvement in performance is compared to traditional one-hot encoding on a simulated cross product learning task. All in all, the proposed method could render possible significant reductions in wet-lab experiments.

翻訳日:2022-09-20 01:56:09 公開日:2020-11-27

# 地形モデルを用いたマラリアベクター飼育地の検出

Detection of Malaria Vector Breeding Habitats using Topographic Models ( http://arxiv.org/abs/2011.13714v1 )

ライセンス: Link先を確認

Aishwarya Jadhav

(参考訳) マラリアベクターの繁殖地として機能する停滞した水域の処理は、ほとんどのマラリア除去運動の基本的なステップである。しかし、大規模な水域の特定は高価であり、労働集約的で時間を要するため、資源が限られている国では困難である。水体を効率的に発見できる実用的なモデルは、現場労働者がスキャンする必要がある領域を大幅に減らし、限られた資源をターゲットにすることができる。そこで本研究では,可能でグローバルで高解像度なDEMデータに基づく実用的な地形モデルを提案する。ガーナのオプアシ地域を調査し,様々な地形特性が異なる水域に与える影響を調査し,水生生物形成に大きな影響を及ぼす特徴を明らかにする。複数のモデルの有効性をさらに評価する。我々の最良モデルは、衛星画像データを利用し、異なる設定で堅牢性を示すものでさえも、小さな水面の検出に地形変数を用いた以前の試みよりも著しく優れている。

Treatment of stagnant water bodies that act as a breeding site for malarial vectors is a fundamental step in most malaria elimination campaigns. However, identification of such water bodies over large areas is expensive, labour-intensive and time-consuming and hence, challenging in countries with limited resources. Practical models that can efficiently locate water bodies can target the limited resources by greatly reducing the area that needs to be scanned by the field workers. To this end, we propose a practical topographic model based on easily available, global, high-resolution DEM data to predict locations of potential vector-breeding water sites. We surveyed the Obuasi region of Ghana to assess the impact of various topographic features on different types of water bodies and uncover the features that significantly influence the formation of aquatic habitats. We further evaluate the effectiveness of multiple models. Our best model significantly outperforms earlier attempts that employ topographic variables for detection of small water sites, even the ones that utilize additional satellite imagery data and demonstrates robustness across different settings.

翻訳日:2022-09-20 01:55:51 公開日:2020-11-27

# 医用ハイパースペクトル画像解析における深層学習の動向

Trends in deep learning for medical hyperspectral image analysis ( http://arxiv.org/abs/2011.13974v1 )

ライセンス: Link先を確認

Uzair Khan, Paheding Sidike, Colin Elkin and Vijay Devabhaktuni

(参考訳) 深層学習のアルゴリズムは、過去10年間にいくつかの分野の関心を集め、医療用ハイパースペクトルイメージングは特に有望な分野である。以上より,医用ハイパースペクトル画像における深層学習の実施を論じるレビュー論文は存在せず,このレビュー論文が目指すのは,現在,深層学習を利用して医用ハイパースペクトル画像の効果的な分析を行う出版物を調べることである。本稿では,深層学習のブーム以来実施されてきた医療用ハイパースペクトル画像解析に関係し,適用可能な深層学習概念について論じる。本研究は, 医用ハイパースペクトル画像解析において, 深層学習を用いた分類, 分割, 検出について検討する。最後に、この規律に関連する現状と今後の課題と、その試みを克服するための取り組みについて論じる。

Deep learning algorithms have seen acute growth of interest in their applications throughout several fields of interest in the last decade, with medical hyperspectral imaging being a particularly promising domain. So far, to the best of our knowledge, there is no review paper that discusses the implementation of deep learning for medical hyperspectral imaging, which is what this review paper aims to accomplish by examining publications that currently utilize deep learning to perform effective analysis of medical hyperspectral imagery. This paper discusses deep learning concepts that are relevant and applicable to medical hyperspectral imaging analysis, several of which have been implemented since the boom in deep learning. This will comprise of reviewing the use of deep learning for classification, segmentation, and detection in order to investigate the analysis of medical hyperspectral imaging. Lastly, we discuss the current and future challenges pertaining to this discipline and the possible efforts to overcome such trials.

翻訳日:2022-09-20 01:55:35 公開日:2020-11-27

# 早期アルツハイマー病検出のためのMRI画像解析法

MRI Images Analysis Method for Early Stage Alzheimer's Disease Detection ( http://arxiv.org/abs/2012.00830v1 )

ライセンス: Link先を確認

Achraf Ben Miled, Taoufik Yeferny, and Amira ben Rabeh

(参考訳) アルツハイマー病(英: Alzheimer disease)は、記憶や認知機能を変える神経変性疾患である。この疾患の早期診断は、ミルド認知障害(MCI: Mild Cognitive Impairment)と呼ばれる予備段階の検出によって、依然として困難な問題である。本稿では,MCI 段階におけるアルツハイマー病を検出するために,MRI 画像から最も顕著な特徴を自動的に抽出する,事前学習ネットワーク AlexNet を実装した強力な分類アーキテクチャを提案する。 oasisデータベース脳の大規模データベースを用いて,提案手法を評価した。脳の様々な部分(前頭、矢状、軸)が用いられた。健常者210名とMRI210名を用いた96.83%の精度を実現した。

Alzheimer's disease is a neurogenerative disease that alters memories, cognitive functions leading to death. Early diagnosis of the disease, by detection of the preliminary stage, called Mild Cognitive Impairment (MCI), remains a challenging issue. In this respect, we introduce, in this paper, a powerful classification architecture that implements the pre-trained network AlexNet to automatically extract the most prominent features from Magnetic Resonance Imaging (MRI) images in order to detect the Alzheimer's disease at the MCI stage. The proposed method is evaluated using a big database from OASIS Database Brain. Various sections of the brain: frontal, sagittal and axial were used. The proposed method achieved 96.83% accuracy by using 420 subjects: 210 Normal and 210 MRI

翻訳日:2022-09-20 01:55:19 公開日:2020-11-27

# 弱いラベルを持つ脳動脈瘤分類のための解剖学的インフォームド3D CNN

An anatomically-informed 3D CNN for brain aneurysm classification with weak labels ( http://arxiv.org/abs/2012.08645v1 )

ライセンス: Link先を確認

Tommaso Di Noto, Guillaume Marie, S\'ebastien Tourbier, Yasser Alem\'an-G\'omez, Guillaume Saliou, Meritxell Bach Cuadra, Patric Hagmann, Jonas Richiardi

(参考訳) 医療画像における検出タスクを実行するための一般的なアプローチは、初期セグメンテーションに依存することである。しかし、このアプローチは、医療専門家が描くのに反復的かつ時間のかかるvoxel-wiseアノテーションに強く依存している。ボクセルのマスクに代わる興味深い選択肢は、いわゆる「弱」ラベルである。これらは粗いアノテーションか、より正確ではないが、作成が著しく高速な過大なアノテーションである。本研究は,脳動脈瘤検出の課題を,教師付きセグメンテーション法やボクセルワイドデラインを用いた関連研究とは対照的に,弱いラベルを用いたパッチワイドバイナリ分類として扱う。我々のアプローチは、ほとんどの焦点疾患と同様に、異常なパッチ(大動脈瘤を含む)は異常のないものよりも多く、通常2つのクラスは異なる空間分布を持つという、データセット作成の非自明な課題に起因している。そこで本研究では,マルチスケール・マルチインプット・3D畳み込みニューラルネットワーク(CNN)を用いて,非バランスで空間的に歪んだデータセットの頻繁なシナリオに対処する。今回我々は,tof-mra(time-of-flight magnetic resonance angiography)を施行した214名 (83名, 131名) の脳動脈瘤の111例を経験した。我々は,ネットワークの難易度が増大する負のパッチサンプリングに対する2つの戦略を比較し,この選択が結果にどのように影響するかを示す。付加された空間情報が性能向上に寄与するかどうかを評価するために, 解剖学的にインフォームドされたCNNと, ベースライン, 空間非依存のCNNを比較した。容器のような負のパッチを含むより現実的で挑戦的なシナリオを考えると、前者は最も高い分類結果(精度$\simeq$95\%, AUROC$\simeq$0.95, AUPR$\simeq$0.71)を得た。

A commonly adopted approach to carry out detection tasks in medical imaging is to rely on an initial segmentation. However, this approach strongly depends on voxel-wise annotations which are repetitive and time-consuming to draw for medical experts. An interesting alternative to voxel-wise masks are so-called "weak" labels: these can either be coarse or oversized annotations that are less precise, but noticeably faster to create. In this work, we address the task of brain aneurysm detection as a patch-wise binary classification with weak labels, in contrast to related studies that rather use supervised segmentation methods and voxel-wise delineations. Our approach comes with the non-trivial challenge of the data set creation: as for most focal diseases, anomalous patches (with aneurysm) are outnumbered by those showing no anomaly, and the two classes usually have different spatial distributions. To tackle this frequent scenario of inherently imbalanced, spatially skewed data sets, we propose a novel, anatomically-driven approach by using a multi-scale and multi-input 3D Convolutional Neural Network (CNN). We apply our model to 214 subjects (83 patients, 131 controls) who underwent Time-Of-Flight Magnetic Resonance Angiography (TOF-MRA) and presented a total of 111 unruptured cerebral aneurysms. We compare two strategies for negative patch sampling that have an increasing level of difficulty for the network and we show how this choice can strongly affect the results. To assess whether the added spatial information helps improving performances, we compare our anatomically-informed CNN with a baseline, spatially-agnostic CNN. When considering the more realistic and challenging scenario including vessel-like negative patches, the former model attains the highest classification results (accuracy$\simeq$95\%, AUROC$\simeq$0.95, AUPR$\simeq$0.71), thus outperforming the baseline.

翻訳日:2022-09-20 01:55:09 公開日:2020-11-27

# ドメイン適応因果性エンコーダ

Domain Adaptative Causality Encoder ( http://arxiv.org/abs/2011.13549v1 )

ライセンス: Link先を確認

Farhad Moghimifar, Gholamreza Haffari, Mahsa Baktashmotlagh

(参考訳) 個々のイベント間の低レベル関係の抽出を主眼とする現在のアプローチは,公開ラベル付きデータの不足によって制限されている。したがって、学習時にラベル付きデータが存在しない分布が異なる領域に適用した場合、結果のモデルは不十分である。この制限を克服するため,本論文では,依存木の特徴と逆学習を活用し,適応因果関係同定と局所化の課題に対処する。適応(adaptive)という用語は、トレーニングとテストのデータが2つの分散的なデータセットから来ているため、私たちの知る限りでは、この作業に対処するのは初めてです。さらに,テキスト中のすべての種類の因果関係を統合する新しい因果関係データセットであるmedcausを提案する。 4つの異なるベンチマーク因果関係データセットを用いた実験により,テキストからの因果関係の同定と局所化のタスクにおいて,既存基準よりも最大7%改善したアプローチが優れていることを示す。

Current approaches which are mainly based on the extraction of low-level relations among individual events are limited by the shortage of publicly available labelled data. Therefore, the resulting models perform poorly when applied to a distributionally different domain for which labelled data did not exist at the time of training. To overcome this limitation, in this paper, we leverage the characteristics of dependency trees and adversarial learning to address the tasks of adaptive causality identification and localisation. The term adaptive is used since the training and test data come from two distributionally different datasets, which to the best of our knowledge, this work is the first to address. Moreover, we present a new causality dataset, namely MedCaus, which integrates all types of causality in the text. Our experiments on four different benchmark causality datasets demonstrate the superiority of our approach over the existing baselines, by up to 7% improvement, on the tasks of identification and localisation of the causal relations from the text.

翻訳日:2022-09-20 01:54:31 公開日:2020-11-27

# 対話型文表現学習に基づく中国語医学質問応答照合

Chinese Medical Question Answer Matching Based on Interactive Sentence Representation Learning ( http://arxiv.org/abs/2011.13573v1 )

ライセンス: Link先を確認

Xiongtao Cui and Jungang Han

(参考訳) 中国の医学質問応答マッチングは、英語のオープンドメイン質問応答マッチングよりも難しい。深層学習法は質問応答マッチングの性能向上に優れてきたが,これらの手法は文内の意味情報のみに焦点をあてるが,質問と回答間の意味関係は無視し,結果として性能に欠陥が生じる。本稿では,この問題に取り組むために,対話型文表現学習モデルの設計を行う。本稿では,中国語の医学的問答マッチングに適応し,異なるニューラルネットワークの構造の利点を活かし,文内の深い意味情報を抽出し,問答間の意味関係を抽出し,多スケールcnnsネットワークやbigruネットワークと組み合わせ,ニューラルネットワークの異なる構造を活用し,文表現における意味的特徴を学習するクロスクロスバートネットワークを提案する。 cMedQA V2.0とcMedQA V1.0データセットの実験により、我々のモデルは、中国の医学的質問応答マッチングの既存の最先端モデルよりも大幅に優れていることが示された。

Chinese medical question-answer matching is more challenging than the open-domain question answer matching in English. Even though the deep learning method has performed well in improving the performance of question answer matching, these methods only focus on the semantic information inside sentences, while ignoring the semantic association between questions and answers, thus resulting in performance deficits. In this paper, we design a series of interactive sentence representation learning models to tackle this problem. To better adapt to Chinese medical question-answer matching and take the advantages of different neural network structures, we propose the Crossed BERT network to extract the deep semantic information inside the sentence and the semantic association between question and answer, and then combine with the multi-scale CNNs network or BiGRU network to take the advantage of different structure of neural networks to learn more semantic features into the sentence representation. The experiments on the cMedQA V2.0 and cMedQA V1.0 dataset show that our model significantly outperforms all the existing state-of-the-art models of Chinese medical question answer matching.

翻訳日:2022-09-20 01:54:14 公開日:2020-11-27

# 深い直交線形ネットワークは浅い

Deep orthogonal linear networks are shallow ( http://arxiv.org/abs/2011.13831v1 )

ライセンス: Link先を確認

Pierre Ablin

(参考訳) 直交行列の積からなる深い直交線形ネットワークをトレーニングする際の問題を考える。リーマン勾配降下を伴う重みの訓練は、勾配降下による因子化全体の訓練と等価であることを示す。つまり、この設定では、過パラメータ化と暗黙のバイアスが全く影響しない:そのような深層で過パラメータ化されたネットワークのトレーニングは、一層浅層ネットワークのトレーニングと完全に等価である。

We consider the problem of training a deep orthogonal linear network, which consists of a product of orthogonal matrices, with no non-linearity in-between. We show that training the weights with Riemannian gradient descent is equivalent to training the whole factorization by gradient descent. This means that there is no effect of overparametrization and implicit bias at all in this setting: training such a deep, overparametrized, network is perfectly equivalent to training a one-layer shallow network.

翻訳日:2022-09-20 01:47:55 公開日:2020-11-27

# マルチナリー制限ボルツマン機の搬送損失機能とカラー画像生成

Tractable loss function and color image generation of multinary restricted Boltzmann machine ( http://arxiv.org/abs/2011.13509v1 )

ライセンス: Link先を確認

Juno Hwang and Wonseok Hwang and Junghyo Jo

(参考訳) 制限ボルツマンマシン(RBM)は統計力学の概念に基づく代表的な生成モデルである。解釈可能性の強い利点にもかかわらず、バックプロパゲーションの非利用性は、他の生成モデルよりも競争力を低下させる。ここで、二元および多元 rbms の可微分損失関数を導出する。次に,色付き顔画像を生成することで,学習性と性能を示す。

The restricted Boltzmann machine (RBM) is a representative generative model based on the concept of statistical mechanics. In spite of the strong merit of interpretability, unavailability of backpropagation makes it less competitive than other generative models. Here we derive differentiable loss functions for both binary and multinary RBMs. Then we demonstrate their learnability and performance by generating colored face images.

翻訳日:2022-09-20 01:47:46 公開日:2020-11-27

# 物理世界におけるディープラーニング顔認識に対するロバストな攻撃

Robust Attacks on Deep Learning Face Recognition in the Physical World ( http://arxiv.org/abs/2011.13526v1 )

ライセンス: Link先を確認

Meng Shen, Hao Yu, Liehuang Zhu, Ke Xu, Qi Li, Xiaojiang Du

(参考訳) ディープニューラルネットワーク(DNN)は、顔認識(FR)システムでますます使われている。しかし、最近の研究では、DNNは敵対的な例に弱いことが示されており、物理的世界ではDNNを使用してFRシステムを誤解させる可能性がある。これらのシステムに対する既存の攻撃は、単にデジタル世界で働く摂動を生成するか、摂動を生成するためにカスタマイズされた機器に依存するかのいずれかであり、様々な物理的環境では堅牢ではない。本稿では、敵ステッカーを使ってFRシステムを騙す物理世界攻撃であるFaceAdvを提案する。主にステッカージェネレータとトランスで構成されており、前者は複数の異なる形状のステッカーを製作でき、後者のトランスフォーマーは人間の顔にステッカーをデジタルに取り付け、ステッカーの有効性を向上させるためにジェネレータにフィードバックを提供することを目的としている。本研究では,3種類のFRシステム(ArcFace,CosFace,FaceNet)に対するFaceAdvの有効性を評価するための広範囲な実験を行った。その結果、faceadvは最先端の攻撃と比べて、ドッジと偽装の両方の成功率を大幅に向上できることがわかった。また,FaceAdvの堅牢性を示すため,包括的評価を行った。

Deep neural networks (DNNs) have been increasingly used in face recognition (FR) systems. Recent studies, however, show that DNNs are vulnerable to adversarial examples, which can potentially mislead the FR systems using DNNs in the physical world. Existing attacks on these systems either generate perturbations working merely in the digital world, or rely on customized equipments to generate perturbations and are not robust in varying physical environments. In this paper, we propose FaceAdv, a physical-world attack that crafts adversarial stickers to deceive FR systems. It mainly consists of a sticker generator and a transformer, where the former can craft several stickers with different shapes and the latter transformer aims to digitally attach stickers to human faces and provide feedbacks to the generator to improve the effectiveness of stickers. We conduct extensive experiments to evaluate the effectiveness of FaceAdv on attacking 3 typical FR systems (i.e., ArcFace, CosFace and FaceNet). The results show that compared with a state-of-the-art attack, FaceAdv can significantly improve success rate of both dodging and impersonating attacks. We also conduct comprehensive evaluations to demonstrate the robustness of FaceAdv.

翻訳日:2022-09-20 01:47:40 公開日:2020-11-27

# SocialGuard: ソーシャルイメージの逆例に基づくプライバシ保護技術

SocialGuard: An Adversarial Example Based Privacy-Preserving Technique for Social Images ( http://arxiv.org/abs/2011.13560v1 )

ライセンス: Link先を確認

Mingfu Xue, Shichang Sun, Zhiyu Wu, Can He, Jian Wang, Weiqiang Liu

(参考訳) さまざまなソーシャルプラットフォームの人気は、人々が日常的な写真をオンラインで共有するきっかけとなった。しかし、このようなオンライン写真共有行動によって、望ましくないプライバシーリークが発生する。 advanced deep neural network (dnn)ベースのオブジェクト検出器は、共有写真に露出したユーザーの個人情報を容易に盗むことができる。本稿では,対象物検知器によるプライバシ盗難に対するソーシャルイメージの新たな逆例に基づくプライバシ保存手法を提案する。具体的には,2種類の敵対的ソーシャル画像を作成するためのオブジェクト消去アルゴリズムを開発した。ソーシャルイメージ内のすべてのオブジェクトを、オブジェクト検出器によって検出されるのを防ぎ、一方は、カスタマイズされた機密オブジェクトを、オブジェクト検出器によって不正に分類することができる。 Object Disappearance Algorithmは、クリーンな社会的イメージに摂動を構築する。摂動を注入した後、社会的イメージは容易に物体検出器を騙すことができるが、その視覚品質は劣化しない。提案手法の有効性を評価するために,プライバシ保存成功率とプライバシリーク率の2つの指標を用いる。実験の結果,提案手法は社会的画像のプライバシーを効果的に保護できることがわかった。 ms-cocoおよびpascal voc 2007データセットにおける提案手法のプライバシ保護成功率は、それぞれ96.1%、99.3%であり、これら2つのデータセットのプライバシリーク率は0.57%、0.07%である。さらに,既存の画像処理手法(低輝度,ノイズ,ぼかし,モザイク,jpeg圧縮)と比較して,提案手法はプライバシ保護と画像品質の維持において,はるかに優れた性能を実現することができる。

The popularity of various social platforms has prompted more people to share their routine photos online. However, undesirable privacy leakages occur due to such online photo sharing behaviors. Advanced deep neural network (DNN) based object detectors can easily steal users' personal information exposed in shared photos. In this paper, we propose a novel adversarial example based privacy-preserving technique for social images against object detectors based privacy stealing. Specifically, we develop an Object Disappearance Algorithm to craft two kinds of adversarial social images. One can hide all objects in the social images from being detected by an object detector, and the other can make the customized sensitive objects be incorrectly classified by the object detector. The Object Disappearance Algorithm constructs perturbation on a clean social image. After being injected with the perturbation, the social image can easily fool the object detector, while its visual quality will not be degraded. We use two metrics, privacy-preserving success rate and privacy leakage rate, to evaluate the effectiveness of the proposed method. Experimental results show that, the proposed method can effectively protect the privacy of social images. The privacy-preserving success rates of the proposed method on MS-COCO and PASCAL VOC 2007 datasets are high up to 96.1% and 99.3%, respectively, and the privacy leakage rates on these two datasets are as low as 0.57% and 0.07%, respectively. In addition, compared with existing image processing methods (low brightness, noise, blur, mosaic and JPEG compression), the proposed method can achieve much better performance in privacy protection and image visual quality maintenance.

翻訳日:2022-09-20 01:46:50 公開日:2020-11-27

# road scene graph: インテリジェントな車両のためのセマンティックグラフベースのシーン表現データセット

Road Scene Graph: A Semantic Graph-Based Scene Representation Dataset for Intelligent Vehicles ( http://arxiv.org/abs/2011.13588v1 )

ライセンス: Link先を確認

Yafu Tian, Alexander Carballo, Ruifeng Li and Kazuya Takeda

(参考訳) リッチセマンティック情報抽出は、次世代のインテリジェント車において重要な役割を果たす。現在,6次元ポーズ検出や道路シーンセマンティックセグメンテーションなどの基本的な応用に焦点を当てた研究が数多く行われている。これにより、これらのデータを組織化し、どのように活用するかを考える素晴らしい機会が得られます。本稿では,車載用特別シーングラフである道路シーングラフを提案する。古典的なデータ表現とは異なり、このグラフはオブジェクトの提案だけでなく、ペア関係も提供する。トポロジグラフにまとめることで、これらのデータは説明可能で、完全に接続可能であり、GCN(Graph Convolutional Networks)で簡単に処理できる。ここでは,基本グラフ予測モデルを含む道路シーングラフデータセットを用いて道路のシーングラフを適用する。本研究は,提案モデルを用いた実験評価も含む。

Rich semantic information extraction plays a vital role on next-generation intelligent vehicles. Currently there is great amount of research focusing on fundamental applications such as 6D pose detection, road scene semantic segmentation, etc. And this provides us a great opportunity to think about how shall these data be organized and exploited. In this paper we propose road scene graph,a special scene-graph for intelligent vehicles. Different to classical data representation, this graph provides not only object proposals but also their pair-wise relationships. By organizing them in a topological graph, these data are explainable, fully-connected, and could be easily processed by GCNs (Graph Convolutional Networks). Here we apply scene graph on roads using our Road Scene Graph dataset, including the basic graph prediction model. This work also includes experimental evaluations using the proposed model.

翻訳日:2022-09-20 01:46:23 公開日:2020-11-27

# GANの学習性に影響を与える特性に関する研究

A study of traits that affect learnability in GANs ( http://arxiv.org/abs/2011.13728v1 )

ライセンス: Link先を確認

Niladri Shekhar Dutt, Sunil Patel

(参考訳) generative adversarial networks gansは2つのニューラルネットワークを使用するアルゴリズムアーキテクチャであり、対向するニューラルネットワークを使用して、実際のデータに渡される新しい合成データインスタンスを考案する。 GANのトレーニングは難しい問題であり、ハイパーパラメータチューニングやアーキテクチャエンジニアリングといった高度なテクニックを適用する必要があります。多くの異なる損失、正規化と正規化スキーム、ネットワークアーキテクチャは、異なるタイプのデータセットに対するこの問題を解決するために提案されている。実験的な観察を理解し、簡単な理論を導出する必要がある。本稿では,パラメータ化合成データセットを用いて実験実験を行い,学習性に影響を与える特性について検討する。

Generative Adversarial Networks GANs are algorithmic architectures that use two neural networks, pitting one against the opposite so as to come up with new, synthetic instances of data that can pass for real data. Training a GAN is a challenging problem which requires us to apply advanced techniques like hyperparameter tuning, architecture engineering etc. Many different losses, regularization and normalization schemes, network architectures have been proposed to solve this challenging problem for different types of datasets. It becomes necessary to understand the experimental observations and deduce a simple theory for it. In this paper, we perform empirical experiments using parameterized synthetic datasets to probe what traits affect learnability.

翻訳日:2022-09-20 01:46:10 公開日:2020-11-27

# TaylorGAN: サンプル効率の良い自然言語生成のための隣り合わせのポリシー更新

TaylorGAN: Neighbor-Augmented Policy Update for Sample-Efficient Natural Language Generation ( http://arxiv.org/abs/2011.13527v1 )

ライセンス: Link先を確認

Chun-Hsing Lin, Siang-Ruei Wu, Hung-Yi Lee, Yun-Nung Chen

(参考訳) ReINFORCEのようなスコア関数ベースの自然言語生成(NLG)アプローチは、サンプル効率の低下や不安定性の訓練に悩まされている。これは主に離散空間サンプリングの非微分的性質のためであり、これらの方法は判別器をブラックボックスとして扱い、勾配情報を無視しなければならない。サンプル効率の向上とREINFORCEのばらつきの低減を目的として,オフ・ポリシー更新と1次テイラー展開による勾配推定を向上する新しいアプローチTaylorGANを提案する。このアプローチにより、NLGモデルをスクラッチからより小さなバッチサイズでトレーニングすることが可能になります -- 最大限の事前トレーニングをすることなく、品質と多様性の複数の指標において、既存のGANベースのメソッドよりも優れています。ソースコードとデータはhttps://github.com/miulab/taylorganで入手できる。

Score function-based natural language generation (NLG) approaches such as REINFORCE, in general, suffer from low sample efficiency and training instability problems. This is mainly due to the non-differentiable nature of the discrete space sampling and thus these methods have to treat the discriminator as a black box and ignore the gradient information. To improve the sample efficiency and reduce the variance of REINFORCE, we propose a novel approach, TaylorGAN, which augments the gradient estimation by off-policy update and the first-order Taylor expansion. This approach enables us to train NLG models from scratch with smaller batch size -- without maximum likelihood pre-training, and outperforms existing GAN-based methods on multiple metrics of quality and diversity. The source code and data are available at https://github.com/MiuLab/TaylorGAN

翻訳日:2022-09-20 01:39:45 公開日:2020-11-27

# 長尾関係抽出のためのラベルなしテキストからの学習関係プロトタイプ

Learning Relation Prototype from Unlabeled Texts for Long-tail Relation Extraction ( http://arxiv.org/abs/2011.13574v1 )

ライセンス: Link先を確認

Yixin Cao, Jun Kuang, Ming Gao, Aoying Zhou, Yonggang Wen, Tat-Seng Chua

(参考訳) 関係抽出(re)は、テキストからエンティティ関係を抽出することによって、知識グラフ(kg)を完成させるための重要なステップである。トレーニングデータは主にいくつかのタイプの関係に集中しており、残りのタイプの関係に対する十分なアノテーションが欠如している。本稿では,関係型から知識を十分な学習データで伝達することで,テキストのラベルのない関係のプロトタイプを学習し,長い関係の抽出を容易にする方法を提案する。我々は,関係の意味と伝達学習の近さを反映した実体間の暗黙的要因として関係プロトタイプを学習する。具体的には、テキストから共起グラフを構築し、埋め込み学習のための一階と二階の両方のエンティティをキャプチャする。これに基づいて、ほぼ任意のREフレームワークに容易に適用可能な、プロトタイプに対応するエンティティペアからの距離をさらに最適化する。そこで我々は、New York TimesとGoogle Distant Supervisionという2つの公開データセットで大規模な実験を行い、8つの最先端ベースラインと比較し、提案モデルは大幅な改善(平均4.1% F1)を達成した。長期関係に関するさらなる議論は、学習された関係プロトタイプの有効性を示す。さらに,様々な成分の影響を解明するためのアブレーション研究を行い,これを4つの基本関係抽出モデルに適用して一般化能力を検証した。コードは後でリリースします。

Relation Extraction (RE) is a vital step to complete Knowledge Graph (KG) by extracting entity relations from texts.However, it usually suffers from the long-tail issue. The training data mainly concentrates on a few types of relations, leading to the lackof sufficient annotations for the remaining types of relations. In this paper, we propose a general approach to learn relation prototypesfrom unlabeled texts, to facilitate the long-tail relation extraction by transferring knowledge from the relation types with sufficient trainingdata. We learn relation prototypes as an implicit factor between entities, which reflects the meanings of relations as well as theirproximities for transfer learning. Specifically, we construct a co-occurrence graph from texts, and capture both first-order andsecond-order entity proximities for embedding learning. Based on this, we further optimize the distance from entity pairs tocorresponding prototypes, which can be easily adapted to almost arbitrary RE frameworks. Thus, the learning of infrequent or evenunseen relation types will benefit from semantically proximate relations through pairs of entities and large-scale textual information.We have conducted extensive experiments on two publicly available datasets: New York Times and Google Distant Supervision.Compared with eight state-of-the-art baselines, our proposed model achieves significant improvements (4.1% F1 on average). Furtherresults on long-tail relations demonstrate the effectiveness of the learned relation prototypes. We further conduct an ablation study toinvestigate the impacts of varying components, and apply it to four basic relation extraction models to verify the generalization ability.Finally, we analyze several example cases to give intuitive impressions as qualitative analysis. Our codes will be released later.

翻訳日:2022-09-20 01:39:31 公開日:2020-11-27

# Reflective-Net: 説明から学ぶ

Reflective-Net: Learning from Explanations ( http://arxiv.org/abs/2011.13986v1 )

ライセンス: Link先を確認

Johannes Schneider and Michalis Vlachos

(参考訳) 人間は、迅速で直感的な決定をするだけでなく、自己表現、すなわち自己説明し、他人の説明から効率的に学ぶ能力を持っている。この研究は、既存の説明法、すなわちGrad-CAMに基づいて生成された説明に乗じて、このプロセスを模倣する最初のステップを提供する。従来のラベル付きデータと組み合わせた説明から学ぶことは、精度とトレーニング時間の観点から分類の大幅な改善をもたらす。

Humans possess a remarkable capability to make fast, intuitive decisions, but also to self-reflect, i.e., to explain to oneself, and to efficiently learn from explanations by others. This work provides the first steps toward mimicking this process by capitalizing on the explanations generated based on existing explanation methods, i.e. Grad-CAM. Learning from explanations combined with conventional labeled data yields significant improvements for classification in terms of accuracy and training time.

翻訳日:2022-09-20 01:37:40 公開日:2020-11-27

# ドメイン知識を使って機械に自己説明を教える

Teaching the Machine to Explain Itself using Domain Knowledge ( http://arxiv.org/abs/2012.01932v1 )

ライセンス: Link先を確認

Vladimir Balayan, Pedro Saleiro, Catarina Bel\'em, Ludwig Krippahl and Pedro Bizarro

(参考訳) 機械学習(ml)は、人間がより良く、より速い決定を下すのを助けるためにますます使われています。しかし、非技術者はモデル予測の背後にある理論的根拠を理解するのに苦労し、アルゴリズムによる意思決定システムの信頼を妨げた。 AIの説明可能性に関する重要な研究は、説明方法を開発することによってAIシステムの信頼を取り戻す試みであるが、大きなブレークスルーはない。同時に、一般的な説明法(例えば LIME や SHAP)は、非データ科学者のペルソナを理解するのが非常に難しい説明を生成する。これを解決するために、意思決定タスクとドメイン知識を伝える関連する説明を共同で学習するニューラルネットワークベースのフレームワークJOELを提案する。 JOELは、専門家自身の推論に非常によく似た、モデルの予測に関する高いレベルの洞察を提供する、深い技術的ML知識の欠如を持つ、ループ内のドメインエキスパートに合わせたものだ。さらに、認定専門家のプールからドメインからのフィードバックを収集し、モデル(人間の教え)を改善するために使用することで、シームレスでより適切な説明を促進します。最後に,従来の専門家システムとドメイン分類体系のセマンティックマッピングを用いてブートストラップトレーニングセットを自動的に注釈付けし,概念に基づく人間のアノテーションの欠如を克服する。実世界の不正検出データセット上でJOELを実証的に検証する。 JOELはブートストラップデータセットから説明を一般化できることを示す。さらに, 人間の指導により, 説明文の予測精度を約$13.57\%$向上できることを示した。

Machine Learning (ML) has been increasingly used to aid humans to make better and faster decisions. However, non-technical humans-in-the-loop struggle to comprehend the rationale behind model predictions, hindering trust in algorithmic decision-making systems. Considerable research work on AI explainability attempts to win back trust in AI systems by developing explanation methods but there is still no major breakthrough. At the same time, popular explanation methods (e.g., LIME, and SHAP) produce explanations that are very hard to understand for non-data scientist persona. To address this, we present JOEL, a neural network-based framework to jointly learn a decision-making task and associated explanations that convey domain knowledge. JOEL is tailored to human-in-the-loop domain experts that lack deep technical ML knowledge, providing high-level insights about the model's predictions that very much resemble the experts' own reasoning. Moreover, we collect the domain feedback from a pool of certified experts and use it to ameliorate the model (human teaching), hence promoting seamless and better suited explanations. Lastly, we resort to semantic mappings between legacy expert systems and domain taxonomies to automatically annotate a bootstrap training set, overcoming the absence of concept-based human annotations. We validate JOEL empirically on a real-world fraud detection dataset. We show that JOEL can generalize the explanations from the bootstrap dataset. Furthermore, obtained results indicate that human teaching can further improve the explanations prediction quality by approximately $13.57\%$.

翻訳日:2022-09-20 01:37:32 公開日:2020-11-27

# すべての企業がその構造を所有:グラフニューラルネットワークによる企業クレジットレーティング

Every Corporation Owns Its Structure: Corporate Credit Ratings via Graph Neural Networks ( http://arxiv.org/abs/2012.01933v1 )

ライセンス: Link先を確認

Bojing Feng, Haonan Xu, Wenfang Xue and Bindang Xue

(参考訳) 信用格付けは、投資におけるリスクと信頼性のレベルを反映し、金融リスクにおいて重要な役割を果たす、企業に関連する信用リスクの分析である。企業信用格付けを扱うためにベクトル空間に基づく機械学習とディープラーニング技術を実装する多くの研究が登場している。近年,ローン保証ネットワークなどの企業間の関係を考慮すると,グラフニューラルネットワークの出現に伴い,グラフベースモデルもいくつか適用されている。しかし、これらの既存のモデルは企業間のネットワークを構築し、内部の機能的相互作用を考慮に入れない。本稿では,このような問題を解決するために,グラフニューラルネットワークを用いた企業信用評価モデルCCR-GNNを提案する。まず、各企業の個々のグラフをセルフアウト製品に基づいて構築し、gnnを使用して、ローカル情報とグローバル情報の両方を含む機能インタラクションを明示的にモデル化します。中国の上場企業評価データセットで実施された大規模な実験は、CCR-GNNが最先端の手法を一貫して上回っていることを証明している。

Credit rating is an analysis of the credit risks associated with a corporation, which reflects the level of the riskiness and reliability in investing, and plays a vital role in financial risk. There have emerged many studies that implement machine learning and deep learning techniques which are based on vector space to deal with corporate credit rating. Recently, considering the relations among enterprises such as loan guarantee network, some graph-based models are applied in this field with the advent of graph neural networks. But these existing models build networks between corporations without taking the internal feature interactions into account. In this paper, to overcome such problems, we propose a novel model, Corporate Credit Rating via Graph Neural Networks, CCR-GNN for brevity. We firstly construct individual graphs for each corporation based on self-outer product and then use GNN to model the feature interaction explicitly, which includes both local and global information. Extensive experiments conducted on the Chinese public-listed corporate rating dataset, prove that CCR-GNN outperforms the state-of-the-art methods consistently.

翻訳日:2022-09-20 01:37:07 公開日:2020-11-27

# バイナリラテントを用いた変分オートエンコーダの直接進化最適化

Direct Evolutionary Optimization of Variational Autoencoders With Binary Latents ( http://arxiv.org/abs/2011.13704v1 )

ライセンス: Link先を確認

Enrico Guiraud, Jakob Drefs, J\"org L\"ucke

(参考訳) 離散潜在変数は実世界のデータにとって重要であると考えられており、離散潜在変数を持つ変分オートエンコーダ(VAE)の研究の動機となっている。しかし、この場合、標準的なVAEトレーニングは不可能であり、従来のような個別のVAEを訓練するために、個別の分散を操作するための異なる戦略を動機付けている。ここでは、符号化モデルに直接離散最適化を適用することにより、潜伏者の離散性を完全に維持できるかどうかを問う。この手法は, サイドステッピングサンプリング近似, 再パラメータ化トリック, 償却により, 標準的なVAEトレーニングから強く逸脱している。離散最適化は、進化的アルゴリズムと連動して、切断後段を用いた変分設定で実現される。バイナリラテントを持つVAEに対して、(A)ネットワーク重みに対する勾配上昇にそのような離散的変動法がどのように結びついているか、および(B)デコーダがトレーニングのために遅延状態を選択する方法を示す。従来の償却トレーニングはより効率的で、大きなニューラルネットワークに適用できる。しかし、より小さなネットワークを用いることで、数百の潜伏者に対して効率よく分散最適化を行うことができる。さらに重要なのは,直接最適化の有効性が,‘ゼロショット’学習において極めて競争力が高いことだ。大規模な教師付きネットワークとは対照的に、hereが調査したvaes canは、クリーンなデータや大きな画像データセットのトレーニングの事前のトレーニングなしに、1つのイメージをデノーズする。より一般に,vaeの訓練はサンプリングに基づく近似と再パラメータ化を伴わずに可能であり,一般にvae訓練の解析には興味深いものと考えられる。ゼロショット' 設定では、直接最適化され、さらに、VAE は非生成的アプローチによって以前より優れていた。

Discrete latent variables are considered important for real world data, which has motivated research on Variational Autoencoders (VAEs) with discrete latents. However, standard VAE-training is not possible in this case, which has motivated different strategies to manipulate discrete distributions in order to train discrete VAEs similarly to conventional ones. Here we ask if it is also possible to keep the discrete nature of the latents fully intact by applying a direct discrete optimization for the encoding model. The approach is consequently strongly diverting from standard VAE-training by sidestepping sampling approximation, reparameterization trick and amortization. Discrete optimization is realized in a variational setting using truncated posteriors in conjunction with evolutionary algorithms. For VAEs with binary latents, we (A) show how such a discrete variational method ties into gradient ascent for network weights, and (B) how the decoder is used to select latent states for training. Conventional amortized training is more efficient and applicable to large neural networks. However, using smaller networks, we here find direct discrete optimization to be efficiently scalable to hundreds of latents. More importantly, we find the effectiveness of direct optimization to be highly competitive in `zero-shot' learning. In contrast to large supervised networks, the here investigated VAEs can, e.g., denoise a single image without previous training on clean data and/or training on large image datasets. More generally, the studied approach shows that training of VAEs is indeed possible without sampling-based approximation and reparameterization, which may be interesting for the analysis of VAE-training in general. For `zero-shot' settings a direct optimization, furthermore, makes VAEs competitive where they have previously been outperformed by non-generative approaches.

翻訳日:2022-09-20 01:36:49 公開日:2020-11-27

# 深層ニューラルネットワークにおける畳み込み層の不確かさに関する研究

A Study on the Uncertainty of Convolutional Layers in Deep Neural Networks ( http://arxiv.org/abs/2011.13719v1 )

ライセンス: Link先を確認

Haojing Shen, Sihong Chen, Ran Wang

(参考訳) 本稿では,ニューラルネットワーク構造,すなわちLeNetにおける畳み込み層の接続重みに存在するMin-Max特性を示す。具体的には、Min-Max特性は、LeNetの後方伝播ベースのトレーニングの間、畳み込み層の重みが間隔の中心から遠ざかる、すなわち最小限に減少するか、最大まで増加することを意味する。不確実性の観点から、Min-Max特性が畳み込みの簡易な定式化によってモデルパラメータのファジィを最小化することを示す。実験により、Min-Max特性を持つモデルが強い対向性を持つことが確認され、この特性は損失関数の設計に組み込むことができる。本稿では,レネ構造の畳み込み層における不確かさの変化傾向を指摘し,畳み込みの解釈可能性について考察する。

This paper shows a Min-Max property existing in the connection weights of the convolutional layers in a neural network structure, i.e., the LeNet. Specifically, the Min-Max property means that, during the back propagation-based training for LeNet, the weights of the convolutional layers will become far away from their centers of intervals, i.e., decreasing to their minimum or increasing to their maximum. From the perspective of uncertainty, we demonstrate that the Min-Max property corresponds to minimizing the fuzziness of the model parameters through a simplified formulation of convolution. It is experimentally confirmed that the model with the Min-Max property has a stronger adversarial robustness, thus this property can be incorporated into the design of loss function. This paper points out a changing tendency of uncertainty in the convolutional layers of LeNet structure, and gives some insights to the interpretability of convolution.

翻訳日:2022-09-20 01:36:18 公開日:2020-11-27

# 条件付き無依存画素合成による画像生成

Image Generators with Conditionally-Independent Pixel Synthesis ( http://arxiv.org/abs/2011.13775v1 )

ライセンス: Link先を確認

Ivan Anokhin, Kirill Demochkin, Taras Khakhulin, Gleb Sterkin, Victor Lempitsky, Denis Korzhenkov

(参考訳) 既存の画像生成ネットワークは空間的畳み込みに大きく依存しており、オプションで画像の粗大な合成を徐々に行うことができる。本稿では,各画素における色値を,ランダム潜時ベクトルの値と,その画素の座標から独立に計算する,画像生成のための新しいアーキテクチャを提案する。合成中にピクセル間で情報を伝達する空間畳み込みや類似の操作は関与しない。本研究では, 逆方向の学習において, このようなジェネレータのモデリング能力を解析し, 新しいジェネレータを観察して, 最先端の畳み込みジェネレータに類似した生成品質を実現する。また,新しいアーキテクチャに特有の興味深い特性についても検討した。

Existing image generator networks rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images in a coarse-to-fine manner. Here, we present a new architecture for image generators, where the color value at each pixel is computed independently given the value of a random latent vector and the coordinate of that pixel. No spatial convolutions or similar operations that propagate information across pixels are involved during the synthesis. We analyze the modeling capabilities of such generators when trained in an adversarial fashion, and observe the new generators to achieve similar generation quality to state-of-the-art convolutional generators. We also investigate several interesting properties unique to the new architecture.

翻訳日:2022-09-20 01:30:14 公開日:2020-11-27

# 期待改善最大化によるcnnのアクティブラーニング

Active Learning in CNNs via Expected Improvement Maximization ( http://arxiv.org/abs/2011.14015v1 )

ライセンス: Link先を確認

Udai G. Nagpal, David A Knowles

(参考訳) convolutional neural networks(cnns)などのディープラーニングモデルは、コンピュータビジョンや最近では計算生物学など、さまざまな領域において高いレベルの有効性を示している。しかし、効果的なモデルのトレーニングには、しばしば大規模なデータセットを組み立てたり、ラベル付けする必要がある。プールベースのアクティブラーニング技術は、これらの問題を軽減し、限られたデータで訓練されたモデルを利用して、学習プロセスを高速化するために、未ラベルのデータポイントをプールから選択的にクエリする。本稿では,提案する「Dropout-based expecteded IMprOvementS」(DEIMOS)について述べる。提案フレームワークは,モデル不確実性を捉える予測共分散行列の維持と,この行列を動的に更新することにより,バッチモード設定における多様な点のバッチを生成する。アクティブラーニングの結果,DIMOSはコンピュータビジョンやゲノミクスから取られた複数の回帰・分類タスクにおいて,既存のベースラインよりも優れていた。

Deep learning models such as Convolutional Neural Networks (CNNs) have demonstrated high levels of effectiveness in a variety of domains, including computer vision and more recently, computational biology. However, training effective models often requires assembling and/or labeling large datasets, which may be prohibitively time-consuming or costly. Pool-based active learning techniques have the potential to mitigate these issues, leveraging models trained on limited data to selectively query unlabeled data points from a pool in an attempt to expedite the learning process. Here we present "Dropout-based Expected IMprOvementS" (DEIMOS), a flexible and computationally-efficient approach to active learning that queries points that are expected to maximize the model's improvement across a representative sample of points. The proposed framework enables us to maintain a prediction covariance matrix capturing model uncertainty, and to dynamically update this matrix in order to generate diverse batches of points in the batch-mode setting. Our active learning results demonstrate that DEIMOS outperforms several existing baselines across multiple regression and classification tasks taken from computer vision and genomics.

翻訳日:2022-09-20 01:30:01 公開日:2020-11-27

# 変圧器を用いた一般マルチラベル画像分類

General Multi-label Image Classification with Transformers ( http://arxiv.org/abs/2011.14027v1 )

ライセンス: Link先を確認

Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi

(参考訳) マルチラベル画像分類は、画像に存在するオブジェクト、属性、その他のエンティティに対応するラベルの集合を予測するタスクである。本研究では,多ラベル画像分類のための一般的なフレームワークである分類変換器(C-Tran)を提案する。我々のアプローチは、マスク付きラベルの入力セットと畳み込みニューラルネットワークの視覚的特徴を与えられたターゲットラベルのセットを予測するために訓練されたTransformerエンコーダで構成されている。本手法の重要な要素はラベルマスクのトレーニング目的であり、トレーニング中にラベルの状態を正、負、未知と表現するために三元符号化方式を用いる。我々のモデルは、COCOやVisual Genomeのような挑戦的なデータセットに対する最先端のパフォーマンスを示す。さらに,トレーニング中のラベルの不確かさを明示的に表現するモデルであるため,推論中に部分的あるいは余分なラベルアノテーションを用いた画像に対して,よりよい結果が得られることがより一般的である。この追加機能は、COCO、Visual Genome、News500、CUBイメージデータセットで実証する。

Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. In this work we propose the Classification Transformer (C-Tran), a general framework for multi-label image classification that leverages Transformers to exploit the complex dependencies among visual features and labels. Our approach consists of a Transformer encoder trained to predict a set of target labels given an input set of masked labels, and visual features from a convolutional neural network. A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels as positive, negative, or unknown during training. Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome. Moreover, because our model explicitly represents the uncertainty of labels during training, it is more general by allowing us to produce improved results for images with partial or extra label annotations during inference. We demonstrate this additional capability in the COCO, Visual Genome, News500, and CUB image datasets.

翻訳日:2022-09-20 01:29:39 公開日:2020-11-27

# 深部強化学習を用いたヒューマノイドサッカーロボットのリアルタイムアクティブビジョン

Real-time Active Vision for a Humanoid Soccer Robot Using Deep Reinforcement Learning ( http://arxiv.org/abs/2011.13851v1 )

ライセンス: Link先を確認

Soheil Khatibi, Meisam Teimouri, Mahdi Rezaei

(参考訳) 本稿では,人間型サッカーロボットのための深層強化学習手法を用いたアクティブビジョン手法を提案する。提案手法はロボットの視点を適応的に最適化し,ボールの視点を保ちながら自己局所化のための最も有用なランドマークを得る。アクティブビジョンは、限られた視野を持つヒューマノイド意思決定ロボットにとって重要である。能動視覚問題に対処するために、自己局在モデルの精度に大きく依存する確率論的エントロピーに基づくいくつかのアプローチが提案されている。しかし,本研究では,この問題をエピソディクス強化学習問題として定式化し,深層q学習法を用いて解く。提案するネットワークでは,ロボットの頭部を最高の視点に向けて移動させるために,カメラの生画像のみを必要とする。このモデルは、最高の視点を達成する上で、非常に競争力のある80%の成功率を示します。提案手法をwebotsシミュレータでシミュレーションしたヒューマノイドロボットに実装した。評価と実験結果から,提案手法は自己局所誤差の高い場合において,RoboCupコンテキストにおいてエントロピーに基づく手法よりも優れていることが示された。

In this paper, we present an active vision method using a deep reinforcement learning approach for a humanoid soccer-playing robot. The proposed method adaptively optimises the viewpoint of the robot to acquire the most useful landmarks for self-localisation while keeping the ball into its viewpoint. Active vision is critical for humanoid decision-maker robots with a limited field of view. To deal with an active vision problem, several probabilistic entropy-based approaches have previously been proposed which are highly dependent on the accuracy of the self-localisation model. However, in this research, we formulate the problem as an episodic reinforcement learning problem and employ a Deep Q-learning method to solve it. The proposed network only requires the raw images of the camera to move the robot's head toward the best viewpoint. The model shows a very competitive rate of 80% success rate in achieving the best viewpoint. We implemented the proposed method on a humanoid robot simulated in Webots simulator. Our evaluations and experimental results show that the proposed method outperforms the entropy-based methods in the RoboCup context, in cases with high self-localisation errors.

翻訳日:2022-09-20 01:29:21 公開日:2020-11-27

# エンティティの協調抽出と情報冗長性除去との関係

Joint Extraction of Entity and Relation with Information Redundancy Elimination ( http://arxiv.org/abs/2011.13565v1 )

ライセンス: Link先を確認

Yuanhao Shen and Jungang Han

(参考訳) 冗長な情報とエンティティと関係抽出モデルの重複関係の問題を解決するために,共同抽出モデルを提案する。このモデルは、関係のない冗長な情報を生成することなく、複数の関連エンティティを直接抽出することができる。また,エンコーダ-LSTMと呼ばれる再帰型ニューラルネットワークを提案し,文をモデル化する再帰型ユニットの能力を高める。具体的には、名前付きエンティティ認識サブモジュールは、事前訓練された言語モデルとLSTMデコーダ層で構成され、エンコーダ-LSTMネットワークを使用して関連するエンティティペア間の順序関係をモデル化するエンティティペア抽出サブモジュールと、注意機構を含む関係分類サブモジュールである。本モデルの有効性を評価するために, adeおよびconll04の公開データセットについて実験を行った。提案手法は,エンティティと関係抽出のタスクにおいて良好な性能を示し,冗長な情報の量を大幅に削減できることを示す。

To solve the problem of redundant information and overlapping relations of the entity and relation extraction model, we propose a joint extraction model. This model can directly extract multiple pairs of related entities without generating unrelated redundant information. We also propose a recurrent neural network named Encoder-LSTM that enhances the ability of recurrent units to model sentences. Specifically, the joint model includes three sub-modules: the Named Entity Recognition sub-module consisted of a pre-trained language model and an LSTM decoder layer, the Entity Pair Extraction sub-module which uses Encoder-LSTM network to model the order relationship between related entity pairs, and the Relation Classification sub-module including Attention mechanism. We conducted experiments on the public datasets ADE and CoNLL04 to evaluate the effectiveness of our model. The results show that the proposed model achieves good performance in the task of entity and relation extraction and can greatly reduce the amount of redundant information.

翻訳日:2022-09-20 01:29:03 公開日:2020-11-27

# センサネットワーク上の分散変分ベイズアルゴリズム

Distributed Variational Bayesian Algorithms Over Sensor Networks ( http://arxiv.org/abs/2011.13600v1 )

ライセンス: Link先を確認

Junhao Hua, Chunguang Li

(参考訳) センサネットワークのコンテキストにおけるベイズフレームワークの分散推論/推定は、その幅広い適用性のために最近注目を集めている。変分ベイズアルゴリズム(英: variational bayesian algorithm)は、ベイズ推論で生じる難解な積分を近似する手法である。本稿では,非常に一般的な共役指数モデルに適用可能な一般ベイズ推論問題に対する2つの分散vbアルゴリズムを提案する。最初のアプローチでは、各ノードにおける大域的自然パラメータは、近似空間のリーマン幾何学を利用する確率的自然勾配を用いて最適化され、続いて隣人と協調するための情報拡散ステップが与えられる。第2の方法では、分散推定のための制約付き最適化定式化を自然パラメータ空間に確立し、乗算器の交互方向法(admm)により解く。次に,提案手法の有効性を評価するために,ベイズ混合モデルの分散推論・推定の応用について述べる。合成データと実データの両方のシミュレーションにより、提案アルゴリズムは優れた性能を持つことが示され、これは核融合センターで利用可能な全データに依存するvbアルゴリズムにほぼ匹敵する。

Distributed inference/estimation in Bayesian framework in the context of sensor networks has recently received much attention due to its broad applicability. The variational Bayesian (VB) algorithm is a technique for approximating intractable integrals arising in Bayesian inference. In this paper, we propose two novel distributed VB algorithms for general Bayesian inference problem, which can be applied to a very general class of conjugate-exponential models. In the first approach, the global natural parameters at each node are optimized using a stochastic natural gradient that utilizes the Riemannian geometry of the approximation space, followed by an information diffusion step for cooperation with the neighbors. In the second method, a constrained optimization formulation for distributed estimation is established in natural parameter space and solved by alternating direction method of multipliers (ADMM). An application of the distributed inference/estimation of a Bayesian Gaussian mixture model is then presented, to evaluate the effectiveness of the proposed algorithms. Simulations on both synthetic and real datasets demonstrate that the proposed algorithms have excellent performance, which are almost as good as the corresponding centralized VB algorithm relying on all data available in a fusion center.

翻訳日:2022-09-20 01:28:17 公開日:2020-11-27

# 教師の繰り返し強制と再重み付けによるマルチタスクmrイメージング

Multi-task MR Imaging with Iterative Teacher Forcing and Re-weighted Deep Learning ( http://arxiv.org/abs/2011.13614v1 )

ライセンス: Link先を確認

Kehan Qi, Yu Gong, Xinfeng Liu, Xin Liu, Hairong Zheng, Shanshan Wang

(参考訳) 磁気共鳴(MR)再構成によるノイズ、アーティファクト、情報の喪失は、下流アプリケーションの最終性能を損なう可能性がある。本稿では,既存のビッグデータから事前知識を学習するマルチタスク深層学習手法を開発し,これらの知識を用いて,アンサンプリングk空間データからのmr再構成とセグメンテーションの同時支援を行う。マルチタスク深層学習フレームワークは,動的再重み付き損失制約 (DRLC) の下で設計した反復型教師強制スキーム (ITFS) によって統合・訓練された2つのネットワークサブモジュールを備える。 ITFSは、完全にサンプル化されたデータをトレーニングプロセスに注入することで、エラーの蓄積を避けるように設計されている。マルチタスクの精度を共プロパントするために,リコンストラクションとセグメンテーションサブモジュールからの貢献を動的にバランスさせるdrlcを提案する。提案手法は,2つのオープンデータセットと1つのin vivo内データセットを用いて評価し,6つの最先端手法と比較した。提案手法は,同時的かつ正確なMR再構成とセグメンテーションの促進機能を有することを示す。

Noises, artifacts, and loss of information caused by the magnetic resonance (MR) reconstruction may compromise the final performance of the downstream applications. In this paper, we develop a re-weighted multi-task deep learning method to learn prior knowledge from the existing big dataset and then utilize them to assist simultaneous MR reconstruction and segmentation from the under-sampled k-space data. The multi-task deep learning framework is equipped with two network sub-modules, which are integrated and trained by our designed iterative teacher forcing scheme (ITFS) under the dynamic re-weighted loss constraint (DRLC). The ITFS is designed to avoid error accumulation by injecting the fully-sampled data into the training process. The DRLC is proposed to dynamically balance the contributions from the reconstruction and segmentation sub-modules so as to co-prompt the multi-task accuracy. The proposed method has been evaluated on two open datasets and one in vivo in-house dataset and compared to six state-of-the-art methods. Results show that the proposed method possesses encouraging capabilities for simultaneous and accurate MR reconstruction and segmentation.

翻訳日:2022-09-20 01:27:58 公開日:2020-11-27

# Manifold Disentanglement を用いた医用画像翻訳の操作

Manipulating Medical Image Translation with Manifold Disentanglement ( http://arxiv.org/abs/2011.13615v1 )

ライセンス: Link先を確認

Siyu Liu, Jason A. Dowling, Craig Engstrom, Peter B. Greer, Stuart Crozier, Shekhar S. Chandra

(参考訳) 医用画像変換(ctからmrへ)は、i)ドメイン不変特徴の忠実な翻訳(解剖学的構造の形状情報など)、ii)ターゲット領域特徴の現実的な合成(mrにおける組織出現など)を必要とするため、難しい課題である。本研究では,この2つの特徴を明示的にモデル化する新しい画像翻訳フレームワークであるmdgan(mandular disentanglement generative adversarial network)を提案する。完全畳み込み生成器を使用してドメイン不変な特徴をモデル化し、スタイルコードを使用して対象領域の特徴を多様体として別々にモデル化する。この設計は、ドメイン不変の機能とドメイン固有の機能を明確に切り離し、双方を個別に制御することを目的としている。画像変換処理はスタイライゼーションタスクとして定式化され、入力は学習多様体からサンプリングされたスタイルコードに基づいて、様々なターゲットドメインイメージに「スタイライゼーション」(翻訳)される。 MDGANをマルチモーダルな医用画像変換のためにテストし、この多様体上に2つのドメイン固有の多様体クラスタを作成し、セグメント化マップを擬似CTと擬似MR画像に変換する。 MR多様体クラスタを横切る経路をトラバースすることで、入力から形状情報を保持しながら目標出力を操作可能であることを示す。

Medical image translation (e.g. CT to MR) is a challenging task as it requires I) faithful translation of domain-invariant features (e.g. shape information of anatomical structures) and II) realistic synthesis of target-domain features (e.g. tissue appearance in MR). In this work, we propose Manifold Disentanglement Generative Adversarial Network (MDGAN), a novel image translation framework that explicitly models these two types of features. It employs a fully convolutional generator to model domain-invariant features, and it uses style codes to separately model target-domain features as a manifold. This design aims to explicitly disentangle domain-invariant features and domain-specific features while gaining individual control of both. The image translation process is formulated as a stylisation task, where the input is "stylised" (translated) into diverse target-domain images based on style codes sampled from the learnt manifold. We test MDGAN for multi-modal medical image translation, where we create two domain-specific manifold clusters on the manifold to translate segmentation maps into pseudo-CT and pseudo-MR images, respectively. We show that by traversing a path across the MR manifold cluster, the target output can be manipulated while still retaining the shape information from the input.

翻訳日:2022-09-20 01:27:38 公開日:2020-11-27

# ほとんど訓練のない多目的ニューラルアーキテクチャ探索

Multi-objective Neural Architecture Search with Almost No Training ( http://arxiv.org/abs/2011.13591v1 )

ライセンス: Link先を確認

Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu

(参考訳) 近年、ニューラルアーキテクチャサーチ(NAS)は、学術と産業の両方から注目を集めている。印象的な実験結果の安定した流れにもかかわらず、既存のNASアルゴリズムのほとんどは、確率勾配降下(SGD)トレーニングのコストのかかる反復のために計算的に実行を禁止している。本研究では,ネットワークアーキテクチャの性能を迅速に評価するために,ランダムウェイト評価(rwe)と呼ばれる効果的な代替案を提案する。最後の線形分類層をトレーニングすることによって、rweはアーキテクチャを評価する計算コストを数時間から秒に短縮する。進化的多目的アルゴリズムに統合されると、rweは1つのgpuカードで2時間未満の検索でcifar-10で最先端のパフォーマンスを持つ一連の効率的なアーキテクチャを得る。 imagenetに対するランク次相関と転送学習実験に関するアブレーション研究は、rweの有効性をさらに検証した。

In the recent past, neural architecture search (NAS) has attracted increasing attention from both academia and industries. Despite the steady stream of impressive empirical results, most existing NAS algorithms are computationally prohibitive to execute due to the costly iterations of stochastic gradient descent (SGD) training. In this work, we propose an effective alternative, dubbed Random-Weight Evaluation (RWE), to rapidly estimate the performance of network architectures. By just training the last linear classification layer, RWE reduces the computational cost of evaluating an architecture from hours to seconds. When integrated within an evolutionary multi-objective algorithm, RWE obtains a set of efficient architectures with state-of-the-art performance on CIFAR-10 with less than two hours' searching on a single GPU card. Ablation studies on rank-order correlations and transfer learning experiments to ImageNet have further validated the effectiveness of RWE.

翻訳日:2022-09-20 01:20:17 公開日:2020-11-27

# 勾配の多様性と不確かさに基づくシーケンスラベリングのための深層能動的学習

Deep Active Learning for Sequence Labeling Based on Diversity and Uncertainty in Gradient ( http://arxiv.org/abs/2011.13570v1 )

ライセンス: Link先を確認

Yekyung Kim

(参考訳) 近年,自然言語処理タスクのアクティブラーニング(al)によるデータ依存の軽減が研究されている。しかし、クエリ選択においては、ほとんどの研究は、主に不確実性に基づくサンプリングに依存しており、一般にラベルなしデータの構造情報を活用していない。これにより、バッチアクティブな学習設定におけるサンプリングバイアスが発生し、同時に複数のサンプルを選択する。本研究では,シーケンスラベリングタスクに不確実性と多様性の両方を組み込んだ場合,アクティブラーニングを用いてラベル付きトレーニングデータの量を削減できることを実証する。我々は,複数のタスク,データセット,モデルにまたがる勾配埋め込みアプローチにおいて,重み付けされた多様性を選択することでシーケンスベースアプローチの効果を検討した。

Recently, several studies have investigated active learning (AL) for natural language processing tasks to alleviate data dependency. However, for query selection, most of these studies mainly rely on uncertainty-based sampling, which generally does not exploit the structural information of the unlabeled data. This leads to a sampling bias in the batch active learning setting, which selects several samples at once. In this work, we demonstrate that the amount of labeled training data can be reduced using active learning when it incorporates both uncertainty and diversity in the sequence labeling task. We examined the effects of our sequence-based approach by selecting weighted diverse in the gradient embedding approach across multiple tasks, datasets, models, and consistently outperform classic uncertainty-based sampling and diversity-based sampling.

翻訳日:2022-09-20 01:20:06 公開日:2020-11-27

# ナラティブ知識グラフにおける関係クラスタリング

Relation Clustering in Narrative Knowledge Graphs ( http://arxiv.org/abs/2011.13647v1 )

ライセンス: Link先を確認

Simone Mellace, K Vani, Alessandro Antonucci

(参考訳) 小説や短編などの文学的文章を扱う場合、ナレッジグラフの形で構造化された情報の抽出は、小説の登場人物に対応するエンティティとそれらに関する監督された情報を集めるための適切なハードルとの間の膨大な関係によって妨げられる可能性がある。原文のリレーショナル文は(SBERTと)組み込まれ、意味論的に類似した関係をまとめるためにクラスタ化される。同じクラスタ内のすべての文は最終的に(BARTで)要約され、要約から抽出された記述ラベルが抽出される。予備テストでは、このようなクラスタリングが類似した関係をうまく検出でき、半教師付きアプローチのための貴重な前処理を提供することが示された。

When coping with literary texts such as novels or short stories, the extraction of structured information in the form of a knowledge graph might be hindered by the huge number of possible relations between the entities corresponding to the characters in the novel and the consequent hurdles in gathering supervised information about them. Such issue is addressed here as an unsupervised task empowered by transformers: relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations. All the sentences in the same cluster are finally summarized (with BART) and a descriptive label extracted from the summary. Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.

翻訳日:2022-09-20 01:19:52 公開日:2020-11-27

# 逐次混合によるリカレントニューラルネットワークの正規化

Regularizing Recurrent Neural Networks via Sequence Mixup ( http://arxiv.org/abs/2012.07527v1 )

ライセンス: Link先を確認

Armin Karamzade, Amir Najafi and Seyed Abolfazl Motahari

(参考訳) 本稿では,入力混合(Zhang et al., 2017)とマニフォールド混合(Verma et al., 2018)という,フィードフォワードニューラルネットワークにもともと提案されていた有名な正規化手法を,リカレントニューラルネットワーク(RNN)の領域に拡張する。提案手法は実装が容易で計算量も少ないが,様々なタスクにおいて単純なニューラルアーキテクチャの性能を活用している。我々は、実世界のデータセットに関するいくつかの実験を通して、我々の主張を検証するとともに、提案手法の性質と潜在的影響をさらに調査するための漸近的な理論的分析を提供する。 CoNLL-2003データ(Sang and De Meulder, 2003)上で, BiLSTM-CRFモデル(Huang et al., 2015)を名前付きエンティティ認識タスクに適用することにより,テストステージにおけるF-1スコアを改善し,損失を大幅に低減した。

In this paper, we extend a class of celebrated regularization techniques originally proposed for feed-forward neural networks, namely Input Mixup (Zhang et al., 2017) and Manifold Mixup (Verma et al., 2018), to the realm of Recurrent Neural Networks (RNN). Our proposed methods are easy to implement and have a low computational complexity, while leverage the performance of simple neural architectures in a variety of tasks. We have validated our claims through several experiments on real-world datasets, and also provide an asymptotic theoretical analysis to further investigate the properties and potential impacts of our proposed techniques. Applying sequence mixup to BiLSTM-CRF model (Huang et al., 2015) to Named Entity Recognition task on CoNLL-2003 data (Sang and De Meulder, 2003) has improved the F-1 score on the test stage and reduced the loss, considerably.

翻訳日:2022-09-20 01:19:20 公開日:2020-11-27

# デモとラベルなし体験によるオフライン学習

Offline Learning from Demonstrations and Unlabeled Experience ( http://arxiv.org/abs/2011.13885v1 )

ライセンス: Link先を確認

Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed

(参考訳) 行動クローニング(BC)は、専門家によるデモンストレーションに関する教師あり学習によって、報酬なしでポリシーをオフラインでトレーニングできるため、ロボット学習において実用的であることが多い。しかし、bcは、私たちがラベルのない経験と呼ぶもの、すなわち、報酬のアノテーションなしで、混合品質と未知の品質のデータを有効に活用しません。このラベルのないデータは、人間の遠隔操作、スクリプト化されたポリシー、および同じロボット上の他のエージェントなど、さまざまなソースによって生成される。このラベルのない体験を利用できるデータ駆動型オフラインロボット学習に向けて、Offline Reinforced Imitation Learning (ORIL)を紹介する。 ORILはまず、実証者や未ラベルの軌跡からの観察を対比して報酬関数を学び、次にすべてのデータを学習報酬で注釈付けし、最後にオフラインの強化学習を通じてエージェントを訓練する。各種の連続制御およびロボット操作タスクのシミュレーションにより、ORILはラベルなし体験を効果的に活用することにより、同等のBCエージェントよりも一貫して優れていることを示す。

Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations. However, BC does not effectively leverage what we will refer to as unlabeled experience: data of mixed and unknown quality without reward annotations. This unlabeled data can be generated by a variety of sources such as human teleoperation, scripted policies and other agents on the same robot. Towards data-driven offline robot learning that can use this unlabeled experience, we introduce Offline Reinforced Imitation Learning (ORIL). ORIL first learns a reward function by contrasting observations from demonstrator and unlabeled trajectories, then annotates all data with the learned reward, and finally trains an agent via offline reinforcement learning. Across a diverse set of continuous control and simulated robotic manipulation tasks, we show that ORIL consistently outperforms comparable BC agents by effectively leveraging unlabeled experience.

翻訳日:2022-09-20 01:18:58 公開日:2020-11-27

PDF登録状況（公開日: 20201127）