Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210821となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# エラー軽減のための最適資源コスト Optimal resource cost for error mitigation ( http://arxiv.org/abs/2006.12509v3 ) ライセンス: Link先を確認	Ryuji Takagi	(参考訳) 短期量子デバイスの中心的な問題の1つは、その究極のポテンシャルと限界を理解することである。本稿では,確率的誤差解消手法の最適資源コストを定式化できる,短期機器の完全な表現性を考慮したフレームワークを導入することで,量子エラー軽減の観点からこの問題に対処する。デバイスが実装可能なノイズ処理に関して定義された資源理論量化器と接続することにより、最適なコストを評価するための一般的な方法を提案する。提案手法は,従来の評価よりも汎用的に有利な実現可能なコストと,多種多様な実装可能な操作に適用可能な基本下限を得るため,騒音の一般的なクラスを緩和するための最適コストを推定する。いくつかのノイズモデルのバウンダリを改善し、ノイズの脱分極と減圧の正確なコストを与え、オーバーヘッドコストを正確に特徴付けるとともに、エラー軽減の観点から資源測定に操作的意味を与える。その結果,テムメらによるヒューリスティックなアプローチが特に示唆された。 K. Temme, S. Bravyi, and J. M. Gambetta, Phys. Lett. 119, 180509 (2017)] は、我々の拡張フレームワークにおいても最適であり、このノイズモデルのために、近距離デバイスに固有の追加自由度によって得られる利点に根本的な制限を課す。 One of the central problems for near-term quantum devices is to understand their ultimate potential and limitations. We address this problem in terms of quantum error mitigation by introducing a framework taking into account the full expressibility of near-term devices, in which the optimal resource cost for the probabilistic error cancellation method can be formalized. We provide a general methodology for evaluating the optimal cost by connecting it to a resource-theoretic quantifier defined with respect to the noisy operations that devices can implement. We employ our methods to estimate the optimal cost in mitigating a general class of noise, where we obtain an achievable cost that has a generic advantage over previous evaluations, as well as a fundamental lower bound applicable to a broad class of noisy implementable operations. We improve our bounds for several noise models, where we give the exact optimal costs for the depolarizing and dephasing noise, precisely characterizing the overhead cost while offering an operational meaning to the resource measure in terms of error mitigation. Our result particularly implies that the heuristic approach presented by Temme et al. [K. Temme, S. Bravyi, and J. M. Gambetta, Phys. Rev. Lett. 119, 180509 (2017)] is optimal even in our extended framework, putting fundamental limitations on the advantage provided by the extra degrees of freedom inherent in near-term devices for this noise model.	翻訳日:2023-05-13 04:51:42 公開日:2021-08-21
# テンソルネットワークを用いた量子回路表現のための決定図 A Tensor Network based Decision Diagram for Representation of Quantum Circuits ( http://arxiv.org/abs/2009.02618v2 ) ライセンス: Link先を確認	Xin Hong, Xiangzhen Zhou, Sanjiang Li, Yuan Feng, Mingsheng Ying	(参考訳) テンソルネットワークは、何十年もの間量子物理系のシミュレーションにうまく応用されてきた。近年、量子コンピューティング、特にランダム量子回路の古典的なシミュレーションにも採用されている。本稿では、テンソルネットワークのより原理的で便利な応用のために、TDD(Tensor Decision Diagram)と呼ばれる決定図式データ構造を提案する。この新しいデータ構造は、量子回路のコンパクトで標準的な表現を提供する。回路分割を利用することにより、量子回路のTDDを効率的に計算することができる。さらに、アプリケーションに不可欠なテンソルネットワーク(加算や収縮など)の操作もTDDで効率的に実装できることを示す。 TDDの概念実証実装を示し、その効率をベンチマーク量子回路のセットで評価する。 TDDは、等価性チェック、エラー検出、合成、シミュレーション、検証など、量子回路に関連するさまざまな設計自動化タスクにおいて重要な役割を果たすことが期待されている。 Tensor networks have been successfully applied in simulation of quantum physical systems for decades. Recently, they have also been employed in classical simulation of quantum computing, in particular, random quantum circuits. This paper proposes a decision diagram style data structure, called TDD (Tensor Decision Diagram), for more principled and convenient applications of tensor networks. This new data structure provides a compact and canonical representation for quantum circuits. By exploiting circuit partition, the TDD of a quantum circuit can be computed efficiently. Furthermore, we show that the operations of tensor networks essential in their applications (e.g., addition and contraction), can also be implemented efficiently in TDDs. A proof-of-concept implementation of TDDs is presented and its efficiency is evaluated on a set of benchmark quantum circuits. It is expected that TDDs will play an important role in various design automation tasks related to quantum circuits, including but not limited to equivalence checking, error detection, synthesis, simulation, and verification.	翻訳日:2023-05-03 11:33:23 公開日:2021-08-21
# 対称性としての量子参照フレーム変換と第三粒子のパラドックス Quantum reference frame transformations as symmetries and the paradox of the third particle ( http://arxiv.org/abs/2011.01951v2 ) ライセンス: Link先を確認	Marius Krumm, Philipp A. Hoehn, Markus P. Mueller	(参考訳) 量子の世界では、参照フレームは究極的には量子システムでもある。本研究では,単純な物理系の対称性として,量子参照フレーム(QRF)変換が自然に現れることを示す。これにより、既知のQRF変換を、運用上透過的なフレームワーク内で再開発し、一般化し、その構造と解釈に新たな光を放つことができます。このような量子対称性に制約されたエージェントによって測定可能な可観測物の明示的な記述を与え、その結果を「第3粒子のパラドックス」と呼ばれるパズルに適用する。より少ない粒子をより多くの粒子にどのように埋め込むかという問題に還元できると論じ、この問題の物理的および代数的解析を徹底的に行うことができる。これにより、パラドックスを確実に解決する部分的トレース('relational trace')が一般化され、この分解において鍵となるリレーショナルオブザーバブルのような単純な量子情報設定において制約量子化の重要な構造が明らかになる。我々は、透明性と数学的厳密性のために有限アベリア群に注意を向けるが、直感的な物理的魅力は、それらがより一般的な状況において有効であることを期待する。 In a quantum world, reference frames are ultimately quantum systems too -- but what does it mean to "jump into the perspective of a quantum particle"? In this work, we show that quantum reference frame (QRF) transformations appear naturally as symmetries of simple physical systems. This allows us to rederive and generalize known QRF transformations within an alternative, operationally transparent framework, and to shed new light on their structure and interpretation. We give an explicit description of the observables that are measurable by agents constrained by such quantum symmetries, and apply our results to a puzzle known as the `paradox of the third particle'. We argue that it can be reduced to the question of how to relationally embed fewer into more particles, and give a thorough physical and algebraic analysis of this question. This leads us to a generalization of the partial trace (`relational trace') which arguably resolves the paradox, and it uncovers important structures of constraint quantization within a simple quantum information setting, such as relational observables which are key in this resolution. While we restrict our attention to finite Abelian groups for transparency and mathematical rigor, the intuitive physical appeal of our results makes us expect that they remain valid in more general situations.	翻訳日:2023-04-25 11:39:38 公開日:2021-08-21
# 量子変分法を用いた量子機械学習のLHCにおける高エネルギー物理解析への応用 : IBM量子コンピュータシミュレータと10量子ビットハードウェア Application of Quantum Machine Learning using the Quantum Variational Classifier Method to High Energy Physics Analysis at the LHC on IBM Quantum Computer Simulator and Hardware with 10 qubits ( http://arxiv.org/abs/2012.11560v2 ) ライセンス: Link先を確認	Sau Lan Wu, Jay Chan, Wen Guan, Shaojun Sun, Alex Wang, Chen Zhou, Miron Livny, Federico Carminati, Alberto Di Meglio, Andy C. Y. Li, Joseph Lykken, Panagiotis Spentzouris, Samuel Yen-Chi Chen, Shinjae Yoo and Tzu-Chieh Wei	(参考訳) lhcにおける実験プログラムの主な目的の1つは、新しい物理学の発見である。これは膨大な背景にある稀な信号の同定を必要とする。機械学習アルゴリズムを使用することで、この目的を達成する能力が大幅に向上します。量子技術の進歩により、量子機械学習は高エネルギー物理学におけるデータ分析の強力なツールとなりうる。本研究は,ibmのゲートモデル量子コンピューティングシステムを用いて,最近のlhcフラッグシップ物理解析において,量子変分分類法を適用した。 $t\bar{t}h$ (トップクォーク対に関連付けてボソン生成) と $h\rightarrow\mu^{+}\mu^{-}$ (ヒッグスボゾンが2つのミューオンに崩壊し,ヒッグスボソンカップリングを第2世代のフェルミオンに推定する) である。我々は、IBM量子シミュレータとIBM量子ハードウェアの10量子ビットによる初期の結果を得た。量子シミュレータ上の100の事象の小さなトレーニングサンプルを用いて、量子変分分類法は、LHC物理解析によく用いられるSVM(サポートベクトルマシン)やBDT(ブースト決定木)のような古典的なアルゴリズムと同様に動作する。量子ハードウェアでは、量子変分分類法が量子シミュレータに匹敵する有望な識別力を示している。この研究は、量子機械学習が現実的な物理データセットの信号と背景を区別できることを示した。我々は、将来の高輝度LHC物理解析における量子機械学習の利用を予測し、ヒッグス粒子自己結合の測定や暗黒物質の探索を含む。 One of the major objectives of the experimental programs at the LHC is the discovery of new physics. This requires the identification of rare signals in immense backgrounds. Using machine learning algorithms greatly enhances our ability to achieve this objective. With the progress of quantum technologies, quantum machine learning could become a powerful tool for data analysis in high energy physics. In this study, using IBM gate-model quantum computing systems, we employ the quantum variational classifier method in two recent LHC flagship physics analyses: $t\bar{t}H$ (Higgs boson production in association with a top quark pair) and $H\rightarrow\mu^{+}\mu^{-}$ (Higgs boson decays to two muons, probing the Higgs boson couplings to second-generation fermions). We have obtained early results with 10 qubits on the IBM quantum simulator and the IBM quantum hardware. With small training samples of 100 events on the quantum simulator, the quantum variational classifier method performs similarly to classical algorithms such as SVM (support vector machine) and BDT (boosted decision tree), which are often employed in LHC physics analyses. On the quantum hardware, the quantum variational classifier method has shown promising discrimination power, comparable to that on the quantum simulator. This study demonstrates that quantum machine learning has the ability to differentiate between signal and background in realistic physics datasets. We foresee the usage of quantum machine learning in future high-luminosity LHC physics analyses, including measurements of the Higgs boson self-couplings and searches for dark matter.	翻訳日:2023-04-20 00:19:10 公開日:2021-08-21
# 密度作用素と可積分ウィグナー分布を持つ混合量子状態のクラスを定義する統計的アンサンブルの不統一性について On the Non-Uniqueness of Statistical Ensembles Defining a Density Operator and a Class of Mixed Quantum States with Integrable Wigner Distribution ( http://arxiv.org/abs/2103.05605v4 ) ライセンス: Link先を確認	Charlyne de Gosson and Maurice de Gosson	(参考訳) 正方積分可能な関数からなる混合量子状態のウィグナー分布は準確率分布であり、その積分は1であり、限界特性が満たされていると仮定するのは標準である。しかし、一般にはそうではない。この性質を満たす量子状態のクラスを導入し、これらの状態は1980年代にH. Feichtingerによって導入された函数空間(変調空間)のクラスで定義されるため、"Feichtinger state"と呼ばれる。これらの状態の性質を研究し、密度演算子を生成する統計アンサンブルの非特異性について、ジェインズの結果の一般の場合の拡張を証明する機会を与える。ボーナスとして、ウィグナー変換の凸和の結果を得る。 It is standard to assume that the Wigner distribution of a mixed quantum state consisting of square-integrable functions is a quasi-probability distribution, that is that its integral is one and that the marginal properties are satisfied. However this is in general not true. We introduce a class of quantum states for which this property is satisfied, these states are dubbed "Feichtinger states" because they are defined in terms of a class of functional spaces (modulation spaces) introduced in the 1980's by H. Feichtinger. The properties of these states are studied, which gives us the opportunity to prove an extension to the general case of a result of Jaynes on the non-uniqueness of the statistical ensemble generating a density operator. As a bonus we obtain a result for convex sums of Wigner transforms.	翻訳日:2023-04-08 15:52:22 公開日:2021-08-21
# 量子グラフ上のインスタントンとベリーの接続 Instantons and Berry's connections on quantum graph ( http://arxiv.org/abs/2104.02311v2 ) ライセンス: Link先を確認	Tomonori Inoue, Makoto Sakamoto, Inori Ueba	(参考訳) 本稿では,量子グラフ上のディラック零モードの境界条件のパラメータ空間における非可換ベリー接続について検討する。本稿では,Yang-Mills のインスタントソリューション構築手法である ADHM の構成をベリー接続に適用する。そして、instantonの設定がberryの接続として現れることがわかりました。 In this paper, we study non-Abelian Berry's connections in the parameter space of boundary conditions for Dirac zero modes on quantum graphs. We apply the ADHM construction, which is the method for constructing Yang-Mills instanton solutions, to the Berry's connections. Then we find that the instanton configurations appear as the Berry's connections.	翻訳日:2023-04-05 06:29:22 公開日:2021-08-21
# フィボナッチアロンによる決定論的量子ワンタイムパッド Deterministic quantum one-time pad via Fibonacci anyons ( http://arxiv.org/abs/2104.05911v2 ) ライセンス: Link先を確認	Cheng-Qian Xu, D. L. Zhou	(参考訳) トポロジカルに堅牢な任意の状態はヒルベルト空間の特異な構造に由来するが、量子コンピューティングや量子通信において重要な応用がある。決定論的量子ワンタイムパッド(dqotp)の情報担体としてアノニック状態が用いられると、真空から生成するフィボナッチ粒子-反粒子対は、古典情報(2\log_2 d_{\tau}$ bit of classical information)(d_{\tau}$ is the quantum dimension of a fibonacci anyon $\tau$)を漸近的に送ることができる。さらに,自明な総電荷を持つ6個のフィボナッチ・エノンのパラメータ化状態を介してdqotpを解析することにより,異なるパラメータに対して送信可能な最大メッセージ数の解析結果を与える。 DQOTPが送信するメッセージの最大数に関する結果は、任意のアクセス可能な情報によって説明できる。 Anyonic states, which are topologically robust originated from their peculiar structure of Hilbert space, have important applications in quantum computing and quantum communication. When an anyonic state is used as an information carrier of the deterministic quantum one-time pad (DQOTP), we find that the Fibonacci particle-antiparticle pair produced from vacuum can be used to asymptotically send $2\log_2 d_{\tau}$ bits of classical information ($d_{\tau}$ is the quantum dimension of a Fibonacci anyon $\tau$), which equals to the anyonic mutual information of the pair. Furthermore, by studying the DQOTP via a parameterized state of six Fibonacci anyons with trivial total charge, we give the analytical results of the maximum number of messages that can be sent for different parameters, which is a step function with every step corresponding to a regular simplex from the viewpoint of geometry. The results for the maximum number of messages sent by the DQOTP can be explained by the anyonic accessible information.	翻訳日:2023-04-03 23:49:34 公開日:2021-08-21
# 閉じ込められた分極子の分極と量子相関の相互作用 Interplay between polarization and quantum correlations of confined polaritons ( http://arxiv.org/abs/2104.13541v2 ) ライセンス: Link先を確認	Olivier Bleu, Jesper Levinsen, Meera M. Parish	(参考訳) 低駆動領域におけるコヒーレント駆動箱空洞内の偏光子量子相関について検討し,特に偏光自由度について考察した。共振器と交叉偏光器の相互作用強度が異なる可能性や、現実的な線形偏光分割により、自振器とクロスカーライクな非線形性を持つ2つの結合非線形共振器としてシステムをモデル化できるため、他の実験プラットフォームに関係がある可能性がある。実効的な波動関数法では, 定常偏光分解型偏光子数と二階相関関数の解析式を求め, リンドブラッドマスター方程式から得られた数値結果とよく一致した。特に, 励起分極(円または直線)によっては, 従来型(干渉型)と従来型(非線形型)の両方の反束が単一の空洞内で調べられることを強調する。さらに本研究は, ファイバキャビティポラリトンを閉じ込めた最近の実験により, ポラリトンフェシバッハ共鳴の特徴であるクロスポーラライズポラリトン間の相互作用が支配的な構造である可能性が示唆された。さらに, 2チャネルモデルを用いて共鳴に近い状態について検討し, 強いポラリトンアンチバンチングを実現するための基盤として, 原子状薄膜半導体など, バイエクシトン結合エネルギーの大きいシステムが有望であることを示す。 We investigate polariton quantum correlations in a coherently driven box cavity in the low driving regime, with a particular focus on accounting for the polarization degree of freedom. The possibility of having different interaction strengths between co- and cross-circularly polarized polaritons as well as a realistic linear-polarization splitting allows one to model the system as two coupled nonlinear resonators with both self- and cross-Kerr-like nonlinearities, thus making our results potentially relevant for other experimental platforms. Within an effective wave-function approach, we obtain analytical expressions for the steady-state polarization-resolved polariton populations and second-order correlation functions, which agree very well with our numerical results obtained from a Lindblad master equation. Notably, we highlight that depending on the excitation polarization (circular or linear), both the unconventional (interference-mediated) and conventional (mediated by nonlinearities) antibunchings can be investigated in a single cavity. Moreover, using our results, we argue that recent experiments on confined fiber-cavity polaritons are likely to have probed a regime where the dominant interaction is between cross-polarized polaritons, which is characteristic of the polariton Feshbach resonance. We furthermore investigate the regime close to resonance using a two-channel model, and we show that systems with large biexciton binding energies, such as atomically thin semiconductors, are promising platforms for realizing strong polariton antibunching.	翻訳日:2023-04-02 04:44:31 公開日:2021-08-21
# ディラトンブラックホールの影響下でのガウス量子ステアリング Gaussian quantum steering under the influence of a dilaton black hole ( http://arxiv.org/abs/2104.14738v2 ) ライセンス: Link先を確認	Biwei Hu, Cuihong Wen, Jieci Wang, Jiliang Jing	(参考訳) 我々はGarfinkle-Horowitz-Strominger ディラトンブラックホールの背景におけるガウス量子ステアリングのダイナミクスについて検討した。ディラトン場によって引き起こされる重力は、慣性観測者アリスと事象地平線外にホバリングする観測者ボブの間の量子ステアビリティを破壊し、因果非連結領域間のステアリング型量子相関を生成する。したがって、観測者は事象の地平線によって分離されたとしても、局所的な測定によってお互いの状態を制御することができる。ディラトン時空における量子エンタングルメントとは異なり、量子ステアリングは「沈む死」や「沈む誕生」のような破滅的な挙動を経験し、ディラトン電荷が増加する。さらに、ディラトン重力はガウスステアリングの対称性を破壊し、後者は常に拡張時空において非対称である。興味深いことに、最大ステアリング非対称性の達成はディラトン時空における2モードガウス状態の1方向と2方向ステアリングの臨界点を示している。 We study the dynamics of Gaussian quantum steering in the background of a Garfinkle-Horowitz-Strominger dilaton black hole. It is found that the gravity induced by dilaton field will destroy the quantum steerability between the inertial observer Alice and the observer Bob who hovers outside the event horizon, while it generates steering-type quantum correlations between the causally disconnected regions. Therefore, the observers can steer each other's state by local measurements even though they are separated by the event horizon. Unlike quantum entanglement in the dilaton spacetime, the quantum steering experiences catastrophic behaviors such as "sudden death" and "sudden birth" with increasing dilaton charge. In addition, the dilaton gravity destroys the symmetry of Gaussian steering and the latter is always asymmetric in the dilation spacetime. Interestingly, the attainment of maximal steering asymmetry indicates the critical point between one-way and two-way steering for the two-mode Gaussian state in the dilaton spacetime.	翻訳日:2023-04-02 00:01:38 公開日:2021-08-21
# x線自由電子レーザーにおける真空複屈折 Vacuum birefringence at x-ray free-electron lasers ( http://arxiv.org/abs/2105.13869v2 ) ライセンス: Link先を確認	Felix Karbstein, Chantal Sundqvist, Kai S. Schulze, Ingo Uschmann, Holger Gies, Gerhard G. Paulus	(参考訳) x線自由電子レーザー(xfel)単独で量子電磁力学によって予測される真空複屈折現象の測定の展望について検討した。我々は、XFELビームを有限角度で衝突させる実験的なスキームを考案し、その効果のためにポンプとプローブ場の両方として機能する。真空複屈折のシグネチャは、高純度x線ポラリメトリーで検出される偏光偏向信号光子に符号化される。理想化されたシナリオに対する我々の発見は、XFELベースの装置のみの発見の可能性は、光高強度レーザーに匹敵する可能性があることを示している。現在達成可能なシナリオでは、所望のシグネチャの大きさに強い影響を与えるX線光学成分のいくつかの重要な詳細を特定する。 We study the perspectives of measuring the phenomenon of vacuum birefringence predicted by quantum electrodynamics using an x-ray free-electron laser (XFEL) alone. We devise an experimental scheme allowing the XFEL beam to collide with itself under a finite angle, and thus act as both pump and probe field for the effect. The signature of vacuum birefringence is encoded in polarization-flipped signal photons to be detected with high-purity x-ray polarimetry. Our findings for idealized scenarios underline that the discovery potential of solely XFEL-based setups can be comparable to those involving optical high-intensity lasers. For currently achievable scenarios, we identify several key details of the x-ray optical ingredients that exert a strong influence on the magnitude of the desired signatures.	翻訳日:2023-03-29 04:27:26 公開日:2021-08-21
# 分子ビームの赤外励起と光解離によるスピン偏光水素原子の生成 Macroscopic production of spin-polarized hydrogen atoms from the IR-excitation and photodissociation of molecular beams ( http://arxiv.org/abs/2108.06133v2 ) ライセンス: Link先を確認	C. S. Kannis and J. Suarez and T. P. Rakitzis	(参考訳) 本稿では, HBr, HI, ${\rm NH_{3}}$同位体のIR励起と光解離からスピン偏極H, D原子を生成する方法について述べる。我々は, 製造速度が${\rm 10^{21}\, photons\, s^{-1}}$ のirレーザー生産率に近づく程度, 従来の${\sim}{\rm 10^{17} \, s^{-1}}$ の生産速度が著しく超える可能性について考察した。 We describe methods for the production of spin-polarized H and D atoms from the IR-excitation and photodissociation of molecular beams of HBr, HI, and ${\rm NH_{3}}$ isotopes, including optical excitation schemes with partial hyperfine resolution. We discuss the extent to which the production rates may approach the IR-laser production rates of ${\rm 10^{21}\, photons\, s^{-1}}$, and how the production rates of conventional methods of ${\sim}{\rm 10^{17} \, s^{-1}}$ may be surpassed significantly.	翻訳日:2023-03-18 15:06:53 公開日:2021-08-21
# ランク付き拡散、デルタボースガスおよびバーガーズ方程式 Ranked diffusion, delta Bose gas and Burgers equation ( http://arxiv.org/abs/2108.09515v1 ) ライセンス: Link先を確認	Pierre Le Doussal	(参考訳) n$粒子の拡散を,その階数に比例するドリフトによる1次元の相互作用で検討した。魅力的な場合(自己重力気体)では、リーブ・リニガー量子モデルへのマッピングにより、定常時間相関、戻り確率、定常状態への減衰率が得られる。ランクフィールドは、我々が解析したバーガーズ方程式に従う。これは、(反発の場合)外部ポテンシャル$V(x)$で大きめの$N$で定常密度を得ることを可能にする。魅力的な場合、定常状態への減衰速度は、その空間的減衰が十分に遅い場合の初期条件に依存することが分かる。クーロンガス法は、最終的な平衡を大きな$N$で研究することができる。 We study the diffusion of $N$ particles in one dimension interacting via a drift proportional to their rank. In the attractive case (self-gravitating gas) a mapping to the Lieb Liniger quantum model allows to obtain stationary time correlations, return probabilities and the decay rate to the stationary state. The rank field obeys a Burgers equation, which we analyze. It allows to obtain the stationary density at large $N$ in an external potential $V(x)$ (in the repulsive case). In the attractive case the decay rate to the steady state is found to depend on the initial condition if its spatial decay is slow enough. Coulomb gas methods allow to study the final equilibrium at large $N$.	翻訳日:2023-03-17 21:10:15 公開日:2021-08-21
# 2つのrydberg-rydberg相互作用原子を有するキャビティqed系における光子反束 Photon antibunching in a cavity-QED system with two Rydberg-Rydberg interaction atoms ( http://arxiv.org/abs/2108.09470v1 ) ライセンス: Link先を確認	Tong Huang, Lei Tan	(参考訳) 本稿では,2つのRydberg-Rydberg相互作用原子と共役するキャビティ-QED系において,強い光子反バンチング効果を実現する方法を提案する。等時間2次相関関数g(2)(0)の計算により、非慣習的な光子封鎖と従来の光子封鎖が原子駆動のスキームに現れ、どちらもライドバーグ-リドバーグ相互作用の影響を強く受けていることがわかった。また, 適切なパラメータの下では, 従来の光子遮断と従来とは異なる光子遮断を組み合わせることで, 光子アンチバンチングと平均光子数を大幅に向上できることがわかった。キャビティ駆動方式では、rydberg-rydberg相互作用の存在は、非慣習的な光子封鎖機構の下で光子反束をひどく破壊する。これらの結果は、リドベルク原子空洞系における単一光子エミッタの実装を導くのに役立つ。 We propose how to achieve strong photon antibunching effect in a cavity-QED system coupled with two Rydberg-Rydberg interaction atoms. Via calculating the equal time second order correlation function g(2)(0), we find that the unconventional photon blockade and the conventional photon blockade appear in the atom-driven scheme, and they are both significantly affected by the Rydberg-Rydberg interaction. We also find that under appropriate parameters, the photon antibunching and the mean photon number can be significantly enhanced by combining the conventional photon blockade and the unconventional photon blockade. In the cavity-driven scheme, the existence of the Rydberg-Rydberg interaction severely destroys the photon antibunching under the unconventional photon blockade mechanism. These results will help to guide the implementation of the single photon emitter in the Rydberg atoms-cavity system.	翻訳日:2023-03-17 21:10:04 公開日:2021-08-21
# 近ゼロ指数材料におけるモーメントの考察 Momentum considerations inside near-zero index materials ( http://arxiv.org/abs/2108.09450v1 ) ライセンス: Link先を確認	Micha\"el Lobet and I\~nigo Liberal and Larissa Vertchenko and Andrei Lavrinenko and Nader Engheta and Eric Mazur	(参考訳) nzi(near-zero-index)材料、すなわち0に近い位相屈折率を持つ材料は、光間相互作用を増強または阻害することが知られている。基本的な放射過程の理論的導出のほとんどはエネルギー的考察と詳細な平衡方程式に依存するが、運動量的考察には依存しない。運動量交換は理論モデルにも組み込む必要があるため、NZI物質の3つのカテゴリ、すなわち、epsilon-and-mu near-zero(EMNZ)、epsilon-near-zero(ENZ)、mu-near-zero(MNZ)内の運動量を調べる。分散材料におけるアブラハム・ミンコフスキーの議論の文脈において、光のミンコフスキーカノニカル運動量はNZIのすべてのカテゴリにおいてゼロであり、一方、光のエイブラハム運動量はENZおよびMNZの材料ではゼロであるが、EMNZの材料ではゼロである。理論上、nzi材料では、運動量後退、場から原子への移動運動量、ドップラーシフトが抑制されていることを実証する。基本放射過程の抑制は三次元nzi材料内部の運動量の考慮から説明される。最後に、スリッツ実験における回折パターンの欠如は、ミンコフスキー運動量ゼロの結果と見なされる。これらの発見は、ナノスケールでの基本的な光と物質との相互作用の理解を深めることと、発散の用途に訴求している。 Near-zero-index (NZI) materials, i.e. materials having a phase refractive index close to zero, are known to enhance or inhibit light-matter interactions. Most theoretical derivations of fundamental radiative processes rely on energetic considerations and detailed balance equations, but not on momentum considerations. Because momentum exchange should also be incorporated into theoretical models, we investigate momentum inside the three categories of NZI materials, i.e. inside epsilon-and-mu near-zero (EMNZ), epsilon-near-zero (ENZ) and mu-near-zero (MNZ) materials. In the context of Abraham-Minkowski debate in dispersive materials, we show that Minkowski-canonical momentum of light is zero inside all categories of NZI materials while Abraham-kinetic momentum of light is zero in ENZ and MNZ materials but nonzero inside EMNZ materials. We theoretically demonstrate that momentum recoil, transfer momentum from the field to the atom and Doppler shift are inhibited in NZI materials. Fundamental radiative processes inhibition is also explained due to those momentum considerations inside three-dimensional NZI materials. Lastly, absence of diffraction pattern in slits experiments is seen as a consequence of zero Minkowski momentum. Those findings are appealing for a better understanding of fundamental light-matter interactions at the nanoscale as well as for lasing applications.	翻訳日:2023-03-17 21:09:19 公開日:2021-08-21
# 対称な厳密に凸ポテンシャルを持つschr\"odinger方程式に対する virial ans\"atze 。第2部 Virial ans\"atze for the Schr\"odinger Equation with a symmetric strictly convex potential. Part II ( http://arxiv.org/abs/2108.09427v1 ) ライセンス: Link先を確認	S. P. Flego	(参考訳) 近年、対称凸ポテンシャルを持つ時間非依存schr\"odinger方程式の固有関数に対してパラメータのないans\"atzeを得る手順が文献に紹介されている。本研究では,$x^{2\kappa}$-type ポテンシャルに関してこの手法を検証した。本研究では, 電位の程度と相互結合定数に関するans\"atzeの挙動について検討した。最後に,多項式ポテンシャルが絡み合う場合の相対誤差の上限の確立に,結果をどのように利用できるかについて議論する。 Recently was introduced in the literature a procedure to obtain ans\"atze, free of parameters, for the eigenfunctions of the time-independent Schr\"odinger equation with symmetric convex potential. In the present work, we test this technique in regard to $x^{2\kappa}$-type potentials. We study the behavior of the ans\"atze regarding the degree of the potential and to the intervening coupling constant. Finally, we discuss how the results could be used to establish the upper bounds of the relative errors in situations where intervening polynomial potentials.	翻訳日:2023-03-17 21:08:22 公開日:2021-08-21
# 手術用テレプレゼンスにおける照明対応グラディエントミキシングを用いた混合現実感:多層可視化の強化 Mixed Reality using Illumination-aware Gradient Mixing in Surgical Telepresence: Enhanced Multi-layer Visualization ( http://arxiv.org/abs/2110.09318v1 ) ライセンス: Link先を確認	Nirakar Puri, Abeer Alsadoon, P.W.C. Prasad, Nada Alsalami, Tarik A. Rashid	(参考訳) 背景と目的: 拡張現実を用いた手術用テレプレゼンスが応用されているが, 混合現実は研究が続けられており, 理論上のみである。本研究の目的は,入力源および対象映像の照明強度が変化した場合に,グローバルに一貫した映像を生成することにより,最終的な統合映像の可視化を改善する方法を提案することである。方法論:本システムでは,照明認識型映像合成アルゴリズムを用いた照明認識勾配混合による拡張多層可視化を行う。 particle swarm optimizationアルゴリズムは、アルファマットを推定するために、前景と背景領域と画像画素相関から最適なサンプルペアを見つけるために使用される。 Particle Swarm Optimizationアルゴリズムは、未知の領域の未知のピクセルの色と深さを取得するのに役立つ。結果: 大腸, 顎, 乳房のサンプル10点につき, 未知領域のサンプルペアを選別する平均二乗誤差を減少させることにより, 精度が向上した。この削減の量は、state of art systemから16.48%である。その結果、視認性は89.4から97.7%に向上し、光の差でも手視をクリアすることができた。結論: 照明効果とアルファ画素相関は, 可視化精度を向上し, グローバルに一貫性のある合成結果を生成し, 高可逆照明効果の2つの映像を合成する際の時間的一貫性を維持する。さらに,本論文では,未知領域に対して最適なサンプリングペアを選択することで,原色と深度を求める方法を提案する。 Background and aim: Surgical telepresence using augmented perception has been applied, but mixed reality is still being researched and is only theoretical. The aim of this work is to propose a solution to improve the visualization in the final merged video by producing globally consistent videos when the intensity of illumination in the input source and target video varies. Methodology: The proposed system uses an enhanced multi-layer visualization with illumination-aware gradient mixing using Illumination Aware Video Composition algorithm. Particle Swarm Optimization Algorithm is used to find the best sample pair from foreground and background region and image pixel correlation to estimate the alpha matte. Particle Swarm Optimization algorithm helps to get the original colour and depth of the unknown pixel in the unknown region. Result: Our results showed improved accuracy caused by reducing the Mean squared Error for selecting the best sample pair for unknown region in 10 each sample for bowel, jaw and breast. The amount of this reduction is 16.48% from the state of art system. As a result, the visibility accuracy is improved from 89.4 to 97.7% which helped to clear the hand vision even in the difference of light. Conclusion: Illumination effect and alpha pixel correlation improves the visualization accuracy and produces a globally consistent composition results and maintains the temporal coherency when compositing two videos with high and inverse illumination effect. In addition, this paper provides a solution for selecting the best sampling pair for the unknown region to obtain the original colour and depth.	翻訳日:2023-03-17 21:03:09 公開日:2021-08-21
# 設計・設計におけるカオス性依存性最適化 Chaotic Fitness Dependent Optimizer for Planning and Engineering Design ( http://arxiv.org/abs/2110.08067v1 ) ライセンス: Link先を確認	Hardi M. Mohammed, Tarik A. Rashid	(参考訳) 適応依存オプティマイザ(fitness dependent optimizer, fdo)は、ミツバチの群れの繁殖行動を模倣した最近のメタヒューリスティックなアルゴリズムである。このアルゴリズムはParticle Swarm Optimization (PSO) に似ているが、動作は異なる。このアルゴリズムは非常に強力で、他の一般的なメタヒューリスティックアルゴリズムよりも優れた結果が得られる。本稿は,FDOの性能向上を目的としており,このカオス理論をFDOの内部で使用して,CFDO(Chaotic FDO)を提案する。 CFDOでは10のカオスマップを使用して、どの地図がうまく機能しているかを考察し、局所最適とグローバル最適の発見を避ける。 FDO技術は人口の修正に問題があるため、新しい技術は特定の制限で人口を遂行するために使用される。提案するCFDOは,CEC2019のベンチマーク関数10を用いて評価する。その結果,CFDOの能力は向上した。テントマップが最悪である間、シンガーマップはcfdoの改善に大きな影響を与えます。その結果,CFDOはGA,FDO,CSOよりも優れていることがわかった。 CEC2013とCEC2005はCFDOの評価に用いられる。最後に, CFDOは圧力容器設計などの古典的工学的問題に適用され, CFDOがWOA, GWO, FDO, CGWOよりも優れていることを示す。さらに、cfdoはタスク割り当て問題を解くために適用され、元のfdoと比較される。その結果、cfdoは問題を解決する能力がより優れていることが判明した。 Fitness Dependent Optimizer (FDO) is a recent metaheuristic algorithm that mimics the reproduction behavior of the bee swarm in finding better hives. This algorithm is similar to Particle Swarm Optimization (PSO) but it works differently. The algorithm is very powerful and has better results compared to other common metaheuristic algorithms. This paper aims at improving the performance of FDO, thus, the chaotic theory is used inside FDO to propose Chaotic FDO (CFDO). Ten chaotic maps are used in the CFDO to consider which of them are performing well to avoid local optima and finding global optima. New technic is used to conduct population in specific limitation since FDO technic has a problem to amend population. The proposed CFDO is evaluated by using 10 benchmark functions from CEC2019. Finally, the results show that the ability of CFDO is improved. Singer map has a great impact on improving CFDO while the Tent map is the worst. Results show that CFDO is superior to GA, FDO, and CSO. Both CEC2013 and CEC2005 are used to evaluate CFDO. Finally, the proposed CFDO is applied to classical engineering problems, such as pressure vessel design and the result shows that CFDO can handle the problem better than WOA, GWO, FDO, and CGWO. Besides, CFDO is applied to solve the task assignment problem and then compared to the original FDO. The results prove that CFDO has better capability to solve the problem.	翻訳日:2023-03-17 21:02:41 公開日:2021-08-21
# no free lunch theorems の実際的証明を求めて Searching for a practical evidence of the No Free Lunch theorems ( http://arxiv.org/abs/2109.13738v1 ) ライセンス: Link先を確認	Mihai Oltean	(参考訳) No Free Lunch (NFL) の定理によると、最適化問題全体と比較すると、すべてのブラックボックスアルゴリズムは等しくうまく機能する。 nflに関連する重要な問題は、あるアルゴリズムが別のアルゴリズムよりも優れているというテスト問題を見つけることである。興味深いのは、ランダム検索が他の標準進化アルゴリズムよりも優れている関数を見つけることである。本稿では,この問題を解決するための進化的アプローチを提案する。与えられたアルゴリズムaが他のアルゴリズムbよりも優れているようなテスト関数を進化させる。関数最適化のためのNFLスタイルの進化的アルゴリズムを含む数値実験を行った。その結果,提案手法の有効性が示された。ランダム検索が他の考慮されたアルゴリズムよりも優れているいくつかのテスト関数が進化してきた。 According to the No Free Lunch (NFL) theorems all black-box algorithms perform equally well when compared over the entire set of optimization problems. An important problem related to NFL is finding a test problem for which a given algorithm is better than another given algorithm. Of high interest is finding a function for which Random Search is better than another standard evolutionary algorithm. In this paper, we propose an evolutionary approach for solving this problem: we will evolve test functions for which a given algorithm A is better than another given algorithm B. Two ways for representing the evolved functions are employed: as GP trees and as binary strings. Several numerical experiments involving NFL-style Evolutionary Algorithms for function optimization are performed. The results show the effectiveness of the proposed approach. Several test functions for which Random Search performs better than all other considered algorithms have been evolved.	翻訳日:2023-03-17 21:02:16 公開日:2021-08-21
# 等間隔問題に対する可逆回路の進化 Evolving reversible circuits for the even-parity problem ( http://arxiv.org/abs/2109.13355v1 ) ライセンス: Link先を確認	Mihai Oltean	(参考訳) 可逆計算とは基本的に、電力の少ない計算を意味する。標準バイナリゲートは通常可逆ではないので、フレドキンゲートを使用して可逆性を達成する。本稿では,可逆ディジタル回路の設計アルゴリズムについて述べる。このアルゴリズムは、個人を線形表現した遺伝的プログラミング変種であるMulti Expression Programming (MEP)に基づいている。均等性問題に対するディジタル回路について検討した。数値実験により、MEPに基づくアルゴリズムは、偶数-8パリティ問題に対する可逆的なディジタル回路を容易に設計できることが示されている。 Reversible computing basically means computation with less or not at all electrical power. Since the standard binary gates are not usually reversible we use the Fredkin gate in order to achieve reversibility. An algorithm for designing reversible digital circuits is described in this paper. The algorithm is based on Multi Expression Programming (MEP), a Genetic Programming variant with a linear representation of individuals. The case of digital circuits for the even-parity problem is investigated. Numerical experiments show that the MEP-based algorithm is able to easily design reversible digital circuits for up to the even-8-parity problem.	翻訳日:2023-03-17 21:02:05 公開日:2021-08-21
# 線形遺伝的プログラミングを用いた進化的アルゴリズム Evolving Evolutionary Algorithms using Linear Genetic Programming ( http://arxiv.org/abs/2109.13110v1 ) ライセンス: Link先を確認	Mihai Oltean	(参考訳) 本稿では,進化的アルゴリズムの新しいモデルを提案する。このモデルは線形遺伝的プログラミング(LGP)技術に基づいている。すべてのLGP染色体は、特定の問題を解決するのに使用されるEAをコードする。機能最適化のためのいくつかの進化的アルゴリズム、トラベリングセールスマン問題、および擬似アサインメント問題は、検討されたモデルを用いて進化する。数値実験により、進化的アルゴリズムは、いくつかのよく知られたベンチマーク問題に対する標準的なアプローチよりも、同様に、時には良く機能することが示された。 A new model for evolving Evolutionary Algorithms is proposed in this paper. The model is based on the Linear Genetic Programming (LGP) technique. Every LGP chromosome encodes an EA which is used for solving a particular problem. Several Evolutionary Algorithms for function optimization, the Traveling Salesman Problem, and the Quadratic Assignment Problem are evolved by using the considered model. Numerical experiments show that the evolved Evolutionary Algorithms perform similarly and sometimes even better than standard approaches for several well-known benchmarking problems.	翻訳日:2023-03-17 21:01:56 公開日:2021-08-21
# Nimライクゲームにおける勝利戦略の展開 Evolving winning strategies for Nim-like games ( http://arxiv.org/abs/2109.13109v1 ) ライセンス: Link先を確認	Mihai Oltean	(参考訳) 本稿では,Nimライクゲームにおける勝利戦略を計算するための進化的アプローチを提案する。勝利戦略は、遺伝的プログラミング(GP)の高速かつ効率的な変種であるMEP(Multi Expression Programming)技術を用いて計算される。各プレイ戦略は、数学演算子(+, -, , mod, div, and , or, xor, not など)とオペランド(現在のゲーム状態のエンコード)を含む数学的表現で表される。 Nimゲームの勝利戦略を計算するためのいくつかの数値実験を行う。勝利戦略の進化に必要な計算労力を報告する。その結果,提案手法はnim系ゲームにおける勝利戦略の計算に非常に適していることがわかった。 An evolutionary approach for computing the winning strategy for Nim-like games is proposed in this paper. The winning strategy is computed by using the Multi Expression Programming (MEP) technique - a fast and efficient variant of the Genetic Programming (GP). Each play strategy is represented by a mathematical expression that contains mathematical operators (such as +, -, , mod, div, and , or, xor, not) and operands (encoding the current game state). Several numerical experiments for computing the winning strategy for the Nim game are performed. The computational effort needed for evolving a winning strategy is reported. The results show that the proposed evolutionary approach is very suitable for computing the winning strategy for Nim-like games.	翻訳日:2023-03-17 21:01:50 公開日:2021-08-21
# Knapsack問題に対するディジタル回路の進化 Evolving Digital Circuits for the Knapsack Problem ( http://arxiv.org/abs/2109.13107v1 ) ライセンス: Link先を確認	Mihai Oltean, Crina Gro\c{s}an and Mihaela Oltean	(参考訳) マルチ表現プログラミング(Multi Expression Programming、MEP)は、線形染色体を用いた遺伝的プログラミングの亜種である。 MEPのユニークな特徴は、単一染色体における問題の複数の解をコードする能力である。本稿では,NP-Complete問題,knapsack (subset sum)問題に対して,ディジタル回路の進化にマルチ表現プログラミングを用いる。数値実験により,マルチ表現プログラミングは検討されたテスト問題に対して良好に機能することが示された。 Multi Expression Programming (MEP) is a Genetic Programming variant that uses linear chromosomes for solution encoding. A unique feature of MEP is its ability of encoding multiple solutions of a problem in a single chromosome. In this paper we use Multi Expression Programming for evolving digital circuits for a well-known NP-Complete problem: the knapsack (subset sum) problem. Numerical experiments show that Multi Expression Programming performs well on the considered test problems.	翻訳日:2023-03-17 21:01:38 公開日:2021-08-21
# 量子場理論は余剰助力を保持する Quantum Field Theory Deserves Extra Help ( http://arxiv.org/abs/2108.10713v1 ) ライセンス: Link先を確認	John R. Klauder	(参考訳) 今日の量子場理論(QFT)は正準量子化(CQ)に依存しており、$\varphi^4_4$が「自由」な結果にしかならない。代替量子化法であるアフィン量子化(AQ)は、同じモデルに対して「自由でない」結果をもたらす。おそらく、CQにAQを加えることで、QFTにおける幅広い問題の量子化を改善することができる。 Today's quantum field theory (QFT) relies heavenly on canonical quantization (CQ), which fails for $\varphi^4_4$ leading only to a "free" result. Affine quantization (AQ), an alternative quantization procedure, leads to a "non-free" result for the same model. Perhaps adding AQ to CQ can improve the quantization of a wide class of problems in QFT.	翻訳日:2023-03-17 21:01:32 公開日:2021-08-21
# ビームスプリッタに作用するコヒーレント対光子の決定論的量子相関 Deterministic quantum correlation between coherently paired photons acting on a beam splitter ( http://arxiv.org/abs/2108.09596v1 ) ライセンス: Link先を確認	B. S. Ham	(参考訳) 光子の粒子の性質に基づく量子技術は過去数十年にわたって進歩しており、エンタングルメントの基本的な量子特性は、ホン・ウー・マンデル型反相関とベル型非局所相関によって検証されてきた。近年、光子の波動特性に基づく相互排他的量子特徴が、謎の量子相関の基礎物理学を理解するために研究され、決定論的かつマクロな量子技術を生み出している。ここでは、ビームスプリッターに作用する対光子の量子的性質を研究し、相互のコヒーレンスが主要な役割を果たす。反相関に関する現在の一般的な理解とは異なり、対の光子間の二部結合は確率的あるいは後選択される必要はないが、量子力学に違反することなく位相ベース操作によって決定的かつマクロ的である。 Quantum technologies based on the particle nature of a photon has been progressed over the last several decades, where the fundamental quantum features of entanglement have been tested by Hong-Ou-Mandel-type anticorrelation and Bell-type nonlocal correlation. Recently, mutually exclusive quantum features based on the wave nature of a photon have been investigated to understand the fundamental physics of mysterious quantum correlation, resulting in deterministic and macroscopic quantum technologies. Here, we study the quantum natures of paired photons acting on a beam splitter, where mutual coherence plays a major role. Unlike current common understanding on anticorrelation, bipartite entanglement between paired photons does not have to be probabilistic or post-selected, but can be deterministic and even macroscopic via phase basis manipulation without violating quantum mechanics.	翻訳日:2023-03-17 21:01:23 公開日:2021-08-21
# 分解多目的進化最適化:最先端から将来の機会へ Decomposition Multi-Objective Evolutionary Optimization: From State-of-the-Art to Future Opportunities ( http://arxiv.org/abs/2108.09588v1 ) ライセンス: Link先を確認	Ke Li	(参考訳) 分解は、多目的最適化と多条件決定のための古典的な数学的プログラミングにおける主流のアプローチである。しかし、進化的多目的最適化の文脈では、分解(MOEA/D)に基づく多目的進化アルゴリズムの開発まで適切に研究されなかった。本稿では,moea/dの開発をその起源から現在の技術動向まで包括的に調査する。自己完結するために、初心者がmoea/dの動作メカニズムに素早く乗り出すのを助けるためのステップバイステップのチュートリアルから始めます。次に, 重みベクトル設定, サブプロブレム定式化, 選択機構, 再生演算子など, 基本設計要素に従ってMOEA/Dの選定を概観する。さらに,制約処理,計算コストの高い客観的関数,選好統合,実世界のアプリケーションについても概説する。最終段階では、今後の発展に向けた新たな方向性に光を当てました。 Decomposition has been the mainstream approach in the classic mathematical programming for multi-objective optimization and multi-criterion decision-making. However, it was not properly studied in the context of evolutionary multi-objective optimization until the development of multi-objective evolutionary algorithm based on decomposition (MOEA/D). In this article, we present a comprehensive survey of the development of MOEA/D from its origin to the current state-of-the-art approaches. In order to be self-contained, we start with a step-by-step tutorial that aims to help a novice quickly get onto the working mechanism of MOEA/D. Then, selected major developments of MOEA/D are reviewed according to its core design components including weight vector settings, sub-problem formulations, selection mechanisms and reproduction operators. Besides, we also overviews some further developments for constraint handling, computationally expensive objective functions, preference incorporation, and real-world applications. In the final part, we shed some lights on emerging directions for future developments.	翻訳日:2023-03-17 21:01:07 公開日:2021-08-21
# 適切かつ公平な説明 Adequate and fair explanations ( http://arxiv.org/abs/2001.07578v2 ) ライセンス: Link先を確認	Nicholas Asher, Soumya Paul, Chris Russell	(参考訳) 高度な機械学習ベースのシステムを説明することは、AIの基礎において重要な問題である。近年,様々な説明方法が提案されている。これらのアプローチは、局所的および人間の解釈可能な機械学習アルゴリズムの近似を提供するものと、決定の1つの側面を正確に特徴づける論理的アプローチの2つに大別できる。本稿では,厳密な論理的基礎を持つ第2学派に焦点をあてる。これらの厳密な方法には認識論的問題がある。これらは完全な説明を与えることができるが、そのような説明は人間が理解したり、読みやすい形で書き留めるには複雑すぎるかもしれない。解釈可能性には理解しやすい説明、人間が把握できる説明が必要である。しかし、十分に完全に理解可能な説明がまだ明確化する必要がある。ここでは、[Wachter et al., 2017]に倣って、対策の観点でこれを行う。反事実的な説明では、完全な説明を提供するために必要な多くの仮定は暗黙的に残される。そのため、反事実的説明は特定のデータポイントやサンプルの性質を利用しており、部分的説明と同様に局所的でもある。局所的な部分的な説明から完全な局所的な説明へと、そしてグローバルな説明へと移行する方法を探求する。しかし、アクセシビリティを維持するために、部分性の必要性を主張します。この偏りにより、有害または不公平なアルゴリズムに存在する明示的なバイアスを隠蔽することができる。我々は、完全な局所的な説明を提供する反事実の集合の構造を利用して、完全かつ公平な説明を提供することで、これらのバイアスをいかに容易に解明できるかを検討する。 Explaining sophisticated machine-learning based systems is an important issue at the foundations of AI. Recent efforts have shown various methods for providing explanations. These approaches can be broadly divided into two schools: those that provide a local and human interpreatable approximation of a machine learning algorithm, and logical approaches that exactly characterise one aspect of the decision. In this paper we focus upon the second school of exact explanations with a rigorous logical foundation. There is an epistemological problem with these exact methods. While they can furnish complete explanations, such explanations may be too complex for humans to understand or even to write down in human readable form. Interpretability requires epistemically accessible explanations, explanations humans can grasp. Yet what is a sufficiently complete epistemically accessible explanation still needs clarification. We do this here in terms of counterfactuals, following [Wachter et al., 2017]. With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit. To do so, counterfactual explanations exploit the properties of a particular data point or sample, and as such are also local as well as partial explanations. We explore how to move from local partial explanations to what we call complete local explanations and then to global ones. But to preserve accessibility we argue for the need for partiality. This partiality makes it possible to hide explicit biases present in the algorithm that may be injurious or unfair.We investigate how easy it is to uncover these biases in providing complete and fair explanations by exploiting the structure of the set of counterfactuals providing a complete local explanation.	翻訳日:2023-01-08 00:11:31 公開日:2021-08-21
# slice tuner: 正確かつ公平な機械学習モデルのための選択的データ取得フレームワーク Slice Tuner: A Selective Data Acquisition Framework for Accurate and Fair Machine Learning Models ( http://arxiv.org/abs/2003.04549v3 ) ライセンス: Link先を確認	Ki Hyun Tae, Steven Euijong Whang	(参考訳) 機械学習はSoftware 2.0の時代に民主化されていくにつれて、正確で公正なモデルを保証するのに十分なデータを獲得している。クラウドソーシングを含む最近の技術は、そのようなデータを集めるためのコスト効率の高い方法を提供する。しかし、データを取得することは必ずしも正確性と公平性を最適化するための効果的な戦略ではない。例えば、オンラインのapp storeに、特定のデータスライス(例えばアメリカの顧客)のための十分なトレーニングデータがあるが、他の顧客にとってはそうではない場合、より多くのアメリカの顧客データを取得することは、モデルのトレーニングに偏るだけだ。代わりに、選択的にデータを取得し、スライス毎の潜在的に異なる量のデータを取得し、スライス毎のモデル精度と公平性を最適化するSlice Tunerを提案する必要がある。この問題は、(アクティブな学習や弱い監督において)既存のデータをラベル付けすることとは異なる。中心となるSlice Tunerは、より多くのデータに対してモデル精度を見積もるスライスの学習曲線を維持し、凸最適化を使用して最高のデータ取得戦略を見つける。学習曲線を推定する主な課題は、十分なデータがなければ不正確な場合があり、一方のスライスで取得したデータが他者の学習曲線に影響を与えるスライス間に依存性がある場合である。より多くのデータを取得するにつれて、学習曲線を反復的かつ効率的に更新することで、これらの問題を解決する。我々は,クラウドソーシングを用いて実際のデータセット上でSlice Tunerを評価し,学習曲線を確実に推定できない場合でも,モデル精度と公平性の観点からSlice Tunerがベースラインを著しく上回ることを示す。 As machine learning becomes democratized in the era of Software 2.0, a serious bottleneck is acquiring enough data to ensure accurate and fair models. Recent techniques including crowdsourcing provide cost-effective ways to gather such data. However, simply acquiring data as much as possible is not necessarily an effective strategy for optimizing accuracy and fairness. For example, if an online app store has enough training data for certain slices of data (say American customers), but not for others, obtaining more American customer data will only bias the model training. Instead, we contend that one needs to selectively acquire data and propose Slice Tuner, which acquires possibly-different amounts of data per slice such that the model accuracy and fairness on all slices are optimized. This problem is different than labeling existing data (as in active learning or weak supervision) because the goal is obtaining the right amounts of new data. At its core, Slice Tuner maintains learning curves of slices that estimate the model accuracies given more data and uses convex optimization to find the best data acquisition strategy. The key challenges of estimating learning curves are that they may be inaccurate if there is not enough data, and there may be dependencies among slices where acquiring data for one slice influences the learning curves of others. We solve these issues by iteratively and efficiently updating the learning curves as more data is acquired. We evaluate Slice Tuner on real datasets using crowdsourcing for data acquisition and show that Slice Tuner significantly outperforms baselines in terms of model accuracy and fairness, even when the learning curves cannot be reliably estimated.	翻訳日:2022-12-24 20:35:38 公開日:2021-08-21
# マルチセンターフェデレーションラーニング Multi-Center Federated Learning ( http://arxiv.org/abs/2005.01026v2 ) ライセンス: Link先を確認	Ming Xie, Guodong Long, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang	(参考訳) フェデレートラーニングは,ユーザデータに直接アクセスする必要なく,大規模モデルを分散的にトレーニングする能力に大きな注目を集めている。ユーザのプライベートデータを集中収集から保護するのに役立つ。分散機械学習とは異なり、フェデレートされた学習は、スマートフォンなど、さまざまな現実世界のアプリケーションにおける異種ソースからの非IIDデータに取り組むことを目的としている。既存のフェデレーション学習のアプローチは通常、単一のグローバルモデルを採用して、データ分布のばらつきに関係なく、勾配を集約することで、すべてのユーザの共有知識をキャプチャする。しかし、ユーザ行動の多様性のため、異なるグローバルモデル(すなわちセンター)にユーザの勾配を割り当てることで、ユーザ間のデータ分散の不均一性をよりよく捉えることができる。本稿では,非IIDユーザデータから複数のグローバルモデルを学習し,同時にユーザとセンタの最適なマッチングを導出する,フェデレーション学習のための新しい多中心集約機構を提案する。確率的予測最大化(EM)アルゴリズムにより効率よく解けるような共同最適化として問題を定式化する。ベンチマークデータセットによる実験結果から,本手法はいくつかの一般的なフェデレーション学習法より優れていることが示された。 Federated learning has received great attention for its capability to train a large-scale model in a decentralized manner without needing to access user data directly. It helps protect the users' private data from centralized collecting. Unlike distributed machine learning, federated learning aims to tackle non-IID data from heterogeneous sources in various real-world applications, such as those on smartphones. Existing federated learning approaches usually adopt a single global model to capture the shared knowledge of all users by aggregating their gradients, regardless of the discrepancy between their data distributions. However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i.e., centers) can better capture the heterogeneity of data distributions across users. Our paper proposes a novel multi-center aggregation mechanism for federated learning, which learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers. We formulate the problem as a joint optimization that can be efficiently solved by a stochastic expectation maximization (EM) algorithm. Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.	翻訳日:2022-12-07 06:23:54 公開日:2021-08-21
# xgboostにおける障害時間モデルによる生存回帰 Survival regression with accelerated failure time model in XGBoost ( http://arxiv.org/abs/2006.04920v3 ) ライセンス: Link先を確認	Avinash Barnwal, Hyunsu Cho, Toby Dylan Hocking	(参考訳) サバイバルレグレッションはイベント時間と機能変数の関係を推定するために使用され、医療、マーケティング、リスク管理、セールス管理といったアプリケーションドメインにおいて重要である。 xgboost、scikit-learn、lightgbm、catboostなどのライブラリに実装された非線形木ベースの機械学習アルゴリズムは、線形モデルよりも正確であることが多い。しかし、既存のツリーベースモデルの最先端実装は、生き残り回帰を限定的にサポートしている。本研究では,xgboostにおけるアクセラレーション障害時間(aft)モデル学習のための損失関数を実装し,異なる種類のラベル検閲に対するサバイバルモデルのサポートを強化する。我々は,XGBoostにおけるAFTの有効性を,一般化性能とトレーニング速度の2点において実かつシミュレートされた実験で実証した。さらに,XGBoostにおけるNVIDIA GPUのサポートを活用し,マルチコアCPU上での大幅な高速化を実現している。我々の知る限り、我々の研究はNVIDIA GPUの処理能力を利用するAFTの最初の実装である。 1.2.0のリリースから、XGBoostパッケージはAFTモデルをネイティブにサポートした。 XGBoostにAFTを追加したことで、オープンソースコミュニティに大きな影響を与え、いくつかの統計パッケージがXGBoost AFTモデルを使用している。 Survival regression is used to estimate the relation between time-to-event and feature variables, and is important in application domains such as medicine, marketing, risk management and sales management. Nonlinear tree based machine learning algorithms as implemented in libraries such as XGBoost, scikit-learn, LightGBM, and CatBoost are often more accurate in practice than linear models. However, existing state-of-the-art implementations of tree-based models have offered limited support for survival regression. In this work, we implement loss functions for learning accelerated failure time (AFT) models in XGBoost, to increase the support for survival modeling for different kinds of label censoring. We demonstrate with real and simulated experiments the effectiveness of AFT in XGBoost with respect to a number of baselines, in two respects: generalization performance and training speed. Furthermore, we take advantage of the support for NVIDIA GPUs in XGBoost to achieve substantial speedup over multi-core CPUs. To our knowledge, our work is the first implementation of AFT that utilizes the processing power of NVIDIA GPUs. Starting from the 1.2.0 release, the XGBoost package natively supports the AFT model. The addition of AFT in XGBoost has had significant impact in the open source community, and a few statistics packages now utilize the XGBoost AFT model.	翻訳日:2022-11-24 01:08:59 公開日:2021-08-21
# 治療効果の異質性モデリングと上昇モデルの統合調査 A unified survey of treatment effect heterogeneity modeling and uplift modeling ( http://arxiv.org/abs/2007.12769v3 ) ライセンス: Link先を確認	Weijia Zhang, Jiuyong Li, Lin Liu	(参考訳) 科学研究の多くの分野における中心的な問題は、結果が作用によってどのように影響を受けるかを決定すること、または作用の効果を測定することである。近年,パーソナライズされた医療,社会科学,オンラインマーケティングといった研究分野から,個人特性の異なる不均一な治療効果条件付けの必要性が浮上している。このニーズに応えるため、異なるコミュニティの研究者と実践者が、治療効果ヘテロジニティ・モデリング・アプローチとアップリフト・モデリング・アプローチをそれぞれ取り入れてアルゴリズムを開発した。本稿では,これら2つの非連結で近縁なアプローチについて,潜在的結果の枠組みの下で統一的な調査を行う。次に,各手法の比較を容易にする統一表記法を用いて,既存の手法に関する構造化された調査を行う。次に、パーソナライズされたマーケティング、パーソナライズされた医療、社会研究における調査手法の主な応用について概説する。最後に、既存のソフトウェアパッケージを要約し、合成、半合成、実世界のデータセットにおけるメソッドの使用に基づく議論を行い、メソッドを選択するための一般的なガイドラインを提供する。 A central question in many fields of scientific research is to determine how an outcome would be affected by an action, or to measure the effect of an action (a.k.a treatment effect). In recent years, a need for estimating the heterogeneous treatment effects conditioning on the different characteristics of individuals has emerged from research fields such as personalized healthcare, social science, and online marketing. To meet the need, researchers and practitioners from different communities have developed algorithms by taking the treatment effect heterogeneity modeling approach and the uplift modeling approach, respectively. In this paper, we provide a unified survey of these two seemingly disconnected yet closely related approaches under the potential outcome framework. We then provide a structured survey of existing methods by emphasizing on their inherent connections with a set of unified notations to make comparisons of the different methods easy. We then review the main applications of the surveyed methods in personalized marketing, personalized medicine, and social studies. Finally, we summarize the existing software packages and present discussions based on the use of methods on synthetic, semi-synthetic and real world data sets and provide some general guidelines for choosing methods.	翻訳日:2022-11-10 14:24:09 公開日:2021-08-21
# コンピュータビジョンのための2次元推定を用いた3次元物体位置推定 3D Object Localization Using 2D Estimates for Computer Vision Applications ( http://arxiv.org/abs/2009.11446v2 ) ライセンス: Link先を確認	Taha Hasan Masood Siddique and Muhammad Usman	(参考訳) ポーズ推定とカメラキャリブレーションに基づく物体位置推定手法を提案する。対象物の2次元(2次元)画像を複数収集して3次元(3次元)座標を推定し、カメラの校正に利用する。レンズ歪みの除去、物体の大きさの計算、カメラの位置計算のための内因的および外因的パラメータを含む多くのパラメータ計算を含むキャリブレーションステップについて論じる。 2次元画像を用いて3次元ポーズを推定する変換戦略を示す。提案手法はMATLABに実装され,ポーズ推定とカメラキャリブレーションの両面で検証実験を行った。 A technique for object localization based on pose estimation and camera calibration is presented. The 3-dimensional (3D) coordinates are estimated by collecting multiple 2-dimensional (2D) images of the object and are utilized for the calibration of the camera. The calibration steps involving a number of parameter calculation including intrinsic and extrinsic parameters for the removal of lens distortion, computation of object's size and camera's position calculation are discussed. A transformation strategy to estimate the 3D pose using the 2D images is presented. The proposed method is implemented on MATLAB and validation experiments are carried out for both pose estimation and camera calibration.	翻訳日:2022-10-15 05:07:07 公開日:2021-08-21
# CIMON: 高品質なハッシュコードを目指す CIMON: Towards High-quality Hash Codes ( http://arxiv.org/abs/2010.07804v4 ) ライセンス: Link先を確認	Xiao Luo, Daqing Wu, Zeyu Ma, Chong Chen, Minghua Deng, Jinwen Ma, Zhongming Jin, Jianqiang Huang and Xian-Sheng Hua	(参考訳) 近年、ハッシュは、そのストレージと計算効率のほぼ隣り合う探索に広く利用されている。教師なしハッシュ法の多くは、事前訓練されたモデルから局所的な意味的類似性構造を構築し、特徴空間において距離が小さい場合には各点対を扱い、イメージを意味的類似性保存ハッシュコードにマッピングすることを学ぶ。しかし、事前学習されたモデルの非効率表現能力のため、局所的な意味的類似性における多くの偽陽性と負が導入され、ハッシュコードの学習中にエラーの伝播が起こる。さらに、モデルのロバスト性を考慮する方法も少なく、それによってハッシュコードの不安定性が乱れてしまう。本稿では, c{\textbf{O}}nsistency lear{\textbf{N}}ing (CIMON) を用いて, {\textbf{C}}omprehensive s{\textbf{I}}milarity {\textbf{M}}ining と c{\textbf{O}}nsistency lear{\textbf{N}}ing (CIMON) という新しい手法を提案する。まず,グローバルリファインメントと類似度統計分布を用いて,信頼性の高い円滑な指導を行う。第2に、意味的および対比的一貫性学習は、乱れ不変と判別的ハッシュ符号の両方を導出するために導入される。いくつかのベンチマークデータセットの大規模な実験により,提案手法は検索性能とロバスト性の両方において,幅広い最先端手法より優れていることが示された。 Recently, hashing is widely used in approximate nearest neighbor search for its storage and computational efficiency. Most of the unsupervised hashing methods learn to map images into semantic similarity-preserving hash codes by constructing local semantic similarity structure from the pre-trained model as the guiding information, i.e., treating each point pair similar if their distance is small in feature space. However, due to the inefficient representation ability of the pre-trained model, many false positives and negatives in local semantic similarity will be introduced and lead to error propagation during the hash code learning. Moreover, few of the methods consider the robustness of models, which will cause instability of hash codes to disturbance. In this paper, we propose a new method named {\textbf{C}}omprehensive s{\textbf{I}}milarity {\textbf{M}}ining and c{\textbf{O}}nsistency lear{\textbf{N}}ing (CIMON). First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes. Extensive experiments on several benchmark datasets show that the proposed method outperforms a wide range of state-of-the-art methods in both retrieval performance and robustness.	翻訳日:2022-10-07 04:18:59 公開日:2021-08-21
# (参考訳) 屋内位置決めシステムにおける教師なし移動検出 Unsupervised Movement Detection in Indoor Positioning Systems ( http://arxiv.org/abs/2109.10757v1 ) ライセンス: CC BY 4.0	Jonathan Flossdorf, Anne Meyer, Dmitri Artjuch, Jaques Schneider, Carsten Jentsch	(参考訳) 近年では製造工程における室内位置決めシステムの利用が盛んになっている。通常、製造ホールはセンサーの位置データを受信する衛星を備えており、部品、荷積み機、産業用トラックに固定することができる。これにより、例えば企業が検索の労力を減らし、個々のシステムプロセスの最適化が可能になる。本研究の文脈では,センサは移動時にのみ位置情報を送信する。しかし、周囲の要因が乱れるなど、様々な状況がデータ送信に好ましくない影響をしばしば与えている。これは、データ品質、エネルギー消費、システム全体の信頼性に悪影響を及ぼす。そこで本研究では,室内システムの騒音や測定誤差の影響を受けやすいため,好ましくない信号と実際の動きを区別することを目的としている。そこで,本課題に適した2つの非教師なし分類アルゴリズムを提案する。興味のある問題によっては、それらは距離ベースか時間ベースの基準に依存しており、すべての必須情報を利用することができる。さらに,両方の分類を結合し,それらを空間生産領域に集約する手法を提案する。これにより、位置データのみを用いて、下層のプロダクションホールの包括的なマップを生成することができる。基盤となる移動構造の分析と検出は別として、利用者は自身のシステムプロセスのより良い理解と、より効率的な位置決めシステムの使用につながる問題のあるシステム領域の検出から恩恵を受ける。全ての手法は教師なしの技術で構築されているため、実際は手動で適用でき、位置決めシステムの出力データ以上の情報を必要としない。 In recent years, the usage of indoor positioning systems for manufacturing processes became increasingly popular. Typically, the production hall is equipped with satellites which receive position data of sensors that can be pinned on components, load carriers or industrial trucks. This enables a company e.g. to reduce search efforts and to optimize individual system processes. In our research context, a sensor only sends position information when it is moved. However, various circumstances frequently affect that data is undesirably sent, e.g. due to disrupting factors nearby. This has a negative impact on the data quality, the energy consumption, and the reliability of the whole system. Motivated by this, we aim to distinguish between actual movements and signals that were undesirably sent which is in particular challenging due to the susceptibility of indoor systems in terms of noise and measuring errors. Therefore, we propose two novel unsupervised classification algorithms suitable for this task. Depending on the question of interest, they rely either on a distance-based or on a time-based criterion, which allows to make use of all essential information. Furthermore, we propose an approach to combine both classifications and to aggregate them on spatial production areas. This enables us to generate a comprehensive map of the underlying production hall with the sole usage of the position data. Aside from the analysis and detection of the underlying movement structure, the user benefits from a better understanding of own system processes and from the detection of problematic system areas which leads to a more efficient usage of positioning systems. Since all our approaches are constructed with unsupervised techniques, they are handily applicable in practice and do not require more information than the output data of the positioning system.	翻訳日:2021-09-26 23:37:59 公開日:2021-08-21
# 深部畳み込みニューラルネットワークを高速化する数値精度に制限のある再構成可能なコプロセッサアーキテクチャ Reconfigurable co-processor architecture with limited numerical precision to accelerate deep convolutional neural networks ( http://arxiv.org/abs/2109.03040v1 ) ライセンス: Link先を確認	Sasindu Wijeratne, Sandaruwan Jayaweera, Mahesh Dananjaya, Ajith Pasqual	(参考訳) 畳み込みニューラルネットワーク(CNN)は、視覚システムやロボット工学などのディープラーニングアプリケーションで広く使われている。しかし、既存のソフトウェアソリューションは効率的ではない。そのため、多くのハードウェアアクセラレーターが実装の性能、パワー、資源利用を最適化する提案がなされている。既存のソリューションの中で、FPGA(Field Programmable Gate Array)ベースのアーキテクチャは、スケーラビリティと開発時間の最小化とともに、より良いコスト-エネルギーパフォーマンスのトレードオフを提供します。本稿では,CNNを高速化するモデル非依存の再構成可能コプロセッシングアーキテクチャを提案する。我々のアーキテクチャは、最大データ並列性を利用するためのキャッシュ技術と相互接続ネットワークを備えた並列Multiply and Accumulate (MAC)ユニットで構成されている。既存の解とは対照的に、算術表現や演算のための限定精度32bit Q-format固定点量子化を導入する。その結果,我々のアーキテクチャは,競争精度で資源利用の大幅な削減を実現した。さらに,協調処理ファブリックにアクセスして層間並列性を管理するアセンブリ型マイクロインストラクションを開発し,限られた資源を再利用した。最後に、Xilinx Virtex 7 FPGA上で最大9x9のカーネルサイズをテストし、3x3カーネルサイズで最大226.2 GOp/Sのスループットを実現した。 Convolutional Neural Networks (CNNs) are widely used in deep learning applications, e.g. visual systems, robotics etc. However, existing software solutions are not efficient. Therefore, many hardware accelerators have been proposed optimizing performance, power and resource utilization of the implementation. Amongst existing solutions, Field Programmable Gate Array (FPGA) based architecture provides better cost-energy-performance trade-offs as well as scalability and minimizing development time. In this paper, we present a model-independent reconfigurable co-processing architecture to accelerate CNNs. Our architecture consists of parallel Multiply and Accumulate (MAC) units with caching techniques and interconnection networks to exploit maximum data parallelism. In contrast to existing solutions, we introduce limited precision 32 bit Q-format fixed point quantization for arithmetic representations and operations. As a result, our architecture achieved significant reduction in resource utilization with competitive accuracy. Furthermore, we developed an assembly-type microinstructions to access the co-processing fabric to manage layer-wise parallelism, thereby making re-use of limited resources. Finally, we have tested our architecture up to 9x9 kernel size on Xilinx Virtex 7 FPGA, achieving a throughput of up to 226.2 GOp/S for 3x3 kernel size.	翻訳日:2021-09-12 10:54:46 公開日:2021-08-21
# (参考訳) ディープラーニングに基づく正規化(安定化)再構成アルゴリズム Regularizing (Stabilizing) Deep Learning Based Reconstruction Algorithms ( http://arxiv.org/abs/2108.13551v1 ) ライセンス: CC0 1.0	Abinash Nayak	(参考訳) 逆問題は不適切であり、それを有意義に解くには正規化法を使わなければならないことはよく知られている。伝統的に、一般的な正規化法はペナルティ化された変分アプローチである。近年、古典的正規化再構成アプローチは(深層学習に基づく)学習的再構成アルゴリズムによって非分類化されている。しかし、従来の正則化法とは異なり、安定性や正則化といった理論的な基盤は、そのような学習された再構成アルゴリズムでは不十分である。したがって、これらのアルゴリズムから得られた結果は、経験的に優れているが、学習プロセスから生じる特定の不安定性や(ハロゲン化)特徴を含むため、常に完全に信頼されるとは限らない。実際、このような学習アルゴリズムは、データ内の小さな(逆)ノイズに非常に影響を受けやすく、回収された解に深刻な不安定性をもたらすことが示されており、これは、不適切な(逆)問題の本質的な不安定性とは全く異なる可能性がある。しかし、古典正規化法はそのような(逆)ノイズをうまく処理することができ、安定した回復をもたらす。そこで我々は,このような(不安定な)学習的再構成手法を安定化し,対向雑音の存在下でも正規化解を回復するための一定の正規化手法を提案する。そのため、古典的な正規化の概念を拡張し、学習された再構成アルゴリズムに組み込む必要がある。また,最も一般的な学習再建アルゴリズムである学習後再構築と学習後再構築の2つを正規化するための正規化手法を提案する。 It's well-known that inverse problems are ill-posed and to solve them meaningfully one has to employ regularization methods. Traditionally, popular regularization methods have been the penalized Variational approaches. In recent years, the classical regularized-reconstruction approaches have been outclassed by the (deep-learning-based) learned reconstruction algorithms. However, unlike the traditional regularization methods, the theoretical underpinnings, such as stability and regularization, have been insufficient for such learned reconstruction algorithms. Hence, the results obtained from such algorithms, though empirically outstanding, can't always be completely trusted, as they may contain certain instabilities or (hallucinated) features arising from the learned process. In fact, it has been shown that such learning algorithms are very susceptible to small (adversarial) noises in the data and can lead to severe instabilities in the recovered solution, which can be quite different than the inherent instabilities of the ill-posed (inverse) problem. Whereas, the classical regularization methods can handle such (adversarial) noises very well and can produce stable recovery. Here, we try to present certain regularization methods to stabilize such (unstable) learned reconstruction methods and recover a regularized solution, even in the presence of adversarial noises. For this, we need to extend the classical notion of regularization and incorporate it in the learned reconstruction algorithms. We also present some regularization techniques to regularize two of the most popular learning reconstruction algorithms, the Learned Post-Processing Reconstruction and the Learned Unrolling Reconstruction.	翻訳日:2021-09-05 10:24:48 公開日:2021-08-21
# ディープラーニングを用いた認知症知識発見のための弾性ネット正規化の新しい解法 A Novel Solution of an Elastic Net Regularization for Dementia Knowledge Discovery using Deep Learning ( http://arxiv.org/abs/2109.00896v1 ) ライセンス: Link先を確認	Kshitiz Shrestha, Omar Hisham Alsadoon, Abeer Alsadoon, Tarik A. Rashid, Rasha S. Ali, P.W.C. Prasad, Oday D. Jerew	(参考訳) 背景と目的:MRIの正確な分類は、軽度認知障害(MCI)からアルツハイマー病(AD)への変換を正確に予測するために不可欠である。一方、ディープラーニングは認知症病の分類と予測に成功している。しかし,MRI画像分類の精度は低い。本稿では,特徴選択におけるElastic Net Regularizationを用いて,ディープラーニングアーキテクチャによる分類の精度を高め,処理時間を短縮することを目的とする。方法論:本システムは,弾性ネット正規化を用いた分類と予測の精度を高めるために,畳み込みニューラルネットワーク(cnn)から構成される。当初、MRI画像はCNNに入力され、プール層と交互に畳み込み層を通して機能を抽出し、それから完全に接続された層を通して抽出される。その後、抽出した特徴を原理成分分析(pca)と弾性ネット正規化により特徴選択を行う。最後に、選択した特徴を、MRI画像の分類のためのExtreme Machine Learning (EML)への入力として使用する。結果: 提案手法の精度は現在のシステムよりも優れていることが示された。さらに,提案手法では,分類精度を平均で5%向上させ,処理時間を平均で30秒から40秒短縮した。結論:提案システムは,MCIコンバータ/非コンバータ分類の精度と処理時間の改善に重点を置いている。 CNN、FreeSurfer、PCA、Elastic Net、Extreme Machine Learningを使った機能抽出、機能選択、分類で構成されている。最後に,本研究は弾性ネット正則化を用いて精度と処理時間を向上し,分類に重要な特徴を提供する。 Background and Aim: Accurate classification of Magnetic Resonance Images (MRI) is essential to accurately predict Mild Cognitive Impairment (MCI) to Alzheimer's Disease (AD) conversion. Meanwhile, deep learning has been successfully implemented to classify and predict dementia disease. However, the accuracy of MRI image classification is low. This paper aims to increase the accuracy and reduce the processing time of classification through Deep Learning Architecture by using Elastic Net Regularization in Feature Selection. Methodology: The proposed system consists of Convolutional Neural Network (CNN) to enhance the accuracy of classification and prediction by using Elastic Net Regularization. Initially, the MRI images are fed into CNN for features extraction through convolutional layers alternate with pooling layers, and then through a fully connected layer. After that, the features extracted are subjected to Principle Component Analysis (PCA) and Elastic Net Regularization for feature selection. Finally, the selected features are used as an input to Extreme Machine Learning (EML) for the classification of MRI images. Results: The result shows that the accuracy of the proposed solution is better than the current system. In addition to that, the proposed method has improved the classification accuracy by 5% on average and reduced the processing time by 30 ~ 40 seconds on average. Conclusion: The proposed system is focused on improving the accuracy and processing time of MCI converters/non-converters classification. It consists of features extraction, feature selection, and classification using CNN, FreeSurfer, PCA, Elastic Net, Extreme Machine Learning. Finally, this study enhances the accuracy and the processing time by using Elastic Net Regularization, which provides important selected features for classification.	翻訳日:2021-09-05 08:54:04 公開日:2021-08-21
# 資源制約付きエッジコンピューティングシステムの最適化圧縮 Supervised Compression for Resource-constrained Edge Computing Systems ( http://arxiv.org/abs/2108.11898v1 ) ライセンス: Link先を確認	Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt	(参考訳) スマートフォンやドローン、医療センサーなど、低消費電力のデバイスにディープラーニングアルゴリズムをデプロイすることに関心がある。しかし、フルスケールのディープニューラルネットワークはエネルギーとストレージの面で資源集約的すぎることが多い。そのため、データを圧縮して送信するエッジサーバでは、機械学習操作のバルク部が頻繁に実行される。しかし、データ(画像など)を圧縮すると、監視されたタスクとは無関係な情報を送信する。もうひとつの一般的なアプローチは、中間機能を圧縮しながらデバイスとサーバの間にディープネットワークを分割することである。しかし、これまでのところ、これらの分割コンピューティング戦略は、機能圧縮に対する非効率なアプローチのため、前述のナイーブなデータ圧縮ベースラインをわずかに上回っている。本稿では、知識蒸留とニューラルイメージ圧縮のアイデアを採用し、中間特徴表現をより効率的に圧縮する。教師モデルと生徒モデルを用いて,エントロピー符号化に先立って確率的ボトルネックと学習可能な圧縮手法を開発した。 3つのビジョンタスクにおいて,我々のアプローチを様々なニューラルイメージと特徴圧縮ベースラインと比較し,より小さなレイテンシを維持しながら,教師付きレートゆがみ性能を向上できることを見出した。さらに、学習した特徴表現が複数の下流タスクに役立てるように調整可能であることを示す。 There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors. However, full-scale deep neural networks are often too resource-intensive in terms of energy and storage. As a result, the bulk part of the machine learning operation is therefore often carried out on an edge server, where the data is compressed and transmitted. However, compressing data (such as images) leads to transmitting information irrelevant to the supervised task. Another popular approach is to split the deep network between the device and the server while compressing intermediate features. To date, however, such split computing strategies have barely outperformed the aforementioned naive data compression baselines due to their inefficient approaches to feature compression. This paper adopts ideas from knowledge distillation and neural image compression to compress intermediate feature representations more efficiently. Our supervised compression approach uses a teacher model and a student model with a stochastic bottleneck and learnable prior for entropy coding. We compare our approach to various neural image and feature compression baselines in three vision tasks and found that it achieves better supervised rate-distortion performance while also maintaining smaller end-to-end latency. We furthermore show that the learned feature representations can be tuned to serve multiple downstream tasks.	翻訳日:2021-08-29 12:13:24 公開日:2021-08-21
# Curricular SincNet:潜時空間におけるハードサンプル強調によるロバストディープ話者認識に向けて Curricular SincNet: Towards Robust Deep Speaker Recognition by Emphasizing Hard Samples in Latent Space ( http://arxiv.org/abs/2108.10714v1 ) ライセンス: Link先を確認	Labib Chowdhury, Mustafa Kamal, Najia Hasan and Nabeel Mohammed	(参考訳) ディープラーニングモデルは、話者認識などの生体認証システムにおいて、ますます好まれる選択肢となっている。ディープニューラルネットワークアーキテクチャであるSincNetは、音声信号を直接処理できるパラメータ化されたシンク関数のために、話者認識タスクで人気を博した。オリジナルのsincnetアーキテクチャはsoftmaxロスを使っているが、認識ベースのタスクには最適ではないかもしれない。このような損失関数はクラス間マージンを課したり、簡単なトレーニングサンプルと難しいトレーニングサンプルを区別したりしない。カリキュラム学習、特に角マージンに基づく損失を利用した学習は、顔認識などの他の生体計測応用において非常に成功した。このようなカリキュラム学習に基づくテクニックの利点は、クラス間マージンを課すだけでなく、簡単でハードなサンプルを考慮に入れることだ。本稿では,sincnetアーキテクチャを学習するためにsincnetモデルの改良版であるcurricular sincnet (cl-sincnet)を提案する。提案モデルは,データセット内およびデータセット間評価プロトコルを用いて,複数のデータセット上で評価される。どちらの設定でも、モデルは以前に公開された他の作業と競合する。データセット間テストの場合、SincNetや他の公開作業と比較すると、エラー率を4倍に減らして、全体的な結果が最も良い。 Deep learning models have become an increasingly preferred option for biometric recognition systems, such as speaker recognition. SincNet, a deep neural network architecture, gained popularity in speaker recognition tasks due to its parameterized sinc functions that allow it to work directly on the speech signal. The original SincNet architecture uses the softmax loss, which may not be the most suitable choice for recognition-based tasks. Such loss functions do not impose inter-class margins nor differentiate between easy and hard training samples. Curriculum learning, particularly those leveraging angular margin-based losses, has proven very successful in other biometric applications such as face recognition. The advantage of such a curriculum learning-based techniques is that it will impose inter-class margins as well as taking to account easy and hard samples. In this paper, we propose Curricular SincNet (CL-SincNet), an improved SincNet model where we use a curricular loss function to train the SincNet architecture. The proposed model is evaluated on multiple datasets using intra-dataset and inter-dataset evaluation protocols. In both settings, the model performs competitively with other previously published work. In the case of inter-dataset testing, it achieves the best overall results with a reduction of 4\% error rate compare to SincNet and other published work.	翻訳日:2021-08-25 14:05:55 公開日:2021-08-21
# (参考訳) 有能な物体検出のためのマルチスケールエッジベースU字型ネットワーク Multi-scale Edge-based U-shape Network for Salient Object Detection ( http://arxiv.org/abs/2108.09408v1 ) ライセンス: CC BY 4.0	Han Sun, Yetong Bian, Ningzhong Liu, Huiyu Zhou	(参考訳) ディープラーニングベースのサルエントオブジェクト検出手法は、大きな改善を達成している。しかし,不適切な特徴抽出と統合が主な原因である,ぼやけた境界や不正確な位置などの予測にはまだ問題が残っている。本稿では,様々な機能を異なるスケールで統合し,より優れた性能を実現するマルチスケールエッジベースu-shape network(meun)を提案する。境界予測に有用な情報を抽出するために、各デコーダユニットにU字形エッジネットワークモジュールを埋め込む。さらに、追加のダウンサンプリングモジュールは位置の不正確さを緩和する。 4つのベンチマークデータセットの実験結果から,提案手法の有効性と信頼性が示された。マルチスケールのエッジベースのu字型ネットワークは、15の最先端のオブジェクト検出方法と比べても優れている。 Deep-learning based salient object detection methods achieve great improvements. However, there are still problems existing in the predictions, such as blurry boundary and inaccurate location, which is mainly caused by inadequate feature extraction and integration. In this paper, we propose a Multi-scale Edge-based U-shape Network (MEUN) to integrate various features at different scales to achieve better performance. To extract more useful information for boundary prediction, U-shape Edge Network modules are embedded in each decoder units. Besides, the additional down-sampling module alleviates the location inaccuracy. Experimental results on four benchmark datasets demonstrate the validity and reliability of the proposed method. Multi-scale Edge based U-shape Network also shows its superiority when compared with 15 state-of-the-art salient object detection methods.	翻訳日:2021-08-25 10:04:30 公開日:2021-08-21
# (参考訳) 2020年米大統領選挙:Twitterで女性ユーザーと男性ユーザーの分析 2020 U.S. Presidential Election: Analysis of Female and Male Users on Twitter ( http://arxiv.org/abs/2108.09416v1 ) ライセンス: CC BY 4.0	Amir Karami, Spring B. Clark, Anderson Mackenzie, Dorathea Lee, Michael Zhu, Hannah R. Boyajieff, Bailey Goldschmidt	(参考訳) ソーシャルメディアは、選挙運動において、様々な問題について意見を表明するために一般に使用される。様々なソーシャルメディアチャンネルの中で、Twitterは研究者や政治家が経済や外交政策など幅広いトピックに関する世論を探るための効率的なプラットフォームを提供している。現在の文献は、主にユーザーの性別を考慮せずにツイートの内容を分析することに焦点を当てている。この研究は、大量のツイートを収集、分析し、計算、ヒューマンコーディング、統計分析を用いて、2020年のアメリカ合衆国大統領選挙中に投稿された30万以上のツイートのトピックを識別し、トピックの平均重量について女性と男性のユーザーを比較する。私たちの発見は、税や気候変動、新型コロナウイルス(covid-19)パンデミックなど、幅広いトピックに基づいています。トピックのうち,70%以上のトピックにおいて,女性ユーザと男性ユーザの間に有意な違いがある。本研究のアプローチは情報学,政治学,コミュニケーション学の分野での研究に役立ち,政治運動によって世論のジェンダーに基づく理解を得るのに有効である。 Social media is commonly used by the public during election campaigns to express their opinions regarding different issues. Among various social media channels, Twitter provides an efficient platform for researchers and politicians to explore public opinion regarding a wide range of topics such as economy and foreign policy. Current literature mainly focuses on analyzing the content of tweets without considering the gender of users. This research collects and analyzes a large number of tweets and uses computational, human coding, and statistical analyses to identify topics in more than 300,000 tweets posted during the 2020 U.S. presidential election and to compare female and male users regarding the average weight of the topics. Our findings are based upon a wide range of topics, such as tax, climate change, and the COVID-19 pandemic. Out of the topics, there exists a significant difference between female and male users for more than 70% of topics. Our research approach can inform studies in the areas of informatics, politics, and communication, and it can be used by political campaigns to obtain a gender-based understanding of public opinion.	翻訳日:2021-08-25 09:53:13 公開日:2021-08-21
# (参考訳) L3C-Stereo:ステレオ画像のロスレス圧縮 L3C-Stereo: Lossless Compression for Stereo Images ( http://arxiv.org/abs/2108.09422v1 ) ライセンス: CC BY 4.0	Zihao Huang, Zhe Sun, Feng Duan, Andrzej Cichocki, Peiying Ruan and Chao Li	(参考訳) 多数の自動運転タスクには高精細なステレオ画像が必要であり、大量のストレージスペースを必要とする。効率よく無損失圧縮を実行することが現実的な問題となっている。一般に、各画素の正確な確率推定を行うのは難しい。そこで本稿では, ワープモジュールと確率推定モジュールの2つの主要モジュールからなるマルチスケールロスレス圧縮モデルであるL3C-Stereoを提案する。ワープモジュールは、同じドメインからの2つのビュー特徴写像を利用して、適切なビューを再構成し、正しいビューの確率推定の信頼性を向上させるために使用される不均一マップを生成する。確率推定モジュールは、適応算術符号化のための画素単位のロジスティック混合分布を提供する。実験では,3つのデータセットすべてにおいて,手作り圧縮法と学習ベース法を上回った。そして, 最大偏差が圧縮効果を向上させることを示す。さらに,本モデルの圧縮特性により,後続のステレオタスクに対して許容される品質の差マップを自然に生成する。 A large number of autonomous driving tasks need high-definition stereo images, which requires a large amount of storage space. Efficiently executing lossless compression has become a practical problem. Commonly, it is hard to make accurate probability estimates for each pixel. To tackle this, we propose L3C-Stereo, a multi-scale lossless compression model consisting of two main modules: the warping module and the probability estimation module. The warping module takes advantage of two view feature maps from the same domain to generate a disparity map, which is used to reconstruct the right view so as to improve the confidence of the probability estimate of the right view. The probability estimation module provides pixel-wise logistic mixture distributions for adaptive arithmetic coding. In the experiments, our method outperforms the hand-crafted compression methods and the learning-based method on all three datasets used. Then, we show that a better maximum disparity can lead to a better compression effect. Furthermore, thanks to a compression property of our model, it naturally generates a disparity map of an acceptable quality for the subsequent stereo tasks.	翻訳日:2021-08-25 09:40:25 公開日:2021-08-21
# (参考訳) 腫瘍内パーティショニングのための特徴表現を増強した適応的教師なし学習とグリオ芽腫の生存予測 Adaptive unsupervised learning with enhanced feature representation for intra-tumor partitioning and survival prediction for glioblastoma ( http://arxiv.org/abs/2108.09423v1 ) ライセンス: CC BY 4.0	Yifan Li, Chao Li, Yiran Wei, Stephen Price, Carola-Bibiane Sch\"onlieb, Xi Chen	(参考訳) グリオ芽腫は局所的な微細構造と血管に非常に異質である。グリオブラスト腫の空間的多様性はより正確な治療につながる可能性がある。教師なし学習法では,Glioblastoma MRI由来の放射線学的特徴が腫瘍亜領域のセグメンテーションや生存予測に広く利用されている。しかし、アルゴリズムの結果の信頼性は、あいまいな中間過程と、クラスタリングアルゴリズムのランダム性、特に異種患者のデータによってもたらされる不安定性の両方によってしばしば問題となる。本稿では, 腫瘍内パーティショニングとグリオーマ生存予測のための適応型非教師なし学習手法を提案する。 K-meansのような教師なし学習アルゴリズムのクラスタリング安定性を向上させるために,新規かつ問題特異的な自動エンコーダ(FAE)を開発した。さらに、プロセス全体をベイズ最適化(BO)技法でモデル化し、ハイパーパラメータを適度な数ステップで適応的に最適化することができるようにした。その結果,提案手法はロバストで臨床的に関連するmriサブリージョンと統計的に有意な生存予測を生成できることがわかった。 Glioblastoma is profoundly heterogeneous in regional microstructure and vasculature. Characterizing the spatial heterogeneity of glioblastoma could lead to more precise treatment. With unsupervised learning techniques, glioblastoma MRI-derived radiomic features have been widely utilized for tumor sub-region segmentation and survival prediction. However, the reliability of algorithm outcomes is often challenged by both ambiguous intermediate process and instability introduced by the randomness of clustering algorithms, especially for data from heterogeneous patients. In this paper, we propose an adaptive unsupervised learning approach for efficient MRI intra-tumor partitioning and glioblastoma survival prediction. A novel and problem-specific Feature-enhanced Auto-Encoder (FAE) is developed to enhance the representation of pairwise clinical modalities and therefore improve clustering stability of unsupervised learning algorithms such as K-means. Moreover, the entire process is modelled by the Bayesian optimization (BO) technique with a custom loss function that the hyper-parameters can be adaptively optimized in a reasonably few steps. The results demonstrate that the proposed approach can produce robust and clinically relevant MRI sub-regions and statistically significant survival predictions.	翻訳日:2021-08-25 09:20:31 公開日:2021-08-21
# (参考訳) ARAPReg:変形可能な形状発電機を学習する正規化損失の可能性 ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators ( http://arxiv.org/abs/2108.09432v1 ) ライセンス: CC BY 4.0	Qixing Huang, Xiangru Huang, Bo Sun, Zaiwei Zhang, Junfeng Jiang and Chandrajit Bajaj	(参考訳) 本稿では,パラメトリック変形形状生成器の訓練のための教師なし損失について述べる。鍵となる考え方は、生成した形状間の局所剛性の保存を強制することである。本手法は,as-rigid-as possible (または arap) 変形エネルギーの近似に基づく。本稿では,ARAPエネルギーのヘシアンスペクトル分解による教師なし損失の展開について述べる。私たちの損失は、強固な規範を通してポーズと形の変化をうまく分離します。損失は単純な閉形式表現を許容する。訓練が容易で、可変オートエンコーダ(VAE)やオートデコーダ(AD)など、任意の標準世代モデルにプラグインすることができる。実験の結果,人間,動物,骨といった様々な形状カテゴリの公開ベンチマークデータセットにおいて,既存の形状生成アプローチをかなり上回っていることがわかった。 This paper introduces an unsupervised loss for training parametric deformation shape generators. The key idea is to enforce the preservation of local rigidity among the generated shapes. Our approach builds on an approximation of the as-rigid-as possible (or ARAP) deformation energy. We show how to develop the unsupervised loss via a spectral decomposition of the Hessian of the ARAP energy. Our loss nicely decouples pose and shape variations through a robust norm. The loss admits simple closed-form expressions. It is easy to train and can be plugged into any standard generation models, e.g., variational auto-encoder (VAE) and auto-decoder (AD). Experimental results show that our approach outperforms existing shape generation approaches considerably on public benchmark datasets of various shape categories such as human, animal and bone.	翻訳日:2021-08-25 09:07:38 公開日:2021-08-21
# (参考訳) deepedgebench: エッジデバイス上のディープニューラルネットワークのベンチマーク DeepEdgeBench: Benchmarking Deep Neural Networks on Edge Devices ( http://arxiv.org/abs/2108.09457v1 ) ライセンス: CC BY 4.0	Stephan Patrick Baller, Anshul Jindal, Mohak Chadha, Michael Gerndt	(参考訳) EdgeAI(Edgeコンピューティングベースの人工知能)は、厳しいレイテンシ要件を満たすために、多種多様な分散AIアプリケーションを扱うために、ここ数年、最も活発に研究されている。一方、多くの企業は、エッジコンピューティング環境で計算ノードとして機能するために、人気のRaspberry PiやNvidiaのJetson Nanoのような、フォームファクタ(消費電力とリソースの制限)の少ないエッジデバイスをリリースしている。エッジデバイスはコンピューティングのパワーとハードウェアのリソースで制限されているが、パフォーマンスを向上させるためにアクセラレーターによって駆動される。したがって、AIベースのDeep Neural Networksが限られたリソースを持つデバイス上でどのように機能するかは興味深い。本研究では,Asus Tinker Edge R, Raspberry Pi 4, Google Coral Dev Board, Nvidia Jetson Nano, そして1つのマイクロコントローラであるArduino Nano 33 BLEを,異なるディープラーニングモデルとフレームワーク上で,チップ上での4つのシステム(SoC)の推論時間と消費電力で比較した。また,装置の消費電力,推定時間,精度を計測し,他の機器に容易に拡張できる方法を提案する。我々の結果は、Tensorflowベースの量子化モデルでは、Google Coral Dev Boardが推論時間と消費電力の両方で最高のパフォーマンスを提供します。計算時間の少ない部分、すなわち、計算時間 MobileNetV2の29.3%以下では、Jetson Nanoは他のデバイスよりも高速に動作している。 EdgeAI (Edge computing based Artificial Intelligence) has been most actively researched for the last few years to handle variety of massively distributed AI applications to meet up the strict latency requirements. Meanwhile, many companies have released edge devices with smaller form factors (low power consumption and limited resources) like the popular Raspberry Pi and Nvidia's Jetson Nano for acting as compute nodes at the edge computing environments. Although the edge devices are limited in terms of computing power and hardware resources, they are powered by accelerators to enhance their performance behavior. Therefore, it is interesting to see how AI-based Deep Neural Networks perform on such devices with limited resources. In this work, we present and compare the performance in terms of inference time and power consumption of the four Systems on a Chip (SoCs): Asus Tinker Edge R, Raspberry Pi 4, Google Coral Dev Board, Nvidia Jetson Nano, and one microcontroller: Arduino Nano 33 BLE, on different deep learning models and frameworks. We also provide a method for measuring power consumption, inference time and accuracy for the devices, which can be easily extended to other devices. Our results showcase that, for Tensorflow based quantized model, the Google Coral Dev Board delivers the best performance, both for inference time and power consumption. For a low fraction of inference computation time, i.e. less than 29.3% of the time for MobileNetV2, the Jetson Nano performs faster than the other devices.	翻訳日:2021-08-25 08:44:45 公開日:2021-08-21
# (参考訳) 教師なしドメイン適応のためのロバスト組立ネットワーク Robust Ensembling Network for Unsupervised Domain Adaptation ( http://arxiv.org/abs/2108.09473v1 ) ライセンス: CC BY 4.0	Han Sun, Lei Lin, Ningzhong Liu, Huiyu Zhou	(参考訳) 近年,unsupervised domain adaptation (uda)問題に対処するために,転送可能なモデルを実現するための広範な研究が提案されている。その中でも最も一般的な手法は、ソースドメインとターゲットドメイン間の距離を短くする、逆領域適応法である。敵対的学習は非常に効果的であるが、ネットワークの不安定性と混乱したカテゴリ情報の欠点につながる。本稿では,情報伝達のためのグローバル情報学習にロバストな時間センシング教師ネットワークを適用した,udaのためのロバストセンシングネットワーク (ren) を提案する。具体的には、主に教師ネットワークと生徒ネットワークを含み、標準ドメイン適応トレーニングを行い、教師ネットワークの重みを更新する。さらに, 判別器の能力を向上させるために, 二重ネットワーク条件付き対向損失を提案する。最後に,学生ネットワークの基本能力を向上させるために,学生ネットワークと教師ネットワークの誤りのバランスをとるために,一貫性制約を利用する。いくつかのUDAデータセットに対する大規模な実験結果は、他の最先端UDAアルゴリズムと比較することにより、我々のモデルの有効性を実証した。 Recently, in order to address the unsupervised domain adaptation (UDA) problem, extensive studies have been proposed to achieve transferrable models. Among them, the most prevalent method is adversarial domain adaptation, which can shorten the distance between the source domain and the target domain. Although adversarial learning is very effective, it still leads to the instability of the network and the drawbacks of confusing category information. In this paper, we propose a Robust Ensembling Network (REN) for UDA, which applies a robust time ensembling teacher network to learn global information for domain transfer. Specifically, REN mainly includes a teacher network and a student network, which performs standard domain adaptation training and updates weights of the teacher network. In addition, we also propose a dual-network conditional adversarial loss to improve the ability of the discriminator. Finally, for the purpose of improving the basic ability of the student network, we utilize the consistency constraint to balance the error between the student network and the teacher network. Extensive experimental results on several UDA datasets have demonstrated the effectiveness of our model by comparing with other state-of-the-art UDA algorithms.	翻訳日:2021-08-25 08:19:59 公開日:2021-08-21
# (参考訳) MimicBot:ImitationとReinforcement Learningを組み合わせてBot Bowlで優勝 MimicBot: Combining Imitation and Reinforcement Learning to win in Bot Bowl ( http://arxiv.org/abs/2108.09478v1 ) ライセンス: CC BY 4.0	Nicola Pezzotti	(参考訳) 本稿では,Bot Bowl IIIコンペティションに参加したFantasy Football AIでプレイするように訓練されたハイブリッドエージェントについて述べる。エージェントであるMimicBotは、特別に設計されたディープポリシーネットワークを使用して実装され、模倣と強化学習の組み合わせを使って訓練される。このような文脈で強化学習アプローチを用いた以前の試みは、いくつかの理由で失敗した。環境に内在するランダム性と、利用可能なアクションの数が大きくて不均一であるため、カリキュラム学習アプローチは、ランダムに支払われるエージェントを一貫して打ち負かせない。現在、機械学習のアプローチは、ゲーム上のドメイン知識を利用するスクリプトボットを打ち負かすことはできない。私たちのソリューションは、模倣学習とハイブリッド意思決定プロセスのおかげで、一貫してこのようなスクリプトエージェントを破ります。さらに,強化学習環境において,サンプル効率を劇的に向上させながら,より効率的にトレーニングする方法を考察した。 MimicBotはBot Bowl IIIコンペティションの勝者であり、現在最先端のソリューションである。 This paper describe an hybrid agent trained to play in Fantasy Football AI which participated in the Bot Bowl III competition. The agent, MimicBot, is implemented using a specifically designed deep policy network and trained using a combination of imitation and reinforcement learning. Previous attempts in using a reinforcement learning approach in such context failed for a number of reasons, e.g. due to the intrinsic randomness in the environment and the large and uneven number of actions available, with a curriculum learning approach failing to consistently beat a randomly paying agent. Currently no machine learning approach can beat a scripted bot which makes use of the domain knowledge on the game. Our solution, thanks to an imitation learning and a hybrid decision-making process, consistently beat such scripted agents. Moreover we shed lights on how to more efficiently train in a reinforcement learning setting while drastically increasing sample efficiency. MimicBot is the winner of the Bot Bowl III competition, and it is currently the state-of-the-art solution.	翻訳日:2021-08-25 08:08:10 公開日:2021-08-21
# (参考訳) Grid-VLP:ビジョンランゲージ事前トレーニングのためのグリッド機能の再検討 Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training ( http://arxiv.org/abs/2108.09479v1 ) ライセンス: CC BY 4.0	Ming Yan, Haiyang Xu, Chenliang Li, Bin Bi, Junfeng Tian, Min Gui and Wei Wang	(参考訳) 視覚言語前訓練(vlp)に対する既存のアプローチは、境界ボックス(領域)に基づいた物体検出器に強く依存しており、最初に画像からサルエントオブジェクトを検出し、その後、トランスフォーマティブベースのモデルを使用してクロスモーダル融合を行う。優れた性能にもかかわらず、これらのアプローチは有効性と効率の両面で対象検出器の能力に縛られている。さらに、オブジェクト検出の存在はモデル設計に不必要な制約を課し、エンドツーエンドのトレーニングをサポートするのが難しくなる。本稿では,視覚言語事前学習のためのグリッドベースの畳み込み機能を再検討し,高価な地域関連ステップをスキップする。本稿では,グリッド機能と驚くほどうまく連携する,単純かつ効果的なグリッドベースVLP法を提案する。ドメイン内データセットのみを事前学習することにより,提案手法は,3つの視覚言語理解タスクにおいて,最も競争力のある領域ベースのVLP手法より優れている。本研究の成果は,視覚言語プレトレーニング技術の進歩に寄与し,より効果的かつ効率的なVLPに向けた新たな方向性を提供することを願っている。 Existing approaches to vision-language pre-training (VLP) heavily rely on an object detector based on bounding boxes (regions), where salient objects are first detected from images and then a Transformer-based model is used for cross-modal fusion. Despite their superior performance, these approaches are bounded by the capability of the object detector in terms of both effectiveness and efficiency. Besides, the presence of object detection imposes unnecessary constraints on model designs and makes it difficult to support end-to-end training. In this paper, we revisit grid-based convolutional features for vision-language pre-training, skipping the expensive region-related steps. We propose a simple yet effective grid-based VLP method that works surprisingly well with the grid features. By pre-training only with in-domain datasets, the proposed Grid-VLP method can outperform most competitive region-based VLP methods on three examined vision-language understanding tasks. We hope that our findings help to further advance the state of the art of vision-language pre-training, and provide a new direction towards effective and efficient VLP.	翻訳日:2021-08-25 07:50:01 公開日:2021-08-21
# (参考訳) yseop at finsim-3 shared task 2021: specializing financial domain learning with phrase representations Yseop at FinSim-3 Shared Task 2021: Specializing Financial Domain Learning with Phrase Representations ( http://arxiv.org/abs/2108.09485v1 ) ライセンス: CC BY 4.0	Hanna Abi Akl, Dominique Mariko, Hugues de Mazancourt	(参考訳) 本稿では,FinSim-3共有タスク2021:財務分野のセマンティック類似性を学ぶためのアプローチを提案する。この共有タスクの目的は、金融ドメインから与えられた用語のリストを、外部オントロジーにおいて最も関連するハイパーnym(またはトップレベル)概念に正しく分類することである。そこで,本研究では,カスタムコーパス上で事前学習した文-roberta(sroberta)埋め込みモデルと,ファストテキストモデルを用いて提案するベースライン単語埋め込み構造を改善し,分類性能を向上させる2つの文-sentence埋め込みモデルの評価を行った。両指標で総合2位、平均精度で0.917、平均ランクで1.141。 In this paper, we present our approaches for the FinSim-3 Shared Task 2021: Learning Semantic Similarities for the Financial Domain. The aim of this shared task is to correctly classify a list of given terms from the financial domain into the most relevant hypernym (or top-level) concept in an external ontology. For our system submission, we evaluate two methods: a Sentence-RoBERTa (SRoBERTa) embeddings model pre-trained on a custom corpus, and a dual word-sentence embeddings model that builds on the first method by improving the proposed baseline word embeddings construction using the FastText model to boost the classification performance. Our system ranks 2nd overall on both metrics, scoring 0.917 on Average Accuracy and 1.141 on Mean Rank.	翻訳日:2021-08-25 07:42:00 公開日:2021-08-21
# (参考訳) flikcer - リアルタイム輝度周波数解析によるオンラインてんかん原性視覚コンテンツを解決するためのchromeエクステンション Flikcer -- A Chrome Extension to Resolve Online Epileptogenic Visual Content with Real-Time Luminance Frequency Analysis ( http://arxiv.org/abs/2108.09491v1 ) ライセンス: CC BY 4.0	Jaisal Kothari, Ashay Srivastava	(参考訳) 映像コンテンツの輝度変動が速いか、あるいはてんかん原性視覚コンテンツと呼ばれる高コントラストの空間パターンが、感光性てんかんの視聴者に発作を誘発し、さらにこの疾患の影響を受けないユーザーに不快感を引き起こすこともある。 flikcerはwebサイトとchromeエクステンションという形で、ビデオのてんかん的なコンテンツを解決しようとするウェブアプリだ。これは発作の可能性のあるトリガーの数を提供する。また、これらのトリガーのタイムスタンプや、ビデオのより安全なバージョンも無料でダウンロードできる。アルゴリズムはpythonで書かれており、機械学習とコンピュータビジョンを使用している。このアルゴリズムの重要な側面は計算効率であり、利用者のリアルタイムな実装を可能にする。 Video content with fast luminance variations, or with spatial patterns of high contrast - referred to as epileptogenic visual content - may induce seizures on viewers with photosensitive epilepsy, and even cause discomfort in users not affected by this disease. Flikcer is a web app in the form of a website and chrome extension which aims to resolve epileptic content in videos. It provides the number of possible triggers for a seizure. It also provides the timestamps for these triggers along with a safer version of the video, free to download. The algorithm is written in Python and uses machine learning and computer vision. A key aspect of the algorithm is its computational efficiency, allowing real time implementation for public users.	翻訳日:2021-08-25 07:34:50 公開日:2021-08-21
# (参考訳) 文書アライメントのための多言語文類似度測定におけるメトリック学習 Metric Learning in Multilingual Sentence Similarity Measurement for Document Alignment ( http://arxiv.org/abs/2108.09495v1 ) ライセンス: CC BY 4.0	Charith Rajitha, Lakmali Piyarathne, Dilan Sachintha, Surangika Ranathunga	(参考訳) 多言語文表現に基づく文書アライメント技術は,最近,その成果が示された。しかし、これらの手法は教師なし距離測定技術に依存しており、手作業では微調整できない。本稿では,これらの教師なし距離測定手法の代わりに,タスク固有距離測定の導出にメトリックラーニングを用いる。これらの測定は教師あり、つまり距離測定メトリックは並列データセットを使って訓練される。 3つの異なる言語族に属する英語、シンハラ語、タミル語に属するデータセットを用いて、これらのタスク固有の教師付き距離学習メトリクスが、教師なし距離学習指標よりもドキュメントアライメントに優れていることを示す。 Document alignment techniques based on multilingual sentence representations have recently shown state of the art results. However, these techniques rely on unsupervised distance measurement techniques, which cannot be fined-tuned to the task at hand. In this paper, instead of these unsupervised distance measurement techniques, we employ Metric Learning to derive task-specific distance measurements. These measurements are supervised, meaning that the distance measurement metric is trained using a parallel dataset. Using a dataset belonging to English, Sinhala, and Tamil, which belong to three different language families, we show that these task-specific supervised distance learning metrics outperform their unsupervised counterparts, for document alignment.	翻訳日:2021-08-25 07:23:51 公開日:2021-08-21
# (参考訳) 文書間の関係抽出のための階層型エンティティグラフ畳み込みネットワーク A Hierarchical Entity Graph Convolutional Network for Relation Extraction across Documents ( http://arxiv.org/abs/2108.09505v1 ) ライセンス: CC0 1.0	Tapas Nayak and Hwee Tou Ng	(参考訳) 関係抽出のための遠方の教師付きデータセットは、主に文レベルの抽出に焦点を当てており、関係性が非常に少ない。本稿では,関係タプルの2つの実体が,共通実体の連鎖を介して連結された2つの異なる文書に現れるクロスドキュメント関係抽出を提案する。このアイデアに従い、各チェーンが正確に2つのドキュメントを含む2つのホップ関係抽出のためのデータセットを作成する。提案するデータセットは,公開可能な文レベルのデータセットよりも高い関係性をカバーする。また,この課題に対する階層型エンティティグラフ畳み込みネットワーク(HEGCN)モデルを提案する。 Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where each chain contains exactly two documents. Our proposed dataset covers a higher number of relations than the publicly available sentence-level datasets. We also propose a hierarchical entity graph convolutional network (HEGCN) model for this task that improves performance by 1.1\% F1 score on our two-hop relation extraction dataset, compared to some strong neural baselines.	翻訳日:2021-08-25 07:14:28 公開日:2021-08-21
# (参考訳) テンソル場上の学習変換のための回転同変ニューラル演算子(例えば3次元画像とベクトル場) Rotationally Equivariant Neural Operators for Learning Transformations on Tensor Fields (eg 3D Images and Vector Fields) ( http://arxiv.org/abs/2108.09541v1 ) ライセンス: CC BY 4.0	Paul Shen, Michael Herbst, Venkat Viswanathan	(参考訳) テンソル場の集合間の変換および回転同変変換と同様に、学習分解不変量に対する同変ニューラルネットワークを導入する。入力と出力はスカラー場、ベクトル場、二階テンソル場、高階場の任意の混合を含むことができる。我々のテンソル場畳み込み層は任意の線型作用素をエミュレートし、そのインパルス応答やグリーン関数を畳み込み核として学習する。テンソル場注目層は局所テンソル積を介してペアワイズ場結合をエミュレートする。畳み込みとそれに付随する随伴体は実あるいはフーリエ空間に存在し、線形スケーリングが可能である。 E3NN, TBNN, FNOの概念を統一することにより, 工学および量子化学における幅広いPDEおよび力学系の予測性能が向上する。コードはJuliaにあり、著者からの要望に応じて入手できる。 We introduce equivariant neural operators for learning resolution invariant as well as translation and rotation equivariant transformations between sets of tensor fields. Input and output may contain arbitrary mixes of scalar fields, vector fields, second order tensor fields and higher order fields. Our tensor field convolution layers emulate any linear operator by learning its impulse response or Green's function as the convolution kernel. Our tensor field attention layers emulate pairwise field coupling via local tensor products. Convolutions and associated adjoints can be in real or Fourier space allowing for linear scaling. By unifying concepts from E3NN, TBNN and FNO, we achieve good predictive performance on a wide range of PDEs and dynamical systems in engineering and quantum chemistry. Code is in Julia and available upon request from authors.	翻訳日:2021-08-25 07:03:22 公開日:2021-08-21
# (参考訳) 時空間データマニフォールドの連成特性 Joint Characterization of Spatiotemporal Data Manifolds ( http://arxiv.org/abs/2108.09545v1 ) ライセンス: CC BY 4.0	Daniel Sousa and Christopher Small	(参考訳) 時空間(ST)画像データはますます一般的になり、しばしば高次元(高次元)である。 STデータのモデリングは、独立して相互作用するプロセスが多々存在するため、測定に寄与するかもしれないし、貢献しないかもしれない。キャラクタリゼーションは、生成過程とそのデータ表現に関する仮定の導出を支援することによって、モデリングの補完と見なすことができる。次元減少(DR)は、高次元信号の「次元の曲線」を緩和するためにしばしば実装される特徴である。長年にわたり、主成分(PC)と経験直交関数(EOF)分析は、DRおよびST分析に対する線形で可逆的なアプローチとして用いられてきた。近年、非線形drアルゴリズムのスイートが開発され、しばしば"manifold learning"と分類されている。ここでは、ラプラシアン固有写像 (LE) と t-分散確率的隣接埋め込み (t-SNE) の2つの非線形DRアプローチとともに、PC/EOFを用いたSTデータ多様体の合同特徴づけについて検討する。合成例から始まり,空間で約5桁,時間で2桁のstデータセットを大域的,地域的,フィールドスケールに展開し,これら3つのdrアプローチがst多様体トポロジーに関する補完的情報が得られることを示す。 PCs/EOFs による比較的拡散したTFS と比較して、非線形アプローチは、時間的終端部材 (LE) および/または時空間クラスタリング (t-SNE) におけるあいまいさを減少させたよりコンパクトな多様体を生成する。これらの特性は、LEやt-SNEよりも高い解釈可能性、計算要求の大幅な低減、PC/EOFの空間エイリアスに対する感度の低下によって補償される。総合的に考えると, 単一のアプローチだけで, 生成st過程をより深く把握できる3つの相補的なdrアプローチを用いた共同キャラクタリゼーションを見いだすことができる。 Spatiotemporal (ST) image data are increasingly common and often high-dimensional (high-D). Modeling ST data can be a challenge due to the plethora of independent and interacting processes which may or may not contribute to the measurements. Characterization can be considered the complement to modeling by helping guide assumptions about generative processes and their representation in the data. Dimensionality reduction (DR) is a frequently implemented type of characterization designed to mitigate the "curse of dimensionality" on high-D signals. For decades, Principal Component (PC) and Empirical Orthogonal Function (EOF) analysis has been used as a linear, invertible approach to DR and ST analysis. Recent years have seen the additional development of a suite of nonlinear DR algorithms, frequently categorized as "manifold learning". Here, we explore the idea of joint characterization of ST data manifolds using PCs/EOFs alongside two nonlinear DR approaches: Laplacian Eigenmaps (LE) and t-distributed stochastic neighbor embedding (t-SNE). Starting with a synthetic example and progressing to global, regional, and field scale ST datasets spanning roughly 5 orders of magnitude in space and 2 in time, we show these three DR approaches can yield complementary information about ST manifold topology. Compared to the relatively diffuse TFS produced by PCs/EOFs, the nonlinear approaches yield more compact manifolds with decreased ambiguity in temporal endmembers (LE) and/or in spatiotemporal clustering (t-SNE). These properties are compensated by the greater interpretability, significantly lower computational demand and diminished sensitivity to spatial aliasing for PCs/EOFs than LE or t-SNE. Taken together, we find joint characterization using the three complementary DR approaches capable of greater insight into generative ST processes than possible using any single approach alone.	翻訳日:2021-08-25 06:59:04 公開日:2021-08-21
# (参考訳) ピカチュウはどうですか。 Pok\emon ワード埋め込みデータによる Pok\emon プロパティの収集とランク付け How Cute is Pikachu? Gathering and Ranking Pok\'emon Properties from Data with Pok\'emon Word Embeddings ( http://arxiv.org/abs/2108.09546v1 ) ライセンス: CC BY 4.0	Mika H\"am\"al\"ainen, Khalid Alnajjar and Niko Partanen	(参考訳) 我々は,151個のオリジナル pok\'emon に対して,記述性を自動的に得るための異なる方法を提案する。クロールしたPok\'emonコーパス上に複数の単語埋め込みモデルをトレーニングし、与えられたPok\'emonにどのような特徴があるかに基づいて、自動的に英語の形容詞をランク付けする。我々の実験に基づいて、事前訓練されたモデルを使用するよりも、ドメイン固有のデータでモデルをトレーニングする方がよい。 Word2Vecは、結果においてfastTextモデルよりもノイズが少ない。さらに、各Pok\'emonのプロパティのリストを自動的に拡張します。しかし、いずれの手法も見つからず、異なるセマンティックモデルにはかなりのノイズがある。私たちのモデルはZenodoでリリースされました。 We present different methods for obtaining descriptive properties automatically for the 151 original Pok\'emon. We train several different word embeddings models on a crawled Pok\'emon corpus, and use them to rank automatically English adjectives based on how characteristic they are to a given Pok\'emon. Based on our experiments, it is better to train a model with domain specific data than to use a pretrained model. Word2Vec produces less noise in the results than fastText model. Furthermore, we expand the list of properties for each Pok\'emon automatically. However, none of the methods is spot on and there is a considerable amount of noise in the different semantic models. Our models have been released on Zenodo.	翻訳日:2021-08-25 06:43:56 公開日:2021-08-21
# (参考訳) 熱可視顔認証のための合成手法 A Synthesis-Based Approach for Thermal-to-Visible Face Verification ( http://arxiv.org/abs/2108.09558v1 ) ライセンス: CC BY 4.0	Neehar Peri, Joshua Gleason, Carlos D. Castillo, Thirimachos Bourlai, Vishal M. Patel, Rama Chellappa	(参考訳) 近年,検査官の認識性能に適合する可視分光顔認証システムが提案されている。しかし、このようなシステムは低照度や夜間では効果がない。体温を吸収する熱顔画像は、可視光スペクトルを効果的に増強し、照明が制限されたシーンで識別可能な顔の特徴を捉える。コストの増大と多様な熱スペクトルと可視スペクトルデータセットの取得の困難さから、アルゴリズムや低光度認識のための大規模ベンチマークは限られている。本稿では,ARL-VTFとTUFTSの両方のマルチスペクトル顔データに対して,最先端の性能を実現するアルゴリズムを提案する。さらに,マルチスペクトル顔合成と検証のためのラベル平滑化による顔アライメント,ピクセルレベル対応,アイデンティティ分類の影響について検討した。提案手法は広く適用可能であり,堅牢であり,かつ高い有効性を示す。また,提案手法は,プロファイル対フロント検証において,フェイスフロント化法を有意に上回っていることを示す。最後にmilab-vtf(b)を提案する。これは対のサーマルビデオと可視ビデオで構成される、挑戦的なマルチスペクトル顔データセットである。私たちの知る限りでは、400人の被験者による顔データとともに、このデータセットは、屋内および長距離の熱可視性顔画像の最も広範なコレクションである。最後に,MILAB-VTF(B)データセットに対して,エンドツーエンドのサーマル・トゥ・ザ・ヴィジュアブル・フェース・検証システムにより高い性能が得られることを示す。 In recent years, visible-spectrum face verification systems have been shown to match expert forensic examiner recognition performance. However, such systems are ineffective in low-light and nighttime conditions. Thermal face imagery, which captures body heat emissions, effectively augments the visible spectrum, capturing discriminative facial features in scenes with limited illumination. Due to the increased cost and difficulty of obtaining diverse, paired thermal and visible spectrum datasets, algorithms and large-scale benchmarks for low-light recognition are limited. This paper presents an algorithm that achieves state-of-the-art performance on both the ARL-VTF and TUFTS multi-spectral face datasets. Importantly, we study the impact of face alignment, pixel-level correspondence, and identity classification with label smoothing for multi-spectral face synthesis and verification. We show that our proposed method is widely applicable, robust, and highly effective. In addition, we show that the proposed method significantly outperforms face frontalization methods on profile-to-frontal verification. Finally, we present MILAB-VTF(B), a challenging multi-spectral face dataset that is composed of paired thermal and visible videos. To the best of our knowledge, with face data from 400 subjects, this dataset represents the most extensive collection of publicly available indoor and long-range outdoor thermal-visible face imagery. Lastly, we show that our end-to-end thermal-to-visible face verification system provides strong performance on the MILAB-VTF(B) dataset.	翻訳日:2021-08-25 06:34:29 公開日:2021-08-21
# (参考訳) 連続学習における主勾配方向と信頼貯留層サンプリング Principal Gradient Direction and Confidence Reservoir Sampling for Continual Learning ( http://arxiv.org/abs/2108.09592v1 ) ライセンス: CC BY 4.0	Zhiyi Chen and Tong Lin	(参考訳) タスクフリーオンライン連続学習は、非IDデータストリーム上の学習者の破滅的な忘れを緩和することを目的としている。 Experience Replay (ER) はSOTA連続学習法であり、他のリプレイ手法のバックボーンアルゴリズムとして広く使われている。しかし, ERのトレーニング戦略は, リプレイされた例を十分に活用するには単純すぎるため, 貯水池のサンプリング戦略も最適ではない。本研究では,ERを特別な場合とみなすことのできる一般近位勾配フレームワークを提案する。さらに,主グラディエント方向(PGD)と信頼性貯留層サンプリング(CRS)の2つの改良点を提案する。主勾配方向において,過去の勾配の大きな寄与を表すだけでなく,現在の勾配に関する新たな知識も保持する目標勾配を最適化する。次に、保存されたサンプルの値を測定するマージンベースのメトリックに基づいて、より有益なメモリバッファを維持するための信頼度リザーバサンプリングを示す。このアルゴリズムは平均精度を7.9%まで向上させ、4つのデータセットで最大15.4%まで忘れてしまうという、soma erベースの手法であるmir-replayの性能を一貫して向上させる。 Task-free online continual learning aims to alleviate catastrophic forgetting of the learner on a non-iid data stream. Experience Replay (ER) is a SOTA continual learning method, which is broadly used as the backbone algorithm for other replay-based methods. However, the training strategy of ER is too simple to take full advantage of replayed examples and its reservoir sampling strategy is also suboptimal. In this work, we propose a general proximal gradient framework so that ER can be viewed as a special case. We further propose two improvements accordingly: Principal Gradient Direction (PGD) and Confidence Reservoir Sampling (CRS). In Principal Gradient Direction, we optimize a target gradient that not only represents the major contribution of past gradients, but also retains the new knowledge of the current gradient. We then present Confidence Reservoir Sampling for maintaining a more informative memory buffer based on a margin-based metric that measures the value of stored examples. Experiments substantiate the effectiveness of both our improvements and our new algorithm consistently boosts the performance of MIR-replay, a SOTA ER-based method: our algorithm increases the average accuracy up to 7.9% and reduces forgetting up to 15.4% on four datasets.	翻訳日:2021-08-25 06:20:02 公開日:2021-08-21
# (参考訳) 長文音声対話のための階層的要約 Hierarchical Summarization for Longform Spoken Dialog ( http://arxiv.org/abs/2108.09597v1 ) ライセンス: CC BY 4.0	Daniel Li, Thomas Chen, Albert Tung, Lydia Chilton	(参考訳) 私たちは毎日会話に囲まれています。この媒体は、監査的に多様な情報ストリームを提供するが、体系的にダイアログを理解することは、しばしば非自明である。音声対話の広汎性にもかかわらず、自動音声理解と品質情報抽出は、特に文章の散文と比較した場合、著しく貧弱である。さらに、テキストを理解することに比べ、聴覚コミュニケーションは、話者の拡散、非公式な散文スタイル、構造の欠如など、多くの課題をもたらす。これらの懸念はすべて、ユーザが話し言葉のドメインを理解し、ナビゲートするのに役立つ、明確にカスタマイズされた対話システムの必要性を示しています。個々の自動音声認識(ASR)とテキスト要約法はすでに存在するが、それらは不完全な技術であり、ユーザ目的や意図、音声言語による合併症への対処も考慮していない。その結果、2段階のASRとテキスト要約パイプラインを設計し、これらの音声認識課題を解決するためのセマンティックセグメンテーションとマージアルゴリズムを提案する。本システムでは,ユーザが簡単にコンテンツを閲覧・ナビゲートできるだけでなく,これらの基盤技術におけるエラーからの回復も可能である。最後に,音声を素早くスキップし,ユーザの興味のある内容を識別するツールとして,階層的な要約のユーザの好みを強調するシステムの評価を行う。 Every day we are surrounded by spoken dialog. This medium delivers rich diverse streams of information auditorily; however, systematically understanding dialog can often be non-trivial. Despite the pervasiveness of spoken dialog, automated speech understanding and quality information extraction remains markedly poor, especially when compared to written prose. Furthermore, compared to understanding text, auditory communication poses many additional challenges such as speaker disfluencies, informal prose styles, and lack of structure. These concerns all demonstrate the need for a distinctly speech tailored interactive system to help users understand and navigate the spoken language domain. While individual automatic speech recognition (ASR) and text summarization methods already exist, they are imperfect technologies; neither consider user purpose and intent nor address spoken language induced complications. Consequently, we design a two stage ASR and text summarization pipeline and propose a set of semantic segmentation and merging algorithms to resolve these speech modeling challenges. Our system enables users to easily browse and navigate content as well as recover from errors in these underlying technologies. Finally, we present an evaluation of the system which highlights user preference for hierarchical summarization as a tool to quickly skim audio and identify content of interest to the user.	翻訳日:2021-08-25 06:10:24 公開日:2021-08-21
# (参考訳) SERF:log-Softplus ERrorActivation Functionを用いたディープニューラルネットワークのより良いトレーニングを目指して SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function ( http://arxiv.org/abs/2108.09598v1 ) ライセンス: CC BY 4.0	Sayan Nag, Mayukh Bhattacharyya	(参考訳) アクティベーション機能は、トレーニングダイナミクスとニューラルネットワークのパフォーマンスを決定する上で重要な役割を果たす。シンプルで有効であるにもかかわらず広く採用されているアクティベーション関数 ReLU には、Dying ReLU 問題を含むいくつかの欠点がある。そこで本研究では,自然界において自己正規化され,非単調であるサーフと呼ばれる新しい活性化関数を提案する。 Mishと同様に、SerfもSwishファミリーに属している。コンピュータビジョン(画像分類とオブジェクト検出)と自然言語処理(機械翻訳、感情分類、マルチモーダル・エンテーメント)の様々な実験に基づいて、SerfはReLU(ベースライン)とSwishとMishを含む他のアクティベーション機能を大きく上回っており、より深いアーキテクチャに顕著な差がある。アブレーション研究により、serfベースのアーキテクチャは様々なシナリオにおいてswishやmishよりも優れた性能を示し、様々な深さ、複雑さ、最適化、学習率、バッチサイズ、初期化器、ドロップアウト率でserfの有効性と互換性を検証する。最後に,SwishとSerfの数学的関係について検討し,よりスムーズかつ高速に勾配を最適化する正規化効果を提供するSerfの第1微分のプレコンディショナー関数の影響を示す。 Activation functions play a pivotal role in determining the training dynamics and neural network performance. The widely adopted activation function ReLU despite being simple and effective has few disadvantages including the Dying ReLU problem. In order to tackle such problems, we propose a novel activation function called Serf which is self-regularized and nonmonotonic in nature. Like Mish, Serf also belongs to the Swish family of functions. Based on several experiments on computer vision (image classification and object detection) and natural language processing (machine translation, sentiment classification and multimodal entailment) tasks with different state-of-the-art architectures, it is observed that Serf vastly outperforms ReLU (baseline) and other activation functions including both Swish and Mish, with a markedly bigger margin on deeper architectures. Ablation studies further demonstrate that Serf based architectures perform better than those of Swish and Mish in varying scenarios, validating the effectiveness and compatibility of Serf with varying depth, complexity, optimizers, learning rates, batch sizes, initializers and dropout rates. Finally, we investigate the mathematical relation between Swish and Serf, thereby showing the impact of preconditioner function ingrained in the first derivative of Serf which provides a regularization effect making gradients smoother and optimization faster.	翻訳日:2021-08-25 05:43:59 公開日:2021-08-21
# CushLEPOR: LABSE蒸留知識モデルを用いたカスタマイズhLEPORメトリクスによる人的判断との整合性向上 CushLEPOR: Customised hLEPOR Metric Using LABSE Distilled Knowledge Model to Improve Agreement with Human Judgements ( http://arxiv.org/abs/2108.09484v1 ) ライセンス: Link先を確認	Lifeng Han, Irina Sorokina, Gleb Erofeev, Serge Gladkoff	(参考訳) 人間の評価は常に高価で、研究者は自動メトリクスを信頼できない。そこで本稿では,事前学習型言語モデル(PLM)と限定された人間のラベル付きスコアの利点を生かして,従来のメトリクスをカスタマイズすることを提案する。まず、hLEPORのパラメータ要素を再導入し、次に、hLEPORのパラメータの重み付けを自動的にチューニングするPythonポータブルバージョンを開発しました。次に、LABSE蒸留知識モデルを用いて、cushLEPORが配置された正確なMT言語対に関する因子重みを自動的に最適化することにより、人間の判断とのメートル法合意を向上する、カスタマイズhLEPOR(cushLEPOR)を提案する。また、英語とドイツ語と中国語のペアにおけるMQMおよびpSQMフレームワークに基づく評価データに対して、cushLEPORを最適化する。実験の結果、CushLEPOR は LABSE のような PLM とのより優れた契約、MQM や pSQM などの人的評価に対するより良い合意、BLEU よりもはるかに優れたパフォーマンスをもたらすことが示されている(データは \url{https://github.com/poethan/cushLEPOR} で入手できる)。 Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited available human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python portable version we developed which achieved the automatic tuning of the weighting parameters in hLEPOR metric. Then we present the customised hLEPOR (cushLEPOR) which uses LABSE distilled knowledge model to improve the metric agreement with human judgements by automatically optimised factor weights regarding the exact MT language pairs that cushLEPOR is deployed to. We also optimise cushLEPOR towards human evaluation data based on MQM and pSQM framework on English-German and Chinese-English language pairs. The experimental investigations show cushLEPOR boosts hLEPOR performances towards better agreements to PLMs like LABSE with much lower cost, and better agreements to human evaluations including MQM and pSQM scores, and yields much better performances than BLEU (data available at \url{https://github.com/poethan/cushLEPOR}).	翻訳日:2021-08-24 16:08:02 公開日:2021-08-21
# learn-explain-reinforce: counterfactual reasoningとアルツハイマー病診断モデル強化のための指導 Learn-Explain-Reinforce: Counterfactual Reasoning and Its Guidance to Reinforce an Alzheimer's Disease Diagnosis Model ( http://arxiv.org/abs/2108.09451v1 ) ライセンス: Link先を確認	Kwanseok Oh, Jee Seok Yoon, and Heung-Il Suk	(参考訳) 既存の疾患診断モデルの研究は、パフォーマンス改善のための診断モデル学習や、訓練された診断モデルの視覚的説明に焦点を当てている。本稿では、診断モデル学習、視覚的説明生成(説明単位)、視覚的説明によって導かれる訓練された診断モデル強化(強化単位)を統一する新しい学習説明強化(LEAR)フレームワークを提案する。視覚的説明のために、入力サンプルを目的のターゲットラベルとして識別するために変換する反ファクトマップを生成する。例えば、カウンターファクトマップは、通常の脳画像内で仮説上の異常を局在させ、アルツハイマー病(AD)と診断される可能性がある。我々は,対象課題に関するデータ駆動型およびモデル駆動型知識,すなわち構造的MRIを用いたAD診断が,訓練された診断モデルの一般化を強化する上で重要な情報源であると考えている。この目的のために,反事実マップの指導により注意に基づく特徴リファインメントモジュールを考案する。説明と補強は相互に行われ、反復的に操作できる。提案手法はadniデータセットの質的・定量的解析により検証された。その理解性と忠実さはアブレーション研究と既存手法との比較によって実証された。 Existing studies on disease diagnostic models focus either on diagnostic model learning for performance improvement or on the visual explanation of a trained diagnostic model. We propose a novel learn-explain-reinforce (LEAR) framework that unifies diagnostic model learning, visual explanation generation (explanation unit), and trained diagnostic model reinforcement (reinforcement unit) guided by the visual explanation. For the visual explanation, we generate a counterfactual map that transforms an input sample to be identified as an intended target label. For example, a counterfactual map can localize hypothetical abnormalities within a normal brain image that may cause it to be diagnosed with Alzheimer's disease (AD). We believe that the generated counterfactual maps represent data-driven and model-induced knowledge about a target task, i.e., AD diagnosis using structural MRI, which can be a vital source of information to reinforce the generalization of the trained diagnostic model. To this end, we devise an attention-based feature refinement module with the guidance of the counterfactual maps. The explanation and reinforcement units are reciprocal and can be operated iteratively. Our proposed approach was validated via qualitative and quantitative analysis on the ADNI dataset. Its comprehensibility and fidelity were demonstrated through ablation studies and comparisons with existing methods.	翻訳日:2021-08-24 16:04:33 公開日:2021-08-21
# 離散高次元データを用いたベイズネットワーク同定のためのスパース構造学習アルゴリズム A Sparse Structure Learning Algorithm for Bayesian Network Identification from Discrete High-Dimensional Data ( http://arxiv.org/abs/2108.09501v1 ) ライセンス: Link先を確認	Nazanin Shajoonnezhad, Amin Nikanjam	(参考訳) 本稿では,高次元離散データから疎構造ベイズネットワークを学習する問題に対処する。連続ベイズネットワークと比較すると、離散ベイズネットワークの学習は大きなパラメータ空間のため難しい問題である。連続ベイズネットワークの学習には多くのアプローチが開発されているが、離散的ネットワークに対するアプローチはほとんど提案されていない。本稿では,学習ベイズネットワークを最適化問題として扱い,空間性とDAG特性を同時に満足するスコア関数を提案する。また,スコア関数を最適化するためにブロック方向確率座標降下アルゴリズムを実装した。具体的には,アルゴリズムを高次元データで効率的に動作させるため,最適化アルゴリズムに分散低減法を用いる。提案手法は,よく知られたベンチマークネットワークからの合成データに適用できる。構築したネットワークの品質,スケーラビリティ,堅牢性を測定した。いくつかの競合手法と比較して,本アルゴリズムは評価指標において他のアルゴリズムよりも優れていた。 This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data. Compared to continuous Bayesian networks, learning a discrete Bayesian network is a challenging problem due to the large parameter space. Although many approaches have been developed for learning continuous Bayesian networks, few approaches have been proposed for the discrete ones. In this paper, we address learning Bayesian networks as an optimization problem and propose a score function that satisfies the sparsity and the DAG property simultaneously. Besides, we implement a block-wised stochastic coordinate descent algorithm to optimize the score function. Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently in high-dimensional data. The proposed approach is applied to synthetic data from well-known benchmark networks. The quality, scalability, and robustness of the constructed network are measured. Compared to some competitive approaches, the results reveal that our algorithm outperforms the others in evaluation metrics.	翻訳日:2021-08-24 16:02:25 公開日:2021-08-21
# 確率勾配の輝きのランダム性向上は一般化を改善するか? How Can Increased Randomness in Stochastic Gradient Descent Improve Generalization? ( http://arxiv.org/abs/2108.09507v1 ) ライセンス: Link先を確認	Arwen V. Bradley and Carlos Alberto Gomez-Uribe	(参考訳) 近年の研究では、確率勾配降下(SGD)における学習率の増加やミニバッチサイズの減少がテストセット性能を向上させることが報告されている。複数の局所ミニマを持つ損失関数を持つモデルでは、いくつかの条件下でこれを期待できる。我々の主な貢献は、一般化におけるSGD学習率とバッチサイズの役割を研究する物理の手法に着想を得た、近似的だが解析的なアプローチである。複数の最小値を持つ損失関数のトレーニングとテストデータ分布のシフトの下でテストセットのパフォーマンスを特徴付ける。このシフトは単にサンプリングによって起こりうるため、一般的には実践的な応用に現れる。その結果,局所的ミニマムの変化は曲率を上げることによってテスト性能を悪化させ,広義の局所的ミニマムの選択により一般化が向上することを示す。次に,SGDを専門とし,静止条件下でのテスト性能について検討する。 SGDの正確な定常分布を得ることは困難であるため、SGDのFokker-Planck近似を導出し、その定常分布を得る。このプロセスは, 最小バッチサイズで分割された学習速度が, 統計力学において温度に類似する役割を担っていることを示唆し, 定常分布を含むSGDは, 温度を一定に保った学習速度やバッチサイズの変化に大きく変化しないことを示唆している。また,SGD温度の上昇は局所最小値の選択を低曲率で促進し,より一般化できることを示す。我々は,SGDの温度不変性を示すCIFAR10の実験を行い,SGD温度が上昇するにつれて試験損失が向上し,この効果を駆動する際のサンプリングとドメインシフトの影響を定量化する。最後に,2つの局所最小値による簡易な損失に我々の理論がどのように適用されるかを示す合成実験を示す。 Recent works report that increasing the learning rate or decreasing the minibatch size in stochastic gradient descent (SGD) can improve test set performance. We argue this is expected under some conditions in models with a loss function with multiple local minima. Our main contribution is an approximate but analytical approach inspired by methods in Physics to study the role of the SGD learning rate and batch size in generalization. We characterize test set performance under a shift between the training and test data distributions for loss functions with multiple minima. The shift can simply be due to sampling, and is therefore typically present in practical applications. We show that the resulting shift in local minima worsens test performance by picking up curvature, implying that generalization improves by selecting wide and/or little-shifted local minima. We then specialize to SGD, and study its test performance under stationarity. Because obtaining the exact stationary distribution of SGD is intractable, we derive a Fokker-Planck approximation of SGD and obtain its stationary distribution instead. This process shows that the learning rate divided by the minibatch size plays a role analogous to temperature in statistical mechanics, and implies that SGD, including its stationary distribution, is largely invariant to changes in learning rate or batch size that leave its temperature constant. We show that increasing SGD temperature encourages the selection of local minima with lower curvature, and can enable better generalization. We provide experiments on CIFAR10 demonstrating the temperature invariance of SGD, improvement of the test loss as SGD temperature increases, and quantifying the impact of sampling versus domain shift in driving this effect. Finally, we present synthetic experiments showing how our theory applies in a simplified loss with two local minima.	翻訳日:2021-08-24 16:02:12 公開日:2021-08-21
# BoundaryNet: 半自動レイアウトアノテーションのための高速マーキング距離マップを備えた注意深いネットワーク BoundaryNet: An Attentive Deep Network with Fast Marching Distance Maps for Semi-automatic Layout Annotation ( http://arxiv.org/abs/2108.09433v1 ) ライセンス: Link先を確認	Abhishek Trivedi and Ravi Kiran Sarvadevabhatla	(参考訳) 画像領域の正確な境界アノテーションは、領域クラスセマンティクスに依存する下流アプリケーションにとって重要である。いくつかの文書コレクションは、アスペクト比の広い多クラス領域インスタンスと非常に不規則で重なり合う密集したレイアウトを含んでいる。完全自動境界推定手法は、データ集約的であり、可変サイズの画像を扱うことができず、上記の画像に対する準最適結果を生成する傾向がある。本稿では,高精度半自動レイアウトアノテーションのための新しいリサイズフリーアプローチであるバウンダリネットを提案する。可変サイズのユーザ選択領域は、最初に注目誘導スキップネットワークにより処理される。ネットワーク最適化は高速マーチング距離マップを介して導かれ、高品質な初期境界推定と関連する特徴表現を得る。これらの出力は、ハウスドルフ損失を用いて最適化された残差グラフ畳み込みネットワークによって処理され、最終的な領域境界を得る。挑戦的な画像原稿データセットの結果、BoundaryNetは強いベースラインを上回り、高品質なセマンティック領域境界を生成する。定性的には,スクリプトシステムとレイアウトの異なる複数の文書画像データセットを,追加の微調整なしで一般化する。 BoundaryNetを文書アノテーションシステムに統合し、手動や完全自動の代替品と比較して高いアノテーションスループットを提供することを示す。 Precise boundary annotations of image regions can be crucial for downstream applications which rely on region-class semantics. Some document collections contain densely laid out, highly irregular and overlapping multi-class region instances with large range in aspect ratio. Fully automatic boundary estimation approaches tend to be data intensive, cannot handle variable-sized images and produce sub-optimal results for aforementioned images. To address these issues, we propose BoundaryNet, a novel resizing-free approach for high-precision semi-automatic layout annotation. The variable-sized user selected region of interest is first processed by an attention-guided skip network. The network optimization is guided via Fast Marching distance maps to obtain a good quality initial boundary estimate and an associated feature representation. These outputs are processed by a Residual Graph Convolution Network optimized using Hausdorff loss to obtain the final region boundary. Results on a challenging image manuscript dataset demonstrate that BoundaryNet outperforms strong baselines and produces high-quality semantic region boundaries. Qualitatively, our approach generalizes across multiple document image datasets containing different script systems and layouts, all without additional fine-tuning. We integrate BoundaryNet into a document annotation system and show that it provides high annotation throughput compared to manual and fully automatic alternatives.	翻訳日:2021-08-24 16:00:48 公開日:2021-08-21
# Palmira: 手書き手書き文字のDenseとUneven LayoutのインスタンスセグメンテーションのためのDeep Deformable Network Palmira: A Deep Deformable Network for Instance Segmentation of Dense and Uneven Layouts in Handwritten Manuscripts ( http://arxiv.org/abs/2108.09436v1 ) ライセンス: Link先を確認	Prema Satish Sharan, Sowmya Aitha, Amandeep Kumar, Abhishek Trivedi, Aaron Augustine, Ravi Kiran Sarvadevabhatla	(参考訳) 手書きの文書は、しばしば濃密で不均一なレイアウトで特徴づけられる。進歩にもかかわらず、セマンティックレイアウトセグメンテーションのための標準的なディープネットワークベースのアプローチは、セマンティクス領域にまたがる複雑な変形に対して堅牢ではない。この現象は、特に低リソースのインディアムリーフ原稿ドメインで顕著である。この問題に対処するため、最初にindiscapes2を紹介します。indiscapes2は、セマンティックレイアウトアノテーションを備えた、インデックス原稿の新しい大規模多種多様なデータセットです。 Indiscapes2には4つの異なる歴史的コレクションの文書があり、前身であるIndiscapesよりも150%大きい。また,手書き原稿中の領域の頑健な変形対応インスタンスセグメンテーションのための,新しい深層ネットワークpalmiraを提案する。また、ハウスドルフ距離とその変種を境界対応性能尺度として報告する。実験によりPalmiraはロバストなレイアウトを提供し、強力なベースラインアプローチやアブレーティブなバリエーションよりも優れていることが示された。我々はまた、パルミラの一般化能力を示すために、アラビア語、東南アジア、ヘブライの歴史写本の質的な結果も含んでいる。 Handwritten documents are often characterized by dense and uneven layout. Despite advances, standard deep network based approaches for semantic layout segmentation are not robust to complex deformations seen across semantic regions. This phenomenon is especially pronounced for the low-resource Indic palm-leaf manuscript domain. To address the issue, we first introduce Indiscapes2, a new large-scale diverse dataset of Indic manuscripts with semantic layout annotations. Indiscapes2 contains documents from four different historical collections and is 150% larger than its predecessor, Indiscapes. We also propose a novel deep network Palmira for robust, deformation-aware instance segmentation of regions in handwritten manuscripts. We also report Hausdorff distance and its variants as a boundary-aware performance measure. Our experiments demonstrate that Palmira provides robust layouts, outperforms strong baseline approaches and ablative variants. We also include qualitative results on Arabic, South-East Asian and Hebrew historical manuscripts to showcase the generalization capability of Palmira.	翻訳日:2021-08-24 16:00:28 公開日:2021-08-21
# semifed:一貫性と擬似ラベル付き半教師付き連合学習 SemiFed: Semi-supervised Federated Learning with Consistency and Pseudo-Labeling ( http://arxiv.org/abs/2108.09412v1 ) ライセンス: Link先を確認	Haowen Lin, Jian Lou, Li Xiong, Cyrus Shahabi	(参考訳) フェデレートラーニングは、携帯電話や組織などの複数のクライアントが、ローカルデータのプライバシーを保護しながら、予測の共有モデルを共同で学習することを可能にする。しかし、フェデレーション学習の最近の研究と応用は、すべてのクライアントが完全なラベル付きデータを持っていると仮定している。本研究では、各クライアントのデータサンプルを部分的にラベル付けするクロスサイロ・フェデレーション学習の新しいシナリオに焦点を当てる。我々は,ラベル付きサンプルへのアクセスに制限があるにもかかわらず,大量のラベル付きデータを用いてモデルの精度を向上させる半教師付き学習手法のアイデアを借りる。半教師付き学習のための2つの支配的アプローチである一貫性の正規化と擬似ラベル付けを統一したsemifedと呼ばれる新しいフレームワークを提案する。 SemiFedはまず、一貫性の正則化を強制するために高度なデータ拡張技術を適用し、トレーニング中にモデルの予測を使用して擬似ラベルを生成する。 SemiFedはフェデレーションを利用して、あるイメージに対して、異なるクライアントから複数のモデルが高信頼の予測を生成し、同じラベルに同意した場合のみ、擬似ラベルを保持する。 2つの画像ベンチマークに関する広範囲実験により,不均質および異種データ分布設定における提案手法の有効性を実証した。 Federated learning enables multiple clients, such as mobile phones and organizations, to collaboratively learn a shared model for prediction while protecting local data privacy. However, most recent research and applications of federated learning assume that all clients have fully labeled data, which is impractical in real-world settings. In this work, we focus on a new scenario for cross-silo federated learning, where data samples of each client are partially labeled. We borrow ideas from semi-supervised learning methods where a large amount of unlabeled data is utilized to improve the model's accuracy despite limited access to labeled examples. We propose a new framework dubbed SemiFed that unifies two dominant approaches for semi-supervised learning: consistency regularization and pseudo-labeling. SemiFed first applies advanced data augmentation techniques to enforce consistency regularization and then generates pseudo-labels using the model's predictions during training. SemiFed takes advantage of the federation so that for a given image, the pseudo-label holds only if multiple models from different clients produce a high-confidence prediction and agree on the same label. Extensive experiments on two image benchmarks demonstrate the effectiveness of our approach under both homogeneous and heterogeneous data distribution settings	翻訳日:2021-08-24 15:59:54 公開日:2021-08-21
# 実証学習における「逆例」 "Adversarial Examples" for Proof-of-Learning ( http://arxiv.org/abs/2108.09454v1 ) ライセンス: Link先を確認	Rui Zhang, Jian Liu, Yuan Ding, Qingbiao Wu, and Kui Ren	(参考訳) S&P'21では、Jia et al。これは、証明者がトレーニング手順の完全性を証明することによって、機械学習モデルのオーナシップを実証することを可能にする。証明の生成において証明者が行うものよりもコスト(計算量と記憶量の両方)の低い有効な証明を構築することはできない。 PoL証明は、トレーニング中に記録された一連の中間モデルと、記録された各モデルを得るために使用される対応するデータポイントを含む。通称、jia et al。最終的なモデルとトレーニングデータセットを知るだけの敵は、正しいデータポイントを持つ中間モデルのセットを効率的に見つけることができないと主張した。しかし,本稿では,PoLが「逆例」に対して脆弱であることを示す。具体的には、敵対的な例を最適化するのと同様の方法で、任意のモデルを任意に「生成」することで、正しいデータポイントを持つ中間モデルを効率的に生成することができる。理論的にも経験的にも、証明者による証明よりもはるかに低コストで有効な証明を生成できることを示し、PoLを破ることに成功した。 In S&P '21, Jia et al. proposed a new concept/mechanism named proof-of-learning (PoL), which allows a prover to demonstrate ownership of a machine learning model by proving integrity of the training procedure. It guarantees that an adversary cannot construct a valid proof with less cost (in both computation and storage) than that made by the prover in generating the proof. A PoL proof includes a set of intermediate models recorded during training, together with the corresponding data points used to obtain each recorded model. Jia et al. claimed that an adversary merely knowing the final model and training dataset cannot efficiently find a set of intermediate models with correct data points. In this paper, however, we show that PoL is vulnerable to "adversarial examples"! Specifically, in a similar way as optimizing an adversarial example, we could make an arbitrarily-chosen data point "generate" a given model, hence efficiently generating intermediate models with correct data points. We demonstrate, both theoretically and empirically, that we are able to generate a valid proof with significantly less cost than generating a proof by the prover, thereby we successfully break PoL.	翻訳日:2021-08-24 15:58:51 公開日:2021-08-21
# 結晶構造相マッピングの自動化:ディープラーニングと制約推論を組み合わせる Automating Crystal-Structure Phase Mapping: Combining Deep Learning with Constraint Reasoning ( http://arxiv.org/abs/2108.09523v1 ) ライセンス: Link先を確認	Di Chen, Yiwei Bai, Sebastian Ament, Wenting Zhao, Dan Guevarra, Lan Zhou, Bart Selman, R. Bruce van Dover, John M. Gregoire, Carla P. Gomes	(参考訳) 結晶構造相マッピング(英: crystal-structure phase mapping)は、合成材料における結晶構造やその混合物の同定を必要とする、材料科学における中核的で長期にわたる挑戦である。材料科学の専門家は単純なシステムを解くことに長けているが、複雑なシステムを解くことはできない。ここでは結晶構造位相マッピングの自動化について述べる。我々は,教師なしパターンデミックス問題として位相マッピングを定式化し,深層推論ネットワーク(drnets)を用いてその解法を説明する。 DRNetは、科学的事前知識を組み込むための制約推論とディープラーニングを組み合わせることで、わずかな量の(ラベルのない)データしか必要としない。 DRNetは、制約推論をニューラルネットワーク最適化にシームレスに統合した結晶の混合物を管理する熱力学規則に関する豊富な事前知識を利用して、限られたデータを補償する。 DRNetは、事前知識ドメイン制約を符号化し、ニューラルネットワーク最適化に制約推論をシームレスに統合するための解釈可能な潜在空間で設計されている。 DRNetはかつての結晶構造相マッピングのアプローチを超越し、Bi-Cu-V酸化物相図を解き、太陽電池材料の発見を支援した。 Crystal-structure phase mapping is a core, long-standing challenge in materials science that requires identifying crystal structures, or mixtures thereof, in synthesized materials. Materials science experts excel at solving simple systems but cannot solve complex systems, creating a major bottleneck in high-throughput materials discovery. Herein we show how to automate crystal-structure phase mapping. We formulate phase mapping as an unsupervised pattern demixing problem and describe how to solve it using Deep Reasoning Networks (DRNets). DRNets combine deep learning with constraint reasoning for incorporating scientific prior knowledge and consequently require only a modest amount of (unlabeled) data. DRNets compensate for the limited data by exploiting and magnifying the rich prior knowledge about the thermodynamic rules governing the mixtures of crystals with constraint reasoning seamlessly integrated into neural network optimization. DRNets are designed with an interpretable latent space for encoding prior-knowledge domain constraints and seamlessly integrate constraint reasoning into neural network optimization. DRNets surpass previous approaches on crystal-structure phase mapping, unraveling the Bi-Cu-V oxide phase diagram, and aiding the discovery of solar-fuels materials.	翻訳日:2021-08-24 15:58:34 公開日:2021-08-21
# 多項式次数の多項式核の高速スケッチ Fast Sketching of Polynomial Kernels of Polynomial Degree ( http://arxiv.org/abs/2108.09420v1 ) ライセンス: Link先を確認	Zhao Song, David P. Woodruff, Zheng Yu, Lichen Zhang	(参考訳) カーネルメソッドは機械学習の基本であり、カーネル近似の高速アルゴリズムは機械学習における多くのコアタスクを直接高速化する。多項式核は、テイラー級数展開を通じて多項式核によって近似されることが多いため、特に重要である。最近の斜めスケッチ技術では、多項式核の指数関数から多項式への次数 q$ に対する実行時間の依存性が小さくなっており、これはガウス核にとって有用であり、q$ は多対数として選択できる。しかし、ニューラル・タンジェントやアークコサイン・カーネルのようなよりゆっくりと成長するカーネルの場合、$q$は多項式でなければならない。この実行時間を大幅に改善し、先行注文項の$q$への依存をなくすことにより、新たな不明瞭なスケッチを提示する。新しいサンプリングスキームと組み合わせることで、成長の遅いカーネルの大規模なファミリーを近似するための最速のアルゴリズムを与える。 Kernel methods are fundamental in machine learning, and faster algorithms for kernel approximation provide direct speedups for many core tasks in machine learning. The polynomial kernel is especially important as other kernels can often be approximated by the polynomial kernel via a Taylor series expansion. Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic. However, for more slowly growing kernels, such as the neural tangent and arc-cosine kernels, $q$ needs to be polynomial, and previous work incurs a polynomial factor slowdown in the running time. We give a new oblivious sketch which greatly improves upon this running time, by removing the dependence on $q$ in the leading order term. Combined with a novel sampling scheme, we give the fastest algorithms for approximating a large family of slow-growing kernels.	翻訳日:2021-08-24 15:56:17 公開日:2021-08-21
# 分離学習環境における逐次確率最適化 Sequential Stochastic Optimization in Separable Learning Environments ( http://arxiv.org/abs/2108.09585v1 ) ライセンス: Link先を確認	R. Reid Bishop and Chelsea C. White III	(参考訳) 我々は,様々な種類の教師付き学習概念を包含する不確実性の下での逐次的意思決定問題を考える。これらの問題は、完全に観察された状態過程と部分的に観測された変調過程を有し、状態過程は観察過程を通してのみ変調過程に影響され、観察過程は変調過程のみを観察し、変調過程は制御に外在する。我々は,この幅広い問題を部分観察マルコフ決定過程(pomdp)としてモデル化する。変調過程の信念関数は制御不変であり、状態過程の制御から変調過程の推定を分離する。 We call this specially structured POMDP the separable POMDP, or SEP-POMDP, and show it (i) can serve as a model for a broad class of application areas, e.g., inventory control, finance, healthcare systems, (ii) inherits value function and optimal policy structure from a set of completely observed MDPs, (iii) can serve as a bridge between classical models of sequential decision making under uncertainty having fully specified model artifacts and such models that are not fully specified and require the use of predictive methods from statistics and machine learning, and (iv) allows for specialized approximate solution procedures. We consider a class of sequential decision-making problems under uncertainty that can encompass various types of supervised learning concepts. These problems have a completely observed state process and a partially observed modulation process, where the state process is affected by the modulation process only through an observation process, the observation process only observes the modulation process, and the modulation process is exogenous to control. We model this broad class of problems as a partially observed Markov decision process (POMDP). The belief function for the modulation process is control invariant, thus separating the estimation of the modulation process from the control of the state process. We call this specially structured POMDP the separable POMDP, or SEP-POMDP, and show it (i) can serve as a model for a broad class of application areas, e.g., inventory control, finance, healthcare systems, (ii) inherits value function and optimal policy structure from a set of completely observed MDPs, (iii) can serve as a bridge between classical models of sequential decision making under uncertainty having fully specified model artifacts and such models that are not fully specified and require the use of predictive methods from statistics and machine learning, and (iv) allows for specialized approximate solution procedures.	翻訳日:2021-08-24 15:55:58 公開日:2021-08-21
# Integer-arithmetic-only Certified Robustness for Quantized Neural Networks Integer-arithmetic-only Certified Robustness for Quantized Neural Networks ( http://arxiv.org/abs/2108.09413v1 ) ライセンス: Link先を確認	Haowen Lin, Jian Lou, Li Xiong and Cyrus Shahabi	(参考訳) 敵対的なデータ例は、機械学習とセキュリティコミュニティから大きな注目を集めている。反対例に取り組むための一連の研究は、理論的な堅牢性を保証するためのランダムな平滑化によって、堅牢性を保証する。しかし、そのような機構は通常、推論の計算に浮動小数点演算を使い、大きなメモリフットプリントと計算コストを犠牲にする。これらの防御モデルは、エッジデバイス上で効率的に動作したり、チューリングテンソルコアや整数専用ARMプロセッサのような整数専用論理ユニットにデプロイすることはできない。これらの課題を克服するために,任意の分類器を新しいスムーズな分類器に変換するために,量子化を用いた整数ランダム化平滑化手法を提案する。提案手法ではL2-ノルムの下で強靭性を保証する。提案手法は,2つの異なるデータセット(CIFAR-10とCaltech-101)上の汎用CPUおよびモバイルデバイス上で,浮動小数点演算によるロバストな手法に対して,同等の精度と4倍～5倍の高速化が得られることを示す。 Adversarial data examples have drawn significant attention from the machine learning and security communities. A line of work on tackling adversarial examples is certified robustness via randomized smoothing that can provide a theoretical robustness guarantee. However, such a mechanism usually uses floating-point arithmetic for calculations in inference and requires large memory footprints and daunting computational costs. These defensive models cannot run efficiently on edge devices nor be deployed on integer-only logical units such as Turing Tensor Cores or integer-only ARM processors. To overcome these challenges, we propose an integer randomized smoothing approach with quantization to convert any classifier into a new smoothed classifier, which uses integer-only arithmetic for certified robustness against adversarial perturbations. We prove a tight robustness guarantee under L2-norm for the proposed approach. We show our approach can obtain a comparable accuracy and 4x~5x speedup over floating-point arithmetic certified robust methods on general-purpose CPUs and mobile devices on two distinct datasets (CIFAR-10 and Caltech-101).	翻訳日:2021-08-24 15:53:55 公開日:2021-08-21
# 空間適応型特徴変換による可変レート深部画像圧縮 Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform ( http://arxiv.org/abs/2108.09551v1 ) ライセンス: Link先を確認	Myungseo Song, Jinyoung Choi, Bohyung Han	(参考訳) 本研究では,空間特徴変換(SFT arXiv:1804.02815)に基づく多目的深部画像圧縮ネットワークを提案する。本モデルは,任意の画素単位の品質マップによって制御される単一モデルを用いて,幅広い圧縮率をカバーする。さらに,提案フレームワークでは,符号化ネットワークの目的タスクに特化して最適化された品質マップを効率的に推定することにより,様々なタスクに対するタスク認識画像圧縮を行うことができる。これは、個別のタスクの別々のモデルを学ぶことなく、事前訓練されたネットワークで可能だ。本アルゴリズムは,複数の異なるターゲットレートに対して別々に最適化された複数のモデルに基づくアプローチと比較して,優れたレートゆがみトレードオフを実現する。同じレベルの圧縮では、モデルトレーニングを伴わずにタスク認識品質マップ推定により、画像分類とテキスト領域の品質保存の性能を向上する。コードはプロジェクトのwebサイトで入手できる。 https://github.com/micmic123/qmapcompression We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815), which takes a source image and a corresponding quality map as inputs and produce a compressed image with variable rates. Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps. In addition, the proposed framework allows us to perform task-aware image compressions for various tasks, e.g., classification, by efficiently estimating optimized quality maps specific to target tasks for our encoding network. This is even possible with a pretrained network without learning separate models for individual tasks. Our algorithm achieves outstanding rate-distortion trade-off compared to the approaches based on multiple models that are optimized separately for several different target rates. At the same level of compression, the proposed approach successfully improves performance on image classification and text region quality preservation via task-aware quality map estimation without additional model training. The code is available at the project website: https://github.com/micmic123/QmapCompression	翻訳日:2021-08-24 15:53:37 公開日:2021-08-21
# パーソナライズ・イン・ザ・ループ文書要約に向けて Towards Personalized and Human-in-the-Loop Document Summarization ( http://arxiv.org/abs/2108.09443v1 ) ライセンス: Link先を確認	Samira Ghodratnama	(参考訳) コンピュータデバイスのユビキタス化とインターネットの普及により、大量のデータが継続的に生成されている。したがって、与えられたトピックに関する利用可能な情報の量は、人間の処理能力をはるかに超え、情報過負荷と呼ばれるものを引き起こす。大量の情報を効率的に処理し,ユーザにとって重要な価値を持つコンテンツを生成するためには,情報の識別,統合,要約が必要である。データ要約は、関連する情報を収集し、より短いフォーマットに収集し、複雑な質問に答え、新しい洞察を得、概念境界を発見するのに役立つ。本論文は,新しい要約手法を用いて情報過負荷を軽減するための3つの課題に焦点を当てている。さらに、個人化された情報抽出を支援するために文書の分析を容易にする。この論文は、(i)文書要約における機能工学、(ii)従来の静的および非フレキシブルな要約、(iii)伝統的な総合的な要約アプローチ、(iv)参照要約の必要性の4つの領域に研究問題を分けている。 i)自動インテリジェント機能工学の獲得,ii)柔軟でインタラクティブな要約の実現,iii)知的でパーソナライズされた要約アプローチを活用した新しいアプローチを提案する。実験の結果,提案手法は他の最先端モデルと比較して有効性が証明された。さらに,ネットワークトラフィックデータ,ヘルスデータ,ビジネスプロセスデータの要約を通じて,異なるドメインにおける情報過負荷問題に対する解決策を提案する。 The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.	翻訳日:2021-08-24 15:51:07 公開日:2021-08-21
# 介入を用いた自律エージェントの因果モデル学習 Learning Causal Models of Autonomous Agents using Interventions ( http://arxiv.org/abs/2108.09586v1 ) ライセンス: Link先を確認	Pulkit Verma, Siddharth Srivastava	(参考訳) aiシステムの広範な使用におけるいくつかの障害の1つは、そのようなシステムの安全で信頼性のある動作を保証することができる解釈可能性の要件の欠如である。我々はエージェントアセスメントモジュールの解析を拡張し、AIシステムがシミュレータでハイレベルな命令シーケンスを実行し、アクションのシーケンスの実行についてユーザクエリに回答できるようにする。このような原始的なクエリ応答能力は,ユーザの解釈可能なシステムの因果モデルを定常的,完全に可観測的,決定論的設定で効率的に導出するのに十分であることを示す。また、STRIPSのようなドメインの因果構造を捉える動的因果決定ネットワーク(DCDN)を導入する。クエリの異なるクラスの比較分析は、それらに答えるために必要な計算要件と、正しいモデルを学ぶためにそれらの応答を評価するのに必要な努力の観点からも示される。 One of the several obstacles in the widespread use of AI systems is the lack of requirements of interpretability that can enable a layperson to ensure the safe and reliable behavior of such systems. We extend the analysis of an agent assessment module that lets an AI system execute high-level instruction sequences in simulators and answer the user queries about its execution of sequences of actions. We show that such a primitive query-response capability is sufficient to efficiently derive a user-interpretable causal model of the system in stationary, fully observable, and deterministic settings. We also introduce dynamic causal decision networks (DCDNs) that capture the causal structure of STRIPS-like domains. A comparative analysis of different classes of queries is also presented in terms of the computational requirements needed to answer them and the efforts required to evaluate their responses to learn the correct model.	翻訳日:2021-08-24 15:50:45 公開日:2021-08-21
# 医療画像に対する教師なし局所識別 Unsupervised Local Discrimination for Medical Images ( http://arxiv.org/abs/2108.09440v1 ) ライセンス: Link先を確認	Huai Chen, Renzhen Wang, Jieyu Li, Qing Peng, Deyu Meng and Lisheng Wang	(参考訳) 対照的表現学習は、医療画像処理における高価な注釈データの需要を軽減する効果的な教師なし手法である。最近の研究は主に、グローバルな特徴を学習するためのケースワイドな識別に基づくが、局所的な詳細は無視され、小さな解剖学的構造、組織、病変の処理に応用が制限されている。そこで我々は,医療モデルを効果的に初期化するための局所的識別特徴を学習するための普遍的局所的判別枠組みを提案し,その実践的応用を体系的に検討する。具体的には、モダリティ内構造類似性の共通性、すなわち、それに基づく。類似した構造が同じモダリティイメージで共有され、体系的な局所的特徴学習フレームワークが提案されている。グローバル埋め込みに基づくインスタンス間比較を行う代わりに,画素間埋め込みを行い,パッチと領域間の類似度を測定することに焦点を当てた。より微細なコントラスト則により、学習表現はセグメンテーションタスクにおいてより一般化され、カラーファンダスと胸部x線中の12個の下流タスクのうち11個を勝ち取ることにより、広範な最先端手法よりも優れる。さらに、モダリティ間の形状類似性、すなわち、性質に基づく。構造は類似した形状を共有できるが、異なる医療形態では、領域判別に先立って、非教師なしセグメンテーションを実現するために、異質な形状を結合する。他のモードからの形状記述と領域識別による内部パターンの類似性のみに基づいて、セグメンテーションターゲットの実現可能性を示す。最後に,1ショットのランドマークの局所化を実現するために,中心感性平均化を導入することにより,パッチ識別のセンタ感性を高める。 Contrastive representation learning is an effective unsupervised method to alleviate the demand for expensive annotated data in medical image processing. Recent work mainly based on instance-wise discrimination to learn global features, while neglect local details, which limit their application in processing tiny anatomical structures, tissues and lesions. Therefore, we aim to propose a universal local discrmination framework to learn local discriminative features to effectively initialize medical models, meanwhile, we systematacially investigate its practical medical applications. Specifically, based on the common property of intra-modality structure similarity, i.e. similar structures are shared among the same modality images, a systematic local feature learning framework is proposed. Instead of making instance-wise comparisons based on global embedding, our method makes pixel-wise embedding and focuses on measuring similarity among patches and regions. The finer contrastive rule makes the learnt representation more generalized for segmentation tasks and outperform extensive state-of-the-art methods by wining 11 out of all 12 downstream tasks in color fundus and chest X-ray. Furthermore, based on the property of inter-modality shape similarity, i.e. structures may share similar shape although in different medical modalities, we joint across-modality shape prior into region discrimination to realize unsupervised segmentation. It shows the feaibility of segmenting target only based on shape description from other modalities and inner pattern similarity provided by region discrimination. Finally, we enhance the center-sensitive ability of patch discrimination by introducing center-sensitive averaging to realize one-shot landmark localization, this is an effective application for patch discrimination.	翻訳日:2021-08-24 15:49:28 公開日:2021-08-21
# Sugeno Fuzzy Integral Technique を用いた頚椎細胞画像分類のためのCNN分類器のアンサンブル Ensemble of CNN classifiers using Sugeno Fuzzy Integral Technique for Cervical Cytology Image Classification ( http://arxiv.org/abs/2108.09460v1 ) ライセンス: Link先を確認	Rohit Kundu, Hritam Basak, Akhil Koilada, Soham Chattopadhyay, Sukanta Chakraborty, Nibaran Das	(参考訳) 子宮頸がんは4番目に一般的ながんのカテゴリーであり、毎年50万人以上の女性に影響を与えている。早期診断は、がんの治療や治療にも役立つが、退屈で時間のかかる検査プロセスによって、集団検診は不可能である。病理学者の効率的かつ信頼性の高い検出を支援するため,本報告では,子宮頸癌の単一細胞およびスライド画像の分類を行うためのコンピュータ支援診断ツールを提案する。バイオメディカル画像分類のための自動検出ツールを開発する際の主な関心事は、公開データの可用性が低いことである。アンサンブル学習は、画像分類の一般的なアプローチであるが、分類器に事前決定された重みを活用する単純化されたアプローチは、満足して実行できない。本研究では,sugenoファジィ積分を用いて,インセプションv3,drknet-161,resnet-34の3つの学習モデルから決定スコアをアンサンブルする。提案するファジィ融合は,各サンプルに対する分類器の信頼度を考慮に入れ,各分類器に与える重要度を適応的に変化させ,各サンプルから供給される補完的情報を取り込み,分類性能を向上させる。提案手法は, mendeley liquid based cytology (lbc) dataset, sipakmed whole slide image (wsi) dataset, sipakmed single cell image (sci) datasetの3つの公開データセットにおいて評価され, 得られた結果は有望である。 GradCAMに基づく視覚表現と統計検査によるアプローチの分析と,文献における既存およびベースラインモデルとの比較は,アプローチの有効性を正当化する。 Cervical cancer is the fourth most common category of cancer, affecting more than 500,000 women annually, owing to the slow detection procedure. Early diagnosis can help in treating and even curing cancer, but the tedious, time-consuming testing process makes it impossible to conduct population-wise screening. To aid the pathologists in efficient and reliable detection, in this paper, we propose a fully automated computer-aided diagnosis tool for classifying single-cell and slide images of cervical cancer. The main concern in developing an automatic detection tool for biomedical image classification is the low availability of publicly accessible data. Ensemble Learning is a popular approach for image classification, but simplistic approaches that leverage pre-determined weights to classifiers fail to perform satisfactorily. In this research, we use the Sugeno Fuzzy Integral to ensemble the decision scores from three popular pretrained deep learning models, namely, Inception v3, DenseNet-161 and ResNet-34. The proposed Fuzzy fusion is capable of taking into consideration the confidence scores of the classifiers for each sample, and thus adaptively changing the importance given to each classifier, capturing the complementary information supplied by each, thus leading to superior classification performance. We evaluated the proposed method on three publicly available datasets, the Mendeley Liquid Based Cytology (LBC) dataset, the SIPaKMeD Whole Slide Image (WSI) dataset, and the SIPaKMeD Single Cell Image (SCI) dataset, and the results thus yielded are promising. Analysis of the approach using GradCAM-based visual representations and statistical tests, and comparison of the method with existing and baseline models in literature justify the efficacy of the approach.	翻訳日:2021-08-24 15:48:55 公開日:2021-08-21
# マスキングによるエンド2エンドの顔認識 End2End Occluded Face Recognition by Masking Corrupted Features ( http://arxiv.org/abs/2108.09468v1 ) ライセンス: Link先を確認	Haibo Qiu, Dihong Gong, Zhifeng Li, Wei Liu, Dacheng Tao	(参考訳) 近年の深層畳み込みニューラルネットワークの進歩により、顔認識において大きな進歩が見られた。しかし、最先端の一般顔認識モデルは、現実のシナリオでよく見られるような、隠蔽された顔画像にうまく当てはまらない。潜在的な理由は、訓練用の大規模な隠蔽顔データがないことと、閉塞によって引き起こされる破損した特徴に対処するための特定の設計がないことである。本稿では,1つのエンドツーエンドのディープニューラルネットワークに基づいて,オクルージョンに頑健な新しい顔認識手法を提案する。私たちのアプローチは(オクルージョンマスクによる顔認識)、深層畳み込みニューラルネットワークから破損した特徴を発見し、動的に学習したマスクによってそれらをきれいにすることを学びます。さらに,大規模な隠蔽顔画像を構築し,効果的かつ効率的に訓練する。外部検出器に頼ってオクルージョンを発見する方法や、差別的でない浅いモデルを使う方法に比べれば、より単純だが強力である。 LFW、Megaface Challenge 1, RMF2、ARデータセットおよびその他の擬似隠蔽/マス付きデータセットの実験結果から、オクルージョン下での精度が劇的に向上し、一般的な顔認識でうまく一般化されることを確認した。 With the recent advancement of deep convolutional neural networks, significant progress has been made in general face recognition. However, the state-of-the-art general face recognition models do not generalize well to occluded face images, which are exactly the common cases in real-world scenarios. The potential reasons are the absences of large-scale occluded face data for training and specific designs for tackling corrupted features brought by occlusions. This paper presents a novel face recognition method that is robust to occlusions based on a single end-to-end deep neural network. Our approach, named FROM (Face Recognition with Occlusion Masks), learns to discover the corrupted features from the deep convolutional neural networks, and clean them by the dynamically learned masks. In addition, we construct massive occluded face images to train FROM effectively and efficiently. FROM is simple yet powerful compared to the existing methods that either rely on external detectors to discover the occlusions or employ shallow models which are less discriminative. Experimental results on the LFW, Megaface challenge 1, RMF2, AR dataset and other simulated occluded/masked datasets confirm that FROM dramatically improves the accuracy under occlusions, and generalizes well on general face recognition.	翻訳日:2021-08-24 15:48:21 公開日:2021-08-21
# 公共ウェブカメラからの3次元再構成 3D Reconstruction from public webcams ( http://arxiv.org/abs/2108.09476v1 ) ライセンス: Link先を確認	Tianyu Wu, Konrad Schindler and Cenek Albl	(参考訳) 本稿では,複数のウェブカメラで捉えたシーンの3次元形状を再構成する可能性を検討する。公開されているウェブカメラの数は増えており、毎日増えている。論理的な疑問が生まれます - この自由データソースは、余暇活動を超えた何かに使えるのでしょうか? 課題は、これらのカメラの内部、外部、または時間的なキャリブレーションがないことである。コンピュータビジョンの最近の進歩により、我々はカメラの校正に成功し、静的なシーンの3次元再構成を行い、移動物体の3次元軌跡を復元した。 In this paper, we investigate the possibility of reconstructing the 3D geometry of a scene captured by multiple webcams. The number of publicly accessible webcams is already large and it is growing every day. A logical question arises - can we use this free source of data for something beyond leisure activities? The challenge is that no internal, external, or temporal calibration of these cameras is available. We show that using recent advances in computer vision, we successfully calibrate the cameras, perform 3D reconstructions of the static scene and also recover the 3D trajectories of moving objects.	翻訳日:2021-08-24 15:48:00 公開日:2021-08-21
# MOTSynth: 合成データは歩行者の検知と追跡にどのように役立つか? MOTSynth: How Can Synthetic Data Help Pedestrian Detection and Tracking? ( http://arxiv.org/abs/2108.09518v1 ) ライセンス: Link先を確認	Matteo Fabbri, Guillem Braso, Gianluca Maugeri, Orcun Cetintas, Riccardo Gasparini, Aljosa Osep, Simone Calderara, Laura Leal-Taixe, Rita Cucchiara	(参考訳) ビデオ歩行者検出と追跡のためのディープラーニングに基づく手法は、優れたパフォーマンスを達成するために大量のトレーニングデータを必要とする。しかし、混み合った公共環境におけるデータ取得は、データプライバシの懸念を引き起こす - すべての参加者の明確な同意なしに、単にデータを記録して保存することは許されない。さらに、コンピュータビジョンアプリケーションに対するそのようなデータのアノテーションは通常、特にビデオ領域においてかなりの手作業を必要とする。非常に混み合ったシナリオにおける歩行者のラベル付けは、人間のアノテータであっても困難であり、トレーニングデータにエラーをもたらす可能性がある。本稿では,合成データのみを用いて多人数追跡の異なる側面を前進させる方法について検討する。この目的のために、レンダリングゲームエンジンを用いてオブジェクトの検出と追跡のための大規模で高度に多様な合成データセットMOTSynthを生成する。実験の結果,MOTSynthは,歩行者検出,再識別,セグメンテーション,追跡といったタスクの実際のデータを置き換えるために利用できることがわかった。 Deep learning-based methods for video pedestrian detection and tracking require large volumes of training data to achieve good performance. However, data acquisition in crowded public environments raises data privacy concerns -- we are not allowed to simply record and store data without the explicit consent of all participants. Furthermore, the annotation of such data for computer vision applications usually requires a substantial amount of manual effort, especially in the video domain. Labeling instances of pedestrians in highly crowded scenarios can be challenging even for human annotators and may introduce errors in the training data. In this paper, we study how we can advance different aspects of multi-person tracking using solely synthetic data. To this end, we generate MOTSynth, a large, highly diverse synthetic dataset for object detection and tracking using a rendering game engine. Our experiments show that MOTSynth can be used as a replacement for real data on tasks such as pedestrian detection, re-identification, segmentation, and tracking.	翻訳日:2021-08-24 15:47:48 公開日:2021-08-21
# vision transformer (vit) アーキテクチャを用いた建設監視自動化のための不均衡データセットの構築材料分類 Construction material classification on imbalanced datasets for construction monitoring automation using Vision Transformer (ViT) architecture ( http://arxiv.org/abs/2108.09527v1 ) ライセンス: Link先を確認	Maryam Soleymani, Mahdi Bonyani, Hadi Mahami, Farnad Nasirzadeh	(参考訳) 今日では、自動化は建設プロジェクトの生産性に大きな影響を与えるため、重要なトピックである。この産業における自動化の利用は、建設作業の効率、品質、安全性を著しく向上させるなど、大きな成果をもたらす。建設における自動化の範囲は幅広い段階を含み、建設プロジェクトを監視することは例外ではない。さらに、プロジェクト進捗の正確かつタイムリーな評価によって、マネージャはスケジュールからの逸脱を素早く識別し、必要なアクションを適切なタイミングで行うことができるので、プロジェクト管理において非常に重要です。この段階で最も重要なタスクの1つは、プロジェクト進捗を日々追跡することであり、それは非常に時間がかかり、労働集約的ですが、自動化によってこのタスクが促進され、加速されました。また、多くの危険なタスクのリスクを排除または少なくとも減らした。このようにして、建設自動化の最初のステップは、プロジェクト現場で使われている材料を自動的に検出することである。本稿では,視覚変換器(ViT)と呼ばれる新しいディープラーニングアーキテクチャを用いて,建設材料の検出と分類を行う。提案手法の適用性および性能を評価するため, 従来の論文で用いた構成材料ライブラリ (CML) と構築材料データセット (BMD) の3つの大きな不均衡なデータセットと, それらを組み合わせて作成した新しいデータセットを用いて, 実験を行った。得られた結果から,すべてのパラメータおよび材料カテゴリーで100%の精度が得られた。提案手法は, 異なる材料タイプを検出し, 分類するための新しいロバストなツールであると考えられる。 Nowadays, automation is a critical topic due to its significant impacts on the productivity of construction projects. Utilizing automation in this industry brings about great results, such as remarkable improvements in the efficiency, quality, and safety of construction activities. The scope of automation in construction includes a wide range of stages, and monitoring construction projects is no exception. Additionally, it is of great importance in project management since an accurate and timely assessment of project progress enables managers to quickly identify deviations from the schedule and take the required actions at the right time. In this stage, one of the most important tasks is to daily keep track of the project progress, which is very time-consuming and labor-intensive, but automation has facilitated and accelerated this task. It also eliminated or at least decreased the risk of many dangerous tasks. In this way, the first step of construction automation is to detect used materials in a project site automatically. In this paper, a novel deep learning architecture is utilized, called Vision Transformer (ViT), for detecting and classifying construction materials. To evaluate the applicability and performance of the proposed method, it is trained and tested on three large imbalanced datasets, namely Construction Material Library (CML) and Building Material Dataset (BMD), used in the previous papers, as well as a new dataset created by combining them. The achieved results revealed an accuracy of 100 percent in all parameters and also in each material category. It is believed that the proposed method provides a novel and robust tool for detecting and classifying different material types.	翻訳日:2021-08-24 15:47:32 公開日:2021-08-21
# SSR: シングルビュー2次元から3次元再構成のための半教師付きソフトラスタライザ SSR: Semi-supervised Soft Rasterizer for single-view 2D to 3D Reconstruction ( http://arxiv.org/abs/2108.09593v1 ) ライセンス: Link先を確認	Issam Laradji, Pau Rodr\'iguez, David Vazquez, Derek Nowrouzezahrai	(参考訳) 最近の研究は、弱い監督下でのオブジェクトメッシュの学習に大きな進歩をもたらした。ソフトラスタライズ法は2次元画像からの正確な3次元再構成を実現した。本研究では,このような3次元復元手法がラベルなし画像を活用することで,ラベリング作業をさらに削減する。これらのラベルのない画像の視点を得るために、2つの画像を入力として取り、同一の視点に対応するか否かを出力するSiameseネットワークを提案する。トレーニング中、クロスエントロピー損失を最小限に抑え、一対のイメージが同じ視点に属するか否かを予測する確率を最大化する。新しい画像の視点を得るために、トレーニングサンプルから得られた異なる視点と比較し、最も高い一致確率で視点を選択する。ラベル付けされていない画像に最も自信のある視点でラベル付けし、異なるラスタライズ層を持つディープネットワークを訓練する。実験の結果、2つのオブジェクトのみをラベル付けしても、未ラベルの例を利用する場合、ShapeNetのIoUは大幅に改善されることがわかった。コードはhttps://github.com/IssamLaradji/SSRで入手できる。 Recent work has made significant progress in learning object meshes with weak supervision. Soft Rasterization methods have achieved accurate 3D reconstruction from 2D images with viewpoint supervision only. In this work, we further reduce the labeling effort by allowing such 3D reconstruction methods leverage unlabeled images. In order to obtain the viewpoints for these unlabeled images, we propose to use a Siamese network that takes two images as input and outputs whether they correspond to the same viewpoint. During training, we minimize the cross entropy loss to maximize the probability of predicting whether a pair of images belong to the same viewpoint or not. To get the viewpoint of a new image, we compare it against different viewpoints obtained from the training samples and select the viewpoint with the highest matching probability. We finally label the unlabeled images with the most confident predicted viewpoint and train a deep network that has a differentiable rasterization layer. Our experiments show that even labeling only two objects yields significant improvement in IoU for ShapeNet when leveraging unlabeled examples. Code is available at https://github.com/IssamLaradji/SSR.	翻訳日:2021-08-24 15:47:05 公開日:2021-08-21
# フェアネスを考慮したオンラインメタラーニング Fairness-Aware Online Meta-learning ( http://arxiv.org/abs/2108.09435v1 ) ライセンス: Link先を確認	Chen Zhao, Feng Chen, Bhavani Thuraisingham	(参考訳) オンラインメタ学習(oml)は,タスクが次々に現れる逐次的な環境において,モデルパラメータ(あるいは学習の学習)よりも優れた優先順位を学習する。このようなテクニックは、人間の知性の重要な特徴である公平さで学習することの重要性を完全に無視する。 (2)オンライン・フェアネス・アウェア・ラーニングこの設定は、公平性が懸念される多くの分類問題を捉えている。しかし、タスク固有の適応なしにゼロショット一般化を達成することを目指している。これにより、モデルが新たに到着したデータに適応する能力が制限される。このような問題を克服し,そのギャップを埋めるために,本稿では,不公平防止の設定下にある新しいオンラインメタ学習アルゴリズムであるFFMLを提案する。 ffmlの重要な部分は、モデルの正確性と公平性にそれぞれ関連づけられたオンラインフェア分類モデルのプライマリパラメータとデュアルパラメータの優れた事前学習である。この問題は二値凸凹最適化の形で定式化されている。理論解析は、損失後悔と累積公正性制約の違反に対して、サブ線形上界を与える。実世界の3つのデータセットの分類にFFMLを適用することでFFMLの汎用性を実証し、公平性と分類精度のトレードオフに関する先行研究よりも大幅に改善したことを示す。 In contrast to offline working fashions, two research paradigms are devised for online learning: (1) Online Meta Learning (OML) learns good priors over model parameters (or learning to learn) in a sequential setting where tasks are revealed one after another. Although it provides a sub-linear regret bound, such techniques completely ignore the importance of learning with fairness which is a significant hallmark of human intelligence. (2) Online Fairness-Aware Learning. This setting captures many classification problems for which fairness is a concern. But it aims to attain zero-shot generalization without any task-specific adaptation. This therefore limits the capability of a model to adapt onto newly arrived data. To overcome such issues and bridge the gap, in this paper for the first time we proposed a novel online meta-learning algorithm, namely FFML, which is under the setting of unfairness prevention. The key part of FFML is to learn good priors of an online fair classification model's primal and dual parameters that are associated with the model's accuracy and fairness, respectively. The problem is formulated in the form of a bi-level convex-concave optimization. Theoretic analysis provides sub-linear upper bounds for loss regret and for violation of cumulative fairness constraints. Our experiments demonstrate the versatility of FFML by applying it to classification on three real-world datasets and show substantial improvements over the best prior work on the tradeoff between fairness and classification accuracy	翻訳日:2021-08-24 15:38:30 公開日:2021-08-21
# 交通事故検出のための不均衡時空間トラヒックフローデータの深い表現 Deep Representation of Imbalanced Spatio-temporal Traffic Flow Data for Traffic Accident Detection ( http://arxiv.org/abs/2108.09506v1 ) ライセンス: Link先を確認	Pouya Mehrannia, Shayan Shirahmad Gale Bagi, Behzad Moshiri, Otman Adam Al-Basir	(参考訳) 交通事故の自動検出は、交通、公共安全、経路計画の改善に重要な影響を及ぼす。事故発生から救助隊派遣までの時間の連続的な減少によって多くの命を救うことができ、またドライバーに代替ルートの選択を通知することで多くの走行時間を節約できる。この問題は、主に事故の稀さと環境の空間的不均一性のために困難である。本稿では,高速道路事故の自動検出のためのLong-Short Term Memory (LSTM) ネットワークを用いたループ検出データの深部表現について検討する。 LSTMベースのフレームワークは、データの次元を減らしながら、エンコードされた特徴空間におけるクラス分離性を高める。ミネソタ州ツインシティーズ・メトロ・フリーウェイズから収集された実事故およびループ検出器データを用いた実験により、lstmネットワークを用いた交通流データの深い表現は、高速道路事故を18分以内の真の正の率 0.71 と偽の正の率 0.25 で検出できる可能性が証明された。 Automatic detection of traffic accidents has a crucial effect on improving transportation, public safety, and path planning. Many lives can be saved by the consequent decrease in the time between when the accidents occur and when rescue teams are dispatched, and much travelling time can be saved by notifying drivers to select alternative routes. This problem is challenging mainly because of the rareness of accidents and spatial heterogeneity of the environment. This paper studies deep representation of loop detector data using Long-Short Term Memory (LSTM) network for automatic detection of freeway accidents. The LSTM-based framework increases class separability in the encoded feature space while reducing the dimension of data. Our experiments on real accident and loop detector data collected from the Twin Cities Metro freeways of Minnesota demonstrate that deep representation of traffic flow data using LSTM network has the potential to detect freeway accidents in less than 18 minutes with a true positive rate of 0.71 and a false positive rate of 0.25 which outperforms other competing methods in the same arrangement.	翻訳日:2021-08-24 15:38:10 公開日:2021-08-21
# DSP-SLAM: 深い形状を持つオブジェクト指向SLAM DSP-SLAM: Object Oriented SLAM with Deep Shape Priors ( http://arxiv.org/abs/2108.09481v1 ) ライセンス: Link先を確認	Jingwen Wang, Martin R\"unz, Lourdes Agapito	(参考訳) DSP-SLAMはオブジェクト指向SLAMシステムであり,前景オブジェクトのための高密度3次元モデルのリッチで高精度な関節マップを構築し,背景を表わすランドマークポイントを疎外する。 DSP-SLAMは特徴に基づくSLAMシステムによって再構成された3次元点雲を入力として、検出された物体の密な再構成でスパースマップを強化する能力を備える。オブジェクトはセマンティックなインスタンスセグメンテーションによって検出され、その形状とポーズはカテゴリ固有の深部形状の埋め込みを先行として、新しい2階最適化によって推定される。我々のオブジェクト認識バンドル調整は、ポーズグラフを構築し、カメラポーズ、オブジェクト位置、特徴点を共同で最適化する。 DSP-SLAMは、モノクロ、ステレオ、ステレオ+LiDARの3つの異なる入力モードで毎秒10フレームで動作する。本研究では,フリブルク・レッドウッド・osデータセットの単眼rgb配列とキッティオドメトリーデータセットのステレオ+ライダー配列のほぼフレームレートで動作するdsp-slamを,部分的観測からでも高品質な完全なオブジェクト再構成を実現するとともに,一貫したグローバルマップを維持しながら実証する。 KITTIデータセット上でのカメラトラッキングドリフトの低減と,近年の深部事前再構成手法によるオブジェクトのポーズと形状復元の改善を示す。 We propose DSP-SLAM, an object-oriented SLAM system that builds a rich and accurate joint map of dense 3D models for foreground objects, and sparse landmark points to represent the background. DSP-SLAM takes as input the 3D point cloud reconstructed by a feature-based SLAM system and equips it with the ability to enhance its sparse map with dense reconstructions of detected objects. Objects are detected via semantic instance segmentation, and their shape and pose is estimated using category-specific deep shape embeddings as priors, via a novel second order optimization. Our object-aware bundle adjustment builds a pose-graph to jointly optimize camera poses, object locations and feature points. DSP-SLAM can operate at 10 frames per second on 3 different input modalities: monocular, stereo, or stereo+LiDAR. We demonstrate DSP-SLAM operating at almost frame rate on monocular-RGB sequences from the Friburg and Redwood-OS datasets, and on stereo+LiDAR sequences on the KITTI odometry dataset showing that it achieves high-quality full object reconstructions, even from partial observations, while maintaining a consistent global map. Our evaluation shows improvements in object pose and shape reconstruction with respect to recent deep prior-based reconstruction methods and reductions in camera tracking drift on the KITTI dataset.	翻訳日:2021-08-24 15:33:14 公開日:2021-08-21
# LiDARパノプティブセグメンテーションにおける従来の点群クラスタリング手法の技術的検討と評価 A Technical Survey and Evaluation of Traditional Point Cloud Clustering Methods for LiDAR Panoptic Segmentation ( http://arxiv.org/abs/2108.09522v1 ) ライセンス: Link先を確認	Yiming Zhao, Xiao Zhang, Xinming Huang	(参考訳) LiDARのパノプティカルセグメンテーションは、自動運転のための新しい技術課題である。一般的なエンドツーエンドのディープラーニングソリューションとは対照的に、セマンティクス情報を抽出する既存のセマンティクスセグメンテーションネットワークと、各インスタンスオブジェクトを分割する従来のlidar point cloud clusterアルゴリズムとのハイブリッド手法を提案する。幾何学に基づく従来のクラスタリングアルゴリズムは、semantickittiデータセットのpanoptic segmentation leaderboard上で公開されたすべてのエンドツーエンドのディープラーニングソリューションの中で最先端のパフォーマンスを示すことによって考慮に値すると論じている。私たちの知る限り、我々はクラスタリングアルゴリズムでpoint cloud panopticセグメンテーションを試した最初の人物です。そこで本研究では,新しいモデルを開発する代わりに,4つの典型的なクラスタ手法を実装し,その性能をベンチマークで報告する。これら4つのクラスタメソッドは、リアルタイム実行速度が最も代表的である。本論文ではC++で実装し,既存のディープラーニングフレームワークとシームレスに統合するためのpython関数としてラップする。この問題に関心のあるピア研究者のためにコードを公開しています。 LiDAR panoptic segmentation is a newly proposed technical task for autonomous driving. In contrast to popular end-to-end deep learning solutions, we propose a hybrid method with an existing semantic segmentation network to extract semantic information and a traditional LiDAR point cloud cluster algorithm to split each instance object. We argue geometry-based traditional clustering algorithms are worth being considered by showing a state-of-the-art performance among all published end-to-end deep learning solutions on the panoptic segmentation leaderboard of the SemanticKITTI dataset. To our best knowledge, we are the first to attempt the point cloud panoptic segmentation with clustering algorithms. Therefore, instead of working on new models, we give a comprehensive technical survey in this paper by implementing four typical cluster methods and report their performances on the benchmark. Those four cluster methods are the most representative ones with real-time running speed. They are implemented with C++ in this paper and then wrapped as a python function for seamless integration with the existing deep learning frameworks. We release our code for peer researchers who might be interested in this problem.	翻訳日:2021-08-24 15:32:46 公開日:2021-08-21
# 医用画像分割のための深層学習法の系統的臨床評価--ラジオサージリー応用 Systematic Clinical Evaluation of A Deep Learning Method for Medical Image Segmentation: Radiosurgery Application ( http://arxiv.org/abs/2108.09535v1 ) ライセンス: Link先を確認	Boris Shirokikh, Alexandra Dalechina, Alexey Shevtsov, Egor Krivov, Valery Kostjuchenko, Amayak Durgaryan, Mikhail Galkin, Andrey Golanov and Mikhail Belyaev	(参考訳) 3次元医用画像分割作業において,Deep Learning(DL)手法を体系的に評価した。セグメンテーション法は, 放射線治療プロセスに統合され, 臨床ワークフローに直接影響を及ぼす。提案手法では,手動セグメンテーションの相対的な欠点,すなわち,高波長間コントゥーリング変動とコンチューリングプロセスの高時間消費に対処する。既存の評価に対する主な拡張は、他の医用画像分割タスクでさらに一般化できる、慎重に詳細な分析である。まず, レータ間検出契約の変更を解析する。セグメンテーションモデルは検出不一致の比率を0.162から0.085に減少させる(p < 0.05)。第2に,このモデルが表層ダイススコア0.845から0.871 (p < 0.05) に向上することを示す。第3に、モデルが1.6倍から2.0倍(p < 0.05)のデライン化過程を加速することを示す。最後に,評価バイアスを排除または推定するために臨床実験のセットアップを設計し,その結果の意義を保存した。臨床評価に加えて、3次元医用画像セグメンテーションのための効率的なdlベースモデル構築のための直感と実践的アイデアを要約する。 We systematically evaluate a Deep Learning (DL) method in a 3D medical image segmentation task. Our segmentation method is integrated into the radiosurgery treatment process and directly impacts the clinical workflow. With our method, we address the relative drawbacks of manual segmentation: high inter-rater contouring variability and high time consumption of the contouring process. The main extension over the existing evaluations is the careful and detailed analysis that could be further generalized on other medical image segmentation tasks. Firstly, we analyze the changes in the inter-rater detection agreement. We show that the segmentation model reduces the ratio of detection disagreements from 0.162 to 0.085 (p < 0.05). Secondly, we show that the model improves the inter-rater contouring agreement from 0.845 to 0.871 surface Dice Score (p < 0.05). Thirdly, we show that the model accelerates the delineation process in between 1.6 and 2.0 times (p < 0.05). Finally, we design the setup of the clinical experiment to either exclude or estimate the evaluation biases, thus preserve the significance of the results. Besides the clinical evaluation, we also summarize the intuitions and practical ideas for building an efficient DL-based model for 3D medical image segmentation.	翻訳日:2021-08-24 15:32:29 公開日:2021-08-21
# クロスアテンションディープネットワークを用いたマルチモーダル乳腺病変分類 Multimodal Breast Lesion Classification Using Cross-Attention Deep Networks ( http://arxiv.org/abs/2108.09591v1 ) ライセンス: Link先を確認	Hung Q. Vo, Pengyu Yuan, Tiancheng He, Stephen T.C. Wong, and Hien V. Nguyen	(参考訳) 正確な乳房病変リスク推定は、不要な生検を著しく減らし、医師が最適な治療計画を決定するのに役立つ。既存のコンピュータ支援システムのほとんどは乳腺病変を分類するためにマンモグラムの特徴のみに依存している。このアプローチは便利であるが、最適な性能を達成するために臨床報告で有用な情報を十分に活用していない。乳房病変の分類はマンモグラフィー単独と比較して有意に改善するだろうか? 医療実践の変化による臨床情報の欠落に対する対処法マンモグラムと臨床特徴を組み合わせる最善の方法は何か? これらの根本的な問題に対処するために体系的な研究が必要となる。本稿では, マンモグラムと分類学的臨床変数を組み合わせるために, 特徴連結, 交差注意, 共同注意に基づく複数のマルチモーダルディープネットワークについて検討する。提案するアーキテクチャにより,病変分類性能が著しく向上した(roc曲線下平均面積は0.99から0.94)。また,臨床変数の欠如時にモデルを評価する。 Accurate breast lesion risk estimation can significantly reduce unnecessary biopsies and help doctors decide optimal treatment plans. Most existing computer-aided systems rely solely on mammogram features to classify breast lesions. While this approach is convenient, it does not fully exploit useful information in clinical reports to achieve the optimal performance. Would clinical features significantly improve breast lesion classification compared to using mammograms alone? How to handle missing clinical information caused by variation in medical practice? What is the best way to combine mammograms and clinical features? There is a compelling need for a systematic study to address these fundamental questions. This paper investigates several multimodal deep networks based on feature concatenation, cross-attention, and co-attention to combine mammograms and categorical clinical variables. We show that the proposed architectures significantly increase the lesion classification performance (average area under ROC curves from 0.89 to 0.94). We also evaluate the model when clinical variables are missing.	翻訳日:2021-08-24 15:32:10 公開日:2021-08-21
# サブ国家レベルの解像度でCOVID-19の今後の知見を可能にする汎用予測ソリューション A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions ( http://arxiv.org/abs/2108.09556v1 ) ライセンス: Link先を確認	Umar Marikkar, Harshana Weligampola, Rumali Perera, Jameel Hassan, Suren Sritharan, Gihan Jayatilaka, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake, Janaka Ekanayake, Anuruddhika Rathnayake, Samath Dharmaratne	(参考訳) 新型コロナウイルスは公衆衛生に大きな影響を与え続けている。この影響を最小限に抑えるため、政策立案者は、実際の脅威に対して不当に実施された場合、誤った脅威評価の結果、望ましくない社会経済的合併症を引き起こすような封じ込め措置を講じる。さらに、マクロレベルの意思決定や全国レベルの意思決定は、小さな地域での局所的な感受性を考慮できない。したがって、正確な予測を通じて、covid-19の行動に関する洞察を提供する地域的な脅威アセスメントの必要性が生じる。 In this study, a forecasting solution is proposed, to predict daily new cases of COVID-19 in regions small enough where containment measures could be locally implemented, by targeting three main shortcomings that exist in literature; the unreliability of existing data caused by inconsistent testing patterns in smaller regions, weak deploy-ability of forecasting models towards predicting cases in previously unseen regions, and model training biases caused by the imbalanced nature of data in COVID-19 epi-curves. そこで本研究は,その地域の疫学的なダイナミクスに基づく決定論的エピカーブを平滑化するための最適化平滑化手法,特定の地域からのデータを用いてトレーニングされた長期記憶型予測モデル,履歴データを持たない地域におけるデプロイ可能性の最大化を目的とした多種多様なトレーニングセット,エピ曲線に見られるデータ不均衡を緩和するための学習中の適応損失関数の3つを特徴とする。提案する平滑化手法,一般化トレーニング戦略,適応損失関数は予測全体の精度を大きく向上させ,より局所的なマイクロレベルでの効率的な封じ込めが可能となった。 COVID-19 continues to cause a significant impact on public health. To minimize this impact, policy makers undertake containment measures that however, when carried out disproportionately to the actual threat, as a result if errorneous threat assessment, cause undesirable long-term socio-economic complications. In addition, macro-level or national level decision making fails to consider the localized sensitivities in small regions. Hence, the need arises for region-wise threat assessments that provide insights on the behaviour of COVID-19 through time, enabled through accurate forecasts. In this study, a forecasting solution is proposed, to predict daily new cases of COVID-19 in regions small enough where containment measures could be locally implemented, by targeting three main shortcomings that exist in literature; the unreliability of existing data caused by inconsistent testing patterns in smaller regions, weak deploy-ability of forecasting models towards predicting cases in previously unseen regions, and model training biases caused by the imbalanced nature of data in COVID-19 epi-curves. Hence, the contributions of this study are three-fold; an optimized smoothing technique to smoothen less deterministic epi-curves based on epidemiological dynamics of that region, a Long-Short-Term-Memory (LSTM) based forecasting model trained using data from select regions to create a representative and diverse training set that maximizes deploy-ability in regions with lack of historical data, and an adaptive loss function whilst training to mitigate the data imbalances seen in epi-curves. The proposed smoothing technique, the generalized training strategy and the adaptive loss function largely increased the overall accuracy of the forecast, which enables efficient containment measures at a more localized micro-level.	翻訳日:2021-08-24 15:28:58 公開日:2021-08-21
# プログラマブルfpgaベースのメモリコントローラ Programmable FPGA-based Memory Controller ( http://arxiv.org/abs/2108.09601v1 ) ライセンス: Link先を確認	Sasindu Wijeratne, Sanket Pattnaik, Zhiyu Chen, Rajgopal Kannan, Viktor Prasanna	(参考訳) DRAM技術の世代別改良にもかかわらず、メモリアクセスレイテンシは依然としてアプリケーションアクセラレーターの主要なボトルネックであり、主にターゲットアプリケーション、使用するアルゴリズム、アクセラレーターアーキテクチャのバリエーションを十分に考慮できないメモリインターフェースIPの制限のためである。本稿では,異なるアプリケーション用のメモリコントローラの開発に時間を要するため,利用可能なハードウェアリソース上で,異なるターゲットアプリケーション用に設定可能なモジュール型でプログラム可能なメモリコントローラを提案する。提案するメモリコントローラはバルクメモリ転送とともにキャッシュラインアクセスを効率的にサポートする。ユーザーはFPGA上の利用可能なロジックリソース、メモリアクセスパターン、および外部メモリ仕様に応じてコントローラを設定することができる。モジュール設計は、要求スケジューリング、内部キャッシュ、直接メモリアクセスを含む様々なメモリアクセス最適化技術をサポートする。これらの技術は、高い持続帯域幅を維持しながら、全体のレイテンシを低減することに寄与する。本研究では,最先端FPGA上に実装し,グラフ解析とディープラーニング処理という2つの広く研究されている領域を用いて性能評価を行う。商用メモリコントローラIPと比較して,CNNおよびGCNワークロードのメモリアクセス時間は最大58%向上した。 Even with generational improvements in DRAM technology, memory access latency still remains the major bottleneck for application accelerators, primarily due to limitations in memory interface IPs which cannot fully account for variations in target applications, the algorithms used, and accelerator architectures. Since developing memory controllers for different applications is time-consuming, this paper introduces a modular and programmable memory controller that can be configured for different target applications on available hardware resources. The proposed memory controller efficiently supports cache-line accesses along with bulk memory transfers. The user can configure the controller depending on the available logic resources on the FPGA, memory access pattern, and external memory specifications. The modular design supports various memory access optimization techniques including, request scheduling, internal caching, and direct memory access. These techniques contribute to reducing the overall latency while maintaining high sustained bandwidth. We implement the system on a state-of-the-art FPGA and evaluate its performance using two widely studied domains: graph analytics and deep learning workloads. We show improved overall memory access time up to 58% on CNN and GCN workloads compared with commercial memory controller IPs.	翻訳日:2021-08-24 15:28:28 公開日:2021-08-21
# 多様な時間スケールを用いた貯留層計算によるマルチスケールダイナミクスの予測 Reservoir Computing with Diverse Timescales for Prediction of Multiscale Dynamics ( http://arxiv.org/abs/2108.09446v1 ) ライセンス: Link先を確認	Gouhei Tanaka, Tadayoshi Matsumori, Hiroaki Yoshida, Kazuyuki Aihara	(参考訳) 機械学習のアプローチは最近、動的システムに対する物理的・数学的モデリングアプローチの代替または補助として活用されている。マルチスケールダイナミックスのモデリングと予測に特化した効率的な機械学習手法を開発するために,異種漏洩積分体ニューロンの繰り返しネットワークを用いて,様々な時間スケールの貯水池計算モデルを提案する。サブシステムダイナミクスの時間スケールに大きなギャップを含む高速でカオス的な動的システムの予測タスクにおいて,提案モデルが既存の標準モデルよりも高いポテンシャルを持ち,リーク率パラメータの最適化を必要とせずとも,標準モデルに匹敵する性能が得られることを実証する。解析の結果, モデル学習により, 対象動力学の各成分を生産するのに要する時間スケールが, 適切に柔軟に選択できることが判明した。 Machine learning approaches have recently been leveraged as a substitute or an aid for physical/mathematical modeling approaches to dynamical systems. To develop an efficient machine learning method dedicated to modeling and prediction of multiscale dynamics, we propose a reservoir computing model with diverse timescales by using a recurrent network of heterogeneous leaky integrator neurons. In prediction tasks with fast-slow chaotic dynamical systems including a large gap in timescales of their subsystems dynamics, we demonstrate that the proposed model has a higher potential than the existing standard model and yields a performance comparable to the best one of the standard model even without an optimization of the leak rate parameter. Our analysis reveals that the timescales required for producing each component of target dynamics are appropriately and flexibly selected from the reservoir dynamics by model training.	翻訳日:2021-08-24 15:27:12 公開日:2021-08-21
# グラフニューラルネットワークに対するハードラベルブラックボックスの逆攻撃 A Hard Label Black-box Adversarial Attack Against Graph Neural Networks ( http://arxiv.org/abs/2108.09513v1 ) ライセンス: Link先を確認	Jiaming Mu, Binghui Wang, Qi Li, Kun Sun, Mingwei Xu, Zhuotao Liu	(参考訳) グラフニューラルネットワーク(GNN)は,ノード分類やグラフ分類などの様々なグラフ構造関連タスクにおいて,最先端のパフォーマンスを実現している。しかし、GNNは敵の攻撃に弱い。既存の研究は主にノード分類のためのGNNに対する攻撃に焦点を当てているが、グラフ分類のためのGNNに対する攻撃は十分に研究されていない。本研究では,グラフ構造を摂動することで,グラフ分類のためのGNNに対する敵対攻撃を系統的に研究する。特に、攻撃者がターゲットGNNモデルについて知識がなく、ターゲットモデルに問い合わせることによって予測されたラベルしか取得できないハードラベルブラックボックス攻撃(ハードラベルブラックボックス攻撃)に注目し、この目的を達成するために、高い攻撃成功率を維持しながらグラフに乱されるエッジの数を最小化する最適化問題として攻撃を定式化する。元の最適化問題の解法は難解であり、この最適化問題を理論的収束保証により解き放つことができるように緩和する。また、ターゲットGNNモデルに対するクエリ数を減少させるために、粗粒度探索アルゴリズムとクエリ効率勾配計算アルゴリズムを設計する。実世界の3つのデータセットに対する実験結果から,クエリや摂動の少ないグラフ分類において,GNNを効果的に攻撃できることが示された。また,本攻撃の有効性を2つの防御条件下で評価した。1つは高度に設計された逆グラフ検出器であり、もう1つはターゲットのgnnモデル自体が逆グラフ生成を防止する防御機能を備えていることである。実験の結果,このような防御効果は十分ではないことが明らかとなった。 Graph Neural Networks (GNNs) have achieved state-of-the-art performance in various graph structure related tasks such as node classification and graph classification. However, GNNs are vulnerable to adversarial attacks. Existing works mainly focus on attacking GNNs for node classification; nevertheless, the attacks against GNNs for graph classification have not been well explored. In this work, we conduct a systematic study on adversarial attacks against GNNs for graph classification via perturbing the graph structure. In particular, we focus on the most challenging attack, i.e., hard label black-box attack, where an attacker has no knowledge about the target GNN model and can only obtain predicted labels through querying the target model.To achieve this goal, we formulate our attack as an optimization problem, whose objective is to minimize the number of edges to be perturbed in a graph while maintaining the high attack success rate. The original optimization problem is intractable to solve, and we relax the optimization problem to be a tractable one, which is solved with theoretical convergence guarantee. We also design a coarse-grained searching algorithm and a query-efficient gradient computation algorithm to decrease the number of queries to the target GNN model. Our experimental results on three real-world datasets demonstrate that our attack can effectively attack representative GNNs for graph classification with less queries and perturbations. We also evaluate the effectiveness of our attack under two defenses: one is well-designed adversarial graph detector and the other is that the target GNN model itself is equipped with a defense to prevent adversarial graph generation. Our experimental results show that such defenses are not effective enough, which highlights more advanced defenses.	翻訳日:2021-08-24 15:26:58 公開日:2021-08-21
# 確率ベイズゲームにおける時間的自己プレイ Temporal Induced Self-Play for Stochastic Bayesian Games ( http://arxiv.org/abs/2108.09444v1 ) ライセンス: Link先を確認	Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang	(参考訳) ダイナミックゲームを解くための実践的な要件は、プレイヤーがいかなる決定点からでもうまくプレーすることを保証することである。この要件を満たすため、既存の取り組みは均衡改善に重点を置いているが、既存の技術のスケーラビリティと適用性は限られている。本稿では,任意の意思決定点から適切なパフォーマンスの戦略を見出すための新しい強化学習ベースフレームワークtispを提案する。 TISPは、信念空間表現、後方誘導、ポリシー学習、および非パラメトリック近似を使用する。 TISPを基盤として,政策段階のアルゴリズムであるTISP-PGを設計する。有限地平線を持つゼロサム一辺確率ベイズゲームにおいて、tispベースのアルゴリズムが近似完全ベイズ均衡を見つけることが証明される。セキュリティゲームやグリッドワールドゲームなど,TISPベースのアルゴリズムを多種多様なゲームでテストする。その結果,TISP-PGは既存の数学的プログラミング手法よりも拡張性が高く,他の学習手法よりも優れていた。 One practical requirement in solving dynamic games is to ensure that the players play well from any decision point onward. To satisfy this requirement, existing efforts focus on equilibrium refinement, but the scalability and applicability of existing techniques are limited. In this paper, we propose Temporal-Induced Self-Play (TISP), a novel reinforcement learning-based framework to find strategies with decent performances from any decision point onward. TISP uses belief-space representation, backward induction, policy learning, and non-parametric approximation. Building upon TISP, we design a policy-gradient-based algorithm TISP-PG. We prove that TISP-based algorithms can find approximate Perfect Bayesian Equilibrium in zero-sum one-sided stochastic Bayesian games with finite horizon. We test TISP-based algorithms in various games, including finitely repeated security games and a grid-world game. The results show that TISP-PG is more scalable than existing mathematical programming-based methods and significantly outperforms other learning-based methods.	翻訳日:2021-08-24 15:21:02 公開日:2021-08-21
# 環境データ不足に対する計算的計算法に関する研究 A computational study on imputation methods for missing environmental data ( http://arxiv.org/abs/2108.09500v1 ) ライセンス: Link先を確認	Paul Dixneuf and Fausto Errico and Mathias Glaus	(参考訳) データベース形式でのデータ取得と記録は日常的な操作である。しかし、データ収集のプロセスは、不規則な状況に陥り、データ欠落したデータベースが発生する可能性がある。ミスエントリは分析効率を変化させ、その結果、関連する意思決定プロセスを変化させる。本稿では,自然環境に関する情報を収集するデータベースに焦点を当てる。記録された活動の幅広いスペクトルを考えると、これらのデータベースは典型的に混在している。したがって、この特性を考慮したデータ処理手法の性能を評価することは重要である。本稿では,いくつかの欠落データ計算手法の性能と,その環境における欠落データ問題への応用について検討する。この手法を連鎖方程式 (mice) と k-nearest neighbors (knn) による多変量インプテーション法(multivariate imputation by chained equation) と比較した。さまざまなタイプの10の事前処理データセットでテストが行われた。その結果,MF の計算誤差は MICE と KNN より優れており,MF が計算誤差を 150% まで削減した混合型データベースの性能差は,他の手法と比較して顕著であった。通常、KNNは最速の方法であった。 mfはケベックの排水処理プラントのパフォーマンスモニタリングのケーススタディにうまく適用された。本研究は, 環境データ不足に対処する上で, MFを抑止法として用いることの意義を示すものである。 Data acquisition and recording in the form of databases are routine operations. The process of collecting data, however, may experience irregularities, resulting in databases with missing data. Missing entries might alter analysis efficiency and, consequently, the associated decision-making process. This paper focuses on databases collecting information related to the natural environment. Given the broad spectrum of recorded activities, these databases typically are of mixed nature. It is therefore relevant to evaluate the performance of missing data processing methods considering this characteristic. In this paper we investigate the performances of several missing data imputation methods and their application to the problem of missing data in environment. A computational study was performed to compare the method missForest (MF) with two other imputation methods, namely Multivariate Imputation by Chained Equations (MICE) and K-Nearest Neighbors (KNN). Tests were made on 10 pretreated datasets of various types. Results revealed that MF generally outperformed MICE and KNN in terms of imputation errors, with a more pronounced performance gap for mixed typed databases where MF reduced the imputation error up to 150%, when compared to the other methods. KNN was usually the fastest method. MF was then successfully applied to a case study on Quebec wastewater treatment plants performance monitoring. We believe that the present study demonstrates the pertinence of using MF as imputation method when dealing with missing environmental data.	翻訳日:2021-08-24 15:20:46 公開日:2021-08-21
# ソフトウェア工学における用語相互関係と動向 Term Interrelations and Trends in Software Engineering ( http://arxiv.org/abs/2108.09529v1 ) ライセンス: Link先を確認	Janusan Baskararajah and Lei Zhang and Andriy Miranskyy	(参考訳) ソフトウェアエンジニアリング(se)コミュニティは多作であり、専門家が新しい論文の洪水に追随し、新生物がこの分野に参入することを困難にしている。そこで我々は,SEコミュニティのテキストコーパスから用語とその相互関係を抽出し,用語の傾向を示すツールの恩恵を受けることができると考えている。本稿では,単語埋め込み技術を用いたプロトタイピングツールを構築する。我々は、SE Body of Knowledgeハンドブックと15,233の研究論文のタイトルと要約の埋め込みを訓練する。また、組み込みのトレーニングの検証に必要なテストケースを作成します。本稿では,埋め込みが用語の要約や知識ベースの動向を明らかにするのに役立つことを示す代表的な例を示す。 The Software Engineering (SE) community is prolific, making it challenging for experts to keep up with the flood of new papers and for neophytes to enter the field. Therefore, we posit that the community may benefit from a tool extracting terms and their interrelations from the SE community's text corpus and showing terms' trends. In this paper, we build a prototyping tool using the word embedding technique. We train the embeddings on the SE Body of Knowledge handbook and 15,233 research papers' titles and abstracts. We also create test cases necessary for validation of the training of the embeddings. We provide representative examples showing that the embeddings may aid in summarizing terms and uncovering trends in the knowledge base.	翻訳日:2021-08-24 15:20:26 公開日:2021-08-21
# 時空間データ調音のための成長変換力学系の利用 Using growth transform dynamical systems for spatio-temporal data sonification ( http://arxiv.org/abs/2108.09537v1 ) ライセンス: Link先を確認	Oindrila Chatterjee, Shantanu Chakrabartty	(参考訳) 有意義な音声シグネチャで情報を符号化するソニフィケーションは、人間のループ内決定のための従来の可視化手法の強化や置き換えにいくつかの利点がある。文献で報告されている標準的な音素化手法は、(i)変数のサブセットのみを使用するか、(ii)データ上の学習タスクを最初に解決し、次いで、エンドユーザーが決定するために使用する音声波形に出力をマッピングする。本稿では, 複合成長変換力学系モデルを用いて, 学習(あるいはより一般的には最適化)と音化過程を統合した, 高次元データを音化するための新しい枠組みを提案する。本アルゴリズムは,学習課題や予測課題の根底にあるデータと最適化パラメータを入力として,ユーザが定義する心理音響パラメータと組み合わせる。その結果、高次元データの統計特性を符号化するだけでなく、最適化・学習プロセスの基盤となる複雑さを明らかにするバイノーラル音声シグネチャを出力する。合成データセットを用いた広範囲な実験とともに、小児のてんかん発作を検出する可能性を持つ脳波解析(eeg)の枠組みを実証する。 Sonification, or encoding information in meaningful audio signatures, has several advantages in augmenting or replacing traditional visualization methods for human-in-the-loop decision-making. Standard sonification methods reported in the literature involve either (i) using only a subset of the variables, or (ii) first solving a learning task on the data and then mapping the output to an audio waveform, which is utilized by the end-user to make a decision. This paper presents a novel framework for sonifying high-dimensional data using a complex growth transform dynamical system model where both the learning (or, more generally, optimization) and the sonification processes are integrated together. Our algorithm takes as input the data and optimization parameters underlying the learning or prediction task and combines it with the psychoacoustic parameters defined by the user. As a result, the proposed framework outputs binaural audio signatures that not only encode some statistical properties of the high-dimensional data but also reveal the underlying complexity of the optimization/learning process. Along with extensive experiments using synthetic datasets, we demonstrate the framework on sonifying Electro-encephalogram (EEG) data with the potential for detecting epileptic seizures in pediatric patients.	翻訳日:2021-08-24 15:20:14 公開日:2021-08-21
# 多様な動作予測のための滑らかなポーズ列の生成 Generating Smooth Pose Sequences for Diverse Human Motion Prediction ( http://arxiv.org/abs/2108.08422v2 ) ライセンス: Link先を確認	Wei Mao, Miaomiao Liu, Mathieu Salzmann	(参考訳) 確率的動き予測の最近の進歩、すなわち、1つの過去のポーズシーケンスが与えられた複数の将来の人間の動きを予測することは、非常に多様な将来の動きを生み出し、いくつかの身体部分の運動を制御することさえもたらした。しかし、これを実現するためには、多様性のためのいくつかのマッピングと、制御可能な動き予測のための専用モデルを学ぶ必要がある。本稿では,多様かつ制御可能な動き予測のための統合型深層生成ネットワークを提案する。この目的のために、現実的な人間の動きは有効なポーズの滑らかなシーケンスで構成されており、限られたデータを考えると、ポーズの事前学習は動きよりもずっと扱いやすいという直観を活用できる。そこで我々は,各部位の動作を逐次予測するジェネレータを設計し,動作リアリズムを実現するために,関節角度の損失とともに正規化フローベースのポーズを導入し,サンプルの多様性と精度の両面で,我々のアプローチが最先端のベースラインより優れていることを示す。コードはhttps://github.com/wei-mao-2019/gspsで入手できる。 Recent progress in stochastic motion prediction, i.e., predicting multiple possible future human motions given a single past pose sequence, has led to producing truly diverse future motions and even providing control over the motion of some body parts. However, to achieve this, the state-of-the-art method requires learning several mappings for diversity and a dedicated model for controllable motion prediction. In this paper, we introduce a unified deep generative network for both diverse and controllable motion prediction. To this end, we leverage the intuition that realistic human motions consist of smooth sequences of valid poses, and that, given limited data, learning a pose prior is much more tractable than a motion one. We therefore design a generator that predicts the motion of different body parts sequentially, and introduce a normalizing flow based pose prior, together with a joint angle loss, to achieve motion realism.Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy. The code is available at https://github.com/wei-mao-2019/gsps	翻訳日:2021-08-24 11:29:39 公開日:2021-08-21
# 知識グラフを用いた質問応答のためのトップk演算子を用いた効率的な文脈化 Efficient Contextualization using Top-k Operators for Question Answering over Knowledge Graphs ( http://arxiv.org/abs/2108.08597v2 ) ライセンス: Link先を確認	Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum	(参考訳) 知識ベース(KB-QA)に関する複雑な疑問に答えるには、数百万のエンティティと数千の述語を含む何十億もの事実を含む膨大な入力データに直面する。効率性のために、QAシステムはまず、すべての回答と関連する手がかりを含む可能性のある事実の集合を特定することによって、回答検索空間を縮小する。最も一般的なテクニックは、名前付きエンティティ曖昧化(NED)システムを問題に適用し、曖昧なエンティティに対してKB事実を検索することである。本研究は,KB対応信号を用いて検索空間の無関係な部分を抽出する効率的なECQAを提案する。 ECQAは、語彙マッチング、質問への関連性、候補項目間のコヒーレンス、KBグラフの接続性といった信号を組み合わせたKB項目のスコア順リスト上のトップkクエリ処理に基づいている。最近の2つのQAベンチマークによる実験は、解答の有無、検索空間のサイズ、ランタイムに関して、最先端のベースラインよりもECQAの方が優れていることを示している。 Answering complex questions over knowledge bases (KB-QA) faces huge input data with billions of facts, involving millions of entities and thousands of predicates. For efficiency, QA systems first reduce the answer search space by identifying a set of facts that is likely to contain all answers and relevant cues. The most common technique is to apply named entity disambiguation (NED) systems to the question, and retrieve KB facts for the disambiguated entities. This work presents ECQA, an efficient method that prunes irrelevant parts of the search space using KB-aware signals. ECQA is based on top-k query processing over score-ordered lists of KB items that combine signals about lexical matching, relevance to the question, coherence among candidate items, and connectivity in the KB graph. Experiments with two recent QA benchmarks demonstrate the superiority of ECQA over state-of-the-art baselines with respect to answer presence, size of the search space, and runtimes.	翻訳日:2021-08-24 11:29:18 公開日:2021-08-21
# マルチセンターフェデレーションラーニング Multi-Center Federated Learning ( http://arxiv.org/abs/2108.08647v2 ) ライセンス: Link先を確認	Ming Xie, Guodong Long, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang	(参考訳) フェデレーション学習(federated learning, fl)は、分散学習におけるデータのプライバシを保護する。しかし、flは実用的な設定、例えば異なるユーザに対する非iidデータなどにおいて一般的に見られる異質性の存在下では脆弱である。既存のFLアプローチは通常、1つのグローバルモデルを更新して、データ分散間の不一致に関わらず、勾配を集約することで、すべてのユーザの共有知識をキャプチャする。対照的に、複数のグローバルモデルの混合は、FLの異なるグローバルモデル(すなわちセンター)にユーザーを割り当てる場合、様々なユーザー間の不均一性を捉えることができる。そこで本研究では,新しい多元集約機構を提案する。データから複数のグローバルモデルを学び、同時にユーザーとセンターの最適なマッチングを導き出す。次に、確率的予測最大化(EM)アルゴリズムにより効率よく解ける二段階最適化問題として定式化する。 FLの複数のベンチマークデータセットに対する実験により,本手法はいくつかのFL競合より優れていることが示された。ソースコードはGithubで公開されている。 Federated learning (FL) can protect data privacy in distributed learning since it merely collects local gradients from users without access to their data. However, FL is fragile in the presence of heterogeneity that is commonly encountered in practical settings, e.g., non-IID data over different users. Existing FL approaches usually update a single global model to capture the shared knowledge of all users by aggregating their gradients, regardless of the discrepancy between their data distributions. By comparison, a mixture of multiple global models could capture the heterogeneity across various users if assigning the users to different global models (i.e., centers) in FL. To this end, we propose a novel multi-center aggregation mechanism . It learns multiple global models from data, and simultaneously derives the optimal matching between users and centers. We then formulate it as a bi-level optimization problem that can be efficiently solved by a stochastic expectation maximization (EM) algorithm. Experiments on multiple benchmark datasets of FL show that our method outperforms several popular FL competitors. The source code are open source on Github.	翻訳日:2021-08-24 11:28:37 公開日:2021-08-21

Title

Authors

Abstract

論文公表日・翻訳日

# エラー軽減のための最適資源コスト

Optimal resource cost for error mitigation ( http://arxiv.org/abs/2006.12509v3 )

ライセンス: Link先を確認

Ryuji Takagi

(参考訳) 短期量子デバイスの中心的な問題の1つは、その究極のポテンシャルと限界を理解することである。本稿では,確率的誤差解消手法の最適資源コストを定式化できる,短期機器の完全な表現性を考慮したフレームワークを導入することで,量子エラー軽減の観点からこの問題に対処する。デバイスが実装可能なノイズ処理に関して定義された資源理論量化器と接続することにより、最適なコストを評価するための一般的な方法を提案する。提案手法は,従来の評価よりも汎用的に有利な実現可能なコストと,多種多様な実装可能な操作に適用可能な基本下限を得るため,騒音の一般的なクラスを緩和するための最適コストを推定する。いくつかのノイズモデルのバウンダリを改善し、ノイズの脱分極と減圧の正確なコストを与え、オーバーヘッドコストを正確に特徴付けるとともに、エラー軽減の観点から資源測定に操作的意味を与える。その結果,テムメらによるヒューリスティックなアプローチが特に示唆された。 K. Temme, S. Bravyi, and J. M. Gambetta, Phys. Lett. 119, 180509 (2017)] は、我々の拡張フレームワークにおいても最適であり、このノイズモデルのために、近距離デバイスに固有の追加自由度によって得られる利点に根本的な制限を課す。

One of the central problems for near-term quantum devices is to understand their ultimate potential and limitations. We address this problem in terms of quantum error mitigation by introducing a framework taking into account the full expressibility of near-term devices, in which the optimal resource cost for the probabilistic error cancellation method can be formalized. We provide a general methodology for evaluating the optimal cost by connecting it to a resource-theoretic quantifier defined with respect to the noisy operations that devices can implement. We employ our methods to estimate the optimal cost in mitigating a general class of noise, where we obtain an achievable cost that has a generic advantage over previous evaluations, as well as a fundamental lower bound applicable to a broad class of noisy implementable operations. We improve our bounds for several noise models, where we give the exact optimal costs for the depolarizing and dephasing noise, precisely characterizing the overhead cost while offering an operational meaning to the resource measure in terms of error mitigation. Our result particularly implies that the heuristic approach presented by Temme et al. [K. Temme, S. Bravyi, and J. M. Gambetta, Phys. Rev. Lett. 119, 180509 (2017)] is optimal even in our extended framework, putting fundamental limitations on the advantage provided by the extra degrees of freedom inherent in near-term devices for this noise model.

翻訳日:2023-05-13 04:51:42 公開日:2021-08-21

# テンソルネットワークを用いた量子回路表現のための決定図

A Tensor Network based Decision Diagram for Representation of Quantum Circuits ( http://arxiv.org/abs/2009.02618v2 )

ライセンス: Link先を確認

Xin Hong, Xiangzhen Zhou, Sanjiang Li, Yuan Feng, Mingsheng Ying

(参考訳) テンソルネットワークは、何十年もの間量子物理系のシミュレーションにうまく応用されてきた。近年、量子コンピューティング、特にランダム量子回路の古典的なシミュレーションにも採用されている。本稿では、テンソルネットワークのより原理的で便利な応用のために、TDD(Tensor Decision Diagram)と呼ばれる決定図式データ構造を提案する。この新しいデータ構造は、量子回路のコンパクトで標準的な表現を提供する。回路分割を利用することにより、量子回路のTDDを効率的に計算することができる。さらに、アプリケーションに不可欠なテンソルネットワーク(加算や収縮など)の操作もTDDで効率的に実装できることを示す。 TDDの概念実証実装を示し、その効率をベンチマーク量子回路のセットで評価する。 TDDは、等価性チェック、エラー検出、合成、シミュレーション、検証など、量子回路に関連するさまざまな設計自動化タスクにおいて重要な役割を果たすことが期待されている。

Tensor networks have been successfully applied in simulation of quantum physical systems for decades. Recently, they have also been employed in classical simulation of quantum computing, in particular, random quantum circuits. This paper proposes a decision diagram style data structure, called TDD (Tensor Decision Diagram), for more principled and convenient applications of tensor networks. This new data structure provides a compact and canonical representation for quantum circuits. By exploiting circuit partition, the TDD of a quantum circuit can be computed efficiently. Furthermore, we show that the operations of tensor networks essential in their applications (e.g., addition and contraction), can also be implemented efficiently in TDDs. A proof-of-concept implementation of TDDs is presented and its efficiency is evaluated on a set of benchmark quantum circuits. It is expected that TDDs will play an important role in various design automation tasks related to quantum circuits, including but not limited to equivalence checking, error detection, synthesis, simulation, and verification.

翻訳日:2023-05-03 11:33:23 公開日:2021-08-21

# 対称性としての量子参照フレーム変換と第三粒子のパラドックス

Quantum reference frame transformations as symmetries and the paradox of the third particle ( http://arxiv.org/abs/2011.01951v2 )

ライセンス: Link先を確認

Marius Krumm, Philipp A. Hoehn, Markus P. Mueller

(参考訳) 量子の世界では、参照フレームは究極的には量子システムでもある。本研究では,単純な物理系の対称性として,量子参照フレーム(QRF)変換が自然に現れることを示す。これにより、既知のQRF変換を、運用上透過的なフレームワーク内で再開発し、一般化し、その構造と解釈に新たな光を放つことができます。このような量子対称性に制約されたエージェントによって測定可能な可観測物の明示的な記述を与え、その結果を「第3粒子のパラドックス」と呼ばれるパズルに適用する。より少ない粒子をより多くの粒子にどのように埋め込むかという問題に還元できると論じ、この問題の物理的および代数的解析を徹底的に行うことができる。これにより、パラドックスを確実に解決する部分的トレース('relational trace')が一般化され、この分解において鍵となるリレーショナルオブザーバブルのような単純な量子情報設定において制約量子化の重要な構造が明らかになる。我々は、透明性と数学的厳密性のために有限アベリア群に注意を向けるが、直感的な物理的魅力は、それらがより一般的な状況において有効であることを期待する。

In a quantum world, reference frames are ultimately quantum systems too -- but what does it mean to "jump into the perspective of a quantum particle"? In this work, we show that quantum reference frame (QRF) transformations appear naturally as symmetries of simple physical systems. This allows us to rederive and generalize known QRF transformations within an alternative, operationally transparent framework, and to shed new light on their structure and interpretation. We give an explicit description of the observables that are measurable by agents constrained by such quantum symmetries, and apply our results to a puzzle known as the `paradox of the third particle'. We argue that it can be reduced to the question of how to relationally embed fewer into more particles, and give a thorough physical and algebraic analysis of this question. This leads us to a generalization of the partial trace (`relational trace') which arguably resolves the paradox, and it uncovers important structures of constraint quantization within a simple quantum information setting, such as relational observables which are key in this resolution. While we restrict our attention to finite Abelian groups for transparency and mathematical rigor, the intuitive physical appeal of our results makes us expect that they remain valid in more general situations.

翻訳日:2023-04-25 11:39:38 公開日:2021-08-21

# 量子変分法を用いた量子機械学習のLHCにおける高エネルギー物理解析への応用 : IBM量子コンピュータシミュレータと10量子ビットハードウェア

Application of Quantum Machine Learning using the Quantum Variational Classifier Method to High Energy Physics Analysis at the LHC on IBM Quantum Computer Simulator and Hardware with 10 qubits ( http://arxiv.org/abs/2012.11560v2 )

ライセンス: Link先を確認

Sau Lan Wu, Jay Chan, Wen Guan, Shaojun Sun, Alex Wang, Chen Zhou, Miron Livny, Federico Carminati, Alberto Di Meglio, Andy C. Y. Li, Joseph Lykken, Panagiotis Spentzouris, Samuel Yen-Chi Chen, Shinjae Yoo and Tzu-Chieh Wei

(参考訳) lhcにおける実験プログラムの主な目的の1つは、新しい物理学の発見である。これは膨大な背景にある稀な信号の同定を必要とする。機械学習アルゴリズムを使用することで、この目的を達成する能力が大幅に向上します。量子技術の進歩により、量子機械学習は高エネルギー物理学におけるデータ分析の強力なツールとなりうる。本研究は,ibmのゲートモデル量子コンピューティングシステムを用いて,最近のlhcフラッグシップ物理解析において,量子変分分類法を適用した。 $t\bar{t}h$ (トップクォーク対に関連付けてボソン生成) と $h\rightarrow\mu^{+}\mu^{-}$ (ヒッグスボゾンが2つのミューオンに崩壊し,ヒッグスボソンカップリングを第2世代のフェルミオンに推定する) である。我々は、IBM量子シミュレータとIBM量子ハードウェアの10量子ビットによる初期の結果を得た。量子シミュレータ上の100の事象の小さなトレーニングサンプルを用いて、量子変分分類法は、LHC物理解析によく用いられるSVM(サポートベクトルマシン)やBDT(ブースト決定木)のような古典的なアルゴリズムと同様に動作する。量子ハードウェアでは、量子変分分類法が量子シミュレータに匹敵する有望な識別力を示している。この研究は、量子機械学習が現実的な物理データセットの信号と背景を区別できることを示した。我々は、将来の高輝度LHC物理解析における量子機械学習の利用を予測し、ヒッグス粒子自己結合の測定や暗黒物質の探索を含む。

One of the major objectives of the experimental programs at the LHC is the discovery of new physics. This requires the identification of rare signals in immense backgrounds. Using machine learning algorithms greatly enhances our ability to achieve this objective. With the progress of quantum technologies, quantum machine learning could become a powerful tool for data analysis in high energy physics. In this study, using IBM gate-model quantum computing systems, we employ the quantum variational classifier method in two recent LHC flagship physics analyses: $t\bar{t}H$ (Higgs boson production in association with a top quark pair) and $H\rightarrow\mu^{+}\mu^{-}$ (Higgs boson decays to two muons, probing the Higgs boson couplings to second-generation fermions). We have obtained early results with 10 qubits on the IBM quantum simulator and the IBM quantum hardware. With small training samples of 100 events on the quantum simulator, the quantum variational classifier method performs similarly to classical algorithms such as SVM (support vector machine) and BDT (boosted decision tree), which are often employed in LHC physics analyses. On the quantum hardware, the quantum variational classifier method has shown promising discrimination power, comparable to that on the quantum simulator. This study demonstrates that quantum machine learning has the ability to differentiate between signal and background in realistic physics datasets. We foresee the usage of quantum machine learning in future high-luminosity LHC physics analyses, including measurements of the Higgs boson self-couplings and searches for dark matter.

翻訳日:2023-04-20 00:19:10 公開日:2021-08-21

# 密度作用素と可積分ウィグナー分布を持つ混合量子状態のクラスを定義する統計的アンサンブルの不統一性について

On the Non-Uniqueness of Statistical Ensembles Defining a Density Operator and a Class of Mixed Quantum States with Integrable Wigner Distribution ( http://arxiv.org/abs/2103.05605v4 )

ライセンス: Link先を確認

Charlyne de Gosson and Maurice de Gosson

(参考訳) 正方積分可能な関数からなる混合量子状態のウィグナー分布は準確率分布であり、その積分は1であり、限界特性が満たされていると仮定するのは標準である。しかし、一般にはそうではない。この性質を満たす量子状態のクラスを導入し、これらの状態は1980年代にH. Feichtingerによって導入された函数空間(変調空間)のクラスで定義されるため、"Feichtinger state"と呼ばれる。これらの状態の性質を研究し、密度演算子を生成する統計アンサンブルの非特異性について、ジェインズの結果の一般の場合の拡張を証明する機会を与える。ボーナスとして、ウィグナー変換の凸和の結果を得る。

It is standard to assume that the Wigner distribution of a mixed quantum state consisting of square-integrable functions is a quasi-probability distribution, that is that its integral is one and that the marginal properties are satisfied. However this is in general not true. We introduce a class of quantum states for which this property is satisfied, these states are dubbed "Feichtinger states" because they are defined in terms of a class of functional spaces (modulation spaces) introduced in the 1980's by H. Feichtinger. The properties of these states are studied, which gives us the opportunity to prove an extension to the general case of a result of Jaynes on the non-uniqueness of the statistical ensemble generating a density operator. As a bonus we obtain a result for convex sums of Wigner transforms.

翻訳日:2023-04-08 15:52:22 公開日:2021-08-21

# 量子グラフ上のインスタントンとベリーの接続

Instantons and Berry's connections on quantum graph ( http://arxiv.org/abs/2104.02311v2 )

ライセンス: Link先を確認

Tomonori Inoue, Makoto Sakamoto, Inori Ueba

(参考訳) 本稿では,量子グラフ上のディラック零モードの境界条件のパラメータ空間における非可換ベリー接続について検討する。本稿では,Yang-Mills のインスタントソリューション構築手法である ADHM の構成をベリー接続に適用する。そして、instantonの設定がberryの接続として現れることがわかりました。

In this paper, we study non-Abelian Berry's connections in the parameter space of boundary conditions for Dirac zero modes on quantum graphs. We apply the ADHM construction, which is the method for constructing Yang-Mills instanton solutions, to the Berry's connections. Then we find that the instanton configurations appear as the Berry's connections.

翻訳日:2023-04-05 06:29:22 公開日:2021-08-21

# フィボナッチアロンによる決定論的量子ワンタイムパッド

Deterministic quantum one-time pad via Fibonacci anyons ( http://arxiv.org/abs/2104.05911v2 )

ライセンス: Link先を確認

Cheng-Qian Xu, D. L. Zhou

(参考訳) トポロジカルに堅牢な任意の状態はヒルベルト空間の特異な構造に由来するが、量子コンピューティングや量子通信において重要な応用がある。決定論的量子ワンタイムパッド(dqotp)の情報担体としてアノニック状態が用いられると、真空から生成するフィボナッチ粒子-反粒子対は、古典情報(2\log_2 d_{\tau}$ bit of classical information)(d_{\tau}$ is the quantum dimension of a fibonacci anyon $\tau$)を漸近的に送ることができる。さらに,自明な総電荷を持つ6個のフィボナッチ・エノンのパラメータ化状態を介してdqotpを解析することにより,異なるパラメータに対して送信可能な最大メッセージ数の解析結果を与える。 DQOTPが送信するメッセージの最大数に関する結果は、任意のアクセス可能な情報によって説明できる。

Anyonic states, which are topologically robust originated from their peculiar structure of Hilbert space, have important applications in quantum computing and quantum communication. When an anyonic state is used as an information carrier of the deterministic quantum one-time pad (DQOTP), we find that the Fibonacci particle-antiparticle pair produced from vacuum can be used to asymptotically send $2\log_2 d_{\tau}$ bits of classical information ($d_{\tau}$ is the quantum dimension of a Fibonacci anyon $\tau$), which equals to the anyonic mutual information of the pair. Furthermore, by studying the DQOTP via a parameterized state of six Fibonacci anyons with trivial total charge, we give the analytical results of the maximum number of messages that can be sent for different parameters, which is a step function with every step corresponding to a regular simplex from the viewpoint of geometry. The results for the maximum number of messages sent by the DQOTP can be explained by the anyonic accessible information.

翻訳日:2023-04-03 23:49:34 公開日:2021-08-21

# 閉じ込められた分極子の分極と量子相関の相互作用

Interplay between polarization and quantum correlations of confined polaritons ( http://arxiv.org/abs/2104.13541v2 )

ライセンス: Link先を確認

Olivier Bleu, Jesper Levinsen, Meera M. Parish

(参考訳) 低駆動領域におけるコヒーレント駆動箱空洞内の偏光子量子相関について検討し,特に偏光自由度について考察した。共振器と交叉偏光器の相互作用強度が異なる可能性や、現実的な線形偏光分割により、自振器とクロスカーライクな非線形性を持つ2つの結合非線形共振器としてシステムをモデル化できるため、他の実験プラットフォームに関係がある可能性がある。実効的な波動関数法では, 定常偏光分解型偏光子数と二階相関関数の解析式を求め, リンドブラッドマスター方程式から得られた数値結果とよく一致した。特に, 励起分極(円または直線)によっては, 従来型(干渉型)と従来型(非線形型)の両方の反束が単一の空洞内で調べられることを強調する。さらに本研究は, ファイバキャビティポラリトンを閉じ込めた最近の実験により, ポラリトンフェシバッハ共鳴の特徴であるクロスポーラライズポラリトン間の相互作用が支配的な構造である可能性が示唆された。さらに, 2チャネルモデルを用いて共鳴に近い状態について検討し, 強いポラリトンアンチバンチングを実現するための基盤として, 原子状薄膜半導体など, バイエクシトン結合エネルギーの大きいシステムが有望であることを示す。

We investigate polariton quantum correlations in a coherently driven box cavity in the low driving regime, with a particular focus on accounting for the polarization degree of freedom. The possibility of having different interaction strengths between co- and cross-circularly polarized polaritons as well as a realistic linear-polarization splitting allows one to model the system as two coupled nonlinear resonators with both self- and cross-Kerr-like nonlinearities, thus making our results potentially relevant for other experimental platforms. Within an effective wave-function approach, we obtain analytical expressions for the steady-state polarization-resolved polariton populations and second-order correlation functions, which agree very well with our numerical results obtained from a Lindblad master equation. Notably, we highlight that depending on the excitation polarization (circular or linear), both the unconventional (interference-mediated) and conventional (mediated by nonlinearities) antibunchings can be investigated in a single cavity. Moreover, using our results, we argue that recent experiments on confined fiber-cavity polaritons are likely to have probed a regime where the dominant interaction is between cross-polarized polaritons, which is characteristic of the polariton Feshbach resonance. We furthermore investigate the regime close to resonance using a two-channel model, and we show that systems with large biexciton binding energies, such as atomically thin semiconductors, are promising platforms for realizing strong polariton antibunching.

翻訳日:2023-04-02 04:44:31 公開日:2021-08-21

# ディラトンブラックホールの影響下でのガウス量子ステアリング

Gaussian quantum steering under the influence of a dilaton black hole ( http://arxiv.org/abs/2104.14738v2 )

ライセンス: Link先を確認

Biwei Hu, Cuihong Wen, Jieci Wang, Jiliang Jing

(参考訳) 我々はGarfinkle-Horowitz-Strominger ディラトンブラックホールの背景におけるガウス量子ステアリングのダイナミクスについて検討した。ディラトン場によって引き起こされる重力は、慣性観測者アリスと事象地平線外にホバリングする観測者ボブの間の量子ステアビリティを破壊し、因果非連結領域間のステアリング型量子相関を生成する。したがって、観測者は事象の地平線によって分離されたとしても、局所的な測定によってお互いの状態を制御することができる。ディラトン時空における量子エンタングルメントとは異なり、量子ステアリングは「沈む死」や「沈む誕生」のような破滅的な挙動を経験し、ディラトン電荷が増加する。さらに、ディラトン重力はガウスステアリングの対称性を破壊し、後者は常に拡張時空において非対称である。興味深いことに、最大ステアリング非対称性の達成はディラトン時空における2モードガウス状態の1方向と2方向ステアリングの臨界点を示している。

We study the dynamics of Gaussian quantum steering in the background of a Garfinkle-Horowitz-Strominger dilaton black hole. It is found that the gravity induced by dilaton field will destroy the quantum steerability between the inertial observer Alice and the observer Bob who hovers outside the event horizon, while it generates steering-type quantum correlations between the causally disconnected regions. Therefore, the observers can steer each other's state by local measurements even though they are separated by the event horizon. Unlike quantum entanglement in the dilaton spacetime, the quantum steering experiences catastrophic behaviors such as "sudden death" and "sudden birth" with increasing dilaton charge. In addition, the dilaton gravity destroys the symmetry of Gaussian steering and the latter is always asymmetric in the dilation spacetime. Interestingly, the attainment of maximal steering asymmetry indicates the critical point between one-way and two-way steering for the two-mode Gaussian state in the dilaton spacetime.

翻訳日:2023-04-02 00:01:38 公開日:2021-08-21

# x線自由電子レーザーにおける真空複屈折

Vacuum birefringence at x-ray free-electron lasers ( http://arxiv.org/abs/2105.13869v2 )

ライセンス: Link先を確認

Felix Karbstein, Chantal Sundqvist, Kai S. Schulze, Ingo Uschmann, Holger Gies, Gerhard G. Paulus

(参考訳) x線自由電子レーザー(xfel)単独で量子電磁力学によって予測される真空複屈折現象の測定の展望について検討した。我々は、XFELビームを有限角度で衝突させる実験的なスキームを考案し、その効果のためにポンプとプローブ場の両方として機能する。真空複屈折のシグネチャは、高純度x線ポラリメトリーで検出される偏光偏向信号光子に符号化される。理想化されたシナリオに対する我々の発見は、XFELベースの装置のみの発見の可能性は、光高強度レーザーに匹敵する可能性があることを示している。現在達成可能なシナリオでは、所望のシグネチャの大きさに強い影響を与えるX線光学成分のいくつかの重要な詳細を特定する。

We study the perspectives of measuring the phenomenon of vacuum birefringence predicted by quantum electrodynamics using an x-ray free-electron laser (XFEL) alone. We devise an experimental scheme allowing the XFEL beam to collide with itself under a finite angle, and thus act as both pump and probe field for the effect. The signature of vacuum birefringence is encoded in polarization-flipped signal photons to be detected with high-purity x-ray polarimetry. Our findings for idealized scenarios underline that the discovery potential of solely XFEL-based setups can be comparable to those involving optical high-intensity lasers. For currently achievable scenarios, we identify several key details of the x-ray optical ingredients that exert a strong influence on the magnitude of the desired signatures.

翻訳日:2023-03-29 04:27:26 公開日:2021-08-21

# 分子ビームの赤外励起と光解離によるスピン偏光水素原子の生成

Macroscopic production of spin-polarized hydrogen atoms from the IR-excitation and photodissociation of molecular beams ( http://arxiv.org/abs/2108.06133v2 )

ライセンス: Link先を確認

C. S. Kannis and J. Suarez and T. P. Rakitzis

(参考訳) 本稿では, HBr, HI, ${\rm NH_{3}}$同位体のIR励起と光解離からスピン偏極H, D原子を生成する方法について述べる。我々は, 製造速度が${\rm 10^{21}\, photons\, s^{-1}}$ のirレーザー生産率に近づく程度, 従来の${\sim}{\rm 10^{17} \, s^{-1}}$ の生産速度が著しく超える可能性について考察した。

We describe methods for the production of spin-polarized H and D atoms from the IR-excitation and photodissociation of molecular beams of HBr, HI, and ${\rm NH_{3}}$ isotopes, including optical excitation schemes with partial hyperfine resolution. We discuss the extent to which the production rates may approach the IR-laser production rates of ${\rm 10^{21}\, photons\, s^{-1}}$, and how the production rates of conventional methods of ${\sim}{\rm 10^{17} \, s^{-1}}$ may be surpassed significantly.

翻訳日:2023-03-18 15:06:53 公開日:2021-08-21

# ランク付き拡散、デルタボースガスおよびバーガーズ方程式

Ranked diffusion, delta Bose gas and Burgers equation ( http://arxiv.org/abs/2108.09515v1 )

ライセンス: Link先を確認

Pierre Le Doussal

(参考訳) n$粒子の拡散を,その階数に比例するドリフトによる1次元の相互作用で検討した。魅力的な場合(自己重力気体)では、リーブ・リニガー量子モデルへのマッピングにより、定常時間相関、戻り確率、定常状態への減衰率が得られる。ランクフィールドは、我々が解析したバーガーズ方程式に従う。これは、(反発の場合)外部ポテンシャル$V(x)$で大きめの$N$で定常密度を得ることを可能にする。魅力的な場合、定常状態への減衰速度は、その空間的減衰が十分に遅い場合の初期条件に依存することが分かる。クーロンガス法は、最終的な平衡を大きな$N$で研究することができる。

We study the diffusion of $N$ particles in one dimension interacting via a drift proportional to their rank. In the attractive case (self-gravitating gas) a mapping to the Lieb Liniger quantum model allows to obtain stationary time correlations, return probabilities and the decay rate to the stationary state. The rank field obeys a Burgers equation, which we analyze. It allows to obtain the stationary density at large $N$ in an external potential $V(x)$ (in the repulsive case). In the attractive case the decay rate to the steady state is found to depend on the initial condition if its spatial decay is slow enough. Coulomb gas methods allow to study the final equilibrium at large $N$.

翻訳日:2023-03-17 21:10:15 公開日:2021-08-21

# 2つのrydberg-rydberg相互作用原子を有するキャビティqed系における光子反束

Photon antibunching in a cavity-QED system with two Rydberg-Rydberg interaction atoms ( http://arxiv.org/abs/2108.09470v1 )

ライセンス: Link先を確認

Tong Huang, Lei Tan

(参考訳) 本稿では,2つのRydberg-Rydberg相互作用原子と共役するキャビティ-QED系において,強い光子反バンチング効果を実現する方法を提案する。等時間2次相関関数g(2)(0)の計算により、非慣習的な光子封鎖と従来の光子封鎖が原子駆動のスキームに現れ、どちらもライドバーグ-リドバーグ相互作用の影響を強く受けていることがわかった。また, 適切なパラメータの下では, 従来の光子遮断と従来とは異なる光子遮断を組み合わせることで, 光子アンチバンチングと平均光子数を大幅に向上できることがわかった。キャビティ駆動方式では、rydberg-rydberg相互作用の存在は、非慣習的な光子封鎖機構の下で光子反束をひどく破壊する。これらの結果は、リドベルク原子空洞系における単一光子エミッタの実装を導くのに役立つ。

We propose how to achieve strong photon antibunching effect in a cavity-QED system coupled with two Rydberg-Rydberg interaction atoms. Via calculating the equal time second order correlation function g(2)(0), we find that the unconventional photon blockade and the conventional photon blockade appear in the atom-driven scheme, and they are both significantly affected by the Rydberg-Rydberg interaction. We also find that under appropriate parameters, the photon antibunching and the mean photon number can be significantly enhanced by combining the conventional photon blockade and the unconventional photon blockade. In the cavity-driven scheme, the existence of the Rydberg-Rydberg interaction severely destroys the photon antibunching under the unconventional photon blockade mechanism. These results will help to guide the implementation of the single photon emitter in the Rydberg atoms-cavity system.

翻訳日:2023-03-17 21:10:04 公開日:2021-08-21

# 近ゼロ指数材料におけるモーメントの考察

Momentum considerations inside near-zero index materials ( http://arxiv.org/abs/2108.09450v1 )

ライセンス: Link先を確認

Micha\"el Lobet and I\~nigo Liberal and Larissa Vertchenko and Andrei Lavrinenko and Nader Engheta and Eric Mazur

(参考訳) nzi(near-zero-index)材料、すなわち0に近い位相屈折率を持つ材料は、光間相互作用を増強または阻害することが知られている。基本的な放射過程の理論的導出のほとんどはエネルギー的考察と詳細な平衡方程式に依存するが、運動量的考察には依存しない。運動量交換は理論モデルにも組み込む必要があるため、NZI物質の3つのカテゴリ、すなわち、epsilon-and-mu near-zero(EMNZ)、epsilon-near-zero(ENZ)、mu-near-zero(MNZ)内の運動量を調べる。分散材料におけるアブラハム・ミンコフスキーの議論の文脈において、光のミンコフスキーカノニカル運動量はNZIのすべてのカテゴリにおいてゼロであり、一方、光のエイブラハム運動量はENZおよびMNZの材料ではゼロであるが、EMNZの材料ではゼロである。理論上、nzi材料では、運動量後退、場から原子への移動運動量、ドップラーシフトが抑制されていることを実証する。基本放射過程の抑制は三次元nzi材料内部の運動量の考慮から説明される。最後に、スリッツ実験における回折パターンの欠如は、ミンコフスキー運動量ゼロの結果と見なされる。これらの発見は、ナノスケールでの基本的な光と物質との相互作用の理解を深めることと、発散の用途に訴求している。

Near-zero-index (NZI) materials, i.e. materials having a phase refractive index close to zero, are known to enhance or inhibit light-matter interactions. Most theoretical derivations of fundamental radiative processes rely on energetic considerations and detailed balance equations, but not on momentum considerations. Because momentum exchange should also be incorporated into theoretical models, we investigate momentum inside the three categories of NZI materials, i.e. inside epsilon-and-mu near-zero (EMNZ), epsilon-near-zero (ENZ) and mu-near-zero (MNZ) materials. In the context of Abraham-Minkowski debate in dispersive materials, we show that Minkowski-canonical momentum of light is zero inside all categories of NZI materials while Abraham-kinetic momentum of light is zero in ENZ and MNZ materials but nonzero inside EMNZ materials. We theoretically demonstrate that momentum recoil, transfer momentum from the field to the atom and Doppler shift are inhibited in NZI materials. Fundamental radiative processes inhibition is also explained due to those momentum considerations inside three-dimensional NZI materials. Lastly, absence of diffraction pattern in slits experiments is seen as a consequence of zero Minkowski momentum. Those findings are appealing for a better understanding of fundamental light-matter interactions at the nanoscale as well as for lasing applications.

翻訳日:2023-03-17 21:09:19 公開日:2021-08-21

# 対称な厳密に凸ポテンシャルを持つschr\"odinger方程式に対する virial ans\"atze 。第2部

Virial ans\"atze for the Schr\"odinger Equation with a symmetric strictly convex potential. Part II ( http://arxiv.org/abs/2108.09427v1 )

ライセンス: Link先を確認

S. P. Flego

(参考訳) 近年、対称凸ポテンシャルを持つ時間非依存schr\"odinger方程式の固有関数に対してパラメータのないans\"atzeを得る手順が文献に紹介されている。本研究では,$x^{2\kappa}$-type ポテンシャルに関してこの手法を検証した。本研究では, 電位の程度と相互結合定数に関するans\"atzeの挙動について検討した。最後に,多項式ポテンシャルが絡み合う場合の相対誤差の上限の確立に,結果をどのように利用できるかについて議論する。

Recently was introduced in the literature a procedure to obtain ans\"atze, free of parameters, for the eigenfunctions of the time-independent Schr\"odinger equation with symmetric convex potential. In the present work, we test this technique in regard to $x^{2\kappa}$-type potentials. We study the behavior of the ans\"atze regarding the degree of the potential and to the intervening coupling constant. Finally, we discuss how the results could be used to establish the upper bounds of the relative errors in situations where intervening polynomial potentials.

翻訳日:2023-03-17 21:08:22 公開日:2021-08-21

# 手術用テレプレゼンスにおける照明対応グラディエントミキシングを用いた混合現実感:多層可視化の強化

Mixed Reality using Illumination-aware Gradient Mixing in Surgical Telepresence: Enhanced Multi-layer Visualization ( http://arxiv.org/abs/2110.09318v1 )

ライセンス: Link先を確認

Nirakar Puri, Abeer Alsadoon, P.W.C. Prasad, Nada Alsalami, Tarik A. Rashid

(参考訳) 背景と目的: 拡張現実を用いた手術用テレプレゼンスが応用されているが, 混合現実は研究が続けられており, 理論上のみである。本研究の目的は,入力源および対象映像の照明強度が変化した場合に,グローバルに一貫した映像を生成することにより,最終的な統合映像の可視化を改善する方法を提案することである。方法論:本システムでは,照明認識型映像合成アルゴリズムを用いた照明認識勾配混合による拡張多層可視化を行う。 particle swarm optimizationアルゴリズムは、アルファマットを推定するために、前景と背景領域と画像画素相関から最適なサンプルペアを見つけるために使用される。 Particle Swarm Optimizationアルゴリズムは、未知の領域の未知のピクセルの色と深さを取得するのに役立つ。結果: 大腸, 顎, 乳房のサンプル10点につき, 未知領域のサンプルペアを選別する平均二乗誤差を減少させることにより, 精度が向上した。この削減の量は、state of art systemから16.48%である。その結果、視認性は89.4から97.7%に向上し、光の差でも手視をクリアすることができた。結論: 照明効果とアルファ画素相関は, 可視化精度を向上し, グローバルに一貫性のある合成結果を生成し, 高可逆照明効果の2つの映像を合成する際の時間的一貫性を維持する。さらに,本論文では,未知領域に対して最適なサンプリングペアを選択することで,原色と深度を求める方法を提案する。

Background and aim: Surgical telepresence using augmented perception has been applied, but mixed reality is still being researched and is only theoretical. The aim of this work is to propose a solution to improve the visualization in the final merged video by producing globally consistent videos when the intensity of illumination in the input source and target video varies. Methodology: The proposed system uses an enhanced multi-layer visualization with illumination-aware gradient mixing using Illumination Aware Video Composition algorithm. Particle Swarm Optimization Algorithm is used to find the best sample pair from foreground and background region and image pixel correlation to estimate the alpha matte. Particle Swarm Optimization algorithm helps to get the original colour and depth of the unknown pixel in the unknown region. Result: Our results showed improved accuracy caused by reducing the Mean squared Error for selecting the best sample pair for unknown region in 10 each sample for bowel, jaw and breast. The amount of this reduction is 16.48% from the state of art system. As a result, the visibility accuracy is improved from 89.4 to 97.7% which helped to clear the hand vision even in the difference of light. Conclusion: Illumination effect and alpha pixel correlation improves the visualization accuracy and produces a globally consistent composition results and maintains the temporal coherency when compositing two videos with high and inverse illumination effect. In addition, this paper provides a solution for selecting the best sampling pair for the unknown region to obtain the original colour and depth.

翻訳日:2023-03-17 21:03:09 公開日:2021-08-21

# 設計・設計におけるカオス性依存性最適化

Chaotic Fitness Dependent Optimizer for Planning and Engineering Design ( http://arxiv.org/abs/2110.08067v1 )

ライセンス: Link先を確認

Hardi M. Mohammed, Tarik A. Rashid

(参考訳) 適応依存オプティマイザ(fitness dependent optimizer, fdo)は、ミツバチの群れの繁殖行動を模倣した最近のメタヒューリスティックなアルゴリズムである。このアルゴリズムはParticle Swarm Optimization (PSO) に似ているが、動作は異なる。このアルゴリズムは非常に強力で、他の一般的なメタヒューリスティックアルゴリズムよりも優れた結果が得られる。本稿は,FDOの性能向上を目的としており,このカオス理論をFDOの内部で使用して,CFDO(Chaotic FDO)を提案する。 CFDOでは10のカオスマップを使用して、どの地図がうまく機能しているかを考察し、局所最適とグローバル最適の発見を避ける。 FDO技術は人口の修正に問題があるため、新しい技術は特定の制限で人口を遂行するために使用される。提案するCFDOは,CEC2019のベンチマーク関数10を用いて評価する。その結果,CFDOの能力は向上した。テントマップが最悪である間、シンガーマップはcfdoの改善に大きな影響を与えます。その結果,CFDOはGA,FDO,CSOよりも優れていることがわかった。 CEC2013とCEC2005はCFDOの評価に用いられる。最後に, CFDOは圧力容器設計などの古典的工学的問題に適用され, CFDOがWOA, GWO, FDO, CGWOよりも優れていることを示す。さらに、cfdoはタスク割り当て問題を解くために適用され、元のfdoと比較される。その結果、cfdoは問題を解決する能力がより優れていることが判明した。

Fitness Dependent Optimizer (FDO) is a recent metaheuristic algorithm that mimics the reproduction behavior of the bee swarm in finding better hives. This algorithm is similar to Particle Swarm Optimization (PSO) but it works differently. The algorithm is very powerful and has better results compared to other common metaheuristic algorithms. This paper aims at improving the performance of FDO, thus, the chaotic theory is used inside FDO to propose Chaotic FDO (CFDO). Ten chaotic maps are used in the CFDO to consider which of them are performing well to avoid local optima and finding global optima. New technic is used to conduct population in specific limitation since FDO technic has a problem to amend population. The proposed CFDO is evaluated by using 10 benchmark functions from CEC2019. Finally, the results show that the ability of CFDO is improved. Singer map has a great impact on improving CFDO while the Tent map is the worst. Results show that CFDO is superior to GA, FDO, and CSO. Both CEC2013 and CEC2005 are used to evaluate CFDO. Finally, the proposed CFDO is applied to classical engineering problems, such as pressure vessel design and the result shows that CFDO can handle the problem better than WOA, GWO, FDO, and CGWO. Besides, CFDO is applied to solve the task assignment problem and then compared to the original FDO. The results prove that CFDO has better capability to solve the problem.

翻訳日:2023-03-17 21:02:41 公開日:2021-08-21

# no free lunch theorems の実際的証明を求めて

Searching for a practical evidence of the No Free Lunch theorems ( http://arxiv.org/abs/2109.13738v1 )

ライセンス: Link先を確認

Mihai Oltean

(参考訳) No Free Lunch (NFL) の定理によると、最適化問題全体と比較すると、すべてのブラックボックスアルゴリズムは等しくうまく機能する。 nflに関連する重要な問題は、あるアルゴリズムが別のアルゴリズムよりも優れているというテスト問題を見つけることである。興味深いのは、ランダム検索が他の標準進化アルゴリズムよりも優れている関数を見つけることである。本稿では,この問題を解決するための進化的アプローチを提案する。与えられたアルゴリズムaが他のアルゴリズムbよりも優れているようなテスト関数を進化させる。関数最適化のためのNFLスタイルの進化的アルゴリズムを含む数値実験を行った。その結果,提案手法の有効性が示された。ランダム検索が他の考慮されたアルゴリズムよりも優れているいくつかのテスト関数が進化してきた。

According to the No Free Lunch (NFL) theorems all black-box algorithms perform equally well when compared over the entire set of optimization problems. An important problem related to NFL is finding a test problem for which a given algorithm is better than another given algorithm. Of high interest is finding a function for which Random Search is better than another standard evolutionary algorithm. In this paper, we propose an evolutionary approach for solving this problem: we will evolve test functions for which a given algorithm A is better than another given algorithm B. Two ways for representing the evolved functions are employed: as GP trees and as binary strings. Several numerical experiments involving NFL-style Evolutionary Algorithms for function optimization are performed. The results show the effectiveness of the proposed approach. Several test functions for which Random Search performs better than all other considered algorithms have been evolved.

翻訳日:2023-03-17 21:02:16 公開日:2021-08-21

# 等間隔問題に対する可逆回路の進化

Evolving reversible circuits for the even-parity problem ( http://arxiv.org/abs/2109.13355v1 )

ライセンス: Link先を確認

Mihai Oltean

(参考訳) 可逆計算とは基本的に、電力の少ない計算を意味する。標準バイナリゲートは通常可逆ではないので、フレドキンゲートを使用して可逆性を達成する。本稿では,可逆ディジタル回路の設計アルゴリズムについて述べる。このアルゴリズムは、個人を線形表現した遺伝的プログラミング変種であるMulti Expression Programming (MEP)に基づいている。均等性問題に対するディジタル回路について検討した。数値実験により、MEPに基づくアルゴリズムは、偶数-8パリティ問題に対する可逆的なディジタル回路を容易に設計できることが示されている。

Reversible computing basically means computation with less or not at all electrical power. Since the standard binary gates are not usually reversible we use the Fredkin gate in order to achieve reversibility. An algorithm for designing reversible digital circuits is described in this paper. The algorithm is based on Multi Expression Programming (MEP), a Genetic Programming variant with a linear representation of individuals. The case of digital circuits for the even-parity problem is investigated. Numerical experiments show that the MEP-based algorithm is able to easily design reversible digital circuits for up to the even-8-parity problem.

翻訳日:2023-03-17 21:02:05 公開日:2021-08-21

# 線形遺伝的プログラミングを用いた進化的アルゴリズム

Evolving Evolutionary Algorithms using Linear Genetic Programming ( http://arxiv.org/abs/2109.13110v1 )

ライセンス: Link先を確認

Mihai Oltean

(参考訳) 本稿では,進化的アルゴリズムの新しいモデルを提案する。このモデルは線形遺伝的プログラミング(LGP)技術に基づいている。すべてのLGP染色体は、特定の問題を解決するのに使用されるEAをコードする。機能最適化のためのいくつかの進化的アルゴリズム、トラベリングセールスマン問題、および擬似アサインメント問題は、検討されたモデルを用いて進化する。数値実験により、進化的アルゴリズムは、いくつかのよく知られたベンチマーク問題に対する標準的なアプローチよりも、同様に、時には良く機能することが示された。

A new model for evolving Evolutionary Algorithms is proposed in this paper. The model is based on the Linear Genetic Programming (LGP) technique. Every LGP chromosome encodes an EA which is used for solving a particular problem. Several Evolutionary Algorithms for function optimization, the Traveling Salesman Problem, and the Quadratic Assignment Problem are evolved by using the considered model. Numerical experiments show that the evolved Evolutionary Algorithms perform similarly and sometimes even better than standard approaches for several well-known benchmarking problems.

翻訳日:2023-03-17 21:01:56 公開日:2021-08-21

# Nimライクゲームにおける勝利戦略の展開

Evolving winning strategies for Nim-like games ( http://arxiv.org/abs/2109.13109v1 )

ライセンス: Link先を確認

Mihai Oltean

(参考訳) 本稿では,Nimライクゲームにおける勝利戦略を計算するための進化的アプローチを提案する。勝利戦略は、遺伝的プログラミング(GP)の高速かつ効率的な変種であるMEP(Multi Expression Programming)技術を用いて計算される。各プレイ戦略は、数学演算子(+, -, *, mod, div, and , or, xor, not など)とオペランド(現在のゲーム状態のエンコード)を含む数学的表現で表される。 Nimゲームの勝利戦略を計算するためのいくつかの数値実験を行う。勝利戦略の進化に必要な計算労力を報告する。その結果,提案手法はnim系ゲームにおける勝利戦略の計算に非常に適していることがわかった。

An evolutionary approach for computing the winning strategy for Nim-like games is proposed in this paper. The winning strategy is computed by using the Multi Expression Programming (MEP) technique - a fast and efficient variant of the Genetic Programming (GP). Each play strategy is represented by a mathematical expression that contains mathematical operators (such as +, -, *, mod, div, and , or, xor, not) and operands (encoding the current game state). Several numerical experiments for computing the winning strategy for the Nim game are performed. The computational effort needed for evolving a winning strategy is reported. The results show that the proposed evolutionary approach is very suitable for computing the winning strategy for Nim-like games.

翻訳日:2023-03-17 21:01:50 公開日:2021-08-21

# Knapsack問題に対するディジタル回路の進化

Evolving Digital Circuits for the Knapsack Problem ( http://arxiv.org/abs/2109.13107v1 )

ライセンス: Link先を確認

Mihai Oltean, Crina Gro\c{s}an and Mihaela Oltean

(参考訳) マルチ表現プログラミング(Multi Expression Programming、MEP)は、線形染色体を用いた遺伝的プログラミングの亜種である。 MEPのユニークな特徴は、単一染色体における問題の複数の解をコードする能力である。本稿では,NP-Complete問題,knapsack (subset sum)問題に対して,ディジタル回路の進化にマルチ表現プログラミングを用いる。数値実験により,マルチ表現プログラミングは検討されたテスト問題に対して良好に機能することが示された。

Multi Expression Programming (MEP) is a Genetic Programming variant that uses linear chromosomes for solution encoding. A unique feature of MEP is its ability of encoding multiple solutions of a problem in a single chromosome. In this paper we use Multi Expression Programming for evolving digital circuits for a well-known NP-Complete problem: the knapsack (subset sum) problem. Numerical experiments show that Multi Expression Programming performs well on the considered test problems.

翻訳日:2023-03-17 21:01:38 公開日:2021-08-21

# 量子場理論は余剰助力を保持する

Quantum Field Theory Deserves Extra Help ( http://arxiv.org/abs/2108.10713v1 )

ライセンス: Link先を確認

John R. Klauder

(参考訳) 今日の量子場理論(QFT)は正準量子化(CQ)に依存しており、$\varphi^4_4$が「自由」な結果にしかならない。代替量子化法であるアフィン量子化(AQ)は、同じモデルに対して「自由でない」結果をもたらす。おそらく、CQにAQを加えることで、QFTにおける幅広い問題の量子化を改善することができる。

Today's quantum field theory (QFT) relies heavenly on canonical quantization (CQ), which fails for $\varphi^4_4$ leading only to a "free" result. Affine quantization (AQ), an alternative quantization procedure, leads to a "non-free" result for the same model. Perhaps adding AQ to CQ can improve the quantization of a wide class of problems in QFT.

翻訳日:2023-03-17 21:01:32 公開日:2021-08-21

# ビームスプリッタに作用するコヒーレント対光子の決定論的量子相関

Deterministic quantum correlation between coherently paired photons acting on a beam splitter ( http://arxiv.org/abs/2108.09596v1 )

ライセンス: Link先を確認

B. S. Ham

(参考訳) 光子の粒子の性質に基づく量子技術は過去数十年にわたって進歩しており、エンタングルメントの基本的な量子特性は、ホン・ウー・マンデル型反相関とベル型非局所相関によって検証されてきた。近年、光子の波動特性に基づく相互排他的量子特徴が、謎の量子相関の基礎物理学を理解するために研究され、決定論的かつマクロな量子技術を生み出している。ここでは、ビームスプリッターに作用する対光子の量子的性質を研究し、相互のコヒーレンスが主要な役割を果たす。反相関に関する現在の一般的な理解とは異なり、対の光子間の二部結合は確率的あるいは後選択される必要はないが、量子力学に違反することなく位相ベース操作によって決定的かつマクロ的である。

Quantum technologies based on the particle nature of a photon has been progressed over the last several decades, where the fundamental quantum features of entanglement have been tested by Hong-Ou-Mandel-type anticorrelation and Bell-type nonlocal correlation. Recently, mutually exclusive quantum features based on the wave nature of a photon have been investigated to understand the fundamental physics of mysterious quantum correlation, resulting in deterministic and macroscopic quantum technologies. Here, we study the quantum natures of paired photons acting on a beam splitter, where mutual coherence plays a major role. Unlike current common understanding on anticorrelation, bipartite entanglement between paired photons does not have to be probabilistic or post-selected, but can be deterministic and even macroscopic via phase basis manipulation without violating quantum mechanics.

翻訳日:2023-03-17 21:01:23 公開日:2021-08-21

# 分解多目的進化最適化:最先端から将来の機会へ

Decomposition Multi-Objective Evolutionary Optimization: From State-of-the-Art to Future Opportunities ( http://arxiv.org/abs/2108.09588v1 )

ライセンス: Link先を確認

Ke Li

(参考訳) 分解は、多目的最適化と多条件決定のための古典的な数学的プログラミングにおける主流のアプローチである。しかし、進化的多目的最適化の文脈では、分解(MOEA/D)に基づく多目的進化アルゴリズムの開発まで適切に研究されなかった。本稿では,moea/dの開発をその起源から現在の技術動向まで包括的に調査する。自己完結するために、初心者がmoea/dの動作メカニズムに素早く乗り出すのを助けるためのステップバイステップのチュートリアルから始めます。次に, 重みベクトル設定, サブプロブレム定式化, 選択機構, 再生演算子など, 基本設計要素に従ってMOEA/Dの選定を概観する。さらに,制約処理,計算コストの高い客観的関数,選好統合,実世界のアプリケーションについても概説する。最終段階では、今後の発展に向けた新たな方向性に光を当てました。

Decomposition has been the mainstream approach in the classic mathematical programming for multi-objective optimization and multi-criterion decision-making. However, it was not properly studied in the context of evolutionary multi-objective optimization until the development of multi-objective evolutionary algorithm based on decomposition (MOEA/D). In this article, we present a comprehensive survey of the development of MOEA/D from its origin to the current state-of-the-art approaches. In order to be self-contained, we start with a step-by-step tutorial that aims to help a novice quickly get onto the working mechanism of MOEA/D. Then, selected major developments of MOEA/D are reviewed according to its core design components including weight vector settings, sub-problem formulations, selection mechanisms and reproduction operators. Besides, we also overviews some further developments for constraint handling, computationally expensive objective functions, preference incorporation, and real-world applications. In the final part, we shed some lights on emerging directions for future developments.

翻訳日:2023-03-17 21:01:07 公開日:2021-08-21

# 適切かつ公平な説明

Adequate and fair explanations ( http://arxiv.org/abs/2001.07578v2 )

ライセンス: Link先を確認

Nicholas Asher, Soumya Paul, Chris Russell

(参考訳) 高度な機械学習ベースのシステムを説明することは、AIの基礎において重要な問題である。近年,様々な説明方法が提案されている。これらのアプローチは、局所的および人間の解釈可能な機械学習アルゴリズムの近似を提供するものと、決定の1つの側面を正確に特徴づける論理的アプローチの2つに大別できる。本稿では,厳密な論理的基礎を持つ第2学派に焦点をあてる。これらの厳密な方法には認識論的問題がある。これらは完全な説明を与えることができるが、そのような説明は人間が理解したり、読みやすい形で書き留めるには複雑すぎるかもしれない。解釈可能性には理解しやすい説明、人間が把握できる説明が必要である。しかし、十分に完全に理解可能な説明がまだ明確化する必要がある。ここでは、[Wachter et al., 2017]に倣って、対策の観点でこれを行う。反事実的な説明では、完全な説明を提供するために必要な多くの仮定は暗黙的に残される。そのため、反事実的説明は特定のデータポイントやサンプルの性質を利用しており、部分的説明と同様に局所的でもある。局所的な部分的な説明から完全な局所的な説明へと、そしてグローバルな説明へと移行する方法を探求する。しかし、アクセシビリティを維持するために、部分性の必要性を主張します。この偏りにより、有害または不公平なアルゴリズムに存在する明示的なバイアスを隠蔽することができる。我々は、完全な局所的な説明を提供する反事実の集合の構造を利用して、完全かつ公平な説明を提供することで、これらのバイアスをいかに容易に解明できるかを検討する。

Explaining sophisticated machine-learning based systems is an important issue at the foundations of AI. Recent efforts have shown various methods for providing explanations. These approaches can be broadly divided into two schools: those that provide a local and human interpreatable approximation of a machine learning algorithm, and logical approaches that exactly characterise one aspect of the decision. In this paper we focus upon the second school of exact explanations with a rigorous logical foundation. There is an epistemological problem with these exact methods. While they can furnish complete explanations, such explanations may be too complex for humans to understand or even to write down in human readable form. Interpretability requires epistemically accessible explanations, explanations humans can grasp. Yet what is a sufficiently complete epistemically accessible explanation still needs clarification. We do this here in terms of counterfactuals, following [Wachter et al., 2017]. With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit. To do so, counterfactual explanations exploit the properties of a particular data point or sample, and as such are also local as well as partial explanations. We explore how to move from local partial explanations to what we call complete local explanations and then to global ones. But to preserve accessibility we argue for the need for partiality. This partiality makes it possible to hide explicit biases present in the algorithm that may be injurious or unfair.We investigate how easy it is to uncover these biases in providing complete and fair explanations by exploiting the structure of the set of counterfactuals providing a complete local explanation.

翻訳日:2023-01-08 00:11:31 公開日:2021-08-21

# slice tuner: 正確かつ公平な機械学習モデルのための選択的データ取得フレームワーク

Slice Tuner: A Selective Data Acquisition Framework for Accurate and Fair Machine Learning Models ( http://arxiv.org/abs/2003.04549v3 )

ライセンス: Link先を確認

Ki Hyun Tae, Steven Euijong Whang

(参考訳) 機械学習はSoftware 2.0の時代に民主化されていくにつれて、正確で公正なモデルを保証するのに十分なデータを獲得している。クラウドソーシングを含む最近の技術は、そのようなデータを集めるためのコスト効率の高い方法を提供する。しかし、データを取得することは必ずしも正確性と公平性を最適化するための効果的な戦略ではない。例えば、オンラインのapp storeに、特定のデータスライス(例えばアメリカの顧客)のための十分なトレーニングデータがあるが、他の顧客にとってはそうではない場合、より多くのアメリカの顧客データを取得することは、モデルのトレーニングに偏るだけだ。代わりに、選択的にデータを取得し、スライス毎の潜在的に異なる量のデータを取得し、スライス毎のモデル精度と公平性を最適化するSlice Tunerを提案する必要がある。この問題は、(アクティブな学習や弱い監督において)既存のデータをラベル付けすることとは異なる。中心となるSlice Tunerは、より多くのデータに対してモデル精度を見積もるスライスの学習曲線を維持し、凸最適化を使用して最高のデータ取得戦略を見つける。学習曲線を推定する主な課題は、十分なデータがなければ不正確な場合があり、一方のスライスで取得したデータが他者の学習曲線に影響を与えるスライス間に依存性がある場合である。より多くのデータを取得するにつれて、学習曲線を反復的かつ効率的に更新することで、これらの問題を解決する。我々は,クラウドソーシングを用いて実際のデータセット上でSlice Tunerを評価し,学習曲線を確実に推定できない場合でも,モデル精度と公平性の観点からSlice Tunerがベースラインを著しく上回ることを示す。

As machine learning becomes democratized in the era of Software 2.0, a serious bottleneck is acquiring enough data to ensure accurate and fair models. Recent techniques including crowdsourcing provide cost-effective ways to gather such data. However, simply acquiring data as much as possible is not necessarily an effective strategy for optimizing accuracy and fairness. For example, if an online app store has enough training data for certain slices of data (say American customers), but not for others, obtaining more American customer data will only bias the model training. Instead, we contend that one needs to selectively acquire data and propose Slice Tuner, which acquires possibly-different amounts of data per slice such that the model accuracy and fairness on all slices are optimized. This problem is different than labeling existing data (as in active learning or weak supervision) because the goal is obtaining the right amounts of new data. At its core, Slice Tuner maintains learning curves of slices that estimate the model accuracies given more data and uses convex optimization to find the best data acquisition strategy. The key challenges of estimating learning curves are that they may be inaccurate if there is not enough data, and there may be dependencies among slices where acquiring data for one slice influences the learning curves of others. We solve these issues by iteratively and efficiently updating the learning curves as more data is acquired. We evaluate Slice Tuner on real datasets using crowdsourcing for data acquisition and show that Slice Tuner significantly outperforms baselines in terms of model accuracy and fairness, even when the learning curves cannot be reliably estimated.

翻訳日:2022-12-24 20:35:38 公開日:2021-08-21

# マルチセンターフェデレーションラーニング

Multi-Center Federated Learning ( http://arxiv.org/abs/2005.01026v2 )

ライセンス: Link先を確認

Ming Xie, Guodong Long, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang

(参考訳) フェデレートラーニングは,ユーザデータに直接アクセスする必要なく,大規模モデルを分散的にトレーニングする能力に大きな注目を集めている。ユーザのプライベートデータを集中収集から保護するのに役立つ。分散機械学習とは異なり、フェデレートされた学習は、スマートフォンなど、さまざまな現実世界のアプリケーションにおける異種ソースからの非IIDデータに取り組むことを目的としている。既存のフェデレーション学習のアプローチは通常、単一のグローバルモデルを採用して、データ分布のばらつきに関係なく、勾配を集約することで、すべてのユーザの共有知識をキャプチャする。しかし、ユーザ行動の多様性のため、異なるグローバルモデル(すなわちセンター)にユーザの勾配を割り当てることで、ユーザ間のデータ分散の不均一性をよりよく捉えることができる。本稿では,非IIDユーザデータから複数のグローバルモデルを学習し,同時にユーザとセンタの最適なマッチングを導出する,フェデレーション学習のための新しい多中心集約機構を提案する。確率的予測最大化(EM)アルゴリズムにより効率よく解けるような共同最適化として問題を定式化する。ベンチマークデータセットによる実験結果から,本手法はいくつかの一般的なフェデレーション学習法より優れていることが示された。

Federated learning has received great attention for its capability to train a large-scale model in a decentralized manner without needing to access user data directly. It helps protect the users' private data from centralized collecting. Unlike distributed machine learning, federated learning aims to tackle non-IID data from heterogeneous sources in various real-world applications, such as those on smartphones. Existing federated learning approaches usually adopt a single global model to capture the shared knowledge of all users by aggregating their gradients, regardless of the discrepancy between their data distributions. However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i.e., centers) can better capture the heterogeneity of data distributions across users. Our paper proposes a novel multi-center aggregation mechanism for federated learning, which learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers. We formulate the problem as a joint optimization that can be efficiently solved by a stochastic expectation maximization (EM) algorithm. Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.

翻訳日:2022-12-07 06:23:54 公開日:2021-08-21

# xgboostにおける障害時間モデルによる生存回帰

Survival regression with accelerated failure time model in XGBoost ( http://arxiv.org/abs/2006.04920v3 )

ライセンス: Link先を確認

Avinash Barnwal, Hyunsu Cho, Toby Dylan Hocking

(参考訳) サバイバルレグレッションはイベント時間と機能変数の関係を推定するために使用され、医療、マーケティング、リスク管理、セールス管理といったアプリケーションドメインにおいて重要である。 xgboost、scikit-learn、lightgbm、catboostなどのライブラリに実装された非線形木ベースの機械学習アルゴリズムは、線形モデルよりも正確であることが多い。しかし、既存のツリーベースモデルの最先端実装は、生き残り回帰を限定的にサポートしている。本研究では,xgboostにおけるアクセラレーション障害時間(aft)モデル学習のための損失関数を実装し,異なる種類のラベル検閲に対するサバイバルモデルのサポートを強化する。我々は,XGBoostにおけるAFTの有効性を,一般化性能とトレーニング速度の2点において実かつシミュレートされた実験で実証した。さらに,XGBoostにおけるNVIDIA GPUのサポートを活用し,マルチコアCPU上での大幅な高速化を実現している。我々の知る限り、我々の研究はNVIDIA GPUの処理能力を利用するAFTの最初の実装である。 1.2.0のリリースから、XGBoostパッケージはAFTモデルをネイティブにサポートした。 XGBoostにAFTを追加したことで、オープンソースコミュニティに大きな影響を与え、いくつかの統計パッケージがXGBoost AFTモデルを使用している。

Survival regression is used to estimate the relation between time-to-event and feature variables, and is important in application domains such as medicine, marketing, risk management and sales management. Nonlinear tree based machine learning algorithms as implemented in libraries such as XGBoost, scikit-learn, LightGBM, and CatBoost are often more accurate in practice than linear models. However, existing state-of-the-art implementations of tree-based models have offered limited support for survival regression. In this work, we implement loss functions for learning accelerated failure time (AFT) models in XGBoost, to increase the support for survival modeling for different kinds of label censoring. We demonstrate with real and simulated experiments the effectiveness of AFT in XGBoost with respect to a number of baselines, in two respects: generalization performance and training speed. Furthermore, we take advantage of the support for NVIDIA GPUs in XGBoost to achieve substantial speedup over multi-core CPUs. To our knowledge, our work is the first implementation of AFT that utilizes the processing power of NVIDIA GPUs. Starting from the 1.2.0 release, the XGBoost package natively supports the AFT model. The addition of AFT in XGBoost has had significant impact in the open source community, and a few statistics packages now utilize the XGBoost AFT model.

翻訳日:2022-11-24 01:08:59 公開日:2021-08-21

# 治療効果の異質性モデリングと上昇モデルの統合調査

A unified survey of treatment effect heterogeneity modeling and uplift modeling ( http://arxiv.org/abs/2007.12769v3 )

ライセンス: Link先を確認

Weijia Zhang, Jiuyong Li, Lin Liu

(参考訳) 科学研究の多くの分野における中心的な問題は、結果が作用によってどのように影響を受けるかを決定すること、または作用の効果を測定することである。近年,パーソナライズされた医療,社会科学,オンラインマーケティングといった研究分野から,個人特性の異なる不均一な治療効果条件付けの必要性が浮上している。このニーズに応えるため、異なるコミュニティの研究者と実践者が、治療効果ヘテロジニティ・モデリング・アプローチとアップリフト・モデリング・アプローチをそれぞれ取り入れてアルゴリズムを開発した。本稿では,これら2つの非連結で近縁なアプローチについて,潜在的結果の枠組みの下で統一的な調査を行う。次に,各手法の比較を容易にする統一表記法を用いて,既存の手法に関する構造化された調査を行う。次に、パーソナライズされたマーケティング、パーソナライズされた医療、社会研究における調査手法の主な応用について概説する。最後に、既存のソフトウェアパッケージを要約し、合成、半合成、実世界のデータセットにおけるメソッドの使用に基づく議論を行い、メソッドを選択するための一般的なガイドラインを提供する。

A central question in many fields of scientific research is to determine how an outcome would be affected by an action, or to measure the effect of an action (a.k.a treatment effect). In recent years, a need for estimating the heterogeneous treatment effects conditioning on the different characteristics of individuals has emerged from research fields such as personalized healthcare, social science, and online marketing. To meet the need, researchers and practitioners from different communities have developed algorithms by taking the treatment effect heterogeneity modeling approach and the uplift modeling approach, respectively. In this paper, we provide a unified survey of these two seemingly disconnected yet closely related approaches under the potential outcome framework. We then provide a structured survey of existing methods by emphasizing on their inherent connections with a set of unified notations to make comparisons of the different methods easy. We then review the main applications of the surveyed methods in personalized marketing, personalized medicine, and social studies. Finally, we summarize the existing software packages and present discussions based on the use of methods on synthetic, semi-synthetic and real world data sets and provide some general guidelines for choosing methods.

翻訳日:2022-11-10 14:24:09 公開日:2021-08-21

# コンピュータビジョンのための2次元推定を用いた3次元物体位置推定

3D Object Localization Using 2D Estimates for Computer Vision Applications ( http://arxiv.org/abs/2009.11446v2 )

ライセンス: Link先を確認

Taha Hasan Masood Siddique and Muhammad Usman

(参考訳) ポーズ推定とカメラキャリブレーションに基づく物体位置推定手法を提案する。対象物の2次元(2次元)画像を複数収集して3次元(3次元)座標を推定し、カメラの校正に利用する。レンズ歪みの除去、物体の大きさの計算、カメラの位置計算のための内因的および外因的パラメータを含む多くのパラメータ計算を含むキャリブレーションステップについて論じる。 2次元画像を用いて3次元ポーズを推定する変換戦略を示す。提案手法はMATLABに実装され,ポーズ推定とカメラキャリブレーションの両面で検証実験を行った。

A technique for object localization based on pose estimation and camera calibration is presented. The 3-dimensional (3D) coordinates are estimated by collecting multiple 2-dimensional (2D) images of the object and are utilized for the calibration of the camera. The calibration steps involving a number of parameter calculation including intrinsic and extrinsic parameters for the removal of lens distortion, computation of object's size and camera's position calculation are discussed. A transformation strategy to estimate the 3D pose using the 2D images is presented. The proposed method is implemented on MATLAB and validation experiments are carried out for both pose estimation and camera calibration.

翻訳日:2022-10-15 05:07:07 公開日:2021-08-21

# CIMON: 高品質なハッシュコードを目指す

CIMON: Towards High-quality Hash Codes ( http://arxiv.org/abs/2010.07804v4 )

ライセンス: Link先を確認

Xiao Luo, Daqing Wu, Zeyu Ma, Chong Chen, Minghua Deng, Jinwen Ma, Zhongming Jin, Jianqiang Huang and Xian-Sheng Hua

(参考訳) 近年、ハッシュは、そのストレージと計算効率のほぼ隣り合う探索に広く利用されている。教師なしハッシュ法の多くは、事前訓練されたモデルから局所的な意味的類似性構造を構築し、特徴空間において距離が小さい場合には各点対を扱い、イメージを意味的類似性保存ハッシュコードにマッピングすることを学ぶ。しかし、事前学習されたモデルの非効率表現能力のため、局所的な意味的類似性における多くの偽陽性と負が導入され、ハッシュコードの学習中にエラーの伝播が起こる。さらに、モデルのロバスト性を考慮する方法も少なく、それによってハッシュコードの不安定性が乱れてしまう。本稿では, c{\textbf{O}}nsistency lear{\textbf{N}}ing (CIMON) を用いて, {\textbf{C}}omprehensive s{\textbf{I}}milarity {\textbf{M}}ining と c{\textbf{O}}nsistency lear{\textbf{N}}ing (CIMON) という新しい手法を提案する。まず,グローバルリファインメントと類似度統計分布を用いて,信頼性の高い円滑な指導を行う。第2に、意味的および対比的一貫性学習は、乱れ不変と判別的ハッシュ符号の両方を導出するために導入される。いくつかのベンチマークデータセットの大規模な実験により,提案手法は検索性能とロバスト性の両方において,幅広い最先端手法より優れていることが示された。

Recently, hashing is widely used in approximate nearest neighbor search for its storage and computational efficiency. Most of the unsupervised hashing methods learn to map images into semantic similarity-preserving hash codes by constructing local semantic similarity structure from the pre-trained model as the guiding information, i.e., treating each point pair similar if their distance is small in feature space. However, due to the inefficient representation ability of the pre-trained model, many false positives and negatives in local semantic similarity will be introduced and lead to error propagation during the hash code learning. Moreover, few of the methods consider the robustness of models, which will cause instability of hash codes to disturbance. In this paper, we propose a new method named {\textbf{C}}omprehensive s{\textbf{I}}milarity {\textbf{M}}ining and c{\textbf{O}}nsistency lear{\textbf{N}}ing (CIMON). First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes. Extensive experiments on several benchmark datasets show that the proposed method outperforms a wide range of state-of-the-art methods in both retrieval performance and robustness.

翻訳日:2022-10-07 04:18:59 公開日:2021-08-21

# (参考訳) 屋内位置決めシステムにおける教師なし移動検出

Unsupervised Movement Detection in Indoor Positioning Systems ( http://arxiv.org/abs/2109.10757v1 )

ライセンス: CC BY 4.0

Jonathan Flossdorf, Anne Meyer, Dmitri Artjuch, Jaques Schneider, Carsten Jentsch

(参考訳) 近年では製造工程における室内位置決めシステムの利用が盛んになっている。通常、製造ホールはセンサーの位置データを受信する衛星を備えており、部品、荷積み機、産業用トラックに固定することができる。これにより、例えば企業が検索の労力を減らし、個々のシステムプロセスの最適化が可能になる。本研究の文脈では,センサは移動時にのみ位置情報を送信する。しかし、周囲の要因が乱れるなど、様々な状況がデータ送信に好ましくない影響をしばしば与えている。これは、データ品質、エネルギー消費、システム全体の信頼性に悪影響を及ぼす。そこで本研究では,室内システムの騒音や測定誤差の影響を受けやすいため,好ましくない信号と実際の動きを区別することを目的としている。そこで,本課題に適した2つの非教師なし分類アルゴリズムを提案する。興味のある問題によっては、それらは距離ベースか時間ベースの基準に依存しており、すべての必須情報を利用することができる。さらに,両方の分類を結合し,それらを空間生産領域に集約する手法を提案する。これにより、位置データのみを用いて、下層のプロダクションホールの包括的なマップを生成することができる。基盤となる移動構造の分析と検出は別として、利用者は自身のシステムプロセスのより良い理解と、より効率的な位置決めシステムの使用につながる問題のあるシステム領域の検出から恩恵を受ける。全ての手法は教師なしの技術で構築されているため、実際は手動で適用でき、位置決めシステムの出力データ以上の情報を必要としない。

In recent years, the usage of indoor positioning systems for manufacturing processes became increasingly popular. Typically, the production hall is equipped with satellites which receive position data of sensors that can be pinned on components, load carriers or industrial trucks. This enables a company e.g. to reduce search efforts and to optimize individual system processes. In our research context, a sensor only sends position information when it is moved. However, various circumstances frequently affect that data is undesirably sent, e.g. due to disrupting factors nearby. This has a negative impact on the data quality, the energy consumption, and the reliability of the whole system. Motivated by this, we aim to distinguish between actual movements and signals that were undesirably sent which is in particular challenging due to the susceptibility of indoor systems in terms of noise and measuring errors. Therefore, we propose two novel unsupervised classification algorithms suitable for this task. Depending on the question of interest, they rely either on a distance-based or on a time-based criterion, which allows to make use of all essential information. Furthermore, we propose an approach to combine both classifications and to aggregate them on spatial production areas. This enables us to generate a comprehensive map of the underlying production hall with the sole usage of the position data. Aside from the analysis and detection of the underlying movement structure, the user benefits from a better understanding of own system processes and from the detection of problematic system areas which leads to a more efficient usage of positioning systems. Since all our approaches are constructed with unsupervised techniques, they are handily applicable in practice and do not require more information than the output data of the positioning system.

翻訳日:2021-09-26 23:37:59 公開日:2021-08-21

# 深部畳み込みニューラルネットワークを高速化する数値精度に制限のある再構成可能なコプロセッサアーキテクチャ

Reconfigurable co-processor architecture with limited numerical precision to accelerate deep convolutional neural networks ( http://arxiv.org/abs/2109.03040v1 )

ライセンス: Link先を確認

Sasindu Wijeratne, Sandaruwan Jayaweera, Mahesh Dananjaya, Ajith Pasqual

(参考訳) 畳み込みニューラルネットワーク(CNN)は、視覚システムやロボット工学などのディープラーニングアプリケーションで広く使われている。しかし、既存のソフトウェアソリューションは効率的ではない。そのため、多くのハードウェアアクセラレーターが実装の性能、パワー、資源利用を最適化する提案がなされている。既存のソリューションの中で、FPGA(Field Programmable Gate Array)ベースのアーキテクチャは、スケーラビリティと開発時間の最小化とともに、より良いコスト-エネルギーパフォーマンスのトレードオフを提供します。本稿では,CNNを高速化するモデル非依存の再構成可能コプロセッシングアーキテクチャを提案する。我々のアーキテクチャは、最大データ並列性を利用するためのキャッシュ技術と相互接続ネットワークを備えた並列Multiply and Accumulate (MAC)ユニットで構成されている。既存の解とは対照的に、算術表現や演算のための限定精度32bit Q-format固定点量子化を導入する。その結果,我々のアーキテクチャは,競争精度で資源利用の大幅な削減を実現した。さらに,協調処理ファブリックにアクセスして層間並列性を管理するアセンブリ型マイクロインストラクションを開発し,限られた資源を再利用した。最後に、Xilinx Virtex 7 FPGA上で最大9x9のカーネルサイズをテストし、3x3カーネルサイズで最大226.2 GOp/Sのスループットを実現した。

Convolutional Neural Networks (CNNs) are widely used in deep learning applications, e.g. visual systems, robotics etc. However, existing software solutions are not efficient. Therefore, many hardware accelerators have been proposed optimizing performance, power and resource utilization of the implementation. Amongst existing solutions, Field Programmable Gate Array (FPGA) based architecture provides better cost-energy-performance trade-offs as well as scalability and minimizing development time. In this paper, we present a model-independent reconfigurable co-processing architecture to accelerate CNNs. Our architecture consists of parallel Multiply and Accumulate (MAC) units with caching techniques and interconnection networks to exploit maximum data parallelism. In contrast to existing solutions, we introduce limited precision 32 bit Q-format fixed point quantization for arithmetic representations and operations. As a result, our architecture achieved significant reduction in resource utilization with competitive accuracy. Furthermore, we developed an assembly-type microinstructions to access the co-processing fabric to manage layer-wise parallelism, thereby making re-use of limited resources. Finally, we have tested our architecture up to 9x9 kernel size on Xilinx Virtex 7 FPGA, achieving a throughput of up to 226.2 GOp/S for 3x3 kernel size.

翻訳日:2021-09-12 10:54:46 公開日:2021-08-21

# (参考訳) ディープラーニングに基づく正規化(安定化)再構成アルゴリズム

Regularizing (Stabilizing) Deep Learning Based Reconstruction Algorithms ( http://arxiv.org/abs/2108.13551v1 )

ライセンス: CC0 1.0

Abinash Nayak

(参考訳) 逆問題は不適切であり、それを有意義に解くには正規化法を使わなければならないことはよく知られている。伝統的に、一般的な正規化法はペナルティ化された変分アプローチである。近年、古典的正規化再構成アプローチは(深層学習に基づく)学習的再構成アルゴリズムによって非分類化されている。しかし、従来の正則化法とは異なり、安定性や正則化といった理論的な基盤は、そのような学習された再構成アルゴリズムでは不十分である。したがって、これらのアルゴリズムから得られた結果は、経験的に優れているが、学習プロセスから生じる特定の不安定性や(ハロゲン化)特徴を含むため、常に完全に信頼されるとは限らない。実際、このような学習アルゴリズムは、データ内の小さな(逆)ノイズに非常に影響を受けやすく、回収された解に深刻な不安定性をもたらすことが示されており、これは、不適切な(逆)問題の本質的な不安定性とは全く異なる可能性がある。しかし、古典正規化法はそのような(逆)ノイズをうまく処理することができ、安定した回復をもたらす。そこで我々は,このような(不安定な)学習的再構成手法を安定化し,対向雑音の存在下でも正規化解を回復するための一定の正規化手法を提案する。そのため、古典的な正規化の概念を拡張し、学習された再構成アルゴリズムに組み込む必要がある。また,最も一般的な学習再建アルゴリズムである学習後再構築と学習後再構築の2つを正規化するための正規化手法を提案する。

It's well-known that inverse problems are ill-posed and to solve them meaningfully one has to employ regularization methods. Traditionally, popular regularization methods have been the penalized Variational approaches. In recent years, the classical regularized-reconstruction approaches have been outclassed by the (deep-learning-based) learned reconstruction algorithms. However, unlike the traditional regularization methods, the theoretical underpinnings, such as stability and regularization, have been insufficient for such learned reconstruction algorithms. Hence, the results obtained from such algorithms, though empirically outstanding, can't always be completely trusted, as they may contain certain instabilities or (hallucinated) features arising from the learned process. In fact, it has been shown that such learning algorithms are very susceptible to small (adversarial) noises in the data and can lead to severe instabilities in the recovered solution, which can be quite different than the inherent instabilities of the ill-posed (inverse) problem. Whereas, the classical regularization methods can handle such (adversarial) noises very well and can produce stable recovery. Here, we try to present certain regularization methods to stabilize such (unstable) learned reconstruction methods and recover a regularized solution, even in the presence of adversarial noises. For this, we need to extend the classical notion of regularization and incorporate it in the learned reconstruction algorithms. We also present some regularization techniques to regularize two of the most popular learning reconstruction algorithms, the Learned Post-Processing Reconstruction and the Learned Unrolling Reconstruction.

翻訳日:2021-09-05 10:24:48 公開日:2021-08-21

# ディープラーニングを用いた認知症知識発見のための弾性ネット正規化の新しい解法

A Novel Solution of an Elastic Net Regularization for Dementia Knowledge Discovery using Deep Learning ( http://arxiv.org/abs/2109.00896v1 )

ライセンス: Link先を確認

Kshitiz Shrestha, Omar Hisham Alsadoon, Abeer Alsadoon, Tarik A. Rashid, Rasha S. Ali, P.W.C. Prasad, Oday D. Jerew

(参考訳) 背景と目的:MRIの正確な分類は、軽度認知障害(MCI)からアルツハイマー病(AD)への変換を正確に予測するために不可欠である。一方、ディープラーニングは認知症病の分類と予測に成功している。しかし,MRI画像分類の精度は低い。本稿では,特徴選択におけるElastic Net Regularizationを用いて,ディープラーニングアーキテクチャによる分類の精度を高め,処理時間を短縮することを目的とする。方法論:本システムは,弾性ネット正規化を用いた分類と予測の精度を高めるために,畳み込みニューラルネットワーク(cnn)から構成される。当初、MRI画像はCNNに入力され、プール層と交互に畳み込み層を通して機能を抽出し、それから完全に接続された層を通して抽出される。その後、抽出した特徴を原理成分分析(pca)と弾性ネット正規化により特徴選択を行う。最後に、選択した特徴を、MRI画像の分類のためのExtreme Machine Learning (EML)への入力として使用する。結果: 提案手法の精度は現在のシステムよりも優れていることが示された。さらに,提案手法では,分類精度を平均で5%向上させ,処理時間を平均で30秒から40秒短縮した。結論:提案システムは,MCIコンバータ/非コンバータ分類の精度と処理時間の改善に重点を置いている。 CNN、FreeSurfer、PCA、Elastic Net、Extreme Machine Learningを使った機能抽出、機能選択、分類で構成されている。最後に,本研究は弾性ネット正則化を用いて精度と処理時間を向上し,分類に重要な特徴を提供する。

Background and Aim: Accurate classification of Magnetic Resonance Images (MRI) is essential to accurately predict Mild Cognitive Impairment (MCI) to Alzheimer's Disease (AD) conversion. Meanwhile, deep learning has been successfully implemented to classify and predict dementia disease. However, the accuracy of MRI image classification is low. This paper aims to increase the accuracy and reduce the processing time of classification through Deep Learning Architecture by using Elastic Net Regularization in Feature Selection. Methodology: The proposed system consists of Convolutional Neural Network (CNN) to enhance the accuracy of classification and prediction by using Elastic Net Regularization. Initially, the MRI images are fed into CNN for features extraction through convolutional layers alternate with pooling layers, and then through a fully connected layer. After that, the features extracted are subjected to Principle Component Analysis (PCA) and Elastic Net Regularization for feature selection. Finally, the selected features are used as an input to Extreme Machine Learning (EML) for the classification of MRI images. Results: The result shows that the accuracy of the proposed solution is better than the current system. In addition to that, the proposed method has improved the classification accuracy by 5% on average and reduced the processing time by 30 ~ 40 seconds on average. Conclusion: The proposed system is focused on improving the accuracy and processing time of MCI converters/non-converters classification. It consists of features extraction, feature selection, and classification using CNN, FreeSurfer, PCA, Elastic Net, Extreme Machine Learning. Finally, this study enhances the accuracy and the processing time by using Elastic Net Regularization, which provides important selected features for classification.

翻訳日:2021-09-05 08:54:04 公開日:2021-08-21

# 資源制約付きエッジコンピューティングシステムの最適化圧縮

Supervised Compression for Resource-constrained Edge Computing Systems ( http://arxiv.org/abs/2108.11898v1 )

ライセンス: Link先を確認

Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt

(参考訳) スマートフォンやドローン、医療センサーなど、低消費電力のデバイスにディープラーニングアルゴリズムをデプロイすることに関心がある。しかし、フルスケールのディープニューラルネットワークはエネルギーとストレージの面で資源集約的すぎることが多い。そのため、データを圧縮して送信するエッジサーバでは、機械学習操作のバルク部が頻繁に実行される。しかし、データ(画像など)を圧縮すると、監視されたタスクとは無関係な情報を送信する。もうひとつの一般的なアプローチは、中間機能を圧縮しながらデバイスとサーバの間にディープネットワークを分割することである。しかし、これまでのところ、これらの分割コンピューティング戦略は、機能圧縮に対する非効率なアプローチのため、前述のナイーブなデータ圧縮ベースラインをわずかに上回っている。本稿では、知識蒸留とニューラルイメージ圧縮のアイデアを採用し、中間特徴表現をより効率的に圧縮する。教師モデルと生徒モデルを用いて,エントロピー符号化に先立って確率的ボトルネックと学習可能な圧縮手法を開発した。 3つのビジョンタスクにおいて,我々のアプローチを様々なニューラルイメージと特徴圧縮ベースラインと比較し,より小さなレイテンシを維持しながら,教師付きレートゆがみ性能を向上できることを見出した。さらに、学習した特徴表現が複数の下流タスクに役立てるように調整可能であることを示す。

There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors. However, full-scale deep neural networks are often too resource-intensive in terms of energy and storage. As a result, the bulk part of the machine learning operation is therefore often carried out on an edge server, where the data is compressed and transmitted. However, compressing data (such as images) leads to transmitting information irrelevant to the supervised task. Another popular approach is to split the deep network between the device and the server while compressing intermediate features. To date, however, such split computing strategies have barely outperformed the aforementioned naive data compression baselines due to their inefficient approaches to feature compression. This paper adopts ideas from knowledge distillation and neural image compression to compress intermediate feature representations more efficiently. Our supervised compression approach uses a teacher model and a student model with a stochastic bottleneck and learnable prior for entropy coding. We compare our approach to various neural image and feature compression baselines in three vision tasks and found that it achieves better supervised rate-distortion performance while also maintaining smaller end-to-end latency. We furthermore show that the learned feature representations can be tuned to serve multiple downstream tasks.

翻訳日:2021-08-29 12:13:24 公開日:2021-08-21

# Curricular SincNet:潜時空間におけるハードサンプル強調によるロバストディープ話者認識に向けて

Curricular SincNet: Towards Robust Deep Speaker Recognition by Emphasizing Hard Samples in Latent Space ( http://arxiv.org/abs/2108.10714v1 )

ライセンス: Link先を確認

Labib Chowdhury, Mustafa Kamal, Najia Hasan and Nabeel Mohammed

(参考訳) ディープラーニングモデルは、話者認識などの生体認証システムにおいて、ますます好まれる選択肢となっている。ディープニューラルネットワークアーキテクチャであるSincNetは、音声信号を直接処理できるパラメータ化されたシンク関数のために、話者認識タスクで人気を博した。オリジナルのsincnetアーキテクチャはsoftmaxロスを使っているが、認識ベースのタスクには最適ではないかもしれない。このような損失関数はクラス間マージンを課したり、簡単なトレーニングサンプルと難しいトレーニングサンプルを区別したりしない。カリキュラム学習、特に角マージンに基づく損失を利用した学習は、顔認識などの他の生体計測応用において非常に成功した。このようなカリキュラム学習に基づくテクニックの利点は、クラス間マージンを課すだけでなく、簡単でハードなサンプルを考慮に入れることだ。本稿では,sincnetアーキテクチャを学習するためにsincnetモデルの改良版であるcurricular sincnet (cl-sincnet)を提案する。提案モデルは,データセット内およびデータセット間評価プロトコルを用いて,複数のデータセット上で評価される。どちらの設定でも、モデルは以前に公開された他の作業と競合する。データセット間テストの場合、SincNetや他の公開作業と比較すると、エラー率を4倍に減らして、全体的な結果が最も良い。

Deep learning models have become an increasingly preferred option for biometric recognition systems, such as speaker recognition. SincNet, a deep neural network architecture, gained popularity in speaker recognition tasks due to its parameterized sinc functions that allow it to work directly on the speech signal. The original SincNet architecture uses the softmax loss, which may not be the most suitable choice for recognition-based tasks. Such loss functions do not impose inter-class margins nor differentiate between easy and hard training samples. Curriculum learning, particularly those leveraging angular margin-based losses, has proven very successful in other biometric applications such as face recognition. The advantage of such a curriculum learning-based techniques is that it will impose inter-class margins as well as taking to account easy and hard samples. In this paper, we propose Curricular SincNet (CL-SincNet), an improved SincNet model where we use a curricular loss function to train the SincNet architecture. The proposed model is evaluated on multiple datasets using intra-dataset and inter-dataset evaluation protocols. In both settings, the model performs competitively with other previously published work. In the case of inter-dataset testing, it achieves the best overall results with a reduction of 4\% error rate compare to SincNet and other published work.

翻訳日:2021-08-25 14:05:55 公開日:2021-08-21

# (参考訳) 有能な物体検出のためのマルチスケールエッジベースU字型ネットワーク

Multi-scale Edge-based U-shape Network for Salient Object Detection ( http://arxiv.org/abs/2108.09408v1 )

ライセンス: CC BY 4.0

Han Sun, Yetong Bian, Ningzhong Liu, Huiyu Zhou

(参考訳) ディープラーニングベースのサルエントオブジェクト検出手法は、大きな改善を達成している。しかし,不適切な特徴抽出と統合が主な原因である,ぼやけた境界や不正確な位置などの予測にはまだ問題が残っている。本稿では,様々な機能を異なるスケールで統合し,より優れた性能を実現するマルチスケールエッジベースu-shape network(meun)を提案する。境界予測に有用な情報を抽出するために、各デコーダユニットにU字形エッジネットワークモジュールを埋め込む。さらに、追加のダウンサンプリングモジュールは位置の不正確さを緩和する。 4つのベンチマークデータセットの実験結果から,提案手法の有効性と信頼性が示された。マルチスケールのエッジベースのu字型ネットワークは、15の最先端のオブジェクト検出方法と比べても優れている。

Deep-learning based salient object detection methods achieve great improvements. However, there are still problems existing in the predictions, such as blurry boundary and inaccurate location, which is mainly caused by inadequate feature extraction and integration. In this paper, we propose a Multi-scale Edge-based U-shape Network (MEUN) to integrate various features at different scales to achieve better performance. To extract more useful information for boundary prediction, U-shape Edge Network modules are embedded in each decoder units. Besides, the additional down-sampling module alleviates the location inaccuracy. Experimental results on four benchmark datasets demonstrate the validity and reliability of the proposed method. Multi-scale Edge based U-shape Network also shows its superiority when compared with 15 state-of-the-art salient object detection methods.

翻訳日:2021-08-25 10:04:30 公開日:2021-08-21

# (参考訳) 2020年米大統領選挙:Twitterで女性ユーザーと男性ユーザーの分析

2020 U.S. Presidential Election: Analysis of Female and Male Users on Twitter ( http://arxiv.org/abs/2108.09416v1 )

ライセンス: CC BY 4.0

Amir Karami, Spring B. Clark, Anderson Mackenzie, Dorathea Lee, Michael Zhu, Hannah R. Boyajieff, Bailey Goldschmidt

(参考訳) ソーシャルメディアは、選挙運動において、様々な問題について意見を表明するために一般に使用される。様々なソーシャルメディアチャンネルの中で、Twitterは研究者や政治家が経済や外交政策など幅広いトピックに関する世論を探るための効率的なプラットフォームを提供している。現在の文献は、主にユーザーの性別を考慮せずにツイートの内容を分析することに焦点を当てている。この研究は、大量のツイートを収集、分析し、計算、ヒューマンコーディング、統計分析を用いて、2020年のアメリカ合衆国大統領選挙中に投稿された30万以上のツイートのトピックを識別し、トピックの平均重量について女性と男性のユーザーを比較する。私たちの発見は、税や気候変動、新型コロナウイルス(covid-19)パンデミックなど、幅広いトピックに基づいています。トピックのうち,70%以上のトピックにおいて,女性ユーザと男性ユーザの間に有意な違いがある。本研究のアプローチは情報学,政治学,コミュニケーション学の分野での研究に役立ち,政治運動によって世論のジェンダーに基づく理解を得るのに有効である。

Social media is commonly used by the public during election campaigns to express their opinions regarding different issues. Among various social media channels, Twitter provides an efficient platform for researchers and politicians to explore public opinion regarding a wide range of topics such as economy and foreign policy. Current literature mainly focuses on analyzing the content of tweets without considering the gender of users. This research collects and analyzes a large number of tweets and uses computational, human coding, and statistical analyses to identify topics in more than 300,000 tweets posted during the 2020 U.S. presidential election and to compare female and male users regarding the average weight of the topics. Our findings are based upon a wide range of topics, such as tax, climate change, and the COVID-19 pandemic. Out of the topics, there exists a significant difference between female and male users for more than 70% of topics. Our research approach can inform studies in the areas of informatics, politics, and communication, and it can be used by political campaigns to obtain a gender-based understanding of public opinion.

翻訳日:2021-08-25 09:53:13 公開日:2021-08-21

# (参考訳) L3C-Stereo:ステレオ画像のロスレス圧縮

L3C-Stereo: Lossless Compression for Stereo Images ( http://arxiv.org/abs/2108.09422v1 )

ライセンス: CC BY 4.0

Zihao Huang, Zhe Sun, Feng Duan, Andrzej Cichocki, Peiying Ruan and Chao Li

(参考訳) 多数の自動運転タスクには高精細なステレオ画像が必要であり、大量のストレージスペースを必要とする。効率よく無損失圧縮を実行することが現実的な問題となっている。一般に、各画素の正確な確率推定を行うのは難しい。そこで本稿では, ワープモジュールと確率推定モジュールの2つの主要モジュールからなるマルチスケールロスレス圧縮モデルであるL3C-Stereoを提案する。ワープモジュールは、同じドメインからの2つのビュー特徴写像を利用して、適切なビューを再構成し、正しいビューの確率推定の信頼性を向上させるために使用される不均一マップを生成する。確率推定モジュールは、適応算術符号化のための画素単位のロジスティック混合分布を提供する。実験では,3つのデータセットすべてにおいて,手作り圧縮法と学習ベース法を上回った。そして, 最大偏差が圧縮効果を向上させることを示す。さらに,本モデルの圧縮特性により,後続のステレオタスクに対して許容される品質の差マップを自然に生成する。

A large number of autonomous driving tasks need high-definition stereo images, which requires a large amount of storage space. Efficiently executing lossless compression has become a practical problem. Commonly, it is hard to make accurate probability estimates for each pixel. To tackle this, we propose L3C-Stereo, a multi-scale lossless compression model consisting of two main modules: the warping module and the probability estimation module. The warping module takes advantage of two view feature maps from the same domain to generate a disparity map, which is used to reconstruct the right view so as to improve the confidence of the probability estimate of the right view. The probability estimation module provides pixel-wise logistic mixture distributions for adaptive arithmetic coding. In the experiments, our method outperforms the hand-crafted compression methods and the learning-based method on all three datasets used. Then, we show that a better maximum disparity can lead to a better compression effect. Furthermore, thanks to a compression property of our model, it naturally generates a disparity map of an acceptable quality for the subsequent stereo tasks.

翻訳日:2021-08-25 09:40:25 公開日:2021-08-21

# (参考訳) 腫瘍内パーティショニングのための特徴表現を増強した適応的教師なし学習とグリオ芽腫の生存予測

Adaptive unsupervised learning with enhanced feature representation for intra-tumor partitioning and survival prediction for glioblastoma ( http://arxiv.org/abs/2108.09423v1 )

ライセンス: CC BY 4.0

Yifan Li, Chao Li, Yiran Wei, Stephen Price, Carola-Bibiane Sch\"onlieb, Xi Chen

(参考訳) グリオ芽腫は局所的な微細構造と血管に非常に異質である。グリオブラスト腫の空間的多様性はより正確な治療につながる可能性がある。教師なし学習法では,Glioblastoma MRI由来の放射線学的特徴が腫瘍亜領域のセグメンテーションや生存予測に広く利用されている。しかし、アルゴリズムの結果の信頼性は、あいまいな中間過程と、クラスタリングアルゴリズムのランダム性、特に異種患者のデータによってもたらされる不安定性の両方によってしばしば問題となる。本稿では, 腫瘍内パーティショニングとグリオーマ生存予測のための適応型非教師なし学習手法を提案する。 K-meansのような教師なし学習アルゴリズムのクラスタリング安定性を向上させるために,新規かつ問題特異的な自動エンコーダ(FAE)を開発した。さらに、プロセス全体をベイズ最適化(BO)技法でモデル化し、ハイパーパラメータを適度な数ステップで適応的に最適化することができるようにした。その結果,提案手法はロバストで臨床的に関連するmriサブリージョンと統計的に有意な生存予測を生成できることがわかった。

Glioblastoma is profoundly heterogeneous in regional microstructure and vasculature. Characterizing the spatial heterogeneity of glioblastoma could lead to more precise treatment. With unsupervised learning techniques, glioblastoma MRI-derived radiomic features have been widely utilized for tumor sub-region segmentation and survival prediction. However, the reliability of algorithm outcomes is often challenged by both ambiguous intermediate process and instability introduced by the randomness of clustering algorithms, especially for data from heterogeneous patients. In this paper, we propose an adaptive unsupervised learning approach for efficient MRI intra-tumor partitioning and glioblastoma survival prediction. A novel and problem-specific Feature-enhanced Auto-Encoder (FAE) is developed to enhance the representation of pairwise clinical modalities and therefore improve clustering stability of unsupervised learning algorithms such as K-means. Moreover, the entire process is modelled by the Bayesian optimization (BO) technique with a custom loss function that the hyper-parameters can be adaptively optimized in a reasonably few steps. The results demonstrate that the proposed approach can produce robust and clinically relevant MRI sub-regions and statistically significant survival predictions.

翻訳日:2021-08-25 09:20:31 公開日:2021-08-21

# (参考訳) ARAPReg:変形可能な形状発電機を学習する正規化損失の可能性

ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators ( http://arxiv.org/abs/2108.09432v1 )

ライセンス: CC BY 4.0

Qixing Huang, Xiangru Huang, Bo Sun, Zaiwei Zhang, Junfeng Jiang and Chandrajit Bajaj

(参考訳) 本稿では,パラメトリック変形形状生成器の訓練のための教師なし損失について述べる。鍵となる考え方は、生成した形状間の局所剛性の保存を強制することである。本手法は,as-rigid-as possible (または arap) 変形エネルギーの近似に基づく。本稿では,ARAPエネルギーのヘシアンスペクトル分解による教師なし損失の展開について述べる。私たちの損失は、強固な規範を通してポーズと形の変化をうまく分離します。損失は単純な閉形式表現を許容する。訓練が容易で、可変オートエンコーダ(VAE)やオートデコーダ(AD)など、任意の標準世代モデルにプラグインすることができる。実験の結果,人間,動物,骨といった様々な形状カテゴリの公開ベンチマークデータセットにおいて,既存の形状生成アプローチをかなり上回っていることがわかった。

This paper introduces an unsupervised loss for training parametric deformation shape generators. The key idea is to enforce the preservation of local rigidity among the generated shapes. Our approach builds on an approximation of the as-rigid-as possible (or ARAP) deformation energy. We show how to develop the unsupervised loss via a spectral decomposition of the Hessian of the ARAP energy. Our loss nicely decouples pose and shape variations through a robust norm. The loss admits simple closed-form expressions. It is easy to train and can be plugged into any standard generation models, e.g., variational auto-encoder (VAE) and auto-decoder (AD). Experimental results show that our approach outperforms existing shape generation approaches considerably on public benchmark datasets of various shape categories such as human, animal and bone.

翻訳日:2021-08-25 09:07:38 公開日:2021-08-21

# (参考訳) deepedgebench: エッジデバイス上のディープニューラルネットワークのベンチマーク

DeepEdgeBench: Benchmarking Deep Neural Networks on Edge Devices ( http://arxiv.org/abs/2108.09457v1 )

ライセンス: CC BY 4.0

Stephan Patrick Baller, Anshul Jindal, Mohak Chadha, Michael Gerndt

(参考訳) EdgeAI(Edgeコンピューティングベースの人工知能)は、厳しいレイテンシ要件を満たすために、多種多様な分散AIアプリケーションを扱うために、ここ数年、最も活発に研究されている。一方、多くの企業は、エッジコンピューティング環境で計算ノードとして機能するために、人気のRaspberry PiやNvidiaのJetson Nanoのような、フォームファクタ(消費電力とリソースの制限)の少ないエッジデバイスをリリースしている。エッジデバイスはコンピューティングのパワーとハードウェアのリソースで制限されているが、パフォーマンスを向上させるためにアクセラレーターによって駆動される。したがって、AIベースのDeep Neural Networksが限られたリソースを持つデバイス上でどのように機能するかは興味深い。本研究では,Asus Tinker Edge R, Raspberry Pi 4, Google Coral Dev Board, Nvidia Jetson Nano, そして1つのマイクロコントローラであるArduino Nano 33 BLEを,異なるディープラーニングモデルとフレームワーク上で,チップ上での4つのシステム(SoC)の推論時間と消費電力で比較した。また,装置の消費電力,推定時間,精度を計測し,他の機器に容易に拡張できる方法を提案する。我々の結果は、Tensorflowベースの量子化モデルでは、Google Coral Dev Boardが推論時間と消費電力の両方で最高のパフォーマンスを提供します。計算時間の少ない部分、すなわち、計算時間 MobileNetV2の29.3%以下では、Jetson Nanoは他のデバイスよりも高速に動作している。

EdgeAI (Edge computing based Artificial Intelligence) has been most actively researched for the last few years to handle variety of massively distributed AI applications to meet up the strict latency requirements. Meanwhile, many companies have released edge devices with smaller form factors (low power consumption and limited resources) like the popular Raspberry Pi and Nvidia's Jetson Nano for acting as compute nodes at the edge computing environments. Although the edge devices are limited in terms of computing power and hardware resources, they are powered by accelerators to enhance their performance behavior. Therefore, it is interesting to see how AI-based Deep Neural Networks perform on such devices with limited resources. In this work, we present and compare the performance in terms of inference time and power consumption of the four Systems on a Chip (SoCs): Asus Tinker Edge R, Raspberry Pi 4, Google Coral Dev Board, Nvidia Jetson Nano, and one microcontroller: Arduino Nano 33 BLE, on different deep learning models and frameworks. We also provide a method for measuring power consumption, inference time and accuracy for the devices, which can be easily extended to other devices. Our results showcase that, for Tensorflow based quantized model, the Google Coral Dev Board delivers the best performance, both for inference time and power consumption. For a low fraction of inference computation time, i.e. less than 29.3% of the time for MobileNetV2, the Jetson Nano performs faster than the other devices.

翻訳日:2021-08-25 08:44:45 公開日:2021-08-21

# (参考訳) 教師なしドメイン適応のためのロバスト組立ネットワーク

Robust Ensembling Network for Unsupervised Domain Adaptation ( http://arxiv.org/abs/2108.09473v1 )

ライセンス: CC BY 4.0

Han Sun, Lei Lin, Ningzhong Liu, Huiyu Zhou

(参考訳) 近年,unsupervised domain adaptation (uda)問題に対処するために,転送可能なモデルを実現するための広範な研究が提案されている。その中でも最も一般的な手法は、ソースドメインとターゲットドメイン間の距離を短くする、逆領域適応法である。敵対的学習は非常に効果的であるが、ネットワークの不安定性と混乱したカテゴリ情報の欠点につながる。本稿では,情報伝達のためのグローバル情報学習にロバストな時間センシング教師ネットワークを適用した,udaのためのロバストセンシングネットワーク (ren) を提案する。具体的には、主に教師ネットワークと生徒ネットワークを含み、標準ドメイン適応トレーニングを行い、教師ネットワークの重みを更新する。さらに, 判別器の能力を向上させるために, 二重ネットワーク条件付き対向損失を提案する。最後に,学生ネットワークの基本能力を向上させるために,学生ネットワークと教師ネットワークの誤りのバランスをとるために,一貫性制約を利用する。いくつかのUDAデータセットに対する大規模な実験結果は、他の最先端UDAアルゴリズムと比較することにより、我々のモデルの有効性を実証した。

Recently, in order to address the unsupervised domain adaptation (UDA) problem, extensive studies have been proposed to achieve transferrable models. Among them, the most prevalent method is adversarial domain adaptation, which can shorten the distance between the source domain and the target domain. Although adversarial learning is very effective, it still leads to the instability of the network and the drawbacks of confusing category information. In this paper, we propose a Robust Ensembling Network (REN) for UDA, which applies a robust time ensembling teacher network to learn global information for domain transfer. Specifically, REN mainly includes a teacher network and a student network, which performs standard domain adaptation training and updates weights of the teacher network. In addition, we also propose a dual-network conditional adversarial loss to improve the ability of the discriminator. Finally, for the purpose of improving the basic ability of the student network, we utilize the consistency constraint to balance the error between the student network and the teacher network. Extensive experimental results on several UDA datasets have demonstrated the effectiveness of our model by comparing with other state-of-the-art UDA algorithms.

翻訳日:2021-08-25 08:19:59 公開日:2021-08-21

# (参考訳) MimicBot:ImitationとReinforcement Learningを組み合わせてBot Bowlで優勝

MimicBot: Combining Imitation and Reinforcement Learning to win in Bot Bowl ( http://arxiv.org/abs/2108.09478v1 )

ライセンス: CC BY 4.0

Nicola Pezzotti

(参考訳) 本稿では,Bot Bowl IIIコンペティションに参加したFantasy Football AIでプレイするように訓練されたハイブリッドエージェントについて述べる。エージェントであるMimicBotは、特別に設計されたディープポリシーネットワークを使用して実装され、模倣と強化学習の組み合わせを使って訓練される。このような文脈で強化学習アプローチを用いた以前の試みは、いくつかの理由で失敗した。環境に内在するランダム性と、利用可能なアクションの数が大きくて不均一であるため、カリキュラム学習アプローチは、ランダムに支払われるエージェントを一貫して打ち負かせない。現在、機械学習のアプローチは、ゲーム上のドメイン知識を利用するスクリプトボットを打ち負かすことはできない。私たちのソリューションは、模倣学習とハイブリッド意思決定プロセスのおかげで、一貫してこのようなスクリプトエージェントを破ります。さらに,強化学習環境において,サンプル効率を劇的に向上させながら,より効率的にトレーニングする方法を考察した。 MimicBotはBot Bowl IIIコンペティションの勝者であり、現在最先端のソリューションである。

This paper describe an hybrid agent trained to play in Fantasy Football AI which participated in the Bot Bowl III competition. The agent, MimicBot, is implemented using a specifically designed deep policy network and trained using a combination of imitation and reinforcement learning. Previous attempts in using a reinforcement learning approach in such context failed for a number of reasons, e.g. due to the intrinsic randomness in the environment and the large and uneven number of actions available, with a curriculum learning approach failing to consistently beat a randomly paying agent. Currently no machine learning approach can beat a scripted bot which makes use of the domain knowledge on the game. Our solution, thanks to an imitation learning and a hybrid decision-making process, consistently beat such scripted agents. Moreover we shed lights on how to more efficiently train in a reinforcement learning setting while drastically increasing sample efficiency. MimicBot is the winner of the Bot Bowl III competition, and it is currently the state-of-the-art solution.

翻訳日:2021-08-25 08:08:10 公開日:2021-08-21

# (参考訳) Grid-VLP:ビジョンランゲージ事前トレーニングのためのグリッド機能の再検討

Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training ( http://arxiv.org/abs/2108.09479v1 )

ライセンス: CC BY 4.0

Ming Yan, Haiyang Xu, Chenliang Li, Bin Bi, Junfeng Tian, Min Gui and Wei Wang

(参考訳) 視覚言語前訓練(vlp)に対する既存のアプローチは、境界ボックス(領域)に基づいた物体検出器に強く依存しており、最初に画像からサルエントオブジェクトを検出し、その後、トランスフォーマティブベースのモデルを使用してクロスモーダル融合を行う。優れた性能にもかかわらず、これらのアプローチは有効性と効率の両面で対象検出器の能力に縛られている。さらに、オブジェクト検出の存在はモデル設計に不必要な制約を課し、エンドツーエンドのトレーニングをサポートするのが難しくなる。本稿では,視覚言語事前学習のためのグリッドベースの畳み込み機能を再検討し,高価な地域関連ステップをスキップする。本稿では,グリッド機能と驚くほどうまく連携する,単純かつ効果的なグリッドベースVLP法を提案する。ドメイン内データセットのみを事前学習することにより,提案手法は,3つの視覚言語理解タスクにおいて,最も競争力のある領域ベースのVLP手法より優れている。本研究の成果は,視覚言語プレトレーニング技術の進歩に寄与し,より効果的かつ効率的なVLPに向けた新たな方向性を提供することを願っている。

Existing approaches to vision-language pre-training (VLP) heavily rely on an object detector based on bounding boxes (regions), where salient objects are first detected from images and then a Transformer-based model is used for cross-modal fusion. Despite their superior performance, these approaches are bounded by the capability of the object detector in terms of both effectiveness and efficiency. Besides, the presence of object detection imposes unnecessary constraints on model designs and makes it difficult to support end-to-end training. In this paper, we revisit grid-based convolutional features for vision-language pre-training, skipping the expensive region-related steps. We propose a simple yet effective grid-based VLP method that works surprisingly well with the grid features. By pre-training only with in-domain datasets, the proposed Grid-VLP method can outperform most competitive region-based VLP methods on three examined vision-language understanding tasks. We hope that our findings help to further advance the state of the art of vision-language pre-training, and provide a new direction towards effective and efficient VLP.

翻訳日:2021-08-25 07:50:01 公開日:2021-08-21

# (参考訳) yseop at finsim-3 shared task 2021: specializing financial domain learning with phrase representations

Yseop at FinSim-3 Shared Task 2021: Specializing Financial Domain Learning with Phrase Representations ( http://arxiv.org/abs/2108.09485v1 )

ライセンス: CC BY 4.0

Hanna Abi Akl, Dominique Mariko, Hugues de Mazancourt

(参考訳) 本稿では,FinSim-3共有タスク2021:財務分野のセマンティック類似性を学ぶためのアプローチを提案する。この共有タスクの目的は、金融ドメインから与えられた用語のリストを、外部オントロジーにおいて最も関連するハイパーnym(またはトップレベル)概念に正しく分類することである。そこで,本研究では,カスタムコーパス上で事前学習した文-roberta(sroberta)埋め込みモデルと,ファストテキストモデルを用いて提案するベースライン単語埋め込み構造を改善し,分類性能を向上させる2つの文-sentence埋め込みモデルの評価を行った。両指標で総合2位、平均精度で0.917、平均ランクで1.141。

In this paper, we present our approaches for the FinSim-3 Shared Task 2021: Learning Semantic Similarities for the Financial Domain. The aim of this shared task is to correctly classify a list of given terms from the financial domain into the most relevant hypernym (or top-level) concept in an external ontology. For our system submission, we evaluate two methods: a Sentence-RoBERTa (SRoBERTa) embeddings model pre-trained on a custom corpus, and a dual word-sentence embeddings model that builds on the first method by improving the proposed baseline word embeddings construction using the FastText model to boost the classification performance. Our system ranks 2nd overall on both metrics, scoring 0.917 on Average Accuracy and 1.141 on Mean Rank.

翻訳日:2021-08-25 07:42:00 公開日:2021-08-21

# (参考訳) flikcer - リアルタイム輝度周波数解析によるオンラインてんかん原性視覚コンテンツを解決するためのchromeエクステンション

Flikcer -- A Chrome Extension to Resolve Online Epileptogenic Visual Content with Real-Time Luminance Frequency Analysis ( http://arxiv.org/abs/2108.09491v1 )

ライセンス: CC BY 4.0

Jaisal Kothari, Ashay Srivastava

(参考訳) 映像コンテンツの輝度変動が速いか、あるいはてんかん原性視覚コンテンツと呼ばれる高コントラストの空間パターンが、感光性てんかんの視聴者に発作を誘発し、さらにこの疾患の影響を受けないユーザーに不快感を引き起こすこともある。 flikcerはwebサイトとchromeエクステンションという形で、ビデオのてんかん的なコンテンツを解決しようとするウェブアプリだ。これは発作の可能性のあるトリガーの数を提供する。また、これらのトリガーのタイムスタンプや、ビデオのより安全なバージョンも無料でダウンロードできる。アルゴリズムはpythonで書かれており、機械学習とコンピュータビジョンを使用している。このアルゴリズムの重要な側面は計算効率であり、利用者のリアルタイムな実装を可能にする。

Video content with fast luminance variations, or with spatial patterns of high contrast - referred to as epileptogenic visual content - may induce seizures on viewers with photosensitive epilepsy, and even cause discomfort in users not affected by this disease. Flikcer is a web app in the form of a website and chrome extension which aims to resolve epileptic content in videos. It provides the number of possible triggers for a seizure. It also provides the timestamps for these triggers along with a safer version of the video, free to download. The algorithm is written in Python and uses machine learning and computer vision. A key aspect of the algorithm is its computational efficiency, allowing real time implementation for public users.

翻訳日:2021-08-25 07:34:50 公開日:2021-08-21

# (参考訳) 文書アライメントのための多言語文類似度測定におけるメトリック学習

Metric Learning in Multilingual Sentence Similarity Measurement for Document Alignment ( http://arxiv.org/abs/2108.09495v1 )

ライセンス: CC BY 4.0

Charith Rajitha, Lakmali Piyarathne, Dilan Sachintha, Surangika Ranathunga

(参考訳) 多言語文表現に基づく文書アライメント技術は,最近,その成果が示された。しかし、これらの手法は教師なし距離測定技術に依存しており、手作業では微調整できない。本稿では,これらの教師なし距離測定手法の代わりに,タスク固有距離測定の導出にメトリックラーニングを用いる。これらの測定は教師あり、つまり距離測定メトリックは並列データセットを使って訓練される。 3つの異なる言語族に属する英語、シンハラ語、タミル語に属するデータセットを用いて、これらのタスク固有の教師付き距離学習メトリクスが、教師なし距離学習指標よりもドキュメントアライメントに優れていることを示す。

Document alignment techniques based on multilingual sentence representations have recently shown state of the art results. However, these techniques rely on unsupervised distance measurement techniques, which cannot be fined-tuned to the task at hand. In this paper, instead of these unsupervised distance measurement techniques, we employ Metric Learning to derive task-specific distance measurements. These measurements are supervised, meaning that the distance measurement metric is trained using a parallel dataset. Using a dataset belonging to English, Sinhala, and Tamil, which belong to three different language families, we show that these task-specific supervised distance learning metrics outperform their unsupervised counterparts, for document alignment.

翻訳日:2021-08-25 07:23:51 公開日:2021-08-21

# (参考訳) 文書間の関係抽出のための階層型エンティティグラフ畳み込みネットワーク

A Hierarchical Entity Graph Convolutional Network for Relation Extraction across Documents ( http://arxiv.org/abs/2108.09505v1 )

ライセンス: CC0 1.0

Tapas Nayak and Hwee Tou Ng

(参考訳) 関係抽出のための遠方の教師付きデータセットは、主に文レベルの抽出に焦点を当てており、関係性が非常に少ない。本稿では,関係タプルの2つの実体が,共通実体の連鎖を介して連結された2つの異なる文書に現れるクロスドキュメント関係抽出を提案する。このアイデアに従い、各チェーンが正確に2つのドキュメントを含む2つのホップ関係抽出のためのデータセットを作成する。提案するデータセットは,公開可能な文レベルのデータセットよりも高い関係性をカバーする。また,この課題に対する階層型エンティティグラフ畳み込みネットワーク(HEGCN)モデルを提案する。

Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where each chain contains exactly two documents. Our proposed dataset covers a higher number of relations than the publicly available sentence-level datasets. We also propose a hierarchical entity graph convolutional network (HEGCN) model for this task that improves performance by 1.1\% F1 score on our two-hop relation extraction dataset, compared to some strong neural baselines.

翻訳日:2021-08-25 07:14:28 公開日:2021-08-21

# (参考訳) テンソル場上の学習変換のための回転同変ニューラル演算子(例えば3次元画像とベクトル場)

Rotationally Equivariant Neural Operators for Learning Transformations on Tensor Fields (eg 3D Images and Vector Fields) ( http://arxiv.org/abs/2108.09541v1 )

ライセンス: CC BY 4.0

Paul Shen, Michael Herbst, Venkat Viswanathan

(参考訳) テンソル場の集合間の変換および回転同変変換と同様に、学習分解不変量に対する同変ニューラルネットワークを導入する。入力と出力はスカラー場、ベクトル場、二階テンソル場、高階場の任意の混合を含むことができる。我々のテンソル場畳み込み層は任意の線型作用素をエミュレートし、そのインパルス応答やグリーン関数を畳み込み核として学習する。テンソル場注目層は局所テンソル積を介してペアワイズ場結合をエミュレートする。畳み込みとそれに付随する随伴体は実あるいはフーリエ空間に存在し、線形スケーリングが可能である。 E3NN, TBNN, FNOの概念を統一することにより, 工学および量子化学における幅広いPDEおよび力学系の予測性能が向上する。コードはJuliaにあり、著者からの要望に応じて入手できる。

We introduce equivariant neural operators for learning resolution invariant as well as translation and rotation equivariant transformations between sets of tensor fields. Input and output may contain arbitrary mixes of scalar fields, vector fields, second order tensor fields and higher order fields. Our tensor field convolution layers emulate any linear operator by learning its impulse response or Green's function as the convolution kernel. Our tensor field attention layers emulate pairwise field coupling via local tensor products. Convolutions and associated adjoints can be in real or Fourier space allowing for linear scaling. By unifying concepts from E3NN, TBNN and FNO, we achieve good predictive performance on a wide range of PDEs and dynamical systems in engineering and quantum chemistry. Code is in Julia and available upon request from authors.

翻訳日:2021-08-25 07:03:22 公開日:2021-08-21

# (参考訳) 時空間データマニフォールドの連成特性

Joint Characterization of Spatiotemporal Data Manifolds ( http://arxiv.org/abs/2108.09545v1 )

ライセンス: CC BY 4.0

Daniel Sousa and Christopher Small

(参考訳) 時空間(ST)画像データはますます一般的になり、しばしば高次元(高次元)である。 STデータのモデリングは、独立して相互作用するプロセスが多々存在するため、測定に寄与するかもしれないし、貢献しないかもしれない。キャラクタリゼーションは、生成過程とそのデータ表現に関する仮定の導出を支援することによって、モデリングの補完と見なすことができる。次元減少(DR)は、高次元信号の「次元の曲線」を緩和するためにしばしば実装される特徴である。長年にわたり、主成分(PC)と経験直交関数(EOF)分析は、DRおよびST分析に対する線形で可逆的なアプローチとして用いられてきた。近年、非線形drアルゴリズムのスイートが開発され、しばしば"manifold learning"と分類されている。ここでは、ラプラシアン固有写像 (LE) と t-分散確率的隣接埋め込み (t-SNE) の2つの非線形DRアプローチとともに、PC/EOFを用いたSTデータ多様体の合同特徴づけについて検討する。合成例から始まり,空間で約5桁,時間で2桁のstデータセットを大域的,地域的,フィールドスケールに展開し,これら3つのdrアプローチがst多様体トポロジーに関する補完的情報が得られることを示す。 PCs/EOFs による比較的拡散したTFS と比較して、非線形アプローチは、時間的終端部材 (LE) および/または時空間クラスタリング (t-SNE) におけるあいまいさを減少させたよりコンパクトな多様体を生成する。これらの特性は、LEやt-SNEよりも高い解釈可能性、計算要求の大幅な低減、PC/EOFの空間エイリアスに対する感度の低下によって補償される。総合的に考えると, 単一のアプローチだけで, 生成st過程をより深く把握できる3つの相補的なdrアプローチを用いた共同キャラクタリゼーションを見いだすことができる。

Spatiotemporal (ST) image data are increasingly common and often high-dimensional (high-D). Modeling ST data can be a challenge due to the plethora of independent and interacting processes which may or may not contribute to the measurements. Characterization can be considered the complement to modeling by helping guide assumptions about generative processes and their representation in the data. Dimensionality reduction (DR) is a frequently implemented type of characterization designed to mitigate the "curse of dimensionality" on high-D signals. For decades, Principal Component (PC) and Empirical Orthogonal Function (EOF) analysis has been used as a linear, invertible approach to DR and ST analysis. Recent years have seen the additional development of a suite of nonlinear DR algorithms, frequently categorized as "manifold learning". Here, we explore the idea of joint characterization of ST data manifolds using PCs/EOFs alongside two nonlinear DR approaches: Laplacian Eigenmaps (LE) and t-distributed stochastic neighbor embedding (t-SNE). Starting with a synthetic example and progressing to global, regional, and field scale ST datasets spanning roughly 5 orders of magnitude in space and 2 in time, we show these three DR approaches can yield complementary information about ST manifold topology. Compared to the relatively diffuse TFS produced by PCs/EOFs, the nonlinear approaches yield more compact manifolds with decreased ambiguity in temporal endmembers (LE) and/or in spatiotemporal clustering (t-SNE). These properties are compensated by the greater interpretability, significantly lower computational demand and diminished sensitivity to spatial aliasing for PCs/EOFs than LE or t-SNE. Taken together, we find joint characterization using the three complementary DR approaches capable of greater insight into generative ST processes than possible using any single approach alone.

翻訳日:2021-08-25 06:59:04 公開日:2021-08-21

# (参考訳) ピカチュウはどうですか。 Pok\emon ワード埋め込みデータによる Pok\emon プロパティの収集とランク付け

How Cute is Pikachu? Gathering and Ranking Pok\'emon Properties from Data with Pok\'emon Word Embeddings ( http://arxiv.org/abs/2108.09546v1 )

ライセンス: CC BY 4.0

Mika H\"am\"al\"ainen, Khalid Alnajjar and Niko Partanen

(参考訳) 我々は,151個のオリジナル pok\'emon に対して,記述性を自動的に得るための異なる方法を提案する。クロールしたPok\'emonコーパス上に複数の単語埋め込みモデルをトレーニングし、与えられたPok\'emonにどのような特徴があるかに基づいて、自動的に英語の形容詞をランク付けする。我々の実験に基づいて、事前訓練されたモデルを使用するよりも、ドメイン固有のデータでモデルをトレーニングする方がよい。 Word2Vecは、結果においてfastTextモデルよりもノイズが少ない。さらに、各Pok\'emonのプロパティのリストを自動的に拡張します。しかし、いずれの手法も見つからず、異なるセマンティックモデルにはかなりのノイズがある。私たちのモデルはZenodoでリリースされました。

We present different methods for obtaining descriptive properties automatically for the 151 original Pok\'emon. We train several different word embeddings models on a crawled Pok\'emon corpus, and use them to rank automatically English adjectives based on how characteristic they are to a given Pok\'emon. Based on our experiments, it is better to train a model with domain specific data than to use a pretrained model. Word2Vec produces less noise in the results than fastText model. Furthermore, we expand the list of properties for each Pok\'emon automatically. However, none of the methods is spot on and there is a considerable amount of noise in the different semantic models. Our models have been released on Zenodo.

翻訳日:2021-08-25 06:43:56 公開日:2021-08-21

# (参考訳) 熱可視顔認証のための合成手法

A Synthesis-Based Approach for Thermal-to-Visible Face Verification ( http://arxiv.org/abs/2108.09558v1 )

ライセンス: CC BY 4.0

Neehar Peri, Joshua Gleason, Carlos D. Castillo, Thirimachos Bourlai, Vishal M. Patel, Rama Chellappa

(参考訳) 近年,検査官の認識性能に適合する可視分光顔認証システムが提案されている。しかし、このようなシステムは低照度や夜間では効果がない。体温を吸収する熱顔画像は、可視光スペクトルを効果的に増強し、照明が制限されたシーンで識別可能な顔の特徴を捉える。コストの増大と多様な熱スペクトルと可視スペクトルデータセットの取得の困難さから、アルゴリズムや低光度認識のための大規模ベンチマークは限られている。本稿では,ARL-VTFとTUFTSの両方のマルチスペクトル顔データに対して,最先端の性能を実現するアルゴリズムを提案する。さらに,マルチスペクトル顔合成と検証のためのラベル平滑化による顔アライメント,ピクセルレベル対応,アイデンティティ分類の影響について検討した。提案手法は広く適用可能であり,堅牢であり,かつ高い有効性を示す。また,提案手法は,プロファイル対フロント検証において,フェイスフロント化法を有意に上回っていることを示す。最後にmilab-vtf(b)を提案する。これは対のサーマルビデオと可視ビデオで構成される、挑戦的なマルチスペクトル顔データセットである。私たちの知る限りでは、400人の被験者による顔データとともに、このデータセットは、屋内および長距離の熱可視性顔画像の最も広範なコレクションである。最後に,MILAB-VTF(B)データセットに対して,エンドツーエンドのサーマル・トゥ・ザ・ヴィジュアブル・フェース・検証システムにより高い性能が得られることを示す。

In recent years, visible-spectrum face verification systems have been shown to match expert forensic examiner recognition performance. However, such systems are ineffective in low-light and nighttime conditions. Thermal face imagery, which captures body heat emissions, effectively augments the visible spectrum, capturing discriminative facial features in scenes with limited illumination. Due to the increased cost and difficulty of obtaining diverse, paired thermal and visible spectrum datasets, algorithms and large-scale benchmarks for low-light recognition are limited. This paper presents an algorithm that achieves state-of-the-art performance on both the ARL-VTF and TUFTS multi-spectral face datasets. Importantly, we study the impact of face alignment, pixel-level correspondence, and identity classification with label smoothing for multi-spectral face synthesis and verification. We show that our proposed method is widely applicable, robust, and highly effective. In addition, we show that the proposed method significantly outperforms face frontalization methods on profile-to-frontal verification. Finally, we present MILAB-VTF(B), a challenging multi-spectral face dataset that is composed of paired thermal and visible videos. To the best of our knowledge, with face data from 400 subjects, this dataset represents the most extensive collection of publicly available indoor and long-range outdoor thermal-visible face imagery. Lastly, we show that our end-to-end thermal-to-visible face verification system provides strong performance on the MILAB-VTF(B) dataset.

翻訳日:2021-08-25 06:34:29 公開日:2021-08-21

# (参考訳) 連続学習における主勾配方向と信頼貯留層サンプリング

Principal Gradient Direction and Confidence Reservoir Sampling for Continual Learning ( http://arxiv.org/abs/2108.09592v1 )

ライセンス: CC BY 4.0

Zhiyi Chen and Tong Lin

(参考訳) タスクフリーオンライン連続学習は、非IDデータストリーム上の学習者の破滅的な忘れを緩和することを目的としている。 Experience Replay (ER) はSOTA連続学習法であり、他のリプレイ手法のバックボーンアルゴリズムとして広く使われている。しかし, ERのトレーニング戦略は, リプレイされた例を十分に活用するには単純すぎるため, 貯水池のサンプリング戦略も最適ではない。本研究では,ERを特別な場合とみなすことのできる一般近位勾配フレームワークを提案する。さらに,主グラディエント方向(PGD)と信頼性貯留層サンプリング(CRS)の2つの改良点を提案する。主勾配方向において,過去の勾配の大きな寄与を表すだけでなく,現在の勾配に関する新たな知識も保持する目標勾配を最適化する。次に、保存されたサンプルの値を測定するマージンベースのメトリックに基づいて、より有益なメモリバッファを維持するための信頼度リザーバサンプリングを示す。このアルゴリズムは平均精度を7.9%まで向上させ、4つのデータセットで最大15.4%まで忘れてしまうという、soma erベースの手法であるmir-replayの性能を一貫して向上させる。

Task-free online continual learning aims to alleviate catastrophic forgetting of the learner on a non-iid data stream. Experience Replay (ER) is a SOTA continual learning method, which is broadly used as the backbone algorithm for other replay-based methods. However, the training strategy of ER is too simple to take full advantage of replayed examples and its reservoir sampling strategy is also suboptimal. In this work, we propose a general proximal gradient framework so that ER can be viewed as a special case. We further propose two improvements accordingly: Principal Gradient Direction (PGD) and Confidence Reservoir Sampling (CRS). In Principal Gradient Direction, we optimize a target gradient that not only represents the major contribution of past gradients, but also retains the new knowledge of the current gradient. We then present Confidence Reservoir Sampling for maintaining a more informative memory buffer based on a margin-based metric that measures the value of stored examples. Experiments substantiate the effectiveness of both our improvements and our new algorithm consistently boosts the performance of MIR-replay, a SOTA ER-based method: our algorithm increases the average accuracy up to 7.9% and reduces forgetting up to 15.4% on four datasets.

翻訳日:2021-08-25 06:20:02 公開日:2021-08-21

# (参考訳) 長文音声対話のための階層的要約

Hierarchical Summarization for Longform Spoken Dialog ( http://arxiv.org/abs/2108.09597v1 )

ライセンス: CC BY 4.0

Daniel Li, Thomas Chen, Albert Tung, Lydia Chilton

(参考訳) 私たちは毎日会話に囲まれています。この媒体は、監査的に多様な情報ストリームを提供するが、体系的にダイアログを理解することは、しばしば非自明である。音声対話の広汎性にもかかわらず、自動音声理解と品質情報抽出は、特に文章の散文と比較した場合、著しく貧弱である。さらに、テキストを理解することに比べ、聴覚コミュニケーションは、話者の拡散、非公式な散文スタイル、構造の欠如など、多くの課題をもたらす。これらの懸念はすべて、ユーザが話し言葉のドメインを理解し、ナビゲートするのに役立つ、明確にカスタマイズされた対話システムの必要性を示しています。個々の自動音声認識(ASR)とテキスト要約法はすでに存在するが、それらは不完全な技術であり、ユーザ目的や意図、音声言語による合併症への対処も考慮していない。その結果、2段階のASRとテキスト要約パイプラインを設計し、これらの音声認識課題を解決するためのセマンティックセグメンテーションとマージアルゴリズムを提案する。本システムでは,ユーザが簡単にコンテンツを閲覧・ナビゲートできるだけでなく,これらの基盤技術におけるエラーからの回復も可能である。最後に,音声を素早くスキップし,ユーザの興味のある内容を識別するツールとして,階層的な要約のユーザの好みを強調するシステムの評価を行う。

Every day we are surrounded by spoken dialog. This medium delivers rich diverse streams of information auditorily; however, systematically understanding dialog can often be non-trivial. Despite the pervasiveness of spoken dialog, automated speech understanding and quality information extraction remains markedly poor, especially when compared to written prose. Furthermore, compared to understanding text, auditory communication poses many additional challenges such as speaker disfluencies, informal prose styles, and lack of structure. These concerns all demonstrate the need for a distinctly speech tailored interactive system to help users understand and navigate the spoken language domain. While individual automatic speech recognition (ASR) and text summarization methods already exist, they are imperfect technologies; neither consider user purpose and intent nor address spoken language induced complications. Consequently, we design a two stage ASR and text summarization pipeline and propose a set of semantic segmentation and merging algorithms to resolve these speech modeling challenges. Our system enables users to easily browse and navigate content as well as recover from errors in these underlying technologies. Finally, we present an evaluation of the system which highlights user preference for hierarchical summarization as a tool to quickly skim audio and identify content of interest to the user.

翻訳日:2021-08-25 06:10:24 公開日:2021-08-21

# (参考訳) SERF:log-Softplus ERrorActivation Functionを用いたディープニューラルネットワークのより良いトレーニングを目指して

SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function ( http://arxiv.org/abs/2108.09598v1 )

ライセンス: CC BY 4.0

Sayan Nag, Mayukh Bhattacharyya

(参考訳) アクティベーション機能は、トレーニングダイナミクスとニューラルネットワークのパフォーマンスを決定する上で重要な役割を果たす。シンプルで有効であるにもかかわらず広く採用されているアクティベーション関数 ReLU には、Dying ReLU 問題を含むいくつかの欠点がある。そこで本研究では,自然界において自己正規化され,非単調であるサーフと呼ばれる新しい活性化関数を提案する。 Mishと同様に、SerfもSwishファミリーに属している。コンピュータビジョン(画像分類とオブジェクト検出)と自然言語処理(機械翻訳、感情分類、マルチモーダル・エンテーメント)の様々な実験に基づいて、SerfはReLU(ベースライン)とSwishとMishを含む他のアクティベーション機能を大きく上回っており、より深いアーキテクチャに顕著な差がある。アブレーション研究により、serfベースのアーキテクチャは様々なシナリオにおいてswishやmishよりも優れた性能を示し、様々な深さ、複雑さ、最適化、学習率、バッチサイズ、初期化器、ドロップアウト率でserfの有効性と互換性を検証する。最後に,SwishとSerfの数学的関係について検討し,よりスムーズかつ高速に勾配を最適化する正規化効果を提供するSerfの第1微分のプレコンディショナー関数の影響を示す。

Activation functions play a pivotal role in determining the training dynamics and neural network performance. The widely adopted activation function ReLU despite being simple and effective has few disadvantages including the Dying ReLU problem. In order to tackle such problems, we propose a novel activation function called Serf which is self-regularized and nonmonotonic in nature. Like Mish, Serf also belongs to the Swish family of functions. Based on several experiments on computer vision (image classification and object detection) and natural language processing (machine translation, sentiment classification and multimodal entailment) tasks with different state-of-the-art architectures, it is observed that Serf vastly outperforms ReLU (baseline) and other activation functions including both Swish and Mish, with a markedly bigger margin on deeper architectures. Ablation studies further demonstrate that Serf based architectures perform better than those of Swish and Mish in varying scenarios, validating the effectiveness and compatibility of Serf with varying depth, complexity, optimizers, learning rates, batch sizes, initializers and dropout rates. Finally, we investigate the mathematical relation between Swish and Serf, thereby showing the impact of preconditioner function ingrained in the first derivative of Serf which provides a regularization effect making gradients smoother and optimization faster.

翻訳日:2021-08-25 05:43:59 公開日:2021-08-21

# CushLEPOR: LABSE蒸留知識モデルを用いたカスタマイズhLEPORメトリクスによる人的判断との整合性向上

CushLEPOR: Customised hLEPOR Metric Using LABSE Distilled Knowledge Model to Improve Agreement with Human Judgements ( http://arxiv.org/abs/2108.09484v1 )

ライセンス: Link先を確認

Lifeng Han, Irina Sorokina, Gleb Erofeev, Serge Gladkoff

(参考訳) 人間の評価は常に高価で、研究者は自動メトリクスを信頼できない。そこで本稿では,事前学習型言語モデル(PLM)と限定された人間のラベル付きスコアの利点を生かして,従来のメトリクスをカスタマイズすることを提案する。まず、hLEPORのパラメータ要素を再導入し、次に、hLEPORのパラメータの重み付けを自動的にチューニングするPythonポータブルバージョンを開発しました。次に、LABSE蒸留知識モデルを用いて、cushLEPORが配置された正確なMT言語対に関する因子重みを自動的に最適化することにより、人間の判断とのメートル法合意を向上する、カスタマイズhLEPOR(cushLEPOR)を提案する。また、英語とドイツ語と中国語のペアにおけるMQMおよびpSQMフレームワークに基づく評価データに対して、cushLEPORを最適化する。実験の結果、CushLEPOR は LABSE のような PLM とのより優れた契約、MQM や pSQM などの人的評価に対するより良い合意、BLEU よりもはるかに優れたパフォーマンスをもたらすことが示されている(データは \url{https://github.com/poethan/cushLEPOR} で入手できる)。

Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited available human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python portable version we developed which achieved the automatic tuning of the weighting parameters in hLEPOR metric. Then we present the customised hLEPOR (cushLEPOR) which uses LABSE distilled knowledge model to improve the metric agreement with human judgements by automatically optimised factor weights regarding the exact MT language pairs that cushLEPOR is deployed to. We also optimise cushLEPOR towards human evaluation data based on MQM and pSQM framework on English-German and Chinese-English language pairs. The experimental investigations show cushLEPOR boosts hLEPOR performances towards better agreements to PLMs like LABSE with much lower cost, and better agreements to human evaluations including MQM and pSQM scores, and yields much better performances than BLEU (data available at \url{https://github.com/poethan/cushLEPOR}).

翻訳日:2021-08-24 16:08:02 公開日:2021-08-21

# learn-explain-reinforce: counterfactual reasoningとアルツハイマー病診断モデル強化のための指導

Learn-Explain-Reinforce: Counterfactual Reasoning and Its Guidance to Reinforce an Alzheimer's Disease Diagnosis Model ( http://arxiv.org/abs/2108.09451v1 )

ライセンス: Link先を確認

Kwanseok Oh, Jee Seok Yoon, and Heung-Il Suk

(参考訳) 既存の疾患診断モデルの研究は、パフォーマンス改善のための診断モデル学習や、訓練された診断モデルの視覚的説明に焦点を当てている。本稿では、診断モデル学習、視覚的説明生成(説明単位)、視覚的説明によって導かれる訓練された診断モデル強化(強化単位)を統一する新しい学習説明強化(LEAR)フレームワークを提案する。視覚的説明のために、入力サンプルを目的のターゲットラベルとして識別するために変換する反ファクトマップを生成する。例えば、カウンターファクトマップは、通常の脳画像内で仮説上の異常を局在させ、アルツハイマー病(AD)と診断される可能性がある。我々は,対象課題に関するデータ駆動型およびモデル駆動型知識,すなわち構造的MRIを用いたAD診断が,訓練された診断モデルの一般化を強化する上で重要な情報源であると考えている。この目的のために,反事実マップの指導により注意に基づく特徴リファインメントモジュールを考案する。説明と補強は相互に行われ、反復的に操作できる。提案手法はadniデータセットの質的・定量的解析により検証された。その理解性と忠実さはアブレーション研究と既存手法との比較によって実証された。

Existing studies on disease diagnostic models focus either on diagnostic model learning for performance improvement or on the visual explanation of a trained diagnostic model. We propose a novel learn-explain-reinforce (LEAR) framework that unifies diagnostic model learning, visual explanation generation (explanation unit), and trained diagnostic model reinforcement (reinforcement unit) guided by the visual explanation. For the visual explanation, we generate a counterfactual map that transforms an input sample to be identified as an intended target label. For example, a counterfactual map can localize hypothetical abnormalities within a normal brain image that may cause it to be diagnosed with Alzheimer's disease (AD). We believe that the generated counterfactual maps represent data-driven and model-induced knowledge about a target task, i.e., AD diagnosis using structural MRI, which can be a vital source of information to reinforce the generalization of the trained diagnostic model. To this end, we devise an attention-based feature refinement module with the guidance of the counterfactual maps. The explanation and reinforcement units are reciprocal and can be operated iteratively. Our proposed approach was validated via qualitative and quantitative analysis on the ADNI dataset. Its comprehensibility and fidelity were demonstrated through ablation studies and comparisons with existing methods.

翻訳日:2021-08-24 16:04:33 公開日:2021-08-21

# 離散高次元データを用いたベイズネットワーク同定のためのスパース構造学習アルゴリズム

A Sparse Structure Learning Algorithm for Bayesian Network Identification from Discrete High-Dimensional Data ( http://arxiv.org/abs/2108.09501v1 )

ライセンス: Link先を確認

Nazanin Shajoonnezhad, Amin Nikanjam

(参考訳) 本稿では,高次元離散データから疎構造ベイズネットワークを学習する問題に対処する。連続ベイズネットワークと比較すると、離散ベイズネットワークの学習は大きなパラメータ空間のため難しい問題である。連続ベイズネットワークの学習には多くのアプローチが開発されているが、離散的ネットワークに対するアプローチはほとんど提案されていない。本稿では,学習ベイズネットワークを最適化問題として扱い,空間性とDAG特性を同時に満足するスコア関数を提案する。また,スコア関数を最適化するためにブロック方向確率座標降下アルゴリズムを実装した。具体的には,アルゴリズムを高次元データで効率的に動作させるため,最適化アルゴリズムに分散低減法を用いる。提案手法は,よく知られたベンチマークネットワークからの合成データに適用できる。構築したネットワークの品質,スケーラビリティ,堅牢性を測定した。いくつかの競合手法と比較して,本アルゴリズムは評価指標において他のアルゴリズムよりも優れていた。

This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data. Compared to continuous Bayesian networks, learning a discrete Bayesian network is a challenging problem due to the large parameter space. Although many approaches have been developed for learning continuous Bayesian networks, few approaches have been proposed for the discrete ones. In this paper, we address learning Bayesian networks as an optimization problem and propose a score function that satisfies the sparsity and the DAG property simultaneously. Besides, we implement a block-wised stochastic coordinate descent algorithm to optimize the score function. Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently in high-dimensional data. The proposed approach is applied to synthetic data from well-known benchmark networks. The quality, scalability, and robustness of the constructed network are measured. Compared to some competitive approaches, the results reveal that our algorithm outperforms the others in evaluation metrics.

翻訳日:2021-08-24 16:02:25 公開日:2021-08-21

# 確率勾配の輝きのランダム性向上は一般化を改善するか?

How Can Increased Randomness in Stochastic Gradient Descent Improve Generalization? ( http://arxiv.org/abs/2108.09507v1 )

ライセンス: Link先を確認

Arwen V. Bradley and Carlos Alberto Gomez-Uribe

(参考訳) 近年の研究では、確率勾配降下(SGD)における学習率の増加やミニバッチサイズの減少がテストセット性能を向上させることが報告されている。複数の局所ミニマを持つ損失関数を持つモデルでは、いくつかの条件下でこれを期待できる。我々の主な貢献は、一般化におけるSGD学習率とバッチサイズの役割を研究する物理の手法に着想を得た、近似的だが解析的なアプローチである。複数の最小値を持つ損失関数のトレーニングとテストデータ分布のシフトの下でテストセットのパフォーマンスを特徴付ける。このシフトは単にサンプリングによって起こりうるため、一般的には実践的な応用に現れる。その結果,局所的ミニマムの変化は曲率を上げることによってテスト性能を悪化させ,広義の局所的ミニマムの選択により一般化が向上することを示す。次に,SGDを専門とし,静止条件下でのテスト性能について検討する。 SGDの正確な定常分布を得ることは困難であるため、SGDのFokker-Planck近似を導出し、その定常分布を得る。このプロセスは, 最小バッチサイズで分割された学習速度が, 統計力学において温度に類似する役割を担っていることを示唆し, 定常分布を含むSGDは, 温度を一定に保った学習速度やバッチサイズの変化に大きく変化しないことを示唆している。また,SGD温度の上昇は局所最小値の選択を低曲率で促進し,より一般化できることを示す。我々は,SGDの温度不変性を示すCIFAR10の実験を行い,SGD温度が上昇するにつれて試験損失が向上し,この効果を駆動する際のサンプリングとドメインシフトの影響を定量化する。最後に,2つの局所最小値による簡易な損失に我々の理論がどのように適用されるかを示す合成実験を示す。

Recent works report that increasing the learning rate or decreasing the minibatch size in stochastic gradient descent (SGD) can improve test set performance. We argue this is expected under some conditions in models with a loss function with multiple local minima. Our main contribution is an approximate but analytical approach inspired by methods in Physics to study the role of the SGD learning rate and batch size in generalization. We characterize test set performance under a shift between the training and test data distributions for loss functions with multiple minima. The shift can simply be due to sampling, and is therefore typically present in practical applications. We show that the resulting shift in local minima worsens test performance by picking up curvature, implying that generalization improves by selecting wide and/or little-shifted local minima. We then specialize to SGD, and study its test performance under stationarity. Because obtaining the exact stationary distribution of SGD is intractable, we derive a Fokker-Planck approximation of SGD and obtain its stationary distribution instead. This process shows that the learning rate divided by the minibatch size plays a role analogous to temperature in statistical mechanics, and implies that SGD, including its stationary distribution, is largely invariant to changes in learning rate or batch size that leave its temperature constant. We show that increasing SGD temperature encourages the selection of local minima with lower curvature, and can enable better generalization. We provide experiments on CIFAR10 demonstrating the temperature invariance of SGD, improvement of the test loss as SGD temperature increases, and quantifying the impact of sampling versus domain shift in driving this effect. Finally, we present synthetic experiments showing how our theory applies in a simplified loss with two local minima.

翻訳日:2021-08-24 16:02:12 公開日:2021-08-21

# BoundaryNet: 半自動レイアウトアノテーションのための高速マーキング距離マップを備えた注意深いネットワーク

BoundaryNet: An Attentive Deep Network with Fast Marching Distance Maps for Semi-automatic Layout Annotation ( http://arxiv.org/abs/2108.09433v1 )

ライセンス: Link先を確認

Abhishek Trivedi and Ravi Kiran Sarvadevabhatla

(参考訳) 画像領域の正確な境界アノテーションは、領域クラスセマンティクスに依存する下流アプリケーションにとって重要である。いくつかの文書コレクションは、アスペクト比の広い多クラス領域インスタンスと非常に不規則で重なり合う密集したレイアウトを含んでいる。完全自動境界推定手法は、データ集約的であり、可変サイズの画像を扱うことができず、上記の画像に対する準最適結果を生成する傾向がある。本稿では,高精度半自動レイアウトアノテーションのための新しいリサイズフリーアプローチであるバウンダリネットを提案する。可変サイズのユーザ選択領域は、最初に注目誘導スキップネットワークにより処理される。ネットワーク最適化は高速マーチング距離マップを介して導かれ、高品質な初期境界推定と関連する特徴表現を得る。これらの出力は、ハウスドルフ損失を用いて最適化された残差グラフ畳み込みネットワークによって処理され、最終的な領域境界を得る。挑戦的な画像原稿データセットの結果、BoundaryNetは強いベースラインを上回り、高品質なセマンティック領域境界を生成する。定性的には,スクリプトシステムとレイアウトの異なる複数の文書画像データセットを,追加の微調整なしで一般化する。 BoundaryNetを文書アノテーションシステムに統合し、手動や完全自動の代替品と比較して高いアノテーションスループットを提供することを示す。

Precise boundary annotations of image regions can be crucial for downstream applications which rely on region-class semantics. Some document collections contain densely laid out, highly irregular and overlapping multi-class region instances with large range in aspect ratio. Fully automatic boundary estimation approaches tend to be data intensive, cannot handle variable-sized images and produce sub-optimal results for aforementioned images. To address these issues, we propose BoundaryNet, a novel resizing-free approach for high-precision semi-automatic layout annotation. The variable-sized user selected region of interest is first processed by an attention-guided skip network. The network optimization is guided via Fast Marching distance maps to obtain a good quality initial boundary estimate and an associated feature representation. These outputs are processed by a Residual Graph Convolution Network optimized using Hausdorff loss to obtain the final region boundary. Results on a challenging image manuscript dataset demonstrate that BoundaryNet outperforms strong baselines and produces high-quality semantic region boundaries. Qualitatively, our approach generalizes across multiple document image datasets containing different script systems and layouts, all without additional fine-tuning. We integrate BoundaryNet into a document annotation system and show that it provides high annotation throughput compared to manual and fully automatic alternatives.

翻訳日:2021-08-24 16:00:48 公開日:2021-08-21

# Palmira: 手書き手書き文字のDenseとUneven LayoutのインスタンスセグメンテーションのためのDeep Deformable Network

Palmira: A Deep Deformable Network for Instance Segmentation of Dense and Uneven Layouts in Handwritten Manuscripts ( http://arxiv.org/abs/2108.09436v1 )

ライセンス: Link先を確認

Prema Satish Sharan, Sowmya Aitha, Amandeep Kumar, Abhishek Trivedi, Aaron Augustine, Ravi Kiran Sarvadevabhatla

(参考訳) 手書きの文書は、しばしば濃密で不均一なレイアウトで特徴づけられる。進歩にもかかわらず、セマンティックレイアウトセグメンテーションのための標準的なディープネットワークベースのアプローチは、セマンティクス領域にまたがる複雑な変形に対して堅牢ではない。この現象は、特に低リソースのインディアムリーフ原稿ドメインで顕著である。この問題に対処するため、最初にindiscapes2を紹介します。indiscapes2は、セマンティックレイアウトアノテーションを備えた、インデックス原稿の新しい大規模多種多様なデータセットです。 Indiscapes2には4つの異なる歴史的コレクションの文書があり、前身であるIndiscapesよりも150%大きい。また,手書き原稿中の領域の頑健な変形対応インスタンスセグメンテーションのための,新しい深層ネットワークpalmiraを提案する。また、ハウスドルフ距離とその変種を境界対応性能尺度として報告する。実験によりPalmiraはロバストなレイアウトを提供し、強力なベースラインアプローチやアブレーティブなバリエーションよりも優れていることが示された。我々はまた、パルミラの一般化能力を示すために、アラビア語、東南アジア、ヘブライの歴史写本の質的な結果も含んでいる。

Handwritten documents are often characterized by dense and uneven layout. Despite advances, standard deep network based approaches for semantic layout segmentation are not robust to complex deformations seen across semantic regions. This phenomenon is especially pronounced for the low-resource Indic palm-leaf manuscript domain. To address the issue, we first introduce Indiscapes2, a new large-scale diverse dataset of Indic manuscripts with semantic layout annotations. Indiscapes2 contains documents from four different historical collections and is 150% larger than its predecessor, Indiscapes. We also propose a novel deep network Palmira for robust, deformation-aware instance segmentation of regions in handwritten manuscripts. We also report Hausdorff distance and its variants as a boundary-aware performance measure. Our experiments demonstrate that Palmira provides robust layouts, outperforms strong baseline approaches and ablative variants. We also include qualitative results on Arabic, South-East Asian and Hebrew historical manuscripts to showcase the generalization capability of Palmira.

翻訳日:2021-08-24 16:00:28 公開日:2021-08-21

# semifed:一貫性と擬似ラベル付き半教師付き連合学習

SemiFed: Semi-supervised Federated Learning with Consistency and Pseudo-Labeling ( http://arxiv.org/abs/2108.09412v1 )

ライセンス: Link先を確認

Haowen Lin, Jian Lou, Li Xiong, Cyrus Shahabi

(参考訳) フェデレートラーニングは、携帯電話や組織などの複数のクライアントが、ローカルデータのプライバシーを保護しながら、予測の共有モデルを共同で学習することを可能にする。しかし、フェデレーション学習の最近の研究と応用は、すべてのクライアントが完全なラベル付きデータを持っていると仮定している。本研究では、各クライアントのデータサンプルを部分的にラベル付けするクロスサイロ・フェデレーション学習の新しいシナリオに焦点を当てる。我々は,ラベル付きサンプルへのアクセスに制限があるにもかかわらず,大量のラベル付きデータを用いてモデルの精度を向上させる半教師付き学習手法のアイデアを借りる。半教師付き学習のための2つの支配的アプローチである一貫性の正規化と擬似ラベル付けを統一したsemifedと呼ばれる新しいフレームワークを提案する。 SemiFedはまず、一貫性の正則化を強制するために高度なデータ拡張技術を適用し、トレーニング中にモデルの予測を使用して擬似ラベルを生成する。 SemiFedはフェデレーションを利用して、あるイメージに対して、異なるクライアントから複数のモデルが高信頼の予測を生成し、同じラベルに同意した場合のみ、擬似ラベルを保持する。 2つの画像ベンチマークに関する広範囲実験により,不均質および異種データ分布設定における提案手法の有効性を実証した。

Federated learning enables multiple clients, such as mobile phones and organizations, to collaboratively learn a shared model for prediction while protecting local data privacy. However, most recent research and applications of federated learning assume that all clients have fully labeled data, which is impractical in real-world settings. In this work, we focus on a new scenario for cross-silo federated learning, where data samples of each client are partially labeled. We borrow ideas from semi-supervised learning methods where a large amount of unlabeled data is utilized to improve the model's accuracy despite limited access to labeled examples. We propose a new framework dubbed SemiFed that unifies two dominant approaches for semi-supervised learning: consistency regularization and pseudo-labeling. SemiFed first applies advanced data augmentation techniques to enforce consistency regularization and then generates pseudo-labels using the model's predictions during training. SemiFed takes advantage of the federation so that for a given image, the pseudo-label holds only if multiple models from different clients produce a high-confidence prediction and agree on the same label. Extensive experiments on two image benchmarks demonstrate the effectiveness of our approach under both homogeneous and heterogeneous data distribution settings

翻訳日:2021-08-24 15:59:54 公開日:2021-08-21

# 実証学習における「逆例」

"Adversarial Examples" for Proof-of-Learning ( http://arxiv.org/abs/2108.09454v1 )

ライセンス: Link先を確認

Rui Zhang, Jian Liu, Yuan Ding, Qingbiao Wu, and Kui Ren

(参考訳) S&P'21では、Jia et al。これは、証明者がトレーニング手順の完全性を証明することによって、機械学習モデルのオーナシップを実証することを可能にする。証明の生成において証明者が行うものよりもコスト(計算量と記憶量の両方)の低い有効な証明を構築することはできない。 PoL証明は、トレーニング中に記録された一連の中間モデルと、記録された各モデルを得るために使用される対応するデータポイントを含む。通称、jia et al。最終的なモデルとトレーニングデータセットを知るだけの敵は、正しいデータポイントを持つ中間モデルのセットを効率的に見つけることができないと主張した。しかし,本稿では,PoLが「逆例」に対して脆弱であることを示す。具体的には、敵対的な例を最適化するのと同様の方法で、任意のモデルを任意に「生成」することで、正しいデータポイントを持つ中間モデルを効率的に生成することができる。理論的にも経験的にも、証明者による証明よりもはるかに低コストで有効な証明を生成できることを示し、PoLを破ることに成功した。

In S&P '21, Jia et al. proposed a new concept/mechanism named proof-of-learning (PoL), which allows a prover to demonstrate ownership of a machine learning model by proving integrity of the training procedure. It guarantees that an adversary cannot construct a valid proof with less cost (in both computation and storage) than that made by the prover in generating the proof. A PoL proof includes a set of intermediate models recorded during training, together with the corresponding data points used to obtain each recorded model. Jia et al. claimed that an adversary merely knowing the final model and training dataset cannot efficiently find a set of intermediate models with correct data points. In this paper, however, we show that PoL is vulnerable to "adversarial examples"! Specifically, in a similar way as optimizing an adversarial example, we could make an arbitrarily-chosen data point "generate" a given model, hence efficiently generating intermediate models with correct data points. We demonstrate, both theoretically and empirically, that we are able to generate a valid proof with significantly less cost than generating a proof by the prover, thereby we successfully break PoL.

翻訳日:2021-08-24 15:58:51 公開日:2021-08-21

# 結晶構造相マッピングの自動化:ディープラーニングと制約推論を組み合わせる

Automating Crystal-Structure Phase Mapping: Combining Deep Learning with Constraint Reasoning ( http://arxiv.org/abs/2108.09523v1 )

ライセンス: Link先を確認

Di Chen, Yiwei Bai, Sebastian Ament, Wenting Zhao, Dan Guevarra, Lan Zhou, Bart Selman, R. Bruce van Dover, John M. Gregoire, Carla P. Gomes

(参考訳) 結晶構造相マッピング(英: crystal-structure phase mapping)は、合成材料における結晶構造やその混合物の同定を必要とする、材料科学における中核的で長期にわたる挑戦である。材料科学の専門家は単純なシステムを解くことに長けているが、複雑なシステムを解くことはできない。ここでは結晶構造位相マッピングの自動化について述べる。我々は,教師なしパターンデミックス問題として位相マッピングを定式化し,深層推論ネットワーク(drnets)を用いてその解法を説明する。 DRNetは、科学的事前知識を組み込むための制約推論とディープラーニングを組み合わせることで、わずかな量の(ラベルのない)データしか必要としない。 DRNetは、制約推論をニューラルネットワーク最適化にシームレスに統合した結晶の混合物を管理する熱力学規則に関する豊富な事前知識を利用して、限られたデータを補償する。 DRNetは、事前知識ドメイン制約を符号化し、ニューラルネットワーク最適化に制約推論をシームレスに統合するための解釈可能な潜在空間で設計されている。 DRNetはかつての結晶構造相マッピングのアプローチを超越し、Bi-Cu-V酸化物相図を解き、太陽電池材料の発見を支援した。

Crystal-structure phase mapping is a core, long-standing challenge in materials science that requires identifying crystal structures, or mixtures thereof, in synthesized materials. Materials science experts excel at solving simple systems but cannot solve complex systems, creating a major bottleneck in high-throughput materials discovery. Herein we show how to automate crystal-structure phase mapping. We formulate phase mapping as an unsupervised pattern demixing problem and describe how to solve it using Deep Reasoning Networks (DRNets). DRNets combine deep learning with constraint reasoning for incorporating scientific prior knowledge and consequently require only a modest amount of (unlabeled) data. DRNets compensate for the limited data by exploiting and magnifying the rich prior knowledge about the thermodynamic rules governing the mixtures of crystals with constraint reasoning seamlessly integrated into neural network optimization. DRNets are designed with an interpretable latent space for encoding prior-knowledge domain constraints and seamlessly integrate constraint reasoning into neural network optimization. DRNets surpass previous approaches on crystal-structure phase mapping, unraveling the Bi-Cu-V oxide phase diagram, and aiding the discovery of solar-fuels materials.

翻訳日:2021-08-24 15:58:34 公開日:2021-08-21

# 多項式次数の多項式核の高速スケッチ

Fast Sketching of Polynomial Kernels of Polynomial Degree ( http://arxiv.org/abs/2108.09420v1 )

ライセンス: Link先を確認

Zhao Song, David P. Woodruff, Zheng Yu, Lichen Zhang

(参考訳) カーネルメソッドは機械学習の基本であり、カーネル近似の高速アルゴリズムは機械学習における多くのコアタスクを直接高速化する。多項式核は、テイラー級数展開を通じて多項式核によって近似されることが多いため、特に重要である。最近の斜めスケッチ技術では、多項式核の指数関数から多項式への次数 q$ に対する実行時間の依存性が小さくなっており、これはガウス核にとって有用であり、q$ は多対数として選択できる。しかし、ニューラル・タンジェントやアークコサイン・カーネルのようなよりゆっくりと成長するカーネルの場合、$q$は多項式でなければならない。この実行時間を大幅に改善し、先行注文項の$q$への依存をなくすことにより、新たな不明瞭なスケッチを提示する。新しいサンプリングスキームと組み合わせることで、成長の遅いカーネルの大規模なファミリーを近似するための最速のアルゴリズムを与える。

Kernel methods are fundamental in machine learning, and faster algorithms for kernel approximation provide direct speedups for many core tasks in machine learning. The polynomial kernel is especially important as other kernels can often be approximated by the polynomial kernel via a Taylor series expansion. Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic. However, for more slowly growing kernels, such as the neural tangent and arc-cosine kernels, $q$ needs to be polynomial, and previous work incurs a polynomial factor slowdown in the running time. We give a new oblivious sketch which greatly improves upon this running time, by removing the dependence on $q$ in the leading order term. Combined with a novel sampling scheme, we give the fastest algorithms for approximating a large family of slow-growing kernels.

翻訳日:2021-08-24 15:56:17 公開日:2021-08-21

# 分離学習環境における逐次確率最適化

Sequential Stochastic Optimization in Separable Learning Environments ( http://arxiv.org/abs/2108.09585v1 )

ライセンス: Link先を確認

R. Reid Bishop and Chelsea C. White III

(参考訳) 我々は,様々な種類の教師付き学習概念を包含する不確実性の下での逐次的意思決定問題を考える。これらの問題は、完全に観察された状態過程と部分的に観測された変調過程を有し、状態過程は観察過程を通してのみ変調過程に影響され、観察過程は変調過程のみを観察し、変調過程は制御に外在する。我々は,この幅広い問題を部分観察マルコフ決定過程(pomdp)としてモデル化する。変調過程の信念関数は制御不変であり、状態過程の制御から変調過程の推定を分離する。 We call this specially structured POMDP the separable POMDP, or SEP-POMDP, and show it (i) can serve as a model for a broad class of application areas, e.g., inventory control, finance, healthcare systems, (ii) inherits value function and optimal policy structure from a set of completely observed MDPs, (iii) can serve as a bridge between classical models of sequential decision making under uncertainty having fully specified model artifacts and such models that are not fully specified and require the use of predictive methods from statistics and machine learning, and (iv) allows for specialized approximate solution procedures.

We consider a class of sequential decision-making problems under uncertainty that can encompass various types of supervised learning concepts. These problems have a completely observed state process and a partially observed modulation process, where the state process is affected by the modulation process only through an observation process, the observation process only observes the modulation process, and the modulation process is exogenous to control. We model this broad class of problems as a partially observed Markov decision process (POMDP). The belief function for the modulation process is control invariant, thus separating the estimation of the modulation process from the control of the state process. We call this specially structured POMDP the separable POMDP, or SEP-POMDP, and show it (i) can serve as a model for a broad class of application areas, e.g., inventory control, finance, healthcare systems, (ii) inherits value function and optimal policy structure from a set of completely observed MDPs, (iii) can serve as a bridge between classical models of sequential decision making under uncertainty having fully specified model artifacts and such models that are not fully specified and require the use of predictive methods from statistics and machine learning, and (iv) allows for specialized approximate solution procedures.

翻訳日:2021-08-24 15:55:58 公開日:2021-08-21

# Integer-arithmetic-only Certified Robustness for Quantized Neural Networks

Integer-arithmetic-only Certified Robustness for Quantized Neural Networks ( http://arxiv.org/abs/2108.09413v1 )

ライセンス: Link先を確認

Haowen Lin, Jian Lou, Li Xiong and Cyrus Shahabi

(参考訳) 敵対的なデータ例は、機械学習とセキュリティコミュニティから大きな注目を集めている。反対例に取り組むための一連の研究は、理論的な堅牢性を保証するためのランダムな平滑化によって、堅牢性を保証する。しかし、そのような機構は通常、推論の計算に浮動小数点演算を使い、大きなメモリフットプリントと計算コストを犠牲にする。これらの防御モデルは、エッジデバイス上で効率的に動作したり、チューリングテンソルコアや整数専用ARMプロセッサのような整数専用論理ユニットにデプロイすることはできない。これらの課題を克服するために,任意の分類器を新しいスムーズな分類器に変換するために,量子化を用いた整数ランダム化平滑化手法を提案する。提案手法ではL2-ノルムの下で強靭性を保証する。提案手法は,2つの異なるデータセット(CIFAR-10とCaltech-101)上の汎用CPUおよびモバイルデバイス上で,浮動小数点演算によるロバストな手法に対して,同等の精度と4倍～5倍の高速化が得られることを示す。

Adversarial data examples have drawn significant attention from the machine learning and security communities. A line of work on tackling adversarial examples is certified robustness via randomized smoothing that can provide a theoretical robustness guarantee. However, such a mechanism usually uses floating-point arithmetic for calculations in inference and requires large memory footprints and daunting computational costs. These defensive models cannot run efficiently on edge devices nor be deployed on integer-only logical units such as Turing Tensor Cores or integer-only ARM processors. To overcome these challenges, we propose an integer randomized smoothing approach with quantization to convert any classifier into a new smoothed classifier, which uses integer-only arithmetic for certified robustness against adversarial perturbations. We prove a tight robustness guarantee under L2-norm for the proposed approach. We show our approach can obtain a comparable accuracy and 4x~5x speedup over floating-point arithmetic certified robust methods on general-purpose CPUs and mobile devices on two distinct datasets (CIFAR-10 and Caltech-101).

翻訳日:2021-08-24 15:53:55 公開日:2021-08-21

# 空間適応型特徴変換による可変レート深部画像圧縮

Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform ( http://arxiv.org/abs/2108.09551v1 )

ライセンス: Link先を確認

Myungseo Song, Jinyoung Choi, Bohyung Han

(参考訳) 本研究では,空間特徴変換(SFT arXiv:1804.02815)に基づく多目的深部画像圧縮ネットワークを提案する。本モデルは,任意の画素単位の品質マップによって制御される単一モデルを用いて,幅広い圧縮率をカバーする。さらに,提案フレームワークでは,符号化ネットワークの目的タスクに特化して最適化された品質マップを効率的に推定することにより,様々なタスクに対するタスク認識画像圧縮を行うことができる。これは、個別のタスクの別々のモデルを学ぶことなく、事前訓練されたネットワークで可能だ。本アルゴリズムは,複数の異なるターゲットレートに対して別々に最適化された複数のモデルに基づくアプローチと比較して,優れたレートゆがみトレードオフを実現する。同じレベルの圧縮では、モデルトレーニングを伴わずにタスク認識品質マップ推定により、画像分類とテキスト領域の品質保存の性能を向上する。コードはプロジェクトのwebサイトで入手できる。 https://github.com/micmic123/qmapcompression

We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815), which takes a source image and a corresponding quality map as inputs and produce a compressed image with variable rates. Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps. In addition, the proposed framework allows us to perform task-aware image compressions for various tasks, e.g., classification, by efficiently estimating optimized quality maps specific to target tasks for our encoding network. This is even possible with a pretrained network without learning separate models for individual tasks. Our algorithm achieves outstanding rate-distortion trade-off compared to the approaches based on multiple models that are optimized separately for several different target rates. At the same level of compression, the proposed approach successfully improves performance on image classification and text region quality preservation via task-aware quality map estimation without additional model training. The code is available at the project website: https://github.com/micmic123/QmapCompression

翻訳日:2021-08-24 15:53:37 公開日:2021-08-21

# パーソナライズ・イン・ザ・ループ文書要約に向けて

Towards Personalized and Human-in-the-Loop Document Summarization ( http://arxiv.org/abs/2108.09443v1 )

ライセンス: Link先を確認

Samira Ghodratnama

(参考訳) コンピュータデバイスのユビキタス化とインターネットの普及により、大量のデータが継続的に生成されている。したがって、与えられたトピックに関する利用可能な情報の量は、人間の処理能力をはるかに超え、情報過負荷と呼ばれるものを引き起こす。大量の情報を効率的に処理し,ユーザにとって重要な価値を持つコンテンツを生成するためには,情報の識別,統合,要約が必要である。データ要約は、関連する情報を収集し、より短いフォーマットに収集し、複雑な質問に答え、新しい洞察を得、概念境界を発見するのに役立つ。本論文は,新しい要約手法を用いて情報過負荷を軽減するための3つの課題に焦点を当てている。さらに、個人化された情報抽出を支援するために文書の分析を容易にする。この論文は、(i)文書要約における機能工学、(ii)従来の静的および非フレキシブルな要約、(iii)伝統的な総合的な要約アプローチ、(iv)参照要約の必要性の4つの領域に研究問題を分けている。 i)自動インテリジェント機能工学の獲得,ii)柔軟でインタラクティブな要約の実現,iii)知的でパーソナライズされた要約アプローチを活用した新しいアプローチを提案する。実験の結果,提案手法は他の最先端モデルと比較して有効性が証明された。さらに,ネットワークトラフィックデータ,ヘルスデータ,ビジネスプロセスデータの要約を通じて,異なるドメインにおける情報過負荷問題に対する解決策を提案する。

The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.

翻訳日:2021-08-24 15:51:07 公開日:2021-08-21

# 介入を用いた自律エージェントの因果モデル学習

Learning Causal Models of Autonomous Agents using Interventions ( http://arxiv.org/abs/2108.09586v1 )

ライセンス: Link先を確認

Pulkit Verma, Siddharth Srivastava

(参考訳) aiシステムの広範な使用におけるいくつかの障害の1つは、そのようなシステムの安全で信頼性のある動作を保証することができる解釈可能性の要件の欠如である。我々はエージェントアセスメントモジュールの解析を拡張し、AIシステムがシミュレータでハイレベルな命令シーケンスを実行し、アクションのシーケンスの実行についてユーザクエリに回答できるようにする。このような原始的なクエリ応答能力は,ユーザの解釈可能なシステムの因果モデルを定常的,完全に可観測的,決定論的設定で効率的に導出するのに十分であることを示す。また、STRIPSのようなドメインの因果構造を捉える動的因果決定ネットワーク(DCDN)を導入する。クエリの異なるクラスの比較分析は、それらに答えるために必要な計算要件と、正しいモデルを学ぶためにそれらの応答を評価するのに必要な努力の観点からも示される。

One of the several obstacles in the widespread use of AI systems is the lack of requirements of interpretability that can enable a layperson to ensure the safe and reliable behavior of such systems. We extend the analysis of an agent assessment module that lets an AI system execute high-level instruction sequences in simulators and answer the user queries about its execution of sequences of actions. We show that such a primitive query-response capability is sufficient to efficiently derive a user-interpretable causal model of the system in stationary, fully observable, and deterministic settings. We also introduce dynamic causal decision networks (DCDNs) that capture the causal structure of STRIPS-like domains. A comparative analysis of different classes of queries is also presented in terms of the computational requirements needed to answer them and the efforts required to evaluate their responses to learn the correct model.

翻訳日:2021-08-24 15:50:45 公開日:2021-08-21

# 医療画像に対する教師なし局所識別

Unsupervised Local Discrimination for Medical Images ( http://arxiv.org/abs/2108.09440v1 )

ライセンス: Link先を確認

Huai Chen, Renzhen Wang, Jieyu Li, Qing Peng, Deyu Meng and Lisheng Wang

(参考訳) 対照的表現学習は、医療画像処理における高価な注釈データの需要を軽減する効果的な教師なし手法である。最近の研究は主に、グローバルな特徴を学習するためのケースワイドな識別に基づくが、局所的な詳細は無視され、小さな解剖学的構造、組織、病変の処理に応用が制限されている。そこで我々は,医療モデルを効果的に初期化するための局所的識別特徴を学習するための普遍的局所的判別枠組みを提案し,その実践的応用を体系的に検討する。具体的には、モダリティ内構造類似性の共通性、すなわち、それに基づく。類似した構造が同じモダリティイメージで共有され、体系的な局所的特徴学習フレームワークが提案されている。グローバル埋め込みに基づくインスタンス間比較を行う代わりに,画素間埋め込みを行い,パッチと領域間の類似度を測定することに焦点を当てた。より微細なコントラスト則により、学習表現はセグメンテーションタスクにおいてより一般化され、カラーファンダスと胸部x線中の12個の下流タスクのうち11個を勝ち取ることにより、広範な最先端手法よりも優れる。さらに、モダリティ間の形状類似性、すなわち、性質に基づく。構造は類似した形状を共有できるが、異なる医療形態では、領域判別に先立って、非教師なしセグメンテーションを実現するために、異質な形状を結合する。他のモードからの形状記述と領域識別による内部パターンの類似性のみに基づいて、セグメンテーションターゲットの実現可能性を示す。最後に,1ショットのランドマークの局所化を実現するために,中心感性平均化を導入することにより,パッチ識別のセンタ感性を高める。

Contrastive representation learning is an effective unsupervised method to alleviate the demand for expensive annotated data in medical image processing. Recent work mainly based on instance-wise discrimination to learn global features, while neglect local details, which limit their application in processing tiny anatomical structures, tissues and lesions. Therefore, we aim to propose a universal local discrmination framework to learn local discriminative features to effectively initialize medical models, meanwhile, we systematacially investigate its practical medical applications. Specifically, based on the common property of intra-modality structure similarity, i.e. similar structures are shared among the same modality images, a systematic local feature learning framework is proposed. Instead of making instance-wise comparisons based on global embedding, our method makes pixel-wise embedding and focuses on measuring similarity among patches and regions. The finer contrastive rule makes the learnt representation more generalized for segmentation tasks and outperform extensive state-of-the-art methods by wining 11 out of all 12 downstream tasks in color fundus and chest X-ray. Furthermore, based on the property of inter-modality shape similarity, i.e. structures may share similar shape although in different medical modalities, we joint across-modality shape prior into region discrimination to realize unsupervised segmentation. It shows the feaibility of segmenting target only based on shape description from other modalities and inner pattern similarity provided by region discrimination. Finally, we enhance the center-sensitive ability of patch discrimination by introducing center-sensitive averaging to realize one-shot landmark localization, this is an effective application for patch discrimination.

翻訳日:2021-08-24 15:49:28 公開日:2021-08-21

# Sugeno Fuzzy Integral Technique を用いた頚椎細胞画像分類のためのCNN分類器のアンサンブル

Ensemble of CNN classifiers using Sugeno Fuzzy Integral Technique for Cervical Cytology Image Classification ( http://arxiv.org/abs/2108.09460v1 )

ライセンス: Link先を確認

Rohit Kundu, Hritam Basak, Akhil Koilada, Soham Chattopadhyay, Sukanta Chakraborty, Nibaran Das

(参考訳) 子宮頸がんは4番目に一般的ながんのカテゴリーであり、毎年50万人以上の女性に影響を与えている。早期診断は、がんの治療や治療にも役立つが、退屈で時間のかかる検査プロセスによって、集団検診は不可能である。病理学者の効率的かつ信頼性の高い検出を支援するため,本報告では,子宮頸癌の単一細胞およびスライド画像の分類を行うためのコンピュータ支援診断ツールを提案する。バイオメディカル画像分類のための自動検出ツールを開発する際の主な関心事は、公開データの可用性が低いことである。アンサンブル学習は、画像分類の一般的なアプローチであるが、分類器に事前決定された重みを活用する単純化されたアプローチは、満足して実行できない。本研究では,sugenoファジィ積分を用いて,インセプションv3,drknet-161,resnet-34の3つの学習モデルから決定スコアをアンサンブルする。提案するファジィ融合は,各サンプルに対する分類器の信頼度を考慮に入れ,各分類器に与える重要度を適応的に変化させ,各サンプルから供給される補完的情報を取り込み,分類性能を向上させる。提案手法は, mendeley liquid based cytology (lbc) dataset, sipakmed whole slide image (wsi) dataset, sipakmed single cell image (sci) datasetの3つの公開データセットにおいて評価され, 得られた結果は有望である。 GradCAMに基づく視覚表現と統計検査によるアプローチの分析と,文献における既存およびベースラインモデルとの比較は,アプローチの有効性を正当化する。

Cervical cancer is the fourth most common category of cancer, affecting more than 500,000 women annually, owing to the slow detection procedure. Early diagnosis can help in treating and even curing cancer, but the tedious, time-consuming testing process makes it impossible to conduct population-wise screening. To aid the pathologists in efficient and reliable detection, in this paper, we propose a fully automated computer-aided diagnosis tool for classifying single-cell and slide images of cervical cancer. The main concern in developing an automatic detection tool for biomedical image classification is the low availability of publicly accessible data. Ensemble Learning is a popular approach for image classification, but simplistic approaches that leverage pre-determined weights to classifiers fail to perform satisfactorily. In this research, we use the Sugeno Fuzzy Integral to ensemble the decision scores from three popular pretrained deep learning models, namely, Inception v3, DenseNet-161 and ResNet-34. The proposed Fuzzy fusion is capable of taking into consideration the confidence scores of the classifiers for each sample, and thus adaptively changing the importance given to each classifier, capturing the complementary information supplied by each, thus leading to superior classification performance. We evaluated the proposed method on three publicly available datasets, the Mendeley Liquid Based Cytology (LBC) dataset, the SIPaKMeD Whole Slide Image (WSI) dataset, and the SIPaKMeD Single Cell Image (SCI) dataset, and the results thus yielded are promising. Analysis of the approach using GradCAM-based visual representations and statistical tests, and comparison of the method with existing and baseline models in literature justify the efficacy of the approach.

翻訳日:2021-08-24 15:48:55 公開日:2021-08-21

# マスキングによるエンド2エンドの顔認識

End2End Occluded Face Recognition by Masking Corrupted Features ( http://arxiv.org/abs/2108.09468v1 )

ライセンス: Link先を確認

Haibo Qiu, Dihong Gong, Zhifeng Li, Wei Liu, Dacheng Tao

(参考訳) 近年の深層畳み込みニューラルネットワークの進歩により、顔認識において大きな進歩が見られた。しかし、最先端の一般顔認識モデルは、現実のシナリオでよく見られるような、隠蔽された顔画像にうまく当てはまらない。潜在的な理由は、訓練用の大規模な隠蔽顔データがないことと、閉塞によって引き起こされる破損した特徴に対処するための特定の設計がないことである。本稿では,1つのエンドツーエンドのディープニューラルネットワークに基づいて,オクルージョンに頑健な新しい顔認識手法を提案する。私たちのアプローチは(オクルージョンマスクによる顔認識)、深層畳み込みニューラルネットワークから破損した特徴を発見し、動的に学習したマスクによってそれらをきれいにすることを学びます。さらに,大規模な隠蔽顔画像を構築し,効果的かつ効率的に訓練する。外部検出器に頼ってオクルージョンを発見する方法や、差別的でない浅いモデルを使う方法に比べれば、より単純だが強力である。 LFW、Megaface Challenge 1, RMF2、ARデータセットおよびその他の擬似隠蔽/マス付きデータセットの実験結果から、オクルージョン下での精度が劇的に向上し、一般的な顔認識でうまく一般化されることを確認した。

With the recent advancement of deep convolutional neural networks, significant progress has been made in general face recognition. However, the state-of-the-art general face recognition models do not generalize well to occluded face images, which are exactly the common cases in real-world scenarios. The potential reasons are the absences of large-scale occluded face data for training and specific designs for tackling corrupted features brought by occlusions. This paper presents a novel face recognition method that is robust to occlusions based on a single end-to-end deep neural network. Our approach, named FROM (Face Recognition with Occlusion Masks), learns to discover the corrupted features from the deep convolutional neural networks, and clean them by the dynamically learned masks. In addition, we construct massive occluded face images to train FROM effectively and efficiently. FROM is simple yet powerful compared to the existing methods that either rely on external detectors to discover the occlusions or employ shallow models which are less discriminative. Experimental results on the LFW, Megaface challenge 1, RMF2, AR dataset and other simulated occluded/masked datasets confirm that FROM dramatically improves the accuracy under occlusions, and generalizes well on general face recognition.

翻訳日:2021-08-24 15:48:21 公開日:2021-08-21

# 公共ウェブカメラからの3次元再構成

3D Reconstruction from public webcams ( http://arxiv.org/abs/2108.09476v1 )

ライセンス: Link先を確認

Tianyu Wu, Konrad Schindler and Cenek Albl

(参考訳) 本稿では,複数のウェブカメラで捉えたシーンの3次元形状を再構成する可能性を検討する。公開されているウェブカメラの数は増えており、毎日増えている。論理的な疑問が生まれます - この自由データソースは、余暇活動を超えた何かに使えるのでしょうか? 課題は、これらのカメラの内部、外部、または時間的なキャリブレーションがないことである。コンピュータビジョンの最近の進歩により、我々はカメラの校正に成功し、静的なシーンの3次元再構成を行い、移動物体の3次元軌跡を復元した。

In this paper, we investigate the possibility of reconstructing the 3D geometry of a scene captured by multiple webcams. The number of publicly accessible webcams is already large and it is growing every day. A logical question arises - can we use this free source of data for something beyond leisure activities? The challenge is that no internal, external, or temporal calibration of these cameras is available. We show that using recent advances in computer vision, we successfully calibrate the cameras, perform 3D reconstructions of the static scene and also recover the 3D trajectories of moving objects.

翻訳日:2021-08-24 15:48:00 公開日:2021-08-21

# MOTSynth: 合成データは歩行者の検知と追跡にどのように役立つか?

MOTSynth: How Can Synthetic Data Help Pedestrian Detection and Tracking? ( http://arxiv.org/abs/2108.09518v1 )

ライセンス: Link先を確認

Matteo Fabbri, Guillem Braso, Gianluca Maugeri, Orcun Cetintas, Riccardo Gasparini, Aljosa Osep, Simone Calderara, Laura Leal-Taixe, Rita Cucchiara

(参考訳) ビデオ歩行者検出と追跡のためのディープラーニングに基づく手法は、優れたパフォーマンスを達成するために大量のトレーニングデータを必要とする。しかし、混み合った公共環境におけるデータ取得は、データプライバシの懸念を引き起こす - すべての参加者の明確な同意なしに、単にデータを記録して保存することは許されない。さらに、コンピュータビジョンアプリケーションに対するそのようなデータのアノテーションは通常、特にビデオ領域においてかなりの手作業を必要とする。非常に混み合ったシナリオにおける歩行者のラベル付けは、人間のアノテータであっても困難であり、トレーニングデータにエラーをもたらす可能性がある。本稿では,合成データのみを用いて多人数追跡の異なる側面を前進させる方法について検討する。この目的のために、レンダリングゲームエンジンを用いてオブジェクトの検出と追跡のための大規模で高度に多様な合成データセットMOTSynthを生成する。実験の結果,MOTSynthは,歩行者検出,再識別,セグメンテーション,追跡といったタスクの実際のデータを置き換えるために利用できることがわかった。

Deep learning-based methods for video pedestrian detection and tracking require large volumes of training data to achieve good performance. However, data acquisition in crowded public environments raises data privacy concerns -- we are not allowed to simply record and store data without the explicit consent of all participants. Furthermore, the annotation of such data for computer vision applications usually requires a substantial amount of manual effort, especially in the video domain. Labeling instances of pedestrians in highly crowded scenarios can be challenging even for human annotators and may introduce errors in the training data. In this paper, we study how we can advance different aspects of multi-person tracking using solely synthetic data. To this end, we generate MOTSynth, a large, highly diverse synthetic dataset for object detection and tracking using a rendering game engine. Our experiments show that MOTSynth can be used as a replacement for real data on tasks such as pedestrian detection, re-identification, segmentation, and tracking.

翻訳日:2021-08-24 15:47:48 公開日:2021-08-21

# vision transformer (vit) アーキテクチャを用いた建設監視自動化のための不均衡データセットの構築材料分類

Construction material classification on imbalanced datasets for construction monitoring automation using Vision Transformer (ViT) architecture ( http://arxiv.org/abs/2108.09527v1 )

ライセンス: Link先を確認

Maryam Soleymani, Mahdi Bonyani, Hadi Mahami, Farnad Nasirzadeh

(参考訳) 今日では、自動化は建設プロジェクトの生産性に大きな影響を与えるため、重要なトピックである。この産業における自動化の利用は、建設作業の効率、品質、安全性を著しく向上させるなど、大きな成果をもたらす。建設における自動化の範囲は幅広い段階を含み、建設プロジェクトを監視することは例外ではない。さらに、プロジェクト進捗の正確かつタイムリーな評価によって、マネージャはスケジュールからの逸脱を素早く識別し、必要なアクションを適切なタイミングで行うことができるので、プロジェクト管理において非常に重要です。この段階で最も重要なタスクの1つは、プロジェクト進捗を日々追跡することであり、それは非常に時間がかかり、労働集約的ですが、自動化によってこのタスクが促進され、加速されました。また、多くの危険なタスクのリスクを排除または少なくとも減らした。このようにして、建設自動化の最初のステップは、プロジェクト現場で使われている材料を自動的に検出することである。本稿では,視覚変換器(ViT)と呼ばれる新しいディープラーニングアーキテクチャを用いて,建設材料の検出と分類を行う。提案手法の適用性および性能を評価するため, 従来の論文で用いた構成材料ライブラリ (CML) と構築材料データセット (BMD) の3つの大きな不均衡なデータセットと, それらを組み合わせて作成した新しいデータセットを用いて, 実験を行った。得られた結果から,すべてのパラメータおよび材料カテゴリーで100%の精度が得られた。提案手法は, 異なる材料タイプを検出し, 分類するための新しいロバストなツールであると考えられる。

Nowadays, automation is a critical topic due to its significant impacts on the productivity of construction projects. Utilizing automation in this industry brings about great results, such as remarkable improvements in the efficiency, quality, and safety of construction activities. The scope of automation in construction includes a wide range of stages, and monitoring construction projects is no exception. Additionally, it is of great importance in project management since an accurate and timely assessment of project progress enables managers to quickly identify deviations from the schedule and take the required actions at the right time. In this stage, one of the most important tasks is to daily keep track of the project progress, which is very time-consuming and labor-intensive, but automation has facilitated and accelerated this task. It also eliminated or at least decreased the risk of many dangerous tasks. In this way, the first step of construction automation is to detect used materials in a project site automatically. In this paper, a novel deep learning architecture is utilized, called Vision Transformer (ViT), for detecting and classifying construction materials. To evaluate the applicability and performance of the proposed method, it is trained and tested on three large imbalanced datasets, namely Construction Material Library (CML) and Building Material Dataset (BMD), used in the previous papers, as well as a new dataset created by combining them. The achieved results revealed an accuracy of 100 percent in all parameters and also in each material category. It is believed that the proposed method provides a novel and robust tool for detecting and classifying different material types.

翻訳日:2021-08-24 15:47:32 公開日:2021-08-21

# SSR: シングルビュー2次元から3次元再構成のための半教師付きソフトラスタライザ

SSR: Semi-supervised Soft Rasterizer for single-view 2D to 3D Reconstruction ( http://arxiv.org/abs/2108.09593v1 )

ライセンス: Link先を確認

Issam Laradji, Pau Rodr\'iguez, David Vazquez, Derek Nowrouzezahrai

(参考訳) 最近の研究は、弱い監督下でのオブジェクトメッシュの学習に大きな進歩をもたらした。ソフトラスタライズ法は2次元画像からの正確な3次元再構成を実現した。本研究では,このような3次元復元手法がラベルなし画像を活用することで,ラベリング作業をさらに削減する。これらのラベルのない画像の視点を得るために、2つの画像を入力として取り、同一の視点に対応するか否かを出力するSiameseネットワークを提案する。トレーニング中、クロスエントロピー損失を最小限に抑え、一対のイメージが同じ視点に属するか否かを予測する確率を最大化する。新しい画像の視点を得るために、トレーニングサンプルから得られた異なる視点と比較し、最も高い一致確率で視点を選択する。ラベル付けされていない画像に最も自信のある視点でラベル付けし、異なるラスタライズ層を持つディープネットワークを訓練する。実験の結果、2つのオブジェクトのみをラベル付けしても、未ラベルの例を利用する場合、ShapeNetのIoUは大幅に改善されることがわかった。コードはhttps://github.com/IssamLaradji/SSRで入手できる。

Recent work has made significant progress in learning object meshes with weak supervision. Soft Rasterization methods have achieved accurate 3D reconstruction from 2D images with viewpoint supervision only. In this work, we further reduce the labeling effort by allowing such 3D reconstruction methods leverage unlabeled images. In order to obtain the viewpoints for these unlabeled images, we propose to use a Siamese network that takes two images as input and outputs whether they correspond to the same viewpoint. During training, we minimize the cross entropy loss to maximize the probability of predicting whether a pair of images belong to the same viewpoint or not. To get the viewpoint of a new image, we compare it against different viewpoints obtained from the training samples and select the viewpoint with the highest matching probability. We finally label the unlabeled images with the most confident predicted viewpoint and train a deep network that has a differentiable rasterization layer. Our experiments show that even labeling only two objects yields significant improvement in IoU for ShapeNet when leveraging unlabeled examples. Code is available at https://github.com/IssamLaradji/SSR.

翻訳日:2021-08-24 15:47:05 公開日:2021-08-21

# フェアネスを考慮したオンラインメタラーニング

Fairness-Aware Online Meta-learning ( http://arxiv.org/abs/2108.09435v1 )

ライセンス: Link先を確認

Chen Zhao, Feng Chen, Bhavani Thuraisingham

(参考訳) オンラインメタ学習(oml)は,タスクが次々に現れる逐次的な環境において,モデルパラメータ(あるいは学習の学習)よりも優れた優先順位を学習する。このようなテクニックは、人間の知性の重要な特徴である公平さで学習することの重要性を完全に無視する。 (2)オンライン・フェアネス・アウェア・ラーニングこの設定は、公平性が懸念される多くの分類問題を捉えている。しかし、タスク固有の適応なしにゼロショット一般化を達成することを目指している。これにより、モデルが新たに到着したデータに適応する能力が制限される。このような問題を克服し,そのギャップを埋めるために,本稿では,不公平防止の設定下にある新しいオンラインメタ学習アルゴリズムであるFFMLを提案する。 ffmlの重要な部分は、モデルの正確性と公平性にそれぞれ関連づけられたオンラインフェア分類モデルのプライマリパラメータとデュアルパラメータの優れた事前学習である。この問題は二値凸凹最適化の形で定式化されている。理論解析は、損失後悔と累積公正性制約の違反に対して、サブ線形上界を与える。実世界の3つのデータセットの分類にFFMLを適用することでFFMLの汎用性を実証し、公平性と分類精度のトレードオフに関する先行研究よりも大幅に改善したことを示す。

In contrast to offline working fashions, two research paradigms are devised for online learning: (1) Online Meta Learning (OML) learns good priors over model parameters (or learning to learn) in a sequential setting where tasks are revealed one after another. Although it provides a sub-linear regret bound, such techniques completely ignore the importance of learning with fairness which is a significant hallmark of human intelligence. (2) Online Fairness-Aware Learning. This setting captures many classification problems for which fairness is a concern. But it aims to attain zero-shot generalization without any task-specific adaptation. This therefore limits the capability of a model to adapt onto newly arrived data. To overcome such issues and bridge the gap, in this paper for the first time we proposed a novel online meta-learning algorithm, namely FFML, which is under the setting of unfairness prevention. The key part of FFML is to learn good priors of an online fair classification model's primal and dual parameters that are associated with the model's accuracy and fairness, respectively. The problem is formulated in the form of a bi-level convex-concave optimization. Theoretic analysis provides sub-linear upper bounds for loss regret and for violation of cumulative fairness constraints. Our experiments demonstrate the versatility of FFML by applying it to classification on three real-world datasets and show substantial improvements over the best prior work on the tradeoff between fairness and classification accuracy

翻訳日:2021-08-24 15:38:30 公開日:2021-08-21

# 交通事故検出のための不均衡時空間トラヒックフローデータの深い表現

Deep Representation of Imbalanced Spatio-temporal Traffic Flow Data for Traffic Accident Detection ( http://arxiv.org/abs/2108.09506v1 )

ライセンス: Link先を確認

Pouya Mehrannia, Shayan Shirahmad Gale Bagi, Behzad Moshiri, Otman Adam Al-Basir

(参考訳) 交通事故の自動検出は、交通、公共安全、経路計画の改善に重要な影響を及ぼす。事故発生から救助隊派遣までの時間の連続的な減少によって多くの命を救うことができ、またドライバーに代替ルートの選択を通知することで多くの走行時間を節約できる。この問題は、主に事故の稀さと環境の空間的不均一性のために困難である。本稿では,高速道路事故の自動検出のためのLong-Short Term Memory (LSTM) ネットワークを用いたループ検出データの深部表現について検討する。 LSTMベースのフレームワークは、データの次元を減らしながら、エンコードされた特徴空間におけるクラス分離性を高める。ミネソタ州ツインシティーズ・メトロ・フリーウェイズから収集された実事故およびループ検出器データを用いた実験により、lstmネットワークを用いた交通流データの深い表現は、高速道路事故を18分以内の真の正の率 0.71 と偽の正の率 0.25 で検出できる可能性が証明された。

Automatic detection of traffic accidents has a crucial effect on improving transportation, public safety, and path planning. Many lives can be saved by the consequent decrease in the time between when the accidents occur and when rescue teams are dispatched, and much travelling time can be saved by notifying drivers to select alternative routes. This problem is challenging mainly because of the rareness of accidents and spatial heterogeneity of the environment. This paper studies deep representation of loop detector data using Long-Short Term Memory (LSTM) network for automatic detection of freeway accidents. The LSTM-based framework increases class separability in the encoded feature space while reducing the dimension of data. Our experiments on real accident and loop detector data collected from the Twin Cities Metro freeways of Minnesota demonstrate that deep representation of traffic flow data using LSTM network has the potential to detect freeway accidents in less than 18 minutes with a true positive rate of 0.71 and a false positive rate of 0.25 which outperforms other competing methods in the same arrangement.

翻訳日:2021-08-24 15:38:10 公開日:2021-08-21

# DSP-SLAM: 深い形状を持つオブジェクト指向SLAM

DSP-SLAM: Object Oriented SLAM with Deep Shape Priors ( http://arxiv.org/abs/2108.09481v1 )

ライセンス: Link先を確認

Jingwen Wang, Martin R\"unz, Lourdes Agapito

(参考訳) DSP-SLAMはオブジェクト指向SLAMシステムであり,前景オブジェクトのための高密度3次元モデルのリッチで高精度な関節マップを構築し,背景を表わすランドマークポイントを疎外する。 DSP-SLAMは特徴に基づくSLAMシステムによって再構成された3次元点雲を入力として、検出された物体の密な再構成でスパースマップを強化する能力を備える。オブジェクトはセマンティックなインスタンスセグメンテーションによって検出され、その形状とポーズはカテゴリ固有の深部形状の埋め込みを先行として、新しい2階最適化によって推定される。我々のオブジェクト認識バンドル調整は、ポーズグラフを構築し、カメラポーズ、オブジェクト位置、特徴点を共同で最適化する。 DSP-SLAMは、モノクロ、ステレオ、ステレオ+LiDARの3つの異なる入力モードで毎秒10フレームで動作する。本研究では,フリブルク・レッドウッド・osデータセットの単眼rgb配列とキッティオドメトリーデータセットのステレオ+ライダー配列のほぼフレームレートで動作するdsp-slamを,部分的観測からでも高品質な完全なオブジェクト再構成を実現するとともに,一貫したグローバルマップを維持しながら実証する。 KITTIデータセット上でのカメラトラッキングドリフトの低減と,近年の深部事前再構成手法によるオブジェクトのポーズと形状復元の改善を示す。

We propose DSP-SLAM, an object-oriented SLAM system that builds a rich and accurate joint map of dense 3D models for foreground objects, and sparse landmark points to represent the background. DSP-SLAM takes as input the 3D point cloud reconstructed by a feature-based SLAM system and equips it with the ability to enhance its sparse map with dense reconstructions of detected objects. Objects are detected via semantic instance segmentation, and their shape and pose is estimated using category-specific deep shape embeddings as priors, via a novel second order optimization. Our object-aware bundle adjustment builds a pose-graph to jointly optimize camera poses, object locations and feature points. DSP-SLAM can operate at 10 frames per second on 3 different input modalities: monocular, stereo, or stereo+LiDAR. We demonstrate DSP-SLAM operating at almost frame rate on monocular-RGB sequences from the Friburg and Redwood-OS datasets, and on stereo+LiDAR sequences on the KITTI odometry dataset showing that it achieves high-quality full object reconstructions, even from partial observations, while maintaining a consistent global map. Our evaluation shows improvements in object pose and shape reconstruction with respect to recent deep prior-based reconstruction methods and reductions in camera tracking drift on the KITTI dataset.

翻訳日:2021-08-24 15:33:14 公開日:2021-08-21

# LiDARパノプティブセグメンテーションにおける従来の点群クラスタリング手法の技術的検討と評価

A Technical Survey and Evaluation of Traditional Point Cloud Clustering Methods for LiDAR Panoptic Segmentation ( http://arxiv.org/abs/2108.09522v1 )

ライセンス: Link先を確認

Yiming Zhao, Xiao Zhang, Xinming Huang

(参考訳) LiDARのパノプティカルセグメンテーションは、自動運転のための新しい技術課題である。一般的なエンドツーエンドのディープラーニングソリューションとは対照的に、セマンティクス情報を抽出する既存のセマンティクスセグメンテーションネットワークと、各インスタンスオブジェクトを分割する従来のlidar point cloud clusterアルゴリズムとのハイブリッド手法を提案する。幾何学に基づく従来のクラスタリングアルゴリズムは、semantickittiデータセットのpanoptic segmentation leaderboard上で公開されたすべてのエンドツーエンドのディープラーニングソリューションの中で最先端のパフォーマンスを示すことによって考慮に値すると論じている。私たちの知る限り、我々はクラスタリングアルゴリズムでpoint cloud panopticセグメンテーションを試した最初の人物です。そこで本研究では,新しいモデルを開発する代わりに,4つの典型的なクラスタ手法を実装し,その性能をベンチマークで報告する。これら4つのクラスタメソッドは、リアルタイム実行速度が最も代表的である。本論文ではC++で実装し,既存のディープラーニングフレームワークとシームレスに統合するためのpython関数としてラップする。この問題に関心のあるピア研究者のためにコードを公開しています。

LiDAR panoptic segmentation is a newly proposed technical task for autonomous driving. In contrast to popular end-to-end deep learning solutions, we propose a hybrid method with an existing semantic segmentation network to extract semantic information and a traditional LiDAR point cloud cluster algorithm to split each instance object. We argue geometry-based traditional clustering algorithms are worth being considered by showing a state-of-the-art performance among all published end-to-end deep learning solutions on the panoptic segmentation leaderboard of the SemanticKITTI dataset. To our best knowledge, we are the first to attempt the point cloud panoptic segmentation with clustering algorithms. Therefore, instead of working on new models, we give a comprehensive technical survey in this paper by implementing four typical cluster methods and report their performances on the benchmark. Those four cluster methods are the most representative ones with real-time running speed. They are implemented with C++ in this paper and then wrapped as a python function for seamless integration with the existing deep learning frameworks. We release our code for peer researchers who might be interested in this problem.

翻訳日:2021-08-24 15:32:46 公開日:2021-08-21

# 医用画像分割のための深層学習法の系統的臨床評価--ラジオサージリー応用

Systematic Clinical Evaluation of A Deep Learning Method for Medical Image Segmentation: Radiosurgery Application ( http://arxiv.org/abs/2108.09535v1 )

ライセンス: Link先を確認

Boris Shirokikh, Alexandra Dalechina, Alexey Shevtsov, Egor Krivov, Valery Kostjuchenko, Amayak Durgaryan, Mikhail Galkin, Andrey Golanov and Mikhail Belyaev

(参考訳) 3次元医用画像分割作業において,Deep Learning(DL)手法を体系的に評価した。セグメンテーション法は, 放射線治療プロセスに統合され, 臨床ワークフローに直接影響を及ぼす。提案手法では,手動セグメンテーションの相対的な欠点,すなわち,高波長間コントゥーリング変動とコンチューリングプロセスの高時間消費に対処する。既存の評価に対する主な拡張は、他の医用画像分割タスクでさらに一般化できる、慎重に詳細な分析である。まず, レータ間検出契約の変更を解析する。セグメンテーションモデルは検出不一致の比率を0.162から0.085に減少させる(p < 0.05)。第2に,このモデルが表層ダイススコア0.845から0.871 (p < 0.05) に向上することを示す。第3に、モデルが1.6倍から2.0倍(p < 0.05)のデライン化過程を加速することを示す。最後に,評価バイアスを排除または推定するために臨床実験のセットアップを設計し,その結果の意義を保存した。臨床評価に加えて、3次元医用画像セグメンテーションのための効率的なdlベースモデル構築のための直感と実践的アイデアを要約する。

We systematically evaluate a Deep Learning (DL) method in a 3D medical image segmentation task. Our segmentation method is integrated into the radiosurgery treatment process and directly impacts the clinical workflow. With our method, we address the relative drawbacks of manual segmentation: high inter-rater contouring variability and high time consumption of the contouring process. The main extension over the existing evaluations is the careful and detailed analysis that could be further generalized on other medical image segmentation tasks. Firstly, we analyze the changes in the inter-rater detection agreement. We show that the segmentation model reduces the ratio of detection disagreements from 0.162 to 0.085 (p < 0.05). Secondly, we show that the model improves the inter-rater contouring agreement from 0.845 to 0.871 surface Dice Score (p < 0.05). Thirdly, we show that the model accelerates the delineation process in between 1.6 and 2.0 times (p < 0.05). Finally, we design the setup of the clinical experiment to either exclude or estimate the evaluation biases, thus preserve the significance of the results. Besides the clinical evaluation, we also summarize the intuitions and practical ideas for building an efficient DL-based model for 3D medical image segmentation.

翻訳日:2021-08-24 15:32:29 公開日:2021-08-21

# クロスアテンションディープネットワークを用いたマルチモーダル乳腺病変分類

Multimodal Breast Lesion Classification Using Cross-Attention Deep Networks ( http://arxiv.org/abs/2108.09591v1 )

ライセンス: Link先を確認

Hung Q. Vo, Pengyu Yuan, Tiancheng He, Stephen T.C. Wong, and Hien V. Nguyen

(参考訳) 正確な乳房病変リスク推定は、不要な生検を著しく減らし、医師が最適な治療計画を決定するのに役立つ。既存のコンピュータ支援システムのほとんどは乳腺病変を分類するためにマンモグラムの特徴のみに依存している。このアプローチは便利であるが、最適な性能を達成するために臨床報告で有用な情報を十分に活用していない。乳房病変の分類はマンモグラフィー単独と比較して有意に改善するだろうか? 医療実践の変化による臨床情報の欠落に対する対処法マンモグラムと臨床特徴を組み合わせる最善の方法は何か? これらの根本的な問題に対処するために体系的な研究が必要となる。本稿では, マンモグラムと分類学的臨床変数を組み合わせるために, 特徴連結, 交差注意, 共同注意に基づく複数のマルチモーダルディープネットワークについて検討する。提案するアーキテクチャにより,病変分類性能が著しく向上した(roc曲線下平均面積は0.99から0.94)。また,臨床変数の欠如時にモデルを評価する。

Accurate breast lesion risk estimation can significantly reduce unnecessary biopsies and help doctors decide optimal treatment plans. Most existing computer-aided systems rely solely on mammogram features to classify breast lesions. While this approach is convenient, it does not fully exploit useful information in clinical reports to achieve the optimal performance. Would clinical features significantly improve breast lesion classification compared to using mammograms alone? How to handle missing clinical information caused by variation in medical practice? What is the best way to combine mammograms and clinical features? There is a compelling need for a systematic study to address these fundamental questions. This paper investigates several multimodal deep networks based on feature concatenation, cross-attention, and co-attention to combine mammograms and categorical clinical variables. We show that the proposed architectures significantly increase the lesion classification performance (average area under ROC curves from 0.89 to 0.94). We also evaluate the model when clinical variables are missing.

翻訳日:2021-08-24 15:32:10 公開日:2021-08-21

# サブ国家レベルの解像度でCOVID-19の今後の知見を可能にする汎用予測ソリューション

A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions ( http://arxiv.org/abs/2108.09556v1 )

ライセンス: Link先を確認

Umar Marikkar, Harshana Weligampola, Rumali Perera, Jameel Hassan, Suren Sritharan, Gihan Jayatilaka, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake, Janaka Ekanayake, Anuruddhika Rathnayake, Samath Dharmaratne

(参考訳) 新型コロナウイルスは公衆衛生に大きな影響を与え続けている。この影響を最小限に抑えるため、政策立案者は、実際の脅威に対して不当に実施された場合、誤った脅威評価の結果、望ましくない社会経済的合併症を引き起こすような封じ込め措置を講じる。さらに、マクロレベルの意思決定や全国レベルの意思決定は、小さな地域での局所的な感受性を考慮できない。したがって、正確な予測を通じて、covid-19の行動に関する洞察を提供する地域的な脅威アセスメントの必要性が生じる。 In this study, a forecasting solution is proposed, to predict daily new cases of COVID-19 in regions small enough where containment measures could be locally implemented, by targeting three main shortcomings that exist in literature; the unreliability of existing data caused by inconsistent testing patterns in smaller regions, weak deploy-ability of forecasting models towards predicting cases in previously unseen regions, and model training biases caused by the imbalanced nature of data in COVID-19 epi-curves. そこで本研究は,その地域の疫学的なダイナミクスに基づく決定論的エピカーブを平滑化するための最適化平滑化手法,特定の地域からのデータを用いてトレーニングされた長期記憶型予測モデル,履歴データを持たない地域におけるデプロイ可能性の最大化を目的とした多種多様なトレーニングセット,エピ曲線に見られるデータ不均衡を緩和するための学習中の適応損失関数の3つを特徴とする。提案する平滑化手法,一般化トレーニング戦略,適応損失関数は予測全体の精度を大きく向上させ,より局所的なマイクロレベルでの効率的な封じ込めが可能となった。

COVID-19 continues to cause a significant impact on public health. To minimize this impact, policy makers undertake containment measures that however, when carried out disproportionately to the actual threat, as a result if errorneous threat assessment, cause undesirable long-term socio-economic complications. In addition, macro-level or national level decision making fails to consider the localized sensitivities in small regions. Hence, the need arises for region-wise threat assessments that provide insights on the behaviour of COVID-19 through time, enabled through accurate forecasts. In this study, a forecasting solution is proposed, to predict daily new cases of COVID-19 in regions small enough where containment measures could be locally implemented, by targeting three main shortcomings that exist in literature; the unreliability of existing data caused by inconsistent testing patterns in smaller regions, weak deploy-ability of forecasting models towards predicting cases in previously unseen regions, and model training biases caused by the imbalanced nature of data in COVID-19 epi-curves. Hence, the contributions of this study are three-fold; an optimized smoothing technique to smoothen less deterministic epi-curves based on epidemiological dynamics of that region, a Long-Short-Term-Memory (LSTM) based forecasting model trained using data from select regions to create a representative and diverse training set that maximizes deploy-ability in regions with lack of historical data, and an adaptive loss function whilst training to mitigate the data imbalances seen in epi-curves. The proposed smoothing technique, the generalized training strategy and the adaptive loss function largely increased the overall accuracy of the forecast, which enables efficient containment measures at a more localized micro-level.

翻訳日:2021-08-24 15:28:58 公開日:2021-08-21

# プログラマブルfpgaベースのメモリコントローラ

Programmable FPGA-based Memory Controller ( http://arxiv.org/abs/2108.09601v1 )

ライセンス: Link先を確認

Sasindu Wijeratne, Sanket Pattnaik, Zhiyu Chen, Rajgopal Kannan, Viktor Prasanna

(参考訳) DRAM技術の世代別改良にもかかわらず、メモリアクセスレイテンシは依然としてアプリケーションアクセラレーターの主要なボトルネックであり、主にターゲットアプリケーション、使用するアルゴリズム、アクセラレーターアーキテクチャのバリエーションを十分に考慮できないメモリインターフェースIPの制限のためである。本稿では,異なるアプリケーション用のメモリコントローラの開発に時間を要するため,利用可能なハードウェアリソース上で,異なるターゲットアプリケーション用に設定可能なモジュール型でプログラム可能なメモリコントローラを提案する。提案するメモリコントローラはバルクメモリ転送とともにキャッシュラインアクセスを効率的にサポートする。ユーザーはFPGA上の利用可能なロジックリソース、メモリアクセスパターン、および外部メモリ仕様に応じてコントローラを設定することができる。モジュール設計は、要求スケジューリング、内部キャッシュ、直接メモリアクセスを含む様々なメモリアクセス最適化技術をサポートする。これらの技術は、高い持続帯域幅を維持しながら、全体のレイテンシを低減することに寄与する。本研究では,最先端FPGA上に実装し,グラフ解析とディープラーニング処理という2つの広く研究されている領域を用いて性能評価を行う。商用メモリコントローラIPと比較して,CNNおよびGCNワークロードのメモリアクセス時間は最大58%向上した。

Even with generational improvements in DRAM technology, memory access latency still remains the major bottleneck for application accelerators, primarily due to limitations in memory interface IPs which cannot fully account for variations in target applications, the algorithms used, and accelerator architectures. Since developing memory controllers for different applications is time-consuming, this paper introduces a modular and programmable memory controller that can be configured for different target applications on available hardware resources. The proposed memory controller efficiently supports cache-line accesses along with bulk memory transfers. The user can configure the controller depending on the available logic resources on the FPGA, memory access pattern, and external memory specifications. The modular design supports various memory access optimization techniques including, request scheduling, internal caching, and direct memory access. These techniques contribute to reducing the overall latency while maintaining high sustained bandwidth. We implement the system on a state-of-the-art FPGA and evaluate its performance using two widely studied domains: graph analytics and deep learning workloads. We show improved overall memory access time up to 58% on CNN and GCN workloads compared with commercial memory controller IPs.

翻訳日:2021-08-24 15:28:28 公開日:2021-08-21

# 多様な時間スケールを用いた貯留層計算によるマルチスケールダイナミクスの予測

Reservoir Computing with Diverse Timescales for Prediction of Multiscale Dynamics ( http://arxiv.org/abs/2108.09446v1 )

ライセンス: Link先を確認

Gouhei Tanaka, Tadayoshi Matsumori, Hiroaki Yoshida, Kazuyuki Aihara

(参考訳) 機械学習のアプローチは最近、動的システムに対する物理的・数学的モデリングアプローチの代替または補助として活用されている。マルチスケールダイナミックスのモデリングと予測に特化した効率的な機械学習手法を開発するために,異種漏洩積分体ニューロンの繰り返しネットワークを用いて,様々な時間スケールの貯水池計算モデルを提案する。サブシステムダイナミクスの時間スケールに大きなギャップを含む高速でカオス的な動的システムの予測タスクにおいて,提案モデルが既存の標準モデルよりも高いポテンシャルを持ち,リーク率パラメータの最適化を必要とせずとも,標準モデルに匹敵する性能が得られることを実証する。解析の結果, モデル学習により, 対象動力学の各成分を生産するのに要する時間スケールが, 適切に柔軟に選択できることが判明した。

Machine learning approaches have recently been leveraged as a substitute or an aid for physical/mathematical modeling approaches to dynamical systems. To develop an efficient machine learning method dedicated to modeling and prediction of multiscale dynamics, we propose a reservoir computing model with diverse timescales by using a recurrent network of heterogeneous leaky integrator neurons. In prediction tasks with fast-slow chaotic dynamical systems including a large gap in timescales of their subsystems dynamics, we demonstrate that the proposed model has a higher potential than the existing standard model and yields a performance comparable to the best one of the standard model even without an optimization of the leak rate parameter. Our analysis reveals that the timescales required for producing each component of target dynamics are appropriately and flexibly selected from the reservoir dynamics by model training.

翻訳日:2021-08-24 15:27:12 公開日:2021-08-21

# グラフニューラルネットワークに対するハードラベルブラックボックスの逆攻撃

A Hard Label Black-box Adversarial Attack Against Graph Neural Networks ( http://arxiv.org/abs/2108.09513v1 )

ライセンス: Link先を確認

Jiaming Mu, Binghui Wang, Qi Li, Kun Sun, Mingwei Xu, Zhuotao Liu

(参考訳) グラフニューラルネットワーク(GNN)は,ノード分類やグラフ分類などの様々なグラフ構造関連タスクにおいて,最先端のパフォーマンスを実現している。しかし、GNNは敵の攻撃に弱い。既存の研究は主にノード分類のためのGNNに対する攻撃に焦点を当てているが、グラフ分類のためのGNNに対する攻撃は十分に研究されていない。本研究では,グラフ構造を摂動することで,グラフ分類のためのGNNに対する敵対攻撃を系統的に研究する。特に、攻撃者がターゲットGNNモデルについて知識がなく、ターゲットモデルに問い合わせることによって予測されたラベルしか取得できないハードラベルブラックボックス攻撃(ハードラベルブラックボックス攻撃)に注目し、この目的を達成するために、高い攻撃成功率を維持しながらグラフに乱されるエッジの数を最小化する最適化問題として攻撃を定式化する。元の最適化問題の解法は難解であり、この最適化問題を理論的収束保証により解き放つことができるように緩和する。また、ターゲットGNNモデルに対するクエリ数を減少させるために、粗粒度探索アルゴリズムとクエリ効率勾配計算アルゴリズムを設計する。実世界の3つのデータセットに対する実験結果から,クエリや摂動の少ないグラフ分類において,GNNを効果的に攻撃できることが示された。また,本攻撃の有効性を2つの防御条件下で評価した。1つは高度に設計された逆グラフ検出器であり、もう1つはターゲットのgnnモデル自体が逆グラフ生成を防止する防御機能を備えていることである。実験の結果,このような防御効果は十分ではないことが明らかとなった。

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in various graph structure related tasks such as node classification and graph classification. However, GNNs are vulnerable to adversarial attacks. Existing works mainly focus on attacking GNNs for node classification; nevertheless, the attacks against GNNs for graph classification have not been well explored. In this work, we conduct a systematic study on adversarial attacks against GNNs for graph classification via perturbing the graph structure. In particular, we focus on the most challenging attack, i.e., hard label black-box attack, where an attacker has no knowledge about the target GNN model and can only obtain predicted labels through querying the target model.To achieve this goal, we formulate our attack as an optimization problem, whose objective is to minimize the number of edges to be perturbed in a graph while maintaining the high attack success rate. The original optimization problem is intractable to solve, and we relax the optimization problem to be a tractable one, which is solved with theoretical convergence guarantee. We also design a coarse-grained searching algorithm and a query-efficient gradient computation algorithm to decrease the number of queries to the target GNN model. Our experimental results on three real-world datasets demonstrate that our attack can effectively attack representative GNNs for graph classification with less queries and perturbations. We also evaluate the effectiveness of our attack under two defenses: one is well-designed adversarial graph detector and the other is that the target GNN model itself is equipped with a defense to prevent adversarial graph generation. Our experimental results show that such defenses are not effective enough, which highlights more advanced defenses.

翻訳日:2021-08-24 15:26:58 公開日:2021-08-21

# 確率ベイズゲームにおける時間的自己プレイ

Temporal Induced Self-Play for Stochastic Bayesian Games ( http://arxiv.org/abs/2108.09444v1 )

ライセンス: Link先を確認

Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang

(参考訳) ダイナミックゲームを解くための実践的な要件は、プレイヤーがいかなる決定点からでもうまくプレーすることを保証することである。この要件を満たすため、既存の取り組みは均衡改善に重点を置いているが、既存の技術のスケーラビリティと適用性は限られている。本稿では,任意の意思決定点から適切なパフォーマンスの戦略を見出すための新しい強化学習ベースフレームワークtispを提案する。 TISPは、信念空間表現、後方誘導、ポリシー学習、および非パラメトリック近似を使用する。 TISPを基盤として,政策段階のアルゴリズムであるTISP-PGを設計する。有限地平線を持つゼロサム一辺確率ベイズゲームにおいて、tispベースのアルゴリズムが近似完全ベイズ均衡を見つけることが証明される。セキュリティゲームやグリッドワールドゲームなど,TISPベースのアルゴリズムを多種多様なゲームでテストする。その結果,TISP-PGは既存の数学的プログラミング手法よりも拡張性が高く,他の学習手法よりも優れていた。

One practical requirement in solving dynamic games is to ensure that the players play well from any decision point onward. To satisfy this requirement, existing efforts focus on equilibrium refinement, but the scalability and applicability of existing techniques are limited. In this paper, we propose Temporal-Induced Self-Play (TISP), a novel reinforcement learning-based framework to find strategies with decent performances from any decision point onward. TISP uses belief-space representation, backward induction, policy learning, and non-parametric approximation. Building upon TISP, we design a policy-gradient-based algorithm TISP-PG. We prove that TISP-based algorithms can find approximate Perfect Bayesian Equilibrium in zero-sum one-sided stochastic Bayesian games with finite horizon. We test TISP-based algorithms in various games, including finitely repeated security games and a grid-world game. The results show that TISP-PG is more scalable than existing mathematical programming-based methods and significantly outperforms other learning-based methods.

翻訳日:2021-08-24 15:21:02 公開日:2021-08-21

# 環境データ不足に対する計算的計算法に関する研究

A computational study on imputation methods for missing environmental data ( http://arxiv.org/abs/2108.09500v1 )

ライセンス: Link先を確認

Paul Dixneuf and Fausto Errico and Mathias Glaus

(参考訳) データベース形式でのデータ取得と記録は日常的な操作である。しかし、データ収集のプロセスは、不規則な状況に陥り、データ欠落したデータベースが発生する可能性がある。ミスエントリは分析効率を変化させ、その結果、関連する意思決定プロセスを変化させる。本稿では,自然環境に関する情報を収集するデータベースに焦点を当てる。記録された活動の幅広いスペクトルを考えると、これらのデータベースは典型的に混在している。したがって、この特性を考慮したデータ処理手法の性能を評価することは重要である。本稿では,いくつかの欠落データ計算手法の性能と,その環境における欠落データ問題への応用について検討する。この手法を連鎖方程式 (mice) と k-nearest neighbors (knn) による多変量インプテーション法(multivariate imputation by chained equation) と比較した。さまざまなタイプの10の事前処理データセットでテストが行われた。その結果,MF の計算誤差は MICE と KNN より優れており,MF が計算誤差を 150% まで削減した混合型データベースの性能差は,他の手法と比較して顕著であった。通常、KNNは最速の方法であった。 mfはケベックの排水処理プラントのパフォーマンスモニタリングのケーススタディにうまく適用された。本研究は, 環境データ不足に対処する上で, MFを抑止法として用いることの意義を示すものである。

Data acquisition and recording in the form of databases are routine operations. The process of collecting data, however, may experience irregularities, resulting in databases with missing data. Missing entries might alter analysis efficiency and, consequently, the associated decision-making process. This paper focuses on databases collecting information related to the natural environment. Given the broad spectrum of recorded activities, these databases typically are of mixed nature. It is therefore relevant to evaluate the performance of missing data processing methods considering this characteristic. In this paper we investigate the performances of several missing data imputation methods and their application to the problem of missing data in environment. A computational study was performed to compare the method missForest (MF) with two other imputation methods, namely Multivariate Imputation by Chained Equations (MICE) and K-Nearest Neighbors (KNN). Tests were made on 10 pretreated datasets of various types. Results revealed that MF generally outperformed MICE and KNN in terms of imputation errors, with a more pronounced performance gap for mixed typed databases where MF reduced the imputation error up to 150%, when compared to the other methods. KNN was usually the fastest method. MF was then successfully applied to a case study on Quebec wastewater treatment plants performance monitoring. We believe that the present study demonstrates the pertinence of using MF as imputation method when dealing with missing environmental data.

翻訳日:2021-08-24 15:20:46 公開日:2021-08-21

# ソフトウェア工学における用語相互関係と動向

Term Interrelations and Trends in Software Engineering ( http://arxiv.org/abs/2108.09529v1 )

ライセンス: Link先を確認

Janusan Baskararajah and Lei Zhang and Andriy Miranskyy

(参考訳) ソフトウェアエンジニアリング(se)コミュニティは多作であり、専門家が新しい論文の洪水に追随し、新生物がこの分野に参入することを困難にしている。そこで我々は,SEコミュニティのテキストコーパスから用語とその相互関係を抽出し,用語の傾向を示すツールの恩恵を受けることができると考えている。本稿では,単語埋め込み技術を用いたプロトタイピングツールを構築する。我々は、SE Body of Knowledgeハンドブックと15,233の研究論文のタイトルと要約の埋め込みを訓練する。また、組み込みのトレーニングの検証に必要なテストケースを作成します。本稿では,埋め込みが用語の要約や知識ベースの動向を明らかにするのに役立つことを示す代表的な例を示す。

The Software Engineering (SE) community is prolific, making it challenging for experts to keep up with the flood of new papers and for neophytes to enter the field. Therefore, we posit that the community may benefit from a tool extracting terms and their interrelations from the SE community's text corpus and showing terms' trends. In this paper, we build a prototyping tool using the word embedding technique. We train the embeddings on the SE Body of Knowledge handbook and 15,233 research papers' titles and abstracts. We also create test cases necessary for validation of the training of the embeddings. We provide representative examples showing that the embeddings may aid in summarizing terms and uncovering trends in the knowledge base.

翻訳日:2021-08-24 15:20:26 公開日:2021-08-21

# 時空間データ調音のための成長変換力学系の利用

Using growth transform dynamical systems for spatio-temporal data sonification ( http://arxiv.org/abs/2108.09537v1 )

ライセンス: Link先を確認

Oindrila Chatterjee, Shantanu Chakrabartty

(参考訳) 有意義な音声シグネチャで情報を符号化するソニフィケーションは、人間のループ内決定のための従来の可視化手法の強化や置き換えにいくつかの利点がある。文献で報告されている標準的な音素化手法は、(i)変数のサブセットのみを使用するか、(ii)データ上の学習タスクを最初に解決し、次いで、エンドユーザーが決定するために使用する音声波形に出力をマッピングする。本稿では, 複合成長変換力学系モデルを用いて, 学習(あるいはより一般的には最適化)と音化過程を統合した, 高次元データを音化するための新しい枠組みを提案する。本アルゴリズムは,学習課題や予測課題の根底にあるデータと最適化パラメータを入力として,ユーザが定義する心理音響パラメータと組み合わせる。その結果、高次元データの統計特性を符号化するだけでなく、最適化・学習プロセスの基盤となる複雑さを明らかにするバイノーラル音声シグネチャを出力する。合成データセットを用いた広範囲な実験とともに、小児のてんかん発作を検出する可能性を持つ脳波解析(eeg)の枠組みを実証する。

Sonification, or encoding information in meaningful audio signatures, has several advantages in augmenting or replacing traditional visualization methods for human-in-the-loop decision-making. Standard sonification methods reported in the literature involve either (i) using only a subset of the variables, or (ii) first solving a learning task on the data and then mapping the output to an audio waveform, which is utilized by the end-user to make a decision. This paper presents a novel framework for sonifying high-dimensional data using a complex growth transform dynamical system model where both the learning (or, more generally, optimization) and the sonification processes are integrated together. Our algorithm takes as input the data and optimization parameters underlying the learning or prediction task and combines it with the psychoacoustic parameters defined by the user. As a result, the proposed framework outputs binaural audio signatures that not only encode some statistical properties of the high-dimensional data but also reveal the underlying complexity of the optimization/learning process. Along with extensive experiments using synthetic datasets, we demonstrate the framework on sonifying Electro-encephalogram (EEG) data with the potential for detecting epileptic seizures in pediatric patients.

翻訳日:2021-08-24 15:20:14 公開日:2021-08-21

# 多様な動作予測のための滑らかなポーズ列の生成

Generating Smooth Pose Sequences for Diverse Human Motion Prediction ( http://arxiv.org/abs/2108.08422v2 )

ライセンス: Link先を確認

Wei Mao, Miaomiao Liu, Mathieu Salzmann

(参考訳) 確率的動き予測の最近の進歩、すなわち、1つの過去のポーズシーケンスが与えられた複数の将来の人間の動きを予測することは、非常に多様な将来の動きを生み出し、いくつかの身体部分の運動を制御することさえもたらした。しかし、これを実現するためには、多様性のためのいくつかのマッピングと、制御可能な動き予測のための専用モデルを学ぶ必要がある。本稿では,多様かつ制御可能な動き予測のための統合型深層生成ネットワークを提案する。この目的のために、現実的な人間の動きは有効なポーズの滑らかなシーケンスで構成されており、限られたデータを考えると、ポーズの事前学習は動きよりもずっと扱いやすいという直観を活用できる。そこで我々は,各部位の動作を逐次予測するジェネレータを設計し,動作リアリズムを実現するために,関節角度の損失とともに正規化フローベースのポーズを導入し,サンプルの多様性と精度の両面で,我々のアプローチが最先端のベースラインより優れていることを示す。コードはhttps://github.com/wei-mao-2019/gspsで入手できる。

Recent progress in stochastic motion prediction, i.e., predicting multiple possible future human motions given a single past pose sequence, has led to producing truly diverse future motions and even providing control over the motion of some body parts. However, to achieve this, the state-of-the-art method requires learning several mappings for diversity and a dedicated model for controllable motion prediction. In this paper, we introduce a unified deep generative network for both diverse and controllable motion prediction. To this end, we leverage the intuition that realistic human motions consist of smooth sequences of valid poses, and that, given limited data, learning a pose prior is much more tractable than a motion one. We therefore design a generator that predicts the motion of different body parts sequentially, and introduce a normalizing flow based pose prior, together with a joint angle loss, to achieve motion realism.Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy. The code is available at https://github.com/wei-mao-2019/gsps

翻訳日:2021-08-24 11:29:39 公開日:2021-08-21

# 知識グラフを用いた質問応答のためのトップk演算子を用いた効率的な文脈化

Efficient Contextualization using Top-k Operators for Question Answering over Knowledge Graphs ( http://arxiv.org/abs/2108.08597v2 )

ライセンス: Link先を確認

Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

(参考訳) 知識ベース(KB-QA)に関する複雑な疑問に答えるには、数百万のエンティティと数千の述語を含む何十億もの事実を含む膨大な入力データに直面する。効率性のために、QAシステムはまず、すべての回答と関連する手がかりを含む可能性のある事実の集合を特定することによって、回答検索空間を縮小する。最も一般的なテクニックは、名前付きエンティティ曖昧化(NED)システムを問題に適用し、曖昧なエンティティに対してKB事実を検索することである。本研究は,KB対応信号を用いて検索空間の無関係な部分を抽出する効率的なECQAを提案する。 ECQAは、語彙マッチング、質問への関連性、候補項目間のコヒーレンス、KBグラフの接続性といった信号を組み合わせたKB項目のスコア順リスト上のトップkクエリ処理に基づいている。最近の2つのQAベンチマークによる実験は、解答の有無、検索空間のサイズ、ランタイムに関して、最先端のベースラインよりもECQAの方が優れていることを示している。

Answering complex questions over knowledge bases (KB-QA) faces huge input data with billions of facts, involving millions of entities and thousands of predicates. For efficiency, QA systems first reduce the answer search space by identifying a set of facts that is likely to contain all answers and relevant cues. The most common technique is to apply named entity disambiguation (NED) systems to the question, and retrieve KB facts for the disambiguated entities. This work presents ECQA, an efficient method that prunes irrelevant parts of the search space using KB-aware signals. ECQA is based on top-k query processing over score-ordered lists of KB items that combine signals about lexical matching, relevance to the question, coherence among candidate items, and connectivity in the KB graph. Experiments with two recent QA benchmarks demonstrate the superiority of ECQA over state-of-the-art baselines with respect to answer presence, size of the search space, and runtimes.

翻訳日:2021-08-24 11:29:18 公開日:2021-08-21

# マルチセンターフェデレーションラーニング

Multi-Center Federated Learning ( http://arxiv.org/abs/2108.08647v2 )

ライセンス: Link先を確認

Ming Xie, Guodong Long, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang

(参考訳) フェデレーション学習(federated learning, fl)は、分散学習におけるデータのプライバシを保護する。しかし、flは実用的な設定、例えば異なるユーザに対する非iidデータなどにおいて一般的に見られる異質性の存在下では脆弱である。既存のFLアプローチは通常、1つのグローバルモデルを更新して、データ分散間の不一致に関わらず、勾配を集約することで、すべてのユーザの共有知識をキャプチャする。対照的に、複数のグローバルモデルの混合は、FLの異なるグローバルモデル(すなわちセンター)にユーザーを割り当てる場合、様々なユーザー間の不均一性を捉えることができる。そこで本研究では,新しい多元集約機構を提案する。データから複数のグローバルモデルを学び、同時にユーザーとセンターの最適なマッチングを導き出す。次に、確率的予測最大化(EM)アルゴリズムにより効率よく解ける二段階最適化問題として定式化する。 FLの複数のベンチマークデータセットに対する実験により,本手法はいくつかのFL競合より優れていることが示された。ソースコードはGithubで公開されている。

Federated learning (FL) can protect data privacy in distributed learning since it merely collects local gradients from users without access to their data. However, FL is fragile in the presence of heterogeneity that is commonly encountered in practical settings, e.g., non-IID data over different users. Existing FL approaches usually update a single global model to capture the shared knowledge of all users by aggregating their gradients, regardless of the discrepancy between their data distributions. By comparison, a mixture of multiple global models could capture the heterogeneity across various users if assigning the users to different global models (i.e., centers) in FL. To this end, we propose a novel multi-center aggregation mechanism . It learns multiple global models from data, and simultaneously derives the optimal matching between users and centers. We then formulate it as a bi-level optimization problem that can be efficiently solved by a stochastic expectation maximization (EM) algorithm. Experiments on multiple benchmark datasets of FL show that our method outperforms several popular FL competitors. The source code are open source on Github.

翻訳日:2021-08-24 11:28:37 公開日:2021-08-21

PDF登録状況（公開日: 20210821）