Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200702となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 転置可能な転向例を用いた効率的な転向訓練 Efficient Adversarial Training with Transferable Adversarial Examples ( http://arxiv.org/abs/1912.11969v2 ) ライセンス: Link先を確認	Haizhong Zheng, Ziqi Zhang, Juncheng Gu, Honglak Lee, Atul Prakash	(参考訳) 対人訓練は、対人攻撃に対する分類モデルを保護する効果的な防御方法である。しかし、このアプローチの1つの制限は、トレーニング中に強い敵の例を生成するコストが高いため、桁違いのトレーニング時間を必要とすることである。本稿では,同一訓練過程において,隣接するエポックからのモデル間の移動性が高いこと,すなわち1つのエポックからの逆例が、その後のエポックにおいて逆であることを示す。この特性を生かして、訓練されたモデルの堅牢性を向上し、エポックを通じて敵の摂動を蓄積することにより、トレーニング効率を大幅に向上する新しい手法であるATTA(Adversarial Training with Transferable Adversarial Examples)を提案する。最先端の対人訓練法と比較すると、ATTAはCIFAR10で最大7.2%の精度を向上し、MNISTおよびCIFAR10データセットでのトレーニング時間を12～14倍短縮する。 Adversarial training is an effective defense method to protect classification models against adversarial attacks. However, one limitation of this approach is that it can require orders of magnitude additional training time due to high cost of generating strong adversarial examples during training. In this paper, we first show that there is high transferability between models from neighboring epochs in the same training process, i.e., adversarial examples from one epoch continue to be adversarial in subsequent epochs. Leveraging this property, we propose a novel method, Adversarial Training with Transferable Adversarial Examples (ATTA), that can enhance the robustness of trained models and greatly improve the training efficiency by accumulating adversarial perturbations through epochs. Compared to state-of-the-art adversarial training methods, ATTA enhances adversarial accuracy by up to 7.2% on CIFAR10 and requires 12~14x less training time on MNIST and CIFAR10 datasets with comparable model robustness.	翻訳日:2023-06-10 00:16:15 公開日:2020-07-02
# HOM干渉法による非局在周波数時間Schr\"odinger cat-like状態の生成 Producing delocalized frequency-time Schr\"odinger cat-like states with HOM interferometry ( http://arxiv.org/abs/2003.11486v2 ) ライセンス: Link先を確認	N. Fabre, J. Belhassen, A. Minneci, S. Felicetti, A. Keller, M.I. Amanti, F. Baboux, T. Coudreau, S. Ducci, P. Milman	(参考訳) 80年代後半、OuとMandelは、バランスの取れたビームスプリッター(Phys. Rev. Lett 61, 54 (1988))に干渉した2つの光子の非時間分解偶然検出を行うことで、信号のビーティングを実験的に観察した。本研究では, 局所スペクトルフィルタリングにより生成した周波数 Schr\"odinger cat-like 状態のクロノサイクリックウィグナー分布の直接測定として, 本実験で観測されたフリンジパターンの新たな解釈を提案する。また,この解析に基づいて時間分解hom実験を行い,その周波数状態を測定した。 In the late 80's, Ou and Mandel experimentally observed signal beatings by performing a non-time resolved coincidence detection of two photons having interfered in a balanced beam splitter [Phys. Rev. Lett 61, 54 (1988)]. In this work, we provide a new interpretation of the fringe pattern observed in this experiment as the direct measurement of the chronocyclic Wigner distribution of a frequency Schr\"odinger cat-like state produced by local spectral filtering. Based on this analysis, we also study time-resolved HOM experiment to measure such frequency state.	翻訳日:2023-05-27 22:35:22 公開日:2020-07-02
# 11DWDMチャンネルを用いた500MHz帯連続可変QKDの分光形状 Spectrally-Shaped Continuous-Variable QKD Operating at 500 MHz Over an Optical Pipe Lit by 11 DWDM Channels ( http://arxiv.org/abs/2004.11962v2 ) ライセンス: Link先を確認	D. Milovan\v{c}ev, N. Voki\'c, F. Laudenbach, C. Pacher, H. H\"ubel, and B. Schrenk	(参考訳) スペクトル調整と量子レシーバ帯域の最適利用により、セキュアキーレート22Mb/sをサポートする高速CV-QKDを実証する。隣り合うキャリアグレードのCバンドチャネルが20nmしか持たない11の共存は10Mb/sである。 We demonstrate high-rate CV-QKD supporting a secure-key rate of 22Mb/s through spectral tailoring and optimal use of quantum receiver bandwidth. Co-existence with 11 adjacent carrier-grade C-band channels spaced by only 20nm is accomplished at >10Mb/s.	翻訳日:2023-05-22 05:53:08 公開日:2020-07-02
# 捕捉イオンの2次元マイクロトラップアレイにおける高速ゲートを用いたスケーラブル量子計算 Scalable quantum computation with fast gates in two-dimensional microtrap arrays of trapped ions ( http://arxiv.org/abs/2005.00367v2 ) ライセンス: Link先を確認	Zain Mehdi and Alexander K. Ratcliffe and Joseph J. Hope	(参考訳) 2次元マイクロトラップアーキテクチャにおけるイオン量子コンピューティングにおける高速パルス2量子ゲートの利用について理論的に検討する。一次元において、そのような高速ゲートは、近接する隣り合うときに最適であり、2次元幾何学への一般化を検討する。高速パルスゲートは、レーザー繰り返し速度を実験的に実証し、近傍トラップにおけるイオン間の高忠実な絡み合い動作をトラッピング期間よりも早く実施できることを実証する。特に、ゲート長を増加させることなく、数百イオンの大型配列でも高忠実度ゲートは実現可能であることが判明した。本提案の有用性を示すために,40モードフェルミ・ハバードモデルのディジタルシミュレーションにおけるゲートの適用について検討する。これはまた、任意のイオン対を接続するために必要な短いゲート鎖が、この幾何学を大規模計算に適していることを示す。 We theoretically investigate the use of fast pulsed two-qubit gates for trapped ion quantum computing in a two-dimensional microtrap architecture. In one dimension, such fast gates are optimal when employed between nearest neighbours, and we examine the generalisation to a two-dimensional geometry. We demonstrate that fast pulsed gates are capable of implementing high-fidelity entangling operations between ions in neighbouring traps faster than the trapping period, with experimentally demonstrated laser repetition rates. Notably, we find that without increasing the gate duration, high-fidelity gates are achievable even in large arrays with hundreds of ions. To demonstrate the usefulness of this proposal, we investigate the application of these gates to the digital simulation of a 40-mode Fermi-Hubbard model. This also demonstrates why shorter chains of gates required to connect arbitrary pairs of ions makes this geometry well suited for large-scale computation.	翻訳日:2023-05-21 15:00:43 公開日:2020-07-02
# 2+1)次元Duffin-Kemmer-Petiau発振子の外部磁場における分割周波数 Splitting frequency of the (2+1)-dimensional Duffin-Kemmer-Petiau oscillator in an external magnetic field ( http://arxiv.org/abs/2005.01228v2 ) ライセンス: Link先を確認	Ignacio S. Gomez, Esdras S. Santos, Olavo Abla	(参考訳) 本研究では,(2+1)次元DKP発振器をDKP場の4x4および6x6表現を用いて再検討し,文献的考察を行った。スピンプロジェクションによりDKP発振器の周波数が分裂し, 発振器, 外界, スピン間の相互作用として生じ, そこからエネルギーと固有関数が統一的に表現されることがわかった。磁場の特定の臨界値に対して、スピン突起−1,1の成分の振動はキャンセルされる。振動のキャンセルが発生すると位相遷移が報告されるベクトルセクタの正準アンサンブルの熱力学について検討する。熱力学的ポテンシャルは、磁場の反転の下で分配関数が対称な高温限界における漸近的な表現に急速に収束する。 We revisit the (2+1)-dimensional DKP oscillator in an external magnetic field by means of 4x4 and 6x6 representations of the DKP field, thus obtaining several cases studied in the literature. We found an splitting in the frequency of the DKP oscillator according to the spin projection that arises as an interplay between the oscillator, the external field and the spin, from which the energies and the eigenfunctions are expressed in a unified way. For certain critical values of the magnetic field the oscillation in the components of spin projections -1 and 1 is cancelled. We study the thermodynamics of the canonical ensemble of the vectorial sector, where a phase transition is reported when the cancellation of the oscillation occurs. Thermodynamic potentials converge rapidly to their asymptotical expressions in the high temperature limit with the partition function symmetric under the reversion of the magnetic field.	翻訳日:2023-05-21 05:29:51 公開日:2020-07-02
# ランダムホッピング項を持つ非エルミートSu-Schrieffer-Heegerモデルの固有値の統計的性質 Statistical properties of eigenvalues of the non-Hermitian Su-Schrieffer-Heeger model with random hopping terms ( http://arxiv.org/abs/2005.02705v2 ) ライセンス: Link先を確認	Ken Mochizuki, Naomichi Hatano, Joshua Feinberg, Hideaki Obuse	(参考訳) 仮想のオンサイトポテンシャルとランダムに分布するホッピング項を持つsu-schrieffer-heegerモデルの非エルミート版の固有値統計を考察する。我々は、ハミルトニアンの構造のため、固有値は、パリティと時間反転対称性がなくても、あるパラメータの範囲で純粋に実数であることを発見した。このように、純粋な実スペクトルの場合、レベル統計はガウス直交のアンサンブルのものである。これは、固有値が純粋に実数である非エルミート的ハミルトニアンが、元のハミルトニアンの対称性を継承するエルミート的ハミルトニアンに写像できることを明らかにする一般的な特徴である。スペクトルが虚数固有値を含む場合、状態(DOS)の密度は原点で消滅し、虚数軸上のスペクトルエッジで発散することを示す。我々は、DOSの発散は、カイラル対称な1次元エルミート系のダイソン特異点から発生し、ハーミート系と異なるDOSの同素体を解析的に導出したことを示す。 We explore the eigenvalue statistics of a non-Hermitian version of the Su-Schrieffer-Heeger model, with imaginary on-site potentials and randomly distributed hopping terms. We find that owing to the structure of the Hamiltonian, eigenvalues can be purely real in a certain range of parameters, even in the absence of parity and time-reversal symmetry. As it turns out, in this case of purely real spectrum, the level statistics is that of the Gaussian orthogonal ensemble. This demonstrates a general feature which we clarify that a non-Hermitian Hamiltonian whose eigenvalues are purely real can be mapped to a Hermitian Hamiltonian which inherits the symmetries of the original Hamiltonian. When the spectrum contains imaginary eigenvalues, we show that the density of states (DOS) vanishes at the origin and diverges at the spectral edges on the imaginary axis. We show that the divergence of the DOS originates from the Dyson singularity in chiral-symmetric one-dimensional Hermitian systems and derive analytically the asymptotes of the DOS which is different from that in Hermitian systems.	翻訳日:2023-05-21 00:48:54 公開日:2020-07-02
# ローレンツボソニック貯留層と有限相関時間をもつ確率環境における2レベル系の非マルコフ的デコヒーレンス Non-Markovian decoherence of a two-level system in a Lorentzian bosonic reservoir and a stochastic environment with finite correlation time ( http://arxiv.org/abs/2006.14055v2 ) ライセンス: Link先を確認	V. A. Mikhailov, N. V. Troshkin	(参考訳) 本稿では,外部の古典変動環境の影響下で,ボゾン浴における2レベル系の非マルコフ進化について検討する。浴槽との相互作用はローレンツスペクトル密度を持ち、変動する環境(確率場)はオルンシュタイン-ウレンベック過程の集合で表される。合成環境のそれぞれのサブ環境は、2レベル系の非マルコフ力学を誘導することができる。本研究では, 階層的運動方程式の数値的厳密な手法を用いて, 2段階系の定常状態, 密度行列の低減, 平衡放射スペクトルの周波数カットオフ, サブ環境の結合強度依存性について検討した。また,浴槽との相互作用に用いる回転波近似(RWA)の精度への影響について検討した。 In this paper we investigate non-Markovian evolution of a two-level system (qubit) in a bosonic bath under influence of an external classical fluctuating environment. The interaction with the bath has the Lorentzian spectral density, and the fluctuating environment (stochastic field) is represented by a set of Ornstein-Uhlenbeck processes. Each of the subenvironments of the composite environment is able to induce non-Markovian dynamics of the two-level system. By means of the numerically exact method of hierarchical equations of motion, we study dependence of the steady states of the two-level system, the reduced density matrix evolution and the equilibrium emission spectrums on frequency cutoffs and coupling strengths of the subenvironments. Additionally we investigate the impact of the rotation-wave approximation (RWA) used for the interaction with the bath on accuracy of the results.	翻訳日:2023-05-12 22:07:18 公開日:2020-07-02
# ホルスタイン模型の高温における拡散とフラクタル幾何学の欠如 Absence of diffusion and fractal geometry in the Holstein model at high temperature ( http://arxiv.org/abs/2007.00817v1 ) ライセンス: Link先を確認	Chen-Yen Lai and S. A. Trugman	(参考訳) ホルスタインモデルにおいて, 分散のない光フォノンに結合した電子の高温でのダイナミクスについて検討した。電子は光フォノンによって繰り返し散乱されるため、従来の力学は拡散すると考えられている。しかし、半古典的な近似では、運動は拡散しない。 1次元では、電子は一定の方向に移動し、向きを変えない。 2次元では、電子は続いてフラクタル軌道を辿り続けます。これらの非標準ダイナミクスの側面は、電子とフォノンダイナミクスの完全な量子計算を含む、より正確な計算に保持される。 We investigate the dynamics of an electron coupled to dispersionless optical phonons in the Holstein model, at high temperatures. The dynamics is conventionally believed to be diffusive, as the electron is repeatedly scattered by optical phonons. In a semiclassical approximation, however, the motion is not diffusive. In one dimension, the electron moves in a constant direction and does not turn around. In two dimensions, the electron follows and then continues to retrace a fractal trajectory. Aspects of these nonstandard dynamics are retained in more accurate calculations, including a fully quantum calculation of the electron and phonon dynamics.	翻訳日:2023-05-11 21:00:09 公開日:2020-07-02
# 対称情報完全測定の化合物とその量子鍵分布への応用 Compounds of symmetric informationally complete measurements and their application in quantum key distribution ( http://arxiv.org/abs/2007.01007v1 ) ライセンス: Link先を確認	Armin Tavakoli, Ingemar Bengtsson, Nicolas Gisin and Joseph M. Renes	(参考訳) 対称情報完備測度(SIC)は、ヒルベルト空間におけるエレガントで、祝福され、広く有用な離散構造である。いくつかのSICで合成されたより洗練された離散構造を導入する。 SIC-化合物は、$d$次元ヒルベルト空間における$d^3$ベクトルの集合として定義され、$d$SICと$d^2$正規基底の2つの異なる方法で分割できる。先入観は$d>2$のときはありそうにないが、$d=4$の明示的な構成によって、驚くほど正に答える。驚くべきことに、sic-compoundは量子状態の識別によって明らかにされるように、相互に偏りのない基底との密接な関係を認める。基本的考察を超えて、これらのエキゾチックな性質を利用して量子鍵分布のプロトコルを構築し、一般的な盗聴攻撃の下でそのセキュリティを分析する。 sic-compound は 6 状態プロトコルの一般化が成功するのを防げるほど大きなエラーが存在する場合にセキュアな鍵生成を可能にする。 Symmetric informationally complete measurements (SICs) are elegant, celebrated and broadly useful discrete structures in Hilbert space. We introduce a more sophisticated discrete structure compounded by several SICs. A SIC-compound is defined to be a collection of $d^3$ vectors in $d$-dimensional Hilbert space that can be partitioned in two different ways: into $d$ SICs and into $d^2$ orthonormal bases. While a priori their existence may appear unlikely when $d>2$, we surprisingly answer it in the positive through an explicit construction for $d=4$. Remarkably this SIC-compound admits a close relation to mutually unbiased bases, as is revealed through quantum state discrimination. Going beyond fundamental considerations, we leverage these exotic properties to construct a protocol for quantum key distribution and analyze its security under general eavesdropping attacks. We show that SIC-compounds enable secure key generation in the presence of errors that are large enough to prevent the success of the generalisation of the six-state protocol.	翻訳日:2023-05-11 20:58:16 公開日:2020-07-02
# 量子コンピューティングデバイス上での量子アルゴリズムの実現 Realizing Quantum Algorithms on Real Quantum Computing Devices ( http://arxiv.org/abs/2007.01000v1 ) ライセンス: Link先を確認	Carmen G. Almudever, Lingling Lao, Robert Wille, Gian Giacomo Guerreschi	(参考訳) 量子コンピューティングは現在、学術的なアイデアから現実へと移行している。クラウド上の量子コンピューティングはすでに利用可能であり、世界中のユーザが実際の量子アルゴリズムを開発し実行することができる。しかし、Google、IBM、Rigetti、Intel、IonQ、Xanaduといった新技術に多大な投資をしている企業は、さまざまな技術的アプローチに従う。これにより、これまで利用可能な量子コンピューティングデバイスが大幅に異なる状況になった。それらは主に、キュービットの数と種類とそれらの間の接続性で異なる。そのため、所定の量子コンピューティングデバイス上で意図された量子機能を実現するための様々な方法が利用可能である。本稿では,この領域の紹介と概要を提供し,コンパイラ,マッパー,シンセサイザー,トランスパイラ,ルータなど,対応する手法について述べる。 Quantum computing is currently moving from an academic idea to a practical reality. Quantum computing in the cloud is already available and allows users from all over the world to develop and execute real quantum algorithms. However, companies which are heavily investing in this new technology such as Google, IBM, Rigetti, Intel, IonQ, and Xanadu follow diverse technological approaches. This led to a situation where we have substantially different quantum computing devices available thus far. They mostly differ in the number and kind of qubits and the connectivity between them. Because of that, various methods for realizing the intended quantum functionality on a given quantum computing device are available. This paper provides an introduction and overview into this domain and describes corresponding methods, also referred to as compilers, mappers, synthesizers, transpilers, or routers.	翻訳日:2023-05-11 20:57:47 公開日:2020-07-02
# オープン量子ランダムウォークから量子ウォークへのクロスオーバー A crossover between open quantum random walks to quantum walks ( http://arxiv.org/abs/2007.00940v1 ) ライセンス: Link先を確認	Norio Konno, Kaname Matsue, Etsuo Segawa	(参考訳) 我々は,開量子ランダムウォークとパラメータを持つ量子ウォークを連続的に接続する中間ウォークを提案する。$m\in \mathbb{n}$ がデコヒーレンス効果を制御する場合,そのウォークは開量子ランダムウォークと一致し,$m=\infty$ は量子ウォークと一致する。我々は、$M=\infty$と$M=1$に対して$\mathbb{Z}$の通常の確率測度を回復する尺度を定義し、様々な正の値に対する数値シミュレーションを通して中間挙動を観察する。 m=2$の場合、オープン量子ランダムウォークのパラメータの小さな隙間でも、量子ウォークの典型的な挙動が現れることを解析的に示す。より正確には、左右に弾道的に移動することと、この歩行器の局所化を同時に観察する。解析は、線型作用素に対する加藤の摂動理論に基づいている。この極限定理をより詳細に分析し、上記の3つのモードがガウス分布によって記述されていることを示す。 We propose an intermediate walk continuously connecting an open quantum random walk and a quantum walk with parameters $M\in \mathbb{N}$ controlling a decoherence effect; if $M=1$, the walk coincides with an open quantum random walk, while $M=\infty$, the walk coincides with a quantum walk. We define a measure which recovers usual probability measures on $\mathbb{Z}$ for $M=\infty$ and $M=1$ and we observe intermediate behavior through numerical simulations for varied positive values $M$. In the case for $M=2$, we analytically show that a typical behavior of quantum walks appears even in a small gap of the parameter from the open quantum random walk. More precisely, we observe both the ballistically moving towards left and right sides and localization of this walker simultaneously. The analysis is based on Kato's perturbation theory for linear operator. We futher analyze this limit theorem in more detail and show that the above three modes are described by Gaussian distributions.	翻訳日:2023-05-11 20:57:15 公開日:2020-07-02
# 非平衡定常状態における開量子系のゆらぎ-散逸関係 Fluctuation-Dissipation Relation for Open Quantum Systems in Nonequilibrium Steady State ( http://arxiv.org/abs/2007.00906v1 ) ライセンス: Link先を確認	Jen-Tsung Hsiang and Bei-Lok Hu	(参考訳) 線形および非線形開量子系における揺動散逸関係(FDR)の性質と存在についての研究を続ける [1-3] ここでは、線形系が非平衡定常状態(NESS)にある場合について考察する。 2つの振動子(両端が短い高調波鎖として考えられている)のモデルにより、異なる温度の熱浴にそれぞれ接続されるので、浴との相互作用によって鎖が完全に緩和されたとき、1つの浴における鎖の散逸核のノイズ核と虚部を結ぶ関係は、平衡の場合において従来のfdrの形状を仮定しない。風呂の温度とオシレータ間結合強度の違いに依存する「バイアス電流」という用語も存在します。さらに,この用語は,NESSにおける2つの浴場間の定常的な熱流に関連していることを示す。熱間交換(浴槽と終端振動子の間)と熱内伝達(チェーン内)のリアルタイムな発展を知る能力と、システムのパラメータへの依存は量子化可能な制御と量子熱エンジンや熱デバイスの設計において可能性を与える。 Continuing our work on the nature and existence of fluctuation-dissipation relations (FDR) in linear and nonlinear open quantum systems [1-3], here we consider such relations when a linear system is in a nonequilibrium steady state (NESS). With the model of two-oscillators (considered as a short harmonic chain with the two ends) each connected to a thermal bath of different temperatures we find that when the chain is fully relaxed due to interaction with the baths, the relation that connects the noise kernel and the imaginary part of the dissipation kernel of the chain in one bath does not assume the conventional form for the FDR in equilibrium cases. There exists an additional term we call the `bias current' that depends on the difference of the bath's initial temperatures and the inter-oscillator coupling strength. We further show that this term is related to the steady heat flow between the two baths when the system is in NESS. The ability to know the real-time development of the inter-heat exchange (between the baths and the end-oscillators) and the intra-heat transfer (within the chain) and their dependence on the parameters in the system offers possibilities for quantifiable control and in the design of quantum heat engines or thermal devices.	翻訳日:2023-05-11 20:56:56 公開日:2020-07-02
# 量子古典対応を用いた偏光関連光子対源のスペクトルマッピング Spectral mapping of polarization-correlated photon-pair sources using quantum-classical correspondence ( http://arxiv.org/abs/2007.00880v1 ) ライセンス: Link先を確認	Hung-Pin Chung, Pawan Kumar, Kai Wang, Olivier Bernard, Chinmay Shirpurkar, Wen-Chiuan Su, Thomas Pertsch, Andrey A. Sukhorukov, Yen-Hung Chen and Frank Setzpfandt	(参考訳) 量子光子対光源の直接スペクトル特性は、通常、煩雑でコストがかかり時間を要する検出問題を伴う。本研究では,チタン拡散型周期分極型ニオブ酸リチウム(ti:ppln)導波路に基づいて,ii型位相整合型自発的パラメトリックダウンコンバージョン(spdc)源のスペクトル特性を実験的に評価した。生成したクロスポーラ化光子対のスペクトル情報のキャラクタリゼーションは、量子情報や通信を含むアプリケーションにおけるそのようなソースの使用において重要である。クロスポーラライズフォトンペア源の結合スペクトル強度は、古典的和周波発生(sfg)測定による量子古典対応を用いて完全に再構成できることを実証する。この手法は可視光に対するより複雑な検出システムを使用しており、(偏光相関)光ファイバー源の量子状態の迅速な監視と制御を可能にし、安定で高可用性の量子源の実現を容易にする。 Direct spectral characterization of a quantum photon-pair source usually involves cumbersome, costly, and time-consuming detection issues. In this study, we experimentally characterize the spectral properties of a type-II phase-matched spontaneous parametric down-conversion (SPDC) source based on a titanium-diffused periodically poled lithium niobate (Ti:PPLN) waveguide. The characterization of the spectral information of the generated cross-polarized photon pairs is of importance for the use of such sources in applications including quantum information and communication. We demonstrate that the joint spectral intensity of the cross-polarized photon-pair source can be fully reconstructed using the quantum-classical correspondence through classical sum-frequency generation (SFG) measurements. This technique, which uses a much less complex detection system for visible light, opens the possibility of fast monitoring and control of the quantum state of (polarization-correlated) photon-pair sources to facilitate the realization of a stable and high-usability quantum source.	翻訳日:2023-05-11 20:55:56 公開日:2020-07-02
# 実時間形式論における熱場理論:粒子崩壊の概念と応用 Thermal Field Theory in real-time formalism: concepts and applications for particle decays ( http://arxiv.org/abs/2007.01224v1 ) ライセンス: Link先を確認	Torbj\"orn Lundberg and Roman Pasechnik	(参考訳) 本総説は, 湯川型理論における熱場理論(TFT)の概念と重要な成果について, 詳細かつ包括的に考察したものである。まず、想像とリアルタイムの定式化におけるTFTの一般的な導入から始める。現象学的に関連した意味として, 実時間形式論の枠組みで計算されたいくつかの典型的な反応について, 文献に見られる想像上の結果と比較し, 熱減衰率の概観を示す。ここで考慮された過程は、中性(擬似)スカラーが2つの異なる(擬似)スカラーまたはフェルミオン-反フェルミオン対に崩壊する過程である。これらの過程は、化学ポテンシャルと最終状態の異なる種を含むように、以前の研究から拡張される。また,フェルミオン線からの(pseudo)スカラー放射についても考察した。これらの結果は、高温および密度の系における多くの現象学的応用に関連する粒子崩壊観測における熱的効果の重要性を示している。 This review represents a detailed and comprehensive discussion of the Thermal Field Theory (TFT) concepts and key results in Yukawa-type theories. We start with a general pedagogical introduction into the TFT in the imaginary- and real-time formulation. As phenomenologically relevant implications, we present a compendium of thermal decay rates for several typical reactions calculated within the framework of the real-time formalism and compared to the imaginary-time results found in the literature. Processes considered here are those of a neutral (pseudo)scalar decaying into two distinct (pseudo)scalars or into a fermion-antifermion pair. These processes are extended from earlier works to include chemical potentials and distinct species in the final state. In addition, a (pseudo)scalar emission off a fermion line is also discussed. These results demonstrate the importance of thermal effects in particle decay observables relevant in many phenomenological applications in systems at high temperatures and densities.	翻訳日:2023-05-11 20:47:45 公開日:2020-07-02
# 忠実性アプローチ:励起エネルギーの有限次元交差における誤解を招く役割 Fidelity approach: the misleading role of finite size level crossing of excited energies ( http://arxiv.org/abs/2007.01186v1 ) ライセンス: Link先を確認	Somayyeh Nemati, Fatemeh Khastehdel Fumani, Saeed Mahdavifar	(参考訳) ここで、量子忠実性は1次元スピン1/2量子イジングモデルの2つの量子相転移を真に識別できるが、その基底状態の位相図の解析には適さないことを示す。 Here, we show that, although quantum fidelity can truly identify two quantum phase transitions of a one-dimensional spin-1/2 quantum Ising model with competing nearest and next-nearest neighbour interactions in a transverse magnetic field, it may not be a suitable approach for analyzing its ground-state phase diagram.	翻訳日:2023-05-11 20:47:02 公開日:2020-07-02
# 量子重ね合わせを測る(または「観測できることを決定するのは理論のみ」)。 Measuring Quantum Superpositions (Or, "It is only the theory which decides what can be observed.") ( http://arxiv.org/abs/2007.01146v1 ) ライセンス: Link先を確認	Christian de Ronde	(参考訳) 本研究は、量子力学の基礎文献(QM)における「実験室で仮説が実際に観測されることはない」という正統的な主張に反論するものである。そのため、我々は有名な測度問題に対する批判的な分析を行うことから始めるが、これは「理論」の特定の理解の下で量子形式論を仮定する経験実証主義的要件の厳密な適用に起因していると論じる。この文脈において、投影仮定(または測定規則)のアドホックな導入は、観察が「常識」経験の自己明快なものであると仮定するナイーブな経験主義的な視点から来る必要条件として理解することができる。すると我々は、QMの2つの「非崩壊」解釈(つまり、モーダルと多くの世界)に注意を向ける。アインシュタインの主張に従えば、「何が観測できるのかを決める理論のみである」という主張に従い、理論的な前提から「観察」が導かれる「物理理論」の実在論的な表現的理解への回帰を提案する。この観点から、直感的(アンスカリヒト)な方法で量子現象を理解できるような、新しい非古典的な概念表現について議論する。量子重ね合わせを計測・観測するための一般的な物理条件について考察する。 In this work we attempt to confront the orthodox widespread claim present in the foundational literature of Quantum Mechanics (QM) according to which 'superpositions are never actually observed in the lab'. In order to do so, we begin by providing a critical analysis of the famous measurement problem which, we will argue, was originated by the strict application of the empirical-positivist requirements to subsume the quantum formalism under their specific understanding of 'theory'. In this context, the ad hoc introduction of the projection postulate (or measurement rule) can be understood as a necessary requirement coming from a naive empiricist standpoint which presupposes that observations are self evident givens of "common sense" experience --independent of metaphysical (categorical) presuppositions. We then turn our attention to two "non-collapse" interpretations of QM --namely, modal and many worlds-- which even though deny that the "collapse" is a real physical process anyhow retain the measurement rule as a necessary element of the theory. In contraposition, following Einstein's claim according to which "it is only the theory which decides what can be observed", we propose a return to the realist representational understanding of 'physical theories' in which 'observation' is considered as derived from theoretical presuppositions. It is from this standpoint that we discuss a new non-classical conceptual representation which allows us to understand quantum phenomena in an intuitive (anschaulicht) manner. Leaving behind the projection postulate, we discuss the general physical conditions for measuring and observing quantum superpositions.	翻訳日:2023-05-11 20:46:56 公開日:2020-07-02
# プライバシーとセキュリティの脅威をビデオに拡大する Zooming Into Video Conferencing Privacy and Security Threats ( http://arxiv.org/abs/2007.01059v1 ) ライセンス: Link先を確認	Dima Kagan, Galit Fuhrmann Alpert, Michael Fire	(参考訳) 新型コロナウイルス(covid-19)パンデミック(covid-19)のパンデミックは、関連するソーシャルディスタンシングやシェルターインプレイス対策と共に、人々が互いにコミュニケーションする方法に劇的に影響を与え、人々が協力し、研究し、特別な機会を祝い、家族や友人と会うための新しい方法を見つけることを余儀なくされている。最も人気のあるソリューションの1つは、対面ミーティングを仮想ミーティングに置き換えるためにビデオ会議アプリケーションを使用することである。これにより、ビデオ会議ユーザーの数が前例のない増加となった。本研究では,仮想会議に参加することで危険にさらされる可能性のあるプライバシー問題について検討した。 web上に公開されている会議参加者のコラージュ画像から個人情報を抽出した。我々は、画像処理、テキスト認識ツール、およびソーシャルネットワーク分析を使用して、15,700コラージュ画像、142,000枚以上の参加者の顔画像のウェブクローリングデータセットを探索した。ビデオ会議のユーザは、セキュリティとプライバシーの脅威に悩まされている。以上の結果から,ビデオ会議の公開画像数千枚を集め,参加者の顔画像,年齢,性別,ユーザ名,時にはフルネームなどの個人情報を抽出することは比較的容易であることが示唆された。この種の抽出データは、オンラインと現実世界の両方で人々のセキュリティとプライバシを著しく、容易に危険に晒し、大人だけでなく、幼児や高齢者のような社会のより脆弱な部分にも影響を及ぼす。最後に, 顔画像データとソーシャルネットワークデータとの相互参照により, 参加者が気付いていないかもしれない追加のプライバシーリスクが生じる可能性があること, ビデオ会議の会議に現れるユーザを識別できること, ターゲット個人に関する情報の異なる情報源を悪意的に集約する可能性を示す。 The COVID-19 pandemic outbreak, with its related social distancing and shelter-in-place measures, has dramatically affected ways in which people communicate with each other, forcing people to find new ways to collaborate, study, celebrate special occasions, and meet with family and friends. One of the most popular solutions that have emerged is the use of video conferencing applications to replace face-to-face meetings with virtual meetings. This resulted in unprecedented growth in the number of video conferencing users. In this study, we explored privacy issues that may be at risk by attending virtual meetings. We extracted private information from collage images of meeting participants that are publicly posted on the Web. We used image processing, text recognition tools, as well as social network analysis to explore our web crawling curated dataset of over 15,700 collage images, and over 142,000 face images of meeting participants. We demonstrate that video conference users are facing prevalent security and privacy threats. Our results indicate that it is relatively easy to collect thousands of publicly available images of video conference meetings and extract personal information about the participants, including their face images, age, gender, usernames, and sometimes even full names. This type of extracted data can vastly and easily jeopardize people's security and privacy both in the online and real-world, affecting not only adults but also more vulnerable segments of society, such as young children and older adults. Finally, we show that cross-referencing facial image data with social network data may put participants at additional privacy risks they may not be aware of and that it is possible to identify users that appear in several video conference meetings, thus providing a potential to maliciously aggregate different sources of information about a target individual.	翻訳日:2023-05-11 20:46:08 公開日:2020-07-02
# Rydberg Noisy-Dressingとソリトン分子および液滴準結晶の作製への応用 Rydberg Noisy-Dressing and applications in making soliton-molecules and droplet quasi-crystals ( http://arxiv.org/abs/2007.01039v1 ) ライセンス: Link先を確認	Mohammadsadegh Khazali	(参考訳) 現在の超低温原子と原子トラップの分野の進歩は、新しい制御可能な長距離相互作用を思い出させる。これらの相互作用は、実現可能な量子アルゴリズムの範囲を広げ、新しいタイプの量子問題に対する新しい制御機構を提供すると期待されている。本稿では,レーザーの線幅を操作することで,ライドバーグ型原子間の特別な原子間相互作用について述べる。この新しい相互作用は、高原とガウス峰を含むハイブリッド空間プロファイルを特徴としている。これらの相互作用要素の符号と強度の動的個別制御と組み合わせて、Rydberg noisy-dressing (RnD) スキームは量子技術のための貴重な相互作用ツールボックスを提供する。例えば、RnDの安定な3Dソリトン分子の合成や準周期性液滴結晶の形成への応用について論じる。 The current advances in the field of ultra-cold atoms and atomic traps recall new controllable long-range interactions. These interactions are expected to extend the range of realizable quantum algorithms as well as providing new control mechanisms for the new types of quantum matters. This article presents special inter-atomic interactions between Rydberg-dressed atoms by manipulating lasers' line-width. The new interaction features a hybrid spatial profile containing plateaus and Gaussian peaks. Combined with dynamic individual control over the sign and strength of these interaction elements, the Rydberg noisy-dressing (RnD) scheme provides a valuable interaction toolbox for quantum technology. As an example, RnD's application in making stable gigantic 3D soliton molecules and in the formation of quasi-periodic droplet-crystals are discussed.	翻訳日:2023-05-11 20:45:34 公開日:2020-07-02
# 有限時間量子カルノーエンジンにおける電力変動 Power fluctuations in a finite-time quantum Carnot engine ( http://arxiv.org/abs/2007.01034v1 ) ライセンス: Link先を確認	Tobias Denzler and Eric Lutz	(参考訳) 安定性は、変動する出力を持つ小型熱機械の重要な特性である。本稿では、縮退多レベル系に基づく有限時間量子カルノエンジンを考察し、その有限ヒルベルト空間構造が安定性に与える影響について考察する。我々は、特に、レベル縮退とレベル番号に関して、相対的な作業変動を最適化する。最適性能は、非退化二段エンジンや高調波発振器モータよりも優れている。本結果は,高性能で高安定性な循環型量子熱エンジンの実現方法を示す。 Stability is an important property of small thermal machines with fluctuating power output. We here consider a finite-time quantum Carnot engine based on a degenerate multilevel system and study the influence of its finite Hilbert space structure on its stability. We optimize in particular its relative work fluctuations with respect to level degeneracy and level number. We find that its optimal performance may surpass those of nondegenerate two-level engines or harmonic oscillator motors. Our results show how to realize high-performance, high-stability cyclic quantum heat engines.	翻訳日:2023-05-11 20:45:20 公開日:2020-07-02
# 温度依存カシミール力:繰り返し微妙な性質 Temperature dependent Casimir forces: recurring subtleties ( http://arxiv.org/abs/2007.01011v1 ) ライセンス: Link先を確認	L.R. Fisher, B.W. Ninham	(参考訳) 2つの理想導電面の間のカシミール力は、リフシッツによるより一般的な理論の特別な(ゼロ温度)極限である。温度依存理論は、導電性、誘電性、磁気媒体のための量子および古典的なゆらぎモードの相関を含む。表面が温度が異なる場合、これらのモードはカップリングスプリングとして作用し、真空でも熱エネルギーを熱源から冷却器に伝達する可能性があると仮定されている。最近の実験でこの予測が確認されたが、データは完全な温度依存理論ではなく、カシミールの最初の表現の予測と比較された。これは文献に共通する誤りである。もう一つの誤りは、実導電面(この場合は金)が理想から遠く離れており、最大25%の補正係数が必要であるという事実を無視することである。ここでは両補正について数値値を与える。これらは最近の実験の基本的な結論には影響しないと思われるが、キャシミール(リフシッツ)効果の拡張の解釈に注意が必要であるというメッセージは、幅広い科学的問題でますます現れつつある。 The Casimir force between two ideal conducting surfaces is a special (zero temperature) limit of a more general theory due to Lifshitz. The temperature dependent theory includes correlations in coupled quantum and classical fluctuation modes for conducting, dielectric and magnetic media. If the surfaces are at different temperatures, it has been postulated that these modes might act as a coupling spring, transferring thermal energy from the hotter to the colder even through a vacuum. Recent experiments have appeared to confirm this prediction, but the data were compared with the predictions of Casimir's original expression, rather than those of the full temperature-dependent theory. This is a common error in the literature. Another error is to ignore the fact that real conducting surfaces (gold in this case) can be far from ideal, and that a correction factor of up to 25% may be required. Here we give numerical values for both of these corrections. It appears that they may not affect the basic conclusions from recent experiments, but the take-home message is that care is needed in the interpretation of extensions of Casimir (Lifshitz) effects, which are increasingly emerging across a wide range of scientific problems.	翻訳日:2023-05-11 20:45:13 公開日:2020-07-02
# ループ量子宇宙論における準確率分布 Quasi-probability distributions in Loop Quantum Cosmology ( http://arxiv.org/abs/2007.01324v1 ) ライセンス: Link先を確認	Jasel Berra-Montiel, Alberto Molgado	(参考訳) 本稿では、ループ量子宇宙論プログラム(lqc)で最近開発されたウィグナー・ワイル形式を一般化するために、位相空間と対応するワイル量子化写像のパラメトリズド準確率分布の完全族について述べる。特に、実数直線のボーアコンパクト化に価値を持つ状態の準分布を、非可換量子作用素に対応する順序曖昧性を説明するパラメータによってラベル付けされるように定義する。したがって、パラメータ化された準確率分布の射影は任意の順序付け処方の下で不変な限界確率密度をもたらす。また、標準のschr\"odinger表現とは対照的に、任意の文字に対して、準分布は順序とは独立に正の関数を決定することに注意する。さらに、LQG に対してパラメトリック順序付きワイル量子化写像を任意に実装することにより、標準、反標準およびワイル対称順序付けの関連事例をそれぞれ簡単な方法で復元することができる。我々は,LQCプログラムにおけるいくつかの基本的側面,特にコヒーレンス,圧縮状態,演算子の収束を量子光学および量子情報フレームワークで広く分析する上で,本研究の結果が有効であることを期待している。 In this paper, we introduce a complete family of parametrized quasi-probability distributions in phase space and their corresponding Weyl quantization maps with the aim to generalize the recently developed Wigner-Weyl formalism within the Loop Quantum Cosmology program (LQC). In particular, we intend to define those quasi-distributions for states valued on the Bohr compactification of the real line in such a way that they are labeled by a parameter that accounts for the ordering ambiguity corresponding to non-commutative quantum operators. Hence, we notice that the projections of the parametrized quasi-probability distributions result in marginal probability densities which are invariant under any ordering prescription. We also note that, in opposition to the standard Schr\"odinger representation, for an arbitrary character the quasi-distributions determine a positive function independently of the ordering. Further, by judiciously implementing a parametric-ordered Weyl quantization map for LQG, we are able to recover in a simple manner the relevant cases of the standard, anti-standard, and Weyl symmetric orderings, respectively. We expect that our results may serve to analyze several fundamental aspects within the LQC program, in special those related to coherence, squeezed states, and the convergence of operators, as extensively analyzed in the quantum optics and in the quantum information frameworks.	翻訳日:2023-05-11 20:39:01 公開日:2020-07-02
# 共形場の理論は魔法です Conformal field theories are magical ( http://arxiv.org/abs/2007.01303v1 ) ライセンス: Link先を確認	Christopher David White and ChunJun Cao and Brian Swingle	(参考訳) マジック」とは、クリフォードゲートによって状態が近似できない程度である。我々は、魔法の尺度であるmanaを$\mathbb z_3$ pottsモデルの基底状態で研究し、多体物理学において広く有用な診断であると主張する。特に、$q = 3$ の基底状態はモデルの臨界点で大きなマナを持ち、このマナはシステムの相関に存在する。状態の mera 表現に基づく単純なテンソルカウント計算によって mana の形式を説明する。マナはあらゆる長さのスケールに存在するので、3状態ポッツモデル臨界点を記述する共形場理論は魔法であると結論付ける。これらの結果は,誤り訂正量子コンピュータ上でのポッツ基底状態の生成とAdS-CFTのテンソルネットワークモデルの制約を制御している。 "Magic" is the degree to which a state cannot be approximated by Clifford gates. We study mana, a measure of magic, in the ground state of the $\mathbb Z_3$ Potts model, and argue that it is a broadly useful diagnostic for many-body physics. In particular we find that the $q = 3$ ground state has large mana at the model's critical point, and that this mana resides in the system's correlations. We explain the form of the mana by a simple tensor-counting calculation based on a MERA representation of the state. Because mana is present at all length scales, we conclude that the conformal field theory describing the 3-state Potts model critical point is magical. These results control the difficulty of preparing the Potts ground state on an error-corrected quantum computer, and constrain tensor network models of AdS-CFT.	翻訳日:2023-05-11 20:38:11 公開日:2020-07-02
# 量子シングルトン境界を破る絡み合い支援量子通信 Entanglement-Assisted Quantum Communication Beating the Quantum Singleton Bound ( http://arxiv.org/abs/2007.01249v1 ) ライセンス: Link先を確認	Markus Grassl	(参考訳) Brun, Devetak, and Hsieh [Science 314, 436 (2006)] は、送信機と受信機の間の事前共有の絡み合いが、絡み合いの助けなしにスキームよりも優れたパラメータを持つ量子通信プロトコルを実現することを示した。その後、同じ著者が、それらによって提案された絡み合い支援量子エラー訂正符号のパラメータに関連するいわゆる量子シングルトン境界のバージョンを導出した。我々は,この境界を一定の範囲で破るパラメータを持つ新しい絡み合い支援量子通信方式を提案する。 Brun, Devetak, and Hsieh [Science 314, 436 (2006)] demonstrated that pre-shared entanglement between sender and receiver enables quantum communication protocols that have better parameters than schemes without the assistance of entanglement. Subsequently, the same authors derived a version of the so-called quantum Singleton bound that relates the parameters of the entanglement-assisted quantum-error correcting codes proposed by them. We present a new entanglement-assisted quantum communication scheme with parameters violating this bound in certain ranges.	翻訳日:2023-05-11 20:37:00 公開日:2020-07-02
# 計算機研究の評価と普及のための進化的手法 Evolving Methods for Evaluating and Disseminating Computing Research ( http://arxiv.org/abs/2007.01242v1 ) ライセンス: Link先を確認	Benjamin Zorn, Tom Conte, Keith Marzullo, and Suresh Venkatasubramanian	(参考訳) 社会と技術の動向は、コンピュータ研究の評価と普及の方法を大きく変えた。会議や雑誌など、従来のレビューや出版の場は、過去には効果的に機能していた。近年、トレンドは新しい機会を生み出しつつも、レビューと普及のプロセスに新たなプレッシャーをかけている。例えば、多くのカンファレンスでは応募者数が大幅に増加した。同様に、研究思想の普及はarXiv.orgやソーシャルメディアネットワークといった出版の場を通じて劇的に進んでいる。こうした傾向は新型コロナウイルスより以前からあったが、パンデミックは長期的変化を加速させる可能性がある。 1) コンピュータ研究に影響を及ぼす傾向は概ね肯定的であり, 研究プロセスの参加, 範囲, アクセシビリティ, 速度が向上している。 2) 審査プロセスの規模を拡大する方法, 結果の拡散や混乱を回避し, 公正性を確保し, プロセス自体への幅広い参加を確保すること等, プロセスの整合性確保に課題が残されている。これらの知見に基づいて,1) コンピュータ研究コミュニティの定期的なポーリングメンバー,例えば,プログラムや一般会議の椅子,ジャーナル編集者,著者,レビュアーなど,これらの問題をよりよく理解するために直面する特定の課題を特定することを推奨する。 2)コンピューティング研究協会などの影響力のある団体は,「コンピューティング研究企業の現状」レポートを定期的に発行し,コンピュータ研究企業に影響を与える,肯定的かつ否定的な傾向をコミュニティに報告している。 3)ソーシャルメディアやプレプリントアーカイブがコンピュータ研究に与える影響をより深く理解するために,より深い調査を行う。 Social and technical trends have significantly changed methods for evaluating and disseminating computing research. Traditional venues for reviewing and publishing, such as conferences and journals, worked effectively in the past. Recently, trends have created new opportunities but also put new pressures on the process of review and dissemination. For example, many conferences have seen large increases in the number of submissions. Likewise, dissemination of research ideas has become dramatically through publication venues such as arXiv.org and social media networks. While these trends predate COVID-19, the pandemic could accelerate longer term changes. Based on interviews with leading academics in computing research, our findings include: (1) Trends impacting computing research are largely positive and have increased the participation, scope, accessibility, and speed of the research process. (2) Challenges remain in securing the integrity of the process, including addressing ways to scale the review process, avoiding attempts to misinform or confuse the dissemination of results, and ensuring fairness and broad participation in the process itself. Based on these findings, we recommend: (1) Regularly polling members of the computing research community, including program and general conference chairs, journal editors, authors, reviewers, etc., to identify specific challenges they face to better understand these issues. (2) An influential body, such as the Computing Research Association regularly issues a "State of the Computing Research Enterprise" report to update the community on trends, both positive and negative, impacting the computing research enterprise. (3) A deeper investigation, specifically to better understand the influence that social media and preprint archives have on computing research, is conducted.	翻訳日:2023-05-11 20:36:48 公開日:2020-07-02
# ドイツ消費者金融セクターにおけるフィンテックの破壊的可能性 - ブルーオーシャンシナリオか? The Disruptive Potential of FinTechs in the German Consumer Finance Sector -- A Blue Ocean Scenario? ( http://arxiv.org/abs/2007.03603v1 ) ライセンス: Link先を確認	Christian Wischnewski	(参考訳) この論文は、ブルーオーシャン戦略を基本戦略要素として、ドイツにおける比較的保守的な銀行部門と、ドイツにおける現在のフィンテック企業の市場シェアを評価するために定量的および質的要素の両方を用いて、ドイツ国民全体のリスク回避思想に当てはまるかどうかを分析、そして将来の発展についての潜在的な見通しを把握している。戦略的な枠組み、ドイツにおける銀行部門、フィンテック部門について文献レビューを行う。その後、銀行セクターが「赤い海」であるかどうか、フィンテック産業が「青い海」であるかどうかを、両セクターのケーススタディを用いて正式に検証する。ドイツにおける銀行顧客とそのフィンテック企業の利用に関する定量的分析は、オンライン調査を通じて行われ、その後、選ばれた参加者がインタビューを受け、さらなる洞察を得る。ドイツ銀行セクターのフルサイズとセグメントごとのトランザクションボリュームを反映した外挿指標とともに、調査結果のピボット分析とクロス集計とインタビュー結果を用いてデータ評価を行い、市場でのフィンテック利用の増加の兆候について検討した。将来の研究のアイデアが導出される場所からいくつかの制限にもかかわらず、その結果はドイツにおけるフィンテックの利用に関する現在のトレンドの概要を提供する。主な発見は、支払いソリューションの特筆すべき例外を除いて、ドイツはフィンテックに対して高い親和性を持っておらず、市場シェアと潜在力に乏しい金融サービス産業の副産物となっていることである。 Using the Blue Ocean strategy as an underlying strategic element, this dissertation analyses whether this statement holds true for the rather more conservative banking sector in Germany and the overall risk-averse mindset of the German population by using both quantitative and qualitative elements to assess the current market share of FinTech companies in the Federal Republic, as well as grasp a potential outlook on the future development. A literature review of the strategic framework, the banking sector in Germany and the FinTech sector is carried out accordingly. Subsequently, a formal verification as to whether the banking sector is a "Red Ocean" and if the FinTech industry is a "Blue Ocean" is carried out using case studies from both sectors. A quantitative analysis of banking customers in Germany and their use of FinTech companies is conducted by way of an online survey, with selected participants being interviewed thereafter to gain additional insights. Data evaluation is made using pivotal analysis and cross tabulation of survey results and interview findings, along with extrapolating indicators to reflect the full size of the German banking sector and transactional volumes per segment are provided and examined for signs of elevated FinTech use in the market. Despite several limitations from where ideas for future research are derived, the outcomes provide an overview of existing trends for the use of FinTechs in Germany. The main finding is that with the notable exception of payment solutions, Germans do not have a high affinity towards FinTechs, rendering them a byproduct of the financial service industry, with limited market share and low potential.	翻訳日:2023-05-11 20:28:39 公開日:2020-07-02
# 初期の宇宙の量子的記述について On the quantum description of the early universe ( http://arxiv.org/abs/2007.03428v1 ) ライセンス: Link先を確認	Gabriel R. Bengochea	(参考訳) なぜ宇宙の起源を理解するのが面白いのか? 私たちの存在を含む今日の観察はすべて、その出来事から生まれました。我々はまだその起源を記述できる理論を持っていないが、宇宙の非常に初期の時代の研究は、今日最も成功した2つの物理理論、一般相対性理論と量子物理学の間のインターフェイスを分析する理想的な地形を含んでいる。しかし、この領域は、我々の理論的アイデアをテストするための多くの観測データを持っている。量子物理学の父であるニールス・ボーア(Niels Bohr)とヴェルナー・ハイゼンベルク(Werner Heisenberg)は、これらの言葉で説明できるいくつかの考えを共有した:「量子物理学は、観測者と観測者の間に線があり、したがって科学は観察されるものに限定されるべきである。我々は世界の完全で客観的で現実的な理論を諦めなければならない」。この記事はこれらのアイデアを周回し、今日、最近の研究から、宇宙論を通じて(少なくとも一部は)それらに挑戦し、初期の宇宙の量子的な記述を求める立場にあることを要約します。 Why is it interesting to try to understand the origin of the universe? Everything we observe today, including our existence, arose from that event. Although we still do not have a theory that allows us to describe the origin itself, the study of the very early era of the universe involves the ideal terrain to analyze the interface between two of today's most successful physical theories, General Relativity and Quantum physics. But it is also an area in which we have a large number of observational data to test our theoretical ideas. Two of the fathers of Quantum physics, Niels Bohr and Werner Heisenberg, shared some thoughts that could be described with these words: "Quantum physics tells us that there is a line between the observed and the observer, and therefore science should be limited to what is observed. We must give up a complete, objective and realistic theory of the world". This article will orbit around these ideas and summarizes how it is that today, from recent works, we are in a position to try to challenge them (at least in part) through cosmology, seeking the quantum description of the early universe.	翻訳日:2023-05-11 20:28:08 公開日:2020-07-02
# 時間依存型コイン演算子を用いた量子ウォークにおけるパロンドのパラドックス Genuine Parrondo's paradox in quantum walks with time-dependent coin operators ( http://arxiv.org/abs/2007.01437v1 ) ライセンス: Link先を確認	Marcelo A. Pires and S\'ilvio M. Duarte Queir\'os	(参考訳) 真のパロンドパラドックスが2状態の量子ウォークにおいて、実験的に複雑な高次元コインを使わずに現れることを示した。このような目的を達成するために、システムの空間的不変性を損なうことなく、時間依存のコイン演算子を用いる。 We show that a genuine Parrondo paradox can emerge in two-state quantum walks without resorting to experimentally intricate high-dimensional coins. To achieve such goal we employ a time-dependent coin operator without breaking the translation spatial invariance of the system.	翻訳日:2023-05-11 20:27:47 公開日:2020-07-02
# dwave量子アニーラを用いた40株のポートフォリオ最適化 Portfolio Optimization of 40 Stocks Using the DWave Quantum Annealer ( http://arxiv.org/abs/2007.01430v1 ) ライセンス: Link先を確認	Jeffrey Cohen, Alex Khan, Clark Alexander	(参考訳) 我々は、株式の最適なセットを含む、米国上場の液体株式の宇宙からポートフォリオを構築するための量子コンピュータの使用について調査する。歴史的市場データから、D-Wave Systems Inc.の様々な問題定式化について考察する。 D-Wave 2000Q(TM)システム(後にDWaveと呼ばれる)は、マルコウィッツの定式化とシャープ比に基づく最適化されたポートフォリオ、単純化されたシカゴ量子比(CQR)、そして新しいシカゴ量子ネットスコア(CQNS)を見つける。まずこれを古典的に、次にDWaveの新しい手法でアプローチします。以上の結果から,米国株40株から魅力的なポートフォリオを選択できることがわかった。 We investigate the use of quantum computers for building a portfolio out of a universe of U.S. listed, liquid equities that contains an optimal set of stocks. Starting from historical market data, we look at various problem formulations on the D-Wave Systems Inc. D-Wave 2000Q(TM) System (hereafter called DWave) to find the optimal risk vs return portfolio; an optimized portfolio based on the Markowitz formulation and the Sharpe ratio, a simplified Chicago Quantum Ratio (CQR), then a new Chicago Quantum Net Score (CQNS). We approach this first classically, then by our new method on DWave. Our results show that practitioners can use a DWave to select attractive portfolios out of 40 U.S. liquid equities.	翻訳日:2023-05-11 20:27:41 公開日:2020-07-02
# デザイン革新のためのクラウドファンディング: 重要な要因を持つ予測モデル Crowdfunding for Design Innovation: Prediction Model with Critical Factors ( http://arxiv.org/abs/2007.01404v1 ) ライセンス: Link先を確認	Chaoyang Song, Jianxi Luo, Katja H\"oltt\"a-Otto, Warren Seering, Kevin Otto	(参考訳) オンライン報酬ベースのクラウドファンディングキャンペーンは、要求を検証し、アーリーアダプターを発見し、革新的な製品の設計プロセスにおける学習とフィードバックを求める革新的なアプローチとして登場した。しかし、革新的な製品のためのクラウドファンディングキャンペーンは高い不確実性に直面しており、デザインの価値を満たすために成功率に苦しめられている。本稿では, クラウドファンディングキャンペーンのデザイナーやイノベーターを指導するために, クラウドファンディングの成功に重要な要因を持つ予測モデルを構築するためのデータ駆動手法を提案する。具体的には、Real-Win-Worthフレームワークの26の候補因子をフィルタリングし、段階的回帰によって重要な要素を特定し、クラウドファンディングの金額を予測する。予測モデルを導出し、3Dプリンターとスマートウォッチのキャンペーンデータから重要な要素をKickstarterとIndiegogoで特定する手法を実証する。重要な要因は、キャンペーンの発展を導くことができ、予測モデルは、革新的製品のクラウドファンディング成功の可能性を高めるために、文脈におけるイノベーションのクラウドファンディングの可能性を評価することができる。 Online reward-based crowdfunding campaigns have emerged as an innovative approach for validating demands, discovering early adopters, and seeking learning and feedback in the design processes of innovative products. However, crowdfunding campaigns for innovative products are faced with a high degree of uncertainty and suffer meager rates of success to fulfill their values for design. To guide designers and innovators for crowdfunding campaigns, this paper presents a data-driven methodology to build a prediction model with critical factors for crowdfunding success, based on public online crowdfunding campaign data. Specifically, the methodology filters 26 candidate factors in the Real-Win-Worth framework and identifies the critical ones via step-wise regression to predict the amount of crowdfunding. We demonstrate the methodology via deriving prediction models and identifying essential factors from 3D printer and smartwatch campaign data on Kickstarter and Indiegogo. The critical factors can guide campaign developments, and the prediction model may evaluate crowdfunding potential of innovations in contexts, to increase the chance of crowdfunding success of innovative products.	翻訳日:2023-05-11 20:26:39 公開日:2020-07-02
# 複素最適化と統計的推測による高次元の量子精度限界における純量子状態の推定 Estimation of pure quantum states in high dimension at the limit of quantum accuracy through complex optimization and statistical inference ( http://arxiv.org/abs/2007.01398v1 ) ライセンス: Link先を確認	Leonardo Zambrano, Luciano Pereira, Sebastian Niklitschek, and Aldo Delgado	(参考訳) 量子トモグラフィーは、量子状態、プロセス、デバイスを評価するための重要なツールとなっている。これにより、より精度の高いトモグラフィー法が探索される。単一2次元量子系適応法の混合状態の場合, 林, ギル, マッサーによる理論的精度限界を達成するために, 最近導入されている。しかし、高次元量子状態の正確な推定は未だよく分かっていない。これは主に非互換な可観測性が存在するため、マルチパラメータ推定が困難である。本稿では,適応トモグラフィー法を示し,数回の反復を経て,高次元の純量子状態の推定精度に基礎的ギル・マッサール下限に漸近する数値シミュレーションを行った。この手法は複素数場における確率的最適化と統計的推論の組み合わせに基づいており、任意の混合状態トモグラフィー法の精度を上回っており、現在の実験能力で実証することができる。提案手法は量子力学の新しい発展につながる可能性がある。 Quantum tomography has become a key tool for the assessment of quantum states, processes, and devices. This drives the search for tomographic methods that achieve greater accuracy. In the case of mixed states of a single 2-dimensional quantum system adaptive methods have been recently introduced that achieve the theoretical accuracy limit deduced by Hayashi and Gill and Massar. However, accurate estimation of higher-dimensional quantum states remains poorly understood. This is mainly due to the existence of incompatible observables, which makes multiparameter estimation difficult. Here we present an adaptive tomographic method and show through numerical simulations that, after a few iterations, it is asymptotically approaching the fundamental Gill-Massar lower bound for the estimation accuracy of pure quantum states in high dimension. The method is based on a combination of stochastic optimization on the field of the complex numbers and statistical inference, exceeds the accuracy of any mixed-state tomographic method, and can be demonstrated with current experimental capabilities. The proposed method may lead to new developments in quantum metrology.	翻訳日:2023-05-11 20:26:21 公開日:2020-07-02
# 一般化 aubry-andr\'{e} 格子における可変移動エッジの観測 Observation of tunable mobility edges in generalized Aubry-Andr\'{e} lattices ( http://arxiv.org/abs/2007.01393v1 ) ライセンス: Link先を確認	Fangzhao Alex An, Karmela Padavi\'c, Eric J. Meier, Suraj Hegde, Sriram Ganeshan, J.H. Pixley, Smitha Vishveshwara, and Bryce Gadway	(参考訳) レーザー結合原子運動量モードの合成格子を用いて,双対対称性によって保護される正確な移動性エッジを有する準周期的サイトエネルギー変調を持つ近距離結合モデル群を実験的に実現した。これらの一次元強結合モデルは、よく知られた Aubry-Andr\'{e} (AA) モデルの一般化と見なすことができ、解析的モビリティエッジ関係を構成するエネルギー依存的な自己双対条件を持つ。このモデルシステムの最低および最高エネルギー固有状態とそれらの参加率の顕微鏡的測定を行うことにより、状態のエネルギー依存密度がモデルのチューニングパラメータによって変化するにつれて、移動度エッジの進化を追跡する。その結果、単粒子予測からの強い偏差が見られ、自己トラップによる最低エネルギー状態の局在化とスクリーニングによる最高エネルギー状態の局在の抑制の両方を引き起こす魅力的な相互作用と一致した。本研究は, 自己双対性誘導移動エッジにおける相互作用効果の定量的研究方法である。 Using synthetic lattices of laser-coupled atomic momentum modes, we experimentally realize a recently proposed family of nearest-neighbor tight-binding models having quasiperiodic site energy modulation that host an exact mobility edge protected by a duality symmetry. These one-dimensional tight-binding models can be viewed as a generalization of the well-known Aubry-Andr\'{e} (AA) model, with an energy-dependent self duality condition that constitutes an analytical mobility edge relation. By adiabatically preparing the lowest and highest energy eigenstates of this model system and performing microscopic measurements of their participation ratio, we track the evolution of the mobility edge as the energy-dependent density of states is modified by the model's tuning parameter. Our results show strong deviations from single-particle predictions, consistent with attractive interactions causing both enhanced localization of the lowest energy state due to self-trapping and inhibited localization of the highest energy state due to screening. This study paves the way for quantitative studies of interaction effects on self duality induced mobility edges.	翻訳日:2023-05-11 20:26:04 公開日:2020-07-02
# 建物における微小気候のデータ駆動制御--イベントトリガー型強化学習アプローチ Data-driven control of micro-climate in buildings: an event-triggered reinforcement learning approach ( http://arxiv.org/abs/2001.10505v2 ) ライセンス: Link先を確認	Ashkan Haji Hosseinloo, Alexander Ryzhov, Aldo Bischi, Henni Ouerdane, Konstantin Turitsyn, Munther A. Dahleh	(参考訳) スマートな建物は、地球全体のエネルギー消費の約40%を占めるため、エネルギー効率が高く、持続可能で、より経済的な未来を形作る大きな可能性を秘めている。スマートビルの将来は、オンラインと継続的な方法で短期間で適切な制御方針を学ぶという重要な課題によって現在目立たれている、適応的意思決定と制御に感覚データを使用することにある。この課題に取り組むために,イベントが発生し,十分な情報が収集された場合に学習と制御の判断を行う,古典的な時間トリガーとは対照的なイベントトリガー型パラダイムを提案する。イベントは特定の設計条件によって特徴づけられ、例えば特定の状態閾値に達すると、条件が満たされたときに発生する。学習の時間と制御の決定を体系的に調整することにより、提案フレームワークは学習のばらつきを低減し、制御プロセスを改善することができる。変動時間状態遷移と意思決定を可能にする半マルコフ決定プロセスに基づいて,マイクロ気候制御問題を定式化する。拡張政策勾配定理と時間差法を用いて、建物内の微小気候のイベントトリガー制御のための2つの学習アルゴリズムを提案する。テストビルにおけるエネルギー消費と居住者の快適さを同時に最適化するスマート・ラーニング・サーモスタットの設計により,提案手法の有効性を示す。 Smart buildings have great potential for shaping an energy-efficient, sustainable, and more economic future for our planet as buildings account for approximately 40% of the global energy consumption. Future of the smart buildings lies in using sensory data for adaptive decision making and control that is currently gloomed by the key challenge of learning a good control policy in a short period of time in an online and continuing fashion. To tackle this challenge, an event-triggered -- as opposed to classic time-triggered -- paradigm, is proposed in which learning and control decisions are made when events occur and enough information is collected. Events are characterized by certain design conditions and they occur when the conditions are met, for instance, when a certain state threshold is reached. By systematically adjusting the time of learning and control decisions, the proposed framework can potentially reduce the variance in learning, and consequently, improve the control process. We formulate the micro-climate control problem based on semi-Markov decision processes that allow for variable-time state transitions and decision making. Using extended policy gradient theorems and temporal difference methods in a reinforcement learning set-up, we propose two learning algorithms for event-triggered control of micro-climate in buildings. We show the efficacy of our proposed approach via designing a smart learning thermostat that simultaneously optimizes energy consumption and occupants' comfort in a test building.	翻訳日:2023-01-06 03:06:54 公開日:2020-07-02
# 人物再同定:深層学習分類フレームワークの受容領域を暗黙的に定義する Person Re-identification: Implicitly Defining the Receptive Fields of Deep Learning Classification Frameworks ( http://arxiv.org/abs/2001.11267v4 ) ライセンス: Link先を確認	Ehsan Yaghoubi, Diana Borza, Aruna Kumar, Hugo Proen\c{c}a	(参考訳) ディープラーニング分類モデルの \emph{receptive fields} は、正しい判断を提供する上で最も重要な入力データの領域を決定する。このような受容的フィールドを学ぶ一番の方法は、マスクされたデータに基づいてモデルをトレーニングすることであり、ネットワークが望ましくない領域を無視するのに役立つが、2つの大きな欠点がある。 1) しばしばエッジに敏感な意思決定プロセスをもたらします。 2) 推論フェーズの計算コストを大幅に増加させる。本稿では,ネットワーク決定に重要な/無関係な,交換セグメントからなる合成学習データを作成することにより,ネットワークの受容領域の推論を暗黙的に駆動する解について述べる。実際には,各学習インスタンスの前景/背景(重要でない)部分を区別するためにセグメンテーションモジュールを使用し,画像ペア間のセグメントをランダムにスワップし,クラスラベルを重要セグメントのラベルに排他的に一致させる。この戦略は典型的にネットワークを早期収束と適切な解へと駆り立てるが、そこではアイデンティティと乱雑な記述は相関しない。さらに、このデータ拡張ソリューションには様々な興味深い特性がある。 1) パラメータフリーである。 2) ラベル情報を完全に保存し,かつ 3) 典型的なデータ拡張技術と互換性がある。実証的検証では,提案手法の有効性を2つの異なる設定 (\emph{upper-body} と \emph{full-body}) で検証し,最先端技術に対する高い競争力のある結果が得られた。再現可能な研究パラダイムの下で、コードと経験的評価プロトコルは \url{https://github.com/Ehsan-Yaghoubi/reid-strong-baseline} で利用可能である。 The \emph{receptive fields} of deep learning classification models determine the regions of the input data that have the most significance for providing correct decisions. The primary way to learn such receptive fields is to train the models upon masked data, which helps the networks to ignore any unwanted regions, but has two major drawbacks: 1) it often yields edge-sensitive decision processes; and 2) augments the computational cost of the inference phase considerably. This paper describes a solution for implicitly driving the inference of the networks' receptive fields, by creating synthetic learning data composed of interchanged segments that should be \emph{apriori} important/irrelevant for the network decision. In practice, we use a segmentation module to distinguish between the foreground (important)/background (irrelevant) parts of each learning instance, and randomly swap segments between image pairs, while keeping the class label exclusively consistent with the label of the deemed important segments. This strategy typically drives the networks to early convergence and appropriate solutions, where the identity and clutter descriptions are not correlated. Moreover, this data augmentation solution has various interesting properties: 1) it is parameter-free; 2) it fully preserves the label information; and, 3) it is compatible with the typical data augmentation techniques. In the empirical validation, we considered the person re-identification problem and evaluated the effectiveness of the proposed solution in the well-known \emph{Richly Annotated Pedestrian} (RAP) dataset for two different settings (\emph{upper-body} and \emph{full-body}), observing highly competitive results over the state-of-the-art. Under a reproducible research paradigm, both the code and the empirical evaluation protocol are available at \url{https://github.com/Ehsan-Yaghoubi/reid-strong-baseline}.	翻訳日:2023-01-05 11:38:30 公開日:2020-07-02
# ディープニューラルネットワークのベイズが本当に優れているのか? How Good is the Bayes Posterior in Deep Neural Networks Really? ( http://arxiv.org/abs/2002.02405v2 ) ライセンス: Link先を確認	Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub \'Swi\k{a}tkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin	(参考訳) 過去5年間で、ベイジアンディープラーニングコミュニティは、ディープニューラルネットワークでベイジアン推論を可能にする、より正確で効率的な近似推論手順を開発してきた。しかし、このアルゴリズムの進歩と不確実性定量化の改善とサンプル効率の約束にもかかわらず、2020年初め現在、産業実践におけるベイズニューラルネットワークの公開デプロイは行われていない。本研究では,一般の深層ニューラルネットワークにおけるベイズ後部の理解に疑問を呈し,ベイズ後部による後部予測がSGDから得られた点推定を含む単純な手法と比較して系統的に悪い予測を行うことを示す。さらに,証拠を過大評価する"コールド後部"を用いることで,予測性能が大幅に向上することを示す。このような冷たい後部はベイズパラダイムから著しく逸脱するが、ベイズ深層学習論文ではヒューリスティックとしてよく使われている。寒冷な後部を説明できる仮説をいくつか提示し,実験を通じて仮説を評価した。我々の研究は、ベイズ深層学習における正確な後方近似の目的に疑問を呈している: 真のベイズ深層が貧弱なら、より正確な近似はどのように使われるのだろうか? 代わりに,寒冷後部の性能向上の原点を理解することに集中することが適当であると主張する。 During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a "cold posterior" that overcounts evidence. Such cold posteriors sharply deviate from the Bayesian paradigm but are commonly used as heuristic in Bayesian deep learning papers. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments. Our work questions the goal of accurate posterior approximations in Bayesian deep learning: If the true Bayes posterior is poor, what is the use of more accurate approximations? Instead, we argue that it is timely to focus on understanding the origin of the improved performance of cold posteriors.	翻訳日:2023-01-03 10:11:09 公開日:2020-07-02
# 解釈可能な構造を用いた多目的分子生成 Multi-Objective Molecule Generation using Interpretable Substructures ( http://arxiv.org/abs/2002.03244v3 ) ライセンス: Link先を確認	Wengong Jin, Regina Barzilay, Tommi Jaakkola	(参考訳) 創薬の目的は、特定の化学的性質のプロファイルを持つ新規化合物を見つけることである。生成的モデリングの観点では、複数の性質制約の交差する分子のサンプリングを学ぶことが目的である。プロパティの制約が多ければ,このタスクはますます難しくなります。我々は、分子の合理性と呼ばれる部分構造の語彙から分子を構成することによって、この複雑さを相殺することを提案する。これらの有理性は、分子から、興味のそれぞれの性質に責任を負う可能性のあるサブ構造として特定される。そして、グラフ生成モデルを用いて有理を全分子に拡張することを学ぶ。最終生成モデルでは、分子を複数の有理補体の混合物として構成し、この混合物は興味のある性質を保持するために微調整される。薬物設計タスクにおける本モデルの評価を行い, 生成化合物の精度, 多様性, 新規性の観点から, 最先端のベースラインに対する顕著な改善を示す。 Drug discovery aims to find novel compounds with specified chemical property profiles. In terms of generative modeling, the goal is to learn to sample molecules in the intersection of multiple property constraints. This task becomes increasingly challenging when there are many property constraints. We propose to offset this complexity by composing molecules from a vocabulary of substructures that we call molecular rationales. These rationales are identified from molecules as substructures that are likely responsible for each property of interest. We then learn to expand rationales into a full molecule using graph generative models. Our final generative model composes molecules as mixtures of multiple rationale completions, and this mixture is fine-tuned to preserve the properties of interest. We evaluate our model on various drug design tasks and demonstrate significant improvements over state-of-the-art baselines in terms of accuracy, diversity, and novelty of generated compounds.	翻訳日:2023-01-02 22:40:10 公開日:2020-07-02
# 解釈可能なAIの代替としての自己説明型AI Self-explaining AI as an alternative to interpretable AI ( http://arxiv.org/abs/2002.05149v6 ) ライセンス: Link先を確認	Daniel C. Elton	(参考訳) AIシステムによってなされる決定を説明する能力は、特に医療や自動運転車といった人間の生命が危険にさらされている領域において、特に注目されている。ディープニューラルネットワークの入出力関係を人間の理解可能なルールで近似することはしばしば可能であるが、二重降下現象の発見は、ディープニューラルネットワークが動作するメカニズムを正確に捉えていないことを示唆している。二重降下は、ディープニューラルネットワークが通常、いくつかの高レベルのルールを抽出するよりも、データポイント間のスムーズな補間によって動作することを示している。その結果、複雑な実世界のデータに基づいてトレーニングされたニューラルネットワークは、外挿を求めると本質的に解釈が難しく、失敗に陥りがちである。これらの問題にもかかわらず、どのようにAIを信頼できるかを示すために、自己説明型AIの概念を紹介します。自己説明型AIは、決定と説明の両方に対する信頼レベルとともに、各決定について人間に理解可能な説明を提供することができる。このアプローチが機能するためには、説明が実際に決定に関連し、理想的には説明にたどり着くメカニズムを捉えることが重要である。最後に、ディープラーニングベースのシステムには、適用性ドメイン分析のテクニックに基づいた「警告光」が含まれており、モデルにトレーニング配布外の外挿を依頼するとユーザーに警告することが重要であると論じる。この講演のビデオプレゼンテーションはhttps://www.youtube.com/watch? v=py7pvdcu7wy&。 The ability to explain decisions made by AI systems is highly sought after, especially in domains where human lives are at stake such as medicine or autonomous vehicles. While it is often possible to approximate the input-output relations of deep neural networks with a few human-understandable rules, the discovery of the double descent phenomena suggests that such approximations do not accurately capture the mechanism by which deep neural networks work. Double descent indicates that deep neural networks typically operate by smoothly interpolating between data points rather than by extracting a few high level rules. As a result, neural networks trained on complex real world data are inherently hard to interpret and prone to failure if asked to extrapolate. To show how we might be able to trust AI despite these problems we introduce the concept of self-explaining AI. Self-explaining AIs are capable of providing a human-understandable explanation of each decision along with confidence levels for both the decision and explanation. For this approach to work, it is important that the explanation actually be related to the decision, ideally capturing the mechanism used to arrive at the explanation. Finally, we argue it is important that deep learning based systems include a "warning light" based on techniques from applicability domain analysis to warn the user if a model is asked to extrapolate outside its training distribution. For a video presentation of this talk see https://www.youtube.com/watch?v=Py7PVdcu7WY& .	翻訳日:2023-01-01 18:43:27 公開日:2020-07-02
# 非凸構成最適化のための確率ガウスニュートンアルゴリズム Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization ( http://arxiv.org/abs/2002.07290v2 ) ライセンス: Link先を確認	Quoc Tran-Dinh and Nhan H. Pham and Lam M. Nguyen	(参考訳) 本研究では,非凸確率的構成最適化問題を解くための2つの新しい確率的ガウス・ニュートンアルゴリズムを開発した。標準仮定の下では期待と有限サムの設定の両方を考慮し、古典確率およびSARAH推定器を用いて関数値とジャコビアンを近似する。期待の場合、予測における定常点を達成するために$\mathcal{O}(\varepsilon^{-2})$ iteration-complexityを確立し、関数値とジャコビアンの両方に対する確率的オラクル呼び出しの総数を推定する($\varepsilon$は所望の精度である)。有限和の場合、$\mathcal{o}(\varepsilon^{-2})$イテレーション複雑度とオラクルの総呼び出しは高い確率で見積もる。我々の知る限り、確率的ガウス・ニュートン法のためにこのような大域的確率的オラクル複雑性が確立されたのはこれが初めてである。最後に,合成データと実データの両方について2つの数値例を用いて理論的結果を示す。 We develop two new stochastic Gauss-Newton algorithms for solving a class of non-convex stochastic compositional optimization problems frequently arising in practice. We consider both the expectation and finite-sum settings under standard assumptions, and use both classical stochastic and SARAH estimators for approximating function values and Jacobians. In the expectation case, we establish $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity to achieve a stationary point in expectation and estimate the total number of stochastic oracle calls for both function value and its Jacobian, where $\varepsilon$ is a desired accuracy. In the finite sum case, we also estimate $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity and the total oracle calls with high probability. To our best knowledge, this is the first time such global stochastic oracle complexity is established for stochastic Gauss-Newton methods. Finally, we illustrate our theoretical results via two numerical examples on both synthetic and real datasets.	翻訳日:2022-12-31 13:01:28 公開日:2020-07-02
# Wavesplit:話者クラスタリングによるエンドツーエンド音声分離 Wavesplit: End-to-End Speech Separation by Speaker Clustering ( http://arxiv.org/abs/2002.08933v2 ) ライセンス: Link先を確認	Neil Zeghidour and David Grangier	(参考訳) エンド・ツー・エンドのソース分離システムwavesplitを紹介する。単一の混合から、モデルは各ソースの表現を推論し、推論された表現が与えられた各ソース信号を推定する。モデルは、生の波形から両方のタスクを共同で実行するように訓練される。 Wavesplitはクラスタリングを通じてソース表現のセットを推論し、分離の基本的な置換問題に対処する。音声分離では, 先行処理に比べて, 連続話者表現の方が, 長大かつ難解な録音をより堅牢に分離することができる。 Wavesplitは、2または3つの話者(WSJ0-2/3mix)の清潔な混合(WHAM/WHAMR)に対して、ノイズと残響設定(WHAM/WHAMR)を再定義する。また、最近のLibriMixデータセットに新しいベンチマークを設定しました。最後に,1回の腹部心電図から胎児と母体心拍数を分離することにより,Wavesplitは他の領域にも適用可能であることを示す。 We introduce Wavesplit, an end-to-end source separation system. From a single mixture, the model infers a representation for each source and then estimates each source signal given the inferred representations. The model is trained to jointly perform both tasks from the raw waveform. Wavesplit infers a set of source representations via clustering, which addresses the fundamental permutation problem of separation. For speech separation, our sequence-wide speaker representations provide a more robust separation of long, challenging recordings compared to prior work. Wavesplit redefines the state-of-the-art on clean mixtures of 2 or 3 speakers (WSJ0-2/3mix), as well as in noisy and reverberated settings (WHAM/WHAMR). We also set a new benchmark on the recent LibriMix dataset. Finally, we show that Wavesplit is also applicable to other domains, by separating fetal and maternal heart rates from a single abdominal electrocardiogram.	翻訳日:2022-12-30 06:32:48 公開日:2020-07-02
# 室内シーンの3次元認識 Indoor Scene Recognition in 3D ( http://arxiv.org/abs/2002.12819v2 ) ライセンス: Link先を確認	Shengyu Huang, Mikhail Usvyatsov and Konrad Schindler	(参考訳) どのような環境があるかを認識することは重要な認識課題である。例えば、屋内で動作しているロボットは、キッチン、廊下、寝室にいるかどうかを認識するのに役立ちます。既存のアプローチでは、2D画像や2.5Dレンジ画像に基づいてシーンを分類しようとする。本研究では,3dポイントクラウド(voxel)データからシーン認識を解析し,2d鳥眼の視点に基づく手法を大きく上回ることを示す。さらに,シーン認識の改善方法としてマルチタスク学習を提唱し,シーンタイプがシーン内のオブジェクトと高度に相関していることと,その意味的セグメンテーションを異なるオブジェクトクラスに分類することに着目した。一連のアブレーション研究において、成功したシーン認識は、特定のシーンタイプ(浴槽など)に固有の個々のオブジェクトの認識だけでなく、粗い3次元形状、色、オブジェクトカテゴリの(簡単な)分布など、いくつかの異なる手がかりに依存することを示した。さらに,室内のシーンを精度良く分類するのに,驚くほどスパースな3Dデータが十分であることを示す。 Recognising in what type of environment one is located is an important perception task. For instance, for a robot operating in indoors it is helpful to be aware whether it is in a kitchen, a hallway or a bedroom. Existing approaches attempt to classify the scene based on 2D images or 2.5D range images. Here, we study scene recognition from 3D point cloud (or voxel) data, and show that it greatly outperforms methods based on 2D birds-eye views. Moreover, we advocate multi-task learning as a way of improving scene recognition, building on the fact that the scene type is highly correlated with the objects in the scene, and therefore with its semantic segmentation into different object classes. In a series of ablation studies, we show that successful scene recognition is not just the recognition of individual objects unique to some scene type (such as a bathtub), but depends on several different cues, including coarse 3D geometry, colour, and the (implicit) distribution of object categories. Moreover, we demonstrate that surprisingly sparse 3D data is sufficient to classify indoor scenes with good accuracy.	翻訳日:2022-12-28 02:03:57 公開日:2020-07-02
# unblind your apps:ディープラーニングによるモバイルguiコンポーネントの自然言語ラベルの予測 Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning ( http://arxiv.org/abs/2003.00380v2 ) ライセンス: Link先を確認	Jieshan Chen, Chunyang Chen, Zhenchang Xing, Xiwei Xu, Liming Zhu, Guoqiang Li, and Jinshui Wang	(参考訳) 世界保健機関(WHO)によると、世界中で約13億人が視覚障害を患っており、そのうち3600万人が盲目である。その障害のため、これらの少数派を社会に巻き込むことは難しい問題である。近年の携帯電話の普及は、視覚障害者が世界を理解するための情報やサービスにアクセスしやすくすることで、新しいソリューションを提供する。視覚障害のあるユーザは、モバイルオペレーティングシステムに埋め込まれたスクリーンリーダーを採用して、アプリ内の各画面のコンテンツを読み、ジェスチャーを使ってスマートフォンと対話することができる。しかし、スクリーンリーダーを使う前提は、開発者がアプリを開発する際に、画像ベースのコンポーネントに自然言語ラベルを追加する必要があることである。 10,408のAndroidアプリの分析によると、残念ながら77%以上のアプリがラベルの不足に悩まされている。これらの問題のほとんどは、マイノリティを考慮した開発者の認識と知識の欠如によって引き起こされる。また、開発者がラベルをUIコンポーネントに追加したいとしても、視覚的な問題がないため、簡潔で明確な説明が得られない可能性がある。これらの課題を克服するために、Google Playの大規模商用アプリから学習することで、画像ベースのボタンのラベルを自動的に予測するディープラーニングベースのモデル、LabelDroidを開発した。実験の結果,本モデルは正確な予測を行うことができ,生成ラベルは実際のandroid開発者よりも高品質であることが判明した。 According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called LabelDroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show that our model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.	翻訳日:2022-12-27 13:20:37 公開日:2020-07-02
# FlashlightのCNNイメージ Flashlight CNN Image Denoising ( http://arxiv.org/abs/2003.00762v2 ) ライセンス: Link先を確認	Pham Huu Thanh Binh, Crist\'ov\~ao Cruz, Karen Egiazarian	(参考訳) 本稿では,画像復調のためのディープニューラルネットワークを実装したFlashLight CNN (FLCNN) という学習手法を提案する。提案手法は深層残差ネットワークとインセプションネットワークに基づいており、付加的白色ガウス雑音(awgn)による灰色スケール画像の除去に残差ネットワークのみよりも多くのパラメータを活用できる。フラッシュライトcnnは、美術画像の表示方法の現況と定量的および視覚的に比較した場合の芸術性能の状態を実証する。 This paper proposes a learning-based denoising method called FlashLight CNN (FLCNN) that implements a deep neural network for image denoising. The proposed approach is based on deep residual networks and inception networks and it is able to leverage many more parameters than residual networks alone for denoising grayscale images corrupted by additive white Gaussian noise (AWGN). FlashLight CNN demonstrates state of the art performance when compared quantitatively and visually with the current state of the art image denoising methods.	翻訳日:2022-12-27 05:50:31 公開日:2020-07-02
# オンラインシンクホーン:サンプルストリームからの最適な輸送距離 Online Sinkhorn: Optimal Transport distances from sample streams ( http://arxiv.org/abs/2003.01415v2 ) ライセンス: Link先を確認	Arthur Mensch (DMA), Gabriel Peyr\'e (DMA)	(参考訳) 最適輸送(OT)距離は、MLタスクの損失関数として日常的に使用される。しかし、任意の(すなわち離散的ではない)確率分布間のot距離を計算することは未解決の問題である。本稿では,2つの任意分布間のエントロピー規則化OT距離の新しいオンライン推定器を提案する。両ディストリビューションからのサンプルストリームを使用して、輸送計画の非パラメトリック表現を反復的に強化する。従来のシンクホーンアルゴリズムと比較すると,本手法は各イテレーションで新たなサンプルを活用し,真の正規化ot距離の一貫した推定を可能にする。オンラインシンクホーンアルゴリズムの収束を理論的に解析し,イテレート列に対してほぼo(1/n)漸近的なサンプル複雑性を示す。本手法は合成1d〜10dデータおよび実3d形状データを用いて検証する。 Optimal Transport (OT) distances are now routinely used as loss functions in ML tasks. Yet, computing OT distances between arbitrary (i.e. not necessarily discrete) probability distributions remains an open problem. This paper introduces a new online estimator of entropy-regularized OT distances between two such arbitrary distributions. It uses streams of samples from both distributions to iteratively enrich a non-parametric representation of the transportation plan. Compared to the classic Sinkhorn algorithm, our method leverages new samples at each iteration, which enables a consistent estimation of the true regularized OT distance. We provide a theoretical analysis of the convergence of the online Sinkhorn algorithm, showing a nearly-O(1/n) asymptotic sample complexity for the iterate sequence. We validate our method on synthetic 1D to 10D data and on real 3D shape data.	翻訳日:2022-12-26 23:09:20 公開日:2020-07-02
# セルフ・アテンションに基づくメタエンベディング Meta-Embeddings Based On Self-Attention ( http://arxiv.org/abs/2003.01371v3 ) ライセンス: Link先を確認	Qichen Li, Yuanqing Lin, Luofeng Zhou, Jian Li	(参考訳) 言語モデリングにおけるパフォーマンス向上のためのメタ組込みの作成が近年注目されており、複数の個別に訓練された組込みの算術平均を連結あるいは単に計算する手法が有用であることが示されている。本稿では,自己保持機構,すなわちDuoに基づくメタ埋め込みモデルを提案する。 0.4M未満のパラメータで、Duoメカニズムは20NGのようなテキスト分類タスクで最先端の精度を達成する。さらに,機械翻訳のためのメタ埋め込みシークエンスモデルを提案する。これは我々の知る限り,複数の単語埋め込みに基づく最初の機械翻訳モデルである。さらに、我々のモデルは、よりよい結果を得るだけでなく、WMT 2014英語-フランス語翻訳タスクのような認識されたベンチマークにより早く収束するという点で、Transformerよりも優れていることが判明した。 Creating meta-embeddings for better performance in language modelling has received attention lately, and methods based on concatenation or merely calculating the arithmetic mean of more than one separately trained embeddings to perform meta-embeddings have shown to be beneficial. In this paper, we devise a new meta-embedding model based on the self-attention mechanism, namely the Duo. With less than 0.4M parameters, the Duo mechanism achieves state-of-the-art accuracy in text classification tasks such as 20NG. Additionally, we propose a new meta-embedding sequece-to-sequence model for machine translation, which to the best of our knowledge, is the first machine translation model based on more than one word-embedding. Furthermore, it has turned out that our model outperform the Transformer not only in terms of achieving a better result, but also a faster convergence on recognized benchmarks, such as the WMT 2014 English-to-French translation task.	翻訳日:2022-12-26 22:43:14 公開日:2020-07-02
# DeepFakes進化: 顔面領域の解析とフェイク検出性能 DeepFakes Evolution: Analysis of Facial Regions and Fake Detection Performance ( http://arxiv.org/abs/2004.07532v2 ) ライセンス: Link先を確認	Ruben Tolosana, Sergio Romero-Tapiador, Julian Fierrez and Ruben Vera-Rodriguez	(参考訳) メディアの法医学は、DeepFakesに関する懸念が高まり、ここ数年で多くの注目を集めている。 UADFVやFaceForensics++といった第1世代のDeepFakeデータベースから、Celeb-DFやDFDCといった第2世代の最新のデータベースに至るまで、多くの視覚的改善が行われており、フェイクビデオはほぼ人間の目で区別できない。本研究では,第1世代および第2世代DeepFakeの顔領域と偽検出性能を総合的に分析した。実験フレームワークでは2つの異なる方法が検討されている。一伝統的に、文献に従つて、偽検出システムへの入力として顔全体を選択すること、及び二偽検出システムへの入力としての特定の顔領域の選択に基づく新しいアプローチ実験の結果,第2世代の最新のDeepFakeデータベースにおいて,最先端のフェイク検出によって達成された偽検出結果が,15%から30%の誤差率で検出された。これらの結果は、より洗練された偽検出器を開発するためのさらなる研究の必要性を述べている。 Media forensics has attracted a lot of attention in the last years in part due to the increasing concerns around DeepFakes. Since the initial DeepFake databases from the 1st generation such as UADFV and FaceForensics++ up to the latest databases of the 2nd generation such as Celeb-DF and DFDC, many visual improvements have been carried out, making fake videos almost indistinguishable to the human eye. This study provides an exhaustive analysis of both 1st and 2nd DeepFake generations in terms of facial regions and fake detection performance. Two different methods are considered in our experimental framework: i) the traditional one followed in the literature and based on selecting the entire face as input to the fake detection system, and ii) a novel approach based on the selection of specific facial regions as input to the fake detection system. Among all the findings resulting from our experiments, we highlight the poor fake detection results achieved even by the strongest state-of-the-art fake detectors in the latest DeepFake databases of the 2nd generation, with Equal Error Rate results ranging from 15% to 30%. These results remark the necessity of further research to develop more sophisticated fake detectors.	翻訳日:2022-12-12 22:03:58 公開日:2020-07-02
# 驚き最小化による強化学習一般化 Reinforcement Learning Generalization with Surprise Minimization ( http://arxiv.org/abs/2004.12399v2 ) ライセンス: Link先を確認	Jerry Zikun Chen	(参考訳) 一般化は、しばしば同じ決定論的ゲーム環境上で訓練され、テストされる深層強化学習アルゴリズムにとって難しい問題である。テスト環境が目に見えず摂動的だが、タスクの性質が変わらず、一般化のギャップが生じる。本研究では,一般化ベンチマークにおけるサプライズ最小化エージェントの提案と評価を行い,エントロピーと確率性が一定である手続き的ゲーム環境において,単純な密度モデルから得られる付加的な報酬がロバスト性を示すことを示す。 Generalization remains a challenging problem for deep reinforcement learning algorithms, which are often trained and tested on the same set of deterministic game environments. When test environments are unseen and perturbed but the nature of the task remains the same, generalization gaps can arise. In this work, we propose and evaluate a surprise minimizing agent on a generalization benchmark to show an additional reward learned from a simple density model can show robustness in procedurally generated game environments that provide constant source of entropy and stochasticity.	翻訳日:2022-12-09 12:59:30 公開日:2020-07-02
# グラフニューラルネットワークによるフラッド検出の不整合問題を軽減する Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection ( http://arxiv.org/abs/2005.00625v3 ) ライセンス: Link先を確認	Zhiwei Liu, Yingtong Dou, Philip S. Yu, Yutong Deng, Hao Peng	(参考訳) このグラフベースのモデルは、疑わしい詐欺をオンラインで検出するのに役立つ。グラフニューラルネットワーク(gnns)の開発により、先行研究は均質グラフまたはヘテロジニアスグラフのいずれかに基づく多くのgnnベースの不正検出フレームワークを提案している。これらの研究は、近隣の情報を集約してノードの埋め込みを学ぶことで既存のGNNフレームワークに従っている。しかし,不整合問題,すなわちコンテキスト不整合,特徴不整合,関係不整合などについてはほとんど調査されていない。 In this paper, we introduce these inconsistencies and design a new GNN framework, $\mathsf{GraphConsis}$, to tackle the inconsistency problem: (1) for the context inconsistency, we propose to combine the context embeddings with node features, (2) for the feature inconsistency, we design a consistency score to filter the inconsistent neighbors and generate corresponding sampling probability, and (3) for the relation inconsistency, we learn a relation attention weights associated with the sampled nodes. 4つのデータセットに関する実証分析は、不正検出タスクにおいて不整合問題は不可欠であることを示している。広範な実験は$\mathsf{GraphConsis}$の有効性を証明する。また,SOTAモデルを実装したGNNベースの不正検出ツールボックスもリリースした。コードはhttps://github.com/safe-graph/DGFraudで公開されている。 The graph-based model can help to detect suspicious fraud online. Owing to the development of Graph Neural Networks~(GNNs), prior research work has proposed many GNN-based fraud detection frameworks based on either homogeneous graphs or heterogeneous graphs. These work follow the existing GNN framework by aggregating the neighboring information to learn the node embedding, which lays on the assumption that the neighbors share similar context, features, and relations. However, the inconsistency problem is hardly investigated, i.e., the context inconsistency, feature inconsistency, and relation inconsistency. In this paper, we introduce these inconsistencies and design a new GNN framework, $\mathsf{GraphConsis}$, to tackle the inconsistency problem: (1) for the context inconsistency, we propose to combine the context embeddings with node features, (2) for the feature inconsistency, we design a consistency score to filter the inconsistent neighbors and generate corresponding sampling probability, and (3) for the relation inconsistency, we learn a relation attention weights associated with the sampled nodes. Empirical analysis on four datasets indicates the inconsistency problem is crucial in a fraud detection task. The extensive experiments prove the effectiveness of $\mathsf{GraphConsis}$. We also released a GNN-based fraud detection toolbox with implementations of SOTA models. The code is available at https://github.com/safe-graph/DGFraud.	翻訳日:2022-12-08 00:30:33 公開日:2020-07-02
# グラフ準同型畳み込み Graph Homomorphism Convolution ( http://arxiv.org/abs/2005.01214v2 ) ライセンス: Link先を確認	Hoang NT, Takanori Maehara	(参考訳) 本稿では,グラフ準同型の観点からのグラフ分類問題について考察する。我々は、$f$ から $g$ への準同型を考えるが、ここでは$g$ は興味のあるグラフ(例えば分子やソーシャルネットワーク)であり、$f$ はいくつかのグラフ(例えばパスや非同型木)に属する。グラフ準同型数は自然不変量(同型不変量および$\mathcal{f}$-invariant)埋め込み写像を提供し、グラフの分類に利用できることを示した。グラフ分類器の表現力について、$\mathcal{f}$-indistinguishable の概念を用いて、$\mathcal{f}$-invariant 関数を近似するグラフ準同型ベクトルの普遍性を証明する。実際、元が有界木幅を持つ$\mathcal{f}$を選択することで、準同型法は他の方法と比較して効率的であることを示す。 In this paper, we study the graph classification problem from the graph homomorphism perspective. We consider the homomorphisms from $F$ to $G$, where $G$ is a graph of interest (e.g. molecules or social networks) and $F$ belongs to some family of graphs (e.g. paths or non-isomorphic trees). We show that graph homomorphism numbers provide a natural invariant (isomorphism invariant and $\mathcal{F}$-invariant) embedding maps which can be used for graph classification. Viewing the expressive power of a graph classifier by the $\mathcal{F}$-indistinguishable concept, we prove the universality property of graph homomorphism vectors in approximating $\mathcal{F}$-invariant functions. In practice, by choosing $\mathcal{F}$ whose elements have bounded tree-width, we show that the homomorphism method is efficient compared with other methods.	翻訳日:2022-12-07 06:22:47 公開日:2020-07-02
# 分散不一致:無条件テキスト生成のための指標 Distributional Discrepancy: A Metric for Unconditional Text Generation ( http://arxiv.org/abs/2005.01282v2 ) ライセンス: Link先を確認	Ping Cai, Xingyuan Chen, Peng Jin, Hongjun Wang, Tianrui Li	(参考訳) 非条件テキスト生成の目的は、実際の文でモデルを訓練し、トレーニングデータと同じ品質と多様性の新規な文を生成することである。しかし、無条件テキスト生成法を比較するために異なる指標を用いる場合、矛盾した結論が導かれる。難点は、モデルを評価する際に、サンプルの多様性と品質の両方を同時に考慮すべきである。この問題を解決するために, 生成した訓練文と実際の訓練文の差異に基づいて, 新たな分布的不一致尺度(dd)を考案した。しかし、実際の文の分布が不可能であるため、DDを直接計算することはできない。そこで本研究では,ニューラルネットワークを用いたテキスト分類器の訓練によりDDを推定する手法を提案する。比較のために,既存の3つの指標,二言語評価アンダースタディ (bleu) と自己ブレイン,言語モデルスコアと逆言語モデルスコア,fr\'{e}chet埋め込み距離を用いて,長期記憶の2つの一般的な生成モデルと,構文と実データの両方における生成事前学習トランスフォーマ2の評価を行った。実験結果から,DDは既存の3つの指標よりも有意に優れていることがわかった。 The purpose of unconditional text generation is to train a model with real sentences, then generate novel sentences of the same quality and diversity as the training data. However, when different metrics are used for comparing the methods of unconditional text generation, contradictory conclusions are drawn. The difficulty is that both the diversity and quality of the sample should be considered simultaneously when the models are evaluated. To solve this problem, a novel metric of distributional discrepancy (DD) is designed to evaluate generators based on the discrepancy between the generated and real training sentences. However, it cannot compute the DD directly because the distribution of real sentences is unavailable. Thus, we propose a method for estimating the DD by training a neural-network-based text classifier. For comparison, three existing metrics, bi-lingual evaluation understudy (BLEU) versus self-BLEU, language model score versus reverse language model score, and Fr\'{e}chet embedding distance, along with the proposed DD, are used to evaluate two popular generative models of long short-term memory and generative pretrained transformer 2 on both syntactic and real data. Experimental results show that DD is significantly better than the three existing metrics for ranking these generative models.	翻訳日:2022-12-07 00:48:04 公開日:2020-07-02
# MLSolv-A: Pairwise Atomistic Interactions による解答自由エネルギーの機械学習による予測 MLSolv-A: A Novel Machine Learning-Based Prediction of Solvation Free Energies from Pairwise Atomistic Interactions ( http://arxiv.org/abs/2005.06182v2 ) ライセンス: Link先を確認	Hyuntae Lim and YounJoon Jung	(参考訳) 機械学習とその応用の最近の進歩は、重要な化学特性のための多様な構造-プロパティ関係モデルの開発に結びついており、その1つが溶解自由エネルギーである。本稿では,一対の原子間相互作用から溶解エネルギーを計算するMLに基づく新しい解法モデルを提案する。 2つのエンコーディング関数は、与えられた化学構造から原子の特徴ベクトルを抽出し、2つの原子論的特徴の間の内積はそれらの相互作用を計算する。 6,493 の試験結果から, 溶媒非特異性によるトレーニングデータの拡大に優れた性能と伝達性を得た。相互作用マップの解析から,本モデルが解離エネルギーに対する群寄与を再現する大きな可能性が示唆され,このモデルが予測対象特性を提供するだけでなく,より詳細な物理化学的洞察を与えると考えている。 Recent advances in machine learning and their applications have lead to the development of diverse structure-property relationship models for crucial chemical properties, and the solvation free energy is one of them. Here, we introduce a novel ML-based solvation model, which calculates the solvation energy from pairwise atomistic interactions. The novelty of the proposed model consists of a simple architecture: two encoding functions extract atomic feature vectors from the given chemical structure, while the inner product between two atomistic features calculates their interactions. The results on 6,493 experimental measurements achieve outstanding performance and transferability for enlarging training data due to its solvent-non-specific nature. Analysis of the interaction map shows there is a great potential that our model reproduces group contributions on the solvation energy, which makes us believe that the model not only provides the predicted target property but also gives us more detailed physicochemical insights.	翻訳日:2022-12-03 12:50:52 公開日:2020-07-02
# 頭部検出と追跡熱マップを用いた店内多人数追跡に向けて Towards in-store multi-person tracking using head detection and track heatmaps ( http://arxiv.org/abs/2005.08009v2 ) ライセンス: Link先を確認	Aibek Musaev, Jiangping Wang, Liang Zhu, Cheng Li, Yi Chen, Jialin Liu, Wanqi Zhang, Juan Mei, De Wang	(参考訳) コンピュータビジョンアルゴリズムは、技術革新を可能にするために、さまざまな産業で実装されている。本稿では,小売業におけるコンピュータビジョンに基づく顧客追跡の問題について検討する。この目的のために,スーパーマーケットにおける顧客行動の模倣を行うオフィス環境において,カメラから収集したデータセットを導入する。さらに,このデータセットを用いた頭部追跡モデルに基づく参加者追跡の例を示し,閉塞による誤りの最小化を図った。さらに,顧客とスタッフの行動パターンに基づいた認識モデルを提案する。モデルは24時間にわたってスーパーマーケットで収集された実世界のデータセットを用いて評価され、トレーニング中の98%の精度と評価時の93%の精度を達成している。 Computer vision algorithms are being implemented across a breadth of industries to enable technological innovations. In this paper, we study the problem of computer vision based customer tracking in retail industry. To this end, we introduce a dataset collected from a camera in an office environment where participants mimic various behaviors of customers in a supermarket. In addition, we describe an illustrative example of the use of this dataset for tracking participants based on a head tracking model in an effort to minimize errors due to occlusion. Furthermore, we propose a model for recognizing customers and staff based on their movement patterns. The model is evaluated using a real-world dataset collected in a supermarket over a 24-hour period that achieves 98% accuracy during training and 93% accuracy during evaluation.	翻訳日:2022-12-02 13:33:06 公開日:2020-07-02
# mixboard: 知識に富んだスタイリッシュな統合テキスト生成プラットフォーム MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform ( http://arxiv.org/abs/2005.08365v2 ) ライセンス: Link先を確認	Xiang Gao, Michel Galley, Bill Dolan	(参考訳) MixingBoardは、知識に基づくスタイル付きテキスト生成に焦点を当てた、デモを素早く構築するプラットフォームです。我々は、既存のテキスト生成アルゴリズムを共有コードベースに統合し、制約付き生成に以前のアルゴリズムをさらに適応させる。異なるモデルから利点を借用するため、トークン確率レベルから潜在空間レベルまで、クロスモデル統合のための戦略を実装します。外部知識へのインタフェースは、Webやドキュメントコレクションからオンザフライで関連する知識を取得するモジュールを介して提供される。ローカル開発用のユーザインターフェース、リモートWebページアクセス、RESTful APIが提供され、ユーザが自身のデモを簡単に構築できるようになる。 We present MixingBoard, a platform for quickly building demos with a focus on knowledge grounded stylized text generation. We unify existing text generation algorithms in a shared codebase and further adapt earlier algorithms for constrained generation. To borrow advantages from different models, we implement strategies for cross-model integration, from the token probability level to the latent space level. An interface to external knowledge is provided via a module that retrieves on-the-fly relevant knowledge from passages on the web or any document collection. A user interface for local development, remote webpage access, and a RESTful API are provided to make it simple for users to build their own demos.	翻訳日:2022-12-02 05:35:02 公開日:2020-07-02
# ExKMC: 説明可能な$k$-Meansクラスタの拡張 ExKMC: Expanding Explainable $k$-Means Clustering ( http://arxiv.org/abs/2006.02399v2 ) ライセンス: Link先を確認	Nave Frost, Michal Moshkovitz, Cyrus Rashtchian	(参考訳) 説明可能なAIの人気にもかかわらず、教師なし学習の効果的な方法には限界がある。説明可能性と精度のトレードオフに着目し,$k$-meansクラスタリングのアルゴリズムについて検討した。以前の作業の後、データセットを$k$クラスタに分割するために、小さな決定ツリーを使用します。これにより、各クラスタ割り当てを、単一機能しきい値の短いシーケンスで説明できる。大きな木はより正確なクラスタリングを生成するが、さらに複雑な説明を必要とする。フレキシビリティを実現するために、新しい説明可能な$k$-meansクラスタリングアルゴリズムであるExKMCを開発し、$k' \geq k$を加算し、$k'$の葉を持つ決定木を出力する。木を効率的に拡張し、葉に$k$クラスタの1つをラベル付けするために、新しいサロゲートコストを使用します。 k'$が増加するにつれて、サロゲートコストは増加せず、したがって説明可能性と精度を交換する。実験により,ExKMCが低コストのクラスタリングを実現し,標準的な決定木法と説明可能なクラスタリングのためのアルゴリズムの両方に優れることを確認した。 ExKMCの実装はhttps://github.com/navefr/ExKMCで公開されている。 Despite the popularity of explainable AI, there is limited work on effective methods for unsupervised learning. We study algorithms for $k$-means clustering, focusing on a trade-off between explainability and accuracy. Following prior work, we use a small decision tree to partition a dataset into $k$ clusters. This enables us to explain each cluster assignment by a short sequence of single-feature thresholds. While larger trees produce more accurate clusterings, they also require more complex explanations. To allow flexibility, we develop a new explainable $k$-means clustering algorithm, ExKMC, that takes an additional parameter $k' \geq k$ and outputs a decision tree with $k'$ leaves. We use a new surrogate cost to efficiently expand the tree and to label the leaves with one of $k$ clusters. We prove that as $k'$ increases, the surrogate cost is non-increasing, and hence, we trade explainability for accuracy. Empirically, we validate that ExKMC produces a low cost clustering, outperforming both standard decision tree methods and other algorithms for explainable clustering. Implementation of ExKMC available at https://github.com/navefr/ExKMC.	翻訳日:2022-11-25 17:46:50 公開日:2020-07-02
# 極小標本の解釈可能な時系列分類 Interpretable Time-series Classification on Few-shot Samples ( http://arxiv.org/abs/2006.02031v2 ) ライセンス: Link先を確認	Wensi Tang, Lu Liu, Guodong Long	(参考訳) 最近のマイナショット学習は、クラスやサンプルのない新しいタスクに素早く適応するために、事前のメタ知識を持つモデルをトレーニングすることに焦点を当てている。しかし,従来の時系列分類アルゴリズムでは,このシナリオに対処できない。既存の数発の学習手法は、画像やテキストデータに取り組むために提案されており、その多くは、解釈性に欠けるニューラルベースモデルである。本稿では,ニューラルネットワークに基づくモデルを学習するだけでなく,そのモデルを双対粒度から解釈する,数発時系列分類のための解釈可能なニューラルベースフレームワークである \textit{dual prototypical shapelet networks (dpsn)"を提案する。 1)代表時系列サンプルを用いたグローバル概観,及び 2)識別型シェープレットを用いた局所ハイライト。特に、生成された二重原型形状体は、クラス内の全てのサンプルの全体形状を主に示す代表サンプルと、異なるクラスを区別するために使用できる識別的部分長形状体から構成される。我々は,公開ベンチマークデータセットから18個の少数ショットtscデータセットを導出し,ベースラインとの比較により提案手法を評価した。 DPSNフレームワークは、特に限られた量のデータによるトレーニングにおいて、最先端の時系列分類方法より優れている。モデルの解釈能力を示すためにいくつかの事例研究がなされている。 Recent few-shot learning works focus on training a model with prior meta-knowledge to fast adapt to new tasks with unseen classes and samples. However, conventional time-series classification algorithms fail to tackle the few-shot scenario. Existing few-shot learning methods are proposed to tackle image or text data, and most of them are neural-based models that lack interpretability. This paper proposes an interpretable neural-based framework, namely \textit{Dual Prototypical Shapelet Networks (DPSN)} for few-shot time-series classification, which not only trains a neural network-based model but also interprets the model from dual granularity: 1) global overview using representative time series samples, and 2) local highlights using discriminative shapelets. In particular, the generated dual prototypical shapelets consist of representative samples that can mostly demonstrate the overall shapes of all samples in the class and discriminative partial-length shapelets that can be used to distinguish different classes. We have derived 18 few-shot TSC datasets from public benchmark datasets and evaluated the proposed method by comparing with baselines. The DPSN framework outperforms state-of-the-art time-series classification methods, especially when training with limited amounts of data. Several case studies have been given to demonstrate the interpret ability of our model.	翻訳日:2022-11-25 17:17:25 公開日:2020-07-02
# 低レベルシングルトンを越えた双方向プログラミングのための汎用一階アルゴリズムフレームワーク A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton ( http://arxiv.org/abs/2006.04045v2 ) ライセンス: Link先を確認	Risheng Liu, Pan Mu, Xiaoming Yuan, Shangzhi Zeng, Jin Zhang	(参考訳) 近年,二段階最適化問題の解法として,勾配に基づく一階法が開発されている。しかしながら、これらの既存のアプローチの理論的保証は、各固定された上層変数に対して、下層解がシングルトン(LLS)でなければならないという単純化に大きく依存している。本研究では,まず,LSS条件の無効化を示す反例を設計する。次に、楽観的な2レベル情報の観点からBLPを定式化し、階層的目的情報を集約することで、汎用的2レベル最適化のための柔軟でモジュール化されたアルゴリズムフレームワークであるBDA(Bilevel Descent Aggregation)を確立する。理論的には、LSS条件なしでBDAの収束を証明する新しい手法を導出する。我々の研究は、BDAが特定の一階計算モジュールの検証と互換性があることも示している。さらに、興味深い副産物として、従来の一階二階スキーム(LSS単純化)も改善する。特に、より弱い仮定で収束を確立する。広範にわたる実験により,提案するbdaの高パラメータ最適化やメタ学習など,さまざまなタスクに対する優越性が実証された。 In recent years, a variety of gradient-based first-order methods have been developed to solve bi-level optimization problems for learning applications. However, theoretical guarantees of these existing approaches heavily rely on the simplification that for each fixed upper-level variable, the lower-level solution must be a singleton (a.k.a., Lower-Level Singleton, LLS). In this work, we first design a counter-example to illustrate the invalidation of such LLS condition. Then by formulating BLPs from the view point of optimistic bi-level and aggregating hierarchical objective information, we establish Bi-level Descent Aggregation (BDA), a flexible and modularized algorithmic framework for generic bi-level optimization. Theoretically, we derive a new methodology to prove the convergence of BDA without the LLS condition. Our investigations also demonstrate that BDA is indeed compatible to a verify of particular first-order computation modules. Additionally, as an interesting byproduct, we also improve these conventional first-order bi-level schemes (under the LLS simplification). Particularly, we establish their convergences with weaker assumptions. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed BDA for different tasks, including hyper-parameter optimization and meta learning.	翻訳日:2022-11-24 07:09:37 公開日:2020-07-02
# WaveNODE:音声合成のための連続正規化フロー WaveNODE: A Continuous Normalizing Flow for Speech Synthesis ( http://arxiv.org/abs/2006.04598v4 ) ライセンス: Link先を確認	Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim	(参考訳) 近年,高忠実度波形をリアルタイムに生成するフローベース生成モデルが提案されている。しかし、これらのモデルは、よく訓練された教師ネットワークか、メモリ非効率な複数のフローステップを必要とする。本稿では,音声合成のための連続正規化フローを利用するWaveNODEという新しい生成モデルを提案する。従来のモデルとは異なり、WaveNODEはフロー操作に使用する関数に制約を課さないため、より柔軟で複雑な関数を使用することができる。さらに、WaveNODEは教師ネットワークや補助的損失項を必要とせずに、可能性の最大化に最適化することができる。本研究では,従来のフローベースボコーダに比べて少ないパラメータでウェーブヌードが同等の性能を発揮することを示す。 In recent years, various flow-based generative models have been proposed to generate high-fidelity waveforms in real-time. However, these models require either a well-trained teacher network or a number of flow steps making them memory-inefficient. In this paper, we propose a novel generative model called WaveNODE which exploits a continuous normalizing flow for speech synthesis. Unlike the conventional models, WaveNODE places no constraint on the function used for flow operation, thus allowing the usage of more flexible and complex functions. Moreover, WaveNODE can be optimized to maximize the likelihood without requiring any teacher network or auxiliary loss terms. We experimentally show that WaveNODE achieves comparable performance with fewer parameters compared to the conventional flow-based vocoders.	翻訳日:2022-11-24 01:17:25 公開日:2020-07-02
# 限定アノテーションによるミトコンドリア検出 : 共同学習によるアプローチ Mitosis Detection Under Limited Annotation: A Joint Learning Approach ( http://arxiv.org/abs/2006.09772v2 ) ライセンス: Link先を確認	Pushpak Pati, Antonio Foncubierta-Rodriguez, Orcun Goksel, Maria Gabrani	(参考訳) 有糸分裂計数は乳癌における腫瘍増殖の重要な予後指標である。深層学習に基づくmitotic detectionは病理学者と同等だが、トレーニングには大きなラベル付きデータが必要である。本研究では,ソフトマックス損失によるクラスラベル情報と,距離メトリック学習によるサンプル間の空間分布情報を活用することで,mitosis検出の深部分類フレームワークを提案する。また,学習を促進するための情報的サンプルを着実に提供するための戦略についても検討する。提案手法の有効性は,ICPR 2012 およびAMIDA 2013 mitotic data による評価により確立された。本フレームワークは,トレーニングデータ全体の使用方法と比較して,少ないトレーニングデータによる検出を著しく改善し,同等あるいは優れたパフォーマンスを実現している。 Mitotic counting is a vital prognostic marker of tumor proliferation in breast cancer. Deep learning-based mitotic detection is on par with pathologists, but it requires large labeled data for training. We propose a deep classification framework for enhancing mitosis detection by leveraging class label information, via softmax loss, and spatial distribution information among samples, via distance metric learning. We also investigate strategies towards steadily providing informative samples to boost the learning. The efficacy of the proposed framework is established through evaluation on ICPR 2012 and AMIDA 2013 mitotic data. Our framework significantly improves the detection with small training data and achieves on par or superior performance compared to state-of-the-art methods for using the entire training data.	翻訳日:2022-11-19 20:35:35 公開日:2020-07-02
# GCC: グラフニューラルネットワーク事前トレーニングのためのグラフコントラスト符号化 GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training ( http://arxiv.org/abs/2006.09963v3 ) ライセンス: Link先を確認	Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, Jie Tang	(参考訳) グラフ表現学習は現実世界の問題に対処する強力な手法として登場した。ダウンストリームグラフ学習タスクは、ノード分類、類似性探索、グラフ分類などの最近の発展の恩恵を受けている。しかしながら、グラフ表現学習における先行技術は、ドメイン固有の問題に焦点を当て、各グラフデータセットの専用モデルをトレーニングする。自然言語処理とコンピュータビジョンからの事前学習の最近の進歩に触発されて、我々はグラフコントラストコーディング (gcc) -- 自己教師付きグラフニューラルネットワーク事前学習フレームワーク -- を設計、複数のネットワークにまたがるユニバーサルネットワークトポロジー特性をキャプチャする。我々はgccの事前学習タスクを,ネットワーク内およびネットワーク間におけるサブグラフインスタンス識別として設計し,グラフニューラルネットワークに内在的かつ転送可能な構造表現を学習させるためのコントラスト学習を利用する。 3つのグラフ学習タスクと10のグラフデータセットに関する広範な実験を行う。その結果,多種多様なデータセットの集合を事前学習したgccは,そのタスク固有かつスクラッチからトレーニングされたデータに対して,競争力やパフォーマンスの向上が期待できることがわかった。このことは、事前学習と微調整のパラダイムがグラフ表現学習に大きな可能性を示唆している。 Graph representation learning has emerged as a powerful technique for addressing real-world problems. Various downstream graph learning tasks have benefited from its recent developments, such as node classification, similarity search, and graph classification. However, prior arts on graph representation learning focus on domain specific problems and train a dedicated model for each graph dataset, which is usually non-transferable to out-of-domain data. Inspired by the recent advances in pre-training from natural language processing and computer vision, we design Graph Contrastive Coding (GCC) -- a self-supervised graph neural network pre-training framework -- to capture the universal network topological properties across multiple networks. We design GCC's pre-training task as subgraph instance discrimination in and across networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations. We conduct extensive experiments on three graph learning tasks and ten graph datasets. The results show that GCC pre-trained on a collection of diverse datasets can achieve competitive or better performance to its task-specific and trained-from-scratch counterparts. This suggests that the pre-training and fine-tuning paradigm presents great potential for graph representation learning.	翻訳日:2022-11-19 20:27:25 公開日:2020-07-02
# クレジット・スコーリングのためのRパッケージのランドスケープに関する概観 An Overview on the Landscape of R Packages for Credit Scoring ( http://arxiv.org/abs/2006.11835v2 ) ライセンス: Link先を確認	Gero Szepannek	(参考訳) 信用スコア業界は、ローンのデフォルト確率予測に統計ツールを使用するという長い伝統があり、マシンラーニングの誇大宣伝よりずっと前にドメイン固有の標準が確立されている。いくつかの商用ソフトウェア会社は、Rの明示的なパッケージでクレジットカードをモデリングするための特定のソリューションを提供しているが、この目的のために長い間失われてきた。近年は変更され、クレジットスコアリングに特化したパッケージがいくつか開発されている。本論文の目的は,これらのパッケージの概観を構造化することである。これによってユーザは、希望する目的のために適切な機能を選択することができ、さらに将来の開発活動の指揮に貢献することが望まれる。この論文は、典型的なスコアカード開発プロセスを形成するための、その後のモデリングステップの連鎖によって導かれる。 The credit scoring industry has a long tradition of using statistical tools for loan default probability prediction and domain specific standards have been established long before the hype of machine learning. Although several commercial software companies offer specific solutions for credit scorecard modelling in R explicit packages for this purpose have been missing long time. In the recent years this has changed and several packages have been developed which are dedicated to credit scoring. The aim of this paper is to give a structured overview on these packages. This may guide users to select the appropriate functions for a desired purpose and further hopefully will contribute to directing future development activities. The paper is guided by the chain of subsequent modelling steps as they are forming the typical scorecard development process.	翻訳日:2022-11-18 12:42:56 公開日:2020-07-02
# 計量空間等級と重みベクトルの実用的応用 Practical applications of metric space magnitude and weighting vectors ( http://arxiv.org/abs/2006.14063v2 ) ライセンス: Link先を確認	Eric Bunch, Daniel Dickinson, Jeffery Kline, Glenn Fung	(参考訳) 代数的トポロジーの研究の活発な主題である計量空間等級は、もともと生物学の文脈で発生し、そこでは環境における異なる種の有効数を表すために用いられた。より一般的な設定では、計量空間の大きさは、空間内の異なる点の有効数の定量化を目的とした実数である。計量空間のグローバル等級への各点の寄与は、元の計量空間の基盤となる幾何の多くを捉えている。驚くべきことに、計量空間がユークリッド空間であるとき、重み付けベクトルは境界検出の有効なツールでもある。これにより、重み付けベクトルは、分類、外れ値検出、アクティブラーニングといった古典的な機械学習タスクのための新しいアルゴリズムの基礎となる。古典的なベンチマークデータセットの実験と比較を用いて、提案した大きさの約束とベクトルベースのアプローチを重み付けする。 Metric space magnitude, an active subject of research in algebraic topology, originally arose in the context of biology, where it was used to represent the effective number of distinct species in an environment. In a more general setting, the magnitude of a metric space is a real number that aims to quantify the effective number of distinct points in the space. The contribution of each point to a metric space's global magnitude, which is encoded by the {\em weighting vector}, captures much of the underlying geometry of the original metric space. Surprisingly, when the metric space is Euclidean, the weighting vector also serves as an effective tool for boundary detection. This allows the weighting vector to serve as the foundation of novel algorithms for classic machine learning tasks such as classification, outlier detection and active learning. We demonstrate, using experiments and comparisons on classic benchmark datasets, the promise of the proposed magnitude and weighting vector-based approaches.	翻訳日:2022-11-17 10:06:52 公開日:2020-07-02
# 効率的なブロックスパースニューラルネットワークのためのラマヌジャン二部グラフ製品 Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural Networks ( http://arxiv.org/abs/2006.13486v2 ) ライセンス: Link先を確認	Dharma Teja Vooturi, Girish Varma, Kishore Kothapalli	(参考訳) スパースニューラルネットワークは、より高密度なバージョンと競合する正確な予測を与えるとともに、実行された算術演算数を最小化する。しかし、GPUのような現在のハードウェアは、より効率良く構造化されたスパーシティパターンしか利用できない。したがって、スパースニューラルネットワークの実行時間は、必要な演算処理に対応しない可能性がある。本研究では,階層型マルチレベルブロックスパースニューラルネットワークをグラフ積の理論を用いて生成するRBGP(Ramanujan Bipartite Graph Product)フレームワークを提案する。ラマヌジャングラフの積も提案するが、これは与えられた範囲で最高の接続性を与える。これは本質的に i を保証します。 ) ネットワークは、実行時効率のよいアルゴリズムが存在する構造化ブロックスパーシティを持つ。 ) このモデルは, グラフiiiの接続性に起因する表現力の向上により, 予測精度が向上する。 ) グラフデータ構造は、メモリに効率的に格納できる簡潔な表現を有する。このフレームワークを使用してrbgp4と呼ばれる特定の接続パターンを設計し、gpuで利用可能なメモリ階層を効率的に利用します。我々は、VGG19とWideResnet-40-4ネットワークを用いて、CIFARデータセット上の画像分類タスクを実験し、非構造パターンとブロック間隔パターンに対して、それぞれ5-9xと2-5xのランタイムゲインを達成するとともに、同じレベルの精度を実現した。 Sparse neural networks are shown to give accurate predictions competitive to denser versions, while also minimizing the number of arithmetic operations performed. However current hardware like GPU's can only exploit structured sparsity patterns for better efficiency. Hence the run time of a sparse neural network may not correspond to the arithmetic operations required. In this work, we propose RBGP( Ramanujan Bipartite Graph Product) framework for generating structured multi level block sparse neural networks by using the theory of Graph products. We also propose to use products of Ramanujan graphs which gives the best connectivity for a given level of sparsity. This essentially ensures that the i.) the networks has the structured block sparsity for which runtime efficient algorithms exists ii.) the model gives high prediction accuracy, due to the better expressive power derived from the connectivity of the graph iii.) the graph data structure has a succinct representation that can be stored efficiently in memory. We use our framework to design a specific connectivity pattern called RBGP4 which makes efficient use of the memory hierarchy available on GPU. We benchmark our approach by experimenting on image classification task over CIFAR dataset using VGG19 and WideResnet-40-4 networks and achieve 5-9x and 2-5x runtime gains over unstructured and block sparsity patterns respectively, while achieving the same level of accuracy.	翻訳日:2022-11-17 09:49:06 公開日:2020-07-02
# 自動運転車用ディープラーニングコンポーネントにおける不確実性推定手法の比較 A Comparison of Uncertainty Estimation Approaches in Deep Learning Components for Autonomous Vehicle Applications ( http://arxiv.org/abs/2006.15172v2 ) ライセンス: Link先を確認	Fabio Arnez (1), Huascar Espinoza (1), Ansgar Radermacher (1) and Fran\c{c}ois Terrier (1) ((1) CEA LIST)	(参考訳) 自動運転車(AV)の安全性を確保する重要な要因は、望ましくない、予測できない状況下での異常行動を避けることである。 AVは安全クリティカルなタスクを実行するためにディープニューラルネットワーク(DNN)にますます依存しているため、データやモデルの必然的なエラーの原因を測定するために、不確実性定量化のためのさまざまな方法が最近提案されている。しかし、DNNにおける不確実性定量化は依然として難しい課題である。これらの手法は高い計算負荷と高いメモリフットプリントを必要とし、安全性が重要なアプリケーションでは禁止される余分なレイテンシをもたらす。本稿では,DNNにおける不確実性定量化手法と,不確実性予測を評価するための既存の指標について,簡潔かつ比較検討する。特に、特定のavタスクや不確実性ソースのタイプに対する各メソッドの利点と欠点を理解することに関心があります。 A key factor for ensuring safety in Autonomous Vehicles (AVs) is to avoid any abnormal behaviors under undesirable and unpredicted circumstances. As AVs increasingly rely on Deep Neural Networks (DNNs) to perform safety-critical tasks, different methods for uncertainty quantification have recently been proposed to measure the inevitable source of errors in data and models. However, uncertainty quantification in DNNs is still a challenging task. These methods require a higher computational load, a higher memory footprint, and introduce extra latency, which can be prohibitive in safety-critical applications. In this paper, we provide a brief and comparative survey of methods for uncertainty quantification in DNNs along with existing metrics to evaluate uncertainty predictions. We are particularly interested in understanding the advantages and downsides of each method for specific AV tasks and types of uncertainty sources.	翻訳日:2022-11-16 21:13:07 公開日:2020-07-02
# ニューラルmcmcのための深部インボリューティブ生成モデル Deep Involutive Generative Models for Neural MCMC ( http://arxiv.org/abs/2006.15167v2 ) ライセンス: Link先を確認	Span Spanbauer, Cameron Freer, Vikash Mansinghka	(参考訳) Involutive Neural MCMC(Involutive Neural MCMC)を高速なニューラルMCMCの新しいアプローチとして定義するために,Deep Involutive Generative ModelとDeep Generative Modelingの新しいアーキテクチャを導入している。帰納的生成モデル (involutive generative model) は、確率核 $g(\phi \mapsto \phi')$ を、補助変数 $\pi$ を含む拡大状態空間上の帰納的決定関数 $f(\phi, \pi)$ として表現する。そこで本研究では,これらのモデルのボリューム保存方法と,さらに深いボリューム保存型インボラティブ生成モデルを用いて,適切なメトロポリス・ハスティング更新を行う方法を示す。深部インボリューティブ生成モデルとその体積保存特例が確率核の普遍近似であることを示す。これにより、十分なネットワーク容量とトレーニング時間があれば、任意の複雑なMCMC更新を学習することができる。シミュレーションデータを用いた学習パラメータの損失関数と最適化アルゴリズムを定義する。また, ハイブリッドモンテカルロでは難解なマルチモーダル分布を効率的に探索し, 最近導入したニューラルmcmc技術であるa-nice-mcよりも高速に収束できることを示す実験を行った。 We introduce deep involutive generative models, a new architecture for deep generative modeling, and use them to define Involutive Neural MCMC, a new approach to fast neural MCMC. An involutive generative model represents a probability kernel $G(\phi \mapsto \phi')$ as an involutive (i.e., self-inverting) deterministic function $f(\phi, \pi)$ on an enlarged state space containing auxiliary variables $\pi$. We show how to make these models volume preserving, and how to use deep volume-preserving involutive generative models to make valid Metropolis-Hastings updates based on an auxiliary variable scheme with an easy-to-calculate acceptance ratio. We prove that deep involutive generative models and their volume-preserving special case are universal approximators for probability kernels. This result implies that with enough network capacity and training time, they can be used to learn arbitrarily complex MCMC updates. We define a loss function and optimization algorithm for training parameters given simulated data. We also provide initial experiments showing that Involutive Neural MCMC can efficiently explore multi-modal distributions that are intractable for Hybrid Monte Carlo, and can converge faster than A-NICE-MC, a recently introduced neural MCMC technique.	翻訳日:2022-11-16 20:37:55 公開日:2020-07-02
# データ選択バイアスによるdecorrelated clustering Decorrelated Clustering with Data Selection Bias ( http://arxiv.org/abs/2006.15874v2 ) ライセンス: Link先を確認	Xiao Wang, Shaohua Fan, Kun Kuang, Chuan Shi, Jiawei Liu and Bai Wang	(参考訳) 既存のクラスタリングアルゴリズムのほとんどは、データの選択バイアスを考慮せずに提案されている。しかし、実際の多くのアプリケーションでは、データが偏りがないことを保証できない。選択バイアスは、機能間の予期せぬ相関とこれらの予期せぬ相関を無視して、クラスタリングアルゴリズムのパフォーマンスを損なう可能性がある。したがって、選択バイアスによって引き起こされる予期せぬ相関をいかに取り除くかは極めて重要であるが、クラスタリングに関してほとんど検討されていない。本稿では,データ選択バイアスを伴うクラスタリングのためのデコリレーション正規化K-Meansアルゴリズム(DCKM)を提案する。具体的には、デコリレーション・レギュレータは、サンプル分布のバランスをとることができるグローバルなサンプル重量を学習し、特徴間の予期せぬ相関を取り除くことを目的としている。一方,学習重みはk-meansと組み合わされ,k-meansクラスタは予期しない相関の影響を伴わずに固有データ分布上に重み付けされる。さらに、DCKMのパラメータを効果的に推測する更新ルールを導出する。実世界のデータセットに対する広範囲な実験結果から,dckmアルゴリズムは有意な性能向上を達成でき,クラスタリング時に選択バイアスによって引き起こされる予期せぬ特徴相関を取り除く必要性が示された。 Most of existing clustering algorithms are proposed without considering the selection bias in data. In many real applications, however, one cannot guarantee the data is unbiased. Selection bias might bring the unexpected correlation between features and ignoring those unexpected correlations will hurt the performance of clustering algorithms. Therefore, how to remove those unexpected correlations induced by selection bias is extremely important yet largely unexplored for clustering. In this paper, we propose a novel Decorrelation regularized K-Means algorithm (DCKM) for clustering with data selection bias. Specifically, the decorrelation regularizer aims to learn the global sample weights which are capable of balancing the sample distribution, so as to remove unexpected correlations among features. Meanwhile, the learned weights are combined with k-means, which makes the reweighted k-means cluster on the inherent data distribution without unexpected correlation influence. Moreover, we derive the updating rules to effectively infer the parameters in DCKM. Extensive experiments results on real world datasets well demonstrate that our DCKM algorithm achieves significant performance gains, indicating the necessity of removing unexpected feature correlations induced by selection bias when clustering.	翻訳日:2022-11-15 13:36:22 公開日:2020-07-02
# 大規模MIMOハイブリッドビームフォーミングのための教師なし深層学習 Unsupervised Deep Learning for Massive MIMO Hybrid Beamforming ( http://arxiv.org/abs/2007.00038v2 ) ライセンス: Link先を確認	Hamed Hojatian, Jeremy Nadal, Jean-Francois Frigon, Francois Leduc-Primeau	(参考訳) ハイブリッドビームフォーミング(hybrid beamforming)は、大量の複数入力多重出力(mimo)システムの複雑さとコストを低減し、高いデータレートを提供する、有望な技術である。しかし、ハイブリッドプリコーダの設計は、チャネル状態情報(CSI)のフィードバックと複雑な最適化問題の解決を必要とする課題である。本稿では,大規模MIMOシステムにおけるハイブリッドビームフォーミングを設計するためのRSSIに基づく非教師なしディープラーニング手法を提案する。さらに提案します一初期アクセス(ia)における同期信号(ss)を設計する方法、及び二アナログプリコーダのコードブックを設計する方法また,様々なシナリオにおいて,現実的なチャネルモデルを用いてシステム性能を評価する。提案手法は, 周波数分割二重化(fdd)通信において, 部分csiフィードバックによるスペクトル効率を大幅に向上させるだけでなく, ほぼ最適の和率を持ち, 最先端の全csiソリューションよりも優れることを示す。 Hybrid beamforming is a promising technique to reduce the complexity and cost of massive multiple-input multiple-output (MIMO) systems while providing high data rate. However, the hybrid precoder design is a challenging task requiring channel state information (CSI) feedback and solving a complex optimization problem. This paper proposes a novel RSSI-based unsupervised deep learning method to design the hybrid beamforming in massive MIMO systems. Furthermore, we propose i) a method to design the synchronization signal (SS) in initial access (IA); and ii) a method to design the codebook for the analog precoder. We also evaluate the system performance through a realistic channel model in various scenarios. We show that the proposed method not only greatly increases the spectral efficiency especially in frequency-division duplex (FDD) communication by using partial CSI feedback, but also has near-optimal sum-rate and outperforms other state-of-the-art full-CSI solutions.	翻訳日:2022-11-15 06:22:54 公開日:2020-07-02
# ハイパーパラメータスキーマ抽出のためのマイニングドキュメント Mining Documentation to Extract Hyperparameter Schemas ( http://arxiv.org/abs/2006.16984v2 ) ライセンス: Link先を確認	Guillaume Baudart, Peter D. Kirchner, Martin Hirzel, Kiran Kate	(参考訳) ai自動化ツールは、検索空間を定義するために機械可読なハイパーパラメータスキーマを必要とする。同時に、AIライブラリには、優れた人間可読性ドキュメントが付属することが多い。このようなドキュメントには必要な情報の大半が含まれているが、残念ながらツールを使う準備ができていない。本稿では,aiライブラリ内のpython docstringを自動マイニングしてハイパーパラメータ用のjsonスキーマを抽出する方法について述べる。 3つの異なるライブラリから119個のトランスフォーマーと推定器のアプローチを評価し,機械可読スキーマの抽出に有効であることを確認した。私たちのビジョンは、AI自動化ツール用のこのようなスキーマを手作業で作成およびメンテナンスし、より大きなライブラリやよりリッチなスキーマに自動化の範囲を広げることです。 AI automation tools need machine-readable hyperparameter schemas to define their search spaces. At the same time, AI libraries often come with good human-readable documentation. While such documentation contains most of the necessary information, it is unfortunately not ready to consume by tools. This paper describes how to automatically mine Python docstrings in AI libraries to extract JSON Schemas for their hyperparameters. We evaluate our approach on 119 transformers and estimators from three different libraries and find that it is effective at extracting machine-readable schemas. Our vision is to reduce the burden to manually create and maintain such schemas for AI automation tools and broaden the reach of automation to larger libraries and richer schemas.	翻訳日:2022-11-15 05:19:56 公開日:2020-07-02
# 不変な神経表現によって駆動される変換に対するロバスト性は? Is Robustness To Transformations Driven by Invariant Neural Representations? ( http://arxiv.org/abs/2007.00112v2 ) ライセンス: Link先を確認	Syed Suleman Abbas Zaidi, Xavier Boix, Neeraj Prasad, Sharon Gilad-Gutnick, Shlomit Ben-Ami, Pawan Sinha	(参考訳) 深層畳み込みニューラルネットワーク(DCNN)は、これらの変換がトレーニングセットに含まれる場合、変換中のオブジェクト(例えば、ぼやけやノイズ)を認識するための印象的な堅牢性を示している。このようなロバスト性を説明する仮説は、dcnnが画像が変換された後も不変な神経表現を発達させることである。しかし、この仮説がどの程度真であるかは、顕著な疑問であり、トレーニングセットに変換を含めると、ネットワークの一部が変換された画像または非変換画像を認識するのに特化できるなど、不変性とは異なる性質をもたらす可能性がある。本稿では,不均一が生じている条件を解析する。そのため、不変表現は、トレーニング中に変換されないオブジェクトカテゴリの変換に対して堅牢性を促進する。最新のdcnnを用いた結果から,トレーニングセット内の変換されたカテゴリ数の増加に伴い,不変表現が強化されることが示された。これは、物体の空間配置の変化を伴う回転や薄型化のような幾何学的変換と比較して、ぼやけやハイパスフィルタリングのような局所変換においてより顕著である。本研究は,深層学習における不変表現と不変表現が自然に出現する条件の理解を深める。 Deep Convolutional Neural Networks (DCNNs) have demonstrated impressive robustness to recognize objects under transformations (e.g. blur or noise) when these transformations are included in the training set. A hypothesis to explain such robustness is that DCNNs develop invariant neural representations that remain unaltered when the image is transformed. Yet, to what extent this hypothesis holds true is an outstanding question, as including transformations in the training set could lead to properties different from invariance, e.g. parts of the network could be specialized to recognize either transformed or non-transformed images. In this paper, we analyze the conditions under which invariance emerges. To do so, we leverage that invariant representations facilitate robustness to transformations for object categories that are not seen transformed during training. Our results with state-of-the-art DCNNs indicate that invariant representations strengthen as the number of transformed categories in the training set is increased. This is much more prominent with local transformations such as blurring and high-pass filtering, compared to geometric transformations such as rotation and thinning, that entail changes in the spatial arrangement of the object. Our results contribute to a better understanding of invariant representations in deep learning, and the conditions under which invariance spontaneously emerges.	翻訳日:2022-11-15 05:02:16 公開日:2020-07-02
# 機械教育を通して読むことを学ぶ Learning to Read through Machine Teaching ( http://arxiv.org/abs/2006.16470v2 ) ライセンス: Link先を確認	Ayon Sen, Christopher R. Cox, Matthew Cooper Borkenhagen, Mark S. Seidenberg and Xiaojin Zhu	(参考訳) 単語を読むことを学ぶことは、読者になるための大きな一歩だ。多くの子供たちは、英語の綴りと音の対応の不一致のためにこの課題に苦しむ。カリキュラムは、これらのパターンの教え方によって大きく異なる。それにもかかわらず、子どもたちは限られた時間(4年生)でシステムをマスターすることが期待されている。認知的に興味深いニューラルネットワークアーキテクチャを用いて、学習試行のシーケンスが学習を容易にするために構成されるかどうかを検証した。これはわずかな数(例えば10k)の学習試行でも難しい組合せ最適化問題である。本稿では,この系列最適化問題を,時間変化分布の最適化,すなわち,異なるステップにおける単語に対する確率分布の定義として提案する。次に、確率勾配降下法を用いて最適な時間変化分布と対応する最適トレーニングシーケンスを求める。基本条件 (ランダムシーケンス, 単語頻度に偏ったシーケンス) と比較して, 一般化精度は有意に向上した。これらの結果は,限られた学習経験を超えて,パフォーマンスが一般化する能力に依存する領域における学習成果の改善へのアプローチを示唆している。 Learning to read words aloud is a major step towards becoming a reader. Many children struggle with the task because of the inconsistencies of English spelling-sound correspondences. Curricula vary enormously in how these patterns are taught. Children are nonetheless expected to master the system in limited time (by grade 4). We used a cognitively interesting neural network architecture to examine whether the sequence of learning trials could be structured to facilitate learning. This is a hard combinatorial optimization problem even for a modest number of learning trials (e.g., 10K). We show how this sequence optimization problem can be posed as optimizing over a time varying distribution i.e., defining probability distributions over words at different steps in training. We then use stochastic gradient descent to find an optimal time-varying distribution and a corresponding optimal training sequence. We observed significant improvement on generalization accuracy compared to baseline conditions (random sequences; sequences biased by word frequency). These findings suggest an approach to improving learning outcomes in domains where performance depends on ability to generalize beyond limited training experience.	翻訳日:2022-11-15 04:17:43 公開日:2020-07-02
# 機能エクストリームを見つけるためのMDP MLPs to Find Extrema of Functionals ( http://arxiv.org/abs/2007.00530v2 ) ライセンス: Link先を確認	Tao Liu	(参考訳) 多層パーセプトロン(MLP)は、複数のパーセプトロンからなるネットワークのクラスであり、本質的には数学的機能である。 MLPに基づいて,関数の極限を求めるための新しい数値法を開発した。実演として,3つの物理場面で解法を提示する。理想的には、目的曲線/曲面が二階微分可能関数に適合できる場合にも同様の方法が適用できる。この方法は、有限個の非微分可能(しかし連続)点/曲面が存在する場合にも拡張することができる。 Multilayer perceptron (MLP) is a class of networks composed of multiple layers of perceptrons, and it is essentially a mathematical function. Based on MLP, we develop a new numerical method to find the extrema of functionals. As demonstrations, we present our solutions in three physic scenes. Ideally, the same method is applicable to any cases where the objective curve/surface can be fitted by second-order differentiable functions. This method can also be extended to cases where there are a finite number of non-differentiable (but continuous) points/surfaces.	翻訳日:2022-11-14 23:29:23 公開日:2020-07-02
# 臨床ベイズネットワーク開発のための医用イディオム Medical idioms for clinical Bayesian network development ( http://arxiv.org/abs/2007.00364v2 ) ライセンス: Link先を確認	Evangelia Kyrimi, Mariana Raniere Neves, Scott McLachlan, Martin Neil, William Marsh, Norman Fenton	(参考訳) ベイズネットワーク(英: Bayesian Networks, BN)は、医学的応用で広く利用されているグラフィカル確率モデルである。多くの医療用bnsが出版されているが、ネットワーク構造がどのように開発されたかの説明や、それが与えられた医療用途の正しい構造を表す理由の正当化なしでfait accompliが提示されている。これは、専門家から医療BNを構築するプロセスは、一般的にアドホックであり、方法論的改善の機会はほとんどないことを意味する。本稿では,医療BNの発達を支援するために,広く応用され,再利用可能な医療推論パターンを提案する。提案手法は2000年にNeil, Fenton, Nielsenによって導入されたイディオムに基づくアプローチを補完し拡張する。医学的なBNに特有な一般的なイディオムの例を提案する。提案する医学的推論パターンを医学的イディオムと呼ぶ。さらに,介入的および反事実的推論を表現するため,イディオムの使用を拡大する。提案する医用イディオムは論理的推論パターンであり,医療用BNの開発に有効であると考えられる。冠状動脈疾患の医学的例を用いて, 提案したすべての医学的イディオムを概説した。この方法は、医療専門家と共に開発中の他のBNにも適用されている。最後に,提案した医療用イディオムをBNモデルに適用すると,より明確な構造を持つモデルが得られることを示す。 Bayesian Networks (BNs) are graphical probabilistic models that have proven popular in medical applications. While numerous medical BNs have been published, most are presented fait accompli without explanation of how the network structure was developed or justification of why it represents the correct structure for the given medical application. This means that the process of building medical BNs from experts is typically ad hoc and offers little opportunity for methodological improvement. This paper proposes generally applicable and reusable medical reasoning patterns to aid those developing medical BNs. The proposed method complements and extends the idiom-based approach introduced by Neil, Fenton, and Nielsen in 2000. We propose instances of their generic idioms that are specific to medical BNs. We refer to the proposed medical reasoning patterns as medical idioms. In addition, we extend the use of idioms to represent interventional and counterfactual reasoning. We believe that the proposed medical idioms are logical reasoning patterns that can be combined, reused and applied generically to help develop medical BNs. All proposed medical idioms have been illustrated using medical examples on coronary artery disease. The method has also been applied to other ongoing BNs being developed with medical experts. Finally, we show that applying the proposed medical idioms to published BN models results in models with a clearer structure.	翻訳日:2022-11-14 23:12:06 公開日:2020-07-02
# Goal-Oriented Semantic Exploration を用いたオブジェクトゴールナビゲーション Object Goal Navigation using Goal-Oriented Semantic Exploration ( http://arxiv.org/abs/2007.00643v2 ) ライセンス: Link先を確認	Devendra Singh Chaplot, Dhiraj Gandhi, Abhinav Gupta, Ruslan Salakhutdinov	(参考訳) 本研究は,未確認環境における対象カテゴリーのインスタンスにナビゲートするオブジェクトゴールナビゲーションの問題を研究する。エンドツーエンドの学習ベースのナビゲーション手法は、探索や長期計画に効果がないため、このタスクで苦労しています。本稿では,エピソディック意味マップを構築し,目標対象のカテゴリに基づいて効率的に環境探索を行う「goal-oriented semantic exploration」というモジュールシステムを提案する。視覚的に現実的なシミュレーション環境における実証的な結果から,提案手法は,モジュール型マップベースの手法と同様に,エンドツーエンドの学習手法を含む幅広いベースラインを上回り,CVPR-2020 Habitat ObjectNav Challengeの勝利につながった。アブレーション解析により,提案モデルがシーン内のオブジェクトの相対的な配置のセマンティック先行を学習し,それらを効率的に探索することを示す。ドメインに依存しないモジュール設計により、我々のモデルを移動ロボットプラットフォームに転送し、現実世界でのオブジェクトゴールナビゲーションと同様のパフォーマンスを達成することができる。 This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. End-to-end learning-based navigation methods struggle at this task as they are ineffective at exploration and long-term planning. We propose a modular system called, `Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category. Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR-2020 Habitat ObjectNav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently. Domain-agnostic module design allow us to transfer our model to a mobile robot platform and achieve similar performance for object goal navigation in the real-world.	翻訳日:2022-11-14 23:04:06 公開日:2020-07-02
# マシンチェッカブル概念によるモデル説明可能性とロバスト性の統合 Unifying Model Explainability and Robustness via Machine-Checkable Concepts ( http://arxiv.org/abs/2007.00251v2 ) ライセンス: Link先を確認	Vedant Nanda, Till Speicher, John P. Dickerson, Krishna P. Gummadi, Muhammad Bilal Zafar	(参考訳) 深層ニューラルネットワーク(DNN)がますます増加するアプリケーションに採用されるにつれて、これらのモデルにとって説明可能性は重要なデシプラタムとして現れてきた。多くの実世界のタスクにおいて、説明可能性を必要とする主な理由の1つは、それぞれの説明(例えば、入力における概念の有無)に従わない予測(クラスラベル)が信頼できないと判断されるような、予測堅牢性を評価することである。しかし、すべてではないにしても、説明整合性(例えば、LIME, TCAV, saliency map)をチェックするための事前の手法は、大規模なデプロイを妨げている。本稿では,機械チェック可能な概念を用いたロバスト性評価フレームワークを提案する。我々のフレームワークは、DNNの説明をベースとした多数の概念を定義し、テスト時に説明整合性チェックを行い、予測の堅牢性を評価する。両方のステップは、人間の介入なしに自動化された方法で実行され、非常に多くのクラスを持つデータセットに簡単にスケールできます。実世界のデータセットと人的調査による実験により、我々のフレームワークは予測のロバスト性を大幅に向上させることができることが分かりました。 As deep neural networks (DNNs) get adopted in an ever-increasing number of applications, explainability has emerged as a crucial desideratum for these models. In many real-world tasks, one of the principal reasons for requiring explainability is to in turn assess prediction robustness, where predictions (i.e., class labels) that do not conform to their respective explanations (e.g., presence or absence of a concept in the input) are deemed to be unreliable. However, most, if not all, prior methods for checking explanation-conformity (e.g., LIME, TCAV, saliency maps) require significant manual intervention, which hinders their large-scale deployability. In this paper, we propose a robustness-assessment framework, at the core of which is the idea of using machine-checkable concepts. Our framework defines a large number of concepts that the DNN explanations could be based on and performs the explanation-conformity check at test time to assess prediction robustness. Both steps are executed in an automated manner without requiring any human intervention and are easily scaled to datasets with a very large number of classes. Experiments on real-world datasets and human surveys show that our framework is able to enhance prediction robustness significantly: the predictions marked to be robust by our framework have significantly higher accuracy and are more robust to adversarial perturbations.	翻訳日:2022-11-14 22:36:37 公開日:2020-07-02
# クロック付き連続時間ベイズネットワーク Continuous-Time Bayesian Networks with Clocks ( http://arxiv.org/abs/2007.00347v2 ) ライセンス: Link先を確認	Nicolai Engelmann, Dominik Linzner, Heinz Koeppl	(参考訳) 連続的に進化する構造化確率過程は、自然と工学で生じる現象をモデル化するための広く採用された枠組みを示す。しかし、そのようなモデルはしばしば、トラクタビリティを維持するためにマルコフ特性を満たすために選択される。このようなメモリレスモデルでよく使われるのは、Continuous Time Bayesian Networks (CTBN) である。本研究では,指数的生存時間に対する制限を任意の分布に引き上げる。現在の拡張は、トラクタビリティを妨げる補助状態を通じてこれを達成している。そこで我々は,グラフ結合型半マルコフ連鎖の集合を構成するノードワイズクロックの集合を導入する。本稿では,遺伝子制御ネットワークのベンチマークツールを用いて生成したデータと,局所的な依存関係を利用して,合成データに対する実験を行うパラメータと構造推論のアルゴリズムを提案する。これにより,現在のCTBN拡張と比較して利点が指摘される。 Structured stochastic processes evolving in continuous time present a widely adopted framework to model phenomena occurring in nature and engineering. However, such models are often chosen to satisfy the Markov property to maintain tractability. One of the more popular of such memoryless models are Continuous Time Bayesian Networks (CTBNs). In this work, we lift its restriction to exponential survival times to arbitrary distributions. Current extensions achieve this via auxiliary states, which hinder tractability. To avoid that, we introduce a set of node-wise clocks to construct a collection of graph-coupled semi-Markov chains. We provide algorithms for parameter and structure inference, which make use of local dependencies and conduct experiments on synthetic data and a data-set generated through a benchmark tool for gene regulatory networks. In doing so, we point out advantages compared to current CTBN extensions.	翻訳日:2022-11-14 22:08:52 公開日:2020-07-02
# 説明可能な人工知能を用いた薬物発見 Drug discovery with explainable artificial intelligence ( http://arxiv.org/abs/2007.00523v2 ) ライセンス: Link先を確認	Jos\'e Jim\'enez-Luna, Francesca Grisoni, Gisbert Schneider	(参考訳) 深層学習は、先進的な画像解析、分子構造と機能の予測、および、bespoke特性を持つ革新的な化学物質の自動生成を含む薬物発見を約束する。将来的な応用が増えているにもかかわらず、基礎となる数学的モデルはしばしば人間の心によって解釈される。分子科学の機械言語の新たな物語の必要性に対処するために、「説明可能な」深層学習法が求められている。このレビューは、説明可能な人工知能の最も顕著なアルゴリズム概念を要約し、将来の機会、潜在的な応用、そして残る課題を予測する。 Deep learning bears promise for drug discovery, including advanced image analysis, prediction of molecular structure and function, and automated generation of innovative chemical entities with bespoke properties. Despite the growing number of successful prospective applications, the underlying mathematical models often remain elusive to interpretation by the human mind. There is a demand for 'explainable' deep learning methods to address the need for a new narrative of the machine language of the molecular sciences. This review summarizes the most prominent algorithmic concepts of explainable artificial intelligence, and dares a forecast of the future opportunities, potential applications, and remaining challenges.	翻訳日:2022-11-14 21:42:29 公開日:2020-07-02
# deep interactive learning: 深層学習に基づく骨肉腫治療反応評価のための効率的なラベリングアプローチ Deep Interactive Learning: An Efficient Labeling Approach for Deep Learning-Based Osteosarcoma Treatment Response Assessment ( http://arxiv.org/abs/2007.01383v1 ) ライセンス: Link先を確認	David Joon Ho, Narasimhan P. Agaram, Peter J. Schueffler, Chad M. Vanderbilt, Marc-Henri Jean, Meera R. Hameed, Thomas J. Fuchs	(参考訳) 骨肉腫は最も一般的な悪性原発性骨腫瘍である。標準治療は術前化学療法と外科的切除を含む。腫瘍面積に対する壊死性腫瘍面積の比率による治療に対する反応は、全身生存の予後因子として知られている。この評価は現在、顕微鏡の下でガラスのスライドを観察することで、その主観的な性質のために再現できない可能性がある。畳み込みニューラルネットワーク(cnns)は、骨肉腫全体の画像上の生存性腫瘍と壊死性腫瘍の自動分割に使用できる。教師あり学習のボトルネックの1つは、時間とコストのかかるプロセスであるトレーニングに大量の正確なアノテーションが必要とされることである。本稿では,cnnを学習するための効率的なラベリング手法として,dial(deep interactive learning)について述べる。最初のラベリングステップが完了すると、アノテータは、十分な予測が達成されるまでcnnモデルを改善するために、以前のセグメンテーション予測から誤ってラベルされた領域を修正するだけでよい。以上の結果より,DIaLを用いた7時間アノテーションでトレーニングしたCNNモデルでは,非標準化手技の術式変化率の予測値内の壊死率を推定することができた。 Osteosarcoma is the most common malignant primary bone tumor. Standard treatment includes pre-operative chemotherapy followed by surgical resection. The response to treatment as measured by ratio of necrotic tumor area to overall tumor area is a known prognostic factor for overall survival. This assessment is currently done manually by pathologists by looking at glass slides under the microscope which may not be reproducible due to its subjective nature. Convolutional neural networks (CNNs) can be used for automated segmentation of viable and necrotic tumor on osteosarcoma whole slide images. One bottleneck for supervised learning is that large amounts of accurate annotations are required for training which is a time-consuming and expensive process. In this paper, we describe Deep Interactive Learning (DIaL) as an efficient labeling approach for training CNNs. After an initial labeling step is done, annotators only need to correct mislabeled regions from previous segmentation predictions to improve the CNN model until the satisfactory predictions are achieved. Our experiments show that our CNN model trained by only 7 hours of annotation using DIaL can successfully estimate ratios of necrosis within expected inter-observer variation rate for non-standardized manual surgical pathology task.	翻訳日:2022-11-14 15:04:24 公開日:2020-07-02
# ADD:摩擦接触を有する多体系の解析微分力学 ADD: Analytically Differentiable Dynamics for Multi-Body Systems with Frictional Contact ( http://arxiv.org/abs/2007.00987v1 ) ライセンス: Link先を確認	Moritz Geilinger, David Hahn, Jonas Zehnder, Moritz B\"acher, Bernhard Thomaszewski, Stelian Coros	(参考訳) 本稿では,統一フレームワーク内で剛体および変形可能な物体の摩擦接触を処理可能な微分可能ダイナミクスソルバを提案する。正常接触力と接点接触力の原理的モーリフィケーションにより, 摩擦接触の非スムース性に固有の主な困難を回避できる。我々は,この新しい接触モデルと完全簡易な時間統合を組み合わせることで,解析的に微分可能なロバストで効率的なダイナミクスソルバを得る。本定式化は,隣接感度解析と合わせて,シミュレーション精度と目的関数ランドスケープの滑らかさの相違を考慮した勾配に基づく最適化を実現する。我々は,剛体,粘弾性材料,結合多体系を含む一連のシミュレーション例について,本手法を徹底的に解析する。さらに,変形可能な物体のパラメータ推定,ロボット操作の動作計画,協調歩行ロボットの軌道最適化,制御ポリシーの効率的な自己教師あり学習への微分シミュレータの応用について紹介する。 We present a differentiable dynamics solver that is able to handle frictional contact for rigid and deformable objects within a unified framework. Through a principled mollification of normal and tangential contact forces, our method circumvents the main difficulties inherent to the non-smooth nature of frictional contact. We combine this new contact model with fully-implicit time integration to obtain a robust and efficient dynamics solver that is analytically differentiable. In conjunction with adjoint sensitivity analysis, our formulation enables gradient-based optimization with adaptive trade-offs between simulation accuracy and smoothness of objective function landscapes. We thoroughly analyse our approach on a set of simulation examples involving rigid bodies, visco-elastic materials, and coupled multi-body systems. We furthermore showcase applications of our differentiable simulator to parameter estimation for deformable objects, motion planning for robotic manipulation, trajectory optimization for compliant walking robots, as well as efficient self-supervised learning of control policies.	翻訳日:2022-11-14 15:03:30 公開日:2020-07-02
# Channel Compression: CNNアーキテクチャにおけるチャネル間の情報冗長性の再考 Channel Compression: Rethinking Information Redundancy among Channels in CNN Architecture ( http://arxiv.org/abs/2007.01696v1 ) ライセンス: Link先を確認	Jinhua Liang, Tao Zhang, and Guoqing Feng	(参考訳) 組込みデバイスやモバイルアプリケーションへの需要により、モデル圧縮とアクセラレーションが注目を集めている。効率的な畳み込みニューラルネットワーク(cnns)の研究は、畳み込み計算を分解または最適化することで特徴冗長性を取り除くことを目的としている。本研究では,CNNアーキテクチャのチャネル間に特徴冗長性が存在すると仮定し,計算効率を高めるためのフリーウェイを提供する。チャネル圧縮を前提として,空間畳み込み,チャネルグループ化,プール操作の進展を受け入れるために,コンパクト畳み込みという新しい畳み込み構造を提案する。具体的には、奥行き分離可能な畳み込みとポイントワイドチャネル操作を利用して特徴を効率的に抽出する。学習可能な重みを通常導入する既存のチャネル圧縮法とは異なり、提案するコンパクト畳み込みは余分なパラメータなしで特徴冗長性を低減できる。ポイントワイズチャネル間操作により、コンパクト畳み込みは特徴写像のチャネル次元を暗黙的に絞り込む。ニューラルネットワークにおけるチャネル冗長性を低減するためのルールを検討するために、異なるポイントワイズチャネル間操作の比較を行う。さらに,音場分類,音事象検出,画像分類などの複数の課題に対処するために,コンパクトな畳み込みが拡張される。実験により, マルチメディアタスクにおいて, コンパクトな畳み込みは高い有効性を示すだけでなく, 並列計算によって効率よく実装できることを示した。 Model compression and acceleration are attracting increasing attentions due to the demand for embedded devices and mobile applications. Research on efficient convolutional neural networks (CNNs) aims at removing feature redundancy by decomposing or optimizing the convolutional calculation. In this work, feature redundancy is assumed to exist among channels in CNN architectures, which provides some leeway to boost calculation efficiency. Aiming at channel compression, a novel convolutional construction named compact convolution is proposed to embrace the progress in spatial convolution, channel grouping and pooling operation. Specifically, the depth-wise separable convolution and the point-wise interchannel operation are utilized to efficiently extract features. Different from the existing channel compression method which usually introduces considerable learnable weights, the proposed compact convolution can reduce feature redundancy with no extra parameters. With the point-wise interchannel operation, compact convolutions implicitly squeeze the channel dimension of feature maps. To explore the rules on reducing channel redundancy in neural networks, the comparison is made among different point-wise interchannel operations. Moreover, compact convolutions are extended to tackle with multiple tasks, such as acoustic scene classification, sound event detection and image classification. The extensive experiments demonstrate that our compact convolution not only exhibits high effectiveness in several multimedia tasks, but also can be efficiently implemented by benefiting from parallel computation.	翻訳日:2022-11-14 15:02:26 公開日:2020-07-02
# 不確実性下におけるデータ駆動肯定行動政策に向けて Towards Data-Driven Affirmative Action Policies under Uncertainty ( http://arxiv.org/abs/2007.01202v1 ) ライセンス: Link先を確認	Corinna Hertweck, Carlos Castillo, Michael Mathioudakis	(参考訳) 本稿では,大学進学者と大学プログラムの合致に等級と標準試験点を用いた中央システム下での大学入試について検討する。我々は、承認申請者数を過小評価グループから増やそうとする肯定的な行動方針を検討する。このような方針を申請期間の開始前に発表する必要があるため、各プログラムに応募する学生のスコア分布に不確実性がある。これは政策立案者にとって難しい課題となる。我々は,過去のデータに基づいてトレーニングされた予測モデルを用いて,これらのポリシーのパラメータを最適化する可能性を検討する。 In this paper, we study university admissions under a centralized system that uses grades and standardized test scores to match applicants to university programs. We consider affirmative action policies that seek to increase the number of admitted applicants from underrepresented groups. Since such a policy has to be announced before the start of the application period, there is uncertainty about the score distribution of the students applying to each program. This poses a difficult challenge for policy-makers. We explore the possibility of using a predictive model trained on historical data to help optimize the parameters of such policies.	翻訳日:2022-11-14 14:56:15 公開日:2020-07-02
# マイクロコントローラのための効率的なニューラルネットワーク配置 Efficient Neural Network Deployment for Microcontroller ( http://arxiv.org/abs/2007.01348v1 ) ライセンス: Link先を確認	Hasan Unlu	(参考訳) ニューラルネットワークのエッジコンピューティングは、特に低電力アプリケーションやオフラインデバイスで重要になっている。 TensorFlow LiteとPyTorch Mobileはこの目的でリリースされた。しかし、主にマイクロコントローラレベルではなくモバイルデバイスをサポートしている。マイクロコントローラのサポートは今、新しい分野だ。ネットワークサイズを削減し、プルーニングやバイナライゼーション、レイヤ操作、すなわちオペレータのリオーダーといった計算負荷を削減する方法は数多く存在する。本稿では,マイクロコントローラのための畳み込みニューラルネットワークの展開を,メモリ節約と2次元畳み込みの計算効率を完全連結層とともに提供する2つの新しい最適化提案で検討し,一般化する。最初のものは、ストライドがカーネルサイズをプールするよりも大きい場合、インプレースマックスプーリングである。第2の最適化は、層間のping-pongバッファを使用してメモリ消費を大幅に削減することだ。メモリの節約と性能は、ARM Cortex-M CPU用に開発されたCMSIS-NNフレームワークと比較される。最終的な目的は、トレーニングされたネットワーク重みを持つPyTorchモデルを消費するツールを開発することであり、低メモリ(キロバイトレベル)と限られた計算能力を持つマイクロコントローラのためにC/C++で最適化された推論エンジン(前方通過)となる。 Edge computing for neural networks is getting important especially for low power applications and offline devices. TensorFlow Lite and PyTorch Mobile were released for this purpose. But they mainly support mobile devices instead of microcontroller level yet. Microcontroller support is an emerging area now. There are many approaches to reduce network size and compute load like pruning, binarization and layer manipulation i.e. operator reordering. This paper is going to explore and generalize convolution neural network deployment for microcontrollers with two novel optimization proposals offering memory saving and compute efficiency in 2D convolutions as well as fully connected layers. The first one is in-place max-pooling, if the stride is greater than or equal to pooling kernel size. The second optimization is to use ping-pong buffers between layers to reduce memory consumption significantly. The memory savings and performance will be compared with CMSIS-NN framework developed for ARM Cortex-M CPUs. The final purpose is to develop a tool consuming PyTorch model with trained network weights, and it turns into an optimized inference engine(forward pass) in C/C++ for low memory(kilobyte level) and limited computing capable microcontrollers.	翻訳日:2022-11-14 14:55:22 公開日:2020-07-02
# WattScale: 大規模建物のエネルギー効率分析のためのデータ駆動型アプローチ WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale ( http://arxiv.org/abs/2007.01382v1 ) ライセンス: Link先を確認	Srinivasan Iyengar, Stephen Lee, David Irwin, Prashant Shenoy, Benjamin Weil	(参考訳) ビルは現代社会の総エネルギーの40%以上を消費し、エネルギー効率を向上させることでエネルギーフットプリントを大幅に削減する。本稿では,都市や地域の建物群からエネルギー効率の低い建物を識別するためのデータ駆動型手法である \texttt{wattscale} を提案する。点推定を利用する最小二乗法のような従来の方法とは異なり、 \texttt{WattScale} は建物に影響を与えるパラメータの分布を推定することによって、日々のエネルギー消費における確率性を捉えるためにベイズ推定を用いる。さらに、特定の人口に類似した家と比較する。 \texttt{WattScale} はまた、エネルギー不効率の原因を特定するために障害検出アルゴリズムも組み込んでいる。我々は,異なる地理的位置から得られた地中真理データを用いてアプローチを検証する。 \texttt{WattScale} には2つの実行モードがある。 (i)個人、及び (II)地域ベースでは,2つのケーススタディで強調する。個別実行モードでは,1万棟以上の建物を有する都市において,建物の半数以上が何らかの方法で非効率であることを示し,エネルギー改善対策から有意な可能性を示唆する。さらに, 効率の低下の原因として, 41\%, 23.73\%, 0.51\%の住宅では, 建物内装, 暖房, 冷却システムの故障がみられた。地域ベースの実行モードでは、代表エネルギーデータセットが最近利用可能になったため、米国内の何百万もの家庭に拡張可能であることを示す。 Buildings consume over 40% of the total energy in modern societies, and improving their energy efficiency can significantly reduce our energy footprint. In this paper, we present \texttt{WattScale}, a data-driven approach to identify the least energy-efficient buildings from a large population of buildings in a city or a region. Unlike previous methods such as least-squares that use point estimates, \texttt{WattScale} uses Bayesian inference to capture the stochasticity in the daily energy usage by estimating the distribution of parameters that affect a building. Further, it compares them with similar homes in a given population. \texttt{WattScale} also incorporates a fault detection algorithm to identify the underlying causes of energy inefficiency. We validate our approach using ground truth data from different geographical locations, which showcases its applicability in various settings. \texttt{WattScale} has two execution modes -- (i) individual, and (ii) region-based, which we highlight using two case studies. For the individual execution mode, we present results from a city containing >10,000 buildings and show that more than half of the buildings are inefficient in one way or another indicating a significant potential from energy improvement measures. Additionally, we provide probable cause of inefficiency and find that 41\%, 23.73\%, and 0.51\% homes have poor building envelope, heating, and cooling system faults, respectively. For the region-based execution mode, we show that \texttt{WattScale} can be extended to millions of homes in the US due to the recent availability of representative energy datasets.	翻訳日:2022-11-14 14:55:03 公開日:2020-07-02
# ロバスト統計枠組みにおける正規フィルタリングに基づく表面雑音化 Surface Denoising based on Normal Filtering in a Robust Statistics Framework ( http://arxiv.org/abs/2007.00842v1 ) ライセンス: Link先を確認	Sunil Kumar Yadav and Martin Skrodzki and Eric Zimmermann and Konrad Polthier	(参考訳) 3Dスキャナーを用いた表面取得プロセスでは、ノイズは避けられず、幾何学処理の重要なステップは、これらのノイズ成分をこれらの表面から除去することである。除音処理(除音)は、まず表面の正常をフィルタリングし、その後にフィルターされた正常に応じて頂点位置を調整することで行うことができる。したがって、多くの解法アルゴリズムでは、ノイズのない正規分布の計算が鍵となる。ノイズ除去のための様々なフィルタが標準から導入され、外周に対するロバスト性や大きな雑音振幅といったフォーカスポイントが異なる。これらのフィルタは様々な面において良好に機能するが、それらの関係を確立するための統一的なフレームワークが欠落し、各手法の性能を超える理論的解析を提供する。本稿では,メッシュデノイジングの面正規化と点集合デノイジングの頂点正規化に広く使用されている多数の非線形フィルタの関係性を確立するための枠組みを提案する。 m-スモーザーを用いたロバストな統計推定と線形および非線形正規フィルタリングへの応用について述べる。これらの手法は拡散・バイラテラル・方向曲率に基づくアルゴリズムを含む異なる数学的理論に起源があるが、ロバストな誤差ノルムと対応する影響関数を用いてロバスト統計の統一的な枠組みに全ての手法が組み入れられることを実証する。この統一は、個々の方法とその相互関係をよりよく理解するのに役立つ。さらに、提案フレームワークは、既知のフィルタの利点を組み合わせ、利用可能なメソッドと比較するための新しいテクニックのためのプラットフォームを提供する。 During a surface acquisition process using 3D scanners, noise is inevitable and an important step in geometry processing is to remove these noise components from these surfaces (given as points-set or triangulated mesh). The noise-removal process (denoising) can be performed by filtering the surface normals first and by adjusting the vertex positions according to filtered normals afterwards. Therefore, in many available denoising algorithms, the computation of noise-free normals is a key factor. A variety of filters have been introduced for noise-removal from normals, with different focus points like robustness against outliers or large amplitude of noise. Although these filters are performing well in different aspects, a unified framework is missing to establish the relation between them and to provide a theoretical analysis beyond the performance of each method. In this paper, we introduce such a framework to establish relations between a number of widely-used nonlinear filters for face normals in mesh denoising and vertex normals in point set denoising. We cover robust statistical estimation with M-smoothers and their application to linear and non-linear normal filtering. Although these methods originate in different mathematical theories - which include diffusion-, bilateral-, and directional curvature-based algorithms - we demonstrate that all of them can be cast into a unified framework of robust statistics using robust error norms and their corresponding influence functions. This unification contributes to a better understanding of the individual methods and their relations with each other. Furthermore, the presented framework provides a platform for new techniques to combine the advantages of known filters and to compare them with available methods.	翻訳日:2022-11-14 14:54:12 公開日:2020-07-02
# フィギュアスケートビデオにおけるハイライト検出のための瞬目確率の推定 Estimating Blink Probability for Highlight Detection in Figure Skating Videos ( http://arxiv.org/abs/2007.01089v1 ) ライセンス: Link先を確認	Tamami Nakano, Atsuya Sakata, Akihiro Kishimoto	(参考訳) スポーツビデオのハイライト検出は幅広い視聴者と商業的可能性を秘めている。したがって、人間の興味により適したハイライトシーンを時間的精度で検出することが不可欠である。注意グラフ作成中の瞬きを直感的に抑制し、ビデオの注目ブレークポイントで瞬きを同期的に生成するため、瞬き瞬き率を人的関心の高精度な時間指標として利用することができる。そこで本研究では,点滅率に基づく新しいハイライト自動検出手法を提案する。本手法は,1次元畳み込みネットワーク (1d-cnn) を訓練し,フィギュアスケートビデオの時空間的ポーズ特徴から各フレームの点滅率を評価する。実験の結果,ビデオクリップの94%で瞬き速度を推定し,ジャンプイベント周辺の瞬き速度の時間変化を高精度に予測できることがわかった。さらに,代表的な運動動作だけでなく,フィギュアスケートのパフォーマンスをキーフレームとして表現する特徴的な芸術的表現も検出する。このことは、ブランクレートに基づく教師あり学習アプローチにより、人間の感受性により近い精度のハイライト検出が可能になることを示唆している。 Highlight detection in sports videos has a broad viewership and huge commercial potential. It is thus imperative to detect highlight scenes more suitably for human interest with high temporal accuracy. Since people instinctively suppress blinks during attention-grabbing events and synchronously generate blinks at attention break points in videos, the instantaneous blink rate can be utilized as a highly accurate temporal indicator of human interest. Therefore, in this study, we propose a novel, automatic highlight detection method based on the blink rate. The method trains a one-dimensional convolution network (1D-CNN) to assess blink rates at each video frame from the spatio-temporal pose features of figure skating videos. Experiments show that the method successfully estimates the blink rate in 94% of the video clips and predicts the temporal change in the blink rate around a jump event with high accuracy. Moreover, the method detects not only the representative athletic action, but also the distinctive artistic expression of figure skating performance as key frames. This suggests that the blink-rate-based supervised learning approach enables high-accuracy highlight detection that more closely matches human sensibility.	翻訳日:2022-11-14 14:47:38 公開日:2020-07-02
# 学習可能な滑らかさを優先した深層学習を用いた大域的最適表面セグメンテーション Globally Optimal Surface Segmentation using Deep Learning with Learnable Smoothness Priors ( http://arxiv.org/abs/2007.01217v1 ) ライセンス: Link先を確認	Leixin Zhou, Xiaodong Wu	(参考訳) 多くの医用画像解析アプリケーションにおいて, 自動表面分割は重要かつ困難である。近年,物体分割作業のための深層学習手法が開発されている。その多くは分類に基づくアプローチであり、例えばu-netは、それぞれのvoxelのターゲットオブジェクトまたは背景となる確率を予測する。これらの方法の1つの問題は、セグメンテーションされたオブジェクトに対するトポロジー保証の欠如であり、通常、オブジェクトの境界面を推測するには、後処理が必要である。本稿では,畳み込みニューラルネットワーク(CNN)と学習可能な表面平滑化ブロックを用いた新しいモデルを提案する。我々の知る限りでは、グローバルな最適性を持つ直面分割のためのCNNとエンドツーエンドで滑らかさを学習する最初の研究である。 Spectral Domain Optical Coherence Tomography (SD-OCT)Retinal layer segmentation and intravascular Ultrasound (IVUS) vessel wall segmentation で行った実験は、非常に有望な結果を示した。 Automated surface segmentation is important and challenging in many medical image analysis applications. Recent deep learning based methods have been developed for various object segmentation tasks. Most of them are a classification based approach, e.g. U-net, which predicts the probability of being target object or background for each voxel. One problem of those methods is lacking of topology guarantee for segmented objects, and usually post processing is needed to infer the boundary surface of the object. In this paper, a novel model based on convolutional neural network (CNN) followed by a learnable surface smoothing block is proposed to tackle the surface segmentation problem with end-to-end training. To the best of our knowledge, this is the first study to learn smoothness priors end-to-end with CNN for direct surface segmentation with global optimality. Experiments carried out on Spectral Domain Optical Coherence Tomography (SD-OCT) retinal layer segmentation and Intravascular Ultrasound (IVUS) vessel wall segmentation demonstrated very promising results.	翻訳日:2022-11-14 14:47:19 公開日:2020-07-02
# 検出対象物の量と品質を最大化するためのuavチームにおける監視位置の自律的・協調的設計 Autonomous and cooperative design of the monitor positions for a team of UAVs to maximize the quantity and quality of detected objects ( http://arxiv.org/abs/2007.01247v1 ) ライセンス: Link先を確認	Dimitrios I. Koutras, Athanasios Ch. Kapoutsis and Elias B. Kosmatopoulos	(参考訳) 本稿では,UAVの群れを未知の地形内に配置する問題に対処し,その全体的意識を最大化することを目的とした。状況認識は、UAVの視野の中で、ユニークな関心の対象の数と品質によって表現される。 YOLOv3と複製対象を識別するシステムを用いて、各UAVの構成に1つのスコアを割り当てた。そこで,UAVや環境のダイナミクスを考慮せずに,予め定義されたスコアを最適化できる新しいナビゲーションアルゴリズムを提案する。提案手法の基盤はブロック座標降下 (bcd) のアプローチと同じ収束特性を共有することである。提案手法の有効性と性能を,AirSimシミュレータ内の一連の実験を用いて評価した。実験により,提案した航法アルゴリズムは,UAVの群れを安定して「戦略的」な監視位置へ移動し,異なる数の群れに適応できることが示唆された。ソースコードはhttps://github.com/dimikout3/ConvCAOAirSimで入手できる。 This paper tackles the problem of positioning a swarm of UAVs inside a completely unknown terrain, having as objective to maximize the overall situational awareness. The situational awareness is expressed by the number and quality of unique objects of interest, inside the UAVs' fields of view. YOLOv3 and a system to identify duplicate objects of interest were employed to assign a single score to each UAVs' configuration. Then, a novel navigation algorithm, capable of optimizing the previously defined score, without taking into consideration the dynamics of either UAVs or environment, is proposed. A cornerstone of the proposed approach is that it shares the same convergence characteristics as the block coordinate descent (BCD) family of approaches. The effectiveness and performance of the proposed navigation scheme were evaluated utilizing a series of experiments inside the AirSim simulator. The experimental evaluation indicates that the proposed navigation algorithm was able to consistently navigate the swarm of UAVs to "strategic" monitoring positions and also adapt to the different number of swarm sizes. Source code is available at https://github.com/dimikout3/ConvCAOAirSim.	翻訳日:2022-11-14 14:47:00 公開日:2020-07-02
# 自動車組み立てラインの視覚検査のためのディープラーニングモデル Deep Learning Models for Visual Inspection on Automotive Assembling Line ( http://arxiv.org/abs/2007.01857v1 ) ライセンス: Link先を確認	Muriel Mazzetto and Marcelo Teixeira and \'Erick Oliveira Rodrigues and Dalcimar Casanova	(参考訳) 自動車製造組立タスクは、加工面上のスクラッチ識別、部品識別と選択など、製品品質とプロセス品質を保証する視覚検査に基づいて構築される。これらのタスクは、同じ製造ライン内で生産される複数の種類の車両と関連付けられる。視覚検査は基本的に人間主導だったが、コンピュータビジョンシステム(CVS)が提供する人工的な知覚によって補われた。関連性にもかかわらず、CVSの精度は、照明、囲い、画像取得の品質といった環境設定によって異なる。これらの問題はコストのかかる解決策を伴い、主に工場の運転サイクルタイムに支障をきたすコンピュータビジョンシステムによってもたらされる利点の一部をオーバーライドする。そこで,本稿では,製造環境に足跡をほとんど残さずに視覚検査作業を支援するための深層学習に基づく手法を提案し,cvsセットアップを容易にするエンドツーエンドツールとして探索する。提案手法は, 物体検出, 意味セグメンテーション, 異常検出のモデルに基づく実自動車組立ラインにおける4つの概念実証によって示される。 Automotive manufacturing assembly tasks are built upon visual inspections such as scratch identification on machined surfaces, part identification and selection, etc, which guarantee product and process quality. These tasks can be related to more than one type of vehicle that is produced within the same manufacturing line. Visual inspection was essentially human-led but has recently been supplemented by the artificial perception provided by computer vision systems (CVSs). Despite their relevance, the accuracy of CVSs varies accordingly to environmental settings such as lighting, enclosure and quality of image acquisition. These issues entail costly solutions and override part of the benefits introduced by computer vision systems, mainly when it interferes with the operating cycle time of the factory. In this sense, this paper proposes the use of deep learning-based methodologies to assist in visual inspection tasks while leaving very little footprints in the manufacturing environment and exploring it as an end-to-end tool to ease CVSs setup. The proposed approach is illustrated by four proofs of concept in a real automotive assembly line based on models for object detection, semantic segmentation, and anomaly detection.	翻訳日:2022-11-14 14:45:41 公開日:2020-07-02
# 多レベルグラフサーチによるサーマルグライダーのマルチエージェント計画 Multi-agent Planning for thermalling gliders using multi level graph-search ( http://arxiv.org/abs/2007.01334v1 ) ライセンス: Link先を確認	Muhammad Aneeq uz Zaman and Aamer Iqbal Bhatti	(参考訳) 本稿では,グライダー群における経路計画問題を解く。グライダーは、一連の関心点を訪問する任務を負う。グライダーは射程は限られているが、サーマルズと呼ばれる特別な地点を訪れることで射程を拡大することができる。本稿では,グライダーに対する経路計画の問題点として,グライダーが訪れた関心点の総数を最大化することを挙げる。これをマルチエージェント問題(multi-agent problem)と呼ぶ。この問題は、まず複数の単一エージェント問題に分解することで解決される。単エージェント問題では、一組の利子点が単一のグライダーに割り当てられる。この問題は、割り当てられた集合から訪問した関心点の数を最大化する経路を計画することで解決される。これは、以前の研究で示したように、均一なコストグラフ検索によって実現されます。マルチエージェント問題は現在、各グライダーの最適な割り当て(利得点)を決定することで構成されている。この問題を解決するには,先行研究で示したようなブルートフォース探索アプローチと,ブランチ・アンド・バウンド型グラフ検索の2つの方法がある。 Branch&Boundアプローチは、この論文の主な貢献である。この手法は最適であることが証明され、シミュレーションを用いたブルート力探索よりも高速であることが示されている。 This paper solves a path planning problem for a group of gliders. The gliders are tasked with visiting a set of interest points. The gliders have limited range but are able to increase their range by visiting special points called thermals. The problem addressed in this paper is of path planning for the gliders such that, the total number of interest points visited by the gliders is maximized. This is referred to as the multi-agent problem. The problem is solved by first decomposing it into several single-agent problems. In a single-agent problem a set of interest points are allocated to a single glider. This problem is solved by planning a path which maximizes the number of visited interest points from the allocated set. This is achieved through a uniform cost graph search, as shown in our earlier work. The multi-agent problem now consists of determining the best allocation (of interest points) for each glider. Two ways are presented of solving this problem, a brute force search approach as shown in earlier work and a Branch\&Bound type graph search. The Branch&Bound approach is the main contribution of the paper. This approach is proven to be optimal and shown to be faster than the brute force search using simulations.	翻訳日:2022-11-14 14:45:26 公開日:2020-07-02
# スポット:E-Squadのステートレス予測オニオンルーティング Spores: Stateless Predictive Onion Routing for E-Squads ( http://arxiv.org/abs/2007.04766v1 ) ライセンス: Link先を確認	Daniel Bosk (KTH), Y\'erom-David Bromberg (WIDE, IRISA), Sonja Buchegger (KTH), Adrien Luxey (WIDE, IRISA), Fran\c{c}ois Ta\"iani (WIDE, IRISA)	(参考訳) 州機関や法人による人口の大量監視は、今やよく知られている事実である。ジャーナリストや内部告発者はまだ調査のために世界的なスパイ行為を回避する手段がない。 Sporesでは、ジャーナリストとその情報源が、物理的に会う際に後部ファイル交換を計画する方法を提案する。我々は1人当たりの個人機器の乗算を利用して、軽量で堅牢で完全に匿名のファイル転送プロトコルをユーザ間で提供する。ゴシップ通信プロトコルによってインテリジェントにレンダリングされた個人のデバイスは、ユーザに対して、プライベートで信頼性の高いサービスを提供することができます。人々のe-squadは、信頼性のあるルーティングを提供しながら、パーソナルアプライアンスの固有の信頼性に耐えられる、新しいオニオンルーティングネットワークにフェデレートされる。 sporesのパフォーマンスは競争力があり、通信のプライバシ特性は技術オニオンルーティング戦略の状態を上回っている。 Mass surveillance of the population by state agencies and corporate parties is now a well-known fact. Journalists and whistle-blowers still lack means to circumvent global spying for the sake of their investigations. With Spores, we propose a way for journalists and their sources to plan a posteriori file exchanges when they physically meet. We leverage on the multiplication of personal devices per capita to provide a lightweight, robust and fully anonymous decentralised file transfer protocol between users. Spores hinges on our novel concept of e-squads: one's personal devices, rendered intelligent by gossip communication protocols, can provide private and dependable services to their user. People's e-squads are federated into a novel onion routing network, able to withstand the inherent unreliability of personal appliances while providing reliable routing. Spores' performances are competitive, and its privacy properties of the communication outperform state of the art onion routing strategies.	翻訳日:2022-11-14 14:45:07 公開日:2020-07-02
# Lingua Francaとしての高階論理 - 論証的談話と深い論理的分析の統合 Higher-order Logic as Lingua Franca -- Integrating Argumentative Discourse and Deep Logical Analysis ( http://arxiv.org/abs/2007.01019v1 ) ライセンス: Link先を確認	David Fuenmayor and Christoph Benzm\"uller	(参考訳) 本稿では,従来の高階論理に対する最先端自動推論技術の適用による議論的言説の深い多元論理解析へのアプローチを提案する。表現性のおかげで、この論理は、形式化された引数(深い論理構造)と弁証的相互作用(攻撃と支援関係)の両方をエンコーディングできる一様な \textit{lingua franca} の状態を採用することができる。気候工学に関する議論からの抜粋を分析して,これを説明する。もう一つの新しい貢献は、古典高階論理における非古典論理の浅い意味的埋め込み(sses)を特徴づけ、評価するための抽象的、言語理論的基礎の定義に関するものである。新たな視点は、論理と論理の組み合わせのセマンティックな埋め込みのより簡潔でエレガントな特徴づけを可能にし、いくつかの例で示される。 We present an approach towards the deep, pluralistic logical analysis of argumentative discourse that benefits from the application of state-of-the-art automated reasoning technology for classical higher-order logic. Thanks to its expressivity this logic can adopt the status of a uniform \textit{lingua franca} allowing the encoding of both formalized arguments (their deep logical structure) and dialectical interactions (their attack and support relations). We illustrate this by analyzing an excerpt from an argumentative debate on climate engineering. Another, novel contribution concerns the definition of abstract, language-theoretical foundations for the characterization and assessment of shallow semantical embeddings (SSEs) of non-classical logics in classical higher-order logic, which constitute a pillar stone of our approach. The novel perspective we draw enables more concise and more elegant characterizations of semantical embeddings of logics and logic combinations, which is demonstrated with several examples.	翻訳日:2022-11-14 14:37:41 公開日:2020-07-02
# 人工便秘 Artificial Stupidity ( http://arxiv.org/abs/2007.03616v1 ) ライセンス: Link先を確認	Michael Falk	(参考訳) AIに関する公的な議論は、AIが超人的になり、人間のコントロールから逃れる恐れであるフランケンシュタイン症候群に支配されている。超知能は確かに可能性であるが、それが興奮する関心は、より差し迫った懸念から大衆を遠ざける可能性がある。この記事では、メアリー・シェリーの1818年の有名な小説におけるフランケンシュタイン症候群のルーツについて論じる。次に、人工知能の愚かさを分析する哲学的枠組みを提供し、現代のインテリジェントシステムは「判断の便宜」に苦しむことができることを示した。最後に、ASの危険と利益を露呈する代替の文学的伝統を特定する。エドマンド・スペンサー、ジョナサン・スウィフト、E・T・A・ホフマンの著作では、ASは人間を置き換え、抑圧し、誘惑する。より楽観的に、Joseph FurphyとLaurence Sterneは、人間の知性を地図やパイプとして使えるASを想像する。これらの作家は、現在AI論争を駆動している神話に強く反論する。彼らは、例えばステレオタイプに訴えたり、現実から私たちを遠ざけることによって、愚かな人工エージェントでさえ人間のコントロールを回避できる方法を特定する。そして彼らは、ますます自動化された社会における文学的想像力の重要性を強調した。 Public debate about AI is dominated by Frankenstein Syndrome, the fear that AI will become superhuman and escape human control. Although superintelligence is certainly a possibility, the interest it excites can distract the public from a more imminent concern: the rise of Artificial Stupidity (AS). This article discusses the roots of Frankenstein Syndrome in Mary Shelley's famous novel of 1818. It then provides a philosophical framework for analysing the stupidity of artificial agents, demonstrating that modern intelligent systems can be seen to suffer from 'stupidity of judgement'. Finally it identifies an alternative literary tradition that exposes the perils and benefits of AS. In the writings of Edmund Spenser, Jonathan Swift and E.T.A. Hoffmann, ASs replace, oppress or seduce their human users. More optimistically, Joseph Furphy and Laurence Sterne imagine ASs that can serve human intellect as maps or as pipes. These writers provide a strong counternarrative to the myths that currently drive the AI debate. They identify ways in which even stupid artificial agents can evade human control, for instance by appealing to stereotypes or distancing us from reality. And they underscore the continuing importance of the literary imagination in an increasingly automated society.	翻訳日:2022-11-14 14:37:24 公開日:2020-07-02
# 量子共通原因の存在における因果関係の定量化 Quantifying causal influences in the presence of a quantum common cause ( http://arxiv.org/abs/2007.01221v1 ) ライセンス: Link先を確認	Mariami Gachechiladze, Nikolai Miklin, and Rafael Chaves	(参考訳) 量子力学は自然の因果関係に対する直感に挑戦する。ライヒェンバッハの共通原因原理や地域実在論など、いくつかの基本的な概念を再考する必要がある。伝統的に、これはベルの不等式違反によって目撃される。しかし、ベルの不等式は、量子相関と因果理論の不整合性の唯一の符号か? この質問に動機づけられた一般の枠組みでは、2つの変数間の因果関係を、介入を必要とせず、また、共通の原因の古典的、量子的、あるいはポスト量子的性質を無視せずに推定することができる。特に、ベルの不等式を破ることが不可能な、最も単純な機器のシナリオを考えることで、全ての純粋な二部交絡状態が因果関係の古典的境界に反し、提案された問題に否定的に答え、量子論における因果関係の役割を探求する新たな場を開くことを示す。 Quantum mechanics challenges our intuition on the cause-effect relations in nature. Some fundamental concepts, including Reichenbach's common cause principle or the notion of local realism, have to be reconsidered. Traditionally, this is witnessed by the violation of a Bell inequality. But are Bell inequalities the only signature of the incompatibility between quantum correlations and causality theory? Motivated by this question we introduce a general framework able to estimate causal influences between two variables, without the need of interventions and irrespectively of the classical, quantum, or even post-quantum nature of a common cause. In particular, by considering the simplest instrumental scenario -- for which violation of Bell inequalities is not possible -- we show that every pure bipartite entangled state violates the classical bounds on causal influence, thus answering in negative to the posed question and opening a new venue to explore the role of causality within quantum theory.	翻訳日:2022-11-14 14:37:02 公開日:2020-07-02
# 時間領域における音声表現のコントラスト学習を増強するデータ Data Augmenting Contrastive Learning of Speech Representations in the Time Domain ( http://arxiv.org/abs/2007.00991v1 ) ライセンス: Link先を確認	Eugene Kharitonov and Morgane Rivi\`ere and Gabriel Synnaeve and Lior Wolf and Pierre-Emmanuel Mazar\'e and Matthijs Douze and Emmanuel Dupoux	(参考訳) 過去セグメントに基づく音声の将来セグメント予測に基づくコントラスト予測符号化(cpc)が,音声信号の表現学習のための強力なアルゴリズムとして出現している。しかし、教師なし評価ベンチマークでは、他の手法が低性能である。ここでは、時間領域データ拡張ライブラリであるwavaugmentを紹介し、過去に拡張を適用する方が一般的に効率的であり、他の方法よりも優れたパフォーマンスをもたらすことを見出します。その結果, ピッチ修正, 付加雑音, 残響の組合せにより, cpcの性能が大幅に向上し(相対的改善率18-22%), 基準リブリライトの600分の1のデータを上回った。ドメイン外データセットを使用することで、時間領域データ拡張は、cpcをzero speech benchmark 2017の最先端技術と同等にすることができる。また,時間領域データ拡張は,ダウンストリームのスーパービジョン音素分類タスクを12～15%の相対的に改善することを示す。 Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal. However, it still under-performs other methods on unsupervised evaluation benchmarks. Here, we introduce WavAugment, a time-domain data augmentation library and find that applying augmentation in the past is generally more efficient and yields better performances than other methods. We find that a combination of pitch modification, additive noise and reverberation substantially increase the performance of CPC (relative improvement of 18-22%), beating the reference Libri-light results with 600 times less data. Using an out-of-domain dataset, time-domain data augmentation can push CPC to be on par with the state of the art on the Zero Speech Benchmark 2017. We also show that time-domain data augmentation consistently improves downstream limited-supervision phoneme classification tasks by a factor of 12-15% relative.	翻訳日:2022-11-14 14:36:45 公開日:2020-07-02
# MRIスライス束からの胎児脳分節の非確実性ガイドによるインタラクティブリファインメント Uncertainty-Guided Efficient Interactive Refinement of Fetal Brain Segmentation from Stacks of MRI Slices ( http://arxiv.org/abs/2007.00833v1 ) ライセンス: Link先を確認	Guotai Wang, Michael Aertsen, Jan Deprest, Sebastien Ourselin, Tom Vercauteren, Shaoting Zhang	(参考訳) 動作補正と高分解能ボリューム再構成には, 胎児脳の運動崩壊した胎児MRIスライスからの分離が重要である。畳み込みニューラルネットワーク(CNN)は胎児の脳の自動分割に広く用いられているが、これらの結果はいまだに困難なスライスのためにインタラクティブな洗練の恩恵を受けている。インタラクティブリファインメントプロセスの効率を向上させるために,不確実性誘導型インタラクティブリファインメント(ugir)フレームワークを提案する。まず、グループ化された畳み込みに基づくCNNを提案し、単一の前方通過における不確実性推定を伴う複数の自動セグメンテーション予測を得る。また,最初のセグメンテーションとユーザインタラクションから洗練された結果を得るための新しい対話型レベル集合法を提案する。実験の結果,(1)提案するcnnは不確かさをリアルタイムで推定し,(2)インタラクティブなレベルセットは精度向上に効果的かつ効率的であり,(3)ugirはユーザインタラクションのガイドに不確実性を用いることで,効率の約30%向上した精度向上結果が得られることがわかった。私たちのコードはオンラインで入手できる。 Segmentation of the fetal brain from stacks of motion-corrupted fetal MRI slices is important for motion correction and high-resolution volume reconstruction. Although Convolutional Neural Networks (CNNs) have been widely used for automatic segmentation of the fetal brain, their results may still benefit from interactive refinement for challenging slices. To improve the efficiency of interactive refinement process, we propose an Uncertainty-Guided Interactive Refinement (UGIR) framework. We first propose a grouped convolution-based CNN to obtain multiple automatic segmentation predictions with uncertainty estimation in a single forward pass, then guide the user to provide interactions only in a subset of slices with the highest uncertainty. A novel interactive level set method is also proposed to obtain a refined result given the initial segmentation and user interactions. Experimental results show that: (1) our proposed CNN obtains uncertainty estimation in real time which correlates well with mis-segmentations, (2) the proposed interactive level set is effective and efficient for refinement, (3) UGIR obtains accurate refinement results with around 30% improvement of efficiency by using uncertainty to guide user interactions. Our code is available online.	翻訳日:2022-11-14 14:36:27 公開日:2020-07-02
# 非負行列分解に基づく画像解析 Image Analysis Based on Nonnegative/Binary Matrix Factorization ( http://arxiv.org/abs/2007.00889v1 ) ライセンス: Link先を確認	Hinako Asaoka and Kazue Kudo	(参考訳) 非負行列分解(NBMF)を用いることで、行列を非負行列と二項行列に分解することができる。 NBMFとFujitsu Digital Annealerを用いた顔画像の解析により,画像再構成と画像分類に成功している。 NBMFアルゴリズムは、非負行列分解(NMF)の収束に必要なものよりも少ないイテレーションで収束するが、どちらの手法も画像分類において比較可能である。 Using nonnegative/binary matrix factorization (NBMF), a matrix can be decomposed into a nonnegative matrix and a binary matrix. Our analysis of facial images, based on NBMF and using the Fujitsu Digital Annealer, leads to successful image reconstruction and image classification. The NBMF algorithm converges in fewer iterations than those required for the convergence of nonnegative matrix factorization (NMF), although both techniques perform comparably in image classification.	翻訳日:2022-11-14 14:35:33 公開日:2020-07-02
# 視床上腫瘍型分類のためのスペクトル-空間再帰的ネットワーク Spectral-Spatial Recurrent-Convolutional Networks for In-Vivo Hyperspectral Tumor Type Classification ( http://arxiv.org/abs/2007.01042v1 ) ライセンス: Link先を確認	Marcel Bengs, Nils Gessert, Wiebke Laffers, Dennis Eggert, Stephan Westermann, Nina A. Mueller, Andreas O. H. Gerstner, Christian Betz, Alexander Schlaefer	(参考訳) 癌組織の早期発見は長期生存に不可欠である。頭頸部領域では、典型的な診断は内視鏡的介入であり、医療専門家がRGBカメラ画像を用いて組織を手動で評価する。健康領域と腫瘍領域は一般に区別しやすいが、良性腫瘍と悪性腫瘍の鑑別は非常に困難である。診断には侵襲的生検と組織学的診断が必要である。また,腫瘍切除時には組織学的に腫瘍マージンを確認する必要がある。不要な組織切除を避けるため、非侵襲的な画像ベースの診断ツールは非常に有用である。近年, 深層学習と組み合わせたハイパースペクトルイメージングが提案され, 前生検体に対して有望な結果を示した。本研究では,高スペクトル画像と深層学習を用いた生体内腫瘍型分類の可能性を示す。我々は、従来のRGB画像と比較して、複数の超スペクトル帯域を用いることの価値を分析し、追加のスペクトル情報を利用する機械学習モデルの能力について検討する。本研究は,スペクトル集約と空間特徴学習のための再帰畳み込みモデルを用いたスペクトル処理と空間処理について考察する。我々の最良のモデルは76.3%のaucを達成し、これまでの従来および深層学習法を大きく上回っている。 Early detection of cancerous tissue is crucial for long-term patient survival. In the head and neck region, a typical diagnostic procedure is an endoscopic intervention where a medical expert manually assesses tissue using RGB camera images. While healthy and tumor regions are generally easier to distinguish, differentiating benign and malignant tumors is very challenging. This requires an invasive biopsy, followed by histological evaluation for diagnosis. Also, during tumor resection, tumor margins need to be verified by histological analysis. To avoid unnecessary tissue resection, a non-invasive, image-based diagnostic tool would be very valuable. Recently, hyperspectral imaging paired with deep learning has been proposed for this task, demonstrating promising results on ex-vivo specimens. In this work, we demonstrate the feasibility of in-vivo tumor type classification using hyperspectral imaging and deep learning. We analyze the value of using multiple hyperspectral bands compared to conventional RGB images and we study several machine learning models' ability to make use of the additional spectral information. Based on our insights, we address spectral and spatial processing using recurrent-convolutional models for effective spectral aggregating and spatial feature learning. Our best model achieves an AUC of 76.3%, significantly outperforming previous conventional and deep learning methods.	翻訳日:2022-11-14 14:35:24 公開日:2020-07-02
# OCTボリュームにおける物体位置推定のための4次元時空間畳み込みネットワーク 4D Spatio-Temporal Convolutional Networks for Object Position Estimation in OCT Volumes ( http://arxiv.org/abs/2007.01044v1 ) ライセンス: Link先を確認	Marcel Bengs, Nils Gessert, Alexander Schlaefer	(参考訳) オブジェクトの追跡とローカライズは、コンピュータ支援手術における中心的な問題である。光コヒーレンストモグラフィ(OCT)は、空間分解能と時間分解能が高いため、光学追跡システムとして用いられる。近年,3次元畳み込みニューラルネットワーク(CNN)は,一体積CT画像を用いたマーカーオブジェクトのポーズ推定に有望な性能を示した。このアプローチは空間情報にのみ依存するが、OCTはOCT画像ボリュームの時間的ストリームを高ボリュームレートで取得することを可能にする。本研究では、3次元CNNを4次元時空間CNNに体系的に拡張し、マーカーオブジェクト追跡に対する追加の時間情報の影響を評価する。その結果, OCTボリュームのストリームと4次元時空間畳み込みを用いた場合, 3次元CNNを用いた単一ボリューム処理と比較して平均絶対誤差が30%低いことがわかった。 Tracking and localizing objects is a central problem in computer-assisted surgery. Optical coherence tomography (OCT) can be employed as an optical tracking system, due to its high spatial and temporal resolution. Recently, 3D convolutional neural networks (CNNs) have shown promising performance for pose estimation of a marker object using single volumetric OCT images. While this approach relied on spatial information only, OCT allows for a temporal stream of OCT image volumes capturing the motion of an object at high volumes rates. In this work, we systematically extend 3D CNNs to 4D spatio-temporal CNNs to evaluate the impact of additional temporal information for marker object tracking. Across various architectures, our results demonstrate that using a stream of OCT volumes and employing 4D spatio-temporal convolutions leads to a 30% lower mean absolute error compared to single volume processing with 3D CNNs.	翻訳日:2022-11-14 14:35:06 公開日:2020-07-02
# 食事療法のための人体取得・可視化・測定のためのRGB-Dベースのフレームワーク RGB-D-based Framework to Acquire, Visualize and Measure the Human Body for Dietetic Treatments ( http://arxiv.org/abs/2007.00981v1 ) ライセンス: Link先を確認	Andr\'es Fuster-Guill\'o, Jorge Azor\'in-L\'opez, Marcelo Saval-Calvo, Juan Miguel Castillo-Zaragoza, Nahuel Garcia-DUrso, Robert B Fisher	(参考訳) 本研究の目的は,最先端のrgb-dセンサと仮想現実(vr)技術を用いた栄養栄養療法の改善である。近年の研究では、マルチメディア技術を用いて治療への付着性を改善することができる。しかし、この目的のために3dデータとvr技術を用いた研究はほとんどない。一方,食事療法中の患者の身体の3次元計測と経時的分析(4d)は困難である。この研究の主な貢献は、肥満治療に対する4次元体モデル可視化の効果を研究するための枠組みを提供することである。このシステムは、低コスト技術を用いて身体の完全な3dモデルを得ることができ、将来の簡単な移動を十分な精度と現実的な可視化で可能とし、肥満治療中の形状の進化(4d)の分析を可能にする。この3dボディモデルは、2dおよびvrデバイスを用いた肥満治療における可視化の効果を調べるために使用される。さらに,得られた3Dモデルを用いて体の測定を行う。合成オブジェクトと実オブジェクトの両方で測定値を得るための提案手法の精度について検討した。 This research aims to improve dietetic-nutritional treatment using state-of-the-art RGB-D sensors and virtual reality (VR) technology. Recent studies show that adherence to treatment can be improved using multimedia technologies. However, there are few studies using 3D data and VR technologies for this purpose. On the other hand, obtaining 3D measurements of the human body and analyzing them over time (4D) in patients undergoing dietary treatment is a challenging field. The main contribution of the work is to provide a framework to study the effect of 4D body model visualization on adherence to obesity treatment. The system can obtain a complete 3D model of a body using low-cost technology, allowing future straightforward transference with sufficient accuracy and realistic visualization, enabling the analysis of the evolution (4D) of the shape during the treatment of obesity. The 3D body models will be used for studying the effect of visualization on adherence to obesity treatment using 2D and VR devices. Moreover, we will use the acquired 3D models to obtain measurements of the body. An analysis of the accuracy of the proposed methods for obtaining measurements with both synthetic and real objects has been carried out.	翻訳日:2022-11-14 14:29:24 公開日:2020-07-02
# リアルタイム人間-ロボットインタラクションのための注意指向行動認識 Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction ( http://arxiv.org/abs/2007.01065v1 ) ライセンス: Link先を確認	Ziyang Song, Ziyi Yin, Zejian Yuan, Chong Zhang, Wanchao Chi, Yonggen Ling, Shenghao Zhang	(参考訳) 行動認識タスクにおける顕著な進歩にもかかわらず、人間とロボットの相互作用に特化した行動認識では、多くの作業が行われていない。本稿では,インタラクションシナリオにおける行動認識タスクの特徴を深く検討し,リアルタイムインタラクションの必要性を満たすための注意指向マルチレベルネットワークフレームワークを提案する。具体的には、まず低解像度でシーン内のインタラクタに大まかに焦点を合わせ、高分解能で微細なポーズ推定を行うプリアテンションネットワークを用いる。他のコンパクトcnnは、抽出された骨格配列をアクション認識の入力として受け取り、局所空間-時間パターンとグローバル意味情報を効果的に捉えるための注意のようなメカニズムを利用する。このアプローチを評価するために,インタラクションシナリオにおける認識タスク用に,新たなアクションデータセットを構築した。モバイルコンピューティングプラットフォーム(Nvidia Jetson AGX Xavier)上でのデータセットと高効率(112fps/640 x 480 RGBD)の実験結果から,実時間人間ロボットインタラクションにおける動作認識に優れた適用性を示した。 Despite the notable progress made in action recognition tasks, not much work has been done in action recognition specifically for human-robot interaction. In this paper, we deeply explore the characteristics of the action recognition task in interaction scenarios and propose an attention-oriented multi-level network framework to meet the need for real-time interaction. Specifically, a Pre-Attention network is employed to roughly focus on the interactor in the scene at low resolution firstly and then perform fine-grained pose estimation at high resolution. The other compact CNN receives the extracted skeleton sequence as input for action recognition, utilizing attention-like mechanisms to capture local spatial-temporal patterns and global semantic information effectively. To evaluate our approach, we construct a new action dataset specially for the recognition task in interaction scenarios. Experimental results on our dataset and high efficiency (112 fps at 640 x 480 RGBD) on the mobile computing platform (Nvidia Jetson AGX Xavier) demonstrate excellent applicability of our method on action recognition in real-time human-robot interaction.	翻訳日:2022-11-14 14:28:50 公開日:2020-07-02
# 実行長圧縮文書を非圧縮で自動ページ分割 Automatic Page Segmentation Without Decompressing the Run-Length Compressed Text Documents ( http://arxiv.org/abs/2007.01142v1 ) ライセンス: Link先を確認	Mohammed Javed and P. Nagabhushan	(参考訳) ページ分割は複雑なレイアウトを持つ文書の自動分析において重要な段階であると考えられている。これは伝統的に圧縮されていない文書で行われてきたが、実際の文書のほとんどは保存と転送を効率よくすることを要求する圧縮形式で存在する。しかし,圧縮の段階を経ることなく,圧縮文書内で直接ページ分割を行うことは難しい課題である。本研究では,ccitt group-3圧縮テキスト文書のラン長データに直接ページ分割操作を行う可能性を示す。そのため、テキスト文書を列に分割する前に、各欄を段落に、各段落をテキスト行に、各行を単語に分割し、最後に各単語を文字に分割して、テキスト文書の事前処理を行う必要がある。プリプロセッシングステージは、通常のテキスト領域と反転したテキスト領域を識別し、反転したテキスト領域を通常のモードにトグルする。カラム分離開始の続編では,空白空間の漸進的同化の新たな戦略が垂直方向に実行され,関連するパラメータの自己推定が提案されている。これらのパラメータを用いた列セグメンテーションを実現する手法を考案した。次に、次に示すのは2段階の水平行分離プロセスで、各列を段落に分割し、次にテキストラインに分割する。そして、単語と文字の分離を完了させる2段階の縦列分離プロセスが存在する。 Page segmentation is considered to be the crucial stage for the automatic analysis of documents with complex layouts. This has traditionally been carried out in uncompressed documents, although most of the documents in real life exist in a compressed form warranted by the requirement to make storage and transfer efficient. However, carrying out page segmentation directly in compressed documents without going through the stage of decompression is a challenging goal. This research paper proposes demonstrating the possibility of carrying out a page segmentation operation directly in the run-length data of the CCITT Group-3 compressed text document, which could be single- or multi-columned and might even have some text regions in the inverted text color mode. Therefore, before carrying out the segmentation of the text document into columns, each column into paragraphs, each paragraph into text lines, each line into words, and, finally, each word into characters, a pre-processing of the text document needs to be carried out. The pre-processing stage identifies the normal text regions and inverted text regions, and the inverted text regions are toggled to the normal mode. In the sequel to initiate column separation, a new strategy of incremental assimilation of white space runs in the vertical direction and the auto-estimation of certain related parameters is proposed. A procedure to realize column-segmentation employing these extracted parameters has been devised. Subsequently, what follows first is a two-level horizontal row separation process, which segments every column into paragraphs, and in turn, into text-lines. Then, there is a two-level vertical column separation process, which completes the separation into words and characters.	翻訳日:2022-11-14 14:28:33 公開日:2020-07-02
# 階層型ニューラルネットワークを用いた低出力物体カウント Low-Power Object Counting with Hierarchical Neural Networks ( http://arxiv.org/abs/2007.01369v1 ) ライセンス: Link先を確認	Abhinav Goel, Caleb Tung, Sara Aghajanzadeh, Isha Ghodgaonkar, Shreya Ghosh, George K. Thiruvathukal, Yung-Hsiang Lu	(参考訳) ディープニューラルネットワーク(DNN)は、オブジェクトカウントなどの多くのコンピュータビジョンタスクにおいて最先端の精度を達成することができる。オブジェクトカウントはイメージとオブジェクトクエリの2つの入力を受け取り、クエリされたオブジェクトの発生回数を報告する。このようなタスクで高い精度を達成するために、dnnは数十億のオペレーションを必要とし、リソース制約のある低消費電力デバイスへのデプロイを困難にしている。以前の研究は、かなりの数のDNN操作が冗長であり、精度に影響を与えることなく排除できることを示している。これらの冗長性を低減するため,オブジェクトカウントのための階層型DNNアーキテクチャを提案する。このアーキテクチャは、RPN(Rerea Proposal Network)を使用して、クエリ対象を含む可能性のあるRerea-of-interest(RoI)を提案する。階層型分類器は、実際にクエリされたオブジェクトを含むRoIsを効率的に見つける。階層構造は視覚的に類似した対象カテゴリのグループを含む。階層の各ノードで小さなDNNを使用して、これらのグループを分類する。 RoIは階層分類器によって漸進的に処理される。 RoI のオブジェクトがクエリ対象と同じグループであれば、階層内の次の DNN は RoI をさらに処理し、そうでなければ RoI は破棄される。各画像を処理するためにいくつかの小さなdnnを使用することで、既存のオブジェクトカウンタと比較して、メモリ要求、推論時間、エネルギー消費、操作数を無視できる精度損失で削減できる。 Deep Neural Networks (DNNs) can achieve state-of-the-art accuracy in many computer vision tasks, such as object counting. Object counting takes two inputs: an image and an object query and reports the number of occurrences of the queried object. To achieve high accuracy on such tasks, DNNs require billions of operations, making them difficult to deploy on resource-constrained, low-power devices. Prior work shows that a significant number of DNN operations are redundant and can be eliminated without affecting the accuracy. To reduce these redundancies, we propose a hierarchical DNN architecture for object counting. This architecture uses a Region Proposal Network (RPN) to propose regions-of-interest (RoIs) that may contain the queried objects. A hierarchical classifier then efficiently finds the RoIs that actually contain the queried objects. The hierarchy contains groups of visually similar object categories. Small DNNs are used at each node of the hierarchy to classify between these groups. The RoIs are incrementally processed by the hierarchical classifier. If the object in an RoI is in the same group as the queried object, then the next DNN in the hierarchy processes the RoI further; otherwise, the RoI is discarded. By using a few small DNNs to process each image, this method reduces the memory requirement, inference time, energy consumption, and number of operations with negligible accuracy loss when compared with the existing object counters.	翻訳日:2022-11-14 14:26:44 公開日:2020-07-02
# D-NetPAD:説明可能で解釈可能なアイリス提示攻撃検出器 D-NetPAD: An Explainable and Interpretable Iris Presentation Attack Detector ( http://arxiv.org/abs/2007.01381v1 ) ライセンス: Link先を確認	Renu Sharma and Arun Ross	(参考訳) 虹彩認識システムは、相手が印刷された目、プラスチックの目、化粧品のコンタクトレンズなどの人工物を提示してシステムを回避する、提示攻撃(PA)に対して脆弱である。本研究では、DenseNet畳み込みニューラルネットワークアーキテクチャに基づくD-NetPADと呼ばれる有効で堅牢なアイリスPA検出器を提案する。 PAアーティファクト、センサー、データセット間の一般化性を示す。プロプライエタリデータセットと公開データセット(livdet-2017)で実施した実験では,提案手法の有効性が検証された。提案手法は,プロプライエタリなデータセットでは0.2\%の誤検出率で98.58\%の真の検出率を示し,LivDet-2017データセットでは最先端の手法よりも優れていた。 D-NetPADの性能を説明するため、t-SNEプロットとGrad-CAMを用いて中間特徴分布と固定熱マップを可視化する。さらに,ネットワークによって抽出される特徴の性質を説明するために周波数解析を行う。ソースコードとトレーニングされたモデルはhttps://github.com/iPRoBe-lab/D-NetPADで入手できる。 An iris recognition system is vulnerable to presentation attacks, or PAs, where an adversary presents artifacts such as printed eyes, plastic eyes, or cosmetic contact lenses to circumvent the system. In this work, we propose an effective and robust iris PA detector called D-NetPAD based on the DenseNet convolutional neural network architecture. It demonstrates generalizability across PA artifacts, sensors and datasets. Experiments conducted on a proprietary dataset and a publicly available dataset (LivDet-2017) substantiate the effectiveness of the proposed method for iris PA detection. The proposed method results in a true detection rate of 98.58\% at a false detection rate of 0.2\% on the proprietary dataset and outperfoms state-of-the-art methods on the LivDet-2017 dataset. We visualize intermediate feature distributions and fixation heatmaps using t-SNE plots and Grad-CAM, respectively, in order to explain the performance of D-NetPAD. Further, we conduct a frequency analysis to explain the nature of features being extracted by the network. The source code and trained model are available at https://github.com/iPRoBe-lab/D-NetPAD.	翻訳日:2022-11-14 14:26:23 公開日:2020-07-02
# ラテン文字で書かれた南アジアの言語処理:Dakshinaデータセット Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset ( http://arxiv.org/abs/2007.01176v1 ) ライセンス: Link先を確認	Brian Roark, Lawrence Wolf-Sonkin, Christo Kirov, Sabrina J. Mielke, Cibu Johny, Isin Demirsahin, Keith Hall	(参考訳) 本稿では,南アジア12言語を対象に,ラテン文字とネイティブ文字の両方のテキストからなる新しい資料であるdakshinaデータセットについて述べる。データセットは、各言語について、以下を含む。 1) 原本ウィキペディアテキスト 2) romanization lexicon,及び 3) 言語のネイティブスクリプトと基本ラテン文字の両方で、全文の並列データを生成する。各言語でwikipediaテキストの作成と選択に使用される方法、サンプルされた辞書に対する検証済みのローマ字化の収集、ネイティブスクリプトコレクションからの保持された文の手動ローマ字化を文書化する。さらに、単一単語の文字化、全文の文字化、ネイティブスクリプトとロマン化テキストの言語モデリングなど、データセットで可能ないくつかのタスクのベースライン結果も提供する。キーワード:ロマン化、翻訳、南アジア諸語 This paper describes the Dakshina dataset, a new resource consisting of text in both the Latin and native scripts for 12 South Asian languages. The dataset includes, for each language: 1) native script Wikipedia text; 2) a romanization lexicon; and 3) full sentence parallel data in both a native script of the language and the basic Latin alphabet. We document the methods used for preparation and selection of the Wikipedia text in each language; collection of attested romanizations for sampled lexicons; and manual romanization of held-out sentences from the native script collections. We additionally provide baseline results on several tasks made possible by the dataset, including single word transliteration, full sentence transliteration, and language modeling of native script and romanized text. Keywords: romanization, transliteration, South Asian languages	翻訳日:2022-11-14 14:20:40 公開日:2020-07-02
# テキスト分類のための新しいBGCapsule Network A Novel BGCapsule Network for Text Classification ( http://arxiv.org/abs/2007.04302v1 ) ライセンス: Link先を確認	Akhilesh Kumar Gangwar and Vadlamani Ravi	(参考訳) 感情分析、ニュース分類、複数ラベル分類、意見分類といったテキスト分類タスクは、現代のディープラーニングネットワークにおいても難しい問題である。近年,画像分類にはCapsule Networks (CapsNets) が提案されている。 CapsNets は Convolutional Neural Networks (CNNs) に対していくつかの利点があるが、テキスト領域での妥当性は調査されていない。本稿では,複数のテキスト分類タスクにおいて,双方向ゲート型再帰ユニット(bigru)のアンサンブルが先行するカプセルモデルであるbgcapsuleを提案する。主カプセル層に先行する特徴抽出層に対して,両方向GRUのアンサンブルを用いた。このハイブリッドアーキテクチャは、基本的な前処理ステップを実行した後、グラブに基づく埋め込み層、bigruベースのアンサンブル層、プライマリカプセル層、フラット層、完全に接続されたrelu層、そして完全に接続されたsoftmax層からなる。 bgcapsuleの有効性を評価するために,映画レビュー(mr imdb 2005),ag newsデータセット,dbpedia ontologyデータセット,yelp review full dataset,yelp review polarityデータセットを含む5つのベンチマークデータセット(10,000レコードから70万レコード)について,広範な実験を行った。これらのベンチマークは、ニュース分類、感情分析、マルチクラス分類、マルチラベル分類、意見分類など、いくつかのテキスト分類タスクをカバーする。提案するアーキテクチャ(bgcapsule)は,ポジティブ感情キーワードやネガティブ感情キーワードなどの外部言語知識を必要とせず,既存の手法と比較して精度が向上した。さらに、BGCapsuleは他の既存の技術よりも早く収束した。 Several text classification tasks such as sentiment analysis, news categorization, multi-label classification and opinion classification are challenging problems even for modern deep learning networks. Recently, Capsule Networks (CapsNets) are proposed for image classification. It has been shown that CapsNets have several advantages over Convolutional Neural Networks (CNNs), while their validity in the domain of text has been less explored. In this paper, we propose a novel hybrid architecture viz., BGCapsule, which is a Capsule model preceded by an ensemble of Bidirectional Gated Recurrent Units (BiGRU) for several text classification tasks. We employed an ensemble of Bidirectional GRUs for feature extraction layer preceding the primary capsule layer. The hybrid architecture, after performing basic pre-processing steps, consists of five layers: an embedding layer based on GloVe, a BiGRU based ensemble layer, a primary capsule layer, a flatten layer and fully connected ReLU layer followed by a fully connected softmax layer. In order to evaluate the effectiveness of BGCapsule, we conducted extensive experiments on five benchmark datasets (ranging from 10,000 records to 700,000 records) including Movie Review (MR Imdb 2005), AG News dataset, Dbpedia ontology dataset, Yelp Review Full dataset and Yelp review polarity dataset. These benchmarks cover several text classification tasks such as news categorization, sentiment analysis, multiclass classification, multi-label classification and opinion classification. We found that our proposed architecture (BGCapsule) achieves better accuracy compared to the existing methods without the help of any external linguistic knowledge such as positive sentiment keywords and negative sentiment keywords. Further, BGCapsule converged faster compared to other extant techniques.	翻訳日:2022-11-14 14:19:55 公開日:2020-07-02
# 移動背景に対する翻訳対象方向のデコードのためのショウジョウバエ運動視覚経路のモデル化 Modelling Drosophila Motion Vision Pathways for Decoding the Direction of Translating Objects Against Cluttered Moving Backgrounds ( http://arxiv.org/abs/2007.00886v1 ) ライセンス: Link先を確認	Qinbing Fu and Shigang Yue	(参考訳) 乱雑な動きの背景の前でオブジェクトを翻訳する方向を正しくかつ効率的にデコードすることは依然として難しい問題である。自然界において、軽量で低出力の飛行昆虫は、飛行中に高度に変動する環境において移動目標を検出するために運動視覚を適用し、運動知覚戦略を学ぶのに優れたパラダイムである。本稿では, ショウジョウバエの運動視覚経路を調査し, 最先端の生理学的研究に基づく計算モデルを提案する。提案する視覚系モデルでは,生物工学的ON・OF経路,広視野水平感度(HS),垂直感度(VS)システムなどが特徴である。本研究の主な貢献は2つの側面である。 1) 本モデルでは, 方向選択性(DS) と方向応答性(DO) の両方の反応を, フィードフォワード方式で, 運動知覚神経回路の主特性として明らかにした。 2) 運動前フィルタリング機構とON経路およびOF経路内の局所相関器のアンサンブルの組み合わせを含む時空間力学のモデル化により, 乱れの進行する背景の物体の翻訳に対して頑健な方向選択性を示し, 背景運動や乱れを効果的に抑制し, 動的応答を改善する。従って、対象の翻訳方向は、好ましくない方向(PD)またはヌル方向(ND)翻訳を示す正または負の出力を持つHSおよびVSシステムのグローバル応答として復号される。実験では,提案したニューラルネットワークモデルの有効性を検証し,より高速な移動,高コントラスト,大規模ターゲットへの応答性を示す。 Decoding the direction of translating objects in front of cluttered moving backgrounds, accurately and efficiently, is still a challenging problem. In nature, lightweight and low-powered flying insects apply motion vision to detect a moving target in highly variable environments during flight, which are excellent paradigms to learn motion perception strategies. This paper investigates the fruit fly \textit{Drosophila} motion vision pathways and presents computational modelling based on cutting-edge physiological researches. The proposed visual system model features bio-plausible ON and OFF pathways, wide-field horizontal-sensitive (HS) and vertical-sensitive (VS) systems. The main contributions of this research are on two aspects: 1) the proposed model articulates the forming of both direction-selective (DS) and direction-opponent (DO) responses, revealed as principal features of motion perception neural circuits, in a feed-forward manner; 2) it also shows robust direction selectivity to translating objects in front of cluttered moving backgrounds, via the modelling of spatiotemporal dynamics including combination of motion pre-filtering mechanisms and ensembles of local correlators inside both the ON and OFF pathways, which works effectively to suppress irrelevant background motion or distractors, and to improve the dynamic response. Accordingly, the direction of translating objects is decoded as global responses of both the HS and VS systems with positive or negative output indicating preferred-direction (PD) or null-direction (ND) translation. The experiments have verified the effectiveness of the proposed neural system model, and demonstrated its responsive preference to faster-moving, higher-contrast and larger-size targets embedded in cluttered moving backgrounds.	翻訳日:2022-11-14 14:19:26 公開日:2020-07-02
# ビデオから道路のレイアウトを理解する Understanding Road Layout from Videos as a Whole ( http://arxiv.org/abs/2007.00822v1 ) ライセンス: Link先を確認	Buyu Liu, Bingbing Zhuang, Samuel Schulter, Pan Ji, Manmohan Chandraker	(参考訳) 本稿では,複雑な道路シーンのレイアウトをビデオシーケンスから推定する問題に対処する。この目的のために,道路属性予測問題として定式化し,その目的は各フレームの属性を正確かつ一貫して予測することである。先行研究とは対照的に,映像中のカメラの動きを活用すること,長期的映像情報を取り入れることの3つの新しい側面を生かした。具体的には,ビデオの予測一貫性を強制するモデルを提案する。我々のモデルは1つのLSTMと1つの特徴変換モジュール(FTM)から構成される。前者は隠された状態との一貫性の制約を暗黙的に含み、後者はビデオに沿って情報を集約する際にカメラの動きを明示的に考慮する。さらに,道路参加者,例えばオブジェクトをモデルに組み込むことにより,文脈情報を組み込むことを提案する。ビデオシーケンス全体が利用可能になると、私たちのモデルは、例えば過去と将来のフレームからの情報など、ローカルとグローバルの両方の手がかりをエンコードすることもできます。 1) グローバルまたは文脈的手がかりのいずれかを組み込むことで、予測精度が向上し、両方の活用が最高のパフォーマンスをもたらす。 2) LSTMおよびFTMモジュールの導入により,ビデオの予測一貫性が向上する。 (3)提案手法はSOTAよりも大きなマージンで優れている。 In this paper, we address the problem of inferring the layout of complex road scenes from video sequences. To this end, we formulate it as a top-view road attributes prediction problem and our goal is to predict these attributes for each frame both accurately and consistently. In contrast to prior work, we exploit the following three novel aspects: leveraging camera motions in videos, including context cuesand incorporating long-term video information. Specifically, we introduce a model that aims to enforce prediction consistency in videos. Our model consists of one LSTM and one Feature Transform Module (FTM). The former implicitly incorporates the consistency constraint with its hidden states, and the latter explicitly takes the camera motion into consideration when aggregating information along videos. Moreover, we propose to incorporate context information by introducing road participants, e.g. objects, into our model. When the entire video sequence is available, our model is also able to encode both local and global cues, e.g. information from both past and future frames. Experiments on two data sets show that: (1) Incorporating either globalor contextual cues improves the prediction accuracy and leveraging both gives the best performance. (2) Introducing the LSTM and FTM modules improves the prediction consistency in videos. (3) The proposed method outperforms the SOTA by a large margin.	翻訳日:2022-11-14 14:18:25 公開日:2020-07-02
# ACFD:非対称カルトン顔検出器 ACFD: Asymmetric Cartoon Face Detector ( http://arxiv.org/abs/2007.00899v1 ) ライセンス: Link先を確認	Bin Zhang, Jian Li, Yabiao Wang, Zhipeng Cui, Yili Xia, Chengjie Wang, Jilin Li, Feiyue Huang	(参考訳) カルトゥーンの顔検出は、難しいシナリオが多いため、人間の顔検出よりも難しい作業である。本稿では, 顔内における大きな違いなど, マンガの顔の特徴に着目し, ACFD という非対称なマンガの顔検出手法を提案する。具体的には、いくつかの非対称な単発アグリゲーションモジュール(AOSA)、非対称な双方向特徴ピラミッドネットワーク(ABi-FPN)、動的アンカーマッチング戦略(DAM)、対応するマージン二分分類損失(MBC)からなる新しいバックボーンVoVNetV3である。特に、多様な受容場を持つ特徴を生成するために、VoVNetV3によりマルチスケールピラミッドの特徴を抽出し、いくつかの極端なポーズで顔を扱うためにABi-FPNによって同時に融合・拡張され、異なるアスペクト比を有する。さらに、DAMは顔ごとに十分な高品質のアンカーに適合し、MBCは差別の強い力である。これらのモジュールの有効性により,acfdは,モデルサイズ200mb,イメージあたりの推論時間50ms,トレーニング済みモデルなしで,2020 icartoon face challengeの検出トラックで第1位を達成した。 Cartoon face detection is a more challenging task than human face detection due to many difficult scenarios is involved. Aiming at the characteristics of cartoon faces, such as huge differences within the intra-faces, in this paper, we propose an asymmetric cartoon face detector, named ACFD. Specifically, it consists of the following modules: a novel backbone VoVNetV3 comprised of several asymmetric one-shot aggregation modules (AOSA), asymmetric bi-directional feature pyramid network (ABi-FPN), dynamic anchor match strategy (DAM) and the corresponding margin binary classification loss (MBC). In particular, to generate features with diverse receptive fields, multi-scale pyramid features are extracted by VoVNetV3, and then fused and enhanced simultaneously by ABi-FPN for handling the faces in some extreme poses and have disparate aspect ratios. Besides, DAM is used to match enough high-quality anchors for each face, and MBC is for the strong power of discrimination. With the effectiveness of these modules, our ACFD achieves the 1st place on the detection track of 2020 iCartoon Face Challenge under the constraints of model size 200MB, inference time 50ms per image, and without any pretrained models.	翻訳日:2022-11-14 14:18:05 公開日:2020-07-02
# マルチソースドメイン適応におけるソース選択のためのカリキュラムマネージャ Curriculum Manager for Source Selection in Multi-Source Domain Adaptation ( http://arxiv.org/abs/2007.01261v1 ) ライセンス: Link先を確認	Luyu Yang, Yogesh Balaji, Ser-Nam Lim, Abhinav Shrivastava	(参考訳) マルチソース非教師付きドメイン適応の性能は、ラベル付きソースドメインサンプルからの転送の有効性に大きく依存する。本稿では,資源選択のためのカリキュラムマネージャ (CMSS) と呼ばれる,資源サンプルの動的カリキュラムを学習する逆エージェントを提案する。独立したネットワークモジュールであるCurriculum Managerは、トレーニング中のカリキュラムを常に更新し、どのドメインやサンプルがターゲットに合わせるのに適しているかを反復的に学習する。この背景にある直感は、Curriculum Managerが遅延ドメインの転送可能性を常に再測定し、ドメイン識別器のエラー率を逆向きに上昇させることである。 CMSSはドメインラベルの知識を一切必要としないが、よく知られた4つのベンチマークの他の手法よりもかなり優れている。また,提案手法に光を当てた解釈可能な結果も提供する。 The performance of Multi-Source Unsupervised Domain Adaptation depends significantly on the effectiveness of transfer from labeled source domain samples. In this paper, we proposed an adversarial agent that learns a dynamic curriculum for source samples, called Curriculum Manager for Source Selection (CMSS). The Curriculum Manager, an independent network module, constantly updates the curriculum during training, and iteratively learns which domains or samples are best suited for aligning to the target. The intuition behind this is to force the Curriculum Manager to constantly re-measure the transferability of latent domains over time to adversarially raise the error rate of the domain discriminator. CMSS does not require any knowledge of the domain labels, yet it outperforms other methods on four well-known benchmarks by significant margins. We also provide interpretable results that shed light on the proposed method.	翻訳日:2022-11-14 14:11:19 公開日:2020-07-02
# DATE:MPSoCを使用可能なDVFSにおける高温サイドチャネル攻撃に対する防御 DATE: Defense Against TEmperature Side-Channel Attacks in DVFS Enabled MPSoCs ( http://arxiv.org/abs/2007.01377v1 ) ライセンス: Link先を確認	Somdip Dey, Amit Kumar Singh, Xiaohang Wang, and Klaus Dieter McDonald-Maier	(参考訳) 組込みデバイスを日常的に利用することの絶え間ない増加を考えると、サイドチャネルはそのようなシステムにおける情報フロー制御とセキュリティの課題である。そのような重要なセキュリティ欠陥の1つは、温度側チャネル攻撃によって悪用され、そこでは、セキュリティ欠陥を推測するために、処理要素からの放熱と伝播が時間とともに観測される。提案手法であるdate: defense against temperature side-channel attackでは,温度側チャネル攻撃に対してより安全なシステムを実現するために,空間的および時間的温度勾配を低減し,同時にデバイスの寿命の信頼性を高める新しい手法を提案する。本稿では,コンピュータシステムに対する温度側チャネル攻撃に対するセキュリティを定量化できる新しい指標であるサーマル・セキュリティ・イン・マルチ・プロシーサ(TSMP)を導入し,DATEは最先端のアプリケーションに対して最大139.24%の安全性を示し,温度サイクルを67.42%削減した。 Given the constant rise in utilizing embedded devices in daily life, side channels remain a challenge to information flow control and security in such systems. One such important security flaw could be exploited through temperature side-channel attacks, where heat dissipation and propagation from the processing elements are observed over time in order to deduce security flaws. In our proposed methodology, DATE: Defense Against TEmperature side-channel attacks, we propose a novel approach of reducing spatial and temporal thermal gradient, which makes the system more secure against temperature side-channel attacks, and at the same time increases the reliability of the device in terms of lifespan. In this paper, we have also introduced a new metric, Thermal-Security-in-Multi-Processors (TSMP), which is capable of quantifying the security against temperature side-channel attacks on computing systems, and DATE is evaluated to be 139.24% more secure at the most for certain applications than the state-of-the-art, while reducing thermal cycle by 67.42% at the most.	翻訳日:2022-11-14 14:11:04 公開日:2020-07-02
# 畳み込みニューラルネットワークを用いたX線画像上のCOVID-19症例の自動検出 Automatic Detection of COVID-19 Cases on X-ray images Using Convolutional Neural Networks ( http://arxiv.org/abs/2007.05494v1 ) ライセンス: Link先を確認	Lucas P. Soares and Cesar P. Soares	(参考訳) ここ数カ月、世界は新型コロナウイルスの急速な進歩に驚いている。この病気に直面し、社会経済的影響を最小限に抑えるためには、監視と治療に加えて、診断が重要な手順である。しかし、この実現には遅れや実験室への限られたアクセスが妨げられ、ケーストリアージを行うための新たな戦略が要求される。このシナリオでは、胸部x線およびct画像に基づく診断プロセスを支援するオプションとして、ディープラーニングモデルが提案されている。そこで本研究では,深層学習による畳み込みニューラルネットワーク(cnn)を用いて,胸部画像から新型コロナウイルスの検出プロセスを自動化することを目的とした。この結果は、covid-19の他の種類の検出方法へのアクセスを拡大し、この病気を識別するプロセスをスピードアップに寄与する可能性がある。使用するすべてのデータベース、ビルドされたコード、およびモデルのトレーニングから得られた結果は、オープンアクセスで利用できる。この行動は、結果の改善に寄与し、その結果、新型コロナウイルスに直面する進歩に寄与するため、他の研究者によるこれらのモデルの強化への関与を促進する。 In recent months the world has been surprised by the rapid advance of COVID-19. In order to face this disease and minimize its socio-economic impacts, in addition to surveillance and treatment, diagnosis is a crucial procedure. However, the realization of this is hampered by the delay and the limited access to laboratory tests, demanding new strategies to carry out case triage. In this scenario, deep learning models are being proposed as a possible option to assist the diagnostic process based on chest X-ray and computed tomography images. Therefore, this research aims to automate the process of detecting COVID-19 cases from chest images, using convolutional neural networks (CNN) through deep learning techniques. The results can contribute to expand access to other forms of detection of COVID-19 and to speed up the process of identifying this disease. All databases used, the codes built, and the results obtained from the models' training are available for open access. This action facilitates the involvement of other researchers in enhancing these models since this can contribute to the improvement of results and, consequently, the progress in confronting COVID-19.	翻訳日:2022-11-14 14:10:04 公開日:2020-07-02
# ウェアラブル呼吸モニタリング:コンテキストとセンサバイオマーカーによる解釈可能な推論 Wearable Respiration Monitoring: Interpretable Inference with Context and Sensor Biomarkers ( http://arxiv.org/abs/2007.01413v1 ) ライセンス: Link先を確認	Ridwan Alam, David B. Peden, and John C. Lach	(参考訳) 呼吸速度(br)、微小換気(ve)、その他の呼吸パラメータは、喘息などの多くの急性疾患の患者をリアルタイムにモニターするのに必須である。呼吸測定のための臨床標準、すなわちスピロメトリは、継続的な使用には適さない。ウェアラブルは心電図や運動といった多くの生理的信号を追跡できるが、呼吸はできない。他のモダリティからの呼吸は活発な研究の領域となっている。本研究では,ウェアラブル心電図と手首運動信号から呼吸パラメータを推定する。本研究では,文脈条件付き推論モデル学習において,物理活動などの利用可能なコンテキスト情報を利用するモジュール型で一般化可能な分類回帰パイプラインを提案する。これらのモデルで使用するウェアラブルecgから形態素およびパワー領域の新しい特徴を抽出する。このパイプラインには探索的特徴選択法が組み込まれ、アプリケーション固有の解釈可能なバイオマーカーを発見する。 15項目のデータを用いて,提案したパイプラインの2つの実装(BRとVE)を評価する。各実装は、一般化線形モデル、ランダムフォレスト、サポートベクトルマシン、ガウス過程回帰、および近傍成分分析を文脈回帰モデルとして比較する。置換、正則化、関連性決定法は、ECGの特徴をランク付けし、モデルや活動間で堅牢なECGバイオマーカーを特定するために用いられる。この研究は、連続監視だけでなく、バイオマーカーによる予防対策の設計においてもウェアラブルセンサーの可能性を示している。 Breathing rate (BR), minute ventilation (VE), and other respiratory parameters are essential for real-time patient monitoring in many acute health conditions, such as asthma. The clinical standard for measuring respiration, namely Spirometry, is hardly suitable for continuous use. Wearables can track many physiological signals, like ECG and motion, yet not respiration. Deriving respiration from other modalities has become an area of active research. In this work, we infer respiratory parameters from wearable ECG and wrist motion signals. We propose a modular and generalizable classification-regression pipeline to utilize available context information, such as physical activity, in learning context-conditioned inference models. Morphological and power domain novel features from the wearable ECG are extracted to use with these models. Exploratory feature selection methods are incorporated in this pipeline to discover application-specific interpretable biomarkers. Using data from 15 subjects, we evaluate two implementations of the proposed pipeline: for inferring BR and VE. Each implementation compares generalized linear model, random forest, support vector machine, Gaussian process regression, and neighborhood component analysis as contextual regression models. Permutation, regularization, and relevance determination methods are used to rank the ECG features to identify robust ECG biomarkers across models and activities. This work demonstrates the potential of wearable sensors not only in continuous monitoring, but also in designing biomarker-driven preventive measures.	翻訳日:2022-11-14 14:09:09 公開日:2020-07-02
# 事実に基づくテキスト編集 Fact-based Text Editing ( http://arxiv.org/abs/2007.00916v1 ) ライセンス: Link先を確認	Hayate Iso, Chao Qiao, Hang Li	(参考訳) 本稿では,知識ベース(例えば,いくつかの三重項)における事実をよりよく記述するために,与えられた文書を改訂することを目的とした,新しいテキスト編集タスクである \textit{fact-based text editing} を提案する。なぜなら、真実を反映することはテキスト編集において一般的な要件であるからである。まず、各インスタンスがドラフトテキスト、改訂テキスト、およびトリプルで表現された複数の事実からなる、事実ベースのテキスト編集の研究のためのデータセットを自動生成する手法を提案する。この手法を2つの公開テーブルツーテキストデータセットに適用し,それぞれ233kインスタンスと37kインスタンスからなる2つの新しいデータセットを得る。次に,バッファ,ストリーム,メモリを用いて与えられた事実を参照してドラフトテキストを編集する,事実ベースのテキスト編集のための新たなニューラルネットワークアーキテクチャ, \textsc{facteditor}を提案する。この問題に対処するための簡単なアプローチは、エンコーダ-デコーダモデルを採用することである。この2つのデータセットの実験結果から, エンコーダとデコーダのアプローチの精度は, 忠実度と流布率で優れていた。結果はまた、textsc{FactEditor} が encoder-decoder アプローチよりも高速に推論を行うことを示している。 We propose a novel text editing task, referred to as \textit{fact-based text editing}, in which the goal is to revise a given document to better describe the facts in a knowledge base (e.g., several triples). The task is important in practice because reflecting the truth is a common requirement in text editing. First, we propose a method for automatically generating a dataset for research on fact-based text editing, where each instance consists of a draft text, a revised text, and several facts represented in triples. We apply the method into two public table-to-text datasets, obtaining two new datasets consisting of 233k and 37k instances, respectively. Next, we propose a new neural network architecture for fact-based text editing, called \textsc{FactEditor}, which edits a draft text by referring to given facts using a buffer, a stream, and a memory. A straightforward approach to address the problem would be to employ an encoder-decoder model. Our experimental results on the two datasets show that \textsc{FactEditor} outperforms the encoder-decoder approach in terms of fidelity and fluency. The results also show that \textsc{FactEditor} conducts inference faster than the encoder-decoder approach.	翻訳日:2022-11-14 14:08:49 公開日:2020-07-02
# IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE ( http://arxiv.org/abs/2007.00924v1 ) ライセンス: Link先を確認	Luxi Xing, Yuqiang Xie, Yue Hu, Wei Peng	(参考訳) 本稿では,SemEval Task4: Commonsense Validation and Explanationの最初の2つのサブタスクについて紹介する。評価の意図を明確にし,選択のためのコントラスト情報を注入するために,プロンプトテンプレートを用いた入力再構成戦略を提案する。具体的には、サブタスクをマルチタスク質問応答形式に形式化し、プロンプトテンプレートで入力を構築し、サブタスクの結果として質問応答の最終的な予測を検討する。実験の結果,本手法はベースラインシステムと比較して高い性能を示した。最初の2つのサブタスクの2つの公式テストセットにおいて、96.4の精度と94.3の精度で第3位を確保した。 This paper introduces our systems for the first two subtasks of SemEval Task4: Commonsense Validation and Explanation. To clarify the intention for judgment and inject contrastive information for selection, we propose the input reconstruction strategy with prompt templates. Specifically, we formalize the subtasks into the multiple-choice question answering format and construct the input with the prompt templates, then, the final prediction of question answering is considered as the result of subtasks. Experimental results show that our approaches achieve significant performance compared with the baseline systems. Our approaches secure the third rank on both official test sets of the first two subtasks with an accuracy of 96.4 and an accuracy of 94.3 respectively.	翻訳日:2022-11-14 14:08:27 公開日:2020-07-02
# Project PIAF: ネイティブなフランス語質問回答データセットの構築 Project PIAF: Building a Native French Question-Answering Dataset ( http://arxiv.org/abs/2007.00968v1 ) ライセンス: Link先を確認	Rachel Keraron, Guillaume Lancrenon, Mathilde Bras, Fr\'ed\'eric Allary, Gilles Moyse, Thomas Scialom, Edmundo-Pavel Soriano-Morales, Jacopo Staiano	(参考訳) 非英語言語のデータの欠如,特に質問応答などの下流タスクの評価に動機づけられ,フランス語母語質問応答データセットを収集するための参加的取り組みを提案する。さらに,得られたデータと予備ベースラインとともに,収集作業用に開発したアノテーションツールについて記述し,公開する。 Motivated by the lack of data for non-English languages, in particular for the evaluation of downstream tasks such as Question Answering, we present a participatory effort to collect a native French Question Answering Dataset. Furthermore, we describe and publicly release the annotation tool developed for our collection effort, along with the data obtained and preliminary baselines.	翻訳日:2022-11-14 14:08:13 公開日:2020-07-02
# スパゲートスケッチとスケールド正規化による分散二階最適化のデバイアス化 Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization ( http://arxiv.org/abs/2007.01327v1 ) ライセンス: Link先を確認	Micha{\l} Derezi\'nski, Burak Bartan, Mert Pilanci and Michael W. Mahoney	(参考訳) 分散第2次最適化において、標準的な戦略は、データの小さなスケッチやバッチに基づいて、多くの局所的な見積もりを平均化することである。しかし、各マシンの局所的な推定値は、すべてのデータに対する完全な解と比較して偏りがあり、平均化の有効性を制限できる。本稿では,分散2次手法の収束率を理論的にも経験的にも改善し,局所的な推定値の偏りを解消する新しい手法を提案する。本手法は,(1)サロゲートスケッチと呼ぶものを得るための標準スケッチ技法の修正,(2)局所計算のためのグローバル正規化パラメータの注意深くスケーリングすること,の2つの新しい特徴を有する。我々の代理スケッチは行列点過程に基づいており、逆 Hessian の推定値のバイアスを正確に計算できる分布の族である。この計算に基づいて、最小化された対象が$l_2$-レギュラライズされたパラメータ$\lambda$で、個々のマシンがそれぞれサイズが$m$のスケッチを与えられたとき、バイアスを取り除くために、局所的な推定は$\lambda^{\prime}=\lambda\cdot(1-\frac{d_{\lambda}}{m})$で与えられるシュルーンク正規化パラメータを用いて計算されるべきであることを示した。 In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data. However, the local estimates on each machine are typically biased, relative to the full solution on all of the data, and this can limit the effectiveness of averaging. Here, we introduce a new technique for debiasing the local estimates, which leads to both theoretical and empirical improvements in the convergence rate of distributed second order methods. Our technique has two novel components: (1) modifying standard sketching techniques to obtain what we call a surrogate sketch; and (2) carefully scaling the global regularization parameter for local computations. Our surrogate sketches are based on determinantal point processes, a family of distributions for which the bias of an estimate of the inverse Hessian can be computed exactly. Based on this computation, we show that when the objective being minimized is $l_2$-regularized with parameter $\lambda$ and individual machines are each given a sketch of size $m$, then to eliminate the bias, local estimates should be computed using a shrunk regularization parameter given by $\lambda^{\prime}=\lambda\cdot(1-\frac{d_{\lambda}}{m})$, where $d_{\lambda}$ is the $\lambda$-effective dimension of the Hessian (or, for quadratic problems, the data matrix).	翻訳日:2022-11-14 14:02:25 公開日:2020-07-02
# グローバル・ランドスケープ・オブ・ニューラル・ネットワーク:概要 The Global Landscape of Neural Networks: An Overview ( http://arxiv.org/abs/2007.01429v1 ) ライセンス: Link先を確認	Ruoyu Sun, Dawei Li, Shiyu Liang, Tian Ding, R Srikant	(参考訳) ニューラルネットワークトレーニングにおける大きな懸念の1つは、関連する損失関数の非凸性が景観不良を引き起こす可能性があることである。最近のニューラルネットワークの成功は、その損失の状況がそれほど悪くはないことを示唆しているが、その状況についてどのような具体的な結果が得られているのだろうか? 本稿では,ニューラルネットワークのグローバルな展望に関する最近の知見と結果について概説する。まず、広いニューラルネットワークは特定の仮定の下で最適な局所最小値を持つ可能性があることを指摘した。第二に、"悪い盆地がない"などの広帯域ネットワークの幾何学的特性に関する厳密な結果と、最適化された局所最小値や無限小への経路を除去するいくつかの修正について論じる。第3に,実用ニューラルネットの景観の可視化と経験的探索について考察する。最後に,いくつかの収束結果と景観結果との関係について概説する。 One of the major concerns for neural network training is that the non-convexity of the associated loss functions may cause bad landscape. The recent success of neural networks suggests that their loss landscape is not too bad, but what specific results do we know about the landscape? In this article, we review recent findings and results on the global landscape of neural networks. First, we point out that wide neural nets may have sub-optimal local minima under certain assumptions. Second, we discuss a few rigorous results on the geometric properties of wide networks such as "no bad basin", and some modifications that eliminate sub-optimal local minima and/or decreasing paths to infinity. Third, we discuss visualization and empirical explorations of the landscape for practical neural nets. Finally, we briefly discuss some convergence results and their relation to landscape results.	翻訳日:2022-11-14 14:00:58 公開日:2020-07-02
# 多値量子論理を用いた深層学習における解釈可能性問題への取り組み Addressing the interpretability problem for deep learning using many valued quantum logic ( http://arxiv.org/abs/2007.01819v1 ) ライセンス: Link先を確認	Swapnil Nitin Shah	(参考訳) 深層学習モデルは様々な産業や科学的応用に広く利用されている。これらのモデルは近年でかなりの成功を収めてきたが、機械学習コミュニティにおけるそのようなシステムによる決定の背後にある理論的根拠の欠如がある。この解釈可能性の問題は、そのようなモデルの複雑さの増加によってさらに悪化する。本稿では,機械学習,量子計算,量子場理論といった概念を用いて,畳み込み型深層信念ネットワークと呼ばれる生成型深層学習モデルにおいて,量子論理系が自然にどのように出現するかを実証する。計算効率を損なうことなく、多くの価値ある量子論理系の解釈可能性を備えたディープラーニングモデルを構築するための堅牢な理論的枠組みを提供する。 Deep learning models are widely used for various industrial and scientific applications. Even though these models have achieved considerable success in recent years, there exists a lack of understanding of the rationale behind decisions made by such systems in the machine learning community. This problem of interpretability is further aggravated by the increasing complexity of such models. This paper utilizes concepts from machine learning, quantum computation and quantum field theory to demonstrate how a many valued quantum logic system naturally arises in a specific class of generative deep learning models called Convolutional Deep Belief Networks. It provides a robust theoretical framework for constructing deep learning models equipped with the interpretability of many valued quantum logic systems without compromising their computing efficiency.	翻訳日:2022-11-14 14:00:10 公開日:2020-07-02
# ウォーターアセット管理のための予測分析:機械学習と生存分析 Predictive Analytics for Water Asset Management: Machine Learning and Survival Analysis ( http://arxiv.org/abs/2007.03744v1 ) ライセンス: Link先を確認	Maryam Rahbaralam, David Modesto, Jaume Card\'us, Amir Abdollahi, and Fernando M Cucchietti	(参考訳) 水資源管理の鍵となるのは, 水道管網のライフサイクルを通しての性能と優先資源の整備である。この重要なネットワークの改修は、一般的にパイプへの物理的アクセスの困難さや不可能さによって妨げられている。本研究では,水管故障予測のための統計的および機械学習フレームワークについて検討する。我々は,短期的予測と生存率分析のために古典的・現代的分類器を用い,より広い視点と長期予測を提供する。これらのモデルを豊かにするために,水分布領域の知識に基づく新しい予測器を導入し,近年のオーバーサンプリング手法を用いて,毎年観測される少数の障害から生じる高い不均衡を解消する。ケーススタディでは,スペイン・バルセロナの配水ネットワーク内の全管の故障記録を含むデータセットを用いて検討を行った。その結果, 管形状, 年齢, 材質, 土壌被覆など, 重要なリスク因子の影響が明らかとなり, 実用管理職がよりインフォームドな予測保守作業を行うのに役立つことがわかった。 Understanding performance and prioritizing resources for the maintenance of the drinking-water pipe network throughout its life-cycle is a key part of water asset management. Renovation of this vital network is generally hindered by the difficulty or impossibility to gain physical access to the pipes. We study a statistical and machine learning framework for the prediction of water pipe failures. We employ classical and modern classifiers for a short-term prediction and survival analysis to provide a broader perspective and long-term forecast, usually needed for the economic analysis of the renovation. To enrich these models, we introduce new predictors based on water distribution domain knowledge and employ a modern oversampling technique to remedy the high imbalance coming from the few failures observed each year. For our case study, we use a dataset containing the failure records of all pipes within the water distribution network in Barcelona, Spain. The results shed light on the effect of important risk factors, such as pipe geometry, age, material, and soil cover, among others, and can help utility managers conduct more informed predictive maintenance tasks.	翻訳日:2022-11-14 13:59:56 公開日:2020-07-02
# 深層学習を用いた衛星画像における鉱業とダム検出 Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning ( http://arxiv.org/abs/2007.01076v1 ) ライセンス: Link先を確認	Remis Balaniuk and Olga Isupova and Steven Reece	(参考訳) この研究は、自由なクラウドコンピューティング、フリーのオープンソースソフトウェア、そしてブラジルの採掘用尾根ダムの自動識別と分類という、実際の大規模問題を分析するためのディープラーニング手法の組み合わせを探求する。公式に登録された鉱山やダムの場所はブラジル政府のオープンデータ資源から取得された。 Google Earth Engineプラットフォームで取得、処理されたMultispectral Sentinel-2衛星画像は、TensorFlow 2 APIとGoogle Colabプラットフォームを使用して、ディープニューラルネットワークのトレーニングとテストに使用された。完全な畳み込みニューラルネットワークは、未登録の鉱石鉱山やブラジル領の広い地域でダムを尾行するために、革新的な方法で使用された。このアプローチの有効性は、公式な採掘権を持たない263の鉱山の発見によって実証される。この探索的な研究は、社会的影響の高い低コストのデータサイエンスツールを構築するために、無料で利用できる一連の新しい技術の可能性を強調している。同時に、特に発展途上国において、人口と環境に高いリスクをもたらす違法な鉱業の複雑で深刻な問題と、尾根ダムの増殖に対する現実的な解決策を議論し、提案する。コードは、https://github.com/remis/mining-discovery-with-deep-learning.comで公開されている。 This work explores the combination of free cloud computing, free open-source software, and deep learning methods to analyse a real, large-scale problem: the automatic country-wide identification and classification of surface mines and mining tailings dams in Brazil. Locations of officially registered mines and dams were obtained from the Brazilian government open data resource. Multispectral Sentinel-2 satellite imagery, obtained and processed at the Google Earth Engine platform, was used to train and test deep neural networks using the TensorFlow 2 API and Google Colab platform. Fully Convolutional Neural Networks were used in an innovative way, to search for unregistered ore mines and tailing dams in large areas of the Brazilian territory. The efficacy of the approach is demonstrated by the discovery of 263 mines that do not have an official mining concession. This exploratory work highlights the potential of a set of new technologies, freely available, for the construction of low cost data science tools that have high social impact. At the same time, it discusses and seeks to suggest practical solutions for the complex and serious problem of illegal mining and the proliferation of tailings dams, which pose high risks to the population and the environment, especially in developing countries. Code is made publicly available at: https://github.com/remis/mining-discovery-with-deep-learning.	翻訳日:2022-11-14 13:59:38 公開日:2020-07-02
# 遺伝性疾患の予後予測のための半教師付きジェネレーショナル・アドバーサリーネットワーク A Semi-Supervised Generative Adversarial Network for Prediction of Genetic Disease Outcomes ( http://arxiv.org/abs/2007.01200v1 ) ライセンス: Link先を確認	Caio Davi and Ulisses Braga-Neto	(参考訳) ほとんどの病気にとって、ラベル付き遺伝データの大規模なデータベースの構築は費用と時間を要する作業である。この問題を解決するために、GANアーキテクチャに基づく半教師付きアプローチであるGGAN(Generative Adversarial Networks)を導入し、少量のラベル付きデータと大量のラベルなしデータから始まる大規模な合成遺伝的データセットを作成する。我々の目標は、遺伝的プロファイルだけで、病気の重篤な形態を発達させる新しい個人の傾向を決定することである。提案モデルでは,異なるデータセットと個体群から得られた実際の遺伝データを用いて良好な結果を得た。提案モデルは自己認識可能であり,新たな遺伝的プロファイルがネットワークがトレーニングされたデータと十分に互換性があるかどうかを判定することができる。使用されるコードとデータセットはhttps://github.com/caio-davi/gGAN.comで見ることができる。 For most diseases, building large databases of labeled genetic data is an expensive and time-demanding task. To address this, we introduce genetic Generative Adversarial Networks (gGAN), a semi-supervised approach based on an innovative GAN architecture to create large synthetic genetic data sets starting with a small amount of labeled data and a large amount of unlabeled data. Our goal is to determine the propensity of a new individual to develop the severe form of the illness from their genetic profile alone. The proposed model achieved satisfactory results using real genetic data from different datasets and populations, in which the test populations may not have the same genetic profiles. The proposed model is self-aware and capable of determining whether a new genetic profile has enough compatibility with the data on which the network was trained and is thus suitable for prediction. The code and datasets used can be found at https://github.com/caio-davi/gGAN.	翻訳日:2022-11-14 13:51:59 公開日:2020-07-02
# 動的グラフのラプラシアン変化点検出 Laplacian Change Point Detection for Dynamic Graphs ( http://arxiv.org/abs/2007.01229v1 ) ライセンス: Link先を確認	Shenyang Huang, Yasmeen Hitti, Guillaume Rabusseau, Reihaneh Rabbany	(参考訳) 動的グラフと時間グラフは、時間とともにエンティティ間の複雑な関係をモデル化するために使用されるリッチなデータ構造である。特に、時間グラフにおける異常検出は、ネットワークシステムにおける侵入識別、生態系の乱れの検出、アウトブレイクの検出など、多くの現実世界の応用にとって重要である。本稿では,動的グラフにおける変化点検出に焦点をあて,この問題に関連する2つの主な課題に対処する。上記の課題を解決するために,各スナップショットにおけるグラフ構造のラプラシアン行列のスペクトルを用いて低次元埋め込みを求めるLaplacian Anomaly Detection (LAD)を提案する。 LADは2つのスライディングウィンドウを適用することで、短期および長期の依存関係を明示的にモデル化する。合成実験では、LADは最先端の手法よりも優れている。また, 本手法は, uciメッセージネットワーク, 上院共同支援ネットワーク, カナダ法案投票ネットワークの3つの実動的ネットワーク上で評価した。 3つのデータセットすべてにおいて,本手法は重要な実世界の事象に応じて異常な時点をより効果的に識別できることを実証する。 Dynamic and temporal graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly detection in temporal graphs is crucial for many real world applications such as intrusion identification in network systems, detection of ecosystem disturbances and detection of epidemic outbreaks. In this paper, we focus on change point detection in dynamic graphs and address two main challenges associated with this problem: I) how to compare graph snapshots across time, II) how to capture temporal dependencies. To solve the above challenges, we propose Laplacian Anomaly Detection (LAD) which uses the spectrum of the Laplacian matrix of the graph structure at each snapshot to obtain low dimensional embeddings. LAD explicitly models short term and long term dependencies by applying two sliding windows. In synthetic experiments, LAD outperforms the state-of-the-art method. We also evaluate our method on three real dynamic networks: UCI message network, US senate co-sponsorship network and Canadian bill voting network. In all three datasets, we demonstrate that our method can more effectively identify anomalous time points according to significant real world events.	翻訳日:2022-11-14 13:51:41 公開日:2020-07-02
# 地球観測におけるガウス過程の展望 A Perspective on Gaussian Processes for Earth Observation ( http://arxiv.org/abs/2007.01238v1 ) ライセンス: Link先を確認	Gustau Camps-Valls and Dino Sejdinovic and Jakob Runge and Markus Reichstein	(参考訳) 空中・衛星リモートセンシングとその場観測による地球観測(EO)は、地球をモニタリングする上で基本的な役割を果たす。過去10年間で、特に機械学習とガウス過程(GP)は、局所的およびグローバルなスケールで取得した画像から、時間分解された方法で生物地球物理変数を推定する際、顕著な結果を得た。 GPは正確な推定だけでなく、予測のための原理化された不確実性推定も提供し、異なるセンサーや時間的取得から得られるマルチモーダルデータを容易に取り扱えるようにし、物理的知識の導入を可能にし、不確実性定量化とエラー伝播の正式な処理を行う。前向きおよび逆モデリングの進歩にもかかわらず、GPモデルは、この視点の論文で改訂された重要な課題に直面する必要がある。 gpモデルは、信号特性を尊重し、物理の基本法則と一致し、純粋回帰から観測因果推論に移行するデータ駆動物理認識モデルへと進化するべきである。 Earth observation (EO) by airborne and satellite remote sensing and in-situ observations play a fundamental role in monitoring our planet. In the last decade, machine learning and Gaussian processes (GPs) in particular has attained outstanding results in the estimation of bio-geo-physical variables from the acquired images at local and global scales in a time-resolved manner. GPs provide not only accurate estimates but also principled uncertainty estimates for the predictions, can easily accommodate multimodal data coming from different sensors and from multitemporal acquisitions, allow the introduction of physical knowledge, and a formal treatment of uncertainty quantification and error propagation. Despite great advances in forward and inverse modelling, GP models still have to face important challenges that are revised in this perspective paper. GP models should evolve towards data-driven physics-aware models that respect signal characteristics, be consistent with elementary laws of physics, and move from pure regression to observational causal inference.	翻訳日:2022-11-14 13:51:12 公開日:2020-07-02
# ディープスパイクニューラルネットワークを用いたパターン認識のためのプログレッシブタンデム学習 Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks ( http://arxiv.org/abs/2007.01204v1 ) ライセンス: Link先を確認	Jibin Wu, Chenglin Xu, Daquan Zhou, Haizhou Li, Kay Chen Tan	(参考訳) スパイキングニューラルネットワーク(snn)は、イベント駆動の性質とスパース通信のため、低レイテンシと高い計算効率のために、従来のニューラルネットワーク(anns)よりも明確なアドバンテージを示している。しかし、深層SNNの訓練は簡単ではない。本稿では,深層SNNのプログレッシブタンデム学習と呼ばれる,高速かつ効率的なパターン認識のための新しいANN-to-SNN変換およびレイヤワイズ学習フレームワークを提案する。離散表現空間におけるANNとSNNの等価性を研究することにより、スパイクカウントをフル活用してアナログニューロンの活性化値を近似するプリミティブネットワーク変換法が導入された。プリミティブなネットワーク変換から生じる近似誤差を補うために,適応型トレーニングスケジューラを用いたレイヤワイズ学習手法を導入し,ネットワーク重みを微調整する。プログレッシブタンデム学習フレームワークはまた、トレーニング中に、制限された重量精度やファンイン接続などのハードウェア制約を徐々に課すことができる。これらのSNNは、大規模オブジェクト認識、画像再構成、音声分離タスクにおいて顕著な分類と回帰能力を示し、同時に、他の最先端のSNN実装よりも、推論時間とシナプス操作を極端に削減する必要がある。そのため、限られた電力予算で、モバイルおよび組み込みデバイスに普及する無数の機会を開くことができる。 Spiking neural networks (SNNs) have shown clear advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency, due to their event-driven nature and sparse communication. However, the training of deep SNNs is not straightforward. In this paper, we propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition, which is referred to as progressive tandem learning of deep SNNs. By studying the equivalence between ANNs and SNNs in the discrete representation space, a primitive network conversion method is introduced that takes full advantage of spike count to approximate the activation value of analog neurons. To compensate for the approximation errors arising from the primitive network conversion, we further introduce a layer-wise learning method with an adaptive training scheduler to fine-tune the network weights. The progressive tandem learning framework also allows hardware constraints, such as limited weight precision and fan-in connections, to be progressively imposed during training. The SNNs thus trained have demonstrated remarkable classification and regression capabilities on large-scale object recognition, image reconstruction, and speech separation tasks, while requiring at least an order of magnitude reduced inference time and synaptic operations than other state-of-the-art SNN implementations. It, therefore, opens up a myriad of opportunities for pervasive mobile and embedded devices with a limited power budget.	翻訳日:2022-11-14 13:44:50 公開日:2020-07-02
# 深層強化学習による人間中心協調ロボット Human-centered collaborative robots with deep reinforcement learning ( http://arxiv.org/abs/2007.01009v1 ) ライセンス: Link先を確認	Ali Ghadirzadeh, Xi Chen, Wenjie Yin, Zhengrong Yi, M{\aa}rten Bj\"orkman and Danica Kragic	(参考訳) 人中心協調システムのための強化学習に基づくフレームワークを提案する。フレームワークは積極的であり、タスク完了に要する時間を最小化することで、タイムリーなアクションの利点と不適切なアクションを取るリスクのバランスを取る。フレームワークは、認識の不確実性と意思決定を統合的に対処する教師なしの方法でエンドツーエンドに学習される。このフレームワークは、教師付き学習を用いて、知覚と意思決定システムが独立して学習される代替品と比較して、パッケージングの例題として、人間とロボットのパートナー間のより流動的な協調を提供する。提案手法の一番の利点は,動きデータの退屈なアノテーションを回避し,学習をオンラインで行うため,新たな人間パートナーやタスクへの迅速な適応を可能にすることである。 We present a reinforcement learning based framework for human-centered collaborative systems. The framework is proactive and balances the benefits of timely actions with the risk of taking improper actions by minimizing the total time spent to complete the task. The framework is learned end-to-end in an unsupervised fashion addressing the perception uncertainties and decision making in an integrated manner. The framework is shown to provide more fluent coordination between human and robot partners on an example task of packaging compared to alternatives for which perception and decision-making systems are learned independently, using supervised learning. The foremost benefit of the proposed approach is that it allows for fast adaptation to new human partners and tasks since tedious annotation of motion data is avoided and the learning is performed on-line.	翻訳日:2022-11-14 13:44:26 公開日:2020-07-02
# エンドツーエンド強化学習のための安全な探索 Verifiably Safe Exploration for End-to-End Reinforcement Learning ( http://arxiv.org/abs/2007.01223v1 ) ライセンス: Link先を確認	Nathan Hunt, Nathan Fulton, Sara Magliacane, Nghia Hoang, Subhro Das, Armando Solar-Lezama	(参考訳) 安全クリティカルな環境での深層強化学習の展開には、探索中に厳しい制約に従うアルゴリズムを開発する必要がある。本稿では,視覚的入力によるエンドツーエンドポリシーの形式的安全性制約の実施に向けた最初のアプローチを提案する。我々のアプローチは、ハイブリッド力学系におけるオブジェクト検出と自動推論の最近の進歩に基づいている。このアプローチは、ハード制約の存在下で安全に探索することの難しさを強調する新しいベンチマークで評価される。本ベンチマークは,安全学習のためのいくつかの問題集合を抽出し,安全制約に適合しない報奨信号などの課題を強調する。これらのベンチマーク問題に対して,本アルゴリズムは安全である限り最適化に競争力を維持しつつ,安全でない動作を完全に回避する。また, 安全制約の実施方法が, もともとの環境からすべての安全政策を守っていることも証明した。 Deploying deep reinforcement learning in safety-critical settings requires developing algorithms that obey hard constraints during exploration. This paper contributes a first approach toward enforcing formal safety constraints on end-to-end policies with visual inputs. Our approach draws on recent advances in object detection and automated reasoning for hybrid dynamical systems. The approach is evaluated on a novel benchmark that emphasizes the challenge of safely exploring in the presence of hard constraints. Our benchmark draws from several proposed problem sets for safe learning and includes problems that emphasize challenges such as reward signals that are not aligned with safety constraints. On each of these benchmark problems, our algorithm completely avoids unsafe behavior while remaining competitive at optimizing for as much reward as is safe. We also prove that our method of enforcing the safety constraints preserves all safe policies from the original environment.	翻訳日:2022-11-14 13:44:13 公開日:2020-07-02
# epsilon}-bmc:モデルフリー強化学習におけるepsilon-greedy探索へのベイズアンサンブルアプローチ {\epsilon}-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning ( http://arxiv.org/abs/2007.00869v1 ) ライセンス: Link先を確認	Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee	(参考訳) 探索-探索トレードオフの解消は、強化学習(RL)アルゴリズムの設計と実装における根本的な問題である。本稿では,epsilon-greedy 探索ポリシーを用いたモデルフリー RL に着目し,その単純さにもかかわらず,最も頻繁に使われている探索形式の一つである。しかし、このポリシーの重要な制限は$\varepsilon$の仕様である。本稿では、Q-値関数の均一性の尺度として、$\varepsilon$という新しいベイズ的視点を提供する。新しい視点に基づいたbayesian model combination(bmc)に基づいたクローズドフォームベイズモデルのアップデートを導入することにより、モノトーン収束保証によって、環境からの体験を一定時間使用することで、$\varepsilon$を適用できる。提案したアルゴリズムである$\varepsilon$-\texttt{BMC} は、異なる問題に対する探索と搾取の効率よくバランスし、最適な調整済みアニールスケジュールと、本論文で提案した代替データ依存の$\varepsilon$アダプティブスキームとを比較または上回る性能を示す。 Resolving the exploration-exploitation trade-off remains a fundamental problem in the design and implementation of reinforcement learning (RL) algorithms. In this paper, we focus on model-free RL using the epsilon-greedy exploration policy, which despite its simplicity, remains one of the most frequently used forms of exploration. However, a key limitation of this policy is the specification of $\varepsilon$. In this paper, we provide a novel Bayesian perspective of $\varepsilon$ as a measure of the uniformity of the Q-value function. We introduce a closed-form Bayesian model update based on Bayesian model combination (BMC), based on this new perspective, which allows us to adapt $\varepsilon$ using experiences from the environment in constant time with monotone convergence guarantees. We demonstrate that our proposed algorithm, $\varepsilon$-\texttt{BMC}, efficiently balances exploration and exploitation on different problems, performing comparably or outperforming the best tuned fixed annealing schedules and an alternative data-dependent $\varepsilon$ adaptation scheme proposed in the literature.	翻訳日:2022-11-14 13:43:21 公開日:2020-07-02
# ローカル更新手法における学習率の大幅な重要性について On the Outsized Importance of Learning Rates in Local Update Methods ( http://arxiv.org/abs/2007.00878v1 ) ライセンス: Link先を確認	Zachary Charles, Jakub Kone\v{c}n\'y	(参考訳) 我々は,多くのフェデレーション学習とメタ学習アルゴリズムを一般化する,局所的な更新手法と呼ばれるアルゴリズム群を研究する。二次目的の場合、局所更新法は、我々が正確に特徴付けるサーロゲート損失関数上で確率的勾配降下を行う。クライアント学習率の選択は、サロゲート損失の条件数と、サロゲート最小化関数と真の損失関数との距離を制御していることを示す。我々はこの理論を用いて、代理損失の条件数と真の損失関数との整合性の間のトレードオフを示すフェデレーション平均化のための新しい収束率を導出する。コミュニケーション制限のある環境では、適切な学習率チューニングが最適に近い行動に達するのに十分であることを実証する。また,学習速度チューニングの必要性を低減し,様々なタスクやデータセットにおける経験的性能を強調する,ローカル更新手法における学習速度の自動減衰の実用的な方法を提案する。 We study a family of algorithms, which we refer to as local update methods, that generalize many federated learning and meta-learning algorithms. We prove that for quadratic objectives, local update methods perform stochastic gradient descent on a surrogate loss function which we exactly characterize. We show that the choice of client learning rate controls the condition number of that surrogate loss, as well as the distance between the minimizers of the surrogate and true loss functions. We use this theory to derive novel convergence rates for federated averaging that showcase this trade-off between the condition number of the surrogate loss and its alignment with the true loss function. We validate our results empirically, showing that in communication-limited settings, proper learning rate tuning is often sufficient to reach near-optimal behavior. We also present a practical method for automatic learning rate decay in local update methods that helps reduce the need for learning rate tuning, and highlight its empirical performance on a variety of tasks and datasets.	翻訳日:2022-11-14 13:42:57 公開日:2020-07-02
# 動的リスク評価のための敵対的事例に対する深層学習防御 Deep Learning Defenses Against Adversarial Examples for Dynamic Risk Assessment ( http://arxiv.org/abs/2007.01017v1 ) ライセンス: Link先を確認	Xabier Echeberria-Barrio, Amaia Gil-Lerchundi, Ines Goicoechea-Telleria and Raul Orduna-Urrutia	(参考訳) Deep Neural Networksが最初に開発されたのは数十年前だが、コンピュータのパワー要件のために広く使われるようになったのは、最近になってからである。それ以来、多くの分野に適用され、広範囲の進歩を遂げてきた。さらに重要なことは、リスク管理が重要な医療手順や自動運転の意思決定など、重要な問題に利用されています。これらの分野における診断や意思決定の誤りは、重大な事故や死を伴います。なぜなら、この種のモデルを攻撃するのは簡単であると繰り返し報告されているからです。したがって、これらの攻撃はリスクを評価するために研究されなければならず、モデルをより堅牢にするために防御を開発する必要がある。このために最も広く知られた攻撃が選択され(敵の攻撃)、それに対するいくつかの防御(敵の訓練、次元の再定義、予測類似性)が行われた。得られた結果は、同様の精度を維持しながら、モデルをより堅牢にする。このアイデアは、乳がんデータセットとVGG16と高密度ニューラルネットワークモデルを使用して開発されたが、他の領域からのデータセットや、さまざまな畳み込みおよび高密度ニューラルネットワークモデルに適用することができる。 Deep Neural Networks were first developed decades ago, but it was not until recently that they started being extensively used, due to their computing power requirements. Since then, they are increasingly being applied to many fields and have undergone far-reaching advancements. More importantly, they have been utilized for critical matters, such as making decisions in healthcare procedures or autonomous driving, where risk management is crucial. Any mistakes in the diagnostics or decision-making in these fields could entail grave accidents, and even death. This is preoccupying, because it has been repeatedly reported that it is straightforward to attack this type of models. Thus, these attacks must be studied to be able to assess their risk, and defenses need to be developed to make models more robust. For this work, the most widely known attack was selected (adversarial attack) and several defenses were implemented against it (i.e. adversarial training, dimensionality reduc tion and prediction similarity). The obtained outcomes make the model more robust while keeping a similar accuracy. The idea was developed using a breast cancer dataset and a VGG16 and dense neural network model, but the solutions could be applied to datasets from other areas and different convolutional and dense deep neural network models.	翻訳日:2022-11-14 13:41:53 公開日:2020-07-02
# 損失領域の一般化を目指して In Search of Lost Domain Generalization ( http://arxiv.org/abs/2007.01434v1 ) ライセンス: Link先を確認	Ishaan Gulrajani, David Lopez-Paz	(参考訳) ドメイン一般化アルゴリズムの目標は、トレーニング中に見られるものと異なる分布を適切に予測することである。無数のドメイン一般化アルゴリズムが存在するが、実験条件における不整合(データセット、アーキテクチャ、モデル選択基準)は公正で現実的な比較が難しい。本稿では,ドメイン一般化アルゴリズムが現実的にどのように有用かを理解することに興味がある。最初のステップとして、モデル選択はドメインの一般化タスクにとって自明ではないことに気づく。先行研究とは対照的に、モデル選択戦略のない領域一般化アルゴリズムは不完全とみなすべきである。次に,7つのマルチドメインデータセット,9つのベースラインアルゴリズム,3つのモデル選択基準を含む,ドメイン一般化のためのテストベッドであるdomainbedを実装した。 DomainBedを使って広範な実験を行い、慎重に実装すると、実験的なリスク最小化がすべてのデータセットにおける最先端のパフォーマンスを示す。今後は、DomainBedのリリースと仲間の研究者の協力により、ドメインの一般化における再現性と厳密な研究の合理化を期待する。 The goal of domain generalization algorithms is to predict well on distributions different from those seen during training. While a myriad of domain generalization algorithms exist, inconsistencies in experimental conditions -- datasets, architectures, and model selection criteria -- render fair and realistic comparisons difficult. In this paper, we are interested in understanding how useful domain generalization algorithms are in realistic settings. As a first step, we realize that model selection is non-trivial for domain generalization tasks. Contrary to prior work, we argue that domain generalization algorithms without a model selection strategy should be regarded as incomplete. Next, we implement DomainBed, a testbed for domain generalization including seven multi-domain datasets, nine baseline algorithms, and three model selection criteria. We conduct extensive experiments using DomainBed and find that, when carefully implemented, empirical risk minimization shows state-of-the-art performance across all datasets. Looking forward, we hope that the release of DomainBed, along with contributions from fellow researchers, will streamline reproducible and rigorous research in domain generalization.	翻訳日:2022-11-14 13:35:02 公開日:2020-07-02
# 低光環境ニューラルサーベイランス Low-light Environment Neural Surveillance ( http://arxiv.org/abs/2007.00843v1 ) ライセンス: Link先を確認	Michael Potter (1), Henry Gridley (1), Noah Lichtenstein (1), Kevin Hines (1), John Nguyen (1), Jacob Walsh (1) ((1) Northeastern University)	(参考訳) 低照度環境における実時間犯罪検知のためのエンドツーエンドシステムの設計と実装を行う。反応するクローズド回路テレビとは異なり、低光環境ニューラルサーベイランスはリアルタイムの犯罪警報を提供する。システムは、光学フローネットワーク、空間的および時間的ネットワークによってリアルタイムで処理された低照度ビデオフィードと、射撃、暴行、盗難を識別するためのサポートベクトルマシンを使用する。私たちは、低光度アクション認識データセット、lens-4を作成します。 Amazon Web Services経由で設定されたIoTインフラストラクチャは、アクション認識用のカメラをホストするローカルボードからのメッセージを解釈し、クラウド内の結果を解析してメッセージを中継する。 20FPSで71.5%の精度を達成した。ユーザーインターフェースは、地元の当局が通知を受け取り、犯罪現場のビデオを見ることができるモバイルアプリである。市民は、法執行機関がユーザーの近づきに応じて犯罪警報をプッシュできる公開アプリを持っている。 We design and implement an end-to-end system for real-time crime detection in low-light environments. Unlike Closed-Circuit Television, which performs reactively, the Low-Light Environment Neural Surveillance provides real time crime alerts. The system uses a low-light video feed processed in real-time by an optical-flow network, spatial and temporal networks, and a Support Vector Machine to identify shootings, assaults, and thefts. We create a low-light action-recognition dataset, LENS-4, which will be publicly available. An IoT infrastructure set up via Amazon Web Services interprets messages from the local board hosting the camera for action recognition and parses the results in the cloud to relay messages. The system achieves 71.5% accuracy at 20 FPS. The user interface is a mobile app which allows local authorities to receive notifications and to view a video of the crime scene. Citizens have a public app which enables law enforcement to push crime alerts based on user proximity.	翻訳日:2022-11-14 13:34:43 公開日:2020-07-02
# ポイントクラウド解析における局所集約演算子について A Closer Look at Local Aggregation Operators in Point Cloud Analysis ( http://arxiv.org/abs/2007.01294v1 ) ライセンス: Link先を確認	Ze Liu and Han Hu and Yue Cao and Zheng Zhang and Xin Tong	(参考訳) ポイントクラウド処理のためのネットワークアーキテクチャの最近の進歩は、主にローカルアグリゲーション演算子の新しい設計に支えられている。しかし,これらの演算子がネットワーク性能に与える影響については,各ソリューションのネットワークアーキテクチャや実装の詳細が異なるため,慎重には検討されていない。一方、ほとんどの演算子は浅いアーキテクチャでのみ適用される。本稿では,代表的局所集合演算子を再検討し,その性能を同一の残差アーキテクチャを用いて検討する。これらの演算子の異なる設計にもかかわらず、これらの演算子は、同じネットワーク入力と特徴数の下で、驚くほど類似したコントリビューションを行い、その結果、標準ベンチマークにおける最先端の精度が得られる。この発見は、ポイントクラウド処理のための局所集約演算子の洗練された設計の必要性を再考するきっかけとなった。そこで本研究では,学習可能な重みを持たない単純な局所集約演算子,PosPooling(PosPool)を提案する。特に、ポスプール層を持つ単純なディープ残差ネットワークは、すべてのベンチマークで優れたパフォーマンスを達成し、挑戦的なpartnetデータセットの以前の方法よりも大きなマージン(7.4 miou)で優れている。コードはhttps://github.com/zeliu98/closerlook3dで公開されている。 Recent advances of network architecture for point cloud processing are mainly driven by new designs of local aggregation operators. However, the impact of these operators to network performance is not carefully investigated due to different overall network architecture and implementation details in each solution. Meanwhile, most of operators are only applied in shallow architectures. In this paper, we revisit the representative local aggregation operators and study their performance using the same deep residual architecture. Our investigation reveals that despite the different designs of these operators, all of these operators make surprisingly similar contributions to the network performance under the same network input and feature numbers and result in the state-of-the-art accuracy on standard benchmarks. This finding stimulate us to rethink the necessity of sophisticated design of local aggregation operator for point cloud processing. To this end, we propose a simple local aggregation operator without learnable weights, named Position Pooling (PosPool), which performs similarly or slightly better than existing sophisticated operators. In particular, a simple deep residual network with PosPool layers achieves outstanding performance on all benchmarks, which outperforms the previous state-of-the methods on the challenging PartNet datasets by a large margin (7.4 mIoU). The code is publicly available at https://github.com/zeliu98/CloserLook3D	翻訳日:2022-11-14 13:33:50 公開日:2020-07-02
# 公平性を考慮したグラディングビデオインタビュー Grading video interviews with fairness considerations ( http://arxiv.org/abs/2007.05461v1 ) ライセンス: Link先を確認	Abhishek Singhania, Abhishek Unnam and Varun Aggarwal	(参考訳) 顔画像とビデオを用いて人間の感情や特徴を予測することには、かなりの関心が寄せられている。近年、このような研究は、ラベル付けの実践の貧弱さ、決定的でない予測結果、公平性に関する批判にさらされている。質問に対するビデオ応答に基づいて、候補者の社会的スキルを自動的に導き出すための慎重な手法を提案する。われわれは初めて、複数の民族を包含する複数の国のビデオデータを含む。また、ビデオは複数の人種的背景を持つ個人によって評価され、いくつかのベストプラクティスに従って、社会的スキルのコンセンサスと偏見のない測定を実現した。社会的スキルを予測するための2つの機械学習モデルを開発した。最初のモデルは、専門家のガイダンスを使って、もっとも因果的な特徴を使用する。後者はディープラーニングを使用し、データに存在する経験的相関のみに依存する。両モデルの誤差を比較し,モデルの特異性を検証し,推奨する。さらに,モデルの誤りを人種や性別別に検討することで公平性を分析する。候補者の面接結果の予測方法を決定することで,モデルの有用性を検証する。全体としてこの研究は、公平性と倫理的な配慮をしながら、ビデオインタビューのスコアリングに人工知能を使用するための強力なサポートを提供する。 There has been considerable interest in predicting human emotions and traits using facial images and videos. Lately, such work has come under criticism for poor labeling practices, inconclusive prediction results and fairness considerations. We present a careful methodology to automatically derive social skills of candidates based on their video response to interview questions. We, for the first time, include video data from multiple countries encompassing multiple ethnicities. Also, the videos were rated by individuals from multiple racial backgrounds, following several best practices, to achieve a consensus and unbiased measure of social skills. We develop two machine-learning models to predict social skills. The first model employs expert-guidance to use plausibly causal features. The second uses deep learning and depends solely on the empirical correlations present in the data. We compare errors of both these models, study the specificity of the models and make recommendations. We further analyze fairness by studying the errors of models by race and gender. We verify the usefulness of our models by determining how well they predict interview outcomes for candidates. Overall, the study provides strong support for using artificial intelligence for video interview scoring, while taking care of fairness and ethical considerations.	翻訳日:2022-11-14 13:33:27 公開日:2020-07-02
# VQAにおけるAI能力予測への説明の影響 The Impact of Explanations on AI Competency Prediction in VQA ( http://arxiv.org/abs/2007.00900v1 ) ライセンス: Link先を確認	Kamran Alipour, Arijit Ray, Xiao Lin, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas	(参考訳) 説明可能性(Explainability)は、AIシステムの信頼を構築する上で重要な要素のひとつだ。 AIを説明可能にしようとする多くの試みの中で、説明の効果を定量化することは、人間とAIの協調作業を実行する上での課題である。 AIの全体的な振る舞いを予測する能力以外に、多くのアプリケーションでは、タスクドメインのさまざまな側面において、AIエージェントの能力を理解する必要がある。本稿では,視覚的質問応答(VQA)タスクにおけるAIエージェント能力のユーザ精神モデルに対する説明の影響を評価する。実際のシステム性能とユーザランキングの相関関係に基づいて,ユーザの能力に対する理解度を定量化する。本稿では,空間的特徴とオブジェクト的特徴を併用し,BERT言語モデルを用いた説明可能なVQAシステムを提案する。それぞれのグループは、VQAモデルの能力を評価するための説明を1つしか見ていない。提案モデルは,ユーザのコンピテンシー知覚に対する説明の影響を調べるために,主観間実験によって評価される。 2つのVQAモデルの比較では、BERTに基づく説明とオブジェクト機能の使用により、モデルの能力に関するユーザの予測が改善されている。 Explainability is one of the key elements for building trust in AI systems. Among numerous attempts to make AI explainable, quantifying the effect of explanations remains a challenge in conducting human-AI collaborative tasks. Aside from the ability to predict the overall behavior of AI, in many applications, users need to understand an AI agent's competency in different aspects of the task domain. In this paper, we evaluate the impact of explanations on the user's mental model of AI agent competency within the task of visual question answering (VQA). We quantify users' understanding of competency, based on the correlation between the actual system performance and user rankings. We introduce an explainable VQA system that uses spatial and object features and is powered by the BERT language model. Each group of users sees only one kind of explanation to rank the competencies of the VQA model. The proposed model is evaluated through between-subject experiments to probe explanations' impact on the user's perception of competency. The comparison between two VQA models shows BERT based explanations and the use of object features improve the user's prediction of the model's competencies.	翻訳日:2022-11-14 13:33:09 公開日:2020-07-02
# 主成分分析による高次元ベイズ最適化 High Dimensional Bayesian Optimization Assisted by Principal Component Analysis ( http://arxiv.org/abs/2007.00925v1 ) ライセンス: Link先を確認	Elena Raponi, Hao Wang, Mariusz Bujny, Simonetta Boria and Carola Doerr	(参考訳) ベイジアン最適化(英: Bayesian Optimization, BO)は、自動機械学習や設計最適化など、様々な分野に適用された、代理支援のグローバル最適化手法である。いわゆる infill-criterion and gaussian process regression (gpr) に基づいて構築されたbo技法は、探索空間の次元が増加するにつれて計算の複雑さと収束率を阻害する。高次元最適化問題に対するBOのスケールアップは依然として難しい課題である。本稿では,PCAを主成分分析(PCA)と組み合わせることでBOのスケーラビリティに取り組み,新しいPCA-BOアルゴリズムを提案する。具体的には、pca手順は、実行中のすべての評価点から線形変換を学習し、評価点の変動性に応じて変換空間の次元を選択する。次に、GPRモデルを構築し、選択された次元にまたがる空間における補充基準について述べる。我々はCOCOベンチマークフレームワークによるマルチモーダル問題に対する経験的収束率とCPU時間の観点からPCA-BOの性能を評価する。実験の結果,PCA-BOは高次元問題におけるCPU時間を効果的に削減し,適切なグローバル構造を持つ問題に対する収束率を維持することができることがわかった。そのため、PCA-BOは、高次元数値最適化におけるBOアプローチの強みから恩恵を受ける新しい方法を開く収束率と計算効率の間の良好なトレードオフを提供する。 Bayesian Optimization (BO) is a surrogate-assisted global optimization technique that has been successfully applied in various fields, e.g., automated machine learning and design optimization. Built upon a so-called infill-criterion and Gaussian Process regression (GPR), the BO technique suffers from a substantial computational complexity and hampered convergence rate as the dimension of the search spaces increases. Scaling up BO for high-dimensional optimization problems remains a challenging task. In this paper, we propose to tackle the scalability of BO by hybridizing it with a Principal Component Analysis (PCA), resulting in a novel PCA-assisted BO (PCA-BO) algorithm. Specifically, the PCA procedure learns a linear transformation from all the evaluated points during the run and selects dimensions in the transformed space according to the variability of evaluated points. We then construct the GPR model, and the infill-criterion in the space spanned by the selected dimensions. We assess the performance of our PCA-BO in terms of the empirical convergence rate and CPU time on multi-modal problems from the COCO benchmark framework. The experimental results show that PCA-BO can effectively reduce the CPU time incurred on high-dimensional problems, and maintains the convergence rate on problems with an adequate global structure. PCA-BO therefore provides a satisfactory trade-off between the convergence rate and computational efficiency opening new ways to benefit from the strength of BO approaches in high dimensional numerical optimization.	翻訳日:2022-11-14 13:32:54 公開日:2020-07-02
# 確率微分方程式を用いた非一様サンプリング時系列の精度評価 Accurate Characterization of Non-Uniformly Sampled Time Series using Stochastic Differential Equations ( http://arxiv.org/abs/2007.01073v1 ) ライセンス: Link先を確認	Stijn de Waele	(参考訳) 非一様サンプリングは、実験者が調査中のプロセスのサンプリング特性を完全に制御できない場合に発生する。さらに、ベイズ最適化や圧縮センシングなどのアルゴリズムにも意図的に導入されている。確率微分方程式(SDE)は、特にそのような時系列の2次モーメントを特徴づけるのに適している。自己回帰モデルからの漸進的推定と初期化に基づいて,確率の数値最適化のための新しい初期推定手法を提案する。さらに、SDE確率に基づく推定モデルの順序を減少させるために、純粋にデータ駆動方式としてモデルトランケーションを導入する。シミュレーション実験において,新しい推定器によって達成される精度が向上し,非一様サンプル時系列の特徴付けに遭遇する可能性のある課題をすべて網羅した。最後に,実験降雨変動データに新しい推定器を適用する。 Non-uniform sampling arises when an experimenter does not have full control over the sampling characteristics of the process under investigation. Moreover, it is introduced intentionally in algorithms such as Bayesian optimization and compressive sensing. We argue that Stochastic Differential Equations (SDEs) are especially well-suited for characterizing second order moments of such time series. We introduce new initial estimates for the numerical optimization of the likelihood, based on incremental estimation and initialization from autoregressive models. Furthermore, we introduce model truncation as a purely data-driven method to reduce the order of the estimated model based on the SDE likelihood. We show the increased accuracy achieved with the new estimator in simulation experiments, covering all challenging circumstances that may be encountered in characterizing a non-uniformly sampled time series. Finally, we apply the new estimator to experimental rainfall variability data.	翻訳日:2022-11-14 13:26:34 公開日:2020-07-02
# 推定逆傾向スコアを用いた個別化治療ルールの学習 Learning Individualized Treatment Rules with Estimated Translated Inverse Propensity Score ( http://arxiv.org/abs/2007.01083v1 ) ライセンス: Link先を確認	Zhiliang Wu, Yinchong Yang, Yunpu Ma, Yushan Liu, Rui Zhao, Michael Moor, Volker Tresp	(参考訳) ランダム化対照試験は、通常、患者サブグループに対する治療勧告を作成することを目的として、治療の有効性を分析する。電子健康記録の進歩に伴い,臨床実践において多種多様なデータが収集され,観察データに基づく治療・治療方針の評価が可能となった。本稿では,個別治療規則(ITR)の学習に焦点をあて,個々の患者により良い結果をもたらすと期待される治療方針を導出する。本フレームワークでは,ITRの学習を文脈的盗聴問題とみなし,治療方針の予測リスクを最小限に抑える。シミュレーション研究と実世界のデータセットに基づいて,提案フレームワークを用いて実験を行う。後者の場合, 静脈内 (IV) 液と血管圧薬 (VP) の投与に最適なITRを学習するために提案法を適用した。様々なオフライン評価手法に基づいて,本フレームワークから導出されたポリシーは,簡単な治療予測手法を含む,医師や他のベースラインと比較して優れた性能を示すことを示すことができる。長期的目標として,本方針はIVおよびVPの治験ガイドラインの改善につながる可能性がある。 Randomized controlled trials typically analyze the effectiveness of treatments with the goal of making treatment recommendations for patient subgroups. With the advance of electronic health records, a great variety of data has been collected in clinical practice, enabling the evaluation of treatments and treatment policies based on observational data. In this paper, we focus on learning individualized treatment rules (ITRs) to derive a treatment policy that is expected to generate a better outcome for an individual patient. In our framework, we cast ITRs learning as a contextual bandit problem and minimize the expected risk of the treatment policy. We conduct experiments with the proposed framework both in a simulation study and based on a real-world dataset. In the latter case, we apply our proposed method to learn the optimal ITRs for the administration of intravenous (IV) fluids and vasopressors (VP). Based on various offline evaluation methods, we could show that the policy derived in our framework demonstrates better performance compared to both the physicians and other baselines, including a simple treatment prediction approach. As a long-term goal, our derived policy might eventually lead to better clinical guidelines for the administration of IV and VP.	翻訳日:2022-11-14 13:26:22 公開日:2020-07-02
# 強化二階ファジィ規則モデル構築における指数重み付きl_2正規化戦略 Exponentially Weighted l_2 Regularization Strategy in Constructing Reinforced Second-order Fuzzy Rule-based Model ( http://arxiv.org/abs/2007.01208v1 ) ライセンス: Link先を確認	Congcong Zhang, Sung-Kwun Oh, Witold Pedrycz, Zunwei Fu and Shanzhen Lu	(参考訳) 従来の高木スゲノカン(TSK)型ファジィモデルでは、定数関数や線形関数は通常ファジィ規則の連続部分として使用されるが、先行部分によって定義された局所領域内の振る舞いを効果的に記述することはできない。本稿では,この問題に対処するために理論的かつ実用的な設計手法を提案する。まず、情報顆粒化(fuzzy c-means)法を用いて、データの構造をキャプチャし、入力空間を部分空間に分割し、先行部を形成する。第2に、二次多項式(QP)が連続部分として用いられる。定数関数や線形関数と比較して、QPは入力変数と出力変数の関係を洗練することにより、局所領域(部分空間)内の入出力挙動を記述することができる。しかし、QPはモデルの近似能力を向上させることができるが、モデルの予測能力を低下させる可能性がある(例えば、過剰適合)。この問題に対処するために,調和解析で遭遇する重み関数理論に着想を得た指数重み法を提案する。具体的には, 2次ファジィ法則モデル (RSFRM) を適切に適合させるために, l2正則化 (l2) (指数重み付きl2, ewl_2) を具備した目標ペナルティ項として指数関数を用いる。通常の l2 と比較して el 2 の利点は、係数推定において異なる種類の多項式項を別々に同定し、ペナルティを課すことであり、その結果はオーバーフィッティングを緩和し、一般化能力の低下を防ぐだけでなく、モデルの予測ポテンシャルを効果的に放出する。 In the conventional Takagi-Sugeno-Kang (TSK)-type fuzzy models, constant or linear functions are usually utilized as the consequent parts of the fuzzy rules, but they cannot effectively describe the behavior within local regions defined by the antecedent parts. In this article, a theoretical and practical design methodology is developed to address this problem. First, the information granulation (Fuzzy C-Means) method is applied to capture the structure in the data and split the input space into subspaces, as well as form the antecedent parts. Second, the quadratic polynomials (QPs) are employed as the consequent parts. Compared with constant and linear functions, QPs can describe the input-output behavior within the local regions (subspaces) by refining the relationship between input and output variables. However, although QP can improve the approximation ability of the model, it could lead to the deterioration of the prediction ability of the model (e.g., overfitting). To handle this issue, we introduce an exponential weight approach inspired by the weight function theory encountered in harmonic analysis. More specifically, we adopt the exponential functions as the targeted penalty terms, which are equipped with l2 regularization (l2) (i.e., exponential weighted l2, ewl_2) to match the proposed reinforced second-order fuzzy rule-based model (RSFRM) properly. The advantage of el 2 compared to ordinary l2 lies in separately identifying and penalizing different types of polynomial terms in the coefficient estimation, and its results not only alleviate the overfitting and prevent the deterioration of generalization ability but also effectively release the prediction potential of the model.	翻訳日:2022-11-14 13:25:19 公開日:2020-07-02
# ニューラルネットワークのヌル空間解析による外乱検出 Outlier Detection through Null Space Analysis of Neural Networks ( http://arxiv.org/abs/2007.01263v1 ) ライセンス: Link先を確認	Matthew Cook, Alina Zare, Paul Gader	(参考訳) 多くの機械学習分類システムは能力の認知を欠いている。特に、多くのシステムは、異常値(例えば、トレーニングデータ分布で表現されていない、異なるサンプル)がシステムに提示されたときに識別する能力が欠如している。予期せぬデータに遭遇するとき、システムが合理的な方法で振る舞うのを助けることができるため、外れ値を検出する能力は実用上重要である。先行研究では, 分類モデルとは異なる処理パイプラインにおいて, 異常検出を行うのが一般的である。したがって、外れ値の検出と分類を組み込んだ完全なシステムでは、2つのモデルをトレーニングし、アプローチの全体的な複雑さを増大させる必要がある。本稿では,ヌル空間の概念を用いて,異常検出手法を分類に用いるニューラルネットワークに直接統合する。ニューラルネットワークのヌル空間解析(nusa)と呼ばれる手法は、データがネットワークを通過するときにヌル空間投影の大きさを計算し制御することで動作する。これらの投影を用いて、正常データと異常データとを区別できるスコアを計算できる。その結果,nusaで訓練されたネットワークは分類性能を維持しつつ,一般の異常検出アルゴリズムと同様の速度で異常値を検出することができた。 Many machine learning classification systems lack competency awareness. Specifically, many systems lack the ability to identify when outliers (e.g., samples that are distinct from and not represented in the training data distribution) are being presented to the system. The ability to detect outliers is of practical significance since it can help the system behave in an reasonable way when encountering unexpected data. In prior work, outlier detection is commonly carried out in a processing pipeline that is distinct from the classification model. Thus, for a complete system that incorporates outlier detection and classification, two models must be trained, increasing the overall complexity of the approach. In this paper we use the concept of the null space to integrate an outlier detection method directly into a neural network used for classification. Our method, called Null Space Analysis (NuSA) of neural networks, works by computing and controlling the magnitude of the null space projection as data is passed through a network. Using these projections, we can then calculate a score that can differentiate between normal and abnormal data. Results are shown that indicate networks trained with NuSA retain their classification performance while also being able to detect outliers at rates similar to commonly used outlier detection algorithms.	翻訳日:2022-11-14 13:24:49 公開日:2020-07-02
# スカースデータを用いたスペクトルランク付け法 Spectral Methods for Ranking with Scarce Data ( http://arxiv.org/abs/2007.01346v1 ) ライセンス: Link先を確認	Umang Varma, Lalit Jain, Anna C. Gilbert	(参考訳) アイテムのペアの選好が与えられた場合、すべてのアイテムをランク付けするのが一般的なタスクである。例えば、ペアワイズ映画評価、ニューヨーカーの漫画キャプションコンテスト、その他多くの消費者選好課題がある。これらの設定が共通しているのは、データの不足(すべての項目を比較するのにコストがかかるかもしれない)とアイテムに関する追加の機能情報(映画ジャンル、監督、キャストなど)の2つだ。本稿では,人気でよく研究されているランクアグリゲーション手法であるrankcentralityを,いくつかの比較を考慮し,付加的な特徴情報を含むように修正する。この方法は少ない比較でも有意義なランキングを返す。拡散に基づく手法を用いて,実際に最先端の手法に勝る特徴情報を組み込む。また,様々なサンプリングスキームにおいて,RangCentralityに対するサンプル複雑性の改善も提供する。 Given a number of pairwise preferences of items, a common task is to rank all the items. Examples include pairwise movie ratings, New Yorker cartoon caption contests, and many other consumer preferences tasks. What these settings have in common is two-fold: a scarcity of data (it may be costly to get comparisons for all the pairs of items) and additional feature information about the items (e.g., movie genre, director, and cast). In this paper we modify a popular and well studied method, RankCentrality for rank aggregation to account for few comparisons and that incorporates additional feature information. This method returns meaningful rankings even under scarce comparisons. Using diffusion based methods, we incorporate feature information that outperforms state-of-the-art methods in practice. We also provide improved sample complexity for RankCentrality in a variety of sampling schemes.	翻訳日:2022-11-14 13:23:49 公開日:2020-07-02
# BusTr:リアルタイム交通からバスの走行時間を予測する BusTr: Predicting Bus Travel Times from Real-Time Traffic ( http://arxiv.org/abs/2007.00882v1 ) ライセンス: Link先を確認	Richard Barnes and Senaka Buthpitiya and James Cook and Alex Fabrikant and Andrew Tomkins and Fangzhou Xu	(参考訳) 本稿では,道路交通予測をバス遅延予測に翻訳する機械学習モデルであるBusTrについて紹介する。我々のニューラルシーケンスモデルは、パフォーマンス(-30% MAPE)とトレーニング安定性の両方において、最先端のベースラインであるDeepTTEよりも改善されていることを実証する。また、より単純なモデルよりも大幅に一般化され、常に進化する世界に対応するために、縦方向のデータで評価される。 We present BusTr, a machine-learned model for translating road traffic forecasts into predictions of bus delays, used by Google Maps to serve the majority of the world's public transit systems where no official real-time bus tracking is provided. We demonstrate that our neural sequence model improves over DeepTTE, the state-of-the-art baseline, both in performance (-30% MAPE) and training stability. We also demonstrate significant generalization gains over simpler models, evaluated on longitudinal data to cope with a constantly evolving world.	翻訳日:2022-11-14 13:16:59 公開日:2020-07-02
# bosh:階層的サンプリングによるベイズ最適化 BOSH: Bayesian Optimization by Sampling Hierarchically ( http://arxiv.org/abs/2007.00939v1 ) ライセンス: Link先を確認	Henry B. Moss, David S. Leslie, Paul Rayson	(参考訳) クロス検証やシミュレーション最適化によるパラメータチューニングのような確率的評価を持つ関数に対するベイズ最適化(bo)の配置は、通常、目的関数の固定されたノイズ発生の平均を最適化する。しかし、この方法で真の目的関数を無視すると、誤った関数の高精度な最適化が見つかる。この問題を解決するために,階層型ガウス過程と情報理論フレームワークを組み合わせ,最適化が進むにつれて実現のプールを増大させる新しいBOルーチンである階層型ガウス法(BOSH)をサンプリングしてベイズ最適化を提案する。 BOSHは, ベンチマーク, シミュレーション最適化, 強化学習, ハイパーパラメータチューニングタスクにおいて, 標準BOよりも効率的で高精度な最適化を実現する。 Deployments of Bayesian Optimization (BO) for functions with stochastic evaluations, such as parameter tuning via cross validation and simulation optimization, typically optimize an average of a fixed set of noisy realizations of the objective function. However, disregarding the true objective function in this manner finds a high-precision optimum of the wrong function. To solve this problem, we propose Bayesian Optimization by Sampling Hierarchically (BOSH), a novel BO routine pairing a hierarchical Gaussian process with an information-theoretic framework to generate a growing pool of realizations as the optimization progresses. We demonstrate that BOSH provides more efficient and higher-precision optimization than standard BO across synthetic benchmarks, simulation optimization, reinforcement learning and hyper-parameter tuning tasks.	翻訳日:2022-11-14 13:16:26 公開日:2020-07-02
# リニアバンディットのための純粋探査のゲーム化 Gamification of Pure Exploration for Linear Bandits ( http://arxiv.org/abs/2007.00953v1 ) ライセンス: Link先を確認	R\'emy Degenne, Pierre M\'enard, Xuedong Shang, Michal Valko	(参考訳) 線形確率的包帯の文脈において、ベストアーム識別を含む活発な純粋探索環境について検討する。標準のマルチアームバンディットには漸近的に最適なアルゴリズムが存在するが、線形バンディットにおける最良アーム識別のためのアルゴリズムの存在は、それに対処するいくつかの試みにもかかわらず、解明されてきた。まず,G最適性,最適設計からの帰納的最適性,漸近的最適性など,線形の場合における最適性の異なる概念について,徹底的な比較と新たな知見を提供する。第2に,線形帯域における固定信頼純粋探索のための漸近最適化アルゴリズムを設計する。その結果、我々のアルゴリズムは、単純だが難解な例による落とし穴を自然に回避し、ほとんどの先行アルゴリズムを明示的に扱うように設計しなければならなかった。最後に、効率的な実装を伴うアプローチを提供することで、最適な設計問題を解決する必要性を回避します。 We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive despite several attempts to address it. First, we provide a thorough comparison and new insight over different notions of optimality in the linear case, including G-optimality, transductive optimality from optimal experimental design and asymptotic optimality. Second, we design the first asymptotically optimal algorithm for fixed-confidence pure exploration in linear bandits. As a consequence, our algorithm naturally bypasses the pitfall caused by a simple but difficult instance, that most prior algorithms had to be engineered to deal with explicitly. Finally, we avoid the need to fully solve an optimal design problem by providing an approach that entails an efficient implementation.	翻訳日:2022-11-14 13:16:12 公開日:2020-07-02
# 確率帯域に対する構造適応アルゴリズム Structure Adaptive Algorithms for Stochastic Bandits ( http://arxiv.org/abs/2007.00969v1 ) ライセンス: Link先を確認	R\'emy Degenne, Han Shao, Wouter M. Koolen	(参考訳) 本研究では,線形,ユニモーダル,スパースなど,腕の平均報酬が与えられた構造的制約を満たした幅広い階層構造的確率的多腕バンディット問題において,報酬の最大化について検討する。我々の目的は、柔軟で(異なる構造に容易に適応できる)、強力で(経験的かつ/または証明的にインスタンス依存の低い境界によく適合する)、かつ、円周計算の負担が小さい方法を開発することである。反復的鞍点解法を用いて,インスタンス依存下限から漸近的最適アルゴリズムを開発する。提案手法は,準最適性ギャップとその相互関係の推定から生じる大きな課題である,純粋探索のための最近の反復的手法を一般化するものである。それでも上述のデシダラタは達成できた。特に,本手法は,それまでの作業で用いたフルブルーのサドル点オラクルの計算コストを回避すると同時に,有限時間後悔境界を許容する。実験の結果,本手法は構造的仮定の活用に成功し,その後悔はバニラ UCB に匹敵することがわかった。 We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent lower bounds) and efficient in that the per-round computational burden is small. We develop asymptotically optimal algorithms from instance-dependent lower-bounds using iterative saddle-point solvers. Our approach generalises recent iterative methods for pure exploration to reward maximisation, where a major challenge arises from the estimation of the sub-optimality gaps and their reciprocals. Still we manage to achieve all the above desiderata. Notably, our technique avoids the computational cost of the full-blown saddle point oracle employed by previous work, while at the same time enabling finite-time regret bounds. Our experiments reveal that our method successfully leverages the structural assumptions, while its regret is at worst comparable to that of vanilla UCB.	翻訳日:2022-11-14 13:15:56 公開日:2020-07-02
# Shapley値と条件推論木を用いた混合特徴付き予測モデルの記述 Explaining predictive models with mixed features using Shapley values and conditional inference trees ( http://arxiv.org/abs/2007.01027v1 ) ライセンス: Link先を確認	Annabelle Redelmeier, Martin Jullum, and Kjersti Aas	(参考訳) 複雑なブラックボックス機械学習モデルを説明することがますます重要になっている。このトピックに関する文献は拡大しているが、Shapleyの値は、あらゆる種類の機械学習モデルからの予測を説明するためのサウンドメソッドとして際立っている。予測説明のためのShapley値の当初の開発は、記述されている特徴が独立しているという仮定に依存していた。この方法論は、基礎となる連続分布で依存する特徴を説明するために拡張された。本稿では,条件付き推論木を用いた特徴の依存構造をモデル化し,混合特徴(連続的,離散的,順序的,類型的)に依存する特徴を説明する手法を提案する。提案手法は, 様々なシミュレーション研究において, 現在の業界標準に対して, 提案手法が他の手法よりも優れていることを実証する。最後に,本手法を2018 fico explainsable machine learning challengeで使用した実金融データセットに適用し,fico challenge recognition awardの受賞チームとの比較を行った。 It is becoming increasingly important to explain complex, black-box machine learning models. Although there is an expanding literature on this topic, Shapley values stand out as a sound method to explain predictions from any type of machine learning model. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. This methodology was then extended to explain dependent features with an underlying continuous distribution. In this paper, we propose a method to explain mixed (i.e. continuous, discrete, ordinal, and categorical) dependent features by modeling the dependence structure of the features using conditional inference trees. We demonstrate our proposed method against the current industry standards in various simulation studies and find that our method often outperforms the other approaches. Finally, we apply our method to a real financial data set used in the 2018 FICO Explainable Machine Learning Challenge and show how our explanations compare to the FICO challenge Recognition Award winning team.	翻訳日:2022-11-14 13:15:03 公開日:2020-07-02
# 信号伝播を超えて: ディープニューラルネットワークの初期化には特徴の多様性が必要か? Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization? ( http://arxiv.org/abs/2007.01038v1 ) ライセンス: Link先を確認	Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry	(参考訳) ディープニューラルネットワークは通常ランダムウェイトで初期化され、信号の伝搬と安定した勾配を促進するために分散が選択される。特徴の多様性はこれらの初期化の重要な性質であると考えられている。ほぼすべての重みを0$に初期化することにより、同一の特徴を持つ深い畳み込みネットワークを構築する。アーキテクチャはまた、完全な信号伝搬と安定した勾配を可能にし、標準ベンチマークで高い精度を達成する。これは、ランダムで多様な初期化がニューラルネットワークのトレーニングに必要な \textit{not} であることを示している。我々は、この現象を研究し、非決定論的である標準的なgpu操作が、トレーニングを可能にするのに十分な対称性の破れの源となることを見出します。 Deep neural networks are typically initialized with random weights, with variances chosen to facilitate signal propagation and stable gradients. It is also believed that diversity of features is an important property of these initializations. We construct a deep convolutional network with identical features by initializing almost all the weights to $0$. The architecture also enables perfect signal propagation and stable gradients, and achieves high accuracy on standard benchmarks. This indicates that random, diverse initializations are \textit{not} necessary for training neural networks. An essential element in training this network is a mechanism of symmetry breaking; we study this phenomenon and find that standard GPU operations, which are non-deterministic, can serve as a sufficient source of symmetry breaking to enable training.	翻訳日:2022-11-14 13:14:47 公開日:2020-07-02
# 変換器(BERT)からの双方向エンコーダ表現 : 感情分析のディッセイ Bidirectional Encoder Representations from Transformers (BERT): A sentiment analysis odyssey ( http://arxiv.org/abs/2007.01127v1 ) ライセンス: Link先を確認	Shivaji Alaparthi (Data Scientist, CenturyLink, Bengaluru, India) and Manit Mishra (Associate Professor, International Management Institute Bhubaneswar, India)	(参考訳) 本研究の目的は,(1)send wordnetを用いた非教師付き語彙ベースモデル,(2)ロジスティック回帰を用いた従来の教師付き機械学習モデル,(3)long short-term memory(lstm)を用いた教師付きディープラーニングモデル,(4)トランスフォーマ(bert)からの双方向エンコーダ表現を用いた高度な教師付きディープラーニングモデル,の4つの異なる感情分析手法の相対的有効性を検討することである。我々は、インターネット映画データベース(IMDB)に投稿された5万本の映画レビューのコーパスを、Sent WordNetレキシコン、ロジスティック回帰、LSTM、BERTを用いて解析するために公開している。最初の3モデルはcpuベースのシステムで動作し、bertはgpuベースのシステムで動作した。感情分類性能は,精度,精度,リコール,F1スコアに基づいて評価した。本研究は,(1)高度で広く使用されている4つの感情分析技術の相対的有効性,(2)テキストデータからの感情分析における事前学習型深層学習 BERT モデルの有効性について考察した。本研究は分析業界とテキスト分析に携わる学者に,最近開発されたbertを含む重要感情分析手法の比較分類性能評価に関する洞察を提供する。これは、LSTM、ロジスティック回帰、Sent WordNetの他の感情分析モデルであるBERT vis-\`a-visの高度な事前学習型ディープラーニングモデルと比較した最初の研究である。 The purpose of the study is to investigate the relative effectiveness of four different sentiment analysis techniques: (1) unsupervised lexicon-based model using Sent WordNet; (2) traditional supervised machine learning model using logistic regression; (3) supervised deep learning model using Long Short-Term Memory (LSTM); and, (4) advanced supervised deep learning models using Bidirectional Encoder Representations from Transformers (BERT). We use publicly available labeled corpora of 50,000 movie reviews originally posted on internet movie database (IMDB) for analysis using Sent WordNet lexicon, logistic regression, LSTM, and BERT. The first three models were run on CPU based system whereas BERT was run on GPU based system. The sentiment classification performance was evaluated based on accuracy, precision, recall, and F1 score. The study puts forth two key insights: (1) relative efficacy of four highly advanced and widely used sentiment analysis techniques; (2) undisputed superiority of pre-trained advanced supervised deep learning BERT model in sentiment analysis from text data. This study provides professionals in analytics industry and academicians working on text analysis key insight regarding comparative classification performance evaluation of key sentiment analysis techniques, including the recently developed BERT. This is the first research endeavor to compare the advanced pre-trained supervised deep learning model of BERT vis-\`a-vis other sentiment analysis models of LSTM, logistic regression, and Sent WordNet.	翻訳日:2022-11-14 13:08:38 公開日:2020-07-02
# PGD-UNet : 臓器・腫瘍の同時分離のための位置ガイド型変形ネットワーク PGD-UNet: A Position-Guided Deformable Network for Simultaneous Segmentation of Organs and Tumors ( http://arxiv.org/abs/2007.01001v1 ) ライセンス: Link先を確認	Ziqiang Li, Hong Pan, Yaping Zhu, A. K. Qin	(参考訳) 臓器と腫瘍の精密セグメンテーションは臨床応用において重要な役割を担っている。不規則な形状と臓器や腫瘍の大きさ、そして関心の解剖学(aoi)と背景領域の間の重大な階級的不均衡のため、これは困難な課題である。加えて、ほとんどの場合、腫瘍と正常臓器は医療画像に重複することが多いが、現在のアプローチでは腫瘍と臓器の両方を正確に切り離すことができない。このような課題に対処すべく,変形可能な畳み込みの空間的変形能力を利用して臓器と腫瘍の幾何学的変形に対応する位置誘導型変形型unet,pgd-unetを提案する。位置情報はネットワークに明示的にエンコードされ、変形の能力を高める。また,従来の最大プーリング操作で失われた位置情報を保存する新しいプーリングモジュールを提案する。また、異なる構造の境界やアノテーションの主観性がはっきりしないため、ラベルは必ずしも医用画像の分割作業において正確ではない。これはラベルノイズによるトレーニングネットワークの過度な適合を引き起こす可能性がある。この問題に対処するために,新たな損失関数を定式化し,潜在的なラベルノイズがトレーニングプロセスに与える影響を抑制する。本手法は,2つの難解なセグメンテーションタスクで評価し,両タスクにおいて非常に有望なセグメンテーション精度を得た。 Precise segmentation of organs and tumors plays a crucial role in clinical applications. It is a challenging task due to the irregular shapes and various sizes of organs and tumors as well as the significant class imbalance between the anatomy of interest (AOI) and the background region. In addition, in most situation tumors and normal organs often overlap in medical images, but current approaches fail to delineate both tumors and organs accurately. To tackle such challenges, we propose a position-guided deformable UNet, namely PGD-UNet, which exploits the spatial deformation capabilities of deformable convolution to deal with the geometric transformation of both organs and tumors. Position information is explicitly encoded into the network to enhance the capabilities of deformation. Meanwhile, we introduce a new pooling module to preserve position information lost in conventional max-pooling operation. Besides, due to unclear boundaries between different structures as well as the subjectivity of annotations, labels are not necessarily accurate for medical image segmentation tasks. It may cause the overfitting of the trained network due to label noise. To address this issue, we formulate a novel loss function to suppress the influence of potential label noise on the training process. Our method was evaluated on two challenging segmentation tasks and achieved very promising segmentation accuracy in both tasks.	翻訳日:2022-11-14 13:08:08 公開日:2020-07-02
# nlnde:ロバストな薬理学的実体検出のための注意とノイズチャンネルによる神経配列タガーの増強 NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy Channel for Robust Pharmacological Entity Detection ( http://arxiv.org/abs/2007.01022v1 ) ライセンス: Link先を確認	Lukas Lange, Heike Adel, Jannik Str\"otgen	(参考訳) 名前付きエンティティ認識は、英語のニューステキストで広く研究されている。しかし、他のドメインや言語への移行は依然として難しい問題である。本稿では,BioNLP Open Shared Tasks 2019のPharmaCoNERコンペティションの最初のサブトラックに参加したシステムについて述べる。スペイン語のテキストにおける薬理学的エンティティ検出を目的としたこのタスクは、非標準ドメインと言語設定を提供する。しかし、言語やドメインの専門知識を必要としないアーキテクチャを提案する。タスクをシーケンスラベリングタスクとして扱い,注意に基づく埋め込み選択と自動アノテートデータのトレーニングを行い,システムの性能をさらに向上させる。提案システムは,特に異なる技術を組み合わせることで,有望な結果を達成し,競争において最大88.6%のF1に達する。 Named entity recognition has been extensively studied on English news texts. However, the transfer to other domains and languages is still a challenging problem. In this paper, we describe the system with which we participated in the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared Tasks 2019. Aiming at pharmacological entity detection in Spanish texts, the task provides a non-standard domain and language setting. However, we propose an architecture that requires neither language nor domain expertise. We treat the task as a sequence labeling task and experiment with attention-based embedding selection and the training on automatically annotated data to further improve our system's performance. Our system achieves promising results, especially by combining the different techniques, and reaches up to 88.6% F1 in the competition.	翻訳日:2022-11-14 13:07:28 公開日:2020-07-02
# nlnde: スペイン語の医学文書の非識別方法 NLNDE: The Neither-Language-Nor-Domain-Experts' Way of Spanish Medical Document De-Identification ( http://arxiv.org/abs/2007.01030v1 ) ライセンス: Link先を確認	Lukas Lange, Heike Adel, Jannik Str\"otgen	(参考訳) 自然言語処理は、最近この分野で多くの研究を導いた医学領域において大きな可能性を秘めている。しかし、患者ノートや臨床試験などの医療文書の安全な処理の前提条件は、プライバシに敏感な情報の適切な特定である。本稿では,IberLEF 2019の医療文書匿名化タスクであるMEDDOCANコンペティションに参加したNLNDEシステムについて述べる。スペインのデータから保護された健康情報をシーケンスラベル問題として検出・分類し、ニューラルネットワークの異なる埋め込み方法を検討する。非標準言語とドメイン設定を扱うにもかかわらず、NLNDEシステムは競争において有望な結果を達成する。 Natural language processing has huge potential in the medical domain which recently led to a lot of research in this field. However, a prerequisite of secure processing of medical documents, e.g., patient notes and clinical trials, is the proper de-identification of privacy-sensitive information. In this paper, we describe our NLNDE system, with which we participated in the MEDDOCAN competition, the medical document anonymization task of IberLEF 2019. We address the task of detecting and classifying protected health information from Spanish data as a sequence-labeling problem and investigate different embedding methods for our neural network. Despite dealing in a non-standard language and domain setting, the NLNDE system achieves promising results in the competition.	翻訳日:2022-11-14 13:07:14 公開日:2020-07-02
# 不完全な情報と制約下での深層強化学習による検査・維持計画 Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints ( http://arxiv.org/abs/2007.01380v1 ) ライセンス: Link先を確認	C.P. Andriotis, K.G. Papakonstantinou	(参考訳) エンジニアリング環境の劣化における長期的なリスクとコストを最小限に抑えるための検査と保守の方針の決定は、複雑な最適化問題を構成する。主な計算上の課題は (i)成分数による状態・行動集合濃度の指数関数的拡大による次元の呪い (ii)決定段階の数で指数関数的に成長する決定木に関連する歴史の呪い三検査・監視計測の環境確率性及び変動性により引き起こされた状態不確実性の有無 (iv)資源不足やその他の実現不可能なシステム応答による、確率的長期的制限に係る制約の存在。本研究は,制約付き部分可観測マルコフ決定プロセス(POMDP)と多エージェント深層強化学習(DRL)の協調フレームワーク内で,これらの課題に対処する。 POMDPは最適に取り組む (ii)- (iii) 確率的動的プログラミングとベイズ推論の原理を組み合わせること。マルチエージェントDRLアドレス (i) 深い関数のパラメトリゼーションと分散制御仮定を通して。挑戦 (iv)は、ライフサイクルリスクに基づく制約と予算制限に重点を置いた適切な状態拡張とラグランジュ緩和を通じて、ここで処理される。基礎となるアルゴリズム的なステップが提供され、提案フレームワークは、最もリソースとリスクを意識した方法で決定を行う必要がある場合に、確立されたポリシーベースラインを上回り、検査および介入行動の処方を緩和する。 Determination of inspection and maintenance policies for minimizing long-term risks and costs in deteriorating engineering environments constitutes a complex optimization problem. Major computational challenges include the (i) curse of dimensionality, due to exponential scaling of state/action set cardinalities with the number of components; (ii) curse of history, related to exponentially growing decision-trees with the number of decision-steps; (iii) presence of state uncertainties, induced by inherent environment stochasticity and variability of inspection/monitoring measurements; (iv) presence of constraints, pertaining to stochastic long-term limitations, due to resource scarcity and other infeasible/undesirable system responses. In this work, these challenges are addressed within a joint framework of constrained Partially Observable Markov Decision Processes (POMDP) and multi-agent Deep Reinforcement Learning (DRL). POMDPs optimally tackle (ii)-(iii), combining stochastic dynamic programming with Bayesian inference principles. Multi-agent DRL addresses (i), through deep function parametrizations and decentralized control assumptions. Challenge (iv) is herein handled through proper state augmentation and Lagrangian relaxation, with emphasis on life-cycle risk-based constraints and budget limitations. The underlying algorithmic steps are provided, and the proposed framework is found to outperform well-established policy baselines and facilitate adept prescription of inspection and intervention actions, in cases where decisions must be made in the most resource- and risk-aware manner.	翻訳日:2022-11-14 12:59:14 公開日:2020-07-02
# perceptiongan: 知覚理解によるテキスト提供による実世界画像の構築 PerceptionGAN: Real-world Image Construction from Provided Text through Perceptual Understanding ( http://arxiv.org/abs/2007.00977v1 ) ライセンス: Link先を確認	Kanish Garg, Ajeet kumar Singh, Dorien Herremans, Brejesh Lall	(参考訳) 提示された記述テキストから画像を生成することは、知覚情報(形状、色、およびそれらの相互作用)を組み込むことが困難であり、提供されたテキストに高い関連性を与えるため、非常に難しい作業である。現在の方法では、通常不規則な物体の形、色、オブジェクト間の相互作用を持つ最初の低解像度画像を生成する。この初期画像はテキストの条件付けによって改善される。しかし,本手法は,dm-gan論文で指摘されているように,初期生成画像の精細化においてテキスト表現を効率的に利用する問題に主に対処しているが,この精細化プロセスの成功は初期生成画像の品質に大きく依存する。そこで本研究では,識別器モジュールに知覚的理解を取り入れ,優れた初期化画像を提供する手法を提案する。我々は第1段階の知覚情報を改善するとともに,最終生成画像の大幅な改善を実現した。本稿では,新しいStackGANアーキテクチャにアプローチを適用した。そして、複数の段階で画像分布をモデル化しながら、初期画像に含まれる知覚情報が改善されることを示す。最後に,テキストで条件づけされた現実的な多色画像を生成する。これらの画像は、基本的な知覚情報の改善とともに高品質である。さらに重要なことに、提案手法は他の最先端テキストベース画像生成モデルのパイプラインに統合でき、初期低解像度画像を生成することができる。また,StackGANアーキテクチャにおけるジェネレータ-ディスクリミネータペアの第3段階の強化により,StackGANの洗練プロセスの改善にも取り組んでいる。大規模だがスパースなデータセットMS COCOを用いた実験解析と最先端技術との比較により,提案手法の有効性がさらに検証された。 Generating an image from a provided descriptive text is quite a challenging task because of the difficulty in incorporating perceptual information (object shapes, colors, and their interactions) along with providing high relevancy related to the provided text. Current methods first generate an initial low-resolution image, which typically has irregular object shapes, colors, and interaction between objects. This initial image is then improved by conditioning on the text. However, these methods mainly address the problem of using text representation efficiently in the refinement of the initially generated image, while the success of this refinement process depends heavily on the quality of the initially generated image, as pointed out in the DM-GAN paper. Hence, we propose a method to provide good initialized images by incorporating perceptual understanding in the discriminator module. We improve the perceptual information at the first stage itself, which results in significant improvement in the final generated image. In this paper, we have applied our approach to the novel StackGAN architecture. We then show that the perceptual information included in the initial image is improved while modeling image distribution at multiple stages. Finally, we generated realistic multi-colored images conditioned by text. These images have good quality along with containing improved basic perceptual information. More importantly, the proposed method can be integrated into the pipeline of other state-of-the-art text-based-image-generation models to generate initial low-resolution images. We also worked on improving the refinement process in StackGAN by augmenting the third stage of the generator-discriminator pair in the StackGAN architecture. Our experimental analysis and comparison with the state-of-the-art on a large but sparse dataset MS COCO further validate the usefulness of our proposed approach.	翻訳日:2022-11-14 12:58:28 公開日:2020-07-02
# オブジェクト検出のための反復境界ボックスアノテーション Iterative Bounding Box Annotation for Object Detection ( http://arxiv.org/abs/2007.00961v1 ) ライセンス: Link先を確認	Bishwo Adhikari and Heikki Huttunen	(参考訳) デジタル画像におけるオブジェクト検出のための境界ボックスの手動アノテーションは退屈で、時間とリソースを消費する。本稿では,効率的なバウンディングボックスアノテーションのための半自動手法を提案する。この方法は、ラベル付き画像の小さなバッチにオブジェクト検出器を反復的に訓練し、次のバッチにバウンディングボックスを提案することを学習する。本稿では,人間の行動をシミュレーションし,アノテータにデータを提示する順序など,異なるイテレーション戦略を比較するための実験的なセットアップを提案する。提案手法を3つのデータセットを用いて実験し,人手による注記作業の75%を省くことにより,人手による注記作業を大幅に削減できることを示した。 Manual annotation of bounding boxes for object detection in digital images is tedious, and time and resource consuming. In this paper, we propose a semi-automatic method for efficient bounding box annotation. The method trains the object detector iteratively on small batches of labeled images and learns to propose bounding boxes for the next batch, after which the human annotator only needs to correct possible errors. We propose an experimental setup for simulating the human actions and use it for comparing different iteration strategies, such as the order in which the data is presented to the annotator. We experiment on our method with three datasets and show that it can reduce the human annotation effort significantly, saving up to 75% of total manual annotation work.	翻訳日:2022-11-14 12:57:59 公開日:2020-07-02
# 視覚的質問応答のためのシーングラフ推論 Scene Graph Reasoning for Visual Question Answering ( http://arxiv.org/abs/2007.01072v1 ) ライセンス: Link先を確認	Marcel Hildebrandt, Hang Li, Rajat Koner, Volker Tresp, Stephan G\"unnemann	(参考訳) 視覚的な質問応答は、画像に関する自由形式の質問に答えることに関するものである。問題に対する深い言語的理解と、画像に存在する様々なオブジェクトと関連付ける能力を必要とするため、これは野心的な課題であり、コンピュータビジョンと自然言語処理の両方の技法を必要とする。本研究では,シーン内に存在するオブジェクトとその意味的・空間的関係に基づいて,コンテキスト駆動の逐次推論を行うことによってタスクにアプローチする手法を提案する。最初のステップとして、画像内のオブジェクトとその属性とその相互関係を記述するシーングラフを導出する。強化エージェントは、抽出されたシーングラフを自律的にナビゲートして、回答を導出する基礎となるパスを生成する。我々は,手作業で収集したシーングラフを用いて,挑戦的なgqaデータセットを初めて実験的に検討した。 Visual question answering is concerned with answering free-form questions about an image. Since it requires a deep linguistic understanding of the question and the ability to associate it with various objects that are present in the image, it is an ambitious task and requires techniques from both computer vision and natural language processing. We propose a novel method that approaches the task by performing context-driven, sequential reasoning based on the objects and their semantic and spatial relationships present in the scene. As a first step, we derive a scene graph which describes the objects in the image, as well as their attributes and their mutual relationships. A reinforcement agent then learns to autonomously navigate over the extracted scene graph to generate paths, which are then the basis for deriving answers. We conduct a first experimental study on the challenging GQA dataset with manually curated scene graphs, where our method almost reaches the level of human performance.	翻訳日:2022-11-14 12:57:47 公開日:2020-07-02
# 深層マルチタスク学習と補助タスク学習の概観 A Brief Review of Deep Multi-task Learning and Auxiliary Task Learning ( http://arxiv.org/abs/2007.01126v1 ) ライセンス: Link先を確認	Partoo Vafaeikia, Khashayar Namdar, Farzad Khalvati	(参考訳) マルチタスク学習(mtl)は複数の学習タスクを同時に最適化し、共有情報を活用して各タスクの一般化とモデル予測を改善する。補助タスクをメインタスクに追加すれば、最終的にパフォーマンスが向上する。本稿では,近年のDeep Multi-task Learning (dMTL) アプローチについて簡単なレビューを行い,それに続いて,本タスクのモデルの性能向上のために,dMTLで使用できる有用な補助タスクを選択する方法について述べる。 Multi-task learning (MTL) optimizes several learning tasks simultaneously and leverages their shared information to improve generalization and the prediction of the model for each task. Auxiliary tasks can be added to the main task to ultimately boost the performance. In this paper, we provide a brief review on the recent deep multi-task learning (dMTL) approaches followed by methods on selecting useful auxiliary tasks that can be used in dMTL to improve the performance of the model for the main task.	翻訳日:2022-11-14 12:57:33 公開日:2020-07-02
# トレーサノーム逆数例 Trace-Norm Adversarial Examples ( http://arxiv.org/abs/2007.01855v1 ) ライセンス: Link先を確認	Ehsan Kazemi, Thomas Kerdreux and Liqiang Wang	(参考訳) ホワイトボックスの逆転摂動は反復最適化アルゴリズムによって求めるが、ほとんどの場合、元の画像の$l_p$近傍での逆転損失を最小限に抑える。逆探索を異なるノルムで制限すると、異なる構成の逆の例が得られる。ここでは,構造エンハンシングアルゴリズムを用いた歪み集合について検討する。敵対的な例のためのこれらの新しい構造は、最適化において広く普及しているが、例えば、また$l_p$証明書しか提供しない敵対的理論証明の挑戦である。敵の堅牢性はまだ実証的な分野であるため、防御機構は異なる構成の攻撃に対して合理的に評価されるべきである。さらに、これらの構造的対向摂動は、画像の自然なわずかな歪みとして知覚できないか知覚できないまま、$l_p$カウンタ部よりも大きな歪みを許容する。最後に、(局所的な)ぼやけなど、敵対的な摂動の発生をある程度制御できる。 White box adversarial perturbations are sought via iterative optimization algorithms most often minimizing an adversarial loss on a $l_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithms. These new structures for adversarial examples, yet pervasive in optimization, are for instance a challenge for adversarial theoretical certification which again provides only $l_p$ certificates. Because adversarial robustness is still an empirical field, defense mechanisms should also reasonably be evaluated against differently structured attacks. Besides, these structured adversarial perturbations may allow for larger distortions size than their $l_p$ counter-part while remaining imperceptible or perceptible as natural slight distortions of the image. Finally, they allow some control on the generation of the adversarial perturbation, like (localized) bluriness.	翻訳日:2022-11-14 12:57:02 公開日:2020-07-02
# Decoder-free Robustness Disentanglement without (Additional) Supervision Decoder-free Robustness Disentanglement without (Additional) Supervision ( http://arxiv.org/abs/2007.01356v1 ) ライセンス: Link先を確認	Yifei Wang, Dan Peng, Furui Liu, Zhenguo Li, Zhitang Chen, Jiansheng Yang	(参考訳) 対向訓練(adversarial training, at)は、入力からロバストな特徴のみを抽出することによって、機械学習モデルの敵対的脆弱性を軽減するために提案されているが、非ロバストで有用な特徴を捨てることにより、必然的に正確性が低下する。これにより、ロバストな特徴と非ロバストな特徴の両方を保存し、対立する表現学習でそれらを分離するモチベーションが生まれます。提案する逆非対称トレーニング(aat)アルゴリズムはロバスト表現と非ロバスト表現を確実に分離することができる。実験結果から,本手法は2つの表現を組み合わせることで精度を保っただけでなく,従来の作業よりもはるかに良好な絡み合いを実現できることがわかった。 Adversarial Training (AT) is proposed to alleviate the adversarial vulnerability of machine learning models by extracting only robust features from the input, which, however, inevitably leads to severe accuracy reduction as it discards the non-robust yet useful features. This motivates us to preserve both robust and non-robust features and separate them with disentangled representation learning. Our proposed Adversarial Asymmetric Training (AAT) algorithm can reliably disentangle robust and non-robust representations without additional supervision on robustness. Empirical results show our method does not only successfully preserve accuracy by combining two representations, but also achieve much better disentanglement than previous work.	翻訳日:2022-11-14 12:50:59 公開日:2020-07-02
# データサンプリングとマルチタスク最適化による新しいDNNトレーニングフレームワーク A Novel DNN Training Framework via Data Sampling and Multi-Task Optimization ( http://arxiv.org/abs/2007.01016v1 ) ライセンス: Link先を確認	Boyu Zhang, A. K. Qin, Hong Pan, Timos Sellis	(参考訳) 従来のDNNトレーニングパラダイムは、トレーニングに使用される注釈付きデータセット、すなわち粗いトレーニングセットを特定の方法で分割することで得られる、1つのトレーニングセットと1つの検証セットに依存している。トレーニングセットはモデルのトレーニングに使用され、検証セットはトレーニングが過度な適合を避けるために進むにつれてトレーニングモデルの一般化性能を推定するために使用される。このパラダイムには2つの大きな問題があります。まず、検証セットは、テストデータとの潜在的なミスマッチによる一般化性能の偏りのない推定をほとんど保証しない。第二に、dnnの訓練は複雑な最適化問題を解決することに対応しており、これは劣る局所視光に閉じ込められやすいため、望ましくない訓練結果をもたらす。これらの課題に対処するために,我々は新しいDNNトレーニングフレームワークを提案する。ランダムスプリッティングにより総合トレーニングセットから複数のペアのトレーニングセットを生成し、一つのモデルトレーニングプロセスから得られた有用な知識(例えば、有望なネットワークパラメータ)をマルチタスク最適化によって他のモデルトレーニングプロセスに転送しながら、各ペアの事前指定された構造のDNNモデルを訓練し、全てのモデルの中で最高の性能を持つ訓練セットを出力する。この新フレームワークで特徴付けられる知識伝達機構は、モデルトレーニングプロセスが局所最適から逃れるのを支援することでトレーニング効率を向上させるだけでなく、他のモデルトレーニングプロセスから1つのモデルトレーニングプロセスに課される暗黙の正規化によって一般化性能を向上させることができる。提案するフレームワークを実装し,GPUクラスタ上での実装を並列化し,広く使用されているDNNモデルのトレーニングに適用する。実験の結果,従来の学習パラダイムよりも優れた枠組みが得られた。 Conventional DNN training paradigms typically rely on one training set and one validation set, obtained by partitioning an annotated dataset used for training, namely gross training set, in a certain way. The training set is used for training the model while the validation set is used to estimate the generalization performance of the trained model as the training proceeds to avoid over-fitting. There exist two major issues in this paradigm. Firstly, the validation set may hardly guarantee an unbiased estimate of generalization performance due to potential mismatching with test data. Secondly, training a DNN corresponds to solve a complex optimization problem, which is prone to getting trapped into inferior local optima and thus leads to undesired training results. To address these issues, we propose a novel DNN training framework. It generates multiple pairs of training and validation sets from the gross training set via random splitting, trains a DNN model of a pre-specified structure on each pair while making the useful knowledge (e.g., promising network parameters) obtained from one model training process to be transferred to other model training processes via multi-task optimization, and outputs the best, among all trained models, which has the overall best performance across the validation sets from all pairs. The knowledge transfer mechanism featured in this new framework can not only enhance training effectiveness by helping the model training process to escape from local optima but also improve on generalization performance via implicit regularization imposed on one model training process from other model training processes. We implement the proposed framework, parallelize the implementation on a GPU cluster, and apply it to train several widely used DNN models. Experimental results demonstrate the superiority of the proposed framework over the conventional training paradigm.	翻訳日:2022-11-14 12:50:12 公開日:2020-07-02
# オブジェクトやシーンを識別するように訓練されたCNNの隠れた層の中に、オブジェクト検出器はありますか? Are there any 'object detectors' in the hidden layers of CNNs trained to identify objects or scenes? ( http://arxiv.org/abs/2007.01062v1 ) ライセンス: Link先を確認	Ella M. Gale and Nicholas Martin and Ryan Blything and Anh Nguyen and Jeffrey S. Bowers	(参考訳) ニューラルネットワークの動作をよりよく理解するために、様々な単位選択性の測定方法が開発されている。しかし、異なる尺度は選択性の異なる推定を与えるため、選択対象表現が学習される条件とこれらの表現の機能的関連性に関して異なる結論を導いた。対象の選択性を向上するために,AlexNetの大規模単位に対する様々な選択度尺度の比較を行った。例えば,局所選択性,精度,クラス条件の平均活動選択性(CCMAS),ネットワーク分割,アクティベーション最大化(AM)画像の人間解釈,標準信号検出測定などである。異なる測定値が、精度とCCMAS測定値で異なる対象選択性の推定値を提供することがわかった。実際、最も選択的なユニットは、被弾率の低さや、物体分類の誤射率(またはその両方)が高く、被写体検出装置の低さであった。我々は、リカレントニューラルネットワークで報告された「グランドマザーセル」ユニットほど遠くから選択的なユニットを見つけることができません。これらの結果を一般化するため,VGG-16 と GoogLeNet で「対象検出器」として記述された ImageNet あるいは Places-365 データセットで訓練された単位の選択性尺度を比較した。繰り返しになりますが、ヒット率の低さとオブジェクト分類の偽装率が高いことが分かりました。信号検出手法は、一般的な代替手法と比較して、単一ユニット選択性の評価に優れており、画像分類の深い畳み込みネットワークは、隠蔽層における物体検出を学習しない。 Various methods of measuring unit selectivity have been developed with the aim of better understanding how neural networks work. But the different measures provide divergent estimates of selectivity, and this has led to different conclusions regarding the conditions in which selective object representations are learned and the functional relevance of these representations. In an attempt to better characterize object selectivity, we undertake a comparison of various selectivity measures on a large set of units in AlexNet, including localist selectivity, precision, class-conditional mean activity selectivity (CCMAS), network dissection,the human interpretation of activation maximization (AM) images, and standard signal-detection measures. We find that the different measures provide different estimates of object selectivity, with precision and CCMAS measures providing misleadingly high estimates. Indeed, the most selective units had a poor hit-rate or a high false-alarm rate (or both) in object classification, making them poor object detectors. We fail to find any units that are even remotely as selective as the 'grandmother cell' units reported in recurrent neural networks. In order to generalize these results, we compared selectivity measures on units in VGG-16 and GoogLeNet trained on the ImageNet or Places-365 datasets that have been described as 'object detectors'. Again, we find poor hit-rates and high false-alarm rates for object classification. We conclude that signal-detection measures provide a better assessment of single-unit selectivity compared to common alternative approaches, and that deep convolutional networks of image classification do not learn object detectors in their hidden layers.	翻訳日:2022-11-14 12:49:42 公開日:2020-07-02
# 専門家としての事実:記号的知識よりも適応可能で解釈可能なニューラルメモリ Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge ( http://arxiv.org/abs/2007.00849v1 ) ライセンス: Link先を確認	Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen	(参考訳) 大規模言語モデルは現代のNLPモデリングの中核であり、膨大なコモンセンスと事実情報をエンコードすることが示されている。しかし、その知識はモデルの潜在パラメータ内にのみ存在し、検査や解釈にはアクセスできない。さらに悪いことに、トレーニングコーパスから記憶された事実情報は、世界が変化するにつれて、陳腐化する可能性が高い。パラメータとして格納された知識は、必然的にソース素材に固有のすべてのバイアスを示す。これらの問題に対処するため、記号的解釈可能な事実情報とサブシンボル的神経知識との明確なインターフェースを含むニューラル言語モデルを開発する。このモデルは2つの知識集約型質問応答タスクの性能を劇的に向上させる。さらに興味深いことに、モデルはシンボル表現を操作することで再トレーニングすることなく更新することができる。特にこのモデルは、以前のモデルでは不可能な方法で、新しい事実を追加し、既存の事実を上書きすることができます。 Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowledge stored as parameters will also inevitably exhibit all of the biases inherent in the source materials. To address these problems, we develop a neural language model that includes an explicit interface between symbolically interpretable factual information and subsymbolic neural knowledge. We show that this model dramatically improves performance on two knowledge-intensive question-answering tasks. More interestingly, the model can be updated without re-training by manipulating its symbolic representations. In particular this model allows us to add new facts and overwrite existing ones in ways that are not possible for earlier models.	翻訳日:2022-11-14 12:48:48 公開日:2020-07-02
# より少ないことで達成できるのか? 有害コメント分類のためのデータ拡張の検討 Can We Achieve More with Less? Exploring Data Augmentation for Toxic Comment Classification ( http://arxiv.org/abs/2007.00875v1 ) ライセンス: Link先を確認	Chetanya Rastogi, Nikka Mofid, Fang-I Hsiao	(参考訳) 本稿では、機械学習における最大の制限の一つに対処する。具体的には、データ拡張技術と機械学習アルゴリズムの組み合わせを利用して、高精度な分類器を小さなデータセットから構築できるかどうかを検討する。本稿では,データ拡張(eda)とバックトランスレーション,およびロジスティック回帰(logistic regression),サポートベクターマシン(svm),双方向長短期記憶ネットワーク(bi-lstm)の3つの一般的な学習アルゴリズムについて実験を行う。実験のために、wikipedia toxic commentsデータセットを利用して、データ拡張の利点を探求する過程で、サイバーいじめやオンラインハラスメントに対抗するために、コメント中の有毒な発言を検出し分類するモデルを開発することができる。最終的に、データ拡張技術は分類器の性能を大幅に向上させ、NLP問題におけるデータの欠如に対処するための優れた戦略であることがわかった。 This paper tackles one of the greatest limitations in Machine Learning: Data Scarcity. Specifically, we explore whether high accuracy classifiers can be built from small datasets, utilizing a combination of data augmentation techniques and machine learning algorithms. In this paper, we experiment with Easy Data Augmentation (EDA) and Backtranslation, as well as with three popular learning algorithms, Logistic Regression, Support Vector Machine (SVM), and Bidirectional Long Short-Term Memory Network (Bi-LSTM). For our experimentation, we utilize the Wikipedia Toxic Comments dataset so that in the process of exploring the benefits of data augmentation, we can develop a model to detect and classify toxic speech in comments to help fight back against cyberbullying and online harassment. Ultimately, we found that data augmentation techniques can be used to significantly boost the performance of classifiers and are an excellent strategy to combat lack of data in NLP problems.	翻訳日:2022-11-14 12:48:33 公開日:2020-07-02

Title

Authors

Abstract

論文公表日・翻訳日

# 転置可能な転向例を用いた効率的な転向訓練

Efficient Adversarial Training with Transferable Adversarial Examples ( http://arxiv.org/abs/1912.11969v2 )

ライセンス: Link先を確認

Haizhong Zheng, Ziqi Zhang, Juncheng Gu, Honglak Lee, Atul Prakash

(参考訳) 対人訓練は、対人攻撃に対する分類モデルを保護する効果的な防御方法である。しかし、このアプローチの1つの制限は、トレーニング中に強い敵の例を生成するコストが高いため、桁違いのトレーニング時間を必要とすることである。本稿では,同一訓練過程において,隣接するエポックからのモデル間の移動性が高いこと,すなわち1つのエポックからの逆例が、その後のエポックにおいて逆であることを示す。この特性を生かして、訓練されたモデルの堅牢性を向上し、エポックを通じて敵の摂動を蓄積することにより、トレーニング効率を大幅に向上する新しい手法であるATTA(Adversarial Training with Transferable Adversarial Examples)を提案する。最先端の対人訓練法と比較すると、ATTAはCIFAR10で最大7.2%の精度を向上し、MNISTおよびCIFAR10データセットでのトレーニング時間を12～14倍短縮する。

Adversarial training is an effective defense method to protect classification models against adversarial attacks. However, one limitation of this approach is that it can require orders of magnitude additional training time due to high cost of generating strong adversarial examples during training. In this paper, we first show that there is high transferability between models from neighboring epochs in the same training process, i.e., adversarial examples from one epoch continue to be adversarial in subsequent epochs. Leveraging this property, we propose a novel method, Adversarial Training with Transferable Adversarial Examples (ATTA), that can enhance the robustness of trained models and greatly improve the training efficiency by accumulating adversarial perturbations through epochs. Compared to state-of-the-art adversarial training methods, ATTA enhances adversarial accuracy by up to 7.2% on CIFAR10 and requires 12~14x less training time on MNIST and CIFAR10 datasets with comparable model robustness.

翻訳日:2023-06-10 00:16:15 公開日:2020-07-02

# HOM干渉法による非局在周波数時間Schr\"odinger cat-like状態の生成

Producing delocalized frequency-time Schr\"odinger cat-like states with HOM interferometry ( http://arxiv.org/abs/2003.11486v2 )

ライセンス: Link先を確認

N. Fabre, J. Belhassen, A. Minneci, S. Felicetti, A. Keller, M.I. Amanti, F. Baboux, T. Coudreau, S. Ducci, P. Milman

(参考訳) 80年代後半、OuとMandelは、バランスの取れたビームスプリッター(Phys. Rev. Lett 61, 54 (1988))に干渉した2つの光子の非時間分解偶然検出を行うことで、信号のビーティングを実験的に観察した。本研究では, 局所スペクトルフィルタリングにより生成した周波数 Schr\"odinger cat-like 状態のクロノサイクリックウィグナー分布の直接測定として, 本実験で観測されたフリンジパターンの新たな解釈を提案する。また,この解析に基づいて時間分解hom実験を行い,その周波数状態を測定した。

In the late 80's, Ou and Mandel experimentally observed signal beatings by performing a non-time resolved coincidence detection of two photons having interfered in a balanced beam splitter [Phys. Rev. Lett 61, 54 (1988)]. In this work, we provide a new interpretation of the fringe pattern observed in this experiment as the direct measurement of the chronocyclic Wigner distribution of a frequency Schr\"odinger cat-like state produced by local spectral filtering. Based on this analysis, we also study time-resolved HOM experiment to measure such frequency state.

翻訳日:2023-05-27 22:35:22 公開日:2020-07-02

# 11DWDMチャンネルを用いた500MHz帯連続可変QKDの分光形状

Spectrally-Shaped Continuous-Variable QKD Operating at 500 MHz Over an Optical Pipe Lit by 11 DWDM Channels ( http://arxiv.org/abs/2004.11962v2 )

ライセンス: Link先を確認

D. Milovan\v{c}ev, N. Voki\'c, F. Laudenbach, C. Pacher, H. H\"ubel, and B. Schrenk

(参考訳) スペクトル調整と量子レシーバ帯域の最適利用により、セキュアキーレート22Mb/sをサポートする高速CV-QKDを実証する。隣り合うキャリアグレードのCバンドチャネルが20nmしか持たない11の共存は10Mb/sである。

We demonstrate high-rate CV-QKD supporting a secure-key rate of 22Mb/s through spectral tailoring and optimal use of quantum receiver bandwidth. Co-existence with 11 adjacent carrier-grade C-band channels spaced by only 20nm is accomplished at >10Mb/s.

翻訳日:2023-05-22 05:53:08 公開日:2020-07-02

# 捕捉イオンの2次元マイクロトラップアレイにおける高速ゲートを用いたスケーラブル量子計算

Scalable quantum computation with fast gates in two-dimensional microtrap arrays of trapped ions ( http://arxiv.org/abs/2005.00367v2 )

ライセンス: Link先を確認

Zain Mehdi and Alexander K. Ratcliffe and Joseph J. Hope

(参考訳) 2次元マイクロトラップアーキテクチャにおけるイオン量子コンピューティングにおける高速パルス2量子ゲートの利用について理論的に検討する。一次元において、そのような高速ゲートは、近接する隣り合うときに最適であり、2次元幾何学への一般化を検討する。高速パルスゲートは、レーザー繰り返し速度を実験的に実証し、近傍トラップにおけるイオン間の高忠実な絡み合い動作をトラッピング期間よりも早く実施できることを実証する。特に、ゲート長を増加させることなく、数百イオンの大型配列でも高忠実度ゲートは実現可能であることが判明した。本提案の有用性を示すために,40モードフェルミ・ハバードモデルのディジタルシミュレーションにおけるゲートの適用について検討する。これはまた、任意のイオン対を接続するために必要な短いゲート鎖が、この幾何学を大規模計算に適していることを示す。

We theoretically investigate the use of fast pulsed two-qubit gates for trapped ion quantum computing in a two-dimensional microtrap architecture. In one dimension, such fast gates are optimal when employed between nearest neighbours, and we examine the generalisation to a two-dimensional geometry. We demonstrate that fast pulsed gates are capable of implementing high-fidelity entangling operations between ions in neighbouring traps faster than the trapping period, with experimentally demonstrated laser repetition rates. Notably, we find that without increasing the gate duration, high-fidelity gates are achievable even in large arrays with hundreds of ions. To demonstrate the usefulness of this proposal, we investigate the application of these gates to the digital simulation of a 40-mode Fermi-Hubbard model. This also demonstrates why shorter chains of gates required to connect arbitrary pairs of ions makes this geometry well suited for large-scale computation.

翻訳日:2023-05-21 15:00:43 公開日:2020-07-02

# 2+1)次元Duffin-Kemmer-Petiau発振子の外部磁場における分割周波数

Splitting frequency of the (2+1)-dimensional Duffin-Kemmer-Petiau oscillator in an external magnetic field ( http://arxiv.org/abs/2005.01228v2 )

ライセンス: Link先を確認

Ignacio S. Gomez, Esdras S. Santos, Olavo Abla

(参考訳) 本研究では,(2+1)次元DKP発振器をDKP場の4x4および6x6表現を用いて再検討し,文献的考察を行った。スピンプロジェクションによりDKP発振器の周波数が分裂し, 発振器, 外界, スピン間の相互作用として生じ, そこからエネルギーと固有関数が統一的に表現されることがわかった。磁場の特定の臨界値に対して、スピン突起−1,1の成分の振動はキャンセルされる。振動のキャンセルが発生すると位相遷移が報告されるベクトルセクタの正準アンサンブルの熱力学について検討する。熱力学的ポテンシャルは、磁場の反転の下で分配関数が対称な高温限界における漸近的な表現に急速に収束する。

We revisit the (2+1)-dimensional DKP oscillator in an external magnetic field by means of 4x4 and 6x6 representations of the DKP field, thus obtaining several cases studied in the literature. We found an splitting in the frequency of the DKP oscillator according to the spin projection that arises as an interplay between the oscillator, the external field and the spin, from which the energies and the eigenfunctions are expressed in a unified way. For certain critical values of the magnetic field the oscillation in the components of spin projections -1 and 1 is cancelled. We study the thermodynamics of the canonical ensemble of the vectorial sector, where a phase transition is reported when the cancellation of the oscillation occurs. Thermodynamic potentials converge rapidly to their asymptotical expressions in the high temperature limit with the partition function symmetric under the reversion of the magnetic field.

翻訳日:2023-05-21 05:29:51 公開日:2020-07-02

# ランダムホッピング項を持つ非エルミートSu-Schrieffer-Heegerモデルの固有値の統計的性質

Statistical properties of eigenvalues of the non-Hermitian Su-Schrieffer-Heeger model with random hopping terms ( http://arxiv.org/abs/2005.02705v2 )

ライセンス: Link先を確認

Ken Mochizuki, Naomichi Hatano, Joshua Feinberg, Hideaki Obuse

(参考訳) 仮想のオンサイトポテンシャルとランダムに分布するホッピング項を持つsu-schrieffer-heegerモデルの非エルミート版の固有値統計を考察する。我々は、ハミルトニアンの構造のため、固有値は、パリティと時間反転対称性がなくても、あるパラメータの範囲で純粋に実数であることを発見した。このように、純粋な実スペクトルの場合、レベル統計はガウス直交のアンサンブルのものである。これは、固有値が純粋に実数である非エルミート的ハミルトニアンが、元のハミルトニアンの対称性を継承するエルミート的ハミルトニアンに写像できることを明らかにする一般的な特徴である。スペクトルが虚数固有値を含む場合、状態(DOS)の密度は原点で消滅し、虚数軸上のスペクトルエッジで発散することを示す。我々は、DOSの発散は、カイラル対称な1次元エルミート系のダイソン特異点から発生し、ハーミート系と異なるDOSの同素体を解析的に導出したことを示す。

We explore the eigenvalue statistics of a non-Hermitian version of the Su-Schrieffer-Heeger model, with imaginary on-site potentials and randomly distributed hopping terms. We find that owing to the structure of the Hamiltonian, eigenvalues can be purely real in a certain range of parameters, even in the absence of parity and time-reversal symmetry. As it turns out, in this case of purely real spectrum, the level statistics is that of the Gaussian orthogonal ensemble. This demonstrates a general feature which we clarify that a non-Hermitian Hamiltonian whose eigenvalues are purely real can be mapped to a Hermitian Hamiltonian which inherits the symmetries of the original Hamiltonian. When the spectrum contains imaginary eigenvalues, we show that the density of states (DOS) vanishes at the origin and diverges at the spectral edges on the imaginary axis. We show that the divergence of the DOS originates from the Dyson singularity in chiral-symmetric one-dimensional Hermitian systems and derive analytically the asymptotes of the DOS which is different from that in Hermitian systems.

翻訳日:2023-05-21 00:48:54 公開日:2020-07-02

# ローレンツボソニック貯留層と有限相関時間をもつ確率環境における2レベル系の非マルコフ的デコヒーレンス

Non-Markovian decoherence of a two-level system in a Lorentzian bosonic reservoir and a stochastic environment with finite correlation time ( http://arxiv.org/abs/2006.14055v2 )

ライセンス: Link先を確認

V. A. Mikhailov, N. V. Troshkin

(参考訳) 本稿では,外部の古典変動環境の影響下で,ボゾン浴における2レベル系の非マルコフ進化について検討する。浴槽との相互作用はローレンツスペクトル密度を持ち、変動する環境(確率場)はオルンシュタイン-ウレンベック過程の集合で表される。合成環境のそれぞれのサブ環境は、2レベル系の非マルコフ力学を誘導することができる。本研究では, 階層的運動方程式の数値的厳密な手法を用いて, 2段階系の定常状態, 密度行列の低減, 平衡放射スペクトルの周波数カットオフ, サブ環境の結合強度依存性について検討した。また,浴槽との相互作用に用いる回転波近似(RWA)の精度への影響について検討した。

In this paper we investigate non-Markovian evolution of a two-level system (qubit) in a bosonic bath under influence of an external classical fluctuating environment. The interaction with the bath has the Lorentzian spectral density, and the fluctuating environment (stochastic field) is represented by a set of Ornstein-Uhlenbeck processes. Each of the subenvironments of the composite environment is able to induce non-Markovian dynamics of the two-level system. By means of the numerically exact method of hierarchical equations of motion, we study dependence of the steady states of the two-level system, the reduced density matrix evolution and the equilibrium emission spectrums on frequency cutoffs and coupling strengths of the subenvironments. Additionally we investigate the impact of the rotation-wave approximation (RWA) used for the interaction with the bath on accuracy of the results.

翻訳日:2023-05-12 22:07:18 公開日:2020-07-02

# ホルスタイン模型の高温における拡散とフラクタル幾何学の欠如

Absence of diffusion and fractal geometry in the Holstein model at high temperature ( http://arxiv.org/abs/2007.00817v1 )

ライセンス: Link先を確認

Chen-Yen Lai and S. A. Trugman

(参考訳) ホルスタインモデルにおいて, 分散のない光フォノンに結合した電子の高温でのダイナミクスについて検討した。電子は光フォノンによって繰り返し散乱されるため、従来の力学は拡散すると考えられている。しかし、半古典的な近似では、運動は拡散しない。 1次元では、電子は一定の方向に移動し、向きを変えない。 2次元では、電子は続いてフラクタル軌道を辿り続けます。これらの非標準ダイナミクスの側面は、電子とフォノンダイナミクスの完全な量子計算を含む、より正確な計算に保持される。

We investigate the dynamics of an electron coupled to dispersionless optical phonons in the Holstein model, at high temperatures. The dynamics is conventionally believed to be diffusive, as the electron is repeatedly scattered by optical phonons. In a semiclassical approximation, however, the motion is not diffusive. In one dimension, the electron moves in a constant direction and does not turn around. In two dimensions, the electron follows and then continues to retrace a fractal trajectory. Aspects of these nonstandard dynamics are retained in more accurate calculations, including a fully quantum calculation of the electron and phonon dynamics.

翻訳日:2023-05-11 21:00:09 公開日:2020-07-02

# 対称情報完全測定の化合物とその量子鍵分布への応用

Compounds of symmetric informationally complete measurements and their application in quantum key distribution ( http://arxiv.org/abs/2007.01007v1 )

ライセンス: Link先を確認

Armin Tavakoli, Ingemar Bengtsson, Nicolas Gisin and Joseph M. Renes

(参考訳) 対称情報完備測度(SIC)は、ヒルベルト空間におけるエレガントで、祝福され、広く有用な離散構造である。いくつかのSICで合成されたより洗練された離散構造を導入する。 SIC-化合物は、$d$次元ヒルベルト空間における$d^3$ベクトルの集合として定義され、$d$SICと$d^2$正規基底の2つの異なる方法で分割できる。先入観は$d>2$のときはありそうにないが、$d=4$の明示的な構成によって、驚くほど正に答える。驚くべきことに、sic-compoundは量子状態の識別によって明らかにされるように、相互に偏りのない基底との密接な関係を認める。基本的考察を超えて、これらのエキゾチックな性質を利用して量子鍵分布のプロトコルを構築し、一般的な盗聴攻撃の下でそのセキュリティを分析する。 sic-compound は 6 状態プロトコルの一般化が成功するのを防げるほど大きなエラーが存在する場合にセキュアな鍵生成を可能にする。

Symmetric informationally complete measurements (SICs) are elegant, celebrated and broadly useful discrete structures in Hilbert space. We introduce a more sophisticated discrete structure compounded by several SICs. A SIC-compound is defined to be a collection of $d^3$ vectors in $d$-dimensional Hilbert space that can be partitioned in two different ways: into $d$ SICs and into $d^2$ orthonormal bases. While a priori their existence may appear unlikely when $d>2$, we surprisingly answer it in the positive through an explicit construction for $d=4$. Remarkably this SIC-compound admits a close relation to mutually unbiased bases, as is revealed through quantum state discrimination. Going beyond fundamental considerations, we leverage these exotic properties to construct a protocol for quantum key distribution and analyze its security under general eavesdropping attacks. We show that SIC-compounds enable secure key generation in the presence of errors that are large enough to prevent the success of the generalisation of the six-state protocol.

翻訳日:2023-05-11 20:58:16 公開日:2020-07-02

# 量子コンピューティングデバイス上での量子アルゴリズムの実現

Realizing Quantum Algorithms on Real Quantum Computing Devices ( http://arxiv.org/abs/2007.01000v1 )

ライセンス: Link先を確認

Carmen G. Almudever, Lingling Lao, Robert Wille, Gian Giacomo Guerreschi

(参考訳) 量子コンピューティングは現在、学術的なアイデアから現実へと移行している。クラウド上の量子コンピューティングはすでに利用可能であり、世界中のユーザが実際の量子アルゴリズムを開発し実行することができる。しかし、Google、IBM、Rigetti、Intel、IonQ、Xanaduといった新技術に多大な投資をしている企業は、さまざまな技術的アプローチに従う。これにより、これまで利用可能な量子コンピューティングデバイスが大幅に異なる状況になった。それらは主に、キュービットの数と種類とそれらの間の接続性で異なる。そのため、所定の量子コンピューティングデバイス上で意図された量子機能を実現するための様々な方法が利用可能である。本稿では,この領域の紹介と概要を提供し,コンパイラ,マッパー,シンセサイザー,トランスパイラ,ルータなど,対応する手法について述べる。

Quantum computing is currently moving from an academic idea to a practical reality. Quantum computing in the cloud is already available and allows users from all over the world to develop and execute real quantum algorithms. However, companies which are heavily investing in this new technology such as Google, IBM, Rigetti, Intel, IonQ, and Xanadu follow diverse technological approaches. This led to a situation where we have substantially different quantum computing devices available thus far. They mostly differ in the number and kind of qubits and the connectivity between them. Because of that, various methods for realizing the intended quantum functionality on a given quantum computing device are available. This paper provides an introduction and overview into this domain and describes corresponding methods, also referred to as compilers, mappers, synthesizers, transpilers, or routers.

翻訳日:2023-05-11 20:57:47 公開日:2020-07-02

# オープン量子ランダムウォークから量子ウォークへのクロスオーバー

A crossover between open quantum random walks to quantum walks ( http://arxiv.org/abs/2007.00940v1 )

ライセンス: Link先を確認

Norio Konno, Kaname Matsue, Etsuo Segawa

(参考訳) 我々は,開量子ランダムウォークとパラメータを持つ量子ウォークを連続的に接続する中間ウォークを提案する。$m\in \mathbb{n}$ がデコヒーレンス効果を制御する場合,そのウォークは開量子ランダムウォークと一致し,$m=\infty$ は量子ウォークと一致する。我々は、$M=\infty$と$M=1$に対して$\mathbb{Z}$の通常の確率測度を回復する尺度を定義し、様々な正の値に対する数値シミュレーションを通して中間挙動を観察する。 m=2$の場合、オープン量子ランダムウォークのパラメータの小さな隙間でも、量子ウォークの典型的な挙動が現れることを解析的に示す。より正確には、左右に弾道的に移動することと、この歩行器の局所化を同時に観察する。解析は、線型作用素に対する加藤の摂動理論に基づいている。この極限定理をより詳細に分析し、上記の3つのモードがガウス分布によって記述されていることを示す。

We propose an intermediate walk continuously connecting an open quantum random walk and a quantum walk with parameters $M\in \mathbb{N}$ controlling a decoherence effect; if $M=1$, the walk coincides with an open quantum random walk, while $M=\infty$, the walk coincides with a quantum walk. We define a measure which recovers usual probability measures on $\mathbb{Z}$ for $M=\infty$ and $M=1$ and we observe intermediate behavior through numerical simulations for varied positive values $M$. In the case for $M=2$, we analytically show that a typical behavior of quantum walks appears even in a small gap of the parameter from the open quantum random walk. More precisely, we observe both the ballistically moving towards left and right sides and localization of this walker simultaneously. The analysis is based on Kato's perturbation theory for linear operator. We futher analyze this limit theorem in more detail and show that the above three modes are described by Gaussian distributions.

翻訳日:2023-05-11 20:57:15 公開日:2020-07-02

# 非平衡定常状態における開量子系のゆらぎ-散逸関係

Fluctuation-Dissipation Relation for Open Quantum Systems in Nonequilibrium Steady State ( http://arxiv.org/abs/2007.00906v1 )

ライセンス: Link先を確認

Jen-Tsung Hsiang and Bei-Lok Hu

(参考訳) 線形および非線形開量子系における揺動散逸関係(FDR)の性質と存在についての研究を続ける [1-3] ここでは、線形系が非平衡定常状態(NESS)にある場合について考察する。 2つの振動子(両端が短い高調波鎖として考えられている)のモデルにより、異なる温度の熱浴にそれぞれ接続されるので、浴との相互作用によって鎖が完全に緩和されたとき、1つの浴における鎖の散逸核のノイズ核と虚部を結ぶ関係は、平衡の場合において従来のfdrの形状を仮定しない。風呂の温度とオシレータ間結合強度の違いに依存する「バイアス電流」という用語も存在します。さらに,この用語は,NESSにおける2つの浴場間の定常的な熱流に関連していることを示す。熱間交換(浴槽と終端振動子の間)と熱内伝達(チェーン内)のリアルタイムな発展を知る能力と、システムのパラメータへの依存は量子化可能な制御と量子熱エンジンや熱デバイスの設計において可能性を与える。

Continuing our work on the nature and existence of fluctuation-dissipation relations (FDR) in linear and nonlinear open quantum systems [1-3], here we consider such relations when a linear system is in a nonequilibrium steady state (NESS). With the model of two-oscillators (considered as a short harmonic chain with the two ends) each connected to a thermal bath of different temperatures we find that when the chain is fully relaxed due to interaction with the baths, the relation that connects the noise kernel and the imaginary part of the dissipation kernel of the chain in one bath does not assume the conventional form for the FDR in equilibrium cases. There exists an additional term we call the `bias current' that depends on the difference of the bath's initial temperatures and the inter-oscillator coupling strength. We further show that this term is related to the steady heat flow between the two baths when the system is in NESS. The ability to know the real-time development of the inter-heat exchange (between the baths and the end-oscillators) and the intra-heat transfer (within the chain) and their dependence on the parameters in the system offers possibilities for quantifiable control and in the design of quantum heat engines or thermal devices.

翻訳日:2023-05-11 20:56:56 公開日:2020-07-02

# 量子古典対応を用いた偏光関連光子対源のスペクトルマッピング

Spectral mapping of polarization-correlated photon-pair sources using quantum-classical correspondence ( http://arxiv.org/abs/2007.00880v1 )

ライセンス: Link先を確認

Hung-Pin Chung, Pawan Kumar, Kai Wang, Olivier Bernard, Chinmay Shirpurkar, Wen-Chiuan Su, Thomas Pertsch, Andrey A. Sukhorukov, Yen-Hung Chen and Frank Setzpfandt

(参考訳) 量子光子対光源の直接スペクトル特性は、通常、煩雑でコストがかかり時間を要する検出問題を伴う。本研究では,チタン拡散型周期分極型ニオブ酸リチウム(ti:ppln)導波路に基づいて,ii型位相整合型自発的パラメトリックダウンコンバージョン(spdc)源のスペクトル特性を実験的に評価した。生成したクロスポーラ化光子対のスペクトル情報のキャラクタリゼーションは、量子情報や通信を含むアプリケーションにおけるそのようなソースの使用において重要である。クロスポーラライズフォトンペア源の結合スペクトル強度は、古典的和周波発生(sfg)測定による量子古典対応を用いて完全に再構成できることを実証する。この手法は可視光に対するより複雑な検出システムを使用しており、(偏光相関)光ファイバー源の量子状態の迅速な監視と制御を可能にし、安定で高可用性の量子源の実現を容易にする。

Direct spectral characterization of a quantum photon-pair source usually involves cumbersome, costly, and time-consuming detection issues. In this study, we experimentally characterize the spectral properties of a type-II phase-matched spontaneous parametric down-conversion (SPDC) source based on a titanium-diffused periodically poled lithium niobate (Ti:PPLN) waveguide. The characterization of the spectral information of the generated cross-polarized photon pairs is of importance for the use of such sources in applications including quantum information and communication. We demonstrate that the joint spectral intensity of the cross-polarized photon-pair source can be fully reconstructed using the quantum-classical correspondence through classical sum-frequency generation (SFG) measurements. This technique, which uses a much less complex detection system for visible light, opens the possibility of fast monitoring and control of the quantum state of (polarization-correlated) photon-pair sources to facilitate the realization of a stable and high-usability quantum source.

翻訳日:2023-05-11 20:55:56 公開日:2020-07-02

# 実時間形式論における熱場理論:粒子崩壊の概念と応用

Thermal Field Theory in real-time formalism: concepts and applications for particle decays ( http://arxiv.org/abs/2007.01224v1 )

ライセンス: Link先を確認

Torbj\"orn Lundberg and Roman Pasechnik

(参考訳) 本総説は, 湯川型理論における熱場理論(TFT)の概念と重要な成果について, 詳細かつ包括的に考察したものである。まず、想像とリアルタイムの定式化におけるTFTの一般的な導入から始める。現象学的に関連した意味として, 実時間形式論の枠組みで計算されたいくつかの典型的な反応について, 文献に見られる想像上の結果と比較し, 熱減衰率の概観を示す。ここで考慮された過程は、中性(擬似)スカラーが2つの異なる(擬似)スカラーまたはフェルミオン-反フェルミオン対に崩壊する過程である。これらの過程は、化学ポテンシャルと最終状態の異なる種を含むように、以前の研究から拡張される。また,フェルミオン線からの(pseudo)スカラー放射についても考察した。これらの結果は、高温および密度の系における多くの現象学的応用に関連する粒子崩壊観測における熱的効果の重要性を示している。

This review represents a detailed and comprehensive discussion of the Thermal Field Theory (TFT) concepts and key results in Yukawa-type theories. We start with a general pedagogical introduction into the TFT in the imaginary- and real-time formulation. As phenomenologically relevant implications, we present a compendium of thermal decay rates for several typical reactions calculated within the framework of the real-time formalism and compared to the imaginary-time results found in the literature. Processes considered here are those of a neutral (pseudo)scalar decaying into two distinct (pseudo)scalars or into a fermion-antifermion pair. These processes are extended from earlier works to include chemical potentials and distinct species in the final state. In addition, a (pseudo)scalar emission off a fermion line is also discussed. These results demonstrate the importance of thermal effects in particle decay observables relevant in many phenomenological applications in systems at high temperatures and densities.

翻訳日:2023-05-11 20:47:45 公開日:2020-07-02

# 忠実性アプローチ:励起エネルギーの有限次元交差における誤解を招く役割

Fidelity approach: the misleading role of finite size level crossing of excited energies ( http://arxiv.org/abs/2007.01186v1 )

ライセンス: Link先を確認

Somayyeh Nemati, Fatemeh Khastehdel Fumani, Saeed Mahdavifar

(参考訳) ここで、量子忠実性は1次元スピン1/2量子イジングモデルの2つの量子相転移を真に識別できるが、その基底状態の位相図の解析には適さないことを示す。

Here, we show that, although quantum fidelity can truly identify two quantum phase transitions of a one-dimensional spin-1/2 quantum Ising model with competing nearest and next-nearest neighbour interactions in a transverse magnetic field, it may not be a suitable approach for analyzing its ground-state phase diagram.

翻訳日:2023-05-11 20:47:02 公開日:2020-07-02

# 量子重ね合わせを測る(または「観測できることを決定するのは理論のみ」)。

Measuring Quantum Superpositions (Or, "It is only the theory which decides what can be observed.") ( http://arxiv.org/abs/2007.01146v1 )

ライセンス: Link先を確認

Christian de Ronde

(参考訳) 本研究は、量子力学の基礎文献(QM)における「実験室で仮説が実際に観測されることはない」という正統的な主張に反論するものである。そのため、我々は有名な測度問題に対する批判的な分析を行うことから始めるが、これは「理論」の特定の理解の下で量子形式論を仮定する経験実証主義的要件の厳密な適用に起因していると論じる。この文脈において、投影仮定(または測定規則)のアドホックな導入は、観察が「常識」経験の自己明快なものであると仮定するナイーブな経験主義的な視点から来る必要条件として理解することができる。すると我々は、QMの2つの「非崩壊」解釈(つまり、モーダルと多くの世界)に注意を向ける。アインシュタインの主張に従えば、「何が観測できるのかを決める理論のみである」という主張に従い、理論的な前提から「観察」が導かれる「物理理論」の実在論的な表現的理解への回帰を提案する。この観点から、直感的(アンスカリヒト)な方法で量子現象を理解できるような、新しい非古典的な概念表現について議論する。量子重ね合わせを計測・観測するための一般的な物理条件について考察する。

In this work we attempt to confront the orthodox widespread claim present in the foundational literature of Quantum Mechanics (QM) according to which 'superpositions are never actually observed in the lab'. In order to do so, we begin by providing a critical analysis of the famous measurement problem which, we will argue, was originated by the strict application of the empirical-positivist requirements to subsume the quantum formalism under their specific understanding of 'theory'. In this context, the ad hoc introduction of the projection postulate (or measurement rule) can be understood as a necessary requirement coming from a naive empiricist standpoint which presupposes that observations are self evident givens of "common sense" experience --independent of metaphysical (categorical) presuppositions. We then turn our attention to two "non-collapse" interpretations of QM --namely, modal and many worlds-- which even though deny that the "collapse" is a real physical process anyhow retain the measurement rule as a necessary element of the theory. In contraposition, following Einstein's claim according to which "it is only the theory which decides what can be observed", we propose a return to the realist representational understanding of 'physical theories' in which 'observation' is considered as derived from theoretical presuppositions. It is from this standpoint that we discuss a new non-classical conceptual representation which allows us to understand quantum phenomena in an intuitive (anschaulicht) manner. Leaving behind the projection postulate, we discuss the general physical conditions for measuring and observing quantum superpositions.

翻訳日:2023-05-11 20:46:56 公開日:2020-07-02

# プライバシーとセキュリティの脅威をビデオに拡大する

Zooming Into Video Conferencing Privacy and Security Threats ( http://arxiv.org/abs/2007.01059v1 )

ライセンス: Link先を確認

Dima Kagan, Galit Fuhrmann Alpert, Michael Fire

(参考訳) 新型コロナウイルス(covid-19)パンデミック(covid-19)のパンデミックは、関連するソーシャルディスタンシングやシェルターインプレイス対策と共に、人々が互いにコミュニケーションする方法に劇的に影響を与え、人々が協力し、研究し、特別な機会を祝い、家族や友人と会うための新しい方法を見つけることを余儀なくされている。最も人気のあるソリューションの1つは、対面ミーティングを仮想ミーティングに置き換えるためにビデオ会議アプリケーションを使用することである。これにより、ビデオ会議ユーザーの数が前例のない増加となった。本研究では,仮想会議に参加することで危険にさらされる可能性のあるプライバシー問題について検討した。 web上に公開されている会議参加者のコラージュ画像から個人情報を抽出した。我々は、画像処理、テキスト認識ツール、およびソーシャルネットワーク分析を使用して、15,700コラージュ画像、142,000枚以上の参加者の顔画像のウェブクローリングデータセットを探索した。ビデオ会議のユーザは、セキュリティとプライバシーの脅威に悩まされている。以上の結果から,ビデオ会議の公開画像数千枚を集め,参加者の顔画像,年齢,性別,ユーザ名,時にはフルネームなどの個人情報を抽出することは比較的容易であることが示唆された。この種の抽出データは、オンラインと現実世界の両方で人々のセキュリティとプライバシを著しく、容易に危険に晒し、大人だけでなく、幼児や高齢者のような社会のより脆弱な部分にも影響を及ぼす。最後に, 顔画像データとソーシャルネットワークデータとの相互参照により, 参加者が気付いていないかもしれない追加のプライバシーリスクが生じる可能性があること, ビデオ会議の会議に現れるユーザを識別できること, ターゲット個人に関する情報の異なる情報源を悪意的に集約する可能性を示す。

The COVID-19 pandemic outbreak, with its related social distancing and shelter-in-place measures, has dramatically affected ways in which people communicate with each other, forcing people to find new ways to collaborate, study, celebrate special occasions, and meet with family and friends. One of the most popular solutions that have emerged is the use of video conferencing applications to replace face-to-face meetings with virtual meetings. This resulted in unprecedented growth in the number of video conferencing users. In this study, we explored privacy issues that may be at risk by attending virtual meetings. We extracted private information from collage images of meeting participants that are publicly posted on the Web. We used image processing, text recognition tools, as well as social network analysis to explore our web crawling curated dataset of over 15,700 collage images, and over 142,000 face images of meeting participants. We demonstrate that video conference users are facing prevalent security and privacy threats. Our results indicate that it is relatively easy to collect thousands of publicly available images of video conference meetings and extract personal information about the participants, including their face images, age, gender, usernames, and sometimes even full names. This type of extracted data can vastly and easily jeopardize people's security and privacy both in the online and real-world, affecting not only adults but also more vulnerable segments of society, such as young children and older adults. Finally, we show that cross-referencing facial image data with social network data may put participants at additional privacy risks they may not be aware of and that it is possible to identify users that appear in several video conference meetings, thus providing a potential to maliciously aggregate different sources of information about a target individual.

翻訳日:2023-05-11 20:46:08 公開日:2020-07-02

# Rydberg Noisy-Dressingとソリトン分子および液滴準結晶の作製への応用

Rydberg Noisy-Dressing and applications in making soliton-molecules and droplet quasi-crystals ( http://arxiv.org/abs/2007.01039v1 )

ライセンス: Link先を確認

Mohammadsadegh Khazali

(参考訳) 現在の超低温原子と原子トラップの分野の進歩は、新しい制御可能な長距離相互作用を思い出させる。これらの相互作用は、実現可能な量子アルゴリズムの範囲を広げ、新しいタイプの量子問題に対する新しい制御機構を提供すると期待されている。本稿では,レーザーの線幅を操作することで,ライドバーグ型原子間の特別な原子間相互作用について述べる。この新しい相互作用は、高原とガウス峰を含むハイブリッド空間プロファイルを特徴としている。これらの相互作用要素の符号と強度の動的個別制御と組み合わせて、Rydberg noisy-dressing (RnD) スキームは量子技術のための貴重な相互作用ツールボックスを提供する。例えば、RnDの安定な3Dソリトン分子の合成や準周期性液滴結晶の形成への応用について論じる。

The current advances in the field of ultra-cold atoms and atomic traps recall new controllable long-range interactions. These interactions are expected to extend the range of realizable quantum algorithms as well as providing new control mechanisms for the new types of quantum matters. This article presents special inter-atomic interactions between Rydberg-dressed atoms by manipulating lasers' line-width. The new interaction features a hybrid spatial profile containing plateaus and Gaussian peaks. Combined with dynamic individual control over the sign and strength of these interaction elements, the Rydberg noisy-dressing (RnD) scheme provides a valuable interaction toolbox for quantum technology. As an example, RnD's application in making stable gigantic 3D soliton molecules and in the formation of quasi-periodic droplet-crystals are discussed.

翻訳日:2023-05-11 20:45:34 公開日:2020-07-02

# 有限時間量子カルノーエンジンにおける電力変動

Power fluctuations in a finite-time quantum Carnot engine ( http://arxiv.org/abs/2007.01034v1 )

ライセンス: Link先を確認

Tobias Denzler and Eric Lutz

(参考訳) 安定性は、変動する出力を持つ小型熱機械の重要な特性である。本稿では、縮退多レベル系に基づく有限時間量子カルノエンジンを考察し、その有限ヒルベルト空間構造が安定性に与える影響について考察する。我々は、特に、レベル縮退とレベル番号に関して、相対的な作業変動を最適化する。最適性能は、非退化二段エンジンや高調波発振器モータよりも優れている。本結果は,高性能で高安定性な循環型量子熱エンジンの実現方法を示す。

Stability is an important property of small thermal machines with fluctuating power output. We here consider a finite-time quantum Carnot engine based on a degenerate multilevel system and study the influence of its finite Hilbert space structure on its stability. We optimize in particular its relative work fluctuations with respect to level degeneracy and level number. We find that its optimal performance may surpass those of nondegenerate two-level engines or harmonic oscillator motors. Our results show how to realize high-performance, high-stability cyclic quantum heat engines.

翻訳日:2023-05-11 20:45:20 公開日:2020-07-02

# 温度依存カシミール力:繰り返し微妙な性質

Temperature dependent Casimir forces: recurring subtleties ( http://arxiv.org/abs/2007.01011v1 )

ライセンス: Link先を確認

L.R. Fisher, B.W. Ninham

(参考訳) 2つの理想導電面の間のカシミール力は、リフシッツによるより一般的な理論の特別な(ゼロ温度)極限である。温度依存理論は、導電性、誘電性、磁気媒体のための量子および古典的なゆらぎモードの相関を含む。表面が温度が異なる場合、これらのモードはカップリングスプリングとして作用し、真空でも熱エネルギーを熱源から冷却器に伝達する可能性があると仮定されている。最近の実験でこの予測が確認されたが、データは完全な温度依存理論ではなく、カシミールの最初の表現の予測と比較された。これは文献に共通する誤りである。もう一つの誤りは、実導電面(この場合は金)が理想から遠く離れており、最大25%の補正係数が必要であるという事実を無視することである。ここでは両補正について数値値を与える。これらは最近の実験の基本的な結論には影響しないと思われるが、キャシミール(リフシッツ)効果の拡張の解釈に注意が必要であるというメッセージは、幅広い科学的問題でますます現れつつある。

The Casimir force between two ideal conducting surfaces is a special (zero temperature) limit of a more general theory due to Lifshitz. The temperature dependent theory includes correlations in coupled quantum and classical fluctuation modes for conducting, dielectric and magnetic media. If the surfaces are at different temperatures, it has been postulated that these modes might act as a coupling spring, transferring thermal energy from the hotter to the colder even through a vacuum. Recent experiments have appeared to confirm this prediction, but the data were compared with the predictions of Casimir's original expression, rather than those of the full temperature-dependent theory. This is a common error in the literature. Another error is to ignore the fact that real conducting surfaces (gold in this case) can be far from ideal, and that a correction factor of up to 25% may be required. Here we give numerical values for both of these corrections. It appears that they may not affect the basic conclusions from recent experiments, but the take-home message is that care is needed in the interpretation of extensions of Casimir (Lifshitz) effects, which are increasingly emerging across a wide range of scientific problems.

翻訳日:2023-05-11 20:45:13 公開日:2020-07-02

# ループ量子宇宙論における準確率分布

Quasi-probability distributions in Loop Quantum Cosmology ( http://arxiv.org/abs/2007.01324v1 )

ライセンス: Link先を確認

Jasel Berra-Montiel, Alberto Molgado

(参考訳) 本稿では、ループ量子宇宙論プログラム(lqc)で最近開発されたウィグナー・ワイル形式を一般化するために、位相空間と対応するワイル量子化写像のパラメトリズド準確率分布の完全族について述べる。特に、実数直線のボーアコンパクト化に価値を持つ状態の準分布を、非可換量子作用素に対応する順序曖昧性を説明するパラメータによってラベル付けされるように定義する。したがって、パラメータ化された準確率分布の射影は任意の順序付け処方の下で不変な限界確率密度をもたらす。また、標準のschr\"odinger表現とは対照的に、任意の文字に対して、準分布は順序とは独立に正の関数を決定することに注意する。さらに、LQG に対してパラメトリック順序付きワイル量子化写像を任意に実装することにより、標準、反標準およびワイル対称順序付けの関連事例をそれぞれ簡単な方法で復元することができる。我々は,LQCプログラムにおけるいくつかの基本的側面,特にコヒーレンス,圧縮状態,演算子の収束を量子光学および量子情報フレームワークで広く分析する上で,本研究の結果が有効であることを期待している。

In this paper, we introduce a complete family of parametrized quasi-probability distributions in phase space and their corresponding Weyl quantization maps with the aim to generalize the recently developed Wigner-Weyl formalism within the Loop Quantum Cosmology program (LQC). In particular, we intend to define those quasi-distributions for states valued on the Bohr compactification of the real line in such a way that they are labeled by a parameter that accounts for the ordering ambiguity corresponding to non-commutative quantum operators. Hence, we notice that the projections of the parametrized quasi-probability distributions result in marginal probability densities which are invariant under any ordering prescription. We also note that, in opposition to the standard Schr\"odinger representation, for an arbitrary character the quasi-distributions determine a positive function independently of the ordering. Further, by judiciously implementing a parametric-ordered Weyl quantization map for LQG, we are able to recover in a simple manner the relevant cases of the standard, anti-standard, and Weyl symmetric orderings, respectively. We expect that our results may serve to analyze several fundamental aspects within the LQC program, in special those related to coherence, squeezed states, and the convergence of operators, as extensively analyzed in the quantum optics and in the quantum information frameworks.

翻訳日:2023-05-11 20:39:01 公開日:2020-07-02

# 共形場の理論は魔法です

Conformal field theories are magical ( http://arxiv.org/abs/2007.01303v1 )

ライセンス: Link先を確認

Christopher David White and ChunJun Cao and Brian Swingle

(参考訳) マジック」とは、クリフォードゲートによって状態が近似できない程度である。我々は、魔法の尺度であるmanaを$\mathbb z_3$ pottsモデルの基底状態で研究し、多体物理学において広く有用な診断であると主張する。特に、$q = 3$ の基底状態はモデルの臨界点で大きなマナを持ち、このマナはシステムの相関に存在する。状態の mera 表現に基づく単純なテンソルカウント計算によって mana の形式を説明する。マナはあらゆる長さのスケールに存在するので、3状態ポッツモデル臨界点を記述する共形場理論は魔法であると結論付ける。これらの結果は,誤り訂正量子コンピュータ上でのポッツ基底状態の生成とAdS-CFTのテンソルネットワークモデルの制約を制御している。

"Magic" is the degree to which a state cannot be approximated by Clifford gates. We study mana, a measure of magic, in the ground state of the $\mathbb Z_3$ Potts model, and argue that it is a broadly useful diagnostic for many-body physics. In particular we find that the $q = 3$ ground state has large mana at the model's critical point, and that this mana resides in the system's correlations. We explain the form of the mana by a simple tensor-counting calculation based on a MERA representation of the state. Because mana is present at all length scales, we conclude that the conformal field theory describing the 3-state Potts model critical point is magical. These results control the difficulty of preparing the Potts ground state on an error-corrected quantum computer, and constrain tensor network models of AdS-CFT.

翻訳日:2023-05-11 20:38:11 公開日:2020-07-02

# 量子シングルトン境界を破る絡み合い支援量子通信

Entanglement-Assisted Quantum Communication Beating the Quantum Singleton Bound ( http://arxiv.org/abs/2007.01249v1 )

ライセンス: Link先を確認

Markus Grassl

(参考訳) Brun, Devetak, and Hsieh [Science 314, 436 (2006)] は、送信機と受信機の間の事前共有の絡み合いが、絡み合いの助けなしにスキームよりも優れたパラメータを持つ量子通信プロトコルを実現することを示した。その後、同じ著者が、それらによって提案された絡み合い支援量子エラー訂正符号のパラメータに関連するいわゆる量子シングルトン境界のバージョンを導出した。我々は,この境界を一定の範囲で破るパラメータを持つ新しい絡み合い支援量子通信方式を提案する。

Brun, Devetak, and Hsieh [Science 314, 436 (2006)] demonstrated that pre-shared entanglement between sender and receiver enables quantum communication protocols that have better parameters than schemes without the assistance of entanglement. Subsequently, the same authors derived a version of the so-called quantum Singleton bound that relates the parameters of the entanglement-assisted quantum-error correcting codes proposed by them. We present a new entanglement-assisted quantum communication scheme with parameters violating this bound in certain ranges.

翻訳日:2023-05-11 20:37:00 公開日:2020-07-02

# 計算機研究の評価と普及のための進化的手法

Evolving Methods for Evaluating and Disseminating Computing Research ( http://arxiv.org/abs/2007.01242v1 )

ライセンス: Link先を確認

Benjamin Zorn, Tom Conte, Keith Marzullo, and Suresh Venkatasubramanian

(参考訳) 社会と技術の動向は、コンピュータ研究の評価と普及の方法を大きく変えた。会議や雑誌など、従来のレビューや出版の場は、過去には効果的に機能していた。近年、トレンドは新しい機会を生み出しつつも、レビューと普及のプロセスに新たなプレッシャーをかけている。例えば、多くのカンファレンスでは応募者数が大幅に増加した。同様に、研究思想の普及はarXiv.orgやソーシャルメディアネットワークといった出版の場を通じて劇的に進んでいる。こうした傾向は新型コロナウイルスより以前からあったが、パンデミックは長期的変化を加速させる可能性がある。 1) コンピュータ研究に影響を及ぼす傾向は概ね肯定的であり, 研究プロセスの参加, 範囲, アクセシビリティ, 速度が向上している。 2) 審査プロセスの規模を拡大する方法, 結果の拡散や混乱を回避し, 公正性を確保し, プロセス自体への幅広い参加を確保すること等, プロセスの整合性確保に課題が残されている。これらの知見に基づいて,1) コンピュータ研究コミュニティの定期的なポーリングメンバー,例えば,プログラムや一般会議の椅子,ジャーナル編集者,著者,レビュアーなど,これらの問題をよりよく理解するために直面する特定の課題を特定することを推奨する。 2)コンピューティング研究協会などの影響力のある団体は,「コンピューティング研究企業の現状」レポートを定期的に発行し,コンピュータ研究企業に影響を与える,肯定的かつ否定的な傾向をコミュニティに報告している。 3)ソーシャルメディアやプレプリントアーカイブがコンピュータ研究に与える影響をより深く理解するために,より深い調査を行う。

Social and technical trends have significantly changed methods for evaluating and disseminating computing research. Traditional venues for reviewing and publishing, such as conferences and journals, worked effectively in the past. Recently, trends have created new opportunities but also put new pressures on the process of review and dissemination. For example, many conferences have seen large increases in the number of submissions. Likewise, dissemination of research ideas has become dramatically through publication venues such as arXiv.org and social media networks. While these trends predate COVID-19, the pandemic could accelerate longer term changes. Based on interviews with leading academics in computing research, our findings include: (1) Trends impacting computing research are largely positive and have increased the participation, scope, accessibility, and speed of the research process. (2) Challenges remain in securing the integrity of the process, including addressing ways to scale the review process, avoiding attempts to misinform or confuse the dissemination of results, and ensuring fairness and broad participation in the process itself. Based on these findings, we recommend: (1) Regularly polling members of the computing research community, including program and general conference chairs, journal editors, authors, reviewers, etc., to identify specific challenges they face to better understand these issues. (2) An influential body, such as the Computing Research Association regularly issues a "State of the Computing Research Enterprise" report to update the community on trends, both positive and negative, impacting the computing research enterprise. (3) A deeper investigation, specifically to better understand the influence that social media and preprint archives have on computing research, is conducted.

翻訳日:2023-05-11 20:36:48 公開日:2020-07-02

# ドイツ消費者金融セクターにおけるフィンテックの破壊的可能性 - ブルーオーシャンシナリオか?

The Disruptive Potential of FinTechs in the German Consumer Finance Sector -- A Blue Ocean Scenario? ( http://arxiv.org/abs/2007.03603v1 )

ライセンス: Link先を確認

Christian Wischnewski

(参考訳) この論文は、ブルーオーシャン戦略を基本戦略要素として、ドイツにおける比較的保守的な銀行部門と、ドイツにおける現在のフィンテック企業の市場シェアを評価するために定量的および質的要素の両方を用いて、ドイツ国民全体のリスク回避思想に当てはまるかどうかを分析、そして将来の発展についての潜在的な見通しを把握している。戦略的な枠組み、ドイツにおける銀行部門、フィンテック部門について文献レビューを行う。その後、銀行セクターが「赤い海」であるかどうか、フィンテック産業が「青い海」であるかどうかを、両セクターのケーススタディを用いて正式に検証する。ドイツにおける銀行顧客とそのフィンテック企業の利用に関する定量的分析は、オンライン調査を通じて行われ、その後、選ばれた参加者がインタビューを受け、さらなる洞察を得る。ドイツ銀行セクターのフルサイズとセグメントごとのトランザクションボリュームを反映した外挿指標とともに、調査結果のピボット分析とクロス集計とインタビュー結果を用いてデータ評価を行い、市場でのフィンテック利用の増加の兆候について検討した。将来の研究のアイデアが導出される場所からいくつかの制限にもかかわらず、その結果はドイツにおけるフィンテックの利用に関する現在のトレンドの概要を提供する。主な発見は、支払いソリューションの特筆すべき例外を除いて、ドイツはフィンテックに対して高い親和性を持っておらず、市場シェアと潜在力に乏しい金融サービス産業の副産物となっていることである。

Using the Blue Ocean strategy as an underlying strategic element, this dissertation analyses whether this statement holds true for the rather more conservative banking sector in Germany and the overall risk-averse mindset of the German population by using both quantitative and qualitative elements to assess the current market share of FinTech companies in the Federal Republic, as well as grasp a potential outlook on the future development. A literature review of the strategic framework, the banking sector in Germany and the FinTech sector is carried out accordingly. Subsequently, a formal verification as to whether the banking sector is a "Red Ocean" and if the FinTech industry is a "Blue Ocean" is carried out using case studies from both sectors. A quantitative analysis of banking customers in Germany and their use of FinTech companies is conducted by way of an online survey, with selected participants being interviewed thereafter to gain additional insights. Data evaluation is made using pivotal analysis and cross tabulation of survey results and interview findings, along with extrapolating indicators to reflect the full size of the German banking sector and transactional volumes per segment are provided and examined for signs of elevated FinTech use in the market. Despite several limitations from where ideas for future research are derived, the outcomes provide an overview of existing trends for the use of FinTechs in Germany. The main finding is that with the notable exception of payment solutions, Germans do not have a high affinity towards FinTechs, rendering them a byproduct of the financial service industry, with limited market share and low potential.

翻訳日:2023-05-11 20:28:39 公開日:2020-07-02

# 初期の宇宙の量子的記述について

On the quantum description of the early universe ( http://arxiv.org/abs/2007.03428v1 )

ライセンス: Link先を確認

Gabriel R. Bengochea

(参考訳) なぜ宇宙の起源を理解するのが面白いのか? 私たちの存在を含む今日の観察はすべて、その出来事から生まれました。我々はまだその起源を記述できる理論を持っていないが、宇宙の非常に初期の時代の研究は、今日最も成功した2つの物理理論、一般相対性理論と量子物理学の間のインターフェイスを分析する理想的な地形を含んでいる。しかし、この領域は、我々の理論的アイデアをテストするための多くの観測データを持っている。量子物理学の父であるニールス・ボーア(Niels Bohr)とヴェルナー・ハイゼンベルク(Werner Heisenberg)は、これらの言葉で説明できるいくつかの考えを共有した:「量子物理学は、観測者と観測者の間に線があり、したがって科学は観察されるものに限定されるべきである。我々は世界の完全で客観的で現実的な理論を諦めなければならない」。この記事はこれらのアイデアを周回し、今日、最近の研究から、宇宙論を通じて(少なくとも一部は)それらに挑戦し、初期の宇宙の量子的な記述を求める立場にあることを要約します。

Why is it interesting to try to understand the origin of the universe? Everything we observe today, including our existence, arose from that event. Although we still do not have a theory that allows us to describe the origin itself, the study of the very early era of the universe involves the ideal terrain to analyze the interface between two of today's most successful physical theories, General Relativity and Quantum physics. But it is also an area in which we have a large number of observational data to test our theoretical ideas. Two of the fathers of Quantum physics, Niels Bohr and Werner Heisenberg, shared some thoughts that could be described with these words: "Quantum physics tells us that there is a line between the observed and the observer, and therefore science should be limited to what is observed. We must give up a complete, objective and realistic theory of the world". This article will orbit around these ideas and summarizes how it is that today, from recent works, we are in a position to try to challenge them (at least in part) through cosmology, seeking the quantum description of the early universe.

翻訳日:2023-05-11 20:28:08 公開日:2020-07-02

# 時間依存型コイン演算子を用いた量子ウォークにおけるパロンドのパラドックス

Genuine Parrondo's paradox in quantum walks with time-dependent coin operators ( http://arxiv.org/abs/2007.01437v1 )

ライセンス: Link先を確認

Marcelo A. Pires and S\'ilvio M. Duarte Queir\'os

(参考訳) 真のパロンドパラドックスが2状態の量子ウォークにおいて、実験的に複雑な高次元コインを使わずに現れることを示した。このような目的を達成するために、システムの空間的不変性を損なうことなく、時間依存のコイン演算子を用いる。

We show that a genuine Parrondo paradox can emerge in two-state quantum walks without resorting to experimentally intricate high-dimensional coins. To achieve such goal we employ a time-dependent coin operator without breaking the translation spatial invariance of the system.

翻訳日:2023-05-11 20:27:47 公開日:2020-07-02

# dwave量子アニーラを用いた40株のポートフォリオ最適化

Portfolio Optimization of 40 Stocks Using the DWave Quantum Annealer ( http://arxiv.org/abs/2007.01430v1 )

ライセンス: Link先を確認

Jeffrey Cohen, Alex Khan, Clark Alexander

(参考訳) 我々は、株式の最適なセットを含む、米国上場の液体株式の宇宙からポートフォリオを構築するための量子コンピュータの使用について調査する。歴史的市場データから、D-Wave Systems Inc.の様々な問題定式化について考察する。 D-Wave 2000Q(TM)システム(後にDWaveと呼ばれる)は、マルコウィッツの定式化とシャープ比に基づく最適化されたポートフォリオ、単純化されたシカゴ量子比(CQR)、そして新しいシカゴ量子ネットスコア(CQNS)を見つける。まずこれを古典的に、次にDWaveの新しい手法でアプローチします。以上の結果から,米国株40株から魅力的なポートフォリオを選択できることがわかった。

We investigate the use of quantum computers for building a portfolio out of a universe of U.S. listed, liquid equities that contains an optimal set of stocks. Starting from historical market data, we look at various problem formulations on the D-Wave Systems Inc. D-Wave 2000Q(TM) System (hereafter called DWave) to find the optimal risk vs return portfolio; an optimized portfolio based on the Markowitz formulation and the Sharpe ratio, a simplified Chicago Quantum Ratio (CQR), then a new Chicago Quantum Net Score (CQNS). We approach this first classically, then by our new method on DWave. Our results show that practitioners can use a DWave to select attractive portfolios out of 40 U.S. liquid equities.

翻訳日:2023-05-11 20:27:41 公開日:2020-07-02

# デザイン革新のためのクラウドファンディング: 重要な要因を持つ予測モデル

Crowdfunding for Design Innovation: Prediction Model with Critical Factors ( http://arxiv.org/abs/2007.01404v1 )

ライセンス: Link先を確認

Chaoyang Song, Jianxi Luo, Katja H\"oltt\"a-Otto, Warren Seering, Kevin Otto

(参考訳) オンライン報酬ベースのクラウドファンディングキャンペーンは、要求を検証し、アーリーアダプターを発見し、革新的な製品の設計プロセスにおける学習とフィードバックを求める革新的なアプローチとして登場した。しかし、革新的な製品のためのクラウドファンディングキャンペーンは高い不確実性に直面しており、デザインの価値を満たすために成功率に苦しめられている。本稿では, クラウドファンディングキャンペーンのデザイナーやイノベーターを指導するために, クラウドファンディングの成功に重要な要因を持つ予測モデルを構築するためのデータ駆動手法を提案する。具体的には、Real-Win-Worthフレームワークの26の候補因子をフィルタリングし、段階的回帰によって重要な要素を特定し、クラウドファンディングの金額を予測する。予測モデルを導出し、3Dプリンターとスマートウォッチのキャンペーンデータから重要な要素をKickstarterとIndiegogoで特定する手法を実証する。重要な要因は、キャンペーンの発展を導くことができ、予測モデルは、革新的製品のクラウドファンディング成功の可能性を高めるために、文脈におけるイノベーションのクラウドファンディングの可能性を評価することができる。

Online reward-based crowdfunding campaigns have emerged as an innovative approach for validating demands, discovering early adopters, and seeking learning and feedback in the design processes of innovative products. However, crowdfunding campaigns for innovative products are faced with a high degree of uncertainty and suffer meager rates of success to fulfill their values for design. To guide designers and innovators for crowdfunding campaigns, this paper presents a data-driven methodology to build a prediction model with critical factors for crowdfunding success, based on public online crowdfunding campaign data. Specifically, the methodology filters 26 candidate factors in the Real-Win-Worth framework and identifies the critical ones via step-wise regression to predict the amount of crowdfunding. We demonstrate the methodology via deriving prediction models and identifying essential factors from 3D printer and smartwatch campaign data on Kickstarter and Indiegogo. The critical factors can guide campaign developments, and the prediction model may evaluate crowdfunding potential of innovations in contexts, to increase the chance of crowdfunding success of innovative products.

翻訳日:2023-05-11 20:26:39 公開日:2020-07-02

# 複素最適化と統計的推測による高次元の量子精度限界における純量子状態の推定

Estimation of pure quantum states in high dimension at the limit of quantum accuracy through complex optimization and statistical inference ( http://arxiv.org/abs/2007.01398v1 )

ライセンス: Link先を確認

Leonardo Zambrano, Luciano Pereira, Sebastian Niklitschek, and Aldo Delgado

(参考訳) 量子トモグラフィーは、量子状態、プロセス、デバイスを評価するための重要なツールとなっている。これにより、より精度の高いトモグラフィー法が探索される。単一2次元量子系適応法の混合状態の場合, 林, ギル, マッサーによる理論的精度限界を達成するために, 最近導入されている。しかし、高次元量子状態の正確な推定は未だよく分かっていない。これは主に非互換な可観測性が存在するため、マルチパラメータ推定が困難である。本稿では,適応トモグラフィー法を示し,数回の反復を経て,高次元の純量子状態の推定精度に基礎的ギル・マッサール下限に漸近する数値シミュレーションを行った。この手法は複素数場における確率的最適化と統計的推論の組み合わせに基づいており、任意の混合状態トモグラフィー法の精度を上回っており、現在の実験能力で実証することができる。提案手法は量子力学の新しい発展につながる可能性がある。

Quantum tomography has become a key tool for the assessment of quantum states, processes, and devices. This drives the search for tomographic methods that achieve greater accuracy. In the case of mixed states of a single 2-dimensional quantum system adaptive methods have been recently introduced that achieve the theoretical accuracy limit deduced by Hayashi and Gill and Massar. However, accurate estimation of higher-dimensional quantum states remains poorly understood. This is mainly due to the existence of incompatible observables, which makes multiparameter estimation difficult. Here we present an adaptive tomographic method and show through numerical simulations that, after a few iterations, it is asymptotically approaching the fundamental Gill-Massar lower bound for the estimation accuracy of pure quantum states in high dimension. The method is based on a combination of stochastic optimization on the field of the complex numbers and statistical inference, exceeds the accuracy of any mixed-state tomographic method, and can be demonstrated with current experimental capabilities. The proposed method may lead to new developments in quantum metrology.

翻訳日:2023-05-11 20:26:21 公開日:2020-07-02

# 一般化 aubry-andr\'{e} 格子における可変移動エッジの観測

Observation of tunable mobility edges in generalized Aubry-Andr\'{e} lattices ( http://arxiv.org/abs/2007.01393v1 )

ライセンス: Link先を確認

Fangzhao Alex An, Karmela Padavi\'c, Eric J. Meier, Suraj Hegde, Sriram Ganeshan, J.H. Pixley, Smitha Vishveshwara, and Bryce Gadway

(参考訳) レーザー結合原子運動量モードの合成格子を用いて,双対対称性によって保護される正確な移動性エッジを有する準周期的サイトエネルギー変調を持つ近距離結合モデル群を実験的に実現した。これらの一次元強結合モデルは、よく知られた Aubry-Andr\'{e} (AA) モデルの一般化と見なすことができ、解析的モビリティエッジ関係を構成するエネルギー依存的な自己双対条件を持つ。このモデルシステムの最低および最高エネルギー固有状態とそれらの参加率の顕微鏡的測定を行うことにより、状態のエネルギー依存密度がモデルのチューニングパラメータによって変化するにつれて、移動度エッジの進化を追跡する。その結果、単粒子予測からの強い偏差が見られ、自己トラップによる最低エネルギー状態の局在化とスクリーニングによる最高エネルギー状態の局在の抑制の両方を引き起こす魅力的な相互作用と一致した。本研究は, 自己双対性誘導移動エッジにおける相互作用効果の定量的研究方法である。

Using synthetic lattices of laser-coupled atomic momentum modes, we experimentally realize a recently proposed family of nearest-neighbor tight-binding models having quasiperiodic site energy modulation that host an exact mobility edge protected by a duality symmetry. These one-dimensional tight-binding models can be viewed as a generalization of the well-known Aubry-Andr\'{e} (AA) model, with an energy-dependent self duality condition that constitutes an analytical mobility edge relation. By adiabatically preparing the lowest and highest energy eigenstates of this model system and performing microscopic measurements of their participation ratio, we track the evolution of the mobility edge as the energy-dependent density of states is modified by the model's tuning parameter. Our results show strong deviations from single-particle predictions, consistent with attractive interactions causing both enhanced localization of the lowest energy state due to self-trapping and inhibited localization of the highest energy state due to screening. This study paves the way for quantitative studies of interaction effects on self duality induced mobility edges.

翻訳日:2023-05-11 20:26:04 公開日:2020-07-02

# 建物における微小気候のデータ駆動制御--イベントトリガー型強化学習アプローチ

Data-driven control of micro-climate in buildings: an event-triggered reinforcement learning approach ( http://arxiv.org/abs/2001.10505v2 )

ライセンス: Link先を確認

Ashkan Haji Hosseinloo, Alexander Ryzhov, Aldo Bischi, Henni Ouerdane, Konstantin Turitsyn, Munther A. Dahleh

(参考訳) スマートな建物は、地球全体のエネルギー消費の約40%を占めるため、エネルギー効率が高く、持続可能で、より経済的な未来を形作る大きな可能性を秘めている。スマートビルの将来は、オンラインと継続的な方法で短期間で適切な制御方針を学ぶという重要な課題によって現在目立たれている、適応的意思決定と制御に感覚データを使用することにある。この課題に取り組むために,イベントが発生し,十分な情報が収集された場合に学習と制御の判断を行う,古典的な時間トリガーとは対照的なイベントトリガー型パラダイムを提案する。イベントは特定の設計条件によって特徴づけられ、例えば特定の状態閾値に達すると、条件が満たされたときに発生する。学習の時間と制御の決定を体系的に調整することにより、提案フレームワークは学習のばらつきを低減し、制御プロセスを改善することができる。変動時間状態遷移と意思決定を可能にする半マルコフ決定プロセスに基づいて,マイクロ気候制御問題を定式化する。拡張政策勾配定理と時間差法を用いて、建物内の微小気候のイベントトリガー制御のための2つの学習アルゴリズムを提案する。テストビルにおけるエネルギー消費と居住者の快適さを同時に最適化するスマート・ラーニング・サーモスタットの設計により,提案手法の有効性を示す。

Smart buildings have great potential for shaping an energy-efficient, sustainable, and more economic future for our planet as buildings account for approximately 40% of the global energy consumption. Future of the smart buildings lies in using sensory data for adaptive decision making and control that is currently gloomed by the key challenge of learning a good control policy in a short period of time in an online and continuing fashion. To tackle this challenge, an event-triggered -- as opposed to classic time-triggered -- paradigm, is proposed in which learning and control decisions are made when events occur and enough information is collected. Events are characterized by certain design conditions and they occur when the conditions are met, for instance, when a certain state threshold is reached. By systematically adjusting the time of learning and control decisions, the proposed framework can potentially reduce the variance in learning, and consequently, improve the control process. We formulate the micro-climate control problem based on semi-Markov decision processes that allow for variable-time state transitions and decision making. Using extended policy gradient theorems and temporal difference methods in a reinforcement learning set-up, we propose two learning algorithms for event-triggered control of micro-climate in buildings. We show the efficacy of our proposed approach via designing a smart learning thermostat that simultaneously optimizes energy consumption and occupants' comfort in a test building.

翻訳日:2023-01-06 03:06:54 公開日:2020-07-02

# 人物再同定:深層学習分類フレームワークの受容領域を暗黙的に定義する

Person Re-identification: Implicitly Defining the Receptive Fields of Deep Learning Classification Frameworks ( http://arxiv.org/abs/2001.11267v4 )

ライセンス: Link先を確認

Ehsan Yaghoubi, Diana Borza, Aruna Kumar, Hugo Proen\c{c}a

(参考訳) ディープラーニング分類モデルの \emph{receptive fields} は、正しい判断を提供する上で最も重要な入力データの領域を決定する。このような受容的フィールドを学ぶ一番の方法は、マスクされたデータに基づいてモデルをトレーニングすることであり、ネットワークが望ましくない領域を無視するのに役立つが、2つの大きな欠点がある。 1) しばしばエッジに敏感な意思決定プロセスをもたらします。 2) 推論フェーズの計算コストを大幅に増加させる。本稿では,ネットワーク決定に重要な/無関係な,交換セグメントからなる合成学習データを作成することにより,ネットワークの受容領域の推論を暗黙的に駆動する解について述べる。実際には,各学習インスタンスの前景/背景(重要でない)部分を区別するためにセグメンテーションモジュールを使用し,画像ペア間のセグメントをランダムにスワップし,クラスラベルを重要セグメントのラベルに排他的に一致させる。この戦略は典型的にネットワークを早期収束と適切な解へと駆り立てるが、そこではアイデンティティと乱雑な記述は相関しない。さらに、このデータ拡張ソリューションには様々な興味深い特性がある。 1) パラメータフリーである。 2) ラベル情報を完全に保存し,かつ 3) 典型的なデータ拡張技術と互換性がある。実証的検証では,提案手法の有効性を2つの異なる設定 (\emph{upper-body} と \emph{full-body}) で検証し,最先端技術に対する高い競争力のある結果が得られた。再現可能な研究パラダイムの下で、コードと経験的評価プロトコルは \url{https://github.com/Ehsan-Yaghoubi/reid-strong-baseline} で利用可能である。

The \emph{receptive fields} of deep learning classification models determine the regions of the input data that have the most significance for providing correct decisions. The primary way to learn such receptive fields is to train the models upon masked data, which helps the networks to ignore any unwanted regions, but has two major drawbacks: 1) it often yields edge-sensitive decision processes; and 2) augments the computational cost of the inference phase considerably. This paper describes a solution for implicitly driving the inference of the networks' receptive fields, by creating synthetic learning data composed of interchanged segments that should be \emph{apriori} important/irrelevant for the network decision. In practice, we use a segmentation module to distinguish between the foreground (important)/background (irrelevant) parts of each learning instance, and randomly swap segments between image pairs, while keeping the class label exclusively consistent with the label of the deemed important segments. This strategy typically drives the networks to early convergence and appropriate solutions, where the identity and clutter descriptions are not correlated. Moreover, this data augmentation solution has various interesting properties: 1) it is parameter-free; 2) it fully preserves the label information; and, 3) it is compatible with the typical data augmentation techniques. In the empirical validation, we considered the person re-identification problem and evaluated the effectiveness of the proposed solution in the well-known \emph{Richly Annotated Pedestrian} (RAP) dataset for two different settings (\emph{upper-body} and \emph{full-body}), observing highly competitive results over the state-of-the-art. Under a reproducible research paradigm, both the code and the empirical evaluation protocol are available at \url{https://github.com/Ehsan-Yaghoubi/reid-strong-baseline}.

翻訳日:2023-01-05 11:38:30 公開日:2020-07-02

# ディープニューラルネットワークのベイズが本当に優れているのか?

How Good is the Bayes Posterior in Deep Neural Networks Really? ( http://arxiv.org/abs/2002.02405v2 )

ライセンス: Link先を確認

Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub \'Swi\k{a}tkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

(参考訳) 過去5年間で、ベイジアンディープラーニングコミュニティは、ディープニューラルネットワークでベイジアン推論を可能にする、より正確で効率的な近似推論手順を開発してきた。しかし、このアルゴリズムの進歩と不確実性定量化の改善とサンプル効率の約束にもかかわらず、2020年初め現在、産業実践におけるベイズニューラルネットワークの公開デプロイは行われていない。本研究では,一般の深層ニューラルネットワークにおけるベイズ後部の理解に疑問を呈し,ベイズ後部による後部予測がSGDから得られた点推定を含む単純な手法と比較して系統的に悪い予測を行うことを示す。さらに,証拠を過大評価する"コールド後部"を用いることで,予測性能が大幅に向上することを示す。このような冷たい後部はベイズパラダイムから著しく逸脱するが、ベイズ深層学習論文ではヒューリスティックとしてよく使われている。寒冷な後部を説明できる仮説をいくつか提示し,実験を通じて仮説を評価した。我々の研究は、ベイズ深層学習における正確な後方近似の目的に疑問を呈している: 真のベイズ深層が貧弱なら、より正確な近似はどのように使われるのだろうか? 代わりに,寒冷後部の性能向上の原点を理解することに集中することが適当であると主張する。

During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a "cold posterior" that overcounts evidence. Such cold posteriors sharply deviate from the Bayesian paradigm but are commonly used as heuristic in Bayesian deep learning papers. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments. Our work questions the goal of accurate posterior approximations in Bayesian deep learning: If the true Bayes posterior is poor, what is the use of more accurate approximations? Instead, we argue that it is timely to focus on understanding the origin of the improved performance of cold posteriors.

翻訳日:2023-01-03 10:11:09 公開日:2020-07-02

# 解釈可能な構造を用いた多目的分子生成

Multi-Objective Molecule Generation using Interpretable Substructures ( http://arxiv.org/abs/2002.03244v3 )

ライセンス: Link先を確認

Wengong Jin, Regina Barzilay, Tommi Jaakkola

(参考訳) 創薬の目的は、特定の化学的性質のプロファイルを持つ新規化合物を見つけることである。生成的モデリングの観点では、複数の性質制約の交差する分子のサンプリングを学ぶことが目的である。プロパティの制約が多ければ,このタスクはますます難しくなります。我々は、分子の合理性と呼ばれる部分構造の語彙から分子を構成することによって、この複雑さを相殺することを提案する。これらの有理性は、分子から、興味のそれぞれの性質に責任を負う可能性のあるサブ構造として特定される。そして、グラフ生成モデルを用いて有理を全分子に拡張することを学ぶ。最終生成モデルでは、分子を複数の有理補体の混合物として構成し、この混合物は興味のある性質を保持するために微調整される。薬物設計タスクにおける本モデルの評価を行い, 生成化合物の精度, 多様性, 新規性の観点から, 最先端のベースラインに対する顕著な改善を示す。

Drug discovery aims to find novel compounds with specified chemical property profiles. In terms of generative modeling, the goal is to learn to sample molecules in the intersection of multiple property constraints. This task becomes increasingly challenging when there are many property constraints. We propose to offset this complexity by composing molecules from a vocabulary of substructures that we call molecular rationales. These rationales are identified from molecules as substructures that are likely responsible for each property of interest. We then learn to expand rationales into a full molecule using graph generative models. Our final generative model composes molecules as mixtures of multiple rationale completions, and this mixture is fine-tuned to preserve the properties of interest. We evaluate our model on various drug design tasks and demonstrate significant improvements over state-of-the-art baselines in terms of accuracy, diversity, and novelty of generated compounds.

翻訳日:2023-01-02 22:40:10 公開日:2020-07-02

# 解釈可能なAIの代替としての自己説明型AI

Self-explaining AI as an alternative to interpretable AI ( http://arxiv.org/abs/2002.05149v6 )

ライセンス: Link先を確認

Daniel C. Elton

(参考訳) AIシステムによってなされる決定を説明する能力は、特に医療や自動運転車といった人間の生命が危険にさらされている領域において、特に注目されている。ディープニューラルネットワークの入出力関係を人間の理解可能なルールで近似することはしばしば可能であるが、二重降下現象の発見は、ディープニューラルネットワークが動作するメカニズムを正確に捉えていないことを示唆している。二重降下は、ディープニューラルネットワークが通常、いくつかの高レベルのルールを抽出するよりも、データポイント間のスムーズな補間によって動作することを示している。その結果、複雑な実世界のデータに基づいてトレーニングされたニューラルネットワークは、外挿を求めると本質的に解釈が難しく、失敗に陥りがちである。これらの問題にもかかわらず、どのようにAIを信頼できるかを示すために、自己説明型AIの概念を紹介します。自己説明型AIは、決定と説明の両方に対する信頼レベルとともに、各決定について人間に理解可能な説明を提供することができる。このアプローチが機能するためには、説明が実際に決定に関連し、理想的には説明にたどり着くメカニズムを捉えることが重要である。最後に、ディープラーニングベースのシステムには、適用性ドメイン分析のテクニックに基づいた「警告光」が含まれており、モデルにトレーニング配布外の外挿を依頼するとユーザーに警告することが重要であると論じる。この講演のビデオプレゼンテーションはhttps://www.youtube.com/watch? v=py7pvdcu7wy&。

The ability to explain decisions made by AI systems is highly sought after, especially in domains where human lives are at stake such as medicine or autonomous vehicles. While it is often possible to approximate the input-output relations of deep neural networks with a few human-understandable rules, the discovery of the double descent phenomena suggests that such approximations do not accurately capture the mechanism by which deep neural networks work. Double descent indicates that deep neural networks typically operate by smoothly interpolating between data points rather than by extracting a few high level rules. As a result, neural networks trained on complex real world data are inherently hard to interpret and prone to failure if asked to extrapolate. To show how we might be able to trust AI despite these problems we introduce the concept of self-explaining AI. Self-explaining AIs are capable of providing a human-understandable explanation of each decision along with confidence levels for both the decision and explanation. For this approach to work, it is important that the explanation actually be related to the decision, ideally capturing the mechanism used to arrive at the explanation. Finally, we argue it is important that deep learning based systems include a "warning light" based on techniques from applicability domain analysis to warn the user if a model is asked to extrapolate outside its training distribution. For a video presentation of this talk see https://www.youtube.com/watch?v=Py7PVdcu7WY& .

翻訳日:2023-01-01 18:43:27 公開日:2020-07-02

# 非凸構成最適化のための確率ガウスニュートンアルゴリズム

Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization ( http://arxiv.org/abs/2002.07290v2 )

ライセンス: Link先を確認

Quoc Tran-Dinh and Nhan H. Pham and Lam M. Nguyen

(参考訳) 本研究では,非凸確率的構成最適化問題を解くための2つの新しい確率的ガウス・ニュートンアルゴリズムを開発した。標準仮定の下では期待と有限サムの設定の両方を考慮し、古典確率およびSARAH推定器を用いて関数値とジャコビアンを近似する。期待の場合、予測における定常点を達成するために$\mathcal{O}(\varepsilon^{-2})$ iteration-complexityを確立し、関数値とジャコビアンの両方に対する確率的オラクル呼び出しの総数を推定する($\varepsilon$は所望の精度である)。有限和の場合、$\mathcal{o}(\varepsilon^{-2})$イテレーション複雑度とオラクルの総呼び出しは高い確率で見積もる。我々の知る限り、確率的ガウス・ニュートン法のためにこのような大域的確率的オラクル複雑性が確立されたのはこれが初めてである。最後に,合成データと実データの両方について2つの数値例を用いて理論的結果を示す。

We develop two new stochastic Gauss-Newton algorithms for solving a class of non-convex stochastic compositional optimization problems frequently arising in practice. We consider both the expectation and finite-sum settings under standard assumptions, and use both classical stochastic and SARAH estimators for approximating function values and Jacobians. In the expectation case, we establish $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity to achieve a stationary point in expectation and estimate the total number of stochastic oracle calls for both function value and its Jacobian, where $\varepsilon$ is a desired accuracy. In the finite sum case, we also estimate $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity and the total oracle calls with high probability. To our best knowledge, this is the first time such global stochastic oracle complexity is established for stochastic Gauss-Newton methods. Finally, we illustrate our theoretical results via two numerical examples on both synthetic and real datasets.

翻訳日:2022-12-31 13:01:28 公開日:2020-07-02

# Wavesplit:話者クラスタリングによるエンドツーエンド音声分離

Wavesplit: End-to-End Speech Separation by Speaker Clustering ( http://arxiv.org/abs/2002.08933v2 )

ライセンス: Link先を確認

Neil Zeghidour and David Grangier

(参考訳) エンド・ツー・エンドのソース分離システムwavesplitを紹介する。単一の混合から、モデルは各ソースの表現を推論し、推論された表現が与えられた各ソース信号を推定する。モデルは、生の波形から両方のタスクを共同で実行するように訓練される。 Wavesplitはクラスタリングを通じてソース表現のセットを推論し、分離の基本的な置換問題に対処する。音声分離では, 先行処理に比べて, 連続話者表現の方が, 長大かつ難解な録音をより堅牢に分離することができる。 Wavesplitは、2または3つの話者(WSJ0-2/3mix)の清潔な混合(WHAM/WHAMR)に対して、ノイズと残響設定(WHAM/WHAMR)を再定義する。また、最近のLibriMixデータセットに新しいベンチマークを設定しました。最後に,1回の腹部心電図から胎児と母体心拍数を分離することにより,Wavesplitは他の領域にも適用可能であることを示す。

We introduce Wavesplit, an end-to-end source separation system. From a single mixture, the model infers a representation for each source and then estimates each source signal given the inferred representations. The model is trained to jointly perform both tasks from the raw waveform. Wavesplit infers a set of source representations via clustering, which addresses the fundamental permutation problem of separation. For speech separation, our sequence-wide speaker representations provide a more robust separation of long, challenging recordings compared to prior work. Wavesplit redefines the state-of-the-art on clean mixtures of 2 or 3 speakers (WSJ0-2/3mix), as well as in noisy and reverberated settings (WHAM/WHAMR). We also set a new benchmark on the recent LibriMix dataset. Finally, we show that Wavesplit is also applicable to other domains, by separating fetal and maternal heart rates from a single abdominal electrocardiogram.

翻訳日:2022-12-30 06:32:48 公開日:2020-07-02

# 室内シーンの3次元認識

Indoor Scene Recognition in 3D ( http://arxiv.org/abs/2002.12819v2 )

ライセンス: Link先を確認

Shengyu Huang, Mikhail Usvyatsov and Konrad Schindler

(参考訳) どのような環境があるかを認識することは重要な認識課題である。例えば、屋内で動作しているロボットは、キッチン、廊下、寝室にいるかどうかを認識するのに役立ちます。既存のアプローチでは、2D画像や2.5Dレンジ画像に基づいてシーンを分類しようとする。本研究では,3dポイントクラウド(voxel)データからシーン認識を解析し,2d鳥眼の視点に基づく手法を大きく上回ることを示す。さらに,シーン認識の改善方法としてマルチタスク学習を提唱し,シーンタイプがシーン内のオブジェクトと高度に相関していることと,その意味的セグメンテーションを異なるオブジェクトクラスに分類することに着目した。一連のアブレーション研究において、成功したシーン認識は、特定のシーンタイプ(浴槽など)に固有の個々のオブジェクトの認識だけでなく、粗い3次元形状、色、オブジェクトカテゴリの(簡単な)分布など、いくつかの異なる手がかりに依存することを示した。さらに,室内のシーンを精度良く分類するのに,驚くほどスパースな3Dデータが十分であることを示す。

Recognising in what type of environment one is located is an important perception task. For instance, for a robot operating in indoors it is helpful to be aware whether it is in a kitchen, a hallway or a bedroom. Existing approaches attempt to classify the scene based on 2D images or 2.5D range images. Here, we study scene recognition from 3D point cloud (or voxel) data, and show that it greatly outperforms methods based on 2D birds-eye views. Moreover, we advocate multi-task learning as a way of improving scene recognition, building on the fact that the scene type is highly correlated with the objects in the scene, and therefore with its semantic segmentation into different object classes. In a series of ablation studies, we show that successful scene recognition is not just the recognition of individual objects unique to some scene type (such as a bathtub), but depends on several different cues, including coarse 3D geometry, colour, and the (implicit) distribution of object categories. Moreover, we demonstrate that surprisingly sparse 3D data is sufficient to classify indoor scenes with good accuracy.

翻訳日:2022-12-28 02:03:57 公開日:2020-07-02

# unblind your apps:ディープラーニングによるモバイルguiコンポーネントの自然言語ラベルの予測

Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning ( http://arxiv.org/abs/2003.00380v2 )

ライセンス: Link先を確認

Jieshan Chen, Chunyang Chen, Zhenchang Xing, Xiwei Xu, Liming Zhu, Guoqiang Li, and Jinshui Wang

(参考訳) 世界保健機関(WHO)によると、世界中で約13億人が視覚障害を患っており、そのうち3600万人が盲目である。その障害のため、これらの少数派を社会に巻き込むことは難しい問題である。近年の携帯電話の普及は、視覚障害者が世界を理解するための情報やサービスにアクセスしやすくすることで、新しいソリューションを提供する。視覚障害のあるユーザは、モバイルオペレーティングシステムに埋め込まれたスクリーンリーダーを採用して、アプリ内の各画面のコンテンツを読み、ジェスチャーを使ってスマートフォンと対話することができる。しかし、スクリーンリーダーを使う前提は、開発者がアプリを開発する際に、画像ベースのコンポーネントに自然言語ラベルを追加する必要があることである。 10,408のAndroidアプリの分析によると、残念ながら77%以上のアプリがラベルの不足に悩まされている。これらの問題のほとんどは、マイノリティを考慮した開発者の認識と知識の欠如によって引き起こされる。また、開発者がラベルをUIコンポーネントに追加したいとしても、視覚的な問題がないため、簡潔で明確な説明が得られない可能性がある。これらの課題を克服するために、Google Playの大規模商用アプリから学習することで、画像ベースのボタンのラベルを自動的に予測するディープラーニングベースのモデル、LabelDroidを開発した。実験の結果,本モデルは正確な予測を行うことができ,生成ラベルは実際のandroid開発者よりも高品質であることが判明した。

According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called LabelDroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show that our model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.

翻訳日:2022-12-27 13:20:37 公開日:2020-07-02

# FlashlightのCNNイメージ

Flashlight CNN Image Denoising ( http://arxiv.org/abs/2003.00762v2 )

ライセンス: Link先を確認

Pham Huu Thanh Binh, Crist\'ov\~ao Cruz, Karen Egiazarian

(参考訳) 本稿では,画像復調のためのディープニューラルネットワークを実装したFlashLight CNN (FLCNN) という学習手法を提案する。提案手法は深層残差ネットワークとインセプションネットワークに基づいており、付加的白色ガウス雑音(awgn)による灰色スケール画像の除去に残差ネットワークのみよりも多くのパラメータを活用できる。フラッシュライトcnnは、美術画像の表示方法の現況と定量的および視覚的に比較した場合の芸術性能の状態を実証する。

This paper proposes a learning-based denoising method called FlashLight CNN (FLCNN) that implements a deep neural network for image denoising. The proposed approach is based on deep residual networks and inception networks and it is able to leverage many more parameters than residual networks alone for denoising grayscale images corrupted by additive white Gaussian noise (AWGN). FlashLight CNN demonstrates state of the art performance when compared quantitatively and visually with the current state of the art image denoising methods.

翻訳日:2022-12-27 05:50:31 公開日:2020-07-02

# オンラインシンクホーン:サンプルストリームからの最適な輸送距離

Online Sinkhorn: Optimal Transport distances from sample streams ( http://arxiv.org/abs/2003.01415v2 )

ライセンス: Link先を確認

Arthur Mensch (DMA), Gabriel Peyr\'e (DMA)

(参考訳) 最適輸送(OT)距離は、MLタスクの損失関数として日常的に使用される。しかし、任意の(すなわち離散的ではない)確率分布間のot距離を計算することは未解決の問題である。本稿では,2つの任意分布間のエントロピー規則化OT距離の新しいオンライン推定器を提案する。両ディストリビューションからのサンプルストリームを使用して、輸送計画の非パラメトリック表現を反復的に強化する。従来のシンクホーンアルゴリズムと比較すると,本手法は各イテレーションで新たなサンプルを活用し,真の正規化ot距離の一貫した推定を可能にする。オンラインシンクホーンアルゴリズムの収束を理論的に解析し,イテレート列に対してほぼo(1/n)漸近的なサンプル複雑性を示す。本手法は合成1d〜10dデータおよび実3d形状データを用いて検証する。

Optimal Transport (OT) distances are now routinely used as loss functions in ML tasks. Yet, computing OT distances between arbitrary (i.e. not necessarily discrete) probability distributions remains an open problem. This paper introduces a new online estimator of entropy-regularized OT distances between two such arbitrary distributions. It uses streams of samples from both distributions to iteratively enrich a non-parametric representation of the transportation plan. Compared to the classic Sinkhorn algorithm, our method leverages new samples at each iteration, which enables a consistent estimation of the true regularized OT distance. We provide a theoretical analysis of the convergence of the online Sinkhorn algorithm, showing a nearly-O(1/n) asymptotic sample complexity for the iterate sequence. We validate our method on synthetic 1D to 10D data and on real 3D shape data.

翻訳日:2022-12-26 23:09:20 公開日:2020-07-02

# セルフ・アテンションに基づくメタエンベディング

Meta-Embeddings Based On Self-Attention ( http://arxiv.org/abs/2003.01371v3 )

ライセンス: Link先を確認

Qichen Li, Yuanqing Lin, Luofeng Zhou, Jian Li

(参考訳) 言語モデリングにおけるパフォーマンス向上のためのメタ組込みの作成が近年注目されており、複数の個別に訓練された組込みの算術平均を連結あるいは単に計算する手法が有用であることが示されている。本稿では,自己保持機構,すなわちDuoに基づくメタ埋め込みモデルを提案する。 0.4M未満のパラメータで、Duoメカニズムは20NGのようなテキスト分類タスクで最先端の精度を達成する。さらに,機械翻訳のためのメタ埋め込みシークエンスモデルを提案する。これは我々の知る限り,複数の単語埋め込みに基づく最初の機械翻訳モデルである。さらに、我々のモデルは、よりよい結果を得るだけでなく、WMT 2014英語-フランス語翻訳タスクのような認識されたベンチマークにより早く収束するという点で、Transformerよりも優れていることが判明した。

Creating meta-embeddings for better performance in language modelling has received attention lately, and methods based on concatenation or merely calculating the arithmetic mean of more than one separately trained embeddings to perform meta-embeddings have shown to be beneficial. In this paper, we devise a new meta-embedding model based on the self-attention mechanism, namely the Duo. With less than 0.4M parameters, the Duo mechanism achieves state-of-the-art accuracy in text classification tasks such as 20NG. Additionally, we propose a new meta-embedding sequece-to-sequence model for machine translation, which to the best of our knowledge, is the first machine translation model based on more than one word-embedding. Furthermore, it has turned out that our model outperform the Transformer not only in terms of achieving a better result, but also a faster convergence on recognized benchmarks, such as the WMT 2014 English-to-French translation task.

翻訳日:2022-12-26 22:43:14 公開日:2020-07-02

# DeepFakes進化: 顔面領域の解析とフェイク検出性能

DeepFakes Evolution: Analysis of Facial Regions and Fake Detection Performance ( http://arxiv.org/abs/2004.07532v2 )

ライセンス: Link先を確認

Ruben Tolosana, Sergio Romero-Tapiador, Julian Fierrez and Ruben Vera-Rodriguez

(参考訳) メディアの法医学は、DeepFakesに関する懸念が高まり、ここ数年で多くの注目を集めている。 UADFVやFaceForensics++といった第1世代のDeepFakeデータベースから、Celeb-DFやDFDCといった第2世代の最新のデータベースに至るまで、多くの視覚的改善が行われており、フェイクビデオはほぼ人間の目で区別できない。本研究では,第1世代および第2世代DeepFakeの顔領域と偽検出性能を総合的に分析した。実験フレームワークでは2つの異なる方法が検討されている。一伝統的に、文献に従つて、偽検出システムへの入力として顔全体を選択すること、及び二偽検出システムへの入力としての特定の顔領域の選択に基づく新しいアプローチ実験の結果,第2世代の最新のDeepFakeデータベースにおいて,最先端のフェイク検出によって達成された偽検出結果が,15%から30%の誤差率で検出された。これらの結果は、より洗練された偽検出器を開発するためのさらなる研究の必要性を述べている。

Media forensics has attracted a lot of attention in the last years in part due to the increasing concerns around DeepFakes. Since the initial DeepFake databases from the 1st generation such as UADFV and FaceForensics++ up to the latest databases of the 2nd generation such as Celeb-DF and DFDC, many visual improvements have been carried out, making fake videos almost indistinguishable to the human eye. This study provides an exhaustive analysis of both 1st and 2nd DeepFake generations in terms of facial regions and fake detection performance. Two different methods are considered in our experimental framework: i) the traditional one followed in the literature and based on selecting the entire face as input to the fake detection system, and ii) a novel approach based on the selection of specific facial regions as input to the fake detection system. Among all the findings resulting from our experiments, we highlight the poor fake detection results achieved even by the strongest state-of-the-art fake detectors in the latest DeepFake databases of the 2nd generation, with Equal Error Rate results ranging from 15% to 30%. These results remark the necessity of further research to develop more sophisticated fake detectors.

翻訳日:2022-12-12 22:03:58 公開日:2020-07-02

# 驚き最小化による強化学習一般化

Reinforcement Learning Generalization with Surprise Minimization ( http://arxiv.org/abs/2004.12399v2 )

ライセンス: Link先を確認

Jerry Zikun Chen

(参考訳) 一般化は、しばしば同じ決定論的ゲーム環境上で訓練され、テストされる深層強化学習アルゴリズムにとって難しい問題である。テスト環境が目に見えず摂動的だが、タスクの性質が変わらず、一般化のギャップが生じる。本研究では,一般化ベンチマークにおけるサプライズ最小化エージェントの提案と評価を行い,エントロピーと確率性が一定である手続き的ゲーム環境において,単純な密度モデルから得られる付加的な報酬がロバスト性を示すことを示す。

Generalization remains a challenging problem for deep reinforcement learning algorithms, which are often trained and tested on the same set of deterministic game environments. When test environments are unseen and perturbed but the nature of the task remains the same, generalization gaps can arise. In this work, we propose and evaluate a surprise minimizing agent on a generalization benchmark to show an additional reward learned from a simple density model can show robustness in procedurally generated game environments that provide constant source of entropy and stochasticity.

翻訳日:2022-12-09 12:59:30 公開日:2020-07-02

# グラフニューラルネットワークによるフラッド検出の不整合問題を軽減する

Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection ( http://arxiv.org/abs/2005.00625v3 )

ライセンス: Link先を確認

Zhiwei Liu, Yingtong Dou, Philip S. Yu, Yutong Deng, Hao Peng

(参考訳) このグラフベースのモデルは、疑わしい詐欺をオンラインで検出するのに役立つ。グラフニューラルネットワーク(gnns)の開発により、先行研究は均質グラフまたはヘテロジニアスグラフのいずれかに基づく多くのgnnベースの不正検出フレームワークを提案している。これらの研究は、近隣の情報を集約してノードの埋め込みを学ぶことで既存のGNNフレームワークに従っている。しかし,不整合問題,すなわちコンテキスト不整合,特徴不整合,関係不整合などについてはほとんど調査されていない。 In this paper, we introduce these inconsistencies and design a new GNN framework, $\mathsf{GraphConsis}$, to tackle the inconsistency problem: (1) for the context inconsistency, we propose to combine the context embeddings with node features, (2) for the feature inconsistency, we design a consistency score to filter the inconsistent neighbors and generate corresponding sampling probability, and (3) for the relation inconsistency, we learn a relation attention weights associated with the sampled nodes. 4つのデータセットに関する実証分析は、不正検出タスクにおいて不整合問題は不可欠であることを示している。広範な実験は$\mathsf{GraphConsis}$の有効性を証明する。また,SOTAモデルを実装したGNNベースの不正検出ツールボックスもリリースした。コードはhttps://github.com/safe-graph/DGFraudで公開されている。

The graph-based model can help to detect suspicious fraud online. Owing to the development of Graph Neural Networks~(GNNs), prior research work has proposed many GNN-based fraud detection frameworks based on either homogeneous graphs or heterogeneous graphs. These work follow the existing GNN framework by aggregating the neighboring information to learn the node embedding, which lays on the assumption that the neighbors share similar context, features, and relations. However, the inconsistency problem is hardly investigated, i.e., the context inconsistency, feature inconsistency, and relation inconsistency. In this paper, we introduce these inconsistencies and design a new GNN framework, $\mathsf{GraphConsis}$, to tackle the inconsistency problem: (1) for the context inconsistency, we propose to combine the context embeddings with node features, (2) for the feature inconsistency, we design a consistency score to filter the inconsistent neighbors and generate corresponding sampling probability, and (3) for the relation inconsistency, we learn a relation attention weights associated with the sampled nodes. Empirical analysis on four datasets indicates the inconsistency problem is crucial in a fraud detection task. The extensive experiments prove the effectiveness of $\mathsf{GraphConsis}$. We also released a GNN-based fraud detection toolbox with implementations of SOTA models. The code is available at https://github.com/safe-graph/DGFraud.

翻訳日:2022-12-08 00:30:33 公開日:2020-07-02

# グラフ準同型畳み込み

Graph Homomorphism Convolution ( http://arxiv.org/abs/2005.01214v2 )

ライセンス: Link先を確認

Hoang NT, Takanori Maehara

(参考訳) 本稿では,グラフ準同型の観点からのグラフ分類問題について考察する。我々は、$f$ から $g$ への準同型を考えるが、ここでは$g$ は興味のあるグラフ(例えば分子やソーシャルネットワーク)であり、$f$ はいくつかのグラフ(例えばパスや非同型木)に属する。グラフ準同型数は自然不変量(同型不変量および$\mathcal{f}$-invariant)埋め込み写像を提供し、グラフの分類に利用できることを示した。グラフ分類器の表現力について、$\mathcal{f}$-indistinguishable の概念を用いて、$\mathcal{f}$-invariant 関数を近似するグラフ準同型ベクトルの普遍性を証明する。実際、元が有界木幅を持つ$\mathcal{f}$を選択することで、準同型法は他の方法と比較して効率的であることを示す。

In this paper, we study the graph classification problem from the graph homomorphism perspective. We consider the homomorphisms from $F$ to $G$, where $G$ is a graph of interest (e.g. molecules or social networks) and $F$ belongs to some family of graphs (e.g. paths or non-isomorphic trees). We show that graph homomorphism numbers provide a natural invariant (isomorphism invariant and $\mathcal{F}$-invariant) embedding maps which can be used for graph classification. Viewing the expressive power of a graph classifier by the $\mathcal{F}$-indistinguishable concept, we prove the universality property of graph homomorphism vectors in approximating $\mathcal{F}$-invariant functions. In practice, by choosing $\mathcal{F}$ whose elements have bounded tree-width, we show that the homomorphism method is efficient compared with other methods.

翻訳日:2022-12-07 06:22:47 公開日:2020-07-02

# 分散不一致:無条件テキスト生成のための指標

Distributional Discrepancy: A Metric for Unconditional Text Generation ( http://arxiv.org/abs/2005.01282v2 )

ライセンス: Link先を確認

Ping Cai, Xingyuan Chen, Peng Jin, Hongjun Wang, Tianrui Li

(参考訳) 非条件テキスト生成の目的は、実際の文でモデルを訓練し、トレーニングデータと同じ品質と多様性の新規な文を生成することである。しかし、無条件テキスト生成法を比較するために異なる指標を用いる場合、矛盾した結論が導かれる。難点は、モデルを評価する際に、サンプルの多様性と品質の両方を同時に考慮すべきである。この問題を解決するために, 生成した訓練文と実際の訓練文の差異に基づいて, 新たな分布的不一致尺度(dd)を考案した。しかし、実際の文の分布が不可能であるため、DDを直接計算することはできない。そこで本研究では,ニューラルネットワークを用いたテキスト分類器の訓練によりDDを推定する手法を提案する。比較のために,既存の3つの指標,二言語評価アンダースタディ (bleu) と自己ブレイン,言語モデルスコアと逆言語モデルスコア,fr\'{e}chet埋め込み距離を用いて,長期記憶の2つの一般的な生成モデルと,構文と実データの両方における生成事前学習トランスフォーマ2の評価を行った。実験結果から,DDは既存の3つの指標よりも有意に優れていることがわかった。

The purpose of unconditional text generation is to train a model with real sentences, then generate novel sentences of the same quality and diversity as the training data. However, when different metrics are used for comparing the methods of unconditional text generation, contradictory conclusions are drawn. The difficulty is that both the diversity and quality of the sample should be considered simultaneously when the models are evaluated. To solve this problem, a novel metric of distributional discrepancy (DD) is designed to evaluate generators based on the discrepancy between the generated and real training sentences. However, it cannot compute the DD directly because the distribution of real sentences is unavailable. Thus, we propose a method for estimating the DD by training a neural-network-based text classifier. For comparison, three existing metrics, bi-lingual evaluation understudy (BLEU) versus self-BLEU, language model score versus reverse language model score, and Fr\'{e}chet embedding distance, along with the proposed DD, are used to evaluate two popular generative models of long short-term memory and generative pretrained transformer 2 on both syntactic and real data. Experimental results show that DD is significantly better than the three existing metrics for ranking these generative models.

翻訳日:2022-12-07 00:48:04 公開日:2020-07-02

# MLSolv-A: Pairwise Atomistic Interactions による解答自由エネルギーの機械学習による予測

MLSolv-A: A Novel Machine Learning-Based Prediction of Solvation Free Energies from Pairwise Atomistic Interactions ( http://arxiv.org/abs/2005.06182v2 )

ライセンス: Link先を確認

Hyuntae Lim and YounJoon Jung

(参考訳) 機械学習とその応用の最近の進歩は、重要な化学特性のための多様な構造-プロパティ関係モデルの開発に結びついており、その1つが溶解自由エネルギーである。本稿では,一対の原子間相互作用から溶解エネルギーを計算するMLに基づく新しい解法モデルを提案する。 2つのエンコーディング関数は、与えられた化学構造から原子の特徴ベクトルを抽出し、2つの原子論的特徴の間の内積はそれらの相互作用を計算する。 6,493 の試験結果から, 溶媒非特異性によるトレーニングデータの拡大に優れた性能と伝達性を得た。相互作用マップの解析から,本モデルが解離エネルギーに対する群寄与を再現する大きな可能性が示唆され,このモデルが予測対象特性を提供するだけでなく,より詳細な物理化学的洞察を与えると考えている。

Recent advances in machine learning and their applications have lead to the development of diverse structure-property relationship models for crucial chemical properties, and the solvation free energy is one of them. Here, we introduce a novel ML-based solvation model, which calculates the solvation energy from pairwise atomistic interactions. The novelty of the proposed model consists of a simple architecture: two encoding functions extract atomic feature vectors from the given chemical structure, while the inner product between two atomistic features calculates their interactions. The results on 6,493 experimental measurements achieve outstanding performance and transferability for enlarging training data due to its solvent-non-specific nature. Analysis of the interaction map shows there is a great potential that our model reproduces group contributions on the solvation energy, which makes us believe that the model not only provides the predicted target property but also gives us more detailed physicochemical insights.

翻訳日:2022-12-03 12:50:52 公開日:2020-07-02

# 頭部検出と追跡熱マップを用いた店内多人数追跡に向けて

Towards in-store multi-person tracking using head detection and track heatmaps ( http://arxiv.org/abs/2005.08009v2 )

ライセンス: Link先を確認

Aibek Musaev, Jiangping Wang, Liang Zhu, Cheng Li, Yi Chen, Jialin Liu, Wanqi Zhang, Juan Mei, De Wang

(参考訳) コンピュータビジョンアルゴリズムは、技術革新を可能にするために、さまざまな産業で実装されている。本稿では,小売業におけるコンピュータビジョンに基づく顧客追跡の問題について検討する。この目的のために,スーパーマーケットにおける顧客行動の模倣を行うオフィス環境において,カメラから収集したデータセットを導入する。さらに,このデータセットを用いた頭部追跡モデルに基づく参加者追跡の例を示し,閉塞による誤りの最小化を図った。さらに,顧客とスタッフの行動パターンに基づいた認識モデルを提案する。モデルは24時間にわたってスーパーマーケットで収集された実世界のデータセットを用いて評価され、トレーニング中の98%の精度と評価時の93%の精度を達成している。

Computer vision algorithms are being implemented across a breadth of industries to enable technological innovations. In this paper, we study the problem of computer vision based customer tracking in retail industry. To this end, we introduce a dataset collected from a camera in an office environment where participants mimic various behaviors of customers in a supermarket. In addition, we describe an illustrative example of the use of this dataset for tracking participants based on a head tracking model in an effort to minimize errors due to occlusion. Furthermore, we propose a model for recognizing customers and staff based on their movement patterns. The model is evaluated using a real-world dataset collected in a supermarket over a 24-hour period that achieves 98% accuracy during training and 93% accuracy during evaluation.

翻訳日:2022-12-02 13:33:06 公開日:2020-07-02

# mixboard: 知識に富んだスタイリッシュな統合テキスト生成プラットフォーム

MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform ( http://arxiv.org/abs/2005.08365v2 )

ライセンス: Link先を確認

Xiang Gao, Michel Galley, Bill Dolan

(参考訳) MixingBoardは、知識に基づくスタイル付きテキスト生成に焦点を当てた、デモを素早く構築するプラットフォームです。我々は、既存のテキスト生成アルゴリズムを共有コードベースに統合し、制約付き生成に以前のアルゴリズムをさらに適応させる。異なるモデルから利点を借用するため、トークン確率レベルから潜在空間レベルまで、クロスモデル統合のための戦略を実装します。外部知識へのインタフェースは、Webやドキュメントコレクションからオンザフライで関連する知識を取得するモジュールを介して提供される。ローカル開発用のユーザインターフェース、リモートWebページアクセス、RESTful APIが提供され、ユーザが自身のデモを簡単に構築できるようになる。

We present MixingBoard, a platform for quickly building demos with a focus on knowledge grounded stylized text generation. We unify existing text generation algorithms in a shared codebase and further adapt earlier algorithms for constrained generation. To borrow advantages from different models, we implement strategies for cross-model integration, from the token probability level to the latent space level. An interface to external knowledge is provided via a module that retrieves on-the-fly relevant knowledge from passages on the web or any document collection. A user interface for local development, remote webpage access, and a RESTful API are provided to make it simple for users to build their own demos.

翻訳日:2022-12-02 05:35:02 公開日:2020-07-02

# ExKMC: 説明可能な$k$-Meansクラスタの拡張

ExKMC: Expanding Explainable $k$-Means Clustering ( http://arxiv.org/abs/2006.02399v2 )

ライセンス: Link先を確認

Nave Frost, Michal Moshkovitz, Cyrus Rashtchian

(参考訳) 説明可能なAIの人気にもかかわらず、教師なし学習の効果的な方法には限界がある。説明可能性と精度のトレードオフに着目し,$k$-meansクラスタリングのアルゴリズムについて検討した。以前の作業の後、データセットを$k$クラスタに分割するために、小さな決定ツリーを使用します。これにより、各クラスタ割り当てを、単一機能しきい値の短いシーケンスで説明できる。大きな木はより正確なクラスタリングを生成するが、さらに複雑な説明を必要とする。フレキシビリティを実現するために、新しい説明可能な$k$-meansクラスタリングアルゴリズムであるExKMCを開発し、$k' \geq k$を加算し、$k'$の葉を持つ決定木を出力する。木を効率的に拡張し、葉に$k$クラスタの1つをラベル付けするために、新しいサロゲートコストを使用します。 k'$が増加するにつれて、サロゲートコストは増加せず、したがって説明可能性と精度を交換する。実験により,ExKMCが低コストのクラスタリングを実現し,標準的な決定木法と説明可能なクラスタリングのためのアルゴリズムの両方に優れることを確認した。 ExKMCの実装はhttps://github.com/navefr/ExKMCで公開されている。

Despite the popularity of explainable AI, there is limited work on effective methods for unsupervised learning. We study algorithms for $k$-means clustering, focusing on a trade-off between explainability and accuracy. Following prior work, we use a small decision tree to partition a dataset into $k$ clusters. This enables us to explain each cluster assignment by a short sequence of single-feature thresholds. While larger trees produce more accurate clusterings, they also require more complex explanations. To allow flexibility, we develop a new explainable $k$-means clustering algorithm, ExKMC, that takes an additional parameter $k' \geq k$ and outputs a decision tree with $k'$ leaves. We use a new surrogate cost to efficiently expand the tree and to label the leaves with one of $k$ clusters. We prove that as $k'$ increases, the surrogate cost is non-increasing, and hence, we trade explainability for accuracy. Empirically, we validate that ExKMC produces a low cost clustering, outperforming both standard decision tree methods and other algorithms for explainable clustering. Implementation of ExKMC available at https://github.com/navefr/ExKMC.

翻訳日:2022-11-25 17:46:50 公開日:2020-07-02

# 極小標本の解釈可能な時系列分類

Interpretable Time-series Classification on Few-shot Samples ( http://arxiv.org/abs/2006.02031v2 )

ライセンス: Link先を確認

Wensi Tang, Lu Liu, Guodong Long

(参考訳) 最近のマイナショット学習は、クラスやサンプルのない新しいタスクに素早く適応するために、事前のメタ知識を持つモデルをトレーニングすることに焦点を当てている。しかし,従来の時系列分類アルゴリズムでは,このシナリオに対処できない。既存の数発の学習手法は、画像やテキストデータに取り組むために提案されており、その多くは、解釈性に欠けるニューラルベースモデルである。本稿では,ニューラルネットワークに基づくモデルを学習するだけでなく,そのモデルを双対粒度から解釈する,数発時系列分類のための解釈可能なニューラルベースフレームワークである \textit{dual prototypical shapelet networks (dpsn)"を提案する。 1)代表時系列サンプルを用いたグローバル概観,及び 2)識別型シェープレットを用いた局所ハイライト。特に、生成された二重原型形状体は、クラス内の全てのサンプルの全体形状を主に示す代表サンプルと、異なるクラスを区別するために使用できる識別的部分長形状体から構成される。我々は,公開ベンチマークデータセットから18個の少数ショットtscデータセットを導出し,ベースラインとの比較により提案手法を評価した。 DPSNフレームワークは、特に限られた量のデータによるトレーニングにおいて、最先端の時系列分類方法より優れている。モデルの解釈能力を示すためにいくつかの事例研究がなされている。

Recent few-shot learning works focus on training a model with prior meta-knowledge to fast adapt to new tasks with unseen classes and samples. However, conventional time-series classification algorithms fail to tackle the few-shot scenario. Existing few-shot learning methods are proposed to tackle image or text data, and most of them are neural-based models that lack interpretability. This paper proposes an interpretable neural-based framework, namely \textit{Dual Prototypical Shapelet Networks (DPSN)} for few-shot time-series classification, which not only trains a neural network-based model but also interprets the model from dual granularity: 1) global overview using representative time series samples, and 2) local highlights using discriminative shapelets. In particular, the generated dual prototypical shapelets consist of representative samples that can mostly demonstrate the overall shapes of all samples in the class and discriminative partial-length shapelets that can be used to distinguish different classes. We have derived 18 few-shot TSC datasets from public benchmark datasets and evaluated the proposed method by comparing with baselines. The DPSN framework outperforms state-of-the-art time-series classification methods, especially when training with limited amounts of data. Several case studies have been given to demonstrate the interpret ability of our model.

翻訳日:2022-11-25 17:17:25 公開日:2020-07-02

# 低レベルシングルトンを越えた双方向プログラミングのための汎用一階アルゴリズムフレームワーク

A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton ( http://arxiv.org/abs/2006.04045v2 )

ライセンス: Link先を確認

Risheng Liu, Pan Mu, Xiaoming Yuan, Shangzhi Zeng, Jin Zhang

(参考訳) 近年,二段階最適化問題の解法として,勾配に基づく一階法が開発されている。しかしながら、これらの既存のアプローチの理論的保証は、各固定された上層変数に対して、下層解がシングルトン(LLS)でなければならないという単純化に大きく依存している。本研究では,まず,LSS条件の無効化を示す反例を設計する。次に、楽観的な2レベル情報の観点からBLPを定式化し、階層的目的情報を集約することで、汎用的2レベル最適化のための柔軟でモジュール化されたアルゴリズムフレームワークであるBDA(Bilevel Descent Aggregation)を確立する。理論的には、LSS条件なしでBDAの収束を証明する新しい手法を導出する。我々の研究は、BDAが特定の一階計算モジュールの検証と互換性があることも示している。さらに、興味深い副産物として、従来の一階二階スキーム(LSS単純化)も改善する。特に、より弱い仮定で収束を確立する。広範にわたる実験により,提案するbdaの高パラメータ最適化やメタ学習など,さまざまなタスクに対する優越性が実証された。

In recent years, a variety of gradient-based first-order methods have been developed to solve bi-level optimization problems for learning applications. However, theoretical guarantees of these existing approaches heavily rely on the simplification that for each fixed upper-level variable, the lower-level solution must be a singleton (a.k.a., Lower-Level Singleton, LLS). In this work, we first design a counter-example to illustrate the invalidation of such LLS condition. Then by formulating BLPs from the view point of optimistic bi-level and aggregating hierarchical objective information, we establish Bi-level Descent Aggregation (BDA), a flexible and modularized algorithmic framework for generic bi-level optimization. Theoretically, we derive a new methodology to prove the convergence of BDA without the LLS condition. Our investigations also demonstrate that BDA is indeed compatible to a verify of particular first-order computation modules. Additionally, as an interesting byproduct, we also improve these conventional first-order bi-level schemes (under the LLS simplification). Particularly, we establish their convergences with weaker assumptions. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed BDA for different tasks, including hyper-parameter optimization and meta learning.

翻訳日:2022-11-24 07:09:37 公開日:2020-07-02

# WaveNODE:音声合成のための連続正規化フロー

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis ( http://arxiv.org/abs/2006.04598v4 )

ライセンス: Link先を確認

Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

(参考訳) 近年,高忠実度波形をリアルタイムに生成するフローベース生成モデルが提案されている。しかし、これらのモデルは、よく訓練された教師ネットワークか、メモリ非効率な複数のフローステップを必要とする。本稿では,音声合成のための連続正規化フローを利用するWaveNODEという新しい生成モデルを提案する。従来のモデルとは異なり、WaveNODEはフロー操作に使用する関数に制約を課さないため、より柔軟で複雑な関数を使用することができる。さらに、WaveNODEは教師ネットワークや補助的損失項を必要とせずに、可能性の最大化に最適化することができる。本研究では,従来のフローベースボコーダに比べて少ないパラメータでウェーブヌードが同等の性能を発揮することを示す。

In recent years, various flow-based generative models have been proposed to generate high-fidelity waveforms in real-time. However, these models require either a well-trained teacher network or a number of flow steps making them memory-inefficient. In this paper, we propose a novel generative model called WaveNODE which exploits a continuous normalizing flow for speech synthesis. Unlike the conventional models, WaveNODE places no constraint on the function used for flow operation, thus allowing the usage of more flexible and complex functions. Moreover, WaveNODE can be optimized to maximize the likelihood without requiring any teacher network or auxiliary loss terms. We experimentally show that WaveNODE achieves comparable performance with fewer parameters compared to the conventional flow-based vocoders.

翻訳日:2022-11-24 01:17:25 公開日:2020-07-02

# 限定アノテーションによるミトコンドリア検出 : 共同学習によるアプローチ

Mitosis Detection Under Limited Annotation: A Joint Learning Approach ( http://arxiv.org/abs/2006.09772v2 )

ライセンス: Link先を確認

Pushpak Pati, Antonio Foncubierta-Rodriguez, Orcun Goksel, Maria Gabrani

(参考訳) 有糸分裂計数は乳癌における腫瘍増殖の重要な予後指標である。深層学習に基づくmitotic detectionは病理学者と同等だが、トレーニングには大きなラベル付きデータが必要である。本研究では,ソフトマックス損失によるクラスラベル情報と,距離メトリック学習によるサンプル間の空間分布情報を活用することで,mitosis検出の深部分類フレームワークを提案する。また,学習を促進するための情報的サンプルを着実に提供するための戦略についても検討する。提案手法の有効性は,ICPR 2012 およびAMIDA 2013 mitotic data による評価により確立された。本フレームワークは,トレーニングデータ全体の使用方法と比較して,少ないトレーニングデータによる検出を著しく改善し,同等あるいは優れたパフォーマンスを実現している。

Mitotic counting is a vital prognostic marker of tumor proliferation in breast cancer. Deep learning-based mitotic detection is on par with pathologists, but it requires large labeled data for training. We propose a deep classification framework for enhancing mitosis detection by leveraging class label information, via softmax loss, and spatial distribution information among samples, via distance metric learning. We also investigate strategies towards steadily providing informative samples to boost the learning. The efficacy of the proposed framework is established through evaluation on ICPR 2012 and AMIDA 2013 mitotic data. Our framework significantly improves the detection with small training data and achieves on par or superior performance compared to state-of-the-art methods for using the entire training data.

翻訳日:2022-11-19 20:35:35 公開日:2020-07-02

# GCC: グラフニューラルネットワーク事前トレーニングのためのグラフコントラスト符号化

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training ( http://arxiv.org/abs/2006.09963v3 )

ライセンス: Link先を確認

Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, Jie Tang

(参考訳) グラフ表現学習は現実世界の問題に対処する強力な手法として登場した。ダウンストリームグラフ学習タスクは、ノード分類、類似性探索、グラフ分類などの最近の発展の恩恵を受けている。しかしながら、グラフ表現学習における先行技術は、ドメイン固有の問題に焦点を当て、各グラフデータセットの専用モデルをトレーニングする。自然言語処理とコンピュータビジョンからの事前学習の最近の進歩に触発されて、我々はグラフコントラストコーディング (gcc) -- 自己教師付きグラフニューラルネットワーク事前学習フレームワーク -- を設計、複数のネットワークにまたがるユニバーサルネットワークトポロジー特性をキャプチャする。我々はgccの事前学習タスクを,ネットワーク内およびネットワーク間におけるサブグラフインスタンス識別として設計し,グラフニューラルネットワークに内在的かつ転送可能な構造表現を学習させるためのコントラスト学習を利用する。 3つのグラフ学習タスクと10のグラフデータセットに関する広範な実験を行う。その結果,多種多様なデータセットの集合を事前学習したgccは,そのタスク固有かつスクラッチからトレーニングされたデータに対して,競争力やパフォーマンスの向上が期待できることがわかった。このことは、事前学習と微調整のパラダイムがグラフ表現学習に大きな可能性を示唆している。

Graph representation learning has emerged as a powerful technique for addressing real-world problems. Various downstream graph learning tasks have benefited from its recent developments, such as node classification, similarity search, and graph classification. However, prior arts on graph representation learning focus on domain specific problems and train a dedicated model for each graph dataset, which is usually non-transferable to out-of-domain data. Inspired by the recent advances in pre-training from natural language processing and computer vision, we design Graph Contrastive Coding (GCC) -- a self-supervised graph neural network pre-training framework -- to capture the universal network topological properties across multiple networks. We design GCC's pre-training task as subgraph instance discrimination in and across networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations. We conduct extensive experiments on three graph learning tasks and ten graph datasets. The results show that GCC pre-trained on a collection of diverse datasets can achieve competitive or better performance to its task-specific and trained-from-scratch counterparts. This suggests that the pre-training and fine-tuning paradigm presents great potential for graph representation learning.

翻訳日:2022-11-19 20:27:25 公開日:2020-07-02

# クレジット・スコーリングのためのRパッケージのランドスケープに関する概観

An Overview on the Landscape of R Packages for Credit Scoring ( http://arxiv.org/abs/2006.11835v2 )

ライセンス: Link先を確認

Gero Szepannek

(参考訳) 信用スコア業界は、ローンのデフォルト確率予測に統計ツールを使用するという長い伝統があり、マシンラーニングの誇大宣伝よりずっと前にドメイン固有の標準が確立されている。いくつかの商用ソフトウェア会社は、Rの明示的なパッケージでクレジットカードをモデリングするための特定のソリューションを提供しているが、この目的のために長い間失われてきた。近年は変更され、クレジットスコアリングに特化したパッケージがいくつか開発されている。本論文の目的は,これらのパッケージの概観を構造化することである。これによってユーザは、希望する目的のために適切な機能を選択することができ、さらに将来の開発活動の指揮に貢献することが望まれる。この論文は、典型的なスコアカード開発プロセスを形成するための、その後のモデリングステップの連鎖によって導かれる。

The credit scoring industry has a long tradition of using statistical tools for loan default probability prediction and domain specific standards have been established long before the hype of machine learning. Although several commercial software companies offer specific solutions for credit scorecard modelling in R explicit packages for this purpose have been missing long time. In the recent years this has changed and several packages have been developed which are dedicated to credit scoring. The aim of this paper is to give a structured overview on these packages. This may guide users to select the appropriate functions for a desired purpose and further hopefully will contribute to directing future development activities. The paper is guided by the chain of subsequent modelling steps as they are forming the typical scorecard development process.

翻訳日:2022-11-18 12:42:56 公開日:2020-07-02

# 計量空間等級と重みベクトルの実用的応用

Practical applications of metric space magnitude and weighting vectors ( http://arxiv.org/abs/2006.14063v2 )

ライセンス: Link先を確認

Eric Bunch, Daniel Dickinson, Jeffery Kline, Glenn Fung

(参考訳) 代数的トポロジーの研究の活発な主題である計量空間等級は、もともと生物学の文脈で発生し、そこでは環境における異なる種の有効数を表すために用いられた。より一般的な設定では、計量空間の大きさは、空間内の異なる点の有効数の定量化を目的とした実数である。計量空間のグローバル等級への各点の寄与は、元の計量空間の基盤となる幾何の多くを捉えている。驚くべきことに、計量空間がユークリッド空間であるとき、重み付けベクトルは境界検出の有効なツールでもある。これにより、重み付けベクトルは、分類、外れ値検出、アクティブラーニングといった古典的な機械学習タスクのための新しいアルゴリズムの基礎となる。古典的なベンチマークデータセットの実験と比較を用いて、提案した大きさの約束とベクトルベースのアプローチを重み付けする。

Metric space magnitude, an active subject of research in algebraic topology, originally arose in the context of biology, where it was used to represent the effective number of distinct species in an environment. In a more general setting, the magnitude of a metric space is a real number that aims to quantify the effective number of distinct points in the space. The contribution of each point to a metric space's global magnitude, which is encoded by the {\em weighting vector}, captures much of the underlying geometry of the original metric space. Surprisingly, when the metric space is Euclidean, the weighting vector also serves as an effective tool for boundary detection. This allows the weighting vector to serve as the foundation of novel algorithms for classic machine learning tasks such as classification, outlier detection and active learning. We demonstrate, using experiments and comparisons on classic benchmark datasets, the promise of the proposed magnitude and weighting vector-based approaches.

翻訳日:2022-11-17 10:06:52 公開日:2020-07-02

# 効率的なブロックスパースニューラルネットワークのためのラマヌジャン二部グラフ製品

Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural Networks ( http://arxiv.org/abs/2006.13486v2 )

ライセンス: Link先を確認

Dharma Teja Vooturi, Girish Varma, Kishore Kothapalli

(参考訳) スパースニューラルネットワークは、より高密度なバージョンと競合する正確な予測を与えるとともに、実行された算術演算数を最小化する。しかし、GPUのような現在のハードウェアは、より効率良く構造化されたスパーシティパターンしか利用できない。したがって、スパースニューラルネットワークの実行時間は、必要な演算処理に対応しない可能性がある。本研究では,階層型マルチレベルブロックスパースニューラルネットワークをグラフ積の理論を用いて生成するRBGP(Ramanujan Bipartite Graph Product)フレームワークを提案する。ラマヌジャングラフの積も提案するが、これは与えられた範囲で最高の接続性を与える。これは本質的に i を保証します。 ) ネットワークは、実行時効率のよいアルゴリズムが存在する構造化ブロックスパーシティを持つ。 ) このモデルは, グラフiiiの接続性に起因する表現力の向上により, 予測精度が向上する。 ) グラフデータ構造は、メモリに効率的に格納できる簡潔な表現を有する。このフレームワークを使用してrbgp4と呼ばれる特定の接続パターンを設計し、gpuで利用可能なメモリ階層を効率的に利用します。我々は、VGG19とWideResnet-40-4ネットワークを用いて、CIFARデータセット上の画像分類タスクを実験し、非構造パターンとブロック間隔パターンに対して、それぞれ5-9xと2-5xのランタイムゲインを達成するとともに、同じレベルの精度を実現した。

Sparse neural networks are shown to give accurate predictions competitive to denser versions, while also minimizing the number of arithmetic operations performed. However current hardware like GPU's can only exploit structured sparsity patterns for better efficiency. Hence the run time of a sparse neural network may not correspond to the arithmetic operations required. In this work, we propose RBGP( Ramanujan Bipartite Graph Product) framework for generating structured multi level block sparse neural networks by using the theory of Graph products. We also propose to use products of Ramanujan graphs which gives the best connectivity for a given level of sparsity. This essentially ensures that the i.) the networks has the structured block sparsity for which runtime efficient algorithms exists ii.) the model gives high prediction accuracy, due to the better expressive power derived from the connectivity of the graph iii.) the graph data structure has a succinct representation that can be stored efficiently in memory. We use our framework to design a specific connectivity pattern called RBGP4 which makes efficient use of the memory hierarchy available on GPU. We benchmark our approach by experimenting on image classification task over CIFAR dataset using VGG19 and WideResnet-40-4 networks and achieve 5-9x and 2-5x runtime gains over unstructured and block sparsity patterns respectively, while achieving the same level of accuracy.

翻訳日:2022-11-17 09:49:06 公開日:2020-07-02

# 自動運転車用ディープラーニングコンポーネントにおける不確実性推定手法の比較

A Comparison of Uncertainty Estimation Approaches in Deep Learning Components for Autonomous Vehicle Applications ( http://arxiv.org/abs/2006.15172v2 )

ライセンス: Link先を確認

Fabio Arnez (1), Huascar Espinoza (1), Ansgar Radermacher (1) and Fran\c{c}ois Terrier (1) ((1) CEA LIST)

(参考訳) 自動運転車(AV)の安全性を確保する重要な要因は、望ましくない、予測できない状況下での異常行動を避けることである。 AVは安全クリティカルなタスクを実行するためにディープニューラルネットワーク(DNN)にますます依存しているため、データやモデルの必然的なエラーの原因を測定するために、不確実性定量化のためのさまざまな方法が最近提案されている。しかし、DNNにおける不確実性定量化は依然として難しい課題である。これらの手法は高い計算負荷と高いメモリフットプリントを必要とし、安全性が重要なアプリケーションでは禁止される余分なレイテンシをもたらす。本稿では,DNNにおける不確実性定量化手法と,不確実性予測を評価するための既存の指標について,簡潔かつ比較検討する。特に、特定のavタスクや不確実性ソースのタイプに対する各メソッドの利点と欠点を理解することに関心があります。

A key factor for ensuring safety in Autonomous Vehicles (AVs) is to avoid any abnormal behaviors under undesirable and unpredicted circumstances. As AVs increasingly rely on Deep Neural Networks (DNNs) to perform safety-critical tasks, different methods for uncertainty quantification have recently been proposed to measure the inevitable source of errors in data and models. However, uncertainty quantification in DNNs is still a challenging task. These methods require a higher computational load, a higher memory footprint, and introduce extra latency, which can be prohibitive in safety-critical applications. In this paper, we provide a brief and comparative survey of methods for uncertainty quantification in DNNs along with existing metrics to evaluate uncertainty predictions. We are particularly interested in understanding the advantages and downsides of each method for specific AV tasks and types of uncertainty sources.

翻訳日:2022-11-16 21:13:07 公開日:2020-07-02

# ニューラルmcmcのための深部インボリューティブ生成モデル

Deep Involutive Generative Models for Neural MCMC ( http://arxiv.org/abs/2006.15167v2 )

ライセンス: Link先を確認

Span Spanbauer, Cameron Freer, Vikash Mansinghka

(参考訳) Involutive Neural MCMC(Involutive Neural MCMC)を高速なニューラルMCMCの新しいアプローチとして定義するために,Deep Involutive Generative ModelとDeep Generative Modelingの新しいアーキテクチャを導入している。帰納的生成モデル (involutive generative model) は、確率核 $g(\phi \mapsto \phi')$ を、補助変数 $\pi$ を含む拡大状態空間上の帰納的決定関数 $f(\phi, \pi)$ として表現する。そこで本研究では,これらのモデルのボリューム保存方法と,さらに深いボリューム保存型インボラティブ生成モデルを用いて,適切なメトロポリス・ハスティング更新を行う方法を示す。深部インボリューティブ生成モデルとその体積保存特例が確率核の普遍近似であることを示す。これにより、十分なネットワーク容量とトレーニング時間があれば、任意の複雑なMCMC更新を学習することができる。シミュレーションデータを用いた学習パラメータの損失関数と最適化アルゴリズムを定義する。また, ハイブリッドモンテカルロでは難解なマルチモーダル分布を効率的に探索し, 最近導入したニューラルmcmc技術であるa-nice-mcよりも高速に収束できることを示す実験を行った。

We introduce deep involutive generative models, a new architecture for deep generative modeling, and use them to define Involutive Neural MCMC, a new approach to fast neural MCMC. An involutive generative model represents a probability kernel $G(\phi \mapsto \phi')$ as an involutive (i.e., self-inverting) deterministic function $f(\phi, \pi)$ on an enlarged state space containing auxiliary variables $\pi$. We show how to make these models volume preserving, and how to use deep volume-preserving involutive generative models to make valid Metropolis-Hastings updates based on an auxiliary variable scheme with an easy-to-calculate acceptance ratio. We prove that deep involutive generative models and their volume-preserving special case are universal approximators for probability kernels. This result implies that with enough network capacity and training time, they can be used to learn arbitrarily complex MCMC updates. We define a loss function and optimization algorithm for training parameters given simulated data. We also provide initial experiments showing that Involutive Neural MCMC can efficiently explore multi-modal distributions that are intractable for Hybrid Monte Carlo, and can converge faster than A-NICE-MC, a recently introduced neural MCMC technique.

翻訳日:2022-11-16 20:37:55 公開日:2020-07-02

# データ選択バイアスによるdecorrelated clustering

Decorrelated Clustering with Data Selection Bias ( http://arxiv.org/abs/2006.15874v2 )

ライセンス: Link先を確認

Xiao Wang, Shaohua Fan, Kun Kuang, Chuan Shi, Jiawei Liu and Bai Wang

(参考訳) 既存のクラスタリングアルゴリズムのほとんどは、データの選択バイアスを考慮せずに提案されている。しかし、実際の多くのアプリケーションでは、データが偏りがないことを保証できない。選択バイアスは、機能間の予期せぬ相関とこれらの予期せぬ相関を無視して、クラスタリングアルゴリズムのパフォーマンスを損なう可能性がある。したがって、選択バイアスによって引き起こされる予期せぬ相関をいかに取り除くかは極めて重要であるが、クラスタリングに関してほとんど検討されていない。本稿では,データ選択バイアスを伴うクラスタリングのためのデコリレーション正規化K-Meansアルゴリズム(DCKM)を提案する。具体的には、デコリレーション・レギュレータは、サンプル分布のバランスをとることができるグローバルなサンプル重量を学習し、特徴間の予期せぬ相関を取り除くことを目的としている。一方,学習重みはk-meansと組み合わされ,k-meansクラスタは予期しない相関の影響を伴わずに固有データ分布上に重み付けされる。さらに、DCKMのパラメータを効果的に推測する更新ルールを導出する。実世界のデータセットに対する広範囲な実験結果から,dckmアルゴリズムは有意な性能向上を達成でき,クラスタリング時に選択バイアスによって引き起こされる予期せぬ特徴相関を取り除く必要性が示された。

Most of existing clustering algorithms are proposed without considering the selection bias in data. In many real applications, however, one cannot guarantee the data is unbiased. Selection bias might bring the unexpected correlation between features and ignoring those unexpected correlations will hurt the performance of clustering algorithms. Therefore, how to remove those unexpected correlations induced by selection bias is extremely important yet largely unexplored for clustering. In this paper, we propose a novel Decorrelation regularized K-Means algorithm (DCKM) for clustering with data selection bias. Specifically, the decorrelation regularizer aims to learn the global sample weights which are capable of balancing the sample distribution, so as to remove unexpected correlations among features. Meanwhile, the learned weights are combined with k-means, which makes the reweighted k-means cluster on the inherent data distribution without unexpected correlation influence. Moreover, we derive the updating rules to effectively infer the parameters in DCKM. Extensive experiments results on real world datasets well demonstrate that our DCKM algorithm achieves significant performance gains, indicating the necessity of removing unexpected feature correlations induced by selection bias when clustering.

翻訳日:2022-11-15 13:36:22 公開日:2020-07-02

# 大規模MIMOハイブリッドビームフォーミングのための教師なし深層学習

Unsupervised Deep Learning for Massive MIMO Hybrid Beamforming ( http://arxiv.org/abs/2007.00038v2 )

ライセンス: Link先を確認

Hamed Hojatian, Jeremy Nadal, Jean-Francois Frigon, Francois Leduc-Primeau

(参考訳) ハイブリッドビームフォーミング(hybrid beamforming)は、大量の複数入力多重出力(mimo)システムの複雑さとコストを低減し、高いデータレートを提供する、有望な技術である。しかし、ハイブリッドプリコーダの設計は、チャネル状態情報(CSI)のフィードバックと複雑な最適化問題の解決を必要とする課題である。本稿では,大規模MIMOシステムにおけるハイブリッドビームフォーミングを設計するためのRSSIに基づく非教師なしディープラーニング手法を提案する。さらに提案します一初期アクセス(ia)における同期信号(ss)を設計する方法、及び二アナログプリコーダのコードブックを設計する方法また,様々なシナリオにおいて,現実的なチャネルモデルを用いてシステム性能を評価する。提案手法は, 周波数分割二重化(fdd)通信において, 部分csiフィードバックによるスペクトル効率を大幅に向上させるだけでなく, ほぼ最適の和率を持ち, 最先端の全csiソリューションよりも優れることを示す。

Hybrid beamforming is a promising technique to reduce the complexity and cost of massive multiple-input multiple-output (MIMO) systems while providing high data rate. However, the hybrid precoder design is a challenging task requiring channel state information (CSI) feedback and solving a complex optimization problem. This paper proposes a novel RSSI-based unsupervised deep learning method to design the hybrid beamforming in massive MIMO systems. Furthermore, we propose i) a method to design the synchronization signal (SS) in initial access (IA); and ii) a method to design the codebook for the analog precoder. We also evaluate the system performance through a realistic channel model in various scenarios. We show that the proposed method not only greatly increases the spectral efficiency especially in frequency-division duplex (FDD) communication by using partial CSI feedback, but also has near-optimal sum-rate and outperforms other state-of-the-art full-CSI solutions.

翻訳日:2022-11-15 06:22:54 公開日:2020-07-02

# ハイパーパラメータスキーマ抽出のためのマイニングドキュメント

Mining Documentation to Extract Hyperparameter Schemas ( http://arxiv.org/abs/2006.16984v2 )

ライセンス: Link先を確認

Guillaume Baudart, Peter D. Kirchner, Martin Hirzel, Kiran Kate

(参考訳) ai自動化ツールは、検索空間を定義するために機械可読なハイパーパラメータスキーマを必要とする。同時に、AIライブラリには、優れた人間可読性ドキュメントが付属することが多い。このようなドキュメントには必要な情報の大半が含まれているが、残念ながらツールを使う準備ができていない。本稿では,aiライブラリ内のpython docstringを自動マイニングしてハイパーパラメータ用のjsonスキーマを抽出する方法について述べる。 3つの異なるライブラリから119個のトランスフォーマーと推定器のアプローチを評価し,機械可読スキーマの抽出に有効であることを確認した。私たちのビジョンは、AI自動化ツール用のこのようなスキーマを手作業で作成およびメンテナンスし、より大きなライブラリやよりリッチなスキーマに自動化の範囲を広げることです。

AI automation tools need machine-readable hyperparameter schemas to define their search spaces. At the same time, AI libraries often come with good human-readable documentation. While such documentation contains most of the necessary information, it is unfortunately not ready to consume by tools. This paper describes how to automatically mine Python docstrings in AI libraries to extract JSON Schemas for their hyperparameters. We evaluate our approach on 119 transformers and estimators from three different libraries and find that it is effective at extracting machine-readable schemas. Our vision is to reduce the burden to manually create and maintain such schemas for AI automation tools and broaden the reach of automation to larger libraries and richer schemas.

翻訳日:2022-11-15 05:19:56 公開日:2020-07-02

# 不変な神経表現によって駆動される変換に対するロバスト性は?

Is Robustness To Transformations Driven by Invariant Neural Representations? ( http://arxiv.org/abs/2007.00112v2 )

ライセンス: Link先を確認

Syed Suleman Abbas Zaidi, Xavier Boix, Neeraj Prasad, Sharon Gilad-Gutnick, Shlomit Ben-Ami, Pawan Sinha

(参考訳) 深層畳み込みニューラルネットワーク(DCNN)は、これらの変換がトレーニングセットに含まれる場合、変換中のオブジェクト(例えば、ぼやけやノイズ)を認識するための印象的な堅牢性を示している。このようなロバスト性を説明する仮説は、dcnnが画像が変換された後も不変な神経表現を発達させることである。しかし、この仮説がどの程度真であるかは、顕著な疑問であり、トレーニングセットに変換を含めると、ネットワークの一部が変換された画像または非変換画像を認識するのに特化できるなど、不変性とは異なる性質をもたらす可能性がある。本稿では,不均一が生じている条件を解析する。そのため、不変表現は、トレーニング中に変換されないオブジェクトカテゴリの変換に対して堅牢性を促進する。最新のdcnnを用いた結果から,トレーニングセット内の変換されたカテゴリ数の増加に伴い,不変表現が強化されることが示された。これは、物体の空間配置の変化を伴う回転や薄型化のような幾何学的変換と比較して、ぼやけやハイパスフィルタリングのような局所変換においてより顕著である。本研究は,深層学習における不変表現と不変表現が自然に出現する条件の理解を深める。

Deep Convolutional Neural Networks (DCNNs) have demonstrated impressive robustness to recognize objects under transformations (e.g. blur or noise) when these transformations are included in the training set. A hypothesis to explain such robustness is that DCNNs develop invariant neural representations that remain unaltered when the image is transformed. Yet, to what extent this hypothesis holds true is an outstanding question, as including transformations in the training set could lead to properties different from invariance, e.g. parts of the network could be specialized to recognize either transformed or non-transformed images. In this paper, we analyze the conditions under which invariance emerges. To do so, we leverage that invariant representations facilitate robustness to transformations for object categories that are not seen transformed during training. Our results with state-of-the-art DCNNs indicate that invariant representations strengthen as the number of transformed categories in the training set is increased. This is much more prominent with local transformations such as blurring and high-pass filtering, compared to geometric transformations such as rotation and thinning, that entail changes in the spatial arrangement of the object. Our results contribute to a better understanding of invariant representations in deep learning, and the conditions under which invariance spontaneously emerges.

翻訳日:2022-11-15 05:02:16 公開日:2020-07-02

# 機械教育を通して読むことを学ぶ

Learning to Read through Machine Teaching ( http://arxiv.org/abs/2006.16470v2 )

ライセンス: Link先を確認

Ayon Sen, Christopher R. Cox, Matthew Cooper Borkenhagen, Mark S. Seidenberg and Xiaojin Zhu

(参考訳) 単語を読むことを学ぶことは、読者になるための大きな一歩だ。多くの子供たちは、英語の綴りと音の対応の不一致のためにこの課題に苦しむ。カリキュラムは、これらのパターンの教え方によって大きく異なる。それにもかかわらず、子どもたちは限られた時間(4年生)でシステムをマスターすることが期待されている。認知的に興味深いニューラルネットワークアーキテクチャを用いて、学習試行のシーケンスが学習を容易にするために構成されるかどうかを検証した。これはわずかな数(例えば10k)の学習試行でも難しい組合せ最適化問題である。本稿では,この系列最適化問題を,時間変化分布の最適化,すなわち,異なるステップにおける単語に対する確率分布の定義として提案する。次に、確率勾配降下法を用いて最適な時間変化分布と対応する最適トレーニングシーケンスを求める。基本条件 (ランダムシーケンス, 単語頻度に偏ったシーケンス) と比較して, 一般化精度は有意に向上した。これらの結果は,限られた学習経験を超えて,パフォーマンスが一般化する能力に依存する領域における学習成果の改善へのアプローチを示唆している。

Learning to read words aloud is a major step towards becoming a reader. Many children struggle with the task because of the inconsistencies of English spelling-sound correspondences. Curricula vary enormously in how these patterns are taught. Children are nonetheless expected to master the system in limited time (by grade 4). We used a cognitively interesting neural network architecture to examine whether the sequence of learning trials could be structured to facilitate learning. This is a hard combinatorial optimization problem even for a modest number of learning trials (e.g., 10K). We show how this sequence optimization problem can be posed as optimizing over a time varying distribution i.e., defining probability distributions over words at different steps in training. We then use stochastic gradient descent to find an optimal time-varying distribution and a corresponding optimal training sequence. We observed significant improvement on generalization accuracy compared to baseline conditions (random sequences; sequences biased by word frequency). These findings suggest an approach to improving learning outcomes in domains where performance depends on ability to generalize beyond limited training experience.

翻訳日:2022-11-15 04:17:43 公開日:2020-07-02

# 機能エクストリームを見つけるためのMDP

MLPs to Find Extrema of Functionals ( http://arxiv.org/abs/2007.00530v2 )

ライセンス: Link先を確認

Tao Liu

(参考訳) 多層パーセプトロン(MLP)は、複数のパーセプトロンからなるネットワークのクラスであり、本質的には数学的機能である。 MLPに基づいて,関数の極限を求めるための新しい数値法を開発した。実演として,3つの物理場面で解法を提示する。理想的には、目的曲線/曲面が二階微分可能関数に適合できる場合にも同様の方法が適用できる。この方法は、有限個の非微分可能(しかし連続)点/曲面が存在する場合にも拡張することができる。

Multilayer perceptron (MLP) is a class of networks composed of multiple layers of perceptrons, and it is essentially a mathematical function. Based on MLP, we develop a new numerical method to find the extrema of functionals. As demonstrations, we present our solutions in three physic scenes. Ideally, the same method is applicable to any cases where the objective curve/surface can be fitted by second-order differentiable functions. This method can also be extended to cases where there are a finite number of non-differentiable (but continuous) points/surfaces.

翻訳日:2022-11-14 23:29:23 公開日:2020-07-02

# 臨床ベイズネットワーク開発のための医用イディオム

Medical idioms for clinical Bayesian network development ( http://arxiv.org/abs/2007.00364v2 )

ライセンス: Link先を確認

Evangelia Kyrimi, Mariana Raniere Neves, Scott McLachlan, Martin Neil, William Marsh, Norman Fenton

(参考訳) ベイズネットワーク(英: Bayesian Networks, BN)は、医学的応用で広く利用されているグラフィカル確率モデルである。多くの医療用bnsが出版されているが、ネットワーク構造がどのように開発されたかの説明や、それが与えられた医療用途の正しい構造を表す理由の正当化なしでfait accompliが提示されている。これは、専門家から医療BNを構築するプロセスは、一般的にアドホックであり、方法論的改善の機会はほとんどないことを意味する。本稿では,医療BNの発達を支援するために,広く応用され,再利用可能な医療推論パターンを提案する。提案手法は2000年にNeil, Fenton, Nielsenによって導入されたイディオムに基づくアプローチを補完し拡張する。医学的なBNに特有な一般的なイディオムの例を提案する。提案する医学的推論パターンを医学的イディオムと呼ぶ。さらに,介入的および反事実的推論を表現するため,イディオムの使用を拡大する。提案する医用イディオムは論理的推論パターンであり,医療用BNの開発に有効であると考えられる。冠状動脈疾患の医学的例を用いて, 提案したすべての医学的イディオムを概説した。この方法は、医療専門家と共に開発中の他のBNにも適用されている。最後に,提案した医療用イディオムをBNモデルに適用すると,より明確な構造を持つモデルが得られることを示す。

Bayesian Networks (BNs) are graphical probabilistic models that have proven popular in medical applications. While numerous medical BNs have been published, most are presented fait accompli without explanation of how the network structure was developed or justification of why it represents the correct structure for the given medical application. This means that the process of building medical BNs from experts is typically ad hoc and offers little opportunity for methodological improvement. This paper proposes generally applicable and reusable medical reasoning patterns to aid those developing medical BNs. The proposed method complements and extends the idiom-based approach introduced by Neil, Fenton, and Nielsen in 2000. We propose instances of their generic idioms that are specific to medical BNs. We refer to the proposed medical reasoning patterns as medical idioms. In addition, we extend the use of idioms to represent interventional and counterfactual reasoning. We believe that the proposed medical idioms are logical reasoning patterns that can be combined, reused and applied generically to help develop medical BNs. All proposed medical idioms have been illustrated using medical examples on coronary artery disease. The method has also been applied to other ongoing BNs being developed with medical experts. Finally, we show that applying the proposed medical idioms to published BN models results in models with a clearer structure.

翻訳日:2022-11-14 23:12:06 公開日:2020-07-02

# Goal-Oriented Semantic Exploration を用いたオブジェクトゴールナビゲーション

Object Goal Navigation using Goal-Oriented Semantic Exploration ( http://arxiv.org/abs/2007.00643v2 )

ライセンス: Link先を確認

Devendra Singh Chaplot, Dhiraj Gandhi, Abhinav Gupta, Ruslan Salakhutdinov

(参考訳) 本研究は,未確認環境における対象カテゴリーのインスタンスにナビゲートするオブジェクトゴールナビゲーションの問題を研究する。エンドツーエンドの学習ベースのナビゲーション手法は、探索や長期計画に効果がないため、このタスクで苦労しています。本稿では,エピソディック意味マップを構築し,目標対象のカテゴリに基づいて効率的に環境探索を行う「goal-oriented semantic exploration」というモジュールシステムを提案する。視覚的に現実的なシミュレーション環境における実証的な結果から,提案手法は,モジュール型マップベースの手法と同様に,エンドツーエンドの学習手法を含む幅広いベースラインを上回り,CVPR-2020 Habitat ObjectNav Challengeの勝利につながった。アブレーション解析により,提案モデルがシーン内のオブジェクトの相対的な配置のセマンティック先行を学習し,それらを効率的に探索することを示す。ドメインに依存しないモジュール設計により、我々のモデルを移動ロボットプラットフォームに転送し、現実世界でのオブジェクトゴールナビゲーションと同様のパフォーマンスを達成することができる。

This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. End-to-end learning-based navigation methods struggle at this task as they are ineffective at exploration and long-term planning. We propose a modular system called, `Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category. Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR-2020 Habitat ObjectNav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently. Domain-agnostic module design allow us to transfer our model to a mobile robot platform and achieve similar performance for object goal navigation in the real-world.

翻訳日:2022-11-14 23:04:06 公開日:2020-07-02

# マシンチェッカブル概念によるモデル説明可能性とロバスト性の統合

Unifying Model Explainability and Robustness via Machine-Checkable Concepts ( http://arxiv.org/abs/2007.00251v2 )

ライセンス: Link先を確認

Vedant Nanda, Till Speicher, John P. Dickerson, Krishna P. Gummadi, Muhammad Bilal Zafar

(参考訳) 深層ニューラルネットワーク(DNN)がますます増加するアプリケーションに採用されるにつれて、これらのモデルにとって説明可能性は重要なデシプラタムとして現れてきた。多くの実世界のタスクにおいて、説明可能性を必要とする主な理由の1つは、それぞれの説明(例えば、入力における概念の有無)に従わない予測(クラスラベル)が信頼できないと判断されるような、予測堅牢性を評価することである。しかし、すべてではないにしても、説明整合性(例えば、LIME, TCAV, saliency map)をチェックするための事前の手法は、大規模なデプロイを妨げている。本稿では,機械チェック可能な概念を用いたロバスト性評価フレームワークを提案する。我々のフレームワークは、DNNの説明をベースとした多数の概念を定義し、テスト時に説明整合性チェックを行い、予測の堅牢性を評価する。両方のステップは、人間の介入なしに自動化された方法で実行され、非常に多くのクラスを持つデータセットに簡単にスケールできます。実世界のデータセットと人的調査による実験により、我々のフレームワークは予測のロバスト性を大幅に向上させることができることが分かりました。

As deep neural networks (DNNs) get adopted in an ever-increasing number of applications, explainability has emerged as a crucial desideratum for these models. In many real-world tasks, one of the principal reasons for requiring explainability is to in turn assess prediction robustness, where predictions (i.e., class labels) that do not conform to their respective explanations (e.g., presence or absence of a concept in the input) are deemed to be unreliable. However, most, if not all, prior methods for checking explanation-conformity (e.g., LIME, TCAV, saliency maps) require significant manual intervention, which hinders their large-scale deployability. In this paper, we propose a robustness-assessment framework, at the core of which is the idea of using machine-checkable concepts. Our framework defines a large number of concepts that the DNN explanations could be based on and performs the explanation-conformity check at test time to assess prediction robustness. Both steps are executed in an automated manner without requiring any human intervention and are easily scaled to datasets with a very large number of classes. Experiments on real-world datasets and human surveys show that our framework is able to enhance prediction robustness significantly: the predictions marked to be robust by our framework have significantly higher accuracy and are more robust to adversarial perturbations.

翻訳日:2022-11-14 22:36:37 公開日:2020-07-02

# クロック付き連続時間ベイズネットワーク

Continuous-Time Bayesian Networks with Clocks ( http://arxiv.org/abs/2007.00347v2 )

ライセンス: Link先を確認

Nicolai Engelmann, Dominik Linzner, Heinz Koeppl

(参考訳) 連続的に進化する構造化確率過程は、自然と工学で生じる現象をモデル化するための広く採用された枠組みを示す。しかし、そのようなモデルはしばしば、トラクタビリティを維持するためにマルコフ特性を満たすために選択される。このようなメモリレスモデルでよく使われるのは、Continuous Time Bayesian Networks (CTBN) である。本研究では,指数的生存時間に対する制限を任意の分布に引き上げる。現在の拡張は、トラクタビリティを妨げる補助状態を通じてこれを達成している。そこで我々は,グラフ結合型半マルコフ連鎖の集合を構成するノードワイズクロックの集合を導入する。本稿では,遺伝子制御ネットワークのベンチマークツールを用いて生成したデータと,局所的な依存関係を利用して,合成データに対する実験を行うパラメータと構造推論のアルゴリズムを提案する。これにより,現在のCTBN拡張と比較して利点が指摘される。

Structured stochastic processes evolving in continuous time present a widely adopted framework to model phenomena occurring in nature and engineering. However, such models are often chosen to satisfy the Markov property to maintain tractability. One of the more popular of such memoryless models are Continuous Time Bayesian Networks (CTBNs). In this work, we lift its restriction to exponential survival times to arbitrary distributions. Current extensions achieve this via auxiliary states, which hinder tractability. To avoid that, we introduce a set of node-wise clocks to construct a collection of graph-coupled semi-Markov chains. We provide algorithms for parameter and structure inference, which make use of local dependencies and conduct experiments on synthetic data and a data-set generated through a benchmark tool for gene regulatory networks. In doing so, we point out advantages compared to current CTBN extensions.

翻訳日:2022-11-14 22:08:52 公開日:2020-07-02

# 説明可能な人工知能を用いた薬物発見

Drug discovery with explainable artificial intelligence ( http://arxiv.org/abs/2007.00523v2 )

ライセンス: Link先を確認

Jos\'e Jim\'enez-Luna, Francesca Grisoni, Gisbert Schneider

(参考訳) 深層学習は、先進的な画像解析、分子構造と機能の予測、および、bespoke特性を持つ革新的な化学物質の自動生成を含む薬物発見を約束する。将来的な応用が増えているにもかかわらず、基礎となる数学的モデルはしばしば人間の心によって解釈される。分子科学の機械言語の新たな物語の必要性に対処するために、「説明可能な」深層学習法が求められている。このレビューは、説明可能な人工知能の最も顕著なアルゴリズム概念を要約し、将来の機会、潜在的な応用、そして残る課題を予測する。

Deep learning bears promise for drug discovery, including advanced image analysis, prediction of molecular structure and function, and automated generation of innovative chemical entities with bespoke properties. Despite the growing number of successful prospective applications, the underlying mathematical models often remain elusive to interpretation by the human mind. There is a demand for 'explainable' deep learning methods to address the need for a new narrative of the machine language of the molecular sciences. This review summarizes the most prominent algorithmic concepts of explainable artificial intelligence, and dares a forecast of the future opportunities, potential applications, and remaining challenges.

翻訳日:2022-11-14 21:42:29 公開日:2020-07-02

# deep interactive learning: 深層学習に基づく骨肉腫治療反応評価のための効率的なラベリングアプローチ

Deep Interactive Learning: An Efficient Labeling Approach for Deep Learning-Based Osteosarcoma Treatment Response Assessment ( http://arxiv.org/abs/2007.01383v1 )

ライセンス: Link先を確認

David Joon Ho, Narasimhan P. Agaram, Peter J. Schueffler, Chad M. Vanderbilt, Marc-Henri Jean, Meera R. Hameed, Thomas J. Fuchs

(参考訳) 骨肉腫は最も一般的な悪性原発性骨腫瘍である。標準治療は術前化学療法と外科的切除を含む。腫瘍面積に対する壊死性腫瘍面積の比率による治療に対する反応は、全身生存の予後因子として知られている。この評価は現在、顕微鏡の下でガラスのスライドを観察することで、その主観的な性質のために再現できない可能性がある。畳み込みニューラルネットワーク(cnns)は、骨肉腫全体の画像上の生存性腫瘍と壊死性腫瘍の自動分割に使用できる。教師あり学習のボトルネックの1つは、時間とコストのかかるプロセスであるトレーニングに大量の正確なアノテーションが必要とされることである。本稿では,cnnを学習するための効率的なラベリング手法として,dial(deep interactive learning)について述べる。最初のラベリングステップが完了すると、アノテータは、十分な予測が達成されるまでcnnモデルを改善するために、以前のセグメンテーション予測から誤ってラベルされた領域を修正するだけでよい。以上の結果より,DIaLを用いた7時間アノテーションでトレーニングしたCNNモデルでは,非標準化手技の術式変化率の予測値内の壊死率を推定することができた。

Osteosarcoma is the most common malignant primary bone tumor. Standard treatment includes pre-operative chemotherapy followed by surgical resection. The response to treatment as measured by ratio of necrotic tumor area to overall tumor area is a known prognostic factor for overall survival. This assessment is currently done manually by pathologists by looking at glass slides under the microscope which may not be reproducible due to its subjective nature. Convolutional neural networks (CNNs) can be used for automated segmentation of viable and necrotic tumor on osteosarcoma whole slide images. One bottleneck for supervised learning is that large amounts of accurate annotations are required for training which is a time-consuming and expensive process. In this paper, we describe Deep Interactive Learning (DIaL) as an efficient labeling approach for training CNNs. After an initial labeling step is done, annotators only need to correct mislabeled regions from previous segmentation predictions to improve the CNN model until the satisfactory predictions are achieved. Our experiments show that our CNN model trained by only 7 hours of annotation using DIaL can successfully estimate ratios of necrosis within expected inter-observer variation rate for non-standardized manual surgical pathology task.

翻訳日:2022-11-14 15:04:24 公開日:2020-07-02

# ADD:摩擦接触を有する多体系の解析微分力学

ADD: Analytically Differentiable Dynamics for Multi-Body Systems with Frictional Contact ( http://arxiv.org/abs/2007.00987v1 )

ライセンス: Link先を確認

Moritz Geilinger, David Hahn, Jonas Zehnder, Moritz B\"acher, Bernhard Thomaszewski, Stelian Coros

(参考訳) 本稿では,統一フレームワーク内で剛体および変形可能な物体の摩擦接触を処理可能な微分可能ダイナミクスソルバを提案する。正常接触力と接点接触力の原理的モーリフィケーションにより, 摩擦接触の非スムース性に固有の主な困難を回避できる。我々は,この新しい接触モデルと完全簡易な時間統合を組み合わせることで,解析的に微分可能なロバストで効率的なダイナミクスソルバを得る。本定式化は,隣接感度解析と合わせて,シミュレーション精度と目的関数ランドスケープの滑らかさの相違を考慮した勾配に基づく最適化を実現する。我々は,剛体,粘弾性材料,結合多体系を含む一連のシミュレーション例について,本手法を徹底的に解析する。さらに,変形可能な物体のパラメータ推定,ロボット操作の動作計画,協調歩行ロボットの軌道最適化,制御ポリシーの効率的な自己教師あり学習への微分シミュレータの応用について紹介する。

We present a differentiable dynamics solver that is able to handle frictional contact for rigid and deformable objects within a unified framework. Through a principled mollification of normal and tangential contact forces, our method circumvents the main difficulties inherent to the non-smooth nature of frictional contact. We combine this new contact model with fully-implicit time integration to obtain a robust and efficient dynamics solver that is analytically differentiable. In conjunction with adjoint sensitivity analysis, our formulation enables gradient-based optimization with adaptive trade-offs between simulation accuracy and smoothness of objective function landscapes. We thoroughly analyse our approach on a set of simulation examples involving rigid bodies, visco-elastic materials, and coupled multi-body systems. We furthermore showcase applications of our differentiable simulator to parameter estimation for deformable objects, motion planning for robotic manipulation, trajectory optimization for compliant walking robots, as well as efficient self-supervised learning of control policies.

翻訳日:2022-11-14 15:03:30 公開日:2020-07-02

# Channel Compression: CNNアーキテクチャにおけるチャネル間の情報冗長性の再考

Channel Compression: Rethinking Information Redundancy among Channels in CNN Architecture ( http://arxiv.org/abs/2007.01696v1 )

ライセンス: Link先を確認

Jinhua Liang, Tao Zhang, and Guoqing Feng

(参考訳) 組込みデバイスやモバイルアプリケーションへの需要により、モデル圧縮とアクセラレーションが注目を集めている。効率的な畳み込みニューラルネットワーク(cnns)の研究は、畳み込み計算を分解または最適化することで特徴冗長性を取り除くことを目的としている。本研究では,CNNアーキテクチャのチャネル間に特徴冗長性が存在すると仮定し,計算効率を高めるためのフリーウェイを提供する。チャネル圧縮を前提として,空間畳み込み,チャネルグループ化,プール操作の進展を受け入れるために,コンパクト畳み込みという新しい畳み込み構造を提案する。具体的には、奥行き分離可能な畳み込みとポイントワイドチャネル操作を利用して特徴を効率的に抽出する。学習可能な重みを通常導入する既存のチャネル圧縮法とは異なり、提案するコンパクト畳み込みは余分なパラメータなしで特徴冗長性を低減できる。ポイントワイズチャネル間操作により、コンパクト畳み込みは特徴写像のチャネル次元を暗黙的に絞り込む。ニューラルネットワークにおけるチャネル冗長性を低減するためのルールを検討するために、異なるポイントワイズチャネル間操作の比較を行う。さらに,音場分類,音事象検出,画像分類などの複数の課題に対処するために,コンパクトな畳み込みが拡張される。実験により, マルチメディアタスクにおいて, コンパクトな畳み込みは高い有効性を示すだけでなく, 並列計算によって効率よく実装できることを示した。

Model compression and acceleration are attracting increasing attentions due to the demand for embedded devices and mobile applications. Research on efficient convolutional neural networks (CNNs) aims at removing feature redundancy by decomposing or optimizing the convolutional calculation. In this work, feature redundancy is assumed to exist among channels in CNN architectures, which provides some leeway to boost calculation efficiency. Aiming at channel compression, a novel convolutional construction named compact convolution is proposed to embrace the progress in spatial convolution, channel grouping and pooling operation. Specifically, the depth-wise separable convolution and the point-wise interchannel operation are utilized to efficiently extract features. Different from the existing channel compression method which usually introduces considerable learnable weights, the proposed compact convolution can reduce feature redundancy with no extra parameters. With the point-wise interchannel operation, compact convolutions implicitly squeeze the channel dimension of feature maps. To explore the rules on reducing channel redundancy in neural networks, the comparison is made among different point-wise interchannel operations. Moreover, compact convolutions are extended to tackle with multiple tasks, such as acoustic scene classification, sound event detection and image classification. The extensive experiments demonstrate that our compact convolution not only exhibits high effectiveness in several multimedia tasks, but also can be efficiently implemented by benefiting from parallel computation.

翻訳日:2022-11-14 15:02:26 公開日:2020-07-02

# 不確実性下におけるデータ駆動肯定行動政策に向けて

Towards Data-Driven Affirmative Action Policies under Uncertainty ( http://arxiv.org/abs/2007.01202v1 )

ライセンス: Link先を確認

Corinna Hertweck, Carlos Castillo, Michael Mathioudakis

(参考訳) 本稿では,大学進学者と大学プログラムの合致に等級と標準試験点を用いた中央システム下での大学入試について検討する。我々は、承認申請者数を過小評価グループから増やそうとする肯定的な行動方針を検討する。このような方針を申請期間の開始前に発表する必要があるため、各プログラムに応募する学生のスコア分布に不確実性がある。これは政策立案者にとって難しい課題となる。我々は,過去のデータに基づいてトレーニングされた予測モデルを用いて,これらのポリシーのパラメータを最適化する可能性を検討する。

In this paper, we study university admissions under a centralized system that uses grades and standardized test scores to match applicants to university programs. We consider affirmative action policies that seek to increase the number of admitted applicants from underrepresented groups. Since such a policy has to be announced before the start of the application period, there is uncertainty about the score distribution of the students applying to each program. This poses a difficult challenge for policy-makers. We explore the possibility of using a predictive model trained on historical data to help optimize the parameters of such policies.

翻訳日:2022-11-14 14:56:15 公開日:2020-07-02

# マイクロコントローラのための効率的なニューラルネットワーク配置

Efficient Neural Network Deployment for Microcontroller ( http://arxiv.org/abs/2007.01348v1 )

ライセンス: Link先を確認

Hasan Unlu

(参考訳) ニューラルネットワークのエッジコンピューティングは、特に低電力アプリケーションやオフラインデバイスで重要になっている。 TensorFlow LiteとPyTorch Mobileはこの目的でリリースされた。しかし、主にマイクロコントローラレベルではなくモバイルデバイスをサポートしている。マイクロコントローラのサポートは今、新しい分野だ。ネットワークサイズを削減し、プルーニングやバイナライゼーション、レイヤ操作、すなわちオペレータのリオーダーといった計算負荷を削減する方法は数多く存在する。本稿では,マイクロコントローラのための畳み込みニューラルネットワークの展開を,メモリ節約と2次元畳み込みの計算効率を完全連結層とともに提供する2つの新しい最適化提案で検討し,一般化する。最初のものは、ストライドがカーネルサイズをプールするよりも大きい場合、インプレースマックスプーリングである。第2の最適化は、層間のping-pongバッファを使用してメモリ消費を大幅に削減することだ。メモリの節約と性能は、ARM Cortex-M CPU用に開発されたCMSIS-NNフレームワークと比較される。最終的な目的は、トレーニングされたネットワーク重みを持つPyTorchモデルを消費するツールを開発することであり、低メモリ(キロバイトレベル)と限られた計算能力を持つマイクロコントローラのためにC/C++で最適化された推論エンジン(前方通過)となる。

Edge computing for neural networks is getting important especially for low power applications and offline devices. TensorFlow Lite and PyTorch Mobile were released for this purpose. But they mainly support mobile devices instead of microcontroller level yet. Microcontroller support is an emerging area now. There are many approaches to reduce network size and compute load like pruning, binarization and layer manipulation i.e. operator reordering. This paper is going to explore and generalize convolution neural network deployment for microcontrollers with two novel optimization proposals offering memory saving and compute efficiency in 2D convolutions as well as fully connected layers. The first one is in-place max-pooling, if the stride is greater than or equal to pooling kernel size. The second optimization is to use ping-pong buffers between layers to reduce memory consumption significantly. The memory savings and performance will be compared with CMSIS-NN framework developed for ARM Cortex-M CPUs. The final purpose is to develop a tool consuming PyTorch model with trained network weights, and it turns into an optimized inference engine(forward pass) in C/C++ for low memory(kilobyte level) and limited computing capable microcontrollers.

翻訳日:2022-11-14 14:55:22 公開日:2020-07-02

# WattScale: 大規模建物のエネルギー効率分析のためのデータ駆動型アプローチ

WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale ( http://arxiv.org/abs/2007.01382v1 )

ライセンス: Link先を確認

Srinivasan Iyengar, Stephen Lee, David Irwin, Prashant Shenoy, Benjamin Weil

(参考訳) ビルは現代社会の総エネルギーの40%以上を消費し、エネルギー効率を向上させることでエネルギーフットプリントを大幅に削減する。本稿では,都市や地域の建物群からエネルギー効率の低い建物を識別するためのデータ駆動型手法である \texttt{wattscale} を提案する。点推定を利用する最小二乗法のような従来の方法とは異なり、 \texttt{WattScale} は建物に影響を与えるパラメータの分布を推定することによって、日々のエネルギー消費における確率性を捉えるためにベイズ推定を用いる。さらに、特定の人口に類似した家と比較する。 \texttt{WattScale} はまた、エネルギー不効率の原因を特定するために障害検出アルゴリズムも組み込んでいる。我々は,異なる地理的位置から得られた地中真理データを用いてアプローチを検証する。 \texttt{WattScale} には2つの実行モードがある。 (i)個人、及び (II)地域ベースでは,2つのケーススタディで強調する。個別実行モードでは,1万棟以上の建物を有する都市において,建物の半数以上が何らかの方法で非効率であることを示し,エネルギー改善対策から有意な可能性を示唆する。さらに, 効率の低下の原因として, 41\%, 23.73\%, 0.51\%の住宅では, 建物内装, 暖房, 冷却システムの故障がみられた。地域ベースの実行モードでは、代表エネルギーデータセットが最近利用可能になったため、米国内の何百万もの家庭に拡張可能であることを示す。

Buildings consume over 40% of the total energy in modern societies, and improving their energy efficiency can significantly reduce our energy footprint. In this paper, we present \texttt{WattScale}, a data-driven approach to identify the least energy-efficient buildings from a large population of buildings in a city or a region. Unlike previous methods such as least-squares that use point estimates, \texttt{WattScale} uses Bayesian inference to capture the stochasticity in the daily energy usage by estimating the distribution of parameters that affect a building. Further, it compares them with similar homes in a given population. \texttt{WattScale} also incorporates a fault detection algorithm to identify the underlying causes of energy inefficiency. We validate our approach using ground truth data from different geographical locations, which showcases its applicability in various settings. \texttt{WattScale} has two execution modes -- (i) individual, and (ii) region-based, which we highlight using two case studies. For the individual execution mode, we present results from a city containing >10,000 buildings and show that more than half of the buildings are inefficient in one way or another indicating a significant potential from energy improvement measures. Additionally, we provide probable cause of inefficiency and find that 41\%, 23.73\%, and 0.51\% homes have poor building envelope, heating, and cooling system faults, respectively. For the region-based execution mode, we show that \texttt{WattScale} can be extended to millions of homes in the US due to the recent availability of representative energy datasets.

翻訳日:2022-11-14 14:55:03 公開日:2020-07-02

# ロバスト統計枠組みにおける正規フィルタリングに基づく表面雑音化

Surface Denoising based on Normal Filtering in a Robust Statistics Framework ( http://arxiv.org/abs/2007.00842v1 )

ライセンス: Link先を確認

Sunil Kumar Yadav and Martin Skrodzki and Eric Zimmermann and Konrad Polthier

(参考訳) 3Dスキャナーを用いた表面取得プロセスでは、ノイズは避けられず、幾何学処理の重要なステップは、これらのノイズ成分をこれらの表面から除去することである。除音処理(除音)は、まず表面の正常をフィルタリングし、その後にフィルターされた正常に応じて頂点位置を調整することで行うことができる。したがって、多くの解法アルゴリズムでは、ノイズのない正規分布の計算が鍵となる。ノイズ除去のための様々なフィルタが標準から導入され、外周に対するロバスト性や大きな雑音振幅といったフォーカスポイントが異なる。これらのフィルタは様々な面において良好に機能するが、それらの関係を確立するための統一的なフレームワークが欠落し、各手法の性能を超える理論的解析を提供する。本稿では,メッシュデノイジングの面正規化と点集合デノイジングの頂点正規化に広く使用されている多数の非線形フィルタの関係性を確立するための枠組みを提案する。 m-スモーザーを用いたロバストな統計推定と線形および非線形正規フィルタリングへの応用について述べる。これらの手法は拡散・バイラテラル・方向曲率に基づくアルゴリズムを含む異なる数学的理論に起源があるが、ロバストな誤差ノルムと対応する影響関数を用いてロバスト統計の統一的な枠組みに全ての手法が組み入れられることを実証する。この統一は、個々の方法とその相互関係をよりよく理解するのに役立つ。さらに、提案フレームワークは、既知のフィルタの利点を組み合わせ、利用可能なメソッドと比較するための新しいテクニックのためのプラットフォームを提供する。

During a surface acquisition process using 3D scanners, noise is inevitable and an important step in geometry processing is to remove these noise components from these surfaces (given as points-set or triangulated mesh). The noise-removal process (denoising) can be performed by filtering the surface normals first and by adjusting the vertex positions according to filtered normals afterwards. Therefore, in many available denoising algorithms, the computation of noise-free normals is a key factor. A variety of filters have been introduced for noise-removal from normals, with different focus points like robustness against outliers or large amplitude of noise. Although these filters are performing well in different aspects, a unified framework is missing to establish the relation between them and to provide a theoretical analysis beyond the performance of each method. In this paper, we introduce such a framework to establish relations between a number of widely-used nonlinear filters for face normals in mesh denoising and vertex normals in point set denoising. We cover robust statistical estimation with M-smoothers and their application to linear and non-linear normal filtering. Although these methods originate in different mathematical theories - which include diffusion-, bilateral-, and directional curvature-based algorithms - we demonstrate that all of them can be cast into a unified framework of robust statistics using robust error norms and their corresponding influence functions. This unification contributes to a better understanding of the individual methods and their relations with each other. Furthermore, the presented framework provides a platform for new techniques to combine the advantages of known filters and to compare them with available methods.

翻訳日:2022-11-14 14:54:12 公開日:2020-07-02

# フィギュアスケートビデオにおけるハイライト検出のための瞬目確率の推定

Estimating Blink Probability for Highlight Detection in Figure Skating Videos ( http://arxiv.org/abs/2007.01089v1 )

ライセンス: Link先を確認

Tamami Nakano, Atsuya Sakata, Akihiro Kishimoto

(参考訳) スポーツビデオのハイライト検出は幅広い視聴者と商業的可能性を秘めている。したがって、人間の興味により適したハイライトシーンを時間的精度で検出することが不可欠である。注意グラフ作成中の瞬きを直感的に抑制し、ビデオの注目ブレークポイントで瞬きを同期的に生成するため、瞬き瞬き率を人的関心の高精度な時間指標として利用することができる。そこで本研究では,点滅率に基づく新しいハイライト自動検出手法を提案する。本手法は,1次元畳み込みネットワーク (1d-cnn) を訓練し,フィギュアスケートビデオの時空間的ポーズ特徴から各フレームの点滅率を評価する。実験の結果,ビデオクリップの94%で瞬き速度を推定し,ジャンプイベント周辺の瞬き速度の時間変化を高精度に予測できることがわかった。さらに,代表的な運動動作だけでなく,フィギュアスケートのパフォーマンスをキーフレームとして表現する特徴的な芸術的表現も検出する。このことは、ブランクレートに基づく教師あり学習アプローチにより、人間の感受性により近い精度のハイライト検出が可能になることを示唆している。

Highlight detection in sports videos has a broad viewership and huge commercial potential. It is thus imperative to detect highlight scenes more suitably for human interest with high temporal accuracy. Since people instinctively suppress blinks during attention-grabbing events and synchronously generate blinks at attention break points in videos, the instantaneous blink rate can be utilized as a highly accurate temporal indicator of human interest. Therefore, in this study, we propose a novel, automatic highlight detection method based on the blink rate. The method trains a one-dimensional convolution network (1D-CNN) to assess blink rates at each video frame from the spatio-temporal pose features of figure skating videos. Experiments show that the method successfully estimates the blink rate in 94% of the video clips and predicts the temporal change in the blink rate around a jump event with high accuracy. Moreover, the method detects not only the representative athletic action, but also the distinctive artistic expression of figure skating performance as key frames. This suggests that the blink-rate-based supervised learning approach enables high-accuracy highlight detection that more closely matches human sensibility.

翻訳日:2022-11-14 14:47:38 公開日:2020-07-02

# 学習可能な滑らかさを優先した深層学習を用いた大域的最適表面セグメンテーション

Globally Optimal Surface Segmentation using Deep Learning with Learnable Smoothness Priors ( http://arxiv.org/abs/2007.01217v1 )

ライセンス: Link先を確認

Leixin Zhou, Xiaodong Wu

(参考訳) 多くの医用画像解析アプリケーションにおいて, 自動表面分割は重要かつ困難である。近年,物体分割作業のための深層学習手法が開発されている。その多くは分類に基づくアプローチであり、例えばu-netは、それぞれのvoxelのターゲットオブジェクトまたは背景となる確率を予測する。これらの方法の1つの問題は、セグメンテーションされたオブジェクトに対するトポロジー保証の欠如であり、通常、オブジェクトの境界面を推測するには、後処理が必要である。本稿では,畳み込みニューラルネットワーク(CNN)と学習可能な表面平滑化ブロックを用いた新しいモデルを提案する。我々の知る限りでは、グローバルな最適性を持つ直面分割のためのCNNとエンドツーエンドで滑らかさを学習する最初の研究である。 Spectral Domain Optical Coherence Tomography (SD-OCT)Retinal layer segmentation and intravascular Ultrasound (IVUS) vessel wall segmentation で行った実験は、非常に有望な結果を示した。

Automated surface segmentation is important and challenging in many medical image analysis applications. Recent deep learning based methods have been developed for various object segmentation tasks. Most of them are a classification based approach, e.g. U-net, which predicts the probability of being target object or background for each voxel. One problem of those methods is lacking of topology guarantee for segmented objects, and usually post processing is needed to infer the boundary surface of the object. In this paper, a novel model based on convolutional neural network (CNN) followed by a learnable surface smoothing block is proposed to tackle the surface segmentation problem with end-to-end training. To the best of our knowledge, this is the first study to learn smoothness priors end-to-end with CNN for direct surface segmentation with global optimality. Experiments carried out on Spectral Domain Optical Coherence Tomography (SD-OCT) retinal layer segmentation and Intravascular Ultrasound (IVUS) vessel wall segmentation demonstrated very promising results.

翻訳日:2022-11-14 14:47:19 公開日:2020-07-02

# 検出対象物の量と品質を最大化するためのuavチームにおける監視位置の自律的・協調的設計

Autonomous and cooperative design of the monitor positions for a team of UAVs to maximize the quantity and quality of detected objects ( http://arxiv.org/abs/2007.01247v1 )

ライセンス: Link先を確認

Dimitrios I. Koutras, Athanasios Ch. Kapoutsis and Elias B. Kosmatopoulos

(参考訳) 本稿では,UAVの群れを未知の地形内に配置する問題に対処し,その全体的意識を最大化することを目的とした。状況認識は、UAVの視野の中で、ユニークな関心の対象の数と品質によって表現される。 YOLOv3と複製対象を識別するシステムを用いて、各UAVの構成に1つのスコアを割り当てた。そこで,UAVや環境のダイナミクスを考慮せずに,予め定義されたスコアを最適化できる新しいナビゲーションアルゴリズムを提案する。提案手法の基盤はブロック座標降下 (bcd) のアプローチと同じ収束特性を共有することである。提案手法の有効性と性能を,AirSimシミュレータ内の一連の実験を用いて評価した。実験により,提案した航法アルゴリズムは,UAVの群れを安定して「戦略的」な監視位置へ移動し,異なる数の群れに適応できることが示唆された。ソースコードはhttps://github.com/dimikout3/ConvCAOAirSimで入手できる。

This paper tackles the problem of positioning a swarm of UAVs inside a completely unknown terrain, having as objective to maximize the overall situational awareness. The situational awareness is expressed by the number and quality of unique objects of interest, inside the UAVs' fields of view. YOLOv3 and a system to identify duplicate objects of interest were employed to assign a single score to each UAVs' configuration. Then, a novel navigation algorithm, capable of optimizing the previously defined score, without taking into consideration the dynamics of either UAVs or environment, is proposed. A cornerstone of the proposed approach is that it shares the same convergence characteristics as the block coordinate descent (BCD) family of approaches. The effectiveness and performance of the proposed navigation scheme were evaluated utilizing a series of experiments inside the AirSim simulator. The experimental evaluation indicates that the proposed navigation algorithm was able to consistently navigate the swarm of UAVs to "strategic" monitoring positions and also adapt to the different number of swarm sizes. Source code is available at https://github.com/dimikout3/ConvCAOAirSim.

翻訳日:2022-11-14 14:47:00 公開日:2020-07-02

# 自動車組み立てラインの視覚検査のためのディープラーニングモデル

Deep Learning Models for Visual Inspection on Automotive Assembling Line ( http://arxiv.org/abs/2007.01857v1 )

ライセンス: Link先を確認

Muriel Mazzetto and Marcelo Teixeira and \'Erick Oliveira Rodrigues and Dalcimar Casanova

(参考訳) 自動車製造組立タスクは、加工面上のスクラッチ識別、部品識別と選択など、製品品質とプロセス品質を保証する視覚検査に基づいて構築される。これらのタスクは、同じ製造ライン内で生産される複数の種類の車両と関連付けられる。視覚検査は基本的に人間主導だったが、コンピュータビジョンシステム(CVS)が提供する人工的な知覚によって補われた。関連性にもかかわらず、CVSの精度は、照明、囲い、画像取得の品質といった環境設定によって異なる。これらの問題はコストのかかる解決策を伴い、主に工場の運転サイクルタイムに支障をきたすコンピュータビジョンシステムによってもたらされる利点の一部をオーバーライドする。そこで,本稿では,製造環境に足跡をほとんど残さずに視覚検査作業を支援するための深層学習に基づく手法を提案し,cvsセットアップを容易にするエンドツーエンドツールとして探索する。提案手法は, 物体検出, 意味セグメンテーション, 異常検出のモデルに基づく実自動車組立ラインにおける4つの概念実証によって示される。

Automotive manufacturing assembly tasks are built upon visual inspections such as scratch identification on machined surfaces, part identification and selection, etc, which guarantee product and process quality. These tasks can be related to more than one type of vehicle that is produced within the same manufacturing line. Visual inspection was essentially human-led but has recently been supplemented by the artificial perception provided by computer vision systems (CVSs). Despite their relevance, the accuracy of CVSs varies accordingly to environmental settings such as lighting, enclosure and quality of image acquisition. These issues entail costly solutions and override part of the benefits introduced by computer vision systems, mainly when it interferes with the operating cycle time of the factory. In this sense, this paper proposes the use of deep learning-based methodologies to assist in visual inspection tasks while leaving very little footprints in the manufacturing environment and exploring it as an end-to-end tool to ease CVSs setup. The proposed approach is illustrated by four proofs of concept in a real automotive assembly line based on models for object detection, semantic segmentation, and anomaly detection.

翻訳日:2022-11-14 14:45:41 公開日:2020-07-02

# 多レベルグラフサーチによるサーマルグライダーのマルチエージェント計画

Multi-agent Planning for thermalling gliders using multi level graph-search ( http://arxiv.org/abs/2007.01334v1 )

ライセンス: Link先を確認

Muhammad Aneeq uz Zaman and Aamer Iqbal Bhatti

(参考訳) 本稿では,グライダー群における経路計画問題を解く。グライダーは、一連の関心点を訪問する任務を負う。グライダーは射程は限られているが、サーマルズと呼ばれる特別な地点を訪れることで射程を拡大することができる。本稿では,グライダーに対する経路計画の問題点として,グライダーが訪れた関心点の総数を最大化することを挙げる。これをマルチエージェント問題(multi-agent problem)と呼ぶ。この問題は、まず複数の単一エージェント問題に分解することで解決される。単エージェント問題では、一組の利子点が単一のグライダーに割り当てられる。この問題は、割り当てられた集合から訪問した関心点の数を最大化する経路を計画することで解決される。これは、以前の研究で示したように、均一なコストグラフ検索によって実現されます。マルチエージェント問題は現在、各グライダーの最適な割り当て(利得点)を決定することで構成されている。この問題を解決するには,先行研究で示したようなブルートフォース探索アプローチと,ブランチ・アンド・バウンド型グラフ検索の2つの方法がある。 Branch&Boundアプローチは、この論文の主な貢献である。この手法は最適であることが証明され、シミュレーションを用いたブルート力探索よりも高速であることが示されている。

This paper solves a path planning problem for a group of gliders. The gliders are tasked with visiting a set of interest points. The gliders have limited range but are able to increase their range by visiting special points called thermals. The problem addressed in this paper is of path planning for the gliders such that, the total number of interest points visited by the gliders is maximized. This is referred to as the multi-agent problem. The problem is solved by first decomposing it into several single-agent problems. In a single-agent problem a set of interest points are allocated to a single glider. This problem is solved by planning a path which maximizes the number of visited interest points from the allocated set. This is achieved through a uniform cost graph search, as shown in our earlier work. The multi-agent problem now consists of determining the best allocation (of interest points) for each glider. Two ways are presented of solving this problem, a brute force search approach as shown in earlier work and a Branch\&Bound type graph search. The Branch&Bound approach is the main contribution of the paper. This approach is proven to be optimal and shown to be faster than the brute force search using simulations.

翻訳日:2022-11-14 14:45:26 公開日:2020-07-02

# スポット:E-Squadのステートレス予測オニオンルーティング

Spores: Stateless Predictive Onion Routing for E-Squads ( http://arxiv.org/abs/2007.04766v1 )

ライセンス: Link先を確認

Daniel Bosk (KTH), Y\'erom-David Bromberg (WIDE, IRISA), Sonja Buchegger (KTH), Adrien Luxey (WIDE, IRISA), Fran\c{c}ois Ta\"iani (WIDE, IRISA)

(参考訳) 州機関や法人による人口の大量監視は、今やよく知られている事実である。ジャーナリストや内部告発者はまだ調査のために世界的なスパイ行為を回避する手段がない。 Sporesでは、ジャーナリストとその情報源が、物理的に会う際に後部ファイル交換を計画する方法を提案する。我々は1人当たりの個人機器の乗算を利用して、軽量で堅牢で完全に匿名のファイル転送プロトコルをユーザ間で提供する。ゴシップ通信プロトコルによってインテリジェントにレンダリングされた個人のデバイスは、ユーザに対して、プライベートで信頼性の高いサービスを提供することができます。人々のe-squadは、信頼性のあるルーティングを提供しながら、パーソナルアプライアンスの固有の信頼性に耐えられる、新しいオニオンルーティングネットワークにフェデレートされる。 sporesのパフォーマンスは競争力があり、通信のプライバシ特性は技術オニオンルーティング戦略の状態を上回っている。

Mass surveillance of the population by state agencies and corporate parties is now a well-known fact. Journalists and whistle-blowers still lack means to circumvent global spying for the sake of their investigations. With Spores, we propose a way for journalists and their sources to plan a posteriori file exchanges when they physically meet. We leverage on the multiplication of personal devices per capita to provide a lightweight, robust and fully anonymous decentralised file transfer protocol between users. Spores hinges on our novel concept of e-squads: one's personal devices, rendered intelligent by gossip communication protocols, can provide private and dependable services to their user. People's e-squads are federated into a novel onion routing network, able to withstand the inherent unreliability of personal appliances while providing reliable routing. Spores' performances are competitive, and its privacy properties of the communication outperform state of the art onion routing strategies.

翻訳日:2022-11-14 14:45:07 公開日:2020-07-02

# Lingua Francaとしての高階論理 - 論証的談話と深い論理的分析の統合

Higher-order Logic as Lingua Franca -- Integrating Argumentative Discourse and Deep Logical Analysis ( http://arxiv.org/abs/2007.01019v1 )

ライセンス: Link先を確認

David Fuenmayor and Christoph Benzm\"uller

(参考訳) 本稿では,従来の高階論理に対する最先端自動推論技術の適用による議論的言説の深い多元論理解析へのアプローチを提案する。表現性のおかげで、この論理は、形式化された引数(深い論理構造)と弁証的相互作用(攻撃と支援関係)の両方をエンコーディングできる一様な \textit{lingua franca} の状態を採用することができる。気候工学に関する議論からの抜粋を分析して,これを説明する。もう一つの新しい貢献は、古典高階論理における非古典論理の浅い意味的埋め込み(sses)を特徴づけ、評価するための抽象的、言語理論的基礎の定義に関するものである。新たな視点は、論理と論理の組み合わせのセマンティックな埋め込みのより簡潔でエレガントな特徴づけを可能にし、いくつかの例で示される。

We present an approach towards the deep, pluralistic logical analysis of argumentative discourse that benefits from the application of state-of-the-art automated reasoning technology for classical higher-order logic. Thanks to its expressivity this logic can adopt the status of a uniform \textit{lingua franca} allowing the encoding of both formalized arguments (their deep logical structure) and dialectical interactions (their attack and support relations). We illustrate this by analyzing an excerpt from an argumentative debate on climate engineering. Another, novel contribution concerns the definition of abstract, language-theoretical foundations for the characterization and assessment of shallow semantical embeddings (SSEs) of non-classical logics in classical higher-order logic, which constitute a pillar stone of our approach. The novel perspective we draw enables more concise and more elegant characterizations of semantical embeddings of logics and logic combinations, which is demonstrated with several examples.

翻訳日:2022-11-14 14:37:41 公開日:2020-07-02

# 人工便秘

Artificial Stupidity ( http://arxiv.org/abs/2007.03616v1 )

ライセンス: Link先を確認

Michael Falk

(参考訳) AIに関する公的な議論は、AIが超人的になり、人間のコントロールから逃れる恐れであるフランケンシュタイン症候群に支配されている。超知能は確かに可能性であるが、それが興奮する関心は、より差し迫った懸念から大衆を遠ざける可能性がある。この記事では、メアリー・シェリーの1818年の有名な小説におけるフランケンシュタイン症候群のルーツについて論じる。次に、人工知能の愚かさを分析する哲学的枠組みを提供し、現代のインテリジェントシステムは「判断の便宜」に苦しむことができることを示した。最後に、ASの危険と利益を露呈する代替の文学的伝統を特定する。エドマンド・スペンサー、ジョナサン・スウィフト、E・T・A・ホフマンの著作では、ASは人間を置き換え、抑圧し、誘惑する。より楽観的に、Joseph FurphyとLaurence Sterneは、人間の知性を地図やパイプとして使えるASを想像する。これらの作家は、現在AI論争を駆動している神話に強く反論する。彼らは、例えばステレオタイプに訴えたり、現実から私たちを遠ざけることによって、愚かな人工エージェントでさえ人間のコントロールを回避できる方法を特定する。そして彼らは、ますます自動化された社会における文学的想像力の重要性を強調した。

Public debate about AI is dominated by Frankenstein Syndrome, the fear that AI will become superhuman and escape human control. Although superintelligence is certainly a possibility, the interest it excites can distract the public from a more imminent concern: the rise of Artificial Stupidity (AS). This article discusses the roots of Frankenstein Syndrome in Mary Shelley's famous novel of 1818. It then provides a philosophical framework for analysing the stupidity of artificial agents, demonstrating that modern intelligent systems can be seen to suffer from 'stupidity of judgement'. Finally it identifies an alternative literary tradition that exposes the perils and benefits of AS. In the writings of Edmund Spenser, Jonathan Swift and E.T.A. Hoffmann, ASs replace, oppress or seduce their human users. More optimistically, Joseph Furphy and Laurence Sterne imagine ASs that can serve human intellect as maps or as pipes. These writers provide a strong counternarrative to the myths that currently drive the AI debate. They identify ways in which even stupid artificial agents can evade human control, for instance by appealing to stereotypes or distancing us from reality. And they underscore the continuing importance of the literary imagination in an increasingly automated society.

翻訳日:2022-11-14 14:37:24 公開日:2020-07-02

# 量子共通原因の存在における因果関係の定量化

Quantifying causal influences in the presence of a quantum common cause ( http://arxiv.org/abs/2007.01221v1 )

ライセンス: Link先を確認

Mariami Gachechiladze, Nikolai Miklin, and Rafael Chaves

(参考訳) 量子力学は自然の因果関係に対する直感に挑戦する。ライヒェンバッハの共通原因原理や地域実在論など、いくつかの基本的な概念を再考する必要がある。伝統的に、これはベルの不等式違反によって目撃される。しかし、ベルの不等式は、量子相関と因果理論の不整合性の唯一の符号か? この質問に動機づけられた一般の枠組みでは、2つの変数間の因果関係を、介入を必要とせず、また、共通の原因の古典的、量子的、あるいはポスト量子的性質を無視せずに推定することができる。特に、ベルの不等式を破ることが不可能な、最も単純な機器のシナリオを考えることで、全ての純粋な二部交絡状態が因果関係の古典的境界に反し、提案された問題に否定的に答え、量子論における因果関係の役割を探求する新たな場を開くことを示す。

Quantum mechanics challenges our intuition on the cause-effect relations in nature. Some fundamental concepts, including Reichenbach's common cause principle or the notion of local realism, have to be reconsidered. Traditionally, this is witnessed by the violation of a Bell inequality. But are Bell inequalities the only signature of the incompatibility between quantum correlations and causality theory? Motivated by this question we introduce a general framework able to estimate causal influences between two variables, without the need of interventions and irrespectively of the classical, quantum, or even post-quantum nature of a common cause. In particular, by considering the simplest instrumental scenario -- for which violation of Bell inequalities is not possible -- we show that every pure bipartite entangled state violates the classical bounds on causal influence, thus answering in negative to the posed question and opening a new venue to explore the role of causality within quantum theory.

翻訳日:2022-11-14 14:37:02 公開日:2020-07-02

# 時間領域における音声表現のコントラスト学習を増強するデータ

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain ( http://arxiv.org/abs/2007.00991v1 )

ライセンス: Link先を確認

Eugene Kharitonov and Morgane Rivi\`ere and Gabriel Synnaeve and Lior Wolf and Pierre-Emmanuel Mazar\'e and Matthijs Douze and Emmanuel Dupoux

(参考訳) 過去セグメントに基づく音声の将来セグメント予測に基づくコントラスト予測符号化(cpc)が,音声信号の表現学習のための強力なアルゴリズムとして出現している。しかし、教師なし評価ベンチマークでは、他の手法が低性能である。ここでは、時間領域データ拡張ライブラリであるwavaugmentを紹介し、過去に拡張を適用する方が一般的に効率的であり、他の方法よりも優れたパフォーマンスをもたらすことを見出します。その結果, ピッチ修正, 付加雑音, 残響の組合せにより, cpcの性能が大幅に向上し(相対的改善率18-22%), 基準リブリライトの600分の1のデータを上回った。ドメイン外データセットを使用することで、時間領域データ拡張は、cpcをzero speech benchmark 2017の最先端技術と同等にすることができる。また,時間領域データ拡張は,ダウンストリームのスーパービジョン音素分類タスクを12～15%の相対的に改善することを示す。

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal. However, it still under-performs other methods on unsupervised evaluation benchmarks. Here, we introduce WavAugment, a time-domain data augmentation library and find that applying augmentation in the past is generally more efficient and yields better performances than other methods. We find that a combination of pitch modification, additive noise and reverberation substantially increase the performance of CPC (relative improvement of 18-22%), beating the reference Libri-light results with 600 times less data. Using an out-of-domain dataset, time-domain data augmentation can push CPC to be on par with the state of the art on the Zero Speech Benchmark 2017. We also show that time-domain data augmentation consistently improves downstream limited-supervision phoneme classification tasks by a factor of 12-15% relative.

翻訳日:2022-11-14 14:36:45 公開日:2020-07-02

# MRIスライス束からの胎児脳分節の非確実性ガイドによるインタラクティブリファインメント

Uncertainty-Guided Efficient Interactive Refinement of Fetal Brain Segmentation from Stacks of MRI Slices ( http://arxiv.org/abs/2007.00833v1 )

ライセンス: Link先を確認

Guotai Wang, Michael Aertsen, Jan Deprest, Sebastien Ourselin, Tom Vercauteren, Shaoting Zhang

(参考訳) 動作補正と高分解能ボリューム再構成には, 胎児脳の運動崩壊した胎児MRIスライスからの分離が重要である。畳み込みニューラルネットワーク(CNN)は胎児の脳の自動分割に広く用いられているが、これらの結果はいまだに困難なスライスのためにインタラクティブな洗練の恩恵を受けている。インタラクティブリファインメントプロセスの効率を向上させるために,不確実性誘導型インタラクティブリファインメント(ugir)フレームワークを提案する。まず、グループ化された畳み込みに基づくCNNを提案し、単一の前方通過における不確実性推定を伴う複数の自動セグメンテーション予測を得る。また,最初のセグメンテーションとユーザインタラクションから洗練された結果を得るための新しい対話型レベル集合法を提案する。実験の結果,(1)提案するcnnは不確かさをリアルタイムで推定し,(2)インタラクティブなレベルセットは精度向上に効果的かつ効率的であり,(3)ugirはユーザインタラクションのガイドに不確実性を用いることで,効率の約30%向上した精度向上結果が得られることがわかった。私たちのコードはオンラインで入手できる。

Segmentation of the fetal brain from stacks of motion-corrupted fetal MRI slices is important for motion correction and high-resolution volume reconstruction. Although Convolutional Neural Networks (CNNs) have been widely used for automatic segmentation of the fetal brain, their results may still benefit from interactive refinement for challenging slices. To improve the efficiency of interactive refinement process, we propose an Uncertainty-Guided Interactive Refinement (UGIR) framework. We first propose a grouped convolution-based CNN to obtain multiple automatic segmentation predictions with uncertainty estimation in a single forward pass, then guide the user to provide interactions only in a subset of slices with the highest uncertainty. A novel interactive level set method is also proposed to obtain a refined result given the initial segmentation and user interactions. Experimental results show that: (1) our proposed CNN obtains uncertainty estimation in real time which correlates well with mis-segmentations, (2) the proposed interactive level set is effective and efficient for refinement, (3) UGIR obtains accurate refinement results with around 30% improvement of efficiency by using uncertainty to guide user interactions. Our code is available online.

翻訳日:2022-11-14 14:36:27 公開日:2020-07-02

# 非負行列分解に基づく画像解析

Image Analysis Based on Nonnegative/Binary Matrix Factorization ( http://arxiv.org/abs/2007.00889v1 )

ライセンス: Link先を確認

Hinako Asaoka and Kazue Kudo

(参考訳) 非負行列分解(NBMF)を用いることで、行列を非負行列と二項行列に分解することができる。 NBMFとFujitsu Digital Annealerを用いた顔画像の解析により,画像再構成と画像分類に成功している。 NBMFアルゴリズムは、非負行列分解(NMF)の収束に必要なものよりも少ないイテレーションで収束するが、どちらの手法も画像分類において比較可能である。

Using nonnegative/binary matrix factorization (NBMF), a matrix can be decomposed into a nonnegative matrix and a binary matrix. Our analysis of facial images, based on NBMF and using the Fujitsu Digital Annealer, leads to successful image reconstruction and image classification. The NBMF algorithm converges in fewer iterations than those required for the convergence of nonnegative matrix factorization (NMF), although both techniques perform comparably in image classification.

翻訳日:2022-11-14 14:35:33 公開日:2020-07-02

# 視床上腫瘍型分類のためのスペクトル-空間再帰的ネットワーク

Spectral-Spatial Recurrent-Convolutional Networks for In-Vivo Hyperspectral Tumor Type Classification ( http://arxiv.org/abs/2007.01042v1 )

ライセンス: Link先を確認

Marcel Bengs, Nils Gessert, Wiebke Laffers, Dennis Eggert, Stephan Westermann, Nina A. Mueller, Andreas O. H. Gerstner, Christian Betz, Alexander Schlaefer

(参考訳) 癌組織の早期発見は長期生存に不可欠である。頭頸部領域では、典型的な診断は内視鏡的介入であり、医療専門家がRGBカメラ画像を用いて組織を手動で評価する。健康領域と腫瘍領域は一般に区別しやすいが、良性腫瘍と悪性腫瘍の鑑別は非常に困難である。診断には侵襲的生検と組織学的診断が必要である。また,腫瘍切除時には組織学的に腫瘍マージンを確認する必要がある。不要な組織切除を避けるため、非侵襲的な画像ベースの診断ツールは非常に有用である。近年, 深層学習と組み合わせたハイパースペクトルイメージングが提案され, 前生検体に対して有望な結果を示した。本研究では,高スペクトル画像と深層学習を用いた生体内腫瘍型分類の可能性を示す。我々は、従来のRGB画像と比較して、複数の超スペクトル帯域を用いることの価値を分析し、追加のスペクトル情報を利用する機械学習モデルの能力について検討する。本研究は,スペクトル集約と空間特徴学習のための再帰畳み込みモデルを用いたスペクトル処理と空間処理について考察する。我々の最良のモデルは76.3%のaucを達成し、これまでの従来および深層学習法を大きく上回っている。

Early detection of cancerous tissue is crucial for long-term patient survival. In the head and neck region, a typical diagnostic procedure is an endoscopic intervention where a medical expert manually assesses tissue using RGB camera images. While healthy and tumor regions are generally easier to distinguish, differentiating benign and malignant tumors is very challenging. This requires an invasive biopsy, followed by histological evaluation for diagnosis. Also, during tumor resection, tumor margins need to be verified by histological analysis. To avoid unnecessary tissue resection, a non-invasive, image-based diagnostic tool would be very valuable. Recently, hyperspectral imaging paired with deep learning has been proposed for this task, demonstrating promising results on ex-vivo specimens. In this work, we demonstrate the feasibility of in-vivo tumor type classification using hyperspectral imaging and deep learning. We analyze the value of using multiple hyperspectral bands compared to conventional RGB images and we study several machine learning models' ability to make use of the additional spectral information. Based on our insights, we address spectral and spatial processing using recurrent-convolutional models for effective spectral aggregating and spatial feature learning. Our best model achieves an AUC of 76.3%, significantly outperforming previous conventional and deep learning methods.

翻訳日:2022-11-14 14:35:24 公開日:2020-07-02

# OCTボリュームにおける物体位置推定のための4次元時空間畳み込みネットワーク

4D Spatio-Temporal Convolutional Networks for Object Position Estimation in OCT Volumes ( http://arxiv.org/abs/2007.01044v1 )

ライセンス: Link先を確認

Marcel Bengs, Nils Gessert, Alexander Schlaefer

(参考訳) オブジェクトの追跡とローカライズは、コンピュータ支援手術における中心的な問題である。光コヒーレンストモグラフィ(OCT)は、空間分解能と時間分解能が高いため、光学追跡システムとして用いられる。近年,3次元畳み込みニューラルネットワーク(CNN)は,一体積CT画像を用いたマーカーオブジェクトのポーズ推定に有望な性能を示した。このアプローチは空間情報にのみ依存するが、OCTはOCT画像ボリュームの時間的ストリームを高ボリュームレートで取得することを可能にする。本研究では、3次元CNNを4次元時空間CNNに体系的に拡張し、マーカーオブジェクト追跡に対する追加の時間情報の影響を評価する。その結果, OCTボリュームのストリームと4次元時空間畳み込みを用いた場合, 3次元CNNを用いた単一ボリューム処理と比較して平均絶対誤差が30%低いことがわかった。

Tracking and localizing objects is a central problem in computer-assisted surgery. Optical coherence tomography (OCT) can be employed as an optical tracking system, due to its high spatial and temporal resolution. Recently, 3D convolutional neural networks (CNNs) have shown promising performance for pose estimation of a marker object using single volumetric OCT images. While this approach relied on spatial information only, OCT allows for a temporal stream of OCT image volumes capturing the motion of an object at high volumes rates. In this work, we systematically extend 3D CNNs to 4D spatio-temporal CNNs to evaluate the impact of additional temporal information for marker object tracking. Across various architectures, our results demonstrate that using a stream of OCT volumes and employing 4D spatio-temporal convolutions leads to a 30% lower mean absolute error compared to single volume processing with 3D CNNs.

翻訳日:2022-11-14 14:35:06 公開日:2020-07-02

# 食事療法のための人体取得・可視化・測定のためのRGB-Dベースのフレームワーク

RGB-D-based Framework to Acquire, Visualize and Measure the Human Body for Dietetic Treatments ( http://arxiv.org/abs/2007.00981v1 )

ライセンス: Link先を確認

Andr\'es Fuster-Guill\'o, Jorge Azor\'in-L\'opez, Marcelo Saval-Calvo, Juan Miguel Castillo-Zaragoza, Nahuel Garcia-DUrso, Robert B Fisher

(参考訳) 本研究の目的は,最先端のrgb-dセンサと仮想現実(vr)技術を用いた栄養栄養療法の改善である。近年の研究では、マルチメディア技術を用いて治療への付着性を改善することができる。しかし、この目的のために3dデータとvr技術を用いた研究はほとんどない。一方,食事療法中の患者の身体の3次元計測と経時的分析(4d)は困難である。この研究の主な貢献は、肥満治療に対する4次元体モデル可視化の効果を研究するための枠組みを提供することである。このシステムは、低コスト技術を用いて身体の完全な3dモデルを得ることができ、将来の簡単な移動を十分な精度と現実的な可視化で可能とし、肥満治療中の形状の進化(4d)の分析を可能にする。この3dボディモデルは、2dおよびvrデバイスを用いた肥満治療における可視化の効果を調べるために使用される。さらに,得られた3Dモデルを用いて体の測定を行う。合成オブジェクトと実オブジェクトの両方で測定値を得るための提案手法の精度について検討した。

This research aims to improve dietetic-nutritional treatment using state-of-the-art RGB-D sensors and virtual reality (VR) technology. Recent studies show that adherence to treatment can be improved using multimedia technologies. However, there are few studies using 3D data and VR technologies for this purpose. On the other hand, obtaining 3D measurements of the human body and analyzing them over time (4D) in patients undergoing dietary treatment is a challenging field. The main contribution of the work is to provide a framework to study the effect of 4D body model visualization on adherence to obesity treatment. The system can obtain a complete 3D model of a body using low-cost technology, allowing future straightforward transference with sufficient accuracy and realistic visualization, enabling the analysis of the evolution (4D) of the shape during the treatment of obesity. The 3D body models will be used for studying the effect of visualization on adherence to obesity treatment using 2D and VR devices. Moreover, we will use the acquired 3D models to obtain measurements of the body. An analysis of the accuracy of the proposed methods for obtaining measurements with both synthetic and real objects has been carried out.

翻訳日:2022-11-14 14:29:24 公開日:2020-07-02

# リアルタイム人間-ロボットインタラクションのための注意指向行動認識

Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction ( http://arxiv.org/abs/2007.01065v1 )

ライセンス: Link先を確認

Ziyang Song, Ziyi Yin, Zejian Yuan, Chong Zhang, Wanchao Chi, Yonggen Ling, Shenghao Zhang

(参考訳) 行動認識タスクにおける顕著な進歩にもかかわらず、人間とロボットの相互作用に特化した行動認識では、多くの作業が行われていない。本稿では,インタラクションシナリオにおける行動認識タスクの特徴を深く検討し,リアルタイムインタラクションの必要性を満たすための注意指向マルチレベルネットワークフレームワークを提案する。具体的には、まず低解像度でシーン内のインタラクタに大まかに焦点を合わせ、高分解能で微細なポーズ推定を行うプリアテンションネットワークを用いる。他のコンパクトcnnは、抽出された骨格配列をアクション認識の入力として受け取り、局所空間-時間パターンとグローバル意味情報を効果的に捉えるための注意のようなメカニズムを利用する。このアプローチを評価するために,インタラクションシナリオにおける認識タスク用に,新たなアクションデータセットを構築した。モバイルコンピューティングプラットフォーム(Nvidia Jetson AGX Xavier)上でのデータセットと高効率(112fps/640 x 480 RGBD)の実験結果から,実時間人間ロボットインタラクションにおける動作認識に優れた適用性を示した。

Despite the notable progress made in action recognition tasks, not much work has been done in action recognition specifically for human-robot interaction. In this paper, we deeply explore the characteristics of the action recognition task in interaction scenarios and propose an attention-oriented multi-level network framework to meet the need for real-time interaction. Specifically, a Pre-Attention network is employed to roughly focus on the interactor in the scene at low resolution firstly and then perform fine-grained pose estimation at high resolution. The other compact CNN receives the extracted skeleton sequence as input for action recognition, utilizing attention-like mechanisms to capture local spatial-temporal patterns and global semantic information effectively. To evaluate our approach, we construct a new action dataset specially for the recognition task in interaction scenarios. Experimental results on our dataset and high efficiency (112 fps at 640 x 480 RGBD) on the mobile computing platform (Nvidia Jetson AGX Xavier) demonstrate excellent applicability of our method on action recognition in real-time human-robot interaction.

翻訳日:2022-11-14 14:28:50 公開日:2020-07-02

# 実行長圧縮文書を非圧縮で自動ページ分割

Automatic Page Segmentation Without Decompressing the Run-Length Compressed Text Documents ( http://arxiv.org/abs/2007.01142v1 )

ライセンス: Link先を確認

Mohammed Javed and P. Nagabhushan

(参考訳) ページ分割は複雑なレイアウトを持つ文書の自動分析において重要な段階であると考えられている。これは伝統的に圧縮されていない文書で行われてきたが、実際の文書のほとんどは保存と転送を効率よくすることを要求する圧縮形式で存在する。しかし,圧縮の段階を経ることなく,圧縮文書内で直接ページ分割を行うことは難しい課題である。本研究では,ccitt group-3圧縮テキスト文書のラン長データに直接ページ分割操作を行う可能性を示す。そのため、テキスト文書を列に分割する前に、各欄を段落に、各段落をテキスト行に、各行を単語に分割し、最後に各単語を文字に分割して、テキスト文書の事前処理を行う必要がある。プリプロセッシングステージは、通常のテキスト領域と反転したテキスト領域を識別し、反転したテキスト領域を通常のモードにトグルする。カラム分離開始の続編では,空白空間の漸進的同化の新たな戦略が垂直方向に実行され,関連するパラメータの自己推定が提案されている。これらのパラメータを用いた列セグメンテーションを実現する手法を考案した。次に、次に示すのは2段階の水平行分離プロセスで、各列を段落に分割し、次にテキストラインに分割する。そして、単語と文字の分離を完了させる2段階の縦列分離プロセスが存在する。

Page segmentation is considered to be the crucial stage for the automatic analysis of documents with complex layouts. This has traditionally been carried out in uncompressed documents, although most of the documents in real life exist in a compressed form warranted by the requirement to make storage and transfer efficient. However, carrying out page segmentation directly in compressed documents without going through the stage of decompression is a challenging goal. This research paper proposes demonstrating the possibility of carrying out a page segmentation operation directly in the run-length data of the CCITT Group-3 compressed text document, which could be single- or multi-columned and might even have some text regions in the inverted text color mode. Therefore, before carrying out the segmentation of the text document into columns, each column into paragraphs, each paragraph into text lines, each line into words, and, finally, each word into characters, a pre-processing of the text document needs to be carried out. The pre-processing stage identifies the normal text regions and inverted text regions, and the inverted text regions are toggled to the normal mode. In the sequel to initiate column separation, a new strategy of incremental assimilation of white space runs in the vertical direction and the auto-estimation of certain related parameters is proposed. A procedure to realize column-segmentation employing these extracted parameters has been devised. Subsequently, what follows first is a two-level horizontal row separation process, which segments every column into paragraphs, and in turn, into text-lines. Then, there is a two-level vertical column separation process, which completes the separation into words and characters.

翻訳日:2022-11-14 14:28:33 公開日:2020-07-02

# 階層型ニューラルネットワークを用いた低出力物体カウント

Low-Power Object Counting with Hierarchical Neural Networks ( http://arxiv.org/abs/2007.01369v1 )

ライセンス: Link先を確認

Abhinav Goel, Caleb Tung, Sara Aghajanzadeh, Isha Ghodgaonkar, Shreya Ghosh, George K. Thiruvathukal, Yung-Hsiang Lu

(参考訳) ディープニューラルネットワーク(DNN)は、オブジェクトカウントなどの多くのコンピュータビジョンタスクにおいて最先端の精度を達成することができる。オブジェクトカウントはイメージとオブジェクトクエリの2つの入力を受け取り、クエリされたオブジェクトの発生回数を報告する。このようなタスクで高い精度を達成するために、dnnは数十億のオペレーションを必要とし、リソース制約のある低消費電力デバイスへのデプロイを困難にしている。以前の研究は、かなりの数のDNN操作が冗長であり、精度に影響を与えることなく排除できることを示している。これらの冗長性を低減するため,オブジェクトカウントのための階層型DNNアーキテクチャを提案する。このアーキテクチャは、RPN(Rerea Proposal Network)を使用して、クエリ対象を含む可能性のあるRerea-of-interest(RoI)を提案する。階層型分類器は、実際にクエリされたオブジェクトを含むRoIsを効率的に見つける。階層構造は視覚的に類似した対象カテゴリのグループを含む。階層の各ノードで小さなDNNを使用して、これらのグループを分類する。 RoIは階層分類器によって漸進的に処理される。 RoI のオブジェクトがクエリ対象と同じグループであれば、階層内の次の DNN は RoI をさらに処理し、そうでなければ RoI は破棄される。各画像を処理するためにいくつかの小さなdnnを使用することで、既存のオブジェクトカウンタと比較して、メモリ要求、推論時間、エネルギー消費、操作数を無視できる精度損失で削減できる。

Deep Neural Networks (DNNs) can achieve state-of-the-art accuracy in many computer vision tasks, such as object counting. Object counting takes two inputs: an image and an object query and reports the number of occurrences of the queried object. To achieve high accuracy on such tasks, DNNs require billions of operations, making them difficult to deploy on resource-constrained, low-power devices. Prior work shows that a significant number of DNN operations are redundant and can be eliminated without affecting the accuracy. To reduce these redundancies, we propose a hierarchical DNN architecture for object counting. This architecture uses a Region Proposal Network (RPN) to propose regions-of-interest (RoIs) that may contain the queried objects. A hierarchical classifier then efficiently finds the RoIs that actually contain the queried objects. The hierarchy contains groups of visually similar object categories. Small DNNs are used at each node of the hierarchy to classify between these groups. The RoIs are incrementally processed by the hierarchical classifier. If the object in an RoI is in the same group as the queried object, then the next DNN in the hierarchy processes the RoI further; otherwise, the RoI is discarded. By using a few small DNNs to process each image, this method reduces the memory requirement, inference time, energy consumption, and number of operations with negligible accuracy loss when compared with the existing object counters.

翻訳日:2022-11-14 14:26:44 公開日:2020-07-02

# D-NetPAD:説明可能で解釈可能なアイリス提示攻撃検出器

D-NetPAD: An Explainable and Interpretable Iris Presentation Attack Detector ( http://arxiv.org/abs/2007.01381v1 )

ライセンス: Link先を確認

Renu Sharma and Arun Ross

(参考訳) 虹彩認識システムは、相手が印刷された目、プラスチックの目、化粧品のコンタクトレンズなどの人工物を提示してシステムを回避する、提示攻撃(PA)に対して脆弱である。本研究では、DenseNet畳み込みニューラルネットワークアーキテクチャに基づくD-NetPADと呼ばれる有効で堅牢なアイリスPA検出器を提案する。 PAアーティファクト、センサー、データセット間の一般化性を示す。プロプライエタリデータセットと公開データセット(livdet-2017)で実施した実験では,提案手法の有効性が検証された。提案手法は,プロプライエタリなデータセットでは0.2\%の誤検出率で98.58\%の真の検出率を示し,LivDet-2017データセットでは最先端の手法よりも優れていた。 D-NetPADの性能を説明するため、t-SNEプロットとGrad-CAMを用いて中間特徴分布と固定熱マップを可視化する。さらに,ネットワークによって抽出される特徴の性質を説明するために周波数解析を行う。ソースコードとトレーニングされたモデルはhttps://github.com/iPRoBe-lab/D-NetPADで入手できる。

An iris recognition system is vulnerable to presentation attacks, or PAs, where an adversary presents artifacts such as printed eyes, plastic eyes, or cosmetic contact lenses to circumvent the system. In this work, we propose an effective and robust iris PA detector called D-NetPAD based on the DenseNet convolutional neural network architecture. It demonstrates generalizability across PA artifacts, sensors and datasets. Experiments conducted on a proprietary dataset and a publicly available dataset (LivDet-2017) substantiate the effectiveness of the proposed method for iris PA detection. The proposed method results in a true detection rate of 98.58\% at a false detection rate of 0.2\% on the proprietary dataset and outperfoms state-of-the-art methods on the LivDet-2017 dataset. We visualize intermediate feature distributions and fixation heatmaps using t-SNE plots and Grad-CAM, respectively, in order to explain the performance of D-NetPAD. Further, we conduct a frequency analysis to explain the nature of features being extracted by the network. The source code and trained model are available at https://github.com/iPRoBe-lab/D-NetPAD.

翻訳日:2022-11-14 14:26:23 公開日:2020-07-02

# ラテン文字で書かれた南アジアの言語処理:Dakshinaデータセット

Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset ( http://arxiv.org/abs/2007.01176v1 )

ライセンス: Link先を確認

Brian Roark, Lawrence Wolf-Sonkin, Christo Kirov, Sabrina J. Mielke, Cibu Johny, Isin Demirsahin, Keith Hall

(参考訳) 本稿では,南アジア12言語を対象に,ラテン文字とネイティブ文字の両方のテキストからなる新しい資料であるdakshinaデータセットについて述べる。データセットは、各言語について、以下を含む。 1) 原本ウィキペディアテキスト 2) romanization lexicon,及び 3) 言語のネイティブスクリプトと基本ラテン文字の両方で、全文の並列データを生成する。各言語でwikipediaテキストの作成と選択に使用される方法、サンプルされた辞書に対する検証済みのローマ字化の収集、ネイティブスクリプトコレクションからの保持された文の手動ローマ字化を文書化する。さらに、単一単語の文字化、全文の文字化、ネイティブスクリプトとロマン化テキストの言語モデリングなど、データセットで可能ないくつかのタスクのベースライン結果も提供する。キーワード:ロマン化、翻訳、南アジア諸語

This paper describes the Dakshina dataset, a new resource consisting of text in both the Latin and native scripts for 12 South Asian languages. The dataset includes, for each language: 1) native script Wikipedia text; 2) a romanization lexicon; and 3) full sentence parallel data in both a native script of the language and the basic Latin alphabet. We document the methods used for preparation and selection of the Wikipedia text in each language; collection of attested romanizations for sampled lexicons; and manual romanization of held-out sentences from the native script collections. We additionally provide baseline results on several tasks made possible by the dataset, including single word transliteration, full sentence transliteration, and language modeling of native script and romanized text. Keywords: romanization, transliteration, South Asian languages

翻訳日:2022-11-14 14:20:40 公開日:2020-07-02

# テキスト分類のための新しいBGCapsule Network

A Novel BGCapsule Network for Text Classification ( http://arxiv.org/abs/2007.04302v1 )

ライセンス: Link先を確認

Akhilesh Kumar Gangwar and Vadlamani Ravi

(参考訳) 感情分析、ニュース分類、複数ラベル分類、意見分類といったテキスト分類タスクは、現代のディープラーニングネットワークにおいても難しい問題である。近年,画像分類にはCapsule Networks (CapsNets) が提案されている。 CapsNets は Convolutional Neural Networks (CNNs) に対していくつかの利点があるが、テキスト領域での妥当性は調査されていない。本稿では,複数のテキスト分類タスクにおいて,双方向ゲート型再帰ユニット(bigru)のアンサンブルが先行するカプセルモデルであるbgcapsuleを提案する。主カプセル層に先行する特徴抽出層に対して,両方向GRUのアンサンブルを用いた。このハイブリッドアーキテクチャは、基本的な前処理ステップを実行した後、グラブに基づく埋め込み層、bigruベースのアンサンブル層、プライマリカプセル層、フラット層、完全に接続されたrelu層、そして完全に接続されたsoftmax層からなる。 bgcapsuleの有効性を評価するために,映画レビュー(mr imdb 2005),ag newsデータセット,dbpedia ontologyデータセット,yelp review full dataset,yelp review polarityデータセットを含む5つのベンチマークデータセット(10,000レコードから70万レコード)について,広範な実験を行った。これらのベンチマークは、ニュース分類、感情分析、マルチクラス分類、マルチラベル分類、意見分類など、いくつかのテキスト分類タスクをカバーする。提案するアーキテクチャ(bgcapsule)は,ポジティブ感情キーワードやネガティブ感情キーワードなどの外部言語知識を必要とせず,既存の手法と比較して精度が向上した。さらに、BGCapsuleは他の既存の技術よりも早く収束した。

Several text classification tasks such as sentiment analysis, news categorization, multi-label classification and opinion classification are challenging problems even for modern deep learning networks. Recently, Capsule Networks (CapsNets) are proposed for image classification. It has been shown that CapsNets have several advantages over Convolutional Neural Networks (CNNs), while their validity in the domain of text has been less explored. In this paper, we propose a novel hybrid architecture viz., BGCapsule, which is a Capsule model preceded by an ensemble of Bidirectional Gated Recurrent Units (BiGRU) for several text classification tasks. We employed an ensemble of Bidirectional GRUs for feature extraction layer preceding the primary capsule layer. The hybrid architecture, after performing basic pre-processing steps, consists of five layers: an embedding layer based on GloVe, a BiGRU based ensemble layer, a primary capsule layer, a flatten layer and fully connected ReLU layer followed by a fully connected softmax layer. In order to evaluate the effectiveness of BGCapsule, we conducted extensive experiments on five benchmark datasets (ranging from 10,000 records to 700,000 records) including Movie Review (MR Imdb 2005), AG News dataset, Dbpedia ontology dataset, Yelp Review Full dataset and Yelp review polarity dataset. These benchmarks cover several text classification tasks such as news categorization, sentiment analysis, multiclass classification, multi-label classification and opinion classification. We found that our proposed architecture (BGCapsule) achieves better accuracy compared to the existing methods without the help of any external linguistic knowledge such as positive sentiment keywords and negative sentiment keywords. Further, BGCapsule converged faster compared to other extant techniques.

翻訳日:2022-11-14 14:19:55 公開日:2020-07-02

# 移動背景に対する翻訳対象方向のデコードのためのショウジョウバエ運動視覚経路のモデル化

Modelling Drosophila Motion Vision Pathways for Decoding the Direction of Translating Objects Against Cluttered Moving Backgrounds ( http://arxiv.org/abs/2007.00886v1 )

ライセンス: Link先を確認

Qinbing Fu and Shigang Yue

(参考訳) 乱雑な動きの背景の前でオブジェクトを翻訳する方向を正しくかつ効率的にデコードすることは依然として難しい問題である。自然界において、軽量で低出力の飛行昆虫は、飛行中に高度に変動する環境において移動目標を検出するために運動視覚を適用し、運動知覚戦略を学ぶのに優れたパラダイムである。本稿では, ショウジョウバエの運動視覚経路を調査し, 最先端の生理学的研究に基づく計算モデルを提案する。提案する視覚系モデルでは,生物工学的ON・OF経路,広視野水平感度(HS),垂直感度(VS)システムなどが特徴である。本研究の主な貢献は2つの側面である。 1) 本モデルでは, 方向選択性(DS) と方向応答性(DO) の両方の反応を, フィードフォワード方式で, 運動知覚神経回路の主特性として明らかにした。 2) 運動前フィルタリング機構とON経路およびOF経路内の局所相関器のアンサンブルの組み合わせを含む時空間力学のモデル化により, 乱れの進行する背景の物体の翻訳に対して頑健な方向選択性を示し, 背景運動や乱れを効果的に抑制し, 動的応答を改善する。従って、対象の翻訳方向は、好ましくない方向(PD)またはヌル方向(ND)翻訳を示す正または負の出力を持つHSおよびVSシステムのグローバル応答として復号される。実験では,提案したニューラルネットワークモデルの有効性を検証し,より高速な移動,高コントラスト,大規模ターゲットへの応答性を示す。

Decoding the direction of translating objects in front of cluttered moving backgrounds, accurately and efficiently, is still a challenging problem. In nature, lightweight and low-powered flying insects apply motion vision to detect a moving target in highly variable environments during flight, which are excellent paradigms to learn motion perception strategies. This paper investigates the fruit fly \textit{Drosophila} motion vision pathways and presents computational modelling based on cutting-edge physiological researches. The proposed visual system model features bio-plausible ON and OFF pathways, wide-field horizontal-sensitive (HS) and vertical-sensitive (VS) systems. The main contributions of this research are on two aspects: 1) the proposed model articulates the forming of both direction-selective (DS) and direction-opponent (DO) responses, revealed as principal features of motion perception neural circuits, in a feed-forward manner; 2) it also shows robust direction selectivity to translating objects in front of cluttered moving backgrounds, via the modelling of spatiotemporal dynamics including combination of motion pre-filtering mechanisms and ensembles of local correlators inside both the ON and OFF pathways, which works effectively to suppress irrelevant background motion or distractors, and to improve the dynamic response. Accordingly, the direction of translating objects is decoded as global responses of both the HS and VS systems with positive or negative output indicating preferred-direction (PD) or null-direction (ND) translation. The experiments have verified the effectiveness of the proposed neural system model, and demonstrated its responsive preference to faster-moving, higher-contrast and larger-size targets embedded in cluttered moving backgrounds.

翻訳日:2022-11-14 14:19:26 公開日:2020-07-02

# ビデオから道路のレイアウトを理解する

Understanding Road Layout from Videos as a Whole ( http://arxiv.org/abs/2007.00822v1 )

ライセンス: Link先を確認

Buyu Liu, Bingbing Zhuang, Samuel Schulter, Pan Ji, Manmohan Chandraker

(参考訳) 本稿では,複雑な道路シーンのレイアウトをビデオシーケンスから推定する問題に対処する。この目的のために,道路属性予測問題として定式化し,その目的は各フレームの属性を正確かつ一貫して予測することである。先行研究とは対照的に,映像中のカメラの動きを活用すること,長期的映像情報を取り入れることの3つの新しい側面を生かした。具体的には,ビデオの予測一貫性を強制するモデルを提案する。我々のモデルは1つのLSTMと1つの特徴変換モジュール(FTM)から構成される。前者は隠された状態との一貫性の制約を暗黙的に含み、後者はビデオに沿って情報を集約する際にカメラの動きを明示的に考慮する。さらに,道路参加者,例えばオブジェクトをモデルに組み込むことにより,文脈情報を組み込むことを提案する。ビデオシーケンス全体が利用可能になると、私たちのモデルは、例えば過去と将来のフレームからの情報など、ローカルとグローバルの両方の手がかりをエンコードすることもできます。 1) グローバルまたは文脈的手がかりのいずれかを組み込むことで、予測精度が向上し、両方の活用が最高のパフォーマンスをもたらす。 2) LSTMおよびFTMモジュールの導入により,ビデオの予測一貫性が向上する。 (3)提案手法はSOTAよりも大きなマージンで優れている。

In this paper, we address the problem of inferring the layout of complex road scenes from video sequences. To this end, we formulate it as a top-view road attributes prediction problem and our goal is to predict these attributes for each frame both accurately and consistently. In contrast to prior work, we exploit the following three novel aspects: leveraging camera motions in videos, including context cuesand incorporating long-term video information. Specifically, we introduce a model that aims to enforce prediction consistency in videos. Our model consists of one LSTM and one Feature Transform Module (FTM). The former implicitly incorporates the consistency constraint with its hidden states, and the latter explicitly takes the camera motion into consideration when aggregating information along videos. Moreover, we propose to incorporate context information by introducing road participants, e.g. objects, into our model. When the entire video sequence is available, our model is also able to encode both local and global cues, e.g. information from both past and future frames. Experiments on two data sets show that: (1) Incorporating either globalor contextual cues improves the prediction accuracy and leveraging both gives the best performance. (2) Introducing the LSTM and FTM modules improves the prediction consistency in videos. (3) The proposed method outperforms the SOTA by a large margin.

翻訳日:2022-11-14 14:18:25 公開日:2020-07-02

# ACFD:非対称カルトン顔検出器

ACFD: Asymmetric Cartoon Face Detector ( http://arxiv.org/abs/2007.00899v1 )

ライセンス: Link先を確認

Bin Zhang, Jian Li, Yabiao Wang, Zhipeng Cui, Yili Xia, Chengjie Wang, Jilin Li, Feiyue Huang

(参考訳) カルトゥーンの顔検出は、難しいシナリオが多いため、人間の顔検出よりも難しい作業である。本稿では, 顔内における大きな違いなど, マンガの顔の特徴に着目し, ACFD という非対称なマンガの顔検出手法を提案する。具体的には、いくつかの非対称な単発アグリゲーションモジュール(AOSA)、非対称な双方向特徴ピラミッドネットワーク(ABi-FPN)、動的アンカーマッチング戦略(DAM)、対応するマージン二分分類損失(MBC)からなる新しいバックボーンVoVNetV3である。特に、多様な受容場を持つ特徴を生成するために、VoVNetV3によりマルチスケールピラミッドの特徴を抽出し、いくつかの極端なポーズで顔を扱うためにABi-FPNによって同時に融合・拡張され、異なるアスペクト比を有する。さらに、DAMは顔ごとに十分な高品質のアンカーに適合し、MBCは差別の強い力である。これらのモジュールの有効性により,acfdは,モデルサイズ200mb,イメージあたりの推論時間50ms,トレーニング済みモデルなしで,2020 icartoon face challengeの検出トラックで第1位を達成した。

Cartoon face detection is a more challenging task than human face detection due to many difficult scenarios is involved. Aiming at the characteristics of cartoon faces, such as huge differences within the intra-faces, in this paper, we propose an asymmetric cartoon face detector, named ACFD. Specifically, it consists of the following modules: a novel backbone VoVNetV3 comprised of several asymmetric one-shot aggregation modules (AOSA), asymmetric bi-directional feature pyramid network (ABi-FPN), dynamic anchor match strategy (DAM) and the corresponding margin binary classification loss (MBC). In particular, to generate features with diverse receptive fields, multi-scale pyramid features are extracted by VoVNetV3, and then fused and enhanced simultaneously by ABi-FPN for handling the faces in some extreme poses and have disparate aspect ratios. Besides, DAM is used to match enough high-quality anchors for each face, and MBC is for the strong power of discrimination. With the effectiveness of these modules, our ACFD achieves the 1st place on the detection track of 2020 iCartoon Face Challenge under the constraints of model size 200MB, inference time 50ms per image, and without any pretrained models.

翻訳日:2022-11-14 14:18:05 公開日:2020-07-02

# マルチソースドメイン適応におけるソース選択のためのカリキュラムマネージャ

Curriculum Manager for Source Selection in Multi-Source Domain Adaptation ( http://arxiv.org/abs/2007.01261v1 )

ライセンス: Link先を確認

Luyu Yang, Yogesh Balaji, Ser-Nam Lim, Abhinav Shrivastava

(参考訳) マルチソース非教師付きドメイン適応の性能は、ラベル付きソースドメインサンプルからの転送の有効性に大きく依存する。本稿では,資源選択のためのカリキュラムマネージャ (CMSS) と呼ばれる,資源サンプルの動的カリキュラムを学習する逆エージェントを提案する。独立したネットワークモジュールであるCurriculum Managerは、トレーニング中のカリキュラムを常に更新し、どのドメインやサンプルがターゲットに合わせるのに適しているかを反復的に学習する。この背景にある直感は、Curriculum Managerが遅延ドメインの転送可能性を常に再測定し、ドメイン識別器のエラー率を逆向きに上昇させることである。 CMSSはドメインラベルの知識を一切必要としないが、よく知られた4つのベンチマークの他の手法よりもかなり優れている。また,提案手法に光を当てた解釈可能な結果も提供する。

The performance of Multi-Source Unsupervised Domain Adaptation depends significantly on the effectiveness of transfer from labeled source domain samples. In this paper, we proposed an adversarial agent that learns a dynamic curriculum for source samples, called Curriculum Manager for Source Selection (CMSS). The Curriculum Manager, an independent network module, constantly updates the curriculum during training, and iteratively learns which domains or samples are best suited for aligning to the target. The intuition behind this is to force the Curriculum Manager to constantly re-measure the transferability of latent domains over time to adversarially raise the error rate of the domain discriminator. CMSS does not require any knowledge of the domain labels, yet it outperforms other methods on four well-known benchmarks by significant margins. We also provide interpretable results that shed light on the proposed method.

翻訳日:2022-11-14 14:11:19 公開日:2020-07-02

# DATE:MPSoCを使用可能なDVFSにおける高温サイドチャネル攻撃に対する防御

DATE: Defense Against TEmperature Side-Channel Attacks in DVFS Enabled MPSoCs ( http://arxiv.org/abs/2007.01377v1 )

ライセンス: Link先を確認

Somdip Dey, Amit Kumar Singh, Xiaohang Wang, and Klaus Dieter McDonald-Maier

(参考訳) 組込みデバイスを日常的に利用することの絶え間ない増加を考えると、サイドチャネルはそのようなシステムにおける情報フロー制御とセキュリティの課題である。そのような重要なセキュリティ欠陥の1つは、温度側チャネル攻撃によって悪用され、そこでは、セキュリティ欠陥を推測するために、処理要素からの放熱と伝播が時間とともに観測される。提案手法であるdate: defense against temperature side-channel attackでは,温度側チャネル攻撃に対してより安全なシステムを実現するために,空間的および時間的温度勾配を低減し,同時にデバイスの寿命の信頼性を高める新しい手法を提案する。本稿では,コンピュータシステムに対する温度側チャネル攻撃に対するセキュリティを定量化できる新しい指標であるサーマル・セキュリティ・イン・マルチ・プロシーサ(TSMP)を導入し,DATEは最先端のアプリケーションに対して最大139.24%の安全性を示し,温度サイクルを67.42%削減した。

Given the constant rise in utilizing embedded devices in daily life, side channels remain a challenge to information flow control and security in such systems. One such important security flaw could be exploited through temperature side-channel attacks, where heat dissipation and propagation from the processing elements are observed over time in order to deduce security flaws. In our proposed methodology, DATE: Defense Against TEmperature side-channel attacks, we propose a novel approach of reducing spatial and temporal thermal gradient, which makes the system more secure against temperature side-channel attacks, and at the same time increases the reliability of the device in terms of lifespan. In this paper, we have also introduced a new metric, Thermal-Security-in-Multi-Processors (TSMP), which is capable of quantifying the security against temperature side-channel attacks on computing systems, and DATE is evaluated to be 139.24% more secure at the most for certain applications than the state-of-the-art, while reducing thermal cycle by 67.42% at the most.

翻訳日:2022-11-14 14:11:04 公開日:2020-07-02

# 畳み込みニューラルネットワークを用いたX線画像上のCOVID-19症例の自動検出

Automatic Detection of COVID-19 Cases on X-ray images Using Convolutional Neural Networks ( http://arxiv.org/abs/2007.05494v1 )

ライセンス: Link先を確認

Lucas P. Soares and Cesar P. Soares

(参考訳) ここ数カ月、世界は新型コロナウイルスの急速な進歩に驚いている。この病気に直面し、社会経済的影響を最小限に抑えるためには、監視と治療に加えて、診断が重要な手順である。しかし、この実現には遅れや実験室への限られたアクセスが妨げられ、ケーストリアージを行うための新たな戦略が要求される。このシナリオでは、胸部x線およびct画像に基づく診断プロセスを支援するオプションとして、ディープラーニングモデルが提案されている。そこで本研究では,深層学習による畳み込みニューラルネットワーク(cnn)を用いて,胸部画像から新型コロナウイルスの検出プロセスを自動化することを目的とした。この結果は、covid-19の他の種類の検出方法へのアクセスを拡大し、この病気を識別するプロセスをスピードアップに寄与する可能性がある。使用するすべてのデータベース、ビルドされたコード、およびモデルのトレーニングから得られた結果は、オープンアクセスで利用できる。この行動は、結果の改善に寄与し、その結果、新型コロナウイルスに直面する進歩に寄与するため、他の研究者によるこれらのモデルの強化への関与を促進する。

In recent months the world has been surprised by the rapid advance of COVID-19. In order to face this disease and minimize its socio-economic impacts, in addition to surveillance and treatment, diagnosis is a crucial procedure. However, the realization of this is hampered by the delay and the limited access to laboratory tests, demanding new strategies to carry out case triage. In this scenario, deep learning models are being proposed as a possible option to assist the diagnostic process based on chest X-ray and computed tomography images. Therefore, this research aims to automate the process of detecting COVID-19 cases from chest images, using convolutional neural networks (CNN) through deep learning techniques. The results can contribute to expand access to other forms of detection of COVID-19 and to speed up the process of identifying this disease. All databases used, the codes built, and the results obtained from the models' training are available for open access. This action facilitates the involvement of other researchers in enhancing these models since this can contribute to the improvement of results and, consequently, the progress in confronting COVID-19.

翻訳日:2022-11-14 14:10:04 公開日:2020-07-02

# ウェアラブル呼吸モニタリング:コンテキストとセンサバイオマーカーによる解釈可能な推論

Wearable Respiration Monitoring: Interpretable Inference with Context and Sensor Biomarkers ( http://arxiv.org/abs/2007.01413v1 )

ライセンス: Link先を確認

Ridwan Alam, David B. Peden, and John C. Lach

(参考訳) 呼吸速度(br)、微小換気(ve)、その他の呼吸パラメータは、喘息などの多くの急性疾患の患者をリアルタイムにモニターするのに必須である。呼吸測定のための臨床標準、すなわちスピロメトリは、継続的な使用には適さない。ウェアラブルは心電図や運動といった多くの生理的信号を追跡できるが、呼吸はできない。他のモダリティからの呼吸は活発な研究の領域となっている。本研究では,ウェアラブル心電図と手首運動信号から呼吸パラメータを推定する。本研究では,文脈条件付き推論モデル学習において,物理活動などの利用可能なコンテキスト情報を利用するモジュール型で一般化可能な分類回帰パイプラインを提案する。これらのモデルで使用するウェアラブルecgから形態素およびパワー領域の新しい特徴を抽出する。このパイプラインには探索的特徴選択法が組み込まれ、アプリケーション固有の解釈可能なバイオマーカーを発見する。 15項目のデータを用いて,提案したパイプラインの2つの実装(BRとVE)を評価する。各実装は、一般化線形モデル、ランダムフォレスト、サポートベクトルマシン、ガウス過程回帰、および近傍成分分析を文脈回帰モデルとして比較する。置換、正則化、関連性決定法は、ECGの特徴をランク付けし、モデルや活動間で堅牢なECGバイオマーカーを特定するために用いられる。この研究は、連続監視だけでなく、バイオマーカーによる予防対策の設計においてもウェアラブルセンサーの可能性を示している。

Breathing rate (BR), minute ventilation (VE), and other respiratory parameters are essential for real-time patient monitoring in many acute health conditions, such as asthma. The clinical standard for measuring respiration, namely Spirometry, is hardly suitable for continuous use. Wearables can track many physiological signals, like ECG and motion, yet not respiration. Deriving respiration from other modalities has become an area of active research. In this work, we infer respiratory parameters from wearable ECG and wrist motion signals. We propose a modular and generalizable classification-regression pipeline to utilize available context information, such as physical activity, in learning context-conditioned inference models. Morphological and power domain novel features from the wearable ECG are extracted to use with these models. Exploratory feature selection methods are incorporated in this pipeline to discover application-specific interpretable biomarkers. Using data from 15 subjects, we evaluate two implementations of the proposed pipeline: for inferring BR and VE. Each implementation compares generalized linear model, random forest, support vector machine, Gaussian process regression, and neighborhood component analysis as contextual regression models. Permutation, regularization, and relevance determination methods are used to rank the ECG features to identify robust ECG biomarkers across models and activities. This work demonstrates the potential of wearable sensors not only in continuous monitoring, but also in designing biomarker-driven preventive measures.

翻訳日:2022-11-14 14:09:09 公開日:2020-07-02

# 事実に基づくテキスト編集

Fact-based Text Editing ( http://arxiv.org/abs/2007.00916v1 )

ライセンス: Link先を確認

Hayate Iso, Chao Qiao, Hang Li

(参考訳) 本稿では,知識ベース(例えば,いくつかの三重項)における事実をよりよく記述するために,与えられた文書を改訂することを目的とした,新しいテキスト編集タスクである \textit{fact-based text editing} を提案する。なぜなら、真実を反映することはテキスト編集において一般的な要件であるからである。まず、各インスタンスがドラフトテキスト、改訂テキスト、およびトリプルで表現された複数の事実からなる、事実ベースのテキスト編集の研究のためのデータセットを自動生成する手法を提案する。この手法を2つの公開テーブルツーテキストデータセットに適用し,それぞれ233kインスタンスと37kインスタンスからなる2つの新しいデータセットを得る。次に,バッファ,ストリーム,メモリを用いて与えられた事実を参照してドラフトテキストを編集する,事実ベースのテキスト編集のための新たなニューラルネットワークアーキテクチャ, \textsc{facteditor}を提案する。この問題に対処するための簡単なアプローチは、エンコーダ-デコーダモデルを採用することである。この2つのデータセットの実験結果から, エンコーダとデコーダのアプローチの精度は, 忠実度と流布率で優れていた。結果はまた、textsc{FactEditor} が encoder-decoder アプローチよりも高速に推論を行うことを示している。

We propose a novel text editing task, referred to as \textit{fact-based text editing}, in which the goal is to revise a given document to better describe the facts in a knowledge base (e.g., several triples). The task is important in practice because reflecting the truth is a common requirement in text editing. First, we propose a method for automatically generating a dataset for research on fact-based text editing, where each instance consists of a draft text, a revised text, and several facts represented in triples. We apply the method into two public table-to-text datasets, obtaining two new datasets consisting of 233k and 37k instances, respectively. Next, we propose a new neural network architecture for fact-based text editing, called \textsc{FactEditor}, which edits a draft text by referring to given facts using a buffer, a stream, and a memory. A straightforward approach to address the problem would be to employ an encoder-decoder model. Our experimental results on the two datasets show that \textsc{FactEditor} outperforms the encoder-decoder approach in terms of fidelity and fluency. The results also show that \textsc{FactEditor} conducts inference faster than the encoder-decoder approach.

翻訳日:2022-11-14 14:08:49 公開日:2020-07-02

# IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE

IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE ( http://arxiv.org/abs/2007.00924v1 )

ライセンス: Link先を確認

Luxi Xing, Yuqiang Xie, Yue Hu, Wei Peng

(参考訳) 本稿では,SemEval Task4: Commonsense Validation and Explanationの最初の2つのサブタスクについて紹介する。評価の意図を明確にし,選択のためのコントラスト情報を注入するために,プロンプトテンプレートを用いた入力再構成戦略を提案する。具体的には、サブタスクをマルチタスク質問応答形式に形式化し、プロンプトテンプレートで入力を構築し、サブタスクの結果として質問応答の最終的な予測を検討する。実験の結果,本手法はベースラインシステムと比較して高い性能を示した。最初の2つのサブタスクの2つの公式テストセットにおいて、96.4の精度と94.3の精度で第3位を確保した。

This paper introduces our systems for the first two subtasks of SemEval Task4: Commonsense Validation and Explanation. To clarify the intention for judgment and inject contrastive information for selection, we propose the input reconstruction strategy with prompt templates. Specifically, we formalize the subtasks into the multiple-choice question answering format and construct the input with the prompt templates, then, the final prediction of question answering is considered as the result of subtasks. Experimental results show that our approaches achieve significant performance compared with the baseline systems. Our approaches secure the third rank on both official test sets of the first two subtasks with an accuracy of 96.4 and an accuracy of 94.3 respectively.

翻訳日:2022-11-14 14:08:27 公開日:2020-07-02

# Project PIAF: ネイティブなフランス語質問回答データセットの構築

Project PIAF: Building a Native French Question-Answering Dataset ( http://arxiv.org/abs/2007.00968v1 )

ライセンス: Link先を確認

Rachel Keraron, Guillaume Lancrenon, Mathilde Bras, Fr\'ed\'eric Allary, Gilles Moyse, Thomas Scialom, Edmundo-Pavel Soriano-Morales, Jacopo Staiano

(参考訳) 非英語言語のデータの欠如,特に質問応答などの下流タスクの評価に動機づけられ,フランス語母語質問応答データセットを収集するための参加的取り組みを提案する。さらに,得られたデータと予備ベースラインとともに,収集作業用に開発したアノテーションツールについて記述し,公開する。

Motivated by the lack of data for non-English languages, in particular for the evaluation of downstream tasks such as Question Answering, we present a participatory effort to collect a native French Question Answering Dataset. Furthermore, we describe and publicly release the annotation tool developed for our collection effort, along with the data obtained and preliminary baselines.

翻訳日:2022-11-14 14:08:13 公開日:2020-07-02

# スパゲートスケッチとスケールド正規化による分散二階最適化のデバイアス化

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization ( http://arxiv.org/abs/2007.01327v1 )

ライセンス: Link先を確認

Micha{\l} Derezi\'nski, Burak Bartan, Mert Pilanci and Michael W. Mahoney

(参考訳) 分散第2次最適化において、標準的な戦略は、データの小さなスケッチやバッチに基づいて、多くの局所的な見積もりを平均化することである。しかし、各マシンの局所的な推定値は、すべてのデータに対する完全な解と比較して偏りがあり、平均化の有効性を制限できる。本稿では,分散2次手法の収束率を理論的にも経験的にも改善し,局所的な推定値の偏りを解消する新しい手法を提案する。本手法は,(1)サロゲートスケッチと呼ぶものを得るための標準スケッチ技法の修正,(2)局所計算のためのグローバル正規化パラメータの注意深くスケーリングすること,の2つの新しい特徴を有する。我々の代理スケッチは行列点過程に基づいており、逆 Hessian の推定値のバイアスを正確に計算できる分布の族である。この計算に基づいて、最小化された対象が$l_2$-レギュラライズされたパラメータ$\lambda$で、個々のマシンがそれぞれサイズが$m$のスケッチを与えられたとき、バイアスを取り除くために、局所的な推定は$\lambda^{\prime}=\lambda\cdot(1-\frac{d_{\lambda}}{m})$で与えられるシュルーンク正規化パラメータを用いて計算されるべきであることを示した。

In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data. However, the local estimates on each machine are typically biased, relative to the full solution on all of the data, and this can limit the effectiveness of averaging. Here, we introduce a new technique for debiasing the local estimates, which leads to both theoretical and empirical improvements in the convergence rate of distributed second order methods. Our technique has two novel components: (1) modifying standard sketching techniques to obtain what we call a surrogate sketch; and (2) carefully scaling the global regularization parameter for local computations. Our surrogate sketches are based on determinantal point processes, a family of distributions for which the bias of an estimate of the inverse Hessian can be computed exactly. Based on this computation, we show that when the objective being minimized is $l_2$-regularized with parameter $\lambda$ and individual machines are each given a sketch of size $m$, then to eliminate the bias, local estimates should be computed using a shrunk regularization parameter given by $\lambda^{\prime}=\lambda\cdot(1-\frac{d_{\lambda}}{m})$, where $d_{\lambda}$ is the $\lambda$-effective dimension of the Hessian (or, for quadratic problems, the data matrix).

翻訳日:2022-11-14 14:02:25 公開日:2020-07-02

# グローバル・ランドスケープ・オブ・ニューラル・ネットワーク:概要

The Global Landscape of Neural Networks: An Overview ( http://arxiv.org/abs/2007.01429v1 )

ライセンス: Link先を確認

Ruoyu Sun, Dawei Li, Shiyu Liang, Tian Ding, R Srikant

(参考訳) ニューラルネットワークトレーニングにおける大きな懸念の1つは、関連する損失関数の非凸性が景観不良を引き起こす可能性があることである。最近のニューラルネットワークの成功は、その損失の状況がそれほど悪くはないことを示唆しているが、その状況についてどのような具体的な結果が得られているのだろうか? 本稿では,ニューラルネットワークのグローバルな展望に関する最近の知見と結果について概説する。まず、広いニューラルネットワークは特定の仮定の下で最適な局所最小値を持つ可能性があることを指摘した。第二に、"悪い盆地がない"などの広帯域ネットワークの幾何学的特性に関する厳密な結果と、最適化された局所最小値や無限小への経路を除去するいくつかの修正について論じる。第3に,実用ニューラルネットの景観の可視化と経験的探索について考察する。最後に,いくつかの収束結果と景観結果との関係について概説する。

One of the major concerns for neural network training is that the non-convexity of the associated loss functions may cause bad landscape. The recent success of neural networks suggests that their loss landscape is not too bad, but what specific results do we know about the landscape? In this article, we review recent findings and results on the global landscape of neural networks. First, we point out that wide neural nets may have sub-optimal local minima under certain assumptions. Second, we discuss a few rigorous results on the geometric properties of wide networks such as "no bad basin", and some modifications that eliminate sub-optimal local minima and/or decreasing paths to infinity. Third, we discuss visualization and empirical explorations of the landscape for practical neural nets. Finally, we briefly discuss some convergence results and their relation to landscape results.

翻訳日:2022-11-14 14:00:58 公開日:2020-07-02

# 多値量子論理を用いた深層学習における解釈可能性問題への取り組み

Addressing the interpretability problem for deep learning using many valued quantum logic ( http://arxiv.org/abs/2007.01819v1 )

ライセンス: Link先を確認

Swapnil Nitin Shah

(参考訳) 深層学習モデルは様々な産業や科学的応用に広く利用されている。これらのモデルは近年でかなりの成功を収めてきたが、機械学習コミュニティにおけるそのようなシステムによる決定の背後にある理論的根拠の欠如がある。この解釈可能性の問題は、そのようなモデルの複雑さの増加によってさらに悪化する。本稿では,機械学習,量子計算,量子場理論といった概念を用いて,畳み込み型深層信念ネットワークと呼ばれる生成型深層学習モデルにおいて,量子論理系が自然にどのように出現するかを実証する。計算効率を損なうことなく、多くの価値ある量子論理系の解釈可能性を備えたディープラーニングモデルを構築するための堅牢な理論的枠組みを提供する。

Deep learning models are widely used for various industrial and scientific applications. Even though these models have achieved considerable success in recent years, there exists a lack of understanding of the rationale behind decisions made by such systems in the machine learning community. This problem of interpretability is further aggravated by the increasing complexity of such models. This paper utilizes concepts from machine learning, quantum computation and quantum field theory to demonstrate how a many valued quantum logic system naturally arises in a specific class of generative deep learning models called Convolutional Deep Belief Networks. It provides a robust theoretical framework for constructing deep learning models equipped with the interpretability of many valued quantum logic systems without compromising their computing efficiency.

翻訳日:2022-11-14 14:00:10 公開日:2020-07-02

# ウォーターアセット管理のための予測分析:機械学習と生存分析

Predictive Analytics for Water Asset Management: Machine Learning and Survival Analysis ( http://arxiv.org/abs/2007.03744v1 )

ライセンス: Link先を確認

Maryam Rahbaralam, David Modesto, Jaume Card\'us, Amir Abdollahi, and Fernando M Cucchietti

(参考訳) 水資源管理の鍵となるのは, 水道管網のライフサイクルを通しての性能と優先資源の整備である。この重要なネットワークの改修は、一般的にパイプへの物理的アクセスの困難さや不可能さによって妨げられている。本研究では,水管故障予測のための統計的および機械学習フレームワークについて検討する。我々は,短期的予測と生存率分析のために古典的・現代的分類器を用い,より広い視点と長期予測を提供する。これらのモデルを豊かにするために,水分布領域の知識に基づく新しい予測器を導入し,近年のオーバーサンプリング手法を用いて,毎年観測される少数の障害から生じる高い不均衡を解消する。ケーススタディでは,スペイン・バルセロナの配水ネットワーク内の全管の故障記録を含むデータセットを用いて検討を行った。その結果, 管形状, 年齢, 材質, 土壌被覆など, 重要なリスク因子の影響が明らかとなり, 実用管理職がよりインフォームドな予測保守作業を行うのに役立つことがわかった。

Understanding performance and prioritizing resources for the maintenance of the drinking-water pipe network throughout its life-cycle is a key part of water asset management. Renovation of this vital network is generally hindered by the difficulty or impossibility to gain physical access to the pipes. We study a statistical and machine learning framework for the prediction of water pipe failures. We employ classical and modern classifiers for a short-term prediction and survival analysis to provide a broader perspective and long-term forecast, usually needed for the economic analysis of the renovation. To enrich these models, we introduce new predictors based on water distribution domain knowledge and employ a modern oversampling technique to remedy the high imbalance coming from the few failures observed each year. For our case study, we use a dataset containing the failure records of all pipes within the water distribution network in Barcelona, Spain. The results shed light on the effect of important risk factors, such as pipe geometry, age, material, and soil cover, among others, and can help utility managers conduct more informed predictive maintenance tasks.

翻訳日:2022-11-14 13:59:56 公開日:2020-07-02

# 深層学習を用いた衛星画像における鉱業とダム検出

Mining and Tailings Dam Detection In Satellite Imagery Using Deep Learning ( http://arxiv.org/abs/2007.01076v1 )

ライセンス: Link先を確認

Remis Balaniuk and Olga Isupova and Steven Reece

(参考訳) この研究は、自由なクラウドコンピューティング、フリーのオープンソースソフトウェア、そしてブラジルの採掘用尾根ダムの自動識別と分類という、実際の大規模問題を分析するためのディープラーニング手法の組み合わせを探求する。公式に登録された鉱山やダムの場所はブラジル政府のオープンデータ資源から取得された。 Google Earth Engineプラットフォームで取得、処理されたMultispectral Sentinel-2衛星画像は、TensorFlow 2 APIとGoogle Colabプラットフォームを使用して、ディープニューラルネットワークのトレーニングとテストに使用された。完全な畳み込みニューラルネットワークは、未登録の鉱石鉱山やブラジル領の広い地域でダムを尾行するために、革新的な方法で使用された。このアプローチの有効性は、公式な採掘権を持たない263の鉱山の発見によって実証される。この探索的な研究は、社会的影響の高い低コストのデータサイエンスツールを構築するために、無料で利用できる一連の新しい技術の可能性を強調している。同時に、特に発展途上国において、人口と環境に高いリスクをもたらす違法な鉱業の複雑で深刻な問題と、尾根ダムの増殖に対する現実的な解決策を議論し、提案する。コードは、https://github.com/remis/mining-discovery-with-deep-learning.comで公開されている。

This work explores the combination of free cloud computing, free open-source software, and deep learning methods to analyse a real, large-scale problem: the automatic country-wide identification and classification of surface mines and mining tailings dams in Brazil. Locations of officially registered mines and dams were obtained from the Brazilian government open data resource. Multispectral Sentinel-2 satellite imagery, obtained and processed at the Google Earth Engine platform, was used to train and test deep neural networks using the TensorFlow 2 API and Google Colab platform. Fully Convolutional Neural Networks were used in an innovative way, to search for unregistered ore mines and tailing dams in large areas of the Brazilian territory. The efficacy of the approach is demonstrated by the discovery of 263 mines that do not have an official mining concession. This exploratory work highlights the potential of a set of new technologies, freely available, for the construction of low cost data science tools that have high social impact. At the same time, it discusses and seeks to suggest practical solutions for the complex and serious problem of illegal mining and the proliferation of tailings dams, which pose high risks to the population and the environment, especially in developing countries. Code is made publicly available at: https://github.com/remis/mining-discovery-with-deep-learning.

翻訳日:2022-11-14 13:59:38 公開日:2020-07-02

# 遺伝性疾患の予後予測のための半教師付きジェネレーショナル・アドバーサリーネットワーク

A Semi-Supervised Generative Adversarial Network for Prediction of Genetic Disease Outcomes ( http://arxiv.org/abs/2007.01200v1 )

ライセンス: Link先を確認

Caio Davi and Ulisses Braga-Neto

(参考訳) ほとんどの病気にとって、ラベル付き遺伝データの大規模なデータベースの構築は費用と時間を要する作業である。この問題を解決するために、GANアーキテクチャに基づく半教師付きアプローチであるGGAN(Generative Adversarial Networks)を導入し、少量のラベル付きデータと大量のラベルなしデータから始まる大規模な合成遺伝的データセットを作成する。我々の目標は、遺伝的プロファイルだけで、病気の重篤な形態を発達させる新しい個人の傾向を決定することである。提案モデルでは,異なるデータセットと個体群から得られた実際の遺伝データを用いて良好な結果を得た。提案モデルは自己認識可能であり,新たな遺伝的プロファイルがネットワークがトレーニングされたデータと十分に互換性があるかどうかを判定することができる。使用されるコードとデータセットはhttps://github.com/caio-davi/gGAN.comで見ることができる。

For most diseases, building large databases of labeled genetic data is an expensive and time-demanding task. To address this, we introduce genetic Generative Adversarial Networks (gGAN), a semi-supervised approach based on an innovative GAN architecture to create large synthetic genetic data sets starting with a small amount of labeled data and a large amount of unlabeled data. Our goal is to determine the propensity of a new individual to develop the severe form of the illness from their genetic profile alone. The proposed model achieved satisfactory results using real genetic data from different datasets and populations, in which the test populations may not have the same genetic profiles. The proposed model is self-aware and capable of determining whether a new genetic profile has enough compatibility with the data on which the network was trained and is thus suitable for prediction. The code and datasets used can be found at https://github.com/caio-davi/gGAN.

翻訳日:2022-11-14 13:51:59 公開日:2020-07-02

# 動的グラフのラプラシアン変化点検出

Laplacian Change Point Detection for Dynamic Graphs ( http://arxiv.org/abs/2007.01229v1 )

ライセンス: Link先を確認

Shenyang Huang, Yasmeen Hitti, Guillaume Rabusseau, Reihaneh Rabbany

(参考訳) 動的グラフと時間グラフは、時間とともにエンティティ間の複雑な関係をモデル化するために使用されるリッチなデータ構造である。特に、時間グラフにおける異常検出は、ネットワークシステムにおける侵入識別、生態系の乱れの検出、アウトブレイクの検出など、多くの現実世界の応用にとって重要である。本稿では,動的グラフにおける変化点検出に焦点をあて,この問題に関連する2つの主な課題に対処する。上記の課題を解決するために,各スナップショットにおけるグラフ構造のラプラシアン行列のスペクトルを用いて低次元埋め込みを求めるLaplacian Anomaly Detection (LAD)を提案する。 LADは2つのスライディングウィンドウを適用することで、短期および長期の依存関係を明示的にモデル化する。合成実験では、LADは最先端の手法よりも優れている。また, 本手法は, uciメッセージネットワーク, 上院共同支援ネットワーク, カナダ法案投票ネットワークの3つの実動的ネットワーク上で評価した。 3つのデータセットすべてにおいて,本手法は重要な実世界の事象に応じて異常な時点をより効果的に識別できることを実証する。

Dynamic and temporal graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly detection in temporal graphs is crucial for many real world applications such as intrusion identification in network systems, detection of ecosystem disturbances and detection of epidemic outbreaks. In this paper, we focus on change point detection in dynamic graphs and address two main challenges associated with this problem: I) how to compare graph snapshots across time, II) how to capture temporal dependencies. To solve the above challenges, we propose Laplacian Anomaly Detection (LAD) which uses the spectrum of the Laplacian matrix of the graph structure at each snapshot to obtain low dimensional embeddings. LAD explicitly models short term and long term dependencies by applying two sliding windows. In synthetic experiments, LAD outperforms the state-of-the-art method. We also evaluate our method on three real dynamic networks: UCI message network, US senate co-sponsorship network and Canadian bill voting network. In all three datasets, we demonstrate that our method can more effectively identify anomalous time points according to significant real world events.

翻訳日:2022-11-14 13:51:41 公開日:2020-07-02

# 地球観測におけるガウス過程の展望

A Perspective on Gaussian Processes for Earth Observation ( http://arxiv.org/abs/2007.01238v1 )

ライセンス: Link先を確認

Gustau Camps-Valls and Dino Sejdinovic and Jakob Runge and Markus Reichstein

(参考訳) 空中・衛星リモートセンシングとその場観測による地球観測(EO)は、地球をモニタリングする上で基本的な役割を果たす。過去10年間で、特に機械学習とガウス過程(GP)は、局所的およびグローバルなスケールで取得した画像から、時間分解された方法で生物地球物理変数を推定する際、顕著な結果を得た。 GPは正確な推定だけでなく、予測のための原理化された不確実性推定も提供し、異なるセンサーや時間的取得から得られるマルチモーダルデータを容易に取り扱えるようにし、物理的知識の導入を可能にし、不確実性定量化とエラー伝播の正式な処理を行う。前向きおよび逆モデリングの進歩にもかかわらず、GPモデルは、この視点の論文で改訂された重要な課題に直面する必要がある。 gpモデルは、信号特性を尊重し、物理の基本法則と一致し、純粋回帰から観測因果推論に移行するデータ駆動物理認識モデルへと進化するべきである。

Earth observation (EO) by airborne and satellite remote sensing and in-situ observations play a fundamental role in monitoring our planet. In the last decade, machine learning and Gaussian processes (GPs) in particular has attained outstanding results in the estimation of bio-geo-physical variables from the acquired images at local and global scales in a time-resolved manner. GPs provide not only accurate estimates but also principled uncertainty estimates for the predictions, can easily accommodate multimodal data coming from different sensors and from multitemporal acquisitions, allow the introduction of physical knowledge, and a formal treatment of uncertainty quantification and error propagation. Despite great advances in forward and inverse modelling, GP models still have to face important challenges that are revised in this perspective paper. GP models should evolve towards data-driven physics-aware models that respect signal characteristics, be consistent with elementary laws of physics, and move from pure regression to observational causal inference.

翻訳日:2022-11-14 13:51:12 公開日:2020-07-02

# ディープスパイクニューラルネットワークを用いたパターン認識のためのプログレッシブタンデム学習

Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks ( http://arxiv.org/abs/2007.01204v1 )

ライセンス: Link先を確認

Jibin Wu, Chenglin Xu, Daquan Zhou, Haizhou Li, Kay Chen Tan

(参考訳) スパイキングニューラルネットワーク(snn)は、イベント駆動の性質とスパース通信のため、低レイテンシと高い計算効率のために、従来のニューラルネットワーク(anns)よりも明確なアドバンテージを示している。しかし、深層SNNの訓練は簡単ではない。本稿では,深層SNNのプログレッシブタンデム学習と呼ばれる,高速かつ効率的なパターン認識のための新しいANN-to-SNN変換およびレイヤワイズ学習フレームワークを提案する。離散表現空間におけるANNとSNNの等価性を研究することにより、スパイクカウントをフル活用してアナログニューロンの活性化値を近似するプリミティブネットワーク変換法が導入された。プリミティブなネットワーク変換から生じる近似誤差を補うために,適応型トレーニングスケジューラを用いたレイヤワイズ学習手法を導入し,ネットワーク重みを微調整する。プログレッシブタンデム学習フレームワークはまた、トレーニング中に、制限された重量精度やファンイン接続などのハードウェア制約を徐々に課すことができる。これらのSNNは、大規模オブジェクト認識、画像再構成、音声分離タスクにおいて顕著な分類と回帰能力を示し、同時に、他の最先端のSNN実装よりも、推論時間とシナプス操作を極端に削減する必要がある。そのため、限られた電力予算で、モバイルおよび組み込みデバイスに普及する無数の機会を開くことができる。

Spiking neural networks (SNNs) have shown clear advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency, due to their event-driven nature and sparse communication. However, the training of deep SNNs is not straightforward. In this paper, we propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition, which is referred to as progressive tandem learning of deep SNNs. By studying the equivalence between ANNs and SNNs in the discrete representation space, a primitive network conversion method is introduced that takes full advantage of spike count to approximate the activation value of analog neurons. To compensate for the approximation errors arising from the primitive network conversion, we further introduce a layer-wise learning method with an adaptive training scheduler to fine-tune the network weights. The progressive tandem learning framework also allows hardware constraints, such as limited weight precision and fan-in connections, to be progressively imposed during training. The SNNs thus trained have demonstrated remarkable classification and regression capabilities on large-scale object recognition, image reconstruction, and speech separation tasks, while requiring at least an order of magnitude reduced inference time and synaptic operations than other state-of-the-art SNN implementations. It, therefore, opens up a myriad of opportunities for pervasive mobile and embedded devices with a limited power budget.

翻訳日:2022-11-14 13:44:50 公開日:2020-07-02

# 深層強化学習による人間中心協調ロボット

Human-centered collaborative robots with deep reinforcement learning ( http://arxiv.org/abs/2007.01009v1 )

ライセンス: Link先を確認

Ali Ghadirzadeh, Xi Chen, Wenjie Yin, Zhengrong Yi, M{\aa}rten Bj\"orkman and Danica Kragic

(参考訳) 人中心協調システムのための強化学習に基づくフレームワークを提案する。フレームワークは積極的であり、タスク完了に要する時間を最小化することで、タイムリーなアクションの利点と不適切なアクションを取るリスクのバランスを取る。フレームワークは、認識の不確実性と意思決定を統合的に対処する教師なしの方法でエンドツーエンドに学習される。このフレームワークは、教師付き学習を用いて、知覚と意思決定システムが独立して学習される代替品と比較して、パッケージングの例題として、人間とロボットのパートナー間のより流動的な協調を提供する。提案手法の一番の利点は,動きデータの退屈なアノテーションを回避し,学習をオンラインで行うため,新たな人間パートナーやタスクへの迅速な適応を可能にすることである。

We present a reinforcement learning based framework for human-centered collaborative systems. The framework is proactive and balances the benefits of timely actions with the risk of taking improper actions by minimizing the total time spent to complete the task. The framework is learned end-to-end in an unsupervised fashion addressing the perception uncertainties and decision making in an integrated manner. The framework is shown to provide more fluent coordination between human and robot partners on an example task of packaging compared to alternatives for which perception and decision-making systems are learned independently, using supervised learning. The foremost benefit of the proposed approach is that it allows for fast adaptation to new human partners and tasks since tedious annotation of motion data is avoided and the learning is performed on-line.

翻訳日:2022-11-14 13:44:26 公開日:2020-07-02

# エンドツーエンド強化学習のための安全な探索

Verifiably Safe Exploration for End-to-End Reinforcement Learning ( http://arxiv.org/abs/2007.01223v1 )

ライセンス: Link先を確認

Nathan Hunt, Nathan Fulton, Sara Magliacane, Nghia Hoang, Subhro Das, Armando Solar-Lezama

(参考訳) 安全クリティカルな環境での深層強化学習の展開には、探索中に厳しい制約に従うアルゴリズムを開発する必要がある。本稿では,視覚的入力によるエンドツーエンドポリシーの形式的安全性制約の実施に向けた最初のアプローチを提案する。我々のアプローチは、ハイブリッド力学系におけるオブジェクト検出と自動推論の最近の進歩に基づいている。このアプローチは、ハード制約の存在下で安全に探索することの難しさを強調する新しいベンチマークで評価される。本ベンチマークは,安全学習のためのいくつかの問題集合を抽出し,安全制約に適合しない報奨信号などの課題を強調する。これらのベンチマーク問題に対して,本アルゴリズムは安全である限り最適化に競争力を維持しつつ,安全でない動作を完全に回避する。また, 安全制約の実施方法が, もともとの環境からすべての安全政策を守っていることも証明した。

Deploying deep reinforcement learning in safety-critical settings requires developing algorithms that obey hard constraints during exploration. This paper contributes a first approach toward enforcing formal safety constraints on end-to-end policies with visual inputs. Our approach draws on recent advances in object detection and automated reasoning for hybrid dynamical systems. The approach is evaluated on a novel benchmark that emphasizes the challenge of safely exploring in the presence of hard constraints. Our benchmark draws from several proposed problem sets for safe learning and includes problems that emphasize challenges such as reward signals that are not aligned with safety constraints. On each of these benchmark problems, our algorithm completely avoids unsafe behavior while remaining competitive at optimizing for as much reward as is safe. We also prove that our method of enforcing the safety constraints preserves all safe policies from the original environment.

翻訳日:2022-11-14 13:44:13 公開日:2020-07-02

# epsilon}-bmc:モデルフリー強化学習におけるepsilon-greedy探索へのベイズアンサンブルアプローチ

{\epsilon}-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning ( http://arxiv.org/abs/2007.00869v1 )

ライセンス: Link先を確認

Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

(参考訳) 探索-探索トレードオフの解消は、強化学習(RL)アルゴリズムの設計と実装における根本的な問題である。本稿では,epsilon-greedy 探索ポリシーを用いたモデルフリー RL に着目し,その単純さにもかかわらず,最も頻繁に使われている探索形式の一つである。しかし、このポリシーの重要な制限は$\varepsilon$の仕様である。本稿では、Q-値関数の均一性の尺度として、$\varepsilon$という新しいベイズ的視点を提供する。新しい視点に基づいたbayesian model combination(bmc)に基づいたクローズドフォームベイズモデルのアップデートを導入することにより、モノトーン収束保証によって、環境からの体験を一定時間使用することで、$\varepsilon$を適用できる。提案したアルゴリズムである$\varepsilon$-\texttt{BMC} は、異なる問題に対する探索と搾取の効率よくバランスし、最適な調整済みアニールスケジュールと、本論文で提案した代替データ依存の$\varepsilon$アダプティブスキームとを比較または上回る性能を示す。

Resolving the exploration-exploitation trade-off remains a fundamental problem in the design and implementation of reinforcement learning (RL) algorithms. In this paper, we focus on model-free RL using the epsilon-greedy exploration policy, which despite its simplicity, remains one of the most frequently used forms of exploration. However, a key limitation of this policy is the specification of $\varepsilon$. In this paper, we provide a novel Bayesian perspective of $\varepsilon$ as a measure of the uniformity of the Q-value function. We introduce a closed-form Bayesian model update based on Bayesian model combination (BMC), based on this new perspective, which allows us to adapt $\varepsilon$ using experiences from the environment in constant time with monotone convergence guarantees. We demonstrate that our proposed algorithm, $\varepsilon$-\texttt{BMC}, efficiently balances exploration and exploitation on different problems, performing comparably or outperforming the best tuned fixed annealing schedules and an alternative data-dependent $\varepsilon$ adaptation scheme proposed in the literature.

翻訳日:2022-11-14 13:43:21 公開日:2020-07-02

# ローカル更新手法における学習率の大幅な重要性について

On the Outsized Importance of Learning Rates in Local Update Methods ( http://arxiv.org/abs/2007.00878v1 )

ライセンス: Link先を確認

Zachary Charles, Jakub Kone\v{c}n\'y

(参考訳) 我々は,多くのフェデレーション学習とメタ学習アルゴリズムを一般化する,局所的な更新手法と呼ばれるアルゴリズム群を研究する。二次目的の場合、局所更新法は、我々が正確に特徴付けるサーロゲート損失関数上で確率的勾配降下を行う。クライアント学習率の選択は、サロゲート損失の条件数と、サロゲート最小化関数と真の損失関数との距離を制御していることを示す。我々はこの理論を用いて、代理損失の条件数と真の損失関数との整合性の間のトレードオフを示すフェデレーション平均化のための新しい収束率を導出する。コミュニケーション制限のある環境では、適切な学習率チューニングが最適に近い行動に達するのに十分であることを実証する。また,学習速度チューニングの必要性を低減し,様々なタスクやデータセットにおける経験的性能を強調する,ローカル更新手法における学習速度の自動減衰の実用的な方法を提案する。

We study a family of algorithms, which we refer to as local update methods, that generalize many federated learning and meta-learning algorithms. We prove that for quadratic objectives, local update methods perform stochastic gradient descent on a surrogate loss function which we exactly characterize. We show that the choice of client learning rate controls the condition number of that surrogate loss, as well as the distance between the minimizers of the surrogate and true loss functions. We use this theory to derive novel convergence rates for federated averaging that showcase this trade-off between the condition number of the surrogate loss and its alignment with the true loss function. We validate our results empirically, showing that in communication-limited settings, proper learning rate tuning is often sufficient to reach near-optimal behavior. We also present a practical method for automatic learning rate decay in local update methods that helps reduce the need for learning rate tuning, and highlight its empirical performance on a variety of tasks and datasets.

翻訳日:2022-11-14 13:42:57 公開日:2020-07-02

# 動的リスク評価のための敵対的事例に対する深層学習防御

Deep Learning Defenses Against Adversarial Examples for Dynamic Risk Assessment ( http://arxiv.org/abs/2007.01017v1 )

ライセンス: Link先を確認

Xabier Echeberria-Barrio, Amaia Gil-Lerchundi, Ines Goicoechea-Telleria and Raul Orduna-Urrutia

(参考訳) Deep Neural Networksが最初に開発されたのは数十年前だが、コンピュータのパワー要件のために広く使われるようになったのは、最近になってからである。それ以来、多くの分野に適用され、広範囲の進歩を遂げてきた。さらに重要なことは、リスク管理が重要な医療手順や自動運転の意思決定など、重要な問題に利用されています。これらの分野における診断や意思決定の誤りは、重大な事故や死を伴います。なぜなら、この種のモデルを攻撃するのは簡単であると繰り返し報告されているからです。したがって、これらの攻撃はリスクを評価するために研究されなければならず、モデルをより堅牢にするために防御を開発する必要がある。このために最も広く知られた攻撃が選択され(敵の攻撃)、それに対するいくつかの防御(敵の訓練、次元の再定義、予測類似性)が行われた。得られた結果は、同様の精度を維持しながら、モデルをより堅牢にする。このアイデアは、乳がんデータセットとVGG16と高密度ニューラルネットワークモデルを使用して開発されたが、他の領域からのデータセットや、さまざまな畳み込みおよび高密度ニューラルネットワークモデルに適用することができる。

Deep Neural Networks were first developed decades ago, but it was not until recently that they started being extensively used, due to their computing power requirements. Since then, they are increasingly being applied to many fields and have undergone far-reaching advancements. More importantly, they have been utilized for critical matters, such as making decisions in healthcare procedures or autonomous driving, where risk management is crucial. Any mistakes in the diagnostics or decision-making in these fields could entail grave accidents, and even death. This is preoccupying, because it has been repeatedly reported that it is straightforward to attack this type of models. Thus, these attacks must be studied to be able to assess their risk, and defenses need to be developed to make models more robust. For this work, the most widely known attack was selected (adversarial attack) and several defenses were implemented against it (i.e. adversarial training, dimensionality reduc tion and prediction similarity). The obtained outcomes make the model more robust while keeping a similar accuracy. The idea was developed using a breast cancer dataset and a VGG16 and dense neural network model, but the solutions could be applied to datasets from other areas and different convolutional and dense deep neural network models.

翻訳日:2022-11-14 13:41:53 公開日:2020-07-02

# 損失領域の一般化を目指して

In Search of Lost Domain Generalization ( http://arxiv.org/abs/2007.01434v1 )

ライセンス: Link先を確認

Ishaan Gulrajani, David Lopez-Paz

(参考訳) ドメイン一般化アルゴリズムの目標は、トレーニング中に見られるものと異なる分布を適切に予測することである。無数のドメイン一般化アルゴリズムが存在するが、実験条件における不整合(データセット、アーキテクチャ、モデル選択基準)は公正で現実的な比較が難しい。本稿では,ドメイン一般化アルゴリズムが現実的にどのように有用かを理解することに興味がある。最初のステップとして、モデル選択はドメインの一般化タスクにとって自明ではないことに気づく。先行研究とは対照的に、モデル選択戦略のない領域一般化アルゴリズムは不完全とみなすべきである。次に,7つのマルチドメインデータセット,9つのベースラインアルゴリズム,3つのモデル選択基準を含む,ドメイン一般化のためのテストベッドであるdomainbedを実装した。 DomainBedを使って広範な実験を行い、慎重に実装すると、実験的なリスク最小化がすべてのデータセットにおける最先端のパフォーマンスを示す。今後は、DomainBedのリリースと仲間の研究者の協力により、ドメインの一般化における再現性と厳密な研究の合理化を期待する。

The goal of domain generalization algorithms is to predict well on distributions different from those seen during training. While a myriad of domain generalization algorithms exist, inconsistencies in experimental conditions -- datasets, architectures, and model selection criteria -- render fair and realistic comparisons difficult. In this paper, we are interested in understanding how useful domain generalization algorithms are in realistic settings. As a first step, we realize that model selection is non-trivial for domain generalization tasks. Contrary to prior work, we argue that domain generalization algorithms without a model selection strategy should be regarded as incomplete. Next, we implement DomainBed, a testbed for domain generalization including seven multi-domain datasets, nine baseline algorithms, and three model selection criteria. We conduct extensive experiments using DomainBed and find that, when carefully implemented, empirical risk minimization shows state-of-the-art performance across all datasets. Looking forward, we hope that the release of DomainBed, along with contributions from fellow researchers, will streamline reproducible and rigorous research in domain generalization.

翻訳日:2022-11-14 13:35:02 公開日:2020-07-02

# 低光環境ニューラルサーベイランス

Low-light Environment Neural Surveillance ( http://arxiv.org/abs/2007.00843v1 )

ライセンス: Link先を確認

Michael Potter (1), Henry Gridley (1), Noah Lichtenstein (1), Kevin Hines (1), John Nguyen (1), Jacob Walsh (1) ((1) Northeastern University)

(参考訳) 低照度環境における実時間犯罪検知のためのエンドツーエンドシステムの設計と実装を行う。反応するクローズド回路テレビとは異なり、低光環境ニューラルサーベイランスはリアルタイムの犯罪警報を提供する。システムは、光学フローネットワーク、空間的および時間的ネットワークによってリアルタイムで処理された低照度ビデオフィードと、射撃、暴行、盗難を識別するためのサポートベクトルマシンを使用する。私たちは、低光度アクション認識データセット、lens-4を作成します。 Amazon Web Services経由で設定されたIoTインフラストラクチャは、アクション認識用のカメラをホストするローカルボードからのメッセージを解釈し、クラウド内の結果を解析してメッセージを中継する。 20FPSで71.5%の精度を達成した。ユーザーインターフェースは、地元の当局が通知を受け取り、犯罪現場のビデオを見ることができるモバイルアプリである。市民は、法執行機関がユーザーの近づきに応じて犯罪警報をプッシュできる公開アプリを持っている。

We design and implement an end-to-end system for real-time crime detection in low-light environments. Unlike Closed-Circuit Television, which performs reactively, the Low-Light Environment Neural Surveillance provides real time crime alerts. The system uses a low-light video feed processed in real-time by an optical-flow network, spatial and temporal networks, and a Support Vector Machine to identify shootings, assaults, and thefts. We create a low-light action-recognition dataset, LENS-4, which will be publicly available. An IoT infrastructure set up via Amazon Web Services interprets messages from the local board hosting the camera for action recognition and parses the results in the cloud to relay messages. The system achieves 71.5% accuracy at 20 FPS. The user interface is a mobile app which allows local authorities to receive notifications and to view a video of the crime scene. Citizens have a public app which enables law enforcement to push crime alerts based on user proximity.

翻訳日:2022-11-14 13:34:43 公開日:2020-07-02

# ポイントクラウド解析における局所集約演算子について

A Closer Look at Local Aggregation Operators in Point Cloud Analysis ( http://arxiv.org/abs/2007.01294v1 )

ライセンス: Link先を確認

Ze Liu and Han Hu and Yue Cao and Zheng Zhang and Xin Tong

(参考訳) ポイントクラウド処理のためのネットワークアーキテクチャの最近の進歩は、主にローカルアグリゲーション演算子の新しい設計に支えられている。しかし,これらの演算子がネットワーク性能に与える影響については,各ソリューションのネットワークアーキテクチャや実装の詳細が異なるため,慎重には検討されていない。一方、ほとんどの演算子は浅いアーキテクチャでのみ適用される。本稿では,代表的局所集合演算子を再検討し,その性能を同一の残差アーキテクチャを用いて検討する。これらの演算子の異なる設計にもかかわらず、これらの演算子は、同じネットワーク入力と特徴数の下で、驚くほど類似したコントリビューションを行い、その結果、標準ベンチマークにおける最先端の精度が得られる。この発見は、ポイントクラウド処理のための局所集約演算子の洗練された設計の必要性を再考するきっかけとなった。そこで本研究では,学習可能な重みを持たない単純な局所集約演算子,PosPooling(PosPool)を提案する。特に、ポスプール層を持つ単純なディープ残差ネットワークは、すべてのベンチマークで優れたパフォーマンスを達成し、挑戦的なpartnetデータセットの以前の方法よりも大きなマージン(7.4 miou)で優れている。コードはhttps://github.com/zeliu98/closerlook3dで公開されている。

Recent advances of network architecture for point cloud processing are mainly driven by new designs of local aggregation operators. However, the impact of these operators to network performance is not carefully investigated due to different overall network architecture and implementation details in each solution. Meanwhile, most of operators are only applied in shallow architectures. In this paper, we revisit the representative local aggregation operators and study their performance using the same deep residual architecture. Our investigation reveals that despite the different designs of these operators, all of these operators make surprisingly similar contributions to the network performance under the same network input and feature numbers and result in the state-of-the-art accuracy on standard benchmarks. This finding stimulate us to rethink the necessity of sophisticated design of local aggregation operator for point cloud processing. To this end, we propose a simple local aggregation operator without learnable weights, named Position Pooling (PosPool), which performs similarly or slightly better than existing sophisticated operators. In particular, a simple deep residual network with PosPool layers achieves outstanding performance on all benchmarks, which outperforms the previous state-of-the methods on the challenging PartNet datasets by a large margin (7.4 mIoU). The code is publicly available at https://github.com/zeliu98/CloserLook3D

翻訳日:2022-11-14 13:33:50 公開日:2020-07-02

# 公平性を考慮したグラディングビデオインタビュー

Grading video interviews with fairness considerations ( http://arxiv.org/abs/2007.05461v1 )

ライセンス: Link先を確認

Abhishek Singhania, Abhishek Unnam and Varun Aggarwal

(参考訳) 顔画像とビデオを用いて人間の感情や特徴を予測することには、かなりの関心が寄せられている。近年、このような研究は、ラベル付けの実践の貧弱さ、決定的でない予測結果、公平性に関する批判にさらされている。質問に対するビデオ応答に基づいて、候補者の社会的スキルを自動的に導き出すための慎重な手法を提案する。われわれは初めて、複数の民族を包含する複数の国のビデオデータを含む。また、ビデオは複数の人種的背景を持つ個人によって評価され、いくつかのベストプラクティスに従って、社会的スキルのコンセンサスと偏見のない測定を実現した。社会的スキルを予測するための2つの機械学習モデルを開発した。最初のモデルは、専門家のガイダンスを使って、もっとも因果的な特徴を使用する。後者はディープラーニングを使用し、データに存在する経験的相関のみに依存する。両モデルの誤差を比較し,モデルの特異性を検証し,推奨する。さらに,モデルの誤りを人種や性別別に検討することで公平性を分析する。候補者の面接結果の予測方法を決定することで,モデルの有用性を検証する。全体としてこの研究は、公平性と倫理的な配慮をしながら、ビデオインタビューのスコアリングに人工知能を使用するための強力なサポートを提供する。

There has been considerable interest in predicting human emotions and traits using facial images and videos. Lately, such work has come under criticism for poor labeling practices, inconclusive prediction results and fairness considerations. We present a careful methodology to automatically derive social skills of candidates based on their video response to interview questions. We, for the first time, include video data from multiple countries encompassing multiple ethnicities. Also, the videos were rated by individuals from multiple racial backgrounds, following several best practices, to achieve a consensus and unbiased measure of social skills. We develop two machine-learning models to predict social skills. The first model employs expert-guidance to use plausibly causal features. The second uses deep learning and depends solely on the empirical correlations present in the data. We compare errors of both these models, study the specificity of the models and make recommendations. We further analyze fairness by studying the errors of models by race and gender. We verify the usefulness of our models by determining how well they predict interview outcomes for candidates. Overall, the study provides strong support for using artificial intelligence for video interview scoring, while taking care of fairness and ethical considerations.

翻訳日:2022-11-14 13:33:27 公開日:2020-07-02

# VQAにおけるAI能力予測への説明の影響

The Impact of Explanations on AI Competency Prediction in VQA ( http://arxiv.org/abs/2007.00900v1 )

ライセンス: Link先を確認

Kamran Alipour, Arijit Ray, Xiao Lin, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas

(参考訳) 説明可能性(Explainability)は、AIシステムの信頼を構築する上で重要な要素のひとつだ。 AIを説明可能にしようとする多くの試みの中で、説明の効果を定量化することは、人間とAIの協調作業を実行する上での課題である。 AIの全体的な振る舞いを予測する能力以外に、多くのアプリケーションでは、タスクドメインのさまざまな側面において、AIエージェントの能力を理解する必要がある。本稿では,視覚的質問応答(VQA)タスクにおけるAIエージェント能力のユーザ精神モデルに対する説明の影響を評価する。実際のシステム性能とユーザランキングの相関関係に基づいて,ユーザの能力に対する理解度を定量化する。本稿では,空間的特徴とオブジェクト的特徴を併用し,BERT言語モデルを用いた説明可能なVQAシステムを提案する。それぞれのグループは、VQAモデルの能力を評価するための説明を1つしか見ていない。提案モデルは,ユーザのコンピテンシー知覚に対する説明の影響を調べるために,主観間実験によって評価される。 2つのVQAモデルの比較では、BERTに基づく説明とオブジェクト機能の使用により、モデルの能力に関するユーザの予測が改善されている。

Explainability is one of the key elements for building trust in AI systems. Among numerous attempts to make AI explainable, quantifying the effect of explanations remains a challenge in conducting human-AI collaborative tasks. Aside from the ability to predict the overall behavior of AI, in many applications, users need to understand an AI agent's competency in different aspects of the task domain. In this paper, we evaluate the impact of explanations on the user's mental model of AI agent competency within the task of visual question answering (VQA). We quantify users' understanding of competency, based on the correlation between the actual system performance and user rankings. We introduce an explainable VQA system that uses spatial and object features and is powered by the BERT language model. Each group of users sees only one kind of explanation to rank the competencies of the VQA model. The proposed model is evaluated through between-subject experiments to probe explanations' impact on the user's perception of competency. The comparison between two VQA models shows BERT based explanations and the use of object features improve the user's prediction of the model's competencies.

翻訳日:2022-11-14 13:33:09 公開日:2020-07-02

# 主成分分析による高次元ベイズ最適化

High Dimensional Bayesian Optimization Assisted by Principal Component Analysis ( http://arxiv.org/abs/2007.00925v1 )

ライセンス: Link先を確認

Elena Raponi, Hao Wang, Mariusz Bujny, Simonetta Boria and Carola Doerr

(参考訳) ベイジアン最適化(英: Bayesian Optimization, BO)は、自動機械学習や設計最適化など、様々な分野に適用された、代理支援のグローバル最適化手法である。いわゆる infill-criterion and gaussian process regression (gpr) に基づいて構築されたbo技法は、探索空間の次元が増加するにつれて計算の複雑さと収束率を阻害する。高次元最適化問題に対するBOのスケールアップは依然として難しい課題である。本稿では,PCAを主成分分析(PCA)と組み合わせることでBOのスケーラビリティに取り組み,新しいPCA-BOアルゴリズムを提案する。具体的には、pca手順は、実行中のすべての評価点から線形変換を学習し、評価点の変動性に応じて変換空間の次元を選択する。次に、GPRモデルを構築し、選択された次元にまたがる空間における補充基準について述べる。我々はCOCOベンチマークフレームワークによるマルチモーダル問題に対する経験的収束率とCPU時間の観点からPCA-BOの性能を評価する。実験の結果,PCA-BOは高次元問題におけるCPU時間を効果的に削減し,適切なグローバル構造を持つ問題に対する収束率を維持することができることがわかった。そのため、PCA-BOは、高次元数値最適化におけるBOアプローチの強みから恩恵を受ける新しい方法を開く収束率と計算効率の間の良好なトレードオフを提供する。

Bayesian Optimization (BO) is a surrogate-assisted global optimization technique that has been successfully applied in various fields, e.g., automated machine learning and design optimization. Built upon a so-called infill-criterion and Gaussian Process regression (GPR), the BO technique suffers from a substantial computational complexity and hampered convergence rate as the dimension of the search spaces increases. Scaling up BO for high-dimensional optimization problems remains a challenging task. In this paper, we propose to tackle the scalability of BO by hybridizing it with a Principal Component Analysis (PCA), resulting in a novel PCA-assisted BO (PCA-BO) algorithm. Specifically, the PCA procedure learns a linear transformation from all the evaluated points during the run and selects dimensions in the transformed space according to the variability of evaluated points. We then construct the GPR model, and the infill-criterion in the space spanned by the selected dimensions. We assess the performance of our PCA-BO in terms of the empirical convergence rate and CPU time on multi-modal problems from the COCO benchmark framework. The experimental results show that PCA-BO can effectively reduce the CPU time incurred on high-dimensional problems, and maintains the convergence rate on problems with an adequate global structure. PCA-BO therefore provides a satisfactory trade-off between the convergence rate and computational efficiency opening new ways to benefit from the strength of BO approaches in high dimensional numerical optimization.

翻訳日:2022-11-14 13:32:54 公開日:2020-07-02

# 確率微分方程式を用いた非一様サンプリング時系列の精度評価

Accurate Characterization of Non-Uniformly Sampled Time Series using Stochastic Differential Equations ( http://arxiv.org/abs/2007.01073v1 )

ライセンス: Link先を確認

Stijn de Waele

(参考訳) 非一様サンプリングは、実験者が調査中のプロセスのサンプリング特性を完全に制御できない場合に発生する。さらに、ベイズ最適化や圧縮センシングなどのアルゴリズムにも意図的に導入されている。確率微分方程式(SDE)は、特にそのような時系列の2次モーメントを特徴づけるのに適している。自己回帰モデルからの漸進的推定と初期化に基づいて,確率の数値最適化のための新しい初期推定手法を提案する。さらに、SDE確率に基づく推定モデルの順序を減少させるために、純粋にデータ駆動方式としてモデルトランケーションを導入する。シミュレーション実験において,新しい推定器によって達成される精度が向上し,非一様サンプル時系列の特徴付けに遭遇する可能性のある課題をすべて網羅した。最後に,実験降雨変動データに新しい推定器を適用する。

Non-uniform sampling arises when an experimenter does not have full control over the sampling characteristics of the process under investigation. Moreover, it is introduced intentionally in algorithms such as Bayesian optimization and compressive sensing. We argue that Stochastic Differential Equations (SDEs) are especially well-suited for characterizing second order moments of such time series. We introduce new initial estimates for the numerical optimization of the likelihood, based on incremental estimation and initialization from autoregressive models. Furthermore, we introduce model truncation as a purely data-driven method to reduce the order of the estimated model based on the SDE likelihood. We show the increased accuracy achieved with the new estimator in simulation experiments, covering all challenging circumstances that may be encountered in characterizing a non-uniformly sampled time series. Finally, we apply the new estimator to experimental rainfall variability data.

翻訳日:2022-11-14 13:26:34 公開日:2020-07-02

# 推定逆傾向スコアを用いた個別化治療ルールの学習

Learning Individualized Treatment Rules with Estimated Translated Inverse Propensity Score ( http://arxiv.org/abs/2007.01083v1 )

ライセンス: Link先を確認

Zhiliang Wu, Yinchong Yang, Yunpu Ma, Yushan Liu, Rui Zhao, Michael Moor, Volker Tresp

(参考訳) ランダム化対照試験は、通常、患者サブグループに対する治療勧告を作成することを目的として、治療の有効性を分析する。電子健康記録の進歩に伴い,臨床実践において多種多様なデータが収集され,観察データに基づく治療・治療方針の評価が可能となった。本稿では,個別治療規則(ITR)の学習に焦点をあて,個々の患者により良い結果をもたらすと期待される治療方針を導出する。本フレームワークでは,ITRの学習を文脈的盗聴問題とみなし,治療方針の予測リスクを最小限に抑える。シミュレーション研究と実世界のデータセットに基づいて,提案フレームワークを用いて実験を行う。後者の場合, 静脈内 (IV) 液と血管圧薬 (VP) の投与に最適なITRを学習するために提案法を適用した。様々なオフライン評価手法に基づいて,本フレームワークから導出されたポリシーは,簡単な治療予測手法を含む,医師や他のベースラインと比較して優れた性能を示すことを示すことができる。長期的目標として,本方針はIVおよびVPの治験ガイドラインの改善につながる可能性がある。

Randomized controlled trials typically analyze the effectiveness of treatments with the goal of making treatment recommendations for patient subgroups. With the advance of electronic health records, a great variety of data has been collected in clinical practice, enabling the evaluation of treatments and treatment policies based on observational data. In this paper, we focus on learning individualized treatment rules (ITRs) to derive a treatment policy that is expected to generate a better outcome for an individual patient. In our framework, we cast ITRs learning as a contextual bandit problem and minimize the expected risk of the treatment policy. We conduct experiments with the proposed framework both in a simulation study and based on a real-world dataset. In the latter case, we apply our proposed method to learn the optimal ITRs for the administration of intravenous (IV) fluids and vasopressors (VP). Based on various offline evaluation methods, we could show that the policy derived in our framework demonstrates better performance compared to both the physicians and other baselines, including a simple treatment prediction approach. As a long-term goal, our derived policy might eventually lead to better clinical guidelines for the administration of IV and VP.

翻訳日:2022-11-14 13:26:22 公開日:2020-07-02

# 強化二階ファジィ規則モデル構築における指数重み付きl_2正規化戦略

Exponentially Weighted l_2 Regularization Strategy in Constructing Reinforced Second-order Fuzzy Rule-based Model ( http://arxiv.org/abs/2007.01208v1 )

ライセンス: Link先を確認

Congcong Zhang, Sung-Kwun Oh, Witold Pedrycz, Zunwei Fu and Shanzhen Lu

(参考訳) 従来の高木スゲノカン(TSK)型ファジィモデルでは、定数関数や線形関数は通常ファジィ規則の連続部分として使用されるが、先行部分によって定義された局所領域内の振る舞いを効果的に記述することはできない。本稿では,この問題に対処するために理論的かつ実用的な設計手法を提案する。まず、情報顆粒化(fuzzy c-means)法を用いて、データの構造をキャプチャし、入力空間を部分空間に分割し、先行部を形成する。第2に、二次多項式(QP)が連続部分として用いられる。定数関数や線形関数と比較して、QPは入力変数と出力変数の関係を洗練することにより、局所領域(部分空間)内の入出力挙動を記述することができる。しかし、QPはモデルの近似能力を向上させることができるが、モデルの予測能力を低下させる可能性がある(例えば、過剰適合)。この問題に対処するために,調和解析で遭遇する重み関数理論に着想を得た指数重み法を提案する。具体的には, 2次ファジィ法則モデル (RSFRM) を適切に適合させるために, l2正則化 (l2) (指数重み付きl2, ewl_2) を具備した目標ペナルティ項として指数関数を用いる。通常の l2 と比較して el 2 の利点は、係数推定において異なる種類の多項式項を別々に同定し、ペナルティを課すことであり、その結果はオーバーフィッティングを緩和し、一般化能力の低下を防ぐだけでなく、モデルの予測ポテンシャルを効果的に放出する。

In the conventional Takagi-Sugeno-Kang (TSK)-type fuzzy models, constant or linear functions are usually utilized as the consequent parts of the fuzzy rules, but they cannot effectively describe the behavior within local regions defined by the antecedent parts. In this article, a theoretical and practical design methodology is developed to address this problem. First, the information granulation (Fuzzy C-Means) method is applied to capture the structure in the data and split the input space into subspaces, as well as form the antecedent parts. Second, the quadratic polynomials (QPs) are employed as the consequent parts. Compared with constant and linear functions, QPs can describe the input-output behavior within the local regions (subspaces) by refining the relationship between input and output variables. However, although QP can improve the approximation ability of the model, it could lead to the deterioration of the prediction ability of the model (e.g., overfitting). To handle this issue, we introduce an exponential weight approach inspired by the weight function theory encountered in harmonic analysis. More specifically, we adopt the exponential functions as the targeted penalty terms, which are equipped with l2 regularization (l2) (i.e., exponential weighted l2, ewl_2) to match the proposed reinforced second-order fuzzy rule-based model (RSFRM) properly. The advantage of el 2 compared to ordinary l2 lies in separately identifying and penalizing different types of polynomial terms in the coefficient estimation, and its results not only alleviate the overfitting and prevent the deterioration of generalization ability but also effectively release the prediction potential of the model.

翻訳日:2022-11-14 13:25:19 公開日:2020-07-02

# ニューラルネットワークのヌル空間解析による外乱検出

Outlier Detection through Null Space Analysis of Neural Networks ( http://arxiv.org/abs/2007.01263v1 )

ライセンス: Link先を確認

Matthew Cook, Alina Zare, Paul Gader

(参考訳) 多くの機械学習分類システムは能力の認知を欠いている。特に、多くのシステムは、異常値(例えば、トレーニングデータ分布で表現されていない、異なるサンプル)がシステムに提示されたときに識別する能力が欠如している。予期せぬデータに遭遇するとき、システムが合理的な方法で振る舞うのを助けることができるため、外れ値を検出する能力は実用上重要である。先行研究では, 分類モデルとは異なる処理パイプラインにおいて, 異常検出を行うのが一般的である。したがって、外れ値の検出と分類を組み込んだ完全なシステムでは、2つのモデルをトレーニングし、アプローチの全体的な複雑さを増大させる必要がある。本稿では,ヌル空間の概念を用いて,異常検出手法を分類に用いるニューラルネットワークに直接統合する。ニューラルネットワークのヌル空間解析(nusa)と呼ばれる手法は、データがネットワークを通過するときにヌル空間投影の大きさを計算し制御することで動作する。これらの投影を用いて、正常データと異常データとを区別できるスコアを計算できる。その結果,nusaで訓練されたネットワークは分類性能を維持しつつ,一般の異常検出アルゴリズムと同様の速度で異常値を検出することができた。

Many machine learning classification systems lack competency awareness. Specifically, many systems lack the ability to identify when outliers (e.g., samples that are distinct from and not represented in the training data distribution) are being presented to the system. The ability to detect outliers is of practical significance since it can help the system behave in an reasonable way when encountering unexpected data. In prior work, outlier detection is commonly carried out in a processing pipeline that is distinct from the classification model. Thus, for a complete system that incorporates outlier detection and classification, two models must be trained, increasing the overall complexity of the approach. In this paper we use the concept of the null space to integrate an outlier detection method directly into a neural network used for classification. Our method, called Null Space Analysis (NuSA) of neural networks, works by computing and controlling the magnitude of the null space projection as data is passed through a network. Using these projections, we can then calculate a score that can differentiate between normal and abnormal data. Results are shown that indicate networks trained with NuSA retain their classification performance while also being able to detect outliers at rates similar to commonly used outlier detection algorithms.

翻訳日:2022-11-14 13:24:49 公開日:2020-07-02

# スカースデータを用いたスペクトルランク付け法

Spectral Methods for Ranking with Scarce Data ( http://arxiv.org/abs/2007.01346v1 )

ライセンス: Link先を確認

Umang Varma, Lalit Jain, Anna C. Gilbert

(参考訳) アイテムのペアの選好が与えられた場合、すべてのアイテムをランク付けするのが一般的なタスクである。例えば、ペアワイズ映画評価、ニューヨーカーの漫画キャプションコンテスト、その他多くの消費者選好課題がある。これらの設定が共通しているのは、データの不足(すべての項目を比較するのにコストがかかるかもしれない)とアイテムに関する追加の機能情報(映画ジャンル、監督、キャストなど)の2つだ。本稿では,人気でよく研究されているランクアグリゲーション手法であるrankcentralityを,いくつかの比較を考慮し,付加的な特徴情報を含むように修正する。この方法は少ない比較でも有意義なランキングを返す。拡散に基づく手法を用いて,実際に最先端の手法に勝る特徴情報を組み込む。また,様々なサンプリングスキームにおいて,RangCentralityに対するサンプル複雑性の改善も提供する。

Given a number of pairwise preferences of items, a common task is to rank all the items. Examples include pairwise movie ratings, New Yorker cartoon caption contests, and many other consumer preferences tasks. What these settings have in common is two-fold: a scarcity of data (it may be costly to get comparisons for all the pairs of items) and additional feature information about the items (e.g., movie genre, director, and cast). In this paper we modify a popular and well studied method, RankCentrality for rank aggregation to account for few comparisons and that incorporates additional feature information. This method returns meaningful rankings even under scarce comparisons. Using diffusion based methods, we incorporate feature information that outperforms state-of-the-art methods in practice. We also provide improved sample complexity for RankCentrality in a variety of sampling schemes.

翻訳日:2022-11-14 13:23:49 公開日:2020-07-02

# BusTr:リアルタイム交通からバスの走行時間を予測する

BusTr: Predicting Bus Travel Times from Real-Time Traffic ( http://arxiv.org/abs/2007.00882v1 )

ライセンス: Link先を確認

Richard Barnes and Senaka Buthpitiya and James Cook and Alex Fabrikant and Andrew Tomkins and Fangzhou Xu

(参考訳) 本稿では,道路交通予測をバス遅延予測に翻訳する機械学習モデルであるBusTrについて紹介する。我々のニューラルシーケンスモデルは、パフォーマンス(-30% MAPE)とトレーニング安定性の両方において、最先端のベースラインであるDeepTTEよりも改善されていることを実証する。また、より単純なモデルよりも大幅に一般化され、常に進化する世界に対応するために、縦方向のデータで評価される。

We present BusTr, a machine-learned model for translating road traffic forecasts into predictions of bus delays, used by Google Maps to serve the majority of the world's public transit systems where no official real-time bus tracking is provided. We demonstrate that our neural sequence model improves over DeepTTE, the state-of-the-art baseline, both in performance (-30% MAPE) and training stability. We also demonstrate significant generalization gains over simpler models, evaluated on longitudinal data to cope with a constantly evolving world.

翻訳日:2022-11-14 13:16:59 公開日:2020-07-02

# bosh:階層的サンプリングによるベイズ最適化

BOSH: Bayesian Optimization by Sampling Hierarchically ( http://arxiv.org/abs/2007.00939v1 )

ライセンス: Link先を確認

Henry B. Moss, David S. Leslie, Paul Rayson

(参考訳) クロス検証やシミュレーション最適化によるパラメータチューニングのような確率的評価を持つ関数に対するベイズ最適化(bo)の配置は、通常、目的関数の固定されたノイズ発生の平均を最適化する。しかし、この方法で真の目的関数を無視すると、誤った関数の高精度な最適化が見つかる。この問題を解決するために,階層型ガウス過程と情報理論フレームワークを組み合わせ,最適化が進むにつれて実現のプールを増大させる新しいBOルーチンである階層型ガウス法(BOSH)をサンプリングしてベイズ最適化を提案する。 BOSHは, ベンチマーク, シミュレーション最適化, 強化学習, ハイパーパラメータチューニングタスクにおいて, 標準BOよりも効率的で高精度な最適化を実現する。

Deployments of Bayesian Optimization (BO) for functions with stochastic evaluations, such as parameter tuning via cross validation and simulation optimization, typically optimize an average of a fixed set of noisy realizations of the objective function. However, disregarding the true objective function in this manner finds a high-precision optimum of the wrong function. To solve this problem, we propose Bayesian Optimization by Sampling Hierarchically (BOSH), a novel BO routine pairing a hierarchical Gaussian process with an information-theoretic framework to generate a growing pool of realizations as the optimization progresses. We demonstrate that BOSH provides more efficient and higher-precision optimization than standard BO across synthetic benchmarks, simulation optimization, reinforcement learning and hyper-parameter tuning tasks.

翻訳日:2022-11-14 13:16:26 公開日:2020-07-02

# リニアバンディットのための純粋探査のゲーム化

Gamification of Pure Exploration for Linear Bandits ( http://arxiv.org/abs/2007.00953v1 )

ライセンス: Link先を確認

R\'emy Degenne, Pierre M\'enard, Xuedong Shang, Michal Valko

(参考訳) 線形確率的包帯の文脈において、ベストアーム識別を含む活発な純粋探索環境について検討する。標準のマルチアームバンディットには漸近的に最適なアルゴリズムが存在するが、線形バンディットにおける最良アーム識別のためのアルゴリズムの存在は、それに対処するいくつかの試みにもかかわらず、解明されてきた。まず,G最適性,最適設計からの帰納的最適性,漸近的最適性など,線形の場合における最適性の異なる概念について,徹底的な比較と新たな知見を提供する。第2に,線形帯域における固定信頼純粋探索のための漸近最適化アルゴリズムを設計する。その結果、我々のアルゴリズムは、単純だが難解な例による落とし穴を自然に回避し、ほとんどの先行アルゴリズムを明示的に扱うように設計しなければならなかった。最後に、効率的な実装を伴うアプローチを提供することで、最適な設計問題を解決する必要性を回避します。

We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive despite several attempts to address it. First, we provide a thorough comparison and new insight over different notions of optimality in the linear case, including G-optimality, transductive optimality from optimal experimental design and asymptotic optimality. Second, we design the first asymptotically optimal algorithm for fixed-confidence pure exploration in linear bandits. As a consequence, our algorithm naturally bypasses the pitfall caused by a simple but difficult instance, that most prior algorithms had to be engineered to deal with explicitly. Finally, we avoid the need to fully solve an optimal design problem by providing an approach that entails an efficient implementation.

翻訳日:2022-11-14 13:16:12 公開日:2020-07-02

# 確率帯域に対する構造適応アルゴリズム

Structure Adaptive Algorithms for Stochastic Bandits ( http://arxiv.org/abs/2007.00969v1 )

ライセンス: Link先を確認

R\'emy Degenne, Han Shao, Wouter M. Koolen

(参考訳) 本研究では,線形,ユニモーダル,スパースなど,腕の平均報酬が与えられた構造的制約を満たした幅広い階層構造的確率的多腕バンディット問題において,報酬の最大化について検討する。我々の目的は、柔軟で(異なる構造に容易に適応できる)、強力で(経験的かつ/または証明的にインスタンス依存の低い境界によく適合する)、かつ、円周計算の負担が小さい方法を開発することである。反復的鞍点解法を用いて,インスタンス依存下限から漸近的最適アルゴリズムを開発する。提案手法は,準最適性ギャップとその相互関係の推定から生じる大きな課題である,純粋探索のための最近の反復的手法を一般化するものである。それでも上述のデシダラタは達成できた。特に,本手法は,それまでの作業で用いたフルブルーのサドル点オラクルの計算コストを回避すると同時に,有限時間後悔境界を許容する。実験の結果,本手法は構造的仮定の活用に成功し,その後悔はバニラ UCB に匹敵することがわかった。

We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent lower bounds) and efficient in that the per-round computational burden is small. We develop asymptotically optimal algorithms from instance-dependent lower-bounds using iterative saddle-point solvers. Our approach generalises recent iterative methods for pure exploration to reward maximisation, where a major challenge arises from the estimation of the sub-optimality gaps and their reciprocals. Still we manage to achieve all the above desiderata. Notably, our technique avoids the computational cost of the full-blown saddle point oracle employed by previous work, while at the same time enabling finite-time regret bounds. Our experiments reveal that our method successfully leverages the structural assumptions, while its regret is at worst comparable to that of vanilla UCB.

翻訳日:2022-11-14 13:15:56 公開日:2020-07-02

# Shapley値と条件推論木を用いた混合特徴付き予測モデルの記述

Explaining predictive models with mixed features using Shapley values and conditional inference trees ( http://arxiv.org/abs/2007.01027v1 )

ライセンス: Link先を確認

Annabelle Redelmeier, Martin Jullum, and Kjersti Aas

(参考訳) 複雑なブラックボックス機械学習モデルを説明することがますます重要になっている。このトピックに関する文献は拡大しているが、Shapleyの値は、あらゆる種類の機械学習モデルからの予測を説明するためのサウンドメソッドとして際立っている。予測説明のためのShapley値の当初の開発は、記述されている特徴が独立しているという仮定に依存していた。この方法論は、基礎となる連続分布で依存する特徴を説明するために拡張された。本稿では,条件付き推論木を用いた特徴の依存構造をモデル化し,混合特徴(連続的,離散的,順序的,類型的)に依存する特徴を説明する手法を提案する。提案手法は, 様々なシミュレーション研究において, 現在の業界標準に対して, 提案手法が他の手法よりも優れていることを実証する。最後に,本手法を2018 fico explainsable machine learning challengeで使用した実金融データセットに適用し,fico challenge recognition awardの受賞チームとの比較を行った。

It is becoming increasingly important to explain complex, black-box machine learning models. Although there is an expanding literature on this topic, Shapley values stand out as a sound method to explain predictions from any type of machine learning model. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. This methodology was then extended to explain dependent features with an underlying continuous distribution. In this paper, we propose a method to explain mixed (i.e. continuous, discrete, ordinal, and categorical) dependent features by modeling the dependence structure of the features using conditional inference trees. We demonstrate our proposed method against the current industry standards in various simulation studies and find that our method often outperforms the other approaches. Finally, we apply our method to a real financial data set used in the 2018 FICO Explainable Machine Learning Challenge and show how our explanations compare to the FICO challenge Recognition Award winning team.

翻訳日:2022-11-14 13:15:03 公開日:2020-07-02

# 信号伝播を超えて: ディープニューラルネットワークの初期化には特徴の多様性が必要か?

Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization? ( http://arxiv.org/abs/2007.01038v1 )

ライセンス: Link先を確認

Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

(参考訳) ディープニューラルネットワークは通常ランダムウェイトで初期化され、信号の伝搬と安定した勾配を促進するために分散が選択される。特徴の多様性はこれらの初期化の重要な性質であると考えられている。ほぼすべての重みを0$に初期化することにより、同一の特徴を持つ深い畳み込みネットワークを構築する。アーキテクチャはまた、完全な信号伝搬と安定した勾配を可能にし、標準ベンチマークで高い精度を達成する。これは、ランダムで多様な初期化がニューラルネットワークのトレーニングに必要な \textit{not} であることを示している。我々は、この現象を研究し、非決定論的である標準的なgpu操作が、トレーニングを可能にするのに十分な対称性の破れの源となることを見出します。

Deep neural networks are typically initialized with random weights, with variances chosen to facilitate signal propagation and stable gradients. It is also believed that diversity of features is an important property of these initializations. We construct a deep convolutional network with identical features by initializing almost all the weights to $0$. The architecture also enables perfect signal propagation and stable gradients, and achieves high accuracy on standard benchmarks. This indicates that random, diverse initializations are \textit{not} necessary for training neural networks. An essential element in training this network is a mechanism of symmetry breaking; we study this phenomenon and find that standard GPU operations, which are non-deterministic, can serve as a sufficient source of symmetry breaking to enable training.

翻訳日:2022-11-14 13:14:47 公開日:2020-07-02

# 変換器(BERT)からの双方向エンコーダ表現 : 感情分析のディッセイ

Bidirectional Encoder Representations from Transformers (BERT): A sentiment analysis odyssey ( http://arxiv.org/abs/2007.01127v1 )

ライセンス: Link先を確認

Shivaji Alaparthi (Data Scientist, CenturyLink, Bengaluru, India) and Manit Mishra (Associate Professor, International Management Institute Bhubaneswar, India)

(参考訳) 本研究の目的は,(1)send wordnetを用いた非教師付き語彙ベースモデル,(2)ロジスティック回帰を用いた従来の教師付き機械学習モデル,(3)long short-term memory(lstm)を用いた教師付きディープラーニングモデル,(4)トランスフォーマ(bert)からの双方向エンコーダ表現を用いた高度な教師付きディープラーニングモデル,の4つの異なる感情分析手法の相対的有効性を検討することである。我々は、インターネット映画データベース(IMDB)に投稿された5万本の映画レビューのコーパスを、Sent WordNetレキシコン、ロジスティック回帰、LSTM、BERTを用いて解析するために公開している。最初の3モデルはcpuベースのシステムで動作し、bertはgpuベースのシステムで動作した。感情分類性能は,精度,精度,リコール,F1スコアに基づいて評価した。本研究は,(1)高度で広く使用されている4つの感情分析技術の相対的有効性,(2)テキストデータからの感情分析における事前学習型深層学習 BERT モデルの有効性について考察した。本研究は分析業界とテキスト分析に携わる学者に,最近開発されたbertを含む重要感情分析手法の比較分類性能評価に関する洞察を提供する。これは、LSTM、ロジスティック回帰、Sent WordNetの他の感情分析モデルであるBERT vis-\`a-visの高度な事前学習型ディープラーニングモデルと比較した最初の研究である。

The purpose of the study is to investigate the relative effectiveness of four different sentiment analysis techniques: (1) unsupervised lexicon-based model using Sent WordNet; (2) traditional supervised machine learning model using logistic regression; (3) supervised deep learning model using Long Short-Term Memory (LSTM); and, (4) advanced supervised deep learning models using Bidirectional Encoder Representations from Transformers (BERT). We use publicly available labeled corpora of 50,000 movie reviews originally posted on internet movie database (IMDB) for analysis using Sent WordNet lexicon, logistic regression, LSTM, and BERT. The first three models were run on CPU based system whereas BERT was run on GPU based system. The sentiment classification performance was evaluated based on accuracy, precision, recall, and F1 score. The study puts forth two key insights: (1) relative efficacy of four highly advanced and widely used sentiment analysis techniques; (2) undisputed superiority of pre-trained advanced supervised deep learning BERT model in sentiment analysis from text data. This study provides professionals in analytics industry and academicians working on text analysis key insight regarding comparative classification performance evaluation of key sentiment analysis techniques, including the recently developed BERT. This is the first research endeavor to compare the advanced pre-trained supervised deep learning model of BERT vis-\`a-vis other sentiment analysis models of LSTM, logistic regression, and Sent WordNet.

翻訳日:2022-11-14 13:08:38 公開日:2020-07-02

# PGD-UNet : 臓器・腫瘍の同時分離のための位置ガイド型変形ネットワーク

PGD-UNet: A Position-Guided Deformable Network for Simultaneous Segmentation of Organs and Tumors ( http://arxiv.org/abs/2007.01001v1 )

ライセンス: Link先を確認

Ziqiang Li, Hong Pan, Yaping Zhu, A. K. Qin

(参考訳) 臓器と腫瘍の精密セグメンテーションは臨床応用において重要な役割を担っている。不規則な形状と臓器や腫瘍の大きさ、そして関心の解剖学(aoi)と背景領域の間の重大な階級的不均衡のため、これは困難な課題である。加えて、ほとんどの場合、腫瘍と正常臓器は医療画像に重複することが多いが、現在のアプローチでは腫瘍と臓器の両方を正確に切り離すことができない。このような課題に対処すべく,変形可能な畳み込みの空間的変形能力を利用して臓器と腫瘍の幾何学的変形に対応する位置誘導型変形型unet,pgd-unetを提案する。位置情報はネットワークに明示的にエンコードされ、変形の能力を高める。また,従来の最大プーリング操作で失われた位置情報を保存する新しいプーリングモジュールを提案する。また、異なる構造の境界やアノテーションの主観性がはっきりしないため、ラベルは必ずしも医用画像の分割作業において正確ではない。これはラベルノイズによるトレーニングネットワークの過度な適合を引き起こす可能性がある。この問題に対処するために,新たな損失関数を定式化し,潜在的なラベルノイズがトレーニングプロセスに与える影響を抑制する。本手法は,2つの難解なセグメンテーションタスクで評価し,両タスクにおいて非常に有望なセグメンテーション精度を得た。

Precise segmentation of organs and tumors plays a crucial role in clinical applications. It is a challenging task due to the irregular shapes and various sizes of organs and tumors as well as the significant class imbalance between the anatomy of interest (AOI) and the background region. In addition, in most situation tumors and normal organs often overlap in medical images, but current approaches fail to delineate both tumors and organs accurately. To tackle such challenges, we propose a position-guided deformable UNet, namely PGD-UNet, which exploits the spatial deformation capabilities of deformable convolution to deal with the geometric transformation of both organs and tumors. Position information is explicitly encoded into the network to enhance the capabilities of deformation. Meanwhile, we introduce a new pooling module to preserve position information lost in conventional max-pooling operation. Besides, due to unclear boundaries between different structures as well as the subjectivity of annotations, labels are not necessarily accurate for medical image segmentation tasks. It may cause the overfitting of the trained network due to label noise. To address this issue, we formulate a novel loss function to suppress the influence of potential label noise on the training process. Our method was evaluated on two challenging segmentation tasks and achieved very promising segmentation accuracy in both tasks.

翻訳日:2022-11-14 13:08:08 公開日:2020-07-02

# nlnde:ロバストな薬理学的実体検出のための注意とノイズチャンネルによる神経配列タガーの増強

NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy Channel for Robust Pharmacological Entity Detection ( http://arxiv.org/abs/2007.01022v1 )

ライセンス: Link先を確認

Lukas Lange, Heike Adel, Jannik Str\"otgen

(参考訳) 名前付きエンティティ認識は、英語のニューステキストで広く研究されている。しかし、他のドメインや言語への移行は依然として難しい問題である。本稿では,BioNLP Open Shared Tasks 2019のPharmaCoNERコンペティションの最初のサブトラックに参加したシステムについて述べる。スペイン語のテキストにおける薬理学的エンティティ検出を目的としたこのタスクは、非標準ドメインと言語設定を提供する。しかし、言語やドメインの専門知識を必要としないアーキテクチャを提案する。タスクをシーケンスラベリングタスクとして扱い,注意に基づく埋め込み選択と自動アノテートデータのトレーニングを行い,システムの性能をさらに向上させる。提案システムは,特に異なる技術を組み合わせることで,有望な結果を達成し,競争において最大88.6%のF1に達する。

Named entity recognition has been extensively studied on English news texts. However, the transfer to other domains and languages is still a challenging problem. In this paper, we describe the system with which we participated in the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared Tasks 2019. Aiming at pharmacological entity detection in Spanish texts, the task provides a non-standard domain and language setting. However, we propose an architecture that requires neither language nor domain expertise. We treat the task as a sequence labeling task and experiment with attention-based embedding selection and the training on automatically annotated data to further improve our system's performance. Our system achieves promising results, especially by combining the different techniques, and reaches up to 88.6% F1 in the competition.

翻訳日:2022-11-14 13:07:28 公開日:2020-07-02

# nlnde: スペイン語の医学文書の非識別方法

NLNDE: The Neither-Language-Nor-Domain-Experts' Way of Spanish Medical Document De-Identification ( http://arxiv.org/abs/2007.01030v1 )

ライセンス: Link先を確認

Lukas Lange, Heike Adel, Jannik Str\"otgen

(参考訳) 自然言語処理は、最近この分野で多くの研究を導いた医学領域において大きな可能性を秘めている。しかし、患者ノートや臨床試験などの医療文書の安全な処理の前提条件は、プライバシに敏感な情報の適切な特定である。本稿では,IberLEF 2019の医療文書匿名化タスクであるMEDDOCANコンペティションに参加したNLNDEシステムについて述べる。スペインのデータから保護された健康情報をシーケンスラベル問題として検出・分類し、ニューラルネットワークの異なる埋め込み方法を検討する。非標準言語とドメイン設定を扱うにもかかわらず、NLNDEシステムは競争において有望な結果を達成する。

Natural language processing has huge potential in the medical domain which recently led to a lot of research in this field. However, a prerequisite of secure processing of medical documents, e.g., patient notes and clinical trials, is the proper de-identification of privacy-sensitive information. In this paper, we describe our NLNDE system, with which we participated in the MEDDOCAN competition, the medical document anonymization task of IberLEF 2019. We address the task of detecting and classifying protected health information from Spanish data as a sequence-labeling problem and investigate different embedding methods for our neural network. Despite dealing in a non-standard language and domain setting, the NLNDE system achieves promising results in the competition.

翻訳日:2022-11-14 13:07:14 公開日:2020-07-02

# 不完全な情報と制約下での深層強化学習による検査・維持計画

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints ( http://arxiv.org/abs/2007.01380v1 )

ライセンス: Link先を確認

C.P. Andriotis, K.G. Papakonstantinou

(参考訳) エンジニアリング環境の劣化における長期的なリスクとコストを最小限に抑えるための検査と保守の方針の決定は、複雑な最適化問題を構成する。主な計算上の課題は (i)成分数による状態・行動集合濃度の指数関数的拡大による次元の呪い (ii)決定段階の数で指数関数的に成長する決定木に関連する歴史の呪い三検査・監視計測の環境確率性及び変動性により引き起こされた状態不確実性の有無 (iv)資源不足やその他の実現不可能なシステム応答による、確率的長期的制限に係る制約の存在。本研究は,制約付き部分可観測マルコフ決定プロセス(POMDP)と多エージェント深層強化学習(DRL)の協調フレームワーク内で,これらの課題に対処する。 POMDPは最適に取り組む (ii)- (iii) 確率的動的プログラミングとベイズ推論の原理を組み合わせること。マルチエージェントDRLアドレス (i) 深い関数のパラメトリゼーションと分散制御仮定を通して。挑戦 (iv)は、ライフサイクルリスクに基づく制約と予算制限に重点を置いた適切な状態拡張とラグランジュ緩和を通じて、ここで処理される。基礎となるアルゴリズム的なステップが提供され、提案フレームワークは、最もリソースとリスクを意識した方法で決定を行う必要がある場合に、確立されたポリシーベースラインを上回り、検査および介入行動の処方を緩和する。

Determination of inspection and maintenance policies for minimizing long-term risks and costs in deteriorating engineering environments constitutes a complex optimization problem. Major computational challenges include the (i) curse of dimensionality, due to exponential scaling of state/action set cardinalities with the number of components; (ii) curse of history, related to exponentially growing decision-trees with the number of decision-steps; (iii) presence of state uncertainties, induced by inherent environment stochasticity and variability of inspection/monitoring measurements; (iv) presence of constraints, pertaining to stochastic long-term limitations, due to resource scarcity and other infeasible/undesirable system responses. In this work, these challenges are addressed within a joint framework of constrained Partially Observable Markov Decision Processes (POMDP) and multi-agent Deep Reinforcement Learning (DRL). POMDPs optimally tackle (ii)-(iii), combining stochastic dynamic programming with Bayesian inference principles. Multi-agent DRL addresses (i), through deep function parametrizations and decentralized control assumptions. Challenge (iv) is herein handled through proper state augmentation and Lagrangian relaxation, with emphasis on life-cycle risk-based constraints and budget limitations. The underlying algorithmic steps are provided, and the proposed framework is found to outperform well-established policy baselines and facilitate adept prescription of inspection and intervention actions, in cases where decisions must be made in the most resource- and risk-aware manner.

翻訳日:2022-11-14 12:59:14 公開日:2020-07-02

# perceptiongan: 知覚理解によるテキスト提供による実世界画像の構築

PerceptionGAN: Real-world Image Construction from Provided Text through Perceptual Understanding ( http://arxiv.org/abs/2007.00977v1 )

ライセンス: Link先を確認

Kanish Garg, Ajeet kumar Singh, Dorien Herremans, Brejesh Lall

(参考訳) 提示された記述テキストから画像を生成することは、知覚情報(形状、色、およびそれらの相互作用)を組み込むことが困難であり、提供されたテキストに高い関連性を与えるため、非常に難しい作業である。現在の方法では、通常不規則な物体の形、色、オブジェクト間の相互作用を持つ最初の低解像度画像を生成する。この初期画像はテキストの条件付けによって改善される。しかし,本手法は,dm-gan論文で指摘されているように,初期生成画像の精細化においてテキスト表現を効率的に利用する問題に主に対処しているが,この精細化プロセスの成功は初期生成画像の品質に大きく依存する。そこで本研究では,識別器モジュールに知覚的理解を取り入れ,優れた初期化画像を提供する手法を提案する。我々は第1段階の知覚情報を改善するとともに,最終生成画像の大幅な改善を実現した。本稿では,新しいStackGANアーキテクチャにアプローチを適用した。そして、複数の段階で画像分布をモデル化しながら、初期画像に含まれる知覚情報が改善されることを示す。最後に,テキストで条件づけされた現実的な多色画像を生成する。これらの画像は、基本的な知覚情報の改善とともに高品質である。さらに重要なことに、提案手法は他の最先端テキストベース画像生成モデルのパイプラインに統合でき、初期低解像度画像を生成することができる。また,StackGANアーキテクチャにおけるジェネレータ-ディスクリミネータペアの第3段階の強化により,StackGANの洗練プロセスの改善にも取り組んでいる。大規模だがスパースなデータセットMS COCOを用いた実験解析と最先端技術との比較により,提案手法の有効性がさらに検証された。

Generating an image from a provided descriptive text is quite a challenging task because of the difficulty in incorporating perceptual information (object shapes, colors, and their interactions) along with providing high relevancy related to the provided text. Current methods first generate an initial low-resolution image, which typically has irregular object shapes, colors, and interaction between objects. This initial image is then improved by conditioning on the text. However, these methods mainly address the problem of using text representation efficiently in the refinement of the initially generated image, while the success of this refinement process depends heavily on the quality of the initially generated image, as pointed out in the DM-GAN paper. Hence, we propose a method to provide good initialized images by incorporating perceptual understanding in the discriminator module. We improve the perceptual information at the first stage itself, which results in significant improvement in the final generated image. In this paper, we have applied our approach to the novel StackGAN architecture. We then show that the perceptual information included in the initial image is improved while modeling image distribution at multiple stages. Finally, we generated realistic multi-colored images conditioned by text. These images have good quality along with containing improved basic perceptual information. More importantly, the proposed method can be integrated into the pipeline of other state-of-the-art text-based-image-generation models to generate initial low-resolution images. We also worked on improving the refinement process in StackGAN by augmenting the third stage of the generator-discriminator pair in the StackGAN architecture. Our experimental analysis and comparison with the state-of-the-art on a large but sparse dataset MS COCO further validate the usefulness of our proposed approach.

翻訳日:2022-11-14 12:58:28 公開日:2020-07-02

# オブジェクト検出のための反復境界ボックスアノテーション

Iterative Bounding Box Annotation for Object Detection ( http://arxiv.org/abs/2007.00961v1 )

ライセンス: Link先を確認

Bishwo Adhikari and Heikki Huttunen

(参考訳) デジタル画像におけるオブジェクト検出のための境界ボックスの手動アノテーションは退屈で、時間とリソースを消費する。本稿では,効率的なバウンディングボックスアノテーションのための半自動手法を提案する。この方法は、ラベル付き画像の小さなバッチにオブジェクト検出器を反復的に訓練し、次のバッチにバウンディングボックスを提案することを学習する。本稿では,人間の行動をシミュレーションし,アノテータにデータを提示する順序など,異なるイテレーション戦略を比較するための実験的なセットアップを提案する。提案手法を3つのデータセットを用いて実験し,人手による注記作業の75%を省くことにより,人手による注記作業を大幅に削減できることを示した。

Manual annotation of bounding boxes for object detection in digital images is tedious, and time and resource consuming. In this paper, we propose a semi-automatic method for efficient bounding box annotation. The method trains the object detector iteratively on small batches of labeled images and learns to propose bounding boxes for the next batch, after which the human annotator only needs to correct possible errors. We propose an experimental setup for simulating the human actions and use it for comparing different iteration strategies, such as the order in which the data is presented to the annotator. We experiment on our method with three datasets and show that it can reduce the human annotation effort significantly, saving up to 75% of total manual annotation work.

翻訳日:2022-11-14 12:57:59 公開日:2020-07-02

# 視覚的質問応答のためのシーングラフ推論

Scene Graph Reasoning for Visual Question Answering ( http://arxiv.org/abs/2007.01072v1 )

ライセンス: Link先を確認

Marcel Hildebrandt, Hang Li, Rajat Koner, Volker Tresp, Stephan G\"unnemann

(参考訳) 視覚的な質問応答は、画像に関する自由形式の質問に答えることに関するものである。問題に対する深い言語的理解と、画像に存在する様々なオブジェクトと関連付ける能力を必要とするため、これは野心的な課題であり、コンピュータビジョンと自然言語処理の両方の技法を必要とする。本研究では,シーン内に存在するオブジェクトとその意味的・空間的関係に基づいて,コンテキスト駆動の逐次推論を行うことによってタスクにアプローチする手法を提案する。最初のステップとして、画像内のオブジェクトとその属性とその相互関係を記述するシーングラフを導出する。強化エージェントは、抽出されたシーングラフを自律的にナビゲートして、回答を導出する基礎となるパスを生成する。我々は,手作業で収集したシーングラフを用いて,挑戦的なgqaデータセットを初めて実験的に検討した。

Visual question answering is concerned with answering free-form questions about an image. Since it requires a deep linguistic understanding of the question and the ability to associate it with various objects that are present in the image, it is an ambitious task and requires techniques from both computer vision and natural language processing. We propose a novel method that approaches the task by performing context-driven, sequential reasoning based on the objects and their semantic and spatial relationships present in the scene. As a first step, we derive a scene graph which describes the objects in the image, as well as their attributes and their mutual relationships. A reinforcement agent then learns to autonomously navigate over the extracted scene graph to generate paths, which are then the basis for deriving answers. We conduct a first experimental study on the challenging GQA dataset with manually curated scene graphs, where our method almost reaches the level of human performance.

翻訳日:2022-11-14 12:57:47 公開日:2020-07-02

# 深層マルチタスク学習と補助タスク学習の概観

A Brief Review of Deep Multi-task Learning and Auxiliary Task Learning ( http://arxiv.org/abs/2007.01126v1 )

ライセンス: Link先を確認

Partoo Vafaeikia, Khashayar Namdar, Farzad Khalvati

(参考訳) マルチタスク学習(mtl)は複数の学習タスクを同時に最適化し、共有情報を活用して各タスクの一般化とモデル予測を改善する。補助タスクをメインタスクに追加すれば、最終的にパフォーマンスが向上する。本稿では,近年のDeep Multi-task Learning (dMTL) アプローチについて簡単なレビューを行い,それに続いて,本タスクのモデルの性能向上のために,dMTLで使用できる有用な補助タスクを選択する方法について述べる。

Multi-task learning (MTL) optimizes several learning tasks simultaneously and leverages their shared information to improve generalization and the prediction of the model for each task. Auxiliary tasks can be added to the main task to ultimately boost the performance. In this paper, we provide a brief review on the recent deep multi-task learning (dMTL) approaches followed by methods on selecting useful auxiliary tasks that can be used in dMTL to improve the performance of the model for the main task.

翻訳日:2022-11-14 12:57:33 公開日:2020-07-02

# トレーサノーム逆数例

Trace-Norm Adversarial Examples ( http://arxiv.org/abs/2007.01855v1 )

ライセンス: Link先を確認

Ehsan Kazemi, Thomas Kerdreux and Liqiang Wang

(参考訳) ホワイトボックスの逆転摂動は反復最適化アルゴリズムによって求めるが、ほとんどの場合、元の画像の$l_p$近傍での逆転損失を最小限に抑える。逆探索を異なるノルムで制限すると、異なる構成の逆の例が得られる。ここでは,構造エンハンシングアルゴリズムを用いた歪み集合について検討する。敵対的な例のためのこれらの新しい構造は、最適化において広く普及しているが、例えば、また$l_p$証明書しか提供しない敵対的理論証明の挑戦である。敵の堅牢性はまだ実証的な分野であるため、防御機構は異なる構成の攻撃に対して合理的に評価されるべきである。さらに、これらの構造的対向摂動は、画像の自然なわずかな歪みとして知覚できないか知覚できないまま、$l_p$カウンタ部よりも大きな歪みを許容する。最後に、(局所的な)ぼやけなど、敵対的な摂動の発生をある程度制御できる。

White box adversarial perturbations are sought via iterative optimization algorithms most often minimizing an adversarial loss on a $l_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithms. These new structures for adversarial examples, yet pervasive in optimization, are for instance a challenge for adversarial theoretical certification which again provides only $l_p$ certificates. Because adversarial robustness is still an empirical field, defense mechanisms should also reasonably be evaluated against differently structured attacks. Besides, these structured adversarial perturbations may allow for larger distortions size than their $l_p$ counter-part while remaining imperceptible or perceptible as natural slight distortions of the image. Finally, they allow some control on the generation of the adversarial perturbation, like (localized) bluriness.

翻訳日:2022-11-14 12:57:02 公開日:2020-07-02

# Decoder-free Robustness Disentanglement without (Additional) Supervision

Decoder-free Robustness Disentanglement without (Additional) Supervision ( http://arxiv.org/abs/2007.01356v1 )

ライセンス: Link先を確認

Yifei Wang, Dan Peng, Furui Liu, Zhenguo Li, Zhitang Chen, Jiansheng Yang

(参考訳) 対向訓練(adversarial training, at)は、入力からロバストな特徴のみを抽出することによって、機械学習モデルの敵対的脆弱性を軽減するために提案されているが、非ロバストで有用な特徴を捨てることにより、必然的に正確性が低下する。これにより、ロバストな特徴と非ロバストな特徴の両方を保存し、対立する表現学習でそれらを分離するモチベーションが生まれます。提案する逆非対称トレーニング(aat)アルゴリズムはロバスト表現と非ロバスト表現を確実に分離することができる。実験結果から,本手法は2つの表現を組み合わせることで精度を保っただけでなく,従来の作業よりもはるかに良好な絡み合いを実現できることがわかった。

Adversarial Training (AT) is proposed to alleviate the adversarial vulnerability of machine learning models by extracting only robust features from the input, which, however, inevitably leads to severe accuracy reduction as it discards the non-robust yet useful features. This motivates us to preserve both robust and non-robust features and separate them with disentangled representation learning. Our proposed Adversarial Asymmetric Training (AAT) algorithm can reliably disentangle robust and non-robust representations without additional supervision on robustness. Empirical results show our method does not only successfully preserve accuracy by combining two representations, but also achieve much better disentanglement than previous work.

翻訳日:2022-11-14 12:50:59 公開日:2020-07-02

# データサンプリングとマルチタスク最適化による新しいDNNトレーニングフレームワーク

A Novel DNN Training Framework via Data Sampling and Multi-Task Optimization ( http://arxiv.org/abs/2007.01016v1 )

ライセンス: Link先を確認

Boyu Zhang, A. K. Qin, Hong Pan, Timos Sellis

(参考訳) 従来のDNNトレーニングパラダイムは、トレーニングに使用される注釈付きデータセット、すなわち粗いトレーニングセットを特定の方法で分割することで得られる、1つのトレーニングセットと1つの検証セットに依存している。トレーニングセットはモデルのトレーニングに使用され、検証セットはトレーニングが過度な適合を避けるために進むにつれてトレーニングモデルの一般化性能を推定するために使用される。このパラダイムには2つの大きな問題があります。まず、検証セットは、テストデータとの潜在的なミスマッチによる一般化性能の偏りのない推定をほとんど保証しない。第二に、dnnの訓練は複雑な最適化問題を解決することに対応しており、これは劣る局所視光に閉じ込められやすいため、望ましくない訓練結果をもたらす。これらの課題に対処するために,我々は新しいDNNトレーニングフレームワークを提案する。ランダムスプリッティングにより総合トレーニングセットから複数のペアのトレーニングセットを生成し、一つのモデルトレーニングプロセスから得られた有用な知識(例えば、有望なネットワークパラメータ)をマルチタスク最適化によって他のモデルトレーニングプロセスに転送しながら、各ペアの事前指定された構造のDNNモデルを訓練し、全てのモデルの中で最高の性能を持つ訓練セットを出力する。この新フレームワークで特徴付けられる知識伝達機構は、モデルトレーニングプロセスが局所最適から逃れるのを支援することでトレーニング効率を向上させるだけでなく、他のモデルトレーニングプロセスから1つのモデルトレーニングプロセスに課される暗黙の正規化によって一般化性能を向上させることができる。提案するフレームワークを実装し,GPUクラスタ上での実装を並列化し,広く使用されているDNNモデルのトレーニングに適用する。実験の結果,従来の学習パラダイムよりも優れた枠組みが得られた。

Conventional DNN training paradigms typically rely on one training set and one validation set, obtained by partitioning an annotated dataset used for training, namely gross training set, in a certain way. The training set is used for training the model while the validation set is used to estimate the generalization performance of the trained model as the training proceeds to avoid over-fitting. There exist two major issues in this paradigm. Firstly, the validation set may hardly guarantee an unbiased estimate of generalization performance due to potential mismatching with test data. Secondly, training a DNN corresponds to solve a complex optimization problem, which is prone to getting trapped into inferior local optima and thus leads to undesired training results. To address these issues, we propose a novel DNN training framework. It generates multiple pairs of training and validation sets from the gross training set via random splitting, trains a DNN model of a pre-specified structure on each pair while making the useful knowledge (e.g., promising network parameters) obtained from one model training process to be transferred to other model training processes via multi-task optimization, and outputs the best, among all trained models, which has the overall best performance across the validation sets from all pairs. The knowledge transfer mechanism featured in this new framework can not only enhance training effectiveness by helping the model training process to escape from local optima but also improve on generalization performance via implicit regularization imposed on one model training process from other model training processes. We implement the proposed framework, parallelize the implementation on a GPU cluster, and apply it to train several widely used DNN models. Experimental results demonstrate the superiority of the proposed framework over the conventional training paradigm.

翻訳日:2022-11-14 12:50:12 公開日:2020-07-02

# オブジェクトやシーンを識別するように訓練されたCNNの隠れた層の中に、オブジェクト検出器はありますか?

Are there any 'object detectors' in the hidden layers of CNNs trained to identify objects or scenes? ( http://arxiv.org/abs/2007.01062v1 )

ライセンス: Link先を確認

Ella M. Gale and Nicholas Martin and Ryan Blything and Anh Nguyen and Jeffrey S. Bowers

(参考訳) ニューラルネットワークの動作をよりよく理解するために、様々な単位選択性の測定方法が開発されている。しかし、異なる尺度は選択性の異なる推定を与えるため、選択対象表現が学習される条件とこれらの表現の機能的関連性に関して異なる結論を導いた。対象の選択性を向上するために,AlexNetの大規模単位に対する様々な選択度尺度の比較を行った。例えば,局所選択性,精度,クラス条件の平均活動選択性(CCMAS),ネットワーク分割,アクティベーション最大化(AM)画像の人間解釈,標準信号検出測定などである。異なる測定値が、精度とCCMAS測定値で異なる対象選択性の推定値を提供することがわかった。実際、最も選択的なユニットは、被弾率の低さや、物体分類の誤射率(またはその両方)が高く、被写体検出装置の低さであった。我々は、リカレントニューラルネットワークで報告された「グランドマザーセル」ユニットほど遠くから選択的なユニットを見つけることができません。これらの結果を一般化するため,VGG-16 と GoogLeNet で「対象検出器」として記述された ImageNet あるいは Places-365 データセットで訓練された単位の選択性尺度を比較した。繰り返しになりますが、ヒット率の低さとオブジェクト分類の偽装率が高いことが分かりました。信号検出手法は、一般的な代替手法と比較して、単一ユニット選択性の評価に優れており、画像分類の深い畳み込みネットワークは、隠蔽層における物体検出を学習しない。

Various methods of measuring unit selectivity have been developed with the aim of better understanding how neural networks work. But the different measures provide divergent estimates of selectivity, and this has led to different conclusions regarding the conditions in which selective object representations are learned and the functional relevance of these representations. In an attempt to better characterize object selectivity, we undertake a comparison of various selectivity measures on a large set of units in AlexNet, including localist selectivity, precision, class-conditional mean activity selectivity (CCMAS), network dissection,the human interpretation of activation maximization (AM) images, and standard signal-detection measures. We find that the different measures provide different estimates of object selectivity, with precision and CCMAS measures providing misleadingly high estimates. Indeed, the most selective units had a poor hit-rate or a high false-alarm rate (or both) in object classification, making them poor object detectors. We fail to find any units that are even remotely as selective as the 'grandmother cell' units reported in recurrent neural networks. In order to generalize these results, we compared selectivity measures on units in VGG-16 and GoogLeNet trained on the ImageNet or Places-365 datasets that have been described as 'object detectors'. Again, we find poor hit-rates and high false-alarm rates for object classification. We conclude that signal-detection measures provide a better assessment of single-unit selectivity compared to common alternative approaches, and that deep convolutional networks of image classification do not learn object detectors in their hidden layers.

翻訳日:2022-11-14 12:49:42 公開日:2020-07-02

# 専門家としての事実:記号的知識よりも適応可能で解釈可能なニューラルメモリ

Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge ( http://arxiv.org/abs/2007.00849v1 )

ライセンス: Link先を確認

Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen

(参考訳) 大規模言語モデルは現代のNLPモデリングの中核であり、膨大なコモンセンスと事実情報をエンコードすることが示されている。しかし、その知識はモデルの潜在パラメータ内にのみ存在し、検査や解釈にはアクセスできない。さらに悪いことに、トレーニングコーパスから記憶された事実情報は、世界が変化するにつれて、陳腐化する可能性が高い。パラメータとして格納された知識は、必然的にソース素材に固有のすべてのバイアスを示す。これらの問題に対処するため、記号的解釈可能な事実情報とサブシンボル的神経知識との明確なインターフェースを含むニューラル言語モデルを開発する。このモデルは2つの知識集約型質問応答タスクの性能を劇的に向上させる。さらに興味深いことに、モデルはシンボル表現を操作することで再トレーニングすることなく更新することができる。特にこのモデルは、以前のモデルでは不可能な方法で、新しい事実を追加し、既存の事実を上書きすることができます。

Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowledge stored as parameters will also inevitably exhibit all of the biases inherent in the source materials. To address these problems, we develop a neural language model that includes an explicit interface between symbolically interpretable factual information and subsymbolic neural knowledge. We show that this model dramatically improves performance on two knowledge-intensive question-answering tasks. More interestingly, the model can be updated without re-training by manipulating its symbolic representations. In particular this model allows us to add new facts and overwrite existing ones in ways that are not possible for earlier models.

翻訳日:2022-11-14 12:48:48 公開日:2020-07-02

# より少ないことで達成できるのか? 有害コメント分類のためのデータ拡張の検討

Can We Achieve More with Less? Exploring Data Augmentation for Toxic Comment Classification ( http://arxiv.org/abs/2007.00875v1 )

ライセンス: Link先を確認

Chetanya Rastogi, Nikka Mofid, Fang-I Hsiao

(参考訳) 本稿では、機械学習における最大の制限の一つに対処する。具体的には、データ拡張技術と機械学習アルゴリズムの組み合わせを利用して、高精度な分類器を小さなデータセットから構築できるかどうかを検討する。本稿では,データ拡張(eda)とバックトランスレーション,およびロジスティック回帰(logistic regression),サポートベクターマシン(svm),双方向長短期記憶ネットワーク(bi-lstm)の3つの一般的な学習アルゴリズムについて実験を行う。実験のために、wikipedia toxic commentsデータセットを利用して、データ拡張の利点を探求する過程で、サイバーいじめやオンラインハラスメントに対抗するために、コメント中の有毒な発言を検出し分類するモデルを開発することができる。最終的に、データ拡張技術は分類器の性能を大幅に向上させ、NLP問題におけるデータの欠如に対処するための優れた戦略であることがわかった。

This paper tackles one of the greatest limitations in Machine Learning: Data Scarcity. Specifically, we explore whether high accuracy classifiers can be built from small datasets, utilizing a combination of data augmentation techniques and machine learning algorithms. In this paper, we experiment with Easy Data Augmentation (EDA) and Backtranslation, as well as with three popular learning algorithms, Logistic Regression, Support Vector Machine (SVM), and Bidirectional Long Short-Term Memory Network (Bi-LSTM). For our experimentation, we utilize the Wikipedia Toxic Comments dataset so that in the process of exploring the benefits of data augmentation, we can develop a model to detect and classify toxic speech in comments to help fight back against cyberbullying and online harassment. Ultimately, we found that data augmentation techniques can be used to significantly boost the performance of classifiers and are an excellent strategy to combat lack of data in NLP problems.

翻訳日:2022-11-14 12:48:33 公開日:2020-07-02

PDF登録状況（公開日: 20200702）