Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210202となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 熱真空に結合した自由ブラウン粒子のエネルギー Energy of a free Brownian particle coupled to thermal vacuum ( http://arxiv.org/abs/2003.13567v2 ) ライセンス: Link先を確認	J. Spiechowicz, J. {\L}uczka	(参考訳) 実験者は、かつては普通だった物理学が異常なものになった絶対零度に非常に近い温度に到達した。このような状態の量子効果とゆらぎが支配的な役割を担い始める。この文脈では、最も単純な開量子系、すなわち熱真空に結合した自由量子ブラウン粒子、すなわち絶対零温度の極限の場合のサーモスタットを研究する。我々は、粒子の平均エネルギー$E=E(c)$を、粒子と熱真空の間の弱い相互作用強度から強い相互作用強度$c$まで分析する。様々な散逸機構の影響を考察する。弱い結合状態では、エネルギーは$E(c) \sim c\, \ln{(1/c)}$としてゼロになるが、強い結合状態では$E(c) \sim \sqrt{c}$として無限大に分岐する。一般化ランゲヴィン方程式のメモリカーネル $\gamma(t)$ で定義される散逸機構の選択例について示す。 c$ エネルギー $e(c)$ の固定値は散逸モデルにどのように依存するかを明らかにする: 散逸関数 $\gamma'(t)$ の導関数の値を比較する必要がある散逸関数 $\gamma(t)$ またはメモリ時間 $t=0$ で、ブラウン粒子力学の非マルコフ性を表す $t=\tau_c$ である。低温の影響も示される。 Experimentalists have come to temperatures very close to absolute zero at which physics that was once ordinary becomes extraordinary. In such a regime quantum effects and fluctuations start to play a dominant role. In this context we study the simplest open quantum system, namely, a free quantum Brownian particle coupled to thermal vacuum, i.e. thermostat in the limiting case of absolute zero temperature. We analyze the average energy $E=E(c)$ of the particle from a weak to strong interaction strength $c$ between the particle and thermal vacuum. The impact of various dissipation mechanisms is considered. In the weak coupling regime the energy tends to zero as $E(c) \sim c\, \ln{(1/c)}$ while in the strong coupling regime it diverges to infinity as $E(c) \sim \sqrt{c}$. We demonstrate it for selected examples of the dissipation mechanisms defined by the memory kernel $\gamma(t)$ of the Generalized Langevin Equation. We reveal how at a fixed value of $c$ the energy $E(c)$ depends on the dissipation model: one has to compare values of the derivative $\gamma'(t)$ of the dissipation function $\gamma(t)$ at time $t=0$ or at the memory time $t=\tau_c$ which characterizes the degree of non-Markovianity of the Brownian particle dynamics. The impact of low temperature is also presented.	翻訳日:2023-05-27 12:12:32 公開日:2021-02-02
# 置換不規則二元晶材料モデリングのためのヒューリスティック量子古典アルゴリズム A Heuristic Quantum-Classical Algorithm for Modeling Substitutionally Disordered Binary Crystalline Materials ( http://arxiv.org/abs/2004.00957v3 ) ライセンス: Link先を確認	Tanvi P. Gujarati, Tyler Takeshita, Andreas Hintennach, and Eunseok Lee	(参考訳) エネルギー計算の効率と精度の向上は、計算材料データに機械学習技術を適用する分野である材料情報学の分野において、重要かつ継続的な関心を集めている。本稿では,置換不規則二元晶材料のエネルギーを効率的にモデル化し,予測するヒューリスティック量子古典アルゴリズムを提案する。具体的には、格子サイト数で線形にスケールする量子回路を設計し、指数的スケーリング特徴空間における量子化学シミュレーションのエネルギーを予測するために訓練する。この回路は、古典的計算された量子化学シミュレーションから得られたデータを用いて、古典的教師付き学習によって訓練される。トレーニングプロセスの一環として,入力データの異常を検出し,修正できるサブルーチンを導入する。このアルゴリズムは、広く使用されているリチウムイオン電池陰極材料であるLi-コバルテート系の複雑な層構造上で実証される。その結果,提案する量子回路モデルは,そのような量子力学系から得られるエネルギーのモデル化に最適であることがわかった。さらに、異常データの解析は、研究対象システムの熱力学特性に関する重要な洞察を与える。 Improving the efficiency and accuracy of energy calculations has been of significant and continued interest in the area of materials informatics, a field that applies machine learning techniques to computational materials data. Here, we present a heuristic quantum-classical algorithm to efficiently model and predict the energies of substitutionally disordered binary crystalline materials. Specifically, a quantum circuit that scales linearly in the number of lattice sites is designed and trained to predict the energies of quantum chemical simulations in an exponentially-scaling feature space. This circuit is trained by classical supervised-learning using data obtained from classically-computed quantum chemical simulations. As a part of the training process, we introduce a sub-routine that is able to detect and rectify anomalies in the input data. The algorithm is demonstrated on the complex layer-structured of Li-cobaltate system, a widely-used Li-ion battery cathode material component. Our results shows that the proposed quantum circuit model presents a suitable choice for modelling the energies obtained from such quantum mechanical systems. Furthermore, analysis of the anomalous data provides important insights into the thermodynamic properties of the systems studied.	翻訳日:2023-05-27 03:25:28 公開日:2021-02-02
# 量子過程の因果構造を調べるための高速テスト Fast tests for probing the causal structure of quantum processes ( http://arxiv.org/abs/2004.08308v3 ) ライセンス: Link先を確認	Giulio Chiribella and Swati	(参考訳) 因果関係の同定は科学的手法の基礎となっている。この課題に対する伝統的なアプローチは古典的な統計に基づいている。しかし、そのような古典的アプローチは、より広い因果関係のスペクトルがアクセス可能になる量子領域では適用されない。近年、量子因果推論の新しいアプローチが開発され、将来有望な新しい特徴が発見されている。本稿では、Refのフレームワークと結果をレビューし、部分的に拡張する。 [1]は可逆過程によって誘導される様々な種類の因果関係の同定において量子的スピードアップを示した。 The identification of causal relations is a cornerstone of the scientific method. Traditional approaches to this task are based on classical statistics. However, such classical approaches do not apply in the quantum domain, where a broader spectrum of causal relations becomes accessible. New approaches to quantum causal inference have been developed in recent years, and promising new features have been discovered. In this paper, we review and partly expand the framework and results of Ref. [1], which demonstrated quantum speedups in the identification of various types of causal relations induced by reversible processes.	翻訳日:2023-05-23 04:38:00 公開日:2021-02-02
# 格子準電子波動関数の連続極限 Continuum limit of lattice quasielectron wavefunctions ( http://arxiv.org/abs/2004.12205v2 ) ライセンス: Link先を確認	Aniket Patra, Birgit Hillebrecht, and Anne E. B. Nielsen	(参考訳) ラウリン状態における正準ホールを記述した試行状態が早期に発見され、従って、正準電子も生成できることを期待することは自然である。それでも、準電子に対する既存の試行波動関数は、期待される位相特性やそれらの構成と相容れない挙動を示す。しかし、格子分数量子ホール系では、期待される全ての性質 [new j. phys. 20, 033029 (2018)] を持つ比較的単純な準電子波動関数を見つけることができることが示されている。連続体極限におけるこの波動関数はどうなるのか? ここでは、準電子が格子点の上にあるときに有限連続な波動関数が得られるが、格子準電子の極限は一般に存在しないことを示す。特に、準電子が格子点以外の任意の場所に配置された場合、連続体極限が近づくと格子波動関数は発散する。発散は、最低ランダウレベルに状態を投影することで取り除くことができるが、投影された状態は、任意の準電子に対して期待される性質をも持たない。したがって、格子準電子波動関数は連続体における任意の準電子の試行状態を見つけることの難しさを解決しない。 Trial states describing anyonic quasiholes in the Laughlin state were found early on, and it is therefore natural to expect that one should also be able to create anyonic quasielectrons. Nevertheless, the existing trial wavefunctions for quasielectrons show behaviors that are not compatible with the expected topological properties or their construction involves ad hoc elements. It was shown, however, that for lattice fractional quantum Hall systems, it is possible to find a relatively simple quasielectron wavefunction that has all the expected properties [New J. Phys. 20, 033029 (2018)]. This naturally poses the question: what happens to this wavefunction in the continuum limit? Here we demonstrate that, although one obtains a finite continuum wavefunction when the quasielectron is on top of a lattice site, such a limit of the lattice quasielectron does not exist in general. In particular, if the quasielectron is put anywhere else than on a lattice site, the lattice wavefunction diverges when the continuum limit is approached. The divergence can be removed by projecting the state on the lowest Landau level, but we find that the projected state does also not have the properties expected for anyonic quasielectrons. We hence conclude that the lattice quasielectron wavefunction does not solve the difficulty of finding trial states for anyonic quasielectrons in the continuum.	翻訳日:2023-05-22 04:01:29 公開日:2021-02-02
# 完全無秩序2次元量子ウォークにおける位相的非局在化 Topological delocalization in the completely disordered two-dimensional quantum walk ( http://arxiv.org/abs/2005.00203v3 ) ライセンス: Link先を確認	J\'anos K. Asb\'oth, Arindam Mallick	(参考訳) 空間障害が2つの内部「コイン」状態を持つ2次元分割ステップ離散時間量子ウォークに与える影響を数値解析および理論的に検討した。空間障害はアンダーソンの局所化につながり、量子ウォークの拡散を阻害し、拡散的に広がる古典的ウォークに対して不利な状態に陥る。最も一般的なタイプの空間的障害、すなわち位置依存的なハールランダムコイン作用素は、アンダーソンの局所化ではなく拡散拡散につながる。これは非局在化であり、これは障害が量子ウォークを異なる異常なフロケ・アンダーソン絶縁位相の間の臨界点に配置するためである。この説明は、この一般的な量子ウォークと、文献でより研究されたより単純なケースとの関係と、障害による位相的起源の非局在化が観察されたことに基づく。我々は、波動関数の時間発展とレベル間隔統計を用いて、より単純な量子ウォークのための位相的非局在化をレビューする。散乱理論を2次元量子ウォークに適用し、乱れた量子ウォークの位相不変量を計算し、非局在化の位相的解釈を裏付け、伝送の有限スケールスケールにおける非局在化の符号を求める。我々は、3つの異なる方法で臨界指数$\eta$を計算し、整数量子ホール効果のように$\eta$$\approx$ 0.52を求めることで、ハール乱量子ウォークの臨界性を示す。固体物理学の理論的アイデアと数値ツールが、空間的にランダムな量子ウォークを理解する上でどのように役立つかを示す。 We investigate numerically and theoretically the effect of spatial disorder on two-dimensional split-step discrete-time quantum walks with two internal "coin" states. Spatial disorder can lead to Anderson localization, inhibiting the spread of quantum walks, putting them at a disadvantage against their diffusively spreading classical counterparts. We find that spatial disorder of the most general type, i.e., position-dependent Haar random coin operators, does not lead to Anderson localization but to a diffusive spread instead. This is a delocalization, which happens because disorder places the quantum walk to a critical point between different anomalous Floquet-Anderson insulating topological phases. We base this explanation on the relationship of this general quantum walk to a simpler case more studied in the literature and for which disorder-induced delocalization of a topological origin has been observed. We review topological delocalization for the simpler quantum walk, using time evolution of the wave functions and level spacing statistics. We apply scattering theory to two-dimensional quantum walks and thus calculate the topological invariants of disordered quantum walks, substantiating the topological interpretation of the delocalization and finding signatures of the delocalization in the finite-size scaling of transmission. We show criticality of the Haar random quantum walk by calculating the critical exponent $\eta$ in three different ways and find $\eta$ $\approx$ 0.52 as in the integer quantum Hall effect. Our results showcase how theoretical ideas and numerical tools from solid-state physics can help us understand spatially random quantum walks.	翻訳日:2023-05-21 15:08:07 公開日:2021-02-02
# 最大通勤初期ハミルトニアンをもつ分子エネルギーに対する量子ゼノアプローチ Quantum Zeno approach for molecular energies with maximum commuting initialHamiltonians ( http://arxiv.org/abs/2006.01066v2 ) ライセンス: Link先を確認	Hongye Yu, Tzu-Chieh Wei	(参考訳) 本稿では,小分子の基底状態を計算するために,量子断熱およびシミュレーションアニールフレームワークを提案する。我々のアルゴリズムの最初のハミルトニアンは、パウリ基底の分子のフルハミルトニアンにおける可換項の最大集合からなる最大可換ハミルトニアン(maximum commuting hamiltonian)である。我々は2つの変種を考える。第1の方法では、最大可換ハミルトニアンの基底状態として初期状態を持つ、得られた時間または経路依存ハミルトニアンの断熱的進化をweperformする。しかし、この手法はハミルトニアン経路に沿った縮退性やエネルギーレベルの交差による断熱量子計算の通常の問題に苦しむ。この問題は、量子シミュレーションアニーリングで使われる一連の固有状態投影(英語版)(eigenstate projections)を通じてゼノ法によって緩和され、経路依存ハミルトニアンはパウリ x 項の和によって拡張され、その寄与は経路の終了時に消滅する。基底状態に加えて、この量子Zenoアプローチを用いて、基底状態と同程度の精度で低い起伏励起状態が得られる。 We propose to use a quantum adiabatic and simulated-annealing framework to compute theground state of small molecules. The initial Hamiltonian of our algorithms is taken to be themaximum commuting Hamiltonian that consists of a maximal set of commuting terms in the fullHamiltonian of molecules in the Pauli basis. We consider two variants. In the first method, weperform the adiabatic evolution on the obtained time- or path-dependent Hamiltonian with theinitial state as the ground state of the maximum commuting Hamiltonian. However, this methoddoes suffer from the usual problems of adiabatic quantum computation due to degeneracy andenergy-level crossings along the Hamiltonian path. This problem is mitigated by a Zeno method,i.e., via a series of eigenstate projections used in the quantum simulated annealing, with the path-dependent Hamiltonian augmented by a sum of Pauli X terms, whose contribution vanishes at thebeginning and the end of the path. In addition to the ground state, the low lying excited states canbe obtained using this quantum Zeno approach with equal accuracy to that of the ground state.	翻訳日:2023-05-17 11:18:35 公開日:2021-02-02
# 市民科学研究におけるモノのインターネットの利用に関する倫理的問題:スコーピング・レビュー Ethical issues with using Internet of Things devices in citizen science research: A scoping review ( http://arxiv.org/abs/2007.09416v2 ) ライセンス: Link先を確認	James Scheibner, Anna Jobin, Effy Vayena	(参考訳) 本章では,市民科学者とインターネット・オブ・モノ(Internet of Things)デバイスの両方を活用する科学研究のスコーピングレビューを行った。具体的には、著者らが研究過程で遭遇した倫理的問題について少なくとも短い議論を含む研究を選択した。 IEEE Xplore, Scopus, Web of Science, ProQuest, PubMedの5つのデータベースを検索した結果、631の潜在的な結果が得られた。要約とタイトルのスクリーニングの後、全文の適格性評価を行い、基準に合致した34の論文を特定した。そして、これらの記事の全文を帰納的かつ帰納的に分析し、倫理問題を3つの主要なカテゴリに分けた。これらのカテゴリは、自律性とデータプライバシ、データ品質、知的財産である。我々はまた、これらの論文の全文を分析し、研究者がこれらの倫理的問題を解決するためにどのような戦略を採ったか、また法的意味を提起した。この分析に続き、市民科学者とIoTデバイスを研究に統合したい研究者に推奨する。まず、すべての市民科学プロジェクトは、参加者の機密性を保護するためにデータプライバシープロトコルを統合するべきである。第二に、科学研究者はプロジェクトを始める前に、妥協が必要かどうかなど、データ品質の潜在的な問題を検討するべきである。最後に、すべての知的財産問題はプロジェクトの開始時とライフサイクル中に明確にする必要があります。研究者は、商用のモノのインターネット(Internet of Things)デバイスによる研究から生じる倫理的問題も考慮すべきである。 Our chapter presents a scoping review of published scientific studies or case studies of scientific studies that utilise both citizen scientists and Internet of Things devices. Specifically, we selected studies where the authors had included at least a short discussion of the ethical issues encountered during the research process. Having conducted a search of five databases (IEEE Xplore, Scopus, Web of Science, ProQuest, and PubMed), we identified 631 potential results. Following abstract and title screening, and then full text eligibility assessment, we identified 34 published articles that matched our criteria. We then analysed the full text for these articles inductively and deductively, coding ethical issues into three main categories. These categories were autonomy and data privacy, data quality, and intellectual property. We also analysed the full text of these articles to see what strategies researchers took to resolve these ethical issues, as well as any legal implications raised. Following this analysis, our discussion provides recommendations for researchers who wish to integrate citizen scientists and Internet of Things devices into their research. First, all citizen science projects should integrate a data privacy protocol to protect the confidentiality of participants. Secondly, scientific researchers should consider any potential issues of data quality, including whether compromises might be required, before establishing a project. Finally, all intellectual property issues should be clarified both at the start of the project and during its lifecycle. Researchers should also consider any ethical issues that might flow from the use of commercially available Internet of Things devices for research.	翻訳日:2023-05-09 03:05:43 公開日:2021-02-02
# コヒーレント駆動Vレベル原子を用いた連続狭帯域ラシング Continuous narrowband lasing with coherently driven V-level atoms ( http://arxiv.org/abs/2007.12522v2 ) ライセンス: Link先を確認	Christoph Hotter, David Plankensteiner, Helmut Ritsch	(参考訳) 非常に異なる速度のVレベル原子の2つの遷移の強いコヒーレントポンプは、より狭い遷移にほぼ完全な反転をもたらすと予測されている。ストロンチウムの青と赤の遷移の例を用いて、適切な条件下では、対応する共鳴ゲインを連続してレーザーをシャロー遷移で操作することができることを示す。特に、狭い遷移に関するポンプ磁場の強い変形は、ラシングモードに散乱するコヒーレントポンプ光からの無視可能な寄与のみを示す素原子遷移周波数に近づき、キャビティ出力スペクトルの計算により、結果として生じるレーザー光線幅がポンプ光の帯域幅や狭原子遷移の自然な直線幅よりもはるかに小さくなることが示されている。その周波数は、適切に選択された原子番号の原子遷移周波数と密接に結びついている。原子運動ショーのドップラー冷却を含むシミュレーションは、発振遷移の小さな運動加熱を伴う強い遷移のドップラー冷却を含むため、現在の実験技術では磁気光学トラップの先端における連続的なレーザー操作が可能となる。 Simultaneous strong coherent pumping of the two transitions of a V-level atom with very differentdecay rates has been predicted to create almost perfect inversion on the narrower transition. Usingthe example of the blue and red transitions in Strontium we show that for suitable operatingconditions the corresponding resonant gain can be used to continuously operate a laser on thenarrow transition. In particular, for a strong detuning of the pump field with respect to the narrowtransition, coherent laser emission occurs close to the bare atomic transition frequency exhibitingonly a negligible contribution from coherent pump light scattered into the lasing mode.Calculations of the cavity output spectrum show that the resulting laser linewidth can get muchsmaller than the bandwidth of the pump light and even the natural linewidth of the narrow atomictransition. Its frequency is closely tied to the atomic transition frequency for properly chosen atomnumbers. Simulations including atomic motionshow Doppler cooling on the strong transitionwith minor motion heating on the lasing transition, so that continuous laser operation in thepresence of a magneto-optical trap should be possible with current experimental technology.	翻訳日:2023-05-08 08:29:26 公開日:2021-02-02
# 弱非線形ジョセフソン接合浴を用いた量子系力学 Quantum system dynamics with a weakly nonlinear Josephson junction bath ( http://arxiv.org/abs/2008.08052v2 ) ライセンス: Link先を確認	Jing Yang, \'Etienne Jussiau, Cyril Elouard, Karyn Le Hur, and Andrew N. Jordan	(参考訳) ジョセフソン接合の鎖からなる弱非線形ジョセフソン浴が小量子系(lc発振器)のダイナミクスに与える影響について検討した。電荷エネルギーが最大のエネルギースケールである状態に着目し、コサインポテンシャルを正確に保ちながら、電荷エネルギーによって分割されたジョセフソンエネルギーの先頭次数に対するジョセフソン浴の相関関数を摂動的に計算する。チェーンに沿った帯電エネルギーの変化がバス相関関数の高速崩壊を確実にするときに、ジョセフソン浴に弱く容量的に結合したlc発振器のダイナミクスをマルコフマスター方程式により解くことができる。ジョゼフソン浴とジョゼフソン浴の2重性関係をそれぞれ大帯電系とジョゼフソンエネルギー系で確立する。この結果は、電荷エネルギーが不均一に工学されたり、鎖に乱れたりした場合に適用できる。さらに, 温度がゼロ温度限界を超えれば, ジョセフソン浴は非マルコフ型になる可能性があり, バス相関関数が一定にシフトし, 時間とともに減衰しないことがわかった。 We investigate the influence of a weakly nonlinear Josephson bath consisting of a chain of Josephson junctions on the dynamics of a small quantum system (LC oscillator). Focusing on the regime where the charging energy is the largest energy scale, we perturbatively calculate the correlation function of the Josephson bath to the leading order in the Josephson energy divided by the charging energy while keeping the cosine potential exactly. When the variation of the charging energy along the chain ensures fast decay of the bath correlation function, the dynamics of the LC oscillator that is weakly and capacitively coupled to the Josephson bath can be solved through the Markovian master equation. We establish a duality relation for the Josephson bath between the regimes of large charging and Josephson energies respectively. The results can be applied to cases where the charging energy either is nonuniformly engineered or disordered in the chain. Furthermore, we find that the Josephson bath may become non-Markovian when the temperature is increased beyond the zero-temperature limit in that the bath correlation function gets shifted by a constant and does not decay with time.	翻訳日:2023-05-05 22:47:47 公開日:2021-02-02
# 複数の絡み合った光子をもつ量子照明 Quantum illumination with multiple entangled photons ( http://arxiv.org/abs/2008.09455v4 ) ライセンス: Link先を確認	Ricardo Gallego Torrom\'e, Nadya Ben Bekhti-Winkel and Peter Knott	(参考訳) 本研究では、ロイドの量子照明を2つの絡み合った光子状態によって記述された信号ビームに理論的に一般化する。この新しいプロトコルは,目標範囲を探索する方法を提供し,ロイドの量子照明と同じノイズ比の信号を持つために必要な時間帯域幅積のサイズを小さくし,偽陽性の確率を低くし,雑音に対して弾力性があり,損失にも耐えうることを示す。しかし、プロトコルに必要な3つの光子状態の生成は、その実用的実装が完全には解決されない技術的な問題を引き起こす。この問題を克服できる三重光子生成の最近の進歩を論じる。プロトコルに関する他の問題も考慮されている。 In this work, a theoretical generalization of Lloyd's quantum illumination to signal beams described by two entangled photon states is developed. It is shown that the new protocol offers a method to find the range of the target, reduces the size of the required time-bandwidth product to have the same signal to noise ratio than in Lloyd's quantum illumination, has a lower probability of false positive and is resilient against noise and also potentially against losses. However, the generation of the required three photon states for the protocol posses a technical problem for its practical implementation not fully addressed. Recent advances in triple photon generation that can overcome this problem are discussed. Other issues related with the protocol are also considered.	翻訳日:2023-05-05 12:07:34 公開日:2021-02-02
# 条件付き光子検出による2つのデカップリング量子リミットサイクル発振器の瞬時位相同期 Instantaneous phase synchronization of two decoupled quantum limit-cycle oscillators induced by conditional photon detection ( http://arxiv.org/abs/2009.08286v2 ) ライセンス: Link先を確認	Yuzuru Kato, Hiroya Nakao	(参考訳) 条件付き光子検出は、2つのデカップリング量子リミットサイクル発振器間の瞬時位相同期を誘導する。相互結合のない2つの量子ファンデルポル発振器について検討し、それぞれに線形結合浴を付加し、ビームスプリッタを介して相互作用する2つの浴の出力場に基づいて光子を連続測定する。光子検出後に2つの分離発振器の相内あるいは反相コヒーレンスが瞬時に増加し、弱量子状態においては徐々に減少し、次の光子検出まで強い量子状態では急速に減少する。強い量子構造では、光子検出後に量子の絡み合いも増加し、すぐに消滅する。量子エンタングルメントと位相コヒーレンスの増加に対する解析上界を、量子極限における条件付き光子検出によって導出する。 We show that conditional photon detection induces instantaneous phase synchronization between two decoupled quantum limit-cycle oscillators. We consider two quantum van der Pol oscillators without mutual coupling, each with an additional linearly coupled bath, and perform continuous measurement of photon counting on the output fields of the two baths interacting through a beam splitter. It is observed that in-phase or anti-phase coherence of the two decoupled oscillators instantaneously increases after the photon detection and then decreases gradually in the weak quantum regime or quickly in the strong quantum regime until the next photon detection occurs. In the strong quantum regime, quantum entanglement also increases after the photon detection and quickly disappears. We derive the analytical upper bounds for the increases in the quantum entanglement and phase coherence by the conditional photon detection in the quantum limit.	翻訳日:2023-05-02 00:09:18 公開日:2021-02-02
# 感染拡大に伴う個人用保護具(ppe)の在庫管理強化のためのゲーム理論 Game theory to enhance stock management of Personal Protective Equipment (PPE) during the COVID-19 outbreak ( http://arxiv.org/abs/2009.11838v3 ) ライセンス: Link先を確認	Khaled Abedrabboh, Matthias Pilz, Zaid Al-Fagih, Othman S. Al-Fagih, Jean-Christophe Nebel, Luluwah Al-Fagih	(参考訳) 新型コロナウイルス(covid-19)のパンデミック以降、多くの医療施設は医療資源の不足、特に個人用防護具(ppe)の不足に苦しめられている。本稿では,医療施設間でPPE注文をスケジュールするゲーム理論アプローチを提案する。このPPEゲームでは、個々の独立した医療施設が、PPEのコストを最小限に抑えるために、自身のストレージ利用を最適化する。このようなモデルは、可変ppe消費プロファイルに適用するとピーク需要を大幅に削減することができる。実際のデータを用いてNHSイングランド地域で実施した実験では、適切な株式管理手順が採用されれば、新型コロナウイルスなどの災害時のPEP供給確保の課題が緩和できることが確認されている。これらの手順には、早期の備蓄、貯蔵能力の増大、社会的距離調整など、連続的な感染波間の期間を延長する措置の実施が含まれる。シミュレーションの結果,第2波のcovid-19感染が発生した場合,ppe専用ストレージスペースの提供はppeサプライチェーンの歪みを回避するための有効な解決策となる可能性が示唆された。 Since the outbreak of the COVID-19 pandemic, many healthcare facilities have suffered from shortages in medical resources, particularly in Personal Protective Equipment (PPE). In this paper, we propose a game-theoretic approach to schedule PPE orders among healthcare facilities. In this PPE game, each independent healthcare facility optimises its own storage utilisation in order to keep its PPE cost at a minimum. Such a model can reduce peak demand considerably when applied to a variable PPE consumption profile. Experiments conducted for NHS England regions using actual data confirm that the challenge of securing PPE supply during disasters such as COVID-19 can be eased if proper stock management procedures are adopted. These procedures can include early stockpiling, increasing storage capacities and implementing measures that can prolong the time period between successive infection waves, such as social distancing measures. Simulation results suggest that the provision of PPE dedicated storage space can be a viable solution to avoid straining PPE supply chains in case a second wave of COVID-19 infections occurs.	翻訳日:2023-05-01 02:37:17 公開日:2021-02-02
# スケーラブルなフォトニックフォールトトレラント量子コンピュータのための青写真 Blueprint for a Scalable Photonic Fault-Tolerant Quantum Computer ( http://arxiv.org/abs/2010.02905v2 ) ライセンス: Link先を確認	J. Eli Bourassa, Rafael N. Alexander, Michael Vasmer, Ashlesha Patil, Ilan Tzitrin, Takaya Matsuura, Daiqin Su, Ben Q. Baragiola, Saikat Guha, Guillaume Dauphinais, Krishna K. Sabapathy, Nicolas C. Menicucci, Ish Dhand	(参考訳) Photonicsは、室温で動くモジュラーで簡単にネットワークに繋がる量子コンピュータを構築するためのプラットフォームだ。しかし、光状態に符号化された量子ビットの利点と、その生成のための現代的なツールの両方を利用する具体的なアーキテクチャは提示されていない。本稿では,最新の理論・技術の発展にともなう,スケーラブルでフォールトトレラントなフォトニック量子コンピュータの設計を提案する。我々のアーキテクチャの中心は、ボゾン量子ビットと圧縮真空状態の両方からなる3次元ハイブリッド資源状態の生成と操作である。本提案は, ボソニック量子ビットの非決定論的生成と連続変数量子計算の強み, すなわち, 容易に生成できる圧縮状態を用いたクリフォードゲートの実装を組み合わせた, 最先端の手順を活用できる。さらに、このアーキテクチャは1つの時間次元と2つの空間次元のキュービットクラスタ状態を生成するために使用される2次元集積フォトニックチップに基づいている。既存のアーキテクチャに比べて実験的な課題を少なくし、室温量子計算を可能にすることで、我々の設計はスケーラブルな製造と運用への扉を開く。 Photonics is the platform of choice to build a modular, easy-to-network quantum computer operating at room temperature. However, no concrete architecture has been presented so far that exploits both the advantages of qubits encoded into states of light and the modern tools for their generation. Here we propose such a design for a scalable and fault-tolerant photonic quantum computer informed by the latest developments in theory and technology. Central to our architecture is the generation and manipulation of three-dimensional hybrid resource states comprising both bosonic qubits and squeezed vacuum states. The proposal enables exploiting state-of-the-art procedures for the non-deterministic generation of bosonic qubits combined with the strengths of continuous-variable quantum computation, namely the implementation of Clifford gates using easy-to-generate squeezed states. Moreover, the architecture is based on two-dimensional integrated photonic chips used to produce a qubit cluster state in one temporal and two spatial dimensions. By reducing the experimental challenges as compared to existing architectures and by enabling room-temperature quantum computation, our design opens the door to scalable fabrication and operation, which may allow photonics to leap-frog other platforms on the path to a quantum computer with millions of qubits.	翻訳日:2023-04-29 20:13:56 公開日:2021-02-02
# tqix: Xにおける量子のためのツールボックス:量子計測、量子トモグラフィ、量子メトロジーなど tqix: A toolbox for Quantum in X: Quantum measurement, quantum tomography, quantum metrology, and others ( http://arxiv.org/abs/2010.03731v2 ) ライセンス: Link先を確認	Le Bin Ho, Kieu Quang Tuan, Hung Q. Nguyen	(参考訳) 本稿では,Python言語で書かれたオープンソースのコンピュータプログラムについて述べる。本プログラムでは,量子ゲートを含む量子状態と演算子を行列で表現した量子対象関数として開発することができる。プログラムに組み込むには、フォン・ノイマン測度や弱い測度を含むいくつかの測度スキームがある。実実験結果の再現には様々な数値シミュレーション手法が用いられる。まず、プログラム構造の概要を説明し、次いで量子計測の数値シミュレーションについて議論する。我々は,量子状態トモグラフィと量子メトロロジーを用いてプログラムの性能を説明する。このプログラムは量子物理学の一般的な言語で構築されており、量子光学、イオントラップ、超伝導回路デバイスなどの様々な物理プラットフォームに広く適用可能である。また、様々な量子システムのシミュレーションと可視化による教室指導での使用も理想的である。 We present an open-source computer program written in Python language for quantum measurement and related issues. In our program, quantum states and operators, including quantum gates, can be developed into a quantum-object function represented by a matrix. Build into the program are several measurement schemes, including von Neumann measurement and weak measurement. Various numerical simulation methods are used to mimic the real experiment results. We first provide an overview of the program structure and then discuss the numerical simulation of quantum measurement. We illustrate the program's performance via quantum state tomography and quantum metrology. The program is built in a general language of quantum physics and thus is widely adaptable to various physical platforms, such as quantum optics, ion traps, superconducting circuit devices, and others. It is also ideal to use in classroom guidance with simulation and visualization of various quantum systems.	翻訳日:2023-04-29 15:49:19 公開日:2021-02-02
# 量子コンピューティングのための動的自己エネルギーマッピング(DSEM) Dynamical Self-energy Mapping (DSEM) for quantum computing ( http://arxiv.org/abs/2010.05441v2 ) ライセンス: Link先を確認	Diksha Dhawan, Mekena Metcalf, Dominika Zgid	(参考訳) ノイズの多い中間スケール量子(NISQ)デバイスでは、コヒーレンスに制限のある適度な数の量子ビットしか利用できないため、現在実行されている量子計算において、浅い回路と数回の進化ステップしか実現できない。本稿では,標準ハミルトニアンの$\mathcal{o}(n^4)$項と比較して,ガウス軌道基底において$\mathcal{o}(n^2)$項のみを含むスパースハミルトニアンを生成できる古典量子ハイブリッドアルゴリズムを用いて,nisqデバイスにおける分子化学シミュレーションにおいて,この課題を回避する方法を提案する。このハイブリッドの古典的な部分は、元の分子系の自己エネルギーを回復するように、スパースで架空のハミルトンのパラメータ化を必要とする。量子機械はこの架空のハミルトニアンを用いてシステムの自己エネルギーを計算する。開発したハイブリッドアルゴリズムは, 完全ハミルトニアンを含むシミュレーションと比較して, 量子回路の深さを少なくとも1桁小さくしながら, 小型分子テストケースにおいて非常に良好な総エネルギーが得られることを示す。 For noisy intermediate-scale quantum (NISQ) devices only a moderate number of qubits with a limited coherence is available thus enabling only shallow circuits and a few time evolution steps in the currently performed quantum computations. Here, we present how to bypass this challenge in practical molecular chemistry simulations on NISQ devices by employing a classical-quantum hybrid algorithm allowing us to produce a sparse Hamiltonian which contains only $\mathcal{O}(n^2)$ terms in a Gaussian orbital basis when compared to the $\mathcal{O}(n^4)$ terms of a standard Hamiltonian, where $n$ is the number of orbitals in the system. Classical part of this hybrid entails parameterization of the sparse, fictitious Hamiltonian in such a way that it recovers the self-energy of the original molecular system. Quantum machine then uses this fictitious Hamiltonian to calculate the self-energy of the system. We show that the developed hybrid algorithm yields very good total energies for small molecular test cases while reducing the depth of the quantum circuit by at least an order of magnitude when compared with simulations involving a full Hamiltonian.	翻訳日:2023-04-29 07:33:58 公開日:2021-02-02
# maxcut qaoa による p > 1 のパフォーマンス保証 MAXCUT QAOA performance guarantees for p >1 ( http://arxiv.org/abs/2010.11209v2 ) ライセンス: Link先を確認	Jonathan Wurtz, Peter J. Love	(参考訳) 均一な3つの正則グラフ上でMAXCUTに対する$p=2$および$$QAOAの最悪のケース性能保証を得る。 Farhiらによる以前の研究は、近似比が0.692$ for $p=1$の低い境界を得た。 0.7559$ for $p=2$で、最悪のケースグラフはサイクルのないグラフである。この境界は、特定の固定パラメータで評価された任意の3つの正規グラフに対して成り立つ。最悪のケースグラフがサイクルを持たないすべての$p$に対して$\leq 2p+1$を予想する。この予想の下で、近似比は3つの正則グラフすべてに対して少なくとも$0.7924$であり、$p=3$である。さらに、単純な区別不可能な議論を用いて、すべての$p$に対する最悪のケース近似比の上限を見つけ、これは少なくとも$p<6$に対して量子的優位性を持たないグラフのクラスを示す。 We obtain worst case performance guarantees for $p=2$ and $3$ QAOA for MAXCUT on uniform 3-regular graphs. Previous work by Farhi et al obtained a lower bound on the approximation ratio of $0.692$ for $p=1$. We find a lower bound of $0.7559$ for $p=2$, where worst case graphs are those with no cycles $\leq 5$. This bound holds for any 3 regular graph evaluated at particular fixed parameters. We conjecture a hierarchy for all $p$, where worst case graphs have with no cycles $\leq 2p+1$. Under this conjecture, the approximation ratio is at least $0.7924$ for all 3 regular graphs and $p=3$. In addition, using a simple indistinguishability argument we find an upper bound on the worst case approximation ratio for all $p$, which indicates classes of graphs for which there can be no quantum advantage for at least $p<6$.	翻訳日:2023-04-28 03:01:13 公開日:2021-02-02
# 光格子におけるスピン軌道結合原子の指向性原子運動と二階トンネル制御 Controlling directed atomic motion and second-order tunneling of a spin-orbit-coupled atom in optical lattices ( http://arxiv.org/abs/2011.01399v2 ) ライセンス: Link先を確認	Xiaobing Luo, Zhao-Yun Zeng, Yu Guo, Baiyuan Yang, Jinpeng Xiao, Lei Li, Chao Kong, and Ai-Xi Chen	(参考訳) 格子揺らぎと時間周期ゼーマン場を受ける光学格子に閉じ込められた単一スピン軌道結合原子の密結合(TB)モデルに対するトンネル力学を理論的に探求する。解析的および数値的手法により、スピン軌道結合(SO)が多光子共鳴および遠方共振パラメータ系におけるトンネル力学にいくつかの新しい結果をもたらすことを示した。 When the driving frequency is resonant with the static Zeeman field (multi-photon resonances), we obtain an unexpected new dynamical localization (DL) phenomenon where the single SO-coupled atom is restricted to making perfect two-site Rabi oscillation accompanied by spin flipping.By using the unconventional DL phenomenon, we are able to generate a ratchetlike effect which enables directed atomic motion towards different directions and accompanies periodic spin-flipping under the action of SO coupling. 遠方共振の場合,通常のサイト間トンネルのみを抑えることで,SO結合のない従来の格子系ではアクセスできない次熱処理部位間でのスピン保存2次トンネルの実現が可能であることを示す。また, クエージーの平坦性(崩壊)と完全に凍結するダイナミクスの存在には, 通常の現場間トンネルとSO結合関連2次トンネルの同時制御が必要であることを示す。これらの結果はスピンベースの量子情報処理や新しいスピントロニクスデバイスの設計といった潜在的な応用に関係している可能性がある。 We theoretically explore the tunneling dynamics for the tight-binding (TB) model of a single spin-orbit-coupled atom trapped in an optical lattice subjected to lattice shaking and to time-periodic Zeeman field. By means of analytical and numerical methods, we demonstrate that the spin-orbit (SO) coupling adds some new results to the tunneling dynamics in both multiphoton resonance and far-off-resonance parameter regimes. When the driving frequency is resonant with the static Zeeman field (multi-photon resonances), we obtain an unexpected new dynamical localization (DL) phenomenon where the single SO-coupled atom is restricted to making perfect two-site Rabi oscillation accompanied by spin flipping.By using the unconventional DL phenomenon, we are able to generate a ratchetlike effect which enables directed atomic motion towards different directions and accompanies periodic spin-flipping under the action of SO coupling. For the far-off-resonance case, we show that by suppressing the usual inter-site tunneling alone, it is possible to realize a type of spin-conserving second-order tunneling between next-nearest-neighboring sites, which is not accessible in the conventional lattice system without SO coupling. We also show that simultaneous controls of the usual inter-site tunneling and the SO-coupling-related second-order-tunneling are necessary for quasienergies flatness (collapse) and completely frozen dynamics to exist. These results may be relevant to potential applications such as spin-based quantum information processing and design of novel spintronics devices.	翻訳日:2023-04-25 11:56:59 公開日:2021-02-02
# 3次元ラシュバヘテロ構造における非定常リフシッツ転移における多重ギャップ超伝導 Multigaps superconductivity at unconventional Lifshitz transition in a 3D Rashba heterostructure at atomic limit ( http://arxiv.org/abs/2011.02311v2 ) ライセンス: Link先を確認	Vittoria Mazziotti, Antonio Valletta, Roberto Raimondi, Antonio Bianconi	(参考訳) 複数の量子サブバンドからなる電子スペクトルを持つ原子層の超格子からなる原子限界(HAL)におけるマルチギャップ超伝導3次元ヘテロ構造の臨界温度は、異なるギャップ間の接触交換相互作用によって駆動される形状共鳴によって増幅できることはよく知られている。この$t_c$増幅は、首を開くためのリフシッツ遷移において特異節点付近のフェルミ準位をチューニングする。近年、リニアインモーメントスピン軌道誘起スピンスプリッティング(Rashba spin-orbit coupling (RSOC))と呼ばれる3次元層状金属の反転対称性の破れに高い関心が寄せられている。しかし、RSOCが非BCS状態にある3D HALにおけるリフシッツ転移に近い多ギャップ超伝導の物理は知られていない。ボゴリューボフ理論による超伝導ギャップとディラック方程式の解による3次元電子波動関数を得るための重要な成果は、スピン軌道長と3次元超格子周期を適切に一致させることで、マルチギャップ超伝導をチューニングできることである。フェルミエネルギーが円ノルダル線付近で調整された場合、rsocの存在はk依存性の異方性ギャップ関数と臨界温度の両方を増幅する。本研究は,超格子変調パラメータのチューニングにより,超格子超伝導体におけるRSOCの効果を,既存の実験プラットフォームにおけるスピントロニクス機能や量子コンピューティングに必要な調整可能な材料で効果的に変化させる手法を提案する。 It is well known that the critical temperature of multi-gap superconducting 3D heterostructures at atomic limit (HAL) made of a superlattice of atomic layers with an electron spectrum made of several quantum subbands can be amplified by a shape resonance driven by the contact exchange interaction between different gaps. The $T_C$ amplification is achieved tuning the Fermi level near the singular nodal point at a Lifshitz transition for opening a neck. Recently high interest has been addressed to the breaking of inversion symmetry which leads to a linear-in-momentum spin-orbit induced spin splitting, universally referred to as Rashba spin-orbit coupling (RSOC) also in 3D layered metals. However the physics of multi-gap superconductivity near unconventional Lifshitz transitions in 3D HAL with RSOC, being in a non-BCS regime, is not known. The key result of this work getting the superconducting gaps by Bogoliubov theory and the 3D electron wave functions by solution of the Dirac equation is the feasibility of tuning multi-gap superconductivity by suitably matching the spin-orbit length with the 3D superlattice period. It is found that the presence of the RSOC amplifies both the k dependent anisotropic gap function and the critical temperature when the Fermi energy is tuned near the circular nodal line. Our results suggest a method to effectively vary the effect of RSOC on macroscopic superconductor condensates via the tuning of the superlattice modulation parameter in a way potentially relevant for spintronics functionalities in several existing experimental platforms and tunable materials needed for quantum devices for quantum computing.	翻訳日:2023-04-25 11:29:25 公開日:2021-02-02
# 複素平面における摂動理論:例外点とどの点を見つけるか Perturbation Theory in the Complex Plane: Exceptional Points and Where to Find Them ( http://arxiv.org/abs/2012.03688v2 ) ライセンス: Link先を確認	Antoine Marie and Hugh G. A. Burton and Pierre-Fran\c{c}ois Loos	(参考訳) 複素平面における量子化学の非エルミート拡大と摂動論との関係を考察する。量子系の物理学は、例外点として知られる複素値エネルギー特異点の位置と密接な関係にあることを観測する。平均場Hartree-Fock近似やRayleigh--Schr\odinger摂動理論を含む複素平面における非エルミート量子化学の基本概念を提示した後、特異点の物理学で実施された様々な研究活動の歴史的概要を提供する。特に、M{\o}ller--Plesset摂動論において得られる摂動級数の収束挙動とその量子相転移との関係について、基礎研究を取り上げ、収束と発散の両方の場合のM{\o}ller--Plesset摂動級数の全体的な精度を改善するためのいくつかの再仮定手法(Pad\'e や2次近似等)についても論じる。これらの各点は半充填のハバードディマーを用いて図示され、複素平面における解析的連続摂動理論の微妙な性質を理解するための汎用モデルであることが証明される。 We explore the non-Hermitian extension of quantum chemistry in the complex plane and its link with perturbation theory. We observe that the physics of a quantum system is intimately connected to the position of complex-valued energy singularities, known as exceptional points. After presenting the fundamental concepts of non-Hermitian quantum chemistry in the complex plane, including the mean-field Hartree--Fock approximation and Rayleigh--Schr\"odinger perturbation theory, we provide a historical overview of the various research activities that have been performed on the physics of singularities. In particular, we highlight seminal work on the convergence behaviour of perturbative series obtained within M{\o}ller--Plesset perturbation theory, and its links with quantum phase transitions. We also discuss several resummation techniques (such as Pad\'e and quadratic approximants) that can improve the overall accuracy of the M{\o}ller--Plesset perturbative series in both convergent and divergent cases. Each of these points is illustrated using the Hubbard dimer at half filling, which proves to be a versatile model for understanding the subtlety of analytically-continued perturbation theory in the complex plane.	翻訳日:2023-04-21 21:04:13 公開日:2021-02-02
# 位相ミスマッチフリーな熱原子蒸気からのサブmhz・スペクトルbrightバイフォトンの生成 Generation of sub-MHz and spectrally-bright biphotons from hot atomic vapors with a phase mismatch-free scheme ( http://arxiv.org/abs/2012.04893v3 ) ライセンス: Link先を確認	Chia-Yu Hsu, Yu-Sheng Wang, Jia-Mou Chen, Fu-Chen Huang, Yi-Ting Ke, Emily Kay Huang, Weilun Hung, Kai-Lin Chao, Shih-Si Hsiao, Yi-Hsin Chen, Chih-Sung Chuu, Ying-Cheng Chen, Yong-Fan Chen, and Ite A. Yu	(参考訳) 熱水蒸気から2光子を生成するために, 位相整合状態を維持する全共役方式を, 自然発生4波混合(SFWM)プロセスで利用した。この手法により, 従来のホットアトムsfwmの手法を超越するだけでなく, コールドアトムsfwmまたはキャビティ支援の自発的パラメトリックダウン変換によって生成されるバイフォトンと競合することができる。この研究における双光子線幅は桁違いに調整可能である。直線幅を610kHzに調整すると、双光子の最大2光子相関関数$g_{s,as}^{(2)}$は42である。この$g_{s,as}^{(2)}$ は古典光に対するコーシー=シュワルツの不等式を 440 倍に破っており、双光子が高い純度を持つことを証明している。 610kHzの2光子源の線幅当たりの発生率は1500対/(s$\cdot$MHz)であり、これは文学における全てのサブMHzの2光子源の最良の結果である。ポンプの出力を16倍に増やすことで、1ライン幅当たりの発電速度をさらに2.3$\times$10$^4$ pairs/(s$\cdot$mhz)に向上させ、最大$g_{s,as}^{(2)}$は6.7になった。さらに、ライン幅を290$\pm$20 kHzに調整することができます。これは、これまでで最も狭い線幅であり、様々な種類のシングルモードバイフォトンの1つである。 We utilized the all-copropagating scheme, which maintains the phase-match condition, in the spontaneous four-wave mixing (SFWM) process to generate biphotons from a hot atomic vapor. The scheme enables our biphotons not only to surpass those in the previous works of hot-atom SFWM, but also to compete with the biphotons that are generated by either the cold-atom SFWM or the cavity-assisted spontaneous parametric down conversion. The biphoton linewidth in this work is tunable for an order of magnitude. As we tuned the linewidth to 610 kHz, the maximum two-photon correlation function, $g_{s,as}^{(2)}$, of the biphotons is 42. This $g_{s,as}^{(2)}$ violates the Cauchy-Schwartz inequality for classical light by 440 folds, and demonstrates that the biphotons have a high purity. The generation rate per linewidth of the 610-kHz biphoton source is 1,500 pairs/(s$\cdot$MHz), which is the best result of all the sub-MHz biphoton sources in the literature. By increasing the pump power by 16 folds, we further enhanced the generation rate per linewidth to 2.3$\times$10$^4$ pairs/(s$\cdot$MHz), while the maximum $g_{s,as}^{(2)}$ became 6.7. In addition, we are able to tune the linewidth down to 290$\pm$20 kHz. This is the narrowest linewidth to date, among all the various kinds of single-mode biphotons.	翻訳日:2023-04-21 08:16:53 公開日:2021-02-02
# マルチパーティ$q$予測量子相関の強多元性 Strong polygamy of multi-party $q$-expected quantum correlations ( http://arxiv.org/abs/2101.05416v2 ) ライセンス: Link先を確認	Jeong San Kim	(参考訳) マルチパーティ量子相関の多元的性質は, tsallis $q$-エントロピーと$q$-expectation値に基づいて, {\em strong} 形式で特徴づけられることを示した。マルチパーティシステムに分散できる絡み合いの量を考えることで、Tsallis $q$-entropy と$q \geq 1$ の$q$-expectation という観点で、多パーティ絡み合いの強いポリガミー不等式のクラスを確立する。我々の新しい不等式クラスは、実際には、多元的絡み合いの通常の多元的不等式よりも厳密であり、その厳密性は例によって明確に示される。さらに、我々の新しい不等式クラスは、1つの党と他の党の任意の部分集合の間に分配される$q$-expected entanglementに関するものであるが、通常のポリガミー不等式は1つの党と他の党の間の絡みについてのみ考慮する。さらに、量子エンタングルメントの強いポリガミー不等式と、マルチパーティ量子システムに分布する量子不協和の同値性を確立する。 We show that the polygamous nature of multi-party quantum correlations can be characterized in a {\em stronger} form based on Tsallis $q$-entropy and $q$-expectation value. By considering the amount of entanglement that can be distributed in multi-party systems, we establish a class of strong polygamy inequalities of multi-party entanglement in terms of Tsallis $q$-entropy and $q$-expectation for $q \geq 1$. Our new class of inequalities is in fact tighter than the usual polygamy inequalities of multi-party entanglement, and the tightness is explicitly illustrated by an example. Moreover, our new class of inequalities is concerned with the $q$-expected entanglement distributed between a single party and any possible subsets of the rest parties whereas the usual polygamy inequality only considers the entanglement between a single party and another. We further establish the equivalence between strong polygamy inequalities of quantum entanglement and quantum discord distributed in multi-party quantum systems.	翻訳日:2023-04-15 05:29:10 公開日:2021-02-02
# JTrack:神経疾患と精神疾患の遠隔監視のためのデジタルバイオマーカープラットフォーム JTrack: A Digital Biomarker Platform for Remote Monitoring in Neurological and Psychiatric Diseases ( http://arxiv.org/abs/2101.10091v3 ) ライセンス: Link先を確認	Mehran Sahandi Far, Michael Stolz, Jona M. Fischer, Simon B. Eickhoff, Juergen Dukart	(参考訳) 目的: スマートフォンが収集する健康関連データは、気候評価に有望な補完的アプローチを提供する。ここでは、JTrackプラットフォームを、日常およびデジタル表現型におけるリモート監視のためのセキュアで信頼性が高く拡張可能なオープンソースソリューションとして紹介する。方法: JTrackはAndroidベースのスマートフォンアプリケーションとWebベースのプロジェクト管理ダッシュボードで構成されている。モーションセンサー、社会活動、身体活動、位置情報からの幅広い匿名化計測は、アクティブモードまたはパッシブモードで収集することができる。ダッシュボードはまた、研究間でのデータ収集を監視および管理するための管理ツールも提供する。スケーリング、再現性、データ管理、共有を容易にするために、DataLadをデータ管理インフラストラクチャとして統合しました。 JTrackは、セキュリティ、プライバシ、一般データ保護規則(GDPR)要件を満たすために開発された。結果: JTrackは、神経学、精神医学、その他の指標におけるデジタルバイオマーカー(DB)のリモート評価のための、オープンソース(オープンソースApache 2.0ライセンス下でリリースされている)プラットフォームである。 JTrackプラットフォームの主要なコンポーネントと、JTrackを使って収集されるデータの例を以下に示す。結論: スマートフォンベースのデジタルバイオマーカーデータは、健康と病気の日常生活行動に関する貴重な洞察を提供する可能性がある。 JTrackは、そのようなデータの収集のための簡単で信頼性の高いオープンソースソリューションを提供する。 Objective: Health-related data being collected by smartphones offer a promising complementary approach to in-clinic assessments. Here we introduce the JTrack platform as a secure, reliable and extendable open-source solution for remote monitoring in daily-life and digital phenotyping. Method: JTrack consists of an Android-based smartphone application and a web-based project management dashboard. A wide range of anonymized measurements from motion-sensors, social and physical activities and geolocation information can be collected in either active or passive modes. The dashboard also provides management tools to monitor and manage data collection across studies. To facilitate scaling, reproducibility, data management and sharing we integrated DataLad as a data management infrastructure. JTrack was developed to comply with security, privacy and the General Data Protection Regulation (GDPR) requirements. Results: JTrack is an open-source (released under open-source Apache 2.0 licenses) platform for remote assessment of digital biomarkers (DB) in neurological, psychiatric and other indications. The main components of the JTrack platform and examples of data being collected using JTrack are presented here. Conclusion: Smartphone-based Digital Biomarker data may provide valuable insight into daily life behaviour in health and disease. JTrack provides an easy and reliable open-source solution for collection of such data.	翻訳日:2023-04-14 21:02:46 公開日:2021-02-02
# 審判もプレーヤーである場合:eコマースマーケットプレースにおけるプライベートラベル製品推奨のバイアス When the Umpire is also a Player: Bias in Private Label Product Recommendations on E-commerce Marketplaces ( http://arxiv.org/abs/2102.00141v2 ) ライセンス: Link先を確認	Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi	(参考訳) アルゴリズムリコメンデーションは、amazonのような大手eコマースマーケットプレースで何百万もの顧客と製品(その生産者と販売者)のやりとりを仲介する。近年、生産者と販売業者は、これらの市場に展開されるブラックボックスレコメンデーションアルゴリズムの公平性を懸念している。多くの苦情は、アルゴリズムが競合製品よりも独自の「プライベートラベル」製品を優先的に好むように偏っている市場に集中している。これらの懸念は、マーケットプレースが'organic'レコメンデーションを広告主導の'sponsored'レコメンデーション(独自のプライベートレーベルを含む)に強調または置き換えるにつれて悪化している。これらの懸念は広く報道され、規制当局による調査が生まれてきたが、われわれの知る限り、これらのマーケットプレースアルゴリズムの公開監査は行われていない。本研究では,アマゾンの商品推薦に関するエンドツーエンドの体系的な監査を行うことにより,このギャップを埋める。提案するネットワーク中心のフレームワークは,有機的およびスポンサーによる推奨項目間のバイアスを定量化し,比較する。提案されている多くのバイアス対策に従って、スポンサードレコメンデーションは、オーガニックレコメンデーションよりもamazon private label製品にかなり偏っていることが分かりました。私たちの発見は、主にAmazonのプロデューサや売り手にとって興味深いものですが、提案されたバイアス測定は一般的に、ソーシャルやコンテンツネットワークにおけるリンク形成バイアスを測定するのに役立ちます。 Algorithmic recommendations mediate interactions between millions of customers and products (in turn, their producers and sellers) on large e-commerce marketplaces like Amazon. In recent years, the producers and sellers have raised concerns about the fairness of black-box recommendation algorithms deployed on these marketplaces. Many complaints are centered around marketplaces biasing the algorithms to preferentially favor their own `private label' products over competitors. These concerns are exacerbated as marketplaces increasingly de-emphasize or replace `organic' recommendations with ad-driven `sponsored' recommendations, which include their own private labels. While these concerns have been covered in popular press and have spawned regulatory investigations, to our knowledge, there has not been any public audit of these marketplace algorithms. In this study, we bridge this gap by performing an end-to-end systematic audit of related item recommendations on Amazon. We propose a network-centric framework to quantify and compare the biases across organic and sponsored related item recommendations. Along a number of our proposed bias measures, we find that the sponsored recommendations are significantly more biased toward Amazon private label products compared to organic recommendations. While our findings are primarily interesting to producers and sellers on Amazon, our proposed bias measures are generally useful for measuring link formation bias in any social or content networks.	翻訳日:2023-04-13 06:57:41 公開日:2021-02-02
# ダイヤモンド中の遠心対称量子エミッタに対するスターク効果の研究 Investigation of the Stark Effect on a Centrosymmetric Quantum Emitter in Diamond ( http://arxiv.org/abs/2102.01322v1 ) ライセンス: Link先を確認	Lorenzo De Santis, Matthew Trusheim, Kevin Chen, Dirk Englund	(参考訳) ダイヤモンドの量子エミッターは、光学的にアクセス可能な固体量子ビットである。これらのうち、グループIV空孔欠陥中心は、長寿命スピン状態に対するコヒーレントで安定な光学界面として大きな関心を集めている。理論は、それらの反転対称性が、任意のホスト物質における光学的コヒーレンスに対する共通の制限である成層電界に対する一階不感性をもたらすことを示している。ここでは, ダイヤモンド中の個々のスズ空隙(snv)中心に適用した外部電場を用いて, この電界依存性を実験的に定量化する。これらの測定により、永久電気双極子モーメントと偏光性は、ダイヤモンド中の第iv族欠陥の反転対称性保護の最初の直接測定であるダイヤモンド窒素空隙(nv)中心よりも少なくとも4桁小さいことが判明した。さらに、電場誘起双極子を変調することにより、snvを局所電界ノイズのナノスケールプローブとして使用できることを示すとともに、この手法を用いてsnvに対するスペクトル拡散の影響を強調する。 Quantum emitters in diamond are leading optically-accessible solid-state qubits. Among these, Group IV-vacancy defect centers have attracted great interest as coherent and stable optical interfaces to long-lived spin states. Theory indicates that their inversion symmetry provides first-order insensitivity to stray electric fields, a common limitation for optical coherence in any host material. Here we experimentally quantify this electric field dependence via an external electric field applied to individual tin-vacancy (SnV) centers in diamond. These measurements reveal that the permanent electric dipole moment and polarizability are at least four orders of magnitude smaller than for the diamond nitrogen vacancy (NV) centers, representing the first direct measurement of the inversion symmetry protection of a Group IV defect in diamond. Moreover, we show that by modulating the electric-field-induced dipole we can use the SnV as a nanoscale probe of local electric field noise, and we employ this technique to highlight the effect of spectral diffusion on the SnV.	翻訳日:2023-04-13 00:51:58 公開日:2021-02-02
# ai開発におけるグローバル包摂の限界 The Limits of Global Inclusion in AI Development ( http://arxiv.org/abs/2102.01265v1 ) ライセンス: Link先を確認	Alan Chan and Chinasa T. Okolo and Zachary Terner and Angelina Wang	(参考訳) 人工知能(AI)システムの普及から利益を得るための最善策は、最も経済的な力を持つ人々だ。現存する世界的な不平等は、西側の機関がより多様なグループをaiシステムの開発と応用に巻き込み、外国人労働者の雇用や海外のデータセンターや研究所の設立などを行っている。しかし、富の優越性と、トップダウンのAIソリューションにおける文脈知識の欠如の両方を考えると、不足しているグループを含めるだけでなく、権力の再分配にもっと注力すべきだと私たちは主張する。 ai開発をリードする機会が公平に分配されることを保証するためにそれ以上のことがなければ、将来は、その適用状況に不適合なaiシステムのみを保持し、不平等を悪化させる可能性がある。 Those best-positioned to profit from the proliferation of artificial intelligence (AI) systems are those with the most economic power. Extant global inequality has motivated Western institutions to involve more diverse groups in the development and application of AI systems, including hiring foreign labour and establishing extra-national data centers and laboratories. However, given both the propensity of wealth to abet its own accumulation and the lack of contextual knowledge in top-down AI solutions, we argue that more focus should be placed on the redistribution of power, rather than just on including underrepresented groups. Unless more is done to ensure that opportunities to lead AI development are distributed justly, the future may hold only AI systems which are unsuited to their conditions of application, and exacerbate inequality.	翻訳日:2023-04-13 00:51:26 公開日:2021-02-02
# スピン増幅器を用いた軸状暗黒物質の探索 Search for axion-like dark matter with spin-based amplifiers ( http://arxiv.org/abs/2102.01448v1 ) ライセンス: Link先を確認	Min Jiang, Haowen Su, Antoine Garcon, Xinhua Peng, Dmitry Budker	(参考訳) ウルトラライトアクシオン様粒子(ultralight axion-like particles、alps)は、標準模型を超えて理論によって導入された暗黒物質候補である。しかし、既存の実験実験を通してのALPの存在に関する制約は、通常天体物理学的な限界よりも弱い現在の感度によって妨げられている。ここでは,8.3 feVから744 feVまでの約20年間の質量範囲でALPを探索する新しい量子センサを実証する。センサは高偏極長寿命核スピンを前増幅器として利用し, 軸索状ダークマター場のコヒーレント振動を100倍に向上させる。スピンベース増幅器を用いて18fT/Hz$^{1/2}$の超高感度を達成し、最先端の原子スピン磁力計よりもはるかに優れている。我々の実験は、ALPと核子との質量範囲上のカップリングを記述するパラメータ空間を67.5 feVで2.9\times 10^{-9}~\textrm{GeV}^{-1}$$95\%$の信頼度に制限し、少なくとも5桁の精度で以前の実験室限界よりも改善した。我々の測定はまた、天体物理学上の新たな限界とのアルプ-核子相互作用とダーク光子-核子相互作用を制約している。 Ultralight axion-like particles (ALPs) are well-motivated dark matter candidates introduced by theories beyond the standard model. However, the constraints on the existence of ALPs through existing laboratory experiments are hindered by their current sensitivities, which are usually weaker than astrophysical limits. Here, we demonstrate a new quantum sensor to search for ALPs in the mass range that spans about two decades from 8.3 feV to 744 feV. Our sensor makes use of hyperpolarized long-lived nuclear spins as a pre-amplifier that effectively enhances coherently oscillating axion-like dark-matter field by a factor of >100. Using spin-based amplifiers, we achieve an ultrahigh magnetic sensitivity of 18 fT/Hz$^{1/2}$, which is significantly better than state-of-the-art nuclear-spin magnetometers. Our experiment constrains the parameter space describing the coupling of ALPs to nucleons over our mass range, at 67.5 feV reaching $2.9\times 10^{-9}~\textrm{GeV}^{-1}$ ($95\%$ confidence level), improving over previous laboratory limits by at least five orders of magnitude. Our measurements also constrain the ALP-nucleon quadratic interaction and dark photon-nucleon interaction with new limits beyond the astrophysical ones	翻訳日:2023-04-13 00:48:20 公開日:2021-02-02
# 量子ホロリズム Quantum Holism ( http://arxiv.org/abs/2102.01438v1 ) ライセンス: Link先を確認	Giacomo Mauro D'Ariano	(参考訳) 複合量子系は、その部分のすべての性質と相容れない性質を持つ。すべての局所的性質と相容れないような大域的性質の存在は、私が「mereological holism」と呼んでいます。メアロジカル・ホリスム(Mereological holism)は、「量子系」(quantum system)の通常の理解を「物理的対象」とすることによる劇的な概念的な結果である。プロパティ」の概念は、操作確率論のクラス全体(短いオプトアウト)に一意的に拡張することができ、最も関連するケースは量子論と古典理論である。古典理論はメアロジカルに包括的ではないが、現在では他のOPTも検索できる。 OPTフレームワーク内での"システム"の役割は、2つの目的イベント間の入出力接続である。古典理論のような非全体論理論では、この体系は依然として「対象」とみなすことができる。逆に、「システム」を「対象」と解釈する全体論理論は、理論的な概念の仮説化を構成する。 A composite quantum system has properties that are incompatible with every property of its parts. The existence of such global properties incompatible with all local properties constitutes what I call "mereological holism"--the distinctive holism of Quantum Theory. Mereological holism has the dramatic conceptual consequence of making untenable the usual understanding of the "quantum system" as being a "physical object", since composed objects have properties compatible with those of its parts. The notion of "property" can be extended in a unique way to the whole class of operational probabilistic theories (shortly OPTs), of which the most relevant cases are Quantum Theory and Classical Theory. Whereas Classical Theory is not mereologically holistic, we can now search for other OPTs that are so. Within the OPT framework the role of the "system" is that of an input-output connection between two objective events. In non holistic theories, such as Classical Theory, the system can still be regarded as an "object". On the contrary, in holistic theories interpreting "system" as "object" constitutes an hypostatization of a theoretical notion.	翻訳日:2023-04-13 00:47:53 公開日:2021-02-02
# 埋込み粒子間の中力ファンデルワールス相互作用の効果的なスクリーニング Effective screening of medium-assisted Van der Waals interactions between embedded particles ( http://arxiv.org/abs/2102.01430v1 ) ライセンス: Link先を確認	Johannes Fiedler, Michael Walter, Stefan Yoshi Buhmann	(参考訳) 粒子対の分散相互作用に対する暗黙媒質の影響を論じ,真空に対する補正のための簡単な式を導出した。単一点ガウス二次数は、相互作用粒子の共鳴周波数に近い環境の誘電率二乗により真空ファンデルワールス$c_6$係数が遮蔽されるという直感的な結果をもたらす。この近似は、媒質がこれらの周波数で透明であれば特に適切である。原稿では、一般的に用いられる溶媒、原子、小分子の単純なモデルとパラメータセットを提供する。 The effect of an implicit medium on dispersive interactions of particle pairs is discussed and simple expressions for the correction relative to vacuum are derived. We show that a single point Gauss quadrature leads to the intuitive result that the vacuum van der Waals $C_6$ coefficient is screened by the permittivity squared of the environment evaluated near to the resonance frequencies of the interacting particles. This approximation should be particularly relevant if the medium is transparent at these frequencies. In the manuscript, we provide simple models and sets of parameters for commonly used solvents, atoms and small molecules.	翻訳日:2023-04-13 00:47:36 公開日:2021-02-02
# マイクロ波光子カウンタによるスピン検出 Detecting spins with a microwave photon counter ( http://arxiv.org/abs/2102.01415v1 ) ライセンス: Link先を確認	Emanuele Albertinale, L\'eo Balembois, Eric Billaud, Vishal Ranjan, Daniel Flanigan, Thomas Schenkel, Daniel Est\`eve, Denis Vion, Patrice Bertet, Emmanuel Flurin	(参考訳) 電磁界を放射することで共鳴照明に応答する量子エミッタ。これらのフィールドの成分は駆動音と位相整合し、もう1つは自発的に放出された光子からなり蛍光信号を形成する非一貫性である。原子や分子は、光周波数での蛍光によって定期的に検出され、量子技術や顕微鏡に重要な応用がある。一方、スピンは通常、連続波またはパルス磁気共鳴において、電波またはマイクロ波の周波数で {their coherent response} によって検出される。実際、スピンの蛍光検出は、低い自発放出率と、この周波数範囲における単一光子検出器の欠如によって妨げられる。ここでは、超伝導量子デバイスを用いて、マイクロ波およびミリケルビン温度での蛍光によるシリコン中のドナースピンの小さなアンサンブルの検出を実証する。我々は、高品位かつ小型の超伝導共振器に結合することでスピン放射減衰率を高め、超伝導量子ビットに基づく新しいマイクロ波単一光子カウンタにデバイス出力を接続する。少数のスピンの磁気共鳴分光法における蛍光検出の可能性について考察する。 Quantum emitters respond to resonant illumination by radiating electromagnetic fields. A component of these fields is phase-coherent with the driving tone, while another one is incoherent, consisting of spontaneously emitted photons and forming the fluorescence signal. Atoms and molecules are routinely detected by their fluorescence at optical frequencies, with important applications in quantum technology and microscopy. Spins, on the other hand, are usually detected by {their coherent response} at radio- or microwave frequencies, either in continuous-wave or pulsed magnetic resonance. Indeed, fluorescence detection of spins is hampered {by their low spontaneous emission rate} and by the lack of single-photon detectors in this frequency range. Here, using superconducting quantum devices, we demonstrate the detection of a small ensemble of donor spins in silicon by their fluorescence at microwave frequency and millikelvin temperatures. We enhance the spin radiative decay rate by coupling them to a high-quality-factor and small-mode-volume superconducting resonator, and we connect the device output to a newly-developed microwave single-photon counter based on a superconducting qubit. We discuss the potential of fluorescence detection as a novel method for magnetic resonance spectroscopy of small numbers of spins.	翻訳日:2023-04-13 00:47:28 公開日:2021-02-02
# 付加ガウス雑音に対する効率良く連結されたボソニック符号 An efficient, concatenated, bosonic code for additive Gaussian noise ( http://arxiv.org/abs/2102.01374v1 ) ライセンス: Link先を確認	Kosuke Fukui and Nicolas C. Menicucci	(参考訳) ボソニック符号は量子情報処理にノイズレジリエンスを提供する。この設定における一般的なノイズは加法ガウス雑音であり、長年の未解決問題は、このノイズチャネルのハッシュバウンドを達成する結合符号を設計することである。ここでは,GKP(Gottesman-Kitaev-Preskill)符号を用いて,残差を処理するために量子パリティ符号と結合した誤り発生量子ビットを検出し,破棄する。本手法は線形時間デコーダを応用し,幅広い量子計算や通信シナリオに適用できる。 Bosonic codes offer noise resilience for quantum information processing. A common type of noise in this setting is additive Gaussian noise, and a long-standing open problem is to design a concatenated code that achieves the hashing bound for this noise channel. Here we achieve this goal using a Gottesman-Kitaev-Preskill (GKP) code to detect and discard error-prone qubits, concatenated with a quantum parity code to handle the residual errors. Our method employs a linear-time decoder and has applications in a wide range of quantum computation and communication scenarios.	翻訳日:2023-04-13 00:46:20 公開日:2021-02-02
# ヘラルドX線光子とビームスプリッタの効率的な相互作用 Efficient Interaction of Heralded X-ray Photons with a Beam Splitter ( http://arxiv.org/abs/2102.01370v1 ) ライセンス: Link先を確認	E. Strizhevsky, D. Borodin, A. Schori, S. Francoual, R. R\"ohlsberger, and S. Shwartz	(参考訳) マルチキロ電子VoltヘラルドX線光子とビームスプリッタとの効率的な相互作用の実験実験を行った。ビームスプリッタの出力で測定されたヘラルドフォトンレートは、ビームスプリッタがない場合のレートに匹敵する約0.01カウント/sである。我々はこのビームスプリッターと光子数と光子エネルギー分解検出器を用いて、単一のX線光子が分裂できないことを直接示す。量子光学におけるx線の主な利点は、高い忠実度と無視できる背景を持つ実験結果を観測できることである。 We report the experimental demonstration of efficient interaction of multi kilo electron Volt heralded x-ray photons with a beam splitter. The measured heralded photon rate at the outputs of the beam splitter is about 0.01 counts/s which is comparable to the rate in the absence of the beam splitter. We use this beam splitter together with photon number and photon energy resolving detectors to show directly that single x ray photons cannot split. Our experiment demonstrates the major advantage of x rays for quantum optics: the possibility to observe experimental results with high fidelity and with negligible background.	翻訳日:2023-04-13 00:46:10 公開日:2021-02-02
# 非マルコフ量子過程の実験的特徴付け Experimental characterisation of a non-Markovian quantum process ( http://arxiv.org/abs/2102.01327v1 ) ライセンス: Link先を確認	K. Goswami, C. Giarmatzi, C. Monterola, S. Shrapnel, J. Romero, and F. Costa	(参考訳) すべての量子系は環境に結合する。このようなシステム環境相互作用は、異なる時間における量子演算間の時間的相関をもたらし、非マルコフノイズをもたらす。原則として、非マルコフ雑音の完全な特徴付けは、計算的かつ実験的に要求されるマルチタイムプロセス行列のトモグラフィーを必要とする。本稿では,より効率的な解法を提案する。情報理論的な尺度で定量化される非マルコフ性量の推定には, トモグラフィ的不完全測定を用いた機械学習モデルを用いる。我々は、量子光学実験でモデルをテストし、90\%の精度で非マルコビアン測度を予測できる。本実験は,大規模量子コンピュータに現れる非マルコフ雑音を効率的に検出する方法である。 Every quantum system is coupled to an environment. Such system-environment interaction leads to temporal correlation between quantum operations at different times, resulting in non-Markovian noise. In principle, a full characterisation of non-Markovian noise requires tomography of a multi-time processes matrix, which is both computationally and experimentally demanding. In this paper, we propose a more efficient solution. We employ machine learning models to estimate the amount of non-Markovianity, as quantified by an information-theoretic measure, with tomographically incomplete measurement. We test our model on a quantum optical experiment, and we are able to predict the non-Markovianity measure with $90\%$ accuracy. Our experiment paves the way for efficient detection of non-Markovian noise appearing in large scale quantum computers.	翻訳日:2023-04-13 00:45:59 公開日:2021-02-02
# 非エルミートNambu--Jona-Lasinioモデルにおけるフェルミオンと中間子質量生成 Fermion and meson mass generation in non-Hermitian Nambu--Jona-Lasinio models ( http://arxiv.org/abs/2102.01491v1 ) ライセンス: Link先を確認	Alexander Felski and S. P. Klevansky	(参考訳) 相互作用するフェルミオン系に対する非ハーミティシティの効果について検討する。我々は、非エルミート双線型項を3+1次元ナムブ・ジョナ・ラシニオ(njl)モデルに含めることでこれを行う。標準的な NJL モデルは擬ベクトル背景場 $ig \bar\psi\gamma_5 B_\mu \gamma^\mu \psi$ または反対称テンソル背景場 $g \bar\psi F_{\mu\nu}\gamma^\mu \gamma^\nu \psi$ によって拡張される。残る3つの双線型は、自然界における {\it anti}-$\mathcal{pt}$-symmetric, $ig \bar\psi b_\mu \gamma^\mu \psi, ig\bar\psi \gamma_5 \psi$, $ig\bar\psi {1}\psi$である。擬ベクトル $ig \bar\psi\gamma_5 b_\mu \gamma^\mu \psi$ とベクトル $ig \bar\psi b_\mu \gamma^\mu \psi$ はキラリー対称である。したがって、この枠組みでは、njlモデルの非ヘルミティ性、$\mathcal{pt}$ 対称性、カイラル対称性、および2体相互作用の様々な組み合わせが、真の効果的なフェルミオン質量(対応する修正された無質量自由ディラックモデルにはない特徴)の存在と動的生成、および複合粒子の質量、擬スカラーおよびスカラーメソニックモード($\pi$および$\sigma$ mesons)に与える影響を調べることができる。その結果, 実フェルミオン質量解が存在するためには, $\mathcal{pt}$ 対称性は必要ではなく, njlモデルの2体相互作用が非エルミート双線型効果に取って代わることが示された。キラル対称性の効果は、中間子モードにおいて最も明確であり、系がキラル対称であれば、擬スカラーは常に金岩である。メソニック方程式の第2の解についても論じる。 We investigate the effects of non-Hermiticity on interacting fermionic systems. We do this by including non-Hermitian bilinear terms into the 3+1 dimensional Nambu--Jona-Lasinio (NJL) model. Two possible bilinear modifications give rise to $\mathcal{PT}$ symmetric theories; this happens when the standard NJL model is extended either by a pseudovector background field $ig \bar\psi\gamma_5 B_\mu \gamma^\mu \psi$ or by an antisymmetric-tensor background field $g \bar\psi F_{\mu\nu}\gamma^\mu \gamma^\nu \psi$. The three remaining bilinears are {\it anti}-$\mathcal{PT}$-symmetric in nature, $ig \bar\psi B_\mu \gamma^\mu \psi, ig\bar\psi \gamma_5 \psi$ and $ig\bar\psi {1}\psi$, so that the Hamiltonian then has no overall symmetry. The pseudovector $ig \bar\psi\gamma_5 B_\mu \gamma^\mu \psi$ and the vector $ig \bar\psi B_\mu \gamma^\mu \psi$ combinations, are, in addition, chirally symmetric. Thus, within this framework we are able to examine the effects that the various combinations of non-Hermiticity, $\mathcal{PT}$ symmetry, chiral symmetry and the two-body interactions of the NJL model have on the existence and dynamical generation of a real effective fermion mass (a feature which is absent in the corresponding modified massless free Dirac models) as well as on the masses of the composite particles, the pseudoscalar and scalar mesonic modes ($\pi$ and $\sigma$ mesons). Our findings demonstrate that $\mathcal{PT}$ symmetry is not necessary for real fermion mass solutions to exist, rather the two-body interactions of the NJL model supersede the non-Hermitian bilinear effects. The effects of chiral symmetry are evident most clearly in the meson modes, the pseudoscalar of which will always be Goldstone in nature if the system is chirally symmetric. Second solutions of the mesonic equations are also discussed.	翻訳日:2023-04-13 00:39:07 公開日:2021-02-02
# 任意のスピンに対する無質量場方程式解の統一 Unification of massless field equations solutions for any spin ( http://arxiv.org/abs/2102.01485v1 ) ライセンス: Link先を確認	Sergio A. Hojman and Felipe A. Asenjo	(参考訳) クライン=ゴルドン、ディラック、マクスウェル、ラリタ=シュウィンガー、アインシュタイン方程式の完全解(無質量体の場合)の統一について述べる。この方法は、ダランベルト方程式を満たす前ポテンシャル関数の積と微分という観点から、関連するすべての力学体を記述することに基づいている。注目すべきことに、通常の波動方程式を満たす(漸進的な)直交前ポテンシャルの解があり、これはクライン=ゴルドン、ディラック、マクスウェル、ラリタ=シュウィンガーおよび(線型で完全な)アインシュタイン方程式に対する非自明な解を構成するのに使用できる。直交前ポテンシャルの観点から書かれたいくつかの解が提示される。この方法と、以前に開発されたもの、物理学の他の科目との関係が指摘されている。 A unification of Klein--Gordon, Dirac, Maxwell, Rarita--Schwinger and Einstein equations exact solutions (for the massless fields cases) is presented. The method is based on writing all of the relevant dynamical fields in terms of products and derivatives of pre--potential functions, which satisfy d'Alambert equation. The coupled equations satisfied by the pre--potentials are non-linear. Remarkably, there are particular solutions of (gradient) orthogonal pre--potentials that satisfy the usual wave equation which may be used to construct {\it{exact non--trivial solutions to Klein--Gordon, Dirac, Maxwell, Rarita--Schwinger and (linearized and full) Einstein equations}}, thus giving rise to a unification of the solutions of all massless field equations for any spin. Some solutions written in terms of orthogonal pre--potentials are presented. Relations of this method to previously developed ones, as well as to other subjects in physics are pointed out.	翻訳日:2023-04-13 00:38:16 公開日:2021-02-02
# 量子散乱理論の逆問題に対する代数的解法 An algebraic method for solving the inverse problem of quantum scattering theory ( http://arxiv.org/abs/2102.01464v1 ) ライセンス: Link先を確認	N.A. Khokhlov	(参考訳) 本稿では,マルケンコ理論に基づく量子散乱理論の逆問題を解くための新しい代数的手法を提案する。分離可能な形でマーケンコ方程式の核展開に三角形の波動セットを適用した。分離形式は、マーチンコ方程式を線形方程式系に還元することを可能にする。零軌道角運動量に対して、核膨張係数の線形式は運動量 q に依存する関数のフーリエ級数係数の言葉で得られ、q の有限範囲の散乱データによって決定される。 We present a new algebraic method for solving the inverse problem of quantum scattering theory based on the Marchenko theory. We applied a triangular wave set for the Marchenko equation kernel expansion in a separable form. The separable form allows a reduction of the Marchenko equation to a system of linear equations. For the zero orbital angular momentum, a linear expression of the kernel expansion coefficients is obtained in terms of the Fourier series coefficients of a function depending on the momentum q and determined by the scattering data on the finite range of q.	翻訳日:2023-04-13 00:37:57 公開日:2021-02-02
# ブロックチェーン技術を用いた分散型サプライチェーンアンチカウンタファイリングシステム Decentralizing Supply Chain Anti-Counterfeiting Systems Using Blockchain Technology ( http://arxiv.org/abs/2102.01456v1 ) ライセンス: Link先を確認	Neo C.K. Yiu	(参考訳) サプライチェーン産業における興味深い研究課題は、高級品の真正性を示す物理商品の評価と評価である。しかし、複雑で国際的に拡大するサプライチェーンネットワークで生産され輸送される今日の商品の反カントリー化と記録的な実績に対処する革新的なソフトウェアソリューションがいくつか存在する。しかし、これらのサプライチェーンシステムは中央集権的な権威や仲介者に依存して中央集権的なシステムアーキテクチャで実装されており、サプライチェーンを横断する不正な参加者ノードによって、製品レコードの悪意ある変更やシステムコンポーネントへのさまざまな潜在的な攻撃の影響を受けやすいシングルポイント処理、ストレージ、障害といった問題を引き起こしている。ブロックチェーン技術は、暗号通貨トランザクションの分散化、分散、不変の台帳から、さまざまなユースケースや既存の問題に対処する分散型で信頼性の高いアプリケーションを構築するためのプログラマブルなインタラクティブ環境へと進化した。本研究では,ブロックチェーン技術を用いたサプライチェーン産業の旧来のアンチカウンタファイトシステムを分散化し,信頼性の高いデータプロヴァンス検索,検証,管理,サプライチェーン産業における製品の反カウンタファイト能力の向上を図るために,nfc対応アンチカウンタファイトシステム(dnas)を提案する。提案したdNASは、エンタープライズコンソーシアム、プログラム可能なスマートコントラクト、分散ファイルストレージシステムの概念と互換性のあるコンセンサスプロトコル上で分散化されたブロックチェーンネットワークを使用して、データ完全性に魅力的な特性を提供する、プロファイランスレコードを自動検証するセキュアで不変な科学的データプロファイランス追跡管理プラットフォームを開発する。 An interesting research problem in supply chain industry is evaluating and determining provenance of physical goods - demonstrating authenticity of luxury goods. Yet, there have been a few innovative software solutions addressing product anti-counterfeiting and record provenance of today's goods that are produced and transported in complex and internationally-spanning supply chain networks. However, these supply chain systems have been implemented with centralized system architecture, relying on centralized authorities or any form of intermediaries, and leading to issues such as single-point processing, storage and failure, which could be susceptible to malicious modifications of product records or various potential attacks to system components by dishonest participant nodes traversing along the supply chain. Blockchain technology has evolved from being merely a decentralized, distributed and immutable ledger of cryptocurrency transactions to a programmable interactive environment for building decentralized and reliable applications addressing different use cases and existing problems in the world. In this research, the Decentralized NFC-Enabled Anti-Counterfeiting System (dNAS) is proposed and developed, decentralizing a legacy anti-counterfeiting system of supply chain industry using Blockchain technology, to facilitate trustworthy data provenance retrieval, verification and management, as well as strengthening capability of product anti-counterfeiting in supply chain industry. The proposed dNAS utilizes decentralized blockchain network on a consensus protocol compatible with the concept of enterprise consortium, programmable smart contracts and a distributed file storage system to develop a secure and immutable scientific data provenance tracking and management platform on which provenance records, providing compelling properties on data integrity, are validated automatically.	翻訳日:2023-04-13 00:37:47 公開日:2021-02-02
# 制御されたモジュラ乗算への計測に基づく非計算の適用 Measurement-based Uncomputation Applied to Controlled Modular Multiplication ( http://arxiv.org/abs/2102.01453v1 ) ライセンス: Link先を確認	Panjin Kim and Daewan Han	(参考訳) これは測定に基づく非計算の特定の使用に関する短い報告である。性能は魅力的ではないが、様々な量子回路の最適化技術に光を当てる可能性がある。 This is a brief report on a particular use of measurement-based uncomputation. Though not appealing in performance, it may shed light on optimization techniques in various quantum circuits.	翻訳日:2023-04-13 00:37:15 公開日:2021-02-02
# 熱前離散時間結晶の観察 Observation of a prethermal discrete time crystal ( http://arxiv.org/abs/2102.01695v1 ) ライセンス: Link先を確認	Antonis Kyprianidis, Francisco Machado, William Morong, Patrick Becker, Kate S. Collins, Dominic V. Else, Lei Feng, Paul W. Hess, Chetan Nayak, Guido Pagano, Norman Y. Yao, Christopher Monroe	(参考訳) 物質の相を定義し理解するための従来の枠組みは熱力学的平衡を必要とする。非平衡系への拡張は、多体熱化の性質や新しい物質相の発見に対する驚くべき洞察をもたらし、しばしば周期的に系の駆動によって触媒される。このようなフロッケ駆動からの固有の加熱は、系の強い障害を含むことで緩和できるが、非平衡相の一般性も隠蔽できる。本研究では,無秩序な非平衡駆動相,前熱離散時間結晶(PDTC)のシグネチャを観測するために,トラップイオン量子シミュレータを用いる。ここでは、多体加熱は障害による多体局在ではなく、高周波駆動によって抑制されるため、非平衡相が出現する可能性がある時間窓が広がる。 pdtcと多体局所障害を区別するいくつかの重要な特徴を観察し、その寿命の駆動周波数制御や初期状態のエネルギー密度に対する時間-結晶次数依存性などについて考察した。従って、フロッケ予熱は物質の本質的非平衡相を創り、安定化し、研究するための一般的な戦略として提示される。 The conventional framework for defining and understanding phases of matter requires thermodynamic equilibrium. Extensions to non-equilibrium systems have led to surprising insights into the nature of many-body thermalization and the discovery of novel phases of matter, often catalyzed by driving the system periodically. The inherent heating from such Floquet drives can be tempered by including strong disorder in the system, but this can also mask the generality of non-equilibrium phases. In this work, we utilize a trapped-ion quantum simulator to observe signatures of a non-equilibrium driven phase without disorder: the prethermal discrete time crystal (PDTC). Here, many-body heating is suppressed not by disorder-induced many-body localization, but instead via high-frequency driving, leading to an expansive time window where non-equilibrium phases can emerge. We observe a number of key features that distinguish the PDTC from its many-body-localized disordered counterpart, such as the drive-frequency control of its lifetime and the dependence of time-crystalline order on the energy density of the initial state. Floquet prethermalization is thus presented as a general strategy for creating, stabilizing and studying intrinsically out-of-equilibrium phases of matter.	翻訳日:2023-04-13 00:30:36 公開日:2021-02-02
# 超伝導量子ビットを用いた量子アルゴリズムにおける動的量子回路の爆発 Exploiting dynamic quantum circuits in a quantum algorithm with superconducting qubits ( http://arxiv.org/abs/2102.01682v1 ) ライセンス: Link先を確認	Antonio D. Corcoles, Maika Takita, Ken Inoue, Scott Lekuch, Zlatko K. Minev, Jerry M. Chow, Jay M. Gambetta	(参考訳) 実システム上での量子回路の実行は、単純に単体演算の時間順序のシーケンスに制限され、続いて射影測定が行われる。量子コンピューティングのハードウェアプラットフォームはサイズと能力が成熟し続けており、従来の構成を超えて量子回路を有効にすることが不可欠である。ここでは超伝導系量子系上の動的量子回路の領域について述べる。動的量子回路は、計算全体を通しての量子状態の進化だけでなく、キュービットのサブセットの周期的な測定や、回路の実行時間よりも短い時間スケールでの古典的な情報の同時処理も含む。ノイズ量子ハードウェアを用いて、動的回路を利用する適応バージョンにおいて、最も基本的な量子アルゴリズムの1つである量子位相推定を探索し、その結果を同じアルゴリズムの非適応実装と比較する。動的回路を用いたリアルタイム量子コンピューティングは,システム内のノイズやレイテンシが十分に低く,実量子システム上で利用可能なアルゴリズムの新たな領域への扉を開くことで,実質的かつ具体的な利点をもたらすことを実証する。 The execution of quantum circuits on real systems has largely been limited to those which are simply time-ordered sequences of unitary operations followed by a projective measurement. As hardware platforms for quantum computing continue to mature in size and capability, it is imperative to enable quantum circuits beyond their conventional construction. Here we break into the realm of dynamic quantum circuits on a superconducting-based quantum system. Dynamic quantum circuits involve not only the evolution of the quantum state throughout the computation, but also periodic measurements of a subset of qubits mid-circuit and concurrent processing of the resulting classical information within timescales shorter than the execution times of the circuits. Using noisy quantum hardware, we explore one of the most fundamental quantum algorithms, quantum phase estimation, in its adaptive version, which exploits dynamic circuits, and compare the results to a non-adaptive implementation of the same algorithm. We demonstrate that the version of real-time quantum computing with dynamic circuits can offer a substantial and tangible advantage when noise and latency are sufficiently low in the system, opening the door to a new realm of available algorithms on real quantum systems.	翻訳日:2023-04-13 00:30:17 公開日:2021-02-02
# 決定論的文脈自由言語によるアナログニューロン階層のより強い分離 Stronger Separation of Analog Neuron Hierarchy by Deterministic Context-Free Languages ( http://arxiv.org/abs/2102.01633v1 ) ライセンス: Link先を確認	Ji\v{r}\'i \v{S}\'ima	(参考訳) 離散時間リカレントニューラルネットワーク(nns)の計算能力をチョムスキー階層内の飽和線形活性化関数を用いて解析する。整数重みに制限されたこのモデルは、有限オートマトン(チョムスキーレベル3)と同値の2進状態 NN と一致し、正則言語(REG)を認識する一方、有理重みはこのモデルを3つのアナログ状態単位(チョムスキーレベル0)に対してもチューリング完全とする。中間モデル $\alpha$ANN を、有理重み付き$\alpha\geq 0$余剰アナログ状態ニューロンで拡張し、アナログニューロン階層 0ANNs $\subset$ 1ANNs $\subset$ 2ANNs $\subseteq$ 3ANNs を確立した。分離 1ANNs $\subsetneq$ 2ANNs は非正規決定論的文脈自由言語 (DCFL) $L_\#=\{0^n1^n\mid n\geq 1\}$ によって目撃され、実際の重みでも任意の 1ANN では認識できないが、DCFL (Chomsky level 2) は有理重みを持つ 2ANN では受け入れられる。本稿では,非正規DCFLが実重量の1ANNでは認識できないことを示すことにより,この分離を強化する。つまり (DCFLs $\setminus$ REG) $\subset$ (2ANNs $\setminus$ 1ANNs) であり,これは 1ANNs $\cap$ DCFLs = 0ANNs を意味する。この目的のために、$L_\#$は、このクラスの任意の言語に$L_\#$を還元することで、最も単純な非正規DCFLであることを示した。 We analyze the computational power of discrete-time recurrent neural networks (NNs) with the saturated-linear activation function within the Chomsky hierarchy. This model restricted to integer weights coincides with binary-state NNs with the Heaviside activation function, which are equivalent to finite automata (Chomsky level 3) recognizing regular languages (REG), while rational weights make this model Turing-complete even for three analog-state units (Chomsky level 0). For the intermediate model $\alpha$ANN of a binary-state NN that is extended with $\alpha\geq 0$ extra analog-state neurons with rational weights, we have established the analog neuron hierarchy 0ANNs $\subset$ 1ANNs $\subset$ 2ANNs $\subseteq$ 3ANNs. The separation 1ANNs $\subsetneqq$ 2ANNs has been witnessed by the non-regular deterministic context-free language (DCFL) $L_\#=\{0^n1^n\mid n\geq 1\}$ which cannot be recognized by any 1ANN even with real weights, while any DCFL (Chomsky level 2) is accepted by a 2ANN with rational weights. In this paper, we strengthen this separation by showing that any non-regular DCFL cannot be recognized by 1ANNs with real weights, which means (DCFLs $\setminus$ REG) $\subset$ (2ANNs $\setminus$ 1ANNs), implying 1ANNs $\cap$ DCFLs = 0ANNs. For this purpose, we have shown that $L_\#$ is the simplest non-regular DCFL by reducing $L_\#$ to any language in this class, which is by itself an interesting achievement in computability theory.	翻訳日:2023-04-13 00:29:04 公開日:2021-02-02
# 次数単位空間におけるスペクトルの幾何学的および代数的側面:比較 Geometric and algebraic aspects of spectrality in order unit spaces: a comparison ( http://arxiv.org/abs/2102.01628v1 ) ライセンス: Link先を確認	Anna Jen\v{c}ov\'a and Sylvia Pulmannov\'a	(参考訳) 順序単位空間のスペクトル理論に対する2つのアプローチは、アルフセンとシュルツのスペクトル双対性とフーラスによるスペクトル圧縮基底である。前者は基底ノルム空間と双対性のある順序単位空間の幾何学的性質を用いるが、後者の概念は純粋に代数的である。フーリスアプローチは厳密にはより一般的であり、アルフセン・シュルツアプローチを特別な場合として含むことが示されている。これは二つの種類の例で示される: jb-代数がフーリススペクトルであることと、それらがリッカートであることは同値であり、中心対称状態空間は、必ずしもアルフセン・シュルツスペクトルではないにもかかわらずフーリススペクトルである。 Two approaches to spectral theory of order unit spaces are compared: the spectral duality of Alfsen and Shultz and the spectral compression bases due to Foulis. While the former approach uses the geometric properties of an order unit space in duality with a base norm space, the latter notion is purely algebraic. It is shown that the Foulis approach is strictly more general and contains the Alfsen-Shultz approach as a special case. This is demonstrated on two types of examples: the JB-algebras which are Foulis spectral if and only if they are Rickart, and the centrally symmetric state spaces, which may be Foulis spectral while not necessarily Alfsen-Shultz spectral.	翻訳日:2023-04-13 00:28:23 公開日:2021-02-02
# ボソニックデータ隠蔽:非線形光と非線形光のパワー Bosonic data hiding: power of linear vs non-linear optics ( http://arxiv.org/abs/2102.01622v1 ) ライセンス: Link先を確認	Krishna Kumar Sabapathy, Andreas Winter	(参考訳) ガウス状態のウィグナー関数と測定値の正則性は、古典的(フィードフォワード)通信(GOCC)により強化されたガウス測度演算として定式化された「線形光学」の識別力を束縛するエレガントな方法を提供する。これにより,コヒーレント状態のgoccノルム距離を厳格に特徴付ける竹岡と佐々木(pra 78:022320, 2008)の結果を再現し,一般化することができる。さらに、古典的および量子的シャノン理論からアイデアを呼び出すと、それぞれの状態が、原理的には指数関数的に確実に判別されるが、gocc測定の出力から指数関数的に近い多モードコヒーレント状態の確率的混合であることを示す。ローカル操作の制限されたクラスと古典的コミュニケーション(LOCC)による状態の生成と識別の不可逆性を示すLOCCデータ隠蔽(LOCC)と類似して、GOCCデータ隠蔽(GOCC data hidden)と呼ぶ。また, 正のウィグナー関数を用いた測定において, ヘルストロームを識別可能な任意の有界エネルギー状態に対して, 最小の識別性を保証し, 逆方向の一般境界も提示する。 GOCC測定にも同様の限界が存在すると推測する。 We show that the positivity of the Wigner function of Gaussian states and measurements provides an elegant way to bound the discriminating power of "linear optics", which we formalise as Gaussian measurement operations augmented by classical (feed-forward) communication (GOCC). This allows us to reproduce and generalise the result of Takeoka and Sasaki [PRA 78:022320, 2008], which tightly characterises the GOCC norm distance of coherent states, separating it from the optimal distinguishability according to Helstrom's theorem. Furthermore, invoking ideas from classical and quantum Shannon theory we show that there are states, each a probabilistic mixture of multi-mode coherent states, which are exponentially reliably discriminated in principle, but appear exponentially close judging from the output of GOCC measurements. In analogy to LOCC data hiding, which shows an irreversibility in the preparation and discrimination of states by the restricted class of local operations and classical communication (LOCC), we call the present effect GOCC data hiding. We also present general bounds in the opposite direction, guaranteeing a minimum of distinguishability under measurements with positive Wigner function, for any bounded-energy states that are Helstrom distinguishable. We conjecture that a similar bound holds for GOCC measurements.	翻訳日:2023-04-13 00:28:07 公開日:2021-02-02
# 弦を付さない位相空間ホログラフィー Phase space holography with no strings attached ( http://arxiv.org/abs/2102.01617v1 ) ライセンス: Link先を確認	D. V. Khveshchenko	(参考訳) このノートでは、位相空間 (bulk') における一般量子系の記述と時空 (boundary) とのホログラフィー的な対応を確立するという観点から、ウィグナー関数の表現について述べる。ある状況下では、前者は局所計量的変数の古典力学に還元され、後者はいくつかのボゾン化群場流体力学の形式を取る。この一般的な擬ホログラフィー双対性は、問題のシステムの特定の対称性に依存したり、応用ホログラフィーの様々な「アドホック」シナリオのように、基礎となる「弦理論」への接続を必要としない。 This note discusses the Wigner function representation from the standpoint of establishing a holography-like correspondence between the descriptions of a generic quantum system in the phase space ('bulk') picture versus its spacetime ('boundary') counterpart. Under certain circumstances the former might reduce to the classical dynamics of a local metric-like variable while the latter takes on the form of some bosonized collective field hydrodynamics. This generic pseudo-holographic duality neither relies on any particular symmetry of the system in question, nor does it require any connection to an underlying 'string theory', as in the various 'ad hoc' scenarios of applied holography.	翻訳日:2023-04-13 00:27:42 公開日:2021-02-02
# 超ポテンシャル W(x,A,B)=Atanh(px)+Btanh(6px) を持つ形状不変ポテンシャルの可解シュロディンガー方程式 Solvable Schrodinger Equations of Shape Invariant Potentials Having Superpotential W(x,A,B)=Atanh(px)+Btanh(6px) ( http://arxiv.org/abs/2102.02775v1 ) ライセンス: Link先を確認	Jamal Benbourenane, Mohamed Benbourenane, Hichem Eleuch	(参考訳) 形状不変法を用いて, 新たに提案した一次元時間独立schr\"odinger方程式を完全に解く。対応するポテンシャルは V_(x,A,B) =-A(sechpx)^2 - 6Bp(sech6px)^2+(tanhpx-6tanh6px)^2 と超ポテンシャル W(x,A,B) = Atanh(px)+Btanh(6px)^2 で与えられる。我々は、形状不変性を持つ超ポテンシャルの超対称性量子力学技術を用いて、V_-ポテンシャルパートナーを持つシュリンガー方程式の族を正確に解き、そこで離散スペクトルと対応する固有関数が正確に閉形式で決定される。 Schr\"odinger 方程式が閉形式で解くのが困難であることはよく知られており、そのいくつかのみが知られている。正確な解を持つ新しい方程式を見つけることは、これらのビジニティにおいて数値法が失敗する曲がり角付近の隠れた物理的性質を理解するために重要である。この結果は核物理学や化学において、反抗力が顕著な存在感を持つ可能性を持っている。 A new proposed one dimensional time independent Schr\"odinger equation is solved completely using shape invariance method. The corresponding potential is given by V_(x,A,B) =-A(sechpx)^2 - 6Bp(sech6px)^2+(tanhpx-6tanh6px)^2 with superpotential W(x,A,B) = Atanh(px)+Btanh(6px). We derive the exact solutions of the family of Schr\"odinger equations with the V_- potential partner using supersymmetric quantum mechanics technique of a superpotential having shape invariance property, and where the discrete spectrum and the corresponding eigenfunctions are determined exactly and in closed form. It is well-known that Schr\"odinger equations are challenging to solve in closed form, and only a few of them are known. Finding new equations with exact solutions is crucial in understanding the hidden physical properties near turning points where numerical methods fail in these vicinities. This result has potential applications in nuclear physics and chemistry where the antagonist forces have a prominent presence.	翻訳日:2023-04-13 00:21:15 公開日:2021-02-02
# 非援助完全量子チャネルに対するレートスプリッティングによる新しいワンショットインナーバウンド Novel one-shot inner bounds for unassisted fully quantum channels via rate splitting ( http://arxiv.org/abs/2102.01766v1 ) ライセンス: Link先を確認	Sayantan Chakraborty and Aditya Nema and Pranab Sen	(参考訳) エンタングルメントアンヘルダー量子多重アクセスチャネル (qmac) とアンヘルプアンヘルダー2sender 2-receiver quantum interference channel (qic) 上で量子情報を送信するための最初の非自明な1ショット内界を証明した。既往の研究は、無支援QMACを無症候性イド限界(asymptotic iid limit)として知られるチャネルの多くの独立的および同一使用の限界でのみ研究し、無支援QMACを全く研究しなかった。私たちは内部境界を得るために、レート分割と逐次キャンセルという2つのテクニックを採用しています。レート分割は、漸近的なiid設定の古典的チャネルに対して、時間共有を回避し、内部境界を得るために以前に用いられた。我々の主な技術的貢献は、レート分割を古典的な漸近的なiid設定から量子ワンショット設定へと拡張することである。漸近的イドでは、QMACに対する一発の内界境界はヤード、デベタック、ヘイデンの速度領域に近づく。 QICでは、漸近的イド設定において新しい非自明な速度領域を得る。いずれの症例も,ワンショットおよび漸近的虹彩設定において,限られた絡み合い支援が提供される場合に拡張する。 QMAC と QIC のワンセットに対する限定的絡み合い結果は新しいものである。 QICでは, 漸近的イイド設定においても, 限られた絡み合いの結果が新しい。 We prove the first non-trivial one-shot inner bounds for sending quantum information over an entanglement unassisted two-sender quantum multiple access channel (QMAC) and an unassisted two-sender two-receiver quantum interference channel (QIC). Previous works only studied the unassisted QMAC in the limit of many independent and identical uses of the channel also known as the asymptotic iid limit, and did not study the unassisted QIC at all. We employ two techniques, rate splitting and successive cancellation}, in order to obtain our inner bound. Rate splitting was earlier used to obtain inner bounds, avoiding time sharing, for classical channels in the asymptotic iid setting. Our main technical contribution is to extend rate splitting from the classical asymptotic iid setting to the quantum one-shot setting. In the asymptotic iid limit our one-shot inner bound for QMAC approaches the rate region of Yard, Devetak and Hayden. For the QIC we get novel non-trivial rate regions in the asymptotic iid setting. All our results also extend to the case where limited entanglement assistance is provided, in both one-shot and asymptotic iid settings. The limited entanglement results for one-setting for both QMAC and QIC are new. For the QIC the limited entanglement results are new even in the asymptotic iid setting.	翻訳日:2023-04-13 00:19:59 公開日:2021-02-02
# デジタル経済活動における技術知識に基づくスキル Skills-based on technological knowledge in the digital economy activity ( http://arxiv.org/abs/2102.01711v1 ) ライセンス: Link先を確認	Dr. Cesar R Salas-Guerra	(参考訳) 本研究は,技術知識を持つ人々が地域デジタル経済活動に与える影響と,近隣の都市への感染拡大の影響を計測することを目的とする。本研究の焦点は定量的,断面的であり,その設計は相関-causalである。この研究はブラジルのミナスジェライスの7つの小地域を対象とし、89の自治体で69%の人口と31%の農村部が組織されている。使用されるデータは、ブラジル政府の公開リポジトリで得られた4,361の観測結果からなり、パネルデータにまとめられ、部分最小二乗、微小領域空間回帰、機械学習による識別パターンを用いて分析された。回帰テストの確認分析は、CEの技術知識とデジタル経済活動の間に、R2 = .749, \b{eta} = .867, p = .000(値t = 18,298)の予測値を通じて大きな影響を与える。公立・私立大学機関(IUPP)、博士・修士課程の教授(DCNT)、情報技術職(CBO)などが有名である。技術基盤技術を求める企業の地理的集中は、小自治体の発展を鈍化させ、技術知識に基づく新しいビジネスモデルを支援する新しい政府技術イニシアチブの開発を示唆した。 This research seeks to measure the impact of people with technological knowledge on regional digital economic activity and the implications of prosperous cities' contagion effect on neighbouring ones. The focus of this study is quantitative, cross-sectional, and its design is correlational-causal. This study covers seven micro-regions of Minas Gerais in Brazil, organized in 89 municipalities, with 69% urban population and 31% rural. The data used consisted of 4,361 observations obtained in the Brazilian government's public repositories, organized into panel data, and analysed using partial least squares, micro-regional spatial regression, and identification patterns with machine learning. The confirmatory analysis of the regression test establishes a significant impact between the CE's technological knowledge and the digital economic activity AED through a predictive value of R2 = .749, \b{eta} = .867, p = .000 (value t = 18,298). With high notoriety among the variables, public and private university institutions (IUPP), professors with doctorates and masters (DCNT), and information technology occupations (CBO). A geographic concentration of companies that demand technology-based skills had effects by slowing down the development of small municipalities, suggesting the development of new government technology initiatives that support new business models based on technological knowledge.	翻訳日:2023-04-13 00:18:59 公開日:2021-02-02
# 乱れた双極子量子系における局所励起の寿命 Lifetimes of local excitations in disordered dipolar quantum systems ( http://arxiv.org/abs/2102.01705v1 ) ライセンス: Link先を確認	Rahul Nandkishore and Sarang Gopalakrishnan	(参考訳) 相互作用する量子双極子の強い乱れた系が局所的に励起されると、励起はいくつかの(潜在的に非常に長い)時間スケールで緩和する。この緩和過程は、粒子間双極子が創発的に励起される電子ガラスと、顕微鏡的双極子からなるシステム(量子磁石や超低温双極子分子など)の両方で解析される。我々は、エネルギー緩和率(t_1$ times)と緩和率(t_2$ times)の両方と、周波数、温度、偏光に依存することを考慮する。 2次元および3次元の系は、準2次元幾何学における次元交叉とともに考慮される。豊富なスケーリング法則が発見されている。 When a strongly disordered system of interacting quantum dipoles is locally excited, the excitation relaxes on some (potentially very long) timescale. We analyze this relaxation process, both for electron glasses with strong Coulomb interactions - in which particle-hole dipoles are emergent excitations - and for systems (e.g., quantum magnets or ultracold dipolar molecules) made up of microscopic dipoles. We consider both energy relaxation rates ($T_1$ times) and dephasing rates ($T_2$ times), and their dependence on frequency, temperature, and polarization. Systems in both two and three dimensions are considered, along with the dimensional crossover in quasi-two dimensional geometries. A rich set of scaling laws is found.	翻訳日:2023-04-13 00:18:34 公開日:2021-02-02
# CTとMRIの3次元超解像のための中間損失を有する畳み込みニューラルネットワーク Convolutional Neural Networks with Intermediate Loss for 3D Super-Resolution of CT and MRI Scans ( http://arxiv.org/abs/2001.01330v3 ) ライセンス: Link先を確認	Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Nicolae Verga	(参考訳) 病院で一般的に使われているCTスキャナーは、現在512ピクセルまでの解像度の低い画像を生成する。画像中の1ピクセルは1ミリの組織に相当する。腫瘍を正確に分類し、治療計画を立案するには、高解像度のCTスキャンが必要である。同じ問題がMRIにも現れる。本稿では,3次元CTやMRIの単一画像超解像へのアプローチを提案する。提案手法は,10層からなる深層畳み込みニューラルネットワーク(CNN)と,第1層の畳み込み層の後に配置される中間層からなる。第1のcnnは2つの軸(幅と高さ)の解像度を増加させ、第2のcnnは第3軸(深さ)の解像度を増加させる。他の方法と異なり、アップスケーリング層の直後の基底トラス高解像度出力に対する損失を計算し、最後の畳み込み層の直後の損失を計算する。中間損失により、我々のネットワークは地上構造に近い、より良い出力を生み出すことができる。シャープな結果を得るために広く使われているアプローチは、固定標準偏差を用いてガウス的曖昧さを加えることである。固定標準偏差への過剰な適合を避けるため、他の手法とは異なり、様々な標準偏差を持つガウス平滑化を適用する。我々は2つのデータベースからのCTとMRIの2次元超解像と3次元超解像の文脈で評価し、2xと4xのスケーリング因子を用いて、様々な補間スキームに基づく文献やベースラインの関連研究と比較した。実験の結果,我々のアプローチは他の手法よりも優れた結果が得られることがわかった。また,人間の注記では,lanczos補間を97.55%で2倍,96.69%で4倍に拡大した症例では96.69%であった。 CT scanners that are commonly-used in hospitals nowadays produce low-resolution images, up to 512 pixels in size. One pixel in the image corresponds to a one millimeter piece of tissue. In order to accurately segment tumors and make treatment plans, doctors need CT scans of higher resolution. The same problem appears in MRI. In this paper, we propose an approach for the single-image super-resolution of 3D CT or MRI scans. Our method is based on deep convolutional neural networks (CNNs) composed of 10 convolutional layers and an intermediate upscaling layer that is placed after the first 6 convolutional layers. Our first CNN, which increases the resolution on two axes (width and height), is followed by a second CNN, which increases the resolution on the third axis (depth). Different from other methods, we compute the loss with respect to the ground-truth high-resolution output right after the upscaling layer, in addition to computing the loss after the last convolutional layer. The intermediate loss forces our network to produce a better output, closer to the ground-truth. A widely-used approach to obtain sharp results is to add Gaussian blur using a fixed standard deviation. In order to avoid overfitting to a fixed standard deviation, we apply Gaussian smoothing with various standard deviations, unlike other approaches. We evaluate our method in the context of 2D and 3D super-resolution of CT and MRI scans from two databases, comparing it to relevant related works from the literature and baselines based on various interpolation schemes, using 2x and 4x scaling factors. The empirical results show that our approach attains superior results to all other methods. Moreover, our human annotation study reveals that both doctors and regular annotators chose our method in favor of Lanczos interpolation in 97.55% cases for 2x upscaling factor and in 96.69% cases for 4x upscaling factor.	翻訳日:2023-01-14 07:52:14 公開日:2021-02-02
# モバイル学習環境における深い注意学習セッションドロップアウト予測 Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment ( http://arxiv.org/abs/2002.11624v5 ) ライセンス: Link先を確認	Youngnam Lee, Dongmin Shin, HyunBin Loh, Jaemin Lee, Piljae Chae, Junghyun Cho, Seoyon Park, Jinhwan Lee, Jineon Baek, Byungsoo Kim, Youngduck Choi	(参考訳) 学生のドロップアウト予測は、学生のエンゲージメントを改善する機会を提供し、学習体験の全体的な効果を最大化する。しかし、学生の退学に関する調査は、主に学校ドロップアウトやコースドロップアウトで行われており、モバイル学習環境における学習セッションの退学は十分に考慮されていない。本稿では,モバイル学習環境における学習セッションのドロップアウト予測問題について検討する。まず,モバイル学習環境における学習セッション,学習セッションドロップアウト,学習セッションドロップアウト予測タスクの概念を定義した。この定義に基づき,モバイル学習環境における学習セッションドロップアウト予測のための新しいトランスフォーマモデルdas: deep attentive studyセッションドロップアウト予測を提案する。 DASにはエンコーダ・デコーダ構造があり、マルチヘッドアテンションとポイントワイドフィードフォワードネットワークで構成されている。 DASの深い注意計算は、動的学生相互作用の間の複雑な関係を捉えることができる。私たちの知る限りでは、これはモバイル学習環境における学習セッションのドロップアウトを調査する最初の試みです。大規模データセットの実証評価から,DASはベースラインモデルと比較して,受信機動作特性曲線の下での領域の大幅な改善により,最高の性能を達成することが示された。 Student dropout prediction provides an opportunity to improve student engagement, which maximizes the overall effectiveness of learning experiences. However, researches on student dropout were mainly conducted on school dropout or course dropout, and study session dropout in a mobile learning environment has not been considered thoroughly. In this paper, we investigate the study session dropout prediction problem in a mobile learning environment. First, we define the concept of the study session, study session dropout and study session dropout prediction task in a mobile learning environment. Based on the definitions, we propose a novel Transformer based model for predicting study session dropout, DAS: Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment. DAS has an encoder-decoder structure which is composed of stacked multi-head attention and point-wise feed-forward networks. The deep attentive computations in DAS are capable of capturing complex relations among dynamic student interactions. To the best of our knowledge, this is the first attempt to investigate study session dropout in a mobile learning environment. Empirical evaluations on a large-scale dataset show that DAS achieves the best performance with a significant improvement in area under the receiver operating characteristic curve compared to baseline models.	翻訳日:2023-01-01 04:22:04 公開日:2021-02-02
# 深部ランダム化ニューラルネットワーク Deep Randomized Neural Networks ( http://arxiv.org/abs/2002.12287v2 ) ライセンス: Link先を確認	Claudio Gallicchio and Simone Scardapane	(参考訳) ランダム化されたニューラルネットワークは、ほとんどの接続が固定されたニューラルネットワークの振る舞いを確率的または決定論的に探索する。このようなシステムの典型的な例は、隠れた層への接続が初期化後に未訓練のまま残される多層ニューラルネットワークアーキテクチャである。トレーニングアルゴリズムを減量セットで運用することを制限することは、本質的に、多くの興味深い特徴を持つランダム化されたニューラルネットワークのクラスを特徴付ける。その中でも、学習プロセスの極端な効率性は、完全に訓練されたアーキテクチャに関して、間違いなく顕著な優位性である。さらに、関連する単純化にもかかわらず、ランダム化されたニューラルネットワークは、実際の両方において顕著な特性を持ち、最先端の結果を複数のドメインで達成し、理論的には、ニューラルネットワークアーキテクチャ(例えば、隠れたレイヤの接続をトレーニングする前に)固有の特性を解析できる。近年、ランダム化ニューラルネットワークの研究は深層アーキテクチャへと拡張され、ベクトルやより複雑なデータ領域において効率的かつ極めて効率的なディープラーニングモデルの設計に向けた新たな研究方向が開かれた。本章では、ランダム化されたニューラルネットワークの設計と解析に関する主要な側面と、それらの近似能力に関する重要な結果について調査する。特に,まず,ランダム化ニューラルモデルの基礎をフィードフォワードネットワーク(すなわち,ランダムベクトル汎関数リンクと等価モデル)と畳み込みフィルタ(英語版)(convolutional filter)の文脈で導入し,その後にリカレントシステム(すなわち貯水池計算ネットワーク)に移行した。どちらの場合でも、深層ランダム化システムの領域における最近の結果と、その構造化ドメインへの(再帰モデルのための)適用に特に焦点を当てています。 Randomized Neural Networks explore the behavior of neural systems where the majority of connections are fixed, either in a stochastic or a deterministic fashion. Typical examples of such systems consist of multi-layered neural network architectures where the connections to the hidden layer(s) are left untrained after initialization. Limiting the training algorithms to operate on a reduced set of weights inherently characterizes the class of Randomized Neural Networks with a number of intriguing features. Among them, the extreme efficiency of the resulting learning processes is undoubtedly a striking advantage with respect to fully trained architectures. Besides, despite the involved simplifications, randomized neural systems possess remarkable properties both in practice, achieving state-of-the-art results in multiple domains, and theoretically, allowing to analyze intrinsic properties of neural architectures (e.g. before training of the hidden layers' connections). In recent years, the study of Randomized Neural Networks has been extended towards deep architectures, opening new research directions to the design of effective yet extremely efficient deep learning models in vectorial as well as in more complex data domains. This chapter surveys all the major aspects regarding the design and analysis of Randomized Neural Networks, and some of the key results with respect to their approximation capabilities. In particular, we first introduce the fundamentals of randomized neural models in the context of feed-forward networks (i.e., Random Vector Functional Link and equivalent models) and convolutional filters, before moving to the case of recurrent systems (i.e., Reservoir Computing networks). For both, we focus specifically on recent results in the domain of deep randomized systems, and (for recurrent models) their application to structured domains.	翻訳日:2022-12-28 07:12:03 公開日:2021-02-02
# サスペンド・ペイロードによる飛行モデルに基づくメタ強化学習 Model-Based Meta-Reinforcement Learning for Flight with Suspended Payloads ( http://arxiv.org/abs/2004.11345v2 ) ライセンス: Link先を確認	Suneel Belkhale, Rachel Li, Gregory Kahn, Rowan McAllister, Roberto Calandra, Sergey Levine	(参考訳) 吊り下げられたペイロードの輸送は、ロボットの動力に重大な、予測不能な変化を引き起こす可能性があるため、自律飛行車両にとって困難である。これらの変更は、最適飛行性能や破滅的な失敗につながる可能性がある。適応制御と学習に基づく手法は、原則としてこれらのハイブリッドロボットペイロードシステムの変化に適応することができるが、事前の物理的性質が不明なペイロードへの迅速なミッドフライ適応は未解決の問題である。本研究では,接続後飛行データから数秒以内に変化するダイナミクスのモデル「学習の仕方を学習する」メタラーニング手法を提案する。実験の結果,オンライン適応手法は,懸架されたペイロード輸送タスクにおいて,非適応手法よりも優れていることが示された。ビデオやその他の補足資料は、我々のウェブサイトで入手できる。 Transporting suspended payloads is challenging for autonomous aerial vehicles because the payload can cause significant and unpredictable changes to the robot's dynamics. These changes can lead to suboptimal flight performance or even catastrophic failure. Although adaptive control and learning-based methods can in principle adapt to changes in these hybrid robot-payload systems, rapid mid-flight adaptation to payloads that have a priori unknown physical properties remains an open problem. We propose a meta-learning approach that "learns how to learn" models of altered dynamics within seconds of post-connection flight data. Our experiments demonstrate that our online adaptation approach outperforms non-adaptive methods on a series of challenging suspended payload transportation tasks. Videos and other supplemental material are available on our website: https://sites.google.com/view/meta-rl-for-flight	翻訳日:2022-12-10 09:20:01 公開日:2021-02-02
# オープンソースソフトウェアにおける開発者エキスパートの表現 Representation of Developer Expertise in Open Source Software ( http://arxiv.org/abs/2005.10176v3 ) ライセンス: Link先を確認	Tapajit Dey, Andrey Karnauch, Audris Mockus	(参考訳) 背景: 開発者の専門知識の正確な表現は常に重要な研究課題です。多くの研究が個々のプロジェクト内で専門知識を表現する新しい手法を提案しているが、これらの手法は生態系レベルでは適用が困難である。しかし、ソフトウェア開発がモノリシックからモジュラーへとシフトするにつれ、例えばプロジェクトが新しいメンテナを見つけ、関連するスキルを持つ開発者を探そうとするときに、OSS開発全体のコンテキストにおける開発者の専門知識を表現する方法が必要である。目的: 私たちは,各apiや開発者,プロジェクトが表現されるスキルスペースの提案と構築を通じて,この知識ギャップに対処することを目的としています。メソッド: 私たちはWorld of Codeインフラストラクチャを使用して、オープンソース開発者が変更したファイルの完全なAPIセットを抽出し、そのデータに基づいて、API、開発者、プロジェクトのベクトル表現にDoc2Vec埋め込みを使用します。これらの埋め込みがSkill Spaceの仮定されたトポロジを反映しているかどうかを、開発者が使用/参加する新しいAPIやプロジェクト、プルリクエストが受け入れられるかどうかを予測することで評価します。また、Skill Spaceにおける開発者の表現が、自己報告のAPIの専門知識とどのように一致しているかを確認します。結果: 提案するスキル空間への埋め込みは, 仮定されたトポロジーを満足しているように思われる。このような表現が, オープンソースエコシステム全体の信頼(と効率)を高めるシグナルの構築に寄与し, 開発者の習熟度や学習に関連する他の現象の調査に役立つことを期待する。 Background: Accurate representation of developer expertise has always been an important research problem. While a number of studies proposed novel methods of representing expertise within individual projects, these methods are difficult to apply at an ecosystem level. However, with the focus of software development shifting from monolithic to modular, a method of representing developers' expertise in the context of the entire OSS development becomes necessary when, for example, a project tries to find new maintainers and look for developers with relevant skills. Aim: We aim to address this knowledge gap by proposing and constructing the Skill Space where each API, developer, and project is represented and postulate how the topology of this space should reflect what developers know (and projects need). Method: we use the World of Code infrastructure to extract the complete set of APIs in the files changed by open source developers and, based on that data, employ Doc2Vec embeddings for vector representations of APIs, developers, and projects. We then evaluate if these embeddings reflect the postulated topology of the Skill Space by predicting what new APIs/projects developers use/join, and whether or not their pull requests get accepted. We also check how the developers' representations in the Skill Space align with their self-reported API expertise. Result: Our results suggest that the proposed embeddings in the Skill Space appear to satisfy the postulated topology and we hope that such representations may aid in the construction of signals that increase trust (and efficiency) of open source ecosystems at large and may aid investigations of other phenomena related to developer proficiency and learning.	翻訳日:2022-12-01 06:16:54 公開日:2021-02-02
# 不整合損失を伴うレビュー要約と感情分類のための統一的デュアルビューモデル A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss ( http://arxiv.org/abs/2006.01592v2 ) ライセンス: Link先を確認	Hou Pong Chan, Wang Chen, Irwin King	(参考訳) ユーザーレビューから正確な要約と感情を得ることは、現代のeコマースプラットフォームの重要な要素である。レビュー要約は、レビューの重要な意見と感情を記述する簡潔な要約を生成することを目的としており、感情分類はレビューの感情態度を示す感情ラベルを予測することを目的としている。レビュー要約と感情分類の両タスクにおいて,共有感情情報を効果的に活用するために,これら2つのタスクの性能を協調的に改善する新しいデュアルビューモデルを提案する。このモデルでは、エンコーダはまずレビューのコンテキスト表現を学習し、次に要約デコーダがレビュー要約語を単語毎に生成する。その後、ソースビュー感情分類器は、エンコードされたコンテキスト表現を使用してレビューの感情ラベルを予測し、サマリビュー感情分類器はデコーダ隠蔽状態を使用して生成された要約の感情ラベルを予測する。トレーニング中、これらの2つの分類器間の不一致を罰する不整合損失を導入する。これはデコーダがレビューで一貫した感情傾向を持つために要約を生成するのを助け、2つの感情分類器が互いに学ぶのに役立つ。異なる領域の4つの実世界のデータセットに対する実験結果は、我々のモデルの有効性を示す。 Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms. Review summarization aims at generating a concise summary that describes the key opinions and sentiment of a review, while sentiment classification aims to predict a sentiment label indicating the sentiment attitude of a review. To effectively leverage the shared sentiment information in both review summarization and sentiment classification tasks, we propose a novel dual-view model that jointly improves the performance of these two tasks. In our model, an encoder first learns a context representation for the review, then a summary decoder generates a review summary word by word. After that, a source-view sentiment classifier uses the encoded context representation to predict a sentiment label for the review, while a summary-view sentiment classifier uses the decoder hidden states to predict a sentiment label for the generated summary. During training, we introduce an inconsistency loss to penalize the disagreement between these two classifiers. It helps the decoder to generate a summary to have a consistent sentiment tendency with the review and also helps the two sentiment classifiers learn from each other. Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.	翻訳日:2022-11-26 00:31:11 公開日:2021-02-02
# Augmented Grasp Map Representation を用いた指向性ロボットグラフ合成 Orientation Attentive Robotic Grasp Synthesis with Augmented Grasp Map Representation ( http://arxiv.org/abs/2006.05123v2 ) ライセンス: Link先を確認	Georgia Chalvatzaki, Nikolaos Gkanatsios, Petros Maragos, Jan Peters	(参考訳) 物体の因果的な形態的特徴は、ロボットの把握の視覚的学習を阻害する、幅広い可視的把握方向を提供する可能性がある。既存の把持生成アプローチは、把持点ごとに大きく異なる向きのアノテーションを集約することで不連続な把持マップを構築するために呪いを負う。さらに,現状の手法では,ロボットの視点において,その実現可能性の制約を無視して,単一方向の把握候補を生成する。本稿では, 角度空間を複数のビンに分割することにより, 方向を局所的に歪曲する, 画素ワイズ合成に適した拡張型グリップマップ表現を提案する。さらに,向き付けビンへの分類と角度値回帰を共同で扱う向き付け注意把握合成(orange)フレームワークについても紹介する。双角方向写像はさらに、把握性が高い領域、すなわち実際の把握点となる確率に対する注意のメカニズムとして機能する。 Jacquardの94.71%の性能は、深度画像のみを用いた単純なU-Netで、マルチモーダルアプローチよりも優れています。その後の定性的な結果から,ORANGEが複数の方向のグリップを生成することの有効性を検証し,実現可能な計画的グリップを実現する。 Inherent morphological characteristics in objects may offer a wide range of plausible grasping orientations that obfuscates the visual learning of robotic grasping. Existing grasp generation approaches are cursed to construct discontinuous grasp maps by aggregating annotations for drastically different orientations per grasping point. Moreover, current methods generate grasp candidates across a single direction in the robot's viewpoint, ignoring its feasibility constraints. In this paper, we propose a novel augmented grasp map representation, suitable for pixel-wise synthesis, that locally disentangles grasping orientations by partitioning the angle space into multiple bins. Furthermore, we introduce the ORientation AtteNtive Grasp synthEsis (ORANGE) framework, that jointly addresses classification into orientation bins and angle-value regression. The bin-wise orientation maps further serve as an attention mechanism for areas with higher graspability, i.e. probability of being an actual grasp point. We report new state-of-the-art 94.71% performance on Jacquard, with a simple U-Net using only depth images, outperforming even multi-modal approaches. Subsequent qualitative results with a real bi-manual robot validate ORANGE's effectiveness in generating grasps for multiple orientations, hence allowing planning grasps that are feasible.	翻訳日:2022-11-23 15:31:26 公開日:2021-02-02
# Debona: より密接な境界と高速な対向ロバスト性証明のための分離境界ネットワーク解析 Debona: Decoupled Boundary Network Analysis for Tighter Bounds and Faster Adversarial Robustness Proofs ( http://arxiv.org/abs/2006.09040v2 ) ライセンス: Link先を確認	Christopher Brix, Thomas Noll	(参考訳) ニューラルネットワークは、安全クリティカルな現実世界のアプリケーションで一般的に使用される。残念なことに、予測された出力は、しばしば入力データの変更に対して非常に敏感である。このような敵の例が存在しないこと、あるいは具体的な例を提供することは、安全なアプリケーションを保証するために不可欠である。全ての潜在的な敵の例を列挙し、検証することは、計算的に不可能であるので、ネットワークアクティベーションの過大評価を用いて、不在の数学的に健全な証明を提供するための検証技術が開発されている。本稿では,これらのノード値の上限値と下限値の密接な計算を行うための改良手法を提案する。さらに,従来の最先端ソフトウェアである"Neurify"の一部を再実装することで,より高速な解析が可能になった。これらの適応を組み合わせることで、必要なランタイムを最大94%削減し、以前は複雑すぎたネットワークや入力の検索に成功した。畳み込みネットワークにおける最大プーリング層上の上下境界の厳密な証明を行う。広汎なユーザビリティを確保するため,実装固有の拡張に加えて,より高速かつ正確な境界計算も備えた実装"Debona"をオープンソース化した。 Neural networks are commonly used in safety-critical real-world applications. Unfortunately, the predicted output is often highly sensitive to small, and possibly imperceptible, changes to the input data. Proving that either no such adversarial examples exist, or providing a concrete instance, is therefore crucial to ensure safe applications. As enumerating and testing all potential adversarial examples is computationally infeasible, verification techniques have been developed to provide mathematically sound proofs of their absence using overestimations of the network activations. We propose an improved technique for computing tight upper and lower bounds of these node values, based on increased flexibility gained by computing both bounds independently of each other. Furthermore, we gain an additional improvement by re-implementing part of the original state-of-the-art software "Neurify", leading to a faster analysis. Combined, these adaptations reduce the necessary runtime by up to 94%, and allow a successful search for networks and inputs that were previously too complex. We provide proofs for tight upper and lower bounds on max-pooling layers in convolutional networks. To ensure widespread usability, we open source our implementation "Debona", featuring both the implementation specific enhancements as well as the refined boundary computation for faster and more exact~results.	翻訳日:2022-11-20 18:44:10 公開日:2021-02-02
# 会話型ニューロシンボリック・コモンセンス推論 Conversational Neuro-Symbolic Commonsense Reasoning ( http://arxiv.org/abs/2006.10022v3 ) ライセンス: Link先を確認	Forough Arabshahi, Jennifer Lee, Mikayla Gawarecki, Kathryn Mazaitis, Amos Azaria, Tom Mitchell	(参考訳) 会話AIシステムがより自然で広範に会話を行うためには、会話パートナーの予想外の推測を識別する機能を含む、より一般的な知識が必要である。例えば、"if it snows at night, then me early because i don't want to be late for work"というコマンドでは、リスナーの常識に基づいて、雪が降って渋滞が遅くなる場合にのみ目を覚ましたいという暗黙の仮定を推測する。ここでは、「if-(state), then-(action), because-(goal)"文」という形で与えられた不正確な自然言語コマンドを理解するという問題を考察する。より正確には、要求された行動が与えられた状態から所望の目標を達成することを許容する話者の未定の前提を識別する問題を考える(暗黙の前提を明示することによる詳細化)。我々はこのタスクのベンチマークデータセットをリリースし、人間から収集し、コモンセンス推定で注釈を付けた。マルチホップ推論鎖を抽出するニューロシンボリック定理証明器を提案し,この問題に応用する。さらに、現在のAIコモンセンスシステムが完全なカバレッジを欠いている現実に対応するため、私たちのニューロシンボリックシステム上に構築された対話型会話フレームワークも提供します。 In order for conversational AI systems to hold more natural and broad-ranging conversations, they will require much more commonsense, including the ability to identify unstated presumptions of their conversational partners. For example, in the command "If it snows at night then wake me up early because I don't want to be late for work" the speaker relies on commonsense reasoning of the listener to infer the implicit presumption that they wish to be woken only if it snows enough to cause traffic slowdowns. We consider here the problem of understanding such imprecisely stated natural language commands given in the form of "if-(state), then-(action), because-(goal)" statements. More precisely, we consider the problem of identifying the unstated presumptions of the speaker that allow the requested action to achieve the desired goal from the given state (perhaps elaborated by making the implicit presumptions explicit). We release a benchmark data set for this task, collected from humans and annotated with commonsense presumptions. We present a neuro-symbolic theorem prover that extracts multi-hop reasoning chains, and apply it to this problem. Furthermore, to accommodate the reality that current AI commonsense systems lack full coverage, we also present an interactive conversational framework built on our neuro-symbolic system, that conversationally evokes commonsense knowledge from humans to complete its reasoning chains.	翻訳日:2022-11-19 18:59:11 公開日:2021-02-02
# ニューロンの構成的説明 Compositional Explanations of Neurons ( http://arxiv.org/abs/2006.14032v2 ) ライセンス: Link先を確認	Jesse Mu, Jacob Andreas	(参考訳) 本稿では,ニューロンの挙動を近似した構成論理的概念を同定し,深部表現におけるニューロンの説明手法について述べる。原子ラベルを説明として使用する以前の研究と比較すると、ニューロンを合成分析することで、より正確に表現的にその行動を特徴付けることができる。視覚と自然言語処理のモデルにおける解釈可能性に関するいくつかの質問に答えるためにこの手順を用いる。まず,ニューロンが学習する抽象化の種類について検討する。画像分類では、多くのニューロンが高度に抽象的だがセマンティック・コヒーレントな視覚概念を学習しているのに対し、他のポリセマンティックニューロンは複数の無関係な特徴を検知している。第2に,人間の解釈可能な概念を検出する視覚ニューロンはタスク性能と正の相関を示す一方,浅いヒューリスティックスのために発火するNLIニューロンはタスク性能と負の相関を示す。最後に、構成説明が、エンドユーザーがモデル動作を予測可能な方法で変更する単純な「コピーペースト」攻撃例を作成するための、アクセス可能な方法を提供する方法を示す。 We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts that closely approximate neuron behavior. Compared to prior work that uses atomic labels as explanations, analyzing neurons compositionally allows us to more precisely and expressively characterize their behavior. We use this procedure to answer several questions on interpretability in models for vision and natural language processing. First, we examine the kinds of abstractions learned by neurons. In image classification, we find that many neurons learn highly abstract but semantically coherent visual concepts, while other polysemantic neurons detect multiple unrelated features; in natural language inference (NLI), neurons learn shallow lexical heuristics from dataset biases. Second, we see whether compositional explanations give us insight into model performance: vision neurons that detect human-interpretable concepts are positively correlated with task performance, while NLI neurons that fire for shallow heuristics are negatively correlated with task performance. Finally, we show how compositional explanations provide an accessible way for end users to produce simple "copy-paste" adversarial examples that change model behavior in predictable ways.	翻訳日:2022-11-17 08:59:07 公開日:2021-02-02
# COAX:ソフト関数依存型多次元データにおける相関認識インデックス COAX: Correlation-Aware Indexing on Multidimensional Data with Soft Functional Dependencies ( http://arxiv.org/abs/2006.16393v3 ) ライセンス: Link先を確認	Ali Hadian, Behzad Ghaffari, Taiyi Wang, Thomas Heinis	(参考訳) 最近の研究は、パフォーマンスを改善するために基礎となるデータセットの分布を学習する学習インデックス構造を提案している。学習されたインデックスに関する最初の研究は、データの累積分布関数を学習することで、b木のようなインデックス構造がメモリフットプリントを小さくしながら、その性能を1桁改善できることを示した。本稿では,鍵の分布を学習する代わりに,データセットの属性間の相関関係を学習する多次元データのための学習指標であるCOAXを提案する。我々のアプローチは、多くのデータセットにおいて、2つの(または複数の)属性の値が相関しているという観測によって導かれる。 COAXはこれらの相関を利用してデータセットの次元を減少させる。より正確には、ある(または複数の)属性を残りの属性から$c_d$を推測する方法を学び、したがって$c_d$をインデックスする必要がない。これにより次元が小さくなり、インデックスはより小さくより効率的になる。提案手法の有効性をFD属性の予測可能性に基づいて理論的に検討する。さらに,データ中の関連属性を予測することにより,クエリ実行時間を短縮し,インデックスのメモリオーバーヘッドを低減できることを実験的に示す。実験では,インデックスのメモリフットプリントを4桁に減らしながら,実行時間を25%削減した。 Recent work proposed learned index structures, which learn the distribution of the underlying dataset to improve performance. The initial work on learned indexes has shown that by learning the cumulative distribution function of the data, index structures such as the B-Tree can improve their performance by one order of magnitude while having a smaller memory footprint. In this paper, we present COAX, a learned index for multidimensional data that, instead of learning the distribution of keys, learns the correlations between attributes of the dataset. Our approach is driven by the observation that in many datasets, values of two (or multiple) attributes are correlated. COAX exploits these correlations to reduce the dimensionality of the datasets. More precisely, we learn how to infer one (or multiple) attribute $C_d$ from the remaining attributes and hence no longer need to index attribute $C_d$. This reduces the dimensionality and hence makes the index smaller and more efficient. We theoretically investigate the effectiveness of the proposed technique based on the predictability of the FD attributes. We further show experimentally that by predicting correlated attributes in the data, we can improve the query execution time and reduce the memory overhead of the index. In our experiments, we reduce the execution time by 25% while reducing the memory footprint of the index by four orders of magnitude.	翻訳日:2022-11-15 15:34:15 公開日:2021-02-02
# 入場拡大を伴うカスケード推論による効率的なコンフォメーション予測 Efficient Conformal Prediction via Cascaded Inference with Expanded Admission ( http://arxiv.org/abs/2007.03114v3 ) ライセンス: Link先を確認	Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay	(参考訳) 本稿では,1つの予測に代えて,予測候補の集合を特定することを目的とした,共形予測(CP)の新しいアプローチを提案する。この集合は高い確率で正しい解を含むことが保証され、多くのオープンな分類タスクに適している。標準CPパラダイムでは、予測された集合は使用不能に大きくなり、得られるコストもかかる。これは、正しい答えが一意ではなく、可能な答えの総数は高い設定で特に広まっています。まずcpの正しさ基準を拡張して,推定可能な「許容」回答を追加可能とし,有効な性能保証を提供しながら,予測セットのサイズを大幅に削減する。第二に、予測カスケードを適合させることでコストを減らし、より強力な分類器-アゲインを用いて、早期に不明瞭なラベルを積極的に作成し、有効な性能保証を提供する。薬物発見のための自然言語処理と計算化学の複数の応用におけるアプローチの実証的有効性を示す。 In this paper, we present a novel approach for conformal prediction (CP), in which we aim to identify a set of promising prediction candidates -- in place of a single prediction. This set is guaranteed to contain a correct answer with high probability, and is well-suited for many open-ended classification tasks. In the standard CP paradigm, the predicted set can often be unusably large and also costly to obtain. This is particularly pervasive in settings where the correct answer is not unique, and the number of total possible answers is high. We first expand the CP correctness criterion to allow for additional, inferred "admissible" answers, which can substantially reduce the size of the predicted set while still providing valid performance guarantees. Second, we amortize costs by conformalizing prediction cascades, in which we aggressively prune implausible labels early on by using progressively stronger classifiers -- again, while still providing valid performance guarantees. We demonstrate the empirical effectiveness of our approach for multiple applications in natural language processing and computational chemistry for drug discovery.	翻訳日:2022-11-13 01:52:21 公開日:2021-02-02
# ビデオ分類のための地域別非ローカル操作 Region-based Non-local Operation for Video Classification ( http://arxiv.org/abs/2007.09033v5 ) ライセンス: Link先を確認	Guoxi Huang and Adrian G. Bors	(参考訳) 畳み込みニューラルネットワーク(cnns)は、小さなウィンドウサイズで畳み込み操作を深く積み重ねることで、長距離依存性をモデル化する。本稿では,ローカル操作の深いスタックを使わずに,長距離依存関係を直接キャプチャできる自己注意機構のファミリーとして,地域ベースの非ローカル操作(RNL)を提案する。中間特徴マップが与えられると、全ての位置の隣接領域から情報を集約することにより、その特徴を位置で再調整する。チャネルアテンションモジュールと提案したRNLを組み合わせることで,市販のCNNに組み込んだアテンションチェーンを設計し,エンドツーエンドのトレーニングを行う。本手法を2つのビデオ分類ベンチマークで評価する。提案手法の実験結果は,他の注意機構よりも優れており,Something V1データセットの最先端性能を実現している。 Convolutional Neural Networks (CNNs) model long-range dependencies by deeply stacking convolution operations with small window sizes, which makes the optimizations difficult. This paper presents region-based non-local (RNL) operations as a family of self-attention mechanisms, which can directly capture long-range dependencies without using a deep stack of local operations. Given an intermediate feature map, our method recalibrates the feature at a position by aggregating the information from the neighboring regions of all positions. By combining a channel attention module with the proposed RNL, we design an attention chain, which can be integrated into the off-the-shelf CNNs for end-to-end training. We evaluate our method on two video classification benchmarks. The experimental results of our method outperform other attention mechanisms, and we achieve state-of-the-art performance on the Something-Something V1 dataset.	翻訳日:2022-11-09 14:14:14 公開日:2021-02-02
# 音声感情認識のためのコンパクトグラフアーキテクチャ Compact Graph Architecture for Speech Emotion Recognition ( http://arxiv.org/abs/2008.02063v4 ) ライセンス: Link先を確認	A. Shirian, T. Guha	(参考訳) 本稿では,音声感情認識の課題に対処するディープグラフアプローチを提案する。データを表現するコンパクトで効率的でスケーラブルな方法は、グラフの形式です。グラフ信号処理の理論に倣って,周期グラフや線グラフとして音声信号をモデル化することを提案する。このようなグラフ構造により、標準的なGCNで使用される近似畳み込みとは対照的に、正確なグラフ畳み込みを行うことができるグラフ畳み込みネットワーク(GCN)ベースのアーキテクチャを構築することができる。一般的なIEMOCAPとMSP-IMPROVデータベースを用いた音声感情認識モデルの性能評価を行った。我々のモデルは、標準的なGCNや他の関連するディープグラフアーキテクチャよりも優れている。既存の音声感情認識法と比較すると,学習可能なパラメータ(約30K)が大幅に少なく,資源制約のあるデバイスに適用可能であることを示す。 We propose a deep graph approach to address the task of speech emotion recognition. A compact, efficient and scalable way to represent data is in the form of graphs. Following the theory of graph signal processing, we propose to model speech signal as a cycle graph or a line graph. Such graph structure enables us to construct a Graph Convolution Network (GCN)-based architecture that can perform an accurate graph convolution in contrast to the approximate convolution used in standard GCNs. We evaluated the performance of our model for speech emotion recognition on the popular IEMOCAP and MSP-IMPROV databases. Our model outperforms standard GCN and other relevant deep graph architectures indicating the effectiveness of our approach. When compared with existing speech emotion recognition methods, our model achieves comparable performance to the state-of-the-art with significantly fewer learnable parameters (~30K) indicating its applicability in resource-constrained devices.	翻訳日:2022-11-02 18:03:15 公開日:2021-02-02
# ConvBERT: Spanベースの動的畳み込みによるBERTの改善 ConvBERT: Improving BERT with Span-based Dynamic Convolution ( http://arxiv.org/abs/2008.02496v3 ) ライセンス: Link先を確認	Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan	(参考訳) BERTのような事前訓練された言語モデルとその変種は、最近、様々な自然言語理解タスクにおいて印象的なパフォーマンスを達成した。しかし、BERTはグローバルな自己保持ブロックに大きく依存しているため、メモリフットプリントと計算コストが大きくなる。すべての注意は、グローバルな視点からアテンションマップを生成するための入力シーケンス全体に問い合わせるが、いくつかのヘッドは、局所的な依存関係のみを学ぶ必要がある、つまり計算冗長性の存在を観察する。そこで本研究では,これらの自己注意型ヘッダを置き換え,局所的依存関係を直接モデル化する,スパンベースの動的畳み込みを提案する。新たな畳み込み頭は、他の自己注意頭と共に、グローバルな文脈学習とローカルな文脈学習の両方においてより効率的である新しい混合注意ブロックを形成する。 BERTにこの混合注意設計を装備し、ConvBERTモデルを構築します。実験によると、ConvBERTはBERTとその変種を様々な下流タスクで大幅に上回り、トレーニングコストが低く、モデルのパラメータも少ない。注目すべきは、ConvBERTbase モデルは 86.4 GLUE スコアで、ELECTRAbase よりも 0.7 高い。コードと事前訓練されたモデルがリリースされる。 Pre-trained language models like BERT and its variants have recently achieved impressive performance in various natural language understanding tasks. However, BERT heavily relies on the global self-attention block and thus suffers large memory footprint and computation cost. Although all its attention heads query on the whole input sequence for generating the attention map from a global perspective, we observe some heads only need to learn local dependencies, which means the existence of computation redundancy. We therefore propose a novel span-based dynamic convolution to replace these self-attention heads to directly model local dependencies. The novel convolution heads, together with the rest self-attention heads, form a new mixed attention block that is more efficient at both global and local context learning. We equip BERT with this mixed attention design and build a ConvBERT model. Experiments have shown that ConvBERT significantly outperforms BERT and its variants in various downstream tasks, with lower training cost and fewer model parameters. Remarkably, ConvBERTbase model achieves 86.4 GLUE score, 0.7 higher than ELECTRAbase, while using less than 1/4 training cost. Code and pre-trained models will be released.	翻訳日:2022-11-02 07:12:36 公開日:2021-02-02
# 汎用適応型人工知能システム設計のための有効摂動ネットワーク Beneficial Perturbation Network for designing general adaptive artificial intelligence systems ( http://arxiv.org/abs/2009.13954v2 ) ライセンス: Link先を確認	Shixian Wen, Amanda Rios, Yunhao Ge, Laurent Itti	(参考訳) 人間の脳は適応学習の金の標準である。経験から学び、利益を得るだけでなく、新しい状況にも適応できるのです。対照的に、ディープニューラルネットワークは、入力から出力への洗練された、固定されたマッピングのみを学習する。これにより適用性がよりダイナミックな状況に制限され、入力から出力マッピングが異なるコンテキストで変化する可能性がある。新しい独立したタスクを、前のタスクを忘れずにシーケンシャルに学習する。勾配勾配勾配を用いたニューラルネットワークにおける複数のタスクの連続的な学習は、破滅的な忘れを招き、新しいタスクの新しいマッピングを学ぶ際に、以前のタスクのマッピングが消去される。本稿では,これらの動的状況に対応するために,ネットワーク外,タスク依存のバイアスユニットを備えた,生物学的に可能な新しい深層ニューラルネットワークを提案する。これにより、単一のネットワークが初めて、出力マッピングに対する潜在的に無制限な並列入力を学習し、実行時にオンザフライを切り替えることが可能になる。バイアスユニットは、各タスクに有益な摂動(よく知られた対向的な摂動)を活用することでプログラムされる。与えられたタスクに対する有益な摂動は、そのタスクに対してネットワークを偏り、そのタスクを処理するためにネットワークを別のモードに切り替える。これにより、タスク間の破滅的な干渉がなくなる。我々のアプローチはメモリ効率が高くパラメータ効率が高く、多くのタスクに対応でき、様々なタスクやドメインで最先端のパフォーマンスを実現する。 The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also can adapt to new situations. In contrast, deep neural networks only learn one sophisticated but fixed mapping from inputs to outputs. This limits their applicability to more dynamic situations, where input to output mapping may change with different contexts. A salient example is continual learning - learning new independent tasks sequentially without forgetting previous tasks. Continual learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby a previously learned mapping of an old task is erased when learning new mappings for new tasks. Here, we propose a new biologically plausible type of deep neural network with extra, out-of-network, task-dependent biasing units to accommodate these dynamic situations. This allows, for the first time, a single network to learn potentially unlimited parallel input to output mappings, and to switch on the fly between them at runtime. Biasing units are programmed by leveraging beneficial perturbations (opposite to well-known adversarial perturbations) for each task. Beneficial perturbations for a given task bias the network toward that task, essentially switching the network into a different mode to process that task. This largely eliminates catastrophic interference between tasks. Our approach is memory-efficient and parameter-efficient, can accommodate many tasks, and achieves state-of-the-art performance across different tasks and domains.	翻訳日:2022-10-14 03:07:36 公開日:2021-02-02
# ノイズ点雲データを用いた果樹構造解析のためのグラフベース手法 Graph-based methods for analyzing orchard tree structure using noisy point cloud data ( http://arxiv.org/abs/2009.13727v2 ) ライセンス: Link先を確認	Fredrik Westling, Dr James Underwood, Dr Mitch Bryson	(参考訳) LiDARを用いた果樹のディジタイズにより、成長するプラクティスを改良して収量を改善するために使用できる分析が可能になる。高度な分析には、個々の木を識別する機能や、葉状および構造的物質の識別など、データの幾何学的および意味的な理解が必要である。この情報の抽出は、データキャプチャーのように迅速で、果樹園全体を処理できるが、既存の分類とセグメンテーションの方法は、高品質のデータやカメラのような追加のデータソースに依存している。本稿では,手持ちまたは移動式LiDARが取得した低品質データに基づいて,個々の木の位置,区分,物質分類に特化してLiDARデータを解析する手法を提案する。 F1スコアが0.774,v尺度が0.915,トランク物質分類が0.490,実データが平均的なF1スコアが0.490,既存手法が一貫して向上し,実行時間が大幅に短縮された。 Digitisation of fruit trees using LiDAR enables analysis which can be used to better growing practices to improve yield. Sophisticated analysis requires geometric and semantic understanding of the data, including the ability to discern individual trees as well as identifying leafy and structural matter. Extraction of this information should be rapid, as should data capture, so that entire orchards can be processed, but existing methods for classification and segmentation rely on high-quality data or additional data sources like cameras. We present a method for analysis of LiDAR data specifically for individual tree location, segmentation and matter classification, which can operate on low-quality data captured by handheld or mobile LiDAR. Our methods for tree location and segmentation improved on existing methods with an F1 score of 0.774 and a v-measure of 0.915 respectively, while trunk matter classification performed poorly in absolute terms with an average F1 score of 0.490 on real data, though consistently outperformed existing methods and displayed a significantly shorter runtime.	翻訳日:2022-10-13 06:55:48 公開日:2021-02-02
# リカレントメモリを有する段落レベルのコモンセンストランスフォーマ Paragraph-level Commonsense Transformers with Recurrent Memory ( http://arxiv.org/abs/2010.01486v2 ) ライセンス: Link先を確認	Saadia Gabriel, Chandra Bhagavatula, Vered Shwartz, Ronan Le Bras, Maxwell Forbes, Yejin Choi	(参考訳) 物語のテキストに対する人間の理解は、テキストで明示的に述べられているものを超えて常識的推論を必要とする。最近のCOMETモデルでは、プレ条件やポスト条件、モチベーション、そして参加者の精神状態など、いくつかの次元に沿って、このような暗黙のコモンセンス推論を生成できる。しかし、COMETは短いフレーズのコモンセンス推論に基づいて訓練されたため、談話に依存しない。多元的な物語の各文で提示されると、その物語の他の部分と矛盾する推論を生成する可能性がある。談話認識コモンセンス推論の課題について述べる。物語の中の文が与えられると、目標は、物語の他の部分との一貫性を維持しながら、予め定義された次元に沿って常識的な推論を生成することである。このような大規模段落レベルのアノテーションは入手やコストがかかるため、文レベルのアノテーションを使用して、遠隔で管理されたコーパスを効率的にかつ自動的に構築する。このコーパスを用いて,物語からコヒーレントなコモンセンス推論を生成するために段落レベルの情報を含む談話認識モデルであるPARA-COMETを訓練する。 PARA-COMETは、前世界知識に関連する意味的知識と、現在の出来事が物語における前と将来の出来事にどのように関係しているかに関する叙述的知識の両方を捉えている。以上の結果から,PARA-COMETは文レベルのベースライン,特にコヒーレントかつ新規な推論に優れていた。 Human understanding of narrative texts requires making commonsense inferences beyond what is stated explicitly in the text. A recent model, COMET, can generate such implicit commonsense inferences along several dimensions such as pre- and post-conditions, motivations, and mental states of the participants. However, COMET was trained on commonsense inferences of short phrases, and is therefore discourse-agnostic. When presented with each sentence of a multi-sentence narrative, it might generate inferences that are inconsistent with the rest of the narrative. We present the task of discourse-aware commonsense inference. Given a sentence within a narrative, the goal is to generate commonsense inferences along predefined dimensions, while maintaining coherence with the rest of the narrative. Such large-scale paragraph-level annotation is hard to get and costly, so we use available sentence-level annotations to efficiently and automatically construct a distantly supervised corpus. Using this corpus, we train PARA-COMET, a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives. PARA-COMET captures both semantic knowledge pertaining to prior world knowledge, and episodic knowledge involving how current events relate to prior and future events in a narrative. Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.	翻訳日:2022-10-11 03:05:53 公開日:2021-02-02
# FaultNet: 断層分類のための深部畳み込みニューラルネットワーク FaultNet: A Deep Convolutional Neural Network for bearing fault classification ( http://arxiv.org/abs/2010.02146v2 ) ライセンス: Link先を確認	Rishikesh Magar, Lalit Ghule, Junhan Li, Yang Zhao and Amir Barati Farimani	(参考訳) 生産フロアにおける高度なセンサーの存在が増加し、マシンの健康に関する重要な洞察を提供するデータセットの収集につながった。機械の健康の重要かつ信頼性の高い指標である振動信号データにより、機械系で発生した異なる故障の理解を深めることができる。そこで本研究では, 異なる信号処理法を組み合わせることで, 機械系の振動信号データを解析し, 各種軸受故障を分類するための機械学習手法と結合する。また, 異なる信号処理手法を用いることの重要性を強調し, 故障検出の精度への影響を分析する。また,従来の機械学習アルゴリズムとは別に,高い精度で軸受故障の種類を効果的に判定できる畳み込みニューラルネットワークフォールトネットを提案する。本研究の差別化要因は,信号からより多くの情報を抽出するためのチャネルの提案であり,さらに精度の高い信号の分類に有用な特徴を抽出するために,平均チャネルとメディアチャネルを生信号に積み重ねた。 The increased presence of advanced sensors on the production floors has led to the collection of datasets that can provide significant insights into machine health. An important and reliable indicator of machine health, vibration signal data can provide us a greater understanding of different faults occurring in mechanical systems. In this work, we analyze vibration signal data of mechanical systems with bearings by combining different signal processing methods and coupling them with machine learning techniques to classify different types of bearing faults. We also highlight the importance of using different signal processing methods and analyze their effect on accuracy for bearing fault detection. Apart from the traditional machine learning algorithms we also propose a convolutional neural network FaultNet which can effectively determine the type of bearing fault with a high degree of accuracy. The distinguishing factor of this work is the idea of channels proposed to extract more information from the signal, we have stacked the Mean and Median channels to raw signal to extract more useful features to classify the signals with greater accuracy.	翻訳日:2022-10-10 22:25:45 公開日:2021-02-02
# 主最適輸送方向を用いた分類のための十分次元削減 Sufficient dimension reduction for classification using principal optimal transport direction ( http://arxiv.org/abs/2010.09921v4 ) ライセンス: Link先を確認	Cheng Meng and Jun Yu and Jingyi Zhang and Ping Ma and Wenxuan Zhong	(参考訳) 十分な次元還元は教師付き次元還元アプローチとして広く用いられる。既存の十分な次元縮小法は、連続応答を持つデータのために開発され、カテゴリー応答、特にバイナリ応答に対して不十分な性能を持つ可能性がある。この問題に対処するために,最適輸送を用いた十分次元縮小部分空間(SDR部分空間)の新たな推定法を提案する。提案手法は主最適輸送方向 (POTD) と命名され, 応答カテゴリの異なるデータ間の最適輸送結合の主方向を用いてSDR部分空間の基底を推定する。提案手法は, 十分次元縮小, 支持ベクトルマシン, 最適輸送という, 一見無関係な3つのトピック間の関係も明らかにする。我々はPOTDの漸近特性を調査し、クラスラベルにエラーがない場合、POTDはSDR部分空間のみを推定する。実証的な研究により、POTDは最先端の線形次元還元法よりも優れていた。 Sufficient dimension reduction is used pervasively as a supervised dimension reduction approach. Most existing sufficient dimension reduction methods are developed for data with a continuous response and may have an unsatisfactory performance for the categorical response, especially for the binary-response. To address this issue, we propose a novel estimation method of sufficient dimension reduction subspace (SDR subspace) using optimal transport. The proposed method, named principal optimal transport direction (POTD), estimates the basis of the SDR subspace using the principal directions of the optimal transport coupling between the data respecting different response categories. The proposed method also reveals the relationship among three seemingly irrelevant topics, i.e., sufficient dimension reduction, support vector machine, and optimal transport. We study the asymptotic properties of POTD and show that in the cases when the class labels contain no error, POTD estimates the SDR subspace exclusively. Empirical studies show POTD outperforms most of the state-of-the-art linear dimension reduction methods.	翻訳日:2022-10-05 21:32:36 公開日:2021-02-02
# 自己教師付き音声表現の類似性解析 Similarity Analysis of Self-Supervised Speech Representations ( http://arxiv.org/abs/2010.11481v2 ) ライセンス: Link先を確認	Yu-An Chung and Yonatan Belinkov and James Glass	(参考訳) 近年,自己監督型音声表現学習が盛んに研究されている。大規模非ラベルデータから有用な表現を学習するために多くのアルゴリズムが提案されており、その幅広い音声タスクへの応用も研究されている。しかし、既存のアプローチの性質を理解することに焦点を当てた研究はほとんど行われていない。本研究では,最も代表的な自己教師型アルゴリズムについて比較研究することを目的とする。具体的には,既存の類似性尺度を用いて,異なる自己教師表現間の類似性を定量化する。また,モデルの事前学習損失と学習表現に含まれる特定の音声情報量との関係を調べるための探索タスクも設計した。各種自己教師型モデルが同じ入力でどのように振る舞うかを示すことに加え、本研究では、学習目標がビルディングブロック(RNN/Transformer/CNN)や方向性(ユニ/双方向)といったアーキテクチャ選択よりも表現類似性に高い影響があることも見出した。また,自己教師型アルゴリズムの学習前損失と下流性能との間には強い相関関係があることが示唆された。 Self-supervised speech representation learning has recently been a prosperous research topic. Many algorithms have been proposed for learning useful representations from large-scale unlabeled data, and their applications to a wide range of speech tasks have also been investigated. However, there has been little research focusing on understanding the properties of existing approaches. In this work, we aim to provide a comparative study of some of the most representative self-supervised algorithms. Specifically, we quantify the similarities between different self-supervised representations using existing similarity measures. We also design probing tasks to study the correlation between the models' pre-training loss and the amount of specific speech information contained in their learned representations. In addition to showing how various self-supervised models behave differently given the same input, our study also finds that the training objective has a higher impact on representation similarity than architectural choices such as building blocks (RNN/Transformer/CNN) and directionality (uni/bidirectional). Our results also suggest that there exists a strong correlation between pre-training loss and downstream performance for some self-supervised algorithms.	翻訳日:2022-10-04 05:56:28 公開日:2021-02-02
# RH-Net:強化学習と階層的関係探索によるニューラルネットワーク抽出の改善 RH-Net: Improving Neural Relation Extraction via Reinforcement Learning and Hierarchical Relational Searching ( http://arxiv.org/abs/2010.14255v2 ) ライセンス: Link先を確認	Jianing Wang	(参考訳) 遠隔監視(DS)は,現在ニューラルネットワーク抽出に広く利用されている大規模ヒューリスティックラベルコーパスを生成することを目的としている。しかし、ノイズの多いラベリングやロングテール分布の問題に苦しむ。多くの先進的なアプローチは、通常2つの問題に別々に対処し、両者の相互作用を無視する。本稿では、強化学習と階層型関係探索モジュールを用いて関係抽出を改善するRH-Netという新しいフレームワークを提案する。強化学習を利用して、モデルに高品質なインスタンスを選択するように指示する。次に、データリッチクラスとデータポーアクラス間の相関インスタンスのセマンティクスを共有する階層的関係探索モジュールを提案する。反復過程の間、2つのモジュールは相互作用を続け、ノイズと長い尾の問題を同時に緩和する。広範に使用されるnytデータセットに関する広範囲な実験により、最先端のベースラインよりも大きな改善が得られた。 Distant supervision (DS) aims to generate large-scale heuristic labeling corpus, which is widely used for neural relation extraction currently. However, it heavily suffers from noisy labeling and long-tail distributions problem. Many advanced approaches usually separately address two problems, which ignore their mutual interactions. In this paper, we propose a novel framework named RH-Net, which utilizes Reinforcement learning and Hierarchical relational searching module to improve relation extraction. We leverage reinforcement learning to instruct the model to select high-quality instances. We then propose the hierarchical relational searching module to share the semantics from correlative instances between data-rich and data-poor classes. During the iterative process, the two modules keep interacting to alleviate the noisy and long-tail problem simultaneously. Extensive experiments on widely used NYT data set clearly show that our method significant improvements over state-of-the-art baselines.	翻訳日:2022-10-02 11:03:29 公開日:2021-02-02
# 連続データにおける活動認識を利用した保護行動検出 Leveraging Activity Recognition to Enable Protective Behavior Detection in Continuous Data ( http://arxiv.org/abs/2011.01776v4 ) ライセンス: Link先を確認	Chongyang Wang, Yuan Gao, Akhil Mathur, Amanda C. De C. Williams, Nicholas D. Lane, Nadia Bianchi-Berthouze	(参考訳) 身体活動中に慢性的な痛み(CP)を訴える人々の保護行動は、身体的および感情的状態を理解する鍵となる。既存のpbd(automatic protective behavior detection)メソッドは、ユーザが事前に定義したアクティビティの事前セグメンテーションに依存する。しかし、実生活では、人々はさりげなく活動する。したがって、これらの活動が慢性的な痛みを伴う人には困難である場合、技術支援は継続的に提供され、活動タイプや保護行動の発生に自動的に適応すべきである。したがって、ユビキタスCP管理を容易にするため、連続データ上で正確なPBDを実現することが重要となる。本稿では、グラフ畳み込みと長寿命メモリ(GC-LSTM)ネットワークを含む新しい階層的HAR-PBDアーキテクチャを用いて、ヒトの活動認識(HAR)とPBDを統合することを提案する。 CP患者データセットを用いたアプローチの詳細な評価により,HAR,GC-LSTMネットワーク,CFCC損失の活用により,ベースラインに対するPBD性能が明らかに向上すること (macro F1 score: 0.81 vs. 0.66, precision-recall Area-under-the-curve (PR-AUC: 0.60 vs. 0.44) が示されている。我々は、CP管理などにおける階層的アーキテクチャのユースケースについて論じる。また、現在の制限や方法についても議論する。 Protective behavior exhibited by people with chronic pain (CP) during physical activities is the key to understanding their physical and emotional states. Existing automatic protective behavior detection (PBD) methods rely on pre-segmentation of activities predefined by users. However, in real life, people perform activities casually. Therefore, where those activities present difficulties for people with chronic pain, technology-enabled support should be delivered continuously and automatically adapted to activity type and occurrence of protective behavior. Hence, to facilitate ubiquitous CP management, it becomes critical to enable accurate PBD over continuous data. In this paper, we propose to integrate human activity recognition (HAR) with PBD via a novel hierarchical HAR-PBD architecture comprising graph-convolution and long short-term memory (GC-LSTM) networks, and alleviate class imbalances using a class-balanced focal categorical-cross-entropy (CFCC) loss. Through in-depth evaluation of the approach using a CP patients' dataset, we show that the leveraging of HAR, GC-LSTM networks, and CFCC loss leads to clear increase in PBD performance against the baseline (macro F1 score of 0.81 vs. 0.66 and precision-recall area-under-the-curve (PR-AUC) of 0.60 vs. 0.44). We conclude by discussing possible use cases of the hierarchical architecture in CP management and beyond. We also discuss current limitations and ways forward.	翻訳日:2022-09-30 05:37:52 公開日:2021-02-02
# 逆学習と擬似ラベリングによるUWFファウンダス診断モデルの訓練のための正規ファウンダス画像の活用 Leveraging Regular Fundus Images for Training UWF Fundus Diagnosis Models via Adversarial Learning and Pseudo-Labeling ( http://arxiv.org/abs/2011.13816v2 ) ライセンス: Link先を確認	Lie Ju, Xin Wang, Xin Zhao, Paul Bonnington, Tom Drummond, Zongyuan Ge	(参考訳) 近年,光学系カメラによる超広視野(uwf)200-fundusイメージングが,通常の30度-60度ファンダスカメラよりも広い視野でファンダスに関する情報を検出できるため,徐々に導入されている。 uwf の fundus 画像と比較すると、通常の fundus 画像には大量の高品質な注釈付きデータが含まれている。ドメインギャップのため、通常の眼底画像で訓練されたモデルでは、uff眼底画像の認識性能が低下する。そこで,本論文では,医療データの注釈付けが労働集約的かつ時間を要することを考慮し,より効率的なトレーニングのために,UWFファウンダスデータとアノテーションの限定的改善のために,通常のファウンダス画像を活用する方法について検討する。本稿では,通常のUWFファウンダスとUWFファウンダスとのギャップを埋めるために,修正サイクル生成敵ネットワーク(CycleGAN)モデルを提案する。生成したデータの品質を改善し,調整するために,GANの喪失時に一貫性正則化項を提案する。提案手法では,2つのドメインのイメージをペアにしたり,セマンティックラベルを同一にしたりする必要がなく,データ収集に非常に便利である。さらに,提案手法は擬似ラベル方式で生成したラベルなしデータによる雑音や誤差に対して頑健であることを示す。糖尿病性網膜症 (DR) 分類, 病変検出, 下顎骨切開術など, 一般的な基礎疾患や課題に対する方法の有効性を検討した。実験の結果,提案手法は複数のタスクにおいて,学習表現の優れた一般化性と性能向上を同時に達成できることが判明した。 Recently, ultra-widefield (UWF) 200\degree~fundus imaging by Optos cameras has gradually been introduced because of its broader insights for detecting more information on the fundus than regular 30 degree - 60 degree fundus cameras. Compared with UWF fundus images, regular fundus images contain a large amount of high-quality and well-annotated data. Due to the domain gap, models trained by regular fundus images to recognize UWF fundus images perform poorly. Hence, given that annotating medical data is labor intensive and time consuming, in this paper, we explore how to leverage regular fundus images to improve the limited UWF fundus data and annotations for more efficient training. We propose the use of a modified cycle generative adversarial network (CycleGAN) model to bridge the gap between regular and UWF fundus and generate additional UWF fundus images for training. A consistency regularization term is proposed in the loss of the GAN to improve and regulate the quality of the generated data. Our method does not require that images from the two domains be paired or even that the semantic labels be the same, which provides great convenience for data collection. Furthermore, we show that our method is robust to noise and errors introduced by the generated unlabeled data with the pseudo-labeling technique. We evaluated the effectiveness of our methods on several common fundus diseases and tasks, such as diabetic retinopathy (DR) classification, lesion detection and tessellated fundus segmentation. The experimental results demonstrate that our proposed method simultaneously achieves superior generalizability of the learned representations and performance improvements in multiple tasks.	翻訳日:2022-09-20 02:21:52 公開日:2021-02-02
# (参考訳) MAVIDHスコア:胸部X線像を用いた重症度検査 MAVIDH Score: A COVID-19 Severity Scoring using Chest X-Ray Pathology Features ( http://arxiv.org/abs/2011.14983v3 ) ライセンス: CC BY 4.0	Douglas P. S. Gomes, Michael J. Horry, Anwaar Ulhaq, Manoranjan Paul, Subrata Chakraborty, Manash Saha, Tanmoy Debnath, D.M. Motiur Rahaman	(参考訳) 患者の誤分類に関連するリスクを考えると、コンピュータビジョンのCOVID-19診断への応用は複雑で困難である。おそらく、covid-19の医療画像化の主要な価値は患者の予後にある。放射線画像は、病気の重症度を評価する医師を誘導し、同じ患者の異なる段階における一連の画像は、疾患の進行を評価するのに役立つ。そこで,胸部x線から疾患の重症度を判定するための肺病理学的特徴に基づく簡便な方法を提案する。この方法は, 疾患進行の異なる段階における患者の重症度と, 既存のより複雑な方法と比較して, 競争的な結果によく相関する。元のデータ選択アプローチも提案されており、単純なモデルで重大性に関する特徴を学習することができる。ここで示される競争的パフォーマンスは、他の文献のように肺への関与や不透明さに依存するのではなく、機能ベースである方法に関係していると仮定されている。第2の貢献は、疾患の異なる段階の患者グループのスコアとして概念化された結果の検証である。独立データセット上でこのような検証を行うのに加えて,文献における他の評価手法と比較した。以上の結果から,診断システム(MAVIDH)と患者の予後との間に有意な相関関係があることが示唆された。 The application of computer vision for COVID-19 diagnosis is complex and challenging, given the risks associated with patient misclassifications. Arguably, the primary value of medical imaging for COVID-19 lies rather on patient prognosis. Radiological images can guide physicians assessing the severity of the disease, and a series of images from the same patient at different stages can help to gauge disease progression. Hence, a simple method based on lung-pathology interpretable features for scoring disease severity from Chest X-rays is proposed here. As the primary contribution, this method correlates well to patient severity in different stages of disease progression with competitive results compared to other existing, more complex methods. An original data selection approach is also proposed, allowing the simple model to learn the severity-related features. It is hypothesized that the resulting competitive performance presented here is related to the method being feature-based rather than reliant on lung involvement or opacity as others in the literature. A second contribution comes from the validation of the results, conceptualized as the scoring of patients groups from different stages of the disease. Besides performing such validation on an independent data set, the results were also compared with other proposed scoring methods in the literature. The results show that there is a significant correlation between the scoring system (MAVIDH) and patient outcome, which could potentially help physicians rating and following disease progression in COVID-19 patients.	翻訳日:2021-06-07 00:49:22 公開日:2021-02-02
# 自己修正Q-Learning Self-correcting Q-Learning ( http://arxiv.org/abs/2012.01100v2 ) ライセンス: Link先を確認	Rong Zhu and Mattia Rigotti	(参考訳) Q学習アルゴリズムは、最大化バイアス、すなわち、影響を受けることが知られている。行動価値の体系的な過大評価は最近注目された重要な問題ですこのバイアスを緩和する効率的なアルゴリズムとして、ダブルQ学習が提案されている。しかしこれは、メモリ要求の増加と収束の遅さに加えて、アクション値の過小評価の価格が伴う。本稿では,期待値の最大値に近似する「自己補正アルゴリズム」という形で,最大化バイアスに対処する新しい手法を提案する。本手法は,従来のq-learningにおける1つの推定値の過大評価と,ダブルq-learningにおける2つの推定値の過大評価とを両立させる。この戦略をQ学習に適用すれば、自己修正Q学習が可能になる。理論的には,このアルゴリズムはQ-ラーニングと同等の収束保証を享受できるが,精度は高い。経験上、高い分散の報酬を持つドメインでは2倍のq-learningよりもパフォーマンスが良く、ゼロまたは低分散のドメインではq-learningよりも高速に収束する。これらの利点は、私たちが自己修正DQNと呼ぶディープQネットワークの実装に移行し、Atari 2600ドメインのいくつかのタスクにおいて、通常のDQNとダブルDQNより優れている。 The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overestimation of action values, an important issue that has recently received renewed attention. Double Q-learning has been proposed as an efficient algorithm to mitigate this bias. However, this comes at the price of an underestimation of action values, in addition to increased memory requirements and a slower convergence. In this paper, we introduce a new way to address the maximization bias in the form of a "self-correcting algorithm" for approximating the maximum of an expected value. Our method balances the overestimation of the single estimator used in conventional Q-learning and the underestimation of the double estimator used in Double Q-learning. Applying this strategy to Q-learning results in Self-correcting Q-learning. We show theoretically that this new algorithm enjoys the same convergence guarantees as Q-learning while being more accurate. Empirically, it performs better than Double Q-learning in domains with rewards of high variance, and it even attains faster convergence than Q-learning in domains with rewards of zero or low variance. These advantages transfer to a Deep Q Network implementation that we call Self-correcting DQN and which outperforms regular DQN and Double DQN on several tasks in the Atari 2600 domain.	翻訳日:2021-05-25 03:52:26 公開日:2021-02-02
# (参考訳) 都市交通・環境の構造化記述と分類のための6層モデル 6-Layer Model for a Structured Description and Categorization of Urban Traffic and Environment ( http://arxiv.org/abs/2012.06319v2 ) ライセンス: CC BY 4.0	Maike Scholtes, Lukas Westhofen, Lara Ruth Turner, Katrin Lotto, Michael Schuldes, Hendrik Weber, Nicolas Wagener, Christian Neurohr, Martin Bollmann, Franziska K\"ortke, Johannes Hiller, Michael Hoss, Julian Bock, Lutz Eckstein	(参考訳) 自動運転機能の検証と検証には大きな課題が伴う。現在、シナリオベースのアプローチは研究や産業において研究されており、安全関連シナリオを特定することでテストの労力を減らすことを目指している。これらのシナリオを定義し、複雑な実世界設計ドメインで運用するには、環境の構造化された記述が必要である。 PEGASUS研究プロジェクトでは、高速道路のシナリオを記述するために6層モデル (6LM) が導入された。本稿では6LMを改良し,都市交通と環境に拡張する。 PEGASUSで定義されているように、6LMは環境を分類し、その後のシナリオ記述のための構造化された基盤として機能する。このモデルは、知識を組み込んだり、アクターの機能を予測することなく、一般的な環境の構造化された記述と分類を可能にする。その他にも,本論文で詳述した 6lm の応用が数多く存在する。 6LMには、道路ネットワークと交通誘導対象、路面構造、前者の一時的な修正、動的オブジェクト、環境条件、デジタル情報などが記述されている。手前の作業は、各レイヤをアイテムを分類することによって指定する。対象環境記述のためのモデルの適用を標準化するためのガイドラインを定式化し、解説例を提示する。以前の出版物とは対照的に、モデルとその設計はより詳細に記述されている。最後に、提示された6LMの全体的記述には、概念を機械知覚の側面に拡張する際の将来の作業の可能性についての言及が含まれている。 Verification and validation of automated driving functions impose large challenges. Currently, scenario-based approaches are investigated in research and industry, aiming at a reduction of testing efforts by specifying safety relevant scenarios. To define those scenarios and operate in a complex real-world design domain, a structured description of the environment is needed. Within the PEGASUS research project, the 6-Layer Model (6LM) was introduced for the description of highway scenarios. This paper refines the 6LM and extends it to urban traffic and environment. As defined in PEGASUS, the 6LM provides the possibility to categorize the environment and, therefore, functions as a structured basis for subsequent scenario description. The model enables a structured description and categorization of the general environment, without incorporating any knowledge or anticipating any functions of actors. Beyond that, there is a variety of other applications of the 6LM, which are elaborated in this paper. The 6LM includes a description of the road network and traffic guidance objects, roadside structures, temporary modifications of the former, dynamic objects, environmental conditions and digital information. The work at hand specifies each layer by categorizing its items. Guidelines are formulated and explanatory examples are given to standardize the application of the model for an objective environment description. In contrast to previous publications, the model and its design are described in far more detail. Finally, the holistic description of the 6LM presented includes remarks on possible future work when expanding the concept to machine perception aspects.	翻訳日:2021-05-16 08:31:24 公開日:2021-02-02
# 正規化ニューラルネットワークを用いたフレキシブル非パラメトリックモデリング Flexible, Non-parametric Modeling Using Regularized Neural Networks ( http://arxiv.org/abs/2012.11369v2 ) ライセンス: Link先を確認	Oskar Allerbo, Rebecka J\"ornsten	(参考訳) 一般化付加モデル(GAM)のような非パラメトリック回帰は、柔軟で解釈可能な方法で複雑なデータ依存関係をキャプチャすることができる。しかし、付加コンポーネントのフォーマットを選択するには、しばしば非自明なデータ探索が必要である。本稿では,近位勾配降下と適応lassoを訓練した,一層ニューラルネットワークを用いたgamsの代替手法であるprada-netを提案する。 PrAda-netは、ニューラルネットワークのサイズとアーキテクチャを自動的に調整し、基盤となるデータ生成モデルの複雑さと構造をキャプチャする。 PrAda-netにより得られたコンパクトネットワークは、自動モデル選択による非パラメトリック統計モデリングに適した付加モデルコンポーネントに変換できる。シミュレーションデータ上でPrAda-netを実演し、PrAda-netの試験誤差性能、変数の重要度、変数のサブセット識別特性を他のラッソベースのアプローチと比較する。我々はまた、prada-netをイギリスの巨大なブラックスモークデータセットに適用し、prada-netをgamsの代替品として使う能力を示す。加法成分の関数形式を選択するのにドメイン知識を必要とするGAMとは対照的に、プラダネットはそのような事前選択は必要とせず、それでも解釈可能な加法成分をもたらす。 Non-parametric regression, such as generalized additive models (GAMs), is able to capture complex data dependencies in a flexible, yet interpretable way. However, choosing the format of the additive components often requires non-trivial data exploration. Here, we propose an alternative to GAMs, PrAda-net, which uses a one hidden layer neural network, trained with proximal gradient descent and adaptive lasso. PrAda-net automatically adjusts the size and architecture of the neural network to capture the complexity and structure of the underlying data generative model. The compact network obtained by PrAda-net can be translated to additive model components, making it suitable for non-parametric statistical modelling with automatic model selection. We demonstrate PrAda-net on simulated data, where we compare the test error performance, variable importance and variable subset identification properties of PrAda-net to other lasso-based approaches. We also apply Prada-net to the massive U.K. black smoke data set, to demonstrate the capability of using Prada-net as an alternative to GAMs. In contrast to GAMs, which often require domain knowledge to select the functional forms of the additive components, Prada-net requires no such pre-selection while still resulting in interpretable additive components.	翻訳日:2021-05-01 17:57:30 公開日:2021-02-02
# (参考訳) 適応的双方向注意:機械読取理解のための多角性表現の探索 Adaptive Bi-directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension ( http://arxiv.org/abs/2012.10877v2 ) ライセンス: CC BY 4.0	Nuo Chen, Fenglin Liu, Chenyu You, Peilin Zhou, Yuexian Zou	(参考訳) 近年,Transformer などの注目型多層エンコーダは,Machine Reading Comprehension (MRC) において広く研究されている。答えを予測するには、ソースシーケンスの\textit{coarse-grained}表現を生成する最終エンコーダ層からのみ情報を描画する予測器、すなわちパッセージとクエスチョンを使用するのが一般的である。以前の研究では、エンコーディング層が増加するにつれて、ソースシーケンスの表現は \textit{coarse-fine} からより \textit{fine-fine} となることが示されている。ディープニューラルネットワークの層数が増加するにつれて、エンコーディングプロセスは各位置に関する関連情報を徐々に収集し、その結果、より多くの‘textit{coarse-fine'表現が生まれ、それによって他の位置と類似する可能性が高まる(同質性を参照)。このような現象は、性能を低下させるために間違った判断を下すためにモデルを誤解させる。そこで本研究では,異なるレベルのソース表現を適応的に予測者に活用するAdaptive Bidirectional Attentionという手法を提案する。ベンチマークデータセットであるSQuAD 2.0の実験結果は、我々のアプローチの有効性を示し、その結果は従来の最先端モデルよりも2.5$\%$ EMと2.3$\%$ F1スコアの方が優れている。 Recently, the attention-enhanced multi-layer encoder, such as Transformer, has been extensively studied in Machine Reading Comprehension (MRC). To predict the answer, it is common practice to employ a predictor to draw information only from the final encoder layer which generates the \textit{coarse-grained} representations of the source sequences, i.e., passage and question. Previous studies have shown that the representation of source sequence becomes more \textit{coarse-grained} from \textit{fine-grained} as the encoding layer increases. It is generally believed that with the growing number of layers in deep neural networks, the encoding process will gather relevant information for each location increasingly, resulting in more \textit{coarse-grained} representations, which adds the likelihood of similarity to other locations (referring to homogeneity). Such a phenomenon will mislead the model to make wrong judgments so as to degrade the performance. To this end, we propose a novel approach called Adaptive Bidirectional Attention, which adaptively exploits the source representations of different levels to the predictor. Experimental results on the benchmark dataset, SQuAD 2.0 demonstrate the effectiveness of our approach, and the results are better than the previous state-of-the-art model by 2.5$\%$ EM and 2.3$\%$ F1 scores.	翻訳日:2021-05-01 08:45:32 公開日:2021-02-02
# (参考訳) 糖尿病網膜症における病変の局在 Towards the Localisation of Lesions in Diabetic Retinopathy ( http://arxiv.org/abs/2012.11432v2 ) ライセンス: CC BY 4.0	Samuel Ofosu Mensah, Bubacarr Bah, Willie Brink	(参考訳) 畳み込みニューラルネットワーク(CNN)は近年,糖尿病性網膜症(DR)基底画像の分類に成功している。しかし、cnnのより深い表現は、空間分解能を犠牲にして高レベルの意味論を捉えうる。眼科医にとって有用な予測を行うために,深層学習モデルのペナルティファイト層上に勾配強調クラスアクティベーションマッピング(grad-cam)と呼ばれるポストアテンション技術を用いて,眼底画像上の粗い局所化マップを作成する。これは画像の識別領域を特定するのに役立ち、眼科医が早期診断によって命を救える証拠となる。具体的には、4つの最先端ディープラーニングモデルの事前学習重量を用いて、DRファンダス画像のローカライズマップを作成し、比較する。 VGG16、ResNet50、InceptionV3、InceptionResNetV2が使用されている。 InceptionV3は96.07%の精度で最高の性能を達成し、ローカライズ病変は他のモデルよりも良く高速であることがわかった。 Convolutional Neural Networks (CNNs) have successfully been used to classify diabetic retinopathy (DR) fundus images in recent times. However, deeper representations in CNNs may capture higher-level semantics at the expense of spatial resolution. To make predictions usable for ophthalmologists, we use a post-attention technique called Gradient-weighted Class Activation Mapping (Grad-CAM) on the penultimate layer of deep learning models to produce coarse localisation maps on DR fundus images. This is to help identify discriminative regions in the images, consequently providing evidence for ophthalmologists to make a diagnosis and potentially save lives by early diagnosis. Specifically, this study uses pre-trained weights from four state-of-the-art deep learning models to produce and compare localisation maps of DR fundus images. The models used include VGG16, ResNet50, InceptionV3, and InceptionResNetV2. We find that InceptionV3 achieves the best performance with a test classification accuracy of 96.07%, and localise lesions better and faster than the other models.	翻訳日:2021-04-27 12:17:57 公開日:2021-02-02
# bayescard: 濃度推定のためのベイズフレームワークの復活 BayesCard: Revitilizing Bayesian Frameworks for Cardinality Estimation ( http://arxiv.org/abs/2012.14743v2 ) ライセンス: Link先を確認	Ziniu Wu, Amir Shaikhha, Rong Zhu, Kai Zeng, Yuxing Han, Jingren Zhou	(参考訳) 基数推定(cardest)はクエリオプティマイザの重要な要素であり、dbmsの基本的な問題である。望ましいCardEstメソッドは、優れたアルゴリズム性能を達成し、さまざまなデータ設定に安定し、システムデプロイメントに親しみやすくする必要がある。しかし、既存のCardEstメソッドでは同時に3つの基準を満たすことはできない。従来の手法では、大きな推定誤差のような大きなアルゴリズムの欠点があることが多い。最近提案されたディープラーニングに基づく手法は推定精度を大幅に改善するが、その性能はデータに大きく影響され、システム展開にはしばしば困難である。本稿では,確率的プログラミング言語の技法を取り入れて,CardEstのベイズネットワーク(BN)を再活性化する。我々は、BNの利点、すなわち高い推定精度と解釈可能性を継承する最初のフレームワークであるBayesCardを紹介し、その欠点、すなわちその欠点を克服する。低い構造学習と推論効率ですこれにより、BayesCardは商用DBMSデプロイメントの完璧な候補となる。 bayescardは、同等かそれ以上の精度、桁違いに速い推論時間、1-3桁のトレーニング時間、1-3桁の小さなモデルサイズ、1-2桁の高速なアップデートを実現しています。一方、BayesCardは、異なる設定でデータを変更した場合、安定したパフォーマンスを維持する。 BayesCardもPostgreSQLにデプロイしています。 IMDBベンチマークのワークロードでは、エンドツーエンドのクエリ時間を13.3%改善し、真の濃度のオラクルを使用して14.2%の最適な結果に非常に近い。 Cardinality estimation (CardEst) is an essential component in query optimizers and a fundamental problem in DBMS. A desired CardEst method should attain good algorithm performance, be stable to varied data settings, and be friendly to system deployment. However, no existing CardEst method can fulfill the three criteria at the same time. Traditional methods often have significant algorithm drawbacks such as large estimation errors. Recently proposed deep learning based methods largely improve the estimation accuracy but their performance can be greatly affected by data and often difficult for system deployment. In this paper, we revitalize the Bayesian networks (BN) for CardEst by incorporating the techniques of probabilistic programming languages. We present BayesCard, the first framework that inherits the advantages of BNs, i.e., high estimation accuracy and interpretability, while overcomes their drawbacks, i.e. low structure learning and inference efficiency. This makes BayesCard a perfect candidate for commercial DBMS deployment. Our experimental results on several single-table and multi-table benchmarks indicate BayesCard's superiority over existing state-of-the-art CardEst methods: BayesCard achieves comparable or better accuracy, 1-2 orders of magnitude faster inference time, 1-3 orders faster training time, 1-3 orders smaller model size, and 1-2 orders faster updates. Meanwhile, BayesCard keeps stable performance when varying data with different settings. We also deploy BayesCard into PostgreSQL. On the IMDB benchmark workload, it improves the end-to-end query time by 13.3%, which is very close to the optimal result of 14.2% using an oracle of true cardinality.	翻訳日:2021-04-18 20:28:24 公開日:2021-02-02
# (参考訳) 収縮とスプラインバイニングを備えたエビデンス2.0 Weight-of-evidence 2.0 with shrinkage and spline-binning ( http://arxiv.org/abs/2101.01494v2 ) ライセンス: CC BY 4.0	Jakob Raymaekers, Wouter Verbeke, Tim Verdonck	(参考訳) 詐欺検出、信用リスクモデリング、医療意思決定など、多くの実用的な応用において、事前定義されたクラスにインスタンスを割り当てる分類モデルは、正確かつ解釈可能である必要がある。ロジスティック回帰のような線形モデリング手法は、精度と解釈可能性のバランスが許容できるため、しばしば採用される。しかし、線形法は、高カルジナリティを持つカテゴリー予測器を扱ったり、データの非線形関係を利用するには不十分である。解法として、ウェイト・オブ・エビデンスのようなデータ前処理法は一般的に予測器の変換に使用される。しかし、エビデンスウェイト・オブ・エビデンス・アプローチの根底にあるビンニング手順はほとんど研究されておらず、通常はアドホックや専門家主導の手順に依存している。そこで本研究では,形式化されたデータ駆動型,強力な手法を提案する。この目的のために,スプライン関数のバイナリ化を通じて連続変数の離散化を探求し,予測変数の非線形効果を捕捉し,少数の離散値のみを取り込む高度に解釈可能な予測器を得る。さらに,重み付けアプローチを拡張し,収縮推定器を用いて比率を推定する手法を提案する。これにより、非線形とカテゴリー予測の両方を活用する能力が向上し、分類精度が向上し、結果モデルの解釈可能性を維持し、オーバーフィッティングのリスクを低減できる。本稿では,提案手法の有効性を示す詐欺検出セットにおける一連の実験結果を示す。提案した結果の再現と,提案手法の採用を容易にするため,提案手法と実験実装のためのデータセットとコードの両方を提供する。 In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise as well as interpretable. Linear modeling methods such as logistic regression are often adopted, since they offer an acceptable balance between precision and interpretability. Linear methods, however, are not well equipped to handle categorical predictors with high-cardinality or to exploit non-linear relations in the data. As a solution, data preprocessing methods such as weight-of-evidence are typically used for transforming the predictors. The binning procedure that underlies the weight-of-evidence approach, however, has been little researched and typically relies on ad-hoc or expert driven procedures. The objective in this paper, therefore, is to propose a formalized, data-driven and powerful method. To this end, we explore the discretization of continuous variables through the binning of spline functions, which allows for capturing non-linear effects in the predictor variables and yields highly interpretable predictors taking only a small number of discrete values. Moreover, we extend upon the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. Together, this offers an improved ability to exploit both non-linear and categorical predictors for achieving increased classification precision, while maintaining interpretability of the resulting model and decreasing the risk of overfitting. We present the results of a series of experiments in a fraud detection setting, which illustrate the effectiveness of the presented approach. We facilitate reproduction of the presented results and adoption of the proposed approaches by providing both the dataset and the code for implementing the experiments and the presented approach.	翻訳日:2021-04-11 17:43:04 公開日:2021-02-02
# (参考訳) 直交性制約による解釈可能なcovid-19胸部x線分類 Interpretable COVID-19 Chest X-Ray Classification via Orthogonality Constraint ( http://arxiv.org/abs/2102.08360v1 ) ライセンス: CC BY 4.0	Ella Y. Wang, Anirudh Som, Ankita Shukla, Hongjun Choi, Pavan Turaga	(参考訳) ディープニューラルネットワークは、いくつかの診断タスクのパフォーマンスを改善する能力のため、医療アプリケーションにおける補助ツールとしてますます使われてきた。しかし, 深層学習系では信頼性, 一般化性, 解釈性に限界があるため, 臨床現場では広く採用されていない。その結果、ネットワークトレーニング中に追加の制約を課す方法が開発され、より制御しやすくなり、解釈性が向上し、医療コミュニティへの受け入れが促進された。本研究は,胸部X線画像から新型コロナウイルスの症例を分類するために,Orthogonal Spheres (OS) 制約を用いることの利点を検討する。 OS制約は、分類ネットワークトレーニング中の標準的なクロスエントロピー損失と合わせて用いられる単純な正則性項として記述することができる。従来の研究では、このような制約をディープラーニングモデルに適用する上で、大きなメリットが示されている。以上の結果から, 正規化損失関数はGradCAM視覚化による意味的局所化, 分類性能の向上, モデル校正誤差の低減を効果的に実現できることが示唆された。提案手法は2クラス分類と3クラス分類でそれぞれ1.6%,4.8%の精度向上を実現し,データ拡張を施したモデルでは同様の結果が得られた。これらの知見に加えて,本研究は,医療におけるOSレギュラーライザの新たな応用を提示し,臨床現場での導入を促進するために,COVID-19分類のためのディープラーニングモデルのポストホック解釈性と性能を高めた。また、今後のさらなる研究のために検討できる戦略の限界も特定します。 Deep neural networks have increasingly been used as an auxiliary tool in healthcare applications, due to their ability to improve performance of several diagnosis tasks. However, these methods are not widely adopted in clinical settings due to the practical limitations in the reliability, generalizability, and interpretability of deep learning based systems. As a result, methods have been developed that impose additional constraints during network training to gain more control as well as improve interpretabilty, facilitating their acceptance in healthcare community. In this work, we investigate the benefit of using Orthogonal Spheres (OS) constraint for classification of COVID-19 cases from chest X-ray images. The OS constraint can be written as a simple orthonormality term which is used in conjunction with the standard cross-entropy loss during classification network training. Previous studies have demonstrated significant benefits in applying such constraints to deep learning models. Our findings corroborate these observations, indicating that the orthonormality loss function effectively produces improved semantic localization via GradCAM visualizations, enhanced classification performance, and reduced model calibration error. Our approach achieves an improvement in accuracy of 1.6% and 4.8% for two- and three-class classification, respectively; similar results are found for models with data augmentation applied. In addition to these findings, our work also presents a new application of the OS regularizer in healthcare, increasing the post-hoc interpretability and performance of deep learning models for COVID-19 classification to facilitate adoption of these methods in clinical settings. We also identify the limitations of our strategy that can be explored for further research in future.	翻訳日:2021-04-06 07:59:36 公開日:2021-02-02
# (参考訳) NFV対応ゼロタッチ6GネットワークのアクティブおよびAoI対応故障回復:モデルフリーDRLアプローチ Proactive and AoI-aware Failure Recovery for Stateful NFV-enabled Zero-Touch 6G Networks: Model-Free DRL Approach ( http://arxiv.org/abs/2103.03817v1 ) ライセンス: CC BY 4.0	Amirhossein Shaghaghi, Abolfazl Zakeri (Student Member, IEEE), Nader Mokari (Senior Member, IEEE), Mohammad Reza Javan (Senior Member, IEEE), Mohammad Behdadfar and Eduard A Jorswieck (Fellow, IEEE)	(参考訳) 本稿では,ネットワーク機能仮想化(NFV)実現ネットワークにおける組込みステートフル仮想ネットワーク機能(VNF)に対するゼロタッチPFR(ZT-PFR)と呼ばれるモデルフリー深部強化学習(DRL)に基づくプロアクティブ障害回復(PFR)フレームワークを提案する。 ZT-PFRの概念を実現するには,ネットワーク状態に基づく逐次意思決定が必要である。そこで本研究では,資源コストや不当な決定ペナルティを含むネットワークコスト関数を最小化し,効率的な資源利用のための最適化問題を定式化する。 ETSI と ITU に着想を得て,各 VNF 状態遷移がマルコフ過程に従うような,新しい入出力故障モデルを提案する。そこで本研究では,ソフトアクター・アクティクスや近位ポリシー最適化など,最先端のDRLベースの手法を提案する。さらに,ネットワーク状態の監視情報を適切な決定をするために,網状状態の監視情報を許容レベルに維持するために,イベントとスケジュールに基づく監視のバランスをとるために,情報年齢の概念(AoI)を適用した。いくつかのシミュレーションシナリオでは,本アルゴリズムの有効性を示し,ベースラインとの比較を行った。解析およびシミュレーション結果から,PFRのためのいくつかの重要なシステムとDRLアルゴリズムの設計知見を抽出した。例えば、DRLエージェント構造内の長短時間メモリ(LSTM)層で構成されるハイブリッドニューラルネットワークを使用して、差し迫った障害時間依存性をキャプチャします。 In this paper, we propose a model-free deep reinforcement learning (DRL)- based proactive failure recovery (PFR) framework called zero-touch PFR (ZT-PFR) for the embedded stateful virtual network functions (VNFs) in network function virtualization (NFV) enabled networks. To realize the ZT-PFR concept, sequential decision-making based on network status is necessary. To this end, we formulate an optimization problem for efficient resource usage by minimizing the defined network cost function including resource cost and wrong decision penalty. Inspired by ETSI and ITU, we propose a novel impending failure model where each VNF state transition follows a Markov process. As a solution, we propose state-of-the-art DRL-based methods such as soft actor-critic and proximal policy optimization. Moreover, to keep network state monitoring information at an acceptable level of freshness in order to make appropriate decisions, we apply the concept of the age of information (AoI) to strike a balance between the event and scheduling-based monitoring. Several simulation scenarios are considered to show the effectiveness of our algorithm and provide a fair comparison with baselines. Several key systems and DRL algorithm design insights for PFR are drawn from our analysis and simulation results. For example we use a hybrid neural network, consisting of long short time memory (LSTM) layers in the DRL agent structure, to capture impending failure time dependency.	翻訳日:2021-04-06 07:49:01 公開日:2021-02-02
# 神経推論における活性化関数の使用の形式化 Formalising the Use of the Activation Function in Neural Inference ( http://arxiv.org/abs/2102.04896v1 ) ライセンス: Link先を確認	Dalton A R Sakthivadivel	(参考訳) 本研究では,神経発火を抽象的に表現するためにアクティベーション関数をどのように利用できるか,そして,それが人工ニューラルネットワークでうまく機能するかを検討する。生物学的ニューロンのスパイクが、統計物理学における位相遷移の特定の普遍性クラスに属するかについて議論する。すると、人工ニューロンは、数学的に生物神経膜力学の平均場モデルであり、スパイクを相転移としてモデル化することから生じる。これにより、選択的神経発射を抽象的に処理し、パーセプトロン学習における活性化機能の役割を定式化する。このモデルを導出し、類似のニューラルケースを特定するとともに、フェーズ遷移を分析し、ニューラルネットワーク学習の物理を理解する。同時に,正準活性化関数の出現と性能に関する生物学的意味だけでなく,物理的正当性も示され,ニューラルラーニングや推論への影響も議論されている。 We investigate how activation functions can be used to describe neural firing in an abstract way, and in turn, why they work well in artificial neural networks. We discuss how a spike in a biological neurone belongs to a particular universality class of phase transitions in statistical physics. We then show that the artificial neurone is, mathematically, a mean field model of biological neural membrane dynamics, which arises from modelling spiking as a phase transition. This allows us to treat selective neural firing in an abstract way, and formalise the role of the activation function in perceptron learning. Along with deriving this model and specifying the analogous neural case, we analyse the phase transition to understand the physics of neural network learning. Together, it is show that there is not only a biological meaning, but a physical justification, for the emergence and performance of canonical activation functions; implications for neural learning and inference are also discussed.	翻訳日:2021-04-05 00:32:45 公開日:2021-02-02
# ランキング vs. 分類:知識ベース完了品質の測定 Ranking vs. Classifying: Measuring Knowledge Base Completion Quality ( http://arxiv.org/abs/2102.06145v1 ) ライセンス: Link先を確認	Marina Speranskaya, Martin Schmitt, Benjamin Roth	(参考訳) 知識ベース補完法(KBC)は,知識ベース(KB)に存在する情報から,候補となる事実の可能性を推定することによって,行方不明な事実を推定することを目的とする。一般的な評価パラダイムでは、モデルは、新しい事実が受け入れられるべきか否かを実際に決めるのではなく、他の候補と高い順位で真事実の位置でのみ判断される。我々は,バイナリ予測の考察は実際のkbc品質を反映するために不可欠であり,現実的なシナリオに対してより透過的なモデル選択基準を提供するように設計された新しい評価パラダイムを提案する。 FB14k-QAQというデータセットを構築し、単一の事実の代わりにKBクエリ、すなわち1つのエンティティが変数に置き換えられた事実を使い、正しい答えとなるエンティティの集合を構築します。我々は、これらの正しい答えのいくつかをデータセットからランダムに取り除き、KBから欠落した現実世界の実体の現実的なシナリオをシミュレートする。このようにして、KBよりも実際の世界で正しい回答を持つクエリを処理できるモデルの性能を、有効な答えのないクエリの特別なケースを含む、明確に測定することができる。後者は特にランキング設定と対比する。我々は,最新のKB埋め込みモデルを新しいベンチマークで評価した。本実験で観察したランキングと分類に基づく評価の相対的性能の差は,評価課題の良好な性能が必ずしも実際の完了課題の良好な性能をもたらすとは限らないという仮説を裏付けるものである。本研究は,予測分離性の向上を図ったKB埋め込みモデルの今後の取り組みを動機付け,その第一歩として,しきい値の設定を奨励し,元のTransEと比較してF1スコアの分類を著しく改善する,シンプルなTransEの変種を提案する。 Knowledge base completion (KBC) methods aim at inferring missing facts from the information present in a knowledge base (KB) by estimating the likelihood of candidate facts. In the prevailing evaluation paradigm, models do not actually decide whether a new fact should be accepted or not but are solely judged on the position of true facts in a likelihood ranking with other candidates. We argue that consideration of binary predictions is essential to reflect the actual KBC quality, and propose a novel evaluation paradigm, designed to provide more transparent model selection criteria for a realistic scenario. We construct the data set FB14k-QAQ where instead of single facts, we use KB queries, i.e., facts where one entity is replaced with a variable, and construct corresponding sets of entities that are correct answers. We randomly remove some of these correct answers from the data set, simulating the realistic scenario of real-world entities missing from a KB. This way, we can explicitly measure a model's ability to handle queries that have more correct answers in the real world than in the KB, including the special case of queries without any valid answer. The latter especially contrasts the ranking setting. We evaluate a number of state-of-the-art KB embeddings models on our new benchmark. The differences in relative performance between ranking-based and classification-based evaluation that we observe in our experiments confirm our hypothesis that good performance on the ranking task does not necessarily translate to good performance on the actual completion task. Our results motivate future work on KB embedding models with better prediction separability and, as a first step in that direction, we propose a simple variant of TransE that encourages thresholding and achieves a significant improvement in classification F1 score relative to the original TransE.	翻訳日:2021-04-05 00:32:16 公開日:2021-02-02
# TensorFlowによる透過FPGA高速化 Transparent FPGA Acceleration with TensorFlow ( http://arxiv.org/abs/2102.06018v1 ) ライセンス: Link先を確認	Simon Pfenning, Philipp Holzinger, Marc Reichenbach	(参考訳) 今日、ニューラルネットワークは機械学習の進歩を推進する主要なイノベーターの1つだ。これは特にニューラルネットワークの高速化ハードウェアの開発に影響を与えている。しかし、これらのアーキテクチャのほとんどは特殊なツールチェーンを必要とするため、新しいディープラーニングアクセラレータを使いたいと思うたびに、開発者にはある程度の労力がかかる。さらに、デバイスの柔軟性は、ランタイム環境の機能だけでなく、アーキテクチャ自体に結びついています。本稿では,TensorFlowをフロントエンドとして使用するツールフローを提案する。バックエンドではFPGAを使用し、HSAランタイム環境を介してアクセス可能です。このようにして、ユーザから新しいハードウェアを制御する複雑さを隠すと同時に、高い柔軟性を維持することができます。ハードウェアはネットワークの構造を静的に設定していないため、HSAツールフローによって実現できます。代わりに、ネットワークによって実行される各カーネルと、他のソースから同時に実行中に動的に再構成することができる。 OpenCL/OpenMP。 Today, artificial neural networks are one of the major innovators pushing the progress of machine learning. This has particularly affected the development of neural network accelerating hardware. However, since most of these architectures require specialized toolchains, there is a certain amount of additional effort for developers each time they want to make use of a new deep learning accelerator. Furthermore the flexibility of the device is bound to the architecture itself, as well as to the functionality of the runtime environment. In this paper we propose a toolflow using TensorFlow as frontend, thus offering developers the opportunity of using a familiar environment. On the backend we use an FPGA, which is addressable via an HSA runtime environment. In this way we are able to hide the complexity of controlling new hardware from the user, while at the same time maintaining a high amount of flexibility. This can be achieved by our HSA toolflow, since the hardware is not statically configured with the structure of the network. Instead, it can be dynamically reconfigured during runtime with the respective kernels executed by the network and simultaneously from other sources e.g. OpenCL/OpenMP.	翻訳日:2021-04-05 00:31:48 公開日:2021-02-02
# (参考訳) ゼロ・マイ・ショット・マルチダイアレクタル・アラビア列ラベリングのための自己学習事前学習言語モデル Self-Training Pre-Trained Language Models for Zero- and Few-Shot Multi-Dialectal Arabic Sequence Labeling ( http://arxiv.org/abs/2101.04758v4 ) ライセンス: CC BY 4.0	Muhammad Khalifa and Muhammad Abdul-Mageed and Khaled Shaalan	(参考訳) 通常、ダウンストリームタスクのために事前学習された言語モデルを微調整するために、十分な量の注釈付きデータが必要である。残念なことに、ラベル付きデータを得ることは、特に複数の言語や方言において、コストがかかる可能性がある。我々は、データリッチな言語からのみのリソースを用いて、データスカース品種の性能を向上させるために、ゼロおよび少数ショットシナリオで事前訓練された言語モデルを自己学習することを提案する。我々は、現代標準アラビア語(MSA)を微調整した言語モデルを用いて、複数の方言アラビア語(DA)品種における名前付きエンティティ(NE)とPOSタグを予測することで、アラビア語シーケンスラベリングの文脈におけるアプローチの有用性を実証する。自己学習は確かに強力であり, ゼロショットMSA-to-DA転送を10\% F$_1$ (NER) と2\%精度 (POSタグ付け) で改善している。限定的なラベル付きデータで、数回のシナリオでパフォーマンスがさらに向上します。本研究は, 自己学習に用いた未ラベルDA例から直接観察した性能向上効果を示す。我々の研究は、MSAリソースのみを活用するDAモデルを開発する機会を開き、他の言語やタスクにも拡張できます。私たちのコードと微調整されたモデルは、https://github.com/mohammadKhalifa/zero-shot-arabic-dialectsでアクセスできます。 A sufficient amount of annotated data is usually required to fine-tune pre-trained language models for downstream tasks. Unfortunately, attaining labeled data can be costly, especially for multiple language varieties and dialects. We propose to self-train pre-trained language models in zero- and few-shot scenarios to improve performance on data-scarce varieties using only resources from data-rich ones. We demonstrate the utility of our approach in the context of Arabic sequence labeling by using a language model fine-tuned on Modern Standard Arabic (MSA) only to predict named entities (NE) and part-of-speech (POS) tags on several dialectal Arabic (DA) varieties. We show that self-training is indeed powerful, improving zero-shot MSA-to-DA transfer by as large as \texttildelow 10\% F$_1$ (NER) and 2\% accuracy (POS tagging). We acquire even better performance in few-shot scenarios with limited amounts of labeled data. We conduct an ablation study and show that the performance boost observed directly results from the unlabeled DA examples used for self-training. Our work opens up opportunities for developing DA models exploiting only MSA resources and it can be extended to other languages and tasks. Our code and fine-tuned models can be accessed at https://github.com/mohammadKhalifa/zero-shot-arabic-dialects.	翻訳日:2021-04-04 03:58:45 公開日:2021-02-02
# (参考訳) 質量作用則による多スケール化学反応のデータの発見 Data-driven discovery of multiscale chemical reactions governed by the law of mass action ( http://arxiv.org/abs/2101.06589v2 ) ライセンス: CC BY 4.0	Juntao Huang and Yizhou Zhou and Wen-An Yong	(参考訳) 本稿では,質量作用の法則に則る多スケール化学反応を探索するためのデータ駆動型手法を提案する。まず, 触媒反応を伴わない系において, 反応物と生成物の化学量係数を表すために, 単一行列を用いる。行列内の負の成分は反応剤の化学量係数と生成物の正の係数を表す。第二に, 従来の最適化手法は局所的な極小領域に留まり, マルチスケール化学反応の学習において真の解を見出すことができなかった。この課題を克服するために,確率係数が整数であるという事実を用いて,ネットワークパラメータを漸進的に決定する部分パラメータフリージング(ppf)手法を提案する。このような技術により、トレーニング過程において探索空間の寸法を徐々に小さくし、最終的に大域的ミミナが得られる。古典的ミカエル・メンテン運動学や水素酸化反応などの数値実験により, マルチスケール化学反応の学習におけるアルゴリズムの性能が検証された。コードは \url{https://github.com/juntaohuang/multiscale-chemical-reaction} で入手できる。 In this paper, we propose a data-driven method to discover multiscale chemical reactions governed by the law of mass action. First, we use a single matrix to represent the stoichiometric coefficients for both the reactants and products in a system without catalysis reactions. The negative entries in the matrix denote the stoichiometric coefficients for the reactants and the positive ones for the products. Second, we find that the conventional optimization methods usually get stuck in the local minima and could not find the true solution in learning the multiscale chemical reactions. To overcome this difficulty, we propose a partial-parameters-freezing (PPF) technique to progressively determine the network parameters by using the fact that the stoichiometric coefficients are integers. With such a technique, the dimension of the searching space is gradually reduced in the training process and the global mimina can be eventually obtained. Several numerical experiments including the classical Michaelis-Menten kinetics and the hydrogen oxidation reactions verify the good performance of our algorithm in learning the multiscale chemical reactions. The code is available at \url{https://github.com/JuntaoHuang/multiscale-chemical-reaction}.	翻訳日:2021-03-28 02:13:33 公開日:2021-02-02
# (参考訳) TREGO: 効率的なグローバル最適化のための信頼度フレームワーク TREGO: a Trust-Region Framework for Efficient Global Optimization ( http://arxiv.org/abs/2101.06808v3 ) ライセンス: CC BY 4.0	Youssef Diouane and Victor Picheny and Rodolphe Le Riche and Alexandre Scotto Di Perrotolo	(参考訳) 効率的なグローバル最適化(EGO)はベイズ最適化の標準形式であり、高価なブラックボックス問題のグローバル最適化に成功している。しかし、EGOは次元のスケールに苦慮しており、理論上の保証は限られている。本研究では,信頼領域型EGO法(TREGO)の提案と解析を行う。 TREGOは、信頼領域内の通常のEGOステップとローカルステップを交互に使用する。信頼領域の古典的スキーム(十分な減少条件に基づく)に従うことで、最適化ステップのサブセットに限りEGOから離脱しながら、我々のアルゴリズムが強い大域収束特性を享受できることを実証する。既知のcocoベンチマークに基づく広範な数値実験を用いて,tregoの自己パラメータに対する感度を解析し,結果のアルゴリズムがegoを一貫して上回っており,他の最先端のグローバル最適化手法と競合していることを示す。このメソッドはRパッケージのDiceOptim (https://cran.r-project.org/package=DiceOptim) とPythonライブラリ tryte (https://secondmind-labs.github.io/trieste/)の両方で利用できる。 Efficient Global Optimization (EGO) is the canonical form of Bayesian optimization that has been successfully applied to solve global optimization of expensive-to-evaluate black-box problems. However, EGO struggles to scale with dimension, and offers limited theoretical guarantees. In this work, we propose and analyze a trust-region-like EGO method (TREGO). TREGO alternates between regular EGO steps and local steps within a trust region. By following a classical scheme for the trust region (based on a sufficient decrease condition), we demonstrate that our algorithm enjoys strong global convergence properties, while departing from EGO only for a subset of optimization steps. Using extensive numerical experiments based on the well-known COCO benchmark, we first analyze the sensitivity of TREGO to its own parameters, then show that the resulting algorithm is consistently outperforming EGO and getting competitive with other state-of-the-art global optimization methods. The method is available both in the R package DiceOptim (https://cran.r-project.org/package=DiceOptim) and Python library trieste (https://secondmind-labs.github.io/trieste/).	翻訳日:2021-03-27 19:51:51 公開日:2021-02-02
# 摂動畳み込みを用いた生成逆ネットワーク Generative Adversarial Network using Perturbed-Convolutions ( http://arxiv.org/abs/2101.10841v2 ) ライセンス: Link先を確認	Seung Park, Yoon-Jae Yeo, and Yong-Goo Shin	(参考訳) GANトレーニングに対する洞察の高まりにもかかわらず、トレーニング手順の不安定さに悩まされている。この問題を軽減するために,本論文では,GANを安定的に訓練するための識別器をペナルティ化し,差別器の過度な問題を防止することを目的とした,摂動畳み込み(PConv)と呼ばれる新しい畳み込み層を提案する。 PConvは、畳み込み操作を行う前に入力テンソルをランダムに乱して摂動特徴を生成する。このアプローチは単純ですが,驚くほど効果的です。まず、乱れた入力テンソルを用いて実および生成されたサンプルを確実に分類するために、判別器の中間層は、局所的なリプシッツ値の小さい特徴を学習する必要がある。第二に、PConvの摂動特性のため、判別器は実際の画像を記憶することが困難であり、判別器は過度に適合する問題を回避できる。提案手法の一般化能力を示すために, CIFAR-10, CelebA-HQ, LSUN, 小型画像ネットなどの各種損失関数とデータセットを用いた広範囲な実験を行った。定量的評価により,WCLはFrechet開始距離(FID)において,GANおよび条件付きGANの性能を著しく向上することが示された。例えば、提案手法は、小画像NetデータセットのFIDスコアを58.59から50.42に改善する。 Despite growing insights into the GAN training, it still suffers from instability during the training procedure. To alleviate this problem, this paper presents a novel convolutional layer, called perturbed-convolution (PConv), which focuses on achieving two goals simultaneously: penalize the discriminator for training GAN stably and prevent the overfitting problem in the discriminator. PConv generates perturbed features by randomly disturbing an input tensor before performing the convolution operation. This approach is simple but surprisingly effective. First, to reliably classify real and generated samples using the disturbed input tensor, the intermediate layers in the discriminator should learn features having a small local Lipschitz value. Second, due to the perturbed features in PConv, the discriminator is difficult to memorize the real images; this makes the discriminator avoid the overfitting problem. To show the generalization ability of the proposed method, we conducted extensive experiments with various loss functions and datasets including CIFAR-10, CelebA-HQ, LSUN, and tiny-ImageNet. Quantitative evaluations demonstrate that WCL significantly improves the performance of GAN and conditional GAN in terms of Frechet inception distance (FID). For instance, the proposed method improves FID scores on the tiny-ImageNet dataset from 58.59 to 50.42.	翻訳日:2021-03-22 11:34:10 公開日:2021-02-02
# 直進非巡回グラフニューラルネットワーク Directed Acyclic Graph Neural Networks ( http://arxiv.org/abs/2101.07965v3 ) ライセンス: Link先を確認	Veronika Thost, Jie Chen	(参考訳) グラフ構造化データは、科学と工学に広く現れる。グラフニューラルネットワーク(gnns)は、グラフに現れる関係帰納的バイアスを利用するように設計されており、構造情報がノードの特徴を補完するシナリオにおいて、他のタイプのニューラルネットワークを上回ることが示されている。最も一般的なGNNアーキテクチャは、メッセージパッシングに基づいて近隣からの情報を集約する。その一般性は広く適用された。本稿では、特殊だが広く使われているグラフ(DAG)に焦点をあて、ニューラルネットワーク設計に強力な帰納バイアス(部分順序付け)を注入する。我々は,部分順序で定義される流れに応じて情報を処理するアーキテクチャである, \emph{directed acyclic graph neural network},dagnnを提案する。 DAGNNは、初期の作業を特別なケース(例えば、木やノード表現を更新するモデルのモデル)として扱うフレームワークと見なすことができますが、以前のアーキテクチャに欠けているいくつかの重要なコンポーネントを特定します。我々は,DAGデータセット(ソースコード,ニューラルアーキテクチャ,確率的グラフィカルモデルなど)のアブレーション研究を含む総合的な実験を行い,DAGNNがより単純なDAGアーキテクチャや一般的なグラフアーキテクチャよりも優れていることを示す。 Graph-structured data ubiquitously appears in science and engineering. Graph neural networks (GNNs) are designed to exploit the relational inductive bias exhibited in graphs; they have been shown to outperform other forms of neural networks in scenarios where structure information supplements node features. The most common GNN architecture aggregates information from neighborhoods based on message passing. Its generality has made it broadly applicable. In this paper, we focus on a special, yet widely used, type of graphs -- DAGs -- and inject a stronger inductive bias -- partial ordering -- into the neural network design. We propose the \emph{directed acyclic graph neural network}, DAGNN, an architecture that processes information according to the flow defined by the partial order. DAGNN can be considered a framework that entails earlier works as special cases (e.g., models for trees and models updating node representations recurrently), but we identify several crucial components that prior architectures lack. We perform comprehensive experiments, including ablation studies, on representative DAG datasets (i.e., source code, neural architectures, and probabilistic graphical models) and demonstrate the superiority of DAGNN over simpler DAG architectures as well as general graph architectures.	翻訳日:2021-03-22 01:36:16 公開日:2021-02-02
# (参考訳) AIST++でダンスを学ぶ:音楽条件付き3Dダンス生成 Learn to Dance with AIST++: Music Conditioned 3D Dance Generation ( http://arxiv.org/abs/2101.08779v2 ) ライセンス: CC BY 4.0	Ruilong Li, Shan Yang, David A. Ross, Angjoo Kanazawa	(参考訳) 本稿では,音楽に基づく3Dダンス生成のためのトランスフォーマーに基づく学習フレームワークを提案する。ネットワークアーキテクチャを慎重に設計し,定性的に満足な結果を得るための鍵を実証的に研究する。重要なコンポーネントには、音楽とダンスの動きの相関をよく学習する深いクロスモーダルトランスフォーマーや、長距離の非凍結運動を生成するのに必須のfuture-n監督機構との完全な対応が含まれる。さらに,AISTのマルチビュー・ダンス・ビデオから再構成したAIST++と呼ばれる3Dモーションと音楽のペアデータセットを提案する。このデータセットは、1408列の3Dダンスモーションの1.1Mフレームを含み、10種類のダンスコレオグラフィーをカバーし、マルチビューカメラパラメータを伴っている。私たちの知る限り、これはこの種の最大のデータセットです。 AIST++のリッチな実験により、我々の手法は定性的かつ定量的に最先端の手法よりもはるかに優れた結果が得られることを示した。 In this paper, we present a transformer-based learning framework for 3D dance generation conditioned on music. We carefully design our network architecture and empirically study the keys for obtaining qualitatively pleasing results. The critical components include a deep cross-modal transformer, which well learns the correlation between the music and dance motion; and the full-attention with future-N supervision mechanism which is essential in producing long-range non-freezing motion. In addition, we propose a new dataset of paired 3D motion and music called AIST++, which we reconstruct from the AIST multi-view dance videos. This dataset contains 1.1M frames of 3D dance motion in 1408 sequences, covering 10 genres of dance choreographies and accompanied with multi-view camera parameters. To our knowledge it is the largest dataset of this kind. Rich experiments on AIST++ demonstrate our method produces much better results than the state-of-the-art methods both qualitatively and quantitatively.	翻訳日:2021-03-21 10:32:34 公開日:2021-02-02
# 深部強化学習を用いた心血管モデルの構築 : 敗血症治療における不確実性意識制御 Unifying Cardiovascular Modelling with Deep Reinforcement Learning for Uncertainty Aware Control of Sepsis Treatment ( http://arxiv.org/abs/2101.08477v2 ) ライセンス: Link先を確認	Thesath Nanayakkara, Gilles Clermont, Christopher James Langmead, and David Swigon	(参考訳) 敗血症はicuの主要な死亡原因であり、全入院の6%、米国における病院内死亡の35%を占めている。しかし、血管圧薬と流体投与の戦略については、広く合意されていない。また、異なる患者が治療に異なる反応を示し、個別治療の必要性を強調していることも観察されている。バソプレッサーと流体は心血管系生理学に特異的な影響を及ぼしており、医学的な研究により、血液力学的に誘導された生理学的な治療アプローチが示唆されている。そこで我々は,数学的モデリング,深層学習,強化学習,不確実性定量化の相補的強みを利用して,個別化,安全,不確実性を考慮した治療戦略を学習する新しいアプローチを提案する。まず、新しい生理駆動型リカレントニューラルネットワークを用いて、患者固有の動的心血管状態を予測する。この情報は、患者の実験室の歴史と観測可能なデータの学習された低次元表現とともに、バッチ分散強化学習を用いて価値分布を導出する。さらに, 安全クリティカルな領域では, エージェントが何をし, 知らないかを知ることが不可欠であり, このために, 患者それぞれの状態や行動に関連するモデルの不確実性を定量化し, 不確実性を認識し, 解釈可能な治療方針に関する一般的な枠組みを提案する。このフレームワークは、臨床医自身のフレームワークに対する信頼を反映して、簡単に微調整することができ、アクセス可能な場合は常に、人間の専門家の意見に影響を与えるように簡単に修正することができる。代表的な患者と検証コホートを用いて,生理学的に解釈可能な一般化可能な方針を学習したことを示す。 Sepsis is the leading cause of mortality in the ICU, responsible for 6% of all hospitalizations and 35% of all in-hospital deaths in USA. However, there is no universally agreed upon strategy for vasopressor and fluid administration. It has also been observed that different patients respond differently to treatment, highlighting the need for individualized treatment. Vasopressors and fluids are administrated with specific effects to cardiovascular physiology in mind and medical research has suggested that physiologic, hemodynamically guided, approaches to treatment. Thus we propose a novel approach, exploiting and unifying complementary strengths of Mathematical Modelling, Deep Learning, Reinforcement Learning and Uncertainty Quantification, to learn individualized, safe, and uncertainty aware treatment strategies. We first infer patient-specific, dynamic cardiovascular states using a novel physiology-driven recurrent neural network trained in an unsupervised manner. This information, along with a learned low dimensional representation of the patient's lab history and observable data, is then used to derive value distributions using Batch Distributional Reinforcement Learning. Moreover in a safety critical domain it is essential to know what our agent does and does not know, for this we also quantify the model uncertainty associated with each patient state and action, and propose a general framework for uncertainty aware, interpretable treatment policies. This framework can be tweaked easily, to reflect a clinician's own confidence of the framework, and can be easily modified to factor in human expert opinion, whenever it's accessible. Using representative patients and a validation cohort, we show that our method has learned physiologically interpretable generalizable policies.	翻訳日:2021-03-21 08:01:25 公開日:2021-02-02
# 現実世界データを用いた薬物開発における人工知能の応用 Applications of artificial intelligence in drug development using real-world data ( http://arxiv.org/abs/2101.08904v2 ) ライセンス: Link先を確認	Zhaoyi Chen, Xiong Liu, William Hogan, Elizabeth Shenkman, Jiang Bian	(参考訳) 米国食品医薬品局(FDA)は、医薬品開発における実世界のデータの利用を積極的に推進している。 RWDは、治療が使用される実際の臨床環境を反映した重要な現実世界の証拠を生成することができる。一方、人工知能(AI)、特に機械学習とディープラーニング(ML/DL)の手法は、医薬品開発プロセスの多くの段階にわたって利用されてきた。 aiの進歩は、大規模な多次元rwdを分析する新しい戦略も提供した。そこで我々は過去20年間の論文の素早いレビューを行い、AIとRWDの両方を用いた薬物開発研究の概要について概説した。最も一般的な応用は、有害事象検出、トライアル採用、薬物再資源化であった。ここでは、現在の研究ギャップと今後の機会についても論じる。 The US Food and Drug Administration (FDA) has been actively promoting the use of real-world data (RWD) in drug development. RWD can generate important real-world evidence reflecting the real-world clinical environment where the treatments are used. Meanwhile, artificial intelligence (AI), especially machine- and deep-learning (ML/DL) methods, have been increasingly used across many stages of the drug development process. Advancements in AI have also provided new strategies to analyze large, multidimensional RWD. Thus, we conducted a rapid review of articles from the past 20 years, to provide an overview of the drug development studies that use both AI and RWD. We found that the most popular applications were adverse event detection, trial recruitment, and drug repurposing. Here, we also discuss current research gaps and future opportunities.	翻訳日:2021-03-20 17:27:57 公開日:2021-02-02
# ベイジアンネットワークが学習したグラフは、因果知識とどのように比べられるか? How do some Bayesian Network machine learned graphs compare to causal knowledge? ( http://arxiv.org/abs/2101.10461v2 ) ライセンス: Link先を確認	Anthony C. Constantinou, Norman Fenton, Martin Neil	(参考訳) ベイズネットワーク(BN)のグラフは、因果知識によって決定されるか、両方の組み合わせで学習することができる。バイオインフォマティクスのような分野では、BN構造学習アルゴリズムを適用することで、未知のままの新たな洞察を明らかにすることができる。しかし、これらのアルゴリズムは、実際のデータを扱う場合にしばしば発生するサンプルサイズにおいて、入力データが制限されている場合、効果が低い。本稿では、純粋に機械学習と純粋に知識ベースのBNに焦点を当て、グラフィカル構造と暗黙の統計モデルがどのようにデータを説明しているかの違いを調査します。テストは、BN構造がドメイン知識によって決定された以前の4つのケーススタディに基づいている。知識に基づくグラフを,TETRADで実装された3つの学習クラスにまたがる様々なアルゴリズムから生成された機械学習グラフと比較した。その結果、アルゴリズムはより高いモデル選択スコアを持つグラフを生成する一方で、知識に基づくグラフは興味のある変数のより正確な予測因子であることがわかった。スコアフィッティングの最大化は、限られたデータで歪みが増し、より高いスコアを共有しながら真のグラフからかなり逸脱するグラフィカルなパターンにアルゴリズムを導くため、限られたサンプルサイズの存在下では効果がない。これは、これらのケースにおける因果知識の価値と、限られたデータに適した適切なスコアの必要性を強調する。最後に、シミュレーションデータの結果が実際の実世界のパフォーマンスについてほとんどわからないという概念を支持する新たな証拠も提示しています。 The graph of a Bayesian Network (BN) can be machine learned, determined by causal knowledge, or a combination of both. In disciplines like bioinformatics, applying BN structure learning algorithms can reveal new insights that would otherwise remain unknown. However, these algorithms are less effective when the input data are limited in terms of sample size, which is often the case when working with real data. This paper focuses on purely machine learned and purely knowledge-based BNs and investigates their differences in terms of graphical structure and how well the implied statistical models explain the data. The tests are based on four previous case studies whose BN structure was determined by domain knowledge. Using various metrics, we compare the knowledge-based graphs to the machine learned graphs generated from various algorithms implemented in TETRAD spanning all three classes of learning. The results show that, while the algorithms produce graphs with much higher model selection score, the knowledge-based graphs are more accurate predictors of variables of interest. Maximising score fitting is ineffective in the presence of limited sample size because the fitting becomes increasingly distorted with limited data, guiding algorithms towards graphical patterns that share higher fitting scores and yet deviate considerably from the true graph. This highlights the value of causal knowledge in these cases, as well as the need for more appropriate fitting scores suitable for limited data. Lastly, the experiments also provide new evidence that support the notion that results from simulated data tell us little about actual real-world performance.	翻訳日:2021-03-14 19:19:40 公開日:2021-02-02
# シールドによる安全マルチエージェント強化学習 Safe Multi-Agent Reinforcement Learning via Shielding ( http://arxiv.org/abs/2101.11196v2 ) ライセンス: Link先を確認	Ingy Elsayed-Aly, Suda Bharadwaj, Christopher Amato, R\"udiger Ehlers, Ufuk Topcu, Lu Feng	(参考訳) マルチエージェント強化学習(MARL)は、学習プロセス中に保証された安全性(例えば、安全でない状態は一度も訪れない)を必要とする幅広い安全クリティカルなアプリケーションで、ますます使われている。そこで,安全MARLに対する2つの遮蔽手法を提案する。集中シールドでは,すべてのエージェントの協調動作を監視し,必要ならば安全でない動作を補正するために,単一のシールドを合成する。因子遮蔽では,すべてのエージェントが観察する結合状態空間の因子化に基づいて複数のシールドを合成し,各シールドはエージェントのサブセットにのみ責任を負う。実験結果から,各シールドは学習中のエージェントの安全性を,学習方針の質を損なうことなく保証できることがわかった。さらに,因子遮蔽は中央集権遮蔽よりも,エージェント数でよりスケーラブルである。 Multi-agent reinforcement learning (MARL) has been increasingly used in a wide range of safety-critical applications, which require guaranteed safety (e.g., no unsafe states are ever visited) during the learning process.Unfortunately, current MARL methods do not have safety guarantees. Therefore, we present two shielding approaches for safe MARL. In centralized shielding, we synthesize a single shield to monitor all agents' joint actions and correct any unsafe action if necessary. In factored shielding, we synthesize multiple shields based on a factorization of the joint state space observed by all agents; the set of shields monitors agents concurrently and each shield is only responsible for a subset of agents at each step.Experimental results show that both approaches can guarantee the safety of agents during learning without compromising the quality of learned policies; moreover, factored shielding is more scalable in the number of agents than centralized shielding.	翻訳日:2021-03-13 19:41:56 公開日:2021-02-02
# (参考訳) トピック検出のためのdeep autoencoderベースのファジィc-means Deep Autoencoder-based Fuzzy C-Means for Topic Detection ( http://arxiv.org/abs/2102.02636v1 ) ライセンス: CC BY 4.0	Hendri Murfi, Natasha Rosaline, Nora Hariadi	(参考訳) トピック検出は、テキストデータの集合からトピックを決定するプロセスである。トピック検出手法の1つはクラスタリングに基づく手法で、centroidsがトピックであると仮定する。クラスタリング手法は、負の表現でデータを処理できるという利点がある。したがって、クラスタリング法はより広範な表現学習法と組み合わせることができる。本稿では,Deep Autoencoder とfuzzy c-means (DFCM) を用いて,話題検出のためのディープラーニングを採用する。オートエンコーダのエンコーダは、低次元表現学習を行う。ファジィc-平均は、中心体を識別するために低次元表現をグループ化する。オートエンコーダのデコーダは、centroidsを元の表現に変換し、トピックとして解釈する。このシミュレーションにより、DFCMは固有空間ベースのファジィc-平均(EFCM)のコヒーレンススコアを改善し、非負行列ファクタリゼーション(NMF)や潜在ディリクレアロケーション(LDA)といった主要な標準手法に匹敵する。 Topic detection is a process for determining topics from a collection of textual data. One of the topic detection methods is a clustering-based method, which assumes that the centroids are topics. The clustering method has the advantage that it can process data with negative representations. Therefore, the clustering method allows a combination with a broader representation learning method. In this paper, we adopt deep learning for topic detection by using a deep autoencoder and fuzzy c-means called deep autoencoder-based fuzzy c-means (DFCM). The encoder of the autoencoder performs a lower-dimensional representation learning. Fuzzy c-means groups the lower-dimensional representation to identify the centroids. The autoencoder's decoder transforms back the centroids into the original representation to be interpreted as the topics. Our simulation shows that DFCM improves the coherence score of eigenspace-based fuzzy c-means (EFCM) and is comparable to the leading standard methods, i.e., nonnegative matrix factorization (NMF) or latent Dirichlet allocation (LDA).	翻訳日:2021-02-06 01:27:22 公開日:2021-02-02
# (参考訳) ヒトアシスタンスによる強化学習の改善:HIPPOジムによる人身学習の議論 Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym ( http://arxiv.org/abs/2102.02639v1 ) ライセンス: CC BY-SA 4.0	Matthew E. Taylor, Nicholas Nissen, Yuan Wang, Neda Navidi	(参考訳) 強化学習(RL)は、ゲームプレイ、ロボット制御、およびその他の連続的な決定タスクのための一般的な機械学習パラダイムです。しかし、rlエージェントはランダムに振る舞うことから、長い学習時間と高いデータ要求を持つことが多い。複雑なタスクをよりよく学習するために、本稿では、外部の教師がRLエージェントの学習に大いに役立つことを論じる。 OpenAI Gymは、多数の標準環境やエージェントを含むRL研究の一般的なフレームワークであり、RL研究が大幅にアクセスしやすくなります。この記事では、新しいオープンソースRLフレームワーク、Openai Gym(HIPPO Gym)のためのヒューマン入力解析プラットフォーム、およびその作成に行われた設計決定について紹介します。このプラットフォームの目的は、人間-RLの研究を促進することであり、またバーを下げることで、より多くの研究者が人間の教師がRLエージェントを支援できる様々な方法を探ることができる。 Reinforcement learning (RL) is a popular machine learning paradigm for game playing, robotics control, and other sequential decision tasks. However, RL agents often have long learning times with high data requirements because they begin by acting randomly. In order to better learn in complex tasks, this article argues that an external teacher can often significantly help the RL agent learn. OpenAI Gym is a common framework for RL research, including a large number of standard environments and agents, making RL research significantly more accessible. This article introduces our new open-source RL framework, the Human Input Parsing Platform for Openai Gym (HIPPO Gym), and the design decisions that went into its creation. The goal of this platform is to facilitate human-RL research, again lowering the bar so that more researchers can quickly investigate different ways that human teachers could assist RL agents, including learning from demonstrations, learning from feedback, or curriculum learning.	翻訳日:2021-02-06 01:13:17 公開日:2021-02-02
# (参考訳) 素粒子物理学のための機械学習のリビングレビュー A Living Review of Machine Learning for Particle Physics ( http://arxiv.org/abs/2102.02770v1 ) ライセンス: CC BY 4.0	Matthew Feickert and Benjamin Nachman	(参考訳) ディープラーニングを含む現代の機械学習技術は急速に応用され、適応され、高エネルギー物理学のために開発されている。この研究の速いペースを考えると、我々は実験、現象学、または理論的分析にこれらのアプローチを開発し、適用する人々のための引用のほぼ包括的なリストを提供することを目標に生きたレビューを作成しました。生きた文書として、最新の開発を取り入れるためにできるだけ頻繁に更新されます。適切な(曖昧な)レビューのリストは、内部で見ることができる。論文は、可能な限り有用なトピックの小さなセットにグループ化されます。提案と貢献が最も歓迎され、参加の指示を提供します。 Modern machine learning techniques, including deep learning, are rapidly being applied, adapted, and developed for high energy physics. Given the fast pace of this research, we have created a living review with the goal of providing a nearly comprehensive list of citations for those developing and applying these approaches to experimental, phenomenological, or theoretical analyses. As a living document, it will be updated as often as possible to incorporate the latest developments. A list of proper (unchanging) reviews can be found within. Papers are grouped into a small set of topics to be as useful as possible. Suggestions and contributions are most welcome, and we provide instructions for participating.	翻訳日:2021-02-05 23:44:09 公開日:2021-02-02
# 深層学習アルゴリズムを用いた多基準決定の融合手法を用いたビッグデータ分析 Big Data Analytics Applying the Fusion Approach of Multicriteria Decision Making with Deep Learning Algorithms ( http://arxiv.org/abs/2102.02637v1 ) ライセンス: Link先を確認	Swarajya Lakshmi V Papineni, Snigdha Yarlagadda, Harita Akkineni, A. Mallikarjuna Reddy	(参考訳) データは、ネットワーク、クラウドコンピューティング、IoT(Internet of Things)、アクチュエータ、センサーなど、さまざまなタイプのデバイスに対する、人口と通信の急速な進歩によって進化している。データとコミュニケーションのコンテンツの増加は、ベロシティ、スピード、サイズ、価値の同値と一致し、将来の困難なタスクや最新の問題を解決するのに役立つ有用で有意義な知識を提供する。さらに、マルチクリトリアベースの意思決定は、ビッグデータ分析における代替効果に関連するさまざまな問題を解決する上で重要な課題の1つである。ビッグデータに対する洞察を提供するために、意思決定やマルチ基準に基づくディープラーニングメカニズムといったアルゴリズムを含む、最新の機械学習技術に基づくソリューションを見つける傾向があります。一方、実行時の双対性を高め、システム全体の潜在性と有効性を改善するために近似に従った導出がなされている。本質的には、ビジネス、農業、情報技術、コンピュータ科学を含むいくつかの分野は、深層学習と多基準に基づく意思決定問題を使用する。本稿では,ビッグデータ分析において直面する問題に対して,深層学習技術の概念を取り入れた多様なアプリケーションを提供し,データ駆動手法の融合手法による新たな研究を提案する。 Data is evolving with the rapid progress of population and communication for various types of devices such as networks, cloud computing, Internet of Things (IoT), actuators, and sensors. The increment of data and communication content goes with the equivalence of velocity, speed, size, and value to provide the useful and meaningful knowledge that helps to solve the future challenging tasks and latest issues. Besides, multicriteria based decision making is one of the key issues to solve for various issues related to the alternative effects in big data analysis. It tends to find a solution based on the latest machine learning techniques that include algorithms like decision making and deep learning mechanism based on multicriteria in providing insights to big data. On the other hand, the derivations are made for it to go with the approximations to increase the duality of runtime and improve the entire system's potentiality and efficacy. In essence, several fields, including business, agriculture, information technology, and computer science, use deep learning and multicriteria-based decision-making problems. This paper aims to provide various applications that involve the concepts of deep learning techniques and exploiting the multicriteria approaches for issues that are facing in big data analytics by proposing new studies with the fusion approaches of data-driven techniques.	翻訳日:2021-02-05 16:47:17 公開日:2021-02-02
# Autodidactic Neurosurgeon: オンライン学習によるモバイルエッジインテリジェンスの協調的深層推論 Autodidactic Neurosurgeon: Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning ( http://arxiv.org/abs/2102.02638v1 ) ライセンス: Link先を確認	Letian Zhang, Lixing Chen, Jie Xu	(参考訳) ディープラーニング(DL)の最近の進歩は、多くのインテリジェントなモバイルアプリケーションとサービスの出現をもたらしましたが、その一方で、リソースに制約のあるモバイルデバイスで前例のないコンピューティングの課題を引き起こします。本論文では, リソースに制約のあるモバイルデバイスとエッジサーバの協調的な深層推論システムを構築し, オンデバイス処理と計算オフロードの両方のパワーを結集することを目的とする。このシステムの基本的な考え方は、ディープニューラルネットワーク(DNN)をモバイルデバイス上で動作するフロントエンド部とエッジサーバ上で動作するバックエンド部に分割することであり、主な課題は、エンドツーエンドの推論遅延を最小限に抑えるために最適なパーティションポイントを見つける方法である。最適なパーティションポイントを探索するために、専用のオフラインプロファイリングステージに大きく依存する既存のDNNパーティションとは異なり、我々のシステムは、Autodidactic Neurosurgeon (ANS)と呼ばれるオンライン学習モジュールを組み込んで、最適なパーティションポイントをオンザフライで自動的に学習する。したがって、適応的意思決定のための新たな知識を発生させることにより、システム環境の変化を密接に追従することができる。 ANSのコアは、$\mu$LinUCBと呼ばれる新しいコンテキスト型バンディット学習アルゴリズムであり、理論的な学習性能を保証するだけでなく、現実世界の実装を容易にするための超軽量です。本稿では,ansの設計を検証するために,映像ストリームオブジェクト検出テストベッド上でシステムを実装し,その性能評価を行う。この実験は、ANSがトラッキングシステムの変更とエンドツーエンドの推論遅延の低減の観点から、最先端のベンチマークを大幅に上回っていることを示しています。 Recent breakthroughs in deep learning (DL) have led to the emergence of many intelligent mobile applications and services, but in the meanwhile also pose unprecedented computing challenges on resource-constrained mobile devices. This paper builds a collaborative deep inference system between a resource-constrained mobile device and a powerful edge server, aiming at joining the power of both on-device processing and computation offloading. The basic idea of this system is to partition a deep neural network (DNN) into a front-end part running on the mobile device and a back-end part running on the edge server, with the key challenge being how to locate the optimal partition point to minimize the end-to-end inference delay. Unlike existing efforts on DNN partitioning that rely heavily on a dedicated offline profiling stage to search for the optimal partition point, our system has a built-in online learning module, called Autodidactic Neurosurgeon (ANS), to automatically learn the optimal partition point on-the-fly. Therefore, ANS is able to closely follow the changes of the system environment by generating new knowledge for adaptive decision making. The core of ANS is a novel contextual bandit learning algorithm, called $\mu$LinUCB, which not only has provable theoretical learning performance guarantee but also is ultra-lightweight for easy real-world implementation. We implement our system on a video stream object detection testbed to validate the design of ANS and evaluate its performance. The experiments show that ANS significantly outperforms state-of-the-art benchmarks in terms of tracking system changes and reducing the end-to-end inference delay.	翻訳日:2021-02-05 16:46:37 公開日:2021-02-02
# novoゲノムアセンブラの強化学習に向けて Towards a reinforcement learning de novo genome assembler ( http://arxiv.org/abs/2102.02649v1 ) ライセンス: Link先を確認	Kleber Padovani, Roberto Xavier, Andre Carvalho, Anna Reali, Annie Chateau, Ronnie Alves	(参考訳) 強化学習の使用は、学習プロセス中に人間の監督なしに複雑な活動を解くことに非常に有望であることが証明されている。しかし、彼らの成功例は主にゲームのようなフィクションやエンターテイメントの問題に焦点を当てている。本研究は、この関連現実問題であるゲノム組立を解くため、強化学習の応用に光を当てることを目的としている。この問題に対処する文献に唯一見られるアプローチを拡張することで、我々はQ学習アルゴリズムによって実行される知的エージェント学習の側面を慎重に検討し、実際のゲノムプロジェクトとより類似した特徴を持つシナリオに適用できる可能性を理解する。提案された改善には、以前に提案された報酬システムの変更、動的プランニングに基づく状態空間探索最適化戦略、進化的コンピューティングとの相互協力が含まれる。これらの調査は、従来よりも大きな入力を持つ23の新しい環境で実施された。これらの環境はすべて、科学コミュニティによるこの研究の進化のためにインターネット上で自由に利用できる。その結果,提案した改良手法による一貫した性能向上が示唆されたが,特に状態空間と行動空間の高次元性に関する限界も示している。また,近年,高次元入力を扱う他の領域から,深層強化学習を含む学習アプリケーションの強化を図ることで,実際のシナリオにおいてゲノムアセンブリーを効率的に取り組める経路を提案する。 The use of reinforcement learning has proven to be very promising for solving complex activities without human supervision during their learning process. However, their successful applications are predominantly focused on fictional and entertainment problems - such as games. Based on the above, this work aims to shed light on the application of reinforcement learning to solve this relevant real-world problem, the genome assembly. By expanding the only approach found in the literature that addresses this problem, we carefully explored the aspects of intelligent agent learning, performed by the Q-learning algorithm, to understand its suitability to be applied in scenarios whose characteristics are more similar to those faced by real genome projects. The improvements proposed here include changing the previously proposed reward system and including state space exploration optimization strategies based on dynamic pruning and mutual collaboration with evolutionary computing. These investigations were tried on 23 new environments with larger inputs than those used previously. All these environments are freely available on the internet for the evolution of this research by the scientific community. The results suggest consistent performance progress using the proposed improvements, however, they also demonstrate the limitations of them, especially related to the high dimensionality of state and action spaces. We also present, later, the paths that can be traced to tackle genome assembly efficiently in real scenarios considering recent, successfully reinforcement learning applications - including deep reinforcement learning - from other domains dealing with high-dimensional inputs.	翻訳日:2021-02-05 16:39:11 公開日:2021-02-02
# DLpN: 深部学習者推定等方体積分率を用いた単層NODDI DLpN: Single-Shell NODDI Using Deep Learner Estimated Isotropic Volume Fraction ( http://arxiv.org/abs/2102.02772v1 ) ライセンス: Link先を確認	Abrar Faiyaz, Marvin Doyley, Giovanni Schifitto, Jianhui Zhong, Md Nasir Uddin	(参考訳) ニューライト配向分散・密度イメージング(NODDI)は,多層拡散MRIデータから細胞内,細胞外および遊離水信号の評価を可能にする。脳組織の微細構造を特徴付けるための洞察力のあるアプローチです。 NODDIパラメータの単一殻再構成は、特にニューロライト密度指数(NDI)に適合する際の故障に基づいて、過去の文献では無視されている。そこで本研究では, 以前に等方性体積分数 (f_{ISO}) を用いて, 単殻データを用いた堅牢なNODDIパラメータマップの作成の可能性を検討した。辞書に基づく深層学習手法を用いて,NODDIモデル制約とは独立に事前推定を行った。まず,f_{ISO} を予測するために,確率的スパース辞書ベースのネットワーク DictNet を提案する。単殻の場合,f_{ISO}推定辞書には拡散重み付けのない分数異方性(FA)とT2信号(S_0)が組み込まれていた。その後、NDIとオリエンテーション分散指数(ODI)を推定するために、NODDIフレームワークを事前設定で使用しました。合成データシミュレーションと3Tスキャナーで収集した人的データを用いて, 辞書を用いた深層学習前のNODDI(DLpN)の性能を, 単殻データと多殻データの両方に対して元のNODDI法と比較した。本研究では, DLpN 由来 NDI および ODI パラメータが単殻プロトコルのマルチシェル NODDI に匹敵し, b=2000 s/mm 2 のプロトコルが最高性能を発揮することを示唆した(ホワイトマターではエラー～2%,グレーマターでは4%程度)。これにより、DictNet f_{ISO} トレーニングのための2つの被験者の追加スキャンによって、単殻データに関するレトロスペクティブ研究のNODDI評価が可能になる。 Neurite orientation dispersion and density imaging (NODDI) enables assessment of intracellular, extracellular and free water signals from multi-shell diffusion MRI data. It is an insightful approach to characterize the brain tissue microstructure. Single-shell reconstruction for NODDI parameters has been discouraged in previous literature based on failure when fitting especially for the neurite density index (NDI). Here, we investigated the possibility to create robust NODDI parameter maps with single-shell data, using isotropic volume fraction (f_{ISO}) as prior. We made the prior estimation independent of NODDI model constraint using a dictionary based deep learning approach. First, we proposed a stochastic sparse dictionary-based network, DictNet in predicting f_{ISO} . In single-shell cases, fractional anisotropy (FA) and T2 signal without diffusion weighting ( S_0 ) were incorporated in the dictionary for f_{ISO} estimation. Then, NODDI framework was used in a prior setting to estimate the NDI and orientation dispersion index (ODI). Using both synthetic data simulation and human data collected on a 3T scanner, we compared the performance of our dictionary based deep learning prior NODDI (DLpN) with original NODDI method for both single-shell and multi-shell data. Our results suggest that DLpN derived NDI and ODI parameters for single-shell protocols are comparable with original multi-shell NODDI, and protocol with b=2000 s/mm 2 performs the best (error ~2% in white matter and ~4% in grey matter). This may allow NODDI evaluation of retrospective studies on single-shell data by additional scanning of two subjects for DictNet f_{ISO} training.	翻訳日:2021-02-05 16:35:17 公開日:2021-02-02
# 量子自然言語処理における同義文のパラメータ化量子回路 Parametrized Quantum Circuits of Synonymous Sentences in Quantum Natural Language Processing ( http://arxiv.org/abs/2102.02204v1 ) ライセンス: Link先を確認	Mina Abbaszadeh, S. Shahin Mousavi, Vahid Salari	(参考訳) 本稿では,非英語言語に対する量子自然言語処理における正の推移文の合成ベクトルに基づく意味論を開発する。ペルシア語は、英語とペルシア語の2つの同義語文のパラメタ化量子回路を比較する。推移文の文法+意味を考慮し、ZX計算によるDisCoCat図を量子回路形式に変換する。また、Bigraphメソッドを使用してDisCoCatダイアグラムを書き換え、セマンティック側で量子回路に変換します。 In this paper, we develop a compositional vector-based semantics of positive transitive sentences in quantum natural language processing for a non-English language, i.e. Persian, to compare the parametrized quantum circuits of two synonymous sentences in two languages, English and Persian. By considering grammar+meaning of a transitive sentence, we translate DisCoCat diagram via ZX-calculus into quantum circuit form. Also, we use a bigraph method to rewrite DisCoCat diagram and turn into quantum circuit in the semantic side.	翻訳日:2021-02-05 16:18:06 公開日:2021-02-02
# 右エッジデバイスの選択:GPGPU上でのCUDAベースのCNNのパワーと性能推定に向けて Pick the Right Edge Device: Towards Power and Performance Estimation of CUDA-based CNNs on GPGPUs ( http://arxiv.org/abs/2102.02645v1 ) ライセンス: Link先を確認	Christopher A. Metz, Mehran Goli, Rolf Drechsler	(参考訳) 機械学習(ML)の強力なテクニックとしての出現は、ビジネスのほぼすべての分野において、運用効率の向上や新たな価値提案の開発に役立っている。 MLモデルのデプロイとメンテナンスの課題に加えて、これらのモデルを実行するために適切なエッジデバイス(GPGPUなど)を選択すること(例えば、大規模な計算プロセスを備えたCNN)は、今日の組織が直面する最も困難な課題の1つです。レンタル(クラウド上で)やエッジデバイスを購入するコストが最終製品やサービスのコストに直接つながるため、最も効率的なデバイスを選択することが不可欠である。しかし、この意思決定には、MLワークフローの初期段階で識別しなければならないエッジデバイス上で動作するMLモデルのパフォーマンスと電力消費に関する深い知識が必要です。本稿では、GPGPU上でのCUDAベースのCNNの消費電力と性能の早期推定をMLエンジニアに提供する新しいMLベースのアプローチを紹介します。提案されたアプローチにより、MLエンジニアは開発初期のCNNモデルに対して最も効率的なGPGPUを選択することができます。 The emergence of Machine Learning (ML) as a powerful technique has been helping nearly all fields of business to increase operational efficiency or to develop new value propositions. Besides the challenges of deploying and maintaining ML models, picking the right edge device (e.g., GPGPUs) to run these models (e.g., CNN with the massive computational process) is one of the most pressing challenges faced by organizations today. As the cost of renting (on Cloud) or purchasing an edge device is directly connected to the cost of final products or services, choosing the most efficient device is essential. However, this decision making requires deep knowledge about performance and power consumption of the ML models running on edge devices that must be identified at the early stage of ML workflow. In this paper, we present a novel ML-based approach that provides ML engineers with the early estimation of both power consumption and performance of CUDA-based CNNs on GPGPUs. The proposed approach empowers ML engineers to pick the most efficient GPGPU for a given CNN model at the early stage of development.	翻訳日:2021-02-05 15:55:53 公開日:2021-02-02
# (参考訳) 転送のスケーリング法則 Scaling Laws for Transfer ( http://arxiv.org/abs/2102.01293v1 ) ライセンス: CC BY 4.0	Danny Hernandez, Jared Kaplan, Tom Henighan, and Sam McCandlish	(参考訳) 教師なしの微調整環境下における分布間の移動学習のための経験的スケーリング法について検討する。ますます大きなニューラルネットワークを固定サイズのデータセット上でスクラッチからトレーニングすると、最終的にはデータ制限となり、パフォーマンス(クロスエントロピー損失)が向上しなくなります。大きな言語データセットで事前トレーニングされたモデルで同じことをすると、パフォーマンス向上の勾配はゼロになるよりも単に小さくなります。同じサイズのトランスフォーマーが、スクラッチからトレーニングする際に同じ損失を達成するために必要なデータ量を決定することにより、事前トレーニングから“転送”された有効データを計算する。言い換えれば、私たちはデータの単位に集中し、他のすべてを固定します。提案手法は,パラメータ数と微調整データセットサイズに比例したパワーロー則を用いて,データ転送の効率をよく記述する。これらのパワーローの指数は、モデルの一般性と分布の近さ(対称性ではなく指向性)の尺度に対応すると信じています。事前学習は、微調整データセットのサイズを効果的に乗算する。全体的なパフォーマンスと同様に、転送はパラメータ、データ、計算の観点で予測できるスケールである。 We study empirical scaling laws for transfer learning between distributions in an unsupervised, fine-tuning setting. When we train increasingly large neural networks from-scratch on a fixed-size dataset, they eventually become data-limited and stop improving in performance (cross-entropy loss). When we do the same for models pre-trained on a large language dataset, the slope in performance gains is merely reduced rather than going to zero. We calculate the effective data "transferred" from pre-training by determining how much data a transformer of the same size would have required to achieve the same loss when training from scratch. In other words, we focus on units of data while holding everything else fixed. We find that the effective data transferred is described well in the low data regime by a power-law of parameter count and fine-tuning dataset size. We believe the exponents in these power-laws correspond to measures of the generality of a model and proximity of distributions (in a directed rather than symmetric sense). We find that pre-training effectively multiplies the fine-tuning dataset size. Transfer, like overall performance, scales predictably in terms of parameters, data, and compute.	翻訳日:2021-02-05 10:33:04 公開日:2021-02-02
# (参考訳) IoTと気象条件を利用して、キャンパス内のバストランジットを待つライダーを推定 Leveraging IoT and Weather Conditions to Estimate the Riders Waiting for the Bus Transit on Campus ( http://arxiv.org/abs/2102.01364v1 ) ライセンス: CC BY-SA 4.0	Ismail Arai, Ahmed Elnoshokaty, Samy El-Tawab	(参考訳) この時代の通信技術革命は、輸送の世界でスマートフォンの使用を増加させています。本論文では,スマートフォンのWi-Fiデータと気象条件を併用したIoTデバイスデータを用いて,ディープラーニングモデルを用いて,バス停で待機する乗客の予想数を予測することを提案する。本研究は、アメリカ合衆国バージニア州のジェームズ・マディソン大学(jmu)の交通バスシステムから収集した。本稿では,停留所で待機する乗客数と気象条件との関係について検討する。実証実験では,JMUにおける複数の停留所を用いた実験を行い,高い精度で確認した。 Deep Neural Network (DNN) モデルと Linear Regression (LR) と Wide Neural Network (WNN) の2つのベースラインモデルを比較しました。ベースラインモデルとDNNのギャップは、LRとWNNと比較してDNNに有利な予測のための平均偏角誤差(MSE)スコアをそれぞれ35%と14%改善した。 The communication technology revolution in this era has increased the use of smartphones in the world of transportation. In this paper, we propose to leverage IoT device data, capturing passengers' smartphones' Wi-Fi data in conjunction with weather conditions to predict the expected number of passengers waiting at a bus stop at a specific time using deep learning models. Our study collected data from the transit bus system at James Madison University (JMU) in Virginia, USA. This paper studies the correlation between the number of passengers waiting at bus stops and weather conditions. Empirically, an experiment with several bus stops in JMU, was utilized to confirm a high precision level. We compared our Deep Neural Network (DNN) model against two baseline models: Linear Regression (LR) and a Wide Neural Network (WNN). The gap between the baseline models and DNN was 35% and 14% better Mean Squared Error (MSE) scores for predictions in favor of the DNN compared to LR and WNN, respectively.	翻訳日:2021-02-05 10:15:58 公開日:2021-02-02
# (参考訳) AutoFreeze:微調整を高速化する自動凍結モデルブロック AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning ( http://arxiv.org/abs/2102.01386v1 ) ライセンス: CC BY 4.0	Yuhan Liu, Saurabh Agarwal, Shivaram Venkataraman	(参考訳) 機械学習(ML)の急速な採用により、多くのドメインが、大規模なデータコーパスで事前トレーニングされた微調整モデルのアプローチを使用している。しかし、我々の実験では、BERTのようなモデルの微調整でさえGPUを使用するのに何時間もかかることが示されている。以前の作業では、最終レイヤ以外のすべてのレイヤの凍結など、微調整されたレイヤの数を制限することを提案しているが、このような静的アプローチは精度を低下させる。適応的手法を用いてどの層を訓練するかを選択するシステムであるAutoFreezeを提案し、精度を保ちながらモデル微調整をいかに加速させるかを示す。また,中間アクティベーションの効率的なキャッシングを可能にする機構を開発し,微調整を行う際の前方計算時間を短縮する。 4つのNLPタスクに対する評価は、キャッシュを有効にしたAutoFreezeが、最大2.55倍の微調整性能を向上できることを示している。 With the rapid adoption of machine learning (ML), a number of domains now use the approach of fine-tuning models pre-trained on a large corpus of data. However, our experiments show that even fine-tuning on models like BERT can take many hours when using GPUs. While prior work proposes limiting the number of layers that are fine-tuned, e.g., freezing all layers but the last layer, we find that such static approaches lead to reduced accuracy. We propose, AutoFreeze, a system that uses an adaptive approach to choose which layers are trained and show how this can accelerate model fine-tuning while preserving accuracy. We also develop mechanisms to enable efficient caching of intermediate activations which can reduce the forward computation time when performing fine-tuning. Our evaluation on fourNLP tasks shows that AutoFreeze, with caching enabled, can improve fine-tuning performance by up to 2.55x.	翻訳日:2021-02-05 09:51:55 公開日:2021-02-02
# (参考訳) OPAM:機械学習を用いたオンライン購入行動分析 OPAM: Online Purchasing-behavior Analysis using Machine learning ( http://arxiv.org/abs/2102.01625v1 ) ライセンス: CC BY 4.0	Sohini Roychowdhury, Ebrahim Alareqi, Wenxi Li	(参考訳) 顧客購買行動分析は、オンラインベンダーとその顧客間の洞察に富んだコミュニケーション戦略を開発する上で重要な役割を果たす。近年のオンラインショッピングのトレンド拡大を支援するため,本研究では,教師付き,非監督型,半監督型学習手法を用いた購買行動分析システムを提案する。提案システムでは,顧客カテゴリやクラスタを特定するために,セッションおよびユーザ・ジャーニーレベルの購買行動を分析する。セッションレベルの購買予測のためのオンラインショッピングポータルの設計に対する感度は91-98%/73-99%の範囲で高い。ユーザジャーニーレベルの分析では、5つのユニークなユーザクラスタが示されており、その中では'New Shoppers'が最も予測可能であり、'Impulsive Shoppers'は低い視聴と高いカートリング行動で購入できる。さらに、クラスタ変換メトリクスと部分ラベル学習は、各ユーザクラスタの新たな/非ラベルイベントへの堅牢性を示す。これにより、顧客クラスタは戦略的にターゲットされたナッジモデルを支援することができる。 Customer purchasing behavior analysis plays a key role in developing insightful communication strategies between online vendors and their customers. To support the recent increase in online shopping trends, in this work, we present a customer purchasing behavior analysis system using supervised, unsupervised and semi-supervised learning methods. The proposed system analyzes session and user-journey level purchasing behaviors to identify customer categories/clusters that can be useful for targeted consumer insights at scale. We observe higher sensitivity to the design of online shopping portals for session-level purchasing prediction with accuracy/recall in range 91-98%/73-99%, respectively. The user-journey level analysis demonstrates five unique user clusters, wherein 'New Shoppers' are most predictable and 'Impulsive Shoppers' are most unique with low viewing and high carting behaviors for purchases. Further, cluster transformation metrics and partial label learning demonstrates the robustness of each user cluster to new/unlabelled events. Thus, customer clusters can aid strategic targeted nudge models.	翻訳日:2021-02-05 09:26:34 公開日:2021-02-02
# (参考訳) NLPを用いた金融トレンド予測のための確率時系列モデル A Stochastic Time Series Model for Predicting Financial Trends using NLP ( http://arxiv.org/abs/2102.01290v1 ) ライセンス: CC BY 4.0	Pratyush Muthukumar, Jie Zhong	(参考訳) 株価予測は、非常に複雑で非常に重要な研究分野です。ディープニューラルネットワーク技術の進歩により、研究者は金融トレンドを予測するために高精度なモデルを開発することができる。 ST-GAN(Stochastic Time-Series Generative Adversarial Network)と呼ばれる新しいディープラーニングモデルを提案し、財務ニューステキストと財務数値データの両方を分析して株価動向を予測します。我々は、GAN(Generative Adversarial Network)のような最先端技術を用いて、テキストデータと数値データの相関関係を時間とともに学習する。ナイーブ・ベイズの金融テキストデータに対する感情分析の学習表現と、数値データからのテクニカル指標を直接利用し、時系列GANを訓練する新しい方法を開発する。実験の結果,株価予測のための深層ニューラルネットワークの既存モデルおよび先行研究に対して有意な改善がみられた。 Stock price forecasting is a highly complex and vitally important field of research. Recent advancements in deep neural network technology allow researchers to develop highly accurate models to predict financial trends. We propose a novel deep learning model called ST-GAN, or Stochastic Time-series Generative Adversarial Network, that analyzes both financial news texts and financial numerical data to predict stock trends. We utilize cutting-edge technology like the Generative Adversarial Network (GAN) to learn the correlations among textual and numerical data over time. We develop a new method of training a time-series GAN directly using the learned representations of Naive Bayes' sentiment analysis on financial text data alongside technical indicators from numerical data. Our experimental results show significant improvement over various existing models and prior research on deep neural networks for stock price forecasting.	翻訳日:2021-02-05 06:16:39 公開日:2021-02-02
# (参考訳) cuboidal partitioningによるヒューマンマシン協調ビデオ符号化 Human-Machine Collaborative Video Coding Through Cuboidal Partitioning ( http://arxiv.org/abs/2102.01307v1 ) ライセンス: CC0 1.0	Ashek Ahmmed, Manoranjan Paul, Manzur Murshed, and David Taubman	(参考訳) ビデオコーディングアルゴリズムは、ビデオフレーム全体をエンコードしてデコードしますが、機能コーディング技術は、特定のアプリケーションに必要な最も重要な情報を保存および伝達するだけです。これは、ビデオコーディングが人間の知覚をターゲットとし、機能コーディングがマシンビジョンタスクをターゲットとするからです。近年,これら2つの領域間のギャップを埋める試みが行われている。本研究では,人間の視覚とcuboidsを用いた機械ビジョンアプリケーションとの共通性を利用して,映像符号化フレームワークを提案する。これは、ビデオフレーム上の長方形領域の推定が計算効率が高く、コンパクトな表現とオブジェクト中心を持つためである。このような特性は、従来のビデオコーディングシステムに付加価値をもたらすことがすでに示されています。ここで、現在のフレームからcuboidal feature descriptorを抽出し、オブジェクト検出の形でマシンビジョンタスクを達成するために使用する。実験結果から, 現在のテストフレームの立方形特徴指向表現を備えた場合, 訓練された分類器は, より優れた平均精度が得られることがわかった。さらに、この表現は、キャプチャされたフレームを受信機に通信する必要がある場合、ビットレートを7%削減する。 Video coding algorithms encode and decode an entire video frame while feature coding techniques only preserve and communicate the most critical information needed for a given application. This is because video coding targets human perception, while feature coding aims for machine vision tasks. Recently, attempts are being made to bridge the gap between these two domains. In this work, we propose a video coding framework by leveraging on to the commonality that exists between human vision and machine vision applications using cuboids. This is because cuboids, estimated rectangular regions over a video frame, are computationally efficient, has a compact representation and object centric. Such properties are already shown to add value to traditional video coding systems. Herein cuboidal feature descriptors are extracted from the current frame and then employed for accomplishing a machine vision task in the form of object detection. Experimental results show that a trained classifier yields superior average precision when equipped with cuboidal features oriented representation of the current test frame. Additionally, this representation costs 7% less in bit rate if the captured frames are need be communicated to a receiver.	翻訳日:2021-02-05 06:05:55 公開日:2021-02-02
# (参考訳) グラフ制約付き変更点学習によるQRS複合検出 A Graph-Constrained Changepoint Learning Approach for Automatic QRS-Complex Detection ( http://arxiv.org/abs/2102.01319v1 ) ライセンス: CC BY 4.0	Atiyeh Fotoohinasab, Toby Hocking, and Fatemeh Afghah	(参考訳) 本研究では,Rピーク位置の探索にグラフベースの変化点検出モデルを適用し,ECG信号解析の新しい視点を提案する。このモデルは、ラベル付きECGデータから制約グラフを学習する新しいグラフ学習アルゴリズムに基づいています。提案した学習アルゴリズムは単純な初期グラフから始まり、Rピーク検出において最終グラフが最大精度を持つように反復的にグラフを編集する。 MIT-BIH不整脈データベースでアルゴリズムの性能を評価します。評価結果は,提案手法が他の最先端手法に匹敵する結果が得られることを示す。提案手法は,sen = 99.64%, ppr = 99.71%, der = 0.19の検出誤差率の合計感度を達成している。 This study presents a new viewpoint on ECG signal analysis by applying a graph-based changepoint detection model to locate R-peak positions. This model is based on a new graph learning algorithm to learn the constraint graph given the labeled ECG data. The proposed learning algorithm starts with a simple initial graph and iteratively edits the graph so that the final graph has the maximum accuracy in R-peak detection. We evaluate the performance of the algorithm on the MIT-BIH Arrhythmia Database. The evaluation results demonstrate that the proposed method can obtain comparable results to other state-of-the-art approaches. The proposed method achieves the overall sensitivity of Sen = 99.64%, positive predictivity of PPR = 99.71%, and detection error rate of DER = 0.19.	翻訳日:2021-02-05 05:57:19 公開日:2021-02-02
# (参考訳) 深層ニューラルネットワークの理解・可視化・説明に関する調査研究 A Survey on Understanding, Visualizations, and Explanation of Deep Neural Networks ( http://arxiv.org/abs/2102.01792v1 ) ライセンス: CC BY 4.0	Atefeh Shahroudnejad	(参考訳) 機械学習と信号処理領域の最近の進歩は、エンジニアリングの重要性の異なる困難な問題に対する前例のないパフォーマンスと高い精度のために、Deep Neural Networks(DNNs)への関心が大幅に高まりました。しかし、このような深層学習アーキテクチャが人間の生活に関わる決定(例えば、制御システムや医学的応用)に利用される場合、深層モデルの決定の背後にある議論を理解すること、信頼すること、一言で「説明」することが重要となる。多くのアプリケーションでは、人工ニューラルネットワーク(DNNを含む)はブラックボックスシステムと見なされ、内部処理アクションの十分な手がかりを提供していません。深層ネットワークの動作と決定を説明するための最近の取り組みが始まっていますが、DNNの行動と決定を推論することを目的とした説明可能な人工知能(XAI)ドメインはまだ初期段階にあります。本研究の目的は、DNNの内部的および全体的行動の理解、可視化、説明に関する包括的概要を提供することである。 Recent advancements in machine learning and signal processing domains have resulted in an extensive surge of interest in Deep Neural Networks (DNNs) due to their unprecedented performance and high accuracy for different and challenging problems of significant engineering importance. However, when such deep learning architectures are utilized for making critical decisions such as the ones that involve human lives (e.g., in control systems and medical applications), it is of paramount importance to understand, trust, and in one word "explain" the argument behind deep models' decisions. In many applications, artificial neural networks (including DNNs) are considered as black-box systems, which do not provide sufficient clue on their internal processing actions. Although some recent efforts have been initiated to explain the behaviors and decisions of deep networks, explainable artificial intelligence (XAI) domain, which aims at reasoning about the behavior and decisions of DNNs, is still in its infancy. The aim of this paper is to provide a comprehensive overview on Understanding, Visualization, and Explanation of the internal and overall behavior of DNNs.	翻訳日:2021-02-05 04:54:36 公開日:2021-02-02
# (参考訳) メジャー化対策、シーケンス複雑性、オンライン学習 Majorizing Measures, Sequential Complexities, and Online Learning ( http://arxiv.org/abs/2102.01729v1 ) ライセンス: CC BY 4.0	Adam Block, Yuval Dagan, and Sasha Rakhlin	(参考訳) 本稿では, シーケンシャルなRademacher複雑性を制御するために, ジェネリックチェアリングと大規模化手法を導入する。本研究は,水平独立な方法での逐次スケールセンシティブな次元の観点で支配される分数被覆数の概念を大規模化することで,さらに複雑性の仮定により,逐次スケールセンシティブな次元の積分による最悪ケースシーケンシャルなラデマッハ複雑性の厳密な制御を確立する。最後に、最悪ケースシーケンシャルなRademacher複雑性に対して、厳密な収縮不等式を確立する。上記は、経験的過程の古典的理論を逐次ケースに拡張する上で顕著なオープン問題の数の解決を構成し、その結果、オンライン学習のための鋭い結果を確立します。 We introduce the technique of generic chaining and majorizing measures for controlling sequential Rademacher complexity. We relate majorizing measures to the notion of fractional covering numbers, which we show to be dominated in terms of sequential scale-sensitive dimensions in a horizon-independent way, and, under additional complexity assumptions establish a tight control on worst-case sequential Rademacher complexity in terms of the integral of sequential scale-sensitive dimension. Finally, we establish a tight contraction inequality for worst-case sequential Rademacher complexity. The above constitutes the resolution of a number of outstanding open problems in extending the classical theory of empirical processes to the sequential case, and, in turn, establishes sharp results for online learning.	翻訳日:2021-02-05 03:52:59 公開日:2021-02-02
# (参考訳) 深層学習法に基づくトップビュー画像列における車両軌跡予測 Vehicle trajectory prediction in top-view image sequences based on deep learning method ( http://arxiv.org/abs/2102.01749v1 ) ライセンス: CC BY 4.0	Zahra Salahshoori Nejad, Hamed Heravi, Ali Rahimpour Jounghani, Abdollah Shahrezaie, Afshin Ebrahimi	(参考訳) 毎年、世界中の多くの負傷者や死亡者が自動車事故に関連しています。この値は、運転支援システムの使用により、最近ある程度減少している。運転支援システム(すなわち自動運転システム)の開発は、この数を減らす上で重要な役割を果たす。自動走行車および高度な安全システムにおいて,周辺車両の移動を推定・予測することが不可欠である。さらに,事故時の運転者の行動,車両の移動と周囲の車両の歴史,交通現場における位置など,多くの要因が軌跡の予測に影響を及ぼしている。車両は交通の安全な経路を移動し、最短で他のドライバーの予測不能な行動に反応しなければならない。ここでは,自動車の走行経路を予測するために,道路の空中画像から得られた画像から学習した計算量が少ないモデルを提案する。本手法は,ソーシャルテンソルを用いたエンコーダデコーダモデルに基づいて,周囲の車両の動きが対象車両に与える影響をモデル化する。提案モデルは,対象車両の移動履歴とその周辺状況に関する画像を見るだけで,任意の高速道路における車両の将来経路を予測できる。深層学習はこれらの画像の特徴を抽出するツールとして用いられた。 HighDデータベースを用いて道路の空中画像の画像データセットを作成し,本データベース上でのモデルの性能評価を行った。提案手法は, 5秒間, 1.91 の RMSE を達成し, 前回の研究では, 最良経路予測法よりも誤差が少ないことがわかった。 Annually, a large number of injuries and deaths around the world are related to motor vehicle accidents. This value has recently been reduced to some extent, via the use of driver-assistance systems. Developing driver-assistance systems (i.e., automated driving systems) can play a crucial role in reducing this number. Estimating and predicting surrounding vehicles' movement is essential for an automated vehicle and advanced safety systems. Moreover, predicting the trajectory is influenced by numerous factors, such as drivers' behavior during accidents, history of the vehicle's movement and the surrounding vehicles, and their position on the traffic scene. The vehicle must move over a safe path in traffic and react to other drivers' unpredictable behaviors in the shortest time. Herein, to predict automated vehicles' path, a model with low computational complexity is proposed, which is trained by images taken from the road's aerial image. Our method is based on an encoder-decoder model that utilizes a social tensor to model the effect of the surrounding vehicles' movement on the target vehicle. The proposed model can predict the vehicle's future path in any freeway only by viewing the images related to the history of the target vehicle's movement and its neighbors. Deep learning was used as a tool for extracting the features of these images. Using the HighD database, an image dataset of the road's aerial image was created, and the model's performance was evaluated on this new database. We achieved the RMSE of 1.91 for the next 5 seconds and found that the proposed method had less error than the best path-prediction methods in previous studies.	翻訳日:2021-02-05 02:54:41 公開日:2021-02-02
# (参考訳) Ansatz-Independent Variational Quantum Classifier Ansatz-Independent Variational Quantum Classifier ( http://arxiv.org/abs/2102.01759v1 ) ライセンス: CC BY 4.0	Hideyuki Miyahara and Vwani Roychowdhury	(参考訳) 変分量子分類器(VQCs)のパラダイムは、量子状態として \textit{classical information} を符号化し、次いで量子処理と古典的な予測を生成するための測定を行う。 VQCは、短期量子デバイスを効率的に活用するための候補である:$M$-dimensionalデータセットを含む分類器は、振幅エンコーディングを使用して、$\lceil \log_2 M \rceil$ qubitsだけで実装できる。しかしながら、VQCの設計と訓練のための一般的な枠組みは提案されておらず、古典的分類器との能力と分析的関係の根本的な理解はよく分かっていない。 VQCの奨励的な具体化である量子回路学習(QCL)では、アンサッツを用いて、所定の位相とパラメトリケートゲートを持つ回路として量子進化演算子を表現し、最適化によってゲートパラメータを学習する。本稿では、まずVQCに関するオープンな疑問に対処し、QCLを含むそれらがよく知られたカーネルメソッドに適合していることを示す。このような対応に基づき,効率的なアンサッツ非依存型VQCの設計枠組みを考案し,これをユニタリカーネル法 (UKM) と呼び,VQCにおけるユニタリ進化演算子を直接最適化する。そこで本研究では,QCLの性能がUKMによって上からバウンドされていることを示す。次に、与えられたユニタリ演算子に対して効率的な量子回路を設計するための変分回路実現(VCR)を提案する。 UKMとVCRを組み合わせることで、高性能回路を構築するための効率的な枠組みを確立します。最後に,複数のデータセットに対する広範囲な数値シミュレーションにより,ukmとvcrの性能を比較検討した。 The paradigm of variational quantum classifiers (VQCs) encodes \textit{classical information} as quantum states, followed by quantum processing and then measurements to generate classical predictions. VQCs are promising candidates for efficient utilization of a near-term quantum device: classifiers involving $M$-dimensional datasets can be implemented with only $\lceil \log_2 M \rceil$ qubits by using an amplitude encoding. A general framework for designing and training VQCs, however, has not been proposed, and a fundamental understanding of its power and analytical relationships with classical classifiers are not well understood. An encouraging specific embodiment of VQCs, quantum circuit learning (QCL), utilizes an ansatz: it expresses the quantum evolution operator as a circuit with a predetermined topology and parametrized gates; training involves learning the gate parameters through optimization. In this letter, we first address the open questions about VQCs and then show that they, including QCL, fit inside the well-known kernel method. Based on such correspondence, we devise a design framework of efficient ansatz-independent VQCs, which we call the unitary kernel method (UKM): it directly optimizes the unitary evolution operator in a VQC. Thus, we show that the performance of QCL is bounded from above by the UKM. Next, we propose a variational circuit realization (VCR) for designing efficient quantum circuits for a given unitary operator. By combining the UKM with the VCR, we establish an efficient framework for constructing high-performing circuits. We finally benchmark the relatively superior performance of the UKM and the VCR via extensive numerical simulations on multiple datasets.	翻訳日:2021-02-04 19:14:13 公開日:2021-02-02
# (参考訳) UAVネットワークにおけるデータ駆動ミリ波通信のための分散条件付き汎用ネットワーク(GAN) Distributed Conditional Generative Adversarial Networks (GANs) for Data-Driven Millimeter Wave Communications in UAV Networks ( http://arxiv.org/abs/2102.01751v1 ) ライセンス: CC BY 4.0	Qianqian Zhang, Aidin Ferdowsi, Walid Saad, Mehdi Bennis	(参考訳) 本稿では,無人航空機(UAV)無線ネットワークにおけるミリ波(mmWave)通信のためのデータ駆動型空対地(A2G)チャネル推定手法を提案する。まず,ミリ波チャネル情報収集に有効なチャネル推定手法を開発し,各UAVは,各ビームフォーミング方向に沿って,条件付き生成対向ネットワーク(CGAN)を介してスタンドアロンチャネルモデルを訓練することができる。次に、訓練されたチャネルモデルのアプリケーションシナリオをより広い空間時間領域に拡張するために、分散CGANアーキテクチャに基づく協調フレームワークが開発され、各UAVが完全に分散された方法でmmWaveチャネル分布を共同で学ぶことができる。効率的な学習プロセスを保証するために、協調チャネルモデリングの学習率を最大化する最適なUAVネットワークトポロジーに必要な十分な条件を導出し、その後、分散ネットワーク構造に基づいて、UAV毎の最適CGAN学習ソリューションを特徴付ける。シミュレーションの結果,提案手法は各uavの局所的トレーニング誤差に頑健であることが判明した。一方、より大きな空飛ぶネットワークサイズでは、効率的な学習率を保証するために、UAV当たりの通信資源がより多く必要となる。また,情報共有のないスタンドアローンCGANや,他の2つの分散スキーム,すなわち多識別器CGANとフェデレートCGAN法と比較して,提案手法は,環境学習中に高いモデリング精度を示し,UAVダウンリンクmmWave通信のオンライン性能においてより高い平均データレートを達成することを示す。 In this paper, a novel framework is proposed to perform data-driven air-to-ground (A2G) channel estimation for millimeter wave (mmWave) communications in an unmanned aerial vehicle (UAV) wireless network. First, an effective channel estimation approach is developed to collect mmWave channel information, allowing each UAV to train a stand-alone channel model via a conditional generative adversarial network (CGAN) along each beamforming direction. Next, in order to expand the application scenarios of the trained channel model into a broader spatial-temporal domain, a cooperative framework, based on a distributed CGAN architecture, is developed, allowing each UAV to collaboratively learn the mmWave channel distribution in a fully-distributed manner. To guarantee an efficient learning process, necessary and sufficient conditions for the optimal UAV network topology that maximizes the learning rate for cooperative channel modeling are derived, and the optimal CGAN learning solution per UAV is subsequently characterized, based on the distributed network structure. Simulation results show that the proposed distributed CGAN approach is robust to the local training error at each UAV. Meanwhile, a larger airborne network size requires more communication resources per UAV to guarantee an efficient learning rate. The results also show that, compared with a stand-alone CGAN without information sharing and two other distributed schemes, namely: A multi-discriminator CGAN and a federated CGAN method, the proposed distributed CGAN approach yields a higher modeling accuracy while learning the environment, and it achieves a larger average data rate in the online performance of UAV downlink mmWave communications.	翻訳日:2021-02-04 18:42:31 公開日:2021-02-02
# 二重分散化による近接最適オフライン強化学習 Near-Optimal Offline Reinforcement Learning via Double Variance Reduction ( http://arxiv.org/abs/2102.01748v1 ) ライセンス: Link先を確認	Ming Yin, Yu Bai, Yu-Xiang Wang	(参考訳) 我々は、履歴データのみを使用した政策最適化を目的としたRLのモチベーションの高い設定であるオフライン強化学習(RL)の問題を検討します。適用性は広いが、オフラインRLの理論的理解、例えば最適なサンプル複雑性は、例えば 'emph{tabular} Markov Decision Processes (MDPs) のような基本的な設定でも、ほとんど開かれている。本稿では,オフラインRLの新しい分散還元アルゴリズムであるOff-Policy Double Variance reduction(OPDVR)を提案する。以上より,opdvrは,有限ホリゾン定常遷移設定におけるオフラインデータの$\widetilde{o}(h^2/d_m\epsilon^2)$で,$h$は地平線長,$d_m$は行動ポリシーによって引き起こされる最小の限界的状態行動分布であることを示す。これは、最もよく知られた上限を$H$の係数で改善します。さらに,Omega(H^2/d_m\epsilon^2)$という情報理論の下限を確立し,OPDVRが対数因子に最適であることを証明した。最後に, OPDVR は非定常遷移を持つ有限水平 MDP や割引された報酬を持つ無限水平 MDP などの代替条件下で, 速度-最適サンプルの複雑性も達成できることを示す。 We consider the problem of offline reinforcement learning (RL) -- a well-motivated setting of RL that aims at policy optimization using only historical data. Despite its wide applicability, theoretical understandings of offline RL, such as its optimal sample complexity, remain largely open even in basic settings such as \emph{tabular} Markov Decision Processes (MDPs). In this paper, we propose Off-Policy Double Variance Reduction (OPDVR), a new variance reduction based algorithm for offline RL. Our main result shows that OPDVR provably identifies an $\epsilon$-optimal policy with $\widetilde{O}(H^2/d_m\epsilon^2)$ episodes of offline data in the finite-horizon stationary transition setting, where $H$ is the horizon length and $d_m$ is the minimal marginal state-action distribution induced by the behavior policy. This improves over the best known upper bound by a factor of $H$. Moreover, we establish an information-theoretic lower bound of $\Omega(H^2/d_m\epsilon^2)$ which certifies that OPDVR is optimal up to logarithmic factors. Lastly, we show that OPDVR also achieves rate-optimal sample complexity under alternative settings such as the finite-horizon MDPs with non-stationary transitions and the infinite horizon MDPs with discounted rewards.	翻訳日:2021-02-04 17:52:35 公開日:2021-02-02
# 大規模で真のスパースニューラルネットワーク Truly Sparse Neural Networks at Scale ( http://arxiv.org/abs/2102.01732v1 ) ライセンス: Link先を確認	Selima Curci, Decebal Constantin Mocanu, Mykola Pechenizkiyi	(参考訳) 近年,ニューラルネットワークにおけるトレーニングと推論効率のデファクトなアプローチとして,スパーストレーニング手法が確立されている。しかし、この効率性は理論上は正しい。実際、誰もがバイナリマスクを使用してスパーシティをシミュレートします。典型的なディープラーニングソフトウェアとハードウェアは高密度マトリックス操作に最適化されています。本稿では直交的アプローチを採り、真にスパースなニューラルネットワークをトレーニングし、その潜在能力を最大限に活用できることを示す。この目的を達成するために,(1)並列学習アルゴリズムとそれに対応するスパース実装をスクラッチから構築し,(2)勾配流を優先する非学習パラメータを持つ活性化関数,(3)冗長性を除去するための隠れ神経細胞重要度指標という,3つの新しい貢献法を提案する。 1つにまとめると、私たちは記録を破り、表現力の観点から訓練された史上最大のニューラルネットワークを訓練することができる。その結果,環境にやさしい人工知能時代への道を歩みながら,最先端のパフォーマンスを実現することができた。 Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. In practice, everyone uses a binary mask to simulate sparsity since the typical deep learning software and hardware are optimized for dense matrix operations. In this paper, we take an orthogonal approach, and we show that we can train truly sparse neural networks to harvest their full potential. To achieve this goal, we introduce three novel contributions, specially designed for sparse neural networks: (1) a parallel training algorithm and its corresponding sparse implementation from scratch, (2) an activation function with non-trainable parameters to favour the gradient flow, and (3) a hidden neurons importance metric to eliminate redundancies. All in one, we are able to break the record and to train the largest neural network ever trained in terms of representational power -- reaching the bat brain size. The results show that our approach has state-of-the-art performance while opening the path for an environmentally friendly artificial intelligence era.	翻訳日:2021-02-04 17:48:45 公開日:2021-02-02
# MoonBoardクライミングルート分類と生成のための繰り返しニューラルネットワーク Recurrent Neural Network for MoonBoard Climbing Route Classification and Generation ( http://arxiv.org/abs/2102.01788v1 ) ライセンス: Link先を確認	Yi-Shiou Duh, Ray Chang	(参考訳) 登山ルートの難易度と新ルートの作成はどちらも困難である。既存の機械学習モデルでは、問題の難易度を正確に予測できないだけでなく、合理的な問題も生成できない。そこで本研究では,人間の登山者の手列を模倣するために開発した新しい移動前処理パイプラインである"betamove"を導入した。事前処理された移動シーケンスは、経路生成器とグレード予測器の両方を訓練するために使用された。ムーンボード問題を適切な移動順序に前処理することで、評価予測器の精度は人間のレベルに近い性能に到達し、経路生成器は以前の作業よりもずっと良い品質の新しい経路を生成する。 BetaMoveでは、機械学習の問題に対する人間の洞察を注入することができ、これが将来の登山スタイルの分類問題における移動学習の基礎となることを実証した。 Classifying the difficulties of climbing routes and generating new routes are both challenging. Existing machine learning models not only fail to accurately predict a problem's difficulty, but they are also unable to generate reasonable problems. In this work, we introduced "BetaMove", a new move preprocessing pipeline we developed, in order to mimic a human climber's hand sequence. The preprocessed move sequences were then used to train both a route generator and a grade predictor. By preprocessing a MoonBoard problem into a proper move sequence, the accuracy of our grade predictor reaches near human-level performance, and our route generator produces new routes of much better quality compared to previous work. We demonstrated that with BetaMove, we are able to inject human insights into the machine learning problems, and this can be the foundations for future transfer learning on climbing style classification problems.	翻訳日:2021-02-04 17:46:52 公開日:2021-02-02
# 情報的手法を用いた絵画の自動分析 Automatic analysis of artistic paintings using information-based measures ( http://arxiv.org/abs/2102.01767v1 ) ライセンス: Link先を確認	Jorge Miguel Silva, Diogo Pratas, Rui Antunes, S\'ergio Matos, and Armando J. Pinho	(参考訳) 芸術コミュニティは、芸術絵画の認証と分類のための自動計算分析にますます依存している。本稿では,物体の特徴の和を定量化する尺度である,その複雑さを分析し,芸術絵画に存在する隠れパターンと関係を同定する。具体的には,正規化圧縮法 (NC) とブロック分解法 (BDM) を91名の著者から集めた4,266点の絵画データセットに適用し,これらの情報に基づく手法が美術絵画の記述子としての可能性を検討する。どちらの尺度も、絵画、作家、芸術運動の類型を一貫して記述している。さらに、NCと絵画の粗さの尺度を組み合わせることで、効率的なスタイリスティックな記述子を作り出す。さらに,各絵画の局所情報を定量化することにより,アーティストの作風やその芸術的影響,共有技術に関する重要な情報を記述する指紋を定義する。より根本的には、この情報は、各著者が一般的にキャンバスにまたがる要素を構成・配布し、それゆえ、どのように作品が知覚されるかを記述する。最後に, 地域的複雑度と2点高さ差相関関数が, 美術絵画の作風と著者分類の方法論を改善する補助的特徴であることを示す。研究全体は、高速な著者特性評価と認証のための広範なウェブサイト(http://panther.web.ua.pt)によってサポートされています。 The artistic community is increasingly relying on automatic computational analysis for authentication and classification of artistic paintings. In this paper, we identify hidden patterns and relationships present in artistic paintings by analysing their complexity, a measure that quantifies the sum of characteristics of an object. Specifically, we apply Normalized Compression (NC) and the Block Decomposition Method (BDM) to a dataset of 4,266 paintings from 91 authors and examine the potential of these information-based measures as descriptors of artistic paintings. Both measures consistently described the equivalent types of paintings, authors, and artistic movements. Moreover, combining the NC with a measure of the roughness of the paintings creates an efficient stylistic descriptor. Furthermore, by quantifying the local information of each painting, we define a fingerprint that describes critical information regarding the artists' style, their artistic influences, and shared techniques. More fundamentally, this information describes how each author typically composes and distributes the elements across the canvas and, therefore, how their work is perceived. Finally, we demonstrate that regional complexity and two-point height difference correlation function are useful auxiliary features that improve current methodologies in style and author classification of artistic paintings. The whole study is supported by an extensive website (http://panther.web.ua.pt) for fast author characterization and authentication.	翻訳日:2021-02-04 17:35:59 公開日:2021-02-02
# 音声認識と翻訳のための多言語TEDxコーパス The Multilingual TEDx Corpus for Speech Recognition and Translation ( http://arxiv.org/abs/2102.01757v1 ) ライセンス: Link先を確認	Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post	(参考訳) 音声認識(ASR)および音声翻訳(ST)研究を支援するために構築された多言語TEDxコーパスについて述べる。コーパスはTEDxの8つのソース言語による音声録音のコレクションである。書き起こしを文に分割し、ソース言語音声とターゲット言語翻訳に対応させる。コーパスはオープンソースコードとともにリリースされ、新しい講演や言語の拡張が可能になった。コーパス作成手法は,従来よりも多くの言語に適用でき,マルチウェイ並列評価セットを作成することができる。低リソース言語ペアの翻訳性能を改善するための多言語モデルを含む,複数のASRおよびST設定のベースラインを提供する。 We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages. The corpus is a collection of audio recordings from TEDx talks in 8 source languages. We segment transcripts into sentences and align them to the source-language audio and target-language translations. The corpus is released along with open-sourced code enabling extension to new talks and languages as they become available. Our corpus creation methodology can be applied to more languages than previous work, and creates multi-way parallel evaluation sets. We provide baselines in multiple ASR and ST settings, including multilingual models to improve translation performance for low-resource language pairs.	翻訳日:2021-02-04 17:34:45 公開日:2021-02-02
# ミニマックス最適化を伴わない連続waserstein-2重心推定 Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization ( http://arxiv.org/abs/2102.01752v1 ) ライセンス: Link先を確認	Alexander Korotin, Lingxiao Li, Justin Solomon, Evgeny Burnaev	(参考訳) ワッサーシュタイン・バリセンターは、最適輸送に基づく確率測度の重み付き平均の幾何学的概念を提供する。本稿では,離散性に制限されない入力尺度へのサンプルアクセスを与えられたWasserstein-2バリセンタを計算するスケーラブルなアルゴリズムを提案する。過去アプローチはエントロピーあるいは二次正規化に依存しているが、我々はバイアスの導入を避けるために入力凸ニューラルネットワークとサイクルコンシスタンス正規化を用いる。その結果、私たちのアプローチはミニマックス最適化に頼りません。誤差境界に関する理論的分析と,提案手法の有効性を低次元定性シナリオおよび高次元定量的実験で実証的に証明する。 Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport. In this paper, we present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures, which are not restricted to being discrete. While past approaches rely on entropic or quadratic regularization, we employ input convex neural networks and cycle-consistency regularization to avoid introducing bias. As a result, our approach does not resort to minimax optimization. We provide theoretical analysis on error bounds as well as empirical evidence of the effectiveness of the proposed approach in low-dimensional qualitative scenarios and high-dimensional quantitative experiments.	翻訳日:2021-02-04 17:24:57 公開日:2021-02-02
# 自動走行車からのイベントデータを用いた人工知能システムの信頼性解析 Reliability Analysis of Artificial Intelligence Systems Using Recurrent Events Data from Autonomous Vehicles ( http://arxiv.org/abs/2102.01740v1 ) ライセンス: Link先を確認	Yili Hong and Jie Min and Caleb B. King and William Q. Meeker	(参考訳) 人工知能(AI)システムはますます一般的になり、トレンドは続きます。 AIシステムの例としては、自動運転車(AV)、コンピュータビジョン、自然言語処理、AI医療専門家などがある。安全かつ効果的なAIシステムのデプロイを可能にするためには、そのようなシステムの信頼性を評価する必要がある。従来、信頼性評価は信頼性テストデータとそれに続く統計モデリングと分析に基づいている。しかし、AIシステムのための信頼性データの可用性は、そのようなデータが通常敏感でプロプライエタリであるため、制限されます。カリフォルニア州自動車局(DMV)は、多くのAVメーカーがAVロードテストを行っているAVテストプログラムを監督および規制しています。プログラムに参加するメーカーは、カリフォルニア州のDMVに繰り返しの離脱イベントを報告する必要があります。この情報は一般に公開されています。本稿では、AVにおけるAIシステムの信頼性の表現としてリカレントデエンゲージメントイベントを使用し、AV駆動テストからリカレントイベントデータをモデル化・解析するための統計フレームワークを提案する。ソフトウェア信頼性には従来のパラメトリックモデルを用い,イベントプロセスを記述するために単調スプラインに基づく新しい非パラメトリックモデルを提案する。我々は,最良モデルの選択,不確かさの定量化,イベントプロセスにおける不均一性の検証のための推論手順を開発した。次に、4つのAVメーカから繰り返し発生するイベントデータを解析し、AV内のAIシステムの信頼性を推測する。また,提案分析を他のaiシステムの信頼性評価に適用する方法について述べる。 Artificial intelligence (AI) systems have become increasingly common and the trend will continue. Examples of AI systems include autonomous vehicles (AV), computer vision, natural language processing, and AI medical experts. To allow for safe and effective deployment of AI systems, the reliability of such systems needs to be assessed. Traditionally, reliability assessment is based on reliability test data and the subsequent statistical modeling and analysis. The availability of reliability data for AI systems, however, is limited because such data are typically sensitive and proprietary. The California Department of Motor Vehicles (DMV) oversees and regulates an AV testing program, in which many AV manufacturers are conducting AV road tests. Manufacturers participating in the program are required to report recurrent disengagement events to California DMV. This information is being made available to the public. In this paper, we use recurrent disengagement events as a representation of the reliability of the AI system in AV, and propose a statistical framework for modeling and analyzing the recurrent events data from AV driving tests. We use traditional parametric models in software reliability and propose a new nonparametric model based on monotonic splines to describe the event process. We develop inference procedures for selecting the best models, quantifying uncertainty, and testing heterogeneity in the event process. We then analyze the recurrent events data from four AV manufacturers, and make inferences on the reliability of the AI systems in AV. We also describe how the proposed analysis can be applied to assess the reliability of other AI systems.	翻訳日:2021-02-04 17:23:59 公開日:2021-02-02
# 深部畳み込みニューラルネットワークによる地表面の相互結合効果予測 Deep Convolutional Neural Networks to Predict Mutual Coupling Effects in Metasurfaces ( http://arxiv.org/abs/2102.01761v1 ) ライセンス: Link先を確認	Sensong An, Bowen Zheng, Mikhail Y. Shalaginov, Hong Tang, Hang Li, Li Zhou, Yunxi Dong, Mohammad Haerinia, Anuradha Murthy Agarwal, Clara Rivero-Baleine, Myungkoo Kang, Kathleen A. Richardson, Tian Gu, Juejun Hu, Clayton Fowler and Hualiang Zhang	(参考訳) metasurfacesはコンパクトで大規模の光学デバイスを実現するための新しい有望なプラットフォームを提供してきた。従来の準曲面設計手法では、要素間の近接場結合効果が非同一構造に囲まれると変化するため、ほとんどの場合において不正確な各要素の周期的境界条件を仮定する。本稿では,大規模アレイに配置された各ターゲットメタアトムの実際の電磁(EM)応答を,近接場結合効果を考慮して予測する深層学習手法を提案する。予測ニューラルネットワークは、ターゲットのメタ原子とその近傍の物理的仕様を入力として、その位相と振幅をミリ秒で計算する。この手法は, 相互結合による準曲面の性能劣化を説明するために適用可能であり, さらに最適化アルゴリズムと組み合わせて効率を最適化するためにも有効である。本手法の有効性を実証するため,従来の設計手法に比べてビーム偏向器とメタレンの効率が大幅に向上した。さらに, 準曲面の性能と相互結合による設計誤差の相関関係は, 特定の仕様(材料, 形状等)に拘束されないことを示す。そこで本手法は, 相互結合効果を探索し, 種々の中表面設計の性能を向上させるために, 容易に適用できることを想定する。 Metasurfaces have provided a novel and promising platform for the realization of compact and large-scale optical devices. The conventional metasurface design approach assumes periodic boundary conditions for each element, which is inaccurate in most cases since the near-field coupling effects between elements will change when surrounded by non-identical structures. In this paper, we propose a deep learning approach to predict the actual electromagnetic (EM) responses of each target meta-atom placed in a large array with near-field coupling effects taken into account. The predicting neural network takes the physical specifications of the target meta-atom and its neighbors as input, and calculates its phase and amplitude in milliseconds. This approach can be applied to explain metasurfaces' performance deterioration caused by mutual coupling and further used to optimize their efficiencies once combined with optimization algorithms. To demonstrate the efficacy of this methodology, we obtain large improvements in efficiency for a beam deflector and a metalens over the conventional design approach. Moreover, we show the correlations between a metasurface's performance and its design errors caused by mutual coupling are not bound to certain specifications (materials, shapes, etc.). As such, we envision that this approach can be readily applied to explore the mutual coupling effects and improve the performance of various metasurface designs.	翻訳日:2021-02-04 17:23:17 公開日:2021-02-02
# 拡張型ゲームにおけるStackelberg平衡の安全な探索 Safe Search for Stackelberg Equilibria in Extensive-Form Games ( http://arxiv.org/abs/2102.01775v1 ) ライセンス: Link先を確認	Chun Kai Ling, Noam Brown	(参考訳) Stackelberg平衡(Stackelberg equilibrium)は、リーダーがフォロワーに対してコミットメント権を持つ2プレイヤーゲームにおけるソリューションコンセプトである。近年では、空港のパトロールや野生動物の密猟防止など、多くのセキュリティアプリケーションの礎となっています。これらの設定の多くは本質的にシーケンシャルですが、既存のテクニックは事前にソリューション全体を計算します。本稿では,一般用ゲームにおける Stackelberg 平衡の計算に,オンライン計算を応用して解を改善する,理論的に健全かつ実証的に有効な探索手法を提案する。リーダーがゲーム全体を前もって解決しようとする代わりに、近似的な"青写真"ソリューションが最初にオフラインで計算され、実際のプレイで遭遇した特定のサブゲームのためにオンラインで改善される。提案手法は,事前計算したブループリント戦略に匹敵する性能が保証されていることを実証し,純粋にオフラインの手法に比べて大局的にゲームを解くことが可能であることを実証した。また,我々の検索操作はより小さなStackelberg問題としてキャストされる可能性を示し,戦略生成に基づく既存のアルゴリズムを補完する手法を提案する。 Stackelberg equilibrium is a solution concept in two-player games where the leader has commitment rights over the follower. In recent years, it has become a cornerstone of many security applications, including airport patrolling and wildlife poaching prevention. Even though many of these settings are sequential in nature, existing techniques pre-compute the entire solution ahead of time. In this paper, we present a theoretically sound and empirically effective way to apply search, which leverages extra online computation to improve a solution, to the computation of Stackelberg equilibria in general-sum games. Instead of the leader attempting to solve the full game upfront, an approximate "blueprint" solution is first computed offline and is then improved online for the particular subgames encountered in actual play. We prove that our search technique is guaranteed to perform no worse than the pre-computed blueprint strategy, and empirically demonstrate that it enables approximately solving significantly larger games compared to purely offline methods. We also show that our search operation may be cast as a smaller Stackelberg problem, making our method complementary to existing algorithms based on strategy generation.	翻訳日:2021-02-04 17:22:36 公開日:2021-02-02
# タイムウィンドウを用いたピックアップ・アンド・デリバリー問題における乗組員スケジューリングのメタヒューリスティック A metaheuristic for crew scheduling in a pickup-and-delivery problem with time windows ( http://arxiv.org/abs/2102.01780v1 ) ライセンス: Link先を確認	Mauro Lucci, Daniel Sever\'in, Paula Zabala	(参考訳) 車両のルーティングおよびクルースケジューリング問題(VRCSP)は、車両の艦隊のルートを計画し、車両とクルーの対応が時間内に固定されていない乗組員をスケジュールすることからなる。これにより、計画の柔軟性が向上し、艦隊の効率が向上するが、それに対して高い同期が要求される。本研究では,トラックやドライバーを用いて,時間窓によるピックアップ・アンド・デリバリ要求を計画の地平線上で満たさなければならないVRCSPを提案する。クルーは1人または2人のドライバーで構成され、それらのどれかは特定の場所のセットで緩和することができます。さらに、非商用シャトルの場所間での移動も可能で、追加費用は最小限に抑えられる。我々の問題はトラックとドライバーの異なる経路を考えるため、クルーが不可分なユニットとして扱われる文献にあるように、以前のvrcspでは考えられなかった柔軟性が増している。 2段階連続的なアプローチでこの問題に取り組みます:トラックルートのセットは第1段階で計算され、トラックルートと一致するドライバールートのセットは第2段階で取得されます。後者の段階におけるメタヒューリスティックベースアルゴリズムの性能を設計・評価する。提案手法は,新しい解の探索が困難になった場合の解の再使用を可能にする摂動手続きを主目的とする。この手順は、他の修理不可能ソリューションと一緒に、12-32台のトラック(計画地平線に応じて)を1時間未満で、15都市にまたがる100のリクエストのインスタンスで高品質のソリューションを見つけることができます。また,追加の運転士を乗せることによって,各乗務員に対して平均約60%の外部シャトルコストが削減され,場合によってはこのコストが完全に削減される可能性が示唆された。 A vehicle routing and crew scheduling problem (VRCSP) consists of simultaneously planning the routes of a fleet of vehicles and scheduling the crews, where the vehicle-crew correspondence is not fixed through time. This allows a greater planning flexibility and a more efficient use of the fleet, but in counterpart, a high synchronisation is demanded. In this work, we present a VRCSP where pickup-and-delivery requests with time windows have to be fulfilled over a given planning horizon by using trucks and drivers. Crews can be composed of 1 or 2 drivers and any of them can be relieved in a given set of locations. Moreover, they are allowed to travel among locations with non-company shuttles, at an additional cost that is minimised. As our problem considers distinct routes for trucks and drivers, we have an additional flexibility not contemplated in other previous VRCSP given in the literature where a crew is handled as an indivisible unit. We tackle this problem with a two-stage sequential approach: a set of truck routes is computed in the first stage and a set of driver routes consistent with the truck routes is obtained in the second one. We design and evaluate the performance of a metaheuristic based algorithm for the latter stage. Our algorithm is mainly a GRASP with a perturbation procedure that allows reusing solutions already found in case the search for new solutions becomes difficult. This procedure together with other to repair infeasible solutions allow us to find high-quality solutions on instances of 100 requests spread across 15 cities with a fleet of 12-32 trucks (depending on the planning horizon) in less than an hour. We also conclude that the possibility of carrying an additional driver leads to a decrease of the cost of external shuttles by about 60% on average with respect to individual crews and, in some cases, to remove this cost completely.	翻訳日:2021-02-04 17:21:56 公開日:2021-02-02
# カステッラノ・コン・アセント・コスタリセンスにおける乳児の遺伝子変異に関する研究 Generacion de voces artificiales infantiles en castellano con acento costarricense ( http://arxiv.org/abs/2102.01692v1 ) ライセンス: Link先を確認	Ana Lilia Alvarez-Blanco, Eugenia Cordoba-Warner, Marvin Coto-Jimenez, Vivian Fallas-Lopez, Maribel Morales Rodriguez	(参考訳) 本稿では,隠れマルコフモデルに基づく統計的パラメトリック音声合成の手法を用いて,コスタリカアクセントを用いた人工的な子どもの声生成の最初の経験を評価する。モデル学習に用いる音声サンプルを録音するプロセス、使用する技術の基礎、およびグループの認識を通じて結果の主観評価について説明します。その結果, 孤立した単語で評価した結果の明瞭さは, 参加する子どものグループの声よりも低いことがわかった。同様に、話す人の年齢と性別の検出は、自然な声の録音と比較して、人工音声に大きく影響されます。これらの結果から,新たなデータやプロセスによる今後の発展の数値的基準となるとともに,同じ手法で結果を改善するために,大量のデータを取得する必要性が示唆された。 This article evaluates a first experience of generating artificial children's voices with a Costa Rican accent, using the technique of statistical parametric speech synthesis based on Hidden Markov Models. The process of recording the voice samples used for learning the models, the fundamentals of the technique used and the subjective evaluation of the results through the perception of a group of people is described. The results show that the intelligibility of the results, evaluated in isolated words, is lower than the voices recorded by the group of participating children. Similarly, the detection of the age and gender of the speaking person is significantly affected in artificial voices, relative to recordings of natural voices. These results show the need to obtain larger amounts of data, in addition to becoming a numerical reference for future developments resulting from new data or from processes to improve results in the same technique.	翻訳日:2021-02-04 17:19:38 公開日:2021-02-02
# Apollo:Transferable Architecture Exploration Apollo: Transferable Architecture Exploration ( http://arxiv.org/abs/2102.01723v1 ) ライセンス: Link先を確認	Amir Yazdanbakhsh, Christof Angermueller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, Satrajit Chatterjee, Ravi Narayanaswami, James Laudon	(参考訳) ムーアの法則の破滅とディープラーニングの利用の上昇は、特定のニューラルアーキテクチャに最適化されたカスタムアクセラレータの設計を促進する。このような加速器のアーキテクチャ探索は、目的関数を評価するのにコストがかかる複雑で高次元で構造化された入力空間上の制約付き最適化問題を引き起こす。既存のアクセラレータ設計のアプローチはサンプル非効率であり、エリアや遅延予算、ニューラルネットワークの構成など、異なる設計制約を持つ関連する最適化タスク間で知識を伝達しない。本研究では, ブラックボックス関数最適化の最近の進歩を活用して, サンプル効率の高い加速器設計のためのトランスファー可能なアーキテクチャ探索フレームワークApolloを提案する。このフレームワークを使用して、代替設計制約のあるさまざまなニューラルネットワークのアクセラレータ構成を最適化する。我々のフレームワークは,ベースラインのブラックボックス最適化手法よりも試料効率が高い(最大24.6%のスピードアップ)。さらに、異なる設計制約を持つターゲットアーキテクチャ間で知識を転送することで、apolloは最適な構成を素早く、しばしばより客観的な価値(最大25%の改善)で見つけることができることを示した。この奨励的な成果は、高品質のアクセラレータの生成を促進するための有望な道筋を示しています。 The looming end of Moore's Law and ascending use of deep learning drives the design of custom accelerators that are optimized for specific neural architectures. Architecture exploration for such accelerators forms a challenging constrained optimization problem over a complex, high-dimensional, and structured input space with a costly to evaluate objective function. Existing approaches for accelerator design are sample-inefficient and do not transfer knowledge between related optimizations tasks with different design constraints, such as area and/or latency budget, or neural architecture configurations. In this work, we propose a transferable architecture exploration framework, dubbed Apollo, that leverages recent advances in black-box function optimization for sample-efficient accelerator design. We use this framework to optimize accelerator configurations of a diverse set of neural architectures with alternative design constraints. We show that our framework finds high reward design configurations (up to 24.6% speedup) more sample-efficiently than a baseline black-box optimization approach. We further show that by transferring knowledge between target architectures with different design constraints, Apollo is able to find optimal configurations faster and often with better objective value (up to 25% improvements). This encouraging outcome portrays a promising path forward to facilitate generating higher quality accelerators.	翻訳日:2021-02-04 17:14:06 公開日:2021-02-02
# 心電図信号と心臓音の同期を用いた新しいトランスファー学習型心疾患スクリーニング法 A Novel Transfer Learning-Based Approach for Screening Pre-existing Heart Diseases Using Synchronized ECG Signals and Heart Sounds ( http://arxiv.org/abs/2102.01728v1 ) ライセンス: Link先を確認	Ramith Hettiarachchi, Udith Haputhanthri, Kithmini Herath, Hasindu Kariyawasam, Shehan Munasinghe, Kithmin Wickramasinghe, Duminda Samarasinghe, Anjula De Silva and Chamira Edussooriya	(参考訳) 既往心疾患の診断は、肺高血圧症、心臓リズム障害、血栓症、心不全、突然の心停止などの合併症の予防に役立つため重要である。このような疾患を識別するために、心電図(PCG)および心電図(ECG)波形は重要な情報を伝達する。したがって、これらの2種類のデータの有効利用は、疾患スクリーニングプロセスを改善する可能性を秘めている。本稿では,PCGとECGを同時取得したPhystoNet Challenge 2016 Datasetのサブセット上で,この仮説を評価する。我々の新しいDual-Convolutional Neural Networkベースのアプローチは、トランスファーラーニングを使用して、大規模なデータセットに適応する可能性を秘めつつ、公開可能なPCGとECGの同時データを限られた量保持する問題に対処する。また、記録的評価とサンプル的評価という2つの主要な評価フレームワークを導入し、トランスファーラーニングアプローチの豊富なパフォーマンス評価につながります。単一・二重モードデータを用いた手法との比較により,本手法が性能向上につながることが示された。さらに,各々収集されたECG波形やPCG波形は,同期PCG波形やECG波形の限られた数を有効活用し,未だに有意な分類性能を達成できるトランスファー可能な機能を提供することができた。 Diagnosing pre-existing heart diseases early in life is important as it helps prevent complications such as pulmonary hypertension, heart rhythm problems, blood clots, heart failure and sudden cardiac arrest. To identify such diseases, phonocardiogram (PCG) and electrocardiogram (ECG) waveforms convey important information. Therefore, effectively using these two modalities of data has the potential to improve the disease screening process. Here, we evaluate this hypothesis on a subset of the PhysioNet Challenge 2016 Dataset which contains simultaneously acquired PCG and ECG recordings. Our novel Dual-Convolutional Neural Network based approach uses transfer learning to tackle the problem of having limited amounts of simultaneous PCG and ECG data that is publicly available, while having the potential to adapt to larger datasets. In addition, we introduce two main evaluation frameworks named record-wise and sample-wise evaluation which leads to a rich performance evaluation for the transfer learning approach. Comparisons with methods which used single or dual modality data show that our method can lead to better performance. Furthermore, our results show that individually collected ECG or PCG waveforms are able to provide transferable features which could effectively help to make use of a limited number of synchronized PCG and ECG waveforms and still achieve significant classification performance.	翻訳日:2021-02-04 17:13:26 公開日:2021-02-02
# FedProf: 動的データプロファイリングによるフェデレーション学習の最適化 FedProf: Optimizing Federated Learning with Dynamic Data Profiling ( http://arxiv.org/abs/2102.01733v1 ) ライセンス: Link先を確認	Wentai Wu, Ligang He, Weiwei Lin, Rui Mao, Chenlin Huang and Wei Song	(参考訳) フェデレートラーニング(FL)は、エンドデバイス(すなわちクライアント)上でのみローカルにアクセス可能な分散データから学ぶためのプライバシー保護ソリューションとして大きな可能性を示している。しかし、多くのシナリオでは、クライアントの大部分が、バイアス、ノイズ、あるいは無関係な低品質のデータのみを保有している。その結果、我々はFLの過程でその収束を構築し、遅らせることを目的としたグローバルモデルの品質を大幅に低下させる可能性があります。そこで本稿では,データプライバシを侵害することなくFLを最適化する手法を提案する。このアプローチの鍵となるのは、各クライアントとサーバのモデルデータフットプリントを生成する動的データプロファイリング手法です。フットプリントは、モデルの最初の完全接続層(fc-1)の出力分布に基づいて対応するデータパーティション上のグローバルモデルの表現を符号化する。クライアントとサーバのフットプリントを一致させることで、各flラウンドに参加する各クライアントの機会を適応的に調整し、クライアントの低品質データへの影響を軽減します。我々は,様々な fl 設定を用いた公開データセットの広範な実験を行った。その結果,グローバルモデルが収束するために必要なラウンド数(最大75\%)と全体時間(最大68\%)を大幅に削減するとともに,グローバルモデルの精度を最大2.5\%向上させることができた。 Federated Learning (FL) has shown great potential as a privacy-preserving solution to learning from decentralized data which are only accessible locally on end devices (i.e., clients). In many scenarios, however, a large proportion of the clients are probably in possession of only low-quality data that are biased, noisy or even irrelevant. As a result, they could significantly degrade the quality of the global model we aim to build and slow down its convergence in the course of FL. In light of this, we propose a novel approach to optimizing FL under such circumstances without breaching data privacy. The key of our approach is a dynamic data profiling method for generating model-data footprints on each client and the server. The footprint encodes the representation of the global model on the corresponding data partition based on the output distribution of the model's first fully-connected layer (FC-1). By matching the footprints from clients and the server, we adaptively adjust each client's opportunity of participation in each FL round to mitigate the impact from the clients with low-quality data. We have conducted extensive experiments on public data sets using various FL settings. Results show that our method significantly reduces the number of rounds (by up to 75\%) and overall time (by up to 68\%) required to have the global model converge whiling increasing the global model's accuracy by up to 2.5\%.	翻訳日:2021-02-04 17:12:42 公開日:2021-02-02
# 条件にまたがるロバストな性能を持つ話者照合バックエンド A Speaker Verification Backend with Robust Performance across Conditions ( http://arxiv.org/abs/2102.01760v1 ) ライセンス: Link先を確認	Luciana Ferrer, Mitchell McLaren, Niko Brummer	(参考訳) 本稿では,開発中の未知・未知の状況における話者検証の問題について述べる。話者照合の標準的な方法は、ディープニューラルネットワークを用いて話者埋め込みを抽出し、確率線形判別分析(plda)とグローバルロジスティック回帰スコア校正からなるバックエンドで処理することである。この方法は、キャリブレーションモデルのトレーニングに使用されるものと異なる条件でうまく動作しないシステムをもたらすことが知られている。入力条件に適応するために、持続時間などの自動抽出側情報を用いた適応キャリブレータを導入し、標準バックエンドの修正を提案します。バックエンドはバイナリのクロスエントロピーを最適化するために差別的に訓練される。話者に対してのみラベル付けされた多数の多様なデータセットでトレーニングされた場合、提案されているバックエンドは一貫して、場合によっては標準のpldaアプローチと比較して、いくつかの保持されたデータセットでキャリブレーションを劇的に改善する。差別性能も一貫して向上します。 PLDAと適応キャリブレータの併用訓練は必須であり,PLDAの凍結やキャリブレータの微調整では同様の効果が得られない。私たちの知る限り、本論文の結果は、さまざまな条件下で安定したアウトオブボックスのパフォーマンスを持つスピーカー検証システムを開発することができるという文献の最初の証拠です。 In this paper, we address the problem of speaker verification in conditions unseen or unknown during development. A standard method for speaker verification consists of extracting speaker embeddings with a deep neural network and processing them through a backend composed of probabilistic linear discriminant analysis (PLDA) and global logistic regression score calibration. This method is known to result in systems that work poorly on conditions different from those used to train the calibration model. We propose to modify the standard backend, introducing an adaptive calibrator that uses duration and other automatically extracted side-information to adapt to the conditions of the inputs. The backend is trained discriminatively to optimize binary cross-entropy. When trained on a number of diverse datasets that are labeled only with respect to speaker, the proposed backend consistently and, in some cases, dramatically improves calibration, compared to the standard PLDA approach, on a number of held-out datasets, some of which are markedly different from the training data. Discrimination performance is also consistently improved. We show that joint training of the PLDA and the adaptive calibrator is essential -- the same benefits cannot be achieved when freezing PLDA and fine-tuning the calibrator. To our knowledge, the results in this paper are the first evidence in the literature that it is possible to develop a speaker verification system with robust out-of-the-box performance on a large variety of conditions.	翻訳日:2021-02-04 17:12:00 公開日:2021-02-02
# re-diffusion glioma growth modelの初期条件評価:translational mri/histology (in)validation study Initial condition assessment for reaction-diffusion glioma growth models: A translational MRI/histology (in)validation study ( http://arxiv.org/abs/2102.01719v1 ) ライセンス: Link先を確認	Corentin Martens, Laetitia Lebrun, Christine Decaestecker, Thomas Vandamme, Yves-R\'emi Van Eycke, Antonin Rovai, Thierry Metens, Olivier Debeir, Serge Goldman, Isabelle Salmon, Gaetan Van Simaeys	(参考訳) 拡散性グリオーマは高浸潤性腫瘍であり、早期診断とフォローアップは通常磁気共鳴画像(MRI)に依存する。しかし、この技術の感度が限られているため、グリオーマ細胞浸潤の程度を直接評価することは不可能であり、最適以下の治療計画につながる。反応拡散成長モデルは、MRIで見るマージンを超えてグリオーマ細胞の浸潤を外挿し、その時空間的進化を予測するために何十年も提案されてきた。これらのモデルは、診断時に脳のあらゆる位置における腫瘍細胞密度値である初期状態を必要とする。腫瘍細胞密度関数とMRIで見られる異常のアウトラインを関連付ける研究がいくつか提案されているが、基礎となる仮定は確認されていない。本研究では,3Dプリンティングスライサーを用いたグリオ芽腫を有する非手術脳の立体的組織学的解析により,これらの仮定を検証することを提案する。細胞密度マップは、深層学習アプローチを用いて、組織学的スライドから計算される。次に、密度マップは、死後MR画像に登録され、腫瘍コアへのMR由来測地距離マップと関連付けられる。 T2 FLAIR MRIで見られる浮腫アウトラインとコアの距離との関係についても検討した。以上の結果から, (i) 腫瘍コアまでの距離で腫瘍細胞密度が指数関数的に減少することは理にかなわないが, (ii) 浮腫アウトラインは一般に細胞密度 iso-contour と一致せず, (iii) これらのアウトラインで一般的に採用されている腫瘍細胞密度値が過大評価される可能性が示唆された。これらの知見は、従来のmriを用いたグリオーマ細胞密度マップの導出の限界を浮き彫りにして、他の方法による反応拡散成長モデルの初期化と臨床応用の必要性を指摘している。 Diffuse gliomas are highly infiltrative tumors whose early diagnosis and follow-up usually rely on magnetic resonance imaging (MRI). However, the limited sensitivity of this technique makes it impossible to directly assess the extent of the glioma cell invasion, leading to sub-optimal treatment planing. Reaction-diffusion growth models have been proposed for decades to extrapolate glioma cell infiltration beyond margins visible on MRI and predict its spatial-temporal evolution. These models nevertheless require an initial condition, that is the tumor cell density values at every location of the brain at diagnosis time. Several works have proposed to relate the tumor cell density function to abnormality outlines visible on MRI but the underlying assumptions have never been verified so far. In this work we propose to verify these assumptions by stereotactic histological analysis of a non-operated brain with glioblastoma using a tailored 3D-printed slicer. Cell density maps are computed from histological slides using a deep learning approach. The density maps are then registered to a postmortem MR image and related to an MR-derived geodesic distance map to the tumor core. The relation between the edema outlines visible on T2 FLAIR MRI and the distance to the core is also investigated. Our results suggest that (i) the previously suggested exponential decrease of the tumor cell density with the distance to the tumor core is not unreasonable but (ii) the edema outlines may in general not correspond to a cell density iso-contour and (iii) the commonly adopted tumor cell density value at these outlines is likely overestimated. These findings highlight the limitations of using conventional MRI to derive glioma cell density maps and point out the need of validating other methods to initialize reaction-diffusion growth models and make them usable in clinical practice.	翻訳日:2021-02-04 17:03:38 公開日:2021-02-02
# 質問プールウェブサイトにおける学生エンゲージメント・ムードのドロップアウト予測 Characterizing Student Engagement Moods for Dropout Prediction in Question Pool Websites ( http://arxiv.org/abs/2102.00423v2 ) ライセンス: Link先を確認	Reza Hadi Mogavi, Xiaojuan Ma, Pan Hui	(参考訳) 問題ベース学習(英語: problem-based learning, pbl)は、問題解決によるハンズオントレーニングを支援する、一般的な指導手法である。 LeetCode、Code Chef、Math Playgroundといった質問プールのウェブサイト(QP)は、学生に本物で多様な、文脈に応じた質問を提供することでPBLを支援する。いずれにせよ、QPに登録されている学生の40%から80%は2ヶ月以内に退学している。本研究は,学生の参加感情を活用し,qpsからの学生の退学を理解・予測する最初の試みである。データ駆動型アプローチを採用することで、QP学生にとって5つの異なるエンゲージメント・ムード、すなわちチャレンジ・シーカー、主題シーカー、興味シーカー、喜びシーカー、非シーカーを識別する。学生は、各エンゲージメントのムードで質問に答える集団的な選好を持ち、その選好からの逸脱は、退学する確率を著しく高めている。最後に、この論文はQPの学生のドロップアウトを予測するための新しいハイブリッド機械学習モデル(我々はDropout-Plusと呼ぶ)を導入することで貢献します。テストの結果、中国で人気のqpで1万人近い学生がおり、dropout-plusは、精度、f1-measure、aucの点でライバルアルゴリズムのドロップアウト予測性能を上回っている。学生のドロップアウトを減らすために、QPマネージャーやオンライン学習の専門家にデザイン提案を行うことで、作業をまとめています。 Problem-Based Learning (PBL) is a popular approach to instruction that supports students to get hands-on training by solving problems. Question Pool websites (QPs) such as LeetCode, Code Chef, and Math Playground help PBL by supplying authentic, diverse, and contextualized questions to students. Nonetheless, empirical findings suggest that 40% to 80% of students registered in QPs drop out in less than two months. This research is the first attempt to understand and predict student dropouts from QPs via exploiting students' engagement moods. Adopting a data-driven approach, we identify five different engagement moods for QP students, which are namely challenge-seeker, subject-seeker, interest-seeker, joy-seeker, and non-seeker. We find that students have collective preferences for answering questions in each engagement mood, and deviation from those preferences increases their probability of dropping out significantly. Last but not least, this paper contributes by introducing a new hybrid machine learning model (we call Dropout-Plus) for predicting student dropouts in QPs. The test results on a popular QP in China, with nearly 10K students, show that Dropout-Plus can exceed the rival algorithms' dropout prediction performance in terms of accuracy, F1-measure, and AUC. We wrap up our work by giving some design suggestions to QP managers and online learning professionals to reduce their student dropouts.	翻訳日:2021-02-04 12:49:52 公開日:2021-02-02
# 化学空間探索のためのディープニューラルネットワークを用いた遺伝的アルゴリズムの再現性に関する研究 A reproducibility study of "Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space" ( http://arxiv.org/abs/2102.00700v2 ) ライセンス: Link先を確認	Kevin Maik Jablonka, Fergus Mcilwaine, Susana Garcia, Berend Smit, Brian Yoo	(参考訳) Nigamら。 SELFIES表現を利用した遺伝的アルゴリズム(GA)を報告し、生成された分子の多様性を改善するための適応型ニューラルネットワークベースのペナルティを提案する。この論文の主な主張は、このGAは他の生成技術(罰則化されたlogPによって測定される)を上回っ、ニューラルネットワークベースの適応ペナルティが生成された分子の多様性を増加させることである。本研究では,それらの主張の再現性を検討した。全体としては、SELFIESベースのGAを用いて同等の結果を再現することができたが、ほとんどは(容易に最適化可能な)フィットネス機能の欠如(すなわち、長い硫黄を含む鎖を生成する)を利用していた。また, 判別器を用いて, 分子の発生を基準セットと類似するものに偏見を与えることができることを示す結果も再現した。さらに,多様性の進化を定量化し,いくつかのハイパーパラメータの影響を理解し,適応的ペナルティの改善を提案する。 Nigam et al. reported a genetic algorithm (GA) utilizing the SELFIES representation and also propose an adaptive, neural network-based penalty that is supposed to improve the diversity of the generated molecules. The main claims of the paper are that this GA outperforms other generative techniques (as measured by the penalized logP) and that a neural network-based adaptive penalty increases the diversity of the generated molecules. In this work, we investigated the reproducibility of their claims. Overall, we were able to reproduce comparable results using the SELFIES-based GA, but mostly by exploiting deficiencies of the (easily optimizable) fitness function (i.e., generating long, sulfur containing chains). In addition, we also reproduce results showing that the discriminator can be used to bias the generation of molecules to ones that are similar to the reference set. Moreover, we also attempted to quantify the evolution of the diversity, understand the influence of some hyperparameters, and propose improvements to the adaptive penalty.	翻訳日:2021-02-04 12:49:03 公開日:2021-02-02
# (参考訳) スマートグリッドデバイスの確率的ブールネットワークモデルによる強化学習 Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices ( http://arxiv.org/abs/2102.01297v1 ) ライセンス: CC BY 4.0	Pedro J. Rivera Torres, Carlos Gershenson Garc\'ia, Samir Kanaan Izquierdo	(参考訳) スマートパワーグリッドの領域は、常に効率と回復力を向上させ、高品質な電力を保護し、抵抗グリッドで、障害を管理し、障害を回避する必要がある。これを実現するには、高いコンポーネント信頼性、適切なメンテナンス、および研究された障害発生が必要です。正しいシステム操作には、これらのアクティビティと、障害や障害を検出し、分類し、分離するための新しい方法論、予測アルゴリズムと分析(データ分析と資産条件を使用してアクティビティを計画および実行)によるプロセスをモデル化およびシミュレートする。本稿では,複雑な適応型自己組織型モデリング手法であるProbabilistic Boolean Networks (PBN) の応用を,スマートグリッドデバイスのダイナミクスの理解と,その動作のモデル化と特性評価の方法として紹介する。この研究は、PBNが標準的な強化学習サイクルと同等であることを示しています。エージェント/モデルは環境と相互作用し、報酬信号の形でフィードバックを受け取ります。好みの行動を特徴付けるために、異なる報酬構造が作成されました。この情報は、故障状況や故障を避けるためにPBNを導くために使用できます。 The area of Smart Power Grids needs to constantly improve its efficiency and resilience, to pro-vide high quality electrical power, in a resistant grid, managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities, and novel methodologies to detect, classify, and isolate faults and failures, model and simulate processes with predictive algorithms and analytics (using data analysis and asset condition to plan and perform activities). We show-case the application of a complex-adaptive, self-organizing modeling method, Probabilistic Boolean Networks (PBN), as a way towards the understanding of the dynamics of smart grid devices, and to model and characterize their behavior. This work demonstrates that PBNs are is equivalent to the standard Reinforcement Learning Cycle, in which the agent/model has an inter-action with its environment and receives feedback from it in the form of a reward signal. Differ-ent reward structures were created in order to characterize preferred behavior. This information can be used to guide the PBN to avoid fault conditions and failures.	翻訳日:2021-02-04 12:25:12 公開日:2021-02-02
# (参考訳) 未知の出現点と不出現点を持つ信号の時間内最適逐次検出 Optimal Sequential Detection of Signals with Unknown Appearance and Disappearance Points in Time ( http://arxiv.org/abs/2102.01310v1 ) ライセンス: CC BY 4.0	Alexander G. Tartakovsky, Nikita R. Berenkov, Alexei E. Kolessa, and Igor V. Nikiforov	(参考訳) 本論文は,変化の持続時間が有限かつ未知であると仮定して,逐次的な変化点検出問題に対処する。この問題は、信号や画像処理など、時間や空間の未知の点に信号が現れて消滅する多くのアプリケーションにとって重要である。与えられた平均走行距離の検出までの遅延を誤報に最小化する必要がある最短変化検出における従来の最適度基準とは対照的に、所定のウィンドウにおける誤報の局所最大確率に対する所定の時間(または空間)ウィンドウにおける検出の最小確率を最大化する信頼性の高い最大変化検出基準に焦点を当てる。最適な検出手順は、変更されたCUSUM手順であることを示します。次に、この最適手順の動作特性と、FMA(Finite moving Average)検出アルゴリズムとモンテカルロシミュレーションを用いた通常のCUSUM手順とを比較し、通常、後者のアルゴリズムは最適手法とほぼ同等の性能を持つことを示す。同時に、FMA手順には、通常不明な信号の強度への依存という大きな利点があります。最後に、FMAアルゴリズムを用いて光学画像中の衛星のかすかなストリークを検出する。 The paper addresses a sequential changepoint detection problem, assuming that the duration of change may be finite and unknown. This problem is of importance for many applications, e.g., for signal and image processing where signals appear and disappear at unknown points in time or space. In contrast to the conventional optimality criterion in quickest change detection that requires minimization of the expected delay to detection for a given average run length to a false alarm, we focus on a reliable maximin change detection criterion of maximizing the minimal probability of detection in a given time (or space) window for a given local maximal probability of false alarm in the prescribed window. We show that the optimal detection procedure is a modified CUSUM procedure. We then compare operating characteristics of this optimal procedure with popular in engineering the Finite Moving Average (FMA) detection algorithm and the ordinary CUSUM procedure using Monte Carlo simulations, which show that typically the later algorithms have almost the same performance as the optimal one. At the same time, the FMA procedure has a substantial advantage -- independence to the intensity of the signal, which is usually unknown. Finally, the FMA algorithm is applied to detecting faint streaks of satellites in optical images.	翻訳日:2021-02-04 12:07:24 公開日:2021-02-02
# (参考訳) MPCを用いた安定制約マルコフ決定過程 Stability-Constrained Markov Decision Processes Using MPC ( http://arxiv.org/abs/2102.01383v1 ) ライセンス: CC BY-SA 4.0	Mario Zanon, S\'ebastien Gros, Michele Palladino	(参考訳) 本稿では,結果として生じる政策が安定化しているという制約の下で,割引マルコフ決定プロセス(MDP)の解決を検討する。実際には、MPPは何らかの政策近似に基づいて解決される。我々は、モデル予測制御(MPC)を強化学習の文脈における構造化ポリシーとして活用することを提案する最近の結果を活用し、MPCベースのポリシー内での安定性要件を直接導入できるようにする。これは、建設による政策の安定化にMDPのソリューションを制限します。 MPCの安定性理論は、比類のないMPCの場合で最も成熟している。したがって、我々はまず、安定した割引MDPを無数に再フォーマットできることを本論文で示します。この観察は、安定要件のあるMPCベースの政策が、安定であれば、割引されたMDPの最適政策と、そうでなければ最良の安定化政策を生み出すことを要求する。 In this paper, we consider solving discounted Markov Decision Processes (MDPs) under the constraint that the resulting policy is stabilizing. In practice MDPs are solved based on some form of policy approximation. We will leverage recent results proposing to use Model Predictive Control (MPC) as a structured policy in the context of Reinforcement Learning to make it possible to introduce stability requirements directly inside the MPC-based policy. This will restrict the solution of the MDP to stabilizing policies by construction. The stability theory for MPC is most mature for the undiscounted MPC case. Hence, we will first show in this paper that stable discounted MDPs can be reformulated as undiscounted ones. This observation will entail that the MPC-based policy with stability requirements will produce the optimal policy for the discounted MDP if it is stable, and the best stabilizing policy otherwise.	翻訳日:2021-02-04 11:48:53 公開日:2021-02-02
# (参考訳) 地球磁場モデリングと球状高調波分解による予測 Global Earth Magnetic Field Modeling and Forecasting with Spherical Harmonics Decomposition ( http://arxiv.org/abs/2102.01447v1 ) ライセンス: CC BY 4.0	Panagiotis Tigas and T\'eo Bloch and Vishal Upendran and Banafsheh Ferdoushi and Mark C. M. Cheung and Siddha Ganju and Ryan M. McGranaghan and Yarin Gal and Asti Bhatt	(参考訳) 太陽風による大域磁場の摂動のモデル化と予測は、オープンな課題である。現在のアプローチは、MHD(Magneticohydrodynamics)モデルのような計算に要求されるモデルのシミュレーションや、スパース基底局(SuperMAG)を通して空間的および時間的にサンプリングに依存する。本稿では、Spherical Harmonicsスペース2で予測するディープラーニングモデルを開発し、MHDモデルへの依存を置き換え、1分間のケイデンスでグローバルカバレッジを提供し、機能工学に依存する現在の最新技術を改善する。超磁気データセット(14.53%改善)とmhdシミュレーション(24.35%改善)の性能評価を行った。さらに,sparse ground-based station (supermag) に基づく球面高調波再構成の補間性能を評価し,mhdシミュレーションにより球面高調波が確実に大域磁場を再構成できることを示した。 Modeling and forecasting the solar wind-driven global magnetic field perturbations is an open challenge. Current approaches depend on simulations of computationally demanding models like the Magnetohydrodynamics (MHD) model or sampling spatially and temporally through sparse ground-based stations (SuperMAG). In this paper, we develop a Deep Learning model that forecasts in Spherical Harmonics space 2, replacing reliance on MHD models and providing global coverage at one minute cadence, improving over the current state-of-the-art which relies on feature engineering. We evaluate the performance in SuperMAG dataset (improved by 14.53%) and MHD simulations (improved by 24.35%). Additionally, we evaluate the extrapolation performance of the spherical harmonics reconstruction based on sparse ground-based stations (SuperMAG), showing that spherical harmonics can reliably reconstruct the global magnetic field as evaluated on MHD simulation.	翻訳日:2021-02-04 11:27:39 公開日:2021-02-02
# (参考訳) 記憶に強い適応型OCO Strongly Adaptive OCO with Memory ( http://arxiv.org/abs/2102.01623v1 ) ライセンス: CC BY 4.0	Zhiyu Zhang, Ashok Cutkosky, Ioannis Ch. Paschalidis	(参考訳) オンライン制御の最近の進歩は、予測履歴に依存する損失関数を持つ標準オンライン学習問題の変種であるメモリによるオンライン学習を普及させました。本稿では,この問題に対する最初の強適応アルゴリズムを提案する。任意の区間$\mathcal{i}\subset[1:t]$において,提案アルゴリズムは,その区間における最善の固定コンパレータに対して$\tilde o\left(\sqrt{\|\mathcal{i}\|}\right)$ポリシー後悔を達成する。オンライン制御技術と組み合わせ、アルゴリズムは線形時間変位システムの制御に縛られる強い適応的な後悔をもたらします。 Recent progress in online control has popularized online learning with memory, a variant of the standard online learning problem with loss functions dependent on the prediction history. In this paper, we propose the first strongly adaptive algorithm for this problem: on any interval $\mathcal{I}\subset[1:T]$, the proposed algorithm achieves $\tilde O\left(\sqrt{\|\mathcal{I}\|}\right)$ policy regret against the best fixed comparator for that interval. Combined with online control techniques, our algorithm results in a strongly adaptive regret bound for the control of linear time-varying systems.	翻訳日:2021-02-04 11:17:15 公開日:2021-02-02
# ユビキタスエッジaiのためのtinyml TinyML for Ubiquitous Edge AI ( http://arxiv.org/abs/2102.01255v1 ) ライセンス: Link先を確認	Stanislava Soro	(参考訳) TinyMLは、機械学習、ハードウェア、ソフトウェアの交差点で急速に成長する多分野分野であり、極低電力範囲(mW範囲以下)で動作する組み込み(マイクロコントローラ駆動)デバイスでディープラーニングアルゴリズムを有効にすることに焦点を当てている。 tinymlは、電力効率が高く、コンパクトなディープニューラルネットワークモデルの設計、ソフトウェアフレームワークのサポート、さまざまなカスタマイズされたユビキタスな推論アプリケーションをバッテリ操作されたリソースに制約されたデバイス上で実行可能にする組み込みハードウェアの課題に対処する。本報告では,この分野の拡大を導く主要な課題と技術的実現要因について論じる。 TinyMLは、クラウド処理に依存しないが、分散エッジ推論と自律的な推論で繁栄する、新しいタイプのエッジサービスやアプリケーションへの扉を開く。 TinyML is a fast-growing multidisciplinary field at the intersection of machine learning, hardware, and software, that focuses on enabling deep learning algorithms on embedded (microcontroller powered) devices operating at extremely low power range (mW range and below). TinyML addresses the challenges in designing power-efficient, compact deep neural network models, supporting software framework, and embedded hardware that will enable a wide range of customized, ubiquitous inference applications on battery-operated, resource-constrained devices. In this report, we discuss the major challenges and technological enablers that direct this field's expansion. TinyML will open the door to the new types of edge services and applications that do not rely on cloud processing but thrive on distributed edge inference and autonomous reasoning.	翻訳日:2021-02-04 10:17:00 公開日:2021-02-02
# グラフィカルモデルによるドリフト推定 Drift Estimation with Graphical Models ( http://arxiv.org/abs/2102.01458v1 ) ライセンス: Link先を確認	Luigi Riso and Marco Guerzoni	(参考訳) 本稿では,教師付き機械学習における概念ドリフトの問題を扱う。私たちはグラフィカルモデルを使ってデータの可視構造を解明し、隠れたコンテキストの変化から推測します。従来のコンセプトドリフト検出方法とは異なり、このアプリケーションは特定のターゲット変数で使用される教師付き機械学習モデルに依存しないが、データセットの進化の独立した特性として概念ドリフトを評価しようとする。具体的には、新しいリンクの作成と、異なる期間に既存のリンクの消失を見て、グラフィカルモデルがどのように進化するかを調べる。本稿は,変化を強調し,最終的に時間とともに安定性を評価する指標を提示する手法を提案する。本研究は,オーストラリア電力市場における実世界データを用いた評価手法である。 This paper deals with the issue of concept drift in supervised machine learn-ing. We make use of graphical models to elicit the visible structure of the dataand we infer from there changes in the hidden context. Differently from previous concept-drift detection methods, this application does not depend on the supervised machine learning model in use for a specific target variable, but it tries to assess the concept drift as independent characteristic of the evolution of a dataset. Specifically, we investigate how a graphical model evolves by looking at the creation of new links and the disappearing of existing ones in different time periods. The paper suggests a method that highlights the changes and eventually produce a metric to evaluate the stability over time. The paper evaluate the method with real world data on the Australian Electric market.	翻訳日:2021-02-04 10:16:26 公開日:2021-02-02
# super-klust: 区分線形分類の別の方法 Super-klust: Another Way of Piecewise Linear Classification ( http://arxiv.org/abs/2102.01571v1 ) ライセンス: Link先を確認	Rahman Salim Zengin (1), Volkan Sezer (1) ((1) Istanbul Technical University)	(参考訳) これまでの研究であるSuper-kアルゴリズムでは,新しい一方向線形分類法が導入された。 super-kアルゴリズムに取り組んでいる間に、voronoi tessellation に基づいた分割線形分類器を得るための、同様の、より単純な方法があることが判明した。アルゴリズムの多次元ボクセル化と期待最大化の段階を距離ベースのクラスタリングアルゴリズム(好ましくはk平均)に置き換えることは、以前のアプローチと同様に機能する。ボキセル化をクラスタリングに置き換えているので、Supervised k Clusters や short Super-klust として、Super-k に関して修正アルゴリズムを名付けることに意義があることがわかりました。 Super-kアルゴリズムと同様に、Super-klustアルゴリズムはVoronoi Tessellationというラベル付きデータをカバーし、その結果を分類するためにtessellationを使用する。実験結果によると、super-klustアルゴリズムはsuper-kアルゴリズムと同様の性能特性を持つ。 With our previous study, the Super-k algorithm, we have introduced a novel way of piecewise-linear classification. While working on the Super-k algorithm, we have found that there is a similar, and simpler way to explain for obtaining a piecewise-linear classifier based on Voronoi tessellations. Replacing the multidimensional voxelization and expectation-maximization stages of the algorithm with a distance-based clustering algorithm, preferably k-means, works as well as the prior approach. Since we are replacing the voxelization with the clustering, we have found it meaningful to name the modified algorithm, with respect to Super-k, as Supervised k Clusters or in short Super-klust. Similar to the Super-k algorithm, the Super-klust algorithm covers data with a labeled Voronoi tessellation, and uses resulting tessellation for classification. According to the experimental results, the Super-klust algorithm has similar performance characteristics with the Super-k algorithm.	翻訳日:2021-02-04 10:15:54 公開日:2021-02-02
# FEDZIP: コミュニケーション効率の高いフェデレーション学習のための圧縮フレームワーク FEDZIP: A Compression Framework for Communication-Efficient Federated Learning ( http://arxiv.org/abs/2102.01593v1 ) ライセンス: Link先を確認	Amirhossein Malekijoo, Mohammad Javad Fadaeieslam, Hanieh Malekijou, Morteza Homayounfar, Farshid Alizadeh-Shabdiz, Reza Rawassizadeh	(参考訳) Federated Learningは、ユーザのプライバシを保護し、サードパーティのアクセスから生データを保護することによって、無線デバイスのための分散機械学習(特にディープラーニング)の実装の転換点となる。学習プロセスを各クライアントに独立して割り当てます。まず、クライアントはローカルデータに基づいて機械学習モデルをローカルにトレーニングする。次に、クライアントはモデル重みとバイアス(データトレーニング)のローカルアップデートをサーバに転送する。その後、サーバは更新(クライアントから受信)を集約し、グローバルな学習モデルを作成する。しかし、クライアントとサーバ間の継続的な転送は通信コストを増大させ、ディープラーニングモデルで使用される多数のパラメータ(重みとバイアス)のためにリソース利用の観点から非効率である。貢献するクライアントやコミュニケーションラウンドの数が増えると、コミュニケーションのコストが懸念されるようになります。本研究では、クライアントとそのサーバ間のディープラーニングモデルから重みを転送しながら、更新のサイズを大幅に削減する新しいフレームワークであるFedZipを提案する。 fedzipはトップzスパーシフィケーションを実装し、クラスタリングで量子化を使用し、3つの異なるエンコーディングメソッドで圧縮を実装している。 FedZipは最先端の圧縮フレームワークを上回り、最大1085xまでの圧縮速度を達成し、通信中のクライアントの帯域幅とエネルギーの99%まで保持します。 Federated Learning marks a turning point in the implementation of decentralized machine learning (especially deep learning) for wireless devices by protecting users' privacy and safeguarding raw data from third-party access. It assigns the learning process independently to each client. First, clients locally train a machine learning model based on local data. Next, clients transfer local updates of model weights and biases (training data) to a server. Then, the server aggregates updates (received from clients) to create a global learning model. However, the continuous transfer between clients and the server increases communication costs and is inefficient from a resource utilization perspective due to the large number of parameters (weights and biases) used by deep learning models. The cost of communication becomes a greater concern when the number of contributing clients and communication rounds increases. In this work, we propose a novel framework, FedZip, that significantly decreases the size of updates while transferring weights from the deep learning model between clients and their servers. FedZip implements Top-z sparsification, uses quantization with clustering, and implements compression with three different encoding methods. FedZip outperforms state-of-the-art compression frameworks and reaches compression rates up to 1085x, and preserves up to 99% of bandwidth and 99% of energy for clients during communication.	翻訳日:2021-02-04 10:15:17 公開日:2021-02-02
# シンプレクティックガウス過程ダイナミクス Symplectic Gaussian Process Dynamics ( http://arxiv.org/abs/2102.01606v1 ) ライセンス: Link先を確認	Katharina Ensinger, Friedrich Solowjow, Michael Tiemann, Sebastian Trimpe	(参考訳) ダイナミクスモデル学習は困難であり、同時に研究の活発な分野でもある。潜在的安全性のため、制御タスクのような下流アプリケーションでは、理論的保証が必要である。 GPは空間上の関数近似子として豊富な理論的保証を誘導するが、力学系の時間的側面には明示的に対応しない。しかし、時間によるシステム特性の伝播は、まさに古典的な数値積分器が設計したものです。本稿では,任意の明示的あるいは暗黙的な単段積分器や多段積分器で基底系を識別し,数値積分器の特性を活用できる,スパースガウス過程に基づく変分推論手法を提案する。特に、ハミルトン問題とシンプレクティック積分器は、体積保存予測を生成する。 Dynamics model learning is challenging and at the same time an active field of research. Due to potential safety critical downstream applications, such as control tasks, there is a need for theoretical guarantees. While GPs induce rich theoretical guarantees as function approximators in space, they do not explicitly cope with the time aspect of dynamical systems. However, propagating system properties through time is exactly what classical numerical integrators were designed for. We introduce a recurrent sparse Gaussian process based variational inference scheme that is able to discretize the underlying system with any explicit or implicit single or multistep integrator, thus leveraging properties of numerical integrators. In particular we discuss Hamiltonian problems coupled with symplectic integrators producing volume preserving predictions.	翻訳日:2021-02-04 10:14:35 公開日:2021-02-02
# 自動階調システムからのデータを用いた学生のパフォーマンス予測 Predicting student performance using data from an auto-grading system ( http://arxiv.org/abs/2102.01270v1 ) ライセンス: Link先を確認	Huanyi Chen, Paul A.S. Ward	(参考訳) オンラインの自動採点システムが現れると、これらのシステムから得られる情報によって、研究者は学生の行動やパフォーマンスを予測する予測モデルを作成することができる。ウォータールー大学では、ECE 150 (Fundamentals of Programming) Instructional Teamは、教育成果を改善するために限られた教育リソースをよりよく割り当てる方法について洞察を得たいと考えています。現在、Instructional Teamは、学習時間をリアクティブベースで割り当てている。生徒を「要請通り」支援する。このアプローチは、助けを求める場所を持つ学生に役立ちます。しかし、苦しんでいる学生の多くは援助を求めて手を差し伸べません。したがって、私たちは研究チームとして、自動グレードシステムであるMarmosetのデータを調べて助けを必要とする学生を決定できるかどうかを探りたいと思っています。本稿では,マーモセット自動採点システムから抽出した様々な特徴を持つ決定木および線形回帰モデルの構築実験を行い,合格率,テストケース結果,提出数,提出時間間隔(最初の合理的な提出と締め切りの間の時間間隔)について検討した。各特徴について,解析結果を混乱行列レベルで解釈した。特に, 成績の悪い学生に対しては, 提出時間間隔を用いた線形回帰モデルが, 精度とf測定の点で, 最良であることを示す。また,成績の悪い生徒に誤分類された生徒は,すべてのモデルにおいて,線形回帰モデルの中では最も低い実例があることが示唆された。また,中間期においては,中間期前の最終割当の提出時間間隔が中間期性能を最も多く予測することを示す。しかし、最終試験では、中間試験のパフォーマンスが最終試験のパフォーマンスに最も貢献します。 As online auto-grading systems appear, information obtained from those systems can potentially enable researchers to create predictive models to predict student behaviour and performances. In the University of Waterloo, the ECE 150 (Fundamentals of Programming) Instructional Team wants to get an insight into how to allocate the limited teaching resources better to achieve improved educational outcomes. Currently, the Instructional Team allocates tutoring time in a reactive basis. They help students "as-requested". This approach serves those students with the wherewithal to request help; however, many of the students who are struggling do not reach out for assistance. Therefore, we, as the Research Team, want to explore if we can determine students which need help by looking into the data from our auto-grading system, Marmoset. In this paper, we conducted experiments building decision-tree and linear-regression models with various features extracted from the Marmoset auto-grading system, including passing rate, testcase outcomes, number of submissions and submission time intervals (the time interval between the student's first reasonable submission and the deadline). For each feature, we interpreted the result at the confusion matrix level. Specifically for poor-performance students, we show that the linear-regression model using submission time intervals performs the best among all models in terms of Precision and F-Measure. We also show that for students who are misclassified into poor-performance students, they have the lowest actual grades in the linear-regression model among all models. In addition, we show that for the midterm, the submission time interval of the last assignment before the midterm predicts the midterm performance the most. However, for the final exam, the midterm performance contributes the most on the final exam performance.	翻訳日:2021-02-04 10:06:40 公開日:2021-02-02
# ノイズがカオスと出会うとき:ニューロカオス学習における確率的共鳴 When Noise meets Chaos: Stochastic Resonance in Neurochaos Learning ( http://arxiv.org/abs/2102.01316v1 ) ライセンス: Link先を確認	Harikrishnan NB and Nithin Nagaraj	(参考訳) カオスとノイズは脳に広がっています。ニューロンのカオス的な発砲と神経モデルにおけるノイズの構造的役割に触発され、私たちは初めてカオス、ノイズ、学習を接続します。本稿では,ニューロカオス学習(NL)における確率共鳴(SR)現象を実証する。 SRはNLの単一のニューロンのレベルで現われ、有効なsubthreshold信号の検出を可能にします。さらに、SRは、シミュレーションと実世界の音声桁データセットの両方において、分類タスクのための単一および複数のニューロンNLアーキテクチャで発生することが示されている。ニューロカオス学習における中間レベルのノイズは、分類タスクにおけるピークパフォーマンスを可能にし、AIアプリケーション、特に脳インスパイアされた学習アーキテクチャにおけるSRの役割を強調します。 Chaos and Noise are ubiquitous in the Brain. Inspired by the chaotic firing of neurons and the constructive role of noise in neuronal models, we for the first time connect chaos, noise and learning. In this paper, we demonstrate Stochastic Resonance (SR) phenomenon in Neurochaos Learning (NL). SR manifests at the level of a single neuron of NL and enables efficient subthreshold signal detection. Furthermore, SR is shown to occur in single and multiple neuronal NL architecture for classification tasks - both on simulated and real-world spoken digit datasets. Intermediate levels of noise in neurochaos learning enables peak performance in classification tasks thus highlighting the role of SR in AI applications, especially in brain inspired learning architectures.	翻訳日:2021-02-04 10:05:55 公開日:2021-02-02
# 二元化ニューラルネットワークにおけるビット誤差耐性測定 Bit Error Tolerance Metrics for Binarized Neural Networks ( http://arxiv.org/abs/2102.01344v1 ) ライセンス: Link先を確認	Sebastian Buschj\"ager, Jian-Jia Chen, Kuan-Hsun Chen, Mario G\"unzel, Katharina Morik, Rodion Novkin, Lukas Pfahler, Mikail Yayla	(参考訳) ニューラルネットワーク(NN)推論システムのリソース需要を減らすために、電源電圧とタイミングパラメータをエネルギー消費とパフォーマンスで取引精度を調整する近似メモリを使用することが提案されている。これらのパラメータの調整はビットエラーに積極的につながり、トレーニング中にビットフリップが注入されるとNNによって許容されます。しかし、ビットエラー耐性を達成するための最先端の技術であるビットフリップトレーニングは、スケールがうまくいかず、膨大なオーバーヘッドをもたらし、高いビットエラー率(BER)に適用することはできません。 NNにおけるビットエラー耐性を実現する別の方法が必要であるが、NNのビットエラー耐性の背後にある基本原則はまだ報告されていない。この理解の欠如により、nnビットのエラー許容性に関する研究のさらなる進展が抑制される。本研究の目的は,二項化NN(BNN)に着目して,フリップトレーニングの原因となるNNの内部的変化を調べることである。そのために、ビットエラー耐性BNNの性質を2つの指標で定量化します。まず,プリアクティベーション値とバッチ正規化しきい値とのマージンを計算する,ニューロンレベルのビット誤り耐性メトリックを提案する。次に、神経細胞の相互作用に対するビット誤差許容度の影響を捉えるために、各ニューロンの重要性を測定し、すべての重要値のばらつきを計算するニューロン間ビット誤差許容度指標を提案します。実験結果は,この2つの指標がビット誤り許容性に強く関連していることを裏付ける。 To reduce the resource demand of neural network (NN) inference systems, it has been proposed to use approximate memory, in which the supply voltage and the timing parameters are tuned trading accuracy with energy consumption and performance. Tuning these parameters aggressively leads to bit errors, which can be tolerated by NNs when bit flips are injected during training. However, bit flip training, which is the state of the art for achieving bit error tolerance, does not scale well; it leads to massive overheads and cannot be applied for high bit error rates (BERs). Alternative methods to achieve bit error tolerance in NNs are needed, but the underlying principles behind the bit error tolerance of NNs have not been reported yet. With this lack of understanding, further progress in the research on NN bit error tolerance will be restrained. In this study, our objective is to investigate the internal changes in the NNs that bit flip training causes, with a focus on binarized NNs (BNNs). To this end, we quantify the properties of bit error tolerant BNNs with two metrics. First, we propose a neuron-level bit error tolerance metric, which calculates the margin between the pre-activation values and batch normalization thresholds. Secondly, to capture the effects of bit error tolerance on the interplay of neurons, we propose an inter-neuron bit error tolerance metric, which measures the importance of each neuron and computes the variance over all importance values. Our experimental results support that these two metrics are strongly related to bit error tolerance.	翻訳日:2021-02-04 10:05:22 公開日:2021-02-02
# スマートシティにおける連合学習:包括的調査 Federated Learning in Smart Cities: A Comprehensive Survey ( http://arxiv.org/abs/2102.01375v1 ) ライセンス: Link先を確認	Zhaohua Zheng, Yize Zhou, Yilong Sun, Zhang Wang, Boyi Liu and Keqiu Li	(参考訳) 連合学習はスマートシティのプロセスにおいて重要な役割を果たす。ビッグデータと人工知能の開発により、このプロセスではデータのプライバシ保護が問題となる。フェデレーション学習はこの問題を解くことができる。本稿では,様々な分野における連合学習とその応用の現況から始める。我々は総合的な調査を行う。本稿では,スマートシティの様々な分野におけるフェデレーション学習の適用に関する最新の研究をまとめる。モノのインターネット、輸送、通信、金融、医療、その他の分野からの連合学習の現在の発展に関する深い理解。その前に,フェデレーション学習の背景,定義,キー技術を紹介する。さらに、重要な技術と最新の結果についてレビューする。最後に,スマートシティにおける連合学習の今後の応用と研究方向について考察する。 Federated learning plays an important role in the process of smart cities. With the development of big data and artificial intelligence, there is a problem of data privacy protection in this process. Federated learning is capable of solving this problem. This paper starts with the current developments of federated learning and its applications in various fields. We conduct a comprehensive investigation. This paper summarize the latest research on the application of federated learning in various fields of smart cities. In-depth understanding of the current development of federated learning from the Internet of Things, transportation, communications, finance, medical and other fields. Before that, we introduce the background, definition and key technologies of federated learning. Further more, we review the key technologies and the latest results. Finally, we discuss the future applications and research directions of federated learning in smart cities.	翻訳日:2021-02-04 10:04:38 公開日:2021-02-02
# AURSAD:Universal Robot Screwdriving Anomaly Detection Dataset AURSAD: Universal Robot Screwdriving Anomaly Detection Dataset ( http://arxiv.org/abs/2102.01409v1 ) ライセンス: Link先を確認	B{\l}a\.zej Leporowski, Daniella Tola, Casper Hansen and Alexandros Iosifidis	(参考訳) ねじ運転は最も人気のある産業プロセスの1つです。そのため、様々なロボットを用いてその手順を自動化することがますます一般的になっている。自動化によってスクリュー駆動プロセスの効率が向上するが、プロセスが正しく監視されていない場合、動作中に障害が発生し、アセンブリの有効性と品質に影響を与える可能性がある。機械学習(ML)は、望ましくない出来事を検出し、その影響を制限する可能性がある。そのためには、まず、自動走行を行う産業用ロボットの動作を完全に記述したデータセットを入手する必要がある。本報告では,UR3eシリーズロボットとOnRobot Screwdriverを用いて作成したデータセットについて述べる。さまざまなシナリオを作成し、プロセスに3種類の異常を導入し、利用可能なロボットとドライバーのセンサーを継続的に記録します。得られたデータは、正常および異常なロボット操作の2042のサンプルを含む。このデータを使用した短いMLベンチマークも提供されており、さらなる分析と実験のためのデータの適合性と可能性を示している。 Screwdriving is one of the most popular industrial processes. As such, it is increasingly common to automate that procedure by using various robots. Even though the automation increases the efficiency of the screwdriving process, if the process is not monitored correctly, faults may occur during operation, which can impact the effectiveness and quality of assembly. Machine Learning (ML) has the potential to detect those undesirable events and limit their impact. In order to do so, first a dataset that fully describes the operation of an industrial robot performing automated screwdriving must be available. This report describes a dataset created using a UR3e series robot and OnRobot Screwdriver. We create different scenarios and introduce 3 types of anomalies to the process while all available robot and screwdriver sensors are continuously recorded. The resulting data contains 2042 samples of normal and anomalous robot operation. Brief ML benchmarks using this data are also provided, showcasing the data's suitability and potential for further analysis and experimentation.	翻訳日:2021-02-04 10:04:09 公開日:2021-02-02
# 機械学習による投票傾向の予測 Predicting Propensity to Vote with Machine Learning ( http://arxiv.org/abs/2102.01535v1 ) ライセンス: Link先を確認	Rebecca D. pollard, Sara M. Pollard, Scott Streit	(参考訳) 機械学習は、過去の行動や属性から投票する個人の傾向を推測する能力を可能にすることを実証します。これは、投票者のアウトリーチ、投票者教育、govtキャンペーンのマイクロターゲティングに有用である。政治学者は1940年代後半から選挙結果を推定する高度な技術を発展させた。 2つの先行研究は機械学習を使って将来の投票行動を予測する。 TensorFlowを使った機械学習環境を構築し、2004年から2018年まで投票データを取得し、3つの実験を実施しました。マシューズ相関係数 0.39 で陽性となった。 We demonstrate that machine learning enables the capability to infer an individual's propensity to vote from their past actions and attributes. This is useful for microtargeting voter outreach, voter education and get-out-the-vote (GOVT) campaigns. Political scientists developed increasingly sophisticated techniques for estimating election outcomes since the late 1940s. Two prior studies similarly used machine learning to predict individual future voting behavior. We built a machine learning environment using TensorFlow, obtained voting data from 2004 to 2018, and then ran three experiments. We show positive results with a Matthews correlation coefficient of 0.39.	翻訳日:2021-02-04 10:03:35 公開日:2021-02-02
# 間欠通信による分散確率凸最適化のMin-Max複雑性 The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication ( http://arxiv.org/abs/2102.01583v1 ) ライセンス: Link先を確認	Blake Woodworth, Brian Bullins, Ohad Shamir, Nathan Srebro	(参考訳) 間欠的通信設定における分散確率凸最適化(対数係数まで)の最小限の複雑性を解消し、M$マシンが目標を最適化するために$R$ラウンドの通信に対して並列に動作するようにし、各通信において各マシンが$K$確率勾配推定を逐次計算することができる。本稿では、最適なアルゴリズムを確立するための、一致した上限を持つ新しい下界を示す。 We resolve the min-max complexity of distributed stochastic convex optimization (up to a log factor) in the intermittent communication setting, where $M$ machines work in parallel over the course of $R$ rounds of communication to optimize the objective, and during each round of communication, each machine may sequentially compute $K$ stochastic gradient estimates. We present a novel lower bound with a matching upper bound that establishes an optimal algorithm.	翻訳日:2021-02-04 10:03:08 公開日:2021-02-02
# 重み付きリーマー平均を用いた非バランス音声のブラインド分離のための方向スパースフィルタリング Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures ( http://arxiv.org/abs/2102.00196v2 ) ライセンス: Link先を確認	Karn Watcharasupat and Anh H. T. Nguyen and Ching-Hui Ooi and Andy W. H. Khong	(参考訳) 音声信号のブラインドソース分離において、ソーススペクトルの固有の不均衡は、混合行列の推定に単一ソース支配に依存する方法の課題である。本稿では,Lahmer平均と学習可能な重みを用いた指向性スパースフィルタリング(DSF)フレームワークに基づくアルゴリズムを提案し,ソースの不均衡を適応的に考慮する。複数の実音環境における音源分離性能の評価は, ベースライン法と比較して改善が見られた。 In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix. We propose an algorithm based on the directional sparse filtering (DSF) framework that utilizes the Lehmer mean with learnable weights to adaptively account for source imbalance. Performance evaluation in multiple real acoustic environments show improvements in source separation compared to the baseline methods.	翻訳日:2021-02-04 10:00:49 公開日:2021-02-02
# PSLA:プリトレーニング、サンプリング、ラベリング、アグリゲーションによるオーディオイベント分類の改善 PSLA: Improving Audio Event Classification with Pretraining, Sampling, Labeling, and Aggregation ( http://arxiv.org/abs/2102.01243v1 ) ライセンス: Link先を確認	Yuan Gong, Yu-An Chung, and James Glass	(参考訳) オーディオイベント分類は活発な研究領域であり、幅広い用途があります。 AudioSetのリリース以来、分類精度の向上に大きく進歩しています。これは、主に新しいモデルアーキテクチャと注意モジュールの開発から来ています。しかし,オーディオセットを用いた音声イベント分類モデルの構築においては,適切なトレーニング手法が等しく重要であることが判明した。このギャップを埋めるため,本研究では,イメージネットプリトレーニング,バランスサンプリング,データ拡張,ラベル拡張,モデルアグリゲーション,設計選択など,モデルの精度を著しく向上させるトレーニング手法であるpslaを提案する。これらの手法でEfficientNetをトレーニングすることにより,AudioSet上で0.474の平均精度(mAP)を新たに達成し,従来の0.439よりも優れるモデルを得る。 Audio event classification is an active research area and has a wide range of applications. Since the release of AudioSet, great progress has been made in advancing the classification accuracy, which mostly comes from the development of novel model architectures and attention modules. However, we find that appropriate training techniques are equally important for building audio event classification models with AudioSet, but have not received the attention they deserve. To fill the gap, in this work, we present PSLA, a collection of training techniques that can noticeably boost the model accuracy including ImageNet pretraining, balanced sampling, data augmentation, label enhancement, model aggregation and their design choices. By training an EfficientNet with these techniques, we obtain a model that achieves a new state-of-the-art mean average precision (mAP) of 0.474 on AudioSet, outperforming the previous best system of 0.439.	翻訳日:2021-02-04 09:50:49 公開日:2021-02-02
# ターゲット話者抽出のためのマルチモーダルアテンション融合 Multimodal Attention Fusion for Target Speaker Extraction ( http://arxiv.org/abs/2102.01326v1 ) ライセンス: Link先を確認	Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki	(参考訳) 音声,視覚的,位置的手がかりを用いた混合音声からターゲット話者の声を抽出することを目的としたターゲット話者抽出が注目されている。近年,補完音声と視覚的手がかりを用いてターゲット音声を抽出する音声-視覚的ターゲット話者抽出法が提案されている。音声と視覚を対象とする話者抽出はシミュレーションデータに対する単一モダリティ法よりも安定した性能を提供するが、現実の状況への適応や実記録混合物の評価は十分に検討されていない。現実的な状況に対処する上で大きな問題の1つは、実際の記録では両方の手がかりが等しく信頼性がない可能性があるため、システムの汚職を突き止めるための堅牢化である。視覚的な手がかりは閉塞の影響を受けます本研究では、マルチモーダル融合のための新しい注意メカニズムとそのトレーニング方法を提案し、より信頼性の高いものに手がかりの信頼性と重量を効果的に捉えることを可能にする。シミュレーションデータに対する従来の核融合機構よりも,信号対歪み比(SDR)を1.0dB向上させる。さらに,同時音声の音声・視覚データセットを実データを用いて記録し,提案手法による音声・視覚対象話者抽出が実データに有効であることを示す。 Target speaker extraction, which aims at extracting a target speaker's voice from a mixture of voices using audio, visual or locational clues, has received much interest. Recently an audio-visual target speaker extraction has been proposed that extracts target speech by using complementary audio and visual clues. Although audio-visual target speaker extraction offers a more stable performance than single modality methods for simulated data, its adaptation towards realistic situations has not been fully explored as well as evaluations on real recorded mixtures. One of the major issues to handle realistic situations is how to make the system robust to clue corruption because in real recordings both clues may not be equally reliable, e.g. visual clues may be affected by occlusions. In this work, we propose a novel attention mechanism for multi-modal fusion and its training methods that enable to effectively capture the reliability of the clues and weight the more reliable ones. Our proposals improve signal to distortion ratio (SDR) by 1.0 dB over conventional fusion mechanisms on simulated data. Moreover, we also record an audio-visual dataset of simultaneous speech with realistic visual clue corruption and show that audio-visual target speaker extraction with our proposals successfully work on real data.	翻訳日:2021-02-04 09:50:11 公開日:2021-02-02
# 機械学習による有機半導体の電荷輸送の動的障害解析 Analyzing dynamical disorder for charge transport in organic semiconductors via machine learning ( http://arxiv.org/abs/2102.01479v1 ) ライセンス: Link先を確認	Patrick Reiser, Manuel Konrad, Artem Fediai, Salvador L\'eon, Wolfgang Wenzel and Pascal Friederich	(参考訳) 有機半導体は有機発光ダイオード(oled)や光電子応用といった今日のディスプレイ技術にとって不可欠である。しかし、有機材料は無機半導体と同じ電荷担体モビリティに到達せず、装置の効率を制限している。より高い電荷キャリア移動度を持つ新しい有機半導体を発見または設計するためには、計算アプローチ、特にマルチスケールモデルがますます重要になっている。しかし、そのようなモデルは計算コストが非常に高く、特に大規模システムや長時間のスケールが必要な場合、静的エネルギーや動的エネルギー障害を計算する場合である。電荷輸送を決定する主要な要因。ここでは、機械学習モデルをマルチスケールシミュレーションに統合することで、この欠点を克服する。これにより、関連する微視的材料特性、特に一連の応用関連分子に対する静的および動的障害寄与に関する前例のない洞察を得ることができます。静的な障害や浅いトラップの分布は、多くの材料に対して非常に非対称であり、ガウス的障害モデルに影響を与えている。さらに, エネルギー準位変動時間の解析を行い, 典型的ホッピング速度と比較し, 電荷輸送における動的障害の重要性を評価する。我々は,有機半導体の応用材料特性の予測に使用する計算手法の精度を大幅に向上し,仮想材料設計にこれらの手法を適用することを期待する。 Organic semiconductors are indispensable for today's display technologies in form of organic light emitting diodes (OLEDs) and further optoelectronic applications. However, organic materials do not reach the same charge carrier mobility as inorganic semiconductors, limiting the efficiency of devices. To find or even design new organic semiconductors with higher charge carrier mobility, computational approaches, in particular multiscale models, are becoming increasingly important. However, such models are computationally very costly, especially when large systems and long time scales are required, which is the case to compute static and dynamic energy disorder, i.e. dominant factor to determine charge transport. Here we overcome this drawback by integrating machine learning models into multiscale simulations. This allows us to obtain unprecedented insight into relevant microscopic materials properties, in particular static and dynamic disorder contributions for a series of application-relevant molecules. We find that static disorder and thus the distribution of shallow traps is highly asymmetrical for many materials, impacting widely considered Gaussian disorder models. We furthermore analyse characteristic energy level fluctuation times and compare them to typical hopping rates to evaluate the importance of dynamic disorder for charge transport. We hope that our findings will significantly improve the accuracy of computational methods used to predict application relevant materials properties of organic semiconductors, and thus make these methods applicable for virtual materials design.	翻訳日:2021-02-04 09:49:29 公開日:2021-02-02
# (参考訳) ニューラルセマンティックパーサーのロバスト性について On Robustness of Neural Semantic Parsers ( http://arxiv.org/abs/2102.01563v1 ) ライセンス: CC BY 4.0	Shuo Huang, Zhuang Li, Lizhen Qu1, Lei Pan	(参考訳) 意味解析は自然言語(NL)の発話を論理形式(LF)に写し、多くの高度なNLP問題を支えている。セマンティックパーサーはディープニューラルネットワークでパフォーマンスが向上するが、逆の例に対する脆弱性を継承する。本論文では,逆アタックの存在下でのセマンティックパーサーの堅牢性に関する実証的研究について述べる。形式的には、意味解析の敵は摂動的発話-LF対と見なされ、その発話は原語と全く同じ意味を持つ。既存のベンチマークコーパスに基づくロバストネステストセットを構築するために,スケーラブルな手法を提案する。本研究は,ロバスト性テストセットにおけるサーテ・オブ・ザ・アーツ・パーサーの性能評価と,データ拡張の効果評価に関する5つの研究課題に答えた。 Semantic parsing maps natural language (NL) utterances into logical forms (LFs), which underpins many advanced NLP problems. Semantic parsers gain performance boosts with deep neural networks, but inherit vulnerabilities against adversarial examples. In this paper, we provide the empirical study on the robustness of semantic parsers in the presence of adversarial attacks. Formally, adversaries of semantic parsing are considered to be the perturbed utterance-LF pairs, whose utterances have exactly the same meanings as the original ones. A scalable methodology is proposed to construct robustness test sets based on existing benchmark corpora. Our results answered five research questions in measuring the sate-of-the-art parsers' performance on robustness test sets, and evaluating the effect of data augmentation.	翻訳日:2021-02-04 07:41:52 公開日:2021-02-02
# (参考訳) リアルタイム超解像におけるraw画像の活用 Exploiting Raw Images for Real-Scene Super-Resolution ( http://arxiv.org/abs/2102.01579v1 ) ライセンス: CC BY 4.0	Xiangyu Xu, Yongrui Ma, Wenxiu Sun, Ming-Hsuan Yang	(参考訳) 超解像度は、カメラセンサーの空間的制約を克服することを目的としたコンピュータビジョンの基本的な問題です。単一画像のスーパーレゾリューションでは大きな進歩が見られたが、ほとんどのアルゴリズムは合成データでのみうまく動作し、実際のシナリオでの応用を制限する。本稿では,合成データと実写画像のギャップを埋めるために,実写単像超解像の問題について検討する。我々は既存の超解像アルゴリズムの2つの問題に焦点を当てている: 実写訓練データの欠如とカメラから得られる視覚情報の活用不足。そこで本研究では,デジタルカメラの撮像過程をシミュレートし,よりリアルなトレーニングデータを生成する手法を提案する。第2の課題は、原画像に記録された放射情報を利用する2分岐畳み込みニューラルネットワークを開発することである。さらに,画像復元のための高密度チャネルアテンションブロックと,有効色補正のための学習型ガイド付きフィルタネットワークを提案する。我々のモデルは、特定のカメラタイプからの画像を意図的に訓練することなく、異なるカメラに一般化することができる。広汎な実験により,提案アルゴリズムは細部やクリアな構造を復元し,実際のシーンにおける単一画像超解像の高品質な結果が得られることを示した。 Super-resolution is a fundamental problem in computer vision which aims to overcome the spatial limitation of camera sensors. While significant progress has been made in single image super-resolution, most algorithms only perform well on synthetic data, which limits their applications in real scenarios. In this paper, we study the problem of real-scene single image super-resolution to bridge the gap between synthetic data and real captured images. We focus on two issues of existing super-resolution algorithms: lack of realistic training data and insufficient utilization of visual information obtained from cameras. To address the first issue, we propose a method to generate more realistic training data by mimicking the imaging process of digital cameras. For the second issue, we develop a two-branch convolutional neural network to exploit the radiance information originally-recorded in raw images. In addition, we propose a dense channel-attention block for better image restoration as well as a learning-based guided filter network for effective color correction. Our model is able to generalize to different cameras without deliberately training on images from specific camera types. Extensive experiments demonstrate that the proposed algorithm can recover fine details and clear structures, and achieve high-quality results for single image super-resolution in real scenes.	翻訳日:2021-02-04 07:14:02 公開日:2021-02-02
# (参考訳) GEMベンチマーク:自然言語生成とその評価とメトリクス The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics ( http://arxiv.org/abs/2102.01672v1 ) ライセンス: CC BY 4.0	Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ond\v{r}ej Du\v{s}ek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, Jo\~ao Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou	(参考訳) 自然言語生成(NLG)のための生きたベンチマークであるGEM、その評価、およびメトリクスを紹介します。 NLGの進捗測定は、自動メトリクス、データセット、および人間の評価基準の絶え間なく進化するエコシステムに依存しています。しかし、この移動目標のため、新しいモデルは、よく確立されているが欠陥のあるメトリクスを持つ分散アングロ中心のコーパスで評価されることが多い。この切断は、現在のモデルと進歩の機会の限界を特定するのを難しくする。この制限に対処するため、GEMは幅広いコーポラにモデルを簡単に適用でき、評価戦略をテストすることができる環境を提供します。ベンチマークの定期的なアップデートにより、NLGの研究はより多言語化され、モデルとともに課題を進化させる。この論文は、ACL 2021ワークショップで共有タスクを組織し、NLGコミュニティ全体を参加するよう招待する最初のリリースの説明として機能します。 We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. However, due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of corpora and evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the initial release for which we are organizing a shared task at our ACL 2021 Workshop and to which we invite the entire NLG community to participate.	翻訳日:2021-02-04 06:42:02 公開日:2021-02-02
# (参考訳) ドメイン適応型エンドツーエンド音声認識のための内部言語モデルトレーニング Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition ( http://arxiv.org/abs/2102.01380v1 ) ライセンス: CC BY 4.0	Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong	(参考訳) 外部言語モデル(LM)と既存のエンドツーエンド(E2E)自動音声認識(ASR)システムの統合の有効性は、内部言語モデル推定(ILME)法を用いて大幅に改善することができる。この方法では、推論中にE2Eスコアと外部LMスコアを補間して得られたスコアから内部LMスコアを減算する。 ILMEに基づく推論を改善するために、内部LM推定に影響を与えるE2Eモデルコンポーネントのみを更新することにより、内部LM損失を最小限に抑える内部LMトレーニング(ILMT)方法を提案する。 ILMTは、ESRの精度を犠牲にすることなく、既存のコンポーネント内でスタンドアロンのLMを形成するようE2Eモデルを奨励している。 ILMTの後、トレーニングと推論の基準が一致したよりモジュール化されたE2Eモデルは、ソースドメイン内部のLMをより徹底的に除去し、ターゲットドメイン外部のLMをより効果的に統合することを可能にする。 30K時間の訓練された繰り返しニューラルネットワークトランスデューサと注意ベースのエンコーダデコーダモデルで実験されたILMTは、ILMEベースの推論により、標準E2Eトレーニングから最大31.5%および11.4%の相対的な単語誤り率を、ドメイン外LibriSpeechとMicrosoft生産テストセットでShallow Fusionでそれぞれ達成する。 The efficacy of external language model (LM) integration with existing end-to-end (E2E) automatic speech recognition (ASR) systems can be improved significantly using the internal language model estimation (ILME) method. In this method, the internal LM score is subtracted from the score obtained by interpolating the E2E score with the external LM score, during inference. To improve the ILME-based inference, we propose an internal LM training (ILMT) method to minimize an additional internal LM loss by updating only the E2E model components that affect the internal LM estimation. ILMT encourages the E2E model to form a standalone LM inside its existing components, without sacrificing ASR accuracy. After ILMT, the more modular E2E model with matched training and inference criteria enables a more thorough elimination of the source-domain internal LM, and therefore leads to a more effective integration of the target-domain external LM. Experimented with 30K-hour trained recurrent neural network transducer and attention-based encoder-decoder models, ILMT with ILME-based inference achieves up to 31.5% and 11.4% relative word error rate reductions from standard E2E training with Shallow Fusion on out-of-domain LibriSpeech and in-domain Microsoft production test sets, respectively.	翻訳日:2021-02-04 05:54:46 公開日:2021-02-02
# (参考訳) 大きさ Size Matters ( http://arxiv.org/abs/2102.01582v1 ) ライセンス: CC BY 4.0	Mats L. Richter, Johan Byttner, Ulf Krumnack, Ludwdig Schallner, Justin Shenk	(参考訳) 完全畳み込みニューラルネットワークは、ダウンサンプリングとプールの組み合わせで任意のサイズの入力を処理することができる。しかし, 完全畳み込み画像分類器は入力サイズに依存せず, 性能に有意な差があることが判明した。より詳しく見ると、入力サイズとモデル性能の間には単純な関係はない(`bigger is better'は存在しない)が、各ネットワークが最適な入力サイズを持ち、最良の結果を示していることがわかる。本研究では,層活性化のスペクトル解析やプローブ分類などの異なる手法を適用し,ネットワークアーキテクチャに特有の特徴があることを示す。この結果から、識別的特徴の大きさが、層間での推論プロセスの分散方法に重大な影響を与えていることが判明した。 Fully convolutional neural networks can process input of arbitrary size by applying a combination of downsampling and pooling. However, we find that fully convolutional image classifiers are not agnostic to the input size but rather show significant differences in performance: presenting the same image at different scales can result in different outcomes. A closer look reveals that there is no simple relationship between input size and model performance (no `bigger is better'), but that each each network has a preferred input size, for which it shows best results. We investigate this phenomenon by applying different methods, including spectral analysis of layer activations and probe classifiers, showing that there are characteristic features depending on the network architecture. From this we find that the size of discriminatory features is critically influencing how the inference process is distributed among the layers.	翻訳日:2021-02-04 05:42:20 公開日:2021-02-02
# (参考訳) MultiTalk:多言語会話のための高分岐ダイアログ MultiTalk: A Highly-Branching Dialog Testbed for Diverse Conversations ( http://arxiv.org/abs/2102.01263v1 ) ライセンス: CC BY 4.0	Yao Dou, Maxwell Forbes, Ari Holtzman, Yejin Choi	(参考訳) 与えられた履歴に対する多くの可能な応答がある会話対話について研究する。選択的なブランチ継続を通じて、高分岐率(10)と複数の会話回転(6)のバランスをとる320,000以上の会話ダイアログの文のコーパスであるMultiTalk Datasetを紹介します。高度に分岐した環境で、対話生成の研究に複数貢献します。多様な世代の世代を評価するために, 多様な参照のセットを最適に組み込むために, 二分グラフマッチングに基づく単純なスコアリングアルゴリズムを提案する。事前学習された分類器から自動的に引き起こされるテキスト属性を用いて,予測会話深さの異なるレベルで複数の言語生成タスクについて検討した。本研究の課題は,聴取者の期待する反応の推論を必要とする制御可能な生成タスクである心的問題の挑戦的理論である。 We study conversational dialog in which there are many possible responses to a given history. We present the MultiTalk Dataset, a corpus of over 320,000 sentences of written conversational dialog that balances a high branching factor (10) with several conversation turns (6) through selective branch continuation. We make multiple contributions to study dialog generation in the highly branching setting. In order to evaluate a diverse set of generations, we propose a simple scoring algorithm, based on bipartite graph matching, to optimally incorporate a set of diverse references. We study multiple language generation tasks at different levels of predictive conversation depth, using textual attributes induced automatically from pretrained classifiers. Our culminating task is a challenging theory of mind problem, a controllable generation task which requires reasoning about the expected reaction of the listener.	翻訳日:2021-02-04 05:26:22 公開日:2021-02-02
# (参考訳) 歴史資料への機械翻訳応用の2つの実証 Two Demonstrations of the Machine Translation Applications to Historical Documents ( http://arxiv.org/abs/2102.01417v1 ) ライセンス: CC BY-SA 4.0	Miguel Domingo and Francisco Casacuberta	(参考訳) 歴史的文書に2つの機械翻訳の応用例を示す。最初のタスクは、その元の言語の現代バージョンで書かれた歴史的な文書の新バージョンを生成することです。第2のアプリケーションは文書の正書法に限られる。文章の綴りの一貫性と綴り規則の欠如を会計するために、文書の綴りを現代の標準に適応させます。我々は、ユーザがシステムの仮説に修正を導入することができる、インタラクティブで適応的なフレームワークに従った。システムはこれらの補正に反応し、それらを考慮した新しい仮説を生成する。ユーザがシステムの仮説に満足して検証すると、システムはオンライン学習戦略に従ってそのモデルに適応する。このシステムはクライアントサーバアーキテクチャに従って実装される。ニューラルモデルと通信するWebサイトを開発した。すべてのコードはオープンソースで公開されています。デモはhttp://demosmt.prhlt.upv.es/mthd/にホストされている。 We present our demonstration of two machine translation applications to historical documents. The first task consists in generating a new version of a historical document, written in the modern version of its original language. The second application is limited to a document's orthography. It adapts the document's spelling to modern standards in order to achieve an orthography consistency and accounting for the lack of spelling conventions. We followed an interactive, adaptive framework that allows the user to introduce corrections to the system's hypothesis. The system reacts to these corrections by generating a new hypothesis that takes them into account. Once the user is satisfied with the system's hypothesis and validates it, the system adapts its model following an online learning strategy. This system is implemented following a client-server architecture. We developed a website which communicates with the neural models. All code is open-source and publicly available. The demonstration is hosted at http://demosmt.prhlt.upv.es/mthd/.	翻訳日:2021-02-04 05:14:17 公開日:2021-02-02
# (参考訳) 直接音声翻訳のためのCTCに基づく圧縮 CTC-based Compression for Direct Speech Translation ( http://arxiv.org/abs/2102.01578v1 ) ライセンス: CC BY-SA 4.0	Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi	(参考訳) 従来,音声入力音声の動的音声インフォーム圧縮は音声翻訳(ST)に有用であった。しかし、彼らは音声認識のための専用モデルを必要とし、単一のモデルが入力音声を中間表現なしでターゲット言語に翻訳するdirect stのこのソリューションをテストしなかった。本研究では,入力間接STモデルの動的圧縮を行うための第1の手法を提案する。特に,コネクショニスト時間分類(ctc)を用いて,その音声特性に応じて入力列を圧縮する。我々の実験は、我々のソリューションが2つの言語ペア(英語-イタリア語と英語-ドイツ語)の強いベースラインに対して1.3-1.5BLEUの改善をもたらし、文脈的にメモリフットプリントを10%以上削減することを示した。 Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST). However, they required a dedicated model for phone recognition and did not test this solution for direct ST, in which a single model translates the input audio into the target language without intermediate representations. In this work, we propose the first method able to perform a dynamic compression of the input indirect ST models. In particular, we exploit the Connectionist Temporal Classification (CTC) to compress the input sequence according to its phonetic characteristics. Our experiments demonstrate that our solution brings a 1.3-1.5 BLEU improvement over a strong baseline on two language pairs (English-Italian and English-German), contextually reducing the memory footprint by more than 10%.	翻訳日:2021-02-04 05:08:39 公開日:2021-02-02
# (参考訳) アトラスアウェア ConvNetfor 正確かつロバストな解剖学的セグメンテーション Atlas-aware ConvNetfor Accurate yet Robust Anatomical Segmentation ( http://arxiv.org/abs/2102.01256v1 ) ライセンス: CC BY 4.0	Yuan Liang, Weinan Song, Jiawei Yang, Liang Qiu, Kun Wang, Lei He	(参考訳) 畳み込みネットワーク(ConvNets)は、様々な解剖学的セグメンテーションタスクの有望な精度を達成しました。成功にもかかわらず、これらの手法はデータの出現変動に敏感である。アーティファクト,病理,スキャン設定によるスキャンの大きな変動を考えると,ロバストなConvNetは臨床応用には不可欠だが,十分に調査されていない。本稿では,画像スキャン中の解剖学的不均一性に対するconvnetの認識を可能にすることで,課題を軽減することを提案する。具体的には,局所接続条件付き確率場(cfr)上の予測に対する明示的な制約として確率的アトラス前置法を組み込んだ完全畳み込み制約導入モジュール(cam)を導入し,ラベリング出力の解剖学的一貫性を効果的に強化する。さまざまなConvNetのブーストに柔軟に対応できるCAMを設計し、最適な性能につながるフュージョンパラメータをConvNetとの共同最適化にコンパクトにします。このようなアトラス前駆体融合の利点は2つの脳パーセレーションタスクで2倍になることを示す。まず,予測の構造的異常を著しく低減し,両データセットのConvNetに基づく手法間の最先端の精度を実現する。第2に、既存のconvnetのロバスト性を大きく向上させることができる。(i) 合成病理によるスキャンのテスト、(ii) データセットをまたいだ異なるスキャンセットアップのスキャンのトレーニングと評価。提案手法は,CAMを微調整し,精度とロバスト性の向上を図ることで,既存のConvNetに容易に適用できることを示唆している。 Convolutional networks (ConvNets) have achieved promising accuracy for various anatomical segmentation tasks. Despite the success, these methods can be sensitive to data appearance variations. Considering the large variability of scans caused by artifacts, pathologies, and scanning setups, robust ConvNets are vital for clinical applications, while have not been fully explored. In this paper, we propose to mitigate the challenge by enabling ConvNets' awareness of the underlying anatomical invariances among imaging scans. Specifically, we introduce a fully convolutional Constraint Adoption Module (CAM) that incorporates probabilistic atlas priors as explicit constraints for predictions over a locally connected Conditional Random Field (CFR), which effectively reinforces the anatomical consistency of the labeling outputs. We design the CAM to be flexible for boosting various ConvNet, and compact for co-optimizing with ConvNets for fusion parameters that leads to the optimal performance. We show the advantage of such atlas priors fusion is two-fold with two brain parcellation tasks. First, our models achieve state-of-the-art accuracy among ConvNet-based methods on both datasets, by significantly reducing structural abnormalities of predictions. Second, we can largely boost the robustness of existing ConvNets, proved by: (i) testing on scans with synthetic pathologies, and (ii) training and evaluation on scans of different scanning setups across datasets. Our method is proposing to be easily adopted to existing ConvNets by fine-tuning with CAM plugged in for accuracy and robustness boosts.	翻訳日:2021-02-04 04:59:22 公開日:2021-02-02
# (参考訳) 画像認識のための方向畳み込みネットワーク Orientation Convolutional Networks for Image Recognition ( http://arxiv.org/abs/2102.01523v1 ) ライセンス: CC BY 4.0	Yalan Qin, Guorui Feng, Hanzhou Wu, Yanli Ren and Xinpeng Zhang	(参考訳) ディープ畳み込みニューラルネットワーク(DCNN)は強力な画像表現を得ることができ、画像認識に大きな注目を集めている。しかし、それらは内部機構による方向変換のモデリングに制限がある。本稿では,提案したLandmark Gabor Filters (LGFs) に基づく画像認識のためのOCN(Orientation Convolution Networks)を開発し,学習表現の方向性変化に対する堅牢性を向上させる。畳み込みフィルタをLGFで変調することにより、OCNは既存のディープラーニングネットワークと互換性を持つことができる。 LGF は Gabor フィルタバンクとして機能し、$ p $ \left( \ll n\right) $ 代表 Gabor フィルタをandmarks として選択し、元の Gabor フィルタをこれらのランドマークの疎線型結合として表現する。具体的には、行列ファクタリゼーションフレームワークに基づいて、スパース性および低ランク制約によるオリジナルのGaborフィルタのローカルおよびグローバル構造に対する柔軟な統合が利用される。低ランク構造の伝播により、元のGaborフィルタバンクの表現に対応する空間を著しく促進することができる。いくつかのベンチマークによる実験結果から,本手法はオリエンテーションに対する感度が低く,従来手法に比べて精度とコストが向上することが示された。さらに、OCNには学習するパラメータがほとんどなく、トレーニングネットワークの複雑さを大幅に削減できます。 Deep Convolutional Neural Networks (DCNNs) are capable of obtaining powerful image representations, which have attracted great attentions in image recognition. However, they are limited in modeling orientation transformation by the internal mechanism. In this paper, we develop Orientation Convolution Networks (OCNs) for image recognition based on the proposed Landmark Gabor Filters (LGFs) that the robustness of the learned representation against changed of orientation can be enhanced. By modulating the convolutional filter with LGFs, OCNs can be compatible with any existing deep learning networks. LGFs act as a Gabor filter bank achieved by selecting $ p $ $ \left( \ll n\right) $ representative Gabor filters as andmarks and express the original Gabor filters as sparse linear combinations of these landmarks. Specifically, based on a matrix factorization framework, a flexible integration for the local and the global structure of original Gabor filters by sparsity and low-rank constraints is utilized. With the propogation of the low-rank structure, the corresponding sparsity for representation of original Gabor filter bank can be significantly promoted. Experimental results over several benchmarks demonstrate that our method is less sensitive to the orientation and produce higher performance both in accuracy and cost, compared with the existing state-of-art methods. Besides, our OCNs have few parameters to learn and can significantly reduce the complexity of training network.	翻訳日:2021-02-04 04:43:41 公開日:2021-02-02
# (参考訳) Occluded Video Instance Segmentation Occluded Video Instance Segmentation ( http://arxiv.org/abs/2102.01558v1 ) ライセンス: CC BY 4.0	Jiyang Qi, Yan Gao, Xiaoyu Liu, Yao Hu, Xinggang Wang, Xiang Bai, Philip H.S. Torr, Serge Belongie, Alan Yuille, Song Bai	(参考訳) 映像理解システムは,シーン内に重い咬合が存在する場合,物体を知覚できるのか? この質問に答えるために、OVISと呼ばれる大規模データセットを収集し、ビデオインスタンスのセグメンテーション、すなわち、インクルードされたシーンでインスタンスを検出し、セグメンテーションし、追跡します。 OVISは25のセマンティックカテゴリから296kの高品質のインスタンスマスクで構成されており、オブジェクト閉塞は通常発生します。人間の視覚システムは文脈的推論と関連づけによってこれらを理解できるが、実験は現在の映像理解システムが満足していないことを示唆する。 OVISデータセットでは、最先端のアルゴリズムによって達成された最高のAPはわずか14.4であり、実際のシナリオでオブジェクト、インスタンス、ビデオを理解するための初期段階にあることを明らかにしています。また,閉塞による物体の欠落を補うために,時間的特徴キャリブレーションと呼ばれるプラグアンドプレイモジュールを提案する。 MaskTrack R-CNN と SipMask をベースに構築され、AP はそれぞれ 15.2 と 15.0 である。 OVISデータセットはhttp://songbai.site/ovis でリリースされる。 Can our video understanding systems perceive objects when a heavy occlusion exists in a scene? To answer this question, we collect a large scale dataset called OVIS for occluded video instance segmentation, that is, to simultaneously detect, segment, and track instances in occluded scenes. OVIS consists of 296k high-quality instance masks from 25 semantic categories, where object occlusions usually occur. While our human vision systems can understand those occluded instances by contextual reasoning and association, our experiments suggest that current video understanding systems are not satisfying. On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 14.4, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario. Moreover, to complement missing object cues caused by occlusion, we propose a plug-and-play module called temporal feature calibration. Built upon MaskTrack R-CNN and SipMask, we report an AP of 15.2 and 15.0 respectively. The OVIS dataset is released at http://songbai.site/ovis , and the project code will be available soon.	翻訳日:2021-02-04 04:27:15 公開日:2021-02-02
# (参考訳) Smoothness-Induction Sequential Variational Auto-Encoderによる時系列異常検出 Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder ( http://arxiv.org/abs/2102.01331v1 ) ライセンス: CC BY 4.0	Longyuan Li, Junchi Yan, Haiyang Wang, and Yaohui Jin	(参考訳) ディープジェネレーションモデルは、潜在表現の学習と時系列の複雑な依存性のモデリングにおけるその効果を実証している。本稿では,多次元時系列のロバストな推定と異常検出のためのスムースネス誘導逐次変分自動エンコーダ(SISVAE)モデルを提案する。我々のモデルは変分オートエンコーダ(VAE)に基づいており、そのバックボーンはリカレントニューラルネットワークによって実行され、生成モデルと推論モデルの両方において時系列の潜時構造をキャプチャする。具体的には,各タイムスタンプの平均と分散をフレキシブルニューラルネットワークでパラメータ化することで,既存のマルコフモデルで一般的である一定ノイズを仮定せずに動作可能な非定常モデルを実現する。しかし、そのような柔軟性はモデルに異常を生じさせる可能性がある。また,検出作業の便益となるロバストな密度推定を実現するため,推定よりもスムーズな事前推定法を提案する。提案された先行作業は、非平滑な再構築でペナルティを課す正規化として機能する。本モデルは,新しい確率勾配変動ベイズ推定器を用いて効率よく学習する。特に, 異常検出の判定基準として, 再構成確率と再構成誤差の2つを検討した。合成データセットと公開実世界のベンチマークの両方において,本モデルの有効性を示す。 Deep generative models have demonstrated their effectiveness in learning latent representation and modeling complex dependencies of time series. In this paper, we present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of multi-dimensional time series. Our model is based on Variational Auto-Encoder (VAE), and its backbone is fulfilled by a Recurrent Neural Network to capture latent temporal structures of time series for both generative model and inference model. Specifically, our model parameterizes mean and variance for each time-stamp with flexible neural networks, resulting in a non-stationary model that can work without the assumption of constant noise as commonly made by existing Markov models. However, such a flexibility may cause the model fragile to anomalies. To achieve robust density estimation which can also benefit detection tasks, we propose a smoothness-inducing prior over possible estimations. The proposed prior works as a regularizer that places penalty at non-smooth reconstructions. Our model is learned efficiently with a novel stochastic gradient variational Bayes estimator. In particular, we study two decision criteria for anomaly detection: reconstruction probability and reconstruction error. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.	翻訳日:2021-02-04 03:39:46 公開日:2021-02-02
# (参考訳) 分布入力検出のための疑似ベイズ型ニューラルネットワーク pseudo-Bayesian Neural Networks for detecting Out of Distribution Inputs ( http://arxiv.org/abs/2102.01336v1 ) ライセンス: CC BY-SA 4.0	Gagandeep Singh, Deepak Mishra	(参考訳) 従来のベイジアンニューラルネットワーク(BNN)は、単一の入力に対して複数の出力を提供できることが知られており、そのバリエーションは分布アウト(OOD)入力を検出するために利用することができる。 BNNは、優先順位の選択に対する感度のために訓練が困難である。そこで本研究では,重みに対する分布を学習する代わりに,推定時に点推定と摂動重みを用いる擬似BNNを提案する。従来のBNNのコスト関数を変更し、ポイント推定によりニューラルネットワークの重みのそれぞれにランダムな摂動の適切な量を注入する目的でパラメータを学習する。 In Distribution(ID)入力から複数の出力を用いてOOD入力を効果的に分離するために、確率分布の分散とエントロピーの指標から導出した2つの尺度を提案し、提案した擬似BNNと組み合わせる。全体として、この組み合わせは推論時にOODサンプルを検出する原則化された技術をもたらす。本手法は,多種多様なニューラルネットワークアーキテクチャと画像分類データセット上で評価する。提案手法は, 95%TPR, AUROC, AUPR, Detection ErrorにおけるFPR, 95%TPR, 95%TPR, 95%TPR, 95%TPR, および2～5重みのサンプルを用いて, 従来の手法よりも優れていることを示す。 Conventional Bayesian Neural Networks (BNNs) are known to be capable of providing multiple outputs for a single input, the variations in which can be utilised to detect Out of Distribution (OOD) inputs. BNNs are difficult to train due to their sensitivity towards the choice of priors. To alleviate this issue, we propose pseudo-BNNs where instead of learning distributions over weights, we use point estimates and perturb weights at the time of inference. We modify the cost function of conventional BNNs and use it to learn parameters for the purpose of injecting right amount of random perturbations to each of the weights of a neural network with point estimate. In order to effectively segregate OOD inputs from In Distribution (ID) inputs using multiple outputs, we further propose two measures, derived from the index of dispersion and entropy of probability distributions, and combine them with the proposed pseudo-BNNs. Overall, this combination results in a principled technique to detect OOD samples at the time of inference. We evaluate our technique on a wide variety of neural network architectures and image classification datasets. We observe that our method achieves state of the art results and beats the related previous work on various metrics such as FPR at 95% TPR, AUROC, AUPR and Detection Error by just using 2 to 5 samples of weights per input.	翻訳日:2021-02-04 02:57:31 公開日:2021-02-02
# (参考訳) ニューラルネットワークによるグラフ粗粒化 Graph Coarsening with Neural Networks ( http://arxiv.org/abs/2102.01350v1 ) ライセンス: CC BY 4.0	Chen Cai, Dingkang Wang, Yusu Wang	(参考訳) 大規模グラフがますます普及するにつれて、大規模グラフデータの処理、抽出、分析に重要な計算上の課題が生じる。グラフ粗大化は、重要な特性を維持しながらグラフのサイズを減らすための一般的なテクニックの1つです。リッチなグラフ粗い文献にもかかわらず、この分野におけるデータ駆動メソッドの探索は限られている。本研究では,グラフ粗化のためのグラフの深層学習の最近の進歩を活用する。我々はまず,粗いアルゴリズムの品質を測定するためのフレームワークを提案し,目標に応じて粗いグラフ上のLaplace演算子と関連するプロジェクション/リフト演算子を慎重に選択する必要があることを示した。粗いグラフに対する現在のエッジウェイト選択が準最適である可能性が示唆され、グラフニューラルネットワークを用いて重み付けマップをパラメータ化し、教師なし方法で粗い品質を改善するよう訓練する。本手法は, 合成ネットワークと実ネットワークの両方における広範な実験により, 還元率, グラフサイズ, グラフタイプなど, 一般的なグラフ粗さ化手法を大幅に改善できることを実証した。これは、より大きなサイズのグラフ(25\times$ of training graphs)に一般化し、異なる損失(微分可能かつ非微分可能)に適応し、より大きなグラフにスケールする。 As large-scale graphs become increasingly more prevalent, it poses significant computational challenges to process, extract and analyze large graph data. Graph coarsening is one popular technique to reduce the size of a graph while maintaining essential properties. Despite rich graph coarsening literature, there is only limited exploration of data-driven methods in the field. In this work, we leverage the recent progress of deep learning on graphs for graph coarsening. We first propose a framework for measuring the quality of coarsening algorithm and show that depending on the goal, we need to carefully choose the Laplace operator on the coarse graph and associated projection/lift operators. Motivated by the observation that the current choice of edge weight for the coarse graph may be sub-optimal, we parametrize the weight assignment map with graph neural networks and train it to improve the coarsening quality in an unsupervised way. Through extensive experiments on both synthetic and real networks, we demonstrate that our method significantly improves common graph coarsening methods under various metrics, reduction ratios, graph sizes, and graph types. It generalizes to graphs of larger size ($25\times$ of training graphs), is adaptive to different losses (differentiable and non-differentiable), and scales to much larger graphs than previous work.	翻訳日:2021-02-04 02:45:47 公開日:2021-02-02
# (参考訳) 骨格と成分特徴に基づくグラフ分類 Graph Classification Based on Skeleton and Component Features ( http://arxiv.org/abs/2102.01428v1 ) ライセンス: CC BY 4.0	Xue Liu, Wei Wei, Xiangnan Feng, Xiaobo Cao, Dan Sun	(参考訳) グラフ埋め込みを学習するためのほとんどの既存の一般的な方法は、固定順序のグローバル構造の特徴と構造階層表現の欠如のみを考慮します。この弱点に対処するため、匿名のランダムウォークで学習した定階構造と異なるサイズのサブグラフを用いたコンポーネント情報を用いて、骨格情報に基づく分類を実現するグラフ埋め込みアルゴリズムGraphCSCを提案する。 2つのグラフは、スケルトンとコンポーネントの両方が類似している場合に類似しているため、私たちのモデルでは、両方のグラフをグラフの均質性特性として埋め込みに統合します。最新のベースラインの包括的なリストと比較し、異なるデータセット上でモデルを示すとともに、実世界のグラフ分類タスクにおいて、私たちの研究が優れていることを実験で示します。 Most existing popular methods for learning graph embedding only consider fixed-order global structural features and lack structures hierarchical representation. To address this weakness, we propose a novel graph embedding algorithm named GraphCSC that realizes classification based on skeleton information using fixed-order structures learned in anonymous random walks manner, and component information using different size subgraphs. Two graphs are similar if their skeletons and components are both similar, thus in our model, we integrate both of them together into embeddings as graph homogeneity characterization. We demonstrate our model on different datasets in comparison with a comprehensive list of up-to-date state-of-the-art baselines, and experiments show that our work is superior in real-world graph classification tasks.	翻訳日:2021-02-04 02:15:16 公開日:2021-02-02
# (参考訳) ニューラルネットワークを用いた無補間センサのリアルタイム検出 Real-time detection of uncalibrated sensors using Neural Networks ( http://arxiv.org/abs/2102.01565v1 ) ライセンス: CC BY 4.0	Luis J. Mu\~noz-Molina, Ignacio Cazorla-Pi\~nar, Juan P. Dominguez-Morales, Fernando Perez-Pe\~na	(参考訳) 現在、センサは、科学、産業、日常生活など、その使用の恩恵を受けるいくつかのコンテキストにおいて重要な役割を果たす。しかし、取得した情報は信頼できるものでなければならない。センサの挙動の異常は、科学プロジェクトを台無しにしたり、工業生産ラインにおける生産の質を損なうなどの重大な結果をもたらす可能性がある。より微妙な種類の異常の1つは不均衡である。地上真理値に応じてキャリブレーションによりセンサが調整または標準化されていない場合、不校正が行われると言われる。本研究では,オンライン学習に基づく温度・湿度・圧力センサの非校正検出装置を開発した。このソリューションはニューラルネットワークをメインコンポーネントとして統合し、校正条件下でのセンサーの動作から学習する。そして、トレーニングとデプロイの後、一度発生した未校正を検出する。その結果, 提案手法は, 偏差値0.25度, 1% RH, 1.5Paの偏差をそれぞれ検出できることがわかった。このソリューションは、新しいセンサーの追加、新しい環境へのデプロイ、最小限のデータ量でモデルのトレーニングを可能にするトランスファーラーニングによって異なるコンテキストに適応することができる。 Nowadays, sensors play a major role in several contexts like science, industry and daily life which benefit of their use. However, the retrieved information must be reliable. Anomalies in the behavior of sensors can give rise to critical consequences such as ruining a scientific project or jeopardizing the quality of the production in industrial production lines. One of the more subtle kind of anomalies are uncalibrations. An uncalibration is said to take place when the sensor is not adjusted or standardized by calibration according to a ground truth value. In this work, an online machine-learning based uncalibration detector for temperature, humidity and pressure sensors was developed. This solution integrates an Artificial Neural Network as main component which learns from the behavior of the sensors under calibrated conditions. Then, after trained and deployed, it detects uncalibrations once they take place. The obtained results show that the proposed solution is able to detect uncalibrations for deviation values of 0.25 degrees, 1% RH and 1.5 Pa, respectively. This solution can be adapted to different contexts by means of transfer learning, whose application allows for the addition of new sensors, the deployment into new environments and the retraining of the model with minimum amounts of data.	翻訳日:2021-02-04 02:02:45 公開日:2021-02-02
# (参考訳) エージェントインセンティブ:因果的視点 Agent Incentives: A Causal Perspective ( http://arxiv.org/abs/2102.01685v1 ) ライセンス: CC BY 4.0	Tom Everitt, Ryan Carey, Eric Langlois, Pedro A Ortega, Shane Legg	(参考訳) 因果関係図を用いてエージェントインセンティブを分析するためのフレームワークを提案する。我々は、情報の価値に関する有名な基準が完成していると断定する。制御値に対する新たなグラフィカル基準を提案し、その健全性と完全性を確立します。また、環境の変化が最適な決定に影響を与えるかを示す応答インセンティブと、エージェントが変数 X を介してその有用性に影響を与えることができるかどうかを決定する機器制御インセンティブの2つの新しい概念を紹介します。両方の新しい概念について、私たちはサウンドと完全なグラフィカルな基準を提供します。これらの結果がAIシステムの安全性と公平性を評価するのにどのように役立つかを例に示します。 We present a framework for analysing agent incentives using causal influence diagrams. We establish that a well-known criterion for value of information is complete. We propose a new graphical criterion for value of control, establishing its soundness and completeness. We also introduce two new concepts for incentive analysis: response incentives indicate which changes in the environment affect an optimal decision, while instrumental control incentives establish whether an agent can influence its utility via a variable X. For both new concepts, we provide sound and complete graphical criteria. We show by example how these results can help with evaluating the safety and fairness of an AI system.	翻訳日:2021-02-04 01:50:28 公開日:2021-02-02
# (参考訳) WeNet: プロダクションファーストとプロダクションレディエンドツーエンドの音声認識ツールキット WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit ( http://arxiv.org/abs/2102.01547v1 ) ライセンス: CC BY 4.0	Binbin Zhang, Di Wu, Chao Yang, Xiaoyu Chen, Zhendong Peng, Xiangming Wang, Zhuoyuan Yao, Xiong Wang, Fan Yu, Lei Xie, Xin Lei	(参考訳) 本稿では、WeNetという新しいオープンソース、プロダクションファースト、プロダクション対応のエンドツーエンド(E2E)音声認識ツールキットを紹介します。 WeNetの主な動機は、E2E音声認識モデルの研究と製造の間のギャップを埋めることです。 WeNetは、ASRアプリケーションを複数の実世界のシナリオで展開する効率的な方法を提供しており、これは他のオープンソースのE2E音声認識ツールキットの主な違いと利点である。本稿では、モデルアーキテクチャ、フレームワーク設計、パフォーマンスメトリクスを含む3つの側面からWeNetを紹介します。 WeNetを用いたAISHELL-1の実験では、統一されたストリーミングおよび非ストリーミング2パス(U2)E2Eモデル上で有望な文字誤り率(CER)を与えるだけでなく、合理的なRTFとレイテンシも示しています。このツールキットはhttps://github.com/mobvoi/wenetで公開されている。 In this paper, we present a new open source, production first and production ready end-to-end (E2E) speech recognition toolkit named WeNet. The main motivation of WeNet is to close the gap between the research and the production of E2E speech recognition models. WeNet provides an efficient way to ship ASR applications in several real-world scenarios, which is the main difference and advantage to other open source E2E speech recognition toolkits. This paper introduces WeNet from three aspects, including model architecture, framework design and performance metrics. Our experiments on AISHELL-1 using WeNet, not only give a promising character error rate (CER) on a unified streaming and non-streaming two pass (U2) E2E model but also show reasonable RTF and latency, both of these aspects are favored for production adoption. The toolkit is publicly available at https://github.com/mobvoi/wenet.	翻訳日:2021-02-04 01:26:10 公開日:2021-02-02
# (参考訳) 連続的な手振りで話し、調音音声シンセサイザーを制御する SPEAK WITH YOUR HANDS Using Continuous Hand Gestures to control Articulatory Speech Synthesizer ( http://arxiv.org/abs/2102.01640v1 ) ライセンス: CC BY 4.0	Pramit Saha, Debasish Ray Mohapatra, Sidney Fels	(参考訳) 本稿では,音声合成エンジンであるtextit{viz の制御の進歩について述べる。 Pink Trombone, with hand gestures.*, Pink Trombone。声道領域機能に基づく音声合成による連続指の動きと手首屈曲を連続音声に変換する。私たちは、仮想舌を制御するために、手首と個々の指の運動情報をキャプチャするために18のセンサーを備えたCyberglove IIを使用します。センサーの座標と曲げ値は、ノイズの多い値と外れ値を滑らかにするスプライン舌モデルに適合するために利用されます。上口蓋を固定とし,スプラインモデルを声道の動的下面(舌)として考慮し,Pink Tromboneに供給される1次元領域関数値を計算し,連続的な発声音を生成する。したがって、手首と指を操作することを学ぶことによって、声道を使用する必要なしに、単に自分の手を通して音声音を生成することを学ぶことができます。 This work presents our advancements in controlling an articulatory speech synthesis engine, \textit{viz.}, Pink Trombone, with hand gestures. Our interface translates continuous finger movements and wrist flexion into continuous speech using vocal tract area-function based articulatory speech synthesis. We use Cyberglove II with 18 sensors to capture the kinematic information of the wrist and the individual fingers, in order to control a virtual tongue. The coordinates and the bending values of the sensors are then utilized to fit a spline tongue model that smoothens out the noisy values and outliers. Considering the upper palate as fixed and the spline model as the dynamically moving lower surface (tongue) of the vocal tract, we compute 1D area functional values that are fed to the Pink Trombone, generating continuous speech sounds. Therefore, by learning to manipulate one's wrist and fingers, one can learn to produce speech sounds just through one's hands, without the need for using the vocal tract.	翻訳日:2021-02-04 01:15:32 公開日:2021-02-02
# (参考訳) 時間適応ガウスモデル Time Adaptive Gaussian Model ( http://arxiv.org/abs/2102.01238v1 ) ライセンス: CC BY 4.0	Federico Cieca, Veronica Tozzo	(参考訳) 多変量時系列分析は、データ分析パイプラインの不可欠な部分になりつつある。コ変数間の個々のタイムポイント接続と、これらの接続が時間内でどのように変化するかを理解することは簡単ではない。そこで本研究では,隠れマルコフモデルとガウスグラフィックモデル-時間適応ガウスモデル(TAGM)を活用した新しい手法を提案する。本モデルは時間的グラフィカルモデルの推論のための最先端手法の一般化であり,その定式化は,現在の手法よりも優れた結果を提供するモデルの両側面を活用している。特に、時間内にデータポイントをクラスタリングすることでパターン認識を行い、観察された変数間の確率的(そしておそらく因果関係)の関係を見出す。時間的ネットワーク推論の現在の方法と比較して、良い推論性能を示しながら基本的な仮定を減らします。 Multivariate time series analysis is becoming an integral part of data analysis pipelines. Understanding the individual time point connections between covariates as well as how these connections change in time is non-trivial. To this aim, we propose a novel method that leverages on Hidden Markov Models and Gaussian Graphical Models -- Time Adaptive Gaussian Model (TAGM). Our model is a generalization of state-of-the-art methods for the inference of temporal graphical models, its formulation leverages on both aspects of these models providing better results than current methods. In particular,it performs pattern recognition by clustering data points in time; and, it finds probabilistic (and possibly causal) relationships among the observed variables. Compared to current methods for temporal network inference, it reduces the basic assumptions while still showing good inference performances.	翻訳日:2021-02-04 00:53:48 公開日:2021-02-02
# (参考訳) グラフモデルを用いたガウス専門家の選択 Gaussian Experts Selection using Graphical Models ( http://arxiv.org/abs/2102.01496v1 ) ライセンス: CC BY 4.0	Hamed Jalali, Martin Pawelczyk, Gjerji Kasneci	(参考訳) 局所近似はガウス過程(GP)をビッグデータに拡張する一般的な手法である。ローカル近似は、元のデータセットをサブセットに分割し、各サブセットでローカルエキスパートをトレーニングすることで、時間の複雑さを低減する。専門家の予測の集約は、専門家間の条件依存または独立を仮定して行われる。専門家間の \emph{conditional independent assumption} (CI) を課すと、異なる専門家の予測の集約が、不確実性の定量化のコストで時間効率良く行われる。一方、モデルに依存する専門家は、非現実的に高い計算コストを犠牲にして正確な予測と不確実性定量を提供することができる。理論ガイドによる専門家選定ステップを通じて弱い専門家を排除することにより、依存専門家を集約する計算コストを大幅に削減し、校正された不確実性の定量化を確保します。専門家間の条件付き依存関係をエンコードするスパース精度行列を使用して,最も重要な専門家を選択することで,無向なグラフィカルモデルに関する文献の手法を活用する。モレロフ Local approximations are popular methods to scale Gaussian processes (GPs) to big data. Local approximations reduce time complexity by dividing the original dataset into subsets and training a local expert on each subset. Aggregating the experts' prediction is done assuming either conditional dependence or independence between the experts. Imposing the \emph{conditional independence assumption} (CI) between the experts renders the aggregation of different expert predictions time efficient at the cost of poor uncertainty quantification. On the other hand, modeling dependent experts can provide precise predictions and uncertainty quantification at the expense of impractically high computational costs. By eliminating weak experts via a theory-guided expert selection step, we substantially reduce the computational cost of aggregating dependent experts while ensuring calibrated uncertainty quantification. We leverage techniques from the literature on undirected graphical models, using sparse precision matrices that encode conditional dependencies between experts to select the most important experts. Moreov	翻訳日:2021-02-04 00:41:35 公開日:2021-02-02
# (参考訳) 確率勾配を持つ正確なランゲビンダイナミクス Exact Langevin Dynamics with Stochastic Gradients ( http://arxiv.org/abs/2102.01691v1 ) ライセンス: CC BY-SA 4.0	Adri\`a Garriga-Alonso and Vincent Fortuin	(参考訳) 確率勾配マルコフチェーンモンテカルロアルゴリズムは近似推論のための一般的なサンプラーであるが、一般的に偏見がある。これらの方法の最近のバージョンの多くを示しています(例)。チェンら。 (2014) は、受け入れ確率が常にゼロであるため、メトロポリス・ハスティングによる拒絶サンプリングでは修正できない。確率勾配Langevinダイナミクス(Welling and Teh, 2011)とハミルトンモンテカルロを一般化するGradient-Guided Monte Carlo (Horowitz, 1991)のような、実現可能な後方方向の軌道を持つサンプラーを使用することで、これを修正できます。このサンプルは確率勾配で使用することができ、複数のステップにわたって計算できる非ゼロ受容確率が得られることを示す。 Stochastic gradient Markov Chain Monte Carlo algorithms are popular samplers for approximate inference, but they are generally biased. We show that many recent versions of these methods (e.g. Chen et al. (2014)) cannot be corrected using Metropolis-Hastings rejection sampling, because their acceptance probability is always zero. We can fix this by employing a sampler with realizable backwards trajectories, such as Gradient-Guided Monte Carlo (Horowitz, 1991), which generalizes stochastic gradient Langevin dynamics (Welling and Teh, 2011) and Hamiltonian Monte Carlo. We show that this sampler can be used with stochastic gradients, yielding nonzero acceptance probabilities, which can be computed even across multiple steps.	翻訳日:2021-02-04 00:24:32 公開日:2021-02-02
# (参考訳) 対話型再構成によるジェネラティブモデルの解釈可能性評価 Evaluating the Interpretability of Generative Models by Interactive Reconstruction ( http://arxiv.org/abs/2102.01264v1 ) ライセンス: CC BY 4.0	Andrew Slavin Ross, Nina Chen, Elisa Zhao Hang, Elena L. Glassman, Finale Doshi-Velez	(参考訳) 機械学習モデルが多数の社会技術システムで最も有用であるためには、多くはそれらが人間に解釈可能でなければならないと主張した。しかし、解釈可能性への関心が高まりつつあるにもかかわらず、その測定方法に関する確固たるコンセンサスはいまだにない。これは表現学習において特に当てはまり、解釈可能性の研究は、合成データセットにのみ適用され、人間の要因に基づかない「偏角」測定に焦点を当てている。生成モデル表現の人間解釈可能性を定量化するタスクを導入し、ユーザが対話的に表現を修正してターゲットインスタンスを再構築する。合成データセットでは、このタスクの性能がベースラインアプローチよりもはるかに確実に絡み合ったモデルと絡み合ったモデルを区別する。実際のデータセットでは、広く信じられているが、多かれ少なかれ解釈可能なモデルを生成することが示されない表現学習方法の違いを見出す。いずれの場合も、Amazon Mechanical Turkに関する小規模のシンクアルード研究と大規模実験を実施し、定性的および定量的な結果が一致したことを確認しました。 For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synthetic datasets and not grounded in human factors. We introduce a task to quantify the human-interpretability of generative model representations, where users interactively modify representations to reconstruct target instances. On synthetic datasets, we find performance on this task much more reliably differentiates entangled and disentangled models than baseline approaches. On a real dataset, we find it differentiates between representation learning methods widely believed but never shown to produce more or less interpretable models. In both cases, we ran small-scale think-aloud studies and large-scale experiments on Amazon Mechanical Turk to confirm that our qualitative and quantitative results agreed.	翻訳日:2021-02-03 23:55:04 公開日:2021-02-02
# (参考訳) 無線ネットワークプロトコル合成のためのマルチエージェント強化学習に向けて Towards Multi-agent Reinforcement Learning for Wireless Network Protocol Synthesis ( http://arxiv.org/abs/2102.01611v1 ) ライセンス: CC BY 4.0	Hrishikesh Dutta and Subir Biswas	(参考訳) 本稿では,無線ネットワークのためのマルチエージェント強化学習に基づくメディアアクセスフレームワークを提案する。アクセス問題はマルコフ決定プロセス(MDP)として定式化され、各ネットワークノードが分散学習エージェントとして機能する強化学習を用いて解決される。ソリューションコンポーネントは、ノードエージェントが自己複製を制御するためにMAC層パケットの負荷を制御することを漸進的に学習する単一ノードアクセスシナリオから、ステップバイステップで開発される。戦略は、より精巧な報酬構造を使用して、マルチノードの完全接続シナリオにスケールアップされます。また、より一般的な部分連結トポロジーに対する予備的実現可能性を示す。また,mac層伝送確率の調整を学べば,最適負荷時の理論的最大スループットが達成できるだけでなく,従来の手法と異なり,高い負荷条件でも最大スループットを維持できることを示した。さらに、この機能を保ちながら、そのメカニズムは異種ロードに依存しない。また、ノード間のプロトコルのアクセス優先度をパラメトリックに調整できることを示した。最後に、強化学習のオンライン学習機能により、プロトコルを時間変化の負荷条件に適応させることができることを示す。 This paper proposes a multi-agent reinforcement learning based medium access framework for wireless networks. The access problem is formulated as a Markov Decision Process (MDP), and solved using reinforcement learning with every network node acting as a distributed learning agent. The solution components are developed step by step, starting from a single-node access scenario in which a node agent incrementally learns to control MAC layer packet loads for reining in self-collisions. The strategy is then scaled up for multi-node fully-connected scenarios by using more elaborate reward structures. It also demonstrates preliminary feasibility for more general partially connected topologies. It is shown that by learning to adjust MAC layer transmission probabilities, the protocol is not only able to attain theoretical maximum throughput at an optimal load, but unlike classical approaches, it can also retain that maximum throughput at higher loading conditions. Additionally, the mechanism is agnostic to heterogeneous loading while preserving that feature. It is also shown that access priorities of the protocol across nodes can be parametrically adjusted. Finally, it is also shown that the online learning feature of reinforcement learning is able to make the protocol adapt to time-varying loading conditions.	翻訳日:2021-02-03 23:08:24 公開日:2021-02-02
# (参考訳) the workshop on program synthesis for scientific computing 参加報告 Report of the Workshop on Program Synthesis for Scientific Computing ( http://arxiv.org/abs/2102.01687v1 ) ライセンス: CC BY 4.0	Hal Finkel, Ignacio Laguna	(参考訳) プログラム合成は、学術、国立研究所、産業において活発な研究分野である。しかし、科学計算に直接適用できる仕事は、いくつかの印象的な成功を収めているが、制限されている。本報告は,科学計算におけるプログラム合成作業の関連分野を概観し,これまでの成功を議論し,今後の作業の機会を概説する。本報告は,2020年8月4日～5日(https://prog-synth-science.github.io/2020/)にサイエントコンピューティングのためのプログラム合成ワークショップの成果である。 Program synthesis is an active research field in academia, national labs, and industry. Yet, work directly applicable to scientific computing, while having some impressive successes, has been limited. This report reviews the relevant areas of program synthesis work for scientific computing, discusses successes to date, and outlines opportunities for future work. This report is the result of the Workshop on Program Synthesis for Scientific Computing was held virtually on August 4-5 2020 (https://prog-synth-science.github.io/2020/).	翻訳日:2021-02-03 22:35:50 公開日:2021-02-02
# (参考訳) JPEGプライマリ量子化行列推定とクラスタリングによる画像スプライシング検出, 局在化, 属性化 Image Splicing Detection, Localization and Attribution via JPEG Primary Quantization Matrix Estimation and Clustering ( http://arxiv.org/abs/2102.01439v1 ) ライセンス: CC BY 4.0	Yakun Niu, Benedetta Tondi, Yao Zhao, Rongrong Ni and Mauro Barni	(参考訳) 異なる画像領域にわたる二重JPEGアーティファクトの不整合の検出は、画像スプライシングのような局所的な画像操作を検出し、それらをローカライズするためにしばしば使用される。本稿では,スプライシング領域の検出と局所化に加えて,異なるドナー画像から得られる領域を識別するエンド・ツー・エンドシステムを提案する。分割された領域と背景画像の両方が二重JPEG圧縮されていると仮定し、一次量子化行列の局所推定を用いて異なるソースから抽出されたスプライシング領域を区別する。そこで,推定された一次量子化行列に従って画像ブロックをクラスタリングし,形態的再構成により精度を向上させる。提案手法は,第2の圧縮が第1の圧縮よりも強いか弱いかに関わらず,アライメントと非アライメントの2つのJPEG圧縮を含む多種多様な設定で動作可能である。類似条件下でのベースライン法に対して優れた性能を示す広範な実験により,提案手法を検証した。 Detection of inconsistencies of double JPEG artefacts across different image regions is often used to detect local image manipulations, like image splicing, and to localize them. In this paper, we move one step further, proposing an end-to-end system that, in addition to detecting and localizing spliced regions, can also distinguish regions coming from different donor images. We assume that both the spliced regions and the background image have undergone a double JPEG compression, and use a local estimate of the primary quantization matrix to distinguish between spliced regions taken from different sources. To do so, we cluster the image blocks according to the estimated primary quantization matrix and refine the result by means of morphological reconstruction. The proposed method can work in a wide variety of settings including aligned and non-aligned double JPEG compression, and regardless of whether the second compression is stronger or weaker than the first one. We validated the proposed approach by means of extensive experiments showing its superior performance with respect to baseline methods working in similar conditions.	翻訳日:2021-02-03 21:13:15 公開日:2021-02-02
# (参考訳) U-LanD:不確実性駆動のビデオランドマーク検出 U-LanD: Uncertainty-Driven Video Landmark Detection ( http://arxiv.org/abs/2102.01586v1 ) ライセンス: CC BY 4.0	Mohammad H. Jafari, Christina Luong, Michael Tsang, Ang Nan Gu, Nathan Van Woudenberg, Robert Rohling, Teresa Tsang, Purang Abolmaesumi	(参考訳) 本稿では,ビデオ中のキーフレームとランドマークを共同検出するためのフレームワークであるU-LanDを提案する。私たちは、トレーニングラベルが騒々しく、非常にスパースな、特に困難な問題に取り組みます。 U-LanDは、重要なビデオフレームでのみ訓練された深いベイズランドマーク検出器が、それらのフレームの予測不確実性を大幅に低下させています。この観測を教師なし信号として使用し、ランドマークを検出するキーフレームを自動的に認識する。本フレームワークの試験ベッドとして,各ビデオの1フレームでのみ,スパースとノイジーな臨床ラベルが使用可能な,心臓の超音波画像を用いた。 4,493人のデータを用いて、U-LanDは、現在最先端の非ベイズ系患者よりも、R2スコアの42%という顕著な絶対的マージンで、モデルサイズにほとんどオーバーヘッドを課さないことが実証された。私たちのアプローチは汎用的で、騒々しいトレーニングラベルを持つ他の挑戦的なデータに適用できます。 This paper presents U-LanD, a framework for joint detection of key frames and landmarks in videos. We tackle a specifically challenging problem, where training labels are noisy and highly sparse. U-LanD builds upon a pivotal observation: a deep Bayesian landmark detector solely trained on key video frames, has significantly lower predictive uncertainty on those frames vs. other frames in videos. We use this observation as an unsupervised signal to automatically recognize key frames on which we detect landmarks. As a test-bed for our framework, we use ultrasound imaging videos of the heart, where sparse and noisy clinical labels are only available for a single frame in each video. Using data from 4,493 patients, we demonstrate that U-LanD can exceedingly outperform the state-of-the-art non-Bayesian counterpart by a noticeable absolute margin of 42% in R2 score, with almost no overhead imposed on the model size. Our approach is generic and can be potentially applied to other challenging data with noisy and sparse training labels.	翻訳日:2021-02-03 20:49:53 公開日:2021-02-02
# (参考訳) 局所差分プライバシーは$E_\gamma$-Divergenceの収縮と同等である Local Differential Privacy Is Equivalent to Contraction of $E_\gamma$-Divergence ( http://arxiv.org/abs/2102.01258v1 ) ライセンス: CC BY 4.0	Shahab Asoodeh, Maryam Aliakbarpour, and Flavio P. Calmon	(参考訳) ランダム化プライバシーメカニズムの局所差分プライバシー(LDP)保証について,その収縮特性を用いて検討する。まず, LDP 制約は $E_\gamma$-divergence の収縮係数で等価にキャストできることを示した。次に、この等価な式を使用して、任意の $f$-divergences の収縮係数の点でプライバシーメカニズムの LDP 保証を表現する。標準的な推定理論ツール(例えばル・カムやファノの逆手法)と組み合わせると、この結果はいくつかのテストにおいてプライバシーとユーティリティの間のトレードオフとミニマックスとベイズ推定問題を調べることができる。 We investigate the local differential privacy (LDP) guarantees of a randomized privacy mechanism via its contraction properties. We first show that LDP constraints can be equivalently cast in terms of the contraction coefficient of the $E_\gamma$-divergence. We then use this equivalent formula to express LDP guarantees of privacy mechanisms in terms of contraction coefficients of arbitrary $f$-divergences. When combined with standard estimation-theoretic tools (such as Le Cam's and Fano's converse methods), this result allows us to study the trade-off between privacy and utility in several testing and minimax and Bayesian estimation problems.	翻訳日:2021-02-03 20:25:35 公開日:2021-02-02
# (参考訳) 分散確率勾配降下の安定性と一般化 Stability and Generalization of the Decentralized Stochastic Gradient Descent ( http://arxiv.org/abs/2102.01302v1 ) ライセンス: CC BY 4.0	Tao Sun, Dongsheng Li, Bao Wang	(参考訳) 確率勾配に基づく手法の安定性と一般化は、機械学習モデルのアルゴリズム性能を理解する上で貴重な洞察を与える。深層学習のメインワークホースとして、確率勾配降下はかなりの量の研究を受けている。しかし、コミュニティはその分散型の変種にほとんど注意を払わなかった。本論文では,分散確率勾配降下の新たな定式化を提案する。この定式化と(非凸最適化理論を併用して、分散確率勾配勾配の第一の安定性と一般化保証を確立する。我々の理論的結果は、いくつかの一般的かつ穏やかな仮定に基づいて構築され、分散化が初めてsgdの安定性を低下させることが明らかとなった。さまざまな分散設定とベンチマーク機械学習モデルを用いて理論的結果を検証する。 The stability and generalization of stochastic gradient-based methods provide valuable insights into understanding the algorithmic performance of machine learning models. As the main workhorse for deep learning, stochastic gradient descent has received a considerable amount of studies. Nevertheless, the community paid little attention to its decentralized variants. In this paper, we provide a novel formulation of the decentralized stochastic gradient descent. Leveraging this formulation together with (non)convex optimization theory, we establish the first stability and generalization guarantees for the decentralized stochastic gradient descent. Our theoretical results are built on top of a few common and mild assumptions and reveal that the decentralization deteriorates the stability of SGD for the first time. We verify our theoretical findings by using a variety of decentralized settings and benchmark machine learning models.	翻訳日:2021-02-03 20:03:22 公開日:2021-02-02
# (参考訳) 簡易予測器によるオンライン学習と0/1ゲームにおけるMinimaxの組合せ評価 Online Learning with Simple Predictors and a Combinatorial Characterization of Minimax in 0/1 Games ( http://arxiv.org/abs/2102.01646v1 ) ライセンス: CC BY 4.0	Steve Hanneke, Roi Livni, and Shay Moran	(参考訳) どのクラスがオンラインモデルで適切に学習できるのか? つまり、各ラウンドで概念クラスから予測子を使用するアルゴリズムによって。不適切な学習が必要な単純で自然なケースもあるが、不適切な予測器がどの程度複雑でなければならないのかを問うのは当然である。単純な"予測器を使って、常にほぼ最適のミス/リグレット境界を達成できるのか? 本研究は,アングリン(1987年)とリトルストーン(1988年)の先駆的研究から研究されてきたオープンな課題を解決するために,これがいつ可能かを完全に特徴づける。より正確には、任意の概念クラス C と任意の仮説クラス H を考えると、H からの予測器を用いてオンライン学習 C の最適誤差境界について、ほぼ厳しい境界 (ログファクタまで) を提供します。アプリケーションとして、(i)実現可能な設定では、(定数まで)ほぼ最適の誤りバウンドが、適切な予測者のスパース多数投票によって達成可能であり、(ii)不可知な設定では、(ログ係数まで)ほぼ最適の後悔バウンドをランダム化された固有アルゴリズムで達成できることを示す構成的証明を与える。独立性のある証明の技術的要素は、二元零サムゲームに対する有名なミニマックス定理(von Neumann, 1928)の一般化である。 Minimaxを満たすのに失敗する単純なゲームは、各プレーヤーが数字を選択し、より大きな数字が勝つ「大きな数字を誘導する」です。ペイオフ行列は無限三角形である。これが唯一の障害であることを示す:ゲームが非有界サイズの三角部分行列を含まないならば、ミニマックス定理は成立する。これはフォン・ノイマンのミニマックス定理を有限性(あるいはコンパクト性)の要件を取り除いて一般化し、オンライン学習に関心のあるゲームを正確に捉えている。 Which classes can be learned properly in the online model? -- that is, by an algorithm that at each round uses a predictor from the concept class. While there are simple and natural cases where improper learning is necessary, it is natural to ask how complex must the improper predictors be in such cases. Can one always achieve nearly optimal mistake/regret bounds using "simple" predictors? In this work, we give a complete characterization of when this is possible, thus settling an open problem which has been studied since the pioneering works of Angluin (1987) and Littlestone (1988). More precisely, given any concept class C and any hypothesis class H, we provide nearly tight bounds (up to a log factor) on the optimal mistake bounds for online learning C using predictors from H. Our bound yields an exponential improvement over the previously best known bound by Chase and Freitag (2020). As applications, we give constructive proofs showing that (i) in the realizable setting, a near-optimal mistake bound (up to a constant factor) can be attained by a sparse majority-vote of proper predictors, and (ii) in the agnostic setting, a near-optimal regret bound (up to a log factor) can be attained by a randomized proper algorithm. A technical ingredient of our proof which may be of independent interest is a generalization of the celebrated Minimax Theorem (von Neumann, 1928) for binary zero-sum games. A simple game which fails to satisfy Minimax is "Guess the Larger Number", where each player picks a number and the larger number wins. The payoff matrix is infinite triangular. We show this is the only obstruction: if a game does not contain triangular submatrices of unbounded sizes then the Minimax Theorem holds. This generalizes von Neumann's Minimax Theorem by removing requirements of finiteness (or compactness), and captures precisely the games of interest in online learning.	翻訳日:2021-02-03 19:40:44 公開日:2021-02-02
# (参考訳) 非同期Q-LearningとTD-Learningの有限サンプル保証に対するリアプノフ理論 A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants ( http://arxiv.org/abs/2102.01567v1 ) ライセンス: CC BY 4.0	Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, and Karthikeyan Shanmugam	(参考訳) 本稿では,大規模な値ベース非同期強化学習(RL)アルゴリズムの有限サンプル収束を保証する統一フレームワークを開発する。我々は、まずRLアルゴリズムをマルコフ確率近似(SA)アルゴリズムとして再構成し、不動点方程式を解く。次に、Lyapunov解析を開発し、マルコフSAの収束に関する平均二乗誤差境界を導出する。この中心的な結果に基づいて,$Q$-learning,$n$-step TD,TD$(\lambda)$,V-traceを含む非政治的なTDアルゴリズムなどの非同期RLアルゴリズムに対して,有限サンプル平均二乗収束境界を確立する。副産物として、TD$(\lambda)$(および$n$-step TD)アルゴリズムの性能境界を一般の$\lambda$(および$n$)に対して解析することにより、RLにおけるブートストラップの効率というバイアス分散トレードオフを実証する。これは[37]で最初にオープンな問題として提起された。 This paper develops an unified framework to study finite-sample convergence guarantees of a large class of value-based asynchronous Reinforcement Learning (RL) algorithms. We do this by first reformulating the RL algorithms as Markovian Stochastic Approximation (SA) algorithms to solve fixed-point equations. We then develop a Lyapunov analysis and derive mean-square error bounds on the convergence of the Markovian SA. Based on this central result, we establish finite-sample mean-square convergence bounds for asynchronous RL algorithms such as $Q$-learning, $n$-step TD, TD$(\lambda)$, and off-policy TD algorithms including V-trace. As a by-product, by analyzing the performance bounds of the TD$(\lambda)$ (and $n$-step TD) algorithm for general $\lambda$ (and $n$), we demonstrate a bias-variance trade-off, i.e., efficiency of bootstrapping in RL. This was first posed as an open problem in [37].	翻訳日:2021-02-03 19:01:51 公開日:2021-02-02
# (参考訳) aura-net : アノテーションの少ない位相コントラスト顕微鏡画像の堅牢なセグメンテーション aura-net : robust segmentation of phase-contrast microscopy images with few annotations ( http://arxiv.org/abs/2102.01389v1 ) ライセンス: CC BY 4.0	Ethan Cohen and Virginie Uhlmann	(参考訳) 位相コントラスト顕微鏡画像の分割のための畳み込みニューラルネットワーク(CNN)であるAURA-netを提案する。 AURA-netは、トランスファーラーニングを使用してトレーニングと注意メカニズムを加速し、ネットワークが関連する画像機能に集中できるようにします。このように、非常に限られた量のアノテーションで効率的にトレーニングできます。したがって、我々のネットワークは、一般的にディープラーニング技術では小さすぎると考えられるデータセットのセグメンテーションを自動化するために利用することができる。 AURA-netはまた、位相コントラスト画像の特異性に順応し、さらに性能を向上させるアクティブな輪郭にインスパイアされた損失を使用する。 AURA-netは、いくつかの小さな(100倍未満)データセットにおいて最先端の代替品よりも優れていることを示す。 We present AURA-net, a convolutional neural network (CNN) for the segmentation of phase-contrast microscopy images. AURA-net uses transfer learning to accelerate training and Attention mechanisms to help the network focus on relevant image features. In this way, it can be trained efficiently with a very limited amount of annotations. Our network can thus be used to automate the segmentation of datasets that are generally considered too small for deep learning techniques. AURA-net also uses a loss inspired by active contours that is well-adapted to the specificity of phase-contrast images, further improving performance. We show that AURA-net outperforms state-of-the-art alternatives in several small (less than 100images) datasets.	翻訳日:2021-02-03 17:46:11 公開日:2021-02-02
# (参考訳) 肺塞栓症自動検出のための多エネルギーCT画像からの低keV単色画像の予測 Prediction of low-keV monochromatic images from polyenergetic CT scans for improved automatic detection of pulmonary embolism ( http://arxiv.org/abs/2102.01445v1 ) ライセンス: CC BY 4.0	Constantin Seibold, Matthias A. Fink, Charlotte Goos, Hans-Ulrich Kauczor, Heinz-Peter Schlemmer, Rainer Stiefelhagen, Jens Kleesiek	(参考訳) 検出器ベースのスペクトル計算トモグラフィは、スペクトル情報を得る可能性を提供する最近のデュアルエネルギーCT(DECT)技術である。このスペクトルデータから、他の仮想単エネルギー(monoE)画像と異なり、異なるタイプの画像を引き出すことができる。 MonoE画像は、アーチファクトが減少し、コントラストが改善し、全体的なノイズ値が低下し、血管異常の診断精度が向上する理想的な候補となります。本稿では,従来の単エネルギーCTからのモノE画像の生成をエミュレートできる畳み込みニューラルネットワーク(CNN)を訓練している。本研究では,よく用いられる画像変換手法について検討する。これらの方法が視覚的に類似した出力を作成し、肺塞栓症(PE)の自動分類に使用されるとパフォーマンスが低下することを示しています。 psnrとssimスコアに反映されるように,ネットワークによる分類と生成結果の改善を実現するマルチタスク最適化手法を用いて,これらの手法を拡張した。さらに,提案手法をrsna-peチャレンジデータセットのサブセット上で評価することにより,受信者動作特性曲線(auroc)下の領域を0.8142から0.8420までのna\"ive分類アプローチと比較して改善できることを示す。 Detector-based spectral computed tomography is a recent dual-energy CT (DECT) technology that offers the possibility of obtaining spectral information. From this spectral data, different types of images can be derived, amongst others virtual monoenergetic (monoE) images. MonoE images potentially exhibit decreased artifacts, improve contrast, and overall contain lower noise values, making them ideal candidates for better delineation and thus improved diagnostic accuracy of vascular abnormalities. In this paper, we are training convolutional neural networks~(CNN) that can emulate the generation of monoE images from conventional single energy CT acquisitions. For this task, we investigate several commonly used image-translation methods. We demonstrate that these methods while creating visually similar outputs, lead to a poorer performance when used for automatic classification of pulmonary embolism (PE). We expand on these methods through the use of a multi-task optimization approach, under which the networks achieve improved classification as well as generation results, as reflected by PSNR and SSIM scores. Further, evaluating our proposed framework on a subset of the RSNA-PE challenge data set shows that we are able to improve the Area under the Receiver Operating Characteristic curve (AuROC) in comparison to a na\"ive classification approach from 0.8142 to 0.8420.	翻訳日:2021-02-03 17:36:52 公開日:2021-02-02
# (参考訳) モデルに基づくマルチパラメータマッピング Model-based multi-parameter mapping ( http://arxiv.org/abs/2102.01604v1 ) ライセンス: CC BY 4.0	Yael Balbastre, Mikael Brudfors, Michela Azzarito, Christian Lambert, Martina F. Callaghan, John Ashburner	(参考訳) 量的MRイメージングは、その豊富な情報コンテンツと標準化された対策のためにますます好まれています。しかし, 縦緩和率 (R1), 横緩和率 (R2), 磁化移動飽和度 (MTsat) などの定量的パラメータの抽出は, 高い非線形関数の反転を伴う。推定はしばしばノイズのない測定を仮定し、データのサブセットを使用して異なる量の分離を解決し、各計算を通じてエラーが伝播します。代わりに、データセット全体の確率的生成モデルを定式化し、逆転してパラメータ推定を適切に定義された確率的意味(例えば、最大可能性または最大a後方)で共同で回収することができる。実際には、反復的な方法を使用する必要がありますが、ログの類似性の非凸性のために収束は困難です。しかし、我々はそれが新しい近似ヘッセンのおかげで達成できることを示し、それによって、信頼できるパラメータ推定が得られました。本稿では,このフレキシブルなフレームワークの有用性を,一般的なマルチパラメータマッピングフレームワークの文脈で実証し,デノイジンの事前設定と後方不確かさの予測の方法を示す。当社の実装では、PyTorchバックエンドを使用しており、GPUアクセラレーションのメリットがあります。 https://github.com/balbasty/nitorch.comで入手できる。 Quantitative MR imaging is increasingly favoured for its richer information content and standardised measures. However, extracting quantitative parameters such as the longitudinal relaxation rate (R1), apparent transverse relaxation rate (R2), or magnetisation-transfer saturation (MTsat) involves inverting a highly non-linear function. Estimations often assume noise-free measurements and use subsets of the data to solve for different quantities in isolation, with error propagating through each computation. Instead, a probabilistic generative model of the entire dataset can be formulated and inverted to jointly recover parameter estimates with a well-defined probabilistic meaning (e.g., maximum likelihood or maximum a posteriori). In practice, iterative methods must be used but convergence is difficult due to the non-convexity of the log-likelihood; yet, we show that it can be achieved thanks to a novel approximate Hessian and, with it, reliable parameter estimates obtained. Here, we demonstrate the utility of this flexible framework in the context of the popular multi-parameter mapping framework and further show how to incorporate a denoising prior and predict posterior uncertainty. Our implementation uses a PyTorch backend and benefits from GPU acceleration. It is available at https://github.com/balbasty/nitorch.	翻訳日:2021-02-03 17:30:30 公開日:2021-02-02
# 例外挿による神経データ拡張 Neural Data Augmentation via Example Extrapolation ( http://arxiv.org/abs/2102.01335v1 ) ライセンス: Link先を確認	Kenton Lee, Kelvin Guu, Luheng He, Tim Dozat, Hyung Won Chung	(参考訳) 機械学習の多くの応用では、トレーニングデータで特定の例のカテゴリが過小評価され、テスト時にこのような"フェーショット"ケースでシステムが過小評価される可能性がある。一般的な治療は、表現不足の例を複製したり、新しい例をヒューリスティックに合成したりしてデータ拡張を行うことである。しかし、これらの治療法は実例の完全な多様性と複雑さをカバーできないことが多い。本稿では,ニューラルサンプル補間(Ex2)を行うデータ拡張手法を提案する。ある分布からサンプリングされた少数の例を考えると、Ex2は同じ分布に属する新しい例を合成する。 Ex2モデルは、データ豊富なスライスの例生成手順をシミュレートして学習され、表現不足の少数のスライスに適用されます。 Ex2をさまざまな言語理解タスクに適用し、リレーション抽出(FewRel)やインテント分類+スロットフィリング(SNIPS)など、複数のマルチショット学習ベンチマークにおける最先端の手法を大幅に改善します。 In many applications of machine learning, certain categories of examples may be underrepresented in the training data, causing systems to underperform on such "few-shot" cases at test time. A common remedy is to perform data augmentation, such as by duplicating underrepresented examples, or heuristically synthesizing new examples. But these remedies often fail to cover the full diversity and complexity of real examples. We propose a data augmentation approach that performs neural Example Extrapolation (Ex2). Given a handful of exemplars sampled from some distribution, Ex2 synthesizes new examples that also belong to the same distribution. The Ex2 model is learned by simulating the example generation procedure on data-rich slices of the data, and it is applied to underrepresented, few-shot slices. We apply Ex2 to a range of language understanding tasks and significantly improve over state-of-the-art methods on multiple few-shot learning benchmarks, including for relation extraction (FewRel) and intent classification + slot filling (SNIPS).	翻訳日:2021-02-03 17:00:11 公開日:2021-02-02
# エッジ領域における色分布解析に基づく顔操作検出 Facial Manipulation Detection Based on the Color Distribution Analysis in Edge Region ( http://arxiv.org/abs/2102.01381v1 ) ライセンス: Link先を確認	Dong-Keon Kim, DongHee Kim, and Kwangsu Kim	(参考訳) 本研究では,操作画像におけるエッジの垂直領域の色分布解析に基づく,汎用的かつ堅牢な顔操作検出手法を提案する。現代顔操作法の大半は、合成画像における顔面境界に沿った画素値差の厄介さを低減するための画素補正手順を含む。この方法では, 顔操作画像と忘れ去られた自然画像との間には, 顔境界の違いがある。また、鍛造画像では、照明の自然な効果を損なう傾向があるため、顔境界と背景エッジ領域のギャップ分布に特徴的な不自然な特徴があるべきである。顔境界と背景エッジに特有の特徴を持つ顔操作画像を検出するニューラルネットワークを設計する。本研究では, 既存の顔操作検出法よりも, トレーニングの有無にかかわらず, 合成顔画像の検出法を各種データセットで比較した。 In this work, we present a generalized and robust facial manipulation detection method based on color distribution analysis of the vertical region of edge in a manipulated image. Most of the contemporary facial manipulation method involves pixel correction procedures for reducing awkwardness of pixel value differences along the facial boundary in a synthesized image. For this procedure, there are distinctive differences in the facial boundary between face manipulated image and unforged natural image. Also, in the forged image, there should be distinctive and unnatural features in the gap distribution between facial boundary and background edge region because it tends to damage the natural effect of lighting. We design the neural network for detecting face-manipulated image with these distinctive features in facial boundary and background edge. Our extensive experiments show that our method outperforms other existing face manipulation detection methods on detecting synthesized face image in various datasets regardless of whether it has participated in training.	翻訳日:2021-02-03 16:59:32 公開日:2021-02-02
# オブジェクトの共同発生に対するペナルティによるクラスタリング:計算的側面 Clustering with Penalty for Joint Occurrence of Objects: Computational Aspects ( http://arxiv.org/abs/2102.01424v1 ) ライセンス: Link先を確認	Ond\v{r}ej Sokol and Vladim\'ir Hol\'y	(参考訳) Hol\'y, Sokol, \v{C}ern\'y (Applied Soft Computing, 2017 Vol) のメソッド。 60, p. 752-762) クラスタオブジェクトは、与えられた多数のセットの入射量に基づく。アイデアは、同じセット内の同じクラスタから複数のオブジェクトの発生を最小限に抑えることです。本稿では,本手法の計算的側面について考察する。まず、最適クラスタリングの問題はNPハードであることが証明される。第二に、最適なクラスタリングを数値的に見つけるために、再数値化手順、高速なタスク固有の局所探索ヒューリスティック、単純化されたモデルに基づく初期解を用いた遺伝的アルゴリズムを提案する。第3に, シミュレーション研究により, 標準遺伝的アルゴリズムの改良により, 計算性能が著しく向上することを示す。 The method of Hol\'y, Sokol and \v{C}ern\'y (Applied Soft Computing, 2017, Vol. 60, p. 752-762) clusters objects based on their incidence in a large number of given sets. The idea is to minimize the occurrence of multiple objects from the same cluster in the same set. In the current paper, we study computational aspects of the method. First, we prove that the problem of finding the optimal clustering is NP-hard. Second, to numerically find a suitable clustering, we propose to use the genetic algorithm augmented by a renumbering procedure, a fast task-specific local search heuristic and an initial solution based on a simplified model. Third, in a simulation study, we demonstrate that our improvements of the standard genetic algorithm significantly enhance its computational performance.	翻訳日:2021-02-03 16:58:58 公開日:2021-02-02
# 強化学習におけるメトリクスと継続性 Metrics and continuity in reinforcement learning ( http://arxiv.org/abs/2102.01514v1 ) ライセンス: Link先を確認	Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro	(参考訳) 強化学習のほとんどの実践的応用では、個々の状態の直接推定を維持することは不可能であり、連続状態システムでは不可能である。代わりに、研究者はしばしば状態の類似性(明示的にも暗黙的にも)を利用して、限られたサンプルセットからうまく一般化できるモデルを構築します。使用される状態類似性、およびそれらが誘導する近隣やトポロジの概念は、アルゴリズムのパフォーマンスに直接影響するため、重要な重要性を有する。実際、最近の多くの研究では「よく行動する」地域の存在を仮定したアルゴリズムが導入されているが、将来の作業のためにそのようなトポロジの完全な仕様を残している。本稿では,これらのトポロジを定義するための統一的形式主義について,メトリクスのレンズを通じて紹介する。これらの指標の階層を確立し、強化学習問題を特定するマルコフ決定プロセスに関する理論的意味を実証する。我々は, 評価指標間の差異を実証的に評価し, 理論結果を補完する。 In most practical applications of reinforcement learning, it is untenable to maintain direct estimates for individual states; in continuous-state systems, it is impossible. Instead, researchers often leverage state similarity (whether explicitly or implicitly) to build models that can generalize well from a limited set of samples. The notion of state similarity used, and the neighbourhoods and topologies they induce, is thus of crucial importance, as it will directly affect the performance of the algorithms. Indeed, a number of recent works introduce algorithms assuming the existence of "well-behaved" neighbourhoods, but leave the full specification of such topologies for future work. In this paper we introduce a unified formalism for defining these topologies through the lens of metrics. We establish a hierarchy amongst these metrics and demonstrate their theoretical implications on the Markov Decision Process specifying the reinforcement learning problem. We complement our theoretical results with empirical evaluations showcasing the differences between the metrics considered.	翻訳日:2021-02-03 16:58:26 公開日:2021-02-02
# CNN圧縮のための重量共有機会の高速探索 Fast Exploration of Weight Sharing Opportunities for CNN Compression ( http://arxiv.org/abs/2102.01345v1 ) ライセンス: Link先を確認	Etienne Dupuis, David Novo, Ian O'Connor, Alberto Bosio	(参考訳) Convolutional Neural Networks(CNN)に関わる計算負荷は、通常、低消費電力の組み込みデバイスでは到達できない。この問題に対処するために、多くの近似技術があります。これらの手法は、設計空間探索(DSE)を用いて、各CNNに最適化する必要があるハイパーパラメータを持つ。本研究の目的は,DSEフェーズタイムがアートCNNの状態に対して容易に爆発できることを実証することである。そこで本稿では,出力の質を犠牲にすることなく,探索時間を劇的に短縮する最適化探索法を提案する。 The computational workload involved in Convolutional Neural Networks (CNNs) is typically out of reach for low-power embedded devices. There are a large number of approximation techniques to address this problem. These methods have hyper-parameters that need to be optimized for each CNNs using design space exploration (DSE). The goal of this work is to demonstrate that the DSE phase time can easily explode for state of the art CNN. We thus propose the use of an optimized exploration process to drastically reduce the exploration time without sacrificing the quality of the output.	翻訳日:2021-02-03 16:57:14 公開日:2021-02-02
# データにおけるマイニング特徴関係 Mining Feature Relationships in Data ( http://arxiv.org/abs/2102.01355v1 ) ライセンス: Link先を確認	Andrew Lensen	(参考訳) 新しいデータセットに直面したとき、ほとんどの実践者はデータ内の興味深いパターンや特徴を発見するために探索的データ分析を行うことから始める。関連ルールマイニングのような手法は、データの特徴(属性)間の関係を明らかにするために一般的に用いられる。しかし、アソシエーションルールはルールベースの機械学習を使用するため、主にバイナリデータやカテゴリデータでの使用のために設計されている。現実世界のデータの大部分は本質的に連続的であり、そのようなデータの離散化は不正確で情報の少ない関連ルールをもたらす。本稿では,データ中の連続的・分類的特徴間の象徴的関係を自動的に発見する遺伝的プログラミング手法を用いて,特徴関係マイニング(FRM)という代替手法を提案する。我々の知る限りでは、我々の提案したアプローチは、特徴間の関係を明確に発見することを目的とした最初の象徴的なアプローチである。実世界のさまざまなデータセットにおける経験的テスト提案手法は、容易に解釈でき、データに対する明確かつ非自明な洞察を提供する高品質でシンプルな特徴関係を見つけることができる。 When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover relationships between features (attributes) of the data. However, association rules are primarily designed for use on binary or categorical data, due to their use of rule-based machine learning. A large proportion of real-world data is continuous in nature, and discretisation of such data leads to inaccurate and less informative association rules. In this paper, we propose an alternative approach called feature relationship mining (FRM), which uses a genetic programming approach to automatically discover symbolic relationships between continuous or categorical features in data. To the best of our knowledge, our proposed approach is the first such symbolic approach with the goal of explicitly discovering relationships between features. Empirical testing on a variety of real-world datasets shows the proposed method is able to find high-quality, simple feature relationships which can be easily interpreted and which provide clear and non-trivial insight into data.	翻訳日:2021-02-03 16:56:46 公開日:2021-02-02
# 文レベル関係抽出のための改良ベースライン An Improved Baseline for Sentence-level Relation Extraction ( http://arxiv.org/abs/2102.01373v1 ) ライセンス: Link先を確認	Wenxuan Zhou, Muhao Chen	(参考訳) 文レベルの関係抽出(RE)は、文中の2つの実体間の関係を特定することを目的とする。この問題には多くの努力が費やされてきたが、最高の実行方法はまだ人間のパフォーマンスには及ばない。本論文では,実体表現とNAインスタンス予測という,徹底的に研究されていないREモデルの2つの側面を再検討する。当社の改良ベースラインモデルは、タイプマーカーを備えたエンティティ表現とNAインスタンス検出の強化のための信頼ベースの分類と組み合わされ、TACREDで75.0%のF1を達成し、以前のSOTAメソッドを大幅に上回っています。 Sentence-level relation extraction (RE) aims at identifying the relationship between two entities in a sentence. Many efforts have been devoted to this problem, while the best performing methods are still far behind human performance. In this paper, we revisit two aspects of RE models that are not thoroughly studied, namely entity representation and NA instance prediction. Our improved baseline model, incorporated with entity representations with type markers and confidence-based classification for enhanced NA instance detection, achieves an F1 of 75.0% on TACRED, significantly outperforms previous SOTA methods.	翻訳日:2021-02-03 16:54:43 公開日:2021-02-02
# MAUVE:オープンエンディングテキスト生成評価のためのヒューマンマシンダイバージェンス曲線 MAUVE: Human-Machine Divergence Curves for Evaluating Open-Ended Text Generation ( http://arxiv.org/abs/2102.01454v1 ) ライセンス: Link先を確認	Krishna Pillutla, Swabha Swayamdipta, Rowan Zellers, John Thickstun, Yejin Choi, Zaid Harchaoui	(参考訳) オープンエンドテキスト生成の大きな進歩にもかかわらず、このタスクの評価基準の設計には限界がある。本稿では,機械生成テキストの分布を人間の言語と直接比較する,オープンエンドテキスト生成の指標であるMAUVEを提案する。 MAUVEは2つの分布の分岐曲線の下の平均面積を測定し、モデル分布がよく近似する分布の一部から生じるものと、そうでないものという2つのタイプの誤差の間のトレードオフを探索する。ウェブテキスト領域とストーリー領域における2つのオープンエンドな生成タスク、および様々な復号アルゴリズムとモデルサイズについて実験を行った。この結果から,MAUVEによる評価は,モデルサイズに対する自然な挙動を反映していることが明らかとなった。 MAUVEの復号アルゴリズムの順序は、オープンエンドテキスト生成において最も広く使われている指標である世代パープレキシティと一致するが、MAUVEはモデルと人文の両方を考慮することにより、タスクに対するより原則化された評価基準を示す。 Despite major advances in open-ended text generation, there has been limited progress in designing evaluation metrics for this task. We propose MAUVE -- a metric for open-ended text generation, which directly compares the distribution of machine-generated text to that of human language. MAUVE measures the mean area under the divergence curve for the two distributions, exploring the trade-off between two types of errors: those arising from parts of the human distribution that the model distribution approximates well, and those it does not. We present experiments across two open-ended generation tasks in the web text domain and the story domain, and a variety of decoding algorithms and model sizes. Our results show that evaluation under MAUVE indeed reflects the more natural behavior with respect to model size, compared to prior metrics. MAUVE's ordering of the decoding algorithms also agrees with that of generation perplexity, the most widely used metric in open-ended text generation; however, MAUVE presents a more principled evaluation metric for the task as it considers both model and human text.	翻訳日:2021-02-03 16:54:09 公開日:2021-02-02
# 変換器(M-BERT)からの多言語双方向エンコーダ表現を用いたインドネシアニュースサイトのクリックベイト見出し検出 Clickbait Headline Detection in Indonesian News Sites using Multilingual Bidirectional Encoder Representations from Transformers (M-BERT) ( http://arxiv.org/abs/2102.01497v1 ) ライセンス: Link先を確認	Muhammad N. Fakhruzzaman, Saidah Z. Jannah, Ratih A. Ningrum, Indah Fahmiyah	(参考訳) クリック数は、オンライン広告主がニュースサイトに支払った金額に関連している。このようなビジネスモデルにより、一部のニュースサイトはクリックベイティングの汚いトリック、すなわちハイパーボリックで興味深い言葉、時には見出しの未完成の文章を使用して読者を意図的にいじめることを余儀なくされた。インドネシアの一部のオンラインニュースサイトもクリックベイトに参加し、他の既存のニュースサイトの信頼性を間接的に低下させた。埋め込み層として機能する予め訓練された言語モデルM-BERTを有するニューラルネットワークを100ノード隠蔽層と組み合わせ、シグモイド分類器をトッピングしてクリックベイト見出しを検出する。トレーニングデータセットとして合計6632の見出しで、分類器は著しくうまく機能しました。 5倍のクロス検証で評価され、精度スコアは0.914、f1スコアは0.914、精度スコアは0.916、ROC-AUCは0.992である。インドネシア語テキスト分類タスクにおける多言語BERTの使用がテストされ、さらなる拡張が可能となった。今後の可能性,社会的影響,クリックベイト検出の限界について論じる。 Click counts are related to the amount of money that online advertisers paid to news sites. Such business models forced some news sites to employ a dirty trick of click-baiting, i.e., using a hyperbolic and interesting words, sometimes unfinished sentence in a headline to purposefully tease the readers. Some Indonesian online news sites also joined the party of clickbait, which indirectly degrade other established news sites' credibility. A neural network with a pre-trained language model M-BERT that acted as a embedding layer is then combined with a 100 nodes hidden layer and topped with a sigmoid classifier was trained to detect clickbait headlines. With a total of 6632 headlines as a training dataset, the classifier performed remarkably well. Evaluated with 5-fold cross validation, it has an accuracy score of 0.914, f1-score of 0.914, precision score of 0.916, and ROC-AUC of 0.92. The usage of multilingual BERT in Indonesian text classification task was tested and is possible to be enhanced further. Future possibilities, societal impact, and limitations of the clickbait detection are discussed.	翻訳日:2021-02-03 16:53:28 公開日:2021-02-02
# Deep Online Fused Video Stabilization Deep Online Fused Video Stabilization ( http://arxiv.org/abs/2102.01279v1 ) ライセンス: Link先を確認	Zhenmei Shi, Fuhao Shi, Wei-Sheng Lai, Chia-Kai Liang, Yingyu Liang	(参考訳) 本稿では、センサデータ(ジャイロスコープ)と画像コンテンツ(光学フロー)の両方を用いて、教師なし学習による動画の安定化を図るディープニューラルネットワーク(DNN)を提案する。ネットワークは、実際の/仮想カメラで光の流れを融合し、ヒストリーを関節運動表現に変換する。次に、LSTMブロックは新しい仮想カメラポーズを推測し、この仮想ポーズはフレームを安定させる反動グリッドを生成するために使用されます。新たな相対運動表現と多段階学習プロセスが提案され, 教師なしのモデルが最適化される。我々の知る限りでは、センサデータと画像の両方を安定化に利用する最初のDNNソリューションである。提案手法をアブレーション研究により検証し,提案手法は定量的評価とユーザスタディにより最先端の代替ソリューションよりも優れていることを示した。 We present a deep neural network (DNN) that uses both sensor data (gyroscope) and image content (optical flow) to stabilize videos through unsupervised learning. The network fuses optical flow with real/virtual camera pose histories into a joint motion representation. Next, the LSTM block infers the new virtual camera pose, and this virtual pose is used to generate a warping grid that stabilizes the frame. Novel relative motion representation as well as a multi-stage training process are presented to optimize our model without any supervision. To the best of our knowledge, this is the first DNN solution that adopts both sensor data and image for stabilization. We validate the proposed framework through ablation studies and demonstrated the proposed method outperforms the state-of-art alternative solutions via quantitative evaluations and a user study.	翻訳日:2021-02-03 16:49:01 公開日:2021-02-02
# 言語に基づくモーメント定位のためのプログレッシブ定位ネットワーク Progressive Localization Networks for Language-based Moment Localization ( http://arxiv.org/abs/2102.01282v1 ) ライセンス: Link先を確認	Qi Zheng, Jianfeng Dong, Xiaoye Qu, Xun Yang, Shouling Ji, Xun Wang	(参考訳) 本稿では,言語に基づくモーメントローカライゼーションの課題を対象とする。このタスクの言語ベースの設定により、ターゲットアクティビティのオープンなセットが可能になり、ビデオモーメントの時間的長さが大きく変化する。既存の手法では、まず時間長の異なる十分な候補モーメントをサンプリングし、それから与えられたクエリと照合して目標モーメントを決定する。しかし、定時間粒度で生成された候補モーメントは、モーメント長の大きな変動を処理するのに最適である。そこで本研究では,目標モーメントを粗大な方法で段階的にローカライズする多段階プログレッシブ・ローカライゼーション・ネットワーク(PLN)を提案する。具体的には、PLNの各段階は局所化分岐を持ち、特定の時間的粒度で生成される候補モーメントに焦点を当てる。候補モーメントの時間的粒度はステージによって異なる。さらに,条件付き特徴操作モジュールとアップサンプリング接続を考案し,複数のローカライズブランチを橋渡しする。この方法では、後段は事前に学習した情報を吸収することができるため、より細かい局所化が容易になる。 3つの公開データセットに対する大規模な実験は、言語に基づくモーメントローカライゼーションにおけるPLNの有効性と、長いビデオで短いモーメントをローカライズする可能性を示す。 This paper targets the task of language-based moment localization. The language-based setting of this task allows for an open set of target activities, resulting in a large variation of the temporal lengths of video moments. Most existing methods prefer to first sample sufficient candidate moments with various temporal lengths, and then match them with the given query to determine the target moment. However, candidate moments generated with a fixed temporal granularity may be suboptimal to handle the large variation in moment lengths. To this end, we propose a novel multi-stage Progressive Localization Network (PLN) which progressively localizes the target moment in a coarse-to-fine manner. Specifically, each stage of PLN has a localization branch, and focuses on candidate moments that are generated with a specific temporal granularity. The temporal granularities of candidate moments are different across the stages. Moreover, we devise a conditional feature manipulation module and an upsampling connection to bridge the multiple localization branches. In this fashion, the later stages are able to absorb the previously learned information, thus facilitating the more fine-grained localization. Extensive experiments on three public datasets demonstrate the effectiveness of our proposed PLN for language-based moment localization and its potential for localizing short moments in long videos.	翻訳日:2021-02-03 16:48:27 公開日:2021-02-02
# GCF-Net:ビデオ行動認識のためのGated Clip Fusion Network GCF-Net: Gated Clip Fusion Network for Video Action Recognition ( http://arxiv.org/abs/2102.01285v1 ) ライセンス: Link先を確認	Jenhao Hsiao and Jiawei Chen and Chiuman Ho	(参考訳) 近年、ビデオアクション認識の精度向上のほとんどは、新しく設計されたCNNアーキテクチャ(例えば、3D-CNN)から来ている。これらのモデルは、固定時間長の単一クリップにディープCNNを適用することで訓練される。各ビデオセグメントは3D-CNNモジュールによって個別に処理されるため、対応するクリップディスクリプタはローカルであり、クリップ間の関係は本質的に暗黙的です。ビデオレベルの予測としてクリップレベルの出力を直接平均化する一般的な方法は、ビデオを表すために関連情報を抽出および統合できるメカニズムの欠如のために失敗する傾向があります。本稿では、既存のビデオアクション分類器を小さな計算オーバーヘッドのコストで大幅に向上させることができるGated Clip Fusion Network(GCF-Net)について紹介する。 GCF-Netは、ビデオクリップ間の依存性を明示的にモデル化し、ローカルクリップディスクリプタの受容フィールドを強化します。さらに、アクションイベントに対する各クリップの重要性を計算し、関連するクリップのサブセットを選択してビデオレベルの分析を行う。大規模なベンチマークデータセット(Kinetics-600)では、提案されたGCF-Netは、それぞれ11.49%(中央クリップに基づく)と3.67%(高密度サンプリングクリップに基づく)の既存のアクション分類器の精度を高める。 In recent years, most of the accuracy gains for video action recognition have come from the newly designed CNN architectures (e.g., 3D-CNNs). These models are trained by applying a deep CNN on single clip of fixed temporal length. Since each video segment are processed by the 3D-CNN module separately, the corresponding clip descriptor is local and the inter-clip relationships are inherently implicit. Common method that directly averages the clip-level outputs as a video-level prediction is prone to fail due to the lack of mechanism that can extract and integrate relevant information to represent the video. In this paper, we introduce the Gated Clip Fusion Network (GCF-Net) that can greatly boost the existing video action classifiers with the cost of a tiny computation overhead. The GCF-Net explicitly models the inter-dependencies between video clips to strengthen the receptive field of local clip descriptors. Furthermore, the importance of each clip to an action event is calculated and a relevant subset of clips is selected accordingly for a video-level analysis. On a large benchmark dataset (Kinetics-600), the proposed GCF-Net elevates the accuracy of existing action classifiers by 11.49% (based on central clip) and 3.67% (based on densely sampled clips) respectively.	翻訳日:2021-02-03 16:47:43 公開日:2021-02-02
# Deep Refinement NetworkとAdaptive Weighting Lossを用いたCrisp Boundariesの学習 Learning Crisp Boundaries Using Deep Refinement Network and Adaptive Weighting Loss ( http://arxiv.org/abs/2102.01301v1 ) ライセンス: Link先を確認	Yi-Jun Cao, Chuan Lin, and Yong-Jie Li	(参考訳) 畳み込みニューラルネットワークを用いて境界検出において著しい進歩を遂げている。最近の境界検出モデルは、実際のオブジェクトの境界検出だけでなく、境界(オブジェクトの輪郭に沿って正確にローカライズ)にも焦点を合わせています。 crisp境界性能を評価する方法は2つある。基底真理と検出された輪郭の間の距離を測定するために、より厳密な耐性を用いる。もう1つは、後処理なしで輪郭マップを評価することに焦点を当てている。本研究では,両手法を解析し,両手法が輪郭評価の2つの側面であることを示す。そこで本研究では,複数の精錬モジュールを積み重ねた深層精錬ネットワーク(DRNet)と,効果的な適応融合によるクロスエントロピーとダイス損失を組み合わせた新たな損失関数を提案する。実験の結果,いくつかの利用可能なデータセットの最先端性能が得られた。 Significant progress has been made in boundary detection with the help of convolutional neural networks. Recent boundary detection models not only focus on real object boundary detection but also "crisp" boundaries (precisely localized along the object's contour). There are two methods to evaluate crisp boundary performance. One uses more strict tolerance to measure the distance between the ground truth and the detected contour. The other focuses on evaluating the contour map without any postprocessing. In this study, we analyze both methods and conclude that both methods are two aspects of crisp contour evaluation. Accordingly, we propose a novel network named deep refinement network (DRNet) that stacks multiple refinement modules to achieve richer feature representation and a novel loss function, which combines cross-entropy and dice loss through effective adaptive fusion. Experimental results demonstrated that we achieve state-of-the-art performance for several available datasets.	翻訳日:2021-02-03 16:46:57 公開日:2021-02-02
# 点型検出とガウス離散によるコーンビームCT像からの歯列分離 Tooth Instance Segmentation from Cone-Beam CT Images through Point-based Detection and Gaussian Disentanglement ( http://arxiv.org/abs/2102.01315v1 ) ライセンス: Link先を確認	Jusang Lee, Minyoung Chung, Minkyung Lee, Yeong-Gil Shin	(参考訳) 歯の個々の分割およびコーンビームCT画像からの識別は、矯正治療の術前前提条件である。畳み込みニューラルネットワークを用いたインスタンスセグメンテーション手法は,個々の歯のセグメンテーションタスクにおいて画期的な結果を示し,様々な医用画像アプリケーションで用いられている。点に基づく検出ネットワークは歯科画像において優れた結果を得るが, 類似したトポロジーと近近性から隣接歯を識別することは依然として難しい課題である。本研究では,ガウス離散客観的関数に基づいて各歯を効果的に解離する点ベースの歯の局在化ネットワークを提案する。提案したネットワークはまず,すべての解剖学的歯に対するボックスレグレッションを伴うヒートマップレグレッションを行う。隣り合う全ての歯のヒートマップの画素ワイド乗算の和を最小化することにより、新しいガウスのゆがみのペナルティを用いる。その後、ピクセルワイズラベリングタスクを距離マップ回帰タスクに変換して個々の歯の分割を行い、歯の隣接する領域における偽陽性を最小限に抑える。実験結果から, 検出精度を9.1%向上させることで, 最新のアプローチを上回り, 個々の歯の区分において高い性能を発揮できることが示された。提案手法の主な意義は, 1) 追加の分類を必要としない点ベース歯検出フレームワークの導入, 2) 点ベース検出フレームワークにおける熱マップ応答に基づいてガウス分布を効果的に分離する新規な損失関数の設計である。 Individual tooth segmentation and identification from cone-beam computed tomography images are preoperative prerequisites for orthodontic treatments. Instance segmentation methods using convolutional neural networks have demonstrated ground-breaking results on individual tooth segmentation tasks, and are used in various medical imaging applications. While point-based detection networks achieve superior results on dental images, it is still a challenging task to distinguish adjacent teeth because of their similar topologies and proximate nature. In this study, we propose a point-based tooth localization network that effectively disentangles each individual tooth based on a Gaussian disentanglement objective function. The proposed network first performs heatmap regression accompanied by box regression for all the anatomical teeth. A novel Gaussian disentanglement penalty is employed by minimizing the sum of the pixel-wise multiplication of the heatmaps for all adjacent teeth pairs. Subsequently, individual tooth segmentation is performed by converting a pixel-wise labeling task to a distance map regression task to minimize false positives in adjacent regions of the teeth. Experimental results demonstrate that the proposed algorithm outperforms state-of-the-art approaches by increasing the average precision of detection by 9.1%, which results in a high performance in terms of individual tooth segmentation. The primary significance of the proposed method is two-fold: 1) the introduction of a point-based tooth detection framework that does not require additional classification and 2) the design of a novel loss function that effectively separates Gaussian distributions based on heatmap responses in the point-based detection framework.	翻訳日:2021-02-03 16:46:21 公開日:2021-02-02
# 分散画像インパインティングのためのテスト時間適応 Test-Time Adaptation for Out-of-distributed Image Inpainting ( http://arxiv.org/abs/2102.01360v1 ) ライセンス: Link先を確認	Chajin Shin, Taeoh Kim, Sangjin Lee and Sangyoun Lee	(参考訳) ディープラーニングベースのイメージインペインティングアルゴリズムは、多数の外部自然画像から事前学習することで、優れたパフォーマンスを示している。しかし, モデルがトレーニング画像に偏りがあるため, トレーニング画像の分布がトレーニング画像の分布からかけ離れているテスト画像に対して, 不愉快な結果を示す。本稿では,AdaFillという実験時間適応を用いた簡易画像描画アルゴリズムを提案する。分散した1つのテスト画像を考えると、私たちの目標は、事前訓練された塗装モデルよりも自然に穴領域を完成させることです。この目的を達成するために,自然画像は内部的類似性が強いため,テスト画像の有効領域を別の訓練方法として扱う。このテスト時間適応により、我々のネットワークは、事前訓練された特徴とテスト画像の内部的事前を明示的に利用することができる。実験の結果,adafillは他のモデルよりも分散テスト画像の方が優れていた。さらに、事前トレーニングされていないZeroFillというモデルも、事前トレーニングされたモデルを上回ることがあります。 Deep learning-based image inpainting algorithms have shown great performance via powerful learned prior from the numerous external natural images. However, they show unpleasant results on the test image whose distribution is far from the that of training images because their models are biased toward the training images. In this paper, we propose a simple image inpainting algorithm with test-time adaptation named AdaFill. Given a single out-of-distributed test image, our goal is to complete hole region more naturally than the pre-trained inpainting models. To achieve this goal, we treat remained valid regions of the test image as another training cues because natural images have strong internal similarities. From this test-time adaptation, our network can exploit externally learned image priors from the pre-trained features as well as the internal prior of the test image explicitly. Experimental results show that AdaFill outperforms other models on the various out-of-distribution test images. Furthermore, the model named ZeroFill, that are not pre-trained also sometimes outperforms the pre-trained models.	翻訳日:2021-02-03 16:45:32 公開日:2021-02-02
# Sf_{3}CNN$を用いた高次特徴識別による顔認識 Face Recognition Using $Sf_{3}CNN$ With Higher Feature Discrimination ( http://arxiv.org/abs/2102.01404v1 ) ライセンス: Link先を確認	Nayaneesh Kumar Mishra, Satish Kumar Singh	(参考訳) 2次元畳み込みニューラルネットワーク(2d cnn)の出現により、顔認識精度は99%を超えている。しかし、顔認識は現実世界の状況では依然として課題です。画像の代わりにビデオは、実際の状況における顔認識の課題を解決するのに、入力としてより有用である。これは、ビデオが画像よりも多くの機能を提供するためです。しかし、2D CNNはビデオの時間的特徴を生かすことはできない。そこで我々は,ビデオの顔認識に$Sf_{3}CNN$というフレームワークを提案する。 The $Sf_{3}CNN$ framework using 3-dimensional Residual Network (3D Resnet) and A-Softmax loss for face recognition in video。 3D ResNetの使用は、空間的特徴と時間的特徴の両方を1つのコンパクトな特徴マップにキャプチャするのに役立つ。しかし、3D CNN機能は、効率的な顔認識のために非常に差別的である必要があります。 A-Softmaxの損失は、顔認識のためにビデオから高い差別的特徴を抽出するのに役立ちます。 Sf_{3}CNN$ frameworkは、CVBLビデオデータベースの99.10%の精度を、3D ResNetsを使用して同じデータベースの97%と比較する。 With the advent of 2-dimensional Convolution Neural Networks (2D CNNs), the face recognition accuracy has reached above 99%. However, face recognition is still a challenge in real world conditions. A video, instead of an image, as an input can be more useful to solve the challenges of face recognition in real world conditions. This is because a video provides more features than an image. However, 2D CNNs cannot take advantage of the temporal features present in the video. We therefore, propose a framework called $Sf_{3}CNN$ for face recognition in videos. The $Sf_{3}CNN$ framework uses 3-dimensional Residual Network (3D Resnet) and A-Softmax loss for face recognition in videos. The use of 3D ResNet helps to capture both spatial and temporal features into one compact feature map. However, the 3D CNN features must be highly discriminative for efficient face recognition. The use of A-Softmax loss helps to extract highly discriminative features from the video for face recognition. $Sf_{3}CNN$ framework gives an increased accuracy of 99.10% on CVBL video database in comparison to the previous 97% on the same database using 3D ResNets.	翻訳日:2021-02-03 16:44:56 公開日:2021-02-02
# 3次元CNNを用いた顔認識 Face Recognition using 3D CNNs ( http://arxiv.org/abs/2102.01441v1 ) ライセンス: Link先を確認	Nayaneesh Kumar Mishra, Satish Kumar Singh	(参考訳) 顔認識の領域はコンピュータビジョンと生体計測の分野で最も広く研究されている分野の1つである。これは、顔の生体認証の非侵入的な性質が、空港などの公共の場所での監視分野の応用に比較的適しているためである。顔認識における原始的手法の適用は, 十分な性能を得られなかった。しかし、機械学習と深層学習の出現と顔認識への応用により、いくつかの大きなブレークスルーが得られた。顔認識における2次元畳み込みニューラルネットワーク(2d cnn)の使用は、人間の顔認識精度を越え、99%に達した。それでも、解像度、照明、ポーズの変動などの現実世界条件の存在下での堅牢な顔認識は、顔認識の研究者にとって大きな課題です。本研究では,映像を3次元CNNアーキテクチャの入力として使用し,映像から空間領域情報と時間領域情報をキャプチャして実環境における顔認識を行った。実験のために,CVBLビデオデータセットという独自のビデオデータセットを開発した。ビデオの顔認識に3D CNNを使用することは、CVBLデータセットで97%の精度で最高のパフォーマンスを発揮するDenseNetsで有望な結果を示しています。 The area of face recognition is one of the most widely researched areas in the domain of computer vision and biometric. This is because, the non-intrusive nature of face biometric makes it comparatively more suitable for application in area of surveillance at public places such as airports. The application of primitive methods in face recognition could not give very satisfactory performance. However, with the advent of machine and deep learning methods and their application in face recognition, several major breakthroughs were obtained. The use of 2D Convolution Neural networks(2D CNN) in face recognition crossed the human face recognition accuracy and reached to 99%. Still, robust face recognition in the presence of real world conditions such as variation in resolution, illumination and pose is a major challenge for researchers in face recognition. In this work, we used video as input to the 3D CNN architectures for capturing both spatial and time domain information from the video for face recognition in real world environment. For the purpose of experimentation, we have developed our own video dataset called CVBL video dataset. The use of 3D CNN for face recognition in videos shows promising results with DenseNets performing the best with an accuracy of 97% on CVBL dataset.	翻訳日:2021-02-03 16:44:18 公開日:2021-02-02
# 合成学習した深部畳み込みネットワークによる人体部品の分節学習 Learning to Segment Human Body Parts with Synthetically Trained Deep Convolutional Networks ( http://arxiv.org/abs/2102.01460v1 ) ライセンス: Link先を確認	Alessandro Saviolo, Matteo Bonotto, Daniele Evangelista, Marco Imperoli, Emanuele Menegatti and Alberto Pretto	(参考訳) 本稿では,合成データのみを用いた深層畳み込みニューラルネットワークに基づく人体部分分割のための新しいフレームワークを提案する。提案手法は,人体の実際のアノテートデータを用いたモデルの訓練を必要とせず,最先端の結果を得る。私たちの貢献は、ネットワークを訓練するために使用される合成データを作成するためのゲームエンジンを利用するデータ生成パイプラインと、エッジレスポンスマップと適応ヒストグラムの等化を組み合わせた新しい前処理モジュールで、ネットワークをガイドして、照明条件の変化に対する堅牢性を保証する人体部品の形状を学びます。最適な候補アーキテクチャを選択するために,実人手足の手動注釈画像の徹底的な検査を行った。さらに、前処理モジュールを検証するためのアブレーション研究について述べる。その結果,本手法は,最先端のセマンティクスセグメンテーションネットワークを大きなマージンで上回っていることがわかった。本論文では,得られたデータセットと合わせて,提案手法の実装をリリースする。 This paper presents a new framework for human body part segmentation based on Deep Convolutional Neural Networks trained using only synthetic data. The proposed approach achieves cutting-edge results without the need of training the models with real annotated data of human body parts. Our contributions include a data generation pipeline, that exploits a game engine for the creation of the synthetic data used for training the network, and a novel pre-processing module, that combines edge response map and adaptive histogram equalization to guide the network to learn the shape of the human body parts ensuring robustness to changes in the illumination conditions. For selecting the best candidate architecture, we performed exhaustive tests on manually-annotated images of real human body limbs. We further present an ablation study to validate our pre-processing module. The results show that our method outperforms several state-of-the-art semantic segmentation networks by a large margin. We release an implementation of the proposed approach along with the acquired datasets with this paper.	翻訳日:2021-02-03 16:43:40 公開日:2021-02-02
# 文化から服へ:ファッションのイメージの世紀の後ろにある世界イベントを発見する From Culture to Clothing: Discovering the World Events Behind A Century of Fashion Images ( http://arxiv.org/abs/2102.01690v1 ) ライセンス: Link先を確認	Wei-Lin Hsiao, Kristen Grauman	(参考訳) ファッションは外部の文化的要因と絡み合っているが、これらのリンクを特定することは、最も健全な現象に限られる手作業である。着る衣服に影響を及ぼす特定の文化的要因を特定するためのデータ駆動アプローチを提案する。 1世紀にわたるニュース記事やヴィンテージ写真の大規模なデータセットを用いて、世界の出来事と衣服の選択の間の影響関係を検出するマルチモーダル統計モデルを導入する。さらに,2つのデータセット上でのビジュアルスタイル予測とフォトタイムスタンプの具体的ビジョンタスクの改善に本モデルを適用した。私たちの仕事は、文化と衣類を結びつけるための計算可能でスケーラブルで簡単に更新可能なアプローチに向けた第一歩です。 Fashion is intertwined with external cultural factors, but identifying these links remains a manual process limited to only the most salient phenomena. We propose a data-driven approach to identify specific cultural factors affecting the clothes people wear. Using large-scale datasets of news articles and vintage photos spanning a century, we introduce a multi-modal statistical model to detect influence relationships between happenings in the world and people's choice of clothing. Furthermore, we apply our model to improve the concrete vision tasks of visual style forecasting and photo timestamping on two datasets. Our work is a first step towards a computational, scalable, and easily refreshable approach to link culture to clothing.	翻訳日:2021-02-03 16:43:04 公開日:2021-02-02
# 生理的反応による人種バイアスの検出 Detection of Racial Bias from Physiological Responses ( http://arxiv.org/abs/2102.01287v1 ) ライセンス: Link先を確認	Fateme Nikseresht, Runze Yan, Rachel Lew, Yingzheng Liu, Rose M.Sebastian, Afsaneh Doryab	(参考訳) 偏見から害を和らげるための規範や規制の進化にもかかわらず、個人の無意識バイアスに関連する有害な差別は続いている。我々の目標は、暗黙のバイアスの生理的および行動的指標をよりよく理解し、検出することである。本稿では,心拍数,伝導性皮膚反応,皮膚温度,微小体運動などの生理的反応から,人種的偏見を確実に検出できるかどうかを検討する。インプリシット・アソシエーション・テスト (IAT) を施行中, Empatica E4 リストバンドを用いて生理データを収集した46名の被験者のデータを解析した。機械学習と統計解析により、76.1%の精度で生理信号から暗黙のバイアスを予測できることが示された。また,皮膚反応に関連するEDA信号は,人種的バイアスと最も強い相関関係を持ち,偏見のある参加者と偏見のない参加者のEDA特徴値には有意な差があることを示した。 Despite the evolution of norms and regulations to mitigate the harm from biases, harmful discrimination linked to an individual's unconscious biases persists. Our goal is to better understand and detect the physiological and behavioral indicators of implicit biases. This paper investigates whether we can reliably detect racial bias from physiological responses, including heart rate, conductive skin response, skin temperature, and micro-body movements. We analyzed data from 46 subjects whose physiological data was collected with Empatica E4 wristband while taking an Implicit Association Test (IAT). Our machine learning and statistical analysis show that implicit bias can be predicted from physiological signals with 76.1% accuracy. Our results also show that the EDA signal associated with skin response has the strongest correlation with racial bias and that there are significant differences between the values of EDA features for biased and unbiased participants.	翻訳日:2021-02-03 16:41:28 公開日:2021-02-02
# 高精度なデキサスロボット操作のためのGazeベースのデュアルリゾリューションディープイミテーション学習 Gaze-based dual resolution deep imitation learning for high-precision dexterous robot manipulation ( http://arxiv.org/abs/2102.01295v1 ) ライセンス: Link先を確認	Heecheol Kim, Yoshiyuki Ohmura, and Yasuo Kuniyoshi	(参考訳) 針のスレッディングのような高精度な操作作業は困難である。生理学的研究は、低解像度の周辺視覚と高速移動をつなげて物体の近傍に手を運ぶことを提案し、高分解能の焦点視覚を用いて物体への正確な手のホーミングを実現する。本研究は,人間の視線に基づく双対分解能振動子制御システムにインスパイアされた,深層模倣学習に基づく手法により,針のスレッディング作業が解決できることを実証した。まず,ロボットを遠隔操作している操作者の視線の動きを記録した。次に,視線周辺の高分解能画像のみを用いて,目標近傍の糸位置を正確に制御した。我々は低解像度の周辺画像を用いて目標付近に到達した。本研究で得られた実験結果は,汎用ロボットマニピュレータを用いた高精度操作が可能であり,計算効率が向上することを示す。 A high-precision manipulation task, such as needle threading, is challenging. Physiological studies have proposed connecting low-resolution peripheral vision and fast movement to transport the hand into the vicinity of an object, and using high-resolution foveated vision to achieve the accurate homing of the hand to the object. The results of this study demonstrate that a deep imitation learning based method, inspired by the gaze-based dual resolution visuomotor control system in humans, can solve the needle threading task. First, we recorded the gaze movements of a human operator who was teleoperating a robot. Then, we used only a high-resolution image around the gaze to precisely control the thread position when it was close to the target. We used a low-resolution peripheral image to reach the vicinity of the target. The experimental results obtained in this study demonstrate that the proposed method enables precise manipulation tasks using a general-purpose robot manipulator and improves computational efficiency.	翻訳日:2021-02-03 16:40:50 公開日:2021-02-02
# Transfer Q-Learning を用いた分散マルチコアサーバのQoS対応電力最小化 QoS-Aware Power Minimization of Distributed Many-Core Servers using Transfer Q-Learning ( http://arxiv.org/abs/2102.01348v1 ) ライセンス: Link先を確認	Dainius Jenkus, Fei Xia, Rishad Shafik, Alex Yakovlev	(参考訳) 分散システムにまたがってスケールされたWebサーバは、サービス品質(QoS)を保証するための複雑なランタイムコントロールを必要とします。本稿では、水平スケーリング(ノードアロケーション)と垂直スケーリング(ノード内のリソースアロケーション)メソッドを相乗的に使用して、QoS制約(応答時間)下での消費電力を最小限に抑えながら、ワークロードへの適応を提供するQoS対応ランタイムコントローラを提案する。水平スケーリングは、一連のルールに従って、ワークロード要求と必要なQoSに基づいてアクティブノード数を決定する。そして、ダイナミック電圧/周波数スケーリング(dvfs)を使用してワークロードプロファイルに基づいて、さらにパワー/パフォーマンスをチューニングするトランスファーqラーニングを用いた垂直スケーリングと結合する。最小限の探索条件でQ値の転送を行う。さらにこのアプローチでは,マルチコアサーバのスケーラブルなアーキテクチャを活用して,完全あるいは部分的に探索されたノードから利用可能な知識を再利用する。これらの手法を組み合わせることで、モデルフリーのq-learningと比較して、探索時間とqos違反を低減できる。このテクニックは設計時間とランタイムコストのバランスをとり、サーバクラスタの異種マルチプロセスノード上の異なるワークロードシナリオにおいて、qos違反を最小限に抑えながら、永続的な電力削減と運用上の最適性を最大化する。 Web servers scaled across distributed systems necessitate complex runtime controls for providing quality of service (QoS) guarantees as well as minimizing the energy costs under dynamic workloads. This paper presents a QoS-aware runtime controller using horizontal scaling (node allocation) and vertical scaling (resource allocation within nodes) methods synergistically to provide adaptation to workloads while minimizing the power consumption under QoS constraint (i.e., response time). A horizontal scaling determines the number of active nodes based on workload demands and the required QoS according to a set of rules. Then, it is coupled with vertical scaling using transfer Q-learning, which further tunes power/performance based on workload profile using dynamic voltage/frequency scaling (DVFS). It transfers Q-values within minimally explored states reducing exploration requirements. In addition, the approach exploits a scalable architecture of the many-core server allowing to reuse available knowledge from fully or partially explored nodes. When combined, these methods allow to reduce the exploration time and QoS violations when compared to model-free Q-learning. The technique balances design-time and runtime costs to maximize the portability and operational optimality demonstrated through persistent power reductions with minimal QoS violations under different workload scenarios on heterogeneous multi-processing nodes of a server cluster.	翻訳日:2021-02-03 16:40:14 公開日:2021-02-02
# ALOHAにおけるデータ前処理のモジュール的アプローチとスマート産業ユースケースへの応用 Modular approach to data preprocessing in ALOHA and application to a smart industry use case ( http://arxiv.org/abs/2102.01349v1 ) ライセンス: Link先を確認	Cristina Chesta, Luca Rinelli	(参考訳) 音声コマンドやマシンビジョンシステムを使用した協調ロボットとのインタラクションなど、スマート産業領域でのアプリケーションには、しばしば異種低電力コンピューティングプラットフォームにディープラーニングアルゴリズムの展開が必要です。異なる設計手順を自動化するためのソフトウェアツールとフレームワークの可用性は、組み込みシステムにおけるDLアルゴリズムの効果的な実装をサポートし、関連する労力とコストを削減できる。フレームワークの受け入れにおいて非常に重要な側面の1つは、拡張性(extensibility)である。高度なスキルを必要とせずに、異なるデータセットに対応し、カスタマイズされた前処理を定義する能力。データ前処理と変換パイプラインをサポートするために、ALOHAツールフローに統合されたモジュラーアプローチに対処する。これはカスタマイズ可能なプラグインによって実現され、新しいユースケースを包含するツールフローを簡単に拡張できる。本手法の有効性を示すために,キーワードスポッティングユースケースに関する実験結果を示し,異なるユースケースへの拡張の可能性について概説する。 Applications in the smart industry domain, such as interaction with collaborative robots using vocal commands or machine vision systems often requires the deployment of deep learning algorithms on heterogeneous low power computing platforms. The availability of software tools and frameworks to automatize different design steps can support the effective implementation of DL algorithms on embedded systems, reducing related effort and costs. One very important aspect for the acceptance of the framework, is its extensibility, i.e. the capability to accommodate different datasets and define customized preprocessing, without requiring advanced skills. The paper addresses a modular approach, integrated into the ALOHA tool flow, to support the data preprocessing and transformation pipeline. This is realized through customizable plugins and allows the easy extension of the tool flow to encompass new use cases. To demonstrate the effectiveness of the approach, we present some experimental results related to a keyword spotting use case and we outline possible extensions to different use cases.	翻訳日:2021-02-03 16:39:31 公開日:2021-02-02
# 多目的マルチエージェントパス探索のための部分次元拡張 Subdimensional Expansion for Multi-objective Multi-agent Path Finding ( http://arxiv.org/abs/2102.01353v1 ) ライセンス: Link先を確認	Zhongqiang Ren, Sivakumar Rathinam and Howie Choset	(参考訳) 従来のマルチエージェントパスプランナーは、通常、経路長のような単一の目的を最適化する経路を決定する。しかし、多くのアプリケーションでは、計画プロセスで同時に最適化されるために、完成までの時間と燃料の使用など、複数の目的が必要です。しばしば、これらの基準は容易に比較されず、しばしば互いに競合している。標準的な多目的探索アルゴリズムをマルチエージェントパス探索に適用するだけで、可能解の空間、すなわちパレート最適集合のサイズがエージェントの数(探索空間の次元)とともに指数関数的に増加するため、非効率であることが証明できる。本稿では,このいわゆる次元の呪いを回避し,従来のマルチエージェントワークをサブ次元展開という枠組みで活用するアプローチを提案する。 A* に適用された部分次元展開の例は M* と呼ばれ、M* は単目的函数に制限された。支配と部分次元拡大の原則を組み合わせて、マルチオブジェクトM(MOM)と呼ばれる新しいアルゴリズムを作成し、エージェントが互いに「相互作用」しなければならない場合にのみ、計画のためのエージェントを動的に結合します。 MOMは、複数のエージェントに対する完全なパレート最適集合を効率的に計算し、パレート最適集合の最適部分近似と計算効率を自然に交換する。我々の手法は、標準多目的Aアルゴリズムでは有界時間内に見つからない数百の解を持つ問題インスタンスに対する完全なパレート最適集合を見つけることができる。 Conventional multi-agent path planners typically determine a path that optimizes a single objective, such as path length. Many applications, however, may require multiple objectives, say time-to-completion and fuel use, to be simultaneously optimized in the planning process. Often, these criteria may not be readily compared and sometimes lie in competition with each other. Simply applying standard multi-objective search algorithms to multi-agent path finding may prove to be inefficient because the size of the space of possible solutions, i.e., the Pareto-optimal set, can grow exponentially with the number of agents (the dimension of the search space). This paper presents an approach that bypasses this so-called curse of dimensionality by leveraging our prior multi-agent work with a framework called subdimensional expansion. One example of subdimensional expansion, when applied to A, is called M and M* was limited to a single objective function. We combine principles of dominance and subdimensional expansion to create a new algorithm named multi-objective M* (MOM), which dynamically couples agents for planning only when those agents have to "interact" with each other. MOM computes the complete Pareto-optimal set for multiple agents efficiently and naturally trades off sub-optimal approximations of the Pareto-optimal set and computational efficiency. Our approach is able to find the complete Pareto-optimal set for problem instances with hundreds of solutions which the standard multi-objective A* algorithms could not find within a bounded time.	翻訳日:2021-02-03 16:38:55 公開日:2021-02-02
# 化学反応データ集合の無支援ノイズ低減 Unassisted Noise Reduction of Chemical Reaction Data Sets ( http://arxiv.org/abs/2102.01399v1 ) ライセンス: Link先を確認	Alessandra Toniato, Philippe Schwaller, Antonio Cardinale, Joppe Geluykens and Teodoro Laino	(参考訳) 有機化学における反応予測に応用された既存のディープラーニングモデルは、高いレベルの精度(自然言語処理ベースでは90%)に達する可能性がある。反応データから得られた情報以上に化学知識が組み込まれていないため、予測モデルの性能においてデータセットの品質が重要な役割を果たす。人間のキュレーションは極めて高価だが、既存のデータセットから化学的に間違ったエントリを取り除くための支援のないアプローチの必要性は、合成化学タスクにおける人工知能モデルのパフォーマンスを改善するために不可欠である。本稿では,化学反応コレクションから化学的に間違った成分を除去する機械学習による非支援手法を提案する。我々はこの手法を,米国特許庁(USPTO)特許から抽出した化学反応ピスタチオとオープンデータセットの収集に適用した。その結果,クリーン化およびバランスの取れたデータセットでトレーニングしたモデルの予測精度が向上した。逆合成モデルでは、ラウンドトリップ精度メトリックは13パーセントポイント増加し、累積Jensen Shannon発散の値は元のレコードと比較して30%減少します。カバレッジは97%で高いままであり、クラス多様性の価値はクリーニングによって影響を受けません。提案手法は,化学データの自動ノイズ低減に対処する最初の無規制手法である。 Existing deep learning models applied to reaction prediction in organic chemistry can reach high levels of accuracy (> 90% for Natural Language Processing-based ones). With no chemical knowledge embedded than the information learnt from reaction data, the quality of the data sets plays a crucial role in the performance of the prediction models. While human curation is prohibitively expensive, the need for unaided approaches to remove chemically incorrect entries from existing data sets is essential to improve artificial intelligence models' performance in synthetic chemistry tasks. Here we propose a machine learning-based, unassisted approach to remove chemically wrong entries from chemical reaction collections. We applied this method to the collection of chemical reactions Pistachio and to an open data set, both extracted from USPTO (United States Patent Office) patents. Our results show an improved prediction quality for models trained on the cleaned and balanced data sets. For the retrosynthetic models, the round-trip accuracy metric grows by 13 percentage points and the value of the cumulative Jensen Shannon divergence decreases by 30% compared to its original record. The coverage remains high with 97%, and the value of the class-diversity is not affected by the cleaning. The proposed strategy is the first unassisted rule-free technique to address automatic noise reduction in chemical data sets.	翻訳日:2021-02-03 16:38:10 公開日:2021-02-02
# 自律システム(AMLAS)における機械学習の保証に関するガイダンス Guidance on the Assurance of Machine Learning in Autonomous Systems (AMLAS) ( http://arxiv.org/abs/2102.01564v1 ) ライセンス: Link先を確認	Richard Hawkins, Colin Paterson, Chiara Picardi, Yan Jia, Radu Calinescu and Ibrahim Habli	(参考訳) 機械学習(ML)は、現在、特定の条件下では人間のパフォーマンスを超えると報告された結果を持つ様々なシステムで使用されている。これらのシステムの多くは、ヘルスケア、自動車、製造業などの分野で、高い自律性を示し、安全性が重要です。 MLの正当性を確立することは、これらのシステムの安全ケースの中核をなす。本稿では,自律システム(AMLAS)における機械学習の保証に関する方法論を紹介する。 AMLASは、(1)MLコンポーネントの開発に安全保証を体系的に統合する工程と、(2)自律システムアプリケーションに統合された場合に、これらのコンポーネントの許容される安全性を明確に正当化するエビデンスベースを生成する工程と、からなる。 Machine Learning (ML) is now used in a range of systems with results that are reported to exceed, under certain conditions, human performance. Many of these systems, in domains such as healthcare , automotive and manufacturing, exhibit high degrees of autonomy and are safety critical. Establishing justified confidence in ML forms a core part of the safety case for these systems. In this document we introduce a methodology for the Assurance of Machine Learning for use in Autonomous Systems (AMLAS). AMLAS comprises a set of safety case patterns and a process for (1) systematically integrating safety assurance into the development of ML components and (2) for generating the evidence base for explicitly justifying the acceptable safety of these components when integrated into autonomous system applications.	翻訳日:2021-02-03 16:37:31 公開日:2021-02-02
# 外部イノベーションが新規医薬品承認に与える影響--ふりかえり分析 The impact of external innovation on new drug approvals: A retrospective analysis ( http://arxiv.org/abs/2102.01260v1 ) ライセンス: Link先を確認	Xiong Liu, Craig E. Thomas, Christian C. Felder	(参考訳) 製薬会社は、発見研究の生産性を高めるために外部のイノベーション源に頼りがちです。しかし、イノベーションエコシステムを最大限に活用する方法についてより深く理解するためには、外部のイノベーションがプロダクトのローンチを成功に導く方法に関する深い知識が必要である。 FDAが承認した新規分子実体(NMEs)と13の大手製薬会社(2006-2016年)が立ち上げた新生物実体(NBEs)について,承認前の文献を分析した。学術機関が承認前の出版物の大半に貢献し、出版主題がそれぞれの革新者の強みと密接に一致していることが判明した。これは第3相で終了する候補薬にも当てはまりますが、これらの分子に関する文献の量は承認された薬物よりもはるかに少ないです。これは、認可された薬物は、多くの研究所によって提供されるより堅牢なデータセットとしばしば関連していることを示唆している。総合的に分析した結果,学界,産業界,政府にまたがる共同研究イノベーション環境が,医薬品承認の成功に非常に寄与するという仮説が支持された。 Pharmaceutical companies are relying more often on external sources of innovation to boost their discovery research productivity. However, more in-depth knowledge about how external innovation may translate to successful product launches is still required in order to better understand how to best leverage the innovation ecosystem. We analyzed the pre-approval publication histories for FDA-approved new molecular entities (NMEs) and new biologic entities (NBEs) launched by 13 top research pharma companies during the last decade (2006-2016). We found that academic institutions contributed the majority of pre-approval publications and that publication subject matter is closely aligned with the strengths of the respective innovator. We found this to also be true for candidate drugs terminated in Phase 3, but the volume of literature on these molecules is substantially less than for approved drugs. This may suggest that approved drugs are often associated with a more robust dataset provided by a large number of institutes. Collectively, the results of our analysis support the hypothesis that a collaborative research innovation environment spanning across academia, industry and government is highly conducive to successful drug approvals.	翻訳日:2021-02-03 16:35:39 公開日:2021-02-02
# 日立JHU DiHARD IIIシステム:DOVER-Lapと組み合わせた競合型エンドツーエンドニューラルダイアリゼーションとXベクトルクラスタリングシステム The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap ( http://arxiv.org/abs/2102.01363v1 ) ライセンス: Link先を確認	Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur	(参考訳) 本稿では,第3回DIHARD音声ダイアリゼーションチャレンジに提出された日立-JHUシステムについて詳述する。このシステムは5つのサブシステム(x-vectorベースのサブシステム2つ、エンドツーエンドのニューラルネットワークダイアリゼーションベースのサブシステム2つ、ハイブリッドサブシステム1つ)のアンサンブル結果を出力する。各システムを洗練し、5つのサブシステムすべてが競争力と補完的になります。 DOVER-Lapベースのシステムの組み合わせの後、トラック1のフルとコアで11.58 %と14.09 %、トラック2のフルとコアで16.94 %と20.01 %というダイアリゼーションエラー率を達成した。その結果、私たちはチャレンジのすべてのタスクで2位を獲得しました。 This paper provides a detailed description of the Hitachi-JHU system that was submitted to the Third DIHARD Speech Diarization Challenge. The system outputs the ensemble results of the five subsystems: two x-vector-based subsystems, two end-to-end neural diarization-based subsystems, and one hybrid subsystem. We refine each system and all five subsystems become competitive and complementary. After the DOVER-Lap based system combination, it achieved diarization error rates of 11.58 % and 14.09 % in Track 1 full and core, and 16.94 % and 20.01 % in Track 2 full and core, respectively. With their results, we won second place in all the tasks of the challenge.	翻訳日:2021-02-03 16:34:57 公開日:2021-02-02
# 仮想フロー計測のためのベイジアンニューラルネットワーク:実証的研究 Bayesian Neural Networks for Virtual Flow Metering: An Empirical Study ( http://arxiv.org/abs/2102.01391v1 ) ライセンス: Link先を確認	Bjarne Grimstad, Mathilde Hotvedt, Anders T. Sandnes, Odd Kolbj{\o}rnsen, Lars S. Imsland	(参考訳) 最近の研究は、機械学習(ML)を油井やガス井の流量のモデリングに適用することで有望な成果を上げている。計算的に安い評価や新しいデータへのキャリブレーションの容易さといったMLモデルの有利な特性と組み合わせることで、データ駆動型仮想フローメータ(VFM)の開発に楽観的になった。ベイズニューラルネットワークに基づく確率的VFMを提示することにより,この発展に寄与する。均質および異方性測定ノイズを考察し、最大後オリ推定と変動推論を用いたモデルの訓練方法を示す。 5つの異なる石油およびガス資産にまたがる60の井戸からなる大規模で不均一なデータセットをモデル化して手法を研究します。予測性能は過去のデータと将来のテストデータに基づいて分析され、50%のベストパフォーマンスモデルの平均誤差は5-6%と9-13%であった。変動推論は、将来のデータに対する参照アプローチよりも堅牢な予測を提供するように見える。歴史的および将来のデータに対する予測性能と不確実性の違いを詳細に検討し、調査結果はデータ駆動VFMのための代替戦略の開発を動機づける。 Recent works have presented promising results from the application of machine learning (ML) to the modeling of flow rates in oil and gas wells. The encouraging results combined with advantageous properties of ML models, such as computationally cheap evaluation and ease of calibration to new data, have sparked optimism for the development of data-driven virtual flow meters (VFMs). We contribute to this development by presenting a probabilistic VFM based on a Bayesian neural network. We consider homoscedastic and heteroscedastic measurement noise, and show how to train the models using maximum a posteriori estimation and variational inference. We study the methods by modeling on a large and heterogeneous dataset, consisting of 60 wells across five different oil and gas assets. The predictive performance is analyzed on historical and future test data, where we achieve an average error of 5-6% and 9-13% for the 50% best performing models, respectively. Variational inference appears to provide more robust predictions than the reference approach on future data. The difference in prediction performance and uncertainty on historical and future data is explored in detail, and the findings motivate the development of alternative strategies for data-driven VFM.	翻訳日:2021-02-03 16:32:26 公開日:2021-02-02
# 時間依存係数をもつ偏微分方程式のロバストなデータ駆動探索 Robust data-driven discovery of partial differential equations with time-dependent coefficients ( http://arxiv.org/abs/2102.01432v1 ) ライセンス: Link先を確認	Aoxue Chen, Guang Lin	(参考訳) 本研究では,ベイズ群Lassoとスパイクとスラブの先行値を用いた,可変係数の偏微分方程式の発見に基づく,堅牢なベイズスパース学習アルゴリズムを提案する。 Gibbsサンプラーで後方分布から抽出したサンプルを用いて、標準誤差と信頼区間とともに係数の値を推定することができます。エラーバーの構築とは別に、モデル選択としきい値設定の新しい基準の設計にも不確実性定量化を用いることができる。これにより、時間依存係数を持つ学習方程式において、より調整可能でロバストな手法が可能となる。モデル選択としきい値設定の3つの基準を導入し、正しい用語を識別する:ルート平均平方、総誤差バー、グループエラーバーである。さらに,3つのノイズフィルタを頑健なベイズスパース学習アルゴリズムと統合し,より大きなノイズでより良い結果を得る。数値計算により,本手法は3つの例による雑音条件下での逐次グループ化閾値リッジ回帰とグループラッソよりも頑健であることが示された。 In this work, we propose a robust Bayesian sparse learning algorithm based on Bayesian group Lasso with spike and slab priors for the discovery of partial differential equations with variable coefficients. Using the samples draw from the posterior distribution with a Gibbs sampler, we are able to estimate the values of coefficients, together with their standard errors and confidence intervals. Apart from constructing the error bars, uncertainty quantification can also be employed for designing new criteria of model selection and threshold setting. This enables our method more adjustable and robust in learning equations with time-dependent coefficients. Three criteria are introduced for model selection and threshold setting to identify the correct terms: the root mean square, total error bar, and group error bar. Moreover, three noise filters are integrated with the robust Bayesian sparse learning algorithm for better results with larger noise. Numerical results demonstrate that our method is more robust than sequential grouped threshold ridge regression and group Lasso in noisy situations through three examples.	翻訳日:2021-02-03 16:31:44 公開日:2021-02-02
# 大次元縦型バイオマーカー履歴による臨床エンドポイントの個人的動的予測--ランドマークアプローチ Individual dynamic prediction of clinical endpoint from large dimensional longitudinal biomarker history: a landmark approach ( http://arxiv.org/abs/2102.01466v1 ) ライセンス: Link先を確認	Anthony Devaux (BPH), Robin Genuer (BPH, SISTM), Karine P\'er\`es (BPH), C\'ecile Proust-Lima (BPH)	(参考訳) 患者のフォローアップを通じて収集された個々のデータは、臨床イベントのリスクを評価し、最終的に治療戦略を適応するための重要な情報です。反復測度から1つまたは2つのマーカーへの個々の動的予測を計算するために、ジョイントモデルとランドマークモデルが提案されている。しかし、完全な患者の履歴がはるかに繰り返しのマーカーを含むケースにはほとんど拡張されません。そこで我々は, 多量のマーカーの繰り返し測定を応用して, 健康イベントを動的に予測する手法を提案することを目標とした。内因性マーカー履歴に拡張したランドマークアプローチと,サバイバルデータに適応した機械学習手法を組み合わせた。各マーカー軌跡はランドマーク時間まで収集された情報を用いてモデル化され、個々の軌跡を最も捉えた要約変数が導出される。これらの要約と追加の共変量は、異なる予測方法に含まれる。大規模な次元履歴を扱うためには、正規化レグレッションやランダムサバイバル森林といった生存データに適応した機械学習手法を用いて、ランドマーク時間からイベントを予測し、それらをスーパーラーナーにどのように組み合わせるかを示す。そして、ブリアスコアの推定値と検閲データに適応した受信者操作特性曲線下の領域を用いて、クロスバリデーションによりパフォーマンスを評価する。特に,予測者と事象との間に多数の非線形関係が存在する場合において,機械学習サバイバル手法の標準生存モデルに対する利点をシミュレーションで実証する。そこで本研究では, 原発性胆管炎患者に対する死亡予測の臨床的コンテキストと, 一般高齢者における死亡予測との公衆衛生的コンテキストの2つの予測条件を適用した。 Rで実施した手法により,繰り返しマーカーの数が多い場合でも,患者の縦断的履歴全体を用いた事象の予測が可能となった。繰り返しマーカーの混合モデルや単一の正しい検閲されたイベントのための方法が導入されたが、この方法はマーカーの他の適切なモデリング技術で使用することができ、競合するリスク設定に容易に拡張することができる。 The individual data collected throughout patient follow-up constitute crucial information for assessing the risk of a clinical event, and eventually for adapting a therapeutic strategy. Joint models and landmark models have been proposed to compute individual dynamic predictions from repeated measures to one or two markers. However, they hardly extend to the case where the complete patient history includes much more repeated markers possibly. Our objective was thus to propose a solution for the dynamic prediction of a health event that may exploit repeated measures of a possibly large number of markers. We combined a landmark approach extended to endogenous markers history with machine learning methods adapted to survival data. Each marker trajectory is modeled using the information collected up to landmark time, and summary variables that best capture the individual trajectories are derived. These summaries and additional covariates are then included in different prediction methods. To handle a possibly large dimensional history, we rely on machine learning methods adapted to survival data, namely regularized regressions and random survival forests, to predict the event from the landmark time, and we show how they can be combined into a superlearner. Then, the performances are evaluated by cross-validation using estimators of Brier Score and the area under the Receiver Operating Characteristic curve adapted to censored data. We demonstrate in a simulation study the benefits of machine learning survival methods over standard survival models, especially in the case of numerous and/or nonlinear relationships between the predictors and the event. We then applied the methodology in two prediction contexts: a clinical context with the prediction of death for patients with primary biliary cholangitis, and a public health context with the prediction of death in the general elderly population at different ages. Our methodology, implemented in R, enables the prediction of an event using the entire longitudinal patient history, even when the number of repeated markers is large. Although introduced with mixed models for the repeated markers and methods for a single right censored time-to-event, our method can be used with any other appropriate modeling technique for the markers and can be easily extended to competing risks setting.	翻訳日:2021-02-03 16:31:09 公開日:2021-02-02
# 連続時間における合成制御を用いた政策分析 Policy Analysis using Synthetic Controls in Continuous-Time ( http://arxiv.org/abs/2102.01577v1 ) ライセンス: Link先を確認	Alexis Bellot, Mihaela van der Schaar	(参考訳) 合成制御を用いた反実用推定は、因果推論における最も成功した最近の方法論発展の1つである。現在の記述では、その人気にもかかわらず、時間系列は単位と観測された制御単位の線形組み合わせとして表現された合成制御にまたがるだけである。本論文では,制御微分方程式の形式化を用いて,潜在反実パスを明示的にモデル化する連続時間代替法を提案する。このモデルは不規則に整合した多変量時系列の一般的な設定に直接適用でき、リッチな関数空間に最適化される可能性がある。 Counterfactual estimation using synthetic controls is one of the most successful recent methodological developments in causal inference. Despite its popularity, the current description only considers time series aligned across units and synthetic controls expressed as linear combinations of observed control units. We propose a continuous-time alternative that models the latent counterfactual path explicitly using the formalism of controlled differential equations. This model is directly applicable to the general setting of irregularly-aligned multivariate time series and may be optimized in rich function spaces -- thereby improving on some limitations of existing approaches.	翻訳日:2021-02-03 16:30:13 公開日:2021-02-02
# FINNを用いたFPGA上の量子ニューラルネットワークのベンチマーク Benchmarking Quantized Neural Networks on FPGAs with FINN ( http://arxiv.org/abs/2102.01341v1 ) ライセンス: Link先を確認	Quentin Ducasse, Pascal Cotret, Lo\"ic Lagadec, Robert Stewart	(参考訳) 最先端のニューラルネットワークのトレーニングと推論の両方のコストの増大は、正確性に最小限の影響を伴って使用するリソースを削減する方法を文学的に見直すことになった。精度を下げるには、精度の低下を無視するコストがかかる。ニューラルネットワークのトレーニングには強力なセットアップが必要だが、低電力と低リソースのハードウェアアーキテクチャでネットワークをデプロイできる必要がある。再構成可能なアーキテクチャは、特定のアプリケーションを見る場合、GPUよりも強力で柔軟なことが証明されている。本稿では、FPGA上に展開されたニューラルネットワークに適用した場合の混合精度の影響を評価することを目的とする。ニューラルネットワークを低精度でデプロイするツールを作成するフレームワークはいくつか存在するが、量子化の重要性とフレームワークの品質を評価するものはほとんどない。 Xilinxラボの2つのフレームワークであるFINNとBrevitasを使用して、2から8ビットの精度と複数の並列化構成の重みを使用して、ニューラルネットワークに対する量子化の影響を評価します。精度の低い表現と十分なトレーニングで等価な精度を得ることができます。しかし、圧縮されたネットワークはより並列化され、ネットワークのスループットが62倍高速になる。この作業で設定されたベンチマークは、パブリックリポジトリ(https://github.com/QDucasse/nnベンチマーク)で利用できる。 The ever-growing cost of both training and inference for state-of-the-art neural networks has brought literature to look upon ways to cut off resources used with a minimal impact on accuracy. Using lower precision comes at the cost of negligible loss in accuracy. While training neural networks may require a powerful setup, deploying a network must be possible on low-power and low-resource hardware architectures. Reconfigurable architectures have proven to be more powerful and flexible than GPUs when looking at a specific application. This article aims to assess the impact of mixed-precision when applied to neural networks deployed on FPGAs. While several frameworks exist that create tools to deploy neural networks using reduced-precision, few of them assess the importance of quantization and the framework quality. FINN and Brevitas, two frameworks from Xilinx labs, are used to assess the impact of quantization on neural networks using 2 to 8 bit precisions and weights with several parallelization configurations. Equivalent accuracy can be obtained using lower-precision representation and enough training. However, the compressed network can be better parallelized allowing the deployed network throughput to be 62 times faster. The benchmark set up in this work is available in a public repository (https://github.com/QDucasse/nn benchmark).	翻訳日:2021-02-03 16:28:59 公開日:2021-02-02
# 対向ロバストネスのための対向訓練の最近の進歩 Recent Advances in Adversarial Training for Adversarial Robustness ( http://arxiv.org/abs/2102.01356v1 ) ライセンス: Link先を確認	Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen	(参考訳) ディープラーニングモデルをだますための逆例は、数年前から研究されており、まだホットなトピックです。敵の訓練も、敵の例を守る効果から大きな注目を集めている。しかし、敵の訓練は完璧ではなく、解決すべき問題が多い。過去数年間、このコミュニティの研究者は様々な側面から敵の訓練を研究し、議論してきた。敵対的訓練の多くの新しい理論と理解が提案されている。本研究は, 敵意訓練の最近の進歩を, 異なる改善によって分類し, 初めて体系的に検討するものである。次に, 対人訓練における一般化問題について3つの視点から考察する。最後に,未解決の課題を浮き彫りにして,今後の方向性について述べる。 Adversarial examples for fooling deep learning models have been studied for several years and are still a hot topic. Adversarial training also receives enormous attention because of its effectiveness in defending adversarial examples. However, adversarial training is not perfect, many questions of which remain to solve. During the last few years, researchers in this community have studied and discussed adversarial training from various aspects. Many new theories and understandings of adversarial training have been proposed. In this survey, we systematically review the recent progress on adversarial training for the first time, categorized by different improvements. Then we discuss the generalization problems in adversarial training from three perspectives. Finally, we highlight the challenges which are not fully solved and present potential future directions.	翻訳日:2021-02-03 16:28:18 公開日:2021-02-02
# LSTM-Recurrent Neural Networksを用いた車線変化までの時間予測 Predicting the Time Until a Vehicle Changes the Lane Using LSTM-based Recurrent Neural Networks ( http://arxiv.org/abs/2102.01431v1 ) ライセンス: Link先を確認	Florian Wirthm\"uller, Marvin Klimke, Julian Schlechtriemen, Jochen Hipp and Manfred Reichert	(参考訳) 高速道路における自動運転車の安全で快適な軌道計画には,交通状況の正確な予測が必要である。これまでのところ、車線変更が実際に起こる時点を推定するよりも、車線変更操作の検出に多くの研究が費やされてきた。しかし実際には、この時間情報はもっと役に立つかもしれない。本論文では,長期記憶型リカレントニューラルネットワークを用いて,高速道路における周辺車両の次の車線変化の時間を正確に予測するシステムの開発について述べる。大規模実世界のデータセットに基づく広範な評価により,本手法は,最も困難な状況であっても,根平均二乗誤差が0.7秒程度で,信頼性の高い予測を行うことができることが示された。車線変更の3.5秒前の予測は精度が高くなり、中央値の誤差は0.25秒未満である。要約すると、この記事は下流の高精度な位置予測のための基本的なステップを形成します。 To plan safe and comfortable trajectories for automated vehicles on highways, accurate predictions of traffic situations are needed. So far, a lot of research effort has been spent on detecting lane change maneuvers rather than on estimating the point in time a lane change actually happens. In practice, however, this temporal information might be even more useful. This paper deals with the development of a system that accurately predicts the time to the next lane change of surrounding vehicles on highways using long short-term memory-based recurrent neural networks. An extensive evaluation based on a large real-world data set shows that our approach is able to make reliable predictions, even in the most challenging situations, with a root mean squared error around 0.7 seconds. Already 3.5 seconds prior to lane changes the predictions become highly accurate, showing a median error of less than 0.25 seconds. In summary, this article forms a fundamental step towards downstreamed highly accurate position predictions.	翻訳日:2021-02-03 16:27:48 公開日:2021-02-02
# 抽象的手法によるマルチエージェント深層補強学習行動の検証 An Abstraction-based Method to Verify Multi-Agent Deep Reinforcement-Learning Behaviours ( http://arxiv.org/abs/2102.01434v1 ) ライセンス: Link先を確認	Pierre El Mqirmi, Francesco Belardinelli and Borja G. Le\'on	(参考訳) マルチエージェント強化学習(RL)は、学習エージェントの安全な動作を保証するためにしばしば苦労するため、一般的には安全クリティカルな応用に適応しない。この問題に対処するために,形式検証と(深度)RLアルゴリズムを組み合わせて,トレーニングとテストの両方において,公式に指定された安全制約の満足度を保証する手法を提案する。私たちが提案するアプローチは、確率計算木論理(PCTL)で検証する制約を表現し、検証ステップの複雑さを減らすためにシステムの抽象表現を構築します。この抽象モデルにより、PCTLで表現される安全制約を満たす抽象ポリシーの集合をモデル検査技術で識別することができる。そして、これらの安全な抽象ポリシーに従ってエージェントの振る舞いが制限される。本手法を用いることで,エージェントの動作が常に安全制約を満たすことを保証し,抽象モデルを自動的に生成する手順を提供する。マルチエージェント環境において,本手法の有効性を実証的に評価し,実証する。 Multi-agent reinforcement learning (RL) often struggles to ensure the safe behaviours of the learning agents, and therefore it is generally not adapted to safety-critical applications. To address this issue, we present a methodology that combines formal verification with (deep) RL algorithms to guarantee the satisfaction of formally-specified safety constraints both in training and testing. The approach we propose expresses the constraints to verify in Probabilistic Computation Tree Logic (PCTL) and builds an abstract representation of the system to reduce the complexity of the verification step. This abstract model allows for model checking techniques to identify a set of abstract policies that meet the safety constraints expressed in PCTL. Then, the agents' behaviours are restricted according to these safe abstract policies. We provide formal guarantees that by using this method, the actions of the agents always meet the safety constraints, and provide a procedure to generate an abstract model automatically. We empirically evaluate and show the effectiveness of our method in a multi-agent environment.	翻訳日:2021-02-03 16:27:14 公開日:2021-02-02
# Entropy-Regularized Deep Reinforcement Learningによる平均フィールドゲームについて Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning ( http://arxiv.org/abs/2102.01585v1 ) ライセンス: Link先を確認	Kai Cui, Heinz Koeppl	(参考訳) 最近の平均場ゲーム(MFG)は、多くのエージェント設定で近似的なナッシュ平衡の難解な計算を容易にする。本稿では,離散時間有限MFGを有限ホリゾン目標とする。非コンスタントな不動点作用素を持つ離散時間有限 MFG は、既存のMFG の文献で通常仮定されるような縮約的でないことを示し、不動点反復による収束を抑える。代わりに、エントロピー規則化とボルツマンポリシーを固定点反復に組み込む。その結果,既存手法が故障する近似不動点に対する証明可能な収束が得られ,nash平衡近似の本来の目標に到達した。提案手法はすべて, 操作可能な厳密解を用いた指導例と, 厳密解が難解な高次元問題の両方について評価されている。高次元シナリオでは、確立された深層強化学習法を適用し、実演と近似を経験的に組み合わせる。 The recent mean field game (MFG) formalism facilitates otherwise intractable computation of approximate Nash equilibria in many-agent settings. In this paper, we consider discrete-time finite MFGs subject to finite-horizon objectives. We show that all discrete-time finite MFGs with non-constant fixed point operators fail to be contractive as typically assumed in existing MFG literature, barring convergence via fixed point iteration. Instead, we incorporate entropy-regularization and Boltzmann policies into the fixed point iteration. As a result, we obtain provable convergence to approximate fixed points where existing methods fail, and reach the original goal of approximate Nash equilibria. All proposed methods are evaluated with respect to their exploitability, on both instructive examples with tractable exact solutions and high-dimensional problems where exact methods become intractable. In high-dimensional scenarios, we apply established deep reinforcement learning methods and empirically combine fictitious play with our approximations.	翻訳日:2021-02-03 16:26:38 公開日:2021-02-02
# CLIP-Guided Generative Latent Space Search によるキャプションからの画像生成とその逆 Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search ( http://arxiv.org/abs/2102.01645v1 ) ライセンス: Link先を確認	Federico A. Galatolo and Mario G.C.A. Cimino and Gigliola Vaglini	(参考訳) 本研究では,与えられたキャプション(または画像)に対応する画像(またはキャプション)を生成する新しいゼロショットフレームワークであるGLaSSを提案する。 GLaSSは、画像と記述キャプションが同様の埋め込みを提供するCLIPニューラルネットワークに基づいている。別として、GLaSSは入力としてキャプション(または画像)を取り、CLIP埋め込みが入力に最も近い画像(またはキャプション)を生成します。この最適な画像(またはキャプション)は、遺伝的アルゴリズムによる探索後に生成ネットワークを介して生成される。画像生成器BigGANおよびStyleGAN2の実験とテキスト生成器GPT2の実験に基づいて、推定結果を示す。 In this research work we present GLaSS, a novel zero-shot framework to generate an image(or a caption) corresponding to a given caption(or image). GLaSS is based on the CLIP neural network which given an image and a descriptive caption provides similar embeddings. Differently, GLaSS takes a caption (or an image) as an input, and generates the image (or the caption) whose CLIP embedding is most similar to the input one. This optimal image (or caption) is produced via a generative network after an exploration by a genetic algorithm. Promising results are shown, based on the experimentation of the image generators BigGAN and StyleGAN2, and of the text generator GPT2.	翻訳日:2021-02-03 16:26:01 公開日:2021-02-02
# 皮膚病変分類のための不均衡小データセットの単一モデル深層学習 Single Model Deep Learning on Imbalanced Small Datasets for Skin Lesion Classification ( http://arxiv.org/abs/2102.01284v1 ) ライセンス: Link先を確認	Peng Yao, Shuwei Shen, Mengjuan Xu, Peng Liu, Fan Zhang, Jinyu Xing, Pengfei Shao, Benjamin Kaffenberger, and Ronald X. Xu	(参考訳) 深層畳み込みニューラルネットワーク(DCNN)モデルは皮膚疾患の診断のために広く研究されており、そのいくつかは皮膚科医の診断結果と同等かそれ以上に優れている。しかし, 皮膚疾患検出におけるdcnnの広範な実装は, 小さいサイズとデータ不均衡によって妨げられている。本稿では,小・不均衡なデータセットに基づく皮膚病変の単一モデル分類のための新しいデータ拡張戦略を提案する。まず、このデータセット上で様々なDCNNをトレーニングし、適度な複雑さを持つモデルがより大きなモデルより優れていることを示す。第二に、正規化DropOutとDropBlockを追加してオーバーフィッティングを削減し、小さなデータセットのサンプル不足の欠陥に対処するためにModified RandAugment Augmentation戦略を提案します。最後に,不均一なサンプルサイズと分類困難さを克服するために,新しい多重音声損失関数を導入した。改良型ランダグメントと複数重み付き焦点損失を単一のdcnnモデルで組み合わせることで,isic 2018 challengeテストデータセットにおける複数のセンシングモデルと同等の分類精度を達成した。本研究では, 低リソース環境下での皮膚病変や他の多くの悪性度の自動スクリーニングのためにモバイル機器に実装するのに好適な, 計算リソースと推論時間の低コストで高い分類性能を達成できることを示した。 Deep convolutional neural network (DCNN) models have been widely explored for skin disease diagnosis and some of them have achieved the diagnostic outcomes comparable or even superior to those of dermatologists. However, broad implementation of DCNN in skin disease detection is hindered by small size and data imbalance of the publically accessible skin lesion datasets. This paper proposes a novel data augmentation strategy for single model classification of skin lesions based on a small and imbalanced dataset. First, various DCNNs are trained on this dataset to show that the models with moderate complexity outperform the larger models. Second, regularization DropOut and DropBlock are added to reduce overfitting and a Modified RandAugment augmentation strategy is proposed to address the defects of sample underrepresentation in the small dataset. Finally, a novel Multi-Weighted Focal Loss function is introduced to overcome the challenge of uneven sample size and classification difficulty. By combining Modified RandAugment and Multi-weighted Focal Loss in a single DCNN model, we have achieved the classification accuracy comparable to those of multiple ensembling models on the ISIC 2018 challenge test dataset. Our study shows that this method is able to achieve a high classification performance at a low cost of computational resources and inference time, potentially suitable to implement in mobile devices for automated screening of skin lesions and many other malignancies in low resource settings.	翻訳日:2021-02-03 16:25:27 公開日:2021-02-02
# 積分画像と積分ヒストグラムに基づく移動端トーンマッピング Mobile-end Tone Mapping based on Integral Image and Integral Histogram ( http://arxiv.org/abs/2102.01289v1 ) ライセンス: Link先を確認	Jie Yang, Mengchen Lin, Ziyi Liu, Ulian Shahnovich, Orly Yadid-Pecht	(参考訳) 広いダイナミックレンジ(WDR)の画像トーンマッピングは、フィルム制作、セキュリティ監視、写真撮影など多くのアプリケーションで高い需要があります。今日の画像のほとんどは携帯電話からのものであるため、モバイルデバイスにとって特に重要です。そのため、そのような技術はモバイルデバイスの消費者市場で非常に要求され、優れた顧客体験のために不可欠です。しかし、高品質で高性能なWDR画像トーンマッピングの実装はモバイル端末ではほとんど見られない。本稿では,高性能なモバイル用WDR画像トーンマッピングの実装について紹介する。複数の受信フィールドのトーンマッピング結果を活用し、各ピクセルに適した値を算出する。積分画像と積分ヒストグラムの利用は必要な計算量を大幅に削減する。さらに、GPU並列計算を用いて処理速度を向上する。実験結果から,モバイルデバイス上で1秒以内に高解像度のWDR画像を処理し,画像品質を向上できることが示唆された。 Wide dynamic range (WDR) image tone mapping is in high demand in many applications like film production, security monitoring, and photography. It is especially crucial for mobile devices because most of the images taken today are from mobile phones, hence such technology is highly demanded in the consumer market of mobile devices and is essential for a good customer experience. However, high-quality and high-performance WDR image tone mapping implementations are rarely found in the mobile-end. In this paper, we introduce a high performance, mobile-end WDR image tone mapping implementation. It leverages the tone mapping results of multiple receptive fields and calculates a suitable value for each pixel. The utilization of integral image and integral histogram significantly reduce the required computation. Moreover, GPU parallel computation is used to increase the processing speed. The experimental results indicate that our implementation can process a high-resolution WDR image within a second on mobile devices and produce appealing image quality.	翻訳日:2021-02-03 16:24:42 公開日:2021-02-02
# ロバストハッシングによる偽画像検出 Fake-image detection with Robust Hashing ( http://arxiv.org/abs/2102.01313v1 ) ライセンス: Link先を確認	Miki Tanaka, Kiya Hitoshi	(参考訳) 本稿では,JPEG圧縮などの複数の操作手法を初めて画像に適用した場合においても,ロバストハッシュがフェイクイメージを堅牢に検出できるかどうかを検討する。実験では,ganで生成した偽画像を含む各種データセットを用いて,ロバストなハッシュによる偽検出が最先端のものよりも優れていることを示す。 In this paper, we investigate whether robust hashing has a possibility to robustly detect fake-images even when multiple manipulation techniques such as JPEG compression are applied to images for the first time. In an experiment, the proposed fake detection with robust hashing is demonstrated to outperform state-of-the-art one under the use of various datasets including fake images generated with GANs.	翻訳日:2021-02-03 16:24:10 公開日:2021-02-02
# IoT用Ultra-Low-Power視覚センサによるエネルギー効率向上機械学習 Enabling energy efficient machine learning on a Ultra-Low-Power vision sensor for IoT ( http://arxiv.org/abs/2102.01340v1 ) ライセンス: Link先を確認	Francesco Paissan, Massimo Gottardi, Elisabetta Farella	(参考訳) IoT(Internet of Things)とスマートシティのパラダイムには、ユーザと市民に有用なサービスを返却するためにコンテキスト情報を抽出するユビキタス技術が含まれている。このシナリオにおいて重要な役割はコンピュータビジョンアプリケーションによって行われ、特定のデバイスから画像を取得する必要がある。ハイエンドカメラの必要性は、電力消費と高い計算資源の処理を要求するため、このプロセスにペナルティを課すことが多い。したがって、ハードウェア内モーション検出などの高度な機能を実装した新しい低消費電力視覚センサは、iot領域のコンピュータビジョンに不可欠である。残念なことに、エネルギー効率が高いため、これらのセンサーは知覚性能(解像度、フレームレート、色など)を悪化させる可能性がある。したがって、ドメイン固有のパイプラインは通常、これらのカメラの潜在能力を最大限活用するために配信される。本稿では,背景フィルタリングスマートビジョンセンサ(svs)のポテンシャルを最大限活用できるリアルタイム検出,分類,追跡パイプラインの開発,解析,実装について述べる。 8msの推算で得られる電力消費量は7.5mWである。 The Internet of Things (IoT) and smart city paradigm includes ubiquitous technology to extract context information in order to return useful services to users and citizens. An essential role in this scenario is often played by computer vision applications, requiring the acquisition of images from specific devices. The need for high-end cameras often penalizes this process since they are power-hungry and ask for high computational resources to be processed. Thus, the availability of novel low-power vision sensors, implementing advanced features like in-hardware motion detection, is crucial for computer vision in the IoT domain. Unfortunately, to be highly energy-efficient, these sensors might worsen the perception performance (e.g., resolution, frame rate, color). Therefore, domain-specific pipelines are usually delivered in order to exploit the full potential of these cameras. This paper presents the development, analysis, and embedded implementation of a realtime detection, classification and tracking pipeline able to exploit the full potential of background filtering Smart Vision Sensors (SVS). The power consumption obtained for the inference - which requires 8ms - is 7.5 mW.	翻訳日:2021-02-03 16:23:44 公開日:2021-02-02
# fpga用ハードウェア効率残差ネットワーク Hardware-efficient Residual Networks for FPGAs ( http://arxiv.org/abs/2102.01351v1 ) ライセンス: Link先を確認	Olivia Weng, Alireza Khodamoradi, Ryan Kastner	(参考訳) 残差ネットワーク(resnets)は、トレーニング収束を改善するために、ネットワーク内のスキップ接続(以前のレイヤからのアクティベーションを再利用する)を採用するが、これらのスキップ接続は、resnetのハードウェア実装の課題を生じさせる。ハードウェアは、より多くの受信データを処理する前に、スキップ接続が処理されるのを待たなければならない。接続をスキップしなければ、ResNetsはよりハードウェア効率が良い。そこで本研究では,NonResNetと呼ばれるネットワークを構築して,ResNetのスキップ接続を段階的に除去する学習手法を提案する。 FPGAで実装すると、NonResNetはResNetのBRAM利用率を9%、LUT利用率を3%、スループットを5%向上させることが示されています。 Residual networks (ResNets) employ skip connections in their networks -- reusing activations from previous layers -- to improve training convergence, but these skip connections create challenges for hardware implementations of ResNets. The hardware must either wait for skip connections to be processed before processing more incoming data or buffer them elsewhere. Without skip connections, ResNets would be more hardware-efficient. Thus, we present the teacher-student learning method to gradually prune away all of a ResNet's skip connections, constructing a network we call NonResNet. We show that when implemented for FPGAs, NonResNet decreases ResNet's BRAM utilization by 9% and LUT utilization by 3% and increases throughput by 5%.	翻訳日:2021-02-03 16:23:05 公開日:2021-02-02
# 常に個人的: デバイス上でのCNNのパーソナライゼーションにEarly Exitsを使う It's always personal: Using Early Exits for Efficient On-Device CNN Personalisation ( http://arxiv.org/abs/2102.01393v1 ) ライセンス: Link先を確認	Ilias Leontiadis, Stefanos Laskaridis, Stylianos I. Venieris, Nicholas D. Lane	(参考訳) 強力なハードウェアとモデル圧縮技術のおかげで、オンデバイス機械学習は現実的になっています。通常、これらのモデルは大きなGPUクラスタ上で事前訓練され、幅広い入力を一般化するのに十分なパラメータを持つ。この研究では、より小さく、パーソナライズされたモデルを特定のシナリオに適合させることで、高い精度と高速な実行を可能にしている。それでもデバイス上でのトレーニングは非常に困難であり、フラッグシップスマートフォンでも過剰な計算とメモリを必要とする。同時に、デバイス上のデータ可用性は制限され、サンプルのラベルが付けられないことが多い。この目的のために、モデルに早期出口を添付し、デバイス上でそれらをパーソナライズするフレームワークであるPersEPhonEEを紹介します。これにより、よりパーソナライズされたデータが利用可能になると、モデルが計算の大部分を段階的にバイパスすることができる。さらに,ネットワーク全体のパーソナライズ時間のごく一部で,早期出口を半教師付きで訓練する効率的なオンデバイスアルゴリズムを提案する。その結果、PersEPhonEEは、トレーニングコストを最大2.2倍、推論レイテンシを平均2.2-3.2倍まで下げながら、最大15.9%の精度を、デバイス上のラベルの可用性に応じて向上させる。 On-device machine learning is becoming a reality thanks to the availability of powerful hardware and model compression techniques. Typically, these models are pretrained on large GPU clusters and have enough parameters to generalise across a wide variety of inputs. In this work, we observe that a much smaller, personalised model can be employed to fit a specific scenario, resulting in both higher accuracy and faster execution. Nevertheless, on-device training is extremely challenging, imposing excessive computational and memory requirements even for flagship smartphones. At the same time, on-device data availability might be limited and samples are most frequently unlabelled. To this end, we introduce PersEPhonEE, a framework that attaches early exits on the model and personalises them on-device. These allow the model to progressively bypass a larger part of the computation as more personalised data become available. Moreover, we introduce an efficient on-device algorithm that trains the early exits in a semi-supervised manner at a fraction of the whole network's personalisation time. Results show that PersEPhonEE boosts accuracy by up to 15.9% while dropping the training cost by up to 2.2x and inference latency by 2.2-3.2x on average for the same accuracy, depending on the availability of labels on-device.	翻訳日:2021-02-03 16:22:29 公開日:2021-02-02
# 子どもとコンピュータの相互作用:最近の仕事、新しいデータセット、年齢検出 Child-Computer Interaction: Recent Works, New Dataset, and Age Detection ( http://arxiv.org/abs/2102.01405v1 ) ライセンス: Link先を確認	Ruben Tolosana, Juan Carlos Ruiz-Garcia, Ruben Vera-Rodriguez, Jaime Herreros-Rodriguez, Sergio Romero-Tapiador, Aythami Morales, Julian Fierrez	(参考訳) 子どもとコンピュータの相互作用に関する最近の研究を概観し,その意図する枠組みであるchildciについて述べる。i) モバイルデバイスと対話しながら,子どもの認知と神経運動の発達をよりよく理解すること,ii) e-learning と e-health の新たな応用を可能にすること,など。我々のフレームワークには、新しいモバイルアプリケーション、特定のデータ取得プロトコル、縦断的研究を可能にするために年次拡張が計画されているChildCIデータセット(ChildCIdb v1)の最初のリリースが含まれている。私たちのフレームワークでは、子どもたちはペンスタイラスと指を使ってタブレットデバイスと対話し、異なるレベルの神経運動と認知スキルを必要とする異なるタスクを実行します。 ChildCIdbは18ヶ月から8歳までの400人以上の子供で構成されており、ピアジェの理論の最初の3つの発達段階を考慮しています。さらに,ChildCIフレームワークの可能性の実証として,ChildCIdbが実現した多くの応用の1つとして,デバイスインタラクションに基づく子どもの年齢検出実験を行った。さまざまな機械学習アプローチが評価され、年齢グループを自動的に検出する34のグローバル機能セットを提案し、90%以上の精度結果を達成し、このタスクでより有用な機能の種類に関して興味深い結果を得ます。 We overview recent research in Child-Computer Interaction and describe our framework ChildCI intended for: i) generating a better understanding of the cognitive and neuromotor development of children while interacting with mobile devices, and ii) enabling new applications in e-learning and e-health, among others. Our framework includes a new mobile application, specific data acquisition protocols, and a first release of the ChildCI dataset (ChildCIdb v1), which is planned to be extended yearly to enable longitudinal studies. In our framework children interact with a tablet device, using both a pen stylus and the finger, performing different tasks that require different levels of neuromotor and cognitive skills. ChildCIdb comprises more than 400 children from 18 months to 8 years old, considering therefore the first three development stages of the Piaget's theory. In addition, and as a demonstration of the potential of the ChildCI framework, we include experimental results for one of the many applications enabled by ChildCIdb: children age detection based on device interaction. Different machine learning approaches are evaluated, proposing a new set of 34 global features to automatically detect age groups, achieving accuracy results over 90% and interesting findings in terms of the type of features more useful for this task.	翻訳日:2021-02-03 16:21:46 公開日:2021-02-02
# スケーラブルなマルチラベル画像検索のためのランク一貫性ディープハッシング Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search ( http://arxiv.org/abs/2102.01486v1 ) ライセンス: Link先を確認	Cheng Ma, Jiwen Lu, Jie Zhou	(参考訳) ハッシュは大規模画像検索においてますます魅力的な技術になりつつあるため、マルチラベルハッシュもマルチレベルのセマンティックコンテンツを活用する能力に注目が集まっている。本稿では,スケーラブルなマルチラベル画像検索のための新しいディープハッシュ法を提案する。コントラストやトリプルトロスといった従来の目的と異なり、全てのサンプルに対して十分なグローバル監視情報を提供するために、ペアやトリプルトではなくランクリストを用いる。具体的には、元の空間とハミング空間の2つの空間からの類似性順序を整列するために、新しい階数整合性目標を適用する。強力な損失関数は、意味的類似性とハミング距離が2つの空間で一致しないサンプルをペナルティ化するように設計されている。また、導関数の簡潔な定式化とともに判別力を高めるために、マルチラベルソフトマックスクロスエントロピー損失が提示される。異なるラベルを持つサンプルの近傍構造を操作するために、サンプルと対応する複数のクラスセンター間の距離を減らすことにより、同じラベルを持つサンプルのハッシュベクトルをクラスタ化するマルチラベルクラスタリングロスを設計します。 MIRFLICKR-25K, IAPRTC12, NUS-WIDEの3つの公開マルチラベルデータセットを用いて, 提案手法の有効性を実証した。 As hashing becomes an increasingly appealing technique for large-scale image retrieval, multi-label hashing is also attracting more attention for the ability to exploit multi-level semantic contents. In this paper, we propose a novel deep hashing method for scalable multi-label image search. Unlike existing approaches with conventional objectives such as contrast and triplet losses, we employ a rank list, rather than pairs or triplets, to provide sufficient global supervision information for all the samples. Specifically, a new rank-consistency objective is applied to align the similarity orders from two spaces, the original space and the hamming space. A powerful loss function is designed to penalize the samples whose semantic similarity and hamming distance are mismatched in two spaces. Besides, a multi-label softmax cross-entropy loss is presented to enhance the discriminative power with a concise formulation of the derivative function. In order to manipulate the neighborhood structure of the samples with different labels, we design a multi-label clustering loss to cluster the hashing vectors of the samples with the same labels by reducing the distances between the samples and their multiple corresponding class centers. The state-of-the-art experimental results achieved on three public multi-label datasets, MIRFLICKR-25K, IAPRTC12 and NUS-WIDE, demonstrate the effectiveness of the proposed method.	翻訳日:2021-02-03 16:21:02 公開日:2021-02-02
# グラディエントフローの維持:グラディエントフローを用いたスパースネットワーク最適化の研究 Keep the Gradients Flowing: Using Gradient Flow to Study Sparse Network Optimization ( http://arxiv.org/abs/2102.01670v1 ) ライセンス: Link先を確認	Kale-ab Tessera, Sara Hooker, Benjamin Rosman	(参考訳) 密集型ニューラルネットワークと同じ性能に収束するスパースネットワークの訓練は、解明されている。最近の研究は初期化が鍵であることを示唆している。しかし、この研究の方向性は成功していますが、初期化だけに焦点を合わせると不十分なようです。本稿では,スパースモデルにおける正規化,最適化,アーキテクチャ選択の役割について考察する。我々は,スパースネットワークと高密度ネットワークの公平な比較を可能にする,単純な実験フレームワークであるSame Capacity Sparse vs Dense Comparison (SC-SDC)を提案する。さらに,スパースネットワークの性能と相関する勾配流,有効勾配流(EGF)の新たな測定法を提案する。トップラインメトリクスsc-sdcとegfを用いて,高濃度ネットワークで使用されるオプティマイザ,アクティベーション関数,レギュラライザのデフォルト選択がスパースネットワークに不利であることを示す。これらの結果から,スパースネットワークにおけるグラデーションフローは,アーキテクチャ設計とトレーニング体制の側面を再考することで改善できることを示した。私たちの研究は、初期化はパズルの1つの部分にすぎないことを示唆し、スパースネットワークへの調整最適化の広い視野を取ることは有望な結果をもたらす。 Training sparse networks to converge to the same performance as dense neural architectures has proven to be elusive. Recent work suggests that initialization is the key. However, while this direction of research has had some success, focusing on initialization alone appears to be inadequate. In this paper, we take a broader view of training sparse networks and consider the role of regularization, optimization and architecture choices on sparse models. We propose a simple experimental framework, Same Capacity Sparse vs Dense Comparison (SC-SDC), that allows for fair comparison of sparse and dense networks. Furthermore, we propose a new measure of gradient flow, Effective Gradient Flow (EGF), that better correlates to performance in sparse networks. Using top-line metrics, SC-SDC and EGF, we show that default choices of optimizers, activation functions and regularizers used for dense networks can disadvantage sparse networks. Based upon these findings, we show that gradient flow in sparse networks can be improved by reconsidering aspects of the architecture design and the training regime. Our work suggests that initialization is only one piece of the puzzle and taking a wider view of tailoring optimization to sparse networks yields promising results.	翻訳日:2021-02-03 16:19:49 公開日:2021-02-02
# サブサンプル半確定プログラムによるコミュニティ検出 Community Detection with a Subsampled Semidefinite Program ( http://arxiv.org/abs/2102.01419v1 ) ライセンス: Link先を確認	Pedro Abdalla and Afonso S. Bandeira	(参考訳) 半定型プログラミングは、クラスタリングやコミュニティ検出など、データサイエンスと信号処理のいくつかの問題に取り組むための重要なツールです。しかし、半定義のプログラムは実際には遅いことが多いため、スケッチなどの技法の高速化がしばしば考慮される。確率ブロックモデルにおけるコミュニティ検出の文脈において、Mixon と Xie [9] は、最近、ネットワークのサブサンプリングされたサブグラフにのみ半定値プログラムを解き、計算の大幅な節約をもたらすスケッチフレームワークを提案している。本稿では,2つの平衡群をもつ確率的ブロックモデルに対するこの手法の統計的限界について,mixon と xie の予想に対する正の答えを提案する。 Semidefinite programming is an important tool to tackle several problems in data science and signal processing, including clustering and community detection. However, semidefinite programs are often slow in practice, so speed up techniques such as sketching are often considered. In the context of community detection in the stochastic block model, Mixon and Xie [9] have recently proposed a sketching framework in which a semidefinite program is solved only on a subsampled subgraph of the network, giving rise to significant computational savings. In this short paper, we provide a positive answer to a conjecture of Mixon and Xie about the statistical limits of this technique for the stochastic block model with two balanced communities.	翻訳日:2021-02-03 16:18:28 公開日:2021-02-02
# 対称的ブール因子解析とInstaHideへの応用 Symmetric Boolean Factor Analysis with Applications to InstaHide ( http://arxiv.org/abs/2102.01570v1 ) ライセンス: Link先を確認	Sitan Chen, Zhao Song, Runzhou Tao, Ruizhe Zhang	(参考訳) 本研究では,最近提案された分散学習手法であるInstaHideのセキュリティについて検討する(Huang et al.)。いくつかの最近の研究は、以下の行列因子化問題への興味深い接続を利用して、InstaHideの再構築攻撃を与えている:{0,1}^rにおけるmランダムk-sparse Booleanベクトルのコレクションのグラム行列を考えると、ベクトルを回復する(自明な対称性まで)。同様に、これはブール因子分析のよく研究された問題の疎密で対称な変種として、またはk-ユニフォームハイパーグラフを線グラフから回復する古典的な問題の平均ケースバージョンとして考えられます。以前のアルゴリズムでは m が k で指数関数的に大きいか、あるいは k = 2 にのみ適用されるかのどちらかが必要であったため、InstaHide は適当な大きさの k に対して再構築攻撃に対して何らかの「細かいセキュリティ」を持っているかという疑問を解いた。この研究では、上記の行列分解問題に対して単純な O(m^{\omega + 1}) 時間アルゴリズムを与えることで、負の方法でこの疑問に答える。このアルゴリズムは、k-スパースベクトルの収集が任意に選択される問題の最悪の場合の設定のための準多項式時間アルゴリズムでこの結果を補完する。 In this work we examine the security of InstaHide, a recently proposed scheme for distributed learning (Huang et al.). A number of recent works have given reconstruction attacks for InstaHide in various regimes by leveraging an intriguing connection to the following matrix factorization problem: given the Gram matrix of a collection of m random k-sparse Boolean vectors in {0,1}^r, recover the vectors (up to the trivial symmetries). Equivalently, this can be thought of as a sparse, symmetric variant of the well-studied problem of Boolean factor analysis, or as an average-case version of the classic problem of recovering a k-uniform hypergraph from its line graph. As previous algorithms either required m to be exponentially large in k or only applied to k = 2, they left open the question of whether InstaHide possesses some form of "fine-grained security" against reconstruction attacks for moderately large k. In this work, we answer this in the negative by giving a simple O(m^{\omega + 1}) time algorithm for the above matrix factorization problem. Our algorithm, based on tensor decomposition, only requires m to be at least quasi-linear in r. We complement this result with a quasipolynomial-time algorithm for a worst-case setting of the problem where the collection of k-sparse vectors is chosen arbitrarily.	翻訳日:2021-02-03 16:17:53 公開日:2021-02-02
# ラジアル関数を超える深さ分離 Depth separation beyond radial functions ( http://arxiv.org/abs/2102.01621v1 ) ライセンス: Link先を確認	Luca Venturi, Samy Jelassi, Tristan Ozuch, Joan Bruna	(参考訳) ニューラルネットワークの高次元深度分離の結果、特定の関数は2重層ネットワークによって効率的に近似できるが、高次元の1重層は$d$であることがわかった。このタイプの既存の結果は、主に基礎となる放射状または1次元の構造を持つ機能に焦点を当てている。本稿の最初の貢献は、(Eldan and Shamir, 2016)の証明戦略に基づいて、より一般的な関数のクラス、すなわち、断片的振動構造を持つ関数にその結果を拡張することである。このような結果の証明における一般的なテーマは、一隠れ層がフーリエ表現が領域に広がる高エネルギー関数を近似できないという事実である。一方、1つの隠れたニューラルネットワークによる関数の既存の近似結果は、スパースなフーリエ表現を持つ関数に依存している。領域の選択はまた、上値と下値の近似境界の間のギャップの源でもある。固定近似領域、すなわち次元 $d$ における球面 $\mathbb{s}^{d-1}$ に焦点をあてて、1階層ネットワークで効率的に近似可能な両関数と、フーリエ展開の観点で証明可能でない関数のキャラクタリゼーションを提供する。 High-dimensional depth separation results for neural networks show that certain functions can be efficiently approximated by two-hidden-layer networks but not by one-hidden-layer ones in high-dimensions $d$. Existing results of this type mainly focus on functions with an underlying radial or one-dimensional structure, which are usually not encountered in practice. The first contribution of this paper is to extend such results to a more general class of functions, namely functions with piece-wise oscillatory structure, by building on the proof strategy of (Eldan and Shamir, 2016). A common theme in the proof of such results is the fact that one-hidden-layer fail to approximate high-energy functions whose Fourier representation is spread in the domain. On the other hand, existing approximation results of a function by one-hidden-layer neural networks rely on the function having a sparse Fourier representation. The choice of the domain also represents a source of gaps between upper and lower approximation bounds. Focusing on a fixed approximation domain, namely the sphere $\mathbb{S}^{d-1}$ in dimension $d$, we provide a characterization of both functions which are efficiently approximable by one-hidden-layer networks and of functions which are provably not, in terms of their Fourier expansion.	翻訳日:2021-02-03 16:17:05 公開日:2021-02-02
# パラメータ化量子回路の容量と量子幾何学 Capacity and quantum geometry of parametrized quantum circuits ( http://arxiv.org/abs/2102.01659v1 ) ライセンス: Link先を確認	Tobias Haug, Kishor Bharti, M. S. Kim	(参考訳) ノイズの多い中規模量子デバイスのポテンシャルを利用するには、ハイブリッド量子古典的アルゴリズムを実行するのに最適なタイプの回路を見つけることが不可欠です。主な候補は、現在のデバイスで効果的に実装できるパラメトリズド量子回路である。本稿では、パラメータ空間の幾何学的構造を用いて、これらの回路の能力と訓練性を効果的な量子次元で評価し、回路の表現力と特定の初期化戦略を明らかにします。様々な人気回路タイプの表現力を評価し、使用する絡み合うゲートの種類によって顕著な違いを見つけます。特に回路は、その表現力のスケーリング法則によって特徴付けられる。我々は、パラメータ空間の量子幾何学の遷移を特定し、それは深い回路のための量子自然勾配の崩壊につながる。浅い回路では、量子自然勾配は通常の勾配に比べて桁違いに値が大きいが、どちらもグラデーションの消失に苦しむことがある。回路パラメータの固定セットをランダム化に調整することにより、回路が表現的だが不規則なプラトーに悩まされない領域を見つけ、回路を初期化するための良い方法を示唆する。その結果、パラメトリズド量子回路の理解が強化され、変分量子アルゴリズムが改善される。 To harness the potential of noisy intermediate-scale quantum devices, it is paramount to find the best type of circuits to run hybrid quantum-classical algorithms. Key candidates are parametrized quantum circuits that can be effectively implemented on current devices. Here, we evaluate the capacity and trainability of these circuits using the geometric structure of the parameter space via the effective quantum dimension, which reveals the expressive power of circuits in general as well as of particular initialization strategies. We assess the representation power of various popular circuit types and find striking differences depending on the type of entangling gates used. Particular circuits are characterized by scaling laws in their expressiveness. We identify a transition in the quantum geometry of the parameter space, which leads to a decay of the quantum natural gradient for deep circuits. For shallow circuits, the quantum natural gradient can be orders of magnitude larger in value compared to the regular gradient; however, both of them can suffer from vanishing gradients. By tuning a fixed set of circuit parameters to randomized ones, we find a region where the circuit is expressive, but does not suffer from barren plateaus, hinting at a good way to initialize circuits. Our results enhance the understanding of parametrized quantum circuits for improving variational quantum algorithms.	翻訳日:2021-02-03 16:16:22 公開日:2021-02-02
# 磁気共鳴脳イメージングにおける転送学習:システムレビュー Transfer Learning in Magnetic Resonance Brain Imaging: a Systematic Review ( http://arxiv.org/abs/2102.01530v1 ) ライセンス: Link先を確認	Juan Miguel Valverde, Vandad Imani, Ali Abdollahzadeh, Riccardo De Feo, Mithilesh Prakash, Robert Ciszek, Jussi Tohka	(参考訳) 転送学習は、関心のあるタスクの一般化を改善するために、関連するタスクから知識を取得することに焦点を当てた機械学習技術である。 MRIでは、移動学習はMR画像の変動に対処する戦略を開発する上で重要である。さらに、転送学習は、関心のあるタスクに関連するタスクを解決するために訓練された機械学習モデルを再利用するのに役立つ。研究の方向性,知識のギャップ,応用,そしてmr脳イメージングに応用されるトランスファー学習アプローチの中で広く使われる戦略を特定することを目的としています。 MR脳イメージングにトランスファー学習を適用した記事の系統的文献探索を行った。 433の研究をスクリーニングし,タスクタイプ,アプリケーション,機械学習手法などの関連情報を分類,抽出した。さらに、プライバシ、未確認ターゲットドメイン、ラベルなしデータに対処する脳MRI固有の転写学習アプローチや他の手法を精査した。脳mriタスクに転送学習を応用した記事は129件あった。最も頻繁な応用は認知症関連分類タスクと脳腫瘍のセグメンテーションであった。記事の大半は畳み込みニューラルネットワーク(CNN)で転送学習を使用した。プライバシー問題、未確認のターゲットドメイン、ラベルなしデータなど、明らかにMRI特有のアプローチはごくわずかだった。我々はグループ固有の広く使われるアプローチに対する新しい分類を提案した。脳MRIにおけるトランスファー学習への関心が高まっている。公共データセットは、アルツハイマーの診断/予後および腫瘍分割の人気に貢献している。同様に、事前訓練されたCNNの利用も促進されている。最後に、調査研究の大半は、転校学習を施した後の戦略の解釈を詳細に検討せず、他のアプローチと比較しなかった。 Transfer learning refers to machine learning techniques that focus on acquiring knowledge from related tasks to improve generalization in the tasks of interest. In MRI, transfer learning is important for developing strategies that address the variation in MR images. Additionally, transfer learning is beneficial to re-utilize machine learning models that were trained to solve related tasks to the task of interest. Our goal is to identify research directions, gaps of knowledge, applications, and widely used strategies among the transfer learning approaches applied in MR brain imaging. We performed a systematic literature search for articles that applied transfer learning to MR brain imaging. We screened 433 studies and we categorized and extracted relevant information, including task type, application, and machine learning methods. Furthermore, we closely examined brain MRI-specific transfer learning approaches and other methods that tackled privacy, unseen target domains, and unlabeled data. We found 129 articles that applied transfer learning to brain MRI tasks. The most frequent applications were dementia related classification tasks and brain tumor segmentation. A majority of articles utilized transfer learning on convolutional neural networks (CNNs). Only few approaches were clearly brain MRI specific, considered privacy issues, unseen target domains or unlabeled data. We proposed a new categorization to group specific, widely-used approaches. There is an increasing interest in transfer learning within brain MRI. Public datasets have contributed to the popularity of Alzheimer's diagnostics/prognostics and tumor segmentation. Likewise, the availability of pretrained CNNs has promoted their utilization. Finally, the majority of the surveyed studies did not examine in detail the interpretation of their strategies after applying transfer learning, and did not compare to other approaches.	翻訳日:2021-02-03 16:15:04 公開日:2021-02-02
# 人工知能を用いた医療画像解析のための医療データセット収集 Medical Datasets Collections for Artificial Intelligence-based Medical Image Analysis ( http://arxiv.org/abs/2102.01549v1 ) ライセンス: Link先を確認	Yang Wen	(参考訳) 我々は32の公開データセットを収集し,そのうち28は医用画像,4つは自然画像で,研究を行った。これらのデータセットの画像は、異なるカメラによってキャプチャされるため、モダリティ、フレームサイズ、容量が異なる。データアクセシビリティのため、私たちは多くのデータセットのwebサイトも提供しています。 We collected 32 public datasets, of which 28 for medical imaging and 4 for natural images, to conduct study. The images of these datasets are captured by different cameras, thus vary from each other in modality, frame size and capacity. For data accessibility, we also provide the websites of most datasets and hope this will help the readers reach the datasets.	翻訳日:2021-02-03 16:14:20 公開日:2021-02-02
# 医学的無関係なスタイル転送拡張を用いた計算病理のドメインに依存しない視覚表現の学習 Learning domain-agnostic visual representation for computational pathology using medically-irrelevant style transfer augmentation ( http://arxiv.org/abs/2102.01678v1 ) ライセンス: Link先を確認	Rikiya Yamashita, Jin Long, Snikitha Banda, Jeanne Shen, Daniel L. Rubin	(参考訳) 見えないデータに基づく機械学習モデルの最適一般化は、そのようなモデルの医療画像への臨床応用性を妨げる重要な課題である。ドメイン適応やドメイン一般化といった様々な方法がこの課題に対処するために進化してきたが、堅牢で一般化可能な表現の学習は医用画像理解の中核であり、現在も問題となっている。本稿では,芸術絵画からのランダムなスタイル転送に基づくデータ拡張の一形態であるSTRAP(Style TRansfer Augmentation for histoPathology)を提案する。スタイル転送は、高レベルの意味コンテンツを維持しながら、画像の低レベルのテクスチャコンテンツをランダムに選択された芸術絵画の無情報スタイルに置き換えます。これにより、ドメインシフトに対する堅牢性が向上し、ドメインに依存しない表現を学ぶためのシンプルで強力なツールとして使用できる。その結果,ストラップは大腸癌におけるマイクロサテライト状態をデジタル化組織病理画像を用いて予測する特定の分類タスクにおいて,最先端のパフォーマンス,特にドメインシフトの有無に寄与することが示された。 Suboptimal generalization of machine learning models on unseen data is a key challenge which hampers the clinical applicability of such models to medical imaging. Although various methods such as domain adaptation and domain generalization have evolved to combat this challenge, learning robust and generalizable representations is core to medical image understanding, and continues to be a problem. Here, we propose STRAP (Style TRansfer Augmentation for histoPathology), a form of data augmentation based on random style transfer from artistic paintings, for learning domain-agnostic visual representations in computational pathology. Style transfer replaces the low-level texture content of images with the uninformative style of randomly selected artistic paintings, while preserving high-level semantic content. This improves robustness to domain shift and can be used as a simple yet powerful tool for learning domain-agnostic representations. We demonstrate that STRAP leads to state-of-the-art performance, particularly in the presence of domain shifts, on a particular classification task of predicting microsatellite status in colorectal cancer using digitized histopathology images.	翻訳日:2021-02-03 16:13:53 公開日:2021-02-02
# (参考訳) NeMo: ロバスト3次元ポース推定のためのコントラスト特徴のニューラルネットワークモデル NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation ( http://arxiv.org/abs/2101.12378v2 ) ライセンス: CC BY 4.0	Angtian Wang, Adam Kortylewski, Alan Yuille	(参考訳) 3Dポーズ推定はコンピュータビジョンにおいて難しいが重要な課題である。本研究では,3Dポーズ推定における標準的深層学習手法が,対象物が部分的に遮蔽されたり,以前見つからなかったポーズから見たりした場合,堅牢ではないことを示した。生成的視覚モデルから部分閉塞へのロバスト性に着想を得て,物体の3次元生成表現とディープニューラルネットワークを,NeMoと呼ぶ統一ニューラルネットワークアーキテクチャに統合することを提案する。特にnemoは、密集した3dメッシュ上の各頂点における神経特徴活性化の生成モデルを学ぶ。微分可能レンダリングを用いて、NeMoとターゲット画像の特徴表現との再構成誤差を最小化することにより、3Dオブジェクトのポーズを推定する。レコンストラクション損失の局所視認を避けるために,特徴抽出器を訓練し,メッシュ上の個々の特徴表現間の距離をコントラスト学習を用いて最大化する。 PASCAL3D+、Occluded-PASCAL3D+およびObjectNet3Dに関する広範な実験により、NeMoは通常のディープネットワークに比べて、部分閉塞に対してより堅牢であり、かつ、通常のデータ上での競合性能を維持しながら、目に見えないポーズを示す。興味深いことに、私たちの実験では、メッシュ表現が真の物体ジオメトリを立方体で粗大に近似するだけであっても、NeMoが合理的にうまく機能することを示しており、正確な3Dポーズ推定には詳細な3Dジオメトリは必要ありません。コードはhttps://github.com/Angtian/NeMoで公開されている。 3D pose estimation is a challenging but important task in computer vision. In this work, we show that standard deep learning approaches to 3D pose estimation are not robust when objects are partially occluded or viewed from a previously unseen pose. Inspired by the robustness of generative vision models to partial occlusion, we propose to integrate deep neural networks with 3D generative representations of objects into a unified neural architecture that we term NeMo. In particular, NeMo learns a generative model of neural feature activations at each vertex on a dense 3D mesh. Using differentiable rendering we estimate the 3D object pose by minimizing the reconstruction error between NeMo and the feature representation of the target image. To avoid local optima in the reconstruction loss, we train the feature extractor to maximize the distance between the individual feature representations on the mesh using contrastive learning. Our extensive experiments on PASCAL3D+, occluded-PASCAL3D+ and ObjectNet3D show that NeMo is much more robust to partial occlusion and unseen pose compared to standard deep networks, while retaining competitive performance on regular data. Interestingly, our experiments also show that NeMo performs reasonably well even when the mesh representation only crudely approximates the true object geometry with a cuboid, hence revealing that the detailed 3D geometry is not needed for accurate 3D pose estimation. The code is publicly available at https://github.com/Angtian/NeMo.	翻訳日:2021-02-03 13:08:18 公開日:2021-02-02
# (参考訳) 天空画像からの深層学習照度予測モデルのベンチマーク -詳細な分析- Benchmarking of Deep Learning Irradiance Forecasting Models from Sky Images -- an in-depth Analysis ( http://arxiv.org/abs/2102.00721v2 ) ライセンス: CC BY 4.0	Quentin Paletta, Guillaume Arbod and Joan Lasenby	(参考訳) スマートグリッド、発電所の運用、ハイブリッドシステム管理、エネルギー取引など多くの産業応用は、ソーラーパネルからの断続的なエネルギー生産に対応するため、短期的な太陽予報の改善の恩恵を受ける可能性がある。しかし、現在の雲を空からモデル化するアプローチでは、雲の空間的配置、時間的ダイナミクス、太陽放射との物理的相互作用に関する精度が不足している。大規模データセットの増加によって、これらの制限に対処するためにデータ駆動メソッドが開発され、有望な結果が得られた。本研究では、半球空画像と外生変数のシーケンスから太陽光照射を予測するために訓練された4つのDeep Learningアーキテクチャを比較した。各モデルの相対的なパフォーマンスを評価するために、スマート永続化モデルに基づく予測スキルメトリックと、ランプと時間の歪みメトリックを使用しました。その結果、天空画像列の時空間的側面のエンコーディングは、試験年度の予測スキルが20.4%に達したことにより、予測を大幅に改善した。しかし、実験データに基づいて、深層学習モデルは「非常にスマートな永続化モデル」として振る舞う傾向があり、最も厄介なエラーを緩和しながら、時間的に永続化モデルと整合する傾向があると結論づけた。したがって、スカイカメラで捉えられたにもかかわらず、モデルはしばしば太陽を遮る雲のような大きな照度変化を引き起こす基本的な事象を見逃す。反応性から予測性まで、このアプローチの放射能予測への移行に貢献できることを願っています。 A number of industrial applications, such as smart grids, power plant operation, hybrid system management or energy trading, could benefit from improved short-term solar forecasting, addressing the intermittent energy production from solar panels. However, current approaches to modelling the cloud cover dynamics from sky images still lack precision regarding the spatial configuration of clouds, their temporal dynamics and physical interactions with solar radiation. Benefiting from a growing number of large datasets, data driven methods are being developed to address these limitations with promising results. In this study, we compare four commonly used Deep Learning architectures trained to forecast solar irradiance from sequences of hemispherical sky images and exogenous variables. To assess the relative performance of each model, we used the Forecast Skill metric based on the smart persistence model, as well as ramp and time distortion metrics. The results show that encoding spatiotemporal aspects of the sequence of sky images greatly improved the predictions with 10 min ahead Forecast Skill reaching 20.4% on the test year. However, based on the experimental data, we conclude that, with a common setup, Deep Learning models tend to behave just as a 'very smart persistence model', temporally aligned with the persistence model while mitigating its most penalising errors. Thus, despite being captured by the sky cameras, models often miss fundamental events causing large irradiance changes such as clouds obscuring the sun. We hope that our work will contribute to a shift of this approach to irradiance forecasting, from reactive to anticipatory.	翻訳日:2021-02-03 12:52:29 公開日:2021-02-02
# M2FN:マルチステップモダリティ融合による画像評価 M2FN: Multi-step Modality Fusion for Advertisement Image Assessment ( http://arxiv.org/abs/2102.00441v2 ) ライセンス: Link先を確認	Kyung-Wha Park (1), Jung-Woo Ha (2), JungHoon Lee (3), Sunyoung Kwon (4), Kyung-Min Kim (2), Byoung-Tak Zhang (1 and 5 and 6) ((1) Interdisciplinary Program in Neuroscience, Seoul National University., (2) NAVER AI LAB, NAVER CLOVA., (3) Statistics and Actuarial Science, Soongsil University., (4) School of Biomedical Convergence Engineering, Pusan National University., (5) Department of Computer Science and Engineering, Seoul National University., (6) Surromind Robotics.)	(参考訳) 特にユーザーの嗜好と広告品質に基づいて広告を評価することは、マーケティング業界にとって重要です。近年の研究では、ディープニューラルネットワークの利用を試みているが、これらの研究では画像関連補助属性(ad画像に頻繁に見られる埋め込みテキストを含む)は使用されていない。そこで,これらの属性が広告イメージの嗜好に与える影響を検討した。まず, 大規模実世界の広告ログデータを分析し, 本研究に基づいて, ユーザの好みにアピールしそうな広告画像を決定する新しいマルチステップモダリティ融合ネットワーク (m2fn) を提案する。本手法は,条件付きバッチ正規化に基づく低レベル融合と注意に基づく高レベル融合を含む,ネットワーク内の複数のステップを通じて補助属性を利用する。 M2FNは、美的画像評価に広く使用されているAVAデータセット上で検証し、豊富な補助属性を持つ実世界の広告データセットを用いて、嗜好予測における最先端のパフォーマンスを達成できることを実証しました。 Assessing advertisements, specifically on the basis of user preferences and ad quality, is crucial to the marketing industry. Although recent studies have attempted to use deep neural networks for this purpose, these studies have not utilized image-related auxiliary attributes, which include embedded text frequently found in ad images. We, therefore, investigated the influence of these attributes on ad image preferences. First, we analyzed large-scale real-world ad log data and, based on our findings, proposed a novel multi-step modality fusion network (M2FN) that determines advertising images likely to appeal to user preferences. Our method utilizes auxiliary attributes through multiple steps in the network, which include conditional batch normalization-based low-level fusion and attention-based high-level fusion. We verified M2FN on the AVA dataset, which is widely used for aesthetic image assessment, and then demonstrated that M2FN can achieve state-of-the-art performance in preference prediction using a real-world ad dataset with rich auxiliary attributes.	翻訳日:2021-02-03 12:49:30 公開日:2021-02-02
# 誰のための公平? テキスト要約における読者の公平性認識の理解 Fairness for Whom? Understanding the Reader's Perception of Fairness in Text Summarization ( http://arxiv.org/abs/2101.12406v2 ) ライセンス: Link先を確認	Anurag Shandilya, Abhisek Dash, Abhijnan Chakraborty, Kripabandhu Ghosh, Saptarshi Ghosh	(参考訳) ユーザが生成するテキスト情報の増加に伴い、近年、広範囲なコンテンツの概要を提供するための要約アルゴリズムの利用が増加している。これらのアルゴリズムを評価するための伝統的なメトリクス(例) ROUGEスコア)は、アルゴリズムの要約と人間生成の要約を一致させることに頼っている。しかし、テキストの内容が異質である場合、例えば、異なる社会的に有能なグループから来る場合、既存の要約アルゴリズムのほとんどは、元のデータにおける分布と非常に異なる社会集団を表すことが示されている。このような悪影響を軽減するため、公正保存要約アルゴリズムも提案されている。これらの研究のすべては、内容の作家の視点から公正の規範的な概念を検討し、根底にある公平性の概念に対する読者の認識を無視しています。このギャップを埋めるため,本研究では,フェアネス概念と読者がテキスト要約でどのように認識するかを考察する。実験により,読者の公平感は文脈に敏感な場合が多いことを示した。さらに、標準的なROUGE評価指標は、要約の知覚的(不公平)性を定量化できない。そこで本研究では,テキスト要約における知覚バイアスを定量化するための,ループ内人間メトリックとグラフベースの自動手法を提案する。我々は,不均質な社会-政治的マイクロブログデータセットのいくつかの要約(un)を定量化し,その有用性を示す。 With the surge in user-generated textual information, there has been a recent increase in the use of summarization algorithms for providing an overview of the extensive content. Traditional metrics for evaluation of these algorithms (e.g. ROUGE scores) rely on matching algorithmic summaries to human-generated ones. However, it has been shown that when the textual contents are heterogeneous, e.g., when they come from different socially salient groups, most existing summarization algorithms represent the social groups very differently compared to their distribution in the original data. To mitigate such adverse impacts, some fairness-preserving summarization algorithms have also been proposed. All of these studies have considered normative notions of fairness from the perspective of writers of the contents, neglecting the readers' perceptions of the underlying fairness notions. To bridge this gap, in this work, we study the interplay between the fairness notions and how readers perceive them in textual summaries. Through our experiments, we show that reader's perception of fairness is often context-sensitive. Moreover, standard ROUGE evaluation metrics are unable to quantify the perceived (un)fairness of the summaries. To this end, we propose a human-in-the-loop metric and an automated graph-based methodology to quantify the perceived bias in textual summaries. We demonstrate their utility by quantifying the (un)fairness of several summaries of heterogeneous socio-political microblog datasets.	翻訳日:2021-02-03 12:48:50 公開日:2021-02-02
# 強化学習のためのポリシーミラー降下:線形収束、新しいサンプリング複雑性、一般化問題クラス Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes ( http://arxiv.org/abs/2102.00135v2 ) ライセンス: Link先を確認	Guanghui Lan	(参考訳) 本稿では,強化学習(RL)問題を,強い凸あるいは一般凸正規化器を用いて解くための新しいポリシーミラー降下法を提案する。これらの全体的非凸問題の構造的性質を調べることにより、pmd法は、大域的最適性への収束速度が速いことを示した。これらの方法の確率的対応法を開発し、 ${\cal O}(1/\epsilon)$ (resp., ${\cal O}(1/\epsilon^2)$) のサンプリング複雑性を確立し、これらのRL問題を異なるサンプリングスキームを用いて強く(resp., general)凸正規化することで解決する。さらに,これらの正規化子の勾配を計算するための複雑性は,必要であれば,強い(一般)凸正規化子を持つ問題に対して,${\cal o}\{(\log_\gamma \epsilon) [(1-\gamma)l/\mu]^{1/2}\log (1/\epsilon)\}$ (resp., ${\cal o} \{(\log_\gamma \epsilon ) [(1-\gamma)l/\epsilon]^{1/2}\}$) で限定できることを示した。ここで$\gamma$は割引要因を表します。我々の知る限り、これらの複雑さはアルゴリズムの発達とともに、最適化とRLの文献の両方において新しく見える。これらの凸正規化器の導入は、rlモデルの柔軟性と適用性を大きく広げる。 We present new policy mirror descent (PMD) methods for solving reinforcement learning (RL) problems with either strongly convex or general convex regularizers. By exploring the structural properties of these overall seemly highly nonconvex problems we show that the PMD methods exhibit fast linear rate of convergence to the global optimality. We develop stochastic counterparts of these methods, and establish an ${\cal O}(1/\epsilon)$ (resp., ${\cal O}(1/\epsilon^2)$) sampling complexity for solving these RL problems with strongly (resp., general) convex regularizers using different sampling schemes, where $\epsilon$ denote the target accuracy. We further show that the complexity for computing the gradients of these regularizers, if necessary, can be bounded by ${\cal O}\{(\log_\gamma \epsilon) [(1-\gamma)L/\mu]^{1/2}\log (1/\epsilon)\}$ (resp., ${\cal O} \{(\log_\gamma \epsilon ) [(1-\gamma)L/\epsilon]^{1/2}\}$) for problems with strongly (resp., general) convex regularizers. Here $\gamma$ denotes the discounting factor. To the best of our knowledge, these complexity bounds, along with our algorithmic developments, appear to be new in both optimization and RL literature. The introduction of these convex regularizers also greatly expands the flexibility and applicability of RL models.	翻訳日:2021-02-03 12:48:10 公開日:2021-02-02
# 一般化非定常バンディット Generalized non-stationary bandits ( http://arxiv.org/abs/2102.00725v2 ) ライセンス: Link先を確認	Anne Gael Manegueu, Alexandra Carpentier and Yi Yu	(参考訳) 本稿では,スイッチングバンドイット問題を一般化する非定常確率バンドイット問題について検討する。スイッチングバンドイット問題(\textbf{Case a})に加えて、我々は3つの具体的な例に興味を持っている: (\textbf{b}) 腕の手段は局所多項式であり、 (\textbf{c}) 腕の手段は局所的に滑らかであり、 (\textbf{d}) 腕の隙間は束縛された数の屈曲点を持ち、そこでは最も高い腕の平均は短い範囲であまり変化しない。これらの3つの設定は非常に異なるが、共通する点がある: (i) ギャップの対数の同様の大きさのレベル集合の数を制御でき、 (ii) 最高平均は急な変更の数に制限があり、それ以外は変化が限られている。この一般的な設定では、特に4つの問題 (a)-(d) を効率的かつ統一的に解く1つのアルゴリズムを提案する。 In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d) mentioned.	翻訳日:2021-02-03 12:47:02 公開日:2021-02-02
# 無線画像伝送のためのSNR適応深部接合源チャネル符号化 SNR-adaptive deep joint source-channel coding for wireless image transmission ( http://arxiv.org/abs/2102.00202v2 ) ライセンス: Link先を確認	Mingze Ding and Jiahui Li and Mengyao Ma and Xiaopeng Fan	(参考訳) 本論文では,ノイズの多いチャネル上での画像のマルチユーザ伝送のためのジョイントソースチャネル符号化(JSCC)の問題を考えることにより,自動エンコーダを用いた深部ソースチャネル符号化方式を提案する。提案したJSCC方式では,信号対雑音比(SNR)を推定し,それを用いて送信画像の適応復号を行う。実験により,提案方式は異なるSNRの適応性に優れた結果が得られ,SNRのデコーダ推定誤差に頑健であることが示された。我々の知る限りでは、これは、異なるSNRの適応性に焦点を当て、マルチユーザシナリオに適用できる最初のディープJSCCスキームである。 Considering the problem of joint source-channel coding (JSCC) for multi-user transmission of images over noisy channels, an autoencoder-based novel deep joint source-channel coding scheme is proposed in this paper. In the proposed JSCC scheme, the decoder can estimate the signal-to-noise ratio (SNR) and use it to adaptively decode the transmitted image. Experiments demonstrate that the proposed scheme achieves impressive results in adaptability for different SNRs and is robust to the decoder's estimation error of the SNR. To the best of our knowledge, this is the first deep JSCC scheme that focuses on the adaptability for different SNRs and can be applied to multi-user scenarios.	翻訳日:2021-02-03 12:46:17 公開日:2021-02-02
# 分布型モンテカルロ木探索によるリスク認識と多目的意思決定 Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search ( http://arxiv.org/abs/2102.00966v2 ) ライセンス: Link先を確認	Conor F. Hayes, Mathieu Reymond, Diederik M. Roijers, Enda Howley, Patrick Mannion	(参考訳) 多くのリスク認識および多目的強化学習設定において、ユーザの有用性はポリシーの単一実行から導かれる。これらの設定では、平均的な将来のリターンに基づいた決定は適切ではない。例えば、医療現場では、患者は病気を治療する機会を1つだけ持つことができる。決定を行う場合、期待されるリターン(強化学習では値として知られています)は、決定が持つ可能性のある有害あるいはポジティブな結果の範囲を考慮できないのです。我々の重要な洞察は、エージェントが決定時に要求する重要な情報を表現するために、期待される未来よりも分布を使うべきだということです。本論文では,個々の政策実行から得られる様々なリターンの有用性について,後方分布を学習するアルゴリズムである分散モンテカルロ木探索を提案する。さらに,本アルゴリズムは,期待値の効用に対する多目的強化学習において,最先端の手法よりも優れていた。 In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from the single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. When making a decision, just the expected return -- known in reinforcement learning as the value -- cannot account for the potential range of adverse or positive outcomes a decision may have. Our key insight is that we should use the distribution over expected future returns differently to represent the critical information that the agent requires at decision time. In this paper, we propose Distributional Monte Carlo Tree Search, an algorithm that learns a posterior distribution over the utility of the different possible returns attainable from individual policy executions, resulting in good policies for both risk-aware and multi-objective settings. Moreover, our algorithm outperforms the state-of-the-art in multi-objective reinforcement learning for the expected utility of the returns.	翻訳日:2021-02-03 12:45:42 公開日:2021-02-02
# (参考訳) 最適化による公正性 Fairness through Optimization ( http://arxiv.org/abs/2102.00311v2 ) ライセンス: CC BY 4.0	Violet Xinying Chen, J.N. Hooker	(参考訳) AIに基づく意思決定モデルにおける公平性を形式化する一般的なパラダイムとして最適化を提案する。最適化モデルは、高度に高度なソリューション技術を活用すると同時に、社会福祉機能として幅広い公正基準を定式化することができると論じる。本稿では,ニューラルネットワーク,サポートベクターマシン,ルールベースシステムといった文脈において,適切な制約を受ける社会福祉関数を最大化することにより,公平性指向の意思決定を支援する最適化モデルを提案する。特に、公平性や公平性と効率性の組み合わせを測定するさまざまな機能のためのトラクタブル最適化モデルについて述べる。これには、いくつかの不等式メトリクス、rawlsian criteria、mclooneとhoover indices、alpha fairness、nashとkalai-smorodinskyの交渉ソリューション、rawlsianとutilitarian criteriaの組み合わせ、統計バイアス測度が含まれる。これらのモデルはすべて、線形プログラミング、混合整数/線形プログラミング、または(2つのケースで)特殊な凸プログラミング方法によって効率的に解くことができる。 We propose optimization as a general paradigm for formalizing fairness in AI-based decision models. We argue that optimization models allow formulation of a wide range of fairness criteria as social welfare functions, while enabling AI to take advantage of highly advanced solution technology. We show how optimization models can assist fairness-oriented decision making in the context of neural networks, support vector machines, and rule-based systems by maximizing a social welfare function subject to appropriate constraints. In particular, we state tractable optimization models for a variety of functions that measure fairness or a combination of fairness and efficiency. These include several inequality metrics, Rawlsian criteria, the McLoone and Hoover indices, alpha fairness, the Nash and Kalai-Smorodinsky bargaining solutions, combinations of Rawlsian and utilitarian criteria, and statistical bias measures. All of these models can be efficiently solved by linear programming, mixed integer/linear programming, or (in two cases) specialized convex programming methods.	翻訳日:2021-02-03 12:44:55 公開日:2021-02-02

Title

Authors

Abstract

論文公表日・翻訳日

# 熱真空に結合した自由ブラウン粒子のエネルギー

Energy of a free Brownian particle coupled to thermal vacuum ( http://arxiv.org/abs/2003.13567v2 )

ライセンス: Link先を確認

J. Spiechowicz, J. {\L}uczka

(参考訳) 実験者は、かつては普通だった物理学が異常なものになった絶対零度に非常に近い温度に到達した。このような状態の量子効果とゆらぎが支配的な役割を担い始める。この文脈では、最も単純な開量子系、すなわち熱真空に結合した自由量子ブラウン粒子、すなわち絶対零温度の極限の場合のサーモスタットを研究する。我々は、粒子の平均エネルギー$E=E(c)$を、粒子と熱真空の間の弱い相互作用強度から強い相互作用強度$c$まで分析する。様々な散逸機構の影響を考察する。弱い結合状態では、エネルギーは$E(c) \sim c\, \ln{(1/c)}$としてゼロになるが、強い結合状態では$E(c) \sim \sqrt{c}$として無限大に分岐する。一般化ランゲヴィン方程式のメモリカーネル $\gamma(t)$ で定義される散逸機構の選択例について示す。 c$ エネルギー $e(c)$ の固定値は散逸モデルにどのように依存するかを明らかにする: 散逸関数 $\gamma'(t)$ の導関数の値を比較する必要がある散逸関数 $\gamma(t)$ またはメモリ時間 $t=0$ で、ブラウン粒子力学の非マルコフ性を表す $t=\tau_c$ である。低温の影響も示される。

Experimentalists have come to temperatures very close to absolute zero at which physics that was once ordinary becomes extraordinary. In such a regime quantum effects and fluctuations start to play a dominant role. In this context we study the simplest open quantum system, namely, a free quantum Brownian particle coupled to thermal vacuum, i.e. thermostat in the limiting case of absolute zero temperature. We analyze the average energy $E=E(c)$ of the particle from a weak to strong interaction strength $c$ between the particle and thermal vacuum. The impact of various dissipation mechanisms is considered. In the weak coupling regime the energy tends to zero as $E(c) \sim c\, \ln{(1/c)}$ while in the strong coupling regime it diverges to infinity as $E(c) \sim \sqrt{c}$. We demonstrate it for selected examples of the dissipation mechanisms defined by the memory kernel $\gamma(t)$ of the Generalized Langevin Equation. We reveal how at a fixed value of $c$ the energy $E(c)$ depends on the dissipation model: one has to compare values of the derivative $\gamma'(t)$ of the dissipation function $\gamma(t)$ at time $t=0$ or at the memory time $t=\tau_c$ which characterizes the degree of non-Markovianity of the Brownian particle dynamics. The impact of low temperature is also presented.

翻訳日:2023-05-27 12:12:32 公開日:2021-02-02

# 置換不規則二元晶材料モデリングのためのヒューリスティック量子古典アルゴリズム

A Heuristic Quantum-Classical Algorithm for Modeling Substitutionally Disordered Binary Crystalline Materials ( http://arxiv.org/abs/2004.00957v3 )

ライセンス: Link先を確認

Tanvi P. Gujarati, Tyler Takeshita, Andreas Hintennach, and Eunseok Lee

(参考訳) エネルギー計算の効率と精度の向上は、計算材料データに機械学習技術を適用する分野である材料情報学の分野において、重要かつ継続的な関心を集めている。本稿では,置換不規則二元晶材料のエネルギーを効率的にモデル化し,予測するヒューリスティック量子古典アルゴリズムを提案する。具体的には、格子サイト数で線形にスケールする量子回路を設計し、指数的スケーリング特徴空間における量子化学シミュレーションのエネルギーを予測するために訓練する。この回路は、古典的計算された量子化学シミュレーションから得られたデータを用いて、古典的教師付き学習によって訓練される。トレーニングプロセスの一環として,入力データの異常を検出し,修正できるサブルーチンを導入する。このアルゴリズムは、広く使用されているリチウムイオン電池陰極材料であるLi-コバルテート系の複雑な層構造上で実証される。その結果,提案する量子回路モデルは,そのような量子力学系から得られるエネルギーのモデル化に最適であることがわかった。さらに、異常データの解析は、研究対象システムの熱力学特性に関する重要な洞察を与える。

Improving the efficiency and accuracy of energy calculations has been of significant and continued interest in the area of materials informatics, a field that applies machine learning techniques to computational materials data. Here, we present a heuristic quantum-classical algorithm to efficiently model and predict the energies of substitutionally disordered binary crystalline materials. Specifically, a quantum circuit that scales linearly in the number of lattice sites is designed and trained to predict the energies of quantum chemical simulations in an exponentially-scaling feature space. This circuit is trained by classical supervised-learning using data obtained from classically-computed quantum chemical simulations. As a part of the training process, we introduce a sub-routine that is able to detect and rectify anomalies in the input data. The algorithm is demonstrated on the complex layer-structured of Li-cobaltate system, a widely-used Li-ion battery cathode material component. Our results shows that the proposed quantum circuit model presents a suitable choice for modelling the energies obtained from such quantum mechanical systems. Furthermore, analysis of the anomalous data provides important insights into the thermodynamic properties of the systems studied.

翻訳日:2023-05-27 03:25:28 公開日:2021-02-02

# 量子過程の因果構造を調べるための高速テスト

Fast tests for probing the causal structure of quantum processes ( http://arxiv.org/abs/2004.08308v3 )

ライセンス: Link先を確認

Giulio Chiribella and Swati

(参考訳) 因果関係の同定は科学的手法の基礎となっている。この課題に対する伝統的なアプローチは古典的な統計に基づいている。しかし、そのような古典的アプローチは、より広い因果関係のスペクトルがアクセス可能になる量子領域では適用されない。近年、量子因果推論の新しいアプローチが開発され、将来有望な新しい特徴が発見されている。本稿では、Refのフレームワークと結果をレビューし、部分的に拡張する。 [1]は可逆過程によって誘導される様々な種類の因果関係の同定において量子的スピードアップを示した。

The identification of causal relations is a cornerstone of the scientific method. Traditional approaches to this task are based on classical statistics. However, such classical approaches do not apply in the quantum domain, where a broader spectrum of causal relations becomes accessible. New approaches to quantum causal inference have been developed in recent years, and promising new features have been discovered. In this paper, we review and partly expand the framework and results of Ref. [1], which demonstrated quantum speedups in the identification of various types of causal relations induced by reversible processes.

翻訳日:2023-05-23 04:38:00 公開日:2021-02-02

# 格子準電子波動関数の連続極限

Continuum limit of lattice quasielectron wavefunctions ( http://arxiv.org/abs/2004.12205v2 )

ライセンス: Link先を確認

Aniket Patra, Birgit Hillebrecht, and Anne E. B. Nielsen

(参考訳) ラウリン状態における正準ホールを記述した試行状態が早期に発見され、従って、正準電子も生成できることを期待することは自然である。それでも、準電子に対する既存の試行波動関数は、期待される位相特性やそれらの構成と相容れない挙動を示す。しかし、格子分数量子ホール系では、期待される全ての性質 [new j. phys. 20, 033029 (2018)] を持つ比較的単純な準電子波動関数を見つけることができることが示されている。連続体極限におけるこの波動関数はどうなるのか? ここでは、準電子が格子点の上にあるときに有限連続な波動関数が得られるが、格子準電子の極限は一般に存在しないことを示す。特に、準電子が格子点以外の任意の場所に配置された場合、連続体極限が近づくと格子波動関数は発散する。発散は、最低ランダウレベルに状態を投影することで取り除くことができるが、投影された状態は、任意の準電子に対して期待される性質をも持たない。したがって、格子準電子波動関数は連続体における任意の準電子の試行状態を見つけることの難しさを解決しない。

Trial states describing anyonic quasiholes in the Laughlin state were found early on, and it is therefore natural to expect that one should also be able to create anyonic quasielectrons. Nevertheless, the existing trial wavefunctions for quasielectrons show behaviors that are not compatible with the expected topological properties or their construction involves ad hoc elements. It was shown, however, that for lattice fractional quantum Hall systems, it is possible to find a relatively simple quasielectron wavefunction that has all the expected properties [New J. Phys. 20, 033029 (2018)]. This naturally poses the question: what happens to this wavefunction in the continuum limit? Here we demonstrate that, although one obtains a finite continuum wavefunction when the quasielectron is on top of a lattice site, such a limit of the lattice quasielectron does not exist in general. In particular, if the quasielectron is put anywhere else than on a lattice site, the lattice wavefunction diverges when the continuum limit is approached. The divergence can be removed by projecting the state on the lowest Landau level, but we find that the projected state does also not have the properties expected for anyonic quasielectrons. We hence conclude that the lattice quasielectron wavefunction does not solve the difficulty of finding trial states for anyonic quasielectrons in the continuum.

翻訳日:2023-05-22 04:01:29 公開日:2021-02-02

# 完全無秩序2次元量子ウォークにおける位相的非局在化

Topological delocalization in the completely disordered two-dimensional quantum walk ( http://arxiv.org/abs/2005.00203v3 )

ライセンス: Link先を確認

J\'anos K. Asb\'oth, Arindam Mallick

(参考訳) 空間障害が2つの内部「コイン」状態を持つ2次元分割ステップ離散時間量子ウォークに与える影響を数値解析および理論的に検討した。空間障害はアンダーソンの局所化につながり、量子ウォークの拡散を阻害し、拡散的に広がる古典的ウォークに対して不利な状態に陥る。最も一般的なタイプの空間的障害、すなわち位置依存的なハールランダムコイン作用素は、アンダーソンの局所化ではなく拡散拡散につながる。これは非局在化であり、これは障害が量子ウォークを異なる異常なフロケ・アンダーソン絶縁位相の間の臨界点に配置するためである。この説明は、この一般的な量子ウォークと、文献でより研究されたより単純なケースとの関係と、障害による位相的起源の非局在化が観察されたことに基づく。我々は、波動関数の時間発展とレベル間隔統計を用いて、より単純な量子ウォークのための位相的非局在化をレビューする。散乱理論を2次元量子ウォークに適用し、乱れた量子ウォークの位相不変量を計算し、非局在化の位相的解釈を裏付け、伝送の有限スケールスケールにおける非局在化の符号を求める。我々は、3つの異なる方法で臨界指数$\eta$を計算し、整数量子ホール効果のように$\eta$$\approx$ 0.52を求めることで、ハール乱量子ウォークの臨界性を示す。固体物理学の理論的アイデアと数値ツールが、空間的にランダムな量子ウォークを理解する上でどのように役立つかを示す。

We investigate numerically and theoretically the effect of spatial disorder on two-dimensional split-step discrete-time quantum walks with two internal "coin" states. Spatial disorder can lead to Anderson localization, inhibiting the spread of quantum walks, putting them at a disadvantage against their diffusively spreading classical counterparts. We find that spatial disorder of the most general type, i.e., position-dependent Haar random coin operators, does not lead to Anderson localization but to a diffusive spread instead. This is a delocalization, which happens because disorder places the quantum walk to a critical point between different anomalous Floquet-Anderson insulating topological phases. We base this explanation on the relationship of this general quantum walk to a simpler case more studied in the literature and for which disorder-induced delocalization of a topological origin has been observed. We review topological delocalization for the simpler quantum walk, using time evolution of the wave functions and level spacing statistics. We apply scattering theory to two-dimensional quantum walks and thus calculate the topological invariants of disordered quantum walks, substantiating the topological interpretation of the delocalization and finding signatures of the delocalization in the finite-size scaling of transmission. We show criticality of the Haar random quantum walk by calculating the critical exponent $\eta$ in three different ways and find $\eta$ $\approx$ 0.52 as in the integer quantum Hall effect. Our results showcase how theoretical ideas and numerical tools from solid-state physics can help us understand spatially random quantum walks.

翻訳日:2023-05-21 15:08:07 公開日:2021-02-02

# 最大通勤初期ハミルトニアンをもつ分子エネルギーに対する量子ゼノアプローチ

Quantum Zeno approach for molecular energies with maximum commuting initialHamiltonians ( http://arxiv.org/abs/2006.01066v2 )

ライセンス: Link先を確認

Hongye Yu, Tzu-Chieh Wei

(参考訳) 本稿では,小分子の基底状態を計算するために,量子断熱およびシミュレーションアニールフレームワークを提案する。我々のアルゴリズムの最初のハミルトニアンは、パウリ基底の分子のフルハミルトニアンにおける可換項の最大集合からなる最大可換ハミルトニアン(maximum commuting hamiltonian)である。我々は2つの変種を考える。第1の方法では、最大可換ハミルトニアンの基底状態として初期状態を持つ、得られた時間または経路依存ハミルトニアンの断熱的進化をweperformする。しかし、この手法はハミルトニアン経路に沿った縮退性やエネルギーレベルの交差による断熱量子計算の通常の問題に苦しむ。この問題は、量子シミュレーションアニーリングで使われる一連の固有状態投影(英語版)(eigenstate projections)を通じてゼノ法によって緩和され、経路依存ハミルトニアンはパウリ x 項の和によって拡張され、その寄与は経路の終了時に消滅する。基底状態に加えて、この量子Zenoアプローチを用いて、基底状態と同程度の精度で低い起伏励起状態が得られる。

We propose to use a quantum adiabatic and simulated-annealing framework to compute theground state of small molecules. The initial Hamiltonian of our algorithms is taken to be themaximum commuting Hamiltonian that consists of a maximal set of commuting terms in the fullHamiltonian of molecules in the Pauli basis. We consider two variants. In the first method, weperform the adiabatic evolution on the obtained time- or path-dependent Hamiltonian with theinitial state as the ground state of the maximum commuting Hamiltonian. However, this methoddoes suffer from the usual problems of adiabatic quantum computation due to degeneracy andenergy-level crossings along the Hamiltonian path. This problem is mitigated by a Zeno method,i.e., via a series of eigenstate projections used in the quantum simulated annealing, with the path-dependent Hamiltonian augmented by a sum of Pauli X terms, whose contribution vanishes at thebeginning and the end of the path. In addition to the ground state, the low lying excited states canbe obtained using this quantum Zeno approach with equal accuracy to that of the ground state.

翻訳日:2023-05-17 11:18:35 公開日:2021-02-02

# 市民科学研究におけるモノのインターネットの利用に関する倫理的問題:スコーピング・レビュー

Ethical issues with using Internet of Things devices in citizen science research: A scoping review ( http://arxiv.org/abs/2007.09416v2 )

ライセンス: Link先を確認

James Scheibner, Anna Jobin, Effy Vayena

(参考訳) 本章では,市民科学者とインターネット・オブ・モノ(Internet of Things)デバイスの両方を活用する科学研究のスコーピングレビューを行った。具体的には、著者らが研究過程で遭遇した倫理的問題について少なくとも短い議論を含む研究を選択した。 IEEE Xplore, Scopus, Web of Science, ProQuest, PubMedの5つのデータベースを検索した結果、631の潜在的な結果が得られた。要約とタイトルのスクリーニングの後、全文の適格性評価を行い、基準に合致した34の論文を特定した。そして、これらの記事の全文を帰納的かつ帰納的に分析し、倫理問題を3つの主要なカテゴリに分けた。これらのカテゴリは、自律性とデータプライバシ、データ品質、知的財産である。我々はまた、これらの論文の全文を分析し、研究者がこれらの倫理的問題を解決するためにどのような戦略を採ったか、また法的意味を提起した。この分析に続き、市民科学者とIoTデバイスを研究に統合したい研究者に推奨する。まず、すべての市民科学プロジェクトは、参加者の機密性を保護するためにデータプライバシープロトコルを統合するべきである。第二に、科学研究者はプロジェクトを始める前に、妥協が必要かどうかなど、データ品質の潜在的な問題を検討するべきである。最後に、すべての知的財産問題はプロジェクトの開始時とライフサイクル中に明確にする必要があります。研究者は、商用のモノのインターネット(Internet of Things)デバイスによる研究から生じる倫理的問題も考慮すべきである。

Our chapter presents a scoping review of published scientific studies or case studies of scientific studies that utilise both citizen scientists and Internet of Things devices. Specifically, we selected studies where the authors had included at least a short discussion of the ethical issues encountered during the research process. Having conducted a search of five databases (IEEE Xplore, Scopus, Web of Science, ProQuest, and PubMed), we identified 631 potential results. Following abstract and title screening, and then full text eligibility assessment, we identified 34 published articles that matched our criteria. We then analysed the full text for these articles inductively and deductively, coding ethical issues into three main categories. These categories were autonomy and data privacy, data quality, and intellectual property. We also analysed the full text of these articles to see what strategies researchers took to resolve these ethical issues, as well as any legal implications raised. Following this analysis, our discussion provides recommendations for researchers who wish to integrate citizen scientists and Internet of Things devices into their research. First, all citizen science projects should integrate a data privacy protocol to protect the confidentiality of participants. Secondly, scientific researchers should consider any potential issues of data quality, including whether compromises might be required, before establishing a project. Finally, all intellectual property issues should be clarified both at the start of the project and during its lifecycle. Researchers should also consider any ethical issues that might flow from the use of commercially available Internet of Things devices for research.

翻訳日:2023-05-09 03:05:43 公開日:2021-02-02

# コヒーレント駆動Vレベル原子を用いた連続狭帯域ラシング

Continuous narrowband lasing with coherently driven V-level atoms ( http://arxiv.org/abs/2007.12522v2 )

ライセンス: Link先を確認

Christoph Hotter, David Plankensteiner, Helmut Ritsch

(参考訳) 非常に異なる速度のVレベル原子の2つの遷移の強いコヒーレントポンプは、より狭い遷移にほぼ完全な反転をもたらすと予測されている。ストロンチウムの青と赤の遷移の例を用いて、適切な条件下では、対応する共鳴ゲインを連続してレーザーをシャロー遷移で操作することができることを示す。特に、狭い遷移に関するポンプ磁場の強い変形は、ラシングモードに散乱するコヒーレントポンプ光からの無視可能な寄与のみを示す素原子遷移周波数に近づき、キャビティ出力スペクトルの計算により、結果として生じるレーザー光線幅がポンプ光の帯域幅や狭原子遷移の自然な直線幅よりもはるかに小さくなることが示されている。その周波数は、適切に選択された原子番号の原子遷移周波数と密接に結びついている。原子運動ショーのドップラー冷却を含むシミュレーションは、発振遷移の小さな運動加熱を伴う強い遷移のドップラー冷却を含むため、現在の実験技術では磁気光学トラップの先端における連続的なレーザー操作が可能となる。

Simultaneous strong coherent pumping of the two transitions of a V-level atom with very differentdecay rates has been predicted to create almost perfect inversion on the narrower transition. Usingthe example of the blue and red transitions in Strontium we show that for suitable operatingconditions the corresponding resonant gain can be used to continuously operate a laser on thenarrow transition. In particular, for a strong detuning of the pump field with respect to the narrowtransition, coherent laser emission occurs close to the bare atomic transition frequency exhibitingonly a negligible contribution from coherent pump light scattered into the lasing mode.Calculations of the cavity output spectrum show that the resulting laser linewidth can get muchsmaller than the bandwidth of the pump light and even the natural linewidth of the narrow atomictransition. Its frequency is closely tied to the atomic transition frequency for properly chosen atomnumbers. Simulations including atomic motionshow Doppler cooling on the strong transitionwith minor motion heating on the lasing transition, so that continuous laser operation in thepresence of a magneto-optical trap should be possible with current experimental technology.

翻訳日:2023-05-08 08:29:26 公開日:2021-02-02

# 弱非線形ジョセフソン接合浴を用いた量子系力学

Quantum system dynamics with a weakly nonlinear Josephson junction bath ( http://arxiv.org/abs/2008.08052v2 )

ライセンス: Link先を確認

Jing Yang, \'Etienne Jussiau, Cyril Elouard, Karyn Le Hur, and Andrew N. Jordan

(参考訳) ジョセフソン接合の鎖からなる弱非線形ジョセフソン浴が小量子系(lc発振器)のダイナミクスに与える影響について検討した。電荷エネルギーが最大のエネルギースケールである状態に着目し、コサインポテンシャルを正確に保ちながら、電荷エネルギーによって分割されたジョセフソンエネルギーの先頭次数に対するジョセフソン浴の相関関数を摂動的に計算する。チェーンに沿った帯電エネルギーの変化がバス相関関数の高速崩壊を確実にするときに、ジョセフソン浴に弱く容量的に結合したlc発振器のダイナミクスをマルコフマスター方程式により解くことができる。ジョゼフソン浴とジョゼフソン浴の2重性関係をそれぞれ大帯電系とジョゼフソンエネルギー系で確立する。この結果は、電荷エネルギーが不均一に工学されたり、鎖に乱れたりした場合に適用できる。さらに, 温度がゼロ温度限界を超えれば, ジョセフソン浴は非マルコフ型になる可能性があり, バス相関関数が一定にシフトし, 時間とともに減衰しないことがわかった。

We investigate the influence of a weakly nonlinear Josephson bath consisting of a chain of Josephson junctions on the dynamics of a small quantum system (LC oscillator). Focusing on the regime where the charging energy is the largest energy scale, we perturbatively calculate the correlation function of the Josephson bath to the leading order in the Josephson energy divided by the charging energy while keeping the cosine potential exactly. When the variation of the charging energy along the chain ensures fast decay of the bath correlation function, the dynamics of the LC oscillator that is weakly and capacitively coupled to the Josephson bath can be solved through the Markovian master equation. We establish a duality relation for the Josephson bath between the regimes of large charging and Josephson energies respectively. The results can be applied to cases where the charging energy either is nonuniformly engineered or disordered in the chain. Furthermore, we find that the Josephson bath may become non-Markovian when the temperature is increased beyond the zero-temperature limit in that the bath correlation function gets shifted by a constant and does not decay with time.

翻訳日:2023-05-05 22:47:47 公開日:2021-02-02

# 複数の絡み合った光子をもつ量子照明

Quantum illumination with multiple entangled photons ( http://arxiv.org/abs/2008.09455v4 )

ライセンス: Link先を確認

Ricardo Gallego Torrom\'e, Nadya Ben Bekhti-Winkel and Peter Knott

(参考訳) 本研究では、ロイドの量子照明を2つの絡み合った光子状態によって記述された信号ビームに理論的に一般化する。この新しいプロトコルは,目標範囲を探索する方法を提供し,ロイドの量子照明と同じノイズ比の信号を持つために必要な時間帯域幅積のサイズを小さくし,偽陽性の確率を低くし,雑音に対して弾力性があり,損失にも耐えうることを示す。しかし、プロトコルに必要な3つの光子状態の生成は、その実用的実装が完全には解決されない技術的な問題を引き起こす。この問題を克服できる三重光子生成の最近の進歩を論じる。プロトコルに関する他の問題も考慮されている。

In this work, a theoretical generalization of Lloyd's quantum illumination to signal beams described by two entangled photon states is developed. It is shown that the new protocol offers a method to find the range of the target, reduces the size of the required time-bandwidth product to have the same signal to noise ratio than in Lloyd's quantum illumination, has a lower probability of false positive and is resilient against noise and also potentially against losses. However, the generation of the required three photon states for the protocol posses a technical problem for its practical implementation not fully addressed. Recent advances in triple photon generation that can overcome this problem are discussed. Other issues related with the protocol are also considered.

翻訳日:2023-05-05 12:07:34 公開日:2021-02-02

# 条件付き光子検出による2つのデカップリング量子リミットサイクル発振器の瞬時位相同期

Instantaneous phase synchronization of two decoupled quantum limit-cycle oscillators induced by conditional photon detection ( http://arxiv.org/abs/2009.08286v2 )

ライセンス: Link先を確認

Yuzuru Kato, Hiroya Nakao

(参考訳) 条件付き光子検出は、2つのデカップリング量子リミットサイクル発振器間の瞬時位相同期を誘導する。相互結合のない2つの量子ファンデルポル発振器について検討し、それぞれに線形結合浴を付加し、ビームスプリッタを介して相互作用する2つの浴の出力場に基づいて光子を連続測定する。光子検出後に2つの分離発振器の相内あるいは反相コヒーレンスが瞬時に増加し、弱量子状態においては徐々に減少し、次の光子検出まで強い量子状態では急速に減少する。強い量子構造では、光子検出後に量子の絡み合いも増加し、すぐに消滅する。量子エンタングルメントと位相コヒーレンスの増加に対する解析上界を、量子極限における条件付き光子検出によって導出する。

We show that conditional photon detection induces instantaneous phase synchronization between two decoupled quantum limit-cycle oscillators. We consider two quantum van der Pol oscillators without mutual coupling, each with an additional linearly coupled bath, and perform continuous measurement of photon counting on the output fields of the two baths interacting through a beam splitter. It is observed that in-phase or anti-phase coherence of the two decoupled oscillators instantaneously increases after the photon detection and then decreases gradually in the weak quantum regime or quickly in the strong quantum regime until the next photon detection occurs. In the strong quantum regime, quantum entanglement also increases after the photon detection and quickly disappears. We derive the analytical upper bounds for the increases in the quantum entanglement and phase coherence by the conditional photon detection in the quantum limit.

翻訳日:2023-05-02 00:09:18 公開日:2021-02-02

# 感染拡大に伴う個人用保護具(ppe)の在庫管理強化のためのゲーム理論

Game theory to enhance stock management of Personal Protective Equipment (PPE) during the COVID-19 outbreak ( http://arxiv.org/abs/2009.11838v3 )

ライセンス: Link先を確認

Khaled Abedrabboh, Matthias Pilz, Zaid Al-Fagih, Othman S. Al-Fagih, Jean-Christophe Nebel, Luluwah Al-Fagih

(参考訳) 新型コロナウイルス(covid-19)のパンデミック以降、多くの医療施設は医療資源の不足、特に個人用防護具(ppe)の不足に苦しめられている。本稿では,医療施設間でPPE注文をスケジュールするゲーム理論アプローチを提案する。このPPEゲームでは、個々の独立した医療施設が、PPEのコストを最小限に抑えるために、自身のストレージ利用を最適化する。このようなモデルは、可変ppe消費プロファイルに適用するとピーク需要を大幅に削減することができる。実際のデータを用いてNHSイングランド地域で実施した実験では、適切な株式管理手順が採用されれば、新型コロナウイルスなどの災害時のPEP供給確保の課題が緩和できることが確認されている。これらの手順には、早期の備蓄、貯蔵能力の増大、社会的距離調整など、連続的な感染波間の期間を延長する措置の実施が含まれる。シミュレーションの結果,第2波のcovid-19感染が発生した場合,ppe専用ストレージスペースの提供はppeサプライチェーンの歪みを回避するための有効な解決策となる可能性が示唆された。

Since the outbreak of the COVID-19 pandemic, many healthcare facilities have suffered from shortages in medical resources, particularly in Personal Protective Equipment (PPE). In this paper, we propose a game-theoretic approach to schedule PPE orders among healthcare facilities. In this PPE game, each independent healthcare facility optimises its own storage utilisation in order to keep its PPE cost at a minimum. Such a model can reduce peak demand considerably when applied to a variable PPE consumption profile. Experiments conducted for NHS England regions using actual data confirm that the challenge of securing PPE supply during disasters such as COVID-19 can be eased if proper stock management procedures are adopted. These procedures can include early stockpiling, increasing storage capacities and implementing measures that can prolong the time period between successive infection waves, such as social distancing measures. Simulation results suggest that the provision of PPE dedicated storage space can be a viable solution to avoid straining PPE supply chains in case a second wave of COVID-19 infections occurs.

翻訳日:2023-05-01 02:37:17 公開日:2021-02-02

# スケーラブルなフォトニックフォールトトレラント量子コンピュータのための青写真

Blueprint for a Scalable Photonic Fault-Tolerant Quantum Computer ( http://arxiv.org/abs/2010.02905v2 )

ライセンス: Link先を確認

J. Eli Bourassa, Rafael N. Alexander, Michael Vasmer, Ashlesha Patil, Ilan Tzitrin, Takaya Matsuura, Daiqin Su, Ben Q. Baragiola, Saikat Guha, Guillaume Dauphinais, Krishna K. Sabapathy, Nicolas C. Menicucci, Ish Dhand

(参考訳) Photonicsは、室温で動くモジュラーで簡単にネットワークに繋がる量子コンピュータを構築するためのプラットフォームだ。しかし、光状態に符号化された量子ビットの利点と、その生成のための現代的なツールの両方を利用する具体的なアーキテクチャは提示されていない。本稿では,最新の理論・技術の発展にともなう,スケーラブルでフォールトトレラントなフォトニック量子コンピュータの設計を提案する。我々のアーキテクチャの中心は、ボゾン量子ビットと圧縮真空状態の両方からなる3次元ハイブリッド資源状態の生成と操作である。本提案は, ボソニック量子ビットの非決定論的生成と連続変数量子計算の強み, すなわち, 容易に生成できる圧縮状態を用いたクリフォードゲートの実装を組み合わせた, 最先端の手順を活用できる。さらに、このアーキテクチャは1つの時間次元と2つの空間次元のキュービットクラスタ状態を生成するために使用される2次元集積フォトニックチップに基づいている。既存のアーキテクチャに比べて実験的な課題を少なくし、室温量子計算を可能にすることで、我々の設計はスケーラブルな製造と運用への扉を開く。

Photonics is the platform of choice to build a modular, easy-to-network quantum computer operating at room temperature. However, no concrete architecture has been presented so far that exploits both the advantages of qubits encoded into states of light and the modern tools for their generation. Here we propose such a design for a scalable and fault-tolerant photonic quantum computer informed by the latest developments in theory and technology. Central to our architecture is the generation and manipulation of three-dimensional hybrid resource states comprising both bosonic qubits and squeezed vacuum states. The proposal enables exploiting state-of-the-art procedures for the non-deterministic generation of bosonic qubits combined with the strengths of continuous-variable quantum computation, namely the implementation of Clifford gates using easy-to-generate squeezed states. Moreover, the architecture is based on two-dimensional integrated photonic chips used to produce a qubit cluster state in one temporal and two spatial dimensions. By reducing the experimental challenges as compared to existing architectures and by enabling room-temperature quantum computation, our design opens the door to scalable fabrication and operation, which may allow photonics to leap-frog other platforms on the path to a quantum computer with millions of qubits.

翻訳日:2023-04-29 20:13:56 公開日:2021-02-02

# tqix: Xにおける量子のためのツールボックス:量子計測、量子トモグラフィ、量子メトロジーなど

tqix: A toolbox for Quantum in X: Quantum measurement, quantum tomography, quantum metrology, and others ( http://arxiv.org/abs/2010.03731v2 )

ライセンス: Link先を確認

Le Bin Ho, Kieu Quang Tuan, Hung Q. Nguyen

(参考訳) 本稿では,Python言語で書かれたオープンソースのコンピュータプログラムについて述べる。本プログラムでは,量子ゲートを含む量子状態と演算子を行列で表現した量子対象関数として開発することができる。プログラムに組み込むには、フォン・ノイマン測度や弱い測度を含むいくつかの測度スキームがある。実実験結果の再現には様々な数値シミュレーション手法が用いられる。まず、プログラム構造の概要を説明し、次いで量子計測の数値シミュレーションについて議論する。我々は,量子状態トモグラフィと量子メトロロジーを用いてプログラムの性能を説明する。このプログラムは量子物理学の一般的な言語で構築されており、量子光学、イオントラップ、超伝導回路デバイスなどの様々な物理プラットフォームに広く適用可能である。また、様々な量子システムのシミュレーションと可視化による教室指導での使用も理想的である。

We present an open-source computer program written in Python language for quantum measurement and related issues. In our program, quantum states and operators, including quantum gates, can be developed into a quantum-object function represented by a matrix. Build into the program are several measurement schemes, including von Neumann measurement and weak measurement. Various numerical simulation methods are used to mimic the real experiment results. We first provide an overview of the program structure and then discuss the numerical simulation of quantum measurement. We illustrate the program's performance via quantum state tomography and quantum metrology. The program is built in a general language of quantum physics and thus is widely adaptable to various physical platforms, such as quantum optics, ion traps, superconducting circuit devices, and others. It is also ideal to use in classroom guidance with simulation and visualization of various quantum systems.

翻訳日:2023-04-29 15:49:19 公開日:2021-02-02

# 量子コンピューティングのための動的自己エネルギーマッピング(DSEM)

Dynamical Self-energy Mapping (DSEM) for quantum computing ( http://arxiv.org/abs/2010.05441v2 )

ライセンス: Link先を確認

Diksha Dhawan, Mekena Metcalf, Dominika Zgid

(参考訳) ノイズの多い中間スケール量子(NISQ)デバイスでは、コヒーレンスに制限のある適度な数の量子ビットしか利用できないため、現在実行されている量子計算において、浅い回路と数回の進化ステップしか実現できない。本稿では,標準ハミルトニアンの$\mathcal{o}(n^4)$項と比較して,ガウス軌道基底において$\mathcal{o}(n^2)$項のみを含むスパースハミルトニアンを生成できる古典量子ハイブリッドアルゴリズムを用いて,nisqデバイスにおける分子化学シミュレーションにおいて,この課題を回避する方法を提案する。このハイブリッドの古典的な部分は、元の分子系の自己エネルギーを回復するように、スパースで架空のハミルトンのパラメータ化を必要とする。量子機械はこの架空のハミルトニアンを用いてシステムの自己エネルギーを計算する。開発したハイブリッドアルゴリズムは, 完全ハミルトニアンを含むシミュレーションと比較して, 量子回路の深さを少なくとも1桁小さくしながら, 小型分子テストケースにおいて非常に良好な総エネルギーが得られることを示す。

For noisy intermediate-scale quantum (NISQ) devices only a moderate number of qubits with a limited coherence is available thus enabling only shallow circuits and a few time evolution steps in the currently performed quantum computations. Here, we present how to bypass this challenge in practical molecular chemistry simulations on NISQ devices by employing a classical-quantum hybrid algorithm allowing us to produce a sparse Hamiltonian which contains only $\mathcal{O}(n^2)$ terms in a Gaussian orbital basis when compared to the $\mathcal{O}(n^4)$ terms of a standard Hamiltonian, where $n$ is the number of orbitals in the system. Classical part of this hybrid entails parameterization of the sparse, fictitious Hamiltonian in such a way that it recovers the self-energy of the original molecular system. Quantum machine then uses this fictitious Hamiltonian to calculate the self-energy of the system. We show that the developed hybrid algorithm yields very good total energies for small molecular test cases while reducing the depth of the quantum circuit by at least an order of magnitude when compared with simulations involving a full Hamiltonian.

翻訳日:2023-04-29 07:33:58 公開日:2021-02-02

# maxcut qaoa による p > 1 のパフォーマンス保証

MAXCUT QAOA performance guarantees for p >1 ( http://arxiv.org/abs/2010.11209v2 )

ライセンス: Link先を確認

Jonathan Wurtz, Peter J. Love

(参考訳) 均一な3つの正則グラフ上でMAXCUTに対する$p=2$および$$QAOAの最悪のケース性能保証を得る。 Farhiらによる以前の研究は、近似比が0.692$ for $p=1$の低い境界を得た。 0.7559$ for $p=2$で、最悪のケースグラフはサイクルのないグラフである。この境界は、特定の固定パラメータで評価された任意の3つの正規グラフに対して成り立つ。最悪のケースグラフがサイクルを持たないすべての$p$に対して$\leq 2p+1$を予想する。この予想の下で、近似比は3つの正則グラフすべてに対して少なくとも$0.7924$であり、$p=3$である。さらに、単純な区別不可能な議論を用いて、すべての$p$に対する最悪のケース近似比の上限を見つけ、これは少なくとも$p<6$に対して量子的優位性を持たないグラフのクラスを示す。

We obtain worst case performance guarantees for $p=2$ and $3$ QAOA for MAXCUT on uniform 3-regular graphs. Previous work by Farhi et al obtained a lower bound on the approximation ratio of $0.692$ for $p=1$. We find a lower bound of $0.7559$ for $p=2$, where worst case graphs are those with no cycles $\leq 5$. This bound holds for any 3 regular graph evaluated at particular fixed parameters. We conjecture a hierarchy for all $p$, where worst case graphs have with no cycles $\leq 2p+1$. Under this conjecture, the approximation ratio is at least $0.7924$ for all 3 regular graphs and $p=3$. In addition, using a simple indistinguishability argument we find an upper bound on the worst case approximation ratio for all $p$, which indicates classes of graphs for which there can be no quantum advantage for at least $p<6$.

翻訳日:2023-04-28 03:01:13 公開日:2021-02-02

# 光格子におけるスピン軌道結合原子の指向性原子運動と二階トンネル制御

Controlling directed atomic motion and second-order tunneling of a spin-orbit-coupled atom in optical lattices ( http://arxiv.org/abs/2011.01399v2 )

ライセンス: Link先を確認

Xiaobing Luo, Zhao-Yun Zeng, Yu Guo, Baiyuan Yang, Jinpeng Xiao, Lei Li, Chao Kong, and Ai-Xi Chen

(参考訳) 格子揺らぎと時間周期ゼーマン場を受ける光学格子に閉じ込められた単一スピン軌道結合原子の密結合(TB)モデルに対するトンネル力学を理論的に探求する。解析的および数値的手法により、スピン軌道結合(SO)が多光子共鳴および遠方共振パラメータ系におけるトンネル力学にいくつかの新しい結果をもたらすことを示した。 When the driving frequency is resonant with the static Zeeman field (multi-photon resonances), we obtain an unexpected new dynamical localization (DL) phenomenon where the single SO-coupled atom is restricted to making perfect two-site Rabi oscillation accompanied by spin flipping.By using the unconventional DL phenomenon, we are able to generate a ratchetlike effect which enables directed atomic motion towards different directions and accompanies periodic spin-flipping under the action of SO coupling. 遠方共振の場合,通常のサイト間トンネルのみを抑えることで,SO結合のない従来の格子系ではアクセスできない次熱処理部位間でのスピン保存2次トンネルの実現が可能であることを示す。また, クエージーの平坦性(崩壊)と完全に凍結するダイナミクスの存在には, 通常の現場間トンネルとSO結合関連2次トンネルの同時制御が必要であることを示す。これらの結果はスピンベースの量子情報処理や新しいスピントロニクスデバイスの設計といった潜在的な応用に関係している可能性がある。

We theoretically explore the tunneling dynamics for the tight-binding (TB) model of a single spin-orbit-coupled atom trapped in an optical lattice subjected to lattice shaking and to time-periodic Zeeman field. By means of analytical and numerical methods, we demonstrate that the spin-orbit (SO) coupling adds some new results to the tunneling dynamics in both multiphoton resonance and far-off-resonance parameter regimes. When the driving frequency is resonant with the static Zeeman field (multi-photon resonances), we obtain an unexpected new dynamical localization (DL) phenomenon where the single SO-coupled atom is restricted to making perfect two-site Rabi oscillation accompanied by spin flipping.By using the unconventional DL phenomenon, we are able to generate a ratchetlike effect which enables directed atomic motion towards different directions and accompanies periodic spin-flipping under the action of SO coupling. For the far-off-resonance case, we show that by suppressing the usual inter-site tunneling alone, it is possible to realize a type of spin-conserving second-order tunneling between next-nearest-neighboring sites, which is not accessible in the conventional lattice system without SO coupling. We also show that simultaneous controls of the usual inter-site tunneling and the SO-coupling-related second-order-tunneling are necessary for quasienergies flatness (collapse) and completely frozen dynamics to exist. These results may be relevant to potential applications such as spin-based quantum information processing and design of novel spintronics devices.

翻訳日:2023-04-25 11:56:59 公開日:2021-02-02

# 3次元ラシュバヘテロ構造における非定常リフシッツ転移における多重ギャップ超伝導

Multigaps superconductivity at unconventional Lifshitz transition in a 3D Rashba heterostructure at atomic limit ( http://arxiv.org/abs/2011.02311v2 )

ライセンス: Link先を確認

Vittoria Mazziotti, Antonio Valletta, Roberto Raimondi, Antonio Bianconi

(参考訳) 複数の量子サブバンドからなる電子スペクトルを持つ原子層の超格子からなる原子限界(HAL)におけるマルチギャップ超伝導3次元ヘテロ構造の臨界温度は、異なるギャップ間の接触交換相互作用によって駆動される形状共鳴によって増幅できることはよく知られている。この$t_c$増幅は、首を開くためのリフシッツ遷移において特異節点付近のフェルミ準位をチューニングする。近年、リニアインモーメントスピン軌道誘起スピンスプリッティング(Rashba spin-orbit coupling (RSOC))と呼ばれる3次元層状金属の反転対称性の破れに高い関心が寄せられている。しかし、RSOCが非BCS状態にある3D HALにおけるリフシッツ転移に近い多ギャップ超伝導の物理は知られていない。ボゴリューボフ理論による超伝導ギャップとディラック方程式の解による3次元電子波動関数を得るための重要な成果は、スピン軌道長と3次元超格子周期を適切に一致させることで、マルチギャップ超伝導をチューニングできることである。フェルミエネルギーが円ノルダル線付近で調整された場合、rsocの存在はk依存性の異方性ギャップ関数と臨界温度の両方を増幅する。本研究は,超格子変調パラメータのチューニングにより,超格子超伝導体におけるRSOCの効果を,既存の実験プラットフォームにおけるスピントロニクス機能や量子コンピューティングに必要な調整可能な材料で効果的に変化させる手法を提案する。

It is well known that the critical temperature of multi-gap superconducting 3D heterostructures at atomic limit (HAL) made of a superlattice of atomic layers with an electron spectrum made of several quantum subbands can be amplified by a shape resonance driven by the contact exchange interaction between different gaps. The $T_C$ amplification is achieved tuning the Fermi level near the singular nodal point at a Lifshitz transition for opening a neck. Recently high interest has been addressed to the breaking of inversion symmetry which leads to a linear-in-momentum spin-orbit induced spin splitting, universally referred to as Rashba spin-orbit coupling (RSOC) also in 3D layered metals. However the physics of multi-gap superconductivity near unconventional Lifshitz transitions in 3D HAL with RSOC, being in a non-BCS regime, is not known. The key result of this work getting the superconducting gaps by Bogoliubov theory and the 3D electron wave functions by solution of the Dirac equation is the feasibility of tuning multi-gap superconductivity by suitably matching the spin-orbit length with the 3D superlattice period. It is found that the presence of the RSOC amplifies both the k dependent anisotropic gap function and the critical temperature when the Fermi energy is tuned near the circular nodal line. Our results suggest a method to effectively vary the effect of RSOC on macroscopic superconductor condensates via the tuning of the superlattice modulation parameter in a way potentially relevant for spintronics functionalities in several existing experimental platforms and tunable materials needed for quantum devices for quantum computing.

翻訳日:2023-04-25 11:29:25 公開日:2021-02-02

# 複素平面における摂動理論:例外点とどの点を見つけるか

Perturbation Theory in the Complex Plane: Exceptional Points and Where to Find Them ( http://arxiv.org/abs/2012.03688v2 )

ライセンス: Link先を確認

Antoine Marie and Hugh G. A. Burton and Pierre-Fran\c{c}ois Loos

(参考訳) 複素平面における量子化学の非エルミート拡大と摂動論との関係を考察する。量子系の物理学は、例外点として知られる複素値エネルギー特異点の位置と密接な関係にあることを観測する。平均場Hartree-Fock近似やRayleigh--Schr\odinger摂動理論を含む複素平面における非エルミート量子化学の基本概念を提示した後、特異点の物理学で実施された様々な研究活動の歴史的概要を提供する。特に、M{\o}ller--Plesset摂動論において得られる摂動級数の収束挙動とその量子相転移との関係について、基礎研究を取り上げ、収束と発散の両方の場合のM{\o}ller--Plesset摂動級数の全体的な精度を改善するためのいくつかの再仮定手法(Pad\'e や2次近似等)についても論じる。これらの各点は半充填のハバードディマーを用いて図示され、複素平面における解析的連続摂動理論の微妙な性質を理解するための汎用モデルであることが証明される。

We explore the non-Hermitian extension of quantum chemistry in the complex plane and its link with perturbation theory. We observe that the physics of a quantum system is intimately connected to the position of complex-valued energy singularities, known as exceptional points. After presenting the fundamental concepts of non-Hermitian quantum chemistry in the complex plane, including the mean-field Hartree--Fock approximation and Rayleigh--Schr\"odinger perturbation theory, we provide a historical overview of the various research activities that have been performed on the physics of singularities. In particular, we highlight seminal work on the convergence behaviour of perturbative series obtained within M{\o}ller--Plesset perturbation theory, and its links with quantum phase transitions. We also discuss several resummation techniques (such as Pad\'e and quadratic approximants) that can improve the overall accuracy of the M{\o}ller--Plesset perturbative series in both convergent and divergent cases. Each of these points is illustrated using the Hubbard dimer at half filling, which proves to be a versatile model for understanding the subtlety of analytically-continued perturbation theory in the complex plane.

翻訳日:2023-04-21 21:04:13 公開日:2021-02-02

# 位相ミスマッチフリーな熱原子蒸気からのサブmhz・スペクトルbrightバイフォトンの生成

Generation of sub-MHz and spectrally-bright biphotons from hot atomic vapors with a phase mismatch-free scheme ( http://arxiv.org/abs/2012.04893v3 )

ライセンス: Link先を確認

Chia-Yu Hsu, Yu-Sheng Wang, Jia-Mou Chen, Fu-Chen Huang, Yi-Ting Ke, Emily Kay Huang, Weilun Hung, Kai-Lin Chao, Shih-Si Hsiao, Yi-Hsin Chen, Chih-Sung Chuu, Ying-Cheng Chen, Yong-Fan Chen, and Ite A. Yu

(参考訳) 熱水蒸気から2光子を生成するために, 位相整合状態を維持する全共役方式を, 自然発生4波混合(SFWM)プロセスで利用した。この手法により, 従来のホットアトムsfwmの手法を超越するだけでなく, コールドアトムsfwmまたはキャビティ支援の自発的パラメトリックダウン変換によって生成されるバイフォトンと競合することができる。この研究における双光子線幅は桁違いに調整可能である。直線幅を610kHzに調整すると、双光子の最大2光子相関関数$g_{s,as}^{(2)}$は42である。この$g_{s,as}^{(2)}$ は古典光に対するコーシー=シュワルツの不等式を 440 倍に破っており、双光子が高い純度を持つことを証明している。 610kHzの2光子源の線幅当たりの発生率は1500対/(s$\cdot$MHz)であり、これは文学における全てのサブMHzの2光子源の最良の結果である。ポンプの出力を16倍に増やすことで、1ライン幅当たりの発電速度をさらに2.3$\times$10$^4$ pairs/(s$\cdot$mhz)に向上させ、最大$g_{s,as}^{(2)}$は6.7になった。さらに、ライン幅を290$\pm$20 kHzに調整することができます。これは、これまでで最も狭い線幅であり、様々な種類のシングルモードバイフォトンの1つである。

We utilized the all-copropagating scheme, which maintains the phase-match condition, in the spontaneous four-wave mixing (SFWM) process to generate biphotons from a hot atomic vapor. The scheme enables our biphotons not only to surpass those in the previous works of hot-atom SFWM, but also to compete with the biphotons that are generated by either the cold-atom SFWM or the cavity-assisted spontaneous parametric down conversion. The biphoton linewidth in this work is tunable for an order of magnitude. As we tuned the linewidth to 610 kHz, the maximum two-photon correlation function, $g_{s,as}^{(2)}$, of the biphotons is 42. This $g_{s,as}^{(2)}$ violates the Cauchy-Schwartz inequality for classical light by 440 folds, and demonstrates that the biphotons have a high purity. The generation rate per linewidth of the 610-kHz biphoton source is 1,500 pairs/(s$\cdot$MHz), which is the best result of all the sub-MHz biphoton sources in the literature. By increasing the pump power by 16 folds, we further enhanced the generation rate per linewidth to 2.3$\times$10$^4$ pairs/(s$\cdot$MHz), while the maximum $g_{s,as}^{(2)}$ became 6.7. In addition, we are able to tune the linewidth down to 290$\pm$20 kHz. This is the narrowest linewidth to date, among all the various kinds of single-mode biphotons.

翻訳日:2023-04-21 08:16:53 公開日:2021-02-02

# マルチパーティ$q$予測量子相関の強多元性

Strong polygamy of multi-party $q$-expected quantum correlations ( http://arxiv.org/abs/2101.05416v2 )

ライセンス: Link先を確認

Jeong San Kim

(参考訳) マルチパーティ量子相関の多元的性質は, tsallis $q$-エントロピーと$q$-expectation値に基づいて, {\em strong} 形式で特徴づけられることを示した。マルチパーティシステムに分散できる絡み合いの量を考えることで、Tsallis $q$-entropy と$q \geq 1$ の$q$-expectation という観点で、多パーティ絡み合いの強いポリガミー不等式のクラスを確立する。我々の新しい不等式クラスは、実際には、多元的絡み合いの通常の多元的不等式よりも厳密であり、その厳密性は例によって明確に示される。さらに、我々の新しい不等式クラスは、1つの党と他の党の任意の部分集合の間に分配される$q$-expected entanglementに関するものであるが、通常のポリガミー不等式は1つの党と他の党の間の絡みについてのみ考慮する。さらに、量子エンタングルメントの強いポリガミー不等式と、マルチパーティ量子システムに分布する量子不協和の同値性を確立する。

We show that the polygamous nature of multi-party quantum correlations can be characterized in a {\em stronger} form based on Tsallis $q$-entropy and $q$-expectation value. By considering the amount of entanglement that can be distributed in multi-party systems, we establish a class of strong polygamy inequalities of multi-party entanglement in terms of Tsallis $q$-entropy and $q$-expectation for $q \geq 1$. Our new class of inequalities is in fact tighter than the usual polygamy inequalities of multi-party entanglement, and the tightness is explicitly illustrated by an example. Moreover, our new class of inequalities is concerned with the $q$-expected entanglement distributed between a single party and any possible subsets of the rest parties whereas the usual polygamy inequality only considers the entanglement between a single party and another. We further establish the equivalence between strong polygamy inequalities of quantum entanglement and quantum discord distributed in multi-party quantum systems.

翻訳日:2023-04-15 05:29:10 公開日:2021-02-02

# JTrack:神経疾患と精神疾患の遠隔監視のためのデジタルバイオマーカープラットフォーム

JTrack: A Digital Biomarker Platform for Remote Monitoring in Neurological and Psychiatric Diseases ( http://arxiv.org/abs/2101.10091v3 )

ライセンス: Link先を確認

Mehran Sahandi Far, Michael Stolz, Jona M. Fischer, Simon B. Eickhoff, Juergen Dukart

(参考訳) 目的: スマートフォンが収集する健康関連データは、気候評価に有望な補完的アプローチを提供する。ここでは、JTrackプラットフォームを、日常およびデジタル表現型におけるリモート監視のためのセキュアで信頼性が高く拡張可能なオープンソースソリューションとして紹介する。方法: JTrackはAndroidベースのスマートフォンアプリケーションとWebベースのプロジェクト管理ダッシュボードで構成されている。モーションセンサー、社会活動、身体活動、位置情報からの幅広い匿名化計測は、アクティブモードまたはパッシブモードで収集することができる。ダッシュボードはまた、研究間でのデータ収集を監視および管理するための管理ツールも提供する。スケーリング、再現性、データ管理、共有を容易にするために、DataLadをデータ管理インフラストラクチャとして統合しました。 JTrackは、セキュリティ、プライバシ、一般データ保護規則(GDPR)要件を満たすために開発された。結果: JTrackは、神経学、精神医学、その他の指標におけるデジタルバイオマーカー(DB)のリモート評価のための、オープンソース(オープンソースApache 2.0ライセンス下でリリースされている)プラットフォームである。 JTrackプラットフォームの主要なコンポーネントと、JTrackを使って収集されるデータの例を以下に示す。結論: スマートフォンベースのデジタルバイオマーカーデータは、健康と病気の日常生活行動に関する貴重な洞察を提供する可能性がある。 JTrackは、そのようなデータの収集のための簡単で信頼性の高いオープンソースソリューションを提供する。

Objective: Health-related data being collected by smartphones offer a promising complementary approach to in-clinic assessments. Here we introduce the JTrack platform as a secure, reliable and extendable open-source solution for remote monitoring in daily-life and digital phenotyping. Method: JTrack consists of an Android-based smartphone application and a web-based project management dashboard. A wide range of anonymized measurements from motion-sensors, social and physical activities and geolocation information can be collected in either active or passive modes. The dashboard also provides management tools to monitor and manage data collection across studies. To facilitate scaling, reproducibility, data management and sharing we integrated DataLad as a data management infrastructure. JTrack was developed to comply with security, privacy and the General Data Protection Regulation (GDPR) requirements. Results: JTrack is an open-source (released under open-source Apache 2.0 licenses) platform for remote assessment of digital biomarkers (DB) in neurological, psychiatric and other indications. The main components of the JTrack platform and examples of data being collected using JTrack are presented here. Conclusion: Smartphone-based Digital Biomarker data may provide valuable insight into daily life behaviour in health and disease. JTrack provides an easy and reliable open-source solution for collection of such data.

翻訳日:2023-04-14 21:02:46 公開日:2021-02-02

# 審判もプレーヤーである場合:eコマースマーケットプレースにおけるプライベートラベル製品推奨のバイアス

When the Umpire is also a Player: Bias in Private Label Product Recommendations on E-commerce Marketplaces ( http://arxiv.org/abs/2102.00141v2 )

ライセンス: Link先を確認

Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi

(参考訳) アルゴリズムリコメンデーションは、amazonのような大手eコマースマーケットプレースで何百万もの顧客と製品(その生産者と販売者)のやりとりを仲介する。近年、生産者と販売業者は、これらの市場に展開されるブラックボックスレコメンデーションアルゴリズムの公平性を懸念している。多くの苦情は、アルゴリズムが競合製品よりも独自の「プライベートラベル」製品を優先的に好むように偏っている市場に集中している。これらの懸念は、マーケットプレースが'organic'レコメンデーションを広告主導の'sponsored'レコメンデーション(独自のプライベートレーベルを含む)に強調または置き換えるにつれて悪化している。これらの懸念は広く報道され、規制当局による調査が生まれてきたが、われわれの知る限り、これらのマーケットプレースアルゴリズムの公開監査は行われていない。本研究では,アマゾンの商品推薦に関するエンドツーエンドの体系的な監査を行うことにより,このギャップを埋める。提案するネットワーク中心のフレームワークは,有機的およびスポンサーによる推奨項目間のバイアスを定量化し,比較する。提案されている多くのバイアス対策に従って、スポンサードレコメンデーションは、オーガニックレコメンデーションよりもamazon private label製品にかなり偏っていることが分かりました。私たちの発見は、主にAmazonのプロデューサや売り手にとって興味深いものですが、提案されたバイアス測定は一般的に、ソーシャルやコンテンツネットワークにおけるリンク形成バイアスを測定するのに役立ちます。

Algorithmic recommendations mediate interactions between millions of customers and products (in turn, their producers and sellers) on large e-commerce marketplaces like Amazon. In recent years, the producers and sellers have raised concerns about the fairness of black-box recommendation algorithms deployed on these marketplaces. Many complaints are centered around marketplaces biasing the algorithms to preferentially favor their own `private label' products over competitors. These concerns are exacerbated as marketplaces increasingly de-emphasize or replace `organic' recommendations with ad-driven `sponsored' recommendations, which include their own private labels. While these concerns have been covered in popular press and have spawned regulatory investigations, to our knowledge, there has not been any public audit of these marketplace algorithms. In this study, we bridge this gap by performing an end-to-end systematic audit of related item recommendations on Amazon. We propose a network-centric framework to quantify and compare the biases across organic and sponsored related item recommendations. Along a number of our proposed bias measures, we find that the sponsored recommendations are significantly more biased toward Amazon private label products compared to organic recommendations. While our findings are primarily interesting to producers and sellers on Amazon, our proposed bias measures are generally useful for measuring link formation bias in any social or content networks.

翻訳日:2023-04-13 06:57:41 公開日:2021-02-02

# ダイヤモンド中の遠心対称量子エミッタに対するスターク効果の研究

Investigation of the Stark Effect on a Centrosymmetric Quantum Emitter in Diamond ( http://arxiv.org/abs/2102.01322v1 )

ライセンス: Link先を確認

Lorenzo De Santis, Matthew Trusheim, Kevin Chen, Dirk Englund

(参考訳) ダイヤモンドの量子エミッターは、光学的にアクセス可能な固体量子ビットである。これらのうち、グループIV空孔欠陥中心は、長寿命スピン状態に対するコヒーレントで安定な光学界面として大きな関心を集めている。理論は、それらの反転対称性が、任意のホスト物質における光学的コヒーレンスに対する共通の制限である成層電界に対する一階不感性をもたらすことを示している。ここでは, ダイヤモンド中の個々のスズ空隙(snv)中心に適用した外部電場を用いて, この電界依存性を実験的に定量化する。これらの測定により、永久電気双極子モーメントと偏光性は、ダイヤモンド中の第iv族欠陥の反転対称性保護の最初の直接測定であるダイヤモンド窒素空隙(nv)中心よりも少なくとも4桁小さいことが判明した。さらに、電場誘起双極子を変調することにより、snvを局所電界ノイズのナノスケールプローブとして使用できることを示すとともに、この手法を用いてsnvに対するスペクトル拡散の影響を強調する。

Quantum emitters in diamond are leading optically-accessible solid-state qubits. Among these, Group IV-vacancy defect centers have attracted great interest as coherent and stable optical interfaces to long-lived spin states. Theory indicates that their inversion symmetry provides first-order insensitivity to stray electric fields, a common limitation for optical coherence in any host material. Here we experimentally quantify this electric field dependence via an external electric field applied to individual tin-vacancy (SnV) centers in diamond. These measurements reveal that the permanent electric dipole moment and polarizability are at least four orders of magnitude smaller than for the diamond nitrogen vacancy (NV) centers, representing the first direct measurement of the inversion symmetry protection of a Group IV defect in diamond. Moreover, we show that by modulating the electric-field-induced dipole we can use the SnV as a nanoscale probe of local electric field noise, and we employ this technique to highlight the effect of spectral diffusion on the SnV.

翻訳日:2023-04-13 00:51:58 公開日:2021-02-02

# ai開発におけるグローバル包摂の限界

The Limits of Global Inclusion in AI Development ( http://arxiv.org/abs/2102.01265v1 )

ライセンス: Link先を確認

Alan Chan and Chinasa T. Okolo and Zachary Terner and Angelina Wang

(参考訳) 人工知能(AI)システムの普及から利益を得るための最善策は、最も経済的な力を持つ人々だ。現存する世界的な不平等は、西側の機関がより多様なグループをaiシステムの開発と応用に巻き込み、外国人労働者の雇用や海外のデータセンターや研究所の設立などを行っている。しかし、富の優越性と、トップダウンのAIソリューションにおける文脈知識の欠如の両方を考えると、不足しているグループを含めるだけでなく、権力の再分配にもっと注力すべきだと私たちは主張する。 ai開発をリードする機会が公平に分配されることを保証するためにそれ以上のことがなければ、将来は、その適用状況に不適合なaiシステムのみを保持し、不平等を悪化させる可能性がある。

Those best-positioned to profit from the proliferation of artificial intelligence (AI) systems are those with the most economic power. Extant global inequality has motivated Western institutions to involve more diverse groups in the development and application of AI systems, including hiring foreign labour and establishing extra-national data centers and laboratories. However, given both the propensity of wealth to abet its own accumulation and the lack of contextual knowledge in top-down AI solutions, we argue that more focus should be placed on the redistribution of power, rather than just on including underrepresented groups. Unless more is done to ensure that opportunities to lead AI development are distributed justly, the future may hold only AI systems which are unsuited to their conditions of application, and exacerbate inequality.

翻訳日:2023-04-13 00:51:26 公開日:2021-02-02

# スピン増幅器を用いた軸状暗黒物質の探索

Search for axion-like dark matter with spin-based amplifiers ( http://arxiv.org/abs/2102.01448v1 )

ライセンス: Link先を確認

Min Jiang, Haowen Su, Antoine Garcon, Xinhua Peng, Dmitry Budker

(参考訳) ウルトラライトアクシオン様粒子(ultralight axion-like particles、alps)は、標準模型を超えて理論によって導入された暗黒物質候補である。しかし、既存の実験実験を通してのALPの存在に関する制約は、通常天体物理学的な限界よりも弱い現在の感度によって妨げられている。ここでは,8.3 feVから744 feVまでの約20年間の質量範囲でALPを探索する新しい量子センサを実証する。センサは高偏極長寿命核スピンを前増幅器として利用し, 軸索状ダークマター場のコヒーレント振動を100倍に向上させる。スピンベース増幅器を用いて18fT/Hz$^{1/2}$の超高感度を達成し、最先端の原子スピン磁力計よりもはるかに優れている。我々の実験は、ALPと核子との質量範囲上のカップリングを記述するパラメータ空間を67.5 feVで2.9\times 10^{-9}~\textrm{GeV}^{-1}$$95\%$の信頼度に制限し、少なくとも5桁の精度で以前の実験室限界よりも改善した。我々の測定はまた、天体物理学上の新たな限界とのアルプ-核子相互作用とダーク光子-核子相互作用を制約している。

Ultralight axion-like particles (ALPs) are well-motivated dark matter candidates introduced by theories beyond the standard model. However, the constraints on the existence of ALPs through existing laboratory experiments are hindered by their current sensitivities, which are usually weaker than astrophysical limits. Here, we demonstrate a new quantum sensor to search for ALPs in the mass range that spans about two decades from 8.3 feV to 744 feV. Our sensor makes use of hyperpolarized long-lived nuclear spins as a pre-amplifier that effectively enhances coherently oscillating axion-like dark-matter field by a factor of >100. Using spin-based amplifiers, we achieve an ultrahigh magnetic sensitivity of 18 fT/Hz$^{1/2}$, which is significantly better than state-of-the-art nuclear-spin magnetometers. Our experiment constrains the parameter space describing the coupling of ALPs to nucleons over our mass range, at 67.5 feV reaching $2.9\times 10^{-9}~\textrm{GeV}^{-1}$ ($95\%$ confidence level), improving over previous laboratory limits by at least five orders of magnitude. Our measurements also constrain the ALP-nucleon quadratic interaction and dark photon-nucleon interaction with new limits beyond the astrophysical ones

翻訳日:2023-04-13 00:48:20 公開日:2021-02-02

# 量子ホロリズム

Quantum Holism ( http://arxiv.org/abs/2102.01438v1 )

ライセンス: Link先を確認

Giacomo Mauro D'Ariano

(参考訳) 複合量子系は、その部分のすべての性質と相容れない性質を持つ。すべての局所的性質と相容れないような大域的性質の存在は、私が「mereological holism」と呼んでいます。メアロジカル・ホリスム(Mereological holism)は、「量子系」(quantum system)の通常の理解を「物理的対象」とすることによる劇的な概念的な結果である。プロパティ」の概念は、操作確率論のクラス全体(短いオプトアウト)に一意的に拡張することができ、最も関連するケースは量子論と古典理論である。古典理論はメアロジカルに包括的ではないが、現在では他のOPTも検索できる。 OPTフレームワーク内での"システム"の役割は、2つの目的イベント間の入出力接続である。古典理論のような非全体論理論では、この体系は依然として「対象」とみなすことができる。逆に、「システム」を「対象」と解釈する全体論理論は、理論的な概念の仮説化を構成する。

A composite quantum system has properties that are incompatible with every property of its parts. The existence of such global properties incompatible with all local properties constitutes what I call "mereological holism"--the distinctive holism of Quantum Theory. Mereological holism has the dramatic conceptual consequence of making untenable the usual understanding of the "quantum system" as being a "physical object", since composed objects have properties compatible with those of its parts. The notion of "property" can be extended in a unique way to the whole class of operational probabilistic theories (shortly OPTs), of which the most relevant cases are Quantum Theory and Classical Theory. Whereas Classical Theory is not mereologically holistic, we can now search for other OPTs that are so. Within the OPT framework the role of the "system" is that of an input-output connection between two objective events. In non holistic theories, such as Classical Theory, the system can still be regarded as an "object". On the contrary, in holistic theories interpreting "system" as "object" constitutes an hypostatization of a theoretical notion.

翻訳日:2023-04-13 00:47:53 公開日:2021-02-02

# 埋込み粒子間の中力ファンデルワールス相互作用の効果的なスクリーニング

Effective screening of medium-assisted Van der Waals interactions between embedded particles ( http://arxiv.org/abs/2102.01430v1 )

ライセンス: Link先を確認

Johannes Fiedler, Michael Walter, Stefan Yoshi Buhmann

(参考訳) 粒子対の分散相互作用に対する暗黙媒質の影響を論じ,真空に対する補正のための簡単な式を導出した。単一点ガウス二次数は、相互作用粒子の共鳴周波数に近い環境の誘電率二乗により真空ファンデルワールス$c_6$係数が遮蔽されるという直感的な結果をもたらす。この近似は、媒質がこれらの周波数で透明であれば特に適切である。原稿では、一般的に用いられる溶媒、原子、小分子の単純なモデルとパラメータセットを提供する。

The effect of an implicit medium on dispersive interactions of particle pairs is discussed and simple expressions for the correction relative to vacuum are derived. We show that a single point Gauss quadrature leads to the intuitive result that the vacuum van der Waals $C_6$ coefficient is screened by the permittivity squared of the environment evaluated near to the resonance frequencies of the interacting particles. This approximation should be particularly relevant if the medium is transparent at these frequencies. In the manuscript, we provide simple models and sets of parameters for commonly used solvents, atoms and small molecules.

翻訳日:2023-04-13 00:47:36 公開日:2021-02-02

# マイクロ波光子カウンタによるスピン検出

Detecting spins with a microwave photon counter ( http://arxiv.org/abs/2102.01415v1 )

ライセンス: Link先を確認

Emanuele Albertinale, L\'eo Balembois, Eric Billaud, Vishal Ranjan, Daniel Flanigan, Thomas Schenkel, Daniel Est\`eve, Denis Vion, Patrice Bertet, Emmanuel Flurin

(参考訳) 電磁界を放射することで共鳴照明に応答する量子エミッタ。これらのフィールドの成分は駆動音と位相整合し、もう1つは自発的に放出された光子からなり蛍光信号を形成する非一貫性である。原子や分子は、光周波数での蛍光によって定期的に検出され、量子技術や顕微鏡に重要な応用がある。一方、スピンは通常、連続波またはパルス磁気共鳴において、電波またはマイクロ波の周波数で {their coherent response} によって検出される。実際、スピンの蛍光検出は、低い自発放出率と、この周波数範囲における単一光子検出器の欠如によって妨げられる。ここでは、超伝導量子デバイスを用いて、マイクロ波およびミリケルビン温度での蛍光によるシリコン中のドナースピンの小さなアンサンブルの検出を実証する。我々は、高品位かつ小型の超伝導共振器に結合することでスピン放射減衰率を高め、超伝導量子ビットに基づく新しいマイクロ波単一光子カウンタにデバイス出力を接続する。少数のスピンの磁気共鳴分光法における蛍光検出の可能性について考察する。

Quantum emitters respond to resonant illumination by radiating electromagnetic fields. A component of these fields is phase-coherent with the driving tone, while another one is incoherent, consisting of spontaneously emitted photons and forming the fluorescence signal. Atoms and molecules are routinely detected by their fluorescence at optical frequencies, with important applications in quantum technology and microscopy. Spins, on the other hand, are usually detected by {their coherent response} at radio- or microwave frequencies, either in continuous-wave or pulsed magnetic resonance. Indeed, fluorescence detection of spins is hampered {by their low spontaneous emission rate} and by the lack of single-photon detectors in this frequency range. Here, using superconducting quantum devices, we demonstrate the detection of a small ensemble of donor spins in silicon by their fluorescence at microwave frequency and millikelvin temperatures. We enhance the spin radiative decay rate by coupling them to a high-quality-factor and small-mode-volume superconducting resonator, and we connect the device output to a newly-developed microwave single-photon counter based on a superconducting qubit. We discuss the potential of fluorescence detection as a novel method for magnetic resonance spectroscopy of small numbers of spins.

翻訳日:2023-04-13 00:47:28 公開日:2021-02-02

# 付加ガウス雑音に対する効率良く連結されたボソニック符号

An efficient, concatenated, bosonic code for additive Gaussian noise ( http://arxiv.org/abs/2102.01374v1 )

ライセンス: Link先を確認

Kosuke Fukui and Nicolas C. Menicucci

(参考訳) ボソニック符号は量子情報処理にノイズレジリエンスを提供する。この設定における一般的なノイズは加法ガウス雑音であり、長年の未解決問題は、このノイズチャネルのハッシュバウンドを達成する結合符号を設計することである。ここでは,GKP(Gottesman-Kitaev-Preskill)符号を用いて,残差を処理するために量子パリティ符号と結合した誤り発生量子ビットを検出し,破棄する。本手法は線形時間デコーダを応用し,幅広い量子計算や通信シナリオに適用できる。

Bosonic codes offer noise resilience for quantum information processing. A common type of noise in this setting is additive Gaussian noise, and a long-standing open problem is to design a concatenated code that achieves the hashing bound for this noise channel. Here we achieve this goal using a Gottesman-Kitaev-Preskill (GKP) code to detect and discard error-prone qubits, concatenated with a quantum parity code to handle the residual errors. Our method employs a linear-time decoder and has applications in a wide range of quantum computation and communication scenarios.

翻訳日:2023-04-13 00:46:20 公開日:2021-02-02

# ヘラルドX線光子とビームスプリッタの効率的な相互作用

Efficient Interaction of Heralded X-ray Photons with a Beam Splitter ( http://arxiv.org/abs/2102.01370v1 )

ライセンス: Link先を確認

E. Strizhevsky, D. Borodin, A. Schori, S. Francoual, R. R\"ohlsberger, and S. Shwartz

(参考訳) マルチキロ電子VoltヘラルドX線光子とビームスプリッタとの効率的な相互作用の実験実験を行った。ビームスプリッタの出力で測定されたヘラルドフォトンレートは、ビームスプリッタがない場合のレートに匹敵する約0.01カウント/sである。我々はこのビームスプリッターと光子数と光子エネルギー分解検出器を用いて、単一のX線光子が分裂できないことを直接示す。量子光学におけるx線の主な利点は、高い忠実度と無視できる背景を持つ実験結果を観測できることである。

We report the experimental demonstration of efficient interaction of multi kilo electron Volt heralded x-ray photons with a beam splitter. The measured heralded photon rate at the outputs of the beam splitter is about 0.01 counts/s which is comparable to the rate in the absence of the beam splitter. We use this beam splitter together with photon number and photon energy resolving detectors to show directly that single x ray photons cannot split. Our experiment demonstrates the major advantage of x rays for quantum optics: the possibility to observe experimental results with high fidelity and with negligible background.

翻訳日:2023-04-13 00:46:10 公開日:2021-02-02

# 非マルコフ量子過程の実験的特徴付け

Experimental characterisation of a non-Markovian quantum process ( http://arxiv.org/abs/2102.01327v1 )

ライセンス: Link先を確認

K. Goswami, C. Giarmatzi, C. Monterola, S. Shrapnel, J. Romero, and F. Costa

(参考訳) すべての量子系は環境に結合する。このようなシステム環境相互作用は、異なる時間における量子演算間の時間的相関をもたらし、非マルコフノイズをもたらす。原則として、非マルコフ雑音の完全な特徴付けは、計算的かつ実験的に要求されるマルチタイムプロセス行列のトモグラフィーを必要とする。本稿では,より効率的な解法を提案する。情報理論的な尺度で定量化される非マルコフ性量の推定には, トモグラフィ的不完全測定を用いた機械学習モデルを用いる。我々は、量子光学実験でモデルをテストし、90\%の精度で非マルコビアン測度を予測できる。本実験は,大規模量子コンピュータに現れる非マルコフ雑音を効率的に検出する方法である。

Every quantum system is coupled to an environment. Such system-environment interaction leads to temporal correlation between quantum operations at different times, resulting in non-Markovian noise. In principle, a full characterisation of non-Markovian noise requires tomography of a multi-time processes matrix, which is both computationally and experimentally demanding. In this paper, we propose a more efficient solution. We employ machine learning models to estimate the amount of non-Markovianity, as quantified by an information-theoretic measure, with tomographically incomplete measurement. We test our model on a quantum optical experiment, and we are able to predict the non-Markovianity measure with $90\%$ accuracy. Our experiment paves the way for efficient detection of non-Markovian noise appearing in large scale quantum computers.

翻訳日:2023-04-13 00:45:59 公開日:2021-02-02

# 非エルミートNambu--Jona-Lasinioモデルにおけるフェルミオンと中間子質量生成

Fermion and meson mass generation in non-Hermitian Nambu--Jona-Lasinio models ( http://arxiv.org/abs/2102.01491v1 )

ライセンス: Link先を確認

Alexander Felski and S. P. Klevansky

(参考訳) 相互作用するフェルミオン系に対する非ハーミティシティの効果について検討する。我々は、非エルミート双線型項を3+1次元ナムブ・ジョナ・ラシニオ(njl)モデルに含めることでこれを行う。標準的な NJL モデルは擬ベクトル背景場 $ig \bar\psi\gamma_5 B_\mu \gamma^\mu \psi$ または反対称テンソル背景場 $g \bar\psi F_{\mu\nu}\gamma^\mu \gamma^\nu \psi$ によって拡張される。残る3つの双線型は、自然界における {\it anti}-$\mathcal{pt}$-symmetric, $ig \bar\psi b_\mu \gamma^\mu \psi, ig\bar\psi \gamma_5 \psi$, $ig\bar\psi {1}\psi$である。擬ベクトル $ig \bar\psi\gamma_5 b_\mu \gamma^\mu \psi$ とベクトル $ig \bar\psi b_\mu \gamma^\mu \psi$ はキラリー対称である。したがって、この枠組みでは、njlモデルの非ヘルミティ性、$\mathcal{pt}$ 対称性、カイラル対称性、および2体相互作用の様々な組み合わせが、真の効果的なフェルミオン質量(対応する修正された無質量自由ディラックモデルにはない特徴)の存在と動的生成、および複合粒子の質量、擬スカラーおよびスカラーメソニックモード($\pi$および$\sigma$ mesons)に与える影響を調べることができる。その結果, 実フェルミオン質量解が存在するためには, $\mathcal{pt}$ 対称性は必要ではなく, njlモデルの2体相互作用が非エルミート双線型効果に取って代わることが示された。キラル対称性の効果は、中間子モードにおいて最も明確であり、系がキラル対称であれば、擬スカラーは常に金岩である。メソニック方程式の第2の解についても論じる。

We investigate the effects of non-Hermiticity on interacting fermionic systems. We do this by including non-Hermitian bilinear terms into the 3+1 dimensional Nambu--Jona-Lasinio (NJL) model. Two possible bilinear modifications give rise to $\mathcal{PT}$ symmetric theories; this happens when the standard NJL model is extended either by a pseudovector background field $ig \bar\psi\gamma_5 B_\mu \gamma^\mu \psi$ or by an antisymmetric-tensor background field $g \bar\psi F_{\mu\nu}\gamma^\mu \gamma^\nu \psi$. The three remaining bilinears are {\it anti}-$\mathcal{PT}$-symmetric in nature, $ig \bar\psi B_\mu \gamma^\mu \psi, ig\bar\psi \gamma_5 \psi$ and $ig\bar\psi {1}\psi$, so that the Hamiltonian then has no overall symmetry. The pseudovector $ig \bar\psi\gamma_5 B_\mu \gamma^\mu \psi$ and the vector $ig \bar\psi B_\mu \gamma^\mu \psi$ combinations, are, in addition, chirally symmetric. Thus, within this framework we are able to examine the effects that the various combinations of non-Hermiticity, $\mathcal{PT}$ symmetry, chiral symmetry and the two-body interactions of the NJL model have on the existence and dynamical generation of a real effective fermion mass (a feature which is absent in the corresponding modified massless free Dirac models) as well as on the masses of the composite particles, the pseudoscalar and scalar mesonic modes ($\pi$ and $\sigma$ mesons). Our findings demonstrate that $\mathcal{PT}$ symmetry is not necessary for real fermion mass solutions to exist, rather the two-body interactions of the NJL model supersede the non-Hermitian bilinear effects. The effects of chiral symmetry are evident most clearly in the meson modes, the pseudoscalar of which will always be Goldstone in nature if the system is chirally symmetric. Second solutions of the mesonic equations are also discussed.

翻訳日:2023-04-13 00:39:07 公開日:2021-02-02

# 任意のスピンに対する無質量場方程式解の統一

Unification of massless field equations solutions for any spin ( http://arxiv.org/abs/2102.01485v1 )

ライセンス: Link先を確認

Sergio A. Hojman and Felipe A. Asenjo

(参考訳) クライン=ゴルドン、ディラック、マクスウェル、ラリタ=シュウィンガー、アインシュタイン方程式の完全解(無質量体の場合)の統一について述べる。この方法は、ダランベルト方程式を満たす前ポテンシャル関数の積と微分という観点から、関連するすべての力学体を記述することに基づいている。注目すべきことに、通常の波動方程式を満たす(漸進的な)直交前ポテンシャルの解があり、これはクライン=ゴルドン、ディラック、マクスウェル、ラリタ=シュウィンガーおよび(線型で完全な)アインシュタイン方程式に対する非自明な解を構成するのに使用できる。直交前ポテンシャルの観点から書かれたいくつかの解が提示される。この方法と、以前に開発されたもの、物理学の他の科目との関係が指摘されている。

A unification of Klein--Gordon, Dirac, Maxwell, Rarita--Schwinger and Einstein equations exact solutions (for the massless fields cases) is presented. The method is based on writing all of the relevant dynamical fields in terms of products and derivatives of pre--potential functions, which satisfy d'Alambert equation. The coupled equations satisfied by the pre--potentials are non-linear. Remarkably, there are particular solutions of (gradient) orthogonal pre--potentials that satisfy the usual wave equation which may be used to construct {\it{exact non--trivial solutions to Klein--Gordon, Dirac, Maxwell, Rarita--Schwinger and (linearized and full) Einstein equations}}, thus giving rise to a unification of the solutions of all massless field equations for any spin. Some solutions written in terms of orthogonal pre--potentials are presented. Relations of this method to previously developed ones, as well as to other subjects in physics are pointed out.

翻訳日:2023-04-13 00:38:16 公開日:2021-02-02

# 量子散乱理論の逆問題に対する代数的解法

An algebraic method for solving the inverse problem of quantum scattering theory ( http://arxiv.org/abs/2102.01464v1 )

ライセンス: Link先を確認

N.A. Khokhlov

(参考訳) 本稿では,マルケンコ理論に基づく量子散乱理論の逆問題を解くための新しい代数的手法を提案する。分離可能な形でマーケンコ方程式の核展開に三角形の波動セットを適用した。分離形式は、マーチンコ方程式を線形方程式系に還元することを可能にする。零軌道角運動量に対して、核膨張係数の線形式は運動量 q に依存する関数のフーリエ級数係数の言葉で得られ、q の有限範囲の散乱データによって決定される。

We present a new algebraic method for solving the inverse problem of quantum scattering theory based on the Marchenko theory. We applied a triangular wave set for the Marchenko equation kernel expansion in a separable form. The separable form allows a reduction of the Marchenko equation to a system of linear equations. For the zero orbital angular momentum, a linear expression of the kernel expansion coefficients is obtained in terms of the Fourier series coefficients of a function depending on the momentum q and determined by the scattering data on the finite range of q.

翻訳日:2023-04-13 00:37:57 公開日:2021-02-02

# ブロックチェーン技術を用いた分散型サプライチェーンアンチカウンタファイリングシステム

Decentralizing Supply Chain Anti-Counterfeiting Systems Using Blockchain Technology ( http://arxiv.org/abs/2102.01456v1 )

ライセンス: Link先を確認

Neo C.K. Yiu

(参考訳) サプライチェーン産業における興味深い研究課題は、高級品の真正性を示す物理商品の評価と評価である。しかし、複雑で国際的に拡大するサプライチェーンネットワークで生産され輸送される今日の商品の反カントリー化と記録的な実績に対処する革新的なソフトウェアソリューションがいくつか存在する。しかし、これらのサプライチェーンシステムは中央集権的な権威や仲介者に依存して中央集権的なシステムアーキテクチャで実装されており、サプライチェーンを横断する不正な参加者ノードによって、製品レコードの悪意ある変更やシステムコンポーネントへのさまざまな潜在的な攻撃の影響を受けやすいシングルポイント処理、ストレージ、障害といった問題を引き起こしている。ブロックチェーン技術は、暗号通貨トランザクションの分散化、分散、不変の台帳から、さまざまなユースケースや既存の問題に対処する分散型で信頼性の高いアプリケーションを構築するためのプログラマブルなインタラクティブ環境へと進化した。本研究では,ブロックチェーン技術を用いたサプライチェーン産業の旧来のアンチカウンタファイトシステムを分散化し,信頼性の高いデータプロヴァンス検索,検証,管理,サプライチェーン産業における製品の反カウンタファイト能力の向上を図るために,nfc対応アンチカウンタファイトシステム(dnas)を提案する。提案したdNASは、エンタープライズコンソーシアム、プログラム可能なスマートコントラクト、分散ファイルストレージシステムの概念と互換性のあるコンセンサスプロトコル上で分散化されたブロックチェーンネットワークを使用して、データ完全性に魅力的な特性を提供する、プロファイランスレコードを自動検証するセキュアで不変な科学的データプロファイランス追跡管理プラットフォームを開発する。

An interesting research problem in supply chain industry is evaluating and determining provenance of physical goods - demonstrating authenticity of luxury goods. Yet, there have been a few innovative software solutions addressing product anti-counterfeiting and record provenance of today's goods that are produced and transported in complex and internationally-spanning supply chain networks. However, these supply chain systems have been implemented with centralized system architecture, relying on centralized authorities or any form of intermediaries, and leading to issues such as single-point processing, storage and failure, which could be susceptible to malicious modifications of product records or various potential attacks to system components by dishonest participant nodes traversing along the supply chain. Blockchain technology has evolved from being merely a decentralized, distributed and immutable ledger of cryptocurrency transactions to a programmable interactive environment for building decentralized and reliable applications addressing different use cases and existing problems in the world. In this research, the Decentralized NFC-Enabled Anti-Counterfeiting System (dNAS) is proposed and developed, decentralizing a legacy anti-counterfeiting system of supply chain industry using Blockchain technology, to facilitate trustworthy data provenance retrieval, verification and management, as well as strengthening capability of product anti-counterfeiting in supply chain industry. The proposed dNAS utilizes decentralized blockchain network on a consensus protocol compatible with the concept of enterprise consortium, programmable smart contracts and a distributed file storage system to develop a secure and immutable scientific data provenance tracking and management platform on which provenance records, providing compelling properties on data integrity, are validated automatically.

翻訳日:2023-04-13 00:37:47 公開日:2021-02-02

# 制御されたモジュラ乗算への計測に基づく非計算の適用

Measurement-based Uncomputation Applied to Controlled Modular Multiplication ( http://arxiv.org/abs/2102.01453v1 )

ライセンス: Link先を確認

Panjin Kim and Daewan Han

(参考訳) これは測定に基づく非計算の特定の使用に関する短い報告である。性能は魅力的ではないが、様々な量子回路の最適化技術に光を当てる可能性がある。

This is a brief report on a particular use of measurement-based uncomputation. Though not appealing in performance, it may shed light on optimization techniques in various quantum circuits.

翻訳日:2023-04-13 00:37:15 公開日:2021-02-02

# 熱前離散時間結晶の観察

Observation of a prethermal discrete time crystal ( http://arxiv.org/abs/2102.01695v1 )

ライセンス: Link先を確認

Antonis Kyprianidis, Francisco Machado, William Morong, Patrick Becker, Kate S. Collins, Dominic V. Else, Lei Feng, Paul W. Hess, Chetan Nayak, Guido Pagano, Norman Y. Yao, Christopher Monroe

(参考訳) 物質の相を定義し理解するための従来の枠組みは熱力学的平衡を必要とする。非平衡系への拡張は、多体熱化の性質や新しい物質相の発見に対する驚くべき洞察をもたらし、しばしば周期的に系の駆動によって触媒される。このようなフロッケ駆動からの固有の加熱は、系の強い障害を含むことで緩和できるが、非平衡相の一般性も隠蔽できる。本研究では,無秩序な非平衡駆動相,前熱離散時間結晶(PDTC)のシグネチャを観測するために,トラップイオン量子シミュレータを用いる。ここでは、多体加熱は障害による多体局在ではなく、高周波駆動によって抑制されるため、非平衡相が出現する可能性がある時間窓が広がる。 pdtcと多体局所障害を区別するいくつかの重要な特徴を観察し、その寿命の駆動周波数制御や初期状態のエネルギー密度に対する時間-結晶次数依存性などについて考察した。従って、フロッケ予熱は物質の本質的非平衡相を創り、安定化し、研究するための一般的な戦略として提示される。

The conventional framework for defining and understanding phases of matter requires thermodynamic equilibrium. Extensions to non-equilibrium systems have led to surprising insights into the nature of many-body thermalization and the discovery of novel phases of matter, often catalyzed by driving the system periodically. The inherent heating from such Floquet drives can be tempered by including strong disorder in the system, but this can also mask the generality of non-equilibrium phases. In this work, we utilize a trapped-ion quantum simulator to observe signatures of a non-equilibrium driven phase without disorder: the prethermal discrete time crystal (PDTC). Here, many-body heating is suppressed not by disorder-induced many-body localization, but instead via high-frequency driving, leading to an expansive time window where non-equilibrium phases can emerge. We observe a number of key features that distinguish the PDTC from its many-body-localized disordered counterpart, such as the drive-frequency control of its lifetime and the dependence of time-crystalline order on the energy density of the initial state. Floquet prethermalization is thus presented as a general strategy for creating, stabilizing and studying intrinsically out-of-equilibrium phases of matter.

翻訳日:2023-04-13 00:30:36 公開日:2021-02-02

# 超伝導量子ビットを用いた量子アルゴリズムにおける動的量子回路の爆発

Exploiting dynamic quantum circuits in a quantum algorithm with superconducting qubits ( http://arxiv.org/abs/2102.01682v1 )

ライセンス: Link先を確認

Antonio D. Corcoles, Maika Takita, Ken Inoue, Scott Lekuch, Zlatko K. Minev, Jerry M. Chow, Jay M. Gambetta

(参考訳) 実システム上での量子回路の実行は、単純に単体演算の時間順序のシーケンスに制限され、続いて射影測定が行われる。量子コンピューティングのハードウェアプラットフォームはサイズと能力が成熟し続けており、従来の構成を超えて量子回路を有効にすることが不可欠である。ここでは超伝導系量子系上の動的量子回路の領域について述べる。動的量子回路は、計算全体を通しての量子状態の進化だけでなく、キュービットのサブセットの周期的な測定や、回路の実行時間よりも短い時間スケールでの古典的な情報の同時処理も含む。ノイズ量子ハードウェアを用いて、動的回路を利用する適応バージョンにおいて、最も基本的な量子アルゴリズムの1つである量子位相推定を探索し、その結果を同じアルゴリズムの非適応実装と比較する。動的回路を用いたリアルタイム量子コンピューティングは,システム内のノイズやレイテンシが十分に低く,実量子システム上で利用可能なアルゴリズムの新たな領域への扉を開くことで,実質的かつ具体的な利点をもたらすことを実証する。

The execution of quantum circuits on real systems has largely been limited to those which are simply time-ordered sequences of unitary operations followed by a projective measurement. As hardware platforms for quantum computing continue to mature in size and capability, it is imperative to enable quantum circuits beyond their conventional construction. Here we break into the realm of dynamic quantum circuits on a superconducting-based quantum system. Dynamic quantum circuits involve not only the evolution of the quantum state throughout the computation, but also periodic measurements of a subset of qubits mid-circuit and concurrent processing of the resulting classical information within timescales shorter than the execution times of the circuits. Using noisy quantum hardware, we explore one of the most fundamental quantum algorithms, quantum phase estimation, in its adaptive version, which exploits dynamic circuits, and compare the results to a non-adaptive implementation of the same algorithm. We demonstrate that the version of real-time quantum computing with dynamic circuits can offer a substantial and tangible advantage when noise and latency are sufficiently low in the system, opening the door to a new realm of available algorithms on real quantum systems.

翻訳日:2023-04-13 00:30:17 公開日:2021-02-02

# 決定論的文脈自由言語によるアナログニューロン階層のより強い分離

Stronger Separation of Analog Neuron Hierarchy by Deterministic Context-Free Languages ( http://arxiv.org/abs/2102.01633v1 )

ライセンス: Link先を確認

Ji\v{r}\'i \v{S}\'ima

(参考訳) 離散時間リカレントニューラルネットワーク(nns)の計算能力をチョムスキー階層内の飽和線形活性化関数を用いて解析する。整数重みに制限されたこのモデルは、有限オートマトン(チョムスキーレベル3)と同値の2進状態 NN と一致し、正則言語(REG)を認識する一方、有理重みはこのモデルを3つのアナログ状態単位(チョムスキーレベル0)に対してもチューリング完全とする。中間モデル $\alpha$ANN を、有理重み付き$\alpha\geq 0$余剰アナログ状態ニューロンで拡張し、アナログニューロン階層 0ANNs $\subset$ 1ANNs $\subset$ 2ANNs $\subseteq$ 3ANNs を確立した。分離 1ANNs $\subsetneq$ 2ANNs は非正規決定論的文脈自由言語 (DCFL) $L_\#=\{0^n1^n\mid n\geq 1\}$ によって目撃され、実際の重みでも任意の 1ANN では認識できないが、DCFL (Chomsky level 2) は有理重みを持つ 2ANN では受け入れられる。本稿では,非正規DCFLが実重量の1ANNでは認識できないことを示すことにより,この分離を強化する。つまり (DCFLs $\setminus$ REG) $\subset$ (2ANNs $\setminus$ 1ANNs) であり,これは 1ANNs $\cap$ DCFLs = 0ANNs を意味する。この目的のために、$L_\#$は、このクラスの任意の言語に$L_\#$を還元することで、最も単純な非正規DCFLであることを示した。

We analyze the computational power of discrete-time recurrent neural networks (NNs) with the saturated-linear activation function within the Chomsky hierarchy. This model restricted to integer weights coincides with binary-state NNs with the Heaviside activation function, which are equivalent to finite automata (Chomsky level 3) recognizing regular languages (REG), while rational weights make this model Turing-complete even for three analog-state units (Chomsky level 0). For the intermediate model $\alpha$ANN of a binary-state NN that is extended with $\alpha\geq 0$ extra analog-state neurons with rational weights, we have established the analog neuron hierarchy 0ANNs $\subset$ 1ANNs $\subset$ 2ANNs $\subseteq$ 3ANNs. The separation 1ANNs $\subsetneqq$ 2ANNs has been witnessed by the non-regular deterministic context-free language (DCFL) $L_\#=\{0^n1^n\mid n\geq 1\}$ which cannot be recognized by any 1ANN even with real weights, while any DCFL (Chomsky level 2) is accepted by a 2ANN with rational weights. In this paper, we strengthen this separation by showing that any non-regular DCFL cannot be recognized by 1ANNs with real weights, which means (DCFLs $\setminus$ REG) $\subset$ (2ANNs $\setminus$ 1ANNs), implying 1ANNs $\cap$ DCFLs = 0ANNs. For this purpose, we have shown that $L_\#$ is the simplest non-regular DCFL by reducing $L_\#$ to any language in this class, which is by itself an interesting achievement in computability theory.

翻訳日:2023-04-13 00:29:04 公開日:2021-02-02

# 次数単位空間におけるスペクトルの幾何学的および代数的側面:比較

Geometric and algebraic aspects of spectrality in order unit spaces: a comparison ( http://arxiv.org/abs/2102.01628v1 )

ライセンス: Link先を確認

Anna Jen\v{c}ov\'a and Sylvia Pulmannov\'a

(参考訳) 順序単位空間のスペクトル理論に対する2つのアプローチは、アルフセンとシュルツのスペクトル双対性とフーラスによるスペクトル圧縮基底である。前者は基底ノルム空間と双対性のある順序単位空間の幾何学的性質を用いるが、後者の概念は純粋に代数的である。フーリスアプローチは厳密にはより一般的であり、アルフセン・シュルツアプローチを特別な場合として含むことが示されている。これは二つの種類の例で示される: jb-代数がフーリススペクトルであることと、それらがリッカートであることは同値であり、中心対称状態空間は、必ずしもアルフセン・シュルツスペクトルではないにもかかわらずフーリススペクトルである。

Two approaches to spectral theory of order unit spaces are compared: the spectral duality of Alfsen and Shultz and the spectral compression bases due to Foulis. While the former approach uses the geometric properties of an order unit space in duality with a base norm space, the latter notion is purely algebraic. It is shown that the Foulis approach is strictly more general and contains the Alfsen-Shultz approach as a special case. This is demonstrated on two types of examples: the JB-algebras which are Foulis spectral if and only if they are Rickart, and the centrally symmetric state spaces, which may be Foulis spectral while not necessarily Alfsen-Shultz spectral.

翻訳日:2023-04-13 00:28:23 公開日:2021-02-02

# ボソニックデータ隠蔽:非線形光と非線形光のパワー

Bosonic data hiding: power of linear vs non-linear optics ( http://arxiv.org/abs/2102.01622v1 )

ライセンス: Link先を確認

Krishna Kumar Sabapathy, Andreas Winter

(参考訳) ガウス状態のウィグナー関数と測定値の正則性は、古典的(フィードフォワード)通信(GOCC)により強化されたガウス測度演算として定式化された「線形光学」の識別力を束縛するエレガントな方法を提供する。これにより,コヒーレント状態のgoccノルム距離を厳格に特徴付ける竹岡と佐々木(pra 78:022320, 2008)の結果を再現し,一般化することができる。さらに、古典的および量子的シャノン理論からアイデアを呼び出すと、それぞれの状態が、原理的には指数関数的に確実に判別されるが、gocc測定の出力から指数関数的に近い多モードコヒーレント状態の確率的混合であることを示す。ローカル操作の制限されたクラスと古典的コミュニケーション(LOCC)による状態の生成と識別の不可逆性を示すLOCCデータ隠蔽(LOCC)と類似して、GOCCデータ隠蔽(GOCC data hidden)と呼ぶ。また, 正のウィグナー関数を用いた測定において, ヘルストロームを識別可能な任意の有界エネルギー状態に対して, 最小の識別性を保証し, 逆方向の一般境界も提示する。 GOCC測定にも同様の限界が存在すると推測する。

We show that the positivity of the Wigner function of Gaussian states and measurements provides an elegant way to bound the discriminating power of "linear optics", which we formalise as Gaussian measurement operations augmented by classical (feed-forward) communication (GOCC). This allows us to reproduce and generalise the result of Takeoka and Sasaki [PRA 78:022320, 2008], which tightly characterises the GOCC norm distance of coherent states, separating it from the optimal distinguishability according to Helstrom's theorem. Furthermore, invoking ideas from classical and quantum Shannon theory we show that there are states, each a probabilistic mixture of multi-mode coherent states, which are exponentially reliably discriminated in principle, but appear exponentially close judging from the output of GOCC measurements. In analogy to LOCC data hiding, which shows an irreversibility in the preparation and discrimination of states by the restricted class of local operations and classical communication (LOCC), we call the present effect GOCC data hiding. We also present general bounds in the opposite direction, guaranteeing a minimum of distinguishability under measurements with positive Wigner function, for any bounded-energy states that are Helstrom distinguishable. We conjecture that a similar bound holds for GOCC measurements.

翻訳日:2023-04-13 00:28:07 公開日:2021-02-02

# 弦を付さない位相空間ホログラフィー

Phase space holography with no strings attached ( http://arxiv.org/abs/2102.01617v1 )

ライセンス: Link先を確認

D. V. Khveshchenko

(参考訳) このノートでは、位相空間 (bulk') における一般量子系の記述と時空 (boundary) とのホログラフィー的な対応を確立するという観点から、ウィグナー関数の表現について述べる。ある状況下では、前者は局所計量的変数の古典力学に還元され、後者はいくつかのボゾン化群場流体力学の形式を取る。この一般的な擬ホログラフィー双対性は、問題のシステムの特定の対称性に依存したり、応用ホログラフィーの様々な「アドホック」シナリオのように、基礎となる「弦理論」への接続を必要としない。

This note discusses the Wigner function representation from the standpoint of establishing a holography-like correspondence between the descriptions of a generic quantum system in the phase space ('bulk') picture versus its spacetime ('boundary') counterpart. Under certain circumstances the former might reduce to the classical dynamics of a local metric-like variable while the latter takes on the form of some bosonized collective field hydrodynamics. This generic pseudo-holographic duality neither relies on any particular symmetry of the system in question, nor does it require any connection to an underlying 'string theory', as in the various 'ad hoc' scenarios of applied holography.

翻訳日:2023-04-13 00:27:42 公開日:2021-02-02

# 超ポテンシャル W(x,A,B)=Atanh(px)+Btanh(6px) を持つ形状不変ポテンシャルの可解シュロディンガー方程式

Solvable Schrodinger Equations of Shape Invariant Potentials Having Superpotential W(x,A,B)=Atanh(px)+Btanh(6px) ( http://arxiv.org/abs/2102.02775v1 )

ライセンス: Link先を確認

Jamal Benbourenane, Mohamed Benbourenane, Hichem Eleuch

(参考訳) 形状不変法を用いて, 新たに提案した一次元時間独立schr\"odinger方程式を完全に解く。対応するポテンシャルは V_(x,A,B) =-A(sechpx)^2 - 6Bp(sech6px)^2+(tanhpx-6tanh6px)^2 と超ポテンシャル W(x,A,B) = Atanh(px)+Btanh(6px)^2 で与えられる。我々は、形状不変性を持つ超ポテンシャルの超対称性量子力学技術を用いて、V_-ポテンシャルパートナーを持つシュリンガー方程式の族を正確に解き、そこで離散スペクトルと対応する固有関数が正確に閉形式で決定される。 Schr\"odinger 方程式が閉形式で解くのが困難であることはよく知られており、そのいくつかのみが知られている。正確な解を持つ新しい方程式を見つけることは、これらのビジニティにおいて数値法が失敗する曲がり角付近の隠れた物理的性質を理解するために重要である。この結果は核物理学や化学において、反抗力が顕著な存在感を持つ可能性を持っている。

A new proposed one dimensional time independent Schr\"odinger equation is solved completely using shape invariance method. The corresponding potential is given by V_(x,A,B) =-A(sechpx)^2 - 6Bp(sech6px)^2+(tanhpx-6tanh6px)^2 with superpotential W(x,A,B) = Atanh(px)+Btanh(6px). We derive the exact solutions of the family of Schr\"odinger equations with the V_- potential partner using supersymmetric quantum mechanics technique of a superpotential having shape invariance property, and where the discrete spectrum and the corresponding eigenfunctions are determined exactly and in closed form. It is well-known that Schr\"odinger equations are challenging to solve in closed form, and only a few of them are known. Finding new equations with exact solutions is crucial in understanding the hidden physical properties near turning points where numerical methods fail in these vicinities. This result has potential applications in nuclear physics and chemistry where the antagonist forces have a prominent presence.

翻訳日:2023-04-13 00:21:15 公開日:2021-02-02

# 非援助完全量子チャネルに対するレートスプリッティングによる新しいワンショットインナーバウンド

Novel one-shot inner bounds for unassisted fully quantum channels via rate splitting ( http://arxiv.org/abs/2102.01766v1 )

ライセンス: Link先を確認

Sayantan Chakraborty and Aditya Nema and Pranab Sen

(参考訳) エンタングルメントアンヘルダー量子多重アクセスチャネル (qmac) とアンヘルプアンヘルダー2sender 2-receiver quantum interference channel (qic) 上で量子情報を送信するための最初の非自明な1ショット内界を証明した。既往の研究は、無支援QMACを無症候性イド限界(asymptotic iid limit)として知られるチャネルの多くの独立的および同一使用の限界でのみ研究し、無支援QMACを全く研究しなかった。私たちは内部境界を得るために、レート分割と逐次キャンセルという2つのテクニックを採用しています。レート分割は、漸近的なiid設定の古典的チャネルに対して、時間共有を回避し、内部境界を得るために以前に用いられた。我々の主な技術的貢献は、レート分割を古典的な漸近的なiid設定から量子ワンショット設定へと拡張することである。漸近的イドでは、QMACに対する一発の内界境界はヤード、デベタック、ヘイデンの速度領域に近づく。 QICでは、漸近的イド設定において新しい非自明な速度領域を得る。いずれの症例も,ワンショットおよび漸近的虹彩設定において,限られた絡み合い支援が提供される場合に拡張する。 QMAC と QIC のワンセットに対する限定的絡み合い結果は新しいものである。 QICでは, 漸近的イイド設定においても, 限られた絡み合いの結果が新しい。

We prove the first non-trivial one-shot inner bounds for sending quantum information over an entanglement unassisted two-sender quantum multiple access channel (QMAC) and an unassisted two-sender two-receiver quantum interference channel (QIC). Previous works only studied the unassisted QMAC in the limit of many independent and identical uses of the channel also known as the asymptotic iid limit, and did not study the unassisted QIC at all. We employ two techniques, rate splitting and successive cancellation}, in order to obtain our inner bound. Rate splitting was earlier used to obtain inner bounds, avoiding time sharing, for classical channels in the asymptotic iid setting. Our main technical contribution is to extend rate splitting from the classical asymptotic iid setting to the quantum one-shot setting. In the asymptotic iid limit our one-shot inner bound for QMAC approaches the rate region of Yard, Devetak and Hayden. For the QIC we get novel non-trivial rate regions in the asymptotic iid setting. All our results also extend to the case where limited entanglement assistance is provided, in both one-shot and asymptotic iid settings. The limited entanglement results for one-setting for both QMAC and QIC are new. For the QIC the limited entanglement results are new even in the asymptotic iid setting.

翻訳日:2023-04-13 00:19:59 公開日:2021-02-02

# デジタル経済活動における技術知識に基づくスキル

Skills-based on technological knowledge in the digital economy activity ( http://arxiv.org/abs/2102.01711v1 )

ライセンス: Link先を確認

Dr. Cesar R Salas-Guerra

(参考訳) 本研究は,技術知識を持つ人々が地域デジタル経済活動に与える影響と,近隣の都市への感染拡大の影響を計測することを目的とする。本研究の焦点は定量的,断面的であり,その設計は相関-causalである。この研究はブラジルのミナスジェライスの7つの小地域を対象とし、89の自治体で69%の人口と31%の農村部が組織されている。使用されるデータは、ブラジル政府の公開リポジトリで得られた4,361の観測結果からなり、パネルデータにまとめられ、部分最小二乗、微小領域空間回帰、機械学習による識別パターンを用いて分析された。回帰テストの確認分析は、CEの技術知識とデジタル経済活動の間に、R2 = .749, \b{eta} = .867, p = .000(値t = 18,298)の予測値を通じて大きな影響を与える。公立・私立大学機関(IUPP)、博士・修士課程の教授(DCNT)、情報技術職(CBO)などが有名である。技術基盤技術を求める企業の地理的集中は、小自治体の発展を鈍化させ、技術知識に基づく新しいビジネスモデルを支援する新しい政府技術イニシアチブの開発を示唆した。

This research seeks to measure the impact of people with technological knowledge on regional digital economic activity and the implications of prosperous cities' contagion effect on neighbouring ones. The focus of this study is quantitative, cross-sectional, and its design is correlational-causal. This study covers seven micro-regions of Minas Gerais in Brazil, organized in 89 municipalities, with 69% urban population and 31% rural. The data used consisted of 4,361 observations obtained in the Brazilian government's public repositories, organized into panel data, and analysed using partial least squares, micro-regional spatial regression, and identification patterns with machine learning. The confirmatory analysis of the regression test establishes a significant impact between the CE's technological knowledge and the digital economic activity AED through a predictive value of R2 = .749, \b{eta} = .867, p = .000 (value t = 18,298). With high notoriety among the variables, public and private university institutions (IUPP), professors with doctorates and masters (DCNT), and information technology occupations (CBO). A geographic concentration of companies that demand technology-based skills had effects by slowing down the development of small municipalities, suggesting the development of new government technology initiatives that support new business models based on technological knowledge.

翻訳日:2023-04-13 00:18:59 公開日:2021-02-02

# 乱れた双極子量子系における局所励起の寿命

Lifetimes of local excitations in disordered dipolar quantum systems ( http://arxiv.org/abs/2102.01705v1 )

ライセンス: Link先を確認

Rahul Nandkishore and Sarang Gopalakrishnan

(参考訳) 相互作用する量子双極子の強い乱れた系が局所的に励起されると、励起はいくつかの(潜在的に非常に長い)時間スケールで緩和する。この緩和過程は、粒子間双極子が創発的に励起される電子ガラスと、顕微鏡的双極子からなるシステム(量子磁石や超低温双極子分子など)の両方で解析される。我々は、エネルギー緩和率(t_1$ times)と緩和率(t_2$ times)の両方と、周波数、温度、偏光に依存することを考慮する。 2次元および3次元の系は、準2次元幾何学における次元交叉とともに考慮される。豊富なスケーリング法則が発見されている。

When a strongly disordered system of interacting quantum dipoles is locally excited, the excitation relaxes on some (potentially very long) timescale. We analyze this relaxation process, both for electron glasses with strong Coulomb interactions - in which particle-hole dipoles are emergent excitations - and for systems (e.g., quantum magnets or ultracold dipolar molecules) made up of microscopic dipoles. We consider both energy relaxation rates ($T_1$ times) and dephasing rates ($T_2$ times), and their dependence on frequency, temperature, and polarization. Systems in both two and three dimensions are considered, along with the dimensional crossover in quasi-two dimensional geometries. A rich set of scaling laws is found.

翻訳日:2023-04-13 00:18:34 公開日:2021-02-02

# CTとMRIの3次元超解像のための中間損失を有する畳み込みニューラルネットワーク

Convolutional Neural Networks with Intermediate Loss for 3D Super-Resolution of CT and MRI Scans ( http://arxiv.org/abs/2001.01330v3 )

ライセンス: Link先を確認

Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Nicolae Verga

(参考訳) 病院で一般的に使われているCTスキャナーは、現在512ピクセルまでの解像度の低い画像を生成する。画像中の1ピクセルは1ミリの組織に相当する。腫瘍を正確に分類し、治療計画を立案するには、高解像度のCTスキャンが必要である。同じ問題がMRIにも現れる。本稿では,3次元CTやMRIの単一画像超解像へのアプローチを提案する。提案手法は,10層からなる深層畳み込みニューラルネットワーク(CNN)と,第1層の畳み込み層の後に配置される中間層からなる。第1のcnnは2つの軸(幅と高さ)の解像度を増加させ、第2のcnnは第3軸(深さ)の解像度を増加させる。他の方法と異なり、アップスケーリング層の直後の基底トラス高解像度出力に対する損失を計算し、最後の畳み込み層の直後の損失を計算する。中間損失により、我々のネットワークは地上構造に近い、より良い出力を生み出すことができる。シャープな結果を得るために広く使われているアプローチは、固定標準偏差を用いてガウス的曖昧さを加えることである。固定標準偏差への過剰な適合を避けるため、他の手法とは異なり、様々な標準偏差を持つガウス平滑化を適用する。我々は2つのデータベースからのCTとMRIの2次元超解像と3次元超解像の文脈で評価し、2xと4xのスケーリング因子を用いて、様々な補間スキームに基づく文献やベースラインの関連研究と比較した。実験の結果,我々のアプローチは他の手法よりも優れた結果が得られることがわかった。また,人間の注記では,lanczos補間を97.55%で2倍,96.69%で4倍に拡大した症例では96.69%であった。

CT scanners that are commonly-used in hospitals nowadays produce low-resolution images, up to 512 pixels in size. One pixel in the image corresponds to a one millimeter piece of tissue. In order to accurately segment tumors and make treatment plans, doctors need CT scans of higher resolution. The same problem appears in MRI. In this paper, we propose an approach for the single-image super-resolution of 3D CT or MRI scans. Our method is based on deep convolutional neural networks (CNNs) composed of 10 convolutional layers and an intermediate upscaling layer that is placed after the first 6 convolutional layers. Our first CNN, which increases the resolution on two axes (width and height), is followed by a second CNN, which increases the resolution on the third axis (depth). Different from other methods, we compute the loss with respect to the ground-truth high-resolution output right after the upscaling layer, in addition to computing the loss after the last convolutional layer. The intermediate loss forces our network to produce a better output, closer to the ground-truth. A widely-used approach to obtain sharp results is to add Gaussian blur using a fixed standard deviation. In order to avoid overfitting to a fixed standard deviation, we apply Gaussian smoothing with various standard deviations, unlike other approaches. We evaluate our method in the context of 2D and 3D super-resolution of CT and MRI scans from two databases, comparing it to relevant related works from the literature and baselines based on various interpolation schemes, using 2x and 4x scaling factors. The empirical results show that our approach attains superior results to all other methods. Moreover, our human annotation study reveals that both doctors and regular annotators chose our method in favor of Lanczos interpolation in 97.55% cases for 2x upscaling factor and in 96.69% cases for 4x upscaling factor.

翻訳日:2023-01-14 07:52:14 公開日:2021-02-02

# モバイル学習環境における深い注意学習セッションドロップアウト予測

Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment ( http://arxiv.org/abs/2002.11624v5 )

ライセンス: Link先を確認

Youngnam Lee, Dongmin Shin, HyunBin Loh, Jaemin Lee, Piljae Chae, Junghyun Cho, Seoyon Park, Jinhwan Lee, Jineon Baek, Byungsoo Kim, Youngduck Choi

(参考訳) 学生のドロップアウト予測は、学生のエンゲージメントを改善する機会を提供し、学習体験の全体的な効果を最大化する。しかし、学生の退学に関する調査は、主に学校ドロップアウトやコースドロップアウトで行われており、モバイル学習環境における学習セッションの退学は十分に考慮されていない。本稿では,モバイル学習環境における学習セッションのドロップアウト予測問題について検討する。まず,モバイル学習環境における学習セッション,学習セッションドロップアウト,学習セッションドロップアウト予測タスクの概念を定義した。この定義に基づき,モバイル学習環境における学習セッションドロップアウト予測のための新しいトランスフォーマモデルdas: deep attentive studyセッションドロップアウト予測を提案する。 DASにはエンコーダ・デコーダ構造があり、マルチヘッドアテンションとポイントワイドフィードフォワードネットワークで構成されている。 DASの深い注意計算は、動的学生相互作用の間の複雑な関係を捉えることができる。私たちの知る限りでは、これはモバイル学習環境における学習セッションのドロップアウトを調査する最初の試みです。大規模データセットの実証評価から,DASはベースラインモデルと比較して,受信機動作特性曲線の下での領域の大幅な改善により,最高の性能を達成することが示された。

Student dropout prediction provides an opportunity to improve student engagement, which maximizes the overall effectiveness of learning experiences. However, researches on student dropout were mainly conducted on school dropout or course dropout, and study session dropout in a mobile learning environment has not been considered thoroughly. In this paper, we investigate the study session dropout prediction problem in a mobile learning environment. First, we define the concept of the study session, study session dropout and study session dropout prediction task in a mobile learning environment. Based on the definitions, we propose a novel Transformer based model for predicting study session dropout, DAS: Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment. DAS has an encoder-decoder structure which is composed of stacked multi-head attention and point-wise feed-forward networks. The deep attentive computations in DAS are capable of capturing complex relations among dynamic student interactions. To the best of our knowledge, this is the first attempt to investigate study session dropout in a mobile learning environment. Empirical evaluations on a large-scale dataset show that DAS achieves the best performance with a significant improvement in area under the receiver operating characteristic curve compared to baseline models.

翻訳日:2023-01-01 04:22:04 公開日:2021-02-02

# 深部ランダム化ニューラルネットワーク

Deep Randomized Neural Networks ( http://arxiv.org/abs/2002.12287v2 )

ライセンス: Link先を確認

Claudio Gallicchio and Simone Scardapane

(参考訳) ランダム化されたニューラルネットワークは、ほとんどの接続が固定されたニューラルネットワークの振る舞いを確率的または決定論的に探索する。このようなシステムの典型的な例は、隠れた層への接続が初期化後に未訓練のまま残される多層ニューラルネットワークアーキテクチャである。トレーニングアルゴリズムを減量セットで運用することを制限することは、本質的に、多くの興味深い特徴を持つランダム化されたニューラルネットワークのクラスを特徴付ける。その中でも、学習プロセスの極端な効率性は、完全に訓練されたアーキテクチャに関して、間違いなく顕著な優位性である。さらに、関連する単純化にもかかわらず、ランダム化されたニューラルネットワークは、実際の両方において顕著な特性を持ち、最先端の結果を複数のドメインで達成し、理論的には、ニューラルネットワークアーキテクチャ(例えば、隠れたレイヤの接続をトレーニングする前に)固有の特性を解析できる。近年、ランダム化ニューラルネットワークの研究は深層アーキテクチャへと拡張され、ベクトルやより複雑なデータ領域において効率的かつ極めて効率的なディープラーニングモデルの設計に向けた新たな研究方向が開かれた。本章では、ランダム化されたニューラルネットワークの設計と解析に関する主要な側面と、それらの近似能力に関する重要な結果について調査する。特に,まず,ランダム化ニューラルモデルの基礎をフィードフォワードネットワーク(すなわち,ランダムベクトル汎関数リンクと等価モデル)と畳み込みフィルタ(英語版)(convolutional filter)の文脈で導入し,その後にリカレントシステム(すなわち貯水池計算ネットワーク)に移行した。どちらの場合でも、深層ランダム化システムの領域における最近の結果と、その構造化ドメインへの(再帰モデルのための)適用に特に焦点を当てています。

Randomized Neural Networks explore the behavior of neural systems where the majority of connections are fixed, either in a stochastic or a deterministic fashion. Typical examples of such systems consist of multi-layered neural network architectures where the connections to the hidden layer(s) are left untrained after initialization. Limiting the training algorithms to operate on a reduced set of weights inherently characterizes the class of Randomized Neural Networks with a number of intriguing features. Among them, the extreme efficiency of the resulting learning processes is undoubtedly a striking advantage with respect to fully trained architectures. Besides, despite the involved simplifications, randomized neural systems possess remarkable properties both in practice, achieving state-of-the-art results in multiple domains, and theoretically, allowing to analyze intrinsic properties of neural architectures (e.g. before training of the hidden layers' connections). In recent years, the study of Randomized Neural Networks has been extended towards deep architectures, opening new research directions to the design of effective yet extremely efficient deep learning models in vectorial as well as in more complex data domains. This chapter surveys all the major aspects regarding the design and analysis of Randomized Neural Networks, and some of the key results with respect to their approximation capabilities. In particular, we first introduce the fundamentals of randomized neural models in the context of feed-forward networks (i.e., Random Vector Functional Link and equivalent models) and convolutional filters, before moving to the case of recurrent systems (i.e., Reservoir Computing networks). For both, we focus specifically on recent results in the domain of deep randomized systems, and (for recurrent models) their application to structured domains.

翻訳日:2022-12-28 07:12:03 公開日:2021-02-02

# サスペンド・ペイロードによる飛行モデルに基づくメタ強化学習

Model-Based Meta-Reinforcement Learning for Flight with Suspended Payloads ( http://arxiv.org/abs/2004.11345v2 )

ライセンス: Link先を確認

Suneel Belkhale, Rachel Li, Gregory Kahn, Rowan McAllister, Roberto Calandra, Sergey Levine

(参考訳) 吊り下げられたペイロードの輸送は、ロボットの動力に重大な、予測不能な変化を引き起こす可能性があるため、自律飛行車両にとって困難である。これらの変更は、最適飛行性能や破滅的な失敗につながる可能性がある。適応制御と学習に基づく手法は、原則としてこれらのハイブリッドロボットペイロードシステムの変化に適応することができるが、事前の物理的性質が不明なペイロードへの迅速なミッドフライ適応は未解決の問題である。本研究では,接続後飛行データから数秒以内に変化するダイナミクスのモデル「学習の仕方を学習する」メタラーニング手法を提案する。実験の結果,オンライン適応手法は,懸架されたペイロード輸送タスクにおいて,非適応手法よりも優れていることが示された。ビデオやその他の補足資料は、我々のウェブサイトで入手できる。

Transporting suspended payloads is challenging for autonomous aerial vehicles because the payload can cause significant and unpredictable changes to the robot's dynamics. These changes can lead to suboptimal flight performance or even catastrophic failure. Although adaptive control and learning-based methods can in principle adapt to changes in these hybrid robot-payload systems, rapid mid-flight adaptation to payloads that have a priori unknown physical properties remains an open problem. We propose a meta-learning approach that "learns how to learn" models of altered dynamics within seconds of post-connection flight data. Our experiments demonstrate that our online adaptation approach outperforms non-adaptive methods on a series of challenging suspended payload transportation tasks. Videos and other supplemental material are available on our website: https://sites.google.com/view/meta-rl-for-flight

翻訳日:2022-12-10 09:20:01 公開日:2021-02-02

# オープンソースソフトウェアにおける開発者エキスパートの表現

Representation of Developer Expertise in Open Source Software ( http://arxiv.org/abs/2005.10176v3 )

ライセンス: Link先を確認

Tapajit Dey, Andrey Karnauch, Audris Mockus

(参考訳) 背景: 開発者の専門知識の正確な表現は常に重要な研究課題です。多くの研究が個々のプロジェクト内で専門知識を表現する新しい手法を提案しているが、これらの手法は生態系レベルでは適用が困難である。しかし、ソフトウェア開発がモノリシックからモジュラーへとシフトするにつれ、例えばプロジェクトが新しいメンテナを見つけ、関連するスキルを持つ開発者を探そうとするときに、OSS開発全体のコンテキストにおける開発者の専門知識を表現する方法が必要である。目的: 私たちは,各apiや開発者,プロジェクトが表現されるスキルスペースの提案と構築を通じて,この知識ギャップに対処することを目的としています。メソッド: 私たちはWorld of Codeインフラストラクチャを使用して、オープンソース開発者が変更したファイルの完全なAPIセットを抽出し、そのデータに基づいて、API、開発者、プロジェクトのベクトル表現にDoc2Vec埋め込みを使用します。これらの埋め込みがSkill Spaceの仮定されたトポロジを反映しているかどうかを、開発者が使用/参加する新しいAPIやプロジェクト、プルリクエストが受け入れられるかどうかを予測することで評価します。また、Skill Spaceにおける開発者の表現が、自己報告のAPIの専門知識とどのように一致しているかを確認します。結果: 提案するスキル空間への埋め込みは, 仮定されたトポロジーを満足しているように思われる。このような表現が, オープンソースエコシステム全体の信頼(と効率)を高めるシグナルの構築に寄与し, 開発者の習熟度や学習に関連する他の現象の調査に役立つことを期待する。

Background: Accurate representation of developer expertise has always been an important research problem. While a number of studies proposed novel methods of representing expertise within individual projects, these methods are difficult to apply at an ecosystem level. However, with the focus of software development shifting from monolithic to modular, a method of representing developers' expertise in the context of the entire OSS development becomes necessary when, for example, a project tries to find new maintainers and look for developers with relevant skills. Aim: We aim to address this knowledge gap by proposing and constructing the Skill Space where each API, developer, and project is represented and postulate how the topology of this space should reflect what developers know (and projects need). Method: we use the World of Code infrastructure to extract the complete set of APIs in the files changed by open source developers and, based on that data, employ Doc2Vec embeddings for vector representations of APIs, developers, and projects. We then evaluate if these embeddings reflect the postulated topology of the Skill Space by predicting what new APIs/projects developers use/join, and whether or not their pull requests get accepted. We also check how the developers' representations in the Skill Space align with their self-reported API expertise. Result: Our results suggest that the proposed embeddings in the Skill Space appear to satisfy the postulated topology and we hope that such representations may aid in the construction of signals that increase trust (and efficiency) of open source ecosystems at large and may aid investigations of other phenomena related to developer proficiency and learning.

翻訳日:2022-12-01 06:16:54 公開日:2021-02-02

# 不整合損失を伴うレビュー要約と感情分類のための統一的デュアルビューモデル

A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss ( http://arxiv.org/abs/2006.01592v2 )

ライセンス: Link先を確認

Hou Pong Chan, Wang Chen, Irwin King

(参考訳) ユーザーレビューから正確な要約と感情を得ることは、現代のeコマースプラットフォームの重要な要素である。レビュー要約は、レビューの重要な意見と感情を記述する簡潔な要約を生成することを目的としており、感情分類はレビューの感情態度を示す感情ラベルを予測することを目的としている。レビュー要約と感情分類の両タスクにおいて,共有感情情報を効果的に活用するために,これら2つのタスクの性能を協調的に改善する新しいデュアルビューモデルを提案する。このモデルでは、エンコーダはまずレビューのコンテキスト表現を学習し、次に要約デコーダがレビュー要約語を単語毎に生成する。その後、ソースビュー感情分類器は、エンコードされたコンテキスト表現を使用してレビューの感情ラベルを予測し、サマリビュー感情分類器はデコーダ隠蔽状態を使用して生成された要約の感情ラベルを予測する。トレーニング中、これらの2つの分類器間の不一致を罰する不整合損失を導入する。これはデコーダがレビューで一貫した感情傾向を持つために要約を生成するのを助け、2つの感情分類器が互いに学ぶのに役立つ。異なる領域の4つの実世界のデータセットに対する実験結果は、我々のモデルの有効性を示す。

Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms. Review summarization aims at generating a concise summary that describes the key opinions and sentiment of a review, while sentiment classification aims to predict a sentiment label indicating the sentiment attitude of a review. To effectively leverage the shared sentiment information in both review summarization and sentiment classification tasks, we propose a novel dual-view model that jointly improves the performance of these two tasks. In our model, an encoder first learns a context representation for the review, then a summary decoder generates a review summary word by word. After that, a source-view sentiment classifier uses the encoded context representation to predict a sentiment label for the review, while a summary-view sentiment classifier uses the decoder hidden states to predict a sentiment label for the generated summary. During training, we introduce an inconsistency loss to penalize the disagreement between these two classifiers. It helps the decoder to generate a summary to have a consistent sentiment tendency with the review and also helps the two sentiment classifiers learn from each other. Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.

翻訳日:2022-11-26 00:31:11 公開日:2021-02-02

# Augmented Grasp Map Representation を用いた指向性ロボットグラフ合成

Orientation Attentive Robotic Grasp Synthesis with Augmented Grasp Map Representation ( http://arxiv.org/abs/2006.05123v2 )

ライセンス: Link先を確認

Georgia Chalvatzaki, Nikolaos Gkanatsios, Petros Maragos, Jan Peters

(参考訳) 物体の因果的な形態的特徴は、ロボットの把握の視覚的学習を阻害する、幅広い可視的把握方向を提供する可能性がある。既存の把持生成アプローチは、把持点ごとに大きく異なる向きのアノテーションを集約することで不連続な把持マップを構築するために呪いを負う。さらに,現状の手法では,ロボットの視点において,その実現可能性の制約を無視して,単一方向の把握候補を生成する。本稿では, 角度空間を複数のビンに分割することにより, 方向を局所的に歪曲する, 画素ワイズ合成に適した拡張型グリップマップ表現を提案する。さらに,向き付けビンへの分類と角度値回帰を共同で扱う向き付け注意把握合成(orange)フレームワークについても紹介する。双角方向写像はさらに、把握性が高い領域、すなわち実際の把握点となる確率に対する注意のメカニズムとして機能する。 Jacquardの94.71%の性能は、深度画像のみを用いた単純なU-Netで、マルチモーダルアプローチよりも優れています。その後の定性的な結果から,ORANGEが複数の方向のグリップを生成することの有効性を検証し,実現可能な計画的グリップを実現する。

Inherent morphological characteristics in objects may offer a wide range of plausible grasping orientations that obfuscates the visual learning of robotic grasping. Existing grasp generation approaches are cursed to construct discontinuous grasp maps by aggregating annotations for drastically different orientations per grasping point. Moreover, current methods generate grasp candidates across a single direction in the robot's viewpoint, ignoring its feasibility constraints. In this paper, we propose a novel augmented grasp map representation, suitable for pixel-wise synthesis, that locally disentangles grasping orientations by partitioning the angle space into multiple bins. Furthermore, we introduce the ORientation AtteNtive Grasp synthEsis (ORANGE) framework, that jointly addresses classification into orientation bins and angle-value regression. The bin-wise orientation maps further serve as an attention mechanism for areas with higher graspability, i.e. probability of being an actual grasp point. We report new state-of-the-art 94.71% performance on Jacquard, with a simple U-Net using only depth images, outperforming even multi-modal approaches. Subsequent qualitative results with a real bi-manual robot validate ORANGE's effectiveness in generating grasps for multiple orientations, hence allowing planning grasps that are feasible.

翻訳日:2022-11-23 15:31:26 公開日:2021-02-02

# Debona: より密接な境界と高速な対向ロバスト性証明のための分離境界ネットワーク解析

Debona: Decoupled Boundary Network Analysis for Tighter Bounds and Faster Adversarial Robustness Proofs ( http://arxiv.org/abs/2006.09040v2 )

ライセンス: Link先を確認

Christopher Brix, Thomas Noll

(参考訳) ニューラルネットワークは、安全クリティカルな現実世界のアプリケーションで一般的に使用される。残念なことに、予測された出力は、しばしば入力データの変更に対して非常に敏感である。このような敵の例が存在しないこと、あるいは具体的な例を提供することは、安全なアプリケーションを保証するために不可欠である。全ての潜在的な敵の例を列挙し、検証することは、計算的に不可能であるので、ネットワークアクティベーションの過大評価を用いて、不在の数学的に健全な証明を提供するための検証技術が開発されている。本稿では,これらのノード値の上限値と下限値の密接な計算を行うための改良手法を提案する。さらに,従来の最先端ソフトウェアである"Neurify"の一部を再実装することで,より高速な解析が可能になった。これらの適応を組み合わせることで、必要なランタイムを最大94%削減し、以前は複雑すぎたネットワークや入力の検索に成功した。畳み込みネットワークにおける最大プーリング層上の上下境界の厳密な証明を行う。広汎なユーザビリティを確保するため,実装固有の拡張に加えて,より高速かつ正確な境界計算も備えた実装"Debona"をオープンソース化した。

Neural networks are commonly used in safety-critical real-world applications. Unfortunately, the predicted output is often highly sensitive to small, and possibly imperceptible, changes to the input data. Proving that either no such adversarial examples exist, or providing a concrete instance, is therefore crucial to ensure safe applications. As enumerating and testing all potential adversarial examples is computationally infeasible, verification techniques have been developed to provide mathematically sound proofs of their absence using overestimations of the network activations. We propose an improved technique for computing tight upper and lower bounds of these node values, based on increased flexibility gained by computing both bounds independently of each other. Furthermore, we gain an additional improvement by re-implementing part of the original state-of-the-art software "Neurify", leading to a faster analysis. Combined, these adaptations reduce the necessary runtime by up to 94%, and allow a successful search for networks and inputs that were previously too complex. We provide proofs for tight upper and lower bounds on max-pooling layers in convolutional networks. To ensure widespread usability, we open source our implementation "Debona", featuring both the implementation specific enhancements as well as the refined boundary computation for faster and more exact~results.

翻訳日:2022-11-20 18:44:10 公開日:2021-02-02

# 会話型ニューロシンボリック・コモンセンス推論

Conversational Neuro-Symbolic Commonsense Reasoning ( http://arxiv.org/abs/2006.10022v3 )

ライセンス: Link先を確認

Forough Arabshahi, Jennifer Lee, Mikayla Gawarecki, Kathryn Mazaitis, Amos Azaria, Tom Mitchell

(参考訳) 会話AIシステムがより自然で広範に会話を行うためには、会話パートナーの予想外の推測を識別する機能を含む、より一般的な知識が必要である。例えば、"if it snows at night, then me early because i don't want to be late for work"というコマンドでは、リスナーの常識に基づいて、雪が降って渋滞が遅くなる場合にのみ目を覚ましたいという暗黙の仮定を推測する。ここでは、「if-(state), then-(action), because-(goal)"文」という形で与えられた不正確な自然言語コマンドを理解するという問題を考察する。より正確には、要求された行動が与えられた状態から所望の目標を達成することを許容する話者の未定の前提を識別する問題を考える(暗黙の前提を明示することによる詳細化)。我々はこのタスクのベンチマークデータセットをリリースし、人間から収集し、コモンセンス推定で注釈を付けた。マルチホップ推論鎖を抽出するニューロシンボリック定理証明器を提案し,この問題に応用する。さらに、現在のAIコモンセンスシステムが完全なカバレッジを欠いている現実に対応するため、私たちのニューロシンボリックシステム上に構築された対話型会話フレームワークも提供します。

In order for conversational AI systems to hold more natural and broad-ranging conversations, they will require much more commonsense, including the ability to identify unstated presumptions of their conversational partners. For example, in the command "If it snows at night then wake me up early because I don't want to be late for work" the speaker relies on commonsense reasoning of the listener to infer the implicit presumption that they wish to be woken only if it snows enough to cause traffic slowdowns. We consider here the problem of understanding such imprecisely stated natural language commands given in the form of "if-(state), then-(action), because-(goal)" statements. More precisely, we consider the problem of identifying the unstated presumptions of the speaker that allow the requested action to achieve the desired goal from the given state (perhaps elaborated by making the implicit presumptions explicit). We release a benchmark data set for this task, collected from humans and annotated with commonsense presumptions. We present a neuro-symbolic theorem prover that extracts multi-hop reasoning chains, and apply it to this problem. Furthermore, to accommodate the reality that current AI commonsense systems lack full coverage, we also present an interactive conversational framework built on our neuro-symbolic system, that conversationally evokes commonsense knowledge from humans to complete its reasoning chains.

翻訳日:2022-11-19 18:59:11 公開日:2021-02-02

# ニューロンの構成的説明

Compositional Explanations of Neurons ( http://arxiv.org/abs/2006.14032v2 )

ライセンス: Link先を確認

Jesse Mu, Jacob Andreas

(参考訳) 本稿では,ニューロンの挙動を近似した構成論理的概念を同定し,深部表現におけるニューロンの説明手法について述べる。原子ラベルを説明として使用する以前の研究と比較すると、ニューロンを合成分析することで、より正確に表現的にその行動を特徴付けることができる。視覚と自然言語処理のモデルにおける解釈可能性に関するいくつかの質問に答えるためにこの手順を用いる。まず,ニューロンが学習する抽象化の種類について検討する。画像分類では、多くのニューロンが高度に抽象的だがセマンティック・コヒーレントな視覚概念を学習しているのに対し、他のポリセマンティックニューロンは複数の無関係な特徴を検知している。第2に,人間の解釈可能な概念を検出する視覚ニューロンはタスク性能と正の相関を示す一方,浅いヒューリスティックスのために発火するNLIニューロンはタスク性能と負の相関を示す。最後に、構成説明が、エンドユーザーがモデル動作を予測可能な方法で変更する単純な「コピーペースト」攻撃例を作成するための、アクセス可能な方法を提供する方法を示す。

We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts that closely approximate neuron behavior. Compared to prior work that uses atomic labels as explanations, analyzing neurons compositionally allows us to more precisely and expressively characterize their behavior. We use this procedure to answer several questions on interpretability in models for vision and natural language processing. First, we examine the kinds of abstractions learned by neurons. In image classification, we find that many neurons learn highly abstract but semantically coherent visual concepts, while other polysemantic neurons detect multiple unrelated features; in natural language inference (NLI), neurons learn shallow lexical heuristics from dataset biases. Second, we see whether compositional explanations give us insight into model performance: vision neurons that detect human-interpretable concepts are positively correlated with task performance, while NLI neurons that fire for shallow heuristics are negatively correlated with task performance. Finally, we show how compositional explanations provide an accessible way for end users to produce simple "copy-paste" adversarial examples that change model behavior in predictable ways.

翻訳日:2022-11-17 08:59:07 公開日:2021-02-02

# COAX:ソフト関数依存型多次元データにおける相関認識インデックス

COAX: Correlation-Aware Indexing on Multidimensional Data with Soft Functional Dependencies ( http://arxiv.org/abs/2006.16393v3 )

ライセンス: Link先を確認

Ali Hadian, Behzad Ghaffari, Taiyi Wang, Thomas Heinis

(参考訳) 最近の研究は、パフォーマンスを改善するために基礎となるデータセットの分布を学習する学習インデックス構造を提案している。学習されたインデックスに関する最初の研究は、データの累積分布関数を学習することで、b木のようなインデックス構造がメモリフットプリントを小さくしながら、その性能を1桁改善できることを示した。本稿では,鍵の分布を学習する代わりに,データセットの属性間の相関関係を学習する多次元データのための学習指標であるCOAXを提案する。我々のアプローチは、多くのデータセットにおいて、2つの(または複数の)属性の値が相関しているという観測によって導かれる。 COAXはこれらの相関を利用してデータセットの次元を減少させる。より正確には、ある(または複数の)属性を残りの属性から$c_d$を推測する方法を学び、したがって$c_d$をインデックスする必要がない。これにより次元が小さくなり、インデックスはより小さくより効率的になる。提案手法の有効性をFD属性の予測可能性に基づいて理論的に検討する。さらに,データ中の関連属性を予測することにより,クエリ実行時間を短縮し,インデックスのメモリオーバーヘッドを低減できることを実験的に示す。実験では,インデックスのメモリフットプリントを4桁に減らしながら,実行時間を25%削減した。

Recent work proposed learned index structures, which learn the distribution of the underlying dataset to improve performance. The initial work on learned indexes has shown that by learning the cumulative distribution function of the data, index structures such as the B-Tree can improve their performance by one order of magnitude while having a smaller memory footprint. In this paper, we present COAX, a learned index for multidimensional data that, instead of learning the distribution of keys, learns the correlations between attributes of the dataset. Our approach is driven by the observation that in many datasets, values of two (or multiple) attributes are correlated. COAX exploits these correlations to reduce the dimensionality of the datasets. More precisely, we learn how to infer one (or multiple) attribute $C_d$ from the remaining attributes and hence no longer need to index attribute $C_d$. This reduces the dimensionality and hence makes the index smaller and more efficient. We theoretically investigate the effectiveness of the proposed technique based on the predictability of the FD attributes. We further show experimentally that by predicting correlated attributes in the data, we can improve the query execution time and reduce the memory overhead of the index. In our experiments, we reduce the execution time by 25% while reducing the memory footprint of the index by four orders of magnitude.

翻訳日:2022-11-15 15:34:15 公開日:2021-02-02

# 入場拡大を伴うカスケード推論による効率的なコンフォメーション予測

Efficient Conformal Prediction via Cascaded Inference with Expanded Admission ( http://arxiv.org/abs/2007.03114v3 )

ライセンス: Link先を確認

Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

(参考訳) 本稿では,1つの予測に代えて,予測候補の集合を特定することを目的とした,共形予測(CP)の新しいアプローチを提案する。この集合は高い確率で正しい解を含むことが保証され、多くのオープンな分類タスクに適している。標準CPパラダイムでは、予測された集合は使用不能に大きくなり、得られるコストもかかる。これは、正しい答えが一意ではなく、可能な答えの総数は高い設定で特に広まっています。まずcpの正しさ基準を拡張して,推定可能な「許容」回答を追加可能とし,有効な性能保証を提供しながら,予測セットのサイズを大幅に削減する。第二に、予測カスケードを適合させることでコストを減らし、より強力な分類器-アゲインを用いて、早期に不明瞭なラベルを積極的に作成し、有効な性能保証を提供する。薬物発見のための自然言語処理と計算化学の複数の応用におけるアプローチの実証的有効性を示す。

In this paper, we present a novel approach for conformal prediction (CP), in which we aim to identify a set of promising prediction candidates -- in place of a single prediction. This set is guaranteed to contain a correct answer with high probability, and is well-suited for many open-ended classification tasks. In the standard CP paradigm, the predicted set can often be unusably large and also costly to obtain. This is particularly pervasive in settings where the correct answer is not unique, and the number of total possible answers is high. We first expand the CP correctness criterion to allow for additional, inferred "admissible" answers, which can substantially reduce the size of the predicted set while still providing valid performance guarantees. Second, we amortize costs by conformalizing prediction cascades, in which we aggressively prune implausible labels early on by using progressively stronger classifiers -- again, while still providing valid performance guarantees. We demonstrate the empirical effectiveness of our approach for multiple applications in natural language processing and computational chemistry for drug discovery.

翻訳日:2022-11-13 01:52:21 公開日:2021-02-02

# ビデオ分類のための地域別非ローカル操作

Region-based Non-local Operation for Video Classification ( http://arxiv.org/abs/2007.09033v5 )

ライセンス: Link先を確認

Guoxi Huang and Adrian G. Bors

(参考訳) 畳み込みニューラルネットワーク(cnns)は、小さなウィンドウサイズで畳み込み操作を深く積み重ねることで、長距離依存性をモデル化する。本稿では,ローカル操作の深いスタックを使わずに,長距離依存関係を直接キャプチャできる自己注意機構のファミリーとして,地域ベースの非ローカル操作(RNL)を提案する。中間特徴マップが与えられると、全ての位置の隣接領域から情報を集約することにより、その特徴を位置で再調整する。チャネルアテンションモジュールと提案したRNLを組み合わせることで,市販のCNNに組み込んだアテンションチェーンを設計し,エンドツーエンドのトレーニングを行う。本手法を2つのビデオ分類ベンチマークで評価する。提案手法の実験結果は,他の注意機構よりも優れており,Something V1データセットの最先端性能を実現している。

Convolutional Neural Networks (CNNs) model long-range dependencies by deeply stacking convolution operations with small window sizes, which makes the optimizations difficult. This paper presents region-based non-local (RNL) operations as a family of self-attention mechanisms, which can directly capture long-range dependencies without using a deep stack of local operations. Given an intermediate feature map, our method recalibrates the feature at a position by aggregating the information from the neighboring regions of all positions. By combining a channel attention module with the proposed RNL, we design an attention chain, which can be integrated into the off-the-shelf CNNs for end-to-end training. We evaluate our method on two video classification benchmarks. The experimental results of our method outperform other attention mechanisms, and we achieve state-of-the-art performance on the Something-Something V1 dataset.

翻訳日:2022-11-09 14:14:14 公開日:2021-02-02

# 音声感情認識のためのコンパクトグラフアーキテクチャ

Compact Graph Architecture for Speech Emotion Recognition ( http://arxiv.org/abs/2008.02063v4 )

ライセンス: Link先を確認

A. Shirian, T. Guha

(参考訳) 本稿では,音声感情認識の課題に対処するディープグラフアプローチを提案する。データを表現するコンパクトで効率的でスケーラブルな方法は、グラフの形式です。グラフ信号処理の理論に倣って,周期グラフや線グラフとして音声信号をモデル化することを提案する。このようなグラフ構造により、標準的なGCNで使用される近似畳み込みとは対照的に、正確なグラフ畳み込みを行うことができるグラフ畳み込みネットワーク(GCN)ベースのアーキテクチャを構築することができる。一般的なIEMOCAPとMSP-IMPROVデータベースを用いた音声感情認識モデルの性能評価を行った。我々のモデルは、標準的なGCNや他の関連するディープグラフアーキテクチャよりも優れている。既存の音声感情認識法と比較すると,学習可能なパラメータ(約30K)が大幅に少なく,資源制約のあるデバイスに適用可能であることを示す。

We propose a deep graph approach to address the task of speech emotion recognition. A compact, efficient and scalable way to represent data is in the form of graphs. Following the theory of graph signal processing, we propose to model speech signal as a cycle graph or a line graph. Such graph structure enables us to construct a Graph Convolution Network (GCN)-based architecture that can perform an accurate graph convolution in contrast to the approximate convolution used in standard GCNs. We evaluated the performance of our model for speech emotion recognition on the popular IEMOCAP and MSP-IMPROV databases. Our model outperforms standard GCN and other relevant deep graph architectures indicating the effectiveness of our approach. When compared with existing speech emotion recognition methods, our model achieves comparable performance to the state-of-the-art with significantly fewer learnable parameters (~30K) indicating its applicability in resource-constrained devices.

翻訳日:2022-11-02 18:03:15 公開日:2021-02-02

# ConvBERT: Spanベースの動的畳み込みによるBERTの改善

ConvBERT: Improving BERT with Span-based Dynamic Convolution ( http://arxiv.org/abs/2008.02496v3 )

ライセンス: Link先を確認

Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

(参考訳) BERTのような事前訓練された言語モデルとその変種は、最近、様々な自然言語理解タスクにおいて印象的なパフォーマンスを達成した。しかし、BERTはグローバルな自己保持ブロックに大きく依存しているため、メモリフットプリントと計算コストが大きくなる。すべての注意は、グローバルな視点からアテンションマップを生成するための入力シーケンス全体に問い合わせるが、いくつかのヘッドは、局所的な依存関係のみを学ぶ必要がある、つまり計算冗長性の存在を観察する。そこで本研究では,これらの自己注意型ヘッダを置き換え,局所的依存関係を直接モデル化する,スパンベースの動的畳み込みを提案する。新たな畳み込み頭は、他の自己注意頭と共に、グローバルな文脈学習とローカルな文脈学習の両方においてより効率的である新しい混合注意ブロックを形成する。 BERTにこの混合注意設計を装備し、ConvBERTモデルを構築します。実験によると、ConvBERTはBERTとその変種を様々な下流タスクで大幅に上回り、トレーニングコストが低く、モデルのパラメータも少ない。注目すべきは、ConvBERTbase モデルは 86.4 GLUE スコアで、ELECTRAbase よりも 0.7 高い。コードと事前訓練されたモデルがリリースされる。

Pre-trained language models like BERT and its variants have recently achieved impressive performance in various natural language understanding tasks. However, BERT heavily relies on the global self-attention block and thus suffers large memory footprint and computation cost. Although all its attention heads query on the whole input sequence for generating the attention map from a global perspective, we observe some heads only need to learn local dependencies, which means the existence of computation redundancy. We therefore propose a novel span-based dynamic convolution to replace these self-attention heads to directly model local dependencies. The novel convolution heads, together with the rest self-attention heads, form a new mixed attention block that is more efficient at both global and local context learning. We equip BERT with this mixed attention design and build a ConvBERT model. Experiments have shown that ConvBERT significantly outperforms BERT and its variants in various downstream tasks, with lower training cost and fewer model parameters. Remarkably, ConvBERTbase model achieves 86.4 GLUE score, 0.7 higher than ELECTRAbase, while using less than 1/4 training cost. Code and pre-trained models will be released.

翻訳日:2022-11-02 07:12:36 公開日:2021-02-02

# 汎用適応型人工知能システム設計のための有効摂動ネットワーク

Beneficial Perturbation Network for designing general adaptive artificial intelligence systems ( http://arxiv.org/abs/2009.13954v2 )

ライセンス: Link先を確認

Shixian Wen, Amanda Rios, Yunhao Ge, Laurent Itti

(参考訳) 人間の脳は適応学習の金の標準である。経験から学び、利益を得るだけでなく、新しい状況にも適応できるのです。対照的に、ディープニューラルネットワークは、入力から出力への洗練された、固定されたマッピングのみを学習する。これにより適用性がよりダイナミックな状況に制限され、入力から出力マッピングが異なるコンテキストで変化する可能性がある。新しい独立したタスクを、前のタスクを忘れずにシーケンシャルに学習する。勾配勾配勾配を用いたニューラルネットワークにおける複数のタスクの連続的な学習は、破滅的な忘れを招き、新しいタスクの新しいマッピングを学ぶ際に、以前のタスクのマッピングが消去される。本稿では,これらの動的状況に対応するために,ネットワーク外,タスク依存のバイアスユニットを備えた,生物学的に可能な新しい深層ニューラルネットワークを提案する。これにより、単一のネットワークが初めて、出力マッピングに対する潜在的に無制限な並列入力を学習し、実行時にオンザフライを切り替えることが可能になる。バイアスユニットは、各タスクに有益な摂動(よく知られた対向的な摂動)を活用することでプログラムされる。与えられたタスクに対する有益な摂動は、そのタスクに対してネットワークを偏り、そのタスクを処理するためにネットワークを別のモードに切り替える。これにより、タスク間の破滅的な干渉がなくなる。我々のアプローチはメモリ効率が高くパラメータ効率が高く、多くのタスクに対応でき、様々なタスクやドメインで最先端のパフォーマンスを実現する。

The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also can adapt to new situations. In contrast, deep neural networks only learn one sophisticated but fixed mapping from inputs to outputs. This limits their applicability to more dynamic situations, where input to output mapping may change with different contexts. A salient example is continual learning - learning new independent tasks sequentially without forgetting previous tasks. Continual learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby a previously learned mapping of an old task is erased when learning new mappings for new tasks. Here, we propose a new biologically plausible type of deep neural network with extra, out-of-network, task-dependent biasing units to accommodate these dynamic situations. This allows, for the first time, a single network to learn potentially unlimited parallel input to output mappings, and to switch on the fly between them at runtime. Biasing units are programmed by leveraging beneficial perturbations (opposite to well-known adversarial perturbations) for each task. Beneficial perturbations for a given task bias the network toward that task, essentially switching the network into a different mode to process that task. This largely eliminates catastrophic interference between tasks. Our approach is memory-efficient and parameter-efficient, can accommodate many tasks, and achieves state-of-the-art performance across different tasks and domains.

翻訳日:2022-10-14 03:07:36 公開日:2021-02-02

# ノイズ点雲データを用いた果樹構造解析のためのグラフベース手法

Graph-based methods for analyzing orchard tree structure using noisy point cloud data ( http://arxiv.org/abs/2009.13727v2 )

ライセンス: Link先を確認

Fredrik Westling, Dr James Underwood, Dr Mitch Bryson

(参考訳) LiDARを用いた果樹のディジタイズにより、成長するプラクティスを改良して収量を改善するために使用できる分析が可能になる。高度な分析には、個々の木を識別する機能や、葉状および構造的物質の識別など、データの幾何学的および意味的な理解が必要である。この情報の抽出は、データキャプチャーのように迅速で、果樹園全体を処理できるが、既存の分類とセグメンテーションの方法は、高品質のデータやカメラのような追加のデータソースに依存している。本稿では,手持ちまたは移動式LiDARが取得した低品質データに基づいて,個々の木の位置,区分,物質分類に特化してLiDARデータを解析する手法を提案する。 F1スコアが0.774,v尺度が0.915,トランク物質分類が0.490,実データが平均的なF1スコアが0.490,既存手法が一貫して向上し,実行時間が大幅に短縮された。

Digitisation of fruit trees using LiDAR enables analysis which can be used to better growing practices to improve yield. Sophisticated analysis requires geometric and semantic understanding of the data, including the ability to discern individual trees as well as identifying leafy and structural matter. Extraction of this information should be rapid, as should data capture, so that entire orchards can be processed, but existing methods for classification and segmentation rely on high-quality data or additional data sources like cameras. We present a method for analysis of LiDAR data specifically for individual tree location, segmentation and matter classification, which can operate on low-quality data captured by handheld or mobile LiDAR. Our methods for tree location and segmentation improved on existing methods with an F1 score of 0.774 and a v-measure of 0.915 respectively, while trunk matter classification performed poorly in absolute terms with an average F1 score of 0.490 on real data, though consistently outperformed existing methods and displayed a significantly shorter runtime.

翻訳日:2022-10-13 06:55:48 公開日:2021-02-02

# リカレントメモリを有する段落レベルのコモンセンストランスフォーマ

Paragraph-level Commonsense Transformers with Recurrent Memory ( http://arxiv.org/abs/2010.01486v2 )

ライセンス: Link先を確認

Saadia Gabriel, Chandra Bhagavatula, Vered Shwartz, Ronan Le Bras, Maxwell Forbes, Yejin Choi

(参考訳) 物語のテキストに対する人間の理解は、テキストで明示的に述べられているものを超えて常識的推論を必要とする。最近のCOMETモデルでは、プレ条件やポスト条件、モチベーション、そして参加者の精神状態など、いくつかの次元に沿って、このような暗黙のコモンセンス推論を生成できる。しかし、COMETは短いフレーズのコモンセンス推論に基づいて訓練されたため、談話に依存しない。多元的な物語の各文で提示されると、その物語の他の部分と矛盾する推論を生成する可能性がある。談話認識コモンセンス推論の課題について述べる。物語の中の文が与えられると、目標は、物語の他の部分との一貫性を維持しながら、予め定義された次元に沿って常識的な推論を生成することである。このような大規模段落レベルのアノテーションは入手やコストがかかるため、文レベルのアノテーションを使用して、遠隔で管理されたコーパスを効率的にかつ自動的に構築する。このコーパスを用いて,物語からコヒーレントなコモンセンス推論を生成するために段落レベルの情報を含む談話認識モデルであるPARA-COMETを訓練する。 PARA-COMETは、前世界知識に関連する意味的知識と、現在の出来事が物語における前と将来の出来事にどのように関係しているかに関する叙述的知識の両方を捉えている。以上の結果から,PARA-COMETは文レベルのベースライン,特にコヒーレントかつ新規な推論に優れていた。

Human understanding of narrative texts requires making commonsense inferences beyond what is stated explicitly in the text. A recent model, COMET, can generate such implicit commonsense inferences along several dimensions such as pre- and post-conditions, motivations, and mental states of the participants. However, COMET was trained on commonsense inferences of short phrases, and is therefore discourse-agnostic. When presented with each sentence of a multi-sentence narrative, it might generate inferences that are inconsistent with the rest of the narrative. We present the task of discourse-aware commonsense inference. Given a sentence within a narrative, the goal is to generate commonsense inferences along predefined dimensions, while maintaining coherence with the rest of the narrative. Such large-scale paragraph-level annotation is hard to get and costly, so we use available sentence-level annotations to efficiently and automatically construct a distantly supervised corpus. Using this corpus, we train PARA-COMET, a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives. PARA-COMET captures both semantic knowledge pertaining to prior world knowledge, and episodic knowledge involving how current events relate to prior and future events in a narrative. Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.

翻訳日:2022-10-11 03:05:53 公開日:2021-02-02

# FaultNet: 断層分類のための深部畳み込みニューラルネットワーク

FaultNet: A Deep Convolutional Neural Network for bearing fault classification ( http://arxiv.org/abs/2010.02146v2 )

ライセンス: Link先を確認

Rishikesh Magar, Lalit Ghule, Junhan Li, Yang Zhao and Amir Barati Farimani

(参考訳) 生産フロアにおける高度なセンサーの存在が増加し、マシンの健康に関する重要な洞察を提供するデータセットの収集につながった。機械の健康の重要かつ信頼性の高い指標である振動信号データにより、機械系で発生した異なる故障の理解を深めることができる。そこで本研究では, 異なる信号処理法を組み合わせることで, 機械系の振動信号データを解析し, 各種軸受故障を分類するための機械学習手法と結合する。また, 異なる信号処理手法を用いることの重要性を強調し, 故障検出の精度への影響を分析する。また,従来の機械学習アルゴリズムとは別に,高い精度で軸受故障の種類を効果的に判定できる畳み込みニューラルネットワークフォールトネットを提案する。本研究の差別化要因は,信号からより多くの情報を抽出するためのチャネルの提案であり,さらに精度の高い信号の分類に有用な特徴を抽出するために,平均チャネルとメディアチャネルを生信号に積み重ねた。

The increased presence of advanced sensors on the production floors has led to the collection of datasets that can provide significant insights into machine health. An important and reliable indicator of machine health, vibration signal data can provide us a greater understanding of different faults occurring in mechanical systems. In this work, we analyze vibration signal data of mechanical systems with bearings by combining different signal processing methods and coupling them with machine learning techniques to classify different types of bearing faults. We also highlight the importance of using different signal processing methods and analyze their effect on accuracy for bearing fault detection. Apart from the traditional machine learning algorithms we also propose a convolutional neural network FaultNet which can effectively determine the type of bearing fault with a high degree of accuracy. The distinguishing factor of this work is the idea of channels proposed to extract more information from the signal, we have stacked the Mean and Median channels to raw signal to extract more useful features to classify the signals with greater accuracy.

翻訳日:2022-10-10 22:25:45 公開日:2021-02-02

# 主最適輸送方向を用いた分類のための十分次元削減

Sufficient dimension reduction for classification using principal optimal transport direction ( http://arxiv.org/abs/2010.09921v4 )

ライセンス: Link先を確認

Cheng Meng and Jun Yu and Jingyi Zhang and Ping Ma and Wenxuan Zhong

(参考訳) 十分な次元還元は教師付き次元還元アプローチとして広く用いられる。既存の十分な次元縮小法は、連続応答を持つデータのために開発され、カテゴリー応答、特にバイナリ応答に対して不十分な性能を持つ可能性がある。この問題に対処するために,最適輸送を用いた十分次元縮小部分空間(SDR部分空間)の新たな推定法を提案する。提案手法は主最適輸送方向 (POTD) と命名され, 応答カテゴリの異なるデータ間の最適輸送結合の主方向を用いてSDR部分空間の基底を推定する。提案手法は, 十分次元縮小, 支持ベクトルマシン, 最適輸送という, 一見無関係な3つのトピック間の関係も明らかにする。我々はPOTDの漸近特性を調査し、クラスラベルにエラーがない場合、POTDはSDR部分空間のみを推定する。実証的な研究により、POTDは最先端の線形次元還元法よりも優れていた。

Sufficient dimension reduction is used pervasively as a supervised dimension reduction approach. Most existing sufficient dimension reduction methods are developed for data with a continuous response and may have an unsatisfactory performance for the categorical response, especially for the binary-response. To address this issue, we propose a novel estimation method of sufficient dimension reduction subspace (SDR subspace) using optimal transport. The proposed method, named principal optimal transport direction (POTD), estimates the basis of the SDR subspace using the principal directions of the optimal transport coupling between the data respecting different response categories. The proposed method also reveals the relationship among three seemingly irrelevant topics, i.e., sufficient dimension reduction, support vector machine, and optimal transport. We study the asymptotic properties of POTD and show that in the cases when the class labels contain no error, POTD estimates the SDR subspace exclusively. Empirical studies show POTD outperforms most of the state-of-the-art linear dimension reduction methods.

翻訳日:2022-10-05 21:32:36 公開日:2021-02-02

# 自己教師付き音声表現の類似性解析

Similarity Analysis of Self-Supervised Speech Representations ( http://arxiv.org/abs/2010.11481v2 )

ライセンス: Link先を確認

Yu-An Chung and Yonatan Belinkov and James Glass

(参考訳) 近年,自己監督型音声表現学習が盛んに研究されている。大規模非ラベルデータから有用な表現を学習するために多くのアルゴリズムが提案されており、その幅広い音声タスクへの応用も研究されている。しかし、既存のアプローチの性質を理解することに焦点を当てた研究はほとんど行われていない。本研究では,最も代表的な自己教師型アルゴリズムについて比較研究することを目的とする。具体的には,既存の類似性尺度を用いて,異なる自己教師表現間の類似性を定量化する。また,モデルの事前学習損失と学習表現に含まれる特定の音声情報量との関係を調べるための探索タスクも設計した。各種自己教師型モデルが同じ入力でどのように振る舞うかを示すことに加え、本研究では、学習目標がビルディングブロック(RNN/Transformer/CNN)や方向性(ユニ/双方向)といったアーキテクチャ選択よりも表現類似性に高い影響があることも見出した。また,自己教師型アルゴリズムの学習前損失と下流性能との間には強い相関関係があることが示唆された。

Self-supervised speech representation learning has recently been a prosperous research topic. Many algorithms have been proposed for learning useful representations from large-scale unlabeled data, and their applications to a wide range of speech tasks have also been investigated. However, there has been little research focusing on understanding the properties of existing approaches. In this work, we aim to provide a comparative study of some of the most representative self-supervised algorithms. Specifically, we quantify the similarities between different self-supervised representations using existing similarity measures. We also design probing tasks to study the correlation between the models' pre-training loss and the amount of specific speech information contained in their learned representations. In addition to showing how various self-supervised models behave differently given the same input, our study also finds that the training objective has a higher impact on representation similarity than architectural choices such as building blocks (RNN/Transformer/CNN) and directionality (uni/bidirectional). Our results also suggest that there exists a strong correlation between pre-training loss and downstream performance for some self-supervised algorithms.

翻訳日:2022-10-04 05:56:28 公開日:2021-02-02

# RH-Net:強化学習と階層的関係探索によるニューラルネットワーク抽出の改善

RH-Net: Improving Neural Relation Extraction via Reinforcement Learning and Hierarchical Relational Searching ( http://arxiv.org/abs/2010.14255v2 )

ライセンス: Link先を確認

Jianing Wang

(参考訳) 遠隔監視(DS)は,現在ニューラルネットワーク抽出に広く利用されている大規模ヒューリスティックラベルコーパスを生成することを目的としている。しかし、ノイズの多いラベリングやロングテール分布の問題に苦しむ。多くの先進的なアプローチは、通常2つの問題に別々に対処し、両者の相互作用を無視する。本稿では、強化学習と階層型関係探索モジュールを用いて関係抽出を改善するRH-Netという新しいフレームワークを提案する。強化学習を利用して、モデルに高品質なインスタンスを選択するように指示する。次に、データリッチクラスとデータポーアクラス間の相関インスタンスのセマンティクスを共有する階層的関係探索モジュールを提案する。反復過程の間、2つのモジュールは相互作用を続け、ノイズと長い尾の問題を同時に緩和する。広範に使用されるnytデータセットに関する広範囲な実験により、最先端のベースラインよりも大きな改善が得られた。

Distant supervision (DS) aims to generate large-scale heuristic labeling corpus, which is widely used for neural relation extraction currently. However, it heavily suffers from noisy labeling and long-tail distributions problem. Many advanced approaches usually separately address two problems, which ignore their mutual interactions. In this paper, we propose a novel framework named RH-Net, which utilizes Reinforcement learning and Hierarchical relational searching module to improve relation extraction. We leverage reinforcement learning to instruct the model to select high-quality instances. We then propose the hierarchical relational searching module to share the semantics from correlative instances between data-rich and data-poor classes. During the iterative process, the two modules keep interacting to alleviate the noisy and long-tail problem simultaneously. Extensive experiments on widely used NYT data set clearly show that our method significant improvements over state-of-the-art baselines.

翻訳日:2022-10-02 11:03:29 公開日:2021-02-02

# 連続データにおける活動認識を利用した保護行動検出

Leveraging Activity Recognition to Enable Protective Behavior Detection in Continuous Data ( http://arxiv.org/abs/2011.01776v4 )

ライセンス: Link先を確認

Chongyang Wang, Yuan Gao, Akhil Mathur, Amanda C. De C. Williams, Nicholas D. Lane, Nadia Bianchi-Berthouze

(参考訳) 身体活動中に慢性的な痛み(CP)を訴える人々の保護行動は、身体的および感情的状態を理解する鍵となる。既存のpbd(automatic protective behavior detection)メソッドは、ユーザが事前に定義したアクティビティの事前セグメンテーションに依存する。しかし、実生活では、人々はさりげなく活動する。したがって、これらの活動が慢性的な痛みを伴う人には困難である場合、技術支援は継続的に提供され、活動タイプや保護行動の発生に自動的に適応すべきである。したがって、ユビキタスCP管理を容易にするため、連続データ上で正確なPBDを実現することが重要となる。本稿では、グラフ畳み込みと長寿命メモリ(GC-LSTM)ネットワークを含む新しい階層的HAR-PBDアーキテクチャを用いて、ヒトの活動認識(HAR)とPBDを統合することを提案する。 CP患者データセットを用いたアプローチの詳細な評価により,HAR,GC-LSTMネットワーク,CFCC損失の活用により,ベースラインに対するPBD性能が明らかに向上すること (macro F1 score: 0.81 vs. 0.66, precision-recall Area-under-the-curve (PR-AUC: 0.60 vs. 0.44) が示されている。我々は、CP管理などにおける階層的アーキテクチャのユースケースについて論じる。また、現在の制限や方法についても議論する。

Protective behavior exhibited by people with chronic pain (CP) during physical activities is the key to understanding their physical and emotional states. Existing automatic protective behavior detection (PBD) methods rely on pre-segmentation of activities predefined by users. However, in real life, people perform activities casually. Therefore, where those activities present difficulties for people with chronic pain, technology-enabled support should be delivered continuously and automatically adapted to activity type and occurrence of protective behavior. Hence, to facilitate ubiquitous CP management, it becomes critical to enable accurate PBD over continuous data. In this paper, we propose to integrate human activity recognition (HAR) with PBD via a novel hierarchical HAR-PBD architecture comprising graph-convolution and long short-term memory (GC-LSTM) networks, and alleviate class imbalances using a class-balanced focal categorical-cross-entropy (CFCC) loss. Through in-depth evaluation of the approach using a CP patients' dataset, we show that the leveraging of HAR, GC-LSTM networks, and CFCC loss leads to clear increase in PBD performance against the baseline (macro F1 score of 0.81 vs. 0.66 and precision-recall area-under-the-curve (PR-AUC) of 0.60 vs. 0.44). We conclude by discussing possible use cases of the hierarchical architecture in CP management and beyond. We also discuss current limitations and ways forward.

翻訳日:2022-09-30 05:37:52 公開日:2021-02-02

# 逆学習と擬似ラベリングによるUWFファウンダス診断モデルの訓練のための正規ファウンダス画像の活用

Leveraging Regular Fundus Images for Training UWF Fundus Diagnosis Models via Adversarial Learning and Pseudo-Labeling ( http://arxiv.org/abs/2011.13816v2 )

ライセンス: Link先を確認

Lie Ju, Xin Wang, Xin Zhao, Paul Bonnington, Tom Drummond, Zongyuan Ge

(参考訳) 近年,光学系カメラによる超広視野(uwf)200-fundusイメージングが,通常の30度-60度ファンダスカメラよりも広い視野でファンダスに関する情報を検出できるため,徐々に導入されている。 uwf の fundus 画像と比較すると、通常の fundus 画像には大量の高品質な注釈付きデータが含まれている。ドメインギャップのため、通常の眼底画像で訓練されたモデルでは、uff眼底画像の認識性能が低下する。そこで,本論文では,医療データの注釈付けが労働集約的かつ時間を要することを考慮し,より効率的なトレーニングのために,UWFファウンダスデータとアノテーションの限定的改善のために,通常のファウンダス画像を活用する方法について検討する。本稿では,通常のUWFファウンダスとUWFファウンダスとのギャップを埋めるために,修正サイクル生成敵ネットワーク(CycleGAN)モデルを提案する。生成したデータの品質を改善し,調整するために,GANの喪失時に一貫性正則化項を提案する。提案手法では,2つのドメインのイメージをペアにしたり,セマンティックラベルを同一にしたりする必要がなく,データ収集に非常に便利である。さらに,提案手法は擬似ラベル方式で生成したラベルなしデータによる雑音や誤差に対して頑健であることを示す。糖尿病性網膜症 (DR) 分類, 病変検出, 下顎骨切開術など, 一般的な基礎疾患や課題に対する方法の有効性を検討した。実験の結果,提案手法は複数のタスクにおいて,学習表現の優れた一般化性と性能向上を同時に達成できることが判明した。

Recently, ultra-widefield (UWF) 200\degree~fundus imaging by Optos cameras has gradually been introduced because of its broader insights for detecting more information on the fundus than regular 30 degree - 60 degree fundus cameras. Compared with UWF fundus images, regular fundus images contain a large amount of high-quality and well-annotated data. Due to the domain gap, models trained by regular fundus images to recognize UWF fundus images perform poorly. Hence, given that annotating medical data is labor intensive and time consuming, in this paper, we explore how to leverage regular fundus images to improve the limited UWF fundus data and annotations for more efficient training. We propose the use of a modified cycle generative adversarial network (CycleGAN) model to bridge the gap between regular and UWF fundus and generate additional UWF fundus images for training. A consistency regularization term is proposed in the loss of the GAN to improve and regulate the quality of the generated data. Our method does not require that images from the two domains be paired or even that the semantic labels be the same, which provides great convenience for data collection. Furthermore, we show that our method is robust to noise and errors introduced by the generated unlabeled data with the pseudo-labeling technique. We evaluated the effectiveness of our methods on several common fundus diseases and tasks, such as diabetic retinopathy (DR) classification, lesion detection and tessellated fundus segmentation. The experimental results demonstrate that our proposed method simultaneously achieves superior generalizability of the learned representations and performance improvements in multiple tasks.

翻訳日:2022-09-20 02:21:52 公開日:2021-02-02

# (参考訳) MAVIDHスコア:胸部X線像を用いた重症度検査

MAVIDH Score: A COVID-19 Severity Scoring using Chest X-Ray Pathology Features ( http://arxiv.org/abs/2011.14983v3 )

ライセンス: CC BY 4.0

Douglas P. S. Gomes, Michael J. Horry, Anwaar Ulhaq, Manoranjan Paul, Subrata Chakraborty, Manash Saha, Tanmoy Debnath, D.M. Motiur Rahaman

(参考訳) 患者の誤分類に関連するリスクを考えると、コンピュータビジョンのCOVID-19診断への応用は複雑で困難である。おそらく、covid-19の医療画像化の主要な価値は患者の予後にある。放射線画像は、病気の重症度を評価する医師を誘導し、同じ患者の異なる段階における一連の画像は、疾患の進行を評価するのに役立つ。そこで,胸部x線から疾患の重症度を判定するための肺病理学的特徴に基づく簡便な方法を提案する。この方法は, 疾患進行の異なる段階における患者の重症度と, 既存のより複雑な方法と比較して, 競争的な結果によく相関する。元のデータ選択アプローチも提案されており、単純なモデルで重大性に関する特徴を学習することができる。ここで示される競争的パフォーマンスは、他の文献のように肺への関与や不透明さに依存するのではなく、機能ベースである方法に関係していると仮定されている。第2の貢献は、疾患の異なる段階の患者グループのスコアとして概念化された結果の検証である。独立データセット上でこのような検証を行うのに加えて,文献における他の評価手法と比較した。以上の結果から,診断システム(MAVIDH)と患者の予後との間に有意な相関関係があることが示唆された。

The application of computer vision for COVID-19 diagnosis is complex and challenging, given the risks associated with patient misclassifications. Arguably, the primary value of medical imaging for COVID-19 lies rather on patient prognosis. Radiological images can guide physicians assessing the severity of the disease, and a series of images from the same patient at different stages can help to gauge disease progression. Hence, a simple method based on lung-pathology interpretable features for scoring disease severity from Chest X-rays is proposed here. As the primary contribution, this method correlates well to patient severity in different stages of disease progression with competitive results compared to other existing, more complex methods. An original data selection approach is also proposed, allowing the simple model to learn the severity-related features. It is hypothesized that the resulting competitive performance presented here is related to the method being feature-based rather than reliant on lung involvement or opacity as others in the literature. A second contribution comes from the validation of the results, conceptualized as the scoring of patients groups from different stages of the disease. Besides performing such validation on an independent data set, the results were also compared with other proposed scoring methods in the literature. The results show that there is a significant correlation between the scoring system (MAVIDH) and patient outcome, which could potentially help physicians rating and following disease progression in COVID-19 patients.

翻訳日:2021-06-07 00:49:22 公開日:2021-02-02

# 自己修正Q-Learning

Self-correcting Q-Learning ( http://arxiv.org/abs/2012.01100v2 )

ライセンス: Link先を確認

Rong Zhu and Mattia Rigotti

(参考訳) Q学習アルゴリズムは、最大化バイアス、すなわち、影響を受けることが知られている。行動価値の体系的な過大評価は最近注目された重要な問題ですこのバイアスを緩和する効率的なアルゴリズムとして、ダブルQ学習が提案されている。しかしこれは、メモリ要求の増加と収束の遅さに加えて、アクション値の過小評価の価格が伴う。本稿では,期待値の最大値に近似する「自己補正アルゴリズム」という形で,最大化バイアスに対処する新しい手法を提案する。本手法は,従来のq-learningにおける1つの推定値の過大評価と,ダブルq-learningにおける2つの推定値の過大評価とを両立させる。この戦略をQ学習に適用すれば、自己修正Q学習が可能になる。理論的には,このアルゴリズムはQ-ラーニングと同等の収束保証を享受できるが,精度は高い。経験上、高い分散の報酬を持つドメインでは2倍のq-learningよりもパフォーマンスが良く、ゼロまたは低分散のドメインではq-learningよりも高速に収束する。これらの利点は、私たちが自己修正DQNと呼ぶディープQネットワークの実装に移行し、Atari 2600ドメインのいくつかのタスクにおいて、通常のDQNとダブルDQNより優れている。

The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overestimation of action values, an important issue that has recently received renewed attention. Double Q-learning has been proposed as an efficient algorithm to mitigate this bias. However, this comes at the price of an underestimation of action values, in addition to increased memory requirements and a slower convergence. In this paper, we introduce a new way to address the maximization bias in the form of a "self-correcting algorithm" for approximating the maximum of an expected value. Our method balances the overestimation of the single estimator used in conventional Q-learning and the underestimation of the double estimator used in Double Q-learning. Applying this strategy to Q-learning results in Self-correcting Q-learning. We show theoretically that this new algorithm enjoys the same convergence guarantees as Q-learning while being more accurate. Empirically, it performs better than Double Q-learning in domains with rewards of high variance, and it even attains faster convergence than Q-learning in domains with rewards of zero or low variance. These advantages transfer to a Deep Q Network implementation that we call Self-correcting DQN and which outperforms regular DQN and Double DQN on several tasks in the Atari 2600 domain.

翻訳日:2021-05-25 03:52:26 公開日:2021-02-02

# (参考訳) 都市交通・環境の構造化記述と分類のための6層モデル

6-Layer Model for a Structured Description and Categorization of Urban Traffic and Environment ( http://arxiv.org/abs/2012.06319v2 )

ライセンス: CC BY 4.0

Maike Scholtes, Lukas Westhofen, Lara Ruth Turner, Katrin Lotto, Michael Schuldes, Hendrik Weber, Nicolas Wagener, Christian Neurohr, Martin Bollmann, Franziska K\"ortke, Johannes Hiller, Michael Hoss, Julian Bock, Lutz Eckstein

(参考訳) 自動運転機能の検証と検証には大きな課題が伴う。現在、シナリオベースのアプローチは研究や産業において研究されており、安全関連シナリオを特定することでテストの労力を減らすことを目指している。これらのシナリオを定義し、複雑な実世界設計ドメインで運用するには、環境の構造化された記述が必要である。 PEGASUS研究プロジェクトでは、高速道路のシナリオを記述するために6層モデル (6LM) が導入された。本稿では6LMを改良し,都市交通と環境に拡張する。 PEGASUSで定義されているように、6LMは環境を分類し、その後のシナリオ記述のための構造化された基盤として機能する。このモデルは、知識を組み込んだり、アクターの機能を予測することなく、一般的な環境の構造化された記述と分類を可能にする。その他にも,本論文で詳述した 6lm の応用が数多く存在する。 6LMには、道路ネットワークと交通誘導対象、路面構造、前者の一時的な修正、動的オブジェクト、環境条件、デジタル情報などが記述されている。手前の作業は、各レイヤをアイテムを分類することによって指定する。対象環境記述のためのモデルの適用を標準化するためのガイドラインを定式化し、解説例を提示する。以前の出版物とは対照的に、モデルとその設計はより詳細に記述されている。最後に、提示された6LMの全体的記述には、概念を機械知覚の側面に拡張する際の将来の作業の可能性についての言及が含まれている。

Verification and validation of automated driving functions impose large challenges. Currently, scenario-based approaches are investigated in research and industry, aiming at a reduction of testing efforts by specifying safety relevant scenarios. To define those scenarios and operate in a complex real-world design domain, a structured description of the environment is needed. Within the PEGASUS research project, the 6-Layer Model (6LM) was introduced for the description of highway scenarios. This paper refines the 6LM and extends it to urban traffic and environment. As defined in PEGASUS, the 6LM provides the possibility to categorize the environment and, therefore, functions as a structured basis for subsequent scenario description. The model enables a structured description and categorization of the general environment, without incorporating any knowledge or anticipating any functions of actors. Beyond that, there is a variety of other applications of the 6LM, which are elaborated in this paper. The 6LM includes a description of the road network and traffic guidance objects, roadside structures, temporary modifications of the former, dynamic objects, environmental conditions and digital information. The work at hand specifies each layer by categorizing its items. Guidelines are formulated and explanatory examples are given to standardize the application of the model for an objective environment description. In contrast to previous publications, the model and its design are described in far more detail. Finally, the holistic description of the 6LM presented includes remarks on possible future work when expanding the concept to machine perception aspects.

翻訳日:2021-05-16 08:31:24 公開日:2021-02-02

# 正規化ニューラルネットワークを用いたフレキシブル非パラメトリックモデリング

Flexible, Non-parametric Modeling Using Regularized Neural Networks ( http://arxiv.org/abs/2012.11369v2 )

ライセンス: Link先を確認

Oskar Allerbo, Rebecka J\"ornsten

(参考訳) 一般化付加モデル(GAM)のような非パラメトリック回帰は、柔軟で解釈可能な方法で複雑なデータ依存関係をキャプチャすることができる。しかし、付加コンポーネントのフォーマットを選択するには、しばしば非自明なデータ探索が必要である。本稿では,近位勾配降下と適応lassoを訓練した,一層ニューラルネットワークを用いたgamsの代替手法であるprada-netを提案する。 PrAda-netは、ニューラルネットワークのサイズとアーキテクチャを自動的に調整し、基盤となるデータ生成モデルの複雑さと構造をキャプチャする。 PrAda-netにより得られたコンパクトネットワークは、自動モデル選択による非パラメトリック統計モデリングに適した付加モデルコンポーネントに変換できる。シミュレーションデータ上でPrAda-netを実演し、PrAda-netの試験誤差性能、変数の重要度、変数のサブセット識別特性を他のラッソベースのアプローチと比較する。我々はまた、prada-netをイギリスの巨大なブラックスモークデータセットに適用し、prada-netをgamsの代替品として使う能力を示す。加法成分の関数形式を選択するのにドメイン知識を必要とするGAMとは対照的に、プラダネットはそのような事前選択は必要とせず、それでも解釈可能な加法成分をもたらす。

Non-parametric regression, such as generalized additive models (GAMs), is able to capture complex data dependencies in a flexible, yet interpretable way. However, choosing the format of the additive components often requires non-trivial data exploration. Here, we propose an alternative to GAMs, PrAda-net, which uses a one hidden layer neural network, trained with proximal gradient descent and adaptive lasso. PrAda-net automatically adjusts the size and architecture of the neural network to capture the complexity and structure of the underlying data generative model. The compact network obtained by PrAda-net can be translated to additive model components, making it suitable for non-parametric statistical modelling with automatic model selection. We demonstrate PrAda-net on simulated data, where we compare the test error performance, variable importance and variable subset identification properties of PrAda-net to other lasso-based approaches. We also apply Prada-net to the massive U.K. black smoke data set, to demonstrate the capability of using Prada-net as an alternative to GAMs. In contrast to GAMs, which often require domain knowledge to select the functional forms of the additive components, Prada-net requires no such pre-selection while still resulting in interpretable additive components.

翻訳日:2021-05-01 17:57:30 公開日:2021-02-02

# (参考訳) 適応的双方向注意:機械読取理解のための多角性表現の探索

Adaptive Bi-directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension ( http://arxiv.org/abs/2012.10877v2 )

ライセンス: CC BY 4.0

Nuo Chen, Fenglin Liu, Chenyu You, Peilin Zhou, Yuexian Zou

(参考訳) 近年,Transformer などの注目型多層エンコーダは,Machine Reading Comprehension (MRC) において広く研究されている。答えを予測するには、ソースシーケンスの\textit{coarse-grained}表現を生成する最終エンコーダ層からのみ情報を描画する予測器、すなわちパッセージとクエスチョンを使用するのが一般的である。以前の研究では、エンコーディング層が増加するにつれて、ソースシーケンスの表現は \textit{coarse-fine} からより \textit{fine-fine} となることが示されている。ディープニューラルネットワークの層数が増加するにつれて、エンコーディングプロセスは各位置に関する関連情報を徐々に収集し、その結果、より多くの‘textit{coarse-fine'表現が生まれ、それによって他の位置と類似する可能性が高まる(同質性を参照)。このような現象は、性能を低下させるために間違った判断を下すためにモデルを誤解させる。そこで本研究では,異なるレベルのソース表現を適応的に予測者に活用するAdaptive Bidirectional Attentionという手法を提案する。ベンチマークデータセットであるSQuAD 2.0の実験結果は、我々のアプローチの有効性を示し、その結果は従来の最先端モデルよりも2.5$\%$ EMと2.3$\%$ F1スコアの方が優れている。

Recently, the attention-enhanced multi-layer encoder, such as Transformer, has been extensively studied in Machine Reading Comprehension (MRC). To predict the answer, it is common practice to employ a predictor to draw information only from the final encoder layer which generates the \textit{coarse-grained} representations of the source sequences, i.e., passage and question. Previous studies have shown that the representation of source sequence becomes more \textit{coarse-grained} from \textit{fine-grained} as the encoding layer increases. It is generally believed that with the growing number of layers in deep neural networks, the encoding process will gather relevant information for each location increasingly, resulting in more \textit{coarse-grained} representations, which adds the likelihood of similarity to other locations (referring to homogeneity). Such a phenomenon will mislead the model to make wrong judgments so as to degrade the performance. To this end, we propose a novel approach called Adaptive Bidirectional Attention, which adaptively exploits the source representations of different levels to the predictor. Experimental results on the benchmark dataset, SQuAD 2.0 demonstrate the effectiveness of our approach, and the results are better than the previous state-of-the-art model by 2.5$\%$ EM and 2.3$\%$ F1 scores.

翻訳日:2021-05-01 08:45:32 公開日:2021-02-02

# (参考訳) 糖尿病網膜症における病変の局在

Towards the Localisation of Lesions in Diabetic Retinopathy ( http://arxiv.org/abs/2012.11432v2 )

ライセンス: CC BY 4.0

Samuel Ofosu Mensah, Bubacarr Bah, Willie Brink

(参考訳) 畳み込みニューラルネットワーク(CNN)は近年,糖尿病性網膜症(DR)基底画像の分類に成功している。しかし、cnnのより深い表現は、空間分解能を犠牲にして高レベルの意味論を捉えうる。眼科医にとって有用な予測を行うために,深層学習モデルのペナルティファイト層上に勾配強調クラスアクティベーションマッピング(grad-cam)と呼ばれるポストアテンション技術を用いて,眼底画像上の粗い局所化マップを作成する。これは画像の識別領域を特定するのに役立ち、眼科医が早期診断によって命を救える証拠となる。具体的には、4つの最先端ディープラーニングモデルの事前学習重量を用いて、DRファンダス画像のローカライズマップを作成し、比較する。 VGG16、ResNet50、InceptionV3、InceptionResNetV2が使用されている。 InceptionV3は96.07%の精度で最高の性能を達成し、ローカライズ病変は他のモデルよりも良く高速であることがわかった。

Convolutional Neural Networks (CNNs) have successfully been used to classify diabetic retinopathy (DR) fundus images in recent times. However, deeper representations in CNNs may capture higher-level semantics at the expense of spatial resolution. To make predictions usable for ophthalmologists, we use a post-attention technique called Gradient-weighted Class Activation Mapping (Grad-CAM) on the penultimate layer of deep learning models to produce coarse localisation maps on DR fundus images. This is to help identify discriminative regions in the images, consequently providing evidence for ophthalmologists to make a diagnosis and potentially save lives by early diagnosis. Specifically, this study uses pre-trained weights from four state-of-the-art deep learning models to produce and compare localisation maps of DR fundus images. The models used include VGG16, ResNet50, InceptionV3, and InceptionResNetV2. We find that InceptionV3 achieves the best performance with a test classification accuracy of 96.07%, and localise lesions better and faster than the other models.

翻訳日:2021-04-27 12:17:57 公開日:2021-02-02

# bayescard: 濃度推定のためのベイズフレームワークの復活

BayesCard: Revitilizing Bayesian Frameworks for Cardinality Estimation ( http://arxiv.org/abs/2012.14743v2 )

ライセンス: Link先を確認

Ziniu Wu, Amir Shaikhha, Rong Zhu, Kai Zeng, Yuxing Han, Jingren Zhou

(参考訳) 基数推定(cardest)はクエリオプティマイザの重要な要素であり、dbmsの基本的な問題である。望ましいCardEstメソッドは、優れたアルゴリズム性能を達成し、さまざまなデータ設定に安定し、システムデプロイメントに親しみやすくする必要がある。しかし、既存のCardEstメソッドでは同時に3つの基準を満たすことはできない。従来の手法では、大きな推定誤差のような大きなアルゴリズムの欠点があることが多い。最近提案されたディープラーニングに基づく手法は推定精度を大幅に改善するが、その性能はデータに大きく影響され、システム展開にはしばしば困難である。本稿では,確率的プログラミング言語の技法を取り入れて,CardEstのベイズネットワーク(BN)を再活性化する。我々は、BNの利点、すなわち高い推定精度と解釈可能性を継承する最初のフレームワークであるBayesCardを紹介し、その欠点、すなわちその欠点を克服する。低い構造学習と推論効率ですこれにより、BayesCardは商用DBMSデプロイメントの完璧な候補となる。 bayescardは、同等かそれ以上の精度、桁違いに速い推論時間、1-3桁のトレーニング時間、1-3桁の小さなモデルサイズ、1-2桁の高速なアップデートを実現しています。一方、BayesCardは、異なる設定でデータを変更した場合、安定したパフォーマンスを維持する。 BayesCardもPostgreSQLにデプロイしています。 IMDBベンチマークのワークロードでは、エンドツーエンドのクエリ時間を13.3%改善し、真の濃度のオラクルを使用して14.2%の最適な結果に非常に近い。

Cardinality estimation (CardEst) is an essential component in query optimizers and a fundamental problem in DBMS. A desired CardEst method should attain good algorithm performance, be stable to varied data settings, and be friendly to system deployment. However, no existing CardEst method can fulfill the three criteria at the same time. Traditional methods often have significant algorithm drawbacks such as large estimation errors. Recently proposed deep learning based methods largely improve the estimation accuracy but their performance can be greatly affected by data and often difficult for system deployment. In this paper, we revitalize the Bayesian networks (BN) for CardEst by incorporating the techniques of probabilistic programming languages. We present BayesCard, the first framework that inherits the advantages of BNs, i.e., high estimation accuracy and interpretability, while overcomes their drawbacks, i.e. low structure learning and inference efficiency. This makes BayesCard a perfect candidate for commercial DBMS deployment. Our experimental results on several single-table and multi-table benchmarks indicate BayesCard's superiority over existing state-of-the-art CardEst methods: BayesCard achieves comparable or better accuracy, 1-2 orders of magnitude faster inference time, 1-3 orders faster training time, 1-3 orders smaller model size, and 1-2 orders faster updates. Meanwhile, BayesCard keeps stable performance when varying data with different settings. We also deploy BayesCard into PostgreSQL. On the IMDB benchmark workload, it improves the end-to-end query time by 13.3%, which is very close to the optimal result of 14.2% using an oracle of true cardinality.

翻訳日:2021-04-18 20:28:24 公開日:2021-02-02

# (参考訳) 収縮とスプラインバイニングを備えたエビデンス2.0

Weight-of-evidence 2.0 with shrinkage and spline-binning ( http://arxiv.org/abs/2101.01494v2 )

ライセンス: CC BY 4.0

Jakob Raymaekers, Wouter Verbeke, Tim Verdonck

(参考訳) 詐欺検出、信用リスクモデリング、医療意思決定など、多くの実用的な応用において、事前定義されたクラスにインスタンスを割り当てる分類モデルは、正確かつ解釈可能である必要がある。ロジスティック回帰のような線形モデリング手法は、精度と解釈可能性のバランスが許容できるため、しばしば採用される。しかし、線形法は、高カルジナリティを持つカテゴリー予測器を扱ったり、データの非線形関係を利用するには不十分である。解法として、ウェイト・オブ・エビデンスのようなデータ前処理法は一般的に予測器の変換に使用される。しかし、エビデンスウェイト・オブ・エビデンス・アプローチの根底にあるビンニング手順はほとんど研究されておらず、通常はアドホックや専門家主導の手順に依存している。そこで本研究では,形式化されたデータ駆動型,強力な手法を提案する。この目的のために,スプライン関数のバイナリ化を通じて連続変数の離散化を探求し,予測変数の非線形効果を捕捉し,少数の離散値のみを取り込む高度に解釈可能な予測器を得る。さらに,重み付けアプローチを拡張し,収縮推定器を用いて比率を推定する手法を提案する。これにより、非線形とカテゴリー予測の両方を活用する能力が向上し、分類精度が向上し、結果モデルの解釈可能性を維持し、オーバーフィッティングのリスクを低減できる。本稿では,提案手法の有効性を示す詐欺検出セットにおける一連の実験結果を示す。提案した結果の再現と,提案手法の採用を容易にするため,提案手法と実験実装のためのデータセットとコードの両方を提供する。

In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise as well as interpretable. Linear modeling methods such as logistic regression are often adopted, since they offer an acceptable balance between precision and interpretability. Linear methods, however, are not well equipped to handle categorical predictors with high-cardinality or to exploit non-linear relations in the data. As a solution, data preprocessing methods such as weight-of-evidence are typically used for transforming the predictors. The binning procedure that underlies the weight-of-evidence approach, however, has been little researched and typically relies on ad-hoc or expert driven procedures. The objective in this paper, therefore, is to propose a formalized, data-driven and powerful method. To this end, we explore the discretization of continuous variables through the binning of spline functions, which allows for capturing non-linear effects in the predictor variables and yields highly interpretable predictors taking only a small number of discrete values. Moreover, we extend upon the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. Together, this offers an improved ability to exploit both non-linear and categorical predictors for achieving increased classification precision, while maintaining interpretability of the resulting model and decreasing the risk of overfitting. We present the results of a series of experiments in a fraud detection setting, which illustrate the effectiveness of the presented approach. We facilitate reproduction of the presented results and adoption of the proposed approaches by providing both the dataset and the code for implementing the experiments and the presented approach.

翻訳日:2021-04-11 17:43:04 公開日:2021-02-02

# (参考訳) 直交性制約による解釈可能なcovid-19胸部x線分類

Interpretable COVID-19 Chest X-Ray Classification via Orthogonality Constraint ( http://arxiv.org/abs/2102.08360v1 )

ライセンス: CC BY 4.0

Ella Y. Wang, Anirudh Som, Ankita Shukla, Hongjun Choi, Pavan Turaga

(参考訳) ディープニューラルネットワークは、いくつかの診断タスクのパフォーマンスを改善する能力のため、医療アプリケーションにおける補助ツールとしてますます使われてきた。しかし, 深層学習系では信頼性, 一般化性, 解釈性に限界があるため, 臨床現場では広く採用されていない。その結果、ネットワークトレーニング中に追加の制約を課す方法が開発され、より制御しやすくなり、解釈性が向上し、医療コミュニティへの受け入れが促進された。本研究は,胸部X線画像から新型コロナウイルスの症例を分類するために,Orthogonal Spheres (OS) 制約を用いることの利点を検討する。 OS制約は、分類ネットワークトレーニング中の標準的なクロスエントロピー損失と合わせて用いられる単純な正則性項として記述することができる。従来の研究では、このような制約をディープラーニングモデルに適用する上で、大きなメリットが示されている。以上の結果から, 正規化損失関数はGradCAM視覚化による意味的局所化, 分類性能の向上, モデル校正誤差の低減を効果的に実現できることが示唆された。提案手法は2クラス分類と3クラス分類でそれぞれ1.6%,4.8%の精度向上を実現し,データ拡張を施したモデルでは同様の結果が得られた。これらの知見に加えて,本研究は,医療におけるOSレギュラーライザの新たな応用を提示し,臨床現場での導入を促進するために,COVID-19分類のためのディープラーニングモデルのポストホック解釈性と性能を高めた。また、今後のさらなる研究のために検討できる戦略の限界も特定します。

Deep neural networks have increasingly been used as an auxiliary tool in healthcare applications, due to their ability to improve performance of several diagnosis tasks. However, these methods are not widely adopted in clinical settings due to the practical limitations in the reliability, generalizability, and interpretability of deep learning based systems. As a result, methods have been developed that impose additional constraints during network training to gain more control as well as improve interpretabilty, facilitating their acceptance in healthcare community. In this work, we investigate the benefit of using Orthogonal Spheres (OS) constraint for classification of COVID-19 cases from chest X-ray images. The OS constraint can be written as a simple orthonormality term which is used in conjunction with the standard cross-entropy loss during classification network training. Previous studies have demonstrated significant benefits in applying such constraints to deep learning models. Our findings corroborate these observations, indicating that the orthonormality loss function effectively produces improved semantic localization via GradCAM visualizations, enhanced classification performance, and reduced model calibration error. Our approach achieves an improvement in accuracy of 1.6% and 4.8% for two- and three-class classification, respectively; similar results are found for models with data augmentation applied. In addition to these findings, our work also presents a new application of the OS regularizer in healthcare, increasing the post-hoc interpretability and performance of deep learning models for COVID-19 classification to facilitate adoption of these methods in clinical settings. We also identify the limitations of our strategy that can be explored for further research in future.

翻訳日:2021-04-06 07:59:36 公開日:2021-02-02

# (参考訳) NFV対応ゼロタッチ6GネットワークのアクティブおよびAoI対応故障回復:モデルフリーDRLアプローチ

Proactive and AoI-aware Failure Recovery for Stateful NFV-enabled Zero-Touch 6G Networks: Model-Free DRL Approach ( http://arxiv.org/abs/2103.03817v1 )

ライセンス: CC BY 4.0

Amirhossein Shaghaghi, Abolfazl Zakeri (Student Member, IEEE), Nader Mokari (Senior Member, IEEE), Mohammad Reza Javan (Senior Member, IEEE), Mohammad Behdadfar and Eduard A Jorswieck (Fellow, IEEE)

(参考訳) 本稿では,ネットワーク機能仮想化(NFV)実現ネットワークにおける組込みステートフル仮想ネットワーク機能(VNF)に対するゼロタッチPFR(ZT-PFR)と呼ばれるモデルフリー深部強化学習(DRL)に基づくプロアクティブ障害回復(PFR)フレームワークを提案する。 ZT-PFRの概念を実現するには,ネットワーク状態に基づく逐次意思決定が必要である。そこで本研究では,資源コストや不当な決定ペナルティを含むネットワークコスト関数を最小化し,効率的な資源利用のための最適化問題を定式化する。 ETSI と ITU に着想を得て,各 VNF 状態遷移がマルコフ過程に従うような,新しい入出力故障モデルを提案する。そこで本研究では,ソフトアクター・アクティクスや近位ポリシー最適化など,最先端のDRLベースの手法を提案する。さらに,ネットワーク状態の監視情報を適切な決定をするために,網状状態の監視情報を許容レベルに維持するために,イベントとスケジュールに基づく監視のバランスをとるために,情報年齢の概念(AoI)を適用した。いくつかのシミュレーションシナリオでは,本アルゴリズムの有効性を示し,ベースラインとの比較を行った。解析およびシミュレーション結果から,PFRのためのいくつかの重要なシステムとDRLアルゴリズムの設計知見を抽出した。例えば、DRLエージェント構造内の長短時間メモリ(LSTM)層で構成されるハイブリッドニューラルネットワークを使用して、差し迫った障害時間依存性をキャプチャします。

In this paper, we propose a model-free deep reinforcement learning (DRL)- based proactive failure recovery (PFR) framework called zero-touch PFR (ZT-PFR) for the embedded stateful virtual network functions (VNFs) in network function virtualization (NFV) enabled networks. To realize the ZT-PFR concept, sequential decision-making based on network status is necessary. To this end, we formulate an optimization problem for efficient resource usage by minimizing the defined network cost function including resource cost and wrong decision penalty. Inspired by ETSI and ITU, we propose a novel impending failure model where each VNF state transition follows a Markov process. As a solution, we propose state-of-the-art DRL-based methods such as soft actor-critic and proximal policy optimization. Moreover, to keep network state monitoring information at an acceptable level of freshness in order to make appropriate decisions, we apply the concept of the age of information (AoI) to strike a balance between the event and scheduling-based monitoring. Several simulation scenarios are considered to show the effectiveness of our algorithm and provide a fair comparison with baselines. Several key systems and DRL algorithm design insights for PFR are drawn from our analysis and simulation results. For example we use a hybrid neural network, consisting of long short time memory (LSTM) layers in the DRL agent structure, to capture impending failure time dependency.

翻訳日:2021-04-06 07:49:01 公開日:2021-02-02

# 神経推論における活性化関数の使用の形式化

Formalising the Use of the Activation Function in Neural Inference ( http://arxiv.org/abs/2102.04896v1 )

ライセンス: Link先を確認

Dalton A R Sakthivadivel

(参考訳) 本研究では,神経発火を抽象的に表現するためにアクティベーション関数をどのように利用できるか,そして,それが人工ニューラルネットワークでうまく機能するかを検討する。生物学的ニューロンのスパイクが、統計物理学における位相遷移の特定の普遍性クラスに属するかについて議論する。すると、人工ニューロンは、数学的に生物神経膜力学の平均場モデルであり、スパイクを相転移としてモデル化することから生じる。これにより、選択的神経発射を抽象的に処理し、パーセプトロン学習における活性化機能の役割を定式化する。このモデルを導出し、類似のニューラルケースを特定するとともに、フェーズ遷移を分析し、ニューラルネットワーク学習の物理を理解する。同時に,正準活性化関数の出現と性能に関する生物学的意味だけでなく,物理的正当性も示され,ニューラルラーニングや推論への影響も議論されている。

We investigate how activation functions can be used to describe neural firing in an abstract way, and in turn, why they work well in artificial neural networks. We discuss how a spike in a biological neurone belongs to a particular universality class of phase transitions in statistical physics. We then show that the artificial neurone is, mathematically, a mean field model of biological neural membrane dynamics, which arises from modelling spiking as a phase transition. This allows us to treat selective neural firing in an abstract way, and formalise the role of the activation function in perceptron learning. Along with deriving this model and specifying the analogous neural case, we analyse the phase transition to understand the physics of neural network learning. Together, it is show that there is not only a biological meaning, but a physical justification, for the emergence and performance of canonical activation functions; implications for neural learning and inference are also discussed.

翻訳日:2021-04-05 00:32:45 公開日:2021-02-02

# ランキング vs. 分類:知識ベース完了品質の測定

Ranking vs. Classifying: Measuring Knowledge Base Completion Quality ( http://arxiv.org/abs/2102.06145v1 )

ライセンス: Link先を確認

Marina Speranskaya, Martin Schmitt, Benjamin Roth

(参考訳) 知識ベース補完法(KBC)は,知識ベース(KB)に存在する情報から,候補となる事実の可能性を推定することによって,行方不明な事実を推定することを目的とする。一般的な評価パラダイムでは、モデルは、新しい事実が受け入れられるべきか否かを実際に決めるのではなく、他の候補と高い順位で真事実の位置でのみ判断される。我々は,バイナリ予測の考察は実際のkbc品質を反映するために不可欠であり,現実的なシナリオに対してより透過的なモデル選択基準を提供するように設計された新しい評価パラダイムを提案する。 FB14k-QAQというデータセットを構築し、単一の事実の代わりにKBクエリ、すなわち1つのエンティティが変数に置き換えられた事実を使い、正しい答えとなるエンティティの集合を構築します。我々は、これらの正しい答えのいくつかをデータセットからランダムに取り除き、KBから欠落した現実世界の実体の現実的なシナリオをシミュレートする。このようにして、KBよりも実際の世界で正しい回答を持つクエリを処理できるモデルの性能を、有効な答えのないクエリの特別なケースを含む、明確に測定することができる。後者は特にランキング設定と対比する。我々は,最新のKB埋め込みモデルを新しいベンチマークで評価した。本実験で観察したランキングと分類に基づく評価の相対的性能の差は,評価課題の良好な性能が必ずしも実際の完了課題の良好な性能をもたらすとは限らないという仮説を裏付けるものである。本研究は,予測分離性の向上を図ったKB埋め込みモデルの今後の取り組みを動機付け,その第一歩として,しきい値の設定を奨励し,元のTransEと比較してF1スコアの分類を著しく改善する,シンプルなTransEの変種を提案する。

Knowledge base completion (KBC) methods aim at inferring missing facts from the information present in a knowledge base (KB) by estimating the likelihood of candidate facts. In the prevailing evaluation paradigm, models do not actually decide whether a new fact should be accepted or not but are solely judged on the position of true facts in a likelihood ranking with other candidates. We argue that consideration of binary predictions is essential to reflect the actual KBC quality, and propose a novel evaluation paradigm, designed to provide more transparent model selection criteria for a realistic scenario. We construct the data set FB14k-QAQ where instead of single facts, we use KB queries, i.e., facts where one entity is replaced with a variable, and construct corresponding sets of entities that are correct answers. We randomly remove some of these correct answers from the data set, simulating the realistic scenario of real-world entities missing from a KB. This way, we can explicitly measure a model's ability to handle queries that have more correct answers in the real world than in the KB, including the special case of queries without any valid answer. The latter especially contrasts the ranking setting. We evaluate a number of state-of-the-art KB embeddings models on our new benchmark. The differences in relative performance between ranking-based and classification-based evaluation that we observe in our experiments confirm our hypothesis that good performance on the ranking task does not necessarily translate to good performance on the actual completion task. Our results motivate future work on KB embedding models with better prediction separability and, as a first step in that direction, we propose a simple variant of TransE that encourages thresholding and achieves a significant improvement in classification F1 score relative to the original TransE.

翻訳日:2021-04-05 00:32:16 公開日:2021-02-02

# TensorFlowによる透過FPGA高速化

Transparent FPGA Acceleration with TensorFlow ( http://arxiv.org/abs/2102.06018v1 )

ライセンス: Link先を確認

Simon Pfenning, Philipp Holzinger, Marc Reichenbach

(参考訳) 今日、ニューラルネットワークは機械学習の進歩を推進する主要なイノベーターの1つだ。これは特にニューラルネットワークの高速化ハードウェアの開発に影響を与えている。しかし、これらのアーキテクチャのほとんどは特殊なツールチェーンを必要とするため、新しいディープラーニングアクセラレータを使いたいと思うたびに、開発者にはある程度の労力がかかる。さらに、デバイスの柔軟性は、ランタイム環境の機能だけでなく、アーキテクチャ自体に結びついています。本稿では,TensorFlowをフロントエンドとして使用するツールフローを提案する。バックエンドではFPGAを使用し、HSAランタイム環境を介してアクセス可能です。このようにして、ユーザから新しいハードウェアを制御する複雑さを隠すと同時に、高い柔軟性を維持することができます。ハードウェアはネットワークの構造を静的に設定していないため、HSAツールフローによって実現できます。代わりに、ネットワークによって実行される各カーネルと、他のソースから同時に実行中に動的に再構成することができる。 OpenCL/OpenMP。

Today, artificial neural networks are one of the major innovators pushing the progress of machine learning. This has particularly affected the development of neural network accelerating hardware. However, since most of these architectures require specialized toolchains, there is a certain amount of additional effort for developers each time they want to make use of a new deep learning accelerator. Furthermore the flexibility of the device is bound to the architecture itself, as well as to the functionality of the runtime environment. In this paper we propose a toolflow using TensorFlow as frontend, thus offering developers the opportunity of using a familiar environment. On the backend we use an FPGA, which is addressable via an HSA runtime environment. In this way we are able to hide the complexity of controlling new hardware from the user, while at the same time maintaining a high amount of flexibility. This can be achieved by our HSA toolflow, since the hardware is not statically configured with the structure of the network. Instead, it can be dynamically reconfigured during runtime with the respective kernels executed by the network and simultaneously from other sources e.g. OpenCL/OpenMP.

翻訳日:2021-04-05 00:31:48 公開日:2021-02-02

# (参考訳) ゼロ・マイ・ショット・マルチダイアレクタル・アラビア列ラベリングのための自己学習事前学習言語モデル

Self-Training Pre-Trained Language Models for Zero- and Few-Shot Multi-Dialectal Arabic Sequence Labeling ( http://arxiv.org/abs/2101.04758v4 )

ライセンス: CC BY 4.0

Muhammad Khalifa and Muhammad Abdul-Mageed and Khaled Shaalan

(参考訳) 通常、ダウンストリームタスクのために事前学習された言語モデルを微調整するために、十分な量の注釈付きデータが必要である。残念なことに、ラベル付きデータを得ることは、特に複数の言語や方言において、コストがかかる可能性がある。我々は、データリッチな言語からのみのリソースを用いて、データスカース品種の性能を向上させるために、ゼロおよび少数ショットシナリオで事前訓練された言語モデルを自己学習することを提案する。我々は、現代標準アラビア語(MSA)を微調整した言語モデルを用いて、複数の方言アラビア語(DA)品種における名前付きエンティティ(NE)とPOSタグを予測することで、アラビア語シーケンスラベリングの文脈におけるアプローチの有用性を実証する。自己学習は確かに強力であり, ゼロショットMSA-to-DA転送を10\% F$_1$ (NER) と2\%精度 (POSタグ付け) で改善している。限定的なラベル付きデータで、数回のシナリオでパフォーマンスがさらに向上します。本研究は, 自己学習に用いた未ラベルDA例から直接観察した性能向上効果を示す。我々の研究は、MSAリソースのみを活用するDAモデルを開発する機会を開き、他の言語やタスクにも拡張できます。私たちのコードと微調整されたモデルは、https://github.com/mohammadKhalifa/zero-shot-arabic-dialectsでアクセスできます。

A sufficient amount of annotated data is usually required to fine-tune pre-trained language models for downstream tasks. Unfortunately, attaining labeled data can be costly, especially for multiple language varieties and dialects. We propose to self-train pre-trained language models in zero- and few-shot scenarios to improve performance on data-scarce varieties using only resources from data-rich ones. We demonstrate the utility of our approach in the context of Arabic sequence labeling by using a language model fine-tuned on Modern Standard Arabic (MSA) only to predict named entities (NE) and part-of-speech (POS) tags on several dialectal Arabic (DA) varieties. We show that self-training is indeed powerful, improving zero-shot MSA-to-DA transfer by as large as \texttildelow 10\% F$_1$ (NER) and 2\% accuracy (POS tagging). We acquire even better performance in few-shot scenarios with limited amounts of labeled data. We conduct an ablation study and show that the performance boost observed directly results from the unlabeled DA examples used for self-training. Our work opens up opportunities for developing DA models exploiting only MSA resources and it can be extended to other languages and tasks. Our code and fine-tuned models can be accessed at https://github.com/mohammadKhalifa/zero-shot-arabic-dialects.

翻訳日:2021-04-04 03:58:45 公開日:2021-02-02

# (参考訳) 質量作用則による多スケール化学反応のデータの発見

Data-driven discovery of multiscale chemical reactions governed by the law of mass action ( http://arxiv.org/abs/2101.06589v2 )

ライセンス: CC BY 4.0

Juntao Huang and Yizhou Zhou and Wen-An Yong

(参考訳) 本稿では,質量作用の法則に則る多スケール化学反応を探索するためのデータ駆動型手法を提案する。まず, 触媒反応を伴わない系において, 反応物と生成物の化学量係数を表すために, 単一行列を用いる。行列内の負の成分は反応剤の化学量係数と生成物の正の係数を表す。第二に, 従来の最適化手法は局所的な極小領域に留まり, マルチスケール化学反応の学習において真の解を見出すことができなかった。この課題を克服するために,確率係数が整数であるという事実を用いて,ネットワークパラメータを漸進的に決定する部分パラメータフリージング(ppf)手法を提案する。このような技術により、トレーニング過程において探索空間の寸法を徐々に小さくし、最終的に大域的ミミナが得られる。古典的ミカエル・メンテン運動学や水素酸化反応などの数値実験により, マルチスケール化学反応の学習におけるアルゴリズムの性能が検証された。コードは \url{https://github.com/juntaohuang/multiscale-chemical-reaction} で入手できる。

In this paper, we propose a data-driven method to discover multiscale chemical reactions governed by the law of mass action. First, we use a single matrix to represent the stoichiometric coefficients for both the reactants and products in a system without catalysis reactions. The negative entries in the matrix denote the stoichiometric coefficients for the reactants and the positive ones for the products. Second, we find that the conventional optimization methods usually get stuck in the local minima and could not find the true solution in learning the multiscale chemical reactions. To overcome this difficulty, we propose a partial-parameters-freezing (PPF) technique to progressively determine the network parameters by using the fact that the stoichiometric coefficients are integers. With such a technique, the dimension of the searching space is gradually reduced in the training process and the global mimina can be eventually obtained. Several numerical experiments including the classical Michaelis-Menten kinetics and the hydrogen oxidation reactions verify the good performance of our algorithm in learning the multiscale chemical reactions. The code is available at \url{https://github.com/JuntaoHuang/multiscale-chemical-reaction}.

翻訳日:2021-03-28 02:13:33 公開日:2021-02-02

# (参考訳) TREGO: 効率的なグローバル最適化のための信頼度フレームワーク

TREGO: a Trust-Region Framework for Efficient Global Optimization ( http://arxiv.org/abs/2101.06808v3 )

ライセンス: CC BY 4.0

Youssef Diouane and Victor Picheny and Rodolphe Le Riche and Alexandre Scotto Di Perrotolo

(参考訳) 効率的なグローバル最適化(EGO)はベイズ最適化の標準形式であり、高価なブラックボックス問題のグローバル最適化に成功している。しかし、EGOは次元のスケールに苦慮しており、理論上の保証は限られている。本研究では,信頼領域型EGO法(TREGO)の提案と解析を行う。 TREGOは、信頼領域内の通常のEGOステップとローカルステップを交互に使用する。信頼領域の古典的スキーム(十分な減少条件に基づく)に従うことで、最適化ステップのサブセットに限りEGOから離脱しながら、我々のアルゴリズムが強い大域収束特性を享受できることを実証する。既知のcocoベンチマークに基づく広範な数値実験を用いて,tregoの自己パラメータに対する感度を解析し,結果のアルゴリズムがegoを一貫して上回っており,他の最先端のグローバル最適化手法と競合していることを示す。このメソッドはRパッケージのDiceOptim (https://cran.r-project.org/package=DiceOptim) とPythonライブラリ tryte (https://secondmind-labs.github.io/trieste/)の両方で利用できる。

Efficient Global Optimization (EGO) is the canonical form of Bayesian optimization that has been successfully applied to solve global optimization of expensive-to-evaluate black-box problems. However, EGO struggles to scale with dimension, and offers limited theoretical guarantees. In this work, we propose and analyze a trust-region-like EGO method (TREGO). TREGO alternates between regular EGO steps and local steps within a trust region. By following a classical scheme for the trust region (based on a sufficient decrease condition), we demonstrate that our algorithm enjoys strong global convergence properties, while departing from EGO only for a subset of optimization steps. Using extensive numerical experiments based on the well-known COCO benchmark, we first analyze the sensitivity of TREGO to its own parameters, then show that the resulting algorithm is consistently outperforming EGO and getting competitive with other state-of-the-art global optimization methods. The method is available both in the R package DiceOptim (https://cran.r-project.org/package=DiceOptim) and Python library trieste (https://secondmind-labs.github.io/trieste/).

翻訳日:2021-03-27 19:51:51 公開日:2021-02-02

# 摂動畳み込みを用いた生成逆ネットワーク

Generative Adversarial Network using Perturbed-Convolutions ( http://arxiv.org/abs/2101.10841v2 )

ライセンス: Link先を確認

Seung Park, Yoon-Jae Yeo, and Yong-Goo Shin

(参考訳) GANトレーニングに対する洞察の高まりにもかかわらず、トレーニング手順の不安定さに悩まされている。この問題を軽減するために,本論文では,GANを安定的に訓練するための識別器をペナルティ化し,差別器の過度な問題を防止することを目的とした,摂動畳み込み(PConv)と呼ばれる新しい畳み込み層を提案する。 PConvは、畳み込み操作を行う前に入力テンソルをランダムに乱して摂動特徴を生成する。このアプローチは単純ですが,驚くほど効果的です。まず、乱れた入力テンソルを用いて実および生成されたサンプルを確実に分類するために、判別器の中間層は、局所的なリプシッツ値の小さい特徴を学習する必要がある。第二に、PConvの摂動特性のため、判別器は実際の画像を記憶することが困難であり、判別器は過度に適合する問題を回避できる。提案手法の一般化能力を示すために, CIFAR-10, CelebA-HQ, LSUN, 小型画像ネットなどの各種損失関数とデータセットを用いた広範囲な実験を行った。定量的評価により,WCLはFrechet開始距離(FID)において,GANおよび条件付きGANの性能を著しく向上することが示された。例えば、提案手法は、小画像NetデータセットのFIDスコアを58.59から50.42に改善する。

Despite growing insights into the GAN training, it still suffers from instability during the training procedure. To alleviate this problem, this paper presents a novel convolutional layer, called perturbed-convolution (PConv), which focuses on achieving two goals simultaneously: penalize the discriminator for training GAN stably and prevent the overfitting problem in the discriminator. PConv generates perturbed features by randomly disturbing an input tensor before performing the convolution operation. This approach is simple but surprisingly effective. First, to reliably classify real and generated samples using the disturbed input tensor, the intermediate layers in the discriminator should learn features having a small local Lipschitz value. Second, due to the perturbed features in PConv, the discriminator is difficult to memorize the real images; this makes the discriminator avoid the overfitting problem. To show the generalization ability of the proposed method, we conducted extensive experiments with various loss functions and datasets including CIFAR-10, CelebA-HQ, LSUN, and tiny-ImageNet. Quantitative evaluations demonstrate that WCL significantly improves the performance of GAN and conditional GAN in terms of Frechet inception distance (FID). For instance, the proposed method improves FID scores on the tiny-ImageNet dataset from 58.59 to 50.42.

翻訳日:2021-03-22 11:34:10 公開日:2021-02-02

# 直進非巡回グラフニューラルネットワーク

Directed Acyclic Graph Neural Networks ( http://arxiv.org/abs/2101.07965v3 )

ライセンス: Link先を確認

Veronika Thost, Jie Chen

(参考訳) グラフ構造化データは、科学と工学に広く現れる。グラフニューラルネットワーク(gnns)は、グラフに現れる関係帰納的バイアスを利用するように設計されており、構造情報がノードの特徴を補完するシナリオにおいて、他のタイプのニューラルネットワークを上回ることが示されている。最も一般的なGNNアーキテクチャは、メッセージパッシングに基づいて近隣からの情報を集約する。その一般性は広く適用された。本稿では、特殊だが広く使われているグラフ(DAG)に焦点をあて、ニューラルネットワーク設計に強力な帰納バイアス(部分順序付け)を注入する。我々は,部分順序で定義される流れに応じて情報を処理するアーキテクチャである, \emph{directed acyclic graph neural network},dagnnを提案する。 DAGNNは、初期の作業を特別なケース(例えば、木やノード表現を更新するモデルのモデル)として扱うフレームワークと見なすことができますが、以前のアーキテクチャに欠けているいくつかの重要なコンポーネントを特定します。我々は,DAGデータセット(ソースコード,ニューラルアーキテクチャ,確率的グラフィカルモデルなど)のアブレーション研究を含む総合的な実験を行い,DAGNNがより単純なDAGアーキテクチャや一般的なグラフアーキテクチャよりも優れていることを示す。

Graph-structured data ubiquitously appears in science and engineering. Graph neural networks (GNNs) are designed to exploit the relational inductive bias exhibited in graphs; they have been shown to outperform other forms of neural networks in scenarios where structure information supplements node features. The most common GNN architecture aggregates information from neighborhoods based on message passing. Its generality has made it broadly applicable. In this paper, we focus on a special, yet widely used, type of graphs -- DAGs -- and inject a stronger inductive bias -- partial ordering -- into the neural network design. We propose the \emph{directed acyclic graph neural network}, DAGNN, an architecture that processes information according to the flow defined by the partial order. DAGNN can be considered a framework that entails earlier works as special cases (e.g., models for trees and models updating node representations recurrently), but we identify several crucial components that prior architectures lack. We perform comprehensive experiments, including ablation studies, on representative DAG datasets (i.e., source code, neural architectures, and probabilistic graphical models) and demonstrate the superiority of DAGNN over simpler DAG architectures as well as general graph architectures.

翻訳日:2021-03-22 01:36:16 公開日:2021-02-02

# (参考訳) AIST++でダンスを学ぶ:音楽条件付き3Dダンス生成

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation ( http://arxiv.org/abs/2101.08779v2 )

ライセンス: CC BY 4.0

Ruilong Li, Shan Yang, David A. Ross, Angjoo Kanazawa

(参考訳) 本稿では,音楽に基づく3Dダンス生成のためのトランスフォーマーに基づく学習フレームワークを提案する。ネットワークアーキテクチャを慎重に設計し,定性的に満足な結果を得るための鍵を実証的に研究する。重要なコンポーネントには、音楽とダンスの動きの相関をよく学習する深いクロスモーダルトランスフォーマーや、長距離の非凍結運動を生成するのに必須のfuture-n監督機構との完全な対応が含まれる。さらに,AISTのマルチビュー・ダンス・ビデオから再構成したAIST++と呼ばれる3Dモーションと音楽のペアデータセットを提案する。このデータセットは、1408列の3Dダンスモーションの1.1Mフレームを含み、10種類のダンスコレオグラフィーをカバーし、マルチビューカメラパラメータを伴っている。私たちの知る限り、これはこの種の最大のデータセットです。 AIST++のリッチな実験により、我々の手法は定性的かつ定量的に最先端の手法よりもはるかに優れた結果が得られることを示した。

In this paper, we present a transformer-based learning framework for 3D dance generation conditioned on music. We carefully design our network architecture and empirically study the keys for obtaining qualitatively pleasing results. The critical components include a deep cross-modal transformer, which well learns the correlation between the music and dance motion; and the full-attention with future-N supervision mechanism which is essential in producing long-range non-freezing motion. In addition, we propose a new dataset of paired 3D motion and music called AIST++, which we reconstruct from the AIST multi-view dance videos. This dataset contains 1.1M frames of 3D dance motion in 1408 sequences, covering 10 genres of dance choreographies and accompanied with multi-view camera parameters. To our knowledge it is the largest dataset of this kind. Rich experiments on AIST++ demonstrate our method produces much better results than the state-of-the-art methods both qualitatively and quantitatively.

翻訳日:2021-03-21 10:32:34 公開日:2021-02-02

# 深部強化学習を用いた心血管モデルの構築 : 敗血症治療における不確実性意識制御

Unifying Cardiovascular Modelling with Deep Reinforcement Learning for Uncertainty Aware Control of Sepsis Treatment ( http://arxiv.org/abs/2101.08477v2 )

ライセンス: Link先を確認

Thesath Nanayakkara, Gilles Clermont, Christopher James Langmead, and David Swigon

(参考訳) 敗血症はicuの主要な死亡原因であり、全入院の6%、米国における病院内死亡の35%を占めている。しかし、血管圧薬と流体投与の戦略については、広く合意されていない。また、異なる患者が治療に異なる反応を示し、個別治療の必要性を強調していることも観察されている。バソプレッサーと流体は心血管系生理学に特異的な影響を及ぼしており、医学的な研究により、血液力学的に誘導された生理学的な治療アプローチが示唆されている。そこで我々は,数学的モデリング,深層学習,強化学習,不確実性定量化の相補的強みを利用して,個別化,安全,不確実性を考慮した治療戦略を学習する新しいアプローチを提案する。まず、新しい生理駆動型リカレントニューラルネットワークを用いて、患者固有の動的心血管状態を予測する。この情報は、患者の実験室の歴史と観測可能なデータの学習された低次元表現とともに、バッチ分散強化学習を用いて価値分布を導出する。さらに, 安全クリティカルな領域では, エージェントが何をし, 知らないかを知ることが不可欠であり, このために, 患者それぞれの状態や行動に関連するモデルの不確実性を定量化し, 不確実性を認識し, 解釈可能な治療方針に関する一般的な枠組みを提案する。このフレームワークは、臨床医自身のフレームワークに対する信頼を反映して、簡単に微調整することができ、アクセス可能な場合は常に、人間の専門家の意見に影響を与えるように簡単に修正することができる。代表的な患者と検証コホートを用いて,生理学的に解釈可能な一般化可能な方針を学習したことを示す。

Sepsis is the leading cause of mortality in the ICU, responsible for 6% of all hospitalizations and 35% of all in-hospital deaths in USA. However, there is no universally agreed upon strategy for vasopressor and fluid administration. It has also been observed that different patients respond differently to treatment, highlighting the need for individualized treatment. Vasopressors and fluids are administrated with specific effects to cardiovascular physiology in mind and medical research has suggested that physiologic, hemodynamically guided, approaches to treatment. Thus we propose a novel approach, exploiting and unifying complementary strengths of Mathematical Modelling, Deep Learning, Reinforcement Learning and Uncertainty Quantification, to learn individualized, safe, and uncertainty aware treatment strategies. We first infer patient-specific, dynamic cardiovascular states using a novel physiology-driven recurrent neural network trained in an unsupervised manner. This information, along with a learned low dimensional representation of the patient's lab history and observable data, is then used to derive value distributions using Batch Distributional Reinforcement Learning. Moreover in a safety critical domain it is essential to know what our agent does and does not know, for this we also quantify the model uncertainty associated with each patient state and action, and propose a general framework for uncertainty aware, interpretable treatment policies. This framework can be tweaked easily, to reflect a clinician's own confidence of the framework, and can be easily modified to factor in human expert opinion, whenever it's accessible. Using representative patients and a validation cohort, we show that our method has learned physiologically interpretable generalizable policies.

翻訳日:2021-03-21 08:01:25 公開日:2021-02-02

# 現実世界データを用いた薬物開発における人工知能の応用

Applications of artificial intelligence in drug development using real-world data ( http://arxiv.org/abs/2101.08904v2 )

ライセンス: Link先を確認

Zhaoyi Chen, Xiong Liu, William Hogan, Elizabeth Shenkman, Jiang Bian

(参考訳) 米国食品医薬品局(FDA)は、医薬品開発における実世界のデータの利用を積極的に推進している。 RWDは、治療が使用される実際の臨床環境を反映した重要な現実世界の証拠を生成することができる。一方、人工知能(AI)、特に機械学習とディープラーニング(ML/DL)の手法は、医薬品開発プロセスの多くの段階にわたって利用されてきた。 aiの進歩は、大規模な多次元rwdを分析する新しい戦略も提供した。そこで我々は過去20年間の論文の素早いレビューを行い、AIとRWDの両方を用いた薬物開発研究の概要について概説した。最も一般的な応用は、有害事象検出、トライアル採用、薬物再資源化であった。ここでは、現在の研究ギャップと今後の機会についても論じる。

The US Food and Drug Administration (FDA) has been actively promoting the use of real-world data (RWD) in drug development. RWD can generate important real-world evidence reflecting the real-world clinical environment where the treatments are used. Meanwhile, artificial intelligence (AI), especially machine- and deep-learning (ML/DL) methods, have been increasingly used across many stages of the drug development process. Advancements in AI have also provided new strategies to analyze large, multidimensional RWD. Thus, we conducted a rapid review of articles from the past 20 years, to provide an overview of the drug development studies that use both AI and RWD. We found that the most popular applications were adverse event detection, trial recruitment, and drug repurposing. Here, we also discuss current research gaps and future opportunities.

翻訳日:2021-03-20 17:27:57 公開日:2021-02-02

# ベイジアンネットワークが学習したグラフは、因果知識とどのように比べられるか?

How do some Bayesian Network machine learned graphs compare to causal knowledge? ( http://arxiv.org/abs/2101.10461v2 )

ライセンス: Link先を確認

Anthony C. Constantinou, Norman Fenton, Martin Neil

(参考訳) ベイズネットワーク(BN)のグラフは、因果知識によって決定されるか、両方の組み合わせで学習することができる。バイオインフォマティクスのような分野では、BN構造学習アルゴリズムを適用することで、未知のままの新たな洞察を明らかにすることができる。しかし、これらのアルゴリズムは、実際のデータを扱う場合にしばしば発生するサンプルサイズにおいて、入力データが制限されている場合、効果が低い。本稿では、純粋に機械学習と純粋に知識ベースのBNに焦点を当て、グラフィカル構造と暗黙の統計モデルがどのようにデータを説明しているかの違いを調査します。テストは、BN構造がドメイン知識によって決定された以前の4つのケーススタディに基づいている。知識に基づくグラフを,TETRADで実装された3つの学習クラスにまたがる様々なアルゴリズムから生成された機械学習グラフと比較した。その結果、アルゴリズムはより高いモデル選択スコアを持つグラフを生成する一方で、知識に基づくグラフは興味のある変数のより正確な予測因子であることがわかった。スコアフィッティングの最大化は、限られたデータで歪みが増し、より高いスコアを共有しながら真のグラフからかなり逸脱するグラフィカルなパターンにアルゴリズムを導くため、限られたサンプルサイズの存在下では効果がない。これは、これらのケースにおける因果知識の価値と、限られたデータに適した適切なスコアの必要性を強調する。最後に、シミュレーションデータの結果が実際の実世界のパフォーマンスについてほとんどわからないという概念を支持する新たな証拠も提示しています。

The graph of a Bayesian Network (BN) can be machine learned, determined by causal knowledge, or a combination of both. In disciplines like bioinformatics, applying BN structure learning algorithms can reveal new insights that would otherwise remain unknown. However, these algorithms are less effective when the input data are limited in terms of sample size, which is often the case when working with real data. This paper focuses on purely machine learned and purely knowledge-based BNs and investigates their differences in terms of graphical structure and how well the implied statistical models explain the data. The tests are based on four previous case studies whose BN structure was determined by domain knowledge. Using various metrics, we compare the knowledge-based graphs to the machine learned graphs generated from various algorithms implemented in TETRAD spanning all three classes of learning. The results show that, while the algorithms produce graphs with much higher model selection score, the knowledge-based graphs are more accurate predictors of variables of interest. Maximising score fitting is ineffective in the presence of limited sample size because the fitting becomes increasingly distorted with limited data, guiding algorithms towards graphical patterns that share higher fitting scores and yet deviate considerably from the true graph. This highlights the value of causal knowledge in these cases, as well as the need for more appropriate fitting scores suitable for limited data. Lastly, the experiments also provide new evidence that support the notion that results from simulated data tell us little about actual real-world performance.

翻訳日:2021-03-14 19:19:40 公開日:2021-02-02

# シールドによる安全マルチエージェント強化学習

Safe Multi-Agent Reinforcement Learning via Shielding ( http://arxiv.org/abs/2101.11196v2 )

ライセンス: Link先を確認

Ingy Elsayed-Aly, Suda Bharadwaj, Christopher Amato, R\"udiger Ehlers, Ufuk Topcu, Lu Feng

(参考訳) マルチエージェント強化学習(MARL)は、学習プロセス中に保証された安全性(例えば、安全でない状態は一度も訪れない)を必要とする幅広い安全クリティカルなアプリケーションで、ますます使われている。そこで,安全MARLに対する2つの遮蔽手法を提案する。集中シールドでは,すべてのエージェントの協調動作を監視し,必要ならば安全でない動作を補正するために,単一のシールドを合成する。因子遮蔽では,すべてのエージェントが観察する結合状態空間の因子化に基づいて複数のシールドを合成し,各シールドはエージェントのサブセットにのみ責任を負う。実験結果から,各シールドは学習中のエージェントの安全性を,学習方針の質を損なうことなく保証できることがわかった。さらに,因子遮蔽は中央集権遮蔽よりも,エージェント数でよりスケーラブルである。

Multi-agent reinforcement learning (MARL) has been increasingly used in a wide range of safety-critical applications, which require guaranteed safety (e.g., no unsafe states are ever visited) during the learning process.Unfortunately, current MARL methods do not have safety guarantees. Therefore, we present two shielding approaches for safe MARL. In centralized shielding, we synthesize a single shield to monitor all agents' joint actions and correct any unsafe action if necessary. In factored shielding, we synthesize multiple shields based on a factorization of the joint state space observed by all agents; the set of shields monitors agents concurrently and each shield is only responsible for a subset of agents at each step.Experimental results show that both approaches can guarantee the safety of agents during learning without compromising the quality of learned policies; moreover, factored shielding is more scalable in the number of agents than centralized shielding.

翻訳日:2021-03-13 19:41:56 公開日:2021-02-02

# (参考訳) トピック検出のためのdeep autoencoderベースのファジィc-means

Deep Autoencoder-based Fuzzy C-Means for Topic Detection ( http://arxiv.org/abs/2102.02636v1 )

ライセンス: CC BY 4.0

Hendri Murfi, Natasha Rosaline, Nora Hariadi

(参考訳) トピック検出は、テキストデータの集合からトピックを決定するプロセスである。トピック検出手法の1つはクラスタリングに基づく手法で、centroidsがトピックであると仮定する。クラスタリング手法は、負の表現でデータを処理できるという利点がある。したがって、クラスタリング法はより広範な表現学習法と組み合わせることができる。本稿では,Deep Autoencoder とfuzzy c-means (DFCM) を用いて,話題検出のためのディープラーニングを採用する。オートエンコーダのエンコーダは、低次元表現学習を行う。ファジィc-平均は、中心体を識別するために低次元表現をグループ化する。オートエンコーダのデコーダは、centroidsを元の表現に変換し、トピックとして解釈する。このシミュレーションにより、DFCMは固有空間ベースのファジィc-平均(EFCM)のコヒーレンススコアを改善し、非負行列ファクタリゼーション(NMF)や潜在ディリクレアロケーション(LDA)といった主要な標準手法に匹敵する。

Topic detection is a process for determining topics from a collection of textual data. One of the topic detection methods is a clustering-based method, which assumes that the centroids are topics. The clustering method has the advantage that it can process data with negative representations. Therefore, the clustering method allows a combination with a broader representation learning method. In this paper, we adopt deep learning for topic detection by using a deep autoencoder and fuzzy c-means called deep autoencoder-based fuzzy c-means (DFCM). The encoder of the autoencoder performs a lower-dimensional representation learning. Fuzzy c-means groups the lower-dimensional representation to identify the centroids. The autoencoder's decoder transforms back the centroids into the original representation to be interpreted as the topics. Our simulation shows that DFCM improves the coherence score of eigenspace-based fuzzy c-means (EFCM) and is comparable to the leading standard methods, i.e., nonnegative matrix factorization (NMF) or latent Dirichlet allocation (LDA).

翻訳日:2021-02-06 01:27:22 公開日:2021-02-02

# (参考訳) ヒトアシスタンスによる強化学習の改善:HIPPOジムによる人身学習の議論

Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym ( http://arxiv.org/abs/2102.02639v1 )

ライセンス: CC BY-SA 4.0

Matthew E. Taylor, Nicholas Nissen, Yuan Wang, Neda Navidi

(参考訳) 強化学習(RL)は、ゲームプレイ、ロボット制御、およびその他の連続的な決定タスクのための一般的な機械学習パラダイムです。しかし、rlエージェントはランダムに振る舞うことから、長い学習時間と高いデータ要求を持つことが多い。複雑なタスクをよりよく学習するために、本稿では、外部の教師がRLエージェントの学習に大いに役立つことを論じる。 OpenAI Gymは、多数の標準環境やエージェントを含むRL研究の一般的なフレームワークであり、RL研究が大幅にアクセスしやすくなります。この記事では、新しいオープンソースRLフレームワーク、Openai Gym(HIPPO Gym)のためのヒューマン入力解析プラットフォーム、およびその作成に行われた設計決定について紹介します。このプラットフォームの目的は、人間-RLの研究を促進することであり、またバーを下げることで、より多くの研究者が人間の教師がRLエージェントを支援できる様々な方法を探ることができる。

Reinforcement learning (RL) is a popular machine learning paradigm for game playing, robotics control, and other sequential decision tasks. However, RL agents often have long learning times with high data requirements because they begin by acting randomly. In order to better learn in complex tasks, this article argues that an external teacher can often significantly help the RL agent learn. OpenAI Gym is a common framework for RL research, including a large number of standard environments and agents, making RL research significantly more accessible. This article introduces our new open-source RL framework, the Human Input Parsing Platform for Openai Gym (HIPPO Gym), and the design decisions that went into its creation. The goal of this platform is to facilitate human-RL research, again lowering the bar so that more researchers can quickly investigate different ways that human teachers could assist RL agents, including learning from demonstrations, learning from feedback, or curriculum learning.

翻訳日:2021-02-06 01:13:17 公開日:2021-02-02

# (参考訳) 素粒子物理学のための機械学習のリビングレビュー

A Living Review of Machine Learning for Particle Physics ( http://arxiv.org/abs/2102.02770v1 )

ライセンス: CC BY 4.0

Matthew Feickert and Benjamin Nachman

(参考訳) ディープラーニングを含む現代の機械学習技術は急速に応用され、適応され、高エネルギー物理学のために開発されている。この研究の速いペースを考えると、我々は実験、現象学、または理論的分析にこれらのアプローチを開発し、適用する人々のための引用のほぼ包括的なリストを提供することを目標に生きたレビューを作成しました。生きた文書として、最新の開発を取り入れるためにできるだけ頻繁に更新されます。適切な(曖昧な)レビューのリストは、内部で見ることができる。論文は、可能な限り有用なトピックの小さなセットにグループ化されます。提案と貢献が最も歓迎され、参加の指示を提供します。

Modern machine learning techniques, including deep learning, are rapidly being applied, adapted, and developed for high energy physics. Given the fast pace of this research, we have created a living review with the goal of providing a nearly comprehensive list of citations for those developing and applying these approaches to experimental, phenomenological, or theoretical analyses. As a living document, it will be updated as often as possible to incorporate the latest developments. A list of proper (unchanging) reviews can be found within. Papers are grouped into a small set of topics to be as useful as possible. Suggestions and contributions are most welcome, and we provide instructions for participating.

翻訳日:2021-02-05 23:44:09 公開日:2021-02-02

# 深層学習アルゴリズムを用いた多基準決定の融合手法を用いたビッグデータ分析

Big Data Analytics Applying the Fusion Approach of Multicriteria Decision Making with Deep Learning Algorithms ( http://arxiv.org/abs/2102.02637v1 )

ライセンス: Link先を確認

Swarajya Lakshmi V Papineni, Snigdha Yarlagadda, Harita Akkineni, A. Mallikarjuna Reddy

(参考訳) データは、ネットワーク、クラウドコンピューティング、IoT(Internet of Things)、アクチュエータ、センサーなど、さまざまなタイプのデバイスに対する、人口と通信の急速な進歩によって進化している。データとコミュニケーションのコンテンツの増加は、ベロシティ、スピード、サイズ、価値の同値と一致し、将来の困難なタスクや最新の問題を解決するのに役立つ有用で有意義な知識を提供する。さらに、マルチクリトリアベースの意思決定は、ビッグデータ分析における代替効果に関連するさまざまな問題を解決する上で重要な課題の1つである。ビッグデータに対する洞察を提供するために、意思決定やマルチ基準に基づくディープラーニングメカニズムといったアルゴリズムを含む、最新の機械学習技術に基づくソリューションを見つける傾向があります。一方、実行時の双対性を高め、システム全体の潜在性と有効性を改善するために近似に従った導出がなされている。本質的には、ビジネス、農業、情報技術、コンピュータ科学を含むいくつかの分野は、深層学習と多基準に基づく意思決定問題を使用する。本稿では,ビッグデータ分析において直面する問題に対して,深層学習技術の概念を取り入れた多様なアプリケーションを提供し,データ駆動手法の融合手法による新たな研究を提案する。

Data is evolving with the rapid progress of population and communication for various types of devices such as networks, cloud computing, Internet of Things (IoT), actuators, and sensors. The increment of data and communication content goes with the equivalence of velocity, speed, size, and value to provide the useful and meaningful knowledge that helps to solve the future challenging tasks and latest issues. Besides, multicriteria based decision making is one of the key issues to solve for various issues related to the alternative effects in big data analysis. It tends to find a solution based on the latest machine learning techniques that include algorithms like decision making and deep learning mechanism based on multicriteria in providing insights to big data. On the other hand, the derivations are made for it to go with the approximations to increase the duality of runtime and improve the entire system's potentiality and efficacy. In essence, several fields, including business, agriculture, information technology, and computer science, use deep learning and multicriteria-based decision-making problems. This paper aims to provide various applications that involve the concepts of deep learning techniques and exploiting the multicriteria approaches for issues that are facing in big data analytics by proposing new studies with the fusion approaches of data-driven techniques.

翻訳日:2021-02-05 16:47:17 公開日:2021-02-02

# Autodidactic Neurosurgeon: オンライン学習によるモバイルエッジインテリジェンスの協調的深層推論

Autodidactic Neurosurgeon: Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning ( http://arxiv.org/abs/2102.02638v1 )

ライセンス: Link先を確認

Letian Zhang, Lixing Chen, Jie Xu

(参考訳) ディープラーニング(DL)の最近の進歩は、多くのインテリジェントなモバイルアプリケーションとサービスの出現をもたらしましたが、その一方で、リソースに制約のあるモバイルデバイスで前例のないコンピューティングの課題を引き起こします。本論文では, リソースに制約のあるモバイルデバイスとエッジサーバの協調的な深層推論システムを構築し, オンデバイス処理と計算オフロードの両方のパワーを結集することを目的とする。このシステムの基本的な考え方は、ディープニューラルネットワーク(DNN)をモバイルデバイス上で動作するフロントエンド部とエッジサーバ上で動作するバックエンド部に分割することであり、主な課題は、エンドツーエンドの推論遅延を最小限に抑えるために最適なパーティションポイントを見つける方法である。最適なパーティションポイントを探索するために、専用のオフラインプロファイリングステージに大きく依存する既存のDNNパーティションとは異なり、我々のシステムは、Autodidactic Neurosurgeon (ANS)と呼ばれるオンライン学習モジュールを組み込んで、最適なパーティションポイントをオンザフライで自動的に学習する。したがって、適応的意思決定のための新たな知識を発生させることにより、システム環境の変化を密接に追従することができる。 ANSのコアは、$\mu$LinUCBと呼ばれる新しいコンテキスト型バンディット学習アルゴリズムであり、理論的な学習性能を保証するだけでなく、現実世界の実装を容易にするための超軽量です。本稿では,ansの設計を検証するために,映像ストリームオブジェクト検出テストベッド上でシステムを実装し,その性能評価を行う。この実験は、ANSがトラッキングシステムの変更とエンドツーエンドの推論遅延の低減の観点から、最先端のベンチマークを大幅に上回っていることを示しています。

Recent breakthroughs in deep learning (DL) have led to the emergence of many intelligent mobile applications and services, but in the meanwhile also pose unprecedented computing challenges on resource-constrained mobile devices. This paper builds a collaborative deep inference system between a resource-constrained mobile device and a powerful edge server, aiming at joining the power of both on-device processing and computation offloading. The basic idea of this system is to partition a deep neural network (DNN) into a front-end part running on the mobile device and a back-end part running on the edge server, with the key challenge being how to locate the optimal partition point to minimize the end-to-end inference delay. Unlike existing efforts on DNN partitioning that rely heavily on a dedicated offline profiling stage to search for the optimal partition point, our system has a built-in online learning module, called Autodidactic Neurosurgeon (ANS), to automatically learn the optimal partition point on-the-fly. Therefore, ANS is able to closely follow the changes of the system environment by generating new knowledge for adaptive decision making. The core of ANS is a novel contextual bandit learning algorithm, called $\mu$LinUCB, which not only has provable theoretical learning performance guarantee but also is ultra-lightweight for easy real-world implementation. We implement our system on a video stream object detection testbed to validate the design of ANS and evaluate its performance. The experiments show that ANS significantly outperforms state-of-the-art benchmarks in terms of tracking system changes and reducing the end-to-end inference delay.

翻訳日:2021-02-05 16:46:37 公開日:2021-02-02

# novoゲノムアセンブラの強化学習に向けて

Towards a reinforcement learning de novo genome assembler ( http://arxiv.org/abs/2102.02649v1 )

ライセンス: Link先を確認

Kleber Padovani, Roberto Xavier, Andre Carvalho, Anna Reali, Annie Chateau, Ronnie Alves

(参考訳) 強化学習の使用は、学習プロセス中に人間の監督なしに複雑な活動を解くことに非常に有望であることが証明されている。しかし、彼らの成功例は主にゲームのようなフィクションやエンターテイメントの問題に焦点を当てている。本研究は、この関連現実問題であるゲノム組立を解くため、強化学習の応用に光を当てることを目的としている。この問題に対処する文献に唯一見られるアプローチを拡張することで、我々はQ学習アルゴリズムによって実行される知的エージェント学習の側面を慎重に検討し、実際のゲノムプロジェクトとより類似した特徴を持つシナリオに適用できる可能性を理解する。提案された改善には、以前に提案された報酬システムの変更、動的プランニングに基づく状態空間探索最適化戦略、進化的コンピューティングとの相互協力が含まれる。これらの調査は、従来よりも大きな入力を持つ23の新しい環境で実施された。これらの環境はすべて、科学コミュニティによるこの研究の進化のためにインターネット上で自由に利用できる。その結果,提案した改良手法による一貫した性能向上が示唆されたが,特に状態空間と行動空間の高次元性に関する限界も示している。また,近年,高次元入力を扱う他の領域から,深層強化学習を含む学習アプリケーションの強化を図ることで,実際のシナリオにおいてゲノムアセンブリーを効率的に取り組める経路を提案する。

The use of reinforcement learning has proven to be very promising for solving complex activities without human supervision during their learning process. However, their successful applications are predominantly focused on fictional and entertainment problems - such as games. Based on the above, this work aims to shed light on the application of reinforcement learning to solve this relevant real-world problem, the genome assembly. By expanding the only approach found in the literature that addresses this problem, we carefully explored the aspects of intelligent agent learning, performed by the Q-learning algorithm, to understand its suitability to be applied in scenarios whose characteristics are more similar to those faced by real genome projects. The improvements proposed here include changing the previously proposed reward system and including state space exploration optimization strategies based on dynamic pruning and mutual collaboration with evolutionary computing. These investigations were tried on 23 new environments with larger inputs than those used previously. All these environments are freely available on the internet for the evolution of this research by the scientific community. The results suggest consistent performance progress using the proposed improvements, however, they also demonstrate the limitations of them, especially related to the high dimensionality of state and action spaces. We also present, later, the paths that can be traced to tackle genome assembly efficiently in real scenarios considering recent, successfully reinforcement learning applications - including deep reinforcement learning - from other domains dealing with high-dimensional inputs.

翻訳日:2021-02-05 16:39:11 公開日:2021-02-02

# DLpN: 深部学習者推定等方体積分率を用いた単層NODDI

DLpN: Single-Shell NODDI Using Deep Learner Estimated Isotropic Volume Fraction ( http://arxiv.org/abs/2102.02772v1 )

ライセンス: Link先を確認

Abrar Faiyaz, Marvin Doyley, Giovanni Schifitto, Jianhui Zhong, Md Nasir Uddin

(参考訳) ニューライト配向分散・密度イメージング(NODDI)は,多層拡散MRIデータから細胞内,細胞外および遊離水信号の評価を可能にする。脳組織の微細構造を特徴付けるための洞察力のあるアプローチです。 NODDIパラメータの単一殻再構成は、特にニューロライト密度指数(NDI)に適合する際の故障に基づいて、過去の文献では無視されている。そこで本研究では, 以前に等方性体積分数 (f_{ISO}) を用いて, 単殻データを用いた堅牢なNODDIパラメータマップの作成の可能性を検討した。辞書に基づく深層学習手法を用いて,NODDIモデル制約とは独立に事前推定を行った。まず,f_{ISO} を予測するために,確率的スパース辞書ベースのネットワーク DictNet を提案する。単殻の場合,f_{ISO}推定辞書には拡散重み付けのない分数異方性(FA)とT2信号(S_0)が組み込まれていた。その後、NDIとオリエンテーション分散指数(ODI)を推定するために、NODDIフレームワークを事前設定で使用しました。合成データシミュレーションと3Tスキャナーで収集した人的データを用いて, 辞書を用いた深層学習前のNODDI(DLpN)の性能を, 単殻データと多殻データの両方に対して元のNODDI法と比較した。本研究では, DLpN 由来 NDI および ODI パラメータが単殻プロトコルのマルチシェル NODDI に匹敵し, b=2000 s/mm 2 のプロトコルが最高性能を発揮することを示唆した(ホワイトマターではエラー～2%,グレーマターでは4%程度)。これにより、DictNet f_{ISO} トレーニングのための2つの被験者の追加スキャンによって、単殻データに関するレトロスペクティブ研究のNODDI評価が可能になる。

Neurite orientation dispersion and density imaging (NODDI) enables assessment of intracellular, extracellular and free water signals from multi-shell diffusion MRI data. It is an insightful approach to characterize the brain tissue microstructure. Single-shell reconstruction for NODDI parameters has been discouraged in previous literature based on failure when fitting especially for the neurite density index (NDI). Here, we investigated the possibility to create robust NODDI parameter maps with single-shell data, using isotropic volume fraction (f_{ISO}) as prior. We made the prior estimation independent of NODDI model constraint using a dictionary based deep learning approach. First, we proposed a stochastic sparse dictionary-based network, DictNet in predicting f_{ISO} . In single-shell cases, fractional anisotropy (FA) and T2 signal without diffusion weighting ( S_0 ) were incorporated in the dictionary for f_{ISO} estimation. Then, NODDI framework was used in a prior setting to estimate the NDI and orientation dispersion index (ODI). Using both synthetic data simulation and human data collected on a 3T scanner, we compared the performance of our dictionary based deep learning prior NODDI (DLpN) with original NODDI method for both single-shell and multi-shell data. Our results suggest that DLpN derived NDI and ODI parameters for single-shell protocols are comparable with original multi-shell NODDI, and protocol with b=2000 s/mm 2 performs the best (error ~2% in white matter and ~4% in grey matter). This may allow NODDI evaluation of retrospective studies on single-shell data by additional scanning of two subjects for DictNet f_{ISO} training.

翻訳日:2021-02-05 16:35:17 公開日:2021-02-02

# 量子自然言語処理における同義文のパラメータ化量子回路

Parametrized Quantum Circuits of Synonymous Sentences in Quantum Natural Language Processing ( http://arxiv.org/abs/2102.02204v1 )

ライセンス: Link先を確認

Mina Abbaszadeh, S. Shahin Mousavi, Vahid Salari

(参考訳) 本稿では,非英語言語に対する量子自然言語処理における正の推移文の合成ベクトルに基づく意味論を開発する。ペルシア語は、英語とペルシア語の2つの同義語文のパラメタ化量子回路を比較する。推移文の文法+意味を考慮し、ZX計算によるDisCoCat図を量子回路形式に変換する。また、Bigraphメソッドを使用してDisCoCatダイアグラムを書き換え、セマンティック側で量子回路に変換します。

In this paper, we develop a compositional vector-based semantics of positive transitive sentences in quantum natural language processing for a non-English language, i.e. Persian, to compare the parametrized quantum circuits of two synonymous sentences in two languages, English and Persian. By considering grammar+meaning of a transitive sentence, we translate DisCoCat diagram via ZX-calculus into quantum circuit form. Also, we use a bigraph method to rewrite DisCoCat diagram and turn into quantum circuit in the semantic side.

翻訳日:2021-02-05 16:18:06 公開日:2021-02-02

# 右エッジデバイスの選択:GPGPU上でのCUDAベースのCNNのパワーと性能推定に向けて

Pick the Right Edge Device: Towards Power and Performance Estimation of CUDA-based CNNs on GPGPUs ( http://arxiv.org/abs/2102.02645v1 )

ライセンス: Link先を確認

Christopher A. Metz, Mehran Goli, Rolf Drechsler

(参考訳) 機械学習(ML)の強力なテクニックとしての出現は、ビジネスのほぼすべての分野において、運用効率の向上や新たな価値提案の開発に役立っている。 MLモデルのデプロイとメンテナンスの課題に加えて、これらのモデルを実行するために適切なエッジデバイス(GPGPUなど)を選択すること(例えば、大規模な計算プロセスを備えたCNN)は、今日の組織が直面する最も困難な課題の1つです。レンタル(クラウド上で)やエッジデバイスを購入するコストが最終製品やサービスのコストに直接つながるため、最も効率的なデバイスを選択することが不可欠である。しかし、この意思決定には、MLワークフローの初期段階で識別しなければならないエッジデバイス上で動作するMLモデルのパフォーマンスと電力消費に関する深い知識が必要です。本稿では、GPGPU上でのCUDAベースのCNNの消費電力と性能の早期推定をMLエンジニアに提供する新しいMLベースのアプローチを紹介します。提案されたアプローチにより、MLエンジニアは開発初期のCNNモデルに対して最も効率的なGPGPUを選択することができます。

The emergence of Machine Learning (ML) as a powerful technique has been helping nearly all fields of business to increase operational efficiency or to develop new value propositions. Besides the challenges of deploying and maintaining ML models, picking the right edge device (e.g., GPGPUs) to run these models (e.g., CNN with the massive computational process) is one of the most pressing challenges faced by organizations today. As the cost of renting (on Cloud) or purchasing an edge device is directly connected to the cost of final products or services, choosing the most efficient device is essential. However, this decision making requires deep knowledge about performance and power consumption of the ML models running on edge devices that must be identified at the early stage of ML workflow. In this paper, we present a novel ML-based approach that provides ML engineers with the early estimation of both power consumption and performance of CUDA-based CNNs on GPGPUs. The proposed approach empowers ML engineers to pick the most efficient GPGPU for a given CNN model at the early stage of development.

翻訳日:2021-02-05 15:55:53 公開日:2021-02-02

# (参考訳) 転送のスケーリング法則

Scaling Laws for Transfer ( http://arxiv.org/abs/2102.01293v1 )

ライセンス: CC BY 4.0

Danny Hernandez, Jared Kaplan, Tom Henighan, and Sam McCandlish

(参考訳) 教師なしの微調整環境下における分布間の移動学習のための経験的スケーリング法について検討する。ますます大きなニューラルネットワークを固定サイズのデータセット上でスクラッチからトレーニングすると、最終的にはデータ制限となり、パフォーマンス(クロスエントロピー損失)が向上しなくなります。大きな言語データセットで事前トレーニングされたモデルで同じことをすると、パフォーマンス向上の勾配はゼロになるよりも単に小さくなります。同じサイズのトランスフォーマーが、スクラッチからトレーニングする際に同じ損失を達成するために必要なデータ量を決定することにより、事前トレーニングから“転送”された有効データを計算する。言い換えれば、私たちはデータの単位に集中し、他のすべてを固定します。提案手法は,パラメータ数と微調整データセットサイズに比例したパワーロー則を用いて,データ転送の効率をよく記述する。これらのパワーローの指数は、モデルの一般性と分布の近さ(対称性ではなく指向性)の尺度に対応すると信じています。事前学習は、微調整データセットのサイズを効果的に乗算する。全体的なパフォーマンスと同様に、転送はパラメータ、データ、計算の観点で予測できるスケールである。

We study empirical scaling laws for transfer learning between distributions in an unsupervised, fine-tuning setting. When we train increasingly large neural networks from-scratch on a fixed-size dataset, they eventually become data-limited and stop improving in performance (cross-entropy loss). When we do the same for models pre-trained on a large language dataset, the slope in performance gains is merely reduced rather than going to zero. We calculate the effective data "transferred" from pre-training by determining how much data a transformer of the same size would have required to achieve the same loss when training from scratch. In other words, we focus on units of data while holding everything else fixed. We find that the effective data transferred is described well in the low data regime by a power-law of parameter count and fine-tuning dataset size. We believe the exponents in these power-laws correspond to measures of the generality of a model and proximity of distributions (in a directed rather than symmetric sense). We find that pre-training effectively multiplies the fine-tuning dataset size. Transfer, like overall performance, scales predictably in terms of parameters, data, and compute.

翻訳日:2021-02-05 10:33:04 公開日:2021-02-02

# (参考訳) IoTと気象条件を利用して、キャンパス内のバストランジットを待つライダーを推定

Leveraging IoT and Weather Conditions to Estimate the Riders Waiting for the Bus Transit on Campus ( http://arxiv.org/abs/2102.01364v1 )

ライセンス: CC BY-SA 4.0

Ismail Arai, Ahmed Elnoshokaty, Samy El-Tawab

(参考訳) この時代の通信技術革命は、輸送の世界でスマートフォンの使用を増加させています。本論文では,スマートフォンのWi-Fiデータと気象条件を併用したIoTデバイスデータを用いて,ディープラーニングモデルを用いて,バス停で待機する乗客の予想数を予測することを提案する。本研究は、アメリカ合衆国バージニア州のジェームズ・マディソン大学(jmu)の交通バスシステムから収集した。本稿では,停留所で待機する乗客数と気象条件との関係について検討する。実証実験では,JMUにおける複数の停留所を用いた実験を行い,高い精度で確認した。 Deep Neural Network (DNN) モデルと Linear Regression (LR) と Wide Neural Network (WNN) の2つのベースラインモデルを比較しました。ベースラインモデルとDNNのギャップは、LRとWNNと比較してDNNに有利な予測のための平均偏角誤差(MSE)スコアをそれぞれ35%と14%改善した。

The communication technology revolution in this era has increased the use of smartphones in the world of transportation. In this paper, we propose to leverage IoT device data, capturing passengers' smartphones' Wi-Fi data in conjunction with weather conditions to predict the expected number of passengers waiting at a bus stop at a specific time using deep learning models. Our study collected data from the transit bus system at James Madison University (JMU) in Virginia, USA. This paper studies the correlation between the number of passengers waiting at bus stops and weather conditions. Empirically, an experiment with several bus stops in JMU, was utilized to confirm a high precision level. We compared our Deep Neural Network (DNN) model against two baseline models: Linear Regression (LR) and a Wide Neural Network (WNN). The gap between the baseline models and DNN was 35% and 14% better Mean Squared Error (MSE) scores for predictions in favor of the DNN compared to LR and WNN, respectively.

翻訳日:2021-02-05 10:15:58 公開日:2021-02-02

# (参考訳) AutoFreeze:微調整を高速化する自動凍結モデルブロック

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning ( http://arxiv.org/abs/2102.01386v1 )

ライセンス: CC BY 4.0

Yuhan Liu, Saurabh Agarwal, Shivaram Venkataraman

(参考訳) 機械学習(ML)の急速な採用により、多くのドメインが、大規模なデータコーパスで事前トレーニングされた微調整モデルのアプローチを使用している。しかし、我々の実験では、BERTのようなモデルの微調整でさえGPUを使用するのに何時間もかかることが示されている。以前の作業では、最終レイヤ以外のすべてのレイヤの凍結など、微調整されたレイヤの数を制限することを提案しているが、このような静的アプローチは精度を低下させる。適応的手法を用いてどの層を訓練するかを選択するシステムであるAutoFreezeを提案し、精度を保ちながらモデル微調整をいかに加速させるかを示す。また,中間アクティベーションの効率的なキャッシングを可能にする機構を開発し,微調整を行う際の前方計算時間を短縮する。 4つのNLPタスクに対する評価は、キャッシュを有効にしたAutoFreezeが、最大2.55倍の微調整性能を向上できることを示している。

With the rapid adoption of machine learning (ML), a number of domains now use the approach of fine-tuning models pre-trained on a large corpus of data. However, our experiments show that even fine-tuning on models like BERT can take many hours when using GPUs. While prior work proposes limiting the number of layers that are fine-tuned, e.g., freezing all layers but the last layer, we find that such static approaches lead to reduced accuracy. We propose, AutoFreeze, a system that uses an adaptive approach to choose which layers are trained and show how this can accelerate model fine-tuning while preserving accuracy. We also develop mechanisms to enable efficient caching of intermediate activations which can reduce the forward computation time when performing fine-tuning. Our evaluation on fourNLP tasks shows that AutoFreeze, with caching enabled, can improve fine-tuning performance by up to 2.55x.

翻訳日:2021-02-05 09:51:55 公開日:2021-02-02

# (参考訳) OPAM:機械学習を用いたオンライン購入行動分析

OPAM: Online Purchasing-behavior Analysis using Machine learning ( http://arxiv.org/abs/2102.01625v1 )

ライセンス: CC BY 4.0

Sohini Roychowdhury, Ebrahim Alareqi, Wenxi Li

(参考訳) 顧客購買行動分析は、オンラインベンダーとその顧客間の洞察に富んだコミュニケーション戦略を開発する上で重要な役割を果たす。近年のオンラインショッピングのトレンド拡大を支援するため,本研究では,教師付き,非監督型,半監督型学習手法を用いた購買行動分析システムを提案する。提案システムでは,顧客カテゴリやクラスタを特定するために,セッションおよびユーザ・ジャーニーレベルの購買行動を分析する。セッションレベルの購買予測のためのオンラインショッピングポータルの設計に対する感度は91-98%/73-99%の範囲で高い。ユーザジャーニーレベルの分析では、5つのユニークなユーザクラスタが示されており、その中では'New Shoppers'が最も予測可能であり、'Impulsive Shoppers'は低い視聴と高いカートリング行動で購入できる。さらに、クラスタ変換メトリクスと部分ラベル学習は、各ユーザクラスタの新たな/非ラベルイベントへの堅牢性を示す。これにより、顧客クラスタは戦略的にターゲットされたナッジモデルを支援することができる。

Customer purchasing behavior analysis plays a key role in developing insightful communication strategies between online vendors and their customers. To support the recent increase in online shopping trends, in this work, we present a customer purchasing behavior analysis system using supervised, unsupervised and semi-supervised learning methods. The proposed system analyzes session and user-journey level purchasing behaviors to identify customer categories/clusters that can be useful for targeted consumer insights at scale. We observe higher sensitivity to the design of online shopping portals for session-level purchasing prediction with accuracy/recall in range 91-98%/73-99%, respectively. The user-journey level analysis demonstrates five unique user clusters, wherein 'New Shoppers' are most predictable and 'Impulsive Shoppers' are most unique with low viewing and high carting behaviors for purchases. Further, cluster transformation metrics and partial label learning demonstrates the robustness of each user cluster to new/unlabelled events. Thus, customer clusters can aid strategic targeted nudge models.

翻訳日:2021-02-05 09:26:34 公開日:2021-02-02

# (参考訳) NLPを用いた金融トレンド予測のための確率時系列モデル

A Stochastic Time Series Model for Predicting Financial Trends using NLP ( http://arxiv.org/abs/2102.01290v1 )

ライセンス: CC BY 4.0

Pratyush Muthukumar, Jie Zhong

(参考訳) 株価予測は、非常に複雑で非常に重要な研究分野です。ディープニューラルネットワーク技術の進歩により、研究者は金融トレンドを予測するために高精度なモデルを開発することができる。 ST-GAN(Stochastic Time-Series Generative Adversarial Network)と呼ばれる新しいディープラーニングモデルを提案し、財務ニューステキストと財務数値データの両方を分析して株価動向を予測します。我々は、GAN(Generative Adversarial Network)のような最先端技術を用いて、テキストデータと数値データの相関関係を時間とともに学習する。ナイーブ・ベイズの金融テキストデータに対する感情分析の学習表現と、数値データからのテクニカル指標を直接利用し、時系列GANを訓練する新しい方法を開発する。実験の結果,株価予測のための深層ニューラルネットワークの既存モデルおよび先行研究に対して有意な改善がみられた。

Stock price forecasting is a highly complex and vitally important field of research. Recent advancements in deep neural network technology allow researchers to develop highly accurate models to predict financial trends. We propose a novel deep learning model called ST-GAN, or Stochastic Time-series Generative Adversarial Network, that analyzes both financial news texts and financial numerical data to predict stock trends. We utilize cutting-edge technology like the Generative Adversarial Network (GAN) to learn the correlations among textual and numerical data over time. We develop a new method of training a time-series GAN directly using the learned representations of Naive Bayes' sentiment analysis on financial text data alongside technical indicators from numerical data. Our experimental results show significant improvement over various existing models and prior research on deep neural networks for stock price forecasting.

翻訳日:2021-02-05 06:16:39 公開日:2021-02-02

# (参考訳) cuboidal partitioningによるヒューマンマシン協調ビデオ符号化

Human-Machine Collaborative Video Coding Through Cuboidal Partitioning ( http://arxiv.org/abs/2102.01307v1 )

ライセンス: CC0 1.0

Ashek Ahmmed, Manoranjan Paul, Manzur Murshed, and David Taubman

(参考訳) ビデオコーディングアルゴリズムは、ビデオフレーム全体をエンコードしてデコードしますが、機能コーディング技術は、特定のアプリケーションに必要な最も重要な情報を保存および伝達するだけです。これは、ビデオコーディングが人間の知覚をターゲットとし、機能コーディングがマシンビジョンタスクをターゲットとするからです。近年,これら2つの領域間のギャップを埋める試みが行われている。本研究では,人間の視覚とcuboidsを用いた機械ビジョンアプリケーションとの共通性を利用して,映像符号化フレームワークを提案する。これは、ビデオフレーム上の長方形領域の推定が計算効率が高く、コンパクトな表現とオブジェクト中心を持つためである。このような特性は、従来のビデオコーディングシステムに付加価値をもたらすことがすでに示されています。ここで、現在のフレームからcuboidal feature descriptorを抽出し、オブジェクト検出の形でマシンビジョンタスクを達成するために使用する。実験結果から, 現在のテストフレームの立方形特徴指向表現を備えた場合, 訓練された分類器は, より優れた平均精度が得られることがわかった。さらに、この表現は、キャプチャされたフレームを受信機に通信する必要がある場合、ビットレートを7%削減する。

Video coding algorithms encode and decode an entire video frame while feature coding techniques only preserve and communicate the most critical information needed for a given application. This is because video coding targets human perception, while feature coding aims for machine vision tasks. Recently, attempts are being made to bridge the gap between these two domains. In this work, we propose a video coding framework by leveraging on to the commonality that exists between human vision and machine vision applications using cuboids. This is because cuboids, estimated rectangular regions over a video frame, are computationally efficient, has a compact representation and object centric. Such properties are already shown to add value to traditional video coding systems. Herein cuboidal feature descriptors are extracted from the current frame and then employed for accomplishing a machine vision task in the form of object detection. Experimental results show that a trained classifier yields superior average precision when equipped with cuboidal features oriented representation of the current test frame. Additionally, this representation costs 7% less in bit rate if the captured frames are need be communicated to a receiver.

翻訳日:2021-02-05 06:05:55 公開日:2021-02-02

# (参考訳) グラフ制約付き変更点学習によるQRS複合検出

A Graph-Constrained Changepoint Learning Approach for Automatic QRS-Complex Detection ( http://arxiv.org/abs/2102.01319v1 )

ライセンス: CC BY 4.0

Atiyeh Fotoohinasab, Toby Hocking, and Fatemeh Afghah

(参考訳) 本研究では,Rピーク位置の探索にグラフベースの変化点検出モデルを適用し,ECG信号解析の新しい視点を提案する。このモデルは、ラベル付きECGデータから制約グラフを学習する新しいグラフ学習アルゴリズムに基づいています。提案した学習アルゴリズムは単純な初期グラフから始まり、Rピーク検出において最終グラフが最大精度を持つように反復的にグラフを編集する。 MIT-BIH不整脈データベースでアルゴリズムの性能を評価します。評価結果は,提案手法が他の最先端手法に匹敵する結果が得られることを示す。提案手法は,sen = 99.64%, ppr = 99.71%, der = 0.19の検出誤差率の合計感度を達成している。

This study presents a new viewpoint on ECG signal analysis by applying a graph-based changepoint detection model to locate R-peak positions. This model is based on a new graph learning algorithm to learn the constraint graph given the labeled ECG data. The proposed learning algorithm starts with a simple initial graph and iteratively edits the graph so that the final graph has the maximum accuracy in R-peak detection. We evaluate the performance of the algorithm on the MIT-BIH Arrhythmia Database. The evaluation results demonstrate that the proposed method can obtain comparable results to other state-of-the-art approaches. The proposed method achieves the overall sensitivity of Sen = 99.64%, positive predictivity of PPR = 99.71%, and detection error rate of DER = 0.19.

翻訳日:2021-02-05 05:57:19 公開日:2021-02-02

# (参考訳) 深層ニューラルネットワークの理解・可視化・説明に関する調査研究

A Survey on Understanding, Visualizations, and Explanation of Deep Neural Networks ( http://arxiv.org/abs/2102.01792v1 )

ライセンス: CC BY 4.0

Atefeh Shahroudnejad

(参考訳) 機械学習と信号処理領域の最近の進歩は、エンジニアリングの重要性の異なる困難な問題に対する前例のないパフォーマンスと高い精度のために、Deep Neural Networks(DNNs)への関心が大幅に高まりました。しかし、このような深層学習アーキテクチャが人間の生活に関わる決定(例えば、制御システムや医学的応用)に利用される場合、深層モデルの決定の背後にある議論を理解すること、信頼すること、一言で「説明」することが重要となる。多くのアプリケーションでは、人工ニューラルネットワーク(DNNを含む)はブラックボックスシステムと見なされ、内部処理アクションの十分な手がかりを提供していません。深層ネットワークの動作と決定を説明するための最近の取り組みが始まっていますが、DNNの行動と決定を推論することを目的とした説明可能な人工知能(XAI)ドメインはまだ初期段階にあります。本研究の目的は、DNNの内部的および全体的行動の理解、可視化、説明に関する包括的概要を提供することである。

Recent advancements in machine learning and signal processing domains have resulted in an extensive surge of interest in Deep Neural Networks (DNNs) due to their unprecedented performance and high accuracy for different and challenging problems of significant engineering importance. However, when such deep learning architectures are utilized for making critical decisions such as the ones that involve human lives (e.g., in control systems and medical applications), it is of paramount importance to understand, trust, and in one word "explain" the argument behind deep models' decisions. In many applications, artificial neural networks (including DNNs) are considered as black-box systems, which do not provide sufficient clue on their internal processing actions. Although some recent efforts have been initiated to explain the behaviors and decisions of deep networks, explainable artificial intelligence (XAI) domain, which aims at reasoning about the behavior and decisions of DNNs, is still in its infancy. The aim of this paper is to provide a comprehensive overview on Understanding, Visualization, and Explanation of the internal and overall behavior of DNNs.

翻訳日:2021-02-05 04:54:36 公開日:2021-02-02

# (参考訳) メジャー化対策、シーケンス複雑性、オンライン学習

Majorizing Measures, Sequential Complexities, and Online Learning ( http://arxiv.org/abs/2102.01729v1 )

ライセンス: CC BY 4.0

Adam Block, Yuval Dagan, and Sasha Rakhlin

(参考訳) 本稿では, シーケンシャルなRademacher複雑性を制御するために, ジェネリックチェアリングと大規模化手法を導入する。本研究は,水平独立な方法での逐次スケールセンシティブな次元の観点で支配される分数被覆数の概念を大規模化することで,さらに複雑性の仮定により,逐次スケールセンシティブな次元の積分による最悪ケースシーケンシャルなラデマッハ複雑性の厳密な制御を確立する。最後に、最悪ケースシーケンシャルなRademacher複雑性に対して、厳密な収縮不等式を確立する。上記は、経験的過程の古典的理論を逐次ケースに拡張する上で顕著なオープン問題の数の解決を構成し、その結果、オンライン学習のための鋭い結果を確立します。

We introduce the technique of generic chaining and majorizing measures for controlling sequential Rademacher complexity. We relate majorizing measures to the notion of fractional covering numbers, which we show to be dominated in terms of sequential scale-sensitive dimensions in a horizon-independent way, and, under additional complexity assumptions establish a tight control on worst-case sequential Rademacher complexity in terms of the integral of sequential scale-sensitive dimension. Finally, we establish a tight contraction inequality for worst-case sequential Rademacher complexity. The above constitutes the resolution of a number of outstanding open problems in extending the classical theory of empirical processes to the sequential case, and, in turn, establishes sharp results for online learning.

翻訳日:2021-02-05 03:52:59 公開日:2021-02-02

# (参考訳) 深層学習法に基づくトップビュー画像列における車両軌跡予測

Vehicle trajectory prediction in top-view image sequences based on deep learning method ( http://arxiv.org/abs/2102.01749v1 )

ライセンス: CC BY 4.0

Zahra Salahshoori Nejad, Hamed Heravi, Ali Rahimpour Jounghani, Abdollah Shahrezaie, Afshin Ebrahimi

(参考訳) 毎年、世界中の多くの負傷者や死亡者が自動車事故に関連しています。この値は、運転支援システムの使用により、最近ある程度減少している。運転支援システム(すなわち自動運転システム)の開発は、この数を減らす上で重要な役割を果たす。自動走行車および高度な安全システムにおいて,周辺車両の移動を推定・予測することが不可欠である。さらに,事故時の運転者の行動,車両の移動と周囲の車両の歴史,交通現場における位置など,多くの要因が軌跡の予測に影響を及ぼしている。車両は交通の安全な経路を移動し、最短で他のドライバーの予測不能な行動に反応しなければならない。ここでは,自動車の走行経路を予測するために,道路の空中画像から得られた画像から学習した計算量が少ないモデルを提案する。本手法は,ソーシャルテンソルを用いたエンコーダデコーダモデルに基づいて,周囲の車両の動きが対象車両に与える影響をモデル化する。提案モデルは,対象車両の移動履歴とその周辺状況に関する画像を見るだけで,任意の高速道路における車両の将来経路を予測できる。深層学習はこれらの画像の特徴を抽出するツールとして用いられた。 HighDデータベースを用いて道路の空中画像の画像データセットを作成し,本データベース上でのモデルの性能評価を行った。提案手法は, 5秒間, 1.91 の RMSE を達成し, 前回の研究では, 最良経路予測法よりも誤差が少ないことがわかった。

Annually, a large number of injuries and deaths around the world are related to motor vehicle accidents. This value has recently been reduced to some extent, via the use of driver-assistance systems. Developing driver-assistance systems (i.e., automated driving systems) can play a crucial role in reducing this number. Estimating and predicting surrounding vehicles' movement is essential for an automated vehicle and advanced safety systems. Moreover, predicting the trajectory is influenced by numerous factors, such as drivers' behavior during accidents, history of the vehicle's movement and the surrounding vehicles, and their position on the traffic scene. The vehicle must move over a safe path in traffic and react to other drivers' unpredictable behaviors in the shortest time. Herein, to predict automated vehicles' path, a model with low computational complexity is proposed, which is trained by images taken from the road's aerial image. Our method is based on an encoder-decoder model that utilizes a social tensor to model the effect of the surrounding vehicles' movement on the target vehicle. The proposed model can predict the vehicle's future path in any freeway only by viewing the images related to the history of the target vehicle's movement and its neighbors. Deep learning was used as a tool for extracting the features of these images. Using the HighD database, an image dataset of the road's aerial image was created, and the model's performance was evaluated on this new database. We achieved the RMSE of 1.91 for the next 5 seconds and found that the proposed method had less error than the best path-prediction methods in previous studies.

翻訳日:2021-02-05 02:54:41 公開日:2021-02-02

# (参考訳) Ansatz-Independent Variational Quantum Classifier

Ansatz-Independent Variational Quantum Classifier ( http://arxiv.org/abs/2102.01759v1 )

ライセンス: CC BY 4.0

Hideyuki Miyahara and Vwani Roychowdhury

(参考訳) 変分量子分類器(VQCs)のパラダイムは、量子状態として \textit{classical information} を符号化し、次いで量子処理と古典的な予測を生成するための測定を行う。 VQCは、短期量子デバイスを効率的に活用するための候補である:$M$-dimensionalデータセットを含む分類器は、振幅エンコーディングを使用して、$\lceil \log_2 M \rceil$ qubitsだけで実装できる。しかしながら、VQCの設計と訓練のための一般的な枠組みは提案されておらず、古典的分類器との能力と分析的関係の根本的な理解はよく分かっていない。 VQCの奨励的な具体化である量子回路学習(QCL)では、アンサッツを用いて、所定の位相とパラメトリケートゲートを持つ回路として量子進化演算子を表現し、最適化によってゲートパラメータを学習する。本稿では、まずVQCに関するオープンな疑問に対処し、QCLを含むそれらがよく知られたカーネルメソッドに適合していることを示す。このような対応に基づき,効率的なアンサッツ非依存型VQCの設計枠組みを考案し,これをユニタリカーネル法 (UKM) と呼び,VQCにおけるユニタリ進化演算子を直接最適化する。そこで本研究では,QCLの性能がUKMによって上からバウンドされていることを示す。次に、与えられたユニタリ演算子に対して効率的な量子回路を設計するための変分回路実現(VCR)を提案する。 UKMとVCRを組み合わせることで、高性能回路を構築するための効率的な枠組みを確立します。最後に,複数のデータセットに対する広範囲な数値シミュレーションにより,ukmとvcrの性能を比較検討した。

The paradigm of variational quantum classifiers (VQCs) encodes \textit{classical information} as quantum states, followed by quantum processing and then measurements to generate classical predictions. VQCs are promising candidates for efficient utilization of a near-term quantum device: classifiers involving $M$-dimensional datasets can be implemented with only $\lceil \log_2 M \rceil$ qubits by using an amplitude encoding. A general framework for designing and training VQCs, however, has not been proposed, and a fundamental understanding of its power and analytical relationships with classical classifiers are not well understood. An encouraging specific embodiment of VQCs, quantum circuit learning (QCL), utilizes an ansatz: it expresses the quantum evolution operator as a circuit with a predetermined topology and parametrized gates; training involves learning the gate parameters through optimization. In this letter, we first address the open questions about VQCs and then show that they, including QCL, fit inside the well-known kernel method. Based on such correspondence, we devise a design framework of efficient ansatz-independent VQCs, which we call the unitary kernel method (UKM): it directly optimizes the unitary evolution operator in a VQC. Thus, we show that the performance of QCL is bounded from above by the UKM. Next, we propose a variational circuit realization (VCR) for designing efficient quantum circuits for a given unitary operator. By combining the UKM with the VCR, we establish an efficient framework for constructing high-performing circuits. We finally benchmark the relatively superior performance of the UKM and the VCR via extensive numerical simulations on multiple datasets.

翻訳日:2021-02-04 19:14:13 公開日:2021-02-02

# (参考訳) UAVネットワークにおけるデータ駆動ミリ波通信のための分散条件付き汎用ネットワーク(GAN)

Distributed Conditional Generative Adversarial Networks (GANs) for Data-Driven Millimeter Wave Communications in UAV Networks ( http://arxiv.org/abs/2102.01751v1 )

ライセンス: CC BY 4.0

Qianqian Zhang, Aidin Ferdowsi, Walid Saad, Mehdi Bennis

(参考訳) 本稿では,無人航空機(UAV)無線ネットワークにおけるミリ波(mmWave)通信のためのデータ駆動型空対地(A2G)チャネル推定手法を提案する。まず,ミリ波チャネル情報収集に有効なチャネル推定手法を開発し,各UAVは,各ビームフォーミング方向に沿って,条件付き生成対向ネットワーク(CGAN)を介してスタンドアロンチャネルモデルを訓練することができる。次に、訓練されたチャネルモデルのアプリケーションシナリオをより広い空間時間領域に拡張するために、分散CGANアーキテクチャに基づく協調フレームワークが開発され、各UAVが完全に分散された方法でmmWaveチャネル分布を共同で学ぶことができる。効率的な学習プロセスを保証するために、協調チャネルモデリングの学習率を最大化する最適なUAVネットワークトポロジーに必要な十分な条件を導出し、その後、分散ネットワーク構造に基づいて、UAV毎の最適CGAN学習ソリューションを特徴付ける。シミュレーションの結果,提案手法は各uavの局所的トレーニング誤差に頑健であることが判明した。一方、より大きな空飛ぶネットワークサイズでは、効率的な学習率を保証するために、UAV当たりの通信資源がより多く必要となる。また,情報共有のないスタンドアローンCGANや,他の2つの分散スキーム,すなわち多識別器CGANとフェデレートCGAN法と比較して,提案手法は,環境学習中に高いモデリング精度を示し,UAVダウンリンクmmWave通信のオンライン性能においてより高い平均データレートを達成することを示す。

In this paper, a novel framework is proposed to perform data-driven air-to-ground (A2G) channel estimation for millimeter wave (mmWave) communications in an unmanned aerial vehicle (UAV) wireless network. First, an effective channel estimation approach is developed to collect mmWave channel information, allowing each UAV to train a stand-alone channel model via a conditional generative adversarial network (CGAN) along each beamforming direction. Next, in order to expand the application scenarios of the trained channel model into a broader spatial-temporal domain, a cooperative framework, based on a distributed CGAN architecture, is developed, allowing each UAV to collaboratively learn the mmWave channel distribution in a fully-distributed manner. To guarantee an efficient learning process, necessary and sufficient conditions for the optimal UAV network topology that maximizes the learning rate for cooperative channel modeling are derived, and the optimal CGAN learning solution per UAV is subsequently characterized, based on the distributed network structure. Simulation results show that the proposed distributed CGAN approach is robust to the local training error at each UAV. Meanwhile, a larger airborne network size requires more communication resources per UAV to guarantee an efficient learning rate. The results also show that, compared with a stand-alone CGAN without information sharing and two other distributed schemes, namely: A multi-discriminator CGAN and a federated CGAN method, the proposed distributed CGAN approach yields a higher modeling accuracy while learning the environment, and it achieves a larger average data rate in the online performance of UAV downlink mmWave communications.

翻訳日:2021-02-04 18:42:31 公開日:2021-02-02

# 二重分散化による近接最適オフライン強化学習

Near-Optimal Offline Reinforcement Learning via Double Variance Reduction ( http://arxiv.org/abs/2102.01748v1 )

ライセンス: Link先を確認

Ming Yin, Yu Bai, Yu-Xiang Wang

(参考訳) 我々は、履歴データのみを使用した政策最適化を目的としたRLのモチベーションの高い設定であるオフライン強化学習(RL)の問題を検討します。適用性は広いが、オフラインRLの理論的理解、例えば最適なサンプル複雑性は、例えば 'emph{tabular} Markov Decision Processes (MDPs) のような基本的な設定でも、ほとんど開かれている。本稿では,オフラインRLの新しい分散還元アルゴリズムであるOff-Policy Double Variance reduction(OPDVR)を提案する。以上より,opdvrは,有限ホリゾン定常遷移設定におけるオフラインデータの$\widetilde{o}(h^2/d_m\epsilon^2)$で,$h$は地平線長,$d_m$は行動ポリシーによって引き起こされる最小の限界的状態行動分布であることを示す。これは、最もよく知られた上限を$H$の係数で改善します。さらに,Omega(H^2/d_m\epsilon^2)$という情報理論の下限を確立し,OPDVRが対数因子に最適であることを証明した。最後に, OPDVR は非定常遷移を持つ有限水平 MDP や割引された報酬を持つ無限水平 MDP などの代替条件下で, 速度-最適サンプルの複雑性も達成できることを示す。

We consider the problem of offline reinforcement learning (RL) -- a well-motivated setting of RL that aims at policy optimization using only historical data. Despite its wide applicability, theoretical understandings of offline RL, such as its optimal sample complexity, remain largely open even in basic settings such as \emph{tabular} Markov Decision Processes (MDPs). In this paper, we propose Off-Policy Double Variance Reduction (OPDVR), a new variance reduction based algorithm for offline RL. Our main result shows that OPDVR provably identifies an $\epsilon$-optimal policy with $\widetilde{O}(H^2/d_m\epsilon^2)$ episodes of offline data in the finite-horizon stationary transition setting, where $H$ is the horizon length and $d_m$ is the minimal marginal state-action distribution induced by the behavior policy. This improves over the best known upper bound by a factor of $H$. Moreover, we establish an information-theoretic lower bound of $\Omega(H^2/d_m\epsilon^2)$ which certifies that OPDVR is optimal up to logarithmic factors. Lastly, we show that OPDVR also achieves rate-optimal sample complexity under alternative settings such as the finite-horizon MDPs with non-stationary transitions and the infinite horizon MDPs with discounted rewards.

翻訳日:2021-02-04 17:52:35 公開日:2021-02-02

# 大規模で真のスパースニューラルネットワーク

Truly Sparse Neural Networks at Scale ( http://arxiv.org/abs/2102.01732v1 )

ライセンス: Link先を確認

Selima Curci, Decebal Constantin Mocanu, Mykola Pechenizkiyi

(参考訳) 近年,ニューラルネットワークにおけるトレーニングと推論効率のデファクトなアプローチとして,スパーストレーニング手法が確立されている。しかし、この効率性は理論上は正しい。実際、誰もがバイナリマスクを使用してスパーシティをシミュレートします。典型的なディープラーニングソフトウェアとハードウェアは高密度マトリックス操作に最適化されています。本稿では直交的アプローチを採り、真にスパースなニューラルネットワークをトレーニングし、その潜在能力を最大限に活用できることを示す。この目的を達成するために,(1)並列学習アルゴリズムとそれに対応するスパース実装をスクラッチから構築し,(2)勾配流を優先する非学習パラメータを持つ活性化関数,(3)冗長性を除去するための隠れ神経細胞重要度指標という,3つの新しい貢献法を提案する。 1つにまとめると、私たちは記録を破り、表現力の観点から訓練された史上最大のニューラルネットワークを訓練することができる。その結果,環境にやさしい人工知能時代への道を歩みながら,最先端のパフォーマンスを実現することができた。

Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. In practice, everyone uses a binary mask to simulate sparsity since the typical deep learning software and hardware are optimized for dense matrix operations. In this paper, we take an orthogonal approach, and we show that we can train truly sparse neural networks to harvest their full potential. To achieve this goal, we introduce three novel contributions, specially designed for sparse neural networks: (1) a parallel training algorithm and its corresponding sparse implementation from scratch, (2) an activation function with non-trainable parameters to favour the gradient flow, and (3) a hidden neurons importance metric to eliminate redundancies. All in one, we are able to break the record and to train the largest neural network ever trained in terms of representational power -- reaching the bat brain size. The results show that our approach has state-of-the-art performance while opening the path for an environmentally friendly artificial intelligence era.

翻訳日:2021-02-04 17:48:45 公開日:2021-02-02

# MoonBoardクライミングルート分類と生成のための繰り返しニューラルネットワーク

Recurrent Neural Network for MoonBoard Climbing Route Classification and Generation ( http://arxiv.org/abs/2102.01788v1 )

ライセンス: Link先を確認

Yi-Shiou Duh, Ray Chang

(参考訳) 登山ルートの難易度と新ルートの作成はどちらも困難である。既存の機械学習モデルでは、問題の難易度を正確に予測できないだけでなく、合理的な問題も生成できない。そこで本研究では,人間の登山者の手列を模倣するために開発した新しい移動前処理パイプラインである"betamove"を導入した。事前処理された移動シーケンスは、経路生成器とグレード予測器の両方を訓練するために使用された。ムーンボード問題を適切な移動順序に前処理することで、評価予測器の精度は人間のレベルに近い性能に到達し、経路生成器は以前の作業よりもずっと良い品質の新しい経路を生成する。 BetaMoveでは、機械学習の問題に対する人間の洞察を注入することができ、これが将来の登山スタイルの分類問題における移動学習の基礎となることを実証した。

Classifying the difficulties of climbing routes and generating new routes are both challenging. Existing machine learning models not only fail to accurately predict a problem's difficulty, but they are also unable to generate reasonable problems. In this work, we introduced "BetaMove", a new move preprocessing pipeline we developed, in order to mimic a human climber's hand sequence. The preprocessed move sequences were then used to train both a route generator and a grade predictor. By preprocessing a MoonBoard problem into a proper move sequence, the accuracy of our grade predictor reaches near human-level performance, and our route generator produces new routes of much better quality compared to previous work. We demonstrated that with BetaMove, we are able to inject human insights into the machine learning problems, and this can be the foundations for future transfer learning on climbing style classification problems.

翻訳日:2021-02-04 17:46:52 公開日:2021-02-02

# 情報的手法を用いた絵画の自動分析

Automatic analysis of artistic paintings using information-based measures ( http://arxiv.org/abs/2102.01767v1 )

ライセンス: Link先を確認

Jorge Miguel Silva, Diogo Pratas, Rui Antunes, S\'ergio Matos, and Armando J. Pinho

(参考訳) 芸術コミュニティは、芸術絵画の認証と分類のための自動計算分析にますます依存している。本稿では,物体の特徴の和を定量化する尺度である,その複雑さを分析し,芸術絵画に存在する隠れパターンと関係を同定する。具体的には,正規化圧縮法 (NC) とブロック分解法 (BDM) を91名の著者から集めた4,266点の絵画データセットに適用し,これらの情報に基づく手法が美術絵画の記述子としての可能性を検討する。どちらの尺度も、絵画、作家、芸術運動の類型を一貫して記述している。さらに、NCと絵画の粗さの尺度を組み合わせることで、効率的なスタイリスティックな記述子を作り出す。さらに,各絵画の局所情報を定量化することにより,アーティストの作風やその芸術的影響,共有技術に関する重要な情報を記述する指紋を定義する。より根本的には、この情報は、各著者が一般的にキャンバスにまたがる要素を構成・配布し、それゆえ、どのように作品が知覚されるかを記述する。最後に, 地域的複雑度と2点高さ差相関関数が, 美術絵画の作風と著者分類の方法論を改善する補助的特徴であることを示す。研究全体は、高速な著者特性評価と認証のための広範なウェブサイト(http://panther.web.ua.pt)によってサポートされています。

The artistic community is increasingly relying on automatic computational analysis for authentication and classification of artistic paintings. In this paper, we identify hidden patterns and relationships present in artistic paintings by analysing their complexity, a measure that quantifies the sum of characteristics of an object. Specifically, we apply Normalized Compression (NC) and the Block Decomposition Method (BDM) to a dataset of 4,266 paintings from 91 authors and examine the potential of these information-based measures as descriptors of artistic paintings. Both measures consistently described the equivalent types of paintings, authors, and artistic movements. Moreover, combining the NC with a measure of the roughness of the paintings creates an efficient stylistic descriptor. Furthermore, by quantifying the local information of each painting, we define a fingerprint that describes critical information regarding the artists' style, their artistic influences, and shared techniques. More fundamentally, this information describes how each author typically composes and distributes the elements across the canvas and, therefore, how their work is perceived. Finally, we demonstrate that regional complexity and two-point height difference correlation function are useful auxiliary features that improve current methodologies in style and author classification of artistic paintings. The whole study is supported by an extensive website (http://panther.web.ua.pt) for fast author characterization and authentication.

翻訳日:2021-02-04 17:35:59 公開日:2021-02-02

# 音声認識と翻訳のための多言語TEDxコーパス

The Multilingual TEDx Corpus for Speech Recognition and Translation ( http://arxiv.org/abs/2102.01757v1 )

ライセンス: Link先を確認

Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post

(参考訳) 音声認識(ASR)および音声翻訳(ST)研究を支援するために構築された多言語TEDxコーパスについて述べる。コーパスはTEDxの8つのソース言語による音声録音のコレクションである。書き起こしを文に分割し、ソース言語音声とターゲット言語翻訳に対応させる。コーパスはオープンソースコードとともにリリースされ、新しい講演や言語の拡張が可能になった。コーパス作成手法は,従来よりも多くの言語に適用でき,マルチウェイ並列評価セットを作成することができる。低リソース言語ペアの翻訳性能を改善するための多言語モデルを含む,複数のASRおよびST設定のベースラインを提供する。

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages. The corpus is a collection of audio recordings from TEDx talks in 8 source languages. We segment transcripts into sentences and align them to the source-language audio and target-language translations. The corpus is released along with open-sourced code enabling extension to new talks and languages as they become available. Our corpus creation methodology can be applied to more languages than previous work, and creates multi-way parallel evaluation sets. We provide baselines in multiple ASR and ST settings, including multilingual models to improve translation performance for low-resource language pairs.

翻訳日:2021-02-04 17:34:45 公開日:2021-02-02

# ミニマックス最適化を伴わない連続waserstein-2重心推定

Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization ( http://arxiv.org/abs/2102.01752v1 )

ライセンス: Link先を確認

Alexander Korotin, Lingxiao Li, Justin Solomon, Evgeny Burnaev

(参考訳) ワッサーシュタイン・バリセンターは、最適輸送に基づく確率測度の重み付き平均の幾何学的概念を提供する。本稿では,離散性に制限されない入力尺度へのサンプルアクセスを与えられたWasserstein-2バリセンタを計算するスケーラブルなアルゴリズムを提案する。過去アプローチはエントロピーあるいは二次正規化に依存しているが、我々はバイアスの導入を避けるために入力凸ニューラルネットワークとサイクルコンシスタンス正規化を用いる。その結果、私たちのアプローチはミニマックス最適化に頼りません。誤差境界に関する理論的分析と,提案手法の有効性を低次元定性シナリオおよび高次元定量的実験で実証的に証明する。

Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport. In this paper, we present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures, which are not restricted to being discrete. While past approaches rely on entropic or quadratic regularization, we employ input convex neural networks and cycle-consistency regularization to avoid introducing bias. As a result, our approach does not resort to minimax optimization. We provide theoretical analysis on error bounds as well as empirical evidence of the effectiveness of the proposed approach in low-dimensional qualitative scenarios and high-dimensional quantitative experiments.

翻訳日:2021-02-04 17:24:57 公開日:2021-02-02

# 自動走行車からのイベントデータを用いた人工知能システムの信頼性解析

Reliability Analysis of Artificial Intelligence Systems Using Recurrent Events Data from Autonomous Vehicles ( http://arxiv.org/abs/2102.01740v1 )

ライセンス: Link先を確認

Yili Hong and Jie Min and Caleb B. King and William Q. Meeker

(参考訳) 人工知能(AI)システムはますます一般的になり、トレンドは続きます。 AIシステムの例としては、自動運転車(AV)、コンピュータビジョン、自然言語処理、AI医療専門家などがある。安全かつ効果的なAIシステムのデプロイを可能にするためには、そのようなシステムの信頼性を評価する必要がある。従来、信頼性評価は信頼性テストデータとそれに続く統計モデリングと分析に基づいている。しかし、AIシステムのための信頼性データの可用性は、そのようなデータが通常敏感でプロプライエタリであるため、制限されます。カリフォルニア州自動車局(DMV)は、多くのAVメーカーがAVロードテストを行っているAVテストプログラムを監督および規制しています。プログラムに参加するメーカーは、カリフォルニア州のDMVに繰り返しの離脱イベントを報告する必要があります。この情報は一般に公開されています。本稿では、AVにおけるAIシステムの信頼性の表現としてリカレントデエンゲージメントイベントを使用し、AV駆動テストからリカレントイベントデータをモデル化・解析するための統計フレームワークを提案する。ソフトウェア信頼性には従来のパラメトリックモデルを用い,イベントプロセスを記述するために単調スプラインに基づく新しい非パラメトリックモデルを提案する。我々は,最良モデルの選択,不確かさの定量化,イベントプロセスにおける不均一性の検証のための推論手順を開発した。次に、4つのAVメーカから繰り返し発生するイベントデータを解析し、AV内のAIシステムの信頼性を推測する。また,提案分析を他のaiシステムの信頼性評価に適用する方法について述べる。

Artificial intelligence (AI) systems have become increasingly common and the trend will continue. Examples of AI systems include autonomous vehicles (AV), computer vision, natural language processing, and AI medical experts. To allow for safe and effective deployment of AI systems, the reliability of such systems needs to be assessed. Traditionally, reliability assessment is based on reliability test data and the subsequent statistical modeling and analysis. The availability of reliability data for AI systems, however, is limited because such data are typically sensitive and proprietary. The California Department of Motor Vehicles (DMV) oversees and regulates an AV testing program, in which many AV manufacturers are conducting AV road tests. Manufacturers participating in the program are required to report recurrent disengagement events to California DMV. This information is being made available to the public. In this paper, we use recurrent disengagement events as a representation of the reliability of the AI system in AV, and propose a statistical framework for modeling and analyzing the recurrent events data from AV driving tests. We use traditional parametric models in software reliability and propose a new nonparametric model based on monotonic splines to describe the event process. We develop inference procedures for selecting the best models, quantifying uncertainty, and testing heterogeneity in the event process. We then analyze the recurrent events data from four AV manufacturers, and make inferences on the reliability of the AI systems in AV. We also describe how the proposed analysis can be applied to assess the reliability of other AI systems.

翻訳日:2021-02-04 17:23:59 公開日:2021-02-02

# 深部畳み込みニューラルネットワークによる地表面の相互結合効果予測

Deep Convolutional Neural Networks to Predict Mutual Coupling Effects in Metasurfaces ( http://arxiv.org/abs/2102.01761v1 )

ライセンス: Link先を確認

Sensong An, Bowen Zheng, Mikhail Y. Shalaginov, Hong Tang, Hang Li, Li Zhou, Yunxi Dong, Mohammad Haerinia, Anuradha Murthy Agarwal, Clara Rivero-Baleine, Myungkoo Kang, Kathleen A. Richardson, Tian Gu, Juejun Hu, Clayton Fowler and Hualiang Zhang

(参考訳) metasurfacesはコンパクトで大規模の光学デバイスを実現するための新しい有望なプラットフォームを提供してきた。従来の準曲面設計手法では、要素間の近接場結合効果が非同一構造に囲まれると変化するため、ほとんどの場合において不正確な各要素の周期的境界条件を仮定する。本稿では,大規模アレイに配置された各ターゲットメタアトムの実際の電磁(EM)応答を,近接場結合効果を考慮して予測する深層学習手法を提案する。予測ニューラルネットワークは、ターゲットのメタ原子とその近傍の物理的仕様を入力として、その位相と振幅をミリ秒で計算する。この手法は, 相互結合による準曲面の性能劣化を説明するために適用可能であり, さらに最適化アルゴリズムと組み合わせて効率を最適化するためにも有効である。本手法の有効性を実証するため,従来の設計手法に比べてビーム偏向器とメタレンの効率が大幅に向上した。さらに, 準曲面の性能と相互結合による設計誤差の相関関係は, 特定の仕様(材料, 形状等)に拘束されないことを示す。そこで本手法は, 相互結合効果を探索し, 種々の中表面設計の性能を向上させるために, 容易に適用できることを想定する。

Metasurfaces have provided a novel and promising platform for the realization of compact and large-scale optical devices. The conventional metasurface design approach assumes periodic boundary conditions for each element, which is inaccurate in most cases since the near-field coupling effects between elements will change when surrounded by non-identical structures. In this paper, we propose a deep learning approach to predict the actual electromagnetic (EM) responses of each target meta-atom placed in a large array with near-field coupling effects taken into account. The predicting neural network takes the physical specifications of the target meta-atom and its neighbors as input, and calculates its phase and amplitude in milliseconds. This approach can be applied to explain metasurfaces' performance deterioration caused by mutual coupling and further used to optimize their efficiencies once combined with optimization algorithms. To demonstrate the efficacy of this methodology, we obtain large improvements in efficiency for a beam deflector and a metalens over the conventional design approach. Moreover, we show the correlations between a metasurface's performance and its design errors caused by mutual coupling are not bound to certain specifications (materials, shapes, etc.). As such, we envision that this approach can be readily applied to explore the mutual coupling effects and improve the performance of various metasurface designs.

翻訳日:2021-02-04 17:23:17 公開日:2021-02-02

# 拡張型ゲームにおけるStackelberg平衡の安全な探索

Safe Search for Stackelberg Equilibria in Extensive-Form Games ( http://arxiv.org/abs/2102.01775v1 )

ライセンス: Link先を確認

Chun Kai Ling, Noam Brown

(参考訳) Stackelberg平衡(Stackelberg equilibrium)は、リーダーがフォロワーに対してコミットメント権を持つ2プレイヤーゲームにおけるソリューションコンセプトである。近年では、空港のパトロールや野生動物の密猟防止など、多くのセキュリティアプリケーションの礎となっています。これらの設定の多くは本質的にシーケンシャルですが、既存のテクニックは事前にソリューション全体を計算します。本稿では,一般用ゲームにおける Stackelberg 平衡の計算に,オンライン計算を応用して解を改善する,理論的に健全かつ実証的に有効な探索手法を提案する。リーダーがゲーム全体を前もって解決しようとする代わりに、近似的な"青写真"ソリューションが最初にオフラインで計算され、実際のプレイで遭遇した特定のサブゲームのためにオンラインで改善される。提案手法は,事前計算したブループリント戦略に匹敵する性能が保証されていることを実証し,純粋にオフラインの手法に比べて大局的にゲームを解くことが可能であることを実証した。また,我々の検索操作はより小さなStackelberg問題としてキャストされる可能性を示し,戦略生成に基づく既存のアルゴリズムを補完する手法を提案する。

Stackelberg equilibrium is a solution concept in two-player games where the leader has commitment rights over the follower. In recent years, it has become a cornerstone of many security applications, including airport patrolling and wildlife poaching prevention. Even though many of these settings are sequential in nature, existing techniques pre-compute the entire solution ahead of time. In this paper, we present a theoretically sound and empirically effective way to apply search, which leverages extra online computation to improve a solution, to the computation of Stackelberg equilibria in general-sum games. Instead of the leader attempting to solve the full game upfront, an approximate "blueprint" solution is first computed offline and is then improved online for the particular subgames encountered in actual play. We prove that our search technique is guaranteed to perform no worse than the pre-computed blueprint strategy, and empirically demonstrate that it enables approximately solving significantly larger games compared to purely offline methods. We also show that our search operation may be cast as a smaller Stackelberg problem, making our method complementary to existing algorithms based on strategy generation.

翻訳日:2021-02-04 17:22:36 公開日:2021-02-02

# タイムウィンドウを用いたピックアップ・アンド・デリバリー問題における乗組員スケジューリングのメタヒューリスティック

A metaheuristic for crew scheduling in a pickup-and-delivery problem with time windows ( http://arxiv.org/abs/2102.01780v1 )

ライセンス: Link先を確認

Mauro Lucci, Daniel Sever\'in, Paula Zabala

(参考訳) 車両のルーティングおよびクルースケジューリング問題(VRCSP)は、車両の艦隊のルートを計画し、車両とクルーの対応が時間内に固定されていない乗組員をスケジュールすることからなる。これにより、計画の柔軟性が向上し、艦隊の効率が向上するが、それに対して高い同期が要求される。本研究では,トラックやドライバーを用いて,時間窓によるピックアップ・アンド・デリバリ要求を計画の地平線上で満たさなければならないVRCSPを提案する。クルーは1人または2人のドライバーで構成され、それらのどれかは特定の場所のセットで緩和することができます。さらに、非商用シャトルの場所間での移動も可能で、追加費用は最小限に抑えられる。我々の問題はトラックとドライバーの異なる経路を考えるため、クルーが不可分なユニットとして扱われる文献にあるように、以前のvrcspでは考えられなかった柔軟性が増している。 2段階連続的なアプローチでこの問題に取り組みます:トラックルートのセットは第1段階で計算され、トラックルートと一致するドライバールートのセットは第2段階で取得されます。後者の段階におけるメタヒューリスティックベースアルゴリズムの性能を設計・評価する。提案手法は,新しい解の探索が困難になった場合の解の再使用を可能にする摂動手続きを主目的とする。この手順は、他の修理不可能ソリューションと一緒に、12-32台のトラック(計画地平線に応じて)を1時間未満で、15都市にまたがる100のリクエストのインスタンスで高品質のソリューションを見つけることができます。また,追加の運転士を乗せることによって,各乗務員に対して平均約60%の外部シャトルコストが削減され,場合によってはこのコストが完全に削減される可能性が示唆された。

A vehicle routing and crew scheduling problem (VRCSP) consists of simultaneously planning the routes of a fleet of vehicles and scheduling the crews, where the vehicle-crew correspondence is not fixed through time. This allows a greater planning flexibility and a more efficient use of the fleet, but in counterpart, a high synchronisation is demanded. In this work, we present a VRCSP where pickup-and-delivery requests with time windows have to be fulfilled over a given planning horizon by using trucks and drivers. Crews can be composed of 1 or 2 drivers and any of them can be relieved in a given set of locations. Moreover, they are allowed to travel among locations with non-company shuttles, at an additional cost that is minimised. As our problem considers distinct routes for trucks and drivers, we have an additional flexibility not contemplated in other previous VRCSP given in the literature where a crew is handled as an indivisible unit. We tackle this problem with a two-stage sequential approach: a set of truck routes is computed in the first stage and a set of driver routes consistent with the truck routes is obtained in the second one. We design and evaluate the performance of a metaheuristic based algorithm for the latter stage. Our algorithm is mainly a GRASP with a perturbation procedure that allows reusing solutions already found in case the search for new solutions becomes difficult. This procedure together with other to repair infeasible solutions allow us to find high-quality solutions on instances of 100 requests spread across 15 cities with a fleet of 12-32 trucks (depending on the planning horizon) in less than an hour. We also conclude that the possibility of carrying an additional driver leads to a decrease of the cost of external shuttles by about 60% on average with respect to individual crews and, in some cases, to remove this cost completely.

翻訳日:2021-02-04 17:21:56 公開日:2021-02-02

# カステッラノ・コン・アセント・コスタリセンスにおける乳児の遺伝子変異に関する研究

Generacion de voces artificiales infantiles en castellano con acento costarricense ( http://arxiv.org/abs/2102.01692v1 )

ライセンス: Link先を確認

Ana Lilia Alvarez-Blanco, Eugenia Cordoba-Warner, Marvin Coto-Jimenez, Vivian Fallas-Lopez, Maribel Morales Rodriguez

(参考訳) 本稿では,隠れマルコフモデルに基づく統計的パラメトリック音声合成の手法を用いて,コスタリカアクセントを用いた人工的な子どもの声生成の最初の経験を評価する。モデル学習に用いる音声サンプルを録音するプロセス、使用する技術の基礎、およびグループの認識を通じて結果の主観評価について説明します。その結果, 孤立した単語で評価した結果の明瞭さは, 参加する子どものグループの声よりも低いことがわかった。同様に、話す人の年齢と性別の検出は、自然な声の録音と比較して、人工音声に大きく影響されます。これらの結果から,新たなデータやプロセスによる今後の発展の数値的基準となるとともに,同じ手法で結果を改善するために,大量のデータを取得する必要性が示唆された。

This article evaluates a first experience of generating artificial children's voices with a Costa Rican accent, using the technique of statistical parametric speech synthesis based on Hidden Markov Models. The process of recording the voice samples used for learning the models, the fundamentals of the technique used and the subjective evaluation of the results through the perception of a group of people is described. The results show that the intelligibility of the results, evaluated in isolated words, is lower than the voices recorded by the group of participating children. Similarly, the detection of the age and gender of the speaking person is significantly affected in artificial voices, relative to recordings of natural voices. These results show the need to obtain larger amounts of data, in addition to becoming a numerical reference for future developments resulting from new data or from processes to improve results in the same technique.

翻訳日:2021-02-04 17:19:38 公開日:2021-02-02

# Apollo:Transferable Architecture Exploration

Apollo: Transferable Architecture Exploration ( http://arxiv.org/abs/2102.01723v1 )

ライセンス: Link先を確認

Amir Yazdanbakhsh, Christof Angermueller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, Satrajit Chatterjee, Ravi Narayanaswami, James Laudon

(参考訳) ムーアの法則の破滅とディープラーニングの利用の上昇は、特定のニューラルアーキテクチャに最適化されたカスタムアクセラレータの設計を促進する。このような加速器のアーキテクチャ探索は、目的関数を評価するのにコストがかかる複雑で高次元で構造化された入力空間上の制約付き最適化問題を引き起こす。既存のアクセラレータ設計のアプローチはサンプル非効率であり、エリアや遅延予算、ニューラルネットワークの構成など、異なる設計制約を持つ関連する最適化タスク間で知識を伝達しない。本研究では, ブラックボックス関数最適化の最近の進歩を活用して, サンプル効率の高い加速器設計のためのトランスファー可能なアーキテクチャ探索フレームワークApolloを提案する。このフレームワークを使用して、代替設計制約のあるさまざまなニューラルネットワークのアクセラレータ構成を最適化する。我々のフレームワークは,ベースラインのブラックボックス最適化手法よりも試料効率が高い(最大24.6%のスピードアップ)。さらに、異なる設計制約を持つターゲットアーキテクチャ間で知識を転送することで、apolloは最適な構成を素早く、しばしばより客観的な価値(最大25%の改善)で見つけることができることを示した。この奨励的な成果は、高品質のアクセラレータの生成を促進するための有望な道筋を示しています。

The looming end of Moore's Law and ascending use of deep learning drives the design of custom accelerators that are optimized for specific neural architectures. Architecture exploration for such accelerators forms a challenging constrained optimization problem over a complex, high-dimensional, and structured input space with a costly to evaluate objective function. Existing approaches for accelerator design are sample-inefficient and do not transfer knowledge between related optimizations tasks with different design constraints, such as area and/or latency budget, or neural architecture configurations. In this work, we propose a transferable architecture exploration framework, dubbed Apollo, that leverages recent advances in black-box function optimization for sample-efficient accelerator design. We use this framework to optimize accelerator configurations of a diverse set of neural architectures with alternative design constraints. We show that our framework finds high reward design configurations (up to 24.6% speedup) more sample-efficiently than a baseline black-box optimization approach. We further show that by transferring knowledge between target architectures with different design constraints, Apollo is able to find optimal configurations faster and often with better objective value (up to 25% improvements). This encouraging outcome portrays a promising path forward to facilitate generating higher quality accelerators.

翻訳日:2021-02-04 17:14:06 公開日:2021-02-02

# 心電図信号と心臓音の同期を用いた新しいトランスファー学習型心疾患スクリーニング法

A Novel Transfer Learning-Based Approach for Screening Pre-existing Heart Diseases Using Synchronized ECG Signals and Heart Sounds ( http://arxiv.org/abs/2102.01728v1 )

ライセンス: Link先を確認

Ramith Hettiarachchi, Udith Haputhanthri, Kithmini Herath, Hasindu Kariyawasam, Shehan Munasinghe, Kithmin Wickramasinghe, Duminda Samarasinghe, Anjula De Silva and Chamira Edussooriya

(参考訳) 既往心疾患の診断は、肺高血圧症、心臓リズム障害、血栓症、心不全、突然の心停止などの合併症の予防に役立つため重要である。このような疾患を識別するために、心電図(PCG)および心電図(ECG)波形は重要な情報を伝達する。したがって、これらの2種類のデータの有効利用は、疾患スクリーニングプロセスを改善する可能性を秘めている。本稿では,PCGとECGを同時取得したPhystoNet Challenge 2016 Datasetのサブセット上で,この仮説を評価する。我々の新しいDual-Convolutional Neural Networkベースのアプローチは、トランスファーラーニングを使用して、大規模なデータセットに適応する可能性を秘めつつ、公開可能なPCGとECGの同時データを限られた量保持する問題に対処する。また、記録的評価とサンプル的評価という2つの主要な評価フレームワークを導入し、トランスファーラーニングアプローチの豊富なパフォーマンス評価につながります。単一・二重モードデータを用いた手法との比較により,本手法が性能向上につながることが示された。さらに,各々収集されたECG波形やPCG波形は,同期PCG波形やECG波形の限られた数を有効活用し,未だに有意な分類性能を達成できるトランスファー可能な機能を提供することができた。

Diagnosing pre-existing heart diseases early in life is important as it helps prevent complications such as pulmonary hypertension, heart rhythm problems, blood clots, heart failure and sudden cardiac arrest. To identify such diseases, phonocardiogram (PCG) and electrocardiogram (ECG) waveforms convey important information. Therefore, effectively using these two modalities of data has the potential to improve the disease screening process. Here, we evaluate this hypothesis on a subset of the PhysioNet Challenge 2016 Dataset which contains simultaneously acquired PCG and ECG recordings. Our novel Dual-Convolutional Neural Network based approach uses transfer learning to tackle the problem of having limited amounts of simultaneous PCG and ECG data that is publicly available, while having the potential to adapt to larger datasets. In addition, we introduce two main evaluation frameworks named record-wise and sample-wise evaluation which leads to a rich performance evaluation for the transfer learning approach. Comparisons with methods which used single or dual modality data show that our method can lead to better performance. Furthermore, our results show that individually collected ECG or PCG waveforms are able to provide transferable features which could effectively help to make use of a limited number of synchronized PCG and ECG waveforms and still achieve significant classification performance.

翻訳日:2021-02-04 17:13:26 公開日:2021-02-02

# FedProf: 動的データプロファイリングによるフェデレーション学習の最適化

FedProf: Optimizing Federated Learning with Dynamic Data Profiling ( http://arxiv.org/abs/2102.01733v1 )

ライセンス: Link先を確認

Wentai Wu, Ligang He, Weiwei Lin, Rui Mao, Chenlin Huang and Wei Song

(参考訳) フェデレートラーニング(FL)は、エンドデバイス(すなわちクライアント)上でのみローカルにアクセス可能な分散データから学ぶためのプライバシー保護ソリューションとして大きな可能性を示している。しかし、多くのシナリオでは、クライアントの大部分が、バイアス、ノイズ、あるいは無関係な低品質のデータのみを保有している。その結果、我々はFLの過程でその収束を構築し、遅らせることを目的としたグローバルモデルの品質を大幅に低下させる可能性があります。そこで本稿では,データプライバシを侵害することなくFLを最適化する手法を提案する。このアプローチの鍵となるのは、各クライアントとサーバのモデルデータフットプリントを生成する動的データプロファイリング手法です。フットプリントは、モデルの最初の完全接続層(fc-1)の出力分布に基づいて対応するデータパーティション上のグローバルモデルの表現を符号化する。クライアントとサーバのフットプリントを一致させることで、各flラウンドに参加する各クライアントの機会を適応的に調整し、クライアントの低品質データへの影響を軽減します。我々は,様々な fl 設定を用いた公開データセットの広範な実験を行った。その結果,グローバルモデルが収束するために必要なラウンド数(最大75\%)と全体時間(最大68\%)を大幅に削減するとともに,グローバルモデルの精度を最大2.5\%向上させることができた。

Federated Learning (FL) has shown great potential as a privacy-preserving solution to learning from decentralized data which are only accessible locally on end devices (i.e., clients). In many scenarios, however, a large proportion of the clients are probably in possession of only low-quality data that are biased, noisy or even irrelevant. As a result, they could significantly degrade the quality of the global model we aim to build and slow down its convergence in the course of FL. In light of this, we propose a novel approach to optimizing FL under such circumstances without breaching data privacy. The key of our approach is a dynamic data profiling method for generating model-data footprints on each client and the server. The footprint encodes the representation of the global model on the corresponding data partition based on the output distribution of the model's first fully-connected layer (FC-1). By matching the footprints from clients and the server, we adaptively adjust each client's opportunity of participation in each FL round to mitigate the impact from the clients with low-quality data. We have conducted extensive experiments on public data sets using various FL settings. Results show that our method significantly reduces the number of rounds (by up to 75\%) and overall time (by up to 68\%) required to have the global model converge whiling increasing the global model's accuracy by up to 2.5\%.

翻訳日:2021-02-04 17:12:42 公開日:2021-02-02

# 条件にまたがるロバストな性能を持つ話者照合バックエンド

A Speaker Verification Backend with Robust Performance across Conditions ( http://arxiv.org/abs/2102.01760v1 )

ライセンス: Link先を確認

Luciana Ferrer, Mitchell McLaren, Niko Brummer

(参考訳) 本稿では,開発中の未知・未知の状況における話者検証の問題について述べる。話者照合の標準的な方法は、ディープニューラルネットワークを用いて話者埋め込みを抽出し、確率線形判別分析(plda)とグローバルロジスティック回帰スコア校正からなるバックエンドで処理することである。この方法は、キャリブレーションモデルのトレーニングに使用されるものと異なる条件でうまく動作しないシステムをもたらすことが知られている。入力条件に適応するために、持続時間などの自動抽出側情報を用いた適応キャリブレータを導入し、標準バックエンドの修正を提案します。バックエンドはバイナリのクロスエントロピーを最適化するために差別的に訓練される。話者に対してのみラベル付けされた多数の多様なデータセットでトレーニングされた場合、提案されているバックエンドは一貫して、場合によっては標準のpldaアプローチと比較して、いくつかの保持されたデータセットでキャリブレーションを劇的に改善する。差別性能も一貫して向上します。 PLDAと適応キャリブレータの併用訓練は必須であり,PLDAの凍結やキャリブレータの微調整では同様の効果が得られない。私たちの知る限り、本論文の結果は、さまざまな条件下で安定したアウトオブボックスのパフォーマンスを持つスピーカー検証システムを開発することができるという文献の最初の証拠です。

In this paper, we address the problem of speaker verification in conditions unseen or unknown during development. A standard method for speaker verification consists of extracting speaker embeddings with a deep neural network and processing them through a backend composed of probabilistic linear discriminant analysis (PLDA) and global logistic regression score calibration. This method is known to result in systems that work poorly on conditions different from those used to train the calibration model. We propose to modify the standard backend, introducing an adaptive calibrator that uses duration and other automatically extracted side-information to adapt to the conditions of the inputs. The backend is trained discriminatively to optimize binary cross-entropy. When trained on a number of diverse datasets that are labeled only with respect to speaker, the proposed backend consistently and, in some cases, dramatically improves calibration, compared to the standard PLDA approach, on a number of held-out datasets, some of which are markedly different from the training data. Discrimination performance is also consistently improved. We show that joint training of the PLDA and the adaptive calibrator is essential -- the same benefits cannot be achieved when freezing PLDA and fine-tuning the calibrator. To our knowledge, the results in this paper are the first evidence in the literature that it is possible to develop a speaker verification system with robust out-of-the-box performance on a large variety of conditions.

翻訳日:2021-02-04 17:12:00 公開日:2021-02-02

# re-diffusion glioma growth modelの初期条件評価:translational mri/histology (in)validation study

Initial condition assessment for reaction-diffusion glioma growth models: A translational MRI/histology (in)validation study ( http://arxiv.org/abs/2102.01719v1 )

ライセンス: Link先を確認

Corentin Martens, Laetitia Lebrun, Christine Decaestecker, Thomas Vandamme, Yves-R\'emi Van Eycke, Antonin Rovai, Thierry Metens, Olivier Debeir, Serge Goldman, Isabelle Salmon, Gaetan Van Simaeys

(参考訳) 拡散性グリオーマは高浸潤性腫瘍であり、早期診断とフォローアップは通常磁気共鳴画像(MRI)に依存する。しかし、この技術の感度が限られているため、グリオーマ細胞浸潤の程度を直接評価することは不可能であり、最適以下の治療計画につながる。反応拡散成長モデルは、MRIで見るマージンを超えてグリオーマ細胞の浸潤を外挿し、その時空間的進化を予測するために何十年も提案されてきた。これらのモデルは、診断時に脳のあらゆる位置における腫瘍細胞密度値である初期状態を必要とする。腫瘍細胞密度関数とMRIで見られる異常のアウトラインを関連付ける研究がいくつか提案されているが、基礎となる仮定は確認されていない。本研究では,3Dプリンティングスライサーを用いたグリオ芽腫を有する非手術脳の立体的組織学的解析により,これらの仮定を検証することを提案する。細胞密度マップは、深層学習アプローチを用いて、組織学的スライドから計算される。次に、密度マップは、死後MR画像に登録され、腫瘍コアへのMR由来測地距離マップと関連付けられる。 T2 FLAIR MRIで見られる浮腫アウトラインとコアの距離との関係についても検討した。以上の結果から, (i) 腫瘍コアまでの距離で腫瘍細胞密度が指数関数的に減少することは理にかなわないが, (ii) 浮腫アウトラインは一般に細胞密度 iso-contour と一致せず, (iii) これらのアウトラインで一般的に採用されている腫瘍細胞密度値が過大評価される可能性が示唆された。これらの知見は、従来のmriを用いたグリオーマ細胞密度マップの導出の限界を浮き彫りにして、他の方法による反応拡散成長モデルの初期化と臨床応用の必要性を指摘している。

Diffuse gliomas are highly infiltrative tumors whose early diagnosis and follow-up usually rely on magnetic resonance imaging (MRI). However, the limited sensitivity of this technique makes it impossible to directly assess the extent of the glioma cell invasion, leading to sub-optimal treatment planing. Reaction-diffusion growth models have been proposed for decades to extrapolate glioma cell infiltration beyond margins visible on MRI and predict its spatial-temporal evolution. These models nevertheless require an initial condition, that is the tumor cell density values at every location of the brain at diagnosis time. Several works have proposed to relate the tumor cell density function to abnormality outlines visible on MRI but the underlying assumptions have never been verified so far. In this work we propose to verify these assumptions by stereotactic histological analysis of a non-operated brain with glioblastoma using a tailored 3D-printed slicer. Cell density maps are computed from histological slides using a deep learning approach. The density maps are then registered to a postmortem MR image and related to an MR-derived geodesic distance map to the tumor core. The relation between the edema outlines visible on T2 FLAIR MRI and the distance to the core is also investigated. Our results suggest that (i) the previously suggested exponential decrease of the tumor cell density with the distance to the tumor core is not unreasonable but (ii) the edema outlines may in general not correspond to a cell density iso-contour and (iii) the commonly adopted tumor cell density value at these outlines is likely overestimated. These findings highlight the limitations of using conventional MRI to derive glioma cell density maps and point out the need of validating other methods to initialize reaction-diffusion growth models and make them usable in clinical practice.

翻訳日:2021-02-04 17:03:38 公開日:2021-02-02

# 質問プールウェブサイトにおける学生エンゲージメント・ムードのドロップアウト予測

Characterizing Student Engagement Moods for Dropout Prediction in Question Pool Websites ( http://arxiv.org/abs/2102.00423v2 )

ライセンス: Link先を確認

Reza Hadi Mogavi, Xiaojuan Ma, Pan Hui

(参考訳) 問題ベース学習(英語: problem-based learning, pbl)は、問題解決によるハンズオントレーニングを支援する、一般的な指導手法である。 LeetCode、Code Chef、Math Playgroundといった質問プールのウェブサイト(QP)は、学生に本物で多様な、文脈に応じた質問を提供することでPBLを支援する。いずれにせよ、QPに登録されている学生の40%から80%は2ヶ月以内に退学している。本研究は,学生の参加感情を活用し,qpsからの学生の退学を理解・予測する最初の試みである。データ駆動型アプローチを採用することで、QP学生にとって5つの異なるエンゲージメント・ムード、すなわちチャレンジ・シーカー、主題シーカー、興味シーカー、喜びシーカー、非シーカーを識別する。学生は、各エンゲージメントのムードで質問に答える集団的な選好を持ち、その選好からの逸脱は、退学する確率を著しく高めている。最後に、この論文はQPの学生のドロップアウトを予測するための新しいハイブリッド機械学習モデル(我々はDropout-Plusと呼ぶ)を導入することで貢献します。テストの結果、中国で人気のqpで1万人近い学生がおり、dropout-plusは、精度、f1-measure、aucの点でライバルアルゴリズムのドロップアウト予測性能を上回っている。学生のドロップアウトを減らすために、QPマネージャーやオンライン学習の専門家にデザイン提案を行うことで、作業をまとめています。

Problem-Based Learning (PBL) is a popular approach to instruction that supports students to get hands-on training by solving problems. Question Pool websites (QPs) such as LeetCode, Code Chef, and Math Playground help PBL by supplying authentic, diverse, and contextualized questions to students. Nonetheless, empirical findings suggest that 40% to 80% of students registered in QPs drop out in less than two months. This research is the first attempt to understand and predict student dropouts from QPs via exploiting students' engagement moods. Adopting a data-driven approach, we identify five different engagement moods for QP students, which are namely challenge-seeker, subject-seeker, interest-seeker, joy-seeker, and non-seeker. We find that students have collective preferences for answering questions in each engagement mood, and deviation from those preferences increases their probability of dropping out significantly. Last but not least, this paper contributes by introducing a new hybrid machine learning model (we call Dropout-Plus) for predicting student dropouts in QPs. The test results on a popular QP in China, with nearly 10K students, show that Dropout-Plus can exceed the rival algorithms' dropout prediction performance in terms of accuracy, F1-measure, and AUC. We wrap up our work by giving some design suggestions to QP managers and online learning professionals to reduce their student dropouts.

翻訳日:2021-02-04 12:49:52 公開日:2021-02-02

# 化学空間探索のためのディープニューラルネットワークを用いた遺伝的アルゴリズムの再現性に関する研究

A reproducibility study of "Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space" ( http://arxiv.org/abs/2102.00700v2 )

ライセンス: Link先を確認

Kevin Maik Jablonka, Fergus Mcilwaine, Susana Garcia, Berend Smit, Brian Yoo

(参考訳) Nigamら。 SELFIES表現を利用した遺伝的アルゴリズム(GA)を報告し、生成された分子の多様性を改善するための適応型ニューラルネットワークベースのペナルティを提案する。この論文の主な主張は、このGAは他の生成技術(罰則化されたlogPによって測定される)を上回っ、ニューラルネットワークベースの適応ペナルティが生成された分子の多様性を増加させることである。本研究では,それらの主張の再現性を検討した。全体としては、SELFIESベースのGAを用いて同等の結果を再現することができたが、ほとんどは(容易に最適化可能な)フィットネス機能の欠如(すなわち、長い硫黄を含む鎖を生成する)を利用していた。また, 判別器を用いて, 分子の発生を基準セットと類似するものに偏見を与えることができることを示す結果も再現した。さらに,多様性の進化を定量化し,いくつかのハイパーパラメータの影響を理解し,適応的ペナルティの改善を提案する。

Nigam et al. reported a genetic algorithm (GA) utilizing the SELFIES representation and also propose an adaptive, neural network-based penalty that is supposed to improve the diversity of the generated molecules. The main claims of the paper are that this GA outperforms other generative techniques (as measured by the penalized logP) and that a neural network-based adaptive penalty increases the diversity of the generated molecules. In this work, we investigated the reproducibility of their claims. Overall, we were able to reproduce comparable results using the SELFIES-based GA, but mostly by exploiting deficiencies of the (easily optimizable) fitness function (i.e., generating long, sulfur containing chains). In addition, we also reproduce results showing that the discriminator can be used to bias the generation of molecules to ones that are similar to the reference set. Moreover, we also attempted to quantify the evolution of the diversity, understand the influence of some hyperparameters, and propose improvements to the adaptive penalty.

翻訳日:2021-02-04 12:49:03 公開日:2021-02-02

# (参考訳) スマートグリッドデバイスの確率的ブールネットワークモデルによる強化学習

Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices ( http://arxiv.org/abs/2102.01297v1 )

ライセンス: CC BY 4.0

Pedro J. Rivera Torres, Carlos Gershenson Garc\'ia, Samir Kanaan Izquierdo

(参考訳) スマートパワーグリッドの領域は、常に効率と回復力を向上させ、高品質な電力を保護し、抵抗グリッドで、障害を管理し、障害を回避する必要がある。これを実現するには、高いコンポーネント信頼性、適切なメンテナンス、および研究された障害発生が必要です。正しいシステム操作には、これらのアクティビティと、障害や障害を検出し、分類し、分離するための新しい方法論、予測アルゴリズムと分析(データ分析と資産条件を使用してアクティビティを計画および実行)によるプロセスをモデル化およびシミュレートする。本稿では,複雑な適応型自己組織型モデリング手法であるProbabilistic Boolean Networks (PBN) の応用を,スマートグリッドデバイスのダイナミクスの理解と,その動作のモデル化と特性評価の方法として紹介する。この研究は、PBNが標準的な強化学習サイクルと同等であることを示しています。エージェント/モデルは環境と相互作用し、報酬信号の形でフィードバックを受け取ります。好みの行動を特徴付けるために、異なる報酬構造が作成されました。この情報は、故障状況や故障を避けるためにPBNを導くために使用できます。

The area of Smart Power Grids needs to constantly improve its efficiency and resilience, to pro-vide high quality electrical power, in a resistant grid, managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities, and novel methodologies to detect, classify, and isolate faults and failures, model and simulate processes with predictive algorithms and analytics (using data analysis and asset condition to plan and perform activities). We show-case the application of a complex-adaptive, self-organizing modeling method, Probabilistic Boolean Networks (PBN), as a way towards the understanding of the dynamics of smart grid devices, and to model and characterize their behavior. This work demonstrates that PBNs are is equivalent to the standard Reinforcement Learning Cycle, in which the agent/model has an inter-action with its environment and receives feedback from it in the form of a reward signal. Differ-ent reward structures were created in order to characterize preferred behavior. This information can be used to guide the PBN to avoid fault conditions and failures.

翻訳日:2021-02-04 12:25:12 公開日:2021-02-02

# (参考訳) 未知の出現点と不出現点を持つ信号の時間内最適逐次検出

Optimal Sequential Detection of Signals with Unknown Appearance and Disappearance Points in Time ( http://arxiv.org/abs/2102.01310v1 )

ライセンス: CC BY 4.0

Alexander G. Tartakovsky, Nikita R. Berenkov, Alexei E. Kolessa, and Igor V. Nikiforov

(参考訳) 本論文は,変化の持続時間が有限かつ未知であると仮定して,逐次的な変化点検出問題に対処する。この問題は、信号や画像処理など、時間や空間の未知の点に信号が現れて消滅する多くのアプリケーションにとって重要である。与えられた平均走行距離の検出までの遅延を誤報に最小化する必要がある最短変化検出における従来の最適度基準とは対照的に、所定のウィンドウにおける誤報の局所最大確率に対する所定の時間(または空間)ウィンドウにおける検出の最小確率を最大化する信頼性の高い最大変化検出基準に焦点を当てる。最適な検出手順は、変更されたCUSUM手順であることを示します。次に、この最適手順の動作特性と、FMA(Finite moving Average)検出アルゴリズムとモンテカルロシミュレーションを用いた通常のCUSUM手順とを比較し、通常、後者のアルゴリズムは最適手法とほぼ同等の性能を持つことを示す。同時に、FMA手順には、通常不明な信号の強度への依存という大きな利点があります。最後に、FMAアルゴリズムを用いて光学画像中の衛星のかすかなストリークを検出する。

The paper addresses a sequential changepoint detection problem, assuming that the duration of change may be finite and unknown. This problem is of importance for many applications, e.g., for signal and image processing where signals appear and disappear at unknown points in time or space. In contrast to the conventional optimality criterion in quickest change detection that requires minimization of the expected delay to detection for a given average run length to a false alarm, we focus on a reliable maximin change detection criterion of maximizing the minimal probability of detection in a given time (or space) window for a given local maximal probability of false alarm in the prescribed window. We show that the optimal detection procedure is a modified CUSUM procedure. We then compare operating characteristics of this optimal procedure with popular in engineering the Finite Moving Average (FMA) detection algorithm and the ordinary CUSUM procedure using Monte Carlo simulations, which show that typically the later algorithms have almost the same performance as the optimal one. At the same time, the FMA procedure has a substantial advantage -- independence to the intensity of the signal, which is usually unknown. Finally, the FMA algorithm is applied to detecting faint streaks of satellites in optical images.

翻訳日:2021-02-04 12:07:24 公開日:2021-02-02

# (参考訳) MPCを用いた安定制約マルコフ決定過程

Stability-Constrained Markov Decision Processes Using MPC ( http://arxiv.org/abs/2102.01383v1 )

ライセンス: CC BY-SA 4.0

Mario Zanon, S\'ebastien Gros, Michele Palladino

(参考訳) 本稿では,結果として生じる政策が安定化しているという制約の下で,割引マルコフ決定プロセス(MDP)の解決を検討する。実際には、MPPは何らかの政策近似に基づいて解決される。我々は、モデル予測制御(MPC)を強化学習の文脈における構造化ポリシーとして活用することを提案する最近の結果を活用し、MPCベースのポリシー内での安定性要件を直接導入できるようにする。これは、建設による政策の安定化にMDPのソリューションを制限します。 MPCの安定性理論は、比類のないMPCの場合で最も成熟している。したがって、我々はまず、安定した割引MDPを無数に再フォーマットできることを本論文で示します。この観察は、安定要件のあるMPCベースの政策が、安定であれば、割引されたMDPの最適政策と、そうでなければ最良の安定化政策を生み出すことを要求する。

In this paper, we consider solving discounted Markov Decision Processes (MDPs) under the constraint that the resulting policy is stabilizing. In practice MDPs are solved based on some form of policy approximation. We will leverage recent results proposing to use Model Predictive Control (MPC) as a structured policy in the context of Reinforcement Learning to make it possible to introduce stability requirements directly inside the MPC-based policy. This will restrict the solution of the MDP to stabilizing policies by construction. The stability theory for MPC is most mature for the undiscounted MPC case. Hence, we will first show in this paper that stable discounted MDPs can be reformulated as undiscounted ones. This observation will entail that the MPC-based policy with stability requirements will produce the optimal policy for the discounted MDP if it is stable, and the best stabilizing policy otherwise.

翻訳日:2021-02-04 11:48:53 公開日:2021-02-02

# (参考訳) 地球磁場モデリングと球状高調波分解による予測

Global Earth Magnetic Field Modeling and Forecasting with Spherical Harmonics Decomposition ( http://arxiv.org/abs/2102.01447v1 )

ライセンス: CC BY 4.0

Panagiotis Tigas and T\'eo Bloch and Vishal Upendran and Banafsheh Ferdoushi and Mark C. M. Cheung and Siddha Ganju and Ryan M. McGranaghan and Yarin Gal and Asti Bhatt

(参考訳) 太陽風による大域磁場の摂動のモデル化と予測は、オープンな課題である。現在のアプローチは、MHD(Magneticohydrodynamics)モデルのような計算に要求されるモデルのシミュレーションや、スパース基底局(SuperMAG)を通して空間的および時間的にサンプリングに依存する。本稿では、Spherical Harmonicsスペース2で予測するディープラーニングモデルを開発し、MHDモデルへの依存を置き換え、1分間のケイデンスでグローバルカバレッジを提供し、機能工学に依存する現在の最新技術を改善する。超磁気データセット(14.53%改善)とmhdシミュレーション(24.35%改善)の性能評価を行った。さらに,sparse ground-based station (supermag) に基づく球面高調波再構成の補間性能を評価し,mhdシミュレーションにより球面高調波が確実に大域磁場を再構成できることを示した。

Modeling and forecasting the solar wind-driven global magnetic field perturbations is an open challenge. Current approaches depend on simulations of computationally demanding models like the Magnetohydrodynamics (MHD) model or sampling spatially and temporally through sparse ground-based stations (SuperMAG). In this paper, we develop a Deep Learning model that forecasts in Spherical Harmonics space 2, replacing reliance on MHD models and providing global coverage at one minute cadence, improving over the current state-of-the-art which relies on feature engineering. We evaluate the performance in SuperMAG dataset (improved by 14.53%) and MHD simulations (improved by 24.35%). Additionally, we evaluate the extrapolation performance of the spherical harmonics reconstruction based on sparse ground-based stations (SuperMAG), showing that spherical harmonics can reliably reconstruct the global magnetic field as evaluated on MHD simulation.

翻訳日:2021-02-04 11:27:39 公開日:2021-02-02

# (参考訳) 記憶に強い適応型OCO

Strongly Adaptive OCO with Memory ( http://arxiv.org/abs/2102.01623v1 )

ライセンス: CC BY 4.0

Zhiyu Zhang, Ashok Cutkosky, Ioannis Ch. Paschalidis

(参考訳) オンライン制御の最近の進歩は、予測履歴に依存する損失関数を持つ標準オンライン学習問題の変種であるメモリによるオンライン学習を普及させました。本稿では,この問題に対する最初の強適応アルゴリズムを提案する。任意の区間$\mathcal{i}\subset[1:t]$において,提案アルゴリズムは,その区間における最善の固定コンパレータに対して$\tilde o\left(\sqrt{|\mathcal{i}|}\right)$ポリシー後悔を達成する。オンライン制御技術と組み合わせ、アルゴリズムは線形時間変位システムの制御に縛られる強い適応的な後悔をもたらします。

Recent progress in online control has popularized online learning with memory, a variant of the standard online learning problem with loss functions dependent on the prediction history. In this paper, we propose the first strongly adaptive algorithm for this problem: on any interval $\mathcal{I}\subset[1:T]$, the proposed algorithm achieves $\tilde O\left(\sqrt{|\mathcal{I}|}\right)$ policy regret against the best fixed comparator for that interval. Combined with online control techniques, our algorithm results in a strongly adaptive regret bound for the control of linear time-varying systems.

翻訳日:2021-02-04 11:17:15 公開日:2021-02-02

# ユビキタスエッジaiのためのtinyml

TinyML for Ubiquitous Edge AI ( http://arxiv.org/abs/2102.01255v1 )

ライセンス: Link先を確認

Stanislava Soro

(参考訳) TinyMLは、機械学習、ハードウェア、ソフトウェアの交差点で急速に成長する多分野分野であり、極低電力範囲(mW範囲以下)で動作する組み込み(マイクロコントローラ駆動)デバイスでディープラーニングアルゴリズムを有効にすることに焦点を当てている。 tinymlは、電力効率が高く、コンパクトなディープニューラルネットワークモデルの設計、ソフトウェアフレームワークのサポート、さまざまなカスタマイズされたユビキタスな推論アプリケーションをバッテリ操作されたリソースに制約されたデバイス上で実行可能にする組み込みハードウェアの課題に対処する。本報告では,この分野の拡大を導く主要な課題と技術的実現要因について論じる。 TinyMLは、クラウド処理に依存しないが、分散エッジ推論と自律的な推論で繁栄する、新しいタイプのエッジサービスやアプリケーションへの扉を開く。

TinyML is a fast-growing multidisciplinary field at the intersection of machine learning, hardware, and software, that focuses on enabling deep learning algorithms on embedded (microcontroller powered) devices operating at extremely low power range (mW range and below). TinyML addresses the challenges in designing power-efficient, compact deep neural network models, supporting software framework, and embedded hardware that will enable a wide range of customized, ubiquitous inference applications on battery-operated, resource-constrained devices. In this report, we discuss the major challenges and technological enablers that direct this field's expansion. TinyML will open the door to the new types of edge services and applications that do not rely on cloud processing but thrive on distributed edge inference and autonomous reasoning.

翻訳日:2021-02-04 10:17:00 公開日:2021-02-02

# グラフィカルモデルによるドリフト推定

Drift Estimation with Graphical Models ( http://arxiv.org/abs/2102.01458v1 )

ライセンス: Link先を確認

Luigi Riso and Marco Guerzoni

(参考訳) 本稿では,教師付き機械学習における概念ドリフトの問題を扱う。私たちはグラフィカルモデルを使ってデータの可視構造を解明し、隠れたコンテキストの変化から推測します。従来のコンセプトドリフト検出方法とは異なり、このアプリケーションは特定のターゲット変数で使用される教師付き機械学習モデルに依存しないが、データセットの進化の独立した特性として概念ドリフトを評価しようとする。具体的には、新しいリンクの作成と、異なる期間に既存のリンクの消失を見て、グラフィカルモデルがどのように進化するかを調べる。本稿は,変化を強調し,最終的に時間とともに安定性を評価する指標を提示する手法を提案する。本研究は,オーストラリア電力市場における実世界データを用いた評価手法である。

This paper deals with the issue of concept drift in supervised machine learn-ing. We make use of graphical models to elicit the visible structure of the dataand we infer from there changes in the hidden context. Differently from previous concept-drift detection methods, this application does not depend on the supervised machine learning model in use for a specific target variable, but it tries to assess the concept drift as independent characteristic of the evolution of a dataset. Specifically, we investigate how a graphical model evolves by looking at the creation of new links and the disappearing of existing ones in different time periods. The paper suggests a method that highlights the changes and eventually produce a metric to evaluate the stability over time. The paper evaluate the method with real world data on the Australian Electric market.

翻訳日:2021-02-04 10:16:26 公開日:2021-02-02

# super-klust: 区分線形分類の別の方法

Super-klust: Another Way of Piecewise Linear Classification ( http://arxiv.org/abs/2102.01571v1 )

ライセンス: Link先を確認

Rahman Salim Zengin (1), Volkan Sezer (1) ((1) Istanbul Technical University)

(参考訳) これまでの研究であるSuper-kアルゴリズムでは,新しい一方向線形分類法が導入された。 super-kアルゴリズムに取り組んでいる間に、voronoi tessellation に基づいた分割線形分類器を得るための、同様の、より単純な方法があることが判明した。アルゴリズムの多次元ボクセル化と期待最大化の段階を距離ベースのクラスタリングアルゴリズム(好ましくはk平均)に置き換えることは、以前のアプローチと同様に機能する。ボキセル化をクラスタリングに置き換えているので、Supervised k Clusters や short Super-klust として、Super-k に関して修正アルゴリズムを名付けることに意義があることがわかりました。 Super-kアルゴリズムと同様に、Super-klustアルゴリズムはVoronoi Tessellationというラベル付きデータをカバーし、その結果を分類するためにtessellationを使用する。実験結果によると、super-klustアルゴリズムはsuper-kアルゴリズムと同様の性能特性を持つ。

With our previous study, the Super-k algorithm, we have introduced a novel way of piecewise-linear classification. While working on the Super-k algorithm, we have found that there is a similar, and simpler way to explain for obtaining a piecewise-linear classifier based on Voronoi tessellations. Replacing the multidimensional voxelization and expectation-maximization stages of the algorithm with a distance-based clustering algorithm, preferably k-means, works as well as the prior approach. Since we are replacing the voxelization with the clustering, we have found it meaningful to name the modified algorithm, with respect to Super-k, as Supervised k Clusters or in short Super-klust. Similar to the Super-k algorithm, the Super-klust algorithm covers data with a labeled Voronoi tessellation, and uses resulting tessellation for classification. According to the experimental results, the Super-klust algorithm has similar performance characteristics with the Super-k algorithm.

翻訳日:2021-02-04 10:15:54 公開日:2021-02-02

# FEDZIP: コミュニケーション効率の高いフェデレーション学習のための圧縮フレームワーク

FEDZIP: A Compression Framework for Communication-Efficient Federated Learning ( http://arxiv.org/abs/2102.01593v1 )

ライセンス: Link先を確認

Amirhossein Malekijoo, Mohammad Javad Fadaeieslam, Hanieh Malekijou, Morteza Homayounfar, Farshid Alizadeh-Shabdiz, Reza Rawassizadeh

(参考訳) Federated Learningは、ユーザのプライバシを保護し、サードパーティのアクセスから生データを保護することによって、無線デバイスのための分散機械学習(特にディープラーニング)の実装の転換点となる。学習プロセスを各クライアントに独立して割り当てます。まず、クライアントはローカルデータに基づいて機械学習モデルをローカルにトレーニングする。次に、クライアントはモデル重みとバイアス(データトレーニング)のローカルアップデートをサーバに転送する。その後、サーバは更新(クライアントから受信)を集約し、グローバルな学習モデルを作成する。しかし、クライアントとサーバ間の継続的な転送は通信コストを増大させ、ディープラーニングモデルで使用される多数のパラメータ(重みとバイアス)のためにリソース利用の観点から非効率である。貢献するクライアントやコミュニケーションラウンドの数が増えると、コミュニケーションのコストが懸念されるようになります。本研究では、クライアントとそのサーバ間のディープラーニングモデルから重みを転送しながら、更新のサイズを大幅に削減する新しいフレームワークであるFedZipを提案する。 fedzipはトップzスパーシフィケーションを実装し、クラスタリングで量子化を使用し、3つの異なるエンコーディングメソッドで圧縮を実装している。 FedZipは最先端の圧縮フレームワークを上回り、最大1085xまでの圧縮速度を達成し、通信中のクライアントの帯域幅とエネルギーの99%まで保持します。

Federated Learning marks a turning point in the implementation of decentralized machine learning (especially deep learning) for wireless devices by protecting users' privacy and safeguarding raw data from third-party access. It assigns the learning process independently to each client. First, clients locally train a machine learning model based on local data. Next, clients transfer local updates of model weights and biases (training data) to a server. Then, the server aggregates updates (received from clients) to create a global learning model. However, the continuous transfer between clients and the server increases communication costs and is inefficient from a resource utilization perspective due to the large number of parameters (weights and biases) used by deep learning models. The cost of communication becomes a greater concern when the number of contributing clients and communication rounds increases. In this work, we propose a novel framework, FedZip, that significantly decreases the size of updates while transferring weights from the deep learning model between clients and their servers. FedZip implements Top-z sparsification, uses quantization with clustering, and implements compression with three different encoding methods. FedZip outperforms state-of-the-art compression frameworks and reaches compression rates up to 1085x, and preserves up to 99% of bandwidth and 99% of energy for clients during communication.

翻訳日:2021-02-04 10:15:17 公開日:2021-02-02

# シンプレクティックガウス過程ダイナミクス

Symplectic Gaussian Process Dynamics ( http://arxiv.org/abs/2102.01606v1 )

ライセンス: Link先を確認

Katharina Ensinger, Friedrich Solowjow, Michael Tiemann, Sebastian Trimpe

(参考訳) ダイナミクスモデル学習は困難であり、同時に研究の活発な分野でもある。潜在的安全性のため、制御タスクのような下流アプリケーションでは、理論的保証が必要である。 GPは空間上の関数近似子として豊富な理論的保証を誘導するが、力学系の時間的側面には明示的に対応しない。しかし、時間によるシステム特性の伝播は、まさに古典的な数値積分器が設計したものです。本稿では,任意の明示的あるいは暗黙的な単段積分器や多段積分器で基底系を識別し,数値積分器の特性を活用できる,スパースガウス過程に基づく変分推論手法を提案する。特に、ハミルトン問題とシンプレクティック積分器は、体積保存予測を生成する。

Dynamics model learning is challenging and at the same time an active field of research. Due to potential safety critical downstream applications, such as control tasks, there is a need for theoretical guarantees. While GPs induce rich theoretical guarantees as function approximators in space, they do not explicitly cope with the time aspect of dynamical systems. However, propagating system properties through time is exactly what classical numerical integrators were designed for. We introduce a recurrent sparse Gaussian process based variational inference scheme that is able to discretize the underlying system with any explicit or implicit single or multistep integrator, thus leveraging properties of numerical integrators. In particular we discuss Hamiltonian problems coupled with symplectic integrators producing volume preserving predictions.

翻訳日:2021-02-04 10:14:35 公開日:2021-02-02

# 自動階調システムからのデータを用いた学生のパフォーマンス予測

Predicting student performance using data from an auto-grading system ( http://arxiv.org/abs/2102.01270v1 )

ライセンス: Link先を確認

Huanyi Chen, Paul A.S. Ward

(参考訳) オンラインの自動採点システムが現れると、これらのシステムから得られる情報によって、研究者は学生の行動やパフォーマンスを予測する予測モデルを作成することができる。ウォータールー大学では、ECE 150 (Fundamentals of Programming) Instructional Teamは、教育成果を改善するために限られた教育リソースをよりよく割り当てる方法について洞察を得たいと考えています。現在、Instructional Teamは、学習時間をリアクティブベースで割り当てている。生徒を「要請通り」支援する。このアプローチは、助けを求める場所を持つ学生に役立ちます。しかし、苦しんでいる学生の多くは援助を求めて手を差し伸べません。したがって、私たちは研究チームとして、自動グレードシステムであるMarmosetのデータを調べて助けを必要とする学生を決定できるかどうかを探りたいと思っています。本稿では,マーモセット自動採点システムから抽出した様々な特徴を持つ決定木および線形回帰モデルの構築実験を行い,合格率,テストケース結果,提出数,提出時間間隔(最初の合理的な提出と締め切りの間の時間間隔)について検討した。各特徴について,解析結果を混乱行列レベルで解釈した。特に, 成績の悪い学生に対しては, 提出時間間隔を用いた線形回帰モデルが, 精度とf測定の点で, 最良であることを示す。また,成績の悪い生徒に誤分類された生徒は,すべてのモデルにおいて,線形回帰モデルの中では最も低い実例があることが示唆された。また,中間期においては,中間期前の最終割当の提出時間間隔が中間期性能を最も多く予測することを示す。しかし、最終試験では、中間試験のパフォーマンスが最終試験のパフォーマンスに最も貢献します。

As online auto-grading systems appear, information obtained from those systems can potentially enable researchers to create predictive models to predict student behaviour and performances. In the University of Waterloo, the ECE 150 (Fundamentals of Programming) Instructional Team wants to get an insight into how to allocate the limited teaching resources better to achieve improved educational outcomes. Currently, the Instructional Team allocates tutoring time in a reactive basis. They help students "as-requested". This approach serves those students with the wherewithal to request help; however, many of the students who are struggling do not reach out for assistance. Therefore, we, as the Research Team, want to explore if we can determine students which need help by looking into the data from our auto-grading system, Marmoset. In this paper, we conducted experiments building decision-tree and linear-regression models with various features extracted from the Marmoset auto-grading system, including passing rate, testcase outcomes, number of submissions and submission time intervals (the time interval between the student's first reasonable submission and the deadline). For each feature, we interpreted the result at the confusion matrix level. Specifically for poor-performance students, we show that the linear-regression model using submission time intervals performs the best among all models in terms of Precision and F-Measure. We also show that for students who are misclassified into poor-performance students, they have the lowest actual grades in the linear-regression model among all models. In addition, we show that for the midterm, the submission time interval of the last assignment before the midterm predicts the midterm performance the most. However, for the final exam, the midterm performance contributes the most on the final exam performance.

翻訳日:2021-02-04 10:06:40 公開日:2021-02-02

# ノイズがカオスと出会うとき:ニューロカオス学習における確率的共鳴

When Noise meets Chaos: Stochastic Resonance in Neurochaos Learning ( http://arxiv.org/abs/2102.01316v1 )

ライセンス: Link先を確認

Harikrishnan NB and Nithin Nagaraj

(参考訳) カオスとノイズは脳に広がっています。ニューロンのカオス的な発砲と神経モデルにおけるノイズの構造的役割に触発され、私たちは初めてカオス、ノイズ、学習を接続します。本稿では,ニューロカオス学習(NL)における確率共鳴(SR)現象を実証する。 SRはNLの単一のニューロンのレベルで現われ、有効なsubthreshold信号の検出を可能にします。さらに、SRは、シミュレーションと実世界の音声桁データセットの両方において、分類タスクのための単一および複数のニューロンNLアーキテクチャで発生することが示されている。ニューロカオス学習における中間レベルのノイズは、分類タスクにおけるピークパフォーマンスを可能にし、AIアプリケーション、特に脳インスパイアされた学習アーキテクチャにおけるSRの役割を強調します。

Chaos and Noise are ubiquitous in the Brain. Inspired by the chaotic firing of neurons and the constructive role of noise in neuronal models, we for the first time connect chaos, noise and learning. In this paper, we demonstrate Stochastic Resonance (SR) phenomenon in Neurochaos Learning (NL). SR manifests at the level of a single neuron of NL and enables efficient subthreshold signal detection. Furthermore, SR is shown to occur in single and multiple neuronal NL architecture for classification tasks - both on simulated and real-world spoken digit datasets. Intermediate levels of noise in neurochaos learning enables peak performance in classification tasks thus highlighting the role of SR in AI applications, especially in brain inspired learning architectures.

翻訳日:2021-02-04 10:05:55 公開日:2021-02-02

# 二元化ニューラルネットワークにおけるビット誤差耐性測定

Bit Error Tolerance Metrics for Binarized Neural Networks ( http://arxiv.org/abs/2102.01344v1 )

ライセンス: Link先を確認

Sebastian Buschj\"ager, Jian-Jia Chen, Kuan-Hsun Chen, Mario G\"unzel, Katharina Morik, Rodion Novkin, Lukas Pfahler, Mikail Yayla

(参考訳) ニューラルネットワーク(NN)推論システムのリソース需要を減らすために、電源電圧とタイミングパラメータをエネルギー消費とパフォーマンスで取引精度を調整する近似メモリを使用することが提案されている。これらのパラメータの調整はビットエラーに積極的につながり、トレーニング中にビットフリップが注入されるとNNによって許容されます。しかし、ビットエラー耐性を達成するための最先端の技術であるビットフリップトレーニングは、スケールがうまくいかず、膨大なオーバーヘッドをもたらし、高いビットエラー率(BER)に適用することはできません。 NNにおけるビットエラー耐性を実現する別の方法が必要であるが、NNのビットエラー耐性の背後にある基本原則はまだ報告されていない。この理解の欠如により、nnビットのエラー許容性に関する研究のさらなる進展が抑制される。本研究の目的は,二項化NN(BNN)に着目して,フリップトレーニングの原因となるNNの内部的変化を調べることである。そのために、ビットエラー耐性BNNの性質を2つの指標で定量化します。まず,プリアクティベーション値とバッチ正規化しきい値とのマージンを計算する,ニューロンレベルのビット誤り耐性メトリックを提案する。次に、神経細胞の相互作用に対するビット誤差許容度の影響を捉えるために、各ニューロンの重要性を測定し、すべての重要値のばらつきを計算するニューロン間ビット誤差許容度指標を提案します。実験結果は,この2つの指標がビット誤り許容性に強く関連していることを裏付ける。

To reduce the resource demand of neural network (NN) inference systems, it has been proposed to use approximate memory, in which the supply voltage and the timing parameters are tuned trading accuracy with energy consumption and performance. Tuning these parameters aggressively leads to bit errors, which can be tolerated by NNs when bit flips are injected during training. However, bit flip training, which is the state of the art for achieving bit error tolerance, does not scale well; it leads to massive overheads and cannot be applied for high bit error rates (BERs). Alternative methods to achieve bit error tolerance in NNs are needed, but the underlying principles behind the bit error tolerance of NNs have not been reported yet. With this lack of understanding, further progress in the research on NN bit error tolerance will be restrained. In this study, our objective is to investigate the internal changes in the NNs that bit flip training causes, with a focus on binarized NNs (BNNs). To this end, we quantify the properties of bit error tolerant BNNs with two metrics. First, we propose a neuron-level bit error tolerance metric, which calculates the margin between the pre-activation values and batch normalization thresholds. Secondly, to capture the effects of bit error tolerance on the interplay of neurons, we propose an inter-neuron bit error tolerance metric, which measures the importance of each neuron and computes the variance over all importance values. Our experimental results support that these two metrics are strongly related to bit error tolerance.

翻訳日:2021-02-04 10:05:22 公開日:2021-02-02

# スマートシティにおける連合学習:包括的調査

Federated Learning in Smart Cities: A Comprehensive Survey ( http://arxiv.org/abs/2102.01375v1 )

ライセンス: Link先を確認

Zhaohua Zheng, Yize Zhou, Yilong Sun, Zhang Wang, Boyi Liu and Keqiu Li

(参考訳) 連合学習はスマートシティのプロセスにおいて重要な役割を果たす。ビッグデータと人工知能の開発により、このプロセスではデータのプライバシ保護が問題となる。フェデレーション学習はこの問題を解くことができる。本稿では,様々な分野における連合学習とその応用の現況から始める。我々は総合的な調査を行う。本稿では,スマートシティの様々な分野におけるフェデレーション学習の適用に関する最新の研究をまとめる。モノのインターネット、輸送、通信、金融、医療、その他の分野からの連合学習の現在の発展に関する深い理解。その前に,フェデレーション学習の背景,定義,キー技術を紹介する。さらに、重要な技術と最新の結果についてレビューする。最後に,スマートシティにおける連合学習の今後の応用と研究方向について考察する。

Federated learning plays an important role in the process of smart cities. With the development of big data and artificial intelligence, there is a problem of data privacy protection in this process. Federated learning is capable of solving this problem. This paper starts with the current developments of federated learning and its applications in various fields. We conduct a comprehensive investigation. This paper summarize the latest research on the application of federated learning in various fields of smart cities. In-depth understanding of the current development of federated learning from the Internet of Things, transportation, communications, finance, medical and other fields. Before that, we introduce the background, definition and key technologies of federated learning. Further more, we review the key technologies and the latest results. Finally, we discuss the future applications and research directions of federated learning in smart cities.

翻訳日:2021-02-04 10:04:38 公開日:2021-02-02

# AURSAD:Universal Robot Screwdriving Anomaly Detection Dataset

AURSAD: Universal Robot Screwdriving Anomaly Detection Dataset ( http://arxiv.org/abs/2102.01409v1 )

ライセンス: Link先を確認

B{\l}a\.zej Leporowski, Daniella Tola, Casper Hansen and Alexandros Iosifidis

(参考訳) ねじ運転は最も人気のある産業プロセスの1つです。そのため、様々なロボットを用いてその手順を自動化することがますます一般的になっている。自動化によってスクリュー駆動プロセスの効率が向上するが、プロセスが正しく監視されていない場合、動作中に障害が発生し、アセンブリの有効性と品質に影響を与える可能性がある。機械学習(ML)は、望ましくない出来事を検出し、その影響を制限する可能性がある。そのためには、まず、自動走行を行う産業用ロボットの動作を完全に記述したデータセットを入手する必要がある。本報告では,UR3eシリーズロボットとOnRobot Screwdriverを用いて作成したデータセットについて述べる。さまざまなシナリオを作成し、プロセスに3種類の異常を導入し、利用可能なロボットとドライバーのセンサーを継続的に記録します。得られたデータは、正常および異常なロボット操作の2042のサンプルを含む。このデータを使用した短いMLベンチマークも提供されており、さらなる分析と実験のためのデータの適合性と可能性を示している。

Screwdriving is one of the most popular industrial processes. As such, it is increasingly common to automate that procedure by using various robots. Even though the automation increases the efficiency of the screwdriving process, if the process is not monitored correctly, faults may occur during operation, which can impact the effectiveness and quality of assembly. Machine Learning (ML) has the potential to detect those undesirable events and limit their impact. In order to do so, first a dataset that fully describes the operation of an industrial robot performing automated screwdriving must be available. This report describes a dataset created using a UR3e series robot and OnRobot Screwdriver. We create different scenarios and introduce 3 types of anomalies to the process while all available robot and screwdriver sensors are continuously recorded. The resulting data contains 2042 samples of normal and anomalous robot operation. Brief ML benchmarks using this data are also provided, showcasing the data's suitability and potential for further analysis and experimentation.

翻訳日:2021-02-04 10:04:09 公開日:2021-02-02

# 機械学習による投票傾向の予測

Predicting Propensity to Vote with Machine Learning ( http://arxiv.org/abs/2102.01535v1 )

ライセンス: Link先を確認

Rebecca D. pollard, Sara M. Pollard, Scott Streit

(参考訳) 機械学習は、過去の行動や属性から投票する個人の傾向を推測する能力を可能にすることを実証します。これは、投票者のアウトリーチ、投票者教育、govtキャンペーンのマイクロターゲティングに有用である。政治学者は1940年代後半から選挙結果を推定する高度な技術を発展させた。 2つの先行研究は機械学習を使って将来の投票行動を予測する。 TensorFlowを使った機械学習環境を構築し、2004年から2018年まで投票データを取得し、3つの実験を実施しました。マシューズ相関係数 0.39 で陽性となった。

We demonstrate that machine learning enables the capability to infer an individual's propensity to vote from their past actions and attributes. This is useful for microtargeting voter outreach, voter education and get-out-the-vote (GOVT) campaigns. Political scientists developed increasingly sophisticated techniques for estimating election outcomes since the late 1940s. Two prior studies similarly used machine learning to predict individual future voting behavior. We built a machine learning environment using TensorFlow, obtained voting data from 2004 to 2018, and then ran three experiments. We show positive results with a Matthews correlation coefficient of 0.39.

翻訳日:2021-02-04 10:03:35 公開日:2021-02-02

# 間欠通信による分散確率凸最適化のMin-Max複雑性

The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication ( http://arxiv.org/abs/2102.01583v1 )

ライセンス: Link先を確認

Blake Woodworth, Brian Bullins, Ohad Shamir, Nathan Srebro

(参考訳) 間欠的通信設定における分散確率凸最適化(対数係数まで)の最小限の複雑性を解消し、M$マシンが目標を最適化するために$R$ラウンドの通信に対して並列に動作するようにし、各通信において各マシンが$K$確率勾配推定を逐次計算することができる。本稿では、最適なアルゴリズムを確立するための、一致した上限を持つ新しい下界を示す。

We resolve the min-max complexity of distributed stochastic convex optimization (up to a log factor) in the intermittent communication setting, where $M$ machines work in parallel over the course of $R$ rounds of communication to optimize the objective, and during each round of communication, each machine may sequentially compute $K$ stochastic gradient estimates. We present a novel lower bound with a matching upper bound that establishes an optimal algorithm.

翻訳日:2021-02-04 10:03:08 公開日:2021-02-02

# 重み付きリーマー平均を用いた非バランス音声のブラインド分離のための方向スパースフィルタリング

Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures ( http://arxiv.org/abs/2102.00196v2 )

ライセンス: Link先を確認

Karn Watcharasupat and Anh H. T. Nguyen and Ching-Hui Ooi and Andy W. H. Khong

(参考訳) 音声信号のブラインドソース分離において、ソーススペクトルの固有の不均衡は、混合行列の推定に単一ソース支配に依存する方法の課題である。本稿では,Lahmer平均と学習可能な重みを用いた指向性スパースフィルタリング(DSF)フレームワークに基づくアルゴリズムを提案し,ソースの不均衡を適応的に考慮する。複数の実音環境における音源分離性能の評価は, ベースライン法と比較して改善が見られた。

In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix. We propose an algorithm based on the directional sparse filtering (DSF) framework that utilizes the Lehmer mean with learnable weights to adaptively account for source imbalance. Performance evaluation in multiple real acoustic environments show improvements in source separation compared to the baseline methods.

翻訳日:2021-02-04 10:00:49 公開日:2021-02-02

# PSLA:プリトレーニング、サンプリング、ラベリング、アグリゲーションによるオーディオイベント分類の改善

PSLA: Improving Audio Event Classification with Pretraining, Sampling, Labeling, and Aggregation ( http://arxiv.org/abs/2102.01243v1 )

ライセンス: Link先を確認

Yuan Gong, Yu-An Chung, and James Glass

(参考訳) オーディオイベント分類は活発な研究領域であり、幅広い用途があります。 AudioSetのリリース以来、分類精度の向上に大きく進歩しています。これは、主に新しいモデルアーキテクチャと注意モジュールの開発から来ています。しかし,オーディオセットを用いた音声イベント分類モデルの構築においては,適切なトレーニング手法が等しく重要であることが判明した。このギャップを埋めるため,本研究では,イメージネットプリトレーニング,バランスサンプリング,データ拡張,ラベル拡張,モデルアグリゲーション,設計選択など,モデルの精度を著しく向上させるトレーニング手法であるpslaを提案する。これらの手法でEfficientNetをトレーニングすることにより,AudioSet上で0.474の平均精度(mAP)を新たに達成し,従来の0.439よりも優れるモデルを得る。

Audio event classification is an active research area and has a wide range of applications. Since the release of AudioSet, great progress has been made in advancing the classification accuracy, which mostly comes from the development of novel model architectures and attention modules. However, we find that appropriate training techniques are equally important for building audio event classification models with AudioSet, but have not received the attention they deserve. To fill the gap, in this work, we present PSLA, a collection of training techniques that can noticeably boost the model accuracy including ImageNet pretraining, balanced sampling, data augmentation, label enhancement, model aggregation and their design choices. By training an EfficientNet with these techniques, we obtain a model that achieves a new state-of-the-art mean average precision (mAP) of 0.474 on AudioSet, outperforming the previous best system of 0.439.

翻訳日:2021-02-04 09:50:49 公開日:2021-02-02

# ターゲット話者抽出のためのマルチモーダルアテンション融合

Multimodal Attention Fusion for Target Speaker Extraction ( http://arxiv.org/abs/2102.01326v1 )

ライセンス: Link先を確認

Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki

(参考訳) 音声,視覚的,位置的手がかりを用いた混合音声からターゲット話者の声を抽出することを目的としたターゲット話者抽出が注目されている。近年,補完音声と視覚的手がかりを用いてターゲット音声を抽出する音声-視覚的ターゲット話者抽出法が提案されている。音声と視覚を対象とする話者抽出はシミュレーションデータに対する単一モダリティ法よりも安定した性能を提供するが、現実の状況への適応や実記録混合物の評価は十分に検討されていない。現実的な状況に対処する上で大きな問題の1つは、実際の記録では両方の手がかりが等しく信頼性がない可能性があるため、システムの汚職を突き止めるための堅牢化である。視覚的な手がかりは閉塞の影響を受けます本研究では、マルチモーダル融合のための新しい注意メカニズムとそのトレーニング方法を提案し、より信頼性の高いものに手がかりの信頼性と重量を効果的に捉えることを可能にする。シミュレーションデータに対する従来の核融合機構よりも,信号対歪み比(SDR)を1.0dB向上させる。さらに,同時音声の音声・視覚データセットを実データを用いて記録し,提案手法による音声・視覚対象話者抽出が実データに有効であることを示す。

Target speaker extraction, which aims at extracting a target speaker's voice from a mixture of voices using audio, visual or locational clues, has received much interest. Recently an audio-visual target speaker extraction has been proposed that extracts target speech by using complementary audio and visual clues. Although audio-visual target speaker extraction offers a more stable performance than single modality methods for simulated data, its adaptation towards realistic situations has not been fully explored as well as evaluations on real recorded mixtures. One of the major issues to handle realistic situations is how to make the system robust to clue corruption because in real recordings both clues may not be equally reliable, e.g. visual clues may be affected by occlusions. In this work, we propose a novel attention mechanism for multi-modal fusion and its training methods that enable to effectively capture the reliability of the clues and weight the more reliable ones. Our proposals improve signal to distortion ratio (SDR) by 1.0 dB over conventional fusion mechanisms on simulated data. Moreover, we also record an audio-visual dataset of simultaneous speech with realistic visual clue corruption and show that audio-visual target speaker extraction with our proposals successfully work on real data.

翻訳日:2021-02-04 09:50:11 公開日:2021-02-02

# 機械学習による有機半導体の電荷輸送の動的障害解析

Analyzing dynamical disorder for charge transport in organic semiconductors via machine learning ( http://arxiv.org/abs/2102.01479v1 )

ライセンス: Link先を確認

Patrick Reiser, Manuel Konrad, Artem Fediai, Salvador L\'eon, Wolfgang Wenzel and Pascal Friederich

(参考訳) 有機半導体は有機発光ダイオード(oled)や光電子応用といった今日のディスプレイ技術にとって不可欠である。しかし、有機材料は無機半導体と同じ電荷担体モビリティに到達せず、装置の効率を制限している。より高い電荷キャリア移動度を持つ新しい有機半導体を発見または設計するためには、計算アプローチ、特にマルチスケールモデルがますます重要になっている。しかし、そのようなモデルは計算コストが非常に高く、特に大規模システムや長時間のスケールが必要な場合、静的エネルギーや動的エネルギー障害を計算する場合である。電荷輸送を決定する主要な要因。ここでは、機械学習モデルをマルチスケールシミュレーションに統合することで、この欠点を克服する。これにより、関連する微視的材料特性、特に一連の応用関連分子に対する静的および動的障害寄与に関する前例のない洞察を得ることができます。静的な障害や浅いトラップの分布は、多くの材料に対して非常に非対称であり、ガウス的障害モデルに影響を与えている。さらに, エネルギー準位変動時間の解析を行い, 典型的ホッピング速度と比較し, 電荷輸送における動的障害の重要性を評価する。我々は,有機半導体の応用材料特性の予測に使用する計算手法の精度を大幅に向上し,仮想材料設計にこれらの手法を適用することを期待する。

Organic semiconductors are indispensable for today's display technologies in form of organic light emitting diodes (OLEDs) and further optoelectronic applications. However, organic materials do not reach the same charge carrier mobility as inorganic semiconductors, limiting the efficiency of devices. To find or even design new organic semiconductors with higher charge carrier mobility, computational approaches, in particular multiscale models, are becoming increasingly important. However, such models are computationally very costly, especially when large systems and long time scales are required, which is the case to compute static and dynamic energy disorder, i.e. dominant factor to determine charge transport. Here we overcome this drawback by integrating machine learning models into multiscale simulations. This allows us to obtain unprecedented insight into relevant microscopic materials properties, in particular static and dynamic disorder contributions for a series of application-relevant molecules. We find that static disorder and thus the distribution of shallow traps is highly asymmetrical for many materials, impacting widely considered Gaussian disorder models. We furthermore analyse characteristic energy level fluctuation times and compare them to typical hopping rates to evaluate the importance of dynamic disorder for charge transport. We hope that our findings will significantly improve the accuracy of computational methods used to predict application relevant materials properties of organic semiconductors, and thus make these methods applicable for virtual materials design.

翻訳日:2021-02-04 09:49:29 公開日:2021-02-02

# (参考訳) ニューラルセマンティックパーサーのロバスト性について

On Robustness of Neural Semantic Parsers ( http://arxiv.org/abs/2102.01563v1 )

ライセンス: CC BY 4.0

Shuo Huang, Zhuang Li, Lizhen Qu1, Lei Pan

(参考訳) 意味解析は自然言語(NL)の発話を論理形式(LF)に写し、多くの高度なNLP問題を支えている。セマンティックパーサーはディープニューラルネットワークでパフォーマンスが向上するが、逆の例に対する脆弱性を継承する。本論文では,逆アタックの存在下でのセマンティックパーサーの堅牢性に関する実証的研究について述べる。形式的には、意味解析の敵は摂動的発話-LF対と見なされ、その発話は原語と全く同じ意味を持つ。既存のベンチマークコーパスに基づくロバストネステストセットを構築するために,スケーラブルな手法を提案する。本研究は,ロバスト性テストセットにおけるサーテ・オブ・ザ・アーツ・パーサーの性能評価と,データ拡張の効果評価に関する5つの研究課題に答えた。

Semantic parsing maps natural language (NL) utterances into logical forms (LFs), which underpins many advanced NLP problems. Semantic parsers gain performance boosts with deep neural networks, but inherit vulnerabilities against adversarial examples. In this paper, we provide the empirical study on the robustness of semantic parsers in the presence of adversarial attacks. Formally, adversaries of semantic parsing are considered to be the perturbed utterance-LF pairs, whose utterances have exactly the same meanings as the original ones. A scalable methodology is proposed to construct robustness test sets based on existing benchmark corpora. Our results answered five research questions in measuring the sate-of-the-art parsers' performance on robustness test sets, and evaluating the effect of data augmentation.

翻訳日:2021-02-04 07:41:52 公開日:2021-02-02

# (参考訳) リアルタイム超解像におけるraw画像の活用

Exploiting Raw Images for Real-Scene Super-Resolution ( http://arxiv.org/abs/2102.01579v1 )

ライセンス: CC BY 4.0

Xiangyu Xu, Yongrui Ma, Wenxiu Sun, Ming-Hsuan Yang

(参考訳) 超解像度は、カメラセンサーの空間的制約を克服することを目的としたコンピュータビジョンの基本的な問題です。単一画像のスーパーレゾリューションでは大きな進歩が見られたが、ほとんどのアルゴリズムは合成データでのみうまく動作し、実際のシナリオでの応用を制限する。本稿では,合成データと実写画像のギャップを埋めるために,実写単像超解像の問題について検討する。我々は既存の超解像アルゴリズムの2つの問題に焦点を当てている: 実写訓練データの欠如とカメラから得られる視覚情報の活用不足。そこで本研究では,デジタルカメラの撮像過程をシミュレートし,よりリアルなトレーニングデータを生成する手法を提案する。第2の課題は、原画像に記録された放射情報を利用する2分岐畳み込みニューラルネットワークを開発することである。さらに,画像復元のための高密度チャネルアテンションブロックと,有効色補正のための学習型ガイド付きフィルタネットワークを提案する。我々のモデルは、特定のカメラタイプからの画像を意図的に訓練することなく、異なるカメラに一般化することができる。広汎な実験により,提案アルゴリズムは細部やクリアな構造を復元し,実際のシーンにおける単一画像超解像の高品質な結果が得られることを示した。

Super-resolution is a fundamental problem in computer vision which aims to overcome the spatial limitation of camera sensors. While significant progress has been made in single image super-resolution, most algorithms only perform well on synthetic data, which limits their applications in real scenarios. In this paper, we study the problem of real-scene single image super-resolution to bridge the gap between synthetic data and real captured images. We focus on two issues of existing super-resolution algorithms: lack of realistic training data and insufficient utilization of visual information obtained from cameras. To address the first issue, we propose a method to generate more realistic training data by mimicking the imaging process of digital cameras. For the second issue, we develop a two-branch convolutional neural network to exploit the radiance information originally-recorded in raw images. In addition, we propose a dense channel-attention block for better image restoration as well as a learning-based guided filter network for effective color correction. Our model is able to generalize to different cameras without deliberately training on images from specific camera types. Extensive experiments demonstrate that the proposed algorithm can recover fine details and clear structures, and achieve high-quality results for single image super-resolution in real scenes.

翻訳日:2021-02-04 07:14:02 公開日:2021-02-02

# (参考訳) GEMベンチマーク:自然言語生成とその評価とメトリクス

The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics ( http://arxiv.org/abs/2102.01672v1 )

ライセンス: CC BY 4.0

Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ond\v{r}ej Du\v{s}ek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, Jo\~ao Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou

(参考訳) 自然言語生成(NLG)のための生きたベンチマークであるGEM、その評価、およびメトリクスを紹介します。 NLGの進捗測定は、自動メトリクス、データセット、および人間の評価基準の絶え間なく進化するエコシステムに依存しています。しかし、この移動目標のため、新しいモデルは、よく確立されているが欠陥のあるメトリクスを持つ分散アングロ中心のコーパスで評価されることが多い。この切断は、現在のモデルと進歩の機会の限界を特定するのを難しくする。この制限に対処するため、GEMは幅広いコーポラにモデルを簡単に適用でき、評価戦略をテストすることができる環境を提供します。ベンチマークの定期的なアップデートにより、NLGの研究はより多言語化され、モデルとともに課題を進化させる。この論文は、ACL 2021ワークショップで共有タスクを組織し、NLGコミュニティ全体を参加するよう招待する最初のリリースの説明として機能します。

We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. However, due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of corpora and evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the initial release for which we are organizing a shared task at our ACL 2021 Workshop and to which we invite the entire NLG community to participate.

翻訳日:2021-02-04 06:42:02 公開日:2021-02-02

# (参考訳) ドメイン適応型エンドツーエンド音声認識のための内部言語モデルトレーニング

Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition ( http://arxiv.org/abs/2102.01380v1 )

ライセンス: CC BY 4.0

Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong

(参考訳) 外部言語モデル(LM)と既存のエンドツーエンド(E2E)自動音声認識(ASR)システムの統合の有効性は、内部言語モデル推定(ILME)法を用いて大幅に改善することができる。この方法では、推論中にE2Eスコアと外部LMスコアを補間して得られたスコアから内部LMスコアを減算する。 ILMEに基づく推論を改善するために、内部LM推定に影響を与えるE2Eモデルコンポーネントのみを更新することにより、内部LM損失を最小限に抑える内部LMトレーニング(ILMT)方法を提案する。 ILMTは、ESRの精度を犠牲にすることなく、既存のコンポーネント内でスタンドアロンのLMを形成するようE2Eモデルを奨励している。 ILMTの後、トレーニングと推論の基準が一致したよりモジュール化されたE2Eモデルは、ソースドメイン内部のLMをより徹底的に除去し、ターゲットドメイン外部のLMをより効果的に統合することを可能にする。 30K時間の訓練された繰り返しニューラルネットワークトランスデューサと注意ベースのエンコーダデコーダモデルで実験されたILMTは、ILMEベースの推論により、標準E2Eトレーニングから最大31.5%および11.4%の相対的な単語誤り率を、ドメイン外LibriSpeechとMicrosoft生産テストセットでShallow Fusionでそれぞれ達成する。

The efficacy of external language model (LM) integration with existing end-to-end (E2E) automatic speech recognition (ASR) systems can be improved significantly using the internal language model estimation (ILME) method. In this method, the internal LM score is subtracted from the score obtained by interpolating the E2E score with the external LM score, during inference. To improve the ILME-based inference, we propose an internal LM training (ILMT) method to minimize an additional internal LM loss by updating only the E2E model components that affect the internal LM estimation. ILMT encourages the E2E model to form a standalone LM inside its existing components, without sacrificing ASR accuracy. After ILMT, the more modular E2E model with matched training and inference criteria enables a more thorough elimination of the source-domain internal LM, and therefore leads to a more effective integration of the target-domain external LM. Experimented with 30K-hour trained recurrent neural network transducer and attention-based encoder-decoder models, ILMT with ILME-based inference achieves up to 31.5% and 11.4% relative word error rate reductions from standard E2E training with Shallow Fusion on out-of-domain LibriSpeech and in-domain Microsoft production test sets, respectively.

翻訳日:2021-02-04 05:54:46 公開日:2021-02-02

# (参考訳) 大きさ

Size Matters ( http://arxiv.org/abs/2102.01582v1 )

ライセンス: CC BY 4.0

Mats L. Richter, Johan Byttner, Ulf Krumnack, Ludwdig Schallner, Justin Shenk

(参考訳) 完全畳み込みニューラルネットワークは、ダウンサンプリングとプールの組み合わせで任意のサイズの入力を処理することができる。しかし, 完全畳み込み画像分類器は入力サイズに依存せず, 性能に有意な差があることが判明した。より詳しく見ると、入力サイズとモデル性能の間には単純な関係はない(`bigger is better'は存在しない)が、各ネットワークが最適な入力サイズを持ち、最良の結果を示していることがわかる。本研究では,層活性化のスペクトル解析やプローブ分類などの異なる手法を適用し,ネットワークアーキテクチャに特有の特徴があることを示す。この結果から、識別的特徴の大きさが、層間での推論プロセスの分散方法に重大な影響を与えていることが判明した。

Fully convolutional neural networks can process input of arbitrary size by applying a combination of downsampling and pooling. However, we find that fully convolutional image classifiers are not agnostic to the input size but rather show significant differences in performance: presenting the same image at different scales can result in different outcomes. A closer look reveals that there is no simple relationship between input size and model performance (no `bigger is better'), but that each each network has a preferred input size, for which it shows best results. We investigate this phenomenon by applying different methods, including spectral analysis of layer activations and probe classifiers, showing that there are characteristic features depending on the network architecture. From this we find that the size of discriminatory features is critically influencing how the inference process is distributed among the layers.

翻訳日:2021-02-04 05:42:20 公開日:2021-02-02

# (参考訳) MultiTalk:多言語会話のための高分岐ダイアログ

MultiTalk: A Highly-Branching Dialog Testbed for Diverse Conversations ( http://arxiv.org/abs/2102.01263v1 )

ライセンス: CC BY 4.0

Yao Dou, Maxwell Forbes, Ari Holtzman, Yejin Choi

(参考訳) 与えられた履歴に対する多くの可能な応答がある会話対話について研究する。選択的なブランチ継続を通じて、高分岐率(10)と複数の会話回転(6)のバランスをとる320,000以上の会話ダイアログの文のコーパスであるMultiTalk Datasetを紹介します。高度に分岐した環境で、対話生成の研究に複数貢献します。多様な世代の世代を評価するために, 多様な参照のセットを最適に組み込むために, 二分グラフマッチングに基づく単純なスコアリングアルゴリズムを提案する。事前学習された分類器から自動的に引き起こされるテキスト属性を用いて,予測会話深さの異なるレベルで複数の言語生成タスクについて検討した。本研究の課題は,聴取者の期待する反応の推論を必要とする制御可能な生成タスクである心的問題の挑戦的理論である。

We study conversational dialog in which there are many possible responses to a given history. We present the MultiTalk Dataset, a corpus of over 320,000 sentences of written conversational dialog that balances a high branching factor (10) with several conversation turns (6) through selective branch continuation. We make multiple contributions to study dialog generation in the highly branching setting. In order to evaluate a diverse set of generations, we propose a simple scoring algorithm, based on bipartite graph matching, to optimally incorporate a set of diverse references. We study multiple language generation tasks at different levels of predictive conversation depth, using textual attributes induced automatically from pretrained classifiers. Our culminating task is a challenging theory of mind problem, a controllable generation task which requires reasoning about the expected reaction of the listener.

翻訳日:2021-02-04 05:26:22 公開日:2021-02-02

# (参考訳) 歴史資料への機械翻訳応用の2つの実証

Two Demonstrations of the Machine Translation Applications to Historical Documents ( http://arxiv.org/abs/2102.01417v1 )

ライセンス: CC BY-SA 4.0

Miguel Domingo and Francisco Casacuberta

(参考訳) 歴史的文書に2つの機械翻訳の応用例を示す。最初のタスクは、その元の言語の現代バージョンで書かれた歴史的な文書の新バージョンを生成することです。第2のアプリケーションは文書の正書法に限られる。文章の綴りの一貫性と綴り規則の欠如を会計するために、文書の綴りを現代の標準に適応させます。我々は、ユーザがシステムの仮説に修正を導入することができる、インタラクティブで適応的なフレームワークに従った。システムはこれらの補正に反応し、それらを考慮した新しい仮説を生成する。ユーザがシステムの仮説に満足して検証すると、システムはオンライン学習戦略に従ってそのモデルに適応する。このシステムはクライアントサーバアーキテクチャに従って実装される。ニューラルモデルと通信するWebサイトを開発した。すべてのコードはオープンソースで公開されています。デモはhttp://demosmt.prhlt.upv.es/mthd/にホストされている。

We present our demonstration of two machine translation applications to historical documents. The first task consists in generating a new version of a historical document, written in the modern version of its original language. The second application is limited to a document's orthography. It adapts the document's spelling to modern standards in order to achieve an orthography consistency and accounting for the lack of spelling conventions. We followed an interactive, adaptive framework that allows the user to introduce corrections to the system's hypothesis. The system reacts to these corrections by generating a new hypothesis that takes them into account. Once the user is satisfied with the system's hypothesis and validates it, the system adapts its model following an online learning strategy. This system is implemented following a client-server architecture. We developed a website which communicates with the neural models. All code is open-source and publicly available. The demonstration is hosted at http://demosmt.prhlt.upv.es/mthd/.

翻訳日:2021-02-04 05:14:17 公開日:2021-02-02

# (参考訳) 直接音声翻訳のためのCTCに基づく圧縮

CTC-based Compression for Direct Speech Translation ( http://arxiv.org/abs/2102.01578v1 )

ライセンス: CC BY-SA 4.0

Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi

(参考訳) 従来,音声入力音声の動的音声インフォーム圧縮は音声翻訳(ST)に有用であった。しかし、彼らは音声認識のための専用モデルを必要とし、単一のモデルが入力音声を中間表現なしでターゲット言語に翻訳するdirect stのこのソリューションをテストしなかった。本研究では,入力間接STモデルの動的圧縮を行うための第1の手法を提案する。特に,コネクショニスト時間分類(ctc)を用いて,その音声特性に応じて入力列を圧縮する。我々の実験は、我々のソリューションが2つの言語ペア(英語-イタリア語と英語-ドイツ語)の強いベースラインに対して1.3-1.5BLEUの改善をもたらし、文脈的にメモリフットプリントを10%以上削減することを示した。

Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST). However, they required a dedicated model for phone recognition and did not test this solution for direct ST, in which a single model translates the input audio into the target language without intermediate representations. In this work, we propose the first method able to perform a dynamic compression of the input indirect ST models. In particular, we exploit the Connectionist Temporal Classification (CTC) to compress the input sequence according to its phonetic characteristics. Our experiments demonstrate that our solution brings a 1.3-1.5 BLEU improvement over a strong baseline on two language pairs (English-Italian and English-German), contextually reducing the memory footprint by more than 10%.

翻訳日:2021-02-04 05:08:39 公開日:2021-02-02

# (参考訳) アトラスアウェア ConvNetfor 正確かつロバストな解剖学的セグメンテーション

Atlas-aware ConvNetfor Accurate yet Robust Anatomical Segmentation ( http://arxiv.org/abs/2102.01256v1 )

ライセンス: CC BY 4.0

Yuan Liang, Weinan Song, Jiawei Yang, Liang Qiu, Kun Wang, Lei He

(参考訳) 畳み込みネットワーク(ConvNets)は、様々な解剖学的セグメンテーションタスクの有望な精度を達成しました。成功にもかかわらず、これらの手法はデータの出現変動に敏感である。アーティファクト,病理,スキャン設定によるスキャンの大きな変動を考えると,ロバストなConvNetは臨床応用には不可欠だが,十分に調査されていない。本稿では,画像スキャン中の解剖学的不均一性に対するconvnetの認識を可能にすることで,課題を軽減することを提案する。具体的には,局所接続条件付き確率場(cfr)上の予測に対する明示的な制約として確率的アトラス前置法を組み込んだ完全畳み込み制約導入モジュール(cam)を導入し,ラベリング出力の解剖学的一貫性を効果的に強化する。さまざまなConvNetのブーストに柔軟に対応できるCAMを設計し、最適な性能につながるフュージョンパラメータをConvNetとの共同最適化にコンパクトにします。このようなアトラス前駆体融合の利点は2つの脳パーセレーションタスクで2倍になることを示す。まず,予測の構造的異常を著しく低減し,両データセットのConvNetに基づく手法間の最先端の精度を実現する。第2に、既存のconvnetのロバスト性を大きく向上させることができる。(i) 合成病理によるスキャンのテスト、(ii) データセットをまたいだ異なるスキャンセットアップのスキャンのトレーニングと評価。提案手法は,CAMを微調整し,精度とロバスト性の向上を図ることで,既存のConvNetに容易に適用できることを示唆している。

Convolutional networks (ConvNets) have achieved promising accuracy for various anatomical segmentation tasks. Despite the success, these methods can be sensitive to data appearance variations. Considering the large variability of scans caused by artifacts, pathologies, and scanning setups, robust ConvNets are vital for clinical applications, while have not been fully explored. In this paper, we propose to mitigate the challenge by enabling ConvNets' awareness of the underlying anatomical invariances among imaging scans. Specifically, we introduce a fully convolutional Constraint Adoption Module (CAM) that incorporates probabilistic atlas priors as explicit constraints for predictions over a locally connected Conditional Random Field (CFR), which effectively reinforces the anatomical consistency of the labeling outputs. We design the CAM to be flexible for boosting various ConvNet, and compact for co-optimizing with ConvNets for fusion parameters that leads to the optimal performance. We show the advantage of such atlas priors fusion is two-fold with two brain parcellation tasks. First, our models achieve state-of-the-art accuracy among ConvNet-based methods on both datasets, by significantly reducing structural abnormalities of predictions. Second, we can largely boost the robustness of existing ConvNets, proved by: (i) testing on scans with synthetic pathologies, and (ii) training and evaluation on scans of different scanning setups across datasets. Our method is proposing to be easily adopted to existing ConvNets by fine-tuning with CAM plugged in for accuracy and robustness boosts.

翻訳日:2021-02-04 04:59:22 公開日:2021-02-02

# (参考訳) 画像認識のための方向畳み込みネットワーク

Orientation Convolutional Networks for Image Recognition ( http://arxiv.org/abs/2102.01523v1 )

ライセンス: CC BY 4.0

Yalan Qin, Guorui Feng, Hanzhou Wu, Yanli Ren and Xinpeng Zhang

(参考訳) ディープ畳み込みニューラルネットワーク(DCNN)は強力な画像表現を得ることができ、画像認識に大きな注目を集めている。しかし、それらは内部機構による方向変換のモデリングに制限がある。本稿では,提案したLandmark Gabor Filters (LGFs) に基づく画像認識のためのOCN(Orientation Convolution Networks)を開発し,学習表現の方向性変化に対する堅牢性を向上させる。畳み込みフィルタをLGFで変調することにより、OCNは既存のディープラーニングネットワークと互換性を持つことができる。 LGF は Gabor フィルタバンクとして機能し、$ p $ \left( \ll n\right) $ 代表 Gabor フィルタをandmarks として選択し、元の Gabor フィルタをこれらのランドマークの疎線型結合として表現する。具体的には、行列ファクタリゼーションフレームワークに基づいて、スパース性および低ランク制約によるオリジナルのGaborフィルタのローカルおよびグローバル構造に対する柔軟な統合が利用される。低ランク構造の伝播により、元のGaborフィルタバンクの表現に対応する空間を著しく促進することができる。いくつかのベンチマークによる実験結果から,本手法はオリエンテーションに対する感度が低く,従来手法に比べて精度とコストが向上することが示された。さらに、OCNには学習するパラメータがほとんどなく、トレーニングネットワークの複雑さを大幅に削減できます。

Deep Convolutional Neural Networks (DCNNs) are capable of obtaining powerful image representations, which have attracted great attentions in image recognition. However, they are limited in modeling orientation transformation by the internal mechanism. In this paper, we develop Orientation Convolution Networks (OCNs) for image recognition based on the proposed Landmark Gabor Filters (LGFs) that the robustness of the learned representation against changed of orientation can be enhanced. By modulating the convolutional filter with LGFs, OCNs can be compatible with any existing deep learning networks. LGFs act as a Gabor filter bank achieved by selecting $ p $ $ \left( \ll n\right) $ representative Gabor filters as andmarks and express the original Gabor filters as sparse linear combinations of these landmarks. Specifically, based on a matrix factorization framework, a flexible integration for the local and the global structure of original Gabor filters by sparsity and low-rank constraints is utilized. With the propogation of the low-rank structure, the corresponding sparsity for representation of original Gabor filter bank can be significantly promoted. Experimental results over several benchmarks demonstrate that our method is less sensitive to the orientation and produce higher performance both in accuracy and cost, compared with the existing state-of-art methods. Besides, our OCNs have few parameters to learn and can significantly reduce the complexity of training network.

翻訳日:2021-02-04 04:43:41 公開日:2021-02-02

# (参考訳) Occluded Video Instance Segmentation

Occluded Video Instance Segmentation ( http://arxiv.org/abs/2102.01558v1 )

ライセンス: CC BY 4.0

Jiyang Qi, Yan Gao, Xiaoyu Liu, Yao Hu, Xinggang Wang, Xiang Bai, Philip H.S. Torr, Serge Belongie, Alan Yuille, Song Bai

(参考訳) 映像理解システムは,シーン内に重い咬合が存在する場合,物体を知覚できるのか? この質問に答えるために、OVISと呼ばれる大規模データセットを収集し、ビデオインスタンスのセグメンテーション、すなわち、インクルードされたシーンでインスタンスを検出し、セグメンテーションし、追跡します。 OVISは25のセマンティックカテゴリから296kの高品質のインスタンスマスクで構成されており、オブジェクト閉塞は通常発生します。人間の視覚システムは文脈的推論と関連づけによってこれらを理解できるが、実験は現在の映像理解システムが満足していないことを示唆する。 OVISデータセットでは、最先端のアルゴリズムによって達成された最高のAPはわずか14.4であり、実際のシナリオでオブジェクト、インスタンス、ビデオを理解するための初期段階にあることを明らかにしています。また,閉塞による物体の欠落を補うために,時間的特徴キャリブレーションと呼ばれるプラグアンドプレイモジュールを提案する。 MaskTrack R-CNN と SipMask をベースに構築され、AP はそれぞれ 15.2 と 15.0 である。 OVISデータセットはhttp://songbai.site/ovis でリリースされる。

Can our video understanding systems perceive objects when a heavy occlusion exists in a scene? To answer this question, we collect a large scale dataset called OVIS for occluded video instance segmentation, that is, to simultaneously detect, segment, and track instances in occluded scenes. OVIS consists of 296k high-quality instance masks from 25 semantic categories, where object occlusions usually occur. While our human vision systems can understand those occluded instances by contextual reasoning and association, our experiments suggest that current video understanding systems are not satisfying. On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 14.4, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario. Moreover, to complement missing object cues caused by occlusion, we propose a plug-and-play module called temporal feature calibration. Built upon MaskTrack R-CNN and SipMask, we report an AP of 15.2 and 15.0 respectively. The OVIS dataset is released at http://songbai.site/ovis , and the project code will be available soon.

翻訳日:2021-02-04 04:27:15 公開日:2021-02-02

# (参考訳) Smoothness-Induction Sequential Variational Auto-Encoderによる時系列異常検出

Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder ( http://arxiv.org/abs/2102.01331v1 )

ライセンス: CC BY 4.0

Longyuan Li, Junchi Yan, Haiyang Wang, and Yaohui Jin

(参考訳) ディープジェネレーションモデルは、潜在表現の学習と時系列の複雑な依存性のモデリングにおけるその効果を実証している。本稿では,多次元時系列のロバストな推定と異常検出のためのスムースネス誘導逐次変分自動エンコーダ(SISVAE)モデルを提案する。我々のモデルは変分オートエンコーダ(VAE)に基づいており、そのバックボーンはリカレントニューラルネットワークによって実行され、生成モデルと推論モデルの両方において時系列の潜時構造をキャプチャする。具体的には,各タイムスタンプの平均と分散をフレキシブルニューラルネットワークでパラメータ化することで,既存のマルコフモデルで一般的である一定ノイズを仮定せずに動作可能な非定常モデルを実現する。しかし、そのような柔軟性はモデルに異常を生じさせる可能性がある。また,検出作業の便益となるロバストな密度推定を実現するため,推定よりもスムーズな事前推定法を提案する。提案された先行作業は、非平滑な再構築でペナルティを課す正規化として機能する。本モデルは,新しい確率勾配変動ベイズ推定器を用いて効率よく学習する。特に, 異常検出の判定基準として, 再構成確率と再構成誤差の2つを検討した。合成データセットと公開実世界のベンチマークの両方において,本モデルの有効性を示す。

Deep generative models have demonstrated their effectiveness in learning latent representation and modeling complex dependencies of time series. In this paper, we present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of multi-dimensional time series. Our model is based on Variational Auto-Encoder (VAE), and its backbone is fulfilled by a Recurrent Neural Network to capture latent temporal structures of time series for both generative model and inference model. Specifically, our model parameterizes mean and variance for each time-stamp with flexible neural networks, resulting in a non-stationary model that can work without the assumption of constant noise as commonly made by existing Markov models. However, such a flexibility may cause the model fragile to anomalies. To achieve robust density estimation which can also benefit detection tasks, we propose a smoothness-inducing prior over possible estimations. The proposed prior works as a regularizer that places penalty at non-smooth reconstructions. Our model is learned efficiently with a novel stochastic gradient variational Bayes estimator. In particular, we study two decision criteria for anomaly detection: reconstruction probability and reconstruction error. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.

翻訳日:2021-02-04 03:39:46 公開日:2021-02-02

# (参考訳) 分布入力検出のための疑似ベイズ型ニューラルネットワーク

pseudo-Bayesian Neural Networks for detecting Out of Distribution Inputs ( http://arxiv.org/abs/2102.01336v1 )

ライセンス: CC BY-SA 4.0

Gagandeep Singh, Deepak Mishra

(参考訳) 従来のベイジアンニューラルネットワーク(BNN)は、単一の入力に対して複数の出力を提供できることが知られており、そのバリエーションは分布アウト(OOD)入力を検出するために利用することができる。 BNNは、優先順位の選択に対する感度のために訓練が困難である。そこで本研究では,重みに対する分布を学習する代わりに,推定時に点推定と摂動重みを用いる擬似BNNを提案する。従来のBNNのコスト関数を変更し、ポイント推定によりニューラルネットワークの重みのそれぞれにランダムな摂動の適切な量を注入する目的でパラメータを学習する。 In Distribution(ID)入力から複数の出力を用いてOOD入力を効果的に分離するために、確率分布の分散とエントロピーの指標から導出した2つの尺度を提案し、提案した擬似BNNと組み合わせる。全体として、この組み合わせは推論時にOODサンプルを検出する原則化された技術をもたらす。本手法は,多種多様なニューラルネットワークアーキテクチャと画像分類データセット上で評価する。提案手法は, 95%TPR, AUROC, AUPR, Detection ErrorにおけるFPR, 95%TPR, 95%TPR, 95%TPR, 95%TPR, および2～5重みのサンプルを用いて, 従来の手法よりも優れていることを示す。

Conventional Bayesian Neural Networks (BNNs) are known to be capable of providing multiple outputs for a single input, the variations in which can be utilised to detect Out of Distribution (OOD) inputs. BNNs are difficult to train due to their sensitivity towards the choice of priors. To alleviate this issue, we propose pseudo-BNNs where instead of learning distributions over weights, we use point estimates and perturb weights at the time of inference. We modify the cost function of conventional BNNs and use it to learn parameters for the purpose of injecting right amount of random perturbations to each of the weights of a neural network with point estimate. In order to effectively segregate OOD inputs from In Distribution (ID) inputs using multiple outputs, we further propose two measures, derived from the index of dispersion and entropy of probability distributions, and combine them with the proposed pseudo-BNNs. Overall, this combination results in a principled technique to detect OOD samples at the time of inference. We evaluate our technique on a wide variety of neural network architectures and image classification datasets. We observe that our method achieves state of the art results and beats the related previous work on various metrics such as FPR at 95% TPR, AUROC, AUPR and Detection Error by just using 2 to 5 samples of weights per input.

翻訳日:2021-02-04 02:57:31 公開日:2021-02-02

# (参考訳) ニューラルネットワークによるグラフ粗粒化

Graph Coarsening with Neural Networks ( http://arxiv.org/abs/2102.01350v1 )

ライセンス: CC BY 4.0

Chen Cai, Dingkang Wang, Yusu Wang

(参考訳) 大規模グラフがますます普及するにつれて、大規模グラフデータの処理、抽出、分析に重要な計算上の課題が生じる。グラフ粗大化は、重要な特性を維持しながらグラフのサイズを減らすための一般的なテクニックの1つです。リッチなグラフ粗い文献にもかかわらず、この分野におけるデータ駆動メソッドの探索は限られている。本研究では,グラフ粗化のためのグラフの深層学習の最近の進歩を活用する。我々はまず,粗いアルゴリズムの品質を測定するためのフレームワークを提案し,目標に応じて粗いグラフ上のLaplace演算子と関連するプロジェクション/リフト演算子を慎重に選択する必要があることを示した。粗いグラフに対する現在のエッジウェイト選択が準最適である可能性が示唆され、グラフニューラルネットワークを用いて重み付けマップをパラメータ化し、教師なし方法で粗い品質を改善するよう訓練する。本手法は, 合成ネットワークと実ネットワークの両方における広範な実験により, 還元率, グラフサイズ, グラフタイプなど, 一般的なグラフ粗さ化手法を大幅に改善できることを実証した。これは、より大きなサイズのグラフ(25\times$ of training graphs)に一般化し、異なる損失(微分可能かつ非微分可能)に適応し、より大きなグラフにスケールする。

As large-scale graphs become increasingly more prevalent, it poses significant computational challenges to process, extract and analyze large graph data. Graph coarsening is one popular technique to reduce the size of a graph while maintaining essential properties. Despite rich graph coarsening literature, there is only limited exploration of data-driven methods in the field. In this work, we leverage the recent progress of deep learning on graphs for graph coarsening. We first propose a framework for measuring the quality of coarsening algorithm and show that depending on the goal, we need to carefully choose the Laplace operator on the coarse graph and associated projection/lift operators. Motivated by the observation that the current choice of edge weight for the coarse graph may be sub-optimal, we parametrize the weight assignment map with graph neural networks and train it to improve the coarsening quality in an unsupervised way. Through extensive experiments on both synthetic and real networks, we demonstrate that our method significantly improves common graph coarsening methods under various metrics, reduction ratios, graph sizes, and graph types. It generalizes to graphs of larger size ($25\times$ of training graphs), is adaptive to different losses (differentiable and non-differentiable), and scales to much larger graphs than previous work.

翻訳日:2021-02-04 02:45:47 公開日:2021-02-02

# (参考訳) 骨格と成分特徴に基づくグラフ分類

Graph Classification Based on Skeleton and Component Features ( http://arxiv.org/abs/2102.01428v1 )

ライセンス: CC BY 4.0

Xue Liu, Wei Wei, Xiangnan Feng, Xiaobo Cao, Dan Sun

(参考訳) グラフ埋め込みを学習するためのほとんどの既存の一般的な方法は、固定順序のグローバル構造の特徴と構造階層表現の欠如のみを考慮します。この弱点に対処するため、匿名のランダムウォークで学習した定階構造と異なるサイズのサブグラフを用いたコンポーネント情報を用いて、骨格情報に基づく分類を実現するグラフ埋め込みアルゴリズムGraphCSCを提案する。 2つのグラフは、スケルトンとコンポーネントの両方が類似している場合に類似しているため、私たちのモデルでは、両方のグラフをグラフの均質性特性として埋め込みに統合します。最新のベースラインの包括的なリストと比較し、異なるデータセット上でモデルを示すとともに、実世界のグラフ分類タスクにおいて、私たちの研究が優れていることを実験で示します。

Most existing popular methods for learning graph embedding only consider fixed-order global structural features and lack structures hierarchical representation. To address this weakness, we propose a novel graph embedding algorithm named GraphCSC that realizes classification based on skeleton information using fixed-order structures learned in anonymous random walks manner, and component information using different size subgraphs. Two graphs are similar if their skeletons and components are both similar, thus in our model, we integrate both of them together into embeddings as graph homogeneity characterization. We demonstrate our model on different datasets in comparison with a comprehensive list of up-to-date state-of-the-art baselines, and experiments show that our work is superior in real-world graph classification tasks.

翻訳日:2021-02-04 02:15:16 公開日:2021-02-02

# (参考訳) ニューラルネットワークを用いた無補間センサのリアルタイム検出

Real-time detection of uncalibrated sensors using Neural Networks ( http://arxiv.org/abs/2102.01565v1 )

ライセンス: CC BY 4.0

Luis J. Mu\~noz-Molina, Ignacio Cazorla-Pi\~nar, Juan P. Dominguez-Morales, Fernando Perez-Pe\~na

(参考訳) 現在、センサは、科学、産業、日常生活など、その使用の恩恵を受けるいくつかのコンテキストにおいて重要な役割を果たす。しかし、取得した情報は信頼できるものでなければならない。センサの挙動の異常は、科学プロジェクトを台無しにしたり、工業生産ラインにおける生産の質を損なうなどの重大な結果をもたらす可能性がある。より微妙な種類の異常の1つは不均衡である。地上真理値に応じてキャリブレーションによりセンサが調整または標準化されていない場合、不校正が行われると言われる。本研究では,オンライン学習に基づく温度・湿度・圧力センサの非校正検出装置を開発した。このソリューションはニューラルネットワークをメインコンポーネントとして統合し、校正条件下でのセンサーの動作から学習する。そして、トレーニングとデプロイの後、一度発生した未校正を検出する。その結果, 提案手法は, 偏差値0.25度, 1% RH, 1.5Paの偏差をそれぞれ検出できることがわかった。このソリューションは、新しいセンサーの追加、新しい環境へのデプロイ、最小限のデータ量でモデルのトレーニングを可能にするトランスファーラーニングによって異なるコンテキストに適応することができる。

Nowadays, sensors play a major role in several contexts like science, industry and daily life which benefit of their use. However, the retrieved information must be reliable. Anomalies in the behavior of sensors can give rise to critical consequences such as ruining a scientific project or jeopardizing the quality of the production in industrial production lines. One of the more subtle kind of anomalies are uncalibrations. An uncalibration is said to take place when the sensor is not adjusted or standardized by calibration according to a ground truth value. In this work, an online machine-learning based uncalibration detector for temperature, humidity and pressure sensors was developed. This solution integrates an Artificial Neural Network as main component which learns from the behavior of the sensors under calibrated conditions. Then, after trained and deployed, it detects uncalibrations once they take place. The obtained results show that the proposed solution is able to detect uncalibrations for deviation values of 0.25 degrees, 1% RH and 1.5 Pa, respectively. This solution can be adapted to different contexts by means of transfer learning, whose application allows for the addition of new sensors, the deployment into new environments and the retraining of the model with minimum amounts of data.

翻訳日:2021-02-04 02:02:45 公開日:2021-02-02

# (参考訳) エージェントインセンティブ:因果的視点

Agent Incentives: A Causal Perspective ( http://arxiv.org/abs/2102.01685v1 )

ライセンス: CC BY 4.0

Tom Everitt, Ryan Carey, Eric Langlois, Pedro A Ortega, Shane Legg

(参考訳) 因果関係図を用いてエージェントインセンティブを分析するためのフレームワークを提案する。我々は、情報の価値に関する有名な基準が完成していると断定する。制御値に対する新たなグラフィカル基準を提案し、その健全性と完全性を確立します。また、環境の変化が最適な決定に影響を与えるかを示す応答インセンティブと、エージェントが変数 X を介してその有用性に影響を与えることができるかどうかを決定する機器制御インセンティブの2つの新しい概念を紹介します。両方の新しい概念について、私たちはサウンドと完全なグラフィカルな基準を提供します。これらの結果がAIシステムの安全性と公平性を評価するのにどのように役立つかを例に示します。

We present a framework for analysing agent incentives using causal influence diagrams. We establish that a well-known criterion for value of information is complete. We propose a new graphical criterion for value of control, establishing its soundness and completeness. We also introduce two new concepts for incentive analysis: response incentives indicate which changes in the environment affect an optimal decision, while instrumental control incentives establish whether an agent can influence its utility via a variable X. For both new concepts, we provide sound and complete graphical criteria. We show by example how these results can help with evaluating the safety and fairness of an AI system.

翻訳日:2021-02-04 01:50:28 公開日:2021-02-02

# (参考訳) WeNet: プロダクションファーストとプロダクションレディエンドツーエンドの音声認識ツールキット

WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit ( http://arxiv.org/abs/2102.01547v1 )

ライセンス: CC BY 4.0

Binbin Zhang, Di Wu, Chao Yang, Xiaoyu Chen, Zhendong Peng, Xiangming Wang, Zhuoyuan Yao, Xiong Wang, Fan Yu, Lei Xie, Xin Lei

(参考訳) 本稿では、WeNetという新しいオープンソース、プロダクションファースト、プロダクション対応のエンドツーエンド(E2E)音声認識ツールキットを紹介します。 WeNetの主な動機は、E2E音声認識モデルの研究と製造の間のギャップを埋めることです。 WeNetは、ASRアプリケーションを複数の実世界のシナリオで展開する効率的な方法を提供しており、これは他のオープンソースのE2E音声認識ツールキットの主な違いと利点である。本稿では、モデルアーキテクチャ、フレームワーク設計、パフォーマンスメトリクスを含む3つの側面からWeNetを紹介します。 WeNetを用いたAISHELL-1の実験では、統一されたストリーミングおよび非ストリーミング2パス(U2)E2Eモデル上で有望な文字誤り率(CER)を与えるだけでなく、合理的なRTFとレイテンシも示しています。このツールキットはhttps://github.com/mobvoi/wenetで公開されている。

In this paper, we present a new open source, production first and production ready end-to-end (E2E) speech recognition toolkit named WeNet. The main motivation of WeNet is to close the gap between the research and the production of E2E speech recognition models. WeNet provides an efficient way to ship ASR applications in several real-world scenarios, which is the main difference and advantage to other open source E2E speech recognition toolkits. This paper introduces WeNet from three aspects, including model architecture, framework design and performance metrics. Our experiments on AISHELL-1 using WeNet, not only give a promising character error rate (CER) on a unified streaming and non-streaming two pass (U2) E2E model but also show reasonable RTF and latency, both of these aspects are favored for production adoption. The toolkit is publicly available at https://github.com/mobvoi/wenet.

翻訳日:2021-02-04 01:26:10 公開日:2021-02-02

# (参考訳) 連続的な手振りで話し、調音音声シンセサイザーを制御する

SPEAK WITH YOUR HANDS Using Continuous Hand Gestures to control Articulatory Speech Synthesizer ( http://arxiv.org/abs/2102.01640v1 )

ライセンス: CC BY 4.0

Pramit Saha, Debasish Ray Mohapatra, Sidney Fels

(参考訳) 本稿では,音声合成エンジンであるtextit{viz の制御の進歩について述べる。 Pink Trombone, with hand gestures.*, Pink Trombone。声道領域機能に基づく音声合成による連続指の動きと手首屈曲を連続音声に変換する。私たちは、仮想舌を制御するために、手首と個々の指の運動情報をキャプチャするために18のセンサーを備えたCyberglove IIを使用します。センサーの座標と曲げ値は、ノイズの多い値と外れ値を滑らかにするスプライン舌モデルに適合するために利用されます。上口蓋を固定とし,スプラインモデルを声道の動的下面(舌)として考慮し,Pink Tromboneに供給される1次元領域関数値を計算し,連続的な発声音を生成する。したがって、手首と指を操作することを学ぶことによって、声道を使用する必要なしに、単に自分の手を通して音声音を生成することを学ぶことができます。

This work presents our advancements in controlling an articulatory speech synthesis engine, \textit{viz.}, Pink Trombone, with hand gestures. Our interface translates continuous finger movements and wrist flexion into continuous speech using vocal tract area-function based articulatory speech synthesis. We use Cyberglove II with 18 sensors to capture the kinematic information of the wrist and the individual fingers, in order to control a virtual tongue. The coordinates and the bending values of the sensors are then utilized to fit a spline tongue model that smoothens out the noisy values and outliers. Considering the upper palate as fixed and the spline model as the dynamically moving lower surface (tongue) of the vocal tract, we compute 1D area functional values that are fed to the Pink Trombone, generating continuous speech sounds. Therefore, by learning to manipulate one's wrist and fingers, one can learn to produce speech sounds just through one's hands, without the need for using the vocal tract.

翻訳日:2021-02-04 01:15:32 公開日:2021-02-02

# (参考訳) 時間適応ガウスモデル

Time Adaptive Gaussian Model ( http://arxiv.org/abs/2102.01238v1 )

ライセンス: CC BY 4.0

Federico Cieca, Veronica Tozzo

(参考訳) 多変量時系列分析は、データ分析パイプラインの不可欠な部分になりつつある。コ変数間の個々のタイムポイント接続と、これらの接続が時間内でどのように変化するかを理解することは簡単ではない。そこで本研究では,隠れマルコフモデルとガウスグラフィックモデル-時間適応ガウスモデル(TAGM)を活用した新しい手法を提案する。本モデルは時間的グラフィカルモデルの推論のための最先端手法の一般化であり,その定式化は,現在の手法よりも優れた結果を提供するモデルの両側面を活用している。特に、時間内にデータポイントをクラスタリングすることでパターン認識を行い、観察された変数間の確率的(そしておそらく因果関係)の関係を見出す。時間的ネットワーク推論の現在の方法と比較して、良い推論性能を示しながら基本的な仮定を減らします。

Multivariate time series analysis is becoming an integral part of data analysis pipelines. Understanding the individual time point connections between covariates as well as how these connections change in time is non-trivial. To this aim, we propose a novel method that leverages on Hidden Markov Models and Gaussian Graphical Models -- Time Adaptive Gaussian Model (TAGM). Our model is a generalization of state-of-the-art methods for the inference of temporal graphical models, its formulation leverages on both aspects of these models providing better results than current methods. In particular,it performs pattern recognition by clustering data points in time; and, it finds probabilistic (and possibly causal) relationships among the observed variables. Compared to current methods for temporal network inference, it reduces the basic assumptions while still showing good inference performances.

翻訳日:2021-02-04 00:53:48 公開日:2021-02-02

# (参考訳) グラフモデルを用いたガウス専門家の選択

Gaussian Experts Selection using Graphical Models ( http://arxiv.org/abs/2102.01496v1 )

ライセンス: CC BY 4.0

Hamed Jalali, Martin Pawelczyk, Gjerji Kasneci

(参考訳) 局所近似はガウス過程(GP)をビッグデータに拡張する一般的な手法である。ローカル近似は、元のデータセットをサブセットに分割し、各サブセットでローカルエキスパートをトレーニングすることで、時間の複雑さを低減する。専門家の予測の集約は、専門家間の条件依存または独立を仮定して行われる。専門家間の \emph{conditional independent assumption} (CI) を課すと、異なる専門家の予測の集約が、不確実性の定量化のコストで時間効率良く行われる。一方、モデルに依存する専門家は、非現実的に高い計算コストを犠牲にして正確な予測と不確実性定量を提供することができる。理論ガイドによる専門家選定ステップを通じて弱い専門家を排除することにより、依存専門家を集約する計算コストを大幅に削減し、校正された不確実性の定量化を確保します。専門家間の条件付き依存関係をエンコードするスパース精度行列を使用して,最も重要な専門家を選択することで,無向なグラフィカルモデルに関する文献の手法を活用する。モレロフ

Local approximations are popular methods to scale Gaussian processes (GPs) to big data. Local approximations reduce time complexity by dividing the original dataset into subsets and training a local expert on each subset. Aggregating the experts' prediction is done assuming either conditional dependence or independence between the experts. Imposing the \emph{conditional independence assumption} (CI) between the experts renders the aggregation of different expert predictions time efficient at the cost of poor uncertainty quantification. On the other hand, modeling dependent experts can provide precise predictions and uncertainty quantification at the expense of impractically high computational costs. By eliminating weak experts via a theory-guided expert selection step, we substantially reduce the computational cost of aggregating dependent experts while ensuring calibrated uncertainty quantification. We leverage techniques from the literature on undirected graphical models, using sparse precision matrices that encode conditional dependencies between experts to select the most important experts. Moreov

翻訳日:2021-02-04 00:41:35 公開日:2021-02-02

# (参考訳) 確率勾配を持つ正確なランゲビンダイナミクス

Exact Langevin Dynamics with Stochastic Gradients ( http://arxiv.org/abs/2102.01691v1 )

ライセンス: CC BY-SA 4.0

Adri\`a Garriga-Alonso and Vincent Fortuin

(参考訳) 確率勾配マルコフチェーンモンテカルロアルゴリズムは近似推論のための一般的なサンプラーであるが、一般的に偏見がある。これらの方法の最近のバージョンの多くを示しています(例)。チェンら。 (2014) は、受け入れ確率が常にゼロであるため、メトロポリス・ハスティングによる拒絶サンプリングでは修正できない。確率勾配Langevinダイナミクス(Welling and Teh, 2011)とハミルトンモンテカルロを一般化するGradient-Guided Monte Carlo (Horowitz, 1991)のような、実現可能な後方方向の軌道を持つサンプラーを使用することで、これを修正できます。このサンプルは確率勾配で使用することができ、複数のステップにわたって計算できる非ゼロ受容確率が得られることを示す。

Stochastic gradient Markov Chain Monte Carlo algorithms are popular samplers for approximate inference, but they are generally biased. We show that many recent versions of these methods (e.g. Chen et al. (2014)) cannot be corrected using Metropolis-Hastings rejection sampling, because their acceptance probability is always zero. We can fix this by employing a sampler with realizable backwards trajectories, such as Gradient-Guided Monte Carlo (Horowitz, 1991), which generalizes stochastic gradient Langevin dynamics (Welling and Teh, 2011) and Hamiltonian Monte Carlo. We show that this sampler can be used with stochastic gradients, yielding nonzero acceptance probabilities, which can be computed even across multiple steps.

翻訳日:2021-02-04 00:24:32 公開日:2021-02-02

# (参考訳) 対話型再構成によるジェネラティブモデルの解釈可能性評価

Evaluating the Interpretability of Generative Models by Interactive Reconstruction ( http://arxiv.org/abs/2102.01264v1 )

ライセンス: CC BY 4.0

Andrew Slavin Ross, Nina Chen, Elisa Zhao Hang, Elena L. Glassman, Finale Doshi-Velez

(参考訳) 機械学習モデルが多数の社会技術システムで最も有用であるためには、多くはそれらが人間に解釈可能でなければならないと主張した。しかし、解釈可能性への関心が高まりつつあるにもかかわらず、その測定方法に関する確固たるコンセンサスはいまだにない。これは表現学習において特に当てはまり、解釈可能性の研究は、合成データセットにのみ適用され、人間の要因に基づかない「偏角」測定に焦点を当てている。生成モデル表現の人間解釈可能性を定量化するタスクを導入し、ユーザが対話的に表現を修正してターゲットインスタンスを再構築する。合成データセットでは、このタスクの性能がベースラインアプローチよりもはるかに確実に絡み合ったモデルと絡み合ったモデルを区別する。実際のデータセットでは、広く信じられているが、多かれ少なかれ解釈可能なモデルを生成することが示されない表現学習方法の違いを見出す。いずれの場合も、Amazon Mechanical Turkに関する小規模のシンクアルード研究と大規模実験を実施し、定性的および定量的な結果が一致したことを確認しました。

For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synthetic datasets and not grounded in human factors. We introduce a task to quantify the human-interpretability of generative model representations, where users interactively modify representations to reconstruct target instances. On synthetic datasets, we find performance on this task much more reliably differentiates entangled and disentangled models than baseline approaches. On a real dataset, we find it differentiates between representation learning methods widely believed but never shown to produce more or less interpretable models. In both cases, we ran small-scale think-aloud studies and large-scale experiments on Amazon Mechanical Turk to confirm that our qualitative and quantitative results agreed.

翻訳日:2021-02-03 23:55:04 公開日:2021-02-02

# (参考訳) 無線ネットワークプロトコル合成のためのマルチエージェント強化学習に向けて

Towards Multi-agent Reinforcement Learning for Wireless Network Protocol Synthesis ( http://arxiv.org/abs/2102.01611v1 )

ライセンス: CC BY 4.0

Hrishikesh Dutta and Subir Biswas

(参考訳) 本稿では,無線ネットワークのためのマルチエージェント強化学習に基づくメディアアクセスフレームワークを提案する。アクセス問題はマルコフ決定プロセス(MDP)として定式化され、各ネットワークノードが分散学習エージェントとして機能する強化学習を用いて解決される。ソリューションコンポーネントは、ノードエージェントが自己複製を制御するためにMAC層パケットの負荷を制御することを漸進的に学習する単一ノードアクセスシナリオから、ステップバイステップで開発される。戦略は、より精巧な報酬構造を使用して、マルチノードの完全接続シナリオにスケールアップされます。また、より一般的な部分連結トポロジーに対する予備的実現可能性を示す。また,mac層伝送確率の調整を学べば,最適負荷時の理論的最大スループットが達成できるだけでなく,従来の手法と異なり,高い負荷条件でも最大スループットを維持できることを示した。さらに、この機能を保ちながら、そのメカニズムは異種ロードに依存しない。また、ノード間のプロトコルのアクセス優先度をパラメトリックに調整できることを示した。最後に、強化学習のオンライン学習機能により、プロトコルを時間変化の負荷条件に適応させることができることを示す。

This paper proposes a multi-agent reinforcement learning based medium access framework for wireless networks. The access problem is formulated as a Markov Decision Process (MDP), and solved using reinforcement learning with every network node acting as a distributed learning agent. The solution components are developed step by step, starting from a single-node access scenario in which a node agent incrementally learns to control MAC layer packet loads for reining in self-collisions. The strategy is then scaled up for multi-node fully-connected scenarios by using more elaborate reward structures. It also demonstrates preliminary feasibility for more general partially connected topologies. It is shown that by learning to adjust MAC layer transmission probabilities, the protocol is not only able to attain theoretical maximum throughput at an optimal load, but unlike classical approaches, it can also retain that maximum throughput at higher loading conditions. Additionally, the mechanism is agnostic to heterogeneous loading while preserving that feature. It is also shown that access priorities of the protocol across nodes can be parametrically adjusted. Finally, it is also shown that the online learning feature of reinforcement learning is able to make the protocol adapt to time-varying loading conditions.

翻訳日:2021-02-03 23:08:24 公開日:2021-02-02

# (参考訳) the workshop on program synthesis for scientific computing 参加報告

Report of the Workshop on Program Synthesis for Scientific Computing ( http://arxiv.org/abs/2102.01687v1 )

ライセンス: CC BY 4.0

Hal Finkel, Ignacio Laguna

(参考訳) プログラム合成は、学術、国立研究所、産業において活発な研究分野である。しかし、科学計算に直接適用できる仕事は、いくつかの印象的な成功を収めているが、制限されている。本報告は,科学計算におけるプログラム合成作業の関連分野を概観し,これまでの成功を議論し,今後の作業の機会を概説する。本報告は,2020年8月4日～5日(https://prog-synth-science.github.io/2020/)にサイエントコンピューティングのためのプログラム合成ワークショップの成果である。

Program synthesis is an active research field in academia, national labs, and industry. Yet, work directly applicable to scientific computing, while having some impressive successes, has been limited. This report reviews the relevant areas of program synthesis work for scientific computing, discusses successes to date, and outlines opportunities for future work. This report is the result of the Workshop on Program Synthesis for Scientific Computing was held virtually on August 4-5 2020 (https://prog-synth-science.github.io/2020/).

翻訳日:2021-02-03 22:35:50 公開日:2021-02-02

# (参考訳) JPEGプライマリ量子化行列推定とクラスタリングによる画像スプライシング検出, 局在化, 属性化

Image Splicing Detection, Localization and Attribution via JPEG Primary Quantization Matrix Estimation and Clustering ( http://arxiv.org/abs/2102.01439v1 )

ライセンス: CC BY 4.0

Yakun Niu, Benedetta Tondi, Yao Zhao, Rongrong Ni and Mauro Barni

(参考訳) 異なる画像領域にわたる二重JPEGアーティファクトの不整合の検出は、画像スプライシングのような局所的な画像操作を検出し、それらをローカライズするためにしばしば使用される。本稿では,スプライシング領域の検出と局所化に加えて,異なるドナー画像から得られる領域を識別するエンド・ツー・エンドシステムを提案する。分割された領域と背景画像の両方が二重JPEG圧縮されていると仮定し、一次量子化行列の局所推定を用いて異なるソースから抽出されたスプライシング領域を区別する。そこで,推定された一次量子化行列に従って画像ブロックをクラスタリングし,形態的再構成により精度を向上させる。提案手法は,第2の圧縮が第1の圧縮よりも強いか弱いかに関わらず,アライメントと非アライメントの2つのJPEG圧縮を含む多種多様な設定で動作可能である。類似条件下でのベースライン法に対して優れた性能を示す広範な実験により,提案手法を検証した。

Detection of inconsistencies of double JPEG artefacts across different image regions is often used to detect local image manipulations, like image splicing, and to localize them. In this paper, we move one step further, proposing an end-to-end system that, in addition to detecting and localizing spliced regions, can also distinguish regions coming from different donor images. We assume that both the spliced regions and the background image have undergone a double JPEG compression, and use a local estimate of the primary quantization matrix to distinguish between spliced regions taken from different sources. To do so, we cluster the image blocks according to the estimated primary quantization matrix and refine the result by means of morphological reconstruction. The proposed method can work in a wide variety of settings including aligned and non-aligned double JPEG compression, and regardless of whether the second compression is stronger or weaker than the first one. We validated the proposed approach by means of extensive experiments showing its superior performance with respect to baseline methods working in similar conditions.

翻訳日:2021-02-03 21:13:15 公開日:2021-02-02

# (参考訳) U-LanD:不確実性駆動のビデオランドマーク検出

U-LanD: Uncertainty-Driven Video Landmark Detection ( http://arxiv.org/abs/2102.01586v1 )

ライセンス: CC BY 4.0

Mohammad H. Jafari, Christina Luong, Michael Tsang, Ang Nan Gu, Nathan Van Woudenberg, Robert Rohling, Teresa Tsang, Purang Abolmaesumi

(参考訳) 本稿では,ビデオ中のキーフレームとランドマークを共同検出するためのフレームワークであるU-LanDを提案する。私たちは、トレーニングラベルが騒々しく、非常にスパースな、特に困難な問題に取り組みます。 U-LanDは、重要なビデオフレームでのみ訓練された深いベイズランドマーク検出器が、それらのフレームの予測不確実性を大幅に低下させています。この観測を教師なし信号として使用し、ランドマークを検出するキーフレームを自動的に認識する。本フレームワークの試験ベッドとして,各ビデオの1フレームでのみ,スパースとノイジーな臨床ラベルが使用可能な,心臓の超音波画像を用いた。 4,493人のデータを用いて、U-LanDは、現在最先端の非ベイズ系患者よりも、R2スコアの42%という顕著な絶対的マージンで、モデルサイズにほとんどオーバーヘッドを課さないことが実証された。私たちのアプローチは汎用的で、騒々しいトレーニングラベルを持つ他の挑戦的なデータに適用できます。

This paper presents U-LanD, a framework for joint detection of key frames and landmarks in videos. We tackle a specifically challenging problem, where training labels are noisy and highly sparse. U-LanD builds upon a pivotal observation: a deep Bayesian landmark detector solely trained on key video frames, has significantly lower predictive uncertainty on those frames vs. other frames in videos. We use this observation as an unsupervised signal to automatically recognize key frames on which we detect landmarks. As a test-bed for our framework, we use ultrasound imaging videos of the heart, where sparse and noisy clinical labels are only available for a single frame in each video. Using data from 4,493 patients, we demonstrate that U-LanD can exceedingly outperform the state-of-the-art non-Bayesian counterpart by a noticeable absolute margin of 42% in R2 score, with almost no overhead imposed on the model size. Our approach is generic and can be potentially applied to other challenging data with noisy and sparse training labels.

翻訳日:2021-02-03 20:49:53 公開日:2021-02-02

# (参考訳) 局所差分プライバシーは$E_\gamma$-Divergenceの収縮と同等である

Local Differential Privacy Is Equivalent to Contraction of $E_\gamma$-Divergence ( http://arxiv.org/abs/2102.01258v1 )

ライセンス: CC BY 4.0

Shahab Asoodeh, Maryam Aliakbarpour, and Flavio P. Calmon

(参考訳) ランダム化プライバシーメカニズムの局所差分プライバシー(LDP)保証について,その収縮特性を用いて検討する。まず, LDP 制約は $E_\gamma$-divergence の収縮係数で等価にキャストできることを示した。次に、この等価な式を使用して、任意の $f$-divergences の収縮係数の点でプライバシーメカニズムの LDP 保証を表現する。標準的な推定理論ツール(例えばル・カムやファノの逆手法)と組み合わせると、この結果はいくつかのテストにおいてプライバシーとユーティリティの間のトレードオフとミニマックスとベイズ推定問題を調べることができる。

We investigate the local differential privacy (LDP) guarantees of a randomized privacy mechanism via its contraction properties. We first show that LDP constraints can be equivalently cast in terms of the contraction coefficient of the $E_\gamma$-divergence. We then use this equivalent formula to express LDP guarantees of privacy mechanisms in terms of contraction coefficients of arbitrary $f$-divergences. When combined with standard estimation-theoretic tools (such as Le Cam's and Fano's converse methods), this result allows us to study the trade-off between privacy and utility in several testing and minimax and Bayesian estimation problems.

翻訳日:2021-02-03 20:25:35 公開日:2021-02-02

# (参考訳) 分散確率勾配降下の安定性と一般化

Stability and Generalization of the Decentralized Stochastic Gradient Descent ( http://arxiv.org/abs/2102.01302v1 )

ライセンス: CC BY 4.0

Tao Sun, Dongsheng Li, Bao Wang

(参考訳) 確率勾配に基づく手法の安定性と一般化は、機械学習モデルのアルゴリズム性能を理解する上で貴重な洞察を与える。深層学習のメインワークホースとして、確率勾配降下はかなりの量の研究を受けている。しかし、コミュニティはその分散型の変種にほとんど注意を払わなかった。本論文では,分散確率勾配降下の新たな定式化を提案する。この定式化と(非凸最適化理論を併用して、分散確率勾配勾配の第一の安定性と一般化保証を確立する。我々の理論的結果は、いくつかの一般的かつ穏やかな仮定に基づいて構築され、分散化が初めてsgdの安定性を低下させることが明らかとなった。さまざまな分散設定とベンチマーク機械学習モデルを用いて理論的結果を検証する。

The stability and generalization of stochastic gradient-based methods provide valuable insights into understanding the algorithmic performance of machine learning models. As the main workhorse for deep learning, stochastic gradient descent has received a considerable amount of studies. Nevertheless, the community paid little attention to its decentralized variants. In this paper, we provide a novel formulation of the decentralized stochastic gradient descent. Leveraging this formulation together with (non)convex optimization theory, we establish the first stability and generalization guarantees for the decentralized stochastic gradient descent. Our theoretical results are built on top of a few common and mild assumptions and reveal that the decentralization deteriorates the stability of SGD for the first time. We verify our theoretical findings by using a variety of decentralized settings and benchmark machine learning models.

翻訳日:2021-02-03 20:03:22 公開日:2021-02-02

# (参考訳) 簡易予測器によるオンライン学習と0/1ゲームにおけるMinimaxの組合せ評価

Online Learning with Simple Predictors and a Combinatorial Characterization of Minimax in 0/1 Games ( http://arxiv.org/abs/2102.01646v1 )

ライセンス: CC BY 4.0

Steve Hanneke, Roi Livni, and Shay Moran

(参考訳) どのクラスがオンラインモデルで適切に学習できるのか? つまり、各ラウンドで概念クラスから予測子を使用するアルゴリズムによって。不適切な学習が必要な単純で自然なケースもあるが、不適切な予測器がどの程度複雑でなければならないのかを問うのは当然である。単純な"予測器を使って、常にほぼ最適のミス/リグレット境界を達成できるのか? 本研究は,アングリン(1987年)とリトルストーン(1988年)の先駆的研究から研究されてきたオープンな課題を解決するために,これがいつ可能かを完全に特徴づける。より正確には、任意の概念クラス C と任意の仮説クラス H を考えると、H からの予測器を用いてオンライン学習 C の最適誤差境界について、ほぼ厳しい境界 (ログファクタまで) を提供します。アプリケーションとして、(i)実現可能な設定では、(定数まで)ほぼ最適の誤りバウンドが、適切な予測者のスパース多数投票によって達成可能であり、(ii)不可知な設定では、(ログ係数まで)ほぼ最適の後悔バウンドをランダム化された固有アルゴリズムで達成できることを示す構成的証明を与える。独立性のある証明の技術的要素は、二元零サムゲームに対する有名なミニマックス定理(von Neumann, 1928)の一般化である。 Minimaxを満たすのに失敗する単純なゲームは、各プレーヤーが数字を選択し、より大きな数字が勝つ「大きな数字を誘導する」です。ペイオフ行列は無限三角形である。これが唯一の障害であることを示す:ゲームが非有界サイズの三角部分行列を含まないならば、ミニマックス定理は成立する。これはフォン・ノイマンのミニマックス定理を有限性(あるいはコンパクト性)の要件を取り除いて一般化し、オンライン学習に関心のあるゲームを正確に捉えている。

Which classes can be learned properly in the online model? -- that is, by an algorithm that at each round uses a predictor from the concept class. While there are simple and natural cases where improper learning is necessary, it is natural to ask how complex must the improper predictors be in such cases. Can one always achieve nearly optimal mistake/regret bounds using "simple" predictors? In this work, we give a complete characterization of when this is possible, thus settling an open problem which has been studied since the pioneering works of Angluin (1987) and Littlestone (1988). More precisely, given any concept class C and any hypothesis class H, we provide nearly tight bounds (up to a log factor) on the optimal mistake bounds for online learning C using predictors from H. Our bound yields an exponential improvement over the previously best known bound by Chase and Freitag (2020). As applications, we give constructive proofs showing that (i) in the realizable setting, a near-optimal mistake bound (up to a constant factor) can be attained by a sparse majority-vote of proper predictors, and (ii) in the agnostic setting, a near-optimal regret bound (up to a log factor) can be attained by a randomized proper algorithm. A technical ingredient of our proof which may be of independent interest is a generalization of the celebrated Minimax Theorem (von Neumann, 1928) for binary zero-sum games. A simple game which fails to satisfy Minimax is "Guess the Larger Number", where each player picks a number and the larger number wins. The payoff matrix is infinite triangular. We show this is the only obstruction: if a game does not contain triangular submatrices of unbounded sizes then the Minimax Theorem holds. This generalizes von Neumann's Minimax Theorem by removing requirements of finiteness (or compactness), and captures precisely the games of interest in online learning.

翻訳日:2021-02-03 19:40:44 公開日:2021-02-02

# (参考訳) 非同期Q-LearningとTD-Learningの有限サンプル保証に対するリアプノフ理論

A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants ( http://arxiv.org/abs/2102.01567v1 )

ライセンス: CC BY 4.0

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, and Karthikeyan Shanmugam

(参考訳) 本稿では,大規模な値ベース非同期強化学習(RL)アルゴリズムの有限サンプル収束を保証する統一フレームワークを開発する。我々は、まずRLアルゴリズムをマルコフ確率近似(SA)アルゴリズムとして再構成し、不動点方程式を解く。次に、Lyapunov解析を開発し、マルコフSAの収束に関する平均二乗誤差境界を導出する。この中心的な結果に基づいて,$Q$-learning,$n$-step TD,TD$(\lambda)$,V-traceを含む非政治的なTDアルゴリズムなどの非同期RLアルゴリズムに対して,有限サンプル平均二乗収束境界を確立する。副産物として、TD$(\lambda)$(および$n$-step TD)アルゴリズムの性能境界を一般の$\lambda$(および$n$)に対して解析することにより、RLにおけるブートストラップの効率というバイアス分散トレードオフを実証する。これは[37]で最初にオープンな問題として提起された。

This paper develops an unified framework to study finite-sample convergence guarantees of a large class of value-based asynchronous Reinforcement Learning (RL) algorithms. We do this by first reformulating the RL algorithms as Markovian Stochastic Approximation (SA) algorithms to solve fixed-point equations. We then develop a Lyapunov analysis and derive mean-square error bounds on the convergence of the Markovian SA. Based on this central result, we establish finite-sample mean-square convergence bounds for asynchronous RL algorithms such as $Q$-learning, $n$-step TD, TD$(\lambda)$, and off-policy TD algorithms including V-trace. As a by-product, by analyzing the performance bounds of the TD$(\lambda)$ (and $n$-step TD) algorithm for general $\lambda$ (and $n$), we demonstrate a bias-variance trade-off, i.e., efficiency of bootstrapping in RL. This was first posed as an open problem in [37].

翻訳日:2021-02-03 19:01:51 公開日:2021-02-02

# (参考訳) aura-net : アノテーションの少ない位相コントラスト顕微鏡画像の堅牢なセグメンテーション

aura-net : robust segmentation of phase-contrast microscopy images with few annotations ( http://arxiv.org/abs/2102.01389v1 )

ライセンス: CC BY 4.0

Ethan Cohen and Virginie Uhlmann

(参考訳) 位相コントラスト顕微鏡画像の分割のための畳み込みニューラルネットワーク(CNN)であるAURA-netを提案する。 AURA-netは、トランスファーラーニングを使用してトレーニングと注意メカニズムを加速し、ネットワークが関連する画像機能に集中できるようにします。このように、非常に限られた量のアノテーションで効率的にトレーニングできます。したがって、我々のネットワークは、一般的にディープラーニング技術では小さすぎると考えられるデータセットのセグメンテーションを自動化するために利用することができる。 AURA-netはまた、位相コントラスト画像の特異性に順応し、さらに性能を向上させるアクティブな輪郭にインスパイアされた損失を使用する。 AURA-netは、いくつかの小さな(100倍未満)データセットにおいて最先端の代替品よりも優れていることを示す。

We present AURA-net, a convolutional neural network (CNN) for the segmentation of phase-contrast microscopy images. AURA-net uses transfer learning to accelerate training and Attention mechanisms to help the network focus on relevant image features. In this way, it can be trained efficiently with a very limited amount of annotations. Our network can thus be used to automate the segmentation of datasets that are generally considered too small for deep learning techniques. AURA-net also uses a loss inspired by active contours that is well-adapted to the specificity of phase-contrast images, further improving performance. We show that AURA-net outperforms state-of-the-art alternatives in several small (less than 100images) datasets.

翻訳日:2021-02-03 17:46:11 公開日:2021-02-02

# (参考訳) 肺塞栓症自動検出のための多エネルギーCT画像からの低keV単色画像の予測

Prediction of low-keV monochromatic images from polyenergetic CT scans for improved automatic detection of pulmonary embolism ( http://arxiv.org/abs/2102.01445v1 )

ライセンス: CC BY 4.0

Constantin Seibold, Matthias A. Fink, Charlotte Goos, Hans-Ulrich Kauczor, Heinz-Peter Schlemmer, Rainer Stiefelhagen, Jens Kleesiek

(参考訳) 検出器ベースのスペクトル計算トモグラフィは、スペクトル情報を得る可能性を提供する最近のデュアルエネルギーCT(DECT)技術である。このスペクトルデータから、他の仮想単エネルギー(monoE)画像と異なり、異なるタイプの画像を引き出すことができる。 MonoE画像は、アーチファクトが減少し、コントラストが改善し、全体的なノイズ値が低下し、血管異常の診断精度が向上する理想的な候補となります。本稿では,従来の単エネルギーCTからのモノE画像の生成をエミュレートできる畳み込みニューラルネットワーク(CNN)を訓練している。本研究では,よく用いられる画像変換手法について検討する。これらの方法が視覚的に類似した出力を作成し、肺塞栓症(PE)の自動分類に使用されるとパフォーマンスが低下することを示しています。 psnrとssimスコアに反映されるように,ネットワークによる分類と生成結果の改善を実現するマルチタスク最適化手法を用いて,これらの手法を拡張した。さらに,提案手法をrsna-peチャレンジデータセットのサブセット上で評価することにより,受信者動作特性曲線(auroc)下の領域を0.8142から0.8420までのna\"ive分類アプローチと比較して改善できることを示す。

Detector-based spectral computed tomography is a recent dual-energy CT (DECT) technology that offers the possibility of obtaining spectral information. From this spectral data, different types of images can be derived, amongst others virtual monoenergetic (monoE) images. MonoE images potentially exhibit decreased artifacts, improve contrast, and overall contain lower noise values, making them ideal candidates for better delineation and thus improved diagnostic accuracy of vascular abnormalities. In this paper, we are training convolutional neural networks~(CNN) that can emulate the generation of monoE images from conventional single energy CT acquisitions. For this task, we investigate several commonly used image-translation methods. We demonstrate that these methods while creating visually similar outputs, lead to a poorer performance when used for automatic classification of pulmonary embolism (PE). We expand on these methods through the use of a multi-task optimization approach, under which the networks achieve improved classification as well as generation results, as reflected by PSNR and SSIM scores. Further, evaluating our proposed framework on a subset of the RSNA-PE challenge data set shows that we are able to improve the Area under the Receiver Operating Characteristic curve (AuROC) in comparison to a na\"ive classification approach from 0.8142 to 0.8420.

翻訳日:2021-02-03 17:36:52 公開日:2021-02-02

# (参考訳) モデルに基づくマルチパラメータマッピング

Model-based multi-parameter mapping ( http://arxiv.org/abs/2102.01604v1 )

ライセンス: CC BY 4.0

Yael Balbastre, Mikael Brudfors, Michela Azzarito, Christian Lambert, Martina F. Callaghan, John Ashburner

(参考訳) 量的MRイメージングは、その豊富な情報コンテンツと標準化された対策のためにますます好まれています。しかし, 縦緩和率 (R1), 横緩和率 (R2*), 磁化移動飽和度 (MTsat) などの定量的パラメータの抽出は, 高い非線形関数の反転を伴う。推定はしばしばノイズのない測定を仮定し、データのサブセットを使用して異なる量の分離を解決し、各計算を通じてエラーが伝播します。代わりに、データセット全体の確率的生成モデルを定式化し、逆転してパラメータ推定を適切に定義された確率的意味(例えば、最大可能性または最大a後方)で共同で回収することができる。実際には、反復的な方法を使用する必要がありますが、ログの類似性の非凸性のために収束は困難です。しかし、我々はそれが新しい近似ヘッセンのおかげで達成できることを示し、それによって、信頼できるパラメータ推定が得られました。本稿では,このフレキシブルなフレームワークの有用性を,一般的なマルチパラメータマッピングフレームワークの文脈で実証し,デノイジンの事前設定と後方不確かさの予測の方法を示す。当社の実装では、PyTorchバックエンドを使用しており、GPUアクセラレーションのメリットがあります。 https://github.com/balbasty/nitorch.comで入手できる。

Quantitative MR imaging is increasingly favoured for its richer information content and standardised measures. However, extracting quantitative parameters such as the longitudinal relaxation rate (R1), apparent transverse relaxation rate (R2*), or magnetisation-transfer saturation (MTsat) involves inverting a highly non-linear function. Estimations often assume noise-free measurements and use subsets of the data to solve for different quantities in isolation, with error propagating through each computation. Instead, a probabilistic generative model of the entire dataset can be formulated and inverted to jointly recover parameter estimates with a well-defined probabilistic meaning (e.g., maximum likelihood or maximum a posteriori). In practice, iterative methods must be used but convergence is difficult due to the non-convexity of the log-likelihood; yet, we show that it can be achieved thanks to a novel approximate Hessian and, with it, reliable parameter estimates obtained. Here, we demonstrate the utility of this flexible framework in the context of the popular multi-parameter mapping framework and further show how to incorporate a denoising prior and predict posterior uncertainty. Our implementation uses a PyTorch backend and benefits from GPU acceleration. It is available at https://github.com/balbasty/nitorch.

翻訳日:2021-02-03 17:30:30 公開日:2021-02-02

# 例外挿による神経データ拡張

Neural Data Augmentation via Example Extrapolation ( http://arxiv.org/abs/2102.01335v1 )

ライセンス: Link先を確認

Kenton Lee, Kelvin Guu, Luheng He, Tim Dozat, Hyung Won Chung

(参考訳) 機械学習の多くの応用では、トレーニングデータで特定の例のカテゴリが過小評価され、テスト時にこのような"フェーショット"ケースでシステムが過小評価される可能性がある。一般的な治療は、表現不足の例を複製したり、新しい例をヒューリスティックに合成したりしてデータ拡張を行うことである。しかし、これらの治療法は実例の完全な多様性と複雑さをカバーできないことが多い。本稿では,ニューラルサンプル補間(Ex2)を行うデータ拡張手法を提案する。ある分布からサンプリングされた少数の例を考えると、Ex2は同じ分布に属する新しい例を合成する。 Ex2モデルは、データ豊富なスライスの例生成手順をシミュレートして学習され、表現不足の少数のスライスに適用されます。 Ex2をさまざまな言語理解タスクに適用し、リレーション抽出(FewRel)やインテント分類+スロットフィリング(SNIPS)など、複数のマルチショット学習ベンチマークにおける最先端の手法を大幅に改善します。

In many applications of machine learning, certain categories of examples may be underrepresented in the training data, causing systems to underperform on such "few-shot" cases at test time. A common remedy is to perform data augmentation, such as by duplicating underrepresented examples, or heuristically synthesizing new examples. But these remedies often fail to cover the full diversity and complexity of real examples. We propose a data augmentation approach that performs neural Example Extrapolation (Ex2). Given a handful of exemplars sampled from some distribution, Ex2 synthesizes new examples that also belong to the same distribution. The Ex2 model is learned by simulating the example generation procedure on data-rich slices of the data, and it is applied to underrepresented, few-shot slices. We apply Ex2 to a range of language understanding tasks and significantly improve over state-of-the-art methods on multiple few-shot learning benchmarks, including for relation extraction (FewRel) and intent classification + slot filling (SNIPS).

翻訳日:2021-02-03 17:00:11 公開日:2021-02-02

# エッジ領域における色分布解析に基づく顔操作検出

Facial Manipulation Detection Based on the Color Distribution Analysis in Edge Region ( http://arxiv.org/abs/2102.01381v1 )

ライセンス: Link先を確認

Dong-Keon Kim, DongHee Kim, and Kwangsu Kim

(参考訳) 本研究では,操作画像におけるエッジの垂直領域の色分布解析に基づく,汎用的かつ堅牢な顔操作検出手法を提案する。現代顔操作法の大半は、合成画像における顔面境界に沿った画素値差の厄介さを低減するための画素補正手順を含む。この方法では, 顔操作画像と忘れ去られた自然画像との間には, 顔境界の違いがある。また、鍛造画像では、照明の自然な効果を損なう傾向があるため、顔境界と背景エッジ領域のギャップ分布に特徴的な不自然な特徴があるべきである。顔境界と背景エッジに特有の特徴を持つ顔操作画像を検出するニューラルネットワークを設計する。本研究では, 既存の顔操作検出法よりも, トレーニングの有無にかかわらず, 合成顔画像の検出法を各種データセットで比較した。

In this work, we present a generalized and robust facial manipulation detection method based on color distribution analysis of the vertical region of edge in a manipulated image. Most of the contemporary facial manipulation method involves pixel correction procedures for reducing awkwardness of pixel value differences along the facial boundary in a synthesized image. For this procedure, there are distinctive differences in the facial boundary between face manipulated image and unforged natural image. Also, in the forged image, there should be distinctive and unnatural features in the gap distribution between facial boundary and background edge region because it tends to damage the natural effect of lighting. We design the neural network for detecting face-manipulated image with these distinctive features in facial boundary and background edge. Our extensive experiments show that our method outperforms other existing face manipulation detection methods on detecting synthesized face image in various datasets regardless of whether it has participated in training.

翻訳日:2021-02-03 16:59:32 公開日:2021-02-02

# オブジェクトの共同発生に対するペナルティによるクラスタリング:計算的側面

Clustering with Penalty for Joint Occurrence of Objects: Computational Aspects ( http://arxiv.org/abs/2102.01424v1 )

ライセンス: Link先を確認

Ond\v{r}ej Sokol and Vladim\'ir Hol\'y

(参考訳) Hol\'y, Sokol, \v{C}ern\'y (Applied Soft Computing, 2017 Vol) のメソッド。 60, p. 752-762) クラスタオブジェクトは、与えられた多数のセットの入射量に基づく。アイデアは、同じセット内の同じクラスタから複数のオブジェクトの発生を最小限に抑えることです。本稿では,本手法の計算的側面について考察する。まず、最適クラスタリングの問題はNPハードであることが証明される。第二に、最適なクラスタリングを数値的に見つけるために、再数値化手順、高速なタスク固有の局所探索ヒューリスティック、単純化されたモデルに基づく初期解を用いた遺伝的アルゴリズムを提案する。第3に, シミュレーション研究により, 標準遺伝的アルゴリズムの改良により, 計算性能が著しく向上することを示す。

The method of Hol\'y, Sokol and \v{C}ern\'y (Applied Soft Computing, 2017, Vol. 60, p. 752-762) clusters objects based on their incidence in a large number of given sets. The idea is to minimize the occurrence of multiple objects from the same cluster in the same set. In the current paper, we study computational aspects of the method. First, we prove that the problem of finding the optimal clustering is NP-hard. Second, to numerically find a suitable clustering, we propose to use the genetic algorithm augmented by a renumbering procedure, a fast task-specific local search heuristic and an initial solution based on a simplified model. Third, in a simulation study, we demonstrate that our improvements of the standard genetic algorithm significantly enhance its computational performance.

翻訳日:2021-02-03 16:58:58 公開日:2021-02-02

# 強化学習におけるメトリクスと継続性

Metrics and continuity in reinforcement learning ( http://arxiv.org/abs/2102.01514v1 )

ライセンス: Link先を確認

Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro

(参考訳) 強化学習のほとんどの実践的応用では、個々の状態の直接推定を維持することは不可能であり、連続状態システムでは不可能である。代わりに、研究者はしばしば状態の類似性(明示的にも暗黙的にも)を利用して、限られたサンプルセットからうまく一般化できるモデルを構築します。使用される状態類似性、およびそれらが誘導する近隣やトポロジの概念は、アルゴリズムのパフォーマンスに直接影響するため、重要な重要性を有する。実際、最近の多くの研究では「よく行動する」地域の存在を仮定したアルゴリズムが導入されているが、将来の作業のためにそのようなトポロジの完全な仕様を残している。本稿では,これらのトポロジを定義するための統一的形式主義について,メトリクスのレンズを通じて紹介する。これらの指標の階層を確立し、強化学習問題を特定するマルコフ決定プロセスに関する理論的意味を実証する。我々は, 評価指標間の差異を実証的に評価し, 理論結果を補完する。

In most practical applications of reinforcement learning, it is untenable to maintain direct estimates for individual states; in continuous-state systems, it is impossible. Instead, researchers often leverage state similarity (whether explicitly or implicitly) to build models that can generalize well from a limited set of samples. The notion of state similarity used, and the neighbourhoods and topologies they induce, is thus of crucial importance, as it will directly affect the performance of the algorithms. Indeed, a number of recent works introduce algorithms assuming the existence of "well-behaved" neighbourhoods, but leave the full specification of such topologies for future work. In this paper we introduce a unified formalism for defining these topologies through the lens of metrics. We establish a hierarchy amongst these metrics and demonstrate their theoretical implications on the Markov Decision Process specifying the reinforcement learning problem. We complement our theoretical results with empirical evaluations showcasing the differences between the metrics considered.

翻訳日:2021-02-03 16:58:26 公開日:2021-02-02

# CNN圧縮のための重量共有機会の高速探索

Fast Exploration of Weight Sharing Opportunities for CNN Compression ( http://arxiv.org/abs/2102.01345v1 )

ライセンス: Link先を確認

Etienne Dupuis, David Novo, Ian O'Connor, Alberto Bosio

(参考訳) Convolutional Neural Networks(CNN)に関わる計算負荷は、通常、低消費電力の組み込みデバイスでは到達できない。この問題に対処するために、多くの近似技術があります。これらの手法は、設計空間探索(DSE)を用いて、各CNNに最適化する必要があるハイパーパラメータを持つ。本研究の目的は,DSEフェーズタイムがアートCNNの状態に対して容易に爆発できることを実証することである。そこで本稿では,出力の質を犠牲にすることなく,探索時間を劇的に短縮する最適化探索法を提案する。

The computational workload involved in Convolutional Neural Networks (CNNs) is typically out of reach for low-power embedded devices. There are a large number of approximation techniques to address this problem. These methods have hyper-parameters that need to be optimized for each CNNs using design space exploration (DSE). The goal of this work is to demonstrate that the DSE phase time can easily explode for state of the art CNN. We thus propose the use of an optimized exploration process to drastically reduce the exploration time without sacrificing the quality of the output.

翻訳日:2021-02-03 16:57:14 公開日:2021-02-02

# データにおけるマイニング特徴関係

Mining Feature Relationships in Data ( http://arxiv.org/abs/2102.01355v1 )

ライセンス: Link先を確認

Andrew Lensen

(参考訳) 新しいデータセットに直面したとき、ほとんどの実践者はデータ内の興味深いパターンや特徴を発見するために探索的データ分析を行うことから始める。関連ルールマイニングのような手法は、データの特徴(属性)間の関係を明らかにするために一般的に用いられる。しかし、アソシエーションルールはルールベースの機械学習を使用するため、主にバイナリデータやカテゴリデータでの使用のために設計されている。現実世界のデータの大部分は本質的に連続的であり、そのようなデータの離散化は不正確で情報の少ない関連ルールをもたらす。本稿では,データ中の連続的・分類的特徴間の象徴的関係を自動的に発見する遺伝的プログラミング手法を用いて,特徴関係マイニング(FRM)という代替手法を提案する。我々の知る限りでは、我々の提案したアプローチは、特徴間の関係を明確に発見することを目的とした最初の象徴的なアプローチである。実世界のさまざまなデータセットにおける経験的テスト提案手法は、容易に解釈でき、データに対する明確かつ非自明な洞察を提供する高品質でシンプルな特徴関係を見つけることができる。

When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover relationships between features (attributes) of the data. However, association rules are primarily designed for use on binary or categorical data, due to their use of rule-based machine learning. A large proportion of real-world data is continuous in nature, and discretisation of such data leads to inaccurate and less informative association rules. In this paper, we propose an alternative approach called feature relationship mining (FRM), which uses a genetic programming approach to automatically discover symbolic relationships between continuous or categorical features in data. To the best of our knowledge, our proposed approach is the first such symbolic approach with the goal of explicitly discovering relationships between features. Empirical testing on a variety of real-world datasets shows the proposed method is able to find high-quality, simple feature relationships which can be easily interpreted and which provide clear and non-trivial insight into data.

翻訳日:2021-02-03 16:56:46 公開日:2021-02-02

# 文レベル関係抽出のための改良ベースライン

An Improved Baseline for Sentence-level Relation Extraction ( http://arxiv.org/abs/2102.01373v1 )

ライセンス: Link先を確認

Wenxuan Zhou, Muhao Chen

(参考訳) 文レベルの関係抽出(RE)は、文中の2つの実体間の関係を特定することを目的とする。この問題には多くの努力が費やされてきたが、最高の実行方法はまだ人間のパフォーマンスには及ばない。本論文では,実体表現とNAインスタンス予測という,徹底的に研究されていないREモデルの2つの側面を再検討する。当社の改良ベースラインモデルは、タイプマーカーを備えたエンティティ表現とNAインスタンス検出の強化のための信頼ベースの分類と組み合わされ、TACREDで75.0%のF1を達成し、以前のSOTAメソッドを大幅に上回っています。

Sentence-level relation extraction (RE) aims at identifying the relationship between two entities in a sentence. Many efforts have been devoted to this problem, while the best performing methods are still far behind human performance. In this paper, we revisit two aspects of RE models that are not thoroughly studied, namely entity representation and NA instance prediction. Our improved baseline model, incorporated with entity representations with type markers and confidence-based classification for enhanced NA instance detection, achieves an F1 of 75.0% on TACRED, significantly outperforms previous SOTA methods.

翻訳日:2021-02-03 16:54:43 公開日:2021-02-02

# MAUVE:オープンエンディングテキスト生成評価のためのヒューマンマシンダイバージェンス曲線

MAUVE: Human-Machine Divergence Curves for Evaluating Open-Ended Text Generation ( http://arxiv.org/abs/2102.01454v1 )

ライセンス: Link先を確認

Krishna Pillutla, Swabha Swayamdipta, Rowan Zellers, John Thickstun, Yejin Choi, Zaid Harchaoui

(参考訳) オープンエンドテキスト生成の大きな進歩にもかかわらず、このタスクの評価基準の設計には限界がある。本稿では,機械生成テキストの分布を人間の言語と直接比較する,オープンエンドテキスト生成の指標であるMAUVEを提案する。 MAUVEは2つの分布の分岐曲線の下の平均面積を測定し、モデル分布がよく近似する分布の一部から生じるものと、そうでないものという2つのタイプの誤差の間のトレードオフを探索する。ウェブテキスト領域とストーリー領域における2つのオープンエンドな生成タスク、および様々な復号アルゴリズムとモデルサイズについて実験を行った。この結果から,MAUVEによる評価は,モデルサイズに対する自然な挙動を反映していることが明らかとなった。 MAUVEの復号アルゴリズムの順序は、オープンエンドテキスト生成において最も広く使われている指標である世代パープレキシティと一致するが、MAUVEはモデルと人文の両方を考慮することにより、タスクに対するより原則化された評価基準を示す。

Despite major advances in open-ended text generation, there has been limited progress in designing evaluation metrics for this task. We propose MAUVE -- a metric for open-ended text generation, which directly compares the distribution of machine-generated text to that of human language. MAUVE measures the mean area under the divergence curve for the two distributions, exploring the trade-off between two types of errors: those arising from parts of the human distribution that the model distribution approximates well, and those it does not. We present experiments across two open-ended generation tasks in the web text domain and the story domain, and a variety of decoding algorithms and model sizes. Our results show that evaluation under MAUVE indeed reflects the more natural behavior with respect to model size, compared to prior metrics. MAUVE's ordering of the decoding algorithms also agrees with that of generation perplexity, the most widely used metric in open-ended text generation; however, MAUVE presents a more principled evaluation metric for the task as it considers both model and human text.

翻訳日:2021-02-03 16:54:09 公開日:2021-02-02

# 変換器(M-BERT)からの多言語双方向エンコーダ表現を用いたインドネシアニュースサイトのクリックベイト見出し検出

Clickbait Headline Detection in Indonesian News Sites using Multilingual Bidirectional Encoder Representations from Transformers (M-BERT) ( http://arxiv.org/abs/2102.01497v1 )

ライセンス: Link先を確認

Muhammad N. Fakhruzzaman, Saidah Z. Jannah, Ratih A. Ningrum, Indah Fahmiyah

(参考訳) クリック数は、オンライン広告主がニュースサイトに支払った金額に関連している。このようなビジネスモデルにより、一部のニュースサイトはクリックベイティングの汚いトリック、すなわちハイパーボリックで興味深い言葉、時には見出しの未完成の文章を使用して読者を意図的にいじめることを余儀なくされた。インドネシアの一部のオンラインニュースサイトもクリックベイトに参加し、他の既存のニュースサイトの信頼性を間接的に低下させた。埋め込み層として機能する予め訓練された言語モデルM-BERTを有するニューラルネットワークを100ノード隠蔽層と組み合わせ、シグモイド分類器をトッピングしてクリックベイト見出しを検出する。トレーニングデータセットとして合計6632の見出しで、分類器は著しくうまく機能しました。 5倍のクロス検証で評価され、精度スコアは0.914、f1スコアは0.914、精度スコアは0.916、ROC-AUCは0.992である。インドネシア語テキスト分類タスクにおける多言語BERTの使用がテストされ、さらなる拡張が可能となった。今後の可能性,社会的影響,クリックベイト検出の限界について論じる。

Click counts are related to the amount of money that online advertisers paid to news sites. Such business models forced some news sites to employ a dirty trick of click-baiting, i.e., using a hyperbolic and interesting words, sometimes unfinished sentence in a headline to purposefully tease the readers. Some Indonesian online news sites also joined the party of clickbait, which indirectly degrade other established news sites' credibility. A neural network with a pre-trained language model M-BERT that acted as a embedding layer is then combined with a 100 nodes hidden layer and topped with a sigmoid classifier was trained to detect clickbait headlines. With a total of 6632 headlines as a training dataset, the classifier performed remarkably well. Evaluated with 5-fold cross validation, it has an accuracy score of 0.914, f1-score of 0.914, precision score of 0.916, and ROC-AUC of 0.92. The usage of multilingual BERT in Indonesian text classification task was tested and is possible to be enhanced further. Future possibilities, societal impact, and limitations of the clickbait detection are discussed.

翻訳日:2021-02-03 16:53:28 公開日:2021-02-02

# Deep Online Fused Video Stabilization

Deep Online Fused Video Stabilization ( http://arxiv.org/abs/2102.01279v1 )

ライセンス: Link先を確認

Zhenmei Shi, Fuhao Shi, Wei-Sheng Lai, Chia-Kai Liang, Yingyu Liang

(参考訳) 本稿では、センサデータ(ジャイロスコープ)と画像コンテンツ(光学フロー)の両方を用いて、教師なし学習による動画の安定化を図るディープニューラルネットワーク(DNN)を提案する。ネットワークは、実際の/仮想カメラで光の流れを融合し、ヒストリーを関節運動表現に変換する。次に、LSTMブロックは新しい仮想カメラポーズを推測し、この仮想ポーズはフレームを安定させる反動グリッドを生成するために使用されます。新たな相対運動表現と多段階学習プロセスが提案され, 教師なしのモデルが最適化される。我々の知る限りでは、センサデータと画像の両方を安定化に利用する最初のDNNソリューションである。提案手法をアブレーション研究により検証し,提案手法は定量的評価とユーザスタディにより最先端の代替ソリューションよりも優れていることを示した。

We present a deep neural network (DNN) that uses both sensor data (gyroscope) and image content (optical flow) to stabilize videos through unsupervised learning. The network fuses optical flow with real/virtual camera pose histories into a joint motion representation. Next, the LSTM block infers the new virtual camera pose, and this virtual pose is used to generate a warping grid that stabilizes the frame. Novel relative motion representation as well as a multi-stage training process are presented to optimize our model without any supervision. To the best of our knowledge, this is the first DNN solution that adopts both sensor data and image for stabilization. We validate the proposed framework through ablation studies and demonstrated the proposed method outperforms the state-of-art alternative solutions via quantitative evaluations and a user study.

翻訳日:2021-02-03 16:49:01 公開日:2021-02-02

# 言語に基づくモーメント定位のためのプログレッシブ定位ネットワーク

Progressive Localization Networks for Language-based Moment Localization ( http://arxiv.org/abs/2102.01282v1 )

ライセンス: Link先を確認

Qi Zheng, Jianfeng Dong, Xiaoye Qu, Xun Yang, Shouling Ji, Xun Wang

(参考訳) 本稿では,言語に基づくモーメントローカライゼーションの課題を対象とする。このタスクの言語ベースの設定により、ターゲットアクティビティのオープンなセットが可能になり、ビデオモーメントの時間的長さが大きく変化する。既存の手法では、まず時間長の異なる十分な候補モーメントをサンプリングし、それから与えられたクエリと照合して目標モーメントを決定する。しかし、定時間粒度で生成された候補モーメントは、モーメント長の大きな変動を処理するのに最適である。そこで本研究では,目標モーメントを粗大な方法で段階的にローカライズする多段階プログレッシブ・ローカライゼーション・ネットワーク(PLN)を提案する。具体的には、PLNの各段階は局所化分岐を持ち、特定の時間的粒度で生成される候補モーメントに焦点を当てる。候補モーメントの時間的粒度はステージによって異なる。さらに,条件付き特徴操作モジュールとアップサンプリング接続を考案し,複数のローカライズブランチを橋渡しする。この方法では、後段は事前に学習した情報を吸収することができるため、より細かい局所化が容易になる。 3つの公開データセットに対する大規模な実験は、言語に基づくモーメントローカライゼーションにおけるPLNの有効性と、長いビデオで短いモーメントをローカライズする可能性を示す。

This paper targets the task of language-based moment localization. The language-based setting of this task allows for an open set of target activities, resulting in a large variation of the temporal lengths of video moments. Most existing methods prefer to first sample sufficient candidate moments with various temporal lengths, and then match them with the given query to determine the target moment. However, candidate moments generated with a fixed temporal granularity may be suboptimal to handle the large variation in moment lengths. To this end, we propose a novel multi-stage Progressive Localization Network (PLN) which progressively localizes the target moment in a coarse-to-fine manner. Specifically, each stage of PLN has a localization branch, and focuses on candidate moments that are generated with a specific temporal granularity. The temporal granularities of candidate moments are different across the stages. Moreover, we devise a conditional feature manipulation module and an upsampling connection to bridge the multiple localization branches. In this fashion, the later stages are able to absorb the previously learned information, thus facilitating the more fine-grained localization. Extensive experiments on three public datasets demonstrate the effectiveness of our proposed PLN for language-based moment localization and its potential for localizing short moments in long videos.

翻訳日:2021-02-03 16:48:27 公開日:2021-02-02

# GCF-Net:ビデオ行動認識のためのGated Clip Fusion Network

GCF-Net: Gated Clip Fusion Network for Video Action Recognition ( http://arxiv.org/abs/2102.01285v1 )

ライセンス: Link先を確認

Jenhao Hsiao and Jiawei Chen and Chiuman Ho

(参考訳) 近年、ビデオアクション認識の精度向上のほとんどは、新しく設計されたCNNアーキテクチャ(例えば、3D-CNN)から来ている。これらのモデルは、固定時間長の単一クリップにディープCNNを適用することで訓練される。各ビデオセグメントは3D-CNNモジュールによって個別に処理されるため、対応するクリップディスクリプタはローカルであり、クリップ間の関係は本質的に暗黙的です。ビデオレベルの予測としてクリップレベルの出力を直接平均化する一般的な方法は、ビデオを表すために関連情報を抽出および統合できるメカニズムの欠如のために失敗する傾向があります。本稿では、既存のビデオアクション分類器を小さな計算オーバーヘッドのコストで大幅に向上させることができるGated Clip Fusion Network(GCF-Net)について紹介する。 GCF-Netは、ビデオクリップ間の依存性を明示的にモデル化し、ローカルクリップディスクリプタの受容フィールドを強化します。さらに、アクションイベントに対する各クリップの重要性を計算し、関連するクリップのサブセットを選択してビデオレベルの分析を行う。大規模なベンチマークデータセット(Kinetics-600)では、提案されたGCF-Netは、それぞれ11.49%(中央クリップに基づく)と3.67%(高密度サンプリングクリップに基づく)の既存のアクション分類器の精度を高める。

In recent years, most of the accuracy gains for video action recognition have come from the newly designed CNN architectures (e.g., 3D-CNNs). These models are trained by applying a deep CNN on single clip of fixed temporal length. Since each video segment are processed by the 3D-CNN module separately, the corresponding clip descriptor is local and the inter-clip relationships are inherently implicit. Common method that directly averages the clip-level outputs as a video-level prediction is prone to fail due to the lack of mechanism that can extract and integrate relevant information to represent the video. In this paper, we introduce the Gated Clip Fusion Network (GCF-Net) that can greatly boost the existing video action classifiers with the cost of a tiny computation overhead. The GCF-Net explicitly models the inter-dependencies between video clips to strengthen the receptive field of local clip descriptors. Furthermore, the importance of each clip to an action event is calculated and a relevant subset of clips is selected accordingly for a video-level analysis. On a large benchmark dataset (Kinetics-600), the proposed GCF-Net elevates the accuracy of existing action classifiers by 11.49% (based on central clip) and 3.67% (based on densely sampled clips) respectively.

翻訳日:2021-02-03 16:47:43 公開日:2021-02-02

# Deep Refinement NetworkとAdaptive Weighting Lossを用いたCrisp Boundariesの学習

Learning Crisp Boundaries Using Deep Refinement Network and Adaptive Weighting Loss ( http://arxiv.org/abs/2102.01301v1 )

ライセンス: Link先を確認

Yi-Jun Cao, Chuan Lin, and Yong-Jie Li

(参考訳) 畳み込みニューラルネットワークを用いて境界検出において著しい進歩を遂げている。最近の境界検出モデルは、実際のオブジェクトの境界検出だけでなく、境界(オブジェクトの輪郭に沿って正確にローカライズ)にも焦点を合わせています。 crisp境界性能を評価する方法は2つある。基底真理と検出された輪郭の間の距離を測定するために、より厳密な耐性を用いる。もう1つは、後処理なしで輪郭マップを評価することに焦点を当てている。本研究では,両手法を解析し,両手法が輪郭評価の2つの側面であることを示す。そこで本研究では,複数の精錬モジュールを積み重ねた深層精錬ネットワーク(DRNet)と,効果的な適応融合によるクロスエントロピーとダイス損失を組み合わせた新たな損失関数を提案する。実験の結果,いくつかの利用可能なデータセットの最先端性能が得られた。

Significant progress has been made in boundary detection with the help of convolutional neural networks. Recent boundary detection models not only focus on real object boundary detection but also "crisp" boundaries (precisely localized along the object's contour). There are two methods to evaluate crisp boundary performance. One uses more strict tolerance to measure the distance between the ground truth and the detected contour. The other focuses on evaluating the contour map without any postprocessing. In this study, we analyze both methods and conclude that both methods are two aspects of crisp contour evaluation. Accordingly, we propose a novel network named deep refinement network (DRNet) that stacks multiple refinement modules to achieve richer feature representation and a novel loss function, which combines cross-entropy and dice loss through effective adaptive fusion. Experimental results demonstrated that we achieve state-of-the-art performance for several available datasets.

翻訳日:2021-02-03 16:46:57 公開日:2021-02-02

# 点型検出とガウス離散によるコーンビームCT像からの歯列分離

Tooth Instance Segmentation from Cone-Beam CT Images through Point-based Detection and Gaussian Disentanglement ( http://arxiv.org/abs/2102.01315v1 )

ライセンス: Link先を確認

Jusang Lee, Minyoung Chung, Minkyung Lee, Yeong-Gil Shin

(参考訳) 歯の個々の分割およびコーンビームCT画像からの識別は、矯正治療の術前前提条件である。畳み込みニューラルネットワークを用いたインスタンスセグメンテーション手法は,個々の歯のセグメンテーションタスクにおいて画期的な結果を示し,様々な医用画像アプリケーションで用いられている。点に基づく検出ネットワークは歯科画像において優れた結果を得るが, 類似したトポロジーと近近性から隣接歯を識別することは依然として難しい課題である。本研究では,ガウス離散客観的関数に基づいて各歯を効果的に解離する点ベースの歯の局在化ネットワークを提案する。提案したネットワークはまず,すべての解剖学的歯に対するボックスレグレッションを伴うヒートマップレグレッションを行う。隣り合う全ての歯のヒートマップの画素ワイド乗算の和を最小化することにより、新しいガウスのゆがみのペナルティを用いる。その後、ピクセルワイズラベリングタスクを距離マップ回帰タスクに変換して個々の歯の分割を行い、歯の隣接する領域における偽陽性を最小限に抑える。実験結果から, 検出精度を9.1%向上させることで, 最新のアプローチを上回り, 個々の歯の区分において高い性能を発揮できることが示された。提案手法の主な意義は, 1) 追加の分類を必要としない点ベース歯検出フレームワークの導入, 2) 点ベース検出フレームワークにおける熱マップ応答に基づいてガウス分布を効果的に分離する新規な損失関数の設計である。

Individual tooth segmentation and identification from cone-beam computed tomography images are preoperative prerequisites for orthodontic treatments. Instance segmentation methods using convolutional neural networks have demonstrated ground-breaking results on individual tooth segmentation tasks, and are used in various medical imaging applications. While point-based detection networks achieve superior results on dental images, it is still a challenging task to distinguish adjacent teeth because of their similar topologies and proximate nature. In this study, we propose a point-based tooth localization network that effectively disentangles each individual tooth based on a Gaussian disentanglement objective function. The proposed network first performs heatmap regression accompanied by box regression for all the anatomical teeth. A novel Gaussian disentanglement penalty is employed by minimizing the sum of the pixel-wise multiplication of the heatmaps for all adjacent teeth pairs. Subsequently, individual tooth segmentation is performed by converting a pixel-wise labeling task to a distance map regression task to minimize false positives in adjacent regions of the teeth. Experimental results demonstrate that the proposed algorithm outperforms state-of-the-art approaches by increasing the average precision of detection by 9.1%, which results in a high performance in terms of individual tooth segmentation. The primary significance of the proposed method is two-fold: 1) the introduction of a point-based tooth detection framework that does not require additional classification and 2) the design of a novel loss function that effectively separates Gaussian distributions based on heatmap responses in the point-based detection framework.

翻訳日:2021-02-03 16:46:21 公開日:2021-02-02

# 分散画像インパインティングのためのテスト時間適応

Test-Time Adaptation for Out-of-distributed Image Inpainting ( http://arxiv.org/abs/2102.01360v1 )

ライセンス: Link先を確認

Chajin Shin, Taeoh Kim, Sangjin Lee and Sangyoun Lee

(参考訳) ディープラーニングベースのイメージインペインティングアルゴリズムは、多数の外部自然画像から事前学習することで、優れたパフォーマンスを示している。しかし, モデルがトレーニング画像に偏りがあるため, トレーニング画像の分布がトレーニング画像の分布からかけ離れているテスト画像に対して, 不愉快な結果を示す。本稿では,AdaFillという実験時間適応を用いた簡易画像描画アルゴリズムを提案する。分散した1つのテスト画像を考えると、私たちの目標は、事前訓練された塗装モデルよりも自然に穴領域を完成させることです。この目的を達成するために,自然画像は内部的類似性が強いため,テスト画像の有効領域を別の訓練方法として扱う。このテスト時間適応により、我々のネットワークは、事前訓練された特徴とテスト画像の内部的事前を明示的に利用することができる。実験の結果,adafillは他のモデルよりも分散テスト画像の方が優れていた。さらに、事前トレーニングされていないZeroFillというモデルも、事前トレーニングされたモデルを上回ることがあります。

Deep learning-based image inpainting algorithms have shown great performance via powerful learned prior from the numerous external natural images. However, they show unpleasant results on the test image whose distribution is far from the that of training images because their models are biased toward the training images. In this paper, we propose a simple image inpainting algorithm with test-time adaptation named AdaFill. Given a single out-of-distributed test image, our goal is to complete hole region more naturally than the pre-trained inpainting models. To achieve this goal, we treat remained valid regions of the test image as another training cues because natural images have strong internal similarities. From this test-time adaptation, our network can exploit externally learned image priors from the pre-trained features as well as the internal prior of the test image explicitly. Experimental results show that AdaFill outperforms other models on the various out-of-distribution test images. Furthermore, the model named ZeroFill, that are not pre-trained also sometimes outperforms the pre-trained models.

翻訳日:2021-02-03 16:45:32 公開日:2021-02-02

# Sf_{3}CNN$を用いた高次特徴識別による顔認識

Face Recognition Using $Sf_{3}CNN$ With Higher Feature Discrimination ( http://arxiv.org/abs/2102.01404v1 )

ライセンス: Link先を確認

Nayaneesh Kumar Mishra, Satish Kumar Singh

(参考訳) 2次元畳み込みニューラルネットワーク(2d cnn)の出現により、顔認識精度は99%を超えている。しかし、顔認識は現実世界の状況では依然として課題です。画像の代わりにビデオは、実際の状況における顔認識の課題を解決するのに、入力としてより有用である。これは、ビデオが画像よりも多くの機能を提供するためです。しかし、2D CNNはビデオの時間的特徴を生かすことはできない。そこで我々は,ビデオの顔認識に$Sf_{3}CNN$というフレームワークを提案する。 The $Sf_{3}CNN$ framework using 3-dimensional Residual Network (3D Resnet) and A-Softmax loss for face recognition in video。 3D ResNetの使用は、空間的特徴と時間的特徴の両方を1つのコンパクトな特徴マップにキャプチャするのに役立つ。しかし、3D CNN機能は、効率的な顔認識のために非常に差別的である必要があります。 A-Softmaxの損失は、顔認識のためにビデオから高い差別的特徴を抽出するのに役立ちます。 Sf_{3}CNN$ frameworkは、CVBLビデオデータベースの99.10%の精度を、3D ResNetsを使用して同じデータベースの97%と比較する。

With the advent of 2-dimensional Convolution Neural Networks (2D CNNs), the face recognition accuracy has reached above 99%. However, face recognition is still a challenge in real world conditions. A video, instead of an image, as an input can be more useful to solve the challenges of face recognition in real world conditions. This is because a video provides more features than an image. However, 2D CNNs cannot take advantage of the temporal features present in the video. We therefore, propose a framework called $Sf_{3}CNN$ for face recognition in videos. The $Sf_{3}CNN$ framework uses 3-dimensional Residual Network (3D Resnet) and A-Softmax loss for face recognition in videos. The use of 3D ResNet helps to capture both spatial and temporal features into one compact feature map. However, the 3D CNN features must be highly discriminative for efficient face recognition. The use of A-Softmax loss helps to extract highly discriminative features from the video for face recognition. $Sf_{3}CNN$ framework gives an increased accuracy of 99.10% on CVBL video database in comparison to the previous 97% on the same database using 3D ResNets.

翻訳日:2021-02-03 16:44:56 公開日:2021-02-02

# 3次元CNNを用いた顔認識

Face Recognition using 3D CNNs ( http://arxiv.org/abs/2102.01441v1 )

ライセンス: Link先を確認

Nayaneesh Kumar Mishra, Satish Kumar Singh

(参考訳) 顔認識の領域はコンピュータビジョンと生体計測の分野で最も広く研究されている分野の1つである。これは、顔の生体認証の非侵入的な性質が、空港などの公共の場所での監視分野の応用に比較的適しているためである。顔認識における原始的手法の適用は, 十分な性能を得られなかった。しかし、機械学習と深層学習の出現と顔認識への応用により、いくつかの大きなブレークスルーが得られた。顔認識における2次元畳み込みニューラルネットワーク(2d cnn)の使用は、人間の顔認識精度を越え、99%に達した。それでも、解像度、照明、ポーズの変動などの現実世界条件の存在下での堅牢な顔認識は、顔認識の研究者にとって大きな課題です。本研究では,映像を3次元CNNアーキテクチャの入力として使用し,映像から空間領域情報と時間領域情報をキャプチャして実環境における顔認識を行った。実験のために,CVBLビデオデータセットという独自のビデオデータセットを開発した。ビデオの顔認識に3D CNNを使用することは、CVBLデータセットで97%の精度で最高のパフォーマンスを発揮するDenseNetsで有望な結果を示しています。

The area of face recognition is one of the most widely researched areas in the domain of computer vision and biometric. This is because, the non-intrusive nature of face biometric makes it comparatively more suitable for application in area of surveillance at public places such as airports. The application of primitive methods in face recognition could not give very satisfactory performance. However, with the advent of machine and deep learning methods and their application in face recognition, several major breakthroughs were obtained. The use of 2D Convolution Neural networks(2D CNN) in face recognition crossed the human face recognition accuracy and reached to 99%. Still, robust face recognition in the presence of real world conditions such as variation in resolution, illumination and pose is a major challenge for researchers in face recognition. In this work, we used video as input to the 3D CNN architectures for capturing both spatial and time domain information from the video for face recognition in real world environment. For the purpose of experimentation, we have developed our own video dataset called CVBL video dataset. The use of 3D CNN for face recognition in videos shows promising results with DenseNets performing the best with an accuracy of 97% on CVBL dataset.

翻訳日:2021-02-03 16:44:18 公開日:2021-02-02

# 合成学習した深部畳み込みネットワークによる人体部品の分節学習

Learning to Segment Human Body Parts with Synthetically Trained Deep Convolutional Networks ( http://arxiv.org/abs/2102.01460v1 )

ライセンス: Link先を確認

Alessandro Saviolo, Matteo Bonotto, Daniele Evangelista, Marco Imperoli, Emanuele Menegatti and Alberto Pretto

(参考訳) 本稿では,合成データのみを用いた深層畳み込みニューラルネットワークに基づく人体部分分割のための新しいフレームワークを提案する。提案手法は,人体の実際のアノテートデータを用いたモデルの訓練を必要とせず,最先端の結果を得る。私たちの貢献は、ネットワークを訓練するために使用される合成データを作成するためのゲームエンジンを利用するデータ生成パイプラインと、エッジレスポンスマップと適応ヒストグラムの等化を組み合わせた新しい前処理モジュールで、ネットワークをガイドして、照明条件の変化に対する堅牢性を保証する人体部品の形状を学びます。最適な候補アーキテクチャを選択するために,実人手足の手動注釈画像の徹底的な検査を行った。さらに、前処理モジュールを検証するためのアブレーション研究について述べる。その結果,本手法は,最先端のセマンティクスセグメンテーションネットワークを大きなマージンで上回っていることがわかった。本論文では,得られたデータセットと合わせて,提案手法の実装をリリースする。

This paper presents a new framework for human body part segmentation based on Deep Convolutional Neural Networks trained using only synthetic data. The proposed approach achieves cutting-edge results without the need of training the models with real annotated data of human body parts. Our contributions include a data generation pipeline, that exploits a game engine for the creation of the synthetic data used for training the network, and a novel pre-processing module, that combines edge response map and adaptive histogram equalization to guide the network to learn the shape of the human body parts ensuring robustness to changes in the illumination conditions. For selecting the best candidate architecture, we performed exhaustive tests on manually-annotated images of real human body limbs. We further present an ablation study to validate our pre-processing module. The results show that our method outperforms several state-of-the-art semantic segmentation networks by a large margin. We release an implementation of the proposed approach along with the acquired datasets with this paper.

翻訳日:2021-02-03 16:43:40 公開日:2021-02-02

# 文化から服へ:ファッションのイメージの世紀の後ろにある世界イベントを発見する

From Culture to Clothing: Discovering the World Events Behind A Century of Fashion Images ( http://arxiv.org/abs/2102.01690v1 )

ライセンス: Link先を確認

Wei-Lin Hsiao, Kristen Grauman

(参考訳) ファッションは外部の文化的要因と絡み合っているが、これらのリンクを特定することは、最も健全な現象に限られる手作業である。着る衣服に影響を及ぼす特定の文化的要因を特定するためのデータ駆動アプローチを提案する。 1世紀にわたるニュース記事やヴィンテージ写真の大規模なデータセットを用いて、世界の出来事と衣服の選択の間の影響関係を検出するマルチモーダル統計モデルを導入する。さらに,2つのデータセット上でのビジュアルスタイル予測とフォトタイムスタンプの具体的ビジョンタスクの改善に本モデルを適用した。私たちの仕事は、文化と衣類を結びつけるための計算可能でスケーラブルで簡単に更新可能なアプローチに向けた第一歩です。

Fashion is intertwined with external cultural factors, but identifying these links remains a manual process limited to only the most salient phenomena. We propose a data-driven approach to identify specific cultural factors affecting the clothes people wear. Using large-scale datasets of news articles and vintage photos spanning a century, we introduce a multi-modal statistical model to detect influence relationships between happenings in the world and people's choice of clothing. Furthermore, we apply our model to improve the concrete vision tasks of visual style forecasting and photo timestamping on two datasets. Our work is a first step towards a computational, scalable, and easily refreshable approach to link culture to clothing.

翻訳日:2021-02-03 16:43:04 公開日:2021-02-02

# 生理的反応による人種バイアスの検出

Detection of Racial Bias from Physiological Responses ( http://arxiv.org/abs/2102.01287v1 )

ライセンス: Link先を確認

Fateme Nikseresht, Runze Yan, Rachel Lew, Yingzheng Liu, Rose M.Sebastian, Afsaneh Doryab

(参考訳) 偏見から害を和らげるための規範や規制の進化にもかかわらず、個人の無意識バイアスに関連する有害な差別は続いている。我々の目標は、暗黙のバイアスの生理的および行動的指標をよりよく理解し、検出することである。本稿では,心拍数,伝導性皮膚反応,皮膚温度,微小体運動などの生理的反応から,人種的偏見を確実に検出できるかどうかを検討する。インプリシット・アソシエーション・テスト (IAT) を施行中, Empatica E4 リストバンドを用いて生理データを収集した46名の被験者のデータを解析した。機械学習と統計解析により、76.1%の精度で生理信号から暗黙のバイアスを予測できることが示された。また,皮膚反応に関連するEDA信号は,人種的バイアスと最も強い相関関係を持ち,偏見のある参加者と偏見のない参加者のEDA特徴値には有意な差があることを示した。

Despite the evolution of norms and regulations to mitigate the harm from biases, harmful discrimination linked to an individual's unconscious biases persists. Our goal is to better understand and detect the physiological and behavioral indicators of implicit biases. This paper investigates whether we can reliably detect racial bias from physiological responses, including heart rate, conductive skin response, skin temperature, and micro-body movements. We analyzed data from 46 subjects whose physiological data was collected with Empatica E4 wristband while taking an Implicit Association Test (IAT). Our machine learning and statistical analysis show that implicit bias can be predicted from physiological signals with 76.1% accuracy. Our results also show that the EDA signal associated with skin response has the strongest correlation with racial bias and that there are significant differences between the values of EDA features for biased and unbiased participants.

翻訳日:2021-02-03 16:41:28 公開日:2021-02-02

# 高精度なデキサスロボット操作のためのGazeベースのデュアルリゾリューションディープイミテーション学習

Gaze-based dual resolution deep imitation learning for high-precision dexterous robot manipulation ( http://arxiv.org/abs/2102.01295v1 )

ライセンス: Link先を確認

Heecheol Kim, Yoshiyuki Ohmura, and Yasuo Kuniyoshi

(参考訳) 針のスレッディングのような高精度な操作作業は困難である。生理学的研究は、低解像度の周辺視覚と高速移動をつなげて物体の近傍に手を運ぶことを提案し、高分解能の焦点視覚を用いて物体への正確な手のホーミングを実現する。本研究は,人間の視線に基づく双対分解能振動子制御システムにインスパイアされた,深層模倣学習に基づく手法により,針のスレッディング作業が解決できることを実証した。まず,ロボットを遠隔操作している操作者の視線の動きを記録した。次に,視線周辺の高分解能画像のみを用いて,目標近傍の糸位置を正確に制御した。我々は低解像度の周辺画像を用いて目標付近に到達した。本研究で得られた実験結果は,汎用ロボットマニピュレータを用いた高精度操作が可能であり,計算効率が向上することを示す。

A high-precision manipulation task, such as needle threading, is challenging. Physiological studies have proposed connecting low-resolution peripheral vision and fast movement to transport the hand into the vicinity of an object, and using high-resolution foveated vision to achieve the accurate homing of the hand to the object. The results of this study demonstrate that a deep imitation learning based method, inspired by the gaze-based dual resolution visuomotor control system in humans, can solve the needle threading task. First, we recorded the gaze movements of a human operator who was teleoperating a robot. Then, we used only a high-resolution image around the gaze to precisely control the thread position when it was close to the target. We used a low-resolution peripheral image to reach the vicinity of the target. The experimental results obtained in this study demonstrate that the proposed method enables precise manipulation tasks using a general-purpose robot manipulator and improves computational efficiency.

翻訳日:2021-02-03 16:40:50 公開日:2021-02-02

# Transfer Q-Learning を用いた分散マルチコアサーバのQoS対応電力最小化

QoS-Aware Power Minimization of Distributed Many-Core Servers using Transfer Q-Learning ( http://arxiv.org/abs/2102.01348v1 )

ライセンス: Link先を確認

Dainius Jenkus, Fei Xia, Rishad Shafik, Alex Yakovlev

(参考訳) 分散システムにまたがってスケールされたWebサーバは、サービス品質(QoS)を保証するための複雑なランタイムコントロールを必要とします。本稿では、水平スケーリング(ノードアロケーション)と垂直スケーリング(ノード内のリソースアロケーション)メソッドを相乗的に使用して、QoS制約(応答時間)下での消費電力を最小限に抑えながら、ワークロードへの適応を提供するQoS対応ランタイムコントローラを提案する。水平スケーリングは、一連のルールに従って、ワークロード要求と必要なQoSに基づいてアクティブノード数を決定する。そして、ダイナミック電圧/周波数スケーリング(dvfs)を使用してワークロードプロファイルに基づいて、さらにパワー/パフォーマンスをチューニングするトランスファーqラーニングを用いた垂直スケーリングと結合する。最小限の探索条件でQ値の転送を行う。さらにこのアプローチでは,マルチコアサーバのスケーラブルなアーキテクチャを活用して,完全あるいは部分的に探索されたノードから利用可能な知識を再利用する。これらの手法を組み合わせることで、モデルフリーのq-learningと比較して、探索時間とqos違反を低減できる。このテクニックは設計時間とランタイムコストのバランスをとり、サーバクラスタの異種マルチプロセスノード上の異なるワークロードシナリオにおいて、qos違反を最小限に抑えながら、永続的な電力削減と運用上の最適性を最大化する。

Web servers scaled across distributed systems necessitate complex runtime controls for providing quality of service (QoS) guarantees as well as minimizing the energy costs under dynamic workloads. This paper presents a QoS-aware runtime controller using horizontal scaling (node allocation) and vertical scaling (resource allocation within nodes) methods synergistically to provide adaptation to workloads while minimizing the power consumption under QoS constraint (i.e., response time). A horizontal scaling determines the number of active nodes based on workload demands and the required QoS according to a set of rules. Then, it is coupled with vertical scaling using transfer Q-learning, which further tunes power/performance based on workload profile using dynamic voltage/frequency scaling (DVFS). It transfers Q-values within minimally explored states reducing exploration requirements. In addition, the approach exploits a scalable architecture of the many-core server allowing to reuse available knowledge from fully or partially explored nodes. When combined, these methods allow to reduce the exploration time and QoS violations when compared to model-free Q-learning. The technique balances design-time and runtime costs to maximize the portability and operational optimality demonstrated through persistent power reductions with minimal QoS violations under different workload scenarios on heterogeneous multi-processing nodes of a server cluster.

翻訳日:2021-02-03 16:40:14 公開日:2021-02-02

# ALOHAにおけるデータ前処理のモジュール的アプローチとスマート産業ユースケースへの応用

Modular approach to data preprocessing in ALOHA and application to a smart industry use case ( http://arxiv.org/abs/2102.01349v1 )

ライセンス: Link先を確認

Cristina Chesta, Luca Rinelli

(参考訳) 音声コマンドやマシンビジョンシステムを使用した協調ロボットとのインタラクションなど、スマート産業領域でのアプリケーションには、しばしば異種低電力コンピューティングプラットフォームにディープラーニングアルゴリズムの展開が必要です。異なる設計手順を自動化するためのソフトウェアツールとフレームワークの可用性は、組み込みシステムにおけるDLアルゴリズムの効果的な実装をサポートし、関連する労力とコストを削減できる。フレームワークの受け入れにおいて非常に重要な側面の1つは、拡張性(extensibility)である。高度なスキルを必要とせずに、異なるデータセットに対応し、カスタマイズされた前処理を定義する能力。データ前処理と変換パイプラインをサポートするために、ALOHAツールフローに統合されたモジュラーアプローチに対処する。これはカスタマイズ可能なプラグインによって実現され、新しいユースケースを包含するツールフローを簡単に拡張できる。本手法の有効性を示すために,キーワードスポッティングユースケースに関する実験結果を示し,異なるユースケースへの拡張の可能性について概説する。

Applications in the smart industry domain, such as interaction with collaborative robots using vocal commands or machine vision systems often requires the deployment of deep learning algorithms on heterogeneous low power computing platforms. The availability of software tools and frameworks to automatize different design steps can support the effective implementation of DL algorithms on embedded systems, reducing related effort and costs. One very important aspect for the acceptance of the framework, is its extensibility, i.e. the capability to accommodate different datasets and define customized preprocessing, without requiring advanced skills. The paper addresses a modular approach, integrated into the ALOHA tool flow, to support the data preprocessing and transformation pipeline. This is realized through customizable plugins and allows the easy extension of the tool flow to encompass new use cases. To demonstrate the effectiveness of the approach, we present some experimental results related to a keyword spotting use case and we outline possible extensions to different use cases.

翻訳日:2021-02-03 16:39:31 公開日:2021-02-02

# 多目的マルチエージェントパス探索のための部分次元拡張

Subdimensional Expansion for Multi-objective Multi-agent Path Finding ( http://arxiv.org/abs/2102.01353v1 )

ライセンス: Link先を確認

Zhongqiang Ren, Sivakumar Rathinam and Howie Choset

(参考訳) 従来のマルチエージェントパスプランナーは、通常、経路長のような単一の目的を最適化する経路を決定する。しかし、多くのアプリケーションでは、計画プロセスで同時に最適化されるために、完成までの時間と燃料の使用など、複数の目的が必要です。しばしば、これらの基準は容易に比較されず、しばしば互いに競合している。標準的な多目的探索アルゴリズムをマルチエージェントパス探索に適用するだけで、可能解の空間、すなわちパレート最適集合のサイズがエージェントの数(探索空間の次元)とともに指数関数的に増加するため、非効率であることが証明できる。本稿では,このいわゆる次元の呪いを回避し,従来のマルチエージェントワークをサブ次元展開という枠組みで活用するアプローチを提案する。 A* に適用された部分次元展開の例は M* と呼ばれ、M* は単目的函数に制限された。支配と部分次元拡大の原則を組み合わせて、マルチオブジェクトM*(MOM*)と呼ばれる新しいアルゴリズムを作成し、エージェントが互いに「相互作用」しなければならない場合にのみ、計画のためのエージェントを動的に結合します。 MOM*は、複数のエージェントに対する完全なパレート最適集合を効率的に計算し、パレート最適集合の最適部分近似と計算効率を自然に交換する。我々の手法は、標準多目的A*アルゴリズムでは有界時間内に見つからない数百の解を持つ問題インスタンスに対する完全なパレート最適集合を見つけることができる。

Conventional multi-agent path planners typically determine a path that optimizes a single objective, such as path length. Many applications, however, may require multiple objectives, say time-to-completion and fuel use, to be simultaneously optimized in the planning process. Often, these criteria may not be readily compared and sometimes lie in competition with each other. Simply applying standard multi-objective search algorithms to multi-agent path finding may prove to be inefficient because the size of the space of possible solutions, i.e., the Pareto-optimal set, can grow exponentially with the number of agents (the dimension of the search space). This paper presents an approach that bypasses this so-called curse of dimensionality by leveraging our prior multi-agent work with a framework called subdimensional expansion. One example of subdimensional expansion, when applied to A*, is called M* and M* was limited to a single objective function. We combine principles of dominance and subdimensional expansion to create a new algorithm named multi-objective M* (MOM*), which dynamically couples agents for planning only when those agents have to "interact" with each other. MOM* computes the complete Pareto-optimal set for multiple agents efficiently and naturally trades off sub-optimal approximations of the Pareto-optimal set and computational efficiency. Our approach is able to find the complete Pareto-optimal set for problem instances with hundreds of solutions which the standard multi-objective A* algorithms could not find within a bounded time.

翻訳日:2021-02-03 16:38:55 公開日:2021-02-02

# 化学反応データ集合の無支援ノイズ低減

Unassisted Noise Reduction of Chemical Reaction Data Sets ( http://arxiv.org/abs/2102.01399v1 )

ライセンス: Link先を確認

Alessandra Toniato, Philippe Schwaller, Antonio Cardinale, Joppe Geluykens and Teodoro Laino

(参考訳) 有機化学における反応予測に応用された既存のディープラーニングモデルは、高いレベルの精度(自然言語処理ベースでは90%)に達する可能性がある。反応データから得られた情報以上に化学知識が組み込まれていないため、予測モデルの性能においてデータセットの品質が重要な役割を果たす。人間のキュレーションは極めて高価だが、既存のデータセットから化学的に間違ったエントリを取り除くための支援のないアプローチの必要性は、合成化学タスクにおける人工知能モデルのパフォーマンスを改善するために不可欠である。本稿では,化学反応コレクションから化学的に間違った成分を除去する機械学習による非支援手法を提案する。我々はこの手法を,米国特許庁(USPTO)特許から抽出した化学反応ピスタチオとオープンデータセットの収集に適用した。その結果,クリーン化およびバランスの取れたデータセットでトレーニングしたモデルの予測精度が向上した。逆合成モデルでは、ラウンドトリップ精度メトリックは13パーセントポイント増加し、累積Jensen Shannon発散の値は元のレコードと比較して30%減少します。カバレッジは97%で高いままであり、クラス多様性の価値はクリーニングによって影響を受けません。提案手法は,化学データの自動ノイズ低減に対処する最初の無規制手法である。

Existing deep learning models applied to reaction prediction in organic chemistry can reach high levels of accuracy (> 90% for Natural Language Processing-based ones). With no chemical knowledge embedded than the information learnt from reaction data, the quality of the data sets plays a crucial role in the performance of the prediction models. While human curation is prohibitively expensive, the need for unaided approaches to remove chemically incorrect entries from existing data sets is essential to improve artificial intelligence models' performance in synthetic chemistry tasks. Here we propose a machine learning-based, unassisted approach to remove chemically wrong entries from chemical reaction collections. We applied this method to the collection of chemical reactions Pistachio and to an open data set, both extracted from USPTO (United States Patent Office) patents. Our results show an improved prediction quality for models trained on the cleaned and balanced data sets. For the retrosynthetic models, the round-trip accuracy metric grows by 13 percentage points and the value of the cumulative Jensen Shannon divergence decreases by 30% compared to its original record. The coverage remains high with 97%, and the value of the class-diversity is not affected by the cleaning. The proposed strategy is the first unassisted rule-free technique to address automatic noise reduction in chemical data sets.

翻訳日:2021-02-03 16:38:10 公開日:2021-02-02

# 自律システム(AMLAS)における機械学習の保証に関するガイダンス

Guidance on the Assurance of Machine Learning in Autonomous Systems (AMLAS) ( http://arxiv.org/abs/2102.01564v1 )

ライセンス: Link先を確認

Richard Hawkins, Colin Paterson, Chiara Picardi, Yan Jia, Radu Calinescu and Ibrahim Habli

(参考訳) 機械学習(ML)は、現在、特定の条件下では人間のパフォーマンスを超えると報告された結果を持つ様々なシステムで使用されている。これらのシステムの多くは、ヘルスケア、自動車、製造業などの分野で、高い自律性を示し、安全性が重要です。 MLの正当性を確立することは、これらのシステムの安全ケースの中核をなす。本稿では,自律システム(AMLAS)における機械学習の保証に関する方法論を紹介する。 AMLASは、(1)MLコンポーネントの開発に安全保証を体系的に統合する工程と、(2)自律システムアプリケーションに統合された場合に、これらのコンポーネントの許容される安全性を明確に正当化するエビデンスベースを生成する工程と、からなる。

Machine Learning (ML) is now used in a range of systems with results that are reported to exceed, under certain conditions, human performance. Many of these systems, in domains such as healthcare , automotive and manufacturing, exhibit high degrees of autonomy and are safety critical. Establishing justified confidence in ML forms a core part of the safety case for these systems. In this document we introduce a methodology for the Assurance of Machine Learning for use in Autonomous Systems (AMLAS). AMLAS comprises a set of safety case patterns and a process for (1) systematically integrating safety assurance into the development of ML components and (2) for generating the evidence base for explicitly justifying the acceptable safety of these components when integrated into autonomous system applications.

翻訳日:2021-02-03 16:37:31 公開日:2021-02-02

# 外部イノベーションが新規医薬品承認に与える影響--ふりかえり分析

The impact of external innovation on new drug approvals: A retrospective analysis ( http://arxiv.org/abs/2102.01260v1 )

ライセンス: Link先を確認

Xiong Liu, Craig E. Thomas, Christian C. Felder

(参考訳) 製薬会社は、発見研究の生産性を高めるために外部のイノベーション源に頼りがちです。しかし、イノベーションエコシステムを最大限に活用する方法についてより深く理解するためには、外部のイノベーションがプロダクトのローンチを成功に導く方法に関する深い知識が必要である。 FDAが承認した新規分子実体(NMEs)と13の大手製薬会社(2006-2016年)が立ち上げた新生物実体(NBEs)について,承認前の文献を分析した。学術機関が承認前の出版物の大半に貢献し、出版主題がそれぞれの革新者の強みと密接に一致していることが判明した。これは第3相で終了する候補薬にも当てはまりますが、これらの分子に関する文献の量は承認された薬物よりもはるかに少ないです。これは、認可された薬物は、多くの研究所によって提供されるより堅牢なデータセットとしばしば関連していることを示唆している。総合的に分析した結果,学界,産業界,政府にまたがる共同研究イノベーション環境が,医薬品承認の成功に非常に寄与するという仮説が支持された。

Pharmaceutical companies are relying more often on external sources of innovation to boost their discovery research productivity. However, more in-depth knowledge about how external innovation may translate to successful product launches is still required in order to better understand how to best leverage the innovation ecosystem. We analyzed the pre-approval publication histories for FDA-approved new molecular entities (NMEs) and new biologic entities (NBEs) launched by 13 top research pharma companies during the last decade (2006-2016). We found that academic institutions contributed the majority of pre-approval publications and that publication subject matter is closely aligned with the strengths of the respective innovator. We found this to also be true for candidate drugs terminated in Phase 3, but the volume of literature on these molecules is substantially less than for approved drugs. This may suggest that approved drugs are often associated with a more robust dataset provided by a large number of institutes. Collectively, the results of our analysis support the hypothesis that a collaborative research innovation environment spanning across academia, industry and government is highly conducive to successful drug approvals.

翻訳日:2021-02-03 16:35:39 公開日:2021-02-02

# 日立JHU DiHARD IIIシステム:DOVER-Lapと組み合わせた競合型エンドツーエンドニューラルダイアリゼーションとXベクトルクラスタリングシステム

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap ( http://arxiv.org/abs/2102.01363v1 )

ライセンス: Link先を確認

Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur

(参考訳) 本稿では,第3回DIHARD音声ダイアリゼーションチャレンジに提出された日立-JHUシステムについて詳述する。このシステムは5つのサブシステム(x-vectorベースのサブシステム2つ、エンドツーエンドのニューラルネットワークダイアリゼーションベースのサブシステム2つ、ハイブリッドサブシステム1つ)のアンサンブル結果を出力する。各システムを洗練し、5つのサブシステムすべてが競争力と補完的になります。 DOVER-Lapベースのシステムの組み合わせの後、トラック1のフルとコアで11.58 %と14.09 %、トラック2のフルとコアで16.94 %と20.01 %というダイアリゼーションエラー率を達成した。その結果、私たちはチャレンジのすべてのタスクで2位を獲得しました。

This paper provides a detailed description of the Hitachi-JHU system that was submitted to the Third DIHARD Speech Diarization Challenge. The system outputs the ensemble results of the five subsystems: two x-vector-based subsystems, two end-to-end neural diarization-based subsystems, and one hybrid subsystem. We refine each system and all five subsystems become competitive and complementary. After the DOVER-Lap based system combination, it achieved diarization error rates of 11.58 % and 14.09 % in Track 1 full and core, and 16.94 % and 20.01 % in Track 2 full and core, respectively. With their results, we won second place in all the tasks of the challenge.

翻訳日:2021-02-03 16:34:57 公開日:2021-02-02

# 仮想フロー計測のためのベイジアンニューラルネットワーク:実証的研究

Bayesian Neural Networks for Virtual Flow Metering: An Empirical Study ( http://arxiv.org/abs/2102.01391v1 )

ライセンス: Link先を確認

Bjarne Grimstad, Mathilde Hotvedt, Anders T. Sandnes, Odd Kolbj{\o}rnsen, Lars S. Imsland

(参考訳) 最近の研究は、機械学習(ML)を油井やガス井の流量のモデリングに適用することで有望な成果を上げている。計算的に安い評価や新しいデータへのキャリブレーションの容易さといったMLモデルの有利な特性と組み合わせることで、データ駆動型仮想フローメータ(VFM)の開発に楽観的になった。ベイズニューラルネットワークに基づく確率的VFMを提示することにより,この発展に寄与する。均質および異方性測定ノイズを考察し、最大後オリ推定と変動推論を用いたモデルの訓練方法を示す。 5つの異なる石油およびガス資産にまたがる60の井戸からなる大規模で不均一なデータセットをモデル化して手法を研究します。予測性能は過去のデータと将来のテストデータに基づいて分析され、50%のベストパフォーマンスモデルの平均誤差は5-6%と9-13%であった。変動推論は、将来のデータに対する参照アプローチよりも堅牢な予測を提供するように見える。歴史的および将来のデータに対する予測性能と不確実性の違いを詳細に検討し、調査結果はデータ駆動VFMのための代替戦略の開発を動機づける。

Recent works have presented promising results from the application of machine learning (ML) to the modeling of flow rates in oil and gas wells. The encouraging results combined with advantageous properties of ML models, such as computationally cheap evaluation and ease of calibration to new data, have sparked optimism for the development of data-driven virtual flow meters (VFMs). We contribute to this development by presenting a probabilistic VFM based on a Bayesian neural network. We consider homoscedastic and heteroscedastic measurement noise, and show how to train the models using maximum a posteriori estimation and variational inference. We study the methods by modeling on a large and heterogeneous dataset, consisting of 60 wells across five different oil and gas assets. The predictive performance is analyzed on historical and future test data, where we achieve an average error of 5-6% and 9-13% for the 50% best performing models, respectively. Variational inference appears to provide more robust predictions than the reference approach on future data. The difference in prediction performance and uncertainty on historical and future data is explored in detail, and the findings motivate the development of alternative strategies for data-driven VFM.

翻訳日:2021-02-03 16:32:26 公開日:2021-02-02

# 時間依存係数をもつ偏微分方程式のロバストなデータ駆動探索

Robust data-driven discovery of partial differential equations with time-dependent coefficients ( http://arxiv.org/abs/2102.01432v1 )

ライセンス: Link先を確認

Aoxue Chen, Guang Lin

(参考訳) 本研究では,ベイズ群Lassoとスパイクとスラブの先行値を用いた,可変係数の偏微分方程式の発見に基づく,堅牢なベイズスパース学習アルゴリズムを提案する。 Gibbsサンプラーで後方分布から抽出したサンプルを用いて、標準誤差と信頼区間とともに係数の値を推定することができます。エラーバーの構築とは別に、モデル選択としきい値設定の新しい基準の設計にも不確実性定量化を用いることができる。これにより、時間依存係数を持つ学習方程式において、より調整可能でロバストな手法が可能となる。モデル選択としきい値設定の3つの基準を導入し、正しい用語を識別する:ルート平均平方、総誤差バー、グループエラーバーである。さらに,3つのノイズフィルタを頑健なベイズスパース学習アルゴリズムと統合し,より大きなノイズでより良い結果を得る。数値計算により,本手法は3つの例による雑音条件下での逐次グループ化閾値リッジ回帰とグループラッソよりも頑健であることが示された。

In this work, we propose a robust Bayesian sparse learning algorithm based on Bayesian group Lasso with spike and slab priors for the discovery of partial differential equations with variable coefficients. Using the samples draw from the posterior distribution with a Gibbs sampler, we are able to estimate the values of coefficients, together with their standard errors and confidence intervals. Apart from constructing the error bars, uncertainty quantification can also be employed for designing new criteria of model selection and threshold setting. This enables our method more adjustable and robust in learning equations with time-dependent coefficients. Three criteria are introduced for model selection and threshold setting to identify the correct terms: the root mean square, total error bar, and group error bar. Moreover, three noise filters are integrated with the robust Bayesian sparse learning algorithm for better results with larger noise. Numerical results demonstrate that our method is more robust than sequential grouped threshold ridge regression and group Lasso in noisy situations through three examples.

翻訳日:2021-02-03 16:31:44 公開日:2021-02-02

# 大次元縦型バイオマーカー履歴による臨床エンドポイントの個人的動的予測--ランドマークアプローチ

Individual dynamic prediction of clinical endpoint from large dimensional longitudinal biomarker history: a landmark approach ( http://arxiv.org/abs/2102.01466v1 )

ライセンス: Link先を確認

Anthony Devaux (BPH), Robin Genuer (BPH, SISTM), Karine P\'er\`es (BPH), C\'ecile Proust-Lima (BPH)

(参考訳) 患者のフォローアップを通じて収集された個々のデータは、臨床イベントのリスクを評価し、最終的に治療戦略を適応するための重要な情報です。反復測度から1つまたは2つのマーカーへの個々の動的予測を計算するために、ジョイントモデルとランドマークモデルが提案されている。しかし、完全な患者の履歴がはるかに繰り返しのマーカーを含むケースにはほとんど拡張されません。そこで我々は, 多量のマーカーの繰り返し測定を応用して, 健康イベントを動的に予測する手法を提案することを目標とした。内因性マーカー履歴に拡張したランドマークアプローチと,サバイバルデータに適応した機械学習手法を組み合わせた。各マーカー軌跡はランドマーク時間まで収集された情報を用いてモデル化され、個々の軌跡を最も捉えた要約変数が導出される。これらの要約と追加の共変量は、異なる予測方法に含まれる。大規模な次元履歴を扱うためには、正規化レグレッションやランダムサバイバル森林といった生存データに適応した機械学習手法を用いて、ランドマーク時間からイベントを予測し、それらをスーパーラーナーにどのように組み合わせるかを示す。そして、ブリアスコアの推定値と検閲データに適応した受信者操作特性曲線下の領域を用いて、クロスバリデーションによりパフォーマンスを評価する。特に,予測者と事象との間に多数の非線形関係が存在する場合において,機械学習サバイバル手法の標準生存モデルに対する利点をシミュレーションで実証する。そこで本研究では, 原発性胆管炎患者に対する死亡予測の臨床的コンテキストと, 一般高齢者における死亡予測との公衆衛生的コンテキストの2つの予測条件を適用した。 Rで実施した手法により,繰り返しマーカーの数が多い場合でも,患者の縦断的履歴全体を用いた事象の予測が可能となった。繰り返しマーカーの混合モデルや単一の正しい検閲されたイベントのための方法が導入されたが、この方法はマーカーの他の適切なモデリング技術で使用することができ、競合するリスク設定に容易に拡張することができる。

The individual data collected throughout patient follow-up constitute crucial information for assessing the risk of a clinical event, and eventually for adapting a therapeutic strategy. Joint models and landmark models have been proposed to compute individual dynamic predictions from repeated measures to one or two markers. However, they hardly extend to the case where the complete patient history includes much more repeated markers possibly. Our objective was thus to propose a solution for the dynamic prediction of a health event that may exploit repeated measures of a possibly large number of markers. We combined a landmark approach extended to endogenous markers history with machine learning methods adapted to survival data. Each marker trajectory is modeled using the information collected up to landmark time, and summary variables that best capture the individual trajectories are derived. These summaries and additional covariates are then included in different prediction methods. To handle a possibly large dimensional history, we rely on machine learning methods adapted to survival data, namely regularized regressions and random survival forests, to predict the event from the landmark time, and we show how they can be combined into a superlearner. Then, the performances are evaluated by cross-validation using estimators of Brier Score and the area under the Receiver Operating Characteristic curve adapted to censored data. We demonstrate in a simulation study the benefits of machine learning survival methods over standard survival models, especially in the case of numerous and/or nonlinear relationships between the predictors and the event. We then applied the methodology in two prediction contexts: a clinical context with the prediction of death for patients with primary biliary cholangitis, and a public health context with the prediction of death in the general elderly population at different ages. Our methodology, implemented in R, enables the prediction of an event using the entire longitudinal patient history, even when the number of repeated markers is large. Although introduced with mixed models for the repeated markers and methods for a single right censored time-to-event, our method can be used with any other appropriate modeling technique for the markers and can be easily extended to competing risks setting.

翻訳日:2021-02-03 16:31:09 公開日:2021-02-02

# 連続時間における合成制御を用いた政策分析

Policy Analysis using Synthetic Controls in Continuous-Time ( http://arxiv.org/abs/2102.01577v1 )

ライセンス: Link先を確認

Alexis Bellot, Mihaela van der Schaar

(参考訳) 合成制御を用いた反実用推定は、因果推論における最も成功した最近の方法論発展の1つである。現在の記述では、その人気にもかかわらず、時間系列は単位と観測された制御単位の線形組み合わせとして表現された合成制御にまたがるだけである。本論文では,制御微分方程式の形式化を用いて,潜在反実パスを明示的にモデル化する連続時間代替法を提案する。このモデルは不規則に整合した多変量時系列の一般的な設定に直接適用でき、リッチな関数空間に最適化される可能性がある。

Counterfactual estimation using synthetic controls is one of the most successful recent methodological developments in causal inference. Despite its popularity, the current description only considers time series aligned across units and synthetic controls expressed as linear combinations of observed control units. We propose a continuous-time alternative that models the latent counterfactual path explicitly using the formalism of controlled differential equations. This model is directly applicable to the general setting of irregularly-aligned multivariate time series and may be optimized in rich function spaces -- thereby improving on some limitations of existing approaches.

翻訳日:2021-02-03 16:30:13 公開日:2021-02-02

# FINNを用いたFPGA上の量子ニューラルネットワークのベンチマーク

Benchmarking Quantized Neural Networks on FPGAs with FINN ( http://arxiv.org/abs/2102.01341v1 )

ライセンス: Link先を確認

Quentin Ducasse, Pascal Cotret, Lo\"ic Lagadec, Robert Stewart

(参考訳) 最先端のニューラルネットワークのトレーニングと推論の両方のコストの増大は、正確性に最小限の影響を伴って使用するリソースを削減する方法を文学的に見直すことになった。精度を下げるには、精度の低下を無視するコストがかかる。ニューラルネットワークのトレーニングには強力なセットアップが必要だが、低電力と低リソースのハードウェアアーキテクチャでネットワークをデプロイできる必要がある。再構成可能なアーキテクチャは、特定のアプリケーションを見る場合、GPUよりも強力で柔軟なことが証明されている。本稿では、FPGA上に展開されたニューラルネットワークに適用した場合の混合精度の影響を評価することを目的とする。ニューラルネットワークを低精度でデプロイするツールを作成するフレームワークはいくつか存在するが、量子化の重要性とフレームワークの品質を評価するものはほとんどない。 Xilinxラボの2つのフレームワークであるFINNとBrevitasを使用して、2から8ビットの精度と複数の並列化構成の重みを使用して、ニューラルネットワークに対する量子化の影響を評価します。精度の低い表現と十分なトレーニングで等価な精度を得ることができます。しかし、圧縮されたネットワークはより並列化され、ネットワークのスループットが62倍高速になる。この作業で設定されたベンチマークは、パブリックリポジトリ(https://github.com/QDucasse/nnベンチマーク)で利用できる。

The ever-growing cost of both training and inference for state-of-the-art neural networks has brought literature to look upon ways to cut off resources used with a minimal impact on accuracy. Using lower precision comes at the cost of negligible loss in accuracy. While training neural networks may require a powerful setup, deploying a network must be possible on low-power and low-resource hardware architectures. Reconfigurable architectures have proven to be more powerful and flexible than GPUs when looking at a specific application. This article aims to assess the impact of mixed-precision when applied to neural networks deployed on FPGAs. While several frameworks exist that create tools to deploy neural networks using reduced-precision, few of them assess the importance of quantization and the framework quality. FINN and Brevitas, two frameworks from Xilinx labs, are used to assess the impact of quantization on neural networks using 2 to 8 bit precisions and weights with several parallelization configurations. Equivalent accuracy can be obtained using lower-precision representation and enough training. However, the compressed network can be better parallelized allowing the deployed network throughput to be 62 times faster. The benchmark set up in this work is available in a public repository (https://github.com/QDucasse/nn benchmark).

翻訳日:2021-02-03 16:28:59 公開日:2021-02-02

# 対向ロバストネスのための対向訓練の最近の進歩

Recent Advances in Adversarial Training for Adversarial Robustness ( http://arxiv.org/abs/2102.01356v1 )

ライセンス: Link先を確認

Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen

(参考訳) ディープラーニングモデルをだますための逆例は、数年前から研究されており、まだホットなトピックです。敵の訓練も、敵の例を守る効果から大きな注目を集めている。しかし、敵の訓練は完璧ではなく、解決すべき問題が多い。過去数年間、このコミュニティの研究者は様々な側面から敵の訓練を研究し、議論してきた。敵対的訓練の多くの新しい理論と理解が提案されている。本研究は, 敵意訓練の最近の進歩を, 異なる改善によって分類し, 初めて体系的に検討するものである。次に, 対人訓練における一般化問題について3つの視点から考察する。最後に,未解決の課題を浮き彫りにして,今後の方向性について述べる。

Adversarial examples for fooling deep learning models have been studied for several years and are still a hot topic. Adversarial training also receives enormous attention because of its effectiveness in defending adversarial examples. However, adversarial training is not perfect, many questions of which remain to solve. During the last few years, researchers in this community have studied and discussed adversarial training from various aspects. Many new theories and understandings of adversarial training have been proposed. In this survey, we systematically review the recent progress on adversarial training for the first time, categorized by different improvements. Then we discuss the generalization problems in adversarial training from three perspectives. Finally, we highlight the challenges which are not fully solved and present potential future directions.

翻訳日:2021-02-03 16:28:18 公開日:2021-02-02

# LSTM-Recurrent Neural Networksを用いた車線変化までの時間予測

Predicting the Time Until a Vehicle Changes the Lane Using LSTM-based Recurrent Neural Networks ( http://arxiv.org/abs/2102.01431v1 )

ライセンス: Link先を確認

Florian Wirthm\"uller, Marvin Klimke, Julian Schlechtriemen, Jochen Hipp and Manfred Reichert

(参考訳) 高速道路における自動運転車の安全で快適な軌道計画には,交通状況の正確な予測が必要である。これまでのところ、車線変更が実際に起こる時点を推定するよりも、車線変更操作の検出に多くの研究が費やされてきた。しかし実際には、この時間情報はもっと役に立つかもしれない。本論文では,長期記憶型リカレントニューラルネットワークを用いて,高速道路における周辺車両の次の車線変化の時間を正確に予測するシステムの開発について述べる。大規模実世界のデータセットに基づく広範な評価により,本手法は,最も困難な状況であっても,根平均二乗誤差が0.7秒程度で,信頼性の高い予測を行うことができることが示された。車線変更の3.5秒前の予測は精度が高くなり、中央値の誤差は0.25秒未満である。要約すると、この記事は下流の高精度な位置予測のための基本的なステップを形成します。

To plan safe and comfortable trajectories for automated vehicles on highways, accurate predictions of traffic situations are needed. So far, a lot of research effort has been spent on detecting lane change maneuvers rather than on estimating the point in time a lane change actually happens. In practice, however, this temporal information might be even more useful. This paper deals with the development of a system that accurately predicts the time to the next lane change of surrounding vehicles on highways using long short-term memory-based recurrent neural networks. An extensive evaluation based on a large real-world data set shows that our approach is able to make reliable predictions, even in the most challenging situations, with a root mean squared error around 0.7 seconds. Already 3.5 seconds prior to lane changes the predictions become highly accurate, showing a median error of less than 0.25 seconds. In summary, this article forms a fundamental step towards downstreamed highly accurate position predictions.

翻訳日:2021-02-03 16:27:48 公開日:2021-02-02

# 抽象的手法によるマルチエージェント深層補強学習行動の検証

An Abstraction-based Method to Verify Multi-Agent Deep Reinforcement-Learning Behaviours ( http://arxiv.org/abs/2102.01434v1 )

ライセンス: Link先を確認

Pierre El Mqirmi, Francesco Belardinelli and Borja G. Le\'on

(参考訳) マルチエージェント強化学習(RL)は、学習エージェントの安全な動作を保証するためにしばしば苦労するため、一般的には安全クリティカルな応用に適応しない。この問題に対処するために,形式検証と(深度)RLアルゴリズムを組み合わせて,トレーニングとテストの両方において,公式に指定された安全制約の満足度を保証する手法を提案する。私たちが提案するアプローチは、確率計算木論理(PCTL)で検証する制約を表現し、検証ステップの複雑さを減らすためにシステムの抽象表現を構築します。この抽象モデルにより、PCTLで表現される安全制約を満たす抽象ポリシーの集合をモデル検査技術で識別することができる。そして、これらの安全な抽象ポリシーに従ってエージェントの振る舞いが制限される。本手法を用いることで,エージェントの動作が常に安全制約を満たすことを保証し,抽象モデルを自動的に生成する手順を提供する。マルチエージェント環境において,本手法の有効性を実証的に評価し,実証する。

Multi-agent reinforcement learning (RL) often struggles to ensure the safe behaviours of the learning agents, and therefore it is generally not adapted to safety-critical applications. To address this issue, we present a methodology that combines formal verification with (deep) RL algorithms to guarantee the satisfaction of formally-specified safety constraints both in training and testing. The approach we propose expresses the constraints to verify in Probabilistic Computation Tree Logic (PCTL) and builds an abstract representation of the system to reduce the complexity of the verification step. This abstract model allows for model checking techniques to identify a set of abstract policies that meet the safety constraints expressed in PCTL. Then, the agents' behaviours are restricted according to these safe abstract policies. We provide formal guarantees that by using this method, the actions of the agents always meet the safety constraints, and provide a procedure to generate an abstract model automatically. We empirically evaluate and show the effectiveness of our method in a multi-agent environment.

翻訳日:2021-02-03 16:27:14 公開日:2021-02-02

# Entropy-Regularized Deep Reinforcement Learningによる平均フィールドゲームについて

Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning ( http://arxiv.org/abs/2102.01585v1 )

ライセンス: Link先を確認

Kai Cui, Heinz Koeppl

(参考訳) 最近の平均場ゲーム(MFG)は、多くのエージェント設定で近似的なナッシュ平衡の難解な計算を容易にする。本稿では,離散時間有限MFGを有限ホリゾン目標とする。非コンスタントな不動点作用素を持つ離散時間有限 MFG は、既存のMFG の文献で通常仮定されるような縮約的でないことを示し、不動点反復による収束を抑える。代わりに、エントロピー規則化とボルツマンポリシーを固定点反復に組み込む。その結果,既存手法が故障する近似不動点に対する証明可能な収束が得られ,nash平衡近似の本来の目標に到達した。提案手法はすべて, 操作可能な厳密解を用いた指導例と, 厳密解が難解な高次元問題の両方について評価されている。高次元シナリオでは、確立された深層強化学習法を適用し、実演と近似を経験的に組み合わせる。

The recent mean field game (MFG) formalism facilitates otherwise intractable computation of approximate Nash equilibria in many-agent settings. In this paper, we consider discrete-time finite MFGs subject to finite-horizon objectives. We show that all discrete-time finite MFGs with non-constant fixed point operators fail to be contractive as typically assumed in existing MFG literature, barring convergence via fixed point iteration. Instead, we incorporate entropy-regularization and Boltzmann policies into the fixed point iteration. As a result, we obtain provable convergence to approximate fixed points where existing methods fail, and reach the original goal of approximate Nash equilibria. All proposed methods are evaluated with respect to their exploitability, on both instructive examples with tractable exact solutions and high-dimensional problems where exact methods become intractable. In high-dimensional scenarios, we apply established deep reinforcement learning methods and empirically combine fictitious play with our approximations.

翻訳日:2021-02-03 16:26:38 公開日:2021-02-02

# CLIP-Guided Generative Latent Space Search によるキャプションからの画像生成とその逆

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search ( http://arxiv.org/abs/2102.01645v1 )

ライセンス: Link先を確認

Federico A. Galatolo and Mario G.C.A. Cimino and Gigliola Vaglini

(参考訳) 本研究では,与えられたキャプション(または画像)に対応する画像(またはキャプション)を生成する新しいゼロショットフレームワークであるGLaSSを提案する。 GLaSSは、画像と記述キャプションが同様の埋め込みを提供するCLIPニューラルネットワークに基づいている。別として、GLaSSは入力としてキャプション(または画像)を取り、CLIP埋め込みが入力に最も近い画像(またはキャプション)を生成します。この最適な画像(またはキャプション)は、遺伝的アルゴリズムによる探索後に生成ネットワークを介して生成される。画像生成器BigGANおよびStyleGAN2の実験とテキスト生成器GPT2の実験に基づいて、推定結果を示す。

In this research work we present GLaSS, a novel zero-shot framework to generate an image(or a caption) corresponding to a given caption(or image). GLaSS is based on the CLIP neural network which given an image and a descriptive caption provides similar embeddings. Differently, GLaSS takes a caption (or an image) as an input, and generates the image (or the caption) whose CLIP embedding is most similar to the input one. This optimal image (or caption) is produced via a generative network after an exploration by a genetic algorithm. Promising results are shown, based on the experimentation of the image generators BigGAN and StyleGAN2, and of the text generator GPT2.

翻訳日:2021-02-03 16:26:01 公開日:2021-02-02

# 皮膚病変分類のための不均衡小データセットの単一モデル深層学習

Single Model Deep Learning on Imbalanced Small Datasets for Skin Lesion Classification ( http://arxiv.org/abs/2102.01284v1 )

ライセンス: Link先を確認

Peng Yao, Shuwei Shen, Mengjuan Xu, Peng Liu, Fan Zhang, Jinyu Xing, Pengfei Shao, Benjamin Kaffenberger, and Ronald X. Xu

(参考訳) 深層畳み込みニューラルネットワーク(DCNN)モデルは皮膚疾患の診断のために広く研究されており、そのいくつかは皮膚科医の診断結果と同等かそれ以上に優れている。しかし, 皮膚疾患検出におけるdcnnの広範な実装は, 小さいサイズとデータ不均衡によって妨げられている。本稿では,小・不均衡なデータセットに基づく皮膚病変の単一モデル分類のための新しいデータ拡張戦略を提案する。まず、このデータセット上で様々なDCNNをトレーニングし、適度な複雑さを持つモデルがより大きなモデルより優れていることを示す。第二に、正規化DropOutとDropBlockを追加してオーバーフィッティングを削減し、小さなデータセットのサンプル不足の欠陥に対処するためにModified RandAugment Augmentation戦略を提案します。最後に,不均一なサンプルサイズと分類困難さを克服するために,新しい多重音声損失関数を導入した。改良型ランダグメントと複数重み付き焦点損失を単一のdcnnモデルで組み合わせることで,isic 2018 challengeテストデータセットにおける複数のセンシングモデルと同等の分類精度を達成した。本研究では, 低リソース環境下での皮膚病変や他の多くの悪性度の自動スクリーニングのためにモバイル機器に実装するのに好適な, 計算リソースと推論時間の低コストで高い分類性能を達成できることを示した。

Deep convolutional neural network (DCNN) models have been widely explored for skin disease diagnosis and some of them have achieved the diagnostic outcomes comparable or even superior to those of dermatologists. However, broad implementation of DCNN in skin disease detection is hindered by small size and data imbalance of the publically accessible skin lesion datasets. This paper proposes a novel data augmentation strategy for single model classification of skin lesions based on a small and imbalanced dataset. First, various DCNNs are trained on this dataset to show that the models with moderate complexity outperform the larger models. Second, regularization DropOut and DropBlock are added to reduce overfitting and a Modified RandAugment augmentation strategy is proposed to address the defects of sample underrepresentation in the small dataset. Finally, a novel Multi-Weighted Focal Loss function is introduced to overcome the challenge of uneven sample size and classification difficulty. By combining Modified RandAugment and Multi-weighted Focal Loss in a single DCNN model, we have achieved the classification accuracy comparable to those of multiple ensembling models on the ISIC 2018 challenge test dataset. Our study shows that this method is able to achieve a high classification performance at a low cost of computational resources and inference time, potentially suitable to implement in mobile devices for automated screening of skin lesions and many other malignancies in low resource settings.

翻訳日:2021-02-03 16:25:27 公開日:2021-02-02

# 積分画像と積分ヒストグラムに基づく移動端トーンマッピング

Mobile-end Tone Mapping based on Integral Image and Integral Histogram ( http://arxiv.org/abs/2102.01289v1 )

ライセンス: Link先を確認

Jie Yang, Mengchen Lin, Ziyi Liu, Ulian Shahnovich, Orly Yadid-Pecht

(参考訳) 広いダイナミックレンジ(WDR)の画像トーンマッピングは、フィルム制作、セキュリティ監視、写真撮影など多くのアプリケーションで高い需要があります。今日の画像のほとんどは携帯電話からのものであるため、モバイルデバイスにとって特に重要です。そのため、そのような技術はモバイルデバイスの消費者市場で非常に要求され、優れた顧客体験のために不可欠です。しかし、高品質で高性能なWDR画像トーンマッピングの実装はモバイル端末ではほとんど見られない。本稿では,高性能なモバイル用WDR画像トーンマッピングの実装について紹介する。複数の受信フィールドのトーンマッピング結果を活用し、各ピクセルに適した値を算出する。積分画像と積分ヒストグラムの利用は必要な計算量を大幅に削減する。さらに、GPU並列計算を用いて処理速度を向上する。実験結果から,モバイルデバイス上で1秒以内に高解像度のWDR画像を処理し,画像品質を向上できることが示唆された。

Wide dynamic range (WDR) image tone mapping is in high demand in many applications like film production, security monitoring, and photography. It is especially crucial for mobile devices because most of the images taken today are from mobile phones, hence such technology is highly demanded in the consumer market of mobile devices and is essential for a good customer experience. However, high-quality and high-performance WDR image tone mapping implementations are rarely found in the mobile-end. In this paper, we introduce a high performance, mobile-end WDR image tone mapping implementation. It leverages the tone mapping results of multiple receptive fields and calculates a suitable value for each pixel. The utilization of integral image and integral histogram significantly reduce the required computation. Moreover, GPU parallel computation is used to increase the processing speed. The experimental results indicate that our implementation can process a high-resolution WDR image within a second on mobile devices and produce appealing image quality.

翻訳日:2021-02-03 16:24:42 公開日:2021-02-02

# ロバストハッシングによる偽画像検出

Fake-image detection with Robust Hashing ( http://arxiv.org/abs/2102.01313v1 )

ライセンス: Link先を確認

Miki Tanaka, Kiya Hitoshi

(参考訳) 本稿では,JPEG圧縮などの複数の操作手法を初めて画像に適用した場合においても,ロバストハッシュがフェイクイメージを堅牢に検出できるかどうかを検討する。実験では,ganで生成した偽画像を含む各種データセットを用いて,ロバストなハッシュによる偽検出が最先端のものよりも優れていることを示す。

In this paper, we investigate whether robust hashing has a possibility to robustly detect fake-images even when multiple manipulation techniques such as JPEG compression are applied to images for the first time. In an experiment, the proposed fake detection with robust hashing is demonstrated to outperform state-of-the-art one under the use of various datasets including fake images generated with GANs.

翻訳日:2021-02-03 16:24:10 公開日:2021-02-02

# IoT用Ultra-Low-Power視覚センサによるエネルギー効率向上機械学習

Enabling energy efficient machine learning on a Ultra-Low-Power vision sensor for IoT ( http://arxiv.org/abs/2102.01340v1 )

ライセンス: Link先を確認

Francesco Paissan, Massimo Gottardi, Elisabetta Farella

(参考訳) IoT(Internet of Things)とスマートシティのパラダイムには、ユーザと市民に有用なサービスを返却するためにコンテキスト情報を抽出するユビキタス技術が含まれている。このシナリオにおいて重要な役割はコンピュータビジョンアプリケーションによって行われ、特定のデバイスから画像を取得する必要がある。ハイエンドカメラの必要性は、電力消費と高い計算資源の処理を要求するため、このプロセスにペナルティを課すことが多い。したがって、ハードウェア内モーション検出などの高度な機能を実装した新しい低消費電力視覚センサは、iot領域のコンピュータビジョンに不可欠である。残念なことに、エネルギー効率が高いため、これらのセンサーは知覚性能(解像度、フレームレート、色など)を悪化させる可能性がある。したがって、ドメイン固有のパイプラインは通常、これらのカメラの潜在能力を最大限活用するために配信される。本稿では,背景フィルタリングスマートビジョンセンサ(svs)のポテンシャルを最大限活用できるリアルタイム検出,分類,追跡パイプラインの開発,解析,実装について述べる。 8msの推算で得られる電力消費量は7.5mWである。

The Internet of Things (IoT) and smart city paradigm includes ubiquitous technology to extract context information in order to return useful services to users and citizens. An essential role in this scenario is often played by computer vision applications, requiring the acquisition of images from specific devices. The need for high-end cameras often penalizes this process since they are power-hungry and ask for high computational resources to be processed. Thus, the availability of novel low-power vision sensors, implementing advanced features like in-hardware motion detection, is crucial for computer vision in the IoT domain. Unfortunately, to be highly energy-efficient, these sensors might worsen the perception performance (e.g., resolution, frame rate, color). Therefore, domain-specific pipelines are usually delivered in order to exploit the full potential of these cameras. This paper presents the development, analysis, and embedded implementation of a realtime detection, classification and tracking pipeline able to exploit the full potential of background filtering Smart Vision Sensors (SVS). The power consumption obtained for the inference - which requires 8ms - is 7.5 mW.

翻訳日:2021-02-03 16:23:44 公開日:2021-02-02

# fpga用ハードウェア効率残差ネットワーク

Hardware-efficient Residual Networks for FPGAs ( http://arxiv.org/abs/2102.01351v1 )

ライセンス: Link先を確認

Olivia Weng, Alireza Khodamoradi, Ryan Kastner

(参考訳) 残差ネットワーク(resnets)は、トレーニング収束を改善するために、ネットワーク内のスキップ接続(以前のレイヤからのアクティベーションを再利用する)を採用するが、これらのスキップ接続は、resnetのハードウェア実装の課題を生じさせる。ハードウェアは、より多くの受信データを処理する前に、スキップ接続が処理されるのを待たなければならない。接続をスキップしなければ、ResNetsはよりハードウェア効率が良い。そこで本研究では,NonResNetと呼ばれるネットワークを構築して,ResNetのスキップ接続を段階的に除去する学習手法を提案する。 FPGAで実装すると、NonResNetはResNetのBRAM利用率を9%、LUT利用率を3%、スループットを5%向上させることが示されています。

Residual networks (ResNets) employ skip connections in their networks -- reusing activations from previous layers -- to improve training convergence, but these skip connections create challenges for hardware implementations of ResNets. The hardware must either wait for skip connections to be processed before processing more incoming data or buffer them elsewhere. Without skip connections, ResNets would be more hardware-efficient. Thus, we present the teacher-student learning method to gradually prune away all of a ResNet's skip connections, constructing a network we call NonResNet. We show that when implemented for FPGAs, NonResNet decreases ResNet's BRAM utilization by 9% and LUT utilization by 3% and increases throughput by 5%.

翻訳日:2021-02-03 16:23:05 公開日:2021-02-02

# 常に個人的: デバイス上でのCNNのパーソナライゼーションにEarly Exitsを使う

It's always personal: Using Early Exits for Efficient On-Device CNN Personalisation ( http://arxiv.org/abs/2102.01393v1 )

ライセンス: Link先を確認

Ilias Leontiadis, Stefanos Laskaridis, Stylianos I. Venieris, Nicholas D. Lane

(参考訳) 強力なハードウェアとモデル圧縮技術のおかげで、オンデバイス機械学習は現実的になっています。通常、これらのモデルは大きなGPUクラスタ上で事前訓練され、幅広い入力を一般化するのに十分なパラメータを持つ。この研究では、より小さく、パーソナライズされたモデルを特定のシナリオに適合させることで、高い精度と高速な実行を可能にしている。それでもデバイス上でのトレーニングは非常に困難であり、フラッグシップスマートフォンでも過剰な計算とメモリを必要とする。同時に、デバイス上のデータ可用性は制限され、サンプルのラベルが付けられないことが多い。この目的のために、モデルに早期出口を添付し、デバイス上でそれらをパーソナライズするフレームワークであるPersEPhonEEを紹介します。これにより、よりパーソナライズされたデータが利用可能になると、モデルが計算の大部分を段階的にバイパスすることができる。さらに,ネットワーク全体のパーソナライズ時間のごく一部で,早期出口を半教師付きで訓練する効率的なオンデバイスアルゴリズムを提案する。その結果、PersEPhonEEは、トレーニングコストを最大2.2倍、推論レイテンシを平均2.2-3.2倍まで下げながら、最大15.9%の精度を、デバイス上のラベルの可用性に応じて向上させる。

On-device machine learning is becoming a reality thanks to the availability of powerful hardware and model compression techniques. Typically, these models are pretrained on large GPU clusters and have enough parameters to generalise across a wide variety of inputs. In this work, we observe that a much smaller, personalised model can be employed to fit a specific scenario, resulting in both higher accuracy and faster execution. Nevertheless, on-device training is extremely challenging, imposing excessive computational and memory requirements even for flagship smartphones. At the same time, on-device data availability might be limited and samples are most frequently unlabelled. To this end, we introduce PersEPhonEE, a framework that attaches early exits on the model and personalises them on-device. These allow the model to progressively bypass a larger part of the computation as more personalised data become available. Moreover, we introduce an efficient on-device algorithm that trains the early exits in a semi-supervised manner at a fraction of the whole network's personalisation time. Results show that PersEPhonEE boosts accuracy by up to 15.9% while dropping the training cost by up to 2.2x and inference latency by 2.2-3.2x on average for the same accuracy, depending on the availability of labels on-device.

翻訳日:2021-02-03 16:22:29 公開日:2021-02-02

# 子どもとコンピュータの相互作用:最近の仕事、新しいデータセット、年齢検出

Child-Computer Interaction: Recent Works, New Dataset, and Age Detection ( http://arxiv.org/abs/2102.01405v1 )

ライセンス: Link先を確認

Ruben Tolosana, Juan Carlos Ruiz-Garcia, Ruben Vera-Rodriguez, Jaime Herreros-Rodriguez, Sergio Romero-Tapiador, Aythami Morales, Julian Fierrez

(参考訳) 子どもとコンピュータの相互作用に関する最近の研究を概観し,その意図する枠組みであるchildciについて述べる。i) モバイルデバイスと対話しながら,子どもの認知と神経運動の発達をよりよく理解すること,ii) e-learning と e-health の新たな応用を可能にすること,など。我々のフレームワークには、新しいモバイルアプリケーション、特定のデータ取得プロトコル、縦断的研究を可能にするために年次拡張が計画されているChildCIデータセット(ChildCIdb v1)の最初のリリースが含まれている。私たちのフレームワークでは、子どもたちはペンスタイラスと指を使ってタブレットデバイスと対話し、異なるレベルの神経運動と認知スキルを必要とする異なるタスクを実行します。 ChildCIdbは18ヶ月から8歳までの400人以上の子供で構成されており、ピアジェの理論の最初の3つの発達段階を考慮しています。さらに,ChildCIフレームワークの可能性の実証として,ChildCIdbが実現した多くの応用の1つとして,デバイスインタラクションに基づく子どもの年齢検出実験を行った。さまざまな機械学習アプローチが評価され、年齢グループを自動的に検出する34のグローバル機能セットを提案し、90%以上の精度結果を達成し、このタスクでより有用な機能の種類に関して興味深い結果を得ます。

We overview recent research in Child-Computer Interaction and describe our framework ChildCI intended for: i) generating a better understanding of the cognitive and neuromotor development of children while interacting with mobile devices, and ii) enabling new applications in e-learning and e-health, among others. Our framework includes a new mobile application, specific data acquisition protocols, and a first release of the ChildCI dataset (ChildCIdb v1), which is planned to be extended yearly to enable longitudinal studies. In our framework children interact with a tablet device, using both a pen stylus and the finger, performing different tasks that require different levels of neuromotor and cognitive skills. ChildCIdb comprises more than 400 children from 18 months to 8 years old, considering therefore the first three development stages of the Piaget's theory. In addition, and as a demonstration of the potential of the ChildCI framework, we include experimental results for one of the many applications enabled by ChildCIdb: children age detection based on device interaction. Different machine learning approaches are evaluated, proposing a new set of 34 global features to automatically detect age groups, achieving accuracy results over 90% and interesting findings in terms of the type of features more useful for this task.

翻訳日:2021-02-03 16:21:46 公開日:2021-02-02

# スケーラブルなマルチラベル画像検索のためのランク一貫性ディープハッシング

Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search ( http://arxiv.org/abs/2102.01486v1 )

ライセンス: Link先を確認

Cheng Ma, Jiwen Lu, Jie Zhou

(参考訳) ハッシュは大規模画像検索においてますます魅力的な技術になりつつあるため、マルチラベルハッシュもマルチレベルのセマンティックコンテンツを活用する能力に注目が集まっている。本稿では,スケーラブルなマルチラベル画像検索のための新しいディープハッシュ法を提案する。コントラストやトリプルトロスといった従来の目的と異なり、全てのサンプルに対して十分なグローバル監視情報を提供するために、ペアやトリプルトではなくランクリストを用いる。具体的には、元の空間とハミング空間の2つの空間からの類似性順序を整列するために、新しい階数整合性目標を適用する。強力な損失関数は、意味的類似性とハミング距離が2つの空間で一致しないサンプルをペナルティ化するように設計されている。また、導関数の簡潔な定式化とともに判別力を高めるために、マルチラベルソフトマックスクロスエントロピー損失が提示される。異なるラベルを持つサンプルの近傍構造を操作するために、サンプルと対応する複数のクラスセンター間の距離を減らすことにより、同じラベルを持つサンプルのハッシュベクトルをクラスタ化するマルチラベルクラスタリングロスを設計します。 MIRFLICKR-25K, IAPRTC12, NUS-WIDEの3つの公開マルチラベルデータセットを用いて, 提案手法の有効性を実証した。

As hashing becomes an increasingly appealing technique for large-scale image retrieval, multi-label hashing is also attracting more attention for the ability to exploit multi-level semantic contents. In this paper, we propose a novel deep hashing method for scalable multi-label image search. Unlike existing approaches with conventional objectives such as contrast and triplet losses, we employ a rank list, rather than pairs or triplets, to provide sufficient global supervision information for all the samples. Specifically, a new rank-consistency objective is applied to align the similarity orders from two spaces, the original space and the hamming space. A powerful loss function is designed to penalize the samples whose semantic similarity and hamming distance are mismatched in two spaces. Besides, a multi-label softmax cross-entropy loss is presented to enhance the discriminative power with a concise formulation of the derivative function. In order to manipulate the neighborhood structure of the samples with different labels, we design a multi-label clustering loss to cluster the hashing vectors of the samples with the same labels by reducing the distances between the samples and their multiple corresponding class centers. The state-of-the-art experimental results achieved on three public multi-label datasets, MIRFLICKR-25K, IAPRTC12 and NUS-WIDE, demonstrate the effectiveness of the proposed method.

翻訳日:2021-02-03 16:21:02 公開日:2021-02-02

# グラディエントフローの維持:グラディエントフローを用いたスパースネットワーク最適化の研究

Keep the Gradients Flowing: Using Gradient Flow to Study Sparse Network Optimization ( http://arxiv.org/abs/2102.01670v1 )

ライセンス: Link先を確認

Kale-ab Tessera, Sara Hooker, Benjamin Rosman

(参考訳) 密集型ニューラルネットワークと同じ性能に収束するスパースネットワークの訓練は、解明されている。最近の研究は初期化が鍵であることを示唆している。しかし、この研究の方向性は成功していますが、初期化だけに焦点を合わせると不十分なようです。本稿では,スパースモデルにおける正規化,最適化,アーキテクチャ選択の役割について考察する。我々は,スパースネットワークと高密度ネットワークの公平な比較を可能にする,単純な実験フレームワークであるSame Capacity Sparse vs Dense Comparison (SC-SDC)を提案する。さらに,スパースネットワークの性能と相関する勾配流,有効勾配流(EGF)の新たな測定法を提案する。トップラインメトリクスsc-sdcとegfを用いて,高濃度ネットワークで使用されるオプティマイザ,アクティベーション関数,レギュラライザのデフォルト選択がスパースネットワークに不利であることを示す。これらの結果から,スパースネットワークにおけるグラデーションフローは,アーキテクチャ設計とトレーニング体制の側面を再考することで改善できることを示した。私たちの研究は、初期化はパズルの1つの部分にすぎないことを示唆し、スパースネットワークへの調整最適化の広い視野を取ることは有望な結果をもたらす。

Training sparse networks to converge to the same performance as dense neural architectures has proven to be elusive. Recent work suggests that initialization is the key. However, while this direction of research has had some success, focusing on initialization alone appears to be inadequate. In this paper, we take a broader view of training sparse networks and consider the role of regularization, optimization and architecture choices on sparse models. We propose a simple experimental framework, Same Capacity Sparse vs Dense Comparison (SC-SDC), that allows for fair comparison of sparse and dense networks. Furthermore, we propose a new measure of gradient flow, Effective Gradient Flow (EGF), that better correlates to performance in sparse networks. Using top-line metrics, SC-SDC and EGF, we show that default choices of optimizers, activation functions and regularizers used for dense networks can disadvantage sparse networks. Based upon these findings, we show that gradient flow in sparse networks can be improved by reconsidering aspects of the architecture design and the training regime. Our work suggests that initialization is only one piece of the puzzle and taking a wider view of tailoring optimization to sparse networks yields promising results.

翻訳日:2021-02-03 16:19:49 公開日:2021-02-02

# サブサンプル半確定プログラムによるコミュニティ検出

Community Detection with a Subsampled Semidefinite Program ( http://arxiv.org/abs/2102.01419v1 )

ライセンス: Link先を確認

Pedro Abdalla and Afonso S. Bandeira

(参考訳) 半定型プログラミングは、クラスタリングやコミュニティ検出など、データサイエンスと信号処理のいくつかの問題に取り組むための重要なツールです。しかし、半定義のプログラムは実際には遅いことが多いため、スケッチなどの技法の高速化がしばしば考慮される。確率ブロックモデルにおけるコミュニティ検出の文脈において、Mixon と Xie [9] は、最近、ネットワークのサブサンプリングされたサブグラフにのみ半定値プログラムを解き、計算の大幅な節約をもたらすスケッチフレームワークを提案している。本稿では,2つの平衡群をもつ確率的ブロックモデルに対するこの手法の統計的限界について,mixon と xie の予想に対する正の答えを提案する。

Semidefinite programming is an important tool to tackle several problems in data science and signal processing, including clustering and community detection. However, semidefinite programs are often slow in practice, so speed up techniques such as sketching are often considered. In the context of community detection in the stochastic block model, Mixon and Xie [9] have recently proposed a sketching framework in which a semidefinite program is solved only on a subsampled subgraph of the network, giving rise to significant computational savings. In this short paper, we provide a positive answer to a conjecture of Mixon and Xie about the statistical limits of this technique for the stochastic block model with two balanced communities.

翻訳日:2021-02-03 16:18:28 公開日:2021-02-02

# 対称的ブール因子解析とInstaHideへの応用

Symmetric Boolean Factor Analysis with Applications to InstaHide ( http://arxiv.org/abs/2102.01570v1 )

ライセンス: Link先を確認

Sitan Chen, Zhao Song, Runzhou Tao, Ruizhe Zhang

(参考訳) 本研究では,最近提案された分散学習手法であるInstaHideのセキュリティについて検討する(Huang et al.)。いくつかの最近の研究は、以下の行列因子化問題への興味深い接続を利用して、InstaHideの再構築攻撃を与えている:{0,1}^rにおけるmランダムk-sparse Booleanベクトルのコレクションのグラム行列を考えると、ベクトルを回復する(自明な対称性まで)。同様に、これはブール因子分析のよく研究された問題の疎密で対称な変種として、またはk-ユニフォームハイパーグラフを線グラフから回復する古典的な問題の平均ケースバージョンとして考えられます。以前のアルゴリズムでは m が k で指数関数的に大きいか、あるいは k = 2 にのみ適用されるかのどちらかが必要であったため、InstaHide は適当な大きさの k に対して再構築攻撃に対して何らかの「細かいセキュリティ」を持っているかという疑問を解いた。この研究では、上記の行列分解問題に対して単純な O(m^{\omega + 1}) 時間アルゴリズムを与えることで、負の方法でこの疑問に答える。このアルゴリズムは、k-スパースベクトルの収集が任意に選択される問題の最悪の場合の設定のための準多項式時間アルゴリズムでこの結果を補完する。

In this work we examine the security of InstaHide, a recently proposed scheme for distributed learning (Huang et al.). A number of recent works have given reconstruction attacks for InstaHide in various regimes by leveraging an intriguing connection to the following matrix factorization problem: given the Gram matrix of a collection of m random k-sparse Boolean vectors in {0,1}^r, recover the vectors (up to the trivial symmetries). Equivalently, this can be thought of as a sparse, symmetric variant of the well-studied problem of Boolean factor analysis, or as an average-case version of the classic problem of recovering a k-uniform hypergraph from its line graph. As previous algorithms either required m to be exponentially large in k or only applied to k = 2, they left open the question of whether InstaHide possesses some form of "fine-grained security" against reconstruction attacks for moderately large k. In this work, we answer this in the negative by giving a simple O(m^{\omega + 1}) time algorithm for the above matrix factorization problem. Our algorithm, based on tensor decomposition, only requires m to be at least quasi-linear in r. We complement this result with a quasipolynomial-time algorithm for a worst-case setting of the problem where the collection of k-sparse vectors is chosen arbitrarily.

翻訳日:2021-02-03 16:17:53 公開日:2021-02-02

# ラジアル関数を超える深さ分離

Depth separation beyond radial functions ( http://arxiv.org/abs/2102.01621v1 )

ライセンス: Link先を確認

Luca Venturi, Samy Jelassi, Tristan Ozuch, Joan Bruna

(参考訳) ニューラルネットワークの高次元深度分離の結果、特定の関数は2重層ネットワークによって効率的に近似できるが、高次元の1重層は$d$であることがわかった。このタイプの既存の結果は、主に基礎となる放射状または1次元の構造を持つ機能に焦点を当てている。本稿の最初の貢献は、(Eldan and Shamir, 2016)の証明戦略に基づいて、より一般的な関数のクラス、すなわち、断片的振動構造を持つ関数にその結果を拡張することである。このような結果の証明における一般的なテーマは、一隠れ層がフーリエ表現が領域に広がる高エネルギー関数を近似できないという事実である。一方、1つの隠れたニューラルネットワークによる関数の既存の近似結果は、スパースなフーリエ表現を持つ関数に依存している。領域の選択はまた、上値と下値の近似境界の間のギャップの源でもある。固定近似領域、すなわち次元 $d$ における球面 $\mathbb{s}^{d-1}$ に焦点をあてて、1階層ネットワークで効率的に近似可能な両関数と、フーリエ展開の観点で証明可能でない関数のキャラクタリゼーションを提供する。

High-dimensional depth separation results for neural networks show that certain functions can be efficiently approximated by two-hidden-layer networks but not by one-hidden-layer ones in high-dimensions $d$. Existing results of this type mainly focus on functions with an underlying radial or one-dimensional structure, which are usually not encountered in practice. The first contribution of this paper is to extend such results to a more general class of functions, namely functions with piece-wise oscillatory structure, by building on the proof strategy of (Eldan and Shamir, 2016). A common theme in the proof of such results is the fact that one-hidden-layer fail to approximate high-energy functions whose Fourier representation is spread in the domain. On the other hand, existing approximation results of a function by one-hidden-layer neural networks rely on the function having a sparse Fourier representation. The choice of the domain also represents a source of gaps between upper and lower approximation bounds. Focusing on a fixed approximation domain, namely the sphere $\mathbb{S}^{d-1}$ in dimension $d$, we provide a characterization of both functions which are efficiently approximable by one-hidden-layer networks and of functions which are provably not, in terms of their Fourier expansion.

翻訳日:2021-02-03 16:17:05 公開日:2021-02-02

# パラメータ化量子回路の容量と量子幾何学

Capacity and quantum geometry of parametrized quantum circuits ( http://arxiv.org/abs/2102.01659v1 )

ライセンス: Link先を確認

Tobias Haug, Kishor Bharti, M. S. Kim

(参考訳) ノイズの多い中規模量子デバイスのポテンシャルを利用するには、ハイブリッド量子古典的アルゴリズムを実行するのに最適なタイプの回路を見つけることが不可欠です。主な候補は、現在のデバイスで効果的に実装できるパラメトリズド量子回路である。本稿では、パラメータ空間の幾何学的構造を用いて、これらの回路の能力と訓練性を効果的な量子次元で評価し、回路の表現力と特定の初期化戦略を明らかにします。様々な人気回路タイプの表現力を評価し、使用する絡み合うゲートの種類によって顕著な違いを見つけます。特に回路は、その表現力のスケーリング法則によって特徴付けられる。我々は、パラメータ空間の量子幾何学の遷移を特定し、それは深い回路のための量子自然勾配の崩壊につながる。浅い回路では、量子自然勾配は通常の勾配に比べて桁違いに値が大きいが、どちらもグラデーションの消失に苦しむことがある。回路パラメータの固定セットをランダム化に調整することにより、回路が表現的だが不規則なプラトーに悩まされない領域を見つけ、回路を初期化するための良い方法を示唆する。その結果、パラメトリズド量子回路の理解が強化され、変分量子アルゴリズムが改善される。

To harness the potential of noisy intermediate-scale quantum devices, it is paramount to find the best type of circuits to run hybrid quantum-classical algorithms. Key candidates are parametrized quantum circuits that can be effectively implemented on current devices. Here, we evaluate the capacity and trainability of these circuits using the geometric structure of the parameter space via the effective quantum dimension, which reveals the expressive power of circuits in general as well as of particular initialization strategies. We assess the representation power of various popular circuit types and find striking differences depending on the type of entangling gates used. Particular circuits are characterized by scaling laws in their expressiveness. We identify a transition in the quantum geometry of the parameter space, which leads to a decay of the quantum natural gradient for deep circuits. For shallow circuits, the quantum natural gradient can be orders of magnitude larger in value compared to the regular gradient; however, both of them can suffer from vanishing gradients. By tuning a fixed set of circuit parameters to randomized ones, we find a region where the circuit is expressive, but does not suffer from barren plateaus, hinting at a good way to initialize circuits. Our results enhance the understanding of parametrized quantum circuits for improving variational quantum algorithms.

翻訳日:2021-02-03 16:16:22 公開日:2021-02-02

# 磁気共鳴脳イメージングにおける転送学習:システムレビュー

Transfer Learning in Magnetic Resonance Brain Imaging: a Systematic Review ( http://arxiv.org/abs/2102.01530v1 )

ライセンス: Link先を確認

Juan Miguel Valverde, Vandad Imani, Ali Abdollahzadeh, Riccardo De Feo, Mithilesh Prakash, Robert Ciszek, Jussi Tohka

(参考訳) 転送学習は、関心のあるタスクの一般化を改善するために、関連するタスクから知識を取得することに焦点を当てた機械学習技術である。 MRIでは、移動学習はMR画像の変動に対処する戦略を開発する上で重要である。さらに、転送学習は、関心のあるタスクに関連するタスクを解決するために訓練された機械学習モデルを再利用するのに役立つ。研究の方向性,知識のギャップ,応用,そしてmr脳イメージングに応用されるトランスファー学習アプローチの中で広く使われる戦略を特定することを目的としています。 MR脳イメージングにトランスファー学習を適用した記事の系統的文献探索を行った。 433の研究をスクリーニングし,タスクタイプ,アプリケーション,機械学習手法などの関連情報を分類,抽出した。さらに、プライバシ、未確認ターゲットドメイン、ラベルなしデータに対処する脳MRI固有の転写学習アプローチや他の手法を精査した。脳mriタスクに転送学習を応用した記事は129件あった。最も頻繁な応用は認知症関連分類タスクと脳腫瘍のセグメンテーションであった。記事の大半は畳み込みニューラルネットワーク(CNN)で転送学習を使用した。プライバシー問題、未確認のターゲットドメイン、ラベルなしデータなど、明らかにMRI特有のアプローチはごくわずかだった。我々はグループ固有の広く使われるアプローチに対する新しい分類を提案した。脳MRIにおけるトランスファー学習への関心が高まっている。公共データセットは、アルツハイマーの診断/予後および腫瘍分割の人気に貢献している。同様に、事前訓練されたCNNの利用も促進されている。最後に、調査研究の大半は、転校学習を施した後の戦略の解釈を詳細に検討せず、他のアプローチと比較しなかった。

Transfer learning refers to machine learning techniques that focus on acquiring knowledge from related tasks to improve generalization in the tasks of interest. In MRI, transfer learning is important for developing strategies that address the variation in MR images. Additionally, transfer learning is beneficial to re-utilize machine learning models that were trained to solve related tasks to the task of interest. Our goal is to identify research directions, gaps of knowledge, applications, and widely used strategies among the transfer learning approaches applied in MR brain imaging. We performed a systematic literature search for articles that applied transfer learning to MR brain imaging. We screened 433 studies and we categorized and extracted relevant information, including task type, application, and machine learning methods. Furthermore, we closely examined brain MRI-specific transfer learning approaches and other methods that tackled privacy, unseen target domains, and unlabeled data. We found 129 articles that applied transfer learning to brain MRI tasks. The most frequent applications were dementia related classification tasks and brain tumor segmentation. A majority of articles utilized transfer learning on convolutional neural networks (CNNs). Only few approaches were clearly brain MRI specific, considered privacy issues, unseen target domains or unlabeled data. We proposed a new categorization to group specific, widely-used approaches. There is an increasing interest in transfer learning within brain MRI. Public datasets have contributed to the popularity of Alzheimer's diagnostics/prognostics and tumor segmentation. Likewise, the availability of pretrained CNNs has promoted their utilization. Finally, the majority of the surveyed studies did not examine in detail the interpretation of their strategies after applying transfer learning, and did not compare to other approaches.

翻訳日:2021-02-03 16:15:04 公開日:2021-02-02

# 人工知能を用いた医療画像解析のための医療データセット収集

Medical Datasets Collections for Artificial Intelligence-based Medical Image Analysis ( http://arxiv.org/abs/2102.01549v1 )

ライセンス: Link先を確認

Yang Wen

(参考訳) 我々は32の公開データセットを収集し,そのうち28は医用画像,4つは自然画像で,研究を行った。これらのデータセットの画像は、異なるカメラによってキャプチャされるため、モダリティ、フレームサイズ、容量が異なる。データアクセシビリティのため、私たちは多くのデータセットのwebサイトも提供しています。

We collected 32 public datasets, of which 28 for medical imaging and 4 for natural images, to conduct study. The images of these datasets are captured by different cameras, thus vary from each other in modality, frame size and capacity. For data accessibility, we also provide the websites of most datasets and hope this will help the readers reach the datasets.

翻訳日:2021-02-03 16:14:20 公開日:2021-02-02

# 医学的無関係なスタイル転送拡張を用いた計算病理のドメインに依存しない視覚表現の学習

Learning domain-agnostic visual representation for computational pathology using medically-irrelevant style transfer augmentation ( http://arxiv.org/abs/2102.01678v1 )

ライセンス: Link先を確認

Rikiya Yamashita, Jin Long, Snikitha Banda, Jeanne Shen, Daniel L. Rubin

(参考訳) 見えないデータに基づく機械学習モデルの最適一般化は、そのようなモデルの医療画像への臨床応用性を妨げる重要な課題である。ドメイン適応やドメイン一般化といった様々な方法がこの課題に対処するために進化してきたが、堅牢で一般化可能な表現の学習は医用画像理解の中核であり、現在も問題となっている。本稿では,芸術絵画からのランダムなスタイル転送に基づくデータ拡張の一形態であるSTRAP(Style TRansfer Augmentation for histoPathology)を提案する。スタイル転送は、高レベルの意味コンテンツを維持しながら、画像の低レベルのテクスチャコンテンツをランダムに選択された芸術絵画の無情報スタイルに置き換えます。これにより、ドメインシフトに対する堅牢性が向上し、ドメインに依存しない表現を学ぶためのシンプルで強力なツールとして使用できる。その結果,ストラップは大腸癌におけるマイクロサテライト状態をデジタル化組織病理画像を用いて予測する特定の分類タスクにおいて,最先端のパフォーマンス,特にドメインシフトの有無に寄与することが示された。

Suboptimal generalization of machine learning models on unseen data is a key challenge which hampers the clinical applicability of such models to medical imaging. Although various methods such as domain adaptation and domain generalization have evolved to combat this challenge, learning robust and generalizable representations is core to medical image understanding, and continues to be a problem. Here, we propose STRAP (Style TRansfer Augmentation for histoPathology), a form of data augmentation based on random style transfer from artistic paintings, for learning domain-agnostic visual representations in computational pathology. Style transfer replaces the low-level texture content of images with the uninformative style of randomly selected artistic paintings, while preserving high-level semantic content. This improves robustness to domain shift and can be used as a simple yet powerful tool for learning domain-agnostic representations. We demonstrate that STRAP leads to state-of-the-art performance, particularly in the presence of domain shifts, on a particular classification task of predicting microsatellite status in colorectal cancer using digitized histopathology images.

翻訳日:2021-02-03 16:13:53 公開日:2021-02-02

# (参考訳) NeMo: ロバスト3次元ポース推定のためのコントラスト特徴のニューラルネットワークモデル

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation ( http://arxiv.org/abs/2101.12378v2 )

ライセンス: CC BY 4.0

Angtian Wang, Adam Kortylewski, Alan Yuille

(参考訳) 3Dポーズ推定はコンピュータビジョンにおいて難しいが重要な課題である。本研究では,3Dポーズ推定における標準的深層学習手法が,対象物が部分的に遮蔽されたり,以前見つからなかったポーズから見たりした場合,堅牢ではないことを示した。生成的視覚モデルから部分閉塞へのロバスト性に着想を得て,物体の3次元生成表現とディープニューラルネットワークを,NeMoと呼ぶ統一ニューラルネットワークアーキテクチャに統合することを提案する。特にnemoは、密集した3dメッシュ上の各頂点における神経特徴活性化の生成モデルを学ぶ。微分可能レンダリングを用いて、NeMoとターゲット画像の特徴表現との再構成誤差を最小化することにより、3Dオブジェクトのポーズを推定する。レコンストラクション損失の局所視認を避けるために,特徴抽出器を訓練し,メッシュ上の個々の特徴表現間の距離をコントラスト学習を用いて最大化する。 PASCAL3D+、Occluded-PASCAL3D+およびObjectNet3Dに関する広範な実験により、NeMoは通常のディープネットワークに比べて、部分閉塞に対してより堅牢であり、かつ、通常のデータ上での競合性能を維持しながら、目に見えないポーズを示す。興味深いことに、私たちの実験では、メッシュ表現が真の物体ジオメトリを立方体で粗大に近似するだけであっても、NeMoが合理的にうまく機能することを示しており、正確な3Dポーズ推定には詳細な3Dジオメトリは必要ありません。コードはhttps://github.com/Angtian/NeMoで公開されている。

3D pose estimation is a challenging but important task in computer vision. In this work, we show that standard deep learning approaches to 3D pose estimation are not robust when objects are partially occluded or viewed from a previously unseen pose. Inspired by the robustness of generative vision models to partial occlusion, we propose to integrate deep neural networks with 3D generative representations of objects into a unified neural architecture that we term NeMo. In particular, NeMo learns a generative model of neural feature activations at each vertex on a dense 3D mesh. Using differentiable rendering we estimate the 3D object pose by minimizing the reconstruction error between NeMo and the feature representation of the target image. To avoid local optima in the reconstruction loss, we train the feature extractor to maximize the distance between the individual feature representations on the mesh using contrastive learning. Our extensive experiments on PASCAL3D+, occluded-PASCAL3D+ and ObjectNet3D show that NeMo is much more robust to partial occlusion and unseen pose compared to standard deep networks, while retaining competitive performance on regular data. Interestingly, our experiments also show that NeMo performs reasonably well even when the mesh representation only crudely approximates the true object geometry with a cuboid, hence revealing that the detailed 3D geometry is not needed for accurate 3D pose estimation. The code is publicly available at https://github.com/Angtian/NeMo.

翻訳日:2021-02-03 13:08:18 公開日:2021-02-02

# (参考訳) 天空画像からの深層学習照度予測モデルのベンチマーク -詳細な分析-

Benchmarking of Deep Learning Irradiance Forecasting Models from Sky Images -- an in-depth Analysis ( http://arxiv.org/abs/2102.00721v2 )

ライセンス: CC BY 4.0

Quentin Paletta, Guillaume Arbod and Joan Lasenby

(参考訳) スマートグリッド、発電所の運用、ハイブリッドシステム管理、エネルギー取引など多くの産業応用は、ソーラーパネルからの断続的なエネルギー生産に対応するため、短期的な太陽予報の改善の恩恵を受ける可能性がある。しかし、現在の雲を空からモデル化するアプローチでは、雲の空間的配置、時間的ダイナミクス、太陽放射との物理的相互作用に関する精度が不足している。大規模データセットの増加によって、これらの制限に対処するためにデータ駆動メソッドが開発され、有望な結果が得られた。本研究では、半球空画像と外生変数のシーケンスから太陽光照射を予測するために訓練された4つのDeep Learningアーキテクチャを比較した。各モデルの相対的なパフォーマンスを評価するために、スマート永続化モデルに基づく予測スキルメトリックと、ランプと時間の歪みメトリックを使用しました。その結果、天空画像列の時空間的側面のエンコーディングは、試験年度の予測スキルが20.4%に達したことにより、予測を大幅に改善した。しかし、実験データに基づいて、深層学習モデルは「非常にスマートな永続化モデル」として振る舞う傾向があり、最も厄介なエラーを緩和しながら、時間的に永続化モデルと整合する傾向があると結論づけた。したがって、スカイカメラで捉えられたにもかかわらず、モデルはしばしば太陽を遮る雲のような大きな照度変化を引き起こす基本的な事象を見逃す。反応性から予測性まで、このアプローチの放射能予測への移行に貢献できることを願っています。

A number of industrial applications, such as smart grids, power plant operation, hybrid system management or energy trading, could benefit from improved short-term solar forecasting, addressing the intermittent energy production from solar panels. However, current approaches to modelling the cloud cover dynamics from sky images still lack precision regarding the spatial configuration of clouds, their temporal dynamics and physical interactions with solar radiation. Benefiting from a growing number of large datasets, data driven methods are being developed to address these limitations with promising results. In this study, we compare four commonly used Deep Learning architectures trained to forecast solar irradiance from sequences of hemispherical sky images and exogenous variables. To assess the relative performance of each model, we used the Forecast Skill metric based on the smart persistence model, as well as ramp and time distortion metrics. The results show that encoding spatiotemporal aspects of the sequence of sky images greatly improved the predictions with 10 min ahead Forecast Skill reaching 20.4% on the test year. However, based on the experimental data, we conclude that, with a common setup, Deep Learning models tend to behave just as a 'very smart persistence model', temporally aligned with the persistence model while mitigating its most penalising errors. Thus, despite being captured by the sky cameras, models often miss fundamental events causing large irradiance changes such as clouds obscuring the sun. We hope that our work will contribute to a shift of this approach to irradiance forecasting, from reactive to anticipatory.

翻訳日:2021-02-03 12:52:29 公開日:2021-02-02

# M2FN:マルチステップモダリティ融合による画像評価

M2FN: Multi-step Modality Fusion for Advertisement Image Assessment ( http://arxiv.org/abs/2102.00441v2 )

ライセンス: Link先を確認

Kyung-Wha Park (1), Jung-Woo Ha (2), JungHoon Lee (3), Sunyoung Kwon (4), Kyung-Min Kim (2), Byoung-Tak Zhang (1 and 5 and 6) ((1) Interdisciplinary Program in Neuroscience, Seoul National University., (2) NAVER AI LAB, NAVER CLOVA., (3) Statistics and Actuarial Science, Soongsil University., (4) School of Biomedical Convergence Engineering, Pusan National University., (5) Department of Computer Science and Engineering, Seoul National University., (6) Surromind Robotics.)

(参考訳) 特にユーザーの嗜好と広告品質に基づいて広告を評価することは、マーケティング業界にとって重要です。近年の研究では、ディープニューラルネットワークの利用を試みているが、これらの研究では画像関連補助属性(ad画像に頻繁に見られる埋め込みテキストを含む)は使用されていない。そこで,これらの属性が広告イメージの嗜好に与える影響を検討した。まず, 大規模実世界の広告ログデータを分析し, 本研究に基づいて, ユーザの好みにアピールしそうな広告画像を決定する新しいマルチステップモダリティ融合ネットワーク (m2fn) を提案する。本手法は,条件付きバッチ正規化に基づく低レベル融合と注意に基づく高レベル融合を含む,ネットワーク内の複数のステップを通じて補助属性を利用する。 M2FNは、美的画像評価に広く使用されているAVAデータセット上で検証し、豊富な補助属性を持つ実世界の広告データセットを用いて、嗜好予測における最先端のパフォーマンスを達成できることを実証しました。

Assessing advertisements, specifically on the basis of user preferences and ad quality, is crucial to the marketing industry. Although recent studies have attempted to use deep neural networks for this purpose, these studies have not utilized image-related auxiliary attributes, which include embedded text frequently found in ad images. We, therefore, investigated the influence of these attributes on ad image preferences. First, we analyzed large-scale real-world ad log data and, based on our findings, proposed a novel multi-step modality fusion network (M2FN) that determines advertising images likely to appeal to user preferences. Our method utilizes auxiliary attributes through multiple steps in the network, which include conditional batch normalization-based low-level fusion and attention-based high-level fusion. We verified M2FN on the AVA dataset, which is widely used for aesthetic image assessment, and then demonstrated that M2FN can achieve state-of-the-art performance in preference prediction using a real-world ad dataset with rich auxiliary attributes.

翻訳日:2021-02-03 12:49:30 公開日:2021-02-02

# 誰のための公平? テキスト要約における読者の公平性認識の理解

Fairness for Whom? Understanding the Reader's Perception of Fairness in Text Summarization ( http://arxiv.org/abs/2101.12406v2 )

ライセンス: Link先を確認

Anurag Shandilya, Abhisek Dash, Abhijnan Chakraborty, Kripabandhu Ghosh, Saptarshi Ghosh

(参考訳) ユーザが生成するテキスト情報の増加に伴い、近年、広範囲なコンテンツの概要を提供するための要約アルゴリズムの利用が増加している。これらのアルゴリズムを評価するための伝統的なメトリクス(例) ROUGEスコア)は、アルゴリズムの要約と人間生成の要約を一致させることに頼っている。しかし、テキストの内容が異質である場合、例えば、異なる社会的に有能なグループから来る場合、既存の要約アルゴリズムのほとんどは、元のデータにおける分布と非常に異なる社会集団を表すことが示されている。このような悪影響を軽減するため、公正保存要約アルゴリズムも提案されている。これらの研究のすべては、内容の作家の視点から公正の規範的な概念を検討し、根底にある公平性の概念に対する読者の認識を無視しています。このギャップを埋めるため,本研究では,フェアネス概念と読者がテキスト要約でどのように認識するかを考察する。実験により,読者の公平感は文脈に敏感な場合が多いことを示した。さらに、標準的なROUGE評価指標は、要約の知覚的(不公平)性を定量化できない。そこで本研究では,テキスト要約における知覚バイアスを定量化するための,ループ内人間メトリックとグラフベースの自動手法を提案する。我々は,不均質な社会-政治的マイクロブログデータセットのいくつかの要約(un)を定量化し,その有用性を示す。

With the surge in user-generated textual information, there has been a recent increase in the use of summarization algorithms for providing an overview of the extensive content. Traditional metrics for evaluation of these algorithms (e.g. ROUGE scores) rely on matching algorithmic summaries to human-generated ones. However, it has been shown that when the textual contents are heterogeneous, e.g., when they come from different socially salient groups, most existing summarization algorithms represent the social groups very differently compared to their distribution in the original data. To mitigate such adverse impacts, some fairness-preserving summarization algorithms have also been proposed. All of these studies have considered normative notions of fairness from the perspective of writers of the contents, neglecting the readers' perceptions of the underlying fairness notions. To bridge this gap, in this work, we study the interplay between the fairness notions and how readers perceive them in textual summaries. Through our experiments, we show that reader's perception of fairness is often context-sensitive. Moreover, standard ROUGE evaluation metrics are unable to quantify the perceived (un)fairness of the summaries. To this end, we propose a human-in-the-loop metric and an automated graph-based methodology to quantify the perceived bias in textual summaries. We demonstrate their utility by quantifying the (un)fairness of several summaries of heterogeneous socio-political microblog datasets.

翻訳日:2021-02-03 12:48:50 公開日:2021-02-02

# 強化学習のためのポリシーミラー降下:線形収束、新しいサンプリング複雑性、一般化問題クラス

Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes ( http://arxiv.org/abs/2102.00135v2 )

ライセンス: Link先を確認

Guanghui Lan

(参考訳) 本稿では,強化学習(RL)問題を,強い凸あるいは一般凸正規化器を用いて解くための新しいポリシーミラー降下法を提案する。これらの全体的非凸問題の構造的性質を調べることにより、pmd法は、大域的最適性への収束速度が速いことを示した。これらの方法の確率的対応法を開発し、 ${\cal O}(1/\epsilon)$ (resp., ${\cal O}(1/\epsilon^2)$) のサンプリング複雑性を確立し、これらのRL問題を異なるサンプリングスキームを用いて強く(resp., general)凸正規化することで解決する。さらに,これらの正規化子の勾配を計算するための複雑性は,必要であれば,強い(一般)凸正規化子を持つ問題に対して,${\cal o}\{(\log_\gamma \epsilon) [(1-\gamma)l/\mu]^{1/2}\log (1/\epsilon)\}$ (resp., ${\cal o} \{(\log_\gamma \epsilon ) [(1-\gamma)l/\epsilon]^{1/2}\}$) で限定できることを示した。ここで$\gamma$は割引要因を表します。我々の知る限り、これらの複雑さはアルゴリズムの発達とともに、最適化とRLの文献の両方において新しく見える。これらの凸正規化器の導入は、rlモデルの柔軟性と適用性を大きく広げる。

We present new policy mirror descent (PMD) methods for solving reinforcement learning (RL) problems with either strongly convex or general convex regularizers. By exploring the structural properties of these overall seemly highly nonconvex problems we show that the PMD methods exhibit fast linear rate of convergence to the global optimality. We develop stochastic counterparts of these methods, and establish an ${\cal O}(1/\epsilon)$ (resp., ${\cal O}(1/\epsilon^2)$) sampling complexity for solving these RL problems with strongly (resp., general) convex regularizers using different sampling schemes, where $\epsilon$ denote the target accuracy. We further show that the complexity for computing the gradients of these regularizers, if necessary, can be bounded by ${\cal O}\{(\log_\gamma \epsilon) [(1-\gamma)L/\mu]^{1/2}\log (1/\epsilon)\}$ (resp., ${\cal O} \{(\log_\gamma \epsilon ) [(1-\gamma)L/\epsilon]^{1/2}\}$) for problems with strongly (resp., general) convex regularizers. Here $\gamma$ denotes the discounting factor. To the best of our knowledge, these complexity bounds, along with our algorithmic developments, appear to be new in both optimization and RL literature. The introduction of these convex regularizers also greatly expands the flexibility and applicability of RL models.

翻訳日:2021-02-03 12:48:10 公開日:2021-02-02

# 一般化非定常バンディット

Generalized non-stationary bandits ( http://arxiv.org/abs/2102.00725v2 )

ライセンス: Link先を確認

Anne Gael Manegueu, Alexandra Carpentier and Yi Yu

(参考訳) 本稿では,スイッチングバンドイット問題を一般化する非定常確率バンドイット問題について検討する。スイッチングバンドイット問題(\textbf{Case a})に加えて、我々は3つの具体的な例に興味を持っている: (\textbf{b}) 腕の手段は局所多項式であり、 (\textbf{c}) 腕の手段は局所的に滑らかであり、 (\textbf{d}) 腕の隙間は束縛された数の屈曲点を持ち、そこでは最も高い腕の平均は短い範囲であまり変化しない。これらの3つの設定は非常に異なるが、共通する点がある: (i) ギャップの対数の同様の大きさのレベル集合の数を制御でき、 (ii) 最高平均は急な変更の数に制限があり、それ以外は変化が限られている。この一般的な設定では、特に4つの問題 (a)-(d) を効率的かつ統一的に解く1つのアルゴリズムを提案する。

In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d) mentioned.

翻訳日:2021-02-03 12:47:02 公開日:2021-02-02

# 無線画像伝送のためのSNR適応深部接合源チャネル符号化

SNR-adaptive deep joint source-channel coding for wireless image transmission ( http://arxiv.org/abs/2102.00202v2 )

ライセンス: Link先を確認

Mingze Ding and Jiahui Li and Mengyao Ma and Xiaopeng Fan

(参考訳) 本論文では,ノイズの多いチャネル上での画像のマルチユーザ伝送のためのジョイントソースチャネル符号化(JSCC)の問題を考えることにより,自動エンコーダを用いた深部ソースチャネル符号化方式を提案する。提案したJSCC方式では,信号対雑音比(SNR)を推定し,それを用いて送信画像の適応復号を行う。実験により,提案方式は異なるSNRの適応性に優れた結果が得られ,SNRのデコーダ推定誤差に頑健であることが示された。我々の知る限りでは、これは、異なるSNRの適応性に焦点を当て、マルチユーザシナリオに適用できる最初のディープJSCCスキームである。

Considering the problem of joint source-channel coding (JSCC) for multi-user transmission of images over noisy channels, an autoencoder-based novel deep joint source-channel coding scheme is proposed in this paper. In the proposed JSCC scheme, the decoder can estimate the signal-to-noise ratio (SNR) and use it to adaptively decode the transmitted image. Experiments demonstrate that the proposed scheme achieves impressive results in adaptability for different SNRs and is robust to the decoder's estimation error of the SNR. To the best of our knowledge, this is the first deep JSCC scheme that focuses on the adaptability for different SNRs and can be applied to multi-user scenarios.

翻訳日:2021-02-03 12:46:17 公開日:2021-02-02

# 分布型モンテカルロ木探索によるリスク認識と多目的意思決定

Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search ( http://arxiv.org/abs/2102.00966v2 )

ライセンス: Link先を確認

Conor F. Hayes, Mathieu Reymond, Diederik M. Roijers, Enda Howley, Patrick Mannion

(参考訳) 多くのリスク認識および多目的強化学習設定において、ユーザの有用性はポリシーの単一実行から導かれる。これらの設定では、平均的な将来のリターンに基づいた決定は適切ではない。例えば、医療現場では、患者は病気を治療する機会を1つだけ持つことができる。決定を行う場合、期待されるリターン(強化学習では値として知られています)は、決定が持つ可能性のある有害あるいはポジティブな結果の範囲を考慮できないのです。我々の重要な洞察は、エージェントが決定時に要求する重要な情報を表現するために、期待される未来よりも分布を使うべきだということです。本論文では,個々の政策実行から得られる様々なリターンの有用性について,後方分布を学習するアルゴリズムである分散モンテカルロ木探索を提案する。さらに,本アルゴリズムは,期待値の効用に対する多目的強化学習において,最先端の手法よりも優れていた。

In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from the single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. When making a decision, just the expected return -- known in reinforcement learning as the value -- cannot account for the potential range of adverse or positive outcomes a decision may have. Our key insight is that we should use the distribution over expected future returns differently to represent the critical information that the agent requires at decision time. In this paper, we propose Distributional Monte Carlo Tree Search, an algorithm that learns a posterior distribution over the utility of the different possible returns attainable from individual policy executions, resulting in good policies for both risk-aware and multi-objective settings. Moreover, our algorithm outperforms the state-of-the-art in multi-objective reinforcement learning for the expected utility of the returns.

翻訳日:2021-02-03 12:45:42 公開日:2021-02-02

# (参考訳) 最適化による公正性

Fairness through Optimization ( http://arxiv.org/abs/2102.00311v2 )

ライセンス: CC BY 4.0

Violet Xinying Chen, J.N. Hooker

(参考訳) AIに基づく意思決定モデルにおける公平性を形式化する一般的なパラダイムとして最適化を提案する。最適化モデルは、高度に高度なソリューション技術を活用すると同時に、社会福祉機能として幅広い公正基準を定式化することができると論じる。本稿では,ニューラルネットワーク,サポートベクターマシン,ルールベースシステムといった文脈において,適切な制約を受ける社会福祉関数を最大化することにより,公平性指向の意思決定を支援する最適化モデルを提案する。特に、公平性や公平性と効率性の組み合わせを測定するさまざまな機能のためのトラクタブル最適化モデルについて述べる。これには、いくつかの不等式メトリクス、rawlsian criteria、mclooneとhoover indices、alpha fairness、nashとkalai-smorodinskyの交渉ソリューション、rawlsianとutilitarian criteriaの組み合わせ、統計バイアス測度が含まれる。これらのモデルはすべて、線形プログラミング、混合整数/線形プログラミング、または(2つのケースで)特殊な凸プログラミング方法によって効率的に解くことができる。

We propose optimization as a general paradigm for formalizing fairness in AI-based decision models. We argue that optimization models allow formulation of a wide range of fairness criteria as social welfare functions, while enabling AI to take advantage of highly advanced solution technology. We show how optimization models can assist fairness-oriented decision making in the context of neural networks, support vector machines, and rule-based systems by maximizing a social welfare function subject to appropriate constraints. In particular, we state tractable optimization models for a variety of functions that measure fairness or a combination of fairness and efficiency. These include several inequality metrics, Rawlsian criteria, the McLoone and Hoover indices, alpha fairness, the Nash and Kalai-Smorodinsky bargaining solutions, combinations of Rawlsian and utilitarian criteria, and statistical bias measures. All of these models can be efficiently solved by linear programming, mixed integer/linear programming, or (in two cases) specialized convex programming methods.

翻訳日:2021-02-03 12:44:55 公開日:2021-02-02

PDF登録状況（公開日: 20210202）