Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200204となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# SiCにおけるSi空隙スピン量子ビットの局所振動モード Local vibrational modes of Si vacancy spin qubits in SiC ( http://arxiv.org/abs/2002.00067v2 ) ライセンス: Link先を確認	Z. Shang, A. Hashemi, Y. Berenc\'en, H.-P. Komsa, P. Erhart, A. V. Krasheninnikov, G. V. Astakhov	(参考訳) シリコン炭化物は、この技術的にフレンドリーな材料における点欠陥の異常スピンと光学的性質のため、量子応用にとって非常に有望なプラットフォームである。これらの性質は結晶振動の影響を強く受けているが、スピン量子ビットの挙動とそれらの正確な関係は十分に研究されていない。成長した4H-SiCにおけるSi空隙スピン量子ビットの局所振動モードを明らかにする。共振マイクロ波場を用いて1種類の欠陥、いわゆるV2中心からの寄与を分離し、7つの等分されたフォノンレプリカとともにゼロフォノン線を観測する。さらに, 実験データとよく一致したフォトルミネッセンスライン形状の第一原理計算について述べる。計算精度を高め,計算時間を短縮するために,機械学習アルゴリズムを用いて力定数を抽出する。これにより、Si空孔における光学放出中に励起電子と結合した格子振動の支配的なモードを特定できる。 36MeVの共鳴フォノンエネルギーとDebye-Waller因子の約6%を得る。我々は光誘起スピン偏光の活性化エネルギーが局所振動エネルギーによって与えられることを実験的に確立した。本研究は,sicスピン量子ビットにおける電子状態と振動モードのカップリングに関する知見を与え,スピン,光学,機械的,熱的性質の予測に不可欠である。このアプローチは、sicおよび他の3dおよび2d材料にスペクトル重なりのある多くのスピン欠陥に適用することができる。 Silicon carbide is a very promising platform for quantum applications because of extraordinary spin and optical properties of point defects in this technologically-friendly material. These properties are strongly influenced by crystal vibrations, but the exact relationship between them and the behavior of spin qubits is not fully investigated. We uncover the local vibrational modes of the Si vacancy spin qubits in as-grown 4H-SiC. We apply the resonant microwave field to isolate the contribution from one particular type of defects, the so-called V2 center, and observe the zero-phonon line together with seven equally-separated phonon replicas. Furthermore, we present first-principles calculations of the photoluminescence lineshape, which are in excellent agreement with our experimental data. To boost up the calculation accuracy and decrease the computation time, we extract the force constants using machine learning algorithms. This allows us to identify dominant modes in the lattice vibrations coupled to an excited electron during optical emission in the Si vacancy. The resonance phonon energy of 36 meV and the Debye-Waller factor of about 6% are obtained. We establish experimentally that the activation energy of the optically-induced spin polarization is given by the local vibrational energy. Our findings give insight into the coupling of electronic states to vibrational modes in SiC spin qubits, which is essential to predict their spin, optical, mechanical and thermal properties. The approach described can be applied to a large variety of spin defects with spectrally overlapped contributions in SiC as well as in other 3D and 2D materials.	翻訳日:2023-06-05 02:30:05 公開日:2020-02-04
# double-raman singlet と doublet light-matter scheme を用いたオフ軸光渦 Off-axis optical vortices using double-Raman singlet and doublet light-matter schemes ( http://arxiv.org/abs/2002.00504v2 ) ライセンス: Link先を確認	Hamid Reza Hamedi, Julius Ruseckas, Emmanuel Paspalakis and Gediminas Juzeli\=unas	(参考訳) ダブルラマンゲイン原子媒体内で伝播するオフ軸光渦の形成について検討した。原子は2つの弱いプローブ場と、軌道角運動量(oam)を運ぶ2つの強いポンプビームと相互作用する。我々は、強力なポンプレーザーの1つだけがOAMを搭載している状況を考える。物質に結合したプローブ場の特定の重ね合わせは、シフト軸を持つ特定の光学的渦を形成する。このようなオフ軸渦は、2光子デチューニングの値に応じて、サブまたはスーパールミナル群速度で媒体内を伝播することができる。ポンプフィールドのエネルギーがプローブフィールドに伝達されるため、スーパールミネッセンス光渦は増幅と関連付けられる。周辺渦の位置は、ポンプ場のoamと強度によって操作できる。第2プローブフィールドの振幅が原子雲の先頭でゼロであれば、個々のプローブビームとポンプフィールドの間で光渦の交換が可能であることを示す。このモデルは4つのポンプ場と相互作用するより複雑なダブルラマン・ダブルトまで拡張されている。ダブルラマン・シンレットとは対照的に、ゼロ2光子デチューニングでもオフ軸サブまたは超光渦の生成が可能になった。 We study the formation of off-axis optical vortices propagating inside a double-Raman gain atomic medium. The atoms interact with two weak probe fields as well as two strong pump beams which can carry orbital angular momentum (OAM). We consider a situation when only one of the strong pump lasers carries an OAM. A particular superposition of probe fields coupled to the matter is shown to form specific optical vortices with shifted axes. Such off-axis vortices can propagate inside the medium with sub- or superluminal group velocity depending on the value of the two-photon detuning. The superluminal optical vortices are associated with the amplification as the energy of pump fields is transferred to the probe fields. The position of the peripheral vortices can be manipulated by the OAM and intensity of the pump fields. We show that the exchange of optical vortices is possible between individual probe beams and the pump fields when the amplitude of the second probe field is zero at the beginning of the atomic cloud. The model is extended to a more complex double Raman doublet interacting with four pump fields. In contrast to the double-Raman-singlet, now the generation of the off-axis sub- or superluminal optical vortices is possible even for zero two-photon detuning.	翻訳日:2023-06-05 00:18:17 公開日:2020-02-04
# オプトメカニカルキャビティにおけるモードロック Mode locking in an optomechanical cavity ( http://arxiv.org/abs/2002.01157v1 ) ライセンス: Link先を確認	Eyal Buks, Roei Levi and Ivar Martin	(参考訳) 本研究では, メカニカル共振器ミラーと光増幅器を統合した光リングキャビティを実験的に検討した。この装置は同期や自励振動を含む様々な興味深い非線形効果を示す。光リングキャビティの周波数が懸架ミラーの機械的周波数に非常に近い場合に受動的に発生する光パルスを観測する。このメカニカルモードロックのしきい値における光パワーは、光増幅器の量子ノイズと関連していることがわかった。 We experimentally study a fiber-based optical ring cavity integrated with a mechanical resonator mirror and an optical amplifier. The device exhibits a variety of intriguing nonlinear effects including synchronization and self-excited oscillation. Passively generated optical pulses are observed when the frequency of the optical ring cavity is tuned very close to the mechanical frequency of the suspended mirror. The optical power at the threshold of this process of mechanical mode locking is found to be related to quantum noise of the optical amplifier.	翻訳日:2023-06-04 18:52:04 公開日:2020-02-04
# MANETを活用した被災地安全給付システム Secure Payment System Utilizing MANET for Disaster Areas ( http://arxiv.org/abs/2002.01081v1 ) ライセンス: Link先を確認	Babatunde Ojetunde, Naoki Shibata, Juntao Gao	(参考訳) 災害地域のモバイル決済システムは、食品、衣類、薬品などの回収品を購入する人々に電子取引を提供する可能性がある。逆に、災害地域での取引を可能にするためには、大規模地震や洪水など災害時に破壊される可能性のある通信インフラ(有線ネットワークや携帯電話ネットワークなど)が必要であるため、災害地域では依存できない。本稿では,災害時の買い物を許可するトランザクションを実現するために,インフラストラクチャレスマネットを利用した新しいモバイル決済システムを提案する。具体的には、顧客間取引の支払い保証と、ブルームフィルタとマークルツリーに基づく軽量なスキームによるマルチレベル推奨機構を提供し、通信のオーバーヘッドを低減するための推奨機構を導入する。モバイル決済システムでは,位置情報に基づく相互監視方式やブラインドシグネチャなど,さまざまな方式を採用することでセキュアな取引を実現するとともに,新たに導入したイベントチェーン機構により,二重消費攻撃が防止される。シミュレーションにより検証したように,提案手法は災害現場で有効であり,高いトランザクション完了率,テストシナリオ全体の65%～90%を達成し,全体の平均7MBの商談メッセージサイズを持つモバイルデバイスのストレージ効率を向上する。 Mobile payment system in a disaster area have the potential to provide electronic transactions for people purchasing recovery goods like foodstuffs, clothes, and medicine. Conversely, to enable transactions in a disaster area, current payment systems need communication infrastructures (such as wired networks and cellular networks) which may be ruined during such disasters as large-scale earthquakes and flooding and thus cannot be depended on in a disaster area. In this paper, we introduce a new mobile payment system utilizing infrastructureless MANETs to enable transactions that permit users to shop in disaster areas. Specifically, we introduce an endorsement-based mechanism to provide payment guarantees for a customer-to-merchant transaction and a multilevel endorsement mechanism with a lightweight scheme based on Bloom filter and Merkle tree to reduce communication overheads. Our mobile payment system achieves secure transaction by adopting various schemes such as location-based mutual monitoring scheme and blind signature, while our newly introduce event chain mechanism prevents double spending attacks. As validated by simulations, the proposed mobile payment system is useful in a disaster area, achieving high transaction completion ratio, 65% - 90% for all scenario tested, and is storage-efficient for mobile devices with an overall average of 7MB merchant message size.	翻訳日:2023-06-04 18:51:39 公開日:2020-02-04
# 低光強度限界を超える原子鎖の光学応答:線形古典振動子モデルの有効性 Optical response of atom chains beyond the limit of low light intensity: The validity of the linear classical oscillator model ( http://arxiv.org/abs/2002.01417v1 ) ライセンス: Link先を確認	L. A. Williamson and J. Ruostekoski	(参考訳) 弱いコヒーレント入射光を受ける原子は、サブラジアントおよびスーパーラジアント集団励起固有モードをサポートする、結合した古典線形振動子として扱うことができる。量子多体マスター方程式を解くことにより, 閉じ込められた原子鎖からのコヒーレントかつ非コヒーレントな散乱を解くことにより, 駆動の強度を増大させる上でのこの擬似古典振動子モデルの妥当性の限界を同定する。線形古典振動子モデルからの偏差は、光によって励起される集合固有モードの共振線幅$\upsilon_\alpha$に敏感に依存し、$\upsilon_\alpha$のパワーローとしてかなりの偏差が発生する強度が増大することを示した。線形古典振動子モデル(英語版)は、超ラジアント励起よりもずっと低い強度で不正確となり、7つの原子の例システムでは、2つのケース間で30の係数で入射光強度が異なる臨界入射光強度が生じる。個々にエキサイティングな固有モデムにより、この臨界強度はより狭い共鳴とより強く相互作用する系に対して$\upsilon_\alpha^{2.5}$のスケーリングを持ち、より広い共鳴に対して$\upsilon_\alpha^3$のスケーリングに近づき、双極子-双極子相互作用が減少する。また、$\upsilon_\alpha^3$のスケーリングは原子間の量子揺らぎが無視される半古典的結果に対応する。完全モード整合ドライブの場合と定在波駆動の場合の両方について検討し,極低ラジアントモードのみに出現する2例とファノ共鳴の位置との間に有意差を認めた。 Atoms subject to weak coherent incident light can be treated as coupled classical linear oscillators, supporting subradiant and superradiant collective excitation eigenmodes. We identify the limits of validity of this \emph{linear classical oscillator model} at increasing intensities of the drive by solving the quantum many-body master equation for coherent and incoherent scattering from a chain of trapped atoms. We show that deviations from the linear classical oscillator model depend sensitively on the resonance linewidths $\upsilon_\alpha$ of the collective eigenmodes excited by light, with the intensity at which substantial deviation occurs scaling as a powerlaw of $\upsilon_\alpha$. The linear classical oscillator model then becomes inaccurate at much lower intensities for subradiant collective excitations than superradiant ones, with an example system of seven atoms resulting in critical incident light intensities differing by a factor of 30 between the two cases. By individually exciting eigenmodes we find that this critical intensity has a $\upsilon_\alpha^{2.5}$ scaling for narrower resonances and more strongly interacting systems, while it approaches a $\upsilon_\alpha^3$ scaling for broader resonances and when the dipole-dipole interactions are reduced. The $\upsilon_\alpha^3$ scaling also corresponds to the semiclassical result whereby quantum fluctuations between the atoms have been neglected. We study both the case of perfectly mode-matched drives and the case of standing wave drives, with significant differences between the two cases appearing only at very subradiant modes and positions of Fano resonances.	翻訳日:2023-06-04 18:46:40 公開日:2020-02-04
# 有限データを用いた多重位相の適応ベイズ推定実験 Experimental adaptive Bayesian estimation of multiple phases with limited data ( http://arxiv.org/abs/2002.01232v1 ) ライセンス: Link先を確認	Mauro Valeri, Emanuele Polino, Davide Poderini, Ilaria Gianani, Giacomo Corrielli, Andrea Crespi, Roberto Osellame, Nicol\`o Spagnolo and Fabio Sciarrino	(参考訳) 推定過程における究極の境界を達成することが量子計量学の主目的である。この文脈では、限られた量の資源しか使わずに複数のパラメータを測定する必要がある。この目的のために、追加の制御パラメータを利用する適応プロトコルは、そのような限られたデータレジームで動作する量子センサーの性能を最適化するツールを提供する。推定プロセス中に制御パラメータをチューニングするための最適な戦略を見つけることは自明な問題であり、機械学習技術はそのような課題に対処するための自然な解決策である。本稿では,非常に限られたデータで最適性能に達するように調整された適応ベイズ型マルチパラメータ推定手法を初めて実験的に検討し,実装する。フェムト秒レーザーライティングにより作製されたコンパクトでフレキシブルな集積フォトニック回路を用いて,高次制御による異なる戦略を実現する。その結果、適応戦略は限られた量のリソースを扱う現実的なセンサに対して実行可能なアプローチになり得ることが示された。 Achieving ultimate bounds in estimation processes is the main objective of quantum metrology. In this context, several problems require measurement of multiple parameters by employing only a limited amount of resources. To this end, adaptive protocols, exploiting additional control parameters, provide a tool to optimize the performance of a quantum sensor to work in such limited data regime. Finding the optimal strategies to tune the control parameters during the estimation process is a non-trivial problem, and machine learning techniques are a natural solution to address such task. Here, we investigate and implement experimentally for the first time an adaptive Bayesian multiparameter estimation technique tailored to reach optimal performances with very limited data. We employ a compact and flexible integrated photonic circuit, fabricated by femtosecond laser writing, which allows to implement different strategies with high degree of control. The obtained results show that adaptive strategies can become a viable approach for realistic sensors working with a limited amount of resources.	翻訳日:2023-06-04 18:44:23 公開日:2020-02-04
# 核スピン島での電子シャットリングによる超放射能様ダイナミクス Superradiant-like dynamics by electron shuttling on a nuclear-spin island ( http://arxiv.org/abs/2002.01219v1 ) ライセンス: Link先を確認	Yi-Nan Fang, Ying-Dan Wang, Rosario Fazio, and Stefano Chesi	(参考訳) 単一電子量子ドットにおける原子核スピン浴の超ラジアント様ダイナミクスを,電子がアイソトープに富む「原子核スピン島」上で周期的に停止するのを考慮して研究する。均一な超微細相互作用を仮定し、シャットリングによる核スピンの進化とその超輝度との関係を詳細に論じる。我々は、断続的なスピンの進化を免れる最小の停止時間を導出する。さらに,近傍のマイクロマグネットの不均質場下での低速・高速シャットリングについて検討する。最後に, 本手法を定常量子ドットモデルと比較することにより, 非断熱シャットリングがクーロン封鎖を解除し, 超ラジアント的挙動を確立する上で果たす役割を強調した。 We investigate superradiant-like dynamics of the nuclear-spin bath in a single-electron quantum dot, by considering electrons cyclically shuttling on/off an isotopically enriched `nuclear-spin island'. Assuming a uniform hyperfine interaction, we discuss in detail the nuclear spin evolution under shuttling and its relation to superradiance. We derive the minimum shuttling time which allows to escape the adiabatic spin evolution. Furthermore, we discuss slow/fast shuttling under the inhomogeneous field of a nearby micromagnet. Finally, by comparing our scheme to a model with stationary quantum dot, we stress the important role played by non-adiabatic shuttling in lifting the Coulomb blockade and thus establishing the superradiant-like behavior.	翻訳日:2023-06-04 18:44:08 公開日:2020-02-04
# LIGOの光とキログラム質量鏡の量子相関 Quantum correlations between the light and kilogram-mass mirrors of LIGO ( http://arxiv.org/abs/2002.01519v1 ) ライセンス: Link先を確認	Haocun Yu, L. McCuller, M. Tse, L. Barsotti, N. Mavalvala, J. Betzwieser, C. D. Blair, S. E. Dwyer, A. Effler, M. Evans, A. Fernandez-Galiana, P. Fritschel, V. V. Frolov, N. Kijbunchoo, F. Matichard, D. E. McClelland, T. McRae, A. Mullavey, D. Sigg, B. J. J. Slagmolen, C. Whittle, A. Buikema, Y. Chen, T. R. Corbitt, R. Schnabel, R. Abbott, C. Adams, R. X. Adhikari, A. Ananyeva, S. Appert, K. Arai, J. S. Areeda, Y. Asali, S. M. Aston, C. Austin, A. M. Baer, M. Ball, S. W. Ballmer, S. Banagiri, D. Barker, J. Bartlett, B. K. Berger, D. Bhattacharjee, G. Billingsley, S. Biscans, R. M. Blair, N. Bode, P. Booker, R. Bork, A. Bramley, A. F. Brooks, D. D. Brown, C. Cahillane, K. C. Cannon, X. Chen, A. A. Ciobanu, F. Clara, S. J. Cooper, K. R. Corley, S. T. Countryman, P. B. Covas, D. C. Coyne, L. E. H. Datrier, D. Davis, C. Di Fronzo, K. L. Dooley, J. C. Driggers, P. Dupej, T. Etzel, T. M. Evans, J. Feicht, P. Fulda, M. Fyffe, J. A. Giaime, K. D. Giardina, P. Godwin, E. Goetz, S. Gras, C. Gray, R. Gray, A. C. Green, Anchal Gupta, E. K. Gustafson, R. Gustafson, J. Hanks, J. Hanson, T. Hardwick, R. K. Hasskew, M. C. Heintze, A. F. Helmling-Cornell, N. A. Holland, J. D. Jones, S. Kandhasamy, S. Karki, M. Kasprzack, K. Kawabe, P. J. King, J. S. Kissel, Rahul Kumar, M. Landry, B. B. Lane, B. Lantz, M. Laxen, Y. K. Lecoeuche, J. Leviton, J. Liu, M. Lormand, A. P. Lundgren, R. Macas, M. MacInnis, D. M. Macleod, G. L. Mansell, S. M\'arka, Z. M\'arka, D. V. Martynov, K. Mason, T. J. Massinger, R. McCarthy, S. McCormick, J. McIver, G. Mendell, K. Merfeld, E. L. Merilh, F. Meylahn, T. Mistry, R. Mittleman, G. Moreno, C. M. Mow-Lowry, S. Mozzon, T. J. N. Nelson, P. Nguyen, L. K. Nuttall, J. Oberling, Richard J. Oram, C. Osthelder, D. J. Ottaway, H. Overmier, J. R. Palamos, W. Parker, E. Payne, A. Pele, C. J. Perez, M. Pirello, H. Radkins, K. E. Ramirez, J. W. Richardson, K. Riles, N. A. Robertson, J. G. Rollins, C. L. Romel, J. H. Romie, M. P. Ross, K. Ryan, T. Sadecki, E. J. Sanchez, L. E. Sanchez, T. R. Saravanan, R. L. Savage, D. Schaetzl, R. M. S. Schofield, E. Schwartz, D. Sellers, T. Shaffer, J. R. Smith, S. Soni, B. Sorazu, A. P. Spencer, K. A. Strain, L. Sun, M. J. Szczepa\'nczyk, M. Thomas, P. Thomas, K. A. Thorne, K. Toland, C. I. Torrie, G. Traylor, A. L. Urban, G. Vajente, G. Valdes, D. C. Vander-Hyde, P. J. Veitch, K. Venkateswara, G. Venugopalan, A. D. Viets, T. Vo, C. Vorvick, M. Wade, R. L. Ward, J. Warner, B. Weaver, R. Weiss, B. Willke, C. C. Wipf, L. Xiao, H. Yamamoto, Hang Yu, L. Zhang, M. E. Zucker, and J. Zweizig	(参考訳) より高精度な微小な力と変位の測定は、量子力学の柱によって課される限界(ハイゼンベルクの不確実性原理)に遭遇する。物体の位置を連続的に測定できる精度の限界は、標準量子極限(SQL)として知られている。光をプローブとして使用すると、物体に照射される光子放射圧の不確かさと光電検出における光子数とのバランスからSQLが生じる。 sqlを超える唯一の可能性は、オブジェクトの位置/運動量の不確かさとそれが反射する光の光子数/位相の不確かさとの相関である。本稿では,レーザー干渉計重力波観測所(LIGO)において,この種の量子相関が自然に発生するという理論的予測を実験的に証明する。以上の結果から,200kwレーザービームの位相と先端ligo検出器の40kgミラーの位置における量子力学的不確かさは,sqlの下の1.4(3db)以下の合同量子不確かさをもたらすことがわかった。量子相関は重力波(gw)観測だけでなく、将来全ての種類の測定値を改善すると予測している。 Measurement of minuscule forces and displacements with ever greater precision encounters a limit imposed by a pillar of quantum mechanics: the Heisenberg uncertainty principle. A limit to the precision with which the position of an object can be measured continuously is known as the standard quantum limit (SQL). When light is used as the probe, the SQL arises from the balance between the uncertainties of photon radiation pressure imposed on the object and of the photon number in the photoelectric detection. The only possibility surpassing the SQL is via correlations within the position/momentum uncertainty of the object and the photon number/phase uncertainty of the light it reflects. Here, we experimentally prove the theoretical prediction that this type of quantum correlation is naturally produced in the Laser Interferometer Gravitational-wave Observatory (LIGO). Our measurements show that the quantum mechanical uncertainties in the phases of the 200 kW laser beams and in the positions of the 40 kg mirrors of the Advanced LIGO detectors yield a joint quantum uncertainty a factor of 1.4 (3dB) below the SQL. We anticipate that quantum correlations will not only improve gravitational wave (GW) observatories but all types of measurements in future.	翻訳日:2023-06-04 18:38:09 公開日:2020-02-04
# 一ターン量子参照ゲームにおける複素性制限 Complexity limitations on one-turn quantum refereed games ( http://arxiv.org/abs/2002.01509v1 ) ライセンス: Link先を確認	Soumik Ghosh, John Watrous	(参考訳) 本稿では、量子状態を送信する2人のプレーヤー間の抽象ゲームである量子参照ゲームの複雑性理論的側面について検討し、そのプレイヤーがどのプレイヤーが勝つかを決定するために、2つの状態に対して効率的に実装可能なジョイント計測を行う。複雑性クラス $\mathrm{qrg}(1)$ は、一方のプレイヤーがyesインスタンスで常に高い確率で勝つことができ、もう一方のプレイヤーは、他方のプレイヤーの戦略に関わらず、常に無インスタンスで高い確率で勝つことができる決定問題を含んでいる。このクラスは自明に$\mathrm{QMA} \cup \text{co-}\mathrm{QMA}$を含み、$\mathrm{PSPACE}$に含まれることが知られている。このクラスの2つの制限付き不変量に対してより強い包含を証明します。 Specifically, if one of the players is limited to sending a classical (probabilistic) state rather than a quantum state, the resulting complexity class $\mathrm{CQRG}(1)$ is contained in $\exists\cdot\mathrm{PP}$ (the nondeterministic polynomial-time operator applied to $\mathrm{PP}$); while if both players send quantum states but the referee is forced to measure one of the states first, and incorporates the classical outcome of this measurement into a measurement of the second state, the resulting class $\mathrm{MQRG}(1)$ is contained in $\mathrm{P}\cdot\mathrm{PP}$ (the unbounded-error probabilistic polynomial-time operator applied to $\mathrm{PP}$). This paper studies complexity theoretic aspects of quantum refereed games, which are abstract games between two competing players that send quantum states to a referee, who performs an efficiently implementable joint measurement on the two states to determine which of the player wins. The complexity class $\mathrm{QRG}(1)$ contains those decision problems for which one of the players can always win with high probability on yes-instances and the other player can always win with high probability on no-instances, regardless of the opposing player's strategy. This class trivially contains $\mathrm{QMA} \cup \text{co-}\mathrm{QMA}$ and is known to be contained in $\mathrm{PSPACE}$. We prove stronger containments on two restricted variants of this class. Specifically, if one of the players is limited to sending a classical (probabilistic) state rather than a quantum state, the resulting complexity class $\mathrm{CQRG}(1)$ is contained in $\exists\cdot\mathrm{PP}$ (the nondeterministic polynomial-time operator applied to $\mathrm{PP}$); while if both players send quantum states but the referee is forced to measure one of the states first, and incorporates the classical outcome of this measurement into a measurement of the second state, the resulting class $\mathrm{MQRG}(1)$ is contained in $\mathrm{P}\cdot\mathrm{PP}$ (the unbounded-error probabilistic polynomial-time operator applied to $\mathrm{PP}$).	翻訳日:2023-06-04 18:37:50 公開日:2020-02-04
# 非定常光ホモダイン量子状態トモグラフィによる条件分光 Conditional Spectroscopy via Non-Stationary Optical Homodyne Quantum State Tomography ( http://arxiv.org/abs/2002.01465v1 ) ライセンス: Link先を確認	Johannes Thewes, Carolin L\"uders, Marc A{\ss}mann	(参考訳) 連続可変量子状態トモグラフィーは、量子光学における光場の性質を研究する最も強力な手法の1つである。しかし、固定位相参照の必要性は、半導体分光法のような他の分野で広く使われることを妨げている。本稿では,超高速分光法の特殊要件に応用した非定常量子状態トモグラフィーを提案する。具体的には、固定位相参照を必要とせず、約100\,fsの時間分解能で光場の振幅と位相にアクセスできる。さらに,本手法は,従来の方法では実験的に到達できない確率力学の条件研究を可能とし,サブpsスケールにおける熱光場の確率力学を観測することにより,実験的にその能力を示す。最後に,本手法の離散変数類似と見なされるハンベリーブラウン-トウィス光子相関実験の相違点と類似点について考察する。 Continuous variable quantum state tomography is one of the most powerful techniques to study the properties of light fields in quantum optics. However, the need for a fixed phase reference has so far prevented widespread usage in other fields such as semiconductor spectroscopy. Here, we introduce non-stationary quantum state tomography, which adapts the technique to the special requirements of ultrafast spectroscopy. In detail, we gain access to the amplitude and phase of light fields with a temporal resolution of about 100\,fs without the need for a fixed phase reference. Further, we show how our technique allows us to perform conditional studies of stochastic dynamics that are inaccessible experimentally by conventional means, and demonstrate the capabilities experimentally by monitoring the stochastic dynamics of a thermal light field on the sub-ps scale. Finally, we discuss differences and similarities to more standard Hanbury Brown-Twiss photon correlation experiments, which may be considered as the discrete variable analogues of our technique.	翻訳日:2023-06-04 18:36:00 公開日:2020-02-04
# エネルギー電流の一方向通り:境界駆動量子スピン鎖におけるユビキタス現象 One-way street for the energy current: A ubiquitous phenomenon in boundary-driven quantum spin chains ( http://arxiv.org/abs/2002.01463v1 ) ライセンス: Link先を確認	Deborah Oliveira and Emmanuel Pereira and Humberto C F Lemos	(参考訳) 量子スケールでのエネルギー輸送の非自明な性質の記述に焦点を当て、境界駆動の$\mathit{XXZ}$と$\mathit{XXX}$Heisenbergモデルで記述された非対称量子スピン鎖について検討する。我々は定常状態の性質を確立するためにシステムのダイナミクスに関連するリンドブラッドマスター方程式の対称性を探索する。境界における目標偏光に対する一般的な仮定の下では、エネルギーの流れに対するユニークな方法の存在である一方通行路現象(英語版)(one-way street phenomena)と呼ばれるエネルギー整流(英語版)に関連する(しかし、より強い)効果の発生を示す。正確には、エネルギー電流は境界における浴槽の反転によって大きさや方向が変化せず、その方向は鎖のバルク内の非対称性によって完全に決定される。結果はシステムの規模や輸送体制とは無関係である。本研究は, 境界駆動型スピンシステムにおけるエネルギー流の一方向路現象のユビキタスな発生を示すものであり, エネルギー電流の制御・操作に使用される効率的な量子デバイスの研究・構築に寄与すると考えられる。 Focusing on the description of nontrivial properties of the energy transport at quantum scale, we investigate asymmetrical quantum spin chains described by boundary-driven $\mathit{XXZ}$ and $\mathit{XXX}$ Heisenberg models. We search for symmetries properties of the Lindblad master equation related to the dynamics of the system in order to establish properties of the steady state. Under rather general assumptions for the target polarization at the boundaries, we show the occurrence of an effect related to (but stronger than) energy rectification, namely, the one-way street phenomenon, which is the existence of an unique way for the energy flow. Precisely, the energy current does not change in magnitude and direction as we invert the baths at the boundaries: its direction is completely determined by the asymmetry in the bulk of the chain. The results follow independent of the system size and of the transport regime. Our findings show the ubiquitous occurrence of the one-way street phenomenon for the energy flow in boundary-driven spin systems and, we believe, they shall be an useful contribution to the area devoted to the investigation and building of efficient quantum devices used to control and manipulate the energy current.	翻訳日:2023-06-04 18:35:45 公開日:2020-02-04
# 内部摩擦を伴う量子カルノーサイクル Quantum Carnot cycle with inner friction ( http://arxiv.org/abs/2002.01457v1 ) ライセンス: Link先を確認	Sel\c{c}uk \c{C}akmak, Ferdi Altintas	(参考訳) 単一駆動スピンは6ストロークの不可逆量子カルノーサイクルの作用物質として研究される。有限時間断熱変換に伴う内部摩擦がサイクル効率と収穫作業に与える影響について詳細に検討した。内摩擦は、作業出力とサイクル効率を著しく低減し、エンジンが過速な断熱変換のための正の作業を生み出すことを不可能にする。理想カルノ効率は準静的変換に対してのみ到達することが分かる。古典的なカルノ効率からのサイクル効率のずれは、内摩擦によるエントロピー全体の生成に直接関係する効率ラグによって与えられる。サイクルの緩和過程における放出熱はエントロピー生成と内部摩擦と関連している。また, スケール不変な量子ワーキング物質の実験結果の延長と, 液体状態核磁気共鳴系における可逆量子カルノサイクルの実験的実装の可能性についても考察した。 A single driven spin is investigated as the working substance of a six-stroke irreversible quantum Carnot cycle. The role of inner friction associated with the finite-time adiabatic transformations on the cycle efficiency and the harvested work are investigated in detail. The inner friction is found to significantly reduce the work output and the cycle efficiency which can make the engine incapable to produce positive work for the too fast adiabatic transformations. The ideal Carnot efficiency is found to be reached only for the quasi-static transformations. A deviation of the cycle efficiency from the classical Carnot efficiency has been given by an efficiency lag which is directly related to the total entropy production due to the inner friction. The released heat in the relaxation processes of the cycle are associated with the entropy production and the inner friction. The extension of the results for a scale invariant quantum working substance and the possible experimental implementation of the irreversible quantum Carnot cycle in a liquid state nuclear magnetic resonance setup are also discussed.	翻訳日:2023-06-04 18:35:23 公開日:2020-02-04
# QCA技術における新しいレベル感度T-FFを用いた同期カウンタ設計 Synchronous Counter Design Using Novel Level Sensitive T-FF in QCA Technology ( http://arxiv.org/abs/2002.11587v1 ) ライセンス: Link先を確認	Ali H. Majeed, Esam Alkaldy, Mohd Shamian bin Zainal, andDanial Bin MD Nor	(参考訳) 量子ドットセルオートマトン(QCA)ナノ技術は、低消費電力や小型化といった特徴から、コンピュータ科学者を惹きつけている。多くの論文が、多くのqca回路の無署名化や論理ゲートの最適構造への提示におけるこの技術の利用に関する論文で発表されている。 Tフリップフロップはデジタルデザインの重要な部分であり、同期カウンタや非同期カウンタの設計に使用することができる。本稿では,新しいTフリップフロップ構造を最適に提示する。提示された新しいゲートはnビットバイナリ同期カウンタの設計に使われた。 QCADesignerソフトウェアは設計した回路の検証とシミュレーション結果の提示に使われ、QCAProツールは電力分析に使用された。提案された設計は最小限の電力を必要とし、以前の設計よりも優れた改善が見られた。 The quantum-dot cellular automata (QCA) nano-technique has attracted computer scientists due to its noticeable features such as low power consumption and small size. Many papers have been published in the literature about the utilization of this technology for de-signing many QCA circuits and for presenting logic gates in an optimal structure. The T flip-flop, which is an essential part of digital designs, can be used to design synchronous and asynchronous counters. This paper presents a novel T flip-flop structure in an optimal form. The presented novel gate was used to design an N-bit binary synchronous counter. The QCADesigner software was used to verify the designed circuits and to present the simulation results, while the QCAPro tool was used for the power analysis. The proposed design required minimal power and showed good improvements over previous designs.	翻訳日:2023-06-04 18:26:34 公開日:2020-02-04
# 暗号通貨、ファイトマネー、ブロックチェーン、データベース Criptocurrencies, Fiat Money, Blockchains and Databases ( http://arxiv.org/abs/2002.08466v1 ) ライセンス: Link先を確認	Jorge Barrera	(参考訳) 暗号通貨を含む2つの通貨の分類を解析する。暗号通貨の定義が与えられ、その価格がどのように固定されているかに基づいてその分類が提示される。現状のファイトマネーの使用と2段階銀行システムの運用の特徴について論じる。暗号通貨はフィアットマネーと比較され、後者が克服できない側面が示される。ブロックチェーンとデータベースの特徴について述べる。両技術の可能な使用事例を比較し、暗号通貨や特定のレコードに加えてブロックチェーンが有用性を示していないのに対して、データベースは運用中のほとんどの自動化システムの基盤となっている点に注意が必要である。 Two taxonomies of money that include cryptocurrencies are analyzed. A definition of the term cryptocurrency is given and a taxonomy of them is presented, based on how its price is fixed. The characteristics of the use of current fiat money and the operation of two-level banking systems are discussed. Cryptocurrencies are compared with fiat money and the aspects in which the latter cannot be overcome are indicated. The characteristics of blockchains and databases are described. The possible cases of use of both technologies are compared, and it is noted that blockchains, in addition to cryptocurrencies and certain records, have not yet shown their usefulness, while databases constitute the foundation of most of the automated systems in operation.	翻訳日:2023-06-04 18:25:41 公開日:2020-02-04
# 共変量子力学と量子時空 Covariant Quantum Mechanics and Quantum Spacetime ( http://arxiv.org/abs/2002.07083v1 ) ライセンス: Link先を確認	Suzana Bedi\'c, Otto C. W. Kong and Hock King Ting	(参考訳) 本稿では、ローレンツ対称性の下でミンコフスキー四ベクトルとして変換される位置および運動量作用素を持つハイゼンベルク・ワイル対称性からの群理論的構成に基づくローレンツ共変量子力学の定式化について述べる。基本表現は、本質的に正則表現の既約成分であるコヒーレント状態表現(英語版)(coherent state representation)として識別され、群 $C^$-algebra の拡張の一致する表現は可観測体の代数を与える。この定式化の重要な特徴は、ユニタリではなく擬似ユニタリであり、ミンコフスキー時空表現と全く同じ意味である。明示的な波動関数の記述は、変数領域の制限なしに与えられるが、有限積分内積を持つ。関連する共変共変振動子フォック状態基底は、ユークリッド位置と任意の「次元」の運動量作用素を持つ調和振動子のものと正確に類似したすべての標準特性を持つ。ローレンツ対称性のガリレオ極限とローレンツ共変フレームワークの古典極限は、実数座標と非可換作用素座標の両方で与えられる位相空間の対称性を通して記述された力学を含む代数とその表現の適切な対称性収縮によって厳密に検索される。後者は(射影的)ヒルベルト空間を量子/非可換時空として明示的な図式を与える。 We present in the article the formulation of a version of Lorentz covariant quantum mechanics based on a group theoretical construction from a Heisenberg-Weyl symmetry with position and momentum operators transforming as Minkowski four-vectors under the Lorentz symmetry. The basic representation is identified as a coherent state representation, essentially an irreducible component of the regular representation, with the matching representation of an extension of the group $C^$-algebra giving the algebra of observables. The key feature of the formulation is that it is not unitary but pseudo-unitary, exactly in the same sense as the Minkowski spacetime representation. Explicit wavefunction description is given without any restriction of the variable domains, yet with a finite integral inner product. The associated covariant harmonic oscillator Fock state basis has all the standard properties in exact analog to those of a harmonic oscillator with Euclidean position and momentum operators of any `dimension'. Galilean limit of the Lorentz symmetry and the classical limit of the Lorentz covariant framework are retrieved rigorously through appropriate symmetry contractions of the algebra and its representation, including the dynamics described through the symmetry of the phase space, given both in terms of real/complex number coordinates and noncommutative operator coordinates. The latter gives an explicit picture of the (projective) Hilbert space as a quantum/noncommutative spacetime.	翻訳日:2023-06-04 18:25:31 公開日:2020-02-04
# 連続スペクトルを持つPT対称ポテンシャル PT-symmetric potentials having continuous spectra ( http://arxiv.org/abs/2002.04398v1 ) ライセンス: Link先を確認	Zichao Wen and Carl M. Bender	(参考訳) 連続スペクトルを持つ1次元PT対称量子力学ハミルトニアンは研究される。ハミルトン派は$H=p^2+V(x)$で、$V(x)$は$x$で奇数であり、純粋な虚数であり、$\|x\|\to\infty$として消える。 Five PT-symmetric potentials are studied: the Scarf-II potential $V_1(x)=iA_1\,{\rm sech}(x)\tanh(x)$, which decays exponentially for large $\|x\|$; the rational potentials $V_2(x)=iA_2\,x/(1+x^4)$ and $V_3(x)=iA_3\,x/(1+\|x\|^3)$, which decay algebraically for large $\|x\|$; the step-function potential $V_4(x)=iA_4\,{\rm sgn}(x)\theta(2.5-\|x\|)$, which has compact support; the regulated Coulomb potential $V_5(x)=iA_5\,x/(1+x^2)$, which decays slowly as $\|x\|\to\infty$ and may be viewed as a long-range potential. 実パラメータ$A_n$はこれらのポテンシャルの強度を測定する。これらのポテンシャルに関連する時間非依存的なシュリンガー固有値問題の解法は、対応するハミルトンのスペクトルが普遍性を示すことを示した。一般に、固有値は一部実数であり、一部複素数である。実固有値はスペクトルの連続部分を形成し、複素固有値はスペクトルの離散部分を形成する。実固有値は、0$から$+\infty$まで連続的に値が変化する。複素固有値は離散複素共役対で発生し、$V_n(x)$$$1\leq n\leq4$) の場合、これらの対の数は有限であり、強度パラメータ$A_n$の値が増加するにつれて増加する。しかし、$v_5(x)$ に対して、原点に極限点を持つ離散固有値の {\it infinite} 列が存在する。この配列は複雑であるが、逆二乗収束を持つため、水素原子に対するバルマー級数と似ている。 One-dimensional PT-symmetric quantum-mechanical Hamiltonians having continuous spectra are studied. The Hamiltonians considered have the form $H=p^2+V(x)$, where $V(x)$ is odd in $x$, pure imaginary, and vanishes as $\|x\|\to\infty$. Five PT-symmetric potentials are studied: the Scarf-II potential $V_1(x)=iA_1\,{\rm sech}(x)\tanh(x)$, which decays exponentially for large $\|x\|$; the rational potentials $V_2(x)=iA_2\,x/(1+x^4)$ and $V_3(x)=iA_3\,x/(1+\|x\|^3)$, which decay algebraically for large $\|x\|$; the step-function potential $V_4(x)=iA_4\,{\rm sgn}(x)\theta(2.5-\|x\|)$, which has compact support; the regulated Coulomb potential $V_5(x)=iA_5\,x/(1+x^2)$, which decays slowly as $\|x\|\to\infty$ and may be viewed as a long-range potential. The real parameters $A_n$ measure the strengths of these potentials. Numerical techniques for solving the time-independent Schr\"odinger eigenvalue problems associated with these potentials reveal that the spectra of the corresponding Hamiltonians exhibit universal properties. In general, the eigenvalues are partly real and partly complex. The real eigenvalues form the continuous part of the spectrum and the complex eigenvalues form the discrete part of the spectrum. The real eigenvalues range continuously in value from $0$ to $+\infty$. The complex eigenvalues occur in discrete complex-conjugate pairs and for $V_n(x)$ ($1\leq n\leq4$) the number of these pairs is finite and increases as the value of the strength parameter $A_n$ increases. However, for $V_5(x)$ there is an {\it infinite} sequence of discrete eigenvalues with a limit point at the origin. This sequence is complex, but it is similar to the Balmer series for the hydrogen atom because it has inverse-square convergence.	翻訳日:2023-06-04 18:24:46 公開日:2020-02-04
# エンタングルメントスワップによる量子密度符号化の実現手法 Scheme for realizing quantum dense coding via entanglement swapping ( http://arxiv.org/abs/2002.02422v1 ) ライセンス: Link先を確認	Nilakantha Meher	(参考訳) 量子密度符号化 (quantum dense coding) は、1つの量子ビット(qubit)だけを送信することで、送信者(Alice)からリモート受信機(Bob)に2つの古典的な情報を送信するプロトコルである。本稿では、ある数個の2レベル原子を含む空洞配列で交換する量子密度符号 \textit{via} エンタングルメントを実現するための実験的に実現可能なスキームを提案する。原子-キャビティカップリングやキャビティ間カップリングなどのシステムパラメータの適切な選択は、情報の完全な転送を可能にする。フォトニック結晶キャビティと超伝導共振器の文脈で最近達成された実験値を用いて、情報の高忠実度転送が可能であることが示されている。実験的な欠陥を模倣するため、結合強度と共振周波数の両方の障害を考慮する。 Quantum dense coding is a protocol for transmitting two classical bits of information from a sender (Alice) to a remote receiver (Bob) by sending only one quantum bit (qubit). In this article, we propose an experimentally feasible scheme to realize quantum dense coding \textit{via} entanglement swapping in a cavity array containing a certain number of two-level atoms. Proper choice of system parameters such as atom-cavity couplings and inter-cavity couplings allows perfect transfer of information. A high fidelity transfer of information is shown to be possible by using recently achieved experimental values in the context of photonic crystal cavities and superconducting resonators. To mimic experimental imperfections, disorder in both the coupling strengths and resonance frequencies is considered.	翻訳日:2023-06-04 18:24:09 公開日:2020-02-04
# 古典的・直観主義的な数学言語は、物理学における時間の理解を形作る Classical and intuitionistic mathematical languages shape our understanding of time in physics ( http://arxiv.org/abs/2002.01653v1 ) ライセンス: Link先を確認	Nicolas Gisin	(参考訳) 物理学は時間のない古典数学で定式化されている。時間進化過程に基づいて構築された直観主義数学に基づく定式化は、我々の物理的現実の経験に近い視点を提供する。 Physics is formulated in terms of timeless classical mathematics. A formulation on the basis of intuitionist mathematics, built on time-evolving processes, would offer a perspective that is closer to our experience of physical reality.	翻訳日:2023-06-04 18:23:56 公開日:2020-02-04
# 賃貸住宅スポット市場:オンライン情報交換が送金データをどのように補完するか Rental Housing Spot Markets: How Online Information Exchanges Can Supplement Transacted-Rents Data ( http://arxiv.org/abs/2002.01578v1 ) ライセンス: Link先を確認	Geoff Boeing, Jake Wegmann, Junfeng Jiao	(参考訳) アメリカン・コミュニティ・サーベイ(英語版)やアメリカン・ハウジング・サーベイ(英語版)のような伝統的な米国の賃貸住宅データソースは、既存の借り手が毎月支払う市場について報告している。彼らはスポットマーケットについて明確には教えてくれない - すなわち、現在の住宅購入者が住宅を購入するために支払わなければならない家賃だ。この研究は、政府のデータと何百万もの同時賃貸物件を比較し、賃貸料の請求がこれらの最新の推計から大きく異なることを発見した。従来型の住宅データは、現在の市場状況や、特にタイトで高価な賃貸市場のある都市における手頃価格の課題を過小評価している。 Traditional US rental housing data sources such as the American Community Survey and the American Housing Survey report on the transacted market - what existing renters pay each month. They do not explicitly tell us about the spot market - i.e., the asking rents that current homeseekers must pay to acquire housing - though they are routinely used as a proxy. This study compares governmental data to millions of contemporaneous rental listings and finds that asking rents diverge substantially from these most recent estimates. Conventional housing data understate current market conditions and affordability challenges, especially in cities with tight and expensive rental markets.	翻訳日:2023-06-04 18:23:52 公開日:2020-02-04
# ローカライズのための連合学習--プライバシー保護型クラウドソーシング手法 Federated Learning for Localization: A Privacy-Preserving Crowdsourcing Method ( http://arxiv.org/abs/2001.01911v2 ) ライセンス: Link先を確認	Bekir Sait Ciftler, Abdullatif Albaseer, Noureddine Lasla, Mohamed Abdallah	(参考訳) 受信信号強度(RSS)指紋ベースのローカライゼーションは、低コストで実装が容易なため、多くの研究成果を惹きつけ、ロケーションベースのサービスの商業的応用を育ててきた。深層学習(DL)アルゴリズムをローカライズに活用する研究が数多く行われている。 DLの機能を抽出し、自律的に分類する能力は、指紋ベースのローカライゼーションの魅力的なソリューションとなる。これらの解は、大量の測定値を持つDLモデルの頻繁な再訓練を必要とする。クラウドソーシングは大量のデータを集めるのに優れた方法だが、集中型サーバーでラベル付きデータを収集する必要があるため、参加者のプライバシーを損なう。近年,フェデレーション学習は,クラウドソーシングの参加者のプライバシ保護の問題を解決する手段として,エッジデバイス上でのモデルトレーニングを分散的に行うという実践的な概念として現れており,参加者はもはやデータを集中型サーバに公開しない。本稿では,クラウドソーシング参加者のプライバシーを維持しつつ,RSS指紋による位置推定の精度を向上させるために,フェデレーション学習を利用した新しい手法を提案する。フェデレートされた学習を利用することで、ユーザのデータのプライバシを保存することを保証すると同時に、実世界の環境でキャプチャされた実験データによる適切なローカライゼーションパフォーマンスを実現することができる。提案手法は, 集中学習用ブースタとして使用する場合のローカライズ精度を1.8m向上させ, 単独使用時のローカライズ精度を良好に向上させた。 Received Signal Strength (RSS) fingerprint-based localization has attracted a lot of research effort and cultivated many commercial applications of location-based services due to its low cost and ease of implementation. Many studies are exploring the use of deep learning (DL) algorithms for localization. DL's ability to extract features and to classify autonomously makes it an attractive solution for fingerprint-based localization. These solutions require frequent retraining of DL models with vast amounts of measurements. Although crowdsourcing is an excellent way to gather immense amounts of data, it jeopardizes the privacy of participants, as it requires to collect labeled data at a centralized server. Recently, federated learning has emerged as a practical concept in solving the privacy preservation issue of crowdsourcing participants by performing model training at the edge devices in a decentralized manner; the participants do not expose their data anymore to a centralized server. This paper presents a novel method utilizing federated learning to improve the accuracy of RSS fingerprint-based localization while preserving the privacy of the crowdsourcing participants. Employing federated learning allows ensuring \emph{preserving the privacy of user data} while enabling an adequate localization performance with experimental data captured in real-world settings. The proposed method improved localization accuracy by 1.8 meters when used as a booster for centralized learning and achieved satisfactory localization accuracy when used standalone.	翻訳日:2023-01-13 21:09:40 公開日:2020-02-04
# d3ba: 非決定性計画を用いたビジネスプロセス最適化ツール D3BA: A Tool for Optimizing Business Processes Using Non-Deterministic Planning ( http://arxiv.org/abs/2001.02619v2 ) ライセンス: Link先を確認	Tathagata Chakraborti and Yasaman Khazaeni	(参考訳) 本稿では,対話エージェントの宣言的設計に関する最近の研究に基づいて,ai計画の力を利用してビジネスプロセスを最適化するディジタルビジネス自動化のための,エキサイティングな新しいツールであるd3baを提案する。このツールは、複雑なビジネスプロセスを構築し、最適化し、メンテナンスするための強力なフレームワークを提供する。我々は、この構成技法の有意義な特徴を説明し、他の構成哲学と比較し、この新興のビジネスプロセス自動化分野における研究のエキサイティングな機会を強調する。 This paper builds upon recent work in the declarative design of dialogue agents and proposes an exciting new tool -- D3BA -- Declarative Design for Digital Business Automation, built to optimize business processes using the power of AI planning. The tool provides a powerful framework to build, optimize, and maintain complex business processes and optimize them by composing with services that automate one or more subtasks. We illustrate salient features of this composition technique, compare with other philosophies of composition, and highlight exciting opportunities for research in this emerging field of business process automation.	翻訳日:2023-01-13 10:07:44 公開日:2020-02-04
# DALC:微粒化交通速度予測のための分散LSTMカスタマイズ DALC: Distributed Automatic LSTM Customization for Fine-Grained Traffic Speed Prediction ( http://arxiv.org/abs/2001.09821v2 ) ライセンス: Link先を確認	Ming-Chang Lee and Jia-Chun Lin	(参考訳) 過去10年間で、短期交通予測のためのいくつかのアプローチが導入されている。しかし,多数の検出器を地理的に配置して交通データを収集する大規模交通ネットワークでは,詳細な交通予測がまだ未解決である。本稿では,単一検出器のlstmモデルを有限マルコフ決定プロセスにカスタマイズする問題を定式化し,それに対応する予測精度を可能な限り満足し,時間消費を極力低くできるように,単一検出器のlstmモデルを自動カスタマイズする自動lstmカスタマイズ(alc)アルゴリズムを導入する。 ALCアルゴリズムに基づいて,大規模輸送ネットワークにおけるLSTMモデル毎にLSTMモデルをカスタマイズするために,分散自動LSTMカスタマイズ(DALC)と呼ばれる分散アプローチを導入する。本実験は, dalcがapache spark mllibが提供する複数のアプローチよりも高い予測精度を提供することを示す。 Over the past decade, several approaches have been introduced for short-term traffic prediction. However, providing fine-grained traffic prediction for large-scale transportation networks where numerous detectors are geographically deployed to collect traffic data is still an open issue. To address this issue, in this paper, we formulate the problem of customizing an LSTM model for a single detector into a finite Markov decision process and then introduce an Automatic LSTM Customization (ALC) algorithm to automatically customize an LSTM model for a single detector such that the corresponding prediction accuracy can be as satisfactory as possible and the time consumption can be as low as possible. Based on the ALC algorithm, we introduce a distributed approach called Distributed Automatic LSTM Customization (DALC) to customize an LSTM model for every detector in large-scale transportation networks. Our experiment demonstrates that the DALC provides higher prediction accuracy than several approaches provided by Apache Spark MLlib.	翻訳日:2023-01-07 05:06:47 公開日:2020-02-04
# 創発的コミュニケーションにおけるグラフ表現学習に向けて Towards Graph Representation Learning in Emergent Communication ( http://arxiv.org/abs/2001.09063v2 ) ライセンス: Link先を確認	Agnieszka S{\l}owik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden	(参考訳) 最近の神経科学の発見は、人間の脳が幾何学的構造(例えば概念空間を通して)の情報を表すことを示唆している。コミュニケーションするために、エンティティとその属性の複雑な表現を1つの単語または文にフラット化する。本稿では,マルチエージェントシステムにおける言語進化と協調を支援するために,グラフ畳み込みネットワークを用いる。画像ベースの参照ゲームに動機づけられ,複雑度が異なるグラフ参照ゲームを提案し,言語出現と協調の観点から望ましい特性を示す強力なベースラインモデルを提供する。出現したコミュニケーションプロトコルは頑健であり、エージェントはゲームの変動の真の要因を明らかにし、トレーニング中に遭遇したサンプルを超えて一般化することを学ぶ。 Recent findings in neuroscience suggest that the human brain represents information in a geometric structure (for instance, through conceptual spaces). In order to communicate, we flatten the complex representation of entities and their attributes into a single word or a sentence. In this paper we use graph convolutional networks to support the evolution of language and cooperation in multi-agent systems. Motivated by an image-based referential game, we propose a graph referential game with varying degrees of complexity, and we provide strong baseline models that exhibit desirable properties in terms of language emergence and cooperation. We show that the emerged communication protocol is robust, that the agents uncover the true factors of variation in the game, and that they learn to generalize beyond the samples encountered during training.	翻訳日:2023-01-07 04:41:04 公開日:2020-02-04
# 低複雑性畳み込みニューラルネットワークのための事前定義されたスパーシリティ Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks ( http://arxiv.org/abs/2001.10710v2 ) ライセンス: Link先を確認	Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel	(参考訳) 深層畳み込みニューラルネットワークを処理するための高エネルギーコストは、組み込みシステムやIoTデバイスのようなエネルギー制約のあるプラットフォームへのユビキタスな展開を妨げる。この研究は、フィルター内およびフィルター間を周期的に繰り返すサポートセットを持つ、事前定義されたスパース2dカーネルによる畳み込み層を導入する。周期的スパースカーネルの効率的な保存により、パラメータセーブはDRAMアクセスの削減によるエネルギー効率の大幅な改善に変換できるため、トレーニングと推論の両方においてエネルギー消費と精度のトレードオフが大幅に改善されることが期待できる。このアプローチを評価するために、ResNet18とVGG16アーキテクチャのスパース変種において、広く受け入れられている2つのデータセットであるCIFAR-10とTiny ImageNetを用いて実験を行った。ベースラインモデルと比較すると,提案手法ではモデルパラメータが最大82%少なく,フラップが5.6倍少なく,cifar-10ではresnet18の精度が無視できる。 Tiny ImageNetでトレーニングされたVGG16では、FLOPが5.8倍少なく、モデルパラメータが83.3%少なく、トップ5(トップ-1)の精度はわずか1.2%(2.1%)である。また,提案アーキテクチャの性能をShuffleNetとMobileNetV2の性能と比較した。同様のハイパーパラメータとFLOPを用いて、ResNet18の変種は平均精度が2.8%向上した。 The high energy cost of processing deep convolutional neural networks impedes their ubiquitous deployment in energy-constrained platforms such as embedded systems and IoT devices. This work introduces convolutional layers with pre-defined sparse 2D kernels that have support sets that repeat periodically within and across filters. Due to the efficient storage of our periodic sparse kernels, the parameter savings can translate into considerable improvements in energy efficiency due to reduced DRAM accesses, thus promising significant improvements in the trade-off between energy consumption and accuracy for both training and inference. To evaluate this approach, we performed experiments with two widely accepted datasets, CIFAR-10 and Tiny ImageNet in sparse variants of the ResNet18 and VGG16 architectures. Compared to baseline models, our proposed sparse variants require up to 82% fewer model parameters with 5.6times fewer FLOPs with negligible loss in accuracy for ResNet18 on CIFAR-10. For VGG16 trained on Tiny ImageNet, our approach requires 5.8times fewer FLOPs and up to 83.3% fewer model parameters with a drop in top-5 (top-1) accuracy of only 1.2% (2.1%). We also compared the performance of our proposed architectures with that of ShuffleNet andMobileNetV2. Using similar hyperparameters and FLOPs, our ResNet18 variants yield an average accuracy improvement of 2.8%.	翻訳日:2023-01-05 21:02:03 公開日:2020-02-04
# グラフニューラルネットワークを用いた確率論的論理推論 Efficient Probabilistic Logic Reasoning with Graph Neural Networks ( http://arxiv.org/abs/2001.11850v2 ) ライセンス: Link先を確認	Yuyu Zhang, Xinshi Chen, Yuan Yang, Arun Ramamurthy, Bo Li, Yuan Qi, Le Song	(参考訳) 論理規則と確率的グラフィカルモデルを組み合わせたマルコフ論理ネットワーク(MLN)は、多くの知識グラフ問題に対処するために用いられる。しかし、MLNの推論は計算集約的であり、MLNの工業的応用は非常に困難である。近年,グラフニューラルネットワーク(gnn)が大規模グラフ問題に対して効率的かつ効果的なツールとして登場している。それでも、GNNはモデルに事前のロジックルールを明示的に取り入れておらず、ターゲットタスクに多くのラベル付き例を必要とする可能性がある。本稿では,MLNとGNNの組み合わせについて検討し,MLNの変分推論にグラフニューラルネットワークを用いる。本稿では,表現力とモデルの単純さのバランスのよいGNN変種であるExpressGNNを提案する。いくつかのベンチマークデータセットに関する広範な実験は、ExpressGNNが効率的かつ効率的な確率論的論理推論をもたらすことを示した。 Markov Logic Networks (MLNs), which elegantly combine logic rules and probabilistic graphical models, can be used to address many knowledge graph problems. However, inference in MLN is computationally intensive, making the industrial-scale application of MLN very difficult. In recent years, graph neural networks (GNNs) have emerged as efficient and effective tools for large-scale graph problems. Nevertheless, GNNs do not explicitly incorporate prior logic rules into the models, and may require many labeled examples for a target task. In this paper, we explore the combination of MLNs and GNNs, and use graph neural networks for variational inference in MLN. We propose a GNN variant, named ExpressGNN, which strikes a nice balance between the representation power and the simplicity of the model. Our extensive experiments on several benchmark datasets demonstrate that ExpressGNN leads to effective and efficient probabilistic logic reasoning.	翻訳日:2023-01-05 20:44:59 公開日:2020-02-04
# NAViDAd:ディープオートエンコーダに基づく非参照オーディオ映像品質指標 NAViDAd: A No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder ( http://arxiv.org/abs/2001.11406v2 ) ライセンス: Link先を確認	Helard Martinez, M. C. Farias, A. Hines	(参考訳) 音声信号とビデオ信号の両方の品質予測モデルの開発は、かなり成熟した分野である。しかし、複数のマルチモーダルモデルが提案されているが、音声・視覚品質予測の分野はいまだに新興分野である。実際、組み合わせとパラメトリックのメトリクスによって得られる妥当なパフォーマンスにもかかわらず、現在、信頼できるピクセルベースのオーディオ視覚品質指標は存在しない。この研究で提示されたアプローチは、説明的なオーディオとビデオ機能を備えたオートエンコーダが、複雑なオーディオとビデオのインタラクションを記述することのできる一連の機能を生み出すかもしれないという仮定に基づいている。この仮説に基づいて,Deep Autoencoder (NAViDAd) に基づく非参照オーディオ-ビジュアル品質メトリクスを提案する。モデル視覚特徴は、ビデオ成分の自然シーン統計(NSS)と時空間測度である。一方、音声成分のスペクトログラム表現を演算して音声特徴を得る。このモデルは、ディープオートエンコーダ層と分類層を含む2層フレームワークによって形成される。これら2つのレイヤは積み重ねられ、ディープニューラルネットワークモデルを構築するためにトレーニングされます。モデルは、代表的なオーディオおよびビデオアーティファクトを含む、大きな刺激セットを使用して訓練され、テストされる。このモデルは、UnB-AVとLiveNetflix-IIデータベースでテストするとうまく動作した。 %の結果, 主観的品質スコアと高い相関性を有する品質スコアが得られた。 The development of models for quality prediction of both audio and video signals is a fairly mature field. But, although several multimodal models have been proposed, the area of audio-visual quality prediction is still an emerging area. In fact, despite the reasonable performance obtained by combination and parametric metrics, currently there is no reliable pixel-based audio-visual quality metric. The approach presented in this work is based on the assumption that autoencoders, fed with descriptive audio and video features, might produce a set of features that is able to describe the complex audio and video interactions. Based on this hypothesis, we propose a No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder (NAViDAd). The model visual features are natural scene statistics (NSS) and spatial-temporal measures of the video component. Meanwhile, the audio features are obtained by computing the spectrogram representation of the audio component. The model is formed by a 2-layer framework that includes a deep autoencoder layer and a classification layer. These two layers are stacked and trained to build the deep neural network model. The model is trained and tested using a large set of stimuli, containing representative audio and video artifacts. The model performed well when tested against the UnB-AV and the LiveNetflix-II databases. %Results shows that this type of approach produces quality scores that are highly correlated to subjective quality scores.	翻訳日:2023-01-05 12:47:34 公開日:2020-02-04
# 畳み込みニューラルネットワークを用いた高次元モータ画像タスクの分類 Classification of High-Dimensional Motor Imagery Tasks based on An End-to-end role assigned convolutional neural network ( http://arxiv.org/abs/2002.00210v2 ) ライセンス: Link先を確認	Byeong-Hoo Lee, Ji-Hoon Jeong, Kyung-Hwan Shim, Seong-Whan Lee	(参考訳) 脳コンピュータインタフェース(BCI)は、ユーザと外部デバイス間の直接通信経路を提供する。エレクトロ脳波(EEG)運動画像(MI)パラダイムは、非侵襲的BCIにおいて、ユーザの動作実行意図を含む符号化信号を得るために広く用いられている。しかし、EEGは複雑な非定常特性を持ち、デコード性能は不十分である。シングルアームの多数の動作を想像することにより、人工的なコマンドマッチングなしで復号性能を向上させることができる。そこで本研究では,片腕の9種類の動作を含む直感的脳波データを9名の被験者から収集した。階層型CNNアーキテクチャの原理を応用して,各上肢領域の識別的特徴を考慮した終端から終端までの畳み込みニューラルネットワーク(ERA-CNN)を提案する。提案手法は,従来の3-class,5-class,および2種類の7-class分類タスクよりも優れている。そこで本研究では,eare-cnnを用いたロバストな性能を有する脳波信号のみを用いて,ユーザの意図を復号する可能性を示す。 A brain-computer interface (BCI) provides a direct communication pathway between user and external devices. Electroencephalogram (EEG) motor imagery (MI) paradigm is widely used in non-invasive BCI to obtain encoded signals contained user intention of movement execution. However, EEG has intricate and non-stationary properties resulting in insufficient decoding performance. By imagining numerous movements of a single-arm, decoding performance can be improved without artificial command matching. In this study, we collected intuitive EEG data contained the nine different types of movements of a single-arm from 9 subjects. We propose an end-to-end role assigned convolutional neural network (ERA-CNN) which considers discriminative features of each upper limb region by adopting the principle of a hierarchical CNN architecture. The proposed model outperforms previous methods on 3-class, 5-class and two different types of 7-class classification tasks. Hence, we demonstrate the possibility of decoding user intention by using only EEG signals with robust performance using an ERA-CNN.	翻訳日:2023-01-05 01:12:21 公開日:2020-02-04
# 医療におけるベイズネットワーク:医療条件による流通 Bayesian Networks in Healthcare: Distribution by Medical Condition ( http://arxiv.org/abs/2002.00224v2 ) ライセンス: Link先を確認	Scott McLachlan, Kudakwashe Dube, Graham A Hitman, Norman E Fenton, Evangelia Kyrimi	(参考訳) ベイジアンネットワーク(BN)は、実際には採用と一致せず、医療に多大な利益をもたらす可能性がある研究の注目を集めている。研究は、BNでモデル化されている医療条件の種類や、どのように、なぜ異なる条件に適用されるのかについて、調査していない。本研究は、医療関連BNモデルが提案されている医療条件の範囲と、適用されている最も一般的な医療条件間のアプローチの差異を同定し、定量化することを目的とする。医療BNの約3分の2は、心臓、がん、心理、肺の4つの疾患に焦点を当てている。 BNがどのように機能し、どのような能力を持つかについての理解の欠如は、日々の医療実践においてポジティブな変化をもたらすために、BNの潜在能力を完全に認識することは、より深い理解と促進によってのみ実現できる、と我々は信じている。 Bayesian networks (BNs) have received increasing research attention that is not matched by adoption in practice and yet have potential to significantly benefit healthcare. Hitherto, research works have not investigated the types of medical conditions being modelled with BNs, nor whether any differences exist in how and why they are applied to different conditions. This research seeks to identify and quantify the range of medical conditions for which healthcare-related BN models have been proposed, and the differences in approach between the most common medical conditions to which they have been applied. We found that almost two-thirds of all healthcare BNs are focused on four conditions: cardiac, cancer, psychological and lung disorders. We believe that a lack of understanding regarding how BNs work and what they are capable of exists, and that it is only with greater understanding and promotion that we may ever realise the full potential of BNs to effect positive change in daily healthcare practice.	翻訳日:2023-01-05 01:03:30 公開日:2020-02-04
# ガウス過程を用いた電子顕微鏡画像のZ厚さとXY異方性の推定 Estimation of Z-Thickness and XY-Anisotropy of Electron Microscopy Images using Gaussian Processes ( http://arxiv.org/abs/2002.00228v2 ) ライセンス: Link先を確認	Thanuja D. Ambegoda, Julien N. P. Martel, Jozef Adamcik, Matthew Cook, Richard H. R. Hahnloser	(参考訳) シリアルセクション電子顕微鏡(SsEM)は、生体組織の体積情報をナノメートルスケールで取得する技術として広く用いられている。しかし、同定された細胞構造と体積量子化の正確な3次元再構成は、XYイメージング面に沿った断面厚さと異方性(または伸展)の正確な推定を必要とする。実際、多くの画像処理アルゴリズムは単に撮像面内の等方性を想定している。そこで本稿では,画像統計の非パラメトリックベイズ回帰法を用いて,電子顕微鏡断面の厚さと伸びを推定する手法を提案する。我々は,原子間力顕微鏡(AFM)により得られた直接測定値を用いて,我々の厚さと伸張率を検証し,最近の間接厚さ推定法や相対Z座標推定法と比較して推定誤差が低いことを示す。さらに,直接計測された断面厚み値を用いたssem画像の最初のデータセットを作成し,間接厚み推定法の評価を行った。 Serial section electron microscopy (ssEM) is a widely used technique for obtaining volumetric information of biological tissues at nanometer scale. However, accurate 3D reconstructions of identified cellular structures and volumetric quantifications require precise estimates of section thickness and anisotropy (or stretching) along the XY imaging plane. In fact, many image processing algorithms simply assume isotropy within the imaging plane. To ameliorate this problem, we present a method for estimating thickness and stretching of electron microscopy sections using non-parametric Bayesian regression of image statistics. We verify our thickness and stretching estimates using direct measurements obtained by atomic force microscopy (AFM) and show that our method has a lower estimation error compared to a recent indirect thickness estimation method as well as a relative Z coordinate estimation method. Furthermore, we have made the first dataset of ssSEM images with directly measured section thickness values publicly available for the evaluation of indirect thickness estimation methods.	翻訳日:2023-01-05 00:47:20 公開日:2020-02-04
# 視覚知覚改善のための水中画像の同時強調と超解像 Simultaneous Enhancement and Super-Resolution of Underwater Imagery for Improved Visual Perception ( http://arxiv.org/abs/2002.01155v1 ) ライセンス: Link先を確認	Md Jahidul Islam, Peigen Luo and Junaed Sattar	(参考訳) 本稿では,水中ロボットビジョンにおけるsesr(single enhancement and super- resolution)問題を紹介し,その解決法を提案する。本稿では,2倍,3倍,あるいは4倍の空間分解能で知覚的画質の復元を学習できる残差ネットワークベース生成モデルであるDeep SESRを提案する。本研究では,彩度特異的な水中色劣化,画像のシャープさの欠如,高レベル特徴表現の損失に対処するマルチモーダル目的関数を定式化することにより,そのトレーニングを監督する。また、画像の突出した前景領域を学習するために監督され、ネットワークがグローバルなコントラスト強化を学ぶためのガイドとなる。我々は、高速な推論のための共有階層的特徴空間上で、相性予測とSESRを共同で学習するエンドツーエンドのトレーニングパイプラインを設計する。さらに,大規模sesr学習を容易にする最初のデータセットであるufo-120を提案する。 UFO-120や他の標準データセットの徹底的な実験的評価により、Deep SESRは水中画像の強調や超高解像度化のために既存のソリューションよりも優れていることを示す。また,様々なスペクトル・空間劣化レベルを持つ水中画像や,目に見えない自然物を含む地上画像を含むいくつかのテストケースにおいて,その一般化性能を検証する。最後に,単板配置の計算可能性を分析し,視覚誘導水中ロボットの操作性を示す。モデルとデータセット情報は、https://github.com/xahidbuffon/Deep-SESR.comで提供される。 In this paper, we introduce and tackle the simultaneous enhancement and super-resolution (SESR) problem for underwater robot vision and provide an efficient solution for near real-time applications. We present Deep SESR, a residual-in-residual network-based generative model that can learn to restore perceptual image qualities at 2x, 3x, or 4x higher spatial resolution. We supervise its training by formulating a multi-modal objective function that addresses the chrominance-specific underwater color degradation, lack of image sharpness, and loss in high-level feature representation. It is also supervised to learn salient foreground regions in the image, which in turn guides the network to learn global contrast enhancement. We design an end-to-end training pipeline to jointly learn the saliency prediction and SESR on a shared hierarchical feature space for fast inference. Moreover, we present UFO-120, the first dataset to facilitate large-scale SESR learning; it contains over 1500 training samples and a benchmark test set of 120 samples. By thorough experimental evaluation on the UFO-120 and other standard datasets, we demonstrate that Deep SESR outperforms the existing solutions for underwater image enhancement and super-resolution. We also validate its generalization performance on several test cases that include underwater images with diverse spectral and spatial degradation levels, and also terrestrial images with unseen natural objects. Lastly, we analyze its computational feasibility for single-board deployments and demonstrate its operational benefits for visually-guided underwater robots. The model and dataset information will be available at: https://github.com/xahidbuffon/Deep-SESR.	翻訳日:2023-01-04 03:46:06 公開日:2020-02-04
# スパースサンプリングによる原子スケールSTEM-EELS画像の高速再構成 Fast reconstruction of atomic-scale STEM-EELS images from sparse sampling ( http://arxiv.org/abs/2002.01225v1 ) ライセンス: Link先を確認	Etienne Monier, Thomas Oberlin, Nathalie Brun, Xiaoyan Li, Marcel Tenc\'e, Nicolas Dobigeon	(参考訳) 本稿では、走査型透過電子顕微鏡(STEM)の取得を加速するために、部分サンプリング分光像の再構成について論じる。画像再構成の問題は、多くの画像モダリティの文献で広く検討されているが、STEM電子エネルギー損失分光法(EELS)によって取得されたスペクトル画像などの3Dデータを扱う試みはわずかである。また, 顕微鏡文献で提案されている手法のうち, 一部は高速であるが不正確であり, 一部は正確な再構成を行うが, 高計算負荷の費用がかかる。したがって,提案手法のいずれも精度と計算複雑性の面での期待を満たさない。本稿では,原子スケールEELSに適した高速かつ高精度な再構成手法を提案する。この方法は、STEM-EELS画像上で初めて使用されるベータプロセスファクター解析(BPFA)のような一般的なソリューションと比較される。 real as合成データに基づく実験を行う。 This paper discusses the reconstruction of partially sampled spectrum-images to accelerate the acquisition in scanning transmission electron microscopy (STEM). The problem of image reconstruction has been widely considered in the literature for many imaging modalities, but only a few attempts handled 3D data such as spectral images acquired by STEM electron energy loss spectroscopy (EELS). Besides, among the methods proposed in the microscopy literature, some are fast but inaccurate while others provide accurate reconstruction but at the price of a high computation burden. Thus none of the proposed reconstruction methods fulfills our expectations in terms of accuracy and computation complexity. In this paper, we propose a fast and accurate reconstruction method suited for atomic-scale EELS. This method is compared to popular solutions such as beta process factor analysis (BPFA) which is used for the first time on STEM-EELS images. Experiments based on real as synthetic data will be conducted.	翻訳日:2023-01-04 03:45:38 公開日:2020-02-04
# 畳み込み層による畳み込み型画像共有のプライバシー保護 Privacy-Preserving Image Sharing via Sparsifying Layers on Convolutional Groups ( http://arxiv.org/abs/2002.01469v1 ) ライセンス: Link先を確認	Sohrab Ferdowsi, Behrooz Razeghi, Taras Holotyak, Flavio P. Calmon, Slava Voloshynovskiy	(参考訳) 本稿では,大規模な設定において,プライバシーに配慮した画像共有の問題に対処する実践的な枠組みを提案する。コンパクト性は常に大規模に求められていますが、プライバシーに敏感なコンテンツをさらに保護しようとすると、このニーズはさらに深刻になります。そこで我々は、画像のエンコードを行い、一方から、表現はプライバシー保護の膨大なコストを払わずにパブリックドメインに格納されるが、それゆえに、攻撃者に対して組合せ的探索的な推測機構が利用できない限り、画像から識別可能なコンテンツが漏れないようにした。一方、認証されたユーザには、セキュアに維持できる非常にコンパクトなキーが提供されている。これは、対応するアクセスグラインド画像の曖昧化と再構築に使用できる。我々は、画像の異なる属性を再構築する責任を負う複数のコンパクトコードを提供しながら、機能マップをスパース化変換を通じて独立に渡す、我々の設計の畳み込みオートエンコーダでこれを達成する。このフレームワークは、公開実装が利用可能な大規模な画像データベース上でテストされる。 We propose a practical framework to address the problem of privacy-aware image sharing in large-scale setups. We argue that, while compactness is always desired at scale, this need is more severe when trying to furthermore protect the privacy-sensitive content. We therefore encode images, such that, from one hand, representations are stored in the public domain without paying the huge cost of privacy protection, but ambiguated and hence leaking no discernible content from the images, unless a combinatorially-expensive guessing mechanism is available for the attacker. From the other hand, authorized users are provided with very compact keys that can easily be kept secure. This can be used to disambiguate and reconstruct faithfully the corresponding access-granted images. We achieve this with a convolutional autoencoder of our design, where feature maps are passed independently through sparsifying transformations, providing multiple compact codes, each responsible for reconstructing different attributes of the image. The framework is tested on a large-scale database of images with public implementation available.	翻訳日:2023-01-04 03:45:23 公開日:2020-02-04
# 特徴精製に基づく畳み込みニューラルネットワークを用いた単一タスクの運動画像分類 Motor Imagery Classification of Single-Arm Tasks Using Convolutional Neural Network based on Feature Refining ( http://arxiv.org/abs/2002.01122v1 ) ライセンス: Link先を確認	Byeong-Hoo Lee, Ji-Hoon Jeong, Kyung-Hwan Shim, Dong-Joo Kim	(参考訳) 脳コンピュータインタフェース(BCI)は、ユーザの意図とステータスを理解するために脳信号をデコードする。単純で安全なデータ取得プロセスのため、脳波(EEG)は非侵襲的BCIで一般的に用いられる。 eegパラダイムの一つであるmotor image (mi) は、信号起源による運動機能の回復やリハビリによく用いられる。しかし、脳波信号は振動・非定常信号であり、MIを正確に収集・分類することが困難である。本研究では,2つの畳み込みブロックからなるbfr-cnn(band-power feature refining convolutional neural network)を提案する。脳波信号を収集し、単一アームの運動想像力を含むMIデータセットを作成しました。提案手法は従来の4クラスmiタスク分類よりも優れている。そこで我々は,BFR-CNNを用いた脳波信号のみを用いて,ユーザ意図の復号化が可能であることを実証した。 Brain-computer interface (BCI) decodes brain signals to understand user intention and status. Because of its simple and safe data acquisition process, electroencephalogram (EEG) is commonly used in non-invasive BCI. One of EEG paradigms, motor imagery (MI) is commonly used for recovery or rehabilitation of motor functions due to its signal origin. However, the EEG signals are an oscillatory and non-stationary signal that makes it difficult to collect and classify MI accurately. In this study, we proposed a band-power feature refining convolutional neural network (BFR-CNN) which is composed of two convolution blocks to achieve high classification accuracy. We collected EEG signals to create MI dataset contained the movement imagination of a single-arm. The proposed model outperforms conventional approaches in 4-class MI tasks classification. Hence, we demonstrate that the decoding of user intention is possible by using only EEG signals with robust performance using BFR-CNN.	翻訳日:2023-01-04 03:44:44 公開日:2020-02-04
# 乱流予測におけるリカレントニューラルネットワークの利用について On the use of recurrent neural networks for predictions of turbulent flows ( http://arxiv.org/abs/2002.01222v1 ) ライセンス: Link先を確認	Luca Guastoni, Prem A. Srinivasan, Hossein Azizpour, Philipp Schlatter and Ricardo Vinuesa	(参考訳) 本稿では,Moehlis {\it et al. による近接壁乱流の低次モデルを用いて,リカレントニューラルネットワークの予測能力を評価する。ニュー・J・フィス(New J. Phys)。 56, 2004) である。その結果, 適切に訓練されたlong short-term memory (lstm) ネットワークを用いて, 乱流統計量と流れの動的挙動の優れた予測が可能となり, 平均の相対誤差とゆらぎは1-%$以下となることがわかった。また,流れの瞬時予測のみに基づく損失関数の使用は乱流統計学において最良の予測にはならない可能性があり,計算された統計学に基づいて停止基準を定義する必要がある。さらに、瞬時に予測されるだけでなく、流れの平均的な振る舞いを含むより洗練された損失関数は、より高速なニューラルネットワークトレーニングをもたらす可能性がある。 In this paper, the prediction capabilities of recurrent neural networks are assessed in the low-order model of near-wall turbulence by Moehlis {\it et al.} (New J. Phys. {\bf 6}, 56, 2004). Our results show that it is possible to obtain excellent predictions of the turbulence statistics and the dynamic behavior of the flow with properly trained long short-term memory (LSTM) networks, leading to relative errors in the mean and the fluctuations below $1\%$. We also observe that using a loss function based only on the instantaneous predictions of the flow may not lead to the best predictions in terms of turbulence statistics, and it is necessary to define a stopping criterion based on the computed statistics. Furthermore, more sophisticated loss functions, including not only the instantaneous predictions but also the averaged behavior of the flow, may lead to much faster neural network training.	翻訳日:2023-01-04 03:44:29 公開日:2020-02-04
# 脳波信号を用いたてんかん発作予測のための機械学習 Machine Learning for Predicting Epileptic Seizures Using EEG Signals: A Review ( http://arxiv.org/abs/2002.01925v1 ) ライセンス: Link先を確認	Khansa Rasheed, Adnan Qayyum, Junaid Qadir, Shobi Sivathamboo, Patrick Kwan, Levin Kuhlmann, Terence O'Brien, and Adeel Razi	(参考訳) 人工知能(AI)と機械学習(ML)技術の進歩により、研究者たちは、これらの技術を用いて臨床実践の進歩を目指している。医療の主要な目的の1つは、予防的介入をタイムリーに提供する病気の早期発見と予測である。これは特にてんかんの症例であり、再発と予測不能な発作が特徴である。患者は、何らかの形で事前に予測できた場合、てんかん発作の副作用を軽減できる。数十年の研究にもかかわらず、発作の予測は未解決の問題である。これは少なくとも、問題の解決に十分な量のデータが不足しているためである。 MLベースのアルゴリズムには、てんかん発作の早期かつ正確な予測においてパラダイムシフトをもたらす可能性がある、エキサイティングな新しい展開がある。本稿では,脳波信号を用いた発作早期予測における最先端ML手法の総合的なレビューを行う。私たちは現在の研究におけるギャップ、課題、落とし穴を特定し、今後の方向性を推奨します。 With the advancement in artificial intelligence (AI) and machine learning (ML) techniques, researchers are striving towards employing these techniques for advancing clinical practice. One of the key objectives in healthcare is the early detection and prediction of disease to timely provide preventive interventions. This is especially the case for epilepsy, which is characterized by recurrent and unpredictable seizures. Patients can be relieved from the adverse consequences of epileptic seizures if it could somehow be predicted in advance. Despite decades of research, seizure prediction remains an unsolved problem. This is likely to remain at least partly because of the inadequate amount of data to resolve the problem. There have been exciting new developments in ML-based algorithms that have the potential to deliver a paradigm shift in the early and accurate prediction of epileptic seizures. Here we provide a comprehensive review of state-of-the-art ML techniques in early prediction of seizures using EEG signals. We will identify the gaps, challenges, and pitfalls in the current research and recommend future directions.	翻訳日:2023-01-04 03:43:59 公開日:2020-02-04
# 話者手がかりを用いた感情認識 Emotion Recognition Using Speaker Cues ( http://arxiv.org/abs/2002.03566v1 ) ライセンス: Link先を確認	Ismail Shahin	(参考訳) 本研究の目的は、話者手がかりを用いて未知の感情を特定することである。本研究では,2段階の枠組みを用いて未知の感情を同定する。第1段階は未知の感情を発話する話者を特定することに焦点を当て、第2段階は認識された話者が前段で発する未知の感情を特定することに焦点を当てている。提案手法はアラビア語Emirati-accented speech databaseで男女15人を対象に評価されている。抽出した特徴としてMel-Frequency Cepstral Coefficients (MFCCs) が用いられ、本研究ではHidden Markov Model (HMM) が分類器として利用されている。その結果,2段階の枠組みに基づく感情認識精度は,ガウス混合モデル (GMM) やサポートベクトルマシン (SVM) ,ベクトル量子化 (VQ) など,一段階のアプローチと最先端の分類器に基づくものよりも高いことがわかった。 2段階のアプローチに基づく平均感情認識精度は67.5%であり、それぞれ1段階のアプローチであるGMM、SVM、VQに基づいて61.4%、63.3%、64.5%、61.5%に達する。 2段階の枠組みに基づいて達成された結果は、人間の聴取者による主観的評価に非常に近い。 This research aims at identifying the unknown emotion using speaker cues. In this study, we identify the unknown emotion using a two-stage framework. The first stage focuses on identifying the speaker who uttered the unknown emotion, while the next stage focuses on identifying the unknown emotion uttered by the recognized speaker in the prior stage. This proposed framework has been evaluated on an Arabic Emirati-accented speech database uttered by fifteen speakers per gender. Mel-Frequency Cepstral Coefficients (MFCCs) have been used as the extracted features and Hidden Markov Model (HMM) has been utilized as the classifier in this work. Our findings demonstrate that emotion recognition accuracy based on the two-stage framework is greater than that based on the one-stage approach and the state-of-the-art classifiers and models such as Gaussian Mixture Model (GMM), Support Vector Machine (SVM), and Vector Quantization (VQ). The average emotion recognition accuracy based on the two-stage approach is 67.5%, while the accuracy reaches to 61.4%, 63.3%, 64.5%, and 61.5%, based on the one-stage approach, GMM, SVM, and VQ, respectively. The achieved results based on the two-stage framework are very close to those attained in subjective assessment by human listeners.	翻訳日:2023-01-04 03:43:46 公開日:2020-02-04
# 結合CNNを用いたハイパースペクトルとLiDARデータの分類 Classification of Hyperspectral and LiDAR Data Using Coupled CNNs ( http://arxiv.org/abs/2002.01144v1 ) ライセンス: Link先を確認	Renlong Hang, Zhu Li, Pedram Ghamisi, Danfeng Hong, Guiyu Xia, and Qingshan Liu	(参考訳) 本稿では,2つの結合畳み込みニューラルネットワーク(CNN)を用いて,高スペクトルと光検出・ラング(LiDAR)データを融合する,効率的かつ効率的なフレームワークを提案する。 1つのCNNは、ハイパースペクトルデータからスペクトル空間的特徴を学習するために設計され、もう1つはLiDARデータから標高情報を取得するために使用される。どちらも3つの畳み込み層で構成され、最後の2つの畳み込み層はパラメータ共有戦略を介して結合される。融合相では、これらの不均一な特徴を十分に統合するために、特徴レベルおよび決定レベル融合法が同時に使用される。機能レベルの融合については,連結戦略,最大化戦略,累積戦略を含む3つの異なる融合戦略が評価される。決定レベルの融合では、各出力の分類精度によって重み付けを決定する重み付け和戦略が採用される。提案モデルは、アメリカ合衆国ヒューストンで取得した都市データセットと、イタリアのトレントで取得した農村データセットに基づいて評価される。ヒューストンのデータでは、我々のモデルは96.03%の精度で新しい記録を達成できる。 Trentoのデータでは、全体的な精度は99.12%である。これらの結果は,提案モデルの有効性を十分に証明する。 In this paper, we propose an efficient and effective framework to fuse hyperspectral and Light Detection And Ranging (LiDAR) data using two coupled convolutional neural networks (CNNs). One CNN is designed to learn spectral-spatial features from hyperspectral data, and the other one is used to capture the elevation information from LiDAR data. Both of them consist of three convolutional layers, and the last two convolutional layers are coupled together via a parameter sharing strategy. In the fusion phase, feature-level and decision-level fusion methods are simultaneously used to integrate these heterogeneous features sufficiently. For the feature-level fusion, three different fusion strategies are evaluated, including the concatenation strategy, the maximization strategy, and the summation strategy. For the decision-level fusion, a weighted summation strategy is adopted, where the weights are determined by the classification accuracy of each output. The proposed model is evaluated on an urban data set acquired over Houston, USA, and a rural one captured over Trento, Italy. On the Houston data, our model can achieve a new record overall accuracy of 96.03%. On the Trento data, it achieves an overall accuracy of 99.12%. These results sufficiently certify the effectiveness of our proposed model.	翻訳日:2023-01-04 03:38:05 公開日:2020-02-04
# ブロック強度と勾配差(BIGD)記述子を用いたテクスチャ分類 Texture Classification using Block Intensity and Gradient Difference (BIGD) Descriptor ( http://arxiv.org/abs/2002.01154v1 ) ライセンス: Link先を確認	Yuting Hu, Zhen Wang, and Ghassan AlRegib	(参考訳) 本稿では,ブロック強度と勾配差(BIGD)という,効率的で独特な局所記述子を提案する。画像パッチでは、マルチスケールブロックペアをランダムにサンプリングし、各ブロックの強度と勾配の違いを利用してローカルなbigdディスクリプタを構築する。ランダムサンプリング戦略とマルチスケールフレームワークは、bigdディスクリプタが異なる方向と空間的な粒度レベルでのパッチの特徴的なパターンを捉えるのに役立つ。局所集約ディスクリプタ(VLAD)や改良されたフィッシャーベクトル(IFV)のベクトルを用いて、局所的なBIGDディスクリプタをフルイメージディスクリプタにエンコードし、その後、テクスチャ分類のための線形サポートベクタマシン(SVM)分類器に入力する。提案する記述子は,Brodatz,CUReT,KTH-TIPS,KTH-TIPS-2a,-2bを含む5つの公的なテクスチャデータセットに対して,その分類性能を評価することで,その特徴と現状を比較した。実験結果から, 識別力の強いBIGDディスクリプタは, 最先端テクスチャディスクリプタ, 密度マイクロブロック差 (DMD) よりも0.12%～6.43%高い分類精度が得られた。 In this paper, we present an efficient and distinctive local descriptor, namely block intensity and gradient difference (BIGD). In an image patch, we randomly sample multi-scale block pairs and utilize the intensity and gradient differences of pairwise blocks to construct the local BIGD descriptor. The random sampling strategy and the multi-scale framework help BIGD descriptors capture the distinctive patterns of patches at different orientations and spatial granularity levels. We use vectors of locally aggregated descriptors (VLAD) or improved Fisher vector (IFV) to encode local BIGD descriptors into a full image descriptor, which is then fed into a linear support vector machine (SVM) classifier for texture classification. We compare the proposed descriptor with typical and state-of-the-art ones by evaluating their classification performance on five public texture data sets including Brodatz, CUReT, KTH-TIPS, and KTH-TIPS-2a and -2b. Experimental results show that the proposed BIGD descriptor with stronger discriminative power yields 0.12% ~ 6.43% higher classification accuracy than the state-of-the-art texture descriptor, dense microblock difference (DMD).	翻訳日:2023-01-04 03:37:47 公開日:2020-02-04
# 畳み込みニューラルネットワークを用いた下水道映像の妨害レベル検出 Obstruction level detection of sewer videos using convolutional neural networks ( http://arxiv.org/abs/2002.01284v1 ) ライセンス: Link先を確認	Mario A. Gutierrez-Mondragon, Dario Garcia-Gasulla, Sergio Alvarez-Napagao, Jaume Brossa-Ordo\~nez and Rafael Gimenez-Esteban	(参考訳) 下水道網は、排水を中央処理場に輸送して処理し、環境に戻すように設計されている。このプロセスは現在の社会にとって重要であり、水性疾患を予防し、安全な飲料水を提供し、一般的な衛生を強化する。下水道網の完全運用を維持するため、常にサンプリング検査を行い、障害を識別する。通常、閉鎖回路テレビシステムはパイプの内部を記録し、妨害レベルを報告するために使われ、掃除作業が引き起こされる可能性がある。現在、障害レベルのアセスメントは手動で行われており、時間がかかり、一貫性がない。本研究では,パイプの閉塞レベルを特定するために畳み込みニューラルネットワークを訓練する手法を設計し,このような頻繁かつ反復的な作業に要する人的労力を削減する。私たちは、モデルに入力するための有用なフレームを生成するために、探索および適応されたビデオのデータベースを集めました。結果として得られた分類器は、デプロイ可能なパフォーマンスを得る。このアプローチの一貫性と工業的適用性を検証するために,階層的適合性伝達説明可能性技術を統合することで,ニューラルネットワークの動作をより深く理解することができる。最後に, 提案システムにより, 下水道試験における速度, 精度, 整合性を向上することができる。私たちの分析では、データ収集方法論のさらなる品質向上に関するガイドラインも公開しています。 Worldwide, sewer networks are designed to transport wastewater to a centralized treatment plant to be treated and returned to the environment. This process is critical for the current society, preventing waterborne illnesses, providing safe drinking water and enhancing general sanitation. To keep a sewer network perfectly operational, sampling inspections are performed constantly to identify obstructions. Typically, a Closed-Circuit Television system is used to record the inside of pipes and report the obstruction level, which may trigger a cleaning operative. Currently, the obstruction level assessment is done manually, which is time-consuming and inconsistent. In this work, we design a methodology to train a Convolutional Neural Network for identifying the level of obstruction in pipes, thus reducing the human effort required on such a frequent and repetitive task. We gathered a database of videos that are explored and adapted to generate useful frames to fed into the model. Our resulting classifier obtains deployment ready performances. To validate the consistency of the approach and its industrial applicability, we integrate the Layer-wise Relevance Propagation explainability technique, which enables us to further understand the behavior of the neural network for this task. In the end, the proposed system can provide higher speed, accuracy, and consistency in the process of sewer examination. Our analysis also uncovers some guidelines on how to further improve the quality of the data gathering methodology.	翻訳日:2023-01-04 03:37:18 公開日:2020-02-04
# tfp.mcmc: 現代のハードウェア用に作られた現代のマルコフチェーンモンテカルロツール tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware ( http://arxiv.org/abs/2002.01184v1 ) ライセンス: Link先を確認	Junpeng Lao, Christopher Suter, Ian Langmore, Cyril Chimisov, Ashish Saxena, Pavel Sountsov, Dave Moore, Rif A. Saurous, Matthew D. Hoffman, and Joshua V. Dillon	(参考訳) マルコフ連鎖モンテカルロ(mcmc)は20世紀の最も重要なアルゴリズムの1つと見なされている。非正規化確率関数のみを用いた漸近収束、安定性、および推定子分散境界の保証は確率計画に不可欠である。本稿では、tensorflow probability mcmc toolkitを紹介し、その設計の動機となるいくつかの考察について述べる。 Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century. Its guarantees of asymptotic convergence, stability, and estimator-variance bounds using only unnormalized probability functions make it indispensable to probabilistic programming. In this paper, we introduce the TensorFlow Probability MCMC toolkit, and discuss some of the considerations that motivated its design.	翻訳日:2023-01-04 03:36:56 公開日:2020-02-04
# ベイズ最適化のための不確かさ定量化 Uncertainty Quantification for Bayesian Optimization ( http://arxiv.org/abs/2002.01569v1 ) ライセンス: Link先を確認	Rui Tuo, Wenjia Wang	(参考訳) ベイズ最適化はグローバル最適化手法のクラスである。これは、基礎となる目的函数をガウス過程の実現と見なしている。ベイズ最適化の出力はガウス過程の仮定に従ってランダムであるが、この不確実性の定量化は文献ではほとんど研究されていない。本研究では,最大点や目的関数の値の信頼領域を構築するという観点から,ベイズ最適化アルゴリズムの出力不確実性を評価するための新しい手法を提案する。これらの領域は効率的に計算でき、その信頼度レベルは逐次ガウス過程回帰のために新しく開発された一様誤差境界によって保証される。本理論は、既存の全ての逐次サンプリングポリシーと停止基準の統一不確実性定量化フレームワークを提供する。 Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assumption, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.	翻訳日:2023-01-04 03:36:50 公開日:2020-02-04
# REAK: エラーレートに基づく適応リグによる信頼性解析 REAK: Reliability analysis through Error rate-based Adaptive Kriging ( http://arxiv.org/abs/2002.01110v1 ) ライセンス: Link先を確認	Zeyu Wang and Abdollah Shafieezadeh	(参考訳) 様々な分野のモデルが複雑になりつつあるため、関連する計算要求は大幅に増大している。これらのシステムの障害確率が小さい場合の信頼性解析は極めて困難であり、大量のコストシミュレーションを必要とする。この課題に対処するために、Error rate-based Adaptive Kriging (REAK) による信頼性解析を提案する。ここでは、リンデベルク条件に基づく中央極限定理の拡張を用いて、間違った符号推定値を持つ設計サンプル数の分布を導出し、失敗確率推定値の最大誤差率を決定する。このエラーレートは、設計サンプルの戦略的生成のための適応スキームの各段階における効果的なサンプリング領域の最適な確立を可能にする。さらに、信頼性解析の停止基準として使用される故障確率推定の目標精度の設定を容易にする。これらの能力は、洗練された計算要求モデルへの呼び出し数を著しく削減することができる。非線形性と次元の異なる4つの例に対するREAKの適用について述べる。その結果,モンテカルロシミュレーション (ak-mcs) と逐次kriging reliability analysis (iskra) の改善による適応krigingの最先端手法と比較して,reakは計算需要を最大50%削減できることがわかった。 As models in various fields are becoming more complex, associated computational demands have been increasing significantly. Reliability analysis for these systems when failure probabilities are small is significantly challenging, requiring a large number of costly simulations. To address this challenge, this paper introduces Reliability analysis through Error rate-based Adaptive Kriging (REAK). An extension of the Central Limit Theorem based on Lindeberg condition is adopted here to derive the distribution of the number of design samples with wrong sign estimate and subsequently determine the maximum error rate for failure probability estimates. This error rate enables optimal establishment of effective sampling regions at each stage of an adaptive scheme for strategic generation of design samples. Moreover, it facilitates setting a target accuracy for failure probability estimation, which is used as stopping criterion for reliability analysis. These capabilities together can significantly reduce the number of calls to sophisticated, computationally demanding models. The application of REAK for four examples with varying extent of nonlinearity and dimension is presented. Results indicate that REAK is able to reduce the computational demand by as high as 50% compared to state-of-the-art methods of Adaptive Kriging with Monte Carlo Simulation (AK-MCS) and Improved Sequential Kriging Reliability Analysis (ISKRA).	翻訳日:2023-01-04 03:36:04 公開日:2020-02-04
# リカレントニューラルネットワークを用いたチャネル状態情報を用いたセンチメートルレベルの屋内定位 Centimeter-Level Indoor Localization using Channel State Information with Recurrent Neural Networks ( http://arxiv.org/abs/2002.01411v1 ) ライセンス: Link先を確認	Jianyuan Yu, R. Michael Buehrer	(参考訳) モノのインターネットや自動運転の現代的な技術は、より正確な位置決めを必要とする。古典的な位置技術は主に屋外のシナリオに適応するが、複数の経路を持つ屋内のケースの要件を満たさない。一方、ノイズや時間変化にロバストな特徴として、チャネル状態情報(csi)はより正確な測位において、受信信号強度指標(rssi)よりも優れていることが示されている。そこで本稿では,線形アンテナから収集した実CSIデータを用いて,センチメートルレベルの屋内位置推定を行うニューラルネットワーク手法を提案する。チャネル応答の振幅または相関行列を入力として使用することにより、データサイズを大幅に削減し、ノイズを抑制することができる。また、リカレントニューラルネットワーク(RNN)と信号雑音比(SNR)情報によるユーザ動作軌跡の整合性を利用して、特に小型データ学習における推定精度をさらに向上させることができる。これらの貢献はすべて、他の古典的な教師付き学習方法の結果に基づいて、ニューラルネットワークの効率に恩恵をもたらします。 Modern techniques in the Internet of Things or autonomous driving require more accuracy positioning ever. Classic location techniques mainly adapt to outdoor scenarios, while they do not meet the requirement of indoor cases with multiple paths. Meanwhile as a feature robust to noise and time variations, Channel State Information (CSI) has shown its advantages over Received Signal Strength Indicator (RSSI) at more accurate positioning. To this end, this paper proposes the neural network method to estimate the centimeter-level indoor positioning with real CSI data collected from linear antennas. It utilizes an amplitude of channel response or a correlation matrix as the input, which can highly reduce the data size and suppress the noise. Also, it makes use of the consistency in the user motion trajectory via Recurrent Neural Network (RNN) and signal-noise ratio (SNR) information, which can further improve the estimation accuracy, especially in small datasize learning. These contributions all benefit the efficiency of the neural network, based on the results with other classic supervised learning methods.	翻訳日:2023-01-04 03:35:00 公開日:2020-02-04
# 深層学習による公共空間利用の計測:デトロイト川流域におけるベンチマーク研究 Measuring the Utilization of Public Open Spaces by Deep Learning: a Benchmark Study at the Detroit Riverfront ( http://arxiv.org/abs/2002.01461v1 ) ライセンス: Link先を確認	Peng Sun, Rui Hou, Jerome Lynch	(参考訳) 身体活動と社会的相互作用は、健康的なライフスタイルを保証する重要な活動である。公園、広場、緑道などの公共のオープンスペース(POS)は、これらの活動を促進する重要な環境である。 POSを評価するためには、その内部の施設をどのように利用するかを研究する必要がある。しかし、POSの利用を研究する従来のアプローチは手作業であり、時間と労力が集中している。質的な洞察のみを提供することもある。監視カメラの活用や,コンピュータビジョンによるユーザ関連情報の抽出が望まれている。本稿では,posにおけるヒューマンアクティビティを定量的に計測するための概念実証型コンピュータビジョンフレームワークを提案し,デトロイト・リバーフロント・コンサージェンシー(drfc)監視カメラネットワークを用いた提案フレームワークのケーススタディを示す。カスタムイメージデータセットはフレームワークをトレーニングするために提示され、データセットには、様々な照明条件下でDRFC公園空間の18のカメラから収集された7826の完全な注釈付き画像が含まれている。データセット分析と、ワンステップユーザのローカライズとアクティビティ認識のためのベースラインモデルも提供されている。 mAPの結果は, 歩行者検出では77.5\%, サイクリスト検出では81.6\%であった。動作マップはフレームワークによって自律的に生成され、異なるposユーザを見つけ出し、行動のローカライズに対する平均誤差は10cm以内である。 Physical activities and social interactions are essential activities that ensure a healthy lifestyle. Public open spaces (POS), such as parks, plazas and greenways, are key environments that encourage those activities. To evaluate a POS, there is a need to study how humans use the facilities within it. However, traditional approaches to studying use of POS are manual and therefore time and labor intensive. They also may only provide qualitative insights. It is appealing to make use of surveillance cameras and to extract user-related information through computer vision. This paper proposes a proof-of-concept deep learning computer vision framework for measuring human activities quantitatively in POS and demonstrates a case study of the proposed framework using the Detroit Riverfront Conservancy (DRFC) surveillance camera network. A custom image dataset is presented to train the framework; the dataset includes 7826 fully annotated images collected from 18 cameras across the DRFC park space under various illumination conditions. Dataset analysis is also provided as well as a baseline model for one-step user localization and activity recognition. The mAP results are 77.5\% for {\it pedestrian} detection and 81.6\% for {\it cyclist} detection. Behavioral maps are autonomously generated by the framework to locate different POS users and the average error for behavioral localization is within 10 cm.	翻訳日:2023-01-04 03:28:27 公開日:2020-02-04
# モノイド上の確率的オートマトンについて On Stochastic Automata over Monoids ( http://arxiv.org/abs/2002.01214v1 ) ライセンス: Link先を確認	Karl-Heinz Zimmermann, Merve Nur Cakir	(参考訳) 入力集合としてのモノイド上の確率オートマトンを研究する。これらのオートマトンの定義性は、自由モノイドの固有の普遍性を置き換える拡張条件を必要とする。ツラカイネンの結果の一般化として、モノイド上の一般化されたオートマトンはその確率的対象と同じ受容力を持つことを示す。準同型の鍵は、入力状態のモノイド準同型と遷移行列のモノイド準同型の間の可換性である。確率オートマトンがモノイド上で受容する言語のクロージャ特性について検討した。マトリックス確率オートマトンがモノイド上で受容する言語のクロージャ特性について検討した。 Stochastic automata over monoids as input sets are studied. The well-definedness of these automata requires an extension postulate that replaces the inherent universal property of free monoids. As a generalization of Turakainen's result, it will be shown that the generalized automata over monoids have the same acceptance power as their stochastic counterparts. The key to homomorphisms is a commuting property between the monoid homomorphism of input states and the monoid homomorphism of transition matrices. Closure properties of the languages accepted by stochastic automata over monoids are investigated. matrices. Closure properties of the languages accepted by stochastic automata over monoids are investigated.	翻訳日:2023-01-04 03:27:42 公開日:2020-02-04
# Ethics Codes Onって誰の側? 権力と責任と社会的利益は Whose Side are Ethics Codes On? Power, Responsibility and the Social Good ( http://arxiv.org/abs/2002.01559v1 ) ライセンス: Link先を確認	Anne L. Washington, Rachel S. Kuo	(参考訳) 倫理規範の道徳的権威は、それらが統一社会に仕えるという仮定に由来するが、これは共有資源の政治的側面を無視している。社会学者ハワード・S・ベッカー(Howard S. Becker)は古典的なエッセイ『Whose Side Are We On』で、研究者にその力と責任を明確にするよう求めた。ベッカーの信頼の階層に基づいて,データ倫理規範の批判的言説分析と,データ技術の「社会的善」という概念化について報告する。分析の結果、企業や専門団体の倫理規定が消費者を社会と混同し、機関でほとんど沈黙していたことが明らかとなった。デジタル時代の社会変化に関するコミュニティオーガナイザへのインタビューは分析を補完し、疎外化コミュニティの懸念に対する技術的な解決策の限界を克服した。文書と生活経験の間の溝を浮かび上がらせる証拠を考えると、我々は消費者を高める倫理規定が同時に脆弱な人口のニーズに従属する可能性があると論じる。競合するデジタルリソースを理解することは、公益技術の新興分野の中心である。本稿では,デジタルディファレンシャル脆弱性の概念を導入し,データ技術における有害な暴露を説明するとともに,将来的な倫理規定に対する勧告を提案する。 The moral authority of ethics codes stems from an assumption that they serve a unified society, yet this ignores the political aspects of any shared resource. The sociologist Howard S. Becker challenged researchers to clarify their power and responsibility in the classic essay: Whose Side Are We On. Building on Becker's hierarchy of credibility, we report on a critical discourse analysis of data ethics codes and emerging conceptualizations of beneficence, or the "social good", of data technology. The analysis revealed that ethics codes from corporations and professional associations conflated consumers with society and were largely silent on agency. Interviews with community organizers about social change in the digital era supplement the analysis, surfacing the limits of technical solutions to concerns of marginalized communities. Given evidence that highlights the gulf between the documents and lived experiences, we argue that ethics codes that elevate consumers may simultaneously subordinate the needs of vulnerable populations. Understanding contested digital resources is central to the emerging field of public interest technology. We introduce the concept of digital differential vulnerability to explain disproportionate exposures to harm within data technology and suggest recommendations for future ethics codes.	翻訳日:2023-01-04 03:26:42 公開日:2020-02-04
# 多段ポンプの多目的予測におけるデータ拡張型ニューラルネットワーク Neural network with data augmentation in multi-objective prediction of multi-stage pump ( http://arxiv.org/abs/2002.02402v1 ) ライセンス: Link先を確認	Hang Zhao	(参考訳) データ拡張を伴うニューラルネットワークに基づく多段階ポンプ法の多目的予測法を提案する。キー設計変数と遠心ポンプの外部特性値(ヘッドとパワー)の高非線形性について検討するために、ニューラルネットワークモデル(NN)を2次応答面モデル(RSF)、ラジアル基底ガウス応答面モデル(RBF)、クリギングモデル(KRG)と比較して構築する。単段遠心ポンプの数値モデル検証実験により,CFDに基づく数値モデルは非常に正確かつ公平であることが確認された。すべての予測モデルは、それぞれ設計範囲の3つのキー変数の異なる組み合わせの下で、60個のサンプルによって訓練される。 4つの予測モデルに基づく頭部とパワーの精度をcfdシミュレーション値と比較して解析した。その結果、ニューラルネットワークモデルは、他の3つのサロゲートモデルと比較して、すべての外部特性値において優れた性能を示すことがわかった。最後に,データ拡張(NNDA)に基づくニューラルネットワークモデルを提案し,シミュレーションコストが高すぎること,特にCFD問題におけるデータ不足を理由として提案する。データ拡張を伴うモデルは、異なる属性のサンプルポイントごとに補間することでデータを3倍にすることができる。データ拡張によるニューラルネットワークモデルの性能は,従来のニューラルネットワークモデルよりも優れていることを示す。したがって、NNの予測能力は、より多くのシミュレーションコストを伴わずに向上する。データ拡張により、次の最適化のために多段ポンプの最適化問題を解き、将来有限要素解析最適化問題に一般化する上で、より良い予測モデルとなる。 A multi-objective prediction method of multi-stage pump method based on neural network with data augmentation is proposed. In order to study the highly nonlinear relationship between key design variables and centrifugal pump external characteristic values (head and power), the neural network model (NN) is built in comparison with the quadratic response surface model (RSF), the radial basis Gaussian response surface model (RBF), and the Kriging model (KRG). The numerical model validation experiment of another type of single stage centrifugal pump showed that numerical model based on CFD is quite accurate and fair. All of prediction models are trained by 60 samples under the different combination of three key variables in design range respectively. The accuracy of the head and power based on the four predictions models are analyzed comparing with the CFD simulation values. The results show that the neural network model has better performance in all external characteristic values comparing with other three surrogate models. Finally, a neural network model based on data augmentation (NNDA) is proposed for the reason that simulation cost is too high and data is scarce in mechanical simulation field especially in CFD problems. The model with data augmentation can triple the data by interpolation at each sample point of different attributes. It shows that the performance of neural network model with data augmentation is better than former neural network model. Therefore, the prediction ability of NN is enhanced without more simulation costs. With data augmentation it can be a better prediction model used in solving the optimization problems of multistage pump for next optimization and generalized to finite element analysis optimization problems in future.	翻訳日:2023-01-04 03:26:21 公開日:2020-02-04
# ディープニューラルネットワークを用いたロバスト顔アライメントの多段階モデル Multistage Model for Robust Face Alignment Using Deep Neural Networks ( http://arxiv.org/abs/2002.01075v1 ) ライセンス: Link先を確認	Huabin Wang and Rui Cheng and Jian Zhou and Liang Tao and Hon Keung Kwan	(参考訳) 厳しいオクルージョンや大きなポーズのバリエーションのような制約のない条件を一般化する能力は、顔のアライメントで達成する上で難しい目標である。本稿では,空間変換器ネットワーク,時間ガラスネットワーク,および模範的形状制約を生かした,深層ニューラルネットワークに基づく多段階モデルを提案する。まず、畳み込み層と残留単位からなる空間変圧器生成逆ネットワークを用いて、顔検出器による初期化問題(回転やスケールの変動など)を解決し、顔アライメントのための顔境界ボックスの改善を図る。そして、積み重ねられた砂時計ネットワークを用いてランドマークの予備位置と対応するスコアを取得する。また,高得点者に基づいて低得点のランドマークを決定するために,exemplar-based shape dictionaryが設計されている。顔形状制約を組み込むことにより、閉塞や散在した背景による不整合ランドマークを大幅に改善することができる。提案手法は他の最先端手法よりも優れた性能を示すために,挑戦的ベンチマークデータセットに基づく広範囲な実験を行った。 An ability to generalize unconstrained conditions such as severe occlusions and large pose variations remains a challenging goal to achieve in face alignment. In this paper, a multistage model based on deep neural networks is proposed which takes advantage of spatial transformer networks, hourglass networks and exemplar-based shape constraints. First, a spatial transformer - generative adversarial network which consists of convolutional layers and residual units is utilized to solve the initialization issues caused by face detectors, such as rotation and scale variations, to obtain improved face bounding boxes for face alignment. Then, stacked hourglass network is employed to obtain preliminary locations of landmarks as well as their corresponding scores. In addition, an exemplar-based shape dictionary is designed to determine landmarks with low scores based on those with high scores. By incorporating face shape constraints, misaligned landmarks caused by occlusions or cluttered backgrounds can be considerably improved. Extensive experiments based on challenging benchmark datasets are performed to demonstrate the superior performance of the proposed method over other state-of-the-art methods.	翻訳日:2023-01-04 03:25:56 公開日:2020-02-04
# トップダウンアテンションを用いた選択的セグメンテーションネットワーク Selective Segmentation Networks Using Top-Down Attention ( http://arxiv.org/abs/2002.01125v1 ) ライセンス: Link先を確認	Mahdi Biparva, John Tsotsos	(参考訳) 畳み込みニューラルネットワーク(convolutional neural networks)は、ネットワーク階層の底にある入力感覚データの、視覚階層の上部にある意味情報への変換をモデル化する。フィードフォワード処理は、いくつかのオブジェクト認識タスクに十分である。トップダウンの選択はボトムアップのfeedforwardパスに加えて必要となる。部分的には、階層的特徴ピラミッドによって課される位置情報の喪失の欠点に対処することができる。本稿では,Top-Down選択ネットワークでBottom-Up \convnetsを拡張可能な,オブジェクトセグメンテーションのための統合2パスフレームワークを提案する。我々は,トップダウン選択ゲーティング活動を利用してボトムアップ隠れ動作のセグメンテーション予測を行う。ネットワークの両端におけるタスク要求を満たす損失項を持つエンドツーエンドのマルチタスクフレームワークを開発する。提案するベンチマークデータセットのネットワークをセマンティクスセグメンテーションのために評価し,トップダウン選択能力を有するネットワークがベースラインモデルを上回ることを示す。さらに、我々は新しいセグメンテーションパラダイムの優れた側面に光を当て、パラメトリックスキップ接続に純粋に依存するベースラインモデルよりも、新しいフレームワークの効率を質的かつ定量的に支援した。 Convolutional neural networks model the transformation of the input sensory data at the bottom of a network hierarchy to the semantic information at the top of the visual hierarchy. Feedforward processing is sufficient for some object recognition tasks. Top-Down selection is potentially required in addition to the Bottom-Up feedforward pass. It can, in part, address the shortcoming of the loss of location information imposed by the hierarchical feature pyramids. We propose a unified 2-pass framework for object segmentation that augments Bottom-Up \convnets with a Top-Down selection network. We utilize the top-down selection gating activities to modulate the bottom-up hidden activities for segmentation predictions. We develop an end-to-end multi-task framework with loss terms satisfying task requirements at the two ends of the network. We evaluate the proposed network on benchmark datasets for semantic segmentation, and show that networks with the Top-Down selection capability outperform the baseline model. Additionally, we shed light on the superior aspects of the new segmentation paradigm and qualitatively and quantitatively support the efficiency of the novel framework over the baseline model that relies purely on parametric skip connections.	翻訳日:2023-01-04 03:18:58 公開日:2020-02-04
# 境界不規則性をもつ逆ロバストフレームサンプリング Adversarially Robust Frame Sampling with Bounded Irregularities ( http://arxiv.org/abs/2002.01147v1 ) ライセンス: Link先を確認	Hanhan Li, Pin Wang	(参考訳) 近年,ビデオから意味のある情報を自動抽出するビデオ解析ツールが広く研究され,展開されている。ほとんどが計算コストのかかるディープニューラルネットワークを使用しているため、そのようなアルゴリズムにビデオフレームのサブセットだけを投入することが望ましい。フレームを固定レートでサンプリングすることは、その単純さ、代表性、解釈性のために常に魅力的である。例えば、人気のcloud video apiは、ビデオ中の毎秒1フレームのみを処理することで、ビデオとショットのラベルを生成した。しかし、選択したフレームをサンプリングされた場所に配置することで、このような戦略を簡単に攻撃することができる。本稿では,このサンプリング問題に対するエレガントな解決法を提案する。 In recent years, video analysis tools for automatically extracting meaningful information from videos are widely studied and deployed. Because most of them use deep neural networks which are computationally expensive, feeding only a subset of video frames into such algorithms is desired. Sampling the frames with fixed rate is always attractive for its simplicity, representativeness, and interpretability. For example, a popular cloud video API generated video and shot labels by processing only the first frame of every second in a video. However, one can easily attack such strategies by placing chosen frames at the sampled locations. In this paper, we present an elegant solution to this sampling problem that is provably robust against adversarial attacks and introduces bounded irregularities as well.	翻訳日:2023-01-04 03:18:40 公開日:2020-02-04
# AutoEncoder-based Lifted Multicuts を用いた教師なし多人数追跡 Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted Multicuts ( http://arxiv.org/abs/2002.01192v1 ) ライセンス: Link先を確認	Kalun Ho, Janis Keuper, Margret Keuper	(参考訳) マルチオブジェクト追跡(MOT)はコンピュータビジョンにおける長年の課題である。検出パラダイムによるトラッキングに基づく現在のアプローチは、データをトラックに正しく関連付けるために、ある種のドメイン知識または監督を必要とする。本研究では,視覚特徴と最小コストリフトされたマルチカットに基づく教師なしマルチオブジェクト追跡手法を提案する。提案手法は,画像列中の隣接フレームから重畳することなく抽出できるストレートフォワード時空間的手がかりに基づく。これらの手がかりに基づくクラスタリングにより、追跡タスクに必要な出現不変性を学び、オートエンコーダを訓練し、適切な潜在表現を生成することができる。このように、結果として生じる潜在表現は、信頼できる時空間的特徴を抽出できない大きな時間的距離でも追跡するための堅牢な外観手がかりとして機能する。提案したアノテーションを使わずにトレーニングされているにもかかわらず,我々のモデルは,歩行者追跡のための挑戦的なMOTベンチマーク上での競争結果を提供する。 Multiple Object Tracking (MOT) is a long-standing task in computer vision. Current approaches based on the tracking by detection paradigm either require some sort of domain knowledge or supervision to associate data correctly into tracks. In this work, we present an unsupervised multiple object tracking approach based on visual features and minimum cost lifted multicuts. Our method is based on straight-forward spatio-temporal cues that can be extracted from neighboring frames in an image sequences without superivison. Clustering based on these cues enables us to learn the required appearance invariances for the tracking task at hand and train an autoencoder to generate suitable latent representation. Thus, the resulting latent representations can serve as robust appearance cues for tracking even over large temporal distances where no reliable spatio-temporal features could be extracted. We show that, despite being trained without using the provided annotations, our model provides competitive results on the challenging MOT Benchmark for pedestrian tracking.	翻訳日:2023-01-04 03:17:58 公開日:2020-02-04
# Selective Convolutional Network: 背景を無視する効率的なオブジェクト検出器 Selective Convolutional Network: An Efficient Object Detector with Ignoring Background ( http://arxiv.org/abs/2002.01205v1 ) ライセンス: Link先を確認	Hefei Ling, Yangyang Qin, Li Zhang, Yuxuan Shi, Ping Li	(参考訳) アテンション機構がオブジェクト検出器を含む多くのcnnの性能を効果的に改善できることはよく知られている。特徴写像を精細化する代わりに、注意を向ける新しい試みによって計算の複雑さを抑える。そこで,本研究では,有意義かつ有意義な情報を含む位置のみを選択的に計算するscn(elective convolutional network)と呼ばれる効率的な物体検出器を提案する。基本的な考え方は、特に特徴抽出時の計算コストを効果的に削減する、重要でない背景領域を排除することである。そこで本稿では,ネットワークの次を導くためのオーバーヘッドを無視する,精巧な構造を設計する。エンドツーエンドのトレーニング可能で、エンベディングも簡単です。追加のセグメンテーションデータセットなしでは、直接監督と間接監督を含む2つの異なる列車戦略を探索する。 PASCAL VOC2007およびMS COCO検出データセットの性能評価実験を行った。その結果, SSD と Pelee を本手法に統合することにより, SCN の精度を低下させることなく, 1/5 および 1/3 の範囲での計算を平均的に削減できることがわかった。 It is well known that attention mechanisms can effectively improve the performance of many CNNs including object detectors. Instead of refining feature maps prevalently, we reduce the prohibitive computational complexity by a novel attempt at attention. Therefore, we introduce an efficient object detector called Selective Convolutional Network (SCN), which selectively calculates only on the locations that contain meaningful and conducive information. The basic idea is to exclude the insignificant background areas, which effectively reduces the computational cost especially during the feature extraction. To solve it, we design an elaborate structure with negligible overheads to guide the network where to look next. It's end-to-end trainable and easy-embedding. Without additional segmentation datasets, we explores two different train strategies including direct supervision and indirect supervision. Extensive experiments assess the performance on PASCAL VOC2007 and MS COCO detection datasets. Results show that SSD and Pelee integrated with our method averagely reduce the calculations in a range of 1/5 and 1/3 with slight loss of accuracy, demonstrating the feasibility of SCN.	翻訳日:2023-01-04 03:17:43 公開日:2020-02-04
# トポメトリックマップにおける単一画像からの深部幾何学的6DF位置決め Deep-Geometric 6 DoF Localization from a Single Image in Topo-metric Maps ( http://arxiv.org/abs/2002.01210v1 ) ライセンス: Link先を確認	Tom Roussel, Punarjay Chakravarty, Gaurav Pandey, Tinne Tuytelaars, Luc Van Eycken	(参考訳) 本稿では,カメラの6自由度(dof)の全体像を,予めマッピングされた環境における1つの画像から推定可能な,深部地理ローカライザについて述べる。我々の地図はトポメトリックであり、6つのDoFポーズが知られている離散位相ノードを持つ。マップの各topoノードは、2d特徴と3d位置がマッピングプロセスの一部として格納される一連のポイントで構成されています。マッピングフェーズでは、ステレオカメラと通常のステレオビジュアルスラムパイプラインを使用します。ローカライゼーションフェーズでは,1枚のカメライメージをDeep Learningを用いてトポロジカルノードにローカライズし,マッチングした2D特徴(およびトポマップにおけるそれらの3D位置)の幾何アルゴリズム(PnP)を用いて,カメラの全6DoFのグローバルな一貫したポーズを決定する。本手法は,マッピングと位置決めアルゴリズムとセンサ(stereoとmono)を分離し,単一のカメラを用いて,予めマッピングした環境で正確な6自由度位置推定を可能にする。携帯電話やドローンなどの単一カメラデバイスにおけるVR/ARやローカライゼーションの応用の可能性を考えると、私たちのハイブリッドアルゴリズムは、シミュレーションや実環境における単一の画像からのポーズを回帰する完全なDeep-LearningベースのPose-Netと好適に比較できる。 We describe a Deep-Geometric Localizer that is able to estimate the full 6 Degree of Freedom (DoF) global pose of the camera from a single image in a previously mapped environment. Our map is a topo-metric one, with discrete topological nodes whose 6 DoF poses are known. Each topo-node in our map also comprises of a set of points, whose 2D features and 3D locations are stored as part of the mapping process. For the mapping phase, we utilise a stereo camera and a regular stereo visual SLAM pipeline. During the localization phase, we take a single camera image, localize it to a topological node using Deep Learning, and use a geometric algorithm (PnP) on the matched 2D features (and their 3D positions in the topo map) to determine the full 6 DoF globally consistent pose of the camera. Our method divorces the mapping and the localization algorithms and sensors (stereo and mono), and allows accurate 6 DoF pose estimation in a previously mapped environment using a single camera. With potential VR/AR and localization applications in single camera devices such as mobile phones and drones, our hybrid algorithm compares favourably with the fully Deep-Learning based Pose-Net that regresses pose from a single image in simulated as well as real environments.	翻訳日:2023-01-04 03:17:22 公開日:2020-02-04
# 物体追跡のための3次元モデル輪郭エネルギーとキーポイントの組み合わせ Combining 3D Model Contour Energy and Keypoints for Object Tracking ( http://arxiv.org/abs/2002.01379v1 ) ライセンス: Link先を確認	Bogdan Bugaev, Anton Kryshchenko, Roman Belov	(参考訳) 単分子モデルに基づく3次元追跡のための新しい組み合わせアプローチを提案する。キーポイントベースの手法を用いて予備オブジェクトのポーズを推定する。次に、輪郭エネルギー関数を最適化してポーズを洗練する。エネルギーは、モデル投影の輪郭と画像エッジとの対応度を決定する。生画像勾配の強度と向きの両方に基づいて算出する。最適化のために,局所最適性を克服し,キーポイントに基づくポーズ推定から得られる情報を考慮した検索領域制約を提案する。この手法は,キーポイントベースおよびエッジベースアプローチの多くの問題を解消する。本手法は,様々な照明条件,動作パターン,速度の動画を含む公開ベンチマークデータセット上で,最先端手法と比較することにより,その効率性を示す。 We present a new combined approach for monocular model-based 3D tracking. A preliminary object pose is estimated by using a keypoint-based technique. The pose is then refined by optimizing the contour energy function. The energy determines the degree of correspondence between the contour of the model projection and the image edges. It is calculated based on both the intensity and orientation of the raw image gradient. For optimization, we propose a technique and search area constraints that allow overcoming the local optima and taking into account information obtained through keypoint-based pose estimation. Owing to its combined nature, our method eliminates numerous issues of keypoint-based and edge-based approaches. We demonstrate the efficiency of our method by comparing it with state-of-the-art methods on a public benchmark dataset that includes videos with various lighting conditions, movement patterns, and speed.	翻訳日:2023-01-04 03:16:55 公開日:2020-02-04
# Action Graphs: グラフ畳み込みネットワークによる弱教師付きアクションローカライゼーション Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks ( http://arxiv.org/abs/2002.01449v1 ) ライセンス: Link先を確認	Maheen Rashid, Hedvig Kjellstr\"om, Yong Jae Lee	(参考訳) 本稿では,グラフ畳み込みに基づく弱教師付き動作定位法を提案する。関連するアクションクラスに対応するビデオ時間セグメントを検索・分類するために、システムは、各ビデオ内の識別可能な時間セグメントを識別し、各アクションの完全な範囲を識別できる必要がある。弱いビデオレベルのラベルでこれを達成するには、システムは、トレーニングデータ内のビデオ間のモーメント間の類似性と相違を利用して、アクションの出現方法と、アクションの全範囲を構成するサブアクションの両方を理解する必要がある。しかし、現在の手法では、ビデオモーメント間の類似性を明示的に使用せず、局所化と分類予測を知らせている。本稿では,ビデオモーメント間の類似性を明示的にモデル化するためにグラフ畳み込みを用いる新しい手法を提案する。本手法は外観と動きを符号化した類似性グラフを用いて,THUMOS '14, ActivityNet 1.2, Charadesの動作ローカライゼーションを弱めに制御する手法である。 We present a method for weakly-supervised action localization based on graph convolutions. In order to find and classify video time segments that correspond to relevant action classes, a system must be able to both identify discriminative time segments in each video, and identify the full extent of each action. Achieving this with weak video level labels requires the system to use similarity and dissimilarity between moments across videos in the training data to understand both how an action appears, as well as the sub-actions that comprise the action's full extent. However, current methods do not make explicit use of similarity between video moments to inform the localization and classification predictions. We present a novel method that uses graph convolutions to explicitly model similarity between video moments. Our method utilizes similarity graphs that encode appearance and motion, and pushes the state of the art on THUMOS '14, ActivityNet 1.2, and Charades for weakly supervised action localization.	翻訳日:2023-01-04 03:16:43 公開日:2020-02-04
# トピックネットワークから分散認知マップへ:自発的地理情報領域におけるZipfian Topic Universes From Topic Networks to Distributed Cognitive Maps: Zipfian Topic Universes in the Area of Volunteered Geographic Information ( http://arxiv.org/abs/2002.01454v1 ) ライセンス: Link先を確認	Alexander Mehler and R\"udiger Gleim and Regina Gaitsch and Wahed Hemati and Tolga Uslu	(参考訳) 近くの場所(都市など)は関連する言葉で書かれていますか。本稿では,地理情報の語彙符号化の分野において,この研究課題をテクスチュアリティのレベルに転送する。この目的のために,いわゆるトピックネットワークの助けを借りて,都市や地域レベルでの住所のテキストをモデル化するボランティア地理情報(vgi)を探索する。このことは、言語がテキストの話題レベルに関する地理情報をエンコードし、ネットワーク化する方法を調べるために行われる。我々の仮説は、場所のネットワーク的テーマ化は、距離や著者の基盤となるコミュニティに関係なく、類似している、というものである。そこで本研究では,言語多層ネットワーク(LMN)を新たなモデルとして,特にテキストコーパスにおけるテーマネットワークを自動生成する多言語トピックネットワーク(MTN)を提案する。本研究は、地理的な場所(特に都市)がオンラインコミュニケーションに存在するテーマ宇宙のZipfian組織を示す。我々は、この発見を認知地図の文脈で解釈し、いわゆるテーママップによって拡張する概念である。この発見の解釈によれば、認知地図の一部としてのテーママップの組織化は、基盤となるメディアの継続的な存在を保証する共有可能なコンテンツを生成する傾向から生じる。ウィキペディアの特別なウィキや抽出例を用いて仮説を検証した。このようにして、私たちは結論に達します: 互いに近いかどうかに関わらず、場所はトピック宇宙の類似のサブネットワークにまたがる隣り合う場所にあります。 Are nearby places (e.g. cities) described by related words? In this article we transfer this research question in the field of lexical encoding of geographic information onto the level of intertextuality. To this end, we explore Volunteered Geographic Information (VGI) to model texts addressing places at the level of cities or regions with the help of so-called topic networks. This is done to examine how language encodes and networks geographic information on the aboutness level of texts. Our hypothesis is that the networked thematizations of places are similar - regardless of their distances and the underlying communities of authors. To investigate this we introduce Multiplex Topic Networks (MTN), which we automatically derive from Linguistic Multilayer Networks (LMN) as a novel model, especially of thematic networking in text corpora. Our study shows a Zipfian organization of the thematic universe in which geographical places (especially cities) are located in online communication. We interpret this finding in the context of cognitive maps, a notion which we extend by so-called thematic maps. According to our interpretation of this finding, the organization of thematic maps as part of cognitive maps results from a tendency of authors to generate shareable content that ensures the continued existence of the underlying media. We test our hypothesis by example of special wikis and extracts of Wikipedia. In this way we come to the conclusion: Places, whether close to each other or not, are located in neighboring places that span similar subnetworks in the topic universe.	翻訳日:2023-01-04 03:08:48 公開日:2020-02-04
# 汎用学習エージェントのための神経進化的枠組み Neuro-evolutionary Frameworks for Generalized Learning Agents ( http://arxiv.org/abs/2002.01088v1 ) ライセンス: Link先を確認	Thommen George Karimpanal	(参考訳) 近年のディープラーニングと強化学習の成功は、最先端の人工知能技術としての地位を確立している。しかし、サンプル効率の低さや限定的な一般化能力といったこれらのアプローチの長年の欠点は、システムの設計とデプロイの方法を再検討する必要があることを示している。本稿では,これらの学習システムと進化アルゴリズムの特定のバリエーションを組み合わせることで,様々な望ましい行動の自動獲得や行動優先の有用なセットなど,ユニークな特徴が出現する可能性について強調する。これにより、環境との最小限の相互作用で、学習を一般化し、継続的に行う方法が整うことができる。このような神経進化の枠組みから期待される改善と関連する課題、そして多くの研究分野への応用の可能性について論じる。 The recent successes of deep learning and deep reinforcement learning have firmly established their statuses as state-of-the-art artificial learning techniques. However, longstanding drawbacks of these approaches, such as their poor sample efficiencies and limited generalization capabilities point to a need for re-thinking the way such systems are designed and deployed. In this paper, we emphasize how the use of these learning systems, in conjunction with a specific variation of evolutionary algorithms could lead to the emergence of unique characteristics such as the automated acquisition of a variety of desirable behaviors and useful sets of behavior priors. This could pave the way for learning to occur in a generalized and continual manner, with minimal interactions with the environment. We discuss the anticipated improvements from such neuro-evolutionary frameworks, along with the associated challenges, as well as its potential for application to a number of research areas.	翻訳日:2023-01-04 03:08:10 公開日:2020-02-04
# 脳波信号を用いた宣言記憶の符号化と復号のためのニューラルオシレーション Neural Oscillations for Encoding and Decoding Declarative Memory using EEG Signals ( http://arxiv.org/abs/2002.01126v1 ) ライセンス: Link先を確認	Jenifer Kalafatovich, Minji Lee	(参考訳) 宣言記憶は、日常生活体験の記憶との関係について研究されている。前回の研究では、エンコーディングフェーズにおける動作性能に関するパワースペクトルの変化が報告されたが、デコーディングフェーズはまだ検討が必要である。本研究では,記憶過程に関連する神経振動の変化について検討する。参加者は脳波信号が記録されている間に、フェーズのエンコーディングとデコードを行うためのメモリタスクを依頼された。その結果, エンコーディングフェーズでは, 低ベータ, 高ベータ帯, 低ベータ帯, 高ベータ帯, ガンマ帯が左側頭葉領域で有意に低下し, その後の記憶への影響が認められた。復号フェーズでは, 前方-中央領域でアルファパワーの低下がみられた。その結果、βバンドとαバンドがメモリタスクのエンコードとデコードフェーズにそれぞれ有意な相関を示した。 Declarative memory has been studied for its relationship with remembering daily life experiences. Previous studies reported changes in power spectra during encoding phase related to behavioral performance, however decoding phase still needs to be explored. This study investigates neural oscillations changes related to memory process. Participants were asked to perform a memory task for encoding and decoding phase while EEG signals were recorded. Results showed that for encoding phase, there was a significant decrease of power in low beta, high beta bands over fronto-central area and a decrease in low beta, high beta and gamma bands over left temporal area related to successful subsequent memory effects. For decoding phase, only significant decreases of alpha power were observed over fronto-central area. This finding showed relevance of beta and alpha band for encoding and decoding phase of a memory task respectively.	翻訳日:2023-01-04 03:07:56 公開日:2020-02-04
# 弱監視対象検出のためのオブジェクトインスタンスマイニング Object Instance Mining for Weakly Supervised Object Detection ( http://arxiv.org/abs/2002.01087v1 ) ライセンス: Link先を確認	Chenhao Lin, Siwen Wang, Dongqi Xu, Yu Lu, Wayne Zhang	(参考訳) 近年,画像レベルのアノテーションのみを用いたオブジェクト検出(WSOD)が注目されている。複数のインスタンス学習を使用する既存のアプローチは、各カテゴリのイメージ内の最も識別的なオブジェクトから学ぶ傾向があるため、ローカルオプティマに容易に当てはまる。したがって、これらのメソッドは、WSODのパフォーマンスを低下させるオブジェクトインスタンスの欠如に悩まされる。この問題に対処するため,本論文では,オブジェクト検出の弱いエンドツーエンドのオブジェクトインスタンスマイニング(OIM)フレームワークを提案する。 oimは、追加のアノテーションなしで、空間および外観グラフに情報伝達を導入することで、各画像に存在するすべての可能なオブジェクトインスタンスの検出を試みる。反復学習プロセスでは、同一クラスからの識別の少ないオブジェクトインスタンスを徐々に検出し、トレーニングに利用することができる。さらに、各オブジェクトインスタンスのより大きな部分を学習し、パフォーマンスをさらに向上するために、オブジェクトインスタンスの再重み付け損失を設計する。 VOC 2007 と 2012 の2つの公開データベースの実験結果は,提案手法の有効性を実証している。 Weakly supervised object detection (WSOD) using only image-level annotations has attracted growing attention over the past few years. Existing approaches using multiple instance learning easily fall into local optima, because such mechanism tends to learn from the most discriminative object in an image for each category. Therefore, these methods suffer from missing object instances which degrade the performance of WSOD. To address this problem, this paper introduces an end-to-end object instance mining (OIM) framework for weakly supervised object detection. OIM attempts to detect all possible object instances existing in each image by introducing information propagation on the spatial and appearance graphs, without any additional annotations. During the iterative learning process, the less discriminative object instances from the same class can be gradually detected and utilized for training. In addition, we design an object instance reweighted loss to learn larger portion of each object instance to further improve the performance. The experimental results on two publicly available databases, VOC 2007 and 2012, demonstrate the efficacy of proposed approach.	翻訳日:2023-01-04 03:07:43 公開日:2020-02-04
# グループ写真の美的品質評価 Aesthetic Quality Assessment for Group photograph ( http://arxiv.org/abs/2002.01096v1 ) ライセンス: Link先を確認	Yaoting Wang (1 and 2), Yongzhen Ke (1 and 2), Kai Wang (1 and 2), Cuijiao Zhang (1 and 2), Fan Qin (3) ((1) School of computer science and technology, Tiangong University, (2) Tianjin Key Laboratory of Autonomous Intelligence Technology and Systems, (3) Business School, Nankai University)	(参考訳) 画像美的品質評価は近年注目されているが、特定の種類の写真、すなわちグループ写真についての研究はあまり行われていない。本研究では,グループ写真の経験と原則に基づく,高度な機能セットを設計した。Opened-eye, Gaze, Smile, Occluded Face, Face Orientation, Facial blur, Character Center。次に,これらと83の汎用的な美的特徴を組み合わせることで,2つの美的評価モデルを構築した。また,審美スコアを付記したグループ写真gpdの大規模データセットを構築した。実験の結果,プロの写真とスナップショットを分類し,同一場面における多様な人間状態の複数のグループ写真の識別を予測できることがわかった。 Image aesthetic quality assessment has got much attention in recent years, but not many works have been done on a specific genre of photos: Group photograph. In this work, we designed a set of high-level features based on the experience and principles of group photography: Opened-eye, Gaze, Smile, Occluded faces, Face Orientation, Facial blur, Character center. Then we combined them and 83 generic aesthetic features to build two aesthetic assessment models. We also constructed a large dataset of group photographs - GPD- annotated with the aesthetic score. The experimental result shows that our features perform well for categorizing professional photos and snapshots and predicting the distinction of multiple group photographs of diverse human states under the same scene.	翻訳日:2023-01-04 03:07:27 公開日:2020-02-04
# 映像中の異常な活動検出のためのランキングロス機能付き3次元ResNet 3D ResNet with Ranking Loss Function for Abnormal Activity Detection in Videos ( http://arxiv.org/abs/2002.01132v1 ) ライセンス: Link先を確認	Shikha Dubey, Abhijeet Boragule, Moongu Jeon	(参考訳) 異常な活動検出はコンピュータビジョンの分野で最も困難なタスクの1つである。本研究は, 異常映像と正常映像の両方を用いて, 映像レベルの情報を提供し, 複数インスタンス学習の助けを借りて異常映像を学習する, 異常行動検出の最近の研究成果に動機づけられている。時間的アノテーションがない場合、そのようなモデルは異常を検出しながら誤報をしがちである。そこで本稿では,異常な活動検知タスクを実行しながら,誤警報率を最小限に抑えるタスクに焦点をあてる。ビデオ行動認識タスクにおけるこれらの誤報の軽減と最近の3Dディープニューラルネットワークの進歩は、提案手法で3D ResNetを活用する動機を与え、ビデオから時空間の特徴を抽出するのに役立つ。その後,これらの特徴と深層マルチインスタンス学習と,提案するランキング損失を用いて,映像セグメントレベルでの異常スコアの予測を行う。そこで,提案手法は3D Deep Multiple Instance Learning with ResNet (MILR) と新しいランキング損失関数を併用して,UCF-Crimeベンチマークデータセット上での最高の性能を実現する。提案手法の有効性をUCF-Crimeデータセットで示す。 Abnormal activity detection is one of the most challenging tasks in the field of computer vision. This study is motivated by the recent state-of-art work of abnormal activity detection, which utilizes both abnormal and normal videos in learning abnormalities with the help of multiple instance learning by providing the data with video-level information. In the absence of temporal-annotations, such a model is prone to give a false alarm while detecting the abnormalities. For this reason, in this paper, we focus on the task of minimizing the false alarm rate while performing an abnormal activity detection task. The mitigation of these false alarms and recent advancement of 3D deep neural network in video action recognition task collectively give us motivation to exploit the 3D ResNet in our proposed method, which helps to extract spatial-temporal features from the videos. Afterwards, using these features and deep multiple instance learning along with the proposed ranking loss, our model learns to predict the abnormality score at the video segment level. Therefore, our proposed method 3D deep Multiple Instance Learning with ResNet (MILR) along with the new proposed ranking loss function achieves the best performance on the UCF-Crime benchmark dataset, as compared to other state-of-art methods. The effectiveness of our proposed method is demonstrated on the UCF-Crime dataset.	翻訳日:2023-01-04 03:01:23 公開日:2020-02-04
# GTC:CTCの効率的かつ正確なテキスト認識に向けた指導的訓練 GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition ( http://arxiv.org/abs/2002.01276v1 ) ライセンス: Link先を確認	Wenyang Hu, Xiaocong Cai, Jun Hou, Shuai Yi, Zhiping Lin	(参考訳) コネクショニスト時間分類(ctc)と注意機構は、近年のテキスト認識における2つの主要なアプローチである。注意に基づく手法と比較して、CTCデコーダはより短い推論時間を持つが、精度は低い。効率的かつ効果的なモデルを設計するために、より強力な注意指導からCTCモデルがより優れたアライメントと特徴表現を学習するCTC(GTC)のガイド付きトレーニングを提案する。ガイド付きトレーニングの利点により、CTCモデルは、高速な推論速度を維持しながら、正規および不規則なシーンテキストの堅牢かつ正確な予測を実現する。さらに,ctcデコーダの可能性をさらに活用するために,グラフ畳み込みネットワーク(gcn)を提案し,抽出された特徴の局所相関について検討した。標準ベンチマークに関する広範囲な実験により,本モデルが正規および不規則なテキスト認識のための新たな最先端技術を実現し,注意に基づく手法の6倍の推論時間を必要とすることが示された。 Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet a lower accuracy. To design an efficient and effective model, we propose the guided training of CTC (GTC), where CTC model learns a better alignment and feature representations from a more powerful attentional guidance. With the benefit of guided training, CTC model achieves robust and accurate prediction for both regular and irregular scene text while maintaining a fast inference speed. Moreover, to further leverage the potential of CTC decoder, a graph convolutional network (GCN) is proposed to learn the local correlations of extracted features. Extensive experiments on standard benchmarks demonstrate that our end-to-end model achieves a new state-of-the-art for regular and irregular scene text recognition and needs 6 times shorter inference time than attentionbased methods.	翻訳日:2023-01-04 02:59:54 公開日:2020-02-04
# ピクセルワイズ条件付き生成逆ネットワークによる画像合成と補完 Pixel-wise Conditioned Generative Adversarial Networks for Image Synthesis and Completion ( http://arxiv.org/abs/2002.01281v1 ) ライセンス: Link先を確認	Cyprien Ruffino and Romain H\'erault and Eric Laloy and Gilles Gasso	(参考訳) generative adversarial networks (gans) は教師なし画像生成に成功している。いくつかの作品では、画像の一部を再構成して生成を条件づけることで、ganを画像の塗装に拡張している。その成功にもかかわらず、これらの手法は、画像ピクセルの小さなサブセットのみが事前に知られているような設定に制限がある。本稿では,ごく少数の画素値が提供される場合の条件付きGANの有効性について検討する。本稿では,GAN目標関数に明示的なコスト項を付加して画素単位の条件を強制するモデリングフレームワークを提案する。本稿では,この正規化項が生成画像の品質と与えられた画素制約を満たすことに与える影響について検討する。最近のPacGAN技術を用いて、我々は生成したサンプルの多様性を維持する。 FashionMNISTにおける実験により、正規化項は生成画像の品質と条件付けとの間のトレードオフを効果的に制御することを示した。 cifar-10 と celeba データセットの実験的評価により, 画素条件を強制しながら, fr\'echet インセプション距離の観点で, 視覚的, 定量的に精度の高い結果が得られることが示された。また,完全畳み込みネットワークを用いたテクスチャ画像生成タスクの評価を行った。最後の貢献として、この手法を古典的な地質シミュレーション応用に適用する。 Generative Adversarial Networks (GANs) have proven successful for unsupervised image generation. Several works have extended GANs to image inpainting by conditioning the generation with parts of the image to be reconstructed. Despite their success, these methods have limitations in settings where only a small subset of the image pixels is known beforehand. In this paper we investigate the effectiveness of conditioning GANs when very few pixel values are provided. We propose a modelling framework which results in adding an explicit cost term to the GAN objective function to enforce pixel-wise conditioning. We investigate the influence of this regularization term on the quality of the generated images and the fulfillment of the given pixel constraints. Using the recent PacGAN technique, we ensure that we keep diversity in the generated samples. Conducted experiments on FashionMNIST show that the regularization term effectively controls the trade-off between quality of the generated images and the conditioning. Experimental evaluation on the CIFAR-10 and CelebA datasets evidences that our method achieves accurate results both visually and quantitatively in term of Fr\'echet Inception Distance, while still enforcing the pixel conditioning. We also evaluate our method on a texture image generation task using fully-convolutional networks. As a final contribution, we apply the method to a classical geological simulation application.	翻訳日:2023-01-04 02:59:37 公開日:2020-02-04
# 有界部分集合の学習 $l_p$ Learning bounded subsets of $L_p$ ( http://arxiv.org/abs/2002.01182v1 ) ライセンス: Link先を確認	Shahar Mendelson	(参考訳) 基礎となるクラスが$L_p$の有界部分集合であり、対象の$Y$が$L_p$に属する学習問題を研究する。以前は、ミニマックスサンプルの複雑性推定は、そのような有界性仮定の下で、$p=\infty$のときのみ知られていた。任意の$p > 4$ の急激なサンプル複雑性の推定値を示す。これは、重み付き問題に適した学習手順に基づいている。 We study learning problems in which the underlying class is a bounded subset of $L_p$ and the target $Y$ belongs to $L_p$. Previously, minimax sample complexity estimates were known under such boundedness assumptions only when $p=\infty$. We present a sharp sample complexity estimate that holds for any $p > 4$. It is based on a learning procedure that is suited for heavy-tailed problems.	翻訳日:2023-01-04 02:52:18 公開日:2020-02-04
# ALPINE:ネットワーク埋め込みを用いたアクティブリンク予測 ALPINE: Active Link Prediction using Network Embedding ( http://arxiv.org/abs/2002.01227v1 ) ライセンス: Link先を確認	Xi Chen, Bo Kang, Jefrey Lijffijt and Tijl De Bie	(参考訳) 多くの実世界の問題は、部分的に観測されたネットワーク内のリンクを予測するものとして定式化することができる。例えば、Facebookの友情提案、消費者製品推奨、犯罪ネットワーク内のアクター間の隠れた相互作用の識別などがある。いくつかのリンク予測アルゴリズム、特に最近導入されたネットワーク埋め込みは、ネットワークの観測部分に依存するだけでこれを行うことができる。多くの場合、ノード対のリンク状態はクエリされ、リンク予測アルゴリズムによって追加情報として使用できる。残念なことに、このようなクエリはコストも時間もかかるため、どのノード対でクエリするかを慎重に検討する必要がある。本稿では,特定のノード対を問合せした後のリンク予測精度の向上を,アクティブな学習環境で使用するために推定する。具体的には,ネットワーク埋め込みに基づくリンク予測のための最初の手法である ALPINE (Active Link Prediction usIng Network Embedding) を提案する。この目的のために,v-optimalityの概念を実験設計からこの設定に一般化するとともに,標準分類設定で開発されたより基本的なアクティブラーニングヒューリスティックスを一般化した。実データによる実証結果から、ALPINEはスケーラブルであり、リンク予測精度をはるかに少ないクエリで向上させる。 Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on the observed part of the network. Often, the link status of a node pair can be queried, which can be used as additional information by the link prediction algorithm. Unfortunately, such queries can be expensive or time-consuming, mandating the careful consideration of which node pairs to query. In this paper we estimate the improvement in link prediction accuracy after querying any particular node pair, to use in an active learning setup. Specifically, we propose ALPINE (Active Link Prediction usIng Network Embedding), the first method to achieve this for link prediction based on network embedding. To this end, we generalized the notion of V-optimality from experimental design to this setting, as well as more basic active learning heuristics originally developed in standard classification settings. Empirical results on real data show that ALPINE is scalable, and boosts link prediction accuracy with far fewer queries.	翻訳日:2023-01-04 02:52:12 公開日:2020-02-04
# ウィスラー電波の検出と特徴化のための機械学習技術 Machine Learning Techniques to Detect and Characterise Whistler Radio Waves ( http://arxiv.org/abs/2002.01244v1 ) ライセンス: Link先を確認	Othniel J.E.Y. Konan, Amit Kumar Mishra, Stefan Lotz	(参考訳) ライトニングストロークは強力な電磁パルスを生成し、非常に低周波(VLF)波を電磁界線に沿って半球に伝播させる。 vlfアンテナ受信機は、これらの雷撃によって発生するホイッスラー波を検出するために使用できる。受信ホイッスラー波の特定の時間/周波数依存性は、磁気圏のプラズマ圏領域における電子密度の推定を可能にする。したがって、ホイッスラーの識別と特徴付けは、プラズマ圏をリアルタイムに監視し、統計研究に使用するイベントの大規模なデータベースを構築するための重要なタスクである。ウイスラー検出技術の現状は、Lichtenberger (2009) が開発したAutomatic Whistler Detection (AWD) 法である。この方法は2次元の画像相関に基づいており、vlf受信アンテナ(例えば南極)に位置する重要な計算ハードウェアを必要とする。本研究の目的は,vlf受信機が提供するデータからホイッスラーを自動的に検出できる機械学習モデルを開発することである。提案手法は,VLF受信機が生成したスペクトルデータに対して,画像分類と局所化を組み合わせることで,各ウィスラーの識別とローカライズを行う。対象とするデータには,SANAEとMarionのAWDが特定した約2300のイベントがあり,トレーニングや検証,テストデータとして使用される予定である。 3つの検出器設計が提案されている。 AWDと同様の手法を用いており、第1は分光図から抽出した関心領域のイメージ分類を用いており、第2はオブジェクト検出における最先端であるYOLOを用いている。これらの検出器はマリオンのデータセットで15%未満の誤検知と誤報を達成できることが示されている。 Lightning strokes create powerful electromagnetic pulses that routinely cause very low frequency (VLF) waves to propagate across hemispheres along geomagnetic field lines. VLF antenna receivers can be used to detect these whistler waves generated by these lightning strokes. The particular time/frequency dependence of the received whistler wave enables the estimation of electron density in the plasmasphere region of the magnetosphere. Therefore the identification and characterisation of whistlers are important tasks to monitor the plasmasphere in real-time and to build large databases of events to be used for statistical studies. The current state of the art in detecting whistler is the Automatic Whistler Detection (AWD) method developed by Lichtenberger (2009). This method is based on image correlation in 2 dimensions and requires significant computing hardware situated at the VLF receiver antennas (e.g. in Antarctica). The aim of this work is to develop a machine learning-based model capable of automatically detecting whistlers in the data provided by the VLF receivers. The approach is to use a combination of image classification and localisation on the spectrogram data generated by the VLF receivers to identify and localise each whistler. The data at hand has around 2300 events identified by AWD at SANAE and Marion and will be used as training, validation, and testing data. Three detector designs have been proposed. The first one using a similar method to AWD, the second using image classification on regions of interest extracted from a spectrogram, and the last one using YOLO, the current state of the art in object detection. It has been shown that these detectors can achieve a misdetection and false alarm of less than 15% on Marion's dataset.	翻訳日:2023-01-04 02:51:51 公開日:2020-02-04
# グラディエント型敵攻撃に対するミニマックス防御 Minimax Defense against Gradient-based Adversarial Attacks ( http://arxiv.org/abs/2002.01256v1 ) ライセンス: Link先を確認	Blerta Lindqvist, Rauf Izmailov	(参考訳) 最先端の敵攻撃はニューラルネットワーク分類器を対象としている。ニューラルネットワークはデフォルトで勾配降下を利用して損失関数を最小化する。分類器の損失関数の勾配は、勾配に基づく敵対攻撃によって逆摂動画像を生成する。我々は、他のタイプの最適化がニューラルネットワークの分類器にエッジを与えるかどうかに疑問を呈する。本稿では,最小限の最適化を応用した新たな手法を提案する。我々のミニマックス分類器は、GANジェネレータでミニマックスゲームをする生成逆数ネットワーク(GAN)の判別器である。さらに、我々のgan生成器は、元の多様体とは異なる多様体にすべての点を投影する。我々は,MNIST, CIFAR-10, German Traffic Sign (TRAFFIC) の3つのデータセット上で, 敵攻撃Carlini Wagner (CW), DeepFool, Fast Gradient Sign Method (FGSM) を用いる。 CW攻撃に対して、我々のミニマックス防衛は98.07%(MNISTデフォルト98.93%)、73.90%(CIFAR-10デフォルト83.14%)、94.54%(TRAFFICデフォルト96.97%)を達成した。 DeepFool攻撃に対して、私たちのミニマックス防衛は98.87%(MNIST)、76.61%(CIFAR-10)、94.57%(TRAFFIC)を達成した。 FGSM攻撃に対して,97.01%(MNIST),76.79%(CIFAR-10),81.41%(TRAFFIC)を達成した。我々のMinimax対逆アプローチは、ニューラルネットワーク分類器の防御戦略に大きな変化をもたらす。 State-of-the-art adversarial attacks are aimed at neural network classifiers. By default, neural networks use gradient descent to minimize their loss function. The gradient of a classifier's loss function is used by gradient-based adversarial attacks to generate adversarially perturbed images. We pose the question whether another type of optimization could give neural network classifiers an edge. Here, we introduce a novel approach that uses minimax optimization to foil gradient-based adversarial attacks. Our minimax classifier is the discriminator of a generative adversarial network (GAN) that plays a minimax game with the GAN generator. In addition, our GAN generator projects all points onto a manifold that is different from the original manifold since the original manifold might be the cause of adversarial attacks. To measure the performance of our minimax defense, we use adversarial attacks - Carlini Wagner (CW), DeepFool, Fast Gradient Sign Method (FGSM) - on three datasets: MNIST, CIFAR-10 and German Traffic Sign (TRAFFIC). Against CW attacks, our minimax defense achieves 98.07% (MNIST-default 98.93%), 73.90% (CIFAR-10-default 83.14%) and 94.54% (TRAFFIC-default 96.97%). Against DeepFool attacks, our minimax defense achieves 98.87% (MNIST), 76.61% (CIFAR-10) and 94.57% (TRAFFIC). Against FGSM attacks, we achieve 97.01% (MNIST), 76.79% (CIFAR-10) and 81.41% (TRAFFIC). Our Minimax adversarial approach presents a significant shift in defense strategy for neural network classifiers.	翻訳日:2023-01-04 02:51:24 公開日:2020-02-04
# 情報基盤によるタスク駆動型制御の学習 Learning Task-Driven Control Policies via Information Bottlenecks ( http://arxiv.org/abs/2002.01428v1 ) ライセンス: Link先を確認	Vincent Pacelli and Anirudha Majumdar	(参考訳) 本稿では,視覚や深度などの感覚の豊富なロボットシステムに対して,タスク駆動制御ポリシを合成するための強化学習手法を提案する。標準強化学習アルゴリズムは通常、システムの状態全体とリッチなセンサー観測に制御アクションを密結合するポリシーを生成する。その結果、結果として得られるポリシーは、状態や観察(背景の色の変化など)のタスク非関連部分の変化に敏感になることが多い。対照的に、ここで紹介するアプローチは、制御アクションの計算に使われるタスク駆動表現を作成することを学びます。形式的には、これは状態とタスク駆動型表現の間の情報ボトルネックを生成するポリシー勾配スタイルのアルゴリズムを導出することで達成される。本稿では,深度画像を用いた把握タスクやRGB画像を用いた球キャッチタスクなど,複数の例を対象としたシミュレーション結果の完全セットで示す。標準方針勾配法との比較により,我々のアルゴリズムが生み出すタスク駆動型政策は,センサノイズやタスク非関連な環境変化に対して,はるかに堅牢であることが示された。 This paper presents a reinforcement learning approach to synthesizing task-driven control policies for robotic systems equipped with rich sensory modalities (e.g., vision or depth). Standard reinforcement learning algorithms typically produce policies that tightly couple control actions to the entirety of the system's state and rich sensor observations. As a consequence, the resulting policies can often be sensitive to changes in task-irrelevant portions of the state or observations (e.g., changing background colors). In contrast, the approach we present here learns to create a task-driven representation that is used to compute control actions. Formally, this is achieved by deriving a policy gradient-style algorithm that creates an information bottleneck between the states and the task-driven representation; this constrains actions to only depend on task-relevant information. We demonstrate our approach in a thorough set of simulation results on multiple examples including a grasping task that utilizes depth images and a ball-catching task that utilizes RGB images. Comparisons with a standard policy gradient approach demonstrate that the task-driven policies produced by our algorithm are often significantly more robust to sensor noise and task-irrelevant changes in the environment.	翻訳日:2023-01-04 02:50:54 公開日:2020-02-04
# DVNet:大規模脳血管再建のためのメモリ効率の良い3次元CNN DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction ( http://arxiv.org/abs/2002.01568v1 ) ライセンス: Link先を確認	Leila Saadatifard, Aryan Mobiny, Pavel Govyadinov, Hien Nguyen, David Mayerich	(参考訳) 脳の微細構造図は神経変性疾患などの慢性疾患による変化を含む神経機能や行動を理解するために重要である。ナイフエッジ走査顕微鏡(KESM)のような技術は、細胞内分解能で全臓器のイメージングを可能にする。しかし、マルチテラバイトのデータサイズは手動アノテーションを非現実的かつ自動的なセグメンテーションを難しくする。密集した細胞と相互接続された微小血管ネットワークを組み合わせることは、現在のセグメンテーションアルゴリズムの課題である。高スループット顕微鏡データの巨大なサイズは、高速でほとんど教師なしのアルゴリズムを必要とする。本稿では,ピクセル単位のセマンティクスセグメンテーションのための完全畳み込み型,深層,密結合型エンコーダデコーダについて検討する。深いネットワークでしばしば発生する過大なメモリの複雑さは、スキップ接続を使用して軽減され、結果としてパラメータが減少し、以前のアーキテクチャよりも大幅にパフォーマンスが向上する。提案ネットワークは,オープンソースベンチマークに適用したセマンティックセグメンテーション問題に対して,優れた性能を提供する。我々はついに細胞および微小血管のセグメンテーションのためのネットワークを実証し、臓器規模の神経血管分析の定量的測定を可能にした。 Maps of brain microarchitecture are important for understanding neurological function and behavior, including alterations caused by chronic conditions such as neurodegenerative disease. Techniques such as knife-edge scanning microscopy (KESM) provide the potential for whole organ imaging at sub-cellular resolution. However, multi-terabyte data sizes make manual annotation impractical and automatic segmentation challenging. Densely packed cells combined with interconnected microvascular networks are a challenge for current segmentation algorithms. The massive size of high-throughput microscopy data necessitates fast and largely unsupervised algorithms. In this paper, we investigate a fully-convolutional, deep, and densely-connected encoder-decoder for pixel-wise semantic segmentation. The excessive memory complexity often encountered with deep and dense networks is mitigated using skip connections, resulting in fewer parameters and enabling a significant performance increase over prior architectures. The proposed network provides superior performance for semantic segmentation problems applied to open-source benchmarks. We finally demonstrate our network for cellular and microvascular segmentation, enabling quantitative metrics for organ-scale neurovascular analysis.	翻訳日:2023-01-04 02:50:37 公開日:2020-02-04
# BOFFIN TTS:ベイズ最適化による少数ショット話者適応 BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization ( http://arxiv.org/abs/2002.01953v1 ) ライセンス: Link先を確認	Henry B.Moss, Vatsal Aggarwal, Nishant Prateek, Javier Gonz\'alez, Roberto Barra-Chicote	(参考訳) 本稿では,話者適応のための新しいアプローチであるBOFFIN TTS(Bayesian Optimization for FIne-tuning Neural Text To Speech)を提案する。ここでは、ターゲット発話の小さなコーパスを用いて、訓練済みのTSモデルを微調整し、新しい話者を模倣する。微調整制御を行うハイパーパラメータのコーパス固有の構成を必要とするような,一様適応戦略は存在しないことを実証する。ターゲット話者のハイパーパラメータ値を効率的に最適化するためにベイズ最適化を用いることで、標準手法よりも平均30%高い話者類似度で適応することができる。複数のコーパスを通して、boffin ttsは10分未満の音声を使って新しい話者を合成することを学び、ベースモデルを訓練するために使用する話者と同じ自然性を達成することが示されている。 We present BOFFIN TTS (Bayesian Optimization For FIne-tuning Neural Text To Speech), a novel approach for few-shot speaker adaptation. Here, the task is to fine-tune a pre-trained TTS model to mimic a new speaker using a small corpus of target utterances. We demonstrate that there does not exist a one-size-fits-all adaptation strategy, with convincing synthesis requiring a corpus-specific configuration of the hyper-parameters that control fine-tuning. By using Bayesian optimization to efficiently optimize these hyper-parameter values for a target speaker, we are able to perform adaptation with an average 30% improvement in speaker similarity over standard techniques. Results indicate, across multiple corpora, that BOFFIN TTS can learn to synthesize new speakers using less than ten minutes of audio, achieving the same naturalness as produced for the speakers used to train the base model.	翻訳日:2023-01-04 02:50:03 公開日:2020-02-04
# 合成経験によるDQNリプレイメモリのブートストラップ Bootstrapping a DQN Replay Memory with Synthetic Experiences ( http://arxiv.org/abs/2002.01370v1 ) ライセンス: Link先を確認	Wenzel Baron Pilar von Pilchau and Anthony Stein and J\"org H\"ahner	(参考訳) 多くのDeep Reinforcement Learningアルゴリズムの重要なコンポーネントは、生成したエクスペリエンスの記憶機構やメモリとして機能するExperience Replayである。これらの経験はトレーニングに使われ、エージェントが問題空間を安定して完璧な軌道を見つけるのに役立ちます。しかし、古典的な体験リプレイは実際に作った経験のみを使うが、保存されたサンプルは抽出できる問題の知識という形で大きな可能性を秘めている。学習者を支援するために,非決定論的離散環境において合成経験を生成するアルゴリズムを提案する。補間されたエクスペリエンスリプレイは、フリーズレイク環境で評価され、エージェントが従来のバージョンよりも早く、さらに良く学習できるようにサポートできることが示されている。 An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version.	翻訳日:2023-01-04 02:43:19 公開日:2020-02-04
# コストセンシティブな大マルジン分類器に対する近似マージンアプローチ Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers ( http://arxiv.org/abs/2002.01408v1 ) ライセンス: Link先を確認	Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich	(参考訳) コストに敏感なマルチクラス分類の問題を考察し、より重要でないクラスを犠牲にして、重要なクラスの感度を高めたいと考えている。我々はこの問題に対処するために {\em Apportioned margin} フレームワークを採用し、同じ境界を共有するクラス間の効率的なマージンシフトを可能にする。すべてのクラス間の決定境界は、与えられた優先順位付けベクトルに従ってそれらのマージンを分割し、重要なクラスに対してより厳密なエラーを生じると同時に、全体のアウト・オブ・サンプルエラーを減少させる。フレームワークの効率的な実装の実証に加えて、一般化バウンダリの導出、フィッシャーの一貫性の実証、Mercurerのカーネルとニューラルネットワークへの適応、そしてすべてのアカウントで有望な実証結果の報告を行う。 We consider the problem of cost sensitive multiclass classification, where we would like to increase the sensitivity of an important class at the expense of a less important one. We adopt an {\em apportioned margin} framework to address this problem, which enables an efficient margin shift between classes that share the same boundary. The decision boundary between all pairs of classes divides the margin between them in accordance to a given prioritization vector, which yields a tighter error bound for the important classes while also reducing the overall out-of-sample error. In addition to demonstrating an efficient implementation of our framework, we derive generalization bounds, demonstrate Fisher consistency, adapt the framework to Mercer's kernel and to neural networks, and report promising empirical results on all accounts.	翻訳日:2023-01-04 02:43:06 公開日:2020-02-04
# ベイジアン能動差分選択による心理測定検査の高速化 Accelerating Psychometric Screening Tests With Bayesian Active Differential Selection ( http://arxiv.org/abs/2002.01547v1 ) ライセンス: Link先を確認	Trevor J. Larsen, Gustavo Malkomes, Dennis L. Barbour	(参考訳) 古典的な心理測定関数推定法は過度な測定を必要とするか、目標の心理測定関数の低分解能近似しか生成しない。本稿では,ある患者の心理計測関数推定の変化を迅速にスクリーニングする新しい方法を提案する。ベイジアン能動モデル選択を用いて、従来のオーディオグラムと異なるものかどうかを素早く見つけることを目的として、純音音響グラムの自動検査を行う。我々は,国立労働安全衛生研究所のオーディオメトリックデータを用いて,我々のアプローチを検証する。最初の結果は、2つのテストセッションの間に患者の聴力関数が高信頼で変化したかどうかを数音で検出できることを示している。 Classical methods for psychometric function estimation either require excessive measurements or produce only a low-resolution approximation of the target psychometric function. In this paper, we propose a novel solution for rapid screening for a change in the psychometric function estimation of a given patient. We use Bayesian active model selection to perform an automated pure-tone audiogram test with the goal of quickly finding if the current audiogram will be different from a previous audiogram. We validate our approach using audiometric data from the National Institute for Occupational Safety and Health NIOSH. Initial results show that with a few tones we can detect if the patient's audiometric function has changed between the two test sessions with high confidence.	翻訳日:2023-01-04 02:42:21 公開日:2020-02-04
# 大きなバッチトレーニングはウォームアップを必要としない Large Batch Training Does Not Need Warmup ( http://arxiv.org/abs/2002.01576v1 ) ライセンス: Link先を確認	Zhouyuan Huo, Bin Gu, Heng Huang	(参考訳) 大規模なバッチサイズによるディープニューラルネットワークのトレーニングでは、有望な結果が得られ、現実世界のアプリケーションの多くにメリットがある。しかし、オプティマイザは早期にゆっくりと収束し、大規模なディープラーニング最適化ヒューリスティックと理論的基礎の間にはギャップがある。本稿では,大規模バッチトレーニングのための新しい階層型適応レートスケーリング(clars)アルゴリズムを提案する。また,勾配法の新しい微粒化解析を導入することにより,提案手法の収束率も解析する。我々は,このギャップを埋め,線形学習率のスケーリング,漸進的ウォームアップ,層幅適応率のスケーリングなど,3つの一般的な大規模バッチトレーニング手法の理論的洞察を示す。大規模な実験により,提案アルゴリズムは,ImageNetデータセット上での高度なディープニューラルネットワーク(ResNet,DenseNet,MobileNet)のトレーニングにおいて,最先端の大規模バッチオプティマイザの収束を克服し,漸進的なウォームアップ手法よりも優れていた。 Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. However, the optimizer converges slowly at early epochs and there is a gap between large-batch deep learning optimization heuristics and theoretical underpinnings. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training. We also analyze the convergence rate of the proposed method by introducing a new fine-grained analysis of gradient-based methods. Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques, including linear learning rate scaling, gradual warmup, and layer-wise adaptive rate scaling. Extensive experiments demonstrate that the proposed algorithm outperforms gradual warmup technique by a large margin and defeats the convergence of the state-of-the-art large-batch optimizer in training advanced deep neural networks (ResNet, DenseNet, MobileNet) on ImageNet dataset.	翻訳日:2023-01-04 02:42:10 公開日:2020-02-04
# 空中画像マッチングのための双方向アンサンブルを有する双方向対称ネットワーク A Two-Stream Symmetric Network with Bidirectional Ensemble for Aerial Image Matching ( http://arxiv.org/abs/2002.01325v1 ) ライセンス: Link先を確認	Jae-Hyun Park, Woo-Jeoung Nam, Seong-Whan Lee	(参考訳) 本稿では,2ストリームの深層ネットワークを用いて異なる環境下で得られた2つの空中画像を正確にマッチングする手法を提案する。ネットワークは、対象画像を内部的に増強することにより、3つの入力画像で2ストリームを考慮し、トレーニングにおける追加の強化ペアを反映する。その結果、深層ネットワークのトレーニングプロセスは規則化され、そのネットワークは空中画像のばらつきに対して堅牢になる。さらに,幾何学的変換の同型性に動機付けられた双方向ネットワークに基づくアンサンブル手法を提案する。ネットワークやパラメータが加わらない2つの大域的変換パラメータが得られ、非対称なマッチング結果が軽減され、2つの結果の融合により性能が大幅に向上する。実験では,Google Earth と International Society for Photogrammetry and Remote Sensing (ISPRS) の航空画像を用いた。その結果を定量的に評価するために、マッチングの度合いを測る正しいキーポイント(PCK)メトリックの確率を適用した。定性的かつ定量的な結果は,従来の航空画像のマッチング方法と比較して,性能の差が大きいことを示している。すべてのコードとトレーニングされたモデル、およびデータセットはオンラインで利用可能です。 In this paper, we propose a novel method to precisely match two aerial images that were obtained in different environments via a two-stream deep network. By internally augmenting the target image, the network considers the two-stream with the three input images and reflects the additional augmented pair in the training. As a result, the training process of the deep network is regularized and the network becomes robust for the variance of aerial images. Furthermore, we introduce an ensemble method that is based on the bidirectional network, which is motivated by the isomorphic nature of the geometric transformation. We obtain two global transformation parameters without any additional network or parameters, which alleviate asymmetric matching results and enable significant improvement in performance by fusing two outcomes. For the experiment, we adopt aerial images from Google Earth and the International Society for Photogrammetry and Remote Sensing (ISPRS). To quantitatively assess our result, we apply the probability of correct keypoints (PCK) metric, which measures the degree of matching. The qualitative and quantitative results show the sizable gap of performance compared to the conventional methods for matching the aerial images. All code and our trained model, as well as the dataset are available online.	翻訳日:2023-01-04 02:41:51 公開日:2020-02-04
# ノード重み依存トラベルセールスパーソン問題:近似アルゴリズムとランダム探索ヒューリスティックス The Node Weight Dependent Traveling Salesperson Problem: Approximation Algorithms and Randomized Search Heuristics ( http://arxiv.org/abs/2002.01070v1 ) ライセンス: Link先を確認	Jakob Bossek, Katrin Casel, Pascal Kerschke and Frank Neumann	(参考訳) 車両経路の領域におけるいくつかの重要な最適化問題は、古典的旅行販売問題(TSP)の変種と見なすことができる。進化計算の分野では,過去5年間で旅行盗難問題(TTP)の関心が高まっている。本稿では,旅行中に訪れたノードの重みに対して移動コストが増加するという観点から,このような問題に対する重みの影響について検討する。これにより、トラベリング・ティーフ問題や時間依存のTSP変種といった重要なTSP変種を抽象化し、重量依存による難易度の増加を正確に研究することができる。計量距離と有界正の重みを持つTSPのこの重み依存バージョンに対する3.59近似を提供する。さらに、古典的突然変異演算子と、重み付きTSPに適応した最先端進化アルゴリズムEAXの2つの変種を用いて、単純なランダム化局所探索実験を行った。その結果,ノード重みがツアー中のノードの位置に与える影響が示唆された。 Several important optimization problems in the area of vehicle routing can be seen as a variant of the classical Traveling Salesperson Problem (TSP). In the area of evolutionary computation, the traveling thief problem (TTP) has gained increasing interest over the last 5 years. In this paper, we investigate the effect of weights on such problems, in the sense that the cost of traveling increases with respect to the weights of nodes already visited during a tour. This provides abstractions of important TSP variants such as the Traveling Thief Problem and time dependent TSP variants, and allows to study precisely the increase in difficulty caused by weight dependence. We provide a 3.59-approximation for this weight dependent version of TSP with metric distances and bounded positive weights. Furthermore, we conduct experimental investigations for simple randomized local search with classical mutation operators and two variants of the state-of-the-art evolutionary algorithm EAX adapted to the weighted TSP. Our results show the impact of the node weights on the position of the nodes in the resulting tour.	翻訳日:2023-01-04 02:41:32 公開日:2020-02-04
# 大規模分散トレーニングにおける効率向上 Improving Efficiency in Large-Scale Decentralized Distributed Training ( http://arxiv.org/abs/2002.01119v1 ) ライセンス: Link先を確認	Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David Kung, Michael Picheny	(参考訳) Decentralized Parallel SGD (D-PSGD) と非同期型 Asynchronous Parallel SGD (AD-PSGD) は分散学習アルゴリズムの一群であり、大規模深層学習に有効である。 A)D-PSGDの欠点は、混合行列のスペクトルギャップがシステム内の学習者の数が増えると減少し、ハマーが収束することである。本稿では,通信コストを最小化しつつスペクトルギャップを改善し,(a)d-psgdに基づくトレーニングを高速化する手法について検討する。提案手法の有効性を示すために,2000時間Switchboard音声認識タスクとImageNetコンピュータビジョンタスクの実験を行った。 IBM P9 スーパーコンピュータ上では,Hav5-2000 Switchboard (SWB) テストセットで7.5% WER,CallHome (CH) テストセットで13.3% WER,SWB で7.7% WER で1.98時間,CH で128 V100 GPU で13.3% WER で2.28時間,LSTM 音響モデルをトレーニングすることができる。 Decentralized Parallel SGD (D-PSGD) and its asynchronous variant Asynchronous Parallel SGD (AD-PSGD) is a family of distributed learning algorithms that have been demonstrated to perform well for large-scale deep learning tasks. One drawback of (A)D-PSGD is that the spectral gap of the mixing matrix decreases when the number of learners in the system increases, which hampers convergence. In this paper, we investigate techniques to accelerate (A)D-PSGD based training by improving the spectral gap while minimizing the communication cost. We demonstrate the effectiveness of our proposed techniques by running experiments on the 2000-hour Switchboard speech recognition task and the ImageNet computer vision task. On an IBM P9 supercomputer, our system is able to train an LSTM acoustic model in 2.28 hours with 7.5% WER on the Hub5-2000 Switchboard (SWB) test set and 13.3% WER on the CallHome (CH) test set using 64 V100 GPUs and in 1.98 hours with 7.7% WER on SWB and 13.3% WER on CH using 128 V100 GPUs, the fastest training time reported to date.	翻訳日:2023-01-04 02:41:17 公開日:2020-02-04
# Feature-Rich biLSTMモデルによるアラビア語の発音回復 Arabic Diacritic Recovery Using a Feature-Rich biLSTM Model ( http://arxiv.org/abs/2002.01207v1 ) ライセンス: Link先を確認	Kareem Darwish, Ahmed Abdelali, Hamdy Mubarak, Mohamed Eldesouki	(参考訳) 方言(短母音)は通常アラビア文字を書く際に省略され、読み手はそれらを正しく発音するために再導入する必要がある。アラビア語のダイアクリティカルは2種類あり、第1は語彙選択を規定するコアワードダイアクリティカル(CW)、第2はケースエンディング(CE)であり、通常は語幹の端に現れ、一般的にそれらの構文的役割を規定する。 CEのリカバリは、単語間の依存関係のため、コアワードのダイアクリティカルティクスを回復するよりも比較的難しい。本稿では,言語的特徴と表面的特徴を多用した機能豊富なリカレントニューラルネットワークモデルを用いて,コアワードのダイアクリティカルティクスとケースエンディングの両方を復元する。本モデルでは,従来の2.86\%のcwエラー率と3.7%のceエラー率 (ceer) と2.2%のcwerと2.5%の古典アラビア語 (ca) のceをそれぞれ上回っている。ダイアクリッド化した単語コアとケースエンディングとを組み合わせると、それぞれMSAとCAそれぞれ6.0%と4.3%となる。これは、そのような深層神経モデルに対する機能工学の有効性を強調している。 Diacritics (short vowels) are typically omitted when writing Arabic text, and readers have to reintroduce them to correctly pronounce words. There are two types of Arabic diacritics: the first are core-word diacritics (CW), which specify the lexical selection, and the second are case endings (CE), which typically appear at the end of the word stem and generally specify their syntactic roles. Recovering CEs is relatively harder than recovering core-word diacritics due to inter-word dependencies, which are often distant. In this paper, we use a feature-rich recurrent neural network model that uses a variety of linguistic and surface-level features to recover both core word diacritics and case endings. Our model surpasses all previous state-of-the-art systems with a CW error rate (CWER) of 2.86\% and a CE error rate (CEER) of 3.7% for Modern Standard Arabic (MSA) and CWER of 2.2% and CEER of 2.5% for Classical Arabic (CA). When combining diacritized word cores with case endings, the resultant word error rate is 6.0% and 4.3% for MSA and CA respectively. This highlights the effectiveness of feature engineering for such deep neural models.	翻訳日:2023-01-04 02:34:05 公開日:2020-02-04
# テキスト分類コーパスの拡張のための反復データプログラミング Iterative Data Programming for Expanding Text Classification Corpora ( http://arxiv.org/abs/2002.01412v1 ) ライセンス: Link先を確認	Neil Mallinar, Abhishek Shah, Tin Kam Ho, Rajendra Ugrani, Ayush Gupta	(参考訳) 実世界のテキスト分類タスクは、しばしば取得するのに高価なラベル付きトレーニング例を必要とする。機械教育の最近の進歩、特にデータプログラミングパラダイムは、ラベリング関数(英語版)として知られる弱いモデルを構築するための一般的なフレームワークを通じてデータセットを迅速に作成し、アンサンブル学習技術によってそれらを認知する。本稿では,近傍の弱モデル生成を最小限の監督で行うことで,テキストデータセットの強化を図るための,高速で簡単なデータプログラミング手法を提案する。さらに,本手法では,大量の未ラベルデータから疎分散なサンプルを同定する反復的手法を用いる。反復型データプログラミング技術は、よりラベル付きデータが人間のループで確認されるので、新しい弱いモデルを改善する。会話エージェントの意図認識を改善するタスクを含む,文分類作業における経験的結果を示す。 Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data sets quickly via a general framework for building weak models, also known as labeling functions, and denoising them through ensemble learning techniques. We present a fast, simple data programming method for augmenting text data sets by generating neighborhood-based weak models with minimal supervision. Furthermore, our method employs an iterative procedure to identify sparsely distributed examples from large volumes of unlabeled data. The iterative data programming techniques improve newer weak models as more labeled data is confirmed with human-in-loop. We show empirical results on sentence classification tasks, including those from a task of improving intent recognition in conversational agents.	翻訳日:2023-01-04 02:33:39 公開日:2020-02-04
# オンデバイス自然言語処理のための軽量畳み込み表現 Lightweight Convolutional Representations for On-Device Natural Language Processing ( http://arxiv.org/abs/2002.01535v1 ) ライセンス: Link先を確認	Shrey Desai, Geoffrey Goh, Arun Babu, Ahmed Aly	(参考訳) ディープニューラルネットワークの計算とメモリの複雑さの増大により、低リソースの電子機器(携帯電話、タブレット、ウェアラブルなど)へのデプロイが困難になった。これらの懸念に対処するために多くのモデル圧縮手法を開発したが、入力表現自体を凝縮したものはほとんどない。本研究では,任意のニューラルモデルにスワップできる高速で正確で軽量な畳み込み表現法を提案する。さらに、Samsung Galaxy S9のリソース中心のメトリクス(例えば、モデルファイルサイズ、レイテンシ、メモリ使用量)を考慮すると、リカレント表現よりも利得を示す。 The increasing computational and memory complexities of deep neural networks have made it difficult to deploy them on low-resource electronic devices (e.g., mobile phones, tablets, wearables). Practitioners have developed numerous model compression methods to address these concerns, but few have condensed input representations themselves. In this work, we propose a fast, accurate, and lightweight convolutional representation that can be swapped into any neural model and compressed significantly (up to 32x) with a negligible reduction in performance. In addition, we show gains over recurrent representations when considering resource-centric metrics (e.g., model file size, latency, memory usage) on a Samsung Galaxy S9.	翻訳日:2023-01-04 02:33:25 公開日:2020-02-04
# HVACシステム故障検出のための伝達学習 Transfer Learning for HVAC System Fault Detection ( http://arxiv.org/abs/2002.01060v1 ) ライセンス: Link先を確認	Chase P. Dowling and Baosen Zhang	(参考訳) 空調システムの故障は建物の熱的快適性とエネルギー効率を低下させ、研究コミュニティから大きな注目を集め、データ駆動方式が人気を集めている。しかし、通常の運用状態と故障状態のようなラベル付きデータの欠如は、HVACシステムへの機械学習の適用を遅らせている。加えて、特定の建物では、適切な時間をかけて訓練を行うには、観測された欠陥の数が不十分な場合もあります。これらの課題を克服するために,通常の操作と故障操作を区別する新しいベイズ分類器の転送手法を提案する。鍵となるのは、この分類器を大量のセンサーと故障データ(例えばシミュレーションや標準テストデータ)で建物で訓練し、その分類器を新しい建物から少量の通常の操作データを使って新しい建物に転送することである。異なる気候における建築的類似の建物間で分類器を転送するための概念実証を行い,分類精度とリコールの維持に必要なサンプルは少ないことを示した。 Faults in HVAC systems degrade thermal comfort and energy efficiency in buildings and have received significant attention from the research community, with data driven methods gaining in popularity. Yet the lack of labeled data, such as normal versus faulty operational status, has slowed the application of machine learning to HVAC systems. In addition, for any particular building, there may be an insufficient number of observed faults over a reasonable amount of time for training. To overcome these challenges, we present a transfer methodology for a novel Bayesian classifier designed to distinguish between normal operations and faulty operations. The key is to train this classifier on a building with a large amount of sensor and fault data (for example, via simulation or standard test data) then transfer the classifier to a new building using a small amount of normal operations data from the new building. We demonstrate a proof-of-concept for transferring a classifier between architecturally similar buildings in different climates and show few samples are required to maintain classification precision and recall.	翻訳日:2023-01-04 02:33:11 公開日:2020-02-04
# boostingによる効率的、ノイズ耐性、プライベートラーニング Efficient, Noise-Tolerant, and Private Learning via Boosting ( http://arxiv.org/abs/2002.01100v1 ) ライセンス: Link先を確認	Mark Bun, Marco Leandro Carmosino, Jessica Sorrell	(参考訳) プライベートブースティングアルゴリズムを設計するためのシンプルなフレームワークを導入する。我々はこれらのアルゴリズムが差分プライベートで、効率的で、耐雑音性のあるPAC学習者である自然条件を与える。この枠組みを実証するために,標本複雑性が次元に依存しない大規模半空間に対して,雑音耐性およびプライベートpac学習器を構築する。大数学のハーフスペース学習者に2つのサンプル複雑性境界を与えます。 1つの境界は差分プライバシーのみに基づいており、この保証を一般化を保証するための資産として利用する。この最初の境界は、独立した関心を持つかもしれないプライバシーからpac学習者を得る一般的な方法を示している。第2境界は、大マルジン分類理論(脂肪散乱次元)の標準手法を用いて、大マルジンハーフスペースの微分プライベート学習において最もよく知られたサンプルの複雑さと一致し、さらにランダムラベルノイズを許容する。 We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization. This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while additionally tolerating random label noise.	翻訳日:2023-01-04 02:32:54 公開日:2020-02-04
# ケイリー変換によるスティーフェル多様体の効率的なリーマン最適化 Efficient Riemannian Optimization on the Stiefel Manifold via the Cayley Transform ( http://arxiv.org/abs/2002.01113v1 ) ライセンス: Link先を確認	Jun Li, Li Fuxin, Sinisa Todorovic	(参考訳) パラメータ行列に厳密な正規性制約を課すことは、ディープラーニングにおいて有利であることが示されている。これは、スティフェル多様体上のリーマン最適化に相当するが、計算上は高価である。この課題に対処するために、(1) 最適化更新のための反復ケイリー変換に基づく新しい効率的なリトラクションマップ、(2) モーメントの射影とスティーフェル多様体上のケイリー変換の組み合わせに基づく暗黙的なベクトル輸送機構を提案する。モーメントを持つケイリーSGDと、スティーフェル多様体上のケイリーADAMの2つの新しい最適化アルゴリズムを指定する。 Cayley SGDの収束性は理論的に解析される。 cnnトレーニングの実験ではどちらのアルゴリズムも (a)CNNパラメータの正規性を強制する既存のアプローチと比較してイテレーション毎の実行時間の削減。 b) CNNの性能を損なうことなく, ベースラインSGDおよびADAMアルゴリズムよりも高速収束率を得る。 Cayley SGDとCayley ADAMもまた、RNNのユニタリ遷移行列を最適化するためのトレーニング時間を短縮することを示した。 Strictly enforcing orthonormality constraints on parameter matrices has been shown advantageous in deep learning. This amounts to Riemannian optimization on the Stiefel manifold, which, however, is computationally expensive. To address this challenge, we present two main contributions: (1) A new efficient retraction map based on an iterative Cayley transform for optimization updates, and (2) An implicit vector transport mechanism based on the combination of a projection of the momentum and the Cayley transform on the Stiefel manifold. We specify two new optimization algorithms: Cayley SGD with momentum, and Cayley ADAM on the Stiefel manifold. Convergence of Cayley SGD is theoretically analyzed. Our experiments for CNN training demonstrate that both algorithms: (a) Use less running time per iteration relative to existing approaches that enforce orthonormality of CNN parameters; and (b) Achieve faster convergence rates than the baseline SGD and ADAM algorithms without compromising the performance of the CNN. Cayley SGD and Cayley ADAM are also shown to reduce the training time for optimizing the unitary transition matrices in RNNs.	翻訳日:2023-01-04 02:32:37 公開日:2020-02-04
# GANにおける肯定的非ラベル分類について On Positive-Unlabeled Classification in GAN ( http://arxiv.org/abs/2002.01136v1 ) ライセンス: Link先を確認	Tianyu Guo, Chang Xu, Jiajun Huang, Yunhe Wang, Boxin Shi, Chao Xu, Dacheng Tao	(参考訳) 本稿では,標準GANの正・負の分類問題を定義し,その上で,識別器のトレーニングを安定化させる新しい手法を提案する。伝統的に、生成データは負である間、実際のデータは正とみなす。この正負の分類基準は, 実データよりも現実的であっても, 生成データの品質を徐々に向上させることなく, 判別器の学習過程を通じて常に固定された。対照的に、生成したデータをラベルなしとして扱う方が合理的であり、品質に応じて正または負の値になる可能性がある。判別器はこの正・未ラベルの分類問題に対する分類器であり、新しい正の無ラベルGAN(PUGAN)を導出する。提案モデルが達成する大域的最適性と同等の最適化目標について理論的に考察する。 PUGANは、これらの高度な判別器安定化手法と同等またはそれ以上の性能を達成できる。 This paper defines a positive and unlabeled classification problem for standard GANs, which then leads to a novel technique to stabilize the training of the discriminator in GANs. Traditionally, real data are taken as positive while generated data are negative. This positive-negative classification criterion was kept fixed all through the learning process of the discriminator without considering the gradually improved quality of generated data, even if they could be more realistic than real data at times. In contrast, it is more reasonable to treat the generated data as unlabeled, which could be positive or negative according to their quality. The discriminator is thus a classifier for this positive and unlabeled classification problem, and we derive a new Positive-Unlabeled GAN (PUGAN). We theoretically discuss the global optimality the proposed model will achieve and the equivalent optimization goal. Empirically, we find that PUGAN can achieve comparable or even better performance than those sophisticated discriminator stabilization methods.	翻訳日:2023-01-04 02:31:54 公開日:2020-02-04
# マルコフ雑音による線形2時間確率近似の有限時間解析 Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise ( http://arxiv.org/abs/2002.01268v1 ) ライセンス: Link先を確認	Maxim Kaledin, Eric Moulines, Alexey Naumov, Vladislav Tadic, Hoi-To Wai	(参考訳) 線形2時間スケール確率近似(SA)スキームは、特に政策評価問題において強化学習(RL)で人気を博したアルゴリズムの重要なクラスである。近年、このスキームの有限時間解析の確立に、特に実際にユビキタスなマルコフ(非i.i.d.)ノイズ設定の下で、多くの研究がなされている。本稿では,線形2時間スケールSAの有限時間解析について述べる。我々の境界はマルコフとマルティンゲールノイズの収束速度に差がないことを示しているが、定数のみがマルコフ連鎖の混合時間に影響されている。適切なステップサイズスケジュールでは、期待されるエラーバウンドの過渡項は$o(1/k^c)$であり、定常項は${\cal o}(1/k)$であり、ここで$c>1$と$k$はイテレーション番号である。さらに、期待誤差の漸近的拡大を示し、一致する下限が$\omega(1/k)$ であることを示す。我々の理論を支持するため、簡単な数値実験を行う。 Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to establishing the finite time analysis of the scheme, especially under the Markovian (non-i.i.d.) noise settings that are ubiquitous in practice. In this paper, we provide a finite-time analysis for linear two timescale SA. Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain. With an appropriate step size schedule, the transient term in the expected error bound is $o(1/k^c)$ and the steady-state term is ${\cal O}(1/k)$, where $c>1$ and $k$ is the iteration number. Furthermore, we present an asymptotic expansion of the expected error with a matching lower bound of $\Omega(1/k)$. A simple numerical experiment is presented to support our theory.	翻訳日:2023-01-04 02:31:13 公開日:2020-02-04
# ビジュアルコンセプト-メタコンセプト学習 Visual Concept-Metaconcept Learning ( http://arxiv.org/abs/2002.01464v1 ) ライセンス: Link先を確認	Chi Han, Jiayuan Mao, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu	(参考訳) 視覚的な入力から赤と緑を認識し、それらがオブジェクトの同じ性質(つまり色)を記述することも理解している。本稿では,画像と関連する質問応答対から概念とメタ概念を共同学習するための視覚概念メタコンセプタ(VCML)を提案する。キーとなるのは,視覚概念とメタ概念の双方向接続を活用することだ。視覚的表現は、見当たらない概念のペア間の関係を予測するための基礎となる手がかりを提供する。赤と緑がオブジェクトの同じ性質を記述していることを知ると、立方体と球面がオブジェクトの形状を分類するため、オブジェクトの同じ性質も記述しているという事実を一般化する。一方、メタコンセプトに関する知識は、限られた、騒々しい、バイアスのあるデータから視覚的な概念を学ぶのに役立ちます。紫のキューブの例から、新しい色の紫はキューブの形ではなくキューブの色に似ています。合成および実世界の両方のデータセットの評価は、我々の主張を検証する。 Humans reason with concepts and metaconcepts: we recognize red and green from visual input; we also understand that they describe the same property of objects (i.e., the color). In this paper, we propose the visual concept-metaconcept learner (VCML) for joint learning of concepts and metaconcepts from images and associated question-answer pairs. The key is to exploit the bidirectional connection between visual concepts and metaconcepts. Visual representations provide grounding cues for predicting relations between unseen pairs of concepts. Knowing that red and green describe the same property of objects, we generalize to the fact that cube and sphere also describe the same property of objects, since they both categorize the shape of objects. Meanwhile, knowledge about metaconcepts empowers visual concept learning from limited, noisy, and even biased data. From just a few examples of purple cubes we can understand a new color purple, which resembles the hue of the cubes instead of the shape of them. Evaluation on both synthetic and real-world datasets validates our claims.	翻訳日:2023-01-04 02:25:14 公開日:2020-02-04
# グラフィカル相互情報最大化によるグラフ表現学習 Graph Representation Learning via Graphical Mutual Information Maximization ( http://arxiv.org/abs/2002.01169v1 ) ライセンス: Link先を確認	Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, Junzhou Huang	(参考訳) ソーシャルネットワークやコミュニケーションネットワークといった様々な情報ネットワークの内容の豊かさは、外部の監督なしに高品質な表現を学ぶ前例のない可能性をもたらす。本稿では,グラフ構造データからの豊富な情報を,教師なしの方法で埋め込み空間に保存し,抽出する方法を検討する。そこで本研究では,入力グラフとハイレベル隠れ表現との相関を測定するための新しい概念であるグラフィカル相互情報(gmi)を提案する。 gmiはベクトル空間からグラフ領域への従来の相互情報計算の考え方を一般化し、ノードの特徴と位相構造から相互情報を測定することは不可欠である。まず、既存のグラフ表現学習アルゴリズムでは避けられない制約である入力グラフの同型変換に不変であり、MINEのような現在の相互情報推定手法によって効率的に推定・最大化することができる。 GMIの助けを借りて、グラフニューラルエンコーダの入力と出力の間でGMIを最大化することで訓練された教師なし学習モデルを開発する。トランスダクティブおよびインダクティブノード分類およびリンク予測に関する検討実験により,本手法は最先端の教師なし手法よりも優れ,時には教師なし手法よりも優れることが示された。 The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs---an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones.	翻訳日:2023-01-04 02:24:00 公開日:2020-02-04
# コンパクトパターン表現のための整数重み付き節付き回帰tsetlinマシン A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation ( http://arxiv.org/abs/2002.01245v1 ) ライセンス: Link先を確認	K. Darshana Abeyrathna, Ole-Christoffer Granmo, Morten Goodwin	(参考訳) Regression Tsetlin Machine (RTM)は、最先端の非線形回帰モデルの解釈可能性の欠如に対処する。これは命題論理の連結節を使用して、データ内の非線形頻繁なパターンをキャプチャすることで実現される。しかし、これらは線型回帰関数と同様の和を通じて連続的な出力に結合され、非線型成分とユニタリ重みを持つ。 RTMは競合精度で非線形回帰問題を解くが、出力の解像度は使用される節数に比例する。これは、計算コストが解像度によって増加することを意味する。この問題を解決するため、整数重み付きRTM節を導入する。我々の整数重み付き節(integer weighted clause)は、同一のサブパターン-nリピート節をキャプチャした複数の節のコンパクトな表現であり、整数重み n で 1 に変換される。さらに,文節と重みの両方を同時に学習する,いわゆる確率探索を活かした新しい学習手法を提案する。整数重み付きrtmのポテンシャルを6つの人工データセットを用いて実証的に評価する。その結果、整数重み付きRTMは通常のRTMに比べて計算資源が大幅に少ないため、同等以上の精度で取得できることがわかった。さらに,整数重み付けにより実数値よりも精度が向上することを示す。 The Regression Tsetlin Machine (RTM) addresses the lack of interpretability impeding state-of-the-art nonlinear regression models. It does this by using conjunctive clauses in propositional logic to capture the underlying non-linear frequent patterns in the data. These, in turn, are combined into a continuous output through summation, akin to a linear regression function, however, with non-linear components and unity weights. Although the RTM has solved non-linear regression problems with competitive accuracy, the resolution of the output is proportional to the number of clauses employed. This means that computation cost increases with resolution. To reduce this problem, we here introduce integer weighted RTM clauses. Our integer weighted clause is a compact representation of multiple clauses that capture the same sub-pattern-N repeating clauses are turned into one, with an integer weight N. This reduces computation cost N times, and increases interpretability through a sparser representation. We further introduce a novel learning scheme that allows us to simultaneously learn both the clauses and their weights, taking advantage of so-called stochastic searching on the line. We evaluate the potential of the integer weighted RTM empirically using six artificial datasets. The results show that the integer weighted RTM is able to acquire on par or better accuracy using significantly less computational resources compared to regular RTMs. We further show that integer weights yield improved accuracy over real-valued ones.	翻訳日:2023-01-04 02:23:36 公開日:2020-02-04
# スパイクニューラルネットワークのサイズとレジリエンスの多目的最適化 Multi-Objective Optimization for Size and Resilience of Spiking Neural Networks ( http://arxiv.org/abs/2002.01406v1 ) ライセンス: Link先を確認	Mihaela Dimovska, Travis Johnston, Catherine D. Schuman, J. Parker Mitchell, Thomas E. Potok	(参考訳) 脳の接続メカニズムにインスパイアされたニューロモルフィックコンピューティングアーキテクチャは、シリコンのスパイクニューラルネットワーク(snn)をモデル化する。そのため、ニューロモルフィックアーキテクチャは、制御や機械学習タスクを実行できる小型で低消費電力のチップを目標として設計・開発されている。しかし、開発したハードウェアの消費電力は、チップ上で評価されているネットワークのサイズに大きく依存する。さらに、チップ上で評価されたトレーニングされたSNNの精度は、ネットワークの学習重量を乱すハードウェアの電圧と電流の変動によって変化する可能性がある。ハードウェア側でこれらの混乱を最小限に抑える努力が行われているが、デプロイされたネットワークをよりレジリエンスにするためのソフトウェアベースの戦略は、この問題をさらに緩和するのに役立つ。本研究では,スパイキングニューラルネットワークを2つのニューロモルフィックアーキテクチャの実装に適用し,そのサイズを小さくすると同時に,ハードウェア故障に対する耐性を高めることを目的とした。進化的アルゴリズムを利用してSNNを訓練し、SNNのサイズとレジリエンスを最適化する多目的フィットネス関数を提案する。この戦略がハードウェアの欠点に対してより回復力のある、高性能で小型のネットワークに繋がることを示す。 Inspired by the connectivity mechanisms in the brain, neuromorphic computing architectures model Spiking Neural Networks (SNNs) in silicon. As such, neuromorphic architectures are designed and developed with the goal of having small, low power chips that can perform control and machine learning tasks. However, the power consumption of the developed hardware can greatly depend on the size of the network that is being evaluated on the chip. Furthermore, the accuracy of a trained SNN that is evaluated on chip can change due to voltage and current variations in the hardware that perturb the learned weights of the network. While efforts are made on the hardware side to minimize those perturbations, a software based strategy to make the deployed networks more resilient can help further alleviate that issue. In this work, we study Spiking Neural Networks in two neuromorphic architecture implementations with the goal of decreasing their size, while at the same time increasing their resiliency to hardware faults. We leverage an evolutionary algorithm to train the SNNs and propose a multiobjective fitness function to optimize the size and resiliency of the SNN. We demonstrate that this strategy leads to well-performing, small-sized networks that are more resilient to hardware faults.	翻訳日:2023-01-04 02:22:18 公開日:2020-02-04

Title

Authors

Abstract

論文公表日・翻訳日

# SiCにおけるSi空隙スピン量子ビットの局所振動モード

Local vibrational modes of Si vacancy spin qubits in SiC ( http://arxiv.org/abs/2002.00067v2 )

ライセンス: Link先を確認

Z. Shang, A. Hashemi, Y. Berenc\'en, H.-P. Komsa, P. Erhart, A. V. Krasheninnikov, G. V. Astakhov

(参考訳) シリコン炭化物は、この技術的にフレンドリーな材料における点欠陥の異常スピンと光学的性質のため、量子応用にとって非常に有望なプラットフォームである。これらの性質は結晶振動の影響を強く受けているが、スピン量子ビットの挙動とそれらの正確な関係は十分に研究されていない。成長した4H-SiCにおけるSi空隙スピン量子ビットの局所振動モードを明らかにする。共振マイクロ波場を用いて1種類の欠陥、いわゆるV2中心からの寄与を分離し、7つの等分されたフォノンレプリカとともにゼロフォノン線を観測する。さらに, 実験データとよく一致したフォトルミネッセンスライン形状の第一原理計算について述べる。計算精度を高め,計算時間を短縮するために,機械学習アルゴリズムを用いて力定数を抽出する。これにより、Si空孔における光学放出中に励起電子と結合した格子振動の支配的なモードを特定できる。 36MeVの共鳴フォノンエネルギーとDebye-Waller因子の約6%を得る。我々は光誘起スピン偏光の活性化エネルギーが局所振動エネルギーによって与えられることを実験的に確立した。本研究は,sicスピン量子ビットにおける電子状態と振動モードのカップリングに関する知見を与え,スピン,光学,機械的,熱的性質の予測に不可欠である。このアプローチは、sicおよび他の3dおよび2d材料にスペクトル重なりのある多くのスピン欠陥に適用することができる。

Silicon carbide is a very promising platform for quantum applications because of extraordinary spin and optical properties of point defects in this technologically-friendly material. These properties are strongly influenced by crystal vibrations, but the exact relationship between them and the behavior of spin qubits is not fully investigated. We uncover the local vibrational modes of the Si vacancy spin qubits in as-grown 4H-SiC. We apply the resonant microwave field to isolate the contribution from one particular type of defects, the so-called V2 center, and observe the zero-phonon line together with seven equally-separated phonon replicas. Furthermore, we present first-principles calculations of the photoluminescence lineshape, which are in excellent agreement with our experimental data. To boost up the calculation accuracy and decrease the computation time, we extract the force constants using machine learning algorithms. This allows us to identify dominant modes in the lattice vibrations coupled to an excited electron during optical emission in the Si vacancy. The resonance phonon energy of 36 meV and the Debye-Waller factor of about 6% are obtained. We establish experimentally that the activation energy of the optically-induced spin polarization is given by the local vibrational energy. Our findings give insight into the coupling of electronic states to vibrational modes in SiC spin qubits, which is essential to predict their spin, optical, mechanical and thermal properties. The approach described can be applied to a large variety of spin defects with spectrally overlapped contributions in SiC as well as in other 3D and 2D materials.

翻訳日:2023-06-05 02:30:05 公開日:2020-02-04

# double-raman singlet と doublet light-matter scheme を用いたオフ軸光渦

Off-axis optical vortices using double-Raman singlet and doublet light-matter schemes ( http://arxiv.org/abs/2002.00504v2 )

ライセンス: Link先を確認

Hamid Reza Hamedi, Julius Ruseckas, Emmanuel Paspalakis and Gediminas Juzeli\=unas

(参考訳) ダブルラマンゲイン原子媒体内で伝播するオフ軸光渦の形成について検討した。原子は2つの弱いプローブ場と、軌道角運動量(oam)を運ぶ2つの強いポンプビームと相互作用する。我々は、強力なポンプレーザーの1つだけがOAMを搭載している状況を考える。物質に結合したプローブ場の特定の重ね合わせは、シフト軸を持つ特定の光学的渦を形成する。このようなオフ軸渦は、2光子デチューニングの値に応じて、サブまたはスーパールミナル群速度で媒体内を伝播することができる。ポンプフィールドのエネルギーがプローブフィールドに伝達されるため、スーパールミネッセンス光渦は増幅と関連付けられる。周辺渦の位置は、ポンプ場のoamと強度によって操作できる。第2プローブフィールドの振幅が原子雲の先頭でゼロであれば、個々のプローブビームとポンプフィールドの間で光渦の交換が可能であることを示す。このモデルは4つのポンプ場と相互作用するより複雑なダブルラマン・ダブルトまで拡張されている。ダブルラマン・シンレットとは対照的に、ゼロ2光子デチューニングでもオフ軸サブまたは超光渦の生成が可能になった。

We study the formation of off-axis optical vortices propagating inside a double-Raman gain atomic medium. The atoms interact with two weak probe fields as well as two strong pump beams which can carry orbital angular momentum (OAM). We consider a situation when only one of the strong pump lasers carries an OAM. A particular superposition of probe fields coupled to the matter is shown to form specific optical vortices with shifted axes. Such off-axis vortices can propagate inside the medium with sub- or superluminal group velocity depending on the value of the two-photon detuning. The superluminal optical vortices are associated with the amplification as the energy of pump fields is transferred to the probe fields. The position of the peripheral vortices can be manipulated by the OAM and intensity of the pump fields. We show that the exchange of optical vortices is possible between individual probe beams and the pump fields when the amplitude of the second probe field is zero at the beginning of the atomic cloud. The model is extended to a more complex double Raman doublet interacting with four pump fields. In contrast to the double-Raman-singlet, now the generation of the off-axis sub- or superluminal optical vortices is possible even for zero two-photon detuning.

翻訳日:2023-06-05 00:18:17 公開日:2020-02-04

# オプトメカニカルキャビティにおけるモードロック

Mode locking in an optomechanical cavity ( http://arxiv.org/abs/2002.01157v1 )

ライセンス: Link先を確認

Eyal Buks, Roei Levi and Ivar Martin

(参考訳) 本研究では, メカニカル共振器ミラーと光増幅器を統合した光リングキャビティを実験的に検討した。この装置は同期や自励振動を含む様々な興味深い非線形効果を示す。光リングキャビティの周波数が懸架ミラーの機械的周波数に非常に近い場合に受動的に発生する光パルスを観測する。このメカニカルモードロックのしきい値における光パワーは、光増幅器の量子ノイズと関連していることがわかった。

We experimentally study a fiber-based optical ring cavity integrated with a mechanical resonator mirror and an optical amplifier. The device exhibits a variety of intriguing nonlinear effects including synchronization and self-excited oscillation. Passively generated optical pulses are observed when the frequency of the optical ring cavity is tuned very close to the mechanical frequency of the suspended mirror. The optical power at the threshold of this process of mechanical mode locking is found to be related to quantum noise of the optical amplifier.

翻訳日:2023-06-04 18:52:04 公開日:2020-02-04

# MANETを活用した被災地安全給付システム

Secure Payment System Utilizing MANET for Disaster Areas ( http://arxiv.org/abs/2002.01081v1 )

ライセンス: Link先を確認

Babatunde Ojetunde, Naoki Shibata, Juntao Gao

(参考訳) 災害地域のモバイル決済システムは、食品、衣類、薬品などの回収品を購入する人々に電子取引を提供する可能性がある。逆に、災害地域での取引を可能にするためには、大規模地震や洪水など災害時に破壊される可能性のある通信インフラ(有線ネットワークや携帯電話ネットワークなど)が必要であるため、災害地域では依存できない。本稿では,災害時の買い物を許可するトランザクションを実現するために,インフラストラクチャレスマネットを利用した新しいモバイル決済システムを提案する。具体的には、顧客間取引の支払い保証と、ブルームフィルタとマークルツリーに基づく軽量なスキームによるマルチレベル推奨機構を提供し、通信のオーバーヘッドを低減するための推奨機構を導入する。モバイル決済システムでは,位置情報に基づく相互監視方式やブラインドシグネチャなど,さまざまな方式を採用することでセキュアな取引を実現するとともに,新たに導入したイベントチェーン機構により,二重消費攻撃が防止される。シミュレーションにより検証したように,提案手法は災害現場で有効であり,高いトランザクション完了率,テストシナリオ全体の65%～90%を達成し,全体の平均7MBの商談メッセージサイズを持つモバイルデバイスのストレージ効率を向上する。

Mobile payment system in a disaster area have the potential to provide electronic transactions for people purchasing recovery goods like foodstuffs, clothes, and medicine. Conversely, to enable transactions in a disaster area, current payment systems need communication infrastructures (such as wired networks and cellular networks) which may be ruined during such disasters as large-scale earthquakes and flooding and thus cannot be depended on in a disaster area. In this paper, we introduce a new mobile payment system utilizing infrastructureless MANETs to enable transactions that permit users to shop in disaster areas. Specifically, we introduce an endorsement-based mechanism to provide payment guarantees for a customer-to-merchant transaction and a multilevel endorsement mechanism with a lightweight scheme based on Bloom filter and Merkle tree to reduce communication overheads. Our mobile payment system achieves secure transaction by adopting various schemes such as location-based mutual monitoring scheme and blind signature, while our newly introduce event chain mechanism prevents double spending attacks. As validated by simulations, the proposed mobile payment system is useful in a disaster area, achieving high transaction completion ratio, 65% - 90% for all scenario tested, and is storage-efficient for mobile devices with an overall average of 7MB merchant message size.

翻訳日:2023-06-04 18:51:39 公開日:2020-02-04

# 低光強度限界を超える原子鎖の光学応答:線形古典振動子モデルの有効性

Optical response of atom chains beyond the limit of low light intensity: The validity of the linear classical oscillator model ( http://arxiv.org/abs/2002.01417v1 )

ライセンス: Link先を確認

L. A. Williamson and J. Ruostekoski

(参考訳) 弱いコヒーレント入射光を受ける原子は、サブラジアントおよびスーパーラジアント集団励起固有モードをサポートする、結合した古典線形振動子として扱うことができる。量子多体マスター方程式を解くことにより, 閉じ込められた原子鎖からのコヒーレントかつ非コヒーレントな散乱を解くことにより, 駆動の強度を増大させる上でのこの擬似古典振動子モデルの妥当性の限界を同定する。線形古典振動子モデルからの偏差は、光によって励起される集合固有モードの共振線幅$\upsilon_\alpha$に敏感に依存し、$\upsilon_\alpha$のパワーローとしてかなりの偏差が発生する強度が増大することを示した。線形古典振動子モデル(英語版)は、超ラジアント励起よりもずっと低い強度で不正確となり、7つの原子の例システムでは、2つのケース間で30の係数で入射光強度が異なる臨界入射光強度が生じる。個々にエキサイティングな固有モデムにより、この臨界強度はより狭い共鳴とより強く相互作用する系に対して$\upsilon_\alpha^{2.5}$のスケーリングを持ち、より広い共鳴に対して$\upsilon_\alpha^3$のスケーリングに近づき、双極子-双極子相互作用が減少する。また、$\upsilon_\alpha^3$のスケーリングは原子間の量子揺らぎが無視される半古典的結果に対応する。完全モード整合ドライブの場合と定在波駆動の場合の両方について検討し,極低ラジアントモードのみに出現する2例とファノ共鳴の位置との間に有意差を認めた。

Atoms subject to weak coherent incident light can be treated as coupled classical linear oscillators, supporting subradiant and superradiant collective excitation eigenmodes. We identify the limits of validity of this \emph{linear classical oscillator model} at increasing intensities of the drive by solving the quantum many-body master equation for coherent and incoherent scattering from a chain of trapped atoms. We show that deviations from the linear classical oscillator model depend sensitively on the resonance linewidths $\upsilon_\alpha$ of the collective eigenmodes excited by light, with the intensity at which substantial deviation occurs scaling as a powerlaw of $\upsilon_\alpha$. The linear classical oscillator model then becomes inaccurate at much lower intensities for subradiant collective excitations than superradiant ones, with an example system of seven atoms resulting in critical incident light intensities differing by a factor of 30 between the two cases. By individually exciting eigenmodes we find that this critical intensity has a $\upsilon_\alpha^{2.5}$ scaling for narrower resonances and more strongly interacting systems, while it approaches a $\upsilon_\alpha^3$ scaling for broader resonances and when the dipole-dipole interactions are reduced. The $\upsilon_\alpha^3$ scaling also corresponds to the semiclassical result whereby quantum fluctuations between the atoms have been neglected. We study both the case of perfectly mode-matched drives and the case of standing wave drives, with significant differences between the two cases appearing only at very subradiant modes and positions of Fano resonances.

翻訳日:2023-06-04 18:46:40 公開日:2020-02-04

# 有限データを用いた多重位相の適応ベイズ推定実験

Experimental adaptive Bayesian estimation of multiple phases with limited data ( http://arxiv.org/abs/2002.01232v1 )

ライセンス: Link先を確認

Mauro Valeri, Emanuele Polino, Davide Poderini, Ilaria Gianani, Giacomo Corrielli, Andrea Crespi, Roberto Osellame, Nicol\`o Spagnolo and Fabio Sciarrino

(参考訳) 推定過程における究極の境界を達成することが量子計量学の主目的である。この文脈では、限られた量の資源しか使わずに複数のパラメータを測定する必要がある。この目的のために、追加の制御パラメータを利用する適応プロトコルは、そのような限られたデータレジームで動作する量子センサーの性能を最適化するツールを提供する。推定プロセス中に制御パラメータをチューニングするための最適な戦略を見つけることは自明な問題であり、機械学習技術はそのような課題に対処するための自然な解決策である。本稿では,非常に限られたデータで最適性能に達するように調整された適応ベイズ型マルチパラメータ推定手法を初めて実験的に検討し,実装する。フェムト秒レーザーライティングにより作製されたコンパクトでフレキシブルな集積フォトニック回路を用いて,高次制御による異なる戦略を実現する。その結果、適応戦略は限られた量のリソースを扱う現実的なセンサに対して実行可能なアプローチになり得ることが示された。

Achieving ultimate bounds in estimation processes is the main objective of quantum metrology. In this context, several problems require measurement of multiple parameters by employing only a limited amount of resources. To this end, adaptive protocols, exploiting additional control parameters, provide a tool to optimize the performance of a quantum sensor to work in such limited data regime. Finding the optimal strategies to tune the control parameters during the estimation process is a non-trivial problem, and machine learning techniques are a natural solution to address such task. Here, we investigate and implement experimentally for the first time an adaptive Bayesian multiparameter estimation technique tailored to reach optimal performances with very limited data. We employ a compact and flexible integrated photonic circuit, fabricated by femtosecond laser writing, which allows to implement different strategies with high degree of control. The obtained results show that adaptive strategies can become a viable approach for realistic sensors working with a limited amount of resources.

翻訳日:2023-06-04 18:44:23 公開日:2020-02-04

# 核スピン島での電子シャットリングによる超放射能様ダイナミクス

Superradiant-like dynamics by electron shuttling on a nuclear-spin island ( http://arxiv.org/abs/2002.01219v1 )

ライセンス: Link先を確認

Yi-Nan Fang, Ying-Dan Wang, Rosario Fazio, and Stefano Chesi

(参考訳) 単一電子量子ドットにおける原子核スピン浴の超ラジアント様ダイナミクスを,電子がアイソトープに富む「原子核スピン島」上で周期的に停止するのを考慮して研究する。均一な超微細相互作用を仮定し、シャットリングによる核スピンの進化とその超輝度との関係を詳細に論じる。我々は、断続的なスピンの進化を免れる最小の停止時間を導出する。さらに,近傍のマイクロマグネットの不均質場下での低速・高速シャットリングについて検討する。最後に, 本手法を定常量子ドットモデルと比較することにより, 非断熱シャットリングがクーロン封鎖を解除し, 超ラジアント的挙動を確立する上で果たす役割を強調した。

We investigate superradiant-like dynamics of the nuclear-spin bath in a single-electron quantum dot, by considering electrons cyclically shuttling on/off an isotopically enriched `nuclear-spin island'. Assuming a uniform hyperfine interaction, we discuss in detail the nuclear spin evolution under shuttling and its relation to superradiance. We derive the minimum shuttling time which allows to escape the adiabatic spin evolution. Furthermore, we discuss slow/fast shuttling under the inhomogeneous field of a nearby micromagnet. Finally, by comparing our scheme to a model with stationary quantum dot, we stress the important role played by non-adiabatic shuttling in lifting the Coulomb blockade and thus establishing the superradiant-like behavior.

翻訳日:2023-06-04 18:44:08 公開日:2020-02-04

# LIGOの光とキログラム質量鏡の量子相関

Quantum correlations between the light and kilogram-mass mirrors of LIGO ( http://arxiv.org/abs/2002.01519v1 )

ライセンス: Link先を確認

Haocun Yu, L. McCuller, M. Tse, L. Barsotti, N. Mavalvala, J. Betzwieser, C. D. Blair, S. E. Dwyer, A. Effler, M. Evans, A. Fernandez-Galiana, P. Fritschel, V. V. Frolov, N. Kijbunchoo, F. Matichard, D. E. McClelland, T. McRae, A. Mullavey, D. Sigg, B. J. J. Slagmolen, C. Whittle, A. Buikema, Y. Chen, T. R. Corbitt, R. Schnabel, R. Abbott, C. Adams, R. X. Adhikari, A. Ananyeva, S. Appert, K. Arai, J. S. Areeda, Y. Asali, S. M. Aston, C. Austin, A. M. Baer, M. Ball, S. W. Ballmer, S. Banagiri, D. Barker, J. Bartlett, B. K. Berger, D. Bhattacharjee, G. Billingsley, S. Biscans, R. M. Blair, N. Bode, P. Booker, R. Bork, A. Bramley, A. F. Brooks, D. D. Brown, C. Cahillane, K. C. Cannon, X. Chen, A. A. Ciobanu, F. Clara, S. J. Cooper, K. R. Corley, S. T. Countryman, P. B. Covas, D. C. Coyne, L. E. H. Datrier, D. Davis, C. Di Fronzo, K. L. Dooley, J. C. Driggers, P. Dupej, T. Etzel, T. M. Evans, J. Feicht, P. Fulda, M. Fyffe, J. A. Giaime, K. D. Giardina, P. Godwin, E. Goetz, S. Gras, C. Gray, R. Gray, A. C. Green, Anchal Gupta, E. K. Gustafson, R. Gustafson, J. Hanks, J. Hanson, T. Hardwick, R. K. Hasskew, M. C. Heintze, A. F. Helmling-Cornell, N. A. Holland, J. D. Jones, S. Kandhasamy, S. Karki, M. Kasprzack, K. Kawabe, P. J. King, J. S. Kissel, Rahul Kumar, M. Landry, B. B. Lane, B. Lantz, M. Laxen, Y. K. Lecoeuche, J. Leviton, J. Liu, M. Lormand, A. P. Lundgren, R. Macas, M. MacInnis, D. M. Macleod, G. L. Mansell, S. M\'arka, Z. M\'arka, D. V. Martynov, K. Mason, T. J. Massinger, R. McCarthy, S. McCormick, J. McIver, G. Mendell, K. Merfeld, E. L. Merilh, F. Meylahn, T. Mistry, R. Mittleman, G. Moreno, C. M. Mow-Lowry, S. Mozzon, T. J. N. Nelson, P. Nguyen, L. K. Nuttall, J. Oberling, Richard J. Oram, C. Osthelder, D. J. Ottaway, H. Overmier, J. R. Palamos, W. Parker, E. Payne, A. Pele, C. J. Perez, M. Pirello, H. Radkins, K. E. Ramirez, J. W. Richardson, K. Riles, N. A. Robertson, J. G. Rollins, C. L. Romel, J. H. Romie, M. P. Ross, K. Ryan, T. Sadecki, E. J. Sanchez, L. E. Sanchez, T. R. Saravanan, R. L. Savage, D. Schaetzl, R. M. S. Schofield, E. Schwartz, D. Sellers, T. Shaffer, J. R. Smith, S. Soni, B. Sorazu, A. P. Spencer, K. A. Strain, L. Sun, M. J. Szczepa\'nczyk, M. Thomas, P. Thomas, K. A. Thorne, K. Toland, C. I. Torrie, G. Traylor, A. L. Urban, G. Vajente, G. Valdes, D. C. Vander-Hyde, P. J. Veitch, K. Venkateswara, G. Venugopalan, A. D. Viets, T. Vo, C. Vorvick, M. Wade, R. L. Ward, J. Warner, B. Weaver, R. Weiss, B. Willke, C. C. Wipf, L. Xiao, H. Yamamoto, Hang Yu, L. Zhang, M. E. Zucker, and J. Zweizig

(参考訳) より高精度な微小な力と変位の測定は、量子力学の柱によって課される限界(ハイゼンベルクの不確実性原理)に遭遇する。物体の位置を連続的に測定できる精度の限界は、標準量子極限(SQL)として知られている。光をプローブとして使用すると、物体に照射される光子放射圧の不確かさと光電検出における光子数とのバランスからSQLが生じる。 sqlを超える唯一の可能性は、オブジェクトの位置/運動量の不確かさとそれが反射する光の光子数/位相の不確かさとの相関である。本稿では,レーザー干渉計重力波観測所(LIGO)において,この種の量子相関が自然に発生するという理論的予測を実験的に証明する。以上の結果から,200kwレーザービームの位相と先端ligo検出器の40kgミラーの位置における量子力学的不確かさは,sqlの下の1.4(3db)以下の合同量子不確かさをもたらすことがわかった。量子相関は重力波(gw)観測だけでなく、将来全ての種類の測定値を改善すると予測している。

Measurement of minuscule forces and displacements with ever greater precision encounters a limit imposed by a pillar of quantum mechanics: the Heisenberg uncertainty principle. A limit to the precision with which the position of an object can be measured continuously is known as the standard quantum limit (SQL). When light is used as the probe, the SQL arises from the balance between the uncertainties of photon radiation pressure imposed on the object and of the photon number in the photoelectric detection. The only possibility surpassing the SQL is via correlations within the position/momentum uncertainty of the object and the photon number/phase uncertainty of the light it reflects. Here, we experimentally prove the theoretical prediction that this type of quantum correlation is naturally produced in the Laser Interferometer Gravitational-wave Observatory (LIGO). Our measurements show that the quantum mechanical uncertainties in the phases of the 200 kW laser beams and in the positions of the 40 kg mirrors of the Advanced LIGO detectors yield a joint quantum uncertainty a factor of 1.4 (3dB) below the SQL. We anticipate that quantum correlations will not only improve gravitational wave (GW) observatories but all types of measurements in future.

翻訳日:2023-06-04 18:38:09 公開日:2020-02-04

# 一ターン量子参照ゲームにおける複素性制限

Complexity limitations on one-turn quantum refereed games ( http://arxiv.org/abs/2002.01509v1 )

ライセンス: Link先を確認

Soumik Ghosh, John Watrous

(参考訳) 本稿では、量子状態を送信する2人のプレーヤー間の抽象ゲームである量子参照ゲームの複雑性理論的側面について検討し、そのプレイヤーがどのプレイヤーが勝つかを決定するために、2つの状態に対して効率的に実装可能なジョイント計測を行う。複雑性クラス $\mathrm{qrg}(1)$ は、一方のプレイヤーがyesインスタンスで常に高い確率で勝つことができ、もう一方のプレイヤーは、他方のプレイヤーの戦略に関わらず、常に無インスタンスで高い確率で勝つことができる決定問題を含んでいる。このクラスは自明に$\mathrm{QMA} \cup \text{co-}\mathrm{QMA}$を含み、$\mathrm{PSPACE}$に含まれることが知られている。このクラスの2つの制限付き不変量に対してより強い包含を証明します。 Specifically, if one of the players is limited to sending a classical (probabilistic) state rather than a quantum state, the resulting complexity class $\mathrm{CQRG}(1)$ is contained in $\exists\cdot\mathrm{PP}$ (the nondeterministic polynomial-time operator applied to $\mathrm{PP}$); while if both players send quantum states but the referee is forced to measure one of the states first, and incorporates the classical outcome of this measurement into a measurement of the second state, the resulting class $\mathrm{MQRG}(1)$ is contained in $\mathrm{P}\cdot\mathrm{PP}$ (the unbounded-error probabilistic polynomial-time operator applied to $\mathrm{PP}$).

This paper studies complexity theoretic aspects of quantum refereed games, which are abstract games between two competing players that send quantum states to a referee, who performs an efficiently implementable joint measurement on the two states to determine which of the player wins. The complexity class $\mathrm{QRG}(1)$ contains those decision problems for which one of the players can always win with high probability on yes-instances and the other player can always win with high probability on no-instances, regardless of the opposing player's strategy. This class trivially contains $\mathrm{QMA} \cup \text{co-}\mathrm{QMA}$ and is known to be contained in $\mathrm{PSPACE}$. We prove stronger containments on two restricted variants of this class. Specifically, if one of the players is limited to sending a classical (probabilistic) state rather than a quantum state, the resulting complexity class $\mathrm{CQRG}(1)$ is contained in $\exists\cdot\mathrm{PP}$ (the nondeterministic polynomial-time operator applied to $\mathrm{PP}$); while if both players send quantum states but the referee is forced to measure one of the states first, and incorporates the classical outcome of this measurement into a measurement of the second state, the resulting class $\mathrm{MQRG}(1)$ is contained in $\mathrm{P}\cdot\mathrm{PP}$ (the unbounded-error probabilistic polynomial-time operator applied to $\mathrm{PP}$).

翻訳日:2023-06-04 18:37:50 公開日:2020-02-04

# 非定常光ホモダイン量子状態トモグラフィによる条件分光

Conditional Spectroscopy via Non-Stationary Optical Homodyne Quantum State Tomography ( http://arxiv.org/abs/2002.01465v1 )

ライセンス: Link先を確認

Johannes Thewes, Carolin L\"uders, Marc A{\ss}mann

(参考訳) 連続可変量子状態トモグラフィーは、量子光学における光場の性質を研究する最も強力な手法の1つである。しかし、固定位相参照の必要性は、半導体分光法のような他の分野で広く使われることを妨げている。本稿では,超高速分光法の特殊要件に応用した非定常量子状態トモグラフィーを提案する。具体的には、固定位相参照を必要とせず、約100\,fsの時間分解能で光場の振幅と位相にアクセスできる。さらに,本手法は,従来の方法では実験的に到達できない確率力学の条件研究を可能とし,サブpsスケールにおける熱光場の確率力学を観測することにより,実験的にその能力を示す。最後に,本手法の離散変数類似と見なされるハンベリーブラウン-トウィス光子相関実験の相違点と類似点について考察する。

Continuous variable quantum state tomography is one of the most powerful techniques to study the properties of light fields in quantum optics. However, the need for a fixed phase reference has so far prevented widespread usage in other fields such as semiconductor spectroscopy. Here, we introduce non-stationary quantum state tomography, which adapts the technique to the special requirements of ultrafast spectroscopy. In detail, we gain access to the amplitude and phase of light fields with a temporal resolution of about 100\,fs without the need for a fixed phase reference. Further, we show how our technique allows us to perform conditional studies of stochastic dynamics that are inaccessible experimentally by conventional means, and demonstrate the capabilities experimentally by monitoring the stochastic dynamics of a thermal light field on the sub-ps scale. Finally, we discuss differences and similarities to more standard Hanbury Brown-Twiss photon correlation experiments, which may be considered as the discrete variable analogues of our technique.

翻訳日:2023-06-04 18:36:00 公開日:2020-02-04

# エネルギー電流の一方向通り:境界駆動量子スピン鎖におけるユビキタス現象

One-way street for the energy current: A ubiquitous phenomenon in boundary-driven quantum spin chains ( http://arxiv.org/abs/2002.01463v1 )

ライセンス: Link先を確認

Deborah Oliveira and Emmanuel Pereira and Humberto C F Lemos

(参考訳) 量子スケールでのエネルギー輸送の非自明な性質の記述に焦点を当て、境界駆動の$\mathit{XXZ}$と$\mathit{XXX}$Heisenbergモデルで記述された非対称量子スピン鎖について検討する。我々は定常状態の性質を確立するためにシステムのダイナミクスに関連するリンドブラッドマスター方程式の対称性を探索する。境界における目標偏光に対する一般的な仮定の下では、エネルギーの流れに対するユニークな方法の存在である一方通行路現象(英語版)(one-way street phenomena)と呼ばれるエネルギー整流(英語版)に関連する(しかし、より強い)効果の発生を示す。正確には、エネルギー電流は境界における浴槽の反転によって大きさや方向が変化せず、その方向は鎖のバルク内の非対称性によって完全に決定される。結果はシステムの規模や輸送体制とは無関係である。本研究は, 境界駆動型スピンシステムにおけるエネルギー流の一方向路現象のユビキタスな発生を示すものであり, エネルギー電流の制御・操作に使用される効率的な量子デバイスの研究・構築に寄与すると考えられる。

Focusing on the description of nontrivial properties of the energy transport at quantum scale, we investigate asymmetrical quantum spin chains described by boundary-driven $\mathit{XXZ}$ and $\mathit{XXX}$ Heisenberg models. We search for symmetries properties of the Lindblad master equation related to the dynamics of the system in order to establish properties of the steady state. Under rather general assumptions for the target polarization at the boundaries, we show the occurrence of an effect related to (but stronger than) energy rectification, namely, the one-way street phenomenon, which is the existence of an unique way for the energy flow. Precisely, the energy current does not change in magnitude and direction as we invert the baths at the boundaries: its direction is completely determined by the asymmetry in the bulk of the chain. The results follow independent of the system size and of the transport regime. Our findings show the ubiquitous occurrence of the one-way street phenomenon for the energy flow in boundary-driven spin systems and, we believe, they shall be an useful contribution to the area devoted to the investigation and building of efficient quantum devices used to control and manipulate the energy current.

翻訳日:2023-06-04 18:35:45 公開日:2020-02-04

# 内部摩擦を伴う量子カルノーサイクル

Quantum Carnot cycle with inner friction ( http://arxiv.org/abs/2002.01457v1 )

ライセンス: Link先を確認

Sel\c{c}uk \c{C}akmak, Ferdi Altintas

(参考訳) 単一駆動スピンは6ストロークの不可逆量子カルノーサイクルの作用物質として研究される。有限時間断熱変換に伴う内部摩擦がサイクル効率と収穫作業に与える影響について詳細に検討した。内摩擦は、作業出力とサイクル効率を著しく低減し、エンジンが過速な断熱変換のための正の作業を生み出すことを不可能にする。理想カルノ効率は準静的変換に対してのみ到達することが分かる。古典的なカルノ効率からのサイクル効率のずれは、内摩擦によるエントロピー全体の生成に直接関係する効率ラグによって与えられる。サイクルの緩和過程における放出熱はエントロピー生成と内部摩擦と関連している。また, スケール不変な量子ワーキング物質の実験結果の延長と, 液体状態核磁気共鳴系における可逆量子カルノサイクルの実験的実装の可能性についても考察した。

A single driven spin is investigated as the working substance of a six-stroke irreversible quantum Carnot cycle. The role of inner friction associated with the finite-time adiabatic transformations on the cycle efficiency and the harvested work are investigated in detail. The inner friction is found to significantly reduce the work output and the cycle efficiency which can make the engine incapable to produce positive work for the too fast adiabatic transformations. The ideal Carnot efficiency is found to be reached only for the quasi-static transformations. A deviation of the cycle efficiency from the classical Carnot efficiency has been given by an efficiency lag which is directly related to the total entropy production due to the inner friction. The released heat in the relaxation processes of the cycle are associated with the entropy production and the inner friction. The extension of the results for a scale invariant quantum working substance and the possible experimental implementation of the irreversible quantum Carnot cycle in a liquid state nuclear magnetic resonance setup are also discussed.

翻訳日:2023-06-04 18:35:23 公開日:2020-02-04

# QCA技術における新しいレベル感度T-FFを用いた同期カウンタ設計

Synchronous Counter Design Using Novel Level Sensitive T-FF in QCA Technology ( http://arxiv.org/abs/2002.11587v1 )

ライセンス: Link先を確認

Ali H. Majeed, Esam Alkaldy, Mohd Shamian bin Zainal, andDanial Bin MD Nor

(参考訳) 量子ドットセルオートマトン(QCA)ナノ技術は、低消費電力や小型化といった特徴から、コンピュータ科学者を惹きつけている。多くの論文が、多くのqca回路の無署名化や論理ゲートの最適構造への提示におけるこの技術の利用に関する論文で発表されている。 Tフリップフロップはデジタルデザインの重要な部分であり、同期カウンタや非同期カウンタの設計に使用することができる。本稿では,新しいTフリップフロップ構造を最適に提示する。提示された新しいゲートはnビットバイナリ同期カウンタの設計に使われた。 QCADesignerソフトウェアは設計した回路の検証とシミュレーション結果の提示に使われ、QCAProツールは電力分析に使用された。提案された設計は最小限の電力を必要とし、以前の設計よりも優れた改善が見られた。

The quantum-dot cellular automata (QCA) nano-technique has attracted computer scientists due to its noticeable features such as low power consumption and small size. Many papers have been published in the literature about the utilization of this technology for de-signing many QCA circuits and for presenting logic gates in an optimal structure. The T flip-flop, which is an essential part of digital designs, can be used to design synchronous and asynchronous counters. This paper presents a novel T flip-flop structure in an optimal form. The presented novel gate was used to design an N-bit binary synchronous counter. The QCADesigner software was used to verify the designed circuits and to present the simulation results, while the QCAPro tool was used for the power analysis. The proposed design required minimal power and showed good improvements over previous designs.

翻訳日:2023-06-04 18:26:34 公開日:2020-02-04

# 暗号通貨、ファイトマネー、ブロックチェーン、データベース

Criptocurrencies, Fiat Money, Blockchains and Databases ( http://arxiv.org/abs/2002.08466v1 )

ライセンス: Link先を確認

Jorge Barrera

(参考訳) 暗号通貨を含む2つの通貨の分類を解析する。暗号通貨の定義が与えられ、その価格がどのように固定されているかに基づいてその分類が提示される。現状のファイトマネーの使用と2段階銀行システムの運用の特徴について論じる。暗号通貨はフィアットマネーと比較され、後者が克服できない側面が示される。ブロックチェーンとデータベースの特徴について述べる。両技術の可能な使用事例を比較し、暗号通貨や特定のレコードに加えてブロックチェーンが有用性を示していないのに対して、データベースは運用中のほとんどの自動化システムの基盤となっている点に注意が必要である。

Two taxonomies of money that include cryptocurrencies are analyzed. A definition of the term cryptocurrency is given and a taxonomy of them is presented, based on how its price is fixed. The characteristics of the use of current fiat money and the operation of two-level banking systems are discussed. Cryptocurrencies are compared with fiat money and the aspects in which the latter cannot be overcome are indicated. The characteristics of blockchains and databases are described. The possible cases of use of both technologies are compared, and it is noted that blockchains, in addition to cryptocurrencies and certain records, have not yet shown their usefulness, while databases constitute the foundation of most of the automated systems in operation.

翻訳日:2023-06-04 18:25:41 公開日:2020-02-04

# 共変量子力学と量子時空

Covariant Quantum Mechanics and Quantum Spacetime ( http://arxiv.org/abs/2002.07083v1 )

ライセンス: Link先を確認

Suzana Bedi\'c, Otto C. W. Kong and Hock King Ting

(参考訳) 本稿では、ローレンツ対称性の下でミンコフスキー四ベクトルとして変換される位置および運動量作用素を持つハイゼンベルク・ワイル対称性からの群理論的構成に基づくローレンツ共変量子力学の定式化について述べる。基本表現は、本質的に正則表現の既約成分であるコヒーレント状態表現(英語版)(coherent state representation)として識別され、群 $C^*$-algebra の拡張の一致する表現は可観測体の代数を与える。この定式化の重要な特徴は、ユニタリではなく擬似ユニタリであり、ミンコフスキー時空表現と全く同じ意味である。明示的な波動関数の記述は、変数領域の制限なしに与えられるが、有限積分内積を持つ。関連する共変共変振動子フォック状態基底は、ユークリッド位置と任意の「次元」の運動量作用素を持つ調和振動子のものと正確に類似したすべての標準特性を持つ。ローレンツ対称性のガリレオ極限とローレンツ共変フレームワークの古典極限は、実数座標と非可換作用素座標の両方で与えられる位相空間の対称性を通して記述された力学を含む代数とその表現の適切な対称性収縮によって厳密に検索される。後者は(射影的)ヒルベルト空間を量子/非可換時空として明示的な図式を与える。

We present in the article the formulation of a version of Lorentz covariant quantum mechanics based on a group theoretical construction from a Heisenberg-Weyl symmetry with position and momentum operators transforming as Minkowski four-vectors under the Lorentz symmetry. The basic representation is identified as a coherent state representation, essentially an irreducible component of the regular representation, with the matching representation of an extension of the group $C^*$-algebra giving the algebra of observables. The key feature of the formulation is that it is not unitary but pseudo-unitary, exactly in the same sense as the Minkowski spacetime representation. Explicit wavefunction description is given without any restriction of the variable domains, yet with a finite integral inner product. The associated covariant harmonic oscillator Fock state basis has all the standard properties in exact analog to those of a harmonic oscillator with Euclidean position and momentum operators of any `dimension'. Galilean limit of the Lorentz symmetry and the classical limit of the Lorentz covariant framework are retrieved rigorously through appropriate symmetry contractions of the algebra and its representation, including the dynamics described through the symmetry of the phase space, given both in terms of real/complex number coordinates and noncommutative operator coordinates. The latter gives an explicit picture of the (projective) Hilbert space as a quantum/noncommutative spacetime.

翻訳日:2023-06-04 18:25:31 公開日:2020-02-04

# 連続スペクトルを持つPT対称ポテンシャル

PT-symmetric potentials having continuous spectra ( http://arxiv.org/abs/2002.04398v1 )

ライセンス: Link先を確認

Zichao Wen and Carl M. Bender

(参考訳) 連続スペクトルを持つ1次元PT対称量子力学ハミルトニアンは研究される。ハミルトン派は$H=p^2+V(x)$で、$V(x)$は$x$で奇数であり、純粋な虚数であり、$|x|\to\infty$として消える。 Five PT-symmetric potentials are studied: the Scarf-II potential $V_1(x)=iA_1\,{\rm sech}(x)\tanh(x)$, which decays exponentially for large $|x|$; the rational potentials $V_2(x)=iA_2\,x/(1+x^4)$ and $V_3(x)=iA_3\,x/(1+|x|^3)$, which decay algebraically for large $|x|$; the step-function potential $V_4(x)=iA_4\,{\rm sgn}(x)\theta(2.5-|x|)$, which has compact support; the regulated Coulomb potential $V_5(x)=iA_5\,x/(1+x^2)$, which decays slowly as $|x|\to\infty$ and may be viewed as a long-range potential. 実パラメータ$A_n$はこれらのポテンシャルの強度を測定する。これらのポテンシャルに関連する時間非依存的なシュリンガー固有値問題の解法は、対応するハミルトンのスペクトルが普遍性を示すことを示した。一般に、固有値は一部実数であり、一部複素数である。実固有値はスペクトルの連続部分を形成し、複素固有値はスペクトルの離散部分を形成する。実固有値は、0$から$+\infty$まで連続的に値が変化する。複素固有値は離散複素共役対で発生し、$V_n(x)$$$1\leq n\leq4$) の場合、これらの対の数は有限であり、強度パラメータ$A_n$の値が増加するにつれて増加する。しかし、$v_5(x)$ に対して、原点に極限点を持つ離散固有値の {\it infinite} 列が存在する。この配列は複雑であるが、逆二乗収束を持つため、水素原子に対するバルマー級数と似ている。

One-dimensional PT-symmetric quantum-mechanical Hamiltonians having continuous spectra are studied. The Hamiltonians considered have the form $H=p^2+V(x)$, where $V(x)$ is odd in $x$, pure imaginary, and vanishes as $|x|\to\infty$. Five PT-symmetric potentials are studied: the Scarf-II potential $V_1(x)=iA_1\,{\rm sech}(x)\tanh(x)$, which decays exponentially for large $|x|$; the rational potentials $V_2(x)=iA_2\,x/(1+x^4)$ and $V_3(x)=iA_3\,x/(1+|x|^3)$, which decay algebraically for large $|x|$; the step-function potential $V_4(x)=iA_4\,{\rm sgn}(x)\theta(2.5-|x|)$, which has compact support; the regulated Coulomb potential $V_5(x)=iA_5\,x/(1+x^2)$, which decays slowly as $|x|\to\infty$ and may be viewed as a long-range potential. The real parameters $A_n$ measure the strengths of these potentials. Numerical techniques for solving the time-independent Schr\"odinger eigenvalue problems associated with these potentials reveal that the spectra of the corresponding Hamiltonians exhibit universal properties. In general, the eigenvalues are partly real and partly complex. The real eigenvalues form the continuous part of the spectrum and the complex eigenvalues form the discrete part of the spectrum. The real eigenvalues range continuously in value from $0$ to $+\infty$. The complex eigenvalues occur in discrete complex-conjugate pairs and for $V_n(x)$ ($1\leq n\leq4$) the number of these pairs is finite and increases as the value of the strength parameter $A_n$ increases. However, for $V_5(x)$ there is an {\it infinite} sequence of discrete eigenvalues with a limit point at the origin. This sequence is complex, but it is similar to the Balmer series for the hydrogen atom because it has inverse-square convergence.

翻訳日:2023-06-04 18:24:46 公開日:2020-02-04

# エンタングルメントスワップによる量子密度符号化の実現手法

Scheme for realizing quantum dense coding via entanglement swapping ( http://arxiv.org/abs/2002.02422v1 )

ライセンス: Link先を確認

Nilakantha Meher

(参考訳) 量子密度符号化 (quantum dense coding) は、1つの量子ビット(qubit)だけを送信することで、送信者(Alice)からリモート受信機(Bob)に2つの古典的な情報を送信するプロトコルである。本稿では、ある数個の2レベル原子を含む空洞配列で交換する量子密度符号 \textit{via} エンタングルメントを実現するための実験的に実現可能なスキームを提案する。原子-キャビティカップリングやキャビティ間カップリングなどのシステムパラメータの適切な選択は、情報の完全な転送を可能にする。フォトニック結晶キャビティと超伝導共振器の文脈で最近達成された実験値を用いて、情報の高忠実度転送が可能であることが示されている。実験的な欠陥を模倣するため、結合強度と共振周波数の両方の障害を考慮する。

Quantum dense coding is a protocol for transmitting two classical bits of information from a sender (Alice) to a remote receiver (Bob) by sending only one quantum bit (qubit). In this article, we propose an experimentally feasible scheme to realize quantum dense coding \textit{via} entanglement swapping in a cavity array containing a certain number of two-level atoms. Proper choice of system parameters such as atom-cavity couplings and inter-cavity couplings allows perfect transfer of information. A high fidelity transfer of information is shown to be possible by using recently achieved experimental values in the context of photonic crystal cavities and superconducting resonators. To mimic experimental imperfections, disorder in both the coupling strengths and resonance frequencies is considered.

翻訳日:2023-06-04 18:24:09 公開日:2020-02-04

# 古典的・直観主義的な数学言語は、物理学における時間の理解を形作る

Classical and intuitionistic mathematical languages shape our understanding of time in physics ( http://arxiv.org/abs/2002.01653v1 )

ライセンス: Link先を確認

Nicolas Gisin

(参考訳) 物理学は時間のない古典数学で定式化されている。時間進化過程に基づいて構築された直観主義数学に基づく定式化は、我々の物理的現実の経験に近い視点を提供する。

Physics is formulated in terms of timeless classical mathematics. A formulation on the basis of intuitionist mathematics, built on time-evolving processes, would offer a perspective that is closer to our experience of physical reality.

翻訳日:2023-06-04 18:23:56 公開日:2020-02-04

# 賃貸住宅スポット市場:オンライン情報交換が送金データをどのように補完するか

Rental Housing Spot Markets: How Online Information Exchanges Can Supplement Transacted-Rents Data ( http://arxiv.org/abs/2002.01578v1 )

ライセンス: Link先を確認

Geoff Boeing, Jake Wegmann, Junfeng Jiao

(参考訳) アメリカン・コミュニティ・サーベイ(英語版)やアメリカン・ハウジング・サーベイ(英語版)のような伝統的な米国の賃貸住宅データソースは、既存の借り手が毎月支払う市場について報告している。彼らはスポットマーケットについて明確には教えてくれない - すなわち、現在の住宅購入者が住宅を購入するために支払わなければならない家賃だ。この研究は、政府のデータと何百万もの同時賃貸物件を比較し、賃貸料の請求がこれらの最新の推計から大きく異なることを発見した。従来型の住宅データは、現在の市場状況や、特にタイトで高価な賃貸市場のある都市における手頃価格の課題を過小評価している。

Traditional US rental housing data sources such as the American Community Survey and the American Housing Survey report on the transacted market - what existing renters pay each month. They do not explicitly tell us about the spot market - i.e., the asking rents that current homeseekers must pay to acquire housing - though they are routinely used as a proxy. This study compares governmental data to millions of contemporaneous rental listings and finds that asking rents diverge substantially from these most recent estimates. Conventional housing data understate current market conditions and affordability challenges, especially in cities with tight and expensive rental markets.

翻訳日:2023-06-04 18:23:52 公開日:2020-02-04

# ローカライズのための連合学習--プライバシー保護型クラウドソーシング手法

Federated Learning for Localization: A Privacy-Preserving Crowdsourcing Method ( http://arxiv.org/abs/2001.01911v2 )

ライセンス: Link先を確認

Bekir Sait Ciftler, Abdullatif Albaseer, Noureddine Lasla, Mohamed Abdallah

(参考訳) 受信信号強度(RSS)指紋ベースのローカライゼーションは、低コストで実装が容易なため、多くの研究成果を惹きつけ、ロケーションベースのサービスの商業的応用を育ててきた。深層学習(DL)アルゴリズムをローカライズに活用する研究が数多く行われている。 DLの機能を抽出し、自律的に分類する能力は、指紋ベースのローカライゼーションの魅力的なソリューションとなる。これらの解は、大量の測定値を持つDLモデルの頻繁な再訓練を必要とする。クラウドソーシングは大量のデータを集めるのに優れた方法だが、集中型サーバーでラベル付きデータを収集する必要があるため、参加者のプライバシーを損なう。近年,フェデレーション学習は,クラウドソーシングの参加者のプライバシ保護の問題を解決する手段として,エッジデバイス上でのモデルトレーニングを分散的に行うという実践的な概念として現れており,参加者はもはやデータを集中型サーバに公開しない。本稿では,クラウドソーシング参加者のプライバシーを維持しつつ,RSS指紋による位置推定の精度を向上させるために,フェデレーション学習を利用した新しい手法を提案する。フェデレートされた学習を利用することで、ユーザのデータのプライバシを保存することを保証すると同時に、実世界の環境でキャプチャされた実験データによる適切なローカライゼーションパフォーマンスを実現することができる。提案手法は, 集中学習用ブースタとして使用する場合のローカライズ精度を1.8m向上させ, 単独使用時のローカライズ精度を良好に向上させた。

Received Signal Strength (RSS) fingerprint-based localization has attracted a lot of research effort and cultivated many commercial applications of location-based services due to its low cost and ease of implementation. Many studies are exploring the use of deep learning (DL) algorithms for localization. DL's ability to extract features and to classify autonomously makes it an attractive solution for fingerprint-based localization. These solutions require frequent retraining of DL models with vast amounts of measurements. Although crowdsourcing is an excellent way to gather immense amounts of data, it jeopardizes the privacy of participants, as it requires to collect labeled data at a centralized server. Recently, federated learning has emerged as a practical concept in solving the privacy preservation issue of crowdsourcing participants by performing model training at the edge devices in a decentralized manner; the participants do not expose their data anymore to a centralized server. This paper presents a novel method utilizing federated learning to improve the accuracy of RSS fingerprint-based localization while preserving the privacy of the crowdsourcing participants. Employing federated learning allows ensuring \emph{preserving the privacy of user data} while enabling an adequate localization performance with experimental data captured in real-world settings. The proposed method improved localization accuracy by 1.8 meters when used as a booster for centralized learning and achieved satisfactory localization accuracy when used standalone.

翻訳日:2023-01-13 21:09:40 公開日:2020-02-04

# d3ba: 非決定性計画を用いたビジネスプロセス最適化ツール

D3BA: A Tool for Optimizing Business Processes Using Non-Deterministic Planning ( http://arxiv.org/abs/2001.02619v2 )

ライセンス: Link先を確認

Tathagata Chakraborti and Yasaman Khazaeni

(参考訳) 本稿では,対話エージェントの宣言的設計に関する最近の研究に基づいて,ai計画の力を利用してビジネスプロセスを最適化するディジタルビジネス自動化のための,エキサイティングな新しいツールであるd3baを提案する。このツールは、複雑なビジネスプロセスを構築し、最適化し、メンテナンスするための強力なフレームワークを提供する。我々は、この構成技法の有意義な特徴を説明し、他の構成哲学と比較し、この新興のビジネスプロセス自動化分野における研究のエキサイティングな機会を強調する。

This paper builds upon recent work in the declarative design of dialogue agents and proposes an exciting new tool -- D3BA -- Declarative Design for Digital Business Automation, built to optimize business processes using the power of AI planning. The tool provides a powerful framework to build, optimize, and maintain complex business processes and optimize them by composing with services that automate one or more subtasks. We illustrate salient features of this composition technique, compare with other philosophies of composition, and highlight exciting opportunities for research in this emerging field of business process automation.

翻訳日:2023-01-13 10:07:44 公開日:2020-02-04

# DALC:微粒化交通速度予測のための分散LSTMカスタマイズ

DALC: Distributed Automatic LSTM Customization for Fine-Grained Traffic Speed Prediction ( http://arxiv.org/abs/2001.09821v2 )

ライセンス: Link先を確認

Ming-Chang Lee and Jia-Chun Lin

(参考訳) 過去10年間で、短期交通予測のためのいくつかのアプローチが導入されている。しかし,多数の検出器を地理的に配置して交通データを収集する大規模交通ネットワークでは,詳細な交通予測がまだ未解決である。本稿では,単一検出器のlstmモデルを有限マルコフ決定プロセスにカスタマイズする問題を定式化し,それに対応する予測精度を可能な限り満足し,時間消費を極力低くできるように,単一検出器のlstmモデルを自動カスタマイズする自動lstmカスタマイズ(alc)アルゴリズムを導入する。 ALCアルゴリズムに基づいて,大規模輸送ネットワークにおけるLSTMモデル毎にLSTMモデルをカスタマイズするために,分散自動LSTMカスタマイズ(DALC)と呼ばれる分散アプローチを導入する。本実験は, dalcがapache spark mllibが提供する複数のアプローチよりも高い予測精度を提供することを示す。

Over the past decade, several approaches have been introduced for short-term traffic prediction. However, providing fine-grained traffic prediction for large-scale transportation networks where numerous detectors are geographically deployed to collect traffic data is still an open issue. To address this issue, in this paper, we formulate the problem of customizing an LSTM model for a single detector into a finite Markov decision process and then introduce an Automatic LSTM Customization (ALC) algorithm to automatically customize an LSTM model for a single detector such that the corresponding prediction accuracy can be as satisfactory as possible and the time consumption can be as low as possible. Based on the ALC algorithm, we introduce a distributed approach called Distributed Automatic LSTM Customization (DALC) to customize an LSTM model for every detector in large-scale transportation networks. Our experiment demonstrates that the DALC provides higher prediction accuracy than several approaches provided by Apache Spark MLlib.

翻訳日:2023-01-07 05:06:47 公開日:2020-02-04

# 創発的コミュニケーションにおけるグラフ表現学習に向けて

Towards Graph Representation Learning in Emergent Communication ( http://arxiv.org/abs/2001.09063v2 )

ライセンス: Link先を確認

Agnieszka S{\l}owik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden

(参考訳) 最近の神経科学の発見は、人間の脳が幾何学的構造(例えば概念空間を通して)の情報を表すことを示唆している。コミュニケーションするために、エンティティとその属性の複雑な表現を1つの単語または文にフラット化する。本稿では,マルチエージェントシステムにおける言語進化と協調を支援するために,グラフ畳み込みネットワークを用いる。画像ベースの参照ゲームに動機づけられ,複雑度が異なるグラフ参照ゲームを提案し,言語出現と協調の観点から望ましい特性を示す強力なベースラインモデルを提供する。出現したコミュニケーションプロトコルは頑健であり、エージェントはゲームの変動の真の要因を明らかにし、トレーニング中に遭遇したサンプルを超えて一般化することを学ぶ。

Recent findings in neuroscience suggest that the human brain represents information in a geometric structure (for instance, through conceptual spaces). In order to communicate, we flatten the complex representation of entities and their attributes into a single word or a sentence. In this paper we use graph convolutional networks to support the evolution of language and cooperation in multi-agent systems. Motivated by an image-based referential game, we propose a graph referential game with varying degrees of complexity, and we provide strong baseline models that exhibit desirable properties in terms of language emergence and cooperation. We show that the emerged communication protocol is robust, that the agents uncover the true factors of variation in the game, and that they learn to generalize beyond the samples encountered during training.

翻訳日:2023-01-07 04:41:04 公開日:2020-02-04

# 低複雑性畳み込みニューラルネットワークのための事前定義されたスパーシリティ

Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks ( http://arxiv.org/abs/2001.10710v2 )

ライセンス: Link先を確認

Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel

(参考訳) 深層畳み込みニューラルネットワークを処理するための高エネルギーコストは、組み込みシステムやIoTデバイスのようなエネルギー制約のあるプラットフォームへのユビキタスな展開を妨げる。この研究は、フィルター内およびフィルター間を周期的に繰り返すサポートセットを持つ、事前定義されたスパース2dカーネルによる畳み込み層を導入する。周期的スパースカーネルの効率的な保存により、パラメータセーブはDRAMアクセスの削減によるエネルギー効率の大幅な改善に変換できるため、トレーニングと推論の両方においてエネルギー消費と精度のトレードオフが大幅に改善されることが期待できる。このアプローチを評価するために、ResNet18とVGG16アーキテクチャのスパース変種において、広く受け入れられている2つのデータセットであるCIFAR-10とTiny ImageNetを用いて実験を行った。ベースラインモデルと比較すると,提案手法ではモデルパラメータが最大82%少なく,フラップが5.6倍少なく,cifar-10ではresnet18の精度が無視できる。 Tiny ImageNetでトレーニングされたVGG16では、FLOPが5.8倍少なく、モデルパラメータが83.3%少なく、トップ5(トップ-1)の精度はわずか1.2%(2.1%)である。また,提案アーキテクチャの性能をShuffleNetとMobileNetV2の性能と比較した。同様のハイパーパラメータとFLOPを用いて、ResNet18の変種は平均精度が2.8%向上した。

The high energy cost of processing deep convolutional neural networks impedes their ubiquitous deployment in energy-constrained platforms such as embedded systems and IoT devices. This work introduces convolutional layers with pre-defined sparse 2D kernels that have support sets that repeat periodically within and across filters. Due to the efficient storage of our periodic sparse kernels, the parameter savings can translate into considerable improvements in energy efficiency due to reduced DRAM accesses, thus promising significant improvements in the trade-off between energy consumption and accuracy for both training and inference. To evaluate this approach, we performed experiments with two widely accepted datasets, CIFAR-10 and Tiny ImageNet in sparse variants of the ResNet18 and VGG16 architectures. Compared to baseline models, our proposed sparse variants require up to 82% fewer model parameters with 5.6times fewer FLOPs with negligible loss in accuracy for ResNet18 on CIFAR-10. For VGG16 trained on Tiny ImageNet, our approach requires 5.8times fewer FLOPs and up to 83.3% fewer model parameters with a drop in top-5 (top-1) accuracy of only 1.2% (2.1%). We also compared the performance of our proposed architectures with that of ShuffleNet andMobileNetV2. Using similar hyperparameters and FLOPs, our ResNet18 variants yield an average accuracy improvement of 2.8%.

翻訳日:2023-01-05 21:02:03 公開日:2020-02-04

# グラフニューラルネットワークを用いた確率論的論理推論

Efficient Probabilistic Logic Reasoning with Graph Neural Networks ( http://arxiv.org/abs/2001.11850v2 )

ライセンス: Link先を確認

Yuyu Zhang, Xinshi Chen, Yuan Yang, Arun Ramamurthy, Bo Li, Yuan Qi, Le Song

(参考訳) 論理規則と確率的グラフィカルモデルを組み合わせたマルコフ論理ネットワーク(MLN)は、多くの知識グラフ問題に対処するために用いられる。しかし、MLNの推論は計算集約的であり、MLNの工業的応用は非常に困難である。近年,グラフニューラルネットワーク(gnn)が大規模グラフ問題に対して効率的かつ効果的なツールとして登場している。それでも、GNNはモデルに事前のロジックルールを明示的に取り入れておらず、ターゲットタスクに多くのラベル付き例を必要とする可能性がある。本稿では,MLNとGNNの組み合わせについて検討し,MLNの変分推論にグラフニューラルネットワークを用いる。本稿では,表現力とモデルの単純さのバランスのよいGNN変種であるExpressGNNを提案する。いくつかのベンチマークデータセットに関する広範な実験は、ExpressGNNが効率的かつ効率的な確率論的論理推論をもたらすことを示した。

Markov Logic Networks (MLNs), which elegantly combine logic rules and probabilistic graphical models, can be used to address many knowledge graph problems. However, inference in MLN is computationally intensive, making the industrial-scale application of MLN very difficult. In recent years, graph neural networks (GNNs) have emerged as efficient and effective tools for large-scale graph problems. Nevertheless, GNNs do not explicitly incorporate prior logic rules into the models, and may require many labeled examples for a target task. In this paper, we explore the combination of MLNs and GNNs, and use graph neural networks for variational inference in MLN. We propose a GNN variant, named ExpressGNN, which strikes a nice balance between the representation power and the simplicity of the model. Our extensive experiments on several benchmark datasets demonstrate that ExpressGNN leads to effective and efficient probabilistic logic reasoning.

翻訳日:2023-01-05 20:44:59 公開日:2020-02-04

# NAViDAd:ディープオートエンコーダに基づく非参照オーディオ映像品質指標

NAViDAd: A No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder ( http://arxiv.org/abs/2001.11406v2 )

ライセンス: Link先を確認

Helard Martinez, M. C. Farias, A. Hines

(参考訳) 音声信号とビデオ信号の両方の品質予測モデルの開発は、かなり成熟した分野である。しかし、複数のマルチモーダルモデルが提案されているが、音声・視覚品質予測の分野はいまだに新興分野である。実際、組み合わせとパラメトリックのメトリクスによって得られる妥当なパフォーマンスにもかかわらず、現在、信頼できるピクセルベースのオーディオ視覚品質指標は存在しない。この研究で提示されたアプローチは、説明的なオーディオとビデオ機能を備えたオートエンコーダが、複雑なオーディオとビデオのインタラクションを記述することのできる一連の機能を生み出すかもしれないという仮定に基づいている。この仮説に基づいて,Deep Autoencoder (NAViDAd) に基づく非参照オーディオ-ビジュアル品質メトリクスを提案する。モデル視覚特徴は、ビデオ成分の自然シーン統計(NSS)と時空間測度である。一方、音声成分のスペクトログラム表現を演算して音声特徴を得る。このモデルは、ディープオートエンコーダ層と分類層を含む2層フレームワークによって形成される。これら2つのレイヤは積み重ねられ、ディープニューラルネットワークモデルを構築するためにトレーニングされます。モデルは、代表的なオーディオおよびビデオアーティファクトを含む、大きな刺激セットを使用して訓練され、テストされる。このモデルは、UnB-AVとLiveNetflix-IIデータベースでテストするとうまく動作した。 %の結果, 主観的品質スコアと高い相関性を有する品質スコアが得られた。

The development of models for quality prediction of both audio and video signals is a fairly mature field. But, although several multimodal models have been proposed, the area of audio-visual quality prediction is still an emerging area. In fact, despite the reasonable performance obtained by combination and parametric metrics, currently there is no reliable pixel-based audio-visual quality metric. The approach presented in this work is based on the assumption that autoencoders, fed with descriptive audio and video features, might produce a set of features that is able to describe the complex audio and video interactions. Based on this hypothesis, we propose a No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder (NAViDAd). The model visual features are natural scene statistics (NSS) and spatial-temporal measures of the video component. Meanwhile, the audio features are obtained by computing the spectrogram representation of the audio component. The model is formed by a 2-layer framework that includes a deep autoencoder layer and a classification layer. These two layers are stacked and trained to build the deep neural network model. The model is trained and tested using a large set of stimuli, containing representative audio and video artifacts. The model performed well when tested against the UnB-AV and the LiveNetflix-II databases. %Results shows that this type of approach produces quality scores that are highly correlated to subjective quality scores.

翻訳日:2023-01-05 12:47:34 公開日:2020-02-04

# 畳み込みニューラルネットワークを用いた高次元モータ画像タスクの分類

Classification of High-Dimensional Motor Imagery Tasks based on An End-to-end role assigned convolutional neural network ( http://arxiv.org/abs/2002.00210v2 )

ライセンス: Link先を確認

Byeong-Hoo Lee, Ji-Hoon Jeong, Kyung-Hwan Shim, Seong-Whan Lee

(参考訳) 脳コンピュータインタフェース(BCI)は、ユーザと外部デバイス間の直接通信経路を提供する。エレクトロ脳波(EEG)運動画像(MI)パラダイムは、非侵襲的BCIにおいて、ユーザの動作実行意図を含む符号化信号を得るために広く用いられている。しかし、EEGは複雑な非定常特性を持ち、デコード性能は不十分である。シングルアームの多数の動作を想像することにより、人工的なコマンドマッチングなしで復号性能を向上させることができる。そこで本研究では,片腕の9種類の動作を含む直感的脳波データを9名の被験者から収集した。階層型CNNアーキテクチャの原理を応用して,各上肢領域の識別的特徴を考慮した終端から終端までの畳み込みニューラルネットワーク(ERA-CNN)を提案する。提案手法は,従来の3-class,5-class,および2種類の7-class分類タスクよりも優れている。そこで本研究では,eare-cnnを用いたロバストな性能を有する脳波信号のみを用いて,ユーザの意図を復号する可能性を示す。

A brain-computer interface (BCI) provides a direct communication pathway between user and external devices. Electroencephalogram (EEG) motor imagery (MI) paradigm is widely used in non-invasive BCI to obtain encoded signals contained user intention of movement execution. However, EEG has intricate and non-stationary properties resulting in insufficient decoding performance. By imagining numerous movements of a single-arm, decoding performance can be improved without artificial command matching. In this study, we collected intuitive EEG data contained the nine different types of movements of a single-arm from 9 subjects. We propose an end-to-end role assigned convolutional neural network (ERA-CNN) which considers discriminative features of each upper limb region by adopting the principle of a hierarchical CNN architecture. The proposed model outperforms previous methods on 3-class, 5-class and two different types of 7-class classification tasks. Hence, we demonstrate the possibility of decoding user intention by using only EEG signals with robust performance using an ERA-CNN.

翻訳日:2023-01-05 01:12:21 公開日:2020-02-04

# 医療におけるベイズネットワーク:医療条件による流通

Bayesian Networks in Healthcare: Distribution by Medical Condition ( http://arxiv.org/abs/2002.00224v2 )

ライセンス: Link先を確認

Scott McLachlan, Kudakwashe Dube, Graham A Hitman, Norman E Fenton, Evangelia Kyrimi

(参考訳) ベイジアンネットワーク(BN)は、実際には採用と一致せず、医療に多大な利益をもたらす可能性がある研究の注目を集めている。研究は、BNでモデル化されている医療条件の種類や、どのように、なぜ異なる条件に適用されるのかについて、調査していない。本研究は、医療関連BNモデルが提案されている医療条件の範囲と、適用されている最も一般的な医療条件間のアプローチの差異を同定し、定量化することを目的とする。医療BNの約3分の2は、心臓、がん、心理、肺の4つの疾患に焦点を当てている。 BNがどのように機能し、どのような能力を持つかについての理解の欠如は、日々の医療実践においてポジティブな変化をもたらすために、BNの潜在能力を完全に認識することは、より深い理解と促進によってのみ実現できる、と我々は信じている。

Bayesian networks (BNs) have received increasing research attention that is not matched by adoption in practice and yet have potential to significantly benefit healthcare. Hitherto, research works have not investigated the types of medical conditions being modelled with BNs, nor whether any differences exist in how and why they are applied to different conditions. This research seeks to identify and quantify the range of medical conditions for which healthcare-related BN models have been proposed, and the differences in approach between the most common medical conditions to which they have been applied. We found that almost two-thirds of all healthcare BNs are focused on four conditions: cardiac, cancer, psychological and lung disorders. We believe that a lack of understanding regarding how BNs work and what they are capable of exists, and that it is only with greater understanding and promotion that we may ever realise the full potential of BNs to effect positive change in daily healthcare practice.

翻訳日:2023-01-05 01:03:30 公開日:2020-02-04

# ガウス過程を用いた電子顕微鏡画像のZ厚さとXY異方性の推定

Estimation of Z-Thickness and XY-Anisotropy of Electron Microscopy Images using Gaussian Processes ( http://arxiv.org/abs/2002.00228v2 )

ライセンス: Link先を確認

Thanuja D. Ambegoda, Julien N. P. Martel, Jozef Adamcik, Matthew Cook, Richard H. R. Hahnloser

(参考訳) シリアルセクション電子顕微鏡(SsEM)は、生体組織の体積情報をナノメートルスケールで取得する技術として広く用いられている。しかし、同定された細胞構造と体積量子化の正確な3次元再構成は、XYイメージング面に沿った断面厚さと異方性(または伸展)の正確な推定を必要とする。実際、多くの画像処理アルゴリズムは単に撮像面内の等方性を想定している。そこで本稿では,画像統計の非パラメトリックベイズ回帰法を用いて,電子顕微鏡断面の厚さと伸びを推定する手法を提案する。我々は,原子間力顕微鏡(AFM)により得られた直接測定値を用いて,我々の厚さと伸張率を検証し,最近の間接厚さ推定法や相対Z座標推定法と比較して推定誤差が低いことを示す。さらに,直接計測された断面厚み値を用いたssem画像の最初のデータセットを作成し,間接厚み推定法の評価を行った。

Serial section electron microscopy (ssEM) is a widely used technique for obtaining volumetric information of biological tissues at nanometer scale. However, accurate 3D reconstructions of identified cellular structures and volumetric quantifications require precise estimates of section thickness and anisotropy (or stretching) along the XY imaging plane. In fact, many image processing algorithms simply assume isotropy within the imaging plane. To ameliorate this problem, we present a method for estimating thickness and stretching of electron microscopy sections using non-parametric Bayesian regression of image statistics. We verify our thickness and stretching estimates using direct measurements obtained by atomic force microscopy (AFM) and show that our method has a lower estimation error compared to a recent indirect thickness estimation method as well as a relative Z coordinate estimation method. Furthermore, we have made the first dataset of ssSEM images with directly measured section thickness values publicly available for the evaluation of indirect thickness estimation methods.

翻訳日:2023-01-05 00:47:20 公開日:2020-02-04

# 視覚知覚改善のための水中画像の同時強調と超解像

Simultaneous Enhancement and Super-Resolution of Underwater Imagery for Improved Visual Perception ( http://arxiv.org/abs/2002.01155v1 )

ライセンス: Link先を確認

Md Jahidul Islam, Peigen Luo and Junaed Sattar

(参考訳) 本稿では,水中ロボットビジョンにおけるsesr(single enhancement and super- resolution)問題を紹介し,その解決法を提案する。本稿では,2倍,3倍,あるいは4倍の空間分解能で知覚的画質の復元を学習できる残差ネットワークベース生成モデルであるDeep SESRを提案する。本研究では,彩度特異的な水中色劣化,画像のシャープさの欠如,高レベル特徴表現の損失に対処するマルチモーダル目的関数を定式化することにより,そのトレーニングを監督する。また、画像の突出した前景領域を学習するために監督され、ネットワークがグローバルなコントラスト強化を学ぶためのガイドとなる。我々は、高速な推論のための共有階層的特徴空間上で、相性予測とSESRを共同で学習するエンドツーエンドのトレーニングパイプラインを設計する。さらに,大規模sesr学習を容易にする最初のデータセットであるufo-120を提案する。 UFO-120や他の標準データセットの徹底的な実験的評価により、Deep SESRは水中画像の強調や超高解像度化のために既存のソリューションよりも優れていることを示す。また,様々なスペクトル・空間劣化レベルを持つ水中画像や,目に見えない自然物を含む地上画像を含むいくつかのテストケースにおいて,その一般化性能を検証する。最後に,単板配置の計算可能性を分析し,視覚誘導水中ロボットの操作性を示す。モデルとデータセット情報は、https://github.com/xahidbuffon/Deep-SESR.comで提供される。

In this paper, we introduce and tackle the simultaneous enhancement and super-resolution (SESR) problem for underwater robot vision and provide an efficient solution for near real-time applications. We present Deep SESR, a residual-in-residual network-based generative model that can learn to restore perceptual image qualities at 2x, 3x, or 4x higher spatial resolution. We supervise its training by formulating a multi-modal objective function that addresses the chrominance-specific underwater color degradation, lack of image sharpness, and loss in high-level feature representation. It is also supervised to learn salient foreground regions in the image, which in turn guides the network to learn global contrast enhancement. We design an end-to-end training pipeline to jointly learn the saliency prediction and SESR on a shared hierarchical feature space for fast inference. Moreover, we present UFO-120, the first dataset to facilitate large-scale SESR learning; it contains over 1500 training samples and a benchmark test set of 120 samples. By thorough experimental evaluation on the UFO-120 and other standard datasets, we demonstrate that Deep SESR outperforms the existing solutions for underwater image enhancement and super-resolution. We also validate its generalization performance on several test cases that include underwater images with diverse spectral and spatial degradation levels, and also terrestrial images with unseen natural objects. Lastly, we analyze its computational feasibility for single-board deployments and demonstrate its operational benefits for visually-guided underwater robots. The model and dataset information will be available at: https://github.com/xahidbuffon/Deep-SESR.

翻訳日:2023-01-04 03:46:06 公開日:2020-02-04

# スパースサンプリングによる原子スケールSTEM-EELS画像の高速再構成

Fast reconstruction of atomic-scale STEM-EELS images from sparse sampling ( http://arxiv.org/abs/2002.01225v1 )

ライセンス: Link先を確認

Etienne Monier, Thomas Oberlin, Nathalie Brun, Xiaoyan Li, Marcel Tenc\'e, Nicolas Dobigeon

(参考訳) 本稿では、走査型透過電子顕微鏡(STEM)の取得を加速するために、部分サンプリング分光像の再構成について論じる。画像再構成の問題は、多くの画像モダリティの文献で広く検討されているが、STEM電子エネルギー損失分光法(EELS)によって取得されたスペクトル画像などの3Dデータを扱う試みはわずかである。また, 顕微鏡文献で提案されている手法のうち, 一部は高速であるが不正確であり, 一部は正確な再構成を行うが, 高計算負荷の費用がかかる。したがって,提案手法のいずれも精度と計算複雑性の面での期待を満たさない。本稿では,原子スケールEELSに適した高速かつ高精度な再構成手法を提案する。この方法は、STEM-EELS画像上で初めて使用されるベータプロセスファクター解析(BPFA)のような一般的なソリューションと比較される。 real as合成データに基づく実験を行う。

This paper discusses the reconstruction of partially sampled spectrum-images to accelerate the acquisition in scanning transmission electron microscopy (STEM). The problem of image reconstruction has been widely considered in the literature for many imaging modalities, but only a few attempts handled 3D data such as spectral images acquired by STEM electron energy loss spectroscopy (EELS). Besides, among the methods proposed in the microscopy literature, some are fast but inaccurate while others provide accurate reconstruction but at the price of a high computation burden. Thus none of the proposed reconstruction methods fulfills our expectations in terms of accuracy and computation complexity. In this paper, we propose a fast and accurate reconstruction method suited for atomic-scale EELS. This method is compared to popular solutions such as beta process factor analysis (BPFA) which is used for the first time on STEM-EELS images. Experiments based on real as synthetic data will be conducted.

翻訳日:2023-01-04 03:45:38 公開日:2020-02-04

# 畳み込み層による畳み込み型画像共有のプライバシー保護

Privacy-Preserving Image Sharing via Sparsifying Layers on Convolutional Groups ( http://arxiv.org/abs/2002.01469v1 )

ライセンス: Link先を確認

Sohrab Ferdowsi, Behrooz Razeghi, Taras Holotyak, Flavio P. Calmon, Slava Voloshynovskiy

(参考訳) 本稿では,大規模な設定において,プライバシーに配慮した画像共有の問題に対処する実践的な枠組みを提案する。コンパクト性は常に大規模に求められていますが、プライバシーに敏感なコンテンツをさらに保護しようとすると、このニーズはさらに深刻になります。そこで我々は、画像のエンコードを行い、一方から、表現はプライバシー保護の膨大なコストを払わずにパブリックドメインに格納されるが、それゆえに、攻撃者に対して組合せ的探索的な推測機構が利用できない限り、画像から識別可能なコンテンツが漏れないようにした。一方、認証されたユーザには、セキュアに維持できる非常にコンパクトなキーが提供されている。これは、対応するアクセスグラインド画像の曖昧化と再構築に使用できる。我々は、画像の異なる属性を再構築する責任を負う複数のコンパクトコードを提供しながら、機能マップをスパース化変換を通じて独立に渡す、我々の設計の畳み込みオートエンコーダでこれを達成する。このフレームワークは、公開実装が利用可能な大規模な画像データベース上でテストされる。

We propose a practical framework to address the problem of privacy-aware image sharing in large-scale setups. We argue that, while compactness is always desired at scale, this need is more severe when trying to furthermore protect the privacy-sensitive content. We therefore encode images, such that, from one hand, representations are stored in the public domain without paying the huge cost of privacy protection, but ambiguated and hence leaking no discernible content from the images, unless a combinatorially-expensive guessing mechanism is available for the attacker. From the other hand, authorized users are provided with very compact keys that can easily be kept secure. This can be used to disambiguate and reconstruct faithfully the corresponding access-granted images. We achieve this with a convolutional autoencoder of our design, where feature maps are passed independently through sparsifying transformations, providing multiple compact codes, each responsible for reconstructing different attributes of the image. The framework is tested on a large-scale database of images with public implementation available.

翻訳日:2023-01-04 03:45:23 公開日:2020-02-04

# 特徴精製に基づく畳み込みニューラルネットワークを用いた単一タスクの運動画像分類

Motor Imagery Classification of Single-Arm Tasks Using Convolutional Neural Network based on Feature Refining ( http://arxiv.org/abs/2002.01122v1 )

ライセンス: Link先を確認

Byeong-Hoo Lee, Ji-Hoon Jeong, Kyung-Hwan Shim, Dong-Joo Kim

(参考訳) 脳コンピュータインタフェース(BCI)は、ユーザの意図とステータスを理解するために脳信号をデコードする。単純で安全なデータ取得プロセスのため、脳波(EEG)は非侵襲的BCIで一般的に用いられる。 eegパラダイムの一つであるmotor image (mi) は、信号起源による運動機能の回復やリハビリによく用いられる。しかし、脳波信号は振動・非定常信号であり、MIを正確に収集・分類することが困難である。本研究では,2つの畳み込みブロックからなるbfr-cnn(band-power feature refining convolutional neural network)を提案する。脳波信号を収集し、単一アームの運動想像力を含むMIデータセットを作成しました。提案手法は従来の4クラスmiタスク分類よりも優れている。そこで我々は,BFR-CNNを用いた脳波信号のみを用いて,ユーザ意図の復号化が可能であることを実証した。

Brain-computer interface (BCI) decodes brain signals to understand user intention and status. Because of its simple and safe data acquisition process, electroencephalogram (EEG) is commonly used in non-invasive BCI. One of EEG paradigms, motor imagery (MI) is commonly used for recovery or rehabilitation of motor functions due to its signal origin. However, the EEG signals are an oscillatory and non-stationary signal that makes it difficult to collect and classify MI accurately. In this study, we proposed a band-power feature refining convolutional neural network (BFR-CNN) which is composed of two convolution blocks to achieve high classification accuracy. We collected EEG signals to create MI dataset contained the movement imagination of a single-arm. The proposed model outperforms conventional approaches in 4-class MI tasks classification. Hence, we demonstrate that the decoding of user intention is possible by using only EEG signals with robust performance using BFR-CNN.

翻訳日:2023-01-04 03:44:44 公開日:2020-02-04

# 乱流予測におけるリカレントニューラルネットワークの利用について

On the use of recurrent neural networks for predictions of turbulent flows ( http://arxiv.org/abs/2002.01222v1 )

ライセンス: Link先を確認

Luca Guastoni, Prem A. Srinivasan, Hossein Azizpour, Philipp Schlatter and Ricardo Vinuesa

(参考訳) 本稿では,Moehlis {\it et al. による近接壁乱流の低次モデルを用いて,リカレントニューラルネットワークの予測能力を評価する。ニュー・J・フィス(New J. Phys)。 56, 2004) である。その結果, 適切に訓練されたlong short-term memory (lstm) ネットワークを用いて, 乱流統計量と流れの動的挙動の優れた予測が可能となり, 平均の相対誤差とゆらぎは1-%$以下となることがわかった。また,流れの瞬時予測のみに基づく損失関数の使用は乱流統計学において最良の予測にはならない可能性があり,計算された統計学に基づいて停止基準を定義する必要がある。さらに、瞬時に予測されるだけでなく、流れの平均的な振る舞いを含むより洗練された損失関数は、より高速なニューラルネットワークトレーニングをもたらす可能性がある。

In this paper, the prediction capabilities of recurrent neural networks are assessed in the low-order model of near-wall turbulence by Moehlis {\it et al.} (New J. Phys. {\bf 6}, 56, 2004). Our results show that it is possible to obtain excellent predictions of the turbulence statistics and the dynamic behavior of the flow with properly trained long short-term memory (LSTM) networks, leading to relative errors in the mean and the fluctuations below $1\%$. We also observe that using a loss function based only on the instantaneous predictions of the flow may not lead to the best predictions in terms of turbulence statistics, and it is necessary to define a stopping criterion based on the computed statistics. Furthermore, more sophisticated loss functions, including not only the instantaneous predictions but also the averaged behavior of the flow, may lead to much faster neural network training.

翻訳日:2023-01-04 03:44:29 公開日:2020-02-04

# 脳波信号を用いたてんかん発作予測のための機械学習

Machine Learning for Predicting Epileptic Seizures Using EEG Signals: A Review ( http://arxiv.org/abs/2002.01925v1 )

ライセンス: Link先を確認

Khansa Rasheed, Adnan Qayyum, Junaid Qadir, Shobi Sivathamboo, Patrick Kwan, Levin Kuhlmann, Terence O'Brien, and Adeel Razi

(参考訳) 人工知能(AI)と機械学習(ML)技術の進歩により、研究者たちは、これらの技術を用いて臨床実践の進歩を目指している。医療の主要な目的の1つは、予防的介入をタイムリーに提供する病気の早期発見と予測である。これは特にてんかんの症例であり、再発と予測不能な発作が特徴である。患者は、何らかの形で事前に予測できた場合、てんかん発作の副作用を軽減できる。数十年の研究にもかかわらず、発作の予測は未解決の問題である。これは少なくとも、問題の解決に十分な量のデータが不足しているためである。 MLベースのアルゴリズムには、てんかん発作の早期かつ正確な予測においてパラダイムシフトをもたらす可能性がある、エキサイティングな新しい展開がある。本稿では,脳波信号を用いた発作早期予測における最先端ML手法の総合的なレビューを行う。私たちは現在の研究におけるギャップ、課題、落とし穴を特定し、今後の方向性を推奨します。

With the advancement in artificial intelligence (AI) and machine learning (ML) techniques, researchers are striving towards employing these techniques for advancing clinical practice. One of the key objectives in healthcare is the early detection and prediction of disease to timely provide preventive interventions. This is especially the case for epilepsy, which is characterized by recurrent and unpredictable seizures. Patients can be relieved from the adverse consequences of epileptic seizures if it could somehow be predicted in advance. Despite decades of research, seizure prediction remains an unsolved problem. This is likely to remain at least partly because of the inadequate amount of data to resolve the problem. There have been exciting new developments in ML-based algorithms that have the potential to deliver a paradigm shift in the early and accurate prediction of epileptic seizures. Here we provide a comprehensive review of state-of-the-art ML techniques in early prediction of seizures using EEG signals. We will identify the gaps, challenges, and pitfalls in the current research and recommend future directions.

翻訳日:2023-01-04 03:43:59 公開日:2020-02-04

# 話者手がかりを用いた感情認識

Emotion Recognition Using Speaker Cues ( http://arxiv.org/abs/2002.03566v1 )

ライセンス: Link先を確認

Ismail Shahin

(参考訳) 本研究の目的は、話者手がかりを用いて未知の感情を特定することである。本研究では,2段階の枠組みを用いて未知の感情を同定する。第1段階は未知の感情を発話する話者を特定することに焦点を当て、第2段階は認識された話者が前段で発する未知の感情を特定することに焦点を当てている。提案手法はアラビア語Emirati-accented speech databaseで男女15人を対象に評価されている。抽出した特徴としてMel-Frequency Cepstral Coefficients (MFCCs) が用いられ、本研究ではHidden Markov Model (HMM) が分類器として利用されている。その結果,2段階の枠組みに基づく感情認識精度は,ガウス混合モデル (GMM) やサポートベクトルマシン (SVM) ,ベクトル量子化 (VQ) など,一段階のアプローチと最先端の分類器に基づくものよりも高いことがわかった。 2段階のアプローチに基づく平均感情認識精度は67.5%であり、それぞれ1段階のアプローチであるGMM、SVM、VQに基づいて61.4%、63.3%、64.5%、61.5%に達する。 2段階の枠組みに基づいて達成された結果は、人間の聴取者による主観的評価に非常に近い。

This research aims at identifying the unknown emotion using speaker cues. In this study, we identify the unknown emotion using a two-stage framework. The first stage focuses on identifying the speaker who uttered the unknown emotion, while the next stage focuses on identifying the unknown emotion uttered by the recognized speaker in the prior stage. This proposed framework has been evaluated on an Arabic Emirati-accented speech database uttered by fifteen speakers per gender. Mel-Frequency Cepstral Coefficients (MFCCs) have been used as the extracted features and Hidden Markov Model (HMM) has been utilized as the classifier in this work. Our findings demonstrate that emotion recognition accuracy based on the two-stage framework is greater than that based on the one-stage approach and the state-of-the-art classifiers and models such as Gaussian Mixture Model (GMM), Support Vector Machine (SVM), and Vector Quantization (VQ). The average emotion recognition accuracy based on the two-stage approach is 67.5%, while the accuracy reaches to 61.4%, 63.3%, 64.5%, and 61.5%, based on the one-stage approach, GMM, SVM, and VQ, respectively. The achieved results based on the two-stage framework are very close to those attained in subjective assessment by human listeners.

翻訳日:2023-01-04 03:43:46 公開日:2020-02-04

# 結合CNNを用いたハイパースペクトルとLiDARデータの分類

Classification of Hyperspectral and LiDAR Data Using Coupled CNNs ( http://arxiv.org/abs/2002.01144v1 )

ライセンス: Link先を確認

Renlong Hang, Zhu Li, Pedram Ghamisi, Danfeng Hong, Guiyu Xia, and Qingshan Liu

(参考訳) 本稿では,2つの結合畳み込みニューラルネットワーク(CNN)を用いて,高スペクトルと光検出・ラング(LiDAR)データを融合する,効率的かつ効率的なフレームワークを提案する。 1つのCNNは、ハイパースペクトルデータからスペクトル空間的特徴を学習するために設計され、もう1つはLiDARデータから標高情報を取得するために使用される。どちらも3つの畳み込み層で構成され、最後の2つの畳み込み層はパラメータ共有戦略を介して結合される。融合相では、これらの不均一な特徴を十分に統合するために、特徴レベルおよび決定レベル融合法が同時に使用される。機能レベルの融合については,連結戦略,最大化戦略,累積戦略を含む3つの異なる融合戦略が評価される。決定レベルの融合では、各出力の分類精度によって重み付けを決定する重み付け和戦略が採用される。提案モデルは、アメリカ合衆国ヒューストンで取得した都市データセットと、イタリアのトレントで取得した農村データセットに基づいて評価される。ヒューストンのデータでは、我々のモデルは96.03%の精度で新しい記録を達成できる。 Trentoのデータでは、全体的な精度は99.12%である。これらの結果は,提案モデルの有効性を十分に証明する。

In this paper, we propose an efficient and effective framework to fuse hyperspectral and Light Detection And Ranging (LiDAR) data using two coupled convolutional neural networks (CNNs). One CNN is designed to learn spectral-spatial features from hyperspectral data, and the other one is used to capture the elevation information from LiDAR data. Both of them consist of three convolutional layers, and the last two convolutional layers are coupled together via a parameter sharing strategy. In the fusion phase, feature-level and decision-level fusion methods are simultaneously used to integrate these heterogeneous features sufficiently. For the feature-level fusion, three different fusion strategies are evaluated, including the concatenation strategy, the maximization strategy, and the summation strategy. For the decision-level fusion, a weighted summation strategy is adopted, where the weights are determined by the classification accuracy of each output. The proposed model is evaluated on an urban data set acquired over Houston, USA, and a rural one captured over Trento, Italy. On the Houston data, our model can achieve a new record overall accuracy of 96.03%. On the Trento data, it achieves an overall accuracy of 99.12%. These results sufficiently certify the effectiveness of our proposed model.

翻訳日:2023-01-04 03:38:05 公開日:2020-02-04

# ブロック強度と勾配差(BIGD)記述子を用いたテクスチャ分類

Texture Classification using Block Intensity and Gradient Difference (BIGD) Descriptor ( http://arxiv.org/abs/2002.01154v1 )

ライセンス: Link先を確認

Yuting Hu, Zhen Wang, and Ghassan AlRegib

(参考訳) 本稿では,ブロック強度と勾配差(BIGD)という,効率的で独特な局所記述子を提案する。画像パッチでは、マルチスケールブロックペアをランダムにサンプリングし、各ブロックの強度と勾配の違いを利用してローカルなbigdディスクリプタを構築する。ランダムサンプリング戦略とマルチスケールフレームワークは、bigdディスクリプタが異なる方向と空間的な粒度レベルでのパッチの特徴的なパターンを捉えるのに役立つ。局所集約ディスクリプタ(VLAD)や改良されたフィッシャーベクトル(IFV)のベクトルを用いて、局所的なBIGDディスクリプタをフルイメージディスクリプタにエンコードし、その後、テクスチャ分類のための線形サポートベクタマシン(SVM)分類器に入力する。提案する記述子は,Brodatz,CUReT,KTH-TIPS,KTH-TIPS-2a,-2bを含む5つの公的なテクスチャデータセットに対して,その分類性能を評価することで,その特徴と現状を比較した。実験結果から, 識別力の強いBIGDディスクリプタは, 最先端テクスチャディスクリプタ, 密度マイクロブロック差 (DMD) よりも0.12%～6.43%高い分類精度が得られた。

In this paper, we present an efficient and distinctive local descriptor, namely block intensity and gradient difference (BIGD). In an image patch, we randomly sample multi-scale block pairs and utilize the intensity and gradient differences of pairwise blocks to construct the local BIGD descriptor. The random sampling strategy and the multi-scale framework help BIGD descriptors capture the distinctive patterns of patches at different orientations and spatial granularity levels. We use vectors of locally aggregated descriptors (VLAD) or improved Fisher vector (IFV) to encode local BIGD descriptors into a full image descriptor, which is then fed into a linear support vector machine (SVM) classifier for texture classification. We compare the proposed descriptor with typical and state-of-the-art ones by evaluating their classification performance on five public texture data sets including Brodatz, CUReT, KTH-TIPS, and KTH-TIPS-2a and -2b. Experimental results show that the proposed BIGD descriptor with stronger discriminative power yields 0.12% ~ 6.43% higher classification accuracy than the state-of-the-art texture descriptor, dense microblock difference (DMD).

翻訳日:2023-01-04 03:37:47 公開日:2020-02-04

# 畳み込みニューラルネットワークを用いた下水道映像の妨害レベル検出

Obstruction level detection of sewer videos using convolutional neural networks ( http://arxiv.org/abs/2002.01284v1 )

ライセンス: Link先を確認

Mario A. Gutierrez-Mondragon, Dario Garcia-Gasulla, Sergio Alvarez-Napagao, Jaume Brossa-Ordo\~nez and Rafael Gimenez-Esteban

(参考訳) 下水道網は、排水を中央処理場に輸送して処理し、環境に戻すように設計されている。このプロセスは現在の社会にとって重要であり、水性疾患を予防し、安全な飲料水を提供し、一般的な衛生を強化する。下水道網の完全運用を維持するため、常にサンプリング検査を行い、障害を識別する。通常、閉鎖回路テレビシステムはパイプの内部を記録し、妨害レベルを報告するために使われ、掃除作業が引き起こされる可能性がある。現在、障害レベルのアセスメントは手動で行われており、時間がかかり、一貫性がない。本研究では,パイプの閉塞レベルを特定するために畳み込みニューラルネットワークを訓練する手法を設計し,このような頻繁かつ反復的な作業に要する人的労力を削減する。私たちは、モデルに入力するための有用なフレームを生成するために、探索および適応されたビデオのデータベースを集めました。結果として得られた分類器は、デプロイ可能なパフォーマンスを得る。このアプローチの一貫性と工業的適用性を検証するために,階層的適合性伝達説明可能性技術を統合することで,ニューラルネットワークの動作をより深く理解することができる。最後に, 提案システムにより, 下水道試験における速度, 精度, 整合性を向上することができる。私たちの分析では、データ収集方法論のさらなる品質向上に関するガイドラインも公開しています。

Worldwide, sewer networks are designed to transport wastewater to a centralized treatment plant to be treated and returned to the environment. This process is critical for the current society, preventing waterborne illnesses, providing safe drinking water and enhancing general sanitation. To keep a sewer network perfectly operational, sampling inspections are performed constantly to identify obstructions. Typically, a Closed-Circuit Television system is used to record the inside of pipes and report the obstruction level, which may trigger a cleaning operative. Currently, the obstruction level assessment is done manually, which is time-consuming and inconsistent. In this work, we design a methodology to train a Convolutional Neural Network for identifying the level of obstruction in pipes, thus reducing the human effort required on such a frequent and repetitive task. We gathered a database of videos that are explored and adapted to generate useful frames to fed into the model. Our resulting classifier obtains deployment ready performances. To validate the consistency of the approach and its industrial applicability, we integrate the Layer-wise Relevance Propagation explainability technique, which enables us to further understand the behavior of the neural network for this task. In the end, the proposed system can provide higher speed, accuracy, and consistency in the process of sewer examination. Our analysis also uncovers some guidelines on how to further improve the quality of the data gathering methodology.

翻訳日:2023-01-04 03:37:18 公開日:2020-02-04

# tfp.mcmc: 現代のハードウェア用に作られた現代のマルコフチェーンモンテカルロツール

tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware ( http://arxiv.org/abs/2002.01184v1 )

ライセンス: Link先を確認

Junpeng Lao, Christopher Suter, Ian Langmore, Cyril Chimisov, Ashish Saxena, Pavel Sountsov, Dave Moore, Rif A. Saurous, Matthew D. Hoffman, and Joshua V. Dillon

(参考訳) マルコフ連鎖モンテカルロ(mcmc)は20世紀の最も重要なアルゴリズムの1つと見なされている。非正規化確率関数のみを用いた漸近収束、安定性、および推定子分散境界の保証は確率計画に不可欠である。本稿では、tensorflow probability mcmc toolkitを紹介し、その設計の動機となるいくつかの考察について述べる。

Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century. Its guarantees of asymptotic convergence, stability, and estimator-variance bounds using only unnormalized probability functions make it indispensable to probabilistic programming. In this paper, we introduce the TensorFlow Probability MCMC toolkit, and discuss some of the considerations that motivated its design.

翻訳日:2023-01-04 03:36:56 公開日:2020-02-04

# ベイズ最適化のための不確かさ定量化

Uncertainty Quantification for Bayesian Optimization ( http://arxiv.org/abs/2002.01569v1 )

ライセンス: Link先を確認

Rui Tuo, Wenjia Wang

(参考訳) ベイズ最適化はグローバル最適化手法のクラスである。これは、基礎となる目的函数をガウス過程の実現と見なしている。ベイズ最適化の出力はガウス過程の仮定に従ってランダムであるが、この不確実性の定量化は文献ではほとんど研究されていない。本研究では,最大点や目的関数の値の信頼領域を構築するという観点から,ベイズ最適化アルゴリズムの出力不確実性を評価するための新しい手法を提案する。これらの領域は効率的に計算でき、その信頼度レベルは逐次ガウス過程回帰のために新しく開発された一様誤差境界によって保証される。本理論は、既存の全ての逐次サンプリングポリシーと停止基準の統一不確実性定量化フレームワークを提供する。

Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assumption, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.

翻訳日:2023-01-04 03:36:50 公開日:2020-02-04

# REAK: エラーレートに基づく適応リグによる信頼性解析

REAK: Reliability analysis through Error rate-based Adaptive Kriging ( http://arxiv.org/abs/2002.01110v1 )

ライセンス: Link先を確認

Zeyu Wang and Abdollah Shafieezadeh

(参考訳) 様々な分野のモデルが複雑になりつつあるため、関連する計算要求は大幅に増大している。これらのシステムの障害確率が小さい場合の信頼性解析は極めて困難であり、大量のコストシミュレーションを必要とする。この課題に対処するために、Error rate-based Adaptive Kriging (REAK) による信頼性解析を提案する。ここでは、リンデベルク条件に基づく中央極限定理の拡張を用いて、間違った符号推定値を持つ設計サンプル数の分布を導出し、失敗確率推定値の最大誤差率を決定する。このエラーレートは、設計サンプルの戦略的生成のための適応スキームの各段階における効果的なサンプリング領域の最適な確立を可能にする。さらに、信頼性解析の停止基準として使用される故障確率推定の目標精度の設定を容易にする。これらの能力は、洗練された計算要求モデルへの呼び出し数を著しく削減することができる。非線形性と次元の異なる4つの例に対するREAKの適用について述べる。その結果,モンテカルロシミュレーション (ak-mcs) と逐次kriging reliability analysis (iskra) の改善による適応krigingの最先端手法と比較して,reakは計算需要を最大50%削減できることがわかった。

As models in various fields are becoming more complex, associated computational demands have been increasing significantly. Reliability analysis for these systems when failure probabilities are small is significantly challenging, requiring a large number of costly simulations. To address this challenge, this paper introduces Reliability analysis through Error rate-based Adaptive Kriging (REAK). An extension of the Central Limit Theorem based on Lindeberg condition is adopted here to derive the distribution of the number of design samples with wrong sign estimate and subsequently determine the maximum error rate for failure probability estimates. This error rate enables optimal establishment of effective sampling regions at each stage of an adaptive scheme for strategic generation of design samples. Moreover, it facilitates setting a target accuracy for failure probability estimation, which is used as stopping criterion for reliability analysis. These capabilities together can significantly reduce the number of calls to sophisticated, computationally demanding models. The application of REAK for four examples with varying extent of nonlinearity and dimension is presented. Results indicate that REAK is able to reduce the computational demand by as high as 50% compared to state-of-the-art methods of Adaptive Kriging with Monte Carlo Simulation (AK-MCS) and Improved Sequential Kriging Reliability Analysis (ISKRA).

翻訳日:2023-01-04 03:36:04 公開日:2020-02-04

# リカレントニューラルネットワークを用いたチャネル状態情報を用いたセンチメートルレベルの屋内定位

Centimeter-Level Indoor Localization using Channel State Information with Recurrent Neural Networks ( http://arxiv.org/abs/2002.01411v1 )

ライセンス: Link先を確認

Jianyuan Yu, R. Michael Buehrer

(参考訳) モノのインターネットや自動運転の現代的な技術は、より正確な位置決めを必要とする。古典的な位置技術は主に屋外のシナリオに適応するが、複数の経路を持つ屋内のケースの要件を満たさない。一方、ノイズや時間変化にロバストな特徴として、チャネル状態情報(csi)はより正確な測位において、受信信号強度指標(rssi)よりも優れていることが示されている。そこで本稿では,線形アンテナから収集した実CSIデータを用いて,センチメートルレベルの屋内位置推定を行うニューラルネットワーク手法を提案する。チャネル応答の振幅または相関行列を入力として使用することにより、データサイズを大幅に削減し、ノイズを抑制することができる。また、リカレントニューラルネットワーク(RNN)と信号雑音比(SNR)情報によるユーザ動作軌跡の整合性を利用して、特に小型データ学習における推定精度をさらに向上させることができる。これらの貢献はすべて、他の古典的な教師付き学習方法の結果に基づいて、ニューラルネットワークの効率に恩恵をもたらします。

Modern techniques in the Internet of Things or autonomous driving require more accuracy positioning ever. Classic location techniques mainly adapt to outdoor scenarios, while they do not meet the requirement of indoor cases with multiple paths. Meanwhile as a feature robust to noise and time variations, Channel State Information (CSI) has shown its advantages over Received Signal Strength Indicator (RSSI) at more accurate positioning. To this end, this paper proposes the neural network method to estimate the centimeter-level indoor positioning with real CSI data collected from linear antennas. It utilizes an amplitude of channel response or a correlation matrix as the input, which can highly reduce the data size and suppress the noise. Also, it makes use of the consistency in the user motion trajectory via Recurrent Neural Network (RNN) and signal-noise ratio (SNR) information, which can further improve the estimation accuracy, especially in small datasize learning. These contributions all benefit the efficiency of the neural network, based on the results with other classic supervised learning methods.

翻訳日:2023-01-04 03:35:00 公開日:2020-02-04

# 深層学習による公共空間利用の計測:デトロイト川流域におけるベンチマーク研究

Measuring the Utilization of Public Open Spaces by Deep Learning: a Benchmark Study at the Detroit Riverfront ( http://arxiv.org/abs/2002.01461v1 )

ライセンス: Link先を確認

Peng Sun, Rui Hou, Jerome Lynch

(参考訳) 身体活動と社会的相互作用は、健康的なライフスタイルを保証する重要な活動である。公園、広場、緑道などの公共のオープンスペース(POS)は、これらの活動を促進する重要な環境である。 POSを評価するためには、その内部の施設をどのように利用するかを研究する必要がある。しかし、POSの利用を研究する従来のアプローチは手作業であり、時間と労力が集中している。質的な洞察のみを提供することもある。監視カメラの活用や,コンピュータビジョンによるユーザ関連情報の抽出が望まれている。本稿では,posにおけるヒューマンアクティビティを定量的に計測するための概念実証型コンピュータビジョンフレームワークを提案し,デトロイト・リバーフロント・コンサージェンシー(drfc)監視カメラネットワークを用いた提案フレームワークのケーススタディを示す。カスタムイメージデータセットはフレームワークをトレーニングするために提示され、データセットには、様々な照明条件下でDRFC公園空間の18のカメラから収集された7826の完全な注釈付き画像が含まれている。データセット分析と、ワンステップユーザのローカライズとアクティビティ認識のためのベースラインモデルも提供されている。 mAPの結果は, 歩行者検出では77.5\%, サイクリスト検出では81.6\%であった。動作マップはフレームワークによって自律的に生成され、異なるposユーザを見つけ出し、行動のローカライズに対する平均誤差は10cm以内である。

Physical activities and social interactions are essential activities that ensure a healthy lifestyle. Public open spaces (POS), such as parks, plazas and greenways, are key environments that encourage those activities. To evaluate a POS, there is a need to study how humans use the facilities within it. However, traditional approaches to studying use of POS are manual and therefore time and labor intensive. They also may only provide qualitative insights. It is appealing to make use of surveillance cameras and to extract user-related information through computer vision. This paper proposes a proof-of-concept deep learning computer vision framework for measuring human activities quantitatively in POS and demonstrates a case study of the proposed framework using the Detroit Riverfront Conservancy (DRFC) surveillance camera network. A custom image dataset is presented to train the framework; the dataset includes 7826 fully annotated images collected from 18 cameras across the DRFC park space under various illumination conditions. Dataset analysis is also provided as well as a baseline model for one-step user localization and activity recognition. The mAP results are 77.5\% for {\it pedestrian} detection and 81.6\% for {\it cyclist} detection. Behavioral maps are autonomously generated by the framework to locate different POS users and the average error for behavioral localization is within 10 cm.

翻訳日:2023-01-04 03:28:27 公開日:2020-02-04

# モノイド上の確率的オートマトンについて

On Stochastic Automata over Monoids ( http://arxiv.org/abs/2002.01214v1 )

ライセンス: Link先を確認

Karl-Heinz Zimmermann, Merve Nur Cakir

(参考訳) 入力集合としてのモノイド上の確率オートマトンを研究する。これらのオートマトンの定義性は、自由モノイドの固有の普遍性を置き換える拡張条件を必要とする。ツラカイネンの結果の一般化として、モノイド上の一般化されたオートマトンはその確率的対象と同じ受容力を持つことを示す。準同型の鍵は、入力状態のモノイド準同型と遷移行列のモノイド準同型の間の可換性である。確率オートマトンがモノイド上で受容する言語のクロージャ特性について検討した。マトリックス確率オートマトンがモノイド上で受容する言語のクロージャ特性について検討した。

Stochastic automata over monoids as input sets are studied. The well-definedness of these automata requires an extension postulate that replaces the inherent universal property of free monoids. As a generalization of Turakainen's result, it will be shown that the generalized automata over monoids have the same acceptance power as their stochastic counterparts. The key to homomorphisms is a commuting property between the monoid homomorphism of input states and the monoid homomorphism of transition matrices. Closure properties of the languages accepted by stochastic automata over monoids are investigated. matrices. Closure properties of the languages accepted by stochastic automata over monoids are investigated.

翻訳日:2023-01-04 03:27:42 公開日:2020-02-04

# Ethics Codes Onって誰の側? 権力と責任と社会的利益は

Whose Side are Ethics Codes On? Power, Responsibility and the Social Good ( http://arxiv.org/abs/2002.01559v1 )

ライセンス: Link先を確認

Anne L. Washington, Rachel S. Kuo

(参考訳) 倫理規範の道徳的権威は、それらが統一社会に仕えるという仮定に由来するが、これは共有資源の政治的側面を無視している。社会学者ハワード・S・ベッカー(Howard S. Becker)は古典的なエッセイ『Whose Side Are We On』で、研究者にその力と責任を明確にするよう求めた。ベッカーの信頼の階層に基づいて,データ倫理規範の批判的言説分析と,データ技術の「社会的善」という概念化について報告する。分析の結果、企業や専門団体の倫理規定が消費者を社会と混同し、機関でほとんど沈黙していたことが明らかとなった。デジタル時代の社会変化に関するコミュニティオーガナイザへのインタビューは分析を補完し、疎外化コミュニティの懸念に対する技術的な解決策の限界を克服した。文書と生活経験の間の溝を浮かび上がらせる証拠を考えると、我々は消費者を高める倫理規定が同時に脆弱な人口のニーズに従属する可能性があると論じる。競合するデジタルリソースを理解することは、公益技術の新興分野の中心である。本稿では,デジタルディファレンシャル脆弱性の概念を導入し,データ技術における有害な暴露を説明するとともに,将来的な倫理規定に対する勧告を提案する。

The moral authority of ethics codes stems from an assumption that they serve a unified society, yet this ignores the political aspects of any shared resource. The sociologist Howard S. Becker challenged researchers to clarify their power and responsibility in the classic essay: Whose Side Are We On. Building on Becker's hierarchy of credibility, we report on a critical discourse analysis of data ethics codes and emerging conceptualizations of beneficence, or the "social good", of data technology. The analysis revealed that ethics codes from corporations and professional associations conflated consumers with society and were largely silent on agency. Interviews with community organizers about social change in the digital era supplement the analysis, surfacing the limits of technical solutions to concerns of marginalized communities. Given evidence that highlights the gulf between the documents and lived experiences, we argue that ethics codes that elevate consumers may simultaneously subordinate the needs of vulnerable populations. Understanding contested digital resources is central to the emerging field of public interest technology. We introduce the concept of digital differential vulnerability to explain disproportionate exposures to harm within data technology and suggest recommendations for future ethics codes.

翻訳日:2023-01-04 03:26:42 公開日:2020-02-04

# 多段ポンプの多目的予測におけるデータ拡張型ニューラルネットワーク

Neural network with data augmentation in multi-objective prediction of multi-stage pump ( http://arxiv.org/abs/2002.02402v1 )

ライセンス: Link先を確認

Hang Zhao

(参考訳) データ拡張を伴うニューラルネットワークに基づく多段階ポンプ法の多目的予測法を提案する。キー設計変数と遠心ポンプの外部特性値(ヘッドとパワー)の高非線形性について検討するために、ニューラルネットワークモデル(NN)を2次応答面モデル(RSF)、ラジアル基底ガウス応答面モデル(RBF)、クリギングモデル(KRG)と比較して構築する。単段遠心ポンプの数値モデル検証実験により,CFDに基づく数値モデルは非常に正確かつ公平であることが確認された。すべての予測モデルは、それぞれ設計範囲の3つのキー変数の異なる組み合わせの下で、60個のサンプルによって訓練される。 4つの予測モデルに基づく頭部とパワーの精度をcfdシミュレーション値と比較して解析した。その結果、ニューラルネットワークモデルは、他の3つのサロゲートモデルと比較して、すべての外部特性値において優れた性能を示すことがわかった。最後に,データ拡張(NNDA)に基づくニューラルネットワークモデルを提案し,シミュレーションコストが高すぎること,特にCFD問題におけるデータ不足を理由として提案する。データ拡張を伴うモデルは、異なる属性のサンプルポイントごとに補間することでデータを3倍にすることができる。データ拡張によるニューラルネットワークモデルの性能は,従来のニューラルネットワークモデルよりも優れていることを示す。したがって、NNの予測能力は、より多くのシミュレーションコストを伴わずに向上する。データ拡張により、次の最適化のために多段ポンプの最適化問題を解き、将来有限要素解析最適化問題に一般化する上で、より良い予測モデルとなる。

A multi-objective prediction method of multi-stage pump method based on neural network with data augmentation is proposed. In order to study the highly nonlinear relationship between key design variables and centrifugal pump external characteristic values (head and power), the neural network model (NN) is built in comparison with the quadratic response surface model (RSF), the radial basis Gaussian response surface model (RBF), and the Kriging model (KRG). The numerical model validation experiment of another type of single stage centrifugal pump showed that numerical model based on CFD is quite accurate and fair. All of prediction models are trained by 60 samples under the different combination of three key variables in design range respectively. The accuracy of the head and power based on the four predictions models are analyzed comparing with the CFD simulation values. The results show that the neural network model has better performance in all external characteristic values comparing with other three surrogate models. Finally, a neural network model based on data augmentation (NNDA) is proposed for the reason that simulation cost is too high and data is scarce in mechanical simulation field especially in CFD problems. The model with data augmentation can triple the data by interpolation at each sample point of different attributes. It shows that the performance of neural network model with data augmentation is better than former neural network model. Therefore, the prediction ability of NN is enhanced without more simulation costs. With data augmentation it can be a better prediction model used in solving the optimization problems of multistage pump for next optimization and generalized to finite element analysis optimization problems in future.

翻訳日:2023-01-04 03:26:21 公開日:2020-02-04

# ディープニューラルネットワークを用いたロバスト顔アライメントの多段階モデル

Multistage Model for Robust Face Alignment Using Deep Neural Networks ( http://arxiv.org/abs/2002.01075v1 )

ライセンス: Link先を確認

Huabin Wang and Rui Cheng and Jian Zhou and Liang Tao and Hon Keung Kwan

(参考訳) 厳しいオクルージョンや大きなポーズのバリエーションのような制約のない条件を一般化する能力は、顔のアライメントで達成する上で難しい目標である。本稿では,空間変換器ネットワーク,時間ガラスネットワーク,および模範的形状制約を生かした,深層ニューラルネットワークに基づく多段階モデルを提案する。まず、畳み込み層と残留単位からなる空間変圧器生成逆ネットワークを用いて、顔検出器による初期化問題(回転やスケールの変動など)を解決し、顔アライメントのための顔境界ボックスの改善を図る。そして、積み重ねられた砂時計ネットワークを用いてランドマークの予備位置と対応するスコアを取得する。また,高得点者に基づいて低得点のランドマークを決定するために,exemplar-based shape dictionaryが設計されている。顔形状制約を組み込むことにより、閉塞や散在した背景による不整合ランドマークを大幅に改善することができる。提案手法は他の最先端手法よりも優れた性能を示すために,挑戦的ベンチマークデータセットに基づく広範囲な実験を行った。

An ability to generalize unconstrained conditions such as severe occlusions and large pose variations remains a challenging goal to achieve in face alignment. In this paper, a multistage model based on deep neural networks is proposed which takes advantage of spatial transformer networks, hourglass networks and exemplar-based shape constraints. First, a spatial transformer - generative adversarial network which consists of convolutional layers and residual units is utilized to solve the initialization issues caused by face detectors, such as rotation and scale variations, to obtain improved face bounding boxes for face alignment. Then, stacked hourglass network is employed to obtain preliminary locations of landmarks as well as their corresponding scores. In addition, an exemplar-based shape dictionary is designed to determine landmarks with low scores based on those with high scores. By incorporating face shape constraints, misaligned landmarks caused by occlusions or cluttered backgrounds can be considerably improved. Extensive experiments based on challenging benchmark datasets are performed to demonstrate the superior performance of the proposed method over other state-of-the-art methods.

翻訳日:2023-01-04 03:25:56 公開日:2020-02-04

# トップダウンアテンションを用いた選択的セグメンテーションネットワーク

Selective Segmentation Networks Using Top-Down Attention ( http://arxiv.org/abs/2002.01125v1 )

ライセンス: Link先を確認

Mahdi Biparva, John Tsotsos

(参考訳) 畳み込みニューラルネットワーク(convolutional neural networks)は、ネットワーク階層の底にある入力感覚データの、視覚階層の上部にある意味情報への変換をモデル化する。フィードフォワード処理は、いくつかのオブジェクト認識タスクに十分である。トップダウンの選択はボトムアップのfeedforwardパスに加えて必要となる。部分的には、階層的特徴ピラミッドによって課される位置情報の喪失の欠点に対処することができる。本稿では,Top-Down選択ネットワークでBottom-Up \convnetsを拡張可能な,オブジェクトセグメンテーションのための統合2パスフレームワークを提案する。我々は,トップダウン選択ゲーティング活動を利用してボトムアップ隠れ動作のセグメンテーション予測を行う。ネットワークの両端におけるタスク要求を満たす損失項を持つエンドツーエンドのマルチタスクフレームワークを開発する。提案するベンチマークデータセットのネットワークをセマンティクスセグメンテーションのために評価し,トップダウン選択能力を有するネットワークがベースラインモデルを上回ることを示す。さらに、我々は新しいセグメンテーションパラダイムの優れた側面に光を当て、パラメトリックスキップ接続に純粋に依存するベースラインモデルよりも、新しいフレームワークの効率を質的かつ定量的に支援した。

Convolutional neural networks model the transformation of the input sensory data at the bottom of a network hierarchy to the semantic information at the top of the visual hierarchy. Feedforward processing is sufficient for some object recognition tasks. Top-Down selection is potentially required in addition to the Bottom-Up feedforward pass. It can, in part, address the shortcoming of the loss of location information imposed by the hierarchical feature pyramids. We propose a unified 2-pass framework for object segmentation that augments Bottom-Up \convnets with a Top-Down selection network. We utilize the top-down selection gating activities to modulate the bottom-up hidden activities for segmentation predictions. We develop an end-to-end multi-task framework with loss terms satisfying task requirements at the two ends of the network. We evaluate the proposed network on benchmark datasets for semantic segmentation, and show that networks with the Top-Down selection capability outperform the baseline model. Additionally, we shed light on the superior aspects of the new segmentation paradigm and qualitatively and quantitatively support the efficiency of the novel framework over the baseline model that relies purely on parametric skip connections.

翻訳日:2023-01-04 03:18:58 公開日:2020-02-04

# 境界不規則性をもつ逆ロバストフレームサンプリング

Adversarially Robust Frame Sampling with Bounded Irregularities ( http://arxiv.org/abs/2002.01147v1 )

ライセンス: Link先を確認

Hanhan Li, Pin Wang

(参考訳) 近年,ビデオから意味のある情報を自動抽出するビデオ解析ツールが広く研究され,展開されている。ほとんどが計算コストのかかるディープニューラルネットワークを使用しているため、そのようなアルゴリズムにビデオフレームのサブセットだけを投入することが望ましい。フレームを固定レートでサンプリングすることは、その単純さ、代表性、解釈性のために常に魅力的である。例えば、人気のcloud video apiは、ビデオ中の毎秒1フレームのみを処理することで、ビデオとショットのラベルを生成した。しかし、選択したフレームをサンプリングされた場所に配置することで、このような戦略を簡単に攻撃することができる。本稿では,このサンプリング問題に対するエレガントな解決法を提案する。

In recent years, video analysis tools for automatically extracting meaningful information from videos are widely studied and deployed. Because most of them use deep neural networks which are computationally expensive, feeding only a subset of video frames into such algorithms is desired. Sampling the frames with fixed rate is always attractive for its simplicity, representativeness, and interpretability. For example, a popular cloud video API generated video and shot labels by processing only the first frame of every second in a video. However, one can easily attack such strategies by placing chosen frames at the sampled locations. In this paper, we present an elegant solution to this sampling problem that is provably robust against adversarial attacks and introduces bounded irregularities as well.

翻訳日:2023-01-04 03:18:40 公開日:2020-02-04

# AutoEncoder-based Lifted Multicuts を用いた教師なし多人数追跡

Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted Multicuts ( http://arxiv.org/abs/2002.01192v1 )

ライセンス: Link先を確認

Kalun Ho, Janis Keuper, Margret Keuper

(参考訳) マルチオブジェクト追跡(MOT)はコンピュータビジョンにおける長年の課題である。検出パラダイムによるトラッキングに基づく現在のアプローチは、データをトラックに正しく関連付けるために、ある種のドメイン知識または監督を必要とする。本研究では,視覚特徴と最小コストリフトされたマルチカットに基づく教師なしマルチオブジェクト追跡手法を提案する。提案手法は,画像列中の隣接フレームから重畳することなく抽出できるストレートフォワード時空間的手がかりに基づく。これらの手がかりに基づくクラスタリングにより、追跡タスクに必要な出現不変性を学び、オートエンコーダを訓練し、適切な潜在表現を生成することができる。このように、結果として生じる潜在表現は、信頼できる時空間的特徴を抽出できない大きな時間的距離でも追跡するための堅牢な外観手がかりとして機能する。提案したアノテーションを使わずにトレーニングされているにもかかわらず,我々のモデルは,歩行者追跡のための挑戦的なMOTベンチマーク上での競争結果を提供する。

Multiple Object Tracking (MOT) is a long-standing task in computer vision. Current approaches based on the tracking by detection paradigm either require some sort of domain knowledge or supervision to associate data correctly into tracks. In this work, we present an unsupervised multiple object tracking approach based on visual features and minimum cost lifted multicuts. Our method is based on straight-forward spatio-temporal cues that can be extracted from neighboring frames in an image sequences without superivison. Clustering based on these cues enables us to learn the required appearance invariances for the tracking task at hand and train an autoencoder to generate suitable latent representation. Thus, the resulting latent representations can serve as robust appearance cues for tracking even over large temporal distances where no reliable spatio-temporal features could be extracted. We show that, despite being trained without using the provided annotations, our model provides competitive results on the challenging MOT Benchmark for pedestrian tracking.

翻訳日:2023-01-04 03:17:58 公開日:2020-02-04

# Selective Convolutional Network: 背景を無視する効率的なオブジェクト検出器

Selective Convolutional Network: An Efficient Object Detector with Ignoring Background ( http://arxiv.org/abs/2002.01205v1 )

ライセンス: Link先を確認

Hefei Ling, Yangyang Qin, Li Zhang, Yuxuan Shi, Ping Li

(参考訳) アテンション機構がオブジェクト検出器を含む多くのcnnの性能を効果的に改善できることはよく知られている。特徴写像を精細化する代わりに、注意を向ける新しい試みによって計算の複雑さを抑える。そこで,本研究では,有意義かつ有意義な情報を含む位置のみを選択的に計算するscn(elective convolutional network)と呼ばれる効率的な物体検出器を提案する。基本的な考え方は、特に特徴抽出時の計算コストを効果的に削減する、重要でない背景領域を排除することである。そこで本稿では,ネットワークの次を導くためのオーバーヘッドを無視する,精巧な構造を設計する。エンドツーエンドのトレーニング可能で、エンベディングも簡単です。追加のセグメンテーションデータセットなしでは、直接監督と間接監督を含む2つの異なる列車戦略を探索する。 PASCAL VOC2007およびMS COCO検出データセットの性能評価実験を行った。その結果, SSD と Pelee を本手法に統合することにより, SCN の精度を低下させることなく, 1/5 および 1/3 の範囲での計算を平均的に削減できることがわかった。

It is well known that attention mechanisms can effectively improve the performance of many CNNs including object detectors. Instead of refining feature maps prevalently, we reduce the prohibitive computational complexity by a novel attempt at attention. Therefore, we introduce an efficient object detector called Selective Convolutional Network (SCN), which selectively calculates only on the locations that contain meaningful and conducive information. The basic idea is to exclude the insignificant background areas, which effectively reduces the computational cost especially during the feature extraction. To solve it, we design an elaborate structure with negligible overheads to guide the network where to look next. It's end-to-end trainable and easy-embedding. Without additional segmentation datasets, we explores two different train strategies including direct supervision and indirect supervision. Extensive experiments assess the performance on PASCAL VOC2007 and MS COCO detection datasets. Results show that SSD and Pelee integrated with our method averagely reduce the calculations in a range of 1/5 and 1/3 with slight loss of accuracy, demonstrating the feasibility of SCN.

翻訳日:2023-01-04 03:17:43 公開日:2020-02-04

# トポメトリックマップにおける単一画像からの深部幾何学的6DF位置決め

Deep-Geometric 6 DoF Localization from a Single Image in Topo-metric Maps ( http://arxiv.org/abs/2002.01210v1 )

ライセンス: Link先を確認

Tom Roussel, Punarjay Chakravarty, Gaurav Pandey, Tinne Tuytelaars, Luc Van Eycken

(参考訳) 本稿では,カメラの6自由度(dof)の全体像を,予めマッピングされた環境における1つの画像から推定可能な,深部地理ローカライザについて述べる。我々の地図はトポメトリックであり、6つのDoFポーズが知られている離散位相ノードを持つ。マップの各topoノードは、2d特徴と3d位置がマッピングプロセスの一部として格納される一連のポイントで構成されています。マッピングフェーズでは、ステレオカメラと通常のステレオビジュアルスラムパイプラインを使用します。ローカライゼーションフェーズでは,1枚のカメライメージをDeep Learningを用いてトポロジカルノードにローカライズし,マッチングした2D特徴(およびトポマップにおけるそれらの3D位置)の幾何アルゴリズム(PnP)を用いて,カメラの全6DoFのグローバルな一貫したポーズを決定する。本手法は,マッピングと位置決めアルゴリズムとセンサ(stereoとmono)を分離し,単一のカメラを用いて,予めマッピングした環境で正確な6自由度位置推定を可能にする。携帯電話やドローンなどの単一カメラデバイスにおけるVR/ARやローカライゼーションの応用の可能性を考えると、私たちのハイブリッドアルゴリズムは、シミュレーションや実環境における単一の画像からのポーズを回帰する完全なDeep-LearningベースのPose-Netと好適に比較できる。

We describe a Deep-Geometric Localizer that is able to estimate the full 6 Degree of Freedom (DoF) global pose of the camera from a single image in a previously mapped environment. Our map is a topo-metric one, with discrete topological nodes whose 6 DoF poses are known. Each topo-node in our map also comprises of a set of points, whose 2D features and 3D locations are stored as part of the mapping process. For the mapping phase, we utilise a stereo camera and a regular stereo visual SLAM pipeline. During the localization phase, we take a single camera image, localize it to a topological node using Deep Learning, and use a geometric algorithm (PnP) on the matched 2D features (and their 3D positions in the topo map) to determine the full 6 DoF globally consistent pose of the camera. Our method divorces the mapping and the localization algorithms and sensors (stereo and mono), and allows accurate 6 DoF pose estimation in a previously mapped environment using a single camera. With potential VR/AR and localization applications in single camera devices such as mobile phones and drones, our hybrid algorithm compares favourably with the fully Deep-Learning based Pose-Net that regresses pose from a single image in simulated as well as real environments.

翻訳日:2023-01-04 03:17:22 公開日:2020-02-04

# 物体追跡のための3次元モデル輪郭エネルギーとキーポイントの組み合わせ

Combining 3D Model Contour Energy and Keypoints for Object Tracking ( http://arxiv.org/abs/2002.01379v1 )

ライセンス: Link先を確認

Bogdan Bugaev, Anton Kryshchenko, Roman Belov

(参考訳) 単分子モデルに基づく3次元追跡のための新しい組み合わせアプローチを提案する。キーポイントベースの手法を用いて予備オブジェクトのポーズを推定する。次に、輪郭エネルギー関数を最適化してポーズを洗練する。エネルギーは、モデル投影の輪郭と画像エッジとの対応度を決定する。生画像勾配の強度と向きの両方に基づいて算出する。最適化のために,局所最適性を克服し,キーポイントに基づくポーズ推定から得られる情報を考慮した検索領域制約を提案する。この手法は,キーポイントベースおよびエッジベースアプローチの多くの問題を解消する。本手法は,様々な照明条件,動作パターン,速度の動画を含む公開ベンチマークデータセット上で,最先端手法と比較することにより,その効率性を示す。

We present a new combined approach for monocular model-based 3D tracking. A preliminary object pose is estimated by using a keypoint-based technique. The pose is then refined by optimizing the contour energy function. The energy determines the degree of correspondence between the contour of the model projection and the image edges. It is calculated based on both the intensity and orientation of the raw image gradient. For optimization, we propose a technique and search area constraints that allow overcoming the local optima and taking into account information obtained through keypoint-based pose estimation. Owing to its combined nature, our method eliminates numerous issues of keypoint-based and edge-based approaches. We demonstrate the efficiency of our method by comparing it with state-of-the-art methods on a public benchmark dataset that includes videos with various lighting conditions, movement patterns, and speed.

翻訳日:2023-01-04 03:16:55 公開日:2020-02-04

# Action Graphs: グラフ畳み込みネットワークによる弱教師付きアクションローカライゼーション

Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks ( http://arxiv.org/abs/2002.01449v1 )

ライセンス: Link先を確認

Maheen Rashid, Hedvig Kjellstr\"om, Yong Jae Lee

(参考訳) 本稿では,グラフ畳み込みに基づく弱教師付き動作定位法を提案する。関連するアクションクラスに対応するビデオ時間セグメントを検索・分類するために、システムは、各ビデオ内の識別可能な時間セグメントを識別し、各アクションの完全な範囲を識別できる必要がある。弱いビデオレベルのラベルでこれを達成するには、システムは、トレーニングデータ内のビデオ間のモーメント間の類似性と相違を利用して、アクションの出現方法と、アクションの全範囲を構成するサブアクションの両方を理解する必要がある。しかし、現在の手法では、ビデオモーメント間の類似性を明示的に使用せず、局所化と分類予測を知らせている。本稿では,ビデオモーメント間の類似性を明示的にモデル化するためにグラフ畳み込みを用いる新しい手法を提案する。本手法は外観と動きを符号化した類似性グラフを用いて,THUMOS '14, ActivityNet 1.2, Charadesの動作ローカライゼーションを弱めに制御する手法である。

We present a method for weakly-supervised action localization based on graph convolutions. In order to find and classify video time segments that correspond to relevant action classes, a system must be able to both identify discriminative time segments in each video, and identify the full extent of each action. Achieving this with weak video level labels requires the system to use similarity and dissimilarity between moments across videos in the training data to understand both how an action appears, as well as the sub-actions that comprise the action's full extent. However, current methods do not make explicit use of similarity between video moments to inform the localization and classification predictions. We present a novel method that uses graph convolutions to explicitly model similarity between video moments. Our method utilizes similarity graphs that encode appearance and motion, and pushes the state of the art on THUMOS '14, ActivityNet 1.2, and Charades for weakly supervised action localization.

翻訳日:2023-01-04 03:16:43 公開日:2020-02-04

# トピックネットワークから分散認知マップへ:自発的地理情報領域におけるZipfian Topic Universes

From Topic Networks to Distributed Cognitive Maps: Zipfian Topic Universes in the Area of Volunteered Geographic Information ( http://arxiv.org/abs/2002.01454v1 )

ライセンス: Link先を確認

Alexander Mehler and R\"udiger Gleim and Regina Gaitsch and Wahed Hemati and Tolga Uslu

(参考訳) 近くの場所(都市など)は関連する言葉で書かれていますか。本稿では,地理情報の語彙符号化の分野において,この研究課題をテクスチュアリティのレベルに転送する。この目的のために,いわゆるトピックネットワークの助けを借りて,都市や地域レベルでの住所のテキストをモデル化するボランティア地理情報(vgi)を探索する。このことは、言語がテキストの話題レベルに関する地理情報をエンコードし、ネットワーク化する方法を調べるために行われる。我々の仮説は、場所のネットワーク的テーマ化は、距離や著者の基盤となるコミュニティに関係なく、類似している、というものである。そこで本研究では,言語多層ネットワーク(LMN)を新たなモデルとして,特にテキストコーパスにおけるテーマネットワークを自動生成する多言語トピックネットワーク(MTN)を提案する。本研究は、地理的な場所(特に都市)がオンラインコミュニケーションに存在するテーマ宇宙のZipfian組織を示す。我々は、この発見を認知地図の文脈で解釈し、いわゆるテーママップによって拡張する概念である。この発見の解釈によれば、認知地図の一部としてのテーママップの組織化は、基盤となるメディアの継続的な存在を保証する共有可能なコンテンツを生成する傾向から生じる。ウィキペディアの特別なウィキや抽出例を用いて仮説を検証した。このようにして、私たちは結論に達します: 互いに近いかどうかに関わらず、場所はトピック宇宙の類似のサブネットワークにまたがる隣り合う場所にあります。

Are nearby places (e.g. cities) described by related words? In this article we transfer this research question in the field of lexical encoding of geographic information onto the level of intertextuality. To this end, we explore Volunteered Geographic Information (VGI) to model texts addressing places at the level of cities or regions with the help of so-called topic networks. This is done to examine how language encodes and networks geographic information on the aboutness level of texts. Our hypothesis is that the networked thematizations of places are similar - regardless of their distances and the underlying communities of authors. To investigate this we introduce Multiplex Topic Networks (MTN), which we automatically derive from Linguistic Multilayer Networks (LMN) as a novel model, especially of thematic networking in text corpora. Our study shows a Zipfian organization of the thematic universe in which geographical places (especially cities) are located in online communication. We interpret this finding in the context of cognitive maps, a notion which we extend by so-called thematic maps. According to our interpretation of this finding, the organization of thematic maps as part of cognitive maps results from a tendency of authors to generate shareable content that ensures the continued existence of the underlying media. We test our hypothesis by example of special wikis and extracts of Wikipedia. In this way we come to the conclusion: Places, whether close to each other or not, are located in neighboring places that span similar subnetworks in the topic universe.

翻訳日:2023-01-04 03:08:48 公開日:2020-02-04

# 汎用学習エージェントのための神経進化的枠組み

Neuro-evolutionary Frameworks for Generalized Learning Agents ( http://arxiv.org/abs/2002.01088v1 )

ライセンス: Link先を確認

Thommen George Karimpanal

(参考訳) 近年のディープラーニングと強化学習の成功は、最先端の人工知能技術としての地位を確立している。しかし、サンプル効率の低さや限定的な一般化能力といったこれらのアプローチの長年の欠点は、システムの設計とデプロイの方法を再検討する必要があることを示している。本稿では,これらの学習システムと進化アルゴリズムの特定のバリエーションを組み合わせることで,様々な望ましい行動の自動獲得や行動優先の有用なセットなど,ユニークな特徴が出現する可能性について強調する。これにより、環境との最小限の相互作用で、学習を一般化し、継続的に行う方法が整うことができる。このような神経進化の枠組みから期待される改善と関連する課題、そして多くの研究分野への応用の可能性について論じる。

The recent successes of deep learning and deep reinforcement learning have firmly established their statuses as state-of-the-art artificial learning techniques. However, longstanding drawbacks of these approaches, such as their poor sample efficiencies and limited generalization capabilities point to a need for re-thinking the way such systems are designed and deployed. In this paper, we emphasize how the use of these learning systems, in conjunction with a specific variation of evolutionary algorithms could lead to the emergence of unique characteristics such as the automated acquisition of a variety of desirable behaviors and useful sets of behavior priors. This could pave the way for learning to occur in a generalized and continual manner, with minimal interactions with the environment. We discuss the anticipated improvements from such neuro-evolutionary frameworks, along with the associated challenges, as well as its potential for application to a number of research areas.

翻訳日:2023-01-04 03:08:10 公開日:2020-02-04

# 脳波信号を用いた宣言記憶の符号化と復号のためのニューラルオシレーション

Neural Oscillations for Encoding and Decoding Declarative Memory using EEG Signals ( http://arxiv.org/abs/2002.01126v1 )

ライセンス: Link先を確認

Jenifer Kalafatovich, Minji Lee

(参考訳) 宣言記憶は、日常生活体験の記憶との関係について研究されている。前回の研究では、エンコーディングフェーズにおける動作性能に関するパワースペクトルの変化が報告されたが、デコーディングフェーズはまだ検討が必要である。本研究では,記憶過程に関連する神経振動の変化について検討する。参加者は脳波信号が記録されている間に、フェーズのエンコーディングとデコードを行うためのメモリタスクを依頼された。その結果, エンコーディングフェーズでは, 低ベータ, 高ベータ帯, 低ベータ帯, 高ベータ帯, ガンマ帯が左側頭葉領域で有意に低下し, その後の記憶への影響が認められた。復号フェーズでは, 前方-中央領域でアルファパワーの低下がみられた。その結果、βバンドとαバンドがメモリタスクのエンコードとデコードフェーズにそれぞれ有意な相関を示した。

Declarative memory has been studied for its relationship with remembering daily life experiences. Previous studies reported changes in power spectra during encoding phase related to behavioral performance, however decoding phase still needs to be explored. This study investigates neural oscillations changes related to memory process. Participants were asked to perform a memory task for encoding and decoding phase while EEG signals were recorded. Results showed that for encoding phase, there was a significant decrease of power in low beta, high beta bands over fronto-central area and a decrease in low beta, high beta and gamma bands over left temporal area related to successful subsequent memory effects. For decoding phase, only significant decreases of alpha power were observed over fronto-central area. This finding showed relevance of beta and alpha band for encoding and decoding phase of a memory task respectively.

翻訳日:2023-01-04 03:07:56 公開日:2020-02-04

# 弱監視対象検出のためのオブジェクトインスタンスマイニング

Object Instance Mining for Weakly Supervised Object Detection ( http://arxiv.org/abs/2002.01087v1 )

ライセンス: Link先を確認

Chenhao Lin, Siwen Wang, Dongqi Xu, Yu Lu, Wayne Zhang

(参考訳) 近年,画像レベルのアノテーションのみを用いたオブジェクト検出(WSOD)が注目されている。複数のインスタンス学習を使用する既存のアプローチは、各カテゴリのイメージ内の最も識別的なオブジェクトから学ぶ傾向があるため、ローカルオプティマに容易に当てはまる。したがって、これらのメソッドは、WSODのパフォーマンスを低下させるオブジェクトインスタンスの欠如に悩まされる。この問題に対処するため,本論文では,オブジェクト検出の弱いエンドツーエンドのオブジェクトインスタンスマイニング(OIM)フレームワークを提案する。 oimは、追加のアノテーションなしで、空間および外観グラフに情報伝達を導入することで、各画像に存在するすべての可能なオブジェクトインスタンスの検出を試みる。反復学習プロセスでは、同一クラスからの識別の少ないオブジェクトインスタンスを徐々に検出し、トレーニングに利用することができる。さらに、各オブジェクトインスタンスのより大きな部分を学習し、パフォーマンスをさらに向上するために、オブジェクトインスタンスの再重み付け損失を設計する。 VOC 2007 と 2012 の2つの公開データベースの実験結果は,提案手法の有効性を実証している。

Weakly supervised object detection (WSOD) using only image-level annotations has attracted growing attention over the past few years. Existing approaches using multiple instance learning easily fall into local optima, because such mechanism tends to learn from the most discriminative object in an image for each category. Therefore, these methods suffer from missing object instances which degrade the performance of WSOD. To address this problem, this paper introduces an end-to-end object instance mining (OIM) framework for weakly supervised object detection. OIM attempts to detect all possible object instances existing in each image by introducing information propagation on the spatial and appearance graphs, without any additional annotations. During the iterative learning process, the less discriminative object instances from the same class can be gradually detected and utilized for training. In addition, we design an object instance reweighted loss to learn larger portion of each object instance to further improve the performance. The experimental results on two publicly available databases, VOC 2007 and 2012, demonstrate the efficacy of proposed approach.

翻訳日:2023-01-04 03:07:43 公開日:2020-02-04

# グループ写真の美的品質評価

Aesthetic Quality Assessment for Group photograph ( http://arxiv.org/abs/2002.01096v1 )

ライセンス: Link先を確認

Yaoting Wang (1 and 2), Yongzhen Ke (1 and 2), Kai Wang (1 and 2), Cuijiao Zhang (1 and 2), Fan Qin (3) ((1) School of computer science and technology, Tiangong University, (2) Tianjin Key Laboratory of Autonomous Intelligence Technology and Systems, (3) Business School, Nankai University)

(参考訳) 画像美的品質評価は近年注目されているが、特定の種類の写真、すなわちグループ写真についての研究はあまり行われていない。本研究では,グループ写真の経験と原則に基づく,高度な機能セットを設計した。Opened-eye, Gaze, Smile, Occluded Face, Face Orientation, Facial blur, Character Center。次に,これらと83の汎用的な美的特徴を組み合わせることで,2つの美的評価モデルを構築した。また,審美スコアを付記したグループ写真gpdの大規模データセットを構築した。実験の結果,プロの写真とスナップショットを分類し,同一場面における多様な人間状態の複数のグループ写真の識別を予測できることがわかった。

Image aesthetic quality assessment has got much attention in recent years, but not many works have been done on a specific genre of photos: Group photograph. In this work, we designed a set of high-level features based on the experience and principles of group photography: Opened-eye, Gaze, Smile, Occluded faces, Face Orientation, Facial blur, Character center. Then we combined them and 83 generic aesthetic features to build two aesthetic assessment models. We also constructed a large dataset of group photographs - GPD- annotated with the aesthetic score. The experimental result shows that our features perform well for categorizing professional photos and snapshots and predicting the distinction of multiple group photographs of diverse human states under the same scene.

翻訳日:2023-01-04 03:07:27 公開日:2020-02-04

# 映像中の異常な活動検出のためのランキングロス機能付き3次元ResNet

3D ResNet with Ranking Loss Function for Abnormal Activity Detection in Videos ( http://arxiv.org/abs/2002.01132v1 )

ライセンス: Link先を確認

Shikha Dubey, Abhijeet Boragule, Moongu Jeon

(参考訳) 異常な活動検出はコンピュータビジョンの分野で最も困難なタスクの1つである。本研究は, 異常映像と正常映像の両方を用いて, 映像レベルの情報を提供し, 複数インスタンス学習の助けを借りて異常映像を学習する, 異常行動検出の最近の研究成果に動機づけられている。時間的アノテーションがない場合、そのようなモデルは異常を検出しながら誤報をしがちである。そこで本稿では,異常な活動検知タスクを実行しながら,誤警報率を最小限に抑えるタスクに焦点をあてる。ビデオ行動認識タスクにおけるこれらの誤報の軽減と最近の3Dディープニューラルネットワークの進歩は、提案手法で3D ResNetを活用する動機を与え、ビデオから時空間の特徴を抽出するのに役立つ。その後,これらの特徴と深層マルチインスタンス学習と,提案するランキング損失を用いて,映像セグメントレベルでの異常スコアの予測を行う。そこで,提案手法は3D Deep Multiple Instance Learning with ResNet (MILR) と新しいランキング損失関数を併用して,UCF-Crimeベンチマークデータセット上での最高の性能を実現する。提案手法の有効性をUCF-Crimeデータセットで示す。

Abnormal activity detection is one of the most challenging tasks in the field of computer vision. This study is motivated by the recent state-of-art work of abnormal activity detection, which utilizes both abnormal and normal videos in learning abnormalities with the help of multiple instance learning by providing the data with video-level information. In the absence of temporal-annotations, such a model is prone to give a false alarm while detecting the abnormalities. For this reason, in this paper, we focus on the task of minimizing the false alarm rate while performing an abnormal activity detection task. The mitigation of these false alarms and recent advancement of 3D deep neural network in video action recognition task collectively give us motivation to exploit the 3D ResNet in our proposed method, which helps to extract spatial-temporal features from the videos. Afterwards, using these features and deep multiple instance learning along with the proposed ranking loss, our model learns to predict the abnormality score at the video segment level. Therefore, our proposed method 3D deep Multiple Instance Learning with ResNet (MILR) along with the new proposed ranking loss function achieves the best performance on the UCF-Crime benchmark dataset, as compared to other state-of-art methods. The effectiveness of our proposed method is demonstrated on the UCF-Crime dataset.

翻訳日:2023-01-04 03:01:23 公開日:2020-02-04

# GTC:CTCの効率的かつ正確なテキスト認識に向けた指導的訓練

GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition ( http://arxiv.org/abs/2002.01276v1 )

ライセンス: Link先を確認

Wenyang Hu, Xiaocong Cai, Jun Hou, Shuai Yi, Zhiping Lin

(参考訳) コネクショニスト時間分類(ctc)と注意機構は、近年のテキスト認識における2つの主要なアプローチである。注意に基づく手法と比較して、CTCデコーダはより短い推論時間を持つが、精度は低い。効率的かつ効果的なモデルを設計するために、より強力な注意指導からCTCモデルがより優れたアライメントと特徴表現を学習するCTC(GTC)のガイド付きトレーニングを提案する。ガイド付きトレーニングの利点により、CTCモデルは、高速な推論速度を維持しながら、正規および不規則なシーンテキストの堅牢かつ正確な予測を実現する。さらに,ctcデコーダの可能性をさらに活用するために,グラフ畳み込みネットワーク(gcn)を提案し,抽出された特徴の局所相関について検討した。標準ベンチマークに関する広範囲な実験により,本モデルが正規および不規則なテキスト認識のための新たな最先端技術を実現し,注意に基づく手法の6倍の推論時間を必要とすることが示された。

Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet a lower accuracy. To design an efficient and effective model, we propose the guided training of CTC (GTC), where CTC model learns a better alignment and feature representations from a more powerful attentional guidance. With the benefit of guided training, CTC model achieves robust and accurate prediction for both regular and irregular scene text while maintaining a fast inference speed. Moreover, to further leverage the potential of CTC decoder, a graph convolutional network (GCN) is proposed to learn the local correlations of extracted features. Extensive experiments on standard benchmarks demonstrate that our end-to-end model achieves a new state-of-the-art for regular and irregular scene text recognition and needs 6 times shorter inference time than attentionbased methods.

翻訳日:2023-01-04 02:59:54 公開日:2020-02-04

# ピクセルワイズ条件付き生成逆ネットワークによる画像合成と補完

Pixel-wise Conditioned Generative Adversarial Networks for Image Synthesis and Completion ( http://arxiv.org/abs/2002.01281v1 )

ライセンス: Link先を確認

Cyprien Ruffino and Romain H\'erault and Eric Laloy and Gilles Gasso

(参考訳) generative adversarial networks (gans) は教師なし画像生成に成功している。いくつかの作品では、画像の一部を再構成して生成を条件づけることで、ganを画像の塗装に拡張している。その成功にもかかわらず、これらの手法は、画像ピクセルの小さなサブセットのみが事前に知られているような設定に制限がある。本稿では,ごく少数の画素値が提供される場合の条件付きGANの有効性について検討する。本稿では,GAN目標関数に明示的なコスト項を付加して画素単位の条件を強制するモデリングフレームワークを提案する。本稿では,この正規化項が生成画像の品質と与えられた画素制約を満たすことに与える影響について検討する。最近のPacGAN技術を用いて、我々は生成したサンプルの多様性を維持する。 FashionMNISTにおける実験により、正規化項は生成画像の品質と条件付けとの間のトレードオフを効果的に制御することを示した。 cifar-10 と celeba データセットの実験的評価により, 画素条件を強制しながら, fr\'echet インセプション距離の観点で, 視覚的, 定量的に精度の高い結果が得られることが示された。また,完全畳み込みネットワークを用いたテクスチャ画像生成タスクの評価を行った。最後の貢献として、この手法を古典的な地質シミュレーション応用に適用する。

Generative Adversarial Networks (GANs) have proven successful for unsupervised image generation. Several works have extended GANs to image inpainting by conditioning the generation with parts of the image to be reconstructed. Despite their success, these methods have limitations in settings where only a small subset of the image pixels is known beforehand. In this paper we investigate the effectiveness of conditioning GANs when very few pixel values are provided. We propose a modelling framework which results in adding an explicit cost term to the GAN objective function to enforce pixel-wise conditioning. We investigate the influence of this regularization term on the quality of the generated images and the fulfillment of the given pixel constraints. Using the recent PacGAN technique, we ensure that we keep diversity in the generated samples. Conducted experiments on FashionMNIST show that the regularization term effectively controls the trade-off between quality of the generated images and the conditioning. Experimental evaluation on the CIFAR-10 and CelebA datasets evidences that our method achieves accurate results both visually and quantitatively in term of Fr\'echet Inception Distance, while still enforcing the pixel conditioning. We also evaluate our method on a texture image generation task using fully-convolutional networks. As a final contribution, we apply the method to a classical geological simulation application.

翻訳日:2023-01-04 02:59:37 公開日:2020-02-04

# 有界部分集合の学習 $l_p$

Learning bounded subsets of $L_p$ ( http://arxiv.org/abs/2002.01182v1 )

ライセンス: Link先を確認

Shahar Mendelson

(参考訳) 基礎となるクラスが$L_p$の有界部分集合であり、対象の$Y$が$L_p$に属する学習問題を研究する。以前は、ミニマックスサンプルの複雑性推定は、そのような有界性仮定の下で、$p=\infty$のときのみ知られていた。任意の$p > 4$ の急激なサンプル複雑性の推定値を示す。これは、重み付き問題に適した学習手順に基づいている。

We study learning problems in which the underlying class is a bounded subset of $L_p$ and the target $Y$ belongs to $L_p$. Previously, minimax sample complexity estimates were known under such boundedness assumptions only when $p=\infty$. We present a sharp sample complexity estimate that holds for any $p > 4$. It is based on a learning procedure that is suited for heavy-tailed problems.

翻訳日:2023-01-04 02:52:18 公開日:2020-02-04

# ALPINE:ネットワーク埋め込みを用いたアクティブリンク予測

ALPINE: Active Link Prediction using Network Embedding ( http://arxiv.org/abs/2002.01227v1 )

ライセンス: Link先を確認

Xi Chen, Bo Kang, Jefrey Lijffijt and Tijl De Bie

(参考訳) 多くの実世界の問題は、部分的に観測されたネットワーク内のリンクを予測するものとして定式化することができる。例えば、Facebookの友情提案、消費者製品推奨、犯罪ネットワーク内のアクター間の隠れた相互作用の識別などがある。いくつかのリンク予測アルゴリズム、特に最近導入されたネットワーク埋め込みは、ネットワークの観測部分に依存するだけでこれを行うことができる。多くの場合、ノード対のリンク状態はクエリされ、リンク予測アルゴリズムによって追加情報として使用できる。残念なことに、このようなクエリはコストも時間もかかるため、どのノード対でクエリするかを慎重に検討する必要がある。本稿では,特定のノード対を問合せした後のリンク予測精度の向上を,アクティブな学習環境で使用するために推定する。具体的には,ネットワーク埋め込みに基づくリンク予測のための最初の手法である ALPINE (Active Link Prediction usIng Network Embedding) を提案する。この目的のために,v-optimalityの概念を実験設計からこの設定に一般化するとともに,標準分類設定で開発されたより基本的なアクティブラーニングヒューリスティックスを一般化した。実データによる実証結果から、ALPINEはスケーラブルであり、リンク予測精度をはるかに少ないクエリで向上させる。

Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on the observed part of the network. Often, the link status of a node pair can be queried, which can be used as additional information by the link prediction algorithm. Unfortunately, such queries can be expensive or time-consuming, mandating the careful consideration of which node pairs to query. In this paper we estimate the improvement in link prediction accuracy after querying any particular node pair, to use in an active learning setup. Specifically, we propose ALPINE (Active Link Prediction usIng Network Embedding), the first method to achieve this for link prediction based on network embedding. To this end, we generalized the notion of V-optimality from experimental design to this setting, as well as more basic active learning heuristics originally developed in standard classification settings. Empirical results on real data show that ALPINE is scalable, and boosts link prediction accuracy with far fewer queries.

翻訳日:2023-01-04 02:52:12 公開日:2020-02-04

# ウィスラー電波の検出と特徴化のための機械学習技術

Machine Learning Techniques to Detect and Characterise Whistler Radio Waves ( http://arxiv.org/abs/2002.01244v1 )

ライセンス: Link先を確認

Othniel J.E.Y. Konan, Amit Kumar Mishra, Stefan Lotz

(参考訳) ライトニングストロークは強力な電磁パルスを生成し、非常に低周波(VLF)波を電磁界線に沿って半球に伝播させる。 vlfアンテナ受信機は、これらの雷撃によって発生するホイッスラー波を検出するために使用できる。受信ホイッスラー波の特定の時間/周波数依存性は、磁気圏のプラズマ圏領域における電子密度の推定を可能にする。したがって、ホイッスラーの識別と特徴付けは、プラズマ圏をリアルタイムに監視し、統計研究に使用するイベントの大規模なデータベースを構築するための重要なタスクである。ウイスラー検出技術の現状は、Lichtenberger (2009) が開発したAutomatic Whistler Detection (AWD) 法である。この方法は2次元の画像相関に基づいており、vlf受信アンテナ(例えば南極)に位置する重要な計算ハードウェアを必要とする。本研究の目的は,vlf受信機が提供するデータからホイッスラーを自動的に検出できる機械学習モデルを開発することである。提案手法は,VLF受信機が生成したスペクトルデータに対して,画像分類と局所化を組み合わせることで,各ウィスラーの識別とローカライズを行う。対象とするデータには,SANAEとMarionのAWDが特定した約2300のイベントがあり,トレーニングや検証,テストデータとして使用される予定である。 3つの検出器設計が提案されている。 AWDと同様の手法を用いており、第1は分光図から抽出した関心領域のイメージ分類を用いており、第2はオブジェクト検出における最先端であるYOLOを用いている。これらの検出器はマリオンのデータセットで15%未満の誤検知と誤報を達成できることが示されている。

Lightning strokes create powerful electromagnetic pulses that routinely cause very low frequency (VLF) waves to propagate across hemispheres along geomagnetic field lines. VLF antenna receivers can be used to detect these whistler waves generated by these lightning strokes. The particular time/frequency dependence of the received whistler wave enables the estimation of electron density in the plasmasphere region of the magnetosphere. Therefore the identification and characterisation of whistlers are important tasks to monitor the plasmasphere in real-time and to build large databases of events to be used for statistical studies. The current state of the art in detecting whistler is the Automatic Whistler Detection (AWD) method developed by Lichtenberger (2009). This method is based on image correlation in 2 dimensions and requires significant computing hardware situated at the VLF receiver antennas (e.g. in Antarctica). The aim of this work is to develop a machine learning-based model capable of automatically detecting whistlers in the data provided by the VLF receivers. The approach is to use a combination of image classification and localisation on the spectrogram data generated by the VLF receivers to identify and localise each whistler. The data at hand has around 2300 events identified by AWD at SANAE and Marion and will be used as training, validation, and testing data. Three detector designs have been proposed. The first one using a similar method to AWD, the second using image classification on regions of interest extracted from a spectrogram, and the last one using YOLO, the current state of the art in object detection. It has been shown that these detectors can achieve a misdetection and false alarm of less than 15% on Marion's dataset.

翻訳日:2023-01-04 02:51:51 公開日:2020-02-04

# グラディエント型敵攻撃に対するミニマックス防御

Minimax Defense against Gradient-based Adversarial Attacks ( http://arxiv.org/abs/2002.01256v1 )

ライセンス: Link先を確認

Blerta Lindqvist, Rauf Izmailov

(参考訳) 最先端の敵攻撃はニューラルネットワーク分類器を対象としている。ニューラルネットワークはデフォルトで勾配降下を利用して損失関数を最小化する。分類器の損失関数の勾配は、勾配に基づく敵対攻撃によって逆摂動画像を生成する。我々は、他のタイプの最適化がニューラルネットワークの分類器にエッジを与えるかどうかに疑問を呈する。本稿では,最小限の最適化を応用した新たな手法を提案する。我々のミニマックス分類器は、GANジェネレータでミニマックスゲームをする生成逆数ネットワーク(GAN)の判別器である。さらに、我々のgan生成器は、元の多様体とは異なる多様体にすべての点を投影する。我々は,MNIST, CIFAR-10, German Traffic Sign (TRAFFIC) の3つのデータセット上で, 敵攻撃Carlini Wagner (CW), DeepFool, Fast Gradient Sign Method (FGSM) を用いる。 CW攻撃に対して、我々のミニマックス防衛は98.07%(MNISTデフォルト98.93%)、73.90%(CIFAR-10デフォルト83.14%)、94.54%(TRAFFICデフォルト96.97%)を達成した。 DeepFool攻撃に対して、私たちのミニマックス防衛は98.87%(MNIST)、76.61%(CIFAR-10)、94.57%(TRAFFIC)を達成した。 FGSM攻撃に対して,97.01%(MNIST),76.79%(CIFAR-10),81.41%(TRAFFIC)を達成した。我々のMinimax対逆アプローチは、ニューラルネットワーク分類器の防御戦略に大きな変化をもたらす。

State-of-the-art adversarial attacks are aimed at neural network classifiers. By default, neural networks use gradient descent to minimize their loss function. The gradient of a classifier's loss function is used by gradient-based adversarial attacks to generate adversarially perturbed images. We pose the question whether another type of optimization could give neural network classifiers an edge. Here, we introduce a novel approach that uses minimax optimization to foil gradient-based adversarial attacks. Our minimax classifier is the discriminator of a generative adversarial network (GAN) that plays a minimax game with the GAN generator. In addition, our GAN generator projects all points onto a manifold that is different from the original manifold since the original manifold might be the cause of adversarial attacks. To measure the performance of our minimax defense, we use adversarial attacks - Carlini Wagner (CW), DeepFool, Fast Gradient Sign Method (FGSM) - on three datasets: MNIST, CIFAR-10 and German Traffic Sign (TRAFFIC). Against CW attacks, our minimax defense achieves 98.07% (MNIST-default 98.93%), 73.90% (CIFAR-10-default 83.14%) and 94.54% (TRAFFIC-default 96.97%). Against DeepFool attacks, our minimax defense achieves 98.87% (MNIST), 76.61% (CIFAR-10) and 94.57% (TRAFFIC). Against FGSM attacks, we achieve 97.01% (MNIST), 76.79% (CIFAR-10) and 81.41% (TRAFFIC). Our Minimax adversarial approach presents a significant shift in defense strategy for neural network classifiers.

翻訳日:2023-01-04 02:51:24 公開日:2020-02-04

# 情報基盤によるタスク駆動型制御の学習

Learning Task-Driven Control Policies via Information Bottlenecks ( http://arxiv.org/abs/2002.01428v1 )

ライセンス: Link先を確認

Vincent Pacelli and Anirudha Majumdar

(参考訳) 本稿では,視覚や深度などの感覚の豊富なロボットシステムに対して,タスク駆動制御ポリシを合成するための強化学習手法を提案する。標準強化学習アルゴリズムは通常、システムの状態全体とリッチなセンサー観測に制御アクションを密結合するポリシーを生成する。その結果、結果として得られるポリシーは、状態や観察(背景の色の変化など)のタスク非関連部分の変化に敏感になることが多い。対照的に、ここで紹介するアプローチは、制御アクションの計算に使われるタスク駆動表現を作成することを学びます。形式的には、これは状態とタスク駆動型表現の間の情報ボトルネックを生成するポリシー勾配スタイルのアルゴリズムを導出することで達成される。本稿では,深度画像を用いた把握タスクやRGB画像を用いた球キャッチタスクなど,複数の例を対象としたシミュレーション結果の完全セットで示す。標準方針勾配法との比較により,我々のアルゴリズムが生み出すタスク駆動型政策は,センサノイズやタスク非関連な環境変化に対して,はるかに堅牢であることが示された。

This paper presents a reinforcement learning approach to synthesizing task-driven control policies for robotic systems equipped with rich sensory modalities (e.g., vision or depth). Standard reinforcement learning algorithms typically produce policies that tightly couple control actions to the entirety of the system's state and rich sensor observations. As a consequence, the resulting policies can often be sensitive to changes in task-irrelevant portions of the state or observations (e.g., changing background colors). In contrast, the approach we present here learns to create a task-driven representation that is used to compute control actions. Formally, this is achieved by deriving a policy gradient-style algorithm that creates an information bottleneck between the states and the task-driven representation; this constrains actions to only depend on task-relevant information. We demonstrate our approach in a thorough set of simulation results on multiple examples including a grasping task that utilizes depth images and a ball-catching task that utilizes RGB images. Comparisons with a standard policy gradient approach demonstrate that the task-driven policies produced by our algorithm are often significantly more robust to sensor noise and task-irrelevant changes in the environment.

翻訳日:2023-01-04 02:50:54 公開日:2020-02-04

# DVNet:大規模脳血管再建のためのメモリ効率の良い3次元CNN

DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction ( http://arxiv.org/abs/2002.01568v1 )

ライセンス: Link先を確認

Leila Saadatifard, Aryan Mobiny, Pavel Govyadinov, Hien Nguyen, David Mayerich

(参考訳) 脳の微細構造図は神経変性疾患などの慢性疾患による変化を含む神経機能や行動を理解するために重要である。ナイフエッジ走査顕微鏡(KESM)のような技術は、細胞内分解能で全臓器のイメージングを可能にする。しかし、マルチテラバイトのデータサイズは手動アノテーションを非現実的かつ自動的なセグメンテーションを難しくする。密集した細胞と相互接続された微小血管ネットワークを組み合わせることは、現在のセグメンテーションアルゴリズムの課題である。高スループット顕微鏡データの巨大なサイズは、高速でほとんど教師なしのアルゴリズムを必要とする。本稿では,ピクセル単位のセマンティクスセグメンテーションのための完全畳み込み型,深層,密結合型エンコーダデコーダについて検討する。深いネットワークでしばしば発生する過大なメモリの複雑さは、スキップ接続を使用して軽減され、結果としてパラメータが減少し、以前のアーキテクチャよりも大幅にパフォーマンスが向上する。提案ネットワークは,オープンソースベンチマークに適用したセマンティックセグメンテーション問題に対して,優れた性能を提供する。我々はついに細胞および微小血管のセグメンテーションのためのネットワークを実証し、臓器規模の神経血管分析の定量的測定を可能にした。

Maps of brain microarchitecture are important for understanding neurological function and behavior, including alterations caused by chronic conditions such as neurodegenerative disease. Techniques such as knife-edge scanning microscopy (KESM) provide the potential for whole organ imaging at sub-cellular resolution. However, multi-terabyte data sizes make manual annotation impractical and automatic segmentation challenging. Densely packed cells combined with interconnected microvascular networks are a challenge for current segmentation algorithms. The massive size of high-throughput microscopy data necessitates fast and largely unsupervised algorithms. In this paper, we investigate a fully-convolutional, deep, and densely-connected encoder-decoder for pixel-wise semantic segmentation. The excessive memory complexity often encountered with deep and dense networks is mitigated using skip connections, resulting in fewer parameters and enabling a significant performance increase over prior architectures. The proposed network provides superior performance for semantic segmentation problems applied to open-source benchmarks. We finally demonstrate our network for cellular and microvascular segmentation, enabling quantitative metrics for organ-scale neurovascular analysis.

翻訳日:2023-01-04 02:50:37 公開日:2020-02-04

# BOFFIN TTS:ベイズ最適化による少数ショット話者適応

BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization ( http://arxiv.org/abs/2002.01953v1 )

ライセンス: Link先を確認

Henry B.Moss, Vatsal Aggarwal, Nishant Prateek, Javier Gonz\'alez, Roberto Barra-Chicote

(参考訳) 本稿では,話者適応のための新しいアプローチであるBOFFIN TTS(Bayesian Optimization for FIne-tuning Neural Text To Speech)を提案する。ここでは、ターゲット発話の小さなコーパスを用いて、訓練済みのTSモデルを微調整し、新しい話者を模倣する。微調整制御を行うハイパーパラメータのコーパス固有の構成を必要とするような,一様適応戦略は存在しないことを実証する。ターゲット話者のハイパーパラメータ値を効率的に最適化するためにベイズ最適化を用いることで、標準手法よりも平均30%高い話者類似度で適応することができる。複数のコーパスを通して、boffin ttsは10分未満の音声を使って新しい話者を合成することを学び、ベースモデルを訓練するために使用する話者と同じ自然性を達成することが示されている。

We present BOFFIN TTS (Bayesian Optimization For FIne-tuning Neural Text To Speech), a novel approach for few-shot speaker adaptation. Here, the task is to fine-tune a pre-trained TTS model to mimic a new speaker using a small corpus of target utterances. We demonstrate that there does not exist a one-size-fits-all adaptation strategy, with convincing synthesis requiring a corpus-specific configuration of the hyper-parameters that control fine-tuning. By using Bayesian optimization to efficiently optimize these hyper-parameter values for a target speaker, we are able to perform adaptation with an average 30% improvement in speaker similarity over standard techniques. Results indicate, across multiple corpora, that BOFFIN TTS can learn to synthesize new speakers using less than ten minutes of audio, achieving the same naturalness as produced for the speakers used to train the base model.

翻訳日:2023-01-04 02:50:03 公開日:2020-02-04

# 合成経験によるDQNリプレイメモリのブートストラップ

Bootstrapping a DQN Replay Memory with Synthetic Experiences ( http://arxiv.org/abs/2002.01370v1 )

ライセンス: Link先を確認

Wenzel Baron Pilar von Pilchau and Anthony Stein and J\"org H\"ahner

(参考訳) 多くのDeep Reinforcement Learningアルゴリズムの重要なコンポーネントは、生成したエクスペリエンスの記憶機構やメモリとして機能するExperience Replayである。これらの経験はトレーニングに使われ、エージェントが問題空間を安定して完璧な軌道を見つけるのに役立ちます。しかし、古典的な体験リプレイは実際に作った経験のみを使うが、保存されたサンプルは抽出できる問題の知識という形で大きな可能性を秘めている。学習者を支援するために,非決定論的離散環境において合成経験を生成するアルゴリズムを提案する。補間されたエクスペリエンスリプレイは、フリーズレイク環境で評価され、エージェントが従来のバージョンよりも早く、さらに良く学習できるようにサポートできることが示されている。

An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version.

翻訳日:2023-01-04 02:43:19 公開日:2020-02-04

# コストセンシティブな大マルジン分類器に対する近似マージンアプローチ

Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers ( http://arxiv.org/abs/2002.01408v1 )

ライセンス: Link先を確認

Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich

(参考訳) コストに敏感なマルチクラス分類の問題を考察し、より重要でないクラスを犠牲にして、重要なクラスの感度を高めたいと考えている。我々はこの問題に対処するために {\em Apportioned margin} フレームワークを採用し、同じ境界を共有するクラス間の効率的なマージンシフトを可能にする。すべてのクラス間の決定境界は、与えられた優先順位付けベクトルに従ってそれらのマージンを分割し、重要なクラスに対してより厳密なエラーを生じると同時に、全体のアウト・オブ・サンプルエラーを減少させる。フレームワークの効率的な実装の実証に加えて、一般化バウンダリの導出、フィッシャーの一貫性の実証、Mercurerのカーネルとニューラルネットワークへの適応、そしてすべてのアカウントで有望な実証結果の報告を行う。

We consider the problem of cost sensitive multiclass classification, where we would like to increase the sensitivity of an important class at the expense of a less important one. We adopt an {\em apportioned margin} framework to address this problem, which enables an efficient margin shift between classes that share the same boundary. The decision boundary between all pairs of classes divides the margin between them in accordance to a given prioritization vector, which yields a tighter error bound for the important classes while also reducing the overall out-of-sample error. In addition to demonstrating an efficient implementation of our framework, we derive generalization bounds, demonstrate Fisher consistency, adapt the framework to Mercer's kernel and to neural networks, and report promising empirical results on all accounts.

翻訳日:2023-01-04 02:43:06 公開日:2020-02-04

# ベイジアン能動差分選択による心理測定検査の高速化

Accelerating Psychometric Screening Tests With Bayesian Active Differential Selection ( http://arxiv.org/abs/2002.01547v1 )

ライセンス: Link先を確認

Trevor J. Larsen, Gustavo Malkomes, Dennis L. Barbour

(参考訳) 古典的な心理測定関数推定法は過度な測定を必要とするか、目標の心理測定関数の低分解能近似しか生成しない。本稿では,ある患者の心理計測関数推定の変化を迅速にスクリーニングする新しい方法を提案する。ベイジアン能動モデル選択を用いて、従来のオーディオグラムと異なるものかどうかを素早く見つけることを目的として、純音音響グラムの自動検査を行う。我々は,国立労働安全衛生研究所のオーディオメトリックデータを用いて,我々のアプローチを検証する。最初の結果は、2つのテストセッションの間に患者の聴力関数が高信頼で変化したかどうかを数音で検出できることを示している。

Classical methods for psychometric function estimation either require excessive measurements or produce only a low-resolution approximation of the target psychometric function. In this paper, we propose a novel solution for rapid screening for a change in the psychometric function estimation of a given patient. We use Bayesian active model selection to perform an automated pure-tone audiogram test with the goal of quickly finding if the current audiogram will be different from a previous audiogram. We validate our approach using audiometric data from the National Institute for Occupational Safety and Health NIOSH. Initial results show that with a few tones we can detect if the patient's audiometric function has changed between the two test sessions with high confidence.

翻訳日:2023-01-04 02:42:21 公開日:2020-02-04

# 大きなバッチトレーニングはウォームアップを必要としない

Large Batch Training Does Not Need Warmup ( http://arxiv.org/abs/2002.01576v1 )

ライセンス: Link先を確認

Zhouyuan Huo, Bin Gu, Heng Huang

(参考訳) 大規模なバッチサイズによるディープニューラルネットワークのトレーニングでは、有望な結果が得られ、現実世界のアプリケーションの多くにメリットがある。しかし、オプティマイザは早期にゆっくりと収束し、大規模なディープラーニング最適化ヒューリスティックと理論的基礎の間にはギャップがある。本稿では,大規模バッチトレーニングのための新しい階層型適応レートスケーリング(clars)アルゴリズムを提案する。また,勾配法の新しい微粒化解析を導入することにより,提案手法の収束率も解析する。我々は,このギャップを埋め,線形学習率のスケーリング,漸進的ウォームアップ,層幅適応率のスケーリングなど,3つの一般的な大規模バッチトレーニング手法の理論的洞察を示す。大規模な実験により,提案アルゴリズムは,ImageNetデータセット上での高度なディープニューラルネットワーク(ResNet,DenseNet,MobileNet)のトレーニングにおいて,最先端の大規模バッチオプティマイザの収束を克服し,漸進的なウォームアップ手法よりも優れていた。

Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. However, the optimizer converges slowly at early epochs and there is a gap between large-batch deep learning optimization heuristics and theoretical underpinnings. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training. We also analyze the convergence rate of the proposed method by introducing a new fine-grained analysis of gradient-based methods. Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques, including linear learning rate scaling, gradual warmup, and layer-wise adaptive rate scaling. Extensive experiments demonstrate that the proposed algorithm outperforms gradual warmup technique by a large margin and defeats the convergence of the state-of-the-art large-batch optimizer in training advanced deep neural networks (ResNet, DenseNet, MobileNet) on ImageNet dataset.

翻訳日:2023-01-04 02:42:10 公開日:2020-02-04

# 空中画像マッチングのための双方向アンサンブルを有する双方向対称ネットワーク

A Two-Stream Symmetric Network with Bidirectional Ensemble for Aerial Image Matching ( http://arxiv.org/abs/2002.01325v1 )

ライセンス: Link先を確認

Jae-Hyun Park, Woo-Jeoung Nam, Seong-Whan Lee

(参考訳) 本稿では,2ストリームの深層ネットワークを用いて異なる環境下で得られた2つの空中画像を正確にマッチングする手法を提案する。ネットワークは、対象画像を内部的に増強することにより、3つの入力画像で2ストリームを考慮し、トレーニングにおける追加の強化ペアを反映する。その結果、深層ネットワークのトレーニングプロセスは規則化され、そのネットワークは空中画像のばらつきに対して堅牢になる。さらに,幾何学的変換の同型性に動機付けられた双方向ネットワークに基づくアンサンブル手法を提案する。ネットワークやパラメータが加わらない2つの大域的変換パラメータが得られ、非対称なマッチング結果が軽減され、2つの結果の融合により性能が大幅に向上する。実験では,Google Earth と International Society for Photogrammetry and Remote Sensing (ISPRS) の航空画像を用いた。その結果を定量的に評価するために、マッチングの度合いを測る正しいキーポイント(PCK)メトリックの確率を適用した。定性的かつ定量的な結果は,従来の航空画像のマッチング方法と比較して,性能の差が大きいことを示している。すべてのコードとトレーニングされたモデル、およびデータセットはオンラインで利用可能です。

In this paper, we propose a novel method to precisely match two aerial images that were obtained in different environments via a two-stream deep network. By internally augmenting the target image, the network considers the two-stream with the three input images and reflects the additional augmented pair in the training. As a result, the training process of the deep network is regularized and the network becomes robust for the variance of aerial images. Furthermore, we introduce an ensemble method that is based on the bidirectional network, which is motivated by the isomorphic nature of the geometric transformation. We obtain two global transformation parameters without any additional network or parameters, which alleviate asymmetric matching results and enable significant improvement in performance by fusing two outcomes. For the experiment, we adopt aerial images from Google Earth and the International Society for Photogrammetry and Remote Sensing (ISPRS). To quantitatively assess our result, we apply the probability of correct keypoints (PCK) metric, which measures the degree of matching. The qualitative and quantitative results show the sizable gap of performance compared to the conventional methods for matching the aerial images. All code and our trained model, as well as the dataset are available online.

翻訳日:2023-01-04 02:41:51 公開日:2020-02-04

# ノード重み依存トラベルセールスパーソン問題:近似アルゴリズムとランダム探索ヒューリスティックス

The Node Weight Dependent Traveling Salesperson Problem: Approximation Algorithms and Randomized Search Heuristics ( http://arxiv.org/abs/2002.01070v1 )

ライセンス: Link先を確認

Jakob Bossek, Katrin Casel, Pascal Kerschke and Frank Neumann

(参考訳) 車両経路の領域におけるいくつかの重要な最適化問題は、古典的旅行販売問題(TSP)の変種と見なすことができる。進化計算の分野では,過去5年間で旅行盗難問題(TTP)の関心が高まっている。本稿では,旅行中に訪れたノードの重みに対して移動コストが増加するという観点から,このような問題に対する重みの影響について検討する。これにより、トラベリング・ティーフ問題や時間依存のTSP変種といった重要なTSP変種を抽象化し、重量依存による難易度の増加を正確に研究することができる。計量距離と有界正の重みを持つTSPのこの重み依存バージョンに対する3.59近似を提供する。さらに、古典的突然変異演算子と、重み付きTSPに適応した最先端進化アルゴリズムEAXの2つの変種を用いて、単純なランダム化局所探索実験を行った。その結果,ノード重みがツアー中のノードの位置に与える影響が示唆された。

Several important optimization problems in the area of vehicle routing can be seen as a variant of the classical Traveling Salesperson Problem (TSP). In the area of evolutionary computation, the traveling thief problem (TTP) has gained increasing interest over the last 5 years. In this paper, we investigate the effect of weights on such problems, in the sense that the cost of traveling increases with respect to the weights of nodes already visited during a tour. This provides abstractions of important TSP variants such as the Traveling Thief Problem and time dependent TSP variants, and allows to study precisely the increase in difficulty caused by weight dependence. We provide a 3.59-approximation for this weight dependent version of TSP with metric distances and bounded positive weights. Furthermore, we conduct experimental investigations for simple randomized local search with classical mutation operators and two variants of the state-of-the-art evolutionary algorithm EAX adapted to the weighted TSP. Our results show the impact of the node weights on the position of the nodes in the resulting tour.

翻訳日:2023-01-04 02:41:32 公開日:2020-02-04

# 大規模分散トレーニングにおける効率向上

Improving Efficiency in Large-Scale Decentralized Distributed Training ( http://arxiv.org/abs/2002.01119v1 )

ライセンス: Link先を確認

Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David Kung, Michael Picheny

(参考訳) Decentralized Parallel SGD (D-PSGD) と非同期型 Asynchronous Parallel SGD (AD-PSGD) は分散学習アルゴリズムの一群であり、大規模深層学習に有効である。 A)D-PSGDの欠点は、混合行列のスペクトルギャップがシステム内の学習者の数が増えると減少し、ハマーが収束することである。本稿では,通信コストを最小化しつつスペクトルギャップを改善し,(a)d-psgdに基づくトレーニングを高速化する手法について検討する。提案手法の有効性を示すために,2000時間Switchboard音声認識タスクとImageNetコンピュータビジョンタスクの実験を行った。 IBM P9 スーパーコンピュータ上では,Hav5-2000 Switchboard (SWB) テストセットで7.5% WER,CallHome (CH) テストセットで13.3% WER,SWB で7.7% WER で1.98時間,CH で128 V100 GPU で13.3% WER で2.28時間,LSTM 音響モデルをトレーニングすることができる。

Decentralized Parallel SGD (D-PSGD) and its asynchronous variant Asynchronous Parallel SGD (AD-PSGD) is a family of distributed learning algorithms that have been demonstrated to perform well for large-scale deep learning tasks. One drawback of (A)D-PSGD is that the spectral gap of the mixing matrix decreases when the number of learners in the system increases, which hampers convergence. In this paper, we investigate techniques to accelerate (A)D-PSGD based training by improving the spectral gap while minimizing the communication cost. We demonstrate the effectiveness of our proposed techniques by running experiments on the 2000-hour Switchboard speech recognition task and the ImageNet computer vision task. On an IBM P9 supercomputer, our system is able to train an LSTM acoustic model in 2.28 hours with 7.5% WER on the Hub5-2000 Switchboard (SWB) test set and 13.3% WER on the CallHome (CH) test set using 64 V100 GPUs and in 1.98 hours with 7.7% WER on SWB and 13.3% WER on CH using 128 V100 GPUs, the fastest training time reported to date.

翻訳日:2023-01-04 02:41:17 公開日:2020-02-04

# Feature-Rich biLSTMモデルによるアラビア語の発音回復

Arabic Diacritic Recovery Using a Feature-Rich biLSTM Model ( http://arxiv.org/abs/2002.01207v1 )

ライセンス: Link先を確認

Kareem Darwish, Ahmed Abdelali, Hamdy Mubarak, Mohamed Eldesouki

(参考訳) 方言(短母音)は通常アラビア文字を書く際に省略され、読み手はそれらを正しく発音するために再導入する必要がある。アラビア語のダイアクリティカルは2種類あり、第1は語彙選択を規定するコアワードダイアクリティカル(CW)、第2はケースエンディング(CE)であり、通常は語幹の端に現れ、一般的にそれらの構文的役割を規定する。 CEのリカバリは、単語間の依存関係のため、コアワードのダイアクリティカルティクスを回復するよりも比較的難しい。本稿では,言語的特徴と表面的特徴を多用した機能豊富なリカレントニューラルネットワークモデルを用いて,コアワードのダイアクリティカルティクスとケースエンディングの両方を復元する。本モデルでは,従来の2.86\%のcwエラー率と3.7%のceエラー率 (ceer) と2.2%のcwerと2.5%の古典アラビア語 (ca) のceをそれぞれ上回っている。ダイアクリッド化した単語コアとケースエンディングとを組み合わせると、それぞれMSAとCAそれぞれ6.0%と4.3%となる。これは、そのような深層神経モデルに対する機能工学の有効性を強調している。

Diacritics (short vowels) are typically omitted when writing Arabic text, and readers have to reintroduce them to correctly pronounce words. There are two types of Arabic diacritics: the first are core-word diacritics (CW), which specify the lexical selection, and the second are case endings (CE), which typically appear at the end of the word stem and generally specify their syntactic roles. Recovering CEs is relatively harder than recovering core-word diacritics due to inter-word dependencies, which are often distant. In this paper, we use a feature-rich recurrent neural network model that uses a variety of linguistic and surface-level features to recover both core word diacritics and case endings. Our model surpasses all previous state-of-the-art systems with a CW error rate (CWER) of 2.86\% and a CE error rate (CEER) of 3.7% for Modern Standard Arabic (MSA) and CWER of 2.2% and CEER of 2.5% for Classical Arabic (CA). When combining diacritized word cores with case endings, the resultant word error rate is 6.0% and 4.3% for MSA and CA respectively. This highlights the effectiveness of feature engineering for such deep neural models.

翻訳日:2023-01-04 02:34:05 公開日:2020-02-04

# テキスト分類コーパスの拡張のための反復データプログラミング

Iterative Data Programming for Expanding Text Classification Corpora ( http://arxiv.org/abs/2002.01412v1 )

ライセンス: Link先を確認

Neil Mallinar, Abhishek Shah, Tin Kam Ho, Rajendra Ugrani, Ayush Gupta

(参考訳) 実世界のテキスト分類タスクは、しばしば取得するのに高価なラベル付きトレーニング例を必要とする。機械教育の最近の進歩、特にデータプログラミングパラダイムは、ラベリング関数(英語版)として知られる弱いモデルを構築するための一般的なフレームワークを通じてデータセットを迅速に作成し、アンサンブル学習技術によってそれらを認知する。本稿では,近傍の弱モデル生成を最小限の監督で行うことで,テキストデータセットの強化を図るための,高速で簡単なデータプログラミング手法を提案する。さらに,本手法では,大量の未ラベルデータから疎分散なサンプルを同定する反復的手法を用いる。反復型データプログラミング技術は、よりラベル付きデータが人間のループで確認されるので、新しい弱いモデルを改善する。会話エージェントの意図認識を改善するタスクを含む,文分類作業における経験的結果を示す。

Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data sets quickly via a general framework for building weak models, also known as labeling functions, and denoising them through ensemble learning techniques. We present a fast, simple data programming method for augmenting text data sets by generating neighborhood-based weak models with minimal supervision. Furthermore, our method employs an iterative procedure to identify sparsely distributed examples from large volumes of unlabeled data. The iterative data programming techniques improve newer weak models as more labeled data is confirmed with human-in-loop. We show empirical results on sentence classification tasks, including those from a task of improving intent recognition in conversational agents.

翻訳日:2023-01-04 02:33:39 公開日:2020-02-04

# オンデバイス自然言語処理のための軽量畳み込み表現

Lightweight Convolutional Representations for On-Device Natural Language Processing ( http://arxiv.org/abs/2002.01535v1 )

ライセンス: Link先を確認

Shrey Desai, Geoffrey Goh, Arun Babu, Ahmed Aly

(参考訳) ディープニューラルネットワークの計算とメモリの複雑さの増大により、低リソースの電子機器(携帯電話、タブレット、ウェアラブルなど)へのデプロイが困難になった。これらの懸念に対処するために多くのモデル圧縮手法を開発したが、入力表現自体を凝縮したものはほとんどない。本研究では,任意のニューラルモデルにスワップできる高速で正確で軽量な畳み込み表現法を提案する。さらに、Samsung Galaxy S9のリソース中心のメトリクス(例えば、モデルファイルサイズ、レイテンシ、メモリ使用量)を考慮すると、リカレント表現よりも利得を示す。

The increasing computational and memory complexities of deep neural networks have made it difficult to deploy them on low-resource electronic devices (e.g., mobile phones, tablets, wearables). Practitioners have developed numerous model compression methods to address these concerns, but few have condensed input representations themselves. In this work, we propose a fast, accurate, and lightweight convolutional representation that can be swapped into any neural model and compressed significantly (up to 32x) with a negligible reduction in performance. In addition, we show gains over recurrent representations when considering resource-centric metrics (e.g., model file size, latency, memory usage) on a Samsung Galaxy S9.

翻訳日:2023-01-04 02:33:25 公開日:2020-02-04

# HVACシステム故障検出のための伝達学習

Transfer Learning for HVAC System Fault Detection ( http://arxiv.org/abs/2002.01060v1 )

ライセンス: Link先を確認

Chase P. Dowling and Baosen Zhang

(参考訳) 空調システムの故障は建物の熱的快適性とエネルギー効率を低下させ、研究コミュニティから大きな注目を集め、データ駆動方式が人気を集めている。しかし、通常の運用状態と故障状態のようなラベル付きデータの欠如は、HVACシステムへの機械学習の適用を遅らせている。加えて、特定の建物では、適切な時間をかけて訓練を行うには、観測された欠陥の数が不十分な場合もあります。これらの課題を克服するために,通常の操作と故障操作を区別する新しいベイズ分類器の転送手法を提案する。鍵となるのは、この分類器を大量のセンサーと故障データ(例えばシミュレーションや標準テストデータ)で建物で訓練し、その分類器を新しい建物から少量の通常の操作データを使って新しい建物に転送することである。異なる気候における建築的類似の建物間で分類器を転送するための概念実証を行い,分類精度とリコールの維持に必要なサンプルは少ないことを示した。

Faults in HVAC systems degrade thermal comfort and energy efficiency in buildings and have received significant attention from the research community, with data driven methods gaining in popularity. Yet the lack of labeled data, such as normal versus faulty operational status, has slowed the application of machine learning to HVAC systems. In addition, for any particular building, there may be an insufficient number of observed faults over a reasonable amount of time for training. To overcome these challenges, we present a transfer methodology for a novel Bayesian classifier designed to distinguish between normal operations and faulty operations. The key is to train this classifier on a building with a large amount of sensor and fault data (for example, via simulation or standard test data) then transfer the classifier to a new building using a small amount of normal operations data from the new building. We demonstrate a proof-of-concept for transferring a classifier between architecturally similar buildings in different climates and show few samples are required to maintain classification precision and recall.

翻訳日:2023-01-04 02:33:11 公開日:2020-02-04

# boostingによる効率的、ノイズ耐性、プライベートラーニング

Efficient, Noise-Tolerant, and Private Learning via Boosting ( http://arxiv.org/abs/2002.01100v1 )

ライセンス: Link先を確認

Mark Bun, Marco Leandro Carmosino, Jessica Sorrell

(参考訳) プライベートブースティングアルゴリズムを設計するためのシンプルなフレームワークを導入する。我々はこれらのアルゴリズムが差分プライベートで、効率的で、耐雑音性のあるPAC学習者である自然条件を与える。この枠組みを実証するために,標本複雑性が次元に依存しない大規模半空間に対して,雑音耐性およびプライベートpac学習器を構築する。大数学のハーフスペース学習者に2つのサンプル複雑性境界を与えます。 1つの境界は差分プライバシーのみに基づいており、この保証を一般化を保証するための資産として利用する。この最初の境界は、独立した関心を持つかもしれないプライバシーからpac学習者を得る一般的な方法を示している。第2境界は、大マルジン分類理論(脂肪散乱次元)の標準手法を用いて、大マルジンハーフスペースの微分プライベート学習において最もよく知られたサンプルの複雑さと一致し、さらにランダムラベルノイズを許容する。

We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization. This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while additionally tolerating random label noise.

翻訳日:2023-01-04 02:32:54 公開日:2020-02-04

# ケイリー変換によるスティーフェル多様体の効率的なリーマン最適化

Efficient Riemannian Optimization on the Stiefel Manifold via the Cayley Transform ( http://arxiv.org/abs/2002.01113v1 )

ライセンス: Link先を確認

Jun Li, Li Fuxin, Sinisa Todorovic

(参考訳) パラメータ行列に厳密な正規性制約を課すことは、ディープラーニングにおいて有利であることが示されている。これは、スティフェル多様体上のリーマン最適化に相当するが、計算上は高価である。この課題に対処するために、(1) 最適化更新のための反復ケイリー変換に基づく新しい効率的なリトラクションマップ、(2) モーメントの射影とスティーフェル多様体上のケイリー変換の組み合わせに基づく暗黙的なベクトル輸送機構を提案する。モーメントを持つケイリーSGDと、スティーフェル多様体上のケイリーADAMの2つの新しい最適化アルゴリズムを指定する。 Cayley SGDの収束性は理論的に解析される。 cnnトレーニングの実験ではどちらのアルゴリズムも (a)CNNパラメータの正規性を強制する既存のアプローチと比較してイテレーション毎の実行時間の削減。 b) CNNの性能を損なうことなく, ベースラインSGDおよびADAMアルゴリズムよりも高速収束率を得る。 Cayley SGDとCayley ADAMもまた、RNNのユニタリ遷移行列を最適化するためのトレーニング時間を短縮することを示した。

Strictly enforcing orthonormality constraints on parameter matrices has been shown advantageous in deep learning. This amounts to Riemannian optimization on the Stiefel manifold, which, however, is computationally expensive. To address this challenge, we present two main contributions: (1) A new efficient retraction map based on an iterative Cayley transform for optimization updates, and (2) An implicit vector transport mechanism based on the combination of a projection of the momentum and the Cayley transform on the Stiefel manifold. We specify two new optimization algorithms: Cayley SGD with momentum, and Cayley ADAM on the Stiefel manifold. Convergence of Cayley SGD is theoretically analyzed. Our experiments for CNN training demonstrate that both algorithms: (a) Use less running time per iteration relative to existing approaches that enforce orthonormality of CNN parameters; and (b) Achieve faster convergence rates than the baseline SGD and ADAM algorithms without compromising the performance of the CNN. Cayley SGD and Cayley ADAM are also shown to reduce the training time for optimizing the unitary transition matrices in RNNs.

翻訳日:2023-01-04 02:32:37 公開日:2020-02-04

# GANにおける肯定的非ラベル分類について

On Positive-Unlabeled Classification in GAN ( http://arxiv.org/abs/2002.01136v1 )

ライセンス: Link先を確認

Tianyu Guo, Chang Xu, Jiajun Huang, Yunhe Wang, Boxin Shi, Chao Xu, Dacheng Tao

(参考訳) 本稿では,標準GANの正・負の分類問題を定義し,その上で,識別器のトレーニングを安定化させる新しい手法を提案する。伝統的に、生成データは負である間、実際のデータは正とみなす。この正負の分類基準は, 実データよりも現実的であっても, 生成データの品質を徐々に向上させることなく, 判別器の学習過程を通じて常に固定された。対照的に、生成したデータをラベルなしとして扱う方が合理的であり、品質に応じて正または負の値になる可能性がある。判別器はこの正・未ラベルの分類問題に対する分類器であり、新しい正の無ラベルGAN(PUGAN)を導出する。提案モデルが達成する大域的最適性と同等の最適化目標について理論的に考察する。 PUGANは、これらの高度な判別器安定化手法と同等またはそれ以上の性能を達成できる。

This paper defines a positive and unlabeled classification problem for standard GANs, which then leads to a novel technique to stabilize the training of the discriminator in GANs. Traditionally, real data are taken as positive while generated data are negative. This positive-negative classification criterion was kept fixed all through the learning process of the discriminator without considering the gradually improved quality of generated data, even if they could be more realistic than real data at times. In contrast, it is more reasonable to treat the generated data as unlabeled, which could be positive or negative according to their quality. The discriminator is thus a classifier for this positive and unlabeled classification problem, and we derive a new Positive-Unlabeled GAN (PUGAN). We theoretically discuss the global optimality the proposed model will achieve and the equivalent optimization goal. Empirically, we find that PUGAN can achieve comparable or even better performance than those sophisticated discriminator stabilization methods.

翻訳日:2023-01-04 02:31:54 公開日:2020-02-04

# マルコフ雑音による線形2時間確率近似の有限時間解析

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise ( http://arxiv.org/abs/2002.01268v1 )

ライセンス: Link先を確認

Maxim Kaledin, Eric Moulines, Alexey Naumov, Vladislav Tadic, Hoi-To Wai

(参考訳) 線形2時間スケール確率近似(SA)スキームは、特に政策評価問題において強化学習(RL)で人気を博したアルゴリズムの重要なクラスである。近年、このスキームの有限時間解析の確立に、特に実際にユビキタスなマルコフ(非i.i.d.)ノイズ設定の下で、多くの研究がなされている。本稿では,線形2時間スケールSAの有限時間解析について述べる。我々の境界はマルコフとマルティンゲールノイズの収束速度に差がないことを示しているが、定数のみがマルコフ連鎖の混合時間に影響されている。適切なステップサイズスケジュールでは、期待されるエラーバウンドの過渡項は$o(1/k^c)$であり、定常項は${\cal o}(1/k)$であり、ここで$c>1$と$k$はイテレーション番号である。さらに、期待誤差の漸近的拡大を示し、一致する下限が$\omega(1/k)$ であることを示す。我々の理論を支持するため、簡単な数値実験を行う。

Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to establishing the finite time analysis of the scheme, especially under the Markovian (non-i.i.d.) noise settings that are ubiquitous in practice. In this paper, we provide a finite-time analysis for linear two timescale SA. Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain. With an appropriate step size schedule, the transient term in the expected error bound is $o(1/k^c)$ and the steady-state term is ${\cal O}(1/k)$, where $c>1$ and $k$ is the iteration number. Furthermore, we present an asymptotic expansion of the expected error with a matching lower bound of $\Omega(1/k)$. A simple numerical experiment is presented to support our theory.

翻訳日:2023-01-04 02:31:13 公開日:2020-02-04

# ビジュアルコンセプト-メタコンセプト学習

Visual Concept-Metaconcept Learning ( http://arxiv.org/abs/2002.01464v1 )

ライセンス: Link先を確認

Chi Han, Jiayuan Mao, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu

(参考訳) 視覚的な入力から赤と緑を認識し、それらがオブジェクトの同じ性質(つまり色)を記述することも理解している。本稿では,画像と関連する質問応答対から概念とメタ概念を共同学習するための視覚概念メタコンセプタ(VCML)を提案する。キーとなるのは,視覚概念とメタ概念の双方向接続を活用することだ。視覚的表現は、見当たらない概念のペア間の関係を予測するための基礎となる手がかりを提供する。赤と緑がオブジェクトの同じ性質を記述していることを知ると、立方体と球面がオブジェクトの形状を分類するため、オブジェクトの同じ性質も記述しているという事実を一般化する。一方、メタコンセプトに関する知識は、限られた、騒々しい、バイアスのあるデータから視覚的な概念を学ぶのに役立ちます。紫のキューブの例から、新しい色の紫はキューブの形ではなくキューブの色に似ています。合成および実世界の両方のデータセットの評価は、我々の主張を検証する。

Humans reason with concepts and metaconcepts: we recognize red and green from visual input; we also understand that they describe the same property of objects (i.e., the color). In this paper, we propose the visual concept-metaconcept learner (VCML) for joint learning of concepts and metaconcepts from images and associated question-answer pairs. The key is to exploit the bidirectional connection between visual concepts and metaconcepts. Visual representations provide grounding cues for predicting relations between unseen pairs of concepts. Knowing that red and green describe the same property of objects, we generalize to the fact that cube and sphere also describe the same property of objects, since they both categorize the shape of objects. Meanwhile, knowledge about metaconcepts empowers visual concept learning from limited, noisy, and even biased data. From just a few examples of purple cubes we can understand a new color purple, which resembles the hue of the cubes instead of the shape of them. Evaluation on both synthetic and real-world datasets validates our claims.

翻訳日:2023-01-04 02:25:14 公開日:2020-02-04

# グラフィカル相互情報最大化によるグラフ表現学習

Graph Representation Learning via Graphical Mutual Information Maximization ( http://arxiv.org/abs/2002.01169v1 )

ライセンス: Link先を確認

Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, Junzhou Huang

(参考訳) ソーシャルネットワークやコミュニケーションネットワークといった様々な情報ネットワークの内容の豊かさは、外部の監督なしに高品質な表現を学ぶ前例のない可能性をもたらす。本稿では,グラフ構造データからの豊富な情報を,教師なしの方法で埋め込み空間に保存し,抽出する方法を検討する。そこで本研究では,入力グラフとハイレベル隠れ表現との相関を測定するための新しい概念であるグラフィカル相互情報(gmi)を提案する。 gmiはベクトル空間からグラフ領域への従来の相互情報計算の考え方を一般化し、ノードの特徴と位相構造から相互情報を測定することは不可欠である。まず、既存のグラフ表現学習アルゴリズムでは避けられない制約である入力グラフの同型変換に不変であり、MINEのような現在の相互情報推定手法によって効率的に推定・最大化することができる。 GMIの助けを借りて、グラフニューラルエンコーダの入力と出力の間でGMIを最大化することで訓練された教師なし学習モデルを開発する。トランスダクティブおよびインダクティブノード分類およびリンク予測に関する検討実験により,本手法は最先端の教師なし手法よりも優れ,時には教師なし手法よりも優れることが示された。

The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs---an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones.

翻訳日:2023-01-04 02:24:00 公開日:2020-02-04

# コンパクトパターン表現のための整数重み付き節付き回帰tsetlinマシン

A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation ( http://arxiv.org/abs/2002.01245v1 )

ライセンス: Link先を確認

K. Darshana Abeyrathna, Ole-Christoffer Granmo, Morten Goodwin

(参考訳) Regression Tsetlin Machine (RTM)は、最先端の非線形回帰モデルの解釈可能性の欠如に対処する。これは命題論理の連結節を使用して、データ内の非線形頻繁なパターンをキャプチャすることで実現される。しかし、これらは線型回帰関数と同様の和を通じて連続的な出力に結合され、非線型成分とユニタリ重みを持つ。 RTMは競合精度で非線形回帰問題を解くが、出力の解像度は使用される節数に比例する。これは、計算コストが解像度によって増加することを意味する。この問題を解決するため、整数重み付きRTM節を導入する。我々の整数重み付き節(integer weighted clause)は、同一のサブパターン-nリピート節をキャプチャした複数の節のコンパクトな表現であり、整数重み n で 1 に変換される。さらに,文節と重みの両方を同時に学習する,いわゆる確率探索を活かした新しい学習手法を提案する。整数重み付きrtmのポテンシャルを6つの人工データセットを用いて実証的に評価する。その結果、整数重み付きRTMは通常のRTMに比べて計算資源が大幅に少ないため、同等以上の精度で取得できることがわかった。さらに,整数重み付けにより実数値よりも精度が向上することを示す。

The Regression Tsetlin Machine (RTM) addresses the lack of interpretability impeding state-of-the-art nonlinear regression models. It does this by using conjunctive clauses in propositional logic to capture the underlying non-linear frequent patterns in the data. These, in turn, are combined into a continuous output through summation, akin to a linear regression function, however, with non-linear components and unity weights. Although the RTM has solved non-linear regression problems with competitive accuracy, the resolution of the output is proportional to the number of clauses employed. This means that computation cost increases with resolution. To reduce this problem, we here introduce integer weighted RTM clauses. Our integer weighted clause is a compact representation of multiple clauses that capture the same sub-pattern-N repeating clauses are turned into one, with an integer weight N. This reduces computation cost N times, and increases interpretability through a sparser representation. We further introduce a novel learning scheme that allows us to simultaneously learn both the clauses and their weights, taking advantage of so-called stochastic searching on the line. We evaluate the potential of the integer weighted RTM empirically using six artificial datasets. The results show that the integer weighted RTM is able to acquire on par or better accuracy using significantly less computational resources compared to regular RTMs. We further show that integer weights yield improved accuracy over real-valued ones.

翻訳日:2023-01-04 02:23:36 公開日:2020-02-04

# スパイクニューラルネットワークのサイズとレジリエンスの多目的最適化

Multi-Objective Optimization for Size and Resilience of Spiking Neural Networks ( http://arxiv.org/abs/2002.01406v1 )

ライセンス: Link先を確認

Mihaela Dimovska, Travis Johnston, Catherine D. Schuman, J. Parker Mitchell, Thomas E. Potok

(参考訳) 脳の接続メカニズムにインスパイアされたニューロモルフィックコンピューティングアーキテクチャは、シリコンのスパイクニューラルネットワーク(snn)をモデル化する。そのため、ニューロモルフィックアーキテクチャは、制御や機械学習タスクを実行できる小型で低消費電力のチップを目標として設計・開発されている。しかし、開発したハードウェアの消費電力は、チップ上で評価されているネットワークのサイズに大きく依存する。さらに、チップ上で評価されたトレーニングされたSNNの精度は、ネットワークの学習重量を乱すハードウェアの電圧と電流の変動によって変化する可能性がある。ハードウェア側でこれらの混乱を最小限に抑える努力が行われているが、デプロイされたネットワークをよりレジリエンスにするためのソフトウェアベースの戦略は、この問題をさらに緩和するのに役立つ。本研究では,スパイキングニューラルネットワークを2つのニューロモルフィックアーキテクチャの実装に適用し,そのサイズを小さくすると同時に,ハードウェア故障に対する耐性を高めることを目的とした。進化的アルゴリズムを利用してSNNを訓練し、SNNのサイズとレジリエンスを最適化する多目的フィットネス関数を提案する。この戦略がハードウェアの欠点に対してより回復力のある、高性能で小型のネットワークに繋がることを示す。

Inspired by the connectivity mechanisms in the brain, neuromorphic computing architectures model Spiking Neural Networks (SNNs) in silicon. As such, neuromorphic architectures are designed and developed with the goal of having small, low power chips that can perform control and machine learning tasks. However, the power consumption of the developed hardware can greatly depend on the size of the network that is being evaluated on the chip. Furthermore, the accuracy of a trained SNN that is evaluated on chip can change due to voltage and current variations in the hardware that perturb the learned weights of the network. While efforts are made on the hardware side to minimize those perturbations, a software based strategy to make the deployed networks more resilient can help further alleviate that issue. In this work, we study Spiking Neural Networks in two neuromorphic architecture implementations with the goal of decreasing their size, while at the same time increasing their resiliency to hardware faults. We leverage an evolutionary algorithm to train the SNNs and propose a multiobjective fitness function to optimize the size and resiliency of the SNN. We demonstrate that this strategy leads to well-performing, small-sized networks that are more resilient to hardware faults.

翻訳日:2023-01-04 02:22:18 公開日:2020-02-04

PDF登録状況（公開日: 20200204）