Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200119となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# ディープ拡散過程を用いたイベントデータからの動的およびパーソナライズされた共生ネットワークの学習 Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes ( http://arxiv.org/abs/2001.02585v2 ) ライセンス: Link先を確認	Zhaozhi Qian, Ahmed M. Alaa, Alexis Bellot, Jem Rashbass, Mihaela van der Schaar	(参考訳) コンコルビンド病は、個人によって異なる複雑な時間的パターンを通じて発生し進行する。電子的な健康記録では、患者が持つ異なる疾患を観察できるが、それぞれの共同症状の時間的関係を推測できる。事象データからこのような時間的パターンを学習することは、疾患の病態を理解し、予後を予測するのに不可欠である。この目的のために我々は,ダイナミックグラフで表現された共生病発症者間の時間的関係をモデル化する深層拡散過程(ddp)を開発した。 ddpは、動的重み付きグラフのエッジによってパラメータ化された強度関数を持つ多次元点過程としてモデル化されたイベントを含む。グラフ構造は、患者の履歴をエッジウェイトにマッピングするニューラルネットワークによって変調され、疾患軌跡の豊かな時間的表現を可能にする。 DDPパラメータは臨床的に有意義な構成要素に分離され、正確なリスク予測と疾患病理の理解不能な表現の両目的に役立てることができる。癌登録データを用いた実験でこれらの特徴を説明する。 Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals. In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition. Learning such temporal patterns from event data is crucial for understanding disease pathology and predicting prognoses. To this end, we develop deep diffusion processes (DDP) to model "dynamic comorbidity networks", i.e., the temporal relationships between comorbid disease onsets expressed through a dynamic graph. A DDP comprises events modelled as a multi-dimensional point process, with an intensity function parameterized by the edges of a dynamic weighted graph. The graph structure is modulated by a neural network that maps patient history to edge weights, enabling rich temporal representations for disease trajectories. The DDP parameters decouple into clinically meaningful components, which enables serving the dual purpose of accurate risk prediction and intelligible representation of disease pathology. We illustrate these features in experiments using cancer registry data.	翻訳日:2023-01-13 09:40:16 公開日:2020-01-19
# 制限ボルツマンマシンのモード支援非教師なし学習 Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines ( http://arxiv.org/abs/2001.05559v2 ) ライセンス: Link先を確認	Haik Manukian, Yan Ru Pei, Sean R.B. Bearden, Massimiliano Di Ventra	(参考訳) 制限ボルツマン機械(RBM)は、強力な生成モデルのクラスであるが、それらの訓練では、典型的な損失関数の教師付きバックプロパゲーションとは異なり、近似も困難である。本稿では,標準勾配更新をrbm基底状態(モード)のサンプルから構築したオフグレード方向と適切に組み合わせることにより,従来の勾配法よりも劇的にトレーニングが向上することを示す。モードトレーニングと呼ばれるこのアプローチは、収束相対エントロピー(KL分散)の低下に加えて、より高速なトレーニングと安定性を促進する。この手法の安定性と収束性の証明とともに、KLの発散を正確に計算できる合成データセットや、より大規模な機械学習標準であるMNISTにも有効性を示す。我々の提案するモードトレーニングは、任意の勾配法とともに適用でき、深層、畳み込み、非制限ボルツマン機械のようなより一般的なエネルギーベースのニューラルネットワーク構造に容易に拡張できるため、非常に多様である。 Restricted Boltzmann machines (RBMs) are a powerful class of generative models, but their training requires computing a gradient that, unlike supervised backpropagation on typical loss functions, is notoriously difficult even to approximate. Here, we show that properly combining standard gradient updates with an off-gradient direction, constructed from samples of the RBM ground state (mode), improves their training dramatically over traditional gradient methods. This approach, which we call mode training, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence). Along with the proofs of stability and convergence of this method, we also demonstrate its efficacy on synthetic datasets where we can compute KL divergences exactly, as well as on a larger machine learning standard, MNIST. The mode training we suggest is quite versatile, as it can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures such as deep, convolutional and unrestricted Boltzmann machines.	翻訳日:2023-01-11 05:56:51 公開日:2020-01-19
# 各種NISQデバイスにおける文脈知覚・時間認識ビットマッピング Context-Sensitive and Duration-Aware Qubit Mapping for Various NISQ Devices ( http://arxiv.org/abs/2001.06887v1 ) ライセンス: Link先を確認	Yu Zhang and Haowei Deng and Quanxi Li	(参考訳) 量子コンピューティング(QC)技術はこの10年で第二のルネッサンスに達した。完全にプログラム可能なQCデバイスは超伝導またはイオントラップ技術に基づいて構築されている。異なる量子技術には独自のパラメータ指標があるが、NISQ時代のQCデバイスは、量子ビットと接続の制限、短いコヒーレンス時間、高いゲートエラー率といった共通の特徴と課題を共有している。プログラマが書いた量子プログラムは、2量子ゲートがいくつかの量子ビットで許可されるため、実際のハードウェア上で直接動作することはほとんどできなかった。したがって、量子コンピューティングコンパイラはマッピング問題を解決し、ハードウェアの限界に合うように元のプログラムを変換しなければならない。上記の問題に対処するため、異なる量子技術を要約し、量子抽象機械(QAM)を抽象的に定義し、次にQAMに基づくContext-sensitive and Duration-Aware Remapping Algorithm(Codar)を提案する。キュービット毎にロックを導入することで、Codarはゲート長差とプログラムコンテキストを認識し、より多くのプログラムの並列性を抽出し、プログラムの実行時間を短縮することができる。最もよく知られているアルゴリズムと比較して、Codarはいくつかの量子アルゴリズムの総実行時間を半減し、17.5%から19.4%を削減した。 Quantum computing (QC) technologies have reached a second renaissance in the last decade. Some fully programmable QC devices have been built based on superconducting or ion trap technologies. Although different quantum technologies have their own parameter indicators, QC devices in the NISQ era share common features and challenges such as limited qubits and connectivity, short coherence time and high gate error rates. Quantum programs written by programmers could hardly run on real hardware directly since two-qubit gates are usually allowed on few pairs of qubits. Therefore, quantum computing compilers must resolve the mapping problem and transform original programs to fit the hardware limitation. To address the issues mentioned above, we summarize different quantum technologies and abstractly define Quantum Abstract Machine (QAM); then propose a COntext-sensitive and Duration-Aware Remapping algorithm (Codar) based on the QAM. By introducing lock for each qubit, Codar is aware of gate duration difference and program context, which bring it abilities to extract more program's parallelism and reduce program execution time. Compared to the best-known algorithm, Codar halves the total execution time of several quantum algorithms and cut down 17.5% - 19.4% total execution time on average in different architectures.	翻訳日:2023-01-10 05:36:43 公開日:2020-01-19
# 非定常変形特異振動子:量子不変量と分解法 Nonstationary deformed singular oscillator: quantum invariants and the factorization method ( http://arxiv.org/abs/2001.06764v1 ) ライセンス: Link先を確認	Kevin Zelaya	(参考訳) 定常特異発振器に関連する時間依存ポテンシャルの新しいファミリーを導入する。これは、特異振動子に対して非定常量子不変量を構成することができることに気付き、達成される。そのような不変量はエルマコフ方程式の解に関連する係数に依存し、後者は各時間における解の正則性を保証するために必須となる。この形式では、ハミルトニアンではなく量子不変量に分解法を適用した後、時間パラメータを変換に導入し、新しい時間依存ポテンシャルの運動定数である分解作用素へと導く。適切な極限の下では、初期量子不変量は定常特異振動子ハミルトニアンに還元され、そのような場合、従来の分解法で得られたポテンシャルの族を復元し、文献で以前報告される。さらに、ポテンシャルの特異な障壁が消え、非特異な時間依存ポテンシャルとなるような特別な制限が議論される。 New families of time-dependent potentials related with the stationary singular oscillator are introduced. This is achieved after noticing that a non stationary quantum invariant can be constructed for the singular oscillator. Such invariant depends on coefficients that are related to solutions of an Ermakov equation, the latter becomes essential since it guarantees the regularity of the solutions at each time. In this form, after applying the factorization method to the quantum invariant, rather than the Hamiltonian, one manages to introduce the time parameter into the transformation, leading to factorized operators which are the constants of motion of the new time-dependent potentials. Under the appropriate limit, the initial quantum invariant reduces to the stationary singular oscillator Hamiltonian, in such case, one recovers the families of potentials obtained through the conventional factorization method and previously reported in the literature. In addition, some special limits are discussed such that the singular barrier of the potential vanishes, leading to non-singular time-dependent potentials.	翻訳日:2023-01-10 05:20:34 公開日:2020-01-19
# r\"ontgen項の付加性と反動による自然放出率の補正 Additivity of R\"ontgen term and recoil-induced correction to spontaneous emission rate ( http://arxiv.org/abs/2001.08145v1 ) ライセンス: Link先を確認	Anwei Zhang and Danying Yu	(参考訳) 移動原子については、R\"{o}ntgen 相互作用項と発光光子によって誘導されるリコイル効果の2つの因子からの寄与により自然放出率が変化する。本稿では,完全導電板近傍における一様移動原子の放出速度を調べ,これら2つの因子による補正を求める。我々は、R\"{o}ntgen 項によって個別に誘導される補正とリコイル効果を簡単に加えることができ、結果として、崩壊率に対する総補正が得られることを見出した。さらに、R\"{o}ntgen 項が正の補正を与えるのに対し、リコイル効果は負の補正をもたらすことが示されている。我々の研究は、量子光学における移動粒子の光-物質相互作用の研究への道を開くものである。 For a moving atom, the spontaneous emission rate is modified due to the contributions from two factors, the R\"{o}ntgen interaction term and the recoil effect induced by the emitted photon. Here we investigate the emission rate of a uniformly moving atom near a perfectly conducting plate and obtain the corrections induced by these two factors. We find that the corrections individually induced by the R\"{o}ntgen term and the recoil effect can be simply added and result in the total correction to the decay rate. Moreover, it is shown that the R\"{o}ntgen term gives positive correction, while the recoil effect induces negative correction. Our work paves the way towards the future studies of the light-matter interaction for the moving particle in quantum optics.	翻訳日:2023-01-10 05:19:57 公開日:2020-01-19
# SlideImages:教育用画像分類用データセット SlideImages: A Dataset for Educational Image Classification ( http://arxiv.org/abs/2001.06823v1 ) ライセンス: Link先を確認	David Morris, Eric M\"uller-Budack, Ralph Ewerth	(参考訳) 過去数年間、畳み込みニューラルネットワーク(convolutional neural networks、cnns)はコンピュータビジョンのタスクで印象的な成果を上げてきた。さらに、イラストやデータの可視化、図形などのセンサ以外の画像は、複雑な情報伝達や大規模なデータセットの探索に一般的に使用される。しかし、この種の画像はコンピュータビジョンにはほとんど注目されていない。 CNNや他の技術は大量のトレーニングデータを使用する。現在、多くの文書分析システムは、教育用画像データの大規模なデータセットが不足しているため、シーン画像に基づいて訓練されている。本稿では,この課題に対処し,教育イラストの分類を行うためのデータセットであるSlideImagesを提示する。 SlideImagesには、Wikimedia CommonsやAI2Dデータセットなど、さまざまなソースから収集したトレーニングデータと、教育スライドから収集したテストデータが含まれている。我々は、このデータセットを用いたアプローチが新しい教育画像や潜在的に他の領域にうまく一般化するように、実際の教育イメージをテストデータセットとして保存してきた。さらに,標準ディープニューラルアーキテクチャを用いたベースラインシステムを提案し,限られたトレーニングデータの扱いについて検討する。 In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.	翻訳日:2023-01-08 12:46:33 公開日:2020-01-19
# RGB-DオドメトリーとSLAM RGB-D Odometry and SLAM ( http://arxiv.org/abs/2001.06875v1 ) ライセンス: Link先を確認	Javier Civera and Seong Hun Lee	(参考訳) 現代のRGB-Dセンサーの出現は、ロボット工学、拡張現実(AR)、そして3Dスキャンを含む多くのアプリケーション分野に大きな影響を与えた。低コストで低消費電力で、LiDARのような従来のレンジセンサーの代替品である。さらに、RGBカメラとは異なり、RGB-Dセンサーは3Dシーン再構成のためのフレーム単位の三角測量の必要性を取り除く追加の深度情報を提供する。これらのメリットは、モバイルロボティクスとarで非常に人気があり、エゴモーションと3dシーン構造を見積もることが非常に興味深い。このような空間的理解により、ロボットは衝突することなく自律的に移動でき、ユーザーは画像ストリームに一貫性のある仮想エンティティを挿入できる。本章では,RGB-Dストリーム入力を用いたオードメトリと同時局所化とマッピング(略称SLAM)の共通定式化について述べる。前者はシーンのローカルマップに対するインクリメンタルカメラの動きを追跡することを目的としており、後者はカメラ軌跡とグローバルマップを一貫性を持って共同で推定することを目的としている。どちらの場合でも、標準手法は非線形最適化技術を用いてコスト関数を最小化する。本章は3つの主要な部分から構成される: 第一部では、オドメトリーとSLAMの基本概念を紹介し、RGB-Dセンサーの使用を動機づける。また,多くのオドメトリーおよびSLAMアルゴリズムに関係した数学的予備次数を与える。第2部では,カメラポーズ追跡,シーンマッピング,ループクローズという,slamシステムの3つの主要コンポーネントについて詳述する。各コンポーネントについて、文献で提案される様々なアプローチについて述べる。最終部では,先進的な研究トピックに関する簡単な議論と最先端技術への言及について述べる。 The emergence of modern RGB-D sensors had a significant impact in many application fields, including robotics, augmented reality (AR) and 3D scanning. They are low-cost, low-power and low-size alternatives to traditional range sensors such as LiDAR. Moreover, unlike RGB cameras, RGB-D sensors provide the additional depth information that removes the need of frame-by-frame triangulation for 3D scene reconstruction. These merits have made them very popular in mobile robotics and AR, where it is of great interest to estimate ego-motion and 3D scene structure. Such spatial understanding can enable robots to navigate autonomously without collisions and allow users to insert virtual entities consistent with the image stream. In this chapter, we review common formulations of odometry and Simultaneous Localization and Mapping (known by its acronym SLAM) using RGB-D stream input. The two topics are closely related, as the former aims to track the incremental camera motion with respect to a local map of the scene, and the latter to jointly estimate the camera trajectory and the global map with consistency. In both cases, the standard approaches minimize a cost function using nonlinear optimization techniques. This chapter consists of three main parts: In the first part, we introduce the basic concept of odometry and SLAM and motivate the use of RGB-D sensors. We also give mathematical preliminaries relevant to most odometry and SLAM algorithms. In the second part, we detail the three main components of SLAM systems: camera pose tracking, scene mapping and loop closing. For each component, we describe different approaches proposed in the literature. In the final part, we provide a brief discussion on advanced research topics with the references to the state-of-the-art.	翻訳日:2023-01-08 12:46:15 公開日:2020-01-19
# 単眼腹腔鏡トレーニングにおける拡張現実によるサチューリング Towards Augmented Reality-based Suturing in Monocular Laparoscopic Training ( http://arxiv.org/abs/2001.06894v1 ) ライセンス: Link先を確認	Chandrakanth Jayachandran Preetha, Jonathan Kloss, Fabian Siegfried Wehrtmann, Lalith Sharan, Carolyn Fan, Beat Peter M\"uller-Stich, Felix Nickel, Sandy Engelhardt	(参考訳) 低侵襲手術(MIS)技術は、回復時間短縮や術後の副作用の減少など重要な臨床効果を提供するため、外科医の間で急速に普及している。しかし、従来の内視鏡システムは、深度知覚、空間方向、視野を損なう単眼映像を出力する。これらの状況下で実行される最も複雑なタスクの1つである。この作業の重要な要素は針ホルダーと手術針の間の相互作用である。針と楽器のリアルタイムの信頼性の高い3次元局在化は、推定された針面とその回転中心と機器の関係など、その量的幾何学的関係を記述するパラメータを追加してシーンを増強するために用いられる。これは基本的なスキルと手術技術の標準化と訓練に寄与し、手術全体のパフォーマンスを高め、合併症のリスクを軽減できる。本論文は,シリコーンパッド上での腹腔鏡視訓練結果を改善するために,定量的・定性的な視覚的表現を伴う拡張現実環境を提案する。これはマルチクラスセグメンテーションと深度マップ予測を実行するマルチタスク教師付きディープニューラルネットワークによって実現されている。深度マップとセグメンテーションマップを生成するための外科訓練シナリオに似た仮想環境を構築することで,ラベルの空洞化が克服されている。提案する畳み込みニューラルネットワークは実際の手術訓練シナリオでテストされ,針の閉塞に頑健であることが判明した。本発明のネットワークは、手術針分割用ダイススコア0.67、針ホルダ計器セグメンテーション0.81、深さ推定用平均絶対誤差6.5mmを達成する。 Minimally Invasive Surgery (MIS) techniques have gained rapid popularity among surgeons since they offer significant clinical benefits including reduced recovery time and diminished post-operative adverse effects. However, conventional endoscopic systems output monocular video which compromises depth perception, spatial orientation and field of view. Suturing is one of the most complex tasks performed under these circumstances. Key components of this tasks are the interplay between needle holder and the surgical needle. Reliable 3D localization of needle and instruments in real time could be used to augment the scene with additional parameters that describe their quantitative geometric relation, e.g. the relation between the estimated needle plane and its rotation center and the instrument. This could contribute towards standardization and training of basic skills and operative techniques, enhance overall surgical performance, and reduce the risk of complications. The paper proposes an Augmented Reality environment with quantitative and qualitative visual representations to enhance laparoscopic training outcomes performed on a silicone pad. This is enabled by a multi-task supervised deep neural network which performs multi-class segmentation and depth map prediction. Scarcity of labels has been conquered by creating a virtual environment which resembles the surgical training scenario to generate dense depth maps and segmentation maps. The proposed convolutional neural network was tested on real surgical training scenarios and showed to be robust to occlusion of the needle. The network achieves a dice score of 0.67 for surgical needle segmentation, 0.81 for needle holder instrument segmentation and a mean absolute error of 6.5 mm for depth estimation.	翻訳日:2023-01-08 12:45:49 公開日:2020-01-19
# 時間的ドメインに基づく社会的影響予測へのアプローチ An Approach for Time-aware Domain-based Social Influence Prediction ( http://arxiv.org/abs/2001.07838v1 ) ライセンス: Link先を確認	Bilal Abu-Salih, Kit Yan Chan, Omar Al-Kadi, Marwan Al-Tawil, Pornpit Wongthongtham, Tomayess Issa, Heba Saadeh, Malak Al-Hassan, Bushra Bremie, Abdulaziz Albahlal	(参考訳) オンラインソーシャルネットワーク(OSN)は、人々がさまざまな状況や領域で意見、関心、考えを表現できる仮想プラットフォームを確立し、正当なユーザーだけでなく、スパマーや他の信頼できないユーザーがコンテンツを公開し、広めることを可能にする。そのため、社会信頼の概念は情報処理・データサイエンティスト・情報消費者・企業から注目を集めている。ソーシャルビッグデータ(SBD)の価値を取得する主な理由の1つは、OSNのユーザの信頼性を評価することのできるフレームワークと方法論を提供することである。これらのアプローチは、大規模ソーシャルデータに対応するためにスケーラブルであるべきです。したがって、分析プロセスを改善し、拡張し、SBDの信頼性を推測するために、社会的信頼を十分に理解する必要がある。露出した環境の設定とosnに関する制限の少なさを考えると、mediumは正当で本物のユーザーだけでなく、スパマーや他の信頼性の低いユーザーもコンテンツを公開し、広めることができる。そこで本稿では,意味分析と機械学習モジュールを用いて,異なる時間領域におけるユーザの信頼性を計測し,予測する手法を提案する。実験の結果,組み込まれた機械学習技術の適用性を評価し,信頼性の高いドメインベースユーザを予測する。 Online Social Networks(OSNs) have established virtual platforms enabling people to express their opinions, interests and thoughts in a variety of contexts and domains, allowing legitimate users as well as spammers and other untrustworthy users to publish and spread their content. Hence, the concept of social trust has attracted the attention of information processors/data scientists and information consumers/business firms. One of the main reasons for acquiring the value of Social Big Data (SBD) is to provide frameworks and methodologies using which the credibility of OSNs users can be evaluated. These approaches should be scalable to accommodate large-scale social data. Hence, there is a need for well comprehending of social trust to improve and expand the analysis process and inferring the credibility of SBD. Given the exposed environment's settings and fewer limitations related to OSNs, the medium allows legitimate and genuine users as well as spammers and other low trustworthy users to publish and spread their content. Hence, this paper presents an approach incorporates semantic analysis and machine learning modules to measure and predict users' trustworthiness in numerous domains in different time periods. The evaluation of the conducted experiment validates the applicability of the incorporated machine learning techniques to predict highly trustworthy domain-based users.	翻訳日:2023-01-08 12:45:20 公開日:2020-01-19
# SQLFlow:SQLと機械学習の橋渡し SQLFlow: A Bridge between SQL and Machine Learning ( http://arxiv.org/abs/2001.06846v1 ) ライセンス: Link先を確認	Yi Wang, Yang Yang, Weiguo Zhu, Yi Wu, Xu Yan, Yongfeng Liu, Yu Wang, Liang Xie, Ziyao Gao, Wenjing Zhu, Xiang Chen, Wei Yan, Mingjie Tang, Yuan Tang	(参考訳) 産業用AIシステムは、主にエンドツーエンドの機械学習(ML)ワークフローである。典型的なレコメンデーションまたはビジネスインテリジェンスシステムには、多くのオンラインマイクロサービスとオフラインジョブが含まれる。このようなワークフローをSQLで効率的に開発するためのSQLFlowについて説明する。 SQLを使うことで、開発者は目的(何)と手順(方法)を無視した短いプログラムを書くことができる。以前のデータベースシステムは、MLをサポートするためにSQL方言を拡張した。 SQLFlow(https://sqlflow.org/sqlflow )は、MySQL、Apache Hive、Alibaba MaxCompute、TensorFlow、XGBoost、Scikit-learnといったMLエンジンなど、さまざまなデータベースシステムのブリッジとして機能する別の戦略を採用している。 SQLの構文を慎重に拡張して、さまざまなSQL方言で拡張を動作させました。我々は,協調構文解析アルゴリズムを考案して拡張を実装した。 SQLFlowは、教師付き、教師なしの学習、深いネットワークとツリーモデル、トレーニングと予測に加えて視覚モデルの説明、MLに加えてデータ処理と機能抽出など、さまざまなMLテクニックに対して効率的で表現力がある。 SQLFlowは、フォールトトレラントな実行とオンプレミスデプロイメントのために、SQLプログラムをKubernetesネイティブワークフローにコンパイルする。現在の産業ユーザはAnt Financial、DiDi、Alibaba Groupなどだ。 Industrial AI systems are mostly end-to-end machine learning (ML) workflows. A typical recommendation or business intelligence system includes many online micro-services and offline jobs. We describe SQLFlow for developing such workflows efficiently in SQL. SQL enables developers to write short programs focusing on the purpose (what) and ignoring the procedure (how). Previous database systems extended their SQL dialect to support ML. SQLFlow (https://sqlflow.org/sqlflow ) takes another strategy to work as a bridge over various database systems, including MySQL, Apache Hive, and Alibaba MaxCompute, and ML engines like TensorFlow, XGBoost, and scikit-learn. We extended SQL syntax carefully to make the extension working with various SQL dialects. We implement the extension by inventing a collaborative parsing algorithm. SQLFlow is efficient and expressive to a wide variety of ML techniques -- supervised and unsupervised learning; deep networks and tree models; visual model explanation in addition to training and prediction; data processing and feature extraction in addition to ML. SQLFlow compiles a SQL program into a Kubernetes-native workflow for fault-tolerable execution and on-cloud deployment. Current industrial users include Ant Financial, DiDi, and Alibaba Group.	翻訳日:2023-01-08 12:44:59 公開日:2020-01-19
# より効率的かつ効果的な推論に向けて:多人数共同決定 Towards More Efficient and Effective Inference: The Joint Decision of Multi-Participants ( http://arxiv.org/abs/2001.06774v1 ) ライセンス: Link先を確認	Hui Zhu, Zhulin An, Kaiqiang Xu, Xiaolong Hu, Yongjun Xu	(参考訳) 局所的なアーキテクチャを最適化したり、ネットワークを深くすることで畳み込みニューラルネットワークの性能を向上させる既存のアプローチは、モデルのサイズを大幅に増加させる傾向がある。需要の高いエッジデバイスにニューラルネットワークをデプロイして適用するためには、ネットワークの規模を縮小することが非常に重要です。しかし、ネットワークを圧縮して画像処理の性能を低下させることは容易である。本稿では,エッジデバイスに適した推論手法を提案する。多層および多層ネットワークを主成分とする多成分の結合決定は、従来の畳み込みニューラルネットワークのパラメータの合計数と同等で、より高い分類精度(cifar-10では0.26%、cifar-100では4.49%)を達成することができる。 Existing approaches to improve the performances of convolutional neural networks by optimizing the local architectures or deepening the networks tend to increase the size of models significantly. In order to deploy and apply the neural networks to edge devices which are in great demand, reducing the scale of networks are quite crucial. However, It is easy to degrade the performance of image processing by compressing the networks. In this paper, we propose a method which is suitable for edge devices while improving the efficiency and effectiveness of inference. The joint decision of multi-participants, mainly contain multi-layers and multi-networks, can achieve higher classification accuracy (0.26% on CIFAR-10 and 4.49% on CIFAR-100 at most) with similar total number of parameters for classical convolutional neural networks.	翻訳日:2023-01-08 12:39:47 公開日:2020-01-19
# 人間の解析のための学習構成型ニューラル情報融合 Learning Compositional Neural Information Fusion for Human Parsing ( http://arxiv.org/abs/2001.06804v1 ) ライセンス: Link先を確認	Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, and Ling Shao	(参考訳) この研究は、ニューラルネットワークと人体の構成階層を組み合わせることで、効率的で完全な人間の解析を行うことを提案する。我々はこのアプローチを神経情報融合フレームワークとして定式化する。本モデルでは,階層上の3つの推論プロセスから情報を収集する。直接推論(画像情報を用いて人体の各部分を直接予測する),ボトムアップ推論(構成部品から知識を組み立てる),トップダウン推論(親ノードからコンテキストを推定する)である。ボトムアップとトップダウンの推論は、それぞれ人体の構成関係と分解関係をモデル化する。さらに、複数のソース情報の融合を入力、すなわちソースの信頼度を推定し、考慮することで条件付けする。モデル全体がエンドツーエンドで微分可能で、情報の流れや構造を明示的にモデル化します。提案手法は4つの一般的なデータセットに対して広範に評価され,高速な処理速度を23fpsで実現した。この方向への今後の研究を容易にするため、コードと結果がリリースされました。 This work proposes to combine neural networks with the compositional hierarchy of human bodies for efficient and complete human parsing. We formulate the approach as a neural information fusion framework. Our model assembles the information from three inference processes over the hierarchy: direct inference (directly predicting each part of a human body using image information), bottom-up inference (assembling knowledge from constituent parts), and top-down inference (leveraging context from parent nodes). The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively. In addition, the fusion of multi-source information is conditioned on the inputs, i.e., by estimating and considering the confidence of the sources. The whole model is end-to-end differentiable, explicitly modeling information flows and structures. Our approach is extensively evaluated on four popular datasets, outperforming the state-of-the-arts in all cases, with a fast processing speed of 23fps. Our code and results have been released to help ease future research in this direction.	翻訳日:2023-01-08 12:39:31 公開日:2020-01-19
# 注意グラフニューラルネットワークによるゼロショットビデオオブジェクトセグメンテーション Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks ( http://arxiv.org/abs/2001.06807v1 ) ライセンス: Link先を確認	Wenguan Wang, Xiankai Lu, Jianbing Shen, David Crandall, and Ling Shao	(参考訳) 本研究では、ゼロショットビデオオブジェクトセグメンテーション(ZVOS)のための新しい注意グラフニューラルネットワーク(AGNN)を提案する。提案されたAGNNは、このタスクをビデオグラフ上で反復的な情報融合のプロセスとして再放送する。特にAGNNは、フレームをノードとして効率的に表現し、任意のフレームペア間の関係をエッジとして、完全に連結されたグラフを構築している。基礎となる対関係は微分可能な注意機構によって記述される。パラメトリックメッセージパッシングにより、AGNNはビデオフレーム間のよりリッチで高次な関係を効果的に捉え、マイニングすることができ、それによってビデオ内容のより完全な理解とより正確なフォアグラウンド推定が可能になる。 3つのビデオセグメンテーションデータセットの実験結果は、agnnがそれぞれのケースで新しい最先端を設定することを示している。我々は、このフレームワークの一般化可能性をさらに示すために、AGNNを次のタスクに拡張する: Image Object Co-segmentation (IOCS)。我々は2つのIOCSデータセットで実験を行い、AGNNモデルの優越性を再び観察する。広範な実験により、AGNNはビデオフレームや関連画像間のセマンティック/出現関係を学習し、共通のオブジェクトを発見することができる。 This work proposes a novel attentive graph neural network (AGNN) for zero-shot video object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative information fusion over video graphs. Specifically, AGNN builds a fully connected graph to efficiently represent frames as nodes, and relations between arbitrary frame pairs as edges. The underlying pair-wise relations are described by a differentiable attention mechanism. Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation. Experimental results on three video segmentation datasets show that AGNN sets a new state-of-the-art in each case. To further demonstrate the generalizability of our framework, we extend AGNN to an additional task: image object co-segmentation (IOCS). We perform experiments on two famous IOCS datasets and observe again the superiority of our AGNN model. The extensive experiments verify that AGNN is able to learn the underlying semantic/appearance relationships among video frames or related images, and discover the common objects.	翻訳日:2023-01-08 12:39:13 公開日:2020-01-19
# 監視されていないビデオオブジェクトのセグメンテーションとコ・アテンション・シームズ・ネットワーク See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks ( http://arxiv.org/abs/2001.06810v1 ) ライセンス: Link先を確認	Xiankai Lu, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, and Fatih Porikli	(参考訳) 本稿では,Co-attention Siamese Network (COSNet) と呼ばれる新しいネットワークを導入し,包括的視点から,教師なしビデオオブジェクトのセグメンテーションタスクに対処する。我々は,映像フレーム間の固有相関の重要性を強調し,短時間の時間セグメントにおける外見や動きに対する差別的前景表現の学習に重点を置いた,最先端のディープラーニングベースのソリューションを改善するためのグローバルなコアテンション機構を取り入れた。ネットワーク内のコアテンション層は,共同計算と共同特徴空間へのコアテンション応答の付加により,グローバルな相関関係とシーンコンテキストを捕捉するための効率的かつ有能な段階を提供する。 COSNetをビデオフレームのペアでトレーニングし、トレーニングデータを自然に強化し、学習能力を向上します。セグメンテーション段階において、コアテンションモデルは、複数の参照フレームを一緒に処理することで有用な情報を符号化し、頻繁に出現し、より健全なフォアグラウンドオブジェクトを推測する。ビデオ内のリッチなコンテキストをマイニングするために,さまざまなコアテンション変種を導出できる統一的でエンドツーエンドのトレーニング可能なフレームワークを提案する。 3つの大きなベンチマークに関する大規模な実験では、COSNetが現在の選択肢よりも大きなマージンで優れています。 We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view. We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism to improve further the state-of-the-art deep learning based solutions that primarily focus on learning discriminative foreground representations over appearance and motion in short-term temporal segments. The co-attention layers in our network provide efficient and competent stages for capturing global correlations and scene context by jointly computing and appending co-attention responses into a joint feature space. We train COSNet with pairs of video frames, which naturally augments training data and allows increased learning capacity. During the segmentation stage, the co-attention model encodes useful information by processing multiple reference frames together, which is leveraged to infer the frequently reappearing and salient foreground objects better. We propose a unified and end-to-end trainable framework where different co-attention variants can be derived for mining the rich context within videos. Our extensive experiments over three large benchmarks manifest that COSNet outperforms the current alternatives by a large margin.	翻訳日:2023-01-08 12:38:54 公開日:2020-01-19
# ヒューマン・アウェア・モーション・デブロアリング Human-Aware Motion Deblurring ( http://arxiv.org/abs/2001.06816v1 ) ライセンス: Link先を確認	Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao	(参考訳) 本稿では,前景(FG)と背景(BG)との間に動きのぼかしをアンタングルする人間認識型デブロアリングモデルを提案する。提案モデルは三分岐エンコーダデコーダアーキテクチャに基づいている。第1の2つの分枝はそれぞれfg人間とbgの細部を研削するために学習され、第3の分枝は2つの領域からのマルチスケールなデブラリング情報を包括的に融合することにより、グローバルかつ調和的な結果を生み出す。提案モデルは, エンド・ツー・エンド方式で, 教師付き, 人間対応の注意機構を付与する。 FGの人間の情報をエンコードするソフトマスクを学習し、FG/BGデコーダブランチを明示的に駆動して特定のドメインに集中する。さらに,人間を認識できる画像デブラリングの研究に資するため,8,422個のぼやけた画像ペアと65,784個のfg人間バウンディングボックスからなるhidという大規模データセットを導入する。 HIDEは、広い範囲のシーン、人間のオブジェクトのサイズ、動きのパターン、背景の複雑さにまたがるように設計されている。公開ベンチマークとデータセットに関する広範な実験により,我々のモデルは,特にセマンティクス詳細の把握において,最先端のモーションデブラリング手法に対して好適に機能することが示された。 This paper proposes a human-aware deblurring model that disentangles the motion blur between foreground (FG) humans and background (BG). The proposed model is based on a triple-branch encoder-decoder architecture. The first two branches are learned for sharpening FG humans and BG details, respectively; while the third one produces global, harmonious results by comprehensively fusing multi-scale deblurring information from the two domains. The proposed model is further endowed with a supervised, human-aware attention mechanism in an end-to-end fashion. It learns a soft mask that encodes FG human information and explicitly drives the FG/BG decoder-branches to focus on their specific domains. To further benefit the research towards Human-aware Image Deblurring, we introduce a large-scale dataset, named HIDE, which consists of 8,422 blurry and sharp image pairs with 65,784 densely annotated FG human bounding boxes. HIDE is specifically built to span a broad range of scenes, human object sizes, motion patterns, and background complexities. Extensive experiments on public benchmarks and our dataset demonstrate that our model performs favorably against the state-of-the-art motion deblurring methods, especially in capturing semantic details.	翻訳日:2023-01-08 12:38:09 公開日:2020-01-19
# セマンティックセグメンテーションのためのゲートパス選択ネットワーク Gated Path Selection Network for Semantic Segmentation ( http://arxiv.org/abs/2001.06819v1 ) ライセンス: Link先を確認	Qichuan Geng, Hong Zhang, Xiaojuan Qi, Ruigang Yang, Zhong Zhou, Gao Huang	(参考訳) セマンティクスのセグメンテーションは、大規模なバリエーション、変形、異なる視点を扱う必要がある困難なタスクである。本稿では,適応受容場を学習することを目的とした新しいネットワークgated path selection network(gpsnet)を開発した。 GPSNetにおいて、我々はまず2次元のマルチスケールネットワーク、SuperNetを設計する。望ましいセマンティックコンテキストを動的に選択するために、さらにゲート予測モジュールを導入する。通常のグリッド上のサンプル位置の最適化に重点を置く以前の研究とは対照的に、GPSNetは自由形式の密接なセマンティックコンテキストを適応的にキャプチャすることができる。導出された適応受容場はデータ依存であり、異なるオブジェクト幾何学変換をモデル化できる柔軟性がある。都市景観とADE20Kの2つの代表的なセマンティックセマンティックセグメンテーションデータセットにおいて,提案手法が従来手法より一貫して優れ,ベルやホイッスルを使わずに競争性能を達成することを示す。 Semantic segmentation is a challenging task that needs to handle large scale variations, deformations and different viewpoints. In this paper, we develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields. In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields. To dynamically select desirable semantic context, a gate prediction module is further introduced. In contrast to previous works that focus on optimizing sample positions on the regular grids, GPSNet can adaptively capture free form dense semantic contexts. The derived adaptive receptive fields are data-dependent, and are flexible that can model different object geometric transformations. On two representative semantic segmentation datasets, i.e., Cityscapes, and ADE20K, we show that the proposed approach consistently outperforms previous methods and achieves competitive performance without bells and whistles.	翻訳日:2023-01-08 12:37:46 公開日:2020-01-19
# FIS-Nets:単眼深度推定のためのフルイメージ監視ネットワーク FIS-Nets: Full-image Supervised Networks for Monocular Depth Estimation ( http://arxiv.org/abs/2001.11092v1 ) ライセンス: Link先を確認	Bei Wang and Jianping An	(参考訳) 本稿では,単眼深度推定における全画像監視の重要性について述べる。画像整合性を利用する教師なしのフレームワークと、深い深度補完を行う教師なしのフレームワークを組み合わせた半教師付きアーキテクチャを提案する。後者は、前者の監督としてフルイメージの深さを提供する。ナビゲーションシステムからのエゴモーションは、内部時間変換ネットワークの出力監視として教師なしのフレームワークにも組み込まれ、単眼深度推定をより良くする。本評価では,提案手法が他の深度推定手法よりも優れていることを示す。 This paper addresses the importance of full-image supervision for monocular depth estimation. We propose a semi-supervised architecture, which combines both unsupervised framework of using image consistency and supervised framework of dense depth completion. The latter provides full-image depth as supervision for the former. Ego-motion from navigation system is also embedded into the unsupervised framework as output supervision of an inner temporal transform network, making monocular depth estimation better. In the evaluation, we show that our proposed model outperforms other approaches on depth estimation.	翻訳日:2023-01-08 12:36:08 公開日:2020-01-19
# ライダーシェアリングによるユーザエクスペリエンス向上を実現するWakly Supervised Learning Weakly Supervised Learning Meets Ride-Sharing User Experience Enhancement ( http://arxiv.org/abs/2001.09027v1 ) ライセンス: Link先を確認	Lan-Zhe Guo, Feng Kuang, Zhang-Xun Liu, Yu-Feng Li, Nan Ma, Xiao-Hu Qie	(参考訳) 弱教師付き学習は、ラベル付きデータの不足に対処することを目的としている。従来の弱い教師付き研究では、データに弱い監督が1つしかないと仮定している。しかし、多くのアプリケーションでは、生データは通常、複数の弱い監督を同時に含む。例えば、最大規模のオンラインライドシェアリングプラットフォームであるDidiのユーザエクスペリエンス向上において、ライドコメントデータは(乗客の主観的要因による)ラベルノイズと(サンプリングバイアスによる)ラベル分布バイアスを含む。このような問題を「弱教師付き学習」と呼んでいる。本稿では,didiの配車コメントデータに基づいてこの問題に対処するためのcwsl手法を提案する。具体的には, 有害な雑音の重み付けが小さいコメントデータにおいて, ラベルノイズに対処するために, インスタンス再重み付け戦略を用いる。精度よりもAUCのようなロバストな基準と検証性能はバイアスデータラベルの修正に最適化されている。代用最適化と確率勾配法は大規模データの最適化を加速する。 Didiのライドシェアリングコメントデータの実験は、その有効性を明確に検証した。この研究が、複雑な実環境に弱い教師付き学習を適用することに光を当てることを望む。 Weakly supervised learning aims at coping with scarce labeled data. Previous weakly supervised studies typically assume that there is only one kind of weak supervision in data. In many applications, however, raw data usually contains more than one kind of weak supervision at the same time. For example, in user experience enhancement from Didi, one of the largest online ride-sharing platforms, the ride comment data contains severe label noise (due to the subjective factors of passengers) and severe label distribution bias (due to the sampling bias). We call such a problem as "compound weakly supervised learning". In this paper, we propose the CWSL method to address this problem based on Didi ride-sharing comment data. Specifically, an instance reweighting strategy is employed to cope with severe label noise in comment data, where the weights for harmful noisy instances are small. Robust criteria like AUC rather than accuracy and the validation performance are optimized for the correction of biased data label. Alternating optimization and stochastic gradient methods accelerate the optimization on large-scale data. Experiments on Didi ride-sharing comment data clearly validate the effectiveness. We hope this work may shed some light on applying weakly supervised learning to complex real situations.	翻訳日:2023-01-08 10:14:25 公開日:2020-01-19
# 適応RBFベースサロゲートモデルを用いた不確実性定量化による支出関数の最適点の探索 Finding Optimal Points for Expensive Functions Using Adaptive RBF-Based Surrogate Model Via Uncertainty Quantification ( http://arxiv.org/abs/2001.06858v1 ) ライセンス: Link先を確認	Ray-Bing Chen, Yuan Wang, C. F. Jeff Wu	(参考訳) 高価な関数のグローバルな最適化は、物理およびコンピュータ実験において重要な応用である。各関数の評価は費用がかかり、その関数の導出情報が得られないことが多いため、効率的な最適化スキームを開発することは難しい問題である。本稿では,適応的放射基底関数(RBF)に基づく不確実性定量化による代理モデルを用いた新しいグローバル最適化フレームワークを提案する。フレームワークは2つのイテレーションステップで構成される。まずRBFに基づくベイズ代理モデルを用いて真の関数を近似し、新しい点が探索されるたびにRBFのパラメータを適応的に推定し更新することができる。そして、モデル誘導選択基準を用いて、関数評価のための候補セットから新しい点を識別する。ここで採用される選択基準は、期待される改善基準(EI)のサンプル版である。標準試験関数を用いたシミュレーション実験を行い,本手法は,特に実表面があまり滑らかでない場合において,いくつかの利点があることを示す。さらに,グローバルな最適点を同定し,高次元シナリオに対応するために,探索性能を改善するための改良手法を提案する。 Global optimization of expensive functions has important applications in physical and computer experiments. It is a challenging problem to develop efficient optimization scheme, because each function evaluation can be costly and the derivative information of the function is often not available. We propose a novel global optimization framework using adaptive Radial Basis Functions (RBF) based surrogate model via uncertainty quantification. The framework consists of two iteration steps. It first employs an RBF-based Bayesian surrogate model to approximate the true function, where the parameters of the RBFs can be adaptively estimated and updated each time a new point is explored. Then it utilizes a model-guided selection criterion to identify a new point from a candidate set for function evaluation. The selection criterion adopted here is a sample version of the expected improvement (EI) criterion. We conduct simulation studies with standard test functions, which show that the proposed method has some advantages, especially when the true surface is not very smooth. In addition, we also propose modified approaches to improve the search performance for identifying global optimal points and to deal with the higher dimension scenarios.	翻訳日:2023-01-08 10:13:30 公開日:2020-01-19
# ランダム再帰木アンサンブルを用いた分類のためのメタアルゴリズム:高エネルギー物理応用 A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application ( http://arxiv.org/abs/2001.06880v1 ) ライセンス: Link先を確認	Vidhi Lalchand	(参考訳) 本研究の目的は,離散二項クラスの存在下での自動分類のためのメタアルゴリズムを提案することである。重複するクラス分布の存在下での分類器学習は、機械学習において難しい問題である。重なり合うクラスは、両クラスに属する点の密度が高い特徴空間におけるあいまいな領域の存在によって記述される。これは実世界のデータセットでしばしば起こり、例えばLHC(Large Hadron Collider)のような高エネルギー加速器に由来する粒子崩壊の性質を示す数値データである。クラスオーバーラップ問題を対象とした重要な研究機関は、アンサンブル分類器を使用して、複数の段階で繰り返し、あるいは入力トレーニングデータの異なるサブセットで同じモデルの複数のコピーを使用することで、アルゴリズムの性能を向上させる。前者をブースティング(boosting)、後者をバグング(bagging)と呼ぶ。この論文で提案されたアルゴリズムは、ヒッグス発見の統計的重要性を改善する高エネルギー物理学における挑戦的な分類問題をターゲットにしている。アルゴリズムのトレーニングに使用される基礎となるデータセットは、信号生成クラスオーバーラップの統計特性を忠実に模倣する、ヒッグスイベント(信号)と異なるバックグラウンドイベント(背景)を混合した公式のATLASフル検出器シミュレーションから構築された実験データである。提案したアルゴリズムは、実験物理学において最も成功した解析手法の1つである古典的な強化決定木の変種である。このアルゴリズムは、ベージとブーピングという2つのメタラーニングテクニックを組み合わせた統合フレームワークを利用している。その結果,この組み合わせは,基礎学習者のランダム化トリックの存在下でのみ有効であることがわかった。 The aim of this work is to propose a meta-algorithm for automatic classification in the presence of discrete binary classes. Classifier learning in the presence of overlapping class distributions is a challenging problem in machine learning. Overlapping classes are described by the presence of ambiguous areas in the feature space with a high density of points belonging to both classes. This often occurs in real-world datasets, one such example is numeric data denoting properties of particle decays derived from high-energy accelerators like the Large Hadron Collider (LHC). A significant body of research targeting the class overlap problem use ensemble classifiers to boost the performance of algorithms by using them iteratively in multiple stages or using multiple copies of the same model on different subsets of the input training data. The former is called boosting and the latter is called bagging. The algorithm proposed in this thesis targets a challenging classification problem in high energy physics - that of improving the statistical significance of the Higgs discovery. The underlying dataset used to train the algorithm is experimental data built from the official ATLAS full-detector simulation with Higgs events (signal) mixed with different background events (background) that closely mimic the statistical properties of the signal generating class overlap. The algorithm proposed is a variant of the classical boosted decision tree which is known to be one of the most successful analysis techniques in experimental physics. The algorithm utilizes a unified framework that combines two meta-learning techniques - bagging and boosting. The results show that this combination only works in the presence of a randomization trick in the base learners.	翻訳日:2023-01-08 10:13:13 公開日:2020-01-19
# ACGAN合成レーダマイクロドップラー信号を用いた運動分類 Motion Classification using Kinematically Sifted ACGAN-Synthesized Radar Micro-Doppler Signatures ( http://arxiv.org/abs/2001.08582v1 ) ライセンス: Link先を確認	Baris Erol, Sevgi Zubeyde Gurbuz, Moeness G. Amin	(参考訳) ディープニューラルネットワーク(dnn)は最近、レーダーベースのヒューマンアクティビティ認識、スマートホーム、生活支援、バイオメディシンなど、レーダーリターンの分類を必要とするアプリケーションで注目を集めている。しかし,レーダーデータ収集に必要な人的コストや資源が高すぎるため,十分な規模のトレーニングデータセットの取得は依然として大変な作業である。本稿では,様々な環境に適応した合成レーダマイクロドップラーシグネチャを生成するための,逆学習への拡張アプローチを提案する。合成データは,視覚的解釈,キネマティック一貫性の分析,データの多様性,潜伏空間の次元,塩分マップを用いて評価する。合成シグネチャが物理的に可能な人間の動作と一致していることを保証するために, 原理成分分析 (pca) に基づくキネマティックシフティングアルゴリズムが導入された。合成データセットは、19層ディープ畳み込みニューラルネットワーク(DCNN)をトレーニングし、敵ネットワークに供給されたデータセットとは異なる環境から取得したマイクロドップラーシグネチャを分類する。全体的な精度93%は、複数のアスペクトアングル(0デグ、30デグ、45デグ、60デグ)を含むデータセット上で達成され、キネマティックなシフティングの結果、9%改善されている。 Deep neural networks (DNNs) have recently received vast attention in applications requiring classification of radar returns, including radar-based human activity recognition for security, smart homes, assisted living, and biomedicine. However,acquiring a sufficiently large training dataset remains a daunting task due to the high human costs and resources required for radar data collection. In this paper, an extended approach to adversarial learning is proposed for generation of synthetic radar micro-Doppler signatures that are well-adapted to different environments. The synthetic data is evaluated using visual interpretation, analysis of kinematic consistency, data diversity, dimensions of the latent space, and saliency maps. A principle-component analysis (PCA) based kinematic-sifting algorithm is introduced to ensure that synthetic signatures are consistent with physically possible human motions. The synthetic dataset is used to train a 19-layer deep convolutional neural network (DCNN) to classify micro-Doppler signatures acquired from an environment different from that of the dataset supplied to the adversarial network. An overall accuracy 93% is achieved on a dataset that contains multiple aspect angles (0 deg., 30 deg., and 45 deg. as well as 60 deg.), with 9% improvement as a result of kinematic sifting.	翻訳日:2023-01-08 10:12:46 公開日:2020-01-19
# ヒューマンフィードバックを用いた高次元状態空間におけるインタラクティブ報酬形成 FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback ( http://arxiv.org/abs/2001.06781v1 ) ライセンス: Link先を確認	Baicen Xiao, Qifan Lu, Bhaskar Ramasubramanian, Andrew Clark, Linda Bushnell, Radha Poovendran	(参考訳) 強化学習は複雑な環境で目標を達成するための自律エージェントの訓練に成功している。これはロボティクスやコンピュータゲームを含む複数の設定に適応しているが、一部の環境では強化学習アルゴリズムよりも高い報酬を得る方が容易である。これは、エージェントによって得られる報酬がスパースまたは非常に遅れた高次元状態空間に特に当てはまる。本稿では,人間の操作者からのフィードバック信号を高次元状態空間における深層強化学習アルゴリズムに効果的に統合することを目的とする。これをFRESH(FeedbackベースのReward SHaping)と呼ぶ。トレーニング中、人間オペレータはリプレイバッファからの軌道を提示され、軌道の状態と動作についてのフィードバックを提供する。人間のオペレータが提供したフィードバック信号を、テスト時に事前に認識した状態やアクションに一般化するために、フィードバックニューラルネットワークを使用する。我々は、モデルの不確実性とニューラルネットワークの出力に対する信頼性を表すために、ニューラルネットワークと共有ネットワークアーキテクチャのアンサンブルを使用する。フィードバックニューラルネットワークの出力は、環境が提供する報酬に付加されたシェーピング報酬に変換される。アーケード学習環境におけるボーリングとスキーのアタリゲームに対する我々のアプローチを評価する。人間のエキスパートはこれらの環境で高いスコアを得ることができたが、最先端のディープラーニングアルゴリズムはパフォーマンスが悪い。我々はFRESHが両環境における最先端のディープラーニングアルゴリズムよりもはるかに高いスコアを得られることを観察した。 FRESHはまた、ボーリングの人間専門家よりも21.4%高いスコアを獲得し、スキーの人間専門家でもある。 Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the reward obtained by the agent is sparse or extremely delayed. In this paper, we seek to effectively integrate feedback signals supplied by a human operator with deep reinforcement learning algorithms in high-dimensional state spaces. We call this FRESH (Feedback-based REward SHaping). During training, a human operator is presented with trajectories from a replay buffer and then provides feedback on states and actions in the trajectory. In order to generalize feedback signals provided by the human operator to previously unseen states and actions at test-time, we use a feedback neural network. We use an ensemble of neural networks with a shared network architecture to represent model uncertainty and the confidence of the neural network in its output. The output of the feedback neural network is converted to a shaping reward that is augmented to the reward provided by the environment. We evaluate our approach on the Bowling and Skiing Atari games in the arcade learning environment. Although human experts have been able to achieve high scores in these environments, state-of-the-art deep learning algorithms perform poorly. We observe that FRESH is able to achieve much higher scores than state-of-the-art deep learning algorithms in both environments. FRESH also achieves a 21.4% higher score than a human expert in Bowling and does as well as a human expert in Skiing.	翻訳日:2023-01-08 10:12:23 公開日:2020-01-19
# 知識ベースアサーションの修正 Correcting Knowledge Base Assertions ( http://arxiv.org/abs/2001.06917v1 ) ライセンス: Link先を確認	Jiaoyan Chen, Xi Chen, Ian Horrocks, Ernesto Jimenez-Ruiz, and Erik B. Myklebus	(参考訳) 知識ベース(KB)の有用性とユーザビリティは品質の問題によって制限されることが多い。よくある問題は誤った主張の存在であり、しばしば語彙的あるいは意味的な混乱によって引き起こされる。そこで本研究では,このようなアサーションを訂正する問題について検討し,語彙マッチング,意味埋め込み,ソフト制約マイニング,意味一貫性チェックを組み合わせた一般的な補正フレームワークを提案する。このフレームワークはDBpediaと企業医療KBを用いて評価される。 The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB.	翻訳日:2023-01-08 10:11:58 公開日:2020-01-19
# ヒンディー語テキスト分類のためのディープラーニング:比較 Deep Learning for Hindi Text Classification: A Comparison ( http://arxiv.org/abs/2001.10340v1 ) ライセンス: Link先を確認	Ramchandra Joshi, Purvi Goel, Raviraj Joshi	(参考訳) 自然言語処理(NLP)、特に自然言語テキスト解析は近年大きな進歩を遂げている。テキスト処理におけるディープラーニングの利用は、テキスト処理技術に革命をもたらし、驚くべき結果をもたらした。 cnn、lstm、そして非常に最近のtransformerのような異なるディープラーニングアーキテクチャは、nlpタスクのさまざまな技術結果を達成するために使われている。本研究では,テキスト分類タスクのためのディープラーニングアーキテクチャのホストを調査した。この作品はヒンディー語のテキストの分類に特に関係している。デヴァナガリ文字で書かれた形態的に豊かで低資源のヒンディー語を分類する研究は、大きなラベル付きコーパスがないために限られている。本研究では,CNN,LSTM,注意に基づくモデル評価のために,英文データセットの翻訳版を用いた。 BERT と LASER に基づく多言語事前学習文の埋め込みも比較し,ヒンディー語の有効性を評価する。この論文は、一般的なテキスト分類技法のチュートリアルとしても機能する。 Natural Language Processing (NLP) and especially natural language text analysis have seen great advances in recent times. Usage of deep learning in text processing has revolutionized the techniques for text processing and achieved remarkable results. Different deep learning architectures like CNN, LSTM, and very recent Transformer have been used to achieve state of the art results variety on NLP tasks. In this work, we survey a host of deep learning architectures for text classification tasks. The work is specifically concerned with the classification of Hindi text. The research in the classification of morphologically rich and low resource Hindi language written in Devanagari script has been limited due to the absence of large labeled corpus. In this work, we used translated versions of English data-sets to evaluate models based on CNN, LSTM and Attention. Multilingual pre-trained sentence embeddings based on BERT and LASER are also compared to evaluate their effectiveness for the Hindi language. The paper also serves as a tutorial for popular text classification techniques.	翻訳日:2023-01-08 10:05:01 公開日:2020-01-19
# 深部空間クラスタリングのためのマルチレベル表現学習 Multi-Level Representation Learning for Deep Subspace Clustering ( http://arxiv.org/abs/2001.08533v1 ) ライセンス: Link先を確認	Mohsen Kheirandishfard, Fariba Zohrizadeh, Farhad Kamangar	(参考訳) 本稿では,畳み込みオートエンコーダを用いて入力画像を線形部分空間の結合上にある新しい表現に変換する,新しい深層部分空間クラスタリング手法を提案する。我々の研究の最初の貢献は、エンコーダ層とそれに対応するデコーダ層の間に複数の完全に接続された線形層を挿入し、サブスペースクラスタリングのためのより好ましい表現の学習を促進することである。これらの接続層は、エンコーダの異なるレベルで複数の自己表現および情報表現を生成するために、低レベルと高レベルの情報を組み合わせることで、特徴学習の手順を促進する。さらに,サンプルの初期クラスタリングを利用して,マルチレベル表現を効果的に融合し,下位部分空間をより正確に復元する新たな損失最小化問題を提案する。損失関数は、代わりにネットワークパラメータを更新し、サンプルの新しいクラスタリングを生成する反復スキームによって最小化される。 4つの実世界のデータセットに対する実験により、我々の手法は、ほとんどのサブスペースクラスタリング問題における最先端手法よりも優れた性能を示すことが示された。 This paper proposes a novel deep subspace clustering approach which uses convolutional autoencoders to transform input images into new representations lying on a union of linear subspaces. The first contribution of our work is to insert multiple fully-connected linear layers between the encoder layers and their corresponding decoder layers to promote learning more favorable representations for subspace clustering. These connection layers facilitate the feature learning procedure by combining low-level and high-level information for generating multiple sets of self-expressive and informative representations at different levels of the encoder. Moreover, we introduce a novel loss minimization problem which leverages an initial clustering of the samples to effectively fuse the multi-level representations and recover the underlying subspaces more accurately. The loss function is then minimized through an iterative scheme which alternatively updates the network parameters and produces new clusterings of the samples. Experiments on four real-world datasets demonstrate that our approach exhibits superior performance compared to the state-of-the-art methods on most of the subspace clustering problems.	翻訳日:2023-01-08 10:04:12 公開日:2020-01-19
# 原始二元能動集合アルゴリズムを用いたK-SVDによる画像認識 Image denoising via K-SVD with primal-dual active set algorithm ( http://arxiv.org/abs/2001.06780v1 ) ライセンス: Link先を確認	Quan Xiao, Canhong Wen, Zirui Yan	(参考訳) K-SVDアルゴリズムは、何十年にもわたって画像復調タスクにうまく適用されてきたが、速度と精度の大きなボトルネックは、いまだに壊れる必要がある。 K-SVD のスパース符号化段階では、$\ell_{0}$ 制約が伴うが、一般的な手法では、ノイズレベルが高くなると、近似的な解を求めることが多い。代替の$\ell_{1}$最適化は$\ell_{0}$よりも強力であることが証明されているが、時間消費によって実装が妨げられる。本稿では,Primal-Dual Active Set (PDAS)アルゴリズムを適用し,K-SVD$_P$という新しいK-SVDフレームワークを提案する。 K-SVDのアルゴリズムと異なり、K-SVD$_P$アルゴリズムはKKT(Karush-Kuhn-Tucker)条件によって動機付けられた選択戦略を開発し、スパース符号化段階における効率的な更新をもたらす。 K-SVD$_P$アルゴリズムは、このデノナイジング問題において単純な明示的な表現で反復的に双対問題の等価解を求めるため、デノナイジングの速度と品質を同時に達成することができる。実験を行い,最先端手法を用いたk-svd$_p$と同等の性能を示す。 K-SVD algorithm has been successfully applied to image denoising tasks dozens of years but the big bottleneck in speed and accuracy still needs attention to break. For the sparse coding stage in K-SVD, which involves $\ell_{0}$ constraint, prevailing methods usually seek approximate solutions greedily but are less effective once the noise level is high. The alternative $\ell_{1}$ optimization is proved to be powerful than $\ell_{0}$, however, the time consumption prevents it from the implementation. In this paper, we propose a new K-SVD framework called K-SVD$_P$ by applying the Primal-dual active set (PDAS) algorithm to it. Different from the greedy algorithms based K-SVD, the K-SVD$_P$ algorithm develops a selection strategy motivated by KKT (Karush-Kuhn-Tucker) condition and yields to an efficient update in the sparse coding stage. Since the K-SVD$_P$ algorithm seeks for an equivalent solution to the dual problem iteratively with simple explicit expression in this denoising problem, speed and quality of denoising can be reached simultaneously. Experiments are carried out and demonstrate the comparable denoising performance of our K-SVD$_P$ with state-of-the-art methods.	翻訳日:2023-01-08 10:03:55 公開日:2020-01-19
# 混合モデルにおけるパラメータ学習のための代数的および解析的アプローチ Algebraic and Analytic Approaches for Parameter Learning in Mixture Models ( http://arxiv.org/abs/2001.06776v1 ) ライセンス: Link先を確認	Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, Soumyabrata Pal	(参考訳) 1次元の混合モデルにおけるパラメータ学習のための2つの異なるアプローチを提案する。最初のアプローチは複素解析法を使用し,共有分散を持つガウス混合,共有成功確率を持つ二項混合,ポアソン混合などに適用する。例として、$\exp(o(n^{1/3}))$サンプルは、$k<n$ poisson分布の混合物を正確に学習するのに十分であり、それぞれが$n$で境界付けられた積分率パラメータを持つ。第2のアプローチは代数的および組合せ的ツールを使用し、共有試行パラメータ$N$と異なる成功パラメータを持つ二項混合および幾何学的分布の混合に適用する。例えば、$k$コンポーネントと成功パラメータの2項混合の場合、$\epsilon$, $O(k^2(N/\epsilon)^{8/\sqrt{\epsilon}})$サンプルはパラメータを正確に回復するのに十分である。これらの分布のいくつかについては,パラメータ推定の最初の保証を示す。 We present two different approaches for parameter learning in several mixture models in one dimension. Our first approach uses complex-analytic methods and applies to Gaussian mixtures with shared variance, binomial mixtures with shared success probability, and Poisson mixtures, among others. An example result is that $\exp(O(N^{1/3}))$ samples suffice to exactly learn a mixture of $k<N$ Poisson distributions, each with integral rate parameters bounded by $N$. Our second approach uses algebraic and combinatorial tools and applies to binomial mixtures with shared trial parameter $N$ and differing success parameters, as well as to mixtures of geometric distributions. Again, as an example, for binomial mixtures with $k$ components and success parameters discretized to resolution $\epsilon$, $O(k^2(N/\epsilon)^{8/\sqrt{\epsilon}})$ samples suffice to exactly recover the parameters. For some of these distributions, our results represent the first guarantees for parameter estimation.	翻訳日:2023-01-08 10:03:30 公開日:2020-01-19
# スキルセグメンテーションを用いた実演からの学習オプション Learning Options from Demonstration using Skill Segmentation ( http://arxiv.org/abs/2001.06793v1 ) ライセンス: Link先を確認	Matthew Cockcroft, Shahil Mawjee, Steven James, Pravesh Ranchod	(参考訳) 本稿では,セグメント化されたデモ軌跡からオプションを学習する手法を提案する。トラジェクタはまず非パラメトリックベイズクラスタリングを用いてスキルに分割され、各セグメントに対する報酬関数は逆強化学習を用いて学習される。これにより、デモのための一連の推論軌道が生成される。 1クラスのサポートベクターマシンクラスタリングアルゴリズムを用いて、これらの軌道からオプション開始セットと終了条件を学習する。提案手法は,エージェントが人間の実演から利用可能な選択肢を自律的に発見できる4部屋領域で実証する。その結果,これらの推論オプションは学習と計画の改善に有効であることが示唆された。 We present a method for learning options from segmented demonstration trajectories. The trajectories are first segmented into skills using nonparametric Bayesian clustering and a reward function for each segment is then learned using inverse reinforcement learning. From this, a set of inferred trajectories for the demonstration are generated. Option initiation sets and termination conditions are learned from these trajectories using the one-class support vector machine clustering algorithm. We demonstrate our method in the four rooms domain, where an agent is able to autonomously discover usable options from human demonstration. Our results show that these inferred options can then be used to improve learning and planning.	翻訳日:2023-01-08 10:03:08 公開日:2020-01-19
# 分布ロバストベイズ四分法最適化 Distributionally Robust Bayesian Quadrature Optimization ( http://arxiv.org/abs/2001.06814v1 ) ライセンス: Link先を確認	Thanh Tang Nguyen, Sunil Gupta, Huong Ha, Santu Rana, Svetha Venkatesh	(参考訳) ベイズ二次最適化(BQO)は、既知の確率分布を乗っ取る高価なブラックボックス積分器の期待を最大化する。そこで本研究では,BQOの分布の不確実性について検討し,基礎となる確率分布が未知であることを示す。標準bqoアプローチは、固定されたサンプル集合の真の期待対象のモンテカルロ推定を最大化する。モンテカルロ推定は偏りがないが、少量のサンプルが与えられた場合、分散度が高いため、目的関数がスプリアスになる可能性がある。我々は,最も敵対的な分布下での予測目標を最大化することにより,分布的に堅牢な最適化の観点をこの問題に適用する。特に, この目的のために, 分散ロバストなBQO (DRBQO) という, 後方サンプリングに基づく新しいアルゴリズムを提案する。提案手法の合成および実世界の問題における実証的有効性を実証し,ベイズ的後悔による理論的収束を特徴づける。 Bayesian quadrature optimization (BQO) maximizes the expectation of an expensive black-box integrand taken over a known probability distribution. In this work, we study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples. A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set. Though Monte Carlo estimate is unbiased, it has high variance given a small set of samples; thus can result in a spurious objective function. We adopt the distributionally robust optimization perspective to this problem by maximizing the expected objective under the most adversarial distribution. In particular, we propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose. We demonstrate the empirical effectiveness of our proposed framework in synthetic and real-world problems, and characterize its theoretical convergence via Bayesian regret.	翻訳日:2023-01-08 10:02:45 公開日:2020-01-19

Title

Authors

Abstract

論文公表日・翻訳日

# ディープ拡散過程を用いたイベントデータからの動的およびパーソナライズされた共生ネットワークの学習

Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes ( http://arxiv.org/abs/2001.02585v2 )

ライセンス: Link先を確認

Zhaozhi Qian, Ahmed M. Alaa, Alexis Bellot, Jem Rashbass, Mihaela van der Schaar

(参考訳) コンコルビンド病は、個人によって異なる複雑な時間的パターンを通じて発生し進行する。電子的な健康記録では、患者が持つ異なる疾患を観察できるが、それぞれの共同症状の時間的関係を推測できる。事象データからこのような時間的パターンを学習することは、疾患の病態を理解し、予後を予測するのに不可欠である。この目的のために我々は,ダイナミックグラフで表現された共生病発症者間の時間的関係をモデル化する深層拡散過程(ddp)を開発した。 ddpは、動的重み付きグラフのエッジによってパラメータ化された強度関数を持つ多次元点過程としてモデル化されたイベントを含む。グラフ構造は、患者の履歴をエッジウェイトにマッピングするニューラルネットワークによって変調され、疾患軌跡の豊かな時間的表現を可能にする。 DDPパラメータは臨床的に有意義な構成要素に分離され、正確なリスク予測と疾患病理の理解不能な表現の両目的に役立てることができる。癌登録データを用いた実験でこれらの特徴を説明する。

Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals. In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition. Learning such temporal patterns from event data is crucial for understanding disease pathology and predicting prognoses. To this end, we develop deep diffusion processes (DDP) to model "dynamic comorbidity networks", i.e., the temporal relationships between comorbid disease onsets expressed through a dynamic graph. A DDP comprises events modelled as a multi-dimensional point process, with an intensity function parameterized by the edges of a dynamic weighted graph. The graph structure is modulated by a neural network that maps patient history to edge weights, enabling rich temporal representations for disease trajectories. The DDP parameters decouple into clinically meaningful components, which enables serving the dual purpose of accurate risk prediction and intelligible representation of disease pathology. We illustrate these features in experiments using cancer registry data.

翻訳日:2023-01-13 09:40:16 公開日:2020-01-19

# 制限ボルツマンマシンのモード支援非教師なし学習

Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines ( http://arxiv.org/abs/2001.05559v2 )

ライセンス: Link先を確認

Haik Manukian, Yan Ru Pei, Sean R.B. Bearden, Massimiliano Di Ventra

(参考訳) 制限ボルツマン機械(RBM)は、強力な生成モデルのクラスであるが、それらの訓練では、典型的な損失関数の教師付きバックプロパゲーションとは異なり、近似も困難である。本稿では,標準勾配更新をrbm基底状態(モード)のサンプルから構築したオフグレード方向と適切に組み合わせることにより,従来の勾配法よりも劇的にトレーニングが向上することを示す。モードトレーニングと呼ばれるこのアプローチは、収束相対エントロピー(KL分散)の低下に加えて、より高速なトレーニングと安定性を促進する。この手法の安定性と収束性の証明とともに、KLの発散を正確に計算できる合成データセットや、より大規模な機械学習標準であるMNISTにも有効性を示す。我々の提案するモードトレーニングは、任意の勾配法とともに適用でき、深層、畳み込み、非制限ボルツマン機械のようなより一般的なエネルギーベースのニューラルネットワーク構造に容易に拡張できるため、非常に多様である。

Restricted Boltzmann machines (RBMs) are a powerful class of generative models, but their training requires computing a gradient that, unlike supervised backpropagation on typical loss functions, is notoriously difficult even to approximate. Here, we show that properly combining standard gradient updates with an off-gradient direction, constructed from samples of the RBM ground state (mode), improves their training dramatically over traditional gradient methods. This approach, which we call mode training, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence). Along with the proofs of stability and convergence of this method, we also demonstrate its efficacy on synthetic datasets where we can compute KL divergences exactly, as well as on a larger machine learning standard, MNIST. The mode training we suggest is quite versatile, as it can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures such as deep, convolutional and unrestricted Boltzmann machines.

翻訳日:2023-01-11 05:56:51 公開日:2020-01-19

# 各種NISQデバイスにおける文脈知覚・時間認識ビットマッピング

Context-Sensitive and Duration-Aware Qubit Mapping for Various NISQ Devices ( http://arxiv.org/abs/2001.06887v1 )

ライセンス: Link先を確認

Yu Zhang and Haowei Deng and Quanxi Li

(参考訳) 量子コンピューティング(QC)技術はこの10年で第二のルネッサンスに達した。完全にプログラム可能なQCデバイスは超伝導またはイオントラップ技術に基づいて構築されている。異なる量子技術には独自のパラメータ指標があるが、NISQ時代のQCデバイスは、量子ビットと接続の制限、短いコヒーレンス時間、高いゲートエラー率といった共通の特徴と課題を共有している。プログラマが書いた量子プログラムは、2量子ゲートがいくつかの量子ビットで許可されるため、実際のハードウェア上で直接動作することはほとんどできなかった。したがって、量子コンピューティングコンパイラはマッピング問題を解決し、ハードウェアの限界に合うように元のプログラムを変換しなければならない。上記の問題に対処するため、異なる量子技術を要約し、量子抽象機械(QAM)を抽象的に定義し、次にQAMに基づくContext-sensitive and Duration-Aware Remapping Algorithm(Codar)を提案する。キュービット毎にロックを導入することで、Codarはゲート長差とプログラムコンテキストを認識し、より多くのプログラムの並列性を抽出し、プログラムの実行時間を短縮することができる。最もよく知られているアルゴリズムと比較して、Codarはいくつかの量子アルゴリズムの総実行時間を半減し、17.5%から19.4%を削減した。

Quantum computing (QC) technologies have reached a second renaissance in the last decade. Some fully programmable QC devices have been built based on superconducting or ion trap technologies. Although different quantum technologies have their own parameter indicators, QC devices in the NISQ era share common features and challenges such as limited qubits and connectivity, short coherence time and high gate error rates. Quantum programs written by programmers could hardly run on real hardware directly since two-qubit gates are usually allowed on few pairs of qubits. Therefore, quantum computing compilers must resolve the mapping problem and transform original programs to fit the hardware limitation. To address the issues mentioned above, we summarize different quantum technologies and abstractly define Quantum Abstract Machine (QAM); then propose a COntext-sensitive and Duration-Aware Remapping algorithm (Codar) based on the QAM. By introducing lock for each qubit, Codar is aware of gate duration difference and program context, which bring it abilities to extract more program's parallelism and reduce program execution time. Compared to the best-known algorithm, Codar halves the total execution time of several quantum algorithms and cut down 17.5% - 19.4% total execution time on average in different architectures.

翻訳日:2023-01-10 05:36:43 公開日:2020-01-19

# 非定常変形特異振動子:量子不変量と分解法

Nonstationary deformed singular oscillator: quantum invariants and the factorization method ( http://arxiv.org/abs/2001.06764v1 )

ライセンス: Link先を確認

Kevin Zelaya

(参考訳) 定常特異発振器に関連する時間依存ポテンシャルの新しいファミリーを導入する。これは、特異振動子に対して非定常量子不変量を構成することができることに気付き、達成される。そのような不変量はエルマコフ方程式の解に関連する係数に依存し、後者は各時間における解の正則性を保証するために必須となる。この形式では、ハミルトニアンではなく量子不変量に分解法を適用した後、時間パラメータを変換に導入し、新しい時間依存ポテンシャルの運動定数である分解作用素へと導く。適切な極限の下では、初期量子不変量は定常特異振動子ハミルトニアンに還元され、そのような場合、従来の分解法で得られたポテンシャルの族を復元し、文献で以前報告される。さらに、ポテンシャルの特異な障壁が消え、非特異な時間依存ポテンシャルとなるような特別な制限が議論される。

New families of time-dependent potentials related with the stationary singular oscillator are introduced. This is achieved after noticing that a non stationary quantum invariant can be constructed for the singular oscillator. Such invariant depends on coefficients that are related to solutions of an Ermakov equation, the latter becomes essential since it guarantees the regularity of the solutions at each time. In this form, after applying the factorization method to the quantum invariant, rather than the Hamiltonian, one manages to introduce the time parameter into the transformation, leading to factorized operators which are the constants of motion of the new time-dependent potentials. Under the appropriate limit, the initial quantum invariant reduces to the stationary singular oscillator Hamiltonian, in such case, one recovers the families of potentials obtained through the conventional factorization method and previously reported in the literature. In addition, some special limits are discussed such that the singular barrier of the potential vanishes, leading to non-singular time-dependent potentials.

翻訳日:2023-01-10 05:20:34 公開日:2020-01-19

# r\"ontgen項の付加性と反動による自然放出率の補正

Additivity of R\"ontgen term and recoil-induced correction to spontaneous emission rate ( http://arxiv.org/abs/2001.08145v1 )

ライセンス: Link先を確認

Anwei Zhang and Danying Yu

(参考訳) 移動原子については、R\"{o}ntgen 相互作用項と発光光子によって誘導されるリコイル効果の2つの因子からの寄与により自然放出率が変化する。本稿では,完全導電板近傍における一様移動原子の放出速度を調べ,これら2つの因子による補正を求める。我々は、R\"{o}ntgen 項によって個別に誘導される補正とリコイル効果を簡単に加えることができ、結果として、崩壊率に対する総補正が得られることを見出した。さらに、R\"{o}ntgen 項が正の補正を与えるのに対し、リコイル効果は負の補正をもたらすことが示されている。我々の研究は、量子光学における移動粒子の光-物質相互作用の研究への道を開くものである。

For a moving atom, the spontaneous emission rate is modified due to the contributions from two factors, the R\"{o}ntgen interaction term and the recoil effect induced by the emitted photon. Here we investigate the emission rate of a uniformly moving atom near a perfectly conducting plate and obtain the corrections induced by these two factors. We find that the corrections individually induced by the R\"{o}ntgen term and the recoil effect can be simply added and result in the total correction to the decay rate. Moreover, it is shown that the R\"{o}ntgen term gives positive correction, while the recoil effect induces negative correction. Our work paves the way towards the future studies of the light-matter interaction for the moving particle in quantum optics.

翻訳日:2023-01-10 05:19:57 公開日:2020-01-19

# SlideImages:教育用画像分類用データセット

SlideImages: A Dataset for Educational Image Classification ( http://arxiv.org/abs/2001.06823v1 )

ライセンス: Link先を確認

David Morris, Eric M\"uller-Budack, Ralph Ewerth

(参考訳) 過去数年間、畳み込みニューラルネットワーク(convolutional neural networks、cnns)はコンピュータビジョンのタスクで印象的な成果を上げてきた。さらに、イラストやデータの可視化、図形などのセンサ以外の画像は、複雑な情報伝達や大規模なデータセットの探索に一般的に使用される。しかし、この種の画像はコンピュータビジョンにはほとんど注目されていない。 CNNや他の技術は大量のトレーニングデータを使用する。現在、多くの文書分析システムは、教育用画像データの大規模なデータセットが不足しているため、シーン画像に基づいて訓練されている。本稿では,この課題に対処し,教育イラストの分類を行うためのデータセットであるSlideImagesを提示する。 SlideImagesには、Wikimedia CommonsやAI2Dデータセットなど、さまざまなソースから収集したトレーニングデータと、教育スライドから収集したテストデータが含まれている。我々は、このデータセットを用いたアプローチが新しい教育画像や潜在的に他の領域にうまく一般化するように、実際の教育イメージをテストデータセットとして保存してきた。さらに,標準ディープニューラルアーキテクチャを用いたベースラインシステムを提案し,限られたトレーニングデータの扱いについて検討する。

In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.

翻訳日:2023-01-08 12:46:33 公開日:2020-01-19

# RGB-DオドメトリーとSLAM

RGB-D Odometry and SLAM ( http://arxiv.org/abs/2001.06875v1 )

ライセンス: Link先を確認

Javier Civera and Seong Hun Lee

(参考訳) 現代のRGB-Dセンサーの出現は、ロボット工学、拡張現実(AR)、そして3Dスキャンを含む多くのアプリケーション分野に大きな影響を与えた。低コストで低消費電力で、LiDARのような従来のレンジセンサーの代替品である。さらに、RGBカメラとは異なり、RGB-Dセンサーは3Dシーン再構成のためのフレーム単位の三角測量の必要性を取り除く追加の深度情報を提供する。これらのメリットは、モバイルロボティクスとarで非常に人気があり、エゴモーションと3dシーン構造を見積もることが非常に興味深い。このような空間的理解により、ロボットは衝突することなく自律的に移動でき、ユーザーは画像ストリームに一貫性のある仮想エンティティを挿入できる。本章では,RGB-Dストリーム入力を用いたオードメトリと同時局所化とマッピング(略称SLAM)の共通定式化について述べる。前者はシーンのローカルマップに対するインクリメンタルカメラの動きを追跡することを目的としており、後者はカメラ軌跡とグローバルマップを一貫性を持って共同で推定することを目的としている。どちらの場合でも、標準手法は非線形最適化技術を用いてコスト関数を最小化する。本章は3つの主要な部分から構成される: 第一部では、オドメトリーとSLAMの基本概念を紹介し、RGB-Dセンサーの使用を動機づける。また,多くのオドメトリーおよびSLAMアルゴリズムに関係した数学的予備次数を与える。第2部では,カメラポーズ追跡,シーンマッピング,ループクローズという,slamシステムの3つの主要コンポーネントについて詳述する。各コンポーネントについて、文献で提案される様々なアプローチについて述べる。最終部では,先進的な研究トピックに関する簡単な議論と最先端技術への言及について述べる。

The emergence of modern RGB-D sensors had a significant impact in many application fields, including robotics, augmented reality (AR) and 3D scanning. They are low-cost, low-power and low-size alternatives to traditional range sensors such as LiDAR. Moreover, unlike RGB cameras, RGB-D sensors provide the additional depth information that removes the need of frame-by-frame triangulation for 3D scene reconstruction. These merits have made them very popular in mobile robotics and AR, where it is of great interest to estimate ego-motion and 3D scene structure. Such spatial understanding can enable robots to navigate autonomously without collisions and allow users to insert virtual entities consistent with the image stream. In this chapter, we review common formulations of odometry and Simultaneous Localization and Mapping (known by its acronym SLAM) using RGB-D stream input. The two topics are closely related, as the former aims to track the incremental camera motion with respect to a local map of the scene, and the latter to jointly estimate the camera trajectory and the global map with consistency. In both cases, the standard approaches minimize a cost function using nonlinear optimization techniques. This chapter consists of three main parts: In the first part, we introduce the basic concept of odometry and SLAM and motivate the use of RGB-D sensors. We also give mathematical preliminaries relevant to most odometry and SLAM algorithms. In the second part, we detail the three main components of SLAM systems: camera pose tracking, scene mapping and loop closing. For each component, we describe different approaches proposed in the literature. In the final part, we provide a brief discussion on advanced research topics with the references to the state-of-the-art.

翻訳日:2023-01-08 12:46:15 公開日:2020-01-19

# 単眼腹腔鏡トレーニングにおける拡張現実によるサチューリング

Towards Augmented Reality-based Suturing in Monocular Laparoscopic Training ( http://arxiv.org/abs/2001.06894v1 )

ライセンス: Link先を確認

Chandrakanth Jayachandran Preetha, Jonathan Kloss, Fabian Siegfried Wehrtmann, Lalith Sharan, Carolyn Fan, Beat Peter M\"uller-Stich, Felix Nickel, Sandy Engelhardt

(参考訳) 低侵襲手術(MIS)技術は、回復時間短縮や術後の副作用の減少など重要な臨床効果を提供するため、外科医の間で急速に普及している。しかし、従来の内視鏡システムは、深度知覚、空間方向、視野を損なう単眼映像を出力する。これらの状況下で実行される最も複雑なタスクの1つである。この作業の重要な要素は針ホルダーと手術針の間の相互作用である。針と楽器のリアルタイムの信頼性の高い3次元局在化は、推定された針面とその回転中心と機器の関係など、その量的幾何学的関係を記述するパラメータを追加してシーンを増強するために用いられる。これは基本的なスキルと手術技術の標準化と訓練に寄与し、手術全体のパフォーマンスを高め、合併症のリスクを軽減できる。本論文は,シリコーンパッド上での腹腔鏡視訓練結果を改善するために,定量的・定性的な視覚的表現を伴う拡張現実環境を提案する。これはマルチクラスセグメンテーションと深度マップ予測を実行するマルチタスク教師付きディープニューラルネットワークによって実現されている。深度マップとセグメンテーションマップを生成するための外科訓練シナリオに似た仮想環境を構築することで,ラベルの空洞化が克服されている。提案する畳み込みニューラルネットワークは実際の手術訓練シナリオでテストされ,針の閉塞に頑健であることが判明した。本発明のネットワークは、手術針分割用ダイススコア0.67、針ホルダ計器セグメンテーション0.81、深さ推定用平均絶対誤差6.5mmを達成する。

Minimally Invasive Surgery (MIS) techniques have gained rapid popularity among surgeons since they offer significant clinical benefits including reduced recovery time and diminished post-operative adverse effects. However, conventional endoscopic systems output monocular video which compromises depth perception, spatial orientation and field of view. Suturing is one of the most complex tasks performed under these circumstances. Key components of this tasks are the interplay between needle holder and the surgical needle. Reliable 3D localization of needle and instruments in real time could be used to augment the scene with additional parameters that describe their quantitative geometric relation, e.g. the relation between the estimated needle plane and its rotation center and the instrument. This could contribute towards standardization and training of basic skills and operative techniques, enhance overall surgical performance, and reduce the risk of complications. The paper proposes an Augmented Reality environment with quantitative and qualitative visual representations to enhance laparoscopic training outcomes performed on a silicone pad. This is enabled by a multi-task supervised deep neural network which performs multi-class segmentation and depth map prediction. Scarcity of labels has been conquered by creating a virtual environment which resembles the surgical training scenario to generate dense depth maps and segmentation maps. The proposed convolutional neural network was tested on real surgical training scenarios and showed to be robust to occlusion of the needle. The network achieves a dice score of 0.67 for surgical needle segmentation, 0.81 for needle holder instrument segmentation and a mean absolute error of 6.5 mm for depth estimation.

翻訳日:2023-01-08 12:45:49 公開日:2020-01-19

# 時間的ドメインに基づく社会的影響予測へのアプローチ

An Approach for Time-aware Domain-based Social Influence Prediction ( http://arxiv.org/abs/2001.07838v1 )

ライセンス: Link先を確認

Bilal Abu-Salih, Kit Yan Chan, Omar Al-Kadi, Marwan Al-Tawil, Pornpit Wongthongtham, Tomayess Issa, Heba Saadeh, Malak Al-Hassan, Bushra Bremie, Abdulaziz Albahlal

(参考訳) オンラインソーシャルネットワーク(OSN)は、人々がさまざまな状況や領域で意見、関心、考えを表現できる仮想プラットフォームを確立し、正当なユーザーだけでなく、スパマーや他の信頼できないユーザーがコンテンツを公開し、広めることを可能にする。そのため、社会信頼の概念は情報処理・データサイエンティスト・情報消費者・企業から注目を集めている。ソーシャルビッグデータ(SBD)の価値を取得する主な理由の1つは、OSNのユーザの信頼性を評価することのできるフレームワークと方法論を提供することである。これらのアプローチは、大規模ソーシャルデータに対応するためにスケーラブルであるべきです。したがって、分析プロセスを改善し、拡張し、SBDの信頼性を推測するために、社会的信頼を十分に理解する必要がある。露出した環境の設定とosnに関する制限の少なさを考えると、mediumは正当で本物のユーザーだけでなく、スパマーや他の信頼性の低いユーザーもコンテンツを公開し、広めることができる。そこで本稿では,意味分析と機械学習モジュールを用いて,異なる時間領域におけるユーザの信頼性を計測し,予測する手法を提案する。実験の結果,組み込まれた機械学習技術の適用性を評価し,信頼性の高いドメインベースユーザを予測する。

Online Social Networks(OSNs) have established virtual platforms enabling people to express their opinions, interests and thoughts in a variety of contexts and domains, allowing legitimate users as well as spammers and other untrustworthy users to publish and spread their content. Hence, the concept of social trust has attracted the attention of information processors/data scientists and information consumers/business firms. One of the main reasons for acquiring the value of Social Big Data (SBD) is to provide frameworks and methodologies using which the credibility of OSNs users can be evaluated. These approaches should be scalable to accommodate large-scale social data. Hence, there is a need for well comprehending of social trust to improve and expand the analysis process and inferring the credibility of SBD. Given the exposed environment's settings and fewer limitations related to OSNs, the medium allows legitimate and genuine users as well as spammers and other low trustworthy users to publish and spread their content. Hence, this paper presents an approach incorporates semantic analysis and machine learning modules to measure and predict users' trustworthiness in numerous domains in different time periods. The evaluation of the conducted experiment validates the applicability of the incorporated machine learning techniques to predict highly trustworthy domain-based users.

翻訳日:2023-01-08 12:45:20 公開日:2020-01-19

# SQLFlow:SQLと機械学習の橋渡し

SQLFlow: A Bridge between SQL and Machine Learning ( http://arxiv.org/abs/2001.06846v1 )

ライセンス: Link先を確認

Yi Wang, Yang Yang, Weiguo Zhu, Yi Wu, Xu Yan, Yongfeng Liu, Yu Wang, Liang Xie, Ziyao Gao, Wenjing Zhu, Xiang Chen, Wei Yan, Mingjie Tang, Yuan Tang

(参考訳) 産業用AIシステムは、主にエンドツーエンドの機械学習(ML)ワークフローである。典型的なレコメンデーションまたはビジネスインテリジェンスシステムには、多くのオンラインマイクロサービスとオフラインジョブが含まれる。このようなワークフローをSQLで効率的に開発するためのSQLFlowについて説明する。 SQLを使うことで、開発者は目的(何)と手順(方法)を無視した短いプログラムを書くことができる。以前のデータベースシステムは、MLをサポートするためにSQL方言を拡張した。 SQLFlow(https://sqlflow.org/sqlflow )は、MySQL、Apache Hive、Alibaba MaxCompute、TensorFlow、XGBoost、Scikit-learnといったMLエンジンなど、さまざまなデータベースシステムのブリッジとして機能する別の戦略を採用している。 SQLの構文を慎重に拡張して、さまざまなSQL方言で拡張を動作させました。我々は,協調構文解析アルゴリズムを考案して拡張を実装した。 SQLFlowは、教師付き、教師なしの学習、深いネットワークとツリーモデル、トレーニングと予測に加えて視覚モデルの説明、MLに加えてデータ処理と機能抽出など、さまざまなMLテクニックに対して効率的で表現力がある。 SQLFlowは、フォールトトレラントな実行とオンプレミスデプロイメントのために、SQLプログラムをKubernetesネイティブワークフローにコンパイルする。現在の産業ユーザはAnt Financial、DiDi、Alibaba Groupなどだ。

Industrial AI systems are mostly end-to-end machine learning (ML) workflows. A typical recommendation or business intelligence system includes many online micro-services and offline jobs. We describe SQLFlow for developing such workflows efficiently in SQL. SQL enables developers to write short programs focusing on the purpose (what) and ignoring the procedure (how). Previous database systems extended their SQL dialect to support ML. SQLFlow (https://sqlflow.org/sqlflow ) takes another strategy to work as a bridge over various database systems, including MySQL, Apache Hive, and Alibaba MaxCompute, and ML engines like TensorFlow, XGBoost, and scikit-learn. We extended SQL syntax carefully to make the extension working with various SQL dialects. We implement the extension by inventing a collaborative parsing algorithm. SQLFlow is efficient and expressive to a wide variety of ML techniques -- supervised and unsupervised learning; deep networks and tree models; visual model explanation in addition to training and prediction; data processing and feature extraction in addition to ML. SQLFlow compiles a SQL program into a Kubernetes-native workflow for fault-tolerable execution and on-cloud deployment. Current industrial users include Ant Financial, DiDi, and Alibaba Group.

翻訳日:2023-01-08 12:44:59 公開日:2020-01-19

# より効率的かつ効果的な推論に向けて:多人数共同決定

Towards More Efficient and Effective Inference: The Joint Decision of Multi-Participants ( http://arxiv.org/abs/2001.06774v1 )

ライセンス: Link先を確認

Hui Zhu, Zhulin An, Kaiqiang Xu, Xiaolong Hu, Yongjun Xu

(参考訳) 局所的なアーキテクチャを最適化したり、ネットワークを深くすることで畳み込みニューラルネットワークの性能を向上させる既存のアプローチは、モデルのサイズを大幅に増加させる傾向がある。需要の高いエッジデバイスにニューラルネットワークをデプロイして適用するためには、ネットワークの規模を縮小することが非常に重要です。しかし、ネットワークを圧縮して画像処理の性能を低下させることは容易である。本稿では,エッジデバイスに適した推論手法を提案する。多層および多層ネットワークを主成分とする多成分の結合決定は、従来の畳み込みニューラルネットワークのパラメータの合計数と同等で、より高い分類精度(cifar-10では0.26%、cifar-100では4.49%)を達成することができる。

Existing approaches to improve the performances of convolutional neural networks by optimizing the local architectures or deepening the networks tend to increase the size of models significantly. In order to deploy and apply the neural networks to edge devices which are in great demand, reducing the scale of networks are quite crucial. However, It is easy to degrade the performance of image processing by compressing the networks. In this paper, we propose a method which is suitable for edge devices while improving the efficiency and effectiveness of inference. The joint decision of multi-participants, mainly contain multi-layers and multi-networks, can achieve higher classification accuracy (0.26% on CIFAR-10 and 4.49% on CIFAR-100 at most) with similar total number of parameters for classical convolutional neural networks.

翻訳日:2023-01-08 12:39:47 公開日:2020-01-19

# 人間の解析のための学習構成型ニューラル情報融合

Learning Compositional Neural Information Fusion for Human Parsing ( http://arxiv.org/abs/2001.06804v1 )

ライセンス: Link先を確認

Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, and Ling Shao

(参考訳) この研究は、ニューラルネットワークと人体の構成階層を組み合わせることで、効率的で完全な人間の解析を行うことを提案する。我々はこのアプローチを神経情報融合フレームワークとして定式化する。本モデルでは,階層上の3つの推論プロセスから情報を収集する。直接推論(画像情報を用いて人体の各部分を直接予測する),ボトムアップ推論(構成部品から知識を組み立てる),トップダウン推論(親ノードからコンテキストを推定する)である。ボトムアップとトップダウンの推論は、それぞれ人体の構成関係と分解関係をモデル化する。さらに、複数のソース情報の融合を入力、すなわちソースの信頼度を推定し、考慮することで条件付けする。モデル全体がエンドツーエンドで微分可能で、情報の流れや構造を明示的にモデル化します。提案手法は4つの一般的なデータセットに対して広範に評価され,高速な処理速度を23fpsで実現した。この方向への今後の研究を容易にするため、コードと結果がリリースされました。

This work proposes to combine neural networks with the compositional hierarchy of human bodies for efficient and complete human parsing. We formulate the approach as a neural information fusion framework. Our model assembles the information from three inference processes over the hierarchy: direct inference (directly predicting each part of a human body using image information), bottom-up inference (assembling knowledge from constituent parts), and top-down inference (leveraging context from parent nodes). The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively. In addition, the fusion of multi-source information is conditioned on the inputs, i.e., by estimating and considering the confidence of the sources. The whole model is end-to-end differentiable, explicitly modeling information flows and structures. Our approach is extensively evaluated on four popular datasets, outperforming the state-of-the-arts in all cases, with a fast processing speed of 23fps. Our code and results have been released to help ease future research in this direction.

翻訳日:2023-01-08 12:39:31 公開日:2020-01-19

# 注意グラフニューラルネットワークによるゼロショットビデオオブジェクトセグメンテーション

Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks ( http://arxiv.org/abs/2001.06807v1 )

ライセンス: Link先を確認

Wenguan Wang, Xiankai Lu, Jianbing Shen, David Crandall, and Ling Shao

(参考訳) 本研究では、ゼロショットビデオオブジェクトセグメンテーション(ZVOS)のための新しい注意グラフニューラルネットワーク(AGNN)を提案する。提案されたAGNNは、このタスクをビデオグラフ上で反復的な情報融合のプロセスとして再放送する。特にAGNNは、フレームをノードとして効率的に表現し、任意のフレームペア間の関係をエッジとして、完全に連結されたグラフを構築している。基礎となる対関係は微分可能な注意機構によって記述される。パラメトリックメッセージパッシングにより、AGNNはビデオフレーム間のよりリッチで高次な関係を効果的に捉え、マイニングすることができ、それによってビデオ内容のより完全な理解とより正確なフォアグラウンド推定が可能になる。 3つのビデオセグメンテーションデータセットの実験結果は、agnnがそれぞれのケースで新しい最先端を設定することを示している。我々は、このフレームワークの一般化可能性をさらに示すために、AGNNを次のタスクに拡張する: Image Object Co-segmentation (IOCS)。我々は2つのIOCSデータセットで実験を行い、AGNNモデルの優越性を再び観察する。広範な実験により、AGNNはビデオフレームや関連画像間のセマンティック/出現関係を学習し、共通のオブジェクトを発見することができる。

This work proposes a novel attentive graph neural network (AGNN) for zero-shot video object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative information fusion over video graphs. Specifically, AGNN builds a fully connected graph to efficiently represent frames as nodes, and relations between arbitrary frame pairs as edges. The underlying pair-wise relations are described by a differentiable attention mechanism. Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation. Experimental results on three video segmentation datasets show that AGNN sets a new state-of-the-art in each case. To further demonstrate the generalizability of our framework, we extend AGNN to an additional task: image object co-segmentation (IOCS). We perform experiments on two famous IOCS datasets and observe again the superiority of our AGNN model. The extensive experiments verify that AGNN is able to learn the underlying semantic/appearance relationships among video frames or related images, and discover the common objects.

翻訳日:2023-01-08 12:39:13 公開日:2020-01-19

# 監視されていないビデオオブジェクトのセグメンテーションとコ・アテンション・シームズ・ネットワーク

See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks ( http://arxiv.org/abs/2001.06810v1 )

ライセンス: Link先を確認

Xiankai Lu, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, and Fatih Porikli

(参考訳) 本稿では,Co-attention Siamese Network (COSNet) と呼ばれる新しいネットワークを導入し,包括的視点から,教師なしビデオオブジェクトのセグメンテーションタスクに対処する。我々は,映像フレーム間の固有相関の重要性を強調し,短時間の時間セグメントにおける外見や動きに対する差別的前景表現の学習に重点を置いた,最先端のディープラーニングベースのソリューションを改善するためのグローバルなコアテンション機構を取り入れた。ネットワーク内のコアテンション層は,共同計算と共同特徴空間へのコアテンション応答の付加により,グローバルな相関関係とシーンコンテキストを捕捉するための効率的かつ有能な段階を提供する。 COSNetをビデオフレームのペアでトレーニングし、トレーニングデータを自然に強化し、学習能力を向上します。セグメンテーション段階において、コアテンションモデルは、複数の参照フレームを一緒に処理することで有用な情報を符号化し、頻繁に出現し、より健全なフォアグラウンドオブジェクトを推測する。ビデオ内のリッチなコンテキストをマイニングするために,さまざまなコアテンション変種を導出できる統一的でエンドツーエンドのトレーニング可能なフレームワークを提案する。 3つの大きなベンチマークに関する大規模な実験では、COSNetが現在の選択肢よりも大きなマージンで優れています。

We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view. We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism to improve further the state-of-the-art deep learning based solutions that primarily focus on learning discriminative foreground representations over appearance and motion in short-term temporal segments. The co-attention layers in our network provide efficient and competent stages for capturing global correlations and scene context by jointly computing and appending co-attention responses into a joint feature space. We train COSNet with pairs of video frames, which naturally augments training data and allows increased learning capacity. During the segmentation stage, the co-attention model encodes useful information by processing multiple reference frames together, which is leveraged to infer the frequently reappearing and salient foreground objects better. We propose a unified and end-to-end trainable framework where different co-attention variants can be derived for mining the rich context within videos. Our extensive experiments over three large benchmarks manifest that COSNet outperforms the current alternatives by a large margin.

翻訳日:2023-01-08 12:38:54 公開日:2020-01-19

# ヒューマン・アウェア・モーション・デブロアリング

Human-Aware Motion Deblurring ( http://arxiv.org/abs/2001.06816v1 )

ライセンス: Link先を確認

Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao

(参考訳) 本稿では,前景(FG)と背景(BG)との間に動きのぼかしをアンタングルする人間認識型デブロアリングモデルを提案する。提案モデルは三分岐エンコーダデコーダアーキテクチャに基づいている。第1の2つの分枝はそれぞれfg人間とbgの細部を研削するために学習され、第3の分枝は2つの領域からのマルチスケールなデブラリング情報を包括的に融合することにより、グローバルかつ調和的な結果を生み出す。提案モデルは, エンド・ツー・エンド方式で, 教師付き, 人間対応の注意機構を付与する。 FGの人間の情報をエンコードするソフトマスクを学習し、FG/BGデコーダブランチを明示的に駆動して特定のドメインに集中する。さらに,人間を認識できる画像デブラリングの研究に資するため,8,422個のぼやけた画像ペアと65,784個のfg人間バウンディングボックスからなるhidという大規模データセットを導入する。 HIDEは、広い範囲のシーン、人間のオブジェクトのサイズ、動きのパターン、背景の複雑さにまたがるように設計されている。公開ベンチマークとデータセットに関する広範な実験により,我々のモデルは,特にセマンティクス詳細の把握において,最先端のモーションデブラリング手法に対して好適に機能することが示された。

This paper proposes a human-aware deblurring model that disentangles the motion blur between foreground (FG) humans and background (BG). The proposed model is based on a triple-branch encoder-decoder architecture. The first two branches are learned for sharpening FG humans and BG details, respectively; while the third one produces global, harmonious results by comprehensively fusing multi-scale deblurring information from the two domains. The proposed model is further endowed with a supervised, human-aware attention mechanism in an end-to-end fashion. It learns a soft mask that encodes FG human information and explicitly drives the FG/BG decoder-branches to focus on their specific domains. To further benefit the research towards Human-aware Image Deblurring, we introduce a large-scale dataset, named HIDE, which consists of 8,422 blurry and sharp image pairs with 65,784 densely annotated FG human bounding boxes. HIDE is specifically built to span a broad range of scenes, human object sizes, motion patterns, and background complexities. Extensive experiments on public benchmarks and our dataset demonstrate that our model performs favorably against the state-of-the-art motion deblurring methods, especially in capturing semantic details.

翻訳日:2023-01-08 12:38:09 公開日:2020-01-19

# セマンティックセグメンテーションのためのゲートパス選択ネットワーク

Gated Path Selection Network for Semantic Segmentation ( http://arxiv.org/abs/2001.06819v1 )

ライセンス: Link先を確認

Qichuan Geng, Hong Zhang, Xiaojuan Qi, Ruigang Yang, Zhong Zhou, Gao Huang

(参考訳) セマンティクスのセグメンテーションは、大規模なバリエーション、変形、異なる視点を扱う必要がある困難なタスクである。本稿では,適応受容場を学習することを目的とした新しいネットワークgated path selection network(gpsnet)を開発した。 GPSNetにおいて、我々はまず2次元のマルチスケールネットワーク、SuperNetを設計する。望ましいセマンティックコンテキストを動的に選択するために、さらにゲート予測モジュールを導入する。通常のグリッド上のサンプル位置の最適化に重点を置く以前の研究とは対照的に、GPSNetは自由形式の密接なセマンティックコンテキストを適応的にキャプチャすることができる。導出された適応受容場はデータ依存であり、異なるオブジェクト幾何学変換をモデル化できる柔軟性がある。都市景観とADE20Kの2つの代表的なセマンティックセマンティックセグメンテーションデータセットにおいて,提案手法が従来手法より一貫して優れ,ベルやホイッスルを使わずに競争性能を達成することを示す。

Semantic segmentation is a challenging task that needs to handle large scale variations, deformations and different viewpoints. In this paper, we develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields. In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields. To dynamically select desirable semantic context, a gate prediction module is further introduced. In contrast to previous works that focus on optimizing sample positions on the regular grids, GPSNet can adaptively capture free form dense semantic contexts. The derived adaptive receptive fields are data-dependent, and are flexible that can model different object geometric transformations. On two representative semantic segmentation datasets, i.e., Cityscapes, and ADE20K, we show that the proposed approach consistently outperforms previous methods and achieves competitive performance without bells and whistles.

翻訳日:2023-01-08 12:37:46 公開日:2020-01-19

# FIS-Nets:単眼深度推定のためのフルイメージ監視ネットワーク

FIS-Nets: Full-image Supervised Networks for Monocular Depth Estimation ( http://arxiv.org/abs/2001.11092v1 )

ライセンス: Link先を確認

Bei Wang and Jianping An

(参考訳) 本稿では,単眼深度推定における全画像監視の重要性について述べる。画像整合性を利用する教師なしのフレームワークと、深い深度補完を行う教師なしのフレームワークを組み合わせた半教師付きアーキテクチャを提案する。後者は、前者の監督としてフルイメージの深さを提供する。ナビゲーションシステムからのエゴモーションは、内部時間変換ネットワークの出力監視として教師なしのフレームワークにも組み込まれ、単眼深度推定をより良くする。本評価では,提案手法が他の深度推定手法よりも優れていることを示す。

This paper addresses the importance of full-image supervision for monocular depth estimation. We propose a semi-supervised architecture, which combines both unsupervised framework of using image consistency and supervised framework of dense depth completion. The latter provides full-image depth as supervision for the former. Ego-motion from navigation system is also embedded into the unsupervised framework as output supervision of an inner temporal transform network, making monocular depth estimation better. In the evaluation, we show that our proposed model outperforms other approaches on depth estimation.

翻訳日:2023-01-08 12:36:08 公開日:2020-01-19

# ライダーシェアリングによるユーザエクスペリエンス向上を実現するWakly Supervised Learning

Weakly Supervised Learning Meets Ride-Sharing User Experience Enhancement ( http://arxiv.org/abs/2001.09027v1 )

ライセンス: Link先を確認

Lan-Zhe Guo, Feng Kuang, Zhang-Xun Liu, Yu-Feng Li, Nan Ma, Xiao-Hu Qie

(参考訳) 弱教師付き学習は、ラベル付きデータの不足に対処することを目的としている。従来の弱い教師付き研究では、データに弱い監督が1つしかないと仮定している。しかし、多くのアプリケーションでは、生データは通常、複数の弱い監督を同時に含む。例えば、最大規模のオンラインライドシェアリングプラットフォームであるDidiのユーザエクスペリエンス向上において、ライドコメントデータは(乗客の主観的要因による)ラベルノイズと(サンプリングバイアスによる)ラベル分布バイアスを含む。このような問題を「弱教師付き学習」と呼んでいる。本稿では,didiの配車コメントデータに基づいてこの問題に対処するためのcwsl手法を提案する。具体的には, 有害な雑音の重み付けが小さいコメントデータにおいて, ラベルノイズに対処するために, インスタンス再重み付け戦略を用いる。精度よりもAUCのようなロバストな基準と検証性能はバイアスデータラベルの修正に最適化されている。代用最適化と確率勾配法は大規模データの最適化を加速する。 Didiのライドシェアリングコメントデータの実験は、その有効性を明確に検証した。この研究が、複雑な実環境に弱い教師付き学習を適用することに光を当てることを望む。

Weakly supervised learning aims at coping with scarce labeled data. Previous weakly supervised studies typically assume that there is only one kind of weak supervision in data. In many applications, however, raw data usually contains more than one kind of weak supervision at the same time. For example, in user experience enhancement from Didi, one of the largest online ride-sharing platforms, the ride comment data contains severe label noise (due to the subjective factors of passengers) and severe label distribution bias (due to the sampling bias). We call such a problem as "compound weakly supervised learning". In this paper, we propose the CWSL method to address this problem based on Didi ride-sharing comment data. Specifically, an instance reweighting strategy is employed to cope with severe label noise in comment data, where the weights for harmful noisy instances are small. Robust criteria like AUC rather than accuracy and the validation performance are optimized for the correction of biased data label. Alternating optimization and stochastic gradient methods accelerate the optimization on large-scale data. Experiments on Didi ride-sharing comment data clearly validate the effectiveness. We hope this work may shed some light on applying weakly supervised learning to complex real situations.

翻訳日:2023-01-08 10:14:25 公開日:2020-01-19

# 適応RBFベースサロゲートモデルを用いた不確実性定量化による支出関数の最適点の探索

Finding Optimal Points for Expensive Functions Using Adaptive RBF-Based Surrogate Model Via Uncertainty Quantification ( http://arxiv.org/abs/2001.06858v1 )

ライセンス: Link先を確認

Ray-Bing Chen, Yuan Wang, C. F. Jeff Wu

(参考訳) 高価な関数のグローバルな最適化は、物理およびコンピュータ実験において重要な応用である。各関数の評価は費用がかかり、その関数の導出情報が得られないことが多いため、効率的な最適化スキームを開発することは難しい問題である。本稿では,適応的放射基底関数(RBF)に基づく不確実性定量化による代理モデルを用いた新しいグローバル最適化フレームワークを提案する。フレームワークは2つのイテレーションステップで構成される。まずRBFに基づくベイズ代理モデルを用いて真の関数を近似し、新しい点が探索されるたびにRBFのパラメータを適応的に推定し更新することができる。そして、モデル誘導選択基準を用いて、関数評価のための候補セットから新しい点を識別する。ここで採用される選択基準は、期待される改善基準(EI)のサンプル版である。標準試験関数を用いたシミュレーション実験を行い,本手法は,特に実表面があまり滑らかでない場合において,いくつかの利点があることを示す。さらに,グローバルな最適点を同定し,高次元シナリオに対応するために,探索性能を改善するための改良手法を提案する。

Global optimization of expensive functions has important applications in physical and computer experiments. It is a challenging problem to develop efficient optimization scheme, because each function evaluation can be costly and the derivative information of the function is often not available. We propose a novel global optimization framework using adaptive Radial Basis Functions (RBF) based surrogate model via uncertainty quantification. The framework consists of two iteration steps. It first employs an RBF-based Bayesian surrogate model to approximate the true function, where the parameters of the RBFs can be adaptively estimated and updated each time a new point is explored. Then it utilizes a model-guided selection criterion to identify a new point from a candidate set for function evaluation. The selection criterion adopted here is a sample version of the expected improvement (EI) criterion. We conduct simulation studies with standard test functions, which show that the proposed method has some advantages, especially when the true surface is not very smooth. In addition, we also propose modified approaches to improve the search performance for identifying global optimal points and to deal with the higher dimension scenarios.

翻訳日:2023-01-08 10:13:30 公開日:2020-01-19

# ランダム再帰木アンサンブルを用いた分類のためのメタアルゴリズム:高エネルギー物理応用

A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application ( http://arxiv.org/abs/2001.06880v1 )

ライセンス: Link先を確認

Vidhi Lalchand

(参考訳) 本研究の目的は,離散二項クラスの存在下での自動分類のためのメタアルゴリズムを提案することである。重複するクラス分布の存在下での分類器学習は、機械学習において難しい問題である。重なり合うクラスは、両クラスに属する点の密度が高い特徴空間におけるあいまいな領域の存在によって記述される。これは実世界のデータセットでしばしば起こり、例えばLHC(Large Hadron Collider)のような高エネルギー加速器に由来する粒子崩壊の性質を示す数値データである。クラスオーバーラップ問題を対象とした重要な研究機関は、アンサンブル分類器を使用して、複数の段階で繰り返し、あるいは入力トレーニングデータの異なるサブセットで同じモデルの複数のコピーを使用することで、アルゴリズムの性能を向上させる。前者をブースティング(boosting)、後者をバグング(bagging)と呼ぶ。この論文で提案されたアルゴリズムは、ヒッグス発見の統計的重要性を改善する高エネルギー物理学における挑戦的な分類問題をターゲットにしている。アルゴリズムのトレーニングに使用される基礎となるデータセットは、信号生成クラスオーバーラップの統計特性を忠実に模倣する、ヒッグスイベント(信号)と異なるバックグラウンドイベント(背景)を混合した公式のATLASフル検出器シミュレーションから構築された実験データである。提案したアルゴリズムは、実験物理学において最も成功した解析手法の1つである古典的な強化決定木の変種である。このアルゴリズムは、ベージとブーピングという2つのメタラーニングテクニックを組み合わせた統合フレームワークを利用している。その結果,この組み合わせは,基礎学習者のランダム化トリックの存在下でのみ有効であることがわかった。

The aim of this work is to propose a meta-algorithm for automatic classification in the presence of discrete binary classes. Classifier learning in the presence of overlapping class distributions is a challenging problem in machine learning. Overlapping classes are described by the presence of ambiguous areas in the feature space with a high density of points belonging to both classes. This often occurs in real-world datasets, one such example is numeric data denoting properties of particle decays derived from high-energy accelerators like the Large Hadron Collider (LHC). A significant body of research targeting the class overlap problem use ensemble classifiers to boost the performance of algorithms by using them iteratively in multiple stages or using multiple copies of the same model on different subsets of the input training data. The former is called boosting and the latter is called bagging. The algorithm proposed in this thesis targets a challenging classification problem in high energy physics - that of improving the statistical significance of the Higgs discovery. The underlying dataset used to train the algorithm is experimental data built from the official ATLAS full-detector simulation with Higgs events (signal) mixed with different background events (background) that closely mimic the statistical properties of the signal generating class overlap. The algorithm proposed is a variant of the classical boosted decision tree which is known to be one of the most successful analysis techniques in experimental physics. The algorithm utilizes a unified framework that combines two meta-learning techniques - bagging and boosting. The results show that this combination only works in the presence of a randomization trick in the base learners.

翻訳日:2023-01-08 10:13:13 公開日:2020-01-19

# ACGAN合成レーダマイクロドップラー信号を用いた運動分類

Motion Classification using Kinematically Sifted ACGAN-Synthesized Radar Micro-Doppler Signatures ( http://arxiv.org/abs/2001.08582v1 )

ライセンス: Link先を確認

Baris Erol, Sevgi Zubeyde Gurbuz, Moeness G. Amin

(参考訳) ディープニューラルネットワーク(dnn)は最近、レーダーベースのヒューマンアクティビティ認識、スマートホーム、生活支援、バイオメディシンなど、レーダーリターンの分類を必要とするアプリケーションで注目を集めている。しかし,レーダーデータ収集に必要な人的コストや資源が高すぎるため,十分な規模のトレーニングデータセットの取得は依然として大変な作業である。本稿では,様々な環境に適応した合成レーダマイクロドップラーシグネチャを生成するための,逆学習への拡張アプローチを提案する。合成データは,視覚的解釈,キネマティック一貫性の分析,データの多様性,潜伏空間の次元,塩分マップを用いて評価する。合成シグネチャが物理的に可能な人間の動作と一致していることを保証するために, 原理成分分析 (pca) に基づくキネマティックシフティングアルゴリズムが導入された。合成データセットは、19層ディープ畳み込みニューラルネットワーク(DCNN)をトレーニングし、敵ネットワークに供給されたデータセットとは異なる環境から取得したマイクロドップラーシグネチャを分類する。全体的な精度93%は、複数のアスペクトアングル(0デグ、30デグ、45デグ、60デグ)を含むデータセット上で達成され、キネマティックなシフティングの結果、9%改善されている。

Deep neural networks (DNNs) have recently received vast attention in applications requiring classification of radar returns, including radar-based human activity recognition for security, smart homes, assisted living, and biomedicine. However,acquiring a sufficiently large training dataset remains a daunting task due to the high human costs and resources required for radar data collection. In this paper, an extended approach to adversarial learning is proposed for generation of synthetic radar micro-Doppler signatures that are well-adapted to different environments. The synthetic data is evaluated using visual interpretation, analysis of kinematic consistency, data diversity, dimensions of the latent space, and saliency maps. A principle-component analysis (PCA) based kinematic-sifting algorithm is introduced to ensure that synthetic signatures are consistent with physically possible human motions. The synthetic dataset is used to train a 19-layer deep convolutional neural network (DCNN) to classify micro-Doppler signatures acquired from an environment different from that of the dataset supplied to the adversarial network. An overall accuracy 93% is achieved on a dataset that contains multiple aspect angles (0 deg., 30 deg., and 45 deg. as well as 60 deg.), with 9% improvement as a result of kinematic sifting.

翻訳日:2023-01-08 10:12:46 公開日:2020-01-19

# ヒューマンフィードバックを用いた高次元状態空間におけるインタラクティブ報酬形成

FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback ( http://arxiv.org/abs/2001.06781v1 )

ライセンス: Link先を確認

Baicen Xiao, Qifan Lu, Bhaskar Ramasubramanian, Andrew Clark, Linda Bushnell, Radha Poovendran

(参考訳) 強化学習は複雑な環境で目標を達成するための自律エージェントの訓練に成功している。これはロボティクスやコンピュータゲームを含む複数の設定に適応しているが、一部の環境では強化学習アルゴリズムよりも高い報酬を得る方が容易である。これは、エージェントによって得られる報酬がスパースまたは非常に遅れた高次元状態空間に特に当てはまる。本稿では,人間の操作者からのフィードバック信号を高次元状態空間における深層強化学習アルゴリズムに効果的に統合することを目的とする。これをFRESH(FeedbackベースのReward SHaping)と呼ぶ。トレーニング中、人間オペレータはリプレイバッファからの軌道を提示され、軌道の状態と動作についてのフィードバックを提供する。人間のオペレータが提供したフィードバック信号を、テスト時に事前に認識した状態やアクションに一般化するために、フィードバックニューラルネットワークを使用する。我々は、モデルの不確実性とニューラルネットワークの出力に対する信頼性を表すために、ニューラルネットワークと共有ネットワークアーキテクチャのアンサンブルを使用する。フィードバックニューラルネットワークの出力は、環境が提供する報酬に付加されたシェーピング報酬に変換される。アーケード学習環境におけるボーリングとスキーのアタリゲームに対する我々のアプローチを評価する。人間のエキスパートはこれらの環境で高いスコアを得ることができたが、最先端のディープラーニングアルゴリズムはパフォーマンスが悪い。我々はFRESHが両環境における最先端のディープラーニングアルゴリズムよりもはるかに高いスコアを得られることを観察した。 FRESHはまた、ボーリングの人間専門家よりも21.4%高いスコアを獲得し、スキーの人間専門家でもある。

Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the reward obtained by the agent is sparse or extremely delayed. In this paper, we seek to effectively integrate feedback signals supplied by a human operator with deep reinforcement learning algorithms in high-dimensional state spaces. We call this FRESH (Feedback-based REward SHaping). During training, a human operator is presented with trajectories from a replay buffer and then provides feedback on states and actions in the trajectory. In order to generalize feedback signals provided by the human operator to previously unseen states and actions at test-time, we use a feedback neural network. We use an ensemble of neural networks with a shared network architecture to represent model uncertainty and the confidence of the neural network in its output. The output of the feedback neural network is converted to a shaping reward that is augmented to the reward provided by the environment. We evaluate our approach on the Bowling and Skiing Atari games in the arcade learning environment. Although human experts have been able to achieve high scores in these environments, state-of-the-art deep learning algorithms perform poorly. We observe that FRESH is able to achieve much higher scores than state-of-the-art deep learning algorithms in both environments. FRESH also achieves a 21.4% higher score than a human expert in Bowling and does as well as a human expert in Skiing.

翻訳日:2023-01-08 10:12:23 公開日:2020-01-19

# 知識ベースアサーションの修正

Correcting Knowledge Base Assertions ( http://arxiv.org/abs/2001.06917v1 )

ライセンス: Link先を確認

Jiaoyan Chen, Xi Chen, Ian Horrocks, Ernesto Jimenez-Ruiz, and Erik B. Myklebus

(参考訳) 知識ベース(KB)の有用性とユーザビリティは品質の問題によって制限されることが多い。よくある問題は誤った主張の存在であり、しばしば語彙的あるいは意味的な混乱によって引き起こされる。そこで本研究では,このようなアサーションを訂正する問題について検討し,語彙マッチング,意味埋め込み,ソフト制約マイニング,意味一貫性チェックを組み合わせた一般的な補正フレームワークを提案する。このフレームワークはDBpediaと企業医療KBを用いて評価される。

The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB.

翻訳日:2023-01-08 10:11:58 公開日:2020-01-19

# ヒンディー語テキスト分類のためのディープラーニング:比較

Deep Learning for Hindi Text Classification: A Comparison ( http://arxiv.org/abs/2001.10340v1 )

ライセンス: Link先を確認

Ramchandra Joshi, Purvi Goel, Raviraj Joshi

(参考訳) 自然言語処理(NLP)、特に自然言語テキスト解析は近年大きな進歩を遂げている。テキスト処理におけるディープラーニングの利用は、テキスト処理技術に革命をもたらし、驚くべき結果をもたらした。 cnn、lstm、そして非常に最近のtransformerのような異なるディープラーニングアーキテクチャは、nlpタスクのさまざまな技術結果を達成するために使われている。本研究では,テキスト分類タスクのためのディープラーニングアーキテクチャのホストを調査した。この作品はヒンディー語のテキストの分類に特に関係している。デヴァナガリ文字で書かれた形態的に豊かで低資源のヒンディー語を分類する研究は、大きなラベル付きコーパスがないために限られている。本研究では,CNN,LSTM,注意に基づくモデル評価のために,英文データセットの翻訳版を用いた。 BERT と LASER に基づく多言語事前学習文の埋め込みも比較し,ヒンディー語の有効性を評価する。この論文は、一般的なテキスト分類技法のチュートリアルとしても機能する。

Natural Language Processing (NLP) and especially natural language text analysis have seen great advances in recent times. Usage of deep learning in text processing has revolutionized the techniques for text processing and achieved remarkable results. Different deep learning architectures like CNN, LSTM, and very recent Transformer have been used to achieve state of the art results variety on NLP tasks. In this work, we survey a host of deep learning architectures for text classification tasks. The work is specifically concerned with the classification of Hindi text. The research in the classification of morphologically rich and low resource Hindi language written in Devanagari script has been limited due to the absence of large labeled corpus. In this work, we used translated versions of English data-sets to evaluate models based on CNN, LSTM and Attention. Multilingual pre-trained sentence embeddings based on BERT and LASER are also compared to evaluate their effectiveness for the Hindi language. The paper also serves as a tutorial for popular text classification techniques.

翻訳日:2023-01-08 10:05:01 公開日:2020-01-19

# 深部空間クラスタリングのためのマルチレベル表現学習

Multi-Level Representation Learning for Deep Subspace Clustering ( http://arxiv.org/abs/2001.08533v1 )

ライセンス: Link先を確認

Mohsen Kheirandishfard, Fariba Zohrizadeh, Farhad Kamangar

(参考訳) 本稿では,畳み込みオートエンコーダを用いて入力画像を線形部分空間の結合上にある新しい表現に変換する,新しい深層部分空間クラスタリング手法を提案する。我々の研究の最初の貢献は、エンコーダ層とそれに対応するデコーダ層の間に複数の完全に接続された線形層を挿入し、サブスペースクラスタリングのためのより好ましい表現の学習を促進することである。これらの接続層は、エンコーダの異なるレベルで複数の自己表現および情報表現を生成するために、低レベルと高レベルの情報を組み合わせることで、特徴学習の手順を促進する。さらに,サンプルの初期クラスタリングを利用して,マルチレベル表現を効果的に融合し,下位部分空間をより正確に復元する新たな損失最小化問題を提案する。損失関数は、代わりにネットワークパラメータを更新し、サンプルの新しいクラスタリングを生成する反復スキームによって最小化される。 4つの実世界のデータセットに対する実験により、我々の手法は、ほとんどのサブスペースクラスタリング問題における最先端手法よりも優れた性能を示すことが示された。

This paper proposes a novel deep subspace clustering approach which uses convolutional autoencoders to transform input images into new representations lying on a union of linear subspaces. The first contribution of our work is to insert multiple fully-connected linear layers between the encoder layers and their corresponding decoder layers to promote learning more favorable representations for subspace clustering. These connection layers facilitate the feature learning procedure by combining low-level and high-level information for generating multiple sets of self-expressive and informative representations at different levels of the encoder. Moreover, we introduce a novel loss minimization problem which leverages an initial clustering of the samples to effectively fuse the multi-level representations and recover the underlying subspaces more accurately. The loss function is then minimized through an iterative scheme which alternatively updates the network parameters and produces new clusterings of the samples. Experiments on four real-world datasets demonstrate that our approach exhibits superior performance compared to the state-of-the-art methods on most of the subspace clustering problems.

翻訳日:2023-01-08 10:04:12 公開日:2020-01-19

# 原始二元能動集合アルゴリズムを用いたK-SVDによる画像認識

Image denoising via K-SVD with primal-dual active set algorithm ( http://arxiv.org/abs/2001.06780v1 )

ライセンス: Link先を確認

Quan Xiao, Canhong Wen, Zirui Yan

(参考訳) K-SVDアルゴリズムは、何十年にもわたって画像復調タスクにうまく適用されてきたが、速度と精度の大きなボトルネックは、いまだに壊れる必要がある。 K-SVD のスパース符号化段階では、$\ell_{0}$ 制約が伴うが、一般的な手法では、ノイズレベルが高くなると、近似的な解を求めることが多い。代替の$\ell_{1}$最適化は$\ell_{0}$よりも強力であることが証明されているが、時間消費によって実装が妨げられる。本稿では,Primal-Dual Active Set (PDAS)アルゴリズムを適用し,K-SVD$_P$という新しいK-SVDフレームワークを提案する。 K-SVDのアルゴリズムと異なり、K-SVD$_P$アルゴリズムはKKT(Karush-Kuhn-Tucker)条件によって動機付けられた選択戦略を開発し、スパース符号化段階における効率的な更新をもたらす。 K-SVD$_P$アルゴリズムは、このデノナイジング問題において単純な明示的な表現で反復的に双対問題の等価解を求めるため、デノナイジングの速度と品質を同時に達成することができる。実験を行い,最先端手法を用いたk-svd$_p$と同等の性能を示す。

K-SVD algorithm has been successfully applied to image denoising tasks dozens of years but the big bottleneck in speed and accuracy still needs attention to break. For the sparse coding stage in K-SVD, which involves $\ell_{0}$ constraint, prevailing methods usually seek approximate solutions greedily but are less effective once the noise level is high. The alternative $\ell_{1}$ optimization is proved to be powerful than $\ell_{0}$, however, the time consumption prevents it from the implementation. In this paper, we propose a new K-SVD framework called K-SVD$_P$ by applying the Primal-dual active set (PDAS) algorithm to it. Different from the greedy algorithms based K-SVD, the K-SVD$_P$ algorithm develops a selection strategy motivated by KKT (Karush-Kuhn-Tucker) condition and yields to an efficient update in the sparse coding stage. Since the K-SVD$_P$ algorithm seeks for an equivalent solution to the dual problem iteratively with simple explicit expression in this denoising problem, speed and quality of denoising can be reached simultaneously. Experiments are carried out and demonstrate the comparable denoising performance of our K-SVD$_P$ with state-of-the-art methods.

翻訳日:2023-01-08 10:03:55 公開日:2020-01-19

# 混合モデルにおけるパラメータ学習のための代数的および解析的アプローチ

Algebraic and Analytic Approaches for Parameter Learning in Mixture Models ( http://arxiv.org/abs/2001.06776v1 )

ライセンス: Link先を確認

Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, Soumyabrata Pal

(参考訳) 1次元の混合モデルにおけるパラメータ学習のための2つの異なるアプローチを提案する。最初のアプローチは複素解析法を使用し,共有分散を持つガウス混合,共有成功確率を持つ二項混合,ポアソン混合などに適用する。例として、$\exp(o(n^{1/3}))$サンプルは、$k<n$ poisson分布の混合物を正確に学習するのに十分であり、それぞれが$n$で境界付けられた積分率パラメータを持つ。第2のアプローチは代数的および組合せ的ツールを使用し、共有試行パラメータ$N$と異なる成功パラメータを持つ二項混合および幾何学的分布の混合に適用する。例えば、$k$コンポーネントと成功パラメータの2項混合の場合、$\epsilon$, $O(k^2(N/\epsilon)^{8/\sqrt{\epsilon}})$サンプルはパラメータを正確に回復するのに十分である。これらの分布のいくつかについては,パラメータ推定の最初の保証を示す。

We present two different approaches for parameter learning in several mixture models in one dimension. Our first approach uses complex-analytic methods and applies to Gaussian mixtures with shared variance, binomial mixtures with shared success probability, and Poisson mixtures, among others. An example result is that $\exp(O(N^{1/3}))$ samples suffice to exactly learn a mixture of $k<N$ Poisson distributions, each with integral rate parameters bounded by $N$. Our second approach uses algebraic and combinatorial tools and applies to binomial mixtures with shared trial parameter $N$ and differing success parameters, as well as to mixtures of geometric distributions. Again, as an example, for binomial mixtures with $k$ components and success parameters discretized to resolution $\epsilon$, $O(k^2(N/\epsilon)^{8/\sqrt{\epsilon}})$ samples suffice to exactly recover the parameters. For some of these distributions, our results represent the first guarantees for parameter estimation.

翻訳日:2023-01-08 10:03:30 公開日:2020-01-19

# スキルセグメンテーションを用いた実演からの学習オプション

Learning Options from Demonstration using Skill Segmentation ( http://arxiv.org/abs/2001.06793v1 )

ライセンス: Link先を確認

Matthew Cockcroft, Shahil Mawjee, Steven James, Pravesh Ranchod

(参考訳) 本稿では,セグメント化されたデモ軌跡からオプションを学習する手法を提案する。トラジェクタはまず非パラメトリックベイズクラスタリングを用いてスキルに分割され、各セグメントに対する報酬関数は逆強化学習を用いて学習される。これにより、デモのための一連の推論軌道が生成される。 1クラスのサポートベクターマシンクラスタリングアルゴリズムを用いて、これらの軌道からオプション開始セットと終了条件を学習する。提案手法は,エージェントが人間の実演から利用可能な選択肢を自律的に発見できる4部屋領域で実証する。その結果,これらの推論オプションは学習と計画の改善に有効であることが示唆された。

We present a method for learning options from segmented demonstration trajectories. The trajectories are first segmented into skills using nonparametric Bayesian clustering and a reward function for each segment is then learned using inverse reinforcement learning. From this, a set of inferred trajectories for the demonstration are generated. Option initiation sets and termination conditions are learned from these trajectories using the one-class support vector machine clustering algorithm. We demonstrate our method in the four rooms domain, where an agent is able to autonomously discover usable options from human demonstration. Our results show that these inferred options can then be used to improve learning and planning.

翻訳日:2023-01-08 10:03:08 公開日:2020-01-19

# 分布ロバストベイズ四分法最適化

Distributionally Robust Bayesian Quadrature Optimization ( http://arxiv.org/abs/2001.06814v1 )

ライセンス: Link先を確認

Thanh Tang Nguyen, Sunil Gupta, Huong Ha, Santu Rana, Svetha Venkatesh

(参考訳) ベイズ二次最適化(BQO)は、既知の確率分布を乗っ取る高価なブラックボックス積分器の期待を最大化する。そこで本研究では,BQOの分布の不確実性について検討し,基礎となる確率分布が未知であることを示す。標準bqoアプローチは、固定されたサンプル集合の真の期待対象のモンテカルロ推定を最大化する。モンテカルロ推定は偏りがないが、少量のサンプルが与えられた場合、分散度が高いため、目的関数がスプリアスになる可能性がある。我々は,最も敵対的な分布下での予測目標を最大化することにより,分布的に堅牢な最適化の観点をこの問題に適用する。特に, この目的のために, 分散ロバストなBQO (DRBQO) という, 後方サンプリングに基づく新しいアルゴリズムを提案する。提案手法の合成および実世界の問題における実証的有効性を実証し,ベイズ的後悔による理論的収束を特徴づける。

Bayesian quadrature optimization (BQO) maximizes the expectation of an expensive black-box integrand taken over a known probability distribution. In this work, we study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples. A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set. Though Monte Carlo estimate is unbiased, it has high variance given a small set of samples; thus can result in a spurious objective function. We adopt the distributionally robust optimization perspective to this problem by maximizing the expected objective under the most adversarial distribution. In particular, we propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose. We demonstrate the empirical effectiveness of our proposed framework in synthetic and real-world problems, and characterize its theoretical convergence via Bayesian regret.

翻訳日:2023-01-08 10:02:45 公開日:2020-01-19

PDF登録状況（公開日: 20200119）