Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200112となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# bring-Your-Own-Device (BYOD) プログラミングエグゼムのセキュア化 Securing Bring-Your-Own-Device (BYOD) Programming Exams ( http://arxiv.org/abs/2001.03942v1 ) ライセンス: Link先を確認	Oka Kurniawan, Norman Tiong Seng Lee, Christopher M. Poskitt	(参考訳) 従来のペンと紙の試験は、実践的なコーディング能力をターゲットにした教育や学習の目的と不一致であるため、現代の大学プログラミングコースでは不十分である。残念ながら、多くの機関は専用のコンピュータラボでアセスメントを実行するためのリソースやスペースを欠いている。このことは、学生がどのように学習するかと同様の環境でプログラムできるような、BYOD( bring-your-own-device)試験フォーマットの開発を動機付けている。本稿では,学生のラップトップをシステムやインターネットアクセスの制限のあるセキュアなワークステーションに変えるソフトウェアであるロックダウンブラウザに基づくbyod試験ソリューションについて述べる。この技術を学習管理システムとクラウドベースのプログラミングツールと組み合わせることで,対話的かつ制御可能な環境において,概念的かつ実践的なプログラミング問題に取り組むことができる。我々は、このソリューションを主要な学部プログラミングコースに導入した経験を反映し、ポリシーとサポートメカニズムが技術そのものと同じくらい重要であるという私たちの主要な教訓を強調します。 Traditional pen and paper exams are inadequate for modern university programming courses as they are misaligned with pedagogies and learning objectives that target practical coding ability. Unfortunately, many institutions lack the resources or space to be able to run assessments in dedicated computer labs. This has motivated the development of bring-your-own-device (BYOD) exam formats, allowing students to program in a similar environment to how they learnt, but presenting instructors with significant additional challenges in preventing plagiarism and cheating. In this paper, we describe a BYOD exam solution based on lockdown browsers, software which temporarily turns students' laptops into secure workstations with limited system or internet access. We combine the use of this technology with a learning management system and cloud-based programming tool to facilitate conceptual and practical programming questions that can be tackled in an interactive but controlled environment. We reflect on our experience of implementing this solution for a major undergraduate programming course, highlighting our principal lesson that policies and support mechanisms are as important to consider as the technology itself.	翻訳日:2023-06-08 03:57:45 公開日:2020-01-12
# 生成的逆模倣学習の計算と一般化について On Computation and Generalization of Generative Adversarial Imitation Learning ( http://arxiv.org/abs/2001.02792v2 ) ライセンス: Link先を確認	Minshuo Chen, Yizhou Wang, Tianyi Liu, Zhuoran Yang, Xingguo Li, Zhaoran Wang, Tuo Zhao	(参考訳) GAIL(Generative Adversarial Imitation Learning)は、シーケンシャルな意思決定ポリシーを学ぶための強力で実践的なアプローチである。強化学習(RL)とは異なり、GAILは専門家(例えば人間)による実証データを活用し、未知の環境のポリシーと報酬関数の両方を学ぶ。顕著な経験的進歩にもかかわらず、GAILの背後にある理論はほとんど不明である。主な困難は、デモデータの根底にある時間依存性と、凸凹構造を持たないGAILの最小計算定式化である。このような理論と実践のギャップを埋めるため,ガイルの理論的性質を考察する。具体的には,(1)報奨関数のクラスが適切に制御される限り,GAILに対して一般化を保証すること,(2)報奨関数が再生カーネル関数としてパラメータ化されるGAILに対して,GAILを確率的一階最適化アルゴリズムにより効率よく解き、定常解へのサブ線形収束を実現すること,を示す。我々の知る限り、これらは報酬/政治機能近似による模倣学習の統計的および計算的保証に関する最初の結果である。解析を支援するために数値実験を行った。 Generative Adversarial Imitation Learning (GAIL) is a powerful and practical approach for learning sequential decision-making policies. Different from Reinforcement Learning (RL), GAIL takes advantage of demonstration data by experts (e.g., human), and learns both the policy and reward function of the unknown environment. Despite the significant empirical progresses, the theory behind GAIL is still largely unknown. The major difficulty comes from the underlying temporal dependency of the demonstration data and the minimax computational formulation of GAIL without convex-concave structure. To bridge such a gap between theory and practice, this paper investigates the theoretical properties of GAIL. Specifically, we show: (1) For GAIL with general reward parameterization, the generalization can be guaranteed as long as the class of the reward functions is properly controlled; (2) For GAIL, where the reward is parameterized as a reproducing kernel function, GAIL can be efficiently solved by stochastic first order optimization algorithms, which attain sublinear convergence to a stationary solution. To the best of our knowledge, these are the first results on statistical and computational guarantees of imitation learning with reward/policy function approximation. Numerical experiments are provided to support our analysis.	翻訳日:2023-01-13 04:21:45 公開日:2020-01-12
# アダプティブ隣り合わせ型分別スパルスPCAによる次元化 Supervised Discriminative Sparse PCA with Adaptive Neighbors for Dimensionality Reduction ( http://arxiv.org/abs/2001.03103v2 ) ライセンス: Link先を確認	Zhenhua Shi, Dongrui Wu, Jian Huang, Yu-Kai Wang, Chin-Teng Lin	(参考訳) 情報可視化,特徴抽出,クラスタリング,回帰,分類において,特にノイズの多い高次元データを処理するための重要な操作である。しかし、既存のアプローチのほとんどは、データのグローバル構造かローカル構造を保存しているが、両方ではない。主成分分析(PCA)のようなグローバルなデータ構造のみを保持するアプローチは、通常、外れ値に敏感である。局所性保存プロジェクションのような局所データ構造のみを保存するアプローチは通常、教師なし(そのためラベル情報は使用できない)で、固定された類似性グラフを使用する。そこで本研究では, 線形次元削減手法として, 適応隣人との識別的スパースPCA(SDSPCAAN)を新たに提案し, 適応隣人とのクラスタリングを図った。その結果、グローバルデータ構造とローカルデータ構造、およびラベル情報の両方が、より次元性の低減に使用される。 9つの高次元データセットの分類実験により,提案したSDSPCAANの有効性とロバスト性を検証した。 Dimensionality reduction is an important operation in information visualization, feature extraction, clustering, regression, and classification, especially for processing noisy high dimensional data. However, most existing approaches preserve either the global or the local structure of the data, but not both. Approaches that preserve only the global data structure, such as principal component analysis (PCA), are usually sensitive to outliers. Approaches that preserve only the local data structure, such as locality preserving projections, are usually unsupervised (and hence cannot use label information) and uses a fixed similarity graph. We propose a novel linear dimensionality reduction approach, supervised discriminative sparse PCA with adaptive neighbors (SDSPCAAN), to integrate neighborhood-free supervised discriminative sparse PCA and projected clustering with adaptive neighbors. As a result, both global and local data structures, as well as the label information, are used for better dimensionality reduction. Classification experiments on nine high-dimensional datasets validated the effectiveness and robustness of our proposed SDSPCAAN.	翻訳日:2023-01-13 04:19:51 公開日:2020-01-12
# 量子振幅減衰符号に対する線形計画法 Linear programming bounds for quantum amplitude damping codes ( http://arxiv.org/abs/2001.03976v1 ) ライセンス: Link先を確認	Yingkai Ouyang and Ching-Yi Lai	(参考訳) 近似量子誤り訂正符号(AQEC)が完全量子誤り訂正符号よりも性能が優れていることを考慮すれば、それらの性能を定量化することが重要となる。量子重み列挙器は、量子誤り訂正符号の最小距離において最良の上限を設定するが、これらの境界はaqec符号に直接は適用されない。本稿では、振幅減衰(AD)誤差に対する量子量列挙器を導入し、近似量子誤差補正の枠組みの中で機能する。特に、符号空間に固有な補助的完全重み列挙子を導入し、さらに、ad誤差に対する量子重み列挙子とこの補助的完全重み列挙子との線形関係を確立する。これにより、AQEC ADコードに対応するパラメータが存在しない場合にのみ実現不可能な線形プログラムを確立することができる。線形プログラムを説明するために、任意のADエラーを修正することができる3ビットAD符号の存在を数値的に排除する。 Given that approximate quantum error-correcting (AQEC) codes have a potentially better performance than perfect quantum error correction codes, it is pertinent to quantify their performance. While quantum weight enumerators establish some of the best upper bounds on the minimum distance of quantum error-correcting codes, these bounds do not directly apply to AQEC codes. Herein, we introduce quantum weight enumerators for amplitude damping (AD) errors and work within the framework of approximate quantum error correction. In particular, we introduce an auxiliary exact weight enumerator that is intrinsic to a code space and moreover, we establish a linear relationship between the quantum weight enumerators for AD errors and this auxiliary exact weight enumerator. This allows us to establish a linear program that is infeasible only when AQEC AD codes with corresponding parameters do not exist. To illustrate our linear program, we numerically rule out the existence of three-qubit AD codes that are capable of correcting an arbitrary AD error.	翻訳日:2023-01-12 05:11:06 公開日:2020-01-12
# 信号解析と量子形式論:プランク定数を持たない量子化 Signal analysis and quantum formalism: Quantizations with no Planck constant ( http://arxiv.org/abs/2001.04916v1 ) ライセンス: Link先を確認	Jean Pierre Gazeau and Celestin Habonimana	(参考訳) 信号解析は、信号ベクトル空間(例えばフーリエ、ガボー、ウェーブレットなど)におけるアイデンティティの様々な分解能に基づいている。同様の分解能は関数や分布の量子化器として使われ、時間周波数や時間スケールの量子形式への道を歩み、興味深いか予期せぬ特徴を明らかにする。光子ではなく波の量子論と見なされる古典的電磁磁性への拡張について述べる。 Signal analysis is built upon various resolutions of the identity in signal vector spaces, e.g. Fourier, Gabor, wavelets, etc. Similar resolutions are used as quantizers of functions or distributions, paving the way to a time-frequency or time-scale quantum formalism and revealing interesting or unexpected features. Extensions to classical electromagnetism viewed as a quantum theory for waves and not for photons are mentioned.	翻訳日:2023-01-12 05:10:49 公開日:2020-01-12
# 集中型無線アクセスネットワークにおける大規模MIMO処理のための量子アニールの活用 Leveraging Quantum Annealing for Large MIMO Processing in Centralized Radio Access Networks ( http://arxiv.org/abs/2001.04014v1 ) ライセンス: Link先を確認	Minsung Kim, Davide Venturelli, Kyle Jamieson	(参考訳) 無線容量の増加に対するユーザの需要は、供給を上回っており、この需要を満たすために、新しいmimo無線物理層技術において大きな進歩を遂げている。高性能なシステムは、アルゴリズムの計算能力が非常に高いため、ほとんど実用的ではないままである。最適な性能を得るためには、ユーザ数と各ユーザのデータレートの両方で指数関数的に増加する計算量が必要となることが多い。これにより、基地局の計算能力は無線容量の重要な制限要因の1つとなっている。 QuAMaxは、この問題に量子アニールを利用して対処する最初の大規模なMIMO無線アクセスネットワークである。我々は2,031量子ビットD-Wave 2000Q量子アニールにQuAMaxを実装した。実験の結果,2000Qの計算時間10〜$\mu$sは,48ユーザ,48APアンテナBPSK通信を20dB SNRのビット誤り率10〜6}$,1500バイトのフレーム誤り率10〜4}$で実現可能であることがわかった。 User demand for increasing amounts of wireless capacity continues to outpace supply, and so to meet this demand, significant progress has been made in new MIMO wireless physical layer techniques. Higher-performance systems now remain impractical largely only because their algorithms are extremely computationally demanding. For optimal performance, an amount of computation that increases at an exponential rate both with the number of users and with the data rate of each user is often required. The base station's computational capacity is thus becoming one of the key limiting factors on wireless capacity. QuAMax is the first large MIMO centralized radio access network design to address this issue by leveraging quantum annealing on the problem. We have implemented QuAMax on the 2,031 qubit D-Wave 2000Q quantum annealer, the state-of-the-art in the field. Our experimental results evaluate that implementation on real and synthetic MIMO channel traces, showing that 10~$\mu$s of compute time on the 2000Q can enable 48 user, 48 AP antenna BPSK communication at 20 dB SNR with a bit error rate of $10^{-6}$ and a 1,500 byte frame error rate of $10^{-4}$.	翻訳日:2023-01-12 05:09:08 公開日:2020-01-12
# ロバストニューラルネットワークの関数誤差補正 Functional Error Correction for Robust Neural Networks ( http://arxiv.org/abs/2001.03814v1 ) ライセンス: Link先を確認	Kunping Huang, Paul Siegel, Anxiao (Andrew) Jiang	(参考訳) ニューラルネットワーク(NeuralNets)をハードウェアで実装する場合、その重みをメモリデバイスに格納する必要がある。格納された重みにノイズが蓄積されると、NeuralNetのパフォーマンスは低下する。本稿では,重みを保護するために誤り訂正符号(ECC)の使い方について検討する。データストレージにおける古典的な誤り訂正とは異なり、最適化の目的は、保護されたビットの誤り率を最小化するのではなく、エラー修正後のNeuralNetのパフォーマンスを最適化することである。すなわち、ニューラルネットワークを入力の関数として見ることにより、エラー訂正スキームは関数指向である。最大の課題は、ディープニューラルネットワークは、しばしば数百万から数億の重量を持ち、ECCの大きな冗長性オーバーヘッドを引き起こし、重みとNeuralNetのパフォーマンスの関係は非常に複雑であることだ。そこで本研究では,ECC保護のための重要なビットのサブセットのみを選択するSelective Protection (SP)方式を提案する。このようなビットを探し出し、ECCの冗長性とNeuralNetの性能のトレードオフを最適化するために、深層強化学習に基づくアルゴリズムを提案する。実験の結果,本アルゴリズムは,本手法と比較して,機能的誤り訂正タスクの性能が大幅に向上することを確認した。 When neural networks (NeuralNets) are implemented in hardware, their weights need to be stored in memory devices. As noise accumulates in the stored weights, the NeuralNet's performance will degrade. This paper studies how to use error correcting codes (ECCs) to protect the weights. Different from classic error correction in data storage, the optimization objective is to optimize the NeuralNet's performance after error correction, instead of minimizing the Uncorrectable Bit Error Rate in the protected bits. That is, by seeing the NeuralNet as a function of its input, the error correction scheme is function-oriented. A main challenge is that a deep NeuralNet often has millions to hundreds of millions of weights, causing a large redundancy overhead for ECCs, and the relationship between the weights and its NeuralNet's performance can be highly complex. To address the challenge, we propose a Selective Protection (SP) scheme, which chooses only a subset of important bits for ECC protection. To find such bits and achieve an optimized tradeoff between ECC's redundancy and NeuralNet's performance, we present an algorithm based on deep reinforcement learning. Experimental results verify that compared to the natural baseline scheme, the proposed algorithm achieves substantially better performance for the functional error correction task.	翻訳日:2023-01-12 05:08:48 公開日:2020-01-12
# CUREデータセット:オーディオイベント分類のためのラダーネットワーク CURE Dataset: Ladder Networks for Audio Event Classification ( http://arxiv.org/abs/2001.03896v1 ) ライセンス: Link先を確認	Harishchandra Dubey, Dimitra Emmanouilidou, Ivan J. Tashev	(参考訳) 音声イベント分類は、監視、音声、ビデオ、マルチメディア検索など、いくつかのアプリケーションにとって重要なタスクである。約300万人が聴力を失い、周囲で起きている出来事を認識できない。本稿では,聴覚障害者に最も関係のある特定の音声イベントのキュレーションセットを含む治療データセットについて述べる。本論文では,freesoundプロジェクトから派生した5s音声記録を用いたラダーネットワーク型音声イベント分類器を提案する。我々は,現在最先端の畳み込みニューラルネットワーク(CNN)をオーディオ機能として採用した。また,イベント分類のための極端学習機械 (ELM) についても検討する。本研究では,提案する分類器をサポートベクトルマシン(SVM)ベースラインと比較する。異なる録音シナリオ間のミスマッチを低減することを目的とした信号と特徴の正規化を提案する。まず、CNNは弱いラベル付きAudiosetデータに基づいて訓練される。次に, 予め学習したモデルを用いて, 提案する治療コーパスの特徴抽出を行う。 esc-50データセットを第2の評価セットとして組み込む。 ELM と SVM の分類器に対するラダーネットワークの優位性について,ロバスト性および分類精度の向上の観点から検証した。 Ladder ネットワークはデータミスマッチに対して堅牢であるが、単純な SVM と ELM の分類器はそのようなミスマッチに敏感であり、提案手法が重要な役割を果たす。 ESC-50とCUREコーパスによる実験的研究は、提案手法によって提供されるデータセットの複雑さと堅牢性の違いを解明する。 Audio event classification is an important task for several applications such as surveillance, audio, video and multimedia retrieval etc. There are approximately 3M people with hearing loss who can't perceive events happening around them. This paper establishes the CURE dataset which contains curated set of specific audio events most relevant for people with hearing loss. We propose a ladder network based audio event classifier that utilizes 5s sound recordings derived from the Freesound project. We adopted the state-of-the-art convolutional neural network (CNN) embeddings as audio features for this task. We also investigate extreme learning machine (ELM) for event classification. In this study, proposed classifiers are compared with support vector machine (SVM) baseline. We propose signal and feature normalization that aims to reduce the mismatch between different recordings scenarios. Firstly, CNN is trained on weakly labeled Audioset data. Next, the pre-trained model is adopted as feature extractor for proposed CURE corpus. We incorporate ESC-50 dataset as second evaluation set. Results and discussions validate the superiority of Ladder network over ELM and SVM classifier in terms of robustness and increased classification accuracy. While Ladder network is robust to data mismatches, simpler SVM and ELM classifiers are sensitive to such mismatches, where the proposed normalization techniques can play an important role. Experimental studies with ESC-50 and CURE corpora elucidate the differences in dataset complexity and robustness offered by proposed approaches.	翻訳日:2023-01-12 05:08:11 公開日:2020-01-12
# 包括的深層学習に基づく非線形時間依存パラメタライズドPDEのオーダーモデリング A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs ( http://arxiv.org/abs/2001.04001v1 ) ライセンス: Link先を確認	Stefania Fresca, Luca Dede, Andrea Manzoni	(参考訳) 還元基底法(RB)法(例えば、適切な直交分解法(POD))のような伝統的な還元次数モデリング技術は、それらに基づくモードの線形重ね合わせの基本的な仮定のため、非線形時間依存パラメトリゼーションPDEを扱う際に厳しい制限を受ける。このため、トランスポート、ウェーブ、対流支配現象などの時間上を伝播するコヒーレント構造を特徴とする問題の場合、rb法は、高忠実度全階モデル(fom)の解に対して十分に精度の低下した次数近似を求めると、通常、効率の悪い還元次数モデル(rom)が得られる。本研究は,これらの制約を克服するために,ディープラーニング(DL)アルゴリズムを応用した低次モデル設定のための非線形手法を提案する。結果として得られた非線形ROMはDL-ROMと呼ばれ、非線形トライアル多様体(線形ROMの基底関数の集合に対応する)と非線形還元力学(線形ROMの投影段階に対応する)の両方をDLアルゴリズムに頼って非侵襲的に学習し、後者は異なるパラメータ値に対して得られたFOM解の集合に基づいて訓練する。本稿では, 線形および非線形時間依存型パラメタライズPDEのためのDL-ROMを構築する方法, さらに, 異なるパラメタライズPDE問題を特徴とするテストケースにおいて, その精度を評価する。 PDE解多様体の内在次元に等しい次元のDL-ROMは、同じ精度を達成するために大量のPODモードを必要とする状況において、パラメータ化されたPDEの解を近似できることを示す。 Traditional reduced order modeling techniques such as the reduced basis (RB) method (relying, e.g., on proper orthogonal decomposition (POD)) suffer from severe limitations when dealing with nonlinear time-dependent parametrized PDEs, because of the fundamental assumption of linear superimposition of modes they are based on. For this reason, in the case of problems featuring coherent structures that propagate over time such as transport, wave, or convection-dominated phenomena, the RB method usually yields inefficient reduced order models (ROMs) if one aims at obtaining reduced order approximations sufficiently accurate compared to the high-fidelity, full order model (FOM) solution. To overcome these limitations, in this work, we propose a new nonlinear approach to set reduced order models by exploiting deep learning (DL) algorithms. In the resulting nonlinear ROM, which we refer to as DL-ROM, both the nonlinear trial manifold (corresponding to the set of basis functions in a linear ROM) as well as the nonlinear reduced dynamics (corresponding to the projection stage in a linear ROM) are learned in a non-intrusive way by relying on DL algorithms; the latter are trained on a set of FOM solutions obtained for different parameter values. In this paper, we show how to construct a DL-ROM for both linear and nonlinear time-dependent parametrized PDEs; moreover, we assess its accuracy on test cases featuring different parametrized PDE problems. Numerical results indicate that DL-ROMs whose dimension is equal to the intrinsic dimensionality of the PDE solutions manifold are able to approximate the solution of parametrized PDEs in situations where a huge number of POD modes would be necessary to achieve the same degree of accuracy.	翻訳日:2023-01-12 05:07:51 公開日:2020-01-12
# アモルファスフォトニック位相絶縁体 Amorphous photonic topological insulator ( http://arxiv.org/abs/2001.03819v1 ) ライセンス: Link先を確認	Peiheng Zhou, Gui-Geng Liu, Xin Ren, Yihao Yang, Haoran Xue, Lei Bi, Longjiang Deng, Yidong Chong, and Baile Zhang	(参考訳) フォトニックトポロジー絶縁体(ptis)はバンドトポロジーによって保護される頑健なフォトニックエッジ状態を示す。標準バンド理論は、長距離の位置順序を持たないが短距離順序のみを持つ非結晶格子によって形成される物質のアモルファス相には適用されない。その他の興味深い性質の中で、アモルファス媒体はガラスと液体の相の遷移を示し、短距離秩序の劇的な変化を伴う。本稿では,Chern-number-based PTIのアモルファス変種について実験的に検討する。格子の歪み強度を調整することにより、ガラス-液相遷移の前に、光学的位相的エッジ状態がアモルファス状態に持続可能であることを示す。液体状の格子構造への遷移の後、位相的エッジ状態のシグネチャは消滅する。アモルファス格子におけるトポロジーと短距離秩序の間のこの相互作用は、新しい非結晶トポロジーフォトニック材料への道を開く。 Photonic topological insulators (PTIs) exhibit robust photonic edge states protected by band topology, similar to electronic edge states in topological band insulators. Standard band theory does not apply to amorphous phases of matter, which are formed by non-crystalline lattices with no long-range positional order but only short-range order. Among other interesting properties, amorphous media exhibit transitions between glassy and liquid phases, accompanied by dramatic changes in short-range order. Here, we experimentally investigate amorphous variants of a Chern-number-based PTI. By tuning the disorder strength in the lattice, we demonstrate that photonic topological edge states can persist into the amorphous regime, prior to the glass-to-liquid transition. After the transition to a liquid-like lattice configuration, the signatures of topological edge states disappear. This interplay between topology and short-range order in amorphous lattices paves the way for new classes of non-crystalline topological photonic materials.	翻訳日:2023-01-12 05:07:18 公開日:2020-01-12
# 希薄スピングラスにおけるフラストレーションと遷移点の簡単な関係 A simple relation between frustration and transition points in diluted spin glasses ( http://arxiv.org/abs/2001.03903v1 ) ライセンス: Link先を確認	Ryoji Miyazaki, Yuta Kudo, Masayuki Ohzeki, Kazuyuki Tanaka	(参考訳) スピングラスのフラストレーションと相転移点の関係について検討した。この関係は, 相転移点ゼロ温度における格子内のフラストレーション点数の条件として表され, 複数の格子に対して相転移点に非常に近い点を与えることが報告された。関係の証明はないが、いくつかの格子の良好な対応は、相転移における関係の妥当性とフラストレーションの重要な役割を示唆している。さらに, この関係を考察するため, 希釈格子との関係を自然拡張し, 結合拡散二乗格子の有効性を検証した。得られた点が幅広い希釈速度で相転移点と良好に一致していることを確認する。その結果,スピングラスの相転移に対するフラストレーションの重要性について,前回の非希釈格子に対する提案が支持された。 We investigate a possible relation between frustration and phase-transition points in spin glasses. The relation is represented as a condition of the number of frustrated plaquettes in the lattice at phase-transition points at zero temperature and was reported to provide very close points to the phase-transition points for several lattices. Although there has been no proof of the relation, the good correspondence in several lattices suggests the validity of the relation and some important role of frustration in the phase transitions. To examine the relation further, we present a natural extension of the relation to diluted lattices and verify its effectiveness for bond-diluted square lattices. We then confirm that the resulting points are in good agreement with the phase-transition points in a wide range of dilution rate. Our result supports the suggestion from the previous work for non-diluted lattices on the importance of frustration to the phase transition of spin glasses.	翻訳日:2023-01-12 05:06:59 公開日:2020-01-12
# ドープ半導体薄膜における超高速2光子放出 Ultrafast two-photon emission in a doped semiconductor thin film ( http://arxiv.org/abs/2001.03975v1 ) ライセンス: Link先を確認	Futai Hu, Liu Li, Yuan Liu, Yuan Meng, Mali Gong, and Yuanmu Yang	(参考訳) 高次量子遷移として、2光子放出は1光子放出に比べて非常に低い発生率を持つため、禁止された過程であると考えられている。本稿では,超高速2光子発光を可能とし,高閉じ込められた表面プラズモンポラリトンモードを利用した半導体薄膜の発光方式を提案する。表面プラズモンポラリトンモードは、半導体中の2光子放出と同時にスペクトルと空間の重なりを持つように調整される。縮退ドープしたInSbを原材料として, 2光子放出は10ミリ秒からピコ秒までの10桁の速度で加速し, 1光子放出速度を超えることを示す。この結果,超高速光子生成のための半導体プラットフォームが,中赤外波長の波長可変化を実現した。 As a high-order quantum transition, two-photon emission has an extremely low occurrence rate compared to one-photon emission, thus having been considered a forbidden process. Here, we propose a scheme that allows ultrafast two-photon emission, leveraging highly confined surface plasmon polariton modes in a degenerately-doped, light-emitting semiconductor thin film. The surface plasmon polariton modes are tailored to have simultaneous spectral and spatial overlap with the two-photon emission in the semiconductor. Using degenerately-doped InSb as the prototype material, we show that the two-photon emission can be accelerated by 10 orders of magnitude: from tens of milliseconds to picoseconds, surpassing the one-photon emission rate. Our result provides a semiconductor platform for ultrafast single and entangled photon generation, with a tunable emission wavelength in the mid-infrared.	翻訳日:2023-01-12 05:06:01 公開日:2020-01-12
# 非剛性画像登録と剛性画像登録の比較検討 A Comparative Study for Non-rigid Image Registration and Rigid Image Registration ( http://arxiv.org/abs/2001.03831v1 ) ライセンス: Link先を確認	Xiaoran Zhang, Hexiang Dong, Di Gao and Xiao Zhao	(参考訳) 画像登録アルゴリズムは一般に非剛性と剛性という2つのグループに分類できる。近年,深層学習に基づくアルゴリズムでは,非剛性画像登録関数を特徴付けるニューラルネットワークが採用されている。しかし、彼らは常に改善しますか? 本研究では,最先端のDeep-based non-rigid registration approachと厳密な登録アプローチを比較した。データはkaggle dog vs cat competition \url{https://www.kaggle.com/c/dogs-vs-cats/}から生成され、変換、回転、スケーリング、せん断、ピクセルワイズ非剛性変換を含む剛性変換におけるアルゴリズムの性能をテストする。 voxelmorphは、比較のためにhardidsetとnonrigidsetを別々にトレーニングし、登録性能を改善するために元のアーキテクチャにガウスぼけ層を追加する。根平均二乗誤差 (RMSE) と平均絶対誤差 (MAE) の両値における最良の定量値は, SimpleElastix と Voxelmorph による非剛性登録により得られる。視覚評価のための代表サンプルを選択する。 Image registration algorithms can be generally categorized into two groups: non-rigid and rigid. Recently, many deep learning-based algorithms employ a neural net to characterize non-rigid image registration function. However, do they always perform better? In this study, we compare the state-of-art deep learning-based non-rigid registration approach with rigid registration approach. The data is generated from Kaggle Dog vs Cat Competition \url{https://www.kaggle.com/c/dogs-vs-cats/} and we test the algorithms' performance on rigid transformation including translation, rotation, scaling, shearing and pixelwise non-rigid transformation. The Voxelmorph is trained on rigidset and nonrigidset separately for comparison and we also add a gaussian blur layer to its original architecture to improve registration performance. The best quantitative results in both root-mean-square error (RMSE) and mean absolute error (MAE) metrics for rigid registration are produced by SimpleElastix and non-rigid registration by Voxelmorph. We select representative samples for visual assessment.	翻訳日:2023-01-12 04:59:45 公開日:2020-01-12
# スカラー量子化学習による深層最適化多重記述画像符号化 Deep Optimized Multiple Description Image Coding via Scalar Quantization Learning ( http://arxiv.org/abs/2001.03851v1 ) ライセンス: Link先を確認	Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao	(参考訳) 本稿では,多重記述(md)圧縮損失を最小化することで最適化した深層多重記述符号化(mdc)フレームワークを提案する。第一に、mdマルチスケール拡張エンコーダネットワークは複数の記述テンソルを生成し、スカラー量子化子によって離散化されるが、これらの量子化テンソルはmdカスケードブロックデコーダネットワークによってデ圧縮される。人工ニューラルネットワークパラメータの総量を大幅に削減するために、これらの2種類のネットワークからなるオートエンコーダネットワークを対称パラメータ共有構造として設計する。第2に、このオートエンコーダネットワークと一対のスカラー量子化器を、エンドツーエンドのセルフ教師方式で同時に学習する。第3に、画像空間分布の変化を考慮すると、各スカラー量子化器は直接量子化ではなく、mdテンソルを生成するために重要指数マップを伴っている。第4に,複数記述の多様化を暗黙的に規則化する多重記述構造類似性距離損失を導入し,MD再構成の他,多様化復号化を明示的に監督する。最後に、我々のMDCフレームワークは、複数の一般的なデータセットでテストした場合、画像符号化効率について、最先端のMDCアプローチよりも優れた性能を示すことを示す。 In this paper, we introduce a deep multiple description coding (MDC) framework optimized by minimizing multiple description (MD) compressive loss. First, MD multi-scale-dilated encoder network generates multiple description tensors, which are discretized by scalar quantizers, while these quantized tensors are decompressed by MD cascaded-ResBlock decoder networks. To greatly reduce the total amount of artificial neural network parameters, an auto-encoder network composed of these two types of network is designed as a symmetrical parameter sharing structure. Second, this autoencoder network and a pair of scalar quantizers are simultaneously learned in an end-to-end self-supervised way. Third, considering the variation in the image spatial distribution, each scalar quantizer is accompanied by an importance-indicator map to generate MD tensors, rather than using direct quantization. Fourth, we introduce the multiple description structural similarity distance loss, which implicitly regularizes the diversified multiple description generations, to explicitly supervise multiple description diversified decoding in addition to MD reconstruction loss. Finally, we demonstrate that our MDC framework performs better than several state-of-the-art MDC approaches regarding image coding efficiency when tested on several commonly available datasets.	翻訳日:2023-01-12 04:59:25 公開日:2020-01-12
# 水頭症に対するロバスト脳磁気共鳴画像分割 : 硬度と軟度の検討 Robust Brain Magnetic Resonance Image Segmentation for Hydrocephalus Patients: Hard and Soft Attention ( http://arxiv.org/abs/2001.03857v1 ) ライセンス: Link先を確認	Xuhua Ren, Jiayu Huo, Kai Xuan, Dongming Wei, Lichi Zhang, Qian Wang	(参考訳) 脳磁気共鳴(MR)の脳卒中患者に対するセグメンテーションは難しい作業であると考えられている。異なる個体の脳解剖学的構造の変化をコード化することは容易ではない。この課題は、特に水頭症患者の画像データを考慮するとさらに難しくなり、しばしば大きな変形があり、通常の被験者とは大きく異なる。本稿では,水頭症mr画像のセグメンテーション問題を解決するための,ハードおよびソフトアテンションモジュールを用いた新しい戦略を提案する。私たちの主な貢献は3倍です。 1) ハードアテンションモジュールは,マルチアトラス法とボクセルモルフツールを用いて粗いセグメンテーションマップを生成し,その後のセグメンテーションプロセスをガイドし,そのロバスト性を向上させる。 2 ソフトアテンションモジュールは、位置注意を取り入れて正確な文脈情報を把握し、セグメンテーション精度をさらに向上させる。 3) 実際の臨床シナリオにおいて脳MRI画像の定量化に不可欠である insula, thalamus, and many other region ofinterests (ROIs) を抽出し, 本法の有効性を検証した。提案手法は,17種類の意識関連ROIを異なる被験者に対して高いバラツキで分割することで,ロバスト性と精度を大幅に向上させる。私たちの知る限りでは、脳卒中患者の脳のセグメンテーション問題を解決するためにディープラーニングを利用した最初の研究である。 Brain magnetic resonance (MR) segmentation for hydrocephalus patients is considered as a challenging work. Encoding the variation of the brain anatomical structures from different individuals cannot be easily achieved. The task becomes even more difficult especially when the image data from hydrocephalus patients are considered, which often have large deformations and differ significantly from the normal subjects. Here, we propose a novel strategy with hard and soft attention modules to solve the segmentation problems for hydrocephalus MR images. Our main contributions are three-fold: 1) the hard-attention module generates coarse segmentation map using multi-atlas-based method and the VoxelMorph tool, which guides subsequent segmentation process and improves its robustness; 2) the soft-attention module incorporates position attention to capture precise context information, which further improves the segmentation accuracy; 3) we validate our method by segmenting insula, thalamus and many other regions-of-interests (ROIs) that are critical to quantify brain MR images of hydrocephalus patients in real clinical scenario. The proposed method achieves much improved robustness and accuracy when segmenting all 17 consciousness-related ROIs with high variations for different subjects. To the best of our knowledge, this is the first work to employ deep learning for solving the brain segmentation problems of hydrocephalus patients.	翻訳日:2023-01-12 04:59:04 公開日:2020-01-12
# ヒューマンロボットインタラクションのための深層学習に基づく感情予測のためのハイパーパラメータ最適化 Hyperparameters optimization for Deep Learning based emotion prediction for Human Robot Interaction ( http://arxiv.org/abs/2001.03855v1 ) ライセンス: Link先を確認	Shruti Jaiswal, and Gora Chand Nandi	(参考訳) ヒューマノイドロボットが私たちの社会空間を共有できるようにするには、音声、ジェスチャー、感情の共有といった複数のモードを使ってロボットと簡単に対話できる技術を開発する必要がある。本研究は,リアルタイムコミュニケーションのために低リソースのソーシャルロボット上でより適応的に計算できる,計算資源の削減とネットワークハイパーパラメータの少なさを必要とする感情認識問題の核となる問題に対処することを目的とした。より具体的には、人間型ロボットをリアルタイムで試した場合に複数のデータセット上で組み合わせてテストした場合、感情分類のための既存のネットワークアーキテクチャよりも最大6%精度が向上したインセプションモジュールベースの畳み込みニューラルネットワークアーキテクチャを提案する。提案モデルでは,トレーニング可能なハイパーパラメータを94%まで削減し,人間のロボットインタラクションなどのリアルタイムアプリケーションで使用できることを明確に示すバニラCNNモデルと比較した。十分に堅牢で精度の高い方法論を検証するために,厳密な実験が実施されている。最後に、モデルを人型ロボットNAOにリアルタイムに実装し、モデルの堅牢性を評価する。 To enable humanoid robots to share our social space we need to develop technology for easy interaction with the robots using multiple modes such as speech, gestures and share our emotions with them. We have targeted this research towards addressing the core issue of emotion recognition problem which would require less computation resources and much lesser number of network hyperparameters which will be more adaptive to be computed on low resourced social robots for real time communication. More specifically, here we have proposed an Inception module based Convolutional Neural Network Architecture which has achieved improved accuracy of upto 6% improvement over the existing network architecture for emotion classification when combinedly tested over multiple datasets when tried over humanoid robots in real - time. Our proposed model is reducing the trainable Hyperparameters to an extent of 94% as compared to vanilla CNN model which clearly indicates that it can be used in real time based application such as human robot interaction. Rigorous experiments have been performed to validate our methodology which is sufficiently robust and could achieve high level of accuracy. Finally, the model is implemented in a humanoid robot, NAO in real time and robustness of the model is evaluated.	翻訳日:2023-01-12 04:58:05 公開日:2020-01-12
# 非有界大域最適化に対する適応拡張ベイズ最適化 Adaptive Expansion Bayesian Optimization for Unbounded Global Optimization ( http://arxiv.org/abs/2001.04815v1 ) ライセンス: Link先を確認	Wei Chen and Mark Fuge	(参考訳) ベイズ最適化は通常、固定変数境界内で実行される。機械学習アルゴリズムのハイパーパラメータチューニングのような場合、変数境界の設定は簡単ではない。任意の固定境界が真の大域的最適性を含むことは保証できない。本稿では,大域的最適化を必ずしも含まない初期探索空間を定義し,必要であれば探索空間を拡大するベイズ最適化手法を提案する。しかし、過剰な爆発は探索空間の膨張の間に起こりうる。拡張空間における探索と利用を適応的にバランスさせることができる。合成試験関数とMLPハイパーパラメータ最適化タスクの結果から,提案手法は現在の最先端手法と同等以上の性能を示した。 Bayesian optimization is normally performed within fixed variable bounds. In cases like hyperparameter tuning for machine learning algorithms, setting the variable bounds is not trivial. It is hard to guarantee that any fixed bounds will include the true global optimum. We propose a Bayesian optimization approach that only needs to specify an initial search space that does not necessarily include the global optimum, and expands the search space when necessary. However, over-exploration may occur during the search space expansion. Our method can adaptively balance exploration and exploitation in an expanding space. Results on a range of synthetic test functions and an MLP hyperparameter optimization task show that the proposed method out-performs or at least as good as the current state-of-the-art methods.	翻訳日:2023-01-12 04:57:45 公開日:2020-01-12
# 離散時間量子ウォークを用いたスペクトル磁化ラチェット Spectral Magnetization Ratchets with Discrete Time Quantum Walks ( http://arxiv.org/abs/2001.03868v1 ) ライセンス: Link先を確認	A. Mallick, M. V. Fistul, P. Kaczynska, S. Flach	(参考訳) 我々は、周期的離散時間量子ウォーク(DTQW)のスペクトル磁化に対するラチェット効果を予測し、理論的に詳細に研究する。これらの一般化DTQWは、対応するコイン演算子パラメータを離散時間で周期的に変化させることにより達成される。期間はm=1,2,3$である。 m$-周期 dtqws のダイナミクスは、2バンド分散関係 $\omega^{(m)_{\pm}(k)$ によって特徴づけられ、ここで $k$ は波動ベクトルである。我々は、$m$- periodic DTQWsの一般化パリティ対称性を同定する。対称性は、コイン演算子パラメータの適切な選択によって、$m=2,3$で破ることができる。得られた対称性の破れはラチェット効果、すなわち非零のスペクトル磁化 $m_s(\omega)$ の出現をもたらす。このラチェット効果は、周期DTQWの時間依存性相関関数の連続量子測定の枠組みで観察することができる。 We predict and theoretically study in detail the ratchet effect for the spectral magnetization of periodic discrete time quantum walks (DTQWs) --- a repetition of a sequence of $m$ different DTQWs. These generalized DTQWs are achieved by varying the corresponding coin operator parameters periodically with discrete time. We consider periods $m=1,2,3$. The dynamics of $m$-periodic DTQWs is characterized by a two-band dispersion relation $\omega^{(m)}_{\pm}(k)$, where $k$ is the wave vector. We identify a generalized parity symmetry of $m$-periodic DTQWs. The symmetry can be broken for $m=2,3$ by proper choices of the coin operator parameters. The obtained symmetry breaking results in a ratchet effect, i.e. the appearance of a nonzero spectral magnetization $M_s(\omega)$. This ratchet effect can be observed in the framework of continuous quantum measurements of the time-dependent correlation function of periodic DTQWs.	翻訳日:2023-01-12 04:57:21 公開日:2020-01-12
# 連続モデル生成のための同時外挿・補間ネットワーク Concurrently Extrapolating and Interpolating Networks for Continuous Model Generation ( http://arxiv.org/abs/2001.03847v1 ) ライセンス: Link先を確認	Lijun Zhao, Jinjing Zhang, Fan Zhang, Anhong Wang, Huihui Bai, Yao Zhao	(参考訳) 多くの深層画像平滑化演算子は、異なるパラメータで設定されたアルゴリズムごとに、異なる明示的な構造-テクスチャペアをラベルイメージとして使用する場合、常に繰り返し訓練される。このようなトレーニング戦略は、しばしば長い時間をかけて、機器リソースをコストのかかる方法で消費します。この課題に対処するために、より強力なモデル生成ツールとして、連続ネットワーク補間を一般化し、特定の効果ラベル画像のセットのみを必要とするモデル列を形成するための、単純かつ効果的なモデル生成戦略を提案する。画像平滑化演算子を正確に学習するために、現在のネットワークアーキテクチャの大部分に簡単に挿入できる二重状態集約(DSA)モジュールを提案する。このモジュールに基づき、局所特徴集約ブロックと非局所特徴集約ブロックを備えた二重状態集約ニューラルネットワーク構造を設計し、表現能力の高い演算子を得る。多くの客観的および視覚的実験結果の評価を通じて,提案手法は連続したモデルを生成することができ,画像平滑化のための最先端手法よりも優れた性能が得られることを示す。 Most deep image smoothing operators are always trained repetitively when different explicit structure-texture pairs are employed as label images for each algorithm configured with different parameters. This kind of training strategy often takes a long time and spends equipment resources in a costly manner. To address this challenging issue, we generalize continuous network interpolation as a more powerful model generation tool, and then propose a simple yet effective model generation strategy to form a sequence of models that only requires a set of specific-effect label images. To precisely learn image smoothing operators, we present a double-state aggregation (DSA) module, which can be easily inserted into most of current network architecture. Based on this module, we design a double-state aggregation neural network structure with a local feature aggregation block and a nonlocal feature aggregation block to obtain operators with large expression capacity. Through the evaluation of many objective and visual experimental results, we show that the proposed method is capable of producing a series of continuous models and achieves better performance than that of several state-of-the-art methods for image smoothing.	翻訳日:2023-01-12 04:50:46 公開日:2020-01-12
# 車両再識別のための属性誘導型特徴学習ネットワーク Attribute-guided Feature Learning Network for Vehicle Re-identification ( http://arxiv.org/abs/2001.03872v1 ) ライセンス: Link先を確認	Huibing Wang, Jinjia Peng, Dongyan Chen, Guangqi Jiang, Tongtong Zhao, Xianping Fu	(参考訳) 自動車再識別(reID)は,近年ホットな話題となっている都市監視ビデオの自動解析において重要な役割を担っている。しかし、これは車両の様々な視点、多彩な照明、複雑な環境によって引き起こされる重大な問題である。現在、ほとんどの既存の車両のreIDアプローチは、より優れた表現を導き出すためにメトリクスやアンサンブルを学習することに焦点を当てている。しかし、詳細な記述を含む車両の特性は、reIDモデルの訓練に有用である。そこで,本稿では,豊富な属性特徴を持つグローバル表現をエンドツーエンドに学習可能な,新しいAttribute-Guided Network(AGNet)を提案する。特に、属性誘導モジュールがagnetで提案され、カテゴリ分類のための識別的特徴の選択を逆ガイドできる属性マスクを生成する。さらに,提案したAGNetでは,属性に基づくラベル平滑化(ALS)の損失がreIDモデルの訓練に有効であることを示す。総合実験の結果, vehicleid データセットと veri-776 データセットの両方において優れた性能が得られた。 Vehicle re-identification (reID) plays an important role in the automatic analysis of the increasing urban surveillance videos, which has become a hot topic in recent years. However, it poses the critical but challenging problem that is caused by various viewpoints of vehicles, diversified illuminations and complicated environments. Till now, most existing vehicle reID approaches focus on learning metrics or ensemble to derive better representation, which are only take identity labels of vehicle into consideration. However, the attributes of vehicle that contain detailed descriptions are beneficial for training reID model. Hence, this paper proposes a novel Attribute-Guided Network (AGNet), which could learn global representation with the abundant attribute features in an end-to-end manner. Specially, an attribute-guided module is proposed in AGNet to generate the attribute mask which could inversely guide to select discriminative features for category classification. Besides that, in our proposed AGNet, an attribute-based label smoothing (ALS) loss is presented to better train the reID model, which can strength the distinct ability of vehicle reID model to regularize AGNet model according to the attributes. Comprehensive experimental results clearly demonstrate that our method achieves excellent performance on both VehicleID dataset and VeRi-776 dataset.	翻訳日:2023-01-12 04:50:02 公開日:2020-01-12
# 視覚的感情分類のためのマルチソースドメイン適応 Multi-source Domain Adaptation for Visual Sentiment Classification ( http://arxiv.org/abs/2001.03886v1 ) ライセンス: Link先を確認	Chuang Lin, Sicheng Zhao, Lei Meng, Tat-Seng Chua	(参考訳) 視覚感情分類に関する既存のドメイン適応法は通常、十分なラベル付きデータのソースドメインから学んだ知識を、ゆるやかにラベル付けされたデータまたはラベル付きデータのターゲットドメインに転送する単一ソースシナリオで検討される。しかし、実際には、単一のソースドメインのデータは通常、限られたボリュームを持ち、ターゲットドメインの特徴をほとんどカバーできない。本稿では,多元感情生成支援ネットワーク(msgan,multi-source sentiment generative adversarial network)と呼ばれる,視覚的感情分類のためのマルチソースドメイン適応(mda)手法を提案する。複数のソースドメインからのデータを扱うために、ソースドメインとターゲットドメインの両方からのデータが同じ分布を共有する、統一された感情潜在空間を見つけることを学ぶ。これは、エンドツーエンドのサイクル一貫した逆学習を通じて達成される。 4つのベンチマークデータセットで実施された大規模な実験により、MSGANは視覚的感情分類のための最先端のMDAアプローチよりも大幅に優れていることが示された。 Existing domain adaptation methods on visual sentiment classification typically are investigated under the single-source scenario, where the knowledge learned from a source domain of sufficient labeled data is transferred to the target domain of loosely labeled or unlabeled data. However, in practice, data from a single source domain usually have a limited volume and can hardly cover the characteristics of the target domain. In this paper, we propose a novel multi-source domain adaptation (MDA) method, termed Multi-source Sentiment Generative Adversarial Network (MSGAN), for visual sentiment classification. To handle data from multiple source domains, it learns to find a unified sentiment latent space where data from both the source and target domains share a similar distribution. This is achieved via cycle consistent adversarial learning in an end-to-end manner. Extensive experiments conducted on four benchmark datasets demonstrate that MSGAN significantly outperforms the state-of-the-art MDA approaches for visual sentiment classification.	翻訳日:2023-01-12 04:49:41 公開日:2020-01-12
# メラノーマセグメンテーションのための適応受容場を持つ補完ネットワーク Complementary Network with Adaptive Receptive Fields for Melanoma Segmentation ( http://arxiv.org/abs/2001.03893v1 ) ライセンス: Link先を確認	Xiaoqing Guo, Zhen Chen, Yixuan Yuan	(参考訳) 皮膚内視鏡像におけるメラノーマの自動分画は皮膚癌のコンピュータ診断に不可欠である。既存の手法はホールに悩まされ、セグメンテーション性能に制限のある問題を縮小する。そこで本研究では,適応的受容学習を用いた補足ネットワークを提案する。セグメンテーションタスクを独立して行う代わりに, メラノーマ病変を検出するフォアグラウンドネットワークと, 非メラノーマ領域をマスキングするバックグラウンドネットワークを提案する。さらに,アダプティブ・アラス・コンボリューション (AAC) とナレッジ・アグリゲーション・モジュール (KAM) を提案する。 aacは、複数のスケールでの受容野を明示的に制御し、kamは、深い特徴マップに従って調整される適応受容野と拡張された畳み込みによって、浅い特徴マップを畳み込みます。さらに、前景と背景ネットワーク間の依存関係を利用するために、新たな相互損失が提案され、これらの2つのネットワーク内で相互に影響を及ぼすことができる。この相互学習戦略により、半教師付き学習が可能となり、境界感性が向上する。 Skin Imaging Collaboration (ISIC) 2018 skin lesion segmentation dataset を用いて, ディスコ効率86.4%を達成し, 最先端のメラノーマ・セグメンテーション法と比較して優れた性能を示した。 Automatic melanoma segmentation in dermoscopic images is essential in computer-aided diagnosis of skin cancer. Existing methods may suffer from the hole and shrink problems with limited segmentation performance. To tackle these issues, we propose a novel complementary network with adaptive receptive filed learning. Instead of regarding the segmentation task independently, we introduce a foreground network to detect melanoma lesions and a background network to mask non-melanoma regions. Moreover, we propose adaptive atrous convolution (AAC) and knowledge aggregation module (KAM) to fill holes and alleviate the shrink problems. AAC explicitly controls the receptive field at multiple scales and KAM convolves shallow feature maps by dilated convolutions with adaptive receptive fields, which are adjusted according to deep feature maps. In addition, a novel mutual loss is proposed to utilize the dependency between the foreground and background networks, thereby enabling the reciprocally influence within these two networks. Consequently, this mutual training strategy enables the semi-supervised learning and improve the boundary-sensitivity. Training with Skin Imaging Collaboration (ISIC) 2018 skin lesion segmentation dataset, our method achieves a dice co-efficient of 86.4% and shows better performance compared with state-of-the-art melanoma segmentation methods.	翻訳日:2023-01-12 04:49:24 公開日:2020-01-12
# アテンションフロー: エンドツーエンドの関節アテンション推定 Attention Flow: End-to-End Joint Attention Estimation ( http://arxiv.org/abs/2001.03960v1 ) ライセンス: Link先を確認	\"Omer S\"umer, Peter Gerjets, Ulrich Trautwein, Enkelejda Kasneci	(参考訳) 本稿では,ソーシャルシーンビデオにおける共同注意の理解の問題に対処する。共同注意は、対象物または関心領域における2人以上の個人の共通の視線行動であり、人間とコンピュータの相互作用、教育評価、注意障害のある患者の治療など、幅広い応用がある。本手法は,有意な特徴を選定し,共同注意の局所化を向上する2つの新しい畳み込み注意機構を用いて,エンドツーエンドで共同注意を学習する。複雑な社会シーンを含むビデオCoAttデータセットにおいて,サリエンシマップとアテンション機構の効果を比較し,定量的および定性的な結果が共同注意の検出と局所化に与える影響を報告する。 This paper addresses the problem of understanding joint attention in third-person social scene videos. Joint attention is the shared gaze behaviour of two or more individuals on an object or an area of interest and has a wide range of applications such as human-computer interaction, educational assessment, treatment of patients with attention disorders, and many more. Our method, Attention Flow, learns joint attention in an end-to-end fashion by using saliency-augmented attention maps and two novel convolutional attention mechanisms that determine to select relevant features and improve joint attention localization. We compare the effect of saliency maps and attention mechanisms and report quantitative and qualitative results on the detection and localization of joint attention in the VideoCoAtt dataset, which contains complex social scenes.	翻訳日:2023-01-12 04:48:19 公開日:2020-01-12
# 線虫c. elegansの頭と尾の局在 Head and Tail Localization of C. elegans ( http://arxiv.org/abs/2001.03981v1 ) ライセンス: Link先を確認	Mansi Ranjit Mane, Aniket Anand Deshmukh, Adam J. Iliff	(参考訳) C. elegans は神経科学で行動分析によく用いられるが、これは神経系が小さく、接続性も良好であるためである。動物を局在させ、頭と尾を区別することは、行動測定中にワームを追跡する重要なタスクであり、定量分析を行う。画像中のワームの頭部と尾の両方を局在化するためのニューラルネットワークによるアプローチを示す。 C. elegansの行動分析のためのオープンソースの機械学習ベースのソリューションを再現可能な論文で実証的な結果を得るために、コードを公開する。 C. elegans is commonly used in neuroscience for behaviour analysis because of it's compact nervous system with well-described connectivity. Localizing the animal and distinguishing between its head and tail are important tasks to track the worm during behavioural assays and to perform quantitative analyses. We demonstrate a neural network based approach to localize both the head and the tail of the worm in an image. To make empirical results in the paper reproducible and promote open source machine learning based solutions for C. elegans behavioural analysis, we also make our code publicly available.	翻訳日:2023-01-12 04:48:06 公開日:2020-01-12
# マルチモード単一パス時空間スクイーズ Multimode Single-Pass Spatio-temporal Squeezing ( http://arxiv.org/abs/2001.03972v1 ) ライセンス: Link先を確認	Luca La Volpe, Syamsundar De, Tiphaine Kouadou, Dmitri Horoshko, Mikhail Kolobov, Claude Fabre, Valentina Parigi, Nicolas Treps	(参考訳) 量子情報や量子気象学に応用可能なブロードバンド多重モード励起光の単一パス源を提案する。ソースは、非線形バルク結晶内の非共線形配置におけるタイプiパラメトリックダウンコンバージョン(pdc)プロセスに基づいている。生成したスクイーズ光は、空間的にも時間的にも形成された局所発振器を用いてホモダイン測定により、時空間的多モード挙動を示す。最後に,共分散行列に基づくアプローチにより,複数の独立な時空間モードと空間モード間のスクイーズ分布を明らかにする。これは、ソースのマルチモード機能を明確に検証します。 We present a single-pass source of broadband multimode squeezed light with potential application in quantum information and quantum metrology. The source is based on a type I parametric down-conversion (PDC) process inside a bulk nonlinear crystal in a non-collinear configuration. The generated squeezed light exhibits a spatiotemporal multimode behavior that is probed using a homodyne measurement with a local oscillator shaped both spatially and temporally. Finally we follow a covariance matrix based approach to reveal the distribution of the squeezing among several independent temporal and spatial modes. This unambiguously validates the multimode feature of our source.	翻訳日:2023-01-12 04:47:56 公開日:2020-01-12
# 機械学習を用いたアップリンク無線通信におけるチャネル割り当て Channel Assignment in Uplink Wireless Communication using Machine Learning Approach ( http://arxiv.org/abs/2001.03952v1 ) ライセンス: Link先を確認	Guangyu Jia and Zhaohui Yang and Hak-Keung Lam and Jianfeng Shi and Mohammad Shikh-Bahaei	(参考訳) この手紙は、アップリンク無線通信システムにおけるチャネル割り当て問題を調査する。我々の目標は、整数チャネル割り当て制約を受ける全ユーザの総和率を最大化することです。凸最適化に基づくアルゴリズムが提供され、各ステップで閉形式解が得られる最適なチャネル割り当てが得られる。凸最適化に基づくアルゴリズムでは計算の複雑さが高いため、機械学習手法を用いて計算効率のよい解を求める。具体的には、凸最適化に基づくアルゴリズムを用いてデータを生成し、元の問題を、畳み込みニューラルネットワーク(CNN)、フィードフォワードニューラルネットワーク(FNN)、ランダムフォレスト(ランダムフォレスト)、ゲートリカレントユニットネットワーク(GRU)の統合によって対処する回帰問題に変換する。その結果,機械学習手法は予測精度をわずかに向上させて計算時間を大幅に短縮することを示した。 This letter investigates a channel assignment problem in uplink wireless communication systems. Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints. A convex optimization based algorithm is provided to obtain the optimal channel assignment, where the closed-form solution is obtained in each step. Due to high computational complexity in the convex optimization based algorithm, machine learning approaches are employed to obtain computational efficient solutions. More specifically, the data are generated by using convex optimization based algorithm and the original problem is converted to a regression problem which is addressed by the integration of convolutional neural networks (CNNs), feed-forward neural networks (FNNs), random forest and gated recurrent unit networks (GRUs). The results demonstrate that the machine learning method largely reduces the computation time with slightly compromising of prediction accuracy.	翻訳日:2023-01-12 04:42:12 公開日:2020-01-12
# Fact Grounding を用いたデータ・テキスト生成における課題の再考 Revisiting Challenges in Data-to-Text Generation with Fact Grounding ( http://arxiv.org/abs/2001.03830v1 ) ライセンス: Link先を確認	Hongmin Wang	(参考訳) データ対テキスト生成モデルは、正しい入力ソースを参照してデータの忠実性を保証するという課題に直面している。この分野の研究を刺激するために、ワイズマンらは、ボックステーブルとラインスコアテーブルからnbaゲームサマリーを生成するために、rotowireコーパスを導入した。しかし、この方向に限定的な試みが行われ、課題は残る。我々は,要約内容の約60%しかボックススコアレコードに接地できないコーパスにおける顕著なボトルネックを観察する。このような情報不足は、条件付き言語モデルが無条件の無作為な事実を生み出すことを誤認し、結果として事実的幻覚を引き起こす傾向がある。本研究では,情報バランスを回復し,実地データ・テキスト生成に重点を置いたタスクを改良する。我々は、2017-19年の50パーセント以上のデータと豊富な入力テーブルを備えた、浄化された大規模データセットであるRotoWire-FG(Fact-Grounding)を導入し、この方向へのさらなる研究の焦点を期待している。さらに,新たなテーブル再構成を補助タスクとして統合することで,最先端モデルに対するデータ忠実度の向上を実現し,生成品質を向上する。 Data-to-text generation models face challenges in ensuring data fidelity by referring to the correct input source. To inspire studies in this area, Wiseman et al. (2017) introduced the RotoWire corpus on generating NBA game summaries from the box- and line-score tables. However, limited attempts have been made in this direction and the challenges remain. We observe a prominent bottleneck in the corpus where only about 60% of the summary contents can be grounded to the boxscore records. Such information deficiency tends to misguide a conditioned language model to produce unconditioned random facts and thus leads to factual hallucinations. In this work, we restore the information balance and revamp this task to focus on fact-grounded data-to-text generation. We introduce a purified and larger-scale dataset, RotoWire-FG (Fact-Grounding), with 50% more data from the year 2017-19 and enriched input tables, hoping to attract more research focuses in this direction. Moreover, we achieve improved data fidelity over the state-of-the-art models by integrating a new form of table reconstruction as an auxiliary task to boost the generation quality.	翻訳日:2023-01-12 04:40:40 公開日:2020-01-12
# ニューラルモデルの一般化を再考する:名前付きエンティティ認識のケーススタディ Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study ( http://arxiv.org/abs/2001.03844v1 ) ライセンス: Link先を確認	Jinlan Fu, Pengfei Liu, Qi Zhang, Xuanjing Huang	(参考訳) ニューラルネットワークベースのモデルは、多くのNLPタスクにおいて印象的なパフォーマンスを達成したが、異なるモデルの一般化動作は、まだ理解されていない。本稿では,既存のモデルの一般化挙動を異なる視点から分析し,その一般化能力の相違を,提案手法のレンズを通して特徴付けるためのテストベッドとしてnerタスクを取り入れた。詳細な分析による実験では、既存のニューラルネットワークnerモデルのボトルネックを、ブレークダウンパフォーマンス分析、アノテーションエラー、データセットバイアス、および改善の方向を示すカテゴリ関係の観点から診断する。我々は、将来の研究のためのデータセット(reconll, ploner)をプロジェクトページでリリースした。本論文の副産物として,最近のNER論文を包括的に要約したプロジェクトをオープンソースとして公開し,さまざまな研究トピックに分類した。 While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers and classifies them into different research topics: https://github.com/pfliu-nlp/Named-Entity-Recognition-NER-Papers.	翻訳日:2023-01-12 04:40:19 公開日:2020-01-12
# 新しい単語の意味の検出:スペイン語における単語埋め込みモデルの比較 Detecting New Word Meanings: A Comparison of Word Embedding Models in Spanish ( http://arxiv.org/abs/2001.05285v1 ) ライセンス: Link先を確認	Andr\'es Torres-Rivera and Juan-Manuel Torres-Moreno	(参考訳) 意味ネオロジズム(sn)は、形態を維持しながら新しい単語の意味を取得する単語として定義される。この種のネオロジズムの性質を考えると、これらの新しい単語の意味を識別するタスクは、現在、neologyのオブザーバリーの専門家によって手作業で行われている。 SNを半自動で検出するために,トピックモデリング,キーワード抽出,単語感覚の曖昧さといった手法を組み合わせたシステムを開発した。トピックモデリングの役割は、入力テキストで扱われるテーマを検出することである。例えば、バイラルはコンピュータサイエンス(CS)の文脈で1つの意味を持ち、健康について話すときにもう1つの意味を持っている。キーワードを抽出するために,posタグフィルタリング付きtextrankを用いた。この方法では、既にスペイン語のレキシコンの一部である関連語を得ることができる。ディープラーニングモデルを使用して、与えられたキーワードに新しい意味があるかどうかを判断します。すべての既知の意味(あるいはトピック)とは異なる埋め込みは、単語が有効なsn候補であることを示している。本研究では,Word2Vec,Sense2Vec,FastTextという単語埋め込みモデルについて検討した。モデルは、スペイン語のwikipediaをコーパスとして、同等のパラメータでトレーニングされた。次に、各モデルが生成する異なる埋め込みを示すために、単語のリストとその一致(ネオロジズムのデータベースから得られた)を使用しました。最後に、これらの結果と各単語の一致を比較して、ある単語がSNの有効な候補であるかどうかを判断する方法を示す。 Semantic neologisms (SN) are defined as words that acquire a new word meaning while maintaining their form. Given the nature of this kind of neologisms, the task of identifying these new word meanings is currently performed manually by specialists at observatories of neology. To detect SN in a semi-automatic way, we developed a system that implements a combination of the following strategies: topic modeling, keyword extraction, and word sense disambiguation. The role of topic modeling is to detect the themes that are treated in the input text. Themes within a text give clues about the particular meaning of the words that are used, for example: viral has one meaning in the context of computer science (CS) and another when talking about health. To extract keywords, we used TextRank with POS tag filtering. With this method, we can obtain relevant words that are already part of the Spanish lexicon. We use a deep learning model to determine if a given keyword could have a new meaning. Embeddings that are different from all the known meanings (or topics) indicate that a word might be a valid SN candidate. In this study, we examine the following word embedding models: Word2Vec, Sense2Vec, and FastText. The models were trained with equivalent parameters using Wikipedia in Spanish as corpora. Then we used a list of words and their concordances (obtained from our database of neologisms) to show the different embeddings that each model yields. Finally, we present a comparison of these outcomes with the concordances of each word to show how we can determine if a word could be a valid candidate for SN.	翻訳日:2023-01-12 04:39:55 公開日:2020-01-12
# ニューラルネットワークを用いたウルドゥー英語機械音訳 Urdu-English Machine Transliteration using Neural Networks ( http://arxiv.org/abs/2001.05296v1 ) ライセンス: Link先を確認	Usman Mohy ud Din	(参考訳) 近年は機械翻訳が注目されている。これは、ある言語から他の言語へのテキストの翻訳に焦点を当てた、計算言語学のサブフィールドである。さまざまな翻訳技術の中で、現在ニューラルネットワークは、注意のメカニズム、シーケンスツーシーケンス、長期のモデリングを備えた単一の大きなニューラルネットワークを提供することで、ドメインをリードしている。機械翻訳分野の著しい進歩にもかかわらず、専門用語を含む語彙外語(oov)の翻訳、名前付き文字を含む外国語は、現在の最先端の翻訳システムにとって依然として課題であり、低資源言語や異なる構造を持つ言語間の翻訳において、状況はさらに悪化する。言語の形態的豊かさのため、単語は異なる文脈で異なる髄を持つことがある。このようなシナリオでは、単語の翻訳は正しい/品質の翻訳を提供するのに十分ではない。翻訳は、翻訳中の単語/文の文脈を考える方法である。 urduのような低リソース言語の場合、システムのトレーニングに十分な大きさの並列コーパスを持つ/探すのは非常に困難である。本研究では,教師なし言語に依存しない予測最大化(EM)に基づく翻訳手法を提案する。システムは並列コーパスからパターンと外語彙(OOV)の単語を学習し、文字コーパスで明示的にトレーニングする必要はない。このアプローチは、フレーズベース、階層的フレーズベースおよび因子ベースモデルとLSTMとトランスフォーマーモデルを含む2つのニューラルマシン翻訳モデルを含む統計機械翻訳(SMT)の3つのモデルで検証される。 Machine translation has gained much attention in recent years. It is a sub-field of computational linguistic which focus on translating text from one language to other language. Among different translation techniques, neural network currently leading the domain with its capabilities of providing a single large neural network with attention mechanism, sequence-to-sequence and long-short term modelling. Despite significant progress in domain of machine translation, translation of out-of-vocabulary words(OOV) which include technical terms, named-entities, foreign words are still a challenge for current state-of-art translation systems, and this situation becomes even worse while translating between low resource languages or languages having different structures. Due to morphological richness of a language, a word may have different meninges in different context. In such scenarios, translation of word is not only enough in order provide the correct/quality translation. Transliteration is a way to consider the context of word/sentence during translation. For low resource language like Urdu, it is very difficult to have/find parallel corpus for transliteration which is large enough to train the system. In this work, we presented transliteration technique based on Expectation Maximization (EM) which is un-supervised and language independent. Systems learns the pattern and out-of-vocabulary (OOV) words from parallel corpus and there is no need to train it on transliteration corpus explicitly. This approach is tested on three models of statistical machine translation (SMT) which include phrasebased, hierarchical phrase-based and factor based models and two models of neural machine translation which include LSTM and transformer model.	翻訳日:2023-01-12 04:39:31 公開日:2020-01-12
# 商品画像分類のためのトリックの袋 Bag of Tricks for Retail Product Image Classification ( http://arxiv.org/abs/2001.03992v1 ) ライセンス: Link先を確認	Muktabh Mayank Srivastava	(参考訳) 小売商品画像分類は、セルフチェックアウトストアや自動小売実行評価のような現実のシステムを構築する上で重要なコンピュータビジョンと機械学習の問題である。本研究では,各種小売商品画像分類データセットの深層学習モデルの精度を高めるための様々な手法を提案する。これらの手法により、小売商品画像分類のための微調整コンブネットの精度を大きなマージンで向上させることができる。最も顕著なトリックとして、複数のデータセットで一貫したゲインを提供する、Local-Concepts-Accumulation (LCA)層と呼ばれる新しいニューラルネットワーク層を導入する。小売製品識別の精度を高めるための他の2つのトリックは、instagramでトレーニング済みのconvnetを使用して、最大エントロピーを分類の補助損失として使用することです。 Retail Product Image Classification is an important Computer Vision and Machine Learning problem for building real world systems like self-checkout stores and automated retail execution evaluation. In this work, we present various tricks to increase accuracy of Deep Learning models on different types of retail product image classification datasets. These tricks enable us to increase the accuracy of fine tuned convnets for retail product image classification by a large margin. As the most prominent trick, we introduce a new neural network layer called Local-Concepts-Accumulation (LCA) layer which gives consistent gains across multiple datasets. Two other tricks we find to increase accuracy on retail product identification are using an instagram-pretrained Convnet and using Maximum Entropy as an auxiliary loss for classification.	翻訳日:2023-01-12 04:33:28 公開日:2020-01-12
# ガウス過程を用いた特徴量に基づく非剛性画像登録の検討 An Investigation of Feature-based Nonrigid Image Registration using Gaussian Process ( http://arxiv.org/abs/2001.05862v1 ) ライセンス: Link先を確認	Siming Bayer, Ute Spiske, Jie Luo, Tobias Geimer, William M. Wells III, Martin Ostermeier, Rebecca Fahrig, Arya Nabavi, Christoph Bert, Ilker Eyupoglo, and Andreas Maier	(参考訳) 適応的治療計画や術中画像更新のような幅広い臨床応用において,fdr(feature-based deformable registration)アプローチは単純さと計算複雑性の低さから広く採用されている。 fdrアルゴリズムは、選択された特徴間の確立された対応によって与えられるスパースフィールドを補間することにより、密度の高い変位場を推定する。本稿では, 変形場をガウス過程 (GP) とみなす一方, 選択した特徴を有効変形の先行情報とみなす。 gpを用いて, 高密度変位場と対応する不確かさマップの両方を同時に推定することができる。さらに,合成,ファントム,臨床データを用いた2乗指数カーネルの異なるハイパーパラメータ設定の性能評価を行った。定量的比較の結果,GP-based interpolation は最先端のB-spline interpolation と同等の性能を示した。 gpに基づく補間の最大の臨床的利点は、計算された濃密な変位マップの数学的不確かさの信頼できる推定を与えることである。 For a wide range of clinical applications, such as adaptive treatment planning or intraoperative image update, feature-based deformable registration (FDR) approaches are widely employed because of their simplicity and low computational complexity. FDR algorithms estimate a dense displacement field by interpolating a sparse field, which is given by the established correspondence between selected features. In this paper, we consider the deformation field as a Gaussian Process (GP), whereas the selected features are regarded as prior information on the valid deformations. Using GP, we are able to estimate the both dense displacement field and a corresponding uncertainty map at once. Furthermore, we evaluated the performance of different hyperparameter settings for squared exponential kernels with synthetic, phantom and clinical data respectively. The quantitative comparison shows, GP-based interpolation has performance on par with state-of-the-art B-spline interpolation. The greatest clinical benefit of GP-based interpolation is that it gives a reliable estimate of the mathematical uncertainty of the calculated dense displacement map.	翻訳日:2023-01-12 04:33:15 公開日:2020-01-12
# 依存情報を用いた確率的自然言語生成 Stochastic Natural Language Generation Using Dependency Information ( http://arxiv.org/abs/2001.03897v1 ) ライセンス: Link先を確認	Elham Seifossadat and Hossein Sameti	(参考訳) 本稿では,自然言語テキスト生成のための確率コーパスモデルを提案する。提案モデルでは,まず,特徴集合を通じてトレーニングデータから依存関係関係を符号化し,次にそれらの特徴を結合して,与えられた意味表現のための新しい依存性木を生成し,最終的に生成した依存性木から自然言語の発話を生成する。我々は、表式、対話法、rdfフォーマットの9つのドメインでモデルをテストする。また,対話行動,E2E,WebNLGデータセットを用いたBLEUおよびERR評価指標を用いて学習したニューラルネットワークに基づくアプローチと同等の結果が得られる。また,人間評価結果を報告することにより,情報性や自然性,品質の面から高品質な発話を生成できることを示した。 This article presents a stochastic corpus-based model for generating natural language text. Our model first encodes dependency relations from training data through a feature set, then concatenates these features to produce a new dependency tree for a given meaning representation, and finally generates a natural language utterance from the produced dependency tree. We test our model on nine domains from tabular, dialogue act and RDF format. Our model outperforms the corpus-based state-of-the-art methods trained on tabular datasets and also achieves comparable results with neural network-based approaches trained on dialogue act, E2E and WebNLG datasets for BLEU and ERR evaluation metrics. Also, by reporting Human Evaluation results, we show that our model produces high-quality utterances in aspects of informativeness and naturalness as well as quality.	翻訳日:2023-01-12 04:32:58 公開日:2020-01-12
# 疎フィードバックを伴う複雑な操作課題に対する深層強化学習 Deep Reinforcement Learning for Complex Manipulation Tasks with Sparse Feedback ( http://arxiv.org/abs/2001.03877v1 ) ライセンス: Link先を確認	Binyamin Manela	(参考訳) 疎いフィードバックから最適なポリシーを学ぶことは、強化学習における既知の課題である。 Hindsight Experience Replay (HER) は、そのような課題を解決するためのマルチゴール強化学習アルゴリズムである。このアルゴリズムは、全ての失敗をエピソードで達成された代替(仮想)目標の成功として扱い、その仮想目標から実際の目標へと一般化する。 HERには既知の欠陥があり、比較的単純なタスクに限定されている。本論文では,既存のherアルゴリズムに基づく3つのアルゴリズムを提案する。まず、エージェントがより価値のある情報を学ぶ仮想目標を優先します。この性質を仮想ゴールの「textit{instructiveness}」と呼び、エージェントが仮想ゴールから実際のゴールへの一般化をいかにうまく行うかを表すヒューリスティックな尺度で定義する。第二に,学習過程全体を通してバイアスを生じさせるような誤解を招くサンプルを検出し,除去するフィルタリングプロセスを設計した。最後に、HERと組み合わせたカリキュラム学習の形式を用いて、複雑でシーケンシャルなタスクの学習を可能にする。このアルゴリズムを \textit{curriculum her} と呼ぶ。アルゴリズムをテストするため、3つの難解な操作環境を構築しました。それぞれの環境は複雑度が3つある。実験の結果,herアルゴリズムと比較した場合,最終的な成功率とサンプル効率は大幅に向上した。 Learning optimal policies from sparse feedback is a known challenge in reinforcement learning. Hindsight Experience Replay (HER) is a multi-goal reinforcement learning algorithm that comes to solve such tasks. The algorithm treats every failure as a success for an alternative (virtual) goal that has been achieved in the episode and then generalizes from that virtual goal to real goals. HER has known flaws and is limited to relatively simple tasks. In this thesis, we present three algorithms based on the existing HER algorithm that improves its performances. First, we prioritize virtual goals from which the agent will learn more valuable information. We call this property the \textit{instructiveness} of the virtual goal and define it by a heuristic measure, which expresses how well the agent will be able to generalize from that virtual goal to actual goals. Secondly, we designed a filtering process that detects and removes misleading samples that may induce bias throughout the learning process. Lastly, we enable the learning of complex, sequential, tasks using a form of curriculum learning combined with HER. We call this algorithm \textit{Curriculum HER}. To test our algorithms, we built three challenging manipulation environments with sparse reward functions. Each environment has three levels of complexity. Our empirical results show vast improvement in the final success rate and sample efficiency when compared to the original HER algorithm.	翻訳日:2023-01-12 04:32:44 公開日:2020-01-12
# Fastは無料より優れている - 敵のトレーニングを再考する Fast is better than free: Revisiting adversarial training ( http://arxiv.org/abs/2001.03994v1 ) ライセンス: Link先を確認	Eric Wong, Leslie Rice, J. Zico Kolter	(参考訳) 強固なディープネットワークを学習する手法であるadversarial trainingは、通常、プロジェクテッド・グラデーション・フォーマル(pgd)のような一階法で敵の例を構築する必要があるため、従来のトレーニングよりも高価であると考えられている。本稿では, 従来非効率と思われていたアプローチである, より弱く安価な逆境を用いて, 実験的に堅牢なモデルを訓練できるという驚くべき発見を行ない, 実際に行う訓練よりもコストがかからない手法を提案する。具体的には,ファストグレードサイン法(fast gradient sign method, fgsm)とランダム初期化を組み合わせることで,pgdベースのトレーニングと同等の効果を示すが,コストは極めて低い。さらに, ディープネットワークの効率的なトレーニングのための標準技術を用いて, 45%のロバストなcifar10分類器を6分で学習できること, 43%のロバストなイメージネット分類器を12時間以内に2/255$で学習できること, 10時間から50時間かけて同じしきい値に到達した"free"アドバーサリートレーニングに基づく過去の作業と比較して, fgsmの敵意訓練をさらに促進できることを示した。最後に,FGSM逆行訓練の失敗の原因となった「破滅的オーバーフィッティング(catastrophic overfitting)」と呼ばれる障害モードを同定した。この論文で実験を再現するためのコードはすべてhttps://github.com/locuslab/fast_adversarial.comにある。 Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice. Specifically, we show that adversarial training with the fast gradient sign method (FGSM), when combined with random initialization, is as effective as PGD-based training but has significantly lower cost. Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $\epsilon=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $\epsilon=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds. Finally, we identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail. All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial.	翻訳日:2023-01-12 04:31:21 公開日:2020-01-12
# テキスト分類のためのテンソルグラフ畳み込みネットワーク Tensor Graph Convolutional Networks for Text Classification ( http://arxiv.org/abs/2001.05313v1 ) ライセンス: Link先を確認	Xien Liu, Xinxin You, Xiao Zhang, Ji Wu and Ping Lv	(参考訳) 逐次学習モデルと比較して、グラフベースのニューラルネットワークは、グローバル情報をキャプチャする能力など、優れた特性を示している。本稿では,テキスト分類問題に対するグラフベースニューラルネットワークについて検討する。この課題に対して、新しいフレームワークTensorGCN(テンソルグラフ畳み込みネットワーク)が提案されている。テキストグラフテンソルは、まずセマンティック、構文、シーケンシャルな文脈情報を記述するために構築される。そして、テキストグラフテンソル上で2種類の伝播学習を行う。 1つ目は、単一のグラフ内の近傍ノードからの情報を集約するために使用されるグラフ内伝搬である。 2つ目はグラフ間の異種情報の調和に使用されるグラフ間伝播である。ベンチマークデータセットを用いて大規模な実験を行い,提案手法の有効性を示した。提案するTensorGCNは,異なる種類のグラフからの異種情報の調和と統合に有効な方法である。 Compared to sequential learning models, graph-based neural networks exhibit some excellent properties, such as ability capturing global information. In this paper, we investigate graph-based neural networks for text classification problem. A new framework TensorGCN (tensor graph convolutional networks), is presented for this task. A text graph tensor is firstly constructed to describe semantic, syntactic, and sequential contextual information. Then, two kinds of propagation learning perform on the text graph tensor. The first is intra-graph propagation used for aggregating information from neighborhood nodes in a single graph. The second is inter-graph propagation used for harmonizing heterogeneous information between graphs. Extensive experiments are conducted on benchmark datasets, and the results illustrate the effectiveness of our proposed framework. Our proposed TensorGCN presents an effective way to harmonize and integrate heterogeneous information from different kinds of graphs.	翻訳日:2023-01-12 04:30:28 公開日:2020-01-12

Title

Authors

Abstract

論文公表日・翻訳日

# bring-Your-Own-Device (BYOD) プログラミングエグゼムのセキュア化

Securing Bring-Your-Own-Device (BYOD) Programming Exams ( http://arxiv.org/abs/2001.03942v1 )

ライセンス: Link先を確認

Oka Kurniawan, Norman Tiong Seng Lee, Christopher M. Poskitt

(参考訳) 従来のペンと紙の試験は、実践的なコーディング能力をターゲットにした教育や学習の目的と不一致であるため、現代の大学プログラミングコースでは不十分である。残念ながら、多くの機関は専用のコンピュータラボでアセスメントを実行するためのリソースやスペースを欠いている。このことは、学生がどのように学習するかと同様の環境でプログラムできるような、BYOD( bring-your-own-device)試験フォーマットの開発を動機付けている。本稿では,学生のラップトップをシステムやインターネットアクセスの制限のあるセキュアなワークステーションに変えるソフトウェアであるロックダウンブラウザに基づくbyod試験ソリューションについて述べる。この技術を学習管理システムとクラウドベースのプログラミングツールと組み合わせることで,対話的かつ制御可能な環境において,概念的かつ実践的なプログラミング問題に取り組むことができる。我々は、このソリューションを主要な学部プログラミングコースに導入した経験を反映し、ポリシーとサポートメカニズムが技術そのものと同じくらい重要であるという私たちの主要な教訓を強調します。

Traditional pen and paper exams are inadequate for modern university programming courses as they are misaligned with pedagogies and learning objectives that target practical coding ability. Unfortunately, many institutions lack the resources or space to be able to run assessments in dedicated computer labs. This has motivated the development of bring-your-own-device (BYOD) exam formats, allowing students to program in a similar environment to how they learnt, but presenting instructors with significant additional challenges in preventing plagiarism and cheating. In this paper, we describe a BYOD exam solution based on lockdown browsers, software which temporarily turns students' laptops into secure workstations with limited system or internet access. We combine the use of this technology with a learning management system and cloud-based programming tool to facilitate conceptual and practical programming questions that can be tackled in an interactive but controlled environment. We reflect on our experience of implementing this solution for a major undergraduate programming course, highlighting our principal lesson that policies and support mechanisms are as important to consider as the technology itself.

翻訳日:2023-06-08 03:57:45 公開日:2020-01-12

# 生成的逆模倣学習の計算と一般化について

On Computation and Generalization of Generative Adversarial Imitation Learning ( http://arxiv.org/abs/2001.02792v2 )

ライセンス: Link先を確認

Minshuo Chen, Yizhou Wang, Tianyi Liu, Zhuoran Yang, Xingguo Li, Zhaoran Wang, Tuo Zhao

(参考訳) GAIL(Generative Adversarial Imitation Learning)は、シーケンシャルな意思決定ポリシーを学ぶための強力で実践的なアプローチである。強化学習(RL)とは異なり、GAILは専門家(例えば人間)による実証データを活用し、未知の環境のポリシーと報酬関数の両方を学ぶ。顕著な経験的進歩にもかかわらず、GAILの背後にある理論はほとんど不明である。主な困難は、デモデータの根底にある時間依存性と、凸凹構造を持たないGAILの最小計算定式化である。このような理論と実践のギャップを埋めるため,ガイルの理論的性質を考察する。具体的には,(1)報奨関数のクラスが適切に制御される限り,GAILに対して一般化を保証すること,(2)報奨関数が再生カーネル関数としてパラメータ化されるGAILに対して,GAILを確率的一階最適化アルゴリズムにより効率よく解き、定常解へのサブ線形収束を実現すること,を示す。我々の知る限り、これらは報酬/政治機能近似による模倣学習の統計的および計算的保証に関する最初の結果である。解析を支援するために数値実験を行った。

Generative Adversarial Imitation Learning (GAIL) is a powerful and practical approach for learning sequential decision-making policies. Different from Reinforcement Learning (RL), GAIL takes advantage of demonstration data by experts (e.g., human), and learns both the policy and reward function of the unknown environment. Despite the significant empirical progresses, the theory behind GAIL is still largely unknown. The major difficulty comes from the underlying temporal dependency of the demonstration data and the minimax computational formulation of GAIL without convex-concave structure. To bridge such a gap between theory and practice, this paper investigates the theoretical properties of GAIL. Specifically, we show: (1) For GAIL with general reward parameterization, the generalization can be guaranteed as long as the class of the reward functions is properly controlled; (2) For GAIL, where the reward is parameterized as a reproducing kernel function, GAIL can be efficiently solved by stochastic first order optimization algorithms, which attain sublinear convergence to a stationary solution. To the best of our knowledge, these are the first results on statistical and computational guarantees of imitation learning with reward/policy function approximation. Numerical experiments are provided to support our analysis.

翻訳日:2023-01-13 04:21:45 公開日:2020-01-12

# アダプティブ隣り合わせ型分別スパルスPCAによる次元化

Supervised Discriminative Sparse PCA with Adaptive Neighbors for Dimensionality Reduction ( http://arxiv.org/abs/2001.03103v2 )

ライセンス: Link先を確認

Zhenhua Shi, Dongrui Wu, Jian Huang, Yu-Kai Wang, Chin-Teng Lin

(参考訳) 情報可視化,特徴抽出,クラスタリング,回帰,分類において,特にノイズの多い高次元データを処理するための重要な操作である。しかし、既存のアプローチのほとんどは、データのグローバル構造かローカル構造を保存しているが、両方ではない。主成分分析(PCA)のようなグローバルなデータ構造のみを保持するアプローチは、通常、外れ値に敏感である。局所性保存プロジェクションのような局所データ構造のみを保存するアプローチは通常、教師なし(そのためラベル情報は使用できない)で、固定された類似性グラフを使用する。そこで本研究では, 線形次元削減手法として, 適応隣人との識別的スパースPCA(SDSPCAAN)を新たに提案し, 適応隣人とのクラスタリングを図った。その結果、グローバルデータ構造とローカルデータ構造、およびラベル情報の両方が、より次元性の低減に使用される。 9つの高次元データセットの分類実験により,提案したSDSPCAANの有効性とロバスト性を検証した。

Dimensionality reduction is an important operation in information visualization, feature extraction, clustering, regression, and classification, especially for processing noisy high dimensional data. However, most existing approaches preserve either the global or the local structure of the data, but not both. Approaches that preserve only the global data structure, such as principal component analysis (PCA), are usually sensitive to outliers. Approaches that preserve only the local data structure, such as locality preserving projections, are usually unsupervised (and hence cannot use label information) and uses a fixed similarity graph. We propose a novel linear dimensionality reduction approach, supervised discriminative sparse PCA with adaptive neighbors (SDSPCAAN), to integrate neighborhood-free supervised discriminative sparse PCA and projected clustering with adaptive neighbors. As a result, both global and local data structures, as well as the label information, are used for better dimensionality reduction. Classification experiments on nine high-dimensional datasets validated the effectiveness and robustness of our proposed SDSPCAAN.

翻訳日:2023-01-13 04:19:51 公開日:2020-01-12

# 量子振幅減衰符号に対する線形計画法

Linear programming bounds for quantum amplitude damping codes ( http://arxiv.org/abs/2001.03976v1 )

ライセンス: Link先を確認

Yingkai Ouyang and Ching-Yi Lai

(参考訳) 近似量子誤り訂正符号(AQEC)が完全量子誤り訂正符号よりも性能が優れていることを考慮すれば、それらの性能を定量化することが重要となる。量子重み列挙器は、量子誤り訂正符号の最小距離において最良の上限を設定するが、これらの境界はaqec符号に直接は適用されない。本稿では、振幅減衰(AD)誤差に対する量子量列挙器を導入し、近似量子誤差補正の枠組みの中で機能する。特に、符号空間に固有な補助的完全重み列挙子を導入し、さらに、ad誤差に対する量子重み列挙子とこの補助的完全重み列挙子との線形関係を確立する。これにより、AQEC ADコードに対応するパラメータが存在しない場合にのみ実現不可能な線形プログラムを確立することができる。線形プログラムを説明するために、任意のADエラーを修正することができる3ビットAD符号の存在を数値的に排除する。

Given that approximate quantum error-correcting (AQEC) codes have a potentially better performance than perfect quantum error correction codes, it is pertinent to quantify their performance. While quantum weight enumerators establish some of the best upper bounds on the minimum distance of quantum error-correcting codes, these bounds do not directly apply to AQEC codes. Herein, we introduce quantum weight enumerators for amplitude damping (AD) errors and work within the framework of approximate quantum error correction. In particular, we introduce an auxiliary exact weight enumerator that is intrinsic to a code space and moreover, we establish a linear relationship between the quantum weight enumerators for AD errors and this auxiliary exact weight enumerator. This allows us to establish a linear program that is infeasible only when AQEC AD codes with corresponding parameters do not exist. To illustrate our linear program, we numerically rule out the existence of three-qubit AD codes that are capable of correcting an arbitrary AD error.

翻訳日:2023-01-12 05:11:06 公開日:2020-01-12

# 信号解析と量子形式論:プランク定数を持たない量子化

Signal analysis and quantum formalism: Quantizations with no Planck constant ( http://arxiv.org/abs/2001.04916v1 )

ライセンス: Link先を確認

Jean Pierre Gazeau and Celestin Habonimana

(参考訳) 信号解析は、信号ベクトル空間(例えばフーリエ、ガボー、ウェーブレットなど)におけるアイデンティティの様々な分解能に基づいている。同様の分解能は関数や分布の量子化器として使われ、時間周波数や時間スケールの量子形式への道を歩み、興味深いか予期せぬ特徴を明らかにする。光子ではなく波の量子論と見なされる古典的電磁磁性への拡張について述べる。

Signal analysis is built upon various resolutions of the identity in signal vector spaces, e.g. Fourier, Gabor, wavelets, etc. Similar resolutions are used as quantizers of functions or distributions, paving the way to a time-frequency or time-scale quantum formalism and revealing interesting or unexpected features. Extensions to classical electromagnetism viewed as a quantum theory for waves and not for photons are mentioned.

翻訳日:2023-01-12 05:10:49 公開日:2020-01-12

# 集中型無線アクセスネットワークにおける大規模MIMO処理のための量子アニールの活用

Leveraging Quantum Annealing for Large MIMO Processing in Centralized Radio Access Networks ( http://arxiv.org/abs/2001.04014v1 )

ライセンス: Link先を確認

Minsung Kim, Davide Venturelli, Kyle Jamieson

(参考訳) 無線容量の増加に対するユーザの需要は、供給を上回っており、この需要を満たすために、新しいmimo無線物理層技術において大きな進歩を遂げている。高性能なシステムは、アルゴリズムの計算能力が非常に高いため、ほとんど実用的ではないままである。最適な性能を得るためには、ユーザ数と各ユーザのデータレートの両方で指数関数的に増加する計算量が必要となることが多い。これにより、基地局の計算能力は無線容量の重要な制限要因の1つとなっている。 QuAMaxは、この問題に量子アニールを利用して対処する最初の大規模なMIMO無線アクセスネットワークである。我々は2,031量子ビットD-Wave 2000Q量子アニールにQuAMaxを実装した。実験の結果,2000Qの計算時間10〜$\mu$sは,48ユーザ,48APアンテナBPSK通信を20dB SNRのビット誤り率10〜6}$,1500バイトのフレーム誤り率10〜4}$で実現可能であることがわかった。

User demand for increasing amounts of wireless capacity continues to outpace supply, and so to meet this demand, significant progress has been made in new MIMO wireless physical layer techniques. Higher-performance systems now remain impractical largely only because their algorithms are extremely computationally demanding. For optimal performance, an amount of computation that increases at an exponential rate both with the number of users and with the data rate of each user is often required. The base station's computational capacity is thus becoming one of the key limiting factors on wireless capacity. QuAMax is the first large MIMO centralized radio access network design to address this issue by leveraging quantum annealing on the problem. We have implemented QuAMax on the 2,031 qubit D-Wave 2000Q quantum annealer, the state-of-the-art in the field. Our experimental results evaluate that implementation on real and synthetic MIMO channel traces, showing that 10~$\mu$s of compute time on the 2000Q can enable 48 user, 48 AP antenna BPSK communication at 20 dB SNR with a bit error rate of $10^{-6}$ and a 1,500 byte frame error rate of $10^{-4}$.

翻訳日:2023-01-12 05:09:08 公開日:2020-01-12

# ロバストニューラルネットワークの関数誤差補正

Functional Error Correction for Robust Neural Networks ( http://arxiv.org/abs/2001.03814v1 )

ライセンス: Link先を確認

Kunping Huang, Paul Siegel, Anxiao (Andrew) Jiang

(参考訳) ニューラルネットワーク(NeuralNets)をハードウェアで実装する場合、その重みをメモリデバイスに格納する必要がある。格納された重みにノイズが蓄積されると、NeuralNetのパフォーマンスは低下する。本稿では,重みを保護するために誤り訂正符号(ECC)の使い方について検討する。データストレージにおける古典的な誤り訂正とは異なり、最適化の目的は、保護されたビットの誤り率を最小化するのではなく、エラー修正後のNeuralNetのパフォーマンスを最適化することである。すなわち、ニューラルネットワークを入力の関数として見ることにより、エラー訂正スキームは関数指向である。最大の課題は、ディープニューラルネットワークは、しばしば数百万から数億の重量を持ち、ECCの大きな冗長性オーバーヘッドを引き起こし、重みとNeuralNetのパフォーマンスの関係は非常に複雑であることだ。そこで本研究では,ECC保護のための重要なビットのサブセットのみを選択するSelective Protection (SP)方式を提案する。このようなビットを探し出し、ECCの冗長性とNeuralNetの性能のトレードオフを最適化するために、深層強化学習に基づくアルゴリズムを提案する。実験の結果,本アルゴリズムは,本手法と比較して,機能的誤り訂正タスクの性能が大幅に向上することを確認した。

When neural networks (NeuralNets) are implemented in hardware, their weights need to be stored in memory devices. As noise accumulates in the stored weights, the NeuralNet's performance will degrade. This paper studies how to use error correcting codes (ECCs) to protect the weights. Different from classic error correction in data storage, the optimization objective is to optimize the NeuralNet's performance after error correction, instead of minimizing the Uncorrectable Bit Error Rate in the protected bits. That is, by seeing the NeuralNet as a function of its input, the error correction scheme is function-oriented. A main challenge is that a deep NeuralNet often has millions to hundreds of millions of weights, causing a large redundancy overhead for ECCs, and the relationship between the weights and its NeuralNet's performance can be highly complex. To address the challenge, we propose a Selective Protection (SP) scheme, which chooses only a subset of important bits for ECC protection. To find such bits and achieve an optimized tradeoff between ECC's redundancy and NeuralNet's performance, we present an algorithm based on deep reinforcement learning. Experimental results verify that compared to the natural baseline scheme, the proposed algorithm achieves substantially better performance for the functional error correction task.

翻訳日:2023-01-12 05:08:48 公開日:2020-01-12

# CUREデータセット:オーディオイベント分類のためのラダーネットワーク

CURE Dataset: Ladder Networks for Audio Event Classification ( http://arxiv.org/abs/2001.03896v1 )

ライセンス: Link先を確認

Harishchandra Dubey, Dimitra Emmanouilidou, Ivan J. Tashev

(参考訳) 音声イベント分類は、監視、音声、ビデオ、マルチメディア検索など、いくつかのアプリケーションにとって重要なタスクである。約300万人が聴力を失い、周囲で起きている出来事を認識できない。本稿では,聴覚障害者に最も関係のある特定の音声イベントのキュレーションセットを含む治療データセットについて述べる。本論文では,freesoundプロジェクトから派生した5s音声記録を用いたラダーネットワーク型音声イベント分類器を提案する。我々は,現在最先端の畳み込みニューラルネットワーク(CNN)をオーディオ機能として採用した。また,イベント分類のための極端学習機械 (ELM) についても検討する。本研究では,提案する分類器をサポートベクトルマシン(SVM)ベースラインと比較する。異なる録音シナリオ間のミスマッチを低減することを目的とした信号と特徴の正規化を提案する。まず、CNNは弱いラベル付きAudiosetデータに基づいて訓練される。次に, 予め学習したモデルを用いて, 提案する治療コーパスの特徴抽出を行う。 esc-50データセットを第2の評価セットとして組み込む。 ELM と SVM の分類器に対するラダーネットワークの優位性について,ロバスト性および分類精度の向上の観点から検証した。 Ladder ネットワークはデータミスマッチに対して堅牢であるが、単純な SVM と ELM の分類器はそのようなミスマッチに敏感であり、提案手法が重要な役割を果たす。 ESC-50とCUREコーパスによる実験的研究は、提案手法によって提供されるデータセットの複雑さと堅牢性の違いを解明する。

Audio event classification is an important task for several applications such as surveillance, audio, video and multimedia retrieval etc. There are approximately 3M people with hearing loss who can't perceive events happening around them. This paper establishes the CURE dataset which contains curated set of specific audio events most relevant for people with hearing loss. We propose a ladder network based audio event classifier that utilizes 5s sound recordings derived from the Freesound project. We adopted the state-of-the-art convolutional neural network (CNN) embeddings as audio features for this task. We also investigate extreme learning machine (ELM) for event classification. In this study, proposed classifiers are compared with support vector machine (SVM) baseline. We propose signal and feature normalization that aims to reduce the mismatch between different recordings scenarios. Firstly, CNN is trained on weakly labeled Audioset data. Next, the pre-trained model is adopted as feature extractor for proposed CURE corpus. We incorporate ESC-50 dataset as second evaluation set. Results and discussions validate the superiority of Ladder network over ELM and SVM classifier in terms of robustness and increased classification accuracy. While Ladder network is robust to data mismatches, simpler SVM and ELM classifiers are sensitive to such mismatches, where the proposed normalization techniques can play an important role. Experimental studies with ESC-50 and CURE corpora elucidate the differences in dataset complexity and robustness offered by proposed approaches.

翻訳日:2023-01-12 05:08:11 公開日:2020-01-12

# 包括的深層学習に基づく非線形時間依存パラメタライズドPDEのオーダーモデリング

A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs ( http://arxiv.org/abs/2001.04001v1 )

ライセンス: Link先を確認

Stefania Fresca, Luca Dede, Andrea Manzoni

(参考訳) 還元基底法(RB)法(例えば、適切な直交分解法(POD))のような伝統的な還元次数モデリング技術は、それらに基づくモードの線形重ね合わせの基本的な仮定のため、非線形時間依存パラメトリゼーションPDEを扱う際に厳しい制限を受ける。このため、トランスポート、ウェーブ、対流支配現象などの時間上を伝播するコヒーレント構造を特徴とする問題の場合、rb法は、高忠実度全階モデル(fom)の解に対して十分に精度の低下した次数近似を求めると、通常、効率の悪い還元次数モデル(rom)が得られる。本研究は,これらの制約を克服するために,ディープラーニング(DL)アルゴリズムを応用した低次モデル設定のための非線形手法を提案する。結果として得られた非線形ROMはDL-ROMと呼ばれ、非線形トライアル多様体(線形ROMの基底関数の集合に対応する)と非線形還元力学(線形ROMの投影段階に対応する)の両方をDLアルゴリズムに頼って非侵襲的に学習し、後者は異なるパラメータ値に対して得られたFOM解の集合に基づいて訓練する。本稿では, 線形および非線形時間依存型パラメタライズPDEのためのDL-ROMを構築する方法, さらに, 異なるパラメタライズPDE問題を特徴とするテストケースにおいて, その精度を評価する。 PDE解多様体の内在次元に等しい次元のDL-ROMは、同じ精度を達成するために大量のPODモードを必要とする状況において、パラメータ化されたPDEの解を近似できることを示す。

Traditional reduced order modeling techniques such as the reduced basis (RB) method (relying, e.g., on proper orthogonal decomposition (POD)) suffer from severe limitations when dealing with nonlinear time-dependent parametrized PDEs, because of the fundamental assumption of linear superimposition of modes they are based on. For this reason, in the case of problems featuring coherent structures that propagate over time such as transport, wave, or convection-dominated phenomena, the RB method usually yields inefficient reduced order models (ROMs) if one aims at obtaining reduced order approximations sufficiently accurate compared to the high-fidelity, full order model (FOM) solution. To overcome these limitations, in this work, we propose a new nonlinear approach to set reduced order models by exploiting deep learning (DL) algorithms. In the resulting nonlinear ROM, which we refer to as DL-ROM, both the nonlinear trial manifold (corresponding to the set of basis functions in a linear ROM) as well as the nonlinear reduced dynamics (corresponding to the projection stage in a linear ROM) are learned in a non-intrusive way by relying on DL algorithms; the latter are trained on a set of FOM solutions obtained for different parameter values. In this paper, we show how to construct a DL-ROM for both linear and nonlinear time-dependent parametrized PDEs; moreover, we assess its accuracy on test cases featuring different parametrized PDE problems. Numerical results indicate that DL-ROMs whose dimension is equal to the intrinsic dimensionality of the PDE solutions manifold are able to approximate the solution of parametrized PDEs in situations where a huge number of POD modes would be necessary to achieve the same degree of accuracy.

翻訳日:2023-01-12 05:07:51 公開日:2020-01-12

# アモルファスフォトニック位相絶縁体

Amorphous photonic topological insulator ( http://arxiv.org/abs/2001.03819v1 )

ライセンス: Link先を確認

Peiheng Zhou, Gui-Geng Liu, Xin Ren, Yihao Yang, Haoran Xue, Lei Bi, Longjiang Deng, Yidong Chong, and Baile Zhang

(参考訳) フォトニックトポロジー絶縁体(ptis)はバンドトポロジーによって保護される頑健なフォトニックエッジ状態を示す。標準バンド理論は、長距離の位置順序を持たないが短距離順序のみを持つ非結晶格子によって形成される物質のアモルファス相には適用されない。その他の興味深い性質の中で、アモルファス媒体はガラスと液体の相の遷移を示し、短距離秩序の劇的な変化を伴う。本稿では,Chern-number-based PTIのアモルファス変種について実験的に検討する。格子の歪み強度を調整することにより、ガラス-液相遷移の前に、光学的位相的エッジ状態がアモルファス状態に持続可能であることを示す。液体状の格子構造への遷移の後、位相的エッジ状態のシグネチャは消滅する。アモルファス格子におけるトポロジーと短距離秩序の間のこの相互作用は、新しい非結晶トポロジーフォトニック材料への道を開く。

Photonic topological insulators (PTIs) exhibit robust photonic edge states protected by band topology, similar to electronic edge states in topological band insulators. Standard band theory does not apply to amorphous phases of matter, which are formed by non-crystalline lattices with no long-range positional order but only short-range order. Among other interesting properties, amorphous media exhibit transitions between glassy and liquid phases, accompanied by dramatic changes in short-range order. Here, we experimentally investigate amorphous variants of a Chern-number-based PTI. By tuning the disorder strength in the lattice, we demonstrate that photonic topological edge states can persist into the amorphous regime, prior to the glass-to-liquid transition. After the transition to a liquid-like lattice configuration, the signatures of topological edge states disappear. This interplay between topology and short-range order in amorphous lattices paves the way for new classes of non-crystalline topological photonic materials.

翻訳日:2023-01-12 05:07:18 公開日:2020-01-12

# 希薄スピングラスにおけるフラストレーションと遷移点の簡単な関係

A simple relation between frustration and transition points in diluted spin glasses ( http://arxiv.org/abs/2001.03903v1 )

ライセンス: Link先を確認

Ryoji Miyazaki, Yuta Kudo, Masayuki Ohzeki, Kazuyuki Tanaka

(参考訳) スピングラスのフラストレーションと相転移点の関係について検討した。この関係は, 相転移点ゼロ温度における格子内のフラストレーション点数の条件として表され, 複数の格子に対して相転移点に非常に近い点を与えることが報告された。関係の証明はないが、いくつかの格子の良好な対応は、相転移における関係の妥当性とフラストレーションの重要な役割を示唆している。さらに, この関係を考察するため, 希釈格子との関係を自然拡張し, 結合拡散二乗格子の有効性を検証した。得られた点が幅広い希釈速度で相転移点と良好に一致していることを確認する。その結果,スピングラスの相転移に対するフラストレーションの重要性について,前回の非希釈格子に対する提案が支持された。

We investigate a possible relation between frustration and phase-transition points in spin glasses. The relation is represented as a condition of the number of frustrated plaquettes in the lattice at phase-transition points at zero temperature and was reported to provide very close points to the phase-transition points for several lattices. Although there has been no proof of the relation, the good correspondence in several lattices suggests the validity of the relation and some important role of frustration in the phase transitions. To examine the relation further, we present a natural extension of the relation to diluted lattices and verify its effectiveness for bond-diluted square lattices. We then confirm that the resulting points are in good agreement with the phase-transition points in a wide range of dilution rate. Our result supports the suggestion from the previous work for non-diluted lattices on the importance of frustration to the phase transition of spin glasses.

翻訳日:2023-01-12 05:06:59 公開日:2020-01-12

# ドープ半導体薄膜における超高速2光子放出

Ultrafast two-photon emission in a doped semiconductor thin film ( http://arxiv.org/abs/2001.03975v1 )

ライセンス: Link先を確認

Futai Hu, Liu Li, Yuan Liu, Yuan Meng, Mali Gong, and Yuanmu Yang

(参考訳) 高次量子遷移として、2光子放出は1光子放出に比べて非常に低い発生率を持つため、禁止された過程であると考えられている。本稿では,超高速2光子発光を可能とし,高閉じ込められた表面プラズモンポラリトンモードを利用した半導体薄膜の発光方式を提案する。表面プラズモンポラリトンモードは、半導体中の2光子放出と同時にスペクトルと空間の重なりを持つように調整される。縮退ドープしたInSbを原材料として, 2光子放出は10ミリ秒からピコ秒までの10桁の速度で加速し, 1光子放出速度を超えることを示す。この結果,超高速光子生成のための半導体プラットフォームが,中赤外波長の波長可変化を実現した。

As a high-order quantum transition, two-photon emission has an extremely low occurrence rate compared to one-photon emission, thus having been considered a forbidden process. Here, we propose a scheme that allows ultrafast two-photon emission, leveraging highly confined surface plasmon polariton modes in a degenerately-doped, light-emitting semiconductor thin film. The surface plasmon polariton modes are tailored to have simultaneous spectral and spatial overlap with the two-photon emission in the semiconductor. Using degenerately-doped InSb as the prototype material, we show that the two-photon emission can be accelerated by 10 orders of magnitude: from tens of milliseconds to picoseconds, surpassing the one-photon emission rate. Our result provides a semiconductor platform for ultrafast single and entangled photon generation, with a tunable emission wavelength in the mid-infrared.

翻訳日:2023-01-12 05:06:01 公開日:2020-01-12

# 非剛性画像登録と剛性画像登録の比較検討

A Comparative Study for Non-rigid Image Registration and Rigid Image Registration ( http://arxiv.org/abs/2001.03831v1 )

ライセンス: Link先を確認

Xiaoran Zhang, Hexiang Dong, Di Gao and Xiao Zhao

(参考訳) 画像登録アルゴリズムは一般に非剛性と剛性という2つのグループに分類できる。近年,深層学習に基づくアルゴリズムでは,非剛性画像登録関数を特徴付けるニューラルネットワークが採用されている。しかし、彼らは常に改善しますか? 本研究では,最先端のDeep-based non-rigid registration approachと厳密な登録アプローチを比較した。データはkaggle dog vs cat competition \url{https://www.kaggle.com/c/dogs-vs-cats/}から生成され、変換、回転、スケーリング、せん断、ピクセルワイズ非剛性変換を含む剛性変換におけるアルゴリズムの性能をテストする。 voxelmorphは、比較のためにhardidsetとnonrigidsetを別々にトレーニングし、登録性能を改善するために元のアーキテクチャにガウスぼけ層を追加する。根平均二乗誤差 (RMSE) と平均絶対誤差 (MAE) の両値における最良の定量値は, SimpleElastix と Voxelmorph による非剛性登録により得られる。視覚評価のための代表サンプルを選択する。

Image registration algorithms can be generally categorized into two groups: non-rigid and rigid. Recently, many deep learning-based algorithms employ a neural net to characterize non-rigid image registration function. However, do they always perform better? In this study, we compare the state-of-art deep learning-based non-rigid registration approach with rigid registration approach. The data is generated from Kaggle Dog vs Cat Competition \url{https://www.kaggle.com/c/dogs-vs-cats/} and we test the algorithms' performance on rigid transformation including translation, rotation, scaling, shearing and pixelwise non-rigid transformation. The Voxelmorph is trained on rigidset and nonrigidset separately for comparison and we also add a gaussian blur layer to its original architecture to improve registration performance. The best quantitative results in both root-mean-square error (RMSE) and mean absolute error (MAE) metrics for rigid registration are produced by SimpleElastix and non-rigid registration by Voxelmorph. We select representative samples for visual assessment.

翻訳日:2023-01-12 04:59:45 公開日:2020-01-12

# スカラー量子化学習による深層最適化多重記述画像符号化

Deep Optimized Multiple Description Image Coding via Scalar Quantization Learning ( http://arxiv.org/abs/2001.03851v1 )

ライセンス: Link先を確認

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

(参考訳) 本稿では,多重記述(md)圧縮損失を最小化することで最適化した深層多重記述符号化(mdc)フレームワークを提案する。第一に、mdマルチスケール拡張エンコーダネットワークは複数の記述テンソルを生成し、スカラー量子化子によって離散化されるが、これらの量子化テンソルはmdカスケードブロックデコーダネットワークによってデ圧縮される。人工ニューラルネットワークパラメータの総量を大幅に削減するために、これらの2種類のネットワークからなるオートエンコーダネットワークを対称パラメータ共有構造として設計する。第2に、このオートエンコーダネットワークと一対のスカラー量子化器を、エンドツーエンドのセルフ教師方式で同時に学習する。第3に、画像空間分布の変化を考慮すると、各スカラー量子化器は直接量子化ではなく、mdテンソルを生成するために重要指数マップを伴っている。第4に,複数記述の多様化を暗黙的に規則化する多重記述構造類似性距離損失を導入し,MD再構成の他,多様化復号化を明示的に監督する。最後に、我々のMDCフレームワークは、複数の一般的なデータセットでテストした場合、画像符号化効率について、最先端のMDCアプローチよりも優れた性能を示すことを示す。

In this paper, we introduce a deep multiple description coding (MDC) framework optimized by minimizing multiple description (MD) compressive loss. First, MD multi-scale-dilated encoder network generates multiple description tensors, which are discretized by scalar quantizers, while these quantized tensors are decompressed by MD cascaded-ResBlock decoder networks. To greatly reduce the total amount of artificial neural network parameters, an auto-encoder network composed of these two types of network is designed as a symmetrical parameter sharing structure. Second, this autoencoder network and a pair of scalar quantizers are simultaneously learned in an end-to-end self-supervised way. Third, considering the variation in the image spatial distribution, each scalar quantizer is accompanied by an importance-indicator map to generate MD tensors, rather than using direct quantization. Fourth, we introduce the multiple description structural similarity distance loss, which implicitly regularizes the diversified multiple description generations, to explicitly supervise multiple description diversified decoding in addition to MD reconstruction loss. Finally, we demonstrate that our MDC framework performs better than several state-of-the-art MDC approaches regarding image coding efficiency when tested on several commonly available datasets.

翻訳日:2023-01-12 04:59:25 公開日:2020-01-12

# 水頭症に対するロバスト脳磁気共鳴画像分割 : 硬度と軟度の検討

Robust Brain Magnetic Resonance Image Segmentation for Hydrocephalus Patients: Hard and Soft Attention ( http://arxiv.org/abs/2001.03857v1 )

ライセンス: Link先を確認

Xuhua Ren, Jiayu Huo, Kai Xuan, Dongming Wei, Lichi Zhang, Qian Wang

(参考訳) 脳磁気共鳴(MR)の脳卒中患者に対するセグメンテーションは難しい作業であると考えられている。異なる個体の脳解剖学的構造の変化をコード化することは容易ではない。この課題は、特に水頭症患者の画像データを考慮するとさらに難しくなり、しばしば大きな変形があり、通常の被験者とは大きく異なる。本稿では,水頭症mr画像のセグメンテーション問題を解決するための,ハードおよびソフトアテンションモジュールを用いた新しい戦略を提案する。私たちの主な貢献は3倍です。 1) ハードアテンションモジュールは,マルチアトラス法とボクセルモルフツールを用いて粗いセグメンテーションマップを生成し,その後のセグメンテーションプロセスをガイドし,そのロバスト性を向上させる。 2 ソフトアテンションモジュールは、位置注意を取り入れて正確な文脈情報を把握し、セグメンテーション精度をさらに向上させる。 3) 実際の臨床シナリオにおいて脳MRI画像の定量化に不可欠である insula, thalamus, and many other region ofinterests (ROIs) を抽出し, 本法の有効性を検証した。提案手法は,17種類の意識関連ROIを異なる被験者に対して高いバラツキで分割することで,ロバスト性と精度を大幅に向上させる。私たちの知る限りでは、脳卒中患者の脳のセグメンテーション問題を解決するためにディープラーニングを利用した最初の研究である。

Brain magnetic resonance (MR) segmentation for hydrocephalus patients is considered as a challenging work. Encoding the variation of the brain anatomical structures from different individuals cannot be easily achieved. The task becomes even more difficult especially when the image data from hydrocephalus patients are considered, which often have large deformations and differ significantly from the normal subjects. Here, we propose a novel strategy with hard and soft attention modules to solve the segmentation problems for hydrocephalus MR images. Our main contributions are three-fold: 1) the hard-attention module generates coarse segmentation map using multi-atlas-based method and the VoxelMorph tool, which guides subsequent segmentation process and improves its robustness; 2) the soft-attention module incorporates position attention to capture precise context information, which further improves the segmentation accuracy; 3) we validate our method by segmenting insula, thalamus and many other regions-of-interests (ROIs) that are critical to quantify brain MR images of hydrocephalus patients in real clinical scenario. The proposed method achieves much improved robustness and accuracy when segmenting all 17 consciousness-related ROIs with high variations for different subjects. To the best of our knowledge, this is the first work to employ deep learning for solving the brain segmentation problems of hydrocephalus patients.

翻訳日:2023-01-12 04:59:04 公開日:2020-01-12

# ヒューマンロボットインタラクションのための深層学習に基づく感情予測のためのハイパーパラメータ最適化

Hyperparameters optimization for Deep Learning based emotion prediction for Human Robot Interaction ( http://arxiv.org/abs/2001.03855v1 )

ライセンス: Link先を確認

Shruti Jaiswal, and Gora Chand Nandi

(参考訳) ヒューマノイドロボットが私たちの社会空間を共有できるようにするには、音声、ジェスチャー、感情の共有といった複数のモードを使ってロボットと簡単に対話できる技術を開発する必要がある。本研究は,リアルタイムコミュニケーションのために低リソースのソーシャルロボット上でより適応的に計算できる,計算資源の削減とネットワークハイパーパラメータの少なさを必要とする感情認識問題の核となる問題に対処することを目的とした。より具体的には、人間型ロボットをリアルタイムで試した場合に複数のデータセット上で組み合わせてテストした場合、感情分類のための既存のネットワークアーキテクチャよりも最大6%精度が向上したインセプションモジュールベースの畳み込みニューラルネットワークアーキテクチャを提案する。提案モデルでは,トレーニング可能なハイパーパラメータを94%まで削減し,人間のロボットインタラクションなどのリアルタイムアプリケーションで使用できることを明確に示すバニラCNNモデルと比較した。十分に堅牢で精度の高い方法論を検証するために,厳密な実験が実施されている。最後に、モデルを人型ロボットNAOにリアルタイムに実装し、モデルの堅牢性を評価する。

To enable humanoid robots to share our social space we need to develop technology for easy interaction with the robots using multiple modes such as speech, gestures and share our emotions with them. We have targeted this research towards addressing the core issue of emotion recognition problem which would require less computation resources and much lesser number of network hyperparameters which will be more adaptive to be computed on low resourced social robots for real time communication. More specifically, here we have proposed an Inception module based Convolutional Neural Network Architecture which has achieved improved accuracy of upto 6% improvement over the existing network architecture for emotion classification when combinedly tested over multiple datasets when tried over humanoid robots in real - time. Our proposed model is reducing the trainable Hyperparameters to an extent of 94% as compared to vanilla CNN model which clearly indicates that it can be used in real time based application such as human robot interaction. Rigorous experiments have been performed to validate our methodology which is sufficiently robust and could achieve high level of accuracy. Finally, the model is implemented in a humanoid robot, NAO in real time and robustness of the model is evaluated.

翻訳日:2023-01-12 04:58:05 公開日:2020-01-12

# 非有界大域最適化に対する適応拡張ベイズ最適化

Adaptive Expansion Bayesian Optimization for Unbounded Global Optimization ( http://arxiv.org/abs/2001.04815v1 )

ライセンス: Link先を確認

Wei Chen and Mark Fuge

(参考訳) ベイズ最適化は通常、固定変数境界内で実行される。機械学習アルゴリズムのハイパーパラメータチューニングのような場合、変数境界の設定は簡単ではない。任意の固定境界が真の大域的最適性を含むことは保証できない。本稿では,大域的最適化を必ずしも含まない初期探索空間を定義し,必要であれば探索空間を拡大するベイズ最適化手法を提案する。しかし、過剰な爆発は探索空間の膨張の間に起こりうる。拡張空間における探索と利用を適応的にバランスさせることができる。合成試験関数とMLPハイパーパラメータ最適化タスクの結果から,提案手法は現在の最先端手法と同等以上の性能を示した。

Bayesian optimization is normally performed within fixed variable bounds. In cases like hyperparameter tuning for machine learning algorithms, setting the variable bounds is not trivial. It is hard to guarantee that any fixed bounds will include the true global optimum. We propose a Bayesian optimization approach that only needs to specify an initial search space that does not necessarily include the global optimum, and expands the search space when necessary. However, over-exploration may occur during the search space expansion. Our method can adaptively balance exploration and exploitation in an expanding space. Results on a range of synthetic test functions and an MLP hyperparameter optimization task show that the proposed method out-performs or at least as good as the current state-of-the-art methods.

翻訳日:2023-01-12 04:57:45 公開日:2020-01-12

# 離散時間量子ウォークを用いたスペクトル磁化ラチェット

Spectral Magnetization Ratchets with Discrete Time Quantum Walks ( http://arxiv.org/abs/2001.03868v1 )

ライセンス: Link先を確認

A. Mallick, M. V. Fistul, P. Kaczynska, S. Flach

(参考訳) 我々は、周期的離散時間量子ウォーク(DTQW)のスペクトル磁化に対するラチェット効果を予測し、理論的に詳細に研究する。これらの一般化DTQWは、対応するコイン演算子パラメータを離散時間で周期的に変化させることにより達成される。期間はm=1,2,3$である。 m$-周期 dtqws のダイナミクスは、2バンド分散関係 $\omega^{(m)_{\pm}(k)$ によって特徴づけられ、ここで $k$ は波動ベクトルである。我々は、$m$- periodic DTQWsの一般化パリティ対称性を同定する。対称性は、コイン演算子パラメータの適切な選択によって、$m=2,3$で破ることができる。得られた対称性の破れはラチェット効果、すなわち非零のスペクトル磁化 $m_s(\omega)$ の出現をもたらす。このラチェット効果は、周期DTQWの時間依存性相関関数の連続量子測定の枠組みで観察することができる。

We predict and theoretically study in detail the ratchet effect for the spectral magnetization of periodic discrete time quantum walks (DTQWs) --- a repetition of a sequence of $m$ different DTQWs. These generalized DTQWs are achieved by varying the corresponding coin operator parameters periodically with discrete time. We consider periods $m=1,2,3$. The dynamics of $m$-periodic DTQWs is characterized by a two-band dispersion relation $\omega^{(m)}_{\pm}(k)$, where $k$ is the wave vector. We identify a generalized parity symmetry of $m$-periodic DTQWs. The symmetry can be broken for $m=2,3$ by proper choices of the coin operator parameters. The obtained symmetry breaking results in a ratchet effect, i.e. the appearance of a nonzero spectral magnetization $M_s(\omega)$. This ratchet effect can be observed in the framework of continuous quantum measurements of the time-dependent correlation function of periodic DTQWs.

翻訳日:2023-01-12 04:57:21 公開日:2020-01-12

# 連続モデル生成のための同時外挿・補間ネットワーク

Concurrently Extrapolating and Interpolating Networks for Continuous Model Generation ( http://arxiv.org/abs/2001.03847v1 )

ライセンス: Link先を確認

Lijun Zhao, Jinjing Zhang, Fan Zhang, Anhong Wang, Huihui Bai, Yao Zhao

(参考訳) 多くの深層画像平滑化演算子は、異なるパラメータで設定されたアルゴリズムごとに、異なる明示的な構造-テクスチャペアをラベルイメージとして使用する場合、常に繰り返し訓練される。このようなトレーニング戦略は、しばしば長い時間をかけて、機器リソースをコストのかかる方法で消費します。この課題に対処するために、より強力なモデル生成ツールとして、連続ネットワーク補間を一般化し、特定の効果ラベル画像のセットのみを必要とするモデル列を形成するための、単純かつ効果的なモデル生成戦略を提案する。画像平滑化演算子を正確に学習するために、現在のネットワークアーキテクチャの大部分に簡単に挿入できる二重状態集約(DSA)モジュールを提案する。このモジュールに基づき、局所特徴集約ブロックと非局所特徴集約ブロックを備えた二重状態集約ニューラルネットワーク構造を設計し、表現能力の高い演算子を得る。多くの客観的および視覚的実験結果の評価を通じて,提案手法は連続したモデルを生成することができ,画像平滑化のための最先端手法よりも優れた性能が得られることを示す。

Most deep image smoothing operators are always trained repetitively when different explicit structure-texture pairs are employed as label images for each algorithm configured with different parameters. This kind of training strategy often takes a long time and spends equipment resources in a costly manner. To address this challenging issue, we generalize continuous network interpolation as a more powerful model generation tool, and then propose a simple yet effective model generation strategy to form a sequence of models that only requires a set of specific-effect label images. To precisely learn image smoothing operators, we present a double-state aggregation (DSA) module, which can be easily inserted into most of current network architecture. Based on this module, we design a double-state aggregation neural network structure with a local feature aggregation block and a nonlocal feature aggregation block to obtain operators with large expression capacity. Through the evaluation of many objective and visual experimental results, we show that the proposed method is capable of producing a series of continuous models and achieves better performance than that of several state-of-the-art methods for image smoothing.

翻訳日:2023-01-12 04:50:46 公開日:2020-01-12

# 車両再識別のための属性誘導型特徴学習ネットワーク

Attribute-guided Feature Learning Network for Vehicle Re-identification ( http://arxiv.org/abs/2001.03872v1 )

ライセンス: Link先を確認

Huibing Wang, Jinjia Peng, Dongyan Chen, Guangqi Jiang, Tongtong Zhao, Xianping Fu

(参考訳) 自動車再識別(reID)は,近年ホットな話題となっている都市監視ビデオの自動解析において重要な役割を担っている。しかし、これは車両の様々な視点、多彩な照明、複雑な環境によって引き起こされる重大な問題である。現在、ほとんどの既存の車両のreIDアプローチは、より優れた表現を導き出すためにメトリクスやアンサンブルを学習することに焦点を当てている。しかし、詳細な記述を含む車両の特性は、reIDモデルの訓練に有用である。そこで,本稿では,豊富な属性特徴を持つグローバル表現をエンドツーエンドに学習可能な,新しいAttribute-Guided Network(AGNet)を提案する。特に、属性誘導モジュールがagnetで提案され、カテゴリ分類のための識別的特徴の選択を逆ガイドできる属性マスクを生成する。さらに,提案したAGNetでは,属性に基づくラベル平滑化(ALS)の損失がreIDモデルの訓練に有効であることを示す。総合実験の結果, vehicleid データセットと veri-776 データセットの両方において優れた性能が得られた。

Vehicle re-identification (reID) plays an important role in the automatic analysis of the increasing urban surveillance videos, which has become a hot topic in recent years. However, it poses the critical but challenging problem that is caused by various viewpoints of vehicles, diversified illuminations and complicated environments. Till now, most existing vehicle reID approaches focus on learning metrics or ensemble to derive better representation, which are only take identity labels of vehicle into consideration. However, the attributes of vehicle that contain detailed descriptions are beneficial for training reID model. Hence, this paper proposes a novel Attribute-Guided Network (AGNet), which could learn global representation with the abundant attribute features in an end-to-end manner. Specially, an attribute-guided module is proposed in AGNet to generate the attribute mask which could inversely guide to select discriminative features for category classification. Besides that, in our proposed AGNet, an attribute-based label smoothing (ALS) loss is presented to better train the reID model, which can strength the distinct ability of vehicle reID model to regularize AGNet model according to the attributes. Comprehensive experimental results clearly demonstrate that our method achieves excellent performance on both VehicleID dataset and VeRi-776 dataset.

翻訳日:2023-01-12 04:50:02 公開日:2020-01-12

# 視覚的感情分類のためのマルチソースドメイン適応

Multi-source Domain Adaptation for Visual Sentiment Classification ( http://arxiv.org/abs/2001.03886v1 )

ライセンス: Link先を確認

Chuang Lin, Sicheng Zhao, Lei Meng, Tat-Seng Chua

(参考訳) 視覚感情分類に関する既存のドメイン適応法は通常、十分なラベル付きデータのソースドメインから学んだ知識を、ゆるやかにラベル付けされたデータまたはラベル付きデータのターゲットドメインに転送する単一ソースシナリオで検討される。しかし、実際には、単一のソースドメインのデータは通常、限られたボリュームを持ち、ターゲットドメインの特徴をほとんどカバーできない。本稿では,多元感情生成支援ネットワーク(msgan,multi-source sentiment generative adversarial network)と呼ばれる,視覚的感情分類のためのマルチソースドメイン適応(mda)手法を提案する。複数のソースドメインからのデータを扱うために、ソースドメインとターゲットドメインの両方からのデータが同じ分布を共有する、統一された感情潜在空間を見つけることを学ぶ。これは、エンドツーエンドのサイクル一貫した逆学習を通じて達成される。 4つのベンチマークデータセットで実施された大規模な実験により、MSGANは視覚的感情分類のための最先端のMDAアプローチよりも大幅に優れていることが示された。

Existing domain adaptation methods on visual sentiment classification typically are investigated under the single-source scenario, where the knowledge learned from a source domain of sufficient labeled data is transferred to the target domain of loosely labeled or unlabeled data. However, in practice, data from a single source domain usually have a limited volume and can hardly cover the characteristics of the target domain. In this paper, we propose a novel multi-source domain adaptation (MDA) method, termed Multi-source Sentiment Generative Adversarial Network (MSGAN), for visual sentiment classification. To handle data from multiple source domains, it learns to find a unified sentiment latent space where data from both the source and target domains share a similar distribution. This is achieved via cycle consistent adversarial learning in an end-to-end manner. Extensive experiments conducted on four benchmark datasets demonstrate that MSGAN significantly outperforms the state-of-the-art MDA approaches for visual sentiment classification.

翻訳日:2023-01-12 04:49:41 公開日:2020-01-12

# メラノーマセグメンテーションのための適応受容場を持つ補完ネットワーク

Complementary Network with Adaptive Receptive Fields for Melanoma Segmentation ( http://arxiv.org/abs/2001.03893v1 )

ライセンス: Link先を確認

Xiaoqing Guo, Zhen Chen, Yixuan Yuan

(参考訳) 皮膚内視鏡像におけるメラノーマの自動分画は皮膚癌のコンピュータ診断に不可欠である。既存の手法はホールに悩まされ、セグメンテーション性能に制限のある問題を縮小する。そこで本研究では,適応的受容学習を用いた補足ネットワークを提案する。セグメンテーションタスクを独立して行う代わりに, メラノーマ病変を検出するフォアグラウンドネットワークと, 非メラノーマ領域をマスキングするバックグラウンドネットワークを提案する。さらに,アダプティブ・アラス・コンボリューション (AAC) とナレッジ・アグリゲーション・モジュール (KAM) を提案する。 aacは、複数のスケールでの受容野を明示的に制御し、kamは、深い特徴マップに従って調整される適応受容野と拡張された畳み込みによって、浅い特徴マップを畳み込みます。さらに、前景と背景ネットワーク間の依存関係を利用するために、新たな相互損失が提案され、これらの2つのネットワーク内で相互に影響を及ぼすことができる。この相互学習戦略により、半教師付き学習が可能となり、境界感性が向上する。 Skin Imaging Collaboration (ISIC) 2018 skin lesion segmentation dataset を用いて, ディスコ効率86.4%を達成し, 最先端のメラノーマ・セグメンテーション法と比較して優れた性能を示した。

Automatic melanoma segmentation in dermoscopic images is essential in computer-aided diagnosis of skin cancer. Existing methods may suffer from the hole and shrink problems with limited segmentation performance. To tackle these issues, we propose a novel complementary network with adaptive receptive filed learning. Instead of regarding the segmentation task independently, we introduce a foreground network to detect melanoma lesions and a background network to mask non-melanoma regions. Moreover, we propose adaptive atrous convolution (AAC) and knowledge aggregation module (KAM) to fill holes and alleviate the shrink problems. AAC explicitly controls the receptive field at multiple scales and KAM convolves shallow feature maps by dilated convolutions with adaptive receptive fields, which are adjusted according to deep feature maps. In addition, a novel mutual loss is proposed to utilize the dependency between the foreground and background networks, thereby enabling the reciprocally influence within these two networks. Consequently, this mutual training strategy enables the semi-supervised learning and improve the boundary-sensitivity. Training with Skin Imaging Collaboration (ISIC) 2018 skin lesion segmentation dataset, our method achieves a dice co-efficient of 86.4% and shows better performance compared with state-of-the-art melanoma segmentation methods.

翻訳日:2023-01-12 04:49:24 公開日:2020-01-12

# アテンションフロー: エンドツーエンドの関節アテンション推定

Attention Flow: End-to-End Joint Attention Estimation ( http://arxiv.org/abs/2001.03960v1 )

ライセンス: Link先を確認

\"Omer S\"umer, Peter Gerjets, Ulrich Trautwein, Enkelejda Kasneci

(参考訳) 本稿では,ソーシャルシーンビデオにおける共同注意の理解の問題に対処する。共同注意は、対象物または関心領域における2人以上の個人の共通の視線行動であり、人間とコンピュータの相互作用、教育評価、注意障害のある患者の治療など、幅広い応用がある。本手法は,有意な特徴を選定し,共同注意の局所化を向上する2つの新しい畳み込み注意機構を用いて,エンドツーエンドで共同注意を学習する。複雑な社会シーンを含むビデオCoAttデータセットにおいて,サリエンシマップとアテンション機構の効果を比較し,定量的および定性的な結果が共同注意の検出と局所化に与える影響を報告する。

This paper addresses the problem of understanding joint attention in third-person social scene videos. Joint attention is the shared gaze behaviour of two or more individuals on an object or an area of interest and has a wide range of applications such as human-computer interaction, educational assessment, treatment of patients with attention disorders, and many more. Our method, Attention Flow, learns joint attention in an end-to-end fashion by using saliency-augmented attention maps and two novel convolutional attention mechanisms that determine to select relevant features and improve joint attention localization. We compare the effect of saliency maps and attention mechanisms and report quantitative and qualitative results on the detection and localization of joint attention in the VideoCoAtt dataset, which contains complex social scenes.

翻訳日:2023-01-12 04:48:19 公開日:2020-01-12

# 線虫c. elegansの頭と尾の局在

Head and Tail Localization of C. elegans ( http://arxiv.org/abs/2001.03981v1 )

ライセンス: Link先を確認

Mansi Ranjit Mane, Aniket Anand Deshmukh, Adam J. Iliff

(参考訳) C. elegans は神経科学で行動分析によく用いられるが、これは神経系が小さく、接続性も良好であるためである。動物を局在させ、頭と尾を区別することは、行動測定中にワームを追跡する重要なタスクであり、定量分析を行う。画像中のワームの頭部と尾の両方を局在化するためのニューラルネットワークによるアプローチを示す。 C. elegansの行動分析のためのオープンソースの機械学習ベースのソリューションを再現可能な論文で実証的な結果を得るために、コードを公開する。

C. elegans is commonly used in neuroscience for behaviour analysis because of it's compact nervous system with well-described connectivity. Localizing the animal and distinguishing between its head and tail are important tasks to track the worm during behavioural assays and to perform quantitative analyses. We demonstrate a neural network based approach to localize both the head and the tail of the worm in an image. To make empirical results in the paper reproducible and promote open source machine learning based solutions for C. elegans behavioural analysis, we also make our code publicly available.

翻訳日:2023-01-12 04:48:06 公開日:2020-01-12

# マルチモード単一パス時空間スクイーズ

Multimode Single-Pass Spatio-temporal Squeezing ( http://arxiv.org/abs/2001.03972v1 )

ライセンス: Link先を確認

Luca La Volpe, Syamsundar De, Tiphaine Kouadou, Dmitri Horoshko, Mikhail Kolobov, Claude Fabre, Valentina Parigi, Nicolas Treps

(参考訳) 量子情報や量子気象学に応用可能なブロードバンド多重モード励起光の単一パス源を提案する。ソースは、非線形バルク結晶内の非共線形配置におけるタイプiパラメトリックダウンコンバージョン(pdc)プロセスに基づいている。生成したスクイーズ光は、空間的にも時間的にも形成された局所発振器を用いてホモダイン測定により、時空間的多モード挙動を示す。最後に,共分散行列に基づくアプローチにより,複数の独立な時空間モードと空間モード間のスクイーズ分布を明らかにする。これは、ソースのマルチモード機能を明確に検証します。

We present a single-pass source of broadband multimode squeezed light with potential application in quantum information and quantum metrology. The source is based on a type I parametric down-conversion (PDC) process inside a bulk nonlinear crystal in a non-collinear configuration. The generated squeezed light exhibits a spatiotemporal multimode behavior that is probed using a homodyne measurement with a local oscillator shaped both spatially and temporally. Finally we follow a covariance matrix based approach to reveal the distribution of the squeezing among several independent temporal and spatial modes. This unambiguously validates the multimode feature of our source.

翻訳日:2023-01-12 04:47:56 公開日:2020-01-12

# 機械学習を用いたアップリンク無線通信におけるチャネル割り当て

Channel Assignment in Uplink Wireless Communication using Machine Learning Approach ( http://arxiv.org/abs/2001.03952v1 )

ライセンス: Link先を確認

Guangyu Jia and Zhaohui Yang and Hak-Keung Lam and Jianfeng Shi and Mohammad Shikh-Bahaei

(参考訳) この手紙は、アップリンク無線通信システムにおけるチャネル割り当て問題を調査する。我々の目標は、整数チャネル割り当て制約を受ける全ユーザの総和率を最大化することです。凸最適化に基づくアルゴリズムが提供され、各ステップで閉形式解が得られる最適なチャネル割り当てが得られる。凸最適化に基づくアルゴリズムでは計算の複雑さが高いため、機械学習手法を用いて計算効率のよい解を求める。具体的には、凸最適化に基づくアルゴリズムを用いてデータを生成し、元の問題を、畳み込みニューラルネットワーク(CNN)、フィードフォワードニューラルネットワーク(FNN)、ランダムフォレスト(ランダムフォレスト)、ゲートリカレントユニットネットワーク(GRU)の統合によって対処する回帰問題に変換する。その結果,機械学習手法は予測精度をわずかに向上させて計算時間を大幅に短縮することを示した。

This letter investigates a channel assignment problem in uplink wireless communication systems. Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints. A convex optimization based algorithm is provided to obtain the optimal channel assignment, where the closed-form solution is obtained in each step. Due to high computational complexity in the convex optimization based algorithm, machine learning approaches are employed to obtain computational efficient solutions. More specifically, the data are generated by using convex optimization based algorithm and the original problem is converted to a regression problem which is addressed by the integration of convolutional neural networks (CNNs), feed-forward neural networks (FNNs), random forest and gated recurrent unit networks (GRUs). The results demonstrate that the machine learning method largely reduces the computation time with slightly compromising of prediction accuracy.

翻訳日:2023-01-12 04:42:12 公開日:2020-01-12

# Fact Grounding を用いたデータ・テキスト生成における課題の再考

Revisiting Challenges in Data-to-Text Generation with Fact Grounding ( http://arxiv.org/abs/2001.03830v1 )

ライセンス: Link先を確認

Hongmin Wang

(参考訳) データ対テキスト生成モデルは、正しい入力ソースを参照してデータの忠実性を保証するという課題に直面している。この分野の研究を刺激するために、ワイズマンらは、ボックステーブルとラインスコアテーブルからnbaゲームサマリーを生成するために、rotowireコーパスを導入した。しかし、この方向に限定的な試みが行われ、課題は残る。我々は,要約内容の約60%しかボックススコアレコードに接地できないコーパスにおける顕著なボトルネックを観察する。このような情報不足は、条件付き言語モデルが無条件の無作為な事実を生み出すことを誤認し、結果として事実的幻覚を引き起こす傾向がある。本研究では,情報バランスを回復し,実地データ・テキスト生成に重点を置いたタスクを改良する。我々は、2017-19年の50パーセント以上のデータと豊富な入力テーブルを備えた、浄化された大規模データセットであるRotoWire-FG(Fact-Grounding)を導入し、この方向へのさらなる研究の焦点を期待している。さらに,新たなテーブル再構成を補助タスクとして統合することで,最先端モデルに対するデータ忠実度の向上を実現し,生成品質を向上する。

Data-to-text generation models face challenges in ensuring data fidelity by referring to the correct input source. To inspire studies in this area, Wiseman et al. (2017) introduced the RotoWire corpus on generating NBA game summaries from the box- and line-score tables. However, limited attempts have been made in this direction and the challenges remain. We observe a prominent bottleneck in the corpus where only about 60% of the summary contents can be grounded to the boxscore records. Such information deficiency tends to misguide a conditioned language model to produce unconditioned random facts and thus leads to factual hallucinations. In this work, we restore the information balance and revamp this task to focus on fact-grounded data-to-text generation. We introduce a purified and larger-scale dataset, RotoWire-FG (Fact-Grounding), with 50% more data from the year 2017-19 and enriched input tables, hoping to attract more research focuses in this direction. Moreover, we achieve improved data fidelity over the state-of-the-art models by integrating a new form of table reconstruction as an auxiliary task to boost the generation quality.

翻訳日:2023-01-12 04:40:40 公開日:2020-01-12

# ニューラルモデルの一般化を再考する:名前付きエンティティ認識のケーススタディ

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study ( http://arxiv.org/abs/2001.03844v1 )

ライセンス: Link先を確認

Jinlan Fu, Pengfei Liu, Qi Zhang, Xuanjing Huang

(参考訳) ニューラルネットワークベースのモデルは、多くのNLPタスクにおいて印象的なパフォーマンスを達成したが、異なるモデルの一般化動作は、まだ理解されていない。本稿では,既存のモデルの一般化挙動を異なる視点から分析し,その一般化能力の相違を,提案手法のレンズを通して特徴付けるためのテストベッドとしてnerタスクを取り入れた。詳細な分析による実験では、既存のニューラルネットワークnerモデルのボトルネックを、ブレークダウンパフォーマンス分析、アノテーションエラー、データセットバイアス、および改善の方向を示すカテゴリ関係の観点から診断する。我々は、将来の研究のためのデータセット(reconll, ploner)をプロジェクトページでリリースした。本論文の副産物として,最近のNER論文を包括的に要約したプロジェクトをオープンソースとして公開し,さまざまな研究トピックに分類した。

While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers and classifies them into different research topics: https://github.com/pfliu-nlp/Named-Entity-Recognition-NER-Papers.

翻訳日:2023-01-12 04:40:19 公開日:2020-01-12

# 新しい単語の意味の検出:スペイン語における単語埋め込みモデルの比較

Detecting New Word Meanings: A Comparison of Word Embedding Models in Spanish ( http://arxiv.org/abs/2001.05285v1 )

ライセンス: Link先を確認

Andr\'es Torres-Rivera and Juan-Manuel Torres-Moreno

(参考訳) 意味ネオロジズム(sn)は、形態を維持しながら新しい単語の意味を取得する単語として定義される。この種のネオロジズムの性質を考えると、これらの新しい単語の意味を識別するタスクは、現在、neologyのオブザーバリーの専門家によって手作業で行われている。 SNを半自動で検出するために,トピックモデリング,キーワード抽出,単語感覚の曖昧さといった手法を組み合わせたシステムを開発した。トピックモデリングの役割は、入力テキストで扱われるテーマを検出することである。例えば、バイラルはコンピュータサイエンス(CS)の文脈で1つの意味を持ち、健康について話すときにもう1つの意味を持っている。キーワードを抽出するために,posタグフィルタリング付きtextrankを用いた。この方法では、既にスペイン語のレキシコンの一部である関連語を得ることができる。ディープラーニングモデルを使用して、与えられたキーワードに新しい意味があるかどうかを判断します。すべての既知の意味(あるいはトピック)とは異なる埋め込みは、単語が有効なsn候補であることを示している。本研究では,Word2Vec,Sense2Vec,FastTextという単語埋め込みモデルについて検討した。モデルは、スペイン語のwikipediaをコーパスとして、同等のパラメータでトレーニングされた。次に、各モデルが生成する異なる埋め込みを示すために、単語のリストとその一致(ネオロジズムのデータベースから得られた)を使用しました。最後に、これらの結果と各単語の一致を比較して、ある単語がSNの有効な候補であるかどうかを判断する方法を示す。

Semantic neologisms (SN) are defined as words that acquire a new word meaning while maintaining their form. Given the nature of this kind of neologisms, the task of identifying these new word meanings is currently performed manually by specialists at observatories of neology. To detect SN in a semi-automatic way, we developed a system that implements a combination of the following strategies: topic modeling, keyword extraction, and word sense disambiguation. The role of topic modeling is to detect the themes that are treated in the input text. Themes within a text give clues about the particular meaning of the words that are used, for example: viral has one meaning in the context of computer science (CS) and another when talking about health. To extract keywords, we used TextRank with POS tag filtering. With this method, we can obtain relevant words that are already part of the Spanish lexicon. We use a deep learning model to determine if a given keyword could have a new meaning. Embeddings that are different from all the known meanings (or topics) indicate that a word might be a valid SN candidate. In this study, we examine the following word embedding models: Word2Vec, Sense2Vec, and FastText. The models were trained with equivalent parameters using Wikipedia in Spanish as corpora. Then we used a list of words and their concordances (obtained from our database of neologisms) to show the different embeddings that each model yields. Finally, we present a comparison of these outcomes with the concordances of each word to show how we can determine if a word could be a valid candidate for SN.

翻訳日:2023-01-12 04:39:55 公開日:2020-01-12

# ニューラルネットワークを用いたウルドゥー英語機械音訳

Urdu-English Machine Transliteration using Neural Networks ( http://arxiv.org/abs/2001.05296v1 )

ライセンス: Link先を確認

Usman Mohy ud Din

(参考訳) 近年は機械翻訳が注目されている。これは、ある言語から他の言語へのテキストの翻訳に焦点を当てた、計算言語学のサブフィールドである。さまざまな翻訳技術の中で、現在ニューラルネットワークは、注意のメカニズム、シーケンスツーシーケンス、長期のモデリングを備えた単一の大きなニューラルネットワークを提供することで、ドメインをリードしている。機械翻訳分野の著しい進歩にもかかわらず、専門用語を含む語彙外語(oov)の翻訳、名前付き文字を含む外国語は、現在の最先端の翻訳システムにとって依然として課題であり、低資源言語や異なる構造を持つ言語間の翻訳において、状況はさらに悪化する。言語の形態的豊かさのため、単語は異なる文脈で異なる髄を持つことがある。このようなシナリオでは、単語の翻訳は正しい/品質の翻訳を提供するのに十分ではない。翻訳は、翻訳中の単語/文の文脈を考える方法である。 urduのような低リソース言語の場合、システムのトレーニングに十分な大きさの並列コーパスを持つ/探すのは非常に困難である。本研究では,教師なし言語に依存しない予測最大化(EM)に基づく翻訳手法を提案する。システムは並列コーパスからパターンと外語彙(OOV)の単語を学習し、文字コーパスで明示的にトレーニングする必要はない。このアプローチは、フレーズベース、階層的フレーズベースおよび因子ベースモデルとLSTMとトランスフォーマーモデルを含む2つのニューラルマシン翻訳モデルを含む統計機械翻訳(SMT)の3つのモデルで検証される。

Machine translation has gained much attention in recent years. It is a sub-field of computational linguistic which focus on translating text from one language to other language. Among different translation techniques, neural network currently leading the domain with its capabilities of providing a single large neural network with attention mechanism, sequence-to-sequence and long-short term modelling. Despite significant progress in domain of machine translation, translation of out-of-vocabulary words(OOV) which include technical terms, named-entities, foreign words are still a challenge for current state-of-art translation systems, and this situation becomes even worse while translating between low resource languages or languages having different structures. Due to morphological richness of a language, a word may have different meninges in different context. In such scenarios, translation of word is not only enough in order provide the correct/quality translation. Transliteration is a way to consider the context of word/sentence during translation. For low resource language like Urdu, it is very difficult to have/find parallel corpus for transliteration which is large enough to train the system. In this work, we presented transliteration technique based on Expectation Maximization (EM) which is un-supervised and language independent. Systems learns the pattern and out-of-vocabulary (OOV) words from parallel corpus and there is no need to train it on transliteration corpus explicitly. This approach is tested on three models of statistical machine translation (SMT) which include phrasebased, hierarchical phrase-based and factor based models and two models of neural machine translation which include LSTM and transformer model.

翻訳日:2023-01-12 04:39:31 公開日:2020-01-12

# 商品画像分類のためのトリックの袋

Bag of Tricks for Retail Product Image Classification ( http://arxiv.org/abs/2001.03992v1 )

ライセンス: Link先を確認

Muktabh Mayank Srivastava

(参考訳) 小売商品画像分類は、セルフチェックアウトストアや自動小売実行評価のような現実のシステムを構築する上で重要なコンピュータビジョンと機械学習の問題である。本研究では,各種小売商品画像分類データセットの深層学習モデルの精度を高めるための様々な手法を提案する。これらの手法により、小売商品画像分類のための微調整コンブネットの精度を大きなマージンで向上させることができる。最も顕著なトリックとして、複数のデータセットで一貫したゲインを提供する、Local-Concepts-Accumulation (LCA)層と呼ばれる新しいニューラルネットワーク層を導入する。小売製品識別の精度を高めるための他の2つのトリックは、instagramでトレーニング済みのconvnetを使用して、最大エントロピーを分類の補助損失として使用することです。

Retail Product Image Classification is an important Computer Vision and Machine Learning problem for building real world systems like self-checkout stores and automated retail execution evaluation. In this work, we present various tricks to increase accuracy of Deep Learning models on different types of retail product image classification datasets. These tricks enable us to increase the accuracy of fine tuned convnets for retail product image classification by a large margin. As the most prominent trick, we introduce a new neural network layer called Local-Concepts-Accumulation (LCA) layer which gives consistent gains across multiple datasets. Two other tricks we find to increase accuracy on retail product identification are using an instagram-pretrained Convnet and using Maximum Entropy as an auxiliary loss for classification.

翻訳日:2023-01-12 04:33:28 公開日:2020-01-12

# ガウス過程を用いた特徴量に基づく非剛性画像登録の検討

An Investigation of Feature-based Nonrigid Image Registration using Gaussian Process ( http://arxiv.org/abs/2001.05862v1 )

ライセンス: Link先を確認

Siming Bayer, Ute Spiske, Jie Luo, Tobias Geimer, William M. Wells III, Martin Ostermeier, Rebecca Fahrig, Arya Nabavi, Christoph Bert, Ilker Eyupoglo, and Andreas Maier

(参考訳) 適応的治療計画や術中画像更新のような幅広い臨床応用において,fdr(feature-based deformable registration)アプローチは単純さと計算複雑性の低さから広く採用されている。 fdrアルゴリズムは、選択された特徴間の確立された対応によって与えられるスパースフィールドを補間することにより、密度の高い変位場を推定する。本稿では, 変形場をガウス過程 (GP) とみなす一方, 選択した特徴を有効変形の先行情報とみなす。 gpを用いて, 高密度変位場と対応する不確かさマップの両方を同時に推定することができる。さらに,合成,ファントム,臨床データを用いた2乗指数カーネルの異なるハイパーパラメータ設定の性能評価を行った。定量的比較の結果,GP-based interpolation は最先端のB-spline interpolation と同等の性能を示した。 gpに基づく補間の最大の臨床的利点は、計算された濃密な変位マップの数学的不確かさの信頼できる推定を与えることである。

For a wide range of clinical applications, such as adaptive treatment planning or intraoperative image update, feature-based deformable registration (FDR) approaches are widely employed because of their simplicity and low computational complexity. FDR algorithms estimate a dense displacement field by interpolating a sparse field, which is given by the established correspondence between selected features. In this paper, we consider the deformation field as a Gaussian Process (GP), whereas the selected features are regarded as prior information on the valid deformations. Using GP, we are able to estimate the both dense displacement field and a corresponding uncertainty map at once. Furthermore, we evaluated the performance of different hyperparameter settings for squared exponential kernels with synthetic, phantom and clinical data respectively. The quantitative comparison shows, GP-based interpolation has performance on par with state-of-the-art B-spline interpolation. The greatest clinical benefit of GP-based interpolation is that it gives a reliable estimate of the mathematical uncertainty of the calculated dense displacement map.

翻訳日:2023-01-12 04:33:15 公開日:2020-01-12

# 依存情報を用いた確率的自然言語生成

Stochastic Natural Language Generation Using Dependency Information ( http://arxiv.org/abs/2001.03897v1 )

ライセンス: Link先を確認

Elham Seifossadat and Hossein Sameti

(参考訳) 本稿では,自然言語テキスト生成のための確率コーパスモデルを提案する。提案モデルでは,まず,特徴集合を通じてトレーニングデータから依存関係関係を符号化し,次にそれらの特徴を結合して,与えられた意味表現のための新しい依存性木を生成し,最終的に生成した依存性木から自然言語の発話を生成する。我々は、表式、対話法、rdfフォーマットの9つのドメインでモデルをテストする。また,対話行動,E2E,WebNLGデータセットを用いたBLEUおよびERR評価指標を用いて学習したニューラルネットワークに基づくアプローチと同等の結果が得られる。また,人間評価結果を報告することにより,情報性や自然性,品質の面から高品質な発話を生成できることを示した。

This article presents a stochastic corpus-based model for generating natural language text. Our model first encodes dependency relations from training data through a feature set, then concatenates these features to produce a new dependency tree for a given meaning representation, and finally generates a natural language utterance from the produced dependency tree. We test our model on nine domains from tabular, dialogue act and RDF format. Our model outperforms the corpus-based state-of-the-art methods trained on tabular datasets and also achieves comparable results with neural network-based approaches trained on dialogue act, E2E and WebNLG datasets for BLEU and ERR evaluation metrics. Also, by reporting Human Evaluation results, we show that our model produces high-quality utterances in aspects of informativeness and naturalness as well as quality.

翻訳日:2023-01-12 04:32:58 公開日:2020-01-12

# 疎フィードバックを伴う複雑な操作課題に対する深層強化学習

Deep Reinforcement Learning for Complex Manipulation Tasks with Sparse Feedback ( http://arxiv.org/abs/2001.03877v1 )

ライセンス: Link先を確認

Binyamin Manela

(参考訳) 疎いフィードバックから最適なポリシーを学ぶことは、強化学習における既知の課題である。 Hindsight Experience Replay (HER) は、そのような課題を解決するためのマルチゴール強化学習アルゴリズムである。このアルゴリズムは、全ての失敗をエピソードで達成された代替(仮想)目標の成功として扱い、その仮想目標から実際の目標へと一般化する。 HERには既知の欠陥があり、比較的単純なタスクに限定されている。本論文では,既存のherアルゴリズムに基づく3つのアルゴリズムを提案する。まず、エージェントがより価値のある情報を学ぶ仮想目標を優先します。この性質を仮想ゴールの「textit{instructiveness}」と呼び、エージェントが仮想ゴールから実際のゴールへの一般化をいかにうまく行うかを表すヒューリスティックな尺度で定義する。第二に,学習過程全体を通してバイアスを生じさせるような誤解を招くサンプルを検出し,除去するフィルタリングプロセスを設計した。最後に、HERと組み合わせたカリキュラム学習の形式を用いて、複雑でシーケンシャルなタスクの学習を可能にする。このアルゴリズムを \textit{curriculum her} と呼ぶ。アルゴリズムをテストするため、3つの難解な操作環境を構築しました。それぞれの環境は複雑度が3つある。実験の結果,herアルゴリズムと比較した場合,最終的な成功率とサンプル効率は大幅に向上した。

Learning optimal policies from sparse feedback is a known challenge in reinforcement learning. Hindsight Experience Replay (HER) is a multi-goal reinforcement learning algorithm that comes to solve such tasks. The algorithm treats every failure as a success for an alternative (virtual) goal that has been achieved in the episode and then generalizes from that virtual goal to real goals. HER has known flaws and is limited to relatively simple tasks. In this thesis, we present three algorithms based on the existing HER algorithm that improves its performances. First, we prioritize virtual goals from which the agent will learn more valuable information. We call this property the \textit{instructiveness} of the virtual goal and define it by a heuristic measure, which expresses how well the agent will be able to generalize from that virtual goal to actual goals. Secondly, we designed a filtering process that detects and removes misleading samples that may induce bias throughout the learning process. Lastly, we enable the learning of complex, sequential, tasks using a form of curriculum learning combined with HER. We call this algorithm \textit{Curriculum HER}. To test our algorithms, we built three challenging manipulation environments with sparse reward functions. Each environment has three levels of complexity. Our empirical results show vast improvement in the final success rate and sample efficiency when compared to the original HER algorithm.

翻訳日:2023-01-12 04:32:44 公開日:2020-01-12

# Fastは無料より優れている - 敵のトレーニングを再考する

Fast is better than free: Revisiting adversarial training ( http://arxiv.org/abs/2001.03994v1 )

ライセンス: Link先を確認

Eric Wong, Leslie Rice, J. Zico Kolter

(参考訳) 強固なディープネットワークを学習する手法であるadversarial trainingは、通常、プロジェクテッド・グラデーション・フォーマル(pgd)のような一階法で敵の例を構築する必要があるため、従来のトレーニングよりも高価であると考えられている。本稿では, 従来非効率と思われていたアプローチである, より弱く安価な逆境を用いて, 実験的に堅牢なモデルを訓練できるという驚くべき発見を行ない, 実際に行う訓練よりもコストがかからない手法を提案する。具体的には,ファストグレードサイン法(fast gradient sign method, fgsm)とランダム初期化を組み合わせることで,pgdベースのトレーニングと同等の効果を示すが,コストは極めて低い。さらに, ディープネットワークの効率的なトレーニングのための標準技術を用いて, 45%のロバストなcifar10分類器を6分で学習できること, 43%のロバストなイメージネット分類器を12時間以内に2/255$で学習できること, 10時間から50時間かけて同じしきい値に到達した"free"アドバーサリートレーニングに基づく過去の作業と比較して, fgsmの敵意訓練をさらに促進できることを示した。最後に,FGSM逆行訓練の失敗の原因となった「破滅的オーバーフィッティング(catastrophic overfitting)」と呼ばれる障害モードを同定した。この論文で実験を再現するためのコードはすべてhttps://github.com/locuslab/fast_adversarial.comにある。

Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice. Specifically, we show that adversarial training with the fast gradient sign method (FGSM), when combined with random initialization, is as effective as PGD-based training but has significantly lower cost. Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $\epsilon=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $\epsilon=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds. Finally, we identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail. All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial.

翻訳日:2023-01-12 04:31:21 公開日:2020-01-12

# テキスト分類のためのテンソルグラフ畳み込みネットワーク

Tensor Graph Convolutional Networks for Text Classification ( http://arxiv.org/abs/2001.05313v1 )

ライセンス: Link先を確認

Xien Liu, Xinxin You, Xiao Zhang, Ji Wu and Ping Lv

(参考訳) 逐次学習モデルと比較して、グラフベースのニューラルネットワークは、グローバル情報をキャプチャする能力など、優れた特性を示している。本稿では,テキスト分類問題に対するグラフベースニューラルネットワークについて検討する。この課題に対して、新しいフレームワークTensorGCN(テンソルグラフ畳み込みネットワーク)が提案されている。テキストグラフテンソルは、まずセマンティック、構文、シーケンシャルな文脈情報を記述するために構築される。そして、テキストグラフテンソル上で2種類の伝播学習を行う。 1つ目は、単一のグラフ内の近傍ノードからの情報を集約するために使用されるグラフ内伝搬である。 2つ目はグラフ間の異種情報の調和に使用されるグラフ間伝播である。ベンチマークデータセットを用いて大規模な実験を行い,提案手法の有効性を示した。提案するTensorGCNは,異なる種類のグラフからの異種情報の調和と統合に有効な方法である。

Compared to sequential learning models, graph-based neural networks exhibit some excellent properties, such as ability capturing global information. In this paper, we investigate graph-based neural networks for text classification problem. A new framework TensorGCN (tensor graph convolutional networks), is presented for this task. A text graph tensor is firstly constructed to describe semantic, syntactic, and sequential contextual information. Then, two kinds of propagation learning perform on the text graph tensor. The first is intra-graph propagation used for aggregating information from neighborhood nodes in a single graph. The second is inter-graph propagation used for harmonizing heterogeneous information between graphs. Extensive experiments are conducted on benchmark datasets, and the results illustrate the effectiveness of our proposed framework. Our proposed TensorGCN presents an effective way to harmonize and integrate heterogeneous information from different kinds of graphs.

翻訳日:2023-01-12 04:30:28 公開日:2020-01-12

PDF登録状況（公開日: 20200112）