Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20220806となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 球面と双曲面上で定義された量子発振子のエネルギー固有値の代数的導出 Algebraic derivation of the Energy Eigenvalues for the quantum oscillator defined on the Sphere and the Hyperbolic plane ( http://arxiv.org/abs/2103.02518v2 ) ライセンス: Link先を確認	Atulit Srivastava and Sanjeev Kant Soni	(参考訳) 量子調和振動子のエネルギーの固有値の代数的導出を定数曲率の表面、すなわち球面上または双曲平面上で与える。 2次元(2次元)二次超可積分系のエネルギー固有値の固定には、ダスカロイアニスが提案した方法を用いて、動積分作用素の多項式代数の有限次元表現の存在によって決定されると仮定する。表現を実現するツールは変形パラフェルミオン発振器である。エネルギーの固有値は計算され、我々の導出した結果は古典的解析法で計算された既知のエネルギー固有値と代数的に一致する。この記事の主な成果であるこの主張は、詳細なプレゼンテーションによって実証される。また,球面および双曲面上のエネルギースペクトルの質的差異についても考察した。 We give an algebraic derivation of the eigenvalues of energy of a quantum harmonic oscillator on the surface of constant curvature, i.e. on the sphere or on the hyperbolic plane. We use the method proposed by Daskaloyannis for fixing the energy eigenvalues of two-dimensional (2D) quadratically superintegrable systems by assuming that they are determined by the existence of finite-dimensional representation of the polynomial algebra of the motion integral operators. The tool for realizing representations is the deformed parafermionic oscillator. The eigenvalues of energy are calculated and the result derived by us algebraically agrees with the known energy eigenvalues calculated by classical analytical methods. This assertion which is the main result of this article is demonstrated by a detailed presentation. We also discuss the qualitative difference of the energy spectra on the sphere and on the hyperbolic plane.	翻訳日:2023-04-09 08:03:06 公開日:2022-08-06
# キラル超流体中の幾何学的誘導 Geometric Induction in Chiral Superfluids ( http://arxiv.org/abs/2112.04528v3 ) ライセンス: Link先を確認	Qing-Dong Jiang and A. Balatsky	(参考訳) 曲面を被覆したキラル超流動薄膜の特性について検討する。秩序パラメータのベクトルの性質から、幾何学的ゲージ場が出現し、異常渦-幾何学的相互作用や曲率誘起質量/スピン超電流など多くの観測可能な効果をもたらす。我々は、この理論をカイラル超流動$\rm ^3 He$のよく知られた相に適用し、実験的に観測可能なシグネチャを導出する。さらに, 軟質表面がキラル超流動からひずみを補うために適応できるフレキシブルジオメトリの場合についても検討した。幾何学とキラル超流動秩序の相互作用が提案され、量子状態の制御と歪の操作に魅力的な道が開かれた。 We explore the properties of chiral superfluid thin films coating a curved surface. Due to the vector nature of the order parameter, a geometric gauge field emerges and leads to a number of observable effects such as anomalous vortex-geometric interaction and curvature-induced mass/spin supercurrents. We apply our theory to several well-known phases of chiral superfluid $\rm ^3 He$ and derive experimentally observable signatures. We further discuss the cases of flexible geometries where a soft surface can adapt itself to compensate for the strain from the chiral superfluid. The proposed interplay between geometry and chiral superfluid order provides a fascinating avenue to control and manipulate quantum states with strain.	翻訳日:2023-03-05 02:55:35 公開日:2022-08-06
# 容量結合型数電子一重項量子ビットに対するロバストエンタングゲート Robust entangling gate for capacitively coupled few-electron singlet-triplet qubits ( http://arxiv.org/abs/2201.01583v2 ) ライセンス: Link先を確認	Guo Xuan Chan, Xin Wang	(参考訳) 量子制御がノイズに影響を受けない量子ビットパラメータの軌跡であるスイートスポットの探索は、高忠実度量子ゲートを達成する鍵となる。従来の二重量子ドット一重項量子ビットにおいて、各点が1つの電子(「2電子一重項量子ビット」)をホストするようなスイートスポットを探す努力は、特に2量子ビット演算では失敗に終わった。ここでは、2つの量子ドット("four-electron singlet-triplet qubit")に合計4つの電子を含む、各ドットが1つ以上の電子をホストできる一重項量子ビットを考える。構成-相互作用計算を用いて,この結合量子ビット系にスイートスポットが現れることを理論的に証明した。さらに,現実の電荷ノイズや超微細ノイズ下では,提案するスイートスポットでの2量子ビット動作は,従来の2電子シングルト三重項量子ビットシステム($\sim90\%$)よりも高いゲートフィディティ($\sim99\%$)を提供できることを示した。我々は,シングルトリップキュービット系における高忠実度2ビットゲートの実現を容易にする。 The search of a sweet spot, locus in qubit parameters where quantum control is first-order insensitive to noises, is key to achieve high-fidelity quantum gates. Efforts to search for such a sweet spot in conventional double-quantum-dot singlet-triplet qubits where each dot hosts one electron ("two-electron singlet-triplet qubit"), especially for two-qubit operations, have been unsuccessful. Here we consider singlet-triplet qubits allowing each dot to host more than one electron, with a total of four electrons in the double quantum dots ("four-electron singlet-triplet qubit"). We theoretically demonstrate, using configuration-interaction calculations, that sweet spots appear in this coupled qubit system. We further demonstrate that, under realistic charge noise and hyperfine noise, two-qubit operation at the proposed sweet spot could offer gate fidelities ($\sim99\%$) that are higher than conventional two-electron singlet-triplet qubit system ($\sim90\%$). Our results should facilitate realization of high-fidelity two-qubit gates in singlet-triplet qubit systems.	翻訳日:2023-03-02 05:43:15 公開日:2022-08-06
# 変分量子パルス学習 Variational Quantum Pulse Learning ( http://arxiv.org/abs/2203.17267v3 ) ライセンス: Link先を確認	Zhiding Liang, Hanrui Wang, Jinglei Cheng, Yongshan Ding, Hang Ren, Zhengqi Gao, Zhirui Hu, Duane S. Boning, Xuehai Qian, Song Han, Weiwen Jiang, Yiyu Shi	(参考訳) 量子コンピューティングは、古典的ハードウェア上で計算的に難解な問題を解く最も有望な新興技術の一つである。既存の多くの研究は、変分量子回路(VQC)のような機械学習タスクのゲートレベルにおける変分量子アルゴリズムの使用に焦点を当てている。しかし、vqcは1つの回転ゲートで1つのパラメータしか訓練できないなど、パラメータの数が少ないため、柔軟性と表現性に制限がある。一方、量子パルスは量子コンピューティングのスタックの量子ゲートよりも小さく、制御パラメータがより大きいことが観察された。本稿では、vqcの有望な性能に触発されて、学習タスクで直接量子パルスを訓練する新しいパラダイムである変分量子パルス(vqp)を提案する。提案手法は,最適化フレームワークにおいてパルスの振幅を引いたり押したりすることで,変動量子パルスを操作する。可変量子アルゴリズムと同様に、パルスをトレーニングするためのフレームワークはノイズの中間スケール量子(nisq)コンピュータの雑音に対するロバスト性を維持する。二値分類の例では、VQP学習は(実機からのノイズモデルを持つ)カイスキットノイズシミュレータとibmq-jarkataのVQC学習と比較して最大11%と9%高い精度を達成し、その効果と実現可能性を示した。 VQPが信頼性の高い結果を得るための安定性も、ノイズの存在下で検証されている。 Quantum computing is among the most promising emerging techniques to solve problems that are computationally intractable on classical hardware. A large body of existing works focus on using variational quantum algorithms on the gate level for machine learning tasks, such as the variational quantum circuit (VQC). However, VQC has limited flexibility and expressibility due to limited number of parameters, e.g. only one parameter can be trained in one rotation gate. On the other hand, we observe that quantum pulses are lower than quantum gates in the stack of quantum computing and offers more control parameters. Inspired by the promising performance of VQC, in this paper we propose variational quantum pulses (VQP), a novel paradigm to directly train quantum pulses for learning tasks. The proposed method manipulates variational quantum pulses by pulling and pushing the amplitudes of pulses in an optimization framework. Similar to variational quantum algorithms, our framework to train pulses maintains the robustness to noise on Noisy Intermediate-Scale Quantum (NISQ) computers. In an example task of binary classification, VQP learning achieves up to 11% and 9% higher accuracy compared with VQC learning on the qiskit noise simulators (with noise model from real machine) and ibmq-jarkata, respectively, demonstrating its effectiveness and feasibility. Stability for VQP to obtain reliable results has also been verified in the presence of noise.	翻訳日:2023-02-20 04:43:10 公開日:2022-08-06
# 敵対的サプライチェーン攻撃の防止または緩和 : 法的分析 Preventing or Mitigating Adversarial Supply Chain Attacks; a legal analysis ( http://arxiv.org/abs/2208.03466v1 ) ライセンス: Link先を確認	Kaspar Rosager Ludvigsen, Shishir Nagaraja, Angela Daly	(参考訳) 現在、世界はインターネット全体を通じて強く結びついており、食品からインフラ、テクノロジーまで、あらゆるものを提供する非常にサプライチェーンも持っている。サプライチェーンは、デジタルと物理的の両方の意味で、敵の攻撃に対して脆弱であり、それらを破壊または破滅させる可能性がある。本稿では、このような攻撃が成功した事例を2つ検討し、その成果が今後どうなるのかを考察し、EUと国家法がこれらの攻撃を防ぎ得るか、あるいはあらゆるコストでそれらを緩和しようとしない企業を罰するかを分析する。現行の国家規制は技術面では不十分であり、サプライチェーンの攻撃を防ぐ上で最大の役割を果たせる適切な当事者を強制または強制することは不可能である。しかし、現在のeuの法律は正しい道筋をたどっており、サイバーセキュリティに関して国家法が適切な規制を怠っているため、こうした大きな脅威を考えるためにはさらなる警戒が必要かもしれない。 The world is currently strongly connected through both the internet at large, but also the very supply chains which provide everything from food to infrastructure and technology. The supply chains are themselves vulnerable to adversarial attacks, both in a digital and physical sense, which can disrupt or at worst destroy them. In this paper, we take a look at two examples of such successful attacks and consider what their consequences may be going forward, and analyse how EU and national law can prevent these attacks or otherwise punish companies which do not try to mitigate them at all possible costs. We find that the current types of national regulation are not technology specific enough, and cannot force or otherwise mandate the correct parties who could play the biggest role in preventing supply chain attacks to do everything in their power to mitigate them. But, current EU law is on the right path, and further vigilance may be what is necessary to consider these large threats, as national law tends to fail at properly regulating companies when it comes to cybersecurity.	翻訳日:2023-02-19 10:22:04 公開日:2022-08-06
# 公衆衛生におけるデータサイエンス : 次世代の能力の構築 Data science in public health: building next generation capacity ( http://arxiv.org/abs/2208.03461v1 ) ライセンス: Link先を確認	Nicholas Mirin, Heather Mattie, Latifa Jackson, Zainab Samad, Rumi Chunara	(参考訳) 急速に進化する技術、データ、分析的な風景は多くの分野や職業に浸透している。公衆衛生において、データリテラシーを含むデータサイエンススキルの必要性は、既存の公衆衛生研究と介入慣行のギャップを埋めるための新しいデータタイプと分析手法の可能性と、そのようなデータや方法が健康格差を持続または拡大する可能性の両方から特に顕著である。米国トップ10および世界の公衆衛生学校における公衆衛生コースとプログラムのレビューを通じて、本稿は、公衆衛生データ科学における既存の教育活動を要約する。これらの既存の慣習は、これらのカリキュラムをさらに多くの学校や人口に広げる努力に役立ちます。データサイエンス倫理コースのオファリングは、人口の健康原則が、従来の公衆衛生カリキュラムのコアを拡大するために、データに関わるレベルのトレーニングにどのようにブレンドできるかを評価する文脈でも検討されている。また、国内外の「教室外」研修プログラムからの並列的な知見を合成し、公衆衛生データ科学の多様性を高めるためのアプローチを推し進める。これらのプログラムのレビューとそれらの合成に基づいて、4点式を蒸留し、公衆衛生の目標達成とデジタル時代の生活の質向上にデータを活用するために、流線型の重要かつ包括的な実践者集団の開発に向けて、公衆衛生データサイエンス教育の取り組みを強化する。 Rapidly evolving technology, data and analytic landscapes are permeating many fields and professions. In public health, the need for data science skills including data literacy is particularly prominent given both the potential of novel data types and analysis methods to fill gaps in existing public health research and intervention practices, as well as the potential of such data or methods to perpetuate or augment health disparities. Through a review of public health courses and programs at the top 10 U.S. and globally ranked schools of public health, this article summarizes existing educational efforts in public health data science. These existing practices serve to inform efforts for broadening such curricula to further schools and populations. Data science ethics course offerings are also examined in context of assessing how population health principles can be blended into training across levels of data involvement to augment the traditional core of public health curricula. Parallel findings from domestic and international 'outside the classroom' training programs are also synthesized to advance approaches for increasing diversity in public health data science. Based on these program reviews and their synthesis, a four-point formula is distilled for furthering public health data science education efforts, toward development of a critical and inclusive mass of practitioners with fluency to leverage data to advance goals of public health and improve quality of life in the digital age.	翻訳日:2023-02-19 10:21:46 公開日:2022-08-06
# オフナディア測地偏差予測のための深層学習アンサンブルフレームワーク A Deep Learning Ensemble Framework for Off-Nadir Geocentric Pose Prediction ( http://arxiv.org/abs/2205.11230v3 ) ライセンス: Link先を確認	Christopher Sun, Jai Sharma, Milind Maiti	(参考訳) 自然災害対応を加速する計算手法には、変化検出、地図アライメント、視覚支援ナビゲーションなどがある。現在のソフトウェアはnadirに近い画像のみに最適に機能するが、オフnadir画像は自然災害後の最初の情報源であることが多い。上記のタスクにオフnadir画像を使用するには、重力に対する航空機の空間方向であるジオセントリックなポーズの計算が必要である。本研究では,世界の都市における5,923個の近海RGB衛星画像を用いて,地球中心のポーズを予測するためのディープラーニングアンサンブルフレームワークを提案する。まず、U-Net Fully Convolutional Neural Networkは、RGB画像の画素方向の地上高度マスクを予測する。そして、標高マスクをRGB画像と連結して第2畳み込みモデルに入力される4チャンネル入力を形成し、方位角と倍率スケールを予測する。 R2=0.917の性能精度は従来の手法よりも大幅に向上した。また,教師付き補間により異常除去を行い,標高マスクの感度分析を行い,データ特徴量の有用性を評価し,将来的な特徴工学の道筋を動機付ける。本研究で構築した高精度ソフトウェアは,災害対応のための地図作成とナビゲーションに有効である。 Computational methods to accelerate natural disaster response include change detection, map alignment, and vision-aided navigation. Current software functions optimally only on near-nadir images, though off-nadir images are often the first sources of information following a natural disaster. The use of off-nadir images for the aforementioned tasks requires the computation of geocentric pose, which is an aerial vehicle's spatial orientation with respect to gravity. This study proposes a deep learning ensemble framework to predict geocentric pose using 5,923 near-nadir and off-nadir RGB satellite images of cities worldwide. First, a U-Net Fully Convolutional Neural Network predicts the pixel-wise above-ground elevation mask of the RGB images. Then, the elevation masks are concatenated with the RGB images to form four-channel inputs fed into a second convolutional model, which predicts orientation angle and magnification scale. A performance accuracy of R2=0.917 significantly outperforms previous methodologies. In addition, outlier removal is performed through supervised interpolation, and a sensitivity analysis of elevation masks is conducted to gauge the usefulness of data features, motivating future avenues of feature engineering. The high-accuracy software built in this study contributes to mapping and navigation procedures for effective disaster response to save lives.	翻訳日:2023-02-14 08:49:32 公開日:2022-08-06
# 量子確率同値は、マクロ実数論と一致する測定のためのレトロコーサルモデルに繋がる A quantum stochastic equivalence leads to a retrocausal model for measurement consistent with macroscopic realism ( http://arxiv.org/abs/2205.06070v2 ) ライセンス: Link先を確認	Margaret D Reid and Peter D Drummond	(参考訳) 本稿では, 量子力学から自然に反因性が生じることを示し, 量子計測をマクロ的リアリズムと一貫して説明する。固有状態 \|\|x_{j}\rangle$ の重ね合わせで用意された系上で測定値 $\hat{x}$ を解析し、増幅によって測定値がモデル化される。経路積分定理を導出することにより、量子確率分布 $q(x,p,t)$ と振幅 $x(t)$ と $p(t)$ の同時バックインタイムとフォワードインタイム確率方程式の等価性が証明される。後方と前方の軌道は、初期時間境界でリンクされる。 deutschのような'causal consistency'とボルンの規則は自然に現れる。特徴は固有状態に関連する真空ノイズである。固有値とは異なり、このノイズは増幅されず、測定不能であり、過去の境界条件と将来の境界条件に由来する正確なゆらぎである。巨視的重ね合わせについては、測定開始前に測定値$\hat{x}$の巨視的結果が決定される。これにより、マクロコーサールとマイクロレトロコーサールの関係、および他のリアリズムのモデルがハイブリッドとなる。この結果は、波動関数の「収束」は増幅によって起こることを裏付けるものである: 結果 $x_{j}$ で選択された最初の'状態' に対する分配 $q_{j}(x,p,0)$ は量子状態ではなく、マクロ重ね合わせのために固有状態 $\|x_{j}\rangle$ に近づく。完全に可逆的な崩壊は、メートルへの結合によってシミュレートされる。 Einstein-Podolsky-Rosen と Bell の相関について論じる。 In this paper, we show how retrocausality arises naturally from within quantum mechanics, and explains quantum measurement consistently with macroscopic realism. We analyze a measurement $\hat{x}$ on a system prepared in a superposition of eigenstates $\|x_{j}\rangle$ where the measurement is modeled by amplification. By deriving a path-integral theorem, we prove an equivalence between a quantum probability distribution $Q(x,p,t)$ and simultaneous back-in-time and forward-in-time stochastic equations for amplitudes $x(t)$ and $p(t)$, respectively. The backward and forward trajectories are linked at the initial-time boundary. A Deutsch-like 'causal consistency' and Born's rule emerge naturally. A feature is the vacuum noise associated with the eigenstate. Unlike the eigenvalue, this noise is not amplified and is not measurable, the precise fluctuations originating from past and future boundary conditions. We find consistency with macroscopic realism: For macroscopic superpositions, the macroscopic outcome of the measurement $\hat{x}$ is considered determined prior to the onset of the measurement. This leads to hybrid macro-causal and micro-retrocausal relations, and other models of realism. Our results support that the 'collapse' of the wave function occurs with amplification: The distribution $Q_{j}(x,p,0)$ for the initial 'state' postselected on the outcome $x_{j}$ is not a quantum state but approaches the eigenstate $\|x_{j}\rangle$ for a macroscopic superposition. The full irreversible collapse is simulated by coupling to a meter. We discuss Einstein-Podolsky-Rosen and Bell correlations.	翻訳日:2023-02-13 09:37:49 公開日:2022-08-06
# 室温Rydberg-Atomを用いたテラヘルツ受信器 Terahertz Receiver based on Room-Temperature Rydberg-Atoms ( http://arxiv.org/abs/2205.11021v2 ) ライセンス: Link先を確認	Yayi Lin, Zhenyue She, Zhiwen Chen, Xianzhe Li, Caixia Zhang, Kaiyu Liao, Xinding Zhang, Wei Huang, Hui Yan, and Shiliang Zhu	(参考訳) 実用的なテラヘルツ無線通信の実現は多くの課題に直面している。 THz無線通信には高感度受信機が重要である。ここでは, セシウムRydberg原子に基づくテラヘルツ受信機を室温気相セルで実証する。最小検出可能なTHz電界を校正する。この受信機では、振幅変調または周波数変調テラヘルツ波を光信号に位相感応変換する。その結果、原子レシーバーはその量子特性のために多くの利点があることがわかった。特に、この受信機を用いて長距離THz無線通信を実現することができる。さらに、原子受信機は、THz無線-光リンクで使用することができる。 Realization of practical terahertz wireless communications still faces many challenges. The receiver with high sensitivity is important for THz wireless communications. Here we demonstrate a terahertz receiver based on the cesium Rydberg atoms in a room-temperature vapor cell. The minimum detectable THz electric field is calibrated. With this receiver, the phase-sensitive conversion of amplitude-modulated or frequency-modulated terahertz waves into optical signals is performed. The results show that the atomic receiver has many advantages due to its quantum properties. Especially, the long distance THz wireless communications is achievable using this receiver. Furthermore, the atomic receiver can be used in the THz wireless-to-optical link.	翻訳日:2023-02-12 00:57:13 公開日:2022-08-06
# 架空の同一粒子の熱力学特性とフェルミオン符号問題への応用に関する研究 On the thermodynamic properties of fictitious identical particles and the application to fermion sign problem ( http://arxiv.org/abs/2206.08341v2 ) ライセンス: Link先を確認	Yunuo Xiong, Hongwei Xiong	(参考訳) 最近開発された同一のボソンとフェルミオンの経路積分分子動力学を一般化することにより、実パラメータ$\xi$ がボソンとフェルミオンの間を連続的に補間される架空の同一粒子の有限温度熱力学的性質を考える。一般解析と数値実験により、平均エネルギーは、この実パラメータ $\xi$ の関数として良い解析的性質を持つことが判明し、これは、虚多項式関数による外挿により同じフェルミオンの熱力学特性を$\xi\geq 0$ で正確に計算した後に計算する機会を与える。本手法は,有限温度フェルミオン系に対して効率的に正確なエネルギー値を与えることができることを示す。我々の研究は、いくつかの量子系のフェルミオン符号問題を回避する機会を提供する。 By generalizing the recently developed path integral molecular dynamics for identical bosons and fermions, we consider the finite-temperature thermodynamic properties of fictitious identical particles with a real parameter $\xi$ interpolating continuously between bosons ($\xi=1$) and fermions ($\xi=-1$). Through general analysis and numerical experiments we find that the average energy may have good analytical property as a function of this real parameter $\xi$, which provides the chance to calculate the thermodynamical properties of identical fermions by an extrapolation with a simple polynomial function after accurately calculating the thermodynamic properties of the fictitious particles for $\xi\geq 0$. Using several examples, it is shown that our method can efficiently give accurate energy values for finite-temperature fermionic systems. Our work provides a chance to circumvent the fermion sign problem for some quantum systems.	翻訳日:2023-02-11 13:46:11 公開日:2022-08-06
# 非自明なPT対称連続体ハミルトニアンとその固有状態と固有値 A non-trivial PT-symmetric continuum Hamiltonian and its Eigenstates and Eigenvalues ( http://arxiv.org/abs/2206.12900v2 ) ライセンス: Link先を確認	Lawrence Mead, David Garfinkle, Sungwook Lee	(参考訳) 本稿では,連続体PT対称ハミルトニアンが支配する非自明な系について述べる。このハミルトニアンは単純な高調波発振器と等距離であることを示す。我々は、それらの函数が正規直交集合を形成する複素平面の固有関数と経路を見つける。また、この系に対して隠れ対称性作用素 ${\cal C}$ も見つかる。すべての計算は解析的に、近似なしで行われる。 In this paper, a non-trivial system governed by a continuum PT-symmetric Hamiltonian is discussed. We show that this Hamiltonian is iso-spectral to the simple harmonic oscillator. We find its eigenfunctions and the path in the complex plane along which these functions form an orthonormal set. We also find the hidden symmetry operator, ${\cal C}$, for this system. All calculations are performed analytically and without approximation.	翻訳日:2023-02-07 23:48:00 公開日:2022-08-06
# キラル対称性のバルクモードを有するツイストエノン共振器共振器と超光アキション暗黒物質に対する感度 Twisted Anyon Cavity Resonators with Bulk Modes of Chiral Symmetry and Sensitivity to Ultra-Light Axion Dark Matter ( http://arxiv.org/abs/2208.01640v2 ) ライセンス: Link先を確認	J. F. Bourhill, E. C. I. Paterson, M. Goryachev, M. E. Tobar	(参考訳) 本研究では,Anyonキャビティ共振器を発明する。共振器はねじれた中空構造に基づいており、選択共振モードは非ゼロヘリシティを示すことができる。キャビティの断面によっては、モードは以前研究されたものよりも一般対称性を持つ。例えば、ねじれのない場合、モードはボソンの形式であり、一方、180^{o}$ツイストでは対称性はフェルミオンの形式である。一般にツイストされた共振器が正弦形であることを示す。非ゼロヘリシティは、モードをアクチオンに結合し、アップコンバージョンでは、共振器の帯域内でモードカップルを超軽量アクチオンに制限する。この結合は振幅変調されたサイドバンドを追加し、共振器の帯域内の1つのモードのみを使用して、単純な感度で超光軸を探索できる。 In this work, we invent the Anyon Cavity Resonator. The resonator is based on twisted hollow structures, which allow select resonant modes to exhibit non-zero helicity. Depending on the cross-section of the cavity, the modes have more general symmetry than what has been studied before. For example, with no twist, the mode is the form of a boson, while with a $180^{o}$ twist the symmetry is in the form of a fermion. We show that the generally twisted resonator is in the form of an anyon. The non-zero helicity couples the mode to axions, and we show in the upconversion limit the mode couples to ultra-light axions within the bandwidth of the resonator. The coupling adds amplitude modulated sidebands and allows a simple sensitive way to search for ultra-light axions using only a single mode within the resonator's bandwidth.	翻訳日:2023-02-02 18:44:45 公開日:2022-08-06
# 捕捉イオン量子ゲートの忠実性に及ぼす高速雑音の影響 The effect of fast noise on the fidelity of trapped-ions quantum gates ( http://arxiv.org/abs/2208.03570v1 ) ライセンス: Link先を確認	Haim Nakav, Ran Finkelstein, Lee Peleg, Nitzan Akerman and Roee Ozeri	(参考訳) 高忠実度シングルおよびマルチ量子ビット演算は量子情報処理のバックボーンを構成する。この忠実度は、1ビットまたは2ビットのレベルを極めて整合的で正確な方法で結合する能力に基づいている。コヒーレント量子進化に必要な条件は、これらの遷移を駆動する非常に安定な局所振動子である。本稿では,局所発振器線幅よりもはるかに高い周波数での雑音である高速雑音が,閉じ込められたイオン系の1ビットおよび2ビットゲートの忠実度に及ぼす影響について検討する。我々は,共振する$\pi$ 回転とオフ共振側バンド遷移を含む単一量子ビット演算に対する高速雑音の影響を解析・測定する。さらに, モルマー・ソレンセン2量子ゲートにおける高速位相雑音の影響を解析した。我々は、量子ビット応答周波数における雑音パワースペクトル密度から与えられる単一のパラメータを通して、これら全ての演算の性能を統一的かつ簡便に推定する方法を見出した。解析は位相雑音や閉じ込められたイオン系に焦点をあてるが、他の高速ノイズ源やスピン状量子ビットが共通のボソニック場によって結合された他の量子ビット系に関係している。私たちの分析は、量子ハードウェアプラットフォームとゲートの分離を導くのに役立ち、フォールトトレラントな量子コンピューティングに対する信頼性を向上させることができます。 High fidelity single and multi-qubit operations compose the backbone of quantum information processing. This fidelity is based on the ability to couple single- or two-qubit levels in an extremely coherent and precise manner. A necessary condition for coherent quantum evolution is a highly stable local oscillator driving these transitions. Here we study the effect of fast noise, that is noise at frequencies much higher than the local oscillator linewidth, on the fidelity of one- and two-qubit gates in a trapped-ion system. We analyze and measure the effect of fast noise on single qubit operations including resonant $\pi$ rotations and off-resonant sideband transitions . We further analyze the effect of fast phase noise on the Molmer-Sorensen two-qubit gate. We find a unified and simple way to estimate the performance of all of these operations through a single parameter given by the noise power spectral density at the qubit response frequency. While our analysis focuses on phase noise and on trapped-ion systems, it is relevant for other sources of fast noise as well as for other qubit systems in which spin-like qubits are coupled by a common bosonic field. Our analysis can help in guiding the deign of quantum hardware platforms and gates, improving their fidelity towards fault-tolerant quantum computing.	翻訳日:2023-02-02 02:25:49 公開日:2022-08-06
# 半量子鍵分配プロトコルの3次元と4次元における無バイアス基底 Mutually Unbiased Bases In 3 and 4 Dimensions Semi-quantum Key Distribution Protocol ( http://arxiv.org/abs/2208.03548v1 ) ライセンス: Link先を確認	Hasnaa Hajji, Morad El Baz	(参考訳) 半量子鍵分布は伝統的に2レベル量子系に基づいている。本稿では,様々な非バイアスベースを用いた高次元システムに基づく半量子鍵分散プロトコルの無条件セキュリティについて述べる。まず,3次元と4次元の非バイアス基底を用いた3次元の場合を,量子チャネルの雑音の関数として,鍵レートに対する下界を導出する。次に, 4次元状態に対して, 相互に偏りのない基底数が異なる半量子鍵分布プロトコルに一般化する。半量子鍵分布プロトコルを高次元の非バイアスベースに基づけることで、ノイズの許容しきい値と秘密鍵レートの最大到達値を高めることができることがわかった。 Semi-quantum key distribution is traditionally based on two-level quantum systems. In this paper, an unconditional security of a semi quantum key distribution protocol based on higher-dimensional systems using various mutually unbiased bases is presented. We first consider the three dimensional case using three and four mutually unbiased bases and derive a lower bound for the key rate as a function of the quantum channel's noise. We then generalize the result to a semi-quantum key distribution protocol that employs different number of mutually unbiased bases for four-dimensional states. It is found that basing the semi-quantum key distribution protocol on higher-dimensional mutually unbiased bases can increase the tolerable threshold of the noise and the maximum achievable value of the secret key rate.	翻訳日:2023-02-02 02:25:29 公開日:2022-08-06
# 原子媒体を組み込んだハイブリッド光機械システムにおける可変電磁誘導多重トランジスタ Tunable Electromagnetically Induced Multi-Transparencies in Hybrid Optomechanical system Incorporating Atomic Medium ( http://arxiv.org/abs/2208.03547v1 ) ライセンス: Link先を確認	M. Hunza, M. Asjad, T. Abbas, M. Qasymeh, B. Teklu and H. Eleuch	(参考訳) 同一の$\Lambda$型原子を組み込んだハイブリッド原子-オプトメカニクス系を考える。このシステムは、光とフォノニックの二重駆動を受ける。光学的線形および二次的相互作用を利用することにより、複数の電磁透過窓が得られることを示す。さらに、内蔵メカニカルポンプにより、透明窓を制御して調整する。例えば、外部機械ポンプの位相を調整することにより、追加の制御パラメータが有効となり、吸収・放出プロファイルが向上する。本研究は,キャビティ-オプトメカニクス系を組み込んだ量子デバイス内部の伝搬信号の効率的な修正手法を提案する。 We consider a hybrid atom-optomechanical system incorporating N identical $\Lambda$-type atoms. The system is subjected to dual optical and phononic drives. We show that by exploiting the optomechanical linear and quadratic interactions, multiple electromagnetic transparency windows are attained. Furthermore, owing to the incorporated mechanical pump, the transparency windows are controlled and tuned. For instance, by adjusting the phase of the external mechanical pump, additional controlling parameters are enabled, and the absorption/emission profiles are enhanced. Our present study provides an efficient approach to modifying propagating signals inside the quantum devices incorporating cavity-optomechanical systems.	翻訳日:2023-02-02 02:25:17 公開日:2022-08-06
# 加速フレームにおけるミンコフスキー・フォック状態 Minkowski-Fock states in accelerated frames ( http://arxiv.org/abs/2208.03481v1 ) ライセンス: Link先を確認	Riccardo Falcone, Claudio Conti	(参考訳) 非慣性観測者に対するミンコフスキー粒子状態の明示的なウィグナー定式化は不明である。ここでは、加速フレームにおけるミンコフスキー・フォック状態の特性関数を計算するための一般的な処方則を導出する。単粒子状態と二粒子状態の特別な場合において、この方法は運動量空間における粒子数の平均値と相関関数、および観測者の加速度の影響を導出することができる。我々は,ミンコフスキー粒子と2粒子状態の区別不可能性をリンドラー粒子分布の観点から示し,観察者がフレームの加速度を検出する方法とみなすことができる。 2粒子状態の場合、観測者は異なるモータを持つリンドラー粒子間の相関を測定することで加速を検出することができる。 An explicit Wigner formulation of Minkowski particle states for non-inertial observers is unknown. Here, we derive a general prescription to compute the characteristic function for Minkowski-Fock states in accelerated frames. For the special case of single-particle and two-particle states, this method enables to derive mean values of particle numbers and correlation function in the momentum space, and the way they are affected by the acceleration of the observer. We show an indistinguishability between Minkowski single-particle and two-particle states in terms of Rindler particle distribution that can be regarded as a way for the observer to detect any acceleration of the frame. We find that for two-particle states the observer is also able to detect acceleration by measuring the correlation between Rindler particles with different momenta.	翻訳日:2023-02-02 02:25:08 公開日:2022-08-06
# 非摂動カシミール効果:真空構造、閉じ込め、カイラル対称性の破断 Nonperturbative Casimir effects: Vacuum structure, Confinement, and Chiral Symmetry Breaking ( http://arxiv.org/abs/2208.03457v1 ) ライセンス: Link先を確認	Alexander Molochkov	(参考訳) 境界を持つ時空間における真空および物質の再構成について概観する。収束ゲージ理論と強い相互作用を持つフェルミオン系の位相特性を考察する。特に、キラル相と分解相はカシミールプレートの存在下で性質を遷移させる。また、そのようなシステムにおける質量スケールシフトとその動的および幾何学的性質についても論じる。 The review of vacuum and matter restructuring in space-time with boundaries is presented. We consider phase properties of confining gauge theories and strongly interacting fermion systems. In particular, the chiral and deconfinement phase transitions properties in the presence of Casimir plates. We also discuss mass scale shifts in such systems and their possible dynamical and geometrical nature.	翻訳日:2023-02-02 02:24:54 公開日:2022-08-06
# フォック状態格子における光の量子位相状態のコヒーレント制御 Coherent control of quantum topological states of light in Fock-state lattices ( http://arxiv.org/abs/2208.03452v1 ) ライセンス: Link先を確認	Jinfeng Deng, Hang Dong, Chuanyu Zhang, Yaozu Wu, Jiale Yuan, Xuhao Zhu, Feitong Jin, Hekang Li, Zhen Wang, Han Cai, Chao Song, H. Wang, J. Q. You, and Da-Wei Wang	(参考訳) トポロジカルフォトニクスは、従来の電子材料を超えてトポロジカル物理を探求する新しいプラットフォームを提供し、位相的に保護された光輸送とレーザーにおける有望な応用を刺激する。偏光や波動ベクトルのような古典的な自由度は、トポロジカル光モードの合成に日常的に使用される。古典的な体制を超えて、光の本質的な量子の性質は、本質的に異なる位相状態の富を生み出し、量子情報処理における位相的保護を提供する。本稿では,3つの共振器をgmon qubitに均一に結合した超伝導回路における量子化光の位相状態に関する実験を行う。本研究では, ゼロエネルギー状態, ひずみ誘起擬ランダウレベル, バレーホール効果, ハルダンキラルエッジ電流のトポロジカル輸送を示す1次元および2次元フォック状態格子を構築した。本研究では、光の位相状態を量子状態まで拡張し、凝縮物質物理学の位相位相相と回路量子電磁力学を橋渡し、複数の共振器の量子状態を制御する新しい自由度を提供する。 Topological photonics provides a novel platform to explore topological physics beyond traditional electronic materials and stimulates promising applications in topologically protected light transport and lasers. Classical degrees of freedom such as polarizations and wavevectors are routinely used to synthesize topological light modes. Beyond the classical regime, inherent quantum nature of light gives birth to a wealth of fundamentally distinct topological states, which offer topological protection in quantum information processing. Here we implement such experiments on topological states of quantized light in a superconducting circuit, on which three resonators are tunably coupled to a gmon qubit. We construct one and two-dimensional Fock-state lattices where topological transport of zero-energy states, strain induced pseudo-Landau levels, valley Hall effect and Haldane chiral edge currents are demonstrated. Our study extends the topological states of light to the quantum regime, bridges topological phases of condensed matter physics with circuit quantum electrodynamics, and offers a new freedom in controlling the quantum states of multiple resonators.	翻訳日:2023-02-02 02:24:49 公開日:2022-08-06
# 複素弱値測定による一般量子相関の動作特性評価 Operational characterization of general quantum correlation via complex weak value measurement ( http://arxiv.org/abs/2208.03442v1 ) ライセンス: Link先を確認	Agung Budiyono and Hermawan K. Dipojono	(参考訳) 過去20年間、量子相関の理解は絡み合いよりも一般的であり、分離可能な状態でさえ古典的な対象によってエミュレートできない相関を生じる可能性がある。このような一般的な非古典的相関は、基礎的な観点から興味深いだけでなく、様々な量子情報処理タスクや量子技術において資源として認識されている。本稿では, 2成分系における一般量子相関を, 選択後の弱い測定値を用いた直接実験室操作の観点から評価する。局所基底の弱測定により得られた弱値の虚部と、他の局所基底の事後選択とに基づく量と、これら2つの基底の可能な全ての選択に対する最適化手順を定義する。一般の量子相関の量子化器に対する一定の要求を満たすことを示す。不確かさの最小の真の量子共有として統計的に解釈できる。一般の純粋な状態に対する絡み合いの忠実な証人であり、絡み合いの線形エントロピーのスケールされた平方根に観測可能な下界を与える。次に,各状態における局所的射影測定の結果に基づいて,任意の局所的測定基準の最適推定における最小平均絶対誤差として,多部状態における一般量子相関に関する情報理論的解釈を提案する。 The last two decades have witnessed significant progress on the understanding the quantum correlation more general than entanglement, wherein even a separable state may yield correlation that cannot be emulated by any classical object. Such a general nonclassical correlation is not only intriguing from the fundamental point of view, but it has also been recognized as a resource in a variety of quantum information processing tasks and quantum technology. Here, we propose a characterization of the general quantum correlation in bipartite system in terms of direct laboratory operations using weak measurement with postselection. We define a quantity based on the imaginary part of weak values obtained via weak measurement of a local basis followed by a postselection of another local basis, and an optimization procedure over all possible choices of the two bases. We show that it satisfies certain desirable requirements for a quantifier of general quantum correlations. It may be statistically interpreted as the minimum genuine quantum share of uncertainty. It is a faithful witness of entanglement for general pure states, giving an observable lower bound to a scaled square root of the linear entropy of entanglement. We then suggest an information theoretic interpretation of the general quantum correlation in a multipartite state as the minimum mean absolute error in an optimal estimation of any local measurement basis, based on the outcomes of local projective measurement on the state, in the worst case scenario.	翻訳日:2023-02-02 02:24:30 公開日:2022-08-06
# ベル不等式違反と関節マッピングのゲームに基づく測定における相関の保存 Conservation of correlation in measurement underlying the violation of Bell inequalities and a game of joint mapping ( http://arxiv.org/abs/2208.03441v1 ) ライセンス: Link先を確認	Agung Budiyono	(参考訳) ベルの不等式に違反する量子測定に何が必要か? 測度によらず、スピン-$\frac{1}{2}$粒子 (qubit) にスピンの定値を割り当てることができると仮定すると、c-値のスピン変数と呼ばれるが、これは任意の連続実数を取ることができる。さらに、測定値のc値のスピン変数を、可能な値の連続範囲から二進標準量子スピン値 $\pm 1$ にマッピングし、二部相関を保存する。ここで、そのようなc値スピン変数を実際に構成できることを示す。したがって、このモデルでは、状態が絡み合っているときベルの不等式を破るように量子測定を強制する相関の保存の要件であると主張することができる。次に、実数対の特定のアンサンブルを2進数対のペアに独立にマッピングするよう2つの当事者に要求し、相関が保存されるという条件のもとに統計ゲームについて議論する。相関の保存により、ゲームはベルの定理を尊重し、古典的戦略(すなわち局所的戦略と決定論的戦略)が勝てないゲームが存在することを意味する。一方、絡み合ったスピン-$\frac{1}{2}$粒子と局所的な量子スピン測定のための回路のアンサンブルにアクセスする量子戦略は、ゲームに勝つために使用できる。 What compels quantum measurement to violate the Bell inequalities? Suppose that regardless of measurement, one can assign to a spin-$\frac{1}{2}$ particle (qubit) a definite value of spin, called c-valued spin variable, but, it may take any continuous real number. Suppose further that measurement maps the c-valued spin variable from the continuous range of possible values onto the binary standard quantum spin values $\pm 1$ while preserving the bipartite correlation. Here, we show that such c-valued spin variables can indeed be constructed. In this model, one may therefore argue that it is the requirement of conservation of correlation which compels quantum measurement to violate the Bell inequalities when the prepared state is entangled. We then discuss a statistical game which captures the model of measurement, wherein two parties are asked to independently map a specific ensemble of pairs of real numbers onto pairs of binary numbers $\pm 1$, under the requirement that the correlation is preserved. The conservation of correlation forces the game to respect the Bell theorem, which implies that there is a class of games no classical (i.e., local and deterministic) strategy can ever win. On the other hand, a quantum strategy with an access to an ensemble of entangled spin-$\frac{1}{2}$ particles and circuits for local quantum spin measurement, can be used to win the game.	翻訳日:2023-02-02 02:24:09 公開日:2022-08-06
# 時間発展テンソルネットワークアルゴリズムの正規化スキーム Regularized scheme of time evolution tensor network algorithms ( http://arxiv.org/abs/2208.03436v1 ) ライセンス: Link先を確認	Li-Xiang Cen	(参考訳) 量子格子系の時間発展をシミュレートするために正規化分解法を提案する。トロッター分解を超越すると、プロパゲーターのコンパクト構造は高階ベーカー・カンベル・ハウスドルフ級数を示す。テンソルネットワークアルゴリズムの正規化スキームは、ハイゼンベルク型あるいは北エフ型相互作用を持つスピン格子系の基底状態エネルギーを決定するために開発される。ベンチマーク計算は正規化アルゴリズムの2つの利点を明らかにしている: 安定収束を持ち、キタエフスピン液体に単純な更新法を適用する場合でもバイアスに影響を受けない; 生成したテンソルネットワークの収縮は、計算コストをはるかに低くして急速に収束し、ボトルネックを緩和して物理的期待値を計算する。 Regularized factorization is proposed to simulate time evolution for quantum lattice systems. Transcending the Trotter decomposition, the resulting compact structure of the propagator indicates a high-order Baker-Campbell-Hausdorff series. Regularized scheme of tensor network algorithms is then developed to determine the ground state energy for spin lattice systems with Heisenberg or Kitaev-type interactions. Benchmark calculations reveal two distinct merits of the regularized algorithm: it has stable convergence, immune to the bias even in applying the simple update method to the Kitaev spin liquid; contraction of the produced tensor network can converge rapidly with much lower computing cost, relaxing the bottleneck to calculate the physical expectation value.	翻訳日:2023-02-02 02:23:44 公開日:2022-08-06
# 複数ラベルによる学習 Learning with Multiple Complementary Labels ( http://arxiv.org/abs/1912.12927v4 ) ライセンス: Link先を確認	Lei Feng, Takuo Kaneko, Bo Han, Gang Niu, Bo An, Masashi Sugiyama	(参考訳) 補ラベル(CL)は単に例の不正なクラスを示すが、CLで学習すると、正しいクラスを予測できる複数のクラス分類器が得られる。残念ながら、問題設定では各例に1つのCLしか使用できないため、ラベル付け者が簡単に複数のCL(MCL)を識別できるため、そのポテンシャルは特に制限される。本稿では, MCL を各例に適用可能な新しい問題設定法と, MCL を学習するための2つの方法を提案する。まず、MCLを複数の単一のCLに分解する2つのラッパーを設計し、CLで学習するためにどんな方法でも使えるようにした。しかし、MCLが保持する監視情報は、分解後に概念的に希釈される。したがって、2つ目の方法では、バイアスのないリスク推定器を導出し、MCLの集合を全体として処理し、推定誤差境界を持つように最小化する。適切に選択された上限を最小化する第2の方法をさらに改善する。実験によると、以前の方法はMCLで学ぶのにうまく機能するが、後者の方が優れている。 A complementary label (CL) simply indicates an incorrect class of an example, but learning with CLs results in multi-class classifiers that can predict the correct class. Unfortunately, the problem setting only allows a single CL for each example, which notably limits its potential since our labelers may easily identify multiple CLs (MCLs) to one example. In this paper, we propose a novel problem setting to allow MCLs for each example and two ways for learning with MCLs. In the first way, we design two wrappers that decompose MCLs into many single CLs, so that we could use any method for learning with CLs. However, the supervision information that MCLs hold is conceptually diluted after decomposition. Thus, in the second way, we derive an unbiased risk estimator; minimizing it processes each set of MCLs as a whole and possesses an estimation error bound. We further improve the second way into minimizing properly chosen upper bounds. Experiments show that the former way works well for learning with MCLs but the latter is even better.	翻訳日:2023-01-17 02:05:33 公開日:2022-08-06
# 敵対的腐敗の存在下での文脈探索 Contextual Search in the Presence of Adversarial Corruptions ( http://arxiv.org/abs/2002.11650v6 ) ライセンス: Link先を確認	Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, and Robert Schapire	(参考訳) 本研究では,より高次元のバイナリ検索の一般化である文脈探索について検討し,機能ベースの動的価格設定などの設定を捉えた。この問題の標準的な定式化は、エージェントが特定の同種反応モデルに従って作用すると仮定する。しかし実際には、一部の反応は逆向きに腐敗することがある。既存のアルゴリズムは、仮定された応答モデルが全てのエージェントに(ほぼ)正確であることに大きく依存しており、そのような任意的な誤特定の存在下でも性能が劣る。エージェントが基盤となる応答モデルと矛盾する方法で振る舞うことができる場合、文脈探索の研究を開始する。特に,多次元二元探索法に基づくアルゴリズムと勾配勾配に基づくアルゴリズムの2つを提案する。これらのアルゴリズムは, 敵対的汚職の欠如と, それらの性能が, エージェント数に応じて優雅に低下していることが示され, 敵対的雑音モデルにおける文脈探索の最初の結果となった。学習理論,ゲーム理論,高次元幾何学,凸解析から着想を得た。 We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as feature-based dynamic pricing. Standard formulations of this problem assume that agents act in accordance with a specific homogeneous response model. In practice, however, some responses may be adversarially corrupted. Existing algorithms heavily depend on the assumed response model being (approximately) accurate for all agents and have poor performance in the presence of even a few such arbitrary misspecifications. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying response model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. We show that these algorithms attain near-optimal regret in the absence of adversarial corruptions and their performance degrades gracefully with the number of such agents, providing the first results for contextual search in any adversarial noise model. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis.	翻訳日:2022-12-28 15:00:02 公開日:2022-08-06
# 3次元点雲の部分分割に対するクロスシェイプ注意 Cross-Shape Attention for Part Segmentation of 3D Point Clouds ( http://arxiv.org/abs/2003.09053v4 ) ライセンス: Link先を確認	Marios Loizou, Dmitry Petrov, Melinos Averkiou, Evangelos Kalogerakis	(参考訳) 本稿では,3次元形状分割を目的とし,コレクション内の形状にまたがる点的特徴表現を伝播する手法を提案する。これは、異なる形状の点間の相互作用の度合いを評価し、特徴伝播を媒介するクロス形状の注意操作によって達成される。各テスト形状について,このようなクロス形状の注意操作を行うのに適した入力コレクション内の形状を求める。得られたポイントワイズ特徴表現は,実験で示されたように,より一貫性のある3次元形状分割結果をもたらす。 We present a method that propagates point-wise feature representations across shapes within a collection for the purpose of 3D shape segmentation. This is achieved through a cross-shape attention operation that assesses the degree of interaction between points on different shapes and mediates feature propagation. For each test shape, our method finds shapes in an input collection that are suited for executing such cross-shape attention operations. The resulting point-wise feature representations lead to more consistent 3D shape segmentation results, as demonstrated in our experiments.	翻訳日:2022-12-21 22:14:52 公開日:2022-08-06
# NAS-Navigator: 説明可能なワンショットディープニューラルネットワーク合成のためのビジュアルステアリング NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural Network Synthesis ( http://arxiv.org/abs/2009.13008v3 ) ライセンス: Link先を確認	Anjul Tyagi, Cong Xie, Klaus Mueller	(参考訳) 近年のディープラーニング分野の進歩は、いくつかのアプリケーションにおいて非常に大きなニューラルネットワークの有効性を示している。しかし、これらのディープニューラルネットワークのサイズが大きくなるにつれて、良い結果を得るために多くのパラメータを設定するのがますます難しくなっている。現在、アナリストは、労働集約的で時間を要する多くの異なる設定とパラメータ設定を実験する必要があります。一方、ニューラルネットワークアーキテクチャ探索のための完全自動化技術の能力は、人間の専門知識がなくても制限される。この問題に対処するため,我々は,ワンショットアーキテクチャ探索技術に基づいて,ニューラルネットワークアーキテクチャ最適化のタスクをグラフ空間探索として定式化する。このアプローチでは、全ての候補アーキテクチャのスーパーグラフをワンショットで訓練し、最適なニューラルネットワークをサブグラフとして識別する。本稿では,分析者が効率的にサブグラフ空間を構築し,ドメイン知識を注入することでネットワーク探索をガイドするフレームワークを提案する。基本的なニューラルネットワークコンポーネントで構成されたネットワークアーキテクチャ空間から始めて、アナリストはワンショット検索スキームを通じて、最も有望なコンポーネントを効果的に選択することができる。このテクニックを反復的に適用することで、アナリストは与えられたアプリケーションの最適なニューラルネットワークアーキテクチャに収束することができる。探索中、アナリストは、探索空間の散在した視覚化から提供された知識を利用して、異なるコンポーネントを編集し、より高速な収束を導くことができる。我々は,複数のディープラーニング研究者と共同でインタフェースを設計し,その最終効果をユーザ・スタディと2つのケース・スタディで評価した。 Recent advancements in the area of deep learning have shown the effectiveness of very large neural networks in several applications. However, as these deep neural networks continue to grow in size, it becomes more and more difficult to configure their many parameters to obtain good results. Presently, analysts must experiment with many different configurations and parameter settings, which is labor-intensive and time-consuming. On the other hand, the capacity of fully automated techniques for neural network architecture search is limited without the domain knowledge of human experts. To deal with the problem, we formulate the task of neural network architecture optimization as a graph space exploration, based on the one-shot architecture search technique. In this approach, a super-graph of all candidate architectures is trained in one-shot and the optimal neural network is identified as a sub-graph. In this paper, we present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge. Starting with the network architecture space composed of basic neural network components, analysts are empowered to effectively select the most promising components via our one-shot search scheme. Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application. During the exploration, analysts can use their domain knowledge aided by cues provided from a scatterplot visualization of the search space to edit different components and guide the search for faster convergence. We designed our interface in collaboration with several deep learning researchers and its final effectiveness is evaluated with a user study and two case studies.	翻訳日:2022-10-13 20:37:50 公開日:2022-08-06
# 口腔癌におけるサイズと顕微鏡の特徴抽出と分類のための深層学習:畳み込みニューラルネットワークの強化 Deep Learning for Size and Microscope Feature Extraction and Classification in Oral Cancer: Enhanced Convolution Neural Network ( http://arxiv.org/abs/2208.07855v1 ) ライセンス: Link先を確認	Prakrit Joshi, Omar Hisham Alsadoon, Abeer Alsadoon, Nada AlSallami, Tarik A. Rashid, P.W.C. Prasad, Sami Haddad	(参考訳) 背景と目的: 過度に適合する問題は、深層学習技術が口腔癌の画像分類でうまく実装されていない理由である。本研究の目的は,畳み込みニューラルネットワークを用いたDeep Learningアルゴリズムを用いて,必要な次元削減特徴マップを正確に作成するためのオーバーフィッティングの削減である。方法論:提案システムは,自動エンコーダ技術を用いて特徴抽出の効率を高め,情報を圧縮する拡張畳み込みニューラルネットワークで構成されている。この手法では、入力データを生成するためにアンプールとデコンボリューションを行い、入力データと出力データの差を最小限に抑える。さらに、入力データセットから特徴特徴を抽出し、それらの特徴から入力データを再生し、ネットワークを学習して過度な適合を低減する。結果: 共焦点レーザー内視鏡(CLE)画像の異なるサンプル画像群を用いて, 精度の異なる処理時間値が得られた。その結果,提案手法は現在のシステムよりも優れていることがわかった。さらに,本システムでは,分類精度を5～5.5%向上し,平均処理時間を20～30ミリ秒短縮した。結論:本システムでは,CLE画像から異なる解剖学的位置の口腔癌細胞の正確な分類に焦点を当てた。最後に, オーバーフィッティング問題を解決するオートエンコーダ法を用いて, 精度と処理時間を向上させる。 Background and Aim: Over-fitting issue has been the reason behind deep learning technology not being successfully implemented in oral cancer images classification. The aims of this research were reducing overfitting for accurately producing the required dimension reduction feature map through Deep Learning algorithm using Convolutional Neural Network. Methodology: The proposed system consists of Enhanced Convolutional Neural Network that uses an autoencoder technique to increase the efficiency of the feature extraction process and compresses information. In this technique, unpooling and deconvolution is done to generate the input data to minimize the difference between input and output data. Moreover, it extracts characteristic features from the input data set to regenerate input data from those features by learning a network to reduce overfitting. Results: Different accuracy and processing time value is achieved while using different sample image group of Confocal Laser Endomicroscopy (CLE) images. The results showed that the proposed solution is better than the current system. Moreover, the proposed system has improved the classification accuracy by 5~ 5.5% on average and reduced the average processing time by 20 ~ 30 milliseconds. Conclusion: The proposed system focuses on the accurate classification of oral cancer cells of different anatomical locations from the CLE images. Finally, this study enhances the accuracy and processing time using the autoencoder method that solves the overfitting problem.	翻訳日:2022-08-28 22:23:21 公開日:2022-08-06
# 致死性疾患早期発見のための効率的な新規発見法 Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases ( http://arxiv.org/abs/2208.04732v1 ) ライセンス: Link先を確認	S\`edjro Salomon Hotegni (1), Ernest Fokou\'e (2) ((1) African Institute for Mathematical Sciences - Rwanda, (2) Rochester Institute of Technology - United States)	(参考訳) CHE(Critical Health Episodes)のような致命的な病気は、集中治療室に入院した患者にとって真の危険である。これらのエピソードは臓器の損傷や死を引き起こすことがある。それでも、時間内に診断することは、その不便さを大幅に減らすだろう。そこで本研究では,急性低血圧エピソードや頻拍エピソードなどのCHEの早期警戒システムの構築に焦点をあてた。予測の正確性を高めるため、観測期間(観測窓)と臨界事象が起こる期間(目標窓)との間に1時間の間隔が考慮された。 MIMIC IIデータセットを用いて,提案システムの性能評価を行った。このシステムはまず、3つの異なるモードを使って追加機能を抽出する。そして、相互情報ゲイン機能重要度を用いて、最も関連性の高い特徴の選択を可能にする特徴選択プロセスを実施した。最後に,高性能予測モデルLightGBMを用いてエピソード分類を行った。 MIG-LightGBMと呼ばれるこのアプローチは、イベントリコール(ER)、縮小精度(RP)、平均予測時間(aveAT)、平均False Alarms(aveFA)、イベントF1スコア(EF1スコア)の5つの異なる指標を用いて評価された。したがって,CHE の早期予測には大きな AveAT だけでなく,大きな EF1 スコアと低い AveFA も有効であると考えられる。予測モデルとして Extreme Gradient Boosting, Support Vector Classification あるいは Naive Bayes を用いたシステムと比較すると,提案システムは非常に支配的であった。また、階層型学習アプローチよりも優れていることも確認した。 Fatal diseases, as Critical Health Episodes (CHEs), represent real dangers for patients hospitalized in Intensive Care Units. These episodes can lead to irreversible organ damage and death. Nevertheless, diagnosing them in time would greatly reduce their inconvenience. This study therefore focused on building a highly effective early warning system for CHEs such as Acute Hypotensive Episodes and Tachycardia Episodes. To facilitate the precocity of the prediction, a gap of one hour was considered between the observation periods (Observation Windows) and the periods during which a critical event can occur (Target Windows). The MIMIC II dataset was used to evaluate the performance of the proposed system. This system first includes extracting additional features using three different modes. Then, the feature selection process allowing the selection of the most relevant features was performed using the Mutual Information Gain feature importance. Finally, the high-performance predictive model LightGBM was used to perform episode classification. This approach called MIG-LightGBM was evaluated using five different metrics: Event Recall (ER), Reduced Precision (RP), average Anticipation Time (aveAT), average False Alarms (aveFA), and Event F1-score (EF1-score). A method is therefore considered highly efficient for the early prediction of CHEs if it exhibits not only a large aveAT but also a large EF1-score and a low aveFA. Compared to systems using Extreme Gradient Boosting, Support Vector Classification or Naive Bayes as a predictive model, the proposed system was found to be highly dominant. It also confirmed its superiority over the Layered Learning approach.	翻訳日:2022-08-10 13:21:50 公開日:2022-08-06
# TripHLApan:トリプルコーディングマトリックスと転写学習に基づくHLA分子結合ペプチドの予測 TripHLApan: predicting HLA molecules binding peptides based on triple coding matrix and transfer learning ( http://arxiv.org/abs/2208.04314v1 ) ライセンス: Link先を確認	Meng Wang, Chuqi Lei, Jianxin Wang, Yaohang Li and Min Li	(参考訳) ヒト白血球抗原(HLA)は、ヒト免疫領域において重要な分子ファミリーであり、外部の脅威を認識し、T細胞にペプチドを提示することで免疫応答を誘導する。近年では、特定の免疫応答を誘導する腫瘍ワクチンの合成ががん治療の最前線となっている。ペプチドとhlaの結合パターンを計算的にモデル化することで、腫瘍ワクチンの開発を大いに加速することができる。しかし,ほとんどの予測手法の性能は極めて限定的であり,モデリングの基盤として既存の生物学的知識の分析を十分に活用することはできない。本稿では,HLA分子ペプチド結合予測のためのパン特異的予測モデルTripHLApanを提案する。 TripHLApanは、3重符号化行列、BiGRU+アテンションモデル、転送学習戦略を統合することで、強力な予測能力を示す。総合的な評価は、異なる試験環境でHLA-IおよびHLA-IIペプチド結合を予測するTripHLApanの有効性を示す。 HLA-Iの予測力は、最新のデータセットでさらに実証される。また,トリプラパンはメラノーマ患者の試料中に強い結合再構成能を有することが示された。結論として、TripHLApanは腫瘍ワクチンの合成のためのHLA-IおよびHLA-II分子ペプチドの結合を予測する強力なツールである。 Human leukocyte antigen (HLA) is an important molecule family in the field of human immunity, which recognizes foreign threats and triggers immune responses by presenting peptides to T cells. In recent years, the synthesis of tumor vaccines to induce specific immune responses has become the forefront of cancer treatment. Computationally modeling the binding patterns between peptide and HLA can greatly accelerate the development of tumor vaccines. However, most of the prediction methods performance is very limited and they cannot fully take advantage of the analysis of existing biological knowledge as the basis of modeling. In this paper, we propose TripHLApan, a novel pan-specific prediction model, for HLA molecular peptide binding prediction. TripHLApan exhibits powerful prediction ability by integrating triple coding matrix, BiGRU + Attention models, and transfer learning strategy. The comprehensive evaluations demonstrate the effectiveness of TripHLApan in predicting HLA-I and HLA-II peptide binding in different test environments. The predictive power of HLA-I is further demonstrated in the latest data set. In addition, we show that TripHLApan has strong binding reconstitution ability in the samples of a melanoma patient. In conclusion, TripHLApan is a powerful tool for predicting the binding of HLA-I and HLA-II molecular peptides for the synthesis of tumor vaccines.	翻訳日:2022-08-10 13:08:24 公開日:2022-08-06
# smart explorer:インタラクティブな探索による密集したクラッター内の物体認識 Smart Explorer: Recognizing Objects in Dense Clutter via Interactive Exploration ( http://arxiv.org/abs/2208.03496v1 ) ライセンス: Link先を確認	Zhenyu Wu, Ziwei Wang, Zibu Wei, Yi Wei and Haibin Yan	(参考訳) 密集クラッタにおける物体の認識は、把握、梱包、再配置など、幅広いロボット操作タスクにおいて重要な役割を担っている。しかし, 従来の視覚認識モデルでは, 症例間の有意な咬合による物体の欠落や, 物体の混み合いが高まる視覚の曖昧さによる不正確な予測が一般的である。本稿では,すべての物体を密集したクラッタで認識するための,smart explorerと呼ばれる対話型探索フレームワークを提案する。われわれのスマートエクスプローラーは、認識性能を最大化するためにクラッタと物理的に相互作用し、動作回数を最小限に抑えながら、最適な精度と効率のトレードオフによって、偽陽性と負の低減を効果的に行うことができる。具体的には,まずクラッタの多視点rgb-d画像を収集し,対応する点雲を再構成する。ビュー間でrgbイメージのインスタンスセグメンテーションを集約することにより、既存のクラスと各クラスのオブジェクト数を予測するクラッターのインスタンス毎ポイントクラウドパーティションを取得する。有効物理相互作用のためのプッシュ動作は、インスタンスセグメンテーションエントロピーとマルチビューオブジェクトの不一致からなる認識の不確実性を大幅に低減するために生成される。したがって、密閉クラッタにおける物体認識の最適精度-効率トレードオフは、反復的なインスタンス予測と物理的相互作用によって達成される。大規模な実験では、スマートエクスプローラーがいくつかのアクションだけで有望な認識精度を獲得し、ランダムなプッシュを大きなマージンで上回ります。 Recognizing objects in dense clutter accurately plays an important role to a wide variety of robotic manipulation tasks including grasping, packing, rearranging and many others. However, conventional visual recognition models usually miss objects because of the significant occlusion among instances and causes incorrect prediction due to the visual ambiguity with the high object crowdedness. In this paper, we propose an interactive exploration framework called Smart Explorer for recognizing all objects in dense clutters. Our Smart Explorer physically interacts with the clutter to maximize the recognition performance while minimize the number of motions, where the false positives and negatives can be alleviated effectively with the optimal accuracy-efficiency trade-offs. Specifically, we first collect the multi-view RGB-D images of the clutter and reconstruct the corresponding point cloud. By aggregating the instance segmentation of RGB images across views, we acquire the instance-wise point cloud partition of the clutter through which the existed classes and the number of objects for each class are predicted. The pushing actions for effective physical interaction are generated to sizably reduce the recognition uncertainty that consists of the instance segmentation entropy and multi-view object disagreement. Therefore, the optimal accuracy-efficiency trade-off of object recognition in dense clutter is achieved via iterative instance prediction and physical interaction. Extensive experiments demonstrate that our Smart Explorer acquires promising recognition accuracy with only a few actions, which also outperforms the random pushing by a large margin.	翻訳日:2022-08-10 13:04:11 公開日:2022-08-06
# autoshape: 時系列クラスタリングのためのautoencoder-shapeletアプローチ AUTOSHAPE: An Autoencoder-Shapelet Approach for Time Series Clustering ( http://arxiv.org/abs/2208.04313v1 ) ライセンス: Link先を確認	Guozhong Li, Byron Choi, Jianliang Xu, Sourav S Bhowmick, Daphne Ngar-yin Mah, and Grace Lai-Hung Wong	(参考訳) 時系列シェープレットは、最近時系列クラスタリング(TSC)に有効であることが判明した識別サブシーケンスである。シェープレットはクラスタの解釈に便利である。したがって、TSCの主な課題は、異なるクラスタを識別する高品質な可変長形状レットを見つけることである。本稿では,新しいオートエンコーダ・シェープレットアプローチ(autoshape)を提案する。このアプローチは,教師なしの方法でシェープレットを決定する際に,オートエンコーダとシェープレットの両方を利用する最初の研究である。オートエンコーダは高品質なシェープレットを学習するために特別に設計されている。より具体的には、潜在表現学習を指導するために、異なる変数の可変長シェープレット候補(時系列サブシーケンス)の統一埋め込みを学ぶために、最新の自己教師付き損失を用い、統一空間における識別埋め込みを選択するための多様性損失を提案する。本稿では,クラスタリングのための元の時系列空間におけるシェープレットを復元する再構成損失について紹介する。最後に、学習中のクラスタリング性能をAUTOSHAPEに知らせるため、Davies Bouldin index(DBI)を採用する。 AUTOSHAPEについて広範な実験を行った。単変量時系列(UTS)におけるクラスタリング性能を評価するために,UCRアーカイブデータセットを用いたAUTOSHAPEと15の代表的な手法を比較した。多変量時系列(MTS)の性能を調べるため,30UEAアーカイブデータセット上でAUTOSHAPEを5つの競合手法で評価した。その結果、AUTOSHAPEは、比較したすべての手法の中で最高であることがわかった。 3つのUTSケーススタディと1つのMSSケーススタディにおいて,クラスタをシェープレットで解釈し,それぞれ興味深い直感を得ることができる。 Time series shapelets are discriminative subsequences that have been recently found effective for time series clustering (TSC). The shapelets are convenient for interpreting the clusters. Thus, the main challenge for TSC is to discover high-quality variable-length shapelets to discriminate different clusters. In this paper, we propose a novel autoencoder-shapelet approach (AUTOSHAPE), which is the first study to take the advantage of both autoencoder and shapelet for determining shapelets in an unsupervised manner. An autoencoder is specially designed to learn high-quality shapelets. More specifically, for guiding the latent representation learning, we employ the latest self-supervised loss to learn the unified embeddings for variable-length shapelet candidates (time series subsequences) of different variables, and propose the diversity loss to select the discriminating embeddings in the unified space. We introduce the reconstruction loss to recover shapelets in the original time series space for clustering. Finally, we adopt Davies Bouldin index (DBI) to inform AUTOSHAPE of the clustering performance during learning. We present extensive experiments on AUTOSHAPE. To evaluate the clustering performance on univariate time series (UTS), we compare AUTOSHAPE with 15 representative methods using UCR archive datasets. To study the performance of multivariate time series (MTS), we evaluate AUTOSHAPE on 30 UEA archive datasets with 5 competitive methods. The results validate that AUTOSHAPE is the best among all the methods compared. We interpret clusters with shapelets, and can obtain interesting intuitions about clusters in three UTS case studies and one MTS case study, respectively.	翻訳日:2022-08-10 12:09:53 公開日:2022-08-06
# 情報収縮とグラフ展開のレンズによるGNNのオーバーカッシング Oversquashing in GNNs through the lens of information contraction and graph expansion ( http://arxiv.org/abs/2208.03471v1 ) ライセンス: Link先を確認	Pradeep Kr. Banerjee, Kedar Karhadkar, Yu Guang Wang, Uri Alon, Guido Mont\'ufar	(参考訳) メッセージパッシンググラフニューラルネットワーク(GNN)における信号伝搬の質は、最近の研究で見られるように、その表現性に強く影響を及ぼす。特に、長距離インタラクションに依存する予測タスクでは、ノード機能の再帰的な集約は、"oversquashing"と呼ばれる望ましくない現象につながる可能性がある。本稿では,情報収縮に基づくオーバースクワッシング分析の枠組みを提案する。我々の解析はフォン・ノイマンによる信頼性計算モデルによって導かれ、ノイズの多い計算グラフにおける信号の待ち行列としての新しい洞察を与える。そこで本研究では,オーバーカッシングを緩和するグラフ再構成アルゴリズムを提案する。本アルゴリズムは、拡張グラフ構成に動機づけられたランダムな局所エッジフリッププリミティブを用いる。提案アルゴリズムのスペクトル展開特性と既存の曲率に基づく非局所再配線戦略との比較を行った。合成実験により、我々のアルゴリズムは拡張速度が遅いが、全体的な計算コストが低く、ノードの度合いを正確に保ち、グラフを切断しないことを示した。 The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called "oversquashing". We present a framework for analyzing oversquashing based on information contraction. Our analysis is guided by a model of reliable computation due to von Neumann that lends a new insight into oversquashing as signal quenching in noisy computation graphs. Building on this, we propose a graph rewiring algorithm aimed at alleviating oversquashing. Our algorithm employs a random local edge flip primitive motivated by an expander graph construction. We compare the spectral expansion properties of our algorithm with that of an existing curvature-based non-local rewiring strategy. Synthetic experiments show that while our algorithm in general has a slower rate of expansion, it is overall computationally cheaper, preserves the node degrees exactly and never disconnects the graph.	翻訳日:2022-08-09 14:31:07 公開日:2022-08-06
# GCNN-LSTMハイブリッドニューラルネットワークによるアルゴリズム生成領域の検出 Detecting Algorithmically Generated Domains Using a GCNN-LSTM Hybrid Neural Network ( http://arxiv.org/abs/2208.03445v1 ) ライセンス: Link先を確認	Zheng Wang	(参考訳) ドメイン生成アルゴリズム(DGA)は、ボットネットによってC&Cサーバとボットの間のステルスなコマンドと制御(C&C)通信チャネルを構築するために使用される。 DGAは、多数の擬似ランダムアルゴリズム生成ドメイン(AGD)を周期的に生成することができる。 AGD検出アルゴリズムは、既存のDGA技術に対応する軽量で有望なソリューションを提供する。本稿では,agd検出のためのgcnn(gated convolutional neural network)-lstm(long short-term memory)ハイブリッドニューラルネットワーク(glhnn)を提案する。 GLHNNでは、LSTM上のドメイン名から情報的特徴を抽出するためにGCNNが適用される。 GLHNNは6種類のDGAをカバーするAGDを用いて実験的に検証されている。 glhnnは最先端検出モデルと比較され、テストされたモデルの中で最高の検出性能を示す。 Domain generation algorithm (DGA) is used by botnets to build a stealthy command and control (C&C) communication channel between the C&C server and the bots. A DGA can periodically produce a large number of pseudo-random algorithmically generated domains (AGDs). AGD detection algorithms provide a lightweight, promising solution in response to the existing DGA techniques. In this paper, a GCNN (gated convolutional neural network)-LSTM (long short-term memory) Hybrid Neural Network (GLHNN) for AGD detection is proposed. In GLHNN, GCNN is applied to extract the informative features from domain names on top of LSTM which further processes the feature sequence. GLHNN is experimentally validated using representative AGDs covering six classes of DGAs. GLHNN is compared with the state-of-the-art detection models and demonstrates the best overall detection performance among these tested models.	翻訳日:2022-08-09 14:20:42 公開日:2022-08-06
# ブラフ体まわりの流れの大規模渦シミュレーションのための深層学習閉鎖モデル Deep Learning Closure Models for Large-Eddy Simulation of Flows around Bluff Bodies ( http://arxiv.org/abs/2208.03498v1 ) ライセンス: Link先を確認	Justin Sirignano and Jonathan F. MacArt	(参考訳) 大渦シミュレーション(LES)のための深層学習(DL)クロージャモデルを開発し,適度なレイノルズ数で矩形円筒まわりの非圧縮性流れについて評価した。壁近傍流れのシミュレーションは空力モデリングの中心的な課題であり、分離された流れの予測はしばしば不正確であり、lesは制限的に小さい壁近傍のメッシュサイズを必要とする。 dl-lesモデルは随伴pde最適化法を用いて訓練され、可能な限り直接数値シミュレーション(dns)データにマッチする。その後、トレーニングデータに含まれない新しいアスペクト比とレイノルズ数について、サンプル外評価を行い、標準のLESモデル(動的スマゴリンスキーモデル)と比較する。 DL-LESモデルは動的Smagorinskyよりも優れており、比較的粗いメッシュ上で正確なLES予測を達成することができる(各カルテシャン方向の4倍の因子でDNSグリッドからダウンサンプリングされる)。抵抗係数,平均流れ,レイノルズ応力を予測するためのdl-lesモデルの精度について検討した。例えば、時間平均平均平均速度 $\bar{u}(x) = \displaystyle \lim_{t \rightarrow \infty} \frac{1}{t} \int_0^t u(s,x) dx$ である。したがって、定常流統計を計算するためには、DL-LES方程式をドメイン内の多数のフロー時間でシミュレートする必要がある; 関数型が深いニューラルネットワークによって定義される非定常な偏微分方程式モデルが$t \in [0, \infty)$で安定かつ正確であるかどうかという、非自明な問題である。その結果,dl-lesモデルは大きな物理時間にわたって正確で安定であり,空力的応用に関連するブラフ体まわりの乱流の流速,ゆらぎ,抗力係数の定常統計量の推定が可能となった。 A deep learning (DL) closure model for large-eddy simulation (LES) is developed and evaluated for incompressible flows around a rectangular cylinder at moderate Reynolds numbers. Near-wall flow simulation remains a central challenge in aerodynamic modeling: RANS predictions of separated flows are often inaccurate, while LES can require prohibitively small near-wall mesh sizes. The DL-LES model is trained using adjoint PDE optimization methods to match, as closely as possible, direct numerical simulation (DNS) data. It is then evaluated out-of-sample (i.e., for new aspect ratios and Reynolds numbers not included in the training data) and compared against a standard LES model (the dynamic Smagorinsky model). The DL-LES model outperforms dynamic Smagorinsky and is able to achieve accurate LES predictions on a relatively coarse mesh (downsampled from the DNS grid by a factor of four in each Cartesian direction). We study the accuracy of the DL-LES model for predicting the drag coefficient, mean flow, and Reynolds stress. A crucial challenge is that the LES quantities of interest are the steady-state flow statistics; for example, the time-averaged mean velocity $\bar{u}(x) = \displaystyle \lim_{t \rightarrow \infty} \frac{1}{t} \int_0^t u(s,x) dx$. Calculating the steady-state flow statistics therefore requires simulating the DL-LES equations over a large number of flow times through the domain; it is a non-trivial question whether an unsteady partial differential equation model whose functional form is defined by a deep neural network can remain stable and accurate on $t \in [0, \infty)$. Our results demonstrate that the DL-LES model is accurate and stable over large physical time spans, enabling the estimation of the steady-state statistics for the velocity, fluctuations, and drag coefficient of turbulent flows around bluff bodies relevant to aerodynamic applications.	翻訳日:2022-08-09 14:20:29 公開日:2022-08-06
# 精度を犠牲にすることなくグラフ畳み込みネットワークのトリプルスカラー化 Triple Sparsification of Graph Convolutional Networks without Sacrificing the Accuracy ( http://arxiv.org/abs/2208.03559v1 ) ライセンス: Link先を確認	Md. Khaledur Rahman, Ariful Azad	(参考訳) グラフニューラルネットワーク(gnns)は、グラフ上で異なる機械学習タスクを実行するために広く使われている。グラフのサイズが拡大し、gnnがより深くなるにつれて、トレーニングと推論の時間は、メモリ要件に加えてコストがかかります。したがって、精度を犠牲にすることなく、グラフのスパーシフィケーションやモデル圧縮は、グラフ学習タスクにとって実行可能なアプローチとなる。既存の手法では、グラフとgnnモデルのスパース化のみを研究する。本稿では,GNNにおける全てのスペーサー化を研究対象とするSparseGCNパイプラインを提案する。我々は理論的解析を行い, ベンチマークグラフデータセットの精度を犠牲にすることなく, 埋め込み行列に最大11.6\%のスパーシティを付加できることを実証的に示した。 Graph Neural Networks (GNNs) are widely used to perform different machine learning tasks on graphs. As the size of the graphs grows, and the GNNs get deeper, training and inference time become costly in addition to the memory requirement. Thus, without sacrificing accuracy, graph sparsification, or model compression becomes a viable approach for graph learning tasks. A few existing techniques only study the sparsification of graphs and GNN models. In this paper, we develop a SparseGCN pipeline to study all possible sparsification in GNN. We provide a theoretical analysis and empirically show that it can add up to 11.6\% additional sparsity to the embedding matrix without sacrificing the accuracy of the commonly used benchmark graph datasets.	翻訳日:2022-08-09 14:19:49 公開日:2022-08-06
# LFGCF: タグ対応レコメンデーションのための光フォークソノミーグラフ協調フィルタ LFGCF: Light Folksonomy Graph Collaborative Filtering for Tag-Aware Recommendation ( http://arxiv.org/abs/2208.03454v1 ) ライセンス: Link先を確認	Yin Zhang, Can Xu, XianJun Wu, Yan Zhang, LiGang Dong, Weigang Wang	(参考訳) タグ認識レコメンデーション(tag-aware recommendation)は、タグ付け動作によって、ユーザのためのアイテムのパーソナライズされたリストを予測するタスクである。 last.fmやmovielensのようなタグ付け機能を持つ多くのアプリケーションにとって非常に重要である。近年,グラフ畳み込みネットワーク (GCN) によるタグ認識レコメンデーションシステム (TRS) の改良に多くの努力が注がれている。しかし、いくつかのソリューションはGCNから直接継承されるため、タグによって導入されたスパーシリティ、あいまいさ、冗長性の問題を緩和することは困難であり、トレーニングやレコメンデーションパフォーマンスの低下が困難になる。本稿では,GCNの設計を簡略化し,RTSをより簡潔にすることを目的とする。本稿では,重要なGCNコンポーネントのみを含む,光フォークソノミーグラフ協調フィルタリング(LFGCF)と呼ばれる新しいタグ認識レコメンデーションモデルを提案する。具体的には、LFGCFは最初に、タグとタグ付けされたアイテムのレコードからFolksonomy Graphsを構築する。次に,アグリゲーションのシンプルな設計を用いて,ソノミーグラフの高次表現を学習し,複数のレイヤで学習した埋め込みの重み付け和を用いて情報更新を行う。ユーザーとアイテム間の情報ギャップを埋めるために、埋め込みタグを共有します。さらに、ユーザの好みやアイテムの特徴をよりよく表現するために、TransRTという正規化関数が提案されている。 3つの実世界のデータセットに対する大規模なハイパーパラメータ実験とアブレーション研究により、LFGCFはパラメータを少なくし、タグ対応のトップNレコメンデーションのベースラインを著しく上回っている。 Tag-aware recommendation is a task of predicting a personalized list of items for a user by their tagging behaviors. It is crucial for many applications with tagging capabilities like last.fm or movielens. Recently, many efforts have been devoted to improving Tag-aware recommendation systems (TRS) with Graph Convolutional Networks (GCN), which has become new state-of-the-art for the general recommendation. However, some solutions are directly inherited from GCN without justifications, which is difficult to alleviate the sparsity, ambiguity, and redundancy issues introduced by tags, thus adding to difficulties of training and degrading recommendation performance. In this work, we aim to simplify the design of GCN to make it more concise for TRS. We propose a novel tag-aware recommendation model named Light Folksonomy Graph Collaborative Filtering (LFGCF), which only includes the essential GCN components. Specifically, LFGCF first constructs Folksonomy Graphs from the records of user assigning tags and item getting tagged. Then we leverage the simple design of aggregation to learn the high-order representations on Folksonomy Graphs and use the weighted sum of the embeddings learned at several layers for information updating. We share tags embeddings to bridge the information gap between users and items. Besides, a regularization function named TransRT is proposed to better depict user preferences and item features. Extensive hyperparameters experiments and ablation studies on three real-world datasets show that LFGCF uses fewer parameters and significantly outperforms most baselines for the tag-aware top-N recommendations.	翻訳日:2022-08-09 14:07:45 公開日:2022-08-06
# NeuCASL: 論理設計からニューロモルフィックエンジンのシステムシミュレーションへ NeuCASL: From Logic Design to System Simulation of Neuromorphic Engines ( http://arxiv.org/abs/2208.03500v1 ) ライセンス: Link先を確認	Dharanidhar Dang, Amitash Nanda, Bill Lin and Debashis Sahoo	(参考訳) ムーアの法則の飽和とデナードのスケーリングが壁にぶつかったため、伝統的なフォン・ニューマン・システムはCNNのような計算集約アルゴリズムにGFlops/WWを供給できない。非従来型コンピューティングアプローチの最近のトレンドは、そのようなアルゴリズムのために高エネルギー効率なコンピューティングシステムを設計することを望んでいる。ニューロモルフィックコンピューティングは、脳にインスパイアされた回路、新興技術の使用、低パワーな性質で有望なアプローチである。研究者は、memristors、silicon photonics、finfet、carbon nanotubesなどの様々な新しい技術を使って、ニューロモルフィックコンピュータを実証している。しかし、ニューロモルフィック論理設計から始めてアーキテクチャシミュレーションに進む柔軟なCADツールは、この将来性のあるパラダイムの台頭を支持するためにまだ実証されていない。本稿では,ニューロモルフィック論理設計,回路シミュレーション,システム性能および信頼性評価のための,オープンソースのピソンベースフルシステムCADフレームワークであるNeuCASLを構築することを目的とする。これは私たちの知る限りでは初めてのことです。 With Moore's law saturating and Dennard scaling hitting its wall, traditional Von Neuman systems cannot offer the GFlops/watt for compute-intensive algorithms such as CNN. Recent trends in unconventional computing approaches give us hope to design highly energy-efficient computing systems for such algorithms. Neuromorphic computing is a promising such approach with its brain-inspired circuitry, use of emerging technologies, and low-power nature. Researchers use a variety of novel technologies such as memristors, silicon photonics, FinFET, and carbon nanotubes to demonstrate a neuromorphic computer. However, a flexible CAD tool to start from neuromorphic logic design and go up to architectural simulation is yet to be demonstrated to support the rise of this promising paradigm. In this project, we aim to build NeuCASL, an opensource python-based full system CAD framework for neuromorphic logic design, circuit simulation, and system performance and reliability estimation. This is a first of its kind to the best of our knowledge.	翻訳日:2022-08-09 14:07:14 公開日:2022-08-06
# 教師なし3次元行動表現学習のための対照的なポジティブマイニング Contrastive Positive Mining for Unsupervised 3D Action Representation Learning ( http://arxiv.org/abs/2208.03497v1 ) ライセンス: Link先を確認	Haoyuan Zhang, Yonghong Hou, Wenjing Zhang and Wanqing Li	(参考訳) 最近の3次元行動表現学習は大きな進歩を遂げている。しかし、厳密なポジティブ/ネガティブな制約はいまだ緩和されておらず、非自己肯定の使用はまだ検討されていない。本稿では,非教師なしスケルトン3D行動表現学習のためのコントラスト陽性マイニング(CPM)フレームワークを提案する。 CPMは、学習を促進するためにコンテキストキュー内の非自己陽性を特定する。具体的には、siameseエンコーダを採用し、コンテキストキュー内のすべてのインスタンスを参照して拡張インスタンスの類似度分布にマッチするように訓練する。列内の非自己正のインスタンスを識別することにより、マイニング正の知識を活用し、学習された潜在空間のクラス内およびクラス間多様性に対する頑健性を高めるための正の強化学習戦略を提案する。実験の結果,提案したCPMは,NTUおよびPKU-MMDデータセットにおいて,既存の最先端の教師なし手法よりも優れていた。 Recent contrastive based 3D action representation learning has made great progress. However, the strict positive/negative constraint is yet to be relaxed and the use of non-self positive is yet to be explored. In this paper, a Contrastive Positive Mining (CPM) framework is proposed for unsupervised skeleton 3D action representation learning. The CPM identifies non-self positives in a contextual queue to boost learning. Specifically, the siamese encoders are adopted and trained to match the similarity distributions of the augmented instances in reference to all instances in the contextual queue. By identifying the non-self positive instances in the queue, a positive-enhanced learning strategy is proposed to leverage the knowledge of mined positives to boost the robustness of the learned latent space against intra-class and inter-class diversity. Experimental results have shown that the proposed CPM is effective and outperforms the existing state-of-the-art unsupervised methods on the challenging NTU and PKU-MMD datasets.	翻訳日:2022-08-09 14:03:05 公開日:2022-08-06
# 深層学習による空間位相の3次元計測 Deep Learning-enabled Spatial Phase Unwrapping for 3D Measurement ( http://arxiv.org/abs/2208.03524v1 ) ライセンス: Link先を確認	Xiaolong Luo, Wanzhong Song, Songlin Bai, Yu Li, and Zhihe Zhao	(参考訳) 3次元撮像速度とシステムコストの観点からは、単一周波数パターンを投影する単一カメラシステムは、提案されている全てのフリンジプロフィロメトリ (fpp) システムの中で理想的な選択肢である。このシステムは、堅牢な空間位相解放(SPU)アルゴリズムを必要とする。しかし、堅牢なSPUは複雑な場面では依然として課題である。品質誘導型SPUアルゴリズムは、切り離す前に位相マップの信頼性の低い点を識別するより効率的な方法を必要とする。エンドツーエンドのディープラーニングSPU手法は、汎用性と解釈可能性の問題に直面している。本稿では,FPPにおける頑健なSPUに対して,ディープラーニングと従来の経路追従を組み合わせたハイブリッド手法を提案する。このハイブリッドSPU方式は、従来の品質誘導型SPU法よりも堅牢性、エンドツーエンドのディープラーニング方式よりも解釈性が高く、目に見えないデータに対する一般性を示す。複数の照明条件と複数のFPPシステムの実際のデータセットに関する実験は, 画像解像度, フリンジ数, フリンジ方向, 光学波長によって異なるが, 提案手法の有効性を検証する。 In terms of 3D imaging speed and system cost, the single-camera system projecting single-frequency patterns is the ideal option among all proposed Fringe Projection Profilometry (FPP) systems. This system necessitates a robust spatial phase unwrapping (SPU) algorithm. However, robust SPU remains a challenge in complex scenes. Quality-guided SPU algorithms need more efficient ways to identify the unreliable points in phase maps before unwrapping. End-to-end deep learning SPU methods face generality and interpretability problems. This paper proposes a hybrid method combining deep learning and traditional path-following for robust SPU in FPP. This hybrid SPU scheme demonstrates better robustness than traditional quality-guided SPU methods, better interpretability than end-to-end deep learning scheme, and generality on unseen data. Experiments on the real dataset of multiple illumination conditions and multiple FPP systems differing in image resolution, the number of fringes, fringe direction, and optics wavelength verify the effectiveness of the proposed method.	翻訳日:2022-08-09 14:02:50 公開日:2022-08-06
# 正確かつ説明可能な深層学習システムによる胸部x線画像の解釈におけるobserver agreementの改善 An Accurate and Explainable Deep Learning System Improves Interobserver Agreement in the Interpretation of Chest Radiograph ( http://arxiv.org/abs/2208.03545v1 ) ライセンス: Link先を確認	Hieu H. Pham, Ha Q. Nguyen, Hieu T. Nguyen, Linh T. Le, Lam Khanh	(参考訳) 最近の人工知能(AI)アルゴリズムは、様々な医学分類タスクにおいて放射線学レベルの性能を達成した。しかし、CXRスキャンによる異常所見の局所化は、放射線医に画像レベルの分類を説明する上で不可欠である。本稿では,CXRスキャンを複数の胸部疾患に分類できるVinDr-CXRという,説明可能な深層学習システムについて紹介する。 VinDr-CXRは51,485個のCXRスキャンで放射線学者によるバウンディングボックスアノテーションを用いて訓練された。 6つの一般的な胸椎疾患を3000のcxrスキャンで分類し、受信者の動作特性曲線(auroc)下の平均面積は0.967(95%信頼区間[ci]:0.958-0.975)であった。 VinDr-CXRは、独立した患者コホートにおいても外部から検証され、その堅牢性を示した。 VinDr-CXRは,14種類の病変を有する局所化タスクにおいて,スキャン毎に検出された1.0偽陽性病変の頻度で80.2%の感度を示した。 VinDr-CXRの臨床効果を6名の経験者を支援するために,前向きに検討した。その結果,診断支援ツールとして使用すると,Fleiss' Kappa平均の1.5%の増加により,放射線科医間の合意が著しく改善した。また, 放射線学者がVinDr-CXRの提案を相談した結果, コーエンのカッパ平均値の3.3%で, 両者の合意が著しく増加した。 Recent artificial intelligence (AI) algorithms have achieved radiologist-level performance on various medical classification tasks. However, only a few studies addressed the localization of abnormal findings from CXR scans, which is essential in explaining the image-level classification to radiologists. We introduce in this paper an explainable deep learning system called VinDr-CXR that can classify a CXR scan into multiple thoracic diseases and, at the same time, localize most types of critical findings on the image. VinDr-CXR was trained on 51,485 CXR scans with radiologist-provided bounding box annotations. It demonstrated a comparable performance to experienced radiologists in classifying 6 common thoracic diseases on a retrospective validation set of 3,000 CXR scans, with a mean area under the receiver operating characteristic curve (AUROC) of 0.967 (95% confidence interval [CI]: 0.958-0.975). The VinDr-CXR was also externally validated in independent patient cohorts and showed its robustness. For the localization task with 14 types of lesions, our free-response receiver operating characteristic (FROC) analysis showed that the VinDr-CXR achieved a sensitivity of 80.2% at the rate of 1.0 false-positive lesion identified per scan. A prospective study was also conducted to measure the clinical impact of the VinDr-CXR in assisting six experienced radiologists. The results indicated that the proposed system, when used as a diagnosis supporting tool, significantly improved the agreement between radiologists themselves with an increase of 1.5% in mean Fleiss' Kappa. We also observed that, after the radiologists consulted VinDr-CXR's suggestions, the agreement between each of them and the system was remarkably increased by 3.3% in mean Cohen's Kappa.	翻訳日:2022-08-09 14:02:35 公開日:2022-08-06
# パネルデータを用いた因果推論のための予測アルゴリズム Forecasting Algorithms for Causal Inference with Panel Data ( http://arxiv.org/abs/2208.03489v1 ) ライセンス: Link先を確認	Jacob Goldin, Julian Nyarko, Justin Young	(参考訳) パネルデータによる因果推論は、社会科学研究の核となる課題である。予測手法の進歩は、治療を行わない治療単位の反事実的進化をより正確に予測することで、この課題を促進できる。本稿では,時系列予測(N-BEATSアルゴリズム)のためのニューラルアーキテクチャを新たに開発した。本手法は, 後処理期間における処理単位の「合成」未処理バージョンを予測するために, 制御単位の先行値を組み込むことにより, 従来の時系列アプリケーションから適応する。本手法から導出した推定器をシンビートと呼び,従来の2方向固定効果や合成制御法を大きく上回っていることを見出した。また,SyNBEATSは,行列補完や相違点の合成等,最近のパネル推定手法と比較して,同等あるいはより正確な性能が得られることがわかった。本研究は,パネル設定における因果推論を改善するために,予測文学の進歩をいかに活用できるかを強調した。 Conducting causal inference with panel data is a core challenge in social science research. Advances in forecasting methods can facilitate this task by more accurately predicting the counterfactual evolution of a treated unit had treatment not occurred. In this paper, we draw on a newly developed deep neural architecture for time series forecasting (the N-BEATS algorithm). We adapt this method from conventional time series applications by incorporating leading values of control units to predict a "synthetic" untreated version of the treated unit in the post-treatment period. We refer to the estimator derived from this method as SyNBEATS, and find that it significantly outperforms traditional two-way fixed effects and synthetic control methods across a range of settings. We also find that SyNBEATS attains comparable or more accurate performance relative to more recent panel estimation methods such as matrix completion and synthetic difference in differences. Our results highlight how advances in the forecasting literature can be harnessed to improve causal inference in panel settings.	翻訳日:2022-08-09 13:55:59 公開日:2022-08-06
# 最大$k$非依存集合によるグラフプーリング Graph Pooling with Maximum-Weight $k$-Independent Sets ( http://arxiv.org/abs/2208.03523v1 ) ライセンス: Link先を確認	Davide Bacciu, Alessio Conte, Francesco Landolfi	(参考訳) 大規模ネットワークやリレーショナルデータを扱う場合、グラフの削減が基本である。粗い構造でそれらを解くことで、高い計算負荷のタスクを縮小できる。同時に、グラフリダクションは、構造からマルチレゾリューション表現を抽出するために、グラフニューラルネットワークの層をプールする役割を担っている。これらの文脈において、距離関係と位相特性を保存するための還元機構の能力は、その応用を現実のサイズの問題に適用できるスケーラビリティとともに、基本的なものと考えられる。本稿では,最大重量$k$非依存集合のグラフ理論的概念に基づくグラフ粗大化機構を導入し,GPU上での効率的な並列実装を実現するグレディアルゴリズムを提案する。本手法は, 正規データ(画像, シーケンス)における制御可能な等間隔粗粒化機構の最初のグラフ構造対応である。我々は、経路長の歪み境界の理論的保証と、粗化グラフにおける重要な位相特性を保存する能力を証明する。これらの概念を利用して,グラフ分類タスクにおいて経験的に評価するグラフプーリング機構を定義し,文献のプーリング手法と比較した。 Graph reductions are fundamental when dealing with large scale networks and relational data. They allow to downsize tasks of high computational impact by solving them in coarsened structures. At the same time, graph reductions play the role of pooling layers in graph neural networks, to extract multi-resolution representations from structures. In these contexts, the ability of the reduction mechanism to preserve distance relationships and topological properties appears fundamental, along with a scalability enabling its application to real-world sized problems. In this paper, we introduce a graph coarsening mechanism based on the graph-theoretic concept of maximum-weight $k$-independent sets, providing a greedy algorithm that allows efficient parallel implementation on GPUs. Our method is the first graph-structured counterpart of controllable equispaced coarsening mechanisms in regular data (images, sequences). We prove theoretical guarantees for distortion bounds on path lengths, as well as the ability to preserve key topological properties in the coarsened graphs. We leverage these concepts to define a graph pooling mechanism that we empirically assess in graph classification tasks, showing that it compares favorably against pooling methods in literature.	翻訳日:2022-08-09 13:43:50 公開日:2022-08-06
# エントロピー損失を用いたロバスト深層学習に向けて Towards Robust Deep Learning using Entropic Losses ( http://arxiv.org/abs/2208.03566v1 ) ライセンス: Link先を確認	David Mac\^edo	(参考訳) 現在のディープラーニングソリューションは、推論中にサンプルを確実に分類できるかどうかを知らせないことでよく知られている。より信頼性の高いディープラーニングソリューションを構築するための最も効果的な方法の1つは、いわゆるアウト・オブ・ディストリビューション(out-of-distribution)検出タスクにおけるパフォーマンスを改善することだ。言い換えれば、分散検出能力のあるシステムは、ニューラルネットワークがトレーニングされていないクラスのインスタンスに送信されると、ナンセンスな分類を行うことを拒否する可能性がある。本論文は, 新たな損失関数と検出スコアを提案することにより, 未解決の分散検出タスクに取り組む。不確実性推定は、より堅牢なディープラーニングシステムを構築する上で重要な補助タスクでもある。そこで,本研究では,ディープニューラルネットワークが提示する確率がどの程度現実的かを評価するロバストネス関連タスクにも対処する。提案手法の有効性を実証するために,最先端の成果を含む実験セットに加えて,最大エントロピー原理に基づく議論を用いて,提案手法の理論的基礎を確立する。現在のほとんどの方法とは異なり、損失とスコアはシームレスで原則的なソリューションであり、高速で効率的な推論に加えて正確な予測を生み出します。さらに、深層ニューラルネットワークのトレーニングに使用される損失を置き換え、検出のための迅速なスコアを計算するだけで、現在のプロジェクトや将来のプロジェクトに組み込むことができます。 Current deep learning solutions are well known for not informing whether they can reliably classify an example during inference. One of the most effective ways to build more reliable deep learning solutions is to improve their performance in the so-called out-of-distribution detection task, which essentially consists of "know that you do not know" or "know the unknown". In other words, out-of-distribution detection capable systems may reject performing a nonsense classification when submitted to instances of classes on which the neural network was not trained. This thesis tackles the defiant out-of-distribution detection task by proposing novel loss functions and detection scores. Uncertainty estimation is also a crucial auxiliary task in building more robust deep learning systems. Therefore, we also deal with this robustness-related task, which evaluates how realistic the probabilities presented by the deep neural network are. To demonstrate the effectiveness of our approach, in addition to a substantial set of experiments, which includes state-of-the-art results, we use arguments based on the principle of maximum entropy to establish the theoretical foundation of the proposed approaches. Unlike most current methods, our losses and scores are seamless and principled solutions that produce accurate predictions in addition to fast and efficient inference. Moreover, our approaches can be incorporated into current and future projects simply by replacing the loss used to train the deep neural network and computing a rapid score for detection.	翻訳日:2022-08-09 13:43:33 公開日:2022-08-06
# ベクトル化表現を用いたグラフ型軌道予測器の一般化解析 Generalizability Analysis of Graph-based Trajectory Predictor with Vectorized Representation ( http://arxiv.org/abs/2208.03578v1 ) ライセンス: Link先を確認	Juanwu Lu, Wei Zhan, Masayoshi Tomizuka, Yeping Hu	(参考訳) 軌道予測は自動運転車にとって不可欠な課題の1つである。機械学習の最近の進歩は、一連の高度な軌道予測アルゴリズムを生み出した。近年,ベクトル化表現を用いたグラフニューラルネットワーク(gnns)の軌道予測の有効性が多くの研究者によって実証されている。それにもかかわらず、これらのアルゴリズムは様々なシナリオにわたるモデルの一般化可能性にほとんど注意を払わないか、あるいはトレーニングとテストデータが同様の統計に従うと仮定する。実際、テストシナリオが見えない場合や、アウト・オブ・ディストリビューション(OOD)の場合、結果のトレイン・テストのドメインシフトは通常、予測性能が大幅に低下し、下流のモジュールに影響を与え、最終的には深刻な事故を引き起こす。したがって、それらの一般化可能性の観点から予測モデルを徹底的に研究することが重要であり、その弱点を識別するだけでなく、これらのモデルを改善するための洞察を与えることもできる。本稿では,ブラックボックスモデルの解釈を支援する特徴属性法による一般化可能性分析フレームワークを提案する。本ケーススタディでは,ベクトル化表現を用いた最先端のグラフベース軌道予測器の詳細な一般化解析を行う。結果は、ドメインシフトによるパフォーマンスの大幅な低下を示し、これらの問題の潜在的な原因を特定するための洞察を提供する。最後に、一般的な予測課題と、トレーニングプロセスによって引き起こされる重み付けバイアスが精度を低下させる可能性について結論づける。 Trajectory prediction is one of the essential tasks for autonomous vehicles. Recent progress in machine learning gave birth to a series of advanced trajectory prediction algorithms. Lately, the effectiveness of using graph neural networks (GNNs) with vectorized representations for trajectory prediction has been demonstrated by many researchers. Nonetheless, these algorithms either pay little attention to models' generalizability across various scenarios or simply assume training and test data follow similar statistics. In fact, when test scenarios are unseen or Out-of-Distribution (OOD), the resulting train-test domain shift usually leads to significant degradation in prediction performance, which will impact downstream modules and eventually lead to severe accidents. Therefore, it is of great importance to thoroughly investigate the prediction models in terms of their generalizability, which can not only help identify their weaknesses but also provide insights on how to improve these models. This paper proposes a generalizability analysis framework using feature attribution methods to help interpret black-box models. For the case study, we provide an in-depth generalizability analysis of one of the state-of-the-art graph-based trajectory predictors that utilize vectorized representation. Results show significant performance degradation due to domain shift, and feature attribution provides insights to identify potential causes of these problems. Finally, we conclude the common prediction challenges and how weighting biases induced by the training process can deteriorate the accuracy.	翻訳日:2022-08-09 13:43:09 公開日:2022-08-06
# MonoViT:視覚変換器を用いた自己監督単眼深度推定 MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer ( http://arxiv.org/abs/2208.03543v1 ) ライセンス: Link先を確認	Chaoqiang Zhao, Youmin Zhang, Matteo Poggi, Fabio Tosi, Xianda Guo, Zheng Zhu, Guan Huang, Yang Tang, Stefano Mattoccia	(参考訳) 自己教師付き単分子深度推定は、訓練にハード・ソースの深度ラベルを必要としない魅力的な解である。畳み込みニューラルネットワーク(CNN)は、最近このタスクで大きな成功を収めた。しかし、その限定的な受容領域は、既存のネットワークアーキテクチャを局所的な推論のみに制限し、自己監督パラダイムの有効性を損なう。ビジョントランスフォーマー (ViTs) が最近達成した成果を踏まえ, ViT モデルで実現したグローバル推論と自己教師型モノクロ深度推定の柔軟性を組み合わせた新しいフレームワーク MonoViT を提案する。平易な畳み込みとTransformerブロックを組み合わせることで、我々のモデルは局所的および世界的推論が可能となり、より詳細な精度と精度で深度予測が得られ、MonoViTは確立されたKITTIデータセット上で最先端のパフォーマンスを達成できる。さらに、MonoViTはMake3DやDrivingStereoといった他のデータセットよりも優れた一般化能力を示している。 Self-supervised monocular depth estimation is an attractive solution that does not require hard-to-source depth labels for training. Convolutional neural networks (CNNs) have recently achieved great success in this task. However, their limited receptive field constrains existing network architectures to reason only locally, dampening the effectiveness of the self-supervised paradigm. In the light of the recent successes achieved by Vision Transformers (ViTs), we propose MonoViT, a brand-new framework combining the global reasoning enabled by ViT models with the flexibility of self-supervised monocular depth estimation. By combining plain convolutions with Transformer blocks, our model can reason locally and globally, yielding depth prediction at a higher level of detail and accuracy, allowing MonoViT to achieve state-of-the-art performance on the established KITTI dataset. Moreover, MonoViT proves its superior generalization capacities on other datasets such as Make3D and DrivingStereo.	翻訳日:2022-08-09 13:21:46 公開日:2022-08-06
# 凍結CLIPモデルは効果的なビデオ学習者である Frozen CLIP Models are Efficient Video Learners ( http://arxiv.org/abs/2208.03550v1 ) ライセンス: Link先を確認	Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang, Jifeng Dai, Yu Qiao, Hongsheng Li	(参考訳) まず、事前訓練された画像モデルの重み付きビデオ認識モデルを初期化し、次に、ビデオ上でエンドツーエンドのトレーニングを実行する。これにより、ビデオネットワークは事前訓練された画像モデルの恩恵を受けることができる。しかし、ビデオの微調整には、かなりの計算とメモリリソースが必要であり、画像バックボーンを微調整することなく、事前訓練された画像機能を直接使用する代替手段は、サブパー結果につながる。幸いなことに、Contrastive Vision-Language Pre-Training(CLIP)の最近の進歩は、視覚認識タスクのための新しいルートの道を開いた。大規模なオープンボカブラリなイメージテキストペアデータで事前トレーニングされたこれらのモデルは、豊富なセマンティクスを持つ強力な視覚的表現を学習する。本稿では,凍ったCLIP機能を備えた高品質なビデオ認識モデルを直接トレーニングする効率的なフレームワークとして,EVL(Efficient Video Learning)を提案する。具体的には,軽量トランスフォーマーデコーダを用いてクエリトークンを学習し,CLIP画像エンコーダからフレームレベルの空間的特徴を動的に収集する。さらに,各デコーダ層に局所時間モジュールを適用し,隣接するフレームとその注意マップから時間的手がかりを検出する。凍結したバックボーンでトレーニングすることが効率的であるにもかかわらず、我々のモデルは様々なビデオ認識データセットで高品質なビデオ表現を学ぶ。コードはhttps://github.com/opengvlab/ efficient-video-recognitionで入手できる。 Video recognition has been dominated by the end-to-end learning paradigm -- first initializing a video recognition model with weights of a pretrained image model and then conducting end-to-end training on videos. This enables the video network to benefit from the pretrained image model. However, this requires substantial computation and memory resources for finetuning on videos and the alternative of directly using pretrained image features without finetuning the image backbone leads to subpar results. Fortunately, recent advances in Contrastive Vision-Language Pre-training (CLIP) pave the way for a new route for visual recognition tasks. Pretrained on large open-vocabulary image-text pair data, these models learn powerful visual representations with rich semantics. In this paper, we present Efficient Video Learning (EVL) -- an efficient framework for directly training high-quality video recognition models with frozen CLIP features. Specifically, we employ a lightweight Transformer decoder and learn a query token to dynamically collect frame-level spatial features from the CLIP image encoder. Furthermore, we adopt a local temporal module in each decoder layer to discover temporal clues from adjacent frames and their attention maps. We show that despite being efficient to train with a frozen backbone, our models learn high quality video representations on a variety of video recognition datasets. Code is available at https://github.com/OpenGVLab/efficient-video-recognition.	翻訳日:2022-08-09 13:21:29 公開日:2022-08-06
# オートキュレーションを用いた誘導パッチマッチングによる現代のカメラ解像度のインペインティング Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation ( http://arxiv.org/abs/2208.03552v1 ) ライセンス: Link先を確認	Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi	(参考訳) 近年、深部モデルでは低解像度画像の塗装にSOTAの性能が確立されているが、4K以上の近代カメラや大穴の分解能に欠ける。我々は、4kおよび現代のセンサーの代表的な画像のインペインティングベンチマークデータセットに寄与する。深層学習と従来の手法を組み合わせた新しい枠組みを提案する。既存の深層塗装モデルlamaを用いて, 穴を適切に埋め, 構造, セグメント化, 深さ, および多重誘導パッチマッチングによる3つのガイド画像を構築し, アップサンプリングした8つの画像を生成する。次に,8x8の非対称な対の選好行列上で,カラム和による良好な着色を選択する新しいキュレーションモジュールを通じて,すべての候補の着色をフィードする。私たちのフレームワークの結果は、8つの強力なベースライン以上のユーザによって圧倒的に好まれており、最高のベースラインlamaよりも7.4までの定量的メトリクスが改善されています。 Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes. We contribute an inpainting benchmark dataset of photos at 4K and above representative of modern sensors. We demonstrate a novel framework that combines deep learning and traditional methods. We use an existing deep inpainting model LaMa to fill the hole plausibly, establish three guide images consisting of structure, segmentation, depth, and apply a multiply-guided PatchMatch to produce eight candidate upsampled inpainted images. Next, we feed all candidate inpaintings through a novel curation module that chooses a good inpainting by column summation on an 8x8 antisymmetric pairwise preference matrix. Our framework's results are overwhelmingly preferred by users over 8 strong baselines, with improvements of quantitative metrics up to 7.4 over the best baseline LaMa, and our technique when paired with 4 different SOTA inpainting backbones improves each such that ours is overwhelmingly preferred by users over a strong super-res baseline.	翻訳日:2022-08-09 13:21:06 公開日:2022-08-06
# HSIC-InfoGAN:近似相互情報の最大化による教師なしアンタングル表現の学習 HSIC-InfoGAN: Learning Unsupervised Disentangled Representations by Maximising Approximated Mutual Information ( http://arxiv.org/abs/2208.03563v1 ) ライセンス: Link先を確認	Xiao Liu, Spyridon Thermos, Pedro Sanchez, Alison Q. O'Neil, Sotirios A. Tsaftaris	(参考訳) 不整合表現の学習には、特定のモデル設計の導入とバイアスとしての学習制約の監督が必要である。 InfoGANは、潜在表現と対応する生成画像の相互情報を最大化することにより、教師なしの非絡み合い表現を学習する一般的な非絡み合いフレームワークである。相互情報の最大化は、潜在回帰損失を伴う補助ネットワークとトレーニングを導入することによって達成される。本稿では,Hilbert-Schmidt Independence Criterion (HSIC) を用いて,潜在表現と画像間の相互情報を近似する手法について検討する。 HSIC損失を直接最適化することは、追加の補助ネットワークの必要性を避ける。我々は,各モデルにおけるゆがみのレベルを質的に比較し,HSIC-InfoGANのハイパーパラメータを調整するための戦略を提案し,医療応用におけるHSIC-InfoGANの可能性について議論する。 Learning disentangled representations requires either supervision or the introduction of specific model designs and learning constraints as biases. InfoGAN is a popular disentanglement framework that learns unsupervised disentangled representations by maximising the mutual information between latent representations and their corresponding generated images. Maximisation of mutual information is achieved by introducing an auxiliary network and training with a latent regression loss. In this short exploratory paper, we study the use of the Hilbert-Schmidt Independence Criterion (HSIC) to approximate mutual information between latent representation and image, termed HSIC-InfoGAN. Directly optimising the HSIC loss avoids the need for an additional auxiliary network. We qualitatively compare the level of disentanglement in each model, suggest a strategy to tune the hyperparameters of HSIC-InfoGAN, and discuss the potential of HSIC-InfoGAN for medical applications.	翻訳日:2022-08-09 13:20:45 公開日:2022-08-06
# イントラ画像融合による深部非共役フォトメトリックステレオ Deep Uncalibrated Photometric Stereo via Inter-Intra Image Feature Fusion ( http://arxiv.org/abs/2208.03440v1 ) ライセンス: Link先を確認	Fangzhou Gao, Meng Wang, Lianghao Zhang, Li Wang, Jiawan Zhang	(参考訳) 様々な光と未知の光の下での像からの詳細な表面の正常さを推定するために,未調整の測光ステレオが提案されている。近年、ディープラーニングは、この未決定問題に先立って強力なデータをもたらす。本稿では,画像間表現を効率的に利用して正規推定を導く,奥行き非共役フォトメトリックステレオの新しい手法を提案する。従来の手法では、最適化ベースのニューラルネットワークの逆レンダリングや単一のサイズ非依存のプーリング層を使用して複数の入力を処理するが、入力画像間の情報の利用には非効率である。異なる照明下でのマルチイメージを考えると、画像内および画像間の変化は高い相関関係にあると考えられる。そこで我々は,画像間特徴抽出に画像間表現を導入するためのイントラ画像間特徴融合モジュールを設計した。余分な表現は画像ごとの特徴抽出を誘導し、正規推定の曖昧さを排除するために使われる。当社の設計が幅広い試料,特に暗黒物質に与える影響を実証する。本手法は, 合成データと実データの両方において, 最先端の手法よりも優れた結果が得られる。 Uncalibrated photometric stereo is proposed to estimate the detailed surface normal from images under varying and unknown lightings. Recently, deep learning brings powerful data priors to this underdetermined problem. This paper presents a new method for deep uncalibrated photometric stereo, which efficiently utilizes the inter-image representation to guide the normal estimation. Previous methods use optimization-based neural inverse rendering or a single size-independent pooling layer to deal with multiple inputs, which are inefficient for utilizing information among input images. Given multi-images under different lighting, we consider the intra-image and inter-image variations highly correlated. Motivated by the correlated variations, we designed an inter-intra image feature fusion module to introduce the inter-image representation into the per-image feature extraction. The extra representation is used to guide the per-image feature extraction and eliminate the ambiguity in normal estimation. We demonstrate the effect of our design on a wide range of samples, especially on dark materials. Our method produces significantly better results than the state-of-the-art methods on both synthetic and real data.	翻訳日:2022-08-09 13:14:55 公開日:2022-08-06
# AFE-CNN:アクション特徴強調による3次元骨格に基づく行動認識 AFE-CNN: 3D Skeleton-based Action Recognition with Action Feature Enhancement ( http://arxiv.org/abs/2208.03444v1 ) ライセンス: Link先を確認	Shannan Guan, Haiyan Lu, Linchao Zhu, Gengfa Fang	(参考訳) 既存の3Dスケルトンに基づくアクション認識アプローチは、手作りのアクション機能を画像フォーマットにエンコードし、CNNによってデコードすることで、印象的なパフォーマンスを実現する。しかし、この方法には2つの制限がある。 a)手作りの動作特徴は、困難な行動に対処することが困難であり、 b) 一般に、行動認識精度を向上させるために複雑なCNNモデルが必要である。これらの限界を克服するため,我々は,挑戦的行動に適応するために,3dスケルトンベースの動作の特徴を強化することに専心する新しい afe-cnn を導入する。そこで,AFE-CNNはカメラの視界や身体サイズの変化に対してより堅牢であり,挑戦行動における認識精度を大幅に向上させる。さらに,AFE-CNNでは,動作特徴が強化された画像を復号化するために,軽量CNNモデルを採用している。 NTU RGB+D, NTU RGB+D 120, UTKinect-Action3Dの3つのベンチマークスケルトンに基づく行動データセットを用いてAFE-CNNを評価する。 Existing 3D skeleton-based action recognition approaches reach impressive performance by encoding handcrafted action features to image format and decoding by CNNs. However, such methods are limited in two ways: a) the handcrafted action features are difficult to handle challenging actions, and b) they generally require complex CNN models to improve action recognition accuracy, which usually occur heavy computational burden. To overcome these limitations, we introduce a novel AFE-CNN, which devotes to enhance the features of 3D skeleton-based actions to adapt to challenging actions. We propose feature enhance modules from key joint, bone vector, key frame and temporal perspectives, thus the AFE-CNN is more robust to camera views and body sizes variation, and significantly improve the recognition accuracy on challenging actions. Moreover, our AFE-CNN adopts a light-weight CNN model to decode images with action feature enhanced, which ensures a much lower computational burden than the state-of-the-art methods. We evaluate the AFE-CNN on three benchmark skeleton-based action datasets: NTU RGB+D, NTU RGB+D 120, and UTKinect-Action3D, with extensive experimental results demonstrate our outstanding performance of AFE-CNN.	翻訳日:2022-08-09 13:14:38 公開日:2022-08-06
# クラスは文脈と副詞に不変:外部分布一般化のための学習不変性について Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization ( http://arxiv.org/abs/2208.03462v1 ) ライセンス: Link先を確認	Jiaxin Qi, Kaihua Tang, Qianru Sun, Xian-Sheng Hua, and Hanwang Zhang	(参考訳) Out-Of-Distribution Generalization (OOD) とは、環境変化に対する不変性を学習することである。すべてのクラスのコンテキストが均等に分散されている場合、OODは自明である。しかし、そのようなバランスのとれたデータセットの収集は現実的ではない。不均衡なデータを学習することで、モデルがコンテキストに偏り、OODを損なう。したがって、OODの鍵はコンテキストバランスである。先行研究において広く採用されている仮定である文脈バイアスは、バイアス付きクラス予測から直接注釈付けや推定が可能であり、文脈が不完全あるいは不正確であると主張する。コンテキストもクラスに不変であり、コンテキストバイアス(文脈ラベルなしで)を解決する様々な環境としてクラス(すでにラベル付けされている)を考える動機となります。この概念を実装し、クラス内サンプル類似性の対照的な損失を最小限に抑えつつ、この類似性を全てのクラスにわたって不変とすることで実装する。種々のコンテキストバイアスとドメインギャップを持つベンチマークにおいて、文脈推定を備えた単純な再重み付けに基づく分類器が最先端の性能を達成することを示す。 Appendix の理論的正当化と https://github.com/simpleshinobu/IRMCon のコードを提供する。 Out-Of-Distribution generalization (OOD) is all about learning invariance against environmental changes. If the context in every class is evenly distributed, OOD would be trivial because the context can be easily removed due to an underlying principle: class is invariant to context. However, collecting such a balanced dataset is impractical. Learning on imbalanced data makes the model bias to context and thus hurts OOD. Therefore, the key to OOD is context balance. We argue that the widely adopted assumption in prior work, the context bias can be directly annotated or estimated from biased class prediction, renders the context incomplete or even incorrect. In contrast, we point out the everoverlooked other side of the above principle: context is also invariant to class, which motivates us to consider the classes (which are already labeled) as the varying environments to resolve context bias (without context labels). We implement this idea by minimizing the contrastive loss of intra-class sample similarity while assuring this similarity to be invariant across all classes. On benchmarks with various context biases and domain gaps, we show that a simple re-weighting based classifier equipped with our context estimation achieves state-of-the-art performance. We provide the theoretical justifications in Appendix and codes on https://github.com/simpleshinobu/IRMCon.	翻訳日:2022-08-09 13:14:14 公開日:2022-08-06
# MRIモダリティの欠如による深層学習に基づく脳腫瘍の分節解析 Analyzing Deep Learning Based Brain Tumor Segmentation with Missing MRI Modalities ( http://arxiv.org/abs/2208.03470v1 ) ライセンス: Link先を確認	Benteng Ma, Yushi Wang, and Shen Wang	(参考訳) 本報告では,mriを欠いた脳腫瘍分割に対する既存の深層学習(dl)法の比較検討を行った。評価されたアプローチには、Adversarial Co-Training Network (ACN) と mmGAN と DeepMedic の組み合わせがある。 mmGANのより安定的で使いやすいバージョンも、GitHubリポジトリでオープンソース化されている。 BraTS2018データセットを使用することで、最先端のACNが特にT1cが欠落している場合には、パフォーマンスが向上することを示した。 mmGANとDeepMedicの単純な組み合わせは、1つのMRIモダリティが欠如している場合にも強いポテンシャルを示す。さらに、この研究は、MRIモダリティの欠如を伴う脳腫瘍セグメンテーションの今後の研究方向について議論を始めた。 This technical report presents a comparative analysis of existing deep learning (DL) based approaches for brain tumor segmentation with missing MRI modalities. Approaches evaluated include the Adversarial Co-training Network (ACN) and a combination of mmGAN and DeepMedic. A more stable and easy-to-use version of mmGAN is also open-sourced at a GitHub repository. Using the BraTS2018 dataset, this work demonstrates that the state-of-the-art ACN performs better especially when T1c is missing. While a simple combination of mmGAN and DeepMedic also shows strong potentials when only one MRI modality is missing. Additionally, this work initiated discussions with future research directions for brain tumor segmentation with missing MRI modalities.	翻訳日:2022-08-09 13:13:53 公開日:2022-08-06
# 顔に基づく影響計算のための不確実性モデリング付きマルチタスクトランス Multi-Task Transformer with uncertainty modelling for Face Based Affective Computing ( http://arxiv.org/abs/2208.03506v1 ) ライセンス: Link先を確認	Gauthier Tallec, Jules Bonnard, Arnaud Dapogny, K\'evin Bailly	(参考訳) 顔に基づく感情計算は、顔画像から感情を検出する。人間の行動のより良い自動理解を解き放ち、人間と機械の相互作用を改善するための道を開くのに役立つ。しかし、それは感情の計算的表現を設計する難しいタスクが伴う。これまでのところ、感情は2次元ヴァレンス/オーラル空間で連続的に表現されるか、エクマンの7つの基本的な感情で離散的に表現されている。あるいは、エクマンの顔行動ユニット(AU)システムは、一元的な筋活動のコードブックを使用して感情を活性化するためにも使われている。 ABAW3とABAW4 マルチタスクチャレンジは、これらの3種類のラベルに注釈を付けた大規模なデータベースを提供する最初の作業である。本稿では,ヴァレンス覚醒,行動単位,基本的な感情を共同で予測するトランスフォーマティブ型マルチタスク手法を提案する。アーキテクチャの観点から、我々のメソッドはタスク間の類似性を効率的にモデル化するためにタスクワイズトークンアプローチを使用します。学習の観点からは、3つのタスクアノテーション間の確率性の差をモデル化するために不確実性重み付き損失を用いる。 Face based affective computing consists in detecting emotions from face images. It is useful to unlock better automatic comprehension of human behaviours and could pave the way toward improved human-machines interactions. However it comes with the challenging task of designing a computational representation of emotions. So far, emotions have been represented either continuously in the 2D Valence/Arousal space or in a discrete manner with Ekman's 7 basic emotions. Alternatively, Ekman's Facial Action Unit (AU) system have also been used to caracterize emotions using a codebook of unitary muscular activations. ABAW3 and ABAW4 Multi-Task Challenges are the first work to provide a large scale database annotated with those three types of labels. In this paper we present a transformer based multi-task method for jointly learning to predict valence arousal, action units and basic emotions. From an architectural standpoint our method uses a taskwise token approach to efficiently model the similarities between the tasks. From a learning point of view we use an uncertainty weighted loss for modelling the difference of stochasticity between the three tasks annotations.	翻訳日:2022-08-09 13:13:42 公開日:2022-08-06
# 古典量子深層学習による半導体欠陥検出 Semiconductor Defect Detection by Hybrid Classical-Quantum Deep Learning ( http://arxiv.org/abs/2208.03514v1 ) ライセンス: Link先を確認	YuanFu Yang and Min Sun	(参考訳) 人工知能と自動運転技術の急速な発展により、半導体の需要は大幅に増加すると予想されている。しかし、半導体製造の大規模な拡大と新しい技術の発展は、多くの欠陥ウエハをもたらす。これらの欠陥ウエハが正しく検査されていない場合、欠陥ウエハの非効率な半導体処理は、過剰な二酸化炭素排出やエネルギー消費など、我々の環境にさらなる影響をもたらす。本稿では、量子コンピューティングの情報処理の利点を活用し、欠陥学習欠陥レビュー(DLDR)を促進する。短期量子プロセッサの深層学習のための古典量子ハイブリッドアルゴリズムを提案する。実装されたパラメータをチューニングすることにより、我々のフレームワークによって駆動される量子回路は、ウェハ欠陥マップ分類、欠陥パターン分類、ホットスポット検出を含む、所定のDLDRタスクを学習する。さらに,表現性やエンタングル能力の異なるパラメタライズド量子回路についても検討する。これらの結果は、半導体欠陥検出のための回路ベースの量子ディープラーニングを開発するための将来のロードマップを構築するために使用できる。 With the rapid development of artificial intelligence and autonomous driving technology, the demand for semiconductors is projected to rise substantially. However, the massive expansion of semiconductor manufacturing and the development of new technology will bring many defect wafers. If these defect wafers have not been correctly inspected, the ineffective semiconductor processing on these defect wafers will cause additional impact to our environment, such as excessive carbon dioxide emission and energy consumption. In this paper, we utilize the information processing advantages of quantum computing to promote the defect learning defect review (DLDR). We propose a classical-quantum hybrid algorithm for deep learning on near-term quantum processors. By tuning parameters implemented on it, quantum circuit driven by our framework learns a given DLDR task, include of wafer defect map classification, defect pattern classification, and hotspot detection. In addition, we explore parametrized quantum circuits with different expressibility and entangling capacities. These results can be used to build a future roadmap to develop circuit-based quantum deep learning for semiconductor defect detection.	翻訳日:2022-08-09 13:13:23 公開日:2022-08-06
# マルチプレックス検出に基づく全スライド画像分類のための複数インスタンス学習ネットワーク Multiplex-detection Based Multiple Instance Learning Network for Whole Slide Image Classification ( http://arxiv.org/abs/2208.03526v1 ) ライセンス: Link先を確認	Zhikang Wang, Yue Bi, Tong Pan, Chris Bain, Richard Bassed, Seiya Imoto, Jianhua Yao, Jiangning Song	(参考訳) マルチ・インスタンス・ラーニング(MIL)は、診断病理のためのスライド画像全体(WSI)を分類する強力な手法である。 WSI分類におけるMILの根本的な課題は、バッグラベルをトリガーするtextit{ critical instance}を見つけることである。しかし、以前の方法は主に独立かつ同一の分布仮説(\textit{i.i.d})に基づいて設計され、例間の相関や腫瘍の不均一性は無視される。本稿では,上記の課題に取り組むために,新しいマルチプレックス検出型マルチインスタンス学習(mdmil)を提案する。具体的には、MDMILは、内部クエリ生成モジュール(IQGM)と多重検出モジュール(MDM)によって構成され、トレーニング中にメモリベースのコントラスト損失を補助する。まず、IQGMは、分布解析後の信頼性の高い特徴を集約することにより、インスタンスの確率を与え、その後のMDMの内部クエリ(IQ)を生成する。次に、mdmにおける多重検出クロスアテンション(mdca)と多頭自己アテンション(mhsa)が協調してwsiの最終表現を生成する。このプロセスでは、IQおよびトレーニング可能な変動クエリ(VQ)がインスタンス間の接続を構築し、不均一な腫瘍に対するモデルの堅牢性を大幅に向上する。最後に、機能空間の制約をさらに強制し、トレーニングプロセスを安定化するために、各イテレーションで1つのサンプルが入力されてもwsi分類に実行可能なメモリベースのコントラスト損失を採用する。我々はCAMELYON16, TCGA-NSCLC, TCGA-RCCの3つの計算病理データセットについて実験を行った。 MDMILの精度とAUCは,他の最先端手法よりも優れていることを示す。 Multiple instance learning (MIL) is a powerful approach to classify whole slide images (WSIs) for diagnostic pathology. A fundamental challenge of MIL on WSI classification is to discover the \textit{critical instances} that trigger the bag label. However, previous methods are primarily designed under the independent and identical distribution hypothesis (\textit{i.i.d}), ignoring either the correlations between instances or heterogeneity of tumours. In this paper, we propose a novel multiplex-detection-based multiple instance learning (MDMIL) to tackle the issues above. Specifically, MDMIL is constructed by the internal query generation module (IQGM) and the multiplex detection module (MDM) and assisted by the memory-based contrastive loss during training. Firstly, IQGM gives the probability of instances and generates the internal query (IQ) for the subsequent MDM by aggregating highly reliable features after the distribution analysis. Secondly, the multiplex-detection cross-attention (MDCA) and multi-head self-attention (MHSA) in MDM cooperate to generate the final representations for the WSI. In this process, the IQ and trainable variational query (VQ) successfully build up the connections between instances and significantly improve the model's robustness toward heterogeneous tumours. At last, to further enforce constraints in the feature space and stabilize the training process, we adopt a memory-based contrastive loss, which is practicable for WSI classification even with a single sample as input in each iteration. We conduct experiments on three computational pathology datasets, e.g., CAMELYON16, TCGA-NSCLC, and TCGA-RCC datasets. The superior accuracy and AUC demonstrate the superiority of our proposed MDMIL over other state-of-the-art methods.	翻訳日:2022-08-09 13:13:06 公開日:2022-08-06
# カルマンフィルタを用いた短時間交通流予測 Short Duration Traffic Flow Prediction Using Kalman Filtering ( http://arxiv.org/abs/2208.03415v1 ) ライセンス: Link先を確認	Khondhaker Al Momin, Saurav Barua, Md. Shahreer Jamil, Omar Faruqe Hamim	(参考訳) 計算フィルタリング手法であるkalman filter technique (kft) を用いて,短時間交通流量の予測について検討した。短期交通予測は交通管理と交通システムの運用において重要なツールである。道路案内と高度トラベラー情報システムによる移動時間推定には, 短期交通流量値結果を用いることができる。 kftは均質なトラフィックでテストされているが、その効率性はまだ調査されていない。この調査は、ソバンバグ・モスクに近いダッカのミルプル・ロードで行われた。ストリームには不均一なトラフィックの混合が含まれており、予測の不確実性が示唆される。提案されたメソッドはPythonでpykalmanライブラリを使って実行される。ライブラリは主に、不確実性に対処するKFTフレームワークの高度なデータベースモデリングに使用される。データは、車両の3時間の交通量から導かれた。 2005年にバングラデシュのroads and highways division(rhd)が発行したgemetry design standards manualによると、不均一な交通フローの値は同等の旅客車単位(pcu)に変換された。 5分間のアグリゲーションから得られたPCUを提案モデルのデータセットとして利用した。提案されたモデルの平均絶対パーセンテージ誤差(MAPE)は14.62であり、KFTモデルは合理的に予測できることを示している。根平均二乗誤差(RMSPE)は18.73%の精度を示し、25%未満である。開発されたモデルはR2値0.879であり、データセットの変数の87.9%を説明できることを示している。データがより長期にわたって収集された場合、R2値は1.0に近い可能性がある。 The research examined predicting short-duration traffic flow counts with the Kalman filtering technique (KFT), a computational filtering method. Short-term traffic prediction is an important tool for operation in traffic management and transportation system. The short-term traffic flow value results can be used for travel time estimation by route guidance and advanced traveler information systems. Though the KFT has been tested for homogeneous traffic, its efficiency in heterogeneous traffic has yet to be investigated. The research was conducted on Mirpur Road in Dhaka, near the Sobhanbagh Mosque. The stream contains a heterogeneous mix of traffic, which implies uncertainty in prediction. The propositioned method is executed in Python using the pykalman library. The library is mostly used in advanced database modeling in the KFT framework, which addresses uncertainty. The data was derived from a three-hour traffic count of the vehicle. According to the Geometric Design Standards Manual published by Roads and Highways Division (RHD), Bangladesh in 2005, the heterogeneous traffic flow value was translated into an equivalent passenger car unit (PCU). The PCU obtained from five-minute aggregation was then utilized as the suggested model's dataset. The propositioned model has a mean absolute percent error (MAPE) of 14.62, indicating that the KFT model can forecast reasonably well. The root mean square percent error (RMSPE) shows an 18.73% accuracy which is less than 25%; hence the model is acceptable. The developed model has an R2 value of 0.879, indicating that it can explain 87.9 percent of the variability in the dataset. If the data were collected over a more extended period of time, the R2 value could be closer to 1.0.	翻訳日:2022-08-09 13:02:59 公開日:2022-08-06
# DeepGen: 異種検索広告生成とリアルタイムカスタマイズ DeepGen: Diverse Search Ad Generation and Real-Time Customization ( http://arxiv.org/abs/2208.03438v1 ) ライセンス: Link先を確認	Konstantin Golobokov, Junyi Chai, Victor Ye Dong, Mandy Gu, Bingyu Chi, Jie Cao, Yulan Yan, Yi Liu	(参考訳) 我々は、BingAdsの顧客向けにスポンサー付き検索広告(ads)を自動的に作成するWebスケールのシステムであるDeepGenを紹介する。我々は、最先端の自然言語生成(NLG)モデルを利用して、広告主のWebページから流動的な広告を抽象的に生成し、事実性や推論速度などの実用的な問題を解決する。さらに,ユーザの検索クエリに応答してカスタマイズされた広告をリアルタイムに生成し,ユーザが求めているものに基づいて,同製品のさまざまな側面を強調表示する。これを実現するために,本システムでは,先行してより小さな広告を多種多様な選択で選択し,クエリ時に最も関連性の高い広告を完全広告に縫い付ける。我々は、制御可能なNLGモデルをトレーニングし、異なる販売ポイントをハイライトする同じWebページの複数の広告を生成することにより、生成の多様性を向上させる。システム設計は、まず異なる目的で訓練された生成モデルのアンサンブルを実行し、次に多様性サンプリングアルゴリズムを用いて、オンライン選択のための生成結果の多様なサブセットを選択することにより、水平方向に多様性を向上する。実験の結果,提案するシステム設計の有効性が示された。当社のシステムは、現在本番環境で運用されており、bingで提供されているグローバル広告の${\sim}4\%を提供しています。 We present DeepGen, a system deployed at web scale for automatically creating sponsored search advertisements (ads) for BingAds customers. We leverage state-of-the-art natural language generation (NLG) models to generate fluent ads from advertiser's web pages in an abstractive fashion and solve practical issues such as factuality and inference speed. In addition, our system creates a customized ad in real-time in response to the user's search query, therefore highlighting different aspects of the same product based on what the user is looking for. To achieve this, our system generates a diverse choice of smaller pieces of the ad ahead of time and, at query time, selects the most relevant ones to be stitched into a complete ad. We improve generation diversity by training a controllable NLG model to generate multiple ads for the same web page highlighting different selling points. Our system design further improves diversity horizontally by first running an ensemble of generation models trained with different objectives and then using a diversity sampling algorithm to pick a diverse subset of generation results for online selection. Experimental results show the effectiveness of our proposed system design. Our system is currently deployed in production, serving ${\sim}4\%$ of global ads served in Bing.	翻訳日:2022-08-09 13:02:36 公開日:2022-08-06
# Follow Me: ターゲット駆動型レコメンデーション対話システムのための会話計画 Follow Me: Conversation Planning for Target-driven Recommendation Dialogue Systems ( http://arxiv.org/abs/2208.03516v1 ) ライセンス: Link先を確認	Jian Wang, Dongding Lin, Wenjie Li	(参考訳) 推薦対話システムは,ユーザとソーシャル・ボンドを構築し,高品質なレコメンデーションを提供することを目的としている。本稿では,目標駆動型レコメンデーション対話システムと呼ばれる有望なパラダイムに向けて前進する。会話を通じて、ユーザが指定されたターゲットを受け入れるように自然に導く方法に重点を置いています。そこで本研究では,対話行動と話題のシーケンスを計画し,異なる会話ステージ間を積極的に移動させる目標駆動型会話計画(tcp)フレームワークを提案する。次に、TCPに予定内容を適用して対話生成をガイドする。実験の結果,対話計画が目標主導型レコメンデーション対話システムの性能を大幅に向上させることがわかった。 Recommendation dialogue systems aim to build social bonds with users and provide high-quality recommendations. This paper pushes forward towards a promising paradigm called target-driven recommendation dialogue systems, which is highly desired yet under-explored. We focus on how to naturally lead users to accept the designated targets gradually through conversations. To this end, we propose a Target-driven Conversation Planning (TCP) framework to plan a sequence of dialogue actions and topics, driving the system to transit between different conversation stages proactively. We then apply our TCP with planned content to guide dialogue generation. Experimental results show that our conversation planning significantly improves the performance of target-driven recommendation dialogue systems.	翻訳日:2022-08-09 13:02:13 公開日:2022-08-06
# 解剖学的追跡データを用いた繊維束検出のための時間的アンサンブルを用いた制約付き自己監督法 Constrained self-supervised method with temporal ensembling for fiber bundle detection on anatomic tracing data ( http://arxiv.org/abs/2208.03569v1 ) ライセンス: Link先を確認	Vaanathi Sundaresan, Julia F. Lehman, Sean Fitzgibbon, Saad Jbabdi, Suzanne N. Haber, Anastasia Yendiki	(参考訳) anatomic tracing dataは、拡散mriでよく見られるエラーに対処するのに必要な脳回路に関する詳細な情報を提供する。しかし, 追跡データ上での繊維束の自動検出は, 歪み, ノイズ, アーティファクトの存在, 強度・コントラストの変動などにより困難である。本研究では,マカク脳のトレーサ部における繊維束の正確なセグメンテーションを考慮した,解剖学的制約を考慮した自己教師付き損失関数を用いた深層学習法を提案する。また,手動ラベルの可用性が限られているため,ラベルなしデータを効率よく使用して性能を向上させるための半教師付きトレーニング手法と,偽陽性のさらなる低減のための位置制約を用いる。異なるマカクの未確認区間における本手法の評価は, 真正率~0.90の有望な結果をもたらす。このメソッドのコードはhttps://github.com/v-sundaresan/fiberbundle_seg_tracingで入手できる。 Anatomic tracing data provides detailed information on brain circuitry essential for addressing some of the common errors in diffusion MRI tractography. However, automated detection of fiber bundles on tracing data is challenging due to sectioning distortions, presence of noise and artifacts and intensity/contrast variations. In this work, we propose a deep learning method with a self-supervised loss function that takes anatomy-based constraints into account for accurate segmentation of fiber bundles on the tracer sections from macaque brains. Also, given the limited availability of manual labels, we use a semi-supervised training technique for efficiently using unlabeled data to improve the performance, and location constraints for further reduction of false positives. Evaluation of our method on unseen sections from a different macaque yields promising results with a true positive rate of ~0.90. The code for our method is available at https://github.com/v-sundaresan/fiberbundle_seg_tracing.	翻訳日:2022-08-09 12:58:11 公開日:2022-08-06
# 手続き型犯罪ドラマシリーズCSIの記憶可能性の分析 Analysing the Memorability of a Procedural Crime-Drama TV Series, CSI ( http://arxiv.org/abs/2208.03479v1 ) ライセンス: Link先を確認	Sean Cummins and Lorin Sweeney and Alan F. Smeaton	(参考訳) 我々は,映像の暗記性を予測するタスクを微調整した視覚変換器を用いて,人気テレビシリーズCSIの5シーズンスパンの記憶可能性について検討した。ビデオの暗記性スコアを付加した詳細な注釈付きコーパスを用いて、一般的な犯罪ドラマテレビのジャンルを調査することにより、映像の暗記性スコアから意味を抽出する方法を示す。映像の記憶可能性と番組の様々な側面を関連付けるための定量的分析を行う。本稿では,教育,マーケティング,インデクシングといった分野のマルチメディアを利用したアプリケーションにおいて,テレビや映画製作などにおいて,映像の記憶可能性の重要性について考察する。 We investigate the memorability of a 5-season span of a popular crime-drama TV series, CSI, through the application of a vision transformer fine-tuned on the task of predicting video memorability. By investigating the popular genre of crime-drama TV through the use of a detailed annotated corpus combined with video memorability scores, we show how to extrapolate meaning from the memorability scores generated on video shots. We perform a quantitative analysis to relate video shot memorability to a variety of aspects of the show. The insights we present in this paper illustrate the importance of video memorability in applications which use multimedia in areas like education, marketing, indexing, as well as in the case here namely TV and film production.	翻訳日:2022-08-09 12:44:59 公開日:2022-08-06
# 臨床関連二次特徴を用いた膵腫瘍検出の改善 Improved Pancreatic Tumor Detection by Utilizing Clinically-Relevant Secondary Features ( http://arxiv.org/abs/2208.03581v1 ) ライセンス: Link先を確認	Christiaan G.A. Viviers and Mark Ramaekers and Peter H.N. de With and Dimitrios Mavroeidis and Joost Nederend and Misha Luyer and Fons van der Sommen	(参考訳) 膵癌は、がん関連死亡の世界的な原因の1つである。コンピュータ支援診断・診断法(CAD)におけるDeep Learningの成功にもかかわらず,膵癌検出にはほとんど注意が払われていない。本報告では, 膵腫瘍の診断法として, 周囲解剖学的特徴を活用し, 放射線科医の知識を他の従来の深層学習法と比較して有効活用するための方法を提案する。この目的のために膵管腺癌99例と膵腫瘍を伴わない97例からなる新しいデータセットを収集した。膵癌の増殖パターンのため、腫瘍は常に低感覚病変として見られるわけではないため、専門家は腫瘍の存在を示す可能性のある二次的な外部特徴の視認性について言及している。本稿では, 膵管, 総胆管, 膵臓の2次的特徴を利用するU-NetライクなDeep CNNとCTスキャンを併用した手法を提案する。これらの特徴を用いて、モデルが膵腫瘍の存在を判断する。この分類と局所化手法のセグメンテーションは、99%の感度(1件欠落)と99%の特異性を達成し、従来の最先端法に比べて5%の感度向上を実現する。さらに、従来のPDAC検出法と比較して、適切な精度と推論時間の短い位置情報を提供する。これらの結果は,新しいCAD手法の開発において,臨床専門家の知識を取り入れることの重要性を強調した。 Pancreatic cancer is one of the global leading causes of cancer-related deaths. Despite the success of Deep Learning in computer-aided diagnosis and detection (CAD) methods, little attention has been paid to the detection of Pancreatic Cancer. We propose a method for detecting pancreatic tumor that utilizes clinically-relevant features in the surrounding anatomical structures, thereby better aiming to exploit the radiologist's knowledge compared to other, conventional deep learning approaches. To this end, we collect a new dataset consisting of 99 cases with pancreatic ductal adenocarcinoma (PDAC) and 97 control cases without any pancreatic tumor. Due to the growth pattern of pancreatic cancer, the tumor may not be always visible as a hypodense lesion, therefore experts refer to the visibility of secondary external features that may indicate the presence of the tumor. We propose a method based on a U-Net-like Deep CNN that exploits the following external secondary features: the pancreatic duct, common bile duct and the pancreas, along with a processed CT scan. Using these features, the model segments the pancreatic tumor if it is present. This segmentation for classification and localization approach achieves a performance of 99% sensitivity (one case missed) and 99% specificity, which realizes a 5% increase in sensitivity over the previous state-of-the-art method. The model additionally provides location information with reasonable accuracy and a shorter inference time compared to previous PDAC detection methods. These results offer a significant performance improvement and highlight the importance of incorporating the knowledge of the clinical expert when developing novel CAD methods.	翻訳日:2022-08-09 12:39:49 公開日:2022-08-06
# 部分観測可能な環境における再帰的ネットワーク、隠れ状態、信念 Recurrent networks, hidden states and beliefs in partially observable environments ( http://arxiv.org/abs/2208.03520v1 ) ライセンス: Link先を確認	Gaspard Lambrechts, Adrien Bolland, Damien Ernst	(参考訳) 強化学習は、動的に未知な環境との相互作用から最適方針を学ぶことを目的としている。多くの手法は値関数の近似に頼り、ほぼ最適ポリシーを導出する。部分的に観測可能な環境では、これらの関数は履歴と呼ばれる観測と過去の行動の完全な順序に依存する。本研究では,そのような値関数を近似するために訓練されたリカレントニューラルネットワークが,その歴史が与えられた状態の後方確率分布を内部的にフィルタすることを示す。より正確には、リカレントニューラルネットワークがQ-関数を学習するにつれて、その隠れた状態が、最適制御に関連する状態変数の信念とますます相関していることが示される。この相関は相互情報によって測定される。さらに,エージェントの期待リターンは,その隠れた状態と信念の間の高い相互情報に達するために,その再帰的なアーキテクチャの能力によって増加することを示した。最後に,隠蔽状態と最適制御に無関係な変数の信念との相互情報を学習過程を通じて減少させることを示す。要約すると、その隠れた状態において、部分的に観測可能な環境のq関数を近似する再帰的ニューラルネットワークは、最適な行動を取るための信念の関連部分と関連付けられた履歴から十分な統計を再現する。 Reinforcement learning aims to learn optimal policies from interaction with environments whose dynamics are unknown. Many methods rely on the approximation of a value function to derive near-optimal policies. In partially observable environments, these functions depend on the complete sequence of observations and past actions, called the history. In this work, we show empirically that recurrent neural networks trained to approximate such value functions internally filter the posterior probability distribution of the current state given the history, called the belief. More precisely, we show that, as a recurrent neural network learns the Q-function, its hidden states become more and more correlated with the beliefs of state variables that are relevant to optimal control. This correlation is measured through their mutual information. In addition, we show that the expected return of an agent increases with the ability of its recurrent architecture to reach a high mutual information between its hidden states and the beliefs. Finally, we show that the mutual information between the hidden states and the beliefs of variables that are irrelevant for optimal control decreases through the learning process. In summary, this work shows that in its hidden states, a recurrent neural network approximating the Q-function of a partially observable environment reproduces a sufficient statistic from the history that is correlated to the relevant part of the belief for taking optimal actions.	翻訳日:2022-08-09 12:32:51 公開日:2022-08-06
# 複数物体追跡のための変圧器に基づく割当決定ネットワーク Transformer-based assignment decision network for multiple object tracking ( http://arxiv.org/abs/2208.03571v1 ) ライセンス: Link先を確認	Athena Psalta, Vasileios Tsironis and Konstantinos Karantzalos	(参考訳) データアソシエーションは、複数のオブジェクト追跡(MOT)メソッドにおいて、トラッキング・バイ・検出のパラダイムに従う重要なコンポーネントである。データアソシエーションプロセスを用いて、各時間ステップ毎に検出と既存のターゲット間の割り当てを確立するようにした完全な軌跡を生成する。近年のデータアソシエーション手法は,多次元線形代入タスクやネットワークフローの最小化問題を解くか,あるいは複数の仮説追跡によって解決しようとする。しかし、推論中に最適な割り当てを計算する最適化ステップは、任意の解に計算の複雑さを付加する全てのシーケンスフレームに必要である。この目的のために,本研究の文脈では,推論中に明示的な最適化を必要とせず,データアソシエーションに取り組むトランスフォーマティブベースの割当決定ネットワーク(tadn)を導入する。特に、TADNは、ネットワークの単一のフォワードパスにおいて、検出とアクティブターゲット間の割り当てペアを直接推論することができる。我々は、TADNをかなり単純なMOTフレームワークに統合し、効率的なエンドツーエンドトレーニングのための新しいトレーニング戦略を設計し、MOT17とUA-DETRACの2つの人気のあるベンチマーク上で、オンラインビジュアルトラッキング・バイ・検出MOTに対する我々のアプローチの可能性を示した。提案手法は,咬合処理や再同定といった重要な補助成分を欠くトラッカーとしての性質にもかかわらず,ほとんどの評価指標において最先端を上回っている。このメソッドの実装はhttps://github.com/psaltaath/tadn-motで公開されている。 Data association is a crucial component for any multiple object tracking (MOT) method that follows the tracking-by-detection paradigm. To generate complete trajectories such methods employ a data association process to establish assignments between detections and existing targets during each timestep. Recent data association approaches try to solve a multi-dimensional linear assignment task or a network flow minimization problem or either tackle it via multiple hypotheses tracking. However, during inference an optimization step that computes optimal assignments is required for every sequence frame adding significant computational complexity in any given solution. To this end, in the context of this work we introduce Transformer-based Assignment Decision Network (TADN) that tackles data association without the need of any explicit optimization during inference. In particular, TADN can directly infer assignment pairs between detections and active targets in a single forward pass of the network. We have integrated TADN in a rather simple MOT framework, we designed a novel training strategy for efficient end-to-end training and demonstrate the high potential of our approach for online visual tracking-by-detection MOT on two popular benchmarks, i.e. MOT17 and UA-DETRAC. Our proposed approach outperforms the state-of-the-art in most evaluation metrics despite its simple nature as a tracker which lacks significant auxiliary components such as occlusion handling or re-identification. The implementation of our method is publicly available at https://github.com/psaltaath/tadn-mot.	翻訳日:2022-08-09 12:29:38 公開日:2022-08-06
# 胸部X線からの肺炎検出のための適応的PSOに基づく深部特徴抽出法 An Adaptive and Altruistic PSO-based Deep Feature Selection Method for Pneumonia Detection from Chest X-Rays ( http://arxiv.org/abs/2208.03558v1 ) ライセンス: Link先を確認	Rishav Pramanik, Sourodip Sarkar, Ram Sarkar	(参考訳) 肺炎は、特に世界の所得不足地域での小児死亡の主な原因の1つである。非常に高度な機器や医薬品で検出・治療できるが、発展途上国では依然として肺炎の検出が主要な関心事である。コンピュータ支援型診断システム(CAD)は,プロの医療専門家よりも手術コストが低いため,そのような国で利用することができる。本稿では,深層学習の概念とメタヒューリスティックアルゴリズムを用いて,胸部X線からの肺炎検出のためのCADシステムを提案する。まず,ターゲット肺炎データセット上で微調整されたresnet50から深い特徴を抽出する。そこで,我々は,メモリに基づく適応パラメータを用いて修正を行い,エージェントに利他的動作を組み込むことにより,機能選択を行うpso( particle swarm optimization)に基づく特徴選択手法を提案する。我々は特徴選択法を適応的・利他的PSO (AAPSO) と命名した。提案手法はresnet50モデルから得られた非形成的特徴を除去し,全体の肺炎検出能力を向上した。肺炎検出のための他のいくつかのフレームワークよりも, 広く利用可能な肺炎データセットの広範な実験と徹底的な解析により, 提案手法の優越性が確立された。肺炎の検出とは別に、AAPSOはいくつかの標準UCIデータセット、がん予測のための遺伝子発現データセット、COVID-19予測データセットでさらに評価されている。その結果,AAPSOが現実の様々な問題に対処する上で有用であることが確認された。この作業のソースコードはhttps://github.com/rishavpramanik/AAPSOで確認できる。 Pneumonia is one of the major reasons for child mortality especially in income-deprived regions of the world. Although it can be detected and treated with very less sophisticated instruments and medication, Pneumonia detection still remains a major concern in developing countries. Computer-aided based diagnosis (CAD) systems can be used in such countries due to their lower operating costs than professional medical experts. In this paper, we propose a CAD system for Pneumonia detection from Chest X-rays, using the concepts of deep learning and a meta-heuristic algorithm. We first extract deep features from the pre-trained ResNet50, fine-tuned on a target Pneumonia dataset. Then, we propose a feature selection technique based on particle swarm optimization (PSO), which is modified using a memory-based adaptation parameter, and enriched by incorporating an altruistic behavior into the agents. We name our feature selection method as adaptive and altruistic PSO (AAPSO). The proposed method successfully eliminates non-informative features obtained from the ResNet50 model, thereby improving the Pneumonia detection ability of the overall framework. Extensive experimentation and thorough analysis on a publicly available Pneumonia dataset establish the superiority of the proposed method over several other frameworks used for Pneumonia detection. Apart from Pneumonia detection, AAPSO is further evaluated on some standard UCI datasets, gene expression datasets for cancer prediction and a COVID-19 prediction dataset. The overall results are satisfactory, thereby confirming the usefulness of AAPSO in dealing with varied real-life problems. The supporting source codes of this work can be found at https://github.com/rishavpramanik/AAPSO	翻訳日:2022-08-09 12:22:46 公開日:2022-08-06
# 災害後被害分類のための多視点深層学習 Multi-view deep learning for reliable post-disaster damage classification ( http://arxiv.org/abs/2208.03419v1 ) ライセンス: Link先を確認	Asim Bashir Khajwal, Chih-Shen Cheng, Arash Noshadravan	(参考訳) 本研究は,人工知能(AI)と多視点画像を用いた,より信頼性の高い建築損傷分類を実現することを目的とする。災害後の被害評価にAIを採用するための現在の実践と研究の取り組みは一般的に行われている (a)定性的で、基準的被害規模に基づく建物被害レベルの厳格な分類を欠いたもの (b) 限られた視界を持つ航空画像や衛星画像に基づいて訓練されているが, 損傷の規模を完全に説明できない。本研究は,より高精度で信頼性の高い被害度自動定量化を実現するため,建物の多面的および空中的視点による総合的な視覚データの利用を提案する。このような空間的損傷予測モデルを実現するために、損傷した建物の異なる視点からの情報を結合するマルチビュー畳み込みニューラルネットワーク(mv-cnn)アーキテクチャが使用される。この空間的3dコンテキスト損傷情報は、損傷のより正確な識別と、損傷レベルの信頼できる定量化をもたらす。提案モデルでは, ハリケーン・ハーヴェイに続き, 調査対象の建物について, 専門家ラベル付きジオタグ付き画像を含む偵察視覚データセットを訓練し, 検証した。開発したモデルでは,災害レベルの予測に適度な精度を示し,よりインフォームドで信頼性の高い災害管理を支援する。 This study aims to enable more reliable automated post-disaster building damage classification using artificial intelligence (AI) and multi-view imagery. The current practices and research efforts in adopting AI for post-disaster damage assessment are generally (a) qualitative, lacking refined classification of building damage levels based on standard damage scales, and (b) trained based on aerial or satellite imagery with limited views, which, although indicative, are not completely descriptive of the damage scale. To enable more accurate and reliable automated quantification of damage levels, the present study proposes the use of more comprehensive visual data in the form of multiple ground and aerial views of the buildings. To have such a spatially-aware damage prediction model, a Multi-view Convolution Neural Network (MV-CNN) architecture is used that combines the information from different views of a damaged building. This spatial 3D context damage information will result in more accurate identification of damages and reliable quantification of damage levels. The proposed model is trained and validated on reconnaissance visual dataset containing expert-labeled, geotagged images of the inspected buildings following hurricane Harvey. The developed model demonstrates reasonably good accuracy in predicting the damage levels and can be used to support more informed and reliable AI-assisted disaster management practices.	翻訳日:2022-08-09 12:21:33 公開日:2022-08-06
# IVT:3D Pose Estimationのためのエンド・ツー・エンドのインスタンス誘導型ビデオトランス IVT: An End-to-End Instance-guided Video Transformer for 3D Pose Estimation ( http://arxiv.org/abs/2208.03431v1 ) ライセンス: Link先を確認	Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu	(参考訳) video 3d human pose estimationは、ビデオから人間の関節の3d座標をローカライズすることを目的としている。近年の変圧器を用いた手法では、2次元ポーズ推定のステップで視覚深度特徴が失われるため、文脈深度特徴を効果的にモデル化できない2次元ポーズからの時空間情報の取り込みに焦点が当てられている。本稿では,このパラダイムを,視覚的特徴から時空間深度情報を効果的に学習し,映像フレームから直接3Dポーズを予測する,エンドツーエンドのフレームワークであるインスタンス誘導ビデオ変換器(IVT)に単純化する。特に、まず、ビデオフレームを一連のインスタンス誘導トークンとして定式化し、各トークンが人間のインスタンスの3dポーズを予測する役割を担います。これらのトークンは、人中心から人体関節への関節オフセットの誘導によって抽出されるため、体構造情報を含む。そして、これらのトークンをIVTに送信し、時空間深度を学習する。また,複数人間の変動尺度を扱うために,クロススケールのインスタンス誘導型注意機構を提案する。最後に、各人物の3Dポーズを座標回帰によりインスタンス誘導トークンから復号する。 3つの広く使われている3次元ポーズ推定ベンチマークの実験により、提案したIVTが最先端の性能を達成することが示された。 Video 3D human pose estimation aims to localize the 3D coordinates of human joints from videos. Recent transformer-based approaches focus on capturing the spatiotemporal information from sequential 2D poses, which cannot model the contextual depth feature effectively since the visual depth features are lost in the step of 2D pose estimation. In this paper, we simplify the paradigm into an end-to-end framework, Instance-guided Video Transformer (IVT), which enables learning spatiotemporal contextual depth information from visual features effectively and predicts 3D poses directly from video frames. In particular, we firstly formulate video frames as a series of instance-guided tokens and each token is in charge of predicting the 3D pose of a human instance. These tokens contain body structure information since they are extracted by the guidance of joint offsets from the human center to the corresponding body joints. Then, these tokens are sent into IVT for learning spatiotemporal contextual depth. In addition, we propose a cross-scale instance-guided attention mechanism to handle the variational scales among multiple persons. Finally, the 3D poses of each person are decoded from instance-guided tokens by coordinate regression. Experiments on three widely-used 3D pose estimation benchmarks show that the proposed IVT achieves state-of-the-art performances.	翻訳日:2022-08-09 12:21:11 公開日:2022-08-06
# 乾燥領域分割におけるデータ拡張の効果の検討 Exploring the Effects of Data Augmentation for Drivable Area Segmentation ( http://arxiv.org/abs/2208.03437v1 ) ライセンス: Link先を確認	Srinjoy Bhuiya, Ayushman Kumar, Sankalok Sen	(参考訳) ドライビング可能な地域のリアルタイムセグメンテーションは、自動車における自律的な認識を達成する上で重要な役割を果たす。近年,ディープラーニングを用いた画像分割モデルの開発が急速に進んでいる。しかしながら、ほとんどの進歩はモデルアーキテクチャ設計において行われてきた。セグメンテーションに関連する教師付きディープラーニング問題の解決において、モデルが構築するモデルの成功は、そのモデルに使用する入力トレーニングデータの量と品質に依存する。このデータは、セグメンテーションモデルのより優れた作業のために、よく注釈付けされた様々な画像を含むべきである。データセットのアノテーションに関連するこのような問題は、テストとバリデーションにおいて過大なタイプIとIIのエラーでモデルが終了する原因となり、現実世界の問題に対処しようとすると悪意のある問題を引き起こします。この問題に対処し、モデルをより正確でダイナミックで堅牢にするために、サンプルトレーニングデータを拡張し、全体としてより良く、より多様なものにするために、データ拡張が使われます。そこで本研究では,既存の画像データセットを分析し,それに応じて拡張を行うことで,データ拡張のメリットを検討することに焦点を当てる。以上の結果から,既存技術(SOTA)モデルの性能と堅牢性は,モデル複雑性や推論時間の増加を伴わずに劇的に向上できることが示された。本論文では,他の拡張手法と戦略の徹底的な研究と,現在広く利用されているそれに対応する効果についてのみ検討した。結果はすべて、広く使われているCityscapes Datasetで報告されています。 The real-time segmentation of drivable areas plays a vital role in accomplishing autonomous perception in cars. Recently there have been some rapid strides in the development of image segmentation models using deep learning. However, most of the advancements have been made in model architecture design. In solving any supervised deep learning problem related to segmentation, the success of the model that one builds depends upon the amount and quality of input training data we use for that model. This data should contain well-annotated varied images for better working of the segmentation model. Issues like this pertaining to annotations in a dataset can lead the model to conclude with overwhelming Type I and II errors in testing and validation, causing malicious issues when trying to tackle real world problems. To address this problem and to make our model more accurate, dynamic, and robust, data augmentation comes into usage as it helps in expanding our sample training data and making it better and more diversified overall. Hence, in our study, we focus on investigating the benefits of data augmentation by analyzing pre-existing image datasets and performing augmentations accordingly. Our results show that the performance and robustness of existing state of the art (or SOTA) models can be increased dramatically without any increase in model complexity or inference time. The augmentations decided on and used in this paper were decided only after thorough research of several other augmentation methodologies and strategies and their corresponding effects that are in widespread usage today. All our results are being reported on the widely used Cityscapes Dataset.	翻訳日:2022-08-09 12:20:51 公開日:2022-08-06
# haloae: 異常検出と局在化のためのhalonetベースの局所変圧器オートエンコーダ HaloAE: An HaloNet based Local Transformer Auto-Encoder for Anomaly Detection and Localization ( http://arxiv.org/abs/2208.03486v1 ) ライセンス: Link先を確認	E. Mathian, H. Liu, L. Fernandez-Cuesta, D. Samaras, M. Foll, L. Chen	(参考訳) 非教師付き異常検出と局所化は、あらゆる可能な異常を収集・ラベル付けすることは不可能であるため、重要な課題である。多くの研究は、異常の正確なセグメンテーションを達成するために、ローカル情報とグローバル情報を統合することの重要性を強調している。このため、長距離コンテンツインタラクションのモデリングを可能にするtransformerへの関心が高まっている。しかし、自己注意によるグローバルな相互作用は、ほとんどの画像スケールでは一般的に高価すぎる。本研究では,HaloNetを用いたTransformerのローカル2次元バージョンに基づく最初の自動エンコーダであるHaloAEを紹介する。 haloaeでは,畳み込みと局所的な2次元ブロックワイズセルフアテンション層を結合し,単一モデルによる異常検出とセグメント化を共同で行うハイブリッドモデルを構築した。我々はMVTecデータセットの競合的な結果を達成し、Transformerを組み込んだビジョンモデルが自己注意操作の局所的な計算の恩恵を受け、他のアプリケーションへの道を開くことを示唆した。 Unsupervised anomaly detection and localization is a crucial task as it is impossible to collect and label all possible anomalies. Many studies have emphasized the importance of integrating local and global information to achieve accurate segmentation of anomalies. To this end, there has been a growing interest in Transformer, which allows modeling long-range content interactions. However, global interactions through self attention are generally too expensive for most image scales. In this study, we introduce HaloAE, the first auto-encoder based on a local 2D version of Transformer with HaloNet. With HaloAE, we have created a hybrid model that combines convolution and local 2D block-wise self-attention layers and jointly performs anomaly detection and segmentation through a single model. We achieved competitive results on the MVTec dataset, suggesting that vision models incorporating Transformer could benefit from a local computation of the self-attention operation, and pave the way for other applications.	翻訳日:2022-08-09 12:20:28 公開日:2022-08-06
# DeepFakeビデオにおける行動シグネチャの検出に関する研究 Study of detecting behavioral signatures within DeepFake videos ( http://arxiv.org/abs/2208.03561v1 ) ライセンス: Link先を確認	Qiaomu Miao, Sinhwa Kang, Stacy Marsella, Steve DiPaola, Chao Wang, Ari Shapiro	(参考訳) 娯楽、コミュニケーション、トレーニング、広告など様々な目的のために話している人々の合成ビデオ画像の生成には強い関心がある。ディープフェイク生成モデルの開発により、合成ビデオ画像は、自然に捉えたビデオから肉眼で見分けがつかないようになる。さらに、多くの手法は、より慎重で法医学的な視覚的分析を避けるために改善を続けている。いくつかのディープフェイクビデオは、顔のパペットを使って作られ、俳優の動きを通じて合成画像の頭部と顔を直接制御し、俳優が他の俳優のイメージを「パペット」することができる。本稿では、話者の視覚的な外観を制御しつつ、行動信号を他の音源から転送することで、ある人の動きが元の話者と区別できるかどうかを問う。我々は合成画像を比較して研究を行う。 1)異なる発話をする別の人に由来する。 2)同じ人が別の発話をすることに由来する。 3)同じ発話をする別の人に由来する。本研究は,3症例すべてにおける合成ビデオは,元のソースビデオよりもリアルで,エンゲージメントが低いことを示している。以上の結果から,視覚的外見から分離した人物の動きから検出可能な行動シグネチャが存在する可能性が示唆され,この行動シグネチャは,撮影された映像と深い偽物とを区別するためにも用いられることが示唆された。 There is strong interest in the generation of synthetic video imagery of people talking for various purposes, including entertainment, communication, training, and advertisement. With the development of deep fake generation models, synthetic video imagery will soon be visually indistinguishable to the naked eye from a naturally capture video. In addition, many methods are continuing to improve to avoid more careful, forensic visual analysis. Some deep fake videos are produced through the use of facial puppetry, which directly controls the head and face of the synthetic image through the movements of the actor, allow the actor to 'puppet' the image of another. In this paper, we address the question of whether one person's movements can be distinguished from the original speaker by controlling the visual appearance of the speaker but transferring the behavior signals from another source. We conduct a study by comparing synthetic imagery that: 1) originates from a different person speaking a different utterance, 2) originates from the same person speaking a different utterance, and 3) originates from a different person speaking the same utterance. Our study shows that synthetic videos in all three cases are seen as less real and less engaging than the original source video. Our results indicate that there could be a behavioral signature that is detectable from a person's movements that is separate from their visual appearance, and that this behavioral signature could be used to distinguish a deep fake from a properly captured video.	翻訳日:2022-08-09 12:20:09 公開日:2022-08-06
# 証明型学習における形式的(dis)頑健性に関する基礎的限界について On the Fundamental Limits of Formally (Dis)Proving Robustness in Proof-of-Learning ( http://arxiv.org/abs/2208.03567v1 ) ライセンス: Link先を確認	Congyu Fang, Hengrui Jia, Anvith Thudi, Mohammad Yaghini, Christopher A. Choquette-Choo, Natalie Dullerud, Varun Chandrasekaran, Nicolas Papernot	(参考訳) Proof-of-learning(PoL)は、モデル所有者が機械学習トレーニングチェックポイントを使用して、トレーニングに必要な計算を拡張した証拠を確立することを提案する。 PoLフォアゴ暗号手法の著者らは、確率勾配勾配や適応的変種に適用することで、ディープラーニングへのスケーラビリティの厳密なセキュリティ保証を行う。この公式な分析の欠如は、攻撃者が訓練していないモデルの証明を偽造できる可能性を残している。本稿では,PoLプロトコルが公式な (dis) 証明できない理由の形式解析に寄与する。そのため、PoLにおける証明検証の2つの役割を解消する。 a)証明が有効な勾配降下軌道であるか否かを効率的に決定し、 (b)修了後(即ちスプーフィング)に証明を製作するコストを高くすることで優先を確立すること。そこで本研究では,効率的な検証が正当な証明の受け入れと無効な証明の拒否のトレードオフをもたらすことを示す。このノイズがトレーニングに与える影響に関する正確な分析モデルがなければ、pol検証アルゴリズムが堅牢かどうかを正式に保証することはできない。また,PoLポストホックトレーニングをスプーフすることは,非凸学習において同一の終点を持つ異なる軌跡を見つけることに似ている。しかし、最終モデルの重みに関する事前知識がそのような軌道の発見に役立つかどうか、厳密には分かっていない。我々は、上記のオープン問題に対処するまで、形式的ロバスト性保証で新しいクラスのpolプロトコルを定式化するために、暗号に重きを置く必要があると結論づける。特に、これが優先事項の確立に役立ちます。分析から得られた知見の副産物として,PoLに対する2つの新たな攻撃を実証した。 Proof-of-learning (PoL) proposes a model owner use machine learning training checkpoints to establish a proof of having expended the necessary compute for training. The authors of PoL forego cryptographic approaches and trade rigorous security guarantees for scalability to deep learning by being applicable to stochastic gradient descent and adaptive variants. This lack of formal analysis leaves the possibility that an attacker may be able to spoof a proof for a model they did not train. We contribute a formal analysis of why the PoL protocol cannot be formally (dis)proven to be robust against spoofing adversaries. To do so, we disentangle the two roles of proof verification in PoL: (a) efficiently determining if a proof is a valid gradient descent trajectory, and (b) establishing precedence by making it more expensive to craft a proof after training completes (i.e., spoofing). We show that efficient verification results in a tradeoff between accepting legitimate proofs and rejecting invalid proofs because deep learning necessarily involves noise. Without a precise analytical model for how this noise affects training, we cannot formally guarantee if a PoL verification algorithm is robust. Then, we demonstrate that establishing precedence robustly also reduces to an open problem in learning theory: spoofing a PoL post hoc training is akin to finding different trajectories with the same endpoint in non-convex learning. Yet, we do not rigorously know if priori knowledge of the final model weights helps discover such trajectories. We conclude that, until the aforementioned open problems are addressed, relying more heavily on cryptography is likely needed to formulate a new class of PoL protocols with formal robustness guarantees. In particular, this will help with establishing precedence. As a by-product of insights from our analysis, we also demonstrate two novel attacks against PoL.	翻訳日:2022-08-09 12:16:44 公開日:2022-08-06
# 強化記憶ユニットによる認知的評価の学習 Learning Human Cognitive Appraisal Through Reinforcement Memory Unit ( http://arxiv.org/abs/2208.03473v1 ) ライセンス: Link先を確認	Yaosi Hu and Zhenzhong Chen	(参考訳) 逐次的評価課題における人間の認知評価の効果を生かした,リカレントニューラルネットワークのための新しいメモリ強調機構を提案する。記憶増強機構を2つの正および負の強化記憶とともに評価状態を含む強化記憶ユニット(RMU)として概念化する。 2つの強化記憶はより強い刺激によって減衰または強化される。その後、正及び負の強化記憶の競合によって評価状態を更新する。したがって、RMUは、人間の感情経験を推定するための刺激の激しい変化の下で、評価の変動を学習することができる。ビデオ品質評価と体験タスクの映像品質評価実験で示すように、提案した強化記憶ユニットは、人間の認知評価をモデル化するためのRMUの有効性を示す。 We propose a novel memory-enhancing mechanism for recurrent neural networks that exploits the effect of human cognitive appraisal in sequential assessment tasks. We conceptualize the memory-enhancing mechanism as Reinforcement Memory Unit (RMU) that contains an appraisal state together with two positive and negative reinforcement memories. The two reinforcement memories are decayed or strengthened by stronger stimulus. Thereafter the appraisal state is updated through the competition of positive and negative reinforcement memories. Therefore, RMU can learn the appraisal variation under violent changing of the stimuli for estimating human affective experience. As shown in the experiments of video quality assessment and video quality of experience tasks, the proposed reinforcement memory unit achieves superior performance among recurrent neural networks, that demonstrates the effectiveness of RMU for modeling human cognitive appraisal.	翻訳日:2022-08-09 12:14:17 公開日:2022-08-06

Title

Authors

Abstract

論文公表日・翻訳日

# 球面と双曲面上で定義された量子発振子のエネルギー固有値の代数的導出

Algebraic derivation of the Energy Eigenvalues for the quantum oscillator defined on the Sphere and the Hyperbolic plane ( http://arxiv.org/abs/2103.02518v2 )

ライセンス: Link先を確認

Atulit Srivastava and Sanjeev Kant Soni

(参考訳) 量子調和振動子のエネルギーの固有値の代数的導出を定数曲率の表面、すなわち球面上または双曲平面上で与える。 2次元(2次元)二次超可積分系のエネルギー固有値の固定には、ダスカロイアニスが提案した方法を用いて、動積分作用素の多項式代数の有限次元表現の存在によって決定されると仮定する。表現を実現するツールは変形パラフェルミオン発振器である。エネルギーの固有値は計算され、我々の導出した結果は古典的解析法で計算された既知のエネルギー固有値と代数的に一致する。この記事の主な成果であるこの主張は、詳細なプレゼンテーションによって実証される。また,球面および双曲面上のエネルギースペクトルの質的差異についても考察した。

We give an algebraic derivation of the eigenvalues of energy of a quantum harmonic oscillator on the surface of constant curvature, i.e. on the sphere or on the hyperbolic plane. We use the method proposed by Daskaloyannis for fixing the energy eigenvalues of two-dimensional (2D) quadratically superintegrable systems by assuming that they are determined by the existence of finite-dimensional representation of the polynomial algebra of the motion integral operators. The tool for realizing representations is the deformed parafermionic oscillator. The eigenvalues of energy are calculated and the result derived by us algebraically agrees with the known energy eigenvalues calculated by classical analytical methods. This assertion which is the main result of this article is demonstrated by a detailed presentation. We also discuss the qualitative difference of the energy spectra on the sphere and on the hyperbolic plane.

翻訳日:2023-04-09 08:03:06 公開日:2022-08-06

# キラル超流体中の幾何学的誘導

Geometric Induction in Chiral Superfluids ( http://arxiv.org/abs/2112.04528v3 )

ライセンス: Link先を確認

Qing-Dong Jiang and A. Balatsky

(参考訳) 曲面を被覆したキラル超流動薄膜の特性について検討する。秩序パラメータのベクトルの性質から、幾何学的ゲージ場が出現し、異常渦-幾何学的相互作用や曲率誘起質量/スピン超電流など多くの観測可能な効果をもたらす。我々は、この理論をカイラル超流動$\rm ^3 He$のよく知られた相に適用し、実験的に観測可能なシグネチャを導出する。さらに, 軟質表面がキラル超流動からひずみを補うために適応できるフレキシブルジオメトリの場合についても検討した。幾何学とキラル超流動秩序の相互作用が提案され、量子状態の制御と歪の操作に魅力的な道が開かれた。

We explore the properties of chiral superfluid thin films coating a curved surface. Due to the vector nature of the order parameter, a geometric gauge field emerges and leads to a number of observable effects such as anomalous vortex-geometric interaction and curvature-induced mass/spin supercurrents. We apply our theory to several well-known phases of chiral superfluid $\rm ^3 He$ and derive experimentally observable signatures. We further discuss the cases of flexible geometries where a soft surface can adapt itself to compensate for the strain from the chiral superfluid. The proposed interplay between geometry and chiral superfluid order provides a fascinating avenue to control and manipulate quantum states with strain.

翻訳日:2023-03-05 02:55:35 公開日:2022-08-06

# 容量結合型数電子一重項量子ビットに対するロバストエンタングゲート

Robust entangling gate for capacitively coupled few-electron singlet-triplet qubits ( http://arxiv.org/abs/2201.01583v2 )

ライセンス: Link先を確認

Guo Xuan Chan, Xin Wang

(参考訳) 量子制御がノイズに影響を受けない量子ビットパラメータの軌跡であるスイートスポットの探索は、高忠実度量子ゲートを達成する鍵となる。従来の二重量子ドット一重項量子ビットにおいて、各点が1つの電子(「2電子一重項量子ビット」)をホストするようなスイートスポットを探す努力は、特に2量子ビット演算では失敗に終わった。ここでは、2つの量子ドット("four-electron singlet-triplet qubit")に合計4つの電子を含む、各ドットが1つ以上の電子をホストできる一重項量子ビットを考える。構成-相互作用計算を用いて,この結合量子ビット系にスイートスポットが現れることを理論的に証明した。さらに,現実の電荷ノイズや超微細ノイズ下では,提案するスイートスポットでの2量子ビット動作は,従来の2電子シングルト三重項量子ビットシステム($\sim90\%$)よりも高いゲートフィディティ($\sim99\%$)を提供できることを示した。我々は,シングルトリップキュービット系における高忠実度2ビットゲートの実現を容易にする。

The search of a sweet spot, locus in qubit parameters where quantum control is first-order insensitive to noises, is key to achieve high-fidelity quantum gates. Efforts to search for such a sweet spot in conventional double-quantum-dot singlet-triplet qubits where each dot hosts one electron ("two-electron singlet-triplet qubit"), especially for two-qubit operations, have been unsuccessful. Here we consider singlet-triplet qubits allowing each dot to host more than one electron, with a total of four electrons in the double quantum dots ("four-electron singlet-triplet qubit"). We theoretically demonstrate, using configuration-interaction calculations, that sweet spots appear in this coupled qubit system. We further demonstrate that, under realistic charge noise and hyperfine noise, two-qubit operation at the proposed sweet spot could offer gate fidelities ($\sim99\%$) that are higher than conventional two-electron singlet-triplet qubit system ($\sim90\%$). Our results should facilitate realization of high-fidelity two-qubit gates in singlet-triplet qubit systems.

翻訳日:2023-03-02 05:43:15 公開日:2022-08-06

# 変分量子パルス学習

Variational Quantum Pulse Learning ( http://arxiv.org/abs/2203.17267v3 )

ライセンス: Link先を確認

Zhiding Liang, Hanrui Wang, Jinglei Cheng, Yongshan Ding, Hang Ren, Zhengqi Gao, Zhirui Hu, Duane S. Boning, Xuehai Qian, Song Han, Weiwen Jiang, Yiyu Shi

(参考訳) 量子コンピューティングは、古典的ハードウェア上で計算的に難解な問題を解く最も有望な新興技術の一つである。既存の多くの研究は、変分量子回路(VQC)のような機械学習タスクのゲートレベルにおける変分量子アルゴリズムの使用に焦点を当てている。しかし、vqcは1つの回転ゲートで1つのパラメータしか訓練できないなど、パラメータの数が少ないため、柔軟性と表現性に制限がある。一方、量子パルスは量子コンピューティングのスタックの量子ゲートよりも小さく、制御パラメータがより大きいことが観察された。本稿では、vqcの有望な性能に触発されて、学習タスクで直接量子パルスを訓練する新しいパラダイムである変分量子パルス(vqp)を提案する。提案手法は,最適化フレームワークにおいてパルスの振幅を引いたり押したりすることで,変動量子パルスを操作する。可変量子アルゴリズムと同様に、パルスをトレーニングするためのフレームワークはノイズの中間スケール量子(nisq)コンピュータの雑音に対するロバスト性を維持する。二値分類の例では、VQP学習は(実機からのノイズモデルを持つ)カイスキットノイズシミュレータとibmq-jarkataのVQC学習と比較して最大11%と9%高い精度を達成し、その効果と実現可能性を示した。 VQPが信頼性の高い結果を得るための安定性も、ノイズの存在下で検証されている。

Quantum computing is among the most promising emerging techniques to solve problems that are computationally intractable on classical hardware. A large body of existing works focus on using variational quantum algorithms on the gate level for machine learning tasks, such as the variational quantum circuit (VQC). However, VQC has limited flexibility and expressibility due to limited number of parameters, e.g. only one parameter can be trained in one rotation gate. On the other hand, we observe that quantum pulses are lower than quantum gates in the stack of quantum computing and offers more control parameters. Inspired by the promising performance of VQC, in this paper we propose variational quantum pulses (VQP), a novel paradigm to directly train quantum pulses for learning tasks. The proposed method manipulates variational quantum pulses by pulling and pushing the amplitudes of pulses in an optimization framework. Similar to variational quantum algorithms, our framework to train pulses maintains the robustness to noise on Noisy Intermediate-Scale Quantum (NISQ) computers. In an example task of binary classification, VQP learning achieves up to 11% and 9% higher accuracy compared with VQC learning on the qiskit noise simulators (with noise model from real machine) and ibmq-jarkata, respectively, demonstrating its effectiveness and feasibility. Stability for VQP to obtain reliable results has also been verified in the presence of noise.

翻訳日:2023-02-20 04:43:10 公開日:2022-08-06

# 敵対的サプライチェーン攻撃の防止または緩和 : 法的分析

Preventing or Mitigating Adversarial Supply Chain Attacks; a legal analysis ( http://arxiv.org/abs/2208.03466v1 )

ライセンス: Link先を確認

Kaspar Rosager Ludvigsen, Shishir Nagaraja, Angela Daly

(参考訳) 現在、世界はインターネット全体を通じて強く結びついており、食品からインフラ、テクノロジーまで、あらゆるものを提供する非常にサプライチェーンも持っている。サプライチェーンは、デジタルと物理的の両方の意味で、敵の攻撃に対して脆弱であり、それらを破壊または破滅させる可能性がある。本稿では、このような攻撃が成功した事例を2つ検討し、その成果が今後どうなるのかを考察し、EUと国家法がこれらの攻撃を防ぎ得るか、あるいはあらゆるコストでそれらを緩和しようとしない企業を罰するかを分析する。現行の国家規制は技術面では不十分であり、サプライチェーンの攻撃を防ぐ上で最大の役割を果たせる適切な当事者を強制または強制することは不可能である。しかし、現在のeuの法律は正しい道筋をたどっており、サイバーセキュリティに関して国家法が適切な規制を怠っているため、こうした大きな脅威を考えるためにはさらなる警戒が必要かもしれない。

The world is currently strongly connected through both the internet at large, but also the very supply chains which provide everything from food to infrastructure and technology. The supply chains are themselves vulnerable to adversarial attacks, both in a digital and physical sense, which can disrupt or at worst destroy them. In this paper, we take a look at two examples of such successful attacks and consider what their consequences may be going forward, and analyse how EU and national law can prevent these attacks or otherwise punish companies which do not try to mitigate them at all possible costs. We find that the current types of national regulation are not technology specific enough, and cannot force or otherwise mandate the correct parties who could play the biggest role in preventing supply chain attacks to do everything in their power to mitigate them. But, current EU law is on the right path, and further vigilance may be what is necessary to consider these large threats, as national law tends to fail at properly regulating companies when it comes to cybersecurity.

翻訳日:2023-02-19 10:22:04 公開日:2022-08-06

# 公衆衛生におけるデータサイエンス : 次世代の能力の構築

Data science in public health: building next generation capacity ( http://arxiv.org/abs/2208.03461v1 )

ライセンス: Link先を確認

Nicholas Mirin, Heather Mattie, Latifa Jackson, Zainab Samad, Rumi Chunara

(参考訳) 急速に進化する技術、データ、分析的な風景は多くの分野や職業に浸透している。公衆衛生において、データリテラシーを含むデータサイエンススキルの必要性は、既存の公衆衛生研究と介入慣行のギャップを埋めるための新しいデータタイプと分析手法の可能性と、そのようなデータや方法が健康格差を持続または拡大する可能性の両方から特に顕著である。米国トップ10および世界の公衆衛生学校における公衆衛生コースとプログラムのレビューを通じて、本稿は、公衆衛生データ科学における既存の教育活動を要約する。これらの既存の慣習は、これらのカリキュラムをさらに多くの学校や人口に広げる努力に役立ちます。データサイエンス倫理コースのオファリングは、人口の健康原則が、従来の公衆衛生カリキュラムのコアを拡大するために、データに関わるレベルのトレーニングにどのようにブレンドできるかを評価する文脈でも検討されている。また、国内外の「教室外」研修プログラムからの並列的な知見を合成し、公衆衛生データ科学の多様性を高めるためのアプローチを推し進める。これらのプログラムのレビューとそれらの合成に基づいて、4点式を蒸留し、公衆衛生の目標達成とデジタル時代の生活の質向上にデータを活用するために、流線型の重要かつ包括的な実践者集団の開発に向けて、公衆衛生データサイエンス教育の取り組みを強化する。

Rapidly evolving technology, data and analytic landscapes are permeating many fields and professions. In public health, the need for data science skills including data literacy is particularly prominent given both the potential of novel data types and analysis methods to fill gaps in existing public health research and intervention practices, as well as the potential of such data or methods to perpetuate or augment health disparities. Through a review of public health courses and programs at the top 10 U.S. and globally ranked schools of public health, this article summarizes existing educational efforts in public health data science. These existing practices serve to inform efforts for broadening such curricula to further schools and populations. Data science ethics course offerings are also examined in context of assessing how population health principles can be blended into training across levels of data involvement to augment the traditional core of public health curricula. Parallel findings from domestic and international 'outside the classroom' training programs are also synthesized to advance approaches for increasing diversity in public health data science. Based on these program reviews and their synthesis, a four-point formula is distilled for furthering public health data science education efforts, toward development of a critical and inclusive mass of practitioners with fluency to leverage data to advance goals of public health and improve quality of life in the digital age.

翻訳日:2023-02-19 10:21:46 公開日:2022-08-06

# オフナディア測地偏差予測のための深層学習アンサンブルフレームワーク

A Deep Learning Ensemble Framework for Off-Nadir Geocentric Pose Prediction ( http://arxiv.org/abs/2205.11230v3 )

ライセンス: Link先を確認

Christopher Sun, Jai Sharma, Milind Maiti

(参考訳) 自然災害対応を加速する計算手法には、変化検出、地図アライメント、視覚支援ナビゲーションなどがある。現在のソフトウェアはnadirに近い画像のみに最適に機能するが、オフnadir画像は自然災害後の最初の情報源であることが多い。上記のタスクにオフnadir画像を使用するには、重力に対する航空機の空間方向であるジオセントリックなポーズの計算が必要である。本研究では,世界の都市における5,923個の近海RGB衛星画像を用いて,地球中心のポーズを予測するためのディープラーニングアンサンブルフレームワークを提案する。まず、U-Net Fully Convolutional Neural Networkは、RGB画像の画素方向の地上高度マスクを予測する。そして、標高マスクをRGB画像と連結して第2畳み込みモデルに入力される4チャンネル入力を形成し、方位角と倍率スケールを予測する。 R2=0.917の性能精度は従来の手法よりも大幅に向上した。また,教師付き補間により異常除去を行い,標高マスクの感度分析を行い,データ特徴量の有用性を評価し,将来的な特徴工学の道筋を動機付ける。本研究で構築した高精度ソフトウェアは,災害対応のための地図作成とナビゲーションに有効である。

Computational methods to accelerate natural disaster response include change detection, map alignment, and vision-aided navigation. Current software functions optimally only on near-nadir images, though off-nadir images are often the first sources of information following a natural disaster. The use of off-nadir images for the aforementioned tasks requires the computation of geocentric pose, which is an aerial vehicle's spatial orientation with respect to gravity. This study proposes a deep learning ensemble framework to predict geocentric pose using 5,923 near-nadir and off-nadir RGB satellite images of cities worldwide. First, a U-Net Fully Convolutional Neural Network predicts the pixel-wise above-ground elevation mask of the RGB images. Then, the elevation masks are concatenated with the RGB images to form four-channel inputs fed into a second convolutional model, which predicts orientation angle and magnification scale. A performance accuracy of R2=0.917 significantly outperforms previous methodologies. In addition, outlier removal is performed through supervised interpolation, and a sensitivity analysis of elevation masks is conducted to gauge the usefulness of data features, motivating future avenues of feature engineering. The high-accuracy software built in this study contributes to mapping and navigation procedures for effective disaster response to save lives.

翻訳日:2023-02-14 08:49:32 公開日:2022-08-06

# 量子確率同値は、マクロ実数論と一致する測定のためのレトロコーサルモデルに繋がる

A quantum stochastic equivalence leads to a retrocausal model for measurement consistent with macroscopic realism ( http://arxiv.org/abs/2205.06070v2 )

ライセンス: Link先を確認

Margaret D Reid and Peter D Drummond

(参考訳) 本稿では, 量子力学から自然に反因性が生じることを示し, 量子計測をマクロ的リアリズムと一貫して説明する。固有状態 ||x_{j}\rangle$ の重ね合わせで用意された系上で測定値 $\hat{x}$ を解析し、増幅によって測定値がモデル化される。経路積分定理を導出することにより、量子確率分布 $q(x,p,t)$ と振幅 $x(t)$ と $p(t)$ の同時バックインタイムとフォワードインタイム確率方程式の等価性が証明される。後方と前方の軌道は、初期時間境界でリンクされる。 deutschのような'causal consistency'とボルンの規則は自然に現れる。特徴は固有状態に関連する真空ノイズである。固有値とは異なり、このノイズは増幅されず、測定不能であり、過去の境界条件と将来の境界条件に由来する正確なゆらぎである。巨視的重ね合わせについては、測定開始前に測定値$\hat{x}$の巨視的結果が決定される。これにより、マクロコーサールとマイクロレトロコーサールの関係、および他のリアリズムのモデルがハイブリッドとなる。この結果は、波動関数の「収束」は増幅によって起こることを裏付けるものである: 結果 $x_{j}$ で選択された最初の'状態' に対する分配 $q_{j}(x,p,0)$ は量子状態ではなく、マクロ重ね合わせのために固有状態 $|x_{j}\rangle$ に近づく。完全に可逆的な崩壊は、メートルへの結合によってシミュレートされる。 Einstein-Podolsky-Rosen と Bell の相関について論じる。

In this paper, we show how retrocausality arises naturally from within quantum mechanics, and explains quantum measurement consistently with macroscopic realism. We analyze a measurement $\hat{x}$ on a system prepared in a superposition of eigenstates $|x_{j}\rangle$ where the measurement is modeled by amplification. By deriving a path-integral theorem, we prove an equivalence between a quantum probability distribution $Q(x,p,t)$ and simultaneous back-in-time and forward-in-time stochastic equations for amplitudes $x(t)$ and $p(t)$, respectively. The backward and forward trajectories are linked at the initial-time boundary. A Deutsch-like 'causal consistency' and Born's rule emerge naturally. A feature is the vacuum noise associated with the eigenstate. Unlike the eigenvalue, this noise is not amplified and is not measurable, the precise fluctuations originating from past and future boundary conditions. We find consistency with macroscopic realism: For macroscopic superpositions, the macroscopic outcome of the measurement $\hat{x}$ is considered determined prior to the onset of the measurement. This leads to hybrid macro-causal and micro-retrocausal relations, and other models of realism. Our results support that the 'collapse' of the wave function occurs with amplification: The distribution $Q_{j}(x,p,0)$ for the initial 'state' postselected on the outcome $x_{j}$ is not a quantum state but approaches the eigenstate $|x_{j}\rangle$ for a macroscopic superposition. The full irreversible collapse is simulated by coupling to a meter. We discuss Einstein-Podolsky-Rosen and Bell correlations.

翻訳日:2023-02-13 09:37:49 公開日:2022-08-06

# 室温Rydberg-Atomを用いたテラヘルツ受信器

Terahertz Receiver based on Room-Temperature Rydberg-Atoms ( http://arxiv.org/abs/2205.11021v2 )

ライセンス: Link先を確認

Yayi Lin, Zhenyue She, Zhiwen Chen, Xianzhe Li, Caixia Zhang, Kaiyu Liao, Xinding Zhang, Wei Huang, Hui Yan, and Shiliang Zhu

(参考訳) 実用的なテラヘルツ無線通信の実現は多くの課題に直面している。 THz無線通信には高感度受信機が重要である。ここでは, セシウムRydberg原子に基づくテラヘルツ受信機を室温気相セルで実証する。最小検出可能なTHz電界を校正する。この受信機では、振幅変調または周波数変調テラヘルツ波を光信号に位相感応変換する。その結果、原子レシーバーはその量子特性のために多くの利点があることがわかった。特に、この受信機を用いて長距離THz無線通信を実現することができる。さらに、原子受信機は、THz無線-光リンクで使用することができる。

Realization of practical terahertz wireless communications still faces many challenges. The receiver with high sensitivity is important for THz wireless communications. Here we demonstrate a terahertz receiver based on the cesium Rydberg atoms in a room-temperature vapor cell. The minimum detectable THz electric field is calibrated. With this receiver, the phase-sensitive conversion of amplitude-modulated or frequency-modulated terahertz waves into optical signals is performed. The results show that the atomic receiver has many advantages due to its quantum properties. Especially, the long distance THz wireless communications is achievable using this receiver. Furthermore, the atomic receiver can be used in the THz wireless-to-optical link.

翻訳日:2023-02-12 00:57:13 公開日:2022-08-06

# 架空の同一粒子の熱力学特性とフェルミオン符号問題への応用に関する研究

On the thermodynamic properties of fictitious identical particles and the application to fermion sign problem ( http://arxiv.org/abs/2206.08341v2 )

ライセンス: Link先を確認

Yunuo Xiong, Hongwei Xiong

(参考訳) 最近開発された同一のボソンとフェルミオンの経路積分分子動力学を一般化することにより、実パラメータ$\xi$ がボソンとフェルミオンの間を連続的に補間される架空の同一粒子の有限温度熱力学的性質を考える。一般解析と数値実験により、平均エネルギーは、この実パラメータ $\xi$ の関数として良い解析的性質を持つことが判明し、これは、虚多項式関数による外挿により同じフェルミオンの熱力学特性を$\xi\geq 0$ で正確に計算した後に計算する機会を与える。本手法は,有限温度フェルミオン系に対して効率的に正確なエネルギー値を与えることができることを示す。我々の研究は、いくつかの量子系のフェルミオン符号問題を回避する機会を提供する。

By generalizing the recently developed path integral molecular dynamics for identical bosons and fermions, we consider the finite-temperature thermodynamic properties of fictitious identical particles with a real parameter $\xi$ interpolating continuously between bosons ($\xi=1$) and fermions ($\xi=-1$). Through general analysis and numerical experiments we find that the average energy may have good analytical property as a function of this real parameter $\xi$, which provides the chance to calculate the thermodynamical properties of identical fermions by an extrapolation with a simple polynomial function after accurately calculating the thermodynamic properties of the fictitious particles for $\xi\geq 0$. Using several examples, it is shown that our method can efficiently give accurate energy values for finite-temperature fermionic systems. Our work provides a chance to circumvent the fermion sign problem for some quantum systems.

翻訳日:2023-02-11 13:46:11 公開日:2022-08-06

# 非自明なPT対称連続体ハミルトニアンとその固有状態と固有値

A non-trivial PT-symmetric continuum Hamiltonian and its Eigenstates and Eigenvalues ( http://arxiv.org/abs/2206.12900v2 )

ライセンス: Link先を確認

Lawrence Mead, David Garfinkle, Sungwook Lee

(参考訳) 本稿では,連続体PT対称ハミルトニアンが支配する非自明な系について述べる。このハミルトニアンは単純な高調波発振器と等距離であることを示す。我々は、それらの函数が正規直交集合を形成する複素平面の固有関数と経路を見つける。また、この系に対して隠れ対称性作用素 ${\cal C}$ も見つかる。すべての計算は解析的に、近似なしで行われる。

In this paper, a non-trivial system governed by a continuum PT-symmetric Hamiltonian is discussed. We show that this Hamiltonian is iso-spectral to the simple harmonic oscillator. We find its eigenfunctions and the path in the complex plane along which these functions form an orthonormal set. We also find the hidden symmetry operator, ${\cal C}$, for this system. All calculations are performed analytically and without approximation.

翻訳日:2023-02-07 23:48:00 公開日:2022-08-06

# キラル対称性のバルクモードを有するツイストエノン共振器共振器と超光アキション暗黒物質に対する感度

Twisted Anyon Cavity Resonators with Bulk Modes of Chiral Symmetry and Sensitivity to Ultra-Light Axion Dark Matter ( http://arxiv.org/abs/2208.01640v2 )

ライセンス: Link先を確認

J. F. Bourhill, E. C. I. Paterson, M. Goryachev, M. E. Tobar

(参考訳) 本研究では,Anyonキャビティ共振器を発明する。共振器はねじれた中空構造に基づいており、選択共振モードは非ゼロヘリシティを示すことができる。キャビティの断面によっては、モードは以前研究されたものよりも一般対称性を持つ。例えば、ねじれのない場合、モードはボソンの形式であり、一方、180^{o}$ツイストでは対称性はフェルミオンの形式である。一般にツイストされた共振器が正弦形であることを示す。非ゼロヘリシティは、モードをアクチオンに結合し、アップコンバージョンでは、共振器の帯域内でモードカップルを超軽量アクチオンに制限する。この結合は振幅変調されたサイドバンドを追加し、共振器の帯域内の1つのモードのみを使用して、単純な感度で超光軸を探索できる。

In this work, we invent the Anyon Cavity Resonator. The resonator is based on twisted hollow structures, which allow select resonant modes to exhibit non-zero helicity. Depending on the cross-section of the cavity, the modes have more general symmetry than what has been studied before. For example, with no twist, the mode is the form of a boson, while with a $180^{o}$ twist the symmetry is in the form of a fermion. We show that the generally twisted resonator is in the form of an anyon. The non-zero helicity couples the mode to axions, and we show in the upconversion limit the mode couples to ultra-light axions within the bandwidth of the resonator. The coupling adds amplitude modulated sidebands and allows a simple sensitive way to search for ultra-light axions using only a single mode within the resonator's bandwidth.

翻訳日:2023-02-02 18:44:45 公開日:2022-08-06

# 捕捉イオン量子ゲートの忠実性に及ぼす高速雑音の影響

The effect of fast noise on the fidelity of trapped-ions quantum gates ( http://arxiv.org/abs/2208.03570v1 )

ライセンス: Link先を確認

Haim Nakav, Ran Finkelstein, Lee Peleg, Nitzan Akerman and Roee Ozeri

(参考訳) 高忠実度シングルおよびマルチ量子ビット演算は量子情報処理のバックボーンを構成する。この忠実度は、1ビットまたは2ビットのレベルを極めて整合的で正確な方法で結合する能力に基づいている。コヒーレント量子進化に必要な条件は、これらの遷移を駆動する非常に安定な局所振動子である。本稿では,局所発振器線幅よりもはるかに高い周波数での雑音である高速雑音が,閉じ込められたイオン系の1ビットおよび2ビットゲートの忠実度に及ぼす影響について検討する。我々は,共振する$\pi$ 回転とオフ共振側バンド遷移を含む単一量子ビット演算に対する高速雑音の影響を解析・測定する。さらに, モルマー・ソレンセン2量子ゲートにおける高速位相雑音の影響を解析した。我々は、量子ビット応答周波数における雑音パワースペクトル密度から与えられる単一のパラメータを通して、これら全ての演算の性能を統一的かつ簡便に推定する方法を見出した。解析は位相雑音や閉じ込められたイオン系に焦点をあてるが、他の高速ノイズ源やスピン状量子ビットが共通のボソニック場によって結合された他の量子ビット系に関係している。私たちの分析は、量子ハードウェアプラットフォームとゲートの分離を導くのに役立ち、フォールトトレラントな量子コンピューティングに対する信頼性を向上させることができます。

High fidelity single and multi-qubit operations compose the backbone of quantum information processing. This fidelity is based on the ability to couple single- or two-qubit levels in an extremely coherent and precise manner. A necessary condition for coherent quantum evolution is a highly stable local oscillator driving these transitions. Here we study the effect of fast noise, that is noise at frequencies much higher than the local oscillator linewidth, on the fidelity of one- and two-qubit gates in a trapped-ion system. We analyze and measure the effect of fast noise on single qubit operations including resonant $\pi$ rotations and off-resonant sideband transitions . We further analyze the effect of fast phase noise on the Molmer-Sorensen two-qubit gate. We find a unified and simple way to estimate the performance of all of these operations through a single parameter given by the noise power spectral density at the qubit response frequency. While our analysis focuses on phase noise and on trapped-ion systems, it is relevant for other sources of fast noise as well as for other qubit systems in which spin-like qubits are coupled by a common bosonic field. Our analysis can help in guiding the deign of quantum hardware platforms and gates, improving their fidelity towards fault-tolerant quantum computing.

翻訳日:2023-02-02 02:25:49 公開日:2022-08-06

# 半量子鍵分配プロトコルの3次元と4次元における無バイアス基底

Mutually Unbiased Bases In 3 and 4 Dimensions Semi-quantum Key Distribution Protocol ( http://arxiv.org/abs/2208.03548v1 )

ライセンス: Link先を確認

Hasnaa Hajji, Morad El Baz

(参考訳) 半量子鍵分布は伝統的に2レベル量子系に基づいている。本稿では,様々な非バイアスベースを用いた高次元システムに基づく半量子鍵分散プロトコルの無条件セキュリティについて述べる。まず,3次元と4次元の非バイアス基底を用いた3次元の場合を,量子チャネルの雑音の関数として,鍵レートに対する下界を導出する。次に, 4次元状態に対して, 相互に偏りのない基底数が異なる半量子鍵分布プロトコルに一般化する。半量子鍵分布プロトコルを高次元の非バイアスベースに基づけることで、ノイズの許容しきい値と秘密鍵レートの最大到達値を高めることができることがわかった。

Semi-quantum key distribution is traditionally based on two-level quantum systems. In this paper, an unconditional security of a semi quantum key distribution protocol based on higher-dimensional systems using various mutually unbiased bases is presented. We first consider the three dimensional case using three and four mutually unbiased bases and derive a lower bound for the key rate as a function of the quantum channel's noise. We then generalize the result to a semi-quantum key distribution protocol that employs different number of mutually unbiased bases for four-dimensional states. It is found that basing the semi-quantum key distribution protocol on higher-dimensional mutually unbiased bases can increase the tolerable threshold of the noise and the maximum achievable value of the secret key rate.

翻訳日:2023-02-02 02:25:29 公開日:2022-08-06

# 原子媒体を組み込んだハイブリッド光機械システムにおける可変電磁誘導多重トランジスタ

Tunable Electromagnetically Induced Multi-Transparencies in Hybrid Optomechanical system Incorporating Atomic Medium ( http://arxiv.org/abs/2208.03547v1 )

ライセンス: Link先を確認

M. Hunza, M. Asjad, T. Abbas, M. Qasymeh, B. Teklu and H. Eleuch

(参考訳) 同一の$\Lambda$型原子を組み込んだハイブリッド原子-オプトメカニクス系を考える。このシステムは、光とフォノニックの二重駆動を受ける。光学的線形および二次的相互作用を利用することにより、複数の電磁透過窓が得られることを示す。さらに、内蔵メカニカルポンプにより、透明窓を制御して調整する。例えば、外部機械ポンプの位相を調整することにより、追加の制御パラメータが有効となり、吸収・放出プロファイルが向上する。本研究は,キャビティ-オプトメカニクス系を組み込んだ量子デバイス内部の伝搬信号の効率的な修正手法を提案する。

We consider a hybrid atom-optomechanical system incorporating N identical $\Lambda$-type atoms. The system is subjected to dual optical and phononic drives. We show that by exploiting the optomechanical linear and quadratic interactions, multiple electromagnetic transparency windows are attained. Furthermore, owing to the incorporated mechanical pump, the transparency windows are controlled and tuned. For instance, by adjusting the phase of the external mechanical pump, additional controlling parameters are enabled, and the absorption/emission profiles are enhanced. Our present study provides an efficient approach to modifying propagating signals inside the quantum devices incorporating cavity-optomechanical systems.

翻訳日:2023-02-02 02:25:17 公開日:2022-08-06

# 加速フレームにおけるミンコフスキー・フォック状態

Minkowski-Fock states in accelerated frames ( http://arxiv.org/abs/2208.03481v1 )

ライセンス: Link先を確認

Riccardo Falcone, Claudio Conti

(参考訳) 非慣性観測者に対するミンコフスキー粒子状態の明示的なウィグナー定式化は不明である。ここでは、加速フレームにおけるミンコフスキー・フォック状態の特性関数を計算するための一般的な処方則を導出する。単粒子状態と二粒子状態の特別な場合において、この方法は運動量空間における粒子数の平均値と相関関数、および観測者の加速度の影響を導出することができる。我々は,ミンコフスキー粒子と2粒子状態の区別不可能性をリンドラー粒子分布の観点から示し,観察者がフレームの加速度を検出する方法とみなすことができる。 2粒子状態の場合、観測者は異なるモータを持つリンドラー粒子間の相関を測定することで加速を検出することができる。

An explicit Wigner formulation of Minkowski particle states for non-inertial observers is unknown. Here, we derive a general prescription to compute the characteristic function for Minkowski-Fock states in accelerated frames. For the special case of single-particle and two-particle states, this method enables to derive mean values of particle numbers and correlation function in the momentum space, and the way they are affected by the acceleration of the observer. We show an indistinguishability between Minkowski single-particle and two-particle states in terms of Rindler particle distribution that can be regarded as a way for the observer to detect any acceleration of the frame. We find that for two-particle states the observer is also able to detect acceleration by measuring the correlation between Rindler particles with different momenta.

翻訳日:2023-02-02 02:25:08 公開日:2022-08-06

# 非摂動カシミール効果:真空構造、閉じ込め、カイラル対称性の破断

Nonperturbative Casimir effects: Vacuum structure, Confinement, and Chiral Symmetry Breaking ( http://arxiv.org/abs/2208.03457v1 )

ライセンス: Link先を確認

Alexander Molochkov

(参考訳) 境界を持つ時空間における真空および物質の再構成について概観する。収束ゲージ理論と強い相互作用を持つフェルミオン系の位相特性を考察する。特に、キラル相と分解相はカシミールプレートの存在下で性質を遷移させる。また、そのようなシステムにおける質量スケールシフトとその動的および幾何学的性質についても論じる。

The review of vacuum and matter restructuring in space-time with boundaries is presented. We consider phase properties of confining gauge theories and strongly interacting fermion systems. In particular, the chiral and deconfinement phase transitions properties in the presence of Casimir plates. We also discuss mass scale shifts in such systems and their possible dynamical and geometrical nature.

翻訳日:2023-02-02 02:24:54 公開日:2022-08-06

# フォック状態格子における光の量子位相状態のコヒーレント制御

Coherent control of quantum topological states of light in Fock-state lattices ( http://arxiv.org/abs/2208.03452v1 )

ライセンス: Link先を確認

Jinfeng Deng, Hang Dong, Chuanyu Zhang, Yaozu Wu, Jiale Yuan, Xuhao Zhu, Feitong Jin, Hekang Li, Zhen Wang, Han Cai, Chao Song, H. Wang, J. Q. You, and Da-Wei Wang

(参考訳) トポロジカルフォトニクスは、従来の電子材料を超えてトポロジカル物理を探求する新しいプラットフォームを提供し、位相的に保護された光輸送とレーザーにおける有望な応用を刺激する。偏光や波動ベクトルのような古典的な自由度は、トポロジカル光モードの合成に日常的に使用される。古典的な体制を超えて、光の本質的な量子の性質は、本質的に異なる位相状態の富を生み出し、量子情報処理における位相的保護を提供する。本稿では,3つの共振器をgmon qubitに均一に結合した超伝導回路における量子化光の位相状態に関する実験を行う。本研究では, ゼロエネルギー状態, ひずみ誘起擬ランダウレベル, バレーホール効果, ハルダンキラルエッジ電流のトポロジカル輸送を示す1次元および2次元フォック状態格子を構築した。本研究では、光の位相状態を量子状態まで拡張し、凝縮物質物理学の位相位相相と回路量子電磁力学を橋渡し、複数の共振器の量子状態を制御する新しい自由度を提供する。

Topological photonics provides a novel platform to explore topological physics beyond traditional electronic materials and stimulates promising applications in topologically protected light transport and lasers. Classical degrees of freedom such as polarizations and wavevectors are routinely used to synthesize topological light modes. Beyond the classical regime, inherent quantum nature of light gives birth to a wealth of fundamentally distinct topological states, which offer topological protection in quantum information processing. Here we implement such experiments on topological states of quantized light in a superconducting circuit, on which three resonators are tunably coupled to a gmon qubit. We construct one and two-dimensional Fock-state lattices where topological transport of zero-energy states, strain induced pseudo-Landau levels, valley Hall effect and Haldane chiral edge currents are demonstrated. Our study extends the topological states of light to the quantum regime, bridges topological phases of condensed matter physics with circuit quantum electrodynamics, and offers a new freedom in controlling the quantum states of multiple resonators.

翻訳日:2023-02-02 02:24:49 公開日:2022-08-06

# 複素弱値測定による一般量子相関の動作特性評価

Operational characterization of general quantum correlation via complex weak value measurement ( http://arxiv.org/abs/2208.03442v1 )

ライセンス: Link先を確認

Agung Budiyono and Hermawan K. Dipojono

(参考訳) 過去20年間、量子相関の理解は絡み合いよりも一般的であり、分離可能な状態でさえ古典的な対象によってエミュレートできない相関を生じる可能性がある。このような一般的な非古典的相関は、基礎的な観点から興味深いだけでなく、様々な量子情報処理タスクや量子技術において資源として認識されている。本稿では, 2成分系における一般量子相関を, 選択後の弱い測定値を用いた直接実験室操作の観点から評価する。局所基底の弱測定により得られた弱値の虚部と、他の局所基底の事後選択とに基づく量と、これら2つの基底の可能な全ての選択に対する最適化手順を定義する。一般の量子相関の量子化器に対する一定の要求を満たすことを示す。不確かさの最小の真の量子共有として統計的に解釈できる。一般の純粋な状態に対する絡み合いの忠実な証人であり、絡み合いの線形エントロピーのスケールされた平方根に観測可能な下界を与える。次に,各状態における局所的射影測定の結果に基づいて,任意の局所的測定基準の最適推定における最小平均絶対誤差として,多部状態における一般量子相関に関する情報理論的解釈を提案する。

The last two decades have witnessed significant progress on the understanding the quantum correlation more general than entanglement, wherein even a separable state may yield correlation that cannot be emulated by any classical object. Such a general nonclassical correlation is not only intriguing from the fundamental point of view, but it has also been recognized as a resource in a variety of quantum information processing tasks and quantum technology. Here, we propose a characterization of the general quantum correlation in bipartite system in terms of direct laboratory operations using weak measurement with postselection. We define a quantity based on the imaginary part of weak values obtained via weak measurement of a local basis followed by a postselection of another local basis, and an optimization procedure over all possible choices of the two bases. We show that it satisfies certain desirable requirements for a quantifier of general quantum correlations. It may be statistically interpreted as the minimum genuine quantum share of uncertainty. It is a faithful witness of entanglement for general pure states, giving an observable lower bound to a scaled square root of the linear entropy of entanglement. We then suggest an information theoretic interpretation of the general quantum correlation in a multipartite state as the minimum mean absolute error in an optimal estimation of any local measurement basis, based on the outcomes of local projective measurement on the state, in the worst case scenario.

翻訳日:2023-02-02 02:24:30 公開日:2022-08-06

# ベル不等式違反と関節マッピングのゲームに基づく測定における相関の保存

Conservation of correlation in measurement underlying the violation of Bell inequalities and a game of joint mapping ( http://arxiv.org/abs/2208.03441v1 )

ライセンス: Link先を確認

Agung Budiyono

(参考訳) ベルの不等式に違反する量子測定に何が必要か? 測度によらず、スピン-$\frac{1}{2}$粒子 (qubit) にスピンの定値を割り当てることができると仮定すると、c-値のスピン変数と呼ばれるが、これは任意の連続実数を取ることができる。さらに、測定値のc値のスピン変数を、可能な値の連続範囲から二進標準量子スピン値 $\pm 1$ にマッピングし、二部相関を保存する。ここで、そのようなc値スピン変数を実際に構成できることを示す。したがって、このモデルでは、状態が絡み合っているときベルの不等式を破るように量子測定を強制する相関の保存の要件であると主張することができる。次に、実数対の特定のアンサンブルを2進数対のペアに独立にマッピングするよう2つの当事者に要求し、相関が保存されるという条件のもとに統計ゲームについて議論する。相関の保存により、ゲームはベルの定理を尊重し、古典的戦略(すなわち局所的戦略と決定論的戦略)が勝てないゲームが存在することを意味する。一方、絡み合ったスピン-$\frac{1}{2}$粒子と局所的な量子スピン測定のための回路のアンサンブルにアクセスする量子戦略は、ゲームに勝つために使用できる。

What compels quantum measurement to violate the Bell inequalities? Suppose that regardless of measurement, one can assign to a spin-$\frac{1}{2}$ particle (qubit) a definite value of spin, called c-valued spin variable, but, it may take any continuous real number. Suppose further that measurement maps the c-valued spin variable from the continuous range of possible values onto the binary standard quantum spin values $\pm 1$ while preserving the bipartite correlation. Here, we show that such c-valued spin variables can indeed be constructed. In this model, one may therefore argue that it is the requirement of conservation of correlation which compels quantum measurement to violate the Bell inequalities when the prepared state is entangled. We then discuss a statistical game which captures the model of measurement, wherein two parties are asked to independently map a specific ensemble of pairs of real numbers onto pairs of binary numbers $\pm 1$, under the requirement that the correlation is preserved. The conservation of correlation forces the game to respect the Bell theorem, which implies that there is a class of games no classical (i.e., local and deterministic) strategy can ever win. On the other hand, a quantum strategy with an access to an ensemble of entangled spin-$\frac{1}{2}$ particles and circuits for local quantum spin measurement, can be used to win the game.

翻訳日:2023-02-02 02:24:09 公開日:2022-08-06

# 時間発展テンソルネットワークアルゴリズムの正規化スキーム

Regularized scheme of time evolution tensor network algorithms ( http://arxiv.org/abs/2208.03436v1 )

ライセンス: Link先を確認

Li-Xiang Cen

(参考訳) 量子格子系の時間発展をシミュレートするために正規化分解法を提案する。トロッター分解を超越すると、プロパゲーターのコンパクト構造は高階ベーカー・カンベル・ハウスドルフ級数を示す。テンソルネットワークアルゴリズムの正規化スキームは、ハイゼンベルク型あるいは北エフ型相互作用を持つスピン格子系の基底状態エネルギーを決定するために開発される。ベンチマーク計算は正規化アルゴリズムの2つの利点を明らかにしている: 安定収束を持ち、キタエフスピン液体に単純な更新法を適用する場合でもバイアスに影響を受けない; 生成したテンソルネットワークの収縮は、計算コストをはるかに低くして急速に収束し、ボトルネックを緩和して物理的期待値を計算する。

Regularized factorization is proposed to simulate time evolution for quantum lattice systems. Transcending the Trotter decomposition, the resulting compact structure of the propagator indicates a high-order Baker-Campbell-Hausdorff series. Regularized scheme of tensor network algorithms is then developed to determine the ground state energy for spin lattice systems with Heisenberg or Kitaev-type interactions. Benchmark calculations reveal two distinct merits of the regularized algorithm: it has stable convergence, immune to the bias even in applying the simple update method to the Kitaev spin liquid; contraction of the produced tensor network can converge rapidly with much lower computing cost, relaxing the bottleneck to calculate the physical expectation value.

翻訳日:2023-02-02 02:23:44 公開日:2022-08-06

# 複数ラベルによる学習

Learning with Multiple Complementary Labels ( http://arxiv.org/abs/1912.12927v4 )

ライセンス: Link先を確認

Lei Feng, Takuo Kaneko, Bo Han, Gang Niu, Bo An, Masashi Sugiyama

(参考訳) 補ラベル(CL)は単に例の不正なクラスを示すが、CLで学習すると、正しいクラスを予測できる複数のクラス分類器が得られる。残念ながら、問題設定では各例に1つのCLしか使用できないため、ラベル付け者が簡単に複数のCL(MCL)を識別できるため、そのポテンシャルは特に制限される。本稿では, MCL を各例に適用可能な新しい問題設定法と, MCL を学習するための2つの方法を提案する。まず、MCLを複数の単一のCLに分解する2つのラッパーを設計し、CLで学習するためにどんな方法でも使えるようにした。しかし、MCLが保持する監視情報は、分解後に概念的に希釈される。したがって、2つ目の方法では、バイアスのないリスク推定器を導出し、MCLの集合を全体として処理し、推定誤差境界を持つように最小化する。適切に選択された上限を最小化する第2の方法をさらに改善する。実験によると、以前の方法はMCLで学ぶのにうまく機能するが、後者の方が優れている。

A complementary label (CL) simply indicates an incorrect class of an example, but learning with CLs results in multi-class classifiers that can predict the correct class. Unfortunately, the problem setting only allows a single CL for each example, which notably limits its potential since our labelers may easily identify multiple CLs (MCLs) to one example. In this paper, we propose a novel problem setting to allow MCLs for each example and two ways for learning with MCLs. In the first way, we design two wrappers that decompose MCLs into many single CLs, so that we could use any method for learning with CLs. However, the supervision information that MCLs hold is conceptually diluted after decomposition. Thus, in the second way, we derive an unbiased risk estimator; minimizing it processes each set of MCLs as a whole and possesses an estimation error bound. We further improve the second way into minimizing properly chosen upper bounds. Experiments show that the former way works well for learning with MCLs but the latter is even better.

翻訳日:2023-01-17 02:05:33 公開日:2022-08-06

# 敵対的腐敗の存在下での文脈探索

Contextual Search in the Presence of Adversarial Corruptions ( http://arxiv.org/abs/2002.11650v6 )

ライセンス: Link先を確認

Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, and Robert Schapire

(参考訳) 本研究では,より高次元のバイナリ検索の一般化である文脈探索について検討し,機能ベースの動的価格設定などの設定を捉えた。この問題の標準的な定式化は、エージェントが特定の同種反応モデルに従って作用すると仮定する。しかし実際には、一部の反応は逆向きに腐敗することがある。既存のアルゴリズムは、仮定された応答モデルが全てのエージェントに(ほぼ)正確であることに大きく依存しており、そのような任意的な誤特定の存在下でも性能が劣る。エージェントが基盤となる応答モデルと矛盾する方法で振る舞うことができる場合、文脈探索の研究を開始する。特に,多次元二元探索法に基づくアルゴリズムと勾配勾配に基づくアルゴリズムの2つを提案する。これらのアルゴリズムは, 敵対的汚職の欠如と, それらの性能が, エージェント数に応じて優雅に低下していることが示され, 敵対的雑音モデルにおける文脈探索の最初の結果となった。学習理論,ゲーム理論,高次元幾何学,凸解析から着想を得た。

We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as feature-based dynamic pricing. Standard formulations of this problem assume that agents act in accordance with a specific homogeneous response model. In practice, however, some responses may be adversarially corrupted. Existing algorithms heavily depend on the assumed response model being (approximately) accurate for all agents and have poor performance in the presence of even a few such arbitrary misspecifications. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying response model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. We show that these algorithms attain near-optimal regret in the absence of adversarial corruptions and their performance degrades gracefully with the number of such agents, providing the first results for contextual search in any adversarial noise model. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis.

翻訳日:2022-12-28 15:00:02 公開日:2022-08-06

# 3次元点雲の部分分割に対するクロスシェイプ注意

Cross-Shape Attention for Part Segmentation of 3D Point Clouds ( http://arxiv.org/abs/2003.09053v4 )

ライセンス: Link先を確認

Marios Loizou, Dmitry Petrov, Melinos Averkiou, Evangelos Kalogerakis

(参考訳) 本稿では,3次元形状分割を目的とし,コレクション内の形状にまたがる点的特徴表現を伝播する手法を提案する。これは、異なる形状の点間の相互作用の度合いを評価し、特徴伝播を媒介するクロス形状の注意操作によって達成される。各テスト形状について,このようなクロス形状の注意操作を行うのに適した入力コレクション内の形状を求める。得られたポイントワイズ特徴表現は,実験で示されたように,より一貫性のある3次元形状分割結果をもたらす。

We present a method that propagates point-wise feature representations across shapes within a collection for the purpose of 3D shape segmentation. This is achieved through a cross-shape attention operation that assesses the degree of interaction between points on different shapes and mediates feature propagation. For each test shape, our method finds shapes in an input collection that are suited for executing such cross-shape attention operations. The resulting point-wise feature representations lead to more consistent 3D shape segmentation results, as demonstrated in our experiments.

翻訳日:2022-12-21 22:14:52 公開日:2022-08-06

# NAS-Navigator: 説明可能なワンショットディープニューラルネットワーク合成のためのビジュアルステアリング

NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural Network Synthesis ( http://arxiv.org/abs/2009.13008v3 )

ライセンス: Link先を確認

Anjul Tyagi, Cong Xie, Klaus Mueller

(参考訳) 近年のディープラーニング分野の進歩は、いくつかのアプリケーションにおいて非常に大きなニューラルネットワークの有効性を示している。しかし、これらのディープニューラルネットワークのサイズが大きくなるにつれて、良い結果を得るために多くのパラメータを設定するのがますます難しくなっている。現在、アナリストは、労働集約的で時間を要する多くの異なる設定とパラメータ設定を実験する必要があります。一方、ニューラルネットワークアーキテクチャ探索のための完全自動化技術の能力は、人間の専門知識がなくても制限される。この問題に対処するため,我々は,ワンショットアーキテクチャ探索技術に基づいて,ニューラルネットワークアーキテクチャ最適化のタスクをグラフ空間探索として定式化する。このアプローチでは、全ての候補アーキテクチャのスーパーグラフをワンショットで訓練し、最適なニューラルネットワークをサブグラフとして識別する。本稿では,分析者が効率的にサブグラフ空間を構築し,ドメイン知識を注入することでネットワーク探索をガイドするフレームワークを提案する。基本的なニューラルネットワークコンポーネントで構成されたネットワークアーキテクチャ空間から始めて、アナリストはワンショット検索スキームを通じて、最も有望なコンポーネントを効果的に選択することができる。このテクニックを反復的に適用することで、アナリストは与えられたアプリケーションの最適なニューラルネットワークアーキテクチャに収束することができる。探索中、アナリストは、探索空間の散在した視覚化から提供された知識を利用して、異なるコンポーネントを編集し、より高速な収束を導くことができる。我々は,複数のディープラーニング研究者と共同でインタフェースを設計し,その最終効果をユーザ・スタディと2つのケース・スタディで評価した。

Recent advancements in the area of deep learning have shown the effectiveness of very large neural networks in several applications. However, as these deep neural networks continue to grow in size, it becomes more and more difficult to configure their many parameters to obtain good results. Presently, analysts must experiment with many different configurations and parameter settings, which is labor-intensive and time-consuming. On the other hand, the capacity of fully automated techniques for neural network architecture search is limited without the domain knowledge of human experts. To deal with the problem, we formulate the task of neural network architecture optimization as a graph space exploration, based on the one-shot architecture search technique. In this approach, a super-graph of all candidate architectures is trained in one-shot and the optimal neural network is identified as a sub-graph. In this paper, we present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge. Starting with the network architecture space composed of basic neural network components, analysts are empowered to effectively select the most promising components via our one-shot search scheme. Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application. During the exploration, analysts can use their domain knowledge aided by cues provided from a scatterplot visualization of the search space to edit different components and guide the search for faster convergence. We designed our interface in collaboration with several deep learning researchers and its final effectiveness is evaluated with a user study and two case studies.

翻訳日:2022-10-13 20:37:50 公開日:2022-08-06

# 口腔癌におけるサイズと顕微鏡の特徴抽出と分類のための深層学習:畳み込みニューラルネットワークの強化

Deep Learning for Size and Microscope Feature Extraction and Classification in Oral Cancer: Enhanced Convolution Neural Network ( http://arxiv.org/abs/2208.07855v1 )

ライセンス: Link先を確認

Prakrit Joshi, Omar Hisham Alsadoon, Abeer Alsadoon, Nada AlSallami, Tarik A. Rashid, P.W.C. Prasad, Sami Haddad

(参考訳) 背景と目的: 過度に適合する問題は、深層学習技術が口腔癌の画像分類でうまく実装されていない理由である。本研究の目的は,畳み込みニューラルネットワークを用いたDeep Learningアルゴリズムを用いて,必要な次元削減特徴マップを正確に作成するためのオーバーフィッティングの削減である。方法論:提案システムは,自動エンコーダ技術を用いて特徴抽出の効率を高め,情報を圧縮する拡張畳み込みニューラルネットワークで構成されている。この手法では、入力データを生成するためにアンプールとデコンボリューションを行い、入力データと出力データの差を最小限に抑える。さらに、入力データセットから特徴特徴を抽出し、それらの特徴から入力データを再生し、ネットワークを学習して過度な適合を低減する。結果: 共焦点レーザー内視鏡(CLE)画像の異なるサンプル画像群を用いて, 精度の異なる処理時間値が得られた。その結果,提案手法は現在のシステムよりも優れていることがわかった。さらに,本システムでは,分類精度を5～5.5%向上し,平均処理時間を20～30ミリ秒短縮した。結論:本システムでは,CLE画像から異なる解剖学的位置の口腔癌細胞の正確な分類に焦点を当てた。最後に, オーバーフィッティング問題を解決するオートエンコーダ法を用いて, 精度と処理時間を向上させる。

Background and Aim: Over-fitting issue has been the reason behind deep learning technology not being successfully implemented in oral cancer images classification. The aims of this research were reducing overfitting for accurately producing the required dimension reduction feature map through Deep Learning algorithm using Convolutional Neural Network. Methodology: The proposed system consists of Enhanced Convolutional Neural Network that uses an autoencoder technique to increase the efficiency of the feature extraction process and compresses information. In this technique, unpooling and deconvolution is done to generate the input data to minimize the difference between input and output data. Moreover, it extracts characteristic features from the input data set to regenerate input data from those features by learning a network to reduce overfitting. Results: Different accuracy and processing time value is achieved while using different sample image group of Confocal Laser Endomicroscopy (CLE) images. The results showed that the proposed solution is better than the current system. Moreover, the proposed system has improved the classification accuracy by 5~ 5.5% on average and reduced the average processing time by 20 ~ 30 milliseconds. Conclusion: The proposed system focuses on the accurate classification of oral cancer cells of different anatomical locations from the CLE images. Finally, this study enhances the accuracy and processing time using the autoencoder method that solves the overfitting problem.

翻訳日:2022-08-28 22:23:21 公開日:2022-08-06

# 致死性疾患早期発見のための効率的な新規発見法

Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases ( http://arxiv.org/abs/2208.04732v1 )

ライセンス: Link先を確認

S\`edjro Salomon Hotegni (1), Ernest Fokou\'e (2) ((1) African Institute for Mathematical Sciences - Rwanda, (2) Rochester Institute of Technology - United States)

(参考訳) CHE(Critical Health Episodes)のような致命的な病気は、集中治療室に入院した患者にとって真の危険である。これらのエピソードは臓器の損傷や死を引き起こすことがある。それでも、時間内に診断することは、その不便さを大幅に減らすだろう。そこで本研究では,急性低血圧エピソードや頻拍エピソードなどのCHEの早期警戒システムの構築に焦点をあてた。予測の正確性を高めるため、観測期間(観測窓)と臨界事象が起こる期間(目標窓)との間に1時間の間隔が考慮された。 MIMIC IIデータセットを用いて,提案システムの性能評価を行った。このシステムはまず、3つの異なるモードを使って追加機能を抽出する。そして、相互情報ゲイン機能重要度を用いて、最も関連性の高い特徴の選択を可能にする特徴選択プロセスを実施した。最後に,高性能予測モデルLightGBMを用いてエピソード分類を行った。 MIG-LightGBMと呼ばれるこのアプローチは、イベントリコール(ER)、縮小精度(RP)、平均予測時間(aveAT)、平均False Alarms(aveFA)、イベントF1スコア(EF1スコア)の5つの異なる指標を用いて評価された。したがって,CHE の早期予測には大きな AveAT だけでなく,大きな EF1 スコアと低い AveFA も有効であると考えられる。予測モデルとして Extreme Gradient Boosting, Support Vector Classification あるいは Naive Bayes を用いたシステムと比較すると,提案システムは非常に支配的であった。また、階層型学習アプローチよりも優れていることも確認した。

Fatal diseases, as Critical Health Episodes (CHEs), represent real dangers for patients hospitalized in Intensive Care Units. These episodes can lead to irreversible organ damage and death. Nevertheless, diagnosing them in time would greatly reduce their inconvenience. This study therefore focused on building a highly effective early warning system for CHEs such as Acute Hypotensive Episodes and Tachycardia Episodes. To facilitate the precocity of the prediction, a gap of one hour was considered between the observation periods (Observation Windows) and the periods during which a critical event can occur (Target Windows). The MIMIC II dataset was used to evaluate the performance of the proposed system. This system first includes extracting additional features using three different modes. Then, the feature selection process allowing the selection of the most relevant features was performed using the Mutual Information Gain feature importance. Finally, the high-performance predictive model LightGBM was used to perform episode classification. This approach called MIG-LightGBM was evaluated using five different metrics: Event Recall (ER), Reduced Precision (RP), average Anticipation Time (aveAT), average False Alarms (aveFA), and Event F1-score (EF1-score). A method is therefore considered highly efficient for the early prediction of CHEs if it exhibits not only a large aveAT but also a large EF1-score and a low aveFA. Compared to systems using Extreme Gradient Boosting, Support Vector Classification or Naive Bayes as a predictive model, the proposed system was found to be highly dominant. It also confirmed its superiority over the Layered Learning approach.

翻訳日:2022-08-10 13:21:50 公開日:2022-08-06

# TripHLApan:トリプルコーディングマトリックスと転写学習に基づくHLA分子結合ペプチドの予測

TripHLApan: predicting HLA molecules binding peptides based on triple coding matrix and transfer learning ( http://arxiv.org/abs/2208.04314v1 )

ライセンス: Link先を確認

Meng Wang, Chuqi Lei, Jianxin Wang, Yaohang Li and Min Li

(参考訳) ヒト白血球抗原(HLA)は、ヒト免疫領域において重要な分子ファミリーであり、外部の脅威を認識し、T細胞にペプチドを提示することで免疫応答を誘導する。近年では、特定の免疫応答を誘導する腫瘍ワクチンの合成ががん治療の最前線となっている。ペプチドとhlaの結合パターンを計算的にモデル化することで、腫瘍ワクチンの開発を大いに加速することができる。しかし,ほとんどの予測手法の性能は極めて限定的であり,モデリングの基盤として既存の生物学的知識の分析を十分に活用することはできない。本稿では,HLA分子ペプチド結合予測のためのパン特異的予測モデルTripHLApanを提案する。 TripHLApanは、3重符号化行列、BiGRU+アテンションモデル、転送学習戦略を統合することで、強力な予測能力を示す。総合的な評価は、異なる試験環境でHLA-IおよびHLA-IIペプチド結合を予測するTripHLApanの有効性を示す。 HLA-Iの予測力は、最新のデータセットでさらに実証される。また,トリプラパンはメラノーマ患者の試料中に強い結合再構成能を有することが示された。結論として、TripHLApanは腫瘍ワクチンの合成のためのHLA-IおよびHLA-II分子ペプチドの結合を予測する強力なツールである。

Human leukocyte antigen (HLA) is an important molecule family in the field of human immunity, which recognizes foreign threats and triggers immune responses by presenting peptides to T cells. In recent years, the synthesis of tumor vaccines to induce specific immune responses has become the forefront of cancer treatment. Computationally modeling the binding patterns between peptide and HLA can greatly accelerate the development of tumor vaccines. However, most of the prediction methods performance is very limited and they cannot fully take advantage of the analysis of existing biological knowledge as the basis of modeling. In this paper, we propose TripHLApan, a novel pan-specific prediction model, for HLA molecular peptide binding prediction. TripHLApan exhibits powerful prediction ability by integrating triple coding matrix, BiGRU + Attention models, and transfer learning strategy. The comprehensive evaluations demonstrate the effectiveness of TripHLApan in predicting HLA-I and HLA-II peptide binding in different test environments. The predictive power of HLA-I is further demonstrated in the latest data set. In addition, we show that TripHLApan has strong binding reconstitution ability in the samples of a melanoma patient. In conclusion, TripHLApan is a powerful tool for predicting the binding of HLA-I and HLA-II molecular peptides for the synthesis of tumor vaccines.

翻訳日:2022-08-10 13:08:24 公開日:2022-08-06

# smart explorer:インタラクティブな探索による密集したクラッター内の物体認識

Smart Explorer: Recognizing Objects in Dense Clutter via Interactive Exploration ( http://arxiv.org/abs/2208.03496v1 )

ライセンス: Link先を確認

Zhenyu Wu, Ziwei Wang, Zibu Wei, Yi Wei and Haibin Yan

(参考訳) 密集クラッタにおける物体の認識は、把握、梱包、再配置など、幅広いロボット操作タスクにおいて重要な役割を担っている。しかし, 従来の視覚認識モデルでは, 症例間の有意な咬合による物体の欠落や, 物体の混み合いが高まる視覚の曖昧さによる不正確な予測が一般的である。本稿では,すべての物体を密集したクラッタで認識するための,smart explorerと呼ばれる対話型探索フレームワークを提案する。われわれのスマートエクスプローラーは、認識性能を最大化するためにクラッタと物理的に相互作用し、動作回数を最小限に抑えながら、最適な精度と効率のトレードオフによって、偽陽性と負の低減を効果的に行うことができる。具体的には,まずクラッタの多視点rgb-d画像を収集し,対応する点雲を再構成する。ビュー間でrgbイメージのインスタンスセグメンテーションを集約することにより、既存のクラスと各クラスのオブジェクト数を予測するクラッターのインスタンス毎ポイントクラウドパーティションを取得する。有効物理相互作用のためのプッシュ動作は、インスタンスセグメンテーションエントロピーとマルチビューオブジェクトの不一致からなる認識の不確実性を大幅に低減するために生成される。したがって、密閉クラッタにおける物体認識の最適精度-効率トレードオフは、反復的なインスタンス予測と物理的相互作用によって達成される。大規模な実験では、スマートエクスプローラーがいくつかのアクションだけで有望な認識精度を獲得し、ランダムなプッシュを大きなマージンで上回ります。

Recognizing objects in dense clutter accurately plays an important role to a wide variety of robotic manipulation tasks including grasping, packing, rearranging and many others. However, conventional visual recognition models usually miss objects because of the significant occlusion among instances and causes incorrect prediction due to the visual ambiguity with the high object crowdedness. In this paper, we propose an interactive exploration framework called Smart Explorer for recognizing all objects in dense clutters. Our Smart Explorer physically interacts with the clutter to maximize the recognition performance while minimize the number of motions, where the false positives and negatives can be alleviated effectively with the optimal accuracy-efficiency trade-offs. Specifically, we first collect the multi-view RGB-D images of the clutter and reconstruct the corresponding point cloud. By aggregating the instance segmentation of RGB images across views, we acquire the instance-wise point cloud partition of the clutter through which the existed classes and the number of objects for each class are predicted. The pushing actions for effective physical interaction are generated to sizably reduce the recognition uncertainty that consists of the instance segmentation entropy and multi-view object disagreement. Therefore, the optimal accuracy-efficiency trade-off of object recognition in dense clutter is achieved via iterative instance prediction and physical interaction. Extensive experiments demonstrate that our Smart Explorer acquires promising recognition accuracy with only a few actions, which also outperforms the random pushing by a large margin.

翻訳日:2022-08-10 13:04:11 公開日:2022-08-06

# autoshape: 時系列クラスタリングのためのautoencoder-shapeletアプローチ

AUTOSHAPE: An Autoencoder-Shapelet Approach for Time Series Clustering ( http://arxiv.org/abs/2208.04313v1 )

ライセンス: Link先を確認

Guozhong Li, Byron Choi, Jianliang Xu, Sourav S Bhowmick, Daphne Ngar-yin Mah, and Grace Lai-Hung Wong

(参考訳) 時系列シェープレットは、最近時系列クラスタリング(TSC)に有効であることが判明した識別サブシーケンスである。シェープレットはクラスタの解釈に便利である。したがって、TSCの主な課題は、異なるクラスタを識別する高品質な可変長形状レットを見つけることである。本稿では,新しいオートエンコーダ・シェープレットアプローチ(autoshape)を提案する。このアプローチは,教師なしの方法でシェープレットを決定する際に,オートエンコーダとシェープレットの両方を利用する最初の研究である。オートエンコーダは高品質なシェープレットを学習するために特別に設計されている。より具体的には、潜在表現学習を指導するために、異なる変数の可変長シェープレット候補(時系列サブシーケンス)の統一埋め込みを学ぶために、最新の自己教師付き損失を用い、統一空間における識別埋め込みを選択するための多様性損失を提案する。本稿では,クラスタリングのための元の時系列空間におけるシェープレットを復元する再構成損失について紹介する。最後に、学習中のクラスタリング性能をAUTOSHAPEに知らせるため、Davies Bouldin index(DBI)を採用する。 AUTOSHAPEについて広範な実験を行った。単変量時系列(UTS)におけるクラスタリング性能を評価するために,UCRアーカイブデータセットを用いたAUTOSHAPEと15の代表的な手法を比較した。多変量時系列(MTS)の性能を調べるため,30UEAアーカイブデータセット上でAUTOSHAPEを5つの競合手法で評価した。その結果、AUTOSHAPEは、比較したすべての手法の中で最高であることがわかった。 3つのUTSケーススタディと1つのMSSケーススタディにおいて,クラスタをシェープレットで解釈し,それぞれ興味深い直感を得ることができる。

Time series shapelets are discriminative subsequences that have been recently found effective for time series clustering (TSC). The shapelets are convenient for interpreting the clusters. Thus, the main challenge for TSC is to discover high-quality variable-length shapelets to discriminate different clusters. In this paper, we propose a novel autoencoder-shapelet approach (AUTOSHAPE), which is the first study to take the advantage of both autoencoder and shapelet for determining shapelets in an unsupervised manner. An autoencoder is specially designed to learn high-quality shapelets. More specifically, for guiding the latent representation learning, we employ the latest self-supervised loss to learn the unified embeddings for variable-length shapelet candidates (time series subsequences) of different variables, and propose the diversity loss to select the discriminating embeddings in the unified space. We introduce the reconstruction loss to recover shapelets in the original time series space for clustering. Finally, we adopt Davies Bouldin index (DBI) to inform AUTOSHAPE of the clustering performance during learning. We present extensive experiments on AUTOSHAPE. To evaluate the clustering performance on univariate time series (UTS), we compare AUTOSHAPE with 15 representative methods using UCR archive datasets. To study the performance of multivariate time series (MTS), we evaluate AUTOSHAPE on 30 UEA archive datasets with 5 competitive methods. The results validate that AUTOSHAPE is the best among all the methods compared. We interpret clusters with shapelets, and can obtain interesting intuitions about clusters in three UTS case studies and one MTS case study, respectively.

翻訳日:2022-08-10 12:09:53 公開日:2022-08-06

# 情報収縮とグラフ展開のレンズによるGNNのオーバーカッシング

Oversquashing in GNNs through the lens of information contraction and graph expansion ( http://arxiv.org/abs/2208.03471v1 )

ライセンス: Link先を確認

Pradeep Kr. Banerjee, Kedar Karhadkar, Yu Guang Wang, Uri Alon, Guido Mont\'ufar

(参考訳) メッセージパッシンググラフニューラルネットワーク(GNN)における信号伝搬の質は、最近の研究で見られるように、その表現性に強く影響を及ぼす。特に、長距離インタラクションに依存する予測タスクでは、ノード機能の再帰的な集約は、"oversquashing"と呼ばれる望ましくない現象につながる可能性がある。本稿では,情報収縮に基づくオーバースクワッシング分析の枠組みを提案する。我々の解析はフォン・ノイマンによる信頼性計算モデルによって導かれ、ノイズの多い計算グラフにおける信号の待ち行列としての新しい洞察を与える。そこで本研究では,オーバーカッシングを緩和するグラフ再構成アルゴリズムを提案する。本アルゴリズムは、拡張グラフ構成に動機づけられたランダムな局所エッジフリッププリミティブを用いる。提案アルゴリズムのスペクトル展開特性と既存の曲率に基づく非局所再配線戦略との比較を行った。合成実験により、我々のアルゴリズムは拡張速度が遅いが、全体的な計算コストが低く、ノードの度合いを正確に保ち、グラフを切断しないことを示した。

The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called "oversquashing". We present a framework for analyzing oversquashing based on information contraction. Our analysis is guided by a model of reliable computation due to von Neumann that lends a new insight into oversquashing as signal quenching in noisy computation graphs. Building on this, we propose a graph rewiring algorithm aimed at alleviating oversquashing. Our algorithm employs a random local edge flip primitive motivated by an expander graph construction. We compare the spectral expansion properties of our algorithm with that of an existing curvature-based non-local rewiring strategy. Synthetic experiments show that while our algorithm in general has a slower rate of expansion, it is overall computationally cheaper, preserves the node degrees exactly and never disconnects the graph.

翻訳日:2022-08-09 14:31:07 公開日:2022-08-06

# GCNN-LSTMハイブリッドニューラルネットワークによるアルゴリズム生成領域の検出

Detecting Algorithmically Generated Domains Using a GCNN-LSTM Hybrid Neural Network ( http://arxiv.org/abs/2208.03445v1 )

ライセンス: Link先を確認

Zheng Wang

(参考訳) ドメイン生成アルゴリズム(DGA)は、ボットネットによってC&Cサーバとボットの間のステルスなコマンドと制御(C&C)通信チャネルを構築するために使用される。 DGAは、多数の擬似ランダムアルゴリズム生成ドメイン(AGD)を周期的に生成することができる。 AGD検出アルゴリズムは、既存のDGA技術に対応する軽量で有望なソリューションを提供する。本稿では,agd検出のためのgcnn(gated convolutional neural network)-lstm(long short-term memory)ハイブリッドニューラルネットワーク(glhnn)を提案する。 GLHNNでは、LSTM上のドメイン名から情報的特徴を抽出するためにGCNNが適用される。 GLHNNは6種類のDGAをカバーするAGDを用いて実験的に検証されている。 glhnnは最先端検出モデルと比較され、テストされたモデルの中で最高の検出性能を示す。

Domain generation algorithm (DGA) is used by botnets to build a stealthy command and control (C&C) communication channel between the C&C server and the bots. A DGA can periodically produce a large number of pseudo-random algorithmically generated domains (AGDs). AGD detection algorithms provide a lightweight, promising solution in response to the existing DGA techniques. In this paper, a GCNN (gated convolutional neural network)-LSTM (long short-term memory) Hybrid Neural Network (GLHNN) for AGD detection is proposed. In GLHNN, GCNN is applied to extract the informative features from domain names on top of LSTM which further processes the feature sequence. GLHNN is experimentally validated using representative AGDs covering six classes of DGAs. GLHNN is compared with the state-of-the-art detection models and demonstrates the best overall detection performance among these tested models.

翻訳日:2022-08-09 14:20:42 公開日:2022-08-06

# ブラフ体まわりの流れの大規模渦シミュレーションのための深層学習閉鎖モデル

Deep Learning Closure Models for Large-Eddy Simulation of Flows around Bluff Bodies ( http://arxiv.org/abs/2208.03498v1 )

ライセンス: Link先を確認

Justin Sirignano and Jonathan F. MacArt

(参考訳) 大渦シミュレーション(LES)のための深層学習(DL)クロージャモデルを開発し,適度なレイノルズ数で矩形円筒まわりの非圧縮性流れについて評価した。壁近傍流れのシミュレーションは空力モデリングの中心的な課題であり、分離された流れの予測はしばしば不正確であり、lesは制限的に小さい壁近傍のメッシュサイズを必要とする。 dl-lesモデルは随伴pde最適化法を用いて訓練され、可能な限り直接数値シミュレーション(dns)データにマッチする。その後、トレーニングデータに含まれない新しいアスペクト比とレイノルズ数について、サンプル外評価を行い、標準のLESモデル(動的スマゴリンスキーモデル)と比較する。 DL-LESモデルは動的Smagorinskyよりも優れており、比較的粗いメッシュ上で正確なLES予測を達成することができる(各カルテシャン方向の4倍の因子でDNSグリッドからダウンサンプリングされる)。抵抗係数,平均流れ,レイノルズ応力を予測するためのdl-lesモデルの精度について検討した。例えば、時間平均平均平均速度 $\bar{u}(x) = \displaystyle \lim_{t \rightarrow \infty} \frac{1}{t} \int_0^t u(s,x) dx$ である。したがって、定常流統計を計算するためには、DL-LES方程式をドメイン内の多数のフロー時間でシミュレートする必要がある; 関数型が深いニューラルネットワークによって定義される非定常な偏微分方程式モデルが$t \in [0, \infty)$で安定かつ正確であるかどうかという、非自明な問題である。その結果,dl-lesモデルは大きな物理時間にわたって正確で安定であり,空力的応用に関連するブラフ体まわりの乱流の流速,ゆらぎ,抗力係数の定常統計量の推定が可能となった。

A deep learning (DL) closure model for large-eddy simulation (LES) is developed and evaluated for incompressible flows around a rectangular cylinder at moderate Reynolds numbers. Near-wall flow simulation remains a central challenge in aerodynamic modeling: RANS predictions of separated flows are often inaccurate, while LES can require prohibitively small near-wall mesh sizes. The DL-LES model is trained using adjoint PDE optimization methods to match, as closely as possible, direct numerical simulation (DNS) data. It is then evaluated out-of-sample (i.e., for new aspect ratios and Reynolds numbers not included in the training data) and compared against a standard LES model (the dynamic Smagorinsky model). The DL-LES model outperforms dynamic Smagorinsky and is able to achieve accurate LES predictions on a relatively coarse mesh (downsampled from the DNS grid by a factor of four in each Cartesian direction). We study the accuracy of the DL-LES model for predicting the drag coefficient, mean flow, and Reynolds stress. A crucial challenge is that the LES quantities of interest are the steady-state flow statistics; for example, the time-averaged mean velocity $\bar{u}(x) = \displaystyle \lim_{t \rightarrow \infty} \frac{1}{t} \int_0^t u(s,x) dx$. Calculating the steady-state flow statistics therefore requires simulating the DL-LES equations over a large number of flow times through the domain; it is a non-trivial question whether an unsteady partial differential equation model whose functional form is defined by a deep neural network can remain stable and accurate on $t \in [0, \infty)$. Our results demonstrate that the DL-LES model is accurate and stable over large physical time spans, enabling the estimation of the steady-state statistics for the velocity, fluctuations, and drag coefficient of turbulent flows around bluff bodies relevant to aerodynamic applications.

翻訳日:2022-08-09 14:20:29 公開日:2022-08-06

# 精度を犠牲にすることなくグラフ畳み込みネットワークのトリプルスカラー化

Triple Sparsification of Graph Convolutional Networks without Sacrificing the Accuracy ( http://arxiv.org/abs/2208.03559v1 )

ライセンス: Link先を確認

Md. Khaledur Rahman, Ariful Azad

(参考訳) グラフニューラルネットワーク(gnns)は、グラフ上で異なる機械学習タスクを実行するために広く使われている。グラフのサイズが拡大し、gnnがより深くなるにつれて、トレーニングと推論の時間は、メモリ要件に加えてコストがかかります。したがって、精度を犠牲にすることなく、グラフのスパーシフィケーションやモデル圧縮は、グラフ学習タスクにとって実行可能なアプローチとなる。既存の手法では、グラフとgnnモデルのスパース化のみを研究する。本稿では,GNNにおける全てのスペーサー化を研究対象とするSparseGCNパイプラインを提案する。我々は理論的解析を行い, ベンチマークグラフデータセットの精度を犠牲にすることなく, 埋め込み行列に最大11.6\%のスパーシティを付加できることを実証的に示した。

Graph Neural Networks (GNNs) are widely used to perform different machine learning tasks on graphs. As the size of the graphs grows, and the GNNs get deeper, training and inference time become costly in addition to the memory requirement. Thus, without sacrificing accuracy, graph sparsification, or model compression becomes a viable approach for graph learning tasks. A few existing techniques only study the sparsification of graphs and GNN models. In this paper, we develop a SparseGCN pipeline to study all possible sparsification in GNN. We provide a theoretical analysis and empirically show that it can add up to 11.6\% additional sparsity to the embedding matrix without sacrificing the accuracy of the commonly used benchmark graph datasets.

翻訳日:2022-08-09 14:19:49 公開日:2022-08-06

# LFGCF: タグ対応レコメンデーションのための光フォークソノミーグラフ協調フィルタ

LFGCF: Light Folksonomy Graph Collaborative Filtering for Tag-Aware Recommendation ( http://arxiv.org/abs/2208.03454v1 )

ライセンス: Link先を確認

Yin Zhang, Can Xu, XianJun Wu, Yan Zhang, LiGang Dong, Weigang Wang

(参考訳) タグ認識レコメンデーション(tag-aware recommendation)は、タグ付け動作によって、ユーザのためのアイテムのパーソナライズされたリストを予測するタスクである。 last.fmやmovielensのようなタグ付け機能を持つ多くのアプリケーションにとって非常に重要である。近年,グラフ畳み込みネットワーク (GCN) によるタグ認識レコメンデーションシステム (TRS) の改良に多くの努力が注がれている。しかし、いくつかのソリューションはGCNから直接継承されるため、タグによって導入されたスパーシリティ、あいまいさ、冗長性の問題を緩和することは困難であり、トレーニングやレコメンデーションパフォーマンスの低下が困難になる。本稿では,GCNの設計を簡略化し,RTSをより簡潔にすることを目的とする。本稿では,重要なGCNコンポーネントのみを含む,光フォークソノミーグラフ協調フィルタリング(LFGCF)と呼ばれる新しいタグ認識レコメンデーションモデルを提案する。具体的には、LFGCFは最初に、タグとタグ付けされたアイテムのレコードからFolksonomy Graphsを構築する。次に,アグリゲーションのシンプルな設計を用いて,ソノミーグラフの高次表現を学習し,複数のレイヤで学習した埋め込みの重み付け和を用いて情報更新を行う。ユーザーとアイテム間の情報ギャップを埋めるために、埋め込みタグを共有します。さらに、ユーザの好みやアイテムの特徴をよりよく表現するために、TransRTという正規化関数が提案されている。 3つの実世界のデータセットに対する大規模なハイパーパラメータ実験とアブレーション研究により、LFGCFはパラメータを少なくし、タグ対応のトップNレコメンデーションのベースラインを著しく上回っている。

Tag-aware recommendation is a task of predicting a personalized list of items for a user by their tagging behaviors. It is crucial for many applications with tagging capabilities like last.fm or movielens. Recently, many efforts have been devoted to improving Tag-aware recommendation systems (TRS) with Graph Convolutional Networks (GCN), which has become new state-of-the-art for the general recommendation. However, some solutions are directly inherited from GCN without justifications, which is difficult to alleviate the sparsity, ambiguity, and redundancy issues introduced by tags, thus adding to difficulties of training and degrading recommendation performance. In this work, we aim to simplify the design of GCN to make it more concise for TRS. We propose a novel tag-aware recommendation model named Light Folksonomy Graph Collaborative Filtering (LFGCF), which only includes the essential GCN components. Specifically, LFGCF first constructs Folksonomy Graphs from the records of user assigning tags and item getting tagged. Then we leverage the simple design of aggregation to learn the high-order representations on Folksonomy Graphs and use the weighted sum of the embeddings learned at several layers for information updating. We share tags embeddings to bridge the information gap between users and items. Besides, a regularization function named TransRT is proposed to better depict user preferences and item features. Extensive hyperparameters experiments and ablation studies on three real-world datasets show that LFGCF uses fewer parameters and significantly outperforms most baselines for the tag-aware top-N recommendations.

翻訳日:2022-08-09 14:07:45 公開日:2022-08-06

# NeuCASL: 論理設計からニューロモルフィックエンジンのシステムシミュレーションへ

NeuCASL: From Logic Design to System Simulation of Neuromorphic Engines ( http://arxiv.org/abs/2208.03500v1 )

ライセンス: Link先を確認

Dharanidhar Dang, Amitash Nanda, Bill Lin and Debashis Sahoo

(参考訳) ムーアの法則の飽和とデナードのスケーリングが壁にぶつかったため、伝統的なフォン・ニューマン・システムはCNNのような計算集約アルゴリズムにGFlops/WWを供給できない。非従来型コンピューティングアプローチの最近のトレンドは、そのようなアルゴリズムのために高エネルギー効率なコンピューティングシステムを設計することを望んでいる。ニューロモルフィックコンピューティングは、脳にインスパイアされた回路、新興技術の使用、低パワーな性質で有望なアプローチである。研究者は、memristors、silicon photonics、finfet、carbon nanotubesなどの様々な新しい技術を使って、ニューロモルフィックコンピュータを実証している。しかし、ニューロモルフィック論理設計から始めてアーキテクチャシミュレーションに進む柔軟なCADツールは、この将来性のあるパラダイムの台頭を支持するためにまだ実証されていない。本稿では,ニューロモルフィック論理設計,回路シミュレーション,システム性能および信頼性評価のための,オープンソースのピソンベースフルシステムCADフレームワークであるNeuCASLを構築することを目的とする。これは私たちの知る限りでは初めてのことです。

With Moore's law saturating and Dennard scaling hitting its wall, traditional Von Neuman systems cannot offer the GFlops/watt for compute-intensive algorithms such as CNN. Recent trends in unconventional computing approaches give us hope to design highly energy-efficient computing systems for such algorithms. Neuromorphic computing is a promising such approach with its brain-inspired circuitry, use of emerging technologies, and low-power nature. Researchers use a variety of novel technologies such as memristors, silicon photonics, FinFET, and carbon nanotubes to demonstrate a neuromorphic computer. However, a flexible CAD tool to start from neuromorphic logic design and go up to architectural simulation is yet to be demonstrated to support the rise of this promising paradigm. In this project, we aim to build NeuCASL, an opensource python-based full system CAD framework for neuromorphic logic design, circuit simulation, and system performance and reliability estimation. This is a first of its kind to the best of our knowledge.

翻訳日:2022-08-09 14:07:14 公開日:2022-08-06

# 教師なし3次元行動表現学習のための対照的なポジティブマイニング

Contrastive Positive Mining for Unsupervised 3D Action Representation Learning ( http://arxiv.org/abs/2208.03497v1 )

ライセンス: Link先を確認

Haoyuan Zhang, Yonghong Hou, Wenjing Zhang and Wanqing Li

(参考訳) 最近の3次元行動表現学習は大きな進歩を遂げている。しかし、厳密なポジティブ/ネガティブな制約はいまだ緩和されておらず、非自己肯定の使用はまだ検討されていない。本稿では,非教師なしスケルトン3D行動表現学習のためのコントラスト陽性マイニング(CPM)フレームワークを提案する。 CPMは、学習を促進するためにコンテキストキュー内の非自己陽性を特定する。具体的には、siameseエンコーダを採用し、コンテキストキュー内のすべてのインスタンスを参照して拡張インスタンスの類似度分布にマッチするように訓練する。列内の非自己正のインスタンスを識別することにより、マイニング正の知識を活用し、学習された潜在空間のクラス内およびクラス間多様性に対する頑健性を高めるための正の強化学習戦略を提案する。実験の結果,提案したCPMは,NTUおよびPKU-MMDデータセットにおいて,既存の最先端の教師なし手法よりも優れていた。

Recent contrastive based 3D action representation learning has made great progress. However, the strict positive/negative constraint is yet to be relaxed and the use of non-self positive is yet to be explored. In this paper, a Contrastive Positive Mining (CPM) framework is proposed for unsupervised skeleton 3D action representation learning. The CPM identifies non-self positives in a contextual queue to boost learning. Specifically, the siamese encoders are adopted and trained to match the similarity distributions of the augmented instances in reference to all instances in the contextual queue. By identifying the non-self positive instances in the queue, a positive-enhanced learning strategy is proposed to leverage the knowledge of mined positives to boost the robustness of the learned latent space against intra-class and inter-class diversity. Experimental results have shown that the proposed CPM is effective and outperforms the existing state-of-the-art unsupervised methods on the challenging NTU and PKU-MMD datasets.

翻訳日:2022-08-09 14:03:05 公開日:2022-08-06

# 深層学習による空間位相の3次元計測

Deep Learning-enabled Spatial Phase Unwrapping for 3D Measurement ( http://arxiv.org/abs/2208.03524v1 )

ライセンス: Link先を確認

Xiaolong Luo, Wanzhong Song, Songlin Bai, Yu Li, and Zhihe Zhao

(参考訳) 3次元撮像速度とシステムコストの観点からは、単一周波数パターンを投影する単一カメラシステムは、提案されている全てのフリンジプロフィロメトリ (fpp) システムの中で理想的な選択肢である。このシステムは、堅牢な空間位相解放(SPU)アルゴリズムを必要とする。しかし、堅牢なSPUは複雑な場面では依然として課題である。品質誘導型SPUアルゴリズムは、切り離す前に位相マップの信頼性の低い点を識別するより効率的な方法を必要とする。エンドツーエンドのディープラーニングSPU手法は、汎用性と解釈可能性の問題に直面している。本稿では,FPPにおける頑健なSPUに対して,ディープラーニングと従来の経路追従を組み合わせたハイブリッド手法を提案する。このハイブリッドSPU方式は、従来の品質誘導型SPU法よりも堅牢性、エンドツーエンドのディープラーニング方式よりも解釈性が高く、目に見えないデータに対する一般性を示す。複数の照明条件と複数のFPPシステムの実際のデータセットに関する実験は, 画像解像度, フリンジ数, フリンジ方向, 光学波長によって異なるが, 提案手法の有効性を検証する。

In terms of 3D imaging speed and system cost, the single-camera system projecting single-frequency patterns is the ideal option among all proposed Fringe Projection Profilometry (FPP) systems. This system necessitates a robust spatial phase unwrapping (SPU) algorithm. However, robust SPU remains a challenge in complex scenes. Quality-guided SPU algorithms need more efficient ways to identify the unreliable points in phase maps before unwrapping. End-to-end deep learning SPU methods face generality and interpretability problems. This paper proposes a hybrid method combining deep learning and traditional path-following for robust SPU in FPP. This hybrid SPU scheme demonstrates better robustness than traditional quality-guided SPU methods, better interpretability than end-to-end deep learning scheme, and generality on unseen data. Experiments on the real dataset of multiple illumination conditions and multiple FPP systems differing in image resolution, the number of fringes, fringe direction, and optics wavelength verify the effectiveness of the proposed method.

翻訳日:2022-08-09 14:02:50 公開日:2022-08-06

# 正確かつ説明可能な深層学習システムによる胸部x線画像の解釈におけるobserver agreementの改善

An Accurate and Explainable Deep Learning System Improves Interobserver Agreement in the Interpretation of Chest Radiograph ( http://arxiv.org/abs/2208.03545v1 )

ライセンス: Link先を確認

Hieu H. Pham, Ha Q. Nguyen, Hieu T. Nguyen, Linh T. Le, Lam Khanh

(参考訳) 最近の人工知能(AI)アルゴリズムは、様々な医学分類タスクにおいて放射線学レベルの性能を達成した。しかし、CXRスキャンによる異常所見の局所化は、放射線医に画像レベルの分類を説明する上で不可欠である。本稿では,CXRスキャンを複数の胸部疾患に分類できるVinDr-CXRという,説明可能な深層学習システムについて紹介する。 VinDr-CXRは51,485個のCXRスキャンで放射線学者によるバウンディングボックスアノテーションを用いて訓練された。 6つの一般的な胸椎疾患を3000のcxrスキャンで分類し、受信者の動作特性曲線(auroc)下の平均面積は0.967(95%信頼区間[ci]:0.958-0.975)であった。 VinDr-CXRは、独立した患者コホートにおいても外部から検証され、その堅牢性を示した。 VinDr-CXRは,14種類の病変を有する局所化タスクにおいて,スキャン毎に検出された1.0偽陽性病変の頻度で80.2%の感度を示した。 VinDr-CXRの臨床効果を6名の経験者を支援するために,前向きに検討した。その結果,診断支援ツールとして使用すると,Fleiss' Kappa平均の1.5%の増加により,放射線科医間の合意が著しく改善した。また, 放射線学者がVinDr-CXRの提案を相談した結果, コーエンのカッパ平均値の3.3%で, 両者の合意が著しく増加した。

Recent artificial intelligence (AI) algorithms have achieved radiologist-level performance on various medical classification tasks. However, only a few studies addressed the localization of abnormal findings from CXR scans, which is essential in explaining the image-level classification to radiologists. We introduce in this paper an explainable deep learning system called VinDr-CXR that can classify a CXR scan into multiple thoracic diseases and, at the same time, localize most types of critical findings on the image. VinDr-CXR was trained on 51,485 CXR scans with radiologist-provided bounding box annotations. It demonstrated a comparable performance to experienced radiologists in classifying 6 common thoracic diseases on a retrospective validation set of 3,000 CXR scans, with a mean area under the receiver operating characteristic curve (AUROC) of 0.967 (95% confidence interval [CI]: 0.958-0.975). The VinDr-CXR was also externally validated in independent patient cohorts and showed its robustness. For the localization task with 14 types of lesions, our free-response receiver operating characteristic (FROC) analysis showed that the VinDr-CXR achieved a sensitivity of 80.2% at the rate of 1.0 false-positive lesion identified per scan. A prospective study was also conducted to measure the clinical impact of the VinDr-CXR in assisting six experienced radiologists. The results indicated that the proposed system, when used as a diagnosis supporting tool, significantly improved the agreement between radiologists themselves with an increase of 1.5% in mean Fleiss' Kappa. We also observed that, after the radiologists consulted VinDr-CXR's suggestions, the agreement between each of them and the system was remarkably increased by 3.3% in mean Cohen's Kappa.

翻訳日:2022-08-09 14:02:35 公開日:2022-08-06

# パネルデータを用いた因果推論のための予測アルゴリズム

Forecasting Algorithms for Causal Inference with Panel Data ( http://arxiv.org/abs/2208.03489v1 )

ライセンス: Link先を確認

Jacob Goldin, Julian Nyarko, Justin Young

(参考訳) パネルデータによる因果推論は、社会科学研究の核となる課題である。予測手法の進歩は、治療を行わない治療単位の反事実的進化をより正確に予測することで、この課題を促進できる。本稿では,時系列予測(N-BEATSアルゴリズム)のためのニューラルアーキテクチャを新たに開発した。本手法は, 後処理期間における処理単位の「合成」未処理バージョンを予測するために, 制御単位の先行値を組み込むことにより, 従来の時系列アプリケーションから適応する。本手法から導出した推定器をシンビートと呼び,従来の2方向固定効果や合成制御法を大きく上回っていることを見出した。また,SyNBEATSは,行列補完や相違点の合成等,最近のパネル推定手法と比較して,同等あるいはより正確な性能が得られることがわかった。本研究は,パネル設定における因果推論を改善するために,予測文学の進歩をいかに活用できるかを強調した。

Conducting causal inference with panel data is a core challenge in social science research. Advances in forecasting methods can facilitate this task by more accurately predicting the counterfactual evolution of a treated unit had treatment not occurred. In this paper, we draw on a newly developed deep neural architecture for time series forecasting (the N-BEATS algorithm). We adapt this method from conventional time series applications by incorporating leading values of control units to predict a "synthetic" untreated version of the treated unit in the post-treatment period. We refer to the estimator derived from this method as SyNBEATS, and find that it significantly outperforms traditional two-way fixed effects and synthetic control methods across a range of settings. We also find that SyNBEATS attains comparable or more accurate performance relative to more recent panel estimation methods such as matrix completion and synthetic difference in differences. Our results highlight how advances in the forecasting literature can be harnessed to improve causal inference in panel settings.

翻訳日:2022-08-09 13:55:59 公開日:2022-08-06

# 最大$k$非依存集合によるグラフプーリング

Graph Pooling with Maximum-Weight $k$-Independent Sets ( http://arxiv.org/abs/2208.03523v1 )

ライセンス: Link先を確認

Davide Bacciu, Alessio Conte, Francesco Landolfi

(参考訳) 大規模ネットワークやリレーショナルデータを扱う場合、グラフの削減が基本である。粗い構造でそれらを解くことで、高い計算負荷のタスクを縮小できる。同時に、グラフリダクションは、構造からマルチレゾリューション表現を抽出するために、グラフニューラルネットワークの層をプールする役割を担っている。これらの文脈において、距離関係と位相特性を保存するための還元機構の能力は、その応用を現実のサイズの問題に適用できるスケーラビリティとともに、基本的なものと考えられる。本稿では,最大重量$k$非依存集合のグラフ理論的概念に基づくグラフ粗大化機構を導入し,GPU上での効率的な並列実装を実現するグレディアルゴリズムを提案する。本手法は, 正規データ(画像, シーケンス)における制御可能な等間隔粗粒化機構の最初のグラフ構造対応である。我々は、経路長の歪み境界の理論的保証と、粗化グラフにおける重要な位相特性を保存する能力を証明する。これらの概念を利用して,グラフ分類タスクにおいて経験的に評価するグラフプーリング機構を定義し,文献のプーリング手法と比較した。

Graph reductions are fundamental when dealing with large scale networks and relational data. They allow to downsize tasks of high computational impact by solving them in coarsened structures. At the same time, graph reductions play the role of pooling layers in graph neural networks, to extract multi-resolution representations from structures. In these contexts, the ability of the reduction mechanism to preserve distance relationships and topological properties appears fundamental, along with a scalability enabling its application to real-world sized problems. In this paper, we introduce a graph coarsening mechanism based on the graph-theoretic concept of maximum-weight $k$-independent sets, providing a greedy algorithm that allows efficient parallel implementation on GPUs. Our method is the first graph-structured counterpart of controllable equispaced coarsening mechanisms in regular data (images, sequences). We prove theoretical guarantees for distortion bounds on path lengths, as well as the ability to preserve key topological properties in the coarsened graphs. We leverage these concepts to define a graph pooling mechanism that we empirically assess in graph classification tasks, showing that it compares favorably against pooling methods in literature.

翻訳日:2022-08-09 13:43:50 公開日:2022-08-06

# エントロピー損失を用いたロバスト深層学習に向けて

Towards Robust Deep Learning using Entropic Losses ( http://arxiv.org/abs/2208.03566v1 )

ライセンス: Link先を確認

David Mac\^edo

(参考訳) 現在のディープラーニングソリューションは、推論中にサンプルを確実に分類できるかどうかを知らせないことでよく知られている。より信頼性の高いディープラーニングソリューションを構築するための最も効果的な方法の1つは、いわゆるアウト・オブ・ディストリビューション(out-of-distribution)検出タスクにおけるパフォーマンスを改善することだ。言い換えれば、分散検出能力のあるシステムは、ニューラルネットワークがトレーニングされていないクラスのインスタンスに送信されると、ナンセンスな分類を行うことを拒否する可能性がある。本論文は, 新たな損失関数と検出スコアを提案することにより, 未解決の分散検出タスクに取り組む。不確実性推定は、より堅牢なディープラーニングシステムを構築する上で重要な補助タスクでもある。そこで,本研究では,ディープニューラルネットワークが提示する確率がどの程度現実的かを評価するロバストネス関連タスクにも対処する。提案手法の有効性を実証するために,最先端の成果を含む実験セットに加えて,最大エントロピー原理に基づく議論を用いて,提案手法の理論的基礎を確立する。現在のほとんどの方法とは異なり、損失とスコアはシームレスで原則的なソリューションであり、高速で効率的な推論に加えて正確な予測を生み出します。さらに、深層ニューラルネットワークのトレーニングに使用される損失を置き換え、検出のための迅速なスコアを計算するだけで、現在のプロジェクトや将来のプロジェクトに組み込むことができます。

Current deep learning solutions are well known for not informing whether they can reliably classify an example during inference. One of the most effective ways to build more reliable deep learning solutions is to improve their performance in the so-called out-of-distribution detection task, which essentially consists of "know that you do not know" or "know the unknown". In other words, out-of-distribution detection capable systems may reject performing a nonsense classification when submitted to instances of classes on which the neural network was not trained. This thesis tackles the defiant out-of-distribution detection task by proposing novel loss functions and detection scores. Uncertainty estimation is also a crucial auxiliary task in building more robust deep learning systems. Therefore, we also deal with this robustness-related task, which evaluates how realistic the probabilities presented by the deep neural network are. To demonstrate the effectiveness of our approach, in addition to a substantial set of experiments, which includes state-of-the-art results, we use arguments based on the principle of maximum entropy to establish the theoretical foundation of the proposed approaches. Unlike most current methods, our losses and scores are seamless and principled solutions that produce accurate predictions in addition to fast and efficient inference. Moreover, our approaches can be incorporated into current and future projects simply by replacing the loss used to train the deep neural network and computing a rapid score for detection.

翻訳日:2022-08-09 13:43:33 公開日:2022-08-06

# ベクトル化表現を用いたグラフ型軌道予測器の一般化解析

Generalizability Analysis of Graph-based Trajectory Predictor with Vectorized Representation ( http://arxiv.org/abs/2208.03578v1 )

ライセンス: Link先を確認

Juanwu Lu, Wei Zhan, Masayoshi Tomizuka, Yeping Hu

(参考訳) 軌道予測は自動運転車にとって不可欠な課題の1つである。機械学習の最近の進歩は、一連の高度な軌道予測アルゴリズムを生み出した。近年,ベクトル化表現を用いたグラフニューラルネットワーク(gnns)の軌道予測の有効性が多くの研究者によって実証されている。それにもかかわらず、これらのアルゴリズムは様々なシナリオにわたるモデルの一般化可能性にほとんど注意を払わないか、あるいはトレーニングとテストデータが同様の統計に従うと仮定する。実際、テストシナリオが見えない場合や、アウト・オブ・ディストリビューション(OOD)の場合、結果のトレイン・テストのドメインシフトは通常、予測性能が大幅に低下し、下流のモジュールに影響を与え、最終的には深刻な事故を引き起こす。したがって、それらの一般化可能性の観点から予測モデルを徹底的に研究することが重要であり、その弱点を識別するだけでなく、これらのモデルを改善するための洞察を与えることもできる。本稿では,ブラックボックスモデルの解釈を支援する特徴属性法による一般化可能性分析フレームワークを提案する。本ケーススタディでは,ベクトル化表現を用いた最先端のグラフベース軌道予測器の詳細な一般化解析を行う。結果は、ドメインシフトによるパフォーマンスの大幅な低下を示し、これらの問題の潜在的な原因を特定するための洞察を提供する。最後に、一般的な予測課題と、トレーニングプロセスによって引き起こされる重み付けバイアスが精度を低下させる可能性について結論づける。

Trajectory prediction is one of the essential tasks for autonomous vehicles. Recent progress in machine learning gave birth to a series of advanced trajectory prediction algorithms. Lately, the effectiveness of using graph neural networks (GNNs) with vectorized representations for trajectory prediction has been demonstrated by many researchers. Nonetheless, these algorithms either pay little attention to models' generalizability across various scenarios or simply assume training and test data follow similar statistics. In fact, when test scenarios are unseen or Out-of-Distribution (OOD), the resulting train-test domain shift usually leads to significant degradation in prediction performance, which will impact downstream modules and eventually lead to severe accidents. Therefore, it is of great importance to thoroughly investigate the prediction models in terms of their generalizability, which can not only help identify their weaknesses but also provide insights on how to improve these models. This paper proposes a generalizability analysis framework using feature attribution methods to help interpret black-box models. For the case study, we provide an in-depth generalizability analysis of one of the state-of-the-art graph-based trajectory predictors that utilize vectorized representation. Results show significant performance degradation due to domain shift, and feature attribution provides insights to identify potential causes of these problems. Finally, we conclude the common prediction challenges and how weighting biases induced by the training process can deteriorate the accuracy.

翻訳日:2022-08-09 13:43:09 公開日:2022-08-06

# MonoViT:視覚変換器を用いた自己監督単眼深度推定

MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer ( http://arxiv.org/abs/2208.03543v1 )

ライセンス: Link先を確認

Chaoqiang Zhao, Youmin Zhang, Matteo Poggi, Fabio Tosi, Xianda Guo, Zheng Zhu, Guan Huang, Yang Tang, Stefano Mattoccia

(参考訳) 自己教師付き単分子深度推定は、訓練にハード・ソースの深度ラベルを必要としない魅力的な解である。畳み込みニューラルネットワーク(CNN)は、最近このタスクで大きな成功を収めた。しかし、その限定的な受容領域は、既存のネットワークアーキテクチャを局所的な推論のみに制限し、自己監督パラダイムの有効性を損なう。ビジョントランスフォーマー (ViTs) が最近達成した成果を踏まえ, ViT モデルで実現したグローバル推論と自己教師型モノクロ深度推定の柔軟性を組み合わせた新しいフレームワーク MonoViT を提案する。平易な畳み込みとTransformerブロックを組み合わせることで、我々のモデルは局所的および世界的推論が可能となり、より詳細な精度と精度で深度予測が得られ、MonoViTは確立されたKITTIデータセット上で最先端のパフォーマンスを達成できる。さらに、MonoViTはMake3DやDrivingStereoといった他のデータセットよりも優れた一般化能力を示している。

Self-supervised monocular depth estimation is an attractive solution that does not require hard-to-source depth labels for training. Convolutional neural networks (CNNs) have recently achieved great success in this task. However, their limited receptive field constrains existing network architectures to reason only locally, dampening the effectiveness of the self-supervised paradigm. In the light of the recent successes achieved by Vision Transformers (ViTs), we propose MonoViT, a brand-new framework combining the global reasoning enabled by ViT models with the flexibility of self-supervised monocular depth estimation. By combining plain convolutions with Transformer blocks, our model can reason locally and globally, yielding depth prediction at a higher level of detail and accuracy, allowing MonoViT to achieve state-of-the-art performance on the established KITTI dataset. Moreover, MonoViT proves its superior generalization capacities on other datasets such as Make3D and DrivingStereo.

翻訳日:2022-08-09 13:21:46 公開日:2022-08-06

# 凍結CLIPモデルは効果的なビデオ学習者である

Frozen CLIP Models are Efficient Video Learners ( http://arxiv.org/abs/2208.03550v1 )

ライセンス: Link先を確認

Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang, Jifeng Dai, Yu Qiao, Hongsheng Li

(参考訳) まず、事前訓練された画像モデルの重み付きビデオ認識モデルを初期化し、次に、ビデオ上でエンドツーエンドのトレーニングを実行する。これにより、ビデオネットワークは事前訓練された画像モデルの恩恵を受けることができる。しかし、ビデオの微調整には、かなりの計算とメモリリソースが必要であり、画像バックボーンを微調整することなく、事前訓練された画像機能を直接使用する代替手段は、サブパー結果につながる。幸いなことに、Contrastive Vision-Language Pre-Training(CLIP)の最近の進歩は、視覚認識タスクのための新しいルートの道を開いた。大規模なオープンボカブラリなイメージテキストペアデータで事前トレーニングされたこれらのモデルは、豊富なセマンティクスを持つ強力な視覚的表現を学習する。本稿では,凍ったCLIP機能を備えた高品質なビデオ認識モデルを直接トレーニングする効率的なフレームワークとして,EVL(Efficient Video Learning)を提案する。具体的には,軽量トランスフォーマーデコーダを用いてクエリトークンを学習し,CLIP画像エンコーダからフレームレベルの空間的特徴を動的に収集する。さらに,各デコーダ層に局所時間モジュールを適用し,隣接するフレームとその注意マップから時間的手がかりを検出する。凍結したバックボーンでトレーニングすることが効率的であるにもかかわらず、我々のモデルは様々なビデオ認識データセットで高品質なビデオ表現を学ぶ。コードはhttps://github.com/opengvlab/ efficient-video-recognitionで入手できる。

Video recognition has been dominated by the end-to-end learning paradigm -- first initializing a video recognition model with weights of a pretrained image model and then conducting end-to-end training on videos. This enables the video network to benefit from the pretrained image model. However, this requires substantial computation and memory resources for finetuning on videos and the alternative of directly using pretrained image features without finetuning the image backbone leads to subpar results. Fortunately, recent advances in Contrastive Vision-Language Pre-training (CLIP) pave the way for a new route for visual recognition tasks. Pretrained on large open-vocabulary image-text pair data, these models learn powerful visual representations with rich semantics. In this paper, we present Efficient Video Learning (EVL) -- an efficient framework for directly training high-quality video recognition models with frozen CLIP features. Specifically, we employ a lightweight Transformer decoder and learn a query token to dynamically collect frame-level spatial features from the CLIP image encoder. Furthermore, we adopt a local temporal module in each decoder layer to discover temporal clues from adjacent frames and their attention maps. We show that despite being efficient to train with a frozen backbone, our models learn high quality video representations on a variety of video recognition datasets. Code is available at https://github.com/OpenGVLab/efficient-video-recognition.

翻訳日:2022-08-09 13:21:29 公開日:2022-08-06

# オートキュレーションを用いた誘導パッチマッチングによる現代のカメラ解像度のインペインティング

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation ( http://arxiv.org/abs/2208.03552v1 )

ライセンス: Link先を確認

Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi

(参考訳) 近年、深部モデルでは低解像度画像の塗装にSOTAの性能が確立されているが、4K以上の近代カメラや大穴の分解能に欠ける。我々は、4kおよび現代のセンサーの代表的な画像のインペインティングベンチマークデータセットに寄与する。深層学習と従来の手法を組み合わせた新しい枠組みを提案する。既存の深層塗装モデルlamaを用いて, 穴を適切に埋め, 構造, セグメント化, 深さ, および多重誘導パッチマッチングによる3つのガイド画像を構築し, アップサンプリングした8つの画像を生成する。次に,8x8の非対称な対の選好行列上で,カラム和による良好な着色を選択する新しいキュレーションモジュールを通じて,すべての候補の着色をフィードする。私たちのフレームワークの結果は、8つの強力なベースライン以上のユーザによって圧倒的に好まれており、最高のベースラインlamaよりも7.4までの定量的メトリクスが改善されています。

Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes. We contribute an inpainting benchmark dataset of photos at 4K and above representative of modern sensors. We demonstrate a novel framework that combines deep learning and traditional methods. We use an existing deep inpainting model LaMa to fill the hole plausibly, establish three guide images consisting of structure, segmentation, depth, and apply a multiply-guided PatchMatch to produce eight candidate upsampled inpainted images. Next, we feed all candidate inpaintings through a novel curation module that chooses a good inpainting by column summation on an 8x8 antisymmetric pairwise preference matrix. Our framework's results are overwhelmingly preferred by users over 8 strong baselines, with improvements of quantitative metrics up to 7.4 over the best baseline LaMa, and our technique when paired with 4 different SOTA inpainting backbones improves each such that ours is overwhelmingly preferred by users over a strong super-res baseline.

翻訳日:2022-08-09 13:21:06 公開日:2022-08-06

# HSIC-InfoGAN:近似相互情報の最大化による教師なしアンタングル表現の学習

HSIC-InfoGAN: Learning Unsupervised Disentangled Representations by Maximising Approximated Mutual Information ( http://arxiv.org/abs/2208.03563v1 )

ライセンス: Link先を確認

Xiao Liu, Spyridon Thermos, Pedro Sanchez, Alison Q. O'Neil, Sotirios A. Tsaftaris

(参考訳) 不整合表現の学習には、特定のモデル設計の導入とバイアスとしての学習制約の監督が必要である。 InfoGANは、潜在表現と対応する生成画像の相互情報を最大化することにより、教師なしの非絡み合い表現を学習する一般的な非絡み合いフレームワークである。相互情報の最大化は、潜在回帰損失を伴う補助ネットワークとトレーニングを導入することによって達成される。本稿では,Hilbert-Schmidt Independence Criterion (HSIC) を用いて,潜在表現と画像間の相互情報を近似する手法について検討する。 HSIC損失を直接最適化することは、追加の補助ネットワークの必要性を避ける。我々は,各モデルにおけるゆがみのレベルを質的に比較し,HSIC-InfoGANのハイパーパラメータを調整するための戦略を提案し,医療応用におけるHSIC-InfoGANの可能性について議論する。

Learning disentangled representations requires either supervision or the introduction of specific model designs and learning constraints as biases. InfoGAN is a popular disentanglement framework that learns unsupervised disentangled representations by maximising the mutual information between latent representations and their corresponding generated images. Maximisation of mutual information is achieved by introducing an auxiliary network and training with a latent regression loss. In this short exploratory paper, we study the use of the Hilbert-Schmidt Independence Criterion (HSIC) to approximate mutual information between latent representation and image, termed HSIC-InfoGAN. Directly optimising the HSIC loss avoids the need for an additional auxiliary network. We qualitatively compare the level of disentanglement in each model, suggest a strategy to tune the hyperparameters of HSIC-InfoGAN, and discuss the potential of HSIC-InfoGAN for medical applications.

翻訳日:2022-08-09 13:20:45 公開日:2022-08-06

# イントラ画像融合による深部非共役フォトメトリックステレオ

Deep Uncalibrated Photometric Stereo via Inter-Intra Image Feature Fusion ( http://arxiv.org/abs/2208.03440v1 )

ライセンス: Link先を確認

Fangzhou Gao, Meng Wang, Lianghao Zhang, Li Wang, Jiawan Zhang

(参考訳) 様々な光と未知の光の下での像からの詳細な表面の正常さを推定するために,未調整の測光ステレオが提案されている。近年、ディープラーニングは、この未決定問題に先立って強力なデータをもたらす。本稿では,画像間表現を効率的に利用して正規推定を導く,奥行き非共役フォトメトリックステレオの新しい手法を提案する。従来の手法では、最適化ベースのニューラルネットワークの逆レンダリングや単一のサイズ非依存のプーリング層を使用して複数の入力を処理するが、入力画像間の情報の利用には非効率である。異なる照明下でのマルチイメージを考えると、画像内および画像間の変化は高い相関関係にあると考えられる。そこで我々は,画像間特徴抽出に画像間表現を導入するためのイントラ画像間特徴融合モジュールを設計した。余分な表現は画像ごとの特徴抽出を誘導し、正規推定の曖昧さを排除するために使われる。当社の設計が幅広い試料,特に暗黒物質に与える影響を実証する。本手法は, 合成データと実データの両方において, 最先端の手法よりも優れた結果が得られる。

Uncalibrated photometric stereo is proposed to estimate the detailed surface normal from images under varying and unknown lightings. Recently, deep learning brings powerful data priors to this underdetermined problem. This paper presents a new method for deep uncalibrated photometric stereo, which efficiently utilizes the inter-image representation to guide the normal estimation. Previous methods use optimization-based neural inverse rendering or a single size-independent pooling layer to deal with multiple inputs, which are inefficient for utilizing information among input images. Given multi-images under different lighting, we consider the intra-image and inter-image variations highly correlated. Motivated by the correlated variations, we designed an inter-intra image feature fusion module to introduce the inter-image representation into the per-image feature extraction. The extra representation is used to guide the per-image feature extraction and eliminate the ambiguity in normal estimation. We demonstrate the effect of our design on a wide range of samples, especially on dark materials. Our method produces significantly better results than the state-of-the-art methods on both synthetic and real data.

翻訳日:2022-08-09 13:14:55 公開日:2022-08-06

# AFE-CNN:アクション特徴強調による3次元骨格に基づく行動認識

AFE-CNN: 3D Skeleton-based Action Recognition with Action Feature Enhancement ( http://arxiv.org/abs/2208.03444v1 )

ライセンス: Link先を確認

Shannan Guan, Haiyan Lu, Linchao Zhu, Gengfa Fang

(参考訳) 既存の3Dスケルトンに基づくアクション認識アプローチは、手作りのアクション機能を画像フォーマットにエンコードし、CNNによってデコードすることで、印象的なパフォーマンスを実現する。しかし、この方法には2つの制限がある。 a)手作りの動作特徴は、困難な行動に対処することが困難であり、 b) 一般に、行動認識精度を向上させるために複雑なCNNモデルが必要である。これらの限界を克服するため,我々は,挑戦的行動に適応するために,3dスケルトンベースの動作の特徴を強化することに専心する新しい afe-cnn を導入する。そこで,AFE-CNNはカメラの視界や身体サイズの変化に対してより堅牢であり,挑戦行動における認識精度を大幅に向上させる。さらに,AFE-CNNでは,動作特徴が強化された画像を復号化するために,軽量CNNモデルを採用している。 NTU RGB+D, NTU RGB+D 120, UTKinect-Action3Dの3つのベンチマークスケルトンに基づく行動データセットを用いてAFE-CNNを評価する。

Existing 3D skeleton-based action recognition approaches reach impressive performance by encoding handcrafted action features to image format and decoding by CNNs. However, such methods are limited in two ways: a) the handcrafted action features are difficult to handle challenging actions, and b) they generally require complex CNN models to improve action recognition accuracy, which usually occur heavy computational burden. To overcome these limitations, we introduce a novel AFE-CNN, which devotes to enhance the features of 3D skeleton-based actions to adapt to challenging actions. We propose feature enhance modules from key joint, bone vector, key frame and temporal perspectives, thus the AFE-CNN is more robust to camera views and body sizes variation, and significantly improve the recognition accuracy on challenging actions. Moreover, our AFE-CNN adopts a light-weight CNN model to decode images with action feature enhanced, which ensures a much lower computational burden than the state-of-the-art methods. We evaluate the AFE-CNN on three benchmark skeleton-based action datasets: NTU RGB+D, NTU RGB+D 120, and UTKinect-Action3D, with extensive experimental results demonstrate our outstanding performance of AFE-CNN.

翻訳日:2022-08-09 13:14:38 公開日:2022-08-06

# クラスは文脈と副詞に不変:外部分布一般化のための学習不変性について

Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization ( http://arxiv.org/abs/2208.03462v1 )

ライセンス: Link先を確認

Jiaxin Qi, Kaihua Tang, Qianru Sun, Xian-Sheng Hua, and Hanwang Zhang

(参考訳) Out-Of-Distribution Generalization (OOD) とは、環境変化に対する不変性を学習することである。すべてのクラスのコンテキストが均等に分散されている場合、OODは自明である。しかし、そのようなバランスのとれたデータセットの収集は現実的ではない。不均衡なデータを学習することで、モデルがコンテキストに偏り、OODを損なう。したがって、OODの鍵はコンテキストバランスである。先行研究において広く採用されている仮定である文脈バイアスは、バイアス付きクラス予測から直接注釈付けや推定が可能であり、文脈が不完全あるいは不正確であると主張する。コンテキストもクラスに不変であり、コンテキストバイアス(文脈ラベルなしで)を解決する様々な環境としてクラス(すでにラベル付けされている)を考える動機となります。この概念を実装し、クラス内サンプル類似性の対照的な損失を最小限に抑えつつ、この類似性を全てのクラスにわたって不変とすることで実装する。種々のコンテキストバイアスとドメインギャップを持つベンチマークにおいて、文脈推定を備えた単純な再重み付けに基づく分類器が最先端の性能を達成することを示す。 Appendix の理論的正当化と https://github.com/simpleshinobu/IRMCon のコードを提供する。

Out-Of-Distribution generalization (OOD) is all about learning invariance against environmental changes. If the context in every class is evenly distributed, OOD would be trivial because the context can be easily removed due to an underlying principle: class is invariant to context. However, collecting such a balanced dataset is impractical. Learning on imbalanced data makes the model bias to context and thus hurts OOD. Therefore, the key to OOD is context balance. We argue that the widely adopted assumption in prior work, the context bias can be directly annotated or estimated from biased class prediction, renders the context incomplete or even incorrect. In contrast, we point out the everoverlooked other side of the above principle: context is also invariant to class, which motivates us to consider the classes (which are already labeled) as the varying environments to resolve context bias (without context labels). We implement this idea by minimizing the contrastive loss of intra-class sample similarity while assuring this similarity to be invariant across all classes. On benchmarks with various context biases and domain gaps, we show that a simple re-weighting based classifier equipped with our context estimation achieves state-of-the-art performance. We provide the theoretical justifications in Appendix and codes on https://github.com/simpleshinobu/IRMCon.

翻訳日:2022-08-09 13:14:14 公開日:2022-08-06

# MRIモダリティの欠如による深層学習に基づく脳腫瘍の分節解析

Analyzing Deep Learning Based Brain Tumor Segmentation with Missing MRI Modalities ( http://arxiv.org/abs/2208.03470v1 )

ライセンス: Link先を確認

Benteng Ma, Yushi Wang, and Shen Wang

(参考訳) 本報告では,mriを欠いた脳腫瘍分割に対する既存の深層学習(dl)法の比較検討を行った。評価されたアプローチには、Adversarial Co-Training Network (ACN) と mmGAN と DeepMedic の組み合わせがある。 mmGANのより安定的で使いやすいバージョンも、GitHubリポジトリでオープンソース化されている。 BraTS2018データセットを使用することで、最先端のACNが特にT1cが欠落している場合には、パフォーマンスが向上することを示した。 mmGANとDeepMedicの単純な組み合わせは、1つのMRIモダリティが欠如している場合にも強いポテンシャルを示す。さらに、この研究は、MRIモダリティの欠如を伴う脳腫瘍セグメンテーションの今後の研究方向について議論を始めた。

This technical report presents a comparative analysis of existing deep learning (DL) based approaches for brain tumor segmentation with missing MRI modalities. Approaches evaluated include the Adversarial Co-training Network (ACN) and a combination of mmGAN and DeepMedic. A more stable and easy-to-use version of mmGAN is also open-sourced at a GitHub repository. Using the BraTS2018 dataset, this work demonstrates that the state-of-the-art ACN performs better especially when T1c is missing. While a simple combination of mmGAN and DeepMedic also shows strong potentials when only one MRI modality is missing. Additionally, this work initiated discussions with future research directions for brain tumor segmentation with missing MRI modalities.

翻訳日:2022-08-09 13:13:53 公開日:2022-08-06

# 顔に基づく影響計算のための不確実性モデリング付きマルチタスクトランス

Multi-Task Transformer with uncertainty modelling for Face Based Affective Computing ( http://arxiv.org/abs/2208.03506v1 )

ライセンス: Link先を確認

Gauthier Tallec, Jules Bonnard, Arnaud Dapogny, K\'evin Bailly

(参考訳) 顔に基づく感情計算は、顔画像から感情を検出する。人間の行動のより良い自動理解を解き放ち、人間と機械の相互作用を改善するための道を開くのに役立つ。しかし、それは感情の計算的表現を設計する難しいタスクが伴う。これまでのところ、感情は2次元ヴァレンス/オーラル空間で連続的に表現されるか、エクマンの7つの基本的な感情で離散的に表現されている。あるいは、エクマンの顔行動ユニット(AU)システムは、一元的な筋活動のコードブックを使用して感情を活性化するためにも使われている。 ABAW3とABAW4 マルチタスクチャレンジは、これらの3種類のラベルに注釈を付けた大規模なデータベースを提供する最初の作業である。本稿では,ヴァレンス覚醒,行動単位,基本的な感情を共同で予測するトランスフォーマティブ型マルチタスク手法を提案する。アーキテクチャの観点から、我々のメソッドはタスク間の類似性を効率的にモデル化するためにタスクワイズトークンアプローチを使用します。学習の観点からは、3つのタスクアノテーション間の確率性の差をモデル化するために不確実性重み付き損失を用いる。

Face based affective computing consists in detecting emotions from face images. It is useful to unlock better automatic comprehension of human behaviours and could pave the way toward improved human-machines interactions. However it comes with the challenging task of designing a computational representation of emotions. So far, emotions have been represented either continuously in the 2D Valence/Arousal space or in a discrete manner with Ekman's 7 basic emotions. Alternatively, Ekman's Facial Action Unit (AU) system have also been used to caracterize emotions using a codebook of unitary muscular activations. ABAW3 and ABAW4 Multi-Task Challenges are the first work to provide a large scale database annotated with those three types of labels. In this paper we present a transformer based multi-task method for jointly learning to predict valence arousal, action units and basic emotions. From an architectural standpoint our method uses a taskwise token approach to efficiently model the similarities between the tasks. From a learning point of view we use an uncertainty weighted loss for modelling the difference of stochasticity between the three tasks annotations.

翻訳日:2022-08-09 13:13:42 公開日:2022-08-06

# 古典量子深層学習による半導体欠陥検出

Semiconductor Defect Detection by Hybrid Classical-Quantum Deep Learning ( http://arxiv.org/abs/2208.03514v1 )

ライセンス: Link先を確認

YuanFu Yang and Min Sun

(参考訳) 人工知能と自動運転技術の急速な発展により、半導体の需要は大幅に増加すると予想されている。しかし、半導体製造の大規模な拡大と新しい技術の発展は、多くの欠陥ウエハをもたらす。これらの欠陥ウエハが正しく検査されていない場合、欠陥ウエハの非効率な半導体処理は、過剰な二酸化炭素排出やエネルギー消費など、我々の環境にさらなる影響をもたらす。本稿では、量子コンピューティングの情報処理の利点を活用し、欠陥学習欠陥レビュー(DLDR)を促進する。短期量子プロセッサの深層学習のための古典量子ハイブリッドアルゴリズムを提案する。実装されたパラメータをチューニングすることにより、我々のフレームワークによって駆動される量子回路は、ウェハ欠陥マップ分類、欠陥パターン分類、ホットスポット検出を含む、所定のDLDRタスクを学習する。さらに,表現性やエンタングル能力の異なるパラメタライズド量子回路についても検討する。これらの結果は、半導体欠陥検出のための回路ベースの量子ディープラーニングを開発するための将来のロードマップを構築するために使用できる。

With the rapid development of artificial intelligence and autonomous driving technology, the demand for semiconductors is projected to rise substantially. However, the massive expansion of semiconductor manufacturing and the development of new technology will bring many defect wafers. If these defect wafers have not been correctly inspected, the ineffective semiconductor processing on these defect wafers will cause additional impact to our environment, such as excessive carbon dioxide emission and energy consumption. In this paper, we utilize the information processing advantages of quantum computing to promote the defect learning defect review (DLDR). We propose a classical-quantum hybrid algorithm for deep learning on near-term quantum processors. By tuning parameters implemented on it, quantum circuit driven by our framework learns a given DLDR task, include of wafer defect map classification, defect pattern classification, and hotspot detection. In addition, we explore parametrized quantum circuits with different expressibility and entangling capacities. These results can be used to build a future roadmap to develop circuit-based quantum deep learning for semiconductor defect detection.

翻訳日:2022-08-09 13:13:23 公開日:2022-08-06

# マルチプレックス検出に基づく全スライド画像分類のための複数インスタンス学習ネットワーク

Multiplex-detection Based Multiple Instance Learning Network for Whole Slide Image Classification ( http://arxiv.org/abs/2208.03526v1 )

ライセンス: Link先を確認

Zhikang Wang, Yue Bi, Tong Pan, Chris Bain, Richard Bassed, Seiya Imoto, Jianhua Yao, Jiangning Song

(参考訳) マルチ・インスタンス・ラーニング(MIL)は、診断病理のためのスライド画像全体(WSI)を分類する強力な手法である。 WSI分類におけるMILの根本的な課題は、バッグラベルをトリガーするtextit{ critical instance}を見つけることである。しかし、以前の方法は主に独立かつ同一の分布仮説(\textit{i.i.d})に基づいて設計され、例間の相関や腫瘍の不均一性は無視される。本稿では,上記の課題に取り組むために,新しいマルチプレックス検出型マルチインスタンス学習(mdmil)を提案する。具体的には、MDMILは、内部クエリ生成モジュール(IQGM)と多重検出モジュール(MDM)によって構成され、トレーニング中にメモリベースのコントラスト損失を補助する。まず、IQGMは、分布解析後の信頼性の高い特徴を集約することにより、インスタンスの確率を与え、その後のMDMの内部クエリ(IQ)を生成する。次に、mdmにおける多重検出クロスアテンション(mdca)と多頭自己アテンション(mhsa)が協調してwsiの最終表現を生成する。このプロセスでは、IQおよびトレーニング可能な変動クエリ(VQ)がインスタンス間の接続を構築し、不均一な腫瘍に対するモデルの堅牢性を大幅に向上する。最後に、機能空間の制約をさらに強制し、トレーニングプロセスを安定化するために、各イテレーションで1つのサンプルが入力されてもwsi分類に実行可能なメモリベースのコントラスト損失を採用する。我々はCAMELYON16, TCGA-NSCLC, TCGA-RCCの3つの計算病理データセットについて実験を行った。 MDMILの精度とAUCは,他の最先端手法よりも優れていることを示す。

Multiple instance learning (MIL) is a powerful approach to classify whole slide images (WSIs) for diagnostic pathology. A fundamental challenge of MIL on WSI classification is to discover the \textit{critical instances} that trigger the bag label. However, previous methods are primarily designed under the independent and identical distribution hypothesis (\textit{i.i.d}), ignoring either the correlations between instances or heterogeneity of tumours. In this paper, we propose a novel multiplex-detection-based multiple instance learning (MDMIL) to tackle the issues above. Specifically, MDMIL is constructed by the internal query generation module (IQGM) and the multiplex detection module (MDM) and assisted by the memory-based contrastive loss during training. Firstly, IQGM gives the probability of instances and generates the internal query (IQ) for the subsequent MDM by aggregating highly reliable features after the distribution analysis. Secondly, the multiplex-detection cross-attention (MDCA) and multi-head self-attention (MHSA) in MDM cooperate to generate the final representations for the WSI. In this process, the IQ and trainable variational query (VQ) successfully build up the connections between instances and significantly improve the model's robustness toward heterogeneous tumours. At last, to further enforce constraints in the feature space and stabilize the training process, we adopt a memory-based contrastive loss, which is practicable for WSI classification even with a single sample as input in each iteration. We conduct experiments on three computational pathology datasets, e.g., CAMELYON16, TCGA-NSCLC, and TCGA-RCC datasets. The superior accuracy and AUC demonstrate the superiority of our proposed MDMIL over other state-of-the-art methods.

翻訳日:2022-08-09 13:13:06 公開日:2022-08-06

# カルマンフィルタを用いた短時間交通流予測

Short Duration Traffic Flow Prediction Using Kalman Filtering ( http://arxiv.org/abs/2208.03415v1 )

ライセンス: Link先を確認

Khondhaker Al Momin, Saurav Barua, Md. Shahreer Jamil, Omar Faruqe Hamim

(参考訳) 計算フィルタリング手法であるkalman filter technique (kft) を用いて,短時間交通流量の予測について検討した。短期交通予測は交通管理と交通システムの運用において重要なツールである。道路案内と高度トラベラー情報システムによる移動時間推定には, 短期交通流量値結果を用いることができる。 kftは均質なトラフィックでテストされているが、その効率性はまだ調査されていない。この調査は、ソバンバグ・モスクに近いダッカのミルプル・ロードで行われた。ストリームには不均一なトラフィックの混合が含まれており、予測の不確実性が示唆される。提案されたメソッドはPythonでpykalmanライブラリを使って実行される。ライブラリは主に、不確実性に対処するKFTフレームワークの高度なデータベースモデリングに使用される。データは、車両の3時間の交通量から導かれた。 2005年にバングラデシュのroads and highways division(rhd)が発行したgemetry design standards manualによると、不均一な交通フローの値は同等の旅客車単位(pcu)に変換された。 5分間のアグリゲーションから得られたPCUを提案モデルのデータセットとして利用した。提案されたモデルの平均絶対パーセンテージ誤差(MAPE)は14.62であり、KFTモデルは合理的に予測できることを示している。根平均二乗誤差(RMSPE)は18.73%の精度を示し、25%未満である。開発されたモデルはR2値0.879であり、データセットの変数の87.9%を説明できることを示している。データがより長期にわたって収集された場合、R2値は1.0に近い可能性がある。

The research examined predicting short-duration traffic flow counts with the Kalman filtering technique (KFT), a computational filtering method. Short-term traffic prediction is an important tool for operation in traffic management and transportation system. The short-term traffic flow value results can be used for travel time estimation by route guidance and advanced traveler information systems. Though the KFT has been tested for homogeneous traffic, its efficiency in heterogeneous traffic has yet to be investigated. The research was conducted on Mirpur Road in Dhaka, near the Sobhanbagh Mosque. The stream contains a heterogeneous mix of traffic, which implies uncertainty in prediction. The propositioned method is executed in Python using the pykalman library. The library is mostly used in advanced database modeling in the KFT framework, which addresses uncertainty. The data was derived from a three-hour traffic count of the vehicle. According to the Geometric Design Standards Manual published by Roads and Highways Division (RHD), Bangladesh in 2005, the heterogeneous traffic flow value was translated into an equivalent passenger car unit (PCU). The PCU obtained from five-minute aggregation was then utilized as the suggested model's dataset. The propositioned model has a mean absolute percent error (MAPE) of 14.62, indicating that the KFT model can forecast reasonably well. The root mean square percent error (RMSPE) shows an 18.73% accuracy which is less than 25%; hence the model is acceptable. The developed model has an R2 value of 0.879, indicating that it can explain 87.9 percent of the variability in the dataset. If the data were collected over a more extended period of time, the R2 value could be closer to 1.0.

翻訳日:2022-08-09 13:02:59 公開日:2022-08-06

# DeepGen: 異種検索広告生成とリアルタイムカスタマイズ

DeepGen: Diverse Search Ad Generation and Real-Time Customization ( http://arxiv.org/abs/2208.03438v1 )

ライセンス: Link先を確認

Konstantin Golobokov, Junyi Chai, Victor Ye Dong, Mandy Gu, Bingyu Chi, Jie Cao, Yulan Yan, Yi Liu

(参考訳) 我々は、BingAdsの顧客向けにスポンサー付き検索広告(ads)を自動的に作成するWebスケールのシステムであるDeepGenを紹介する。我々は、最先端の自然言語生成(NLG)モデルを利用して、広告主のWebページから流動的な広告を抽象的に生成し、事実性や推論速度などの実用的な問題を解決する。さらに,ユーザの検索クエリに応答してカスタマイズされた広告をリアルタイムに生成し,ユーザが求めているものに基づいて,同製品のさまざまな側面を強調表示する。これを実現するために,本システムでは,先行してより小さな広告を多種多様な選択で選択し,クエリ時に最も関連性の高い広告を完全広告に縫い付ける。我々は、制御可能なNLGモデルをトレーニングし、異なる販売ポイントをハイライトする同じWebページの複数の広告を生成することにより、生成の多様性を向上させる。システム設計は、まず異なる目的で訓練された生成モデルのアンサンブルを実行し、次に多様性サンプリングアルゴリズムを用いて、オンライン選択のための生成結果の多様なサブセットを選択することにより、水平方向に多様性を向上する。実験の結果,提案するシステム設計の有効性が示された。当社のシステムは、現在本番環境で運用されており、bingで提供されているグローバル広告の${\sim}4\%を提供しています。

We present DeepGen, a system deployed at web scale for automatically creating sponsored search advertisements (ads) for BingAds customers. We leverage state-of-the-art natural language generation (NLG) models to generate fluent ads from advertiser's web pages in an abstractive fashion and solve practical issues such as factuality and inference speed. In addition, our system creates a customized ad in real-time in response to the user's search query, therefore highlighting different aspects of the same product based on what the user is looking for. To achieve this, our system generates a diverse choice of smaller pieces of the ad ahead of time and, at query time, selects the most relevant ones to be stitched into a complete ad. We improve generation diversity by training a controllable NLG model to generate multiple ads for the same web page highlighting different selling points. Our system design further improves diversity horizontally by first running an ensemble of generation models trained with different objectives and then using a diversity sampling algorithm to pick a diverse subset of generation results for online selection. Experimental results show the effectiveness of our proposed system design. Our system is currently deployed in production, serving ${\sim}4\%$ of global ads served in Bing.

翻訳日:2022-08-09 13:02:36 公開日:2022-08-06

# Follow Me: ターゲット駆動型レコメンデーション対話システムのための会話計画

Follow Me: Conversation Planning for Target-driven Recommendation Dialogue Systems ( http://arxiv.org/abs/2208.03516v1 )

ライセンス: Link先を確認

Jian Wang, Dongding Lin, Wenjie Li

(参考訳) 推薦対話システムは,ユーザとソーシャル・ボンドを構築し,高品質なレコメンデーションを提供することを目的としている。本稿では,目標駆動型レコメンデーション対話システムと呼ばれる有望なパラダイムに向けて前進する。会話を通じて、ユーザが指定されたターゲットを受け入れるように自然に導く方法に重点を置いています。そこで本研究では,対話行動と話題のシーケンスを計画し,異なる会話ステージ間を積極的に移動させる目標駆動型会話計画(tcp)フレームワークを提案する。次に、TCPに予定内容を適用して対話生成をガイドする。実験の結果,対話計画が目標主導型レコメンデーション対話システムの性能を大幅に向上させることがわかった。

Recommendation dialogue systems aim to build social bonds with users and provide high-quality recommendations. This paper pushes forward towards a promising paradigm called target-driven recommendation dialogue systems, which is highly desired yet under-explored. We focus on how to naturally lead users to accept the designated targets gradually through conversations. To this end, we propose a Target-driven Conversation Planning (TCP) framework to plan a sequence of dialogue actions and topics, driving the system to transit between different conversation stages proactively. We then apply our TCP with planned content to guide dialogue generation. Experimental results show that our conversation planning significantly improves the performance of target-driven recommendation dialogue systems.

翻訳日:2022-08-09 13:02:13 公開日:2022-08-06

# 解剖学的追跡データを用いた繊維束検出のための時間的アンサンブルを用いた制約付き自己監督法

Constrained self-supervised method with temporal ensembling for fiber bundle detection on anatomic tracing data ( http://arxiv.org/abs/2208.03569v1 )

ライセンス: Link先を確認

Vaanathi Sundaresan, Julia F. Lehman, Sean Fitzgibbon, Saad Jbabdi, Suzanne N. Haber, Anastasia Yendiki

(参考訳) anatomic tracing dataは、拡散mriでよく見られるエラーに対処するのに必要な脳回路に関する詳細な情報を提供する。しかし, 追跡データ上での繊維束の自動検出は, 歪み, ノイズ, アーティファクトの存在, 強度・コントラストの変動などにより困難である。本研究では,マカク脳のトレーサ部における繊維束の正確なセグメンテーションを考慮した,解剖学的制約を考慮した自己教師付き損失関数を用いた深層学習法を提案する。また,手動ラベルの可用性が限られているため,ラベルなしデータを効率よく使用して性能を向上させるための半教師付きトレーニング手法と,偽陽性のさらなる低減のための位置制約を用いる。異なるマカクの未確認区間における本手法の評価は, 真正率~0.90の有望な結果をもたらす。このメソッドのコードはhttps://github.com/v-sundaresan/fiberbundle_seg_tracingで入手できる。

Anatomic tracing data provides detailed information on brain circuitry essential for addressing some of the common errors in diffusion MRI tractography. However, automated detection of fiber bundles on tracing data is challenging due to sectioning distortions, presence of noise and artifacts and intensity/contrast variations. In this work, we propose a deep learning method with a self-supervised loss function that takes anatomy-based constraints into account for accurate segmentation of fiber bundles on the tracer sections from macaque brains. Also, given the limited availability of manual labels, we use a semi-supervised training technique for efficiently using unlabeled data to improve the performance, and location constraints for further reduction of false positives. Evaluation of our method on unseen sections from a different macaque yields promising results with a true positive rate of ~0.90. The code for our method is available at https://github.com/v-sundaresan/fiberbundle_seg_tracing.

翻訳日:2022-08-09 12:58:11 公開日:2022-08-06

# 手続き型犯罪ドラマシリーズCSIの記憶可能性の分析

Analysing the Memorability of a Procedural Crime-Drama TV Series, CSI ( http://arxiv.org/abs/2208.03479v1 )

ライセンス: Link先を確認

Sean Cummins and Lorin Sweeney and Alan F. Smeaton

(参考訳) 我々は,映像の暗記性を予測するタスクを微調整した視覚変換器を用いて,人気テレビシリーズCSIの5シーズンスパンの記憶可能性について検討した。ビデオの暗記性スコアを付加した詳細な注釈付きコーパスを用いて、一般的な犯罪ドラマテレビのジャンルを調査することにより、映像の暗記性スコアから意味を抽出する方法を示す。映像の記憶可能性と番組の様々な側面を関連付けるための定量的分析を行う。本稿では,教育,マーケティング,インデクシングといった分野のマルチメディアを利用したアプリケーションにおいて,テレビや映画製作などにおいて,映像の記憶可能性の重要性について考察する。

We investigate the memorability of a 5-season span of a popular crime-drama TV series, CSI, through the application of a vision transformer fine-tuned on the task of predicting video memorability. By investigating the popular genre of crime-drama TV through the use of a detailed annotated corpus combined with video memorability scores, we show how to extrapolate meaning from the memorability scores generated on video shots. We perform a quantitative analysis to relate video shot memorability to a variety of aspects of the show. The insights we present in this paper illustrate the importance of video memorability in applications which use multimedia in areas like education, marketing, indexing, as well as in the case here namely TV and film production.

翻訳日:2022-08-09 12:44:59 公開日:2022-08-06

# 臨床関連二次特徴を用いた膵腫瘍検出の改善

Improved Pancreatic Tumor Detection by Utilizing Clinically-Relevant Secondary Features ( http://arxiv.org/abs/2208.03581v1 )

ライセンス: Link先を確認

Christiaan G.A. Viviers and Mark Ramaekers and Peter H.N. de With and Dimitrios Mavroeidis and Joost Nederend and Misha Luyer and Fons van der Sommen

(参考訳) 膵癌は、がん関連死亡の世界的な原因の1つである。コンピュータ支援診断・診断法(CAD)におけるDeep Learningの成功にもかかわらず,膵癌検出にはほとんど注意が払われていない。本報告では, 膵腫瘍の診断法として, 周囲解剖学的特徴を活用し, 放射線科医の知識を他の従来の深層学習法と比較して有効活用するための方法を提案する。この目的のために膵管腺癌99例と膵腫瘍を伴わない97例からなる新しいデータセットを収集した。膵癌の増殖パターンのため、腫瘍は常に低感覚病変として見られるわけではないため、専門家は腫瘍の存在を示す可能性のある二次的な外部特徴の視認性について言及している。本稿では, 膵管, 総胆管, 膵臓の2次的特徴を利用するU-NetライクなDeep CNNとCTスキャンを併用した手法を提案する。これらの特徴を用いて、モデルが膵腫瘍の存在を判断する。この分類と局所化手法のセグメンテーションは、99%の感度(1件欠落)と99%の特異性を達成し、従来の最先端法に比べて5%の感度向上を実現する。さらに、従来のPDAC検出法と比較して、適切な精度と推論時間の短い位置情報を提供する。これらの結果は,新しいCAD手法の開発において,臨床専門家の知識を取り入れることの重要性を強調した。

Pancreatic cancer is one of the global leading causes of cancer-related deaths. Despite the success of Deep Learning in computer-aided diagnosis and detection (CAD) methods, little attention has been paid to the detection of Pancreatic Cancer. We propose a method for detecting pancreatic tumor that utilizes clinically-relevant features in the surrounding anatomical structures, thereby better aiming to exploit the radiologist's knowledge compared to other, conventional deep learning approaches. To this end, we collect a new dataset consisting of 99 cases with pancreatic ductal adenocarcinoma (PDAC) and 97 control cases without any pancreatic tumor. Due to the growth pattern of pancreatic cancer, the tumor may not be always visible as a hypodense lesion, therefore experts refer to the visibility of secondary external features that may indicate the presence of the tumor. We propose a method based on a U-Net-like Deep CNN that exploits the following external secondary features: the pancreatic duct, common bile duct and the pancreas, along with a processed CT scan. Using these features, the model segments the pancreatic tumor if it is present. This segmentation for classification and localization approach achieves a performance of 99% sensitivity (one case missed) and 99% specificity, which realizes a 5% increase in sensitivity over the previous state-of-the-art method. The model additionally provides location information with reasonable accuracy and a shorter inference time compared to previous PDAC detection methods. These results offer a significant performance improvement and highlight the importance of incorporating the knowledge of the clinical expert when developing novel CAD methods.

翻訳日:2022-08-09 12:39:49 公開日:2022-08-06

# 部分観測可能な環境における再帰的ネットワーク、隠れ状態、信念

Recurrent networks, hidden states and beliefs in partially observable environments ( http://arxiv.org/abs/2208.03520v1 )

ライセンス: Link先を確認

Gaspard Lambrechts, Adrien Bolland, Damien Ernst

(参考訳) 強化学習は、動的に未知な環境との相互作用から最適方針を学ぶことを目的としている。多くの手法は値関数の近似に頼り、ほぼ最適ポリシーを導出する。部分的に観測可能な環境では、これらの関数は履歴と呼ばれる観測と過去の行動の完全な順序に依存する。本研究では,そのような値関数を近似するために訓練されたリカレントニューラルネットワークが,その歴史が与えられた状態の後方確率分布を内部的にフィルタすることを示す。より正確には、リカレントニューラルネットワークがQ-関数を学習するにつれて、その隠れた状態が、最適制御に関連する状態変数の信念とますます相関していることが示される。この相関は相互情報によって測定される。さらに,エージェントの期待リターンは,その隠れた状態と信念の間の高い相互情報に達するために,その再帰的なアーキテクチャの能力によって増加することを示した。最後に,隠蔽状態と最適制御に無関係な変数の信念との相互情報を学習過程を通じて減少させることを示す。要約すると、その隠れた状態において、部分的に観測可能な環境のq関数を近似する再帰的ニューラルネットワークは、最適な行動を取るための信念の関連部分と関連付けられた履歴から十分な統計を再現する。

Reinforcement learning aims to learn optimal policies from interaction with environments whose dynamics are unknown. Many methods rely on the approximation of a value function to derive near-optimal policies. In partially observable environments, these functions depend on the complete sequence of observations and past actions, called the history. In this work, we show empirically that recurrent neural networks trained to approximate such value functions internally filter the posterior probability distribution of the current state given the history, called the belief. More precisely, we show that, as a recurrent neural network learns the Q-function, its hidden states become more and more correlated with the beliefs of state variables that are relevant to optimal control. This correlation is measured through their mutual information. In addition, we show that the expected return of an agent increases with the ability of its recurrent architecture to reach a high mutual information between its hidden states and the beliefs. Finally, we show that the mutual information between the hidden states and the beliefs of variables that are irrelevant for optimal control decreases through the learning process. In summary, this work shows that in its hidden states, a recurrent neural network approximating the Q-function of a partially observable environment reproduces a sufficient statistic from the history that is correlated to the relevant part of the belief for taking optimal actions.

翻訳日:2022-08-09 12:32:51 公開日:2022-08-06

# 複数物体追跡のための変圧器に基づく割当決定ネットワーク

Transformer-based assignment decision network for multiple object tracking ( http://arxiv.org/abs/2208.03571v1 )

ライセンス: Link先を確認

Athena Psalta, Vasileios Tsironis and Konstantinos Karantzalos

(参考訳) データアソシエーションは、複数のオブジェクト追跡(MOT)メソッドにおいて、トラッキング・バイ・検出のパラダイムに従う重要なコンポーネントである。データアソシエーションプロセスを用いて、各時間ステップ毎に検出と既存のターゲット間の割り当てを確立するようにした完全な軌跡を生成する。近年のデータアソシエーション手法は,多次元線形代入タスクやネットワークフローの最小化問題を解くか,あるいは複数の仮説追跡によって解決しようとする。しかし、推論中に最適な割り当てを計算する最適化ステップは、任意の解に計算の複雑さを付加する全てのシーケンスフレームに必要である。この目的のために,本研究の文脈では,推論中に明示的な最適化を必要とせず,データアソシエーションに取り組むトランスフォーマティブベースの割当決定ネットワーク(tadn)を導入する。特に、TADNは、ネットワークの単一のフォワードパスにおいて、検出とアクティブターゲット間の割り当てペアを直接推論することができる。我々は、TADNをかなり単純なMOTフレームワークに統合し、効率的なエンドツーエンドトレーニングのための新しいトレーニング戦略を設計し、MOT17とUA-DETRACの2つの人気のあるベンチマーク上で、オンラインビジュアルトラッキング・バイ・検出MOTに対する我々のアプローチの可能性を示した。提案手法は,咬合処理や再同定といった重要な補助成分を欠くトラッカーとしての性質にもかかわらず,ほとんどの評価指標において最先端を上回っている。このメソッドの実装はhttps://github.com/psaltaath/tadn-motで公開されている。

Data association is a crucial component for any multiple object tracking (MOT) method that follows the tracking-by-detection paradigm. To generate complete trajectories such methods employ a data association process to establish assignments between detections and existing targets during each timestep. Recent data association approaches try to solve a multi-dimensional linear assignment task or a network flow minimization problem or either tackle it via multiple hypotheses tracking. However, during inference an optimization step that computes optimal assignments is required for every sequence frame adding significant computational complexity in any given solution. To this end, in the context of this work we introduce Transformer-based Assignment Decision Network (TADN) that tackles data association without the need of any explicit optimization during inference. In particular, TADN can directly infer assignment pairs between detections and active targets in a single forward pass of the network. We have integrated TADN in a rather simple MOT framework, we designed a novel training strategy for efficient end-to-end training and demonstrate the high potential of our approach for online visual tracking-by-detection MOT on two popular benchmarks, i.e. MOT17 and UA-DETRAC. Our proposed approach outperforms the state-of-the-art in most evaluation metrics despite its simple nature as a tracker which lacks significant auxiliary components such as occlusion handling or re-identification. The implementation of our method is publicly available at https://github.com/psaltaath/tadn-mot.

翻訳日:2022-08-09 12:29:38 公開日:2022-08-06

# 胸部X線からの肺炎検出のための適応的PSOに基づく深部特徴抽出法

An Adaptive and Altruistic PSO-based Deep Feature Selection Method for Pneumonia Detection from Chest X-Rays ( http://arxiv.org/abs/2208.03558v1 )

ライセンス: Link先を確認

Rishav Pramanik, Sourodip Sarkar, Ram Sarkar

(参考訳) 肺炎は、特に世界の所得不足地域での小児死亡の主な原因の1つである。非常に高度な機器や医薬品で検出・治療できるが、発展途上国では依然として肺炎の検出が主要な関心事である。コンピュータ支援型診断システム(CAD)は,プロの医療専門家よりも手術コストが低いため,そのような国で利用することができる。本稿では,深層学習の概念とメタヒューリスティックアルゴリズムを用いて,胸部X線からの肺炎検出のためのCADシステムを提案する。まず,ターゲット肺炎データセット上で微調整されたresnet50から深い特徴を抽出する。そこで,我々は,メモリに基づく適応パラメータを用いて修正を行い,エージェントに利他的動作を組み込むことにより,機能選択を行うpso( particle swarm optimization)に基づく特徴選択手法を提案する。我々は特徴選択法を適応的・利他的PSO (AAPSO) と命名した。提案手法はresnet50モデルから得られた非形成的特徴を除去し,全体の肺炎検出能力を向上した。肺炎検出のための他のいくつかのフレームワークよりも, 広く利用可能な肺炎データセットの広範な実験と徹底的な解析により, 提案手法の優越性が確立された。肺炎の検出とは別に、AAPSOはいくつかの標準UCIデータセット、がん予測のための遺伝子発現データセット、COVID-19予測データセットでさらに評価されている。その結果,AAPSOが現実の様々な問題に対処する上で有用であることが確認された。この作業のソースコードはhttps://github.com/rishavpramanik/AAPSOで確認できる。

Pneumonia is one of the major reasons for child mortality especially in income-deprived regions of the world. Although it can be detected and treated with very less sophisticated instruments and medication, Pneumonia detection still remains a major concern in developing countries. Computer-aided based diagnosis (CAD) systems can be used in such countries due to their lower operating costs than professional medical experts. In this paper, we propose a CAD system for Pneumonia detection from Chest X-rays, using the concepts of deep learning and a meta-heuristic algorithm. We first extract deep features from the pre-trained ResNet50, fine-tuned on a target Pneumonia dataset. Then, we propose a feature selection technique based on particle swarm optimization (PSO), which is modified using a memory-based adaptation parameter, and enriched by incorporating an altruistic behavior into the agents. We name our feature selection method as adaptive and altruistic PSO (AAPSO). The proposed method successfully eliminates non-informative features obtained from the ResNet50 model, thereby improving the Pneumonia detection ability of the overall framework. Extensive experimentation and thorough analysis on a publicly available Pneumonia dataset establish the superiority of the proposed method over several other frameworks used for Pneumonia detection. Apart from Pneumonia detection, AAPSO is further evaluated on some standard UCI datasets, gene expression datasets for cancer prediction and a COVID-19 prediction dataset. The overall results are satisfactory, thereby confirming the usefulness of AAPSO in dealing with varied real-life problems. The supporting source codes of this work can be found at https://github.com/rishavpramanik/AAPSO

翻訳日:2022-08-09 12:22:46 公開日:2022-08-06

# 災害後被害分類のための多視点深層学習

Multi-view deep learning for reliable post-disaster damage classification ( http://arxiv.org/abs/2208.03419v1 )

ライセンス: Link先を確認

Asim Bashir Khajwal, Chih-Shen Cheng, Arash Noshadravan

(参考訳) 本研究は,人工知能(AI)と多視点画像を用いた,より信頼性の高い建築損傷分類を実現することを目的とする。災害後の被害評価にAIを採用するための現在の実践と研究の取り組みは一般的に行われている (a)定性的で、基準的被害規模に基づく建物被害レベルの厳格な分類を欠いたもの (b) 限られた視界を持つ航空画像や衛星画像に基づいて訓練されているが, 損傷の規模を完全に説明できない。本研究は,より高精度で信頼性の高い被害度自動定量化を実現するため,建物の多面的および空中的視点による総合的な視覚データの利用を提案する。このような空間的損傷予測モデルを実現するために、損傷した建物の異なる視点からの情報を結合するマルチビュー畳み込みニューラルネットワーク(mv-cnn)アーキテクチャが使用される。この空間的3dコンテキスト損傷情報は、損傷のより正確な識別と、損傷レベルの信頼できる定量化をもたらす。提案モデルでは, ハリケーン・ハーヴェイに続き, 調査対象の建物について, 専門家ラベル付きジオタグ付き画像を含む偵察視覚データセットを訓練し, 検証した。開発したモデルでは,災害レベルの予測に適度な精度を示し,よりインフォームドで信頼性の高い災害管理を支援する。

This study aims to enable more reliable automated post-disaster building damage classification using artificial intelligence (AI) and multi-view imagery. The current practices and research efforts in adopting AI for post-disaster damage assessment are generally (a) qualitative, lacking refined classification of building damage levels based on standard damage scales, and (b) trained based on aerial or satellite imagery with limited views, which, although indicative, are not completely descriptive of the damage scale. To enable more accurate and reliable automated quantification of damage levels, the present study proposes the use of more comprehensive visual data in the form of multiple ground and aerial views of the buildings. To have such a spatially-aware damage prediction model, a Multi-view Convolution Neural Network (MV-CNN) architecture is used that combines the information from different views of a damaged building. This spatial 3D context damage information will result in more accurate identification of damages and reliable quantification of damage levels. The proposed model is trained and validated on reconnaissance visual dataset containing expert-labeled, geotagged images of the inspected buildings following hurricane Harvey. The developed model demonstrates reasonably good accuracy in predicting the damage levels and can be used to support more informed and reliable AI-assisted disaster management practices.

翻訳日:2022-08-09 12:21:33 公開日:2022-08-06

# IVT:3D Pose Estimationのためのエンド・ツー・エンドのインスタンス誘導型ビデオトランス

IVT: An End-to-End Instance-guided Video Transformer for 3D Pose Estimation ( http://arxiv.org/abs/2208.03431v1 )

ライセンス: Link先を確認

Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu

(参考訳) video 3d human pose estimationは、ビデオから人間の関節の3d座標をローカライズすることを目的としている。近年の変圧器を用いた手法では、2次元ポーズ推定のステップで視覚深度特徴が失われるため、文脈深度特徴を効果的にモデル化できない2次元ポーズからの時空間情報の取り込みに焦点が当てられている。本稿では,このパラダイムを,視覚的特徴から時空間深度情報を効果的に学習し,映像フレームから直接3Dポーズを予測する,エンドツーエンドのフレームワークであるインスタンス誘導ビデオ変換器(IVT)に単純化する。特に、まず、ビデオフレームを一連のインスタンス誘導トークンとして定式化し、各トークンが人間のインスタンスの3dポーズを予測する役割を担います。これらのトークンは、人中心から人体関節への関節オフセットの誘導によって抽出されるため、体構造情報を含む。そして、これらのトークンをIVTに送信し、時空間深度を学習する。また,複数人間の変動尺度を扱うために,クロススケールのインスタンス誘導型注意機構を提案する。最後に、各人物の3Dポーズを座標回帰によりインスタンス誘導トークンから復号する。 3つの広く使われている3次元ポーズ推定ベンチマークの実験により、提案したIVTが最先端の性能を達成することが示された。

Video 3D human pose estimation aims to localize the 3D coordinates of human joints from videos. Recent transformer-based approaches focus on capturing the spatiotemporal information from sequential 2D poses, which cannot model the contextual depth feature effectively since the visual depth features are lost in the step of 2D pose estimation. In this paper, we simplify the paradigm into an end-to-end framework, Instance-guided Video Transformer (IVT), which enables learning spatiotemporal contextual depth information from visual features effectively and predicts 3D poses directly from video frames. In particular, we firstly formulate video frames as a series of instance-guided tokens and each token is in charge of predicting the 3D pose of a human instance. These tokens contain body structure information since they are extracted by the guidance of joint offsets from the human center to the corresponding body joints. Then, these tokens are sent into IVT for learning spatiotemporal contextual depth. In addition, we propose a cross-scale instance-guided attention mechanism to handle the variational scales among multiple persons. Finally, the 3D poses of each person are decoded from instance-guided tokens by coordinate regression. Experiments on three widely-used 3D pose estimation benchmarks show that the proposed IVT achieves state-of-the-art performances.

翻訳日:2022-08-09 12:21:11 公開日:2022-08-06

# 乾燥領域分割におけるデータ拡張の効果の検討

Exploring the Effects of Data Augmentation for Drivable Area Segmentation ( http://arxiv.org/abs/2208.03437v1 )

ライセンス: Link先を確認

Srinjoy Bhuiya, Ayushman Kumar, Sankalok Sen

(参考訳) ドライビング可能な地域のリアルタイムセグメンテーションは、自動車における自律的な認識を達成する上で重要な役割を果たす。近年,ディープラーニングを用いた画像分割モデルの開発が急速に進んでいる。しかしながら、ほとんどの進歩はモデルアーキテクチャ設計において行われてきた。セグメンテーションに関連する教師付きディープラーニング問題の解決において、モデルが構築するモデルの成功は、そのモデルに使用する入力トレーニングデータの量と品質に依存する。このデータは、セグメンテーションモデルのより優れた作業のために、よく注釈付けされた様々な画像を含むべきである。データセットのアノテーションに関連するこのような問題は、テストとバリデーションにおいて過大なタイプIとIIのエラーでモデルが終了する原因となり、現実世界の問題に対処しようとすると悪意のある問題を引き起こします。この問題に対処し、モデルをより正確でダイナミックで堅牢にするために、サンプルトレーニングデータを拡張し、全体としてより良く、より多様なものにするために、データ拡張が使われます。そこで本研究では,既存の画像データセットを分析し,それに応じて拡張を行うことで,データ拡張のメリットを検討することに焦点を当てる。以上の結果から,既存技術(SOTA)モデルの性能と堅牢性は,モデル複雑性や推論時間の増加を伴わずに劇的に向上できることが示された。本論文では,他の拡張手法と戦略の徹底的な研究と,現在広く利用されているそれに対応する効果についてのみ検討した。結果はすべて、広く使われているCityscapes Datasetで報告されています。

The real-time segmentation of drivable areas plays a vital role in accomplishing autonomous perception in cars. Recently there have been some rapid strides in the development of image segmentation models using deep learning. However, most of the advancements have been made in model architecture design. In solving any supervised deep learning problem related to segmentation, the success of the model that one builds depends upon the amount and quality of input training data we use for that model. This data should contain well-annotated varied images for better working of the segmentation model. Issues like this pertaining to annotations in a dataset can lead the model to conclude with overwhelming Type I and II errors in testing and validation, causing malicious issues when trying to tackle real world problems. To address this problem and to make our model more accurate, dynamic, and robust, data augmentation comes into usage as it helps in expanding our sample training data and making it better and more diversified overall. Hence, in our study, we focus on investigating the benefits of data augmentation by analyzing pre-existing image datasets and performing augmentations accordingly. Our results show that the performance and robustness of existing state of the art (or SOTA) models can be increased dramatically without any increase in model complexity or inference time. The augmentations decided on and used in this paper were decided only after thorough research of several other augmentation methodologies and strategies and their corresponding effects that are in widespread usage today. All our results are being reported on the widely used Cityscapes Dataset.

翻訳日:2022-08-09 12:20:51 公開日:2022-08-06

# haloae: 異常検出と局在化のためのhalonetベースの局所変圧器オートエンコーダ

HaloAE: An HaloNet based Local Transformer Auto-Encoder for Anomaly Detection and Localization ( http://arxiv.org/abs/2208.03486v1 )

ライセンス: Link先を確認

E. Mathian, H. Liu, L. Fernandez-Cuesta, D. Samaras, M. Foll, L. Chen

(参考訳) 非教師付き異常検出と局所化は、あらゆる可能な異常を収集・ラベル付けすることは不可能であるため、重要な課題である。多くの研究は、異常の正確なセグメンテーションを達成するために、ローカル情報とグローバル情報を統合することの重要性を強調している。このため、長距離コンテンツインタラクションのモデリングを可能にするtransformerへの関心が高まっている。しかし、自己注意によるグローバルな相互作用は、ほとんどの画像スケールでは一般的に高価すぎる。本研究では,HaloNetを用いたTransformerのローカル2次元バージョンに基づく最初の自動エンコーダであるHaloAEを紹介する。 haloaeでは,畳み込みと局所的な2次元ブロックワイズセルフアテンション層を結合し,単一モデルによる異常検出とセグメント化を共同で行うハイブリッドモデルを構築した。我々はMVTecデータセットの競合的な結果を達成し、Transformerを組み込んだビジョンモデルが自己注意操作の局所的な計算の恩恵を受け、他のアプリケーションへの道を開くことを示唆した。

Unsupervised anomaly detection and localization is a crucial task as it is impossible to collect and label all possible anomalies. Many studies have emphasized the importance of integrating local and global information to achieve accurate segmentation of anomalies. To this end, there has been a growing interest in Transformer, which allows modeling long-range content interactions. However, global interactions through self attention are generally too expensive for most image scales. In this study, we introduce HaloAE, the first auto-encoder based on a local 2D version of Transformer with HaloNet. With HaloAE, we have created a hybrid model that combines convolution and local 2D block-wise self-attention layers and jointly performs anomaly detection and segmentation through a single model. We achieved competitive results on the MVTec dataset, suggesting that vision models incorporating Transformer could benefit from a local computation of the self-attention operation, and pave the way for other applications.

翻訳日:2022-08-09 12:20:28 公開日:2022-08-06

# DeepFakeビデオにおける行動シグネチャの検出に関する研究

Study of detecting behavioral signatures within DeepFake videos ( http://arxiv.org/abs/2208.03561v1 )

ライセンス: Link先を確認

Qiaomu Miao, Sinhwa Kang, Stacy Marsella, Steve DiPaola, Chao Wang, Ari Shapiro

(参考訳) 娯楽、コミュニケーション、トレーニング、広告など様々な目的のために話している人々の合成ビデオ画像の生成には強い関心がある。ディープフェイク生成モデルの開発により、合成ビデオ画像は、自然に捉えたビデオから肉眼で見分けがつかないようになる。さらに、多くの手法は、より慎重で法医学的な視覚的分析を避けるために改善を続けている。いくつかのディープフェイクビデオは、顔のパペットを使って作られ、俳優の動きを通じて合成画像の頭部と顔を直接制御し、俳優が他の俳優のイメージを「パペット」することができる。本稿では、話者の視覚的な外観を制御しつつ、行動信号を他の音源から転送することで、ある人の動きが元の話者と区別できるかどうかを問う。我々は合成画像を比較して研究を行う。 1)異なる発話をする別の人に由来する。 2)同じ人が別の発話をすることに由来する。 3)同じ発話をする別の人に由来する。本研究は,3症例すべてにおける合成ビデオは,元のソースビデオよりもリアルで,エンゲージメントが低いことを示している。以上の結果から,視覚的外見から分離した人物の動きから検出可能な行動シグネチャが存在する可能性が示唆され,この行動シグネチャは,撮影された映像と深い偽物とを区別するためにも用いられることが示唆された。

There is strong interest in the generation of synthetic video imagery of people talking for various purposes, including entertainment, communication, training, and advertisement. With the development of deep fake generation models, synthetic video imagery will soon be visually indistinguishable to the naked eye from a naturally capture video. In addition, many methods are continuing to improve to avoid more careful, forensic visual analysis. Some deep fake videos are produced through the use of facial puppetry, which directly controls the head and face of the synthetic image through the movements of the actor, allow the actor to 'puppet' the image of another. In this paper, we address the question of whether one person's movements can be distinguished from the original speaker by controlling the visual appearance of the speaker but transferring the behavior signals from another source. We conduct a study by comparing synthetic imagery that: 1) originates from a different person speaking a different utterance, 2) originates from the same person speaking a different utterance, and 3) originates from a different person speaking the same utterance. Our study shows that synthetic videos in all three cases are seen as less real and less engaging than the original source video. Our results indicate that there could be a behavioral signature that is detectable from a person's movements that is separate from their visual appearance, and that this behavioral signature could be used to distinguish a deep fake from a properly captured video.

翻訳日:2022-08-09 12:20:09 公開日:2022-08-06

# 証明型学習における形式的(dis)頑健性に関する基礎的限界について

On the Fundamental Limits of Formally (Dis)Proving Robustness in Proof-of-Learning ( http://arxiv.org/abs/2208.03567v1 )

ライセンス: Link先を確認

Congyu Fang, Hengrui Jia, Anvith Thudi, Mohammad Yaghini, Christopher A. Choquette-Choo, Natalie Dullerud, Varun Chandrasekaran, Nicolas Papernot

(参考訳) Proof-of-learning(PoL)は、モデル所有者が機械学習トレーニングチェックポイントを使用して、トレーニングに必要な計算を拡張した証拠を確立することを提案する。 PoLフォアゴ暗号手法の著者らは、確率勾配勾配や適応的変種に適用することで、ディープラーニングへのスケーラビリティの厳密なセキュリティ保証を行う。この公式な分析の欠如は、攻撃者が訓練していないモデルの証明を偽造できる可能性を残している。本稿では,PoLプロトコルが公式な (dis) 証明できない理由の形式解析に寄与する。そのため、PoLにおける証明検証の2つの役割を解消する。 a)証明が有効な勾配降下軌道であるか否かを効率的に決定し、 (b)修了後(即ちスプーフィング)に証明を製作するコストを高くすることで優先を確立すること。そこで本研究では,効率的な検証が正当な証明の受け入れと無効な証明の拒否のトレードオフをもたらすことを示す。このノイズがトレーニングに与える影響に関する正確な分析モデルがなければ、pol検証アルゴリズムが堅牢かどうかを正式に保証することはできない。また,PoLポストホックトレーニングをスプーフすることは,非凸学習において同一の終点を持つ異なる軌跡を見つけることに似ている。しかし、最終モデルの重みに関する事前知識がそのような軌道の発見に役立つかどうか、厳密には分かっていない。我々は、上記のオープン問題に対処するまで、形式的ロバスト性保証で新しいクラスのpolプロトコルを定式化するために、暗号に重きを置く必要があると結論づける。特に、これが優先事項の確立に役立ちます。分析から得られた知見の副産物として,PoLに対する2つの新たな攻撃を実証した。

Proof-of-learning (PoL) proposes a model owner use machine learning training checkpoints to establish a proof of having expended the necessary compute for training. The authors of PoL forego cryptographic approaches and trade rigorous security guarantees for scalability to deep learning by being applicable to stochastic gradient descent and adaptive variants. This lack of formal analysis leaves the possibility that an attacker may be able to spoof a proof for a model they did not train. We contribute a formal analysis of why the PoL protocol cannot be formally (dis)proven to be robust against spoofing adversaries. To do so, we disentangle the two roles of proof verification in PoL: (a) efficiently determining if a proof is a valid gradient descent trajectory, and (b) establishing precedence by making it more expensive to craft a proof after training completes (i.e., spoofing). We show that efficient verification results in a tradeoff between accepting legitimate proofs and rejecting invalid proofs because deep learning necessarily involves noise. Without a precise analytical model for how this noise affects training, we cannot formally guarantee if a PoL verification algorithm is robust. Then, we demonstrate that establishing precedence robustly also reduces to an open problem in learning theory: spoofing a PoL post hoc training is akin to finding different trajectories with the same endpoint in non-convex learning. Yet, we do not rigorously know if priori knowledge of the final model weights helps discover such trajectories. We conclude that, until the aforementioned open problems are addressed, relying more heavily on cryptography is likely needed to formulate a new class of PoL protocols with formal robustness guarantees. In particular, this will help with establishing precedence. As a by-product of insights from our analysis, we also demonstrate two novel attacks against PoL.

翻訳日:2022-08-09 12:16:44 公開日:2022-08-06

# 強化記憶ユニットによる認知的評価の学習

Learning Human Cognitive Appraisal Through Reinforcement Memory Unit ( http://arxiv.org/abs/2208.03473v1 )

ライセンス: Link先を確認

Yaosi Hu and Zhenzhong Chen

(参考訳) 逐次的評価課題における人間の認知評価の効果を生かした,リカレントニューラルネットワークのための新しいメモリ強調機構を提案する。記憶増強機構を2つの正および負の強化記憶とともに評価状態を含む強化記憶ユニット(RMU)として概念化する。 2つの強化記憶はより強い刺激によって減衰または強化される。その後、正及び負の強化記憶の競合によって評価状態を更新する。したがって、RMUは、人間の感情経験を推定するための刺激の激しい変化の下で、評価の変動を学習することができる。ビデオ品質評価と体験タスクの映像品質評価実験で示すように、提案した強化記憶ユニットは、人間の認知評価をモデル化するためのRMUの有効性を示す。

We propose a novel memory-enhancing mechanism for recurrent neural networks that exploits the effect of human cognitive appraisal in sequential assessment tasks. We conceptualize the memory-enhancing mechanism as Reinforcement Memory Unit (RMU) that contains an appraisal state together with two positive and negative reinforcement memories. The two reinforcement memories are decayed or strengthened by stronger stimulus. Thereafter the appraisal state is updated through the competition of positive and negative reinforcement memories. Therefore, RMU can learn the appraisal variation under violent changing of the stimuli for estimating human affective experience. As shown in the experiments of video quality assessment and video quality of experience tasks, the proposed reinforcement memory unit achieves superior performance among recurrent neural networks, that demonstrates the effectiveness of RMU for modeling human cognitive appraisal.

翻訳日:2022-08-09 12:14:17 公開日:2022-08-06

PDF登録状況（公開日: 20220806）