Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20230106となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 二成分量子チャネルの前方古典的容量の束縛 Bounding the forward classical capacity of bipartite quantum channels ( http://arxiv.org/abs/2010.01058v3 ) ライセンス: Link先を確認	Dawei Ding, Sumeet Khatri, Yihui Quek, Peter W. Shor, Xin Wang, Mark M. Wilde	(参考訳) 両部量子チャネルにおける前方古典通信の様々な方法を紹介する。点対点チャネルは二成分チャネルの特別な場合であるため、この測度は点対点チャネルの古典的通信の測度に還元される。その結果、これらの削減された測度は、wangらが量子チャネルの古典的容量の境界に関する以前の研究で報告されている。応用として、この測度は二部流路の前方古典的容量の上限であることを示す。減少測度は、古典的フィードバックチャネルによって支援される点対点量子チャネルの古典的容量の上界である。様々な測度のいくつかは半定義プログラミングによって計算できる。 We introduce various measures of forward classical communication for bipartite quantum channels. Since a point-to-point channel is a special case of a bipartite channel, the measures reduce to measures of classical communication for point-to-point channels. As it turns out, these reduced measures have been reported in prior work of Wang et al. on bounding the classical capacity of a quantum channel. As applications, we show that the measures are upper bounds on the forward classical capacity of a bipartite channel. The reduced measures are upper bounds on the classical capacity of a point-to-point quantum channel assisted by a classical feedback channel. Some of the various measures can be computed by semi-definite programming.	翻訳日:2023-04-30 04:01:53 公開日:2023-01-06
# 最適線形検出器の設計 --ボトムアップアプローチ- Designing optimal linear detectors -- a bottom-up approach ( http://arxiv.org/abs/2110.07942v5 ) ライセンス: Link先を確認	Joe Bentley, Hendra Nurdin, Yanbei Chen, Xiang Li, Haixing Miao	(参考訳) 本稿では, 線形検出器を最適感度で実現するための系統的アプローチを開発し, 極弱信号の検出を可能にする。まず、一般的な制約は線形検出器の入出力伝達関数の特定のクラスに導かれる。すると、そのクラスにおける転送関数の物理的実現が量子ネットワーク合成技術を用いて見出され、入出力転送関数から直接物理セットアップを推測することができる。最小内部モード数を持つ最小限の実現法を探索することにより、最適検出器は内部スクイーズ方式であることが示される。そして、パリティ時間対称系に動機づけられた非最小実現を探索し、量子非退化測定を体系的に回収する。 This paper develops a systematic approach to realising linear detectors with an optimised sensitivity, allowing for the detection of extremely weak signals. First, general constraints are derived on a specific class of input-output transfer functions of a linear detector. Then a physical realization of transfer functions in that class is found using the quantum network synthesis technique, which allows for the inference of the physical setup directly from the input-output transfer function. By exploring a minimal realization which has the minimum number of internal modes, it is shown that the optimal such detectors are internal squeezing schemes. Then, investigating non-minimal realizations, which is motivated by the parity-time symmetric systems, a quantum non-demolition measurement is systematically recovered.	翻訳日:2023-03-11 10:10:31 公開日:2023-01-06
# ゼロ次元ボゾン系の合成空間における創発的非エルミート局在現象 Emergent non-Hermitian localization phenomena in the synthetic space of zero-dimensional bosonic systems ( http://arxiv.org/abs/2110.15286v6 ) ライセンス: Link先を確認	Ievgen I. Arkhipov, Fabrizio Minganti	(参考訳) 非エルミート系の相転移は、最先端の理論と実験的研究に焦点をあてている。一方、パリティ時間 ($\cal PT$-) と反 PT$-対称物理学は、例外点 (EPs) と呼ばれる非エルミートスペクトル特異点の存在により、常に関心を集めている。一方、非エルミート系のトポロジカルおよび局在遷移は、例えば非エルミート皮膚効果や従来のバルク境界対応の欠如など、新しい現象を示す。従来の研究の大部分は、位相的および局在的遷移現象を示すために、微調整された拡張格子を必要とする非エルミートハミルトン系にのみ焦点をあてており、本研究では、非エルミート局所化現象が、ゼロ次元ボソニック系の合成場モーメント空間、例えば、反$\cal pt$ および$$\cal pt$-symmetric 量子ダイマーにおいて自然にどのように現れるかを示す。これは低次元系の局所化遷移をシミュレートする機会を与え、例えば結合キャビティや導波路のような複雑な配列を構成する必要はない。実際、運動の場モーメント方程式は、1次元(1D)合成格子で動く等価な(準)粒子を記述することができる。この合成場モーメント空間は、高縮退EPの存在によって誘導される非エルミート皮膚効果のような非自明な局在現象を示すことができる。我々は,高次場モーメント固有空間をSylvester行列形状の合成1次元非エルミート・ハミルトニアンでエミュレートした反$\cal PT$-symmetric two-modeシステムの例を示した。この結果は、光子モーメントや相関関数を測定することにより、超伝導回路やトロイダル共振器などの最先端光学装置で直接検証することができる。 Phase transitions in non-Hermitian systems are at the focus of cutting edge theoretical and experimental research. On the one hand, parity-time- ($\cal PT$-) and anti-$\cal PT$-symmetric physics have gained ever-growing interest, due to the existence of non-Hermitian spectral singularities called exceptional points (EPs). On the other, topological and localization transitions in non-Hermitian systems reveal new phenomena, e.g., the non-Hermitian skin effect and the absence of conventional bulk-boundary correspondence. The great majority of previous studies exclusively focus on non-Hermitian Hamiltonians, whose realization requires an {\it a priori} fine-tuned extended lattices to exhibit topological and localization transition phenomena.In this work, we show how the non-Hermitian localization phenomena can naturally emerge in the synthetic field moments space of zero-dimensional bosonic systems, e.g., in anti-$\cal PT$ and $\cal PT$-symmetric quantum dimers. This offers an opportunity to simulate localization transitions in low-dimensional systems, without the need to construct complex arrays of, e.g., coupled cavities or waveguides. Indeed, the field moment equations of motion can describe an equivalent (quasi-)particle moving in a one-dimensional (1D) synthetic lattice. This synthetic field moments space can exhibit a nontrivial localization phenomena, such as non-Hermitian skin effect, induced by the presence of highly-degenerate EPs. We demonstrate our findings on the example of an anti-$\cal PT$-symmetric two-mode system, whose higher-order field moments eigenspace is emulated by a synthetic 1D non-Hermitian Hamiltonian having a Sylvester matrix shape. Our results can be directly verified in state-of-the-art optical setups, such as superconducting circuits and toroidal resonators, by measuring photon moments or correlation functions.	翻訳日:2023-03-10 00:58:39 公開日:2023-01-06
# 多部門内在的非局所性とデバイス非依存会議鍵合意 Multipartite Intrinsic Non-Locality and Device-Independent Conference Key Agreement ( http://arxiv.org/abs/2111.02596v3 ) ライセンス: Link先を確認	Aby Philip, Eneet Kaur, Peter Bierhorst, and Mark M. Wilde	(参考訳) 本研究では,デバイス非依存(DI)会議鍵契約におけるマルチパートシナリオにおける資源の定量化手法として,マルチパーティ固有の非局所性を導入する。局所演算と共通乱数性と呼ばれる自由操作のクラスにおいて,多部固有の非局所性は加法,凸,単調であることを証明する。我々の技術的貢献の1つとして、我々は2種類の多成分相互情報の連鎖規則を確立し、多成分内在的非局所性が付加的であることを証明するために使用する。この連鎖規則は他の文脈において独立した関心を持つかもしれない。多部内在的非局所性(multipartite intrinsic non-locality)は、diカンファレンスキーアグリーメントの一般的な多部内在的非局所性(multipartite intrinsic non-locality)において、秘密鍵レートの上限となるものです。本稿では、DI会議鍵プロトコルの様々な例について論じ、これらのプロトコルの上限を既知の下限と比較する。最後に、di量子鍵分布の最近の実験的実現における上限を計算する。 In this work, we introduce multipartite intrinsic non-locality as a method for quantifying resources in the multipartite scenario of device-independent (DI) conference key agreement. We prove that multipartite intrinsic non-locality is additive, convex, and monotone under a class of free operations called local operations and common randomness. As one of our technical contributions, we establish a chain rule for two variants of multipartite mutual information, which we then use to prove that multipartite intrinsic non-locality is additive. This chain rule may be of independent interest in other contexts. All of these properties of multipartite intrinsic non-locality are helpful in establishing the main result of our paper: multipartite intrinsic non-locality is an upper bound on secret key rate in the general multipartite scenario of DI conference key agreement. We discuss various examples of DI conference key protocols and compare our upper bounds for these protocols with known lower bounds. Finally, we calculate upper bounds on recent experimental realizations of DI quantum key distribution.	翻訳日:2023-03-09 04:47:36 公開日:2023-01-06
# 量子エラー補正と耐故障性入門 Introduction to Quantum Error Correction and Fault Tolerance ( http://arxiv.org/abs/2111.08894v4 ) ライセンス: Link先を確認	Steven M. Girvin	(参考訳) 2019年のLes Houches Summer Schoolの講義ノートは、ビットと量子ビットによる古典的および量子的誤り訂正と連続的な可変系(高調波発振器)の導入を目的としている。後者の焦点は、超電導回路とマイクロ波光子に基づくモジュラーアーキテクチャによって、今日または近い将来に実現可能な実用的な例に焦点が当てられる。ゴールとビジョンは「ハードウエア効率」な量子エラー補正であり、実用的で有用なフォールトトレランスと回路深度を達成するために、指数関数的に大きなハードウェアオーバーヘッドを必要としない。 These lecture notes from the 2019 Les Houches Summer School on 'Quantum Information Machines' are intended to provide an introduction to classical and quantum error correction with bits and qubits, and with continuous variable systems (harmonic oscillators). The focus on the latter will be on practical examples that can be realized today or in the near future with a modular architecture based on superconducting electrical circuits and microwave photons. The goal and vision is 'hardware-efficient' quantum error correction that does not require exponentially large hardware overhead in order to achieve practical and useful levels of fault tolerance and circuit depth.	翻訳日:2023-03-07 22:01:26 公開日:2023-01-06
# 責任ある量子について語る: 認識は絶対最小であり... Talking about responsible quantum: Awareness is the absolute minimum... that we need to do ( http://arxiv.org/abs/2112.01378v4 ) ライセンス: Link先を確認	Tara Roberson	(参考訳) 量子技術に関するハイプは、このセクターの社会的影響について議論を呼んだ。量子技術の責任ある開発を確実にするための呼び出しは、具体的なケーススタディや無責任量子の実例の欠如によって複雑になる。この段階では、責任量子はコリングリッジジレンマを思い起こさせる状況に直面している。このジレンマにおいて、社会的なリスクと利益に関する議論が最も影響のある瞬間は、最も少ない情報が得られる時間でもある。この課題の裏側は、セクターの軌道(および潜在的な問題)が閉じ込められる前に、量子の公共利益を調べるためのプロセスを構築する機会である。この分野での最近の研究は、量子研究者やイノベーターが不確実性や懸念に対処するために社会と協力する必要があると主張している。量子利害関係者の関与と責任観の理解により、この提案を支持し、量子技術の責任ある開発と利用に関するさらなる対話を可能にすることを目指す。 Hype over novel quantum technologies has prompted discussion on the likely societal impacts of the sector. Calls to ensure the responsible development of quantum technologies are complicated by a lack of concrete case studies or real-world examples of irresponsible quantum. At this stage, responsible quantum faces a situation reminiscent of the Collingridge dilemma. In this dilemma, the moment in which discussion on societal risks and benefits can be most impactful is also the time where the least information is available. The flipside of this challenge is an opportunity to build processes for examining the public good of quantum before the trajectory (and potential problems) of the sector become locked in. Recent work in this space has argued that quantum researchers and innovators must work with society to address uncertainties and concerns. By engaging quantum stakeholders and understanding their perspectives on responsibility, this paper seeks to support this proposition and enable further dialogue on responsible development and use of quantum technologies.	翻訳日:2023-03-06 04:24:55 公開日:2023-01-06
# 準巡回符号から構築した新しい二進量子符号 New Binary Quantum Codes Constructed from Quasi-Cyclic Codes ( http://arxiv.org/abs/2112.07137v3 ) ライセンス: Link先を確認	Chaofeng Guan, Ruihu Li, Liangdong Lu, Yu Yao	(参考訳) 量子符号は古典的シンプレクティック双対包含符号によって構成できることはよく知られている。本稿では,2世代準巡回符号のファミリーを考察し,これらの符号がシンプレクティックな二重包含となるための十分な条件を導出する。そこで,シンプレクティック双対包含符号を用いたバイナリ量子符号の構成法を提案する。アプリケーションとして、最もよく知られた結果を超える8つのバイナリ量子コードを構築します。さらに、伝播規則によってさらに36個の新しいバイナリ量子符号が得られ、いずれも最小距離における下限を改善する。 It is well known that quantum codes can be constructed by means of classical symplectic dual-containing codes. This paper considers a family of two-generator quasi-cyclic codes and derives sufficient conditions for these codes to be symplectic dual-containing. Then, a new method for constructing binary quantum codes using symplectic dual-containing codes is proposed. As an application, we construct 8 binary quantum codes that exceed the best-known results. Further, another 36 new binary quantum codes are obtained by propagation rules, all of which improve the lower bound on the minimum distances.	翻訳日:2023-03-04 14:31:02 公開日:2023-01-06
# ホログラフィ、セルレーション、誤り訂正符号 Holography, cellulations and error correcting codes ( http://arxiv.org/abs/2112.12468v2 ) ライセンス: Link先を確認	Marika Taylor, Charles Woodward	(参考訳) 双曲平面に関連する量子誤り訂正符号はads$_3$/cft$_2$対応の文脈で広く研究されている。本稿では,高次元のホログラフィックジオメトリに関連する符号の体系的研究を開始し,ジオメトリの空間断面のセルレーションと安定化符号を関連づける。本研究では,3次元双曲空間(AdS$_4$)に対するHaPPY符号の類似を,絶対最大絡み(AME)符号と非AME符号の両方を用いて構成する。これらの符号は双曲空間の均一な正則テッセレーションに基づいているが、テッセレーションのポリトープの離散対称性を保存するAME符号は2次元以上は存在しないことに留意する。また,論理情報が境界に関連付けられる双曲空間に対するスタビリサー符号の異なる構成を探索し,それらの解釈について考察する。双曲空間のトロイダル還元による重力-スカラー理論(JT重力など)に基づくホログラフィック双対の興味深いクラスに、我々の符号がどのように適用できるかを説明する。 Quantum error correction codes associated with the hyperbolic plane have been explored extensively in the context of the AdS$_3$/CFT$_2$ correspondence. In this paper we initiate a systematic study of codes associated with holographic geometries in higher dimensions, relating cellulations of the spatial sections of the geometries to stabiliser codes. We construct analogues of the HaPPY code for three-dimensional hyperbolic space (AdS$_4$), using both absolutely maximally entangled (AME) and non-AME codes. These codes are based on uniform regular tessellations of hyperbolic space but we note that AME codes that preserve the discrete symmetry of the polytope of the tessellation do not exist above two dimensions. We also explore different constructions of stabiliser codes for hyperbolic spaces in which the logical information is associated with the boundary and discuss their potential interpretation. We explain how our codes could be applied to interesting classes of holographic dualities based on gravity-scalar theories (such as JT gravity) through toroidal reductions of hyperbolic spaces.	翻訳日:2023-03-03 18:00:04 公開日:2023-01-06
# 混合状態自由QFTにおける量子情報のチャネル誘起ダイナミクス Channel induced dynamics of quantum information in mixed state free QFTs ( http://arxiv.org/abs/2201.02723v3 ) ライセンス: Link先を確認	Michal Baczyk	(参考訳) 本稿では,場の励起を量子チャネルとして表現できる量子場理論(QFT)の研究フレームワークを提案する。 1次元QFT系の正規化真空状態と2つの同一自由QFT系の格子制御熱場二重状態の2つの普遍状態に対する提案方式の内部動作を実証する。単体および非単体ボソニックガウスチャネル(ペッツ回収マップを含む)の動作について検討する。チャネル静的動作とチャネル誘起力学の特性を評価し定量化するために,量子エントロピーと忠実度を計算する。 We propose a framework for Quantum Field Theory (QFT) studies that allows us to represent field excitations as quantum channels. We demonstrate inner-workings of the proposed scheme for two universal states: the regularized vacuum state of a one dimensional QFT system and the lattice-regulated Thermofield Double State of two identical free QFTs. We investigate actions of unitary and non-unitary Bosonic Gaussian channels (including Petz Recovery maps). To evaluate and quantify the character of the channel static action and channel induced dynamics we calculate quantum entropies and fidelities.	翻訳日:2023-03-01 23:36:38 公開日:2023-01-06
# ブロックチェーンユーザの政治的、経済的、ガバナンス的態度 Political, economic, and governance attitudes of blockchain users ( http://arxiv.org/abs/2301.02734v1 ) ライセンス: Link先を確認	Lucia M. Korpas, Seth Frey, Joshua Tan	(参考訳) ブロックチェーンエコシステムの一部である人々を対象に、暗号政治、暗号経済、暗号統治の感情を評価するための調査を行う。 3710人の調査回答に基づいて、その信念、態度、暗号参加の態様を説明し、自己報告された政治的提携とブロックチェーンエコシステムがこれらとどのように関連しているかを調査した。我々は,経済力分布の認識,暗号に対する個人的態度,ガバナンスにおける権力分布に関する規範的信念,ブロックチェーン技術の外部的規制に関する質問において,分極を観察した。政治的自己同一化の相違は、経済的公平性、性平等、意思決定力、適切な規制の獲得方法に関する意見と相関し、ブロックチェーン関連は、暗号通貨のガバナンスと規制に関する意見と相関し、回答者の暗号と個人的目標の意味的概念が関与している。また、理論駆動構成の政治軸は、データによって支持され、データから生じる他の回答者のグループ化や信念の可能性を調査する。 We present a survey to evaluate crypto-political, crypto-economic, and crypto-governance sentiment in people who are part of a blockchain ecosystem. Based on 3710 survey responses, we describe their beliefs, attitudes, and modes of participation in crypto and investigate how self-reported political affiliation and blockchain ecosystem affiliation are associated with these. We observed polarization in questions on perceptions of the distribution of economic power, personal attitudes towards crypto, normative beliefs about the distribution of power in governance, and external regulation of blockchain technologies. Differences in political self-identification correlated with opinions on economic fairness, gender equity, decision-making power and how to obtain favorable regulation, while blockchain affiliation correlated with opinions on governance and regulation of crypto and respondents' semantic conception of crypto and personal goals for their involvement. We also find that a theory-driven constructed political axis is supported by the data and investigate the possibility of other groupings of respondents or beliefs arising from the data.	翻訳日:2023-02-19 13:30:26 公開日:2023-01-06
# 関数型プログラミング・アサインメントにおける学生クラスタの識別 : クイックラーナーからストラグリング学生へ Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students ( http://arxiv.org/abs/2301.02611v1 ) ライセンス: Link先を確認	Chuqin Geng, Wenwen Xu, Yingjie Xu, Brigitte Pientka, Xujie Si	(参考訳) インストラクターや学生は、学生がいかにうまく教材を習得しているか、学生が苦闘しているかを示す重要な指標として、プログラミング課題の成績にしばしば注目される。しかしこれは誤解を招く可能性がある。特に、学生がオートグレーターにアクセスできる場合、成績は大幅に歪められることがある。本稿では,McGill大学における関数型プログラミングコースから収集した学生の課題提出データを,幅広い特徴を取り入れて分析する。グレードに加えて、アクティビティ時間データ、費やされた時間、静的エラーの数についても検討する。これにより、クラスタアルゴリズムを通じて、"quick-learning"、"hardworking"、"satisficing"、"struggling"の4つの学生クラスタを識別することができます。次に、作業習慣、作業期間、エラーの範囲、エラーを修正する能力が、学生の異なるクラスタに与える影響を分析する。この構造化分析は、インストラクターがさまざまなタイプの学生を積極的に支援し、コース全体のデザインの異なる側面を強調するための貴重な洞察を提供する。また、学生自身がどの側面に苦しむかを理解し、明確化を追求し、仕事の習慣を調整するための洞察を提供する。 Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functional programming course taught at McGill university incorporating a wide range of features. In addition to the grade, we consider activity time data, time spent, and the number of static errors. This allows us to identify four clusters of students: "Quick-learning", "Hardworking", "Satisficing", and "Struggling" through cluster algorithms. We then analyze how work habits, working duration, the range of errors, and the ability to fix errors impact different clusters of students. This structured analysis provides valuable insights for instructors to actively help different types of students and emphasize different aspects of their overall course design. It also provides insights for students themselves to understand which aspects they still struggle with and allows them to seek clarification and adjust their work habits.	翻訳日:2023-02-19 13:30:00 公開日:2023-01-06
# インフォマティクスにおけるバランス改善 : 学生との正直な議論 Better Balance in Informatics: An Honest Discussion with Students ( http://arxiv.org/abs/2301.02532v1 ) ライセンス: Link先を確認	Elisavet Kozyri, Mariel Evelyn Markussen Ellingsen, Ragnhild Abel Grape, Letizia Jaccheri	(参考訳) 近年,コンピュータ科学(cs)の学術環境において,男女のバランスを促進する取り組みが盛んに行われている。しかし、学生から博士号取得者、教員まで、すべてのcsの学術レベルでは男女差が残っている。この傾向は、UiT(ノルウェー北極大学)のコンピュータ科学科(Department of Computer Science)が続く。 UiTのCS環境におけるこの傾向に対処するため,本学部の学生を対象に構造化された議論を行った。これらの議論から収集したデータを分析した結果、我々の部署のジェンダーギャップを緩和できる行動項目が特定できた。特に、これらの議論は、達成する方法を解明した。 (i)CS学部課程への学生のバランスの取れた流れ (二)バランスの取れたCS研究環境、及び (iii)csアカデミア(例えばphdプログラム)のより高いレベルへの卒業生のバランスの取れたフロー。本報告では, 省庁に対して行った議論の結果とその後の提言について述べる。また、ジェンダーバランス行動計画の一環として、他の機関が同様のイベントを組織化するためのロードマップも提供します。 In recent years, there has been considerable effort to promote gender balance in the academic environment of Computer Science (CS). However, there is still a gender gap at all CS academic levels: from students, to PhD candidates, to faculty members. This general trend is followed by the Department of Computer Science at UiT The Arctic University of Norway. To combat this trend within the CS environment at UiT, we embarked on structured discussions with students of our department. After analyzing the data collected from these discussions, we were able to identify action items that could mitigate the existing gender gap at our department. In particular, these discussions elucidated ways to achieve (i) a balanced flow of students into CS undergraduate program, (ii) a balanced CS study environment, and (iii) a balanced flow of graduates into higher levels of the CS academia (e.g., PhD program). This paper presents the results of the discussions and the subsequent recommendations that we made to the administration of the department. We also provide a road-map that other institutions could follow to organize similar events as part of their gender-balance action plan.	翻訳日:2023-02-19 13:29:38 公開日:2023-01-06
# 労働者の声:なぜ労働者中心のクラウドワークアプローチが混み合っているのか Voices of Workers: Why a Worker-Centered Approach to Crowd Work Is Challenging ( http://arxiv.org/abs/2212.14471v2 ) ライセンス: Link先を確認	Caifan Du, Matthew Lease	(参考訳) 広く、多様性があり、シフトし、目に見えない群衆の労働力を理解するにはどうすればよいのか。一般市民のコミュニティフォーラムにおける公開投稿のオンライン観察と分析から得られた知見を報告する。特に,群集作業のメディア描写に関して,群集作業員とジャーナリストの間で繰り返し緊張関係が見られた。群衆の多様性は、群衆の仕事の幅広い経験に対処する上で、あらゆる1次元表現が不十分であることがわかった。我々は、規模、多様性、可視性、そして大衆の宣伝に対する抵抗が、特に群衆の仕事に対する労働者中心のアプローチを特に困難にし、労働者の多様性とその生活経験をよりよく理解する必要があると論じている。 How can we better understand the broad, diverse, shifting, and invisible crowd workforce, so that we can better support it? We present findings from online observations and analysis of publicly available postings from a community forum of crowd workers. In particular, we observed recurring tensions between crowd workers and journalists regarding media depictions of crowd work. We found that crowd diversity makes any one-dimensional representation inadequate in addressing the wide-ranging experiences of crowd work. We argue that the scale, diversity, invisibility, and the crowds' resistance to publicity make a worker-centered approach to crowd work particularly challenging, necessitating better understanding the diversity of workers and their lived experiences.	翻訳日:2023-02-19 13:22:51 公開日:2023-01-06
# 社会分析のための自己教師付きハイパーグラフ表現学習 Self-supervised Hypergraph Representation Learning for Sociological Analysis ( http://arxiv.org/abs/2212.11440v2 ) ライセンス: Link先を確認	Xiangguo Sun, Hong Cheng, Bo Liu, Jia Li, Hongyang Chen, Guandong Xu, Hongzhi Yin	(参考訳) 現代の社会学は行動分析の説得力のある社会的基準の多くを深く発見してきた。残念ながら、それらの多くは、オンラインソーシャルネットワークで測定され、提示されるには主観的すぎる。一方、データマイニング技術はデータパターンをよりよく見つけることができるが、その多くは不自然な理解を残している。本稿では,データマイニング技術と社会学的行動基準のさらなる融合を支援するための基本的な方法論を提案する。まず、効果的なハイパーグラフ認識と高速なライングラフ構築フレームワークを提案する。ハイパーグラフは、ハイパーグラフの各エッジが2つ以上のノードを含んでおり、社会環境を記述するのに最適であるため、個人とその環境間の相互作用をより深く示すことができる。ライングラフは、それぞれの社会環境を、異なる環境間の基盤となる影響を持つスーパーノードとして扱う。そこで,我々は従来の対関係を越え,様々な社会学的基準の下でより豊かなパターンを探索する。第2に,ユーザからユーザへ,ユーザへ,環境へ,環境から環境へ流れる社会的影響を学習するハイパーグラフベースのニューラルネットを提案する。第3に、社会的適合性、社会的等価性、環境の進化、社会分極化といった社会学的基準を効果的に評価するために、質的および定量的なソリューションを提案する。広範な実験により,オンラインユーザ行動と社会学的分析のためのデータマイニングタスクを,フレームワークがより良くサポートできることが判明した。 Modern sociology has profoundly uncovered many convincing social criteria for behavioural analysis. Unfortunately, many of them are too subjective to be measured and presented in online social networks. On the other hand, data mining techniques can better find data patterns but many of them leave behind unnatural understanding. In this paper, we propose a fundamental methodology to support the further fusion of data mining techniques and sociological behavioral criteria. Our highlights are three-fold: First, we propose an effective hypergraph awareness and a fast line graph construction framework. The hypergraph can more profoundly indicate the interactions between individuals and their environments because each edge in the hypergraph (a.k.a hyperedge) contains more than two nodes, which is perfect to describe social environments. A line graph treats each social environment as a super node with the underlying influence between different environments. In this way, we go beyond traditional pair-wise relations and explore richer patterns under various sociological criteria; Second, we propose a novel hypergraph-based neural network to learn social influence flowing from users to users, users to environments, environment to users, and environments to environments. The neural network can be learned via a task-free method, making our model very flexible to support various data mining tasks and sociological analysis; Third, we propose both qualitative and quantitive solutions to effectively evaluate the most common sociological criteria like social conformity, social equivalence, environmental evolving and social polarization. Our extensive experiments show that our framework can better support both data mining tasks for online user behaviours and sociological analysis.	翻訳日:2023-02-19 13:15:53 公開日:2023-01-06
# 回転ボース・アインシュタイン凝縮体の基底状態を計算するための二次流れ Second-order flows for computing the ground states of rotating Bose-Einstein condensates ( http://arxiv.org/abs/2205.00805v2 ) ライセンス: Link先を確認	Haifan Chen, Guozhi Dong, Wei Liu, Ziqing Xie	(参考訳) 本稿では,一階流と見なされる勾配流と区別される二階時間微分を含む人工進化微分方程式について述べる。これは、凸最適化の減衰を伴う慣性力学の最近の進歩により、一般的なトピックである。数学的には、回転ボース・アインシュタイン凝縮体(bec)の基底状態は、正規化制約の下で角運動量回転項を持つグロス・ピタエフスキーエネルギー汎関数の最小値としてモデル化することができる。この制約付き非凸最適化問題に対するエネルギー最小化戦略として2種類の二階流を導入する。提案した人工力学は、散逸を伴う2階非線形双曲偏微分方程式である。時間的離散化のための明示的および半単純的手法や空間的離散化のためのフーリエ擬スペクトル法など、いくつかの数値的離散化方式が議論されている。これらのアルゴリズムは、回転するbecの基底状態を計算するための効率的でロバストなアルゴリズムを提供する。特に, 新たに開発したアルゴリズムは, 勾配流に基づく最先端の数値手法よりも優れていることがわかった。勾配流型アプローチと比較して、明示的な時間的離散化戦略を採用すると、提案手法はより安定した時間的ステップサイズを実現することができる; 半単純離散化では、同じステップサイズを使用するが、提案手法が停止基準に達するためには、より少ないイテレーションが必要であり、毎回ステップがほぼ同じ計算複雑性に遭遇する。リッチで詳細な数値例が検証と比較のために文書化されている。 Second-order flows in this paper refer to some artificial evolutionary differential equations involving second-order time derivatives distinguished from gradient flows which are considered to be first-order flows. This is a popular topic due to the recent advances of inertial dynamics with damping in convex optimization. Mathematically, the ground state of a rotating Bose-Einstein condensate (BEC) can be modeled as a minimizer of the Gross-Pitaevskii energy functional with angular momentum rotational term under the normalization constraint. We introduce two types of second-order flows as energy minimization strategies for this constrained non-convex optimization problem, in order to approach the ground state. The proposed artificial dynamics are novel second-order nonlinear hyperbolic partial differential equations with dissipation. Several numerical discretization schemes are discussed, including explicit and semi-implicit methods for temporal discretization, combined with a Fourier pseudospectral method for spatial discretization. These provide us a series of efficient and robust algorithms for computing the ground states of rotating BECs. Particularly, the newly developed algorithms turn out to be superior to the state-of-the-art numerical methods based on the gradient flow. In comparison with the gradient flow type approaches: When explicit temporal discretization strategies are adopted, the proposed methods allow for larger stable time step sizes; While for semi-implicit discretization, using the same step size, a much smaller number of iterations are needed for the proposed methods to reach the stopping criterion, and every time step encounters almost the same computational complexity. Rich and detailed numerical examples are documented for verification and comparison.	翻訳日:2023-02-14 20:40:58 公開日:2023-01-06
# ニューラルネットワークを用いたcovid-19患者のフィットネス依存型オプティマイザ Fitness Dependent Optimizer with Neural Networks for COVID-19 patients ( http://arxiv.org/abs/2302.02986v1 ) ライセンス: Link先を確認	Maryam T. Abdulkhaleq, Tarik A. Rashid, Bryar A. Hassan, Abeer Alsadoon, Nebojsa Bacanin, Amit Chhabra, S. Vimal	(参考訳) 2019年に中国で発生した新型コロナウイルス(COVID-19)は、世界の健康に大きな影響を与え、世界中の医療機関に多大な負担を与えている。これらの効果は今日も続いている。ウイルスの感染を制限する一つの戦略は、疑わしい症例を早期に診断し、病気がさらに拡大する前に適切な対策を講じることである。本研究は, 文献的臨床データに基づき, 感染の可能性を診断し, 明らかにすることを目的としている。本研究では,5つの機械学習技術(GWO_MLP,GWO_CMLP,MGWO_MLP,FDO_MLP,FDO_CMLP)を用いて,Covid-19患者を2つのカテゴリに分類した。実験はすべての使用モデルに有望な結果をもたらした。適用された手法は、通常精度の点で非常によく似た性能を示した。しかし、各テストデータセットにおいて、FDO_MLPとFDO_CMLPは100%精度で最良の結果を得た。他のモデルの結果は、ある実験から別の実験へと変化した。その結果,FDOアルゴリズムを学習アルゴリズムとして用いたモデルは,高い精度が得られる可能性が示唆された。しかし、FDOは他のアルゴリズムと比較して最長のランタイムを持つことがわかった。 covid 19モデルへのリンクはこちら。 https://github.com/tarik4rashid4/covid19models The Coronavirus, known as COVID-19, which appeared in 2019 in China, has significantly affected global health and become a huge burden on health institutions all over the world. These effects are continuing today. One strategy for limiting the virus's transmission is to have an early diagnosis of suspected cases and take appropriate measures before the disease spreads further. This work aims to diagnose and show the probability of getting infected by the disease according to textual clinical data. In this work, we used five machine learning techniques (GWO_MLP, GWO_CMLP, MGWO_MLP, FDO_MLP, FDO_CMLP) all of which aim to classify Covid-19 patients into two categories (Positive and Negative). Experiments showed promising results for all used models. The applied methods showed very similar performance, typically in terms of accuracy. However, in each tested dataset, FDO_MLP and FDO_CMLP produced the best results with 100% accuracy. The other models' results varied from one experiment to the other. It is concluded that the models on which the FDO algorithm was used as a learning algorithm had the possibility of obtaining higher accuracy. However, it is found that FDO has the longest runtime compared to the other algorithms. The link to the covid 19 models is found here: https://github.com/Tarik4Rashid4/covid19models	翻訳日:2023-02-12 13:04:44 公開日:2023-01-06
# 15のパズル-3つのヒューリスティックス法のハイブリッド化による新しいアプローチ The Fifteen Puzzle- A New Approach through Hybridizing Three Heuristics Methods ( http://arxiv.org/abs/2302.02985v1 ) ライセンス: Link先を確認	Dler O. Hasan, Aso M. Aladdin, Hardi Sabah Talabani, Tarik Ahmed Rashid, and Seyedali Mirjalili	(参考訳) 15のパズル問題は、数世紀にわたって数学愛好家を魅了してきた最も古典的な問題の1つである。これは主に、探索すべき約1013の状態を持つ状態空間の巨大なサイズと、Fifteen Puzzleインスタンスの解決にいくつかのアルゴリズムが適用されているためである。本稿では,この大きな状態空間に対処するために,マンハッタン距離 (md), 線形衝突 (lc), 歩行距離 (wd) といった3つのヒューリスティックを持つ双方向a* (ba) 探索アルゴリズムを用いた。 3つのヒューリスティックはアルゴリズムによって生成された状態の数を劇的に減らす方法でハイブリダイゼーションされる。さらに、これらのヒューリスティックは25KBのストレージしか必要としないが、アルゴリズムは生成された状態の数を効果的に減らし、ノード数を減らした。 BAサーチの実装は,空間の複雑さを著しく低減し,最適解か準最適解かを保証できる。 Fifteen Puzzle problem is one of the most classical problems that have captivated mathematical enthusiasts for centuries. This is mainly because of the huge size of the state space with approximately 1013 states that have to be explored and several algorithms have been applied to solve the Fifteen Puzzle instances. In this paper, to deal with this large state space, Bidirectional A* (BA) search algorithm with three heuristics, such as Manhattan distance (MD), linear conflict (LC), and walking distance (WD) has been used to solve the Fifteen Puzzle problems. The three mentioned heuristics will be hybridized in a way that can dramatically reduce the number of generated states by the algorithm. Moreover, all those heuristics require only 25KB of storage but help the algorithm effectively reduce the number of generated states and expand fewer nodes. Our implementation of BA search can significantly reduce the space complexity, and guarantee either optimal or near-optimal solutions.1	翻訳日:2023-02-12 13:04:23 公開日:2023-01-06
# 非マルコフ散逸から量子ナノデバイスの時空間制御へ From Non-Markovian Dissipation to Spatiotemporal Control of Quantum Nanodevices ( http://arxiv.org/abs/2205.11247v3 ) ライセンス: Link先を確認	Thibaut Lacroix, Brendon W. Lovett, Alex W. Chin	(参考訳) 量子効果を利用するナノデバイスは、将来の量子技術(QT)の重要な要素であるが、それらの実世界の性能は、局所的な「環境」相互作用から生じるデコヒーレンスによって強く制限されている。複数の機能ユニットを含むデバイスが複雑化するにつれて、ローカルな環境が重なり始め、新しい時間と長さのスケールで環境に媒介するデコヒーレンス現象が発生する可能性がある。このような複雑で本質的に非マルコフ力学は、QTのスケールアップに挑戦する可能性があるが、一方では、酵素や光合成タンパク質のような生物学的ナノマシンで起こることが示唆されるように、環境が「シグナル」とエネルギーを伝達する能力も、コンポーネント間プロセスの時空間的調整を可能にする可能性がある。数値的に正確な多くのボディ・メソッド(テンソル・ネットワーク)を探索し、空間的に離れた非相互作用量子系の進化を伝播する環境力学をどのように推し進めるかを探求する。本研究では, 環境に放出されるエネルギーを遠隔で収穫し, 過渡的な励起・反応性状態を生成する方法を示し, システム励起によって引き起こされる再編成が「機能的」量子システムの「下流」運動を質的かつ可逆的に変化させる可能性を明らかにする。完全なシステム環境波動関数へのアクセスにより、これらの現象の基礎となる顕微鏡プロセスが解明され、エネルギー効率のよい量子デバイスにどのように活用できるかの新しい知見が得られた。 Nanodevices exploiting quantum effects are critically important elements of future quantum technologies (QT), but their real-world performance is strongly limited by decoherence arising from local 'environmental' interactions. Compounding this, as devices become more complex, i.e. contain multiple functional units, the `local' environments begin to overlap, creating the possibility of environmentally mediated decoherence phenomena on new time-and-length scales. Such complex and inherently non-Markovian dynamics could present a challenge for scaling up QT, but -- on the other hand -- the ability of environments to transfer `signals' and energy might also enable sophisticated spatiotemporal coordination of inter-component processes, as is suggested to happen in biological nanomachines, like enzymes and photosynthetic proteins. Exploiting numerically exact many body methods (tensor networks) we study a general, fully quantum model that allows us to explore how propagating environmental dynamics can instigate and direct the evolution of spatially remote, non-interacting quantum systems. We demonstrate how energy dissipated into the environment can be remotely harvested to create transient excited/reactive states, and also identify how reorganisation triggered by system excitation can qualitatively and reversibly alter the `downstream' kinetics of a 'functional' quantum system. With access to complete system-environment wave functions, we elucidate the microscopic processes underlying these phenomena, providing new insight into how they could be exploited for energy efficient quantum devices.	翻訳日:2023-02-12 07:47:55 公開日:2023-01-06
# mc-qtaim分析によるコヒーレント量子重ね合わせマロンアルデヒドのエキゾチック結合の解明 The MC-QTAIM analysis reveals an exotic bond in the coherently quantum superposed Malonaldehyde ( http://arxiv.org/abs/2205.12090v3 ) ライセンス: Link先を確認	Mohammad Goli and Shant Shahbazian	(参考訳) マロンアルデヒド分子の2つの酸素原子間のプロトンは、2つの井戸の間にプロトン波動関数が非局在化する効果的な二重ウェルポテンシャルを経験する。そこで我々は分子分割法における原子の最先端の多成分量子理論を用いて分子構造、すなわち分子と結合ネットワークの原子をマロンアルデヒドの重ね合わせのabイニティオ波動関数から得る。プロトンが水素盆地を形成するマロンアルデヒドのよく知られたクランプ・プロトン描写とは対照的に、重畳された状態では水素盆地は消滅し、代わりに2つの新しいハイブリッド酸素-水素盆地が出現し、2つの盆地の間に陽子集団が均等に分布する。ハイブリッド盆地間の相互作用は、前例のないメカニズムによって安定している。これは、一方の盆地における1プロトン密度と他方の盆地における1電子密度の古典的クーロン相互作用の安定化を含む。この安定化機構は、化学において既知の結合モードと異なる結合をもたらす。 The proton between the two oxygen atoms of the malonaldehyde molecule experiences an effective double-well potential in which the proton wavefunction is delocalized between the two wells. Herein we employed the state-of-the-art multi-component quantum theory of atoms in molecules partitioning scheme to obtain the molecular structure, i.e. atoms in molecules and bonding network, from the superposed ab initio wavefunctions of malonaldehyde. In contrast to the familiar clamped-proton portrayal of malonaldehyde, in which the proton forms a hydrogen basin, for the superposed states the hydrogen basin disappears and two novel hybrid oxygen-hydrogen basins appear instead, with an even distribution of the proton population between the two basins. The interaction between the hybrid basins is stabilizing thanks to an unprecedented mechanism. This involves the stabilizing classical Coulomb interaction of the one-proton density in one of the basins with the one-electron density in the other basin. This stabilizing mechanism yields a bond foreign to the known bonding modes in chemistry.	翻訳日:2023-02-11 22:03:54 公開日:2023-01-06
# 空洞内超低温原子の自己組織化超放射相における次元交叉 Dimensional crossover in self-organised super-radiant phases of ultra cold atoms inside a cavity ( http://arxiv.org/abs/2206.04518v3 ) ライセンス: Link先を確認	Poornima Shakya, Amulya Ratnakar, Sankalpa Ghosh	(参考訳) 各ポンプがキャビティ軸の方向と角度が異なる2ポンプ配置で照らされた線形光学キャビティ内の超低温ボソニック原子の凝縮について考察する。このような構成は, 1次元量子光学格子配置からキャビティ-原子相互作用によって誘導される2次元量子光学格子配置への滑らかな遷移を可能にする。ホルシュタイン・プリマコフ変換を用いて、超放射相におけるそのような自己組織基底状態の原子密度プロファイルを、そのような動的量子光学格子におけるポンプの角方向の関数として発見し、座標空間と運動量空間におけるそれらの構造解析を提供する。論文の後半部では、このような量子光学格子ポテンシャルにおけるbose-hubbardモデルの拡張の観点から、対応する結果が定性的にも理解できることを示す。 We consider a condensate of ultra cold bosonic atoms in a linear optical cavity illuminated by a two-pump configuration where each pump is making different angles with the direction of the cavity axis. We show such configuration allows a smooth transition from a one-dimensional quantum optical lattice configuration to a two-dimensional quantum optical lattice configuration induced by the cavity-atom interaction. Using a Holstein-Primakoff transformation, we find out the atomic density profile of such self-organised ground state in the super-radiant phase as a function of the angular orientations of the pump in such dynamical quantum optical lattice, and, also provide an analysis of their structures in coordinate and momentum space. In the later part of the paper, we show how the corresponding results can also be qualitatively understood in terms of an Extended Bose-Hubbard model in such quantum optical lattice potential.	翻訳日:2023-02-10 01:35:16 公開日:2023-01-06
# 一般化確率論:テンソル積問題 Generalized possibilistic Theories: the tensor product problem ( http://arxiv.org/abs/2207.09905v2 ) ライセンス: Link先を確認	Eric Buffenoir (INPHYNI)	(参考訳) 演算量子論理プログラムに触発されて,量子力学の再構成プログラムにおいても,確率を導出概念とみなすことができるという主張が得られた。本稿では,確率が3値(ポシビリスティック)意味領域に属する反事実文に置き換えられる物理理論の操作的記述を提案する。状態空間と効果空間は、Chu 3 空間を通して双対性に置かれるポーズとして構築される。状態と効果の空間上の凸性要件は、基本的に一般化確率論で扱われ、これらの空間上の半格子構造に置き換えられる。純粋な状態は、状態全体の空間を生成する完全既約要素として容易に構築される。理論のチャネル(つまり対称性)は自然にChu準同型として構築される。公理論は「一般化ポシビリスティック理論」と呼ばれるものに対して、この状態/効果がチュ空間の圏(英語版)(chu space's category)に基づいて要約することができる。両部実験の問題点は,本論文の主な技術として扱われる。このとき、状態空間のテンソル積に対する公理が与えられ、解が明示的に構成される。次に、このテンソル積と数学文献に存在する半格子のテンソル積との関係/差分を解析する。半格子のテンソル積に対するこの新しい提案は、この研究の興味深い副産物と見なすことができる。 Inspired by the operational quantum logic program, we have the contention that probabilities can be viewed as a derived concept, even in a reconstruction program of Quantum Mechanics. We propose an operational description of physical theories where probabilities are replaced by counterfactual statements belonging to a three-valued (i.e. possibilistic) semantic domain. The space of states and the space of effects are then built as posets put in duality through a Chu 3 space. The convexity requirements on the spaces of states and effects, addressed basically in Generalized Probabilistic Theories, are then replaced by semi-lattice structures on these spaces. The pure states are also easily constructed as completely meet-irreducible elements which generate the whole space of states. The channels (i.e. symmetries) of the theory are then naturally built as Chu morphisms. An axiomatic can then be summarized for what can be called ''Generalized possibilistic Theory'' based on this States/Effects Chu space's category. The problem of bipartite experiment is then addressed as the main skill of this paper. An axiomatic for the tensor product of the space of states is then given and a solution is explicitly constructed. The relations/differences between this tensor product and the tensor product of semi-lattices present in the mathematical literature are then analyzed. This new proposal for the tensor product of semi-lattices can be considered as an interesting byproduct of this work.	翻訳日:2023-02-04 12:43:27 公開日:2023-01-06
# 曲面グラフェン超格子におけるスピン依存伝達 Spin-dependent transmission in curved graphene superlattice ( http://arxiv.org/abs/2208.02220v2 ) ライセンス: Link先を確認	Jaouad El-hassouny, Ahmed Jellal, El Houssine Atmani	(参考訳) 4つの領域から構成されるN$セルの曲面グラフェン超格子におけるスピン依存透過について検討した。 1つ目はconcaveで、3つ目はconvexで、平らなグラフェンシートから距離$d$で隔てられた2つの円の弧である。トンネル解析により、システムに関連するすべての伝送路と反射路を決定できる。その結果、細胞数が同じスピンで伝達を減少させることで作用することが示された。我々は,$d$と$N$が十分に大きいとき,固体スピンフィルター効果を予測する。最後に、同一のスピンがエネルギー範囲を超えた伝送の抑制の程度と持続時間が$d$で制御可能であると判定する。 We investigate spin-dependent transmission in a curved graphene superlattice of $N$ cells where each one is made up of four regions. The first is concave, and the third is convex, two arcs of circles separated by a distance $d$ from flat graphene sheets. The tunneling analysis allows us to determine all transmission and reflection channels associated with our system. As a result, we show that the number of cells acts by decreasing the transmissions with the same spin. We predict a solid spin-filtering effect when $d$ and $N$ are sufficiently large. Finally, it is determined that the degree and duration of suppression of the transmissions with the same spin over a range of energy are controllable using $d$.	翻訳日:2023-02-02 09:56:54 公開日:2023-01-06
# 2次元ディラック・ワイルフェルミオンのラシュバ寄与:通常の量子レジームを超えて Rashba contribution of 2D Dirac-Weyl fermions: Beyond ordinary quantum regime ( http://arxiv.org/abs/2208.07661v2 ) ライセンス: Link先を確認	Ahmed Jellal, Dariush Jahani, Omid Akhavan	(参考訳) グラフェン中のディラック・ワイルフェルミオンのエネルギー準位をラシュバが最小長条件に寄与する磁場下で検討した。 2+1)次元の磁気モーメントに結合したdirac様電荷キャリアのエネルギー分散の正確な解は、運動量空間表現を用いて得られる。さらに、2次元ディラック様準粒子の応用に関しては、我々の理論と結果をいくつかの特別なケースで拡張し、高磁場限界における新興エネルギースペクトルはラシュバカップリング、$\lambda_{r}$、およびランダウ準位のバンド指数とは独立になることを示した。 We study the energy levels of Dirac-Weyl fermions in graphene subject to a magnetic field with Rashba contribution in the minimal length situation. The exact solution for the energy dispersion of Dirac-like charge carriers coupled to the magnetic moments in a (2+1)-dimension is obtained by the use of the momentum space representation. Moreover, as it comes to applications for 2D Dirac-like quasiparticles, we also extend our theory and results in some special cases, showing that the emerging energy spectrum at the high magnetic field limit becomes independent of the Rashba coupling, $\lambda_{R}$, and the band index of Landau levels.	翻訳日:2023-01-30 22:53:05 公開日:2023-01-06
# 準周期ポテンシャルと定常ホッピング振幅を持つ周期駆動モデル:移動ギャップと多フラクタル状態の工学 Periodically driven model with quasiperiodic potential and staggered hopping amplitudes: engineering of mobility gaps and multifractal states ( http://arxiv.org/abs/2208.10853v2 ) ライセンス: Link先を確認	Sreemayee Aditya, K. Sengupta, Diptiman Sen	(参考訳) 準周期ポテンシャルを持つモデルの周期的駆動が静的モデルに相反しない興味深いフロケット位相を生成できるかどうかを考察する。具体的には、オンサイト準周期ポテンシャル $v_0$ を持つ1次元の時間独立モデルであるオーブリー=アンドロ=eモデルと、スタッガー形式をとる最近傍ホッピング振幅を考える。周波数$\omega$で周期的に変化する均一なホッピング振幅を加える。 2つの位相しか持たない単純な位相図を持つ静的Aubry-Andr\'eモデルとは異なり、駆動モデルは、拡張状態のみを持つ位相、異なる準エネルギーバンドを分離する複数のモビリティギャップを持つ位相、共存する多重フラクタル状態と局所状態のみを持つ混合位相、そして局所状態のみを持つ位相の4つの相を持つ。マルチフラクタル状態は、拡張状態と局所状態の両方の値とは異なる指数でシステムサイズとスケールする逆参加比を一般化した。さらに、$\omega$ と $V_0$ が変化するとき、異なる種類の状態間の複雑な再帰遷移を観察する。高周波および大きな駆動振幅の限界において、Floquet準エネルギーは非駆動系のエネルギーと一致するが、Floquet固有状態ははるかに拡張されている。また、1粒子の波動パケットの拡散について検討し、常に弾道的であるが、弾道速度はシステムパラメータによって大きく変化し、静的モデルでは発生しない$V_0$に対する非単調な依存を示すことがある。準周期ポテンシャルと駆動の相互作用は、静的モデルには現れないリッチな位相図を生成すると結論づける。 We study if periodic driving of a model with a quasiperiodic potential can generate interesting Floquet phases which have no counterparts in the static model. Specifically, we consider the Aubry-Andr\'e model which is a one-dimensional time-independent model with an on-site quasiperiodic potential $V_0$ and a nearest-neighbor hopping amplitude which is taken to have a staggered form. We add a uniform hopping amplitude which varies periodically in time with a frequency $\omega$. Unlike the static Aubry-Andr\'e model which has a simple phase diagram with only two phases (only extended or only localized states), we find that the driven model has four possible phases: a phase with only extended states, a phase with multiple mobility gaps separating different quasienergy bands, a mixed phase with coexisting extended, multifractal, and localized states, and a phase with only localized states. The multifractal states have generalized inverse participation ratios which scale with the system size with exponents which are different from the values for both extended and localized states. In addition, we observe intricate re-entrant transitions between the different kinds of states when $\omega$ and $V_0$ are varied. In the limit of high frequency and large driving amplitude, we find that the Floquet quasienergies match the energies of the undriven system, but the Floquet eigenstates are much more extended. We also study the spreading of a one-particle wave packet and find that it is always ballistic but the ballistic velocity varies significantly with the system parameters, sometimes showing a non-monotonic dependence on $V_0$ which does not occur in the static model. We conclude that the interplay of quasiperiodic potential and driving produces a rich phase diagram which does not appear in the static model.	翻訳日:2023-01-30 02:26:58 公開日:2023-01-06
# AI行動の記述による人間とAIのコラボレーションの改善 Improving Human-AI Collaboration With Descriptions of AI Behavior ( http://arxiv.org/abs/2301.06937v1 ) ライセンス: Link先を確認	\'Angel Alexander Cabrera, Adam Perer, Jason I. Hong	(参考訳) 人々はAIシステムを使って意思決定を改善するが、しばしばAIの予測を過度に、あるいは過度に予測し、手伝わなかったよりも悪いパフォーマンスをする。人々がAIアシスタントを適切に頼りにするために、動作記述、AIシステムがインスタンスのサブグループでどのように機能するかの詳細を示すことを提案する。我々は,フェイクレビュー検出,衛星画像分類,鳥の分類という3つの異なるドメインの225名を対象に,行動記述の有効性をユーザ調査により検証した。行動記述は、AIの失敗を識別し、より正確な場合にAIへの信頼を高める2つのメカニズムを通じて、人間とAIの精度を高めることができることがわかった。これらの知見は、人間とAIのコラボレーションにおける人々のメンタルモデルの重要性を強調し、ハイレベルなAI行動の人々に通知することで、AI支援による意思決定を大幅に改善できることを示した。 People work with AI systems to improve their decision making, but often under- or over-rely on AI predictions and perform worse than they would have unassisted. To help people appropriately rely on AI aids, we propose showing them behavior descriptions, details of how AI systems perform on subgroups of instances. We tested the efficacy of behavior descriptions through user studies with 225 participants in three distinct domains: fake review detection, satellite image classification, and bird classification. We found that behavior descriptions can increase human-AI accuracy through two mechanisms: helping people identify AI failures and increasing people's reliance on the AI when it is more accurate. These findings highlight the importance of people's mental models in human-AI collaboration and show that informing people of high-level AI behaviors can significantly improve AI-assisted decision making.	翻訳日:2023-01-29 14:06:44 公開日:2023-01-06
# AdaEnsemble: クリックスルーレート予測のための適応スパース構造型アンサンブルネットワークの学習 AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction ( http://arxiv.org/abs/2301.08353v1 ) ライセンス: Link先を確認	YaChen Yan, Liubo Li	(参考訳) 機能相互作用の学習は、推薦システムや広告ランキングにおける大規模CTR予測の成功に不可欠である。研究者と実践者は、機能相互作用の探索とモデリングのための様々なニューラルネットワークアーキテクチャを幅広く提案した。しかし、異なるデータセットが異なるニューラルネットワークアーキテクチャや特徴相互作用タイプを好んでおり、異なる特徴相互作用学習手法には独自の利点があることが示唆されている。 AdaEnsemble: AdaEnsemble: Sparsely-Gated Mixture-of-Experts (SparseMoE)アーキテクチャは、異種機能相互作用の専門家の強みを生かし、各例のエキスパートの疎結合へのルーティングを適応的に学習することで、異なるタイプの機能相互作用の動的階層を構築することができる。予測精度と推論効率をさらに向上するため,機能間相互作用深度選択のための動的早期退避機構を組み込んだ。 AdaEnsembleは、機能相互作用の深さを適応的に選択し、対応するSparseMoEスタック層を見つけて、予測を終了し、計算することができる。そこで,提案アーキテクチャは,SparseMoE層内の疎ゲート専門家の指数的組み合わせの利点を継承し,さらにより深い層を実行することなく最適な特徴相互作用深さを動的に選択する。提案したAdaEnsembleを実装し,実世界のデータセット上での性能を評価する。 AdaEnsembleの最先端モデルに対する有効性と有効性を示す実験結果である。 Learning feature interactions is crucial to success for large-scale CTR prediction in recommender systems and Ads ranking. Researchers and practitioners extensively proposed various neural network architectures for searching and modeling feature interactions. However, we observe that different datasets favor different neural network architectures and feature interaction types, suggesting that different feature interaction learning methods may have their own unique advantages. Inspired by this observation, we propose AdaEnsemble: a Sparsely-Gated Mixture-of-Experts (SparseMoE) architecture that can leverage the strengths of heterogeneous feature interaction experts and adaptively learns the routing to a sparse combination of experts for each example, allowing us to build a dynamic hierarchy of the feature interactions of different types and orders. To further improve the prediction accuracy and inference efficiency, we incorporate the dynamic early exiting mechanism for feature interaction depth selection. The AdaEnsemble can adaptively choose the feature interaction depth and find the corresponding SparseMoE stacking layer to exit and compute prediction from. Therefore, our proposed architecture inherits the advantages of the exponential combinations of sparsely gated experts within SparseMoE layers and further dynamically selects the optimal feature interaction depth without executing deeper layers. We implement the proposed AdaEnsemble and evaluate its performance on real-world datasets. Extensive experiment results demonstrate the efficiency and effectiveness of AdaEnsemble over state-of-the-art models.	翻訳日:2023-01-29 13:48:36 公開日:2023-01-06
# 機械学習によるコビビリティに向けた遷移経路の発見 Discovering Transition Pathways Towards Coviability with Machine Learning ( http://arxiv.org/abs/2301.10023v1 ) ライセンス: Link先を確認	Laure Berti-Equille and Rafael L. G. Raimundo	(参考訳) 共生性(coviability)とは、人間と自然が機能的で公平で永続的な方法で共存できる、複数の社会生態学的配置とガバナンス構造を指す。環境劣化と社会的に脆弱な領域において、共生可能な状態に移行することは困難である。本稿では,ブラジル北東部の地域住民が採用・実施できるコビビリティ・パスを発見するために,機械学習,アグロエコロジー,社会科学を組み合わせたフランス・ブラジル共同研究プロジェクトについて述べる。 Coviability refers to the multiple socio-ecological arrangements and governance structures under which humans and nature can coexist in functional, fair, and persistent ways. Transitioning to a coviable state in environmentally degraded and socially vulnerable territories is challenging. This paper presents an ongoing French-Brazilian joint research project combining machine learning, agroecology, and social sciences to discover coviability pathways that can be adopted and implemented by local populations in the North-East region of Brazil.	翻訳日:2023-01-29 13:41:12 公開日:2023-01-06
# 財政責任:自動意思決定における公共信頼の実現 Fiduciary Responsibility: Facilitating Public Trust in Automated Decision Making ( http://arxiv.org/abs/2301.10001v1 ) ライセンス: Link先を確認	Shannon B. Harper and Eric S. Weber	(参考訳) 自動意思決定システムは、さまざまな肯定的かつ否定的な方法で、ますます普及し、大衆に影響を与える。政府や民間機関はこれらのシステムを使用して、社会問題や組織的課題に対処するために、特定の人間によって規定されたルールに従って情報を処理する。研究と実世界の経験は、公衆が自動意思決定システムとそれらを展開する機関への信頼を欠いていることを示している。帰納定理(recreancy theorem)は、行政機関が行政責任を負うならば、国民は自動意思決定システムによってなされた決定や影響を信頼し、支援する可能性が高いと主張している。しかし、一般にはこれらのシステムがどのように機能しているかを知らされず、結果として組織的な決定が行われることが多い。自動意思決定システムによる‘ブラックボックス’の効果は、完全性と信頼性に対する大衆の認識を減少させる。その結果、公共の商品や利益の喪失に伴う不公平さやコストを特定し、挑戦し、修正する能力を失うことになる。現在のポジションペーパーでは、自動意思決定システムにおける義務の役割を定義し説明する。本稿では、データサイエンスライフサイクル(DSL)として自動意思決定システムを定式化し、DSLのコンテキスト内での業務責任の影響について検討する。 DSLにおける財政的な責任は、自動意思決定システムに対する国民の信頼の欠如に対処するための方法論を提供する。我々は,DSL の複数の文脈において,ファデューシャルな責任が顕在化し,それぞれが自身の不信源の緩和を必要とすることを仮定する。受託者の責任を立証するために、ロサンゼルス警察(lapd)の予測警察ケーススタディを調査した。 Automated decision-making systems are being increasingly deployed and affect the public in a multitude of positive and negative ways. Governmental and private institutions use these systems to process information according to certain human-devised rules in order to address social problems or organizational challenges. Both research and real-world experience indicate that the public lacks trust in automated decision-making systems and the institutions that deploy them. The recreancy theorem argues that the public is more likely to trust and support decisions made or influenced by automated decision-making systems if the institutions that administer them meet their fiduciary responsibility. However, often the public is never informed of how these systems operate and resultant institutional decisions are made. A ``black box'' effect of automated decision-making systems reduces the public's perceptions of integrity and trustworthiness. The result is that the public loses the capacity to identify, challenge, and rectify unfairness or the costs associated with the loss of public goods or benefits. The current position paper defines and explains the role of fiduciary responsibility within an automated decision-making system. We formulate an automated decision-making system as a data science lifecycle (DSL) and examine the implications of fiduciary responsibility within the context of the DSL. Fiduciary responsibility within DSLs provides a methodology for addressing the public's lack of trust in automated decision-making systems and the institutions that employ them to make decisions affecting the public. We posit that fiduciary responsibility manifests in several contexts of a DSL, each of which requires its own mitigation of sources of mistrust. To instantiate fiduciary responsibility, a Los Angeles Police Department (LAPD) predictive policing case study is examined.	翻訳日:2023-01-29 13:40:15 公開日:2023-01-06
# Contra Bellum: 言語の混乱としてのベルの定理 Contra Bellum: Bell's theorem as a confusion of languages ( http://arxiv.org/abs/2301.10727v1 ) ライセンス: Link先を確認	Marek Czachor (Politechnika Gda\'nska)	(参考訳) ベルの定理(ベルのりん、英: bell's theorem)は、数学モデルの無限階層の中で定式化された数学的予測の矛盾である。レベル$k\in\mathbb{Z}$で定式化された不等式は、レベル$k+1$で確率に反する。我々は、$k=0$が古典世界に対応すると考える傾向があるが、量子世界は$k=1$である。しかし、$k=0$の不等式は$k=1$確率で破られるので、$k=1$不等式は$k=2$確率で破られ、$k=-1$不等式は$k=0$確率で破られる。ベルの定理の論理を受け入れて、何も存在しないことを帰納的に証明できるだろうか。 Bell's theorem is a conflict of mathematical predictions formulated within an infinite hierarchy of mathematical models. Inequalities formulated at level $k\in\mathbb{Z}$, are violated by probabilities at level $k+1$. We are inclined to think that $k=0$ corresponds to the the classical world, while the quantum one is $k=1$. However, as the $k=0$ inequalities are violated by $k=1$ probabilities, the same relation holds between $k=1$ inequalities violated by $k=2$ probabilities, $k=-1$ inequalities, violated by $k=0$ probabilities, and so forth. Accepting the logic of the Bell theorem, can we prove by induction that nothing exists?	翻訳日:2023-01-29 13:12:00 公開日:2023-01-06
# 2d-block geminals:計算複雑性を低減した非1-orthogonalおよび非0-seniorityモデル 2D-Block Geminals: a non 1-orthogonal and non 0-seniority model with reduced computational complexity ( http://arxiv.org/abs/2209.00834v4 ) ライセンス: Link先を確認	Patrick Cassam-Chena\"i (JAD), Thomas Perez (JAD), Davide Accomasso	(参考訳) ここでは、geminal 関数は強直交や高次性 0 に制約されない新しいgeminal product wave function ansatzを提案する。代わりに、電子の区別不能性を犠牲にすることなく、計算労力を大幅に下げるジェミナル間のより弱い直交性制約を導入する。つまり、geminal に対応する電子対は完全には区別できないし、その積はパウリの原理に従って反対称化されて \textit{bona fide} 電子波関数を形成しなければならない。最も単純な非自明なモデルでは、解の集合はブロック対角行列によって与えられ、各ブロックはサイズ 2x2 であり、最適化される複素パラメータで乗算されるパウリ行列または正規化された対角行列からなる。この単純化されたgeminalsのアンサッツにより、量子可観測体の行列要素の計算における項数は大幅に減少する。原理の証明が報告され、アンザッツが計算的に手頃な価格を維持しながら強い直交の宝石製品よりも正確であることを確認する。 We present a new geminal product wave function ansatz where the geminals are not constrained to be strongly orthogonal nor to be of seniority zero. Instead, we introduce weaker orthogonality constraints between geminals which significantly lower the computational effort, without sacrificing the indistinguishability of the electrons. That is to say, the electron pairs corresponding to the geminals are not fully distinguishable, and their product has still to be antisymmetrized according to the Pauli principle to form a \textit{bona fide} electronic wave function.Our geometrical constraints translate into simple equations involving the traces of products of our geminal matrices. In the simplest non-trivial model, a set of solutions is given by block-diagonal matrices where each block is of size 2x2 and consists of either a Pauli matrix or a normalized diagonal matrix, multiplied by a complex parameter to be optimized. With this simplified ansatz for geminals, the number of terms in the calculation of the matrix elements of quantum observables is considerably reduced. A proof of principle is reported and confirms that the ansatz is more accurate than strongly orthogonal geminal products while remaining computationally affordable.	翻訳日:2023-01-28 04:09:21 公開日:2023-01-06
# 線形光学超放射とサブ放射の光学的解釈 Optical interpretation of linear-optics superradiance and subradiance ( http://arxiv.org/abs/2209.00918v2 ) ライセンス: Link先を確認	S. Asselie and A. Cipris and W. Guerin	(参考訳) 超放射と準放射は通常、電磁場が原子間の効果的な相互作用のみを提供する「原子図」であるディック集合状態の枠組みで記述される。本稿では,原子媒質中の光の伝播と散乱について,複雑な感受性と散乱性を提供する相補的図式について述べる。この「オプティカル・ピクチャー」は乱れたサンプルの線形光学系で有効であり、単純な教科書式から感受性と散乱断面積を計算できる場合、主に低密度で関係している。この図では、超放射能は効果的な屈折率で装った単一散乱事象による分散効果であるが、サブ放射能は多重散乱によるものである。解釈を裏付ける数値データと実験データを提示する。 Super- and subradiance are usually described in the framework of Dicke collective states, which is an ``atomic picture'' in which the electromagnetic field only provides an effective interaction between the atoms. Here, we discuss a complementary picture, in which we describe the propagation and scattering of light in the atomic medium, which provides a complex susceptibility and scatterers. This ``optical picture'' is valid in the linear-optics regime for disordered samples and is mainly relevant at low density, when the susceptibility and scattering cross-section can be computed from simple textbook formulas. In this picture, superradiance is a dispersion effect due to a single scattering event dressed by an effective refractive index, whereas subradiance is due to multiple scattering. We present numerical and experimental data supporting our interpretation.	翻訳日:2023-01-28 04:00:24 公開日:2023-01-06
# 誘導Rydberg格子気体の超輝度誘起性 Superradiance-induced multistability in driven Rydberg lattice gases ( http://arxiv.org/abs/2209.10366v2 ) ライセンス: Link先を確認	Yunhui He, Zhengyang Bai, Yuechun Jiao, Jianming Zhao, and Weibin Li	(参考訳) マイクロ波(MW)磁場で結合した1次元のリドベルク原子の定常状態相について検討し、高エネルギーのリドベルク原子は単体および集合(超ラジアント)崩壊によって低エネルギーに崩壊する。平均場アプローチを用いて,MW結合,状態内バンデルワールス(vdW)相互作用,およびRydberg状態間の単体・集団散逸について検討した。線形安定解析により、均一、反強磁性、発振、双安定および多安定相を含む一連の相が得られることが明らかになった。 vdW相互作用がなければ、一様相のみが見つかる。 vdW相互作用の存在下では、超ラジカル崩壊速度の強度を増加させると、多安定解が増大する。数値シミュレーションにより,双安定相と多安定相は長鎖の超放射によって安定化することが示された。均一相と多安定相の間の臨界点と原子番号によるスケーリングを求める。有限鎖のマスター方程式を数値的に解くことにより、平均場多安定相は、Rydberg 集団の期待値と異なる部位におけるRydberg 原子間の2体相関によって特徴づけられることを示す。 We study steady state phases of a one-dimensional array of Rydberg atoms coupled by a microwave (MW) field where the higher energy Rydberg state decays to the lower energy one via single-body and collective (superradiant) decay. Using mean-field approaches, we examine the interplay among the MW coupling, intra-state van der Waals (vdW) interaction, and single-body and collective dissipation between Rydberg states. A linear stability analysis reveals that a series of phases, including uniform, antiferromagnetic, oscillatory, and bistable and multistable phases can be obtained. Without the vdW interaction, only uniform phases are found. In the presence of the vdW interaction, multistable solutions are enhanced when increasing the strength of the superradiant decay rate. Our numerical simulations show that the bistable and multistable phases are stabilized by superradiance in a long chain. The critical point between the uniform and multistable phases and its scaling with the atom number is obtained. Through numerically solving the master equation of a finite chain, we show that the mean-field multistable phase could be characterized by expectation values of Rydberg populations and two-body correlations between Rydberg atoms in different sites.	翻訳日:2023-01-25 20:47:14 公開日:2023-01-06
# 臨界の双対性視点 Duality viewpoint of criticality ( http://arxiv.org/abs/2209.13450v4 ) ライセンス: Link先を確認	Linhao Li, Yuan Yao	(参考訳) 本研究では、異なる対称性保護位相(SPT)位相を連結する双対変換の下で自己双対な量子多体系について検討する。これらの自己双対モデルの臨界性の幾何学的説明を提供する。より正確には、周期境界条件下での基底状態(準退化)、すなわちバルクスペクトルの到達可能性を示す。同様に、双対性対称性を含む臨界点の対称性群は混合の't Hooft 異常を持つ。このアプローチは通常の0-形式対称性を持つ自己双対モデルのスペクトルを予測できるだけでなく、より高い形式やサブシステム対称性のような一般化対称性を持つモデルにも適用できる。アプリケーションとして、1次元と2次元のいくつかの例で結果を説明し、2つの異なるSPTを分離する。 In this work, we study quantum many-body systems which are self-dual under duality transformation connecting different symmetry protected topological (SPT) phases. We provide a geometric explanation of the criticality of these self-dual models. More precisely, we show a ground state (quasi-)degeneracy under the periodic boundary conditions,i.e., the ingappability of the bulk spectrum. Equivalently, the symmetry group at criticality, including the duality symmetry, has a mixed 't Hooft anomaly. This approach can not only predict the spectrum of the self-dual model with ordinary 0-form symmetry, but also be applied to that with generalized symmetry, such as higher form and subsystem symmetry. As an application, we illustrate our results with several examples in one and two dimensions, which separate two different SPTs.	翻訳日:2023-01-25 00:23:25 公開日:2023-01-06
# プログラム可能な時空間パラメトリックモードセンサ A Programmable Spatiotemporal Quantum Parametric Mode Sorter ( http://arxiv.org/abs/2210.16517v2 ) ライセンス: Link先を確認	Malvika Garikapati, Santosh Kumar, He Zhang, Yong Meng Sua, and Yu-Ping Huang	(参考訳) 我々は,モード選択型量子周波数アップコンバージョンによる複合時空間ヒルベルト空間における高次元信号のプログラム可能なパラメトリックモードソータを実験的に示す。具体的な例として、量子通信の応用を念頭に置いて、ラゲール・ガウシアンモードとエルミート・ガウシアンモードをそれぞれ信号の空間的および時間的基底と考える。アップコンバージョンポンプの時空間プロファイルを変調することにより、これらのモードにおける単一光子の忠実な選択と重ね合わせモードを示す。その結果,アップ変換光を単一モードファイバに結合し,位相マッチングのエッジでアップコンバージョンを操作することにより,量子モードソート性能が向上した。ポンプ時間プロファイルのみを最適化することにより、時空間モードの相互非バイアス基底(MUB)集合に対して12dB以上の絶滅を達成する。この完全にプログラム可能で効率的なシステムは、量子通信、量子計算、量子メトロロジーの有効なリソースとして機能する。 We experimentally demonstrate a programmable parametric mode sorter of high-dimensional signals in a composite spatiotemporal Hilbert space through mode-selective quantum frequency up-conversion. As a concrete example and with quantum communication applications in mind, we consider the Laguerre-Gaussian and Hermite-Gaussian modes as the spatial and temporal state basis for the signals, respectively. By modulating the spatiotemporal profiles of the up-conversion pump, we demonstrate the faithful selection of single photons in those modes and their superposition modes. Our results show an improvement in the quantum mode-sorting performance by coupling the up-converted light into a single-mode fiber and/or operating the upconversion at the edge of phase matching. By optimizing pump temporal profiles only, we achieve more than 12 dB extinction for mutually unbiased basis (MUB) sets of the spatiotemporal modes. This fully programmable and efficient system could serve as a viable resource for quantum communications, quantum computation, and quantum metrology.	翻訳日:2023-01-21 03:07:48 公開日:2023-01-06
# 離散力学系における非自明な最小固定点の探索 Finding Nontrivial Minimum Fixed Points in Discrete Dynamical Systems ( http://arxiv.org/abs/2301.04090v1 ) ライセンス: Link先を確認	Zirou Qiu, Chen Chen, Madhav V. Marathe, S. S. Ravi, Daniel J. Rosenkrantz, Richard E. Stearns, Anil Vullikanti	(参考訳) ネットワーク化された離散力学システムは、協調ゲームにおけるエージェントによる伝染と意思決定の拡散をモデル化するためにしばしば用いられる。このような力学系の固定点は、システムが収束する構成を表す。望ましくない感染(噂や誤報など)の拡散においては、少数の影響を受けるノードを持つ固定点への収束が望ましい目標である。このような考慮により、影響を受けるノード数が最小となるシステムの非自明な固定点を見つけるという、新しい最適化問題を定式化する。 p = np でない限り、この問題の解を任意の定数エプシロン > 0 の係数 n^1-\epsilon に近似する多項式時間アルゴリズムは存在しない。この計算難易度に対処するため,この問題を効率的に解決できる特別な事例をいくつか挙げる。さらに,適切な大きさのネットワークに対する問題に対処する整数線形プログラムを提案する。大規模ネットワーク上での問題を解くために、欲求選択法とともに一般的なヒューリスティックな枠組みを提案する。実世界のネットワークにおける広範囲な実験結果から,提案するヒューリスティックスの有効性が示された。 Networked discrete dynamical systems are often used to model the spread of contagions and decision-making by agents in coordination games. Fixed points of such dynamical systems represent configurations to which the system converges. In the dissemination of undesirable contagions (such as rumors and misinformation), convergence to fixed points with a small number of affected nodes is a desirable goal. Motivated by such considerations, we formulate a novel optimization problem of finding a nontrivial fixed point of the system with the minimum number of affected nodes. We establish that, unless P = NP, there is no polynomial time algorithm for approximating a solution to this problem to within the factor n^1-\epsilon for any constant epsilon > 0. To cope with this computational intractability, we identify several special cases for which the problem can be solved efficiently. Further, we introduce an integer linear program to address the problem for networks of reasonable sizes. For solving the problem on larger networks, we propose a general heuristic framework along with greedy selection methods. Extensive experimental results on real-world networks demonstrate the effectiveness of the proposed heuristics.	翻訳日:2023-01-11 17:21:41 公開日:2023-01-06
# 拡散写像によるトポロジカルニューラルネットワーク量子状態の分類 Classifying topological neural network quantum states via diffusion maps ( http://arxiv.org/abs/2301.02683v1 ) ライセンス: Link先を確認	Yanting Teng, Subir Sachdev, Mathias S. Scheurer	(参考訳) 量子多体系におけるトポロジ的順序を検出するための教師なし機械学習手法を議論し、実証する。制限されたボルツマン機械を用いて低エネルギースペクトルの変分アンザッツを定義することで、確率が指数関数的に減衰する波動関数とその変動エネルギーをサンプリングし、拡散写像スキームの入力として使用するトレーニングデータセットを定義する。拡散写像は波動関数の低次元埋め込みを提供し、超選択セクターの存在や欠如を明らかにし、したがって位相順序を与える。拡散写像に対して,ネットワークパラメータを用いて量子状態の必要相似性測度を定義できることを示し,多項式時間での効率的な評価を可能にした。しかし、考えられる「ゲージ冗長性」を慎重に考慮する必要がある。明示的な例として、このメソッドを toric コードに適用します。 We discuss and demonstrate an unsupervised machine-learning procedure to detect topological order in quantum many-body systems. Using a restricted Boltzmann machine to define a variational ansatz for the low-energy spectrum, we sample wave functions with probability decaying exponentially with their variational energy; this defines our training dataset that we use as input to a diffusion map scheme. The diffusion map provides a low-dimensional embedding of the wave functions, revealing the presence or absence of superselection sectors and, thus, topological order. We show that for the diffusion map, the required similarity measure of quantum states can be defined in terms of the network parameters, allowing for an efficient evaluation within polynomial time. However, possible ''gauge redundancies'' have to be carefully taken into account. As an explicit example, we apply the method to the toric code.	翻訳日:2023-01-10 18:56:25 公開日:2023-01-06
# マルチモーダル歌詞-リズムマッチング Multimodal Lyrics-Rhythm Matching ( http://arxiv.org/abs/2301.02732v1 ) ライセンス: Link先を確認	Callie C. Liao, Duoduo Liao, Jesse Guessford	(参考訳) 最近の音楽の人工知能研究の増加にもかかわらず、歌詞の主要成分とキーワード、強調された音節、強いビートといったリズムの相関は、あまり研究されていない。 thsは、オーディオの誤用、音節識別の不正確さ、そして最も重要なのは、学際的な知識の必要性といった課題による可能性がある。このような研究の欠如に対処するため,本稿では,歌詞と音楽のキーコンポーネントを言語的制約なくマッチングする,新しいマルチモーダルな歌詞・リズムマッチング手法を提案する。私たちは、簡単に利用可能なメタデータで楽譜をシートするのではなく、オーディオを使用します。さらに,音楽の強いビート,歌詞の音節,歌手の発音の聴覚的変化,特に歌詞キーワードなど,鍵となるリズミカル要素と鍵となるリズミカル要素のマッチングに活用される様々なマルチモーダルなパターンを創造的に生成する。この有利なアプローチは、効率的なリズムベースのオーディオアライメントアルゴリズムを含む聴覚的歌詞とリズムの相関を研究するためのユニークな方法を提供するだけでなく、音楽や音楽認知と計算言語学を橋渡しする。実験の結果,平均で0.81の確率が一致し,約30%の楽曲が0.9以上のキーワードが強いビートに着地する確率を示し,そのうち12%が完璧に着地した。また、類似度指標を用いて、歌詞とリズムの相関性を評価する。楽曲の50%近くが0.70以上の類似性を持っている。結論として,本手法は洞察に富む相関関係を計算的に明らかにすることにより,歌詞とリズムの関係に大きく寄与する。 Despite the recent increase in research on artificial intelligence for music, prominent correlations between key components of lyrics and rhythm such as keywords, stressed syllables, and strong beats are not frequently studied. Ths is likely due to challenges such as audio misalignment, inaccuracies in syllabic identification, and most importantly, the need for cross-disciplinary knowledge. To address this lack of research, we propose a novel multimodal lyrics-rhythm matching approach in this paper that specifically matches key components of lyrics and music with each other without any language limitations. We use audio instead of sheet music with readily available metadata, which creates more challenges yet increases the application flexibility of our method. Furthermore, our approach creatively generates several patterns involving various multimodalities, including music strong beats, lyrical syllables, auditory changes in a singer's pronunciation, and especially lyrical keywords, which are utilized for matching key lyrical elements with key rhythmic elements. This advantageous approach not only provides a unique way to study auditory lyrics-rhythm correlations including efficient rhythm-based audio alignment algorithms, but also bridges computational linguistics with music as well as music cognition. Our experimental results reveal an 0.81 probability of matching on average, and around 30% of the songs have a probability of 0.9 or higher of keywords landing on strong beats, including 12% of the songs with a perfect landing. Also, the similarity metrics are used to evaluate the correlation between lyrics and rhythm. It shows that nearly 50% of the songs have 0.70 similarity or higher. In conclusion, our approach contributes significantly to the lyrics-rhythm relationship by computationally unveiling insightful correlations.	翻訳日:2023-01-10 18:48:19 公開日:2023-01-06
# 文脈内終端自動音声認識における外部オフポリティ・スピーチ・トゥ・テキストマッピングの利用 Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition ( http://arxiv.org/abs/2301.02736v1 ) ライセンス: Link先を確認	David M. Chan, Shalini Ghosh, Ariya Rastrow, Bj\"orn Hoffmeister	(参考訳) 自動音声認識(ASR)モデルの一般化性能の改善にもかかわらず、ダウンストリームタスクのためのASRモデルの特殊化は、主にデータ可用性の低下(データ収集の増大)とデータ分散の急激なシフト(より頻繁なモデル微調整の要求)のために難しい課題である。本研究では,外部知識の活用の可能性,特にtext-to-speech法で生成されたオフポリシックキーバリューストアを用いて,新しいデータ分布へのフレキシブルなトレーニング後適応を可能にする。提案手法では,テキストから音声への埋め込みとセマンティックテキストの埋め込みを併用して,k-nearest-neighbor (KNN) に基づく注意融合ステップを用いて,ASRに偏りを与える。 libirispeechと社内の音声アシスタント/検索データセットに関する実験では、提案手法はドメイン適応時間を最大1kgpu時間に短縮できると同時に、微調整ベースラインと比較して最大3%改善できることが示され、ゼロおよび少数ショットシナリオに挑戦して生産asrシステムを適用するための有望なアプローチが示唆された。 Despite improvements to the generalization performance of automated speech recognition (ASR) models, specializing ASR models for downstream tasks remains a challenging task, primarily due to reduced data availability (necessitating increased data collection), and rapidly shifting data distributions (requiring more frequent model fine-tuning). In this work, we investigate the potential of leveraging external knowledge, particularly through off-policy key-value stores generated with text-to-speech methods, to allow for flexible post-training adaptation to new data distributions. In our approach, audio embeddings captured from text-to-speech, along with semantic text embeddings, are used to bias ASR via an approximate k-nearest-neighbor (KNN) based attentive fusion step. Our experiments on LibiriSpeech and in-house voice assistant/search datasets show that the proposed approach can reduce domain adaptation time by up to 1K GPU-hours while providing up to 3% WER improvement compared to a fine-tuning baseline, suggesting a promising approach for adapting production ASR systems in challenging zero and few-shot scenarios.	翻訳日:2023-01-10 18:47:49 公開日:2023-01-06
# 化学空間上の仮説駆動型能動学習による分子の構造-親和関係の発見 Discovery of structure-property relations for molecules via hypothesis-driven active learning over the chemical space ( http://arxiv.org/abs/2301.02665v1 ) ライセンス: Link先を確認	Ayana Ghosh, Sergei V. Kalinin and Maxim A. Ziatdinov	(参考訳) 薬物標的、生体分子系、触媒、光電気、有機エレクトロニクス、電池の分子候補の発見は、望まれる機能性をターゲットとした化学空間の迅速な探索が可能な機械学習アルゴリズムの開発を必要とする。本稿では,仮説学習に基づく化学空間上のアクティブラーニングのための新しいアプローチを提案する。我々は、データの小さな部分集合に基づいて、興味の構造と機能の間の可能な関係に関する仮説を構築し、それをガウス過程の(確率的な)平均関数として導入する。このアプローチはSISSOやアクティブラーニングといったシンボリック回帰手法の要素をひとつのフレームワークに統合する。ここでは、qm9データセットについて実証するが、分子科学と固体材料科学の両方の分野のデータセットに広く適用することができる。 Discovery of the molecular candidates for applications in drug targets, biomolecular systems, catalysts, photovoltaics, organic electronics, and batteries, necessitates development of machine learning algorithms capable of rapid exploration of the chemical spaces targeting the desired functionalities. Here we introduce a novel approach for the active learning over the chemical spaces based on hypothesis learning. We construct the hypotheses on the possible relationships between structures and functionalities of interest based on a small subset of data and introduce them as (probabilistic) mean functions for the Gaussian process. This approach combines the elements from the symbolic regression methods such as SISSO and active learning into a single framework. Here, we demonstrate it for the QM9 dataset, but it can be applied more broadly to datasets from both domains of molecular and solid-state materials sciences.	翻訳日:2023-01-10 18:21:33 公開日:2023-01-06
# グロキングモジュラー算術 Grokking modular arithmetic ( http://arxiv.org/abs/2301.02679v1 ) ライセンス: Link先を確認	Andrey Gromov	(参考訳) モジュラー演算のタスクを学習し,'grokking'と呼ばれる一般化の急激な飛躍を示す,シンプルなニューラルネットワークを提案する。具体的に言えば i) MSE損失関数が正規化されていない場合に、バニラ勾配降下の下で様々なモジュラー演算タスクをグラッキングする完全連結二層ネットワーク。 (二モジュラー算術がタスクによって構造が決定される特定の特徴写像の学習に対応することの証拠。) (iii)多種多様なモジュラー算術タスクを解決する重み(従って特徴写像)の解析式 (4)これらの特徴写像は、AdamWと同様にバニラ勾配降下によっても見出され、ネットワークによって学習された表現の完全な解釈可能性を確立する。 We present a simple neural network that can learn modular arithmetic tasks and exhibits a sudden jump in generalization known as ``grokking''. Concretely, we present (i) fully-connected two-layer networks that exhibit grokking on various modular arithmetic tasks under vanilla gradient descent with the MSE loss function in the absence of any regularization; (ii) evidence that grokking modular arithmetic corresponds to learning specific feature maps whose structure is determined by the task; (iii) analytic expressions for the weights -- and thus for the feature maps -- that solve a large class of modular arithmetic tasks; and (iv) evidence that these feature maps are also found by vanilla gradient descent as well as AdamW, thereby establishing complete interpretability of the representations learnt by the network.	翻訳日:2023-01-10 18:21:19 公開日:2023-01-06
# 並列分散型大規模ディープラーニング学習システム Systems for Parallel and Distributed Large-Model Deep Learning Training ( http://arxiv.org/abs/2301.02691v1 ) ライセンス: Link先を確認	Kabir Nagrecha	(参考訳) ディープラーニング(DL)は、コンピュータビジョン、自然言語処理、表形式のデータ分析など、さまざまな分野のアプリケーションを変換している。 dlモデルの精度向上の追求は、数十億の学習可能なパラメータにまたがる最近のトランスフォーマーモデルによって、ますます大きなニューラルネットワークアーキテクチャを探求するようになった。これらの設計は、メモリボトルネック、ランタイム効率の低下、モデル開発における高コストなど、DL空間に新たなスケール駆動システム課題を導入している。これらの問題に対処する努力は、ニューラルアーキテクチャの並列化、メモリ階層にまたがるデータの流出、メモリ効率のよいデータ表現といったテクニックを探求してきた。この調査では、大規模なモデルトレーニングシステムの展望を探求し、主要な課題とそれに対応する様々なテクニックを強調します。 Deep learning (DL) has transformed applications in a variety of domains, including computer vision, natural language processing, and tabular data analysis. The search for improved DL model accuracy has led practitioners to explore increasingly large neural architectures, with some recent Transformer models spanning hundreds of billions of learnable parameters. These designs have introduced new scale-driven systems challenges for the DL space, such as memory bottlenecks, poor runtime efficiency, and high costs of model development. Efforts to address these issues have explored techniques such as parallelization of neural architectures, spilling data across the memory hierarchy, and memory-efficient data representations. This survey will explore the large-model training systems landscape, highlighting key challenges and the various techniques that have been used to address them.	翻訳日:2023-01-10 18:21:05 公開日:2023-01-06
# エンコーダ・デコーダ言語モデルによるペアリング抗体配列の条件付き生成 Conditional Generation of Paired Antibody Chain Sequences through Encoder-Decoder Language Model ( http://arxiv.org/abs/2301.02748v1 ) ライセンス: Link先を確認	Simon K.S. Chu, Kathy Y. Wei	(参考訳) タンパク質言語モデル(lms)は、シーケンス、構造、機能予測に成功している。しかし、現在、タンパク質 LM は単一配列のエンコーダまたはデコーダのみのアーキテクチャに制限されている。ここでは, 抗体鎖ペアリングをT5アーキテクチャを用いて前方および後方翻訳としてモデル化したpAbT5を紹介する。 pAbT5は、配列生成と不一致を、教師なしおよび教師なしの分類として正確に反映していることを示す。我々のタンパク質LMは可変長配列を生成し、その次単語予測確率は配列アライメントから位置特異的スコアリング行列と一致する。タンパク質 LM の他の研究と同様に、pAbT5 は実験測定において最先端の教師なし予測を行う。我々の知る限り、pAbT5はタンパク質-タンパク質相互作用のための最初のエンコーダ-デコーダタンパク質LMである。 Protein language models (LMs) have been successful in sequence, structural and functional predictions. However, currently, protein LMs are limited to encoder- or decoder-only architectures for single sequences while many biological contexts involve protein-protein interactions. Here, we introduce pAbT5, which models antibody chain pairing as forward- and back-translations using a T5-based architecture. We show that pAbT5 accurately reflects chain pairing through sequence generation and mispairing as unsupervised and supervised classifications. Our protein LM generates variable-length sequences and its next-word prediction probability agrees with position-specific scoring matrix from sequence alignment. Like other works in protein LM, pAbT5 performs state-of-the-art unsupervised prediction on experimental measurements. To the best of our knowledge, pAbT5 is the first encoder-decoder protein LM for protein-protein interactions.	翻訳日:2023-01-10 18:02:36 公開日:2023-01-06
# 3DAvatarGAN: パーソナライズされた編集可能なアバターのためのブリッジドメイン 3DAvatarGAN: Bridging Domains for Personalized Editable Avatars ( http://arxiv.org/abs/2301.02700v1 ) ライセンス: Link先を確認	Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov	(参考訳) 現代の3D-GANは、一貫した構造を持つ大規模データセットのトレーニングによって幾何学とテクスチャを合成する。このようなモデルを、しばしば未知の、高度に変動した幾何学とカメラ情報に基づくスタイル化された芸術データで訓練することは、まだ不可能である。マルチビューの一貫性とテクスチャの質を維持しながら、3D GANをそのような芸術的データでトレーニングできるだろうか? そこで本研究では,ソースドメインが事前訓練された3D-GANであり,ターゲットドメインが2D-GANである適応フレームワークを提案する。次に、2Dジェネレータからソース3Dジェネレータに知識を蒸留する。そこで我々はまず,ドメイン間のカメラパラメータの分布を調整する最適化手法を提案する。第二に,質の高いテクスチャを学習するために必要な規則化を提案し,平坦な形状などの幾何学的解の退化を回避した。第3に,芸術領域の誇張された幾何学をモデル化するための変形に基づく手法について述べる。最後に、ソースとターゲットドメインの潜在空間をリンクする3D-GANの新しい逆変換法を提案する。私たちのコントリビューションは、初めて、芸術データセット上でパーソナライズされた3Dアバターの生成、編集、アニメーションを可能にしました。 Modern 3D-GANs synthesize geometry and texture by training on large-scale datasets with a consistent structure. Training such models on stylized, artistic data, with often unknown, highly variable geometry, and camera information has not yet been shown possible. Can we train a 3D GAN on such artistic data, while maintaining multi-view consistency and texture quality? To this end, we propose an adaptation framework, where the source domain is a pre-trained 3D-GAN, while the target domain is a 2D-GAN trained on artistic datasets. We then distill the knowledge from a 2D generator to the source 3D generator. To do that, we first propose an optimization-based method to align the distributions of camera parameters across domains. Second, we propose regularizations necessary to learn high-quality texture, while avoiding degenerate geometric solutions, such as flat shapes. Third, we show a deformation-based technique for modeling exaggerated geometry of artistic domains, enabling -- as a byproduct -- personalized geometric editing. Finally, we propose a novel inversion method for 3D-GANs linking the latent spaces of the source and the target domains. Our contributions -- for the first time -- allow for the generation, editing, and animation of personalized artistic 3D avatars on artistic datasets.	翻訳日:2023-01-10 18:02:26 公開日:2023-01-06
# 胸部X線画像における深層学習に基づく新型コロナウイルス認識モデルの設計 : 知識蒸留アプローチ Designing an Improved Deep Learning-based Model for COVID-19 Recognition in Chest X-ray Images: A Knowledge Distillation Approach ( http://arxiv.org/abs/2301.02735v1 ) ライセンス: Link先を確認	AmirReza BabaAhmadi, Sahar Khalafi, Masoud ShariatPanahi, Moosa Ayati	(参考訳) 新型コロナウイルス(covid-19)は、異なる側面の人間や社会に悪影響を及ぼしている。新型コロナウイルスの診断が不正確で、適切な治療が不十分なため、多くの人が死亡した。世界中の研究者によって,手動・自動特徴抽出技術に基づく多数の解が研究されている。通常、自動特徴抽出法、特にディープラーニングモデルは、必要な計算を実行するために強力なハードウェアシステムを必要とする。残念なことに、多くの機関や社会は、高品質のハードウェア機器の高価さのために、これらの進歩から利益を得ることができない。その結果,本研究では, 組込みデバイス, モバイルデバイス, 従来のコンピュータ上でのモデル実行に伴う計算コストの低減, および, 医用認識タスクの性能と精度を確保するために, これまでに公表した手法(少なくとも最先端モデルと同等の性能)と比較して, モデルの性能を向上すること, の2つの目標に焦点をあてた。本研究では,VGG19とResNet50V2という2つのニューラルネットワークを用いて,データセットの特徴抽出を改善した。これらのネットワークはどちらも、指定されたデータセットからセマンティック機能を提供する。この目的のために、モバイルと組み込みデバイスで最小限の計算を必要としながらセマンティック機能を抽出するMobileNetV2という代替ネットワークが検討された。知識蒸留(KD)は、教師ネットワーク(統合ResNet50V2とVGG19)から学生ネットワーク(MobileNetV2)へ知識を伝達し、MobileNetV2の性能を改善し、胸部X線画像から新型コロナウイルス識別タスクの堅牢で正確なモデルを実現するために用いられた。 COVID-19 has adversely affected humans and societies in different aspects. Numerous people have perished due to inaccurate COVID-19 identification and, consequently, a lack of appropriate medical treatment. Numerous solutions based on manual and automatic feature extraction techniques have been investigated to address this issue by researchers worldwide. Typically, automatic feature extraction methods, particularly deep learning models, necessitate a powerful hardware system to perform the necessary computations. Unfortunately, many institutions and societies cannot benefit from these advancements due to the prohibitively high cost of high-quality hardware equipment. As a result, this study focused on two primary goals: first, lowering the computational costs associated with running the proposed model on embedded devices, mobile devices, and conventional computers; and second, improving the model's performance in comparison to previously published methods (at least performs on par with state-of-the-art models) in order to ensure its performance and accuracy for the medical recognition task. This study used two neural networks to improve feature extraction from our dataset: VGG19 and ResNet50V2. Both of these networks are capable of providing semantic features from the nominated dataset. To this end, An alternative network was considered, namely MobileNetV2, which excels at extracting semantic features while requiring minimal computation on mobile and embedded devices. Knowledge distillation (KD) was used to transfer knowledge from the teacher network (concatenated ResNet50V2 and VGG19) to the student network (MobileNetV2) to improve MobileNetV2 performance and to achieve a robust and accurate model for the COVID-19 identification task from chest X-ray images.	翻訳日:2023-01-10 18:02:06 公開日:2023-01-06
# 量子測定におけるリンドブラジアン誘起アライメント Lindbladian-Induced Alignment in Quantum Measurements ( http://arxiv.org/abs/2301.02664v1 ) ライセンス: Link先を確認	Robert Englman and Asher Yahalom	(参考訳) システム・ポインター・ウェーブ・パッケージの初期値と観測可能な値が整列した値に対して、外積から系の密度行列の内積への遷移として、不明瞭な時間連続的な減少を確実にするリンドブレディアン形式の式が提案される。ジャンプ作用素は可観測性に基づいており、測定セットから一意に決定されたパラメータ(これはS. Weinbergの波束形式論のリンドブラディアン分解と異なる)とボルンの確率規則に従っている。この新しさは、周囲(測定装置を含む)の観察モードへの適応性を定式化することにある。したがって、遷移は有限期間である(フォン・ノイマンの定式化における即時性とは対照的である)。この期間は単純な半スピンモデルとして推定される。 An expression of the Lindbladian form is proposed that ensures an unambiguous time-continuous reduction of the initial system-pointer wave-packet to one in which the readings and the observable's values are aligned, formalized as the transition from an outer product to an inner product of the system's and apparatus' density matrices. The jump operators are in the basis of the observables, with uniquely determined parameters derived from the measurement set-up (thereby differing from S. Weinberg's Lindbladian resolution of wave-packet formalism) and conforming to Born's probability rules. The novelty lies in formalising the adaptability of the surroundings (including the measuring device) to the mode of observation. Accordingly, the transition is of finite duration (in contrast to its instantaneousness in the von Neumann's formulation). This duration is estimated for a simple half-spin-like model.	翻訳日:2023-01-10 17:46:26 公開日:2023-01-06
# 誤り緩和のための仮説検証:エラー緩和の評価方法 Hypothesis Testing for Error Mitigation: How to Evaluate Error Mitigation ( http://arxiv.org/abs/2301.02690v1 ) ライセンス: Link先を確認	Abdullah Ash Saki, Amara Katabarwa, Salonik Resch, George Umbrarescu	(参考訳) ノイズの多い中間スケール量子(NISQ)時代には、量子エラー軽減は量子デバイスから有用なパフォーマンスを抽出するために必要なツールとなる。しかし、誤差緩和技術によってしばしば想定されるノイズモデルと、量子デバイス上の実際のノイズとの間には大きなギャップがある。その結果、技術の理論的な期待と日々のパフォーマンスの間にギャップが生じている。特に量子デバイスのクラウドユーザーは、デバイスをそのまま利用することが多いが、このギャップを最も感じている。これらのテクニックの有用性における不確実性をパラメータ化して,エラー軽減に必要なリソースとアルゴリズムレベルでの精度を判断するには,どうすればよいのか? 第1の質問に答えるために,量子エラー緩和の枠組み内で仮説検証を導入するとともに,第2の質問に対して,エラー緩和実装のリソース要件と緩和効率の両方を考慮した包括的メリット図を提案する。メリットの図形は、様々なエラー軽減手法のスケーラビリティと精度のトレードオフを評価するのに有用である。最後に, 仮説検証と実測値を用いて, ゼロノイズ外挿, ランダム化コンパイル, 測定誤差緩和, 動的デカップリング, 推定回路による緩和などの特異な手法からなる16ドルの誤差軽減パイプラインを実験的に評価した。合計275,640ドルの回路をIBMの量子コンピュータ2台で走らせた。 In the noisy intermediate-scale quantum (NISQ) era, quantum error mitigation will be a necessary tool to extract useful performance out of quantum devices. However, there is a big gap between the noise models often assumed by error mitigation techniques and the actual noise on quantum devices. As a consequence, there arises a gap between the theoretical expectations of the techniques and their everyday performance. Cloud users of quantum devices in particular, who often take the devices as they are, feel this gap the most. How should they parametrize their uncertainty in the usefulness of these techniques and be able to make judgement calls between resources required to implement error mitigation and the accuracy required at the algorithmic level? To answer the first question, we introduce hypothesis testing within the framework of quantum error mitigation and for the second question, we propose an inclusive figure of merit that accounts for both resource requirement and mitigation efficiency of an error mitigation implementation. The figure of merit is useful to weigh the trade-offs between the scalability and accuracy of various error mitigation methods. Finally, using the hypothesis testing and the figure of merit, we experimentally evaluate $16$ error mitigation pipelines composed of singular methods such as zero noise extrapolation, randomized compilation, measurement error mitigation, dynamical decoupling, and mitigation with estimation circuits. In total our data involved running $275,640$ circuits on two IBM quantum computers.	翻訳日:2023-01-10 17:46:07 公開日:2023-01-06
# 農村道路における多変量交通状態予測のための注意-LSTM Attention-LSTM for Multivariate Traffic State Prediction on Rural Roads ( http://arxiv.org/abs/2301.02731v1 ) ライセンス: Link先を確認	Elahe Sherafat and Bilal Farooq and Amir Hossein Karbasi and Seyedehsan Seyedabrishami	(参考訳) 正確な交通量と速度予測は、輸送に幅広い応用がある。旅行者と交通機関の意思決定者の両方にとって有用かつタイムリーな情報が得られる。本研究では,イラン最大の観光地都市チャルスとテヘランを結ぶ重要な農村道路セグメントにおいて,交通量と速度を同時に予測するために,注意に基づく長期記憶モデル(A-LSTM)を提案する。さらに,A-LSTMモデルとLong Short-Term Memory(LSTM)モデルとの比較を行った。どちらのモデルも速度と流れを予測できる性能を示している。しかし、A-LSTMモデルはLSTMを5分15分間隔で上回る。対照的に、30分間の間隔で2つのモデルの間に有意な差はない。異なる時間地平線に基づくモデルの性能を比較することにより、15分間の地平線モデルは最低平均角誤差(MSE)が0.0032に達し、続いて30分間と5分間の地平線が0.004と0.0051である。さらに,15分間の時間間隔において,時間的カテゴリの入力変数である 1-hot または cyclic の 2 つの変換に基づくモデルの結果を比較した。その結果, 周期的特徴符号化によるLSTMとA-LSTMは, 単孔的特徴符号化よりも優れていた。 Accurate traffic volume and speed prediction have a wide range of applications in transportation. It can result in useful and timely information for both travellers and transportation decision-makers. In this study, an Attention based Long Sort-Term Memory model (A-LSTM) is proposed to simultaneously predict traffic volume and speed in a critical rural road segmentation which connects Tehran to Chalus, the most tourist destination city in Iran. Moreover, this study compares the results of the A-LSTM model with the Long Short-Term Memory (LSTM) model. Both models show acceptable performance in predicting speed and flow. However, the A-LSTM model outperforms the LSTM in 5 and 15-minute intervals. In contrast, there is no meaningful difference between the two models for the 30-minute time interval. By comparing the performance of the models based on different time horizons, the 15-minute horizon model outperforms the others by reaching the lowest Mean Square Error (MSE) loss of 0.0032, followed by the 30 and 5-minutes horizons with 0.004 and 0.0051, respectively. In addition, this study compares the results of the models based on two transformations of temporal categorical input variables, one-hot or cyclic, for the 15-minute time interval. The results demonstrate that both LSTM and A-LSTM with cyclic feature encoding outperform those with one-hot feature encoding.	翻訳日:2023-01-10 17:28:29 公開日:2023-01-06
# イメージベース表現を用いた自己整合複素多項式を用いたアンテナ設計における散乱係数のモデル化 Modeling Scattering Coefficients in Antenna Design using Self-Attentive Complex Polynomials with Image-based Representation ( http://arxiv.org/abs/2301.02747v1 ) ライセンス: Link先を確認	Andrew Cohen, Weiping Dou, Jiang Zhu, Slawomir Koziel, Peter Renner, Jan-Ove Mattsson, Xiaomeng Yang, Beidi Chen, Kevin Stone, Yuandong Tian	(参考訳) 周波数要件を満たし、複数の物理基準に対して最適であるアンテナ設計を見つけることは、次世代ハードウェアの設計において重要な要素である。しかし、目的関数は一般に非常に非線形であり、微妙な設計変更に敏感であるため、そのようなプロセスは自明ではない。さらに、最適化される目的は、しばしば電磁シミュレーション(EM)であり、商業シミュレーションソフトウェアでは遅くて高価である。本研究では,CZP (Constant Zeros Poles) と呼ばれるサンプル効率・精度の高い代理モデルを提案し,シミュレータを使わずに与えられた2次元平面アンテナ設計の周波数領域における散乱係数を直接推定する。 CZPは散乱係数の周波数応答に関する複素零点と極を予測し、マクスウェル方程式を含む任意の線形PDEに対して理論的に正当化した。さらに、czpは、低次元表現を使用する代わりに、既存のメッシュベースのemシミュレーション技術や注意に基づくニューラルネットワークアーキテクチャにインスパイアされたアンテナトポロジーのための新しいイメージベース表現を利用する。実験では,czpが試験損失の点でベースラインを上回るだけでなく,40kのトレーニングサンプルしか持たない商用ソフトウェアで検証可能な2dアンテナ設計を,強化学習などの先進的な逐次探索技術と組み合わせることで検証できることを実証した。 Finding antenna designs that satisfy frequency requirements and are also optimal with respect to multiple physical criteria is a critical component in designing next generation hardware. However, such a process is non-trivial because the objective function is typically highly nonlinear and sensitive to subtle design change. Moreover, the objective to be optimized often involves electromagnetic (EM) simulations, which is slow and expensive with commercial simulation software. In this work, we propose a sample-efficient and accurate surrogate model, named CZP (Constant Zeros Poles), to directly estimate the scattering coefficients in the frequency domain of a given 2D planar antenna design, without using a simulator. CZP achieves this by predicting the complex zeros and poles for the frequency response of scattering coefficients, which we have theoretically justified for any linear PDE, including Maxwell's equations. Moreover, instead of using low-dimensional representations, CZP leverages a novel image-based representation for antenna topology inspired by the existing mesh-based EM simulation techniques, and attention-based neural network architectures. We demonstrate experimentally that CZP not only outperforms baselines in terms of test loss, but also is able to find 2D antenna designs verifiable by commercial software with only 40k training samples, when coupling with advanced sequential search techniques like reinforcement learning.	翻訳日:2023-01-10 17:28:06 公開日:2023-01-06
# ボケおよびポアソンノイズ除去のための異方性および等方性全変動の異なる効率的な画像分割フレームワーク Efficient Image Segmentation Framework with Difference of Anisotropic and Isotropic Total Variation for Blur and Poisson Noise Removal ( http://arxiv.org/abs/2301.03393v1 ) ライセンス: Link先を確認	Kevin Bui, Yifei Lou, Fredrick Park, Jack Xin	(参考訳) 本稿では,ぼかしとポアソンノイズによって劣化した画像のセグメント化を目的とする。画像をスムースに分割するために$k$-meansクラスタリングを行う。特に、画像平滑化ステップでは、ムンフォード・シャーモデルにおけるガウス雑音の最小二乗忠実度をポアソン雑音に対応する最大後方(map)項に置き換え、画像勾配のスパーシティを促進するための正規化として、異方性および等方性総変動(aitv)の重み付き差分を取り入れる。このような非凸モデルに対しては、特定の分割方式を開発し、近似演算子を用いて乗算器の交互方向法(ADMM)を適用する。 ADMM方式の有効性を検証するために収束解析を行う。様々なセグメンテーションシナリオ(grayscale/color and multiphase)における数値実験により,本手法がsatを含む多くのセグメンテーション手法を上回っていることを示した。 In this paper, we aim to segment an image degraded by blur and Poisson noise. We adopt a smoothing-and-thresholding (SaT) segmentation framework that finds a piecewise-smooth solution, followed by $k$-means clustering to segment the image. Specifically for the image smoothing step, we replace the least-squares fidelity for Gaussian noise in the Mumford-Shah model with a maximum posterior (MAP) term to deal with Poisson noise and we incorporate the weighted difference of anisotropic and isotropic total variation (AITV) as a regularization to promote the sparsity of image gradients. For such a nonconvex model, we develop a specific splitting scheme and utilize a proximal operator to apply the alternating direction method of multipliers (ADMM). Convergence analysis is provided to validate the efficacy of the ADMM scheme. Numerical experiments on various segmentation scenarios (grayscale/color and multiphase) showcase that our proposed method outperforms a number of segmentation methods, including the original SaT.	翻訳日:2023-01-10 17:18:28 公開日:2023-01-06
# 混乱した頭:拡散モデルが対面生成でGANを上回った Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation ( http://arxiv.org/abs/2301.03396v1 ) ライセンス: Link先を確認	Micha{\l} Stypu{\l}kowski, Konstantinos Vougioukas, Sen He, Maciej Zi\k{e}ba, Stavros Petridis, Maja Pantic	(参考訳) 顔の生成は、これまで、追加の参照ビデオからのガイダンスなしで、頭の動きや自然な表情を作り出すのに苦労してきた。近年の拡散型生成モデルの開発により、より現実的で安定したデータ合成が可能となり、画像およびビデオ生成の性能は他の生成モデルを上回るものとなった。本研究では,人間の頭部の映像を生成するのに1つの識別画像と音声シーケンスしか必要としない自己回帰拡散モデルを提案する。我々のソリューションは、頭の動き、点滅などの表情を幻覚させ、特定の背景を保存することができる。 2つの異なるデータセットでモデルを評価し、両者で最先端の結果を得る。 Talking face generation has historically struggled to produce head movements and natural facial expressions without guidance from additional reference videos. Recent developments in diffusion-based generative models allow for more realistic and stable data synthesis and their performance on image and video generation has surpassed that of other generative models. In this work, we present an autoregressive diffusion model that requires only one identity image and audio sequence to generate a video of a realistic talking human head. Our solution is capable of hallucinating head movements, facial expressions, such as blinks, and preserving a given background. We evaluate our model on two different datasets, achieving state-of-the-art results on both of them.	翻訳日:2023-01-10 17:17:59 公開日:2023-01-06
# ビデオイベント関連予測のための構造記号表現の防御 In Defense of Structural Symbolic Representation for Video Event-Relation Prediction ( http://arxiv.org/abs/2301.03410v1 ) ライセンス: Link先を確認	Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang	(参考訳) ビデオ内のイベント関係を理解するには、イベントの基盤となる構造(イベントタイプ、関連する引数ロール、対応するエンティティ)と推論に必要な事実的知識を理解するモデルが必要である。構造記号表現(SSR)に基づく手法は、イベントタイプと関連する引数ロール/エンティティを直接入力として取り込んで推論を行う。しかし、現在最先端のビデオイベント関連予測システムは、入力ビデオから連続的な特徴ベクトルを使用する必要があることを示している。本稿では,以下の質問に答えるために,広範な実験分析を行う。 1) SSR ベースの方法が失敗した理由 2) 映像イベント関連予測の評価設定を適切に理解する方法 3) SSR に基づく手法の可能性を明らかにする方法。まず,従来のSSRに基づくビデオイベント予測モデルの失敗を,準最適トレーニング設定によって検出する。意外なことに、調整されたハイパーパラメータを持つ単純なSSRモデルでは、最先端モデルよりも20倍のマクロ精度が得られる。次に,質的かつ定量的な分析を通じて,映像のみを入力として使用する評価が現在実現不可能であることを示すとともに,oracle のイベント情報に依存して正確な評価を行う。そこで本研究では,ssrに基づくモデルをイベント系列モデルにさらにコンテキスト化し,外部視覚コモンセンス知識ベースをイベントリレーティング予測データセットに再構成する簡易かつ効果的な手法により,より事実的な知識を具備することを提案する。その結果、新たな最先端モデルによって、最終的に25\%のマクロ精度パフォーマンス向上が実現される。 Understanding event relationships in videos requires a model to understand the underlying structures of events, i.e., the event type, the associated argument roles, and corresponding entities) along with factual knowledge needed for reasoning. Structural symbolic representation (SSR) based methods directly take event types and associated argument roles/entities as inputs to perform reasoning. However, the state-of-the-art video event-relation prediction system shows the necessity of using continuous feature vectors from input videos; existing methods based solely on SSR inputs fail completely, event when given oracle event types and argument roles. In this paper, we conduct an extensive empirical analysis to answer the following questions: 1) why SSR-based method failed; 2) how to understand the evaluation setting of video event relation prediction properly; 3) how to uncover the potential of SSR-based methods. We first identify the failure of previous SSR-based video event prediction models to be caused by sub-optimal training settings. Surprisingly, we find that a simple SSR-based model with tuned hyperparameters can actually yield a 20\% absolute improvement in macro-accuracy over the state-of-the-art model. Then through qualitative and quantitative analysis, we show how evaluation that takes only video as inputs is currently unfeasible, and the reliance on oracle event information to obtain an accurate evaluation. Based on these findings, we propose to further contextualize the SSR-based model to an Event-Sequence Model and equip it with more factual knowledge through a simple yet effective way of reformulating external visual commonsense knowledge bases into an event-relation prediction pretraining dataset. The resultant new state-of-the-art model eventually establishes a 25\% Macro-accuracy performance boost.	翻訳日:2023-01-10 17:17:33 公開日:2023-01-06
# アラビア語手話認識モデルの設計 Design of Arabic Sign Language Recognition Model ( http://arxiv.org/abs/2301.02693v1 ) ライセンス: Link先を確認	Muhammad Al-Barham, Ahmad Jamal, Musa Al-Yaman	(参考訳) 聴覚障害者は手話を使ってコミュニケーションしており、ジェスチャー、動き、姿勢、および話し言葉のアルファベットや単語に対応する表情の組み合わせである。提案するアラビア手話認識モデルは,難聴者や難聴者が一般人と効果的にコミュニケーションするのに役立つ。 The recognition has four stages of converting the alphabet into letters as follows: Image Loading stage, which loads the images of Arabic sign language alphabets that were used later to train and test the model, a pre-processing stage which applies image processing techniques such as normalization, Image augmentation, resizing, and filtering to extract the features which are necessary to accomplish the recognition perfectly, a training stage which is achieved by deep learning techniques like CNN, a testing stage which demonstrates how effectively the model performs for images did not see it before, and the model was built and tested mainly using PyTorch library. モデルはArASL2018でテストされ、40の署名者から収集された32のアルファベット記号に対して54,000の画像で構成され、データセットにはトレーニングデータセットとテストデータセットの2つのセットがある。本報告で詳しく説明するには,システムの正確性,時間,柔軟性の観点から信頼性を確保する必要があった。最後に、今後の研究はアラビア語の手話からアラビア語のテキストに変換するモデルになる。 Deaf people are using sign language for communication, and it is a combination of gestures, movements, postures, and facial expressions that correspond to alphabets and words in spoken languages. The proposed Arabic sign language recognition model helps deaf and hard hearing people communicate effectively with ordinary people. The recognition has four stages of converting the alphabet into letters as follows: Image Loading stage, which loads the images of Arabic sign language alphabets that were used later to train and test the model, a pre-processing stage which applies image processing techniques such as normalization, Image augmentation, resizing, and filtering to extract the features which are necessary to accomplish the recognition perfectly, a training stage which is achieved by deep learning techniques like CNN, a testing stage which demonstrates how effectively the model performs for images did not see it before, and the model was built and tested mainly using PyTorch library. The model is tested on ArASL2018, consisting of 54,000 images for 32 alphabet signs gathered from 40 signers, and the dataset has two sets: training dataset and testing dataset. We had to ensure that the system is reliable in terms of accuracy, time, and flexibility of use explained in detail in this report. Finally, the future work will be a model that converts Arabic sign language into Arabic text.	翻訳日:2023-01-10 16:51:37 公開日:2023-01-06
# rupnet:リアルタイムポリプセグメンテーションのための残差アップサンプリングネットワーク RUPNet: Residual upsampling network for real-time polyp segmentation ( http://arxiv.org/abs/2301.02703v1 ) ライセンス: Link先を確認	Nikhil Kumar Tomar, Ulas Bagci, Debesh Jha	(参考訳) 大腸癌は世界中でがん関連死亡の最も多い原因の一つである。早期にポリプの検出と除去は死亡率の低下に寄与し、隣接する臓器の拡散にも寄与する。早期のポリープ検出は世界中の何百万人もの患者を救い、臨床的な負担を軽減できる。しかし,検出ポリープ率は内科医によって大きく異なる。深層学習に基づく手法が多数提案されているが,ほとんどの研究で精度が向上している。本稿では,大腸ポリープ分割のための新しいアーキテクチャであるResidual Upsampling Network (RUPNet)を提案する。提案アーキテクチャであるRUPNetは、3つのエンコーダ、3つのデコーダブロックと、ネットワークの終端にある追加のアップサンプリングブロックで構成されるエンコーダ・デコーダネットワークである。画像サイズは512 \times 512$で,平均ダイス係数0.7658,和算平均交点0.6553,感度0.8049,精度0.7995,F2スコア0.9361で,毎秒152.60フレームの優れたリアルタイム動作速度を実現する。その結果, RUPNetは早期ポリプ検出のための優れたベンチマークを示す高い精度を維持しつつ, リアルタイムフィードバックを得られることが示唆された。 Colorectal cancer is among the most prevalent cause of cancer-related mortality worldwide. Detection and removal of polyps at an early stage can help reduce mortality and even help in spreading over adjacent organs. Early polyp detection could save the lives of millions of patients over the world as well as reduce the clinical burden. However, the detection polyp rate varies significantly among endoscopists. There is numerous deep learning-based method proposed, however, most of the studies improve accuracy. Here, we propose a novel architecture, Residual Upsampling Network (RUPNet) for colon polyp segmentation that can process in real-time and show high recall and precision. The proposed architecture, RUPNet, is an encoder-decoder network that consists of three encoders, three decoder blocks, and some additional upsampling blocks at the end of the network. With an image size of $512 \times 512$, the proposed method achieves an excellent real-time operation speed of 152.60 frames per second with an average dice coefficient of 0.7658, mean intersection of union of 0.6553, sensitivity of 0.8049, precision of 0.7995, and F2-score of 0.9361. The results suggest that RUPNet can give real-time feedback while retaining high accuracy indicating a good benchmark for early polyp detection.	翻訳日:2023-01-10 16:51:20 公開日:2023-01-06
# 条件付き翻訳を用いた交通事故・事故分類データセットの強化 Augmenting Ego-Vehicle for Traffic Near-Miss and Accident Classification Dataset using Manipulating Conditional Style Translation ( http://arxiv.org/abs/2301.02726v1 ) ライセンス: Link先を確認	Hilmil Pradana, Minh-Son Dao, and Koji Zettsu	(参考訳) 先進的な自動運転システムを開発するために、多くの研究者は、クローズドサーキットテレビ(cctv)とダッシュボード搭載カメラから可能な全ての交通リスクケースに注意を払っている。これらの手法のほとんどは、異常が発生したフレーム毎の識別に重点を置いているが、実現されていないため、道路交通参加者は、利用可能なアノテーションデータセットによって、トラフィックビデオで異常を検出できないため、エゴ車両が衝突する可能性がある。近接ミスは事故の一種であり、狭義に避けられる事故と定義できる。しかし,事故発生前の事故とニアミスの間には差はなく,事故の定義を再定義し,DADA-2000データセット上での事故の不整合を再注釈することへの貢献である。事故発生時刻の開始時刻と終了時刻を延ばすことで,事故発生時のエゴ運動を正確にカバーし,事故発生時を含む交通リスク事故を一貫した分類を行い,現実の運転支援システムにより重要な情報を提供する。提案手法は条件付きスタイル変換(cst)と分離可能な3次元畳み込みニューラルネットワーク(s3d)の2つの構成要素を統合する。 CSTアーキテクチャは、再アノテーションDADA-2000データセットを増大させ、交通事故ビデオの数を増やし、異なる種類の条件下での動画分類モデルの性能を一般化するために使用されるunsupervised image-to-image translation network (UNIT)によって導かれる。評価では, クロスバリデーション解析において, ベースラインモデルから10.25%の正のマージンで有意な改善が得られた。 To develop the advanced self-driving systems, many researchers are focusing to alert all possible traffic risk cases from closed-circuit television (CCTV) and dashboard-mounted cameras. Most of these methods focused on identifying frame-by-frame in which an anomaly has occurred, but they are unrealized, which road traffic participant can cause ego-vehicle leading into collision because of available annotation dataset only to detect anomaly on traffic video. Near-miss is one type of accident and can be defined as a narrowly avoided accident. However, there is no difference between accident and near-miss at the time before the accident happened, so our contribution is to redefine the accident definition and re-annotate the accident inconsistency on DADA-2000 dataset together with near-miss. By extending the start and end time of accident duration, our annotation can precisely cover all ego-motions during an incident and consistently classify all possible traffic risk accidents including near-miss to give more critical information for real-world driving assistance systems. The proposed method integrates two different components: conditional style translation (CST) and separable 3-dimensional convolutional neural network (S3D). CST architecture is derived by unsupervised image-to-image translation networks (UNIT) used for augmenting the re-annotation DADA-2000 dataset to increase the number of traffic risk accident videos and to generalize the performance of video classification model on different types of conditions while S3D is useful for video classification to prove dataset re-annotation consistency. In evaluation, the proposed method achieved a significant improvement result by 10.25% positive margin from the baseline model for accuracy on cross-validation analysis.	翻訳日:2023-01-10 16:50:55 公開日:2023-01-06
# LS-DYNA機械学習による短繊維強化複合材料の非線形モデリング LS-DYNA Machine Learning-based Multiscale Method for Nonlinear Modeling of Short Fiber-Reinforced Composites ( http://arxiv.org/abs/2301.02738v1 ) ライセンス: Link先を確認	Haoyan Wei, C. T. Wu, Wei Hu, Tung-Huan Su, Hitoshi Oura, Masato Nishi, Tadashi Naito, Stan Chung, Leo Shen	(参考訳) 短繊維強化複合材料(英: short-fiber-reinforceed Composites、SFRC)は、自動車やエレクトロニクス産業における軽量構造応用のための高性能な工学材料である。通常、SFRC構造は異種組織を誘導する射出成形により製造され、結果として生じる非線形異方性挙動は従来のマイクロメカニカル解析により予測することが困難である。本稿では, 有限要素シミュレーションソフトウェアls-dynaにおける射出成形誘起微細構造, 材料均質化, 深層材料ネットワーク(dmn)を統合し, sfrcの構造解析を行う機械学習ベースのマルチスケール手法を提案する。 DMNは物理埋め込み機械学習モデルであり、オフライントレーニングを通じて複合材料の代表体積要素に隠されたマイクロスケールの材料形態を学習する。 DMNを有限要素に結合することにより,高忠実性直接数値シミュレーションよりも高速に計算速度で複合材料や構造物の非線形挙動を予測する,高精度で効率的なデータ駆動手法を開発した。産業規模のSFRC製品をモデル化するために, 転写学習を用いて一貫したDMNデータベースを生成し, 射出成形による繊維配向と体積分率が複合性に及ぼす影響を効果的に把握する。このLS-DYNA機械学習に基づくマルチスケールSFRCモデリングの有望な性能を示す数値的な例を示す。 Short-fiber-reinforced composites (SFRC) are high-performance engineering materials for lightweight structural applications in the automotive and electronics industries. Typically, SFRC structures are manufactured by injection molding, which induces heterogeneous microstructures, and the resulting nonlinear anisotropic behaviors are challenging to predict by conventional micromechanical analyses. In this work, we present a machine learning-based multiscale method by integrating injection molding-induced microstructures, material homogenization, and Deep Material Network (DMN) in the finite element simulation software LS-DYNA for structural analysis of SFRC. DMN is a physics-embedded machine learning model that learns the microscale material morphologies hidden in representative volume elements of composites through offline training. By coupling DMN with finite elements, we have developed a highly accurate and efficient data-driven approach, which predicts nonlinear behaviors of composite materials and structures at a computational speed orders-of-magnitude faster than the high-fidelity direct numerical simulation. To model industrial-scale SFRC products, transfer learning is utilized to generate a unified DMN database, which effectively captures the effects of injection molding-induced fiber orientations and volume fractions on the overall composite properties. Numerical examples are presented to demonstrate the promising performance of this LS-DYNA machine learning-based multiscale method for SFRC modeling.	翻訳日:2023-01-10 16:33:45 公開日:2023-01-06
# 低信号対雑音比における等張リカレーション Isotonic Recalibration under a Low Signal-to-Noise Ratio ( http://arxiv.org/abs/2301.02692v1 ) ライセンス: Link先を確認	Mario V. W\"uthrich, Johanna Ziegel	(参考訳) 保険料体系は、異なる価格コホート間で系統的な相互資金繰りがないことを保証するために、自動調整資産を満たすべきである。回帰モデルは自動校正されないことが多い。自動校正を保証するために,任意の回帰モデルに等速再校正を適用することを提案する。我々の主な結果は、信号対雑音比の低さの下で、この等張リカバリレーションステップが説明可能な価格体系をもたらすことを証明している。 Insurance pricing systems should fulfill the auto-calibration property to ensure that there is no systematic cross-financing between different price cohorts. Often, regression models are not auto-calibrated. We propose to apply isotonic recalibration to a given regression model to ensure auto-calibration. Our main result proves that under a low signal-to-noise ratio, this isotonic recalibration step leads to explainable pricing systems because the resulting isotonically recalibrated regression functions have a low complexity.	翻訳日:2023-01-10 16:32:20 公開日:2023-01-06
# 宇宙空間における主成分分析 Principal Component Analysis in Space Forms ( http://arxiv.org/abs/2301.02750v1 ) ライセンス: Link先を確認	Puoya Tabaghi, Michael Khanzadeh, Yusu Wang, Sivash Mirarab	(参考訳) 主成分分析(PCA)は、現代のデータ科学の成果である。実践者は通常、データがユークリッド幾何学に適合するとpcaを行う。しかし、階層データのような特定のデータ型の場合、他の幾何学的空間の方が適切である。我々は、ゼロ曲率(ユークリッド)空間に加えて、定数正(球面)および負(双曲)曲率を持つ空間形式でPCAを研究する。リーマン多様体上の任意の点において、接ベクトルの集合に基づくリーマンアフィン部分空間を定義でき、可逆写像を使って多様体への接ベクトルを射影し、逆もまたできる。空間形式における点の集合に対する低次元リーマンアフィン部分空間を見つけることは、そのようなアフィン部分空間が同じ次元と曲率の空間形式に等長であるため、次元の減少に等しい。主成分を見つけるために、アフィン部分空間にデータ点を投影する最小平均コストで多様体値のデータ点の集合を最もよく表現する(リーマン的)アフィン部分空間を求める。そこで我々は,(1) ユークリッドPCAと同様の等式を解くことでアフィン部分空間を推定し,(2) 異なる次元の最適アフィン部分空間をネスト集合とする,という2つの大きな利点をもたらす特定のコスト関数を提案する。これらの性質は、ほとんどが収束が遅く、理論的な保証が弱い反復アルゴリズムである既存の方法よりも進歩する。特に双曲型 PCA の場合、関連する等式はローレンツ空間で作用し、不定内積が与えられ、したがってローレンツ空間とユークリッド空間の等式の間の接続を確立する。球面および双曲空間でシミュレートされたデータセット上で提案した空間形式PCAを評価し,コンバージェンス速度や精度において他の手法よりも優れていることを示す。 Principal component analysis (PCA) is a workhorse of modern data science. Practitioners typically perform PCA assuming the data conforms to Euclidean geometry. However, for specific data types, such as hierarchical data, other geometrical spaces may be more appropriate. We study PCA in space forms; that is, those with constant positive (spherical) and negative (hyperbolic) curvatures, in addition to zero-curvature (Euclidean) spaces. At any point on a Riemannian manifold, one can define a Riemannian affine subspace based on a set of tangent vectors and use invertible maps to project tangent vectors to the manifold and vice versa. Finding a low-dimensional Riemannian affine subspace for a set of points in a space form amounts to dimensionality reduction because, as we show, any such affine subspace is isometric to a space form of the same dimension and curvature. To find principal components, we seek a (Riemannian) affine subspace that best represents a set of manifold-valued data points with the minimum average cost of projecting data points onto the affine subspace. We propose specific cost functions that bring about two major benefits: (1) the affine subspace can be estimated by solving an eigenequation -- similar to that of Euclidean PCA, and (2) optimal affine subspaces of different dimensions form a nested set. These properties provide advances over existing methods which are mostly iterative algorithms with slow convergence and weaker theoretical guarantees. Specifically for hyperbolic PCA, the associated eigenequation operates in the Lorentzian space, endowed with an indefinite inner product; we thus establish a connection between Lorentzian and Euclidean eigenequations. We evaluate the proposed space form PCA on data sets simulated in spherical and hyperbolic spaces and show that it outperforms alternative methods in convergence speed or accuracy, often both.	翻訳日:2023-01-10 16:25:38 公開日:2023-01-06
# マルチラベル学習能力のキャラクタリゼーション A Characterization of Multilabel Learnability ( http://arxiv.org/abs/2301.02729v1 ) ライセンス: Link先を確認	Vinod Raman, Unique Subedi, Ambuj Tewari	(参考訳) マルチラベル分類の問題点を考察し,バッチおよびオンライン設定における学習可能性について考察する。両方の設定において、各関数クラスのシングルラベル制限が学習可能である場合に限り、マルチラベル関数クラスが学習可能であることを示す。拡張として,バッチ設定におけるマルチアウトプット回帰とオンライン設定におけるバンディットフィードバックについても検討した。前者は学習可能性w.r.t.$L_p$損失を特徴付ける。後者については、フルフィードバック設定と同様の特性を示す。 We consider the problem of multilabel classification and investigate learnability in batch and online settings. In both settings, we show that a multilabel function class is learnable if and only if each single-label restriction of the function class is learnable. As extensions, we also study multioutput regression in the batch setting and bandit feedback in the online setting. For the former, we characterize learnability w.r.t. $L_p$ losses. For the latter, we show a similar characterization as in the full-feedback setting.	翻訳日:2023-01-10 16:06:23 公開日:2023-01-06
# 極弱スーパービジョンを用いた少数ショットノード分類 Few-shot Node Classification with Extremely Weak Supervision ( http://arxiv.org/abs/2301.02708v1 ) ライセンス: Link先を確認	Song Wang, Yushun Dong, Kaize Ding, Chen Chen, Jundong Li	(参考訳) 数少ないノード分類は、限定されたラベル付きノードを参照として分類することを目的としている。最近のマイナショットノード分類法は、ラベル付きノードが豊富なクラス(メタトレーニングクラス)から学び、制限されたラベル付きノード(メタテストクラス)に一般化する。それでも実世界のグラフでは、多くのクラスで豊富なラベル付きノードを得るのは通常困難である。実際には、各メタトレーニングクラスは、非常に弱い監督問題として知られる複数のラベル付きノードのみで構成されることができる。メタトレーニングのためのラベル付きノードが極めて限られている数少ないノード分類では、メタトレーニングとメタテストの間の一般化ギャップが大きくなるため、サブ最適パフォーマンスが向上する。この問題に取り組むために,極端に弱い監督を持つ少数ノード分類の新たな問題について検討し,広く普及しているメタラーニングフレームワークに基づく原理フレームワーク x-fnc を提案する。具体的には,様々なメタ学習タスクにメタ知識を蓄積し,その知識をメタテストタスクに一般化することが目的である。極端に少ないラベル付きノードから生じる課題に対処するため、擬似ラベル付きノードを追加参照として取得し、極めて限られた監視情報から効果的に学習する2つの必須モジュールを提案する。さらに、4つのノード分類データセットについて、最先端のベースラインと比較してフレームワークの優位性を検証するために、極めて弱い監督力を持つ広範な実験を行った。 Few-shot node classification aims at classifying nodes with limited labeled nodes as references. Recent few-shot node classification methods typically learn from classes with abundant labeled nodes (i.e., meta-training classes) and then generalize to classes with limited labeled nodes (i.e., meta-test classes). Nevertheless, on real-world graphs, it is usually difficult to obtain abundant labeled nodes for many classes. In practice, each meta-training class can only consist of several labeled nodes, known as the extremely weak supervision problem. In few-shot node classification, with extremely limited labeled nodes for meta-training, the generalization gap between meta-training and meta-test will become larger and thus lead to suboptimal performance. To tackle this issue, we study a novel problem of few-shot node classification with extremely weak supervision and propose a principled framework X-FNC under the prevalent meta-learning framework. Specifically, our goal is to accumulate meta-knowledge across different meta-training tasks with extremely weak supervision and generalize such knowledge to meta-test tasks. To address the challenges resulting from extremely scarce labeled nodes, we propose two essential modules to obtain pseudo-labeled nodes as extra references and effectively learn from extremely limited supervision information. We further conduct extensive experiments on four node classification datasets with extremely weak supervision to validate the superiority of our framework compared to the state-of-the-art baselines.	翻訳日:2023-01-10 15:57:19 公開日:2023-01-06
# Witscript 3:会話におけるジョーク改善のためのハイブリッドAIシステム Witscript 3: A Hybrid AI System for Improvising Jokes in a Conversation ( http://arxiv.org/abs/2301.02695v1 ) ライセンス: Link先を確認	Joe Toplyn	(参考訳) 以前の論文ではWitscriptとWitscript 2が紹介されていた。 Witscriptはワードプレイに依存するジョークを生成するが、Witscript 2で生成されるジョークは常識に依存する。本稿では, 3つのジョーク生成機構を用いてジョーク候補を生成し, 出力する最適な候補を選択する Witscript 3 を提示した。 WitscriptやWitscript 2と同様に、Witscript 3はプロのコメディライターが作ったユーモアアルゴリズムに基づいている。人間はwitscript 3の入力文に対する応答を44%の冗談だと判断した。これは、Witscript 3がチャットボットに人間のようなユーモアを与えるための別のステップであることを示す証拠である。 Previous papers presented Witscript and Witscript 2, AI systems for improvising jokes in a conversation. Witscript generates jokes that rely on wordplay, whereas the jokes generated by Witscript 2 rely on common sense. This paper extends that earlier work by presenting Witscript 3, which generates joke candidates using three joke production mechanisms and then selects the best candidate to output. Like Witscript and Witscript 2, Witscript 3 is based on humor algorithms created by an expert comedy writer. Human evaluators judged Witscript 3's responses to input sentences to be jokes 44% of the time. This is evidence that Witscript 3 represents another step toward giving a chatbot a humanlike sense of humor.	翻訳日:2023-01-10 15:39:03 公開日:2023-01-06
# 感覚関係の階層を生かした談話関係感覚の対比学習の促進 Facilitating Contrastive Learning of Discourse Relational Senses by Exploiting the Hierarchy of Sense Relations ( http://arxiv.org/abs/2301.02724v1 ) ライセンス: Link先を確認	Wanqiu Long and Bonnie Webber	(参考訳) 暗黙の談話関係認識は、2つの隣接するテキストの間に保持される感覚や感覚を、それらの間に明示的な接続性がない場合に識別する難しいタスクである。 PDTB-2とPDTB-3の両方では、談話関係感覚は4つの広義のトップレベル感覚からより特定の感覚まで3段階の階層に分類される。暗黙的談話関係認識に関するほとんどの以前の研究は、センス階層を単にどのセンスラベルが利用可能かを示すために使用してきた。ここではさらに -- 認識プロセス自体にセンス階層を組み込んで、対比学習で使用される否定的な例を選択する。追加の努力なしに、このアプローチはタスクの最先端のパフォーマンスを達成する。 Implicit discourse relation recognition is a challenging task that involves identifying the sense or senses that hold between two adjacent spans of text, in the absence of an explicit connective between them. In both PDTB-2 and PDTB-3, discourse relational senses are organized into a three-level hierarchy ranging from four broad top-level senses, to more specific senses below them. Most previous work on implicit discourse relation recognition have used the sense hierarchy simply to indicate what sense labels were available. Here we do more -- incorporating the sense hierarchy into the recognition process itself and using it to select the negative examples used in contrastive learning. With no additional effort, the approach achieves state-of-the-art performance on the task.	翻訳日:2023-01-10 15:38:52 公開日:2023-01-06
# フルフィールド超音波キャラクタリゼーションのための深層学習 Deep learning for full-field ultrasonic characterization ( http://arxiv.org/abs/2301.02378v1 ) ライセンス: Link先を確認	Yang Xu, Fatemeh Pourahmadian, Jian Song, Conglin Wang	(参考訳) 本研究は、機械学習の最近の進歩を活用し、全波形データから層状成分の機械的特性を分散再構築するための物理ベースのデータ解析プラットフォームを構築する。本稿では,2つの論理,すなわち直接反転と物理インフォームドニューラルネットワーク(PINN)について検討する。直接反転には3つのステップがある。 (i)フルフィールドデータのスペクトル分別と微分二各領域における未知の物理・正規化パラメータのプロファイルを近似するための適切なニューラルマップの構築、及び 3)Tikhonov-regularized PDE損失の最小化によるニューラルネットワークの同時学習 (i)。 PINNは、フィールド変数が物理的未知や損失関数重みのような(スケールまたは分散された)補助パラメータによって与えられるニューラルネットワークによってモデル化されるマルチタスク学習を通じて予測能力を持つ複雑なシステムの効率的なサロゲートモデルを提供する。 PINNは、基礎となる物理法則に基づくデータ不適合の尺度を制約として最小化することで訓練される。本研究では,超音波データからの学習を容易にするため,ピンズロスを採用する。 (a)データ不適合を計算するための波数依存のソボレフノルム b) PDEの形式を弾性波伝搬に活用することにより, 損失目標を自然にバランスさせる, 特定のスケーリングフレームワークにおける非適応重み付けを行う。どちらのパラダイムも合成データと実験室テストデータで調べられる。後者の場合、複数の周波数で再構成を行い、データ駆動モデリングにおける検証と検証の重要性を強調した相補的な実験によって結果が検証される。 This study takes advantage of recent advances in machine learning to establish a physics-based data analytic platform for distributed reconstruction of mechanical properties in layered components from full waveform data. In this vein, two logics, namely the direct inversion and physics-informed neural networks (PINNs), are explored. The direct inversion entails three steps: (i) spectral denoising and differentiation of the full-field data, (ii) building appropriate neural maps to approximate the profile of unknown physical and regularization parameters on their respective domains, and (iii) simultaneous training of the neural networks by minimizing the Tikhonov-regularized PDE loss using data from (i). PINNs furnish efficient surrogate models of complex systems with predictive capabilities via multitask learning where the field variables are modeled by neural maps endowed with (scaler or distributed) auxiliary parameters such as physical unknowns and loss function weights. PINNs are then trained by minimizing a measure of data misfit subject to the underlying physical laws as constraints. In this study, to facilitate learning from ultrasonic data, the PINNs loss adopts (a) wavenumber-dependent Sobolev norms to compute the data misfit, and (b) non-adaptive weights in a specific scaling framework to naturally balance the loss objectives by leveraging the form of PDEs germane to elastic-wave propagation. Both paradigms are examined via synthetic and laboratory test data. In the latter case, the reconstructions are performed at multiple frequencies and the results are verified by a set of complementary experiments highlighting the importance of verification and validation in data-driven modeling.	翻訳日:2023-01-10 00:36:51 公開日:2023-01-06
# 深部生物学的経路インフォームド・パス-ゲノム多モード生存予測 Deep Biological Pathway Informed Pathology-Genomic Multimodal Survival Prediction ( http://arxiv.org/abs/2301.02383v1 ) ライセンス: Link先を確認	Lin Qiu, Aminollah Khormali, Kai Liu	(参考訳) 病理画像やゲノムデータなどのマルチモーダルデータの統合は、パーソナライズされた治療におけるがんの不均一性と複雑性の理解、および生存予測の強化に不可欠である。病理学とゲノムデータを統合する進歩にもかかわらず、ほとんどの既存の手法は複雑なモダリティ間の関係を完全に掘り出すことはできない。さらに、前臨床発見と臨床予測を管理するこれらのモデルから説明可能な特徴を特定することは、がんの診断、予後、治療反応の研究に不可欠である。生命予後予測だけでなく, 生存率の異なる遺伝子や経路を同定するために, 病理画像とゲノムデータを統合した, 新たな生物学的経路形成型病理遺伝深層モデル ponet を提案する。 The Cancer Genome Atlas (TCGA) データセットの6つの実験結果から,提案手法は優れた予測性能を示し,有意義な生物学的解釈を示した。提案手法は,疾患の理解と治療耐性の予測に汎用的な応用性を有するマルチモーダルバイオメディカルデータを用いた,生体情報による深層ネットワークの訓練方法に関する知見を確立する。 The integration of multi-modal data, such as pathological images and genomic data, is essential for understanding cancer heterogeneity and complexity for personalized treatments, as well as for enhancing survival predictions. Despite the progress made in integrating pathology and genomic data, most existing methods cannot mine the complex inter-modality relations thoroughly. Additionally, identifying explainable features from these models that govern preclinical discovery and clinical prediction is crucial for cancer diagnosis, prognosis, and therapeutic response studies. We propose PONET- a novel biological pathway-informed pathology-genomic deep model that integrates pathological images and genomic data not only to improve survival prediction but also to identify genes and pathways that cause different survival rates in patients. Empirical results on six of The Cancer Genome Atlas (TCGA) datasets show that our proposed method achieves superior predictive performance and reveals meaningful biological interpretations. The proposed method establishes insight into how to train biologically informed deep networks on multimodal biomedical data which will have general applicability for understanding diseases and predicting response and resistance to treatment.	翻訳日:2023-01-10 00:36:27 公開日:2023-01-06
# 多世代音楽変換器-完全長楽譜 Multi-Genre Music Transformer -- Composing Full Length Musical Piece ( http://arxiv.org/abs/2301.02385v1 ) ライセンス: Link先を確認	Abhinav Kaushal Keshari	(参考訳) 音楽を生成するタスクにおいて、アートファクタは大きな役割を担い、AIにとって大きな課題である。従来は、新しい楽曲を制作するための敵対的な訓練や、様々な音楽(ビート、テンポ、音楽ステム)の互換性をモデル化する作業は、この課題を学習する素晴らしい例であった。これはマッシュアップやテンポや鍵分布から学習的特徴を発生させることに限られていた。複合語トランスフォーマーは、複合語で定義された音楽イベントを含むシーケンス生成チャレンジとして音楽生成タスクを表現することができた。これらの音楽イベントは、音符の進行、コードの変更、調和、芸術的要素をより正確に記述する。本研究の目的は,楽曲のジャンルや形式も考慮した課題を含む,より適応的な学習プロセスを通じて楽曲の制作を学ぶマルチジャンルトランスフォーマーの実装である。我々は,複数種類の複合語データセットを構築し,このデータセット上で学習した線形トランスフォーマを実装した。このマルチジャンルトランスフォーマーは、オリジナル曲に匹敵する多種多様な新曲をフルタイムで生成することができた。モデルは他のモデルより2～5倍速い速度で走行する。 In the task of generating music, the art factor plays a big role and is a great challenge for AI. Previous work involving adversarial training to produce new music pieces and modeling the compatibility of variety in music (beats, tempo, musical stems) demonstrated great examples of learning this task. Though this was limited to generating mashups or learning features from tempo and key distributions to produce similar patterns. Compound Word Transformer was able to represent music generation task as a sequence generation challenge involving musical events defined by compound words. These musical events give a more accurate description of notes progression, chord change, harmony and the art factor. The objective of the project is to implement a Multi-Genre Transformer which learns to produce music pieces through more adaptive learning process involving more challenging task where genres or form of the composition is also considered. We built a multi-genre compound word dataset, implemented a linear transformer which was trained on this dataset. We call this Multi-Genre Transformer, which was able to generate full length new musical pieces which is diverse and comparable to original tracks. The model trains 2-5 times faster than other models discussed.	翻訳日:2023-01-10 00:36:08 公開日:2023-01-06
# 心電図同期のためのデータ駆動ガウスプロセスフィルタ A Data-Driven Gaussian Process Filter for Electrocardiogram Denoising ( http://arxiv.org/abs/2301.02607v1 ) ライセンス: Link先を確認	Mircea Dumitru, Qiao Li, Erick Andres Perez Alday, Ali Bahrami Rad, Gari D. Clifford, Reza Sameni	(参考訳) 目的: 心電図 (ECG) フィルタリングを含む様々な用途に効果的に使用されているガウス過程 (GP) ベースのフィルタは、計算的に要求され、そのハイパーパラメータの選択は通常アドホックである。方法: ecgフェーズドメイン(ecg phase domain)という概念を用いて、データ駆動gpフィルタを開発し、一定数のサンプルにecgビートをタイムウォードで表現し、ガウス分布に従うと仮定したrピークをアライメントする。この仮定の下で、サンプル平均と共分散行列の計算を単純化し、アドホックなハイパーパラメータなしでデータ駆動方式でGPフィルタの効率的な実装を可能にする。提案フィルタはPhyloNet QTデータベース上で,最先端のウェーブレットベースフィルタと比較して評価する。付加雑音を用いた5dBステップにおいて,5dBから30dBまでのSNRレベルにおけるフィルタの信号対雑音比(SNR)改善を測定して評価を行った。臨床評価のために, 原信号とフィルタ信号のqt間隔の推定誤差を測定し, ベンチマークフィルタと比較した。結果: 提案するgpフィルタは, 全雑音レベルのベンチマークフィルタよりも優れていることが示された。また、QT間隔推定誤差バイアスと分散の観点から、最先端フィルタよりも優れている。結論: GPフィルタは臨床および研究応用においてECGを前処理するための汎用的手法であり, 任意の長さとサンプリング周波数のECGに適用可能であり, その性能に対する信頼区間を提供する。 Objective: Gaussian Processes (GP)-based filters, which have been effectively used for various applications including electrocardiogram (ECG) filtering can be computationally demanding and the choice of their hyperparameters is typically ad hoc. Methods: We develop a data-driven GP filter to address both issues, using the notion of the ECG phase domain -- a time-warped representation of the ECG beats onto a fixed number of samples and aligned R-peaks, which is assumed to follow a Gaussian distribution. Under this assumption, the computation of the sample mean and covariance matrix is simplified, enabling an efficient implementation of the GP filter in a data-driven manner, with no ad hoc hyperparameters. The proposed filter is evaluated and compared with a state-of-the-art wavelet-based filter, on the PhysioNet QT Database. The performance is evaluated by measuring the signal-to-noise ratio (SNR) improvement of the filter at SNR levels ranging from -5 to 30dB, in 5dB steps, using additive noise. For a clinical evaluation, the error between the estimated QT-intervals of the original and filtered signals is measured and compared with the benchmark filter. Results: It is shown that the proposed GP filter outperforms the benchmark filter for all the tested noise levels. It also outperforms the state-of-the-art filter in terms of QT-interval estimation error bias and variance. Conclusion: The proposed GP filter is a versatile technique for preprocessing the ECG in clinical and research applications, is applicable to ECG of arbitrary lengths and sampling frequencies, and provides confidence intervals for its performance.	翻訳日:2023-01-10 00:35:49 公開日:2023-01-06
# NEC違反:トンネルとカシミール効果 NEC violation: Tunnelling versus the Casimir effect ( http://arxiv.org/abs/2301.02455v1 ) ライセンス: Link先を確認	Jean Alexandre and Drew Backhouse	(参考訳) 有限体積で許容される2つの縮退したミニマ間のトンネルは、非拡張対称な基底状態をもたらす。これにより、フィールドを含むボックス内の連続的なモーメントの集合が仮定された場合、十分な低温でヌルエネルギ条件に違反する。離散モーメントを考慮すると、この図を修正でき、トンネルによって引き起こされる基底状態エネルギーにカシミールエネルギーを加えることで達成される。ゼロ温度に焦点をあてると、これらの非自明な効果は、典型的な長さスケールに依存する。 We show that tunnelling between two degenerate minima, as allowed in a finite volume, leads to a non-extensive symmetric ground state. This results in Null Energy Condition violation for sufficiently low temperatures, when a continuous set of momenta in the box containing the field is assumed. Taking into account discrete momenta can modify this picture and is achieved via the addition of the Casimir energy to the tunnelling-induced ground state energy. Focusing on zero-temperature, these non-trivial effects are found to compete, depending on the typical length scales involved.	翻訳日:2023-01-10 00:34:49 公開日:2023-01-06
# 有限温度シミュレーションのための適応変分量子最小絡み合い典型的な熱状態 Adaptive variational quantum minimally entangled typical thermal states for finite temperature simulations ( http://arxiv.org/abs/2301.02592v1 ) ライセンス: Link先を確認	Jo\~ao C. Getelina, Niladri Gomes, Thomas Iadecola, Peter P. Orth, Yong-Xin Yao	(参考訳) 熱平衡における量子多体系のシミュレーションのためのスケーラブルな量子アルゴリズムは、有限温度における量子物質の特性を予測するのに重要である。ここでは,最小絡み合った典型的な熱状態(metts)アルゴリズムの量子コンピューティング版について記述し,ベンチマークを行った。 AVQMETTSと呼ばれるアルゴリズムは、ノイズの多い中間スケール量子(NISQ)ハードウェアに適した、コンパクトで問題固有の量子回路を動的に生成する。我々は、状態ベクトルシミュレータ上でAVQMETTSをベンチマークし、1次元と2次元の積分可能および非可積分量子スピンモデルの熱エネルギー計算を行い、回路複雑性の概して線形なスケールを示す。最後に,二次元横磁場イジングモデルの有限温度相転移線をマッピングする。 Scalable quantum algorithms for the simulation of quantum many-body systems in thermal equilibrium are important for predicting properties of quantum matter at finite temperatures. Here we describe and benchmark a quantum computing version of the minimally entangled typical thermal states (METTS) algorithm for which we adopt an adaptive variational approach to perform the required quantum imaginary time evolution. The algorithm, which we name AVQMETTS, dynamically generates compact and problem-specific quantum circuits, which are suitable for noisy intermediate-scale quantum (NISQ) hardware. We benchmark AVQMETTS on statevector simulators and perform thermal energy calculations of integrable and nonintegrable quantum spin models in one and two dimensions and demonstrate an approximately linear system-size scaling of the circuit complexity. Finally, we map out the finite-temperature phase transition line of the two-dimensional transverse field Ising model.	翻訳日:2023-01-10 00:34:39 公開日:2023-01-06
# 遅延系の階層的運動方程式(heom)アナログ:共振器間光子伝播を例に A hierarchical equations of motion (HEOM) analog for systems with delay: illustrated on inter-cavity photon propagation ( http://arxiv.org/abs/2301.02626v1 ) ライセンス: Link先を確認	Robert Fuchs and Marten Richter	(参考訳) 過去20年間で、谷村と久保の階層的運動方程式(HEOM)は、システムバス問題の数値計算のための動きに基づくツールの方程式となっている。 HEOMは今日では、外浴を通しての散逸・移行プロセスの多くに一般化されている。空間的に拡張されたフォトニック系では、浴槽内の光子の伝播は量子エミッタのカップリングの遅延/遅延を引き起こす。ここで、HEOMの導出の背後にあるアイデアは光子遅延の場合に一般化され、2つの誘電スラブの単純な例に適用される。導出方程式は遅延を記述するための単純な信頼できる枠組みを提供し、経路積分処理の代替となるかもしれない。 Over the last two decades, the hierarchical equations of motion (HEOM) of Tanimura and Kubo have become the equation of motion-based tool for numerically exact calculations of system-bath problems. The HEOM is today generalized to many cases of dissipation and transfer processes through an external bath. In spatially extended photonic systems, the propagation of photons through the bath leads to retardation/delays in the coupling of quantum emitters. Here, the idea behind the HEOM derivation is generalized to the case of photon retardation and applied to the simple example of two dielectric slabs. The derived equations provide a simple reliable framework for describing retardation and may provide an alternative to path integral treatments.	翻訳日:2023-01-10 00:34:24 公開日:2023-01-06
# 高性能コンピューティングにおける神話と伝説 Myths and Legends in High-Performance Computing ( http://arxiv.org/abs/2301.02432v1 ) ライセンス: Link先を確認	Satoshi Matsuoka, Jens Domke, Mohamed Wahib, and Aleksandr Drozd, Torsten Hoefler	(参考訳) このユーモラスで思考を挑発する記事では、ハイパフォーマンスコンピューティングコミュニティのメンバーの間で伝承される神話や伝説について論じる。カンファレンスやミーティング、プロダクト広告、論文、さらにはツイートやブログ、ニュース記事といったコミュニケーションから、コミュニティ内(そしてそれ以上)でこれらの神話を収集しました。それらは、デンナード・スケーリングやムーアの法則のような多くのスケーリング法則の終わりによって引き起こされた、現在の大規模な変化の時代におけるジートジストであると信じています。いくつかの法則が終わる一方で、アルゴリズムスケーリングや新しいアーキテクチャ研究など、新しい方向性が開かれる。しかし、これらの神話は科学的事実に基づくことはめったにないが、しばしばいくつかの証拠や議論に基づいている。実際、これは多くの神話が存在する理由であり、それが明確に答えられない理由であると信じている。それぞれに明確な答えがあるように感じられるが、ベートーヴェンがモーツァルトより優れているかどうかという問題など、哲学的な議論が絶え間ない。我々は、私たちの神話の収集を、研究と産業投資の新たな方向性に関する議論として見たいと思っています。 In this humorous and thought provoking article, we discuss certain myths and legends that are folklore among members of the high-performance computing community. We collected those myths from conversations at conferences and meetings, product advertisements, papers, and other communications such as tweets, blogs, and news articles within (and beyond) our community. We believe they represent the zeitgeist of the current era of massive change, driven by the end of many scaling laws such as Dennard scaling and Moore's law. While some laws end, new directions open up, such as algorithmic scaling or novel architecture research. However, these myths are rarely based on scientific facts but often on some evidence or argumentation. In fact, we believe that this is the very reason for the existence of many myths and why they cannot be answered clearly. While it feels like there should be clear answers for each, some may remain endless philosophical debates such as the question whether Beethoven was better than Mozart. We would like to see our collection of myths as a discussion of possible new directions for research and industry investment.	翻訳日:2023-01-10 00:34:13 公開日:2023-01-06
# 量子多重アクセスワイヤタップチャネル:ワンショットで実現可能なシークレットレート領域について Quantum Multiple Access Wiretap Channel: On the One-Shot Achievable Secrecy Rate Regions ( http://arxiv.org/abs/2301.02479v1 ) ライセンス: Link先を確認	Hadi Aghaee and Bahareh Akhbari	(参考訳) 本稿では,古典的量子多重アクセスパケットチャネル(CQ-MA-WTC)をワンショット設定で検討する。そこで本研究では,CQ-MA-WTCを同時位置ベース復号器を用いて解析し,信頼性の高い復号化を行う。また,CQ-MA-WTC を Sen のワンショット継手典型補題を用いて解析し,信頼性の高い復号化を行う。同時位置ベースデコーダは、複数の仮説テスト問題を引き起こす傾向がある。また、凸分割を用いて同時シナリオにおけるプライバシー基準を分析することも問題となる。両問題を克服するために,まず cq-ma-wtc の双対と見なすことのできる新しいチャネルを導入する。このチャネルは、複数のメッセージ(pp-qwtc)を持つポイントツーポイント量子ワイヤータップチャネルと呼ばれる。以下では,この問題を解決するための戦略として,量子放送チャネル(qbcs)をワンショット設定で検討し,解析する。 In this paper, we want to investigate classical-quantum multiple access wiretap channels (CQ-MA-WTC) under one-shot setting. In this regard, we analyze the CQ-MA-WTC using simultaneous position-based decoder for reliable decoding and using a newly introduced technique in order to decode securely. Also, for the sake of comparison, we analyze the CQ-MA-WTC using Sen's one-shot joint typicality lemma for reliable decoding. The simultaneous position-based decoder tends to a multiple hypothesis testing problem. Also, using convex splitting to analyze the privacy criteria in a simultaneous scenario becomes problematic. To overcome both problems, we first introduce a new channel that can be considered as a dual to the CQ-MA-WTC. This channel is called a point-to-point quantum wiretap channel with multiple messages (PP-QWTC). In the following, as a strategy to solve the problem, we also investigate and analyze quantum broadcast channels (QBCs) under the one-shot setting.	翻訳日:2023-01-10 00:33:55 公開日:2023-01-06
# グラフェン系ナノアンテナにおけるエッジ効果とコンダクタンスの理論 Theory of Edge Effects and Conductunce for Applications in Graphene-based Nanoantennas ( http://arxiv.org/abs/2301.02441v1 ) ライセンス: Link先を確認	Tomer Berghaus, Touvia Miloh, Oded Gottlieb, and Gregory Slepyan	(参考訳) 本稿では,グラフェンにおけるエッジ効果の理論を,テラヘルツ,赤外線,可視周波数領域のナノアンテナへの応用に適用する。その特性は、通常の表面伝導率ではなく、動的導電率の観点から定式化して到達した自己整合性である。エッジ効果の物理的モデルは、ディラックフェルミオンの概念に基づいている。表面コンダクタンスは一般感受性と見なされ、kuboアプローチによって計算される。以前のモデルとは対照的に、表面コンダクタンスは非均質かつ非局所となる。表面コンダクタンスの空間的挙動は、シートの長さと電気化学的ポテンシャルに依存する。数値シミュレーションの結果,2.1-800nmの範囲と0.1-1.0ev範囲の電気化学ポテンシャルについて検討した。長さが800nmを超えると、我々のモデルは比較的高い精度で古典的なドリュード導電率モデルと一致することが示されている。比較的短い長さでは、導電性は通常空間振動を示し、導電性に欠け、グラフェン系アンテナの特性に強く影響を及ぼす。このような空間振動の周期と振幅は、電気化学的ポテンシャルに強く依存する。新しい理論は、ゲート電圧の電気化学的ポテンシャルを変化させることで、電気制御されたナノアンテナを実現する方法を開く。得られた結果は、現代の量子技術における炭素系ナノデバイスの設計に適用できる。 In this paper, we develop a theory of edge effects in graphene for its applications to nanoantennas in the terahertz, infrared, and visible frequency ranges. Its characteristic feature is selfconsistence reached due the formulation in terms of dynamical conductance instead of ordinary used surface conductivity. The physical model of edge effects is based on using the concept of Dirac fermions. The surface conductance is considered as a general susceptibility and is calculated via the Kubo approach. In contrast with earlier models, the surface conductance becomes nonhomogeneous and nonlocal. The spatial behavior of the surface conductance depends on the length of the sheet and the electrochemical potential. Results of numerical simulations are presented for lengths in the range of 2.1-800 nm and electrochemical potentials ranging between 0.1-1.0 eV. It is shown that if the length exceeded 800 nm, our model agrees with the classical Drude conductivity model with a relatively high degree of accuracy. For rather short lengths, the conductance usually exhibits spatial oscillations, which absent in conductivity and strongly affect the properties of graphene based antennas. The period and amplitude of such spatial oscillations, strongly depend on the electrochemical potential. The new theory opens the way for realizing electrically controlled nanoantennas by changing the electrochemical potential may of the gate voltage. The obtained results may be applicable for the design of carbon based nanodevices in modern quantum technologies.	翻訳日:2023-01-10 00:27:02 公開日:2023-01-06
# フェムト秒パルス駆動非線形光ファイバにおける偏光励起光の発生の最適化 Optimizing the generation of polarization squeezed light in nonlinear optical fibers driven by femtosecond pulses ( http://arxiv.org/abs/2301.02454v1 ) ライセンス: Link先を確認	A. V. Andrianov, N. A. Kalinin, A. A. Sorokin, E. A. Anashkina, L. L. Sanchez-Soto, J. F. Corney, and G. Leuchs	(参考訳) 超短パルスレーザーに対するkerr効果を利用した光ファイバでは、明るい絞り光を生成することができる。しかし、繊維中のパルス伝搬は、スクイーズを劣化させる非保存効果を受ける。本稿では,su(2)不変で技術的な摂動に対して頑健な2モード偏光スクイージングを解析し,偏光維持ファイバ内で生成する。我々は、様々な非保存効果と実ファイバーデータを含むファイバの量子パルス進化の先進モデルを用いて、プロセスとパルスパラメータの厳密な数値最適化を行う。数値結果は実験結果と一致している。 Bright squeezed light can be generated in optical fibers utilizing the Kerr effect for ultrashort laser pulses. However, pulse propagation in a fiber is subject to nonconservative effects that deteriorate the squeezing. Here, we analyze two-mode polarization squeezing, which is SU(2)-invariant, robust against technical perturbations, and can be generated in a polarization-maintaining fiber. We perform a rigorous numerical optimization of the process and the pulse parameters using our advanced model of quantum pulse evolution in the fiber that includes various nonconservative effects and real fiber data. Numerical results are consistent with experimental results.	翻訳日:2023-01-10 00:26:43 公開日:2023-01-06
# 未検出光による量子イメージング蒸留実験 Experimental quantum imaging distillation with undetected light ( http://arxiv.org/abs/2301.02529v1 ) ライセンス: Link先を確認	Jorge Fuenzalida, Marta Gilaberte Basset, Sebastian T\"opfer, Juan P. Torres, Markus Gr\"afe	(参考訳) 誘導コヒーレンス効果に基づくイメージングは、光子対を用いて、それをプローブする光を検出することなく、物体の情報を得る。 1つの光子が物体を照らすが、そのパートナーのみが検出されるため、偶然の事象の測定は不要である。検出された光子の特定の干渉パターンを観察して、追従対象の情報を開示する。ここでは、この撮像技術がノイズに耐性を持たせることを実験的に実証する。本稿では,関心信号の干渉変調に基づく画像蒸留法を提案する。提案手法は,実利得信号の250倍のノイズレベルに対しても高品質の画像を生成することができることを示す。また、我々の発見に関する詳細な理論的説明も含んでいる。 Imaging based on the induced coherence effect makes use of photon pairs to obtain information of an object without detecting the light that probes it. While one photon illuminates the object, only its partner is detected, so no measurement of coincidence events are needed. The sought-after object's information is revealed observing a certain interference pattern on the detected photon. Here we demonstrate experimentally that this imaging technique can be made resilient to noise. We introduce an imaging distillation approach based on the interferometric modulation of the signal of interest. We show that our scheme can generate a high-quality image of an object even against noise levels up to 250 times the actual signal of interest. We also include a detailed theoretical explanation of our findings.	翻訳日:2023-01-10 00:26:33 公開日:2023-01-06
# アウトデコヒーレンスによる古典性:概念、マルコビアン性との関係、およびランダム行列論アプローチ Classicality with(out) decoherence: Concepts, relation to Markovianity, and a random matrix theory approach ( http://arxiv.org/abs/2301.02563v1 ) ライセンス: Link先を確認	Philipp Strasberg	(参考訳) 古典の世界が量子物理学の根底からどのように現われるかという疑問に対する答えは、次のように再検討され、連結され、拡張される。まず、オープン量子系のデコヒーレンス、一貫性/デコヒーレントヒストリー、コルモゴロフ一貫性の3つの異なる概念を比較する。第二に、これらの概念をつなぐ量子マルコフ性(厳密に定義される)の重要な役割が確立される。第3に、ランダム行列理論モデルを用いて、大量のコヒーレンスが存在するにもかかわらず、遅い観測値と粗い観測値の測定統計値において、量子効果が指数関数的に抑制されることが示されている。これはまた数値的に例示されており、古典性の出現に対する非可積分性とカオスの可能性と重要性を強調している。 Answers to the question how a classical world emerges from underlying quantum physics are revisited, connected and extended as follows. First, three distinct concepts are compared: decoherence in open quantum systems, consistent/decoherent histories and Kolmogorov consistency. Second, the crucial role of quantum Markovianity (defined rigorously) to connect these concepts is established. Third, using a random matrix theory model, quantum effects are shown to be exponentially suppressed in the measurement statistics of slow and coarse observables despite the presence of large amount of coherences. This is also numerically exemplified, and it highlights the potential and importance of non-integrability and chaos for the emergence of classicality.	翻訳日:2023-01-10 00:26:14 公開日:2023-01-06
# エンドツーエンド無線通信のためのハイブリッド量子古典オートエンコーダ Hybrid Quantum-Classical Autoencoders for End-to-End Radio Communication ( http://arxiv.org/abs/2301.02609v1 ) ライセンス: Link先を確認	Zsolt Tabi and Bence Bak\'o and D\'aniel T. R. Nagy and P\'eter Vaderna and Zs\'ofia Kallus and P\'eter H\'aga and Zolt\'an Zimbor\'as	(参考訳) 量子ニューラルネットワークは、ノイズの多い量子処理ユニットを応用するための候補として浮上している。本稿では,エンドツーエンド無線通信のためのハイブリッド量子古典オートエンコーダを提案する。古典的無線システムの物理層において,ノイズチャネル上での標準符号化無線信号のシミュレーションアーキテクチャの性能について検討する。我々は、受信機内の量子デコーダが送信部内の古典的エンコーダで動作するハイブリッドモデルを実装した。信号劣化に対する堅牢性に優れた入力シンボルの潜在空間表現を学ぶことに加えて、量子ビット回路の一般化されたデータ再ロードスキームにより、アプリケーションの推論時間制約を満たすことができる。 Quantum neural networks are emerging as potential candidates to leverage noisy quantum processing units for applications. Here we introduce hybrid quantum-classical autoencoders for end-to-end radio communication. In the physical layer of classical wireless systems, we study the performance of simulated architectures for standard encoded radio signals over a noisy channel. We implement a hybrid model, where a quantum decoder in the receiver works with a classical encoder in the transmitter part. Besides learning a latent space representation of the input symbols with good robustness against signal degradation, a generalized data re-uploading scheme for the qubit-based circuits allows to meet inference-time constraints of the application.	翻訳日:2023-01-10 00:25:59 公開日:2023-01-06
# ハード組合せ問題に対する量子価格に基づく列生成フレームワーク A quantum pricing-based column generation framework for hard combinatorial problems ( http://arxiv.org/abs/2301.02637v1 ) ライセンス: Link先を確認	Wesley da Silva Coelho, Lo\"ic Henriet, Louis-Paul Henry	(参考訳) 本研究では、中性原子プラットフォームに基づく量子サンプリング器を含む完全ハイブリッド古典量子アルゴリズムを提案する。このアプローチは、オペレーションリサーチの分野で開発された古典列生成フレームワークにインスパイアされ、量子プロシージャが古典的な解法にどのように役立つかを示す。提案手法を最小頂点色問題にベンチマークし,提案したハイブリッド量子古典列生成アルゴリズムが比較的数イテレーションで優れた解が得られることを示す。結果と最先端の古典的手法と量子的アプローチを比較した。 In this work, we present a complete hybrid classical-quantum algorithm involving a quantum sampler based on neutral atom platforms. This approach is inspired by classical column generation frameworks developed in the field of Operations Research and shows how quantum procedures can assist classical solvers in addressing hard combinatorial problems. We benchmark our method on the Minimum Vertex Coloring problem and show that the proposed hybrid quantum-classical column generation algorithm can yield good solutions in relatively few iterations. We compare our results with state-of-the-art classical and quantum approaches.	翻訳日:2023-01-10 00:25:48 公開日:2023-01-06
# マイクロ波光子計数による単一電子スピン共鳴検出 Single electron-spin-resonance detection by microwave photon counting ( http://arxiv.org/abs/2301.02653v1 ) ライセンス: Link先を確認	Zhiren Wang, L\'eo Balembois, Milos Ran\v{c}i\'c, Eric Billaud, Marianne Le Dantec, Alban Ferrier, Philippe Goldner, Sylvain Bertaina, Thierry Chaneli\`ere, Daniel Est\`eve, Denis Vion, Patrice Bertet, Emmanuel Flurin	(参考訳) 電子スピン共鳴(esr)分光法は、化学から量子コンピューティングまで幅広い応用を含む、常磁性不純物を特徴付ける方法であるが、信号対雑音比が限られているため、アンサンブル平均量のみにアクセスできる。しかし、スピン依存フォトルミネッセンス、輸送測定、走査プローブ技術を用いて単一電子スピン感度が達成されている。これらの手法は、小さな検出ボリュームでのみシステム固有のものであるか、感度が高いため、実用的な単一スピン検出は未解決の課題である。ここでは、極低温のマイクロ波光子カウンタを用いて、スピン蛍光検出による単一電子磁気共鳴を実証する。高品質平面超伝導共振器に結合したシェーライト結晶中の個々の常磁性エルビウムイオンを検出し、その放射減衰速度を1秒で信号対雑音比1.9で向上させる。蛍光信号は、個々のエミッターに由来することを証明し、反膨らみを示す。 3msまでのコヒーレンス時間は測定され、スピン放射寿命によって制限される。この方法は、十分な非放射性緩和時間を持つ任意の常磁性種に適用できる可能性があり、共振器磁気モード体積(10 um^3)と他の単スピン検出技術より桁違い大きい体積での単スピン検出を可能にする。したがって、磁気共鳴や量子コンピューティングに応用できるかもしれない。 Electron spin resonance (ESR) spectroscopy is the method of choice for characterizing paramagnetic impurities, with applications ranging from chemistry to quantum computing, but it gives access only to ensemble-averaged quantities due to its limited signal-to-noise ratio. Single-electron-spin sensitivity has however been reached using spin-dependent photoluminescence, transport measurements, and scanning-probe techniques. These methods are system-specific or sensitive only in a small detection volume, so that practical single spin detection remains an open challenge. Here, we demonstrate single electron magnetic resonance by spin fluorescence detection, using a microwave photon counter at cryogenic temperatures. We detect individual paramagnetic erbium ions in a scheelite crystal coupled to a high-quality factor planar superconducting resonator to enhance their radiative decay rate, with a signal-to-noise ratio of 1.9 in one second integration time. The fluorescence signal shows anti-bunching, proving that it comes from individual emitters. Coherence times up to 3 ms are measured, limited by the spin radiative lifetime. The method has the potential to apply to arbitrary paramagnetic species with long enough non-radiative relaxation time, and allows single-spin detection in a volume as large as the resonator magnetic mode volume ( 10 um^3 in the present experiment), orders of magnitude larger than other single-spin detection techniques. As such, it may find applications in magnetic resonance and quantum computing.	翻訳日:2023-01-10 00:25:40 公開日:2023-01-06
# プライオリティ投票力の測定 - デリゲートを真剣に考える Measuring a Priori Voting Power -- Taking Delegations Seriously ( http://arxiv.org/abs/2301.02462v1 ) ライセンス: Link先を確認	Rachael Colley, Th\'eo Delemazure, Hugo Gilbert	(参考訳) 本稿では,代議員が重要な役割を担っている選挙における有権者の批判性,すなわち2種類の代議員投票設定と液状民主主義設定を計測する新たな権力指標を提案する。まず、我々のパワー指標は、従来の単純な投票ゲームにおけるpenrose-banzhafインデックスの自然な拡張であり、直観的な説明であると主張する。重み付き投票ゲームにおける再帰公式は擬似多項時間でこれらの指標を計算することができることを示す。最後に、理論的特性を強調し、代議員制の導入が有権者の投票力をどう変えるかを示す数値的な結果を提供する。 In this paper, we introduce new power indices to measure the criticality of voters involved in different elections where delegations play a key role, namely, two variants of the proxy voting setting and a liquid democracy setting. First, we argue that our power indices are natural extensions of the Penrose-Banzhaf index in classic simple voting games, illustrating their intuitions. We show that recursive formulas can compute these indices for weighted voting games in pseudo-polynomial time. Last, we highlight theoretical properties and provide numerical results to illustrate how introducing delegation options modifies the voting power of voters.	翻訳日:2023-01-10 00:19:47 公開日:2023-01-06
# 時系列確率的潮流に適用したロバストなデータ駆動プロセスモデリング A Robust Data-driven Process Modeling Applied to Time-series Stochastic Power Flow ( http://arxiv.org/abs/2301.02651v1 ) ライセンス: Link先を確認	Pooja Algikar, Yijun Xu, Somayeh Yarahmadi, Lamine Mili	(参考訳) 本稿では,超パラメータをシュウェッペ型一般化最大確率推定器を用いてロバストに推定するロバストなデータ駆動プロセスモデルを提案する。提案モデルでは,電圧ファサーと電力噴射の時系列データを用いて時系列の確率的潮流計算を行う。電力系統のデータは、大きなエラー、故障状況、停電、極端な天候などによって、しばしば異常によって破損する。提案するモデルでは,トレーニングデータセットの測定において,垂直外れ値と悪レバレッジ点を削減できる。時系列データポイントのマハラノビス距離の頑健なバージョンであるプロジェクション統計を用いて、外れ値の影響を束縛するために用いられる重みを計算した。提案手法は,ieee 33 バス配電システムと,再生可能エネルギー源と高度に統合された実世界不均衡240 バス配電システムで実証された。シミュレーションの結果,提案するロバストモデルはトレーニングデータセットの異常値の最大25%を処理できることがわかった。 In this paper, we propose a robust data-driven process model whose hyperparameters are robustly estimated using the Schweppe-type generalized maximum likelihood estimator. The proposed model is trained on recorded time-series data of voltage phasors and power injections to perform a time-series stochastic power flow calculation. Power system data are often corrupted with outliers caused by large errors, fault conditions, power outages, and extreme weather, to name a few. The proposed model downweights vertical outliers and bad leverage points in the measurements of the training dataset. The weights used to bound the influence of the outliers are calculated using projection statistics, which are a robust version of Mahalanobis distances of the time series data points. The proposed method is demonstrated on the IEEE 33-Bus power distribution system and a real-world unbalanced 240-bus power distribution system heavily integrated with renewable energy sources. Our simulation results show that the proposed robust model can handle up to 25% of outliers in the training data set.	翻訳日:2023-01-10 00:19:36 公開日:2023-01-06
# TrojanPuzzle: コード提案モデルを隠蔽する TrojanPuzzle: Covertly Poisoning Code-Suggestion Models ( http://arxiv.org/abs/2301.02344v1 ) ライセンス: Link先を確認	Hojjat Aghakhani, Wei Dai, Andre Manoel, Xavier Fernandes, Anant Kharkar, Christopher Kruegel, Giovanni Vigna, David Evans, Ben Zorn, and Robert Sim	(参考訳) GitHub Copilotのようなツールでは、自動コード提案はもはやソフトウェアエンジニアリングの夢ではない。大規模な言語モデルに基づくこれらのツールは、通常、未調査の公開ソースから採掘された大量のコードコーパスで訓練される。その結果、これらのモデルは悪意のあるデータを注入してモデルのトレーニングや微調整フェーズを操作するデータ中毒攻撃に影響を受けやすい。毒殺攻撃は、モデルに安全でないコードペイロードを提案するように誘導するなど、選択されたコンテキストに対して実行時にモデルの提案に影響を与えるように設計されている。これを実現するために、事前毒殺攻撃は、安全でないコードペイロードをトレーニングデータに明示的に注入し、このような悪意のあるデータをトレーニングセットから削除できる静的解析ツールによって、毒殺データを検出可能にする。本研究では, ドクストリングなどの文脈外領域に有害な中毒データを植え付けることで静的解析を回避できる2つの新しいデータ中毒攻撃, COVERT と TROJANPUZLE を実証する。我々の最も斬新な攻撃であるTROJANPUZLEは、有毒データにペイロードの特定の(目立たしい)部分を含めることなく、コード完了時にペイロード全体(つまり外部の文書)を示唆するモデルを生成することによって、不審な毒性データを生成する。これによってtrojanpuzzleは、トレーニングデータから疑わしいシーケンスを識別およびフィルタリングするシグネチャベースのデータセット分離手法に対して堅牢になる。 2つのモデルサイズに対する評価は、COVERTとTROJANPUZLEの両方が、コード提案モデルのトレーニングやチューニングに使用するコードを選択する方法に重要な意味を持つことを示している。 With tools like GitHub Copilot, automatic code suggestion is no longer a dream in software engineering. These tools, based on large language models, are typically trained on massive corpora of code mined from unvetted public sources. As a result, these models are susceptible to data poisoning attacks where an adversary manipulates the model's training or fine-tuning phases by injecting malicious data. Poisoning attacks could be designed to influence the model's suggestions at run time for chosen contexts, such as inducing the model into suggesting insecure code payloads. To achieve this, prior poisoning attacks explicitly inject the insecure code payload into the training data, making the poisoning data detectable by static analysis tools that can remove such malicious data from the training set. In this work, we demonstrate two novel data poisoning attacks, COVERT and TROJANPUZZLE, that can bypass static analysis by planting malicious poisoning data in out-of-context regions such as docstrings. Our most novel attack, TROJANPUZZLE, goes one step further in generating less suspicious poisoning data by never including certain (suspicious) parts of the payload in the poisoned data, while still inducing a model that suggests the entire payload when completing code (i.e., outside docstrings). This makes TROJANPUZZLE robust against signature-based dataset-cleansing methods that identify and filter out suspicious sequences from the training data. Our evaluation against two model sizes demonstrates that both COVERT and TROJANPUZZLE have significant implications for how practitioners should select code used to train or tune code-suggestion models.	翻訳日:2023-01-10 00:19:17 公開日:2023-01-06
# No-Regret Reduction による確率的リセットフリー強化学習 Provable Reset-free Reinforcement Learning by No-Regret Reduction ( http://arxiv.org/abs/2301.02389v1 ) ライセンス: Link先を確認	Hoai-An Nguyen, Ching-An Cheng	(参考訳) 実世界の強化学習(RL)は、典型的なRLアルゴリズムが適切な初期状態のサンプリングにリセット機構に強く依存するため、非常に制限されることが多い。実際には、人間の介入や高度なエンジニアリング環境を必要とするため、リセットメカニズムを実装するのに費用がかかる。学習をより実用的なものにするために,リセットフリーなrlアルゴリズムを体系的に設計する汎用的非リグレット削減を提案する。我々のリセットフリーのRLを2プレイヤーゲームに変える。この2つのプレイヤーゲームでsublinear regretを達成することは、オリジナルのrl問題におけるsublinear performance regretとsublinear total of resetsの両方を持つポリシーを学ぶことを意味する。これは、エージェントが最終的に最適な実行を学習し、リセットを避けることを意味する。この削減により、我々は線形マルコフ決定過程のインスタンス化を設計する。 Real-world reinforcement learning (RL) is often severely limited since typical RL algorithms heavily rely on the reset mechanism to sample proper initial states. In practice, the reset mechanism is expensive to implement due to the need for human intervention or heavily engineered environments. To make learning more practical, we propose a generic no-regret reduction to systematically design reset-free RL algorithms. Our reduction turns reset-free RL into a two-player game. We show that achieving sublinear regret in this two player game would imply learning a policy that has both sublinear performance regret and sublinear total number of resets in the original RL problem. This means that the agent eventually learns to perform optimally and avoid resets. By this reduction, we design an instantiation for linear Markov decision processes, which is the first provably correct reset-free RL algorithm to our knowledge.	翻訳日:2023-01-10 00:18:45 公開日:2023-01-06
# 統合ベイズネットワークによるMDD患者のパーソナライズされた脳機能結合の学習 Learning Personalized Brain Functional Connectivity of MDD Patients from Multiple Sites via Federated Bayesian Networks ( http://arxiv.org/abs/2301.02423v1 ) ライセンス: Link先を確認	Shuai Liu, Xiao Guo, Shun Qi, Huaning Wang and Xiangyu Chang	(参考訳) 主要うつ病性障害(mdd)患者の機能的結合バイオマーカーの同定は、障害機構の解明と早期介入に不可欠である。しかし, サンプルサイズが小さく, 利用可能な神経画像データの高次元化により, 既存手法の性能は制限されることが多い。多地点データでは統計的パワーとサンプルサイズが向上するが、サイト間の不均一性とデータ共有ポリシーがしばしば適用される。本稿では,複数のベイズネットワーク(BN)を連続的最適化で同時学習し,MDD患者の疾患誘発変化を同定するための連合型関節推定器NOTEARS-PFLを提案する。提案するフェデレーション学習フレームワークには,サイト間で共有される情報とサイト固有の情報を組み込んで,グループ融合ラッソペナルティを導入することで,パーソナライズされたBN構造を学習する。そこで我々は,局所的な更新ステップにおいて,各局所でニューロイメージングデータを処理した乗算器の交互方向法を開発した。そして、学習したネットワーク構造をセンターに送信し、グローバル更新を行う。特に,局所的な更新ステップのクローズドフォーム式を導出し,グループ融合ラッソペナルティを扱うために反復的近近投射法を用いる。合成および実世界のマルチサイトRS-fMRIデータセットにおける提案手法の性能評価を行った。その結果,提案したNOTEARS-PFLは同等の手法よりも有効性と精度が高いことがわかった。 Identifying functional connectivity biomarkers of major depressive disorder (MDD) patients is essential to advance understanding of the disorder mechanisms and early intervention. However, due to the small sample size and the high dimension of available neuroimaging data, the performance of existing methods is often limited. Multi-site data could enhance the statistical power and sample size, while they are often subject to inter-site heterogeneity and data-sharing policies. In this paper, we propose a federated joint estimator, NOTEARS-PFL, for simultaneous learning of multiple Bayesian networks (BNs) with continuous optimization, to identify disease-induced alterations in MDD patients. We incorporate information shared between sites and site-specific information into the proposed federated learning framework to learn personalized BN structures by introducing the group fused lasso penalty. We develop the alternating direction method of multipliers, where in the local update step, the neuroimaging data is processed at each local site. Then the learned network structures are transmitted to the center for the global update. In particular, we derive a closed-form expression for the local update step and use the iterative proximal projection method to deal with the group fused lasso penalty in the global update step. We evaluate the performance of the proposed method on both synthetic and real-world multi-site rs-fMRI datasets. The results suggest that the proposed NOTEARS-PFL yields superior effectiveness and accuracy than the comparable methods.	翻訳日:2023-01-10 00:18:28 公開日:2023-01-06
# 強レーザーパルス下における分子の多電子ダイナミクスの効率的なシミュレーション:適応有限要素法に基づく多重構成時間依存Hartree-Fock法の実装 Efficient simulation of multielectron dynamics in molecules under intense laser pulses: Implementation of the multiconfiguration time-dependent Hartree-Fock method based on the adaptive finite element method ( http://arxiv.org/abs/2301.02387v1 ) ライセンス: Link先を確認	Yuki Orimo, Takeshi Sato, Kenichi L. Ishikawa	(参考訳) 本稿では,高出力レーザーパルス下での分子の適応有限要素法に基づくマルチコンフィギュレーション時間依存hartree-fock法の実装について述べる。効率的なシミュレーションのために、軌道関数は短い反復アーノルドスキームを用いて安定なプロパゲータによって伝播され、分散メモリ計算のために並列化される。これは、水分子からの高調波発生をシミュレーションし、計算時間を極端に少なくした多電子ダイナミクスのシミュレーションを実現することで実証される。 We present an implementation of the multiconfiguration time-dependent Hartree-Fock method based on the adaptive finite element method for molecules under intense laser pulses. For efficient simulations, orbital functions are propagated by a stable propagator using the short iterative Arnoldi scheme and our implementation is parallelized for distributed memory computing. This is demonstrated by simulating high-harmonic generation from a water molecule and achieves a simulation of multielectron dynamics with overwhelmingly less computational time, compared to our previous work.	翻訳日:2023-01-10 00:15:56 公開日:2023-01-06
# オルガノイド画像解析プラットフォームに関する調査 A survey on Organoid Image Analysis Platforms ( http://arxiv.org/abs/2301.02341v1 ) ライセンス: Link先を確認	Alireza Ranjbaran and Azadeh Nazemi	(参考訳) 生体内細胞培養系は、特定の細胞型に関する生物学的発見や仮説駆動の研究に使われ、機械的または試験薬理学的薬物を理解する。従来のin-vitro培養法は2次元表面上に沈着した一次細胞や不死化細胞に応用されている。しかし、複雑な生理環境では信頼できず、生存中の行動を正確に予測することはできない。オルガノイド(Organoids)は、in vitro細胞培養系に置換された一次ドナーまたは幹細胞の多細胞スフェロイドであり、生物学的、生医学、翻訳研究で広く用いられている。臓器や疾患組織のネイティブな異質性、微細解剖、機能性は、オルガノイドのような3次元の生体内組織モデルで表すことができる。オルガノイドは、薬物発見とパーソナライズドドラッグスクリーニングのための生体内モデルに必須である。オルガノイドの閉塞、重なり、焦点外スフェロイドなどの多くの画像アーティファクトは、従来の画像処理では困難である。生物学におけるオルガノイドモデルの力にもかかわらず、その大きさと形状はほとんど考慮されていない。薬物応答は、個々のオルガノイドの形態、数、大きさの動的変化に依存するが、これはオルガノイドの形状や大きさの違い、焦点平面の移動、限られたオプションによる生細胞染色が薬物応答や成長分析に困難をもたらすことを意味する。本研究は, 様々な医学分野におけるオルガノイド培養システムの役割と, オルガノイドの利用範囲について紹介する。次に、オルガノイドの運用の課題を研究し、続いて、オルガノイド活用の課題に対処するために、オルガノイドに適用される画像分析システムやプラットフォームをレビューする。 An in-vitro cell culture system is used for biological discoveries and hypothesis-driven research on a particular cell type to understand mechanistic or test pharmaceutical drugs. Conventional in-vitro cultures have been applied to primary cells and immortalised cell lines plated on 2D surfaces. However, they are unreliable in complex physiological environments and can not always predict in-vivo behaviour correctly. Organoids are multicellular spheroids of a primary donor or stem cells that are replaced in vitro cell culture systems and are widely used in biological, biomedical and translational studies. Native heterogeneity, microanatomy, and functionality of an organ or diseased tissue can be represented by three-dimensional in-vitro tissue models such as organoids. Organoids are essential in in-vitro models for drug discovery and personalised drug screening. Many imaging artefacts such as organoid occlusion, overlap, out-of-focus spheroids and considerable heterogeneity in size cause difficulty in conventional image processing. Despite the power of organoid models for biology, their size and shape have mostly not been considered. Drug responses depend on dynamic changes in individual organoid morphology, number and size, which means differences in organoid shape and size, movement through focal planes, and live-cell staining with limited options cause challenges for drug response and growth analysis. This study primarily introduces the importance of the role of the organoid culture system in different disciplines of medical science and various scopes of utilising organoids. Then studies the challenges of operating organoids, followed by reviewing image analysis systems or platforms applied to organoids to address organoid utilising challenges.	翻訳日:2023-01-10 00:10:27 公開日:2023-01-06
# Text2Poster: 検索した画像にスティル化されたテキストをレイアウトする Text2Poster: Laying out Stylized Texts on Retrieved Images ( http://arxiv.org/abs/2301.02363v1 ) ライセンス: Link先を確認	Chuhao Jin, Hongteng Xu, Ruihua Song, Zhiwu Lu	(参考訳) ポスター生成は広範囲のアプリケーションにとって重要なタスクであり、しばしば時間がかかり、手作業による編集や芸術的な経験を必要とする。本稿では,テキスト情報から視覚的に有効なポスターを自動的に生成する,新しいデータ駆動フレームワークである \textit{text2poster} を提案する。マニュアルポスター編集のプロセスを模倣したフレームワークでは,所定のテキストから背景画像を抽出し,逐次的な自動エンコーダによって画像上のテキストを反復的にレイアウトし,最後にマッチングベースの手法でテキストをスタイラライズする。我々は、ラベル付きデータの需要を軽減し、弱々しく自己監督的な学習戦略によってフレームワークのモジュールを学習する。客観的な実験と主観的な実験の両方で、text2posterは、学術研究や商用ソフトウェアを含む最先端の手法よりも、生成したポスターの品質に優れています。 Poster generation is a significant task for a wide range of applications, which is often time-consuming and requires lots of manual editing and artistic experience. In this paper, we propose a novel data-driven framework, called \textit{Text2Poster}, to automatically generate visually-effective posters from textual information. Imitating the process of manual poster editing, our framework leverages a large-scale pretrained visual-textual model to retrieve background images from given texts, lays out the texts on the images iteratively by cascaded auto-encoders, and finally, stylizes the texts by a matching-based method. We learn the modules of the framework by weakly- and self-supervised learning strategies, mitigating the demand for labeled data. Both objective and subjective experiments demonstrate that our Text2Poster outperforms state-of-the-art methods, including academic research and commercial software, on the quality of generated posters.	翻訳日:2023-01-10 00:10:00 公開日:2023-01-06
# コンタクト顕微鏡画像からの角膜パノラマ画像の生成 Generating corneal panoramic images from contact specular microscope images ( http://arxiv.org/abs/2301.02388v1 ) ライセンス: Link先を確認	Yusuke Nagira, Yuzuha Hara, Satoru Hiwa, Naoki Okumura, Noriko Koizumi and Tomoyuki Hiroyasu	(参考訳) 接触鏡顕微鏡は非接触鏡顕微鏡よりも広い視野角を有するが、角膜全体の像を捉えることはできない。このような画像を得るには、連続的に撮像された画像の一部にフィルムを作成し、それらを組み合わせて完全な画像を作成する必要がある。本研究では,コンタクトスペクトル顕微鏡を用いて撮影した映像から角膜全体を自動生成する枠組みを提案する。ビデオから比較的焦点を絞った映像を抽出し,パノラマ合成を行った。画像全体を生成することができる場合、画像からグッタを検出し、その存在範囲を調べることができる。本システムを実装し,提案手法の有効性を検討した。このシステムは、カスタムメイド複合ソフトウェア、画像合成ソフトウェア(ICS, K.I. Technology Co., Ltd., 内部アルゴリズムは公表されていない)を用いて実装され、U-Netを用いた教師付き学習モデルを用いた。フッフス内皮角膜ジストロフィー(FECD)マウスモデルから得られた94種類の角膜ビデオに構築システムを適用した際,いくつかの画像が正しく合成された。本研究におけるデータに対する手法の実装と適用により,その効果が確認された。実装による精度などの最小限の定量的評価により、将来の調査にはいくつかの制限が生じる可能性がある。 The contact specular microscope has a wider angle of view than that of the non-contact specular microscope but still cannot capture an image of the entire cornea. To obtain such an image, it is necessary to prepare film on the parts of the image captured sequentially and combine them to create a complete image. This study proposes a framework to automatically generate an entire corneal image from videos captured using a contact specular microscope. Relatively focused images were extracted from the videos and panoramic compositing was performed. If an entire image can be generated, it is possible to detect guttae from the image and examine the extent of their presence. The system was implemented and the effectiveness of the proposed framework was examined. The system was implemented using custom-made composite software, Image Composite Software (ICS, K.I. Technology Co., Ltd., Japan, internal algorithms not disclosed), and a supervised learning model using U-Net was used for guttae detection. Several images were correctly synthesized when the constructed system was applied to 94 different corneal videos obtained from Fuchs endothelial corneal dystrophy (FECD) mouse model. The implementation and application of the method to the data in this study confirmed its effectiveness. Owing to the minimal quantitative evaluation performed, such as accuracy with implementation, it may pose some limitations for future investigations.	翻訳日:2023-01-10 00:09:41 公開日:2023-01-06
# 医用画像解析における深層学習モデル:Kvasirデータセットによる食道炎の検出 Deep-learning models in medical image analysis: Detection of esophagitis from the Kvasir Dataset ( http://arxiv.org/abs/2301.02390v1 ) ライセンス: Link先を確認	Kyoka Yoshiok, Kensuke Tanioka, Satoru Hiwa and Tomoyuki Hiroyasu	(参考訳) 食道炎の早期発見は,非治療で癌に進行する可能性があるため重要である。しかし,食道炎検出における深層学習モデルの精度は,まだ比較されていない。そこで本研究では,結膜型ニューラルネットワークモデル(googlenet,resnet-50,mobilenet v2,mobilenet v3)の内視鏡画像のオープンkvasirデータセットからの食道炎検出における精度を比較することを目的とした。その結果,GoogLeNetはF1スコアが最も高かった。 MobileNet V3は、真の陽性率の平均に基づいて、他のモデルよりも確実に食道炎を予測した。モデルを用いて得られた結果は、SHapley Additive exPlanations と Gradient-weighted Class Activation Mapping を用いた結果と比較した。 Early detection of esophagitis is important because this condition can progress to cancer if left untreated. However, the accuracies of different deep learning models in detecting esophagitis have yet to be compared. Thus, this study aimed to compare the accuracies of convolutional neural network models (GoogLeNet, ResNet-50, MobileNet V2, and MobileNet V3) in detecting esophagitis from the open Kvasir dataset of endoscopic images. Results showed that among the models, GoogLeNet achieved the highest F1-scores. Based on the average of true positive rate, MobileNet V3 predicted esophagitis more confidently than the other models. The results obtained using the models were also compared with those obtained using SHapley Additive exPlanations and Gradient-weighted Class Activation Mapping.	翻訳日:2023-01-10 00:09:21 公開日:2023-01-06
# グラフ畳み込みによる深部血管分割のためのクロスネットワークマルチスケール特徴融合 Graph Convolution Based Cross-Network Multi-Scale Feature Fusion for Deep Vessel Segmentation ( http://arxiv.org/abs/2301.02393v1 ) ライセンス: Link先を確認	Gangming Zhao, Kongming Liang, Chengwei Pan, Fandong Zhang, Xianpeng Wu, Xinyang Hu, and Yizhou Yu	(参考訳) 血管セグメンテーションは血管疾患の診断に広く用いられている。既存の方法で再建された容器はしばしば臨床使用基準を満たすほど正確ではない。これは3D血管構造が非常に複雑で、空間性や異方性など独特の特徴を持つためである。本稿では,血管分割のためのハイブリッド深層ニューラルネットワークを提案する。ネットワークは,それぞれ初期セグメンテーションと洗練されたセグメンテーションを行う2つのカスケードサブネットワークで構成されている。第2のサブネットワークはさらに、従来のCNNベースのU-NetとグラフU-Netの2つの密結合コンポーネントを備えている。これら2つのU字型ネットワーク間でクロスネットワークマルチスケール機能融合を行い、高品質な船体セグメンテーションを効果的に支援する。カスケードされたネットワーク全体を、エンドツーエンドでトレーニングすることができる。第2サブネットワークのグラフは、容器の確率マップと、元のCTボリュームの外観と意味的類似性に基づいて構築される。血管の疎水性と異方性に起因する課題に対処するため、グラフノードの比率は血管を含む可能性のある領域に分布し、エッジの比率は潜在的な近接血管の向きに従っている。我々のディープネットワークは、複数のパブリックおよび社内データセット上で最先端の3D船体セグメンテーション性能を達成する。 Vessel segmentation is widely used to help with vascular disease diagnosis. Vessels reconstructed using existing methods are often not sufficiently accurate to meet clinical use standards. This is because 3D vessel structures are highly complicated and exhibit unique characteristics, including sparsity and anisotropy. In this paper, we propose a novel hybrid deep neural network for vessel segmentation. Our network consists of two cascaded subnetworks performing initial and refined segmentation respectively. The second subnetwork further has two tightly coupled components, a traditional CNN-based U-Net and a graph U-Net. Cross-network multi-scale feature fusion is performed between these two U-shaped networks to effectively support high-quality vessel segmentation. The entire cascaded network can be trained from end to end. The graph in the second subnetwork is constructed according to a vessel probability map as well as appearance and semantic similarities in the original CT volume. To tackle the challenges caused by the sparsity and anisotropy of vessels, a higher percentage of graph nodes are distributed in areas that potentially contain vessels while a higher percentage of edges follow the orientation of potential nearbyvessels. Extensive experiments demonstrate our deep network achieves state-of-the-art 3D vessel segmentation performance on multiple public and in-house datasets.	翻訳日:2023-01-10 00:09:05 公開日:2023-01-06
# 胸部x線画像分類のための深部学習 (共同19) Deep Learning For Classification Of Chest X-Ray Images (Covid 19) ( http://arxiv.org/abs/2301.02468v1 ) ライセンス: Link先を確認	Benbakreti Samir, Said Mwanahija, Benbakreti Soumia, Umut \"Ozkaya	(参考訳) 医療実践においては、情報技術の貢献は非常に大きい。これらのプラクティスのほとんどは、医療援助が人体の異なる病理を識別するために使用する画像を含んでいる。そのうちの1つはX線画像で、この論文の作業の多くをカバーしています。胸部X線はCovid 19の同定と診断において重要な役割を果たしている。新型コロナウイルスは、2019年12月に中国武漢で発生した最初の症例を受けて、2020年以来、世界的な感染拡大が宣言されている。このプロジェクトのゴールは、Covid 19のウイルス性肺炎、肺の透明度、正常な画像を含む胸部X線画像を分類できるようにすることです。 cnnアーキテクチャとさまざまな事前学習モデルを使用しました。最良の結果は94.1%の精度でresnet 18アーキテクチャを使用することで得られる。また、AlexNetの場合、GPUの実行時間は最適であるが、私たちが注意する必要があるのは、事前訓練されたモデルがCNNよりもはるかに早く収束することである。時間の節約は非常に大きい。これらの結果により、患者に対する診断時間が解決されるだけでなく、特にパンデミックの強い時期には、実践者にとって興味深いツールが提供される。 In medical practice, the contribution of information technology can be considerable. Most of these practices include the images that medical assistance uses to identify different pathologies of the human body. One of them is X-ray images which cover much of our work in this paper. Chest x-rays have played an important role in Covid 19 identification and diagnosis. The Covid 19 virus has been declared a global pandemic since 2020 after the first case found in Wuhan China in December 2019. Our goal in this project is to be able to classify different chest X-ray images containing Covid 19, viral pneumonia, lung opacity and normal images. We used CNN architecture and different pre-trained models. The best result is obtained by the use of the ResNet 18 architecture with 94.1% accuracy. We also note that The GPU execution time is optimal in the case of AlexNet but what requires our attention is that the pretrained models converge much faster than the CNN. The time saving is very considerable. With these results not only will solve the diagnosis time for patients, but will provide an interesting tool for practitioners, thus helping them in times of strong pandemic in particular.	翻訳日:2023-01-10 00:08:48 公開日:2023-01-06
# WSIから得られた大腸癌のCADシステム:臨床検査による解釈可能なMLベースのプロトタイプ A CAD System for Colorectal Cancer from WSI: A Clinically Validated Interpretable ML-based Prototype ( http://arxiv.org/abs/2301.02608v1 ) ライセンス: Link先を確認	Pedro C. Neto, Diana Montezuma, Sara P. Oliveira, Domingos Oliveira, Jo\~ao Fraga, Ana Monteiro, Jo\~ao Monteiro, Liliana Ribeiro, Sofia Gon\c{c}alves, Stefan Reinhard, Inti Zlobec, Isabel M. Pinto, Jaime S. Cardoso	(参考訳) 人工知能(AI)とデジタル病理学の統合は、ここ数年で増加している。近年,WSI画像から癌を診断するためのディープラーニング(DL)法の応用は,これまでになく,さまざまな研究グループにおいて現実となっている。しかし,これらのシステムの開発は,トレーニングサンプルの欠如,スケーリング困難,DL法の不透明さ,臨床検査の欠如など,無数の制約によって制限された。そこで本研究では,大腸癌検体診断に特化したシステムを提案する。 The construction of such a system consisted of four stages: (1) a careful data collection and annotation process, which resulted in one of the largest WSI colorectal samples datasets; (2) the design of an interpretable mixed-supervision scheme to leverage the domain knowledge introduced by pathologists through spatial annotations; (3) the development of an effective sampling approach based on the expected severeness of each tile, which decreased the computation cost by a factor of almost 6x; (4) the creation of a prototype that integrates the full set of features of the model to be evaluated in clinical practice. これらの段階において,提案手法は4つの異なるテストセットで評価され,そのうち2つは外部的で完全に独立である。最大のセットでは、提案されたアプローチは93.44%の精度を達成した。大腸サンプルのDLは、研究を排他的に中止し、臨床実践に完全に統合されるためのいくつかのステップである。 The integration of Artificial Intelligence (AI) and Digital Pathology has been increasing over the past years. Nowadays, applications of deep learning (DL) methods to diagnose cancer from whole-slide images (WSI) are, more than ever, a reality within different research groups. Nonetheless, the development of these systems was limited by a myriad of constraints regarding the lack of training samples, the scaling difficulties, the opaqueness of DL methods, and, more importantly, the lack of clinical validation. As such, we propose a system designed specifically for the diagnosis of colorectal samples. The construction of such a system consisted of four stages: (1) a careful data collection and annotation process, which resulted in one of the largest WSI colorectal samples datasets; (2) the design of an interpretable mixed-supervision scheme to leverage the domain knowledge introduced by pathologists through spatial annotations; (3) the development of an effective sampling approach based on the expected severeness of each tile, which decreased the computation cost by a factor of almost 6x; (4) the creation of a prototype that integrates the full set of features of the model to be evaluated in clinical practice. During these stages, the proposed method was evaluated in four separate test sets, two of them are external and completely independent. On the largest of those sets, the proposed approach achieved an accuracy of 93.44%. DL for colorectal samples is a few steps closer to stop being research exclusive and to become fully integrated in clinical practice.	翻訳日:2023-01-10 00:08:07 公開日:2023-01-06
# ZX計算からのグラフィック量子クリフォードエンコーダコンパイラ Graphical quantum Clifford-encoder compilers from the ZX calculus ( http://arxiv.org/abs/2301.02356v1 ) ライセンス: Link先を確認	Andrey Boris Khesin, Jonathan Z. Lu, and Peter W. Shor	(参考訳) 本稿では、量子誤り訂正において普遍的に発生する量子回路の等価クラスであるクリフォードエンコーダをZX計算の表現にマッピングする量子コンパイルアルゴリズムを提案する。特に、zx計算において正準形式を開発し、任意のクリフォードエンコーダの正準形式への効率的な還元性を証明する。コンパイラが生成したダイアグラムは,エンコーダの情報伝搬と絡み合い構造を明確に可視化し,回路や安定化器の表象に隠蔽される特性を明らかにする。 We present a quantum compilation algorithm that maps Clifford encoders, an equivalence class of quantum circuits that arise universally in quantum error correction, into a representation in the ZX calculus. In particular, we develop a canonical form in the ZX calculus and prove canonicity as well as efficient reducibility of any Clifford encoder into the canonical form. The diagrams produced by our compiler explicitly visualize information propagation and entanglement structure of the encoder, revealing properties that may be obscured in the circuit or stabilizer-tableau representation.	翻訳日:2023-01-09 23:59:50 公開日:2023-01-06
# 量子カオス指標としての時間外相関子の相対的漸近振動 Relative asymptotic oscillations of the out-of-time-ordered correlator as a quantum chaos indicator ( http://arxiv.org/abs/2301.02456v1 ) ライセンス: Link先を確認	Jakub Novotn\'y, Pavel Str\'ansk\'y	(参考訳) 詳細な数値研究により、時間外整列コリレータの標準偏差-平均比の漸近値がシステムの量子カオス性の尺度として有効であることが判明した。自由度が2つの有限サイズの完全連結量子系、すなわち代数的u(3)モデルを採用し、相関子の相対振動と、系の古典的極限における位相空間体積のカオス的部分の比との明確な対応を示す。また、相対振動がシステムサイズとどのようにスケールするかを示し、スケーリング指数が堅牢なカオス指標としても機能することを示す。 A detailed numerical study reveals that the asymptotic values of the standard deviation-to-mean ratio of the out-of-time-ordered correlator can be successfully used as a measure of the quantum chaoticity of the system. We employ a finite-size fully connected quantum system with two degrees of freedom, namely the algebraic u(3) model, and demonstrate a clear correspondence between the relative oscillations of the correlators and the ratio of the chaotic part of the volume of phase space in the classical limit of the system. We also show how the relative oscillations scale with the system size and conjecture that the scaling exponent can also serve as a robust chaos indicator.	翻訳日:2023-01-09 23:59:40 公開日:2023-01-06
# 対角的非侵襲性の侵害:量子記憶効果の要点 Violation of Diagonal Non-Invasiveness: A Hallmark of Quantum Memory Effects ( http://arxiv.org/abs/2301.02500v1 ) ライセンス: Link先を確認	Adri\'an A. Budini	(参考訳) 慣性的な計測の浸透性と記憶効果の存在をつなぐ操作的(測定に基づく)スキームを定義する。その基礎となる理論的な基礎は、対応する可観測性が系密度行列と同じ基底で対角的であるとき(メモリレス)マルコフ力学の非侵襲的可測性に依存する。対照的に、(操作的に定義された)量子メモリ効果は、常に対角非侵襲性に違反する。非マルコフ記憶効果によるLeggett-Garg不等式違反の関連条件も確立した。 An operational (measurement based) scheme that connects in a univocal way measurement invasivity and the presence of memory effects is defined. Its underlying theoretical basis relies on a non-invasive measurability of (memoryless) Markovian dynamics when the corresponding observable is diagonal in the same basis as the system density matrix. In contrast, (operational defined) quantum memory effects always lead to violation of diagonal non-invasiveness. Related conditions for violation of Leggett-Garg inequality due to non-Markovian memory effects are also established.	翻訳日:2023-01-09 23:59:29 公開日:2023-01-06
# 量子多重アクセスチャネルにおける単一粒子による情報伝達 Information Carried by a Single Particle in Quantum Multiple-Access Channels ( http://arxiv.org/abs/2301.02513v1 ) ライセンス: Link先を確認	Xinan Chen, Yujie Zhang, Andreas Winter, Virginia O. Lorenz, Eric Chitambar	(参考訳) 量子システムの非古典的特徴は、現在情報交換の方法を強化する可能性を秘めている。本稿では,この強化を単一粒子の最も基本的なレベルについて検討する。より正確には、1つの古典的粒子または量子的粒子を用いて、複数のパーティ情報が単一の受信機にどれだけうまく伝達できるかを比較する。提案手法は、複数の空間モードにまたがってコヒーレントに分散された単一の粒子にメッセージがエンコードできるマルチアクセス通信モデルに基づいている。理論的には、古典的シナリオから厳密に分離する量子設定におけるアクセス可能な情報の下限を導出する。この分離は、複数の送信者が存在する場合や、受信者と共有フェーズ参照を持つ単一の送信者が存在する場合にも発生する。実験では、異なる軌道に沿ってエンコードされるメッセージを含むマルチポート干渉計を実装し、単一粒子通信においてこのような量子的な利点を示す。具体的には、3ポート光干渉計で構築した2周通信プロトコルについて検討する。このシナリオでは、古典粒子で達成可能なレート和は1ビットで上限され、量子セットアップで1.0152\pm0.0034$bitのレート和を実験的に観測する。 Non-classical features of quantum systems have the potential to strengthen the way we currently exchange information. In this paper, we explore this enhancement on the most basic level of single particles. To be more precise, we compare how well multi-party information can be transmitted to a single receiver using just one classical or quantum particle. Our approach is based on a multiple-access communication model in which messages can be encoded into a single particle that is coherently distributed across multiple spatial modes. Theoretically, we derive lower bounds on the accessible information in the quantum setting that strictly separate it from the classical scenario. This separation is found whenever there is more than one sender, and also when there is just a single sender who has a shared phase reference with the receiver. Experimentally, we demonstrate such quantum advantage in single-particle communication by implementing a multi-port interferometer with messages being encoded along the different trajectories. Specifically, we consider a two-sender communication protocol built by a three-port optical interferometer. In this scenario, the rate sum achievable with a classical particle is upper bounded by one bit, while we experimentally observe a rate sum of $1.0152\pm0.0034$ bits in the quantum setup.	翻訳日:2023-01-09 23:59:20 公開日:2023-01-06
# 二重スリット実験は量子解釈を区別できるのか? Can the double-slit experiment distinguish between quantum interpretations? ( http://arxiv.org/abs/2301.02641v1 ) ライセンス: Link先を確認	Ali Ayatollah Rafsanjani, MohammadJavad Kazemi, Alireza Bahrampour, and Mehdi Golshani	(参考訳) 量子力学の驚くべき成功にもかかわらず、測定問題や量子到着時間問題といった基本的な問題により、理論の予測は明確で独特な場合もある。特に、スクリーン上の粒子検出事象の同時時空間分布に関する様々な予測があり、これは量子論の異なる定式化と解釈から導かれる。この差は典型的には小さいが,本研究では,従来の2重スリット構成により,これらの予測を実験的に区別できることが示唆された。 Despite the astonishing successes of quantum mechanics, due to some fundamental problems such as the measurement problem and quantum arrival time problem, the predictions of the theory are in some cases not quite clear and unique. Especially, there are various predictions for the joint spatiotemporal distribution of particle detection events on a screen, which are derived from different formulations and interpretations of the quantum theory. Although the differences are typically small, our studies show that these predictions can be experimentally distinguished by an unconventional double-slit configuration, which is realizable using present-day single-atom interferometry.	翻訳日:2023-01-09 23:59:01 公開日:2023-01-06
# ReVoLT: ターゲット駆動ナビゲーションのための関係推論とボロノイ局所グラフ計画 ReVoLT: Relational Reasoning and Voronoi Local Graph Planning for Target-driven Navigation ( http://arxiv.org/abs/2301.02382v1 ) ライセンス: Link先を確認	Junjia Liu, Jianfei Guo, Zehui Meng, Jingtao Xue	(参考訳) Embodied AIは、インテリジェントなエンティティと現実世界の相互作用を強調する必然的なトレンドであり、ロボティクス、特にターゲット駆動ナビゲーションに広く応用されている。このタスクは、未知の家庭環境において、特定のカテゴリーのオブジェクトを効率的に見つけることを必要とする。最近の研究は、グラフニューラルネットワーク(GNN)によるレイアウト関係の活用に焦点を当てている。しかし、ほとんどのロボットは、不完全な関係グラフを通して、エンドツーエンドで観察から直接ロボットの動作を得るが、これは解釈可能で信頼性に欠ける。このタスクを分離し、階層的なフレームワークであるReVoLTを提案する。 (a)物体検出用視覚フロントエンド (b)高水準推論者(意味サブゴールを推定する) (c)中間レベルプランナー(幾何学的位置を計算)、及び (d)低レベルコントローラ(アクションの実行)。 ReVoLTは多層意味空間トポロジグラフで動作する。推論器は、教師なしグラフsage、gcn、およびgraphrnnベースの領域ロールアウトからなる組合せ関係抽出ネットワークから得られる、事前としてマルチフォーム構造化関係を用いる。セマンティックなサブゴールを推論し、エクスプロイト(深み優先探索)と探索(参照)のトレードオフを考慮し、アッパー信頼境界木(UCT)で実行します。軽量中間レベルプランナーは、オンライン構築されたボロノイ局所グラフを介して、瞬時空間的な部分ゴール位置を生成する。シミュレーション実験により,本フレームワークは目標駆動型ナビゲーションタスクの性能向上と,既存の最先端手法と比較して80%向上した一般化を実現していることが示された。コードと結果のビデオはhttps://ventusff.github.io/ReVoLT-website/で公開される。 Embodied AI is an inevitable trend that emphasizes the interaction between intelligent entities and the real world, with broad applications in Robotics, especially target-driven navigation. This task requires the robot to find an object of a certain category efficiently in an unknown domestic environment. Recent works focus on exploiting layout relationships by graph neural networks (GNNs). However, most of them obtain robot actions directly from observations in an end-to-end manner via an incomplete relation graph, which is not interpretable and reliable. We decouple this task and propose ReVoLT, a hierarchical framework: (a) an object detection visual front-end, (b) a high-level reasoner (infers semantic sub-goals), (c) an intermediate-level planner (computes geometrical positions), and (d) a low-level controller (executes actions). ReVoLT operates with a multi-layer semantic-spatial topological graph. The reasoner uses multiform structured relations as priors, which are obtained from combinatorial relation extraction networks composed of unsupervised GraphSAGE, GCN, and GraphRNN-based Region Rollout. The reasoner performs with Upper Confidence Bound for Tree (UCT) to infer semantic sub-goals, accounting for trade-offs between exploitation (depth-first searching) and exploration (regretting). The lightweight intermediate-level planner generates instantaneous spatial sub-goal locations via an online constructed Voronoi local graph. The simulation experiments demonstrate that our framework achieves better performance in the target-driven navigation tasks and generalizes well, which has an 80% improvement compared to the existing state-of-the-art method. The code and result video will be released at https://ventusff.github.io/ReVoLT-website/.	翻訳日:2023-01-09 23:58:51 公開日:2023-01-06
# 状態情報と意図情報を用いた交差点における多車軌道予測 Multi-Vehicle Trajectory Prediction at Intersections using State and Intention Information ( http://arxiv.org/abs/2301.02561v1 ) ライセンス: Link先を確認	Dekai Zhu, Qadeer Khan, Daniel Cremers	(参考訳) 道路員の将来の軌跡予測への伝統的なアプローチは、過去の軌跡を知ることに依存している。この研究はむしろ、交差点で複数の車両の予測を行うための現在の状態と意図した方向の知識のみに依存している。さらに、車両間のこれらの情報のメッセージパッシングは、それぞれにより総合的な環境概要を提供し、より情報的な予測を可能にする。これは、複数の車両の状態と意図を使って将来の軌道を予測するニューラルネットワークのトレーニングによって行われる。インプットとして意図を使用することで、複数の車両が望ましい経路に向かって走行できるように、アプローチを拡張できます。実験により,交差点における軌道予測と車両制御の両面でのアプローチの堅牢性を示す。この作業のための完全なトレーニングと評価コードは、ここで入手できる。 Traditional approaches to prediction of future trajectory of road agents rely on knowing information about their past trajectory. This work rather relies only on having knowledge of the current state and intended direction to make predictions for multiple vehicles at intersections. Furthermore, message passing of this information between the vehicles provides each one of them a more holistic overview of the environment allowing for a more informed prediction. This is done by training a neural network which takes the state and intent of the multiple vehicles to predict their future trajectory. Using the intention as an input allows our approach to be extended to additionally control the multiple vehicles to drive towards desired paths. Experimental results demonstrate the robustness of our approach both in terms of trajectory prediction and vehicle control at intersections. The complete training and evaluation code for this work is available here: \url{https://github.com/Dekai21/Multi_Agent_Intersection}.	翻訳日:2023-01-09 23:58:26 公開日:2023-01-06
# 多視点バイナリクラスタリングのためのグラフコラボレーテッドオートエンコーダハッシュ Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering ( http://arxiv.org/abs/2301.02484v1 ) ライセンス: Link先を確認	Huibing Wang, Mingze Yao, Guangqi Jiang, Zetian Mi, Xianping Fu	(参考訳) 教師なしハッシュ法は大規模データの爆発的成長に広く関心を集めており、コンパクトなバイナリコードを学習することでストレージと計算を大幅に削減することができる。既存の教師なしハッシュ手法では、サンプルからの貴重な情報を活用しようとするが、ラベルなしサンプルの局所幾何構造を考慮していない。さらに、オートエンコーダに基づくハッシュは、複数のソースデータの潜在的な一貫性と相補性を無視した入力データとバイナリコードの間の再構成損失を最小限にすることを目的としている。本稿では,マルチビューバイナリクラスタリングのための自動エンコーダに基づくハッシュアルゴリズムを提案する。これは低ランク制約付きアフィニティグラフを動的に学習し,マルチビューバイナリクラスタリングのためのグラフ共用オートエンコーダハッシュ(gcae)と呼ばれる,統合バイナリコードを学習するためにオートエンコーダとアフィニティグラフの協調学習を採用する。具体的には,低ランク制約を用いた多視点親和性グラフ学習モデルを提案する。次に、複数の親和性グラフを協調して統一バイナリコードを効果的に学習するエンコーダ・デコーダパラダイムを設計する。特に、量子化エラーを低減するためにバイナリコードにデコレーションとコードバランスの制約を課す。最後に,複数ビュークラスタリング結果を得るために反復反復最適化方式を用いる。 5ドルの公開データセットに関する広範な実験結果は、アルゴリズムの有効性と、他の最先端の代替品よりも優れた性能を明らかにするために提供される。 Unsupervised hashing methods have attracted widespread attention with the explosive growth of large-scale data, which can greatly reduce storage and computation by learning compact binary codes. Existing unsupervised hashing methods attempt to exploit the valuable information from samples, which fails to take the local geometric structure of unlabeled samples into consideration. Moreover, hashing based on auto-encoders aims to minimize the reconstruction loss between the input data and binary codes, which ignores the potential consistency and complementarity of multiple sources data. To address the above issues, we propose a hashing algorithm based on auto-encoders for multi-view binary clustering, which dynamically learns affinity graphs with low-rank constraints and adopts collaboratively learning between auto-encoders and affinity graphs to learn a unified binary code, called Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering (GCAE). Specifically, we propose a multi-view affinity graphs learning model with low-rank constraint, which can mine the underlying geometric information from multi-view data. Then, we design an encoder-decoder paradigm to collaborate the multiple affinity graphs, which can learn a unified binary code effectively. Notably, we impose the decorrelation and code balance constraints on binary codes to reduce the quantization errors. Finally, we utilize an alternating iterative optimization scheme to obtain the multi-view clustering results. Extensive experimental results on $5$ public datasets are provided to reveal the effectiveness of the algorithm and its superior performance over other state-of-the-art alternatives.	翻訳日:2023-01-09 23:52:18 公開日:2023-01-06
# Vote2Cap-DETRを用いたエンド・ツー・エンド3次元Dense Captioning End-to-End 3D Dense Captioning with Vote2Cap-DETR ( http://arxiv.org/abs/2301.02508v1 ) ライセンス: Link先を確認	Sijin Chen, Hongyuan Zhu, Xin Chen, Yinjie Lei, Tao Chen, Gang YU	(参考訳) 3D高密度キャプションは、関連する対象領域にローカライズされた複数のキャプションを生成することを目的としている。既存のメソッドは、多数の手作りのコンポーネントを備えた洗練された 'detect-then-describe'' パイプラインに従っている。しかし、これらの手作りのコンポーネントは、異なるシーン間のオブジェクト空間とクラス分布が散らばった場合、最適以下のパフォーマンスをもたらす。本稿では,最近普及している textbf{DE}tection \textbf{TR}ansformer (DETR) に基づく,単純なyet効率のトランスフォーマフレームワークである Vote2Cap-DETR を提案する。先行技術と比較すると、我々の枠組みにはいくつかの魅力があります。 1) 手作り部品は多くないが,本手法は,学習可能な投票クエリ駆動オブジェクトデコーダを備えたフルトランスフォーマー・デコーダアーキテクチャと,集合予測方式で高密度キャプションを生成するキャプションデコーダをベースとしている。 2) この2段階方式とは対照的に, 検出とキャプションを1段階で行うことができる。 3) ベルとホイッスルがなければ、2つの一般的なデータセットであるScanReferとNr3Dの広範な実験により、Vote2Cap-DETRがそれぞれCIDEr@0.5IoUの11.13\%と7.11\%を超えることが実証された。コードはまもなくリリースされる予定だ。 3D dense captioning aims to generate multiple captions localized with their associated object regions. Existing methods follow a sophisticated ``detect-then-describe'' pipeline equipped with numerous hand-crafted components. However, these hand-crafted components would yield suboptimal performance given cluttered object spatial and class distributions among different scenes. In this paper, we propose a simple-yet-effective transformer framework Vote2Cap-DETR based on recent popular \textbf{DE}tection \textbf{TR}ansformer (DETR). Compared with prior arts, our framework has several appealing advantages: 1) Without resorting to numerous hand-crafted components, our method is based on a full transformer encoder-decoder architecture with a learnable vote query driven object decoder, and a caption decoder that produces the dense captions in a set-prediction manner. 2) In contrast to the two-stage scheme, our method can perform detection and captioning in one-stage. 3) Without bells and whistles, extensive experiments on two commonly used datasets, ScanRefer and Nr3D, demonstrate that our Vote2Cap-DETR surpasses current state-of-the-arts by 11.13\% and 7.11\% in CIDEr@0.5IoU, respectively. Codes will be released soon.	翻訳日:2023-01-09 23:51:48 公開日:2023-01-06
# スタイル転送を用いた絵画分類におけるデータバイアス対策 Tackling Data Bias in Painting Classification with Style Transfer ( http://arxiv.org/abs/2301.02524v1 ) ライセンス: Link先を確認	Mridula Vijendran, Frederick W. B. Li, Hubert P. H. Shum	(参考訳) ドメインギャップによるモデルバイアスと,芸術様式の不均一な分布によるデータバイアスにより,絵画コレクション上の分類器の訓練は困難である。データ蒸留、伝統的なデータ拡張、スタイル転送といった以前の技術は、タスク固有のトレーニングデータセットやドメイン適応を使用して分類子トレーニングを改善する。本研究では,カオコレデータセットのような小さな絵画データセットにおけるデータバイアスを扱うとともに,実世界画像にトレーニングされたモデルを微調整する際に,ドメイン適応を同時に計算するシステムを提案する。本システムは,スタイル転送と分類の2段階からなる。スタイル転送ステージでは、一律にサンプリングされたコンテンツとスタイルイメージをクラスごとにスタイリッシュなトレーニングサンプルを生成し、各ドメインごとにスタイル変換ネットワークをトレーニングします。分類段階では、オリジナルトレーニングデータセットとスタイライゼーション画像のトレーニングにおいて、注意層におけるスタイル層とコンテンツ層の有効性を解釈することができる。多数派と少数派における増分サンプルの割合を動的に変化させることで、モデル性能と収束性をトレードオフすることができる。訓練期間の短縮と,訓練パラメータの少ない分類器を用いて,somaと同等の結果を得る。 It is difficult to train classifiers on paintings collections due to model bias from domain gaps and data bias from the uneven distribution of artistic styles. Previous techniques like data distillation, traditional data augmentation and style transfer improve classifier training using task specific training datasets or domain adaptation. We propose a system to handle data bias in small paintings datasets like the Kaokore dataset while simultaneously accounting for domain adaptation in fine-tuning a model trained on real world images. Our system consists of two stages which are style transfer and classification. In the style transfer stage, we generate the stylized training samples per class with uniformly sampled content and style images and train the style transformation network per domain. In the classification stage, we can interpret the effectiveness of the style and content layers at the attention layers when training on the original training dataset and the stylized images. We can tradeoff the model performance and convergence by dynamically varying the proportion of augmented samples in the majority and minority classes. We achieve comparable results to the SOTA with fewer training epochs and a classifier with fewer training parameters.	翻訳日:2023-01-09 23:51:26 公開日:2023-01-06
# 3次元物体検出のためのモデル非依存階層的注意 Model-Agnostic Hierarchical Attention for 3D Object Detection ( http://arxiv.org/abs/2301.02650v1 ) ライセンス: Link先を確認	Manli Shu, Le Xue, Ning Yu, Roberto Mart\'in-Mart\'in, Juan Carlos Niebles, Caiming Xiong, Ran Xu	(参考訳) 汎用ネットワークアーキテクチャとしてのトランスフォーマーは最近、3dポイントクラウドオブジェクト検出で大きな成功を収めている。しかし, 通常の変圧器では階層構造が欠如しているため, 異なるスケールで特徴を学習することは困難であり, 局所的特徴を抽出する能力を抑制する。このような制限により、異なるサイズのオブジェクトでは性能が不均衡になり、小さいオブジェクトでは性能が劣る。本研究では,トランスを用いた3D検出器のモジュール化階層設計として,新しい2つの注意機構を提案する。異なるスケールで機能学習を可能にするために,単一スケールの入力機能から複数スケールのトークンを構築するシンプルなマルチスケールアテンションを提案する。局所化特徴集約のために,各境界ボックスの提案に対して適応的注意範囲を持つサイズ適応局所注意を提案する。この2つのアテンションモジュールはモデルに依存しないネットワーク層で、エンドツーエンドトレーニングのために既存のポイントクラウドトランスフォーマーにプラグインすることができます。提案手法を室内3次元点状物体検出ベンチマークで評価した。提案するモジュールを最先端のトランスフォーマーベースの3d検出器に差し込むことで,従来の2つのベンチマークの最良の結果を改善し,小型オブジェクトに対する改善マージンを最大にする。 Transformers as versatile network architectures have recently seen great success in 3D point cloud object detection. However, the lack of hierarchy in a plain transformer makes it difficult to learn features at different scales and restrains its ability to extract localized features. Such limitation makes them have imbalanced performance on objects of different sizes, with inferior performance on smaller ones. In this work, we propose two novel attention mechanisms as modularized hierarchical designs for transformer-based 3D detectors. To enable feature learning at different scales, we propose Simple Multi-Scale Attention that builds multi-scale tokens from a single-scale input feature. For localized feature aggregation, we propose Size-Adaptive Local Attention with adaptive attention ranges for every bounding box proposal. Both of our attention modules are model-agnostic network layers that can be plugged into existing point cloud transformers for end-to-end training. We evaluate our method on two widely used indoor 3D point cloud object detection benchmarks. By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.	翻訳日:2023-01-09 23:50:51 公開日:2023-01-06
# 連続制御タスクのための集中型協調探索政策 Centralized Cooperative Exploration Policy for Continuous Control Tasks ( http://arxiv.org/abs/2301.02375v1 ) ライセンス: Link先を確認	Chao Li, Chen Gong, Qiang He, Xinwen Hou and Yu Liu	(参考訳) 深層強化学習(drl)アルゴリズムは、様々な複雑な制御タスクを巧みに解決する。この現象的な成功は、DRLが知的エージェントに環境を十分に探索し、エージェントトレーニングプロセス中に多様な経験を収集するよう促すことによるものである。したがって、探査はdrlの最適ポリシーにアクセスする上で重要な役割を果たす。近年の継続的制御タスクの進歩にもかかわらず、これらのタスクの探索は不十分なままである。連続制御タスクにおける探索を明示的に奨励するために,価値関数の過小評価と過大評価を利用して探索能力を維持するCCEP(Centralized Cooperative Exploration Policy)を提案する。 CCEPはまず、異なるパラメータで初期化された2つの値関数を保持し、値関数のペアから複数の探索スタイルで多様なポリシーを生成する。さらに、集中型ポリシフレームワークは、CCEPが複数のポリシ間のメッセージ配信を実現し、さらに環境の協調的な探索に寄与することを保証する。大規模な実験の結果、CCEPは高い探査能力を発揮することが示された。実証分析では、CCEPによる学習政策における多様な探索スタイルが示され、より多くの探検地域での利益が得られている。そしてこのccepの探索能力は、実験で示された複数の連続制御タスクにまたがる現在の最先端のメソッドよりも優れています。 The deep reinforcement learning (DRL) algorithm works brilliantly on solving various complex control tasks. This phenomenal success can be partly attributed to DRL encouraging intelligent agents to sufficiently explore the environment and collect diverse experiences during the agent training process. Therefore, exploration plays a significant role in accessing an optimal policy for DRL. Despite recent works making great progress in continuous control tasks, exploration in these tasks has remained insufficiently investigated. To explicitly encourage exploration in continuous control tasks, we propose CCEP (Centralized Cooperative Exploration Policy), which utilizes underestimation and overestimation of value functions to maintain the capacity of exploration. CCEP first keeps two value functions initialized with different parameters, and generates diverse policies with multiple exploration styles from a pair of value functions. In addition, a centralized policy framework ensures that CCEP achieves message delivery between multiple policies, furthermore contributing to exploring the environment cooperatively. Extensive experimental results demonstrate that CCEP achieves higher exploration capacity. Empirical analysis shows diverse exploration styles in the learned policies by CCEP, reaping benefits in more exploration regions. And this exploration capacity of CCEP ensures it outperforms the current state-of-the-art methods across multiple continuous control tasks shown in experiments.	翻訳日:2023-01-09 23:50:13 公開日:2023-01-06
# 共形損失制御予測 Conformal Loss-Controlling Prediction ( http://arxiv.org/abs/2301.02424v1 ) ライセンス: Link先を確認	Di Wang, Ping Wang, Zhong Ji, Xiaojun Yang, Hongyue Li	(参考訳) コンフォーマル予測は、予測セットの予測カバレッジを制御する学習フレームワークであり、任意の学習アルゴリズムに基づいてポイント予測を行うことができる。本研究では,損失関数の値を制御する必要がある状況に対して,共形予測を拡張した共形損失制御予測という学習フレームワークを提案する。リスク制御予測セットと,損失関数の期待値を制御することを目的とした共形リスク制御に関する既存の研究とは違い,本論文では,誤発見損失から一般損失への共形予測の拡張である任意のテスト対象の損失に着目した。制御保証は有限事例におけるデータの交換可能性の仮定の下で証明され、数値気象予報アプリケーションのクラス変動損失と統計的後処理を伴う分類について実証的に検証し、ポイントワイズ分類およびポイントワイズ回帰問題として導入する。すべての理論解析と実験結果から,損失制御手法の有効性を確認した。 Conformal prediction is a learning framework controlling prediction coverage of prediction sets, which can be built on any learning algorithm for point prediction. This work proposes a learning framework named conformal loss-controlling prediction, which extends conformal prediction to the situation where the value of a loss function needs to be controlled. Different from existing works about risk-controlling prediction sets and conformal risk control with the purpose of controlling the expected values of loss functions, the proposed approach in this paper focuses on the loss for any test object, which is an extension of conformal prediction from miscoverage loss to some general loss. The controlling guarantee is proved under the assumption of exchangeability of data in finite-sample cases and the framework is tested empirically for classification with a class-varying loss and statistical postprocessing of numerical weather forecasting applications, which are introduced as point-wise classification and point-wise regression problems. All theoretical analysis and experimental results confirm the effectiveness of our loss-controlling approach.	翻訳日:2023-01-09 23:49:54 公開日:2023-01-06
# 圧縮アクティベーションは並列トレーニングのモデルに役立つか? Does compressing activations help model parallel training? ( http://arxiv.org/abs/2301.02654v1 ) ライセンス: Link先を確認	Song Bian, Dacheng Li, Hongyi Wang, Eric P. Xing, Shivaram Venkataraman	(参考訳) 大規模トランスフォーマーモデルは様々なタスクにおいて例外的な性能で知られているが、通信集約型モデル並列性を必要とするため、訓練は困難である。トレーニング速度を改善する1つの方法は、通信におけるメッセージサイズを圧縮することである。従来の手法は主にデータ並列性の設定における勾配の圧縮に焦点を合わせてきたが、モデル並列設定における圧縮は未調査領域である。モデル並列性はデータ並列性と根本的に異なる特徴を持つことがわかった。本研究では,モデル並列性に対する圧縮手法の有効性に関する実験的検討を行った。我々は,一般的なTransformerトレーニングフレームワークを用いて,プルーニングベース,学習ベース,量子化ベースという3つの圧縮アルゴリズムの共通クラスを実装し,評価する。我々は、これらの手法を160以上の設定と8つの一般的なデータセットで評価し、異なるハイパーパラメータ、ハードウェア、微調整および事前学習の段階を考慮に入れた。モデルのスケールアップ時の分析も行っています。最後に,モデル並列性圧縮アルゴリズムの今後の開発について考察する。 Large-scale Transformer models are known for their exceptional performance in a range of tasks, but training them can be difficult due to the requirement for communication-intensive model parallelism. One way to improve training speed is to compress the message size in communication. Previous approaches have primarily focused on compressing gradients in a data parallelism setting, but compression in a model-parallel setting is an understudied area. We have discovered that model parallelism has fundamentally different characteristics than data parallelism. In this work, we present the first empirical study on the effectiveness of compression methods for model parallelism. We implement and evaluate three common classes of compression algorithms - pruning-based, learning-based, and quantization-based - using a popular Transformer training framework. We evaluate these methods across more than 160 settings and 8 popular datasets, taking into account different hyperparameters, hardware, and both fine-tuning and pre-training stages. We also provide analysis when the model is scaled up. Finally, we provide insights for future development of model parallelism compression algorithms.	翻訳日:2023-01-09 23:49:35 公開日:2023-01-06
# object as query: 任意の2dオブジェクト検出器に3d検出能力を備える Object as Query: Equipping Any 2D Object Detector with 3D Detection Ability ( http://arxiv.org/abs/2301.02364v1 ) ライセンス: Link先を確認	Zitian Wang, Zehao Huang, Jiahui Fu, Naiyan Wang, Si Liu	(参考訳) マルチビュー画像からの3Dオブジェクト検出は、ここ数年で注目されている。既存の方法は、主に多視点画像から3D表現を確立し、オブジェクト検出に高密度な検出ヘッドを採用するか、オブジェクトをローカライズするために3D空間に分散されたオブジェクトクエリを使用する。本稿では,多視点3次元物体検出装置(MV2D)を設計し,任意の2次元物体検出装置を装備して,多視点3次元物体検出の促進を図る。 MV2Dは2D検出器を利用して、リッチな画像意味論に基づくオブジェクトクエリを生成する。これらの動的に生成されたクエリにより、MV2Dは計算コストを増大させることなくより大きな3D空間のオブジェクトを検出でき、3Dオブジェクトをローカライズする強力な能力を示す。生成したクエリに対して,分散クロスアテンションモジュールを設計し,特定のオブジェクトの特徴に注目させることにより,計算コストを低減し,ノイズによる干渉を抑制する。 nuScenesデータセットの評価結果は、動的オブジェクトクエリとスパース特徴集約が3次元検出能力を損なわないことを示す。 MV2Dは既存の手法の中でも最先端の性能を示している。 MV2Dが将来の研究の新たなベースラインになることを期待している。 3D object detection from multi-view images has drawn much attention over the past few years. Existing methods mainly establish 3D representations from multi-view images and adopt a dense detection head for object detection, or employ object queries distributed in 3D space to localize objects. In this paper, we design Multi-View 2D Objects guided 3D Object Detector (MV2D), which can be equipped with any 2D object detector to promote multi-view 3D object detection. Since 2D detections can provide valuable priors for object existence, MV2D exploits 2D detector to generate object queries conditioned on the rich image semantics. These dynamically generated queries enable MV2D to detect objects in larger 3D space without increased computational costs and shows a strong capability of localizing 3D objects. For the generated queries, we design a sparse cross attention module to force them to focus on the features of specific objects, which reduces the computational cost and suppresses interference from noises. The evaluation results on the nuScenes dataset demonstrate that dynamic object queries and sparse feature aggregation do not harm 3D detection capability. MV2D also exhibits a state-of-the-art performance among existing methods. We hope MV2D can serve as a new baseline for future research.	翻訳日:2023-01-09 23:41:56 公開日:2023-01-06
# Anchor3DLane:モノクロ3Dレーン検出のための3Dアンカーの学習 Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection ( http://arxiv.org/abs/2301.02371v1 ) ライセンス: Link先を確認	Shaofei Huang, Zhenwei Shen, Zehao Huang, Zihan Ding, Jiao Dai, Jizhong Han, Naiyan Wang, Si Liu	(参考訳) 深さ情報がないため,単眼3次元レーン検出は難しい課題である。 3Dレーン検出の一般的な解決策は、まず正面視(FV)画像や特徴を逆視点マッピング(IPM)で鳥眼視(BEV)空間に変換し、BEV特徴から車線を検出することである。しかし、IPMが平らな地上での仮定やコンテキスト情報の喪失に依存しているため、BEV表現から3D情報を復元するには不正確である。 BEVを排除し、FV表現から直接3Dレーンを予測する試みがなされているが、3Dレーンの構造的表現が欠如していることから、他のBEVベースの方法よりも性能が低い。本稿では,3d空間における3dレーンアンカーを定義し,fv表現から直接3dレーンを予測するためのアンカー3dlane法を提案する。 3DレーンアンカーはFV機能に投影され、正確な予測を行うための優れた構造情報とコンテキスト情報の両方を含む特徴を抽出する。さらにanchor3dlaneをマルチフレーム設定に拡張し、パフォーマンス改善のために時間情報を取り込む。さらに,車線間の等幅特性を利用した大域的最適化手法も開発し,予測の側方誤差を低減する。 3つの人気のある3Dレーン検出ベンチマークの大規模な実験により、我々のAnchor3DLaneは従来のBEVベースの手法より優れ、最先端のパフォーマンスを実現しています。 Monocular 3D lane detection is a challenging task due to its lack of depth information. A popular solution to 3D lane detection is to first transform the front-viewed (FV) images or features into the bird-eye-view (BEV) space with inverse perspective mapping (IPM) and detect lanes from BEV features. However, the reliance of IPM on flat ground assumption and loss of context information makes it inaccurate to restore 3D information from BEV representations. An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes. In this paper, we define 3D lane anchors in the 3D space and propose a BEV-free method named Anchor3DLane to predict 3D lanes directly from FV representations. 3D lane anchors are projected to the FV features to extract their features which contain both good structural and context information to make accurate predictions. We further extend Anchor3DLane to the multi-frame setting to incorporate temporal information for performance improvement. In addition, we also develop a global optimization method that makes use of the equal-width property between lanes to reduce the lateral error of predictions. Extensive experiments on three popular 3D lane detection benchmarks show that our Anchor3DLane outperforms previous BEV-based methods and achieves state-of-the-art performances.	翻訳日:2023-01-09 23:41:35 公開日:2023-01-06
# codetalker: 個別動作を優先した音声駆動3d顔アニメーション CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior ( http://arxiv.org/abs/2301.02379v1 ) ライセンス: Link先を確認	Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong	(参考訳) 音声駆動の3D顔アニメーションは広く研究されているが、音声視覚データの極めて不適切な性質と不足のため、現実主義と鮮明さを達成するには依然としてギャップがある。既存の作業は、通常、回帰タスクへのクロスモーダルマッピングを定式化するが、これは回帰と平均の問題に悩まされ、過度に滑らかな顔の動きにつながる。本稿では,学習したコードブックの有限プロキシ空間において,音声による顔のアニメーションをコードクエリタスクとしてキャストすることを提案する。コードブックは、実際の顔の動きに対する自己再構成によって学習され、現実的な顔の動きに埋め込まれる。離散的動作空間上では、入力された音声信号から顔の動きを逐次合成する時間的自己回帰モデルが用いられ、口唇同期と多彩な表情が保証される。提案手法は, 定性的かつ定量的に, 現在の最先端手法よりも優れていることを示す。また、ユーザスタディは、知覚品質の優位性をさらに正当化する。 Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty. The codebook is learned by self-reconstruction over real facial motions and thus embedded with realistic facial motion priors. Over the discrete motion space, a temporal autoregressive model is employed to sequentially synthesize facial motions from the input speech signal, which guarantees lip-sync as well as plausible facial expressions. We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. Also, a user study further justifies our superiority in perceptual quality.	翻訳日:2023-01-09 23:41:07 公開日:2023-01-06
# cyberloc: 正確な長期視定位を目指して CyberLoc: Towards Accurate Long-term Visual Localization ( http://arxiv.org/abs/2301.02403v1 ) ライセンス: Link先を確認	Liu Liu, Yukai Lin, Xiao Liang, Qichao Xu, Miao Jia, Yangdong Liu, Yuxiang Wen, Wei Luo, Jiangwei Li	(参考訳) 本報告では,課題条件下でロバストかつ高精度なポーズ推定を行うための画像ベースビジュアルローカライズパイプラインであるcyberlocを紹介する。提案手法は4つのモジュールを連結して構成する。まず、異なる条件下で複数の参照シーケンスが存在する場合、各参照シーケンスに対する1つのマップであるシーンの正確な3Dマップを構築するためにマッピングモジュールを適用する。次に、単一の画像ベースのローカライゼーションパイプライン(retrieval--matching--PnP)を行い、クエリ画像毎に6-DoFカメラのポーズを3Dマップ毎に推定する。第3に、6-DoFカメラのポーズをフィルタし、1つの6-DoFカメラのポーズをクエリに出力するコンセンサスセット最大化モジュールを提案する。最後に、6-DoFのクエリポーズを最適化し、候補となるグローバルな6-DoFカメラポーズとその対応するグローバルな2D-3Dマッチング、連続的なクエリイメージとクエリシーケンスのSLAMポーズのスパース2D-2D特徴マッチングを入力として、ロバストなポーズ修正モジュールを提案する。 4シーズンデータセットを用いた実験により,本手法は高精度かつロバスト性が得られた。特に,本手法は,地図を用いた自律運転用ローカライゼーション(MLAD-ECCV2022)に関するECCV 2022ワークショップのローカライゼーション課題に勝っている。 This technical report introduces CyberLoc, an image-based visual localization pipeline for robust and accurate long-term pose estimation under challenging conditions. The proposed method comprises four modules connected in a sequence. First, a mapping module is applied to build accurate 3D maps of the scene, one map for each reference sequence if there exist multiple reference sequences under different conditions. Second, a single-image-based localization pipeline (retrieval--matching--PnP) is performed to estimate 6-DoF camera poses for each query image, one for each 3D map. Third, a consensus set maximization module is proposed to filter out outlier 6-DoF camera poses, and outputs one 6-DoF camera pose for a query. Finally, a robust pose refinement module is proposed to optimize 6-DoF query poses, taking candidate global 6-DoF camera poses and their corresponding global 2D-3D matches, sparse 2D-2D feature matches between consecutive query images and SLAM poses of the query sequence as input. Experiments on the 4seasons dataset show that our method achieves high accuracy and robustness. In particular, our approach wins the localization challenge of ECCV 2022 workshop on Map-based Localization for Autonomous Driving (MLAD-ECCV2022).	翻訳日:2023-01-09 23:40:48 公開日:2023-01-06
# 視覚変換器の効率よいFew-shot Adaptationの探索 Exploring Efficient Few-shot Adaptation for Vision Transformers ( http://arxiv.org/abs/2301.02419v1 ) ライセンス: Link先を確認	Chengming Xu, Siqian Yang, Yabiao Wang, Zhanxiong Wang, Yanwei Fu, Xiangyang Xue	(参考訳) FSL(Few-shot Learning)の課題は,ラベル付きトレーニングサンプルを豊富に含むベースカテゴリから学習した知識を利用して,ラベル付きサンプルを少数含む新規カテゴリの推論を行うことである。 FSLタスクには多くの研究があるが、視覚トランスフォーマー(ViT)がFSLのバックボーンとして採用されることは稀であり、バックボーン全体や分類層を微調整することに焦点を当てる試みはほとんどない。基本的に、ViTは、他のビジョンタスクで同等またはさらに優れたパフォーマンスを享受していることが示されているが、現実のFSLシナリオでViTを効率的に微調整することは、まだ非常に簡単ではない。そこで本研究では,FSLタスクの微調整を容易にするトランスフォーマーチューニング(eTT)手法を提案する。鍵となる新機能は、タスクとバックボーンチューニングのために新たに提示された注意プレフィックスチューニング(apt)とドメイン残差アダプタ(dra)から生まれます。具体的には、APTでは、プレフィックスを各自己保持層に取り付けられた新しいキーと値ペアに投影し、タスク固有の情報を提供する。さらに,学習可能なオフセットベクトルの形でdraを設計し,ベースデータと新規データの間の潜在的な領域ギャップを処理する。 aptが初期タスク固有の情報からあまり逸脱しないようにするため、我々はさらにプレフィックスと初期プロトタイプの射影分布の類似性を最大化し、更新手順を規則化する新しいプロトタイプ正規化を提案する。提案手法はメタデータセットの課題に対して優れた性能を発揮する。我々は,モデルの有効性を示す広範な実験を行った。 The task of Few-shot Learning (FSL) aims to do the inference on novel categories containing only few labeled examples, with the help of knowledge learned from base categories containing abundant labeled training samples. While there are numerous works into FSL task, Vision Transformers (ViTs) have rarely been taken as the backbone to FSL with few trials focusing on naive finetuning of whole backbone or classification layer.} Essentially, despite ViTs have been shown to enjoy comparable or even better performance on other vision tasks, it is still very nontrivial to efficiently finetune the ViTs in real-world FSL scenarios. To this end, we propose a novel efficient Transformer Tuning (eTT) method that facilitates finetuning ViTs in the FSL tasks. The key novelties come from the newly presented Attentive Prefix Tuning (APT) and Domain Residual Adapter (DRA) for the task and backbone tuning, individually. Specifically, in APT, the prefix is projected to new key and value pairs that are attached to each self-attention layer to provide the model with task-specific information. Moreover, we design the DRA in the form of learnable offset vectors to handle the potential domain gaps between base and novel data. To ensure the APT would not deviate from the initial task-specific information much, we further propose a novel prototypical regularization, which maximizes the similarity between the projected distribution of prefix and initial prototypes, regularizing the update procedure. Our method receives outstanding performance on the challenging Meta-Dataset. We conduct extensive experiments to show the efficacy of our model.	翻訳日:2023-01-09 23:40:25 公開日:2023-01-06
# シーケンシャル量子強化トレーニングを用いたトレーサブル量子機械学習に向けて SEQUENT: Towards Traceable Quantum Machine Learning using Sequential Quantum Enhanced Training ( http://arxiv.org/abs/2301.02601v1 ) ライセンス: Link先を確認	Philipp Altmann, Leo S\"unkel, Jonas Stein, Tobias M\"uller, Christoph Roch and Claudia Linnhoff-Popien	(参考訳) 量子コンピューティングのような新しいコンピューティングパラダイムを機械学習の分野に適用する動きが最近注目を集めている。しかし、高次元実世界の応用は純粋に量子ハードウェアで解決できないため、古典的および量子機械学習のパラダイムを用いたハイブリッド手法が提案されている。例えば、移動学習法はハイブリッド画像分類タスクに適用可能であることが示されている。それでも、有益な回路アーキテクチャを探求する必要がある。したがって、選択した回路アーキテクチャとパラメータ化の影響の追跡は、有効なハイブリッド手法の開発に不可欠である。しかし、現在の方法には、両方の部分を同時に訓練するプロセスが含まれているため、古典的および量子的な影響の厳密な分離性が認められない。したがって、これらのアーキテクチャは、最小限の量子インパクトを使用しながらより優れた予測精度をもたらすモデルを生成するかもしれない。本稿では,量子コンピューティング手法のハイブリッド機械学習へのトレーサブルな応用に向けて,逐次的量子強化トレーニング(sequent)により改良されたアーキテクチャとトレーニングプロセスを提案する。さらに,現在の手法の欠点と予備的な実験結果に対する形式的な証拠を,sequentの適用可能性の実証として提示する。 Applying new computing paradigms like quantum computing to the field of machine learning has recently gained attention. However, as high-dimensional real-world applications are not yet feasible to be solved using purely quantum hardware, hybrid methods using both classical and quantum machine learning paradigms have been proposed. For instance, transfer learning methods have been shown to be successfully applicable to hybrid image classification tasks. Nevertheless, beneficial circuit architectures still need to be explored. Therefore, tracing the impact of the chosen circuit architecture and parameterization is crucial for the development of beneficially applicable hybrid methods. However, current methods include processes where both parts are trained concurrently, therefore not allowing for a strict separability of classical and quantum impact. Thus, those architectures might produce models that yield a superior prediction accuracy whilst employing the least possible quantum impact. To tackle this issue, we propose Sequential Quantum Enhanced Training (SEQUENT) an improved architecture and training process for the traceable application of quantum computing methods to hybrid machine learning. Furthermore, we provide formal evidence for the disadvantage of current methods and preliminary experimental results as a proof-of-concept for the applicability of SEQUENT.	翻訳日:2023-01-09 23:33:15 公開日:2023-01-06
# 住宅負荷の迅速応答のためのマルチエージェント強化学習 Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response of Residential Loads ( http://arxiv.org/abs/2301.02593v1 ) ライセンス: Link先を確認	Vincent Mai, Philippe Maisonneuve, Tianyu Zhang, Hadi Nekoei, Liam Paull, Antoine Lesage-Landry	(参考訳) 高量の再生可能エネルギー資源を統合するためには、電力グリッドは高振幅で高速な時間スケールの発電に対処できなければならない。需要応答による周波数規制は、空気調和機のような時間的に柔軟な負荷を調整し、これらの変動に対処する可能性がある。動的制約を伴う離散制御のための既存のアプローチは、数百のエージェントによる高速な時間スケールアクション選択に満足な性能を提供するのに苦労している。局所通信を用いたマルチエージェントポリシー最適化を訓練した分散エージェントを提案する。ハンドエンジニアリングとマルチエージェント通信による学習という,2つのコミュニケーションフレームワークについて検討する。結果として得られるポリシは、周波数規制に対して良好かつ堅牢に機能し、一定の処理時間の間、任意の数のハウスにシームレスにスケールする。 To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale variations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times.	翻訳日:2023-01-09 23:32:35 公開日:2023-01-06
# 楕円スライスサンプリング再訪の可逆性 Reversibility of elliptical slice sampling revisited ( http://arxiv.org/abs/2301.02426v1 ) ライセンス: Link先を確認	Mareike Hasenpflug, Viacheslav Natarovskii, Daniel Rudolf	(参考訳) マレー、アダムズ、マッケイが2010年に導入した後方分布の近似サンプリングのためのマルコフ連鎖法である楕円スライスサンプリングの明確性について考察する。我々は正規性要件を指摘し、可逆性特性の別の証明を提供する。特に、これは無限次元分離ヒルベルト空間上でもスライスサンプリングスキームの正しさを保証する。 We discuss the well-definedness of elliptical slice sampling, a Markov chain approach for approximate sampling of posterior distributions introduced by Murray, Adams and MacKay 2010. We point to a regularity requirement and provide an alternative proof of the reversibility property. In particular, this guarantees the correctness of the slice sampling scheme also on infinite-dimensional separable Hilbert spaces.	翻訳日:2023-01-09 23:32:23 公開日:2023-01-06
# mask-then-fill: イベント抽出のための柔軟かつ効果的なデータ拡張フレームワーク Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction ( http://arxiv.org/abs/2301.02427v1 ) ライセンス: Link先を確認	Jun Gao, Changlong Yu, Wei Wang, Huan Zhao, Ruifeng Xu	(参考訳) イベント抽出のための柔軟かつ効果的なデータ拡張フレームワークであるmask-then-fillを提案する。このアプローチは、テキストのより柔軟な操作を可能にし、元のイベント構造を可能な限り変更することなく、より多様なデータを生成することができる。具体的には、まず随伴文の断片をランダムにマスキングし、それから可変長のテキストを細調整された埋め込みモデルで埋め込む。主な利点は、テキスト中の任意の長さの断片を、単一の単語または固定長の断片だけを置換できる既存の方法と比較して、可変長の別の断片に置き換えることができることである。トリガおよび引数抽出タスクにおいて,提案手法はベースライン手法よりも有効であり,低リソース設定において特に強い結果を示す。さらに分析した結果,多様性と分布的類似性のバランスが良好であることが判明した。 We present Mask-then-Fill, a flexible and effective data augmentation framework for event extraction. Our approach allows for more flexible manipulation of text and thus can generate more diverse data while keeping the original event structure unchanged as much as possible. Specifically, it first randomly masks out an adjunct sentence fragment and then infills a variable-length text span with a fine-tuned infilling model. The main advantage lies in that it can replace a fragment of arbitrary length in the text with another fragment of variable length, compared to the existing methods which can only replace a single word or a fixed-length fragment. On trigger and argument extraction tasks, the proposed framework is more effective than baseline methods and it demonstrates particularly strong results in the low-resource setting. Our further analysis shows that it achieves a good balance between diversity and distributional similarity.	翻訳日:2023-01-09 23:31:44 公開日:2023-01-06
# OPD@NL4Opt:最適化問題のNERタスクに対するアンサンブルアプローチ OPD@NL4Opt: An ensemble approach for the NER task of the optimization problem ( http://arxiv.org/abs/2301.02459v1 ) ライセンス: Link先を確認	Kangxu Wang, Ze Chen, Jiewen Zheng	(参考訳) 本稿では,NL4Optコンペティションサブタスク1(NERタスク)に対するアンサンブルアプローチを提案する。このタスクでは、まず、競合データセットに基づいて事前訓練された言語モデルを微調整する。そして,モデル一般化とロバスト性を高めるために,差分学習率と対角訓練戦略を採用する。さらに、モデルアンサンブル法を用いて最終予測を行い、マイクロ平均F1スコア93.3%を達成し、NERタスクにおいて第2位を獲得する。 In this paper, we present an ensemble approach for the NL4Opt competition subtask 1(NER task). For this task, we first fine tune the pretrained language models based on the competition dataset. Then we adopt differential learning rates and adversarial training strategies to enhance the model generalization and robustness. Additionally, we use a model ensemble method for the final prediction, which achieves a micro-averaged F1 score of 93.3% and attains the second prize in the NER task.	翻訳日:2023-01-09 23:31:30 公開日:2023-01-06
# トランスフォーマーを用いたメンタルヘルスポストの因果分類 Causal Categorization of Mental Health Posts using Transformers ( http://arxiv.org/abs/2301.02589v1 ) ライセンス: Link先を確認	Muskan Garg, Simranjeet Kaur, Ritika Bhardwaj, Aastha Jain, Chandni Saxena	(参考訳) 近年、臨床心理学のデジタル化が進み、NLP研究コミュニティはソーシャルメディアにおけるメンタルヘルス検出の分野に革命をもたらした。既存のメンタルヘルス分析研究は、ソーシャルメディアに対するユーザの意図を分類するための横断的研究を中心に展開されている。詳細な分析のために,既存の分類器について検討し,限られたトレーニングサンプルによる学習ベース手法の非効率性を示唆する因果分類問題を解く。この課題に対処するために、トランスフォーマーモデルを使用し、"CAMS"データセット上でトレーニング済みのトランスファー学習の有効性を実証する。実験結果により精度が向上し,基礎となるテキストにおける因果関係の同定の重要性が示された。 With recent developments in digitization of clinical psychology, NLP research community has revolutionized the field of mental health detection on social media. Existing research in mental health analysis revolves around the cross-sectional studies to classify users' intent on social media. For in-depth analysis, we investigate existing classifiers to solve the problem of causal categorization which suggests the inefficiency of learning based methods due to limited training samples. To handle this challenge, we use transformer models and demonstrate the efficacy of a pre-trained transfer learning on "CAMS" dataset. The experimental result improves the accuracy and depicts the importance of identifying cause-and-effect relationships in the underlying text.	翻訳日:2023-01-09 23:31:21 公開日:2023-01-06
# 私の必要なものを本当に理解している人:知識とペルソナを基盤とする知的で友好的な対話エージェント You Truly Understand What I Need: Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona ( http://arxiv.org/abs/2301.02401v1 ) ライセンス: Link先を確認	Jungwoo Lim, Myunghoon Kang, Yuna Hur, Seungwon Jung, Jinsung Kim, Yoonna Jang, Dongyub Lee, Hyesung Ji, Donghoon Shin, Seungryong Kim, and Heuiseok Lim	(参考訳) 人間と流動的に対話する会話エージェントを構築するため、事前学習された言語モデルに知識や個人のプロファイルをブレンドする。しかし、知識とペルソナを同時に考慮するモデルは依然として限定的であり、幻覚とペルソナの使用のパッシブな方法につながる。外部知識とペルソナを同時に活用する効果的な対話エージェントを提案する。エージェントは、ポリエンコーダで実装された候補スコアで回答を生成するために使用する適切な知識とペルソナを選択する。そして,本モデルでは,知識人格拡張クエリを用いた検索拡張により,より少なめの幻覚とより親しみのある発話を生成する。我々はペルソナ知識チャットの実験を行い、自動メトリクスに基づくグラウンドおよび生成タスクにおける最先端のパフォーマンスを達成する。さらに,人間の評価と質的結果を通して,幻覚とエンゲージメントに関するモデルからの回答を検証する。本稿では,他の検索者と比較して関連文書の抽出に有効であることを示すとともに,複数の候補スコアリング手法との比較を行った。コードはhttps://github.com/dlawjddn803/infoで入手できる。 To build a conversational agent that interacts fluently with humans, previous studies blend knowledge or personal profile into the pre-trained language model. However, the model that considers knowledge and persona at the same time is still limited, leading to hallucination and a passive way of using personas. We propose an effective dialogue agent that grounds external knowledge and persona simultaneously. The agent selects the proper knowledge and persona to use for generating the answers with our candidate scoring implemented with a poly-encoder. Then, our model generates the utterance with lesser hallucination and more engagingness utilizing retrieval augmented generation with knowledge-persona enhanced query. We conduct experiments on the persona-knowledge chat and achieve state-of-the-art performance in grounding and generation tasks on the automatic metrics. Moreover, we validate the answers from the models regarding hallucination and engagingness through human evaluation and qualitative results. We show our retriever's effectiveness in extracting relevant documents compared to the other previous retrievers, along with the comparison of multiple candidate scoring methods. Code is available at https://github.com/dlawjddn803/INFO	翻訳日:2023-01-09 23:25:36 公開日:2023-01-06
# SAIDS : 方言とサルカズムを指標とした感性分析の新しいアプローチ SAIDS: A Novel Approach for Sentiment Analysis Informed of Dialect and Sarcasm ( http://arxiv.org/abs/2301.02521v1 ) ライセンス: Link先を確認	Abdelrahman Kaseb and Mona Farouk	(参考訳) 感情分析はあらゆるソーシャルネットワークで不可欠な部分となり、意思決定者がユーザーの意見をほぼあらゆる側面から知ることができる。その重要性にもかかわらず、感傷的テキストの感情のような、感情分析の主要な課題の1つである複数の問題に直面する。本稿では,アラビア語ツイートの感情,皮肉,方言を予測する新しいシステム(SAIDS)を導入することで,この問題に対処する。 SAIDSは、感情を予測するために既知の情報として、皮肉と方言の予測を使用する。言語モデルとしてMARBERTを使用して文の埋め込みを生成し、それをサルカズムと方言モデルに渡し、3つのモデルの出力を連結して感情分析モデルに渡す。複数のシステム設計が実験され、報告された。 SAIDSはArSarcasm-v2データセットに適用され、感情分析タスクの最先端モデルを上回った。すべてのタスクを一緒にトレーニングすることで、SAIDSはそれぞれ75.98 FPN、59.09 F1スコア、71.13 F1スコアの感情分析、sarcasm detection、弁別識別を行う。システム設計は、他のタスクに依存する任意のタスクのパフォーマンスを向上させるために使用できる。 Sentiment analysis becomes an essential part of every social network, as it enables decision-makers to know more about users' opinions in almost all life aspects. Despite its importance, there are multiple issues it encounters like the sentiment of the sarcastic text which is one of the main challenges of sentiment analysis. This paper tackles this challenge by introducing a novel system (SAIDS) that predicts the sentiment, sarcasm and dialect of Arabic tweets. SAIDS uses its prediction of sarcasm and dialect as known information to predict the sentiment. It uses MARBERT as a language model to generate sentence embedding, then passes it to the sarcasm and dialect models, and then the outputs of the three models are concatenated and passed to the sentiment analysis model. Multiple system design setups were experimented with and reported. SAIDS was applied to the ArSarcasm-v2 dataset where it outperforms the state-of-the-art model for the sentiment analysis task. By training all tasks together, SAIDS achieves results of 75.98 FPN, 59.09 F1-score and 71.13 F1-score for sentiment analysis, sarcasm detection, and dialect identification respectively. The system design can be used to enhance the performance of any task which is dependent on other tasks.	翻訳日:2023-01-09 23:25:21 公開日:2023-01-06
# エンティティクラスタとしてのトピック: 言語モデルとグラフニューラルネットワークによるエンティティベースのトピック Topics as Entity Clusters: Entity-based Topics from Language Models and Graph Neural Networks ( http://arxiv.org/abs/2301.02458v1 ) ライセンス: Link先を確認	Manuel V. Loureiro, Steven Derby and Tri Kurniawan Wijaya	(参考訳) トピックモデルはコーパスの背後にある潜伏構造を明らかにすることを目的としている。トピックモデリングの文脈では、ほとんどの語彙は基礎となるトピックを明らかにするのに無関係であるか、関連する概念と強い関係を持ち、これらのトピックの解釈可能性に影響を与える。さらに、言語への依存や表現力の制限は、かなりの計算資源を必要とする。そこで本研究では,概念的実体を用いたクラスタベースのトピックモデリング手法を提案する。エンティティは、関係情報に富んだ現実世界の概念の言語に依存しない表現である。この目的のために、我々は実体のベクトル表現を抽出する。 (i)言語モデルを用いた百科事典 (ii)グラフニューラルネットワークを用いた知識ベース。我々は,この手法がコヒーレンシー指標の他の最先端トピックモデルより一貫して優れており,グラフベース埋め込みに符号化された明示的な知識は,言語モデルの文脈的埋め込みに符号化された暗黙的な知識よりも,より一貫性のあるトピックを提供することを示した。 Topic models aim to reveal the latent structure behind a corpus, typically conducted over a bag-of-words representation of documents. In the context of topic modeling, most vocabulary is either irrelevant for uncovering underlying topics or contains strong relationships with relevant concepts, impacting the interpretability of these topics. Furthermore, their limited expressiveness and dependency on language demand considerable computation resources. Hence, we propose a novel approach for cluster-based topic modeling that employs conceptual entities. Entities are language-agnostic representations of real-world concepts rich in relational information. To this end, we extract vector representations of entities from (i) an encyclopedic corpus using a language model; and (ii) a knowledge base using a graph neural network. We demonstrate that our approach consistently outperforms other state-of-the-art topic models across coherency metrics and find that the explicit knowledge encoded in the graph-based embeddings provides more coherent topics than the implicit knowledge encoded with the contextualized embeddings of language models.	翻訳日:2023-01-09 23:24:42 公開日:2023-01-06
# ハイブリッドディープラーニング技術(CNN+GRU)に基づく画像キャプションアルゴリズム An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU) ( http://arxiv.org/abs/2301.02440v1 ) ライセンス: Link先を確認	Rana Adnan Ahmad, Muhammad Azhar, Hina Sattar	(参考訳) エンコーダ-デコーダフレームワークによる画像キャプションは,CNNを主にエンコーダとして,LSTMをデコーダとして使用した過去10年間で著しく進歩している。単純な画像の正確さという点では驚くべき成果があるが、時間的複雑さと空間的複雑さの効率性には欠ける。さらに,多くの情報やオブジェクトを持つ複雑な画像の場合,このCNN-LSTMペアの性能は,画像に提示されるシーンのセマンティックな理解が欠如していることから,指数関数的に低下した。そこで,これらの問題を考慮し,CNN-GRUエンコーダ・デコーダ・フレームワークを提案する。デコーダの隠れた状態を考慮して、入力画像とその類似意味表現を再構成し、モデルトレーニング中に意味再構築器からの再構成スコアを確率と共に使用して、生成されたキャプションの品質を評価する。その結果、デコーダは、改良された意味情報を受け取り、キャプション生成プロセスが向上する。モデルテストでは、復元スコアとログライクフッドを組み合わせることで、最も適切なキャプションを選択することもできる。提案モデルでは,画像キャプションのための最先端のLSTM-A5モデルよりも,時間的複雑性と精度が優れている。 Image captioning by the encoder-decoder framework has shown tremendous advancement in the last decade where CNN is mainly used as encoder and LSTM is used as a decoder. Despite such an impressive achievement in terms of accuracy in simple images, it lacks in terms of time complexity and space complexity efficiency. In addition to this, in case of complex images with a lot of information and objects, the performance of this CNN-LSTM pair downgraded exponentially due to the lack of semantic understanding of the scenes presented in the images. Thus, to take these issues into consideration, we present CNN-GRU encoder decode framework for caption-to-image reconstructor to handle the semantic context into consideration as well as the time complexity. By taking the hidden states of the decoder into consideration, the input image and its similar semantic representations is reconstructed and reconstruction scores from a semantic reconstructor are used in conjunction with likelihood during model training to assess the quality of the generated caption. As a result, the decoder receives improved semantic information, enhancing the caption production process. During model testing, combining the reconstruction score and the log-likelihood is also feasible to choose the most appropriate caption. The suggested model outperforms the state-of-the-art LSTM-A5 model for picture captioning in terms of time complexity and accuracy.	翻訳日:2023-01-09 23:24:24 公開日:2023-01-06
# IMKGA-SM:シーケンスモデリングによる解釈可能なマルチモーダル知識グラフ回答予測 IMKGA-SM: Interpretable Multimodal Knowledge Graph Answer Prediction via Sequence Modeling ( http://arxiv.org/abs/2301.02445v1 ) ライセンス: Link先を確認	Yilin Wen, Biao Luo and Yuqian Zhao	(参考訳) マルチモーダル知識グラフリンク予測は,マルチモーダルデータに対するリンク予測タスクの精度と効率を向上させることを目的としている。しかし、複雑なマルチモーダル情報やスパーストレーニングデータの場合、ほとんどの手法では解釈可能性と高い精度を同時に達成することは困難である。そこで本稿では,この課題に対処するために,多変量知識グラフ応答予測(imkga-sm)という新しいモデルを開発した。まず,マルチモーダル微細粒度融合法を提案し,vgg16とocr(optical character recognition)技術を用いて画像や画像からテキスト情報を効果的に抽出する。次に、知識グラフリンク予測タスクをオフライン強化学習マルコフ決定モデルとしてモデル化し、統一シーケンスフレームワークに抽象化する。対話的な知覚に基づく報酬期待機構と特別な因果的マスキング機構が設計され、クエリを推論パスに`変換する。そこで,マルチモーダル最適化の問題点を軽減するために,自己回帰動的勾配調整機構を提案する。最後に、2つのデータセットが実験に採用され、一般的なSOTAベースラインが比較に使用される。その結果,開発したIMKGA-SMは,異なるサイズのマルチモーダルリンク予測データセット上でのSOTAベースラインよりもはるかに優れた性能が得られることがわかった。 Multimodal knowledge graph link prediction aims to improve the accuracy and efficiency of link prediction tasks for multimodal data. However, for complex multimodal information and sparse training data, it is usually difficult to achieve interpretability and high accuracy simultaneously for most methods. To address this difficulty, a new model is developed in this paper, namely Interpretable Multimodal Knowledge Graph Answer Prediction via Sequence Modeling (IMKGA-SM). First, a multi-modal fine-grained fusion method is proposed, and Vgg16 and Optical Character Recognition (OCR) techniques are adopted to effectively extract text information from images and images. Then, the knowledge graph link prediction task is modelled as an offline reinforcement learning Markov decision model, which is then abstracted into a unified sequence framework. An interactive perception-based reward expectation mechanism and a special causal masking mechanism are designed, which ``converts" the query into an inference path. Then, an autoregressive dynamic gradient adjustment mechanism is proposed to alleviate the insufficient problem of multimodal optimization. Finally, two datasets are adopted for experiments, and the popular SOTA baselines are used for comparison. The results show that the developed IMKGA-SM achieves much better performance than SOTA baselines on multimodal link prediction datasets of different sizes.	翻訳日:2023-01-09 23:23:32 公開日:2023-01-06
# 逐次依存型マルチタスク学習のためのタスク認識特徴抽出フレームワーク Task Aware Feature Extraction Framework for Sequential Dependence Multi-Task Learning ( http://arxiv.org/abs/2301.02494v1 ) ライセンス: Link先を確認	Xuewen Tao and Mingming Ha and Xiaobo Guo and Qiongxu Ma and Hongwei Cheng and Wenfang Lin	(参考訳) マルチタスク学習(mtl)は多くの現実世界のアプリケーションでうまく実装されており、単一のモデルで複数のタスクを同時に解決することを目指している。マルチタスク学習の一般的な考え方は、全タスクのパフォーマンスを改善するために、グローバルパラメータ共有機構とタスク固有の特徴抽出器を設計することである。しかし、タスク間のシーケンシャルな依存は滅多に研究されていないが、オンラインのオンラインレコメンデーション(インプレッション、クリック、コンバージョンなど)で頻繁に発生する。この問題に関する理論的研究はほとんどなく、ほとんどのMTL手法で採用されているバイアス最適化オブジェクトはオンライン性能を劣化させる。さらに、さまざまなタスク間のトレードオフのバランスを保ち、共通および特定の表現を効果的に学習する上でも課題は残る。本稿では,まず,厳密な数学的観点から逐次依存度mtlを解析し,不偏最適化対象として依存度タスク学習損失を設計する。また,逐次依存型MLLのためのタスク認識特徴抽出(TAFE)フレームワークを提案する。オフラインデータセットとオンラインa/b実装に関する広範な実験により,提案手法の有効性が実証された。 Multi-task learning (MTL) has been successfully implemented in many real-world applications, which aims to simultaneously solve multiple tasks with a single model. The general idea of multi-task learning is designing kinds of global parameter sharing mechanism and task-specific feature extractor to improve the performance of all tasks. However, sequential dependence between tasks are rarely studied but frequently encountered in e-commence online recommendation, e.g. impression, click and conversion on displayed product. There is few theoretical work on this problem and biased optimization object adopted in most MTL methods deteriorates online performance. Besides, challenge still remains in balancing the trade-off between various tasks and effectively learn common and specific representation. In this paper, we first analyze sequential dependence MTL from rigorous mathematical perspective and design a dependence task learning loss to provide an unbiased optimizing object. And we propose a Task Aware Feature Extraction (TAFE) framework for sequential dependence MTL, which enables to selectively reconstruct implicit shared representations from a sample-wise view and extract explicit task-specific information in an more efficient way. Extensive experiments on offline datasets and online A/B implementation demonstrate the effectiveness of our proposed TAFE.	翻訳日:2023-01-09 23:23:12 公開日:2023-01-06
# GNNによる乗客需要予測 GNN-based Passenger Request Prediction ( http://arxiv.org/abs/2301.02515v1 ) ライセンス: Link先を確認	Aqsa Ashraf Makhdomi and Iqra Altaf Gillani	(参考訳) 乗客の要求予測は、配車プラットフォームにおける運用計画、制御、管理に不可欠である。需要予測問題は広く研究されているが、乗客のOrigin-Destination(OD)フロー予測は研究コミュニティからはあまり注目されていない。本稿では,乗客のodフローを予測するための注意機構とともに,グラフニューラルネットワークフレームワークを開発した。提案フレームワークでは,異なる場所からの要求間で発生する線形および非線形のさまざまな依存関係を活用し,その場所の繰り返しパターンとコンテキストデータをキャプチャする。さらに、道路網を網羅し、モデルの複雑さと精度を維持するグリッドセルの最適サイズを決定する。提案手法の特徴と各種成分を明らかにするため,広範なシミュレーションを行った。その結果,提案モデルが既存のベースラインよりも優れた性能を示すことができた。 Passenger request prediction is essential for operations planning, control, and management in ride-sharing platforms. While the demand prediction problem has been studied extensively, the Origin-Destination (OD) flow prediction of passengers has received less attention from the research community. This paper develops a Graph Neural Network framework along with the Attention Mechanism to predict the OD flow of passengers. The proposed framework exploits various linear and non-linear dependencies that arise among requests originating from different locations and captures the repetition pattern and the contextual data of that place. Moreover, the optimal size of the grid cell that covers the road network and preserves the complexity and accuracy of the model is determined. Extensive simulations are conducted to examine the characteristics of our proposed approach and its various components. The results show the superior performance of our proposed model compared to the existing baselines.	翻訳日:2023-01-09 23:22:48 公開日:2023-01-06
# パールの反実的手法による反実的説明の評価 Evaluating counterfactual explanations using Pearl's counterfactual method ( http://arxiv.org/abs/2301.02499v1 ) ライセンス: Link先を確認	Bevan I. Smith	(参考訳) 対実的な説明 (CE) は、異なる望ましい結果を生み出す別のシナリオを生成する方法である。例えば、もし学生がコースを失敗すると予測された場合、反実的な説明は生徒が合格すると予測される別の方法を与えることができる。アプリケーションはたくさんあります。しかし、CEは必ずしもデータの真の因果構造を考慮していない機械学習モデルから現在生成されている。これにより、CE量にバイアスを導入することができる。本研究は,これまでのところ反事実説明(ce)文献に見られない,judea pearlの反事実計算法を用いてcesをテストするためのものである。さらに,これらのCEを3つの異なる因果構造上で評価し,その根底構造が生成するCEにどのように影響するかを示す。本研究は,pearl法を用いてcesを評価する方法を示し,(限られたサンプルサイズではあるが)cesの30%がpearl法で計算したものと矛盾していることを示した。このことは、CEを単に信頼できないことを示し、元の機械学習モデルを使ってカウンターファクトを盲目的に計算する前に、真の因果構造を知ることが不可欠であることを示している。 Counterfactual explanations (CEs) are methods for generating an alternative scenario that produces a different desirable outcome. For example, if a student is predicted to fail a course, then counterfactual explanations can provide the student with alternate ways so that they would be predicted to pass. The applications are many. However, CEs are currently generated from machine learning models that do not necessarily take into account the true causal structure in the data. By doing this, bias can be introduced into the CE quantities. I propose in this study to test the CEs using Judea Pearl's method of computing counterfactuals which has thus far, surprisingly, not been seen in the counterfactual explanation (CE) literature. I furthermore evaluate these CEs on three different causal structures to show how the true underlying causal structure affects the CEs that are generated. This study presented a method of evaluating CEs using Pearl's method and it showed, (although using a limited sample size), that thirty percent of the CEs conflicted with those computed by Pearl's method. This shows that we cannot simply trust CEs and it is vital for us to know the true causal structure before we blindly compute counterfactuals using the original machine learning model.	翻訳日:2023-01-09 23:22:36 公開日:2023-01-06
# architect, regularize and replay (arr): 継続的学習のための柔軟なハイブリッドアプローチ Architect, Regularize and Replay (ARR): a Flexible Hybrid Approach for Continual Learning ( http://arxiv.org/abs/2301.02464v1 ) ライセンス: Link先を確認	Vincenzo Lomonaco, Lorenzo Pellegrini, Gabriele Graffieti, Davide Maltoni	(参考訳) 近年,機械学習手法,特に深層表現学習に対する関心が高まり,基本的な仮定を克服し,様々な分布シフトやサンプル選択バイアスを受ける非定常環境への取り組みが見られた。この文脈では、アーキテクチャの優先順位、レギュラライザ、リプレイポリシーに基づくいくつかの計算アプローチが、それらが開発され評価された特定のシナリオによって異なる成功度で提案されている。しかし、柔軟かつ一般的に調整可能な効率効率性トレードオフに適用できる包括的なハイブリッドソリューションを設計することは、まだ遠い目標に思える。本稿では、古典的シナリオ(例えば、クラス増分学習)における最先端の成果を達成し、CIFAR-100、CORe50、ImageNet-1000などの実世界のデータセットから生成された任意のデータストリームに一般化できるAR1アルゴリズムとその変種をハイブリッドで一般化した「アーキテクチャ、規則化、再生」を提案する。 In recent years we have witnessed a renewed interest in machine learning methodologies, especially for deep representation learning, that could overcome basic i.i.d. assumptions and tackle non-stationary environments subject to various distributional shifts or sample selection biases. Within this context, several computational approaches based on architectural priors, regularizers and replay policies have been proposed with different degrees of success depending on the specific scenario in which they were developed and assessed. However, designing comprehensive hybrid solutions that can flexibly and generally be applied with tunable efficiency-effectiveness trade-offs still seems a distant goal. In this paper, we propose "Architect, Regularize and Replay" (ARR), an hybrid generalization of the renowned AR1 algorithm and its variants, that can achieve state-of-the-art results in classic scenarios (e.g. class-incremental learning) but also generalize to arbitrary data streams generated from real-world datasets such as CIFAR-100, CORe50 and ImageNet-1000.	翻訳日:2023-01-09 23:16:25 公開日:2023-01-06
# 大型類人猿行動のトリプルストリームディープメトリック学習 Triple-stream Deep Metric Learning of Great Ape Behavioural Actions ( http://arxiv.org/abs/2301.02642v1 ) ライセンス: Link先を確認	Otto Brookes, Majid Mirmehdi, Hjalmar K\"uhl, Tilo Burghardt	(参考訳) 本稿では,類人猿の行動認識のための最初のメトリック学習システムを提案する。提案手法は,DensePose-Cチンパンジーのボディー部分分割ストリームの利用により,従来のRGBの外観や光フローストリームを効果的に補完することを示す。異なる特徴融合手法と長い尾認識手法を用いてシステム変異を評価した。 PanAf-500データセットでは、9つの動作アクションに対して180,000のアノテートフレームが手作業で記述されているため、トップ1の精度が約12%向上した。さらに,本研究の結果を定性的に分析し,そのデータを用いた文献と比較して,クラス毎の平均精度が約23%向上できることを示すロングテール認識手法を用いて,メートル法学習システムを強化した。最後に、埋め込み空間はメートル法として構築されるので、新しい幾何学とトポロジーを示す巨大な猿の行動行動空間の最初のデータ駆動可視化を提供する。この研究が、絶滅危惧猿の利益のために、コンピュータビジョンのこの重要な応用分野へのさらなる関心を喚起することを願っている。 We propose the first metric learning system for the recognition of great ape behavioural actions. Our proposed triple stream embedding architecture works on camera trap videos taken directly in the wild and demonstrates that the utilisation of an explicit DensePose-C chimpanzee body part segmentation stream effectively complements traditional RGB appearance and optical flow streams. We evaluate system variants with different feature fusion techniques and long-tail recognition approaches. Results and ablations show performance improvements of ~12% in top-1 accuracy over previous results achieved on the PanAf-500 dataset containing 180,000 manually annotated frames across nine behavioural actions. Furthermore, we provide a qualitative analysis of our findings and augment the metric learning system with long-tail recognition techniques showing that average per class accuracy -- critical in the domain -- can be improved by ~23% compared to the literature on that dataset. Finally, since our embedding spaces are constructed as metric, we provide first data-driven visualisations of the great ape behavioural action spaces revealing emerging geometry and topology. We hope that the work sparks further interest in this vital application area of computer vision for the benefit of endangered great apes.	翻訳日:2023-01-09 23:15:24 公開日:2023-01-06
# tarvis: ターゲットベースのビデオセグメンテーションのための統一アプローチ TarViS: A Unified Approach for Target-based Video Segmentation ( http://arxiv.org/abs/2301.02657v1 ) ライセンス: Link先を確認	Ali Athar, Alexander Hermans, Jonathon Luiten, Deva Ramanan, Bastian Leibe	(参考訳) ビデオセグメンテーションの一般的なドメインは、現在複数のベンチマークにまたがる異なるタスクに断片化されている。最先端技術の急速な進歩にもかかわらず、現在の手法は圧倒的にタスク固有であり、概念的には他のタスクに一般化できない。マルチタスク機能を備えた最近のアプローチにインスパイアされたTarViSは、ビデオ内の任意に定義された「ターゲット」の集合をセグメント化する必要のあるタスクに適用可能な、新しく統一されたネットワークアーキテクチャである。我々のアプローチは、タスクがこれらのターゲットをどのように定義するかに関して柔軟であり、後者を抽象的な「クエリ」としてモデル化し、ピクセル精度の高いターゲットマスクを予測するのに使用される。単一のTarViSモデルは、異なるタスクにまたがるデータセットのコレクションを共同でトレーニングすることができ、タスク固有のリトレーニングなしで、推論中にタスク間のホットスワップを行うことができる。有効性を示すために,ビデオインスタンスセグメンテーション(VIS),ビデオパノプティクスセグメンテーション(VPS),ビデオオブジェクトセグメンテーション(VOS),ポイントインテンプラ誘導トラッキング(PET)の4つのタスクにTarViSを適用した。これら4つのタスクにまたがる5/7ベンチマークの最先端性能と,残りの2つのタスクの競合性能を実現する。 The general domain of video segmentation is currently fragmented into different tasks spanning multiple benchmarks. Despite rapid progress in the state-of-the-art, current methods are overwhelmingly task-specific and cannot conceptually generalize to other tasks. Inspired by recent approaches with multi-task capability, we propose TarViS: a novel, unified network architecture that can be applied to any task that requires segmenting a set of arbitrarily defined 'targets' in video. Our approach is flexible with respect to how tasks define these targets, since it models the latter as abstract 'queries' which are then used to predict pixel-precise target masks. A single TarViS model can be trained jointly on a collection of datasets spanning different tasks, and can hot-swap between tasks during inference without any task-specific retraining. To demonstrate its effectiveness, we apply TarViS to four different tasks, namely Video Instance Segmentation (VIS), Video Panoptic Segmentation (VPS), Video Object Segmentation (VOS) and Point Exemplar-guided Tracking (PET). Our unified, jointly trained model achieves state-of-the-art performance on 5/7 benchmarks spanning these four tasks, and competitive performance on the remaining two.	翻訳日:2023-01-09 23:15:09 公開日:2023-01-06
# 深層学習駆動サルエント領域における有効なp値 Valid P-Value for Deep Learning-Driven Salient Region ( http://arxiv.org/abs/2301.02437v1 ) ライセンス: Link先を確認	Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi	(参考訳) 深層学習モデルの予測を解釈し,説明するために,様々なサリエンシマップ手法が提案されている。精度マップにより、入力信号のどの部分が予測結果に強い影響を与えるかを解釈できる。しかし、深層学習モデルにおける複雑な計算によってサリエンシマップが得られるため、サリエンシマップ自体の信頼性を知ることはしばしば困難である。そこで本研究では,p値の形で有意な領域の信頼性を定量化する手法を提案する。本研究は,訓練された深層学習モデルによって選択された仮説としてサルエント領域を考察し,選択推論フレームワークを採用することを目的とする。提案手法は,有意な領域の偽陽性検出の確率を確実に制御できる。提案手法の有効性を,合成データセットと実データセットの数値例を用いて示す。さらに,提案するCNNに対して,実装コストを伴わずに選択推論を行うKerasベースのフレームワークを開発した。 Various saliency map methods have been proposed to interpret and explain predictions of deep learning models. Saliency maps allow us to interpret which parts of the input signals have a strong influence on the prediction results. However, since a saliency map is obtained by complex computations in deep learning models, it is often difficult to know how reliable the saliency map itself is. In this study, we propose a method to quantify the reliability of a salient region in the form of p-values. Our idea is to consider a salient region as a selected hypothesis by the trained deep learning model and employ the selective inference framework. The proposed method can provably control the probability of false positive detections of salient regions. We demonstrate the validity of the proposed method through numerical examples in synthetic and real datasets. Furthermore, we develop a Keras-based framework for conducting the proposed selective inference for a wide class of CNNs without additional implementation cost.	翻訳日:2023-01-09 23:14:43 公開日:2023-01-06
# no, to the right" --共有自律性によるロボット操作のためのオンライン言語修正 "No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy ( http://arxiv.org/abs/2301.02555v1 ) ライセンス: Link先を確認	Yuchen Cui and Siddharth Karamcheti and Raj Palleti and Nidhya Shivakumar and Percy Liang and Dorsa Sadigh	(参考訳) 言語誘導型ロボットインタラクションのためのシステムは、適応性と学習効率の2つの重要なデシダータを満たす必要がある。残念ながら、既存のインストラクションフォローエージェントは適応できず、オンライン自然言語監視を組み込む能力が欠如している。本研究では,自然言語の修正を取り入れ,適応するためのフレームワークであるLanguage-Informed Latent Actions with Corrections (LILAC) を,実行中に「右へ」あるいは「右へ」あるいは「右へ」に向けて提示することで,これらの問題に対処する。我々は共有自律性パラダイムの中でリッチ操作ドメインを探求する。言語は、人間がロボットをガイドするために使用できる有意義で低次元の制御空間を生成する学習されたモデルへの入力です。それぞれのリアルタイム補正は、人間のコントロール空間を洗練し、正確で拡張された動作を可能にします。我々は,Franka Emika Pandaマニピュレータを用いて複雑な操作作業を行うユーザスタディを通じて,我々のアプローチを評価する。オープンループ指導とシングルターン共有自律の両方を対象とする既存の学習ベースラインと比較して,我々の修正認識アプローチはタスク完了率が高く,信頼性,正確性,使いやすさから,ユーザによって主観的に好まれることを示す。 Systems for language-guided human-robot interaction must satisfy two key desiderata for broad adoption: adaptivity and learning efficiency. Unfortunately, existing instruction-following agents cannot adapt, lacking the ability to incorporate online natural language supervision, and even if they could, require hundreds of demonstrations to learn even simple policies. In this work, we address these problems by presenting Language-Informed Latent Actions with Corrections (LILAC), a framework for incorporating and adapting to natural language corrections - "to the right," or "no, towards the book" - online, during execution. We explore rich manipulation domains within a shared autonomy paradigm. Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot: language is an input to a learned model that produces a meaningful, low-dimensional control space that the human can use to guide the robot. Each real-time correction refines the human's control space, enabling precise, extended behaviors - with the added benefit of requiring only a handful of demonstrations to learn. We evaluate our approach via a user study where users work with a Franka Emika Panda manipulator to complete complex manipulation tasks. Compared to existing learned baselines covering both open-loop instruction following and single-turn shared autonomy, we show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users because of its reliability, precision, and ease of use.	翻訳日:2023-01-09 23:14:27 公開日:2023-01-06
# フィードバックゲーテッド整流線形単位 Feedback-Gated Rectified Linear Units ( http://arxiv.org/abs/2301.02610v1 ) ライセンス: Link先を確認	Marco Kemmerling	(参考訳) フィードバック接続は人間の脳において重要な役割を果たすが、ニューラルネットワーク研究ではあまり注目されていない。ここでは, 整流線形ユニットをゲートとする生物学的フィードバック機構を提案する。 MNISTデータセットでは、フィードバックのないオートエンコーダは、フィードバックのないものに比べて、より高速な収束、パフォーマンスの向上、ノイズに対する堅牢性を示している。 cifar-10データセットにフィードバックのあるネットワークを適用すると、いくつかの利点は、発音や一貫性が低下するが、観察できる。 Feedback connections play a prominent role in the human brain but have not received much attention in artificial neural network research. Here, a biologically inspired feedback mechanism which gates rectified linear units is proposed. On the MNIST dataset, autoencoders with feedback show faster convergence, better performance, and more robustness to noise compared to their counterparts without feedback. Some benefits, although less pronounced and less consistent, can be observed when networks with feedback are applied on the CIFAR-10 dataset.	翻訳日:2023-01-09 23:13:59 公開日:2023-01-06
# TWR-MCAE: 壁面レーダーによる人体動作認識のためのデータ拡張手法 TWR-MCAE: A Data Augmentation Method for Through-the-Wall Radar Human Motion Recognition ( http://arxiv.org/abs/2301.02488v1 ) ライセンス: Link先を確認	Weicheng Gao, Xiaopeng Yang, Xiaodong Qu, Tian Lan	(参考訳) 壁面減衰,マルチパス効果,システム干渉による壁面レーダ(twr)の人間動作の精度低下と収束時間の延長という課題を解決するため,マルチリンク自動符号化ニューラルネットワーク(twr-mcae)データ拡張法を提案する。特に、TWR-MCAEアルゴリズムは、特異値分解(SVD)ベースのデータ前処理モジュール、改良された座標注意モジュール、圧縮検出可能な反復収縮しきい値再構成アルゴリズム(LISTA)モジュール、適応重みモジュールで共同構築される。データ前処理モジュールは、壁クラッタ、人の動き特徴、ノイズサブスペース分離を実現する。改良された座標注意モジュールは、クラッタおよびノイズ抑制を実現する。 LISTAモジュールはヒトの運動特徴増強を実現する。適応重み加群は重みを学び、3つの部分空間を融合する。 TWR-MCAEは壁クラッタの低ランク特性を抑制でき、同時に人の動きの空間特性を高めることができる。分類ステップの前にリンクすることで、他の事前知識を追加したり、より多くのデータを再収集することなく、特徴抽出能力を改善することができる。実験により,提案手法はピーク信号対雑音比(psnr)が向上し,認識精度が向上し,バックエンド分類器の学習プロセスを高速化することを示した。 To solve the problems of reduced accuracy and prolonging convergence time of through-the-wall radar (TWR) human motion due to wall attenuation, multipath effect, and system interference, we propose a multilink auto-encoding neural network (TWR-MCAE) data augmentation method. Specifically, the TWR-MCAE algorithm is jointly constructed by a singular value decomposition (SVD)-based data preprocessing module, an improved coordinate attention module, a compressed sensing learnable iterative shrinkage threshold reconstruction algorithm (LISTA) module, and an adaptive weight module. The data preprocessing module achieves wall clutter, human motion features, and noise subspaces separation. The improved coordinate attention module achieves clutter and noise suppression. The LISTA module achieves human motion feature enhancement. The adaptive weight module learns the weights and fuses the three subspaces. The TWR-MCAE can suppress the low-rank characteristics of wall clutter and enhance the sparsity characteristics in human motion at the same time. It can be linked before the classification step to improve the feature extraction capability without adding other prior knowledge or recollecting more data. Experiments show that the proposed algorithm gets a better peak signal-to-noise ratio (PSNR), which increases the recognition accuracy and speeds up the training process of the back-end classifiers.	翻訳日:2023-01-09 23:13:51 公開日:2023-01-06
# semantic match: ヘルスケアのためのxaiのデバッギング機能帰属メソッド Semantic match: Debugging feature attribution methods in XAI for healthcare ( http://arxiv.org/abs/2301.02080v2 ) ライセンス: Link先を確認	Giovanni Cin\`a, Tabea E. R\"ober, Rob Goedhart, \c{S}. \.Ilker Birbil	(参考訳) 最近、医療用の認証人工知能(AI)ツールが急増し、この技術の採用に関する議論が再燃している。このような議論の1つのスレッドは、説明可能なAI(XAI)と、AIデバイスをより透明で信頼性の高いものにすることの約束に関するものだ。医療AI分野で活動している一部の声は、説明可能なAI技術、特に特徴帰属手法の信頼性に関する懸念を表明し、その使用とガイドラインや標準への含意を疑問視している。画像データに固有の問題を一般化することにより, 保温後の局所的説明可能性に関する既存の批判は, 浴水で赤ちゃんを投げ捨てるものである, と論じる。まず、その問題を説明と人間の理解のセマンティックマッチの欠如として特徴づける。機能の重要度がいつ確実に使用できるのかを理解するため、低レベルと高レベルの機能の重要度を区別する。 EHR(Electronic Health Records)のような表層データのような,低レベルの機能に明確なセマンティクスが付与されたデータタイプに対しては,セマンティクスマッチングが実現可能であるため,機能属性手法を有意義かつ有用な方法で使用することが可能である,と論じる。 The recent spike in certified Artificial Intelligence (AI) tools for healthcare has renewed the debate around adoption of this technology. One thread of such debate concerns Explainable AI (XAI) and its promise to render AI devices more transparent and trustworthy. A few voices active in the medical AI space have expressed concerns on the reliability of Explainable AI techniques and especially feature attribution methods, questioning their use and inclusion in guidelines and standards. Despite valid concerns, we argue that existing criticism on the viability of post-hoc local explainability methods throws away the baby with the bathwater by generalizing a problem that is specific to image data. We begin by characterizing the problem as a lack of semantic match between explanations and human understanding. To understand when feature importance can be used reliably, we introduce a distinction between feature importance of low- and high-level features. We argue that for data types where low-level features come endowed with a clear semantics, such as tabular data like Electronic Health Records (EHRs), semantic match can be obtained, and thus feature attribution methods can still be employed in a meaningful and useful way.	翻訳日:2023-01-09 23:06:15 公開日:2023-01-06
# 未知のユーティリティ関数によるネットワークユーティリティの最大化: 分散データ駆動バイレベル最適化アプローチ Network Utility Maximization with Unknown Utility Functions: A Distributed, Data-Driven Bilevel Optimization Approach ( http://arxiv.org/abs/2301.01801v2 ) ライセンス: Link先を確認	Kaiyi Ji and Lei Ying	(参考訳) 公平なリソース割り当ては、通信ネットワークにおける最も重要なトピックの1つである。既存のソリューションはほとんどの場合、各ユーザユーティリティ関数が知られて凹凸であると仮定する。本稿では,ユーティリティ機能が未知である場合,ユーザに対してどのようにリソースを割り当てるか,という問いに答える。この答えは、ユーザユーティリティが複雑でクローズドフォームが入手困難である、次世代のAI対応通信ネットワークにおいてますます重要になっている。本稿では,分散およびデータ駆動の双方向最適化手法を用いて,分散ネットワークユーティリティ最大化(NUM)アルゴリズムと,データ駆動学習アルゴリズムを用いて,真のネットワークユーティリティの総和を最大化するための最良サロゲートユーティリティ関数を求める。提案アルゴリズムは、データサンプル(ユーティリティ値または勾配値)から学習し、サロゲートユーティリティ関数を自動チューニングして真のネットワークユーティリティを最大化するので、未知のユーティリティ関数で機能する。一般ネットワークでは,提案アルゴリズムの非凹凸汎関数による非漸近収束率を定式化する。提案手法の有効性を実世界のネットワークで検証し,提案手法の有効性を検証した。 Fair resource allocation is one of the most important topics in communication networks. Existing solutions almost exclusively assume each user utility function is known and concave. This paper seeks to answer the following question: how to allocate resources when utility functions are unknown, even to the users? This answer has become increasingly important in the next-generation AI-aware communication networks where the user utilities are complex and their closed-forms are hard to obtain. In this paper, we provide a new solution using a distributed and data-driven bilevel optimization approach, where the lower level is a distributed network utility maximization (NUM) algorithm with concave surrogate utility functions, and the upper level is a data-driven learning algorithm to find the best surrogate utility functions that maximize the sum of true network utility. The proposed algorithm learns from data samples (utility values or gradient values) to autotune the surrogate utility functions to maximize the true network utility, so works for unknown utility functions. For the general network, we establish the nonasymptotic convergence rate of the proposed algorithm with nonconcave utility functions. The simulations validate our theoretical results and demonstrate the great effectiveness of the proposed method in a real-world network.	翻訳日:2023-01-09 23:05:52 公開日:2023-01-06
# 質問応答としての感情因果対抽出 Emotion-Cause Pair Extraction as Question Answering ( http://arxiv.org/abs/2301.01982v2 ) ライセンス: Link先を確認	Huu-Hiep Nguyen and Minh-Tien Nguyen	(参考訳) Emotion-Cause Pair extract (ECPE) のタスクは、感情や原因節のアノテーションなしで、文書の潜在的な感情のペアを抽出することを目的としている。従来のECPEのアプローチでは、複雑なアーキテクチャを用いて感情による相互作用をモデル化し、従来の2段階処理方式を改良しようと試みてきた。本稿では,質問応答(QA)問題にECPEタスクを投入し,それに取り組むための単純かつ効果的なBERTベースのソリューションを提案する。文書が与えられた場合、ガイド-QAモデルはまず、固定された質問を用いて最適な感情節を予測する。次に、予測された感情は、感情の最も潜在的な原因を予測する質問として使用される。我々は,標準ECPEコーパスでモデルを評価する。実験の結果, 単純性にもかかわらず, 有望な結果が得られ, 容易に再現できることが示唆された。 Guided-QAのコードも提供される。 The task of Emotion-Cause Pair Extraction (ECPE) aims to extract all potential emotion-cause pairs of a document without any annotation of emotion or cause clauses. Previous approaches on ECPE have tried to improve conventional two-step processing schemes by using complex architectures for modeling emotion-cause interaction. In this paper, we cast the ECPE task to the question answering (QA) problem and propose simple yet effective BERT-based solutions to tackle it. Given a document, our Guided-QA model first predicts the best emotion clause using a fixed question. Then the predicted emotion is used as a question to predict the most potential cause for the emotion. We evaluate our model on a standard ECPE corpus. The experimental results show that despite its simplicity, our Guided-QA achieves promising results and is easy to reproduce. The code of Guided-QA is also provided.	翻訳日:2023-01-09 23:05:34 公開日:2023-01-06
# 映像言語課題のための学習軌跡単語アライメント Learning Trajectory-Word Alignments for Video-Language Tasks ( http://arxiv.org/abs/2301.01953v2 ) ライセンス: Link先を確認	Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang	(参考訳) Image-Language BERT (IL-BERT) と Video-Language BERT (VDL-BERT) では、言葉でオブジェクトを調整することが重要な役割を果たす。オブジェクトがいくつかの空間的パッチをカバーしている場合とは異なり、ビデオ内のオブジェクトは通常、オブジェクトの軌道として現れる、すなわち、いくつかの空間的だがより長い時間的パッチにまたがるので、豊富な時空間的コンテキストを含む。しかしながら、現代のVDL-BERTは、通常、パッチ・トゥ・ワード(P2W)の注意を配置するためにIL-BERTに従うというこの軌跡を無視する一方、そのような注意は、自明な空間的コンテキストを過度に露出し、時間的文脈を無視する。そこで本稿では,ビデオ言語タスクを解くためのトラジェクティブ・ワードアライメントを学習するための新しいTW-BERTを提案する。このようなアライメントは、新しく設計されたt2wの注意によって学習される。また,従来のVDL-BERTを追従して,モーダルエンコーダにワード・トゥ・パッチ(W2P)の注意を設定する。 T2WとW2Pの注意は多様であるため、我々のクロスモーダルエンコーダは非対称である。この非対称なクロスモーダルエンコーダが堅牢な視覚言語アソシエーションを構築するのに役立ち、ビデオやテキストエンコーダによって計算された埋め込み空間を閉じるための粒度の 'align-before-fuse'' 戦略を提案する。提案した戦略とT2Wの注目により、我々のTW-BERTは、テキストからビデオまでの検索タスクにおけるSOTAパフォーマンスと、より多くのデータで訓練されたVDL-BERTを用いたビデオ質問応答タスクにおける同等のパフォーマンスを達成する。コードは補足資料で入手できます。 Aligning objects with words plays a critical role in Image-Language BERT (IL-BERT) and Video-Language BERT (VDL-BERT). Different from the image case where an object covers some spatial patches, an object in a video usually appears as an object trajectory, i.e., it spans over a few spatial but longer temporal patches and thus contains abundant spatiotemporal contexts. However, modern VDL-BERTs neglect this trajectory characteristic that they usually follow IL-BERTs to deploy the patch-to-word (P2W) attention while such attention may over-exploit trivial spatial contexts and neglect significant temporal contexts. To amend this, we propose a novel TW-BERT to learn Trajectory-Word alignment for solving video-language tasks. Such alignment is learned by a newly designed trajectory-to-word (T2W) attention. Besides T2W attention, we also follow previous VDL-BERTs to set a word-to-patch (W2P) attention in the cross-modal encoder. Since T2W and W2P attentions have diverse structures, our cross-modal encoder is asymmetric. To further help this asymmetric cross-modal encoder build robust vision-language associations, we propose a fine-grained ``align-before-fuse'' strategy to pull close the embedding spaces calculated by the video and text encoders. By the proposed strategy and T2W attention, our TW-BERT achieves SOTA performances on text-to-video retrieval tasks, and comparable performances on video question answering tasks with some VDL-BERTs trained on much more data. The code will be available in the supplementary material.	翻訳日:2023-01-09 23:05:09 公開日:2023-01-06
# ハイパーパラメータ最適化による自律レースシステムのデータ駆動モデル同定 Data-Driven Model Identification via Hyperparameter Optimization for Autonomous Racing Systems ( http://arxiv.org/abs/2301.01470v2 ) ライセンス: Link先を確認	Hyunki Seong, Chanyoung Chung, and David Hyunchul Shim	(参考訳) 本稿では,ハイパーパラメータ最適化(MIHO)を用いたモデル同定手法を提案する。提案手法は,データ駆動最適化方式で動的モデルのパラメータを同定する効率的な探索探索戦略を採用する。フルスケールの自動運転車であるAV-21のモデルパラメータ同定にMIHOを利用する。次に、モデルベースの計画・制御システムの設計に最適化されたパラメータを組み込む。実験では、学習されたパラメトリックモデルは与えられたデータセットの適合性を示し、目に見えない動的シナリオにおける一般化能力を示す。さらに、モデルベースシステムを検証するために、広範囲なフィールドテストを実施します。テストの結果,インディアナポリス・モーター・スピードウェイとラスベガス・モーター・スピードウェイで,学習したモデル力学を活用し,障害物回避と200km/h以上の高速走行に成功した。 MIHOのソースコードとテストのビデオはhttps://github.com/hynkis/MIHOで公開されている。 In this letter, we propose a model identification method via hyperparameter optimization (MIHO). Our method adopts an efficient explore-exploit strategy to identify the parameters of dynamic models in a data-driven optimization manner. We utilize MIHO for model parameter identification of the AV-21, a full-scaled autonomous race vehicle. We then incorporate the optimized parameters for the design of model-based planning and control systems of our platform. In experiments, the learned parametric models demonstrate good fitness to given datasets and show generalization ability in unseen dynamic scenarios. We further conduct extensive field tests to validate our model-based system. The tests show that our race systems leverage the learned model dynamics and successfully perform obstacle avoidance and high-speed driving over $200 km/h$ at the Indianapolis Motor Speedway and Las Vegas Motor Speedway. The source code for MIHO and videos of the tests are available at https://github.com/hynkis/MIHO.	翻訳日:2023-01-09 23:04:31 公開日:2023-01-06
# アンチスケーシングによる量子-光子相互作用の動的強化 Dynamically enhancing qubit-photon interactions with anti-squeezing ( http://arxiv.org/abs/2212.04991v1 ) ライセンス: Link先を確認	M. Villiers, W. C. Smith, A. Petrescu, A. Borgognoni, M. Delbecq, A. Sarlette, M. Mirrahimi, P. Campagne-Ibarcq, T. Kontos and Z. Leghtas	(参考訳) 発振器と量子ビットとの相互作用強度は、発振器の真空場変動とともに増大する。良く知られた縮退パラメトリック発振器は、その固有状態が押しつぶされたフォック状態である強い復調スクイージング状態への関心を復活させた。これらの増幅場ゆらぎにより、この振動子を絞ることで量子ビット-光子相互作用を動的に促進することが最近提案されている。超伝導回路実験において、スクイージングの5.5dBにおいて、キュービットと発振器の分散相互作用の2倍の増大を観測し、キュービット-光子相互作用のその場動的制御を示す。この研究は、励起された光子の振動子と量子ビットとの実験的カップリングを開始し、強化された相互作用を求める実験プラットフォームでの拡散を慎重に動機付ける。 The interaction strength of an oscillator to a qubit grows with the oscillator's vacuum field fluctuations. The well known degenerate parametric oscillator has revived interest in the regime of strongly detuned squeezing, where its eigenstates are squeezed Fock states. Owing to these amplified field fluctuations, it was recently proposed that squeezing this oscillator would dynamically boost qubit-photon interactions. In a superconducting circuit experiment, we observe a two-fold increase in the dispersive interaction between a qubit and an oscillator at 5.5 dB of squeezing, demonstrating in-situ dynamical control of qubit-photon interactions. This work initiates the experimental coupling of oscillators of squeezed photons to qubits, and cautiously motivates their dissemination in experimental platforms seeking enhanced interactions.	翻訳日:2023-01-09 17:30:45 公開日:2023-01-06
# 高振幅状態の連続可変量子トモグラフィ Continuous-variable quantum tomography of high-amplitude states ( http://arxiv.org/abs/2212.07406v1 ) ライセンス: Link先を確認	Ekaterina Fedotova, Nikolai Kuznetsov, Egor Tiunov, A. E. Ulanov and A. I. Lvovsky	(参考訳) 量子状態トモグラフィーは現代の量子技術の重要な構成要素である。電磁場のような連続可変高調波オシレータシステムに適用する場合、既存のトモグラフィー法は一般に離散基底で状態を再構成し、したがって比較的低い振幅とエネルギーを持つ状態に制限される。そこで,この制限を克服するために,フィードフォワードニューラルネットワークを用いて,密度行列を直接連続位置で取得する。このアプローチの重要な利点は、詳細な再構築のために位相空間内の特定の領域を選択できることです。これにより、状態振幅による再構成に必要なリソース量のスケーリングが比較的遅くなり、その結果、本手法でアクセス可能な振幅の範囲を劇的に増加させることができる。 Quantum state tomography is an essential component of modern quantum technology. In application to continuous-variable harmonic-oscilator systems, such as the electromagnetic field, existing tomography methods typically reconstruct the state in discrete bases, and are hence limited to states with relatively low amplitudes and energies. Here we overcome this limitation by utilizing a feed-forward neural network to obtain the density matrix directly in the continuous position basis. An important benefit of our approach is the ability to choose specific regions in the phase space for detailed reconstruction. This results in relatively slow scaling of the amount of resources required for the reconstruction with the state amplitude, and hence allows us to dramatically increase the range of amplitudes accessible with our method.	翻訳日:2023-01-09 14:31:05 公開日:2023-01-06

Title

Authors

Abstract

論文公表日・翻訳日

# 二成分量子チャネルの前方古典的容量の束縛

Bounding the forward classical capacity of bipartite quantum channels ( http://arxiv.org/abs/2010.01058v3 )

ライセンス: Link先を確認

Dawei Ding, Sumeet Khatri, Yihui Quek, Peter W. Shor, Xin Wang, Mark M. Wilde

(参考訳) 両部量子チャネルにおける前方古典通信の様々な方法を紹介する。点対点チャネルは二成分チャネルの特別な場合であるため、この測度は点対点チャネルの古典的通信の測度に還元される。その結果、これらの削減された測度は、wangらが量子チャネルの古典的容量の境界に関する以前の研究で報告されている。応用として、この測度は二部流路の前方古典的容量の上限であることを示す。減少測度は、古典的フィードバックチャネルによって支援される点対点量子チャネルの古典的容量の上界である。様々な測度のいくつかは半定義プログラミングによって計算できる。

We introduce various measures of forward classical communication for bipartite quantum channels. Since a point-to-point channel is a special case of a bipartite channel, the measures reduce to measures of classical communication for point-to-point channels. As it turns out, these reduced measures have been reported in prior work of Wang et al. on bounding the classical capacity of a quantum channel. As applications, we show that the measures are upper bounds on the forward classical capacity of a bipartite channel. The reduced measures are upper bounds on the classical capacity of a point-to-point quantum channel assisted by a classical feedback channel. Some of the various measures can be computed by semi-definite programming.

翻訳日:2023-04-30 04:01:53 公開日:2023-01-06

# 最適線形検出器の設計 --ボトムアップアプローチ-

Designing optimal linear detectors -- a bottom-up approach ( http://arxiv.org/abs/2110.07942v5 )

ライセンス: Link先を確認

Joe Bentley, Hendra Nurdin, Yanbei Chen, Xiang Li, Haixing Miao

(参考訳) 本稿では, 線形検出器を最適感度で実現するための系統的アプローチを開発し, 極弱信号の検出を可能にする。まず、一般的な制約は線形検出器の入出力伝達関数の特定のクラスに導かれる。すると、そのクラスにおける転送関数の物理的実現が量子ネットワーク合成技術を用いて見出され、入出力転送関数から直接物理セットアップを推測することができる。最小内部モード数を持つ最小限の実現法を探索することにより、最適検出器は内部スクイーズ方式であることが示される。そして、パリティ時間対称系に動機づけられた非最小実現を探索し、量子非退化測定を体系的に回収する。

This paper develops a systematic approach to realising linear detectors with an optimised sensitivity, allowing for the detection of extremely weak signals. First, general constraints are derived on a specific class of input-output transfer functions of a linear detector. Then a physical realization of transfer functions in that class is found using the quantum network synthesis technique, which allows for the inference of the physical setup directly from the input-output transfer function. By exploring a minimal realization which has the minimum number of internal modes, it is shown that the optimal such detectors are internal squeezing schemes. Then, investigating non-minimal realizations, which is motivated by the parity-time symmetric systems, a quantum non-demolition measurement is systematically recovered.

翻訳日:2023-03-11 10:10:31 公開日:2023-01-06

# ゼロ次元ボゾン系の合成空間における創発的非エルミート局在現象

Emergent non-Hermitian localization phenomena in the synthetic space of zero-dimensional bosonic systems ( http://arxiv.org/abs/2110.15286v6 )

ライセンス: Link先を確認

Ievgen I. Arkhipov, Fabrizio Minganti

(参考訳) 非エルミート系の相転移は、最先端の理論と実験的研究に焦点をあてている。一方、パリティ時間 ($\cal PT$-) と反 PT$-対称物理学は、例外点 (EPs) と呼ばれる非エルミートスペクトル特異点の存在により、常に関心を集めている。一方、非エルミート系のトポロジカルおよび局在遷移は、例えば非エルミート皮膚効果や従来のバルク境界対応の欠如など、新しい現象を示す。従来の研究の大部分は、位相的および局在的遷移現象を示すために、微調整された拡張格子を必要とする非エルミートハミルトン系にのみ焦点をあてており、本研究では、非エルミート局所化現象が、ゼロ次元ボソニック系の合成場モーメント空間、例えば、反$\cal pt$ および$$\cal pt$-symmetric 量子ダイマーにおいて自然にどのように現れるかを示す。これは低次元系の局所化遷移をシミュレートする機会を与え、例えば結合キャビティや導波路のような複雑な配列を構成する必要はない。実際、運動の場モーメント方程式は、1次元(1D)合成格子で動く等価な(準)粒子を記述することができる。この合成場モーメント空間は、高縮退EPの存在によって誘導される非エルミート皮膚効果のような非自明な局在現象を示すことができる。我々は,高次場モーメント固有空間をSylvester行列形状の合成1次元非エルミート・ハミルトニアンでエミュレートした反$\cal PT$-symmetric two-modeシステムの例を示した。この結果は、光子モーメントや相関関数を測定することにより、超伝導回路やトロイダル共振器などの最先端光学装置で直接検証することができる。

Phase transitions in non-Hermitian systems are at the focus of cutting edge theoretical and experimental research. On the one hand, parity-time- ($\cal PT$-) and anti-$\cal PT$-symmetric physics have gained ever-growing interest, due to the existence of non-Hermitian spectral singularities called exceptional points (EPs). On the other, topological and localization transitions in non-Hermitian systems reveal new phenomena, e.g., the non-Hermitian skin effect and the absence of conventional bulk-boundary correspondence. The great majority of previous studies exclusively focus on non-Hermitian Hamiltonians, whose realization requires an {\it a priori} fine-tuned extended lattices to exhibit topological and localization transition phenomena.In this work, we show how the non-Hermitian localization phenomena can naturally emerge in the synthetic field moments space of zero-dimensional bosonic systems, e.g., in anti-$\cal PT$ and $\cal PT$-symmetric quantum dimers. This offers an opportunity to simulate localization transitions in low-dimensional systems, without the need to construct complex arrays of, e.g., coupled cavities or waveguides. Indeed, the field moment equations of motion can describe an equivalent (quasi-)particle moving in a one-dimensional (1D) synthetic lattice. This synthetic field moments space can exhibit a nontrivial localization phenomena, such as non-Hermitian skin effect, induced by the presence of highly-degenerate EPs. We demonstrate our findings on the example of an anti-$\cal PT$-symmetric two-mode system, whose higher-order field moments eigenspace is emulated by a synthetic 1D non-Hermitian Hamiltonian having a Sylvester matrix shape. Our results can be directly verified in state-of-the-art optical setups, such as superconducting circuits and toroidal resonators, by measuring photon moments or correlation functions.

翻訳日:2023-03-10 00:58:39 公開日:2023-01-06

# 多部門内在的非局所性とデバイス非依存会議鍵合意

Multipartite Intrinsic Non-Locality and Device-Independent Conference Key Agreement ( http://arxiv.org/abs/2111.02596v3 )

ライセンス: Link先を確認

Aby Philip, Eneet Kaur, Peter Bierhorst, and Mark M. Wilde

(参考訳) 本研究では,デバイス非依存(DI)会議鍵契約におけるマルチパートシナリオにおける資源の定量化手法として,マルチパーティ固有の非局所性を導入する。局所演算と共通乱数性と呼ばれる自由操作のクラスにおいて,多部固有の非局所性は加法,凸,単調であることを証明する。我々の技術的貢献の1つとして、我々は2種類の多成分相互情報の連鎖規則を確立し、多成分内在的非局所性が付加的であることを証明するために使用する。この連鎖規則は他の文脈において独立した関心を持つかもしれない。多部内在的非局所性(multipartite intrinsic non-locality)は、diカンファレンスキーアグリーメントの一般的な多部内在的非局所性(multipartite intrinsic non-locality)において、秘密鍵レートの上限となるものです。本稿では、DI会議鍵プロトコルの様々な例について論じ、これらのプロトコルの上限を既知の下限と比較する。最後に、di量子鍵分布の最近の実験的実現における上限を計算する。

In this work, we introduce multipartite intrinsic non-locality as a method for quantifying resources in the multipartite scenario of device-independent (DI) conference key agreement. We prove that multipartite intrinsic non-locality is additive, convex, and monotone under a class of free operations called local operations and common randomness. As one of our technical contributions, we establish a chain rule for two variants of multipartite mutual information, which we then use to prove that multipartite intrinsic non-locality is additive. This chain rule may be of independent interest in other contexts. All of these properties of multipartite intrinsic non-locality are helpful in establishing the main result of our paper: multipartite intrinsic non-locality is an upper bound on secret key rate in the general multipartite scenario of DI conference key agreement. We discuss various examples of DI conference key protocols and compare our upper bounds for these protocols with known lower bounds. Finally, we calculate upper bounds on recent experimental realizations of DI quantum key distribution.

翻訳日:2023-03-09 04:47:36 公開日:2023-01-06

# 量子エラー補正と耐故障性入門

Introduction to Quantum Error Correction and Fault Tolerance ( http://arxiv.org/abs/2111.08894v4 )

ライセンス: Link先を確認

Steven M. Girvin

(参考訳) 2019年のLes Houches Summer Schoolの講義ノートは、ビットと量子ビットによる古典的および量子的誤り訂正と連続的な可変系(高調波発振器)の導入を目的としている。後者の焦点は、超電導回路とマイクロ波光子に基づくモジュラーアーキテクチャによって、今日または近い将来に実現可能な実用的な例に焦点が当てられる。ゴールとビジョンは「ハードウエア効率」な量子エラー補正であり、実用的で有用なフォールトトレランスと回路深度を達成するために、指数関数的に大きなハードウェアオーバーヘッドを必要としない。

These lecture notes from the 2019 Les Houches Summer School on 'Quantum Information Machines' are intended to provide an introduction to classical and quantum error correction with bits and qubits, and with continuous variable systems (harmonic oscillators). The focus on the latter will be on practical examples that can be realized today or in the near future with a modular architecture based on superconducting electrical circuits and microwave photons. The goal and vision is 'hardware-efficient' quantum error correction that does not require exponentially large hardware overhead in order to achieve practical and useful levels of fault tolerance and circuit depth.

翻訳日:2023-03-07 22:01:26 公開日:2023-01-06

# 責任ある量子について語る: 認識は絶対最小であり...

Talking about responsible quantum: Awareness is the absolute minimum... that we need to do ( http://arxiv.org/abs/2112.01378v4 )

ライセンス: Link先を確認

Tara Roberson

(参考訳) 量子技術に関するハイプは、このセクターの社会的影響について議論を呼んだ。量子技術の責任ある開発を確実にするための呼び出しは、具体的なケーススタディや無責任量子の実例の欠如によって複雑になる。この段階では、責任量子はコリングリッジジレンマを思い起こさせる状況に直面している。このジレンマにおいて、社会的なリスクと利益に関する議論が最も影響のある瞬間は、最も少ない情報が得られる時間でもある。この課題の裏側は、セクターの軌道(および潜在的な問題)が閉じ込められる前に、量子の公共利益を調べるためのプロセスを構築する機会である。この分野での最近の研究は、量子研究者やイノベーターが不確実性や懸念に対処するために社会と協力する必要があると主張している。量子利害関係者の関与と責任観の理解により、この提案を支持し、量子技術の責任ある開発と利用に関するさらなる対話を可能にすることを目指す。

Hype over novel quantum technologies has prompted discussion on the likely societal impacts of the sector. Calls to ensure the responsible development of quantum technologies are complicated by a lack of concrete case studies or real-world examples of irresponsible quantum. At this stage, responsible quantum faces a situation reminiscent of the Collingridge dilemma. In this dilemma, the moment in which discussion on societal risks and benefits can be most impactful is also the time where the least information is available. The flipside of this challenge is an opportunity to build processes for examining the public good of quantum before the trajectory (and potential problems) of the sector become locked in. Recent work in this space has argued that quantum researchers and innovators must work with society to address uncertainties and concerns. By engaging quantum stakeholders and understanding their perspectives on responsibility, this paper seeks to support this proposition and enable further dialogue on responsible development and use of quantum technologies.

翻訳日:2023-03-06 04:24:55 公開日:2023-01-06

# 準巡回符号から構築した新しい二進量子符号

New Binary Quantum Codes Constructed from Quasi-Cyclic Codes ( http://arxiv.org/abs/2112.07137v3 )

ライセンス: Link先を確認

Chaofeng Guan, Ruihu Li, Liangdong Lu, Yu Yao

(参考訳) 量子符号は古典的シンプレクティック双対包含符号によって構成できることはよく知られている。本稿では,2世代準巡回符号のファミリーを考察し,これらの符号がシンプレクティックな二重包含となるための十分な条件を導出する。そこで,シンプレクティック双対包含符号を用いたバイナリ量子符号の構成法を提案する。アプリケーションとして、最もよく知られた結果を超える8つのバイナリ量子コードを構築します。さらに、伝播規則によってさらに36個の新しいバイナリ量子符号が得られ、いずれも最小距離における下限を改善する。

It is well known that quantum codes can be constructed by means of classical symplectic dual-containing codes. This paper considers a family of two-generator quasi-cyclic codes and derives sufficient conditions for these codes to be symplectic dual-containing. Then, a new method for constructing binary quantum codes using symplectic dual-containing codes is proposed. As an application, we construct 8 binary quantum codes that exceed the best-known results. Further, another 36 new binary quantum codes are obtained by propagation rules, all of which improve the lower bound on the minimum distances.

翻訳日:2023-03-04 14:31:02 公開日:2023-01-06

# ホログラフィ、セルレーション、誤り訂正符号

Holography, cellulations and error correcting codes ( http://arxiv.org/abs/2112.12468v2 )

ライセンス: Link先を確認

Marika Taylor, Charles Woodward

(参考訳) 双曲平面に関連する量子誤り訂正符号はads$_3$/cft$_2$対応の文脈で広く研究されている。本稿では,高次元のホログラフィックジオメトリに関連する符号の体系的研究を開始し,ジオメトリの空間断面のセルレーションと安定化符号を関連づける。本研究では,3次元双曲空間(AdS$_4$)に対するHaPPY符号の類似を,絶対最大絡み(AME)符号と非AME符号の両方を用いて構成する。これらの符号は双曲空間の均一な正則テッセレーションに基づいているが、テッセレーションのポリトープの離散対称性を保存するAME符号は2次元以上は存在しないことに留意する。また,論理情報が境界に関連付けられる双曲空間に対するスタビリサー符号の異なる構成を探索し,それらの解釈について考察する。双曲空間のトロイダル還元による重力-スカラー理論(JT重力など)に基づくホログラフィック双対の興味深いクラスに、我々の符号がどのように適用できるかを説明する。

Quantum error correction codes associated with the hyperbolic plane have been explored extensively in the context of the AdS$_3$/CFT$_2$ correspondence. In this paper we initiate a systematic study of codes associated with holographic geometries in higher dimensions, relating cellulations of the spatial sections of the geometries to stabiliser codes. We construct analogues of the HaPPY code for three-dimensional hyperbolic space (AdS$_4$), using both absolutely maximally entangled (AME) and non-AME codes. These codes are based on uniform regular tessellations of hyperbolic space but we note that AME codes that preserve the discrete symmetry of the polytope of the tessellation do not exist above two dimensions. We also explore different constructions of stabiliser codes for hyperbolic spaces in which the logical information is associated with the boundary and discuss their potential interpretation. We explain how our codes could be applied to interesting classes of holographic dualities based on gravity-scalar theories (such as JT gravity) through toroidal reductions of hyperbolic spaces.

翻訳日:2023-03-03 18:00:04 公開日:2023-01-06

# 混合状態自由QFTにおける量子情報のチャネル誘起ダイナミクス

Channel induced dynamics of quantum information in mixed state free QFTs ( http://arxiv.org/abs/2201.02723v3 )

ライセンス: Link先を確認

Michal Baczyk

(参考訳) 本稿では,場の励起を量子チャネルとして表現できる量子場理論(QFT)の研究フレームワークを提案する。 1次元QFT系の正規化真空状態と2つの同一自由QFT系の格子制御熱場二重状態の2つの普遍状態に対する提案方式の内部動作を実証する。単体および非単体ボソニックガウスチャネル(ペッツ回収マップを含む)の動作について検討する。チャネル静的動作とチャネル誘起力学の特性を評価し定量化するために,量子エントロピーと忠実度を計算する。

We propose a framework for Quantum Field Theory (QFT) studies that allows us to represent field excitations as quantum channels. We demonstrate inner-workings of the proposed scheme for two universal states: the regularized vacuum state of a one dimensional QFT system and the lattice-regulated Thermofield Double State of two identical free QFTs. We investigate actions of unitary and non-unitary Bosonic Gaussian channels (including Petz Recovery maps). To evaluate and quantify the character of the channel static action and channel induced dynamics we calculate quantum entropies and fidelities.

翻訳日:2023-03-01 23:36:38 公開日:2023-01-06

# ブロックチェーンユーザの政治的、経済的、ガバナンス的態度

Political, economic, and governance attitudes of blockchain users ( http://arxiv.org/abs/2301.02734v1 )

ライセンス: Link先を確認

Lucia M. Korpas, Seth Frey, Joshua Tan

(参考訳) ブロックチェーンエコシステムの一部である人々を対象に、暗号政治、暗号経済、暗号統治の感情を評価するための調査を行う。 3710人の調査回答に基づいて、その信念、態度、暗号参加の態様を説明し、自己報告された政治的提携とブロックチェーンエコシステムがこれらとどのように関連しているかを調査した。我々は,経済力分布の認識,暗号に対する個人的態度,ガバナンスにおける権力分布に関する規範的信念,ブロックチェーン技術の外部的規制に関する質問において,分極を観察した。政治的自己同一化の相違は、経済的公平性、性平等、意思決定力、適切な規制の獲得方法に関する意見と相関し、ブロックチェーン関連は、暗号通貨のガバナンスと規制に関する意見と相関し、回答者の暗号と個人的目標の意味的概念が関与している。また、理論駆動構成の政治軸は、データによって支持され、データから生じる他の回答者のグループ化や信念の可能性を調査する。

We present a survey to evaluate crypto-political, crypto-economic, and crypto-governance sentiment in people who are part of a blockchain ecosystem. Based on 3710 survey responses, we describe their beliefs, attitudes, and modes of participation in crypto and investigate how self-reported political affiliation and blockchain ecosystem affiliation are associated with these. We observed polarization in questions on perceptions of the distribution of economic power, personal attitudes towards crypto, normative beliefs about the distribution of power in governance, and external regulation of blockchain technologies. Differences in political self-identification correlated with opinions on economic fairness, gender equity, decision-making power and how to obtain favorable regulation, while blockchain affiliation correlated with opinions on governance and regulation of crypto and respondents' semantic conception of crypto and personal goals for their involvement. We also find that a theory-driven constructed political axis is supported by the data and investigate the possibility of other groupings of respondents or beliefs arising from the data.

翻訳日:2023-02-19 13:30:26 公開日:2023-01-06

# 関数型プログラミング・アサインメントにおける学生クラスタの識別 : クイックラーナーからストラグリング学生へ

Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students ( http://arxiv.org/abs/2301.02611v1 )

ライセンス: Link先を確認

Chuqin Geng, Wenwen Xu, Yingjie Xu, Brigitte Pientka, Xujie Si

(参考訳) インストラクターや学生は、学生がいかにうまく教材を習得しているか、学生が苦闘しているかを示す重要な指標として、プログラミング課題の成績にしばしば注目される。しかしこれは誤解を招く可能性がある。特に、学生がオートグレーターにアクセスできる場合、成績は大幅に歪められることがある。本稿では,McGill大学における関数型プログラミングコースから収集した学生の課題提出データを,幅広い特徴を取り入れて分析する。グレードに加えて、アクティビティ時間データ、費やされた時間、静的エラーの数についても検討する。これにより、クラスタアルゴリズムを通じて、"quick-learning"、"hardworking"、"satisficing"、"struggling"の4つの学生クラスタを識別することができます。次に、作業習慣、作業期間、エラーの範囲、エラーを修正する能力が、学生の異なるクラスタに与える影響を分析する。この構造化分析は、インストラクターがさまざまなタイプの学生を積極的に支援し、コース全体のデザインの異なる側面を強調するための貴重な洞察を提供する。また、学生自身がどの側面に苦しむかを理解し、明確化を追求し、仕事の習慣を調整するための洞察を提供する。

Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functional programming course taught at McGill university incorporating a wide range of features. In addition to the grade, we consider activity time data, time spent, and the number of static errors. This allows us to identify four clusters of students: "Quick-learning", "Hardworking", "Satisficing", and "Struggling" through cluster algorithms. We then analyze how work habits, working duration, the range of errors, and the ability to fix errors impact different clusters of students. This structured analysis provides valuable insights for instructors to actively help different types of students and emphasize different aspects of their overall course design. It also provides insights for students themselves to understand which aspects they still struggle with and allows them to seek clarification and adjust their work habits.

翻訳日:2023-02-19 13:30:00 公開日:2023-01-06

# インフォマティクスにおけるバランス改善 : 学生との正直な議論

Better Balance in Informatics: An Honest Discussion with Students ( http://arxiv.org/abs/2301.02532v1 )

ライセンス: Link先を確認

Elisavet Kozyri, Mariel Evelyn Markussen Ellingsen, Ragnhild Abel Grape, Letizia Jaccheri

(参考訳) 近年,コンピュータ科学(cs)の学術環境において,男女のバランスを促進する取り組みが盛んに行われている。しかし、学生から博士号取得者、教員まで、すべてのcsの学術レベルでは男女差が残っている。この傾向は、UiT(ノルウェー北極大学)のコンピュータ科学科(Department of Computer Science)が続く。 UiTのCS環境におけるこの傾向に対処するため,本学部の学生を対象に構造化された議論を行った。これらの議論から収集したデータを分析した結果、我々の部署のジェンダーギャップを緩和できる行動項目が特定できた。特に、これらの議論は、達成する方法を解明した。 (i)CS学部課程への学生のバランスの取れた流れ (二)バランスの取れたCS研究環境、及び (iii)csアカデミア(例えばphdプログラム)のより高いレベルへの卒業生のバランスの取れたフロー。本報告では, 省庁に対して行った議論の結果とその後の提言について述べる。また、ジェンダーバランス行動計画の一環として、他の機関が同様のイベントを組織化するためのロードマップも提供します。

In recent years, there has been considerable effort to promote gender balance in the academic environment of Computer Science (CS). However, there is still a gender gap at all CS academic levels: from students, to PhD candidates, to faculty members. This general trend is followed by the Department of Computer Science at UiT The Arctic University of Norway. To combat this trend within the CS environment at UiT, we embarked on structured discussions with students of our department. After analyzing the data collected from these discussions, we were able to identify action items that could mitigate the existing gender gap at our department. In particular, these discussions elucidated ways to achieve (i) a balanced flow of students into CS undergraduate program, (ii) a balanced CS study environment, and (iii) a balanced flow of graduates into higher levels of the CS academia (e.g., PhD program). This paper presents the results of the discussions and the subsequent recommendations that we made to the administration of the department. We also provide a road-map that other institutions could follow to organize similar events as part of their gender-balance action plan.

翻訳日:2023-02-19 13:29:38 公開日:2023-01-06

# 労働者の声:なぜ労働者中心のクラウドワークアプローチが混み合っているのか

Voices of Workers: Why a Worker-Centered Approach to Crowd Work Is Challenging ( http://arxiv.org/abs/2212.14471v2 )

ライセンス: Link先を確認

Caifan Du, Matthew Lease

(参考訳) 広く、多様性があり、シフトし、目に見えない群衆の労働力を理解するにはどうすればよいのか。一般市民のコミュニティフォーラムにおける公開投稿のオンライン観察と分析から得られた知見を報告する。特に,群集作業のメディア描写に関して,群集作業員とジャーナリストの間で繰り返し緊張関係が見られた。群衆の多様性は、群衆の仕事の幅広い経験に対処する上で、あらゆる1次元表現が不十分であることがわかった。我々は、規模、多様性、可視性、そして大衆の宣伝に対する抵抗が、特に群衆の仕事に対する労働者中心のアプローチを特に困難にし、労働者の多様性とその生活経験をよりよく理解する必要があると論じている。

How can we better understand the broad, diverse, shifting, and invisible crowd workforce, so that we can better support it? We present findings from online observations and analysis of publicly available postings from a community forum of crowd workers. In particular, we observed recurring tensions between crowd workers and journalists regarding media depictions of crowd work. We found that crowd diversity makes any one-dimensional representation inadequate in addressing the wide-ranging experiences of crowd work. We argue that the scale, diversity, invisibility, and the crowds' resistance to publicity make a worker-centered approach to crowd work particularly challenging, necessitating better understanding the diversity of workers and their lived experiences.

翻訳日:2023-02-19 13:22:51 公開日:2023-01-06

# 社会分析のための自己教師付きハイパーグラフ表現学習

Self-supervised Hypergraph Representation Learning for Sociological Analysis ( http://arxiv.org/abs/2212.11440v2 )

ライセンス: Link先を確認

Xiangguo Sun, Hong Cheng, Bo Liu, Jia Li, Hongyang Chen, Guandong Xu, Hongzhi Yin

(参考訳) 現代の社会学は行動分析の説得力のある社会的基準の多くを深く発見してきた。残念ながら、それらの多くは、オンラインソーシャルネットワークで測定され、提示されるには主観的すぎる。一方、データマイニング技術はデータパターンをよりよく見つけることができるが、その多くは不自然な理解を残している。本稿では,データマイニング技術と社会学的行動基準のさらなる融合を支援するための基本的な方法論を提案する。まず、効果的なハイパーグラフ認識と高速なライングラフ構築フレームワークを提案する。ハイパーグラフは、ハイパーグラフの各エッジが2つ以上のノードを含んでおり、社会環境を記述するのに最適であるため、個人とその環境間の相互作用をより深く示すことができる。ライングラフは、それぞれの社会環境を、異なる環境間の基盤となる影響を持つスーパーノードとして扱う。そこで,我々は従来の対関係を越え,様々な社会学的基準の下でより豊かなパターンを探索する。第2に,ユーザからユーザへ,ユーザへ,環境へ,環境から環境へ流れる社会的影響を学習するハイパーグラフベースのニューラルネットを提案する。第3に、社会的適合性、社会的等価性、環境の進化、社会分極化といった社会学的基準を効果的に評価するために、質的および定量的なソリューションを提案する。広範な実験により,オンラインユーザ行動と社会学的分析のためのデータマイニングタスクを,フレームワークがより良くサポートできることが判明した。

Modern sociology has profoundly uncovered many convincing social criteria for behavioural analysis. Unfortunately, many of them are too subjective to be measured and presented in online social networks. On the other hand, data mining techniques can better find data patterns but many of them leave behind unnatural understanding. In this paper, we propose a fundamental methodology to support the further fusion of data mining techniques and sociological behavioral criteria. Our highlights are three-fold: First, we propose an effective hypergraph awareness and a fast line graph construction framework. The hypergraph can more profoundly indicate the interactions between individuals and their environments because each edge in the hypergraph (a.k.a hyperedge) contains more than two nodes, which is perfect to describe social environments. A line graph treats each social environment as a super node with the underlying influence between different environments. In this way, we go beyond traditional pair-wise relations and explore richer patterns under various sociological criteria; Second, we propose a novel hypergraph-based neural network to learn social influence flowing from users to users, users to environments, environment to users, and environments to environments. The neural network can be learned via a task-free method, making our model very flexible to support various data mining tasks and sociological analysis; Third, we propose both qualitative and quantitive solutions to effectively evaluate the most common sociological criteria like social conformity, social equivalence, environmental evolving and social polarization. Our extensive experiments show that our framework can better support both data mining tasks for online user behaviours and sociological analysis.

翻訳日:2023-02-19 13:15:53 公開日:2023-01-06

# 回転ボース・アインシュタイン凝縮体の基底状態を計算するための二次流れ

Second-order flows for computing the ground states of rotating Bose-Einstein condensates ( http://arxiv.org/abs/2205.00805v2 )

ライセンス: Link先を確認

Haifan Chen, Guozhi Dong, Wei Liu, Ziqing Xie

(参考訳) 本稿では,一階流と見なされる勾配流と区別される二階時間微分を含む人工進化微分方程式について述べる。これは、凸最適化の減衰を伴う慣性力学の最近の進歩により、一般的なトピックである。数学的には、回転ボース・アインシュタイン凝縮体(bec)の基底状態は、正規化制約の下で角運動量回転項を持つグロス・ピタエフスキーエネルギー汎関数の最小値としてモデル化することができる。この制約付き非凸最適化問題に対するエネルギー最小化戦略として2種類の二階流を導入する。提案した人工力学は、散逸を伴う2階非線形双曲偏微分方程式である。時間的離散化のための明示的および半単純的手法や空間的離散化のためのフーリエ擬スペクトル法など、いくつかの数値的離散化方式が議論されている。これらのアルゴリズムは、回転するbecの基底状態を計算するための効率的でロバストなアルゴリズムを提供する。特に, 新たに開発したアルゴリズムは, 勾配流に基づく最先端の数値手法よりも優れていることがわかった。勾配流型アプローチと比較して、明示的な時間的離散化戦略を採用すると、提案手法はより安定した時間的ステップサイズを実現することができる; 半単純離散化では、同じステップサイズを使用するが、提案手法が停止基準に達するためには、より少ないイテレーションが必要であり、毎回ステップがほぼ同じ計算複雑性に遭遇する。リッチで詳細な数値例が検証と比較のために文書化されている。

Second-order flows in this paper refer to some artificial evolutionary differential equations involving second-order time derivatives distinguished from gradient flows which are considered to be first-order flows. This is a popular topic due to the recent advances of inertial dynamics with damping in convex optimization. Mathematically, the ground state of a rotating Bose-Einstein condensate (BEC) can be modeled as a minimizer of the Gross-Pitaevskii energy functional with angular momentum rotational term under the normalization constraint. We introduce two types of second-order flows as energy minimization strategies for this constrained non-convex optimization problem, in order to approach the ground state. The proposed artificial dynamics are novel second-order nonlinear hyperbolic partial differential equations with dissipation. Several numerical discretization schemes are discussed, including explicit and semi-implicit methods for temporal discretization, combined with a Fourier pseudospectral method for spatial discretization. These provide us a series of efficient and robust algorithms for computing the ground states of rotating BECs. Particularly, the newly developed algorithms turn out to be superior to the state-of-the-art numerical methods based on the gradient flow. In comparison with the gradient flow type approaches: When explicit temporal discretization strategies are adopted, the proposed methods allow for larger stable time step sizes; While for semi-implicit discretization, using the same step size, a much smaller number of iterations are needed for the proposed methods to reach the stopping criterion, and every time step encounters almost the same computational complexity. Rich and detailed numerical examples are documented for verification and comparison.

翻訳日:2023-02-14 20:40:58 公開日:2023-01-06

# ニューラルネットワークを用いたcovid-19患者のフィットネス依存型オプティマイザ

Fitness Dependent Optimizer with Neural Networks for COVID-19 patients ( http://arxiv.org/abs/2302.02986v1 )

ライセンス: Link先を確認

Maryam T. Abdulkhaleq, Tarik A. Rashid, Bryar A. Hassan, Abeer Alsadoon, Nebojsa Bacanin, Amit Chhabra, S. Vimal

(参考訳) 2019年に中国で発生した新型コロナウイルス(COVID-19)は、世界の健康に大きな影響を与え、世界中の医療機関に多大な負担を与えている。これらの効果は今日も続いている。ウイルスの感染を制限する一つの戦略は、疑わしい症例を早期に診断し、病気がさらに拡大する前に適切な対策を講じることである。本研究は, 文献的臨床データに基づき, 感染の可能性を診断し, 明らかにすることを目的としている。本研究では,5つの機械学習技術(GWO_MLP,GWO_CMLP,MGWO_MLP,FDO_MLP,FDO_CMLP)を用いて,Covid-19患者を2つのカテゴリに分類した。実験はすべての使用モデルに有望な結果をもたらした。適用された手法は、通常精度の点で非常によく似た性能を示した。しかし、各テストデータセットにおいて、FDO_MLPとFDO_CMLPは100%精度で最良の結果を得た。他のモデルの結果は、ある実験から別の実験へと変化した。その結果,FDOアルゴリズムを学習アルゴリズムとして用いたモデルは,高い精度が得られる可能性が示唆された。しかし、FDOは他のアルゴリズムと比較して最長のランタイムを持つことがわかった。 covid 19モデルへのリンクはこちら。 https://github.com/tarik4rashid4/covid19models

The Coronavirus, known as COVID-19, which appeared in 2019 in China, has significantly affected global health and become a huge burden on health institutions all over the world. These effects are continuing today. One strategy for limiting the virus's transmission is to have an early diagnosis of suspected cases and take appropriate measures before the disease spreads further. This work aims to diagnose and show the probability of getting infected by the disease according to textual clinical data. In this work, we used five machine learning techniques (GWO_MLP, GWO_CMLP, MGWO_MLP, FDO_MLP, FDO_CMLP) all of which aim to classify Covid-19 patients into two categories (Positive and Negative). Experiments showed promising results for all used models. The applied methods showed very similar performance, typically in terms of accuracy. However, in each tested dataset, FDO_MLP and FDO_CMLP produced the best results with 100% accuracy. The other models' results varied from one experiment to the other. It is concluded that the models on which the FDO algorithm was used as a learning algorithm had the possibility of obtaining higher accuracy. However, it is found that FDO has the longest runtime compared to the other algorithms. The link to the covid 19 models is found here: https://github.com/Tarik4Rashid4/covid19models

翻訳日:2023-02-12 13:04:44 公開日:2023-01-06

# 15のパズル-3つのヒューリスティックス法のハイブリッド化による新しいアプローチ

The Fifteen Puzzle- A New Approach through Hybridizing Three Heuristics Methods ( http://arxiv.org/abs/2302.02985v1 )

ライセンス: Link先を確認

Dler O. Hasan, Aso M. Aladdin, Hardi Sabah Talabani, Tarik Ahmed Rashid, and Seyedali Mirjalili

(参考訳) 15のパズル問題は、数世紀にわたって数学愛好家を魅了してきた最も古典的な問題の1つである。これは主に、探索すべき約1013の状態を持つ状態空間の巨大なサイズと、Fifteen Puzzleインスタンスの解決にいくつかのアルゴリズムが適用されているためである。本稿では,この大きな状態空間に対処するために,マンハッタン距離 (md), 線形衝突 (lc), 歩行距離 (wd) といった3つのヒューリスティックを持つ双方向a* (ba*) 探索アルゴリズムを用いた。 3つのヒューリスティックはアルゴリズムによって生成された状態の数を劇的に減らす方法でハイブリダイゼーションされる。さらに、これらのヒューリスティックは25KBのストレージしか必要としないが、アルゴリズムは生成された状態の数を効果的に減らし、ノード数を減らした。 BA*サーチの実装は,空間の複雑さを著しく低減し,最適解か準最適解かを保証できる。

Fifteen Puzzle problem is one of the most classical problems that have captivated mathematical enthusiasts for centuries. This is mainly because of the huge size of the state space with approximately 1013 states that have to be explored and several algorithms have been applied to solve the Fifteen Puzzle instances. In this paper, to deal with this large state space, Bidirectional A* (BA*) search algorithm with three heuristics, such as Manhattan distance (MD), linear conflict (LC), and walking distance (WD) has been used to solve the Fifteen Puzzle problems. The three mentioned heuristics will be hybridized in a way that can dramatically reduce the number of generated states by the algorithm. Moreover, all those heuristics require only 25KB of storage but help the algorithm effectively reduce the number of generated states and expand fewer nodes. Our implementation of BA* search can significantly reduce the space complexity, and guarantee either optimal or near-optimal solutions.1

翻訳日:2023-02-12 13:04:23 公開日:2023-01-06

# 非マルコフ散逸から量子ナノデバイスの時空間制御へ

From Non-Markovian Dissipation to Spatiotemporal Control of Quantum Nanodevices ( http://arxiv.org/abs/2205.11247v3 )

ライセンス: Link先を確認

Thibaut Lacroix, Brendon W. Lovett, Alex W. Chin

(参考訳) 量子効果を利用するナノデバイスは、将来の量子技術(QT)の重要な要素であるが、それらの実世界の性能は、局所的な「環境」相互作用から生じるデコヒーレンスによって強く制限されている。複数の機能ユニットを含むデバイスが複雑化するにつれて、ローカルな環境が重なり始め、新しい時間と長さのスケールで環境に媒介するデコヒーレンス現象が発生する可能性がある。このような複雑で本質的に非マルコフ力学は、QTのスケールアップに挑戦する可能性があるが、一方では、酵素や光合成タンパク質のような生物学的ナノマシンで起こることが示唆されるように、環境が「シグナル」とエネルギーを伝達する能力も、コンポーネント間プロセスの時空間的調整を可能にする可能性がある。数値的に正確な多くのボディ・メソッド(テンソル・ネットワーク)を探索し、空間的に離れた非相互作用量子系の進化を伝播する環境力学をどのように推し進めるかを探求する。本研究では, 環境に放出されるエネルギーを遠隔で収穫し, 過渡的な励起・反応性状態を生成する方法を示し, システム励起によって引き起こされる再編成が「機能的」量子システムの「下流」運動を質的かつ可逆的に変化させる可能性を明らかにする。完全なシステム環境波動関数へのアクセスにより、これらの現象の基礎となる顕微鏡プロセスが解明され、エネルギー効率のよい量子デバイスにどのように活用できるかの新しい知見が得られた。

Nanodevices exploiting quantum effects are critically important elements of future quantum technologies (QT), but their real-world performance is strongly limited by decoherence arising from local 'environmental' interactions. Compounding this, as devices become more complex, i.e. contain multiple functional units, the `local' environments begin to overlap, creating the possibility of environmentally mediated decoherence phenomena on new time-and-length scales. Such complex and inherently non-Markovian dynamics could present a challenge for scaling up QT, but -- on the other hand -- the ability of environments to transfer `signals' and energy might also enable sophisticated spatiotemporal coordination of inter-component processes, as is suggested to happen in biological nanomachines, like enzymes and photosynthetic proteins. Exploiting numerically exact many body methods (tensor networks) we study a general, fully quantum model that allows us to explore how propagating environmental dynamics can instigate and direct the evolution of spatially remote, non-interacting quantum systems. We demonstrate how energy dissipated into the environment can be remotely harvested to create transient excited/reactive states, and also identify how reorganisation triggered by system excitation can qualitatively and reversibly alter the `downstream' kinetics of a 'functional' quantum system. With access to complete system-environment wave functions, we elucidate the microscopic processes underlying these phenomena, providing new insight into how they could be exploited for energy efficient quantum devices.

翻訳日:2023-02-12 07:47:55 公開日:2023-01-06

# mc-qtaim分析によるコヒーレント量子重ね合わせマロンアルデヒドのエキゾチック結合の解明

The MC-QTAIM analysis reveals an exotic bond in the coherently quantum superposed Malonaldehyde ( http://arxiv.org/abs/2205.12090v3 )

ライセンス: Link先を確認

Mohammad Goli and Shant Shahbazian

(参考訳) マロンアルデヒド分子の2つの酸素原子間のプロトンは、2つの井戸の間にプロトン波動関数が非局在化する効果的な二重ウェルポテンシャルを経験する。そこで我々は分子分割法における原子の最先端の多成分量子理論を用いて分子構造、すなわち分子と結合ネットワークの原子をマロンアルデヒドの重ね合わせのabイニティオ波動関数から得る。プロトンが水素盆地を形成するマロンアルデヒドのよく知られたクランプ・プロトン描写とは対照的に、重畳された状態では水素盆地は消滅し、代わりに2つの新しいハイブリッド酸素-水素盆地が出現し、2つの盆地の間に陽子集団が均等に分布する。ハイブリッド盆地間の相互作用は、前例のないメカニズムによって安定している。これは、一方の盆地における1プロトン密度と他方の盆地における1電子密度の古典的クーロン相互作用の安定化を含む。この安定化機構は、化学において既知の結合モードと異なる結合をもたらす。

The proton between the two oxygen atoms of the malonaldehyde molecule experiences an effective double-well potential in which the proton wavefunction is delocalized between the two wells. Herein we employed the state-of-the-art multi-component quantum theory of atoms in molecules partitioning scheme to obtain the molecular structure, i.e. atoms in molecules and bonding network, from the superposed ab initio wavefunctions of malonaldehyde. In contrast to the familiar clamped-proton portrayal of malonaldehyde, in which the proton forms a hydrogen basin, for the superposed states the hydrogen basin disappears and two novel hybrid oxygen-hydrogen basins appear instead, with an even distribution of the proton population between the two basins. The interaction between the hybrid basins is stabilizing thanks to an unprecedented mechanism. This involves the stabilizing classical Coulomb interaction of the one-proton density in one of the basins with the one-electron density in the other basin. This stabilizing mechanism yields a bond foreign to the known bonding modes in chemistry.

翻訳日:2023-02-11 22:03:54 公開日:2023-01-06

# 空洞内超低温原子の自己組織化超放射相における次元交叉

Dimensional crossover in self-organised super-radiant phases of ultra cold atoms inside a cavity ( http://arxiv.org/abs/2206.04518v3 )

ライセンス: Link先を確認

Poornima Shakya, Amulya Ratnakar, Sankalpa Ghosh

(参考訳) 各ポンプがキャビティ軸の方向と角度が異なる2ポンプ配置で照らされた線形光学キャビティ内の超低温ボソニック原子の凝縮について考察する。このような構成は, 1次元量子光学格子配置からキャビティ-原子相互作用によって誘導される2次元量子光学格子配置への滑らかな遷移を可能にする。ホルシュタイン・プリマコフ変換を用いて、超放射相におけるそのような自己組織基底状態の原子密度プロファイルを、そのような動的量子光学格子におけるポンプの角方向の関数として発見し、座標空間と運動量空間におけるそれらの構造解析を提供する。論文の後半部では、このような量子光学格子ポテンシャルにおけるbose-hubbardモデルの拡張の観点から、対応する結果が定性的にも理解できることを示す。

We consider a condensate of ultra cold bosonic atoms in a linear optical cavity illuminated by a two-pump configuration where each pump is making different angles with the direction of the cavity axis. We show such configuration allows a smooth transition from a one-dimensional quantum optical lattice configuration to a two-dimensional quantum optical lattice configuration induced by the cavity-atom interaction. Using a Holstein-Primakoff transformation, we find out the atomic density profile of such self-organised ground state in the super-radiant phase as a function of the angular orientations of the pump in such dynamical quantum optical lattice, and, also provide an analysis of their structures in coordinate and momentum space. In the later part of the paper, we show how the corresponding results can also be qualitatively understood in terms of an Extended Bose-Hubbard model in such quantum optical lattice potential.

翻訳日:2023-02-10 01:35:16 公開日:2023-01-06

# 一般化確率論:テンソル積問題

Generalized possibilistic Theories: the tensor product problem ( http://arxiv.org/abs/2207.09905v2 )

ライセンス: Link先を確認

Eric Buffenoir (INPHYNI)

(参考訳) 演算量子論理プログラムに触発されて,量子力学の再構成プログラムにおいても,確率を導出概念とみなすことができるという主張が得られた。本稿では,確率が3値(ポシビリスティック)意味領域に属する反事実文に置き換えられる物理理論の操作的記述を提案する。状態空間と効果空間は、Chu 3 空間を通して双対性に置かれるポーズとして構築される。状態と効果の空間上の凸性要件は、基本的に一般化確率論で扱われ、これらの空間上の半格子構造に置き換えられる。純粋な状態は、状態全体の空間を生成する完全既約要素として容易に構築される。理論のチャネル(つまり対称性)は自然にChu準同型として構築される。公理論は「一般化ポシビリスティック理論」と呼ばれるものに対して、この状態/効果がチュ空間の圏(英語版)(chu space's category)に基づいて要約することができる。両部実験の問題点は,本論文の主な技術として扱われる。このとき、状態空間のテンソル積に対する公理が与えられ、解が明示的に構成される。次に、このテンソル積と数学文献に存在する半格子のテンソル積との関係/差分を解析する。半格子のテンソル積に対するこの新しい提案は、この研究の興味深い副産物と見なすことができる。

Inspired by the operational quantum logic program, we have the contention that probabilities can be viewed as a derived concept, even in a reconstruction program of Quantum Mechanics. We propose an operational description of physical theories where probabilities are replaced by counterfactual statements belonging to a three-valued (i.e. possibilistic) semantic domain. The space of states and the space of effects are then built as posets put in duality through a Chu 3 space. The convexity requirements on the spaces of states and effects, addressed basically in Generalized Probabilistic Theories, are then replaced by semi-lattice structures on these spaces. The pure states are also easily constructed as completely meet-irreducible elements which generate the whole space of states. The channels (i.e. symmetries) of the theory are then naturally built as Chu morphisms. An axiomatic can then be summarized for what can be called ''Generalized possibilistic Theory'' based on this States/Effects Chu space's category. The problem of bipartite experiment is then addressed as the main skill of this paper. An axiomatic for the tensor product of the space of states is then given and a solution is explicitly constructed. The relations/differences between this tensor product and the tensor product of semi-lattices present in the mathematical literature are then analyzed. This new proposal for the tensor product of semi-lattices can be considered as an interesting byproduct of this work.

翻訳日:2023-02-04 12:43:27 公開日:2023-01-06

# 曲面グラフェン超格子におけるスピン依存伝達

Spin-dependent transmission in curved graphene superlattice ( http://arxiv.org/abs/2208.02220v2 )

ライセンス: Link先を確認

Jaouad El-hassouny, Ahmed Jellal, El Houssine Atmani

(参考訳) 4つの領域から構成されるN$セルの曲面グラフェン超格子におけるスピン依存透過について検討した。 1つ目はconcaveで、3つ目はconvexで、平らなグラフェンシートから距離$d$で隔てられた2つの円の弧である。トンネル解析により、システムに関連するすべての伝送路と反射路を決定できる。その結果、細胞数が同じスピンで伝達を減少させることで作用することが示された。我々は,$d$と$N$が十分に大きいとき,固体スピンフィルター効果を予測する。最後に、同一のスピンがエネルギー範囲を超えた伝送の抑制の程度と持続時間が$d$で制御可能であると判定する。

We investigate spin-dependent transmission in a curved graphene superlattice of $N$ cells where each one is made up of four regions. The first is concave, and the third is convex, two arcs of circles separated by a distance $d$ from flat graphene sheets. The tunneling analysis allows us to determine all transmission and reflection channels associated with our system. As a result, we show that the number of cells acts by decreasing the transmissions with the same spin. We predict a solid spin-filtering effect when $d$ and $N$ are sufficiently large. Finally, it is determined that the degree and duration of suppression of the transmissions with the same spin over a range of energy are controllable using $d$.

翻訳日:2023-02-02 09:56:54 公開日:2023-01-06

# 2次元ディラック・ワイルフェルミオンのラシュバ寄与:通常の量子レジームを超えて

Rashba contribution of 2D Dirac-Weyl fermions: Beyond ordinary quantum regime ( http://arxiv.org/abs/2208.07661v2 )

ライセンス: Link先を確認

Ahmed Jellal, Dariush Jahani, Omid Akhavan

(参考訳) グラフェン中のディラック・ワイルフェルミオンのエネルギー準位をラシュバが最小長条件に寄与する磁場下で検討した。 2+1)次元の磁気モーメントに結合したdirac様電荷キャリアのエネルギー分散の正確な解は、運動量空間表現を用いて得られる。さらに、2次元ディラック様準粒子の応用に関しては、我々の理論と結果をいくつかの特別なケースで拡張し、高磁場限界における新興エネルギースペクトルはラシュバカップリング、$\lambda_{r}$、およびランダウ準位のバンド指数とは独立になることを示した。

We study the energy levels of Dirac-Weyl fermions in graphene subject to a magnetic field with Rashba contribution in the minimal length situation. The exact solution for the energy dispersion of Dirac-like charge carriers coupled to the magnetic moments in a (2+1)-dimension is obtained by the use of the momentum space representation. Moreover, as it comes to applications for 2D Dirac-like quasiparticles, we also extend our theory and results in some special cases, showing that the emerging energy spectrum at the high magnetic field limit becomes independent of the Rashba coupling, $\lambda_{R}$, and the band index of Landau levels.

翻訳日:2023-01-30 22:53:05 公開日:2023-01-06

# 準周期ポテンシャルと定常ホッピング振幅を持つ周期駆動モデル:移動ギャップと多フラクタル状態の工学

Periodically driven model with quasiperiodic potential and staggered hopping amplitudes: engineering of mobility gaps and multifractal states ( http://arxiv.org/abs/2208.10853v2 )

ライセンス: Link先を確認

Sreemayee Aditya, K. Sengupta, Diptiman Sen

(参考訳) 準周期ポテンシャルを持つモデルの周期的駆動が静的モデルに相反しない興味深いフロケット位相を生成できるかどうかを考察する。具体的には、オンサイト準周期ポテンシャル $v_0$ を持つ1次元の時間独立モデルであるオーブリー=アンドロ=eモデルと、スタッガー形式をとる最近傍ホッピング振幅を考える。周波数$\omega$で周期的に変化する均一なホッピング振幅を加える。 2つの位相しか持たない単純な位相図を持つ静的Aubry-Andr\'eモデルとは異なり、駆動モデルは、拡張状態のみを持つ位相、異なる準エネルギーバンドを分離する複数のモビリティギャップを持つ位相、共存する多重フラクタル状態と局所状態のみを持つ混合位相、そして局所状態のみを持つ位相の4つの相を持つ。マルチフラクタル状態は、拡張状態と局所状態の両方の値とは異なる指数でシステムサイズとスケールする逆参加比を一般化した。さらに、$\omega$ と $V_0$ が変化するとき、異なる種類の状態間の複雑な再帰遷移を観察する。高周波および大きな駆動振幅の限界において、Floquet準エネルギーは非駆動系のエネルギーと一致するが、Floquet固有状態ははるかに拡張されている。また、1粒子の波動パケットの拡散について検討し、常に弾道的であるが、弾道速度はシステムパラメータによって大きく変化し、静的モデルでは発生しない$V_0$に対する非単調な依存を示すことがある。準周期ポテンシャルと駆動の相互作用は、静的モデルには現れないリッチな位相図を生成すると結論づける。

We study if periodic driving of a model with a quasiperiodic potential can generate interesting Floquet phases which have no counterparts in the static model. Specifically, we consider the Aubry-Andr\'e model which is a one-dimensional time-independent model with an on-site quasiperiodic potential $V_0$ and a nearest-neighbor hopping amplitude which is taken to have a staggered form. We add a uniform hopping amplitude which varies periodically in time with a frequency $\omega$. Unlike the static Aubry-Andr\'e model which has a simple phase diagram with only two phases (only extended or only localized states), we find that the driven model has four possible phases: a phase with only extended states, a phase with multiple mobility gaps separating different quasienergy bands, a mixed phase with coexisting extended, multifractal, and localized states, and a phase with only localized states. The multifractal states have generalized inverse participation ratios which scale with the system size with exponents which are different from the values for both extended and localized states. In addition, we observe intricate re-entrant transitions between the different kinds of states when $\omega$ and $V_0$ are varied. In the limit of high frequency and large driving amplitude, we find that the Floquet quasienergies match the energies of the undriven system, but the Floquet eigenstates are much more extended. We also study the spreading of a one-particle wave packet and find that it is always ballistic but the ballistic velocity varies significantly with the system parameters, sometimes showing a non-monotonic dependence on $V_0$ which does not occur in the static model. We conclude that the interplay of quasiperiodic potential and driving produces a rich phase diagram which does not appear in the static model.

翻訳日:2023-01-30 02:26:58 公開日:2023-01-06

# AI行動の記述による人間とAIのコラボレーションの改善

Improving Human-AI Collaboration With Descriptions of AI Behavior ( http://arxiv.org/abs/2301.06937v1 )

ライセンス: Link先を確認

\'Angel Alexander Cabrera, Adam Perer, Jason I. Hong

(参考訳) 人々はAIシステムを使って意思決定を改善するが、しばしばAIの予測を過度に、あるいは過度に予測し、手伝わなかったよりも悪いパフォーマンスをする。人々がAIアシスタントを適切に頼りにするために、動作記述、AIシステムがインスタンスのサブグループでどのように機能するかの詳細を示すことを提案する。我々は,フェイクレビュー検出,衛星画像分類,鳥の分類という3つの異なるドメインの225名を対象に,行動記述の有効性をユーザ調査により検証した。行動記述は、AIの失敗を識別し、より正確な場合にAIへの信頼を高める2つのメカニズムを通じて、人間とAIの精度を高めることができることがわかった。これらの知見は、人間とAIのコラボレーションにおける人々のメンタルモデルの重要性を強調し、ハイレベルなAI行動の人々に通知することで、AI支援による意思決定を大幅に改善できることを示した。

People work with AI systems to improve their decision making, but often under- or over-rely on AI predictions and perform worse than they would have unassisted. To help people appropriately rely on AI aids, we propose showing them behavior descriptions, details of how AI systems perform on subgroups of instances. We tested the efficacy of behavior descriptions through user studies with 225 participants in three distinct domains: fake review detection, satellite image classification, and bird classification. We found that behavior descriptions can increase human-AI accuracy through two mechanisms: helping people identify AI failures and increasing people's reliance on the AI when it is more accurate. These findings highlight the importance of people's mental models in human-AI collaboration and show that informing people of high-level AI behaviors can significantly improve AI-assisted decision making.

翻訳日:2023-01-29 14:06:44 公開日:2023-01-06

# AdaEnsemble: クリックスルーレート予測のための適応スパース構造型アンサンブルネットワークの学習

AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction ( http://arxiv.org/abs/2301.08353v1 )

ライセンス: Link先を確認

YaChen Yan, Liubo Li

(参考訳) 機能相互作用の学習は、推薦システムや広告ランキングにおける大規模CTR予測の成功に不可欠である。研究者と実践者は、機能相互作用の探索とモデリングのための様々なニューラルネットワークアーキテクチャを幅広く提案した。しかし、異なるデータセットが異なるニューラルネットワークアーキテクチャや特徴相互作用タイプを好んでおり、異なる特徴相互作用学習手法には独自の利点があることが示唆されている。 AdaEnsemble: AdaEnsemble: Sparsely-Gated Mixture-of-Experts (SparseMoE)アーキテクチャは、異種機能相互作用の専門家の強みを生かし、各例のエキスパートの疎結合へのルーティングを適応的に学習することで、異なるタイプの機能相互作用の動的階層を構築することができる。予測精度と推論効率をさらに向上するため,機能間相互作用深度選択のための動的早期退避機構を組み込んだ。 AdaEnsembleは、機能相互作用の深さを適応的に選択し、対応するSparseMoEスタック層を見つけて、予測を終了し、計算することができる。そこで,提案アーキテクチャは,SparseMoE層内の疎ゲート専門家の指数的組み合わせの利点を継承し,さらにより深い層を実行することなく最適な特徴相互作用深さを動的に選択する。提案したAdaEnsembleを実装し,実世界のデータセット上での性能を評価する。 AdaEnsembleの最先端モデルに対する有効性と有効性を示す実験結果である。

Learning feature interactions is crucial to success for large-scale CTR prediction in recommender systems and Ads ranking. Researchers and practitioners extensively proposed various neural network architectures for searching and modeling feature interactions. However, we observe that different datasets favor different neural network architectures and feature interaction types, suggesting that different feature interaction learning methods may have their own unique advantages. Inspired by this observation, we propose AdaEnsemble: a Sparsely-Gated Mixture-of-Experts (SparseMoE) architecture that can leverage the strengths of heterogeneous feature interaction experts and adaptively learns the routing to a sparse combination of experts for each example, allowing us to build a dynamic hierarchy of the feature interactions of different types and orders. To further improve the prediction accuracy and inference efficiency, we incorporate the dynamic early exiting mechanism for feature interaction depth selection. The AdaEnsemble can adaptively choose the feature interaction depth and find the corresponding SparseMoE stacking layer to exit and compute prediction from. Therefore, our proposed architecture inherits the advantages of the exponential combinations of sparsely gated experts within SparseMoE layers and further dynamically selects the optimal feature interaction depth without executing deeper layers. We implement the proposed AdaEnsemble and evaluate its performance on real-world datasets. Extensive experiment results demonstrate the efficiency and effectiveness of AdaEnsemble over state-of-the-art models.

翻訳日:2023-01-29 13:48:36 公開日:2023-01-06

# 機械学習によるコビビリティに向けた遷移経路の発見

Discovering Transition Pathways Towards Coviability with Machine Learning ( http://arxiv.org/abs/2301.10023v1 )

ライセンス: Link先を確認

Laure Berti-Equille and Rafael L. G. Raimundo

(参考訳) 共生性(coviability)とは、人間と自然が機能的で公平で永続的な方法で共存できる、複数の社会生態学的配置とガバナンス構造を指す。環境劣化と社会的に脆弱な領域において、共生可能な状態に移行することは困難である。本稿では,ブラジル北東部の地域住民が採用・実施できるコビビリティ・パスを発見するために,機械学習,アグロエコロジー,社会科学を組み合わせたフランス・ブラジル共同研究プロジェクトについて述べる。

Coviability refers to the multiple socio-ecological arrangements and governance structures under which humans and nature can coexist in functional, fair, and persistent ways. Transitioning to a coviable state in environmentally degraded and socially vulnerable territories is challenging. This paper presents an ongoing French-Brazilian joint research project combining machine learning, agroecology, and social sciences to discover coviability pathways that can be adopted and implemented by local populations in the North-East region of Brazil.

翻訳日:2023-01-29 13:41:12 公開日:2023-01-06

# 財政責任:自動意思決定における公共信頼の実現

Fiduciary Responsibility: Facilitating Public Trust in Automated Decision Making ( http://arxiv.org/abs/2301.10001v1 )

ライセンス: Link先を確認

Shannon B. Harper and Eric S. Weber

(参考訳) 自動意思決定システムは、さまざまな肯定的かつ否定的な方法で、ますます普及し、大衆に影響を与える。政府や民間機関はこれらのシステムを使用して、社会問題や組織的課題に対処するために、特定の人間によって規定されたルールに従って情報を処理する。研究と実世界の経験は、公衆が自動意思決定システムとそれらを展開する機関への信頼を欠いていることを示している。帰納定理(recreancy theorem)は、行政機関が行政責任を負うならば、国民は自動意思決定システムによってなされた決定や影響を信頼し、支援する可能性が高いと主張している。しかし、一般にはこれらのシステムがどのように機能しているかを知らされず、結果として組織的な決定が行われることが多い。自動意思決定システムによる‘ブラックボックス’の効果は、完全性と信頼性に対する大衆の認識を減少させる。その結果、公共の商品や利益の喪失に伴う不公平さやコストを特定し、挑戦し、修正する能力を失うことになる。現在のポジションペーパーでは、自動意思決定システムにおける義務の役割を定義し説明する。本稿では、データサイエンスライフサイクル(DSL)として自動意思決定システムを定式化し、DSLのコンテキスト内での業務責任の影響について検討する。 DSLにおける財政的な責任は、自動意思決定システムに対する国民の信頼の欠如に対処するための方法論を提供する。我々は,DSL の複数の文脈において,ファデューシャルな責任が顕在化し,それぞれが自身の不信源の緩和を必要とすることを仮定する。受託者の責任を立証するために、ロサンゼルス警察(lapd)の予測警察ケーススタディを調査した。

Automated decision-making systems are being increasingly deployed and affect the public in a multitude of positive and negative ways. Governmental and private institutions use these systems to process information according to certain human-devised rules in order to address social problems or organizational challenges. Both research and real-world experience indicate that the public lacks trust in automated decision-making systems and the institutions that deploy them. The recreancy theorem argues that the public is more likely to trust and support decisions made or influenced by automated decision-making systems if the institutions that administer them meet their fiduciary responsibility. However, often the public is never informed of how these systems operate and resultant institutional decisions are made. A ``black box'' effect of automated decision-making systems reduces the public's perceptions of integrity and trustworthiness. The result is that the public loses the capacity to identify, challenge, and rectify unfairness or the costs associated with the loss of public goods or benefits. The current position paper defines and explains the role of fiduciary responsibility within an automated decision-making system. We formulate an automated decision-making system as a data science lifecycle (DSL) and examine the implications of fiduciary responsibility within the context of the DSL. Fiduciary responsibility within DSLs provides a methodology for addressing the public's lack of trust in automated decision-making systems and the institutions that employ them to make decisions affecting the public. We posit that fiduciary responsibility manifests in several contexts of a DSL, each of which requires its own mitigation of sources of mistrust. To instantiate fiduciary responsibility, a Los Angeles Police Department (LAPD) predictive policing case study is examined.

翻訳日:2023-01-29 13:40:15 公開日:2023-01-06

# Contra Bellum: 言語の混乱としてのベルの定理

Contra Bellum: Bell's theorem as a confusion of languages ( http://arxiv.org/abs/2301.10727v1 )

ライセンス: Link先を確認

Marek Czachor (Politechnika Gda\'nska)

(参考訳) ベルの定理(ベルのりん、英: bell's theorem)は、数学モデルの無限階層の中で定式化された数学的予測の矛盾である。レベル$k\in\mathbb{Z}$で定式化された不等式は、レベル$k+1$で確率に反する。我々は、$k=0$が古典世界に対応すると考える傾向があるが、量子世界は$k=1$である。しかし、$k=0$の不等式は$k=1$確率で破られるので、$k=1$不等式は$k=2$確率で破られ、$k=-1$不等式は$k=0$確率で破られる。ベルの定理の論理を受け入れて、何も存在しないことを帰納的に証明できるだろうか。

Bell's theorem is a conflict of mathematical predictions formulated within an infinite hierarchy of mathematical models. Inequalities formulated at level $k\in\mathbb{Z}$, are violated by probabilities at level $k+1$. We are inclined to think that $k=0$ corresponds to the the classical world, while the quantum one is $k=1$. However, as the $k=0$ inequalities are violated by $k=1$ probabilities, the same relation holds between $k=1$ inequalities violated by $k=2$ probabilities, $k=-1$ inequalities, violated by $k=0$ probabilities, and so forth. Accepting the logic of the Bell theorem, can we prove by induction that nothing exists?

翻訳日:2023-01-29 13:12:00 公開日:2023-01-06

# 2d-block geminals:計算複雑性を低減した非1-orthogonalおよび非0-seniorityモデル

2D-Block Geminals: a non 1-orthogonal and non 0-seniority model with reduced computational complexity ( http://arxiv.org/abs/2209.00834v4 )

ライセンス: Link先を確認

Patrick Cassam-Chena\"i (JAD), Thomas Perez (JAD), Davide Accomasso

(参考訳) ここでは、geminal 関数は強直交や高次性 0 に制約されない新しいgeminal product wave function ansatzを提案する。代わりに、電子の区別不能性を犠牲にすることなく、計算労力を大幅に下げるジェミナル間のより弱い直交性制約を導入する。つまり、geminal に対応する電子対は完全には区別できないし、その積はパウリの原理に従って反対称化されて \textit{bona fide} 電子波関数を形成しなければならない。最も単純な非自明なモデルでは、解の集合はブロック対角行列によって与えられ、各ブロックはサイズ 2x2 であり、最適化される複素パラメータで乗算されるパウリ行列または正規化された対角行列からなる。この単純化されたgeminalsのアンサッツにより、量子可観測体の行列要素の計算における項数は大幅に減少する。原理の証明が報告され、アンザッツが計算的に手頃な価格を維持しながら強い直交の宝石製品よりも正確であることを確認する。

We present a new geminal product wave function ansatz where the geminals are not constrained to be strongly orthogonal nor to be of seniority zero. Instead, we introduce weaker orthogonality constraints between geminals which significantly lower the computational effort, without sacrificing the indistinguishability of the electrons. That is to say, the electron pairs corresponding to the geminals are not fully distinguishable, and their product has still to be antisymmetrized according to the Pauli principle to form a \textit{bona fide} electronic wave function.Our geometrical constraints translate into simple equations involving the traces of products of our geminal matrices. In the simplest non-trivial model, a set of solutions is given by block-diagonal matrices where each block is of size 2x2 and consists of either a Pauli matrix or a normalized diagonal matrix, multiplied by a complex parameter to be optimized. With this simplified ansatz for geminals, the number of terms in the calculation of the matrix elements of quantum observables is considerably reduced. A proof of principle is reported and confirms that the ansatz is more accurate than strongly orthogonal geminal products while remaining computationally affordable.

翻訳日:2023-01-28 04:09:21 公開日:2023-01-06

# 線形光学超放射とサブ放射の光学的解釈

Optical interpretation of linear-optics superradiance and subradiance ( http://arxiv.org/abs/2209.00918v2 )

ライセンス: Link先を確認

S. Asselie and A. Cipris and W. Guerin

(参考訳) 超放射と準放射は通常、電磁場が原子間の効果的な相互作用のみを提供する「原子図」であるディック集合状態の枠組みで記述される。本稿では,原子媒質中の光の伝播と散乱について,複雑な感受性と散乱性を提供する相補的図式について述べる。この「オプティカル・ピクチャー」は乱れたサンプルの線形光学系で有効であり、単純な教科書式から感受性と散乱断面積を計算できる場合、主に低密度で関係している。この図では、超放射能は効果的な屈折率で装った単一散乱事象による分散効果であるが、サブ放射能は多重散乱によるものである。解釈を裏付ける数値データと実験データを提示する。

Super- and subradiance are usually described in the framework of Dicke collective states, which is an ``atomic picture'' in which the electromagnetic field only provides an effective interaction between the atoms. Here, we discuss a complementary picture, in which we describe the propagation and scattering of light in the atomic medium, which provides a complex susceptibility and scatterers. This ``optical picture'' is valid in the linear-optics regime for disordered samples and is mainly relevant at low density, when the susceptibility and scattering cross-section can be computed from simple textbook formulas. In this picture, superradiance is a dispersion effect due to a single scattering event dressed by an effective refractive index, whereas subradiance is due to multiple scattering. We present numerical and experimental data supporting our interpretation.

翻訳日:2023-01-28 04:00:24 公開日:2023-01-06

# 誘導Rydberg格子気体の超輝度誘起性

Superradiance-induced multistability in driven Rydberg lattice gases ( http://arxiv.org/abs/2209.10366v2 )

ライセンス: Link先を確認

Yunhui He, Zhengyang Bai, Yuechun Jiao, Jianming Zhao, and Weibin Li

(参考訳) マイクロ波(MW)磁場で結合した1次元のリドベルク原子の定常状態相について検討し、高エネルギーのリドベルク原子は単体および集合(超ラジアント)崩壊によって低エネルギーに崩壊する。平均場アプローチを用いて,MW結合,状態内バンデルワールス(vdW)相互作用,およびRydberg状態間の単体・集団散逸について検討した。線形安定解析により、均一、反強磁性、発振、双安定および多安定相を含む一連の相が得られることが明らかになった。 vdW相互作用がなければ、一様相のみが見つかる。 vdW相互作用の存在下では、超ラジカル崩壊速度の強度を増加させると、多安定解が増大する。数値シミュレーションにより,双安定相と多安定相は長鎖の超放射によって安定化することが示された。均一相と多安定相の間の臨界点と原子番号によるスケーリングを求める。有限鎖のマスター方程式を数値的に解くことにより、平均場多安定相は、Rydberg 集団の期待値と異なる部位におけるRydberg 原子間の2体相関によって特徴づけられることを示す。

We study steady state phases of a one-dimensional array of Rydberg atoms coupled by a microwave (MW) field where the higher energy Rydberg state decays to the lower energy one via single-body and collective (superradiant) decay. Using mean-field approaches, we examine the interplay among the MW coupling, intra-state van der Waals (vdW) interaction, and single-body and collective dissipation between Rydberg states. A linear stability analysis reveals that a series of phases, including uniform, antiferromagnetic, oscillatory, and bistable and multistable phases can be obtained. Without the vdW interaction, only uniform phases are found. In the presence of the vdW interaction, multistable solutions are enhanced when increasing the strength of the superradiant decay rate. Our numerical simulations show that the bistable and multistable phases are stabilized by superradiance in a long chain. The critical point between the uniform and multistable phases and its scaling with the atom number is obtained. Through numerically solving the master equation of a finite chain, we show that the mean-field multistable phase could be characterized by expectation values of Rydberg populations and two-body correlations between Rydberg atoms in different sites.

翻訳日:2023-01-25 20:47:14 公開日:2023-01-06

# 臨界の双対性視点

Duality viewpoint of criticality ( http://arxiv.org/abs/2209.13450v4 )

ライセンス: Link先を確認

Linhao Li, Yuan Yao

(参考訳) 本研究では、異なる対称性保護位相(SPT)位相を連結する双対変換の下で自己双対な量子多体系について検討する。これらの自己双対モデルの臨界性の幾何学的説明を提供する。より正確には、周期境界条件下での基底状態(準退化)、すなわちバルクスペクトルの到達可能性を示す。同様に、双対性対称性を含む臨界点の対称性群は混合の't Hooft 異常を持つ。このアプローチは通常の0-形式対称性を持つ自己双対モデルのスペクトルを予測できるだけでなく、より高い形式やサブシステム対称性のような一般化対称性を持つモデルにも適用できる。アプリケーションとして、1次元と2次元のいくつかの例で結果を説明し、2つの異なるSPTを分離する。

In this work, we study quantum many-body systems which are self-dual under duality transformation connecting different symmetry protected topological (SPT) phases. We provide a geometric explanation of the criticality of these self-dual models. More precisely, we show a ground state (quasi-)degeneracy under the periodic boundary conditions,i.e., the ingappability of the bulk spectrum. Equivalently, the symmetry group at criticality, including the duality symmetry, has a mixed 't Hooft anomaly. This approach can not only predict the spectrum of the self-dual model with ordinary 0-form symmetry, but also be applied to that with generalized symmetry, such as higher form and subsystem symmetry. As an application, we illustrate our results with several examples in one and two dimensions, which separate two different SPTs.

翻訳日:2023-01-25 00:23:25 公開日:2023-01-06

# プログラム可能な時空間パラメトリックモードセンサ

A Programmable Spatiotemporal Quantum Parametric Mode Sorter ( http://arxiv.org/abs/2210.16517v2 )

ライセンス: Link先を確認

Malvika Garikapati, Santosh Kumar, He Zhang, Yong Meng Sua, and Yu-Ping Huang

(参考訳) 我々は,モード選択型量子周波数アップコンバージョンによる複合時空間ヒルベルト空間における高次元信号のプログラム可能なパラメトリックモードソータを実験的に示す。具体的な例として、量子通信の応用を念頭に置いて、ラゲール・ガウシアンモードとエルミート・ガウシアンモードをそれぞれ信号の空間的および時間的基底と考える。アップコンバージョンポンプの時空間プロファイルを変調することにより、これらのモードにおける単一光子の忠実な選択と重ね合わせモードを示す。その結果,アップ変換光を単一モードファイバに結合し,位相マッチングのエッジでアップコンバージョンを操作することにより,量子モードソート性能が向上した。ポンプ時間プロファイルのみを最適化することにより、時空間モードの相互非バイアス基底(MUB)集合に対して12dB以上の絶滅を達成する。この完全にプログラム可能で効率的なシステムは、量子通信、量子計算、量子メトロロジーの有効なリソースとして機能する。

We experimentally demonstrate a programmable parametric mode sorter of high-dimensional signals in a composite spatiotemporal Hilbert space through mode-selective quantum frequency up-conversion. As a concrete example and with quantum communication applications in mind, we consider the Laguerre-Gaussian and Hermite-Gaussian modes as the spatial and temporal state basis for the signals, respectively. By modulating the spatiotemporal profiles of the up-conversion pump, we demonstrate the faithful selection of single photons in those modes and their superposition modes. Our results show an improvement in the quantum mode-sorting performance by coupling the up-converted light into a single-mode fiber and/or operating the upconversion at the edge of phase matching. By optimizing pump temporal profiles only, we achieve more than 12 dB extinction for mutually unbiased basis (MUB) sets of the spatiotemporal modes. This fully programmable and efficient system could serve as a viable resource for quantum communications, quantum computation, and quantum metrology.

翻訳日:2023-01-21 03:07:48 公開日:2023-01-06

# 離散力学系における非自明な最小固定点の探索

Finding Nontrivial Minimum Fixed Points in Discrete Dynamical Systems ( http://arxiv.org/abs/2301.04090v1 )

ライセンス: Link先を確認

Zirou Qiu, Chen Chen, Madhav V. Marathe, S. S. Ravi, Daniel J. Rosenkrantz, Richard E. Stearns, Anil Vullikanti

(参考訳) ネットワーク化された離散力学システムは、協調ゲームにおけるエージェントによる伝染と意思決定の拡散をモデル化するためにしばしば用いられる。このような力学系の固定点は、システムが収束する構成を表す。望ましくない感染(噂や誤報など)の拡散においては、少数の影響を受けるノードを持つ固定点への収束が望ましい目標である。このような考慮により、影響を受けるノード数が最小となるシステムの非自明な固定点を見つけるという、新しい最適化問題を定式化する。 p = np でない限り、この問題の解を任意の定数エプシロン > 0 の係数 n^1-\epsilon に近似する多項式時間アルゴリズムは存在しない。この計算難易度に対処するため,この問題を効率的に解決できる特別な事例をいくつか挙げる。さらに,適切な大きさのネットワークに対する問題に対処する整数線形プログラムを提案する。大規模ネットワーク上での問題を解くために、欲求選択法とともに一般的なヒューリスティックな枠組みを提案する。実世界のネットワークにおける広範囲な実験結果から,提案するヒューリスティックスの有効性が示された。

Networked discrete dynamical systems are often used to model the spread of contagions and decision-making by agents in coordination games. Fixed points of such dynamical systems represent configurations to which the system converges. In the dissemination of undesirable contagions (such as rumors and misinformation), convergence to fixed points with a small number of affected nodes is a desirable goal. Motivated by such considerations, we formulate a novel optimization problem of finding a nontrivial fixed point of the system with the minimum number of affected nodes. We establish that, unless P = NP, there is no polynomial time algorithm for approximating a solution to this problem to within the factor n^1-\epsilon for any constant epsilon > 0. To cope with this computational intractability, we identify several special cases for which the problem can be solved efficiently. Further, we introduce an integer linear program to address the problem for networks of reasonable sizes. For solving the problem on larger networks, we propose a general heuristic framework along with greedy selection methods. Extensive experimental results on real-world networks demonstrate the effectiveness of the proposed heuristics.

翻訳日:2023-01-11 17:21:41 公開日:2023-01-06

# 拡散写像によるトポロジカルニューラルネットワーク量子状態の分類

Classifying topological neural network quantum states via diffusion maps ( http://arxiv.org/abs/2301.02683v1 )

ライセンス: Link先を確認

Yanting Teng, Subir Sachdev, Mathias S. Scheurer

(参考訳) 量子多体系におけるトポロジ的順序を検出するための教師なし機械学習手法を議論し、実証する。制限されたボルツマン機械を用いて低エネルギースペクトルの変分アンザッツを定義することで、確率が指数関数的に減衰する波動関数とその変動エネルギーをサンプリングし、拡散写像スキームの入力として使用するトレーニングデータセットを定義する。拡散写像は波動関数の低次元埋め込みを提供し、超選択セクターの存在や欠如を明らかにし、したがって位相順序を与える。拡散写像に対して,ネットワークパラメータを用いて量子状態の必要相似性測度を定義できることを示し,多項式時間での効率的な評価を可能にした。しかし、考えられる「ゲージ冗長性」を慎重に考慮する必要がある。明示的な例として、このメソッドを toric コードに適用します。

We discuss and demonstrate an unsupervised machine-learning procedure to detect topological order in quantum many-body systems. Using a restricted Boltzmann machine to define a variational ansatz for the low-energy spectrum, we sample wave functions with probability decaying exponentially with their variational energy; this defines our training dataset that we use as input to a diffusion map scheme. The diffusion map provides a low-dimensional embedding of the wave functions, revealing the presence or absence of superselection sectors and, thus, topological order. We show that for the diffusion map, the required similarity measure of quantum states can be defined in terms of the network parameters, allowing for an efficient evaluation within polynomial time. However, possible ''gauge redundancies'' have to be carefully taken into account. As an explicit example, we apply the method to the toric code.

翻訳日:2023-01-10 18:56:25 公開日:2023-01-06

# マルチモーダル歌詞-リズムマッチング

Multimodal Lyrics-Rhythm Matching ( http://arxiv.org/abs/2301.02732v1 )

ライセンス: Link先を確認

Callie C. Liao, Duoduo Liao, Jesse Guessford

(参考訳) 最近の音楽の人工知能研究の増加にもかかわらず、歌詞の主要成分とキーワード、強調された音節、強いビートといったリズムの相関は、あまり研究されていない。 thsは、オーディオの誤用、音節識別の不正確さ、そして最も重要なのは、学際的な知識の必要性といった課題による可能性がある。このような研究の欠如に対処するため,本稿では,歌詞と音楽のキーコンポーネントを言語的制約なくマッチングする,新しいマルチモーダルな歌詞・リズムマッチング手法を提案する。私たちは、簡単に利用可能なメタデータで楽譜をシートするのではなく、オーディオを使用します。さらに,音楽の強いビート,歌詞の音節,歌手の発音の聴覚的変化,特に歌詞キーワードなど,鍵となるリズミカル要素と鍵となるリズミカル要素のマッチングに活用される様々なマルチモーダルなパターンを創造的に生成する。この有利なアプローチは、効率的なリズムベースのオーディオアライメントアルゴリズムを含む聴覚的歌詞とリズムの相関を研究するためのユニークな方法を提供するだけでなく、音楽や音楽認知と計算言語学を橋渡しする。実験の結果,平均で0.81の確率が一致し,約30%の楽曲が0.9以上のキーワードが強いビートに着地する確率を示し,そのうち12%が完璧に着地した。また、類似度指標を用いて、歌詞とリズムの相関性を評価する。楽曲の50%近くが0.70以上の類似性を持っている。結論として,本手法は洞察に富む相関関係を計算的に明らかにすることにより,歌詞とリズムの関係に大きく寄与する。

Despite the recent increase in research on artificial intelligence for music, prominent correlations between key components of lyrics and rhythm such as keywords, stressed syllables, and strong beats are not frequently studied. Ths is likely due to challenges such as audio misalignment, inaccuracies in syllabic identification, and most importantly, the need for cross-disciplinary knowledge. To address this lack of research, we propose a novel multimodal lyrics-rhythm matching approach in this paper that specifically matches key components of lyrics and music with each other without any language limitations. We use audio instead of sheet music with readily available metadata, which creates more challenges yet increases the application flexibility of our method. Furthermore, our approach creatively generates several patterns involving various multimodalities, including music strong beats, lyrical syllables, auditory changes in a singer's pronunciation, and especially lyrical keywords, which are utilized for matching key lyrical elements with key rhythmic elements. This advantageous approach not only provides a unique way to study auditory lyrics-rhythm correlations including efficient rhythm-based audio alignment algorithms, but also bridges computational linguistics with music as well as music cognition. Our experimental results reveal an 0.81 probability of matching on average, and around 30% of the songs have a probability of 0.9 or higher of keywords landing on strong beats, including 12% of the songs with a perfect landing. Also, the similarity metrics are used to evaluate the correlation between lyrics and rhythm. It shows that nearly 50% of the songs have 0.70 similarity or higher. In conclusion, our approach contributes significantly to the lyrics-rhythm relationship by computationally unveiling insightful correlations.

翻訳日:2023-01-10 18:48:19 公開日:2023-01-06

# 文脈内終端自動音声認識における外部オフポリティ・スピーチ・トゥ・テキストマッピングの利用

Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition ( http://arxiv.org/abs/2301.02736v1 )

ライセンス: Link先を確認

David M. Chan, Shalini Ghosh, Ariya Rastrow, Bj\"orn Hoffmeister

(参考訳) 自動音声認識(ASR)モデルの一般化性能の改善にもかかわらず、ダウンストリームタスクのためのASRモデルの特殊化は、主にデータ可用性の低下(データ収集の増大)とデータ分散の急激なシフト(より頻繁なモデル微調整の要求)のために難しい課題である。本研究では,外部知識の活用の可能性,特にtext-to-speech法で生成されたオフポリシックキーバリューストアを用いて,新しいデータ分布へのフレキシブルなトレーニング後適応を可能にする。提案手法では,テキストから音声への埋め込みとセマンティックテキストの埋め込みを併用して,k-nearest-neighbor (KNN) に基づく注意融合ステップを用いて,ASRに偏りを与える。 libirispeechと社内の音声アシスタント/検索データセットに関する実験では、提案手法はドメイン適応時間を最大1kgpu時間に短縮できると同時に、微調整ベースラインと比較して最大3%改善できることが示され、ゼロおよび少数ショットシナリオに挑戦して生産asrシステムを適用するための有望なアプローチが示唆された。

Despite improvements to the generalization performance of automated speech recognition (ASR) models, specializing ASR models for downstream tasks remains a challenging task, primarily due to reduced data availability (necessitating increased data collection), and rapidly shifting data distributions (requiring more frequent model fine-tuning). In this work, we investigate the potential of leveraging external knowledge, particularly through off-policy key-value stores generated with text-to-speech methods, to allow for flexible post-training adaptation to new data distributions. In our approach, audio embeddings captured from text-to-speech, along with semantic text embeddings, are used to bias ASR via an approximate k-nearest-neighbor (KNN) based attentive fusion step. Our experiments on LibiriSpeech and in-house voice assistant/search datasets show that the proposed approach can reduce domain adaptation time by up to 1K GPU-hours while providing up to 3% WER improvement compared to a fine-tuning baseline, suggesting a promising approach for adapting production ASR systems in challenging zero and few-shot scenarios.

翻訳日:2023-01-10 18:47:49 公開日:2023-01-06

# 化学空間上の仮説駆動型能動学習による分子の構造-親和関係の発見

Discovery of structure-property relations for molecules via hypothesis-driven active learning over the chemical space ( http://arxiv.org/abs/2301.02665v1 )

ライセンス: Link先を確認

Ayana Ghosh, Sergei V. Kalinin and Maxim A. Ziatdinov

(参考訳) 薬物標的、生体分子系、触媒、光電気、有機エレクトロニクス、電池の分子候補の発見は、望まれる機能性をターゲットとした化学空間の迅速な探索が可能な機械学習アルゴリズムの開発を必要とする。本稿では,仮説学習に基づく化学空間上のアクティブラーニングのための新しいアプローチを提案する。我々は、データの小さな部分集合に基づいて、興味の構造と機能の間の可能な関係に関する仮説を構築し、それをガウス過程の(確率的な)平均関数として導入する。このアプローチはSISSOやアクティブラーニングといったシンボリック回帰手法の要素をひとつのフレームワークに統合する。ここでは、qm9データセットについて実証するが、分子科学と固体材料科学の両方の分野のデータセットに広く適用することができる。

Discovery of the molecular candidates for applications in drug targets, biomolecular systems, catalysts, photovoltaics, organic electronics, and batteries, necessitates development of machine learning algorithms capable of rapid exploration of the chemical spaces targeting the desired functionalities. Here we introduce a novel approach for the active learning over the chemical spaces based on hypothesis learning. We construct the hypotheses on the possible relationships between structures and functionalities of interest based on a small subset of data and introduce them as (probabilistic) mean functions for the Gaussian process. This approach combines the elements from the symbolic regression methods such as SISSO and active learning into a single framework. Here, we demonstrate it for the QM9 dataset, but it can be applied more broadly to datasets from both domains of molecular and solid-state materials sciences.

翻訳日:2023-01-10 18:21:33 公開日:2023-01-06

# グロキングモジュラー算術

Grokking modular arithmetic ( http://arxiv.org/abs/2301.02679v1 )

ライセンス: Link先を確認

Andrey Gromov

(参考訳) モジュラー演算のタスクを学習し,'grokking'と呼ばれる一般化の急激な飛躍を示す,シンプルなニューラルネットワークを提案する。具体的に言えば i) MSE損失関数が正規化されていない場合に、バニラ勾配降下の下で様々なモジュラー演算タスクをグラッキングする完全連結二層ネットワーク。 (二モジュラー算術がタスクによって構造が決定される特定の特徴写像の学習に対応することの証拠。) (iii)多種多様なモジュラー算術タスクを解決する重み(従って特徴写像)の解析式 (4)これらの特徴写像は、AdamWと同様にバニラ勾配降下によっても見出され、ネットワークによって学習された表現の完全な解釈可能性を確立する。

We present a simple neural network that can learn modular arithmetic tasks and exhibits a sudden jump in generalization known as ``grokking''. Concretely, we present (i) fully-connected two-layer networks that exhibit grokking on various modular arithmetic tasks under vanilla gradient descent with the MSE loss function in the absence of any regularization; (ii) evidence that grokking modular arithmetic corresponds to learning specific feature maps whose structure is determined by the task; (iii) analytic expressions for the weights -- and thus for the feature maps -- that solve a large class of modular arithmetic tasks; and (iv) evidence that these feature maps are also found by vanilla gradient descent as well as AdamW, thereby establishing complete interpretability of the representations learnt by the network.

翻訳日:2023-01-10 18:21:19 公開日:2023-01-06

# 並列分散型大規模ディープラーニング学習システム

Systems for Parallel and Distributed Large-Model Deep Learning Training ( http://arxiv.org/abs/2301.02691v1 )

ライセンス: Link先を確認

Kabir Nagrecha

(参考訳) ディープラーニング(DL)は、コンピュータビジョン、自然言語処理、表形式のデータ分析など、さまざまな分野のアプリケーションを変換している。 dlモデルの精度向上の追求は、数十億の学習可能なパラメータにまたがる最近のトランスフォーマーモデルによって、ますます大きなニューラルネットワークアーキテクチャを探求するようになった。これらの設計は、メモリボトルネック、ランタイム効率の低下、モデル開発における高コストなど、DL空間に新たなスケール駆動システム課題を導入している。これらの問題に対処する努力は、ニューラルアーキテクチャの並列化、メモリ階層にまたがるデータの流出、メモリ効率のよいデータ表現といったテクニックを探求してきた。この調査では、大規模なモデルトレーニングシステムの展望を探求し、主要な課題とそれに対応する様々なテクニックを強調します。

Deep learning (DL) has transformed applications in a variety of domains, including computer vision, natural language processing, and tabular data analysis. The search for improved DL model accuracy has led practitioners to explore increasingly large neural architectures, with some recent Transformer models spanning hundreds of billions of learnable parameters. These designs have introduced new scale-driven systems challenges for the DL space, such as memory bottlenecks, poor runtime efficiency, and high costs of model development. Efforts to address these issues have explored techniques such as parallelization of neural architectures, spilling data across the memory hierarchy, and memory-efficient data representations. This survey will explore the large-model training systems landscape, highlighting key challenges and the various techniques that have been used to address them.

翻訳日:2023-01-10 18:21:05 公開日:2023-01-06

# エンコーダ・デコーダ言語モデルによるペアリング抗体配列の条件付き生成

Conditional Generation of Paired Antibody Chain Sequences through Encoder-Decoder Language Model ( http://arxiv.org/abs/2301.02748v1 )

ライセンス: Link先を確認

Simon K.S. Chu, Kathy Y. Wei

(参考訳) タンパク質言語モデル(lms)は、シーケンス、構造、機能予測に成功している。しかし、現在、タンパク質 LM は単一配列のエンコーダまたはデコーダのみのアーキテクチャに制限されている。ここでは, 抗体鎖ペアリングをT5アーキテクチャを用いて前方および後方翻訳としてモデル化したpAbT5を紹介する。 pAbT5は、配列生成と不一致を、教師なしおよび教師なしの分類として正確に反映していることを示す。我々のタンパク質LMは可変長配列を生成し、その次単語予測確率は配列アライメントから位置特異的スコアリング行列と一致する。タンパク質 LM の他の研究と同様に、pAbT5 は実験測定において最先端の教師なし予測を行う。我々の知る限り、pAbT5はタンパク質-タンパク質相互作用のための最初のエンコーダ-デコーダタンパク質LMである。

Protein language models (LMs) have been successful in sequence, structural and functional predictions. However, currently, protein LMs are limited to encoder- or decoder-only architectures for single sequences while many biological contexts involve protein-protein interactions. Here, we introduce pAbT5, which models antibody chain pairing as forward- and back-translations using a T5-based architecture. We show that pAbT5 accurately reflects chain pairing through sequence generation and mispairing as unsupervised and supervised classifications. Our protein LM generates variable-length sequences and its next-word prediction probability agrees with position-specific scoring matrix from sequence alignment. Like other works in protein LM, pAbT5 performs state-of-the-art unsupervised prediction on experimental measurements. To the best of our knowledge, pAbT5 is the first encoder-decoder protein LM for protein-protein interactions.

翻訳日:2023-01-10 18:02:36 公開日:2023-01-06

# 3DAvatarGAN: パーソナライズされた編集可能なアバターのためのブリッジドメイン

3DAvatarGAN: Bridging Domains for Personalized Editable Avatars ( http://arxiv.org/abs/2301.02700v1 )

ライセンス: Link先を確認

Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov

(参考訳) 現代の3D-GANは、一貫した構造を持つ大規模データセットのトレーニングによって幾何学とテクスチャを合成する。このようなモデルを、しばしば未知の、高度に変動した幾何学とカメラ情報に基づくスタイル化された芸術データで訓練することは、まだ不可能である。マルチビューの一貫性とテクスチャの質を維持しながら、3D GANをそのような芸術的データでトレーニングできるだろうか? そこで本研究では,ソースドメインが事前訓練された3D-GANであり,ターゲットドメインが2D-GANである適応フレームワークを提案する。次に、2Dジェネレータからソース3Dジェネレータに知識を蒸留する。そこで我々はまず,ドメイン間のカメラパラメータの分布を調整する最適化手法を提案する。第二に,質の高いテクスチャを学習するために必要な規則化を提案し,平坦な形状などの幾何学的解の退化を回避した。第3に,芸術領域の誇張された幾何学をモデル化するための変形に基づく手法について述べる。最後に、ソースとターゲットドメインの潜在空間をリンクする3D-GANの新しい逆変換法を提案する。私たちのコントリビューションは、初めて、芸術データセット上でパーソナライズされた3Dアバターの生成、編集、アニメーションを可能にしました。

Modern 3D-GANs synthesize geometry and texture by training on large-scale datasets with a consistent structure. Training such models on stylized, artistic data, with often unknown, highly variable geometry, and camera information has not yet been shown possible. Can we train a 3D GAN on such artistic data, while maintaining multi-view consistency and texture quality? To this end, we propose an adaptation framework, where the source domain is a pre-trained 3D-GAN, while the target domain is a 2D-GAN trained on artistic datasets. We then distill the knowledge from a 2D generator to the source 3D generator. To do that, we first propose an optimization-based method to align the distributions of camera parameters across domains. Second, we propose regularizations necessary to learn high-quality texture, while avoiding degenerate geometric solutions, such as flat shapes. Third, we show a deformation-based technique for modeling exaggerated geometry of artistic domains, enabling -- as a byproduct -- personalized geometric editing. Finally, we propose a novel inversion method for 3D-GANs linking the latent spaces of the source and the target domains. Our contributions -- for the first time -- allow for the generation, editing, and animation of personalized artistic 3D avatars on artistic datasets.

翻訳日:2023-01-10 18:02:26 公開日:2023-01-06

# 胸部X線画像における深層学習に基づく新型コロナウイルス認識モデルの設計 : 知識蒸留アプローチ

Designing an Improved Deep Learning-based Model for COVID-19 Recognition in Chest X-ray Images: A Knowledge Distillation Approach ( http://arxiv.org/abs/2301.02735v1 )

ライセンス: Link先を確認

AmirReza BabaAhmadi, Sahar Khalafi, Masoud ShariatPanahi, Moosa Ayati

(参考訳) 新型コロナウイルス(covid-19)は、異なる側面の人間や社会に悪影響を及ぼしている。新型コロナウイルスの診断が不正確で、適切な治療が不十分なため、多くの人が死亡した。世界中の研究者によって,手動・自動特徴抽出技術に基づく多数の解が研究されている。通常、自動特徴抽出法、特にディープラーニングモデルは、必要な計算を実行するために強力なハードウェアシステムを必要とする。残念なことに、多くの機関や社会は、高品質のハードウェア機器の高価さのために、これらの進歩から利益を得ることができない。その結果,本研究では, 組込みデバイス, モバイルデバイス, 従来のコンピュータ上でのモデル実行に伴う計算コストの低減, および, 医用認識タスクの性能と精度を確保するために, これまでに公表した手法(少なくとも最先端モデルと同等の性能)と比較して, モデルの性能を向上すること, の2つの目標に焦点をあてた。本研究では,VGG19とResNet50V2という2つのニューラルネットワークを用いて,データセットの特徴抽出を改善した。これらのネットワークはどちらも、指定されたデータセットからセマンティック機能を提供する。この目的のために、モバイルと組み込みデバイスで最小限の計算を必要としながらセマンティック機能を抽出するMobileNetV2という代替ネットワークが検討された。知識蒸留(KD)は、教師ネットワーク(統合ResNet50V2とVGG19)から学生ネットワーク(MobileNetV2)へ知識を伝達し、MobileNetV2の性能を改善し、胸部X線画像から新型コロナウイルス識別タスクの堅牢で正確なモデルを実現するために用いられた。

COVID-19 has adversely affected humans and societies in different aspects. Numerous people have perished due to inaccurate COVID-19 identification and, consequently, a lack of appropriate medical treatment. Numerous solutions based on manual and automatic feature extraction techniques have been investigated to address this issue by researchers worldwide. Typically, automatic feature extraction methods, particularly deep learning models, necessitate a powerful hardware system to perform the necessary computations. Unfortunately, many institutions and societies cannot benefit from these advancements due to the prohibitively high cost of high-quality hardware equipment. As a result, this study focused on two primary goals: first, lowering the computational costs associated with running the proposed model on embedded devices, mobile devices, and conventional computers; and second, improving the model's performance in comparison to previously published methods (at least performs on par with state-of-the-art models) in order to ensure its performance and accuracy for the medical recognition task. This study used two neural networks to improve feature extraction from our dataset: VGG19 and ResNet50V2. Both of these networks are capable of providing semantic features from the nominated dataset. To this end, An alternative network was considered, namely MobileNetV2, which excels at extracting semantic features while requiring minimal computation on mobile and embedded devices. Knowledge distillation (KD) was used to transfer knowledge from the teacher network (concatenated ResNet50V2 and VGG19) to the student network (MobileNetV2) to improve MobileNetV2 performance and to achieve a robust and accurate model for the COVID-19 identification task from chest X-ray images.

翻訳日:2023-01-10 18:02:06 公開日:2023-01-06

# 量子測定におけるリンドブラジアン誘起アライメント

Lindbladian-Induced Alignment in Quantum Measurements ( http://arxiv.org/abs/2301.02664v1 )

ライセンス: Link先を確認

Robert Englman and Asher Yahalom

(参考訳) システム・ポインター・ウェーブ・パッケージの初期値と観測可能な値が整列した値に対して、外積から系の密度行列の内積への遷移として、不明瞭な時間連続的な減少を確実にするリンドブレディアン形式の式が提案される。ジャンプ作用素は可観測性に基づいており、測定セットから一意に決定されたパラメータ(これはS. Weinbergの波束形式論のリンドブラディアン分解と異なる)とボルンの確率規則に従っている。この新しさは、周囲(測定装置を含む)の観察モードへの適応性を定式化することにある。したがって、遷移は有限期間である(フォン・ノイマンの定式化における即時性とは対照的である)。この期間は単純な半スピンモデルとして推定される。

An expression of the Lindbladian form is proposed that ensures an unambiguous time-continuous reduction of the initial system-pointer wave-packet to one in which the readings and the observable's values are aligned, formalized as the transition from an outer product to an inner product of the system's and apparatus' density matrices. The jump operators are in the basis of the observables, with uniquely determined parameters derived from the measurement set-up (thereby differing from S. Weinberg's Lindbladian resolution of wave-packet formalism) and conforming to Born's probability rules. The novelty lies in formalising the adaptability of the surroundings (including the measuring device) to the mode of observation. Accordingly, the transition is of finite duration (in contrast to its instantaneousness in the von Neumann's formulation). This duration is estimated for a simple half-spin-like model.

翻訳日:2023-01-10 17:46:26 公開日:2023-01-06

# 誤り緩和のための仮説検証:エラー緩和の評価方法

Hypothesis Testing for Error Mitigation: How to Evaluate Error Mitigation ( http://arxiv.org/abs/2301.02690v1 )

ライセンス: Link先を確認

Abdullah Ash Saki, Amara Katabarwa, Salonik Resch, George Umbrarescu

(参考訳) ノイズの多い中間スケール量子(NISQ)時代には、量子エラー軽減は量子デバイスから有用なパフォーマンスを抽出するために必要なツールとなる。しかし、誤差緩和技術によってしばしば想定されるノイズモデルと、量子デバイス上の実際のノイズとの間には大きなギャップがある。その結果、技術の理論的な期待と日々のパフォーマンスの間にギャップが生じている。特に量子デバイスのクラウドユーザーは、デバイスをそのまま利用することが多いが、このギャップを最も感じている。これらのテクニックの有用性における不確実性をパラメータ化して,エラー軽減に必要なリソースとアルゴリズムレベルでの精度を判断するには,どうすればよいのか? 第1の質問に答えるために,量子エラー緩和の枠組み内で仮説検証を導入するとともに,第2の質問に対して,エラー緩和実装のリソース要件と緩和効率の両方を考慮した包括的メリット図を提案する。メリットの図形は、様々なエラー軽減手法のスケーラビリティと精度のトレードオフを評価するのに有用である。最後に, 仮説検証と実測値を用いて, ゼロノイズ外挿, ランダム化コンパイル, 測定誤差緩和, 動的デカップリング, 推定回路による緩和などの特異な手法からなる16ドルの誤差軽減パイプラインを実験的に評価した。合計275,640ドルの回路をIBMの量子コンピュータ2台で走らせた。

In the noisy intermediate-scale quantum (NISQ) era, quantum error mitigation will be a necessary tool to extract useful performance out of quantum devices. However, there is a big gap between the noise models often assumed by error mitigation techniques and the actual noise on quantum devices. As a consequence, there arises a gap between the theoretical expectations of the techniques and their everyday performance. Cloud users of quantum devices in particular, who often take the devices as they are, feel this gap the most. How should they parametrize their uncertainty in the usefulness of these techniques and be able to make judgement calls between resources required to implement error mitigation and the accuracy required at the algorithmic level? To answer the first question, we introduce hypothesis testing within the framework of quantum error mitigation and for the second question, we propose an inclusive figure of merit that accounts for both resource requirement and mitigation efficiency of an error mitigation implementation. The figure of merit is useful to weigh the trade-offs between the scalability and accuracy of various error mitigation methods. Finally, using the hypothesis testing and the figure of merit, we experimentally evaluate $16$ error mitigation pipelines composed of singular methods such as zero noise extrapolation, randomized compilation, measurement error mitigation, dynamical decoupling, and mitigation with estimation circuits. In total our data involved running $275,640$ circuits on two IBM quantum computers.

翻訳日:2023-01-10 17:46:07 公開日:2023-01-06

# 農村道路における多変量交通状態予測のための注意-LSTM

Attention-LSTM for Multivariate Traffic State Prediction on Rural Roads ( http://arxiv.org/abs/2301.02731v1 )

ライセンス: Link先を確認

Elahe Sherafat and Bilal Farooq and Amir Hossein Karbasi and Seyedehsan Seyedabrishami

(参考訳) 正確な交通量と速度予測は、輸送に幅広い応用がある。旅行者と交通機関の意思決定者の両方にとって有用かつタイムリーな情報が得られる。本研究では,イラン最大の観光地都市チャルスとテヘランを結ぶ重要な農村道路セグメントにおいて,交通量と速度を同時に予測するために,注意に基づく長期記憶モデル(A-LSTM)を提案する。さらに,A-LSTMモデルとLong Short-Term Memory(LSTM)モデルとの比較を行った。どちらのモデルも速度と流れを予測できる性能を示している。しかし、A-LSTMモデルはLSTMを5分15分間隔で上回る。対照的に、30分間の間隔で2つのモデルの間に有意な差はない。異なる時間地平線に基づくモデルの性能を比較することにより、15分間の地平線モデルは最低平均角誤差(MSE)が0.0032に達し、続いて30分間と5分間の地平線が0.004と0.0051である。さらに,15分間の時間間隔において,時間的カテゴリの入力変数である 1-hot または cyclic の 2 つの変換に基づくモデルの結果を比較した。その結果, 周期的特徴符号化によるLSTMとA-LSTMは, 単孔的特徴符号化よりも優れていた。

Accurate traffic volume and speed prediction have a wide range of applications in transportation. It can result in useful and timely information for both travellers and transportation decision-makers. In this study, an Attention based Long Sort-Term Memory model (A-LSTM) is proposed to simultaneously predict traffic volume and speed in a critical rural road segmentation which connects Tehran to Chalus, the most tourist destination city in Iran. Moreover, this study compares the results of the A-LSTM model with the Long Short-Term Memory (LSTM) model. Both models show acceptable performance in predicting speed and flow. However, the A-LSTM model outperforms the LSTM in 5 and 15-minute intervals. In contrast, there is no meaningful difference between the two models for the 30-minute time interval. By comparing the performance of the models based on different time horizons, the 15-minute horizon model outperforms the others by reaching the lowest Mean Square Error (MSE) loss of 0.0032, followed by the 30 and 5-minutes horizons with 0.004 and 0.0051, respectively. In addition, this study compares the results of the models based on two transformations of temporal categorical input variables, one-hot or cyclic, for the 15-minute time interval. The results demonstrate that both LSTM and A-LSTM with cyclic feature encoding outperform those with one-hot feature encoding.

翻訳日:2023-01-10 17:28:29 公開日:2023-01-06

# イメージベース表現を用いた自己整合複素多項式を用いたアンテナ設計における散乱係数のモデル化

Modeling Scattering Coefficients in Antenna Design using Self-Attentive Complex Polynomials with Image-based Representation ( http://arxiv.org/abs/2301.02747v1 )

ライセンス: Link先を確認

Andrew Cohen, Weiping Dou, Jiang Zhu, Slawomir Koziel, Peter Renner, Jan-Ove Mattsson, Xiaomeng Yang, Beidi Chen, Kevin Stone, Yuandong Tian

(参考訳) 周波数要件を満たし、複数の物理基準に対して最適であるアンテナ設計を見つけることは、次世代ハードウェアの設計において重要な要素である。しかし、目的関数は一般に非常に非線形であり、微妙な設計変更に敏感であるため、そのようなプロセスは自明ではない。さらに、最適化される目的は、しばしば電磁シミュレーション(EM)であり、商業シミュレーションソフトウェアでは遅くて高価である。本研究では,CZP (Constant Zeros Poles) と呼ばれるサンプル効率・精度の高い代理モデルを提案し,シミュレータを使わずに与えられた2次元平面アンテナ設計の周波数領域における散乱係数を直接推定する。 CZPは散乱係数の周波数応答に関する複素零点と極を予測し、マクスウェル方程式を含む任意の線形PDEに対して理論的に正当化した。さらに、czpは、低次元表現を使用する代わりに、既存のメッシュベースのemシミュレーション技術や注意に基づくニューラルネットワークアーキテクチャにインスパイアされたアンテナトポロジーのための新しいイメージベース表現を利用する。実験では,czpが試験損失の点でベースラインを上回るだけでなく,40kのトレーニングサンプルしか持たない商用ソフトウェアで検証可能な2dアンテナ設計を,強化学習などの先進的な逐次探索技術と組み合わせることで検証できることを実証した。

Finding antenna designs that satisfy frequency requirements and are also optimal with respect to multiple physical criteria is a critical component in designing next generation hardware. However, such a process is non-trivial because the objective function is typically highly nonlinear and sensitive to subtle design change. Moreover, the objective to be optimized often involves electromagnetic (EM) simulations, which is slow and expensive with commercial simulation software. In this work, we propose a sample-efficient and accurate surrogate model, named CZP (Constant Zeros Poles), to directly estimate the scattering coefficients in the frequency domain of a given 2D planar antenna design, without using a simulator. CZP achieves this by predicting the complex zeros and poles for the frequency response of scattering coefficients, which we have theoretically justified for any linear PDE, including Maxwell's equations. Moreover, instead of using low-dimensional representations, CZP leverages a novel image-based representation for antenna topology inspired by the existing mesh-based EM simulation techniques, and attention-based neural network architectures. We demonstrate experimentally that CZP not only outperforms baselines in terms of test loss, but also is able to find 2D antenna designs verifiable by commercial software with only 40k training samples, when coupling with advanced sequential search techniques like reinforcement learning.

翻訳日:2023-01-10 17:28:06 公開日:2023-01-06

# ボケおよびポアソンノイズ除去のための異方性および等方性全変動の異なる効率的な画像分割フレームワーク

Efficient Image Segmentation Framework with Difference of Anisotropic and Isotropic Total Variation for Blur and Poisson Noise Removal ( http://arxiv.org/abs/2301.03393v1 )

ライセンス: Link先を確認

Kevin Bui, Yifei Lou, Fredrick Park, Jack Xin

(参考訳) 本稿では,ぼかしとポアソンノイズによって劣化した画像のセグメント化を目的とする。画像をスムースに分割するために$k$-meansクラスタリングを行う。特に、画像平滑化ステップでは、ムンフォード・シャーモデルにおけるガウス雑音の最小二乗忠実度をポアソン雑音に対応する最大後方(map)項に置き換え、画像勾配のスパーシティを促進するための正規化として、異方性および等方性総変動(aitv)の重み付き差分を取り入れる。このような非凸モデルに対しては、特定の分割方式を開発し、近似演算子を用いて乗算器の交互方向法(ADMM)を適用する。 ADMM方式の有効性を検証するために収束解析を行う。様々なセグメンテーションシナリオ(grayscale/color and multiphase)における数値実験により,本手法がsatを含む多くのセグメンテーション手法を上回っていることを示した。

In this paper, we aim to segment an image degraded by blur and Poisson noise. We adopt a smoothing-and-thresholding (SaT) segmentation framework that finds a piecewise-smooth solution, followed by $k$-means clustering to segment the image. Specifically for the image smoothing step, we replace the least-squares fidelity for Gaussian noise in the Mumford-Shah model with a maximum posterior (MAP) term to deal with Poisson noise and we incorporate the weighted difference of anisotropic and isotropic total variation (AITV) as a regularization to promote the sparsity of image gradients. For such a nonconvex model, we develop a specific splitting scheme and utilize a proximal operator to apply the alternating direction method of multipliers (ADMM). Convergence analysis is provided to validate the efficacy of the ADMM scheme. Numerical experiments on various segmentation scenarios (grayscale/color and multiphase) showcase that our proposed method outperforms a number of segmentation methods, including the original SaT.

翻訳日:2023-01-10 17:18:28 公開日:2023-01-06

# 混乱した頭:拡散モデルが対面生成でGANを上回った

Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation ( http://arxiv.org/abs/2301.03396v1 )

ライセンス: Link先を確認

Micha{\l} Stypu{\l}kowski, Konstantinos Vougioukas, Sen He, Maciej Zi\k{e}ba, Stavros Petridis, Maja Pantic

(参考訳) 顔の生成は、これまで、追加の参照ビデオからのガイダンスなしで、頭の動きや自然な表情を作り出すのに苦労してきた。近年の拡散型生成モデルの開発により、より現実的で安定したデータ合成が可能となり、画像およびビデオ生成の性能は他の生成モデルを上回るものとなった。本研究では,人間の頭部の映像を生成するのに1つの識別画像と音声シーケンスしか必要としない自己回帰拡散モデルを提案する。我々のソリューションは、頭の動き、点滅などの表情を幻覚させ、特定の背景を保存することができる。 2つの異なるデータセットでモデルを評価し、両者で最先端の結果を得る。

Talking face generation has historically struggled to produce head movements and natural facial expressions without guidance from additional reference videos. Recent developments in diffusion-based generative models allow for more realistic and stable data synthesis and their performance on image and video generation has surpassed that of other generative models. In this work, we present an autoregressive diffusion model that requires only one identity image and audio sequence to generate a video of a realistic talking human head. Our solution is capable of hallucinating head movements, facial expressions, such as blinks, and preserving a given background. We evaluate our model on two different datasets, achieving state-of-the-art results on both of them.

翻訳日:2023-01-10 17:17:59 公開日:2023-01-06

# ビデオイベント関連予測のための構造記号表現の防御

In Defense of Structural Symbolic Representation for Video Event-Relation Prediction ( http://arxiv.org/abs/2301.03410v1 )

ライセンス: Link先を確認

Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang

(参考訳) ビデオ内のイベント関係を理解するには、イベントの基盤となる構造(イベントタイプ、関連する引数ロール、対応するエンティティ)と推論に必要な事実的知識を理解するモデルが必要である。構造記号表現(SSR)に基づく手法は、イベントタイプと関連する引数ロール/エンティティを直接入力として取り込んで推論を行う。しかし、現在最先端のビデオイベント関連予測システムは、入力ビデオから連続的な特徴ベクトルを使用する必要があることを示している。本稿では,以下の質問に答えるために,広範な実験分析を行う。 1) SSR ベースの方法が失敗した理由 2) 映像イベント関連予測の評価設定を適切に理解する方法 3) SSR に基づく手法の可能性を明らかにする方法。まず,従来のSSRに基づくビデオイベント予測モデルの失敗を,準最適トレーニング設定によって検出する。意外なことに、調整されたハイパーパラメータを持つ単純なSSRモデルでは、最先端モデルよりも20倍のマクロ精度が得られる。次に,質的かつ定量的な分析を通じて,映像のみを入力として使用する評価が現在実現不可能であることを示すとともに,oracle のイベント情報に依存して正確な評価を行う。そこで本研究では,ssrに基づくモデルをイベント系列モデルにさらにコンテキスト化し,外部視覚コモンセンス知識ベースをイベントリレーティング予測データセットに再構成する簡易かつ効果的な手法により,より事実的な知識を具備することを提案する。その結果、新たな最先端モデルによって、最終的に25\%のマクロ精度パフォーマンス向上が実現される。

Understanding event relationships in videos requires a model to understand the underlying structures of events, i.e., the event type, the associated argument roles, and corresponding entities) along with factual knowledge needed for reasoning. Structural symbolic representation (SSR) based methods directly take event types and associated argument roles/entities as inputs to perform reasoning. However, the state-of-the-art video event-relation prediction system shows the necessity of using continuous feature vectors from input videos; existing methods based solely on SSR inputs fail completely, event when given oracle event types and argument roles. In this paper, we conduct an extensive empirical analysis to answer the following questions: 1) why SSR-based method failed; 2) how to understand the evaluation setting of video event relation prediction properly; 3) how to uncover the potential of SSR-based methods. We first identify the failure of previous SSR-based video event prediction models to be caused by sub-optimal training settings. Surprisingly, we find that a simple SSR-based model with tuned hyperparameters can actually yield a 20\% absolute improvement in macro-accuracy over the state-of-the-art model. Then through qualitative and quantitative analysis, we show how evaluation that takes only video as inputs is currently unfeasible, and the reliance on oracle event information to obtain an accurate evaluation. Based on these findings, we propose to further contextualize the SSR-based model to an Event-Sequence Model and equip it with more factual knowledge through a simple yet effective way of reformulating external visual commonsense knowledge bases into an event-relation prediction pretraining dataset. The resultant new state-of-the-art model eventually establishes a 25\% Macro-accuracy performance boost.

翻訳日:2023-01-10 17:17:33 公開日:2023-01-06

# アラビア語手話認識モデルの設計

Design of Arabic Sign Language Recognition Model ( http://arxiv.org/abs/2301.02693v1 )

ライセンス: Link先を確認

Muhammad Al-Barham, Ahmad Jamal, Musa Al-Yaman

(参考訳) 聴覚障害者は手話を使ってコミュニケーションしており、ジェスチャー、動き、姿勢、および話し言葉のアルファベットや単語に対応する表情の組み合わせである。提案するアラビア手話認識モデルは,難聴者や難聴者が一般人と効果的にコミュニケーションするのに役立つ。 The recognition has four stages of converting the alphabet into letters as follows: Image Loading stage, which loads the images of Arabic sign language alphabets that were used later to train and test the model, a pre-processing stage which applies image processing techniques such as normalization, Image augmentation, resizing, and filtering to extract the features which are necessary to accomplish the recognition perfectly, a training stage which is achieved by deep learning techniques like CNN, a testing stage which demonstrates how effectively the model performs for images did not see it before, and the model was built and tested mainly using PyTorch library. モデルはArASL2018でテストされ、40の署名者から収集された32のアルファベット記号に対して54,000の画像で構成され、データセットにはトレーニングデータセットとテストデータセットの2つのセットがある。本報告で詳しく説明するには,システムの正確性,時間,柔軟性の観点から信頼性を確保する必要があった。最後に、今後の研究はアラビア語の手話からアラビア語のテキストに変換するモデルになる。

Deaf people are using sign language for communication, and it is a combination of gestures, movements, postures, and facial expressions that correspond to alphabets and words in spoken languages. The proposed Arabic sign language recognition model helps deaf and hard hearing people communicate effectively with ordinary people. The recognition has four stages of converting the alphabet into letters as follows: Image Loading stage, which loads the images of Arabic sign language alphabets that were used later to train and test the model, a pre-processing stage which applies image processing techniques such as normalization, Image augmentation, resizing, and filtering to extract the features which are necessary to accomplish the recognition perfectly, a training stage which is achieved by deep learning techniques like CNN, a testing stage which demonstrates how effectively the model performs for images did not see it before, and the model was built and tested mainly using PyTorch library. The model is tested on ArASL2018, consisting of 54,000 images for 32 alphabet signs gathered from 40 signers, and the dataset has two sets: training dataset and testing dataset. We had to ensure that the system is reliable in terms of accuracy, time, and flexibility of use explained in detail in this report. Finally, the future work will be a model that converts Arabic sign language into Arabic text.

翻訳日:2023-01-10 16:51:37 公開日:2023-01-06

# rupnet:リアルタイムポリプセグメンテーションのための残差アップサンプリングネットワーク

RUPNet: Residual upsampling network for real-time polyp segmentation ( http://arxiv.org/abs/2301.02703v1 )

ライセンス: Link先を確認

Nikhil Kumar Tomar, Ulas Bagci, Debesh Jha

(参考訳) 大腸癌は世界中でがん関連死亡の最も多い原因の一つである。早期にポリプの検出と除去は死亡率の低下に寄与し、隣接する臓器の拡散にも寄与する。早期のポリープ検出は世界中の何百万人もの患者を救い、臨床的な負担を軽減できる。しかし,検出ポリープ率は内科医によって大きく異なる。深層学習に基づく手法が多数提案されているが,ほとんどの研究で精度が向上している。本稿では,大腸ポリープ分割のための新しいアーキテクチャであるResidual Upsampling Network (RUPNet)を提案する。提案アーキテクチャであるRUPNetは、3つのエンコーダ、3つのデコーダブロックと、ネットワークの終端にある追加のアップサンプリングブロックで構成されるエンコーダ・デコーダネットワークである。画像サイズは512 \times 512$で,平均ダイス係数0.7658,和算平均交点0.6553,感度0.8049,精度0.7995,F2スコア0.9361で,毎秒152.60フレームの優れたリアルタイム動作速度を実現する。その結果, RUPNetは早期ポリプ検出のための優れたベンチマークを示す高い精度を維持しつつ, リアルタイムフィードバックを得られることが示唆された。

Colorectal cancer is among the most prevalent cause of cancer-related mortality worldwide. Detection and removal of polyps at an early stage can help reduce mortality and even help in spreading over adjacent organs. Early polyp detection could save the lives of millions of patients over the world as well as reduce the clinical burden. However, the detection polyp rate varies significantly among endoscopists. There is numerous deep learning-based method proposed, however, most of the studies improve accuracy. Here, we propose a novel architecture, Residual Upsampling Network (RUPNet) for colon polyp segmentation that can process in real-time and show high recall and precision. The proposed architecture, RUPNet, is an encoder-decoder network that consists of three encoders, three decoder blocks, and some additional upsampling blocks at the end of the network. With an image size of $512 \times 512$, the proposed method achieves an excellent real-time operation speed of 152.60 frames per second with an average dice coefficient of 0.7658, mean intersection of union of 0.6553, sensitivity of 0.8049, precision of 0.7995, and F2-score of 0.9361. The results suggest that RUPNet can give real-time feedback while retaining high accuracy indicating a good benchmark for early polyp detection.

翻訳日:2023-01-10 16:51:20 公開日:2023-01-06

# 条件付き翻訳を用いた交通事故・事故分類データセットの強化

Augmenting Ego-Vehicle for Traffic Near-Miss and Accident Classification Dataset using Manipulating Conditional Style Translation ( http://arxiv.org/abs/2301.02726v1 )

ライセンス: Link先を確認

Hilmil Pradana, Minh-Son Dao, and Koji Zettsu

(参考訳) 先進的な自動運転システムを開発するために、多くの研究者は、クローズドサーキットテレビ(cctv)とダッシュボード搭載カメラから可能な全ての交通リスクケースに注意を払っている。これらの手法のほとんどは、異常が発生したフレーム毎の識別に重点を置いているが、実現されていないため、道路交通参加者は、利用可能なアノテーションデータセットによって、トラフィックビデオで異常を検出できないため、エゴ車両が衝突する可能性がある。近接ミスは事故の一種であり、狭義に避けられる事故と定義できる。しかし,事故発生前の事故とニアミスの間には差はなく,事故の定義を再定義し,DADA-2000データセット上での事故の不整合を再注釈することへの貢献である。事故発生時刻の開始時刻と終了時刻を延ばすことで,事故発生時のエゴ運動を正確にカバーし,事故発生時を含む交通リスク事故を一貫した分類を行い,現実の運転支援システムにより重要な情報を提供する。提案手法は条件付きスタイル変換(cst)と分離可能な3次元畳み込みニューラルネットワーク(s3d)の2つの構成要素を統合する。 CSTアーキテクチャは、再アノテーションDADA-2000データセットを増大させ、交通事故ビデオの数を増やし、異なる種類の条件下での動画分類モデルの性能を一般化するために使用されるunsupervised image-to-image translation network (UNIT)によって導かれる。評価では, クロスバリデーション解析において, ベースラインモデルから10.25%の正のマージンで有意な改善が得られた。

To develop the advanced self-driving systems, many researchers are focusing to alert all possible traffic risk cases from closed-circuit television (CCTV) and dashboard-mounted cameras. Most of these methods focused on identifying frame-by-frame in which an anomaly has occurred, but they are unrealized, which road traffic participant can cause ego-vehicle leading into collision because of available annotation dataset only to detect anomaly on traffic video. Near-miss is one type of accident and can be defined as a narrowly avoided accident. However, there is no difference between accident and near-miss at the time before the accident happened, so our contribution is to redefine the accident definition and re-annotate the accident inconsistency on DADA-2000 dataset together with near-miss. By extending the start and end time of accident duration, our annotation can precisely cover all ego-motions during an incident and consistently classify all possible traffic risk accidents including near-miss to give more critical information for real-world driving assistance systems. The proposed method integrates two different components: conditional style translation (CST) and separable 3-dimensional convolutional neural network (S3D). CST architecture is derived by unsupervised image-to-image translation networks (UNIT) used for augmenting the re-annotation DADA-2000 dataset to increase the number of traffic risk accident videos and to generalize the performance of video classification model on different types of conditions while S3D is useful for video classification to prove dataset re-annotation consistency. In evaluation, the proposed method achieved a significant improvement result by 10.25% positive margin from the baseline model for accuracy on cross-validation analysis.

翻訳日:2023-01-10 16:50:55 公開日:2023-01-06

# LS-DYNA機械学習による短繊維強化複合材料の非線形モデリング

LS-DYNA Machine Learning-based Multiscale Method for Nonlinear Modeling of Short Fiber-Reinforced Composites ( http://arxiv.org/abs/2301.02738v1 )

ライセンス: Link先を確認

Haoyan Wei, C. T. Wu, Wei Hu, Tung-Huan Su, Hitoshi Oura, Masato Nishi, Tadashi Naito, Stan Chung, Leo Shen

(参考訳) 短繊維強化複合材料(英: short-fiber-reinforceed Composites、SFRC)は、自動車やエレクトロニクス産業における軽量構造応用のための高性能な工学材料である。通常、SFRC構造は異種組織を誘導する射出成形により製造され、結果として生じる非線形異方性挙動は従来のマイクロメカニカル解析により予測することが困難である。本稿では, 有限要素シミュレーションソフトウェアls-dynaにおける射出成形誘起微細構造, 材料均質化, 深層材料ネットワーク(dmn)を統合し, sfrcの構造解析を行う機械学習ベースのマルチスケール手法を提案する。 DMNは物理埋め込み機械学習モデルであり、オフライントレーニングを通じて複合材料の代表体積要素に隠されたマイクロスケールの材料形態を学習する。 DMNを有限要素に結合することにより,高忠実性直接数値シミュレーションよりも高速に計算速度で複合材料や構造物の非線形挙動を予測する,高精度で効率的なデータ駆動手法を開発した。産業規模のSFRC製品をモデル化するために, 転写学習を用いて一貫したDMNデータベースを生成し, 射出成形による繊維配向と体積分率が複合性に及ぼす影響を効果的に把握する。このLS-DYNA機械学習に基づくマルチスケールSFRCモデリングの有望な性能を示す数値的な例を示す。

Short-fiber-reinforced composites (SFRC) are high-performance engineering materials for lightweight structural applications in the automotive and electronics industries. Typically, SFRC structures are manufactured by injection molding, which induces heterogeneous microstructures, and the resulting nonlinear anisotropic behaviors are challenging to predict by conventional micromechanical analyses. In this work, we present a machine learning-based multiscale method by integrating injection molding-induced microstructures, material homogenization, and Deep Material Network (DMN) in the finite element simulation software LS-DYNA for structural analysis of SFRC. DMN is a physics-embedded machine learning model that learns the microscale material morphologies hidden in representative volume elements of composites through offline training. By coupling DMN with finite elements, we have developed a highly accurate and efficient data-driven approach, which predicts nonlinear behaviors of composite materials and structures at a computational speed orders-of-magnitude faster than the high-fidelity direct numerical simulation. To model industrial-scale SFRC products, transfer learning is utilized to generate a unified DMN database, which effectively captures the effects of injection molding-induced fiber orientations and volume fractions on the overall composite properties. Numerical examples are presented to demonstrate the promising performance of this LS-DYNA machine learning-based multiscale method for SFRC modeling.

翻訳日:2023-01-10 16:33:45 公開日:2023-01-06

# 低信号対雑音比における等張リカレーション

Isotonic Recalibration under a Low Signal-to-Noise Ratio ( http://arxiv.org/abs/2301.02692v1 )

ライセンス: Link先を確認

Mario V. W\"uthrich, Johanna Ziegel

(参考訳) 保険料体系は、異なる価格コホート間で系統的な相互資金繰りがないことを保証するために、自動調整資産を満たすべきである。回帰モデルは自動校正されないことが多い。自動校正を保証するために,任意の回帰モデルに等速再校正を適用することを提案する。我々の主な結果は、信号対雑音比の低さの下で、この等張リカバリレーションステップが説明可能な価格体系をもたらすことを証明している。

Insurance pricing systems should fulfill the auto-calibration property to ensure that there is no systematic cross-financing between different price cohorts. Often, regression models are not auto-calibrated. We propose to apply isotonic recalibration to a given regression model to ensure auto-calibration. Our main result proves that under a low signal-to-noise ratio, this isotonic recalibration step leads to explainable pricing systems because the resulting isotonically recalibrated regression functions have a low complexity.

翻訳日:2023-01-10 16:32:20 公開日:2023-01-06

# 宇宙空間における主成分分析

Principal Component Analysis in Space Forms ( http://arxiv.org/abs/2301.02750v1 )

ライセンス: Link先を確認

Puoya Tabaghi, Michael Khanzadeh, Yusu Wang, Sivash Mirarab

(参考訳) 主成分分析(PCA)は、現代のデータ科学の成果である。実践者は通常、データがユークリッド幾何学に適合するとpcaを行う。しかし、階層データのような特定のデータ型の場合、他の幾何学的空間の方が適切である。我々は、ゼロ曲率(ユークリッド)空間に加えて、定数正(球面)および負(双曲)曲率を持つ空間形式でPCAを研究する。リーマン多様体上の任意の点において、接ベクトルの集合に基づくリーマンアフィン部分空間を定義でき、可逆写像を使って多様体への接ベクトルを射影し、逆もまたできる。空間形式における点の集合に対する低次元リーマンアフィン部分空間を見つけることは、そのようなアフィン部分空間が同じ次元と曲率の空間形式に等長であるため、次元の減少に等しい。主成分を見つけるために、アフィン部分空間にデータ点を投影する最小平均コストで多様体値のデータ点の集合を最もよく表現する(リーマン的)アフィン部分空間を求める。そこで我々は,(1) ユークリッドPCAと同様の等式を解くことでアフィン部分空間を推定し,(2) 異なる次元の最適アフィン部分空間をネスト集合とする,という2つの大きな利点をもたらす特定のコスト関数を提案する。これらの性質は、ほとんどが収束が遅く、理論的な保証が弱い反復アルゴリズムである既存の方法よりも進歩する。特に双曲型 PCA の場合、関連する等式はローレンツ空間で作用し、不定内積が与えられ、したがってローレンツ空間とユークリッド空間の等式の間の接続を確立する。球面および双曲空間でシミュレートされたデータセット上で提案した空間形式PCAを評価し,コンバージェンス速度や精度において他の手法よりも優れていることを示す。

Principal component analysis (PCA) is a workhorse of modern data science. Practitioners typically perform PCA assuming the data conforms to Euclidean geometry. However, for specific data types, such as hierarchical data, other geometrical spaces may be more appropriate. We study PCA in space forms; that is, those with constant positive (spherical) and negative (hyperbolic) curvatures, in addition to zero-curvature (Euclidean) spaces. At any point on a Riemannian manifold, one can define a Riemannian affine subspace based on a set of tangent vectors and use invertible maps to project tangent vectors to the manifold and vice versa. Finding a low-dimensional Riemannian affine subspace for a set of points in a space form amounts to dimensionality reduction because, as we show, any such affine subspace is isometric to a space form of the same dimension and curvature. To find principal components, we seek a (Riemannian) affine subspace that best represents a set of manifold-valued data points with the minimum average cost of projecting data points onto the affine subspace. We propose specific cost functions that bring about two major benefits: (1) the affine subspace can be estimated by solving an eigenequation -- similar to that of Euclidean PCA, and (2) optimal affine subspaces of different dimensions form a nested set. These properties provide advances over existing methods which are mostly iterative algorithms with slow convergence and weaker theoretical guarantees. Specifically for hyperbolic PCA, the associated eigenequation operates in the Lorentzian space, endowed with an indefinite inner product; we thus establish a connection between Lorentzian and Euclidean eigenequations. We evaluate the proposed space form PCA on data sets simulated in spherical and hyperbolic spaces and show that it outperforms alternative methods in convergence speed or accuracy, often both.

翻訳日:2023-01-10 16:25:38 公開日:2023-01-06

# マルチラベル学習能力のキャラクタリゼーション

A Characterization of Multilabel Learnability ( http://arxiv.org/abs/2301.02729v1 )

ライセンス: Link先を確認

Vinod Raman, Unique Subedi, Ambuj Tewari

(参考訳) マルチラベル分類の問題点を考察し,バッチおよびオンライン設定における学習可能性について考察する。両方の設定において、各関数クラスのシングルラベル制限が学習可能である場合に限り、マルチラベル関数クラスが学習可能であることを示す。拡張として,バッチ設定におけるマルチアウトプット回帰とオンライン設定におけるバンディットフィードバックについても検討した。前者は学習可能性w.r.t.$L_p$損失を特徴付ける。後者については、フルフィードバック設定と同様の特性を示す。

We consider the problem of multilabel classification and investigate learnability in batch and online settings. In both settings, we show that a multilabel function class is learnable if and only if each single-label restriction of the function class is learnable. As extensions, we also study multioutput regression in the batch setting and bandit feedback in the online setting. For the former, we characterize learnability w.r.t. $L_p$ losses. For the latter, we show a similar characterization as in the full-feedback setting.

翻訳日:2023-01-10 16:06:23 公開日:2023-01-06

# 極弱スーパービジョンを用いた少数ショットノード分類

Few-shot Node Classification with Extremely Weak Supervision ( http://arxiv.org/abs/2301.02708v1 )

ライセンス: Link先を確認

Song Wang, Yushun Dong, Kaize Ding, Chen Chen, Jundong Li

(参考訳) 数少ないノード分類は、限定されたラベル付きノードを参照として分類することを目的としている。最近のマイナショットノード分類法は、ラベル付きノードが豊富なクラス(メタトレーニングクラス)から学び、制限されたラベル付きノード(メタテストクラス)に一般化する。それでも実世界のグラフでは、多くのクラスで豊富なラベル付きノードを得るのは通常困難である。実際には、各メタトレーニングクラスは、非常に弱い監督問題として知られる複数のラベル付きノードのみで構成されることができる。メタトレーニングのためのラベル付きノードが極めて限られている数少ないノード分類では、メタトレーニングとメタテストの間の一般化ギャップが大きくなるため、サブ最適パフォーマンスが向上する。この問題に取り組むために,極端に弱い監督を持つ少数ノード分類の新たな問題について検討し,広く普及しているメタラーニングフレームワークに基づく原理フレームワーク x-fnc を提案する。具体的には,様々なメタ学習タスクにメタ知識を蓄積し,その知識をメタテストタスクに一般化することが目的である。極端に少ないラベル付きノードから生じる課題に対処するため、擬似ラベル付きノードを追加参照として取得し、極めて限られた監視情報から効果的に学習する2つの必須モジュールを提案する。さらに、4つのノード分類データセットについて、最先端のベースラインと比較してフレームワークの優位性を検証するために、極めて弱い監督力を持つ広範な実験を行った。

Few-shot node classification aims at classifying nodes with limited labeled nodes as references. Recent few-shot node classification methods typically learn from classes with abundant labeled nodes (i.e., meta-training classes) and then generalize to classes with limited labeled nodes (i.e., meta-test classes). Nevertheless, on real-world graphs, it is usually difficult to obtain abundant labeled nodes for many classes. In practice, each meta-training class can only consist of several labeled nodes, known as the extremely weak supervision problem. In few-shot node classification, with extremely limited labeled nodes for meta-training, the generalization gap between meta-training and meta-test will become larger and thus lead to suboptimal performance. To tackle this issue, we study a novel problem of few-shot node classification with extremely weak supervision and propose a principled framework X-FNC under the prevalent meta-learning framework. Specifically, our goal is to accumulate meta-knowledge across different meta-training tasks with extremely weak supervision and generalize such knowledge to meta-test tasks. To address the challenges resulting from extremely scarce labeled nodes, we propose two essential modules to obtain pseudo-labeled nodes as extra references and effectively learn from extremely limited supervision information. We further conduct extensive experiments on four node classification datasets with extremely weak supervision to validate the superiority of our framework compared to the state-of-the-art baselines.

翻訳日:2023-01-10 15:57:19 公開日:2023-01-06

# Witscript 3:会話におけるジョーク改善のためのハイブリッドAIシステム

Witscript 3: A Hybrid AI System for Improvising Jokes in a Conversation ( http://arxiv.org/abs/2301.02695v1 )

ライセンス: Link先を確認

Joe Toplyn

(参考訳) 以前の論文ではWitscriptとWitscript 2が紹介されていた。 Witscriptはワードプレイに依存するジョークを生成するが、Witscript 2で生成されるジョークは常識に依存する。本稿では, 3つのジョーク生成機構を用いてジョーク候補を生成し, 出力する最適な候補を選択する Witscript 3 を提示した。 WitscriptやWitscript 2と同様に、Witscript 3はプロのコメディライターが作ったユーモアアルゴリズムに基づいている。人間はwitscript 3の入力文に対する応答を44%の冗談だと判断した。これは、Witscript 3がチャットボットに人間のようなユーモアを与えるための別のステップであることを示す証拠である。

Previous papers presented Witscript and Witscript 2, AI systems for improvising jokes in a conversation. Witscript generates jokes that rely on wordplay, whereas the jokes generated by Witscript 2 rely on common sense. This paper extends that earlier work by presenting Witscript 3, which generates joke candidates using three joke production mechanisms and then selects the best candidate to output. Like Witscript and Witscript 2, Witscript 3 is based on humor algorithms created by an expert comedy writer. Human evaluators judged Witscript 3's responses to input sentences to be jokes 44% of the time. This is evidence that Witscript 3 represents another step toward giving a chatbot a humanlike sense of humor.

翻訳日:2023-01-10 15:39:03 公開日:2023-01-06

# 感覚関係の階層を生かした談話関係感覚の対比学習の促進

Facilitating Contrastive Learning of Discourse Relational Senses by Exploiting the Hierarchy of Sense Relations ( http://arxiv.org/abs/2301.02724v1 )

ライセンス: Link先を確認

Wanqiu Long and Bonnie Webber

(参考訳) 暗黙の談話関係認識は、2つの隣接するテキストの間に保持される感覚や感覚を、それらの間に明示的な接続性がない場合に識別する難しいタスクである。 PDTB-2とPDTB-3の両方では、談話関係感覚は4つの広義のトップレベル感覚からより特定の感覚まで3段階の階層に分類される。暗黙的談話関係認識に関するほとんどの以前の研究は、センス階層を単にどのセンスラベルが利用可能かを示すために使用してきた。ここではさらに -- 認識プロセス自体にセンス階層を組み込んで、対比学習で使用される否定的な例を選択する。追加の努力なしに、このアプローチはタスクの最先端のパフォーマンスを達成する。

Implicit discourse relation recognition is a challenging task that involves identifying the sense or senses that hold between two adjacent spans of text, in the absence of an explicit connective between them. In both PDTB-2 and PDTB-3, discourse relational senses are organized into a three-level hierarchy ranging from four broad top-level senses, to more specific senses below them. Most previous work on implicit discourse relation recognition have used the sense hierarchy simply to indicate what sense labels were available. Here we do more -- incorporating the sense hierarchy into the recognition process itself and using it to select the negative examples used in contrastive learning. With no additional effort, the approach achieves state-of-the-art performance on the task.

翻訳日:2023-01-10 15:38:52 公開日:2023-01-06

# フルフィールド超音波キャラクタリゼーションのための深層学習

Deep learning for full-field ultrasonic characterization ( http://arxiv.org/abs/2301.02378v1 )

ライセンス: Link先を確認

Yang Xu, Fatemeh Pourahmadian, Jian Song, Conglin Wang

(参考訳) 本研究は、機械学習の最近の進歩を活用し、全波形データから層状成分の機械的特性を分散再構築するための物理ベースのデータ解析プラットフォームを構築する。本稿では,2つの論理,すなわち直接反転と物理インフォームドニューラルネットワーク(PINN)について検討する。直接反転には3つのステップがある。 (i)フルフィールドデータのスペクトル分別と微分二各領域における未知の物理・正規化パラメータのプロファイルを近似するための適切なニューラルマップの構築、及び 3)Tikhonov-regularized PDE損失の最小化によるニューラルネットワークの同時学習 (i)。 PINNは、フィールド変数が物理的未知や損失関数重みのような(スケールまたは分散された)補助パラメータによって与えられるニューラルネットワークによってモデル化されるマルチタスク学習を通じて予測能力を持つ複雑なシステムの効率的なサロゲートモデルを提供する。 PINNは、基礎となる物理法則に基づくデータ不適合の尺度を制約として最小化することで訓練される。本研究では,超音波データからの学習を容易にするため,ピンズロスを採用する。 (a)データ不適合を計算するための波数依存のソボレフノルム b) PDEの形式を弾性波伝搬に活用することにより, 損失目標を自然にバランスさせる, 特定のスケーリングフレームワークにおける非適応重み付けを行う。どちらのパラダイムも合成データと実験室テストデータで調べられる。後者の場合、複数の周波数で再構成を行い、データ駆動モデリングにおける検証と検証の重要性を強調した相補的な実験によって結果が検証される。

This study takes advantage of recent advances in machine learning to establish a physics-based data analytic platform for distributed reconstruction of mechanical properties in layered components from full waveform data. In this vein, two logics, namely the direct inversion and physics-informed neural networks (PINNs), are explored. The direct inversion entails three steps: (i) spectral denoising and differentiation of the full-field data, (ii) building appropriate neural maps to approximate the profile of unknown physical and regularization parameters on their respective domains, and (iii) simultaneous training of the neural networks by minimizing the Tikhonov-regularized PDE loss using data from (i). PINNs furnish efficient surrogate models of complex systems with predictive capabilities via multitask learning where the field variables are modeled by neural maps endowed with (scaler or distributed) auxiliary parameters such as physical unknowns and loss function weights. PINNs are then trained by minimizing a measure of data misfit subject to the underlying physical laws as constraints. In this study, to facilitate learning from ultrasonic data, the PINNs loss adopts (a) wavenumber-dependent Sobolev norms to compute the data misfit, and (b) non-adaptive weights in a specific scaling framework to naturally balance the loss objectives by leveraging the form of PDEs germane to elastic-wave propagation. Both paradigms are examined via synthetic and laboratory test data. In the latter case, the reconstructions are performed at multiple frequencies and the results are verified by a set of complementary experiments highlighting the importance of verification and validation in data-driven modeling.

翻訳日:2023-01-10 00:36:51 公開日:2023-01-06

# 深部生物学的経路インフォームド・パス-ゲノム多モード生存予測

Deep Biological Pathway Informed Pathology-Genomic Multimodal Survival Prediction ( http://arxiv.org/abs/2301.02383v1 )

ライセンス: Link先を確認

Lin Qiu, Aminollah Khormali, Kai Liu

(参考訳) 病理画像やゲノムデータなどのマルチモーダルデータの統合は、パーソナライズされた治療におけるがんの不均一性と複雑性の理解、および生存予測の強化に不可欠である。病理学とゲノムデータを統合する進歩にもかかわらず、ほとんどの既存の手法は複雑なモダリティ間の関係を完全に掘り出すことはできない。さらに、前臨床発見と臨床予測を管理するこれらのモデルから説明可能な特徴を特定することは、がんの診断、予後、治療反応の研究に不可欠である。生命予後予測だけでなく, 生存率の異なる遺伝子や経路を同定するために, 病理画像とゲノムデータを統合した, 新たな生物学的経路形成型病理遺伝深層モデル ponet を提案する。 The Cancer Genome Atlas (TCGA) データセットの6つの実験結果から,提案手法は優れた予測性能を示し,有意義な生物学的解釈を示した。提案手法は,疾患の理解と治療耐性の予測に汎用的な応用性を有するマルチモーダルバイオメディカルデータを用いた,生体情報による深層ネットワークの訓練方法に関する知見を確立する。

The integration of multi-modal data, such as pathological images and genomic data, is essential for understanding cancer heterogeneity and complexity for personalized treatments, as well as for enhancing survival predictions. Despite the progress made in integrating pathology and genomic data, most existing methods cannot mine the complex inter-modality relations thoroughly. Additionally, identifying explainable features from these models that govern preclinical discovery and clinical prediction is crucial for cancer diagnosis, prognosis, and therapeutic response studies. We propose PONET- a novel biological pathway-informed pathology-genomic deep model that integrates pathological images and genomic data not only to improve survival prediction but also to identify genes and pathways that cause different survival rates in patients. Empirical results on six of The Cancer Genome Atlas (TCGA) datasets show that our proposed method achieves superior predictive performance and reveals meaningful biological interpretations. The proposed method establishes insight into how to train biologically informed deep networks on multimodal biomedical data which will have general applicability for understanding diseases and predicting response and resistance to treatment.

翻訳日:2023-01-10 00:36:27 公開日:2023-01-06

# 多世代音楽変換器-完全長楽譜

Multi-Genre Music Transformer -- Composing Full Length Musical Piece ( http://arxiv.org/abs/2301.02385v1 )

ライセンス: Link先を確認

Abhinav Kaushal Keshari

(参考訳) 音楽を生成するタスクにおいて、アートファクタは大きな役割を担い、AIにとって大きな課題である。従来は、新しい楽曲を制作するための敵対的な訓練や、様々な音楽(ビート、テンポ、音楽ステム)の互換性をモデル化する作業は、この課題を学習する素晴らしい例であった。これはマッシュアップやテンポや鍵分布から学習的特徴を発生させることに限られていた。複合語トランスフォーマーは、複合語で定義された音楽イベントを含むシーケンス生成チャレンジとして音楽生成タスクを表現することができた。これらの音楽イベントは、音符の進行、コードの変更、調和、芸術的要素をより正確に記述する。本研究の目的は,楽曲のジャンルや形式も考慮した課題を含む,より適応的な学習プロセスを通じて楽曲の制作を学ぶマルチジャンルトランスフォーマーの実装である。我々は,複数種類の複合語データセットを構築し,このデータセット上で学習した線形トランスフォーマを実装した。このマルチジャンルトランスフォーマーは、オリジナル曲に匹敵する多種多様な新曲をフルタイムで生成することができた。モデルは他のモデルより2～5倍速い速度で走行する。

In the task of generating music, the art factor plays a big role and is a great challenge for AI. Previous work involving adversarial training to produce new music pieces and modeling the compatibility of variety in music (beats, tempo, musical stems) demonstrated great examples of learning this task. Though this was limited to generating mashups or learning features from tempo and key distributions to produce similar patterns. Compound Word Transformer was able to represent music generation task as a sequence generation challenge involving musical events defined by compound words. These musical events give a more accurate description of notes progression, chord change, harmony and the art factor. The objective of the project is to implement a Multi-Genre Transformer which learns to produce music pieces through more adaptive learning process involving more challenging task where genres or form of the composition is also considered. We built a multi-genre compound word dataset, implemented a linear transformer which was trained on this dataset. We call this Multi-Genre Transformer, which was able to generate full length new musical pieces which is diverse and comparable to original tracks. The model trains 2-5 times faster than other models discussed.

翻訳日:2023-01-10 00:36:08 公開日:2023-01-06

# 心電図同期のためのデータ駆動ガウスプロセスフィルタ

A Data-Driven Gaussian Process Filter for Electrocardiogram Denoising ( http://arxiv.org/abs/2301.02607v1 )

ライセンス: Link先を確認

Mircea Dumitru, Qiao Li, Erick Andres Perez Alday, Ali Bahrami Rad, Gari D. Clifford, Reza Sameni

(参考訳) 目的: 心電図 (ECG) フィルタリングを含む様々な用途に効果的に使用されているガウス過程 (GP) ベースのフィルタは、計算的に要求され、そのハイパーパラメータの選択は通常アドホックである。方法: ecgフェーズドメイン(ecg phase domain)という概念を用いて、データ駆動gpフィルタを開発し、一定数のサンプルにecgビートをタイムウォードで表現し、ガウス分布に従うと仮定したrピークをアライメントする。この仮定の下で、サンプル平均と共分散行列の計算を単純化し、アドホックなハイパーパラメータなしでデータ駆動方式でGPフィルタの効率的な実装を可能にする。提案フィルタはPhyloNet QTデータベース上で,最先端のウェーブレットベースフィルタと比較して評価する。付加雑音を用いた5dBステップにおいて,5dBから30dBまでのSNRレベルにおけるフィルタの信号対雑音比(SNR)改善を測定して評価を行った。臨床評価のために, 原信号とフィルタ信号のqt間隔の推定誤差を測定し, ベンチマークフィルタと比較した。結果: 提案するgpフィルタは, 全雑音レベルのベンチマークフィルタよりも優れていることが示された。また、QT間隔推定誤差バイアスと分散の観点から、最先端フィルタよりも優れている。結論: GPフィルタは臨床および研究応用においてECGを前処理するための汎用的手法であり, 任意の長さとサンプリング周波数のECGに適用可能であり, その性能に対する信頼区間を提供する。

Objective: Gaussian Processes (GP)-based filters, which have been effectively used for various applications including electrocardiogram (ECG) filtering can be computationally demanding and the choice of their hyperparameters is typically ad hoc. Methods: We develop a data-driven GP filter to address both issues, using the notion of the ECG phase domain -- a time-warped representation of the ECG beats onto a fixed number of samples and aligned R-peaks, which is assumed to follow a Gaussian distribution. Under this assumption, the computation of the sample mean and covariance matrix is simplified, enabling an efficient implementation of the GP filter in a data-driven manner, with no ad hoc hyperparameters. The proposed filter is evaluated and compared with a state-of-the-art wavelet-based filter, on the PhysioNet QT Database. The performance is evaluated by measuring the signal-to-noise ratio (SNR) improvement of the filter at SNR levels ranging from -5 to 30dB, in 5dB steps, using additive noise. For a clinical evaluation, the error between the estimated QT-intervals of the original and filtered signals is measured and compared with the benchmark filter. Results: It is shown that the proposed GP filter outperforms the benchmark filter for all the tested noise levels. It also outperforms the state-of-the-art filter in terms of QT-interval estimation error bias and variance. Conclusion: The proposed GP filter is a versatile technique for preprocessing the ECG in clinical and research applications, is applicable to ECG of arbitrary lengths and sampling frequencies, and provides confidence intervals for its performance.

翻訳日:2023-01-10 00:35:49 公開日:2023-01-06

# NEC違反:トンネルとカシミール効果

NEC violation: Tunnelling versus the Casimir effect ( http://arxiv.org/abs/2301.02455v1 )

ライセンス: Link先を確認

Jean Alexandre and Drew Backhouse

(参考訳) 有限体積で許容される2つの縮退したミニマ間のトンネルは、非拡張対称な基底状態をもたらす。これにより、フィールドを含むボックス内の連続的なモーメントの集合が仮定された場合、十分な低温でヌルエネルギ条件に違反する。離散モーメントを考慮すると、この図を修正でき、トンネルによって引き起こされる基底状態エネルギーにカシミールエネルギーを加えることで達成される。ゼロ温度に焦点をあてると、これらの非自明な効果は、典型的な長さスケールに依存する。

We show that tunnelling between two degenerate minima, as allowed in a finite volume, leads to a non-extensive symmetric ground state. This results in Null Energy Condition violation for sufficiently low temperatures, when a continuous set of momenta in the box containing the field is assumed. Taking into account discrete momenta can modify this picture and is achieved via the addition of the Casimir energy to the tunnelling-induced ground state energy. Focusing on zero-temperature, these non-trivial effects are found to compete, depending on the typical length scales involved.

翻訳日:2023-01-10 00:34:49 公開日:2023-01-06

# 有限温度シミュレーションのための適応変分量子最小絡み合い典型的な熱状態

Adaptive variational quantum minimally entangled typical thermal states for finite temperature simulations ( http://arxiv.org/abs/2301.02592v1 )

ライセンス: Link先を確認

Jo\~ao C. Getelina, Niladri Gomes, Thomas Iadecola, Peter P. Orth, Yong-Xin Yao

(参考訳) 熱平衡における量子多体系のシミュレーションのためのスケーラブルな量子アルゴリズムは、有限温度における量子物質の特性を予測するのに重要である。ここでは,最小絡み合った典型的な熱状態(metts)アルゴリズムの量子コンピューティング版について記述し,ベンチマークを行った。 AVQMETTSと呼ばれるアルゴリズムは、ノイズの多い中間スケール量子(NISQ)ハードウェアに適した、コンパクトで問題固有の量子回路を動的に生成する。我々は、状態ベクトルシミュレータ上でAVQMETTSをベンチマークし、1次元と2次元の積分可能および非可積分量子スピンモデルの熱エネルギー計算を行い、回路複雑性の概して線形なスケールを示す。最後に,二次元横磁場イジングモデルの有限温度相転移線をマッピングする。

Scalable quantum algorithms for the simulation of quantum many-body systems in thermal equilibrium are important for predicting properties of quantum matter at finite temperatures. Here we describe and benchmark a quantum computing version of the minimally entangled typical thermal states (METTS) algorithm for which we adopt an adaptive variational approach to perform the required quantum imaginary time evolution. The algorithm, which we name AVQMETTS, dynamically generates compact and problem-specific quantum circuits, which are suitable for noisy intermediate-scale quantum (NISQ) hardware. We benchmark AVQMETTS on statevector simulators and perform thermal energy calculations of integrable and nonintegrable quantum spin models in one and two dimensions and demonstrate an approximately linear system-size scaling of the circuit complexity. Finally, we map out the finite-temperature phase transition line of the two-dimensional transverse field Ising model.

翻訳日:2023-01-10 00:34:39 公開日:2023-01-06

# 遅延系の階層的運動方程式(heom)アナログ:共振器間光子伝播を例に

A hierarchical equations of motion (HEOM) analog for systems with delay: illustrated on inter-cavity photon propagation ( http://arxiv.org/abs/2301.02626v1 )

ライセンス: Link先を確認

Robert Fuchs and Marten Richter

(参考訳) 過去20年間で、谷村と久保の階層的運動方程式(HEOM)は、システムバス問題の数値計算のための動きに基づくツールの方程式となっている。 HEOMは今日では、外浴を通しての散逸・移行プロセスの多くに一般化されている。空間的に拡張されたフォトニック系では、浴槽内の光子の伝播は量子エミッタのカップリングの遅延/遅延を引き起こす。ここで、HEOMの導出の背後にあるアイデアは光子遅延の場合に一般化され、2つの誘電スラブの単純な例に適用される。導出方程式は遅延を記述するための単純な信頼できる枠組みを提供し、経路積分処理の代替となるかもしれない。

Over the last two decades, the hierarchical equations of motion (HEOM) of Tanimura and Kubo have become the equation of motion-based tool for numerically exact calculations of system-bath problems. The HEOM is today generalized to many cases of dissipation and transfer processes through an external bath. In spatially extended photonic systems, the propagation of photons through the bath leads to retardation/delays in the coupling of quantum emitters. Here, the idea behind the HEOM derivation is generalized to the case of photon retardation and applied to the simple example of two dielectric slabs. The derived equations provide a simple reliable framework for describing retardation and may provide an alternative to path integral treatments.

翻訳日:2023-01-10 00:34:24 公開日:2023-01-06

# 高性能コンピューティングにおける神話と伝説

Myths and Legends in High-Performance Computing ( http://arxiv.org/abs/2301.02432v1 )

ライセンス: Link先を確認

Satoshi Matsuoka, Jens Domke, Mohamed Wahib, and Aleksandr Drozd, Torsten Hoefler

(参考訳) このユーモラスで思考を挑発する記事では、ハイパフォーマンスコンピューティングコミュニティのメンバーの間で伝承される神話や伝説について論じる。カンファレンスやミーティング、プロダクト広告、論文、さらにはツイートやブログ、ニュース記事といったコミュニケーションから、コミュニティ内(そしてそれ以上)でこれらの神話を収集しました。それらは、デンナード・スケーリングやムーアの法則のような多くのスケーリング法則の終わりによって引き起こされた、現在の大規模な変化の時代におけるジートジストであると信じています。いくつかの法則が終わる一方で、アルゴリズムスケーリングや新しいアーキテクチャ研究など、新しい方向性が開かれる。しかし、これらの神話は科学的事実に基づくことはめったにないが、しばしばいくつかの証拠や議論に基づいている。実際、これは多くの神話が存在する理由であり、それが明確に答えられない理由であると信じている。それぞれに明確な答えがあるように感じられるが、ベートーヴェンがモーツァルトより優れているかどうかという問題など、哲学的な議論が絶え間ない。我々は、私たちの神話の収集を、研究と産業投資の新たな方向性に関する議論として見たいと思っています。

In this humorous and thought provoking article, we discuss certain myths and legends that are folklore among members of the high-performance computing community. We collected those myths from conversations at conferences and meetings, product advertisements, papers, and other communications such as tweets, blogs, and news articles within (and beyond) our community. We believe they represent the zeitgeist of the current era of massive change, driven by the end of many scaling laws such as Dennard scaling and Moore's law. While some laws end, new directions open up, such as algorithmic scaling or novel architecture research. However, these myths are rarely based on scientific facts but often on some evidence or argumentation. In fact, we believe that this is the very reason for the existence of many myths and why they cannot be answered clearly. While it feels like there should be clear answers for each, some may remain endless philosophical debates such as the question whether Beethoven was better than Mozart. We would like to see our collection of myths as a discussion of possible new directions for research and industry investment.

翻訳日:2023-01-10 00:34:13 公開日:2023-01-06

# 量子多重アクセスワイヤタップチャネル:ワンショットで実現可能なシークレットレート領域について

Quantum Multiple Access Wiretap Channel: On the One-Shot Achievable Secrecy Rate Regions ( http://arxiv.org/abs/2301.02479v1 )

ライセンス: Link先を確認

Hadi Aghaee and Bahareh Akhbari

(参考訳) 本稿では,古典的量子多重アクセスパケットチャネル(CQ-MA-WTC)をワンショット設定で検討する。そこで本研究では,CQ-MA-WTCを同時位置ベース復号器を用いて解析し,信頼性の高い復号化を行う。また,CQ-MA-WTC を Sen のワンショット継手典型補題を用いて解析し,信頼性の高い復号化を行う。同時位置ベースデコーダは、複数の仮説テスト問題を引き起こす傾向がある。また、凸分割を用いて同時シナリオにおけるプライバシー基準を分析することも問題となる。両問題を克服するために,まず cq-ma-wtc の双対と見なすことのできる新しいチャネルを導入する。このチャネルは、複数のメッセージ(pp-qwtc)を持つポイントツーポイント量子ワイヤータップチャネルと呼ばれる。以下では,この問題を解決するための戦略として,量子放送チャネル(qbcs)をワンショット設定で検討し,解析する。

In this paper, we want to investigate classical-quantum multiple access wiretap channels (CQ-MA-WTC) under one-shot setting. In this regard, we analyze the CQ-MA-WTC using simultaneous position-based decoder for reliable decoding and using a newly introduced technique in order to decode securely. Also, for the sake of comparison, we analyze the CQ-MA-WTC using Sen's one-shot joint typicality lemma for reliable decoding. The simultaneous position-based decoder tends to a multiple hypothesis testing problem. Also, using convex splitting to analyze the privacy criteria in a simultaneous scenario becomes problematic. To overcome both problems, we first introduce a new channel that can be considered as a dual to the CQ-MA-WTC. This channel is called a point-to-point quantum wiretap channel with multiple messages (PP-QWTC). In the following, as a strategy to solve the problem, we also investigate and analyze quantum broadcast channels (QBCs) under the one-shot setting.

翻訳日:2023-01-10 00:33:55 公開日:2023-01-06

# グラフェン系ナノアンテナにおけるエッジ効果とコンダクタンスの理論

Theory of Edge Effects and Conductunce for Applications in Graphene-based Nanoantennas ( http://arxiv.org/abs/2301.02441v1 )

ライセンス: Link先を確認

Tomer Berghaus, Touvia Miloh, Oded Gottlieb, and Gregory Slepyan

(参考訳) 本稿では,グラフェンにおけるエッジ効果の理論を,テラヘルツ,赤外線,可視周波数領域のナノアンテナへの応用に適用する。その特性は、通常の表面伝導率ではなく、動的導電率の観点から定式化して到達した自己整合性である。エッジ効果の物理的モデルは、ディラックフェルミオンの概念に基づいている。表面コンダクタンスは一般感受性と見なされ、kuboアプローチによって計算される。以前のモデルとは対照的に、表面コンダクタンスは非均質かつ非局所となる。表面コンダクタンスの空間的挙動は、シートの長さと電気化学的ポテンシャルに依存する。数値シミュレーションの結果,2.1-800nmの範囲と0.1-1.0ev範囲の電気化学ポテンシャルについて検討した。長さが800nmを超えると、我々のモデルは比較的高い精度で古典的なドリュード導電率モデルと一致することが示されている。比較的短い長さでは、導電性は通常空間振動を示し、導電性に欠け、グラフェン系アンテナの特性に強く影響を及ぼす。このような空間振動の周期と振幅は、電気化学的ポテンシャルに強く依存する。新しい理論は、ゲート電圧の電気化学的ポテンシャルを変化させることで、電気制御されたナノアンテナを実現する方法を開く。得られた結果は、現代の量子技術における炭素系ナノデバイスの設計に適用できる。

In this paper, we develop a theory of edge effects in graphene for its applications to nanoantennas in the terahertz, infrared, and visible frequency ranges. Its characteristic feature is selfconsistence reached due the formulation in terms of dynamical conductance instead of ordinary used surface conductivity. The physical model of edge effects is based on using the concept of Dirac fermions. The surface conductance is considered as a general susceptibility and is calculated via the Kubo approach. In contrast with earlier models, the surface conductance becomes nonhomogeneous and nonlocal. The spatial behavior of the surface conductance depends on the length of the sheet and the electrochemical potential. Results of numerical simulations are presented for lengths in the range of 2.1-800 nm and electrochemical potentials ranging between 0.1-1.0 eV. It is shown that if the length exceeded 800 nm, our model agrees with the classical Drude conductivity model with a relatively high degree of accuracy. For rather short lengths, the conductance usually exhibits spatial oscillations, which absent in conductivity and strongly affect the properties of graphene based antennas. The period and amplitude of such spatial oscillations, strongly depend on the electrochemical potential. The new theory opens the way for realizing electrically controlled nanoantennas by changing the electrochemical potential may of the gate voltage. The obtained results may be applicable for the design of carbon based nanodevices in modern quantum technologies.

翻訳日:2023-01-10 00:27:02 公開日:2023-01-06

# フェムト秒パルス駆動非線形光ファイバにおける偏光励起光の発生の最適化

Optimizing the generation of polarization squeezed light in nonlinear optical fibers driven by femtosecond pulses ( http://arxiv.org/abs/2301.02454v1 )

ライセンス: Link先を確認

A. V. Andrianov, N. A. Kalinin, A. A. Sorokin, E. A. Anashkina, L. L. Sanchez-Soto, J. F. Corney, and G. Leuchs

(参考訳) 超短パルスレーザーに対するkerr効果を利用した光ファイバでは、明るい絞り光を生成することができる。しかし、繊維中のパルス伝搬は、スクイーズを劣化させる非保存効果を受ける。本稿では,su(2)不変で技術的な摂動に対して頑健な2モード偏光スクイージングを解析し,偏光維持ファイバ内で生成する。我々は、様々な非保存効果と実ファイバーデータを含むファイバの量子パルス進化の先進モデルを用いて、プロセスとパルスパラメータの厳密な数値最適化を行う。数値結果は実験結果と一致している。

Bright squeezed light can be generated in optical fibers utilizing the Kerr effect for ultrashort laser pulses. However, pulse propagation in a fiber is subject to nonconservative effects that deteriorate the squeezing. Here, we analyze two-mode polarization squeezing, which is SU(2)-invariant, robust against technical perturbations, and can be generated in a polarization-maintaining fiber. We perform a rigorous numerical optimization of the process and the pulse parameters using our advanced model of quantum pulse evolution in the fiber that includes various nonconservative effects and real fiber data. Numerical results are consistent with experimental results.

翻訳日:2023-01-10 00:26:43 公開日:2023-01-06

# 未検出光による量子イメージング蒸留実験

Experimental quantum imaging distillation with undetected light ( http://arxiv.org/abs/2301.02529v1 )

ライセンス: Link先を確認

Jorge Fuenzalida, Marta Gilaberte Basset, Sebastian T\"opfer, Juan P. Torres, Markus Gr\"afe

(参考訳) 誘導コヒーレンス効果に基づくイメージングは、光子対を用いて、それをプローブする光を検出することなく、物体の情報を得る。 1つの光子が物体を照らすが、そのパートナーのみが検出されるため、偶然の事象の測定は不要である。検出された光子の特定の干渉パターンを観察して、追従対象の情報を開示する。ここでは、この撮像技術がノイズに耐性を持たせることを実験的に実証する。本稿では,関心信号の干渉変調に基づく画像蒸留法を提案する。提案手法は,実利得信号の250倍のノイズレベルに対しても高品質の画像を生成することができることを示す。また、我々の発見に関する詳細な理論的説明も含んでいる。

Imaging based on the induced coherence effect makes use of photon pairs to obtain information of an object without detecting the light that probes it. While one photon illuminates the object, only its partner is detected, so no measurement of coincidence events are needed. The sought-after object's information is revealed observing a certain interference pattern on the detected photon. Here we demonstrate experimentally that this imaging technique can be made resilient to noise. We introduce an imaging distillation approach based on the interferometric modulation of the signal of interest. We show that our scheme can generate a high-quality image of an object even against noise levels up to 250 times the actual signal of interest. We also include a detailed theoretical explanation of our findings.

翻訳日:2023-01-10 00:26:33 公開日:2023-01-06

# アウトデコヒーレンスによる古典性:概念、マルコビアン性との関係、およびランダム行列論アプローチ

Classicality with(out) decoherence: Concepts, relation to Markovianity, and a random matrix theory approach ( http://arxiv.org/abs/2301.02563v1 )

ライセンス: Link先を確認

Philipp Strasberg

(参考訳) 古典の世界が量子物理学の根底からどのように現われるかという疑問に対する答えは、次のように再検討され、連結され、拡張される。まず、オープン量子系のデコヒーレンス、一貫性/デコヒーレントヒストリー、コルモゴロフ一貫性の3つの異なる概念を比較する。第二に、これらの概念をつなぐ量子マルコフ性(厳密に定義される)の重要な役割が確立される。第3に、ランダム行列理論モデルを用いて、大量のコヒーレンスが存在するにもかかわらず、遅い観測値と粗い観測値の測定統計値において、量子効果が指数関数的に抑制されることが示されている。これはまた数値的に例示されており、古典性の出現に対する非可積分性とカオスの可能性と重要性を強調している。

Answers to the question how a classical world emerges from underlying quantum physics are revisited, connected and extended as follows. First, three distinct concepts are compared: decoherence in open quantum systems, consistent/decoherent histories and Kolmogorov consistency. Second, the crucial role of quantum Markovianity (defined rigorously) to connect these concepts is established. Third, using a random matrix theory model, quantum effects are shown to be exponentially suppressed in the measurement statistics of slow and coarse observables despite the presence of large amount of coherences. This is also numerically exemplified, and it highlights the potential and importance of non-integrability and chaos for the emergence of classicality.

翻訳日:2023-01-10 00:26:14 公開日:2023-01-06

# エンドツーエンド無線通信のためのハイブリッド量子古典オートエンコーダ

Hybrid Quantum-Classical Autoencoders for End-to-End Radio Communication ( http://arxiv.org/abs/2301.02609v1 )

ライセンス: Link先を確認

Zsolt Tabi and Bence Bak\'o and D\'aniel T. R. Nagy and P\'eter Vaderna and Zs\'ofia Kallus and P\'eter H\'aga and Zolt\'an Zimbor\'as

(参考訳) 量子ニューラルネットワークは、ノイズの多い量子処理ユニットを応用するための候補として浮上している。本稿では,エンドツーエンド無線通信のためのハイブリッド量子古典オートエンコーダを提案する。古典的無線システムの物理層において,ノイズチャネル上での標準符号化無線信号のシミュレーションアーキテクチャの性能について検討する。我々は、受信機内の量子デコーダが送信部内の古典的エンコーダで動作するハイブリッドモデルを実装した。信号劣化に対する堅牢性に優れた入力シンボルの潜在空間表現を学ぶことに加えて、量子ビット回路の一般化されたデータ再ロードスキームにより、アプリケーションの推論時間制約を満たすことができる。

Quantum neural networks are emerging as potential candidates to leverage noisy quantum processing units for applications. Here we introduce hybrid quantum-classical autoencoders for end-to-end radio communication. In the physical layer of classical wireless systems, we study the performance of simulated architectures for standard encoded radio signals over a noisy channel. We implement a hybrid model, where a quantum decoder in the receiver works with a classical encoder in the transmitter part. Besides learning a latent space representation of the input symbols with good robustness against signal degradation, a generalized data re-uploading scheme for the qubit-based circuits allows to meet inference-time constraints of the application.

翻訳日:2023-01-10 00:25:59 公開日:2023-01-06

# ハード組合せ問題に対する量子価格に基づく列生成フレームワーク

A quantum pricing-based column generation framework for hard combinatorial problems ( http://arxiv.org/abs/2301.02637v1 )

ライセンス: Link先を確認

Wesley da Silva Coelho, Lo\"ic Henriet, Louis-Paul Henry

(参考訳) 本研究では、中性原子プラットフォームに基づく量子サンプリング器を含む完全ハイブリッド古典量子アルゴリズムを提案する。このアプローチは、オペレーションリサーチの分野で開発された古典列生成フレームワークにインスパイアされ、量子プロシージャが古典的な解法にどのように役立つかを示す。提案手法を最小頂点色問題にベンチマークし,提案したハイブリッド量子古典列生成アルゴリズムが比較的数イテレーションで優れた解が得られることを示す。結果と最先端の古典的手法と量子的アプローチを比較した。

In this work, we present a complete hybrid classical-quantum algorithm involving a quantum sampler based on neutral atom platforms. This approach is inspired by classical column generation frameworks developed in the field of Operations Research and shows how quantum procedures can assist classical solvers in addressing hard combinatorial problems. We benchmark our method on the Minimum Vertex Coloring problem and show that the proposed hybrid quantum-classical column generation algorithm can yield good solutions in relatively few iterations. We compare our results with state-of-the-art classical and quantum approaches.

翻訳日:2023-01-10 00:25:48 公開日:2023-01-06

# マイクロ波光子計数による単一電子スピン共鳴検出

Single electron-spin-resonance detection by microwave photon counting ( http://arxiv.org/abs/2301.02653v1 )

ライセンス: Link先を確認

Zhiren Wang, L\'eo Balembois, Milos Ran\v{c}i\'c, Eric Billaud, Marianne Le Dantec, Alban Ferrier, Philippe Goldner, Sylvain Bertaina, Thierry Chaneli\`ere, Daniel Est\`eve, Denis Vion, Patrice Bertet, Emmanuel Flurin

(参考訳) 電子スピン共鳴(esr)分光法は、化学から量子コンピューティングまで幅広い応用を含む、常磁性不純物を特徴付ける方法であるが、信号対雑音比が限られているため、アンサンブル平均量のみにアクセスできる。しかし、スピン依存フォトルミネッセンス、輸送測定、走査プローブ技術を用いて単一電子スピン感度が達成されている。これらの手法は、小さな検出ボリュームでのみシステム固有のものであるか、感度が高いため、実用的な単一スピン検出は未解決の課題である。ここでは、極低温のマイクロ波光子カウンタを用いて、スピン蛍光検出による単一電子磁気共鳴を実証する。高品質平面超伝導共振器に結合したシェーライト結晶中の個々の常磁性エルビウムイオンを検出し、その放射減衰速度を1秒で信号対雑音比1.9で向上させる。蛍光信号は、個々のエミッターに由来することを証明し、反膨らみを示す。 3msまでのコヒーレンス時間は測定され、スピン放射寿命によって制限される。この方法は、十分な非放射性緩和時間を持つ任意の常磁性種に適用できる可能性があり、共振器磁気モード体積(10 um^3)と他の単スピン検出技術より桁違い大きい体積での単スピン検出を可能にする。したがって、磁気共鳴や量子コンピューティングに応用できるかもしれない。

Electron spin resonance (ESR) spectroscopy is the method of choice for characterizing paramagnetic impurities, with applications ranging from chemistry to quantum computing, but it gives access only to ensemble-averaged quantities due to its limited signal-to-noise ratio. Single-electron-spin sensitivity has however been reached using spin-dependent photoluminescence, transport measurements, and scanning-probe techniques. These methods are system-specific or sensitive only in a small detection volume, so that practical single spin detection remains an open challenge. Here, we demonstrate single electron magnetic resonance by spin fluorescence detection, using a microwave photon counter at cryogenic temperatures. We detect individual paramagnetic erbium ions in a scheelite crystal coupled to a high-quality factor planar superconducting resonator to enhance their radiative decay rate, with a signal-to-noise ratio of 1.9 in one second integration time. The fluorescence signal shows anti-bunching, proving that it comes from individual emitters. Coherence times up to 3 ms are measured, limited by the spin radiative lifetime. The method has the potential to apply to arbitrary paramagnetic species with long enough non-radiative relaxation time, and allows single-spin detection in a volume as large as the resonator magnetic mode volume ( 10 um^3 in the present experiment), orders of magnitude larger than other single-spin detection techniques. As such, it may find applications in magnetic resonance and quantum computing.

翻訳日:2023-01-10 00:25:40 公開日:2023-01-06

# プライオリティ投票力の測定 - デリゲートを真剣に考える

Measuring a Priori Voting Power -- Taking Delegations Seriously ( http://arxiv.org/abs/2301.02462v1 )

ライセンス: Link先を確認

Rachael Colley, Th\'eo Delemazure, Hugo Gilbert

(参考訳) 本稿では,代議員が重要な役割を担っている選挙における有権者の批判性,すなわち2種類の代議員投票設定と液状民主主義設定を計測する新たな権力指標を提案する。まず、我々のパワー指標は、従来の単純な投票ゲームにおけるpenrose-banzhafインデックスの自然な拡張であり、直観的な説明であると主張する。重み付き投票ゲームにおける再帰公式は擬似多項時間でこれらの指標を計算することができることを示す。最後に、理論的特性を強調し、代議員制の導入が有権者の投票力をどう変えるかを示す数値的な結果を提供する。

In this paper, we introduce new power indices to measure the criticality of voters involved in different elections where delegations play a key role, namely, two variants of the proxy voting setting and a liquid democracy setting. First, we argue that our power indices are natural extensions of the Penrose-Banzhaf index in classic simple voting games, illustrating their intuitions. We show that recursive formulas can compute these indices for weighted voting games in pseudo-polynomial time. Last, we highlight theoretical properties and provide numerical results to illustrate how introducing delegation options modifies the voting power of voters.

翻訳日:2023-01-10 00:19:47 公開日:2023-01-06

# 時系列確率的潮流に適用したロバストなデータ駆動プロセスモデリング

A Robust Data-driven Process Modeling Applied to Time-series Stochastic Power Flow ( http://arxiv.org/abs/2301.02651v1 )

ライセンス: Link先を確認

Pooja Algikar, Yijun Xu, Somayeh Yarahmadi, Lamine Mili

(参考訳) 本稿では,超パラメータをシュウェッペ型一般化最大確率推定器を用いてロバストに推定するロバストなデータ駆動プロセスモデルを提案する。提案モデルでは,電圧ファサーと電力噴射の時系列データを用いて時系列の確率的潮流計算を行う。電力系統のデータは、大きなエラー、故障状況、停電、極端な天候などによって、しばしば異常によって破損する。提案するモデルでは,トレーニングデータセットの測定において,垂直外れ値と悪レバレッジ点を削減できる。時系列データポイントのマハラノビス距離の頑健なバージョンであるプロジェクション統計を用いて、外れ値の影響を束縛するために用いられる重みを計算した。提案手法は,ieee 33 バス配電システムと,再生可能エネルギー源と高度に統合された実世界不均衡240 バス配電システムで実証された。シミュレーションの結果,提案するロバストモデルはトレーニングデータセットの異常値の最大25%を処理できることがわかった。

In this paper, we propose a robust data-driven process model whose hyperparameters are robustly estimated using the Schweppe-type generalized maximum likelihood estimator. The proposed model is trained on recorded time-series data of voltage phasors and power injections to perform a time-series stochastic power flow calculation. Power system data are often corrupted with outliers caused by large errors, fault conditions, power outages, and extreme weather, to name a few. The proposed model downweights vertical outliers and bad leverage points in the measurements of the training dataset. The weights used to bound the influence of the outliers are calculated using projection statistics, which are a robust version of Mahalanobis distances of the time series data points. The proposed method is demonstrated on the IEEE 33-Bus power distribution system and a real-world unbalanced 240-bus power distribution system heavily integrated with renewable energy sources. Our simulation results show that the proposed robust model can handle up to 25% of outliers in the training data set.

翻訳日:2023-01-10 00:19:36 公開日:2023-01-06

# TrojanPuzzle: コード提案モデルを隠蔽する

TrojanPuzzle: Covertly Poisoning Code-Suggestion Models ( http://arxiv.org/abs/2301.02344v1 )

ライセンス: Link先を確認

Hojjat Aghakhani, Wei Dai, Andre Manoel, Xavier Fernandes, Anant Kharkar, Christopher Kruegel, Giovanni Vigna, David Evans, Ben Zorn, and Robert Sim

(参考訳) GitHub Copilotのようなツールでは、自動コード提案はもはやソフトウェアエンジニアリングの夢ではない。大規模な言語モデルに基づくこれらのツールは、通常、未調査の公開ソースから採掘された大量のコードコーパスで訓練される。その結果、これらのモデルは悪意のあるデータを注入してモデルのトレーニングや微調整フェーズを操作するデータ中毒攻撃に影響を受けやすい。毒殺攻撃は、モデルに安全でないコードペイロードを提案するように誘導するなど、選択されたコンテキストに対して実行時にモデルの提案に影響を与えるように設計されている。これを実現するために、事前毒殺攻撃は、安全でないコードペイロードをトレーニングデータに明示的に注入し、このような悪意のあるデータをトレーニングセットから削除できる静的解析ツールによって、毒殺データを検出可能にする。本研究では, ドクストリングなどの文脈外領域に有害な中毒データを植え付けることで静的解析を回避できる2つの新しいデータ中毒攻撃, COVERT と TROJANPUZLE を実証する。我々の最も斬新な攻撃であるTROJANPUZLEは、有毒データにペイロードの特定の(目立たしい)部分を含めることなく、コード完了時にペイロード全体(つまり外部の文書)を示唆するモデルを生成することによって、不審な毒性データを生成する。これによってtrojanpuzzleは、トレーニングデータから疑わしいシーケンスを識別およびフィルタリングするシグネチャベースのデータセット分離手法に対して堅牢になる。 2つのモデルサイズに対する評価は、COVERTとTROJANPUZLEの両方が、コード提案モデルのトレーニングやチューニングに使用するコードを選択する方法に重要な意味を持つことを示している。

With tools like GitHub Copilot, automatic code suggestion is no longer a dream in software engineering. These tools, based on large language models, are typically trained on massive corpora of code mined from unvetted public sources. As a result, these models are susceptible to data poisoning attacks where an adversary manipulates the model's training or fine-tuning phases by injecting malicious data. Poisoning attacks could be designed to influence the model's suggestions at run time for chosen contexts, such as inducing the model into suggesting insecure code payloads. To achieve this, prior poisoning attacks explicitly inject the insecure code payload into the training data, making the poisoning data detectable by static analysis tools that can remove such malicious data from the training set. In this work, we demonstrate two novel data poisoning attacks, COVERT and TROJANPUZZLE, that can bypass static analysis by planting malicious poisoning data in out-of-context regions such as docstrings. Our most novel attack, TROJANPUZZLE, goes one step further in generating less suspicious poisoning data by never including certain (suspicious) parts of the payload in the poisoned data, while still inducing a model that suggests the entire payload when completing code (i.e., outside docstrings). This makes TROJANPUZZLE robust against signature-based dataset-cleansing methods that identify and filter out suspicious sequences from the training data. Our evaluation against two model sizes demonstrates that both COVERT and TROJANPUZZLE have significant implications for how practitioners should select code used to train or tune code-suggestion models.

翻訳日:2023-01-10 00:19:17 公開日:2023-01-06

# No-Regret Reduction による確率的リセットフリー強化学習

Provable Reset-free Reinforcement Learning by No-Regret Reduction ( http://arxiv.org/abs/2301.02389v1 )

ライセンス: Link先を確認

Hoai-An Nguyen, Ching-An Cheng

(参考訳) 実世界の強化学習(RL)は、典型的なRLアルゴリズムが適切な初期状態のサンプリングにリセット機構に強く依存するため、非常に制限されることが多い。実際には、人間の介入や高度なエンジニアリング環境を必要とするため、リセットメカニズムを実装するのに費用がかかる。学習をより実用的なものにするために,リセットフリーなrlアルゴリズムを体系的に設計する汎用的非リグレット削減を提案する。我々のリセットフリーのRLを2プレイヤーゲームに変える。この2つのプレイヤーゲームでsublinear regretを達成することは、オリジナルのrl問題におけるsublinear performance regretとsublinear total of resetsの両方を持つポリシーを学ぶことを意味する。これは、エージェントが最終的に最適な実行を学習し、リセットを避けることを意味する。この削減により、我々は線形マルコフ決定過程のインスタンス化を設計する。

Real-world reinforcement learning (RL) is often severely limited since typical RL algorithms heavily rely on the reset mechanism to sample proper initial states. In practice, the reset mechanism is expensive to implement due to the need for human intervention or heavily engineered environments. To make learning more practical, we propose a generic no-regret reduction to systematically design reset-free RL algorithms. Our reduction turns reset-free RL into a two-player game. We show that achieving sublinear regret in this two player game would imply learning a policy that has both sublinear performance regret and sublinear total number of resets in the original RL problem. This means that the agent eventually learns to perform optimally and avoid resets. By this reduction, we design an instantiation for linear Markov decision processes, which is the first provably correct reset-free RL algorithm to our knowledge.

翻訳日:2023-01-10 00:18:45 公開日:2023-01-06

# 統合ベイズネットワークによるMDD患者のパーソナライズされた脳機能結合の学習

Learning Personalized Brain Functional Connectivity of MDD Patients from Multiple Sites via Federated Bayesian Networks ( http://arxiv.org/abs/2301.02423v1 )

ライセンス: Link先を確認

Shuai Liu, Xiao Guo, Shun Qi, Huaning Wang and Xiangyu Chang

(参考訳) 主要うつ病性障害(mdd)患者の機能的結合バイオマーカーの同定は、障害機構の解明と早期介入に不可欠である。しかし, サンプルサイズが小さく, 利用可能な神経画像データの高次元化により, 既存手法の性能は制限されることが多い。多地点データでは統計的パワーとサンプルサイズが向上するが、サイト間の不均一性とデータ共有ポリシーがしばしば適用される。本稿では,複数のベイズネットワーク(BN)を連続的最適化で同時学習し,MDD患者の疾患誘発変化を同定するための連合型関節推定器NOTEARS-PFLを提案する。提案するフェデレーション学習フレームワークには,サイト間で共有される情報とサイト固有の情報を組み込んで,グループ融合ラッソペナルティを導入することで,パーソナライズされたBN構造を学習する。そこで我々は,局所的な更新ステップにおいて,各局所でニューロイメージングデータを処理した乗算器の交互方向法を開発した。そして、学習したネットワーク構造をセンターに送信し、グローバル更新を行う。特に,局所的な更新ステップのクローズドフォーム式を導出し,グループ融合ラッソペナルティを扱うために反復的近近投射法を用いる。合成および実世界のマルチサイトRS-fMRIデータセットにおける提案手法の性能評価を行った。その結果,提案したNOTEARS-PFLは同等の手法よりも有効性と精度が高いことがわかった。

Identifying functional connectivity biomarkers of major depressive disorder (MDD) patients is essential to advance understanding of the disorder mechanisms and early intervention. However, due to the small sample size and the high dimension of available neuroimaging data, the performance of existing methods is often limited. Multi-site data could enhance the statistical power and sample size, while they are often subject to inter-site heterogeneity and data-sharing policies. In this paper, we propose a federated joint estimator, NOTEARS-PFL, for simultaneous learning of multiple Bayesian networks (BNs) with continuous optimization, to identify disease-induced alterations in MDD patients. We incorporate information shared between sites and site-specific information into the proposed federated learning framework to learn personalized BN structures by introducing the group fused lasso penalty. We develop the alternating direction method of multipliers, where in the local update step, the neuroimaging data is processed at each local site. Then the learned network structures are transmitted to the center for the global update. In particular, we derive a closed-form expression for the local update step and use the iterative proximal projection method to deal with the group fused lasso penalty in the global update step. We evaluate the performance of the proposed method on both synthetic and real-world multi-site rs-fMRI datasets. The results suggest that the proposed NOTEARS-PFL yields superior effectiveness and accuracy than the comparable methods.

翻訳日:2023-01-10 00:18:28 公開日:2023-01-06

# 強レーザーパルス下における分子の多電子ダイナミクスの効率的なシミュレーション:適応有限要素法に基づく多重構成時間依存Hartree-Fock法の実装

Efficient simulation of multielectron dynamics in molecules under intense laser pulses: Implementation of the multiconfiguration time-dependent Hartree-Fock method based on the adaptive finite element method ( http://arxiv.org/abs/2301.02387v1 )

ライセンス: Link先を確認

Yuki Orimo, Takeshi Sato, Kenichi L. Ishikawa

(参考訳) 本稿では,高出力レーザーパルス下での分子の適応有限要素法に基づくマルチコンフィギュレーション時間依存hartree-fock法の実装について述べる。効率的なシミュレーションのために、軌道関数は短い反復アーノルドスキームを用いて安定なプロパゲータによって伝播され、分散メモリ計算のために並列化される。これは、水分子からの高調波発生をシミュレーションし、計算時間を極端に少なくした多電子ダイナミクスのシミュレーションを実現することで実証される。

We present an implementation of the multiconfiguration time-dependent Hartree-Fock method based on the adaptive finite element method for molecules under intense laser pulses. For efficient simulations, orbital functions are propagated by a stable propagator using the short iterative Arnoldi scheme and our implementation is parallelized for distributed memory computing. This is demonstrated by simulating high-harmonic generation from a water molecule and achieves a simulation of multielectron dynamics with overwhelmingly less computational time, compared to our previous work.

翻訳日:2023-01-10 00:15:56 公開日:2023-01-06

# オルガノイド画像解析プラットフォームに関する調査

A survey on Organoid Image Analysis Platforms ( http://arxiv.org/abs/2301.02341v1 )

ライセンス: Link先を確認

Alireza Ranjbaran and Azadeh Nazemi

(参考訳) 生体内細胞培養系は、特定の細胞型に関する生物学的発見や仮説駆動の研究に使われ、機械的または試験薬理学的薬物を理解する。従来のin-vitro培養法は2次元表面上に沈着した一次細胞や不死化細胞に応用されている。しかし、複雑な生理環境では信頼できず、生存中の行動を正確に予測することはできない。オルガノイド(Organoids)は、in vitro細胞培養系に置換された一次ドナーまたは幹細胞の多細胞スフェロイドであり、生物学的、生医学、翻訳研究で広く用いられている。臓器や疾患組織のネイティブな異質性、微細解剖、機能性は、オルガノイドのような3次元の生体内組織モデルで表すことができる。オルガノイドは、薬物発見とパーソナライズドドラッグスクリーニングのための生体内モデルに必須である。オルガノイドの閉塞、重なり、焦点外スフェロイドなどの多くの画像アーティファクトは、従来の画像処理では困難である。生物学におけるオルガノイドモデルの力にもかかわらず、その大きさと形状はほとんど考慮されていない。薬物応答は、個々のオルガノイドの形態、数、大きさの動的変化に依存するが、これはオルガノイドの形状や大きさの違い、焦点平面の移動、限られたオプションによる生細胞染色が薬物応答や成長分析に困難をもたらすことを意味する。本研究は, 様々な医学分野におけるオルガノイド培養システムの役割と, オルガノイドの利用範囲について紹介する。次に、オルガノイドの運用の課題を研究し、続いて、オルガノイド活用の課題に対処するために、オルガノイドに適用される画像分析システムやプラットフォームをレビューする。

An in-vitro cell culture system is used for biological discoveries and hypothesis-driven research on a particular cell type to understand mechanistic or test pharmaceutical drugs. Conventional in-vitro cultures have been applied to primary cells and immortalised cell lines plated on 2D surfaces. However, they are unreliable in complex physiological environments and can not always predict in-vivo behaviour correctly. Organoids are multicellular spheroids of a primary donor or stem cells that are replaced in vitro cell culture systems and are widely used in biological, biomedical and translational studies. Native heterogeneity, microanatomy, and functionality of an organ or diseased tissue can be represented by three-dimensional in-vitro tissue models such as organoids. Organoids are essential in in-vitro models for drug discovery and personalised drug screening. Many imaging artefacts such as organoid occlusion, overlap, out-of-focus spheroids and considerable heterogeneity in size cause difficulty in conventional image processing. Despite the power of organoid models for biology, their size and shape have mostly not been considered. Drug responses depend on dynamic changes in individual organoid morphology, number and size, which means differences in organoid shape and size, movement through focal planes, and live-cell staining with limited options cause challenges for drug response and growth analysis. This study primarily introduces the importance of the role of the organoid culture system in different disciplines of medical science and various scopes of utilising organoids. Then studies the challenges of operating organoids, followed by reviewing image analysis systems or platforms applied to organoids to address organoid utilising challenges.

翻訳日:2023-01-10 00:10:27 公開日:2023-01-06

# Text2Poster: 検索した画像にスティル化されたテキストをレイアウトする

Text2Poster: Laying out Stylized Texts on Retrieved Images ( http://arxiv.org/abs/2301.02363v1 )

ライセンス: Link先を確認

Chuhao Jin, Hongteng Xu, Ruihua Song, Zhiwu Lu

(参考訳) ポスター生成は広範囲のアプリケーションにとって重要なタスクであり、しばしば時間がかかり、手作業による編集や芸術的な経験を必要とする。本稿では,テキスト情報から視覚的に有効なポスターを自動的に生成する,新しいデータ駆動フレームワークである \textit{text2poster} を提案する。マニュアルポスター編集のプロセスを模倣したフレームワークでは,所定のテキストから背景画像を抽出し,逐次的な自動エンコーダによって画像上のテキストを反復的にレイアウトし,最後にマッチングベースの手法でテキストをスタイラライズする。我々は、ラベル付きデータの需要を軽減し、弱々しく自己監督的な学習戦略によってフレームワークのモジュールを学習する。客観的な実験と主観的な実験の両方で、text2posterは、学術研究や商用ソフトウェアを含む最先端の手法よりも、生成したポスターの品質に優れています。

Poster generation is a significant task for a wide range of applications, which is often time-consuming and requires lots of manual editing and artistic experience. In this paper, we propose a novel data-driven framework, called \textit{Text2Poster}, to automatically generate visually-effective posters from textual information. Imitating the process of manual poster editing, our framework leverages a large-scale pretrained visual-textual model to retrieve background images from given texts, lays out the texts on the images iteratively by cascaded auto-encoders, and finally, stylizes the texts by a matching-based method. We learn the modules of the framework by weakly- and self-supervised learning strategies, mitigating the demand for labeled data. Both objective and subjective experiments demonstrate that our Text2Poster outperforms state-of-the-art methods, including academic research and commercial software, on the quality of generated posters.

翻訳日:2023-01-10 00:10:00 公開日:2023-01-06

# コンタクト顕微鏡画像からの角膜パノラマ画像の生成

Generating corneal panoramic images from contact specular microscope images ( http://arxiv.org/abs/2301.02388v1 )

ライセンス: Link先を確認

Yusuke Nagira, Yuzuha Hara, Satoru Hiwa, Naoki Okumura, Noriko Koizumi and Tomoyuki Hiroyasu

(参考訳) 接触鏡顕微鏡は非接触鏡顕微鏡よりも広い視野角を有するが、角膜全体の像を捉えることはできない。このような画像を得るには、連続的に撮像された画像の一部にフィルムを作成し、それらを組み合わせて完全な画像を作成する必要がある。本研究では,コンタクトスペクトル顕微鏡を用いて撮影した映像から角膜全体を自動生成する枠組みを提案する。ビデオから比較的焦点を絞った映像を抽出し,パノラマ合成を行った。画像全体を生成することができる場合、画像からグッタを検出し、その存在範囲を調べることができる。本システムを実装し,提案手法の有効性を検討した。このシステムは、カスタムメイド複合ソフトウェア、画像合成ソフトウェア(ICS, K.I. Technology Co., Ltd., 内部アルゴリズムは公表されていない)を用いて実装され、U-Netを用いた教師付き学習モデルを用いた。フッフス内皮角膜ジストロフィー(FECD)マウスモデルから得られた94種類の角膜ビデオに構築システムを適用した際,いくつかの画像が正しく合成された。本研究におけるデータに対する手法の実装と適用により,その効果が確認された。実装による精度などの最小限の定量的評価により、将来の調査にはいくつかの制限が生じる可能性がある。

The contact specular microscope has a wider angle of view than that of the non-contact specular microscope but still cannot capture an image of the entire cornea. To obtain such an image, it is necessary to prepare film on the parts of the image captured sequentially and combine them to create a complete image. This study proposes a framework to automatically generate an entire corneal image from videos captured using a contact specular microscope. Relatively focused images were extracted from the videos and panoramic compositing was performed. If an entire image can be generated, it is possible to detect guttae from the image and examine the extent of their presence. The system was implemented and the effectiveness of the proposed framework was examined. The system was implemented using custom-made composite software, Image Composite Software (ICS, K.I. Technology Co., Ltd., Japan, internal algorithms not disclosed), and a supervised learning model using U-Net was used for guttae detection. Several images were correctly synthesized when the constructed system was applied to 94 different corneal videos obtained from Fuchs endothelial corneal dystrophy (FECD) mouse model. The implementation and application of the method to the data in this study confirmed its effectiveness. Owing to the minimal quantitative evaluation performed, such as accuracy with implementation, it may pose some limitations for future investigations.

翻訳日:2023-01-10 00:09:41 公開日:2023-01-06

# 医用画像解析における深層学習モデル:Kvasirデータセットによる食道炎の検出

Deep-learning models in medical image analysis: Detection of esophagitis from the Kvasir Dataset ( http://arxiv.org/abs/2301.02390v1 )

ライセンス: Link先を確認

Kyoka Yoshiok, Kensuke Tanioka, Satoru Hiwa and Tomoyuki Hiroyasu

(参考訳) 食道炎の早期発見は,非治療で癌に進行する可能性があるため重要である。しかし,食道炎検出における深層学習モデルの精度は,まだ比較されていない。そこで本研究では,結膜型ニューラルネットワークモデル(googlenet,resnet-50,mobilenet v2,mobilenet v3)の内視鏡画像のオープンkvasirデータセットからの食道炎検出における精度を比較することを目的とした。その結果,GoogLeNetはF1スコアが最も高かった。 MobileNet V3は、真の陽性率の平均に基づいて、他のモデルよりも確実に食道炎を予測した。モデルを用いて得られた結果は、SHapley Additive exPlanations と Gradient-weighted Class Activation Mapping を用いた結果と比較した。

Early detection of esophagitis is important because this condition can progress to cancer if left untreated. However, the accuracies of different deep learning models in detecting esophagitis have yet to be compared. Thus, this study aimed to compare the accuracies of convolutional neural network models (GoogLeNet, ResNet-50, MobileNet V2, and MobileNet V3) in detecting esophagitis from the open Kvasir dataset of endoscopic images. Results showed that among the models, GoogLeNet achieved the highest F1-scores. Based on the average of true positive rate, MobileNet V3 predicted esophagitis more confidently than the other models. The results obtained using the models were also compared with those obtained using SHapley Additive exPlanations and Gradient-weighted Class Activation Mapping.

翻訳日:2023-01-10 00:09:21 公開日:2023-01-06

# グラフ畳み込みによる深部血管分割のためのクロスネットワークマルチスケール特徴融合

Graph Convolution Based Cross-Network Multi-Scale Feature Fusion for Deep Vessel Segmentation ( http://arxiv.org/abs/2301.02393v1 )

ライセンス: Link先を確認

Gangming Zhao, Kongming Liang, Chengwei Pan, Fandong Zhang, Xianpeng Wu, Xinyang Hu, and Yizhou Yu

(参考訳) 血管セグメンテーションは血管疾患の診断に広く用いられている。既存の方法で再建された容器はしばしば臨床使用基準を満たすほど正確ではない。これは3D血管構造が非常に複雑で、空間性や異方性など独特の特徴を持つためである。本稿では,血管分割のためのハイブリッド深層ニューラルネットワークを提案する。ネットワークは,それぞれ初期セグメンテーションと洗練されたセグメンテーションを行う2つのカスケードサブネットワークで構成されている。第2のサブネットワークはさらに、従来のCNNベースのU-NetとグラフU-Netの2つの密結合コンポーネントを備えている。これら2つのU字型ネットワーク間でクロスネットワークマルチスケール機能融合を行い、高品質な船体セグメンテーションを効果的に支援する。カスケードされたネットワーク全体を、エンドツーエンドでトレーニングすることができる。第2サブネットワークのグラフは、容器の確率マップと、元のCTボリュームの外観と意味的類似性に基づいて構築される。血管の疎水性と異方性に起因する課題に対処するため、グラフノードの比率は血管を含む可能性のある領域に分布し、エッジの比率は潜在的な近接血管の向きに従っている。我々のディープネットワークは、複数のパブリックおよび社内データセット上で最先端の3D船体セグメンテーション性能を達成する。

Vessel segmentation is widely used to help with vascular disease diagnosis. Vessels reconstructed using existing methods are often not sufficiently accurate to meet clinical use standards. This is because 3D vessel structures are highly complicated and exhibit unique characteristics, including sparsity and anisotropy. In this paper, we propose a novel hybrid deep neural network for vessel segmentation. Our network consists of two cascaded subnetworks performing initial and refined segmentation respectively. The second subnetwork further has two tightly coupled components, a traditional CNN-based U-Net and a graph U-Net. Cross-network multi-scale feature fusion is performed between these two U-shaped networks to effectively support high-quality vessel segmentation. The entire cascaded network can be trained from end to end. The graph in the second subnetwork is constructed according to a vessel probability map as well as appearance and semantic similarities in the original CT volume. To tackle the challenges caused by the sparsity and anisotropy of vessels, a higher percentage of graph nodes are distributed in areas that potentially contain vessels while a higher percentage of edges follow the orientation of potential nearbyvessels. Extensive experiments demonstrate our deep network achieves state-of-the-art 3D vessel segmentation performance on multiple public and in-house datasets.

翻訳日:2023-01-10 00:09:05 公開日:2023-01-06

# 胸部x線画像分類のための深部学習 (共同19)

Deep Learning For Classification Of Chest X-Ray Images (Covid 19) ( http://arxiv.org/abs/2301.02468v1 )

ライセンス: Link先を確認

Benbakreti Samir, Said Mwanahija, Benbakreti Soumia, Umut \"Ozkaya

(参考訳) 医療実践においては、情報技術の貢献は非常に大きい。これらのプラクティスのほとんどは、医療援助が人体の異なる病理を識別するために使用する画像を含んでいる。そのうちの1つはX線画像で、この論文の作業の多くをカバーしています。胸部X線はCovid 19の同定と診断において重要な役割を果たしている。新型コロナウイルスは、2019年12月に中国武漢で発生した最初の症例を受けて、2020年以来、世界的な感染拡大が宣言されている。このプロジェクトのゴールは、Covid 19のウイルス性肺炎、肺の透明度、正常な画像を含む胸部X線画像を分類できるようにすることです。 cnnアーキテクチャとさまざまな事前学習モデルを使用しました。最良の結果は94.1%の精度でresnet 18アーキテクチャを使用することで得られる。また、AlexNetの場合、GPUの実行時間は最適であるが、私たちが注意する必要があるのは、事前訓練されたモデルがCNNよりもはるかに早く収束することである。時間の節約は非常に大きい。これらの結果により、患者に対する診断時間が解決されるだけでなく、特にパンデミックの強い時期には、実践者にとって興味深いツールが提供される。

In medical practice, the contribution of information technology can be considerable. Most of these practices include the images that medical assistance uses to identify different pathologies of the human body. One of them is X-ray images which cover much of our work in this paper. Chest x-rays have played an important role in Covid 19 identification and diagnosis. The Covid 19 virus has been declared a global pandemic since 2020 after the first case found in Wuhan China in December 2019. Our goal in this project is to be able to classify different chest X-ray images containing Covid 19, viral pneumonia, lung opacity and normal images. We used CNN architecture and different pre-trained models. The best result is obtained by the use of the ResNet 18 architecture with 94.1% accuracy. We also note that The GPU execution time is optimal in the case of AlexNet but what requires our attention is that the pretrained models converge much faster than the CNN. The time saving is very considerable. With these results not only will solve the diagnosis time for patients, but will provide an interesting tool for practitioners, thus helping them in times of strong pandemic in particular.

翻訳日:2023-01-10 00:08:48 公開日:2023-01-06

# WSIから得られた大腸癌のCADシステム:臨床検査による解釈可能なMLベースのプロトタイプ

A CAD System for Colorectal Cancer from WSI: A Clinically Validated Interpretable ML-based Prototype ( http://arxiv.org/abs/2301.02608v1 )

ライセンス: Link先を確認

Pedro C. Neto, Diana Montezuma, Sara P. Oliveira, Domingos Oliveira, Jo\~ao Fraga, Ana Monteiro, Jo\~ao Monteiro, Liliana Ribeiro, Sofia Gon\c{c}alves, Stefan Reinhard, Inti Zlobec, Isabel M. Pinto, Jaime S. Cardoso

(参考訳) 人工知能(AI)とデジタル病理学の統合は、ここ数年で増加している。近年,WSI画像から癌を診断するためのディープラーニング(DL)法の応用は,これまでになく,さまざまな研究グループにおいて現実となっている。しかし,これらのシステムの開発は,トレーニングサンプルの欠如,スケーリング困難,DL法の不透明さ,臨床検査の欠如など,無数の制約によって制限された。そこで本研究では,大腸癌検体診断に特化したシステムを提案する。 The construction of such a system consisted of four stages: (1) a careful data collection and annotation process, which resulted in one of the largest WSI colorectal samples datasets; (2) the design of an interpretable mixed-supervision scheme to leverage the domain knowledge introduced by pathologists through spatial annotations; (3) the development of an effective sampling approach based on the expected severeness of each tile, which decreased the computation cost by a factor of almost 6x; (4) the creation of a prototype that integrates the full set of features of the model to be evaluated in clinical practice. これらの段階において,提案手法は4つの異なるテストセットで評価され,そのうち2つは外部的で完全に独立である。最大のセットでは、提案されたアプローチは93.44%の精度を達成した。大腸サンプルのDLは、研究を排他的に中止し、臨床実践に完全に統合されるためのいくつかのステップである。

The integration of Artificial Intelligence (AI) and Digital Pathology has been increasing over the past years. Nowadays, applications of deep learning (DL) methods to diagnose cancer from whole-slide images (WSI) are, more than ever, a reality within different research groups. Nonetheless, the development of these systems was limited by a myriad of constraints regarding the lack of training samples, the scaling difficulties, the opaqueness of DL methods, and, more importantly, the lack of clinical validation. As such, we propose a system designed specifically for the diagnosis of colorectal samples. The construction of such a system consisted of four stages: (1) a careful data collection and annotation process, which resulted in one of the largest WSI colorectal samples datasets; (2) the design of an interpretable mixed-supervision scheme to leverage the domain knowledge introduced by pathologists through spatial annotations; (3) the development of an effective sampling approach based on the expected severeness of each tile, which decreased the computation cost by a factor of almost 6x; (4) the creation of a prototype that integrates the full set of features of the model to be evaluated in clinical practice. During these stages, the proposed method was evaluated in four separate test sets, two of them are external and completely independent. On the largest of those sets, the proposed approach achieved an accuracy of 93.44%. DL for colorectal samples is a few steps closer to stop being research exclusive and to become fully integrated in clinical practice.

翻訳日:2023-01-10 00:08:07 公開日:2023-01-06

# ZX計算からのグラフィック量子クリフォードエンコーダコンパイラ

Graphical quantum Clifford-encoder compilers from the ZX calculus ( http://arxiv.org/abs/2301.02356v1 )

ライセンス: Link先を確認

Andrey Boris Khesin, Jonathan Z. Lu, and Peter W. Shor

(参考訳) 本稿では、量子誤り訂正において普遍的に発生する量子回路の等価クラスであるクリフォードエンコーダをZX計算の表現にマッピングする量子コンパイルアルゴリズムを提案する。特に、zx計算において正準形式を開発し、任意のクリフォードエンコーダの正準形式への効率的な還元性を証明する。コンパイラが生成したダイアグラムは,エンコーダの情報伝搬と絡み合い構造を明確に可視化し,回路や安定化器の表象に隠蔽される特性を明らかにする。

We present a quantum compilation algorithm that maps Clifford encoders, an equivalence class of quantum circuits that arise universally in quantum error correction, into a representation in the ZX calculus. In particular, we develop a canonical form in the ZX calculus and prove canonicity as well as efficient reducibility of any Clifford encoder into the canonical form. The diagrams produced by our compiler explicitly visualize information propagation and entanglement structure of the encoder, revealing properties that may be obscured in the circuit or stabilizer-tableau representation.

翻訳日:2023-01-09 23:59:50 公開日:2023-01-06

# 量子カオス指標としての時間外相関子の相対的漸近振動

Relative asymptotic oscillations of the out-of-time-ordered correlator as a quantum chaos indicator ( http://arxiv.org/abs/2301.02456v1 )

ライセンス: Link先を確認

Jakub Novotn\'y, Pavel Str\'ansk\'y

(参考訳) 詳細な数値研究により、時間外整列コリレータの標準偏差-平均比の漸近値がシステムの量子カオス性の尺度として有効であることが判明した。自由度が2つの有限サイズの完全連結量子系、すなわち代数的u(3)モデルを採用し、相関子の相対振動と、系の古典的極限における位相空間体積のカオス的部分の比との明確な対応を示す。また、相対振動がシステムサイズとどのようにスケールするかを示し、スケーリング指数が堅牢なカオス指標としても機能することを示す。

A detailed numerical study reveals that the asymptotic values of the standard deviation-to-mean ratio of the out-of-time-ordered correlator can be successfully used as a measure of the quantum chaoticity of the system. We employ a finite-size fully connected quantum system with two degrees of freedom, namely the algebraic u(3) model, and demonstrate a clear correspondence between the relative oscillations of the correlators and the ratio of the chaotic part of the volume of phase space in the classical limit of the system. We also show how the relative oscillations scale with the system size and conjecture that the scaling exponent can also serve as a robust chaos indicator.

翻訳日:2023-01-09 23:59:40 公開日:2023-01-06

# 対角的非侵襲性の侵害:量子記憶効果の要点

Violation of Diagonal Non-Invasiveness: A Hallmark of Quantum Memory Effects ( http://arxiv.org/abs/2301.02500v1 )

ライセンス: Link先を確認

Adri\'an A. Budini

(参考訳) 慣性的な計測の浸透性と記憶効果の存在をつなぐ操作的(測定に基づく)スキームを定義する。その基礎となる理論的な基礎は、対応する可観測性が系密度行列と同じ基底で対角的であるとき(メモリレス)マルコフ力学の非侵襲的可測性に依存する。対照的に、(操作的に定義された)量子メモリ効果は、常に対角非侵襲性に違反する。非マルコフ記憶効果によるLeggett-Garg不等式違反の関連条件も確立した。

An operational (measurement based) scheme that connects in a univocal way measurement invasivity and the presence of memory effects is defined. Its underlying theoretical basis relies on a non-invasive measurability of (memoryless) Markovian dynamics when the corresponding observable is diagonal in the same basis as the system density matrix. In contrast, (operational defined) quantum memory effects always lead to violation of diagonal non-invasiveness. Related conditions for violation of Leggett-Garg inequality due to non-Markovian memory effects are also established.

翻訳日:2023-01-09 23:59:29 公開日:2023-01-06

# 量子多重アクセスチャネルにおける単一粒子による情報伝達

Information Carried by a Single Particle in Quantum Multiple-Access Channels ( http://arxiv.org/abs/2301.02513v1 )

ライセンス: Link先を確認

Xinan Chen, Yujie Zhang, Andreas Winter, Virginia O. Lorenz, Eric Chitambar

(参考訳) 量子システムの非古典的特徴は、現在情報交換の方法を強化する可能性を秘めている。本稿では,この強化を単一粒子の最も基本的なレベルについて検討する。より正確には、1つの古典的粒子または量子的粒子を用いて、複数のパーティ情報が単一の受信機にどれだけうまく伝達できるかを比較する。提案手法は、複数の空間モードにまたがってコヒーレントに分散された単一の粒子にメッセージがエンコードできるマルチアクセス通信モデルに基づいている。理論的には、古典的シナリオから厳密に分離する量子設定におけるアクセス可能な情報の下限を導出する。この分離は、複数の送信者が存在する場合や、受信者と共有フェーズ参照を持つ単一の送信者が存在する場合にも発生する。実験では、異なる軌道に沿ってエンコードされるメッセージを含むマルチポート干渉計を実装し、単一粒子通信においてこのような量子的な利点を示す。具体的には、3ポート光干渉計で構築した2周通信プロトコルについて検討する。このシナリオでは、古典粒子で達成可能なレート和は1ビットで上限され、量子セットアップで1.0152\pm0.0034$bitのレート和を実験的に観測する。

Non-classical features of quantum systems have the potential to strengthen the way we currently exchange information. In this paper, we explore this enhancement on the most basic level of single particles. To be more precise, we compare how well multi-party information can be transmitted to a single receiver using just one classical or quantum particle. Our approach is based on a multiple-access communication model in which messages can be encoded into a single particle that is coherently distributed across multiple spatial modes. Theoretically, we derive lower bounds on the accessible information in the quantum setting that strictly separate it from the classical scenario. This separation is found whenever there is more than one sender, and also when there is just a single sender who has a shared phase reference with the receiver. Experimentally, we demonstrate such quantum advantage in single-particle communication by implementing a multi-port interferometer with messages being encoded along the different trajectories. Specifically, we consider a two-sender communication protocol built by a three-port optical interferometer. In this scenario, the rate sum achievable with a classical particle is upper bounded by one bit, while we experimentally observe a rate sum of $1.0152\pm0.0034$ bits in the quantum setup.

翻訳日:2023-01-09 23:59:20 公開日:2023-01-06

# 二重スリット実験は量子解釈を区別できるのか?

Can the double-slit experiment distinguish between quantum interpretations? ( http://arxiv.org/abs/2301.02641v1 )

ライセンス: Link先を確認

Ali Ayatollah Rafsanjani, MohammadJavad Kazemi, Alireza Bahrampour, and Mehdi Golshani

(参考訳) 量子力学の驚くべき成功にもかかわらず、測定問題や量子到着時間問題といった基本的な問題により、理論の予測は明確で独特な場合もある。特に、スクリーン上の粒子検出事象の同時時空間分布に関する様々な予測があり、これは量子論の異なる定式化と解釈から導かれる。この差は典型的には小さいが,本研究では,従来の2重スリット構成により,これらの予測を実験的に区別できることが示唆された。

Despite the astonishing successes of quantum mechanics, due to some fundamental problems such as the measurement problem and quantum arrival time problem, the predictions of the theory are in some cases not quite clear and unique. Especially, there are various predictions for the joint spatiotemporal distribution of particle detection events on a screen, which are derived from different formulations and interpretations of the quantum theory. Although the differences are typically small, our studies show that these predictions can be experimentally distinguished by an unconventional double-slit configuration, which is realizable using present-day single-atom interferometry.

翻訳日:2023-01-09 23:59:01 公開日:2023-01-06

# ReVoLT: ターゲット駆動ナビゲーションのための関係推論とボロノイ局所グラフ計画

ReVoLT: Relational Reasoning and Voronoi Local Graph Planning for Target-driven Navigation ( http://arxiv.org/abs/2301.02382v1 )

ライセンス: Link先を確認

Junjia Liu, Jianfei Guo, Zehui Meng, Jingtao Xue

(参考訳) Embodied AIは、インテリジェントなエンティティと現実世界の相互作用を強調する必然的なトレンドであり、ロボティクス、特にターゲット駆動ナビゲーションに広く応用されている。このタスクは、未知の家庭環境において、特定のカテゴリーのオブジェクトを効率的に見つけることを必要とする。最近の研究は、グラフニューラルネットワーク(GNN)によるレイアウト関係の活用に焦点を当てている。しかし、ほとんどのロボットは、不完全な関係グラフを通して、エンドツーエンドで観察から直接ロボットの動作を得るが、これは解釈可能で信頼性に欠ける。このタスクを分離し、階層的なフレームワークであるReVoLTを提案する。 (a)物体検出用視覚フロントエンド (b)高水準推論者(意味サブゴールを推定する) (c)中間レベルプランナー(幾何学的位置を計算)、及び (d)低レベルコントローラ(アクションの実行)。 ReVoLTは多層意味空間トポロジグラフで動作する。推論器は、教師なしグラフsage、gcn、およびgraphrnnベースの領域ロールアウトからなる組合せ関係抽出ネットワークから得られる、事前としてマルチフォーム構造化関係を用いる。セマンティックなサブゴールを推論し、エクスプロイト(深み優先探索)と探索(参照)のトレードオフを考慮し、アッパー信頼境界木(UCT)で実行します。軽量中間レベルプランナーは、オンライン構築されたボロノイ局所グラフを介して、瞬時空間的な部分ゴール位置を生成する。シミュレーション実験により,本フレームワークは目標駆動型ナビゲーションタスクの性能向上と,既存の最先端手法と比較して80%向上した一般化を実現していることが示された。コードと結果のビデオはhttps://ventusff.github.io/ReVoLT-website/で公開される。

Embodied AI is an inevitable trend that emphasizes the interaction between intelligent entities and the real world, with broad applications in Robotics, especially target-driven navigation. This task requires the robot to find an object of a certain category efficiently in an unknown domestic environment. Recent works focus on exploiting layout relationships by graph neural networks (GNNs). However, most of them obtain robot actions directly from observations in an end-to-end manner via an incomplete relation graph, which is not interpretable and reliable. We decouple this task and propose ReVoLT, a hierarchical framework: (a) an object detection visual front-end, (b) a high-level reasoner (infers semantic sub-goals), (c) an intermediate-level planner (computes geometrical positions), and (d) a low-level controller (executes actions). ReVoLT operates with a multi-layer semantic-spatial topological graph. The reasoner uses multiform structured relations as priors, which are obtained from combinatorial relation extraction networks composed of unsupervised GraphSAGE, GCN, and GraphRNN-based Region Rollout. The reasoner performs with Upper Confidence Bound for Tree (UCT) to infer semantic sub-goals, accounting for trade-offs between exploitation (depth-first searching) and exploration (regretting). The lightweight intermediate-level planner generates instantaneous spatial sub-goal locations via an online constructed Voronoi local graph. The simulation experiments demonstrate that our framework achieves better performance in the target-driven navigation tasks and generalizes well, which has an 80% improvement compared to the existing state-of-the-art method. The code and result video will be released at https://ventusff.github.io/ReVoLT-website/.

翻訳日:2023-01-09 23:58:51 公開日:2023-01-06

# 状態情報と意図情報を用いた交差点における多車軌道予測

Multi-Vehicle Trajectory Prediction at Intersections using State and Intention Information ( http://arxiv.org/abs/2301.02561v1 )

ライセンス: Link先を確認

Dekai Zhu, Qadeer Khan, Daniel Cremers

(参考訳) 道路員の将来の軌跡予測への伝統的なアプローチは、過去の軌跡を知ることに依存している。この研究はむしろ、交差点で複数の車両の予測を行うための現在の状態と意図した方向の知識のみに依存している。さらに、車両間のこれらの情報のメッセージパッシングは、それぞれにより総合的な環境概要を提供し、より情報的な予測を可能にする。これは、複数の車両の状態と意図を使って将来の軌道を予測するニューラルネットワークのトレーニングによって行われる。インプットとして意図を使用することで、複数の車両が望ましい経路に向かって走行できるように、アプローチを拡張できます。実験により,交差点における軌道予測と車両制御の両面でのアプローチの堅牢性を示す。この作業のための完全なトレーニングと評価コードは、ここで入手できる。

Traditional approaches to prediction of future trajectory of road agents rely on knowing information about their past trajectory. This work rather relies only on having knowledge of the current state and intended direction to make predictions for multiple vehicles at intersections. Furthermore, message passing of this information between the vehicles provides each one of them a more holistic overview of the environment allowing for a more informed prediction. This is done by training a neural network which takes the state and intent of the multiple vehicles to predict their future trajectory. Using the intention as an input allows our approach to be extended to additionally control the multiple vehicles to drive towards desired paths. Experimental results demonstrate the robustness of our approach both in terms of trajectory prediction and vehicle control at intersections. The complete training and evaluation code for this work is available here: \url{https://github.com/Dekai21/Multi_Agent_Intersection}.

翻訳日:2023-01-09 23:58:26 公開日:2023-01-06

# 多視点バイナリクラスタリングのためのグラフコラボレーテッドオートエンコーダハッシュ

Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering ( http://arxiv.org/abs/2301.02484v1 )

ライセンス: Link先を確認

Huibing Wang, Mingze Yao, Guangqi Jiang, Zetian Mi, Xianping Fu

(参考訳) 教師なしハッシュ法は大規模データの爆発的成長に広く関心を集めており、コンパクトなバイナリコードを学習することでストレージと計算を大幅に削減することができる。既存の教師なしハッシュ手法では、サンプルからの貴重な情報を活用しようとするが、ラベルなしサンプルの局所幾何構造を考慮していない。さらに、オートエンコーダに基づくハッシュは、複数のソースデータの潜在的な一貫性と相補性を無視した入力データとバイナリコードの間の再構成損失を最小限にすることを目的としている。本稿では,マルチビューバイナリクラスタリングのための自動エンコーダに基づくハッシュアルゴリズムを提案する。これは低ランク制約付きアフィニティグラフを動的に学習し,マルチビューバイナリクラスタリングのためのグラフ共用オートエンコーダハッシュ(gcae)と呼ばれる,統合バイナリコードを学習するためにオートエンコーダとアフィニティグラフの協調学習を採用する。具体的には,低ランク制約を用いた多視点親和性グラフ学習モデルを提案する。次に、複数の親和性グラフを協調して統一バイナリコードを効果的に学習するエンコーダ・デコーダパラダイムを設計する。特に、量子化エラーを低減するためにバイナリコードにデコレーションとコードバランスの制約を課す。最後に,複数ビュークラスタリング結果を得るために反復反復最適化方式を用いる。 5ドルの公開データセットに関する広範な実験結果は、アルゴリズムの有効性と、他の最先端の代替品よりも優れた性能を明らかにするために提供される。

Unsupervised hashing methods have attracted widespread attention with the explosive growth of large-scale data, which can greatly reduce storage and computation by learning compact binary codes. Existing unsupervised hashing methods attempt to exploit the valuable information from samples, which fails to take the local geometric structure of unlabeled samples into consideration. Moreover, hashing based on auto-encoders aims to minimize the reconstruction loss between the input data and binary codes, which ignores the potential consistency and complementarity of multiple sources data. To address the above issues, we propose a hashing algorithm based on auto-encoders for multi-view binary clustering, which dynamically learns affinity graphs with low-rank constraints and adopts collaboratively learning between auto-encoders and affinity graphs to learn a unified binary code, called Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering (GCAE). Specifically, we propose a multi-view affinity graphs learning model with low-rank constraint, which can mine the underlying geometric information from multi-view data. Then, we design an encoder-decoder paradigm to collaborate the multiple affinity graphs, which can learn a unified binary code effectively. Notably, we impose the decorrelation and code balance constraints on binary codes to reduce the quantization errors. Finally, we utilize an alternating iterative optimization scheme to obtain the multi-view clustering results. Extensive experimental results on $5$ public datasets are provided to reveal the effectiveness of the algorithm and its superior performance over other state-of-the-art alternatives.

翻訳日:2023-01-09 23:52:18 公開日:2023-01-06

# Vote2Cap-DETRを用いたエンド・ツー・エンド3次元Dense Captioning

End-to-End 3D Dense Captioning with Vote2Cap-DETR ( http://arxiv.org/abs/2301.02508v1 )

ライセンス: Link先を確認

Sijin Chen, Hongyuan Zhu, Xin Chen, Yinjie Lei, Tao Chen, Gang YU

(参考訳) 3D高密度キャプションは、関連する対象領域にローカライズされた複数のキャプションを生成することを目的としている。既存のメソッドは、多数の手作りのコンポーネントを備えた洗練された 'detect-then-describe'' パイプラインに従っている。しかし、これらの手作りのコンポーネントは、異なるシーン間のオブジェクト空間とクラス分布が散らばった場合、最適以下のパフォーマンスをもたらす。本稿では,最近普及している textbf{DE}tection \textbf{TR}ansformer (DETR) に基づく,単純なyet効率のトランスフォーマフレームワークである Vote2Cap-DETR を提案する。先行技術と比較すると、我々の枠組みにはいくつかの魅力があります。 1) 手作り部品は多くないが,本手法は,学習可能な投票クエリ駆動オブジェクトデコーダを備えたフルトランスフォーマー・デコーダアーキテクチャと,集合予測方式で高密度キャプションを生成するキャプションデコーダをベースとしている。 2) この2段階方式とは対照的に, 検出とキャプションを1段階で行うことができる。 3) ベルとホイッスルがなければ、2つの一般的なデータセットであるScanReferとNr3Dの広範な実験により、Vote2Cap-DETRがそれぞれCIDEr@0.5IoUの11.13\%と7.11\%を超えることが実証された。コードはまもなくリリースされる予定だ。

3D dense captioning aims to generate multiple captions localized with their associated object regions. Existing methods follow a sophisticated ``detect-then-describe'' pipeline equipped with numerous hand-crafted components. However, these hand-crafted components would yield suboptimal performance given cluttered object spatial and class distributions among different scenes. In this paper, we propose a simple-yet-effective transformer framework Vote2Cap-DETR based on recent popular \textbf{DE}tection \textbf{TR}ansformer (DETR). Compared with prior arts, our framework has several appealing advantages: 1) Without resorting to numerous hand-crafted components, our method is based on a full transformer encoder-decoder architecture with a learnable vote query driven object decoder, and a caption decoder that produces the dense captions in a set-prediction manner. 2) In contrast to the two-stage scheme, our method can perform detection and captioning in one-stage. 3) Without bells and whistles, extensive experiments on two commonly used datasets, ScanRefer and Nr3D, demonstrate that our Vote2Cap-DETR surpasses current state-of-the-arts by 11.13\% and 7.11\% in CIDEr@0.5IoU, respectively. Codes will be released soon.

翻訳日:2023-01-09 23:51:48 公開日:2023-01-06

# スタイル転送を用いた絵画分類におけるデータバイアス対策

Tackling Data Bias in Painting Classification with Style Transfer ( http://arxiv.org/abs/2301.02524v1 )

ライセンス: Link先を確認

Mridula Vijendran, Frederick W. B. Li, Hubert P. H. Shum

(参考訳) ドメインギャップによるモデルバイアスと,芸術様式の不均一な分布によるデータバイアスにより,絵画コレクション上の分類器の訓練は困難である。データ蒸留、伝統的なデータ拡張、スタイル転送といった以前の技術は、タスク固有のトレーニングデータセットやドメイン適応を使用して分類子トレーニングを改善する。本研究では,カオコレデータセットのような小さな絵画データセットにおけるデータバイアスを扱うとともに,実世界画像にトレーニングされたモデルを微調整する際に,ドメイン適応を同時に計算するシステムを提案する。本システムは,スタイル転送と分類の2段階からなる。スタイル転送ステージでは、一律にサンプリングされたコンテンツとスタイルイメージをクラスごとにスタイリッシュなトレーニングサンプルを生成し、各ドメインごとにスタイル変換ネットワークをトレーニングします。分類段階では、オリジナルトレーニングデータセットとスタイライゼーション画像のトレーニングにおいて、注意層におけるスタイル層とコンテンツ層の有効性を解釈することができる。多数派と少数派における増分サンプルの割合を動的に変化させることで、モデル性能と収束性をトレードオフすることができる。訓練期間の短縮と,訓練パラメータの少ない分類器を用いて,somaと同等の結果を得る。

It is difficult to train classifiers on paintings collections due to model bias from domain gaps and data bias from the uneven distribution of artistic styles. Previous techniques like data distillation, traditional data augmentation and style transfer improve classifier training using task specific training datasets or domain adaptation. We propose a system to handle data bias in small paintings datasets like the Kaokore dataset while simultaneously accounting for domain adaptation in fine-tuning a model trained on real world images. Our system consists of two stages which are style transfer and classification. In the style transfer stage, we generate the stylized training samples per class with uniformly sampled content and style images and train the style transformation network per domain. In the classification stage, we can interpret the effectiveness of the style and content layers at the attention layers when training on the original training dataset and the stylized images. We can tradeoff the model performance and convergence by dynamically varying the proportion of augmented samples in the majority and minority classes. We achieve comparable results to the SOTA with fewer training epochs and a classifier with fewer training parameters.

翻訳日:2023-01-09 23:51:26 公開日:2023-01-06

# 3次元物体検出のためのモデル非依存階層的注意

Model-Agnostic Hierarchical Attention for 3D Object Detection ( http://arxiv.org/abs/2301.02650v1 )

ライセンス: Link先を確認

Manli Shu, Le Xue, Ning Yu, Roberto Mart\'in-Mart\'in, Juan Carlos Niebles, Caiming Xiong, Ran Xu

(参考訳) 汎用ネットワークアーキテクチャとしてのトランスフォーマーは最近、3dポイントクラウドオブジェクト検出で大きな成功を収めている。しかし, 通常の変圧器では階層構造が欠如しているため, 異なるスケールで特徴を学習することは困難であり, 局所的特徴を抽出する能力を抑制する。このような制限により、異なるサイズのオブジェクトでは性能が不均衡になり、小さいオブジェクトでは性能が劣る。本研究では,トランスを用いた3D検出器のモジュール化階層設計として,新しい2つの注意機構を提案する。異なるスケールで機能学習を可能にするために,単一スケールの入力機能から複数スケールのトークンを構築するシンプルなマルチスケールアテンションを提案する。局所化特徴集約のために,各境界ボックスの提案に対して適応的注意範囲を持つサイズ適応局所注意を提案する。この2つのアテンションモジュールはモデルに依存しないネットワーク層で、エンドツーエンドトレーニングのために既存のポイントクラウドトランスフォーマーにプラグインすることができます。提案手法を室内3次元点状物体検出ベンチマークで評価した。提案するモジュールを最先端のトランスフォーマーベースの3d検出器に差し込むことで,従来の2つのベンチマークの最良の結果を改善し,小型オブジェクトに対する改善マージンを最大にする。

Transformers as versatile network architectures have recently seen great success in 3D point cloud object detection. However, the lack of hierarchy in a plain transformer makes it difficult to learn features at different scales and restrains its ability to extract localized features. Such limitation makes them have imbalanced performance on objects of different sizes, with inferior performance on smaller ones. In this work, we propose two novel attention mechanisms as modularized hierarchical designs for transformer-based 3D detectors. To enable feature learning at different scales, we propose Simple Multi-Scale Attention that builds multi-scale tokens from a single-scale input feature. For localized feature aggregation, we propose Size-Adaptive Local Attention with adaptive attention ranges for every bounding box proposal. Both of our attention modules are model-agnostic network layers that can be plugged into existing point cloud transformers for end-to-end training. We evaluate our method on two widely used indoor 3D point cloud object detection benchmarks. By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.

翻訳日:2023-01-09 23:50:51 公開日:2023-01-06

# 連続制御タスクのための集中型協調探索政策

Centralized Cooperative Exploration Policy for Continuous Control Tasks ( http://arxiv.org/abs/2301.02375v1 )

ライセンス: Link先を確認

Chao Li, Chen Gong, Qiang He, Xinwen Hou and Yu Liu

(参考訳) 深層強化学習(drl)アルゴリズムは、様々な複雑な制御タスクを巧みに解決する。この現象的な成功は、DRLが知的エージェントに環境を十分に探索し、エージェントトレーニングプロセス中に多様な経験を収集するよう促すことによるものである。したがって、探査はdrlの最適ポリシーにアクセスする上で重要な役割を果たす。近年の継続的制御タスクの進歩にもかかわらず、これらのタスクの探索は不十分なままである。連続制御タスクにおける探索を明示的に奨励するために,価値関数の過小評価と過大評価を利用して探索能力を維持するCCEP(Centralized Cooperative Exploration Policy)を提案する。 CCEPはまず、異なるパラメータで初期化された2つの値関数を保持し、値関数のペアから複数の探索スタイルで多様なポリシーを生成する。さらに、集中型ポリシフレームワークは、CCEPが複数のポリシ間のメッセージ配信を実現し、さらに環境の協調的な探索に寄与することを保証する。大規模な実験の結果、CCEPは高い探査能力を発揮することが示された。実証分析では、CCEPによる学習政策における多様な探索スタイルが示され、より多くの探検地域での利益が得られている。そしてこのccepの探索能力は、実験で示された複数の連続制御タスクにまたがる現在の最先端のメソッドよりも優れています。

The deep reinforcement learning (DRL) algorithm works brilliantly on solving various complex control tasks. This phenomenal success can be partly attributed to DRL encouraging intelligent agents to sufficiently explore the environment and collect diverse experiences during the agent training process. Therefore, exploration plays a significant role in accessing an optimal policy for DRL. Despite recent works making great progress in continuous control tasks, exploration in these tasks has remained insufficiently investigated. To explicitly encourage exploration in continuous control tasks, we propose CCEP (Centralized Cooperative Exploration Policy), which utilizes underestimation and overestimation of value functions to maintain the capacity of exploration. CCEP first keeps two value functions initialized with different parameters, and generates diverse policies with multiple exploration styles from a pair of value functions. In addition, a centralized policy framework ensures that CCEP achieves message delivery between multiple policies, furthermore contributing to exploring the environment cooperatively. Extensive experimental results demonstrate that CCEP achieves higher exploration capacity. Empirical analysis shows diverse exploration styles in the learned policies by CCEP, reaping benefits in more exploration regions. And this exploration capacity of CCEP ensures it outperforms the current state-of-the-art methods across multiple continuous control tasks shown in experiments.

翻訳日:2023-01-09 23:50:13 公開日:2023-01-06

# 共形損失制御予測

Conformal Loss-Controlling Prediction ( http://arxiv.org/abs/2301.02424v1 )

ライセンス: Link先を確認

Di Wang, Ping Wang, Zhong Ji, Xiaojun Yang, Hongyue Li

(参考訳) コンフォーマル予測は、予測セットの予測カバレッジを制御する学習フレームワークであり、任意の学習アルゴリズムに基づいてポイント予測を行うことができる。本研究では,損失関数の値を制御する必要がある状況に対して,共形予測を拡張した共形損失制御予測という学習フレームワークを提案する。リスク制御予測セットと,損失関数の期待値を制御することを目的とした共形リスク制御に関する既存の研究とは違い,本論文では,誤発見損失から一般損失への共形予測の拡張である任意のテスト対象の損失に着目した。制御保証は有限事例におけるデータの交換可能性の仮定の下で証明され、数値気象予報アプリケーションのクラス変動損失と統計的後処理を伴う分類について実証的に検証し、ポイントワイズ分類およびポイントワイズ回帰問題として導入する。すべての理論解析と実験結果から,損失制御手法の有効性を確認した。

Conformal prediction is a learning framework controlling prediction coverage of prediction sets, which can be built on any learning algorithm for point prediction. This work proposes a learning framework named conformal loss-controlling prediction, which extends conformal prediction to the situation where the value of a loss function needs to be controlled. Different from existing works about risk-controlling prediction sets and conformal risk control with the purpose of controlling the expected values of loss functions, the proposed approach in this paper focuses on the loss for any test object, which is an extension of conformal prediction from miscoverage loss to some general loss. The controlling guarantee is proved under the assumption of exchangeability of data in finite-sample cases and the framework is tested empirically for classification with a class-varying loss and statistical postprocessing of numerical weather forecasting applications, which are introduced as point-wise classification and point-wise regression problems. All theoretical analysis and experimental results confirm the effectiveness of our loss-controlling approach.

翻訳日:2023-01-09 23:49:54 公開日:2023-01-06

# 圧縮アクティベーションは並列トレーニングのモデルに役立つか?

Does compressing activations help model parallel training? ( http://arxiv.org/abs/2301.02654v1 )

ライセンス: Link先を確認

Song Bian, Dacheng Li, Hongyi Wang, Eric P. Xing, Shivaram Venkataraman

(参考訳) 大規模トランスフォーマーモデルは様々なタスクにおいて例外的な性能で知られているが、通信集約型モデル並列性を必要とするため、訓練は困難である。トレーニング速度を改善する1つの方法は、通信におけるメッセージサイズを圧縮することである。従来の手法は主にデータ並列性の設定における勾配の圧縮に焦点を合わせてきたが、モデル並列設定における圧縮は未調査領域である。モデル並列性はデータ並列性と根本的に異なる特徴を持つことがわかった。本研究では,モデル並列性に対する圧縮手法の有効性に関する実験的検討を行った。我々は,一般的なTransformerトレーニングフレームワークを用いて,プルーニングベース,学習ベース,量子化ベースという3つの圧縮アルゴリズムの共通クラスを実装し,評価する。我々は、これらの手法を160以上の設定と8つの一般的なデータセットで評価し、異なるハイパーパラメータ、ハードウェア、微調整および事前学習の段階を考慮に入れた。モデルのスケールアップ時の分析も行っています。最後に,モデル並列性圧縮アルゴリズムの今後の開発について考察する。

Large-scale Transformer models are known for their exceptional performance in a range of tasks, but training them can be difficult due to the requirement for communication-intensive model parallelism. One way to improve training speed is to compress the message size in communication. Previous approaches have primarily focused on compressing gradients in a data parallelism setting, but compression in a model-parallel setting is an understudied area. We have discovered that model parallelism has fundamentally different characteristics than data parallelism. In this work, we present the first empirical study on the effectiveness of compression methods for model parallelism. We implement and evaluate three common classes of compression algorithms - pruning-based, learning-based, and quantization-based - using a popular Transformer training framework. We evaluate these methods across more than 160 settings and 8 popular datasets, taking into account different hyperparameters, hardware, and both fine-tuning and pre-training stages. We also provide analysis when the model is scaled up. Finally, we provide insights for future development of model parallelism compression algorithms.

翻訳日:2023-01-09 23:49:35 公開日:2023-01-06

# object as query: 任意の2dオブジェクト検出器に3d検出能力を備える

Object as Query: Equipping Any 2D Object Detector with 3D Detection Ability ( http://arxiv.org/abs/2301.02364v1 )

ライセンス: Link先を確認

Zitian Wang, Zehao Huang, Jiahui Fu, Naiyan Wang, Si Liu

(参考訳) マルチビュー画像からの3Dオブジェクト検出は、ここ数年で注目されている。既存の方法は、主に多視点画像から3D表現を確立し、オブジェクト検出に高密度な検出ヘッドを採用するか、オブジェクトをローカライズするために3D空間に分散されたオブジェクトクエリを使用する。本稿では,多視点3次元物体検出装置(MV2D)を設計し,任意の2次元物体検出装置を装備して,多視点3次元物体検出の促進を図る。 MV2Dは2D検出器を利用して、リッチな画像意味論に基づくオブジェクトクエリを生成する。これらの動的に生成されたクエリにより、MV2Dは計算コストを増大させることなくより大きな3D空間のオブジェクトを検出でき、3Dオブジェクトをローカライズする強力な能力を示す。生成したクエリに対して,分散クロスアテンションモジュールを設計し,特定のオブジェクトの特徴に注目させることにより,計算コストを低減し,ノイズによる干渉を抑制する。 nuScenesデータセットの評価結果は、動的オブジェクトクエリとスパース特徴集約が3次元検出能力を損なわないことを示す。 MV2Dは既存の手法の中でも最先端の性能を示している。 MV2Dが将来の研究の新たなベースラインになることを期待している。

3D object detection from multi-view images has drawn much attention over the past few years. Existing methods mainly establish 3D representations from multi-view images and adopt a dense detection head for object detection, or employ object queries distributed in 3D space to localize objects. In this paper, we design Multi-View 2D Objects guided 3D Object Detector (MV2D), which can be equipped with any 2D object detector to promote multi-view 3D object detection. Since 2D detections can provide valuable priors for object existence, MV2D exploits 2D detector to generate object queries conditioned on the rich image semantics. These dynamically generated queries enable MV2D to detect objects in larger 3D space without increased computational costs and shows a strong capability of localizing 3D objects. For the generated queries, we design a sparse cross attention module to force them to focus on the features of specific objects, which reduces the computational cost and suppresses interference from noises. The evaluation results on the nuScenes dataset demonstrate that dynamic object queries and sparse feature aggregation do not harm 3D detection capability. MV2D also exhibits a state-of-the-art performance among existing methods. We hope MV2D can serve as a new baseline for future research.

翻訳日:2023-01-09 23:41:56 公開日:2023-01-06

# Anchor3DLane:モノクロ3Dレーン検出のための3Dアンカーの学習

Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection ( http://arxiv.org/abs/2301.02371v1 )

ライセンス: Link先を確認

Shaofei Huang, Zhenwei Shen, Zehao Huang, Zihan Ding, Jiao Dai, Jizhong Han, Naiyan Wang, Si Liu

(参考訳) 深さ情報がないため,単眼3次元レーン検出は難しい課題である。 3Dレーン検出の一般的な解決策は、まず正面視(FV)画像や特徴を逆視点マッピング(IPM)で鳥眼視(BEV)空間に変換し、BEV特徴から車線を検出することである。しかし、IPMが平らな地上での仮定やコンテキスト情報の喪失に依存しているため、BEV表現から3D情報を復元するには不正確である。 BEVを排除し、FV表現から直接3Dレーンを予測する試みがなされているが、3Dレーンの構造的表現が欠如していることから、他のBEVベースの方法よりも性能が低い。本稿では,3d空間における3dレーンアンカーを定義し,fv表現から直接3dレーンを予測するためのアンカー3dlane法を提案する。 3DレーンアンカーはFV機能に投影され、正確な予測を行うための優れた構造情報とコンテキスト情報の両方を含む特徴を抽出する。さらにanchor3dlaneをマルチフレーム設定に拡張し、パフォーマンス改善のために時間情報を取り込む。さらに,車線間の等幅特性を利用した大域的最適化手法も開発し,予測の側方誤差を低減する。 3つの人気のある3Dレーン検出ベンチマークの大規模な実験により、我々のAnchor3DLaneは従来のBEVベースの手法より優れ、最先端のパフォーマンスを実現しています。

Monocular 3D lane detection is a challenging task due to its lack of depth information. A popular solution to 3D lane detection is to first transform the front-viewed (FV) images or features into the bird-eye-view (BEV) space with inverse perspective mapping (IPM) and detect lanes from BEV features. However, the reliance of IPM on flat ground assumption and loss of context information makes it inaccurate to restore 3D information from BEV representations. An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes. In this paper, we define 3D lane anchors in the 3D space and propose a BEV-free method named Anchor3DLane to predict 3D lanes directly from FV representations. 3D lane anchors are projected to the FV features to extract their features which contain both good structural and context information to make accurate predictions. We further extend Anchor3DLane to the multi-frame setting to incorporate temporal information for performance improvement. In addition, we also develop a global optimization method that makes use of the equal-width property between lanes to reduce the lateral error of predictions. Extensive experiments on three popular 3D lane detection benchmarks show that our Anchor3DLane outperforms previous BEV-based methods and achieves state-of-the-art performances.

翻訳日:2023-01-09 23:41:35 公開日:2023-01-06

# codetalker: 個別動作を優先した音声駆動3d顔アニメーション

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior ( http://arxiv.org/abs/2301.02379v1 )

ライセンス: Link先を確認

Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong

(参考訳) 音声駆動の3D顔アニメーションは広く研究されているが、音声視覚データの極めて不適切な性質と不足のため、現実主義と鮮明さを達成するには依然としてギャップがある。既存の作業は、通常、回帰タスクへのクロスモーダルマッピングを定式化するが、これは回帰と平均の問題に悩まされ、過度に滑らかな顔の動きにつながる。本稿では,学習したコードブックの有限プロキシ空間において,音声による顔のアニメーションをコードクエリタスクとしてキャストすることを提案する。コードブックは、実際の顔の動きに対する自己再構成によって学習され、現実的な顔の動きに埋め込まれる。離散的動作空間上では、入力された音声信号から顔の動きを逐次合成する時間的自己回帰モデルが用いられ、口唇同期と多彩な表情が保証される。提案手法は, 定性的かつ定量的に, 現在の最先端手法よりも優れていることを示す。また、ユーザスタディは、知覚品質の優位性をさらに正当化する。

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty. The codebook is learned by self-reconstruction over real facial motions and thus embedded with realistic facial motion priors. Over the discrete motion space, a temporal autoregressive model is employed to sequentially synthesize facial motions from the input speech signal, which guarantees lip-sync as well as plausible facial expressions. We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. Also, a user study further justifies our superiority in perceptual quality.

翻訳日:2023-01-09 23:41:07 公開日:2023-01-06

# cyberloc: 正確な長期視定位を目指して

CyberLoc: Towards Accurate Long-term Visual Localization ( http://arxiv.org/abs/2301.02403v1 )

ライセンス: Link先を確認

Liu Liu, Yukai Lin, Xiao Liang, Qichao Xu, Miao Jia, Yangdong Liu, Yuxiang Wen, Wei Luo, Jiangwei Li

(参考訳) 本報告では,課題条件下でロバストかつ高精度なポーズ推定を行うための画像ベースビジュアルローカライズパイプラインであるcyberlocを紹介する。提案手法は4つのモジュールを連結して構成する。まず、異なる条件下で複数の参照シーケンスが存在する場合、各参照シーケンスに対する1つのマップであるシーンの正確な3Dマップを構築するためにマッピングモジュールを適用する。次に、単一の画像ベースのローカライゼーションパイプライン(retrieval--matching--PnP)を行い、クエリ画像毎に6-DoFカメラのポーズを3Dマップ毎に推定する。第3に、6-DoFカメラのポーズをフィルタし、1つの6-DoFカメラのポーズをクエリに出力するコンセンサスセット最大化モジュールを提案する。最後に、6-DoFのクエリポーズを最適化し、候補となるグローバルな6-DoFカメラポーズとその対応するグローバルな2D-3Dマッチング、連続的なクエリイメージとクエリシーケンスのSLAMポーズのスパース2D-2D特徴マッチングを入力として、ロバストなポーズ修正モジュールを提案する。 4シーズンデータセットを用いた実験により,本手法は高精度かつロバスト性が得られた。特に,本手法は,地図を用いた自律運転用ローカライゼーション(MLAD-ECCV2022)に関するECCV 2022ワークショップのローカライゼーション課題に勝っている。

This technical report introduces CyberLoc, an image-based visual localization pipeline for robust and accurate long-term pose estimation under challenging conditions. The proposed method comprises four modules connected in a sequence. First, a mapping module is applied to build accurate 3D maps of the scene, one map for each reference sequence if there exist multiple reference sequences under different conditions. Second, a single-image-based localization pipeline (retrieval--matching--PnP) is performed to estimate 6-DoF camera poses for each query image, one for each 3D map. Third, a consensus set maximization module is proposed to filter out outlier 6-DoF camera poses, and outputs one 6-DoF camera pose for a query. Finally, a robust pose refinement module is proposed to optimize 6-DoF query poses, taking candidate global 6-DoF camera poses and their corresponding global 2D-3D matches, sparse 2D-2D feature matches between consecutive query images and SLAM poses of the query sequence as input. Experiments on the 4seasons dataset show that our method achieves high accuracy and robustness. In particular, our approach wins the localization challenge of ECCV 2022 workshop on Map-based Localization for Autonomous Driving (MLAD-ECCV2022).

翻訳日:2023-01-09 23:40:48 公開日:2023-01-06

# 視覚変換器の効率よいFew-shot Adaptationの探索

Exploring Efficient Few-shot Adaptation for Vision Transformers ( http://arxiv.org/abs/2301.02419v1 )

ライセンス: Link先を確認

Chengming Xu, Siqian Yang, Yabiao Wang, Zhanxiong Wang, Yanwei Fu, Xiangyang Xue

(参考訳) FSL(Few-shot Learning)の課題は,ラベル付きトレーニングサンプルを豊富に含むベースカテゴリから学習した知識を利用して,ラベル付きサンプルを少数含む新規カテゴリの推論を行うことである。 FSLタスクには多くの研究があるが、視覚トランスフォーマー(ViT)がFSLのバックボーンとして採用されることは稀であり、バックボーン全体や分類層を微調整することに焦点を当てる試みはほとんどない。基本的に、ViTは、他のビジョンタスクで同等またはさらに優れたパフォーマンスを享受していることが示されているが、現実のFSLシナリオでViTを効率的に微調整することは、まだ非常に簡単ではない。そこで本研究では,FSLタスクの微調整を容易にするトランスフォーマーチューニング(eTT)手法を提案する。鍵となる新機能は、タスクとバックボーンチューニングのために新たに提示された注意プレフィックスチューニング(apt)とドメイン残差アダプタ(dra)から生まれます。具体的には、APTでは、プレフィックスを各自己保持層に取り付けられた新しいキーと値ペアに投影し、タスク固有の情報を提供する。さらに,学習可能なオフセットベクトルの形でdraを設計し,ベースデータと新規データの間の潜在的な領域ギャップを処理する。 aptが初期タスク固有の情報からあまり逸脱しないようにするため、我々はさらにプレフィックスと初期プロトタイプの射影分布の類似性を最大化し、更新手順を規則化する新しいプロトタイプ正規化を提案する。提案手法はメタデータセットの課題に対して優れた性能を発揮する。我々は,モデルの有効性を示す広範な実験を行った。

The task of Few-shot Learning (FSL) aims to do the inference on novel categories containing only few labeled examples, with the help of knowledge learned from base categories containing abundant labeled training samples. While there are numerous works into FSL task, Vision Transformers (ViTs) have rarely been taken as the backbone to FSL with few trials focusing on naive finetuning of whole backbone or classification layer.} Essentially, despite ViTs have been shown to enjoy comparable or even better performance on other vision tasks, it is still very nontrivial to efficiently finetune the ViTs in real-world FSL scenarios. To this end, we propose a novel efficient Transformer Tuning (eTT) method that facilitates finetuning ViTs in the FSL tasks. The key novelties come from the newly presented Attentive Prefix Tuning (APT) and Domain Residual Adapter (DRA) for the task and backbone tuning, individually. Specifically, in APT, the prefix is projected to new key and value pairs that are attached to each self-attention layer to provide the model with task-specific information. Moreover, we design the DRA in the form of learnable offset vectors to handle the potential domain gaps between base and novel data. To ensure the APT would not deviate from the initial task-specific information much, we further propose a novel prototypical regularization, which maximizes the similarity between the projected distribution of prefix and initial prototypes, regularizing the update procedure. Our method receives outstanding performance on the challenging Meta-Dataset. We conduct extensive experiments to show the efficacy of our model.

翻訳日:2023-01-09 23:40:25 公開日:2023-01-06

# シーケンシャル量子強化トレーニングを用いたトレーサブル量子機械学習に向けて

SEQUENT: Towards Traceable Quantum Machine Learning using Sequential Quantum Enhanced Training ( http://arxiv.org/abs/2301.02601v1 )

ライセンス: Link先を確認

Philipp Altmann, Leo S\"unkel, Jonas Stein, Tobias M\"uller, Christoph Roch and Claudia Linnhoff-Popien

(参考訳) 量子コンピューティングのような新しいコンピューティングパラダイムを機械学習の分野に適用する動きが最近注目を集めている。しかし、高次元実世界の応用は純粋に量子ハードウェアで解決できないため、古典的および量子機械学習のパラダイムを用いたハイブリッド手法が提案されている。例えば、移動学習法はハイブリッド画像分類タスクに適用可能であることが示されている。それでも、有益な回路アーキテクチャを探求する必要がある。したがって、選択した回路アーキテクチャとパラメータ化の影響の追跡は、有効なハイブリッド手法の開発に不可欠である。しかし、現在の方法には、両方の部分を同時に訓練するプロセスが含まれているため、古典的および量子的な影響の厳密な分離性が認められない。したがって、これらのアーキテクチャは、最小限の量子インパクトを使用しながらより優れた予測精度をもたらすモデルを生成するかもしれない。本稿では,量子コンピューティング手法のハイブリッド機械学習へのトレーサブルな応用に向けて,逐次的量子強化トレーニング(sequent)により改良されたアーキテクチャとトレーニングプロセスを提案する。さらに,現在の手法の欠点と予備的な実験結果に対する形式的な証拠を,sequentの適用可能性の実証として提示する。

Applying new computing paradigms like quantum computing to the field of machine learning has recently gained attention. However, as high-dimensional real-world applications are not yet feasible to be solved using purely quantum hardware, hybrid methods using both classical and quantum machine learning paradigms have been proposed. For instance, transfer learning methods have been shown to be successfully applicable to hybrid image classification tasks. Nevertheless, beneficial circuit architectures still need to be explored. Therefore, tracing the impact of the chosen circuit architecture and parameterization is crucial for the development of beneficially applicable hybrid methods. However, current methods include processes where both parts are trained concurrently, therefore not allowing for a strict separability of classical and quantum impact. Thus, those architectures might produce models that yield a superior prediction accuracy whilst employing the least possible quantum impact. To tackle this issue, we propose Sequential Quantum Enhanced Training (SEQUENT) an improved architecture and training process for the traceable application of quantum computing methods to hybrid machine learning. Furthermore, we provide formal evidence for the disadvantage of current methods and preliminary experimental results as a proof-of-concept for the applicability of SEQUENT.

翻訳日:2023-01-09 23:33:15 公開日:2023-01-06

# 住宅負荷の迅速応答のためのマルチエージェント強化学習

Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response of Residential Loads ( http://arxiv.org/abs/2301.02593v1 )

ライセンス: Link先を確認

Vincent Mai, Philippe Maisonneuve, Tianyu Zhang, Hadi Nekoei, Liam Paull, Antoine Lesage-Landry

(参考訳) 高量の再生可能エネルギー資源を統合するためには、電力グリッドは高振幅で高速な時間スケールの発電に対処できなければならない。需要応答による周波数規制は、空気調和機のような時間的に柔軟な負荷を調整し、これらの変動に対処する可能性がある。動的制約を伴う離散制御のための既存のアプローチは、数百のエージェントによる高速な時間スケールアクション選択に満足な性能を提供するのに苦労している。局所通信を用いたマルチエージェントポリシー最適化を訓練した分散エージェントを提案する。ハンドエンジニアリングとマルチエージェント通信による学習という,2つのコミュニケーションフレームワークについて検討する。結果として得られるポリシは、周波数規制に対して良好かつ堅牢に機能し、一定の処理時間の間、任意の数のハウスにシームレスにスケールする。

To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale variations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times.

翻訳日:2023-01-09 23:32:35 公開日:2023-01-06

# 楕円スライスサンプリング再訪の可逆性

Reversibility of elliptical slice sampling revisited ( http://arxiv.org/abs/2301.02426v1 )

ライセンス: Link先を確認

Mareike Hasenpflug, Viacheslav Natarovskii, Daniel Rudolf

(参考訳) マレー、アダムズ、マッケイが2010年に導入した後方分布の近似サンプリングのためのマルコフ連鎖法である楕円スライスサンプリングの明確性について考察する。我々は正規性要件を指摘し、可逆性特性の別の証明を提供する。特に、これは無限次元分離ヒルベルト空間上でもスライスサンプリングスキームの正しさを保証する。

We discuss the well-definedness of elliptical slice sampling, a Markov chain approach for approximate sampling of posterior distributions introduced by Murray, Adams and MacKay 2010. We point to a regularity requirement and provide an alternative proof of the reversibility property. In particular, this guarantees the correctness of the slice sampling scheme also on infinite-dimensional separable Hilbert spaces.

翻訳日:2023-01-09 23:32:23 公開日:2023-01-06

# mask-then-fill: イベント抽出のための柔軟かつ効果的なデータ拡張フレームワーク

Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction ( http://arxiv.org/abs/2301.02427v1 )

ライセンス: Link先を確認

Jun Gao, Changlong Yu, Wei Wang, Huan Zhao, Ruifeng Xu

(参考訳) イベント抽出のための柔軟かつ効果的なデータ拡張フレームワークであるmask-then-fillを提案する。このアプローチは、テキストのより柔軟な操作を可能にし、元のイベント構造を可能な限り変更することなく、より多様なデータを生成することができる。具体的には、まず随伴文の断片をランダムにマスキングし、それから可変長のテキストを細調整された埋め込みモデルで埋め込む。主な利点は、テキスト中の任意の長さの断片を、単一の単語または固定長の断片だけを置換できる既存の方法と比較して、可変長の別の断片に置き換えることができることである。トリガおよび引数抽出タスクにおいて,提案手法はベースライン手法よりも有効であり,低リソース設定において特に強い結果を示す。さらに分析した結果,多様性と分布的類似性のバランスが良好であることが判明した。

We present Mask-then-Fill, a flexible and effective data augmentation framework for event extraction. Our approach allows for more flexible manipulation of text and thus can generate more diverse data while keeping the original event structure unchanged as much as possible. Specifically, it first randomly masks out an adjunct sentence fragment and then infills a variable-length text span with a fine-tuned infilling model. The main advantage lies in that it can replace a fragment of arbitrary length in the text with another fragment of variable length, compared to the existing methods which can only replace a single word or a fixed-length fragment. On trigger and argument extraction tasks, the proposed framework is more effective than baseline methods and it demonstrates particularly strong results in the low-resource setting. Our further analysis shows that it achieves a good balance between diversity and distributional similarity.

翻訳日:2023-01-09 23:31:44 公開日:2023-01-06

# OPD@NL4Opt:最適化問題のNERタスクに対するアンサンブルアプローチ

OPD@NL4Opt: An ensemble approach for the NER task of the optimization problem ( http://arxiv.org/abs/2301.02459v1 )

ライセンス: Link先を確認

Kangxu Wang, Ze Chen, Jiewen Zheng

(参考訳) 本稿では,NL4Optコンペティションサブタスク1(NERタスク)に対するアンサンブルアプローチを提案する。このタスクでは、まず、競合データセットに基づいて事前訓練された言語モデルを微調整する。そして,モデル一般化とロバスト性を高めるために,差分学習率と対角訓練戦略を採用する。さらに、モデルアンサンブル法を用いて最終予測を行い、マイクロ平均F1スコア93.3%を達成し、NERタスクにおいて第2位を獲得する。

In this paper, we present an ensemble approach for the NL4Opt competition subtask 1(NER task). For this task, we first fine tune the pretrained language models based on the competition dataset. Then we adopt differential learning rates and adversarial training strategies to enhance the model generalization and robustness. Additionally, we use a model ensemble method for the final prediction, which achieves a micro-averaged F1 score of 93.3% and attains the second prize in the NER task.

翻訳日:2023-01-09 23:31:30 公開日:2023-01-06

# トランスフォーマーを用いたメンタルヘルスポストの因果分類

Causal Categorization of Mental Health Posts using Transformers ( http://arxiv.org/abs/2301.02589v1 )

ライセンス: Link先を確認

Muskan Garg, Simranjeet Kaur, Ritika Bhardwaj, Aastha Jain, Chandni Saxena

(参考訳) 近年、臨床心理学のデジタル化が進み、NLP研究コミュニティはソーシャルメディアにおけるメンタルヘルス検出の分野に革命をもたらした。既存のメンタルヘルス分析研究は、ソーシャルメディアに対するユーザの意図を分類するための横断的研究を中心に展開されている。詳細な分析のために,既存の分類器について検討し,限られたトレーニングサンプルによる学習ベース手法の非効率性を示唆する因果分類問題を解く。この課題に対処するために、トランスフォーマーモデルを使用し、"CAMS"データセット上でトレーニング済みのトランスファー学習の有効性を実証する。実験結果により精度が向上し,基礎となるテキストにおける因果関係の同定の重要性が示された。

With recent developments in digitization of clinical psychology, NLP research community has revolutionized the field of mental health detection on social media. Existing research in mental health analysis revolves around the cross-sectional studies to classify users' intent on social media. For in-depth analysis, we investigate existing classifiers to solve the problem of causal categorization which suggests the inefficiency of learning based methods due to limited training samples. To handle this challenge, we use transformer models and demonstrate the efficacy of a pre-trained transfer learning on "CAMS" dataset. The experimental result improves the accuracy and depicts the importance of identifying cause-and-effect relationships in the underlying text.

翻訳日:2023-01-09 23:31:21 公開日:2023-01-06

# 私の必要なものを本当に理解している人:知識とペルソナを基盤とする知的で友好的な対話エージェント

You Truly Understand What I Need: Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona ( http://arxiv.org/abs/2301.02401v1 )

ライセンス: Link先を確認

Jungwoo Lim, Myunghoon Kang, Yuna Hur, Seungwon Jung, Jinsung Kim, Yoonna Jang, Dongyub Lee, Hyesung Ji, Donghoon Shin, Seungryong Kim, and Heuiseok Lim

(参考訳) 人間と流動的に対話する会話エージェントを構築するため、事前学習された言語モデルに知識や個人のプロファイルをブレンドする。しかし、知識とペルソナを同時に考慮するモデルは依然として限定的であり、幻覚とペルソナの使用のパッシブな方法につながる。外部知識とペルソナを同時に活用する効果的な対話エージェントを提案する。エージェントは、ポリエンコーダで実装された候補スコアで回答を生成するために使用する適切な知識とペルソナを選択する。そして,本モデルでは,知識人格拡張クエリを用いた検索拡張により,より少なめの幻覚とより親しみのある発話を生成する。我々はペルソナ知識チャットの実験を行い、自動メトリクスに基づくグラウンドおよび生成タスクにおける最先端のパフォーマンスを達成する。さらに,人間の評価と質的結果を通して,幻覚とエンゲージメントに関するモデルからの回答を検証する。本稿では,他の検索者と比較して関連文書の抽出に有効であることを示すとともに,複数の候補スコアリング手法との比較を行った。コードはhttps://github.com/dlawjddn803/infoで入手できる。

To build a conversational agent that interacts fluently with humans, previous studies blend knowledge or personal profile into the pre-trained language model. However, the model that considers knowledge and persona at the same time is still limited, leading to hallucination and a passive way of using personas. We propose an effective dialogue agent that grounds external knowledge and persona simultaneously. The agent selects the proper knowledge and persona to use for generating the answers with our candidate scoring implemented with a poly-encoder. Then, our model generates the utterance with lesser hallucination and more engagingness utilizing retrieval augmented generation with knowledge-persona enhanced query. We conduct experiments on the persona-knowledge chat and achieve state-of-the-art performance in grounding and generation tasks on the automatic metrics. Moreover, we validate the answers from the models regarding hallucination and engagingness through human evaluation and qualitative results. We show our retriever's effectiveness in extracting relevant documents compared to the other previous retrievers, along with the comparison of multiple candidate scoring methods. Code is available at https://github.com/dlawjddn803/INFO

翻訳日:2023-01-09 23:25:36 公開日:2023-01-06

# SAIDS : 方言とサルカズムを指標とした感性分析の新しいアプローチ

SAIDS: A Novel Approach for Sentiment Analysis Informed of Dialect and Sarcasm ( http://arxiv.org/abs/2301.02521v1 )

ライセンス: Link先を確認

Abdelrahman Kaseb and Mona Farouk

(参考訳) 感情分析はあらゆるソーシャルネットワークで不可欠な部分となり、意思決定者がユーザーの意見をほぼあらゆる側面から知ることができる。その重要性にもかかわらず、感傷的テキストの感情のような、感情分析の主要な課題の1つである複数の問題に直面する。本稿では,アラビア語ツイートの感情,皮肉,方言を予測する新しいシステム(SAIDS)を導入することで,この問題に対処する。 SAIDSは、感情を予測するために既知の情報として、皮肉と方言の予測を使用する。言語モデルとしてMARBERTを使用して文の埋め込みを生成し、それをサルカズムと方言モデルに渡し、3つのモデルの出力を連結して感情分析モデルに渡す。複数のシステム設計が実験され、報告された。 SAIDSはArSarcasm-v2データセットに適用され、感情分析タスクの最先端モデルを上回った。すべてのタスクを一緒にトレーニングすることで、SAIDSはそれぞれ75.98 FPN、59.09 F1スコア、71.13 F1スコアの感情分析、sarcasm detection、弁別識別を行う。システム設計は、他のタスクに依存する任意のタスクのパフォーマンスを向上させるために使用できる。

Sentiment analysis becomes an essential part of every social network, as it enables decision-makers to know more about users' opinions in almost all life aspects. Despite its importance, there are multiple issues it encounters like the sentiment of the sarcastic text which is one of the main challenges of sentiment analysis. This paper tackles this challenge by introducing a novel system (SAIDS) that predicts the sentiment, sarcasm and dialect of Arabic tweets. SAIDS uses its prediction of sarcasm and dialect as known information to predict the sentiment. It uses MARBERT as a language model to generate sentence embedding, then passes it to the sarcasm and dialect models, and then the outputs of the three models are concatenated and passed to the sentiment analysis model. Multiple system design setups were experimented with and reported. SAIDS was applied to the ArSarcasm-v2 dataset where it outperforms the state-of-the-art model for the sentiment analysis task. By training all tasks together, SAIDS achieves results of 75.98 FPN, 59.09 F1-score and 71.13 F1-score for sentiment analysis, sarcasm detection, and dialect identification respectively. The system design can be used to enhance the performance of any task which is dependent on other tasks.

翻訳日:2023-01-09 23:25:21 公開日:2023-01-06

# エンティティクラスタとしてのトピック: 言語モデルとグラフニューラルネットワークによるエンティティベースのトピック

Topics as Entity Clusters: Entity-based Topics from Language Models and Graph Neural Networks ( http://arxiv.org/abs/2301.02458v1 )

ライセンス: Link先を確認

Manuel V. Loureiro, Steven Derby and Tri Kurniawan Wijaya

(参考訳) トピックモデルはコーパスの背後にある潜伏構造を明らかにすることを目的としている。トピックモデリングの文脈では、ほとんどの語彙は基礎となるトピックを明らかにするのに無関係であるか、関連する概念と強い関係を持ち、これらのトピックの解釈可能性に影響を与える。さらに、言語への依存や表現力の制限は、かなりの計算資源を必要とする。そこで本研究では,概念的実体を用いたクラスタベースのトピックモデリング手法を提案する。エンティティは、関係情報に富んだ現実世界の概念の言語に依存しない表現である。この目的のために、我々は実体のベクトル表現を抽出する。 (i)言語モデルを用いた百科事典 (ii)グラフニューラルネットワークを用いた知識ベース。我々は,この手法がコヒーレンシー指標の他の最先端トピックモデルより一貫して優れており,グラフベース埋め込みに符号化された明示的な知識は,言語モデルの文脈的埋め込みに符号化された暗黙的な知識よりも,より一貫性のあるトピックを提供することを示した。

Topic models aim to reveal the latent structure behind a corpus, typically conducted over a bag-of-words representation of documents. In the context of topic modeling, most vocabulary is either irrelevant for uncovering underlying topics or contains strong relationships with relevant concepts, impacting the interpretability of these topics. Furthermore, their limited expressiveness and dependency on language demand considerable computation resources. Hence, we propose a novel approach for cluster-based topic modeling that employs conceptual entities. Entities are language-agnostic representations of real-world concepts rich in relational information. To this end, we extract vector representations of entities from (i) an encyclopedic corpus using a language model; and (ii) a knowledge base using a graph neural network. We demonstrate that our approach consistently outperforms other state-of-the-art topic models across coherency metrics and find that the explicit knowledge encoded in the graph-based embeddings provides more coherent topics than the implicit knowledge encoded with the contextualized embeddings of language models.

翻訳日:2023-01-09 23:24:42 公開日:2023-01-06

# ハイブリッドディープラーニング技術(CNN+GRU)に基づく画像キャプションアルゴリズム

An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU) ( http://arxiv.org/abs/2301.02440v1 )

ライセンス: Link先を確認

Rana Adnan Ahmad, Muhammad Azhar, Hina Sattar

(参考訳) エンコーダ-デコーダフレームワークによる画像キャプションは,CNNを主にエンコーダとして,LSTMをデコーダとして使用した過去10年間で著しく進歩している。単純な画像の正確さという点では驚くべき成果があるが、時間的複雑さと空間的複雑さの効率性には欠ける。さらに,多くの情報やオブジェクトを持つ複雑な画像の場合,このCNN-LSTMペアの性能は,画像に提示されるシーンのセマンティックな理解が欠如していることから,指数関数的に低下した。そこで,これらの問題を考慮し,CNN-GRUエンコーダ・デコーダ・フレームワークを提案する。デコーダの隠れた状態を考慮して、入力画像とその類似意味表現を再構成し、モデルトレーニング中に意味再構築器からの再構成スコアを確率と共に使用して、生成されたキャプションの品質を評価する。その結果、デコーダは、改良された意味情報を受け取り、キャプション生成プロセスが向上する。モデルテストでは、復元スコアとログライクフッドを組み合わせることで、最も適切なキャプションを選択することもできる。提案モデルでは,画像キャプションのための最先端のLSTM-A5モデルよりも,時間的複雑性と精度が優れている。

Image captioning by the encoder-decoder framework has shown tremendous advancement in the last decade where CNN is mainly used as encoder and LSTM is used as a decoder. Despite such an impressive achievement in terms of accuracy in simple images, it lacks in terms of time complexity and space complexity efficiency. In addition to this, in case of complex images with a lot of information and objects, the performance of this CNN-LSTM pair downgraded exponentially due to the lack of semantic understanding of the scenes presented in the images. Thus, to take these issues into consideration, we present CNN-GRU encoder decode framework for caption-to-image reconstructor to handle the semantic context into consideration as well as the time complexity. By taking the hidden states of the decoder into consideration, the input image and its similar semantic representations is reconstructed and reconstruction scores from a semantic reconstructor are used in conjunction with likelihood during model training to assess the quality of the generated caption. As a result, the decoder receives improved semantic information, enhancing the caption production process. During model testing, combining the reconstruction score and the log-likelihood is also feasible to choose the most appropriate caption. The suggested model outperforms the state-of-the-art LSTM-A5 model for picture captioning in terms of time complexity and accuracy.

翻訳日:2023-01-09 23:24:24 公開日:2023-01-06

# IMKGA-SM:シーケンスモデリングによる解釈可能なマルチモーダル知識グラフ回答予測

IMKGA-SM: Interpretable Multimodal Knowledge Graph Answer Prediction via Sequence Modeling ( http://arxiv.org/abs/2301.02445v1 )

ライセンス: Link先を確認

Yilin Wen, Biao Luo and Yuqian Zhao

(参考訳) マルチモーダル知識グラフリンク予測は,マルチモーダルデータに対するリンク予測タスクの精度と効率を向上させることを目的としている。しかし、複雑なマルチモーダル情報やスパーストレーニングデータの場合、ほとんどの手法では解釈可能性と高い精度を同時に達成することは困難である。そこで本稿では,この課題に対処するために,多変量知識グラフ応答予測(imkga-sm)という新しいモデルを開発した。まず,マルチモーダル微細粒度融合法を提案し,vgg16とocr(optical character recognition)技術を用いて画像や画像からテキスト情報を効果的に抽出する。次に、知識グラフリンク予測タスクをオフライン強化学習マルコフ決定モデルとしてモデル化し、統一シーケンスフレームワークに抽象化する。対話的な知覚に基づく報酬期待機構と特別な因果的マスキング機構が設計され、クエリを推論パスに`変換する。そこで,マルチモーダル最適化の問題点を軽減するために,自己回帰動的勾配調整機構を提案する。最後に、2つのデータセットが実験に採用され、一般的なSOTAベースラインが比較に使用される。その結果,開発したIMKGA-SMは,異なるサイズのマルチモーダルリンク予測データセット上でのSOTAベースラインよりもはるかに優れた性能が得られることがわかった。

Multimodal knowledge graph link prediction aims to improve the accuracy and efficiency of link prediction tasks for multimodal data. However, for complex multimodal information and sparse training data, it is usually difficult to achieve interpretability and high accuracy simultaneously for most methods. To address this difficulty, a new model is developed in this paper, namely Interpretable Multimodal Knowledge Graph Answer Prediction via Sequence Modeling (IMKGA-SM). First, a multi-modal fine-grained fusion method is proposed, and Vgg16 and Optical Character Recognition (OCR) techniques are adopted to effectively extract text information from images and images. Then, the knowledge graph link prediction task is modelled as an offline reinforcement learning Markov decision model, which is then abstracted into a unified sequence framework. An interactive perception-based reward expectation mechanism and a special causal masking mechanism are designed, which ``converts" the query into an inference path. Then, an autoregressive dynamic gradient adjustment mechanism is proposed to alleviate the insufficient problem of multimodal optimization. Finally, two datasets are adopted for experiments, and the popular SOTA baselines are used for comparison. The results show that the developed IMKGA-SM achieves much better performance than SOTA baselines on multimodal link prediction datasets of different sizes.

翻訳日:2023-01-09 23:23:32 公開日:2023-01-06

# 逐次依存型マルチタスク学習のためのタスク認識特徴抽出フレームワーク

Task Aware Feature Extraction Framework for Sequential Dependence Multi-Task Learning ( http://arxiv.org/abs/2301.02494v1 )

ライセンス: Link先を確認

Xuewen Tao and Mingming Ha and Xiaobo Guo and Qiongxu Ma and Hongwei Cheng and Wenfang Lin

(参考訳) マルチタスク学習(mtl)は多くの現実世界のアプリケーションでうまく実装されており、単一のモデルで複数のタスクを同時に解決することを目指している。マルチタスク学習の一般的な考え方は、全タスクのパフォーマンスを改善するために、グローバルパラメータ共有機構とタスク固有の特徴抽出器を設計することである。しかし、タスク間のシーケンシャルな依存は滅多に研究されていないが、オンラインのオンラインレコメンデーション(インプレッション、クリック、コンバージョンなど)で頻繁に発生する。この問題に関する理論的研究はほとんどなく、ほとんどのMTL手法で採用されているバイアス最適化オブジェクトはオンライン性能を劣化させる。さらに、さまざまなタスク間のトレードオフのバランスを保ち、共通および特定の表現を効果的に学習する上でも課題は残る。本稿では,まず,厳密な数学的観点から逐次依存度mtlを解析し,不偏最適化対象として依存度タスク学習損失を設計する。また,逐次依存型MLLのためのタスク認識特徴抽出(TAFE)フレームワークを提案する。オフラインデータセットとオンラインa/b実装に関する広範な実験により,提案手法の有効性が実証された。

Multi-task learning (MTL) has been successfully implemented in many real-world applications, which aims to simultaneously solve multiple tasks with a single model. The general idea of multi-task learning is designing kinds of global parameter sharing mechanism and task-specific feature extractor to improve the performance of all tasks. However, sequential dependence between tasks are rarely studied but frequently encountered in e-commence online recommendation, e.g. impression, click and conversion on displayed product. There is few theoretical work on this problem and biased optimization object adopted in most MTL methods deteriorates online performance. Besides, challenge still remains in balancing the trade-off between various tasks and effectively learn common and specific representation. In this paper, we first analyze sequential dependence MTL from rigorous mathematical perspective and design a dependence task learning loss to provide an unbiased optimizing object. And we propose a Task Aware Feature Extraction (TAFE) framework for sequential dependence MTL, which enables to selectively reconstruct implicit shared representations from a sample-wise view and extract explicit task-specific information in an more efficient way. Extensive experiments on offline datasets and online A/B implementation demonstrate the effectiveness of our proposed TAFE.

翻訳日:2023-01-09 23:23:12 公開日:2023-01-06

# GNNによる乗客需要予測

GNN-based Passenger Request Prediction ( http://arxiv.org/abs/2301.02515v1 )

ライセンス: Link先を確認

Aqsa Ashraf Makhdomi and Iqra Altaf Gillani

(参考訳) 乗客の要求予測は、配車プラットフォームにおける運用計画、制御、管理に不可欠である。需要予測問題は広く研究されているが、乗客のOrigin-Destination(OD)フロー予測は研究コミュニティからはあまり注目されていない。本稿では,乗客のodフローを予測するための注意機構とともに,グラフニューラルネットワークフレームワークを開発した。提案フレームワークでは,異なる場所からの要求間で発生する線形および非線形のさまざまな依存関係を活用し,その場所の繰り返しパターンとコンテキストデータをキャプチャする。さらに、道路網を網羅し、モデルの複雑さと精度を維持するグリッドセルの最適サイズを決定する。提案手法の特徴と各種成分を明らかにするため,広範なシミュレーションを行った。その結果,提案モデルが既存のベースラインよりも優れた性能を示すことができた。

Passenger request prediction is essential for operations planning, control, and management in ride-sharing platforms. While the demand prediction problem has been studied extensively, the Origin-Destination (OD) flow prediction of passengers has received less attention from the research community. This paper develops a Graph Neural Network framework along with the Attention Mechanism to predict the OD flow of passengers. The proposed framework exploits various linear and non-linear dependencies that arise among requests originating from different locations and captures the repetition pattern and the contextual data of that place. Moreover, the optimal size of the grid cell that covers the road network and preserves the complexity and accuracy of the model is determined. Extensive simulations are conducted to examine the characteristics of our proposed approach and its various components. The results show the superior performance of our proposed model compared to the existing baselines.

翻訳日:2023-01-09 23:22:48 公開日:2023-01-06

# パールの反実的手法による反実的説明の評価

Evaluating counterfactual explanations using Pearl's counterfactual method ( http://arxiv.org/abs/2301.02499v1 )

ライセンス: Link先を確認

Bevan I. Smith

(参考訳) 対実的な説明 (CE) は、異なる望ましい結果を生み出す別のシナリオを生成する方法である。例えば、もし学生がコースを失敗すると予測された場合、反実的な説明は生徒が合格すると予測される別の方法を与えることができる。アプリケーションはたくさんあります。しかし、CEは必ずしもデータの真の因果構造を考慮していない機械学習モデルから現在生成されている。これにより、CE量にバイアスを導入することができる。本研究は,これまでのところ反事実説明(ce)文献に見られない,judea pearlの反事実計算法を用いてcesをテストするためのものである。さらに,これらのCEを3つの異なる因果構造上で評価し,その根底構造が生成するCEにどのように影響するかを示す。本研究は,pearl法を用いてcesを評価する方法を示し,(限られたサンプルサイズではあるが)cesの30%がpearl法で計算したものと矛盾していることを示した。このことは、CEを単に信頼できないことを示し、元の機械学習モデルを使ってカウンターファクトを盲目的に計算する前に、真の因果構造を知ることが不可欠であることを示している。

Counterfactual explanations (CEs) are methods for generating an alternative scenario that produces a different desirable outcome. For example, if a student is predicted to fail a course, then counterfactual explanations can provide the student with alternate ways so that they would be predicted to pass. The applications are many. However, CEs are currently generated from machine learning models that do not necessarily take into account the true causal structure in the data. By doing this, bias can be introduced into the CE quantities. I propose in this study to test the CEs using Judea Pearl's method of computing counterfactuals which has thus far, surprisingly, not been seen in the counterfactual explanation (CE) literature. I furthermore evaluate these CEs on three different causal structures to show how the true underlying causal structure affects the CEs that are generated. This study presented a method of evaluating CEs using Pearl's method and it showed, (although using a limited sample size), that thirty percent of the CEs conflicted with those computed by Pearl's method. This shows that we cannot simply trust CEs and it is vital for us to know the true causal structure before we blindly compute counterfactuals using the original machine learning model.

翻訳日:2023-01-09 23:22:36 公開日:2023-01-06

# architect, regularize and replay (arr): 継続的学習のための柔軟なハイブリッドアプローチ

Architect, Regularize and Replay (ARR): a Flexible Hybrid Approach for Continual Learning ( http://arxiv.org/abs/2301.02464v1 )

ライセンス: Link先を確認

Vincenzo Lomonaco, Lorenzo Pellegrini, Gabriele Graffieti, Davide Maltoni

(参考訳) 近年,機械学習手法,特に深層表現学習に対する関心が高まり,基本的な仮定を克服し,様々な分布シフトやサンプル選択バイアスを受ける非定常環境への取り組みが見られた。この文脈では、アーキテクチャの優先順位、レギュラライザ、リプレイポリシーに基づくいくつかの計算アプローチが、それらが開発され評価された特定のシナリオによって異なる成功度で提案されている。しかし、柔軟かつ一般的に調整可能な効率効率性トレードオフに適用できる包括的なハイブリッドソリューションを設計することは、まだ遠い目標に思える。本稿では、古典的シナリオ(例えば、クラス増分学習)における最先端の成果を達成し、CIFAR-100、CORe50、ImageNet-1000などの実世界のデータセットから生成された任意のデータストリームに一般化できるAR1アルゴリズムとその変種をハイブリッドで一般化した「アーキテクチャ、規則化、再生」を提案する。

In recent years we have witnessed a renewed interest in machine learning methodologies, especially for deep representation learning, that could overcome basic i.i.d. assumptions and tackle non-stationary environments subject to various distributional shifts or sample selection biases. Within this context, several computational approaches based on architectural priors, regularizers and replay policies have been proposed with different degrees of success depending on the specific scenario in which they were developed and assessed. However, designing comprehensive hybrid solutions that can flexibly and generally be applied with tunable efficiency-effectiveness trade-offs still seems a distant goal. In this paper, we propose "Architect, Regularize and Replay" (ARR), an hybrid generalization of the renowned AR1 algorithm and its variants, that can achieve state-of-the-art results in classic scenarios (e.g. class-incremental learning) but also generalize to arbitrary data streams generated from real-world datasets such as CIFAR-100, CORe50 and ImageNet-1000.

翻訳日:2023-01-09 23:16:25 公開日:2023-01-06

# 大型類人猿行動のトリプルストリームディープメトリック学習

Triple-stream Deep Metric Learning of Great Ape Behavioural Actions ( http://arxiv.org/abs/2301.02642v1 )

ライセンス: Link先を確認

Otto Brookes, Majid Mirmehdi, Hjalmar K\"uhl, Tilo Burghardt

(参考訳) 本稿では,類人猿の行動認識のための最初のメトリック学習システムを提案する。提案手法は,DensePose-Cチンパンジーのボディー部分分割ストリームの利用により,従来のRGBの外観や光フローストリームを効果的に補完することを示す。異なる特徴融合手法と長い尾認識手法を用いてシステム変異を評価した。 PanAf-500データセットでは、9つの動作アクションに対して180,000のアノテートフレームが手作業で記述されているため、トップ1の精度が約12%向上した。さらに,本研究の結果を定性的に分析し,そのデータを用いた文献と比較して,クラス毎の平均精度が約23%向上できることを示すロングテール認識手法を用いて,メートル法学習システムを強化した。最後に、埋め込み空間はメートル法として構築されるので、新しい幾何学とトポロジーを示す巨大な猿の行動行動空間の最初のデータ駆動可視化を提供する。この研究が、絶滅危惧猿の利益のために、コンピュータビジョンのこの重要な応用分野へのさらなる関心を喚起することを願っている。

We propose the first metric learning system for the recognition of great ape behavioural actions. Our proposed triple stream embedding architecture works on camera trap videos taken directly in the wild and demonstrates that the utilisation of an explicit DensePose-C chimpanzee body part segmentation stream effectively complements traditional RGB appearance and optical flow streams. We evaluate system variants with different feature fusion techniques and long-tail recognition approaches. Results and ablations show performance improvements of ~12% in top-1 accuracy over previous results achieved on the PanAf-500 dataset containing 180,000 manually annotated frames across nine behavioural actions. Furthermore, we provide a qualitative analysis of our findings and augment the metric learning system with long-tail recognition techniques showing that average per class accuracy -- critical in the domain -- can be improved by ~23% compared to the literature on that dataset. Finally, since our embedding spaces are constructed as metric, we provide first data-driven visualisations of the great ape behavioural action spaces revealing emerging geometry and topology. We hope that the work sparks further interest in this vital application area of computer vision for the benefit of endangered great apes.

翻訳日:2023-01-09 23:15:24 公開日:2023-01-06

# tarvis: ターゲットベースのビデオセグメンテーションのための統一アプローチ

TarViS: A Unified Approach for Target-based Video Segmentation ( http://arxiv.org/abs/2301.02657v1 )

ライセンス: Link先を確認

Ali Athar, Alexander Hermans, Jonathon Luiten, Deva Ramanan, Bastian Leibe

(参考訳) ビデオセグメンテーションの一般的なドメインは、現在複数のベンチマークにまたがる異なるタスクに断片化されている。最先端技術の急速な進歩にもかかわらず、現在の手法は圧倒的にタスク固有であり、概念的には他のタスクに一般化できない。マルチタスク機能を備えた最近のアプローチにインスパイアされたTarViSは、ビデオ内の任意に定義された「ターゲット」の集合をセグメント化する必要のあるタスクに適用可能な、新しく統一されたネットワークアーキテクチャである。我々のアプローチは、タスクがこれらのターゲットをどのように定義するかに関して柔軟であり、後者を抽象的な「クエリ」としてモデル化し、ピクセル精度の高いターゲットマスクを予測するのに使用される。単一のTarViSモデルは、異なるタスクにまたがるデータセットのコレクションを共同でトレーニングすることができ、タスク固有のリトレーニングなしで、推論中にタスク間のホットスワップを行うことができる。有効性を示すために,ビデオインスタンスセグメンテーション(VIS),ビデオパノプティクスセグメンテーション(VPS),ビデオオブジェクトセグメンテーション(VOS),ポイントインテンプラ誘導トラッキング(PET)の4つのタスクにTarViSを適用した。これら4つのタスクにまたがる5/7ベンチマークの最先端性能と,残りの2つのタスクの競合性能を実現する。

The general domain of video segmentation is currently fragmented into different tasks spanning multiple benchmarks. Despite rapid progress in the state-of-the-art, current methods are overwhelmingly task-specific and cannot conceptually generalize to other tasks. Inspired by recent approaches with multi-task capability, we propose TarViS: a novel, unified network architecture that can be applied to any task that requires segmenting a set of arbitrarily defined 'targets' in video. Our approach is flexible with respect to how tasks define these targets, since it models the latter as abstract 'queries' which are then used to predict pixel-precise target masks. A single TarViS model can be trained jointly on a collection of datasets spanning different tasks, and can hot-swap between tasks during inference without any task-specific retraining. To demonstrate its effectiveness, we apply TarViS to four different tasks, namely Video Instance Segmentation (VIS), Video Panoptic Segmentation (VPS), Video Object Segmentation (VOS) and Point Exemplar-guided Tracking (PET). Our unified, jointly trained model achieves state-of-the-art performance on 5/7 benchmarks spanning these four tasks, and competitive performance on the remaining two.

翻訳日:2023-01-09 23:15:09 公開日:2023-01-06

# 深層学習駆動サルエント領域における有効なp値

Valid P-Value for Deep Learning-Driven Salient Region ( http://arxiv.org/abs/2301.02437v1 )

ライセンス: Link先を確認

Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi

(参考訳) 深層学習モデルの予測を解釈し,説明するために,様々なサリエンシマップ手法が提案されている。精度マップにより、入力信号のどの部分が予測結果に強い影響を与えるかを解釈できる。しかし、深層学習モデルにおける複雑な計算によってサリエンシマップが得られるため、サリエンシマップ自体の信頼性を知ることはしばしば困難である。そこで本研究では,p値の形で有意な領域の信頼性を定量化する手法を提案する。本研究は,訓練された深層学習モデルによって選択された仮説としてサルエント領域を考察し,選択推論フレームワークを採用することを目的とする。提案手法は,有意な領域の偽陽性検出の確率を確実に制御できる。提案手法の有効性を,合成データセットと実データセットの数値例を用いて示す。さらに,提案するCNNに対して,実装コストを伴わずに選択推論を行うKerasベースのフレームワークを開発した。

Various saliency map methods have been proposed to interpret and explain predictions of deep learning models. Saliency maps allow us to interpret which parts of the input signals have a strong influence on the prediction results. However, since a saliency map is obtained by complex computations in deep learning models, it is often difficult to know how reliable the saliency map itself is. In this study, we propose a method to quantify the reliability of a salient region in the form of p-values. Our idea is to consider a salient region as a selected hypothesis by the trained deep learning model and employ the selective inference framework. The proposed method can provably control the probability of false positive detections of salient regions. We demonstrate the validity of the proposed method through numerical examples in synthetic and real datasets. Furthermore, we develop a Keras-based framework for conducting the proposed selective inference for a wide class of CNNs without additional implementation cost.

翻訳日:2023-01-09 23:14:43 公開日:2023-01-06

# no, to the right" --共有自律性によるロボット操作のためのオンライン言語修正

"No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy ( http://arxiv.org/abs/2301.02555v1 )

ライセンス: Link先を確認

Yuchen Cui and Siddharth Karamcheti and Raj Palleti and Nidhya Shivakumar and Percy Liang and Dorsa Sadigh

(参考訳) 言語誘導型ロボットインタラクションのためのシステムは、適応性と学習効率の2つの重要なデシダータを満たす必要がある。残念ながら、既存のインストラクションフォローエージェントは適応できず、オンライン自然言語監視を組み込む能力が欠如している。本研究では,自然言語の修正を取り入れ,適応するためのフレームワークであるLanguage-Informed Latent Actions with Corrections (LILAC) を,実行中に「右へ」あるいは「右へ」あるいは「右へ」に向けて提示することで,これらの問題に対処する。我々は共有自律性パラダイムの中でリッチ操作ドメインを探求する。言語は、人間がロボットをガイドするために使用できる有意義で低次元の制御空間を生成する学習されたモデルへの入力です。それぞれのリアルタイム補正は、人間のコントロール空間を洗練し、正確で拡張された動作を可能にします。我々は,Franka Emika Pandaマニピュレータを用いて複雑な操作作業を行うユーザスタディを通じて,我々のアプローチを評価する。オープンループ指導とシングルターン共有自律の両方を対象とする既存の学習ベースラインと比較して,我々の修正認識アプローチはタスク完了率が高く,信頼性,正確性,使いやすさから,ユーザによって主観的に好まれることを示す。

Systems for language-guided human-robot interaction must satisfy two key desiderata for broad adoption: adaptivity and learning efficiency. Unfortunately, existing instruction-following agents cannot adapt, lacking the ability to incorporate online natural language supervision, and even if they could, require hundreds of demonstrations to learn even simple policies. In this work, we address these problems by presenting Language-Informed Latent Actions with Corrections (LILAC), a framework for incorporating and adapting to natural language corrections - "to the right," or "no, towards the book" - online, during execution. We explore rich manipulation domains within a shared autonomy paradigm. Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot: language is an input to a learned model that produces a meaningful, low-dimensional control space that the human can use to guide the robot. Each real-time correction refines the human's control space, enabling precise, extended behaviors - with the added benefit of requiring only a handful of demonstrations to learn. We evaluate our approach via a user study where users work with a Franka Emika Panda manipulator to complete complex manipulation tasks. Compared to existing learned baselines covering both open-loop instruction following and single-turn shared autonomy, we show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users because of its reliability, precision, and ease of use.

翻訳日:2023-01-09 23:14:27 公開日:2023-01-06

# フィードバックゲーテッド整流線形単位

Feedback-Gated Rectified Linear Units ( http://arxiv.org/abs/2301.02610v1 )

ライセンス: Link先を確認

Marco Kemmerling

(参考訳) フィードバック接続は人間の脳において重要な役割を果たすが、ニューラルネットワーク研究ではあまり注目されていない。ここでは, 整流線形ユニットをゲートとする生物学的フィードバック機構を提案する。 MNISTデータセットでは、フィードバックのないオートエンコーダは、フィードバックのないものに比べて、より高速な収束、パフォーマンスの向上、ノイズに対する堅牢性を示している。 cifar-10データセットにフィードバックのあるネットワークを適用すると、いくつかの利点は、発音や一貫性が低下するが、観察できる。

Feedback connections play a prominent role in the human brain but have not received much attention in artificial neural network research. Here, a biologically inspired feedback mechanism which gates rectified linear units is proposed. On the MNIST dataset, autoencoders with feedback show faster convergence, better performance, and more robustness to noise compared to their counterparts without feedback. Some benefits, although less pronounced and less consistent, can be observed when networks with feedback are applied on the CIFAR-10 dataset.

翻訳日:2023-01-09 23:13:59 公開日:2023-01-06

# TWR-MCAE: 壁面レーダーによる人体動作認識のためのデータ拡張手法

TWR-MCAE: A Data Augmentation Method for Through-the-Wall Radar Human Motion Recognition ( http://arxiv.org/abs/2301.02488v1 )

ライセンス: Link先を確認

Weicheng Gao, Xiaopeng Yang, Xiaodong Qu, Tian Lan

(参考訳) 壁面減衰,マルチパス効果,システム干渉による壁面レーダ(twr)の人間動作の精度低下と収束時間の延長という課題を解決するため,マルチリンク自動符号化ニューラルネットワーク(twr-mcae)データ拡張法を提案する。特に、TWR-MCAEアルゴリズムは、特異値分解(SVD)ベースのデータ前処理モジュール、改良された座標注意モジュール、圧縮検出可能な反復収縮しきい値再構成アルゴリズム(LISTA)モジュール、適応重みモジュールで共同構築される。データ前処理モジュールは、壁クラッタ、人の動き特徴、ノイズサブスペース分離を実現する。改良された座標注意モジュールは、クラッタおよびノイズ抑制を実現する。 LISTAモジュールはヒトの運動特徴増強を実現する。適応重み加群は重みを学び、3つの部分空間を融合する。 TWR-MCAEは壁クラッタの低ランク特性を抑制でき、同時に人の動きの空間特性を高めることができる。分類ステップの前にリンクすることで、他の事前知識を追加したり、より多くのデータを再収集することなく、特徴抽出能力を改善することができる。実験により,提案手法はピーク信号対雑音比(psnr)が向上し,認識精度が向上し,バックエンド分類器の学習プロセスを高速化することを示した。

To solve the problems of reduced accuracy and prolonging convergence time of through-the-wall radar (TWR) human motion due to wall attenuation, multipath effect, and system interference, we propose a multilink auto-encoding neural network (TWR-MCAE) data augmentation method. Specifically, the TWR-MCAE algorithm is jointly constructed by a singular value decomposition (SVD)-based data preprocessing module, an improved coordinate attention module, a compressed sensing learnable iterative shrinkage threshold reconstruction algorithm (LISTA) module, and an adaptive weight module. The data preprocessing module achieves wall clutter, human motion features, and noise subspaces separation. The improved coordinate attention module achieves clutter and noise suppression. The LISTA module achieves human motion feature enhancement. The adaptive weight module learns the weights and fuses the three subspaces. The TWR-MCAE can suppress the low-rank characteristics of wall clutter and enhance the sparsity characteristics in human motion at the same time. It can be linked before the classification step to improve the feature extraction capability without adding other prior knowledge or recollecting more data. Experiments show that the proposed algorithm gets a better peak signal-to-noise ratio (PSNR), which increases the recognition accuracy and speeds up the training process of the back-end classifiers.

翻訳日:2023-01-09 23:13:51 公開日:2023-01-06

# semantic match: ヘルスケアのためのxaiのデバッギング機能帰属メソッド

Semantic match: Debugging feature attribution methods in XAI for healthcare ( http://arxiv.org/abs/2301.02080v2 )

ライセンス: Link先を確認

Giovanni Cin\`a, Tabea E. R\"ober, Rob Goedhart, \c{S}. \.Ilker Birbil

(参考訳) 最近、医療用の認証人工知能(AI)ツールが急増し、この技術の採用に関する議論が再燃している。このような議論の1つのスレッドは、説明可能なAI(XAI)と、AIデバイスをより透明で信頼性の高いものにすることの約束に関するものだ。医療AI分野で活動している一部の声は、説明可能なAI技術、特に特徴帰属手法の信頼性に関する懸念を表明し、その使用とガイドラインや標準への含意を疑問視している。画像データに固有の問題を一般化することにより, 保温後の局所的説明可能性に関する既存の批判は, 浴水で赤ちゃんを投げ捨てるものである, と論じる。まず、その問題を説明と人間の理解のセマンティックマッチの欠如として特徴づける。機能の重要度がいつ確実に使用できるのかを理解するため、低レベルと高レベルの機能の重要度を区別する。 EHR(Electronic Health Records)のような表層データのような,低レベルの機能に明確なセマンティクスが付与されたデータタイプに対しては,セマンティクスマッチングが実現可能であるため,機能属性手法を有意義かつ有用な方法で使用することが可能である,と論じる。

The recent spike in certified Artificial Intelligence (AI) tools for healthcare has renewed the debate around adoption of this technology. One thread of such debate concerns Explainable AI (XAI) and its promise to render AI devices more transparent and trustworthy. A few voices active in the medical AI space have expressed concerns on the reliability of Explainable AI techniques and especially feature attribution methods, questioning their use and inclusion in guidelines and standards. Despite valid concerns, we argue that existing criticism on the viability of post-hoc local explainability methods throws away the baby with the bathwater by generalizing a problem that is specific to image data. We begin by characterizing the problem as a lack of semantic match between explanations and human understanding. To understand when feature importance can be used reliably, we introduce a distinction between feature importance of low- and high-level features. We argue that for data types where low-level features come endowed with a clear semantics, such as tabular data like Electronic Health Records (EHRs), semantic match can be obtained, and thus feature attribution methods can still be employed in a meaningful and useful way.

翻訳日:2023-01-09 23:06:15 公開日:2023-01-06

# 未知のユーティリティ関数によるネットワークユーティリティの最大化: 分散データ駆動バイレベル最適化アプローチ

Network Utility Maximization with Unknown Utility Functions: A Distributed, Data-Driven Bilevel Optimization Approach ( http://arxiv.org/abs/2301.01801v2 )

ライセンス: Link先を確認

Kaiyi Ji and Lei Ying

(参考訳) 公平なリソース割り当ては、通信ネットワークにおける最も重要なトピックの1つである。既存のソリューションはほとんどの場合、各ユーザユーティリティ関数が知られて凹凸であると仮定する。本稿では,ユーティリティ機能が未知である場合,ユーザに対してどのようにリソースを割り当てるか,という問いに答える。この答えは、ユーザユーティリティが複雑でクローズドフォームが入手困難である、次世代のAI対応通信ネットワークにおいてますます重要になっている。本稿では,分散およびデータ駆動の双方向最適化手法を用いて,分散ネットワークユーティリティ最大化(NUM)アルゴリズムと,データ駆動学習アルゴリズムを用いて,真のネットワークユーティリティの総和を最大化するための最良サロゲートユーティリティ関数を求める。提案アルゴリズムは、データサンプル(ユーティリティ値または勾配値)から学習し、サロゲートユーティリティ関数を自動チューニングして真のネットワークユーティリティを最大化するので、未知のユーティリティ関数で機能する。一般ネットワークでは,提案アルゴリズムの非凹凸汎関数による非漸近収束率を定式化する。提案手法の有効性を実世界のネットワークで検証し,提案手法の有効性を検証した。

Fair resource allocation is one of the most important topics in communication networks. Existing solutions almost exclusively assume each user utility function is known and concave. This paper seeks to answer the following question: how to allocate resources when utility functions are unknown, even to the users? This answer has become increasingly important in the next-generation AI-aware communication networks where the user utilities are complex and their closed-forms are hard to obtain. In this paper, we provide a new solution using a distributed and data-driven bilevel optimization approach, where the lower level is a distributed network utility maximization (NUM) algorithm with concave surrogate utility functions, and the upper level is a data-driven learning algorithm to find the best surrogate utility functions that maximize the sum of true network utility. The proposed algorithm learns from data samples (utility values or gradient values) to autotune the surrogate utility functions to maximize the true network utility, so works for unknown utility functions. For the general network, we establish the nonasymptotic convergence rate of the proposed algorithm with nonconcave utility functions. The simulations validate our theoretical results and demonstrate the great effectiveness of the proposed method in a real-world network.

翻訳日:2023-01-09 23:05:52 公開日:2023-01-06

# 質問応答としての感情因果対抽出

Emotion-Cause Pair Extraction as Question Answering ( http://arxiv.org/abs/2301.01982v2 )

ライセンス: Link先を確認

Huu-Hiep Nguyen and Minh-Tien Nguyen

(参考訳) Emotion-Cause Pair extract (ECPE) のタスクは、感情や原因節のアノテーションなしで、文書の潜在的な感情のペアを抽出することを目的としている。従来のECPEのアプローチでは、複雑なアーキテクチャを用いて感情による相互作用をモデル化し、従来の2段階処理方式を改良しようと試みてきた。本稿では,質問応答(QA)問題にECPEタスクを投入し,それに取り組むための単純かつ効果的なBERTベースのソリューションを提案する。文書が与えられた場合、ガイド-QAモデルはまず、固定された質問を用いて最適な感情節を予測する。次に、予測された感情は、感情の最も潜在的な原因を予測する質問として使用される。我々は,標準ECPEコーパスでモデルを評価する。実験の結果, 単純性にもかかわらず, 有望な結果が得られ, 容易に再現できることが示唆された。 Guided-QAのコードも提供される。

The task of Emotion-Cause Pair Extraction (ECPE) aims to extract all potential emotion-cause pairs of a document without any annotation of emotion or cause clauses. Previous approaches on ECPE have tried to improve conventional two-step processing schemes by using complex architectures for modeling emotion-cause interaction. In this paper, we cast the ECPE task to the question answering (QA) problem and propose simple yet effective BERT-based solutions to tackle it. Given a document, our Guided-QA model first predicts the best emotion clause using a fixed question. Then the predicted emotion is used as a question to predict the most potential cause for the emotion. We evaluate our model on a standard ECPE corpus. The experimental results show that despite its simplicity, our Guided-QA achieves promising results and is easy to reproduce. The code of Guided-QA is also provided.

翻訳日:2023-01-09 23:05:34 公開日:2023-01-06

# 映像言語課題のための学習軌跡単語アライメント

Learning Trajectory-Word Alignments for Video-Language Tasks ( http://arxiv.org/abs/2301.01953v2 )

ライセンス: Link先を確認

Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang

(参考訳) Image-Language BERT (IL-BERT) と Video-Language BERT (VDL-BERT) では、言葉でオブジェクトを調整することが重要な役割を果たす。オブジェクトがいくつかの空間的パッチをカバーしている場合とは異なり、ビデオ内のオブジェクトは通常、オブジェクトの軌道として現れる、すなわち、いくつかの空間的だがより長い時間的パッチにまたがるので、豊富な時空間的コンテキストを含む。しかしながら、現代のVDL-BERTは、通常、パッチ・トゥ・ワード(P2W)の注意を配置するためにIL-BERTに従うというこの軌跡を無視する一方、そのような注意は、自明な空間的コンテキストを過度に露出し、時間的文脈を無視する。そこで本稿では,ビデオ言語タスクを解くためのトラジェクティブ・ワードアライメントを学習するための新しいTW-BERTを提案する。このようなアライメントは、新しく設計されたt2wの注意によって学習される。また,従来のVDL-BERTを追従して,モーダルエンコーダにワード・トゥ・パッチ(W2P)の注意を設定する。 T2WとW2Pの注意は多様であるため、我々のクロスモーダルエンコーダは非対称である。この非対称なクロスモーダルエンコーダが堅牢な視覚言語アソシエーションを構築するのに役立ち、ビデオやテキストエンコーダによって計算された埋め込み空間を閉じるための粒度の 'align-before-fuse'' 戦略を提案する。提案した戦略とT2Wの注目により、我々のTW-BERTは、テキストからビデオまでの検索タスクにおけるSOTAパフォーマンスと、より多くのデータで訓練されたVDL-BERTを用いたビデオ質問応答タスクにおける同等のパフォーマンスを達成する。コードは補足資料で入手できます。

Aligning objects with words plays a critical role in Image-Language BERT (IL-BERT) and Video-Language BERT (VDL-BERT). Different from the image case where an object covers some spatial patches, an object in a video usually appears as an object trajectory, i.e., it spans over a few spatial but longer temporal patches and thus contains abundant spatiotemporal contexts. However, modern VDL-BERTs neglect this trajectory characteristic that they usually follow IL-BERTs to deploy the patch-to-word (P2W) attention while such attention may over-exploit trivial spatial contexts and neglect significant temporal contexts. To amend this, we propose a novel TW-BERT to learn Trajectory-Word alignment for solving video-language tasks. Such alignment is learned by a newly designed trajectory-to-word (T2W) attention. Besides T2W attention, we also follow previous VDL-BERTs to set a word-to-patch (W2P) attention in the cross-modal encoder. Since T2W and W2P attentions have diverse structures, our cross-modal encoder is asymmetric. To further help this asymmetric cross-modal encoder build robust vision-language associations, we propose a fine-grained ``align-before-fuse'' strategy to pull close the embedding spaces calculated by the video and text encoders. By the proposed strategy and T2W attention, our TW-BERT achieves SOTA performances on text-to-video retrieval tasks, and comparable performances on video question answering tasks with some VDL-BERTs trained on much more data. The code will be available in the supplementary material.

翻訳日:2023-01-09 23:05:09 公開日:2023-01-06

# ハイパーパラメータ最適化による自律レースシステムのデータ駆動モデル同定

Data-Driven Model Identification via Hyperparameter Optimization for Autonomous Racing Systems ( http://arxiv.org/abs/2301.01470v2 )

ライセンス: Link先を確認

Hyunki Seong, Chanyoung Chung, and David Hyunchul Shim

(参考訳) 本稿では,ハイパーパラメータ最適化(MIHO)を用いたモデル同定手法を提案する。提案手法は,データ駆動最適化方式で動的モデルのパラメータを同定する効率的な探索探索戦略を採用する。フルスケールの自動運転車であるAV-21のモデルパラメータ同定にMIHOを利用する。次に、モデルベースの計画・制御システムの設計に最適化されたパラメータを組み込む。実験では、学習されたパラメトリックモデルは与えられたデータセットの適合性を示し、目に見えない動的シナリオにおける一般化能力を示す。さらに、モデルベースシステムを検証するために、広範囲なフィールドテストを実施します。テストの結果,インディアナポリス・モーター・スピードウェイとラスベガス・モーター・スピードウェイで,学習したモデル力学を活用し,障害物回避と200km/h以上の高速走行に成功した。 MIHOのソースコードとテストのビデオはhttps://github.com/hynkis/MIHOで公開されている。

In this letter, we propose a model identification method via hyperparameter optimization (MIHO). Our method adopts an efficient explore-exploit strategy to identify the parameters of dynamic models in a data-driven optimization manner. We utilize MIHO for model parameter identification of the AV-21, a full-scaled autonomous race vehicle. We then incorporate the optimized parameters for the design of model-based planning and control systems of our platform. In experiments, the learned parametric models demonstrate good fitness to given datasets and show generalization ability in unseen dynamic scenarios. We further conduct extensive field tests to validate our model-based system. The tests show that our race systems leverage the learned model dynamics and successfully perform obstacle avoidance and high-speed driving over $200 km/h$ at the Indianapolis Motor Speedway and Las Vegas Motor Speedway. The source code for MIHO and videos of the tests are available at https://github.com/hynkis/MIHO.

翻訳日:2023-01-09 23:04:31 公開日:2023-01-06

# アンチスケーシングによる量子-光子相互作用の動的強化

Dynamically enhancing qubit-photon interactions with anti-squeezing ( http://arxiv.org/abs/2212.04991v1 )

ライセンス: Link先を確認

M. Villiers, W. C. Smith, A. Petrescu, A. Borgognoni, M. Delbecq, A. Sarlette, M. Mirrahimi, P. Campagne-Ibarcq, T. Kontos and Z. Leghtas

(参考訳) 発振器と量子ビットとの相互作用強度は、発振器の真空場変動とともに増大する。良く知られた縮退パラメトリック発振器は、その固有状態が押しつぶされたフォック状態である強い復調スクイージング状態への関心を復活させた。これらの増幅場ゆらぎにより、この振動子を絞ることで量子ビット-光子相互作用を動的に促進することが最近提案されている。超伝導回路実験において、スクイージングの5.5dBにおいて、キュービットと発振器の分散相互作用の2倍の増大を観測し、キュービット-光子相互作用のその場動的制御を示す。この研究は、励起された光子の振動子と量子ビットとの実験的カップリングを開始し、強化された相互作用を求める実験プラットフォームでの拡散を慎重に動機付ける。

The interaction strength of an oscillator to a qubit grows with the oscillator's vacuum field fluctuations. The well known degenerate parametric oscillator has revived interest in the regime of strongly detuned squeezing, where its eigenstates are squeezed Fock states. Owing to these amplified field fluctuations, it was recently proposed that squeezing this oscillator would dynamically boost qubit-photon interactions. In a superconducting circuit experiment, we observe a two-fold increase in the dispersive interaction between a qubit and an oscillator at 5.5 dB of squeezing, demonstrating in-situ dynamical control of qubit-photon interactions. This work initiates the experimental coupling of oscillators of squeezed photons to qubits, and cautiously motivates their dissemination in experimental platforms seeking enhanced interactions.

翻訳日:2023-01-09 17:30:45 公開日:2023-01-06

# 高振幅状態の連続可変量子トモグラフィ

Continuous-variable quantum tomography of high-amplitude states ( http://arxiv.org/abs/2212.07406v1 )

ライセンス: Link先を確認

Ekaterina Fedotova, Nikolai Kuznetsov, Egor Tiunov, A. E. Ulanov and A. I. Lvovsky

(参考訳) 量子状態トモグラフィーは現代の量子技術の重要な構成要素である。電磁場のような連続可変高調波オシレータシステムに適用する場合、既存のトモグラフィー法は一般に離散基底で状態を再構成し、したがって比較的低い振幅とエネルギーを持つ状態に制限される。そこで,この制限を克服するために,フィードフォワードニューラルネットワークを用いて,密度行列を直接連続位置で取得する。このアプローチの重要な利点は、詳細な再構築のために位相空間内の特定の領域を選択できることです。これにより、状態振幅による再構成に必要なリソース量のスケーリングが比較的遅くなり、その結果、本手法でアクセス可能な振幅の範囲を劇的に増加させることができる。

Quantum state tomography is an essential component of modern quantum technology. In application to continuous-variable harmonic-oscilator systems, such as the electromagnetic field, existing tomography methods typically reconstruct the state in discrete bases, and are hence limited to states with relatively low amplitudes and energies. Here we overcome this limitation by utilizing a feed-forward neural network to obtain the density matrix directly in the continuous position basis. An important benefit of our approach is the ability to choose specific regions in the phase space for detailed reconstruction. This results in relatively slow scaling of the amount of resources required for the reconstruction with the state amplitude, and hence allows us to dramatically increase the range of amplitudes accessible with our method.

翻訳日:2023-01-09 14:31:05 公開日:2023-01-06

PDF登録状況（公開日: 20230106）