Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20230514となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# rosematcher: アプリの更新に対するユーザーレビューの影響を特定する RoseMatcher: Identifying the Impact of User Reviews on App Updates ( http://arxiv.org/abs/2210.10223v4 ) ライセンス: Link先を確認	Tianyang Liu, Chong Wang, Kun Huang, Peng Liang, Beiqi Zhang, Maya Daneva, Marten van Sinderen	(参考訳) $\textbf{Context}$: モバイルアプリのリリース計画が活発な研究領域となり、ほとんどの研究はApple App Storeのリリースノートによるアプリ分析とイシュートラッカによるユーザレビューの追跡に重点を置いている。しかし、これらのリリースノートとapp storeのユーザーレビューの相関は未定である。この論文では、関連するユーザーレビューとアプリリリースノートをマッチングし、高い信頼性を持つマッチしたペアを識別するための新しい自動アプローチである、$\textit{rosematcher}$を紹介します。 $\textbf{Methods}$: Apple App Storeの5つのモバイルアプリから、944のリリースノートと1,046,862のユーザレビューを収集して、$\textit{RoseMatcher}$の有効性と正確性を評価し、マッチしたペアに関する深いコンテンツ分析を行った。関連したペアを識別するために$\textit{rosematcher}$が0.718のヒット率に達することを示し、984の関連ペアの手動ラベリングとコンテンツ分析を用いて、関連するマッチングペアにおけるリリースノートとユーザレビューの関係に基づいて、ユーザレビューがアプリアップデートで果たす8つの役割を識別した。 $\textbf{Conclusions}$: 私たちの調査結果は、アプリ開発チームとユーザの両方がリリースノートやユーザレビューに細心の注意を払っていることを示している。全体として、調査は、モバイルアプリのリリース計画におけるアプリ開発チームとユーザ間のコミュニケーションの重要性を強調しており、関連するレビューはリリースノートのリリース前後の短い期間で実施され、リリースノートのポストタイムとユーザレビューの間の平均的な時間間隔は約1年である。 $\textbf{Context}$: The release planning of mobile apps has become an area of active research, with most studies centering on app analysis through release notes in the Apple App Store and tracking user reviews via issue trackers. However, the correlation between these release notes and user reviews in App Store remains understudied. $\textbf{Objective}$: In this paper, we introduce $\textit{RoseMatcher}$, a novel automatic approach to match relevant user reviews with app release notes and identify matched pairs with high confidence. $\textbf{Methods}$: We collected 944 release notes and 1,046,862 user reviews from 5 mobile apps in the Apple App Store as research data to evaluate the effectiveness and accuracy of $\textit{RoseMatcher}$, and conducted deep content analysis on matched pairs. $\textbf{Results}$: Our evaluation shows that $\textit{RoseMatcher}$ can reach a hit ratio of 0.718 for identifying relevant matched pairs, and with the manual labeling and content analysis of 984 relevant pairs, we identify 8 roles that user reviews play in app updates according to the relationship between release notes and user reviews in the relevant matched pairs. $\textbf{Conclusions}$: Our findings indicate that both app development teams and users pay close attention to release notes and user reviews, with release notes typically addressing feature requests, bug reports, and complaints, and user reviews offering positive, negative, and constructive feedback. Overall, the study highlights the importance of the communication between app development teams and users in the release planning of mobile apps, with relevant reviews tending to be posed within a short period before and after the release of release notes, with the average time interval between the post time of release notes and user reviews being approximately one year.	翻訳日:2023-10-24 14:15:54 公開日:2023-05-14
# CLawK: スマートコントラクトにおけるビジネスプロセス監視 CLawK: Monitoring Business Processes in Smart Contracts ( http://arxiv.org/abs/2305.08254v1 ) ライセンス: Link先を確認	Mojtaba Eshghie, Wolfgang Ahrendt, Cyrille Artho, Thomas Troels Hildebrandt, Gerardo Schneider	(参考訳) スマートコントラクトは、静的解析が難しい複雑なビジネスプロセスを具現化する。本稿では,DCRグラフで記述されたビジネスプロセス仕様を利用して,スマートコントラクト実行のランタイム検証を行うランタイム監視ツールCLawKを提案する。我々は、コードインスツルメンテーションや追加のガスコストなしで、Ethereumネットワークにデプロイされたスマートコントラクトの特定の振る舞いから、CLawKがどのように逸脱を検出し、フラグを立てるかを実証する。 Smart contracts embody complex business processes that can be difficult to analyze statically. In this paper, we present CLawK, a runtime monitoring tool that leverages business process specifications written in DCR graphs to provide runtime verification of smart contract execution. We demonstrate how CLawK can detect and flag deviations from specified behaviors in smart contracts deployed in the Ethereum network without code instrumentation and any additional gas costs.	翻訳日:2023-10-24 08:55:36 公開日:2023-05-14
# 多視点対話型協調フィルタリング Multi-View Interactive Collaborative Filtering ( http://arxiv.org/abs/2305.18306v1 ) ライセンス: Link先を確認	Maria Lentini, Umashanger Thayasivam	(参考訳) 多くのシナリオでは、クリックやレーティングなどのレコメンダシステムユーザーインタラクションデータは少なく、アイテムのターンオーバー率(新しい記事や仕事の投稿など)が高い。これを踏まえると,ユーザ満足度に加えて文脈的"サイド"情報の統合が極めて望ましい。評価データと文脈データの両方を同時に処理できるアルゴリズムは存在するが、これらのアルゴリズムは通常、サンプル内のレコメンデーションのみに制限され、次元性の呪いに悩まされ、長期累積報酬最適化のためにマルチアームバンディット(MAB)ポリシーが組み込まれない。本稿では,複数画面対話型トピックレグレッション(MV-ICTR)を提案する。このアルゴリズムは,アイテム固有の特徴依存をモデル化するための評価情報と文脈情報の両方を組み込んだ,オンラインパーソナライゼーションを継続するマルチアームバンディットポリシを備えた,新しいオンライン潜在因子レコメンデーションアルゴリズムである。その結果、コールドスタートユーザとアイテムの割合が高いデータセットのパフォーマンスが大幅に向上した。 In many scenarios, recommender system user interaction data such as clicks or ratings is sparse, and item turnover rates (e.g., new articles, job postings) high. Given this, the integration of contextual "side" information in addition to user-item ratings is highly desirable. Whilst there are algorithms that can handle both rating and contextual data simultaneously, these algorithms are typically limited to making only in-sample recommendations, suffer from the curse of dimensionality, and do not incorporate multi-armed bandit (MAB) policies for long-term cumulative reward optimization. We propose multi-view interactive topic regression (MV-ICTR) a novel partially online latent factor recommender algorithm that incorporates both rating and contextual information to model item-specific feature dependencies and users' personal preferences simultaneously, with multi-armed bandit policies for continued online personalization. The result is significantly increased performance on datasets with high percentages of cold-start users and items.	翻訳日:2023-06-04 11:40:06 公開日:2023-05-14
# 無線通信システムにおけるグラフニューラルネットワークによるユーザペアリング Graph Neural Networks-Based User Pairing in Wireless Communication Systems ( http://arxiv.org/abs/2306.00717v1 ) ライセンス: Link先を確認	Sharan Mourya, Pavan Reddy, SaiDhiraj Amuru, Kiran Kumar Kuchi	(参考訳) 近年,NPハード無線リソース割り当て問題をリアルタイムに解決するためのソリューションとして,ディープニューラルネットワークが登場している。しかし、画像処理タスクから継承される多層パーセプトロン(MLP)と畳み込みニューラルネットワーク(CNN)構造は、無線ネットワーク問題に最適化されていない。ネットワークサイズが大きくなるにつれて、これらの手法は訓練や一般化が難しくなる。ユーザペアリングは、干渉を最小限に抑え、スループットを最大化しながら、同時にスケジュールするユーザを選択することを伴う無線通信システムにおいて、npハードな最適化問題のひとつです。本稿では,ユーザペアリング問題を効率的に解くために,教師なしグラフニューラルネットワーク(GNN)アプローチを提案する。提案手法は,Erdos goニューラルパイプラインを用いて,k平均や半直交ユーザスケジューリング(SUS)などの他のスケジューリング手法を大幅に上回っている。提案手法は20dBのSNRにおいて,k平均よりも49%,SUSよりも95%,最小時間と資源を消費しながら,安定した総和率を実現する。提案手法のスケーラビリティについても検討し,性能が大幅に低下することなく,ネットワークサイズの変化を動的に処理できることを示す。さらに,本モデルは,cnnやmlpでは実現できない動的機能を実現するために,大規模ネットワークや小型ネットワークを明示的に訓練することなく実現可能である。 Recently, deep neural networks have emerged as a solution to solve NP-hard wireless resource allocation problems in real-time. However, multi-layer perceptron (MLP) and convolutional neural network (CNN) structures, which are inherited from image processing tasks, are not optimized for wireless network problems. As network size increases, these methods get harder to train and generalize. User pairing is one such essential NP-hard optimization problem in wireless communication systems that entails selecting users to be scheduled together while minimizing interference and maximizing throughput. In this paper, we propose an unsupervised graph neural network (GNN) approach to efficiently solve the user pairing problem. Our proposed method utilizes the Erdos goes neural pipeline to significantly outperform other scheduling methods such as k-means and semi-orthogonal user scheduling (SUS). At 20 dB SNR, our proposed approach achieves a 49% better sum rate than k-means and a staggering 95% better sum rate than SUS while consuming minimal time and resources. The scalability of the proposed method is also explored as our model can handle dynamic changes in network size without experiencing a substantial decrease in performance. Moreover, our model can accomplish this without being explicitly trained for larger or smaller networks facilitating a dynamic functionality that cannot be achieved using CNNs or MLPs.	翻訳日:2023-06-04 11:01:19 公開日:2023-05-14
# ハイパーオートメーション-IT産業における自動化の次の周辺 Hyper-automation-The next peripheral for automation in IT industries ( http://arxiv.org/abs/2305.11896v1 ) ライセンス: Link先を確認	Ayush Singh Rajput, Richa Gupta	(参考訳) 特定のプロセスの境界を超えたレガシーなビジネスプロセス自動化の拡張は、ハイパーオートモーテーション(hyperautomation)と呼ばれる。 hyperautomationは、aiツールとrpaを組み合わせることで、ビジネスユーザが行うほぼすべての反復アクションの自動化を提供する。企業のトップ脳が完成できないかもしれない複雑なITビジネスプロセスを自動化する。これは、標準的なビジネスプロセスデプロイメントのエンドツーエンド自動化です。自動化は、ブレインコンピュータインターフェース(BCI)とAIとRPA自動化ツールを組み合わせることで、タスクのデジタル化を可能にする。 bciは自動化ツールと連携して、自動化プロセスの検出と生成を次のレベルに進める。企業はビジネスインテリジェンスシステムを統合し、複雑な要件に対処し、人間の専門知識と自動化エクスペリエンスを向上させることができる。本稿では, ハイパーオートマテーションと今日の環境におけるその重要性について概説する。この記事はその後、BCIとセンサーがHyperautomationにどのように役立つかについて論じている。この概念に関連する様々な柔軟な技術と、図式的に示される専用のワークフロー技術を用いて、特定の勧誘のセクタを調査した。ハイパーオートミネーションは、自動化タスクの効率、正確性、ヒューマンエンハンスメントを劇的に改善するために利用されています。発見、実装、自動化フェーズには、多数の自動化ツールが組み込まれている。その結果、最先端の技術の統合と新しい作業方法の実験に適しています。キーワード - ハイパーオートマチック、脳コンピューターインタフェース(BCI)、テクノロジー、ユースケース、センサー、産業。 The extension of legacy business process automation beyond the bounds of specific processes is known as hyperautomation. Hyperautomation provides automation for nearly any repetitive action performed by business users by combining AI tools with RPA. It automates complex IT business processes that a company's top brains might not be able to complete. This is an end-to-end automation of a standard business process deployment. It enables automation to perform task digitalization by combining a brain computer interface (BCI) with AI and RPA automation tools. BCI, in conjunction with automation tools, will advance the detection and generation of automation processes to the next level. It allows enterprises to combine business intelligence systems, address complex requirements, and enhance human expertise and automation experience. Hyperautomation and its importance in today's environment are briefly discussed in this paper. The article then goes on to discuss how BCI and sensors might aid Hyperautomation. The specific sectors of solicitations were examined using a variety of flexible technologies associated to this concept, as well as dedicated workflow techniques, which are also diagrammatically illustrated. Hyperautomation is being utilized to improve the efficiency, accuracy, and human enhancement of automated tasks dramatically. It incorporates a number of automated tools in its discovery, implementation, and automation phases. As a result, it's well-suited to integrating cutting-edge technologies and experimenting with new methods of working. Keywords- Hyperautomation, Brain computer Interface (BCI), Technology, Used case, Sensors, Industries.	翻訳日:2023-05-28 05:31:43 公開日:2023-05-14
# 森林火災防止の最適化:ドローン監視システムのためのインテリジェントスケジューリングアルゴリズム Optimizing Forest Fire Prevention: Intelligent Scheduling Algorithms for Drone-Based Surveillance System ( http://arxiv.org/abs/2305.10444v1 ) ライセンス: Link先を確認	Mahdi Jemmali, Loai Kayed B.Melhim, Wadii Boulila, Hajer Amdouni, Mafawez T. Alharbi	(参考訳) この研究は、森林の重要性と、地球、気候、地球上の生命に直接影響を及ぼす生態系のバランス維持における役割を踏まえ、ドローンによる森林火災監視の課題を提起する。森林モニタリングプロセスは、森林内の監視地域の変化を追跡するために連続的に行われる。火災の間、ドローンの捕獲データは、追跡速度を高め、これらの火災の制御プロセスを強化するために使用される。このような問題におけるタイムファクターは、適切な時刻の適切なデータが火災の制御、拡散の防止、消火、損失の制限に決定的な要因となるため、消火プロセスの成功率を決定する。そこで本研究では,森林モニタリングシステムにおけるドローンの監視タスクスケジューリングの問題を提示した。この問題は、全てのドローンが割り当てられたタスクを実行するのに必要な完了時間を最小化するために、いくつかのアルゴリズムを開発することで解決される。システムパフォーマンスは、3つの異なるクラスの990インスタンスを用いて測定される。実験の結果,提案アルゴリズムの有効性と目的達成のために効率的に行動できることが示唆された。アルゴリズムの$rid$は最大90.3%のパーセンテージレートで0.088秒という最高のパフォーマンスを達成した。 Given the importance of forests and their role in maintaining the ecological balance, which directly affects the planet, the climate, and the life on this planet, this research presents the problem of forest fire monitoring using drones. The forest monitoring process is performed continuously to track any changes in the monitored region within the forest. During fires, drones' capture data is used to increase the follow-up speed and enhance the control process of these fires to prevent their spread. The time factor in such problems determines the success rate of the fire extinguishing process, as appropriate data at the right time may be the decisive factor in controlling fires, preventing their spread, extinguishing them, and limiting their losses. Therefore, this research presented the problem of monitoring task scheduling for drones in the forest monitoring system. This problem is solved by developing several algorithms with the aim of minimizing the total completion time required to carry out all the drones' assigned tasks. System performance is measured by using 990 instances of three different classes. The performed experimental results indicated the effectiveness of the proposed algorithms and their ability to act efficiently to achieve the desired goal. The algorithm $RID$ achieved the best performance with a percentage rate of up to 90.3% with a time of 0.088 seconds.	翻訳日:2023-05-19 19:08:11 公開日:2023-05-14
# SuperDriverAI: エンドツーエンド学習型自動運転の設計と実装に向けて SuperDriverAI: Towards Design and Implementation for End-to-End Learning-based Autonomous Driving ( http://arxiv.org/abs/2305.10443v1 ) ライセンス: Link先を確認	Shunsuke Aoki, Issei Yamamoto, Daiki Shiotsuka, Yuichi Inoue, Kento Tokuhiro, and Keita Miwa	(参考訳) 完全自動運転は広く研究され、ますます実現可能になっている。しかし、周囲のドライバーや歩行者による様々な不確実性のため、公道での自動運転はまだ実現されていない。本稿では,Deep Neural Networks(DNN)が経験豊富なドライバから運転動作とポリシーを学習し,道路安全を確保しながら運転操作を決定する,SuperDriver AIというエンドツーエンドの学習ベース自動運転システムを提案する。さらに,頑健性と解釈性を向上させるため,スリットモデルと視覚的注意モジュールを提案する。我々は、実世界のハードウェアでデータ収集システムとエミュレータを構築し、実世界の運転シナリオでSuperDriver AIシステムをテストする。最後に,1回の運転シナリオで150ランを収集し,実世界の車両を用いたスーパードライバーaiのデモンストレーションを行った。 Fully autonomous driving has been widely studied and is becoming increasingly feasible. However, such autonomous driving has yet to be achieved on public roads, because of various uncertainties due to surrounding human drivers and pedestrians. In this paper, we present an end-to-end learningbased autonomous driving system named SuperDriver AI, where Deep Neural Networks (DNNs) learn the driving actions and policies from the experienced human drivers and determine the driving maneuvers to take while guaranteeing road safety. In addition, to improve robustness and interpretability, we present a slit model and a visual attention module. We build a datacollection system and emulator with real-world hardware, and we also test the SuperDriver AI system with real-world driving scenarios. Finally, we have collected 150 runs for one driving scenario in Tokyo, Japan, and have shown the demonstration of SuperDriver AI with the real-world vehicle.	翻訳日:2023-05-19 19:07:51 公開日:2023-05-14
# スマートホームエネルギーマネジメント:VAE-GAN合成データセットジェネレータとQラーニング Smart Home Energy Management: VAE-GAN synthetic dataset generator and Q-learning ( http://arxiv.org/abs/2305.08885v1 ) ライセンス: Link先を確認	Mina Razghandi, Hao Zhou, Melike Erol-Kantarci, and Damla Turgut	(参考訳) 近年、学界や産業の間で住宅の電気消費を分析し、家庭用エネルギー消費とコストを削減するためにスマートホームエネルギー管理システム(hems)を採用することへの関心が高まっている。 HEMSは、実際のスマートグリッドの統計的および機能的性質をシミュレートするために開発された。公開データセットへのアクセスは、この種の研究において大きな課題である。人工HEMSの応用の可能性は、合成システムの異なる動作条件を表す時系列の開発によってさらに強化される。本稿では,家庭におけるエネルギー消費に関する時系列データを生成するための変分的自動エンコーダ生成逆ネットワーク(vae-gan)手法を提案する。また、Qラーニングに基づくHEMSと組み合わせることで、生成モデルがどのように機能するかについても検討する。実世界のスマートホームデータを用いて,Qラーニングに基づくHEMSのオンラインパフォーマンスを検証した。生成したデータセットをテストするために,実データと合成データの確率分布間のKullback-Leibler(KL)偏差,最大平均差(MMD)およびワッサーシュタイン距離を測定する。実験の結果, VAE-GAN生成合成データは実データ分布と密に一致していることがわかった。最後に、生成したデータにより、ベースラインアプローチで生成されたデータセットと比較して、高性能なQ-ラーニングベースのHEMSのトレーニングが可能になることを示す。 Recent years have noticed an increasing interest among academia and industry towards analyzing the electrical consumption of residential buildings and employing smart home energy management systems (HEMS) to reduce household energy consumption and costs. HEMS has been developed to simulate the statistical and functional properties of actual smart grids. Access to publicly available datasets is a major challenge in this type of research. The potential of artificial HEMS applications will be further enhanced with the development of time series that represent different operating conditions of the synthetic systems. In this paper, we propose a novel variational auto-encoder-generative adversarial network (VAE-GAN) technique for generating time-series data on energy consumption in smart homes. We also explore how the generative model performs when combined with a Q-learning-based HEMS. We tested the online performance of Q-learning-based HEMS with real-world smart home data. To test the generated dataset, we measure the Kullback-Leibler (KL) divergence, maximum mean discrepancy (MMD), and the Wasserstein distance between the probability distributions of the real and synthetic data. Our experiments show that VAE-GAN-generated synthetic data closely matches the real data distribution. Finally, we show that the generated data allows for the training of a higher-performance Q-learning-based HEMS compared to datasets generated with baseline approaches.	翻訳日:2023-05-17 17:40:18 公開日:2023-05-14
# ブラックボックス言語モデルによるテキストの透かし Watermarking Text Generated by Black-Box Language Models ( http://arxiv.org/abs/2305.08883v1 ) ライセンス: Link先を確認	Xi Yang, Kejiang Chen, Weiming Zhang, Chang Liu, Yuang Qi, Jie Zhang, Han Fang, Nenghai Yu	(参考訳) 現在、LLMは様々な分野で人間のようなスキルを示しており、誤用を心配している。したがって、生成されたテキストの検出が不可欠である。しかし, 受動的検出手法は, 領域特異性と限られた対向性に留まっている。テキスト生成時に透かしを埋め込むことが可能なホワイトボックスLCMに対して,透かしベースの手法が提案された。この方法は、モデル語彙をランダムに分割して特殊リストを取得し、確率分布を調整し、リスト内の単語の選択を促進する。リストを認識する検出アルゴリズムは、透かし付きテキストを識別することができる。しかし、この方法はブラックボックス言語モデルのみが利用可能な現実世界のシナリオの多くでは適用できない。例えば、APIベースの垂直アプリケーションを開発するサードパーティは、生成したテキストのみを供給し、商業的利益を保護するために確率分布を保持するため、テキスト自体をウォーターマークすることはできない。サードパーティが生成したテキストに自動的に透かしを注入できるようにするために,ブラックボックス言語モデル利用シナリオのための透かしフレームワークを開発した。具体的には、まず単語に対応するランダムなバイナリエンコーディングを計算するバイナリエンコーディング関数を定義する。非透かしテキストで計算された符号化はベルヌーイ分布に準拠し、ビット-1を表す単語の確率は約0.5である。透かしを注入するために、ビット0を表す単語を、ビット1を表す文脈に基づく同義語に選択的に置き換えることで、分布を変化させる。その後、統計検査によって透かしを識別する。実験により,中国語と英語のデータセットにおける本手法の有効性が実証された。さらに, 再翻訳, 研磨, 単語削除, 同義語置換攻撃による結果から, 本来の意味論を損なうことなく, 透かしを除去することが困難であることが明らかとなった。 LLMs now exhibit human-like skills in various fields, leading to worries about misuse. Thus, detecting generated text is crucial. However, passive detection methods are stuck in domain specificity and limited adversarial robustness. To achieve reliable detection, a watermark-based method was proposed for white-box LLMs, allowing them to embed watermarks during text generation. The method involves randomly dividing the model vocabulary to obtain a special list and adjusting the probability distribution to promote the selection of words in the list. A detection algorithm aware of the list can identify the watermarked text. However, this method is not applicable in many real-world scenarios where only black-box language models are available. For instance, third-parties that develop API-based vertical applications cannot watermark text themselves because API providers only supply generated text and withhold probability distributions to shield their commercial interests. To allow third-parties to autonomously inject watermarks into generated text, we develop a watermarking framework for black-box language model usage scenarios. Specifically, we first define a binary encoding function to compute a random binary encoding corresponding to a word. The encodings computed for non-watermarked text conform to a Bernoulli distribution, wherein the probability of a word representing bit-1 being approximately 0.5. To inject a watermark, we alter the distribution by selectively replacing words representing bit-0 with context-based synonyms that represent bit-1. A statistical test is then used to identify the watermark. Experiments demonstrate the effectiveness of our method on both Chinese and English datasets. Furthermore, results under re-translation, polishing, word deletion, and synonym substitution attacks reveal that it is arduous to remove the watermark without compromising the original semantics.	翻訳日:2023-05-17 17:39:58 公開日:2023-05-14
# 視覚球面パースペクティブモデルの消滅点を考慮した計算ポッド Calculating Pose with Vanishing Points of Visual-Sphere Perspective Model ( http://arxiv.org/abs/2004.08933v4 ) ライセンス: Link先を確認	Jakub Maksymilian Fober	(参考訳) 提案手法の目標は, 幾何学的手法を用いて, 推定することなく, 既知の矩形対象のポーズ行列を直接取得することである。この方法は、魚眼カメラビューのような180{\deg}視野を超えるリアルタイム、極端な撮像装置に特化している。導入されたアルゴリズムは幾何代数を用いてコプラナー平行線(理想的には長方形のような接対)のポーズを決定する。これは、ポーズ行列ベクトルに対応する視覚単位球面上の点を計算することによって達成される。このアルゴリズムは、ビュー座標の視野モデルマッピングにより、事前の補正なしに非常に歪んだビューソースのポーズを決定することができる。本稿では、視点マップのルックアップと、パラメトリックな普遍的視点歪モデルを用いてマッピングを行うことができる。その結果、マイクロコントローラを使用して組み込みシステムで実行でき、高い精度と低レイテンシを実現する、堅牢なポーズ行列計算が実現した。この方法は、包括的カメラキャリブレーションのための立方体ターゲット設定にさらに拡張することができる。また、低レイテンシと極端な視野角を必要とする他のアプリケーションでも有用である。 The goal of the proposed method is to directly obtain a pose matrix of a known rectangular target, without estimation, using geometric techniques. This method is specifically tailored for real-time, extreme imaging setups exceeding 180{\deg} field of view, such as a fish-eye camera view. The introduced algorithm employs geometric algebra to determine the pose for a pair of coplanar parallel lines (ideally a tangent pair as in a rectangle). This is achieved by computing vanishing points on a visual unit sphere, which correspond to pose matrix vectors. The algorithm can determine pose for an extremely distorted view source without prior rectification, owing to a visual-sphere perspective model mapping of view coordinates. Mapping can be performed using either a perspective map lookup or a parametric universal perspective distortion model, which is also presented in this paper. The outcome is a robust pose matrix computation that can be executed on an embedded system using a microcontroller, offering high accuracy and low latency. This method can be further extended to a cubic target setup for comprehensive camera calibration. It may also prove valuable in other applications requiring low latency and extreme viewing angles.	翻訳日:2023-05-17 02:16:48 公開日:2023-05-14
# Label-Assemble: 部分ラベルによる複数データセットの活用 Label-Assemble: Leveraging Multiple Datasets with Partial Labels ( http://arxiv.org/abs/2109.12265v4 ) ライセンス: Link先を確認	Mintong Kang, Bowen Li, Zengle Zhu, Yongyi Lu, Elliot K. Fishman, Alan L. Yuille, Zongwei Zhou	(参考訳) ディープラーニングの成功は、大きなラベル付きデータセットに大きく依存していますが、部分ラベルに関連するいくつかの小さなデータセットにしかアクセスできません。この問題に対処するため,我々は,公開データセットのアセンブリから部分ラベルの可能性を解き放つことを目的とした,新たなイニシアティブ "label-assemble" を提案する。ネガティブな例から学ぶことで,コンピュータ支援型疾患の診断と検出が容易になることがわかった。この発見は、陽性例の収集が難しいが、陰性例の組み立てが比較的容易な、新しい疾患診断において特に重要である。例えば、NIH ChestX-ray14(2017年以降利用可能)から既存のラベルを組み込むことで、新型コロナウイルスの診断精度が96.3%から99.3%に大幅に向上する。例えば、膵管腺癌(PDAC)の検出は、CystsおよびPanNets(他の2種類の膵異常)のラベルを利用することで、感度を52.1%から84.0%に上昇させ、98.0%の高比性を維持しながら、疾患の検出を改善することができる。 The success of deep learning relies heavily on large labeled datasets, but we often only have access to several small datasets associated with partial labels. To address this problem, we propose a new initiative, "Label-Assemble", that aims to unleash the full potential of partial labels from an assembly of public datasets. We discovered that learning from negative examples facilitates both computer-aided disease diagnosis and detection. This discovery will be particularly crucial in novel disease diagnosis, where positive examples are hard to collect, yet negative examples are relatively easier to assemble. For example, assembling existing labels from NIH ChestX-ray14 (available since 2017) significantly improves the accuracy of COVID-19 diagnosis from 96.3% to 99.3%. In addition to diagnosis, assembling labels can also improve disease detection, e.g., the detection of pancreatic ductal adenocarcinoma (PDAC) can greatly benefit from leveraging the labels of Cysts and PanNets (two other types of pancreatic abnormalities), increasing sensitivity from 52.1% to 84.0% while maintaining a high specificity of 98.0%.	翻訳日:2023-05-17 01:41:50 公開日:2023-05-14
# TrAISformer-AIS軌道予測のための生成変換器 TrAISformer-A generative transformer for AIS trajectory prediction ( http://arxiv.org/abs/2109.03958v3 ) ライセンス: Link先を確認	Duong Nguyen and Ronan Fablet	(参考訳) 将来ある特定の地点における船舶位置の予測は多くの海洋応用の基本的な側面である。自動識別システム(AIS)は、このタスクを実現するための豊富な情報を提供するが、動きデータに固有の複雑さとマルチモーダル性のため、現代の機械学習/深層学習においても、AISデータを用いた船舶軌道予測は極めて困難である。本稿では,aisデータの新しい離散的高次元表現と,不均一性と多様性を考慮した新しい損失関数を導入することで,これらの課題を解決する。提案したモデルであるTrAISformerは、改良されたトランスフォーマーネットワークであり、提案された拡張空間におけるAIS軌道の長期相関を抽出し、将来的な船舶の位置を予測する。実AISデータを公開して実験結果を報告する。 TrAISformerは最先端の手法を大きく上回り、平均的な予測性能は10海里から10時間以内である。 The prediction of vessel positions at a specified point in the future is a fundamental aspect of many maritime applications. While Automatic Identification System (AIS) provides a rich source of information to enable this task, vessel trajectory forecasting using AIS data remains formidably challenging, even for modern machine learning/deep learning, because of the complexity and multimodality inherent in motion data. In this paper, we address these challenges by introducing a novel discrete, high-dimensional representation of AIS data and a new loss function to explicitly account for heterogeneity and multimodality. The proposed model -- referred to as TrAISformer -- is a modified transformer network that extracts long-term correlations of AIS trajectories in the proposed enriched space to forecast the positions of vessels several hours into the future. We report experimental results on publicly available, real AIS data. TrAISformer significantly outperforms state-of-the-art methods and with an average prediction performance below 10 nautical miles up to ~10 hours.	翻訳日:2023-05-17 01:41:30 公開日:2023-05-14
# 自己学習アンサンブルを用いたラベルなしデータの誤り検出と推定精度 Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles ( http://arxiv.org/abs/2106.15728v4 ) ライセンス: Link先を確認	Jiefeng Chen, Frederick Liu, Besim Avci, Xi Wu, Yingyu Liang, Somesh Jha	(参考訳) ディープラーニングモデルがワイルドにデプロイされると、トレーニングデータ分布とは異なる分布から引き出されたテストデータに遭遇し、パフォーマンスが低下する可能性がある。安全なデプロイメントには,事前トレーニングしたモデルの精度をテストデータ上で推定することが不可欠である。しかし、テスト入力のラベルは通常、すぐには利用できず、それらを取得するには費用がかかる可能性がある。本研究は,(1) ラベル付きテスト入力のセット上で事前学習した分類器の精度を推定することを目的とした教師なしの精度推定,(2) 誤分類テスト入力の同定を目的とした誤り検出の2つの課題を実現する。本稿では,2つのタスクを同時に処理する原理的かつ効果的なフレームワークを提案する。提案手法は,誤分類されたデータポイントを識別するためのモデルのアンサンブルを反復的に学習し,同定されたポイントとのアンサンブルを改善するために自己学習を行う。理論解析により,本フレームワークは,実用的なディープラーニングモデルによって容易に満足できる軽度条件下での精度推定と誤り検出の両立を保証できる。このフレームワークとともに,59のタスクに対して2つのインスタンス化を提案し,実験を行った。例えば、iWildCamでは、教師なし精度推定における推定誤差を少なくとも70%削減し、エラー検出のためのF1スコアを既存の方法と比較して少なくとも4.7%改善する。 When a deep learning model is deployed in the wild, it can encounter test data drawn from distributions different from the training data distribution and suffer drop in performance. For safe deployment, it is essential to estimate the accuracy of the pre-trained model on the test data. However, the labels for the test inputs are usually not immediately available in practice, and obtaining them can be expensive. This observation leads to two challenging tasks: (1) unsupervised accuracy estimation, which aims to estimate the accuracy of a pre-trained classifier on a set of unlabeled test inputs; (2) error detection, which aims to identify mis-classified test inputs. In this paper, we propose a principled and practically effective framework that simultaneously addresses the two tasks. The proposed framework iteratively learns an ensemble of models to identify mis-classified data points and performs self-training to improve the ensemble with the identified points. Theoretical analysis demonstrates that our framework enjoys provable guarantees for both accuracy estimation and error detection under mild conditions readily satisfied by practical deep learning models. Along with the framework, we proposed and experimented with two instantiations and achieved state-of-the-art results on 59 tasks. For example, on iWildCam, one instantiation reduces the estimation error for unsupervised accuracy estimation by at least 70% and improves the F1 score for error detection by at least 4.7% compared to existing methods.	翻訳日:2023-05-17 01:39:38 公開日:2023-05-14
# CD-GAN : 不均一センサを用いた非監視リモートセンシング変化検出のためのロバスト核融合による生成対向ネットワーク CD-GAN: a robust fusion-based generative adversarial network for unsupervised remote sensing change detection with heterogeneous sensors ( http://arxiv.org/abs/2203.00948v3 ) ライセンス: Link先を確認	Jin-Ju Wang, Nicolas Dobigeon, Marie Chabert, Ding-Cheng Wang, Ting-Zhu Huang and Jie Huang	(参考訳) 地球観測の文脈では、変化の検出は、空間分解能やスペクトル分解能の異なるセンサーによって取得されたマルチテンポラリ画像、あるいは光学、レーダーなど)によって行われる。光学モードに限ったとしても、センサーが空間やスペクトルの解像度が異なるため、この課題はすぐに困難であることが判明した。本稿では,このような不均質光センサを用いた画像の教師なし変化検出法を提案する。この手法は、変化検出問題を堅牢な融合フレームワークに組み込んだ最近の進歩を生かしている。より正確には、前もって設計・訓練された深層対向ネットワークにより、同一アーキテクチャのネットワークにより、一対のマルチバンド光画像を容易に補完し、変化検出を行うことができることを示す。結果として生じる全体的なアーキテクチャは、融合ネットワークと追加ネットワークがジェネレータの不可欠なビルディングブロックとして解釈される敵の戦略に従う。最先端の変更検出手法との比較により,提案手法の有効性と有効性を示す。 In the context of Earth observation, the detection of changes is performed from multitemporal images acquired by sensors with possibly different spatial and/or spectral resolutions or even different modalities (e.g. optical, radar). Even limiting to the optical modality, this task has proved to be challenging as soon as the sensors have different spatial and/or spectral resolutions. This paper proposes a novel unsupervised change detection method dedicated to images acquired with such so-called heterogeneous optical sensors. This method capitalizes on recent advances which frame the change detection problem into a robust fusion framework. More precisely, we show that a deep adversarial network designed and trained beforehand to fuse a pair of multiband optical images can be easily complemented by a network with the same architecture to perform change detection. The resulting overall architecture itself follows an adversarial strategy where the fusion network and the additional network are interpreted as essential building blocks of a generator. A comparison with state-of-the-art change detection methods demonstrates the versatility and the effectiveness of the proposed approach.	翻訳日:2023-05-17 01:32:34 公開日:2023-05-14
# セマンティクスセグメンテーションのためのピラミッド融合トランスフォーマ Pyramid Fusion Transformer for Semantic Segmentation ( http://arxiv.org/abs/2201.04019v3 ) ライセンス: Link先を確認	Zipeng Qin, Jianbo Liu, Xiaolin Zhang, Maoqing Tian, Aojun Zhou, Shuai Yi, Shaoan Qi, Hongsheng Li	(参考訳) 最近提案されたmaskformerは、セマンティックセグメンテーションのタスクに関する新しい視点を提供している。本質的には、カテゴリセグメントに対応するペア確率とマスクを生成し、セグメンテーションマップの推論中にそれらを組み合わせます。本研究では,シングルスケール機能上のマスク分類デコーダは,信頼性の高い確率やマスクを抽出できるほど有効ではないことを見出した。特徴ピラミッド全体にわたって豊富な意味情報を求めるため,マルチスケール特徴を持つマスク・アプローチ・セマンティクスセグメンテーションのためのトランスフォーマーベースのピラミッド融合トランスフォーマ (pft) を提案する。提案するトランスフォーマーデコーダは,学習可能なクエリと特徴ピラミッドからのそれぞれの空間特徴との相互接続を並列に行い,補足情報交換にクロススケールクエリ間注意を使用する。広く使われている3つのセマンティックセグメンテーションデータセット上での競合性能を実現する。特にADE20Kの検証セットでは、Swin-Bのバックボーンはシングルスケールとマルチスケールの両方でMaskFormerのバックボーンよりも大きく、それぞれ54.1 mIoUと55.7 mIoUを達成した。 Swin-Lのバックボーンを使用して、単一スケールの56.1 mIoUとマルチスケールの57.4 mIoUを達成し、データセット上で最先端のパフォーマンスを得る。 3つの広く使われているセマンティックセグメンテーションデータセットの大規模な実験により,提案手法の有効性が検証された。 The recently proposed MaskFormer gives a refreshed perspective on the task of semantic segmentation: it shifts from the popular pixel-level classification paradigm to a mask-level classification method. In essence, it generates paired probabilities and masks corresponding to category segments and combines them during inference for the segmentation maps. In our study, we find that per-mask classification decoder on top of a single-scale feature is not effective enough to extract reliable probability or mask. To mine for rich semantic information across the feature pyramid, we propose a transformer-based Pyramid Fusion Transformer (PFT) for per-mask approach semantic segmentation with multi-scale features. The proposed transformer decoder performs cross-attention between the learnable queries and each spatial feature from the feature pyramid in parallel and uses cross-scale inter-query attention to exchange complimentary information. We achieve competitive performance on three widely used semantic segmentation datasets. In particular, on ADE20K validation set, our result with Swin-B backbone surpasses that of MaskFormer's with a much larger Swin-L backbone in both single-scale and multi-scale inference, achieving 54.1 mIoU and 55.7 mIoU respectively. Using a Swin-L backbone, we achieve single-scale 56.1 mIoU and multi-scale 57.4 mIoU, obtaining state-of-the-art performance on the dataset. Extensive experiments on three widely used semantic segmentation datasets verify the effectiveness of our proposed method.	翻訳日:2023-05-17 01:31:13 公開日:2023-05-14
# 逆ベイズ分類器の存在について(拡張版) On the Existence of the Adversarial Bayes Classifier (Extended Version) ( http://arxiv.org/abs/2112.01694v3 ) ライセンス: Link先を確認	Pranjal Awasthi, Natalie S. Frank, Mehryar Mohri	(参考訳) 敵対的堅牢性は、現代の機械学習アプリケーションにおいて重要な特性である。近年のいくつかの理論的研究の対象となっているが、敵の強靭性に関する重要な疑問がまだ数多く残っている。本研究では,ベイズ最適性に関する基本的問題について考察する。ベイズ最適分類器の存在を敵の強靭性に対して保証できるような、一般的な十分条件を提供する。この結果は, 敵の強靭性とその整合性におけるサロゲート損失の研究に有用である。この写本は、NeurIPS 2021 で出版された論文 \emph{On the Existence of the Adversarial Bayes Classifier} の拡張と修正版である。元々の論文では定理ステートメントに2つの誤りがあった。1つは疑似証明可能ロバスト性の定義であり、もう1つは任意の距離空間に対して $a^\e$ の可測性の定義である。このバージョンではエラーを修正します。さらに、原論文の結果は、いくつかの非制限凸ノルムには適用されず、ここでは、結果を全ての可能なノルムにまで拡張する。 Adversarial robustness is a critical property in a variety of modern machine learning applications. While it has been the subject of several recent theoretical studies, many important questions related to adversarial robustness are still open. In this work, we study a fundamental question regarding Bayes optimality for adversarial robustness. We provide general sufficient conditions under which the existence of a Bayes optimal classifier can be guaranteed for adversarial robustness. Our results can provide a useful tool for a subsequent study of surrogate losses in adversarial robustness and their consistency properties. This manuscript is the extended and corrected version of the paper \emph{On the Existence of the Adversarial Bayes Classifier} published in NeurIPS 2021. There were two errors in theorem statements in the original paper -- one in the definition of pseudo-certifiable robustness and the other in the measurability of $A^\e$ for arbitrary metric spaces. In this version we correct the errors. Furthermore, the results of the original paper did not apply to some non-strictly convex norms and here we extend our results to all possible norms.	翻訳日:2023-05-17 01:30:45 公開日:2023-05-14
# AxoNN: 極大規模ディープラーニングのための非同期メッセージ駆動並列フレームワーク AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning ( http://arxiv.org/abs/2110.13005v5 ) ライセンス: Link先を確認	Siddharth Singh, Abhinav Bhatele	(参考訳) ここ数年、最先端のニューラルネットワークをトレーニングするためのメモリ要件は、現代のハードウェアアクセラレーターのDRAM能力を大きく超えてきた。これにより、大規模なGPUベースのクラスタ上でこれらのニューラルネットワークを並列にトレーニングする効率的なアルゴリズムの開発が必要になった。現代のgpuでは計算コストは比較的安価であるため、並列トレーニングアルゴリズムにおける極めて効率的な通信の設計と実装は、最大性能の抽出に不可欠である。並列ディープラーニングフレームワークであるAxoNNは、非同期およびメッセージ駆動実行を利用して、各GPU上でのニューラルネットワーク操作をスケジュールし、GPUアイドル時間を短縮し、ハードウェア効率を最大化する。トレーニング中に定期的にデータをオフロードするスクラッチスペースとしてCPUメモリを使用することで、AxoNNはGPUメモリ使用量を4倍削減することができる。これにより、GPUあたりのパラメータ数を4倍に増やすことができ、通信量と性能を13%以上向上させることができる。 48-384 NVIDIA Tesla V100 GPU上で12-1000億のパラメータを持つ大きなトランスフォーマーモデルに対してテストすると、AxoNNは理論ピークの49.4-54.78%のGPU当たりのスループットを達成し、最先端と比較して22-37日(15-25%のスピードアップ)のトレーニング時間を短縮する。 In the last few years, the memory requirements to train state-of-the-art neural networks have far exceeded the DRAM capacities of modern hardware accelerators. This has necessitated the development of efficient algorithms to train these neural networks in parallel on large-scale GPU-based clusters. Since computation is relatively inexpensive on modern GPUs, designing and implementing extremely efficient communication in these parallel training algorithms is critical for extracting the maximum performance. This paper presents AxoNN, a parallel deep learning framework that exploits asynchrony and message-driven execution to schedule neural network operations on each GPU, thereby reducing GPU idle time and maximizing hardware efficiency. By using the CPU memory as a scratch space for offloading data periodically during training, AxoNN is able to reduce GPU memory consumption by four times. This allows us to increase the number of parameters per GPU by four times, thus reducing the amount of communication and increasing performance by over 13%. When tested against large transformer models with 12-100 billion parameters on 48-384 NVIDIA Tesla V100 GPUs, AxoNN achieves a per-GPU throughput of 49.4-54.78% of theoretical peak and reduces the training time by 22-37 days (15-25% speedup) as compared to the state-of-the-art.	翻訳日:2023-05-17 01:30:04 公開日:2023-05-14
# Twitter会話スレッドのヘイトインテンシティ予測 Predicting Hate Intensity of Twitter Conversation Threads ( http://arxiv.org/abs/2206.08406v4 ) ライセンス: Link先を確認	Qing Meng and Tharun Suresh, Roy Ka-Wei Lee, Tanmoy Chakraborty	(参考訳) ツイートは、オンラインのソーシャルメディアにおける最も簡潔なコミュニケーション形態であり、一つのツイートが会話の会話を作り、破壊する可能性を秘めている。オンラインヘイトスピーチはかつてないほどアクセスしやすく、その拡散を抑制することは、ソーシャルメディア企業やユーザーにとって、コンジェニアルコミュニケーションにとって最も重要である。最近の少数の研究は、ツイートスレッド/コンテキストに関わらず、個々のツイートを分類することに重点を置いている。ヘイトスピーチを抑制する古典的なアプローチの1つは、ヘイトスピーチの投稿後にリアクティブ戦略を採用することである。ポストのファクト戦略は、ヘイトスピーチを自力で扇動する可能性を示さない微妙なポストを無視する結果となり、ポストの回答で続く議論に終止符を打つ可能性がある。本稿では,将来,ツイートが応答チェーンを通じてもたらす憎悪の強さを予測することを目的としたDRAGNET++を提案する。ツイートスレッドのセマンティックな構造と伝播構造を利用して、続く各ツイートにおけるヘイト強度の低下につながるコンテキスト情報を最大化する。反人種差別には、米国の政治や新型コロナウイルス(covid-19)背景における人種差別的発言に関するソーシャルメディア談話の返信ツイート、新型コロナウイルス(covid-19)のパンデミック中の4000万ツイートのデータセット、新型コロナウイルス(covid-19)のパンデミック時の反asian行動に基づくtwitterデータセットが含まれる。キュレートされたデータセットはすべて、ツイートスレッドの構造グラフ情報で構成されている。 DRAGNET++は最先端のすべてのベースラインを大幅に上回ることを示す。これは、Person相関係数の11%のマージンで最高のベースラインを上回り、他の2つのデータセットで同様のパフォーマンスを持つ反ラチズムデータセットのRMSEでは25%低下する。 Tweets are the most concise form of communication in online social media, wherein a single tweet has the potential to make or break the discourse of the conversation. Online hate speech is more accessible than ever, and stifling its propagation is of utmost importance for social media companies and users for congenial communication. Most of the research barring a recent few has focused on classifying an individual tweet regardless of the tweet thread/context leading up to that point. One of the classical approaches to curb hate speech is to adopt a reactive strategy after the hate speech postage. The ex-post facto strategy results in neglecting subtle posts that do not show the potential to instigate hate speech on their own but may portend in the subsequent discussion ensuing in the post's replies. In this paper, we propose DRAGNET++, which aims to predict the intensity of hatred that a tweet can bring in through its reply chain in the future. It uses the semantic and propagating structure of the tweet threads to maximize the contextual information leading up to and the fall of hate intensity at each subsequent tweet. We explore three publicly available Twitter datasets -- Anti-Racism contains the reply tweets of a collection of social media discourse on racist remarks during US political and Covid-19 background; Anti-Social presents a dataset of 40 million tweets amidst the COVID-19 pandemic on anti-social behaviours; and Anti-Asian presents Twitter datasets collated based on anti-Asian behaviours during COVID-19 pandemic. All the curated datasets consist of structural graph information of the Tweet threads. We show that DRAGNET++ outperforms all the state-of-the-art baselines significantly. It beats the best baseline by an 11% margin on the Person correlation coefficient and a decrease of 25% on RMSE for the Anti-Racism dataset with a similar performance on the other two datasets.	翻訳日:2023-05-17 01:22:48 公開日:2023-05-14
# 一般化された量子シュタインの補題の証明のギャップとその量子資源の可逆性への帰結について On a gap in the proof of the generalised quantum Stein's lemma and its consequences for the reversibility of quantum resources ( http://arxiv.org/abs/2205.02813v3 ) ライセンス: Link先を確認	Mario Berta, Fernando G. S. L. Brand\~ao, Gilad Gour, Ludovico Lami, Martin B. Plenio, Bartosz Regula, Marco Tomamichel	(参考訳) 一般化された量子シュタインの補題 [Brand\~ao & Plenio, Commun] の証明を示す。数学 Phys 295, 791 (2010)] は、Lemma III.9 に至る議論のギャップのために正しくない。したがって、Brand\~ao & Plenioの達成可能性の主な成果は分かっていない。これは文学におけるいくつかの確立された結果、特に量子エンタングルメントの可逆性 [brand\~ao & plenio, commun] に疑問を呈する。数学 Phys 295, 829 (2010), Nat。 Phys 4, 873 (2008) および一般的な量子資源 [Brand\~ao & Gour, Phys. Rev. Lett. 115, 070503 (2015)] の漸近的資源非発生操作。提案手法では,新たな未解決結果の変種を他の手法を用いて復元する可能性について論じる。 We show that the proof of the generalised quantum Stein's lemma [Brand\~ao & Plenio, Commun. Math. Phys. 295, 791 (2010)] is not correct due to a gap in the argument leading to Lemma III.9. Hence, the main achievability result of Brand\~ao & Plenio is not known to hold. This puts into question a number of established results in the literature, in particular the reversibility of quantum entanglement [Brand\~ao & Plenio, Commun. Math. Phys. 295, 829 (2010); Nat. Phys. 4, 873 (2008)] and of general quantum resources [Brand\~ao & Gour, Phys. Rev. Lett. 115, 070503 (2015)] under asymptotically resource non-generating operations. We discuss potential ways to recover variants of the newly unsettled results using other approaches.	翻訳日:2023-05-17 01:21:36 公開日:2023-05-14
# 非ブール行列に対するIhara-Bass式とランダムCSPの強い反発 A Ihara-Bass Formula for Non-Boolean Matrices and Strong Refutations of Random CSPs ( http://arxiv.org/abs/2204.10881v2 ) ライセンス: Link先を確認	Tommaso d'Orsi, Luca Trevisan	(参考訳) 我々は、任意の対称行列に付随する ‘non-backtracking' 行列の新たな概念を定義し、それに対する ``ihara-bass'' 型式を証明する。この理論を用いて,制約当たり$k$変数 (k-csps) を持つ無作為制約満足度問題の多項式時間強い反論を証明した。代入分数$p$で満たされる制約で構築されたランダムk-CSPインスタンスに対して、もしインスタンスに$n$変数と$n^{k/2} / \epsilon^2$制約があるなら、最適値が少なくとも$p+O_k(\epsilon)$制約分で満足する証明書を効率的に計算できる。以前は$k$でも知られていたが、奇数$k$の場合、同じ結論を達成するために$n^{k/2} (\log n)^{O(1)} / \epsilon^2$ランダムな制約が必要であった。改善は多対数に過ぎませんが、この種の結果に対する大きな障壁を克服します。現在のアプローチに基づく強い反発の結果は、k-CSPインスタンスに関連するある行列が準ランダムであることの証明を構築する。そのような証明は、ファイゲ=オフェック型の引数、グロタンディークの不等式の適用、あるいはトレース引数で得られるスペクトル境界から得られる。最初の2つのアプローチでは、制約の数が$o(n^{\lceil k/2 \rceil})$であり、3番目のアプローチは、制約の数が$o(n^{k/2} \sqrt{\log n})$であるときに機能しないユニオン境界を必要とする。さらに,制約がランダムな半ランダム設定において,$k$-CSP インスタンスに対して $n^{k/2} / \epsilon^2$ 制約を付与する新たな PTAS 探索手法を提案する。 We define a novel notion of ``non-backtracking'' matrix associated to any symmetric matrix, and we prove a ``Ihara-Bass'' type formula for it. We use this theory to prove new results on polynomial-time strong refutations of random constraint satisfaction problems with $k$ variables per constraints (k-CSPs). For a random k-CSP instance constructed out of a constraint that is satisfied by a $p$ fraction of assignments, if the instance contains $n$ variables and $n^{k/2} / \epsilon^2$ constraints, we can efficiently compute a certificate that the optimum satisfies at most a $p+O_k(\epsilon)$ fraction of constraints. Previously, this was known for even $k$, but for odd $k$ one needed $n^{k/2} (\log n)^{O(1)} / \epsilon^2$ random constraints to achieve the same conclusion. Although the improvement is only polylogarithmic, it overcomes a significant barrier to these types of results. Strong refutation results based on current approaches construct a certificate that a certain matrix associated to the k-CSP instance is quasirandom. Such certificate can come from a Feige-Ofek type argument, from an application of Grothendieck's inequality, or from a spectral bound obtained with a trace argument. The first two approaches require a union bound that cannot work when the number of constraints is $o(n^{\lceil k/2 \rceil})$ and the third one cannot work when the number of constraints is $o(n^{k/2} \sqrt{\log n})$. We further apply our techniques to obtain a new PTAS finding assignments for $k$-CSP instances with $n^{k/2} / \epsilon^2$ constraints in the semi-random settings where the constraints are random, but the sign patterns are adversarial.	翻訳日:2023-05-17 01:21:07 公開日:2023-05-14
# 局所データ制限付きニューラルネットワークを用いた知的空間補間に基づく凍上予測手法 Intelligent Spatial Interpolation-based Frost Prediction Methodology using Artificial Neural Networks with Limited Local Data ( http://arxiv.org/abs/2204.08465v2 ) ライセンス: Link先を確認	Ian Zhou, Justin Lipman, Mehran Abolhasan and Negin Shariati	(参考訳) フロストの気象現象は農業に大きな脅威をもたらす。最近のフロスト予測は現場の履歴データとセンサーに基づいており、新しいサイトでのデータ収集には追加開発と展開時間が必要である。本論文の目的は,現場の履歴データと凍害予測のためのセンサへの依存を解消することである。本稿では,空間補間に基づく凍害予測手法を提案する。これらのモデルは、既存の気象観測所の気候データ、デジタル標高モデルサーベイ、および正規化差植生指数データを用いて、目標地点の次の1時間最低気温を推定する。提案手法は,モデルの精度を高めるためにアンサンブル学習を用いる。気候データセットは、ニューサウスウェールズ州とオーストラリアの首都圏の75の気象観測所から得られる。その結果,提案手法は検出率92.55%に達することがわかった。 The weather phenomenon of frost poses great threats to agriculture. As recent frost prediction methods are based on on-site historical data and sensors, extra development and deployment time are required for data collection in any new site. The aim of this article is to eliminate the dependency on on-site historical data and sensors for frost prediction methods. In this article, a frost prediction method based on spatial interpolation is proposed. The models use climate data from existing weather stations, digital elevation models surveys, and normalized difference vegetation index data to estimate a target site's next hour minimum temperature. The proposed method utilizes ensemble learning to increase the model accuracy. Climate datasets are obtained from 75 weather stations across New South Wales and Australian Capital Territory areas of Australia. The results show that the proposed method reached a detection rate up to 92.55%.	翻訳日:2023-05-17 01:20:10 公開日:2023-05-14
# 実ヒルベルト空間における正写像と絡み合い Positive maps and entanglement in real Hilbert spaces ( http://arxiv.org/abs/2207.02510v2 ) ライセンス: Link先を確認	Giulio Chiribella, Kenneth R. Davidson, Vern I. Paulsen and Mizanur Rahaman	(参考訳) 正写像の理論は作用素代数や関数解析において中心的な役割を果たし、量子情報科学において無数の応用がある。この理論はもともと複素ヒルベルト空間上で作用する作用素のために開発され、実ヒルベルト空間上の変種についてはほとんど知られていない。本稿では、実数体上の全行列代数に作用する正の写像について研究し、複素数体に対する多くの基本的な違いを指摘し、量子情報におけるそれらの意味について論じる。我々は、実写像が正の複素化を受け入れる必要十分条件を提供し、正の写像の存在と非正の複素化の存在と、実ヒルベルト空間の量子力学において絡み合っている混合状態の存在を結びつけるが、複素バージョンでは分離可能であり、写像と状態の両方に明確な例を提供する。最後に、エンタングルメント破れと PPT 写像について議論し、PPT-二乗予想の単純実版が次元 2 においても偽であることを示す。それでも、元の PPT-二乗予想は実写像に対して異なる予想を示し、PPT特性は部分転置(IPT)の下での不変性の強い性質に置き換えられることを示す。 IPT特性を仮定すると、予想の漸近バージョンが証明される。 The theory of positive maps plays a central role in operator algebras and functional analysis, and has countless applications in quantum information science. The theory was originally developed for operators acting on complex Hilbert spaces, and little is known about its variant on real Hilbert spaces. In this article we study positive maps acting on a full matrix algebra over the reals, pointing out a number of fundamental differences with the complex case and discussing their implications in quantum information. We provide a necessary and sufficient condition for a real map to admit a positive complexification, and connect the existence of positive maps with non-positive complexification with the existence of mixed states that are entangled in real Hilbert space quantum mechanics, but separable in the complex version, providing explicit examples both for the maps and for the states. Finally, we discuss entanglement breaking and PPT maps, and we show that a straightforward real version of the PPT-squared conjecture is false even in dimension 2. Nevertheless, we show that the original PPT-squared conjecture implies a different conjecture for real maps, in which the PPT property is replaced by a stronger property of invariance under partial transposition (IPT). When the IPT property is assumed, we prove an asymptotic version of the conjecture.	翻訳日:2023-05-17 01:12:47 公開日:2023-05-14
# 次世代衛星ネットワークのための人工知能技術 Artificial Intelligence Techniques for Next-Generation Mega Satellite Networks ( http://arxiv.org/abs/2207.00414v2 ) ライセンス: Link先を確認	Bassel Al Homssi, Kosta Dakic, Ke Wang, Tansu Alpcan, Ben Allen, Sithamparanathan Kandeepan, Akram Al-Hourani, and Walid Saad	(参考訳) 宇宙通信、特にメガ衛星ネットワークは、宇宙打ち上げ、エレクトロニクス、処理能力、小型化の大きな進歩により、次世代ネットワークの魅力ある候補として再燃した。しかし、メガ衛星ネットワークは、軌道速度、衛星間リンク、短距離通過、衛星フットプリントなどのダイナミックでユニークな特徴のために、従来のモデルでは真に捉えられない多くの基盤的および相互接続のプロセスに依存している。したがって、ネットワークがリンク内で急速に変化する条件に積極的に適応できるように、新しいアプローチが必要である。人工知能(AI)は、これらのプロセスを捕捉し、その振る舞いを分析し、ネットワーク上での効果をモデル化する経路を提供する。本稿では,統合衛星ネットワーク,特にメガ衛星ネットワーク通信におけるai技術の適用について紹介する。メガ衛星ネットワークのユニークな特徴と、現在の通信インフラへの統合に伴う全体的な課題を詳述している。さらに、このアーティクルは、コミュニケーションリンクのさまざまなレイヤにわたる最先端のAI技術に関する洞察を提供する。これは、高度にダイナミックな無線チャネルの予測、スペクトル検出と分類、信号検出と復調、衛星間リンクと衛星アクセスネットワークの最適化、ネットワークセキュリティのためのaiの適用を含む。さらに,今後のパラダイムと,それらの機構の実用ネットワークへのマッピングについて概説する。 Space communications, particularly mega satellite networks, re-emerged as an appealing candidate for next generation networks due to major advances in space launching, electronics, processing power, and miniaturization. However, mega satellite networks rely on numerous underlying and intertwined processes that cannot be truly captured using conventionally used models, due to their dynamic and unique features such as orbital speed, inter-satellite links, short time pass, and satellite footprint, among others. Hence, new approaches are needed to enable the network to proactively adjust to the rapidly varying conditions associated within the link. Artificial intelligence (AI) provides a pathway to capture these processes, analyze their behavior, and model their effect on the network. This article introduces the application of AI techniques for integrated terrestrial satellite networks, particularly mega satellite network communications. It details the unique features of mega satellite networks, and the overarching challenges concomitant with their integration into the current communication infrastructure. Moreover, the article provides insights into state-of-the-art AI techniques across various layers of the communication link. This entails applying AI for forecasting the highly dynamic radio channel, spectrum sensing and classification, signal detection and demodulation, inter-satellite link and satellite access network optimization, and network security. Moreover, future paradigms and the mapping of these mechanisms onto practical networks are outlined.	翻訳日:2023-05-17 01:12:00 公開日:2023-05-14
# 量子コンピュータの摂動理論法に向けて Towards Perturbation Theory Methods on a Quantum Computer ( http://arxiv.org/abs/2206.14955v2 ) ライセンス: Link先を確認	Junxu Li, Barbara A. Jones and Sabre Kais	(参考訳) 摂動理論(PT)は物理学者と化学者の両方にとって最も強力で実りの多い道具の1つであり、原子物理学と亜原子物理学の開花による応用の爆発を引き起こした。今日ptはよく使われているが、ptの技術は量子コンピューティングにおいて著しく不足している。本稿では,pt法を用いてエネルギーと固有状態の両方の補正を推定する量子回路を提案する。本手法は,qiskitに基づく数値シミュレーションが提案されている拡張ハバードモデルの適用によりさらに実証される。一般的な量子変動回路とは異なり、我々の回路にはトレーニングや最適化のプロセスはなく、全てのパラメータは未摂動ハミルトニアンから導かれる。我々の研究は、PTに基づくより複雑な手法の量子的実装に光を当てるかもしれない量子デバイスを用いた複雑なシステムの研究の新しいアプローチを提供する。 Perturbation theory (PT) might be one of the most powerful and fruitful tools for both physicists and chemists, which evoked an explosion of applications with the blooming of atomic and subatomic physics. Even though PT is well-used today, techniques for PT are significantly lacking in quantum computing. Here we present a quantum circuit estimating both the energy and eigenstates corrections with PT methods, which we claim is far superior to the classical version when estimating the second order energy correction. Our approach is further demonstrated with an application on the extended Hubbard model, where numerical simulation based on qiskit is also presented. Unlike the popular quantum variational circuit, there is no training or optimizing process in our circuit, and all parameters are derived from the unperturbed Hamiltonian. Our work offers a new approach to studying complex systems with quantum devices, which might shed light on the quantum implementation of the more intricate methods based on PT.	翻訳日:2023-05-17 01:11:39 公開日:2023-05-14
# 症例:共感反応生成における粗悪から細かな認知と愛情の一致 CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic Response Generation ( http://arxiv.org/abs/2208.08845v2 ) ライセンス: Link先を確認	Jinfeng Zhou, Chujie Zheng, Bo Wang, Zheng Zhang, Minlie Huang	(参考訳) 共感的会話は、意識的なアライメントと共感の認知と感情の相互作用の結果であると考えられている。しかし、既存の共感的対話モデルは、通常、感情的側面のみを考慮し、孤立して認知と愛情を扱い、共感的反応生成の能力を制限する。本研究では,共感的対話生成のためのCASEモデルを提案する。まず、コモンセンス認知グラフと感情概念グラフの上に構築され、その後、粗粒度と細粒度の両方でユーザの認知と感情を調整します。自動的および手作業による評価により,情緒対話の最先端ベースラインを上回っており,共感的かつ情報的な反応を発生できることを示す。 Empathetic conversation is psychologically supposed to be the result of conscious alignment and interaction between the cognition and affection of empathy. However, existing empathetic dialogue models usually consider only the affective aspect or treat cognition and affection in isolation, which limits the capability of empathetic response generation. In this work, we propose the CASE model for empathetic dialogue generation. It first builds upon a commonsense cognition graph and an emotional concept graph and then aligns the user's cognition and affection at both the coarse-grained and fine-grained levels. Through automatic and manual evaluation, we demonstrate that CASE outperforms state-of-the-art baselines of empathetic dialogues and can generate more empathetic and informative responses.	翻訳日:2023-05-17 01:03:00 公開日:2023-05-14
# 統計力学のエントロピー特性と一致する量子無知に対する微分幾何学的アプローチ A Differential-Geometric Approach to Quantum Ignorance Consistent with Entropic Properties of Statistical Mechanics ( http://arxiv.org/abs/2208.04134v4 ) ライセンス: Link先を確認	Shannon Ray, Paul M. Alsing, Carlo Cafaro, Shelton Jacinto	(参考訳) 本稿では、任意の還元密度演算子$\rho_S$に付随する浄化多様体の計量テンソルと体積を構築する。また、マクロステートが無知の表面(SOI)と呼ばれる浄化の多様体である体積、およびマイクロステートが$\rho_S$の精製量を研究するために、CG(quantum coarse-graining)を定義する。この文脈では、ボリュームは$\rho_S$から欠落する情報の量を定量化するマクロ状態の多重度として機能する。 soiが$su(2)$、$so(3)$、および$so(n)$の表現を使って生成される例を用いて、cgの2つの特徴を示す。 1) より小さい体積の非定型マクロ状態から始まる系は, システムと環境が厳密に絡み合う過程において, 平衡マクロ状態に達するまで大きな体積のマクロ状態へと発展し, 2) 平衡マクロ状態は, 特に全体系の寸法が大きくなるにつれて, 粗粒空間の大部分を占める。ここで、平衡マクロ状態は、システムと環境の間の最大絡み合いに対応する。特徴(1) を述べるために、体積はフォン・ノイマンのエントロピーのように振る舞うことを示し、純状態はゼロ、最大混合状態は最大であり、対流関数 w.r.t は$\rho_s$ の純度であることを示す。これらの2つの特徴は、熱化とボルツマンのオリジナルのcgに関する典型的な議論に欠かせない。 In this paper, we construct the metric tensor and volume for the manifold of purifications associated with an arbitrary reduced density operator $\rho_S$. We also define a quantum coarse-graining (CG) to study the volume where macrostates are the manifolds of purifications, which we call surfaces of ignorance (SOI), and microstates are the purifications of $\rho_S$. In this context, the volume functions as a multiplicity of the macrostates that quantifies the amount of information missing from $\rho_S$. Using examples where the SOI are generated using representations of $SU(2)$, $SO(3)$, and $SO(N)$, we show two features of the CG. (1) A system beginning in an atypical macrostate of smaller volume evolves to macrostates of greater volume until it reaches the equilibrium macrostate in a process in which the system and environment become strictly more entangled, and (2) the equilibrium macrostate takes up the vast majority of the coarse-grainied space especially as the dimension of the total system becomes large. Here, the equilibrium macrostate corresponds to maximum entanglement between system and environment. To demonstrate feature (1) for the examples considered, we show that the volume behaves like the von Neumann entropy in that it is zero for pure states, maximal for maximally mixed states, and is a concave function w.r.t the purity of $\rho_S$. These two features are essential to typicality arguments regarding thermalization and Boltzmann's original CG.	翻訳日:2023-05-17 01:01:30 公開日:2023-05-14
# 線形回帰の量子通信複雑性 Quantum communication complexity of linear regression ( http://arxiv.org/abs/2210.01601v2 ) ライセンス: Link先を確認	Ashley Montanaro and Changpeng Shao	(参考訳) 量子コンピュータは、線形代数問題を解くための古典的な問題よりも高速化することができる。しかし、例えば低ランク行列の場合のように、非量子化アルゴリズムは指数関数的な量子速度アップは存在できないことを証明している。本研究では, 量子コンピュータが, 基本線形代数問題 \update{if no limit on the rank} の通信複雑性の観点から, 証明可能な多項式と指数的高速化を持つことを示す。主に線形回帰とハミルトンシミュレーションの解法に焦点をあてる。量子の場合、タスクは結果の量子状態を準備することである。比較を公平にするために、古典的な場合、タスクは結果からサンプルを作ることである。本研究では,これら2つの問題を二元モデルと多元モデルで検討し,準最適量子プロトコルを提案し,量子・古典下界を証明した。本研究では,量子アルゴリズム設計のための強力な手法である量子特異値変換のための効率的な量子プロトコルを提案する。これは、他の多くの問題に対する効率的な量子プロトコルの開発に役立ちます。 Quantum computers may achieve speedups over their classical counterparts for solving linear algebra problems. However, in some cases -- such as for low-rank matrices -- dequantized algorithms demonstrate that there cannot be an exponential quantum speedup. In this work, we show that quantum computers have provable polynomial and exponential speedups in terms of communication complexity for some fundamental linear algebra problems \update{if there is no restriction on the rank}. We mainly focus on solving linear regression and Hamiltonian simulation. In the quantum case, the task is to prepare the quantum state of the result. To allow for a fair comparison, in the classical case, the task is to sample from the result. We investigate these two problems in two-party and multiparty models, propose near-optimal quantum protocols and prove quantum/classical lower bounds. In this process, we propose an efficient quantum protocol for quantum singular value transformation, which is a powerful technique for designing quantum algorithms. This will be helpful in developing efficient quantum protocols for many other problems.	翻訳日:2023-05-17 00:43:31 公開日:2023-05-14
# 事前学習された言語モデルがゼロショット学習に役立つ理由 What Makes Pre-trained Language Models Better Zero-shot Learners? ( http://arxiv.org/abs/2209.15206v2 ) ライセンス: Link先を確認	Jinghui Lu, Dongsheng Zhu, Weidong Han, Rui Zhao, Brian Mac Namee, Fei Tan	(参考訳) 本稿では,ゼロ/ファウショットシナリオにおける即時学習の有効性を説明する理論的枠組みを提案する。まず、従来の事前学習および微調整のパラダイムは、表現できないラベル付きデータに過度に適合するため、少数のシナリオで失敗することを証明する。そこで本研究では,大量のテキストコーパス上に構築された事前学習言語モデルと,ドメイン関連の人的知識を活用して予測にもっと参加し,小さなトレーニングセットによって提供される限定ラベル情報の影響を低減することにより,迅速な学習がより効果的であるという仮定を詳述する。さらに、言語不一致がプロンプトの質を測定することができると仮定する。仮定を検証するために包括的な実験が行われる。さらに,理論的な枠組みに触発されて,パープレキシティに基づくアノテーションに依存しないテンプレート選択手法を提案する。既存の作業は、まだテンプレートを評価するために開発セットに依存しているため、このアプローチは特に奨励されます。実験により、この手法は最先端のゼロショット法に比べて大きな予測効果をもたらすことが示された。 In this paper, we propose a theoretical framework to explain the efficacy of prompt learning in zero/few-shot scenarios. First, we prove that conventional pre-training and fine-tuning paradigm fails in few-shot scenarios due to overfitting the unrepresentative labelled data. We then detail the assumption that prompt learning is more effective because it empowers pre-trained language model that is built upon massive text corpora, as well as domain-related human knowledge to participate more in prediction and thereby reduces the impact of limited label information provided by the small training set. We further hypothesize that language discrepancy can measure the quality of prompting. Comprehensive experiments are performed to verify our assumptions. More remarkably, inspired by the theoretical framework, we propose an annotation-agnostic template selection method based on perplexity, which enables us to ``forecast'' the prompting performance in advance. This approach is especially encouraging because existing work still relies on development set to post-hoc evaluate templates. Experiments show that this method leads to significant prediction benefits compared to state-of-the-art zero-shot methods.	翻訳日:2023-05-17 00:42:58 公開日:2023-05-14
# 力学系のニューラルネットワーク積分器の厳密な保存則 Exact conservation laws for neural network integrators of dynamical systems ( http://arxiv.org/abs/2209.11661v2 ) ライセンス: Link先を確認	Eike Hermann M\"uller	(参考訳) 近年,ニューラルネットワークを用いた時間依存微分方程式の解法が注目されている。中心となる考え方は、ランダムノイズによって汚染される可能性のあるデータから解の進化を管理する法則を学ぶことである。しかし、他の機械学習アプリケーションとは対照的に、システムについては通常多くのことが知られている。例えば、多くの力学系において、エネルギーや(角運動量のような)物理量は正確に保存される。したがって、ニューラルネットワークはデータからこれらの保存則を学習しなければならず、有限なトレーニング時間とランダムノイズによってのみ満足できる。本稿では,ニューラルネットワークのアーキテクチャに保存則を内在的に組み込むために,ネーターの定理を用いた代替手法を提案する。これは3次元ニュートン重力ポテンシャルにおける非相対論的粒子の運動、シュワルツシルト計量における大規模相対論的粒子の運動、および4次元で相互作用する2つの粒子の系である。 The solution of time dependent differential equations with neural networks has attracted a lot of attention recently. The central idea is to learn the laws that govern the evolution of the solution from data, which might be polluted with random noise. However, in contrast to other machine learning applications, usually a lot is known about the system at hand. For example, for many dynamical systems physical quantities such as energy or (angular) momentum are exactly conserved. Hence, the neural network has to learn these conservation laws from data and they will only be satisfied approximately due to finite training time and random noise. In this paper we present an alternative approach which uses Noether's Theorem to inherently incorporate conservation laws into the architecture of the neural network. We demonstrate that this leads to better predictions for three model systems: the motion of a non-relativistic particle in a three-dimensional Newtonian gravitational potential, the motion of a massive relativistic particle in the Schwarzschild metric and a system of two interacting particles in four dimensions.	翻訳日:2023-05-17 00:41:59 公開日:2023-05-14
# ニューラルネットワーク検証におけるTighter Abstract Queries Tighter Abstract Queries in Neural Network Verification ( http://arxiv.org/abs/2210.12871v2 ) ライセンス: Link先を確認	Elazar Cohen, Yizhak Yisrael Elboher, Clark Barrett, Guy Katz	(参考訳) ニューラルネットワークは、コンピュータサイエンスにおけるさまざまな領域におけるリアクティブシステムの重要な構成要素となっている。優れたパフォーマンスにもかかわらず、ニューラルネットワークを使用することは、私たちの行動を理解し、判断する能力の欠如に起因する多くのリスクを伴います。これらのリスクのため、ニューラルネットワークの検証には様々な形式的手法が提案されているが、残念ながらスケーラビリティの障壁に苦しむことが多い。最近の試みでは、これらの制限を緩和する上で、抽象化-制限アプローチが重要な役割を果たすことが示されているが、これらのアプローチは、しばしば、非常に抽象的なネットワークを生成し、検証に適さないものとなる。この問題に対処するため,システムとプロパティを同時に抽象化・洗練する新しい検証機構であるCEGARETTEを提案する。このアプローチによって,小型かつ十分に正確な抽象ネットワークを作成でき,多数の改良ステップを回避しつつ,迅速な検証時間を確保できることがわかった。評価のために,最近提案された CEGAR-NN フレームワークの拡張として CEGARETTE を実装した。私たちの結果は有望であり、複数のベンチマークに対するパフォーマンスの大幅な改善を示しています。 Neural networks have become critical components of reactive systems in various domains within computer science. Despite their excellent performance, using neural networks entails numerous risks that stem from our lack of ability to understand and reason about their behavior. Due to these risks, various formal methods have been proposed for verifying neural networks; but unfortunately, these typically struggle with scalability barriers. Recent attempts have demonstrated that abstraction-refinement approaches could play a significant role in mitigating these limitations; but these approaches can often produce networks that are so abstract, that they become unsuitable for verification. To deal with this issue, we present CEGARETTE, a novel verification mechanism where both the system and the property are abstracted and refined simultaneously. We observe that this approach allows us to produce abstract networks which are both small and sufficiently accurate, allowing for quick verification times while avoiding a large number of refinement steps. For evaluation purposes, we implemented CEGARETTE as an extension to the recently proposed CEGAR-NN framework. Our results are very promising, and demonstrate a significant improvement in performance over multiple benchmarks.	翻訳日:2023-05-17 00:35:43 公開日:2023-05-14
# 男女のアニムスは、良好な影響下でも存続できる:オンラインp2pローンの注意書き Gender Animus Can Still Exist Under Favorable Disparate Impact: a Cautionary Tale from Online P2P Lending ( http://arxiv.org/abs/2210.07864v3 ) ライセンス: Link先を確認	Xudong Shen, Tianhui Tan, Tuan Q. Phan, Jussi Keppo	(参考訳) 本稿では,中国の著名なオンラインピアツーピア(p2p)レンディングプラットフォーム上での性差別とその基盤となるドライバについて検討する。 P2P貸与に関する既存の研究は、異種治療(DT)に焦点を当てているが、DTは直接的差別を狭く認識し、間接的および代理的差別を見落とし、不完全な画像を提供する。本研究では,実際のリターン率に合致しないローンの融資率の差を包含する,分散インパクト(di)と呼ばれる幅広い差別概念を測定した。観測データからdiを推定する2段階予測器置換手法を開発した。私たちの発見は (i)女性借り手は、同じ実利率で、資金を受け取る確率が3.97%高い。 (ii)このdiの少なくとも37.1%は、間接的又は代理的差別であり、 (iii)DTは女性全体の嗜好を44.6%過小評価している。また, 投資家が不完全な観察から期待したリターン率を正確に予測する「合理的統計的識別」によって, 女性の好意性全般が説明できることを示す。さらに、女性の借り手は資金確保に2%高いリターン率を必要としており、別のドライバーの味覚に基づく差別が共存し、女性に対するものであることを示している。これらの結果は、P2P貸与は、女性を支持する肯定的な行動が合理的な群衆から自然に現れる価値ある代替信用市場を提供する一方、全体的な差別効果(DIまたはDTの両方)が女性に有利である一方で、味に基づく差別は持続し、統計的差別など既存の他の差別ドライバーによって隠蔽される可能性がある。 This paper investigates gender discrimination and its underlying drivers on a prominent Chinese online peer-to-peer (P2P) lending platform. While existing studies on P2P lending focus on disparate treatment (DT), DT narrowly recognizes direct discrimination and overlooks indirect and proxy discrimination, providing an incomplete picture. In this work, we measure a broadened discrimination notion called disparate impact (DI), which encompasses any disparity in the loan's funding rate that does not commensurate with the actual return rate. We develop a two-stage predictor substitution approach to estimate DI from observational data. Our findings reveal (i) female borrowers, given identical actual return rates, are 3.97% more likely to receive funding, (ii) at least 37.1% of this DI favoring female is indirect or proxy discrimination, and (iii) DT indeed underestimates the overall female favoritism by 44.6%. However, we also identify the overall female favoritism can be explained by one specific discrimination driver, rational statistical discrimination, wherein investors accurately predict the expected return rate from imperfect observations. Furthermore, female borrowers still require 2% higher expected return rate to secure funding, indicating another driver taste-based discrimination co-exists and is against female. These results altogether tell a cautionary tale: on one hand, P2P lending provides a valuable alternative credit market where the affirmative action to support female naturally emerges from the rational crowd; on the other hand, while the overall discrimination effect (both in terms of DI or DT) favors female, concerning taste-based discrimination can persist and can be obscured by other co-existing discrimination drivers, such as statistical discrimination.	翻訳日:2023-05-17 00:34:06 公開日:2023-05-14
# KALM:長期文書理解のためのローカル・ドキュメント・グローバルコンテキストの知識認識統合 KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding ( http://arxiv.org/abs/2210.04105v2 ) ライセンス: Link先を確認	Shangbin Feng, Zhaoxuan Tan, Wenqian Zhang, Zhenyu Lei, Yulia Tsvetkov	(参考訳) 事前訓練された言語モデル(LM)の出現に伴い、下流タスクのためのLMを作成するために、コモンセンスとドメイン固有の知識を注入することに注力する研究が増えている。これらの研究は、事前訓練されたLMとともに、記号的知識表現のデファクト標準である知識グラフを活用する。既存のアプローチでは外部の知識を活用しているが、ローカル(例えば文)からドキュメントレベル、グローバル知識まで、さまざまなコンテキストを表す知識グラフを共同で組み込む方法が疑問視されている。このようなリッチな文脈化は、標準の事前訓練されたLMが典型的には入力シーケンス長によって拘束されるため、長い文書理解タスクに特に有用である。これらの課題を踏まえて,長文理解のためのローカル,文書レベル,グローバルコンテキストの知識を協調的に活用する知識認識言語モデルであるKALMを提案する。 KALMはまず、長いドキュメントと知識グラフを3つの知識認識コンテキスト表現にエンコードする。その後、各コンテキストをコンテキスト固有のレイヤで処理し、次いでコンテキスト融合層によって知識交換を促進し、包括的なドキュメント表現を導出する。大規模な実験により、KALMは6つの長い文書理解タスクとデータセットで最先端のパフォーマンスを達成する。さらなる分析により、3つの知識認識コンテキストは相補的であり、それらは全てモデルのパフォーマンスに寄与し、異なるコンテキストの重要度と情報交換パターンは異なるタスクとデータセットに関して異なることが判明した。 With the advent of pretrained language models (LMs), increasing research efforts have been focusing on infusing commonsense and domain-specific knowledge to prepare LMs for downstream tasks. These works attempt to leverage knowledge graphs, the de facto standard of symbolic knowledge representation, along with pretrained LMs. While existing approaches have leveraged external knowledge, it remains an open question how to jointly incorporate knowledge graphs representing varying contexts, from local (e.g., sentence), to document-level, to global knowledge, to enable knowledge-rich exchange across these contexts. Such rich contextualization can be especially beneficial for long document understanding tasks since standard pretrained LMs are typically bounded by the input sequence length. In light of these challenges, we propose KALM, a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. KALM first encodes long documents and knowledge graphs into the three knowledge-aware context representations. It then processes each context with context-specific layers, followed by a context fusion layer that facilitates knowledge exchange to derive an overarching document representation. Extensive experiments demonstrate that KALM achieves state-of-the-art performance on six long document understanding tasks and datasets. Further analyses reveal that the three knowledge-aware contexts are complementary and they all contribute to model performance, while the importance and information exchange patterns of different contexts vary with respect to different tasks and datasets.	翻訳日:2023-05-17 00:33:34 公開日:2023-05-14
# 制約付き最小値最適化のための高速化シングルコール法 Accelerated Single-Call Methods for Constrained Min-Max Optimization ( http://arxiv.org/abs/2210.03096v2 ) ライセンス: Link先を確認	Yang Cai, Weiqiang Zheng	(参考訳) 制約最小値最適化のための一階法について検討する。既存のメソッドは、各イテレーションで2つのグラデーションコールまたは2つのプロジェクションを必要とする。本稿では,単射単射影アルゴリズムである楽観的勾配法 (og) の変種が,弱いミント変分不等式 (mvi) を満たす演算子を持つ包含問題に対して$o(\frac{1}{\sqrt{t}})$ のベストイテレート収束率を持つことを示す。第二の結果は、最初の単呼単射アルゴリズムである Accelerated Reflected Gradient (ARG) 法であり、負のコモノトニック性を満たす包摂問題に対する最適$O(\frac{1}{T})$の最終点収束率を達成する。弱いMVIと負のコモノトニック性はともによく研究された仮定であり、非凸なmin-max最適化問題のリッチな集合を捉えている。最後に,single-call single-projectionアルゴリズムであるreflection gradient (rg)法は,制約付き凸凸凸min-max最適化に対して$o(\frac{1}{\sqrt{t}})$ last-iterate convergence rateを持つことを示す。我々の収束率は、接残差や自然残差などの標準測度を定めている。 We study first-order methods for constrained min-max optimization. Existing methods either require two gradient calls or two projections in each iteration, which may be costly in some applications. In this paper, we first show that a variant of the Optimistic Gradient (OG) method, a single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ best-iterate convergence rate for inclusion problems with operators that satisfy the weak Minty variation inequality (MVI). Our second result is the first single-call single-projection algorithm -- the Accelerated Reflected Gradient (ARG) method that achieves the optimal $O(\frac{1}{T})$ last-iterate convergence rate for inclusion problems that satisfy negative comonotonicity. Both the weak MVI and negative comonotonicity are well-studied assumptions and capture a rich set of non-convex non-concave min-max optimization problems. Finally, we show that the Reflected Gradient (RG) method, another single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ last-iterate convergence rate for constrained convex-concave min-max optimization, answering an open problem of [Heish et al, 2019]. Our convergence rates hold for standard measures such as the tangent residual and the natural residual.	翻訳日:2023-05-17 00:33:06 公開日:2023-05-14
# rita: インタラクティブな交通の流れで自動運転シミュレータを強化 RITA: Boost Autonomous Driving Simulators with Realistic Interactive Traffic Flow ( http://arxiv.org/abs/2211.03408v4 ) ライセンス: Link先を確認	Zhengbang Zhu, Shenyu Zhang, Yuzheng Zhuang, Yuecheng Liu, Minghuan Liu, Liyuan Mao, Ziqin Gong, Weinan Zhang, Shixiong Kai, Qiang Gu, Bin Wang, Siyuan Cheng, Xinyu Wang, Jianye Hao and Yong Yu	(参考訳) 高品質な交通フロー生成は、自動運転シミュレータ構築における中核モジュールである。しかし、利用可能なシミュレータのほとんどは、実世界のデータの様々な特徴を正確に反映したトラフィックパターンを複製することができず、テストされたオートパイロット駆動戦略に対する人間のような反応をシミュレートすることができない。このような問題に対処するために,既存の運転シミュレータの統合コンポーネントとしてRealistic Interactive TrAffic Flow (RITA)を提案する。 RITAは3つの重要な特徴、すなわち忠実さ、多様性、制御性を考慮して開発され、RITABackendとRITAKitと呼ばれる2つのコアモジュールで構成されている。 RITABackendは実世界のデータセットからトラフィック生成モデルを提供するために構築されており、RITAKitはRITABackendを介して制御可能なトラフィック生成のための使いやすいインターフェースで開発されている。本稿では,多種多様かつ高忠実な交通シミュレーションを実現するRITAの能力について述べる。実験の結果, 生成したRITAトラヒックフローは3つの重要な特徴を全て示し, 運転戦略評価の完全性を高めた。さらに、RITAトラフィックフローを用いたオンライン微調整によるベースライン戦略の改善の可能性を示す。 High-quality traffic flow generation is the core module in building simulators for autonomous driving. However, the majority of available simulators are incapable of replicating traffic patterns that accurately reflect the various features of real-world data while also simulating human-like reactive responses to the tested autopilot driving strategies. Taking one step forward to addressing such a problem, we propose Realistic Interactive TrAffic flow (RITA) as an integrated component of existing driving simulators to provide high-quality traffic flow for the evaluation and optimization of the tested driving strategies. RITA is developed with consideration of three key features, i.e., fidelity, diversity, and controllability, and consists of two core modules called RITABackend and RITAKit. RITABackend is built to support vehicle-wise control and provide traffic generation models from real-world datasets, while RITAKit is developed with easy-to-use interfaces for controllable traffic generation via RITABackend. We demonstrate RITA's capacity to create diversified and high-fidelity traffic simulations in several highly interactive highway scenarios. The experimental findings demonstrate that our produced RITA traffic flows exhibit all three key features, hence enhancing the completeness of driving strategy evaluation. Moreover, we showcase the possibility for further improvement of baseline strategies through online fine-tuning with RITA traffic flows.	翻訳日:2023-05-17 00:25:24 公開日:2023-05-14
# 量子コンピューティングにおける学生の強みと困難 Investigating students' strengths and difficulties in quantum computing ( http://arxiv.org/abs/2212.03726v3 ) ライセンス: Link先を確認	Tunde Kushimo and Beth Thacker	(参考訳) 量子コンピューティングは、情報理論、計算機科学、数学、量子物理学から情報を根本的に新しい方法で処理するエキサイティングな分野である。実用的な量子コンピュータを開発し、量子労働力を増やすための競争が進行中である。これは、量子コンピューティングプログラム、コース、カリキュラムの開発と、次世代の量子情報科学者の教育を支援するためのエビデンスに基づく教育材料の開発と相まって行う必要がある。量子コンピューティングの入門コースを大学生に導入し,入門コースを受講した学生の量子コンピューティングにおける強みと難易度について検討した。我々のゴールは、学生が理解し易いトピックと理解し難いトピックを理解しながら、量子コンピューティング教育の改善に貢献することである。我々は,学生の強みと難しさを明らかにするために,一連のインタビューを行った。我々は、これらのインタビューの結果と、量子コンピューティング入門コースを教えるためのエビデンスベース教材の開発に関する初期研究について報告する。 Quantum Computing is an exciting field that draws from information theory, computer science, mathematics, and quantum physics to process information in fundamentally new ways. There is an ongoing race to develop practical quantum computers and increase the quantum workforce. This needs to be accompanied by the development of quantum computing programs, courses, and curricula coupled with the development of evidence-based pedagogical materials to support the education of the next generation of quantum information scientists. We introduced an introductory course in quantum computing to undergraduate students and investigated the strengths and difficulties of these students in quantum computing after taking the introductory course. Our goal is to contribute to the improvement of quantum computing education while understanding the topics that the students find easy to comprehend and the topics that are difficult to comprehend. We conducted a series of interviews to identify these strengths and difficulties in the students. We report on the results of these interviews and our initial work on the development of evidence-based materials for teaching an introductory course in quantum computing.	翻訳日:2023-05-17 00:05:52 公開日:2023-05-14
# MUS-CDB:空中物体検出におけるアクティブアノテーションのためのクラス分散バランス付き混合不確かさサンプリング MUS-CDB: Mixed Uncertainty Sampling with Class Distribution Balancing for Active Annotation in Aerial Object Detection ( http://arxiv.org/abs/2212.02804v3 ) ライセンス: Link先を確認	Dong Liang and Jing-Wei Zhang and Ying-Peng Tang and Sheng-Jun Huang	(参考訳) 最近の航空物体検出モデルは、大量のラベル付き訓練データに依存しており、密集した物体を持つ大きな空中シーンでは、望ましくない手動ラベリングコストを必要とする。アクティブラーニングは、情報的および代表的未ラベルサンプルを選択的にクエリすることで、データラベリングコストの低減に有効である。しかし,既存のアクティブラーニング手法は,主にクラスバランスの設定と画像に基づく一般的な物体検出タスクのクエリが特徴であり,空域における長い尾のクラス分布や密集した小物体による空中物体検出のシナリオには適用できない。本稿では,コスト効率の高い空中物体検出のための新しい能動学習手法を提案する。具体的には、冗長で近視的なクエリを控えるために、オブジェクトの選択において、オブジェクトレベルとイメージレベルのインフォメーションの両方が考慮される。また、モデルトレーニングにおけるロングテールクラス分散問題を軽減するためにマイノリティオブジェクトを好むために、使いやすいクラスバランス基準が組み込まれている。問い合わせ情報を完全に活用するために,未発見画像領域における潜伏知識をマイニングするためのトレーニング損失を更に考案する。提案手法の有効性を検証するため,DOTA-v1.0およびDOTA-v2.0ベンチマークを用いて実験を行った。その結果,ラベリングコストの75%以上を削減でき,ベースラインや最先端のアクティブオブジェクト検出法と同等の性能が得られることがわかった。コードは \href{https://github.com/ZJW700/MUS-CDB}{\textit{https://github.com/ZJW700/MUS-CDB}} で公開されている。 Recent aerial object detection models rely on a large amount of labeled training data, which requires unaffordable manual labeling costs in large aerial scenes with dense objects. Active learning is effective in reducing the data labeling cost by selectively querying the informative and representative unlabelled samples. However, existing active learning methods are mainly with class-balanced setting and image-based querying for generic object detection tasks, which are less applicable to aerial object detection scenario due to the long-tailed class distribution and dense small objects in aerial scenes. In this paper, we propose a novel active learning method for cost-effective aerial object detection. Specifically, both object-level and image-level informativeness are considered in the object selection to refrain from redundant and myopic querying. Besides, an easy-to-use class-balancing criterion is incorporated to favor the minority objects to alleviate the long-tailed class distribution problem in model training. To fully utilize the queried information, we further devise a training loss to mine the latent knowledge in the undiscovered image regions. Extensive experiments are conducted on the DOTA-v1.0 and DOTA-v2.0 benchmarks to validate the effectiveness of the proposed method. The results show that it can save more than 75% of the labeling cost to reach the same performance compared to the baselines and state-of-the-art active object detection methods. Code is available at \href{https://github.com/ZJW700/MUS-CDB}{\textit{https://github.com/ZJW700/MUS-CDB}}.	翻訳日:2023-05-17 00:05:08 公開日:2023-05-14
# 物理インフォームドモデルに基づく強化学習 Physics-Informed Model-Based Reinforcement Learning ( http://arxiv.org/abs/2212.02179v4 ) ライセンス: Link先を確認	Adithya Ramesh, Balaraman Ravindran	(参考訳) ロボットのタスクに強化学習(RL)を適用する。従来のRLアルゴリズムの欠点の1つは、サンプル効率が悪いことである。サンプル効率を改善する1つのアプローチはモデルベースRLである。モデルに基づくRLアルゴリズムでは、その遷移力学と報酬関数のモデルを学び、それを仮想軌道生成に利用し、それらをバックプロパゲーションしてポリシーを更新し、モデルの微分可能性を活用する。直感的には、より正確なモデルを学ぶことで、モデルベースのrlパフォーマンスが向上するはずだ。近年,基礎となる物理構造を利用して,より深いニューラルネットワークに基づく物理系の力学モデル開発への関心が高まっている。接触なしで剛体運動を行うロボットシステムに焦点を当てる。モデルベースRLアルゴリズムの2つのバージョンを比較した。1つは標準のディープニューラルネットワークベースのダイナミックスモデル、もう1つはより正確な物理インフォームドニューラルネットワークベースのダイナミックスモデルである。モデルベースRLでは,数値誤差が急速に蓄積する初期条件に敏感な環境において,モデル精度が重要となることを示す。これらの環境では、物理に変形したアルゴリズムは平均回帰とサンプル効率が大幅に向上する。初期条件に敏感でない環境では、アルゴリズムのどちらのバージョンも同様の平均回帰を達成し、物理インフォームされたバージョンはより優れたサンプル効率を達成する。また, 困難な環境下では, 物理モデルに基づくrlは, ソフトアクタ-クリティックのような最先端のモデルフリーなrlアルゴリズムよりも, 平均回帰性能が向上することを示した。 We apply reinforcement learning (RL) to robotics tasks. One of the drawbacks of traditional RL algorithms has been their poor sample efficiency. One approach to improve the sample efficiency is model-based RL. In our model-based RL algorithm, we learn a model of the environment, essentially its transition dynamics and reward function, use it to generate imaginary trajectories and backpropagate through them to update the policy, exploiting the differentiability of the model. Intuitively, learning more accurate models should lead to better model-based RL performance. Recently, there has been growing interest in developing better deep neural network based dynamics models for physical systems, by utilizing the structure of the underlying physics. We focus on robotic systems undergoing rigid body motion without contacts. We compare two versions of our model-based RL algorithm, one which uses a standard deep neural network based dynamics model and the other which uses a much more accurate, physics-informed neural network based dynamics model. We show that, in model-based RL, model accuracy mainly matters in environments that are sensitive to initial conditions, where numerical errors accumulate fast. In these environments, the physics-informed version of our algorithm achieves significantly better average-return and sample efficiency. In environments that are not sensitive to initial conditions, both versions of our algorithm achieve similar average-return, while the physics-informed version achieves better sample efficiency. We also show that, in challenging environments, physics-informed model-based RL achieves better average-return than state-of-the-art model-free RL algorithms such as Soft Actor-Critic, as it computes the policy-gradient analytically, while the latter estimates it through sampling.	翻訳日:2023-05-17 00:04:10 公開日:2023-05-14
# 教師なし要約の再評価 Unsupervised Summarization Re-ranking ( http://arxiv.org/abs/2212.09593v2 ) ライセンス: Link先を確認	Mathieu Ravaut, Shafiq Joty, Nancy Chen	(参考訳) PEGASUSのような抽象的な要約モデルは、タスク固有の事前学習目標の増大に伴い、下流の要約タスクにおいて魅力的なゼロショットパフォーマンスを提供する。しかし、そのような教師なしモデルの性能は教師なしモデルよりもかなり遅れている。教師付き設定と同様に,サマリ候補間の品質のばらつきが極めて高いのに対して,サマリ出力として保持される候補は1つのみである。本稿では,教師なしモデルと教師なしモデルの性能差を縮めるために,教師なし方式で要約候補をランク付けすることを提案する。提案手法では,非教師付きペガサスを最大7.27%,chatgptを6.86%,広く採用されている4つの要約ベンチマークで平均平均ルージュを最大6.86%改善し,平均値が7.1%(xsumからwikihowまで最大23.73%)となり,30以上のゼロショット転送セットアップ(データセットの細調整,評価)を達成した。 With the rise of task-specific pre-training objectives, abstractive summarization models like PEGASUS offer appealing zero-shot performance on downstream summarization tasks. However, the performance of such unsupervised models still lags significantly behind their supervised counterparts. Similarly to the supervised setup, we notice a very high variance in quality among summary candidates from these models while only one candidate is kept as the summary output. In this paper, we propose to re-rank summary candidates in an unsupervised manner, aiming to close the performance gap between unsupervised and supervised models. Our approach improves the unsupervised PEGASUS by up to 7.27% and ChatGPT by up to 6.86% relative mean ROUGE across four widely-adopted summarization benchmarks ; and achieves relative gains of 7.51% (up to 23.73% from XSum to WikiHow) averaged over 30 zero-shot transfer setups (finetuning on a dataset, evaluating on another).	翻訳日:2023-05-16 23:57:16 公開日:2023-05-14
# 時空間マップの直流および交流成分による顔面映像からの血液酸素飽和度推定 Blood Oxygen Saturation Estimation from Facial Video via DC and AC components of Spatio-temporal Map ( http://arxiv.org/abs/2212.07116v2 ) ライセンス: Link先を確認	Yusuke Akamatsu, Yoshifumi Onishi, Hitoshi Imaoka	(参考訳) 血液中の酸素濃度の指標である末梢血酸素飽和度(SpO2)は、最も重要な生理的パラメータの1つである。 SpO2は通常、パルスオキシメータを用いて測定されるが、顔や手動ビデオからの非接触SpO2推定方法が近年注目されている。本稿では,畳み込みニューラルネットワーク(CNN)を用いた顔画像からのSpO2推定手法を提案する。本手法は,顔映像のrgb信号から抽出した直流(dc)と交流電流(ac)成分を考慮したcnnモデルを構築し,spo2推定の原理において重要である。具体的には,フィルタ処理を用いた時空間マップから直流および交流成分を抽出し,cnnモデルを訓練し,これらの成分からspo2を予測する。また,直流および交流成分を畳み込み層から抽出し,時空間マップから直接spo2を予測するエンドツーエンドモデルを提案する。 50名の被験者の顔ビデオとSpO2データを用いた実験により,提案手法は現在のSpO2推定法よりも優れた推定性能が得られることが示された。 Peripheral blood oxygen saturation (SpO2), an indicator of oxygen levels in the blood, is one of the most important physiological parameters. Although SpO2 is usually measured using a pulse oximeter, non-contact SpO2 estimation methods from facial or hand videos have been attracting attention in recent years. In this paper, we propose an SpO2 estimation method from facial videos based on convolutional neural networks (CNN). Our method constructs CNN models that consider the direct current (DC) and alternating current (AC) components extracted from the RGB signals of facial videos, which are important in the principle of SpO2 estimation. Specifically, we extract the DC and AC components from the spatio-temporal map using filtering processes and train CNN models to predict SpO2 from these components. We also propose an end-to-end model that predicts SpO2 directly from the spatio-temporal map by extracting the DC and AC components via convolutional layers. Experiments using facial videos and SpO2 data from 50 subjects demonstrate that the proposed method achieves a better estimation performance than current state-of-the-art SpO2 estimation methods.	翻訳日:2023-05-16 23:55:32 公開日:2023-05-14
# マルチエージェントネットワークシステムにおけるスケーラブル・サンプル分散ポリシー勾配アルゴリズム Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems ( http://arxiv.org/abs/2212.06357v2 ) ライセンス: Link先を確認	Xin Liu, Honghao Wei, Lei Ying	(参考訳) 本稿では,エージェントが受ける報酬が他のエージェントの状態に依存するマルチエージェント強化学習(MARL)のクラスについて検討する。 Reward-Coupled Multi-Agent Reinforcement LearningからREC-MARLと命名した。 REC-MARLは、無線ネットワークにおけるリアルタイムアクセス制御や分散電力制御など、様々な重要な応用がある。本稿では,REC-MARLのための分散ポリシ勾配アルゴリズムを提案する。提案アルゴリズムは,2つの側面に分散する。 (i)学習方針とは、エージェントのローカル状態をそのローカルアクションにマッピングする分散ポリシーである。 (ii)学習・訓練が分散され、その間に各エージェントは自身の情報と隣人の情報に基づいて方針を更新する。学習アルゴリズムは定常ポリシーを達成し、その反復的複雑性境界は局所状態と行動の次元に依存する。無線ネットワークにおけるリアルタイムアクセス制御と電力制御のためのアルゴリズムの実験結果から,本手法は最先端のアルゴリズムやよく知られたベンチマークを大きく上回っていることがわかった。 This paper studies a class of multi-agent reinforcement learning (MARL) problems where the reward that an agent receives depends on the states of other agents, but the next state only depends on the agent's own current state and action. We name it REC-MARL standing for REward-Coupled Multi-Agent Reinforcement Learning. REC-MARL has a range of important applications such as real-time access control and distributed power control in wireless networks. This paper presents a distributed policy gradient algorithm for REC-MARL. The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information. The learned algorithm achieves a stationary policy and its iterative complexity bounds depend on the dimension of local states and actions. The experimental results of our algorithm for the real-time access control and power control in wireless networks show that our policy significantly outperforms the state-of-the-art algorithms and well-known benchmarks.	翻訳日:2023-05-16 23:54:38 公開日:2023-05-14
# ロバストicp初期化へのアプローチ An approach to robust ICP initialization ( http://arxiv.org/abs/2212.05332v2 ) ライセンス: Link先を確認	Alexander Kolpakov, Michael Werman	(参考訳) 本稿では,厳密な変換に伴う乱れのない点群に対応するため,ICPアルゴリズムを初期化する手法を提案する。この方法は、点の共分散行列で定義される楕円体をマッチングし、有限反射群の要素によって異なる様々な主半軸マッチングをテストする。ノイズに対するアプローチのロバスト性の境界を導出し,理論的な知見を数値実験により検証した。 In this note, we propose an approach to initialize the Iterative Closest Point (ICP) algorithm to match unlabelled point clouds related by rigid transformations. The method is based on matching the ellipsoids defined by the points' covariance matrices and then testing the various principal half-axes matchings that differ by elements of a finite reflection group. We derive bounds on the robustness of our approach to noise and numerical experiments confirm our theoretical findings.	翻訳日:2023-05-16 23:54:19 公開日:2023-05-14
# Word-Graph2vec:ランダムウォークサンプリングを用いた単語共起グラフへの効率的な単語埋め込み手法 Word-Graph2vec: An efficient word embedding approach on word co-occurrence graph using random walk sampling ( http://arxiv.org/abs/2301.04312v4 ) ライセンス: Link先を確認	Wenting Li and Jiahong Xue and Xi Zhang and Huacan Chen and Zeyu Chen and Yuanzhe Cai	(参考訳) 単語の埋め込みはユビキタスになり、情報検索、意味分析、機械翻訳など、様々なテキストマイニングや自然言語処理(NLP)タスクで広く使われている。残念ながら、比較的大きなコーパスに埋め込まれた単語を訓練するのは極めて高価である。そこで本研究では,大小コーパスを単語共起グラフに変換し,ランダムに移動して単語列サンプルを取り,最後にこのサンプリングコーパスに埋め込まれた単語を訓練する,グラフベースの単語埋め込みアルゴリズムであるword-graph2vecを提案する。英語における安定語彙,相対イディオム,固定表現により,単語共起グラフの大きさと密度は,学習コーパスの増加とともにわずかに変化することが示唆された。したがって、Word-Graph2vecは大規模データセット上で安定したランタイムを持ち、そのパフォーマンス上の優位性は、トレーニングコーパスの成長とともにますます明確になる。実世界のデータセットを用いた広範囲な実験により,提案アルゴリズムは従来のスキップグラムを4～5倍効率で上回り,ランダムウォークサンプリングによる誤差は小さいことがわかった。 Word embedding has become ubiquitous and is widely used in various text mining and natural language processing (NLP) tasks, such as information retrieval, semantic analysis, and machine translation, among many others. Unfortunately, it is prohibitively expensive to train the word embedding in a relatively large corpus. We propose a graph-based word embedding algorithm, called Word-Graph2vec, which converts the large corpus into a word co-occurrence graph, then takes the word sequence samples from this graph by randomly traveling and trains the word embedding on this sampling corpus in the end. We posit that because of the stable vocabulary, relative idioms, and fixed expressions in English, the size and density of the word co-occurrence graph change slightly with the increase in the training corpus. So that Word-Graph2vec has stable runtime on the large scale data set, and its performance advantage becomes more and more obvious with the growth of the training corpus. Extensive experiments conducted on real-world datasets show that the proposed algorithm outperforms traditional Skip-Gram by four-five times in terms of efficiency, while the error generated by the random walk sampling is small.	翻訳日:2023-05-16 23:47:17 公開日:2023-05-14
# 注意ネットワークの解釈可能性について On the Interpretability of Attention Networks ( http://arxiv.org/abs/2212.14776v3 ) ライセンス: Link先を確認	Lakshmi Narayan Pandey, Rahul Vashisht and Harish G. Ramaswamy	(参考訳) 注意機構は、いくつかの成功したディープラーニングアーキテクチャのコアコンポーネントを形成し、"出力は入力の小さな(しかし未知の)セグメントにのみ依存する"というキーアイデアに基づいている。注意機構を持つ訓練されたモデルでは、出力に責任を持つ入力のセグメントをエンコードする中間モジュールの出力が、ネットワークの 'reasoning' を覗く手段としてしばしば使用される。我々は,注意モデルアーキテクチャで使用する場合,選択依存分類 (sdc) と呼ぶ分類問題の変種に対して,このような概念をより正確に述べる。このような設定下では,注意モデルが正確でありながら解釈できない様々なエラーモードを示し,トレーニングの結果,そのようなモデルが発生することを示す。この動作を強調し緩和できる様々な状況を説明します。最後に,sdcタスクの解釈可能性の客観的定義を用いて,分散性を促進するために設計された注意モデル学習アルゴリズムを評価し,これらのアルゴリズムが解釈性の向上に役立つことを示す。 Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network. We make such a notion more precise for a variant of the classification problem that we term selective dependence classification (SDC) when used with attention model architectures. Under such a setting, we demonstrate various error modes where an attention model can be accurate but fail to be interpretable, and show that such models do occur as a result of training. We illustrate various situations that can accentuate and mitigate this behaviour. Finally, we use our objective definition of interpretability for SDC tasks to evaluate a few attention model learning algorithms designed to encourage sparsity and demonstrate that these algorithms help improve interpretability.	翻訳日:2023-05-16 23:46:00 公開日:2023-05-14
# 深い線形ネットワークによるベイズ補間 Bayesian Interpolation with Deep Linear Networks ( http://arxiv.org/abs/2212.14457v3 ) ライセンス: Link先を確認	Boris Hanin, Alexander Zlokapa	(参考訳) ニューラルネットワークの深さ、幅、データセットサイズがモデル品質にどう影響するかを特徴付けることは、ディープラーニング理論における中心的な問題である。ここでは、ガウス重み付きゼロノイズベイズ推定と負の対数類似度の平均二乗誤差を用いた出力次元1の線形ネットワークの特別な場合の完全な解を与える。任意のトレーニングデータセット、ネットワーク深さ、隠された層幅に対して、単一の複素変数のメロモルフィック特殊関数のクラスであるMeijer-G関数の観点から予測的後およびベイズモデル証拠の非漸近式を求める。これらのmeijer-g関数の新たな漸近展開を通じて、深さ、幅、データセットサイズの共同の役割に関するリッチな新しい図が現れる。線形ネットワークが無限深度で証明可能な最適予測を行うことを示す。データに依存しない無限深度線形ネットワークの後部は、データ依存を最大化する浅層ネットワークのそれと同じである。これは、前者がデータに依存しない場合、より深いネットワークを優先する原則的な理由をもたらす。さらに,データに依存しない先行例では,広域線形ネットワークにおけるベイズモデルエビデンスを無限深度で最大化し,モデル選択における深度増加の因果関係を明らかにする。ネットワーク幅で区切られたデータポイントの数の2倍の隠蔽層数で与えられる有効深度という新たな概念であり、これは大容量データ制限における後部構造を決定する。 Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We give here a complete solution in the special case of linear networks with output dimension one trained using zero noise Bayesian inference with Gaussian weight priors and mean squared error as a negative log-likelihood. For any training dataset, network depth, and hidden layer widths, we find non-asymptotic expressions for the predictive posterior and Bayesian model evidence in terms of Meijer-G functions, a class of meromorphic special functions of a single complex variable. Through novel asymptotic expansions of these Meijer-G functions, a rich new picture of the joint role of depth, width, and dataset size emerges. We show that linear networks make provably optimal predictions at infinite depth: the posterior of infinitely deep linear networks with data-agnostic priors is the same as that of shallow networks with evidence-maximizing data-dependent priors. This yields a principled reason to prefer deeper networks when priors are forced to be data-agnostic. Moreover, we show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth, elucidating the salutary role of increased depth for model selection. Underpinning our results is a novel emergent notion of effective depth, given by the number of hidden layers times the number of data points divided by the network width; this determines the structure of the posterior in the large-data limit.	翻訳日:2023-05-16 23:45:44 公開日:2023-05-14
# 校正の豊かさについて On the Richness of Calibration ( http://arxiv.org/abs/2302.04118v2 ) ライセンス: Link先を確認	Benedikt H\"oltgen and Robert C Williamson	(参考訳) 確率的予測は、観測されたラベル周波数、すなわちキャリブレーションレンズによる比較によって評価することができる。アルゴリズムの公正性に関する最近の研究は、多校正の名のもと、様々なキャリブレーションに基づく目的に注目し始めているが、いまだにかなり制限されている。本稿では,キャリブレーションスコアの設計に関わる選択を明確化し,キャリブレーションによる評価形態を調査し分析する。これらを3つのグループ選択と,グループエラーの集約に関する選択にまとめる。これは、以前に提案されたキャリブレーションスコアを比較するためのフレームワークを提供し、望ましい数学的特性を持つ新しいスコアを定式化するのに役立つ。特に,予測ではなく,入力特徴に基づいてデータポイントをグループ化する可能性について検討し,その利点を正式に示している。また,予め提案した校正スコアを一般化し,グループ誤りに対する適切な凝集関数の空間を特徴付ける。このような集団レベルのスコアを補完し,個人レベルでのキャリブレーションスコアを調査し,グループ化の選択との関係を分析する。人口レベルのスコアに対する公平度逸脱対策の導入と公理化について考察する。グループ化の適切な選択により、これらの新しいグローバルフェアネススコアは(サブ)グループや個人フェアネスの概念を提供することができることを示す。 Probabilistic predictions can be evaluated through comparisons with observed label frequencies, that is, through the lens of calibration. Recent scholarship on algorithmic fairness has started to look at a growing variety of calibration-based objectives under the name of multi-calibration but has still remained fairly restricted. In this paper, we explore and analyse forms of evaluation through calibration by making explicit the choices involved in designing calibration scores. We organise these into three grouping choices and a choice concerning the agglomeration of group errors. This provides a framework for comparing previously proposed calibration scores and helps to formulate novel ones with desirable mathematical properties. In particular, we explore the possibility of grouping datapoints based on their input features rather than on predictions and formally demonstrate advantages of such approaches. We also characterise the space of suitable agglomeration functions for group errors, generalising previously proposed calibration scores. Complementary to such population-level scores, we explore calibration scores at the individual level and analyse their relationship to choices of grouping. We draw on these insights to introduce and axiomatise fairness deviation measures for population-level scores. We demonstrate that with appropriate choices of grouping, these novel global fairness scores can provide notions of (sub-)group or individual fairness.	翻訳日:2023-05-16 23:28:23 公開日:2023-05-14
# 不確かさ量子化による物理制約運動予測 Physics Constrained Motion Prediction with Uncertainty Quantification ( http://arxiv.org/abs/2302.01060v2 ) ライセンス: Link先を確認	Renukanandan Tumu, Lars Lindemann, Truong Nghiem, Rahul Mangharam	(参考訳) 動的エージェントの動作を予測することは、自律システムの安全性を保証する上で重要なタスクである。特に、動き予測アルゴリズムはダイナミクスの制約に従い、信頼の尺度として予測の不確かさを定量化するべきである。本稿では, 代用動力学モデルを用いて, 予測軌道が動的に実現可能であることを保証する運動予測のための物理制約付きアプローチを提案する。動力学的制約を考慮したインテントと軌道予測からなる2段階の統合を提案する。また,不確実性を定量化し,共形予測を用いて自律運転に適した予測領域を構築した。物理制約運動予測は、自律的なレーシングデータセットを使用した実験において、ADEが41%、FDEが56%、IoUが19%向上した。 Predicting the motion of dynamic agents is a critical task for guaranteeing the safety of autonomous systems. A particular challenge is that motion prediction algorithms should obey dynamics constraints and quantify prediction uncertainty as a measure of confidence. We present a physics-constrained approach for motion prediction which uses a surrogate dynamical model to ensure that predicted trajectories are dynamically feasible. We propose a two-step integration consisting of intent and trajectory prediction subject to dynamics constraints. We also construct prediction regions that quantify uncertainty and are tailored for autonomous driving by using conformal prediction, a popular statistical tool. Physics Constrained Motion Prediction achieves a 41% better ADE, 56% better FDE, and 19% better IoU over a baseline in experiments using an autonomous racing dataset.	翻訳日:2023-05-16 23:26:29 公開日:2023-05-14
# Metropolis-adjusted Langevin アルゴリズムによる制約の効率的な処理 Efficiently handling constraints with Metropolis-adjusted Langevin algorithm ( http://arxiv.org/abs/2302.11971v2 ) ライセンス: Link先を確認	Jinyuan Chang, Cheng Yong Tang, Yuanzheng Zhu	(参考訳) 本研究では,対象分布のサポートに制約のある設定において,メトロポリス調整ランジュバンアルゴリズムの性能について検討する。得られたマルコフ鎖の厳密な解析を行い、その収束を確立し、混合時間に対して上界を導出する。以上の結果から,メトロポリス調整型ランゲヴィンアルゴリズムは,この課題に対処する上で極めて有効であることが示される: 得られた混合時間境界は,アセプション・リジェクトのステップを使わずに競合するアルゴリズムの最もよく知られた境界よりも優れている。我々の数値実験は,これらの理論的な知見を裏付けるものであり,メトロポリス調整ランジュバンアルゴリズムは,対象分布の制約を扱う際に有望な性能を示す。 In this study, we investigate the performance of the Metropolis-adjusted Langevin algorithm in a setting with constraints on the support of the target distribution. We provide a rigorous analysis of the resulting Markov chain, establishing its convergence and deriving an upper bound for its mixing time. Our results demonstrate that the Metropolis-adjusted Langevin algorithm is highly effective in handling this challenging situation: the mixing time bound we obtain is superior to the best known bounds for competing algorithms without an accept-reject step. Our numerical experiments support these theoretical findings, indicating that the Metropolis-adjusted Langevin algorithm shows promising performance when dealing with constraints on the support of the target distribution.	翻訳日:2023-05-16 23:18:31 公開日:2023-05-14
# 組合せ最適化のための効率的なソリューションQuantum Dueling Quantum Dueling: an Efficient Solution for Combinatorial Optimization ( http://arxiv.org/abs/2302.10151v3 ) ライセンス: Link先を確認	Letian Tang, Haorui Wang, Zhengyang Li, Haozhan Tang, Chi Zhang, Shujin Li	(参考訳) 本稿では、量子デュエルと呼ばれる量子組合せ最適化の新しい戦略を提案する。以前のアルゴリズムでは、与えられた最適化問題の潜在的な解はヒルベルト空間の基底状態として符号化された。しかし、量子デュエルは使用される量子ビットの数を2倍にし、基底状態は拡張ヒルベルト空間における1対のポテンシャル解を表す。この表現の下で、目的関数に基づいてそのようなペアの一方の要素を識別する量子オラクルを構築することができれば、量子振幅増幅により量子最適化が達成できることに気づく。到達性の問題を補うには、追加のパラメータセットが必要である。私たちは設計をテストするために古典的シミュレーションの証拠を広範囲に使います。直感的に選択されたパラメータでは、量子デュエルはうまく機能するが、到達性は解分布に大きく依存する。この場合、成功確率の進化は非常に規則的である。したがって、状態進化を数学的に近似する方法があるかもしれない。最適パラメータについては、量子デュエルは一定の成功確率閾値に達するためにゲートへのアクセスを$O(\sqrt{N})$で要求し、ほぼ全ての解分布に対して良好に動作することを示唆している。高速なアルゴリズムと比較して、潜在的な定数レベルの最適化に加えて、量子デュエルは高レベル量子アルゴリズムのサブルーチンとしても使用できる。さらに、量子デュエルは多くの変分最適化アルゴリズム、特にQAOAと類似している。これは量子デュエルの戦略がハミルトニアン配置に移植されることを示唆している。その場合、短期量子コンピューティングのための実行可能な最適化アルゴリズムが得られ、訓練が容易になる。 This paper presents a new strategy for quantum combinatorial optimization, which we term quantum dueling. In previous algorithms, potential solutions to the given optimization problems were encoded as basis states of the Hilbert space. Quantum dueling, however, doubles the number of qubits used, making the basis states represent a pair of potential solutions in the augmented Hilbert space. Under this representation, we realize that if we can construct quantum oracles that identify one element of such pair over the other based on the objective function, quantum optimization can be achieved by quantum amplitude amplification. An additional set of parameters are required to compensate for reachability issues. We extensively use classical simulation evidence to test our designs. For intuitively chosen parameters, quantum dueling performs well, though reachability is highly dependent on solution distribution. In this case, the evolution of the success probability is highly regular. Thus, there might be ways to approximate the state evolution mathematically. For optimal parameters, data suggest that quantum dueling requires $O(\sqrt{N})$ accesses to the gates to reach a constant-level success probability threshold and performs well for almost all solution distributions. In addition to potential constant-level optimization compared with the fastest algorithms, quantum dueling can also be used as a subroutine for higher-level quantum algorithms. Moreover, quantum dueling shares similarities with many variational optimization algorithms, most notably QAOA. This suggests that the strategy of quantum dueling might be transplanted into a Hamiltonian setup. In that case, we might obtain viable optimization algorithms for near-term quantum computing, with the added advantage of easier training.	翻訳日:2023-05-16 23:18:18 公開日:2023-05-14
# 深層カーネル学習のガイド Guided Deep Kernel Learning ( http://arxiv.org/abs/2302.09574v2 ) ライセンス: Link先を確認	Idan Achituve, Gal Chechik, Ethan Fetaya	(参考訳) ガウス過程とディープニューラルネットワークの表現力の組み合わせは、今日ではdkl(deep kernel learning)を通じて一般的に行われている。残念なことに、カーネル最適化プロセスのため、これはしばしばベイズ的な利点を失う。本研究では,無限幅ニューラルネットワークを用いて深層カーネルを学習する新しい手法を提案する。本稿では、最適化プロセスにおけるDKLモデルのガイドとしてニューラルネットワークガウス過程(NNGP)モデルを提案する。提案手法は,新しいデータポイントに遭遇した場合のDKL目標の信頼度に適応するために,NNGPの確実性評価を利用する。その結果、我々は、NNGPのベイズ的挙動、すなわち過度な適合に対する頑健さ、そして正確な不確実性推定を生かし、より深いカーネルの一般化能力、スケーラビリティ、柔軟性を維持できる。実験では, 様々なサイズと寸法のベンチマークデータセット上で, オーバーフィッティングに頑健であり, 予測性能が良好であり, 信頼性の高い不確実性推定を行う。 Combining Gaussian processes with the expressive power of deep neural networks is commonly done nowadays through deep kernel learning (DKL). Unfortunately, due to the kernel optimization process, this often results in losing their Bayesian benefits. In this study, we present a novel approach for learning deep kernels by utilizing infinite-width neural networks. We propose to use the Neural Network Gaussian Process (NNGP) model as a guide to the DKL model in the optimization process. Our approach harnesses the reliable uncertainty estimation of the NNGPs to adapt the DKL target confidence when it encounters novel data points. As a result, we get the best of both worlds, we leverage the Bayesian behavior of the NNGP, namely its robustness to overfitting, and accurate uncertainty estimation, while maintaining the generalization abilities, scalability, and flexibility of deep kernels. Empirically, we show on multiple benchmark datasets of varying sizes and dimensionality, that our method is robust to overfitting, has good predictive performance, and provides reliable uncertainty estimations.	翻訳日:2023-05-16 23:17:54 公開日:2023-05-14
# プルーニングニューラルネットワークにおけるスパーシティを活用した大規模モデルトレーニングの最適化 Exploiting Sparsity in Pruned Neural Networks to Optimize Large Model Training ( http://arxiv.org/abs/2302.05045v3 ) ライセンス: Link先を確認	Siddharth Singh, Abhinav Bhatele	(参考訳) 大規模ニューラルネットワークの並列トレーニングは、通信によるオーバーヘッドが大きいため困難である。近年,ニューラルネットワークにおけるパラメータの80-90%のプルーニング(すなわちゼロに設定)が可能な様々なプルーニングアルゴリズムを開発し,未解析の親ネットワークの精度に匹敵するスパースサブネットを構築している。本研究では,これらのスパースサブネットワークを利用して,並列ディープラーニングのための2つの一般的なアルゴリズム,すなわちデータと層間並列処理のメモリ利用と通信を最適化する新しい手法を提案する。我々は、データと層間並列性に依存した並列ディープラーニングのための高度にスケーラブルなフレームワークであるAxoNNにアプローチを統合し、通信時間とメモリ使用量の削減を実証する。 512nvidia v100 gpuでは,2.7億パラメータモデルのメモリ消費を74%削減し,通信時間を40%削減し,axon上で34%,deepspeed-3d上で32%,スパース行列計算ベースラインであるsputnik上で46%高速化した。 Parallel training of neural networks at scale is challenging due to significant overheads arising from communication. Recently, deep learning researchers have developed a variety of pruning algorithms that are capable of pruning (i.e. setting to zero) 80-90% of the parameters in a neural network to yield sparse subnetworks that equal the accuracy of the unpruned parent network. In this work, we propose a novel approach that exploits these sparse subnetworks to optimize the memory utilization and communication in two popular algorithms for parallel deep learning namely -- data and inter-layer parallelism. We integrate our approach into AxoNN, a highly scalable framework for parallel deep learning that relies on data and inter-layer parallelism, and demonstrate the reduction in communication time and memory utilization. On 512 NVIDIA V100 GPUs, our optimizations reduce the memory consumption of a 2.7 billion parameter model by 74%, and the total communication time by 40%, thus providing an overall speedup of 34% over AxoNN, 32% over DeepSpeed-3D and 46% over Sputnik, a sparse matrix computation baseline.	翻訳日:2023-05-16 23:16:39 公開日:2023-05-14
# 混在訓練の最適化のためのハイブリッドテンソル-エクストラ-データ並列化手法 A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training ( http://arxiv.org/abs/2303.06318v2 ) ライセンス: Link先を確認	Siddharth Singh, Olatunji Ruwase, Ammar Ahmad Awan, Samyam Rajbhandari, Yuxiong He, Abhinav Bhatele	(参考訳) Mixture-of-Experts (MoE)は、ニューラルネットワークアーキテクチャであり、ベースモデルに疎活性化されたエキスパートブロックを追加し、計算コストに影響を与えることなくパラメータの数を増やす。しかし、現在の分散ディープラーニングフレームワークは、大規模なベースモデルで高品質なMoEモデルをトレーニングする能力に制限がある。本研究では,データ,テンソル,エキスパート並列性を組み合わせた,新しい3次元ハイブリッド並列アルゴリズムであるdeepspeed-tedを提案する。また、オプティマイザステップにおけるメモリ最適化と、不要なデータ移動をなくす通信最適化についても述べる。我々はDeepSpeedのアプローチを実装し、128V100 GPU上で400億のパラメータMOEモデル(16人のエキスパートを持つ670億ベースモデル)をトレーニングする際に、ベースライン(通信最適化無し)で26%のスピードアップを達成する。 Mixture-of-Experts (MoE) is a neural network architecture that adds sparsely activated expert blocks to a base model, increasing the number of parameters without impacting computational costs. However, current distributed deep learning frameworks are limited in their ability to train high-quality MoE models with large base models. In this work, we present DeepSpeed-TED, a novel, three-dimensional, hybrid parallel algorithm that combines data, tensor, and expert parallelism to enable the training of MoE models with 4 to 8x larger base models than the current state-of-the-art. We also describe memory optimizations in the optimizer step, and communication optimizations that eliminate unnecessary data movement. We implement our approach in DeepSpeed and achieve speedups of 26% over a baseline (i.e. without our communication optimizations) when training a 40 billion parameter MoE model (6.7 billion base model with 16 experts) on 128 V100 GPUs.	翻訳日:2023-05-16 23:07:55 公開日:2023-05-14
# 量子確率熱力学:位相空間における半古典理論 Quantum Stochastic Thermodynamics: a Semiclassical Theory in Phase Space ( http://arxiv.org/abs/2303.05935v2 ) ライセンス: Link先を確認	Zhaoyu Fei	(参考訳) 量子多体系の定式化は相空間における半古典的処理によって提案され、量子統計学を取り入れた確率的熱力学を確立できる。具体的には、メソスコピックレベルの力学として確率的フォッカー・プランク方程式を用いる。ここで、フラックス密度の変動を特徴付ける雑音項は、系と貯水池の間のランダム衝突の有限N効果を説明する。したがって、定常解は標準系における準平衡状態である。位相空間分布の軌跡に基づく確率的熱力学量を定義する。したがって、エネルギーの保存則、H理論およびゆらぎ定理が得られる。我々の研究は、2点測定スキームに依存しない量子確率熱力学の代替形式を定めている。多数の量子系の射影測定は、将来実験的な検証を期待する相空間分布のサンプリングによって置き換えられる。 A formalism for quantum many-body systems is proposed through semiclassical treatment in phase space, allowing us to establish a stochastic thermodynamics incorporating quantum statistics. Specifically, we utilize stochastic Fokker-Planck equation as the dynamics at the mesoscopic level. Here, the noise term characterizing the fluctuation of the flux density accounts for the finite-N effects of random collisions between the system and the reservoir. Accordingly, the stationary solution is a quasi-equilibrium state in a canonical system. We define stochastic thermodynamic quantities based on trajectories of phase-space distribution. The conservation law of energy, H-theorem and fluctuation theorems are therefore obtained. Our work sets an alternative formalism of quantum stochastic thermodynamics that is independent of the two-point measurement scheme. The numerous projective measurements of quantum systems are replaced by the sampling of the phase-space distribution, offering hope for experimental verifications in the future.	翻訳日:2023-05-16 23:07:21 公開日:2023-05-14
# 修復に基づく生成モデル Restoration based Generative Models ( http://arxiv.org/abs/2303.05456v2 ) ライセンス: Link先を確認	Jaemoo Choi, Yesom Park, Myungjoo Kang	(参考訳) 近年, 高い合成品質を示すことで, 拡散モデル (DDM) が注目されている。 DDMは拡散プロセス上に構築され、ノイズ分布にデータをプッシュし、モデルはノイズを学習する。本稿では,画像復元(IR)の観点からDDMの解釈を確立する。 IR文献を統合することで、拡散過程を補うのではなく、別の目的と多様な前進過程を使うことができる。 MAPに基づく推定に基づく損失関数の事前知識を付与することにより,高価なDDMサンプリングの必要性を解消する。また,前処理の柔軟性を生かして,拡散過程と比較して性能を向上させるマルチスケールトレーニングを提案する。実験の結果,本モデルはトレーニングと推論の両方の品質と効率を改善した。さらに, 逆問題に対するモデルの適用性を示す。当社のフレームワークは、新しいタイプのフレキシブル汎用生成モデルを設計するための道を開くものだと考えています。 Denoising diffusion models (DDMs) have recently attracted increasing attention by showing impressive synthesis quality. DDMs are built on a diffusion process that pushes data to the noise distribution and the models learn to denoise. In this paper, we establish the interpretation of DDMs in terms of image restoration (IR). Integrating IR literature allows us to use an alternative objective and diverse forward processes, not confining to the diffusion process. By imposing prior knowledge on the loss function grounded on MAP-based estimation, we eliminate the need for the expensive sampling of DDMs. Also, we propose a multi-scale training, which improves the performance compared to the diffusion process, by taking advantage of the flexibility of the forward process. Experimental results demonstrate that our model improves the quality and efficiency of both training and inference. Furthermore, we show the applicability of our model to inverse problems. We believe that our framework paves the way for designing a new type of flexible general generative model.	翻訳日:2023-05-16 23:07:09 公開日:2023-05-14
# CoolPINNs: 真空系におけるアクティブ冷却の物理インフォームドニューラルネットワークモデリング CoolPINNs: A Physics-informed Neural Network Modeling of Active Cooling in Vascular Systems ( http://arxiv.org/abs/2303.05300v2 ) ライセンス: Link先を確認	N. V. Jagtap, M. K. Mudunuru, and K. B. Nakshatrala	(参考訳) 超音速航空機、宇宙探査車、バッテリーなどの新興技術は、効率的な熱調節のために組込みマイクロ血管内での流体循環に有効である。これらのシステムの設計と運用においてモデリングは不可欠である。しかし、モデリングフレームワークの開発には多くの課題がある。欠けているのは正確な枠組みで (i)複雑な血管配置における熱流束の鋭い跳躍をキャプチャする。 (ii)斜め微分(接成分及び正規成分)を扱う。 (iii)放射熱伝達による非線形性を扱う。 (iv)リアルタイム監視のための高速予測を提供し、 (v)ロバストな逆モデリングを容易にする。本稿では,物理インフォームドニューラルネットワーク(PINN)のパワーを活用して,これらの課題に対処する。当社は、血管ベースの熱規制のための高速で信頼性が高く正確なSciML(SciML)フレームワークを開発しています -- CoolPINNsと呼ばれる、アクティブ冷却のためのPINNベースのモデリングフレームワークです。提案されたメッシュレスフレームワークは、前述のすべての課題をエレガントに克服する。報告された研究の意義は多岐にわたる。第一に、このフレームワークは急速な予測のため、熱規制システムのリアルタイム監視に有用である。第2に、アプローチがメッシュレスであるため、複雑な熱調節設計に対処できる。最後に、このフレームワークは、システマティックパラメータの識別と、おそらく現在のフレームワークの最も重要なユーティリティである逆モデリング研究を促進する。 Emerging technologies like hypersonic aircraft, space exploration vehicles, and batteries avail fluid circulation in embedded microvasculatures for efficient thermal regulation. Modeling is vital during these engineered systems' design and operational phases. However, many challenges exist in developing a modeling framework. What is lacking is an accurate framework that (i) captures sharp jumps in the thermal flux across complex vasculature layouts, (ii) deals with oblique derivatives (involving tangential and normal components), (iii) handles nonlinearity because of radiative heat transfer, (iv) provides a high-speed forecast for real-time monitoring, and (v) facilitates robust inverse modeling. This paper addresses these challenges by availing the power of physics-informed neural networks (PINNs). We develop a fast, reliable, and accurate Scientific Machine Learning (SciML) framework for vascular-based thermal regulation -- called CoolPINNs: a PINNs-based modeling framework for active cooling. The proposed mesh-less framework elegantly overcomes all the mentioned challenges. The significance of the reported research is multi-fold. First, the framework is valuable for real-time monitoring of thermal regulatory systems because of rapid forecasting. Second, researchers can address complex thermoregulation designs inasmuch as the approach is mesh-less. Finally, the framework facilitates systematic parameter identification and inverse modeling studies, perhaps the current framework's most significant utility.	翻訳日:2023-05-16 23:06:56 公開日:2023-05-14
# QuickSRNet: モバイルプラットフォームでの高速推論のための平易な単一イメージ超解法アーキテクチャ QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms ( http://arxiv.org/abs/2303.04336v2 ) ライセンス: Link先を確認	Guillaume Berger and Manik Dhingra and Antoine Mercier and Yashesh Savani and Sunny Panchal and Fatih Porikli	(参考訳) 本稿では,モバイルプラットフォーム上でリアルタイムアプリケーションを実現するための,効率的な超解像アーキテクチャQuickSRNetを提案する。超解像度は画像の高解像度化、シャープ化、アップスケール化を行う。ゲームやビデオ再生などのアプリケーションや、テレビ、スマートフォン、VRヘッドセットのディスプレイ能力の向上は、効率的なアップスケーリングソリューションの必要性を喚起している。既存のディープラーニングベースの超高解像度アプローチは、視覚的品質の観点から見事な結果をもたらすが、計算、熱、電力制約のあるモバイルデバイスでリアルタイムDLベースの超高解像度を実現することは困難である。このような課題に対処するため,我々は,単一画像のスーパーレゾリューションのための既存のニューラルネットワークよりも精度とレイテンシのトレードオフを提供する,シンプルで効果的なアーキテクチャであるquicksrnetを提案する。量子化に対する堅牢性を維持しつつ,既存の残差ベース超解像アーキテクチャを高速化する訓練手法を提案する。提案するアーキテクチャは,最新のスマートフォンで2.2ミリ秒で2倍のアップスケーリングで1080pの出力を生成する。 In this work, we present QuickSRNet, an efficient super-resolution architecture for real-time applications on mobile platforms. Super-resolution clarifies, sharpens, and upscales an image to higher resolution. Applications such as gaming and video playback along with the ever-improving display capabilities of TVs, smartphones, and VR headsets are driving the need for efficient upscaling solutions. While existing deep learning-based super-resolution approaches achieve impressive results in terms of visual quality, enabling real-time DL-based super-resolution on mobile devices with compute, thermal, and power constraints is challenging. To address these challenges, we propose QuickSRNet, a simple yet effective architecture that provides better accuracy-to-latency trade-offs than existing neural architectures for single-image super resolution. We present training tricks to speed up existing residual-based super-resolution architectures while maintaining robustness to quantization. Our proposed architecture produces 1080p outputs via 2x upscaling in 2.2 ms on a modern smartphone, making it ideal for high-fps real-time applications.	翻訳日:2023-05-16 23:06:41 公開日:2023-05-14
# 協調型マルチエージェントタスクにおける学習報酬マシン Learning Reward Machines in Cooperative Multi-Agent Tasks ( http://arxiv.org/abs/2303.14061v3 ) ライセンス: Link先を確認	Leo Ardon, Daniel Furelos-Blanco, Alessandra Russo	(参考訳) 本稿では,協調的なタスク分解と,サブタスクの構造を符号化した報酬機械(rms)の学習を組み合わせたマルチエージェント強化学習(marl)への新しいアプローチを提案する。提案手法は, 部分的に観察可能な環境における報酬の非マルコフ的性質に対処し, 協調作業の完了に必要な学習方針の解釈性を向上させる。各サブタスクに関連付けられたrmは分散的に学習され、各エージェントの振る舞いを導くのに使用される。これにより、協調的マルチエージェント問題の複雑さが減少し、より効果的な学習が可能となる。以上の結果から,本手法はMARL,特に大規模状態空間と複数エージェントを持つ複雑な環境での今後の研究の方向性として期待できると考えられる。 This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) that combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments and improves the interpretability of the learnt policies required to complete the cooperative task. The RMs associated with each sub-task are learnt in a decentralised manner and then used to guide the behaviour of each agent. By doing so, the complexity of a cooperative multi-agent problem is reduced, allowing for more effective learning. The results suggest that our approach is a promising direction for future research in MARL, especially in complex environments with large state spaces and multiple agents.	翻訳日:2023-05-16 23:00:16 公開日:2023-05-14
# 絡み合った送信機を有するマルチアクセスチャネル The Multiple-Access Channel with Entangled Transmitters ( http://arxiv.org/abs/2303.10456v4 ) ライセンス: Link先を確認	Uzi Pereg, Christian Deppe, and Holger Boche	(参考訳) 従来型マルチアクセスチャネル(mac)と絡み合いリソースとの通信を考慮し,通信開始前に2つの送信機で絡み合いリソースを共有する。 leditzky et al. (2020) は、疑似テレパシーゲームで定義される古典的なmacの例を示し、絡み合った送信機との和率は、そのようなリソースのない最高の達成可能な和率よりも厳密に高いことを示した。ここでは、エンタングル送信機を有する一般macの容量領域における内外界と外界の境界を定め、その先行結果を特殊ケースとして得ることができることを示す。メッセージ平均誤差基準の下での古典的なmacの容量領域は、最大誤差基準よりも厳密に大きいことが長年知られている(dueck, 1978)。絡み合った資源が与えられた場合、その領域は一致する。さらに、エンタングルメントリソースと会議の複合的な設定に対処し、送信機はレート制限リンクを介して相互に通信することができる。超深度符号化を用いて、絡み合いは会議レートを2倍にすることができる。 Communication over a classical multiple-access channel (MAC) with entanglement resources is considered, whereby two transmitters share entanglement resources a priori before communication begins. Leditzky et al. (2020) presented an example of a classical MAC, defined in terms of a pseudo telepathy game, such that the sum rate with entangled transmitters is strictly higher than the best achievable sum rate without such resources. Here, we establish inner and outer bounds on the capacity region for the general MAC with entangled transmitters, and show that the previous result can be obtained as a special case. It has long been known that the capacity region of the classical MAC under a message-average error criterion can be strictly larger than with a maximal error criterion (Dueck, 1978). We observe that given entanglement resources, the regions coincide. Furthermore, we address the combined setting of entanglement resources and conferencing, where the transmitters can also communicate with each other over rate-limited links. Using superdense coding, entanglement can double the conferencing rate.	翻訳日:2023-05-16 22:58:41 公開日:2023-05-14
# 予算制約付き多成分PMDPの福祉最大化アルゴリズム Welfare Maximization Algorithm for Solving Budget-Constrained Multi-Component POMDPs ( http://arxiv.org/abs/2303.10302v2 ) ライセンス: Link先を確認	Manav Vora, Pranay Thangeda, Michael N. Grussing, Melkior Ornik	(参考訳) 部分的に観測可能なマルコフ決定プロセス(POMDP)は、実世界のシーケンシャルな意思決定プロセスをモデル化する効率的な方法を提供する。本稿では,独立なダイナミクスを持つインフラストラクチャコンポーネント群の保守・検査の問題に動機づけられ,多成分予算制約型pomdpの最適ポリシーを求めるアルゴリズムを提案する。まず、予算制約に固執しながら、POMDPの最適ポリシーを見つけることができる予算付きPOMDPモデル(b-POMDP)を導入する。次に、b-POMDP に対する値関数や最大値収集報酬が有限地平線の場合の予算の凹凸関数であることを証明する。第2のコントリビューションは、各コンポーネントのPOMDP間で最適な予算分割を求めることで、多成分の予算制約付きPOMDPの最適ポリシーを計算するアルゴリズムである。最適予算分割は福祉最大化問題として提起され、その解は値関数の凹凸特性を利用して計算される。本稿では, 劣化ダイナミクス, 検査コスト, 保守コストの異なる実世界のインフラコンポーネント群に対して, 保守・検査ポリシーを提案することにより, 提案手法の有効性を示す。提案アルゴリズムは,現在実施中であるポリシーを大幅に上回っていることを示す。 Partially Observable Markov Decision Processes (POMDPs) provide an efficient way to model real-world sequential decision making processes. Motivated by the problem of maintenance and inspection of a group of infrastructure components with independent dynamics, this paper presents an algorithm to find the optimal policy for a multi-component budget-constrained POMDP. We first introduce a budgeted-POMDP model (b-POMDP) which enables us to find the optimal policy for a POMDP while adhering to budget constraints. Next, we prove that the value function or maximal collected reward for a b-POMDP is a concave function of the budget for the finite horizon case. Our second contribution is an algorithm to calculate the optimal policy for a multi-component budget-constrained POMDP by finding the optimal budget split among the individual component POMDPs. The optimal budget split is posed as a welfare maximization problem and the solution is computed by exploiting the concave nature of the value function. We illustrate the effectiveness of the proposed algorithm by proposing a maintenance and inspection policy for a group of real-world infrastructure components with different deterioration dynamics, inspection and maintenance costs. We show that the proposed algorithm vastly outperforms the policy currently used in practice.	翻訳日:2023-05-16 22:58:23 公開日:2023-05-14
# 神経集団の動態と幾何学の解釈可能な統計表現 Interpretable statistical representations of neural population dynamics and geometry ( http://arxiv.org/abs/2304.03376v2 ) ライセンス: Link先を確認	Adam Gosztolai, Robert L. Peach, Alexis Arnaudon, Mauricio Barahona, Pierre Vandergheynst	(参考訳) 多様なタスク中のニューロン集団のダイナミクスは、しばしば低次元多様体上で進化する。しかし、関連する行動変数をエンコーディングするための幾何学と力学の貢献を理解することは依然として困難である。本稿では,局所相ポートレート特徴の統計的分布に基づく非線形力学系を表現するための教師なし幾何深層学習フレームワークを提案する。本手法は,計測軌跡に基づく力学の非バイアス比較のためのロバストな幾何認識あるいは幾何非依存表現を提供する。提案手法は,計算機構を識別するためにニューラルネットワークのインスタンスを一般化し,手指運動学と幾何学的対応を持つ霊長類到達課題における神経力学の解釈可能な組込みを求め,最先端精度の復号アルゴリズムを開発した。本研究は,時間的情報よりも本質的多様体構造を用い,より優れた復号アルゴリズムを開発し,実験間でデータを同化することの重要性を浮き彫りにする。 The dynamics of neuron populations during diverse tasks often evolve on low-dimensional manifolds. However, it remains challenging to discern the contributions of geometry and dynamics for encoding relevant behavioural variables. Here, we introduce an unsupervised geometric deep learning framework for representing non-linear dynamical systems based on statistical distributions of local phase portrait features. Our method provides robust geometry-aware or geometry-agnostic representations for the unbiased comparison of dynamics based on measured trajectories. We demonstrate that our statistical representation can generalise across neural network instances to discriminate computational mechanisms, obtain interpretable embeddings of neural dynamics in a primate reaching task with geometric correspondence to hand kinematics, and develop a decoding algorithm with state-of-the-art accuracy. Our results highlight the importance of using the intrinsic manifold structure over temporal information to develop better decoding algorithms and assimilate data across experiments.	翻訳日:2023-05-16 22:49:49 公開日:2023-05-14
# 肺結節分類のための縦型マルチモーダルトランスフォーマリン : 画像と潜伏臨床所見の統合 Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification ( http://arxiv.org/abs/2304.02836v3 ) ライセンス: Link先を確認	Thomas Z. Li, John M. Still, Kaiwen Xu, Ho Hin Lee, Leon Y. Cai, Aravind R. Krishnan, Riqiang Gao, Mirza S. Khan, Sanja Antic, Michael Kammer, Kim L. Sandler, Fabien Maldonado, Bennett A. Landman, Thomas A. Lasko	(参考訳) 孤立性肺結節(SPN)診断の予測モデルの精度は、電子健康記録(EHRs)などの反復画像と医療コンテキストを取り入れることで大幅に向上することができる。しかし、画像や診断符号などの臨床上の日常的なモダリティは、縦型マルチモーダル学習の障害となる様々な時間スケールで非同期かつ不規則にサンプリングすることができる。本研究では,SPN分類のための経時的臨床像とリピート画像を統合したトランスフォーマーに基づくマルチモーダル戦略を提案する。潜在臨床署名の非教師付き不連続化を行い, 臨床署名表現と胸部ctスキャンから共同学習するために, 時間的スケールドセルフアテンションを活用した。うちの分類器は,公開データセットからの2,668件のスキャンと,縦型胸部ct,請求コード,薬剤,eersによる検査で1,149名の被験者を対象に事前訓練を行っている。 SPNに挑戦する227名の被験者に対する評価では、縦型マルチモーダルベースライン(0.824 vs 0.752 AUC)に対するAUCの大幅な改善と、横型マルチモーダルシナリオ(0.809 AUC)と縦型イメージオンリーシナリオ(0.741 AUC)に対する改善が示された。本研究は、トランスフォーマを用いた縦型画像と非画像表現型を共学習する新しいアプローチにより、大きな利点を示す。 The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learning. In this work, we propose a transformer-based multimodal strategy to integrate repeat imaging with longitudinal clinical signatures from routinely collected EHRs for SPN classification. We perform unsupervised disentanglement of latent clinical signatures and leverage time-distance scaled self-attention to jointly learn from clinical signatures expressions and chest computed tomography (CT) scans. Our classifier is pretrained on 2,668 scans from a public dataset and 1,149 subjects with longitudinal chest CTs, billing codes, medications, and laboratory tests from EHRs of our home institution. Evaluation on 227 subjects with challenging SPNs revealed a significant AUC improvement over a longitudinal multimodal baseline (0.824 vs 0.752 AUC), as well as improvements over a single cross-section multimodal scenario (0.809 AUC) and a longitudinal imaging-only scenario (0.741 AUC). This work demonstrates significant advantages with a novel approach for co-learning longitudinal imaging and non-imaging phenotypes with transformers.	翻訳日:2023-05-16 22:49:27 公開日:2023-05-14
# スペクトル保存データ圧縮による高速化支援ベクトルクラスタリング Accelerate Support Vector Clustering via Spectrum-Preserving Data Compression ( http://arxiv.org/abs/2304.09868v3 ) ライセンス: Link先を確認	Yuxuan Song, Yongyu Wang	(参考訳) 本稿では,サポートベクトルクラスタリングを高速化する新しいフレームワークを提案する。提案手法は,新しいスペクトルデータ圧縮手法に基づき,オリジナルデータセットのキークラスタ特性を維持しながら,より小さな圧縮データセットを最初に計算する。得られたスペクトル圧縮データセットは,ベクトルクラスタリングをサポートする高速かつ高品質なアルゴリズムの開発に活用される。実世界のデータセットを用いた広範な実験を行い,非常に有望な結果を得た。提案手法により,Pendigits および USPS データセット上でのアート SVC 法の状態を,100X と 115X の高速化が可能となり,クラスタリング品質が向上した。我々の知る限りでは、これは大規模な実世界のデータセットにおける高品質で高速なSVCのための最初の実用的な方法である。 This paper proposes a novel framework for accelerating support vector clustering. The proposed method first computes much smaller compressed data sets while preserving the key cluster properties of the original data sets based on a novel spectral data compression approach. Then, the resultant spectrally-compressed data sets are leveraged for the development of fast and high quality algorithm for support vector clustering. We conducted extensive experiments using real-world data sets and obtained very promising results. The proposed method allows us to achieve 100X and 115X speedups over the state of the art SVC method on the Pendigits and USPS data sets, respectively, while achieving even better clustering quality. To the best of our knowledge, this represents the first practical method for high-quality and fast SVC on large-scale real-world data sets	翻訳日:2023-05-16 21:03:49 公開日:2023-05-14
# repuアクティベーションを持つ微分可能なニューラルネットワーク:スコア推定と等張回帰への応用 Differentiable Neural Networks with RePU Activation: with Applications to Score Estimation and Isotonic Regression ( http://arxiv.org/abs/2305.00608v2 ) ライセンス: Link先を確認	Guohao Shen, Yuling Jiao, Yuanyuan Lin, and Jian Huang	(参考訳) 整流パワーユニット(RePU)関数によって活性化される可変ニューラルネットワークの特性について検討する。本稿では,RePU ニューラルネットワークの部分微分を RePU 混合活性化ネットワークで表現し,RePU ネットワークの関数クラスの複雑性の上限を導出することを示す。本稿では,RePU活性化深層ニューラルネットワークを用いて,C^s$スムーズ関数とその導関数を同時に近似するための誤差境界を確立する。さらに、データに近似した低次元サポートがある場合の近似誤差境界を改善し、RePUネットワークが次元性の呪いを軽減できることを示す。結果の有用性を説明するために,RePUネットワークを用いた深部スコアマッチング推定器 (DSME) とペナル化深部ソトニック回帰 (PDIR) を提案する。 DSME と PDIR の非漸近的過剰リスク境界は、対象関数が滑らかな関数のクラスに属するという仮定の下で成立する。また,単調性仮定が満たされていない場合でも,PDIRはペナルティパラメータの消滅と整合性を有することを示す。さらに, 近似低次元多様体上でデータ分布が支持される場合, dsme と pdir は次元の呪いを緩和できることを示す。 We study the properties of differentiable neural networks activated by rectified power unit (RePU) functions. We show that the partial derivatives of RePU neural networks can be represented by RePUs mixed-activated networks and derive upper bounds for the complexity of the function class of derivatives of RePUs networks. We establish error bounds for simultaneously approximating $C^s$ smooth functions and their derivatives using RePU-activated deep neural networks. Furthermore, we derive improved approximation error bounds when data has an approximate low-dimensional support, demonstrating the ability of RePU networks to mitigate the curse of dimensionality. To illustrate the usefulness of our results, we consider a deep score matching estimator (DSME) and propose a penalized deep isotonic regression (PDIR) using RePU networks. We establish non-asymptotic excess risk bounds for DSME and PDIR under the assumption that the target functions belong to a class of $C^s$ smooth functions. We also show that PDIR has a robustness property in the sense it is consistent with vanishing penalty parameters even when the monotonicity assumption is not satisfied. Furthermore, if the data distribution is supported on an approximate low-dimensional manifold, we show that DSME and PDIR can mitigate the curse of dimensionality.	翻訳日:2023-05-16 20:54:40 公開日:2023-05-14
# 時間的敵意増強による映像表現の改善 Improve Video Representation with Temporal Adversarial Augmentation ( http://arxiv.org/abs/2304.14601v2 ) ライセンス: Link先を確認	Jinhao Duan, Quanfu Fan, Hao Cheng, Xiaoshuang Shi, Kaidi Xu	(参考訳) 近年の研究では、ニューラルネットワーク(NN)を適切に使用すれば、対向的な拡張が一般化の恩恵を受けることが示されている。本稿では,時間的注意を利用する新しい映像拡張手法であるtemporal adversarial augmentedation (ta)を提案する。従来の敵対的拡張とは異なり、TAは時間的関連損失関数を最大化することにより、ビデオクリップに対するニューラルネットワークの注意分布をシフトするように特別に設計されている。 TAは、ニューラルネットワークの焦点に大きな影響を及ぼす多様な時間的視点が得られることを実証する。これらの例によるトレーニングは、不均衡な時間的情報知覚の欠陥を修復し、時間的シフトに対して防御する能力を高め、最終的にはより一般化する。 TAを活用するために,ビデオ表現を改善するためのTAF(Temporal Video Adversarial Fine-tuning)フレームワークを提案する。 tafはモデルに依存しない、汎用的で、解釈しやすいトレーニング戦略である。 TSM, GST, TAM, TPNの4つの強力なモデルを用いて, 時間関連ベンチマーク(V1&V2, dive48)を用いてTAFを評価する。実験結果から,TAFはパラメータや計算コストを伴わずに,有意なマージンでこれらのモデルの試験精度を効果的に向上することが示された。副産物として、TAFはアウト・オブ・ディストリビューション(OOD)設定下での堅牢性も改善する。コードはhttps://github.com/jinhaoduan/tafで入手できる。 Recent works reveal that adversarial augmentation benefits the generalization of neural networks (NNs) if used in an appropriate manner. In this paper, we introduce Temporal Adversarial Augmentation (TA), a novel video augmentation technique that utilizes temporal attention. Unlike conventional adversarial augmentation, TA is specifically designed to shift the attention distributions of neural networks with respect to video clips by maximizing a temporal-related loss function. We demonstrate that TA will obtain diverse temporal views, which significantly affect the focus of neural networks. Training with these examples remedies the flaw of unbalanced temporal information perception and enhances the ability to defend against temporal shifts, ultimately leading to better generalization. To leverage TA, we propose Temporal Video Adversarial Fine-tuning (TAF) framework for improving video representations. TAF is a model-agnostic, generic, and interpretability-friendly training strategy. We evaluate TAF with four powerful models (TSM, GST, TAM, and TPN) over three challenging temporal-related benchmarks (Something-something V1&V2 and diving48). Experimental results demonstrate that TAF effectively improves the test accuracy of these models with notable margins without introducing additional parameters or computational costs. As a byproduct, TAF also improves the robustness under out-of-distribution (OOD) settings. Code is available at https://github.com/jinhaoduan/TAF.	翻訳日:2023-05-16 20:54:07 公開日:2023-05-14
# PoseVocab:人間のアバターモデリングのための共同構造ポス埋め込み学習 PoseVocab: Learning Joint-structured Pose Embeddings for Human Avatar Modeling ( http://arxiv.org/abs/2304.13006v2 ) ライセンス: Link先を確認	Zhe Li, Zerong Zheng, Yuxiao Liu, Boyao Zhou, Yebin Liu	(参考訳) ポーズ駆動ヒトアバターの作成は、低周波駆動ポーズから高周波動的人間の外観へのマッピングをモデル化するため、人間のアバターモデリングには、高忠実度な人間の詳細をエンコードできる効果的なポーズ符号化法が不可欠である。そこで本研究では,ネットワークが動的に人間の表情を学習するための最適なポーズ埋め込みを見つけることを促す,新しいポーズ符号化手法であるpositvocabを提案する。キャラクターのマルチビューRGBビデオが与えられた後、PoseVocabはトレーニングポーズに基づいてキーポーズと潜在埋め込みを構築する。ポーズ一般化と時間的一貫性を達成するために,大域的なポーズベクトルではなく,各ジョイントの$so(3)$でキー回転をサンプリングし,各サンプルされたキー回転に対してポーズ埋め込みを割り当てる。これらのジョイント構造のポーズ埋め込みは、異なるキーポーズの下でのダイナミックな外観をエンコードするだけでなく、ジョイント構造に埋め込まれたグローバルなポーズを分解し、各ジョイントの動きに関連する外観の変動をよりよく学習する。メモリ効率を保ちながらポーズ埋め込みの表現能力を向上するために,よりきめ細かな人間の外観をモデル化するために,コンパクトで効果的な3D表現である特徴線を導入する。さらに、クエリポーズと空間的位置が与えられた場合、ポーズ埋め込みを補間し、動的ヒト合成のための条件付きポーズ特徴を取得する階層的なクエリ戦略を導入する。全体的に、ponsvocabは人間の外観の動的な詳細を効果的にエンコードし、新しいポーズの下でリアルで一般化されたアニメーションを可能にする。実験により,本手法は質的および定量的に合成品質の点で,他の最先端ベースラインよりも優れていることが示された。コードはhttps://github.com/lizhe00/posevocabで入手できる。 Creating pose-driven human avatars is about modeling the mapping from the low-frequency driving pose to high-frequency dynamic human appearances, so an effective pose encoding method that can encode high-fidelity human details is essential to human avatar modeling. To this end, we present PoseVocab, a novel pose encoding method that encourages the network to discover the optimal pose embeddings for learning the dynamic human appearance. Given multi-view RGB videos of a character, PoseVocab constructs key poses and latent embeddings based on the training poses. To achieve pose generalization and temporal consistency, we sample key rotations in $so(3)$ of each joint rather than the global pose vectors, and assign a pose embedding to each sampled key rotation. These joint-structured pose embeddings not only encode the dynamic appearances under different key poses, but also factorize the global pose embedding into joint-structured ones to better learn the appearance variation related to the motion of each joint. To improve the representation ability of the pose embedding while maintaining memory efficiency, we introduce feature lines, a compact yet effective 3D representation, to model more fine-grained details of human appearances. Furthermore, given a query pose and a spatial position, a hierarchical query strategy is introduced to interpolate pose embeddings and acquire the conditional pose feature for dynamic human synthesis. Overall, PoseVocab effectively encodes the dynamic details of human appearance and enables realistic and generalized animation under novel poses. Experiments show that our method outperforms other state-of-the-art baselines both qualitatively and quantitatively in terms of synthesis quality. Code is available at https://github.com/lizhe00/PoseVocab.	翻訳日:2023-05-16 20:53:09 公開日:2023-05-14
# 量子ビットルーティングのアルゴリズム理論 Algorithmic Theory of Qubit Routing ( http://arxiv.org/abs/2305.02059v2 ) ライセンス: Link先を確認	Takehiro Ito, Naonori Kakimura, Naoyuki Kamiyama, Yusuke Kobayashi, Yoshio Okamoto	(参考訳) 量子ビットルーティング問題(qubit routing problem)またはスワップ最小化問題(swap minimization problem)は、量子プログラムのコンパイラの設計において生じる(古典的な)組合せ最適化問題である。理論計算機科学の立場から量子経路問題を研究する一方,既存の研究の多くは実用的側面を考察している。我々は、グラフトポロジが経路である量子コンピュータの線形近接アーキテクチャ(LNN)に集中する。私たちの結果は3倍です。 1) 量子ビットルーティング問題はNPハードであることを証明する。 2) 2量子ゲートの数がパラメータである場合,固定パラメータアルゴリズムを提案する。 (3) 各キュービットが少なくとも1つの2量子ビットゲートに関与している場合に多項式時間アルゴリズムを与える。 The qubit routing problem, also known as the swap minimization problem, is a (classical) combinatorial optimization problem that arises in the design of compilers of quantum programs. We study the qubit routing problem from the viewpoint of theoretical computer science, while most of the existing studies investigated the practical aspects. We concentrate on the linear nearest neighbor (LNN) architectures of quantum computers, in which the graph topology is a path. Our results are three-fold. (1) We prove that the qubit routing problem is NP-hard. (2) We give a fixed-parameter algorithm when the number of two-qubit gates is a parameter. (3) We give a polynomial-time algorithm when each qubit is involved in at most one two-qubit gate.	翻訳日:2023-05-16 20:43:56 公開日:2023-05-14
# 教師なし深部FCDDを用いた農村鉄道診断のための木造スリーパー劣化検出 Wooden Sleeper Deterioration Detection for Rural Railway Prognostics Using Unsupervised Deeper FCDDs ( http://arxiv.org/abs/2305.05103v3 ) ライセンス: Link先を確認	Takato Yasuno, Masahiro Okano, and Junichiro Fujii	(参考訳) 日々の鉄道運行における利用者の安全確保は、鉄道管理者にとって不可欠である。この取り組みを支援するため、トップカメラやサイドカメラ、GPS測位システムは、欠陥点検の定期的検査の自動化や、鉄道部品の劣化状況の評価に進展している。しかし,劣化状態に関するデータ収集には時間を要する可能性があり,過度な時間的発生の不均衡のため,データ取得の繰り返しが必要となる。教師付き学習では、欠陥のある生画像と注釈付きラベルを含む何千ものペアデータセットが必要である。しかし、一級分類アプローチは、通常の特徴や異常な特徴を訓練するためのパラメータを最適化するために、画像が少ないという利点がある。 FCDDは, 構造物や倒木, 倒木などのコンクリート・鋼構造物の損傷データ集合に適用し, 災害時の木造建築物の崩壊について検討した。しかし、まだ鉄道部品が可能であることは分かっていない。本研究では, 鉄道部品の深いFCDDを用いた一級損傷分類を自動化するための識別器パイプラインを考案した。また,畳み込みニューラルネットワーク(CNN)を用いた深部背骨と受容野の感度解析を行った。さらに, トランスポーテッド・ガウスアン・アップサンプリングを用いて, 欠陥鉄道特性を可視化した。農村鉄道における木製スリーパー劣化を含む前方視における鉄道線路の映像取得データセットを用いた鉄道検査への適用を実証した。最後に, 鉄道部品検査における予測モニタリングへのアプローチの有用性と今後の課題について検討した。 Maintaining high standards for user safety during daily railway operations is crucial for railway managers. To aid in this endeavor, top- or side-view cameras and GPS positioning systems have facilitated progress toward automating periodic inspections of defective features and assessing the deteriorating status of railway components. However, collecting data on deteriorated status can be time-consuming and requires repeated data acquisition because of the extreme temporal occurrence imbalance. In supervised learning, thousands of paired data sets containing defective raw images and annotated labels are required. However, the one-class classification approach offers the advantage of requiring fewer images to optimize parameters for training normal and anomalous features. The deeper fully-convolutional data descriptions (FCDDs) were applicable to several damage data sets of concrete/steel components in structures, and fallen tree, and wooden building collapse in disasters. However, it is not yet known to feasible to railway components. In this study, we devised a prognostic discriminator pipeline to automate one-class damage classification using the deeper FCDDs for defective railway components. We also performed sensitivity analysis of the deeper backbone and receptive field based on convolutional neural networks (CNNs). Furthermore, we visualized defective railway features by using transposed Gaussian upsampling. We demonstrated our application to railway inspection using a video acquisition dataset of railway track in forward view that contains wooden sleeper deterioration in rural railways. Finally, we examined the usability of our approach for prognostic monitoring and future work on railway component inspection.	翻訳日:2023-05-16 20:35:31 公開日:2023-05-14
# ANALOGICAL - 大規模言語モデルのための長文分析のための新しいベンチマーク ANALOGICAL - A New Benchmark for Analogy of Long Text for Large Language Models ( http://arxiv.org/abs/2305.05050v2 ) ライセンス: Link先を確認	Thilini Wijesiriwardene, Ruwan Wickramarachchi, Bimal G. Gajera, Shreeyash Mukul Gowaikar, Chandan Gupta, Aman Chadha, Aishwarya Naresh Reganti, Amit Sheth, Amitava Das	(参考訳) 過去10年間で、単語レベルの類推という形で、Word2vecのような単語埋め込み手法の品質を評価するための本質的な尺度として重要な役割を果たしてきた。しかし、現代の大規模言語モデル(LLM)は、GLUEやSuperGLUEのようなベンチマークに基づく外部尺度に基づいて主に評価されており、LLMが長いテキスト間の類似性を引き出すことができるかどうかについてはいくつかの研究がある。本稿では,6段階の複雑さを持つ長文のアナロジーの分類において,LLMを内在的に評価する新しいベンチマークであるANALOGICALを提案する。 (i)単語 (ii)単語対文 (三)統語論、 (4)否定 (v)以下 (vi)メタファー。 13のデータセットと3つの異なる距離測度を用いて、意味ベクトル空間における類似対を識別する8つのLLMの能力を評価する。我々の評価では,類推分類法を上昇させる際,llm が類推を識別することがますます困難になっていることがわかった。 Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw analogies between long texts. In this paper, we present ANALOGICAL, a new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of long text with six levels of complexity -- (i) word, (ii) word vs. sentence, (iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using thirteen datasets and three different distance measures, we evaluate the abilities of eight LLMs in identifying analogical pairs in the semantic vector space. Our evaluation finds that it is increasingly challenging for LLMs to identify analogies when going up the analogy taxonomy.	翻訳日:2023-05-16 20:35:07 公開日:2023-05-14
# 機械学習の景観を探る : 総合的な調査と分類学 Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy ( http://arxiv.org/abs/2305.06360v2 ) ライセンス: Link先を確認	Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, and Qing Li	(参考訳) 機械学習(ML)モデルによる予測の削除や修正の必要性から、機械学習(MU)が注目を集めている。トレーニングモデルはより効率的で正確になっていますが、未学習の情報の重要性は、プライバシやセキュリティ、公正といった分野でますます重要になっています。本稿では,データ削除,摂動,モデル更新など,現在の最先端技術とアプローチを包括的に調査する。また、一般的なメトリクスやデータセットも提示される。また、攻撃の高度化、標準化、転送可能性、解釈可能性、トレーニングデータ、リソース制約など、対処すべき課題を強調している。本稿では,muの潜在的メリットとその今後の方向性について考察する。さらに、機械学習モデルがユーザの信頼を維持しながら変化する状況に適応できるように、研究者や実践者が未学習の技術を探求し、改善し続ける必要性を強調した。アンラーニングの重要性はさらに強調され、人工知能(AI)をより信頼性が高く透明なものにすること、特に大量の個人データを含むさまざまな領域におけるAIの重要性が増している。 Machine unlearning (MU) is gaining increasing attention due to the need to remove or modify predictions made by machine learning (ML) models. While training models have become more efficient and accurate, the importance of unlearning previously learned information has become increasingly significant in fields such as privacy, security, and fairness. This paper presents a comprehensive survey of MU, covering current state-of-the-art techniques and approaches, including data deletion, perturbation, and model updates. In addition, commonly used metrics and datasets are also presented. The paper also highlights the challenges that need to be addressed, including attack sophistication, standardization, transferability, interpretability, training data, and resource constraints. The contributions of this paper include discussions about the potential benefits of MU and its future directions. Additionally, the paper emphasizes the need for researchers and practitioners to continue exploring and refining unlearning techniques to ensure that ML models can adapt to changing circumstances while maintaining user trust. The importance of unlearning is further highlighted in making Artificial Intelligence (AI) more trustworthy and transparent, especially with the increasing importance of AI in various domains that involve large amounts of personal user data.	翻訳日:2023-05-16 20:26:09 公開日:2023-05-14
# 事例依存ラベル雑音学習におけるラベルの価値の再考 Rethinking the Value of Labels for Instance-Dependent Label Noise Learning ( http://arxiv.org/abs/2305.06247v2 ) ライセンス: Link先を確認	Hanwen Deng, Weijia Zhang, Min-Ling Zhang	(参考訳) ラベルノイズは大規模データセットに広く存在し、ディープラーニングアルゴリズムの性能を著しく劣化させる。インスタンス依存ノイズ遷移行列の識別不能のため、ほとんどの既存のアルゴリズムは、ノイズラベル生成プロセスがインスタンスの特徴とは独立であると仮定することでこの問題に対処する。残念ながら、実世界のアプリケーションにおけるノイズの多いラベルは、しばしば真のラベルと機能の両方に依存します。本研究では,ノイズ遷移行列を明示的にモデル化することを避ける新しい深層生成モデルを用いて,インスタンス依存ラベルノイズに取り組む。本アルゴリズムは,カジュアル表現学習を活用し,データから高レベルコンテンツとスタイル潜在要因を同時に識別する。ノイズラベルの監視情報を構造的因果モデルを用いて活用することにより,提案手法が最先端の雑音データよりも大幅に優れていることを示す。 Label noise widely exists in large-scale datasets and significantly degenerates the performances of deep learning algorithms. Due to the non-identifiability of the instance-dependent noise transition matrix, most existing algorithms address the problem by assuming the noisy label generation process to be independent of the instance features. Unfortunately, noisy labels in real-world applications often depend on both the true label and the features. In this work, we tackle instance-dependent label noise with a novel deep generative model that avoids explicitly modeling the noise transition matrix. Our algorithm leverages casual representation learning and simultaneously identifies the high-level content and style latent factors from the data. By exploiting the supervision information of noisy labels with structural causal models, our empirical evaluations on a wide range of synthetic and real-world instance-dependent label noise datasets demonstrate that the proposed algorithm significantly outperforms the state-of-the-art counterparts.	翻訳日:2023-05-16 20:25:52 公開日:2023-05-14
# 善意を超えて:社会善のためのNLPの研究ランドスケープを報告 Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good ( http://arxiv.org/abs/2305.05471v2 ) ライセンス: Link先を確認	Fernando Gonzalez, Zhijing Jin, Bernhard Sch\"olkopf, Tom Hope, Mrinmaya Sachan, Rada Mihalcea	(参考訳) 自然言語処理(NLP)の最近の進歩により、様々なユースケースにまたがって多数のアプリケーションが登場した。 NLP応用の多さの中で、NLP for Social Good (NLP4SG) の最近の取り組みに則って、多くの学術研究者は、社会に良い影響を与える仕事を行う動機がある。しかし、研究者が今日の大きな社会問題にどのように取り組んでいるかは必ずしも明らかではない。そこで本稿では,NLP4SGPAPERSという,NLP4SG論文を識別し,NLP4SGランドスケープを特徴付ける3つの関連タスクを持つ科学データセットを紹介する。(1)社会問題に対処する論文を識別し,(2)対応する国連持続開発目標(SDG)にマッピングし,(3)解決している課題と方法を特定する。現状のNLPモデルを用いて、これらのタスクに対処し、ACLアンソロジー全体で使用することにより、研究者がNLP4SGの分野を概観する可視化ワークスペースを提供する。私たちのwebサイトはhttps://nlp4sg.vercel.appで閲覧できます。私たちはデータをhttps://huggingface.co/datasets/feradauto/nlp4sgpapersとhttps://github.com/feradauto/nlp4sgでリリースした。 With the recent advances in natural language processing (NLP), a vast number of applications have emerged across various use cases. Among the plethora of NLP applications, many academic researchers are motivated to do work that has a positive social impact, in line with the recent initiatives of NLP for Social Good (NLP4SG). However, it is not always obvious to researchers how their research efforts are tackling today's big social problems. Thus, in this paper, we introduce NLP4SGPAPERS, a scientific dataset with three associated tasks that can help identify NLP4SG papers and characterize the NLP4SG landscape by: (1) identifying the papers that address a social problem, (2) mapping them to the corresponding UN Sustainable Development Goals (SDGs), and (3) identifying the task they are solving and the methods they are using. Using state-of-the-art NLP models, we address each of these tasks and use them on the entire ACL Anthology, resulting in a visualization workspace that gives researchers a comprehensive overview of the field of NLP4SG. Our website is available at https://nlp4sg.vercel.app . We released our data at https://huggingface.co/datasets/feradauto/NLP4SGPapers and code at https://github.com/feradauto/nlp4sg .	翻訳日:2023-05-16 20:24:17 公開日:2023-05-14
# 視覚変換器の堅牢性向上について:防御拡散 On enhancing the robustness of Vision Transformers: Defensive Diffusion ( http://arxiv.org/abs/2305.08031v1 ) ライセンス: Link先を確認	Raza Imam, Muhammad Huzaifa, and Mohammed El-Amine Azz	(参考訳) 医療データのプライバシーと機密性は、医療設定において最も重要である。 sotaビジョンモデルであるvitsは、トレーニングのために大量の患者データに依存しており、データセキュリティと不正アクセスの可能性を懸念している。敵はViTの脆弱性を利用して機密情報を抽出し、患者のプライバシーを侵害する可能性がある。この研究は、医療応用におけるViTの信頼性と信頼性を確保するために、これらの脆弱性に対処する。本研究では,元画像中の攻撃者による対向ノイズを除去する対向浄化器として,防御拡散手法を導入した。拡散モデルのデノナイズ機能を利用することで、逆拡散法を用いて、攻撃サンプルから対向ノイズを効果的に除去し、その結果、よりクリーンな画像がViTブロックに供給される。本研究は,画像からのアタック非依存な敵対的ノイズ除去における拡散モデルの有効性を示す。さらに,知識蒸留とフレームワークを組み合わせることで,グレーボックス攻撃に対する計算効率と堅牢性を両立する軽量な学生モデルを実現することを提案する。提案手法とSOTAベースライン法SEViTとの比較により,本手法がベースラインより優れていることを示す。結核x線データを用いた広範な実験により,提案手法による計算効率と頑健性が検証された。 Privacy and confidentiality of medical data are of utmost importance in healthcare settings. ViTs, the SOTA vision model, rely on large amounts of patient data for training, which raises concerns about data security and the potential for unauthorized access. Adversaries may exploit vulnerabilities in ViTs to extract sensitive patient information and compromising patient privacy. This work address these vulnerabilities to ensure the trustworthiness and reliability of ViTs in medical applications. In this work, we introduced a defensive diffusion technique as an adversarial purifier to eliminate adversarial noise introduced by attackers in the original image. By utilizing the denoising capabilities of the diffusion model, we employ a reverse diffusion process to effectively eliminate the adversarial noise from the attack sample, resulting in a cleaner image that is then fed into the ViT blocks. Our findings demonstrate the effectiveness of the diffusion model in eliminating attack-agnostic adversarial noise from images. Additionally, we propose combining knowledge distillation with our framework to obtain a lightweight student model that is both computationally efficient and robust against gray box attacks. Comparison of our method with a SOTA baseline method, SEViT, shows that our work is able to outperform the baseline. Extensive experiments conducted on a publicly available Tuberculosis X-ray dataset validate the computational efficiency and improved robustness achieved by our proposed architecture.	翻訳日:2023-05-16 18:15:02 公開日:2023-05-14
# SongDriver2:ソフト移行によるリアルタイム感情ベースの音楽アレンジメント SongDriver2: Real-time Emotion-based Music Arrangement with Soft Transition ( http://arxiv.org/abs/2305.08029v1 ) ライセンス: Link先を確認	Zihao Wang, Le Ma, Chen Zhang, Bo Han, Yikai Wang, Xinyi Chen, HaoRong Hong, Wenbo Liu, Xinda Wu, Kejun Zhang	(参考訳) リアルタイムの感情に基づく音楽アレンジメントは、特定の楽曲を別の曲に変換し、リアルタイムでユーザーと特定の感情共鳴を誘発することを目的としており、音楽療法、ビデオゲームのサウンドトラック、映画のスコアなど、様々なシナリオにおいて重要な応用価値を持っている。しかし、ソフトな感情遷移とリアルタイムに適合する感情のバランスは、ターゲットの感情のきめ細かい性質と可変性のために困難である。既存の研究は主に感情をリアルタイムに適合させることに焦点を当てているが、ソフト・トランジションの問題はまだ検討されており、音楽全体の感情的コヒーレンスに影響を与える。本稿では,このバランスに対処するため,SongDriver2を提案する。具体的には、まず最後のタイムステップの音楽感情を認識し、次に現在のタイムステップのターゲット入力感情と融合する。そして、融合感情がsongdriver2のガイダンスとなり、入力メロディデータに基づいて今後の音楽を生成する。音楽の類似性と感情のリアルタイム適合性を柔軟に調整するために、オリジナルメロディを分解し、生成モデルに入力する。さらに、4つの音楽理論を設計し、ドメイン知識を活用して感情情報を強化し、半教師付き学習を用いて、手動データセットアノテーションによる主観的バイアスを軽減する。評価結果によると、SongDriver2は客観的および主観的メトリクスの両方において最先端の手法を上回っている。これらの結果は,SongDriver2がリアルタイムな適合性とソフトな遷移を同時に達成し,生成した音楽のコヒーレンスを高めることを実証している。 Real-time emotion-based music arrangement, which aims to transform a given music piece into another one that evokes specific emotional resonance with the user in real-time, holds significant application value in various scenarios, e.g., music therapy, video game soundtracks, and movie scores. However, balancing emotion real-time fit with soft emotion transition is a challenge due to the fine-grained and mutable nature of the target emotion. Existing studies mainly focus on achieving emotion real-time fit, while the issue of soft transition remains understudied, affecting the overall emotional coherence of the music. In this paper, we propose SongDriver2 to address this balance. Specifically, we first recognize the last timestep's music emotion and then fuse it with the current timestep's target input emotion. The fused emotion then serves as the guidance for SongDriver2 to generate the upcoming music based on the input melody data. To adjust music similarity and emotion real-time fit flexibly, we downsample the original melody and feed it into the generation model. Furthermore, we design four music theory features to leverage domain knowledge to enhance emotion information and employ semi-supervised learning to mitigate the subjective bias introduced by manual dataset annotation. According to the evaluation results, SongDriver2 surpasses the state-of-the-art methods in both objective and subjective metrics. These results demonstrate that SongDriver2 achieves real-time fit and soft transitions simultaneously, enhancing the coherence of the generated music.	翻訳日:2023-05-16 18:14:41 公開日:2023-05-14
# QAOA, Penalty Dephasing, Zeno効果を統合したハイブリッド量子アルゴリズムによる2値最適化問題の解法 Hybrid Quantum Algorithms integrating QAOA, Penalty Dephasing and Zeno Effect for Solving Binary Optimization Problems with Multiple Constraints ( http://arxiv.org/abs/2305.08056v1 ) ライセンス: Link先を確認	Ke Wan, Yiwen Liu	(参考訳) 量子アルゴリズムを用いてバイナリ最適化問題に取り組む場合、従来のIsing表現とQuantum Approximate Optimization Algorithm (QAOA)は、複数の制約を含む大規模問題のエラーを効率的に処理するのに困難である。これらの課題に対処するため,本論文では,制約のサブセットを解決するために標準Ising Hamiltonianの使用と,残りの制約を表現および対処するためにIsing以外の定式化を併用したハイブリッドフレームワークを提案する。これらのノンイジング制約の解決は、ペナルティ・デファスメントまたは量子ゼノン効果によって達成される。この革新的なアプローチは、各制約に対する選択された表現に依存する、適応可能な構造を持つ量子回路の集合をもたらす。さらに,制約フラグを頻繁に測定し,任意の最適化制約の解決を可能にする量子ゼノ効果を利用した新しい手法を提案する。これらのアルゴリズムの理論的性質を考察し, 実機載荷問題に対するそれらの性能は高い有望であり, 幅広い産業応用において有意な可能性を示している。 When tackling binary optimization problems using quantum algorithms, the conventional Ising representation and Quantum Approximate Optimization Algorithm (QAOA) encounter difficulties in efficiently handling errors for large-scale problems involving multiple constraints. To address these challenges, this paper presents a hybrid framework that combines the use of standard Ising Hamiltonians to solve a subset of the constraints, while employing non-Ising formulations to represent and address the remaining constraints. The resolution of these non-Ising constraints is achieved through either penalty dephasing or the quantum Zeno effect. This innovative approach leads to a collection of quantum circuits with adaptable structures, depending on the chosen representation for each constraint. Furthermore, this paper introduces a novel technique that utilizes the quantum Zeno effect by frequently measuring the constraint flag, enabling the resolution of any optimization constraint. Theoretical properties of these algorithms are discussed, and their performance in addressing practical aircraft loading problems is highly promising, showcasing significant potential for a wide range of industrial applications.	翻訳日:2023-05-16 18:03:36 公開日:2023-05-14
# SCRNet:空間整合性による網膜構造に基づく低照度化モデル SCRNet: a Retinex Structure-based Low-light Enhancement Model Guided by Spatial Consistency ( http://arxiv.org/abs/2305.08053v1 ) ライセンス: Link先を確認	Miao Zhang, Yiqing Shen and Shenghui Zhong	(参考訳) 低照度条件下で撮影された画像は、コントラストの減少、ノイズの増加、細部の減少、不自然な色再現など、いくつかの課題に苦しめられている。これらの要因は、物体検出や画像分割といったコンピュータビジョンタスクのパフォーマンスを著しく損なう可能性がある。 As a result, improving the quality of low-light images is of paramount importance for practical applications in the computer vision domain.To effectively address these challenges, we present a novel low-light image enhancement model, termed Spatial Consistency Retinex Network (SCRNet), which leverages the Retinex-based structure and is guided by the principle of spatial consistency.Specifically, our proposed model incorporates three levels of consistency: channel level, semantic level, and texture level, inspired by the principle of spatial consistency.These levels of consistency enable our model to adaptively enhance image features, ensuring more accurate and visually pleasing results.Extensive experimental evaluations on various low-light image datasets demonstrate that our proposed SCRNet outshines existing state-of-the-art methods, highlighting the potential of SCRNet as an effective solution for enhancing low-light images. Images captured under low-light conditions are often plagued by several challenges, including diminished contrast, increased noise, loss of fine details, and unnatural color reproduction. These factors can significantly hinder the performance of computer vision tasks such as object detection and image segmentation. As a result, improving the quality of low-light images is of paramount importance for practical applications in the computer vision domain.To effectively address these challenges, we present a novel low-light image enhancement model, termed Spatial Consistency Retinex Network (SCRNet), which leverages the Retinex-based structure and is guided by the principle of spatial consistency.Specifically, our proposed model incorporates three levels of consistency: channel level, semantic level, and texture level, inspired by the principle of spatial consistency.These levels of consistency enable our model to adaptively enhance image features, ensuring more accurate and visually pleasing results.Extensive experimental evaluations on various low-light image datasets demonstrate that our proposed SCRNet outshines existing state-of-the-art methods, highlighting the potential of SCRNet as an effective solution for enhancing low-light images.	翻訳日:2023-05-16 18:03:15 公開日:2023-05-14
# 驚くほど単純な連続アクションpomdpソルバ:ポリシーツリー上の遅延クロスエントロピー探索 A Surprisingly Simple Continuous-Action POMDP Solver: Lazy Cross-Entropy Search Over Policy Trees ( http://arxiv.org/abs/2305.08049v1 ) ライセンス: Link先を確認	Marcus Hoerger, Hanna Kurniawati, Dirk Kroese, Nan Ye	(参考訳) 部分可観測マルコフ決定プロセス(POMDP)は確率的部分可観測環境における意思決定の原則的枠組みを提供する。しかし、連続行動空間の問題に対する優れた解の計算は依然として困難である。この課題を解消するために、Lazy Cross-Entropy Search Over Policy Trees (L CEOPT) と呼ばれるシンプルなオンラインPOMDP解決器を提案する。各計画段階では,ポリシーツリーの空間を探索するために遅延クロスエントロピー法を用いて,簡単なポリシー表現を提供する。具体的には、有望な有限水平ポリシーツリーの分布を維持する。この分布はサンプリングポリシによって反復的に更新され、モンテカルロシミュレーションによって評価され、最高性能のものに再適合する。本手法はポリシツリー表現を利用して,ポリシーサンプリング,評価,分散更新における冗長な計算を回避するという意味では遅延である。これにより、最大2桁の計算節約が可能となる。我々のL CEOPTは、既存の最先端手法と比較して驚くほど単純であるが、特に高次元のアクション空間における問題に対して、いくつかの連続作用POMDP問題において、経験的に優れている。 The Partially Observable Markov Decision Process (POMDP) provides a principled framework for decision making in stochastic partially observable environments. However, computing good solutions for problems with continuous action spaces remains challenging. To ease this challenge, we propose a simple online POMDP solver, called Lazy Cross-Entropy Search Over Policy Trees (LCEOPT). At each planning step, our method uses a lazy Cross-Entropy method to search the space of policy trees, which provide a simple policy representation. Specifically, we maintain a distribution on promising finite-horizon policy trees. The distribution is iteratively updated by sampling policies, evaluating them via Monte Carlo simulation, and refitting them to the top-performing ones. Our method is lazy in the sense that it exploits the policy tree representation to avoid redundant computations in policy sampling, evaluation, and distribution update. This leads to computational savings of up to two orders of magnitude. Our LCEOPT is surprisingly simple as compared to existing state-of-the-art methods, yet empirically outperforms them on several continuous-action POMDP problems, particularly for problems with higher-dimensional action spaces.	翻訳日:2023-05-16 18:02:59 公開日:2023-05-14
# グラフニューラルネットワークの一般化に向けて Towards Understanding the Generalization of Graph Neural Networks ( http://arxiv.org/abs/2305.08048v1 ) ライセンス: Link先を確認	Huayi Tang and Yong Liu	(参考訳) グラフニューラルネットワーク(GNN)は、グラフ構造化データ指向学習と表現において最も広く採用されているモデルである。実世界のアプリケーションで並外れた成功を収めたにもかかわらず、理論による作業メカニズムの理解はまだ第一段階である。本稿では,一般化の観点から,この目標に向かって進む。具体的には,確率的最適化を考慮したトランスダクティブ学習における一般化ギャップと勾配の確率境界を確立する。その後、人気のあるGNNに対して、一般化ギャップの確率境界を提供する。理論的結果は、一般化ギャップに影響を与えるアーキテクチャ固有の要因を明らかにする。ベンチマークデータセットにおける実験結果は、理論的結果と経験的証拠の一貫性を示している。本研究は,GNNの一般化に関する新たな知見を提供する。 Graph neural networks (GNNs) are the most widely adopted model in graph-structured data oriented learning and representation. Despite their extraordinary success in real-world applications, understanding their working mechanism by theory is still on primary stage. In this paper, we move towards this goal from the perspective of generalization. To be specific, we first establish high probability bounds of generalization gap and gradients in transductive learning with consideration of stochastic optimization. After that, we provide high probability bounds of generalization gap for popular GNNs. The theoretical results reveal the architecture specific factors affecting the generalization gap. Experimental results on benchmark datasets show the consistency between theoretical results and empirical evidence. Our results provide new insights in understanding the generalization of GNNs.	翻訳日:2023-05-16 18:02:40 公開日:2023-05-14
# キャビティマグノメカニクスにおける量子増強メトロロジー Quantum-Enhanced Metrology in Cavity Magnomechanics ( http://arxiv.org/abs/2305.08045v1 ) ライセンス: Link先を確認	Qing-Kun Wan, Hai-Long Shi, Xi-Wen Guan	(参考訳) マグノンは、基本的な準粒子が初等スピン励起で現れ、情報符号化と処理における量子技術革新に大きな可能性を秘めている。ここでは, 空洞磁場が弱磁場を感知するのに対して, 空洞磁場が弱磁場の精密測定を行うような, 実験的に実現可能なキャビティマグノメカニカルシステムに基づくメトロロジースキームにおいて, 絡み合いの微妙な役割を見出す。フィッシャー情報と絡み合いの正確な関係を確立することにより,弱いカップリングの場合,測定精度はハイゼンベルク限界に達するが,量子臨界性は強いカップリングの場合の測定精度を高めることができることを示した。特に,マグノンと光子の絡み合いは動的符号化過程において重要であるが,測定過程におけるそのような絡み合いの存在は,最終的な測定精度を劇的に低下させる。 Magnons, as fundamental quasiparticles emerged in elementary spin excitations, hold a big promise for innovating quantum technologies in information coding and processing. Here we discover subtle roles of entanglement in a metrological scheme based on an experimentally feasible cavity magnomechanical system, where the magnons are responsible for sensing a weak magnetic field whereas the cavity field carries out a precision measurement of the weak field. By establishing exact relations between the Fisher information and entanglement, we show that for the weak coupling case the measurement precision can reach the Heisenberg limit, whereas quantum criticality enables us to enhance measurement precision for the strong coupling case. In particular, we also find that the entanglement between magnons and photons is of crucial importance during the dynamical encoding process, but the presence of such an entanglement in the measurement process dramatically reduces the final measurement precision.	翻訳日:2023-05-16 18:02:31 公開日:2023-05-14
# 実世界シナリオにおけるeeg信号を用いたメモリ検索時の作業負荷評価 Using EEG Signals to Assess Workload during Memory Retrieval in a Real-world Scenario ( http://arxiv.org/abs/2305.08044v1 ) ライセンス: Link先を確認	Kuan-Jung Chiang, Steven Dong, Chung-Kuan Cheng, and Tzyy-Ping Jung	(参考訳) 目的:脳波(EEG)は、客観的であり、バイアスの傾向が低く、認知状態のダイナミクスを評価することができるため、人間の因子研究における神経経済学の生理学的指標として人気を集めている。本研究は,参加者の典型的なオフィスタスクにおけるメモリ負荷と脳波の関係を,シングルモニターとデュアルモニターアレンジメントで検討した。シングルモニターアレンジメントでは、より高いメモリワークロードが期待できます。アプローチ: オフィスワークを行う被験者のシナリオを模倣した実験を設計し, 2つの異なるオフィスセットアップにおいて, 様々なレベルのメモリ負荷を経験したかどうかを検討した。 1)シングルモニターの設定と 2)デュアルモニターの設定。我々は,脳波バンドパワー,相互情報,コヒーレンスを,高メモリ負荷と低メモリ負荷を分類する機械学習モデルを訓練するための特徴として用いた。主な結果: 研究結果から, これらの特徴は, 全参加者で一貫した有意差を示した。また、Sternbergタスク中に収集した異なるデータセットにおいて、これらのEEGシグネチャの堅牢性と一貫性を検証する。意義:本研究は、脳波が個人間の記憶負荷の相関関係を示し、実世界の神経人間工学研究における脳波分析の有効性を実証した。 Objective: The Electroencephalogram (EEG) is gaining popularity as a physiological measure for neuroergonomics in human factor studies because it is objective, less prone to bias, and capable of assessing the dynamics of cognitive states. This study investigated the associations between memory workload and EEG during participants' typical office tasks on a single-monitor and dual-monitor arrangement. We expect a higher memory workload for the single-monitor arrangement. Approach: We designed an experiment that mimics the scenario of a subject performing some office work and examined whether the subjects experienced various levels of memory workload in two different office setups: 1) a single-monitor setup and 2) a dual-monitor setup. We used EEG band power, mutual information, and coherence as features to train machine learning models to classify high versus low memory workload states. Main results: The study results showed that these characteristics exhibited significant differences that were consistent across all participants. We also verified the robustness and consistency of these EEG signatures in a different data set collected during a Sternberg task in a prior study. Significance: The study found the EEG correlates of memory workload across individuals, demonstrating the effectiveness of using EEG analysis in conducting real-world neuroergonomic studies.	翻訳日:2023-05-16 18:02:15 公開日:2023-05-14
# chsel: 接触データと自由空間データから多岐にわたる多彩なポーズ推定を生成する CHSEL: Producing Diverse Plausible Pose Estimates from Contact and Free Space Data ( http://arxiv.org/abs/2305.08042v1 ) ライセンス: Link先を確認	Sheng Zhong, Nima Fazeli, and Dmitry Berenson	(参考訳) 本稿では,各点が自由空間にあるか,あるいは物体の表面にあるかなどのボリューム情報を持つ点群から,剛体物体の有理なポーズの集合を推定する新しい方法を提案する。特に,接触から生じる力や触覚データからポーズを推定する方法について検討した。接触から派生したデータを使用することは、本質的に視覚データよりも情報密度が低いため、接触が少ない場合、ポーズ推定問題は過小評価される。多数の接触を伴わない被写体の真のポーズを推定する代わりに,センサデータによって課される制約に従わなければならないポーズの集合を推定する。既存の手法は、このセットを単一のポーズ推定のために設計するか、効果的に情報的優先順位を必要とするため、見積もりに苦労する。この問題に対する我々のアプローチ、制約付きポーズ仮説セット除去(CHSEL)には3つの重要な属性がある。 1) 既知の自由空間を考慮できる量的情報を考える。 2)強力な勾配に基づく最適化ツールを活用するために,新しい微分可能な体積コスト関数を用いる。 3)品質多様性(QD)最適化文献からの手法を用いて,高品質なポーズの多様なセットを生成する。我々の知る限り、QD法はポーズ登録には使われていない。また、より多くのデータがロボットによって収集された場合、推定したポーズをオンラインで更新する方法も示します。実験の結果,CHSELはシミュレーションデータと実世界のデータの両方に対して,複数のベースライン法よりも大きな性能向上を示した。 This paper proposes a novel method for estimating the set of plausible poses of a rigid object from a set of points with volumetric information, such as whether each point is in free space or on the surface of the object. In particular, we study how pose can be estimated from force and tactile data arising from contact. Using data derived from contact is challenging because it is inherently less information-dense than visual data, and thus the pose estimation problem is severely under-constrained when there are few contacts. Rather than attempting to estimate the true pose of the object, which is not tractable without a large number of contacts, we seek to estimate a plausible set of poses which obey the constraints imposed by the sensor data. Existing methods struggle to estimate this set because they are either designed for single pose estimates or require informative priors to be effective. Our approach to this problem, Constrained pose Hypothesis Set Elimination (CHSEL), has three key attributes: 1) It considers volumetric information, which allows us to account for known free space; 2) It uses a novel differentiable volumetric cost function to take advantage of powerful gradient-based optimization tools; and 3) It uses methods from the Quality Diversity (QD) optimization literature to produce a diverse set of high-quality poses. To our knowledge, QD methods have not been used previously for pose registration. We also show how to update our plausible pose estimates online as more data is gathered by the robot. Our experiments suggest that CHSEL shows large performance improvements over several baseline methods for both simulated and real-world data.	翻訳日:2023-05-16 18:01:54 公開日:2023-05-14
# 確率的プーリングを用いた証明可能なマルチインスタンス深層auc最大化 Provable Multi-instance Deep AUC Maximization with Stochastic Pooling ( http://arxiv.org/abs/2305.08040v1 ) ライセンス: Link先を確認	Dixain Zhu, Bokun Wang, Zhi Chen, Yaxing Wang, Milan Sonka, Xiaodong Wu, Tianbao Yang	(参考訳) 本稿では,1つのクラスラベルをインスタンスの袋に割り当てるマルチインスタンス学習 (mil) に対する深層auc最大化 (dam) の新たな応用について検討する。 milの標準的なプーリングメソッドが要求する、バックプロパゲーションのための {gpu} メモリにバッグサイズがロードするには大きすぎる、という文脈で、無視されているが無視できない計算上の課題に対処します。この課題に対処するために,多レベル構成関数としてプールド予測上の損失関数を定式化することにより,確率最適化の精神における分散還元確率プール法を提案する。確率的合成最適化と非凸 min-max 最適化の手法を合成することにより,確率的スムーズドマックスプーリングや確率的アテンションベースプールを用いた統一的かつ証明可能なMIDAM (MIDAM) アルゴリズムを提案し,各バッグのいくつかのインスタンスをサンプリングし,確率的勾配推定器を計算し,モデルパラメータを更新する。我々は,提案したMIDAMアルゴリズムと最先端DAMアルゴリズムとの類似の収束率を確立する。従来のMILデータセットと医療データセットに関する広範な実験は、MIDAMアルゴリズムの優位性を実証している。 This paper considers a novel application of deep AUC maximization (DAM) for multi-instance learning (MIL), in which a single class label is assigned to a bag of instances (e.g., multiple 2D slices of a CT scan for a patient). We address a neglected yet non-negligible computational challenge of MIL in the context of DAM, i.e., bag size is too large to be loaded into {GPU} memory for backpropagation, which is required by the standard pooling methods of MIL. To tackle this challenge, we propose variance-reduced stochastic pooling methods in the spirit of stochastic optimization by formulating the loss function over the pooled prediction as a multi-level compositional function. By synthesizing techniques from stochastic compositional optimization and non-convex min-max optimization, we propose a unified and provable muli-instance DAM (MIDAM) algorithm with stochastic smoothed-max pooling or stochastic attention-based pooling, which only samples a few instances for each bag to compute a stochastic gradient estimator and to update the model parameter. We establish a similar convergence rate of the proposed MIDAM algorithm as the state-of-the-art DAM algorithms. Our extensive experiments on conventional MIL datasets and medical datasets demonstrate the superiority of our MIDAM algorithm.	翻訳日:2023-05-16 18:01:27 公開日:2023-05-14
# SyCo-AEによるカオスダイナミクスの低次モデリング:合成制約オートエンコーダ Small-data Reduced Order Modeling of Chaotic Dynamics through SyCo-AE: Synthetically Constrained Autoencoders ( http://arxiv.org/abs/2305.08036v1 ) ライセンス: Link先を確認	Andrey A. Popov, Renato Zanetti	(参考訳) データ駆動によるカオス力学の還元次数モデリングは、破滅的に散逸または散逸するシステムをもたらす可能性がある。本稿では,自動エンコーダの非線形次元低減とニューラルネットワークを用いた非線形演算子推論の自由を活かし,縮小順序空間に合成制約を課すことで,この問題を解決しようとする。合成制約により、自由度が完全に非線形で不安定でありながら、ばらつきを防止できる。従来の40変数のlorenz '96方程式を用いて手法を説明し,より少ないデータを用いて誤差の低い中から長距離の予測を行うことができることを示した。 Data-driven reduced order modeling of chaotic dynamics can result in systems that either dissipate or diverge catastrophically. Leveraging non-linear dimensionality reduction of autoencoders and the freedom of non-linear operator inference with neural-networks, we aim to solve this problem by imposing a synthetic constraint in the reduced order space. The synthetic constraint allows our reduced order model both the freedom to remain fully non-linear and highly unstable while preventing divergence. We illustrate the methodology with the classical 40-variable Lorenz '96 equations, showing that our methodology is capable of producing medium-to-long range forecasts with lower error using less data.	翻訳日:2023-05-16 18:01:00 公開日:2023-05-14
# DNN-Defender: 対向重み攻撃のためのDRAM内ディープニューラルネットワーク防御機構 DNN-Defender: An in-DRAM Deep Neural Network Defense Mechanism for Adversarial Weight Attack ( http://arxiv.org/abs/2305.08034v1 ) ライセンス: Link先を確認	Ranyang Zhou, Sabbir Ahmed, Adnan Siraj Rakin, Shaahin Angizi	(参考訳) 多くのセキュリティに敏感な分野にディープラーニングが展開されるにつれ、機械学習のセキュリティは徐々に重要になりつつある。近年の研究では、DRAMのRowHammer脆弱性を利用して、ディープニューラルネットワーク(DNN)モデルの重み付けを決定的かつ正確にフリップし、推論精度に影響を与えるシステムレベルのテクニックを攻撃者が活用できることが示されている。既存の防御機構はソフトウェアベースで、重量再構成には高価なトレーニングオーバーヘッドや性能の低下を必要とする。一方で、汎用的なハードウェアベースの被害者/攻撃者中心のメカニズムは、高価なハードウェアオーバーヘッドを課し、被害者と攻撃者列の間の空間的接続を維持する。本稿では,DNN-Defenderという名前の量子化DNNに適した,DRAMをベースとした最初の防御機構を提案する。以上の結果から,DNN-DefenderはターゲットRowHammer攻撃の性能をランダムな攻撃レベルに低下させる高いレベルの保護を提供することが可能であることが示唆された。さらに、提案されたディフェンスは、ソフトウェアトレーニングや追加のハードウェアオーバーヘッドを発生させずに、CIFAR-10とImageNetデータセットに精度低下はない。 With deep learning deployed in many security-sensitive areas, machine learning security is becoming progressively important. Recent studies demonstrate attackers can exploit system-level techniques exploiting the RowHammer vulnerability of DRAM to deterministically and precisely flip bits in Deep Neural Networks (DNN) model weights to affect inference accuracy. The existing defense mechanisms are software-based, such as weight reconstruction requiring expensive training overhead or performance degradation. On the other hand, generic hardware-based victim-/aggressor-focused mechanisms impose expensive hardware overheads and preserve the spatial connection between victim and aggressor rows. In this paper, we present the first DRAM-based victim-focused defense mechanism tailored for quantized DNNs, named DNN-Defender that leverages the potential of in-DRAM swapping to withstand the targeted bit-flip attacks. Our results indicate that DNN-Defender can deliver a high level of protection downgrading the performance of targeted RowHammer attacks to a random attack level. In addition, the proposed defense has no accuracy drop on CIFAR-10 and ImageNet datasets without requiring any software training or incurring additional hardware overhead.	翻訳日:2023-05-16 18:00:47 公開日:2023-05-14
# HiPerformer: 時系列予測のための階層的置換同変変換器 HiPerformer: Hierarchically Permutation-Equivariant Transformer for Time Series Forecasting ( http://arxiv.org/abs/2305.08073v1 ) ライセンス: Link先を確認	Ryo Umagami, Yu Ono, Yusuke Mukuta, Tatsuya Harada	(参考訳) 正確な予測のために複数の時系列の関係を識別することが不可欠である。特に株価については、同じ特性を持つグループに部品を分割することが多く、このグループ構造に整合した関係を抽出するモデルが有効であるべきである。そこで本研究では,このグループ構造を考慮したモデルの設計のために,グループ内およびグループ間の成分の指数スワップに着目した階層的置換等価性の概念を提案する。予測モデルが階層的置換同分散を持つ場合、その予測は成分の群関係と一致する。そこで,同群における成分間の関係と群間の関係を考慮した階層的置換同変モデルを提案する。実世界のデータを用いた実験により,提案手法が既存の最先端手法より優れていることを示す。 It is imperative to discern the relationships between multiple time series for accurate forecasting. In particular, for stock prices, components are often divided into groups with the same characteristics, and a model that extracts relationships consistent with this group structure should be effective. Thus, we propose the concept of hierarchical permutation-equivariance, focusing on index swapping of components within and among groups, to design a model that considers this group structure. When the prediction model has hierarchical permutation-equivariance, the prediction is consistent with the group relationships of the components. Therefore, we propose a hierarchically permutation-equivariant model that considers both the relationship among components in the same group and the relationship among groups. The experiments conducted on real-world data demonstrate that the proposed method outperforms existing state-of-the-art methods.	翻訳日:2023-05-16 17:54:37 公開日:2023-05-14
# 原始重力のデコヒーレンスについて On the Decoherence of Primordial Gravitons ( http://arxiv.org/abs/2305.08071v1 ) ライセンス: Link先を確認	Sirui Ning, Chon Man Sou, Yi Wang	(参考訳) 原始スカラー曲率とテンソル摂動の$\zeta$と$\gamma_{ij}$は、最小のインフレーションモデルにおける超水平スケールで保存されていることはよく知られている。しかし、それらの波動関数は急速に振動する位相を持ち、宇宙論的摂動の境界(現在の微分)やホイーラー・デウィット方程式のWKB近似から見てもわかるように、緩やかに回転しない。このような振動相は、スカラーとテンソルの摂動の間の重力非直線性を含む。観測されていないモードの追跡により、発振相は、バルク相互作用によるよりも早く原始重力子の脱コヒーレンスを引き起こす。以上の結果から, 収縮した原始重力場を探索する最近の提案に対して, 脱コヒーレンス効果はより低くなった。 It is well-known that the primordial scalar curvature and tensor perturbations, $\zeta$ and $\gamma_{ij}$, are conserved on super-horizon scales in minimal inflation models. However, their wave functional has a rapidly oscillating phase which is slow-roll unsuppressed, as can be seen either from boundary (total-derivative) terms of cosmological perturbations, or the WKB approximation of the Wheeler-DeWitt equation. Such an oscillatory phase involves gravitational non-linearity between scalar and tensor perturbations. By tracing out unobserved modes, the oscillatory phase causes faster decoherence of primordial gravitons compared to those by bulk interactions. Our results put a stronger lower bound of decoherence effect to the recent proposals probing squeezed primordial gravitons.	翻訳日:2023-05-16 17:54:24 公開日:2023-05-14
# フェデレーション学習におけるフェデレーション評価に関する調査 A Survey of Federated Evaluation in Federated Learning ( http://arxiv.org/abs/2305.08070v1 ) ライセンス: Link先を確認	Behnaz Soltani, Yipeng Zhou, Venus Haghighi, John C.S. Lui	(参考訳) 従来の機械学習では、すべてのデータサンプルがサーバによって中央管理されているため、モデル評価を行うのは簡単です。しかし、モデル評価は、この研究でフェデレーション評価と呼ばれるフェデレーション学習(FL)において難しい問題となっている。これは、クライアントがデータプライバシを保存するために元のデータを公開しないためです。フェデレーション評価は、クライアント選択、インセンティブ機構設計、悪意のある攻撃検出などにおいて重要な役割を果たす。本稿では,既存のフェデレーション評価手法の包括的調査を初めて実施する。さらに,FL性能向上のためのフェデレーション評価の様々な応用について検討し,いくつかの課題を想定して今後の研究の方向性を示す。 In traditional machine learning, it is trivial to conduct model evaluation since all data samples are managed centrally by a server. However, model evaluation becomes a challenging problem in federated learning (FL), which is called federated evaluation in this work. This is because clients do not expose their original data to preserve data privacy. Federated evaluation plays a vital role in client selection, incentive mechanism design, malicious attack detection, etc. In this paper, we provide the first comprehensive survey of existing federated evaluation methods. Moreover, we explore various applications of federated evaluation for enhancing FL performance and finally present future research directions by envisioning some challenges.	翻訳日:2023-05-16 17:54:10 公開日:2023-05-14
# 長距離物体検出のための事例認識繰り返し因子サンプリング Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection ( http://arxiv.org/abs/2305.08069v1 ) ライセンス: Link先を確認	Burhaneddin Yaman, Tanvir Mahmud, Chun-Hao Liu	(参考訳) 我々は,ロングテール物体検出における不均衡データ問題に対処するため,恥ずかしいほど簡単なインスタンス認識反復因子サンプリング(irfs)を提案する。実世界のオブジェクト検出における不均衡データセットは、しばしば各クラスに対するインスタンス数の大きな差に苦しむ。希少クラスにおける対象検出モデルの一般化性能を向上させるため,様々なデータサンプリング手法が提案されている。繰り返し因子サンプリング(RFS)はその単純さと有効性から有望である。 RFSはその効率にもかかわらず、インスタンスカウントを完全に無視し、再サンプリングプロセス中のイメージカウントにのみ依存する。しかし、同じ画像数を持つ異なるクラスでインスタンス数は大きく異なる可能性がある。このようなバリエーションは、ロングテール分布に対処するためのイメージとインスタンスの両方の重要性を強調している。そこで本研究では,ロングテールデータセットにおける不均衡の異なる視点を認識するために,再サンプリングプロセスのインスタンス数と画像数を統一するirfを提案する。提案手法は,様々なアーキテクチャやバックボーン上でのLVIS v1.0ベンチマークデータセットに対する有望な結果を示し,RFSに対する相対的な平均精度(AP)が+50\%である希少クラスにおけるオブジェクト検出モデルの性能向上に有効であることを示す。 IRFSは強力なベースラインとして機能し、既存のロングテールフレームワークに簡単に組み込める。 We propose an embarrassingly simple method -- instance-aware repeat factor sampling (IRFS) to address the problem of imbalanced data in long-tailed object detection. Imbalanced datasets in real-world object detection often suffer from a large disparity in the number of instances for each class. To improve the generalization performance of object detection models on rare classes, various data sampling techniques have been proposed. Repeat factor sampling (RFS) has shown promise due to its simplicity and effectiveness. Despite its efficiency, RFS completely neglects the instance counts and solely relies on the image count during re-sampling process. However, instance count may immensely vary for different classes with similar image counts. Such variation highlights the importance of both image and instance for addressing the long-tail distributions. Thus, we propose IRFS which unifies instance and image counts for the re-sampling process to be aware of different perspectives of the imbalance in long-tailed datasets. Our method shows promising results on the challenging LVIS v1.0 benchmark dataset over various architectures and backbones, demonstrating their effectiveness in improving the performance of object detection models on rare classes with a relative $+50\%$ average precision (AP) improvement over counterpart RFS. IRFS can serve as a strong baseline and be easily incorporated into existing long-tailed frameworks.	翻訳日:2023-05-16 17:53:59 公開日:2023-05-14
# 韻律的注意と蒸留による終端SLU性能の向上 Improving End-to-End SLU performance with Prosodic Attention and Distillation ( http://arxiv.org/abs/2305.08067v1 ) ライセンス: Link先を確認	Shangeth Rajaa	(参考訳) ほとんどのエンドツーエンドSLU法は、意図予測のための事前訓練されたASRまたは言語モデル機能に依存している。しかし、言論における他の重要な情報、例えば韻律はしばしば無視される。近年の研究では、韻律情報を組み込んだ対話行為の分類結果が改善されている。これらの方法の改善のマージンは最小限であり、神経モデルは韻律的特徴を無視している。本研究では,発話の時間枠にまたがる注意マップを生成するために,韻律的特徴が異なる韻律アテンションを提案する。次に,暗黙の韻律特徴を結合するのではなく,音響エンコーダの韻律情報を明示的に学習する韻律蒸留を提案する。提案手法はどちらもベースライン結果を改善し, プロソディ-蒸留法は, SLURP と STOP のデータセットに対して, 意図的分類精度を 8 %, 2 % 向上させる。 Most End-to-End SLU methods depend on the pretrained ASR or language model features for intent prediction. However, other essential information in speech, such as prosody, is often ignored. Recent research has shown improved results in classifying dialogue acts by incorporating prosodic information. The margins of improvement in these methods are minimal as the neural models ignore prosodic features. In this work, we propose prosody-attention, which uses the prosodic features differently to generate attention maps across time frames of the utterance. Then we propose prosody-distillation to explicitly learn the prosodic information in the acoustic encoder rather than concatenating the implicit prosodic features. Both the proposed methods improve the baseline results, and the prosody-distillation method gives an intent classification accuracy improvement of 8\% and 2\% on SLURP and STOP datasets over the prosody baseline.	翻訳日:2023-05-16 17:53:40 公開日:2023-05-14
# 視覚障害者の画質向上を助ける Helping Visually Impaired People Take Better Quality Pictures ( http://arxiv.org/abs/2305.08066v1 ) ライセンス: Link先を確認	Maniratnam Mandal, Deepti Ghadiyaram, Danna Gurari, and Alan C. Bovik	(参考訳) 知覚に基づく画像分析技術は、視覚障害者が自動ガイダンスを提供することで、より高品質な写真を撮るのに役立つ。視覚障害者が撮影した写真は、技術的品質(歪曲)と、フレーミングや美的構成といった意味的な品質の2つの品質問題の1つまたは両方に悩まされることが多い。ここでは,ぼやけや露出不良,ノイズなど,一般的な技術的歪みの発生を最小限に抑えるためのツールを開発した。我々は、セマンティック品質の相補的な問題に対処せず、その側面を将来の作業に残します。視覚障害者が捉えた画像の技術的品質に対する実用的なフィードバックを評価・提供することの問題は、しばしば発生する重篤な歪みのため、十分に困難である。視覚障がい者生成コンテンツ(vi-ugc)の技術的品質の分析と測定の課題を前進させるために,我々は,非常に大きくユニークな主観的画質と歪みデータセットを構築した。 LIVE-Meta VI-UGC Databaseと呼ばれるこの新しい知覚リソースには、実世界の歪んだVI-UGCイメージ40ドルと40ドルのパッチが含まれており、人間による知覚品質判断と27ドルの歪みラベルが記録されている。この心理測定資源を用いて,局所的空間的品質関係を学習し,vi-ugc画像における最先端の予測性能を達成し,このユニークな歪画像データを用いた既存の画像品質モデルを著しく上回る,盲目画像品質および歪み予測器を開発した。また,マルチタスク学習フレームワークを作成することで,ユーザによる品質問題軽減と品質画像の取得を支援するプロトタイプフィードバックシステムを開発した。 Perception-based image analysis technologies can be used to help visually impaired people take better quality pictures by providing automated guidance, thereby empowering them to interact more confidently on social media. The photographs taken by visually impaired users often suffer from one or both of two kinds of quality issues: technical quality (distortions), and semantic quality, such as framing and aesthetic composition. Here we develop tools to help them minimize occurrences of common technical distortions, such as blur, poor exposure, and noise. We do not address the complementary problems of semantic quality, leaving that aspect for future work. The problem of assessing and providing actionable feedback on the technical quality of pictures captured by visually impaired users is hard enough, owing to the severe, commingled distortions that often occur. To advance progress on the problem of analyzing and measuring the technical quality of visually impaired user-generated content (VI-UGC), we built a very large and unique subjective image quality and distortion dataset. This new perceptual resource, which we call the LIVE-Meta VI-UGC Database, contains $40$K real-world distorted VI-UGC images and $40$K patches, on which we recorded $2.7$M human perceptual quality judgments and $2.7$M distortion labels. Using this psychometric resource we also created an automatic blind picture quality and distortion predictor that learns local-to-global spatial quality relationships, achieving state-of-the-art prediction performance on VI-UGC pictures, significantly outperforming existing picture quality models on this unique class of distorted picture data. We also created a prototype feedback system that helps to guide users to mitigate quality issues and take better quality pictures, by creating a multi-task learning framework.	翻訳日:2023-05-16 17:53:24 公開日:2023-05-14
# 結束効果モデリングによる大規模行動空間のオフポリシー評価 Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling ( http://arxiv.org/abs/2305.08062v1 ) ライセンス: Link先を確認	Yuta Saito, Qingyang Ren, Thorsten Joachims	(参考訳) 従来の重要度重み付けアプローチが過度なばらつきを被る大規模離散行動空間における文脈的バンディットポリシーのオフポリシー評価(ope)について検討した。この分散問題を回避すべく,結束効果モデル(cem)に基づく新たな推定器であるoffcemを提案し,因果効果のクラスター効果への新しい分解と残留効果を提案する。 OffCEMは、アクションクラスタのみに重み付けを適用し、モデルベースの報酬推定を通じて残留因果効果に対処する。提案した推定器は局所的正当性と呼ばれる新しい条件下では偏りがなく, 残差効果モデルが各クラスタ内の動作の相対的な報酬差を保持する必要がある。また,CEMと局所的正当性を最大限に活用するために,第1ステップのバイアスと第2ステップのばらつきを最小化するモデルベース推定法を提案する。その結果,従来の推定器に比べてバイアスやばらつきが大幅に改善されることがわかった。 OffCEMは、特に多くのアクションが存在する場合、OPEを大幅に改善することを示した。 We study off-policy evaluation (OPE) of contextual bandit policies for large discrete action spaces where conventional importance-weighting approaches suffer from excessive variance. To circumvent this variance issue, we propose a new estimator, called OffCEM, that is based on the conjunct effect model (CEM), a novel decomposition of the causal effect into a cluster effect and a residual effect. OffCEM applies importance weighting only to action clusters and addresses the residual causal effect through model-based reward estimation. We show that the proposed estimator is unbiased under a new condition, called local correctness, which only requires that the residual-effect model preserves the relative expected reward differences of the actions within each cluster. To best leverage the CEM and local correctness, we also propose a new two-step procedure for performing model-based estimation that minimizes bias in the first step and variance in the second step. We find that the resulting OffCEM estimator substantially improves bias and variance compared to a range of conventional estimators. Experiments demonstrate that OffCEM provides substantial improvements in OPE especially in the presence of many actions.	翻訳日:2023-05-16 17:52:52 公開日:2023-05-14
# デジタルの兄弟姉妹が自動運転テストを改善する Two is Better Than One: Digital Siblings to Improve Autonomous Driving Testing ( http://arxiv.org/abs/2305.08060v1 ) ライセンス: Link先を確認	Matteo Biagiola, Andrea Stocco, Vincenzo Riccio, Paolo Tonella	(参考訳) シミュレーションベースのテストは、自動運転ソフトウェアの信頼性を確保するための重要なステップである。実のところ、企業が社内またはアウトソーステストのためにサードパーティの汎用シミュレータに依存している場合、実際の自動運転車に対するテスト結果の一般化が危ぶまれる。本稿では,様々な技術を用いた多目的シミュレータ上でavをテストするための新しい枠組みであるディジタル兄弟の概念を導入することで,シミュレーションに基づくテストを強化する。まず、個々のシミュレータごとにテストケースを自動的に生成する。次に、テストはシミュレータ間で移動され、特徴マップを使用して運動された運転条件を特徴づける。そして、ジョイント予測故障確率を算出し、兄弟間の合意の場合にのみ、故障を報告させる。このフレームワークを2つのオープンソースのシミュレータを用いて実装し,大規模なテストケースで実物大の自律走行車のデジタル双生児と比較した。本研究は,デジタル双子の故障予測において,デジタル兄弟によるアンサンブル故障予測器が個々のシミュレータよりも優れていることを示す。我々は,自動運転ソフトウェアの自動テストに関心を持つ研究者に,我々のフレームワークが役立ついくつかの方法について論じる。 Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we strengthen simulation-based testing by introducing the notion of digital siblings, a novel framework in which the AV is tested on multiple general-purpose simulators, built with different technologies. First, test cases are automatically generated for each individual simulator. Then, tests are migrated between simulators, using feature maps to characterize of the exercised driving conditions. Finally, the joint predicted failure probability is computed and a failure is reported only in cases of agreement among the siblings. We implemented our framework using two open-source simulators and we empirically compared it against a digital twin of a physical scaled autonomous vehicle on a large set of test cases. Our study shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. We discuss several ways in which our framework can help researchers interested in automated testing of autonomous driving software.	翻訳日:2023-05-16 17:52:35 公開日:2023-05-14
# イベントレベルのビデオ質問応答に対する意味認識動的ふりかえり推論 Semantic-aware Dynamic Retrospective-Prospective Reasoning for Event-level Video Question Answering ( http://arxiv.org/abs/2305.08059v1 ) ライセンス: Link先を確認	Chenyang Lyu, Tianbo Ji, Yvette Graham, Jennifer Foster	(参考訳) Event-Level Video Question Answering (EVQA)は、最適な回答を提供するのに必要な視覚的情報を得るために、ビデオイベント全体にわたる複雑な推論を必要とする。しかしながら、モデル性能の大幅な向上にもかかわらず、質問と視覚的情報、特にイベントレベルでの明示的な意味関係の使用に焦点をあてた研究は少ない。ビデオフレーム間の複雑な推論を容易にするために、このようなセマンティック接続を使用する必要がある。そこで本稿では,ビデオによる質問応答に対する動的振り返り推論手法を提案する。具体的には、質問のSRL構造(エージェント、動詞、患者など)のどの部分に焦点を当てているかに基づいて次のフレームに移行することを決定した動的推論プロセスにおいて、質問のセマンティックロールラベル(SRL)構造を明示的に使用する。ベンチマークEVQAデータセット - TrafficQA で実験を行う。その結果,提案手法は従来の最先端モデルと比較して優れた性能を示すことがわかった。私たちのコードは研究用に公開されます。 Event-Level Video Question Answering (EVQA) requires complex reasoning across video events to obtain the visual information needed to provide optimal answers. However, despite significant progress in model performance, few studies have focused on using the explicit semantic connections between the question and visual information especially at the event level. There is need for using such semantic connections to facilitate complex reasoning across video frames. Therefore, we propose a semantic-aware dynamic retrospective-prospective reasoning approach for video-based question answering. Specifically, we explicitly use the Semantic Role Labeling (SRL) structure of the question in the dynamic reasoning process where we decide to move to the next frame based on which part of the SRL structure (agent, verb, patient, etc.) of the question is being focused on. We conduct experiments on a benchmark EVQA dataset - TrafficQA. Results show that our proposed approach achieves superior performance compared to previous state-of-the-art models. Our code will be made publicly available for research use.	翻訳日:2023-05-16 17:52:18 公開日:2023-05-14
# CREMP: 機械学習のためのマクロ環状ペプチドのコンバータロータマーアンサンブル CREMP: Conformer-Rotamer Ensembles of Macrocyclic Peptides for Machine Learning ( http://arxiv.org/abs/2305.08057v1 ) ライセンス: Link先を確認	Colin A. Grambow, Hayley Weir, Christian N. Cunningham, Tommaso Biancalani, Kangway V. Chuang	(参考訳) 大環状ペプチドのコンフォメーションランドスケープをモデル化するための計算および機械学習アプローチは、合理的な設計と最適化を可能にする可能性がある。しかしながら、マクロサイクルジオメトリをモデリングするための正確で高速でスケーラブルな手法は、いまだに解明されていない。近年の深層学習はタンパク質構造の予測と小分子コンホメーションアンサンブルの生成を著しく促進しているが,その特異な性質から,大環状ペプチドに対する同様の進歩は得られていない。本稿では,マクロ環状ペプチドの機械学習モデルの開発と評価のための資源であるCREMPを紹介する。 CREMPは36,198個のマクロ環状ペプチドと、Conformer-Rotamer Ensemble Sampling Tool (CREST)を用いて生成された高品質な構造アンサンブルを含む。この新しいデータセットには3130万近いユニークなマクロサイクルジオメトリが含まれており、それぞれが半経験的拡張タイト結合(xtb)dft計算から得られるエネルギーをアノテートしている。このデータセットは、新しい治療のためのペプチド設計と最適化を改善する機械学習モデルの開発を可能にすることを期待する。 Computational and machine learning approaches to model the conformational landscape of macrocyclic peptides have the potential to enable rational design and optimization. However, accurate, fast, and scalable methods for modeling macrocycle geometries remain elusive. Recent deep learning approaches have significantly accelerated protein structure prediction and the generation of small-molecule conformational ensembles, yet similar progress has not been made for macrocyclic peptides due to their unique properties. Here, we introduce CREMP, a resource generated for the rapid development and evaluation of machine learning models for macrocyclic peptides. CREMP contains 36,198 unique macrocyclic peptides and their high-quality structural ensembles generated using the Conformer-Rotamer Ensemble Sampling Tool (CREST). Altogether, this new dataset contains nearly 31.3 million unique macrocycle geometries, each annotated with energies derived from semi-empirical extended tight-binding (xTB) DFT calculations. We anticipate that this dataset will enable the development of machine learning models that can improve peptide design and optimization for novel therapeutics.	翻訳日:2023-05-16 17:52:00 公開日:2023-05-14
# 最適探索空間サイズ学習による非線形モデル予測制御の遺伝的最適化 Accelerating genetic optimization of nonlinear model predictive control by learning optimal search space size ( http://arxiv.org/abs/2305.08094v1 ) ライセンス: Link先を確認	Eslam Mostafa, Hussein A. Aly, Ahmed Elliethy	(参考訳) 非線形モデル予測制御(NMPC)は、制御サイクル毎にシステムの最適制御入力を推定するために多変量最適化問題を解く。このような最適化は、システム内で継承された非線形性、高度に結合された入力、システムの物理的制限に関連する様々な制約などによってより困難にされている。これらの要因により、最適化は非凸であり、伝統的に解決するのは難しい。遺伝的アルゴリズム(GA)は、解推定に差分計算や勾配評価を含まないため、いくつかのアプリケーション領域でそのような最適化に取り組むために一般的に広く使われている。しかし、GAが最適制御入力を探索する検索空間のサイズは、迅速な応答を必要とするシステムによるGAの適用性に不可欠である。本稿では,NMPCの遺伝的最適化を最適探索空間サイズを学習することで高速化する手法を提案する。提案手法は多変量回帰モデルを訓練し,制御サイクル毎に最小探索空間を適応的に予測する。この探索空間内の最適制御入力を探索できるように、推定最小の探索空間がGAに供給される。提案手法はgaの計算時間を短縮するだけでなく,各サイクルにおける最適制御入力を得る可能性を向上させる。提案手法は2つの非線形システム上で評価され、Nvidia Jetson TX2組み込みプラットフォームのGPU上に実装された他の2つの遺伝的NMPCアプローチと比較された。その結果,提案手法は計算時間を39-53\%削減できることがわかった。さらに、サイクル時間内の最適な制御入力に対する収束率を48-56\%増加させ、結果として大幅な性能向上をもたらす。ソースコードはgithubで公開されている。 Nonlinear model predictive control (NMPC) solves a multivariate optimization problem to estimate the system's optimal control inputs in each control cycle. Such optimization is made more difficult by several factors, such as nonlinearities inherited in the system, highly coupled inputs, and various constraints related to the system's physical limitations. These factors make the optimization to be non-convex and hard to solve traditionally. Genetic algorithm (GA) is typically used extensively to tackle such optimization in several application domains because it does not involve differential calculation or gradient evaluation in its solution estimation. However, the size of the search space in which the GA searches for the optimal control inputs is crucial for the applicability of the GA with systems that require fast response. This paper proposes an approach to accelerate the genetic optimization of NMPC by learning optimal search space size. The proposed approach trains a multivariate regression model to adaptively predict the best smallest search space in every control cycle. The estimated best smallest size of search space is fed to the GA to allow for searching the optimal control inputs within this search space. The proposed approach not only reduces the GA's computational time but also improves the chance of obtaining the optimal control inputs in each cycle. The proposed approach was evaluated on two nonlinear systems and compared with two other genetic-based NMPC approaches implemented on the GPU of a Nvidia Jetson TX2 embedded platform in a processor-in-the-loop (PIL) fashion. The results show that the proposed approach provides a 39-53\% reduction in computational time. Additionally, it increases the convergence percentage to the optimal control inputs within the cycle's time by 48-56\%, resulting in a significant performance enhancement. The source code is available on GitHub.	翻訳日:2023-05-16 17:45:08 公開日:2023-05-14
# アジャイル開発のためのAI:メタ分析 AI for Agile development: a Meta-Analysis ( http://arxiv.org/abs/2305.08093v1 ) ライセンス: Link先を確認	Beatriz Cabrero-Daniel	(参考訳) 本研究は,継続的インテグレーションとデリバリの改善に重点を置いた,人工知能とアジャイルソフトウェア開発方法論を統合することのメリットと課題について検討する。検索した研究の体系的な文献レビューと縦断的なメタ分析を行い、人工知能とアジャイルソフトウェア開発における今後の応用について分析した。このレビューは、特別な社会技術専門知識の必要性など、重要な課題を特定するのに役立った。人工知能はソフトウェア開発プラクティスの改善を約束する一方で、プロセスや実践者への影響をより深く理解し、その実装に関連する間接的な課題に対処するためには、さらなる研究が必要である。 This study explores the benefits and challenges of integrating Artificial Intelligence with Agile software development methodologies, focusing on improving continuous integration and delivery. A systematic literature review and longitudinal meta-analysis of the retrieved studies was conducted to analyse the role of Artificial Intelligence and it's future applications within Agile software development. The review helped identify critical challenges, such as the need for specialised socio-technical expertise. While Artificial Intelligence holds promise for improved software development practices, further research is needed to better understand its impact on processes and practitioners, and to address the indirect challenges associated with its implementation.	翻訳日:2023-05-16 17:44:42 公開日:2023-05-14
# meta-dm: 限定学習における拡散モデルの応用 Meta-DM: Applications of Diffusion Models on Few-Shot Learning ( http://arxiv.org/abs/2305.08092v1 ) ライセンス: Link先を確認	Wentao Hu, Xiurong Jiang, Jiarun Liu, Yuqi Yang, Hui Tian	(参考訳) 数ショット学習(FSL)の分野では、ネットワーク構造の改善とトレーニング戦略に重点を置いている。しかし、データ処理モジュールの役割は十分に解明されていない。そこで本稿では,拡散モデルに基づくFSL問題の一般化データ処理モジュールであるMeta-DMを提案する。 Meta-DMはシンプルだが効果的なモジュールであり、既存のFSLメソッドと簡単に統合でき、教師なし設定と教師なし設定の両方で大幅なパフォーマンス向上をもたらす。メタDMの理論解析を行い,その性能をいくつかのアルゴリズムで評価する。実験の結果,Meta-DMと特定の手法を組み合わせることで,最先端の成果が得られることがわかった。 In the field of few-shot learning (FSL), extensive research has focused on improving network structures and training strategies. However, the role of data processing modules has not been fully explored. Therefore, in this paper, we propose Meta-DM, a generalized data processing module for FSL problems based on diffusion models. Meta-DM is a simple yet effective module that can be easily integrated with existing FSL methods, leading to significant performance improvements in both supervised and unsupervised settings. We provide a theoretical analysis of Meta-DM and evaluate its performance on several algorithms. Our experiments show that combining Meta-DM with certain methods achieves state-of-the-art results.	翻訳日:2023-05-16 17:44:33 公開日:2023-05-14
# プロンプトベースのブラックボックスチューニングカラーフル:3次元直交視点からのモデル一般化の促進 Make Prompt-based Black-Box Tuning Colorful: Boosting Model Generalization from Three Orthogonal Perspectives ( http://arxiv.org/abs/2305.08088v1 ) ライセンス: Link先を確認	Qiushi Sun, Chengcheng Han, Nuo Chen, Renyu Zhu, Jingyang Gong, Xiang Li, Ming Gao	(参考訳) 大規模言語モデル(llm)は、様々な自然言語処理(nlp)タスクで力を増している。しかし、これらのモデルを下流タスクにチューニングするには、通常、余分なコストを必要とするか、商業的な考慮のために利用できない。近年,タスク固有のプロンプトを勾配や隠れ表現にアクセスせずに最適化することで,この問題に対処するブラックボックスチューニングが提案されている。しかし、既存の作品の多くは、少数発学習のシナリオで、勾配なし最適化の可能性を完全に活用していない。本稿では,ブラックボックス最適化の効率性と性能を向上させるための,単純かつ補完的な手法であるBBT-RGBについて述べる。具体的には,(1)高速収束と過剰フィッティングの緩和を容易にする二段階微分自由最適化戦略,(2)新規使用による自動発声器の構成,(3)指示探索と自動選択デモンストレーションに基づく高速初期化ポリシーの改善,の3つを含む。自然言語の理解と推論に関する多岐にわたる実験により,本手法の有効性が示された。私たちのコードはhttps://github.com/QiushiSun/BBT-RGBで公開されています。 Large language models (LLMs) have shown increasing power on various natural language processing (NLP) tasks. However, tuning these models for downstream tasks usually needs exorbitant costs or is unavailable due to commercial considerations. Recently, black-box tuning has been proposed to address this problem by optimizing task-specific prompts without accessing the gradients and hidden representations. However, most existing works have yet fully exploited the potential of gradient-free optimization under the scenario of few-shot learning. In this paper, we describe BBT-RGB, a suite of straightforward and complementary techniques for enhancing the efficiency and performance of black-box optimization. Specifically, our method includes three plug-and-play components: (1) Two-stage derivative-free optimization strategy that facilitates fast convergence and mitigates overfitting; (2) Automatic verbalizer construction with its novel usage under few-shot settings; (3) Better prompt initialization policy based on instruction search and auto-selected demonstration. Extensive experiments across various tasks on natural language understanding and inference demonstrate the effectiveness of our method. Our codes are publicly available at https://github.com/QiushiSun/BBT-RGB.	翻訳日:2023-05-16 17:44:23 公開日:2023-05-14
# deep learning empowered type-ii codebook: csiフィードバック強化のための新しい展望 Deep Learning Empowered Type-II Codebook: New Perspectives for Enhancing CSI Feedback ( http://arxiv.org/abs/2305.08081v1 ) ライセンス: Link先を確認	Ke Ma, Yiliang Sang, Yang Ming, Jin Lian, Chang Tian, Zhaocheng Wang	(参考訳) 周波数分割二重系における深層学習に基づくチャネル状態情報(CSI)フィードバックは、学術と産業の両方で広く注目を集めている。本稿では,CSIフィードバックの性能向上のため,無線通信規格におけるType-IIコーデックブックの深層学習への統合に焦点をあてる。 Release 16 Type-IIコードブックに関する既存のディープラーニングベースの研究とは対照的に、Release 17(R17)のType-IIコードブックでは、アップリンクとダウンリンクチャネルの間の角-遅延領域部分の相反性を、ダウンリンクCSIの測定とフィードバックのための角-遅延領域ポートの選択に利用している。この問題に対処するため、我々はR17 Type-IIコードブックを改善するためにディープラーニングを採用する2つの新しい視点を提案する。まず、アップリンクチャネルの信号対雑音比の低さを考慮して、深層学習を用いて、焦点損失を利用してクラス不均衡問題を解決する支配的な角遅延領域ポートを正確に選択する。第2に,基地局におけるR17 Type-IIコードブックのフィードバックに基づいて,深層学習を用いてダウンリンクCSIを再構築し,スパース構造の情報を効果的に活用することを提案する。さらに,重み付きショートカットモジュールを設計し,実際のマルチユーザシナリオに適応するために,平均2乗誤差と和率を組み合わせた2段階の損失関数を提案する。シミュレーションの結果,本提案手法は従来のr17 type-ii コードブックやdeep learning ベンチマークと比較して,総和率を向上できることがわかった。 Deep learning based channel state information (CSI) feedback in frequency division duplex systems has drawn widespread attention in both academia and industry. In this paper, we focus on integrating the Type-II codebook in the wireless communication standards with deep learning to enhance the performance of CSI feedback. In contrast to the existing deep learning based studies on the Release 16 Type-II codebook, the Type-II codebook in Release 17 (R17) exploits the angular-delay-domain partial reciprocity between uplink and downlink channels to select part of angular-delay-domain ports for measuring and feeding back the downlink CSI, where the performance of deep learning based conventional methods is limited due to the deficiency of sparse structures. To address this issue, we propose two new perspectives of adopting deep learning to improve the R17 Type-II codebook. Firstly, considering the low signal-to-noise ratio of uplink channels, deep learning is utilized to accurately select the dominant angular-delay-domain ports, where the focal loss is harnessed to solve the class imbalance problem. Secondly, we propose to adopt deep learning to reconstruct the downlink CSI based on the feedback of the R17 Type-II codebook at the base station, where the information of sparse structures can be effectively leveraged. Furthermore, a weighted shortcut module is designed to facilitate the accurate reconstruction, and a two-stage loss function that combines the mean squared error and sum rate is proposed for adapting to practical multi-user scenarios. Simulation results demonstrate that our proposed deep learning based port selection and CSI reconstruction methods can improve the sum rate performance compared with the traditional R17 Type-II codebook and deep learning benchmarks.	翻訳日:2023-05-16 17:44:01 公開日:2023-05-14
# 広視野眼底画像から複数の網膜疾患を認識するためのクロスドメイン協調学習 Cross-domain Collaborative Learning for Recognizing Multiple Retinal Diseases from Wide-Field Fundus Images ( http://arxiv.org/abs/2305.08078v1 ) ライセンス: Link先を確認	Qijie Wei, Jingyuan Yang, Bo Wang, Jinrui Wang, Jianchun Zhao, Xinyu Zhao, Sheng Yang, Niranchana Manivannan, Youxin Chen, Dayong Ding and Xirong Li	(参考訳) 本稿では,広視野 (WF) と超広視野 (UWF) の眼底画像から複数の網膜疾患を認識するための課題について述べる。既存のラベル付きカラーファンドス写真(CFP)データを効果的に再利用するために,クロスドメイン協調学習(CdCL)を提案する。教師なしドメイン適応における固定比に基づくミックスアップの成功に触発されて、我々はこの戦略を現在のタスクに再活用する。 CFP画像とWF/UWF画像の視野の違いにより,CFP画像の解剖学的構造がWF/UWF画像よりもかなり大きくなるという,スケールバイアスが自然に存在する。 CdCL法は,変圧器を用いたスケール・バイアス補正法により,スケール不変な特徴を生成できる。 wf画像とuwf画像の両方をカバーする複数のデータセットに関する広範囲な実験によって示されているように、提案手法は多くの競合ベースラインと比較できる。 This paper addresses the emerging task of recognizing multiple retinal diseases from wide-field (WF) and ultra-wide-field (UWF) fundus images. For an effective reuse of existing labeled color fundus photo (CFP) data, we propose Cross-domain Collaborative Learning (CdCL). Inspired by the success of fixed-ratio based mixup in unsupervised domain adaptation, we re-purpose this strategy for the current task. Due to the intrinsic disparity between the field-of-view of CFP and WF/UWF images, a scale bias naturally exists in a mixup sample that the anatomic structure from a CFP image will be considerably larger than its WF/UWF counterpart. The CdCL method resolves the issue by Scale-bias Correction, which employs Transformers for producing scale-invariant features. As demonstrated by extensive experiments on multiple datasets covering both WF and UWF images, the proposed method compares favorably against a number of competitive baselines.	翻訳日:2023-05-16 17:43:27 公開日:2023-05-14
# 温熱快適性とプライバシを考慮した住宅需要対応プログラムコストの最適化 Optimization of Residential Demand Response Program Cost with Consideration for Occupants Thermal Comfort and Privacy ( http://arxiv.org/abs/2305.08077v1 ) ライセンス: Link先を確認	Reza Nematirad, M. M. Ardehali, and Amir Khorsandi	(参考訳) 住宅利用者は、住宅エネルギー管理システム(HEMS)を利用することができれば需要対応プログラム(DRP)を使用でき、空調設定点(AC)を自動的に調整し、一部の機器をオフピーク時間にシフトすることで、消費者のコストを低減できる。 HEMSが占有状況を知っている場合、消費者はより多くの経済的利益と熱的快適さを得ることができる。しかし、建物の居住状況では、直接センシングは費用がかかり、不正確であり、住民にとって侵入的である。したがって、予測アルゴリズムは効果的な代替手段になり得る。本研究の目的は, スマート住宅におけるdrp活用のための多目的シミュレーションモデルの構築に向けて, 非誘惑的, 正確, かつ費用対効果の高い手法を提案することである。 (a)電気負荷の低減 (b)熱快適度(ac)温度設定点の調整及び (c)最悪のシナリオアプローチは非常に保守的です。なぜなら、不確実なパラメータが常に最悪の値を取る可能性は低いからです。そこで,不確かさを現実的に考慮するために,不確実性予算とともに柔軟なロバストな最適化手法を開発した。シミュレーションの結果,不確実性を考慮するとコストが36%増加し,交流温度設定値が低下することが示唆された。さらに、DRPを使用すると、一部の家電事業をオフピーク時間に切り替えて需要を減らし、コストを13.2%削減する。 Residential consumers can use the demand response program (DRP) if they can utilize the home energy management system (HEMS), which reduces consumer costs by automatically adjusting air conditioning (AC) setpoints and shifting some appliances to off-peak hours. If HEMS knows occupancy status, consumers can gain more economic benefits and thermal comfort. However, for the building occupancy status, direct sensing is costly, inaccurate, and intrusive for residents. So, forecasting algorithms could serve as an effective alternative. The goal of this study is to present a non-intrusive, accurate, and cost-effective approach, to develop a multi-objective simulation model for the application of DRPs in a smart residential house, where (a) electrical load demand reduction, (b) adjustment in thermal comfort (AC) temperature setpoints, and (c) , worst cases scenario approach is very conservative. Because that is unlikely all uncertain parameters take their worst values at all times. So, the flexible robust counterpart optimization along with uncertainty budgets is developed to consider uncertainty realistically. Simulated results indicate that considering uncertainty increases the costs by 36 percent and decreases the AC temperature setpoints. Besides, using DRPs reduces demand by shifting some appliance operations to off-peak hours and lowers costs by 13.2 percent.	翻訳日:2023-05-16 17:43:09 公開日:2023-05-14
# 教員助手による防衛蒸留の改善 Improving Defensive Distillation using Teacher Assistant ( http://arxiv.org/abs/2305.08076v1 ) ライセンス: Link先を確認	Maniratnam Mandal and Suna Gao	(参考訳) 敵の攻撃は、現代のアプリケーションに適用されるディープニューラルネットワークのセキュリティと安全性に重大な脅威をもたらす。より具体的には、コンピュータビジョンに基づくタスクでは、専門家はモデルアーキテクチャの知識を使って、人間の目には見えない敵対的なサンプルを作成することができる。これらの攻撃は、自動運転車や顔認識などの一般的なアプリケーションでセキュリティ上の問題を引き起こす可能性がある。したがって、このような攻撃に対して堅牢なネットワークの構築は非常に望ましいものであり、不可欠である。文献にみられる様々な方法のうち、近年は防御蒸留が期待されている。知識蒸留を用いて、研究者はこれらの攻撃に対して堅牢なモデルを作ることができた。しかし、防御蒸留の弱さを露呈する攻撃が増えている。本研究では,教師補助知識蒸留から着想を得て,補助ネットワークの導入により,蒸留モデルのロバスト性が向上することを示す。一連の実験を通じて, 蒸留温度の異なる蒸留モデルについて, 精度, 感度, 堅牢性の観点から評価した。実験の結果,提案仮説はほとんどの場合,ロバスト性が向上することが示された。さらに,多段蒸留はモデルの精度にほとんど影響を与えず,ロバスト性をさらに向上させることができることを示した。 Adversarial attacks pose a significant threat to the security and safety of deep neural networks being applied to modern applications. More specifically, in computer vision-based tasks, experts can use the knowledge of model architecture to create adversarial samples imperceptible to the human eye. These attacks can lead to security problems in popular applications such as self-driving cars, face recognition, etc. Hence, building networks which are robust to such attacks is highly desirable and essential. Among the various methods present in literature, defensive distillation has shown promise in recent years. Using knowledge distillation, researchers have been able to create models robust against some of those attacks. However, more attacks have been developed exposing weakness in defensive distillation. In this project, we derive inspiration from teacher assistant knowledge distillation and propose that introducing an assistant network can improve the robustness of the distilled model. Through a series of experiments, we evaluate the distilled models for different distillation temperatures in terms of accuracy, sensitivity, and robustness. Our experiments demonstrate that the proposed hypothesis can improve robustness in most cases. Additionally, we show that multi-step distillation can further improve robustness with very little impact on model accuracy.	翻訳日:2023-05-16 17:42:49 公開日:2023-05-14
# コンピュータビジョンのための圧縮技術の解析 Analyzing Compression Techniques for Computer Vision ( http://arxiv.org/abs/2305.08075v1 ) ライセンス: Link先を確認	Maniratnam Mandal and Imran Khan	(参考訳) 深層ネットワーク圧縮は、コンピュータビジョンアプリケーションにおける実用的なユースケースとして非常に望ましい。文献でいくつかの技術が研究され、それらを組み合わせるための効率的な戦略を見つける研究が進められている。本研究の目的は, 知識蒸留, プルーニング, 量子化の3つの基本的な圧縮手法について検討することであった。基本手法とともに、それらを逐次的に組み合わせることの有効性についても検証する。 MNIST と CIFAR-10 のデータセットを用いて解析し、その結果と、それらから推測される観測結果を提示する。 Compressing deep networks is highly desirable for practical use-cases in computer vision applications. Several techniques have been explored in the literature, and research has been done in finding efficient strategies for combining them. For this project, we aimed to explore three different basic compression techniques - knowledge distillation, pruning, and quantization for small-scale recognition tasks. Along with the basic methods, we also test the efficacy of combining them in a sequential manner. We analyze them using MNIST and CIFAR-10 datasets and present the results along with few observations inferred from them.	翻訳日:2023-05-16 17:42:33 公開日:2023-05-14
# カオスにおける直交多項式近似と拡張動的モード分解 Orthogonal polynomial approximation and Extended Dynamic Mode Decomposition in chaos ( http://arxiv.org/abs/2305.08074v1 ) ライセンス: Link先を確認	Caroline L. Wormell	(参考訳) extended dynamic mode decomposition (edmd) は、物理科学において広く取り上げられている、ダイナミクスの予測とモデル還元のためのデータ駆動ツールである。この手法は概念的には単純であるが、決定論的カオスでは、その性質が何であるか、何に収束するかは明らかではない。特に、EDMDの最小二乗近似がカオス力学を理解するのに必要な正規関数のクラスをどのように扱うかは明らかではない。本稿では、カオス写像の最も単純な例である円の膨張写像を解析する、EDMDの厳密な一般理論を開発する。単位円(OPUC)上の直交多項式の理論における新たな結果を示すと、無限データ極限において、最小二乗射影は多項式可観測辞書に対して指数関数的に効率的であることを示す。その結果,EDMDを用いて作成した予測値とクープマンスペクトルデータは,指数速度で物理的に有意な限界に収束することがわかった。これは、比較的小さな多項式辞書だけでは、サンプリング測度が均一でない場合でも、EDMDは非常に効果的であることを示す。さらに, OPUCの結果から, データに基づく最小二乗予測が極めて効果的な近似手法である可能性が示唆された。 Extended Dynamic Mode Decomposition (EDMD) is a data-driven tool for forecasting and model reduction of dynamics, which has been extensively taken up in the physical sciences. While the method is conceptually simple, in deterministic chaos it is unclear what its properties are or even what it converges to. In particular, it is not clear how EDMD's least-squares approximation treats the classes of regular functions needed to make sense of chaotic dynamics. In this paper we develop a general, rigorous theory of EDMD on the simplest examples of chaotic maps: analytic expanding maps of the circle. Proving a new result in the theory of orthogonal polynomials on the unit circle (OPUC), we show that in the infinite-data limit, the least-squares projection is exponentially efficient for polynomial observable dictionaries. As a result, we show that the forecasts and Koopman spectral data produced using EDMD in this setting converge to the physically meaningful limits, at an exponential rate. This demonstrates that with only a relatively small polynomial dictionary, EDMD can be very effective, even when the sampling measure is not uniform. Furthermore, our OPUC result suggests that data-based least-squares projections may be a very effective approximation strategy.	翻訳日:2023-05-16 17:42:25 公開日:2023-05-14
# 行列積状態からの高次ベリー曲率 Higher Berry curvature from matrix product states ( http://arxiv.org/abs/2305.08109v1 ) ライセンス: Link先を確認	Ken Shiozaki, Niclas Heinsdorf, Shuhei Ohyama	(参考訳) 高いベリー曲率は、有限自由度を持つ量子力学系におけるベリー曲率の有限次元における量子多体系への拡張としてカプスティンとスポディニコによって導入された。本稿では,翻訳不変行列積状態を用いた高次ベリー曲率の代替定式化を提案する。これらは、離散化されたパラメータ空間を通して断熱的に進化するギャップ付きハミルトン多様体の基底状態である。行列積状態は射影表現の下で変換されるので、パラメータ空間を通る閉ループ上のベリー曲率の評価は、すべてのゲージの自由度を固定するのに十分ではない。ゲージ不変実量を得るため、パラメータ空間における小さなテトラヘドラ上で高次元ベリー曲率を評価する。数値計算により,Adiabatic進化を通じて高いベリー曲率が連続的に変化し,閉じた3次元パラメータ空間上で量子化されることを確認した。 The higher Berry curvature was introduced by Kapustin and Spodyneiko as an extension of the Berry curvature in quantum mechanical systems with finite degrees of freedom to quantum many-body systems in finite spatial dimensions. In this paper, we propose an alternative formulation of the higher Berry curvature using translationally invariant matrix product states. They are the ground states of a set of gapped Hamiltonians which are evolved adiabatically through a discretized parameter space. Because matrix product states transform under a projective representation, evaluating the Berry curvature on a closed loop through parameter space is not sufficient to fix all the gauge degrees of freedom. To obtain a gauge-invariant real quantity, the higher-dimensional Berry curvature is evaluated on small tetrahedra in parameter space. Our numerical calculations confirm that the higher Berry curvature varies continuously throughout an adiabatic evolution and becomes quantized over a closed 3-dimensional parameter space.	翻訳日:2023-05-16 17:37:01 公開日:2023-05-14
# 香港中学校地理カリキュラムへのGIS統合 Integrating GIS into Hong Kong Secondary School Geography Curriculum ( http://arxiv.org/abs/2305.08108v1 ) ライセンス: Link先を確認	Yin Ching Lai	(参考訳) 香港の地理カリキュラムには2000年代初頭からGISが含まれている。しかし、中等教育におけるGISは、香港の中等地理教育において重要な役割を果たさない。文献レビューによるGISのメリットを分析した結果、GISは上級・中等教育カリキュラムに含めるべきであると考えられる。さらに、香港教育局(EDB)からの明確な指導、香港の地理教師の低い準備、学界や教科書出版社からの支持の無い態度がなければ、GISは香港の中等教育では実施できないことを示している。そこで,edb,地理教員,学界,教科書出版者に対して,地理教育におけるgisの関与を促進するための提案を行った。 EDBは、教師、アカデミア、教科書出版社の参照のための明確なガイドラインを開発し、教師のための学生中心のGIS教育コースを提供する。教師は高度なGIS技術の準備をし、学生と一緒に学ぶことも重要である。学術誌や教科書出版社は、香港の中学校および上級地理カリキュラムを対象とした無料のGISマップを提供することができる。本報告は、香港地理教育におけるGIS導入に関する簡単な情報を提供するが、香港中等教育におけるGISの利用を促進するため、他の学者による新たなアイデアを刺激することができる。 Hong Kong' senior geography curriculum has included GIS since the early 2000s. However, GIS in secondary schools does not play a significant role in Hong Kong secondary geography education. Analyzing GIS benefits by literature review, it is believed that GIS should be included in both the senior and junior geography curriculum. Moreover, the literature review indicates that without clear instruction from the Hong Kong Education Bureau (EDB), low preparedness of Hong Kong geography teachers, and unsupportive attitudes from academia and textbook publishers, GIS cannot be implemented in secondary schools of Hong Kong. Therefore, suggestions are made for the EDB, geography teachers, academia and textbook publishers to facilitate GIS involvement in senior and junior geography curriculums. The EDB can develop clear guidelines for teachers, academia and textbook publishers' references, and offer student-centered GIS educational courses for teachers. It is important for teachers to be prepared for advanced GIS technology and to even learn along with students. Academics and textbook publishers can provide free GIS maps targeted at Hong Kong' junior and senior geography curriculums. Although the report provides brief information towards the GIS implementation in Hong Kong geography education, it can inspire new ideas from other scholars to facilitate the usage of GIS in Hong Kong secondary school geography teaching.	翻訳日:2023-05-16 17:36:44 公開日:2023-05-14
# タクシー需要予測のための時空間データのプライバシーと有用性のバランス Balancing Privacy and Utility of Spatio-Temporal Data for Taxi-Demand Prediction ( http://arxiv.org/abs/2305.08107v1 ) ライセンス: Link先を確認	Yumeki Goto, Tomoya Matsumoto, Hamada Rizk, Naoto Yanai, Hirozumi Yamaguchi	(参考訳) タクシー需要予測は、タクシー提供施設が運転を最適化し、都市計画者が交通インフラやサービスを改善するための機械学習の重要な応用である。しかし、これらのシステムにおける機密データの使用は、プライバシーとセキュリティに関する懸念を引き起こす。本稿では,複数の当事者がデータをプライベートかつセキュアに保ちながら,自身のデータで機械学習モデルをトレーニングできる,タクシー需要予測のためのフェデレーション学習の利用を提案する。これにより、組織はアクセスできないデータに基づいてモデルを構築することができる。潜在的な利点にもかかわらず、タクシー需要予測のための連合学習は、クラス不均衡、一部の当事者間のデータ不足、多様な施設や地理的地域に対応するためのモデル一般化の必要性など、いくつかの技術的課題を提起している。これらの課題を効果的に解決するために,地域非依存エンコーディングを地理的ラッチ長座標に活用するシステムを提案する。これにより、提案するモデルは特定の領域に限らず、任意の領域で最適に実行することができる。さらに,コストに敏感な学習と様々な正規化手法を用いて,データ不足と過剰適合の問題を緩和する。 6か月間のタクシーサービス提供者16社から収集した実世界データによる評価から,本システムでは,統合データで学習した単一モデルと比較して,1～%の誤差で需要レベルを正確に予測した。また、乗客データに対するメンバーシップ推論攻撃を効果的に防いだ。 Taxi-demand prediction is an important application of machine learning that enables taxi-providing facilities to optimize their operations and city planners to improve transportation infrastructure and services. However, the use of sensitive data in these systems raises concerns about privacy and security. In this paper, we propose the use of federated learning for taxi-demand prediction that allows multiple parties to train a machine learning model on their own data while keeping the data private and secure. This can enable organizations to build models on data they otherwise would not be able to access. Despite its potential benefits, federated learning for taxi-demand prediction poses several technical challenges, such as class imbalance, data scarcity among some parties, and the need to ensure model generalization to accommodate diverse facilities and geographic regions. To effectively address these challenges, we propose a system that utilizes region-independent encoding for geographic lat-long coordinates. By doing so, the proposed model is not limited to a specific region, enabling it to perform optimally in any area. Furthermore, we employ cost-sensitive learning and various regularization techniques to mitigate issues related to data scarcity and overfitting, respectively. Evaluation with real-world data collected from 16 taxi service providers in Japan over a period of six months showed the proposed system predicted demand level accurately within 1\% error compared to a single model trained with integrated data. The system also effectively defended against membership inference attacks on passenger data.	翻訳日:2023-05-16 17:36:23 公開日:2023-05-14
# ブロックチェーントランザクション料金予測:機械学習手法の比較 Blockchain Transaction Fee Forecasting: A Comparison of Machine Learning Methods ( http://arxiv.org/abs/2305.08105v1 ) ライセンス: Link先を確認	Conall Butler and Martin Crane	(参考訳) GasはEthereumネットワークのトランザクションフィー計測システムである。ネットワークのユーザは、取引を提出するためにガス価格を選択し、この選択において過払いまたは遅延/未処理の取引のリスクを生じさせる。本研究では,ロンドン・ハードフォークの余波に関するデータを調査し,この大規模フォーク後のネットワークのトランザクションダイナミクスについて考察した。そこで本稿では,EthUSD BitUSDとガス価格の関連について,2019年以前の作業状況について報告する。予測には, 直接再帰型ハイブリッドLSTM, CNNLSTM, Attention LSTMなどの機械学習手法の新たな組み合わせを比較する。これらはウェーブレットしきい値と行列プロファイルデータ処理を組み合わせ、ブロック最小のガス価格を5分間のタイムスケールで複数のルックアヘッドで予測する。本研究は, 行列プロファイルがガス価格データや予測データに適用された最初の応用として, ハードウェアの制約, ハイブリッドモデルの性能, CNNLSTMモデルを考えると, 行列プロファイルデータが注目に基づくモデルを強化することを実証する。入力のウェーブレットコヒーレンス(wavelet coherence)は、1日の時間スケールで複数の変数の相関を示す。直接再帰型ハイブリッドLSTM戦略は他のモデルよりも優れている。ハイブリッドモデルは20分間のルックアヘッドで、25/50分間の予測では注目モデルに匹敵するパフォーマンスである。さまざまなルックヘッドでの予測によって、ユーザはガス価格選択に関するインフォームドな決定と、取引が拒否されることを恐れずに取引を提出する最適な窓口を選択できる。これは、既存の推奨者やオラクルや予測アプローチよりも、ガス価格のダイナミクスに関するより詳細な洞察を与え、単純なヒューリスティックスや限られた外観の地平線を提供する。 Gas is the transaction-fee metering system of the Ethereum network. Users of the network are required to select a gas price for submission with their transaction, creating a risk of overpaying or delayed/unprocessed transactions in this selection. In this work, we investigate data in the aftermath of the London Hard Fork and shed insight into the transaction dynamics of the net-work after this major fork. As such, this paper provides an update on work previous to 2019 on the link between EthUSD BitUSD and gas price. For forecasting, we compare a novel combination of machine learning methods such as Direct Recursive Hybrid LSTM, CNNLSTM, and Attention LSTM. These are combined with wavelet threshold denoising and matrix profile data processing toward the forecasting of block minimum gas price, on a 5-min timescale, over multiple lookaheads. As the first application of the matrix profile being applied to gas price data and forecasting we are aware of, this study demonstrates that matrix profile data can enhance attention-based models however, given the hardware constraints, hybrid models outperformed attention and CNNLSTM models. The wavelet coherence of inputs demonstrates correlation in multiple variables on a 1 day timescale, which is a deviation of base free from gas price. A Direct-Recursive Hybrid LSTM strategy outperforms other models. Hybrid models have favourable performance up to a 20 min lookahead with performance being comparable to attention models when forecasting 25/50-min ahead. Forecasts over a range of lookaheads allow users to make an informed decision on gas price selection and the optimal window to submit their transaction in without fear of their transaction being rejected. This, in turn, gives more detailed insight into gas price dynamics than existing recommenders, oracles and forecasting approaches, which provide simple heuristics or limited lookahead horizons.	翻訳日:2023-05-16 17:35:58 公開日:2023-05-14
# 有限レート消去チャネル上のフェデレーションTD学習:マルコフサンプリングによる線形高速化 Federated TD Learning over Finite-Rate Erasure Channels: Linear Speedup under Markovian Sampling ( http://arxiv.org/abs/2305.08104v1 ) ライセンス: Link先を確認	Nicol\`o Dal Fabbro, Aritra Mitra and George J. Pappas	(参考訳) フェデレーテッド・ラーニング(FL)は、コミュニケーションとプライバシの制約の下で教師付き学習タスクを高速化する効果により、最近注目を集めている。しかし、強化学習に類似したスピードアップが確立できるかどうかは理論的には理解されていない。本研究では,共通政策の評価を迅速化するために,エージェントが中央アグリゲータを介してコミュニケーションするフェデレート政策評価問題について検討する。 FLにおける典型的な通信制約を捉えるために、ベルヌーイ消去モデルに基づいてパケットをドロップできる有限容量アップリンクチャネルを考える。そこで本稿では,線形関数近似を用いた量子化フェデレーション時間差分学習アルゴリズムQFedTDを提案する。我々の主な技術的貢献はQFedTDの有限サンプル解析を提供することである。 (i) 量子化及び消去が収束率に及ぼす影響を強調する。 (ii) マルコフサンプリング下のエージェント数を線形スピードアップ w.r.t. とする。特に,共役学習,分散最適化,ネットワーク制御系文献において,異なる量子化機構やパケットドロップモデルが広く研究されてきたが,マルチエージェント・共役強化学習における効果の非漸近的解析を初めて行った。 Federated learning (FL) has recently gained much attention due to its effectiveness in speeding up supervised learning tasks under communication and privacy constraints. However, whether similar speedups can be established for reinforcement learning remains much less understood theoretically. Towards this direction, we study a federated policy evaluation problem where agents communicate via a central aggregator to expedite the evaluation of a common policy. To capture typical communication constraints in FL, we consider finite capacity up-link channels that can drop packets based on a Bernoulli erasure model. Given this setting, we propose and analyze QFedTD - a quantized federated temporal difference learning algorithm with linear function approximation. Our main technical contribution is to provide a finite-sample analysis of QFedTD that (i) highlights the effect of quantization and erasures on the convergence rate; and (ii) establishes a linear speedup w.r.t. the number of agents under Markovian sampling. Notably, while different quantization mechanisms and packet drop models have been extensively studied in the federated learning, distributed optimization, and networked control systems literature, our work is the first to provide a non-asymptotic analysis of their effects in multi-agent and federated reinforcement learning.	翻訳日:2023-05-16 17:35:27 公開日:2023-05-14
# 水分含有エポキシナノコンポジットの機械学習による粘弾性・粘弾性モデル A machine learning-based viscoelastic-viscoplastic model for epoxy nanocomposites with moisture content ( http://arxiv.org/abs/2305.08102v1 ) ライセンス: Link先を確認	Betim Bahtiri, Behrouz Arash, Sven Scheffler, Maximilian Jux, Raimund Rolfes	(参考訳) 本研究では, ナノ粒子/エポキシナノコンポジットの循環粘弾性・粘弾性損傷挙動を水分含量で解析する, 深層学習に基づく構成モデルを提案する。このため、サンプリング技法と摂動法を組み合わせたフレームワークを用いて、長期短期記憶ネットワークを訓練する。実験で検証された粘弾性粘弾性粘塑性モデルによって生成されたトレーニングデータとともに、dlモデルが速度依存性の応力-ひずみ関係と一貫した接モジュラリティを正確に捉えることができる。さらに、dlに基づく構成モデルは有限要素解析に実装される。ナノ粒子/エポキシ試料の力-変位応答に及ぼす荷重速度と水分量の影響を有限要素シミュレーションにより検討した。数値的な例は、DLモデルの計算効率が負荷条件に依存し、従来の構成モデルよりもかなり高いことを示している。さらに, 数値計算結果と実験データを比較すると, 異なるナノ粒子や水分量と良好な一致を示した。 In this work, we propose a deep learning (DL)-based constitutive model for investigating the cyclic viscoelastic-viscoplastic-damage behavior of nanoparticle/epoxy nanocomposites with moisture content. For this, a long short-term memory network is trained using a combined framework of a sampling technique and a perturbation method. The training framework, along with the training data generated by an experimentally validated viscoelastic-viscoplastic model, enables the DL model to accurately capture the rate-dependent stress-strain relationship and consistent tangent moduli. In addition, the DL-based constitutive model is implemented into finite element analysis. Finite element simulations are performed to study the effect of load rate and moisture content on the force-displacement response of nanoparticle/ epoxy samples. Numerical examples show that the computational efficiency of the DL model depends on the loading condition and is significantly higher than the conventional constitutive model. Furthermore, comparing numerical results and experimental data demonstrates good agreement with different nanoparticle and moisture contents.	翻訳日:2023-05-16 17:35:07 公開日:2023-05-14
# 正定値核による条件付き平均埋め込みと最適特徴選択 Conditional mean embeddings and optimal feature selection via positive definite kernels ( http://arxiv.org/abs/2305.08100v1 ) ライセンス: Link先を確認	Palle E.T. Jorgensen, Myung-Sin Song, and James Tian	(参考訳) ここでは,条件付き平均埋め込み(cme)に対する新しい演算子理論的アプローチを考える。本稿では,スペクトル解析に基づく最適化手法とカーネル,確率過程,構成学習アルゴリズムの併用について述べる。当初与えられた非線形データに対しては、最適化に基づく特徴選択を検討する。これは、学習モデルからの回帰アルゴリズムによる最適な特徴選択を構築する際に、正定値(p.d.)カーネルの凸集合を使用する。したがって、(適切な学習アルゴリズムのための)トレーニングデータの初期入力により、p.d. kernel $k$ のそれぞれの選択は、様々なヒルベルト空間と特徴の実現をもたらす。ここでの新しい考え方は、選択されたカーネルの集合に対して$K$を凸集合$C$の正定値カーネルの$K$から最適化することである。したがって、特徴表現の「textquotedblleft optimal\textquotedblright{}」の選択は、指定された凸集合$C$内のp.d.カーネルの$K$に対する二次最適化に依存する。 Motivated by applications, we consider here new operator theoretic approaches to Conditional mean embeddings (CME). Our present results combine a spectral analysis-based optimization scheme with the use of kernels, stochastic processes, and constructive learning algorithms. For initially given non-linear data, we consider optimization-based feature selections. This entails the use of convex sets of positive definite (p.d.) kernels in a construction of optimal feature selection via regression algorithms from learning models. Thus, with initial inputs of training data (for a suitable learning algorithm,) each choice of p.d. kernel $K$ in turn yields a variety of Hilbert spaces and realizations of features. A novel idea here is that we shall allow an optimization over selected sets of kernels $K$ from a convex set $C$ of positive definite kernels $K$. Hence our \textquotedblleft optimal\textquotedblright{} choices of feature representations will depend on a secondary optimization over p.d. kernels $K$ within a specified convex set $C$.	翻訳日:2023-05-16 17:34:51 公開日:2023-05-14
# 遠距離発話レベルの表現のための自己教師型ニューラルファクター解析 Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations ( http://arxiv.org/abs/2305.08099v1 ) ライセンス: Link先を確認	Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu	(参考訳) wav2vecやHuBERTのような自己教師付き学習(SSL)音声モデルは、音声認識(ASR)における最先端の性能を示し、低ラベル・リソース設定において非常に有用であることが証明されている。しかし、sslモデルの成功はまだ話者、感情、言語認識といった発話レベルのタスクに移行しておらず、優れたパフォーマンスを得るためにはsslモデルの教師付き微調整が必要である。問題の原因は,異種表現の欠如と,これらの課題に対する発話レベルの学習目標にあると考える。 HuBERTがクラスタリングを使って隠れ音響ユニットを発見する方法に着想を得て、隠れ音響ユニットを用いてSSL機能を整列させる因子分析(FA)モデルを定式化した。下位の発話レベル表現は、一致した特徴に対する確率的推論を用いて、音声の内容から切り離される。さらに、faモデルから派生した変動下限は発話レベルの目標を提供し、エラー勾配をトランスフォーマ層にバックプロパゲーションし、高度に識別可能な音響単位を学ぶことができる。 HuBERTのマスク付き予測トレーニングと組み合わせて使用する場合、私たちのモデルは、ラベル付きデータの20%しか表示されないSUPERBベンチマークの発話レベル非意味タスクにおいて、現在の最高のモデルであるWavLMよりも優れています。 Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings. However, the success of SSL models has yet to transfer to utterance-level tasks such as speaker, emotion, and language recognition, which still require supervised fine-tuning of the SSL models to obtain good performance. We argue that the problem is caused by the lack of disentangled representations and an utterance-level learning objective for these tasks. Inspired by how HuBERT uses clustering to discover hidden acoustic units, we formulate a factor analysis (FA) model that uses the discovered hidden acoustic units to align the SSL features. The underlying utterance-level representations are disentangled from the content of speech using probabilistic inference on the aligned features. Furthermore, the variational lower bound derived from the FA model provides an utterance-level objective, allowing error gradients to be backpropagated to the Transformer layers to learn highly discriminative acoustic units. When used in conjunction with HuBERT's masked prediction training, our models outperform the current best model, WavLM, on all utterance-level non-semantic tasks on the SUPERB benchmark with only 20% of labeled data.	翻訳日:2023-05-16 17:34:38 公開日:2023-05-14
# tao一般微分と差分:理論と応用 Tao General Differential and Difference: Theory and Application ( http://arxiv.org/abs/2305.08098v1 ) ライセンス: Link先を確認	Linmi Tao, Ruiyang Liu, Donglai Tao, Wu Xia, Feilong Ma, Jingmao Cui	(参考訳) 現代の数値解析は離散データ上で行われ、数値差分計算はコアの1つであり、必須である。それにもかかわらず、差分アルゴリズムはノイズに対する感受性に致命的な弱点を持ち、信号処理を含む様々な分野において長年の課題となっている。差は離散領域における微分の拡張あるいは一般化である。しかし、離散計算における有限区間のため、dy と dx がともに無限小(ライプニッツ)、dx の極限が 0(コーシー)であるような微分の最も基本的な定義を満たすことに失敗する。この点において、差分の一般化は成立しない。この問題に対処するため、元の微分アプローチから離れ、有限区間に基づく微分を構築し、さらに一般化して畳み込みによる差を求める。この理論に基づき、実用的信号処理に適した様々な差分演算子を提案する。実験の結果、これらの差分演算子は高ノイズ免疫を含む例外的な信号処理能力を有することがわかった。 Modern numerical analysis is executed on discrete data, of which numerical difference computation is one of the cores and is indispensable. Nevertheless, difference algorithms have a critical weakness in their sensitivity to noise, which has long posed a challenge in various fields including signal processing. Difference is an extension or generalization of differential in the discrete domain. However, due to the finite interval in discrete calculation, there is a failure in meeting the most fundamental definition of differential, where dy and dx are both infinitesimal (Leibniz) or the limit of dx is 0 (Cauchy). In this regard, the generalization of differential to difference does not hold. To address this issue, we depart from the original derivative approach, construct a finite interval-based differential, and further generalize it to obtain the difference by convolution. Based on this theory, we present a variety of difference operators suitable for practical signal processing. Experimental results demonstrate that these difference operators possess exceptional signal processing capabilities, including high noise immunity.	翻訳日:2023-05-16 17:34:12 公開日:2023-05-14
# 神経機械翻訳における知識蒸留の理解と改善に向けて Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation ( http://arxiv.org/abs/2305.08096v1 ) ライセンス: Link先を確認	Songming Zhang, Yunlong Liang, Shuaibo Wang, Wenjuan Han, Jian Liu, Jinan Xu and Yufeng Chen	(参考訳) 知識蒸留(KD)はニューラルマシン翻訳におけるモデル圧縮の有望な技術である。しかし、kdに知識が隠されている場所はまだ明確ではなく、kdの開発を妨げる可能性がある。本研究では、まずこの謎を経験的観点から解き出し、その知識が教師のトップ1の予測から得られることを示し、また、単語とシーケンスレベルのKDの間の潜在的なつながりを構築するのにも役立ちます。さらに,この発見に基づいて,バニラ語レベルのkdに固有の2つの問題点を指摘する。第一に、kdの現在の目的は、知識を学ぶために全分布にその焦点を広げるが、最も重要なtop-1情報に対する特別な扱いを欠いている。第二に、この知識は、教師のtop-1予測のほとんどが、kdの可能性をさらに制限する地上トークンと重なり合うという事実から、金の情報によっておおむねカバーされている。これらの問題に対処するために、新しい方法である \textbf{T}op-1 \textbf{I}nformation \textbf{E}nhanced \textbf{K}nowledge \textbf{D}istillation (TIE-KD)を提案する。具体的には,教師からのtop-1情報の学習を強制するために,階層的ランキングロスを設計する。さらに, 地中標的を使わずにデータを蒸留し, さらなる知識を注入する反復的kd手法を開発した。 WMT'14英語-ドイツ語、WMT'14英語-フランス語、WMT'16英語-ルーマニア語の実験では、我々の手法がTransformer$_{base}$ studentsを+1.04, +0.60, +1.11BLEUスコアで向上させ、バニラ語レベルのKDベースラインを著しく上回ることを示した。さらに,本手法は,既存のKD手法よりも,教師と生徒の容量ギャップの多様さに対して高い一般化性を示す。 Knowledge distillation (KD) is a promising technique for model compression in neural machine translation. However, where the knowledge hides in KD is still not clear, which may hinder the development of KD. In this work, we first unravel this mystery from an empirical perspective and show that the knowledge comes from the top-1 predictions of teachers, which also helps us build a potential connection between word- and sequence-level KD. Further, we point out two inherent issues in vanilla word-level KD based on this finding. Firstly, the current objective of KD spreads its focus to whole distributions to learn the knowledge, yet lacks special treatment on the most crucial top-1 information. Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a novel method named \textbf{T}op-1 \textbf{I}nformation \textbf{E}nhanced \textbf{K}nowledge \textbf{D}istillation (TIE-KD). Specifically, we design a hierarchical ranking loss to enforce the learning of the top-1 information from the teacher. Additionally, we develop an iterative KD procedure to infuse more additional knowledge by distilling on the data without ground-truth targets. Experiments on WMT'14 English-German, WMT'14 English-French and WMT'16 English-Romanian demonstrate that our method can respectively boost Transformer$_{base}$ students by +1.04, +0.60 and +1.11 BLEU scores and significantly outperform the vanilla word-level KD baseline. Besides, our method shows higher generalizability on different teacher-student capacity gaps than existing KD techniques.	翻訳日:2023-05-16 17:33:55 公開日:2023-05-14
# パラメトリック駆動による結合キャビティアレイの強度と原子-原子相互作用範囲の増強 Enhancing strength and range of atom-atom interaction in a coupled-cavity array via parametric drives ( http://arxiv.org/abs/2305.08127v1 ) ライセンス: Link先を確認	Ya-long Ren, Sheng-li Ma, Stefano Zippilli, David Vitali and Fu-li Li	(参考訳) 原子間のコヒーレントな長距離相互作用は、量子情報科学の分野における多くの応用に必須であるが、通常は原子分離の増加に伴い指数関数的に減少する。ここでは、2光子(パラメトリック)駆動を受ける結合キャビティアレイによって媒介される長距離原子-原子相互作用を劇的に向上させる。本手法により, 単一光子束縛状態波動関数の局在長と有効原子-光子結合強度の両方を大幅に増幅し, 2つの遠方原子間の光子を介するコヒーレント相互作用を著しく改善することができる。さらに、情報伝達の促進と原子間の絡み合いの生成について分析することにより、この効果を説明する。 Coherent long-range interactions between atoms are a prerequisite for numerous applications in the field of quantum information science, but they usually decrease exponentially with the increase in atomic separation. Here we present an appealing method to dramatically enhance the long-range atom-atom interaction mediated by a coupled-cavity array that is subjected to two-photon (parametric) drives. Our method allows one to greatly amplify both the localization length of the single-photon bound-state wavefunction and the effective atom-photon coupling strength, resulting in a significant improvement of photon-mediated coherent interaction between two distant atoms. Additionally, we illustrate this effect by analyzing how it facilitates the transfer of information and the creation of entanglement between the atoms.	翻訳日:2023-05-16 17:26:17 公開日:2023-05-14
# 学習可能な概念のセマンティックコミュニケーション Semantic Communication of Learnable Concepts ( http://arxiv.org/abs/2305.08126v1 ) ライセンス: Link先を確認	Francesco Pase, Szymon Kobus, Deniz Gunduz, Michele Zorzi	(参考訳) 我々は、未知および潜在的確率写像という一連の概念を伝達する問題を、例を通してのみ観察できる、すなわち、写像規則が未知である、と考える。送信機は、利用可能な例に学習アルゴリズムを適用し、一連のモデル、すなわち、観測されたデータをよりうまく記述できる既知の関数、および潜在的に基礎となる概念に対して確率分布を最適化することにより、データから知識を抽出する。送信機は、学習したモデルをレート制限されたチャネルを介してリモートレシーバーに通信し、受信機が、そのセマンティック空間において可能な限り正確なサンプル概念を記述できるモデルをデコードできるようにする。分析のモチベーションを得た後、ネットワークにおける経験的・強い協調の概念との関係を指摘し、コミュニケーション概念の形式的問題を提案し、その速度歪曲特性を提供する。また、歪み率関数のバウンドも提供する。 We consider the problem of communicating a sequence of concepts, i.e., unknown and potentially stochastic maps, which can be observed only through examples, i.e., the mapping rules are unknown. The transmitter applies a learning algorithm to the available examples, and extracts knowledge from the data by optimizing a probability distribution over a set of models, i.e., known functions, which can better describe the observed data, and so potentially the underlying concepts. The transmitter then needs to communicate the learned models to a remote receiver through a rate-limited channel, to allow the receiver to decode the models that can describe the underlying sampled concepts as accurately as possible in their semantic space. After motivating our analysis, we propose the formal problem of communicating concepts, and provide its rate-distortion characterization, pointing out its connection with the concepts of empirical and strong coordination in a network. We also provide a bound for the distortion-rate function.	翻訳日:2023-05-16 17:26:03 公開日:2023-05-14
# 資格トレースとしてのthetaシークエンス:クレジット割り当てに対する生物学的解決策 Theta sequences as eligibility traces: a biological solution to credit assignment ( http://arxiv.org/abs/2305.08124v1 ) ライセンス: Link先を確認	Tom M George	(参考訳) クレジット代入問題(例えば、RLにおけるポリシー評価)は、しばしば前回の状態 \textit{or} が時間的に拡張されたメモリトレースを維持することで予測エラーをブートストラップする必要がある。海馬のテタ振動における神経活動の連鎖、覚醒行動の迅速なプレイスルーを表すと考えられるテタ配列を解法として提案する。 thetaシーケンスのモデルを解析してシミュレートすることにより、既存のだが短い$\mathsf{o}(10)$ msのニューロンメモリトレースが効果的に拡張され、長いメモリトレースなしでブートストラップフリーのクレジット割り当てが可能になる。 Credit assignment problems, for example policy evaluation in RL, often require bootstrapping prediction errors through preceding states \textit{or} maintaining temporally extended memory traces; solutions which are unfavourable or implausible for biological networks of neurons. We propose theta sequences -- chains of neural activity during theta oscillations in the hippocampus, thought to represent rapid playthroughs of awake behaviour -- as a solution. By analysing and simulating a model for theta sequences we show they compress behaviour such that existing but short $\mathsf{O}(10)$ ms neuronal memory traces are effectively extended allowing for bootstrap-free credit assignment without long memory traces, equivalent to the use of eligibility traces in TD($\lambda$).	翻訳日:2023-05-16 17:25:47 公開日:2023-05-14
# 結合ランダムグラフモデルにおける量子スカー状態 Quantum Scar States in Coupled Random-Graph Models ( http://arxiv.org/abs/2305.08123v1 ) ライセンス: Link先を確認	Bhilahari Jeevanesan	(参考訳) 我々は,l$site pxp-model のヒルベルト空間接続を,基底状態のグレイ符号によるハミルトニアン行列を構築して解析する。一度構築すると、行列は単純な構造を明らかにする:それらはすべて単一のハミルトンパスのバックボーンとサイド接続から形成される。 PXPモデルは、領域法的な絡み合いを持つスペクトルの中央に傷跡が存在することで知られている。 pxp-モデルの隣接グラフを開発するという理解は、可変制約次数と可変ネットワークトポロジーを持つハミルトニアンのクラスをどのように構築するかに関する一般的な指示を与えてくれる。ネットワークトポロジがランダムグラフモデルを中心に構築されるこのモデルのバージョンについて検討する。弱絡み合った中スペクトル固有状態の2つのクラスを見つける。第1のクラスはサブシステムの製品に近い固有状態である傷跡であり、第2のクラスは$\log 2$エンタングルメントエントロピーを持ち、特別なタイプのサブグラフの発生と結びついている。後者の状態は、Lin-Motrunich $\sqrt{2}$-scarsに似ている。 We analyze the Hilbert space connectivity of the $L$ site PXP-model by constructing the Hamiltonian matrices via a Gray code numbering of basis states. Once constructed, the matrices reveal a simple structure: they are all formed out of a single Hamiltonian-path backbone and side-connections. The PXP model is known for the presence of scar states in the middle of the spectrum that have area-law entanglement. The understanding that we develop of the PXP-model's adjacency graph equips us with a general instruction on how to construct a class of Hamiltonians with tunable constraint degree and variable network topology. We explore a version of this model where the network topology is constructed around a random-graph model. We find two classes of weakly-entangled mid-spectrum eigenstates. The first class are scars that are near-product eigenstates of the subsystems, while the second class has $\log 2$ entanglement entropy and is tied to the occurrence of special types of subgraphs. The latter states have some resemblance to the Lin-Motrunich $\sqrt{2}$-scars.	翻訳日:2023-05-16 17:25:27 公開日:2023-05-14
# OTTメディアの予測分析におけるコールドスタートエニグマの展開 : 相乗的メタインサイトとマルチモーダルアンサンブルマスター Unraveling Cold Start Enigmas in Predictive Analytics for OTT Media: Synergistic Meta-Insights and Multimodal Ensemble Mastery ( http://arxiv.org/abs/2305.08120v1 ) ライセンス: Link先を確認	K. Ganguly, A. Patra	(参考訳) コールドスタート問題は、Over-The-Top(OTT)プラットフォームで新たにローンチされたショーの視聴者数を予測するようなメディアユースケースを含む、さまざまな領域で一般的な課題である。本研究では,メタデータの活用とマルチモデルアンサンブルによるコールドスタート問題への汎用的アプローチを提案する。提案手法は,特徴工学,モデル選択,および重み付き予測平均に基づくアンサンブルアプローチを含む。提案手法の性能は,様々な性能指標を用いて評価する。その結果,マルチモデルアンサンブルアプローチは,個々のモデルと比較して予測精度が著しく向上することがわかった。 The cold start problem is a common challenge in various domains, including media use cases such as predicting viewership for newly launched shows on Over-The-Top (OTT) platforms. In this study, we propose a generic approach to tackle cold start problems by leveraging metadata and employing multi-model ensemble techniques. Our methodology includes feature engineering, model selection, and an ensemble approach based on a weighted average of predictions. The performance of our proposed method is evaluated using various performance metrics. Our results indicate that the multi-model ensemble approach significantly improves prediction accuracy compared to individual models.	翻訳日:2023-05-16 17:25:08 公開日:2023-05-14
# 容易軸強磁性体を用いたキャビティマグノニクス : 臨界的に強化されたマグノンスクイーズと光間相互作用 Cavity magnonics with easy-axis ferromagnet: critically enhanced magnon squeezing and light-matter interaction ( http://arxiv.org/abs/2305.08119v1 ) ライセンス: Link先を確認	Jongjun M. Lee, Hyun-Woo Lee, Myung-Joong Hwang	(参考訳) マグノンスクイージングの生成と探索は、量子マグノニクスの分野において重要な課題である。本研究では,この課題に対処するため,容易軸強磁性体を用いたキャビティマグノニクスのセットアップを提案する。この目的のために,我々はまず,容易軸強磁性体におけるマグノンスクイーズの発生機構を確立し,イジング相転移点近傍の外部磁場をチューニングすることにより、マグノンスクイーズを臨界的に向上させることができることを示す。磁石を空洞磁場に結合すると、有効キャビティ-マグノン相互作用はマグノンスクイーズに比例し、静磁場を用いてキャビティ-マグノン結合強度を高めることができる。キャビティフィールドの周波数シフトを測定することで,マグノンスクイーズを探査できることを実証した。さらに, 静磁場をチューニングすることで, マグネトロン超ラジアント相転移を観測することができ, キャビティとマグネットとの磁気相互作用が弱すぎて超ラジアント相転移を駆動できないという課題を克服できる。我々の研究は、磁石の内在的性質を利用して、従来の空洞QED物理を超える空洞マグノニクスのユニークな能力を開発する方法である。 Generating and probing the magnon squeezing is an important challenge in the field of quantum magnonics. In this work, we propose a cavity magnonics setup with an easy-axis ferromagnet to address this challenge. To this end, we first establish a mechanism for the generation of magnon squeezing in the easy-axis ferromagnet and show that the magnon squeezing can be critically enhanced by tuning an external magnetic field near the Ising phase transition point. When the magnet is coupled to the cavity field, the effective cavity-magnon interaction becomes proportional to the magnon squeezing, allowing one to enhance the cavity-magnon coupling strength using a static field. We demonstrate that the magnon squeezing can be probed by measuring the frequency shift of the cavity field. Moreover, a magnonic superradiant phase transition can be observed in our setup by tuning the static magnetic field, overcoming the challenge that the magnetic interaction between the cavity and the magnet is typically too weak to drive the superradiant transition. Our work paves the way to develop unique capabilities of cavity magnonics that goes beyond the conventional cavity QED physics by harnessing the intrinsic property of a magnet.	翻訳日:2023-05-16 17:24:55 公開日:2023-05-14
# MultiQuant: 任意ビット幅ネットワーク量子化のための新しいマルチブランチトポロジー手法 MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization ( http://arxiv.org/abs/2305.08117v1 ) ライセンス: Link先を確認	Yunshan Zhong, Mingbao Lin, Yuyao Zhou, Mengzhao Chen, Yuxin Zhang, Fei Chao, Rongrong Ji	(参考訳) 任意のビット幅ネットワーク量子化は、実行時に様々なビット幅要求に高い適応性を持つため、大きな注目を集めている。しかし,本研究では,重みとアクティベーションの頻繁なビット幅切替による量子化誤差の顕著な蓄積を観測し,性能の限界を指摘した。この問題に対処するために,任意のビット幅量子化にマルチブランチトポロジを利用する新しい手法であるMultiQuantを提案する。 MultiQuantは、ネットワーク本体を複数の独立したブランチに複製し、期待ビット幅の入力活性化を維持しながら、各ブランチの重みを固定2ビットに量子化する。この手法は、重みビット幅の切り替えを回避しつつも計算コストを同じに維持し、重み量子化の誤差を実質的に低減する。また,分枝の活性化ビット幅切替による量子化誤差を分枝間で分散し,性能を向上させるための償却分枝選択戦略を提案する。最後に,MultiQuantの性能を高めるため,枝間誘導を容易にする蒸留方式を設計する。大規模な実験により、MultiQuantは既存の任意のビット幅量子化法と比較して大きな性能向上を達成した。コードは \url{https://github.com/zysxmu/MultiQuant} にある。 Arbitrary bit-width network quantization has received significant attention due to its high adaptability to various bit-width requirements during runtime. However, in this paper, we investigate existing methods and observe a significant accumulation of quantization errors caused by frequent bit-width switching of weights and activations, leading to limited performance. To address this issue, we propose MultiQuant, a novel method that utilizes a multi-branch topology for arbitrary bit-width quantization. MultiQuant duplicates the network body into multiple independent branches and quantizes the weights of each branch to a fixed 2-bit while retaining the input activations in the expected bit-width. This approach maintains the computational cost as the same while avoiding the switching of weight bit-widths, thereby substantially reducing errors in weight quantization. Additionally, we introduce an amortization branch selection strategy to distribute quantization errors caused by activation bit-width switching among branches to enhance performance. Finally, we design an in-place distillation strategy that facilitates guidance between branches to further enhance MultiQuant's performance. Extensive experiments demonstrate that MultiQuant achieves significant performance gains compared to existing arbitrary bit-width quantization methods. Code is at \url{https://github.com/zysxmu/MultiQuant}.	翻訳日:2023-05-16 17:24:30 公開日:2023-05-14
# 超現実性を持つ知識グラフの構造とダイナミクス The Structure and Dynamics of Knowledge Graphs, with Superficiality ( http://arxiv.org/abs/2305.08116v1 ) ライセンス: Link先を確認	Lo\"ick Lhote, B\'eatrice Markhoff, Arnaud Soulet	(参考訳) 大規模な知識グラフは、学界や機関から企業、クラウドソーシングに至るまで、さまざまなプロジェクトから収集された人間の知識を組み合わせる。このようなグラフの中では、2つのノード間の関係は2つの実体を含む基本的な事実を表している。関係のセマンティクスの多様性は知識グラフの豊かさを構成しており、特異なトポロジーが出現し、時には外観が混乱することがある。しかし、この複雑な特徴は、事実が独立して生成される関係の重複を制御する超現実性の概念を導入することで、単純な方法でモデル化することができる。現実性はまた、誤解された実体の割合を決定することによって、知識のグローバルな分布のバランスを規制する。これは知識グラフの構造とダイナミクスに関する最初のモデルである。これは、正式な知識獲得と組織に関する理解を深めます。 Large knowledge graphs combine human knowledge garnered from projects ranging from academia and institutions to enterprises and crowdsourcing. Within such graphs, each relationship between two nodes represents a basic fact involving these two entities. The diversity of the semantics of relationships constitutes the richness of knowledge graphs, leading to the emergence of singular topologies, sometimes chaotic in appearance. However, this complex characteristic can be modeled in a simple way by introducing the concept of superficiality, which controls the overlap between relationships whose facts are generated independently. Superficiality also regulates the balance of the global distribution of knowledge by determining the proportion of misdescribed entities. This is the first model for the structure and dynamics of knowledge graphs. It leads to a better understanding of formal knowledge acquisition and organization.	翻訳日:2023-05-16 17:24:06 公開日:2023-05-14
# 機械学習モデルエラー封じ込めのための注意ルールの自動生成 Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors ( http://arxiv.org/abs/2305.08115v1 ) ライセンス: Link先を確認	Samuel Ackerman, Axel Bendavid, Eitan Farchi, Orna Raz	(参考訳) 多くのアプリケーションで機械学習(ML)ソリューションが普及している。しかし、これらのソリューションをビジネスグレードにする上で、多くの課題が存在する。例えば、基盤となるMLモデルのエラー率を許容できる低いレベルに維持する。通常、特徴入力と予測対象特徴との間の真の関係は不確かであり、したがって自然界では統計的である。提案するアプローチは、誤って予測される可能性が最も高い観測を「注意セット」に分離することである。これらはモデル診断と改善を直接支援し、これらの問題のある観察のために別の行動経路を決定するのに使用できる。これらの観測を分離するために最適な規則を決定するアルゴリズムをいくつか提示する。特に,機能ベースのスライシングを利用する戦略は,人間の解釈可能で,モデル非依存であり,補足的な入力や知識を最小限に抑える必要がある。さらに,予測信頼度がしきい値を下回るような観測結果の選択など,これらの戦略がいくつかの一般的なベースラインを上回っていることを示す。戦略を評価するために,様々な望ましい品質(その性能,安定性,非認識データの一般化など)を測定するための指標を導入し,その戦略はいくつかの公開データセット上で評価される。 ToPSIS(Multiple Criteria Decision Making Method)を用いて、これらのメトリクスを戦略ごとに単一の品質スコアに集約し、比較を可能にする。 Machine learning (ML) solutions are prevalent in many applications. However, many challenges exist in making these solutions business-grade. For instance, maintaining the error rate of the underlying ML models at an acceptably low level. Typically, the true relationship between feature inputs and the target feature to be predicted is uncertain, and hence statistical in nature. The approach we propose is to separate the observations that are the most likely to be predicted incorrectly into 'attention sets'. These can directly aid model diagnosis and improvement, and be used to decide on alternative courses of action for these problematic observations. We present several algorithms (`strategies') for determining optimal rules to separate these observations. In particular, we prefer strategies that use feature-based slicing because they are human-interpretable, model-agnostic, and require minimal supplementary inputs or knowledge. In addition, we show that these strategies outperform several common baselines, such as selecting observations with prediction confidence below a threshold. To evaluate strategies, we introduce metrics to measure various desired qualities, such as their performance, stability, and generalizability to unseen data; the strategies are evaluated on several publicly-available datasets. We use TOPSIS, a Multiple Criteria Decision Making method, to aggregate these metrics into a single quality score for each strategy, to allow comparison.	翻訳日:2023-05-16 17:23:56 公開日:2023-05-14
# 影響のある人工知能の量子操作 Quantum Operation of Affective Artificial Intelligence ( http://arxiv.org/abs/2305.08112v1 ) ライセンス: Link先を確認	V.I. Yukalov	(参考訳) このレビューでは、感情を経験する人間による意思決定の現実的な過程を模倣するために、人工知能が基本とする基本原則を分析している。 2つのアプローチを比較する。1つは量子論に基づいており、もう1つは古典的用語を用いる。これらのアプローチには多くの類似点があり、主に確率的である。固有雑音下での量子測定と感情的意思決定の類似性を明らかにする。認知過程は、量子測定と形式的に類似した多くの特徴を有することが示されている。しかしこれは、人間の意思決定を模倣するためには、Affective Artificial Intelligenceは必ずしも量子システムの機能に依存する必要があることを意味する。量子測定と意思決定の共通性を評価することは、古典的な概念のみを用いた公理的アプローチの定式化に役立つ。このアプローチに従う人工知能は、考慮された選択肢の実用性と彼らの感情的な魅力を考慮し、人間と同じような動作をする。感情的人工知能は、その操作が認知と感情の二重性を考慮しており、伝統的な意思決定の多くの行動的パラドックスを避ける。知的エージェントの社会は、情報の繰り返し多段階の交換を通じて相互作用し、動的な意思決定を行うネットワークを形成する。知的ネットワークは、感情的意思決定者の人間社会、ニューロンからなる脳、または人工知能の典型的な確率的ネットワークのいずれかの動作を特徴付けることができる。 The review analyzes the fundamental principles which Artificial Intelligence should be based on in order to imitate the realistic process of taking decisions by humans experiencing emotions. Two approaches are compared, one based on quantum theory and the other employing classical terms. Both these approaches have a number of similarities, being principally probabilistic. The analogies between quantum measurements under intrinsic noise and affective decision making are elucidated. It is shown that cognitive processes have many features that are formally similar to quantum measurements. This, however, in no way means that for the imitation of human decision making Affective Artificial Intelligence has necessarily to rely on the functioning of quantum systems. Appreciating the common features between quantum measurements and decision making helps for the formulation of an axiomatic approach employing only classical notions. Artificial Intelligence, following this approach, operates similarly to humans, by taking into account the utility of the considered alternatives as well as their emotional attractiveness. Affective Artificial Intelligence, whose operation takes account of the cognition-emotion duality, avoids numerous behavioural paradoxes of traditional decision making. A society of intelligent agents, interacting through the repeated multistep exchange of information, forms a network accomplishing dynamic decision making. The considered intelligent networks can characterize the operation of either a human society of affective decision makers, or the brain composed of neurons, or a typical probabilistic network of an artificial intelligence.	翻訳日:2023-05-16 17:23:34 公開日:2023-05-14
# 局所発振器を用いた100km繊維上の長距離連続可変量子鍵分布 Long-distance continuous-variable quantum key distribution over 100 km fiber with local local oscillator ( http://arxiv.org/abs/2305.08156v1 ) ライセンス: Link先を確認	Adnan A.E. Hajomer, Ivan Derkach, Nitin Jain, Hou-Man Chin, Ulrik L. Andersen and Tobias Gehring	(参考訳) 量子鍵分散(QKD)は、2つのリモートパーティが物理法則に基づいて暗号化キーをセキュリティと共有することを可能にする。連続変数(CV)QKDとコヒーレント状態とコヒーレント検出は、既存の通信ネットワークとよく統合される。しかし、これまでのところ、長距離のcv-qkdは、ローカル発振器が送信される非常に複雑なスキームを使用してのみ実証されており、盗聴者のためのセキュリティホールを開き、潜在的な用途を制限している。本稿では,100kmのファイバーチャネル上で局所的に発生する局所発振器を用いた長距離CV-QKD実験について報告する。この記録破断距離は、キャリア回復のための機械学習フレームワークを介して位相ノイズによる余剰ノイズを制御し、変調分散を最適化することで達成される。 CV-QKDプロトコルの完全な実装と,有限サイズシステムにおける集団攻撃に対する鍵生成の実証を行う。その結果,CV量子アクセスネットワークを実現する上で重要なマイルストーンを達成し,セキュアQKDの大規模展開の道を開いた。 Quantum key distribution (QKD) enables two remote parties to share encryption keys with security based on the laws of physics. Continuous variable (CV) QKD with coherent states and coherent detection integrates well with existing telecommunication networks. However, thus far, long-distance CV-QKD has only been demonstrated using a highly complex scheme where the local oscillator is transmitted, opening security loopholes for eavesdroppers and limiting its potential applications. Here, we report a long-distance CV-QKD experiment with a locally generated local oscillator over a 100 km fiber channel with a total loss of 15.4 dB. This record-breaking distance is achieved by controlling the phase-noise-induced excess noise through a machine-learning framework for carrier recovery and optimizing the modulation variance. We implement the full CV-QKD protocol and demonstrate the generation of keys secure against collective attacks in the finite-size regime. Our results mark a significant milestone for realizing CV quantum access networks with a high loss budget, and pave the way for large-scale deployment of secure QKD.	翻訳日:2023-05-16 17:17:04 公開日:2023-05-14
# STORYWARS:コラボレーティブなストーリー理解と生成のためのデータセットとインストラクションチューニングベースライン STORYWARS: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation ( http://arxiv.org/abs/2305.08152v1 ) ライセンス: Link先を確認	Yulun Du and Lydia Chilton	(参考訳) 協力的なストーリーは、異なる執筆スタイルと意図を持つ複数の著者の協力によって作成されたテキストであり、NLPモデルに固有の課題を提起する。このようなストーリーの理解と生成は、オープンドメインコーパスが欠如しているため、未熟な領域である。これを解決するために、オンラインプラットフォームから9,400人の異なる著者によって書かれた4万以上のコラボレーティブなストーリーのデータセットであるSTORYWARSを紹介します。 STORYWARSでは7つの理解と5つの生成タスクからなる12のタスクタイプを設計し、全教師付き、少数ショット、ゼロショットシナリオをカバーするマルチタスクベンチマークとして、合計101のストーリー関連タスクを導出する。さらに,STORYWARSの完全教師付きタスクにおいて,命令チューニングがゼロショットおよび少数ショットシナリオにおいて優れた結果を得るとともに,優れたマルチタスクベンチマーク性能を確立できることを示すストーリータスクに対して,命令チューニングモデル INSTRUCTSTORY を提案する。 Collaborative stories, which are texts created through the collaborative efforts of multiple authors with different writing styles and intentions, pose unique challenges for NLP models. Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce STORYWARS, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. We design 12 task types, comprising 7 understanding and 5 generation task types, on STORYWARS, deriving 101 diverse story-related tasks in total as a multi-task benchmark covering all fully-supervised, few-shot, and zero-shot scenarios. Furthermore, we present our instruction-tuned model, INSTRUCTSTORY, for the story tasks showing that instruction tuning, in addition to achieving superior results in zero-shot and few-shot scenarios, can also obtain the best performance on the fully-supervised tasks in STORYWARS, establishing strong multi-task benchmark performances on STORYWARS.	翻訳日:2023-05-16 17:16:46 公開日:2023-05-14
# 固有値問題に対する多点摂動公式 A multipoint perturbation formula for eigenvalue problems ( http://arxiv.org/abs/2305.08151v1 ) ライセンス: Link先を確認	Genevi\`eve Dusson, Louis Garrigue, Benjamin Stamm	(参考訳) 固有値問題の標準摂動理論は、対応する固有モードが知られているハミルトニアン近傍での固有モデの近似を求めることである。それでも、近くのいくつかのハミルトニアンの対応する固有モードが知られているならば、標準摂動理論はこれらの知識を全て同時に使ってより良い近似を与えることはできない。このような近似結果を可能にする式を導出し、この手法が標準摂動理論よりも競争力のある数値例を提供する。 Standard perturbation theory of eigenvalue problems consists of obtaining approximations of eigenmodes in the neighborhood of a Hamiltonian where the corresponding eigenmode is known. Nevertheless, if the corresponding eigenmodes of several nearby Hamiltonians are known, standard perturbation theory cannot simultaneously use all this knowledge to provide a better approximation. We derive a formula enabling such an approximation result, and provide numerical examples for which this method is more competitive than standard perturbation theory.	翻訳日:2023-05-16 17:16:25 公開日:2023-05-14
# 結合共振器の異常点に及ぼす熱光子の影響 The effect of thermal photons on exceptional points in coupled resonators ( http://arxiv.org/abs/2305.08150v1 ) ライセンス: Link先を確認	Grzegorz Chimczak and Anna Kowalewska-Kud{\l}aszyk and Ewelina Lange and Karol Bartkiewicz and Jan Pe\v{r}ina Jr	(参考訳) 1つは光学デバイスであり、もう1つは超伝導マイクロ波周波数デバイスである。それらの対称性を調べるために、与えられたハミルトニアンの損失と利得項がバランスをとる平衡系を導入する。両系の非エルミート・ハミルトニアンは例外点(ep)に到達するように調整可能であること、すなわち、破れから崩壊しない隠れpt対称性への遷移が起こるパラメータ空間の点を示す。我々は、liouvillian exceptional point (lep) と呼ばれるliouvillian superoperatorの縮退度を計算し、光学領域において、lepは非エルミートハミルトン(hep)から得られるepと等価であることを示す。またマイクロ波系に対する非ゼロ数の熱光子によるEPとHEPの等価性を報告した。 We analyse two quantum systems with hidden parity-time (PT) symmetry: one is an optical device, whereas another is a superconducting microwave-frequency device. To investigate their symmetry, we introduce an equilibrium frame, in which loss and gain terms for a given Hamiltonian are balanced. We show that the non-Hermitian Hamiltonians of both systems can be tuned to reach an exceptional point (EP), i.e., the point in parameter space at which a transition from broken to unbroken hidden PT symmetry takes place. We calculate a degeneracy of a Liouvillian superoperator, which is called the Liouvillian exceptional point (LEP), and show that, in the optical domain, LEP is equivalent to EP obtained from the non-Hermitian Hamiltonian (HEP). We also report breaking the equivalence between LEP and HEP by a non-zero number of thermal photons for the microwave-frequency system.	翻訳日:2023-05-16 17:16:17 公開日:2023-05-14
# サイドチャネルセキュア量子鍵分布 Side-channel-secure quantum key distribution ( http://arxiv.org/abs/2305.08148v1 ) ライセンス: Link先を確認	Cong Jiang and Xiao-Long Hu and Zong-Wen Yu and Xiang-Bin Wang	(参考訳) 完全現実的な条件下では、サイドチャネルセキュリティ(SCS)量子鍵分布(QKD)の結果を示す。本研究の結果は, 測定デバイスに依存しないだけでなく, 不完全真空および不完全コヒーレント状態源を含む不完全(かつ不安定な)ソースデバイスにも有効である。仮想マッピングのアイデアを応用して、サイドチャネルのコヒーレントな攻撃を含む、外部からの攻撃に対する一般的なセキュリティ証明を提示する。また, 副産物として, 鍵レートを1～2桁向上させるscsプロトコルの改良法を提案する。これらの結果を用いて, 完全現実的条件で即時に役立つ非漸近キーレートを求める。 We present a result of side-channel-secure (SCS) quantum key distribution (QKD) under fully realistic conditions. Our result is not only measurement-device independent but also effective with imperfect (and unstable) source devices including imperfect vacuum and imperfect coherent-state source. Applying the virtual mapping idea, we present a general security proof under whatever out-side-lab attack, including whatever side-channel coherent attack. As a by- product, we also present an improved method for SCS protocols which can raise the key rate by 1-2 orders of magnitude. Using these results, we obtain a non-asymptotic key rate which is instantly useful with full realistic conditions.	翻訳日:2023-05-16 17:15:58 公開日:2023-05-14
# ParaLS:プレトレーニングパラフラザーによる語彙置換 ParaLS: Lexical Substitution via Pretrained Paraphraser ( http://arxiv.org/abs/2305.08146v1 ) ライセンス: Link先を確認	Jipeng Qiang, Kang Liu, Yun Li, Yunhao Yuan, Yi Zhu	(参考訳) 語彙置換(LS)は、文中の対象単語の適切な置換を見つけることを目的としている。近年,事前訓練された言語モデルに基づくLS手法が顕著な進歩を遂げ,その文脈環境の分析を通じて,対象単語の潜在的代用を生成する。しかし、これらの方法は代用語を生成する際に文の意味の保存を過小評価する傾向がある。本研究では,代用候補をパラフレーズから生成する方法を検討する。パラフレーズから生成されたパラフレーズには,単語選択のバリエーションが含まれており,文の意味を保っている。一般的なデコード戦略では代替語を直接生成することはできないため,デコード中の対象単語のバリエーションに着目した2つの単純なデコード戦略を提案する。実験の結果,本手法は3つのベンチマークで事前学習した言語モデルに基づき,最先端ls法を上回った。 Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence's meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence's meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.	翻訳日:2023-05-16 17:15:47 公開日:2023-05-14
# Mobile-Env: モバイルインタラクションのトレーニングと評価のためのユニバーサルプラットフォーム Mobile-Env: A Universal Platform for Training and Evaluation of Mobile Interaction ( http://arxiv.org/abs/2305.08144v1 ) ライセンス: Link先を確認	Danyang Zhang, Lu Chen, Kai Yu	(参考訳) インタラクションプラットフォームは、ゲームプレイやインテリジェンスなどのコントロールおよび決定領域の最近の進歩において、重要な役割を果たす。しかし、情報ユーザインタフェース(infoui)インタラクションにはまだ満足のいくプラットフォームが欠けている。提案したInfoUIは、平易なテキスト情報だけでなく、マルチモーダルな内容と、スタイルを持つ空間構造も含んでいる。本稿ではInfoUIインタラクションの研究を支援するために,新しいプラットフォームであるMobile-Envを紹介する。 Mobile-Envプラットフォームは、柔軟で、適応可能で、容易に拡張できるように設計されている。 Mobile-EnvをベースにInfoUIタスクセットが構築され、デモと評価が行われる。大規模言語モデル(LLM)に基づくエージェントをタスクセット上でテストする。実験結果は、LLMがテキスト理解とマッチングを行う大きな可能性を実証し、一方で、対話フィードバックと探索のより良いメカニズムの必要性を明らかにした。新たな議論もいくつか行われている。デモビデオはhttps://youtu.be/gkv6kzywxgyで見ることができる。コードリポジトリはhttps://github.com/x-lance/mobile-envで入手できる。提案されたWikiHowタスクセットはhttps://huggingface.co/datasets/zdy023/WikiHow-tasksetで公開されている。 The interaction platform plays a crucial role in the recent advancement of the control and decision domains like game playing and embodied intelligence. However, there is still a lack of a satisfactory platform for the information user interface (InfoUI) interaction. The proposed InfoUI comprises not only the plain text information, but the multimodal contents and a few spatial structures with styles as well. To help the research of InfoUI interaction, a novel platform Mobile-Env is presented in this paper. The Mobile-Env platform is designed to be flexible, adaptable, and easily-extended. Based on Mobile-Env, an InfoUI task set is then built for a demonstration and evaluation. An agent based on the large-scale language model (LLM) is tested on the task set. The experiment results demonstrate the great potential of the LLM to do text understanding and matching and, meanwhile, reveal the necessity of a better mechanism of interaction feedback and exploration. Several new discussions are conducted as well. A demo video is available at https://youtu.be/gKV6KZYwxGY. The code repository is available at https://github.com/X-LANCE/Mobile-Env. The proposed WikiHow task set is made public at https://huggingface.co/datasets/zdy023/WikiHow-taskset.	翻訳日:2023-05-16 17:15:33 公開日:2023-05-14
# 集中治療室における非計画的寛容の予測 : マルチモーダリティ評価 Predicting Unplanned Readmissions in the Intensive Care Unit: A Multimodality Evaluation ( http://arxiv.org/abs/2305.08139v1 ) ライセンス: Link先を確認	Eitam Sheetrit, Menachem Brief, Oren Elisha	(参考訳) 退院は、ある期間内に退院した患者が、同じまたは関連するケアのために再び入院した場合である。入院は、入院コストの上昇、患者の満足度の低下、感染症、医薬品のエラー、さらには死亡といった副作用のリスクの増加につながるため、医療分野において重大な問題である。特にICU(ICU)は,患者の症状の重症度と合併症のリスクが高いため,入院が困難である。静的データ、構造化されていないフリーテキスト、診断と手順のシーケンス、多変量時系列など、さまざまなデータモダリティを分析することが必要となる。本稿では,時系列解析と自然言語処理における最先端機械学習手法を用いて,各データモダリティの有効性を別々に検証する。評価プロセスを用いて、各データモダリティの寄与を決定でき、可読性の観点から初めて予測値の階層を確立することができる。さらに,読解予測に対する時系列アプローチの性能向上における時間的抽象化の効果を示す。文献における矛盾する定義のため、我々はまた、将来の研究の再現性と一貫性を高め、この用語の多様な解釈から生じる可能性のある潜在的な誤解を防ぐために、無計画許可という用語を明確に定義する。臨床実験結果から, 医師が作成したアウトレットノートは, 他のすべての指標よりも, 読み出し予測に優れていたことが示唆された。 A hospital readmission is when a patient who was discharged from the hospital is admitted again for the same or related care within a certain period. Hospital readmissions are a significant problem in the healthcare domain, as they lead to increased hospitalization costs, decreased patient satisfaction, and increased risk of adverse outcomes such as infections, medication errors, and even death. The problem of hospital readmissions is particularly acute in intensive care units (ICUs), due to the severity of the patients' conditions, and the substantial risk of complications. Predicting Unplanned Readmissions in ICUs is a challenging task, as it involves analyzing different data modalities, such as static data, unstructured free text, sequences of diagnoses and procedures, and multivariate time-series. Here, we investigate the effectiveness of each data modality separately, then alongside with others, using state-of-the-art machine learning approaches in time-series analysis and natural language processing. Using our evaluation process, we are able to determine the contribution of each data modality, and for the first time in the context of readmission, establish a hierarchy of their predictive value. Additionally, we demonstrate the impact of Temporal Abstractions in enhancing the performance of time-series approaches to readmission prediction. Due to conflicting definitions in the literature, we also provide a clear definition of the term Unplanned Readmission to enhance reproducibility and consistency of future research and to prevent any potential misunderstandings that could result from diverse interpretations of the term. Our experimental results on a large benchmark clinical data set show that Discharge Notes written by physicians, have better capabilities for readmission prediction than all other modalities.	翻訳日:2023-05-16 17:15:16 公開日:2023-05-14
# 回答の前に区別する:共通質問応答の知識としての対比的説明の生成 Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering ( http://arxiv.org/abs/2305.08135v1 ) ライセンス: Link先を確認	Qianglong Chen, Guohai Xu, Ming Yan, Ji Zhang, Fei Huang, Luo Si and Yin Zhang	(参考訳) 既存の知識強化手法は、異なる知識ベースから多様な知識を得ることにより、特定のQAタスクにおいて顕著な成果を上げている。しかし、検索された知識の性質によって制限を受けると、知識の関連性と区別の両方から利益を得るのに問題が生じる。この課題を解決するために,提案するCPACEは概念中心のPrompt-bAsed Contrastive Explanation Generationモデルである。まず,先行研究に続いて,概念中心知識抽出モジュールを用いて,異なる種類の記号的知識を検索する。その後、獲得した記号的知識と説明プロンプトを用いて、対応する対比的説明を生成し、知識の識別と解釈性をよりよくモデル化するためのガイダンスとする。最後に,生成したコントラスト説明を,下流タスク強化のための外部知識として捉える。本稿では,CSQA,QASC,OBQAの3つの質問回答データセットについて実験を行った。実験結果から, CPACEモデルはCSQAの新しいSOTA(テストセット89.8%, 人体性能0.9%)を実現し, QASCとOBQA(それぞれ4.2%, 3.5%)の大幅な改善が得られた。 Existing knowledge-enhanced methods have achieved remarkable results in certain QA tasks via obtaining diverse knowledge from different knowledge bases. However, limited by the properties of retrieved knowledge, they still have trouble benefiting from both the knowledge relevance and distinguishment simultaneously. To address the challenge, we propose CPACE, a Concept-centric Prompt-bAsed Contrastive Explanation Generation model, which aims to convert obtained symbolic knowledge into a contrastive explanation for better distinguishing the differences among given candidates. Firstly, following previous works, we retrieve different types of symbolic knowledge with a concept-centric knowledge extraction module. After that, we generate corresponding contrastive explanations using acquired symbolic knowledge and explanation prompts as guidance for better modeling the knowledge distinguishment and interpretability. Finally, we regard the generated contrastive explanation as external knowledge for downstream task enhancement. We conduct a series of experiments on three widely-used question-answering datasets: CSQA, QASC, and OBQA. Experimental results demonstrate that with the help of generated contrastive explanation, our CPACE model achieves new SOTA on CSQA (89.8% on the testing set, 0.9% higher than human performance), and gains impressive improvement on QASC and OBQA (4.2% and 3.5%, respectively).	翻訳日:2023-05-16 17:14:48 公開日:2023-05-14
# 制約回復による逆強化学習 Inverse Reinforcement Learning With Constraint Recovery ( http://arxiv.org/abs/2305.08130v1 ) ライセンス: Link先を確認	Nirjhar Das and Arpan Chattopadhyay	(参考訳) 本研究では,制約付きマルコフ決定過程(CMDP)問題に対する新しい逆強化学習(IRL)アルゴリズムを提案する。標準IRL問題において、逆学習者またはエージェントは、最適ポリシーに対する一連の軌道実証から、MDPの報酬関数を回復しようとする。本研究では,cmdpの報酬関数だけでなく,制約についても推測する。最大エントロピーの原理を用いて、制約回復(irl-cr)問題を持つirlを制約付き非凸最適化問題としてキャストできることを示す。サブプロブレムが凸である交互に制約された最適化問題に還元する。我々はそれを解決するために指数勾配降下アルゴリズムを用いる。最後に,グリッド環境におけるアルゴリズムの有効性を示す。 In this work, we propose a novel inverse reinforcement learning (IRL) algorithm for constrained Markov decision process (CMDP) problems. In standard IRL problems, the inverse learner or agent seeks to recover the reward function of the MDP, given a set of trajectory demonstrations for the optimal policy. In this work, we seek to infer not only the reward functions of the CMDP, but also the constraints. Using the principle of maximum entropy, we show that the IRL with constraint recovery (IRL-CR) problem can be cast as a constrained non-convex optimization problem. We reduce it to an alternating constrained optimization problem whose sub-problems are convex. We use exponentiated gradient descent algorithm to solve it. Finally, we demonstrate the efficacy of our algorithm for the grid world environment.	翻訳日:2023-05-16 17:14:26 公開日:2023-05-14
# エンドツーエンド学習はフィットネスアクティビティ認識に十分か? Is end-to-end learning enough for fitness activity recognition? ( http://arxiv.org/abs/2305.08191v1 ) ライセンス: Link先を確認	Antoine Mercier and Guillaume Berger and Sunny Panchal and Florian Letsch and Cornelius Boehm and Nahua Kang and Ingo Bax and Roland Memisevic	(参考訳) エンド・ツー・エンド・ラーニングは、特に静止画像に関連する多くのコンピュータビジョンタスクをホールドしており、タスク固有の最適化は非常に高いパフォーマンスをもたらす。それでも、人間中心のアクション認識は依然として手作りのパイプラインで占められており、個々のコンポーネントだけが、通常個々のフレームで動作するニューラルネットワークに置き換えられている。このようなパイプラインの関連性を調べるためのテストベッドとして,フィットネス活動の完全注釈付きビデオデータセットを提案する。この領域の認識能力は、基本的に人間のポーズとその時間的ダイナミクスの関数であるので、ポーズベースのソリューションはうまく機能すべきである。このラベル付きデータにより、原画素でのエンドツーエンド学習が、ポーズ推定に基づく最先端のアクション認識パイプラインと競合することを示す。また、エンド・ツー・エンドの学習は、リアルタイム反復数などの時間的にきめ細かなタスクを支援できることを示す。 End-to-end learning has taken hold of many computer vision tasks, in particular, related to still images, with task-specific optimization yielding very strong performance. Nevertheless, human-centric action recognition is still largely dominated by hand-crafted pipelines, and only individual components are replaced by neural networks that typically operate on individual frames. As a testbed to study the relevance of such pipelines, we present a new fully annotated video dataset of fitness activities. Any recognition capabilities in this domain are almost exclusively a function of human poses and their temporal dynamics, so pose-based solutions should perform well. We show that, with this labelled data, end-to-end learning on raw pixels can compete with state-of-the-art action recognition pipelines based on pose estimation. We also show that end-to-end learning can support temporally fine-grained tasks such as real-time repetition counting.	翻訳日:2023-05-16 17:06:42 公開日:2023-05-14
# tsgn: 多エージェント動作予測のための投影ベクトル表現付きテンポラルシーングラフニューラルネットワーク TSGN: Temporal Scene Graph Neural Networks with Projected Vectorized Representation for Multi-Agent Motion Prediction ( http://arxiv.org/abs/2305.08190v1 ) ライセンス: Link先を確認	Yunong Wu, Thomas Gilles, Bogdan Stanciulescu, Fabien Moutarde	(参考訳) 近くのエージェントの将来の動きを予測することは、自動運転車が安全かつ効果的な行動を取るために不可欠である。本稿では,マルチエージェント軌道予測のためのベクトル表現を投影したテンポラルシーングラフニューラルネットワークを用いたフレームワークTSGNを提案する。投影ベクトル化表現は、トラフィックシーンをベクトルの集合によって構築されたグラフとしてモデル化する。これらのベクトルはエージェント、道路ネットワーク、およびそれらの空間的相対関係を表す。この表現のすべての相対的特徴は、変換と回転不変である。この表現に基づいて、TSGNはエージェント、道路ネットワーク、それら間の相互作用、時間的トラフィックシーンの時間的依存関係をキャプチャする。 TSGNは、全てのエージェントに対するマルチモーダルな将来の軌跡を、妥当かつ正確に同時に予測することができる。一方,エージェントと道路ネットワーク間の相互作用を捕捉する階層型レーントランスを提案する。これは周囲の道路ネットワークをフィルタし,対象エージェントの将来の行動に影響を与える可能性のある最も確率の高いレーンセグメントのみを保持する。予測性能を犠牲にすることなく、計算負担を大幅に削減する。実験により、TSGNはArgoverse運動予測ベンチマーで最先端のパフォーマンスを達成することが示された。 Predicting future motions of nearby agents is essential for an autonomous vehicle to take safe and effective actions. In this paper, we propose TSGN, a framework using Temporal Scene Graph Neural Networks with projected vectorized representations for multi-agent trajectory prediction. Projected vectorized representation models the traffic scene as a graph which is constructed by a set of vectors. These vectors represent agents, road network, and their spatial relative relationships. All relative features under this representation are both translationand rotation-invariant. Based on this representation, TSGN captures the spatial-temporal features across agents, road network, interactions among them, and temporal dependencies of temporal traffic scenes. TSGN can predict multimodal future trajectories for all agents simultaneously, plausibly, and accurately. Meanwhile, we propose a Hierarchical Lane Transformer for capturing interactions between agents and road network, which filters the surrounding road network and only keeps the most probable lane segments which could have an impact on the future behavior of the target agent. Without sacrificing the prediction performance, this greatly reduces the computational burden. Experiments show TSGN achieves state-of-the-art performance on the Argoverse motion forecasting benchmar.	翻訳日:2023-05-16 17:06:30 公開日:2023-05-14
# crosentinews 2.0: 文レベルのニュース感情コーパス CroSentiNews 2.0: A Sentence-Level News Sentiment Corpus ( http://arxiv.org/abs/2305.08187v1 ) ライセンス: Link先を確認	Gaurish Thakkar, Nives Mikelic Preradovi\'c, Marko Tadi\'c	(参考訳) 本稿ではクロアチアのニュースドメインの文レベルの感情データセットについて述べる。すでに存在する3Kアノテートテキストに加えて、5つのクラスでタグ付けされた14.5Kアノテート文がデータセットに含まれる。アノテーションプロセスとアノテーション間の合意に加えて,ベースラインスコアを提供する。 This article presents a sentence-level sentiment dataset for the Croatian news domain. In addition to the 3K annotated texts already present, our dataset contains 14.5K annotated sentence occurrences that have been tagged with 5 classes. We provide baseline scores in addition to the annotation process and inter-annotator agreement.	翻訳日:2023-05-16 17:06:14 公開日:2023-05-14
# モード切替点最適化を考慮した空中ロボットの経路計画 Path Planning for Air-Ground Robot Considering Modal Switching Point Optimization ( http://arxiv.org/abs/2305.08178v1 ) ライセンス: Link先を確認	Xiaoyu Wang and Kangyao Huang and Xinyu Zhang and Honglin Sun and Wenzhuo Liu and Huaping Liu and Jun Li and Pingping Lu	(参考訳) 運転も飛行もできる革新的なモビリティプラットフォームは、空飛ぶロボットだ。アジャイル飛行の必要性は、空中ロボットの伝統的な経路計画技術によって満足できない。以前の研究は、主に経路のエネルギー効率の向上、探索速度の低下、離着陸地点の最適化に重点を置いていた。フィールドアプリケーション環境のためのロボットを提案し, エネルギー効率, 探索速度, 実際の展開可能性に着目し, モード切替点最適化を考慮したグラフ探索アルゴリズムに基づく, 軽量なグローバル空間計画手法を提案する。基本的な概念は、平面探索と空間探索を組み合わせた交換可能な探索アプローチを採用することで計算量を減らすことである。さらに、電池の健全性とミッション実行の完全性を保護するため、トラップエスケープアプローチも提供された。シミュレーションは、フィールドdemマップに基づいた提案モデルの有効性をテストするために実行される。シミュレーションの結果,我々の技術は,高い信頼度で完成可能な3dパスを生成できることがわかった。さらに、モード切換点最適化法は、モード切換に許容される追加の場所を効率よく同定し、改良されたパスは時間とエネルギーを少なくする。 An innovative sort of mobility platform that can both drive and fly is the air-ground robot. The need for an agile flight cannot be satisfied by traditional path planning techniques for air-ground robots. Prior studies had mostly focused on improving the energy efficiency of paths, seldom taking the seeking speed and optimizing take-off and landing places into account. A robot for the field application environment was proposed, and a lightweight global spatial planning technique for the robot based on the graph-search algorithm taking mode switching point optimization into account, with an emphasis on energy efficiency, searching speed, and the viability of real deployment. The fundamental concept is to lower the computational burden by employing an interchangeable search approach that combines planar and spatial search. Furthermore, to safeguard the health of the power battery and the integrity of the mission execution, a trap escape approach was also provided. Simulations are run to test the effectiveness of the suggested model based on the field DEM map. The simulation results show that our technology is capable of producing finished, plausible 3D paths with a high degree of believability. Additionally, the mode-switching point optimization method efficiently identifies additional acceptable places for mode switching, and the improved paths use less time and energy.	翻訳日:2023-05-16 17:06:10 公開日:2023-05-14
# 凸損失関数下での雑音場に対する最適かつスケーラブルな行列機構 An Optimal and Scalable Matrix Mechanism for Noisy Marginals under Convex Loss Functions ( http://arxiv.org/abs/2305.08175v1 ) ライセンス: Link先を確認	Yingtai Xiao, Guanlin He, Danfeng Zhang, Daniel Kifer	(参考訳) ノイズ境界は機密性保護データリリースの一般的な形態であり、並行性テーブル解析、ベイズネットワークの構築、合成データ生成など多くの下流タスクに有用である。線形クエリ(例えば境界)に対するバイアスのないノイズ応答を提供するプライバシメカニズムは、行列メカニズムとして知られている。そこで本研究では,gaussian noiseを伴う辺縁系の行列機構であるsustainsplannerを提案する。 ResidualPlannerは、余分な分散の凸関数として記述できる多くの損失関数に対して最適化できる(事前の作業は1つの事前定義された目的関数に制限される)。 ResidualPlannerは、前回のHDMM(HDMM)がメモリが切れた場合でも、大規模な設定でマーサルの精度を数秒で最適化できる。数分で100の属性を持つデータセット上でも動作する。さらにResidualPlannerは、各辺の分散/共分散値を効率的に計算できる(比較的小さなデータセットであっても、適切なメソッドはすぐにメモリが切れる)。 Noisy marginals are a common form of confidentiality-protecting data release and are useful for many downstream tasks such as contingency table analysis, construction of Bayesian networks, and even synthetic data generation. Privacy mechanisms that provide unbiased noisy answers to linear queries (such as marginals) are known as matrix mechanisms. We propose ResidualPlanner, a matrix mechanism for marginals with Gaussian noise that is both optimal and scalable. ResidualPlanner can optimize for many loss functions that can be written as a convex function of marginal variances (prior work was restricted to just one predefined objective function). ResidualPlanner can optimize the accuracy of marginals in large scale settings in seconds, even when the previous state of the art (HDMM) runs out of memory. It even runs on datasets with 100 attributes in a couple of minutes. Furthermore ResidualPlanner can efficiently compute variance/covariance values for each marginal (prior methods quickly run out of memory, even for relatively small datasets).	翻訳日:2023-05-16 17:05:49 公開日:2023-05-14
# クロアチア映画レビューデータセット (cro-fireda: a sentiment annotated dataset of film reviews) Croatian Film Review Dataset (Cro-FiReDa): A Sentiment Annotated Dataset of Film Reviews ( http://arxiv.org/abs/2305.08173v1 ) ライセンス: Link先を確認	Gaurish Thakkar, Nives Mikelic Preradovic and Marko Tadi\'c	(参考訳) 本稿では,映画レビュー分野におけるクロアチア人の感情アノテートデータセットであるCro-FiReDaを紹介する。 1万以上の文を含むデータセットは、文レベルで注釈付けされている。アノテーション全体のプロセスを示すことに加えて、トランスフォーマティブに基づく微調整手法に基づくベンチマーク結果も提示する。 This paper introduces Cro-FiReDa, a sentiment- annotated dataset for Croatian in the domain of movie reviews. The dataset, which contains over 10,000 sentences, has been annotated at the sentence level. In addition to presenting the overall annotation process, we also present benchmark results based on the transformer- based fine-tuning approach	翻訳日:2023-05-16 17:05:31 公開日:2023-05-14
# 制御の劣化を学べるか? ガウス過程に基づくイベントトリガーオンライン学習における計算遅延の解析 Can Learning Deteriorate Control? Analyzing Computational Delays in Gaussian Process-Based Event-Triggered Online Learning ( http://arxiv.org/abs/2305.08169v1 ) ライセンス: Link先を確認	Xiaobing Dai, Armin Lederer, Zewen Yang, Sandra Hirche	(参考訳) システムのダイナミクスが未知である場合、教師付き機械学習技術は一般にデータからモデルを推測するために使用される。ガウス過程(GP)回帰は、予測誤差境界が存在するため、この目的のために特に一般的な学習方法である。さらに、イベントトリガー付きオンライン学習戦略を追求して、特定トラッキングアキュラシーを確保するように、gpモデルをオンラインで効率的に更新することができる。しかし、既存のトリガー条件は任意のタイミングで評価できなければならず、不要な計算時間のために実際に達成できない。そこで,まず遅延認識型トラッキングエラーバウンドを導出し,精度と遅延のトレードオフを明らかにする。この結果に基づいて,計算遅延を伴うGPベースのオンライン学習における新たなイベントトリガを提案する。最後に,シミュレーションにおけるオンライン学習におけるイベントトリガの有効性を示す。 When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ensure specified tracking accuracies. However, existing trigger conditions must be able to be evaluated at arbitrary times, which cannot be achieved in practice due to non-negligible computation times. Therefore, we first derive a delay-aware tracking error bound, which reveals an accuracy-delay trade-off. Based on this result, we propose a novel event trigger for GP-based online learning with computational delays, which we show to offer advantages over offline trained GP models for sufficiently small computation times. Finally, we demonstrate the effectiveness of the proposed event trigger for online learning in simulations.	翻訳日:2023-05-16 17:05:27 公開日:2023-05-14
# 多視点時系列からの潜在プロセス同定 Latent Processes Identification From Multi-View Time Series ( http://arxiv.org/abs/2305.08164v1 ) ライセンス: Link先を確認	Zenan Huang, Haobo Wang, Junbo Zhao, Nenggan Zheng	(参考訳) 時系列データのダイナミクスを理解するには、典型的にはデータ生成のためのユニークな潜在因子を識別する必要がある。独立した仮定に基づいて、既存の作業はシングルビューデータの処理に大きな進歩を遂げました。しかし、大きな課題が2つあるため、それをマルチビュー時系列データに拡張する非自明な問題である。 (i) 時間依存のような複雑なデータ構造は、独立した仮定に違反する可能性がある。 (ii) 異なる視点からの因子は概して重複しており、完全な集合に集約することは困難である。本研究では,データ生成過程を逆転させて識別性を高めるために,コントラスト学習技術を用いた新しいフレームワーク MuLTI を提案する。さらに、MuLTIは最適な輸送公式を確立することで、対応する重複変数をマージする置換機構を統合する。合成および実世界のデータセットに対する大規模な実験結果から,多視点時系列上での同定可能な潜伏変数の復元において,本手法の優位性が示された。 Understanding the dynamics of time series data typically requires identifying the unique latent factors for data generation, \textit{a.k.a.}, latent processes identification. Driven by the independent assumption, existing works have made great progress in handling single-view data. However, it is a non-trivial problem that extends them to multi-view time series data because of two main challenges: (i) the complex data structure, such as temporal dependency, can result in violation of the independent assumption; (ii) the factors from different views are generally overlapped and are hard to be aggregated to a complete set. In this work, we propose a novel framework MuLTI that employs the contrastive learning technique to invert the data generative process for enhanced identifiability. Additionally, MuLTI integrates a permutation mechanism that merges corresponding overlapped variables by the establishment of an optimal transport formula. Extensive experimental results on synthetic and real-world datasets demonstrate the superiority of our method in recovering identifiable latent variables on multi-view time series.	翻訳日:2023-05-16 17:05:16 公開日:2023-05-14
# アルツハイマー病に伴う機能的脳ネットワークの位相的特性の変化 Altered Topological Properties of Functional Brain Network Associated with Alzheimer's Disease ( http://arxiv.org/abs/2305.08159v1 ) ライセンス: Link先を確認	Yongcheng Yao	(参考訳) 機能的磁気共鳴イメージング(fMRI)は、神経変性疾患に関連する機能的異常を含む人間の脳活動を研究するために一般的に用いられる。本研究は,アルツハイマー病(AD)患者と正常コントロール者における機能的脳ネットワークのトポロジー特性の違いについて検討することを目的とする。対象者は,AD認知症175名,年齢415名,性別415名,手腕マッチング群590名であった。脳ネットワークのトポロジ的特性をグラフ理論に基づく分析により定量化した。その結果,adグループ内のネットワーク統合と分離が異常であった。これらの知見は、機能的脳ネットワーク構造の観点からAD病態の理解を深め、ADバイオマーカーの同定に役立つ可能性がある。我々はこの研究の検証をhttps://github.com/YongchengYAO/AD-FunctionalBrainNetwork.comで支援した。 Functional Magnetic Resonance Imaging (fMRI) is commonly utilized to study human brain activity, including abnormal functional properties related to neurodegenerative diseases. This study aims to investigate the differences in the topological properties of functional brain networks between individuals with Alzheimer's Disease (AD) and normal controls. A total of 590 subjects, consisting of 175 with AD dementia and 415 age-, gender-, and handedness-matched controls, were included. The topological properties of the brain network were quantified using graph-theory-based analyses. The results indicate abnormal network integration and segregation in the AD group. These findings enhance our understanding of AD pathophysiology from a functional brain network structure perspective and may aid in identifying AD biomarkers. We provided more information to asist the validation of this study at https://github.com/YongchengYAO/AD-FunctionalBrainNetwork.	翻訳日:2023-05-16 17:04:59 公開日:2023-05-14
# アルゴリズム的多元主義 : 平等への構造的アプローチ Algorithmic Pluralism: A Structural Approach Towards Equal Opportunity ( http://arxiv.org/abs/2305.08157v1 ) ライセンス: Link先を確認	Shomik Jain, Vinith Suriyakumar, Ashia Wilson	(参考訳) 平等な機会というアイデアは、自由な機会が私たちの生活を形作ってくれるので、広く受け入れられる。しかし、多くの人は平等な機会の意味について深く反対している。平等機会に関する新しい理論は、意思決定が機会の構造においてボトルネックや狭い場所としてどのように機能するかを記述する構造的アプローチを採用する。この差別に対する見解は、平等な機会と形式的な公正な介入による達成による根本的な問題を強調し、より多くの人々に機会を開くことを優先するより多元的なアプローチを提唱する。我々はこのボトルネック理論をデータ駆動型意思決定に拡張し、アルゴリズムが機会構造において深刻なボトルネックを生じさせる範囲の懸念に対処する。アルゴリズムによる意思決定システムにおける重大度緩和の優先順位付けを推奨する。教育、医療、刑事司法の例から、この構造的アプローチがシステム設計と規制における平等な機会についての議論をいかに再編成し、アルゴリズム的多元主義が機会をよりポジティブな方法で拡大するかを示す。 The idea of equal opportunity enjoys wide acceptance because of the freedom opportunities provide us to shape our lives. Many disagree deeply, however, about the meaning of equal opportunity, especially in algorithmic decision-making. A new theory of equal opportunity adopts a structural approach, describing how decisions can operate as bottlenecks or narrow places in the structure of opportunities. This viewpoint on discrimination highlights fundamental problems with equal opportunity and its achievement through formal fairness interventions, and instead advocates for a more pluralistic approach that prioritizes opening up more opportunities for more people. We extend this theory of bottlenecks to data-driven decision-making, adapting it to center concerns about the extent to which algorithms can create severe bottlenecks in the opportunity structure. We recommend algorithmic pluralism: the prioritization of alleviating severity in systems of algorithmic decision-making. Drawing on examples from education, healthcare, and criminal justice, we show how this structural approach helps reframe debates about equal opportunity in system design and regulation, and how algorithmic pluralism could help expand opportunities in a more positive-sum way.	翻訳日:2023-05-16 17:04:44 公開日:2023-05-14
# 単光子円偏光単モード渦ビーム Single-photon circularly polarized single-mode vortex beams ( http://arxiv.org/abs/2305.08223v1 ) ライセンス: Link先を確認	Xujing Liu, Yinhui Kan, Shailesh Kumar, Danylo Komisar, Changying Zhao, Sergey I. Bozhevolnyi	(参考訳) スピンと軌道角モータ(SAMとOAM)を持つ単一光子の生成は、高次元量子系に対する複数の自由度を利用するためのエンテンシングの視点を開く。しかし、シングルモードSAM-OAM状態で符号化された単一光子のオンチップ生成は大きな課題である。ここでは、基板上に作製された異方性ナノ二量体を慎重に設計し、表面プラズモンポラリトン(SPP)伝搬をサポートし、量子エミッタ(QE)周辺を正確に位置決めすることにより、非放射性QE-SPP結合とSPP結合を、SAMとOAMを特徴とする自由空間伝播放射に展開させる。本研究は, 位相電荷 (l = 0, 1, 2) と高単光子純度 (g(0) < 0.15) を持つ単モード渦ビームを, 円偏光(キラル度 > 0.97) のオンチップ室温で生成することを示した。先進的な量子フォトニック技術のための高次元量子源の実現を可能にするために、開発されたアプローチは簡単に拡張でき、複数の異なる偏光単光子放射チャネルを生成することができる。 Generation of single photons carrying spin and orbital angular momenta (SAM and OAM) opens enticing perspectives for exploiting multiple degrees of freedom for high-dimensional quantum systems. However, on-chip generation of single photons encoded with single-mode SAM-OAM states has been a major challenge. Here, by utilizing carefully designed anisotropic nanodimers fabricated atop a substrate, supporting surface plasmon polariton (SPP) propagation, and accurately positioned around a quantum emitter (QE), we enable nonradiative QE-SPP coupling and the SPP outcoupling into free-space propagating radiation featuring the designed SAM and OAM. We demonstrate on-chip room-temperature generation of well-collimated (divergence < 7.5 degrees) circularly polarized (chirality > 0.97) single-mode vortex beams with different topological charges (l = 0, 1, and 2) and high single-photon purity, g(0) < 0.15. The developed approach can straightforwardly be extended to produce multiple, differently polarized, single-mode single-photon radiation channels, and enable thereby realization of high-dimensional quantum sources for advanced quantum photonic technologies.	翻訳日:2023-05-16 16:59:29 公開日:2023-05-14
# 線形偏光渦ビームの超コンパクト単一光子源 Ultracompact single-photon sources of linearly polarized vortex beams ( http://arxiv.org/abs/2305.08222v1 ) ライセンス: Link先を確認	Xujing Liu, Yinhui Kan, Shailesh Kumar, Liudmilla F. Kulikova, Valery A. Davydov, Viatcheslav N. Agafonov, Changying Zhao, Sergey I. Bozhevolnyi	(参考訳) 偏光状態を持つ超コンパクトチップ一体型単光子源は、集積量子技術にとって不可欠である。しかし、現在利用可能な単一光子源のほとんどは、放出された光子ビームの偏光と位相フロントを形成するために外部の偏光成分に依存している。量子エミッタのビーム整形と偏光符号化機能との効率的な統合は、いまだに解明されていない。本稿では,ナノブリックアレイ型メタサーフェスのポテンシャルを十分に活用した,チップ集積量子エミッタ結合型メタサーフェスに基づく線形偏極渦ビームの超コンパクト単一光子源を提案する。まず, 所定の位相電荷-1, 0, +1の高純度線形偏光渦ビームのオンチップ単光子生成を示す。さらに、位相電荷の異なる直交線形偏光を持つ単一光子放出チャネルの多重化を実現し、その絡み合いを示す。本研究は,チップ一体型高次元単一光子源を実現するための新しい量子光学プラットフォームとして,超コンパクト量子エミッタ結合型メタサーフェスの可能性と実現可能性を示す。 Ultracompact chip-integrated single-photon sources of collimated beams with polarizationencoded states are crucial for integrated quantum technologies. However, most of currently available single-photon sources rely on external bulky optical components to shape the polarization and phase front of emitted photon beams. Efficient integration of quantum emitters with beam shaping and polarization encoding functionalities remains so far elusive. Here, we present ultracompact single-photon sources of linearly polarized vortex beams based on chip-integrated quantum emitter-coupled metasurfaces, which are meticulously designed by fully exploiting the potential of nanobrick arrayed metasurfaces. We first demonstrate on-chip single-photon generation of high-purity linearly polarized vortex beams with prescribed topological charges of -1, 0, and +1. We further realize multiplexing of single-photon emission channels with orthogonal linear polarizations carrying different topological charges and demonstrate their entanglement. Our work illustrates the potential and feasibility of ultracompact quantum emitter-coupled metasurfaces as a new quantum optics platform for realizing chip-integrated high-dimensional single-photon sources.	翻訳日:2023-05-16 16:59:02 公開日:2023-05-14
# 深層スペクトル埋め込みを意識した学習構造 Learning Structure Aware Deep Spectral Embedding ( http://arxiv.org/abs/2305.08215v1 ) ライセンス: Link先を確認	Hira Yaseen and Arif Mahmood	(参考訳) スペクトル埋め込み(se)は、分類とクラスタリングのために、非線形多様体から線形部分空間へのデータポイントのマッピングにしばしば用いられる。重要な利点にもかかわらず、元の空間におけるデータの部分空間構造は埋め込み空間では保存されない。この問題に対処するために、SEグラフ親和性を自己表現行列に置き換えることで、サブスペースクラスタリングが提案されている。しかし、データが線型部分空間の結合にある場合、データが非線型多様体にまたがる実世界での性能は低下する可能性がある。この問題に対処するために,スペクトル埋め込み損失と構造保存損失を組み合わせた新しい構造認識深層スペクトル埋め込みを提案する。この目的のために、両タイプの情報を同時に符号化し、構造対応スペクトル埋め込みを生成するディープニューラルネットワークアーキテクチャを提案する。注意に基づく自己表現学習を用いて入力データの部分空間構造を符号化する。提案アルゴリズムは6つの実世界のデータセット上で評価される。その結果,既存の最先端手法と比較して,提案アルゴリズムのクラスタリング性能は優れていた。提案アルゴリズムは,データポイントの発見に優れた一般化を示し,膨大な計算資源を必要としない大規模データセットにスケーラブルである。 Spectral Embedding (SE) has often been used to map data points from non-linear manifolds to linear subspaces for the purpose of classification and clustering. Despite significant advantages, the subspace structure of data in the original space is not preserved in the embedding space. To address this issue subspace clustering has been proposed by replacing the SE graph affinity with a self-expression matrix. It works well if the data lies in a union of linear subspaces however, the performance may degrade in real-world applications where data often spans non-linear manifolds. To address this problem we propose a novel structure-aware deep spectral embedding by combining a spectral embedding loss and a structure preservation loss. To this end, a deep neural network architecture is proposed that simultaneously encodes both types of information and aims to generate structure-aware spectral embedding. The subspace structure of the input data is encoded by using attention-based self-expression learning. The proposed algorithm is evaluated on six publicly available real-world datasets. The results demonstrate the excellent clustering performance of the proposed algorithm compared to the existing state-of-the-art methods. The proposed algorithm has also exhibited better generalization to unseen data points and it is scalable to larger datasets without requiring significant computational resources.	翻訳日:2023-05-16 16:58:45 公開日:2023-05-14
# フェルミオン環境と相互作用する系の刺激ラマン断熱通路の巨大スピンモデル Giant Spin Model for Stimulated Raman Adiabatic Passage of systems interacting with a fermionic environment ( http://arxiv.org/abs/2305.08209v1 ) ライセンス: Link先を確認	Benedetto Militello and Anna Napoli	(参考訳) このような技術によって操作される物理系がスピン浴と相互作用する場合に、刺激ラマン断熱路を解析する。人口移動プロセスの効率性は, 環境との弱い強い結合や非共鳴など, いくつかの制度で検討されている。一般化された量子ゼノ効果の発生は、強い減衰状態における効率の低下を説明する。 Stimulated Raman Adiabatic Passage is analyzed in the case where the physical system manipulated by such technique is interacting with a spin bath. The efficiency of the population transfer process is investigated in several regimes, including the weak and strong coupling with the environment and the off-resonance. The occurrence of a generalized quantum Zeno effect explains the lowering of the efficiency in the strong damping regime.	翻訳日:2023-05-16 16:58:12 公開日:2023-05-14
# クロスドメインqaを一般化する学習 Learning to Generalize for Cross-domain QA ( http://arxiv.org/abs/2305.08208v1 ) ライセンス: Link先を確認	Yingjie Niu, Linyi Yang, Ruihai Dong, Yue Zhang	(参考訳) 自然言語処理(NLP)モデルのドメイン外一般化能力,特に質問応答(QA)タスクに対する懸念が高まっている。トレーニングコストの増大により、QAの現在の合成データ拡張方法が妨げられる。この問題に対処するため,提案手法と線形探索と微調整戦略を組み合わせた新しい手法を提案するが,追加コストは伴わない。本手法は, 生成モデルと識別モデルの両方の一般化能力の向上に有効であることが理論的, 実験的に証明されている。我々のアプローチは最先端のベースラインを上回り、F1のスコアは平均4.5%-7.9%上昇した。さらに,任意の事前学習モデルに容易に統合でき,未検討のクロスドメインqaタスクに対して有望な解決策を提供する。ソースコードはGitHubで公開しています。 There have been growing concerns regarding the out-of-domain generalization ability of natural language processing (NLP) models, particularly in question-answering (QA) tasks. Current synthesized data augmentation methods for QA are hampered by increased training costs. To address this issue, we propose a novel approach that combines prompting methods and linear probing then fine-tuning strategy, which does not entail additional cost. Our method has been theoretically and empirically shown to be effective in enhancing the generalization ability of both generative and discriminative models. Our approach outperforms state-of-the-art baselines, with an average increase in F1 score of 4.5%-7.9%. Furthermore, our method can be easily integrated into any pre-trained models and offers a promising solution to the under-explored cross-domain QA task. We release our source code at GitHub.	翻訳日:2023-05-16 16:57:58 公開日:2023-05-14
# 認知障害高齢者のための多元的知識融合を用いた認知刺激対話システム A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment ( http://arxiv.org/abs/2305.08200v1 ) ライセンス: Link先を確認	Jiyue Jiang, Sheng Wang, Qintong Li, Lingpeng Kong, Chuan Wu	(参考訳) 認知障害のある高齢者とコミュニケーションする際、認知刺激(CS)は高齢者の認知健康を維持するのに役立つ。データ空間は、特に中国語でCSベースの対話システムを構築する上で大きな課題である。このギャップを埋めるために、CS原則と感情支援戦略ラベルとの対話の約2.6Kグループを含む中国のCS会話(CSConv)データセットを構築した。感情的なサポートを提供しながらチャットをするというのは、既存の認知対話システムの大半で見過ごされている。本稿では,CS の原理と感情支援戦略に導かれるオープンな応答を生成するための,CS 対話のためのマルチソース知識融合手法を提案する。まず,外部知識に基づくプログレッシブマスク法を用いて,エンコーダを効果的な分類法として学習する。そして、デコーダが認識されたCS原理と感情的支援戦略と相互作用して応答を生成する。 csconvデータセットで行った広範囲な実験により,提案手法の有効性が実証された。 When communicating with elders with cognitive impairment, cognitive stimulation (CS) help to maintain the cognitive health of elders. Data sparsity is the main challenge in building CS-based dialogue systems, particularly in the Chinese language. To fill this gap, we construct a Chinese CS conversation (CSConv) dataset, which contains about 2.6K groups of dialogues with CS principles and emotional support strategy labels. Making chit chat while providing emotional support is overlooked by the majority of existing cognitive dialogue systems. In this paper, we propose a multi-source knowledge fusion method for CS dialogue (CSD), to generate open-ended responses guided by the CS principle and emotional support strategy. We first use a progressive mask method based on external knowledge to learn encoders as effective classifiers, which is the prerequisite to predict the CS principle and emotional support strategy of the target response. Then a decoder interacts with the perceived CS principle and emotional support strategy to generate responses. Extensive experiments conducted on the CSConv dataset demonstrate the effectiveness of the proposed method, while there is still a large space for improvement compared to human performance.	翻訳日:2023-05-16 16:57:35 公開日:2023-05-14
# 一様周期時系列データセットにおける一般化異常検出のためのデータセット融合アルゴリズム A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets ( http://arxiv.org/abs/2305.08197v1 ) ライセンス: Link先を確認	Ayman Elhalwagy and Tatiana Kalganova	(参考訳) ニューラルネットワーク(NN)を複数のデータセットに一般化することは、NNが特定のデータソースに最適化されるため、文献でしばしば見過ごされる。これは、異なるセンサからのシーケンシャルデータとコレクション仕様の融合が困難であるため、時系列ベースのマルチデータセットモデルでは特に困難になる。しかし、商用環境では、AIモデルの持続可能な開発であるグリーンAIの文脈において不可欠である、利用可能なデータと計算能力を有効に活用することができる。本稿では,複数の均質なデータセットから周期的信号を単一のデータセットに融合する新しいデータセット合成アルゴリズム"dataset fusion"を提案する。提案手法は、教師なしLSTMCaps NNを用いた2種類の同種誘導電動機(IM)故障データセットの3相電流データをケーススタディで検証し、平均F1スコア0.879で従来のトレーニング手法を著しく上回り、全データセットにわたって効果的に一般化する。提案されたアプローチは、Green AIの原則に従って、トレーニングデータのさまざまなパーセンテージでテストされた。その結果、トレーニングデータの6.25\%しか使用せず、93.7\%の計算能力の低下に対応して、わずか4.04\%の性能低下となり、性能と計算効率の両方の観点から提案手法の利点が示された。さらに,非理想条件下でのアルゴリズムの有効性は,実世界への応用の可能性を強調している。 The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series-based multi-dataset models due to difficulties in fusing sequential data from different sensors and collection specifications. In a commercial environment, however, generalisation can effectively utilise available data and computational power, which is essential in the context of Green AI, the sustainable development of AI models. This paper introduces "Dataset Fusion," a novel dataset composition algorithm for fusing periodic signals from multiple homogeneous datasets into a single dataset while retaining unique features for generalised anomaly detection. The proposed approach, tested on a case study of 3-phase current data from 2 different homogeneous Induction Motor (IM) fault datasets using an unsupervised LSTMCaps NN, significantly outperforms conventional training approaches with an Average F1 score of 0.879 and effectively generalises across all datasets. The proposed approach was also tested with varying percentages of the training data, in line with the principles of Green AI. Results show that using only 6.25\% of the training data, translating to a 93.7\% reduction in computational power, results in a mere 4.04\% decrease in performance, demonstrating the advantages of the proposed approach in terms of both performance and computational efficiency. Moreover, the algorithm's effectiveness under non-ideal conditions highlights its potential for practical use in real-world applications.	翻訳日:2023-05-16 16:57:06 公開日:2023-05-14
# 視覚・他領域のセグメンテーションモデルに関する総合的調査 A Comprehensive Survey on Segment Anything Model for Vision and Beyond ( http://arxiv.org/abs/2305.08196v1 ) ライセンス: Link先を確認	Chunhui Zhang, Li Liu, Yawen Cui, Guanjie Huang, Weilin Lin, Yiqian Yang, Yuehong Hu	(参考訳) 人工知能(AI)は、AIシステムが幅広いタスクを実行し、人間のものと似たレベルの知性を示す能力を指す人工知能へと進化している。これは、特定のタスクを高い効率で実行するように設計された、狭いあるいは特殊なAIとは対照的である。したがって、様々な下流タスクに適応可能な幅広いデータに基づいて訓練された基礎モデルと呼ばれる、一般的なモデルのクラスを設計することが急務である。最近提案されたセグメンテーションモデル (SAM) は、セグメンテーションの境界を画定し、コンピュータビジョンの基礎モデルの開発を大いに促進している。 SAMを完全に理解するために,我々は調査研究を行う。ビジョンのためのタスクのセグメンテーションの進捗を、samの基礎モデルに基づいて包括的にレビューするため、本研究は、その歴史的発展、最近の進歩、幅広いアプリケーションへの深い影響について議論することで、様々なタスクやデータタイプへの応用に焦点を当てている。まず、SAMを含む基礎モデルの背景と用語、およびタスクのセグメンテーションに重要なSAMと同等の最先端の手法について紹介する。そして,ソフトウェアシーン,現実世界シーン,複雑なシーンなど,様々な画像処理アプリケーションにおけるSAMの利点と限界を分析し,要約する。重要なのは、より汎用的な基礎モデルを開発し、samのアーキテクチャを改善するための将来の研究のガイドとなるいくつかの洞察である。また、SAMの視覚およびそれ以上の素晴らしい応用についてもまとめています。 Artificial intelligence (AI) is evolving towards artificial general intelligence, which refers to the ability of an AI system to perform a wide range of tasks and exhibit a level of intelligence similar to that of a human being. This is in contrast to narrow or specialized AI, which is designed to perform specific tasks with a high degree of efficiency. Therefore, it is urgent to design a general class of models, which we term foundation models, trained on broad data that can be adapted to various downstream tasks. The recently proposed segment anything model (SAM) has made significant progress in breaking the boundaries of segmentation, greatly promoting the development of foundation models for computer vision. To fully comprehend SAM, we conduct a survey study. As the first to comprehensively review the progress of segmenting anything task for vision and beyond based on the foundation model of SAM, this work focuses on its applications to various tasks and data types by discussing its historical development, recent progress, and profound impact on broad applications. We first introduce the background and terminology for foundation models including SAM, as well as state-of-the-art methods contemporaneous with SAM that are significant for segmenting anything task. Then, we analyze and summarize the advantages and limitations of SAM across various image processing applications, including software scenes, real-world scenes, and complex scenes. Importantly, some insights are drawn to guide future research to develop more versatile foundation models and improve the architecture of SAM. We also summarize massive other amazing applications of SAM in vision and beyond.	翻訳日:2023-05-16 16:56:39 公開日:2023-05-14
# 対話型意味解析のための自然言語フィードバックのシミュレーション Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing ( http://arxiv.org/abs/2305.08195v1 ) ライセンス: Link先を確認	Hao Yan, Saurabh Srivastava, Yintao Tai, Sida I. Wang, Wen-tau Yih, Ziyu Yao	(参考訳) 自然言語(nl)フィードバックに基づく対話的意味解析は、ユーザーがパーサーの間違いを修正するためのフィードバックを提供するもので、従来のワンショット意味解析よりも実用的なシナリオとして登場している。しかし、従来の作業は、対話型セマンティックパーサをトレーニングするために、人間に注釈付けされたフィードバックデータに大きく依存している。本研究では,対話型意味解析のためのNLフィードバックをシミュレーションするタスクを提案する。私たちはそのタスクに新しいフィードバック評価器を伴います。 evaluatorはシミュレーションされたフィードバックの品質を評価するために特別に設計されており、提案手法から最適なフィードバックシミュレータを決定する。テキストからSQLへのデータセットでは、フィードバックシミュレータが高品質なNLフィードバックを生成し、特定のパーサの誤り訂正能力を向上できることを示す。低データ設定で、私たちのフィードバックシミュレータは、コストがかかるフルヒューマンアノテーションを使用してトレーニングされたエラー修正のパフォーマンスを同等に達成できます。 Interactive semantic parsing based on natural language (NL) feedback, where users provide feedback to correct the parser mistakes, has emerged as a more practical scenario than the traditional one-shot semantic parsing. However, prior work has heavily relied on human-annotated feedback data to train the interactive semantic parser, which is prohibitively expensive and not scalable. In this work, we propose a new task of simulating NL feedback for interactive semantic parsing. We accompany the task with a novel feedback evaluator. The evaluator is specifically designed to assess the quality of the simulated feedback, based on which we decide the best feedback simulator from our proposed variants. On a text-to-SQL dataset, we show that our feedback simulator can generate high-quality NL feedback to boost the error correction ability of a specific parser. In low-data settings, our feedback simulator can help achieve comparable error correction performance as trained using the costly, full set of human annotations.	翻訳日:2023-05-16 16:56:12 公開日:2023-05-14
# 知覚不能および伝達不能な逆襲に対する拡散モデル Diffusion Models for Imperceptible and Transferable Adversarial Attack ( http://arxiv.org/abs/2305.08192v1 ) ライセンス: Link先を確認	Jianqi Chen, Hao Chen, Keyan Chen, Yilan Zhang, Zhengxia Zou, Zhenwei Shi	(参考訳) 既存の多くの敵攻撃は画像RGB空間上で$L_p$-norm摂動を生成する。移植性や攻撃成功率のいくつかの成果にもかかわらず、製作された敵の例は人間の目で容易に認識される。最近の研究では、L_p$-norm制約なしで制限のない攻撃を探索しているが、ブラックボックスモデルに対する攻撃の転送性は欠如している。本研究では,拡散モデルの生成的・判別的パワーを活用し,新しい非受容的・移動可能攻撃を提案する。具体的には、ピクセル空間の直接操作の代わりに、拡散モデルの潜在空間で摂動を発生させる。適切に設計されたコンテンツ保存構造と組み合わせることで、意味的な手がかりが埋め込まれた人間非感受性の摂動を生成することができる。移動性を改善するため,対象領域から注意をそらすことにより,追加の認識サーロゲートと見なすことのできる拡散モデルをさらに「欺く」。我々の知る限り、提案手法であるdiffattackは、敵の攻撃フィールドに拡散モデルを導入する最初の方法である。各種モデル構造(CNN, Transformer, MLPs など)と防御手法の多種多様な実験により,攻撃方法の優位性を実証した。 Many existing adversarial attacks generate $L_p$-norm perturbations on image RGB space. Despite some achievements in transferability and attack success rate, the crafted adversarial examples are easily perceived by human eyes. Towards visual imperceptibility, some recent works explore unrestricted attacks without $L_p$-norm constraints, yet lacking transferability of attacking black-box models. In this work, we propose a novel imperceptible and transferable attack by leveraging both the generative and discriminative power of diffusion models. Specifically, instead of direct manipulation in pixel space, we craft perturbations in latent space of diffusion models. Combined with well-designed content-preserving structures, we can generate human-insensitive perturbations embedded with semantic clues. For better transferability, we further "deceive" the diffusion model which can be viewed as an additional recognition surrogate, by distracting its attention away from the target regions. To our knowledge, our proposed method, DiffAttack, is the first that introduces diffusion models into adversarial attack field. Extensive experiments on various model structures (including CNNs, Transformers, MLPs) and defense methods have demonstrated our superiority over other attack methods.	翻訳日:2023-05-16 16:55:56 公開日:2023-05-14
# MatSci-NLP:テキスト-スキーマモデリングを用いた材料科学言語課題における科学言語モデルの評価 MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling ( http://arxiv.org/abs/2305.08264v1 ) ライセンス: Link先を確認	Yu Song, Santiago Miret, Bang Liu	(参考訳) 本研究では,自然言語処理(NLP)モデルの性能評価を行う自然言語ベンチマークであるMatSci-NLPを提案する。本ベンチマークは,7つの異なるnlpタスク(名前付きエンティティ認識や関係分類などの従来のnlpタスクや,素材の合成手順の作成に関連する合成行動検索など,材料科学特有のnlpタスクを含む)を包含する,利用可能な材料科学のテキストデータから構築する。本研究では,様々な理科テキストコーパスで事前学習したBERTモデルについて検討し,事前学習戦略が教材科学テキストの理解に与える影響を明らかにする。材料科学分野における高品質な注釈データの不足を考えると,我々はmatsci-nlpタスク間の一般化を促進するために,限られたトレーニングデータを用いて微調整実験を行う。この低リソース・トレーニング・セッティングにおける実験により,理科テキストで事前学習した言語モデルは,一般的なテキストで訓練したBERTより優れていることが示された。 MatBERTは、材料科学雑誌に特化して事前訓練されたモデルで、ほとんどのタスクに最適である。さらに,Shabenchmark上でのマルチタスク学習のための統一テキストスキーマを提案し,その性能を従来の微調整手法と比較する。異なる学習方法の分析により,提案手法が単タスクと多タスクのnlpの微調整法を常に上回っており,質問応答法に着想を得た。コードとデータセットは \url{https://github.com/BangLab-UdeM-Mila/NLP4MatSci-ACL23} で公開されている。 We present MatSci-NLP, a natural language benchmark for evaluating the performance of natural language processing (NLP) models on materials science text. We construct the benchmark from publicly available materials science text data to encompass seven different NLP tasks, including conventional NLP tasks like named entity recognition and relation classification, as well as NLP tasks specific to materials science, such as synthesis action retrieval which relates to creating synthesis procedures for materials. We study various BERT-based models pretrained on different scientific text corpora on MatSci-NLP to understand the impact of pretraining strategies on understanding materials science text. Given the scarcity of high-quality annotated data in the materials science domain, we perform our fine-tuning experiments with limited training data to encourage the generalize across MatSci-NLP tasks. Our experiments in this low-resource training setting show that language models pretrained on scientific text outperform BERT trained on general text. MatBERT, a model pretrained specifically on materials science journals, generally performs best for most tasks. Moreover, we propose a unified text-to-schema for multitask learning on \benchmark and compare its performance with traditional fine-tuning methods. In our analysis of different training methods, we find that our proposed text-to-schema methods inspired by question-answering consistently outperform single and multitask NLP fine-tuning methods. The code and datasets are publicly available at \url{https://github.com/BangLab-UdeM-Mila/NLP4MatSci-ACL23}.	翻訳日:2023-05-16 16:48:54 公開日:2023-05-14
# 医用画像解析のためのパラメーター効率の微調整:逃避機会 Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity ( http://arxiv.org/abs/2305.08252v1 ) ライセンス: Link先を確認	Raman Dutt, Linus Ericsson, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy Hospedales	(参考訳) 本稿では,多種多様な医用画像解析タスクにおけるパラメータ効率向上技術(PEFT)の総合評価について述べる。 PEFTは、自然言語処理、ビジョン、スピーチ、そして視覚言語やテキスト・ツー・イメージ生成のようなモーダルなタスクにおいて、事前訓練されたモデルから知識を伝達するための貴重なアプローチとして、ますます活用されている。しかし、医用画像解析への応用はいまだに未解明である。基礎モデルが医学領域でますます活用されるようになるにつれて、ダウンストリームタスクの範囲を補強する知識伝達の様々な戦略を調査し、比較評価することが重要となる。コンボリューションとトランスフォーマーに基づくネットワークのために提案された16種類のPEFT手法を,サイズ,モダリティ,複雑性の6つの医学データセットを対象とした画像分類とテキスト・ツー・イメージ生成タスクに着目し,本研究で評価した。 600以上の制御された実験により,特定のシナリオ下では最大22%の性能向上を示し,医療用テキスト・画像生成におけるPEFTの有効性を示した。さらに, 従来の微調整手法よりもPEFT法が特に優位である事例を明らかにし, 下流データ量との関係について検討する。 We present a comprehensive evaluation of Parameter-Efficient Fine-Tuning (PEFT) techniques for diverse medical image analysis tasks. PEFT is increasingly exploited as a valuable approach for knowledge transfer from pre-trained models in natural language processing, vision, speech, and cross-modal tasks, such as vision-language and text-to-image generation. However, its application in medical image analysis remains relatively unexplored. As foundation models are increasingly exploited in the medical domain, it is crucial to investigate and comparatively assess various strategies for knowledge transfer that can bolster a range of downstream tasks. Our study, the first of its kind (to the best of our knowledge), evaluates 16 distinct PEFT methodologies proposed for convolutional and transformer-based networks, focusing on image classification and text-to-image generation tasks across six medical datasets ranging in size, modality, and complexity. Through a battery of more than 600 controlled experiments, we demonstrate performance gains of up to 22% under certain scenarios and demonstrate the efficacy of PEFT for medical text-to-image generation. Further, we reveal the instances where PEFT methods particularly dominate over conventional fine-tuning approaches by studying their relationship with downstream data volume.	翻訳日:2023-05-16 16:48:26 公開日:2023-05-14
# 言語能力の犠牲を伴わない非言語スキルの学習 Learning Non-linguistic Skills without Sacrificing Linguistic Proficiency ( http://arxiv.org/abs/2305.08246v1 ) ライセンス: Link先を確認	Mandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan	(参考訳) 近年のMath-NLPの分野は、LLMのパフォーマンスを非言語的概念(数、そしてその後の算術的推論)の学習に拡大したいという願望に動機付けられて、著しい成長をみせている。しかし、非言語的スキルインジェクションは、一般的にllmのコストがかかる:それは、コア言語スキルが壊滅的に忘れ去られてしまうことにつながる。 Math-NLP は、小学生の数学的スキルや計算機の算術的推論スキルを正確に近似できる LLM を作成することができるため、これらのモデルの実用性は、それらが言語能力を損なうと失敗する。本研究は, LLMに関する破滅的忘れの現象を深く考察し, 厳密な算術推論の学習を可能にする情報理論的介入とスキル特異的損失に基づく, LLMの非言語的スキル注入のための新しい枠組みを提供する。本モデルは,非言語的スキルと言語的知識の保持の両方において最先端を上回っており,非言語的訓練データ(1/4)およびゼロの追加的な言語的訓練データを用いている。 The field of Math-NLP has witnessed significant growth in recent years, motivated by the desire to expand LLM performance to the learning of non-linguistic notions (numerals, and subsequently, arithmetic reasoning). However, non-linguistic skill injection typically comes at a cost for LLMs: it leads to catastrophic forgetting of core linguistic skills, a consequence that often remains unaddressed in the literature. As Math-NLP has been able to create LLMs that can closely approximate the mathematical skills of a grade-schooler or the arithmetic reasoning skills of a calculator, the practicality of these models fail if they concomitantly shed their linguistic capabilities. In this work, we take a closer look into the phenomena of catastrophic forgetting as it pertains to LLMs and subsequently offer a novel framework for non-linguistic skill injection for LLMs based on information theoretic interventions and skill-specific losses that enable the learning of strict arithmetic reasoning. Our model outperforms the state-of-the-art both on injected non-linguistic skills and on linguistic knowledge retention, and does so with a fraction of the non-linguistic training data (1/4) and zero additional synthetic linguistic training data.	翻訳日:2023-05-16 16:48:04 公開日:2023-05-14
# トリビュートAIコンペティションの物語の紹介 Introducing Tales of Tribute AI Competition ( http://arxiv.org/abs/2305.08234v1 ) ライセンス: Link先を確認	Jakub Kowalski, Rados{\l}aw Miernik, Katarzyna Polak, Dominik Budzki, Damian Kowalik	(参考訳) 本稿では,The Elder Scrolls OnlineのHigh Isle章でリリースされた2人のプレイヤーによるデッキビルディングカードゲームに基づいて,新たなAIチャレンジであるTOTAICを提案する。現在、CCG(Collectible Card Games)のジャンルをカバーするAIコンペティションは他になく、デッキビルディングゲームをターゲットにした大会は一度もない。したがって、ランダム性や隠れ情報、大きな分岐要因など、通常のCCG関連の障害を克服するためには、長期的な計画と汎用性も必要である。このゲームは、古典的な敵探索、シングルプレイヤー計画、ニューラルネットワークベースのアルゴリズムなど、複数のアプローチで対処できる。本稿では,競争の枠組みを紹介し,ゲームのルールを説明し,サンプルAIエージェント間のトーナメントの結果を示す。 TOTAICの最初のエディションはIEEE Conference on Games 2023で開催されている。 This paper presents a new AI challenge, the Tales of Tribute AI Competition (TOTAIC), based on a two-player deck-building card game released with the High Isle chapter of The Elder Scrolls Online. Currently, there is no other AI competition covering Collectible Card Games (CCG) genre, and there has never been one that targets a deck-building game. Thus, apart from usual CCG-related obstacles to overcome, like randomness, hidden information, and large branching factor, the successful approach additionally requires long-term planning and versatility. The game can be tackled with multiple approaches, including classic adversarial search, single-player planning, and Neural Networks-based algorithms. This paper introduces the competition framework, describes the rules of the game, and presents the results of a tournament between sample AI agents. The first edition of TOTAIC is hosted at the IEEE Conference on Games 2023.	翻訳日:2023-05-16 16:47:40 公開日:2023-05-14
# グラフエコー状態ネットワークを用いたノード分類におけるヘテロフォリーの対応 Addressing Heterophily in Node Classification with Graph Echo State Networks ( http://arxiv.org/abs/2305.08233v1 ) ライセンス: Link先を確認	Alessio Micheli, Domenico Tortorella	(参考訳) グラフ上のノード分類タスクは、ノード近傍の複数の集約を通してノード表現の階層を学習する、完全に訓練されたディープメッセージパッシングモデルによって処理される。クラス内エッジの比率が高いグラフでは有効であるが、このアプローチは反対のケース、すなわちヘテロフィリー(英語版)では、同じクラスに属するノードが通常はさらに離れている。ヘテロフィアの高いグラフでは、畳み込みモデルによって計算された近接近傍に基づく平滑化表現はもはや有効ではない。これまでのところ、入力グラフの過度な平滑化や切り替えを低減し、長距離メッセージパッシングを改善するためのメッセージパッシングモデルのアーキテクチャ上のバリエーションが提案されている。本稿では,ノード分類のためのグラフエコー状態ネットワーク(GESN)を用いた異種グラフの課題に対処する。 gesnはグラフの貯水池計算モデルであり、ノード埋め込みは未学習のメッセージパッシング関数によって再帰的に計算される。我々の実験では,アーキテクチャバイアスのアドホックなバリエーションを実装したり,インプットグラフの事前処理ステップとして再処理を行う,最も完全に訓練された深層モデルに対して,リザーバモデルの方が,効率/正確性のトレードオフという面で改善した。さらに,gesnは再帰的埋め込み関数の反復とグラフ内の最短経路の分布との相関を示すことにより,グラフノードの構造的関係を効果的にエンコードできることを示した。 Node classification tasks on graphs are addressed via fully-trained deep message-passing models that learn a hierarchy of node representations via multiple aggregations of a node's neighbourhood. While effective on graphs that exhibit a high ratio of intra-class edges, this approach poses challenges in the opposite case, i.e. heterophily, where nodes belonging to the same class are usually further apart. In graphs with a high degree of heterophily, the smoothed representations based on close neighbours computed by convolutional models are no longer effective. So far, architectural variations in message-passing models to reduce excessive smoothing or rewiring the input graph to improve longer-range message passing have been proposed. In this paper, we address the challenges of heterophilic graphs with Graph Echo State Network (GESN) for node classification. GESN is a reservoir computing model for graphs, where node embeddings are recursively computed by an untrained message-passing function. Our experiments show that reservoir models are able to achieve better or comparable accuracy with respect to most fully trained deep models that implement ad hoc variations in the architectural bias or perform rewiring as a preprocessing step on the input graph, with an improvement in terms of efficiency/accuracy trade-off. Furthermore, our analysis shows that GESN is able to effectively encode the structural relationships of a graph node, by showing a correlation between iterations of the recursive embedding function and the distribution of shortest paths in a graph.	翻訳日:2023-05-16 16:47:25 公開日:2023-05-14
# 街路画像からの物体の位置情報と高さ推定の組合せ Combining geolocation and height estimation of objects from street level imagery ( http://arxiv.org/abs/2305.08232v1 ) ライセンス: Link先を確認	Matej Ulicny, Vladimir A. Krylov, Julie Connelly, and Rozenn Dahyot	(参考訳) 本研究では,単一の入力データモダリティと見なされる道路レベルrgb画像から,多クラスオブジェクトの位置情報と高さ推定を組み合わせたパイプラインを提案する。我々の解はマルコフ確率場最適化によって定式化される。提案手法は、カスタムトレーニングされた畳み込みニューラルネットワークで検出された画像平面内の物体の座標とともに画像メタデータを使用する。対象位置の計算に加えて,本手法を用いた対象高さの計算は,全体の計算コストに悪影響を及ぼす。平均標高推定誤差が20cm未満となる排水路や道路標識の精度を実験的に実証した。 We propose a pipeline for combined multi-class object geolocation and height estimation from street level RGB imagery, which is considered as a single available input data modality. Our solution is formulated via Markov Random Field optimization with deterministic output. The proposed technique uses image metadata along with coordinates of objects detected in the image plane as found by a custom-trained Convolutional Neural Network. Computing the object height using our methodology, in addition to object geolocation, has negligible effect on the overall computational cost. Accuracy is demonstrated experimentally for water drains and road signs on which we achieve average elevation estimation error lower than 20cm.	翻訳日:2023-05-16 16:46:57 公開日:2023-05-14
# 海面の高さと速度場に基づくハイブリッド3次元渦検出技術 A Hybrid 3D Eddy Detection Technique Based on Sea Surface Height and Velocity Field ( http://arxiv.org/abs/2305.08229v1 ) ライセンス: Link先を確認	Weiping Hua, Karen Bemis, Dujuan Kang, Sedat Ozer, Deborah Silver	(参考訳) 渦検出は海洋科学者にとって海洋循環を理解し解析する重要な課題である。本稿では,海面の高さ (ssh) と速度場と渦の挙動を定義する幾何学的基準を組み合わせた渦検出手法を提案する。海洋学者がエディーズの中心に求めるSSHミニマとマキシマの探索を行った。幾何的基準は、各渦中心を囲む円形の経路に沿って速度成分を追従することにより、ネット回転や対称性などの期待される速度場特性の検証に使用される。プログレッシブな探索は、各エディの3D領域に影響を及ぼす。データセットから各渦構造を分離することで、水平速度、垂直速度、温度、塩分量を用いて内部渦構造の可視化が容易になる。大久保-ワイス渦性閾値(ow)、標準巻線角、およびこの新しいssh-速度ハイブリッド法による渦検出法を赤海データセットに適用した結果、検出結果は方法、閾値、基準の選定に大きく依存していることが示唆された。この新しいssh-velocityハイブリッド検出手法は, 回転特性が検証された渦構造を提供すること, 物性の内部構造の3次元可視化, 流線を計算せずに高速に渦足跡を推定できる。本手法は, 内部構造の可視化と全体移動の追跡を併用し, 栄養分布と海洋循環の相互作用を理解するための輸送機構の研究を支援する。本手法は3つの異なるデータセットに適用し,その一般性を示す。 Eddy detection is a critical task for ocean scientists to understand and analyze ocean circulation. In this paper, we introduce a hybrid eddy detection approach that combines sea surface height (SSH) and velocity fields with geometric criteria defining eddy behavior. Our approach searches for SSH minima and maxima, which oceanographers expect to find at the center of eddies. Geometric criteria are used to verify expected velocity field properties, such as net rotation and symmetry, by tracing velocity components along a circular path surrounding each eddy center. Progressive searches outward and into deeper layers yield each eddy's 3D region of influence. Isolation of each eddy structure from the dataset, using it's cylindrical footprint, facilitates visualization of internal eddy structures using horizontal velocity, vertical velocity, temperature and salinity. A quantitative comparison of Okubo-Weiss vorticity (OW) thresholding, the standard winding angle, and this new SSH-velocity hybrid methods of eddy detection as applied to the Red Sea dataset suggests that detection results are highly dependent on the choices of method, thresholds, and criteria. Our new SSH-velocity hybrid detection approach has the advantages of providing eddy structures with verified rotation properties, 3D visualization of the internal structure of physical properties, and rapid efficient estimations of eddy footprints without calculating streamlines. Our approach combines visualization of internal structure and tracking overall movement to support the study of the transport mechanisms key to understanding the interaction of nutrient distribution and ocean circulation. Our method is applied to three different datasets to showcase the generality of its application.	翻訳日:2023-05-16 16:46:47 公開日:2023-05-14
# 骨格グラフに基づく超音波CT非剛性レジストレーション Skeleton Graph-based Ultrasound-CT Non-rigid Registration ( http://arxiv.org/abs/2305.08228v1 ) ライセンス: Link先を確認	Zhongliang Jiang, Xuesong Li, Chenyu Zhang, Yuan Bi, Walter Stechele, Nassir Navab	(参考訳) 自律型超音波(US)スキャンは注目を集めており、術者間変動などの従来のアメリカの検査の限界を克服するための潜在的な解決策と見なされている。しかしながら、特に音響窓が制限された胸郭アプリケーションにおいて、ジェネリック・アトラス上で計画されたスキャン軌道を、他の患者のために現在の設定に自律的かつ正確に転送することは依然として困難である。この課題に対処するため,皮膚表面ではなく皮下骨表面の特徴を用いて患者固有の特性を適応する骨格グラフに基づく非剛性登録法を提案した。この目的のために、それぞれ入力点雲を統一し、キーポイントを抽出するために、自己組織化マッピングを2回連続して使用する。その後、最小のスパンニングツリーを使用して、抽出されたすべてのキーポイントを接続するツリーグラフを生成する。ソースおよびターゲットポイントクラウドに適合するリブ軟骨輪郭を適切に特徴付けるため、ツリーグラフから抽出されたパスは、リブ全体にわたって連続性を最大に維持することにより最適化される。提案手法を検証するために,1人のボランティアと7つのCT軟骨点群から,異なる患者からUS軟骨点群を手動で抽出した。以上の結果より,ICP (distance error mean/SD: 5.0/1.9 mm vs 8.6/6.7 mm on 7 CTs) よりも患者間変動に適応する上で,グラフベース登録の方が有効で堅牢であることが示唆された。 Autonomous ultrasound (US) scanning has attracted increased attention, and it has been seen as a potential solution to overcome the limitations of conventional US examinations, such as inter-operator variations. However, it is still challenging to autonomously and accurately transfer a planned scan trajectory on a generic atlas to the current setup for different patients, particularly for thorax applications with limited acoustic windows. To address this challenge, we proposed a skeleton graph-based non-rigid registration to adapt patient-specific properties using subcutaneous bone surface features rather than the skin surface. To this end, the self-organization mapping is successively used twice to unify the input point cloud and extract the key points, respectively. Afterward, the minimal spanning tree is employed to generate a tree graph to connect all extracted key points. To appropriately characterize the rib cartilage outline to match the source and target point cloud, the path extracted from the tree graph is optimized by maximally maintaining continuity throughout each rib. To validate the proposed approach, we manually extract the US cartilage point cloud from one volunteer and seven CT cartilage point clouds from different patients. The results demonstrate that the proposed graph-based registration is more effective and robust in adapting to the inter-patient variations than the ICP (distance error mean/SD: 5.0/1.9 mm vs 8.6/6.7 mm on seven CTs).	翻訳日:2023-05-16 16:46:19 公開日:2023-05-14
# deepfilternet: 知覚的動機付けによるリアルタイム音声強調 DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement ( http://arxiv.org/abs/2305.08227v1 ) ライセンス: Link先を確認	Hendrik Schr\"oter, Tobias Rosenkranz, Alberto N. Escalante-B., Andreas Maier	(参考訳) 単一チャンネル音声強調のためのマルチフレームアルゴリズムは、音声信号内の短時間相関を活用できる。周波数領域における複素フィルタを直接推定し,それらの相関性を利用するためにDF法を提案した。本稿では,DeepFilterNetを用いたリアルタイム音声強調デモを示す。 DeepFilterNetの効率性は、音声生成と心理音響知覚のドメイン知識を活用することで実現される。本モデルは,シングルスレッドノートブック cpu 上で 0.19 のリアルタイム係数を実現しつつ,最先端の音声強調ベンチマークと一致させることができる。フレームワークと事前トレーニングされた重み付けは、オープンソースライセンスで公開されている。 Multi-frame algorithms for single-channel speech enhancement are able to take advantage from short-time correlations within the speech signal. Deep Filtering (DF) was proposed to directly estimate a complex filter in frequency domain to take advantage of these correlations. In this work, we present a real-time speech enhancement demo using DeepFilterNet. DeepFilterNet's efficiency is enabled by exploiting domain knowledge of speech production and psychoacoustic perception. Our model is able to match state-of-the-art speech enhancement benchmarks while achieving a real-time-factor of 0.19 on a single threaded notebook CPU. The framework as well as pretrained weights have been published under an open source license.	翻訳日:2023-05-16 16:45:53 公開日:2023-05-14
# ファジィ生成ランタイムプロファイリングによるNLPベースのクロスレイヤ5G脆弱性検出 NLP-based Cross-Layer 5G Vulnerabilities Detection via Fuzzing Generated Run-Time Profiling ( http://arxiv.org/abs/2305.08226v1 ) ライセンス: Link先を確認	Zhuzhu Wang and Ying Wang	(参考訳) 5Gソフトウェアスタックの脆弱性と意図しない動作検出の有効性と効率性は、5G保証、特に重要なインフラにおけるその応用に不可欠である。スケーラビリティと自動化は、テストアプローチとサイバーセキュリティ研究の主要な課題である。本稿では,コードリポジトリのファズテストに対応する実行時プロファイリング文書を用いて,脆弱性,意図しない緊急動作,および5Gスタックの性能劣化を自動的に検出する革新的な手法を提案する。 srsRANをパイロットとして,ファジィテストによって生成されたログ情報を用いてリアルタイムのプロファイリングを高次元距離空間にマップし,そのタイムスタンプ情報に基づいて特徴空間を構築する。最後に,ロジスティック回帰,k-nearest近傍,ランダムフォレストなど,機械学習に基づく分類アルゴリズムを活用して,パフォーマンスとセキュリティ属性への影響を分類する。提案手法の性能は高い精度で、ファジングインパクトを検出する際に93.4 \% $ から95.9 \% $ となる。さらに、概念実証は5Gインフラストラクチャのリアルタイム脆弱性と、さまざまな分野のクリティカルアプリケーションを特定し、優先順位付けする可能性がある。 The effectiveness and efficiency of 5G software stack vulnerability and unintended behavior detection are essential for 5G assurance, especially for its applications in critical infrastructures. Scalability and automation are the main challenges in testing approaches and cybersecurity research. In this paper, we propose an innovative approach for automatically detecting vulnerabilities, unintended emergent behaviors, and performance degradation in 5G stacks via run-time profiling documents corresponding to fuzz testing in code repositories. Piloting on srsRAN, we map the run-time profiling via Logging Information (LogInfo) generated by fuzzing test to a high dimensional metric space first and then construct feature spaces based on their timestamp information. Lastly, we further leverage machine learning-based classification algorithms, including Logistic Regression, K-Nearest Neighbors, and Random Forest to categorize the impacts on performance and security attributes. The performance of the proposed approach has high accuracy, ranging from $ 93.4 \% $ to $ 95.9 \% $, in detecting the fuzzing impacts. In addition, the proof of concept could identify and prioritize real-time vulnerabilities on 5G infrastructures and critical applications in various verticals.	翻訳日:2023-05-16 16:45:43 公開日:2023-05-14
# FactKB: ファクト知識で強化された言語モデルを用いた一般化可能なファクチュアリティ評価 FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge ( http://arxiv.org/abs/2305.08281v1 ) ライセンス: Link先を確認	Shangbin Feng, Vidhisha Balachandran, Yuyang Bai, Yulia Tsvetkov	(参考訳) 自動生成された要約の事実整合性を評価することは、信頼できる要約システムの進展と導入に不可欠である。近年の進歩にもかかわらず、既存の事実性評価モデルは頑健ではなく、特に新しいドメインにおけるエンティティと関係エラーの傾向が強い。我々はfactkbを提案する。factuality evaluationに対する単純な新しいアプローチで、特にエンティティやリレーションに関して、ドメイン間で一般化できる。 FactKBは、外部知識ベースから抽出された事実を用いて事前訓練された言語モデルに基づいている。本稿では,直接実体事実に基づく相補的事実学習目標,実体に関する補助知識に基づく事実,知識ベースウォークによる構成的事実の3種類の相補的事実学習目標について紹介する。結果の事実性評価モデルは、2つのドメイン内ニュース要約ベンチマークと3つのドメイン外科学文献データセットに対して、最先端のパフォーマンスを達成する。 FactKBのさらなる分析は、要約における誤った実体や関係を検出する能力が改善され、ドメイン間で堅牢で一般化可能であることを示している。 Evaluating the factual consistency of automatically generated summaries is essential for the progress and adoption of reliable summarization systems. Despite recent advances, existing factuality evaluation models are not robust, being especially prone to entity and relation errors in new domains. We propose FactKB, a simple new approach to factuality evaluation that is generalizable across domains, in particular with respect to entities and relations. FactKB is based on language models pretrained using facts extracted from external knowledge bases. We introduce three types of complementary factuality pretraining objectives based on direct entity facts, facts grounded in auxiliary knowledge about entities, and facts constructed compositionally through knowledge base walks. The resulting factuality evaluation model achieves state-of-the-art performance on two in-domain news summarization benchmarks as well as on three out-of-domain scientific literature datasets. Further analysis of FactKB shows improved ability to detect erroneous entities and relations in summaries and is robust and generalizable across domains.	翻訳日:2023-05-16 16:39:04 公開日:2023-05-14
# Ship-D: 機械学習を用いた設計最適化のためのシップハルデータセット Ship-D: Ship Hull Dataset for Design Optimization using Machine Learning ( http://arxiv.org/abs/2305.08279v1 ) ライセンス: Link先を確認	Noah J. Bagazinski and Faez Ahmed	(参考訳) 機械学習は最近、複雑な製品の設計サイクル時間を短縮するために大きな進歩を遂げている。船体設計は現在、長いサイクルと小さなバッチ生産を含むが、これらの進歩の大きな恩恵を受ける可能性がある。様々な種類の船舶の設計から学習する船舶設計のための機械学習ツールを開発することで、船舶設計におけるトレードオフを特定し最適化することができる。しかし、現在公開されている船の設計データセットの欠如は、一般的な船の設計において機械学習を活用する可能性を制限している。このギャップに対処するために, パラメータ化, メッシュ, 点雲, 画像表現などの設計および機能性能情報と, 異なる動作条件下での3つの流体抵抗測定値を含む, 3万個の船殻の大規模データセットを提案する。データセットは人間の入力を可能にするように構成されており、計算方法も設計されている。さらに,既存の船体を正確に再構成するパラメータ化機能を示すため,公開されているCADレポジトリから12種類の船体を紹介する。遺伝的アルゴリズムのケーススタディでは, 船体断面の形状と平行中間体の長さを保ちながら, 船体の総抗力を60パーセント削減するために, 30の波動抵抗係数を予測するために代理モデルが開発された。我々の研究は、他の研究者がデータ駆動船の設計を進めるために使用する包括的なデータセットとアプリケーションの例を提供します。 Machine learning has recently made significant strides in reducing design cycle time for complex products. Ship design, which currently involves years long cycles and small batch production, could greatly benefit from these advancements. By developing a machine learning tool for ship design that learns from the design of many different types of ships, tradeoffs in ship design could be identified and optimized. However, the lack of publicly available ship design datasets currently limits the potential for leveraging machine learning in generalized ship design. To address this gap, this paper presents a large dataset of thirty thousand ship hulls, each with design and functional performance information, including parameterization, mesh, point cloud, and image representations, as well as thirty two hydrodynamic drag measures under different operating conditions. The dataset is structured to allow human input and is also designed for computational methods. Additionally, the paper introduces a set of twelve ship hulls from publicly available CAD repositories to showcase the proposed parameterizations ability to accurately reconstruct existing hulls. A surrogate model was developed to predict the thirty two wave drag coefficients, which was then implemented in a genetic algorithm case study to reduce the total drag of a hull by sixty percent while maintaining the shape of the hulls cross section and the length of the parallel midbody. Our work provides a comprehensive dataset and application examples for other researchers to use in advancing data driven ship design.	翻訳日:2023-05-16 16:38:48 公開日:2023-05-14
# 勾配降下の局所収束-生成逆ネットワークの訓練 Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks ( http://arxiv.org/abs/2305.08277v1 ) ライセンス: Link先を確認	Evan Becker, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher	(参考訳) generative adversarial networks (gans) は複雑な高次元データの生成モデルを訓練するための一般的な定式化である。 GANをトレーニングする標準的な方法は、極小最適化問題に対する勾配降下度(GDA)手順を含む。この手順は、力学の非線形性のため、一般には解析が難しい。カーネルベースの判別器を用いてGANを訓練するためのGDAの局所力学について検討する。この収束解析は、[becker et al. 2022] から仮定された \textit{isolated points} モデルの下で gda 反復を記述する非線形力学系の線形化に基づいている。本研究では,カーネル識別器の学習率,正規化,帯域幅がgdaの局所収束率に及ぼす影響について検討した。重要なことは、システムがいつ収束するか、振動するか、分岐するかを示す相転移を示す。また,クレームを検証する数値シミュレーションも提供する。 Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with a kernel-based discriminator. This convergence analysis is based on a linearization of a non-linear dynamical system that describes the GDA iterations, under an \textit{isolated points model} assumption from [Becker et al. 2022]. Our analysis brings out the effect of the learning rates, regularization, and the bandwidth of the kernel discriminator, on the local convergence rate of GDA. Importantly, we show phase transitions that indicate when the system converges, oscillates, or diverges. We also provide numerical simulations that verify our claims.	翻訳日:2023-05-16 16:38:26 公開日:2023-05-14
# ULIP-2:3D理解のためのスケーラブルなマルチモーダル事前学習を目指して ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding ( http://arxiv.org/abs/2305.08275v1 ) ライセンス: Link先を確認	Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Mart\'in-Mart\'in, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, Silvio Savarese	(参考訳) マルチモーダル事前学習法の最近の進歩は、3次元モダリティ、それらの2次元モダリティ、対応する言語モダリティを合わせた3次元表現学習において有望な効果を示している。しかし、3Dアプリケーションのためのマルチモーダルデータを収集するために既存のマルチモーダル事前学習フレームワークが使用している手法はスケーラビリティと包括性に欠けており、多モーダル学習の可能性を最大限に制限する可能性がある。主なボトルネックは、言語モダリティのスケーラビリティと包括性にある。このボトルネックに対処するため,我々は,最先端のマルチモーダル大規模言語モデル (LLM) を利用したマルチモーダル事前学習フレームワークULIP-2を導入する。我々は,ObjaverseとShapeNet55という2つの大規模データセットの実験を行い,生成した3次元三重項データセット(3D Point Cloud - Image - Language)をリリースする。 ULIP-2は、ModelNet40 (74% Top1 Accuracy) で、下流のゼロショット分類の大幅な改善を実現している。さらに、ULIP-2 は実世界の ScanObjectNN ベンチマーク (91.5% の総合精度) で新しい記録を樹立し、140万のパラメータ(現在の SOTA より10倍少ない)しか利用せず、人間のアノテーションなしでスケーラブルなマルチモーダル3D 表現学習のブレークスルーを示している。コードとデータセットはhttps://github.com/salesforce/ulipで入手できる。 Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning features across 3D modality, their 2D counterpart modality, and corresponding language modality. However, the methods used by existing multimodal pre-training frameworks to gather multimodal data for 3D applications lack scalability and comprehensiveness, potentially constraining the full potential of multimodal learning. The main bottleneck lies in the language modality's scalability and comprehensiveness. To address this bottleneck, we introduce ULIP-2, a multimodal pre-training framework that leverages state-of-the-art multimodal large language models (LLMs) pre-trained on extensive knowledge to automatically generate holistic language counterparts for 3D objects. We conduct experiments on two large-scale datasets, Objaverse and ShapeNet55, and release our generated three-modality triplet datasets (3D Point Cloud - Image - Language), named "ULIP-Objaverse Triplets" and "ULIP-ShapeNet Triplets". ULIP-2 requires only 3D data itself and eliminates the need for any manual annotation effort, demonstrating its scalability; and ULIP-2 achieves remarkable improvements on downstream zero-shot classification on ModelNet40 (74% Top1 Accuracy). Moreover, ULIP-2 sets a new record on the real-world ScanObjectNN benchmark (91.5% Overall Accuracy) while utilizing only 1.4 million parameters(~10x fewer than current SOTA), signifying a breakthrough in scalable multimodal 3D representation learning without human annotations. The code and datasets are available at https://github.com/salesforce/ULIP.	翻訳日:2023-05-16 16:38:14 公開日:2023-05-14
# 大規模動的グラフのための分離グラフニューラルネットワーク Decoupled Graph Neural Networks for Large Dynamic Graphs ( http://arxiv.org/abs/2305.08273v1 ) ライセンス: Link先を確認	Yanping Zheng, Zhewei Wei, Jiajun Liu	(参考訳) ソーシャルネットワーク、金融取引、レコメンデーションシステムといった現実世界のグラフは、しばしば動的な振る舞いを示す。この現象はグラフストリームと呼ばれ、ノードの動的変化とエッジの出現と消失を含む。これらの動的グラフの構造的側面と時間的側面の両方を効果的に捉えるために、動的グラフニューラルネットワークが開発された。しかし、既存の手法は通常、連続時間または離散時間動的グラフの処理に適しており、一方から他方へ一般化することはできない。本稿では,連続と離散の両方の動的グラフの効率的な計算を支援する統一動的伝播を含む,大規模動的グラフのための分離グラフニューラルネットワークを提案する。グラフ構造関連計算は伝播過程においてのみ実行されるため、下流タスクの予測プロセスは高価なグラフ計算なしで個別に訓練できるため、任意のシーケンスモデルをプラグインして使用することができる。その結果,本アルゴリズムは拡張性と表現力に優れる。本アルゴリズムは連続時間と離散時間の両方の動的グラフの7つの実世界のデータセットで評価する。実験の結果,両種類の動的グラフにおいて最先端の性能が得られることがわかった。特に、我々のアルゴリズムのスケーラビリティは、最大10億の時間エッジと1億以上のノードを持つ巨大なグラフへの成功例によってよく示されています。 Real-world graphs, such as social networks, financial transactions, and recommendation systems, often demonstrate dynamic behavior. This phenomenon, known as graph stream, involves the dynamic changes of nodes and the emergence and disappearance of edges. To effectively capture both the structural and temporal aspects of these dynamic graphs, dynamic graph neural networks have been developed. However, existing methods are usually tailored to process either continuous-time or discrete-time dynamic graphs, and cannot be generalized from one to the other. In this paper, we propose a decoupled graph neural network for large dynamic graphs, including a unified dynamic propagation that supports efficient computation for both continuous and discrete dynamic graphs. Since graph structure-related computations are only performed during the propagation process, the prediction process for the downstream task can be trained separately without expensive graph computations, and therefore any sequence model can be plugged-in and used. As a result, our algorithm achieves exceptional scalability and expressiveness. We evaluate our algorithm on seven real-world datasets of both continuous-time and discrete-time dynamic graphs. The experimental results demonstrate that our algorithm achieves state-of-the-art performance in both kinds of dynamic graphs. Most notably, the scalability of our algorithm is well illustrated by its successful application to large graphs with up to over a billion temporal edges and over a hundred million nodes.	翻訳日:2023-05-16 16:37:40 公開日:2023-05-14
# $SmartProbe$: 市場調査のための仮想モデレーター $SmartProbe$: A Virtual Moderator for Market Research Surveys ( http://arxiv.org/abs/2305.08271v1 ) ライセンス: Link先を確認	Josh Seltzer, Jiahua (Fiona) Pan, Kathy Cheng, Yuxiao Sun, Santosh Kolagati, Jimmy Lin, Shi Zong	(参考訳) 市場調査は、消費者の視点を大規模に理解するための強力な方法論であるが、理解と洞察の深みによって制限されている。仮想モデレーターは、調査の質的研究の要素を導入し、調査参加者とのラプポートを開発し、探索的な質問を動的に行い、最終的には市場研究者により有用な情報を提供する。本研究では,大規模言語モデル(llm)の適応能力を活用したapiである${\tt smartprobe}$を導入し,市場調査における効果的な調査質問を生成するために,市場調査からドメイン知識を取り入れる。我々は,$\tt smartprobe$のモジュール処理フローを概説し,生成した調査質問の品質と有効性を評価する。当社の取り組みは、業界関係者にLLMの最新の進歩に基づいて、現実世界のアプリケーションを構築するよう促すだろうと考えています。私たちのデモはhttps://nexxt.in/smartprobe-demoで公開しています。 Market research surveys are a powerful methodology for understanding consumer perspectives at scale, but are limited by depth of understanding and insights. A virtual moderator can introduce elements of qualitative research into surveys, developing a rapport with survey participants and dynamically asking probing questions, ultimately to elicit more useful information for market researchers. In this work, we introduce ${\tt SmartProbe}$, an API which leverages the adaptive capabilities of large language models (LLMs), and incorporates domain knowledge from market research, in order to generate effective probing questions in any market research survey. We outline the modular processing flow of $\tt SmartProbe$, and evaluate the quality and effectiveness of its generated probing questions. We believe our efforts will inspire industry practitioners to build real-world applications based on the latest advances in LLMs. Our demo is publicly available at https://nexxt.in/smartprobe-demo	翻訳日:2023-05-16 16:37:18 公開日:2023-05-14
# Kochen-Specker の文脈性 Kochen-Specker Contextuality ( http://arxiv.org/abs/2305.08267v1 ) ライセンス: Link先を確認	Mladen Pavicic and Mordecai Waegell	(参考訳) 最近開発された小さなベクトル成分から量子文脈集合を生成する手法は、任意の次元に普遍的かつ理論的に適用できる。しかし、8以上の次元の任意の排他的集合を得るタスクは、スーパーコンピュータでも計算障壁に直面している。そこで本研究では,KS集合の最小複雑性が,低次元の既知の集合から高次元の比較的小さなKS集合を構成するために,次元にスケールしないという事実を生かした次元アップスケーリング手法を提案する。これにより、現在利用可能な計算資源を用いて16次元空間の単純ベクトル成分から多数の集合を生成できる。 A recently developed method of generating quantum contextual sets from small vectors components is universally and theoretically applicable to any dimension. However, tasks of obtaining such arbitrarily exhaustive sets in dimensions higher than eight face a computational barrier even on supercomputers. Therefore, for this paper, we employed a dimensional upscaling method that exploits the fact that the minimal complexity of KS sets does not scale with dimension to construct relatively small KS sets in higher dimensions from known sets in lower dimensions. This enabled us to generate numerous sets from simple vector components in up to 16-dimensional spaces using presently available computational resources.	翻訳日:2023-05-16 16:37:03 公開日:2023-05-14
# 残差計算のない車両検出と分類:ランダム摂動注入によるHEVC画像デコーディングの高速化 Vehicle Detection and Classification without Residual Calculation: Accelerating HEVC Image Decoding with Random Perturbation Injection ( http://arxiv.org/abs/2305.08265v1 ) ライセンス: Link先を確認	Muhammet Sebul Berato\u{g}lu and Beh\c{c}et U\u{g}ur T\"oreyin	(参考訳) ビデオ分析,特に交通監視の分野では,映像データの処理と理解のための効率的かつ効果的な手法の必要性が高まっている。従来のフルビデオデコーディング技術は計算集約的で時間を要するため、研究者は圧縮された領域における代替アプローチを探求する。本研究では,高効率ビデオ符号化(HEVC)ビットストリームから画像を再構成する,ランダム摂動に基づく圧縮領域法を提案する。本手法は,映像理解タスクに関連する情報を保持しつつ,特に車両の検知・分類を重要なユースケースとして重視しながら,元の画像の凝縮表現を作成し,残差に対するランダムな摂動の置換を提案する最初の方法である。残差データを使用しないことにより,提案手法は画像再構成プロセスに必要なデータを大幅に削減し,より効率的な情報保存と送信を可能にする。これは、監視アプリケーションに関わる膨大なビデオデータを考える際に特に重要である。提案手法は,一般のビットベクトルデータセットに適用することで,従来のフルデコード法に比べて復元速度が著しく向上し,画素領域法よりも約56%高速であることを示す。さらに,画素領域法と比較して検出精度が99.9%,分類精度96.84%であり,画素領域法よりわずか0.98%低い。さらに,データサイズが大幅に削減され,ストレージや送信の効率が向上することを示す。本研究は、速度とデータサイズが重要な要因である交通監視アプリケーションにおいて、圧縮されたドメインメソッドの可能性を立証する。 In the field of video analytics, particularly traffic surveillance, there is a growing need for efficient and effective methods for processing and understanding video data. Traditional full video decoding techniques can be computationally intensive and time-consuming, leading researchers to explore alternative approaches in the compressed domain. This study introduces a novel random perturbation-based compressed domain method for reconstructing images from High Efficiency Video Coding (HEVC) bitstreams, specifically designed for traffic surveillance applications. To the best of our knowledge, our method is the first to propose substituting random perturbations for residual values, creating a condensed representation of the original image while retaining information relevant to video understanding tasks, particularly focusing on vehicle detection and classification as key use cases. By not using residual data, our proposed method significantly reduces the data needed in the image reconstruction process, allowing for more efficient storage and transmission of information. This is particularly important when considering the vast amount of video data involved in surveillance applications. Applied to the public BIT-Vehicle dataset, we demonstrate a significant increase in the reconstruction speed compared to the traditional full decoding approach, with our proposed method being approximately 56% faster than the pixel domain method. Additionally, we achieve a detection accuracy of 99.9%, on par with the pixel domain method, and a classification accuracy of 96.84%, only 0.98% lower than the pixel domain method. Furthermore, we showcase the significant reduction in data size, leading to more efficient storage and transmission. Our research establishes the potential of compressed domain methods in traffic surveillance applications, where speed and data size are critical factors.	翻訳日:2023-05-16 16:36:53 公開日:2023-05-14
# 実践的ロバスト強化学習について:実用的不確実性セットとダブルエージェントアルゴリズム On Practical Robust Reinforcement Learning: Practical Uncertainty Set and Double-Agent Algorithm ( http://arxiv.org/abs/2305.06657v2 ) ライセンス: Link先を確認	Ukjo Hwang, Songnam Hong	(参考訳) モデル不確実性を伴う頑健な強化学習(RL)について検討する。トレーニングのためのサンプルを生成する名目上のマルコフ決定プロセス(N-MDP)が与えられた場合、トレーニング(N-MDP)とテスト環境の間の潜在的なミスマッチを反映するために、N-MDPから摂動されたMDPを含む不確実性セットが定義される。堅牢なRLの目的は、不確実性セットに対する最悪のパフォーマンスを最適化する堅牢なポリシーを学ぶことである。本稿では,既存のものよりも現実的なMDPを含む新しい不確実性セットを提案する。この不確実性集合に対して,表ケースに対する頑健なrlアルゴリズム(arq-learning)を示し,その有限時間誤差境界を特徴付ける。また、ARQ-LearningはQ-Learningや最先端の堅牢なQ-Learningと同等の速度で収束し、実世界のアプリケーションにより良いロバスト性を確保することが証明された。次に,大規模あるいは連続的な状態空間を持つ場合において,ARQ学習の拡張の鍵となるボトルネックを効果的に解決する「悲観的」エージェントを提案する。 Q-Learning, Deep-Q Network (DQN), Deep Deterministic Policy gradient (DDPG) などの有名なRLアルゴリズムに悲観的エージェントのアイデアを取り入れ, PRQ-Learning, PR-DQN, PR-DDPGを提案する。特に、提案されたアイデアは、他のモデルなしRLアルゴリズム(ソフトアクター批評家など)に即座に適用することができる。実験により、モデル不確実性のあるRLアプリケーションにおけるアルゴリズムの優位性を示す。 We study a robust reinforcement learning (RL) with model uncertainty. Given nominal Markov decision process (N-MDP) that generate samples for training, an uncertainty set is defined, which contains some perturbed MDPs from N-MDP for the purpose of reflecting potential mismatched between training (i.e., N-MDP) and testing environments. The objective of robust RL is to learn a robust policy that optimizes the worst-case performance over an uncertainty set. In this paper, we propose a new uncertainty set containing more realistic MDPs than the existing ones. For this uncertainty set, we present a robust RL algorithm (named ARQ-Learning) for tabular case and characterize its finite-time error bound. Also, it is proved that ARQ-Learning converges as fast as Q-Learning and the state-of-the-art robust Q-Learning while ensuring better robustness to real-world applications. Next, we propose {\em pessimistic} agent that efficiently tackles the key bottleneck for the extension of ARQ-Learning into the case with larger or continuous state spaces. Incorporating the idea of pessimistic agents into the famous RL algorithms such as Q-Learning, deep-Q network (DQN), and deep deterministic policy gradient (DDPG), we present PRQ-Learning, PR-DQN, and PR-DDPG, respectively. Noticeably, the proposed idea can be immediately applied to other model-free RL algorithms (e.g., soft actor critic). Via experiments, we demonstrate the superiority of our algorithms on various RL applications with model uncertainty.	翻訳日:2023-05-16 11:18:19 公開日:2023-05-14
# 大規模言語モデルにおけるオープンドメイン質問応答の評価 Evaluating Open-Domain Question Answering in the Era of Large Language Models ( http://arxiv.org/abs/2305.06984v2 ) ライセンス: Link先を確認	Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei	(参考訳) 語彙マッチングは、オープンドメイン質問応答(QA)のデファクト評価方法として残っている。残念なことに、論理的マッチングは、金の答えリストにプラウチブル候補の答えが現れない場合に完全に失敗し、抽出モデルから生成モデルへ移行するにつれて、ますますその傾向が増す。近年の大規模言語モデル (LLMs) の成功により、候補解が長くなると語彙的マッチングの失敗が増加し、ゴールド解とのマッチングはさらに困難になる。正確な評価がなければ、オープンドメインQAの真の進歩は分かっていない。本稿では,一般的なベンチマークであるNQ-openのサブセットを手動で評価することにより,LLMを含む様々なオープンドメインQAモデルの徹底的な分析を行う。私たちの評価では、すべてのモデルの真のパフォーマンスは著しく過小評価されているものの、instructgpt (zero-shot) llmのパフォーマンスは60%近く向上し、既存のトップモデルと同等になり、instructgpt (few-shot) モデルはnq-openの新たな最先端を実際に達成しています。また、語彙マッチング失敗の50%以上が意味論的に等価な答えによるものであることが判明した。さらに、不必要な厳密さに悩まされているにもかかわらず、人間の判断と整合したランクQAモデルを示す。最後に, 自動評価モデルは, LLM が生成する長文解に対してではなく, 語彙マッチングのための合理的なサロゲートであることを示す。自動モデルはLLM回答の幻覚を検出するのに苦労し、LLMを評価することができない。現段階では、人間の評価に代わるものはないようである。 Lexical matching remains the de facto evaluation method for open-domain question answering (QA). Unfortunately, lexical matching fails completely when a plausible candidate answer does not appear in the list of gold answers, which is increasingly the case as we shift from extractive to generative models. The recent success of large language models (LLMs) for QA aggravates lexical matching failures since candidate answers become longer, thereby making matching with the gold answers even more challenging. Without accurate evaluation, the true progress in open-domain QA remains unknown. In this paper, we conduct a thorough analysis of various open-domain QA models, including LLMs, by manually evaluating their answers on a subset of NQ-open, a popular benchmark. Our assessments reveal that while the true performance of all models is significantly underestimated, the performance of the InstructGPT (zero-shot) LLM increases by nearly +60%, making it on par with existing top models, and the InstructGPT (few-shot) model actually achieves a new state-of-the-art on NQ-open. We also find that more than 50% of lexical matching failures are attributed to semantically equivalent answers. We further demonstrate that regex matching ranks QA models consistent with human judgments, although still suffering from unnecessary strictness. Finally, we demonstrate that automated evaluation models are a reasonable surrogate for lexical matching in some circumstances, but not for long-form answers generated by LLMs. The automated models struggle in detecting hallucinations in LLM answers and are thus unable to evaluate LLMs. At this time, there appears to be no substitute for human evaluation.	翻訳日:2023-05-16 11:06:27 公開日:2023-05-14
# cockatiel: nlpタスクにおけるニューラルネット分類器の説明のための解釈可能な要素による帰属分類の連続概念 COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasks ( http://arxiv.org/abs/2305.06754v2 ) ライセンス: Link先を確認	Fanny Jourdan, Agustin Picard, Thomas Fel, Laurent Risser, Jean Michel Loubes, Nicholas Asher	(参考訳) トランスフォーマーアーキテクチャは複雑で、NLPで使用されるが、多くの成功をおさめ、解釈可能性や説明性は困難である。近年の議論では、注意地図と属性法は信頼できない(Pruthi et al., 2019; Brunner et al., 2019)。本稿では,その制限のいくつかを紹介するとともに,そのいくつかをうまく解決したcockatielを紹介する。 cockatielは、nlp分類タスクでトレーニングされたニューラルネットモデルの最終層から、非負行列分解(non-negative matrix factorization:nmf)を使用して、モデルが予測に利用する概念を発見し、感度分析を利用してモデルに対する各概念の重要性を正確に推定することで、意味のある説明を生成する、新しい、概念ベース、モデル非依存のxaiテクニックである。基礎となるモデルの精度を損なうことなく、新しいモデルをトレーニングする必要もない。我々は,単一および多視点の感情分析タスクで実験を行い,コッカティエルが人間のトランスフォーマーモデルと協調する概念を何の監督もせずに発見する能力を示し,その説明の忠実性を忠実度メトリクスで客観的に検証し,2つの異なるデータセットで有意義な説明を提供する能力を示す。 Transformer architectures are complex and their use in NLP, while it has engendered many successes, makes their interpretability or explainability challenging. Recent debates have shown that attention maps and attribution methods are unreliable (Pruthi et al., 2019; Brunner et al., 2019). In this paper, we present some of their limitations and introduce COCKATIEL, which successfully addresses some of them. COCKATIEL is a novel, post-hoc, concept-based, model-agnostic XAI technique that generates meaningful explanations from the last layer of a neural net model trained on an NLP classification task by using Non-Negative Matrix Factorization (NMF) to discover the concepts the model leverages to make predictions and by exploiting a Sensitivity Analysis to estimate accurately the importance of each of these concepts for the model. It does so without compromising the accuracy of the underlying model or requiring a new one to be trained. We conduct experiments in single and multi-aspect sentiment analysis tasks and we show COCKATIEL's superior ability to discover concepts that align with humans' on Transformer models without any supervision, we objectively verify the faithfulness of its explanations through fidelity metrics, and we showcase its ability to provide meaningful explanations in two different datasets.	翻訳日:2023-05-16 11:04:30 公開日:2023-05-14

Title

Authors

Abstract

論文公表日・翻訳日

# rosematcher: アプリの更新に対するユーザーレビューの影響を特定する

RoseMatcher: Identifying the Impact of User Reviews on App Updates ( http://arxiv.org/abs/2210.10223v4 )

ライセンス: Link先を確認

Tianyang Liu, Chong Wang, Kun Huang, Peng Liang, Beiqi Zhang, Maya Daneva, Marten van Sinderen

(参考訳) $\textbf{Context}$: モバイルアプリのリリース計画が活発な研究領域となり、ほとんどの研究はApple App Storeのリリースノートによるアプリ分析とイシュートラッカによるユーザレビューの追跡に重点を置いている。しかし、これらのリリースノートとapp storeのユーザーレビューの相関は未定である。この論文では、関連するユーザーレビューとアプリリリースノートをマッチングし、高い信頼性を持つマッチしたペアを識別するための新しい自動アプローチである、$\textit{rosematcher}$を紹介します。 $\textbf{Methods}$: Apple App Storeの5つのモバイルアプリから、944のリリースノートと1,046,862のユーザレビューを収集して、$\textit{RoseMatcher}$の有効性と正確性を評価し、マッチしたペアに関する深いコンテンツ分析を行った。関連したペアを識別するために$\textit{rosematcher}$が0.718のヒット率に達することを示し、984の関連ペアの手動ラベリングとコンテンツ分析を用いて、関連するマッチングペアにおけるリリースノートとユーザレビューの関係に基づいて、ユーザレビューがアプリアップデートで果たす8つの役割を識別した。 $\textbf{Conclusions}$: 私たちの調査結果は、アプリ開発チームとユーザの両方がリリースノートやユーザレビューに細心の注意を払っていることを示している。全体として、調査は、モバイルアプリのリリース計画におけるアプリ開発チームとユーザ間のコミュニケーションの重要性を強調しており、関連するレビューはリリースノートのリリース前後の短い期間で実施され、リリースノートのポストタイムとユーザレビューの間の平均的な時間間隔は約1年である。

$\textbf{Context}$: The release planning of mobile apps has become an area of active research, with most studies centering on app analysis through release notes in the Apple App Store and tracking user reviews via issue trackers. However, the correlation between these release notes and user reviews in App Store remains understudied. $\textbf{Objective}$: In this paper, we introduce $\textit{RoseMatcher}$, a novel automatic approach to match relevant user reviews with app release notes and identify matched pairs with high confidence. $\textbf{Methods}$: We collected 944 release notes and 1,046,862 user reviews from 5 mobile apps in the Apple App Store as research data to evaluate the effectiveness and accuracy of $\textit{RoseMatcher}$, and conducted deep content analysis on matched pairs. $\textbf{Results}$: Our evaluation shows that $\textit{RoseMatcher}$ can reach a hit ratio of 0.718 for identifying relevant matched pairs, and with the manual labeling and content analysis of 984 relevant pairs, we identify 8 roles that user reviews play in app updates according to the relationship between release notes and user reviews in the relevant matched pairs. $\textbf{Conclusions}$: Our findings indicate that both app development teams and users pay close attention to release notes and user reviews, with release notes typically addressing feature requests, bug reports, and complaints, and user reviews offering positive, negative, and constructive feedback. Overall, the study highlights the importance of the communication between app development teams and users in the release planning of mobile apps, with relevant reviews tending to be posed within a short period before and after the release of release notes, with the average time interval between the post time of release notes and user reviews being approximately one year.

翻訳日:2023-10-24 14:15:54 公開日:2023-05-14

# CLawK: スマートコントラクトにおけるビジネスプロセス監視

CLawK: Monitoring Business Processes in Smart Contracts ( http://arxiv.org/abs/2305.08254v1 )

ライセンス: Link先を確認

Mojtaba Eshghie, Wolfgang Ahrendt, Cyrille Artho, Thomas Troels Hildebrandt, Gerardo Schneider

(参考訳) スマートコントラクトは、静的解析が難しい複雑なビジネスプロセスを具現化する。本稿では,DCRグラフで記述されたビジネスプロセス仕様を利用して,スマートコントラクト実行のランタイム検証を行うランタイム監視ツールCLawKを提案する。我々は、コードインスツルメンテーションや追加のガスコストなしで、Ethereumネットワークにデプロイされたスマートコントラクトの特定の振る舞いから、CLawKがどのように逸脱を検出し、フラグを立てるかを実証する。

Smart contracts embody complex business processes that can be difficult to analyze statically. In this paper, we present CLawK, a runtime monitoring tool that leverages business process specifications written in DCR graphs to provide runtime verification of smart contract execution. We demonstrate how CLawK can detect and flag deviations from specified behaviors in smart contracts deployed in the Ethereum network without code instrumentation and any additional gas costs.

翻訳日:2023-10-24 08:55:36 公開日:2023-05-14

# 多視点対話型協調フィルタリング

Multi-View Interactive Collaborative Filtering ( http://arxiv.org/abs/2305.18306v1 )

ライセンス: Link先を確認

Maria Lentini, Umashanger Thayasivam

(参考訳) 多くのシナリオでは、クリックやレーティングなどのレコメンダシステムユーザーインタラクションデータは少なく、アイテムのターンオーバー率(新しい記事や仕事の投稿など)が高い。これを踏まえると,ユーザ満足度に加えて文脈的"サイド"情報の統合が極めて望ましい。評価データと文脈データの両方を同時に処理できるアルゴリズムは存在するが、これらのアルゴリズムは通常、サンプル内のレコメンデーションのみに制限され、次元性の呪いに悩まされ、長期累積報酬最適化のためにマルチアームバンディット(MAB)ポリシーが組み込まれない。本稿では,複数画面対話型トピックレグレッション(MV-ICTR)を提案する。このアルゴリズムは,アイテム固有の特徴依存をモデル化するための評価情報と文脈情報の両方を組み込んだ,オンラインパーソナライゼーションを継続するマルチアームバンディットポリシを備えた,新しいオンライン潜在因子レコメンデーションアルゴリズムである。その結果、コールドスタートユーザとアイテムの割合が高いデータセットのパフォーマンスが大幅に向上した。

In many scenarios, recommender system user interaction data such as clicks or ratings is sparse, and item turnover rates (e.g., new articles, job postings) high. Given this, the integration of contextual "side" information in addition to user-item ratings is highly desirable. Whilst there are algorithms that can handle both rating and contextual data simultaneously, these algorithms are typically limited to making only in-sample recommendations, suffer from the curse of dimensionality, and do not incorporate multi-armed bandit (MAB) policies for long-term cumulative reward optimization. We propose multi-view interactive topic regression (MV-ICTR) a novel partially online latent factor recommender algorithm that incorporates both rating and contextual information to model item-specific feature dependencies and users' personal preferences simultaneously, with multi-armed bandit policies for continued online personalization. The result is significantly increased performance on datasets with high percentages of cold-start users and items.

翻訳日:2023-06-04 11:40:06 公開日:2023-05-14

# 無線通信システムにおけるグラフニューラルネットワークによるユーザペアリング

Graph Neural Networks-Based User Pairing in Wireless Communication Systems ( http://arxiv.org/abs/2306.00717v1 )

ライセンス: Link先を確認

Sharan Mourya, Pavan Reddy, SaiDhiraj Amuru, Kiran Kumar Kuchi

(参考訳) 近年,NPハード無線リソース割り当て問題をリアルタイムに解決するためのソリューションとして,ディープニューラルネットワークが登場している。しかし、画像処理タスクから継承される多層パーセプトロン(MLP)と畳み込みニューラルネットワーク(CNN)構造は、無線ネットワーク問題に最適化されていない。ネットワークサイズが大きくなるにつれて、これらの手法は訓練や一般化が難しくなる。ユーザペアリングは、干渉を最小限に抑え、スループットを最大化しながら、同時にスケジュールするユーザを選択することを伴う無線通信システムにおいて、npハードな最適化問題のひとつです。本稿では,ユーザペアリング問題を効率的に解くために,教師なしグラフニューラルネットワーク(GNN)アプローチを提案する。提案手法は,Erdos goニューラルパイプラインを用いて,k平均や半直交ユーザスケジューリング(SUS)などの他のスケジューリング手法を大幅に上回っている。提案手法は20dBのSNRにおいて,k平均よりも49%,SUSよりも95%,最小時間と資源を消費しながら,安定した総和率を実現する。提案手法のスケーラビリティについても検討し,性能が大幅に低下することなく,ネットワークサイズの変化を動的に処理できることを示す。さらに,本モデルは,cnnやmlpでは実現できない動的機能を実現するために,大規模ネットワークや小型ネットワークを明示的に訓練することなく実現可能である。

Recently, deep neural networks have emerged as a solution to solve NP-hard wireless resource allocation problems in real-time. However, multi-layer perceptron (MLP) and convolutional neural network (CNN) structures, which are inherited from image processing tasks, are not optimized for wireless network problems. As network size increases, these methods get harder to train and generalize. User pairing is one such essential NP-hard optimization problem in wireless communication systems that entails selecting users to be scheduled together while minimizing interference and maximizing throughput. In this paper, we propose an unsupervised graph neural network (GNN) approach to efficiently solve the user pairing problem. Our proposed method utilizes the Erdos goes neural pipeline to significantly outperform other scheduling methods such as k-means and semi-orthogonal user scheduling (SUS). At 20 dB SNR, our proposed approach achieves a 49% better sum rate than k-means and a staggering 95% better sum rate than SUS while consuming minimal time and resources. The scalability of the proposed method is also explored as our model can handle dynamic changes in network size without experiencing a substantial decrease in performance. Moreover, our model can accomplish this without being explicitly trained for larger or smaller networks facilitating a dynamic functionality that cannot be achieved using CNNs or MLPs.

翻訳日:2023-06-04 11:01:19 公開日:2023-05-14

# ハイパーオートメーション-IT産業における自動化の次の周辺

Hyper-automation-The next peripheral for automation in IT industries ( http://arxiv.org/abs/2305.11896v1 )

ライセンス: Link先を確認

Ayush Singh Rajput, Richa Gupta

(参考訳) 特定のプロセスの境界を超えたレガシーなビジネスプロセス自動化の拡張は、ハイパーオートモーテーション(hyperautomation)と呼ばれる。 hyperautomationは、aiツールとrpaを組み合わせることで、ビジネスユーザが行うほぼすべての反復アクションの自動化を提供する。企業のトップ脳が完成できないかもしれない複雑なITビジネスプロセスを自動化する。これは、標準的なビジネスプロセスデプロイメントのエンドツーエンド自動化です。自動化は、ブレインコンピュータインターフェース(BCI)とAIとRPA自動化ツールを組み合わせることで、タスクのデジタル化を可能にする。 bciは自動化ツールと連携して、自動化プロセスの検出と生成を次のレベルに進める。企業はビジネスインテリジェンスシステムを統合し、複雑な要件に対処し、人間の専門知識と自動化エクスペリエンスを向上させることができる。本稿では, ハイパーオートマテーションと今日の環境におけるその重要性について概説する。この記事はその後、BCIとセンサーがHyperautomationにどのように役立つかについて論じている。この概念に関連する様々な柔軟な技術と、図式的に示される専用のワークフロー技術を用いて、特定の勧誘のセクタを調査した。ハイパーオートミネーションは、自動化タスクの効率、正確性、ヒューマンエンハンスメントを劇的に改善するために利用されています。発見、実装、自動化フェーズには、多数の自動化ツールが組み込まれている。その結果、最先端の技術の統合と新しい作業方法の実験に適しています。キーワード - ハイパーオートマチック、脳コンピューターインタフェース(BCI)、テクノロジー、ユースケース、センサー、産業。

The extension of legacy business process automation beyond the bounds of specific processes is known as hyperautomation. Hyperautomation provides automation for nearly any repetitive action performed by business users by combining AI tools with RPA. It automates complex IT business processes that a company's top brains might not be able to complete. This is an end-to-end automation of a standard business process deployment. It enables automation to perform task digitalization by combining a brain computer interface (BCI) with AI and RPA automation tools. BCI, in conjunction with automation tools, will advance the detection and generation of automation processes to the next level. It allows enterprises to combine business intelligence systems, address complex requirements, and enhance human expertise and automation experience. Hyperautomation and its importance in today's environment are briefly discussed in this paper. The article then goes on to discuss how BCI and sensors might aid Hyperautomation. The specific sectors of solicitations were examined using a variety of flexible technologies associated to this concept, as well as dedicated workflow techniques, which are also diagrammatically illustrated. Hyperautomation is being utilized to improve the efficiency, accuracy, and human enhancement of automated tasks dramatically. It incorporates a number of automated tools in its discovery, implementation, and automation phases. As a result, it's well-suited to integrating cutting-edge technologies and experimenting with new methods of working. Keywords- Hyperautomation, Brain computer Interface (BCI), Technology, Used case, Sensors, Industries.

翻訳日:2023-05-28 05:31:43 公開日:2023-05-14

# 森林火災防止の最適化:ドローン監視システムのためのインテリジェントスケジューリングアルゴリズム

Optimizing Forest Fire Prevention: Intelligent Scheduling Algorithms for Drone-Based Surveillance System ( http://arxiv.org/abs/2305.10444v1 )

ライセンス: Link先を確認

Mahdi Jemmali, Loai Kayed B.Melhim, Wadii Boulila, Hajer Amdouni, Mafawez T. Alharbi

(参考訳) この研究は、森林の重要性と、地球、気候、地球上の生命に直接影響を及ぼす生態系のバランス維持における役割を踏まえ、ドローンによる森林火災監視の課題を提起する。森林モニタリングプロセスは、森林内の監視地域の変化を追跡するために連続的に行われる。火災の間、ドローンの捕獲データは、追跡速度を高め、これらの火災の制御プロセスを強化するために使用される。このような問題におけるタイムファクターは、適切な時刻の適切なデータが火災の制御、拡散の防止、消火、損失の制限に決定的な要因となるため、消火プロセスの成功率を決定する。そこで本研究では,森林モニタリングシステムにおけるドローンの監視タスクスケジューリングの問題を提示した。この問題は、全てのドローンが割り当てられたタスクを実行するのに必要な完了時間を最小化するために、いくつかのアルゴリズムを開発することで解決される。システムパフォーマンスは、3つの異なるクラスの990インスタンスを用いて測定される。実験の結果,提案アルゴリズムの有効性と目的達成のために効率的に行動できることが示唆された。アルゴリズムの$rid$は最大90.3%のパーセンテージレートで0.088秒という最高のパフォーマンスを達成した。

Given the importance of forests and their role in maintaining the ecological balance, which directly affects the planet, the climate, and the life on this planet, this research presents the problem of forest fire monitoring using drones. The forest monitoring process is performed continuously to track any changes in the monitored region within the forest. During fires, drones' capture data is used to increase the follow-up speed and enhance the control process of these fires to prevent their spread. The time factor in such problems determines the success rate of the fire extinguishing process, as appropriate data at the right time may be the decisive factor in controlling fires, preventing their spread, extinguishing them, and limiting their losses. Therefore, this research presented the problem of monitoring task scheduling for drones in the forest monitoring system. This problem is solved by developing several algorithms with the aim of minimizing the total completion time required to carry out all the drones' assigned tasks. System performance is measured by using 990 instances of three different classes. The performed experimental results indicated the effectiveness of the proposed algorithms and their ability to act efficiently to achieve the desired goal. The algorithm $RID$ achieved the best performance with a percentage rate of up to 90.3% with a time of 0.088 seconds.

翻訳日:2023-05-19 19:08:11 公開日:2023-05-14

# SuperDriverAI: エンドツーエンド学習型自動運転の設計と実装に向けて

SuperDriverAI: Towards Design and Implementation for End-to-End Learning-based Autonomous Driving ( http://arxiv.org/abs/2305.10443v1 )

ライセンス: Link先を確認

Shunsuke Aoki, Issei Yamamoto, Daiki Shiotsuka, Yuichi Inoue, Kento Tokuhiro, and Keita Miwa

(参考訳) 完全自動運転は広く研究され、ますます実現可能になっている。しかし、周囲のドライバーや歩行者による様々な不確実性のため、公道での自動運転はまだ実現されていない。本稿では,Deep Neural Networks(DNN)が経験豊富なドライバから運転動作とポリシーを学習し,道路安全を確保しながら運転操作を決定する,SuperDriver AIというエンドツーエンドの学習ベース自動運転システムを提案する。さらに,頑健性と解釈性を向上させるため,スリットモデルと視覚的注意モジュールを提案する。我々は、実世界のハードウェアでデータ収集システムとエミュレータを構築し、実世界の運転シナリオでSuperDriver AIシステムをテストする。最後に,1回の運転シナリオで150ランを収集し,実世界の車両を用いたスーパードライバーaiのデモンストレーションを行った。

Fully autonomous driving has been widely studied and is becoming increasingly feasible. However, such autonomous driving has yet to be achieved on public roads, because of various uncertainties due to surrounding human drivers and pedestrians. In this paper, we present an end-to-end learningbased autonomous driving system named SuperDriver AI, where Deep Neural Networks (DNNs) learn the driving actions and policies from the experienced human drivers and determine the driving maneuvers to take while guaranteeing road safety. In addition, to improve robustness and interpretability, we present a slit model and a visual attention module. We build a datacollection system and emulator with real-world hardware, and we also test the SuperDriver AI system with real-world driving scenarios. Finally, we have collected 150 runs for one driving scenario in Tokyo, Japan, and have shown the demonstration of SuperDriver AI with the real-world vehicle.

翻訳日:2023-05-19 19:07:51 公開日:2023-05-14

# スマートホームエネルギーマネジメント:VAE-GAN合成データセットジェネレータとQラーニング

Smart Home Energy Management: VAE-GAN synthetic dataset generator and Q-learning ( http://arxiv.org/abs/2305.08885v1 )

ライセンス: Link先を確認

Mina Razghandi, Hao Zhou, Melike Erol-Kantarci, and Damla Turgut

(参考訳) 近年、学界や産業の間で住宅の電気消費を分析し、家庭用エネルギー消費とコストを削減するためにスマートホームエネルギー管理システム(hems)を採用することへの関心が高まっている。 HEMSは、実際のスマートグリッドの統計的および機能的性質をシミュレートするために開発された。公開データセットへのアクセスは、この種の研究において大きな課題である。人工HEMSの応用の可能性は、合成システムの異なる動作条件を表す時系列の開発によってさらに強化される。本稿では,家庭におけるエネルギー消費に関する時系列データを生成するための変分的自動エンコーダ生成逆ネットワーク(vae-gan)手法を提案する。また、Qラーニングに基づくHEMSと組み合わせることで、生成モデルがどのように機能するかについても検討する。実世界のスマートホームデータを用いて,Qラーニングに基づくHEMSのオンラインパフォーマンスを検証した。生成したデータセットをテストするために,実データと合成データの確率分布間のKullback-Leibler(KL)偏差,最大平均差(MMD)およびワッサーシュタイン距離を測定する。実験の結果, VAE-GAN生成合成データは実データ分布と密に一致していることがわかった。最後に、生成したデータにより、ベースラインアプローチで生成されたデータセットと比較して、高性能なQ-ラーニングベースのHEMSのトレーニングが可能になることを示す。

Recent years have noticed an increasing interest among academia and industry towards analyzing the electrical consumption of residential buildings and employing smart home energy management systems (HEMS) to reduce household energy consumption and costs. HEMS has been developed to simulate the statistical and functional properties of actual smart grids. Access to publicly available datasets is a major challenge in this type of research. The potential of artificial HEMS applications will be further enhanced with the development of time series that represent different operating conditions of the synthetic systems. In this paper, we propose a novel variational auto-encoder-generative adversarial network (VAE-GAN) technique for generating time-series data on energy consumption in smart homes. We also explore how the generative model performs when combined with a Q-learning-based HEMS. We tested the online performance of Q-learning-based HEMS with real-world smart home data. To test the generated dataset, we measure the Kullback-Leibler (KL) divergence, maximum mean discrepancy (MMD), and the Wasserstein distance between the probability distributions of the real and synthetic data. Our experiments show that VAE-GAN-generated synthetic data closely matches the real data distribution. Finally, we show that the generated data allows for the training of a higher-performance Q-learning-based HEMS compared to datasets generated with baseline approaches.

翻訳日:2023-05-17 17:40:18 公開日:2023-05-14

# ブラックボックス言語モデルによるテキストの透かし

Watermarking Text Generated by Black-Box Language Models ( http://arxiv.org/abs/2305.08883v1 )

ライセンス: Link先を確認

Xi Yang, Kejiang Chen, Weiming Zhang, Chang Liu, Yuang Qi, Jie Zhang, Han Fang, Nenghai Yu

(参考訳) 現在、LLMは様々な分野で人間のようなスキルを示しており、誤用を心配している。したがって、生成されたテキストの検出が不可欠である。しかし, 受動的検出手法は, 領域特異性と限られた対向性に留まっている。テキスト生成時に透かしを埋め込むことが可能なホワイトボックスLCMに対して,透かしベースの手法が提案された。この方法は、モデル語彙をランダムに分割して特殊リストを取得し、確率分布を調整し、リスト内の単語の選択を促進する。リストを認識する検出アルゴリズムは、透かし付きテキストを識別することができる。しかし、この方法はブラックボックス言語モデルのみが利用可能な現実世界のシナリオの多くでは適用できない。例えば、APIベースの垂直アプリケーションを開発するサードパーティは、生成したテキストのみを供給し、商業的利益を保護するために確率分布を保持するため、テキスト自体をウォーターマークすることはできない。サードパーティが生成したテキストに自動的に透かしを注入できるようにするために,ブラックボックス言語モデル利用シナリオのための透かしフレームワークを開発した。具体的には、まず単語に対応するランダムなバイナリエンコーディングを計算するバイナリエンコーディング関数を定義する。非透かしテキストで計算された符号化はベルヌーイ分布に準拠し、ビット-1を表す単語の確率は約0.5である。透かしを注入するために、ビット0を表す単語を、ビット1を表す文脈に基づく同義語に選択的に置き換えることで、分布を変化させる。その後、統計検査によって透かしを識別する。実験により,中国語と英語のデータセットにおける本手法の有効性が実証された。さらに, 再翻訳, 研磨, 単語削除, 同義語置換攻撃による結果から, 本来の意味論を損なうことなく, 透かしを除去することが困難であることが明らかとなった。

LLMs now exhibit human-like skills in various fields, leading to worries about misuse. Thus, detecting generated text is crucial. However, passive detection methods are stuck in domain specificity and limited adversarial robustness. To achieve reliable detection, a watermark-based method was proposed for white-box LLMs, allowing them to embed watermarks during text generation. The method involves randomly dividing the model vocabulary to obtain a special list and adjusting the probability distribution to promote the selection of words in the list. A detection algorithm aware of the list can identify the watermarked text. However, this method is not applicable in many real-world scenarios where only black-box language models are available. For instance, third-parties that develop API-based vertical applications cannot watermark text themselves because API providers only supply generated text and withhold probability distributions to shield their commercial interests. To allow third-parties to autonomously inject watermarks into generated text, we develop a watermarking framework for black-box language model usage scenarios. Specifically, we first define a binary encoding function to compute a random binary encoding corresponding to a word. The encodings computed for non-watermarked text conform to a Bernoulli distribution, wherein the probability of a word representing bit-1 being approximately 0.5. To inject a watermark, we alter the distribution by selectively replacing words representing bit-0 with context-based synonyms that represent bit-1. A statistical test is then used to identify the watermark. Experiments demonstrate the effectiveness of our method on both Chinese and English datasets. Furthermore, results under re-translation, polishing, word deletion, and synonym substitution attacks reveal that it is arduous to remove the watermark without compromising the original semantics.

翻訳日:2023-05-17 17:39:58 公開日:2023-05-14

# 視覚球面パースペクティブモデルの消滅点を考慮した計算ポッド

Calculating Pose with Vanishing Points of Visual-Sphere Perspective Model ( http://arxiv.org/abs/2004.08933v4 )

ライセンス: Link先を確認

Jakub Maksymilian Fober

(参考訳) 提案手法の目標は, 幾何学的手法を用いて, 推定することなく, 既知の矩形対象のポーズ行列を直接取得することである。この方法は、魚眼カメラビューのような180{\deg}視野を超えるリアルタイム、極端な撮像装置に特化している。導入されたアルゴリズムは幾何代数を用いてコプラナー平行線(理想的には長方形のような接対)のポーズを決定する。これは、ポーズ行列ベクトルに対応する視覚単位球面上の点を計算することによって達成される。このアルゴリズムは、ビュー座標の視野モデルマッピングにより、事前の補正なしに非常に歪んだビューソースのポーズを決定することができる。本稿では、視点マップのルックアップと、パラメトリックな普遍的視点歪モデルを用いてマッピングを行うことができる。その結果、マイクロコントローラを使用して組み込みシステムで実行でき、高い精度と低レイテンシを実現する、堅牢なポーズ行列計算が実現した。この方法は、包括的カメラキャリブレーションのための立方体ターゲット設定にさらに拡張することができる。また、低レイテンシと極端な視野角を必要とする他のアプリケーションでも有用である。

The goal of the proposed method is to directly obtain a pose matrix of a known rectangular target, without estimation, using geometric techniques. This method is specifically tailored for real-time, extreme imaging setups exceeding 180{\deg} field of view, such as a fish-eye camera view. The introduced algorithm employs geometric algebra to determine the pose for a pair of coplanar parallel lines (ideally a tangent pair as in a rectangle). This is achieved by computing vanishing points on a visual unit sphere, which correspond to pose matrix vectors. The algorithm can determine pose for an extremely distorted view source without prior rectification, owing to a visual-sphere perspective model mapping of view coordinates. Mapping can be performed using either a perspective map lookup or a parametric universal perspective distortion model, which is also presented in this paper. The outcome is a robust pose matrix computation that can be executed on an embedded system using a microcontroller, offering high accuracy and low latency. This method can be further extended to a cubic target setup for comprehensive camera calibration. It may also prove valuable in other applications requiring low latency and extreme viewing angles.

翻訳日:2023-05-17 02:16:48 公開日:2023-05-14

# Label-Assemble: 部分ラベルによる複数データセットの活用

Label-Assemble: Leveraging Multiple Datasets with Partial Labels ( http://arxiv.org/abs/2109.12265v4 )

ライセンス: Link先を確認

Mintong Kang, Bowen Li, Zengle Zhu, Yongyi Lu, Elliot K. Fishman, Alan L. Yuille, Zongwei Zhou

(参考訳) ディープラーニングの成功は、大きなラベル付きデータセットに大きく依存していますが、部分ラベルに関連するいくつかの小さなデータセットにしかアクセスできません。この問題に対処するため,我々は,公開データセットのアセンブリから部分ラベルの可能性を解き放つことを目的とした,新たなイニシアティブ "label-assemble" を提案する。ネガティブな例から学ぶことで,コンピュータ支援型疾患の診断と検出が容易になることがわかった。この発見は、陽性例の収集が難しいが、陰性例の組み立てが比較的容易な、新しい疾患診断において特に重要である。例えば、NIH ChestX-ray14(2017年以降利用可能)から既存のラベルを組み込むことで、新型コロナウイルスの診断精度が96.3%から99.3%に大幅に向上する。例えば、膵管腺癌(PDAC)の検出は、CystsおよびPanNets(他の2種類の膵異常)のラベルを利用することで、感度を52.1%から84.0%に上昇させ、98.0%の高比性を維持しながら、疾患の検出を改善することができる。

The success of deep learning relies heavily on large labeled datasets, but we often only have access to several small datasets associated with partial labels. To address this problem, we propose a new initiative, "Label-Assemble", that aims to unleash the full potential of partial labels from an assembly of public datasets. We discovered that learning from negative examples facilitates both computer-aided disease diagnosis and detection. This discovery will be particularly crucial in novel disease diagnosis, where positive examples are hard to collect, yet negative examples are relatively easier to assemble. For example, assembling existing labels from NIH ChestX-ray14 (available since 2017) significantly improves the accuracy of COVID-19 diagnosis from 96.3% to 99.3%. In addition to diagnosis, assembling labels can also improve disease detection, e.g., the detection of pancreatic ductal adenocarcinoma (PDAC) can greatly benefit from leveraging the labels of Cysts and PanNets (two other types of pancreatic abnormalities), increasing sensitivity from 52.1% to 84.0% while maintaining a high specificity of 98.0%.

翻訳日:2023-05-17 01:41:50 公開日:2023-05-14

# TrAISformer-AIS軌道予測のための生成変換器

TrAISformer-A generative transformer for AIS trajectory prediction ( http://arxiv.org/abs/2109.03958v3 )

ライセンス: Link先を確認

Duong Nguyen and Ronan Fablet

(参考訳) 将来ある特定の地点における船舶位置の予測は多くの海洋応用の基本的な側面である。自動識別システム(AIS)は、このタスクを実現するための豊富な情報を提供するが、動きデータに固有の複雑さとマルチモーダル性のため、現代の機械学習/深層学習においても、AISデータを用いた船舶軌道予測は極めて困難である。本稿では,aisデータの新しい離散的高次元表現と,不均一性と多様性を考慮した新しい損失関数を導入することで,これらの課題を解決する。提案したモデルであるTrAISformerは、改良されたトランスフォーマーネットワークであり、提案された拡張空間におけるAIS軌道の長期相関を抽出し、将来的な船舶の位置を予測する。実AISデータを公開して実験結果を報告する。 TrAISformerは最先端の手法を大きく上回り、平均的な予測性能は10海里から10時間以内である。

The prediction of vessel positions at a specified point in the future is a fundamental aspect of many maritime applications. While Automatic Identification System (AIS) provides a rich source of information to enable this task, vessel trajectory forecasting using AIS data remains formidably challenging, even for modern machine learning/deep learning, because of the complexity and multimodality inherent in motion data. In this paper, we address these challenges by introducing a novel discrete, high-dimensional representation of AIS data and a new loss function to explicitly account for heterogeneity and multimodality. The proposed model -- referred to as TrAISformer -- is a modified transformer network that extracts long-term correlations of AIS trajectories in the proposed enriched space to forecast the positions of vessels several hours into the future. We report experimental results on publicly available, real AIS data. TrAISformer significantly outperforms state-of-the-art methods and with an average prediction performance below 10 nautical miles up to ~10 hours.

翻訳日:2023-05-17 01:41:30 公開日:2023-05-14

# 自己学習アンサンブルを用いたラベルなしデータの誤り検出と推定精度

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles ( http://arxiv.org/abs/2106.15728v4 )

ライセンス: Link先を確認

Jiefeng Chen, Frederick Liu, Besim Avci, Xi Wu, Yingyu Liang, Somesh Jha

(参考訳) ディープラーニングモデルがワイルドにデプロイされると、トレーニングデータ分布とは異なる分布から引き出されたテストデータに遭遇し、パフォーマンスが低下する可能性がある。安全なデプロイメントには,事前トレーニングしたモデルの精度をテストデータ上で推定することが不可欠である。しかし、テスト入力のラベルは通常、すぐには利用できず、それらを取得するには費用がかかる可能性がある。本研究は,(1) ラベル付きテスト入力のセット上で事前学習した分類器の精度を推定することを目的とした教師なしの精度推定,(2) 誤分類テスト入力の同定を目的とした誤り検出の2つの課題を実現する。本稿では,2つのタスクを同時に処理する原理的かつ効果的なフレームワークを提案する。提案手法は,誤分類されたデータポイントを識別するためのモデルのアンサンブルを反復的に学習し,同定されたポイントとのアンサンブルを改善するために自己学習を行う。理論解析により,本フレームワークは,実用的なディープラーニングモデルによって容易に満足できる軽度条件下での精度推定と誤り検出の両立を保証できる。このフレームワークとともに,59のタスクに対して2つのインスタンス化を提案し,実験を行った。例えば、iWildCamでは、教師なし精度推定における推定誤差を少なくとも70%削減し、エラー検出のためのF1スコアを既存の方法と比較して少なくとも4.7%改善する。

When a deep learning model is deployed in the wild, it can encounter test data drawn from distributions different from the training data distribution and suffer drop in performance. For safe deployment, it is essential to estimate the accuracy of the pre-trained model on the test data. However, the labels for the test inputs are usually not immediately available in practice, and obtaining them can be expensive. This observation leads to two challenging tasks: (1) unsupervised accuracy estimation, which aims to estimate the accuracy of a pre-trained classifier on a set of unlabeled test inputs; (2) error detection, which aims to identify mis-classified test inputs. In this paper, we propose a principled and practically effective framework that simultaneously addresses the two tasks. The proposed framework iteratively learns an ensemble of models to identify mis-classified data points and performs self-training to improve the ensemble with the identified points. Theoretical analysis demonstrates that our framework enjoys provable guarantees for both accuracy estimation and error detection under mild conditions readily satisfied by practical deep learning models. Along with the framework, we proposed and experimented with two instantiations and achieved state-of-the-art results on 59 tasks. For example, on iWildCam, one instantiation reduces the estimation error for unsupervised accuracy estimation by at least 70% and improves the F1 score for error detection by at least 4.7% compared to existing methods.

翻訳日:2023-05-17 01:39:38 公開日:2023-05-14

# CD-GAN : 不均一センサを用いた非監視リモートセンシング変化検出のためのロバスト核融合による生成対向ネットワーク

CD-GAN: a robust fusion-based generative adversarial network for unsupervised remote sensing change detection with heterogeneous sensors ( http://arxiv.org/abs/2203.00948v3 )

ライセンス: Link先を確認

Jin-Ju Wang, Nicolas Dobigeon, Marie Chabert, Ding-Cheng Wang, Ting-Zhu Huang and Jie Huang

(参考訳) 地球観測の文脈では、変化の検出は、空間分解能やスペクトル分解能の異なるセンサーによって取得されたマルチテンポラリ画像、あるいは光学、レーダーなど)によって行われる。光学モードに限ったとしても、センサーが空間やスペクトルの解像度が異なるため、この課題はすぐに困難であることが判明した。本稿では,このような不均質光センサを用いた画像の教師なし変化検出法を提案する。この手法は、変化検出問題を堅牢な融合フレームワークに組み込んだ最近の進歩を生かしている。より正確には、前もって設計・訓練された深層対向ネットワークにより、同一アーキテクチャのネットワークにより、一対のマルチバンド光画像を容易に補完し、変化検出を行うことができることを示す。結果として生じる全体的なアーキテクチャは、融合ネットワークと追加ネットワークがジェネレータの不可欠なビルディングブロックとして解釈される敵の戦略に従う。最先端の変更検出手法との比較により,提案手法の有効性と有効性を示す。

In the context of Earth observation, the detection of changes is performed from multitemporal images acquired by sensors with possibly different spatial and/or spectral resolutions or even different modalities (e.g. optical, radar). Even limiting to the optical modality, this task has proved to be challenging as soon as the sensors have different spatial and/or spectral resolutions. This paper proposes a novel unsupervised change detection method dedicated to images acquired with such so-called heterogeneous optical sensors. This method capitalizes on recent advances which frame the change detection problem into a robust fusion framework. More precisely, we show that a deep adversarial network designed and trained beforehand to fuse a pair of multiband optical images can be easily complemented by a network with the same architecture to perform change detection. The resulting overall architecture itself follows an adversarial strategy where the fusion network and the additional network are interpreted as essential building blocks of a generator. A comparison with state-of-the-art change detection methods demonstrates the versatility and the effectiveness of the proposed approach.

翻訳日:2023-05-17 01:32:34 公開日:2023-05-14

# セマンティクスセグメンテーションのためのピラミッド融合トランスフォーマ

Pyramid Fusion Transformer for Semantic Segmentation ( http://arxiv.org/abs/2201.04019v3 )

ライセンス: Link先を確認

Zipeng Qin, Jianbo Liu, Xiaolin Zhang, Maoqing Tian, Aojun Zhou, Shuai Yi, Shaoan Qi, Hongsheng Li

(参考訳) 最近提案されたmaskformerは、セマンティックセグメンテーションのタスクに関する新しい視点を提供している。本質的には、カテゴリセグメントに対応するペア確率とマスクを生成し、セグメンテーションマップの推論中にそれらを組み合わせます。本研究では,シングルスケール機能上のマスク分類デコーダは,信頼性の高い確率やマスクを抽出できるほど有効ではないことを見出した。特徴ピラミッド全体にわたって豊富な意味情報を求めるため,マルチスケール特徴を持つマスク・アプローチ・セマンティクスセグメンテーションのためのトランスフォーマーベースのピラミッド融合トランスフォーマ (pft) を提案する。提案するトランスフォーマーデコーダは,学習可能なクエリと特徴ピラミッドからのそれぞれの空間特徴との相互接続を並列に行い,補足情報交換にクロススケールクエリ間注意を使用する。広く使われている3つのセマンティックセグメンテーションデータセット上での競合性能を実現する。特にADE20Kの検証セットでは、Swin-Bのバックボーンはシングルスケールとマルチスケールの両方でMaskFormerのバックボーンよりも大きく、それぞれ54.1 mIoUと55.7 mIoUを達成した。 Swin-Lのバックボーンを使用して、単一スケールの56.1 mIoUとマルチスケールの57.4 mIoUを達成し、データセット上で最先端のパフォーマンスを得る。 3つの広く使われているセマンティックセグメンテーションデータセットの大規模な実験により,提案手法の有効性が検証された。

The recently proposed MaskFormer gives a refreshed perspective on the task of semantic segmentation: it shifts from the popular pixel-level classification paradigm to a mask-level classification method. In essence, it generates paired probabilities and masks corresponding to category segments and combines them during inference for the segmentation maps. In our study, we find that per-mask classification decoder on top of a single-scale feature is not effective enough to extract reliable probability or mask. To mine for rich semantic information across the feature pyramid, we propose a transformer-based Pyramid Fusion Transformer (PFT) for per-mask approach semantic segmentation with multi-scale features. The proposed transformer decoder performs cross-attention between the learnable queries and each spatial feature from the feature pyramid in parallel and uses cross-scale inter-query attention to exchange complimentary information. We achieve competitive performance on three widely used semantic segmentation datasets. In particular, on ADE20K validation set, our result with Swin-B backbone surpasses that of MaskFormer's with a much larger Swin-L backbone in both single-scale and multi-scale inference, achieving 54.1 mIoU and 55.7 mIoU respectively. Using a Swin-L backbone, we achieve single-scale 56.1 mIoU and multi-scale 57.4 mIoU, obtaining state-of-the-art performance on the dataset. Extensive experiments on three widely used semantic segmentation datasets verify the effectiveness of our proposed method.

翻訳日:2023-05-17 01:31:13 公開日:2023-05-14

# 逆ベイズ分類器の存在について(拡張版)

On the Existence of the Adversarial Bayes Classifier (Extended Version) ( http://arxiv.org/abs/2112.01694v3 )

ライセンス: Link先を確認

Pranjal Awasthi, Natalie S. Frank, Mehryar Mohri

(参考訳) 敵対的堅牢性は、現代の機械学習アプリケーションにおいて重要な特性である。近年のいくつかの理論的研究の対象となっているが、敵の強靭性に関する重要な疑問がまだ数多く残っている。本研究では,ベイズ最適性に関する基本的問題について考察する。ベイズ最適分類器の存在を敵の強靭性に対して保証できるような、一般的な十分条件を提供する。この結果は, 敵の強靭性とその整合性におけるサロゲート損失の研究に有用である。この写本は、NeurIPS 2021 で出版された論文 \emph{On the Existence of the Adversarial Bayes Classifier} の拡張と修正版である。元々の論文では定理ステートメントに2つの誤りがあった。1つは疑似証明可能ロバスト性の定義であり、もう1つは任意の距離空間に対して $a^\e$ の可測性の定義である。このバージョンではエラーを修正します。さらに、原論文の結果は、いくつかの非制限凸ノルムには適用されず、ここでは、結果を全ての可能なノルムにまで拡張する。

Adversarial robustness is a critical property in a variety of modern machine learning applications. While it has been the subject of several recent theoretical studies, many important questions related to adversarial robustness are still open. In this work, we study a fundamental question regarding Bayes optimality for adversarial robustness. We provide general sufficient conditions under which the existence of a Bayes optimal classifier can be guaranteed for adversarial robustness. Our results can provide a useful tool for a subsequent study of surrogate losses in adversarial robustness and their consistency properties. This manuscript is the extended and corrected version of the paper \emph{On the Existence of the Adversarial Bayes Classifier} published in NeurIPS 2021. There were two errors in theorem statements in the original paper -- one in the definition of pseudo-certifiable robustness and the other in the measurability of $A^\e$ for arbitrary metric spaces. In this version we correct the errors. Furthermore, the results of the original paper did not apply to some non-strictly convex norms and here we extend our results to all possible norms.

翻訳日:2023-05-17 01:30:45 公開日:2023-05-14

# AxoNN: 極大規模ディープラーニングのための非同期メッセージ駆動並列フレームワーク

AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning ( http://arxiv.org/abs/2110.13005v5 )

ライセンス: Link先を確認

Siddharth Singh, Abhinav Bhatele

(参考訳) ここ数年、最先端のニューラルネットワークをトレーニングするためのメモリ要件は、現代のハードウェアアクセラレーターのDRAM能力を大きく超えてきた。これにより、大規模なGPUベースのクラスタ上でこれらのニューラルネットワークを並列にトレーニングする効率的なアルゴリズムの開発が必要になった。現代のgpuでは計算コストは比較的安価であるため、並列トレーニングアルゴリズムにおける極めて効率的な通信の設計と実装は、最大性能の抽出に不可欠である。並列ディープラーニングフレームワークであるAxoNNは、非同期およびメッセージ駆動実行を利用して、各GPU上でのニューラルネットワーク操作をスケジュールし、GPUアイドル時間を短縮し、ハードウェア効率を最大化する。トレーニング中に定期的にデータをオフロードするスクラッチスペースとしてCPUメモリを使用することで、AxoNNはGPUメモリ使用量を4倍削減することができる。これにより、GPUあたりのパラメータ数を4倍に増やすことができ、通信量と性能を13%以上向上させることができる。 48-384 NVIDIA Tesla V100 GPU上で12-1000億のパラメータを持つ大きなトランスフォーマーモデルに対してテストすると、AxoNNは理論ピークの49.4-54.78%のGPU当たりのスループットを達成し、最先端と比較して22-37日(15-25%のスピードアップ)のトレーニング時間を短縮する。

In the last few years, the memory requirements to train state-of-the-art neural networks have far exceeded the DRAM capacities of modern hardware accelerators. This has necessitated the development of efficient algorithms to train these neural networks in parallel on large-scale GPU-based clusters. Since computation is relatively inexpensive on modern GPUs, designing and implementing extremely efficient communication in these parallel training algorithms is critical for extracting the maximum performance. This paper presents AxoNN, a parallel deep learning framework that exploits asynchrony and message-driven execution to schedule neural network operations on each GPU, thereby reducing GPU idle time and maximizing hardware efficiency. By using the CPU memory as a scratch space for offloading data periodically during training, AxoNN is able to reduce GPU memory consumption by four times. This allows us to increase the number of parameters per GPU by four times, thus reducing the amount of communication and increasing performance by over 13%. When tested against large transformer models with 12-100 billion parameters on 48-384 NVIDIA Tesla V100 GPUs, AxoNN achieves a per-GPU throughput of 49.4-54.78% of theoretical peak and reduces the training time by 22-37 days (15-25% speedup) as compared to the state-of-the-art.

翻訳日:2023-05-17 01:30:04 公開日:2023-05-14

# Twitter会話スレッドのヘイトインテンシティ予測

Predicting Hate Intensity of Twitter Conversation Threads ( http://arxiv.org/abs/2206.08406v4 )

ライセンス: Link先を確認

Qing Meng and Tharun Suresh, Roy Ka-Wei Lee, Tanmoy Chakraborty

(参考訳) ツイートは、オンラインのソーシャルメディアにおける最も簡潔なコミュニケーション形態であり、一つのツイートが会話の会話を作り、破壊する可能性を秘めている。オンラインヘイトスピーチはかつてないほどアクセスしやすく、その拡散を抑制することは、ソーシャルメディア企業やユーザーにとって、コンジェニアルコミュニケーションにとって最も重要である。最近の少数の研究は、ツイートスレッド/コンテキストに関わらず、個々のツイートを分類することに重点を置いている。ヘイトスピーチを抑制する古典的なアプローチの1つは、ヘイトスピーチの投稿後にリアクティブ戦略を採用することである。ポストのファクト戦略は、ヘイトスピーチを自力で扇動する可能性を示さない微妙なポストを無視する結果となり、ポストの回答で続く議論に終止符を打つ可能性がある。本稿では,将来,ツイートが応答チェーンを通じてもたらす憎悪の強さを予測することを目的としたDRAGNET++を提案する。ツイートスレッドのセマンティックな構造と伝播構造を利用して、続く各ツイートにおけるヘイト強度の低下につながるコンテキスト情報を最大化する。反人種差別には、米国の政治や新型コロナウイルス(covid-19)背景における人種差別的発言に関するソーシャルメディア談話の返信ツイート、新型コロナウイルス(covid-19)のパンデミック中の4000万ツイートのデータセット、新型コロナウイルス(covid-19)のパンデミック時の反asian行動に基づくtwitterデータセットが含まれる。キュレートされたデータセットはすべて、ツイートスレッドの構造グラフ情報で構成されている。 DRAGNET++は最先端のすべてのベースラインを大幅に上回ることを示す。これは、Person相関係数の11%のマージンで最高のベースラインを上回り、他の2つのデータセットで同様のパフォーマンスを持つ反ラチズムデータセットのRMSEでは25%低下する。

Tweets are the most concise form of communication in online social media, wherein a single tweet has the potential to make or break the discourse of the conversation. Online hate speech is more accessible than ever, and stifling its propagation is of utmost importance for social media companies and users for congenial communication. Most of the research barring a recent few has focused on classifying an individual tweet regardless of the tweet thread/context leading up to that point. One of the classical approaches to curb hate speech is to adopt a reactive strategy after the hate speech postage. The ex-post facto strategy results in neglecting subtle posts that do not show the potential to instigate hate speech on their own but may portend in the subsequent discussion ensuing in the post's replies. In this paper, we propose DRAGNET++, which aims to predict the intensity of hatred that a tweet can bring in through its reply chain in the future. It uses the semantic and propagating structure of the tweet threads to maximize the contextual information leading up to and the fall of hate intensity at each subsequent tweet. We explore three publicly available Twitter datasets -- Anti-Racism contains the reply tweets of a collection of social media discourse on racist remarks during US political and Covid-19 background; Anti-Social presents a dataset of 40 million tweets amidst the COVID-19 pandemic on anti-social behaviours; and Anti-Asian presents Twitter datasets collated based on anti-Asian behaviours during COVID-19 pandemic. All the curated datasets consist of structural graph information of the Tweet threads. We show that DRAGNET++ outperforms all the state-of-the-art baselines significantly. It beats the best baseline by an 11% margin on the Person correlation coefficient and a decrease of 25% on RMSE for the Anti-Racism dataset with a similar performance on the other two datasets.

翻訳日:2023-05-17 01:22:48 公開日:2023-05-14

# 一般化された量子シュタインの補題の証明のギャップとその量子資源の可逆性への帰結について

On a gap in the proof of the generalised quantum Stein's lemma and its consequences for the reversibility of quantum resources ( http://arxiv.org/abs/2205.02813v3 )

ライセンス: Link先を確認

Mario Berta, Fernando G. S. L. Brand\~ao, Gilad Gour, Ludovico Lami, Martin B. Plenio, Bartosz Regula, Marco Tomamichel

(参考訳) 一般化された量子シュタインの補題 [Brand\~ao & Plenio, Commun] の証明を示す。数学 Phys 295, 791 (2010)] は、Lemma III.9 に至る議論のギャップのために正しくない。したがって、Brand\~ao & Plenioの達成可能性の主な成果は分かっていない。これは文学におけるいくつかの確立された結果、特に量子エンタングルメントの可逆性 [brand\~ao & plenio, commun] に疑問を呈する。数学 Phys 295, 829 (2010), Nat。 Phys 4, 873 (2008) および一般的な量子資源 [Brand\~ao & Gour, Phys. Rev. Lett. 115, 070503 (2015)] の漸近的資源非発生操作。提案手法では,新たな未解決結果の変種を他の手法を用いて復元する可能性について論じる。

We show that the proof of the generalised quantum Stein's lemma [Brand\~ao & Plenio, Commun. Math. Phys. 295, 791 (2010)] is not correct due to a gap in the argument leading to Lemma III.9. Hence, the main achievability result of Brand\~ao & Plenio is not known to hold. This puts into question a number of established results in the literature, in particular the reversibility of quantum entanglement [Brand\~ao & Plenio, Commun. Math. Phys. 295, 829 (2010); Nat. Phys. 4, 873 (2008)] and of general quantum resources [Brand\~ao & Gour, Phys. Rev. Lett. 115, 070503 (2015)] under asymptotically resource non-generating operations. We discuss potential ways to recover variants of the newly unsettled results using other approaches.

翻訳日:2023-05-17 01:21:36 公開日:2023-05-14

# 非ブール行列に対するIhara-Bass式とランダムCSPの強い反発

A Ihara-Bass Formula for Non-Boolean Matrices and Strong Refutations of Random CSPs ( http://arxiv.org/abs/2204.10881v2 )

ライセンス: Link先を確認

Tommaso d'Orsi, Luca Trevisan

(参考訳) 我々は、任意の対称行列に付随する ‘non-backtracking' 行列の新たな概念を定義し、それに対する ``ihara-bass'' 型式を証明する。この理論を用いて,制約当たり$k$変数 (k-csps) を持つ無作為制約満足度問題の多項式時間強い反論を証明した。代入分数$p$で満たされる制約で構築されたランダムk-CSPインスタンスに対して、もしインスタンスに$n$変数と$n^{k/2} / \epsilon^2$制約があるなら、最適値が少なくとも$p+O_k(\epsilon)$制約分で満足する証明書を効率的に計算できる。以前は$k$でも知られていたが、奇数$k$の場合、同じ結論を達成するために$n^{k/2} (\log n)^{O(1)} / \epsilon^2$ランダムな制約が必要であった。改善は多対数に過ぎませんが、この種の結果に対する大きな障壁を克服します。現在のアプローチに基づく強い反発の結果は、k-CSPインスタンスに関連するある行列が準ランダムであることの証明を構築する。そのような証明は、ファイゲ=オフェック型の引数、グロタンディークの不等式の適用、あるいはトレース引数で得られるスペクトル境界から得られる。最初の2つのアプローチでは、制約の数が$o(n^{\lceil k/2 \rceil})$であり、3番目のアプローチは、制約の数が$o(n^{k/2} \sqrt{\log n})$であるときに機能しないユニオン境界を必要とする。さらに,制約がランダムな半ランダム設定において,$k$-CSP インスタンスに対して $n^{k/2} / \epsilon^2$ 制約を付与する新たな PTAS 探索手法を提案する。

We define a novel notion of ``non-backtracking'' matrix associated to any symmetric matrix, and we prove a ``Ihara-Bass'' type formula for it. We use this theory to prove new results on polynomial-time strong refutations of random constraint satisfaction problems with $k$ variables per constraints (k-CSPs). For a random k-CSP instance constructed out of a constraint that is satisfied by a $p$ fraction of assignments, if the instance contains $n$ variables and $n^{k/2} / \epsilon^2$ constraints, we can efficiently compute a certificate that the optimum satisfies at most a $p+O_k(\epsilon)$ fraction of constraints. Previously, this was known for even $k$, but for odd $k$ one needed $n^{k/2} (\log n)^{O(1)} / \epsilon^2$ random constraints to achieve the same conclusion. Although the improvement is only polylogarithmic, it overcomes a significant barrier to these types of results. Strong refutation results based on current approaches construct a certificate that a certain matrix associated to the k-CSP instance is quasirandom. Such certificate can come from a Feige-Ofek type argument, from an application of Grothendieck's inequality, or from a spectral bound obtained with a trace argument. The first two approaches require a union bound that cannot work when the number of constraints is $o(n^{\lceil k/2 \rceil})$ and the third one cannot work when the number of constraints is $o(n^{k/2} \sqrt{\log n})$. We further apply our techniques to obtain a new PTAS finding assignments for $k$-CSP instances with $n^{k/2} / \epsilon^2$ constraints in the semi-random settings where the constraints are random, but the sign patterns are adversarial.

翻訳日:2023-05-17 01:21:07 公開日:2023-05-14

# 局所データ制限付きニューラルネットワークを用いた知的空間補間に基づく凍上予測手法

Intelligent Spatial Interpolation-based Frost Prediction Methodology using Artificial Neural Networks with Limited Local Data ( http://arxiv.org/abs/2204.08465v2 )

ライセンス: Link先を確認

Ian Zhou, Justin Lipman, Mehran Abolhasan and Negin Shariati

(参考訳) フロストの気象現象は農業に大きな脅威をもたらす。最近のフロスト予測は現場の履歴データとセンサーに基づいており、新しいサイトでのデータ収集には追加開発と展開時間が必要である。本論文の目的は,現場の履歴データと凍害予測のためのセンサへの依存を解消することである。本稿では,空間補間に基づく凍害予測手法を提案する。これらのモデルは、既存の気象観測所の気候データ、デジタル標高モデルサーベイ、および正規化差植生指数データを用いて、目標地点の次の1時間最低気温を推定する。提案手法は,モデルの精度を高めるためにアンサンブル学習を用いる。気候データセットは、ニューサウスウェールズ州とオーストラリアの首都圏の75の気象観測所から得られる。その結果,提案手法は検出率92.55%に達することがわかった。

The weather phenomenon of frost poses great threats to agriculture. As recent frost prediction methods are based on on-site historical data and sensors, extra development and deployment time are required for data collection in any new site. The aim of this article is to eliminate the dependency on on-site historical data and sensors for frost prediction methods. In this article, a frost prediction method based on spatial interpolation is proposed. The models use climate data from existing weather stations, digital elevation models surveys, and normalized difference vegetation index data to estimate a target site's next hour minimum temperature. The proposed method utilizes ensemble learning to increase the model accuracy. Climate datasets are obtained from 75 weather stations across New South Wales and Australian Capital Territory areas of Australia. The results show that the proposed method reached a detection rate up to 92.55%.

翻訳日:2023-05-17 01:20:10 公開日:2023-05-14

# 実ヒルベルト空間における正写像と絡み合い

Positive maps and entanglement in real Hilbert spaces ( http://arxiv.org/abs/2207.02510v2 )

ライセンス: Link先を確認

Giulio Chiribella, Kenneth R. Davidson, Vern I. Paulsen and Mizanur Rahaman

(参考訳) 正写像の理論は作用素代数や関数解析において中心的な役割を果たし、量子情報科学において無数の応用がある。この理論はもともと複素ヒルベルト空間上で作用する作用素のために開発され、実ヒルベルト空間上の変種についてはほとんど知られていない。本稿では、実数体上の全行列代数に作用する正の写像について研究し、複素数体に対する多くの基本的な違いを指摘し、量子情報におけるそれらの意味について論じる。我々は、実写像が正の複素化を受け入れる必要十分条件を提供し、正の写像の存在と非正の複素化の存在と、実ヒルベルト空間の量子力学において絡み合っている混合状態の存在を結びつけるが、複素バージョンでは分離可能であり、写像と状態の両方に明確な例を提供する。最後に、エンタングルメント破れと PPT 写像について議論し、PPT-二乗予想の単純実版が次元 2 においても偽であることを示す。それでも、元の PPT-二乗予想は実写像に対して異なる予想を示し、PPT特性は部分転置(IPT)の下での不変性の強い性質に置き換えられることを示す。 IPT特性を仮定すると、予想の漸近バージョンが証明される。

The theory of positive maps plays a central role in operator algebras and functional analysis, and has countless applications in quantum information science. The theory was originally developed for operators acting on complex Hilbert spaces, and little is known about its variant on real Hilbert spaces. In this article we study positive maps acting on a full matrix algebra over the reals, pointing out a number of fundamental differences with the complex case and discussing their implications in quantum information. We provide a necessary and sufficient condition for a real map to admit a positive complexification, and connect the existence of positive maps with non-positive complexification with the existence of mixed states that are entangled in real Hilbert space quantum mechanics, but separable in the complex version, providing explicit examples both for the maps and for the states. Finally, we discuss entanglement breaking and PPT maps, and we show that a straightforward real version of the PPT-squared conjecture is false even in dimension 2. Nevertheless, we show that the original PPT-squared conjecture implies a different conjecture for real maps, in which the PPT property is replaced by a stronger property of invariance under partial transposition (IPT). When the IPT property is assumed, we prove an asymptotic version of the conjecture.

翻訳日:2023-05-17 01:12:47 公開日:2023-05-14

# 次世代衛星ネットワークのための人工知能技術

Artificial Intelligence Techniques for Next-Generation Mega Satellite Networks ( http://arxiv.org/abs/2207.00414v2 )

ライセンス: Link先を確認

Bassel Al Homssi, Kosta Dakic, Ke Wang, Tansu Alpcan, Ben Allen, Sithamparanathan Kandeepan, Akram Al-Hourani, and Walid Saad

(参考訳) 宇宙通信、特にメガ衛星ネットワークは、宇宙打ち上げ、エレクトロニクス、処理能力、小型化の大きな進歩により、次世代ネットワークの魅力ある候補として再燃した。しかし、メガ衛星ネットワークは、軌道速度、衛星間リンク、短距離通過、衛星フットプリントなどのダイナミックでユニークな特徴のために、従来のモデルでは真に捉えられない多くの基盤的および相互接続のプロセスに依存している。したがって、ネットワークがリンク内で急速に変化する条件に積極的に適応できるように、新しいアプローチが必要である。人工知能(AI)は、これらのプロセスを捕捉し、その振る舞いを分析し、ネットワーク上での効果をモデル化する経路を提供する。本稿では,統合衛星ネットワーク,特にメガ衛星ネットワーク通信におけるai技術の適用について紹介する。メガ衛星ネットワークのユニークな特徴と、現在の通信インフラへの統合に伴う全体的な課題を詳述している。さらに、このアーティクルは、コミュニケーションリンクのさまざまなレイヤにわたる最先端のAI技術に関する洞察を提供する。これは、高度にダイナミックな無線チャネルの予測、スペクトル検出と分類、信号検出と復調、衛星間リンクと衛星アクセスネットワークの最適化、ネットワークセキュリティのためのaiの適用を含む。さらに,今後のパラダイムと,それらの機構の実用ネットワークへのマッピングについて概説する。

Space communications, particularly mega satellite networks, re-emerged as an appealing candidate for next generation networks due to major advances in space launching, electronics, processing power, and miniaturization. However, mega satellite networks rely on numerous underlying and intertwined processes that cannot be truly captured using conventionally used models, due to their dynamic and unique features such as orbital speed, inter-satellite links, short time pass, and satellite footprint, among others. Hence, new approaches are needed to enable the network to proactively adjust to the rapidly varying conditions associated within the link. Artificial intelligence (AI) provides a pathway to capture these processes, analyze their behavior, and model their effect on the network. This article introduces the application of AI techniques for integrated terrestrial satellite networks, particularly mega satellite network communications. It details the unique features of mega satellite networks, and the overarching challenges concomitant with their integration into the current communication infrastructure. Moreover, the article provides insights into state-of-the-art AI techniques across various layers of the communication link. This entails applying AI for forecasting the highly dynamic radio channel, spectrum sensing and classification, signal detection and demodulation, inter-satellite link and satellite access network optimization, and network security. Moreover, future paradigms and the mapping of these mechanisms onto practical networks are outlined.

翻訳日:2023-05-17 01:12:00 公開日:2023-05-14

# 量子コンピュータの摂動理論法に向けて

Towards Perturbation Theory Methods on a Quantum Computer ( http://arxiv.org/abs/2206.14955v2 )

ライセンス: Link先を確認

Junxu Li, Barbara A. Jones and Sabre Kais

(参考訳) 摂動理論(PT)は物理学者と化学者の両方にとって最も強力で実りの多い道具の1つであり、原子物理学と亜原子物理学の開花による応用の爆発を引き起こした。今日ptはよく使われているが、ptの技術は量子コンピューティングにおいて著しく不足している。本稿では,pt法を用いてエネルギーと固有状態の両方の補正を推定する量子回路を提案する。本手法は,qiskitに基づく数値シミュレーションが提案されている拡張ハバードモデルの適用によりさらに実証される。一般的な量子変動回路とは異なり、我々の回路にはトレーニングや最適化のプロセスはなく、全てのパラメータは未摂動ハミルトニアンから導かれる。我々の研究は、PTに基づくより複雑な手法の量子的実装に光を当てるかもしれない量子デバイスを用いた複雑なシステムの研究の新しいアプローチを提供する。

Perturbation theory (PT) might be one of the most powerful and fruitful tools for both physicists and chemists, which evoked an explosion of applications with the blooming of atomic and subatomic physics. Even though PT is well-used today, techniques for PT are significantly lacking in quantum computing. Here we present a quantum circuit estimating both the energy and eigenstates corrections with PT methods, which we claim is far superior to the classical version when estimating the second order energy correction. Our approach is further demonstrated with an application on the extended Hubbard model, where numerical simulation based on qiskit is also presented. Unlike the popular quantum variational circuit, there is no training or optimizing process in our circuit, and all parameters are derived from the unperturbed Hamiltonian. Our work offers a new approach to studying complex systems with quantum devices, which might shed light on the quantum implementation of the more intricate methods based on PT.

翻訳日:2023-05-17 01:11:39 公開日:2023-05-14

# 症例:共感反応生成における粗悪から細かな認知と愛情の一致

CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic Response Generation ( http://arxiv.org/abs/2208.08845v2 )

ライセンス: Link先を確認

Jinfeng Zhou, Chujie Zheng, Bo Wang, Zheng Zhang, Minlie Huang

(参考訳) 共感的会話は、意識的なアライメントと共感の認知と感情の相互作用の結果であると考えられている。しかし、既存の共感的対話モデルは、通常、感情的側面のみを考慮し、孤立して認知と愛情を扱い、共感的反応生成の能力を制限する。本研究では,共感的対話生成のためのCASEモデルを提案する。まず、コモンセンス認知グラフと感情概念グラフの上に構築され、その後、粗粒度と細粒度の両方でユーザの認知と感情を調整します。自動的および手作業による評価により,情緒対話の最先端ベースラインを上回っており,共感的かつ情報的な反応を発生できることを示す。

Empathetic conversation is psychologically supposed to be the result of conscious alignment and interaction between the cognition and affection of empathy. However, existing empathetic dialogue models usually consider only the affective aspect or treat cognition and affection in isolation, which limits the capability of empathetic response generation. In this work, we propose the CASE model for empathetic dialogue generation. It first builds upon a commonsense cognition graph and an emotional concept graph and then aligns the user's cognition and affection at both the coarse-grained and fine-grained levels. Through automatic and manual evaluation, we demonstrate that CASE outperforms state-of-the-art baselines of empathetic dialogues and can generate more empathetic and informative responses.

翻訳日:2023-05-17 01:03:00 公開日:2023-05-14

# 統計力学のエントロピー特性と一致する量子無知に対する微分幾何学的アプローチ

A Differential-Geometric Approach to Quantum Ignorance Consistent with Entropic Properties of Statistical Mechanics ( http://arxiv.org/abs/2208.04134v4 )

ライセンス: Link先を確認

Shannon Ray, Paul M. Alsing, Carlo Cafaro, Shelton Jacinto

(参考訳) 本稿では、任意の還元密度演算子$\rho_S$に付随する浄化多様体の計量テンソルと体積を構築する。また、マクロステートが無知の表面(SOI)と呼ばれる浄化の多様体である体積、およびマイクロステートが$\rho_S$の精製量を研究するために、CG(quantum coarse-graining)を定義する。この文脈では、ボリュームは$\rho_S$から欠落する情報の量を定量化するマクロ状態の多重度として機能する。 soiが$su(2)$、$so(3)$、および$so(n)$の表現を使って生成される例を用いて、cgの2つの特徴を示す。 1) より小さい体積の非定型マクロ状態から始まる系は, システムと環境が厳密に絡み合う過程において, 平衡マクロ状態に達するまで大きな体積のマクロ状態へと発展し, 2) 平衡マクロ状態は, 特に全体系の寸法が大きくなるにつれて, 粗粒空間の大部分を占める。ここで、平衡マクロ状態は、システムと環境の間の最大絡み合いに対応する。特徴(1) を述べるために、体積はフォン・ノイマンのエントロピーのように振る舞うことを示し、純状態はゼロ、最大混合状態は最大であり、対流関数 w.r.t は$\rho_s$ の純度であることを示す。これらの2つの特徴は、熱化とボルツマンのオリジナルのcgに関する典型的な議論に欠かせない。

In this paper, we construct the metric tensor and volume for the manifold of purifications associated with an arbitrary reduced density operator $\rho_S$. We also define a quantum coarse-graining (CG) to study the volume where macrostates are the manifolds of purifications, which we call surfaces of ignorance (SOI), and microstates are the purifications of $\rho_S$. In this context, the volume functions as a multiplicity of the macrostates that quantifies the amount of information missing from $\rho_S$. Using examples where the SOI are generated using representations of $SU(2)$, $SO(3)$, and $SO(N)$, we show two features of the CG. (1) A system beginning in an atypical macrostate of smaller volume evolves to macrostates of greater volume until it reaches the equilibrium macrostate in a process in which the system and environment become strictly more entangled, and (2) the equilibrium macrostate takes up the vast majority of the coarse-grainied space especially as the dimension of the total system becomes large. Here, the equilibrium macrostate corresponds to maximum entanglement between system and environment. To demonstrate feature (1) for the examples considered, we show that the volume behaves like the von Neumann entropy in that it is zero for pure states, maximal for maximally mixed states, and is a concave function w.r.t the purity of $\rho_S$. These two features are essential to typicality arguments regarding thermalization and Boltzmann's original CG.

翻訳日:2023-05-17 01:01:30 公開日:2023-05-14

# 線形回帰の量子通信複雑性

Quantum communication complexity of linear regression ( http://arxiv.org/abs/2210.01601v2 )

ライセンス: Link先を確認

Ashley Montanaro and Changpeng Shao

(参考訳) 量子コンピュータは、線形代数問題を解くための古典的な問題よりも高速化することができる。しかし、例えば低ランク行列の場合のように、非量子化アルゴリズムは指数関数的な量子速度アップは存在できないことを証明している。本研究では, 量子コンピュータが, 基本線形代数問題 \update{if no limit on the rank} の通信複雑性の観点から, 証明可能な多項式と指数的高速化を持つことを示す。主に線形回帰とハミルトンシミュレーションの解法に焦点をあてる。量子の場合、タスクは結果の量子状態を準備することである。比較を公平にするために、古典的な場合、タスクは結果からサンプルを作ることである。本研究では,これら2つの問題を二元モデルと多元モデルで検討し,準最適量子プロトコルを提案し,量子・古典下界を証明した。本研究では,量子アルゴリズム設計のための強力な手法である量子特異値変換のための効率的な量子プロトコルを提案する。これは、他の多くの問題に対する効率的な量子プロトコルの開発に役立ちます。

Quantum computers may achieve speedups over their classical counterparts for solving linear algebra problems. However, in some cases -- such as for low-rank matrices -- dequantized algorithms demonstrate that there cannot be an exponential quantum speedup. In this work, we show that quantum computers have provable polynomial and exponential speedups in terms of communication complexity for some fundamental linear algebra problems \update{if there is no restriction on the rank}. We mainly focus on solving linear regression and Hamiltonian simulation. In the quantum case, the task is to prepare the quantum state of the result. To allow for a fair comparison, in the classical case, the task is to sample from the result. We investigate these two problems in two-party and multiparty models, propose near-optimal quantum protocols and prove quantum/classical lower bounds. In this process, we propose an efficient quantum protocol for quantum singular value transformation, which is a powerful technique for designing quantum algorithms. This will be helpful in developing efficient quantum protocols for many other problems.

翻訳日:2023-05-17 00:43:31 公開日:2023-05-14

# 事前学習された言語モデルがゼロショット学習に役立つ理由

What Makes Pre-trained Language Models Better Zero-shot Learners? ( http://arxiv.org/abs/2209.15206v2 )

ライセンス: Link先を確認

Jinghui Lu, Dongsheng Zhu, Weidong Han, Rui Zhao, Brian Mac Namee, Fei Tan

(参考訳) 本稿では,ゼロ/ファウショットシナリオにおける即時学習の有効性を説明する理論的枠組みを提案する。まず、従来の事前学習および微調整のパラダイムは、表現できないラベル付きデータに過度に適合するため、少数のシナリオで失敗することを証明する。そこで本研究では,大量のテキストコーパス上に構築された事前学習言語モデルと,ドメイン関連の人的知識を活用して予測にもっと参加し,小さなトレーニングセットによって提供される限定ラベル情報の影響を低減することにより,迅速な学習がより効果的であるという仮定を詳述する。さらに、言語不一致がプロンプトの質を測定することができると仮定する。仮定を検証するために包括的な実験が行われる。さらに,理論的な枠組みに触発されて,パープレキシティに基づくアノテーションに依存しないテンプレート選択手法を提案する。既存の作業は、まだテンプレートを評価するために開発セットに依存しているため、このアプローチは特に奨励されます。実験により、この手法は最先端のゼロショット法に比べて大きな予測効果をもたらすことが示された。

In this paper, we propose a theoretical framework to explain the efficacy of prompt learning in zero/few-shot scenarios. First, we prove that conventional pre-training and fine-tuning paradigm fails in few-shot scenarios due to overfitting the unrepresentative labelled data. We then detail the assumption that prompt learning is more effective because it empowers pre-trained language model that is built upon massive text corpora, as well as domain-related human knowledge to participate more in prediction and thereby reduces the impact of limited label information provided by the small training set. We further hypothesize that language discrepancy can measure the quality of prompting. Comprehensive experiments are performed to verify our assumptions. More remarkably, inspired by the theoretical framework, we propose an annotation-agnostic template selection method based on perplexity, which enables us to ``forecast'' the prompting performance in advance. This approach is especially encouraging because existing work still relies on development set to post-hoc evaluate templates. Experiments show that this method leads to significant prediction benefits compared to state-of-the-art zero-shot methods.

翻訳日:2023-05-17 00:42:58 公開日:2023-05-14

# 力学系のニューラルネットワーク積分器の厳密な保存則

Exact conservation laws for neural network integrators of dynamical systems ( http://arxiv.org/abs/2209.11661v2 )

ライセンス: Link先を確認

Eike Hermann M\"uller

(参考訳) 近年,ニューラルネットワークを用いた時間依存微分方程式の解法が注目されている。中心となる考え方は、ランダムノイズによって汚染される可能性のあるデータから解の進化を管理する法則を学ぶことである。しかし、他の機械学習アプリケーションとは対照的に、システムについては通常多くのことが知られている。例えば、多くの力学系において、エネルギーや(角運動量のような)物理量は正確に保存される。したがって、ニューラルネットワークはデータからこれらの保存則を学習しなければならず、有限なトレーニング時間とランダムノイズによってのみ満足できる。本稿では,ニューラルネットワークのアーキテクチャに保存則を内在的に組み込むために,ネーターの定理を用いた代替手法を提案する。これは3次元ニュートン重力ポテンシャルにおける非相対論的粒子の運動、シュワルツシルト計量における大規模相対論的粒子の運動、および4次元で相互作用する2つの粒子の系である。

The solution of time dependent differential equations with neural networks has attracted a lot of attention recently. The central idea is to learn the laws that govern the evolution of the solution from data, which might be polluted with random noise. However, in contrast to other machine learning applications, usually a lot is known about the system at hand. For example, for many dynamical systems physical quantities such as energy or (angular) momentum are exactly conserved. Hence, the neural network has to learn these conservation laws from data and they will only be satisfied approximately due to finite training time and random noise. In this paper we present an alternative approach which uses Noether's Theorem to inherently incorporate conservation laws into the architecture of the neural network. We demonstrate that this leads to better predictions for three model systems: the motion of a non-relativistic particle in a three-dimensional Newtonian gravitational potential, the motion of a massive relativistic particle in the Schwarzschild metric and a system of two interacting particles in four dimensions.

翻訳日:2023-05-17 00:41:59 公開日:2023-05-14

# ニューラルネットワーク検証におけるTighter Abstract Queries

Tighter Abstract Queries in Neural Network Verification ( http://arxiv.org/abs/2210.12871v2 )

ライセンス: Link先を確認

Elazar Cohen, Yizhak Yisrael Elboher, Clark Barrett, Guy Katz

(参考訳) ニューラルネットワークは、コンピュータサイエンスにおけるさまざまな領域におけるリアクティブシステムの重要な構成要素となっている。優れたパフォーマンスにもかかわらず、ニューラルネットワークを使用することは、私たちの行動を理解し、判断する能力の欠如に起因する多くのリスクを伴います。これらのリスクのため、ニューラルネットワークの検証には様々な形式的手法が提案されているが、残念ながらスケーラビリティの障壁に苦しむことが多い。最近の試みでは、これらの制限を緩和する上で、抽象化-制限アプローチが重要な役割を果たすことが示されているが、これらのアプローチは、しばしば、非常に抽象的なネットワークを生成し、検証に適さないものとなる。この問題に対処するため,システムとプロパティを同時に抽象化・洗練する新しい検証機構であるCEGARETTEを提案する。このアプローチによって,小型かつ十分に正確な抽象ネットワークを作成でき,多数の改良ステップを回避しつつ,迅速な検証時間を確保できることがわかった。評価のために,最近提案された CEGAR-NN フレームワークの拡張として CEGARETTE を実装した。私たちの結果は有望であり、複数のベンチマークに対するパフォーマンスの大幅な改善を示しています。

Neural networks have become critical components of reactive systems in various domains within computer science. Despite their excellent performance, using neural networks entails numerous risks that stem from our lack of ability to understand and reason about their behavior. Due to these risks, various formal methods have been proposed for verifying neural networks; but unfortunately, these typically struggle with scalability barriers. Recent attempts have demonstrated that abstraction-refinement approaches could play a significant role in mitigating these limitations; but these approaches can often produce networks that are so abstract, that they become unsuitable for verification. To deal with this issue, we present CEGARETTE, a novel verification mechanism where both the system and the property are abstracted and refined simultaneously. We observe that this approach allows us to produce abstract networks which are both small and sufficiently accurate, allowing for quick verification times while avoiding a large number of refinement steps. For evaluation purposes, we implemented CEGARETTE as an extension to the recently proposed CEGAR-NN framework. Our results are very promising, and demonstrate a significant improvement in performance over multiple benchmarks.

翻訳日:2023-05-17 00:35:43 公開日:2023-05-14

# 男女のアニムスは、良好な影響下でも存続できる:オンラインp2pローンの注意書き

Gender Animus Can Still Exist Under Favorable Disparate Impact: a Cautionary Tale from Online P2P Lending ( http://arxiv.org/abs/2210.07864v3 )

ライセンス: Link先を確認

Xudong Shen, Tianhui Tan, Tuan Q. Phan, Jussi Keppo

(参考訳) 本稿では,中国の著名なオンラインピアツーピア(p2p)レンディングプラットフォーム上での性差別とその基盤となるドライバについて検討する。 P2P貸与に関する既存の研究は、異種治療(DT)に焦点を当てているが、DTは直接的差別を狭く認識し、間接的および代理的差別を見落とし、不完全な画像を提供する。本研究では,実際のリターン率に合致しないローンの融資率の差を包含する,分散インパクト(di)と呼ばれる幅広い差別概念を測定した。観測データからdiを推定する2段階予測器置換手法を開発した。私たちの発見は (i)女性借り手は、同じ実利率で、資金を受け取る確率が3.97%高い。 (ii)このdiの少なくとも37.1%は、間接的又は代理的差別であり、 (iii)DTは女性全体の嗜好を44.6%過小評価している。また, 投資家が不完全な観察から期待したリターン率を正確に予測する「合理的統計的識別」によって, 女性の好意性全般が説明できることを示す。さらに、女性の借り手は資金確保に2%高いリターン率を必要としており、別のドライバーの味覚に基づく差別が共存し、女性に対するものであることを示している。これらの結果は、P2P貸与は、女性を支持する肯定的な行動が合理的な群衆から自然に現れる価値ある代替信用市場を提供する一方、全体的な差別効果(DIまたはDTの両方)が女性に有利である一方で、味に基づく差別は持続し、統計的差別など既存の他の差別ドライバーによって隠蔽される可能性がある。

This paper investigates gender discrimination and its underlying drivers on a prominent Chinese online peer-to-peer (P2P) lending platform. While existing studies on P2P lending focus on disparate treatment (DT), DT narrowly recognizes direct discrimination and overlooks indirect and proxy discrimination, providing an incomplete picture. In this work, we measure a broadened discrimination notion called disparate impact (DI), which encompasses any disparity in the loan's funding rate that does not commensurate with the actual return rate. We develop a two-stage predictor substitution approach to estimate DI from observational data. Our findings reveal (i) female borrowers, given identical actual return rates, are 3.97% more likely to receive funding, (ii) at least 37.1% of this DI favoring female is indirect or proxy discrimination, and (iii) DT indeed underestimates the overall female favoritism by 44.6%. However, we also identify the overall female favoritism can be explained by one specific discrimination driver, rational statistical discrimination, wherein investors accurately predict the expected return rate from imperfect observations. Furthermore, female borrowers still require 2% higher expected return rate to secure funding, indicating another driver taste-based discrimination co-exists and is against female. These results altogether tell a cautionary tale: on one hand, P2P lending provides a valuable alternative credit market where the affirmative action to support female naturally emerges from the rational crowd; on the other hand, while the overall discrimination effect (both in terms of DI or DT) favors female, concerning taste-based discrimination can persist and can be obscured by other co-existing discrimination drivers, such as statistical discrimination.

翻訳日:2023-05-17 00:34:06 公開日:2023-05-14

# KALM:長期文書理解のためのローカル・ドキュメント・グローバルコンテキストの知識認識統合

KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding ( http://arxiv.org/abs/2210.04105v2 )

ライセンス: Link先を確認

Shangbin Feng, Zhaoxuan Tan, Wenqian Zhang, Zhenyu Lei, Yulia Tsvetkov

(参考訳) 事前訓練された言語モデル(LM)の出現に伴い、下流タスクのためのLMを作成するために、コモンセンスとドメイン固有の知識を注入することに注力する研究が増えている。これらの研究は、事前訓練されたLMとともに、記号的知識表現のデファクト標準である知識グラフを活用する。既存のアプローチでは外部の知識を活用しているが、ローカル(例えば文)からドキュメントレベル、グローバル知識まで、さまざまなコンテキストを表す知識グラフを共同で組み込む方法が疑問視されている。このようなリッチな文脈化は、標準の事前訓練されたLMが典型的には入力シーケンス長によって拘束されるため、長い文書理解タスクに特に有用である。これらの課題を踏まえて,長文理解のためのローカル,文書レベル,グローバルコンテキストの知識を協調的に活用する知識認識言語モデルであるKALMを提案する。 KALMはまず、長いドキュメントと知識グラフを3つの知識認識コンテキスト表現にエンコードする。その後、各コンテキストをコンテキスト固有のレイヤで処理し、次いでコンテキスト融合層によって知識交換を促進し、包括的なドキュメント表現を導出する。大規模な実験により、KALMは6つの長い文書理解タスクとデータセットで最先端のパフォーマンスを達成する。さらなる分析により、3つの知識認識コンテキストは相補的であり、それらは全てモデルのパフォーマンスに寄与し、異なるコンテキストの重要度と情報交換パターンは異なるタスクとデータセットに関して異なることが判明した。

With the advent of pretrained language models (LMs), increasing research efforts have been focusing on infusing commonsense and domain-specific knowledge to prepare LMs for downstream tasks. These works attempt to leverage knowledge graphs, the de facto standard of symbolic knowledge representation, along with pretrained LMs. While existing approaches have leveraged external knowledge, it remains an open question how to jointly incorporate knowledge graphs representing varying contexts, from local (e.g., sentence), to document-level, to global knowledge, to enable knowledge-rich exchange across these contexts. Such rich contextualization can be especially beneficial for long document understanding tasks since standard pretrained LMs are typically bounded by the input sequence length. In light of these challenges, we propose KALM, a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. KALM first encodes long documents and knowledge graphs into the three knowledge-aware context representations. It then processes each context with context-specific layers, followed by a context fusion layer that facilitates knowledge exchange to derive an overarching document representation. Extensive experiments demonstrate that KALM achieves state-of-the-art performance on six long document understanding tasks and datasets. Further analyses reveal that the three knowledge-aware contexts are complementary and they all contribute to model performance, while the importance and information exchange patterns of different contexts vary with respect to different tasks and datasets.

翻訳日:2023-05-17 00:33:34 公開日:2023-05-14

# 制約付き最小値最適化のための高速化シングルコール法

Accelerated Single-Call Methods for Constrained Min-Max Optimization ( http://arxiv.org/abs/2210.03096v2 )

ライセンス: Link先を確認

Yang Cai, Weiqiang Zheng

(参考訳) 制約最小値最適化のための一階法について検討する。既存のメソッドは、各イテレーションで2つのグラデーションコールまたは2つのプロジェクションを必要とする。本稿では,単射単射影アルゴリズムである楽観的勾配法 (og) の変種が,弱いミント変分不等式 (mvi) を満たす演算子を持つ包含問題に対して$o(\frac{1}{\sqrt{t}})$ のベストイテレート収束率を持つことを示す。第二の結果は、最初の単呼単射アルゴリズムである Accelerated Reflected Gradient (ARG) 法であり、負のコモノトニック性を満たす包摂問題に対する最適$O(\frac{1}{T})$の最終点収束率を達成する。弱いMVIと負のコモノトニック性はともによく研究された仮定であり、非凸なmin-max最適化問題のリッチな集合を捉えている。最後に,single-call single-projectionアルゴリズムであるreflection gradient (rg)法は,制約付き凸凸凸min-max最適化に対して$o(\frac{1}{\sqrt{t}})$ last-iterate convergence rateを持つことを示す。我々の収束率は、接残差や自然残差などの標準測度を定めている。

We study first-order methods for constrained min-max optimization. Existing methods either require two gradient calls or two projections in each iteration, which may be costly in some applications. In this paper, we first show that a variant of the Optimistic Gradient (OG) method, a single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ best-iterate convergence rate for inclusion problems with operators that satisfy the weak Minty variation inequality (MVI). Our second result is the first single-call single-projection algorithm -- the Accelerated Reflected Gradient (ARG) method that achieves the optimal $O(\frac{1}{T})$ last-iterate convergence rate for inclusion problems that satisfy negative comonotonicity. Both the weak MVI and negative comonotonicity are well-studied assumptions and capture a rich set of non-convex non-concave min-max optimization problems. Finally, we show that the Reflected Gradient (RG) method, another single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ last-iterate convergence rate for constrained convex-concave min-max optimization, answering an open problem of [Heish et al, 2019]. Our convergence rates hold for standard measures such as the tangent residual and the natural residual.

翻訳日:2023-05-17 00:33:06 公開日:2023-05-14

# rita: インタラクティブな交通の流れで自動運転シミュレータを強化

RITA: Boost Autonomous Driving Simulators with Realistic Interactive Traffic Flow ( http://arxiv.org/abs/2211.03408v4 )

ライセンス: Link先を確認

Zhengbang Zhu, Shenyu Zhang, Yuzheng Zhuang, Yuecheng Liu, Minghuan Liu, Liyuan Mao, Ziqin Gong, Weinan Zhang, Shixiong Kai, Qiang Gu, Bin Wang, Siyuan Cheng, Xinyu Wang, Jianye Hao and Yong Yu

(参考訳) 高品質な交通フロー生成は、自動運転シミュレータ構築における中核モジュールである。しかし、利用可能なシミュレータのほとんどは、実世界のデータの様々な特徴を正確に反映したトラフィックパターンを複製することができず、テストされたオートパイロット駆動戦略に対する人間のような反応をシミュレートすることができない。このような問題に対処するために,既存の運転シミュレータの統合コンポーネントとしてRealistic Interactive TrAffic Flow (RITA)を提案する。 RITAは3つの重要な特徴、すなわち忠実さ、多様性、制御性を考慮して開発され、RITABackendとRITAKitと呼ばれる2つのコアモジュールで構成されている。 RITABackendは実世界のデータセットからトラフィック生成モデルを提供するために構築されており、RITAKitはRITABackendを介して制御可能なトラフィック生成のための使いやすいインターフェースで開発されている。本稿では,多種多様かつ高忠実な交通シミュレーションを実現するRITAの能力について述べる。実験の結果, 生成したRITAトラヒックフローは3つの重要な特徴を全て示し, 運転戦略評価の完全性を高めた。さらに、RITAトラフィックフローを用いたオンライン微調整によるベースライン戦略の改善の可能性を示す。

High-quality traffic flow generation is the core module in building simulators for autonomous driving. However, the majority of available simulators are incapable of replicating traffic patterns that accurately reflect the various features of real-world data while also simulating human-like reactive responses to the tested autopilot driving strategies. Taking one step forward to addressing such a problem, we propose Realistic Interactive TrAffic flow (RITA) as an integrated component of existing driving simulators to provide high-quality traffic flow for the evaluation and optimization of the tested driving strategies. RITA is developed with consideration of three key features, i.e., fidelity, diversity, and controllability, and consists of two core modules called RITABackend and RITAKit. RITABackend is built to support vehicle-wise control and provide traffic generation models from real-world datasets, while RITAKit is developed with easy-to-use interfaces for controllable traffic generation via RITABackend. We demonstrate RITA's capacity to create diversified and high-fidelity traffic simulations in several highly interactive highway scenarios. The experimental findings demonstrate that our produced RITA traffic flows exhibit all three key features, hence enhancing the completeness of driving strategy evaluation. Moreover, we showcase the possibility for further improvement of baseline strategies through online fine-tuning with RITA traffic flows.

翻訳日:2023-05-17 00:25:24 公開日:2023-05-14

# 量子コンピューティングにおける学生の強みと困難

Investigating students' strengths and difficulties in quantum computing ( http://arxiv.org/abs/2212.03726v3 )

ライセンス: Link先を確認

Tunde Kushimo and Beth Thacker

(参考訳) 量子コンピューティングは、情報理論、計算機科学、数学、量子物理学から情報を根本的に新しい方法で処理するエキサイティングな分野である。実用的な量子コンピュータを開発し、量子労働力を増やすための競争が進行中である。これは、量子コンピューティングプログラム、コース、カリキュラムの開発と、次世代の量子情報科学者の教育を支援するためのエビデンスに基づく教育材料の開発と相まって行う必要がある。量子コンピューティングの入門コースを大学生に導入し,入門コースを受講した学生の量子コンピューティングにおける強みと難易度について検討した。我々のゴールは、学生が理解し易いトピックと理解し難いトピックを理解しながら、量子コンピューティング教育の改善に貢献することである。我々は,学生の強みと難しさを明らかにするために,一連のインタビューを行った。我々は、これらのインタビューの結果と、量子コンピューティング入門コースを教えるためのエビデンスベース教材の開発に関する初期研究について報告する。

Quantum Computing is an exciting field that draws from information theory, computer science, mathematics, and quantum physics to process information in fundamentally new ways. There is an ongoing race to develop practical quantum computers and increase the quantum workforce. This needs to be accompanied by the development of quantum computing programs, courses, and curricula coupled with the development of evidence-based pedagogical materials to support the education of the next generation of quantum information scientists. We introduced an introductory course in quantum computing to undergraduate students and investigated the strengths and difficulties of these students in quantum computing after taking the introductory course. Our goal is to contribute to the improvement of quantum computing education while understanding the topics that the students find easy to comprehend and the topics that are difficult to comprehend. We conducted a series of interviews to identify these strengths and difficulties in the students. We report on the results of these interviews and our initial work on the development of evidence-based materials for teaching an introductory course in quantum computing.

翻訳日:2023-05-17 00:05:52 公開日:2023-05-14

# MUS-CDB:空中物体検出におけるアクティブアノテーションのためのクラス分散バランス付き混合不確かさサンプリング

MUS-CDB: Mixed Uncertainty Sampling with Class Distribution Balancing for Active Annotation in Aerial Object Detection ( http://arxiv.org/abs/2212.02804v3 )

ライセンス: Link先を確認

Dong Liang and Jing-Wei Zhang and Ying-Peng Tang and Sheng-Jun Huang

(参考訳) 最近の航空物体検出モデルは、大量のラベル付き訓練データに依存しており、密集した物体を持つ大きな空中シーンでは、望ましくない手動ラベリングコストを必要とする。アクティブラーニングは、情報的および代表的未ラベルサンプルを選択的にクエリすることで、データラベリングコストの低減に有効である。しかし,既存のアクティブラーニング手法は,主にクラスバランスの設定と画像に基づく一般的な物体検出タスクのクエリが特徴であり,空域における長い尾のクラス分布や密集した小物体による空中物体検出のシナリオには適用できない。本稿では,コスト効率の高い空中物体検出のための新しい能動学習手法を提案する。具体的には、冗長で近視的なクエリを控えるために、オブジェクトの選択において、オブジェクトレベルとイメージレベルのインフォメーションの両方が考慮される。また、モデルトレーニングにおけるロングテールクラス分散問題を軽減するためにマイノリティオブジェクトを好むために、使いやすいクラスバランス基準が組み込まれている。問い合わせ情報を完全に活用するために,未発見画像領域における潜伏知識をマイニングするためのトレーニング損失を更に考案する。提案手法の有効性を検証するため,DOTA-v1.0およびDOTA-v2.0ベンチマークを用いて実験を行った。その結果,ラベリングコストの75%以上を削減でき,ベースラインや最先端のアクティブオブジェクト検出法と同等の性能が得られることがわかった。コードは \href{https://github.com/ZJW700/MUS-CDB}{\textit{https://github.com/ZJW700/MUS-CDB}} で公開されている。

Recent aerial object detection models rely on a large amount of labeled training data, which requires unaffordable manual labeling costs in large aerial scenes with dense objects. Active learning is effective in reducing the data labeling cost by selectively querying the informative and representative unlabelled samples. However, existing active learning methods are mainly with class-balanced setting and image-based querying for generic object detection tasks, which are less applicable to aerial object detection scenario due to the long-tailed class distribution and dense small objects in aerial scenes. In this paper, we propose a novel active learning method for cost-effective aerial object detection. Specifically, both object-level and image-level informativeness are considered in the object selection to refrain from redundant and myopic querying. Besides, an easy-to-use class-balancing criterion is incorporated to favor the minority objects to alleviate the long-tailed class distribution problem in model training. To fully utilize the queried information, we further devise a training loss to mine the latent knowledge in the undiscovered image regions. Extensive experiments are conducted on the DOTA-v1.0 and DOTA-v2.0 benchmarks to validate the effectiveness of the proposed method. The results show that it can save more than 75% of the labeling cost to reach the same performance compared to the baselines and state-of-the-art active object detection methods. Code is available at \href{https://github.com/ZJW700/MUS-CDB}{\textit{https://github.com/ZJW700/MUS-CDB}}.

翻訳日:2023-05-17 00:05:08 公開日:2023-05-14

# 物理インフォームドモデルに基づく強化学習

Physics-Informed Model-Based Reinforcement Learning ( http://arxiv.org/abs/2212.02179v4 )

ライセンス: Link先を確認

Adithya Ramesh, Balaraman Ravindran

(参考訳) ロボットのタスクに強化学習(RL)を適用する。従来のRLアルゴリズムの欠点の1つは、サンプル効率が悪いことである。サンプル効率を改善する1つのアプローチはモデルベースRLである。モデルに基づくRLアルゴリズムでは、その遷移力学と報酬関数のモデルを学び、それを仮想軌道生成に利用し、それらをバックプロパゲーションしてポリシーを更新し、モデルの微分可能性を活用する。直感的には、より正確なモデルを学ぶことで、モデルベースのrlパフォーマンスが向上するはずだ。近年,基礎となる物理構造を利用して,より深いニューラルネットワークに基づく物理系の力学モデル開発への関心が高まっている。接触なしで剛体運動を行うロボットシステムに焦点を当てる。モデルベースRLアルゴリズムの2つのバージョンを比較した。1つは標準のディープニューラルネットワークベースのダイナミックスモデル、もう1つはより正確な物理インフォームドニューラルネットワークベースのダイナミックスモデルである。モデルベースRLでは,数値誤差が急速に蓄積する初期条件に敏感な環境において,モデル精度が重要となることを示す。これらの環境では、物理に変形したアルゴリズムは平均回帰とサンプル効率が大幅に向上する。初期条件に敏感でない環境では、アルゴリズムのどちらのバージョンも同様の平均回帰を達成し、物理インフォームされたバージョンはより優れたサンプル効率を達成する。また, 困難な環境下では, 物理モデルに基づくrlは, ソフトアクタ-クリティックのような最先端のモデルフリーなrlアルゴリズムよりも, 平均回帰性能が向上することを示した。

We apply reinforcement learning (RL) to robotics tasks. One of the drawbacks of traditional RL algorithms has been their poor sample efficiency. One approach to improve the sample efficiency is model-based RL. In our model-based RL algorithm, we learn a model of the environment, essentially its transition dynamics and reward function, use it to generate imaginary trajectories and backpropagate through them to update the policy, exploiting the differentiability of the model. Intuitively, learning more accurate models should lead to better model-based RL performance. Recently, there has been growing interest in developing better deep neural network based dynamics models for physical systems, by utilizing the structure of the underlying physics. We focus on robotic systems undergoing rigid body motion without contacts. We compare two versions of our model-based RL algorithm, one which uses a standard deep neural network based dynamics model and the other which uses a much more accurate, physics-informed neural network based dynamics model. We show that, in model-based RL, model accuracy mainly matters in environments that are sensitive to initial conditions, where numerical errors accumulate fast. In these environments, the physics-informed version of our algorithm achieves significantly better average-return and sample efficiency. In environments that are not sensitive to initial conditions, both versions of our algorithm achieve similar average-return, while the physics-informed version achieves better sample efficiency. We also show that, in challenging environments, physics-informed model-based RL achieves better average-return than state-of-the-art model-free RL algorithms such as Soft Actor-Critic, as it computes the policy-gradient analytically, while the latter estimates it through sampling.

翻訳日:2023-05-17 00:04:10 公開日:2023-05-14

# 教師なし要約の再評価

Unsupervised Summarization Re-ranking ( http://arxiv.org/abs/2212.09593v2 )

ライセンス: Link先を確認

Mathieu Ravaut, Shafiq Joty, Nancy Chen

(参考訳) PEGASUSのような抽象的な要約モデルは、タスク固有の事前学習目標の増大に伴い、下流の要約タスクにおいて魅力的なゼロショットパフォーマンスを提供する。しかし、そのような教師なしモデルの性能は教師なしモデルよりもかなり遅れている。教師付き設定と同様に,サマリ候補間の品質のばらつきが極めて高いのに対して,サマリ出力として保持される候補は1つのみである。本稿では,教師なしモデルと教師なしモデルの性能差を縮めるために,教師なし方式で要約候補をランク付けすることを提案する。提案手法では,非教師付きペガサスを最大7.27%,chatgptを6.86%,広く採用されている4つの要約ベンチマークで平均平均ルージュを最大6.86%改善し,平均値が7.1%(xsumからwikihowまで最大23.73%)となり,30以上のゼロショット転送セットアップ(データセットの細調整,評価)を達成した。

With the rise of task-specific pre-training objectives, abstractive summarization models like PEGASUS offer appealing zero-shot performance on downstream summarization tasks. However, the performance of such unsupervised models still lags significantly behind their supervised counterparts. Similarly to the supervised setup, we notice a very high variance in quality among summary candidates from these models while only one candidate is kept as the summary output. In this paper, we propose to re-rank summary candidates in an unsupervised manner, aiming to close the performance gap between unsupervised and supervised models. Our approach improves the unsupervised PEGASUS by up to 7.27% and ChatGPT by up to 6.86% relative mean ROUGE across four widely-adopted summarization benchmarks ; and achieves relative gains of 7.51% (up to 23.73% from XSum to WikiHow) averaged over 30 zero-shot transfer setups (finetuning on a dataset, evaluating on another).

翻訳日:2023-05-16 23:57:16 公開日:2023-05-14

# 時空間マップの直流および交流成分による顔面映像からの血液酸素飽和度推定

Blood Oxygen Saturation Estimation from Facial Video via DC and AC components of Spatio-temporal Map ( http://arxiv.org/abs/2212.07116v2 )

ライセンス: Link先を確認

Yusuke Akamatsu, Yoshifumi Onishi, Hitoshi Imaoka

(参考訳) 血液中の酸素濃度の指標である末梢血酸素飽和度(SpO2)は、最も重要な生理的パラメータの1つである。 SpO2は通常、パルスオキシメータを用いて測定されるが、顔や手動ビデオからの非接触SpO2推定方法が近年注目されている。本稿では,畳み込みニューラルネットワーク(CNN)を用いた顔画像からのSpO2推定手法を提案する。本手法は,顔映像のrgb信号から抽出した直流(dc)と交流電流(ac)成分を考慮したcnnモデルを構築し,spo2推定の原理において重要である。具体的には,フィルタ処理を用いた時空間マップから直流および交流成分を抽出し,cnnモデルを訓練し,これらの成分からspo2を予測する。また,直流および交流成分を畳み込み層から抽出し,時空間マップから直接spo2を予測するエンドツーエンドモデルを提案する。 50名の被験者の顔ビデオとSpO2データを用いた実験により,提案手法は現在のSpO2推定法よりも優れた推定性能が得られることが示された。

Peripheral blood oxygen saturation (SpO2), an indicator of oxygen levels in the blood, is one of the most important physiological parameters. Although SpO2 is usually measured using a pulse oximeter, non-contact SpO2 estimation methods from facial or hand videos have been attracting attention in recent years. In this paper, we propose an SpO2 estimation method from facial videos based on convolutional neural networks (CNN). Our method constructs CNN models that consider the direct current (DC) and alternating current (AC) components extracted from the RGB signals of facial videos, which are important in the principle of SpO2 estimation. Specifically, we extract the DC and AC components from the spatio-temporal map using filtering processes and train CNN models to predict SpO2 from these components. We also propose an end-to-end model that predicts SpO2 directly from the spatio-temporal map by extracting the DC and AC components via convolutional layers. Experiments using facial videos and SpO2 data from 50 subjects demonstrate that the proposed method achieves a better estimation performance than current state-of-the-art SpO2 estimation methods.

翻訳日:2023-05-16 23:55:32 公開日:2023-05-14

# マルチエージェントネットワークシステムにおけるスケーラブル・サンプル分散ポリシー勾配アルゴリズム

Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems ( http://arxiv.org/abs/2212.06357v2 )

ライセンス: Link先を確認

Xin Liu, Honghao Wei, Lei Ying

(参考訳) 本稿では,エージェントが受ける報酬が他のエージェントの状態に依存するマルチエージェント強化学習(MARL)のクラスについて検討する。 Reward-Coupled Multi-Agent Reinforcement LearningからREC-MARLと命名した。 REC-MARLは、無線ネットワークにおけるリアルタイムアクセス制御や分散電力制御など、様々な重要な応用がある。本稿では,REC-MARLのための分散ポリシ勾配アルゴリズムを提案する。提案アルゴリズムは,2つの側面に分散する。 (i)学習方針とは、エージェントのローカル状態をそのローカルアクションにマッピングする分散ポリシーである。 (ii)学習・訓練が分散され、その間に各エージェントは自身の情報と隣人の情報に基づいて方針を更新する。学習アルゴリズムは定常ポリシーを達成し、その反復的複雑性境界は局所状態と行動の次元に依存する。無線ネットワークにおけるリアルタイムアクセス制御と電力制御のためのアルゴリズムの実験結果から,本手法は最先端のアルゴリズムやよく知られたベンチマークを大きく上回っていることがわかった。

This paper studies a class of multi-agent reinforcement learning (MARL) problems where the reward that an agent receives depends on the states of other agents, but the next state only depends on the agent's own current state and action. We name it REC-MARL standing for REward-Coupled Multi-Agent Reinforcement Learning. REC-MARL has a range of important applications such as real-time access control and distributed power control in wireless networks. This paper presents a distributed policy gradient algorithm for REC-MARL. The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information. The learned algorithm achieves a stationary policy and its iterative complexity bounds depend on the dimension of local states and actions. The experimental results of our algorithm for the real-time access control and power control in wireless networks show that our policy significantly outperforms the state-of-the-art algorithms and well-known benchmarks.

翻訳日:2023-05-16 23:54:38 公開日:2023-05-14

# ロバストicp初期化へのアプローチ

An approach to robust ICP initialization ( http://arxiv.org/abs/2212.05332v2 )

ライセンス: Link先を確認

Alexander Kolpakov, Michael Werman

(参考訳) 本稿では,厳密な変換に伴う乱れのない点群に対応するため,ICPアルゴリズムを初期化する手法を提案する。この方法は、点の共分散行列で定義される楕円体をマッチングし、有限反射群の要素によって異なる様々な主半軸マッチングをテストする。ノイズに対するアプローチのロバスト性の境界を導出し,理論的な知見を数値実験により検証した。

In this note, we propose an approach to initialize the Iterative Closest Point (ICP) algorithm to match unlabelled point clouds related by rigid transformations. The method is based on matching the ellipsoids defined by the points' covariance matrices and then testing the various principal half-axes matchings that differ by elements of a finite reflection group. We derive bounds on the robustness of our approach to noise and numerical experiments confirm our theoretical findings.

翻訳日:2023-05-16 23:54:19 公開日:2023-05-14

# Word-Graph2vec:ランダムウォークサンプリングを用いた単語共起グラフへの効率的な単語埋め込み手法

Word-Graph2vec: An efficient word embedding approach on word co-occurrence graph using random walk sampling ( http://arxiv.org/abs/2301.04312v4 )

ライセンス: Link先を確認

Wenting Li and Jiahong Xue and Xi Zhang and Huacan Chen and Zeyu Chen and Yuanzhe Cai

(参考訳) 単語の埋め込みはユビキタスになり、情報検索、意味分析、機械翻訳など、様々なテキストマイニングや自然言語処理(NLP)タスクで広く使われている。残念ながら、比較的大きなコーパスに埋め込まれた単語を訓練するのは極めて高価である。そこで本研究では,大小コーパスを単語共起グラフに変換し,ランダムに移動して単語列サンプルを取り,最後にこのサンプリングコーパスに埋め込まれた単語を訓練する,グラフベースの単語埋め込みアルゴリズムであるword-graph2vecを提案する。英語における安定語彙,相対イディオム,固定表現により,単語共起グラフの大きさと密度は,学習コーパスの増加とともにわずかに変化することが示唆された。したがって、Word-Graph2vecは大規模データセット上で安定したランタイムを持ち、そのパフォーマンス上の優位性は、トレーニングコーパスの成長とともにますます明確になる。実世界のデータセットを用いた広範囲な実験により,提案アルゴリズムは従来のスキップグラムを4～5倍効率で上回り,ランダムウォークサンプリングによる誤差は小さいことがわかった。

Word embedding has become ubiquitous and is widely used in various text mining and natural language processing (NLP) tasks, such as information retrieval, semantic analysis, and machine translation, among many others. Unfortunately, it is prohibitively expensive to train the word embedding in a relatively large corpus. We propose a graph-based word embedding algorithm, called Word-Graph2vec, which converts the large corpus into a word co-occurrence graph, then takes the word sequence samples from this graph by randomly traveling and trains the word embedding on this sampling corpus in the end. We posit that because of the stable vocabulary, relative idioms, and fixed expressions in English, the size and density of the word co-occurrence graph change slightly with the increase in the training corpus. So that Word-Graph2vec has stable runtime on the large scale data set, and its performance advantage becomes more and more obvious with the growth of the training corpus. Extensive experiments conducted on real-world datasets show that the proposed algorithm outperforms traditional Skip-Gram by four-five times in terms of efficiency, while the error generated by the random walk sampling is small.

翻訳日:2023-05-16 23:47:17 公開日:2023-05-14

# 注意ネットワークの解釈可能性について

On the Interpretability of Attention Networks ( http://arxiv.org/abs/2212.14776v3 )

ライセンス: Link先を確認

Lakshmi Narayan Pandey, Rahul Vashisht and Harish G. Ramaswamy

(参考訳) 注意機構は、いくつかの成功したディープラーニングアーキテクチャのコアコンポーネントを形成し、"出力は入力の小さな(しかし未知の)セグメントにのみ依存する"というキーアイデアに基づいている。注意機構を持つ訓練されたモデルでは、出力に責任を持つ入力のセグメントをエンコードする中間モジュールの出力が、ネットワークの 'reasoning' を覗く手段としてしばしば使用される。我々は,注意モデルアーキテクチャで使用する場合,選択依存分類 (sdc) と呼ぶ分類問題の変種に対して,このような概念をより正確に述べる。このような設定下では,注意モデルが正確でありながら解釈できない様々なエラーモードを示し,トレーニングの結果,そのようなモデルが発生することを示す。この動作を強調し緩和できる様々な状況を説明します。最後に,sdcタスクの解釈可能性の客観的定義を用いて,分散性を促進するために設計された注意モデル学習アルゴリズムを評価し,これらのアルゴリズムが解釈性の向上に役立つことを示す。

Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network. We make such a notion more precise for a variant of the classification problem that we term selective dependence classification (SDC) when used with attention model architectures. Under such a setting, we demonstrate various error modes where an attention model can be accurate but fail to be interpretable, and show that such models do occur as a result of training. We illustrate various situations that can accentuate and mitigate this behaviour. Finally, we use our objective definition of interpretability for SDC tasks to evaluate a few attention model learning algorithms designed to encourage sparsity and demonstrate that these algorithms help improve interpretability.

翻訳日:2023-05-16 23:46:00 公開日:2023-05-14

# 深い線形ネットワークによるベイズ補間

Bayesian Interpolation with Deep Linear Networks ( http://arxiv.org/abs/2212.14457v3 )

ライセンス: Link先を確認

Boris Hanin, Alexander Zlokapa

(参考訳) ニューラルネットワークの深さ、幅、データセットサイズがモデル品質にどう影響するかを特徴付けることは、ディープラーニング理論における中心的な問題である。ここでは、ガウス重み付きゼロノイズベイズ推定と負の対数類似度の平均二乗誤差を用いた出力次元1の線形ネットワークの特別な場合の完全な解を与える。任意のトレーニングデータセット、ネットワーク深さ、隠された層幅に対して、単一の複素変数のメロモルフィック特殊関数のクラスであるMeijer-G関数の観点から予測的後およびベイズモデル証拠の非漸近式を求める。これらのmeijer-g関数の新たな漸近展開を通じて、深さ、幅、データセットサイズの共同の役割に関するリッチな新しい図が現れる。線形ネットワークが無限深度で証明可能な最適予測を行うことを示す。データに依存しない無限深度線形ネットワークの後部は、データ依存を最大化する浅層ネットワークのそれと同じである。これは、前者がデータに依存しない場合、より深いネットワークを優先する原則的な理由をもたらす。さらに,データに依存しない先行例では,広域線形ネットワークにおけるベイズモデルエビデンスを無限深度で最大化し,モデル選択における深度増加の因果関係を明らかにする。ネットワーク幅で区切られたデータポイントの数の2倍の隠蔽層数で与えられる有効深度という新たな概念であり、これは大容量データ制限における後部構造を決定する。

Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We give here a complete solution in the special case of linear networks with output dimension one trained using zero noise Bayesian inference with Gaussian weight priors and mean squared error as a negative log-likelihood. For any training dataset, network depth, and hidden layer widths, we find non-asymptotic expressions for the predictive posterior and Bayesian model evidence in terms of Meijer-G functions, a class of meromorphic special functions of a single complex variable. Through novel asymptotic expansions of these Meijer-G functions, a rich new picture of the joint role of depth, width, and dataset size emerges. We show that linear networks make provably optimal predictions at infinite depth: the posterior of infinitely deep linear networks with data-agnostic priors is the same as that of shallow networks with evidence-maximizing data-dependent priors. This yields a principled reason to prefer deeper networks when priors are forced to be data-agnostic. Moreover, we show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth, elucidating the salutary role of increased depth for model selection. Underpinning our results is a novel emergent notion of effective depth, given by the number of hidden layers times the number of data points divided by the network width; this determines the structure of the posterior in the large-data limit.

翻訳日:2023-05-16 23:45:44 公開日:2023-05-14

# 校正の豊かさについて

On the Richness of Calibration ( http://arxiv.org/abs/2302.04118v2 )

ライセンス: Link先を確認

Benedikt H\"oltgen and Robert C Williamson

(参考訳) 確率的予測は、観測されたラベル周波数、すなわちキャリブレーションレンズによる比較によって評価することができる。アルゴリズムの公正性に関する最近の研究は、多校正の名のもと、様々なキャリブレーションに基づく目的に注目し始めているが、いまだにかなり制限されている。本稿では,キャリブレーションスコアの設計に関わる選択を明確化し,キャリブレーションによる評価形態を調査し分析する。これらを3つのグループ選択と,グループエラーの集約に関する選択にまとめる。これは、以前に提案されたキャリブレーションスコアを比較するためのフレームワークを提供し、望ましい数学的特性を持つ新しいスコアを定式化するのに役立つ。特に,予測ではなく,入力特徴に基づいてデータポイントをグループ化する可能性について検討し,その利点を正式に示している。また,予め提案した校正スコアを一般化し,グループ誤りに対する適切な凝集関数の空間を特徴付ける。このような集団レベルのスコアを補完し,個人レベルでのキャリブレーションスコアを調査し,グループ化の選択との関係を分析する。人口レベルのスコアに対する公平度逸脱対策の導入と公理化について考察する。グループ化の適切な選択により、これらの新しいグローバルフェアネススコアは(サブ)グループや個人フェアネスの概念を提供することができることを示す。

Probabilistic predictions can be evaluated through comparisons with observed label frequencies, that is, through the lens of calibration. Recent scholarship on algorithmic fairness has started to look at a growing variety of calibration-based objectives under the name of multi-calibration but has still remained fairly restricted. In this paper, we explore and analyse forms of evaluation through calibration by making explicit the choices involved in designing calibration scores. We organise these into three grouping choices and a choice concerning the agglomeration of group errors. This provides a framework for comparing previously proposed calibration scores and helps to formulate novel ones with desirable mathematical properties. In particular, we explore the possibility of grouping datapoints based on their input features rather than on predictions and formally demonstrate advantages of such approaches. We also characterise the space of suitable agglomeration functions for group errors, generalising previously proposed calibration scores. Complementary to such population-level scores, we explore calibration scores at the individual level and analyse their relationship to choices of grouping. We draw on these insights to introduce and axiomatise fairness deviation measures for population-level scores. We demonstrate that with appropriate choices of grouping, these novel global fairness scores can provide notions of (sub-)group or individual fairness.

翻訳日:2023-05-16 23:28:23 公開日:2023-05-14

# 不確かさ量子化による物理制約運動予測

Physics Constrained Motion Prediction with Uncertainty Quantification ( http://arxiv.org/abs/2302.01060v2 )

ライセンス: Link先を確認

Renukanandan Tumu, Lars Lindemann, Truong Nghiem, Rahul Mangharam

(参考訳) 動的エージェントの動作を予測することは、自律システムの安全性を保証する上で重要なタスクである。特に、動き予測アルゴリズムはダイナミクスの制約に従い、信頼の尺度として予測の不確かさを定量化するべきである。本稿では, 代用動力学モデルを用いて, 予測軌道が動的に実現可能であることを保証する運動予測のための物理制約付きアプローチを提案する。動力学的制約を考慮したインテントと軌道予測からなる2段階の統合を提案する。また,不確実性を定量化し,共形予測を用いて自律運転に適した予測領域を構築した。物理制約運動予測は、自律的なレーシングデータセットを使用した実験において、ADEが41%、FDEが56%、IoUが19%向上した。

Predicting the motion of dynamic agents is a critical task for guaranteeing the safety of autonomous systems. A particular challenge is that motion prediction algorithms should obey dynamics constraints and quantify prediction uncertainty as a measure of confidence. We present a physics-constrained approach for motion prediction which uses a surrogate dynamical model to ensure that predicted trajectories are dynamically feasible. We propose a two-step integration consisting of intent and trajectory prediction subject to dynamics constraints. We also construct prediction regions that quantify uncertainty and are tailored for autonomous driving by using conformal prediction, a popular statistical tool. Physics Constrained Motion Prediction achieves a 41% better ADE, 56% better FDE, and 19% better IoU over a baseline in experiments using an autonomous racing dataset.

翻訳日:2023-05-16 23:26:29 公開日:2023-05-14

# Metropolis-adjusted Langevin アルゴリズムによる制約の効率的な処理

Efficiently handling constraints with Metropolis-adjusted Langevin algorithm ( http://arxiv.org/abs/2302.11971v2 )

ライセンス: Link先を確認

Jinyuan Chang, Cheng Yong Tang, Yuanzheng Zhu

(参考訳) 本研究では,対象分布のサポートに制約のある設定において,メトロポリス調整ランジュバンアルゴリズムの性能について検討する。得られたマルコフ鎖の厳密な解析を行い、その収束を確立し、混合時間に対して上界を導出する。以上の結果から,メトロポリス調整型ランゲヴィンアルゴリズムは,この課題に対処する上で極めて有効であることが示される: 得られた混合時間境界は,アセプション・リジェクトのステップを使わずに競合するアルゴリズムの最もよく知られた境界よりも優れている。我々の数値実験は,これらの理論的な知見を裏付けるものであり,メトロポリス調整ランジュバンアルゴリズムは,対象分布の制約を扱う際に有望な性能を示す。

In this study, we investigate the performance of the Metropolis-adjusted Langevin algorithm in a setting with constraints on the support of the target distribution. We provide a rigorous analysis of the resulting Markov chain, establishing its convergence and deriving an upper bound for its mixing time. Our results demonstrate that the Metropolis-adjusted Langevin algorithm is highly effective in handling this challenging situation: the mixing time bound we obtain is superior to the best known bounds for competing algorithms without an accept-reject step. Our numerical experiments support these theoretical findings, indicating that the Metropolis-adjusted Langevin algorithm shows promising performance when dealing with constraints on the support of the target distribution.

翻訳日:2023-05-16 23:18:31 公開日:2023-05-14

# 組合せ最適化のための効率的なソリューションQuantum Dueling

Quantum Dueling: an Efficient Solution for Combinatorial Optimization ( http://arxiv.org/abs/2302.10151v3 )

ライセンス: Link先を確認

Letian Tang, Haorui Wang, Zhengyang Li, Haozhan Tang, Chi Zhang, Shujin Li

(参考訳) 本稿では、量子デュエルと呼ばれる量子組合せ最適化の新しい戦略を提案する。以前のアルゴリズムでは、与えられた最適化問題の潜在的な解はヒルベルト空間の基底状態として符号化された。しかし、量子デュエルは使用される量子ビットの数を2倍にし、基底状態は拡張ヒルベルト空間における1対のポテンシャル解を表す。この表現の下で、目的関数に基づいてそのようなペアの一方の要素を識別する量子オラクルを構築することができれば、量子振幅増幅により量子最適化が達成できることに気づく。到達性の問題を補うには、追加のパラメータセットが必要である。私たちは設計をテストするために古典的シミュレーションの証拠を広範囲に使います。直感的に選択されたパラメータでは、量子デュエルはうまく機能するが、到達性は解分布に大きく依存する。この場合、成功確率の進化は非常に規則的である。したがって、状態進化を数学的に近似する方法があるかもしれない。最適パラメータについては、量子デュエルは一定の成功確率閾値に達するためにゲートへのアクセスを$O(\sqrt{N})$で要求し、ほぼ全ての解分布に対して良好に動作することを示唆している。高速なアルゴリズムと比較して、潜在的な定数レベルの最適化に加えて、量子デュエルは高レベル量子アルゴリズムのサブルーチンとしても使用できる。さらに、量子デュエルは多くの変分最適化アルゴリズム、特にQAOAと類似している。これは量子デュエルの戦略がハミルトニアン配置に移植されることを示唆している。その場合、短期量子コンピューティングのための実行可能な最適化アルゴリズムが得られ、訓練が容易になる。

This paper presents a new strategy for quantum combinatorial optimization, which we term quantum dueling. In previous algorithms, potential solutions to the given optimization problems were encoded as basis states of the Hilbert space. Quantum dueling, however, doubles the number of qubits used, making the basis states represent a pair of potential solutions in the augmented Hilbert space. Under this representation, we realize that if we can construct quantum oracles that identify one element of such pair over the other based on the objective function, quantum optimization can be achieved by quantum amplitude amplification. An additional set of parameters are required to compensate for reachability issues. We extensively use classical simulation evidence to test our designs. For intuitively chosen parameters, quantum dueling performs well, though reachability is highly dependent on solution distribution. In this case, the evolution of the success probability is highly regular. Thus, there might be ways to approximate the state evolution mathematically. For optimal parameters, data suggest that quantum dueling requires $O(\sqrt{N})$ accesses to the gates to reach a constant-level success probability threshold and performs well for almost all solution distributions. In addition to potential constant-level optimization compared with the fastest algorithms, quantum dueling can also be used as a subroutine for higher-level quantum algorithms. Moreover, quantum dueling shares similarities with many variational optimization algorithms, most notably QAOA. This suggests that the strategy of quantum dueling might be transplanted into a Hamiltonian setup. In that case, we might obtain viable optimization algorithms for near-term quantum computing, with the added advantage of easier training.

翻訳日:2023-05-16 23:18:18 公開日:2023-05-14

# 深層カーネル学習のガイド

Guided Deep Kernel Learning ( http://arxiv.org/abs/2302.09574v2 )

ライセンス: Link先を確認

Idan Achituve, Gal Chechik, Ethan Fetaya

(参考訳) ガウス過程とディープニューラルネットワークの表現力の組み合わせは、今日ではdkl(deep kernel learning)を通じて一般的に行われている。残念なことに、カーネル最適化プロセスのため、これはしばしばベイズ的な利点を失う。本研究では,無限幅ニューラルネットワークを用いて深層カーネルを学習する新しい手法を提案する。本稿では、最適化プロセスにおけるDKLモデルのガイドとしてニューラルネットワークガウス過程(NNGP)モデルを提案する。提案手法は,新しいデータポイントに遭遇した場合のDKL目標の信頼度に適応するために,NNGPの確実性評価を利用する。その結果、我々は、NNGPのベイズ的挙動、すなわち過度な適合に対する頑健さ、そして正確な不確実性推定を生かし、より深いカーネルの一般化能力、スケーラビリティ、柔軟性を維持できる。実験では, 様々なサイズと寸法のベンチマークデータセット上で, オーバーフィッティングに頑健であり, 予測性能が良好であり, 信頼性の高い不確実性推定を行う。

Combining Gaussian processes with the expressive power of deep neural networks is commonly done nowadays through deep kernel learning (DKL). Unfortunately, due to the kernel optimization process, this often results in losing their Bayesian benefits. In this study, we present a novel approach for learning deep kernels by utilizing infinite-width neural networks. We propose to use the Neural Network Gaussian Process (NNGP) model as a guide to the DKL model in the optimization process. Our approach harnesses the reliable uncertainty estimation of the NNGPs to adapt the DKL target confidence when it encounters novel data points. As a result, we get the best of both worlds, we leverage the Bayesian behavior of the NNGP, namely its robustness to overfitting, and accurate uncertainty estimation, while maintaining the generalization abilities, scalability, and flexibility of deep kernels. Empirically, we show on multiple benchmark datasets of varying sizes and dimensionality, that our method is robust to overfitting, has good predictive performance, and provides reliable uncertainty estimations.

翻訳日:2023-05-16 23:17:54 公開日:2023-05-14

# プルーニングニューラルネットワークにおけるスパーシティを活用した大規模モデルトレーニングの最適化

Exploiting Sparsity in Pruned Neural Networks to Optimize Large Model Training ( http://arxiv.org/abs/2302.05045v3 )

ライセンス: Link先を確認

Siddharth Singh, Abhinav Bhatele

(参考訳) 大規模ニューラルネットワークの並列トレーニングは、通信によるオーバーヘッドが大きいため困難である。近年,ニューラルネットワークにおけるパラメータの80-90%のプルーニング(すなわちゼロに設定)が可能な様々なプルーニングアルゴリズムを開発し,未解析の親ネットワークの精度に匹敵するスパースサブネットを構築している。本研究では,これらのスパースサブネットワークを利用して,並列ディープラーニングのための2つの一般的なアルゴリズム,すなわちデータと層間並列処理のメモリ利用と通信を最適化する新しい手法を提案する。我々は、データと層間並列性に依存した並列ディープラーニングのための高度にスケーラブルなフレームワークであるAxoNNにアプローチを統合し、通信時間とメモリ使用量の削減を実証する。 512nvidia v100 gpuでは,2.7億パラメータモデルのメモリ消費を74%削減し,通信時間を40%削減し,axon上で34%,deepspeed-3d上で32%,スパース行列計算ベースラインであるsputnik上で46%高速化した。

Parallel training of neural networks at scale is challenging due to significant overheads arising from communication. Recently, deep learning researchers have developed a variety of pruning algorithms that are capable of pruning (i.e. setting to zero) 80-90% of the parameters in a neural network to yield sparse subnetworks that equal the accuracy of the unpruned parent network. In this work, we propose a novel approach that exploits these sparse subnetworks to optimize the memory utilization and communication in two popular algorithms for parallel deep learning namely -- data and inter-layer parallelism. We integrate our approach into AxoNN, a highly scalable framework for parallel deep learning that relies on data and inter-layer parallelism, and demonstrate the reduction in communication time and memory utilization. On 512 NVIDIA V100 GPUs, our optimizations reduce the memory consumption of a 2.7 billion parameter model by 74%, and the total communication time by 40%, thus providing an overall speedup of 34% over AxoNN, 32% over DeepSpeed-3D and 46% over Sputnik, a sparse matrix computation baseline.

翻訳日:2023-05-16 23:16:39 公開日:2023-05-14

# 混在訓練の最適化のためのハイブリッドテンソル-エクストラ-データ並列化手法

A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training ( http://arxiv.org/abs/2303.06318v2 )

ライセンス: Link先を確認

Siddharth Singh, Olatunji Ruwase, Ammar Ahmad Awan, Samyam Rajbhandari, Yuxiong He, Abhinav Bhatele

(参考訳) Mixture-of-Experts (MoE)は、ニューラルネットワークアーキテクチャであり、ベースモデルに疎活性化されたエキスパートブロックを追加し、計算コストに影響を与えることなくパラメータの数を増やす。しかし、現在の分散ディープラーニングフレームワークは、大規模なベースモデルで高品質なMoEモデルをトレーニングする能力に制限がある。本研究では,データ,テンソル,エキスパート並列性を組み合わせた,新しい3次元ハイブリッド並列アルゴリズムであるdeepspeed-tedを提案する。また、オプティマイザステップにおけるメモリ最適化と、不要なデータ移動をなくす通信最適化についても述べる。我々はDeepSpeedのアプローチを実装し、128V100 GPU上で400億のパラメータMOEモデル(16人のエキスパートを持つ670億ベースモデル)をトレーニングする際に、ベースライン(通信最適化無し)で26%のスピードアップを達成する。

Mixture-of-Experts (MoE) is a neural network architecture that adds sparsely activated expert blocks to a base model, increasing the number of parameters without impacting computational costs. However, current distributed deep learning frameworks are limited in their ability to train high-quality MoE models with large base models. In this work, we present DeepSpeed-TED, a novel, three-dimensional, hybrid parallel algorithm that combines data, tensor, and expert parallelism to enable the training of MoE models with 4 to 8x larger base models than the current state-of-the-art. We also describe memory optimizations in the optimizer step, and communication optimizations that eliminate unnecessary data movement. We implement our approach in DeepSpeed and achieve speedups of 26% over a baseline (i.e. without our communication optimizations) when training a 40 billion parameter MoE model (6.7 billion base model with 16 experts) on 128 V100 GPUs.

翻訳日:2023-05-16 23:07:55 公開日:2023-05-14

# 量子確率熱力学:位相空間における半古典理論

Quantum Stochastic Thermodynamics: a Semiclassical Theory in Phase Space ( http://arxiv.org/abs/2303.05935v2 )

ライセンス: Link先を確認

Zhaoyu Fei

(参考訳) 量子多体系の定式化は相空間における半古典的処理によって提案され、量子統計学を取り入れた確率的熱力学を確立できる。具体的には、メソスコピックレベルの力学として確率的フォッカー・プランク方程式を用いる。ここで、フラックス密度の変動を特徴付ける雑音項は、系と貯水池の間のランダム衝突の有限N効果を説明する。したがって、定常解は標準系における準平衡状態である。位相空間分布の軌跡に基づく確率的熱力学量を定義する。したがって、エネルギーの保存則、H理論およびゆらぎ定理が得られる。我々の研究は、2点測定スキームに依存しない量子確率熱力学の代替形式を定めている。多数の量子系の射影測定は、将来実験的な検証を期待する相空間分布のサンプリングによって置き換えられる。

A formalism for quantum many-body systems is proposed through semiclassical treatment in phase space, allowing us to establish a stochastic thermodynamics incorporating quantum statistics. Specifically, we utilize stochastic Fokker-Planck equation as the dynamics at the mesoscopic level. Here, the noise term characterizing the fluctuation of the flux density accounts for the finite-N effects of random collisions between the system and the reservoir. Accordingly, the stationary solution is a quasi-equilibrium state in a canonical system. We define stochastic thermodynamic quantities based on trajectories of phase-space distribution. The conservation law of energy, H-theorem and fluctuation theorems are therefore obtained. Our work sets an alternative formalism of quantum stochastic thermodynamics that is independent of the two-point measurement scheme. The numerous projective measurements of quantum systems are replaced by the sampling of the phase-space distribution, offering hope for experimental verifications in the future.

翻訳日:2023-05-16 23:07:21 公開日:2023-05-14

# 修復に基づく生成モデル

Restoration based Generative Models ( http://arxiv.org/abs/2303.05456v2 )

ライセンス: Link先を確認

Jaemoo Choi, Yesom Park, Myungjoo Kang

(参考訳) 近年, 高い合成品質を示すことで, 拡散モデル (DDM) が注目されている。 DDMは拡散プロセス上に構築され、ノイズ分布にデータをプッシュし、モデルはノイズを学習する。本稿では,画像復元(IR)の観点からDDMの解釈を確立する。 IR文献を統合することで、拡散過程を補うのではなく、別の目的と多様な前進過程を使うことができる。 MAPに基づく推定に基づく損失関数の事前知識を付与することにより,高価なDDMサンプリングの必要性を解消する。また,前処理の柔軟性を生かして,拡散過程と比較して性能を向上させるマルチスケールトレーニングを提案する。実験の結果,本モデルはトレーニングと推論の両方の品質と効率を改善した。さらに, 逆問題に対するモデルの適用性を示す。当社のフレームワークは、新しいタイプのフレキシブル汎用生成モデルを設計するための道を開くものだと考えています。

Denoising diffusion models (DDMs) have recently attracted increasing attention by showing impressive synthesis quality. DDMs are built on a diffusion process that pushes data to the noise distribution and the models learn to denoise. In this paper, we establish the interpretation of DDMs in terms of image restoration (IR). Integrating IR literature allows us to use an alternative objective and diverse forward processes, not confining to the diffusion process. By imposing prior knowledge on the loss function grounded on MAP-based estimation, we eliminate the need for the expensive sampling of DDMs. Also, we propose a multi-scale training, which improves the performance compared to the diffusion process, by taking advantage of the flexibility of the forward process. Experimental results demonstrate that our model improves the quality and efficiency of both training and inference. Furthermore, we show the applicability of our model to inverse problems. We believe that our framework paves the way for designing a new type of flexible general generative model.

翻訳日:2023-05-16 23:07:09 公開日:2023-05-14

# CoolPINNs: 真空系におけるアクティブ冷却の物理インフォームドニューラルネットワークモデリング

CoolPINNs: A Physics-informed Neural Network Modeling of Active Cooling in Vascular Systems ( http://arxiv.org/abs/2303.05300v2 )

ライセンス: Link先を確認

N. V. Jagtap, M. K. Mudunuru, and K. B. Nakshatrala

(参考訳) 超音速航空機、宇宙探査車、バッテリーなどの新興技術は、効率的な熱調節のために組込みマイクロ血管内での流体循環に有効である。これらのシステムの設計と運用においてモデリングは不可欠である。しかし、モデリングフレームワークの開発には多くの課題がある。欠けているのは正確な枠組みで (i)複雑な血管配置における熱流束の鋭い跳躍をキャプチャする。 (ii)斜め微分(接成分及び正規成分)を扱う。 (iii)放射熱伝達による非線形性を扱う。 (iv)リアルタイム監視のための高速予測を提供し、 (v)ロバストな逆モデリングを容易にする。本稿では,物理インフォームドニューラルネットワーク(PINN)のパワーを活用して,これらの課題に対処する。当社は、血管ベースの熱規制のための高速で信頼性が高く正確なSciML(SciML)フレームワークを開発しています -- CoolPINNsと呼ばれる、アクティブ冷却のためのPINNベースのモデリングフレームワークです。提案されたメッシュレスフレームワークは、前述のすべての課題をエレガントに克服する。報告された研究の意義は多岐にわたる。第一に、このフレームワークは急速な予測のため、熱規制システムのリアルタイム監視に有用である。第2に、アプローチがメッシュレスであるため、複雑な熱調節設計に対処できる。最後に、このフレームワークは、システマティックパラメータの識別と、おそらく現在のフレームワークの最も重要なユーティリティである逆モデリング研究を促進する。

Emerging technologies like hypersonic aircraft, space exploration vehicles, and batteries avail fluid circulation in embedded microvasculatures for efficient thermal regulation. Modeling is vital during these engineered systems' design and operational phases. However, many challenges exist in developing a modeling framework. What is lacking is an accurate framework that (i) captures sharp jumps in the thermal flux across complex vasculature layouts, (ii) deals with oblique derivatives (involving tangential and normal components), (iii) handles nonlinearity because of radiative heat transfer, (iv) provides a high-speed forecast for real-time monitoring, and (v) facilitates robust inverse modeling. This paper addresses these challenges by availing the power of physics-informed neural networks (PINNs). We develop a fast, reliable, and accurate Scientific Machine Learning (SciML) framework for vascular-based thermal regulation -- called CoolPINNs: a PINNs-based modeling framework for active cooling. The proposed mesh-less framework elegantly overcomes all the mentioned challenges. The significance of the reported research is multi-fold. First, the framework is valuable for real-time monitoring of thermal regulatory systems because of rapid forecasting. Second, researchers can address complex thermoregulation designs inasmuch as the approach is mesh-less. Finally, the framework facilitates systematic parameter identification and inverse modeling studies, perhaps the current framework's most significant utility.

翻訳日:2023-05-16 23:06:56 公開日:2023-05-14

# QuickSRNet: モバイルプラットフォームでの高速推論のための平易な単一イメージ超解法アーキテクチャ

QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms ( http://arxiv.org/abs/2303.04336v2 )

ライセンス: Link先を確認

Guillaume Berger and Manik Dhingra and Antoine Mercier and Yashesh Savani and Sunny Panchal and Fatih Porikli

(参考訳) 本稿では,モバイルプラットフォーム上でリアルタイムアプリケーションを実現するための,効率的な超解像アーキテクチャQuickSRNetを提案する。超解像度は画像の高解像度化、シャープ化、アップスケール化を行う。ゲームやビデオ再生などのアプリケーションや、テレビ、スマートフォン、VRヘッドセットのディスプレイ能力の向上は、効率的なアップスケーリングソリューションの必要性を喚起している。既存のディープラーニングベースの超高解像度アプローチは、視覚的品質の観点から見事な結果をもたらすが、計算、熱、電力制約のあるモバイルデバイスでリアルタイムDLベースの超高解像度を実現することは困難である。このような課題に対処するため,我々は,単一画像のスーパーレゾリューションのための既存のニューラルネットワークよりも精度とレイテンシのトレードオフを提供する,シンプルで効果的なアーキテクチャであるquicksrnetを提案する。量子化に対する堅牢性を維持しつつ,既存の残差ベース超解像アーキテクチャを高速化する訓練手法を提案する。提案するアーキテクチャは,最新のスマートフォンで2.2ミリ秒で2倍のアップスケーリングで1080pの出力を生成する。

In this work, we present QuickSRNet, an efficient super-resolution architecture for real-time applications on mobile platforms. Super-resolution clarifies, sharpens, and upscales an image to higher resolution. Applications such as gaming and video playback along with the ever-improving display capabilities of TVs, smartphones, and VR headsets are driving the need for efficient upscaling solutions. While existing deep learning-based super-resolution approaches achieve impressive results in terms of visual quality, enabling real-time DL-based super-resolution on mobile devices with compute, thermal, and power constraints is challenging. To address these challenges, we propose QuickSRNet, a simple yet effective architecture that provides better accuracy-to-latency trade-offs than existing neural architectures for single-image super resolution. We present training tricks to speed up existing residual-based super-resolution architectures while maintaining robustness to quantization. Our proposed architecture produces 1080p outputs via 2x upscaling in 2.2 ms on a modern smartphone, making it ideal for high-fps real-time applications.

翻訳日:2023-05-16 23:06:41 公開日:2023-05-14

# 協調型マルチエージェントタスクにおける学習報酬マシン

Learning Reward Machines in Cooperative Multi-Agent Tasks ( http://arxiv.org/abs/2303.14061v3 )

ライセンス: Link先を確認

Leo Ardon, Daniel Furelos-Blanco, Alessandra Russo

(参考訳) 本稿では,協調的なタスク分解と,サブタスクの構造を符号化した報酬機械(rms)の学習を組み合わせたマルチエージェント強化学習(marl)への新しいアプローチを提案する。提案手法は, 部分的に観察可能な環境における報酬の非マルコフ的性質に対処し, 協調作業の完了に必要な学習方針の解釈性を向上させる。各サブタスクに関連付けられたrmは分散的に学習され、各エージェントの振る舞いを導くのに使用される。これにより、協調的マルチエージェント問題の複雑さが減少し、より効果的な学習が可能となる。以上の結果から,本手法はMARL,特に大規模状態空間と複数エージェントを持つ複雑な環境での今後の研究の方向性として期待できると考えられる。

This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) that combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments and improves the interpretability of the learnt policies required to complete the cooperative task. The RMs associated with each sub-task are learnt in a decentralised manner and then used to guide the behaviour of each agent. By doing so, the complexity of a cooperative multi-agent problem is reduced, allowing for more effective learning. The results suggest that our approach is a promising direction for future research in MARL, especially in complex environments with large state spaces and multiple agents.

翻訳日:2023-05-16 23:00:16 公開日:2023-05-14

# 絡み合った送信機を有するマルチアクセスチャネル

The Multiple-Access Channel with Entangled Transmitters ( http://arxiv.org/abs/2303.10456v4 )

ライセンス: Link先を確認

Uzi Pereg, Christian Deppe, and Holger Boche

(参考訳) 従来型マルチアクセスチャネル(mac)と絡み合いリソースとの通信を考慮し,通信開始前に2つの送信機で絡み合いリソースを共有する。 leditzky et al. (2020) は、疑似テレパシーゲームで定義される古典的なmacの例を示し、絡み合った送信機との和率は、そのようなリソースのない最高の達成可能な和率よりも厳密に高いことを示した。ここでは、エンタングル送信機を有する一般macの容量領域における内外界と外界の境界を定め、その先行結果を特殊ケースとして得ることができることを示す。メッセージ平均誤差基準の下での古典的なmacの容量領域は、最大誤差基準よりも厳密に大きいことが長年知られている(dueck, 1978)。絡み合った資源が与えられた場合、その領域は一致する。さらに、エンタングルメントリソースと会議の複合的な設定に対処し、送信機はレート制限リンクを介して相互に通信することができる。超深度符号化を用いて、絡み合いは会議レートを2倍にすることができる。

Communication over a classical multiple-access channel (MAC) with entanglement resources is considered, whereby two transmitters share entanglement resources a priori before communication begins. Leditzky et al. (2020) presented an example of a classical MAC, defined in terms of a pseudo telepathy game, such that the sum rate with entangled transmitters is strictly higher than the best achievable sum rate without such resources. Here, we establish inner and outer bounds on the capacity region for the general MAC with entangled transmitters, and show that the previous result can be obtained as a special case. It has long been known that the capacity region of the classical MAC under a message-average error criterion can be strictly larger than with a maximal error criterion (Dueck, 1978). We observe that given entanglement resources, the regions coincide. Furthermore, we address the combined setting of entanglement resources and conferencing, where the transmitters can also communicate with each other over rate-limited links. Using superdense coding, entanglement can double the conferencing rate.

翻訳日:2023-05-16 22:58:41 公開日:2023-05-14

# 予算制約付き多成分PMDPの福祉最大化アルゴリズム

Welfare Maximization Algorithm for Solving Budget-Constrained Multi-Component POMDPs ( http://arxiv.org/abs/2303.10302v2 )

ライセンス: Link先を確認

Manav Vora, Pranay Thangeda, Michael N. Grussing, Melkior Ornik

(参考訳) 部分的に観測可能なマルコフ決定プロセス(POMDP)は、実世界のシーケンシャルな意思決定プロセスをモデル化する効率的な方法を提供する。本稿では,独立なダイナミクスを持つインフラストラクチャコンポーネント群の保守・検査の問題に動機づけられ,多成分予算制約型pomdpの最適ポリシーを求めるアルゴリズムを提案する。まず、予算制約に固執しながら、POMDPの最適ポリシーを見つけることができる予算付きPOMDPモデル(b-POMDP)を導入する。次に、b-POMDP に対する値関数や最大値収集報酬が有限地平線の場合の予算の凹凸関数であることを証明する。第2のコントリビューションは、各コンポーネントのPOMDP間で最適な予算分割を求めることで、多成分の予算制約付きPOMDPの最適ポリシーを計算するアルゴリズムである。最適予算分割は福祉最大化問題として提起され、その解は値関数の凹凸特性を利用して計算される。本稿では, 劣化ダイナミクス, 検査コスト, 保守コストの異なる実世界のインフラコンポーネント群に対して, 保守・検査ポリシーを提案することにより, 提案手法の有効性を示す。提案アルゴリズムは,現在実施中であるポリシーを大幅に上回っていることを示す。

Partially Observable Markov Decision Processes (POMDPs) provide an efficient way to model real-world sequential decision making processes. Motivated by the problem of maintenance and inspection of a group of infrastructure components with independent dynamics, this paper presents an algorithm to find the optimal policy for a multi-component budget-constrained POMDP. We first introduce a budgeted-POMDP model (b-POMDP) which enables us to find the optimal policy for a POMDP while adhering to budget constraints. Next, we prove that the value function or maximal collected reward for a b-POMDP is a concave function of the budget for the finite horizon case. Our second contribution is an algorithm to calculate the optimal policy for a multi-component budget-constrained POMDP by finding the optimal budget split among the individual component POMDPs. The optimal budget split is posed as a welfare maximization problem and the solution is computed by exploiting the concave nature of the value function. We illustrate the effectiveness of the proposed algorithm by proposing a maintenance and inspection policy for a group of real-world infrastructure components with different deterioration dynamics, inspection and maintenance costs. We show that the proposed algorithm vastly outperforms the policy currently used in practice.

翻訳日:2023-05-16 22:58:23 公開日:2023-05-14

# 神経集団の動態と幾何学の解釈可能な統計表現

Interpretable statistical representations of neural population dynamics and geometry ( http://arxiv.org/abs/2304.03376v2 )

ライセンス: Link先を確認

Adam Gosztolai, Robert L. Peach, Alexis Arnaudon, Mauricio Barahona, Pierre Vandergheynst

(参考訳) 多様なタスク中のニューロン集団のダイナミクスは、しばしば低次元多様体上で進化する。しかし、関連する行動変数をエンコーディングするための幾何学と力学の貢献を理解することは依然として困難である。本稿では,局所相ポートレート特徴の統計的分布に基づく非線形力学系を表現するための教師なし幾何深層学習フレームワークを提案する。本手法は,計測軌跡に基づく力学の非バイアス比較のためのロバストな幾何認識あるいは幾何非依存表現を提供する。提案手法は,計算機構を識別するためにニューラルネットワークのインスタンスを一般化し,手指運動学と幾何学的対応を持つ霊長類到達課題における神経力学の解釈可能な組込みを求め,最先端精度の復号アルゴリズムを開発した。本研究は,時間的情報よりも本質的多様体構造を用い,より優れた復号アルゴリズムを開発し,実験間でデータを同化することの重要性を浮き彫りにする。

The dynamics of neuron populations during diverse tasks often evolve on low-dimensional manifolds. However, it remains challenging to discern the contributions of geometry and dynamics for encoding relevant behavioural variables. Here, we introduce an unsupervised geometric deep learning framework for representing non-linear dynamical systems based on statistical distributions of local phase portrait features. Our method provides robust geometry-aware or geometry-agnostic representations for the unbiased comparison of dynamics based on measured trajectories. We demonstrate that our statistical representation can generalise across neural network instances to discriminate computational mechanisms, obtain interpretable embeddings of neural dynamics in a primate reaching task with geometric correspondence to hand kinematics, and develop a decoding algorithm with state-of-the-art accuracy. Our results highlight the importance of using the intrinsic manifold structure over temporal information to develop better decoding algorithms and assimilate data across experiments.

翻訳日:2023-05-16 22:49:49 公開日:2023-05-14

# 肺結節分類のための縦型マルチモーダルトランスフォーマリン : 画像と潜伏臨床所見の統合

Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification ( http://arxiv.org/abs/2304.02836v3 )

ライセンス: Link先を確認

Thomas Z. Li, John M. Still, Kaiwen Xu, Ho Hin Lee, Leon Y. Cai, Aravind R. Krishnan, Riqiang Gao, Mirza S. Khan, Sanja Antic, Michael Kammer, Kim L. Sandler, Fabien Maldonado, Bennett A. Landman, Thomas A. Lasko

(参考訳) 孤立性肺結節(SPN)診断の予測モデルの精度は、電子健康記録(EHRs)などの反復画像と医療コンテキストを取り入れることで大幅に向上することができる。しかし、画像や診断符号などの臨床上の日常的なモダリティは、縦型マルチモーダル学習の障害となる様々な時間スケールで非同期かつ不規則にサンプリングすることができる。本研究では,SPN分類のための経時的臨床像とリピート画像を統合したトランスフォーマーに基づくマルチモーダル戦略を提案する。潜在臨床署名の非教師付き不連続化を行い, 臨床署名表現と胸部ctスキャンから共同学習するために, 時間的スケールドセルフアテンションを活用した。うちの分類器は,公開データセットからの2,668件のスキャンと,縦型胸部ct,請求コード,薬剤,eersによる検査で1,149名の被験者を対象に事前訓練を行っている。 SPNに挑戦する227名の被験者に対する評価では、縦型マルチモーダルベースライン(0.824 vs 0.752 AUC)に対するAUCの大幅な改善と、横型マルチモーダルシナリオ(0.809 AUC)と縦型イメージオンリーシナリオ(0.741 AUC)に対する改善が示された。本研究は、トランスフォーマを用いた縦型画像と非画像表現型を共学習する新しいアプローチにより、大きな利点を示す。

The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learning. In this work, we propose a transformer-based multimodal strategy to integrate repeat imaging with longitudinal clinical signatures from routinely collected EHRs for SPN classification. We perform unsupervised disentanglement of latent clinical signatures and leverage time-distance scaled self-attention to jointly learn from clinical signatures expressions and chest computed tomography (CT) scans. Our classifier is pretrained on 2,668 scans from a public dataset and 1,149 subjects with longitudinal chest CTs, billing codes, medications, and laboratory tests from EHRs of our home institution. Evaluation on 227 subjects with challenging SPNs revealed a significant AUC improvement over a longitudinal multimodal baseline (0.824 vs 0.752 AUC), as well as improvements over a single cross-section multimodal scenario (0.809 AUC) and a longitudinal imaging-only scenario (0.741 AUC). This work demonstrates significant advantages with a novel approach for co-learning longitudinal imaging and non-imaging phenotypes with transformers.

翻訳日:2023-05-16 22:49:27 公開日:2023-05-14

# スペクトル保存データ圧縮による高速化支援ベクトルクラスタリング

Accelerate Support Vector Clustering via Spectrum-Preserving Data Compression ( http://arxiv.org/abs/2304.09868v3 )

ライセンス: Link先を確認

Yuxuan Song, Yongyu Wang

(参考訳) 本稿では,サポートベクトルクラスタリングを高速化する新しいフレームワークを提案する。提案手法は,新しいスペクトルデータ圧縮手法に基づき,オリジナルデータセットのキークラスタ特性を維持しながら,より小さな圧縮データセットを最初に計算する。得られたスペクトル圧縮データセットは,ベクトルクラスタリングをサポートする高速かつ高品質なアルゴリズムの開発に活用される。実世界のデータセットを用いた広範な実験を行い,非常に有望な結果を得た。提案手法により,Pendigits および USPS データセット上でのアート SVC 法の状態を,100X と 115X の高速化が可能となり,クラスタリング品質が向上した。我々の知る限りでは、これは大規模な実世界のデータセットにおける高品質で高速なSVCのための最初の実用的な方法である。

This paper proposes a novel framework for accelerating support vector clustering. The proposed method first computes much smaller compressed data sets while preserving the key cluster properties of the original data sets based on a novel spectral data compression approach. Then, the resultant spectrally-compressed data sets are leveraged for the development of fast and high quality algorithm for support vector clustering. We conducted extensive experiments using real-world data sets and obtained very promising results. The proposed method allows us to achieve 100X and 115X speedups over the state of the art SVC method on the Pendigits and USPS data sets, respectively, while achieving even better clustering quality. To the best of our knowledge, this represents the first practical method for high-quality and fast SVC on large-scale real-world data sets

翻訳日:2023-05-16 21:03:49 公開日:2023-05-14

# repuアクティベーションを持つ微分可能なニューラルネットワーク:スコア推定と等張回帰への応用

Differentiable Neural Networks with RePU Activation: with Applications to Score Estimation and Isotonic Regression ( http://arxiv.org/abs/2305.00608v2 )

ライセンス: Link先を確認

Guohao Shen, Yuling Jiao, Yuanyuan Lin, and Jian Huang

(参考訳) 整流パワーユニット(RePU)関数によって活性化される可変ニューラルネットワークの特性について検討する。本稿では,RePU ニューラルネットワークの部分微分を RePU 混合活性化ネットワークで表現し,RePU ネットワークの関数クラスの複雑性の上限を導出することを示す。本稿では,RePU活性化深層ニューラルネットワークを用いて,C^s$スムーズ関数とその導関数を同時に近似するための誤差境界を確立する。さらに、データに近似した低次元サポートがある場合の近似誤差境界を改善し、RePUネットワークが次元性の呪いを軽減できることを示す。結果の有用性を説明するために,RePUネットワークを用いた深部スコアマッチング推定器 (DSME) とペナル化深部ソトニック回帰 (PDIR) を提案する。 DSME と PDIR の非漸近的過剰リスク境界は、対象関数が滑らかな関数のクラスに属するという仮定の下で成立する。また,単調性仮定が満たされていない場合でも,PDIRはペナルティパラメータの消滅と整合性を有することを示す。さらに, 近似低次元多様体上でデータ分布が支持される場合, dsme と pdir は次元の呪いを緩和できることを示す。

We study the properties of differentiable neural networks activated by rectified power unit (RePU) functions. We show that the partial derivatives of RePU neural networks can be represented by RePUs mixed-activated networks and derive upper bounds for the complexity of the function class of derivatives of RePUs networks. We establish error bounds for simultaneously approximating $C^s$ smooth functions and their derivatives using RePU-activated deep neural networks. Furthermore, we derive improved approximation error bounds when data has an approximate low-dimensional support, demonstrating the ability of RePU networks to mitigate the curse of dimensionality. To illustrate the usefulness of our results, we consider a deep score matching estimator (DSME) and propose a penalized deep isotonic regression (PDIR) using RePU networks. We establish non-asymptotic excess risk bounds for DSME and PDIR under the assumption that the target functions belong to a class of $C^s$ smooth functions. We also show that PDIR has a robustness property in the sense it is consistent with vanishing penalty parameters even when the monotonicity assumption is not satisfied. Furthermore, if the data distribution is supported on an approximate low-dimensional manifold, we show that DSME and PDIR can mitigate the curse of dimensionality.

翻訳日:2023-05-16 20:54:40 公開日:2023-05-14

# 時間的敵意増強による映像表現の改善

Improve Video Representation with Temporal Adversarial Augmentation ( http://arxiv.org/abs/2304.14601v2 )

ライセンス: Link先を確認

Jinhao Duan, Quanfu Fan, Hao Cheng, Xiaoshuang Shi, Kaidi Xu

(参考訳) 近年の研究では、ニューラルネットワーク(NN)を適切に使用すれば、対向的な拡張が一般化の恩恵を受けることが示されている。本稿では,時間的注意を利用する新しい映像拡張手法であるtemporal adversarial augmentedation (ta)を提案する。従来の敵対的拡張とは異なり、TAは時間的関連損失関数を最大化することにより、ビデオクリップに対するニューラルネットワークの注意分布をシフトするように特別に設計されている。 TAは、ニューラルネットワークの焦点に大きな影響を及ぼす多様な時間的視点が得られることを実証する。これらの例によるトレーニングは、不均衡な時間的情報知覚の欠陥を修復し、時間的シフトに対して防御する能力を高め、最終的にはより一般化する。 TAを活用するために,ビデオ表現を改善するためのTAF(Temporal Video Adversarial Fine-tuning)フレームワークを提案する。 tafはモデルに依存しない、汎用的で、解釈しやすいトレーニング戦略である。 TSM, GST, TAM, TPNの4つの強力なモデルを用いて, 時間関連ベンチマーク(V1&V2, dive48)を用いてTAFを評価する。実験結果から,TAFはパラメータや計算コストを伴わずに,有意なマージンでこれらのモデルの試験精度を効果的に向上することが示された。副産物として、TAFはアウト・オブ・ディストリビューション(OOD)設定下での堅牢性も改善する。コードはhttps://github.com/jinhaoduan/tafで入手できる。

Recent works reveal that adversarial augmentation benefits the generalization of neural networks (NNs) if used in an appropriate manner. In this paper, we introduce Temporal Adversarial Augmentation (TA), a novel video augmentation technique that utilizes temporal attention. Unlike conventional adversarial augmentation, TA is specifically designed to shift the attention distributions of neural networks with respect to video clips by maximizing a temporal-related loss function. We demonstrate that TA will obtain diverse temporal views, which significantly affect the focus of neural networks. Training with these examples remedies the flaw of unbalanced temporal information perception and enhances the ability to defend against temporal shifts, ultimately leading to better generalization. To leverage TA, we propose Temporal Video Adversarial Fine-tuning (TAF) framework for improving video representations. TAF is a model-agnostic, generic, and interpretability-friendly training strategy. We evaluate TAF with four powerful models (TSM, GST, TAM, and TPN) over three challenging temporal-related benchmarks (Something-something V1&V2 and diving48). Experimental results demonstrate that TAF effectively improves the test accuracy of these models with notable margins without introducing additional parameters or computational costs. As a byproduct, TAF also improves the robustness under out-of-distribution (OOD) settings. Code is available at https://github.com/jinhaoduan/TAF.

翻訳日:2023-05-16 20:54:07 公開日:2023-05-14

# PoseVocab:人間のアバターモデリングのための共同構造ポス埋め込み学習

PoseVocab: Learning Joint-structured Pose Embeddings for Human Avatar Modeling ( http://arxiv.org/abs/2304.13006v2 )

ライセンス: Link先を確認

Zhe Li, Zerong Zheng, Yuxiao Liu, Boyao Zhou, Yebin Liu

(参考訳) ポーズ駆動ヒトアバターの作成は、低周波駆動ポーズから高周波動的人間の外観へのマッピングをモデル化するため、人間のアバターモデリングには、高忠実度な人間の詳細をエンコードできる効果的なポーズ符号化法が不可欠である。そこで本研究では,ネットワークが動的に人間の表情を学習するための最適なポーズ埋め込みを見つけることを促す,新しいポーズ符号化手法であるpositvocabを提案する。キャラクターのマルチビューRGBビデオが与えられた後、PoseVocabはトレーニングポーズに基づいてキーポーズと潜在埋め込みを構築する。ポーズ一般化と時間的一貫性を達成するために,大域的なポーズベクトルではなく,各ジョイントの$so(3)$でキー回転をサンプリングし,各サンプルされたキー回転に対してポーズ埋め込みを割り当てる。これらのジョイント構造のポーズ埋め込みは、異なるキーポーズの下でのダイナミックな外観をエンコードするだけでなく、ジョイント構造に埋め込まれたグローバルなポーズを分解し、各ジョイントの動きに関連する外観の変動をよりよく学習する。メモリ効率を保ちながらポーズ埋め込みの表現能力を向上するために,よりきめ細かな人間の外観をモデル化するために,コンパクトで効果的な3D表現である特徴線を導入する。さらに、クエリポーズと空間的位置が与えられた場合、ポーズ埋め込みを補間し、動的ヒト合成のための条件付きポーズ特徴を取得する階層的なクエリ戦略を導入する。全体的に、ponsvocabは人間の外観の動的な詳細を効果的にエンコードし、新しいポーズの下でリアルで一般化されたアニメーションを可能にする。実験により,本手法は質的および定量的に合成品質の点で,他の最先端ベースラインよりも優れていることが示された。コードはhttps://github.com/lizhe00/posevocabで入手できる。

Creating pose-driven human avatars is about modeling the mapping from the low-frequency driving pose to high-frequency dynamic human appearances, so an effective pose encoding method that can encode high-fidelity human details is essential to human avatar modeling. To this end, we present PoseVocab, a novel pose encoding method that encourages the network to discover the optimal pose embeddings for learning the dynamic human appearance. Given multi-view RGB videos of a character, PoseVocab constructs key poses and latent embeddings based on the training poses. To achieve pose generalization and temporal consistency, we sample key rotations in $so(3)$ of each joint rather than the global pose vectors, and assign a pose embedding to each sampled key rotation. These joint-structured pose embeddings not only encode the dynamic appearances under different key poses, but also factorize the global pose embedding into joint-structured ones to better learn the appearance variation related to the motion of each joint. To improve the representation ability of the pose embedding while maintaining memory efficiency, we introduce feature lines, a compact yet effective 3D representation, to model more fine-grained details of human appearances. Furthermore, given a query pose and a spatial position, a hierarchical query strategy is introduced to interpolate pose embeddings and acquire the conditional pose feature for dynamic human synthesis. Overall, PoseVocab effectively encodes the dynamic details of human appearance and enables realistic and generalized animation under novel poses. Experiments show that our method outperforms other state-of-the-art baselines both qualitatively and quantitatively in terms of synthesis quality. Code is available at https://github.com/lizhe00/PoseVocab.

翻訳日:2023-05-16 20:53:09 公開日:2023-05-14

# 量子ビットルーティングのアルゴリズム理論

Algorithmic Theory of Qubit Routing ( http://arxiv.org/abs/2305.02059v2 )

ライセンス: Link先を確認

Takehiro Ito, Naonori Kakimura, Naoyuki Kamiyama, Yusuke Kobayashi, Yoshio Okamoto

(参考訳) 量子ビットルーティング問題(qubit routing problem)またはスワップ最小化問題(swap minimization problem)は、量子プログラムのコンパイラの設計において生じる(古典的な)組合せ最適化問題である。理論計算機科学の立場から量子経路問題を研究する一方,既存の研究の多くは実用的側面を考察している。我々は、グラフトポロジが経路である量子コンピュータの線形近接アーキテクチャ(LNN)に集中する。私たちの結果は3倍です。 1) 量子ビットルーティング問題はNPハードであることを証明する。 2) 2量子ゲートの数がパラメータである場合,固定パラメータアルゴリズムを提案する。 (3) 各キュービットが少なくとも1つの2量子ビットゲートに関与している場合に多項式時間アルゴリズムを与える。

The qubit routing problem, also known as the swap minimization problem, is a (classical) combinatorial optimization problem that arises in the design of compilers of quantum programs. We study the qubit routing problem from the viewpoint of theoretical computer science, while most of the existing studies investigated the practical aspects. We concentrate on the linear nearest neighbor (LNN) architectures of quantum computers, in which the graph topology is a path. Our results are three-fold. (1) We prove that the qubit routing problem is NP-hard. (2) We give a fixed-parameter algorithm when the number of two-qubit gates is a parameter. (3) We give a polynomial-time algorithm when each qubit is involved in at most one two-qubit gate.

翻訳日:2023-05-16 20:43:56 公開日:2023-05-14

# 教師なし深部FCDDを用いた農村鉄道診断のための木造スリーパー劣化検出

Wooden Sleeper Deterioration Detection for Rural Railway Prognostics Using Unsupervised Deeper FCDDs ( http://arxiv.org/abs/2305.05103v3 )

ライセンス: Link先を確認

Takato Yasuno, Masahiro Okano, and Junichiro Fujii

(参考訳) 日々の鉄道運行における利用者の安全確保は、鉄道管理者にとって不可欠である。この取り組みを支援するため、トップカメラやサイドカメラ、GPS測位システムは、欠陥点検の定期的検査の自動化や、鉄道部品の劣化状況の評価に進展している。しかし,劣化状態に関するデータ収集には時間を要する可能性があり,過度な時間的発生の不均衡のため,データ取得の繰り返しが必要となる。教師付き学習では、欠陥のある生画像と注釈付きラベルを含む何千ものペアデータセットが必要である。しかし、一級分類アプローチは、通常の特徴や異常な特徴を訓練するためのパラメータを最適化するために、画像が少ないという利点がある。 FCDDは, 構造物や倒木, 倒木などのコンクリート・鋼構造物の損傷データ集合に適用し, 災害時の木造建築物の崩壊について検討した。しかし、まだ鉄道部品が可能であることは分かっていない。本研究では, 鉄道部品の深いFCDDを用いた一級損傷分類を自動化するための識別器パイプラインを考案した。また,畳み込みニューラルネットワーク(CNN)を用いた深部背骨と受容野の感度解析を行った。さらに, トランスポーテッド・ガウスアン・アップサンプリングを用いて, 欠陥鉄道特性を可視化した。農村鉄道における木製スリーパー劣化を含む前方視における鉄道線路の映像取得データセットを用いた鉄道検査への適用を実証した。最後に, 鉄道部品検査における予測モニタリングへのアプローチの有用性と今後の課題について検討した。

Maintaining high standards for user safety during daily railway operations is crucial for railway managers. To aid in this endeavor, top- or side-view cameras and GPS positioning systems have facilitated progress toward automating periodic inspections of defective features and assessing the deteriorating status of railway components. However, collecting data on deteriorated status can be time-consuming and requires repeated data acquisition because of the extreme temporal occurrence imbalance. In supervised learning, thousands of paired data sets containing defective raw images and annotated labels are required. However, the one-class classification approach offers the advantage of requiring fewer images to optimize parameters for training normal and anomalous features. The deeper fully-convolutional data descriptions (FCDDs) were applicable to several damage data sets of concrete/steel components in structures, and fallen tree, and wooden building collapse in disasters. However, it is not yet known to feasible to railway components. In this study, we devised a prognostic discriminator pipeline to automate one-class damage classification using the deeper FCDDs for defective railway components. We also performed sensitivity analysis of the deeper backbone and receptive field based on convolutional neural networks (CNNs). Furthermore, we visualized defective railway features by using transposed Gaussian upsampling. We demonstrated our application to railway inspection using a video acquisition dataset of railway track in forward view that contains wooden sleeper deterioration in rural railways. Finally, we examined the usability of our approach for prognostic monitoring and future work on railway component inspection.

翻訳日:2023-05-16 20:35:31 公開日:2023-05-14

# ANALOGICAL - 大規模言語モデルのための長文分析のための新しいベンチマーク

ANALOGICAL - A New Benchmark for Analogy of Long Text for Large Language Models ( http://arxiv.org/abs/2305.05050v2 )

ライセンス: Link先を確認

Thilini Wijesiriwardene, Ruwan Wickramarachchi, Bimal G. Gajera, Shreeyash Mukul Gowaikar, Chandan Gupta, Aman Chadha, Aishwarya Naresh Reganti, Amit Sheth, Amitava Das

(参考訳) 過去10年間で、単語レベルの類推という形で、Word2vecのような単語埋め込み手法の品質を評価するための本質的な尺度として重要な役割を果たしてきた。しかし、現代の大規模言語モデル(LLM)は、GLUEやSuperGLUEのようなベンチマークに基づく外部尺度に基づいて主に評価されており、LLMが長いテキスト間の類似性を引き出すことができるかどうかについてはいくつかの研究がある。本稿では,6段階の複雑さを持つ長文のアナロジーの分類において,LLMを内在的に評価する新しいベンチマークであるANALOGICALを提案する。 (i)単語 (ii)単語対文 (三)統語論、 (4)否定 (v)以下 (vi)メタファー。 13のデータセットと3つの異なる距離測度を用いて、意味ベクトル空間における類似対を識別する8つのLLMの能力を評価する。我々の評価では,類推分類法を上昇させる際,llm が類推を識別することがますます困難になっていることがわかった。

Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw analogies between long texts. In this paper, we present ANALOGICAL, a new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of long text with six levels of complexity -- (i) word, (ii) word vs. sentence, (iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using thirteen datasets and three different distance measures, we evaluate the abilities of eight LLMs in identifying analogical pairs in the semantic vector space. Our evaluation finds that it is increasingly challenging for LLMs to identify analogies when going up the analogy taxonomy.

翻訳日:2023-05-16 20:35:07 公開日:2023-05-14

# 機械学習の景観を探る : 総合的な調査と分類学

Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy ( http://arxiv.org/abs/2305.06360v2 )

ライセンス: Link先を確認

Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, and Qing Li

(参考訳) 機械学習(ML)モデルによる予測の削除や修正の必要性から、機械学習(MU)が注目を集めている。トレーニングモデルはより効率的で正確になっていますが、未学習の情報の重要性は、プライバシやセキュリティ、公正といった分野でますます重要になっています。本稿では,データ削除,摂動,モデル更新など,現在の最先端技術とアプローチを包括的に調査する。また、一般的なメトリクスやデータセットも提示される。また、攻撃の高度化、標準化、転送可能性、解釈可能性、トレーニングデータ、リソース制約など、対処すべき課題を強調している。本稿では,muの潜在的メリットとその今後の方向性について考察する。さらに、機械学習モデルがユーザの信頼を維持しながら変化する状況に適応できるように、研究者や実践者が未学習の技術を探求し、改善し続ける必要性を強調した。アンラーニングの重要性はさらに強調され、人工知能(AI)をより信頼性が高く透明なものにすること、特に大量の個人データを含むさまざまな領域におけるAIの重要性が増している。

Machine unlearning (MU) is gaining increasing attention due to the need to remove or modify predictions made by machine learning (ML) models. While training models have become more efficient and accurate, the importance of unlearning previously learned information has become increasingly significant in fields such as privacy, security, and fairness. This paper presents a comprehensive survey of MU, covering current state-of-the-art techniques and approaches, including data deletion, perturbation, and model updates. In addition, commonly used metrics and datasets are also presented. The paper also highlights the challenges that need to be addressed, including attack sophistication, standardization, transferability, interpretability, training data, and resource constraints. The contributions of this paper include discussions about the potential benefits of MU and its future directions. Additionally, the paper emphasizes the need for researchers and practitioners to continue exploring and refining unlearning techniques to ensure that ML models can adapt to changing circumstances while maintaining user trust. The importance of unlearning is further highlighted in making Artificial Intelligence (AI) more trustworthy and transparent, especially with the increasing importance of AI in various domains that involve large amounts of personal user data.

翻訳日:2023-05-16 20:26:09 公開日:2023-05-14

# 事例依存ラベル雑音学習におけるラベルの価値の再考

Rethinking the Value of Labels for Instance-Dependent Label Noise Learning ( http://arxiv.org/abs/2305.06247v2 )

ライセンス: Link先を確認

Hanwen Deng, Weijia Zhang, Min-Ling Zhang

(参考訳) ラベルノイズは大規模データセットに広く存在し、ディープラーニングアルゴリズムの性能を著しく劣化させる。インスタンス依存ノイズ遷移行列の識別不能のため、ほとんどの既存のアルゴリズムは、ノイズラベル生成プロセスがインスタンスの特徴とは独立であると仮定することでこの問題に対処する。残念ながら、実世界のアプリケーションにおけるノイズの多いラベルは、しばしば真のラベルと機能の両方に依存します。本研究では,ノイズ遷移行列を明示的にモデル化することを避ける新しい深層生成モデルを用いて,インスタンス依存ラベルノイズに取り組む。本アルゴリズムは,カジュアル表現学習を活用し,データから高レベルコンテンツとスタイル潜在要因を同時に識別する。ノイズラベルの監視情報を構造的因果モデルを用いて活用することにより,提案手法が最先端の雑音データよりも大幅に優れていることを示す。

Label noise widely exists in large-scale datasets and significantly degenerates the performances of deep learning algorithms. Due to the non-identifiability of the instance-dependent noise transition matrix, most existing algorithms address the problem by assuming the noisy label generation process to be independent of the instance features. Unfortunately, noisy labels in real-world applications often depend on both the true label and the features. In this work, we tackle instance-dependent label noise with a novel deep generative model that avoids explicitly modeling the noise transition matrix. Our algorithm leverages casual representation learning and simultaneously identifies the high-level content and style latent factors from the data. By exploiting the supervision information of noisy labels with structural causal models, our empirical evaluations on a wide range of synthetic and real-world instance-dependent label noise datasets demonstrate that the proposed algorithm significantly outperforms the state-of-the-art counterparts.

翻訳日:2023-05-16 20:25:52 公開日:2023-05-14

# 善意を超えて:社会善のためのNLPの研究ランドスケープを報告

Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good ( http://arxiv.org/abs/2305.05471v2 )

ライセンス: Link先を確認

Fernando Gonzalez, Zhijing Jin, Bernhard Sch\"olkopf, Tom Hope, Mrinmaya Sachan, Rada Mihalcea

(参考訳) 自然言語処理(NLP)の最近の進歩により、様々なユースケースにまたがって多数のアプリケーションが登場した。 NLP応用の多さの中で、NLP for Social Good (NLP4SG) の最近の取り組みに則って、多くの学術研究者は、社会に良い影響を与える仕事を行う動機がある。しかし、研究者が今日の大きな社会問題にどのように取り組んでいるかは必ずしも明らかではない。そこで本稿では,NLP4SGPAPERSという,NLP4SG論文を識別し,NLP4SGランドスケープを特徴付ける3つの関連タスクを持つ科学データセットを紹介する。(1)社会問題に対処する論文を識別し,(2)対応する国連持続開発目標(SDG)にマッピングし,(3)解決している課題と方法を特定する。現状のNLPモデルを用いて、これらのタスクに対処し、ACLアンソロジー全体で使用することにより、研究者がNLP4SGの分野を概観する可視化ワークスペースを提供する。私たちのwebサイトはhttps://nlp4sg.vercel.appで閲覧できます。私たちはデータをhttps://huggingface.co/datasets/feradauto/nlp4sgpapersとhttps://github.com/feradauto/nlp4sgでリリースした。

With the recent advances in natural language processing (NLP), a vast number of applications have emerged across various use cases. Among the plethora of NLP applications, many academic researchers are motivated to do work that has a positive social impact, in line with the recent initiatives of NLP for Social Good (NLP4SG). However, it is not always obvious to researchers how their research efforts are tackling today's big social problems. Thus, in this paper, we introduce NLP4SGPAPERS, a scientific dataset with three associated tasks that can help identify NLP4SG papers and characterize the NLP4SG landscape by: (1) identifying the papers that address a social problem, (2) mapping them to the corresponding UN Sustainable Development Goals (SDGs), and (3) identifying the task they are solving and the methods they are using. Using state-of-the-art NLP models, we address each of these tasks and use them on the entire ACL Anthology, resulting in a visualization workspace that gives researchers a comprehensive overview of the field of NLP4SG. Our website is available at https://nlp4sg.vercel.app . We released our data at https://huggingface.co/datasets/feradauto/NLP4SGPapers and code at https://github.com/feradauto/nlp4sg .

翻訳日:2023-05-16 20:24:17 公開日:2023-05-14

# 視覚変換器の堅牢性向上について:防御拡散

On enhancing the robustness of Vision Transformers: Defensive Diffusion ( http://arxiv.org/abs/2305.08031v1 )

ライセンス: Link先を確認

Raza Imam, Muhammad Huzaifa, and Mohammed El-Amine Azz

(参考訳) 医療データのプライバシーと機密性は、医療設定において最も重要である。 sotaビジョンモデルであるvitsは、トレーニングのために大量の患者データに依存しており、データセキュリティと不正アクセスの可能性を懸念している。敵はViTの脆弱性を利用して機密情報を抽出し、患者のプライバシーを侵害する可能性がある。この研究は、医療応用におけるViTの信頼性と信頼性を確保するために、これらの脆弱性に対処する。本研究では,元画像中の攻撃者による対向ノイズを除去する対向浄化器として,防御拡散手法を導入した。拡散モデルのデノナイズ機能を利用することで、逆拡散法を用いて、攻撃サンプルから対向ノイズを効果的に除去し、その結果、よりクリーンな画像がViTブロックに供給される。本研究は,画像からのアタック非依存な敵対的ノイズ除去における拡散モデルの有効性を示す。さらに,知識蒸留とフレームワークを組み合わせることで,グレーボックス攻撃に対する計算効率と堅牢性を両立する軽量な学生モデルを実現することを提案する。提案手法とSOTAベースライン法SEViTとの比較により,本手法がベースラインより優れていることを示す。結核x線データを用いた広範な実験により,提案手法による計算効率と頑健性が検証された。

Privacy and confidentiality of medical data are of utmost importance in healthcare settings. ViTs, the SOTA vision model, rely on large amounts of patient data for training, which raises concerns about data security and the potential for unauthorized access. Adversaries may exploit vulnerabilities in ViTs to extract sensitive patient information and compromising patient privacy. This work address these vulnerabilities to ensure the trustworthiness and reliability of ViTs in medical applications. In this work, we introduced a defensive diffusion technique as an adversarial purifier to eliminate adversarial noise introduced by attackers in the original image. By utilizing the denoising capabilities of the diffusion model, we employ a reverse diffusion process to effectively eliminate the adversarial noise from the attack sample, resulting in a cleaner image that is then fed into the ViT blocks. Our findings demonstrate the effectiveness of the diffusion model in eliminating attack-agnostic adversarial noise from images. Additionally, we propose combining knowledge distillation with our framework to obtain a lightweight student model that is both computationally efficient and robust against gray box attacks. Comparison of our method with a SOTA baseline method, SEViT, shows that our work is able to outperform the baseline. Extensive experiments conducted on a publicly available Tuberculosis X-ray dataset validate the computational efficiency and improved robustness achieved by our proposed architecture.

翻訳日:2023-05-16 18:15:02 公開日:2023-05-14

# SongDriver2:ソフト移行によるリアルタイム感情ベースの音楽アレンジメント

SongDriver2: Real-time Emotion-based Music Arrangement with Soft Transition ( http://arxiv.org/abs/2305.08029v1 )

ライセンス: Link先を確認

Zihao Wang, Le Ma, Chen Zhang, Bo Han, Yikai Wang, Xinyi Chen, HaoRong Hong, Wenbo Liu, Xinda Wu, Kejun Zhang

(参考訳) リアルタイムの感情に基づく音楽アレンジメントは、特定の楽曲を別の曲に変換し、リアルタイムでユーザーと特定の感情共鳴を誘発することを目的としており、音楽療法、ビデオゲームのサウンドトラック、映画のスコアなど、様々なシナリオにおいて重要な応用価値を持っている。しかし、ソフトな感情遷移とリアルタイムに適合する感情のバランスは、ターゲットの感情のきめ細かい性質と可変性のために困難である。既存の研究は主に感情をリアルタイムに適合させることに焦点を当てているが、ソフト・トランジションの問題はまだ検討されており、音楽全体の感情的コヒーレンスに影響を与える。本稿では,このバランスに対処するため,SongDriver2を提案する。具体的には、まず最後のタイムステップの音楽感情を認識し、次に現在のタイムステップのターゲット入力感情と融合する。そして、融合感情がsongdriver2のガイダンスとなり、入力メロディデータに基づいて今後の音楽を生成する。音楽の類似性と感情のリアルタイム適合性を柔軟に調整するために、オリジナルメロディを分解し、生成モデルに入力する。さらに、4つの音楽理論を設計し、ドメイン知識を活用して感情情報を強化し、半教師付き学習を用いて、手動データセットアノテーションによる主観的バイアスを軽減する。評価結果によると、SongDriver2は客観的および主観的メトリクスの両方において最先端の手法を上回っている。これらの結果は,SongDriver2がリアルタイムな適合性とソフトな遷移を同時に達成し,生成した音楽のコヒーレンスを高めることを実証している。

Real-time emotion-based music arrangement, which aims to transform a given music piece into another one that evokes specific emotional resonance with the user in real-time, holds significant application value in various scenarios, e.g., music therapy, video game soundtracks, and movie scores. However, balancing emotion real-time fit with soft emotion transition is a challenge due to the fine-grained and mutable nature of the target emotion. Existing studies mainly focus on achieving emotion real-time fit, while the issue of soft transition remains understudied, affecting the overall emotional coherence of the music. In this paper, we propose SongDriver2 to address this balance. Specifically, we first recognize the last timestep's music emotion and then fuse it with the current timestep's target input emotion. The fused emotion then serves as the guidance for SongDriver2 to generate the upcoming music based on the input melody data. To adjust music similarity and emotion real-time fit flexibly, we downsample the original melody and feed it into the generation model. Furthermore, we design four music theory features to leverage domain knowledge to enhance emotion information and employ semi-supervised learning to mitigate the subjective bias introduced by manual dataset annotation. According to the evaluation results, SongDriver2 surpasses the state-of-the-art methods in both objective and subjective metrics. These results demonstrate that SongDriver2 achieves real-time fit and soft transitions simultaneously, enhancing the coherence of the generated music.

翻訳日:2023-05-16 18:14:41 公開日:2023-05-14

# QAOA, Penalty Dephasing, Zeno効果を統合したハイブリッド量子アルゴリズムによる2値最適化問題の解法

Hybrid Quantum Algorithms integrating QAOA, Penalty Dephasing and Zeno Effect for Solving Binary Optimization Problems with Multiple Constraints ( http://arxiv.org/abs/2305.08056v1 )

ライセンス: Link先を確認

Ke Wan, Yiwen Liu

(参考訳) 量子アルゴリズムを用いてバイナリ最適化問題に取り組む場合、従来のIsing表現とQuantum Approximate Optimization Algorithm (QAOA)は、複数の制約を含む大規模問題のエラーを効率的に処理するのに困難である。これらの課題に対処するため,本論文では,制約のサブセットを解決するために標準Ising Hamiltonianの使用と,残りの制約を表現および対処するためにIsing以外の定式化を併用したハイブリッドフレームワークを提案する。これらのノンイジング制約の解決は、ペナルティ・デファスメントまたは量子ゼノン効果によって達成される。この革新的なアプローチは、各制約に対する選択された表現に依存する、適応可能な構造を持つ量子回路の集合をもたらす。さらに,制約フラグを頻繁に測定し,任意の最適化制約の解決を可能にする量子ゼノ効果を利用した新しい手法を提案する。これらのアルゴリズムの理論的性質を考察し, 実機載荷問題に対するそれらの性能は高い有望であり, 幅広い産業応用において有意な可能性を示している。

When tackling binary optimization problems using quantum algorithms, the conventional Ising representation and Quantum Approximate Optimization Algorithm (QAOA) encounter difficulties in efficiently handling errors for large-scale problems involving multiple constraints. To address these challenges, this paper presents a hybrid framework that combines the use of standard Ising Hamiltonians to solve a subset of the constraints, while employing non-Ising formulations to represent and address the remaining constraints. The resolution of these non-Ising constraints is achieved through either penalty dephasing or the quantum Zeno effect. This innovative approach leads to a collection of quantum circuits with adaptable structures, depending on the chosen representation for each constraint. Furthermore, this paper introduces a novel technique that utilizes the quantum Zeno effect by frequently measuring the constraint flag, enabling the resolution of any optimization constraint. Theoretical properties of these algorithms are discussed, and their performance in addressing practical aircraft loading problems is highly promising, showcasing significant potential for a wide range of industrial applications.

翻訳日:2023-05-16 18:03:36 公開日:2023-05-14

# SCRNet:空間整合性による網膜構造に基づく低照度化モデル

SCRNet: a Retinex Structure-based Low-light Enhancement Model Guided by Spatial Consistency ( http://arxiv.org/abs/2305.08053v1 )

ライセンス: Link先を確認

Miao Zhang, Yiqing Shen and Shenghui Zhong

(参考訳) 低照度条件下で撮影された画像は、コントラストの減少、ノイズの増加、細部の減少、不自然な色再現など、いくつかの課題に苦しめられている。これらの要因は、物体検出や画像分割といったコンピュータビジョンタスクのパフォーマンスを著しく損なう可能性がある。 As a result, improving the quality of low-light images is of paramount importance for practical applications in the computer vision domain.To effectively address these challenges, we present a novel low-light image enhancement model, termed Spatial Consistency Retinex Network (SCRNet), which leverages the Retinex-based structure and is guided by the principle of spatial consistency.Specifically, our proposed model incorporates three levels of consistency: channel level, semantic level, and texture level, inspired by the principle of spatial consistency.These levels of consistency enable our model to adaptively enhance image features, ensuring more accurate and visually pleasing results.Extensive experimental evaluations on various low-light image datasets demonstrate that our proposed SCRNet outshines existing state-of-the-art methods, highlighting the potential of SCRNet as an effective solution for enhancing low-light images.

Images captured under low-light conditions are often plagued by several challenges, including diminished contrast, increased noise, loss of fine details, and unnatural color reproduction. These factors can significantly hinder the performance of computer vision tasks such as object detection and image segmentation. As a result, improving the quality of low-light images is of paramount importance for practical applications in the computer vision domain.To effectively address these challenges, we present a novel low-light image enhancement model, termed Spatial Consistency Retinex Network (SCRNet), which leverages the Retinex-based structure and is guided by the principle of spatial consistency.Specifically, our proposed model incorporates three levels of consistency: channel level, semantic level, and texture level, inspired by the principle of spatial consistency.These levels of consistency enable our model to adaptively enhance image features, ensuring more accurate and visually pleasing results.Extensive experimental evaluations on various low-light image datasets demonstrate that our proposed SCRNet outshines existing state-of-the-art methods, highlighting the potential of SCRNet as an effective solution for enhancing low-light images.

翻訳日:2023-05-16 18:03:15 公開日:2023-05-14

# 驚くほど単純な連続アクションpomdpソルバ:ポリシーツリー上の遅延クロスエントロピー探索

A Surprisingly Simple Continuous-Action POMDP Solver: Lazy Cross-Entropy Search Over Policy Trees ( http://arxiv.org/abs/2305.08049v1 )

ライセンス: Link先を確認

Marcus Hoerger, Hanna Kurniawati, Dirk Kroese, Nan Ye

(参考訳) 部分可観測マルコフ決定プロセス(POMDP)は確率的部分可観測環境における意思決定の原則的枠組みを提供する。しかし、連続行動空間の問題に対する優れた解の計算は依然として困難である。この課題を解消するために、Lazy Cross-Entropy Search Over Policy Trees (L CEOPT) と呼ばれるシンプルなオンラインPOMDP解決器を提案する。各計画段階では,ポリシーツリーの空間を探索するために遅延クロスエントロピー法を用いて,簡単なポリシー表現を提供する。具体的には、有望な有限水平ポリシーツリーの分布を維持する。この分布はサンプリングポリシによって反復的に更新され、モンテカルロシミュレーションによって評価され、最高性能のものに再適合する。本手法はポリシツリー表現を利用して,ポリシーサンプリング,評価,分散更新における冗長な計算を回避するという意味では遅延である。これにより、最大2桁の計算節約が可能となる。我々のL CEOPTは、既存の最先端手法と比較して驚くほど単純であるが、特に高次元のアクション空間における問題に対して、いくつかの連続作用POMDP問題において、経験的に優れている。

The Partially Observable Markov Decision Process (POMDP) provides a principled framework for decision making in stochastic partially observable environments. However, computing good solutions for problems with continuous action spaces remains challenging. To ease this challenge, we propose a simple online POMDP solver, called Lazy Cross-Entropy Search Over Policy Trees (LCEOPT). At each planning step, our method uses a lazy Cross-Entropy method to search the space of policy trees, which provide a simple policy representation. Specifically, we maintain a distribution on promising finite-horizon policy trees. The distribution is iteratively updated by sampling policies, evaluating them via Monte Carlo simulation, and refitting them to the top-performing ones. Our method is lazy in the sense that it exploits the policy tree representation to avoid redundant computations in policy sampling, evaluation, and distribution update. This leads to computational savings of up to two orders of magnitude. Our LCEOPT is surprisingly simple as compared to existing state-of-the-art methods, yet empirically outperforms them on several continuous-action POMDP problems, particularly for problems with higher-dimensional action spaces.

翻訳日:2023-05-16 18:02:59 公開日:2023-05-14

# グラフニューラルネットワークの一般化に向けて

Towards Understanding the Generalization of Graph Neural Networks ( http://arxiv.org/abs/2305.08048v1 )

ライセンス: Link先を確認

Huayi Tang and Yong Liu

(参考訳) グラフニューラルネットワーク(GNN)は、グラフ構造化データ指向学習と表現において最も広く採用されているモデルである。実世界のアプリケーションで並外れた成功を収めたにもかかわらず、理論による作業メカニズムの理解はまだ第一段階である。本稿では,一般化の観点から,この目標に向かって進む。具体的には,確率的最適化を考慮したトランスダクティブ学習における一般化ギャップと勾配の確率境界を確立する。その後、人気のあるGNNに対して、一般化ギャップの確率境界を提供する。理論的結果は、一般化ギャップに影響を与えるアーキテクチャ固有の要因を明らかにする。ベンチマークデータセットにおける実験結果は、理論的結果と経験的証拠の一貫性を示している。本研究は,GNNの一般化に関する新たな知見を提供する。

Graph neural networks (GNNs) are the most widely adopted model in graph-structured data oriented learning and representation. Despite their extraordinary success in real-world applications, understanding their working mechanism by theory is still on primary stage. In this paper, we move towards this goal from the perspective of generalization. To be specific, we first establish high probability bounds of generalization gap and gradients in transductive learning with consideration of stochastic optimization. After that, we provide high probability bounds of generalization gap for popular GNNs. The theoretical results reveal the architecture specific factors affecting the generalization gap. Experimental results on benchmark datasets show the consistency between theoretical results and empirical evidence. Our results provide new insights in understanding the generalization of GNNs.

翻訳日:2023-05-16 18:02:40 公開日:2023-05-14

# キャビティマグノメカニクスにおける量子増強メトロロジー

Quantum-Enhanced Metrology in Cavity Magnomechanics ( http://arxiv.org/abs/2305.08045v1 )

ライセンス: Link先を確認

Qing-Kun Wan, Hai-Long Shi, Xi-Wen Guan

(参考訳) マグノンは、基本的な準粒子が初等スピン励起で現れ、情報符号化と処理における量子技術革新に大きな可能性を秘めている。ここでは, 空洞磁場が弱磁場を感知するのに対して, 空洞磁場が弱磁場の精密測定を行うような, 実験的に実現可能なキャビティマグノメカニカルシステムに基づくメトロロジースキームにおいて, 絡み合いの微妙な役割を見出す。フィッシャー情報と絡み合いの正確な関係を確立することにより,弱いカップリングの場合,測定精度はハイゼンベルク限界に達するが,量子臨界性は強いカップリングの場合の測定精度を高めることができることを示した。特に,マグノンと光子の絡み合いは動的符号化過程において重要であるが,測定過程におけるそのような絡み合いの存在は,最終的な測定精度を劇的に低下させる。

Magnons, as fundamental quasiparticles emerged in elementary spin excitations, hold a big promise for innovating quantum technologies in information coding and processing. Here we discover subtle roles of entanglement in a metrological scheme based on an experimentally feasible cavity magnomechanical system, where the magnons are responsible for sensing a weak magnetic field whereas the cavity field carries out a precision measurement of the weak field. By establishing exact relations between the Fisher information and entanglement, we show that for the weak coupling case the measurement precision can reach the Heisenberg limit, whereas quantum criticality enables us to enhance measurement precision for the strong coupling case. In particular, we also find that the entanglement between magnons and photons is of crucial importance during the dynamical encoding process, but the presence of such an entanglement in the measurement process dramatically reduces the final measurement precision.

翻訳日:2023-05-16 18:02:31 公開日:2023-05-14

# 実世界シナリオにおけるeeg信号を用いたメモリ検索時の作業負荷評価

Using EEG Signals to Assess Workload during Memory Retrieval in a Real-world Scenario ( http://arxiv.org/abs/2305.08044v1 )

ライセンス: Link先を確認

Kuan-Jung Chiang, Steven Dong, Chung-Kuan Cheng, and Tzyy-Ping Jung

(参考訳) 目的:脳波(EEG)は、客観的であり、バイアスの傾向が低く、認知状態のダイナミクスを評価することができるため、人間の因子研究における神経経済学の生理学的指標として人気を集めている。本研究は,参加者の典型的なオフィスタスクにおけるメモリ負荷と脳波の関係を,シングルモニターとデュアルモニターアレンジメントで検討した。シングルモニターアレンジメントでは、より高いメモリワークロードが期待できます。アプローチ: オフィスワークを行う被験者のシナリオを模倣した実験を設計し, 2つの異なるオフィスセットアップにおいて, 様々なレベルのメモリ負荷を経験したかどうかを検討した。 1)シングルモニターの設定と 2)デュアルモニターの設定。我々は,脳波バンドパワー,相互情報,コヒーレンスを,高メモリ負荷と低メモリ負荷を分類する機械学習モデルを訓練するための特徴として用いた。主な結果: 研究結果から, これらの特徴は, 全参加者で一貫した有意差を示した。また、Sternbergタスク中に収集した異なるデータセットにおいて、これらのEEGシグネチャの堅牢性と一貫性を検証する。意義:本研究は、脳波が個人間の記憶負荷の相関関係を示し、実世界の神経人間工学研究における脳波分析の有効性を実証した。

Objective: The Electroencephalogram (EEG) is gaining popularity as a physiological measure for neuroergonomics in human factor studies because it is objective, less prone to bias, and capable of assessing the dynamics of cognitive states. This study investigated the associations between memory workload and EEG during participants' typical office tasks on a single-monitor and dual-monitor arrangement. We expect a higher memory workload for the single-monitor arrangement. Approach: We designed an experiment that mimics the scenario of a subject performing some office work and examined whether the subjects experienced various levels of memory workload in two different office setups: 1) a single-monitor setup and 2) a dual-monitor setup. We used EEG band power, mutual information, and coherence as features to train machine learning models to classify high versus low memory workload states. Main results: The study results showed that these characteristics exhibited significant differences that were consistent across all participants. We also verified the robustness and consistency of these EEG signatures in a different data set collected during a Sternberg task in a prior study. Significance: The study found the EEG correlates of memory workload across individuals, demonstrating the effectiveness of using EEG analysis in conducting real-world neuroergonomic studies.

翻訳日:2023-05-16 18:02:15 公開日:2023-05-14

# chsel: 接触データと自由空間データから多岐にわたる多彩なポーズ推定を生成する

CHSEL: Producing Diverse Plausible Pose Estimates from Contact and Free Space Data ( http://arxiv.org/abs/2305.08042v1 )

ライセンス: Link先を確認

Sheng Zhong, Nima Fazeli, and Dmitry Berenson

(参考訳) 本稿では,各点が自由空間にあるか,あるいは物体の表面にあるかなどのボリューム情報を持つ点群から,剛体物体の有理なポーズの集合を推定する新しい方法を提案する。特に,接触から生じる力や触覚データからポーズを推定する方法について検討した。接触から派生したデータを使用することは、本質的に視覚データよりも情報密度が低いため、接触が少ない場合、ポーズ推定問題は過小評価される。多数の接触を伴わない被写体の真のポーズを推定する代わりに,センサデータによって課される制約に従わなければならないポーズの集合を推定する。既存の手法は、このセットを単一のポーズ推定のために設計するか、効果的に情報的優先順位を必要とするため、見積もりに苦労する。この問題に対する我々のアプローチ、制約付きポーズ仮説セット除去(CHSEL)には3つの重要な属性がある。 1) 既知の自由空間を考慮できる量的情報を考える。 2)強力な勾配に基づく最適化ツールを活用するために,新しい微分可能な体積コスト関数を用いる。 3)品質多様性(QD)最適化文献からの手法を用いて,高品質なポーズの多様なセットを生成する。我々の知る限り、QD法はポーズ登録には使われていない。また、より多くのデータがロボットによって収集された場合、推定したポーズをオンラインで更新する方法も示します。実験の結果,CHSELはシミュレーションデータと実世界のデータの両方に対して,複数のベースライン法よりも大きな性能向上を示した。

This paper proposes a novel method for estimating the set of plausible poses of a rigid object from a set of points with volumetric information, such as whether each point is in free space or on the surface of the object. In particular, we study how pose can be estimated from force and tactile data arising from contact. Using data derived from contact is challenging because it is inherently less information-dense than visual data, and thus the pose estimation problem is severely under-constrained when there are few contacts. Rather than attempting to estimate the true pose of the object, which is not tractable without a large number of contacts, we seek to estimate a plausible set of poses which obey the constraints imposed by the sensor data. Existing methods struggle to estimate this set because they are either designed for single pose estimates or require informative priors to be effective. Our approach to this problem, Constrained pose Hypothesis Set Elimination (CHSEL), has three key attributes: 1) It considers volumetric information, which allows us to account for known free space; 2) It uses a novel differentiable volumetric cost function to take advantage of powerful gradient-based optimization tools; and 3) It uses methods from the Quality Diversity (QD) optimization literature to produce a diverse set of high-quality poses. To our knowledge, QD methods have not been used previously for pose registration. We also show how to update our plausible pose estimates online as more data is gathered by the robot. Our experiments suggest that CHSEL shows large performance improvements over several baseline methods for both simulated and real-world data.

翻訳日:2023-05-16 18:01:54 公開日:2023-05-14

# 確率的プーリングを用いた証明可能なマルチインスタンス深層auc最大化

Provable Multi-instance Deep AUC Maximization with Stochastic Pooling ( http://arxiv.org/abs/2305.08040v1 )

ライセンス: Link先を確認

Dixain Zhu, Bokun Wang, Zhi Chen, Yaxing Wang, Milan Sonka, Xiaodong Wu, Tianbao Yang

(参考訳) 本稿では,1つのクラスラベルをインスタンスの袋に割り当てるマルチインスタンス学習 (mil) に対する深層auc最大化 (dam) の新たな応用について検討する。 milの標準的なプーリングメソッドが要求する、バックプロパゲーションのための {gpu} メモリにバッグサイズがロードするには大きすぎる、という文脈で、無視されているが無視できない計算上の課題に対処します。この課題に対処するために,多レベル構成関数としてプールド予測上の損失関数を定式化することにより,確率最適化の精神における分散還元確率プール法を提案する。確率的合成最適化と非凸 min-max 最適化の手法を合成することにより,確率的スムーズドマックスプーリングや確率的アテンションベースプールを用いた統一的かつ証明可能なMIDAM (MIDAM) アルゴリズムを提案し,各バッグのいくつかのインスタンスをサンプリングし,確率的勾配推定器を計算し,モデルパラメータを更新する。我々は,提案したMIDAMアルゴリズムと最先端DAMアルゴリズムとの類似の収束率を確立する。従来のMILデータセットと医療データセットに関する広範な実験は、MIDAMアルゴリズムの優位性を実証している。

This paper considers a novel application of deep AUC maximization (DAM) for multi-instance learning (MIL), in which a single class label is assigned to a bag of instances (e.g., multiple 2D slices of a CT scan for a patient). We address a neglected yet non-negligible computational challenge of MIL in the context of DAM, i.e., bag size is too large to be loaded into {GPU} memory for backpropagation, which is required by the standard pooling methods of MIL. To tackle this challenge, we propose variance-reduced stochastic pooling methods in the spirit of stochastic optimization by formulating the loss function over the pooled prediction as a multi-level compositional function. By synthesizing techniques from stochastic compositional optimization and non-convex min-max optimization, we propose a unified and provable muli-instance DAM (MIDAM) algorithm with stochastic smoothed-max pooling or stochastic attention-based pooling, which only samples a few instances for each bag to compute a stochastic gradient estimator and to update the model parameter. We establish a similar convergence rate of the proposed MIDAM algorithm as the state-of-the-art DAM algorithms. Our extensive experiments on conventional MIL datasets and medical datasets demonstrate the superiority of our MIDAM algorithm.

翻訳日:2023-05-16 18:01:27 公開日:2023-05-14

# SyCo-AEによるカオスダイナミクスの低次モデリング:合成制約オートエンコーダ

Small-data Reduced Order Modeling of Chaotic Dynamics through SyCo-AE: Synthetically Constrained Autoencoders ( http://arxiv.org/abs/2305.08036v1 )

ライセンス: Link先を確認

Andrey A. Popov, Renato Zanetti

(参考訳) データ駆動によるカオス力学の還元次数モデリングは、破滅的に散逸または散逸するシステムをもたらす可能性がある。本稿では,自動エンコーダの非線形次元低減とニューラルネットワークを用いた非線形演算子推論の自由を活かし,縮小順序空間に合成制約を課すことで,この問題を解決しようとする。合成制約により、自由度が完全に非線形で不安定でありながら、ばらつきを防止できる。従来の40変数のlorenz '96方程式を用いて手法を説明し,より少ないデータを用いて誤差の低い中から長距離の予測を行うことができることを示した。

Data-driven reduced order modeling of chaotic dynamics can result in systems that either dissipate or diverge catastrophically. Leveraging non-linear dimensionality reduction of autoencoders and the freedom of non-linear operator inference with neural-networks, we aim to solve this problem by imposing a synthetic constraint in the reduced order space. The synthetic constraint allows our reduced order model both the freedom to remain fully non-linear and highly unstable while preventing divergence. We illustrate the methodology with the classical 40-variable Lorenz '96 equations, showing that our methodology is capable of producing medium-to-long range forecasts with lower error using less data.

翻訳日:2023-05-16 18:01:00 公開日:2023-05-14

# DNN-Defender: 対向重み攻撃のためのDRAM内ディープニューラルネットワーク防御機構

DNN-Defender: An in-DRAM Deep Neural Network Defense Mechanism for Adversarial Weight Attack ( http://arxiv.org/abs/2305.08034v1 )

ライセンス: Link先を確認

Ranyang Zhou, Sabbir Ahmed, Adnan Siraj Rakin, Shaahin Angizi

(参考訳) 多くのセキュリティに敏感な分野にディープラーニングが展開されるにつれ、機械学習のセキュリティは徐々に重要になりつつある。近年の研究では、DRAMのRowHammer脆弱性を利用して、ディープニューラルネットワーク(DNN)モデルの重み付けを決定的かつ正確にフリップし、推論精度に影響を与えるシステムレベルのテクニックを攻撃者が活用できることが示されている。既存の防御機構はソフトウェアベースで、重量再構成には高価なトレーニングオーバーヘッドや性能の低下を必要とする。一方で、汎用的なハードウェアベースの被害者/攻撃者中心のメカニズムは、高価なハードウェアオーバーヘッドを課し、被害者と攻撃者列の間の空間的接続を維持する。本稿では,DNN-Defenderという名前の量子化DNNに適した,DRAMをベースとした最初の防御機構を提案する。以上の結果から,DNN-DefenderはターゲットRowHammer攻撃の性能をランダムな攻撃レベルに低下させる高いレベルの保護を提供することが可能であることが示唆された。さらに、提案されたディフェンスは、ソフトウェアトレーニングや追加のハードウェアオーバーヘッドを発生させずに、CIFAR-10とImageNetデータセットに精度低下はない。

With deep learning deployed in many security-sensitive areas, machine learning security is becoming progressively important. Recent studies demonstrate attackers can exploit system-level techniques exploiting the RowHammer vulnerability of DRAM to deterministically and precisely flip bits in Deep Neural Networks (DNN) model weights to affect inference accuracy. The existing defense mechanisms are software-based, such as weight reconstruction requiring expensive training overhead or performance degradation. On the other hand, generic hardware-based victim-/aggressor-focused mechanisms impose expensive hardware overheads and preserve the spatial connection between victim and aggressor rows. In this paper, we present the first DRAM-based victim-focused defense mechanism tailored for quantized DNNs, named DNN-Defender that leverages the potential of in-DRAM swapping to withstand the targeted bit-flip attacks. Our results indicate that DNN-Defender can deliver a high level of protection downgrading the performance of targeted RowHammer attacks to a random attack level. In addition, the proposed defense has no accuracy drop on CIFAR-10 and ImageNet datasets without requiring any software training or incurring additional hardware overhead.

翻訳日:2023-05-16 18:00:47 公開日:2023-05-14

# HiPerformer: 時系列予測のための階層的置換同変変換器

HiPerformer: Hierarchically Permutation-Equivariant Transformer for Time Series Forecasting ( http://arxiv.org/abs/2305.08073v1 )

ライセンス: Link先を確認

Ryo Umagami, Yu Ono, Yusuke Mukuta, Tatsuya Harada

(参考訳) 正確な予測のために複数の時系列の関係を識別することが不可欠である。特に株価については、同じ特性を持つグループに部品を分割することが多く、このグループ構造に整合した関係を抽出するモデルが有効であるべきである。そこで本研究では,このグループ構造を考慮したモデルの設計のために,グループ内およびグループ間の成分の指数スワップに着目した階層的置換等価性の概念を提案する。予測モデルが階層的置換同分散を持つ場合、その予測は成分の群関係と一致する。そこで,同群における成分間の関係と群間の関係を考慮した階層的置換同変モデルを提案する。実世界のデータを用いた実験により,提案手法が既存の最先端手法より優れていることを示す。

It is imperative to discern the relationships between multiple time series for accurate forecasting. In particular, for stock prices, components are often divided into groups with the same characteristics, and a model that extracts relationships consistent with this group structure should be effective. Thus, we propose the concept of hierarchical permutation-equivariance, focusing on index swapping of components within and among groups, to design a model that considers this group structure. When the prediction model has hierarchical permutation-equivariance, the prediction is consistent with the group relationships of the components. Therefore, we propose a hierarchically permutation-equivariant model that considers both the relationship among components in the same group and the relationship among groups. The experiments conducted on real-world data demonstrate that the proposed method outperforms existing state-of-the-art methods.

翻訳日:2023-05-16 17:54:37 公開日:2023-05-14

# 原始重力のデコヒーレンスについて

On the Decoherence of Primordial Gravitons ( http://arxiv.org/abs/2305.08071v1 )

ライセンス: Link先を確認

Sirui Ning, Chon Man Sou, Yi Wang

(参考訳) 原始スカラー曲率とテンソル摂動の$\zeta$と$\gamma_{ij}$は、最小のインフレーションモデルにおける超水平スケールで保存されていることはよく知られている。しかし、それらの波動関数は急速に振動する位相を持ち、宇宙論的摂動の境界(現在の微分)やホイーラー・デウィット方程式のWKB近似から見てもわかるように、緩やかに回転しない。このような振動相は、スカラーとテンソルの摂動の間の重力非直線性を含む。観測されていないモードの追跡により、発振相は、バルク相互作用によるよりも早く原始重力子の脱コヒーレンスを引き起こす。以上の結果から, 収縮した原始重力場を探索する最近の提案に対して, 脱コヒーレンス効果はより低くなった。

It is well-known that the primordial scalar curvature and tensor perturbations, $\zeta$ and $\gamma_{ij}$, are conserved on super-horizon scales in minimal inflation models. However, their wave functional has a rapidly oscillating phase which is slow-roll unsuppressed, as can be seen either from boundary (total-derivative) terms of cosmological perturbations, or the WKB approximation of the Wheeler-DeWitt equation. Such an oscillatory phase involves gravitational non-linearity between scalar and tensor perturbations. By tracing out unobserved modes, the oscillatory phase causes faster decoherence of primordial gravitons compared to those by bulk interactions. Our results put a stronger lower bound of decoherence effect to the recent proposals probing squeezed primordial gravitons.

翻訳日:2023-05-16 17:54:24 公開日:2023-05-14

# フェデレーション学習におけるフェデレーション評価に関する調査

A Survey of Federated Evaluation in Federated Learning ( http://arxiv.org/abs/2305.08070v1 )

ライセンス: Link先を確認

Behnaz Soltani, Yipeng Zhou, Venus Haghighi, John C.S. Lui

(参考訳) 従来の機械学習では、すべてのデータサンプルがサーバによって中央管理されているため、モデル評価を行うのは簡単です。しかし、モデル評価は、この研究でフェデレーション評価と呼ばれるフェデレーション学習(FL)において難しい問題となっている。これは、クライアントがデータプライバシを保存するために元のデータを公開しないためです。フェデレーション評価は、クライアント選択、インセンティブ機構設計、悪意のある攻撃検出などにおいて重要な役割を果たす。本稿では,既存のフェデレーション評価手法の包括的調査を初めて実施する。さらに,FL性能向上のためのフェデレーション評価の様々な応用について検討し,いくつかの課題を想定して今後の研究の方向性を示す。

In traditional machine learning, it is trivial to conduct model evaluation since all data samples are managed centrally by a server. However, model evaluation becomes a challenging problem in federated learning (FL), which is called federated evaluation in this work. This is because clients do not expose their original data to preserve data privacy. Federated evaluation plays a vital role in client selection, incentive mechanism design, malicious attack detection, etc. In this paper, we provide the first comprehensive survey of existing federated evaluation methods. Moreover, we explore various applications of federated evaluation for enhancing FL performance and finally present future research directions by envisioning some challenges.

翻訳日:2023-05-16 17:54:10 公開日:2023-05-14

# 長距離物体検出のための事例認識繰り返し因子サンプリング

Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection ( http://arxiv.org/abs/2305.08069v1 )

ライセンス: Link先を確認

Burhaneddin Yaman, Tanvir Mahmud, Chun-Hao Liu

(参考訳) 我々は,ロングテール物体検出における不均衡データ問題に対処するため,恥ずかしいほど簡単なインスタンス認識反復因子サンプリング(irfs)を提案する。実世界のオブジェクト検出における不均衡データセットは、しばしば各クラスに対するインスタンス数の大きな差に苦しむ。希少クラスにおける対象検出モデルの一般化性能を向上させるため,様々なデータサンプリング手法が提案されている。繰り返し因子サンプリング(RFS)はその単純さと有効性から有望である。 RFSはその効率にもかかわらず、インスタンスカウントを完全に無視し、再サンプリングプロセス中のイメージカウントにのみ依存する。しかし、同じ画像数を持つ異なるクラスでインスタンス数は大きく異なる可能性がある。このようなバリエーションは、ロングテール分布に対処するためのイメージとインスタンスの両方の重要性を強調している。そこで本研究では,ロングテールデータセットにおける不均衡の異なる視点を認識するために,再サンプリングプロセスのインスタンス数と画像数を統一するirfを提案する。提案手法は,様々なアーキテクチャやバックボーン上でのLVIS v1.0ベンチマークデータセットに対する有望な結果を示し,RFSに対する相対的な平均精度(AP)が+50\%である希少クラスにおけるオブジェクト検出モデルの性能向上に有効であることを示す。 IRFSは強力なベースラインとして機能し、既存のロングテールフレームワークに簡単に組み込める。

We propose an embarrassingly simple method -- instance-aware repeat factor sampling (IRFS) to address the problem of imbalanced data in long-tailed object detection. Imbalanced datasets in real-world object detection often suffer from a large disparity in the number of instances for each class. To improve the generalization performance of object detection models on rare classes, various data sampling techniques have been proposed. Repeat factor sampling (RFS) has shown promise due to its simplicity and effectiveness. Despite its efficiency, RFS completely neglects the instance counts and solely relies on the image count during re-sampling process. However, instance count may immensely vary for different classes with similar image counts. Such variation highlights the importance of both image and instance for addressing the long-tail distributions. Thus, we propose IRFS which unifies instance and image counts for the re-sampling process to be aware of different perspectives of the imbalance in long-tailed datasets. Our method shows promising results on the challenging LVIS v1.0 benchmark dataset over various architectures and backbones, demonstrating their effectiveness in improving the performance of object detection models on rare classes with a relative $+50\%$ average precision (AP) improvement over counterpart RFS. IRFS can serve as a strong baseline and be easily incorporated into existing long-tailed frameworks.

翻訳日:2023-05-16 17:53:59 公開日:2023-05-14

# 韻律的注意と蒸留による終端SLU性能の向上

Improving End-to-End SLU performance with Prosodic Attention and Distillation ( http://arxiv.org/abs/2305.08067v1 )

ライセンス: Link先を確認

Shangeth Rajaa

(参考訳) ほとんどのエンドツーエンドSLU法は、意図予測のための事前訓練されたASRまたは言語モデル機能に依存している。しかし、言論における他の重要な情報、例えば韻律はしばしば無視される。近年の研究では、韻律情報を組み込んだ対話行為の分類結果が改善されている。これらの方法の改善のマージンは最小限であり、神経モデルは韻律的特徴を無視している。本研究では,発話の時間枠にまたがる注意マップを生成するために,韻律的特徴が異なる韻律アテンションを提案する。次に,暗黙の韻律特徴を結合するのではなく,音響エンコーダの韻律情報を明示的に学習する韻律蒸留を提案する。提案手法はどちらもベースライン結果を改善し, プロソディ-蒸留法は, SLURP と STOP のデータセットに対して, 意図的分類精度を 8 %, 2 % 向上させる。

Most End-to-End SLU methods depend on the pretrained ASR or language model features for intent prediction. However, other essential information in speech, such as prosody, is often ignored. Recent research has shown improved results in classifying dialogue acts by incorporating prosodic information. The margins of improvement in these methods are minimal as the neural models ignore prosodic features. In this work, we propose prosody-attention, which uses the prosodic features differently to generate attention maps across time frames of the utterance. Then we propose prosody-distillation to explicitly learn the prosodic information in the acoustic encoder rather than concatenating the implicit prosodic features. Both the proposed methods improve the baseline results, and the prosody-distillation method gives an intent classification accuracy improvement of 8\% and 2\% on SLURP and STOP datasets over the prosody baseline.

翻訳日:2023-05-16 17:53:40 公開日:2023-05-14

# 視覚障害者の画質向上を助ける

Helping Visually Impaired People Take Better Quality Pictures ( http://arxiv.org/abs/2305.08066v1 )

ライセンス: Link先を確認

Maniratnam Mandal, Deepti Ghadiyaram, Danna Gurari, and Alan C. Bovik

(参考訳) 知覚に基づく画像分析技術は、視覚障害者が自動ガイダンスを提供することで、より高品質な写真を撮るのに役立つ。視覚障害者が撮影した写真は、技術的品質(歪曲)と、フレーミングや美的構成といった意味的な品質の2つの品質問題の1つまたは両方に悩まされることが多い。ここでは,ぼやけや露出不良,ノイズなど,一般的な技術的歪みの発生を最小限に抑えるためのツールを開発した。我々は、セマンティック品質の相補的な問題に対処せず、その側面を将来の作業に残します。視覚障害者が捉えた画像の技術的品質に対する実用的なフィードバックを評価・提供することの問題は、しばしば発生する重篤な歪みのため、十分に困難である。視覚障がい者生成コンテンツ(vi-ugc)の技術的品質の分析と測定の課題を前進させるために,我々は,非常に大きくユニークな主観的画質と歪みデータセットを構築した。 LIVE-Meta VI-UGC Databaseと呼ばれるこの新しい知覚リソースには、実世界の歪んだVI-UGCイメージ40ドルと40ドルのパッチが含まれており、人間による知覚品質判断と27ドルの歪みラベルが記録されている。この心理測定資源を用いて,局所的空間的品質関係を学習し,vi-ugc画像における最先端の予測性能を達成し,このユニークな歪画像データを用いた既存の画像品質モデルを著しく上回る,盲目画像品質および歪み予測器を開発した。また,マルチタスク学習フレームワークを作成することで,ユーザによる品質問題軽減と品質画像の取得を支援するプロトタイプフィードバックシステムを開発した。

Perception-based image analysis technologies can be used to help visually impaired people take better quality pictures by providing automated guidance, thereby empowering them to interact more confidently on social media. The photographs taken by visually impaired users often suffer from one or both of two kinds of quality issues: technical quality (distortions), and semantic quality, such as framing and aesthetic composition. Here we develop tools to help them minimize occurrences of common technical distortions, such as blur, poor exposure, and noise. We do not address the complementary problems of semantic quality, leaving that aspect for future work. The problem of assessing and providing actionable feedback on the technical quality of pictures captured by visually impaired users is hard enough, owing to the severe, commingled distortions that often occur. To advance progress on the problem of analyzing and measuring the technical quality of visually impaired user-generated content (VI-UGC), we built a very large and unique subjective image quality and distortion dataset. This new perceptual resource, which we call the LIVE-Meta VI-UGC Database, contains $40$K real-world distorted VI-UGC images and $40$K patches, on which we recorded $2.7$M human perceptual quality judgments and $2.7$M distortion labels. Using this psychometric resource we also created an automatic blind picture quality and distortion predictor that learns local-to-global spatial quality relationships, achieving state-of-the-art prediction performance on VI-UGC pictures, significantly outperforming existing picture quality models on this unique class of distorted picture data. We also created a prototype feedback system that helps to guide users to mitigate quality issues and take better quality pictures, by creating a multi-task learning framework.

翻訳日:2023-05-16 17:53:24 公開日:2023-05-14

# 結束効果モデリングによる大規模行動空間のオフポリシー評価

Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling ( http://arxiv.org/abs/2305.08062v1 )

ライセンス: Link先を確認

Yuta Saito, Qingyang Ren, Thorsten Joachims

(参考訳) 従来の重要度重み付けアプローチが過度なばらつきを被る大規模離散行動空間における文脈的バンディットポリシーのオフポリシー評価(ope)について検討した。この分散問題を回避すべく,結束効果モデル(cem)に基づく新たな推定器であるoffcemを提案し,因果効果のクラスター効果への新しい分解と残留効果を提案する。 OffCEMは、アクションクラスタのみに重み付けを適用し、モデルベースの報酬推定を通じて残留因果効果に対処する。提案した推定器は局所的正当性と呼ばれる新しい条件下では偏りがなく, 残差効果モデルが各クラスタ内の動作の相対的な報酬差を保持する必要がある。また,CEMと局所的正当性を最大限に活用するために,第1ステップのバイアスと第2ステップのばらつきを最小化するモデルベース推定法を提案する。その結果,従来の推定器に比べてバイアスやばらつきが大幅に改善されることがわかった。 OffCEMは、特に多くのアクションが存在する場合、OPEを大幅に改善することを示した。

We study off-policy evaluation (OPE) of contextual bandit policies for large discrete action spaces where conventional importance-weighting approaches suffer from excessive variance. To circumvent this variance issue, we propose a new estimator, called OffCEM, that is based on the conjunct effect model (CEM), a novel decomposition of the causal effect into a cluster effect and a residual effect. OffCEM applies importance weighting only to action clusters and addresses the residual causal effect through model-based reward estimation. We show that the proposed estimator is unbiased under a new condition, called local correctness, which only requires that the residual-effect model preserves the relative expected reward differences of the actions within each cluster. To best leverage the CEM and local correctness, we also propose a new two-step procedure for performing model-based estimation that minimizes bias in the first step and variance in the second step. We find that the resulting OffCEM estimator substantially improves bias and variance compared to a range of conventional estimators. Experiments demonstrate that OffCEM provides substantial improvements in OPE especially in the presence of many actions.

翻訳日:2023-05-16 17:52:52 公開日:2023-05-14

# デジタルの兄弟姉妹が自動運転テストを改善する

Two is Better Than One: Digital Siblings to Improve Autonomous Driving Testing ( http://arxiv.org/abs/2305.08060v1 )

ライセンス: Link先を確認

Matteo Biagiola, Andrea Stocco, Vincenzo Riccio, Paolo Tonella

(参考訳) シミュレーションベースのテストは、自動運転ソフトウェアの信頼性を確保するための重要なステップである。実のところ、企業が社内またはアウトソーステストのためにサードパーティの汎用シミュレータに依存している場合、実際の自動運転車に対するテスト結果の一般化が危ぶまれる。本稿では,様々な技術を用いた多目的シミュレータ上でavをテストするための新しい枠組みであるディジタル兄弟の概念を導入することで,シミュレーションに基づくテストを強化する。まず、個々のシミュレータごとにテストケースを自動的に生成する。次に、テストはシミュレータ間で移動され、特徴マップを使用して運動された運転条件を特徴づける。そして、ジョイント予測故障確率を算出し、兄弟間の合意の場合にのみ、故障を報告させる。このフレームワークを2つのオープンソースのシミュレータを用いて実装し,大規模なテストケースで実物大の自律走行車のデジタル双生児と比較した。本研究は,デジタル双子の故障予測において,デジタル兄弟によるアンサンブル故障予測器が個々のシミュレータよりも優れていることを示す。我々は,自動運転ソフトウェアの自動テストに関心を持つ研究者に,我々のフレームワークが役立ついくつかの方法について論じる。

Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we strengthen simulation-based testing by introducing the notion of digital siblings, a novel framework in which the AV is tested on multiple general-purpose simulators, built with different technologies. First, test cases are automatically generated for each individual simulator. Then, tests are migrated between simulators, using feature maps to characterize of the exercised driving conditions. Finally, the joint predicted failure probability is computed and a failure is reported only in cases of agreement among the siblings. We implemented our framework using two open-source simulators and we empirically compared it against a digital twin of a physical scaled autonomous vehicle on a large set of test cases. Our study shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. We discuss several ways in which our framework can help researchers interested in automated testing of autonomous driving software.

翻訳日:2023-05-16 17:52:35 公開日:2023-05-14

# イベントレベルのビデオ質問応答に対する意味認識動的ふりかえり推論

Semantic-aware Dynamic Retrospective-Prospective Reasoning for Event-level Video Question Answering ( http://arxiv.org/abs/2305.08059v1 )

ライセンス: Link先を確認

Chenyang Lyu, Tianbo Ji, Yvette Graham, Jennifer Foster

(参考訳) Event-Level Video Question Answering (EVQA)は、最適な回答を提供するのに必要な視覚的情報を得るために、ビデオイベント全体にわたる複雑な推論を必要とする。しかしながら、モデル性能の大幅な向上にもかかわらず、質問と視覚的情報、特にイベントレベルでの明示的な意味関係の使用に焦点をあてた研究は少ない。ビデオフレーム間の複雑な推論を容易にするために、このようなセマンティック接続を使用する必要がある。そこで本稿では,ビデオによる質問応答に対する動的振り返り推論手法を提案する。具体的には、質問のSRL構造(エージェント、動詞、患者など)のどの部分に焦点を当てているかに基づいて次のフレームに移行することを決定した動的推論プロセスにおいて、質問のセマンティックロールラベル(SRL)構造を明示的に使用する。ベンチマークEVQAデータセット - TrafficQA で実験を行う。その結果,提案手法は従来の最先端モデルと比較して優れた性能を示すことがわかった。私たちのコードは研究用に公開されます。

Event-Level Video Question Answering (EVQA) requires complex reasoning across video events to obtain the visual information needed to provide optimal answers. However, despite significant progress in model performance, few studies have focused on using the explicit semantic connections between the question and visual information especially at the event level. There is need for using such semantic connections to facilitate complex reasoning across video frames. Therefore, we propose a semantic-aware dynamic retrospective-prospective reasoning approach for video-based question answering. Specifically, we explicitly use the Semantic Role Labeling (SRL) structure of the question in the dynamic reasoning process where we decide to move to the next frame based on which part of the SRL structure (agent, verb, patient, etc.) of the question is being focused on. We conduct experiments on a benchmark EVQA dataset - TrafficQA. Results show that our proposed approach achieves superior performance compared to previous state-of-the-art models. Our code will be made publicly available for research use.

翻訳日:2023-05-16 17:52:18 公開日:2023-05-14

# CREMP: 機械学習のためのマクロ環状ペプチドのコンバータロータマーアンサンブル

CREMP: Conformer-Rotamer Ensembles of Macrocyclic Peptides for Machine Learning ( http://arxiv.org/abs/2305.08057v1 )

ライセンス: Link先を確認

Colin A. Grambow, Hayley Weir, Christian N. Cunningham, Tommaso Biancalani, Kangway V. Chuang

(参考訳) 大環状ペプチドのコンフォメーションランドスケープをモデル化するための計算および機械学習アプローチは、合理的な設計と最適化を可能にする可能性がある。しかしながら、マクロサイクルジオメトリをモデリングするための正確で高速でスケーラブルな手法は、いまだに解明されていない。近年の深層学習はタンパク質構造の予測と小分子コンホメーションアンサンブルの生成を著しく促進しているが,その特異な性質から,大環状ペプチドに対する同様の進歩は得られていない。本稿では,マクロ環状ペプチドの機械学習モデルの開発と評価のための資源であるCREMPを紹介する。 CREMPは36,198個のマクロ環状ペプチドと、Conformer-Rotamer Ensemble Sampling Tool (CREST)を用いて生成された高品質な構造アンサンブルを含む。この新しいデータセットには3130万近いユニークなマクロサイクルジオメトリが含まれており、それぞれが半経験的拡張タイト結合(xtb)dft計算から得られるエネルギーをアノテートしている。このデータセットは、新しい治療のためのペプチド設計と最適化を改善する機械学習モデルの開発を可能にすることを期待する。

Computational and machine learning approaches to model the conformational landscape of macrocyclic peptides have the potential to enable rational design and optimization. However, accurate, fast, and scalable methods for modeling macrocycle geometries remain elusive. Recent deep learning approaches have significantly accelerated protein structure prediction and the generation of small-molecule conformational ensembles, yet similar progress has not been made for macrocyclic peptides due to their unique properties. Here, we introduce CREMP, a resource generated for the rapid development and evaluation of machine learning models for macrocyclic peptides. CREMP contains 36,198 unique macrocyclic peptides and their high-quality structural ensembles generated using the Conformer-Rotamer Ensemble Sampling Tool (CREST). Altogether, this new dataset contains nearly 31.3 million unique macrocycle geometries, each annotated with energies derived from semi-empirical extended tight-binding (xTB) DFT calculations. We anticipate that this dataset will enable the development of machine learning models that can improve peptide design and optimization for novel therapeutics.

翻訳日:2023-05-16 17:52:00 公開日:2023-05-14

# 最適探索空間サイズ学習による非線形モデル予測制御の遺伝的最適化

Accelerating genetic optimization of nonlinear model predictive control by learning optimal search space size ( http://arxiv.org/abs/2305.08094v1 )

ライセンス: Link先を確認

Eslam Mostafa, Hussein A. Aly, Ahmed Elliethy

(参考訳) 非線形モデル予測制御(NMPC)は、制御サイクル毎にシステムの最適制御入力を推定するために多変量最適化問題を解く。このような最適化は、システム内で継承された非線形性、高度に結合された入力、システムの物理的制限に関連する様々な制約などによってより困難にされている。これらの要因により、最適化は非凸であり、伝統的に解決するのは難しい。遺伝的アルゴリズム(GA)は、解推定に差分計算や勾配評価を含まないため、いくつかのアプリケーション領域でそのような最適化に取り組むために一般的に広く使われている。しかし、GAが最適制御入力を探索する検索空間のサイズは、迅速な応答を必要とするシステムによるGAの適用性に不可欠である。本稿では,NMPCの遺伝的最適化を最適探索空間サイズを学習することで高速化する手法を提案する。提案手法は多変量回帰モデルを訓練し,制御サイクル毎に最小探索空間を適応的に予測する。この探索空間内の最適制御入力を探索できるように、推定最小の探索空間がGAに供給される。提案手法はgaの計算時間を短縮するだけでなく,各サイクルにおける最適制御入力を得る可能性を向上させる。提案手法は2つの非線形システム上で評価され、Nvidia Jetson TX2組み込みプラットフォームのGPU上に実装された他の2つの遺伝的NMPCアプローチと比較された。その結果,提案手法は計算時間を39-53\%削減できることがわかった。さらに、サイクル時間内の最適な制御入力に対する収束率を48-56\%増加させ、結果として大幅な性能向上をもたらす。ソースコードはgithubで公開されている。

Nonlinear model predictive control (NMPC) solves a multivariate optimization problem to estimate the system's optimal control inputs in each control cycle. Such optimization is made more difficult by several factors, such as nonlinearities inherited in the system, highly coupled inputs, and various constraints related to the system's physical limitations. These factors make the optimization to be non-convex and hard to solve traditionally. Genetic algorithm (GA) is typically used extensively to tackle such optimization in several application domains because it does not involve differential calculation or gradient evaluation in its solution estimation. However, the size of the search space in which the GA searches for the optimal control inputs is crucial for the applicability of the GA with systems that require fast response. This paper proposes an approach to accelerate the genetic optimization of NMPC by learning optimal search space size. The proposed approach trains a multivariate regression model to adaptively predict the best smallest search space in every control cycle. The estimated best smallest size of search space is fed to the GA to allow for searching the optimal control inputs within this search space. The proposed approach not only reduces the GA's computational time but also improves the chance of obtaining the optimal control inputs in each cycle. The proposed approach was evaluated on two nonlinear systems and compared with two other genetic-based NMPC approaches implemented on the GPU of a Nvidia Jetson TX2 embedded platform in a processor-in-the-loop (PIL) fashion. The results show that the proposed approach provides a 39-53\% reduction in computational time. Additionally, it increases the convergence percentage to the optimal control inputs within the cycle's time by 48-56\%, resulting in a significant performance enhancement. The source code is available on GitHub.

翻訳日:2023-05-16 17:45:08 公開日:2023-05-14

# アジャイル開発のためのAI:メタ分析

AI for Agile development: a Meta-Analysis ( http://arxiv.org/abs/2305.08093v1 )

ライセンス: Link先を確認

Beatriz Cabrero-Daniel

(参考訳) 本研究は,継続的インテグレーションとデリバリの改善に重点を置いた,人工知能とアジャイルソフトウェア開発方法論を統合することのメリットと課題について検討する。検索した研究の体系的な文献レビューと縦断的なメタ分析を行い、人工知能とアジャイルソフトウェア開発における今後の応用について分析した。このレビューは、特別な社会技術専門知識の必要性など、重要な課題を特定するのに役立った。人工知能はソフトウェア開発プラクティスの改善を約束する一方で、プロセスや実践者への影響をより深く理解し、その実装に関連する間接的な課題に対処するためには、さらなる研究が必要である。

This study explores the benefits and challenges of integrating Artificial Intelligence with Agile software development methodologies, focusing on improving continuous integration and delivery. A systematic literature review and longitudinal meta-analysis of the retrieved studies was conducted to analyse the role of Artificial Intelligence and it's future applications within Agile software development. The review helped identify critical challenges, such as the need for specialised socio-technical expertise. While Artificial Intelligence holds promise for improved software development practices, further research is needed to better understand its impact on processes and practitioners, and to address the indirect challenges associated with its implementation.

翻訳日:2023-05-16 17:44:42 公開日:2023-05-14

# meta-dm: 限定学習における拡散モデルの応用

Meta-DM: Applications of Diffusion Models on Few-Shot Learning ( http://arxiv.org/abs/2305.08092v1 )

ライセンス: Link先を確認

Wentao Hu, Xiurong Jiang, Jiarun Liu, Yuqi Yang, Hui Tian

(参考訳) 数ショット学習(FSL)の分野では、ネットワーク構造の改善とトレーニング戦略に重点を置いている。しかし、データ処理モジュールの役割は十分に解明されていない。そこで本稿では,拡散モデルに基づくFSL問題の一般化データ処理モジュールであるMeta-DMを提案する。 Meta-DMはシンプルだが効果的なモジュールであり、既存のFSLメソッドと簡単に統合でき、教師なし設定と教師なし設定の両方で大幅なパフォーマンス向上をもたらす。メタDMの理論解析を行い,その性能をいくつかのアルゴリズムで評価する。実験の結果,Meta-DMと特定の手法を組み合わせることで,最先端の成果が得られることがわかった。

In the field of few-shot learning (FSL), extensive research has focused on improving network structures and training strategies. However, the role of data processing modules has not been fully explored. Therefore, in this paper, we propose Meta-DM, a generalized data processing module for FSL problems based on diffusion models. Meta-DM is a simple yet effective module that can be easily integrated with existing FSL methods, leading to significant performance improvements in both supervised and unsupervised settings. We provide a theoretical analysis of Meta-DM and evaluate its performance on several algorithms. Our experiments show that combining Meta-DM with certain methods achieves state-of-the-art results.

翻訳日:2023-05-16 17:44:33 公開日:2023-05-14

# プロンプトベースのブラックボックスチューニングカラーフル:3次元直交視点からのモデル一般化の促進

Make Prompt-based Black-Box Tuning Colorful: Boosting Model Generalization from Three Orthogonal Perspectives ( http://arxiv.org/abs/2305.08088v1 )

ライセンス: Link先を確認

Qiushi Sun, Chengcheng Han, Nuo Chen, Renyu Zhu, Jingyang Gong, Xiang Li, Ming Gao

(参考訳) 大規模言語モデル(llm)は、様々な自然言語処理(nlp)タスクで力を増している。しかし、これらのモデルを下流タスクにチューニングするには、通常、余分なコストを必要とするか、商業的な考慮のために利用できない。近年,タスク固有のプロンプトを勾配や隠れ表現にアクセスせずに最適化することで,この問題に対処するブラックボックスチューニングが提案されている。しかし、既存の作品の多くは、少数発学習のシナリオで、勾配なし最適化の可能性を完全に活用していない。本稿では,ブラックボックス最適化の効率性と性能を向上させるための,単純かつ補完的な手法であるBBT-RGBについて述べる。具体的には,(1)高速収束と過剰フィッティングの緩和を容易にする二段階微分自由最適化戦略,(2)新規使用による自動発声器の構成,(3)指示探索と自動選択デモンストレーションに基づく高速初期化ポリシーの改善,の3つを含む。自然言語の理解と推論に関する多岐にわたる実験により,本手法の有効性が示された。私たちのコードはhttps://github.com/QiushiSun/BBT-RGBで公開されています。

Large language models (LLMs) have shown increasing power on various natural language processing (NLP) tasks. However, tuning these models for downstream tasks usually needs exorbitant costs or is unavailable due to commercial considerations. Recently, black-box tuning has been proposed to address this problem by optimizing task-specific prompts without accessing the gradients and hidden representations. However, most existing works have yet fully exploited the potential of gradient-free optimization under the scenario of few-shot learning. In this paper, we describe BBT-RGB, a suite of straightforward and complementary techniques for enhancing the efficiency and performance of black-box optimization. Specifically, our method includes three plug-and-play components: (1) Two-stage derivative-free optimization strategy that facilitates fast convergence and mitigates overfitting; (2) Automatic verbalizer construction with its novel usage under few-shot settings; (3) Better prompt initialization policy based on instruction search and auto-selected demonstration. Extensive experiments across various tasks on natural language understanding and inference demonstrate the effectiveness of our method. Our codes are publicly available at https://github.com/QiushiSun/BBT-RGB.

翻訳日:2023-05-16 17:44:23 公開日:2023-05-14

# deep learning empowered type-ii codebook: csiフィードバック強化のための新しい展望

Deep Learning Empowered Type-II Codebook: New Perspectives for Enhancing CSI Feedback ( http://arxiv.org/abs/2305.08081v1 )

ライセンス: Link先を確認

Ke Ma, Yiliang Sang, Yang Ming, Jin Lian, Chang Tian, Zhaocheng Wang

(参考訳) 周波数分割二重系における深層学習に基づくチャネル状態情報(CSI)フィードバックは、学術と産業の両方で広く注目を集めている。本稿では,CSIフィードバックの性能向上のため,無線通信規格におけるType-IIコーデックブックの深層学習への統合に焦点をあてる。 Release 16 Type-IIコードブックに関する既存のディープラーニングベースの研究とは対照的に、Release 17(R17)のType-IIコードブックでは、アップリンクとダウンリンクチャネルの間の角-遅延領域部分の相反性を、ダウンリンクCSIの測定とフィードバックのための角-遅延領域ポートの選択に利用している。この問題に対処するため、我々はR17 Type-IIコードブックを改善するためにディープラーニングを採用する2つの新しい視点を提案する。まず、アップリンクチャネルの信号対雑音比の低さを考慮して、深層学習を用いて、焦点損失を利用してクラス不均衡問題を解決する支配的な角遅延領域ポートを正確に選択する。第2に,基地局におけるR17 Type-IIコードブックのフィードバックに基づいて,深層学習を用いてダウンリンクCSIを再構築し,スパース構造の情報を効果的に活用することを提案する。さらに,重み付きショートカットモジュールを設計し,実際のマルチユーザシナリオに適応するために,平均2乗誤差と和率を組み合わせた2段階の損失関数を提案する。シミュレーションの結果,本提案手法は従来のr17 type-ii コードブックやdeep learning ベンチマークと比較して,総和率を向上できることがわかった。

Deep learning based channel state information (CSI) feedback in frequency division duplex systems has drawn widespread attention in both academia and industry. In this paper, we focus on integrating the Type-II codebook in the wireless communication standards with deep learning to enhance the performance of CSI feedback. In contrast to the existing deep learning based studies on the Release 16 Type-II codebook, the Type-II codebook in Release 17 (R17) exploits the angular-delay-domain partial reciprocity between uplink and downlink channels to select part of angular-delay-domain ports for measuring and feeding back the downlink CSI, where the performance of deep learning based conventional methods is limited due to the deficiency of sparse structures. To address this issue, we propose two new perspectives of adopting deep learning to improve the R17 Type-II codebook. Firstly, considering the low signal-to-noise ratio of uplink channels, deep learning is utilized to accurately select the dominant angular-delay-domain ports, where the focal loss is harnessed to solve the class imbalance problem. Secondly, we propose to adopt deep learning to reconstruct the downlink CSI based on the feedback of the R17 Type-II codebook at the base station, where the information of sparse structures can be effectively leveraged. Furthermore, a weighted shortcut module is designed to facilitate the accurate reconstruction, and a two-stage loss function that combines the mean squared error and sum rate is proposed for adapting to practical multi-user scenarios. Simulation results demonstrate that our proposed deep learning based port selection and CSI reconstruction methods can improve the sum rate performance compared with the traditional R17 Type-II codebook and deep learning benchmarks.

翻訳日:2023-05-16 17:44:01 公開日:2023-05-14

# 広視野眼底画像から複数の網膜疾患を認識するためのクロスドメイン協調学習

Cross-domain Collaborative Learning for Recognizing Multiple Retinal Diseases from Wide-Field Fundus Images ( http://arxiv.org/abs/2305.08078v1 )

ライセンス: Link先を確認

Qijie Wei, Jingyuan Yang, Bo Wang, Jinrui Wang, Jianchun Zhao, Xinyu Zhao, Sheng Yang, Niranchana Manivannan, Youxin Chen, Dayong Ding and Xirong Li

(参考訳) 本稿では,広視野 (WF) と超広視野 (UWF) の眼底画像から複数の網膜疾患を認識するための課題について述べる。既存のラベル付きカラーファンドス写真(CFP)データを効果的に再利用するために,クロスドメイン協調学習(CdCL)を提案する。教師なしドメイン適応における固定比に基づくミックスアップの成功に触発されて、我々はこの戦略を現在のタスクに再活用する。 CFP画像とWF/UWF画像の視野の違いにより,CFP画像の解剖学的構造がWF/UWF画像よりもかなり大きくなるという,スケールバイアスが自然に存在する。 CdCL法は,変圧器を用いたスケール・バイアス補正法により,スケール不変な特徴を生成できる。 wf画像とuwf画像の両方をカバーする複数のデータセットに関する広範囲な実験によって示されているように、提案手法は多くの競合ベースラインと比較できる。

This paper addresses the emerging task of recognizing multiple retinal diseases from wide-field (WF) and ultra-wide-field (UWF) fundus images. For an effective reuse of existing labeled color fundus photo (CFP) data, we propose Cross-domain Collaborative Learning (CdCL). Inspired by the success of fixed-ratio based mixup in unsupervised domain adaptation, we re-purpose this strategy for the current task. Due to the intrinsic disparity between the field-of-view of CFP and WF/UWF images, a scale bias naturally exists in a mixup sample that the anatomic structure from a CFP image will be considerably larger than its WF/UWF counterpart. The CdCL method resolves the issue by Scale-bias Correction, which employs Transformers for producing scale-invariant features. As demonstrated by extensive experiments on multiple datasets covering both WF and UWF images, the proposed method compares favorably against a number of competitive baselines.

翻訳日:2023-05-16 17:43:27 公開日:2023-05-14

# 温熱快適性とプライバシを考慮した住宅需要対応プログラムコストの最適化

Optimization of Residential Demand Response Program Cost with Consideration for Occupants Thermal Comfort and Privacy ( http://arxiv.org/abs/2305.08077v1 )

ライセンス: Link先を確認

Reza Nematirad, M. M. Ardehali, and Amir Khorsandi

(参考訳) 住宅利用者は、住宅エネルギー管理システム(HEMS)を利用することができれば需要対応プログラム(DRP)を使用でき、空調設定点(AC)を自動的に調整し、一部の機器をオフピーク時間にシフトすることで、消費者のコストを低減できる。 HEMSが占有状況を知っている場合、消費者はより多くの経済的利益と熱的快適さを得ることができる。しかし、建物の居住状況では、直接センシングは費用がかかり、不正確であり、住民にとって侵入的である。したがって、予測アルゴリズムは効果的な代替手段になり得る。本研究の目的は, スマート住宅におけるdrp活用のための多目的シミュレーションモデルの構築に向けて, 非誘惑的, 正確, かつ費用対効果の高い手法を提案することである。 (a)電気負荷の低減 (b)熱快適度(ac)温度設定点の調整及び (c)最悪のシナリオアプローチは非常に保守的です。なぜなら、不確実なパラメータが常に最悪の値を取る可能性は低いからです。そこで,不確かさを現実的に考慮するために,不確実性予算とともに柔軟なロバストな最適化手法を開発した。シミュレーションの結果,不確実性を考慮するとコストが36%増加し,交流温度設定値が低下することが示唆された。さらに、DRPを使用すると、一部の家電事業をオフピーク時間に切り替えて需要を減らし、コストを13.2%削減する。

Residential consumers can use the demand response program (DRP) if they can utilize the home energy management system (HEMS), which reduces consumer costs by automatically adjusting air conditioning (AC) setpoints and shifting some appliances to off-peak hours. If HEMS knows occupancy status, consumers can gain more economic benefits and thermal comfort. However, for the building occupancy status, direct sensing is costly, inaccurate, and intrusive for residents. So, forecasting algorithms could serve as an effective alternative. The goal of this study is to present a non-intrusive, accurate, and cost-effective approach, to develop a multi-objective simulation model for the application of DRPs in a smart residential house, where (a) electrical load demand reduction, (b) adjustment in thermal comfort (AC) temperature setpoints, and (c) , worst cases scenario approach is very conservative. Because that is unlikely all uncertain parameters take their worst values at all times. So, the flexible robust counterpart optimization along with uncertainty budgets is developed to consider uncertainty realistically. Simulated results indicate that considering uncertainty increases the costs by 36 percent and decreases the AC temperature setpoints. Besides, using DRPs reduces demand by shifting some appliance operations to off-peak hours and lowers costs by 13.2 percent.

翻訳日:2023-05-16 17:43:09 公開日:2023-05-14

# 教員助手による防衛蒸留の改善

Improving Defensive Distillation using Teacher Assistant ( http://arxiv.org/abs/2305.08076v1 )

ライセンス: Link先を確認

Maniratnam Mandal and Suna Gao

(参考訳) 敵の攻撃は、現代のアプリケーションに適用されるディープニューラルネットワークのセキュリティと安全性に重大な脅威をもたらす。より具体的には、コンピュータビジョンに基づくタスクでは、専門家はモデルアーキテクチャの知識を使って、人間の目には見えない敵対的なサンプルを作成することができる。これらの攻撃は、自動運転車や顔認識などの一般的なアプリケーションでセキュリティ上の問題を引き起こす可能性がある。したがって、このような攻撃に対して堅牢なネットワークの構築は非常に望ましいものであり、不可欠である。文献にみられる様々な方法のうち、近年は防御蒸留が期待されている。知識蒸留を用いて、研究者はこれらの攻撃に対して堅牢なモデルを作ることができた。しかし、防御蒸留の弱さを露呈する攻撃が増えている。本研究では,教師補助知識蒸留から着想を得て,補助ネットワークの導入により,蒸留モデルのロバスト性が向上することを示す。一連の実験を通じて, 蒸留温度の異なる蒸留モデルについて, 精度, 感度, 堅牢性の観点から評価した。実験の結果,提案仮説はほとんどの場合,ロバスト性が向上することが示された。さらに,多段蒸留はモデルの精度にほとんど影響を与えず,ロバスト性をさらに向上させることができることを示した。

Adversarial attacks pose a significant threat to the security and safety of deep neural networks being applied to modern applications. More specifically, in computer vision-based tasks, experts can use the knowledge of model architecture to create adversarial samples imperceptible to the human eye. These attacks can lead to security problems in popular applications such as self-driving cars, face recognition, etc. Hence, building networks which are robust to such attacks is highly desirable and essential. Among the various methods present in literature, defensive distillation has shown promise in recent years. Using knowledge distillation, researchers have been able to create models robust against some of those attacks. However, more attacks have been developed exposing weakness in defensive distillation. In this project, we derive inspiration from teacher assistant knowledge distillation and propose that introducing an assistant network can improve the robustness of the distilled model. Through a series of experiments, we evaluate the distilled models for different distillation temperatures in terms of accuracy, sensitivity, and robustness. Our experiments demonstrate that the proposed hypothesis can improve robustness in most cases. Additionally, we show that multi-step distillation can further improve robustness with very little impact on model accuracy.

翻訳日:2023-05-16 17:42:49 公開日:2023-05-14

# コンピュータビジョンのための圧縮技術の解析

Analyzing Compression Techniques for Computer Vision ( http://arxiv.org/abs/2305.08075v1 )

ライセンス: Link先を確認

Maniratnam Mandal and Imran Khan

(参考訳) 深層ネットワーク圧縮は、コンピュータビジョンアプリケーションにおける実用的なユースケースとして非常に望ましい。文献でいくつかの技術が研究され、それらを組み合わせるための効率的な戦略を見つける研究が進められている。本研究の目的は, 知識蒸留, プルーニング, 量子化の3つの基本的な圧縮手法について検討することであった。基本手法とともに、それらを逐次的に組み合わせることの有効性についても検証する。 MNIST と CIFAR-10 のデータセットを用いて解析し、その結果と、それらから推測される観測結果を提示する。

Compressing deep networks is highly desirable for practical use-cases in computer vision applications. Several techniques have been explored in the literature, and research has been done in finding efficient strategies for combining them. For this project, we aimed to explore three different basic compression techniques - knowledge distillation, pruning, and quantization for small-scale recognition tasks. Along with the basic methods, we also test the efficacy of combining them in a sequential manner. We analyze them using MNIST and CIFAR-10 datasets and present the results along with few observations inferred from them.

翻訳日:2023-05-16 17:42:33 公開日:2023-05-14

# カオスにおける直交多項式近似と拡張動的モード分解

Orthogonal polynomial approximation and Extended Dynamic Mode Decomposition in chaos ( http://arxiv.org/abs/2305.08074v1 )

ライセンス: Link先を確認

Caroline L. Wormell

(参考訳) extended dynamic mode decomposition (edmd) は、物理科学において広く取り上げられている、ダイナミクスの予測とモデル還元のためのデータ駆動ツールである。この手法は概念的には単純であるが、決定論的カオスでは、その性質が何であるか、何に収束するかは明らかではない。特に、EDMDの最小二乗近似がカオス力学を理解するのに必要な正規関数のクラスをどのように扱うかは明らかではない。本稿では、カオス写像の最も単純な例である円の膨張写像を解析する、EDMDの厳密な一般理論を開発する。単位円(OPUC)上の直交多項式の理論における新たな結果を示すと、無限データ極限において、最小二乗射影は多項式可観測辞書に対して指数関数的に効率的であることを示す。その結果,EDMDを用いて作成した予測値とクープマンスペクトルデータは,指数速度で物理的に有意な限界に収束することがわかった。これは、比較的小さな多項式辞書だけでは、サンプリング測度が均一でない場合でも、EDMDは非常に効果的であることを示す。さらに, OPUCの結果から, データに基づく最小二乗予測が極めて効果的な近似手法である可能性が示唆された。

Extended Dynamic Mode Decomposition (EDMD) is a data-driven tool for forecasting and model reduction of dynamics, which has been extensively taken up in the physical sciences. While the method is conceptually simple, in deterministic chaos it is unclear what its properties are or even what it converges to. In particular, it is not clear how EDMD's least-squares approximation treats the classes of regular functions needed to make sense of chaotic dynamics. In this paper we develop a general, rigorous theory of EDMD on the simplest examples of chaotic maps: analytic expanding maps of the circle. Proving a new result in the theory of orthogonal polynomials on the unit circle (OPUC), we show that in the infinite-data limit, the least-squares projection is exponentially efficient for polynomial observable dictionaries. As a result, we show that the forecasts and Koopman spectral data produced using EDMD in this setting converge to the physically meaningful limits, at an exponential rate. This demonstrates that with only a relatively small polynomial dictionary, EDMD can be very effective, even when the sampling measure is not uniform. Furthermore, our OPUC result suggests that data-based least-squares projections may be a very effective approximation strategy.

翻訳日:2023-05-16 17:42:25 公開日:2023-05-14

# 行列積状態からの高次ベリー曲率

Higher Berry curvature from matrix product states ( http://arxiv.org/abs/2305.08109v1 )

ライセンス: Link先を確認

Ken Shiozaki, Niclas Heinsdorf, Shuhei Ohyama

(参考訳) 高いベリー曲率は、有限自由度を持つ量子力学系におけるベリー曲率の有限次元における量子多体系への拡張としてカプスティンとスポディニコによって導入された。本稿では,翻訳不変行列積状態を用いた高次ベリー曲率の代替定式化を提案する。これらは、離散化されたパラメータ空間を通して断熱的に進化するギャップ付きハミルトン多様体の基底状態である。行列積状態は射影表現の下で変換されるので、パラメータ空間を通る閉ループ上のベリー曲率の評価は、すべてのゲージの自由度を固定するのに十分ではない。ゲージ不変実量を得るため、パラメータ空間における小さなテトラヘドラ上で高次元ベリー曲率を評価する。数値計算により,Adiabatic進化を通じて高いベリー曲率が連続的に変化し,閉じた3次元パラメータ空間上で量子化されることを確認した。

The higher Berry curvature was introduced by Kapustin and Spodyneiko as an extension of the Berry curvature in quantum mechanical systems with finite degrees of freedom to quantum many-body systems in finite spatial dimensions. In this paper, we propose an alternative formulation of the higher Berry curvature using translationally invariant matrix product states. They are the ground states of a set of gapped Hamiltonians which are evolved adiabatically through a discretized parameter space. Because matrix product states transform under a projective representation, evaluating the Berry curvature on a closed loop through parameter space is not sufficient to fix all the gauge degrees of freedom. To obtain a gauge-invariant real quantity, the higher-dimensional Berry curvature is evaluated on small tetrahedra in parameter space. Our numerical calculations confirm that the higher Berry curvature varies continuously throughout an adiabatic evolution and becomes quantized over a closed 3-dimensional parameter space.

翻訳日:2023-05-16 17:37:01 公開日:2023-05-14

# 香港中学校地理カリキュラムへのGIS統合

Integrating GIS into Hong Kong Secondary School Geography Curriculum ( http://arxiv.org/abs/2305.08108v1 )

ライセンス: Link先を確認

Yin Ching Lai

(参考訳) 香港の地理カリキュラムには2000年代初頭からGISが含まれている。しかし、中等教育におけるGISは、香港の中等地理教育において重要な役割を果たさない。文献レビューによるGISのメリットを分析した結果、GISは上級・中等教育カリキュラムに含めるべきであると考えられる。さらに、香港教育局(EDB)からの明確な指導、香港の地理教師の低い準備、学界や教科書出版社からの支持の無い態度がなければ、GISは香港の中等教育では実施できないことを示している。そこで,edb,地理教員,学界,教科書出版者に対して,地理教育におけるgisの関与を促進するための提案を行った。 EDBは、教師、アカデミア、教科書出版社の参照のための明確なガイドラインを開発し、教師のための学生中心のGIS教育コースを提供する。教師は高度なGIS技術の準備をし、学生と一緒に学ぶことも重要である。学術誌や教科書出版社は、香港の中学校および上級地理カリキュラムを対象とした無料のGISマップを提供することができる。本報告は、香港地理教育におけるGIS導入に関する簡単な情報を提供するが、香港中等教育におけるGISの利用を促進するため、他の学者による新たなアイデアを刺激することができる。

Hong Kong' senior geography curriculum has included GIS since the early 2000s. However, GIS in secondary schools does not play a significant role in Hong Kong secondary geography education. Analyzing GIS benefits by literature review, it is believed that GIS should be included in both the senior and junior geography curriculum. Moreover, the literature review indicates that without clear instruction from the Hong Kong Education Bureau (EDB), low preparedness of Hong Kong geography teachers, and unsupportive attitudes from academia and textbook publishers, GIS cannot be implemented in secondary schools of Hong Kong. Therefore, suggestions are made for the EDB, geography teachers, academia and textbook publishers to facilitate GIS involvement in senior and junior geography curriculums. The EDB can develop clear guidelines for teachers, academia and textbook publishers' references, and offer student-centered GIS educational courses for teachers. It is important for teachers to be prepared for advanced GIS technology and to even learn along with students. Academics and textbook publishers can provide free GIS maps targeted at Hong Kong' junior and senior geography curriculums. Although the report provides brief information towards the GIS implementation in Hong Kong geography education, it can inspire new ideas from other scholars to facilitate the usage of GIS in Hong Kong secondary school geography teaching.

翻訳日:2023-05-16 17:36:44 公開日:2023-05-14

# タクシー需要予測のための時空間データのプライバシーと有用性のバランス

Balancing Privacy and Utility of Spatio-Temporal Data for Taxi-Demand Prediction ( http://arxiv.org/abs/2305.08107v1 )

ライセンス: Link先を確認

Yumeki Goto, Tomoya Matsumoto, Hamada Rizk, Naoto Yanai, Hirozumi Yamaguchi

(参考訳) タクシー需要予測は、タクシー提供施設が運転を最適化し、都市計画者が交通インフラやサービスを改善するための機械学習の重要な応用である。しかし、これらのシステムにおける機密データの使用は、プライバシーとセキュリティに関する懸念を引き起こす。本稿では,複数の当事者がデータをプライベートかつセキュアに保ちながら,自身のデータで機械学習モデルをトレーニングできる,タクシー需要予測のためのフェデレーション学習の利用を提案する。これにより、組織はアクセスできないデータに基づいてモデルを構築することができる。潜在的な利点にもかかわらず、タクシー需要予測のための連合学習は、クラス不均衡、一部の当事者間のデータ不足、多様な施設や地理的地域に対応するためのモデル一般化の必要性など、いくつかの技術的課題を提起している。これらの課題を効果的に解決するために,地域非依存エンコーディングを地理的ラッチ長座標に活用するシステムを提案する。これにより、提案するモデルは特定の領域に限らず、任意の領域で最適に実行することができる。さらに,コストに敏感な学習と様々な正規化手法を用いて,データ不足と過剰適合の問題を緩和する。 6か月間のタクシーサービス提供者16社から収集した実世界データによる評価から,本システムでは,統合データで学習した単一モデルと比較して,1～%の誤差で需要レベルを正確に予測した。また、乗客データに対するメンバーシップ推論攻撃を効果的に防いだ。

Taxi-demand prediction is an important application of machine learning that enables taxi-providing facilities to optimize their operations and city planners to improve transportation infrastructure and services. However, the use of sensitive data in these systems raises concerns about privacy and security. In this paper, we propose the use of federated learning for taxi-demand prediction that allows multiple parties to train a machine learning model on their own data while keeping the data private and secure. This can enable organizations to build models on data they otherwise would not be able to access. Despite its potential benefits, federated learning for taxi-demand prediction poses several technical challenges, such as class imbalance, data scarcity among some parties, and the need to ensure model generalization to accommodate diverse facilities and geographic regions. To effectively address these challenges, we propose a system that utilizes region-independent encoding for geographic lat-long coordinates. By doing so, the proposed model is not limited to a specific region, enabling it to perform optimally in any area. Furthermore, we employ cost-sensitive learning and various regularization techniques to mitigate issues related to data scarcity and overfitting, respectively. Evaluation with real-world data collected from 16 taxi service providers in Japan over a period of six months showed the proposed system predicted demand level accurately within 1\% error compared to a single model trained with integrated data. The system also effectively defended against membership inference attacks on passenger data.

翻訳日:2023-05-16 17:36:23 公開日:2023-05-14

# ブロックチェーントランザクション料金予測:機械学習手法の比較

Blockchain Transaction Fee Forecasting: A Comparison of Machine Learning Methods ( http://arxiv.org/abs/2305.08105v1 )

ライセンス: Link先を確認

Conall Butler and Martin Crane

(参考訳) GasはEthereumネットワークのトランザクションフィー計測システムである。ネットワークのユーザは、取引を提出するためにガス価格を選択し、この選択において過払いまたは遅延/未処理の取引のリスクを生じさせる。本研究では,ロンドン・ハードフォークの余波に関するデータを調査し,この大規模フォーク後のネットワークのトランザクションダイナミクスについて考察した。そこで本稿では,EthUSD BitUSDとガス価格の関連について,2019年以前の作業状況について報告する。予測には, 直接再帰型ハイブリッドLSTM, CNNLSTM, Attention LSTMなどの機械学習手法の新たな組み合わせを比較する。これらはウェーブレットしきい値と行列プロファイルデータ処理を組み合わせ、ブロック最小のガス価格を5分間のタイムスケールで複数のルックアヘッドで予測する。本研究は, 行列プロファイルがガス価格データや予測データに適用された最初の応用として, ハードウェアの制約, ハイブリッドモデルの性能, CNNLSTMモデルを考えると, 行列プロファイルデータが注目に基づくモデルを強化することを実証する。入力のウェーブレットコヒーレンス(wavelet coherence)は、1日の時間スケールで複数の変数の相関を示す。直接再帰型ハイブリッドLSTM戦略は他のモデルよりも優れている。ハイブリッドモデルは20分間のルックアヘッドで、25/50分間の予測では注目モデルに匹敵するパフォーマンスである。さまざまなルックヘッドでの予測によって、ユーザはガス価格選択に関するインフォームドな決定と、取引が拒否されることを恐れずに取引を提出する最適な窓口を選択できる。これは、既存の推奨者やオラクルや予測アプローチよりも、ガス価格のダイナミクスに関するより詳細な洞察を与え、単純なヒューリスティックスや限られた外観の地平線を提供する。

Gas is the transaction-fee metering system of the Ethereum network. Users of the network are required to select a gas price for submission with their transaction, creating a risk of overpaying or delayed/unprocessed transactions in this selection. In this work, we investigate data in the aftermath of the London Hard Fork and shed insight into the transaction dynamics of the net-work after this major fork. As such, this paper provides an update on work previous to 2019 on the link between EthUSD BitUSD and gas price. For forecasting, we compare a novel combination of machine learning methods such as Direct Recursive Hybrid LSTM, CNNLSTM, and Attention LSTM. These are combined with wavelet threshold denoising and matrix profile data processing toward the forecasting of block minimum gas price, on a 5-min timescale, over multiple lookaheads. As the first application of the matrix profile being applied to gas price data and forecasting we are aware of, this study demonstrates that matrix profile data can enhance attention-based models however, given the hardware constraints, hybrid models outperformed attention and CNNLSTM models. The wavelet coherence of inputs demonstrates correlation in multiple variables on a 1 day timescale, which is a deviation of base free from gas price. A Direct-Recursive Hybrid LSTM strategy outperforms other models. Hybrid models have favourable performance up to a 20 min lookahead with performance being comparable to attention models when forecasting 25/50-min ahead. Forecasts over a range of lookaheads allow users to make an informed decision on gas price selection and the optimal window to submit their transaction in without fear of their transaction being rejected. This, in turn, gives more detailed insight into gas price dynamics than existing recommenders, oracles and forecasting approaches, which provide simple heuristics or limited lookahead horizons.

翻訳日:2023-05-16 17:35:58 公開日:2023-05-14

# 有限レート消去チャネル上のフェデレーションTD学習:マルコフサンプリングによる線形高速化

Federated TD Learning over Finite-Rate Erasure Channels: Linear Speedup under Markovian Sampling ( http://arxiv.org/abs/2305.08104v1 )

ライセンス: Link先を確認

Nicol\`o Dal Fabbro, Aritra Mitra and George J. Pappas

(参考訳) フェデレーテッド・ラーニング(FL)は、コミュニケーションとプライバシの制約の下で教師付き学習タスクを高速化する効果により、最近注目を集めている。しかし、強化学習に類似したスピードアップが確立できるかどうかは理論的には理解されていない。本研究では,共通政策の評価を迅速化するために,エージェントが中央アグリゲータを介してコミュニケーションするフェデレート政策評価問題について検討する。 FLにおける典型的な通信制約を捉えるために、ベルヌーイ消去モデルに基づいてパケットをドロップできる有限容量アップリンクチャネルを考える。そこで本稿では,線形関数近似を用いた量子化フェデレーション時間差分学習アルゴリズムQFedTDを提案する。我々の主な技術的貢献はQFedTDの有限サンプル解析を提供することである。 (i) 量子化及び消去が収束率に及ぼす影響を強調する。 (ii) マルコフサンプリング下のエージェント数を線形スピードアップ w.r.t. とする。特に,共役学習,分散最適化,ネットワーク制御系文献において,異なる量子化機構やパケットドロップモデルが広く研究されてきたが,マルチエージェント・共役強化学習における効果の非漸近的解析を初めて行った。

Federated learning (FL) has recently gained much attention due to its effectiveness in speeding up supervised learning tasks under communication and privacy constraints. However, whether similar speedups can be established for reinforcement learning remains much less understood theoretically. Towards this direction, we study a federated policy evaluation problem where agents communicate via a central aggregator to expedite the evaluation of a common policy. To capture typical communication constraints in FL, we consider finite capacity up-link channels that can drop packets based on a Bernoulli erasure model. Given this setting, we propose and analyze QFedTD - a quantized federated temporal difference learning algorithm with linear function approximation. Our main technical contribution is to provide a finite-sample analysis of QFedTD that (i) highlights the effect of quantization and erasures on the convergence rate; and (ii) establishes a linear speedup w.r.t. the number of agents under Markovian sampling. Notably, while different quantization mechanisms and packet drop models have been extensively studied in the federated learning, distributed optimization, and networked control systems literature, our work is the first to provide a non-asymptotic analysis of their effects in multi-agent and federated reinforcement learning.

翻訳日:2023-05-16 17:35:27 公開日:2023-05-14

# 水分含有エポキシナノコンポジットの機械学習による粘弾性・粘弾性モデル

A machine learning-based viscoelastic-viscoplastic model for epoxy nanocomposites with moisture content ( http://arxiv.org/abs/2305.08102v1 )

ライセンス: Link先を確認

Betim Bahtiri, Behrouz Arash, Sven Scheffler, Maximilian Jux, Raimund Rolfes

(参考訳) 本研究では, ナノ粒子/エポキシナノコンポジットの循環粘弾性・粘弾性損傷挙動を水分含量で解析する, 深層学習に基づく構成モデルを提案する。このため、サンプリング技法と摂動法を組み合わせたフレームワークを用いて、長期短期記憶ネットワークを訓練する。実験で検証された粘弾性粘弾性粘塑性モデルによって生成されたトレーニングデータとともに、dlモデルが速度依存性の応力-ひずみ関係と一貫した接モジュラリティを正確に捉えることができる。さらに、dlに基づく構成モデルは有限要素解析に実装される。ナノ粒子/エポキシ試料の力-変位応答に及ぼす荷重速度と水分量の影響を有限要素シミュレーションにより検討した。数値的な例は、DLモデルの計算効率が負荷条件に依存し、従来の構成モデルよりもかなり高いことを示している。さらに, 数値計算結果と実験データを比較すると, 異なるナノ粒子や水分量と良好な一致を示した。

In this work, we propose a deep learning (DL)-based constitutive model for investigating the cyclic viscoelastic-viscoplastic-damage behavior of nanoparticle/epoxy nanocomposites with moisture content. For this, a long short-term memory network is trained using a combined framework of a sampling technique and a perturbation method. The training framework, along with the training data generated by an experimentally validated viscoelastic-viscoplastic model, enables the DL model to accurately capture the rate-dependent stress-strain relationship and consistent tangent moduli. In addition, the DL-based constitutive model is implemented into finite element analysis. Finite element simulations are performed to study the effect of load rate and moisture content on the force-displacement response of nanoparticle/ epoxy samples. Numerical examples show that the computational efficiency of the DL model depends on the loading condition and is significantly higher than the conventional constitutive model. Furthermore, comparing numerical results and experimental data demonstrates good agreement with different nanoparticle and moisture contents.

翻訳日:2023-05-16 17:35:07 公開日:2023-05-14

# 正定値核による条件付き平均埋め込みと最適特徴選択

Conditional mean embeddings and optimal feature selection via positive definite kernels ( http://arxiv.org/abs/2305.08100v1 )

ライセンス: Link先を確認

Palle E.T. Jorgensen, Myung-Sin Song, and James Tian

(参考訳) ここでは,条件付き平均埋め込み(cme)に対する新しい演算子理論的アプローチを考える。本稿では,スペクトル解析に基づく最適化手法とカーネル,確率過程,構成学習アルゴリズムの併用について述べる。当初与えられた非線形データに対しては、最適化に基づく特徴選択を検討する。これは、学習モデルからの回帰アルゴリズムによる最適な特徴選択を構築する際に、正定値(p.d.)カーネルの凸集合を使用する。したがって、(適切な学習アルゴリズムのための)トレーニングデータの初期入力により、p.d. kernel $k$ のそれぞれの選択は、様々なヒルベルト空間と特徴の実現をもたらす。ここでの新しい考え方は、選択されたカーネルの集合に対して$K$を凸集合$C$の正定値カーネルの$K$から最適化することである。したがって、特徴表現の「textquotedblleft optimal\textquotedblright{}」の選択は、指定された凸集合$C$内のp.d.カーネルの$K$に対する二次最適化に依存する。

Motivated by applications, we consider here new operator theoretic approaches to Conditional mean embeddings (CME). Our present results combine a spectral analysis-based optimization scheme with the use of kernels, stochastic processes, and constructive learning algorithms. For initially given non-linear data, we consider optimization-based feature selections. This entails the use of convex sets of positive definite (p.d.) kernels in a construction of optimal feature selection via regression algorithms from learning models. Thus, with initial inputs of training data (for a suitable learning algorithm,) each choice of p.d. kernel $K$ in turn yields a variety of Hilbert spaces and realizations of features. A novel idea here is that we shall allow an optimization over selected sets of kernels $K$ from a convex set $C$ of positive definite kernels $K$. Hence our \textquotedblleft optimal\textquotedblright{} choices of feature representations will depend on a secondary optimization over p.d. kernels $K$ within a specified convex set $C$.

翻訳日:2023-05-16 17:34:51 公開日:2023-05-14

# 遠距離発話レベルの表現のための自己教師型ニューラルファクター解析

Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations ( http://arxiv.org/abs/2305.08099v1 )

ライセンス: Link先を確認

Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu

(参考訳) wav2vecやHuBERTのような自己教師付き学習(SSL)音声モデルは、音声認識(ASR)における最先端の性能を示し、低ラベル・リソース設定において非常に有用であることが証明されている。しかし、sslモデルの成功はまだ話者、感情、言語認識といった発話レベルのタスクに移行しておらず、優れたパフォーマンスを得るためにはsslモデルの教師付き微調整が必要である。問題の原因は,異種表現の欠如と,これらの課題に対する発話レベルの学習目標にあると考える。 HuBERTがクラスタリングを使って隠れ音響ユニットを発見する方法に着想を得て、隠れ音響ユニットを用いてSSL機能を整列させる因子分析(FA)モデルを定式化した。下位の発話レベル表現は、一致した特徴に対する確率的推論を用いて、音声の内容から切り離される。さらに、faモデルから派生した変動下限は発話レベルの目標を提供し、エラー勾配をトランスフォーマ層にバックプロパゲーションし、高度に識別可能な音響単位を学ぶことができる。 HuBERTのマスク付き予測トレーニングと組み合わせて使用する場合、私たちのモデルは、ラベル付きデータの20%しか表示されないSUPERBベンチマークの発話レベル非意味タスクにおいて、現在の最高のモデルであるWavLMよりも優れています。

Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings. However, the success of SSL models has yet to transfer to utterance-level tasks such as speaker, emotion, and language recognition, which still require supervised fine-tuning of the SSL models to obtain good performance. We argue that the problem is caused by the lack of disentangled representations and an utterance-level learning objective for these tasks. Inspired by how HuBERT uses clustering to discover hidden acoustic units, we formulate a factor analysis (FA) model that uses the discovered hidden acoustic units to align the SSL features. The underlying utterance-level representations are disentangled from the content of speech using probabilistic inference on the aligned features. Furthermore, the variational lower bound derived from the FA model provides an utterance-level objective, allowing error gradients to be backpropagated to the Transformer layers to learn highly discriminative acoustic units. When used in conjunction with HuBERT's masked prediction training, our models outperform the current best model, WavLM, on all utterance-level non-semantic tasks on the SUPERB benchmark with only 20% of labeled data.

翻訳日:2023-05-16 17:34:38 公開日:2023-05-14

# tao一般微分と差分:理論と応用

Tao General Differential and Difference: Theory and Application ( http://arxiv.org/abs/2305.08098v1 )

ライセンス: Link先を確認

Linmi Tao, Ruiyang Liu, Donglai Tao, Wu Xia, Feilong Ma, Jingmao Cui

(参考訳) 現代の数値解析は離散データ上で行われ、数値差分計算はコアの1つであり、必須である。それにもかかわらず、差分アルゴリズムはノイズに対する感受性に致命的な弱点を持ち、信号処理を含む様々な分野において長年の課題となっている。差は離散領域における微分の拡張あるいは一般化である。しかし、離散計算における有限区間のため、dy と dx がともに無限小(ライプニッツ)、dx の極限が 0(コーシー)であるような微分の最も基本的な定義を満たすことに失敗する。この点において、差分の一般化は成立しない。この問題に対処するため、元の微分アプローチから離れ、有限区間に基づく微分を構築し、さらに一般化して畳み込みによる差を求める。この理論に基づき、実用的信号処理に適した様々な差分演算子を提案する。実験の結果、これらの差分演算子は高ノイズ免疫を含む例外的な信号処理能力を有することがわかった。

Modern numerical analysis is executed on discrete data, of which numerical difference computation is one of the cores and is indispensable. Nevertheless, difference algorithms have a critical weakness in their sensitivity to noise, which has long posed a challenge in various fields including signal processing. Difference is an extension or generalization of differential in the discrete domain. However, due to the finite interval in discrete calculation, there is a failure in meeting the most fundamental definition of differential, where dy and dx are both infinitesimal (Leibniz) or the limit of dx is 0 (Cauchy). In this regard, the generalization of differential to difference does not hold. To address this issue, we depart from the original derivative approach, construct a finite interval-based differential, and further generalize it to obtain the difference by convolution. Based on this theory, we present a variety of difference operators suitable for practical signal processing. Experimental results demonstrate that these difference operators possess exceptional signal processing capabilities, including high noise immunity.

翻訳日:2023-05-16 17:34:12 公開日:2023-05-14

# 神経機械翻訳における知識蒸留の理解と改善に向けて

Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation ( http://arxiv.org/abs/2305.08096v1 )

ライセンス: Link先を確認

Songming Zhang, Yunlong Liang, Shuaibo Wang, Wenjuan Han, Jian Liu, Jinan Xu and Yufeng Chen

(参考訳) 知識蒸留(KD)はニューラルマシン翻訳におけるモデル圧縮の有望な技術である。しかし、kdに知識が隠されている場所はまだ明確ではなく、kdの開発を妨げる可能性がある。本研究では、まずこの謎を経験的観点から解き出し、その知識が教師のトップ1の予測から得られることを示し、また、単語とシーケンスレベルのKDの間の潜在的なつながりを構築するのにも役立ちます。さらに,この発見に基づいて,バニラ語レベルのkdに固有の2つの問題点を指摘する。第一に、kdの現在の目的は、知識を学ぶために全分布にその焦点を広げるが、最も重要なtop-1情報に対する特別な扱いを欠いている。第二に、この知識は、教師のtop-1予測のほとんどが、kdの可能性をさらに制限する地上トークンと重なり合うという事実から、金の情報によっておおむねカバーされている。これらの問題に対処するために、新しい方法である \textbf{T}op-1 \textbf{I}nformation \textbf{E}nhanced \textbf{K}nowledge \textbf{D}istillation (TIE-KD)を提案する。具体的には,教師からのtop-1情報の学習を強制するために,階層的ランキングロスを設計する。さらに, 地中標的を使わずにデータを蒸留し, さらなる知識を注入する反復的kd手法を開発した。 WMT'14英語-ドイツ語、WMT'14英語-フランス語、WMT'16英語-ルーマニア語の実験では、我々の手法がTransformer$_{base}$ studentsを+1.04, +0.60, +1.11BLEUスコアで向上させ、バニラ語レベルのKDベースラインを著しく上回ることを示した。さらに,本手法は,既存のKD手法よりも,教師と生徒の容量ギャップの多様さに対して高い一般化性を示す。

Knowledge distillation (KD) is a promising technique for model compression in neural machine translation. However, where the knowledge hides in KD is still not clear, which may hinder the development of KD. In this work, we first unravel this mystery from an empirical perspective and show that the knowledge comes from the top-1 predictions of teachers, which also helps us build a potential connection between word- and sequence-level KD. Further, we point out two inherent issues in vanilla word-level KD based on this finding. Firstly, the current objective of KD spreads its focus to whole distributions to learn the knowledge, yet lacks special treatment on the most crucial top-1 information. Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a novel method named \textbf{T}op-1 \textbf{I}nformation \textbf{E}nhanced \textbf{K}nowledge \textbf{D}istillation (TIE-KD). Specifically, we design a hierarchical ranking loss to enforce the learning of the top-1 information from the teacher. Additionally, we develop an iterative KD procedure to infuse more additional knowledge by distilling on the data without ground-truth targets. Experiments on WMT'14 English-German, WMT'14 English-French and WMT'16 English-Romanian demonstrate that our method can respectively boost Transformer$_{base}$ students by +1.04, +0.60 and +1.11 BLEU scores and significantly outperform the vanilla word-level KD baseline. Besides, our method shows higher generalizability on different teacher-student capacity gaps than existing KD techniques.

翻訳日:2023-05-16 17:33:55 公開日:2023-05-14

# パラメトリック駆動による結合キャビティアレイの強度と原子-原子相互作用範囲の増強

Enhancing strength and range of atom-atom interaction in a coupled-cavity array via parametric drives ( http://arxiv.org/abs/2305.08127v1 )

ライセンス: Link先を確認

Ya-long Ren, Sheng-li Ma, Stefano Zippilli, David Vitali and Fu-li Li

(参考訳) 原子間のコヒーレントな長距離相互作用は、量子情報科学の分野における多くの応用に必須であるが、通常は原子分離の増加に伴い指数関数的に減少する。ここでは、2光子(パラメトリック)駆動を受ける結合キャビティアレイによって媒介される長距離原子-原子相互作用を劇的に向上させる。本手法により, 単一光子束縛状態波動関数の局在長と有効原子-光子結合強度の両方を大幅に増幅し, 2つの遠方原子間の光子を介するコヒーレント相互作用を著しく改善することができる。さらに、情報伝達の促進と原子間の絡み合いの生成について分析することにより、この効果を説明する。

Coherent long-range interactions between atoms are a prerequisite for numerous applications in the field of quantum information science, but they usually decrease exponentially with the increase in atomic separation. Here we present an appealing method to dramatically enhance the long-range atom-atom interaction mediated by a coupled-cavity array that is subjected to two-photon (parametric) drives. Our method allows one to greatly amplify both the localization length of the single-photon bound-state wavefunction and the effective atom-photon coupling strength, resulting in a significant improvement of photon-mediated coherent interaction between two distant atoms. Additionally, we illustrate this effect by analyzing how it facilitates the transfer of information and the creation of entanglement between the atoms.

翻訳日:2023-05-16 17:26:17 公開日:2023-05-14

# 学習可能な概念のセマンティックコミュニケーション

Semantic Communication of Learnable Concepts ( http://arxiv.org/abs/2305.08126v1 )

ライセンス: Link先を確認

Francesco Pase, Szymon Kobus, Deniz Gunduz, Michele Zorzi

(参考訳) 我々は、未知および潜在的確率写像という一連の概念を伝達する問題を、例を通してのみ観察できる、すなわち、写像規則が未知である、と考える。送信機は、利用可能な例に学習アルゴリズムを適用し、一連のモデル、すなわち、観測されたデータをよりうまく記述できる既知の関数、および潜在的に基礎となる概念に対して確率分布を最適化することにより、データから知識を抽出する。送信機は、学習したモデルをレート制限されたチャネルを介してリモートレシーバーに通信し、受信機が、そのセマンティック空間において可能な限り正確なサンプル概念を記述できるモデルをデコードできるようにする。分析のモチベーションを得た後、ネットワークにおける経験的・強い協調の概念との関係を指摘し、コミュニケーション概念の形式的問題を提案し、その速度歪曲特性を提供する。また、歪み率関数のバウンドも提供する。

We consider the problem of communicating a sequence of concepts, i.e., unknown and potentially stochastic maps, which can be observed only through examples, i.e., the mapping rules are unknown. The transmitter applies a learning algorithm to the available examples, and extracts knowledge from the data by optimizing a probability distribution over a set of models, i.e., known functions, which can better describe the observed data, and so potentially the underlying concepts. The transmitter then needs to communicate the learned models to a remote receiver through a rate-limited channel, to allow the receiver to decode the models that can describe the underlying sampled concepts as accurately as possible in their semantic space. After motivating our analysis, we propose the formal problem of communicating concepts, and provide its rate-distortion characterization, pointing out its connection with the concepts of empirical and strong coordination in a network. We also provide a bound for the distortion-rate function.

翻訳日:2023-05-16 17:26:03 公開日:2023-05-14

# 資格トレースとしてのthetaシークエンス:クレジット割り当てに対する生物学的解決策

Theta sequences as eligibility traces: a biological solution to credit assignment ( http://arxiv.org/abs/2305.08124v1 )

ライセンス: Link先を確認

Tom M George

(参考訳) クレジット代入問題(例えば、RLにおけるポリシー評価)は、しばしば前回の状態 \textit{or} が時間的に拡張されたメモリトレースを維持することで予測エラーをブートストラップする必要がある。海馬のテタ振動における神経活動の連鎖、覚醒行動の迅速なプレイスルーを表すと考えられるテタ配列を解法として提案する。 thetaシーケンスのモデルを解析してシミュレートすることにより、既存のだが短い$\mathsf{o}(10)$ msのニューロンメモリトレースが効果的に拡張され、長いメモリトレースなしでブートストラップフリーのクレジット割り当てが可能になる。

Credit assignment problems, for example policy evaluation in RL, often require bootstrapping prediction errors through preceding states \textit{or} maintaining temporally extended memory traces; solutions which are unfavourable or implausible for biological networks of neurons. We propose theta sequences -- chains of neural activity during theta oscillations in the hippocampus, thought to represent rapid playthroughs of awake behaviour -- as a solution. By analysing and simulating a model for theta sequences we show they compress behaviour such that existing but short $\mathsf{O}(10)$ ms neuronal memory traces are effectively extended allowing for bootstrap-free credit assignment without long memory traces, equivalent to the use of eligibility traces in TD($\lambda$).

翻訳日:2023-05-16 17:25:47 公開日:2023-05-14

# 結合ランダムグラフモデルにおける量子スカー状態

Quantum Scar States in Coupled Random-Graph Models ( http://arxiv.org/abs/2305.08123v1 )

ライセンス: Link先を確認

Bhilahari Jeevanesan

(参考訳) 我々は,l$site pxp-model のヒルベルト空間接続を,基底状態のグレイ符号によるハミルトニアン行列を構築して解析する。一度構築すると、行列は単純な構造を明らかにする:それらはすべて単一のハミルトンパスのバックボーンとサイド接続から形成される。 PXPモデルは、領域法的な絡み合いを持つスペクトルの中央に傷跡が存在することで知られている。 pxp-モデルの隣接グラフを開発するという理解は、可変制約次数と可変ネットワークトポロジーを持つハミルトニアンのクラスをどのように構築するかに関する一般的な指示を与えてくれる。ネットワークトポロジがランダムグラフモデルを中心に構築されるこのモデルのバージョンについて検討する。弱絡み合った中スペクトル固有状態の2つのクラスを見つける。第1のクラスはサブシステムの製品に近い固有状態である傷跡であり、第2のクラスは$\log 2$エンタングルメントエントロピーを持ち、特別なタイプのサブグラフの発生と結びついている。後者の状態は、Lin-Motrunich $\sqrt{2}$-scarsに似ている。

We analyze the Hilbert space connectivity of the $L$ site PXP-model by constructing the Hamiltonian matrices via a Gray code numbering of basis states. Once constructed, the matrices reveal a simple structure: they are all formed out of a single Hamiltonian-path backbone and side-connections. The PXP model is known for the presence of scar states in the middle of the spectrum that have area-law entanglement. The understanding that we develop of the PXP-model's adjacency graph equips us with a general instruction on how to construct a class of Hamiltonians with tunable constraint degree and variable network topology. We explore a version of this model where the network topology is constructed around a random-graph model. We find two classes of weakly-entangled mid-spectrum eigenstates. The first class are scars that are near-product eigenstates of the subsystems, while the second class has $\log 2$ entanglement entropy and is tied to the occurrence of special types of subgraphs. The latter states have some resemblance to the Lin-Motrunich $\sqrt{2}$-scars.

翻訳日:2023-05-16 17:25:27 公開日:2023-05-14

# OTTメディアの予測分析におけるコールドスタートエニグマの展開 : 相乗的メタインサイトとマルチモーダルアンサンブルマスター

Unraveling Cold Start Enigmas in Predictive Analytics for OTT Media: Synergistic Meta-Insights and Multimodal Ensemble Mastery ( http://arxiv.org/abs/2305.08120v1 )

ライセンス: Link先を確認

K. Ganguly, A. Patra

(参考訳) コールドスタート問題は、Over-The-Top(OTT)プラットフォームで新たにローンチされたショーの視聴者数を予測するようなメディアユースケースを含む、さまざまな領域で一般的な課題である。本研究では,メタデータの活用とマルチモデルアンサンブルによるコールドスタート問題への汎用的アプローチを提案する。提案手法は,特徴工学,モデル選択,および重み付き予測平均に基づくアンサンブルアプローチを含む。提案手法の性能は,様々な性能指標を用いて評価する。その結果,マルチモデルアンサンブルアプローチは,個々のモデルと比較して予測精度が著しく向上することがわかった。

The cold start problem is a common challenge in various domains, including media use cases such as predicting viewership for newly launched shows on Over-The-Top (OTT) platforms. In this study, we propose a generic approach to tackle cold start problems by leveraging metadata and employing multi-model ensemble techniques. Our methodology includes feature engineering, model selection, and an ensemble approach based on a weighted average of predictions. The performance of our proposed method is evaluated using various performance metrics. Our results indicate that the multi-model ensemble approach significantly improves prediction accuracy compared to individual models.

翻訳日:2023-05-16 17:25:08 公開日:2023-05-14

# 容易軸強磁性体を用いたキャビティマグノニクス : 臨界的に強化されたマグノンスクイーズと光間相互作用

Cavity magnonics with easy-axis ferromagnet: critically enhanced magnon squeezing and light-matter interaction ( http://arxiv.org/abs/2305.08119v1 )

ライセンス: Link先を確認

Jongjun M. Lee, Hyun-Woo Lee, Myung-Joong Hwang

(参考訳) マグノンスクイージングの生成と探索は、量子マグノニクスの分野において重要な課題である。本研究では,この課題に対処するため,容易軸強磁性体を用いたキャビティマグノニクスのセットアップを提案する。この目的のために,我々はまず,容易軸強磁性体におけるマグノンスクイーズの発生機構を確立し,イジング相転移点近傍の外部磁場をチューニングすることにより、マグノンスクイーズを臨界的に向上させることができることを示す。磁石を空洞磁場に結合すると、有効キャビティ-マグノン相互作用はマグノンスクイーズに比例し、静磁場を用いてキャビティ-マグノン結合強度を高めることができる。キャビティフィールドの周波数シフトを測定することで,マグノンスクイーズを探査できることを実証した。さらに, 静磁場をチューニングすることで, マグネトロン超ラジアント相転移を観測することができ, キャビティとマグネットとの磁気相互作用が弱すぎて超ラジアント相転移を駆動できないという課題を克服できる。我々の研究は、磁石の内在的性質を利用して、従来の空洞QED物理を超える空洞マグノニクスのユニークな能力を開発する方法である。

Generating and probing the magnon squeezing is an important challenge in the field of quantum magnonics. In this work, we propose a cavity magnonics setup with an easy-axis ferromagnet to address this challenge. To this end, we first establish a mechanism for the generation of magnon squeezing in the easy-axis ferromagnet and show that the magnon squeezing can be critically enhanced by tuning an external magnetic field near the Ising phase transition point. When the magnet is coupled to the cavity field, the effective cavity-magnon interaction becomes proportional to the magnon squeezing, allowing one to enhance the cavity-magnon coupling strength using a static field. We demonstrate that the magnon squeezing can be probed by measuring the frequency shift of the cavity field. Moreover, a magnonic superradiant phase transition can be observed in our setup by tuning the static magnetic field, overcoming the challenge that the magnetic interaction between the cavity and the magnet is typically too weak to drive the superradiant transition. Our work paves the way to develop unique capabilities of cavity magnonics that goes beyond the conventional cavity QED physics by harnessing the intrinsic property of a magnet.

翻訳日:2023-05-16 17:24:55 公開日:2023-05-14

# MultiQuant: 任意ビット幅ネットワーク量子化のための新しいマルチブランチトポロジー手法

MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization ( http://arxiv.org/abs/2305.08117v1 )

ライセンス: Link先を確認

Yunshan Zhong, Mingbao Lin, Yuyao Zhou, Mengzhao Chen, Yuxin Zhang, Fei Chao, Rongrong Ji

(参考訳) 任意のビット幅ネットワーク量子化は、実行時に様々なビット幅要求に高い適応性を持つため、大きな注目を集めている。しかし,本研究では,重みとアクティベーションの頻繁なビット幅切替による量子化誤差の顕著な蓄積を観測し,性能の限界を指摘した。この問題に対処するために,任意のビット幅量子化にマルチブランチトポロジを利用する新しい手法であるMultiQuantを提案する。 MultiQuantは、ネットワーク本体を複数の独立したブランチに複製し、期待ビット幅の入力活性化を維持しながら、各ブランチの重みを固定2ビットに量子化する。この手法は、重みビット幅の切り替えを回避しつつも計算コストを同じに維持し、重み量子化の誤差を実質的に低減する。また,分枝の活性化ビット幅切替による量子化誤差を分枝間で分散し,性能を向上させるための償却分枝選択戦略を提案する。最後に,MultiQuantの性能を高めるため,枝間誘導を容易にする蒸留方式を設計する。大規模な実験により、MultiQuantは既存の任意のビット幅量子化法と比較して大きな性能向上を達成した。コードは \url{https://github.com/zysxmu/MultiQuant} にある。

Arbitrary bit-width network quantization has received significant attention due to its high adaptability to various bit-width requirements during runtime. However, in this paper, we investigate existing methods and observe a significant accumulation of quantization errors caused by frequent bit-width switching of weights and activations, leading to limited performance. To address this issue, we propose MultiQuant, a novel method that utilizes a multi-branch topology for arbitrary bit-width quantization. MultiQuant duplicates the network body into multiple independent branches and quantizes the weights of each branch to a fixed 2-bit while retaining the input activations in the expected bit-width. This approach maintains the computational cost as the same while avoiding the switching of weight bit-widths, thereby substantially reducing errors in weight quantization. Additionally, we introduce an amortization branch selection strategy to distribute quantization errors caused by activation bit-width switching among branches to enhance performance. Finally, we design an in-place distillation strategy that facilitates guidance between branches to further enhance MultiQuant's performance. Extensive experiments demonstrate that MultiQuant achieves significant performance gains compared to existing arbitrary bit-width quantization methods. Code is at \url{https://github.com/zysxmu/MultiQuant}.

翻訳日:2023-05-16 17:24:30 公開日:2023-05-14

# 超現実性を持つ知識グラフの構造とダイナミクス

The Structure and Dynamics of Knowledge Graphs, with Superficiality ( http://arxiv.org/abs/2305.08116v1 )

ライセンス: Link先を確認

Lo\"ick Lhote, B\'eatrice Markhoff, Arnaud Soulet

(参考訳) 大規模な知識グラフは、学界や機関から企業、クラウドソーシングに至るまで、さまざまなプロジェクトから収集された人間の知識を組み合わせる。このようなグラフの中では、2つのノード間の関係は2つの実体を含む基本的な事実を表している。関係のセマンティクスの多様性は知識グラフの豊かさを構成しており、特異なトポロジーが出現し、時には外観が混乱することがある。しかし、この複雑な特徴は、事実が独立して生成される関係の重複を制御する超現実性の概念を導入することで、単純な方法でモデル化することができる。現実性はまた、誤解された実体の割合を決定することによって、知識のグローバルな分布のバランスを規制する。これは知識グラフの構造とダイナミクスに関する最初のモデルである。これは、正式な知識獲得と組織に関する理解を深めます。

Large knowledge graphs combine human knowledge garnered from projects ranging from academia and institutions to enterprises and crowdsourcing. Within such graphs, each relationship between two nodes represents a basic fact involving these two entities. The diversity of the semantics of relationships constitutes the richness of knowledge graphs, leading to the emergence of singular topologies, sometimes chaotic in appearance. However, this complex characteristic can be modeled in a simple way by introducing the concept of superficiality, which controls the overlap between relationships whose facts are generated independently. Superficiality also regulates the balance of the global distribution of knowledge by determining the proportion of misdescribed entities. This is the first model for the structure and dynamics of knowledge graphs. It leads to a better understanding of formal knowledge acquisition and organization.

翻訳日:2023-05-16 17:24:06 公開日:2023-05-14

# 機械学習モデルエラー封じ込めのための注意ルールの自動生成

Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors ( http://arxiv.org/abs/2305.08115v1 )

ライセンス: Link先を確認

Samuel Ackerman, Axel Bendavid, Eitan Farchi, Orna Raz

(参考訳) 多くのアプリケーションで機械学習(ML)ソリューションが普及している。しかし、これらのソリューションをビジネスグレードにする上で、多くの課題が存在する。例えば、基盤となるMLモデルのエラー率を許容できる低いレベルに維持する。通常、特徴入力と予測対象特徴との間の真の関係は不確かであり、したがって自然界では統計的である。提案するアプローチは、誤って予測される可能性が最も高い観測を「注意セット」に分離することである。これらはモデル診断と改善を直接支援し、これらの問題のある観察のために別の行動経路を決定するのに使用できる。これらの観測を分離するために最適な規則を決定するアルゴリズムをいくつか提示する。特に,機能ベースのスライシングを利用する戦略は,人間の解釈可能で,モデル非依存であり,補足的な入力や知識を最小限に抑える必要がある。さらに,予測信頼度がしきい値を下回るような観測結果の選択など,これらの戦略がいくつかの一般的なベースラインを上回っていることを示す。戦略を評価するために,様々な望ましい品質(その性能,安定性,非認識データの一般化など)を測定するための指標を導入し,その戦略はいくつかの公開データセット上で評価される。 ToPSIS(Multiple Criteria Decision Making Method)を用いて、これらのメトリクスを戦略ごとに単一の品質スコアに集約し、比較を可能にする。

Machine learning (ML) solutions are prevalent in many applications. However, many challenges exist in making these solutions business-grade. For instance, maintaining the error rate of the underlying ML models at an acceptably low level. Typically, the true relationship between feature inputs and the target feature to be predicted is uncertain, and hence statistical in nature. The approach we propose is to separate the observations that are the most likely to be predicted incorrectly into 'attention sets'. These can directly aid model diagnosis and improvement, and be used to decide on alternative courses of action for these problematic observations. We present several algorithms (`strategies') for determining optimal rules to separate these observations. In particular, we prefer strategies that use feature-based slicing because they are human-interpretable, model-agnostic, and require minimal supplementary inputs or knowledge. In addition, we show that these strategies outperform several common baselines, such as selecting observations with prediction confidence below a threshold. To evaluate strategies, we introduce metrics to measure various desired qualities, such as their performance, stability, and generalizability to unseen data; the strategies are evaluated on several publicly-available datasets. We use TOPSIS, a Multiple Criteria Decision Making method, to aggregate these metrics into a single quality score for each strategy, to allow comparison.

翻訳日:2023-05-16 17:23:56 公開日:2023-05-14

# 影響のある人工知能の量子操作

Quantum Operation of Affective Artificial Intelligence ( http://arxiv.org/abs/2305.08112v1 )

ライセンス: Link先を確認

V.I. Yukalov

(参考訳) このレビューでは、感情を経験する人間による意思決定の現実的な過程を模倣するために、人工知能が基本とする基本原則を分析している。 2つのアプローチを比較する。1つは量子論に基づいており、もう1つは古典的用語を用いる。これらのアプローチには多くの類似点があり、主に確率的である。固有雑音下での量子測定と感情的意思決定の類似性を明らかにする。認知過程は、量子測定と形式的に類似した多くの特徴を有することが示されている。しかしこれは、人間の意思決定を模倣するためには、Affective Artificial Intelligenceは必ずしも量子システムの機能に依存する必要があることを意味する。量子測定と意思決定の共通性を評価することは、古典的な概念のみを用いた公理的アプローチの定式化に役立つ。このアプローチに従う人工知能は、考慮された選択肢の実用性と彼らの感情的な魅力を考慮し、人間と同じような動作をする。感情的人工知能は、その操作が認知と感情の二重性を考慮しており、伝統的な意思決定の多くの行動的パラドックスを避ける。知的エージェントの社会は、情報の繰り返し多段階の交換を通じて相互作用し、動的な意思決定を行うネットワークを形成する。知的ネットワークは、感情的意思決定者の人間社会、ニューロンからなる脳、または人工知能の典型的な確率的ネットワークのいずれかの動作を特徴付けることができる。

The review analyzes the fundamental principles which Artificial Intelligence should be based on in order to imitate the realistic process of taking decisions by humans experiencing emotions. Two approaches are compared, one based on quantum theory and the other employing classical terms. Both these approaches have a number of similarities, being principally probabilistic. The analogies between quantum measurements under intrinsic noise and affective decision making are elucidated. It is shown that cognitive processes have many features that are formally similar to quantum measurements. This, however, in no way means that for the imitation of human decision making Affective Artificial Intelligence has necessarily to rely on the functioning of quantum systems. Appreciating the common features between quantum measurements and decision making helps for the formulation of an axiomatic approach employing only classical notions. Artificial Intelligence, following this approach, operates similarly to humans, by taking into account the utility of the considered alternatives as well as their emotional attractiveness. Affective Artificial Intelligence, whose operation takes account of the cognition-emotion duality, avoids numerous behavioural paradoxes of traditional decision making. A society of intelligent agents, interacting through the repeated multistep exchange of information, forms a network accomplishing dynamic decision making. The considered intelligent networks can characterize the operation of either a human society of affective decision makers, or the brain composed of neurons, or a typical probabilistic network of an artificial intelligence.

翻訳日:2023-05-16 17:23:34 公開日:2023-05-14

# 局所発振器を用いた100km繊維上の長距離連続可変量子鍵分布

Long-distance continuous-variable quantum key distribution over 100 km fiber with local local oscillator ( http://arxiv.org/abs/2305.08156v1 )

ライセンス: Link先を確認

Adnan A.E. Hajomer, Ivan Derkach, Nitin Jain, Hou-Man Chin, Ulrik L. Andersen and Tobias Gehring

(参考訳) 量子鍵分散(QKD)は、2つのリモートパーティが物理法則に基づいて暗号化キーをセキュリティと共有することを可能にする。連続変数(CV)QKDとコヒーレント状態とコヒーレント検出は、既存の通信ネットワークとよく統合される。しかし、これまでのところ、長距離のcv-qkdは、ローカル発振器が送信される非常に複雑なスキームを使用してのみ実証されており、盗聴者のためのセキュリティホールを開き、潜在的な用途を制限している。本稿では,100kmのファイバーチャネル上で局所的に発生する局所発振器を用いた長距離CV-QKD実験について報告する。この記録破断距離は、キャリア回復のための機械学習フレームワークを介して位相ノイズによる余剰ノイズを制御し、変調分散を最適化することで達成される。 CV-QKDプロトコルの完全な実装と,有限サイズシステムにおける集団攻撃に対する鍵生成の実証を行う。その結果,CV量子アクセスネットワークを実現する上で重要なマイルストーンを達成し,セキュアQKDの大規模展開の道を開いた。

Quantum key distribution (QKD) enables two remote parties to share encryption keys with security based on the laws of physics. Continuous variable (CV) QKD with coherent states and coherent detection integrates well with existing telecommunication networks. However, thus far, long-distance CV-QKD has only been demonstrated using a highly complex scheme where the local oscillator is transmitted, opening security loopholes for eavesdroppers and limiting its potential applications. Here, we report a long-distance CV-QKD experiment with a locally generated local oscillator over a 100 km fiber channel with a total loss of 15.4 dB. This record-breaking distance is achieved by controlling the phase-noise-induced excess noise through a machine-learning framework for carrier recovery and optimizing the modulation variance. We implement the full CV-QKD protocol and demonstrate the generation of keys secure against collective attacks in the finite-size regime. Our results mark a significant milestone for realizing CV quantum access networks with a high loss budget, and pave the way for large-scale deployment of secure QKD.

翻訳日:2023-05-16 17:17:04 公開日:2023-05-14

# STORYWARS:コラボレーティブなストーリー理解と生成のためのデータセットとインストラクションチューニングベースライン

STORYWARS: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation ( http://arxiv.org/abs/2305.08152v1 )

ライセンス: Link先を確認

Yulun Du and Lydia Chilton

(参考訳) 協力的なストーリーは、異なる執筆スタイルと意図を持つ複数の著者の協力によって作成されたテキストであり、NLPモデルに固有の課題を提起する。このようなストーリーの理解と生成は、オープンドメインコーパスが欠如しているため、未熟な領域である。これを解決するために、オンラインプラットフォームから9,400人の異なる著者によって書かれた4万以上のコラボレーティブなストーリーのデータセットであるSTORYWARSを紹介します。 STORYWARSでは7つの理解と5つの生成タスクからなる12のタスクタイプを設計し、全教師付き、少数ショット、ゼロショットシナリオをカバーするマルチタスクベンチマークとして、合計101のストーリー関連タスクを導出する。さらに,STORYWARSの完全教師付きタスクにおいて,命令チューニングがゼロショットおよび少数ショットシナリオにおいて優れた結果を得るとともに,優れたマルチタスクベンチマーク性能を確立できることを示すストーリータスクに対して,命令チューニングモデル INSTRUCTSTORY を提案する。

Collaborative stories, which are texts created through the collaborative efforts of multiple authors with different writing styles and intentions, pose unique challenges for NLP models. Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce STORYWARS, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. We design 12 task types, comprising 7 understanding and 5 generation task types, on STORYWARS, deriving 101 diverse story-related tasks in total as a multi-task benchmark covering all fully-supervised, few-shot, and zero-shot scenarios. Furthermore, we present our instruction-tuned model, INSTRUCTSTORY, for the story tasks showing that instruction tuning, in addition to achieving superior results in zero-shot and few-shot scenarios, can also obtain the best performance on the fully-supervised tasks in STORYWARS, establishing strong multi-task benchmark performances on STORYWARS.

翻訳日:2023-05-16 17:16:46 公開日:2023-05-14

# 固有値問題に対する多点摂動公式

A multipoint perturbation formula for eigenvalue problems ( http://arxiv.org/abs/2305.08151v1 )

ライセンス: Link先を確認

Genevi\`eve Dusson, Louis Garrigue, Benjamin Stamm

(参考訳) 固有値問題の標準摂動理論は、対応する固有モードが知られているハミルトニアン近傍での固有モデの近似を求めることである。それでも、近くのいくつかのハミルトニアンの対応する固有モードが知られているならば、標準摂動理論はこれらの知識を全て同時に使ってより良い近似を与えることはできない。このような近似結果を可能にする式を導出し、この手法が標準摂動理論よりも競争力のある数値例を提供する。

Standard perturbation theory of eigenvalue problems consists of obtaining approximations of eigenmodes in the neighborhood of a Hamiltonian where the corresponding eigenmode is known. Nevertheless, if the corresponding eigenmodes of several nearby Hamiltonians are known, standard perturbation theory cannot simultaneously use all this knowledge to provide a better approximation. We derive a formula enabling such an approximation result, and provide numerical examples for which this method is more competitive than standard perturbation theory.

翻訳日:2023-05-16 17:16:25 公開日:2023-05-14

# 結合共振器の異常点に及ぼす熱光子の影響

The effect of thermal photons on exceptional points in coupled resonators ( http://arxiv.org/abs/2305.08150v1 )

ライセンス: Link先を確認

Grzegorz Chimczak and Anna Kowalewska-Kud{\l}aszyk and Ewelina Lange and Karol Bartkiewicz and Jan Pe\v{r}ina Jr

(参考訳) 1つは光学デバイスであり、もう1つは超伝導マイクロ波周波数デバイスである。それらの対称性を調べるために、与えられたハミルトニアンの損失と利得項がバランスをとる平衡系を導入する。両系の非エルミート・ハミルトニアンは例外点(ep)に到達するように調整可能であること、すなわち、破れから崩壊しない隠れpt対称性への遷移が起こるパラメータ空間の点を示す。我々は、liouvillian exceptional point (lep) と呼ばれるliouvillian superoperatorの縮退度を計算し、光学領域において、lepは非エルミートハミルトン(hep)から得られるepと等価であることを示す。またマイクロ波系に対する非ゼロ数の熱光子によるEPとHEPの等価性を報告した。

We analyse two quantum systems with hidden parity-time (PT) symmetry: one is an optical device, whereas another is a superconducting microwave-frequency device. To investigate their symmetry, we introduce an equilibrium frame, in which loss and gain terms for a given Hamiltonian are balanced. We show that the non-Hermitian Hamiltonians of both systems can be tuned to reach an exceptional point (EP), i.e., the point in parameter space at which a transition from broken to unbroken hidden PT symmetry takes place. We calculate a degeneracy of a Liouvillian superoperator, which is called the Liouvillian exceptional point (LEP), and show that, in the optical domain, LEP is equivalent to EP obtained from the non-Hermitian Hamiltonian (HEP). We also report breaking the equivalence between LEP and HEP by a non-zero number of thermal photons for the microwave-frequency system.

翻訳日:2023-05-16 17:16:17 公開日:2023-05-14

# サイドチャネルセキュア量子鍵分布

Side-channel-secure quantum key distribution ( http://arxiv.org/abs/2305.08148v1 )

ライセンス: Link先を確認

Cong Jiang and Xiao-Long Hu and Zong-Wen Yu and Xiang-Bin Wang

(参考訳) 完全現実的な条件下では、サイドチャネルセキュリティ(SCS)量子鍵分布(QKD)の結果を示す。本研究の結果は, 測定デバイスに依存しないだけでなく, 不完全真空および不完全コヒーレント状態源を含む不完全(かつ不安定な)ソースデバイスにも有効である。仮想マッピングのアイデアを応用して、サイドチャネルのコヒーレントな攻撃を含む、外部からの攻撃に対する一般的なセキュリティ証明を提示する。また, 副産物として, 鍵レートを1～2桁向上させるscsプロトコルの改良法を提案する。これらの結果を用いて, 完全現実的条件で即時に役立つ非漸近キーレートを求める。

We present a result of side-channel-secure (SCS) quantum key distribution (QKD) under fully realistic conditions. Our result is not only measurement-device independent but also effective with imperfect (and unstable) source devices including imperfect vacuum and imperfect coherent-state source. Applying the virtual mapping idea, we present a general security proof under whatever out-side-lab attack, including whatever side-channel coherent attack. As a by- product, we also present an improved method for SCS protocols which can raise the key rate by 1-2 orders of magnitude. Using these results, we obtain a non-asymptotic key rate which is instantly useful with full realistic conditions.

翻訳日:2023-05-16 17:15:58 公開日:2023-05-14

# ParaLS:プレトレーニングパラフラザーによる語彙置換

ParaLS: Lexical Substitution via Pretrained Paraphraser ( http://arxiv.org/abs/2305.08146v1 )

ライセンス: Link先を確認

Jipeng Qiang, Kang Liu, Yun Li, Yunhao Yuan, Yi Zhu

(参考訳) 語彙置換(LS)は、文中の対象単語の適切な置換を見つけることを目的としている。近年,事前訓練された言語モデルに基づくLS手法が顕著な進歩を遂げ,その文脈環境の分析を通じて,対象単語の潜在的代用を生成する。しかし、これらの方法は代用語を生成する際に文の意味の保存を過小評価する傾向がある。本研究では,代用候補をパラフレーズから生成する方法を検討する。パラフレーズから生成されたパラフレーズには,単語選択のバリエーションが含まれており,文の意味を保っている。一般的なデコード戦略では代替語を直接生成することはできないため,デコード中の対象単語のバリエーションに着目した2つの単純なデコード戦略を提案する。実験の結果,本手法は3つのベンチマークで事前学習した言語モデルに基づき,最先端ls法を上回った。

Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence's meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence's meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.

翻訳日:2023-05-16 17:15:47 公開日:2023-05-14

# Mobile-Env: モバイルインタラクションのトレーニングと評価のためのユニバーサルプラットフォーム

Mobile-Env: A Universal Platform for Training and Evaluation of Mobile Interaction ( http://arxiv.org/abs/2305.08144v1 )

ライセンス: Link先を確認

Danyang Zhang, Lu Chen, Kai Yu

(参考訳) インタラクションプラットフォームは、ゲームプレイやインテリジェンスなどのコントロールおよび決定領域の最近の進歩において、重要な役割を果たす。しかし、情報ユーザインタフェース(infoui)インタラクションにはまだ満足のいくプラットフォームが欠けている。提案したInfoUIは、平易なテキスト情報だけでなく、マルチモーダルな内容と、スタイルを持つ空間構造も含んでいる。本稿ではInfoUIインタラクションの研究を支援するために,新しいプラットフォームであるMobile-Envを紹介する。 Mobile-Envプラットフォームは、柔軟で、適応可能で、容易に拡張できるように設計されている。 Mobile-EnvをベースにInfoUIタスクセットが構築され、デモと評価が行われる。大規模言語モデル(LLM)に基づくエージェントをタスクセット上でテストする。実験結果は、LLMがテキスト理解とマッチングを行う大きな可能性を実証し、一方で、対話フィードバックと探索のより良いメカニズムの必要性を明らかにした。新たな議論もいくつか行われている。デモビデオはhttps://youtu.be/gkv6kzywxgyで見ることができる。コードリポジトリはhttps://github.com/x-lance/mobile-envで入手できる。提案されたWikiHowタスクセットはhttps://huggingface.co/datasets/zdy023/WikiHow-tasksetで公開されている。

The interaction platform plays a crucial role in the recent advancement of the control and decision domains like game playing and embodied intelligence. However, there is still a lack of a satisfactory platform for the information user interface (InfoUI) interaction. The proposed InfoUI comprises not only the plain text information, but the multimodal contents and a few spatial structures with styles as well. To help the research of InfoUI interaction, a novel platform Mobile-Env is presented in this paper. The Mobile-Env platform is designed to be flexible, adaptable, and easily-extended. Based on Mobile-Env, an InfoUI task set is then built for a demonstration and evaluation. An agent based on the large-scale language model (LLM) is tested on the task set. The experiment results demonstrate the great potential of the LLM to do text understanding and matching and, meanwhile, reveal the necessity of a better mechanism of interaction feedback and exploration. Several new discussions are conducted as well. A demo video is available at https://youtu.be/gKV6KZYwxGY. The code repository is available at https://github.com/X-LANCE/Mobile-Env. The proposed WikiHow task set is made public at https://huggingface.co/datasets/zdy023/WikiHow-taskset.

翻訳日:2023-05-16 17:15:33 公開日:2023-05-14

# 集中治療室における非計画的寛容の予測 : マルチモーダリティ評価

Predicting Unplanned Readmissions in the Intensive Care Unit: A Multimodality Evaluation ( http://arxiv.org/abs/2305.08139v1 )

ライセンス: Link先を確認

Eitam Sheetrit, Menachem Brief, Oren Elisha

(参考訳) 退院は、ある期間内に退院した患者が、同じまたは関連するケアのために再び入院した場合である。入院は、入院コストの上昇、患者の満足度の低下、感染症、医薬品のエラー、さらには死亡といった副作用のリスクの増加につながるため、医療分野において重大な問題である。特にICU(ICU)は,患者の症状の重症度と合併症のリスクが高いため,入院が困難である。静的データ、構造化されていないフリーテキスト、診断と手順のシーケンス、多変量時系列など、さまざまなデータモダリティを分析することが必要となる。本稿では,時系列解析と自然言語処理における最先端機械学習手法を用いて,各データモダリティの有効性を別々に検証する。評価プロセスを用いて、各データモダリティの寄与を決定でき、可読性の観点から初めて予測値の階層を確立することができる。さらに,読解予測に対する時系列アプローチの性能向上における時間的抽象化の効果を示す。文献における矛盾する定義のため、我々はまた、将来の研究の再現性と一貫性を高め、この用語の多様な解釈から生じる可能性のある潜在的な誤解を防ぐために、無計画許可という用語を明確に定義する。臨床実験結果から, 医師が作成したアウトレットノートは, 他のすべての指標よりも, 読み出し予測に優れていたことが示唆された。

A hospital readmission is when a patient who was discharged from the hospital is admitted again for the same or related care within a certain period. Hospital readmissions are a significant problem in the healthcare domain, as they lead to increased hospitalization costs, decreased patient satisfaction, and increased risk of adverse outcomes such as infections, medication errors, and even death. The problem of hospital readmissions is particularly acute in intensive care units (ICUs), due to the severity of the patients' conditions, and the substantial risk of complications. Predicting Unplanned Readmissions in ICUs is a challenging task, as it involves analyzing different data modalities, such as static data, unstructured free text, sequences of diagnoses and procedures, and multivariate time-series. Here, we investigate the effectiveness of each data modality separately, then alongside with others, using state-of-the-art machine learning approaches in time-series analysis and natural language processing. Using our evaluation process, we are able to determine the contribution of each data modality, and for the first time in the context of readmission, establish a hierarchy of their predictive value. Additionally, we demonstrate the impact of Temporal Abstractions in enhancing the performance of time-series approaches to readmission prediction. Due to conflicting definitions in the literature, we also provide a clear definition of the term Unplanned Readmission to enhance reproducibility and consistency of future research and to prevent any potential misunderstandings that could result from diverse interpretations of the term. Our experimental results on a large benchmark clinical data set show that Discharge Notes written by physicians, have better capabilities for readmission prediction than all other modalities.

翻訳日:2023-05-16 17:15:16 公開日:2023-05-14

# 回答の前に区別する:共通質問応答の知識としての対比的説明の生成

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering ( http://arxiv.org/abs/2305.08135v1 )

ライセンス: Link先を確認

Qianglong Chen, Guohai Xu, Ming Yan, Ji Zhang, Fei Huang, Luo Si and Yin Zhang

(参考訳) 既存の知識強化手法は、異なる知識ベースから多様な知識を得ることにより、特定のQAタスクにおいて顕著な成果を上げている。しかし、検索された知識の性質によって制限を受けると、知識の関連性と区別の両方から利益を得るのに問題が生じる。この課題を解決するために,提案するCPACEは概念中心のPrompt-bAsed Contrastive Explanation Generationモデルである。まず,先行研究に続いて,概念中心知識抽出モジュールを用いて,異なる種類の記号的知識を検索する。その後、獲得した記号的知識と説明プロンプトを用いて、対応する対比的説明を生成し、知識の識別と解釈性をよりよくモデル化するためのガイダンスとする。最後に,生成したコントラスト説明を,下流タスク強化のための外部知識として捉える。本稿では,CSQA,QASC,OBQAの3つの質問回答データセットについて実験を行った。実験結果から, CPACEモデルはCSQAの新しいSOTA(テストセット89.8%, 人体性能0.9%)を実現し, QASCとOBQA(それぞれ4.2%, 3.5%)の大幅な改善が得られた。

Existing knowledge-enhanced methods have achieved remarkable results in certain QA tasks via obtaining diverse knowledge from different knowledge bases. However, limited by the properties of retrieved knowledge, they still have trouble benefiting from both the knowledge relevance and distinguishment simultaneously. To address the challenge, we propose CPACE, a Concept-centric Prompt-bAsed Contrastive Explanation Generation model, which aims to convert obtained symbolic knowledge into a contrastive explanation for better distinguishing the differences among given candidates. Firstly, following previous works, we retrieve different types of symbolic knowledge with a concept-centric knowledge extraction module. After that, we generate corresponding contrastive explanations using acquired symbolic knowledge and explanation prompts as guidance for better modeling the knowledge distinguishment and interpretability. Finally, we regard the generated contrastive explanation as external knowledge for downstream task enhancement. We conduct a series of experiments on three widely-used question-answering datasets: CSQA, QASC, and OBQA. Experimental results demonstrate that with the help of generated contrastive explanation, our CPACE model achieves new SOTA on CSQA (89.8% on the testing set, 0.9% higher than human performance), and gains impressive improvement on QASC and OBQA (4.2% and 3.5%, respectively).

翻訳日:2023-05-16 17:14:48 公開日:2023-05-14

# 制約回復による逆強化学習

Inverse Reinforcement Learning With Constraint Recovery ( http://arxiv.org/abs/2305.08130v1 )

ライセンス: Link先を確認

Nirjhar Das and Arpan Chattopadhyay

(参考訳) 本研究では,制約付きマルコフ決定過程(CMDP)問題に対する新しい逆強化学習(IRL)アルゴリズムを提案する。標準IRL問題において、逆学習者またはエージェントは、最適ポリシーに対する一連の軌道実証から、MDPの報酬関数を回復しようとする。本研究では,cmdpの報酬関数だけでなく,制約についても推測する。最大エントロピーの原理を用いて、制約回復(irl-cr)問題を持つirlを制約付き非凸最適化問題としてキャストできることを示す。サブプロブレムが凸である交互に制約された最適化問題に還元する。我々はそれを解決するために指数勾配降下アルゴリズムを用いる。最後に,グリッド環境におけるアルゴリズムの有効性を示す。

In this work, we propose a novel inverse reinforcement learning (IRL) algorithm for constrained Markov decision process (CMDP) problems. In standard IRL problems, the inverse learner or agent seeks to recover the reward function of the MDP, given a set of trajectory demonstrations for the optimal policy. In this work, we seek to infer not only the reward functions of the CMDP, but also the constraints. Using the principle of maximum entropy, we show that the IRL with constraint recovery (IRL-CR) problem can be cast as a constrained non-convex optimization problem. We reduce it to an alternating constrained optimization problem whose sub-problems are convex. We use exponentiated gradient descent algorithm to solve it. Finally, we demonstrate the efficacy of our algorithm for the grid world environment.

翻訳日:2023-05-16 17:14:26 公開日:2023-05-14

# エンドツーエンド学習はフィットネスアクティビティ認識に十分か?

Is end-to-end learning enough for fitness activity recognition? ( http://arxiv.org/abs/2305.08191v1 )

ライセンス: Link先を確認

Antoine Mercier and Guillaume Berger and Sunny Panchal and Florian Letsch and Cornelius Boehm and Nahua Kang and Ingo Bax and Roland Memisevic

(参考訳) エンド・ツー・エンド・ラーニングは、特に静止画像に関連する多くのコンピュータビジョンタスクをホールドしており、タスク固有の最適化は非常に高いパフォーマンスをもたらす。それでも、人間中心のアクション認識は依然として手作りのパイプラインで占められており、個々のコンポーネントだけが、通常個々のフレームで動作するニューラルネットワークに置き換えられている。このようなパイプラインの関連性を調べるためのテストベッドとして,フィットネス活動の完全注釈付きビデオデータセットを提案する。この領域の認識能力は、基本的に人間のポーズとその時間的ダイナミクスの関数であるので、ポーズベースのソリューションはうまく機能すべきである。このラベル付きデータにより、原画素でのエンドツーエンド学習が、ポーズ推定に基づく最先端のアクション認識パイプラインと競合することを示す。また、エンド・ツー・エンドの学習は、リアルタイム反復数などの時間的にきめ細かなタスクを支援できることを示す。

End-to-end learning has taken hold of many computer vision tasks, in particular, related to still images, with task-specific optimization yielding very strong performance. Nevertheless, human-centric action recognition is still largely dominated by hand-crafted pipelines, and only individual components are replaced by neural networks that typically operate on individual frames. As a testbed to study the relevance of such pipelines, we present a new fully annotated video dataset of fitness activities. Any recognition capabilities in this domain are almost exclusively a function of human poses and their temporal dynamics, so pose-based solutions should perform well. We show that, with this labelled data, end-to-end learning on raw pixels can compete with state-of-the-art action recognition pipelines based on pose estimation. We also show that end-to-end learning can support temporally fine-grained tasks such as real-time repetition counting.

翻訳日:2023-05-16 17:06:42 公開日:2023-05-14

# tsgn: 多エージェント動作予測のための投影ベクトル表現付きテンポラルシーングラフニューラルネットワーク

TSGN: Temporal Scene Graph Neural Networks with Projected Vectorized Representation for Multi-Agent Motion Prediction ( http://arxiv.org/abs/2305.08190v1 )

ライセンス: Link先を確認

Yunong Wu, Thomas Gilles, Bogdan Stanciulescu, Fabien Moutarde

(参考訳) 近くのエージェントの将来の動きを予測することは、自動運転車が安全かつ効果的な行動を取るために不可欠である。本稿では,マルチエージェント軌道予測のためのベクトル表現を投影したテンポラルシーングラフニューラルネットワークを用いたフレームワークTSGNを提案する。投影ベクトル化表現は、トラフィックシーンをベクトルの集合によって構築されたグラフとしてモデル化する。これらのベクトルはエージェント、道路ネットワーク、およびそれらの空間的相対関係を表す。この表現のすべての相対的特徴は、変換と回転不変である。この表現に基づいて、TSGNはエージェント、道路ネットワーク、それら間の相互作用、時間的トラフィックシーンの時間的依存関係をキャプチャする。 TSGNは、全てのエージェントに対するマルチモーダルな将来の軌跡を、妥当かつ正確に同時に予測することができる。一方,エージェントと道路ネットワーク間の相互作用を捕捉する階層型レーントランスを提案する。これは周囲の道路ネットワークをフィルタし,対象エージェントの将来の行動に影響を与える可能性のある最も確率の高いレーンセグメントのみを保持する。予測性能を犠牲にすることなく、計算負担を大幅に削減する。実験により、TSGNはArgoverse運動予測ベンチマーで最先端のパフォーマンスを達成することが示された。

Predicting future motions of nearby agents is essential for an autonomous vehicle to take safe and effective actions. In this paper, we propose TSGN, a framework using Temporal Scene Graph Neural Networks with projected vectorized representations for multi-agent trajectory prediction. Projected vectorized representation models the traffic scene as a graph which is constructed by a set of vectors. These vectors represent agents, road network, and their spatial relative relationships. All relative features under this representation are both translationand rotation-invariant. Based on this representation, TSGN captures the spatial-temporal features across agents, road network, interactions among them, and temporal dependencies of temporal traffic scenes. TSGN can predict multimodal future trajectories for all agents simultaneously, plausibly, and accurately. Meanwhile, we propose a Hierarchical Lane Transformer for capturing interactions between agents and road network, which filters the surrounding road network and only keeps the most probable lane segments which could have an impact on the future behavior of the target agent. Without sacrificing the prediction performance, this greatly reduces the computational burden. Experiments show TSGN achieves state-of-the-art performance on the Argoverse motion forecasting benchmar.

翻訳日:2023-05-16 17:06:30 公開日:2023-05-14

# crosentinews 2.0: 文レベルのニュース感情コーパス

CroSentiNews 2.0: A Sentence-Level News Sentiment Corpus ( http://arxiv.org/abs/2305.08187v1 )

ライセンス: Link先を確認

Gaurish Thakkar, Nives Mikelic Preradovi\'c, Marko Tadi\'c

(参考訳) 本稿ではクロアチアのニュースドメインの文レベルの感情データセットについて述べる。すでに存在する3Kアノテートテキストに加えて、5つのクラスでタグ付けされた14.5Kアノテート文がデータセットに含まれる。アノテーションプロセスとアノテーション間の合意に加えて,ベースラインスコアを提供する。

This article presents a sentence-level sentiment dataset for the Croatian news domain. In addition to the 3K annotated texts already present, our dataset contains 14.5K annotated sentence occurrences that have been tagged with 5 classes. We provide baseline scores in addition to the annotation process and inter-annotator agreement.

翻訳日:2023-05-16 17:06:14 公開日:2023-05-14

# モード切替点最適化を考慮した空中ロボットの経路計画

Path Planning for Air-Ground Robot Considering Modal Switching Point Optimization ( http://arxiv.org/abs/2305.08178v1 )

ライセンス: Link先を確認

Xiaoyu Wang and Kangyao Huang and Xinyu Zhang and Honglin Sun and Wenzhuo Liu and Huaping Liu and Jun Li and Pingping Lu

(参考訳) 運転も飛行もできる革新的なモビリティプラットフォームは、空飛ぶロボットだ。アジャイル飛行の必要性は、空中ロボットの伝統的な経路計画技術によって満足できない。以前の研究は、主に経路のエネルギー効率の向上、探索速度の低下、離着陸地点の最適化に重点を置いていた。フィールドアプリケーション環境のためのロボットを提案し, エネルギー効率, 探索速度, 実際の展開可能性に着目し, モード切替点最適化を考慮したグラフ探索アルゴリズムに基づく, 軽量なグローバル空間計画手法を提案する。基本的な概念は、平面探索と空間探索を組み合わせた交換可能な探索アプローチを採用することで計算量を減らすことである。さらに、電池の健全性とミッション実行の完全性を保護するため、トラップエスケープアプローチも提供された。シミュレーションは、フィールドdemマップに基づいた提案モデルの有効性をテストするために実行される。シミュレーションの結果,我々の技術は,高い信頼度で完成可能な3dパスを生成できることがわかった。さらに、モード切換点最適化法は、モード切換に許容される追加の場所を効率よく同定し、改良されたパスは時間とエネルギーを少なくする。

An innovative sort of mobility platform that can both drive and fly is the air-ground robot. The need for an agile flight cannot be satisfied by traditional path planning techniques for air-ground robots. Prior studies had mostly focused on improving the energy efficiency of paths, seldom taking the seeking speed and optimizing take-off and landing places into account. A robot for the field application environment was proposed, and a lightweight global spatial planning technique for the robot based on the graph-search algorithm taking mode switching point optimization into account, with an emphasis on energy efficiency, searching speed, and the viability of real deployment. The fundamental concept is to lower the computational burden by employing an interchangeable search approach that combines planar and spatial search. Furthermore, to safeguard the health of the power battery and the integrity of the mission execution, a trap escape approach was also provided. Simulations are run to test the effectiveness of the suggested model based on the field DEM map. The simulation results show that our technology is capable of producing finished, plausible 3D paths with a high degree of believability. Additionally, the mode-switching point optimization method efficiently identifies additional acceptable places for mode switching, and the improved paths use less time and energy.

翻訳日:2023-05-16 17:06:10 公開日:2023-05-14

# 凸損失関数下での雑音場に対する最適かつスケーラブルな行列機構

An Optimal and Scalable Matrix Mechanism for Noisy Marginals under Convex Loss Functions ( http://arxiv.org/abs/2305.08175v1 )

ライセンス: Link先を確認

Yingtai Xiao, Guanlin He, Danfeng Zhang, Daniel Kifer

(参考訳) ノイズ境界は機密性保護データリリースの一般的な形態であり、並行性テーブル解析、ベイズネットワークの構築、合成データ生成など多くの下流タスクに有用である。線形クエリ(例えば境界)に対するバイアスのないノイズ応答を提供するプライバシメカニズムは、行列メカニズムとして知られている。そこで本研究では,gaussian noiseを伴う辺縁系の行列機構であるsustainsplannerを提案する。 ResidualPlannerは、余分な分散の凸関数として記述できる多くの損失関数に対して最適化できる(事前の作業は1つの事前定義された目的関数に制限される)。 ResidualPlannerは、前回のHDMM(HDMM)がメモリが切れた場合でも、大規模な設定でマーサルの精度を数秒で最適化できる。数分で100の属性を持つデータセット上でも動作する。さらにResidualPlannerは、各辺の分散/共分散値を効率的に計算できる(比較的小さなデータセットであっても、適切なメソッドはすぐにメモリが切れる)。

Noisy marginals are a common form of confidentiality-protecting data release and are useful for many downstream tasks such as contingency table analysis, construction of Bayesian networks, and even synthetic data generation. Privacy mechanisms that provide unbiased noisy answers to linear queries (such as marginals) are known as matrix mechanisms. We propose ResidualPlanner, a matrix mechanism for marginals with Gaussian noise that is both optimal and scalable. ResidualPlanner can optimize for many loss functions that can be written as a convex function of marginal variances (prior work was restricted to just one predefined objective function). ResidualPlanner can optimize the accuracy of marginals in large scale settings in seconds, even when the previous state of the art (HDMM) runs out of memory. It even runs on datasets with 100 attributes in a couple of minutes. Furthermore ResidualPlanner can efficiently compute variance/covariance values for each marginal (prior methods quickly run out of memory, even for relatively small datasets).

翻訳日:2023-05-16 17:05:49 公開日:2023-05-14

# クロアチア映画レビューデータセット (cro-fireda: a sentiment annotated dataset of film reviews)

Croatian Film Review Dataset (Cro-FiReDa): A Sentiment Annotated Dataset of Film Reviews ( http://arxiv.org/abs/2305.08173v1 )

ライセンス: Link先を確認

Gaurish Thakkar, Nives Mikelic Preradovic and Marko Tadi\'c

(参考訳) 本稿では,映画レビュー分野におけるクロアチア人の感情アノテートデータセットであるCro-FiReDaを紹介する。 1万以上の文を含むデータセットは、文レベルで注釈付けされている。アノテーション全体のプロセスを示すことに加えて、トランスフォーマティブに基づく微調整手法に基づくベンチマーク結果も提示する。

This paper introduces Cro-FiReDa, a sentiment- annotated dataset for Croatian in the domain of movie reviews. The dataset, which contains over 10,000 sentences, has been annotated at the sentence level. In addition to presenting the overall annotation process, we also present benchmark results based on the transformer- based fine-tuning approach

翻訳日:2023-05-16 17:05:31 公開日:2023-05-14

# 制御の劣化を学べるか? ガウス過程に基づくイベントトリガーオンライン学習における計算遅延の解析

Can Learning Deteriorate Control? Analyzing Computational Delays in Gaussian Process-Based Event-Triggered Online Learning ( http://arxiv.org/abs/2305.08169v1 )

ライセンス: Link先を確認

Xiaobing Dai, Armin Lederer, Zewen Yang, Sandra Hirche

(参考訳) システムのダイナミクスが未知である場合、教師付き機械学習技術は一般にデータからモデルを推測するために使用される。ガウス過程(GP)回帰は、予測誤差境界が存在するため、この目的のために特に一般的な学習方法である。さらに、イベントトリガー付きオンライン学習戦略を追求して、特定トラッキングアキュラシーを確保するように、gpモデルをオンラインで効率的に更新することができる。しかし、既存のトリガー条件は任意のタイミングで評価できなければならず、不要な計算時間のために実際に達成できない。そこで,まず遅延認識型トラッキングエラーバウンドを導出し,精度と遅延のトレードオフを明らかにする。この結果に基づいて,計算遅延を伴うGPベースのオンライン学習における新たなイベントトリガを提案する。最後に,シミュレーションにおけるオンライン学習におけるイベントトリガの有効性を示す。

When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ensure specified tracking accuracies. However, existing trigger conditions must be able to be evaluated at arbitrary times, which cannot be achieved in practice due to non-negligible computation times. Therefore, we first derive a delay-aware tracking error bound, which reveals an accuracy-delay trade-off. Based on this result, we propose a novel event trigger for GP-based online learning with computational delays, which we show to offer advantages over offline trained GP models for sufficiently small computation times. Finally, we demonstrate the effectiveness of the proposed event trigger for online learning in simulations.

翻訳日:2023-05-16 17:05:27 公開日:2023-05-14

# 多視点時系列からの潜在プロセス同定

Latent Processes Identification From Multi-View Time Series ( http://arxiv.org/abs/2305.08164v1 )

ライセンス: Link先を確認

Zenan Huang, Haobo Wang, Junbo Zhao, Nenggan Zheng

(参考訳) 時系列データのダイナミクスを理解するには、典型的にはデータ生成のためのユニークな潜在因子を識別する必要がある。独立した仮定に基づいて、既存の作業はシングルビューデータの処理に大きな進歩を遂げました。しかし、大きな課題が2つあるため、それをマルチビュー時系列データに拡張する非自明な問題である。 (i) 時間依存のような複雑なデータ構造は、独立した仮定に違反する可能性がある。 (ii) 異なる視点からの因子は概して重複しており、完全な集合に集約することは困難である。本研究では,データ生成過程を逆転させて識別性を高めるために,コントラスト学習技術を用いた新しいフレームワーク MuLTI を提案する。さらに、MuLTIは最適な輸送公式を確立することで、対応する重複変数をマージする置換機構を統合する。合成および実世界のデータセットに対する大規模な実験結果から,多視点時系列上での同定可能な潜伏変数の復元において,本手法の優位性が示された。

Understanding the dynamics of time series data typically requires identifying the unique latent factors for data generation, \textit{a.k.a.}, latent processes identification. Driven by the independent assumption, existing works have made great progress in handling single-view data. However, it is a non-trivial problem that extends them to multi-view time series data because of two main challenges: (i) the complex data structure, such as temporal dependency, can result in violation of the independent assumption; (ii) the factors from different views are generally overlapped and are hard to be aggregated to a complete set. In this work, we propose a novel framework MuLTI that employs the contrastive learning technique to invert the data generative process for enhanced identifiability. Additionally, MuLTI integrates a permutation mechanism that merges corresponding overlapped variables by the establishment of an optimal transport formula. Extensive experimental results on synthetic and real-world datasets demonstrate the superiority of our method in recovering identifiable latent variables on multi-view time series.

翻訳日:2023-05-16 17:05:16 公開日:2023-05-14

# アルツハイマー病に伴う機能的脳ネットワークの位相的特性の変化

Altered Topological Properties of Functional Brain Network Associated with Alzheimer's Disease ( http://arxiv.org/abs/2305.08159v1 )

ライセンス: Link先を確認

Yongcheng Yao

(参考訳) 機能的磁気共鳴イメージング(fMRI)は、神経変性疾患に関連する機能的異常を含む人間の脳活動を研究するために一般的に用いられる。本研究は,アルツハイマー病(AD)患者と正常コントロール者における機能的脳ネットワークのトポロジー特性の違いについて検討することを目的とする。対象者は,AD認知症175名,年齢415名,性別415名,手腕マッチング群590名であった。脳ネットワークのトポロジ的特性をグラフ理論に基づく分析により定量化した。その結果,adグループ内のネットワーク統合と分離が異常であった。これらの知見は、機能的脳ネットワーク構造の観点からAD病態の理解を深め、ADバイオマーカーの同定に役立つ可能性がある。我々はこの研究の検証をhttps://github.com/YongchengYAO/AD-FunctionalBrainNetwork.comで支援した。

Functional Magnetic Resonance Imaging (fMRI) is commonly utilized to study human brain activity, including abnormal functional properties related to neurodegenerative diseases. This study aims to investigate the differences in the topological properties of functional brain networks between individuals with Alzheimer's Disease (AD) and normal controls. A total of 590 subjects, consisting of 175 with AD dementia and 415 age-, gender-, and handedness-matched controls, were included. The topological properties of the brain network were quantified using graph-theory-based analyses. The results indicate abnormal network integration and segregation in the AD group. These findings enhance our understanding of AD pathophysiology from a functional brain network structure perspective and may aid in identifying AD biomarkers. We provided more information to asist the validation of this study at https://github.com/YongchengYAO/AD-FunctionalBrainNetwork.

翻訳日:2023-05-16 17:04:59 公開日:2023-05-14

# アルゴリズム的多元主義 : 平等への構造的アプローチ

Algorithmic Pluralism: A Structural Approach Towards Equal Opportunity ( http://arxiv.org/abs/2305.08157v1 )

ライセンス: Link先を確認

Shomik Jain, Vinith Suriyakumar, Ashia Wilson

(参考訳) 平等な機会というアイデアは、自由な機会が私たちの生活を形作ってくれるので、広く受け入れられる。しかし、多くの人は平等な機会の意味について深く反対している。平等機会に関する新しい理論は、意思決定が機会の構造においてボトルネックや狭い場所としてどのように機能するかを記述する構造的アプローチを採用する。この差別に対する見解は、平等な機会と形式的な公正な介入による達成による根本的な問題を強調し、より多くの人々に機会を開くことを優先するより多元的なアプローチを提唱する。我々はこのボトルネック理論をデータ駆動型意思決定に拡張し、アルゴリズムが機会構造において深刻なボトルネックを生じさせる範囲の懸念に対処する。アルゴリズムによる意思決定システムにおける重大度緩和の優先順位付けを推奨する。教育、医療、刑事司法の例から、この構造的アプローチがシステム設計と規制における平等な機会についての議論をいかに再編成し、アルゴリズム的多元主義が機会をよりポジティブな方法で拡大するかを示す。

The idea of equal opportunity enjoys wide acceptance because of the freedom opportunities provide us to shape our lives. Many disagree deeply, however, about the meaning of equal opportunity, especially in algorithmic decision-making. A new theory of equal opportunity adopts a structural approach, describing how decisions can operate as bottlenecks or narrow places in the structure of opportunities. This viewpoint on discrimination highlights fundamental problems with equal opportunity and its achievement through formal fairness interventions, and instead advocates for a more pluralistic approach that prioritizes opening up more opportunities for more people. We extend this theory of bottlenecks to data-driven decision-making, adapting it to center concerns about the extent to which algorithms can create severe bottlenecks in the opportunity structure. We recommend algorithmic pluralism: the prioritization of alleviating severity in systems of algorithmic decision-making. Drawing on examples from education, healthcare, and criminal justice, we show how this structural approach helps reframe debates about equal opportunity in system design and regulation, and how algorithmic pluralism could help expand opportunities in a more positive-sum way.

翻訳日:2023-05-16 17:04:44 公開日:2023-05-14

# 単光子円偏光単モード渦ビーム

Single-photon circularly polarized single-mode vortex beams ( http://arxiv.org/abs/2305.08223v1 )

ライセンス: Link先を確認

Xujing Liu, Yinhui Kan, Shailesh Kumar, Danylo Komisar, Changying Zhao, Sergey I. Bozhevolnyi

(参考訳) スピンと軌道角モータ(SAMとOAM)を持つ単一光子の生成は、高次元量子系に対する複数の自由度を利用するためのエンテンシングの視点を開く。しかし、シングルモードSAM-OAM状態で符号化された単一光子のオンチップ生成は大きな課題である。ここでは、基板上に作製された異方性ナノ二量体を慎重に設計し、表面プラズモンポラリトン(SPP)伝搬をサポートし、量子エミッタ(QE)周辺を正確に位置決めすることにより、非放射性QE-SPP結合とSPP結合を、SAMとOAMを特徴とする自由空間伝播放射に展開させる。本研究は, 位相電荷 (l = 0, 1, 2) と高単光子純度 (g(0) < 0.15) を持つ単モード渦ビームを, 円偏光(キラル度 > 0.97) のオンチップ室温で生成することを示した。先進的な量子フォトニック技術のための高次元量子源の実現を可能にするために、開発されたアプローチは簡単に拡張でき、複数の異なる偏光単光子放射チャネルを生成することができる。

Generation of single photons carrying spin and orbital angular momenta (SAM and OAM) opens enticing perspectives for exploiting multiple degrees of freedom for high-dimensional quantum systems. However, on-chip generation of single photons encoded with single-mode SAM-OAM states has been a major challenge. Here, by utilizing carefully designed anisotropic nanodimers fabricated atop a substrate, supporting surface plasmon polariton (SPP) propagation, and accurately positioned around a quantum emitter (QE), we enable nonradiative QE-SPP coupling and the SPP outcoupling into free-space propagating radiation featuring the designed SAM and OAM. We demonstrate on-chip room-temperature generation of well-collimated (divergence < 7.5 degrees) circularly polarized (chirality > 0.97) single-mode vortex beams with different topological charges (l = 0, 1, and 2) and high single-photon purity, g(0) < 0.15. The developed approach can straightforwardly be extended to produce multiple, differently polarized, single-mode single-photon radiation channels, and enable thereby realization of high-dimensional quantum sources for advanced quantum photonic technologies.

翻訳日:2023-05-16 16:59:29 公開日:2023-05-14

# 線形偏光渦ビームの超コンパクト単一光子源

Ultracompact single-photon sources of linearly polarized vortex beams ( http://arxiv.org/abs/2305.08222v1 )

ライセンス: Link先を確認

Xujing Liu, Yinhui Kan, Shailesh Kumar, Liudmilla F. Kulikova, Valery A. Davydov, Viatcheslav N. Agafonov, Changying Zhao, Sergey I. Bozhevolnyi

(参考訳) 偏光状態を持つ超コンパクトチップ一体型単光子源は、集積量子技術にとって不可欠である。しかし、現在利用可能な単一光子源のほとんどは、放出された光子ビームの偏光と位相フロントを形成するために外部の偏光成分に依存している。量子エミッタのビーム整形と偏光符号化機能との効率的な統合は、いまだに解明されていない。本稿では,ナノブリックアレイ型メタサーフェスのポテンシャルを十分に活用した,チップ集積量子エミッタ結合型メタサーフェスに基づく線形偏極渦ビームの超コンパクト単一光子源を提案する。まず, 所定の位相電荷-1, 0, +1の高純度線形偏光渦ビームのオンチップ単光子生成を示す。さらに、位相電荷の異なる直交線形偏光を持つ単一光子放出チャネルの多重化を実現し、その絡み合いを示す。本研究は,チップ一体型高次元単一光子源を実現するための新しい量子光学プラットフォームとして,超コンパクト量子エミッタ結合型メタサーフェスの可能性と実現可能性を示す。

Ultracompact chip-integrated single-photon sources of collimated beams with polarizationencoded states are crucial for integrated quantum technologies. However, most of currently available single-photon sources rely on external bulky optical components to shape the polarization and phase front of emitted photon beams. Efficient integration of quantum emitters with beam shaping and polarization encoding functionalities remains so far elusive. Here, we present ultracompact single-photon sources of linearly polarized vortex beams based on chip-integrated quantum emitter-coupled metasurfaces, which are meticulously designed by fully exploiting the potential of nanobrick arrayed metasurfaces. We first demonstrate on-chip single-photon generation of high-purity linearly polarized vortex beams with prescribed topological charges of -1, 0, and +1. We further realize multiplexing of single-photon emission channels with orthogonal linear polarizations carrying different topological charges and demonstrate their entanglement. Our work illustrates the potential and feasibility of ultracompact quantum emitter-coupled metasurfaces as a new quantum optics platform for realizing chip-integrated high-dimensional single-photon sources.

翻訳日:2023-05-16 16:59:02 公開日:2023-05-14

# 深層スペクトル埋め込みを意識した学習構造

Learning Structure Aware Deep Spectral Embedding ( http://arxiv.org/abs/2305.08215v1 )

ライセンス: Link先を確認

Hira Yaseen and Arif Mahmood

(参考訳) スペクトル埋め込み(se)は、分類とクラスタリングのために、非線形多様体から線形部分空間へのデータポイントのマッピングにしばしば用いられる。重要な利点にもかかわらず、元の空間におけるデータの部分空間構造は埋め込み空間では保存されない。この問題に対処するために、SEグラフ親和性を自己表現行列に置き換えることで、サブスペースクラスタリングが提案されている。しかし、データが線型部分空間の結合にある場合、データが非線型多様体にまたがる実世界での性能は低下する可能性がある。この問題に対処するために,スペクトル埋め込み損失と構造保存損失を組み合わせた新しい構造認識深層スペクトル埋め込みを提案する。この目的のために、両タイプの情報を同時に符号化し、構造対応スペクトル埋め込みを生成するディープニューラルネットワークアーキテクチャを提案する。注意に基づく自己表現学習を用いて入力データの部分空間構造を符号化する。提案アルゴリズムは6つの実世界のデータセット上で評価される。その結果,既存の最先端手法と比較して,提案アルゴリズムのクラスタリング性能は優れていた。提案アルゴリズムは,データポイントの発見に優れた一般化を示し,膨大な計算資源を必要としない大規模データセットにスケーラブルである。

Spectral Embedding (SE) has often been used to map data points from non-linear manifolds to linear subspaces for the purpose of classification and clustering. Despite significant advantages, the subspace structure of data in the original space is not preserved in the embedding space. To address this issue subspace clustering has been proposed by replacing the SE graph affinity with a self-expression matrix. It works well if the data lies in a union of linear subspaces however, the performance may degrade in real-world applications where data often spans non-linear manifolds. To address this problem we propose a novel structure-aware deep spectral embedding by combining a spectral embedding loss and a structure preservation loss. To this end, a deep neural network architecture is proposed that simultaneously encodes both types of information and aims to generate structure-aware spectral embedding. The subspace structure of the input data is encoded by using attention-based self-expression learning. The proposed algorithm is evaluated on six publicly available real-world datasets. The results demonstrate the excellent clustering performance of the proposed algorithm compared to the existing state-of-the-art methods. The proposed algorithm has also exhibited better generalization to unseen data points and it is scalable to larger datasets without requiring significant computational resources.

翻訳日:2023-05-16 16:58:45 公開日:2023-05-14

# フェルミオン環境と相互作用する系の刺激ラマン断熱通路の巨大スピンモデル

Giant Spin Model for Stimulated Raman Adiabatic Passage of systems interacting with a fermionic environment ( http://arxiv.org/abs/2305.08209v1 )

ライセンス: Link先を確認

Benedetto Militello and Anna Napoli

(参考訳) このような技術によって操作される物理系がスピン浴と相互作用する場合に、刺激ラマン断熱路を解析する。人口移動プロセスの効率性は, 環境との弱い強い結合や非共鳴など, いくつかの制度で検討されている。一般化された量子ゼノ効果の発生は、強い減衰状態における効率の低下を説明する。

Stimulated Raman Adiabatic Passage is analyzed in the case where the physical system manipulated by such technique is interacting with a spin bath. The efficiency of the population transfer process is investigated in several regimes, including the weak and strong coupling with the environment and the off-resonance. The occurrence of a generalized quantum Zeno effect explains the lowering of the efficiency in the strong damping regime.

翻訳日:2023-05-16 16:58:12 公開日:2023-05-14

# クロスドメインqaを一般化する学習

Learning to Generalize for Cross-domain QA ( http://arxiv.org/abs/2305.08208v1 )

ライセンス: Link先を確認

Yingjie Niu, Linyi Yang, Ruihai Dong, Yue Zhang

(参考訳) 自然言語処理(NLP)モデルのドメイン外一般化能力,特に質問応答(QA)タスクに対する懸念が高まっている。トレーニングコストの増大により、QAの現在の合成データ拡張方法が妨げられる。この問題に対処するため,提案手法と線形探索と微調整戦略を組み合わせた新しい手法を提案するが,追加コストは伴わない。本手法は, 生成モデルと識別モデルの両方の一般化能力の向上に有効であることが理論的, 実験的に証明されている。我々のアプローチは最先端のベースラインを上回り、F1のスコアは平均4.5%-7.9%上昇した。さらに,任意の事前学習モデルに容易に統合でき,未検討のクロスドメインqaタスクに対して有望な解決策を提供する。ソースコードはGitHub*で公開しています。

There have been growing concerns regarding the out-of-domain generalization ability of natural language processing (NLP) models, particularly in question-answering (QA) tasks. Current synthesized data augmentation methods for QA are hampered by increased training costs. To address this issue, we propose a novel approach that combines prompting methods and linear probing then fine-tuning strategy, which does not entail additional cost. Our method has been theoretically and empirically shown to be effective in enhancing the generalization ability of both generative and discriminative models. Our approach outperforms state-of-the-art baselines, with an average increase in F1 score of 4.5%-7.9%. Furthermore, our method can be easily integrated into any pre-trained models and offers a promising solution to the under-explored cross-domain QA task. We release our source code at GitHub*.

翻訳日:2023-05-16 16:57:58 公開日:2023-05-14

# 認知障害高齢者のための多元的知識融合を用いた認知刺激対話システム

A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment ( http://arxiv.org/abs/2305.08200v1 )

ライセンス: Link先を確認

Jiyue Jiang, Sheng Wang, Qintong Li, Lingpeng Kong, Chuan Wu

(参考訳) 認知障害のある高齢者とコミュニケーションする際、認知刺激(CS)は高齢者の認知健康を維持するのに役立つ。データ空間は、特に中国語でCSベースの対話システムを構築する上で大きな課題である。このギャップを埋めるために、CS原則と感情支援戦略ラベルとの対話の約2.6Kグループを含む中国のCS会話(CSConv)データセットを構築した。感情的なサポートを提供しながらチャットをするというのは、既存の認知対話システムの大半で見過ごされている。本稿では,CS の原理と感情支援戦略に導かれるオープンな応答を生成するための,CS 対話のためのマルチソース知識融合手法を提案する。まず,外部知識に基づくプログレッシブマスク法を用いて,エンコーダを効果的な分類法として学習する。そして、デコーダが認識されたCS原理と感情的支援戦略と相互作用して応答を生成する。 csconvデータセットで行った広範囲な実験により,提案手法の有効性が実証された。

When communicating with elders with cognitive impairment, cognitive stimulation (CS) help to maintain the cognitive health of elders. Data sparsity is the main challenge in building CS-based dialogue systems, particularly in the Chinese language. To fill this gap, we construct a Chinese CS conversation (CSConv) dataset, which contains about 2.6K groups of dialogues with CS principles and emotional support strategy labels. Making chit chat while providing emotional support is overlooked by the majority of existing cognitive dialogue systems. In this paper, we propose a multi-source knowledge fusion method for CS dialogue (CSD), to generate open-ended responses guided by the CS principle and emotional support strategy. We first use a progressive mask method based on external knowledge to learn encoders as effective classifiers, which is the prerequisite to predict the CS principle and emotional support strategy of the target response. Then a decoder interacts with the perceived CS principle and emotional support strategy to generate responses. Extensive experiments conducted on the CSConv dataset demonstrate the effectiveness of the proposed method, while there is still a large space for improvement compared to human performance.

翻訳日:2023-05-16 16:57:35 公開日:2023-05-14

# 一様周期時系列データセットにおける一般化異常検出のためのデータセット融合アルゴリズム

A Dataset Fusion Algorithm for Generalised Anomaly Detection in Homogeneous Periodic Time Series Datasets ( http://arxiv.org/abs/2305.08197v1 )

ライセンス: Link先を確認

Ayman Elhalwagy and Tatiana Kalganova

(参考訳) ニューラルネットワーク(NN)を複数のデータセットに一般化することは、NNが特定のデータソースに最適化されるため、文献でしばしば見過ごされる。これは、異なるセンサからのシーケンシャルデータとコレクション仕様の融合が困難であるため、時系列ベースのマルチデータセットモデルでは特に困難になる。しかし、商用環境では、AIモデルの持続可能な開発であるグリーンAIの文脈において不可欠である、利用可能なデータと計算能力を有効に活用することができる。本稿では,複数の均質なデータセットから周期的信号を単一のデータセットに融合する新しいデータセット合成アルゴリズム"dataset fusion"を提案する。提案手法は、教師なしLSTMCaps NNを用いた2種類の同種誘導電動機(IM)故障データセットの3相電流データをケーススタディで検証し、平均F1スコア0.879で従来のトレーニング手法を著しく上回り、全データセットにわたって効果的に一般化する。提案されたアプローチは、Green AIの原則に従って、トレーニングデータのさまざまなパーセンテージでテストされた。その結果、トレーニングデータの6.25\%しか使用せず、93.7\%の計算能力の低下に対応して、わずか4.04\%の性能低下となり、性能と計算効率の両方の観点から提案手法の利点が示された。さらに,非理想条件下でのアルゴリズムの有効性は,実世界への応用の可能性を強調している。

The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series-based multi-dataset models due to difficulties in fusing sequential data from different sensors and collection specifications. In a commercial environment, however, generalisation can effectively utilise available data and computational power, which is essential in the context of Green AI, the sustainable development of AI models. This paper introduces "Dataset Fusion," a novel dataset composition algorithm for fusing periodic signals from multiple homogeneous datasets into a single dataset while retaining unique features for generalised anomaly detection. The proposed approach, tested on a case study of 3-phase current data from 2 different homogeneous Induction Motor (IM) fault datasets using an unsupervised LSTMCaps NN, significantly outperforms conventional training approaches with an Average F1 score of 0.879 and effectively generalises across all datasets. The proposed approach was also tested with varying percentages of the training data, in line with the principles of Green AI. Results show that using only 6.25\% of the training data, translating to a 93.7\% reduction in computational power, results in a mere 4.04\% decrease in performance, demonstrating the advantages of the proposed approach in terms of both performance and computational efficiency. Moreover, the algorithm's effectiveness under non-ideal conditions highlights its potential for practical use in real-world applications.

翻訳日:2023-05-16 16:57:06 公開日:2023-05-14

# 視覚・他領域のセグメンテーションモデルに関する総合的調査

A Comprehensive Survey on Segment Anything Model for Vision and Beyond ( http://arxiv.org/abs/2305.08196v1 )

ライセンス: Link先を確認

Chunhui Zhang, Li Liu, Yawen Cui, Guanjie Huang, Weilin Lin, Yiqian Yang, Yuehong Hu

(参考訳) 人工知能(AI)は、AIシステムが幅広いタスクを実行し、人間のものと似たレベルの知性を示す能力を指す人工知能へと進化している。これは、特定のタスクを高い効率で実行するように設計された、狭いあるいは特殊なAIとは対照的である。したがって、様々な下流タスクに適応可能な幅広いデータに基づいて訓練された基礎モデルと呼ばれる、一般的なモデルのクラスを設計することが急務である。最近提案されたセグメンテーションモデル (SAM) は、セグメンテーションの境界を画定し、コンピュータビジョンの基礎モデルの開発を大いに促進している。 SAMを完全に理解するために,我々は調査研究を行う。ビジョンのためのタスクのセグメンテーションの進捗を、samの基礎モデルに基づいて包括的にレビューするため、本研究は、その歴史的発展、最近の進歩、幅広いアプリケーションへの深い影響について議論することで、様々なタスクやデータタイプへの応用に焦点を当てている。まず、SAMを含む基礎モデルの背景と用語、およびタスクのセグメンテーションに重要なSAMと同等の最先端の手法について紹介する。そして,ソフトウェアシーン,現実世界シーン,複雑なシーンなど,様々な画像処理アプリケーションにおけるSAMの利点と限界を分析し,要約する。重要なのは、より汎用的な基礎モデルを開発し、samのアーキテクチャを改善するための将来の研究のガイドとなるいくつかの洞察である。また、SAMの視覚およびそれ以上の素晴らしい応用についてもまとめています。

Artificial intelligence (AI) is evolving towards artificial general intelligence, which refers to the ability of an AI system to perform a wide range of tasks and exhibit a level of intelligence similar to that of a human being. This is in contrast to narrow or specialized AI, which is designed to perform specific tasks with a high degree of efficiency. Therefore, it is urgent to design a general class of models, which we term foundation models, trained on broad data that can be adapted to various downstream tasks. The recently proposed segment anything model (SAM) has made significant progress in breaking the boundaries of segmentation, greatly promoting the development of foundation models for computer vision. To fully comprehend SAM, we conduct a survey study. As the first to comprehensively review the progress of segmenting anything task for vision and beyond based on the foundation model of SAM, this work focuses on its applications to various tasks and data types by discussing its historical development, recent progress, and profound impact on broad applications. We first introduce the background and terminology for foundation models including SAM, as well as state-of-the-art methods contemporaneous with SAM that are significant for segmenting anything task. Then, we analyze and summarize the advantages and limitations of SAM across various image processing applications, including software scenes, real-world scenes, and complex scenes. Importantly, some insights are drawn to guide future research to develop more versatile foundation models and improve the architecture of SAM. We also summarize massive other amazing applications of SAM in vision and beyond.

翻訳日:2023-05-16 16:56:39 公開日:2023-05-14

# 対話型意味解析のための自然言語フィードバックのシミュレーション

Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing ( http://arxiv.org/abs/2305.08195v1 )

ライセンス: Link先を確認

Hao Yan, Saurabh Srivastava, Yintao Tai, Sida I. Wang, Wen-tau Yih, Ziyu Yao

(参考訳) 自然言語(nl)フィードバックに基づく対話的意味解析は、ユーザーがパーサーの間違いを修正するためのフィードバックを提供するもので、従来のワンショット意味解析よりも実用的なシナリオとして登場している。しかし、従来の作業は、対話型セマンティックパーサをトレーニングするために、人間に注釈付けされたフィードバックデータに大きく依存している。本研究では,対話型意味解析のためのNLフィードバックをシミュレーションするタスクを提案する。私たちはそのタスクに新しいフィードバック評価器を伴います。 evaluatorはシミュレーションされたフィードバックの品質を評価するために特別に設計されており、提案手法から最適なフィードバックシミュレータを決定する。テキストからSQLへのデータセットでは、フィードバックシミュレータが高品質なNLフィードバックを生成し、特定のパーサの誤り訂正能力を向上できることを示す。低データ設定で、私たちのフィードバックシミュレータは、コストがかかるフルヒューマンアノテーションを使用してトレーニングされたエラー修正のパフォーマンスを同等に達成できます。

Interactive semantic parsing based on natural language (NL) feedback, where users provide feedback to correct the parser mistakes, has emerged as a more practical scenario than the traditional one-shot semantic parsing. However, prior work has heavily relied on human-annotated feedback data to train the interactive semantic parser, which is prohibitively expensive and not scalable. In this work, we propose a new task of simulating NL feedback for interactive semantic parsing. We accompany the task with a novel feedback evaluator. The evaluator is specifically designed to assess the quality of the simulated feedback, based on which we decide the best feedback simulator from our proposed variants. On a text-to-SQL dataset, we show that our feedback simulator can generate high-quality NL feedback to boost the error correction ability of a specific parser. In low-data settings, our feedback simulator can help achieve comparable error correction performance as trained using the costly, full set of human annotations.

翻訳日:2023-05-16 16:56:12 公開日:2023-05-14

# 知覚不能および伝達不能な逆襲に対する拡散モデル

Diffusion Models for Imperceptible and Transferable Adversarial Attack ( http://arxiv.org/abs/2305.08192v1 )

ライセンス: Link先を確認

Jianqi Chen, Hao Chen, Keyan Chen, Yilan Zhang, Zhengxia Zou, Zhenwei Shi

(参考訳) 既存の多くの敵攻撃は画像RGB空間上で$L_p$-norm摂動を生成する。移植性や攻撃成功率のいくつかの成果にもかかわらず、製作された敵の例は人間の目で容易に認識される。最近の研究では、L_p$-norm制約なしで制限のない攻撃を探索しているが、ブラックボックスモデルに対する攻撃の転送性は欠如している。本研究では,拡散モデルの生成的・判別的パワーを活用し,新しい非受容的・移動可能攻撃を提案する。具体的には、ピクセル空間の直接操作の代わりに、拡散モデルの潜在空間で摂動を発生させる。適切に設計されたコンテンツ保存構造と組み合わせることで、意味的な手がかりが埋め込まれた人間非感受性の摂動を生成することができる。移動性を改善するため,対象領域から注意をそらすことにより,追加の認識サーロゲートと見なすことのできる拡散モデルをさらに「欺く」。我々の知る限り、提案手法であるdiffattackは、敵の攻撃フィールドに拡散モデルを導入する最初の方法である。各種モデル構造(CNN, Transformer, MLPs など)と防御手法の多種多様な実験により,攻撃方法の優位性を実証した。

Many existing adversarial attacks generate $L_p$-norm perturbations on image RGB space. Despite some achievements in transferability and attack success rate, the crafted adversarial examples are easily perceived by human eyes. Towards visual imperceptibility, some recent works explore unrestricted attacks without $L_p$-norm constraints, yet lacking transferability of attacking black-box models. In this work, we propose a novel imperceptible and transferable attack by leveraging both the generative and discriminative power of diffusion models. Specifically, instead of direct manipulation in pixel space, we craft perturbations in latent space of diffusion models. Combined with well-designed content-preserving structures, we can generate human-insensitive perturbations embedded with semantic clues. For better transferability, we further "deceive" the diffusion model which can be viewed as an additional recognition surrogate, by distracting its attention away from the target regions. To our knowledge, our proposed method, DiffAttack, is the first that introduces diffusion models into adversarial attack field. Extensive experiments on various model structures (including CNNs, Transformers, MLPs) and defense methods have demonstrated our superiority over other attack methods.

翻訳日:2023-05-16 16:55:56 公開日:2023-05-14

# MatSci-NLP:テキスト-スキーマモデリングを用いた材料科学言語課題における科学言語モデルの評価

MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling ( http://arxiv.org/abs/2305.08264v1 )

ライセンス: Link先を確認

Yu Song, Santiago Miret, Bang Liu

(参考訳) 本研究では,自然言語処理(NLP)モデルの性能評価を行う自然言語ベンチマークであるMatSci-NLPを提案する。本ベンチマークは,7つの異なるnlpタスク(名前付きエンティティ認識や関係分類などの従来のnlpタスクや,素材の合成手順の作成に関連する合成行動検索など,材料科学特有のnlpタスクを含む)を包含する,利用可能な材料科学のテキストデータから構築する。本研究では,様々な理科テキストコーパスで事前学習したBERTモデルについて検討し,事前学習戦略が教材科学テキストの理解に与える影響を明らかにする。材料科学分野における高品質な注釈データの不足を考えると,我々はmatsci-nlpタスク間の一般化を促進するために,限られたトレーニングデータを用いて微調整実験を行う。この低リソース・トレーニング・セッティングにおける実験により,理科テキストで事前学習した言語モデルは,一般的なテキストで訓練したBERTより優れていることが示された。 MatBERTは、材料科学雑誌に特化して事前訓練されたモデルで、ほとんどのタスクに最適である。さらに,Shabenchmark上でのマルチタスク学習のための統一テキストスキーマを提案し,その性能を従来の微調整手法と比較する。異なる学習方法の分析により,提案手法が単タスクと多タスクのnlpの微調整法を常に上回っており,質問応答法に着想を得た。コードとデータセットは \url{https://github.com/BangLab-UdeM-Mila/NLP4MatSci-ACL23} で公開されている。

We present MatSci-NLP, a natural language benchmark for evaluating the performance of natural language processing (NLP) models on materials science text. We construct the benchmark from publicly available materials science text data to encompass seven different NLP tasks, including conventional NLP tasks like named entity recognition and relation classification, as well as NLP tasks specific to materials science, such as synthesis action retrieval which relates to creating synthesis procedures for materials. We study various BERT-based models pretrained on different scientific text corpora on MatSci-NLP to understand the impact of pretraining strategies on understanding materials science text. Given the scarcity of high-quality annotated data in the materials science domain, we perform our fine-tuning experiments with limited training data to encourage the generalize across MatSci-NLP tasks. Our experiments in this low-resource training setting show that language models pretrained on scientific text outperform BERT trained on general text. MatBERT, a model pretrained specifically on materials science journals, generally performs best for most tasks. Moreover, we propose a unified text-to-schema for multitask learning on \benchmark and compare its performance with traditional fine-tuning methods. In our analysis of different training methods, we find that our proposed text-to-schema methods inspired by question-answering consistently outperform single and multitask NLP fine-tuning methods. The code and datasets are publicly available at \url{https://github.com/BangLab-UdeM-Mila/NLP4MatSci-ACL23}.

翻訳日:2023-05-16 16:48:54 公開日:2023-05-14

# 医用画像解析のためのパラメーター効率の微調整:逃避機会

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity ( http://arxiv.org/abs/2305.08252v1 )

ライセンス: Link先を確認

Raman Dutt, Linus Ericsson, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy Hospedales

(参考訳) 本稿では,多種多様な医用画像解析タスクにおけるパラメータ効率向上技術(PEFT)の総合評価について述べる。 PEFTは、自然言語処理、ビジョン、スピーチ、そして視覚言語やテキスト・ツー・イメージ生成のようなモーダルなタスクにおいて、事前訓練されたモデルから知識を伝達するための貴重なアプローチとして、ますます活用されている。しかし、医用画像解析への応用はいまだに未解明である。基礎モデルが医学領域でますます活用されるようになるにつれて、ダウンストリームタスクの範囲を補強する知識伝達の様々な戦略を調査し、比較評価することが重要となる。コンボリューションとトランスフォーマーに基づくネットワークのために提案された16種類のPEFT手法を,サイズ,モダリティ,複雑性の6つの医学データセットを対象とした画像分類とテキスト・ツー・イメージ生成タスクに着目し,本研究で評価した。 600以上の制御された実験により,特定のシナリオ下では最大22%の性能向上を示し,医療用テキスト・画像生成におけるPEFTの有効性を示した。さらに, 従来の微調整手法よりもPEFT法が特に優位である事例を明らかにし, 下流データ量との関係について検討する。

We present a comprehensive evaluation of Parameter-Efficient Fine-Tuning (PEFT) techniques for diverse medical image analysis tasks. PEFT is increasingly exploited as a valuable approach for knowledge transfer from pre-trained models in natural language processing, vision, speech, and cross-modal tasks, such as vision-language and text-to-image generation. However, its application in medical image analysis remains relatively unexplored. As foundation models are increasingly exploited in the medical domain, it is crucial to investigate and comparatively assess various strategies for knowledge transfer that can bolster a range of downstream tasks. Our study, the first of its kind (to the best of our knowledge), evaluates 16 distinct PEFT methodologies proposed for convolutional and transformer-based networks, focusing on image classification and text-to-image generation tasks across six medical datasets ranging in size, modality, and complexity. Through a battery of more than 600 controlled experiments, we demonstrate performance gains of up to 22% under certain scenarios and demonstrate the efficacy of PEFT for medical text-to-image generation. Further, we reveal the instances where PEFT methods particularly dominate over conventional fine-tuning approaches by studying their relationship with downstream data volume.

翻訳日:2023-05-16 16:48:26 公開日:2023-05-14

# 言語能力の犠牲を伴わない非言語スキルの学習

Learning Non-linguistic Skills without Sacrificing Linguistic Proficiency ( http://arxiv.org/abs/2305.08246v1 )

ライセンス: Link先を確認

Mandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan

(参考訳) 近年のMath-NLPの分野は、LLMのパフォーマンスを非言語的概念(数、そしてその後の算術的推論)の学習に拡大したいという願望に動機付けられて、著しい成長をみせている。しかし、非言語的スキルインジェクションは、一般的にllmのコストがかかる:それは、コア言語スキルが壊滅的に忘れ去られてしまうことにつながる。 Math-NLP は、小学生の数学的スキルや計算機の算術的推論スキルを正確に近似できる LLM を作成することができるため、これらのモデルの実用性は、それらが言語能力を損なうと失敗する。本研究は, LLMに関する破滅的忘れの現象を深く考察し, 厳密な算術推論の学習を可能にする情報理論的介入とスキル特異的損失に基づく, LLMの非言語的スキル注入のための新しい枠組みを提供する。本モデルは,非言語的スキルと言語的知識の保持の両方において最先端を上回っており,非言語的訓練データ(1/4)およびゼロの追加的な言語的訓練データを用いている。

The field of Math-NLP has witnessed significant growth in recent years, motivated by the desire to expand LLM performance to the learning of non-linguistic notions (numerals, and subsequently, arithmetic reasoning). However, non-linguistic skill injection typically comes at a cost for LLMs: it leads to catastrophic forgetting of core linguistic skills, a consequence that often remains unaddressed in the literature. As Math-NLP has been able to create LLMs that can closely approximate the mathematical skills of a grade-schooler or the arithmetic reasoning skills of a calculator, the practicality of these models fail if they concomitantly shed their linguistic capabilities. In this work, we take a closer look into the phenomena of catastrophic forgetting as it pertains to LLMs and subsequently offer a novel framework for non-linguistic skill injection for LLMs based on information theoretic interventions and skill-specific losses that enable the learning of strict arithmetic reasoning. Our model outperforms the state-of-the-art both on injected non-linguistic skills and on linguistic knowledge retention, and does so with a fraction of the non-linguistic training data (1/4) and zero additional synthetic linguistic training data.

翻訳日:2023-05-16 16:48:04 公開日:2023-05-14

# トリビュートAIコンペティションの物語の紹介

Introducing Tales of Tribute AI Competition ( http://arxiv.org/abs/2305.08234v1 )

ライセンス: Link先を確認

Jakub Kowalski, Rados{\l}aw Miernik, Katarzyna Polak, Dominik Budzki, Damian Kowalik

(参考訳) 本稿では,The Elder Scrolls OnlineのHigh Isle章でリリースされた2人のプレイヤーによるデッキビルディングカードゲームに基づいて,新たなAIチャレンジであるTOTAICを提案する。現在、CCG(Collectible Card Games)のジャンルをカバーするAIコンペティションは他になく、デッキビルディングゲームをターゲットにした大会は一度もない。したがって、ランダム性や隠れ情報、大きな分岐要因など、通常のCCG関連の障害を克服するためには、長期的な計画と汎用性も必要である。このゲームは、古典的な敵探索、シングルプレイヤー計画、ニューラルネットワークベースのアルゴリズムなど、複数のアプローチで対処できる。本稿では,競争の枠組みを紹介し,ゲームのルールを説明し,サンプルAIエージェント間のトーナメントの結果を示す。 TOTAICの最初のエディションはIEEE Conference on Games 2023で開催されている。

This paper presents a new AI challenge, the Tales of Tribute AI Competition (TOTAIC), based on a two-player deck-building card game released with the High Isle chapter of The Elder Scrolls Online. Currently, there is no other AI competition covering Collectible Card Games (CCG) genre, and there has never been one that targets a deck-building game. Thus, apart from usual CCG-related obstacles to overcome, like randomness, hidden information, and large branching factor, the successful approach additionally requires long-term planning and versatility. The game can be tackled with multiple approaches, including classic adversarial search, single-player planning, and Neural Networks-based algorithms. This paper introduces the competition framework, describes the rules of the game, and presents the results of a tournament between sample AI agents. The first edition of TOTAIC is hosted at the IEEE Conference on Games 2023.

翻訳日:2023-05-16 16:47:40 公開日:2023-05-14

# グラフエコー状態ネットワークを用いたノード分類におけるヘテロフォリーの対応

Addressing Heterophily in Node Classification with Graph Echo State Networks ( http://arxiv.org/abs/2305.08233v1 )

ライセンス: Link先を確認

Alessio Micheli, Domenico Tortorella

(参考訳) グラフ上のノード分類タスクは、ノード近傍の複数の集約を通してノード表現の階層を学習する、完全に訓練されたディープメッセージパッシングモデルによって処理される。クラス内エッジの比率が高いグラフでは有効であるが、このアプローチは反対のケース、すなわちヘテロフィリー(英語版)では、同じクラスに属するノードが通常はさらに離れている。ヘテロフィアの高いグラフでは、畳み込みモデルによって計算された近接近傍に基づく平滑化表現はもはや有効ではない。これまでのところ、入力グラフの過度な平滑化や切り替えを低減し、長距離メッセージパッシングを改善するためのメッセージパッシングモデルのアーキテクチャ上のバリエーションが提案されている。本稿では,ノード分類のためのグラフエコー状態ネットワーク(GESN)を用いた異種グラフの課題に対処する。 gesnはグラフの貯水池計算モデルであり、ノード埋め込みは未学習のメッセージパッシング関数によって再帰的に計算される。我々の実験では,アーキテクチャバイアスのアドホックなバリエーションを実装したり,インプットグラフの事前処理ステップとして再処理を行う,最も完全に訓練された深層モデルに対して,リザーバモデルの方が,効率/正確性のトレードオフという面で改善した。さらに,gesnは再帰的埋め込み関数の反復とグラフ内の最短経路の分布との相関を示すことにより,グラフノードの構造的関係を効果的にエンコードできることを示した。

Node classification tasks on graphs are addressed via fully-trained deep message-passing models that learn a hierarchy of node representations via multiple aggregations of a node's neighbourhood. While effective on graphs that exhibit a high ratio of intra-class edges, this approach poses challenges in the opposite case, i.e. heterophily, where nodes belonging to the same class are usually further apart. In graphs with a high degree of heterophily, the smoothed representations based on close neighbours computed by convolutional models are no longer effective. So far, architectural variations in message-passing models to reduce excessive smoothing or rewiring the input graph to improve longer-range message passing have been proposed. In this paper, we address the challenges of heterophilic graphs with Graph Echo State Network (GESN) for node classification. GESN is a reservoir computing model for graphs, where node embeddings are recursively computed by an untrained message-passing function. Our experiments show that reservoir models are able to achieve better or comparable accuracy with respect to most fully trained deep models that implement ad hoc variations in the architectural bias or perform rewiring as a preprocessing step on the input graph, with an improvement in terms of efficiency/accuracy trade-off. Furthermore, our analysis shows that GESN is able to effectively encode the structural relationships of a graph node, by showing a correlation between iterations of the recursive embedding function and the distribution of shortest paths in a graph.

翻訳日:2023-05-16 16:47:25 公開日:2023-05-14

# 街路画像からの物体の位置情報と高さ推定の組合せ

Combining geolocation and height estimation of objects from street level imagery ( http://arxiv.org/abs/2305.08232v1 )

ライセンス: Link先を確認

Matej Ulicny, Vladimir A. Krylov, Julie Connelly, and Rozenn Dahyot

(参考訳) 本研究では,単一の入力データモダリティと見なされる道路レベルrgb画像から,多クラスオブジェクトの位置情報と高さ推定を組み合わせたパイプラインを提案する。我々の解はマルコフ確率場最適化によって定式化される。提案手法は、カスタムトレーニングされた畳み込みニューラルネットワークで検出された画像平面内の物体の座標とともに画像メタデータを使用する。対象位置の計算に加えて,本手法を用いた対象高さの計算は,全体の計算コストに悪影響を及ぼす。平均標高推定誤差が20cm未満となる排水路や道路標識の精度を実験的に実証した。

We propose a pipeline for combined multi-class object geolocation and height estimation from street level RGB imagery, which is considered as a single available input data modality. Our solution is formulated via Markov Random Field optimization with deterministic output. The proposed technique uses image metadata along with coordinates of objects detected in the image plane as found by a custom-trained Convolutional Neural Network. Computing the object height using our methodology, in addition to object geolocation, has negligible effect on the overall computational cost. Accuracy is demonstrated experimentally for water drains and road signs on which we achieve average elevation estimation error lower than 20cm.

翻訳日:2023-05-16 16:46:57 公開日:2023-05-14

# 海面の高さと速度場に基づくハイブリッド3次元渦検出技術

A Hybrid 3D Eddy Detection Technique Based on Sea Surface Height and Velocity Field ( http://arxiv.org/abs/2305.08229v1 )

ライセンス: Link先を確認

Weiping Hua, Karen Bemis, Dujuan Kang, Sedat Ozer, Deborah Silver

(参考訳) 渦検出は海洋科学者にとって海洋循環を理解し解析する重要な課題である。本稿では,海面の高さ (ssh) と速度場と渦の挙動を定義する幾何学的基準を組み合わせた渦検出手法を提案する。海洋学者がエディーズの中心に求めるSSHミニマとマキシマの探索を行った。幾何的基準は、各渦中心を囲む円形の経路に沿って速度成分を追従することにより、ネット回転や対称性などの期待される速度場特性の検証に使用される。プログレッシブな探索は、各エディの3D領域に影響を及ぼす。データセットから各渦構造を分離することで、水平速度、垂直速度、温度、塩分量を用いて内部渦構造の可視化が容易になる。大久保-ワイス渦性閾値(ow)、標準巻線角、およびこの新しいssh-速度ハイブリッド法による渦検出法を赤海データセットに適用した結果、検出結果は方法、閾値、基準の選定に大きく依存していることが示唆された。この新しいssh-velocityハイブリッド検出手法は, 回転特性が検証された渦構造を提供すること, 物性の内部構造の3次元可視化, 流線を計算せずに高速に渦足跡を推定できる。本手法は, 内部構造の可視化と全体移動の追跡を併用し, 栄養分布と海洋循環の相互作用を理解するための輸送機構の研究を支援する。本手法は3つの異なるデータセットに適用し,その一般性を示す。

Eddy detection is a critical task for ocean scientists to understand and analyze ocean circulation. In this paper, we introduce a hybrid eddy detection approach that combines sea surface height (SSH) and velocity fields with geometric criteria defining eddy behavior. Our approach searches for SSH minima and maxima, which oceanographers expect to find at the center of eddies. Geometric criteria are used to verify expected velocity field properties, such as net rotation and symmetry, by tracing velocity components along a circular path surrounding each eddy center. Progressive searches outward and into deeper layers yield each eddy's 3D region of influence. Isolation of each eddy structure from the dataset, using it's cylindrical footprint, facilitates visualization of internal eddy structures using horizontal velocity, vertical velocity, temperature and salinity. A quantitative comparison of Okubo-Weiss vorticity (OW) thresholding, the standard winding angle, and this new SSH-velocity hybrid methods of eddy detection as applied to the Red Sea dataset suggests that detection results are highly dependent on the choices of method, thresholds, and criteria. Our new SSH-velocity hybrid detection approach has the advantages of providing eddy structures with verified rotation properties, 3D visualization of the internal structure of physical properties, and rapid efficient estimations of eddy footprints without calculating streamlines. Our approach combines visualization of internal structure and tracking overall movement to support the study of the transport mechanisms key to understanding the interaction of nutrient distribution and ocean circulation. Our method is applied to three different datasets to showcase the generality of its application.

翻訳日:2023-05-16 16:46:47 公開日:2023-05-14

# 骨格グラフに基づく超音波CT非剛性レジストレーション

Skeleton Graph-based Ultrasound-CT Non-rigid Registration ( http://arxiv.org/abs/2305.08228v1 )

ライセンス: Link先を確認

Zhongliang Jiang, Xuesong Li, Chenyu Zhang, Yuan Bi, Walter Stechele, Nassir Navab

(参考訳) 自律型超音波(US)スキャンは注目を集めており、術者間変動などの従来のアメリカの検査の限界を克服するための潜在的な解決策と見なされている。しかしながら、特に音響窓が制限された胸郭アプリケーションにおいて、ジェネリック・アトラス上で計画されたスキャン軌道を、他の患者のために現在の設定に自律的かつ正確に転送することは依然として困難である。この課題に対処するため,皮膚表面ではなく皮下骨表面の特徴を用いて患者固有の特性を適応する骨格グラフに基づく非剛性登録法を提案した。この目的のために、それぞれ入力点雲を統一し、キーポイントを抽出するために、自己組織化マッピングを2回連続して使用する。その後、最小のスパンニングツリーを使用して、抽出されたすべてのキーポイントを接続するツリーグラフを生成する。ソースおよびターゲットポイントクラウドに適合するリブ軟骨輪郭を適切に特徴付けるため、ツリーグラフから抽出されたパスは、リブ全体にわたって連続性を最大に維持することにより最適化される。提案手法を検証するために,1人のボランティアと7つのCT軟骨点群から,異なる患者からUS軟骨点群を手動で抽出した。以上の結果より,ICP (distance error mean/SD: 5.0/1.9 mm vs 8.6/6.7 mm on 7 CTs) よりも患者間変動に適応する上で,グラフベース登録の方が有効で堅牢であることが示唆された。

Autonomous ultrasound (US) scanning has attracted increased attention, and it has been seen as a potential solution to overcome the limitations of conventional US examinations, such as inter-operator variations. However, it is still challenging to autonomously and accurately transfer a planned scan trajectory on a generic atlas to the current setup for different patients, particularly for thorax applications with limited acoustic windows. To address this challenge, we proposed a skeleton graph-based non-rigid registration to adapt patient-specific properties using subcutaneous bone surface features rather than the skin surface. To this end, the self-organization mapping is successively used twice to unify the input point cloud and extract the key points, respectively. Afterward, the minimal spanning tree is employed to generate a tree graph to connect all extracted key points. To appropriately characterize the rib cartilage outline to match the source and target point cloud, the path extracted from the tree graph is optimized by maximally maintaining continuity throughout each rib. To validate the proposed approach, we manually extract the US cartilage point cloud from one volunteer and seven CT cartilage point clouds from different patients. The results demonstrate that the proposed graph-based registration is more effective and robust in adapting to the inter-patient variations than the ICP (distance error mean/SD: 5.0/1.9 mm vs 8.6/6.7 mm on seven CTs).

翻訳日:2023-05-16 16:46:19 公開日:2023-05-14

# deepfilternet: 知覚的動機付けによるリアルタイム音声強調

DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement ( http://arxiv.org/abs/2305.08227v1 )

ライセンス: Link先を確認

Hendrik Schr\"oter, Tobias Rosenkranz, Alberto N. Escalante-B., Andreas Maier

(参考訳) 単一チャンネル音声強調のためのマルチフレームアルゴリズムは、音声信号内の短時間相関を活用できる。周波数領域における複素フィルタを直接推定し,それらの相関性を利用するためにDF法を提案した。本稿では,DeepFilterNetを用いたリアルタイム音声強調デモを示す。 DeepFilterNetの効率性は、音声生成と心理音響知覚のドメイン知識を活用することで実現される。本モデルは,シングルスレッドノートブック cpu 上で 0.19 のリアルタイム係数を実現しつつ,最先端の音声強調ベンチマークと一致させることができる。フレームワークと事前トレーニングされた重み付けは、オープンソースライセンスで公開されている。

Multi-frame algorithms for single-channel speech enhancement are able to take advantage from short-time correlations within the speech signal. Deep Filtering (DF) was proposed to directly estimate a complex filter in frequency domain to take advantage of these correlations. In this work, we present a real-time speech enhancement demo using DeepFilterNet. DeepFilterNet's efficiency is enabled by exploiting domain knowledge of speech production and psychoacoustic perception. Our model is able to match state-of-the-art speech enhancement benchmarks while achieving a real-time-factor of 0.19 on a single threaded notebook CPU. The framework as well as pretrained weights have been published under an open source license.

翻訳日:2023-05-16 16:45:53 公開日:2023-05-14

# ファジィ生成ランタイムプロファイリングによるNLPベースのクロスレイヤ5G脆弱性検出

NLP-based Cross-Layer 5G Vulnerabilities Detection via Fuzzing Generated Run-Time Profiling ( http://arxiv.org/abs/2305.08226v1 )

ライセンス: Link先を確認

Zhuzhu Wang and Ying Wang

(参考訳) 5Gソフトウェアスタックの脆弱性と意図しない動作検出の有効性と効率性は、5G保証、特に重要なインフラにおけるその応用に不可欠である。スケーラビリティと自動化は、テストアプローチとサイバーセキュリティ研究の主要な課題である。本稿では,コードリポジトリのファズテストに対応する実行時プロファイリング文書を用いて,脆弱性,意図しない緊急動作,および5Gスタックの性能劣化を自動的に検出する革新的な手法を提案する。 srsRANをパイロットとして,ファジィテストによって生成されたログ情報を用いてリアルタイムのプロファイリングを高次元距離空間にマップし,そのタイムスタンプ情報に基づいて特徴空間を構築する。最後に,ロジスティック回帰,k-nearest近傍,ランダムフォレストなど,機械学習に基づく分類アルゴリズムを活用して,パフォーマンスとセキュリティ属性への影響を分類する。提案手法の性能は高い精度で、ファジングインパクトを検出する際に93.4 \% $ から95.9 \% $ となる。さらに、概念実証は5Gインフラストラクチャのリアルタイム脆弱性と、さまざまな分野のクリティカルアプリケーションを特定し、優先順位付けする可能性がある。

The effectiveness and efficiency of 5G software stack vulnerability and unintended behavior detection are essential for 5G assurance, especially for its applications in critical infrastructures. Scalability and automation are the main challenges in testing approaches and cybersecurity research. In this paper, we propose an innovative approach for automatically detecting vulnerabilities, unintended emergent behaviors, and performance degradation in 5G stacks via run-time profiling documents corresponding to fuzz testing in code repositories. Piloting on srsRAN, we map the run-time profiling via Logging Information (LogInfo) generated by fuzzing test to a high dimensional metric space first and then construct feature spaces based on their timestamp information. Lastly, we further leverage machine learning-based classification algorithms, including Logistic Regression, K-Nearest Neighbors, and Random Forest to categorize the impacts on performance and security attributes. The performance of the proposed approach has high accuracy, ranging from $ 93.4 \% $ to $ 95.9 \% $, in detecting the fuzzing impacts. In addition, the proof of concept could identify and prioritize real-time vulnerabilities on 5G infrastructures and critical applications in various verticals.

翻訳日:2023-05-16 16:45:43 公開日:2023-05-14

# FactKB: ファクト知識で強化された言語モデルを用いた一般化可能なファクチュアリティ評価

FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge ( http://arxiv.org/abs/2305.08281v1 )

ライセンス: Link先を確認

Shangbin Feng, Vidhisha Balachandran, Yuyang Bai, Yulia Tsvetkov

(参考訳) 自動生成された要約の事実整合性を評価することは、信頼できる要約システムの進展と導入に不可欠である。近年の進歩にもかかわらず、既存の事実性評価モデルは頑健ではなく、特に新しいドメインにおけるエンティティと関係エラーの傾向が強い。我々はfactkbを提案する。factuality evaluationに対する単純な新しいアプローチで、特にエンティティやリレーションに関して、ドメイン間で一般化できる。 FactKBは、外部知識ベースから抽出された事実を用いて事前訓練された言語モデルに基づいている。本稿では,直接実体事実に基づく相補的事実学習目標,実体に関する補助知識に基づく事実,知識ベースウォークによる構成的事実の3種類の相補的事実学習目標について紹介する。結果の事実性評価モデルは、2つのドメイン内ニュース要約ベンチマークと3つのドメイン外科学文献データセットに対して、最先端のパフォーマンスを達成する。 FactKBのさらなる分析は、要約における誤った実体や関係を検出する能力が改善され、ドメイン間で堅牢で一般化可能であることを示している。

Evaluating the factual consistency of automatically generated summaries is essential for the progress and adoption of reliable summarization systems. Despite recent advances, existing factuality evaluation models are not robust, being especially prone to entity and relation errors in new domains. We propose FactKB, a simple new approach to factuality evaluation that is generalizable across domains, in particular with respect to entities and relations. FactKB is based on language models pretrained using facts extracted from external knowledge bases. We introduce three types of complementary factuality pretraining objectives based on direct entity facts, facts grounded in auxiliary knowledge about entities, and facts constructed compositionally through knowledge base walks. The resulting factuality evaluation model achieves state-of-the-art performance on two in-domain news summarization benchmarks as well as on three out-of-domain scientific literature datasets. Further analysis of FactKB shows improved ability to detect erroneous entities and relations in summaries and is robust and generalizable across domains.

翻訳日:2023-05-16 16:39:04 公開日:2023-05-14

# Ship-D: 機械学習を用いた設計最適化のためのシップハルデータセット

Ship-D: Ship Hull Dataset for Design Optimization using Machine Learning ( http://arxiv.org/abs/2305.08279v1 )

ライセンス: Link先を確認

Noah J. Bagazinski and Faez Ahmed

(参考訳) 機械学習は最近、複雑な製品の設計サイクル時間を短縮するために大きな進歩を遂げている。船体設計は現在、長いサイクルと小さなバッチ生産を含むが、これらの進歩の大きな恩恵を受ける可能性がある。様々な種類の船舶の設計から学習する船舶設計のための機械学習ツールを開発することで、船舶設計におけるトレードオフを特定し最適化することができる。しかし、現在公開されている船の設計データセットの欠如は、一般的な船の設計において機械学習を活用する可能性を制限している。このギャップに対処するために, パラメータ化, メッシュ, 点雲, 画像表現などの設計および機能性能情報と, 異なる動作条件下での3つの流体抵抗測定値を含む, 3万個の船殻の大規模データセットを提案する。データセットは人間の入力を可能にするように構成されており、計算方法も設計されている。さらに,既存の船体を正確に再構成するパラメータ化機能を示すため,公開されているCADレポジトリから12種類の船体を紹介する。遺伝的アルゴリズムのケーススタディでは, 船体断面の形状と平行中間体の長さを保ちながら, 船体の総抗力を60パーセント削減するために, 30の波動抵抗係数を予測するために代理モデルが開発された。我々の研究は、他の研究者がデータ駆動船の設計を進めるために使用する包括的なデータセットとアプリケーションの例を提供します。

Machine learning has recently made significant strides in reducing design cycle time for complex products. Ship design, which currently involves years long cycles and small batch production, could greatly benefit from these advancements. By developing a machine learning tool for ship design that learns from the design of many different types of ships, tradeoffs in ship design could be identified and optimized. However, the lack of publicly available ship design datasets currently limits the potential for leveraging machine learning in generalized ship design. To address this gap, this paper presents a large dataset of thirty thousand ship hulls, each with design and functional performance information, including parameterization, mesh, point cloud, and image representations, as well as thirty two hydrodynamic drag measures under different operating conditions. The dataset is structured to allow human input and is also designed for computational methods. Additionally, the paper introduces a set of twelve ship hulls from publicly available CAD repositories to showcase the proposed parameterizations ability to accurately reconstruct existing hulls. A surrogate model was developed to predict the thirty two wave drag coefficients, which was then implemented in a genetic algorithm case study to reduce the total drag of a hull by sixty percent while maintaining the shape of the hulls cross section and the length of the parallel midbody. Our work provides a comprehensive dataset and application examples for other researchers to use in advancing data driven ship design.

翻訳日:2023-05-16 16:38:48 公開日:2023-05-14

# 勾配降下の局所収束-生成逆ネットワークの訓練

Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks ( http://arxiv.org/abs/2305.08277v1 )

ライセンス: Link先を確認

Evan Becker, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

(参考訳) generative adversarial networks (gans) は複雑な高次元データの生成モデルを訓練するための一般的な定式化である。 GANをトレーニングする標準的な方法は、極小最適化問題に対する勾配降下度(GDA)手順を含む。この手順は、力学の非線形性のため、一般には解析が難しい。カーネルベースの判別器を用いてGANを訓練するためのGDAの局所力学について検討する。この収束解析は、[becker et al. 2022] から仮定された \textit{isolated points} モデルの下で gda 反復を記述する非線形力学系の線形化に基づいている。本研究では,カーネル識別器の学習率,正規化,帯域幅がgdaの局所収束率に及ぼす影響について検討した。重要なことは、システムがいつ収束するか、振動するか、分岐するかを示す相転移を示す。また,クレームを検証する数値シミュレーションも提供する。

Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with a kernel-based discriminator. This convergence analysis is based on a linearization of a non-linear dynamical system that describes the GDA iterations, under an \textit{isolated points model} assumption from [Becker et al. 2022]. Our analysis brings out the effect of the learning rates, regularization, and the bandwidth of the kernel discriminator, on the local convergence rate of GDA. Importantly, we show phase transitions that indicate when the system converges, oscillates, or diverges. We also provide numerical simulations that verify our claims.

翻訳日:2023-05-16 16:38:26 公開日:2023-05-14

# ULIP-2:3D理解のためのスケーラブルなマルチモーダル事前学習を目指して

ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding ( http://arxiv.org/abs/2305.08275v1 )

ライセンス: Link先を確認

Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Mart\'in-Mart\'in, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, Silvio Savarese

(参考訳) マルチモーダル事前学習法の最近の進歩は、3次元モダリティ、それらの2次元モダリティ、対応する言語モダリティを合わせた3次元表現学習において有望な効果を示している。しかし、3Dアプリケーションのためのマルチモーダルデータを収集するために既存のマルチモーダル事前学習フレームワークが使用している手法はスケーラビリティと包括性に欠けており、多モーダル学習の可能性を最大限に制限する可能性がある。主なボトルネックは、言語モダリティのスケーラビリティと包括性にある。このボトルネックに対処するため,我々は,最先端のマルチモーダル大規模言語モデル (LLM) を利用したマルチモーダル事前学習フレームワークULIP-2を導入する。我々は,ObjaverseとShapeNet55という2つの大規模データセットの実験を行い,生成した3次元三重項データセット(3D Point Cloud - Image - Language)をリリースする。 ULIP-2は、ModelNet40 (74% Top1 Accuracy) で、下流のゼロショット分類の大幅な改善を実現している。さらに、ULIP-2 は実世界の ScanObjectNN ベンチマーク (91.5% の総合精度) で新しい記録を樹立し、140万のパラメータ(現在の SOTA より10倍少ない)しか利用せず、人間のアノテーションなしでスケーラブルなマルチモーダル3D 表現学習のブレークスルーを示している。コードとデータセットはhttps://github.com/salesforce/ulipで入手できる。

Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning features across 3D modality, their 2D counterpart modality, and corresponding language modality. However, the methods used by existing multimodal pre-training frameworks to gather multimodal data for 3D applications lack scalability and comprehensiveness, potentially constraining the full potential of multimodal learning. The main bottleneck lies in the language modality's scalability and comprehensiveness. To address this bottleneck, we introduce ULIP-2, a multimodal pre-training framework that leverages state-of-the-art multimodal large language models (LLMs) pre-trained on extensive knowledge to automatically generate holistic language counterparts for 3D objects. We conduct experiments on two large-scale datasets, Objaverse and ShapeNet55, and release our generated three-modality triplet datasets (3D Point Cloud - Image - Language), named "ULIP-Objaverse Triplets" and "ULIP-ShapeNet Triplets". ULIP-2 requires only 3D data itself and eliminates the need for any manual annotation effort, demonstrating its scalability; and ULIP-2 achieves remarkable improvements on downstream zero-shot classification on ModelNet40 (74% Top1 Accuracy). Moreover, ULIP-2 sets a new record on the real-world ScanObjectNN benchmark (91.5% Overall Accuracy) while utilizing only 1.4 million parameters(~10x fewer than current SOTA), signifying a breakthrough in scalable multimodal 3D representation learning without human annotations. The code and datasets are available at https://github.com/salesforce/ULIP.

翻訳日:2023-05-16 16:38:14 公開日:2023-05-14

# 大規模動的グラフのための分離グラフニューラルネットワーク

Decoupled Graph Neural Networks for Large Dynamic Graphs ( http://arxiv.org/abs/2305.08273v1 )

ライセンス: Link先を確認

Yanping Zheng, Zhewei Wei, Jiajun Liu

(参考訳) ソーシャルネットワーク、金融取引、レコメンデーションシステムといった現実世界のグラフは、しばしば動的な振る舞いを示す。この現象はグラフストリームと呼ばれ、ノードの動的変化とエッジの出現と消失を含む。これらの動的グラフの構造的側面と時間的側面の両方を効果的に捉えるために、動的グラフニューラルネットワークが開発された。しかし、既存の手法は通常、連続時間または離散時間動的グラフの処理に適しており、一方から他方へ一般化することはできない。本稿では,連続と離散の両方の動的グラフの効率的な計算を支援する統一動的伝播を含む,大規模動的グラフのための分離グラフニューラルネットワークを提案する。グラフ構造関連計算は伝播過程においてのみ実行されるため、下流タスクの予測プロセスは高価なグラフ計算なしで個別に訓練できるため、任意のシーケンスモデルをプラグインして使用することができる。その結果,本アルゴリズムは拡張性と表現力に優れる。本アルゴリズムは連続時間と離散時間の両方の動的グラフの7つの実世界のデータセットで評価する。実験の結果,両種類の動的グラフにおいて最先端の性能が得られることがわかった。特に、我々のアルゴリズムのスケーラビリティは、最大10億の時間エッジと1億以上のノードを持つ巨大なグラフへの成功例によってよく示されています。

Real-world graphs, such as social networks, financial transactions, and recommendation systems, often demonstrate dynamic behavior. This phenomenon, known as graph stream, involves the dynamic changes of nodes and the emergence and disappearance of edges. To effectively capture both the structural and temporal aspects of these dynamic graphs, dynamic graph neural networks have been developed. However, existing methods are usually tailored to process either continuous-time or discrete-time dynamic graphs, and cannot be generalized from one to the other. In this paper, we propose a decoupled graph neural network for large dynamic graphs, including a unified dynamic propagation that supports efficient computation for both continuous and discrete dynamic graphs. Since graph structure-related computations are only performed during the propagation process, the prediction process for the downstream task can be trained separately without expensive graph computations, and therefore any sequence model can be plugged-in and used. As a result, our algorithm achieves exceptional scalability and expressiveness. We evaluate our algorithm on seven real-world datasets of both continuous-time and discrete-time dynamic graphs. The experimental results demonstrate that our algorithm achieves state-of-the-art performance in both kinds of dynamic graphs. Most notably, the scalability of our algorithm is well illustrated by its successful application to large graphs with up to over a billion temporal edges and over a hundred million nodes.

翻訳日:2023-05-16 16:37:40 公開日:2023-05-14

# $SmartProbe$: 市場調査のための仮想モデレーター

$SmartProbe$: A Virtual Moderator for Market Research Surveys ( http://arxiv.org/abs/2305.08271v1 )

ライセンス: Link先を確認

Josh Seltzer, Jiahua (Fiona) Pan, Kathy Cheng, Yuxiao Sun, Santosh Kolagati, Jimmy Lin, Shi Zong

(参考訳) 市場調査は、消費者の視点を大規模に理解するための強力な方法論であるが、理解と洞察の深みによって制限されている。仮想モデレーターは、調査の質的研究の要素を導入し、調査参加者とのラプポートを開発し、探索的な質問を動的に行い、最終的には市場研究者により有用な情報を提供する。本研究では,大規模言語モデル(llm)の適応能力を活用したapiである${\tt smartprobe}$を導入し,市場調査における効果的な調査質問を生成するために,市場調査からドメイン知識を取り入れる。我々は,$\tt smartprobe$のモジュール処理フローを概説し,生成した調査質問の品質と有効性を評価する。当社の取り組みは、業界関係者にLLMの最新の進歩に基づいて、現実世界のアプリケーションを構築するよう促すだろうと考えています。私たちのデモはhttps://nexxt.in/smartprobe-demoで公開しています。

Market research surveys are a powerful methodology for understanding consumer perspectives at scale, but are limited by depth of understanding and insights. A virtual moderator can introduce elements of qualitative research into surveys, developing a rapport with survey participants and dynamically asking probing questions, ultimately to elicit more useful information for market researchers. In this work, we introduce ${\tt SmartProbe}$, an API which leverages the adaptive capabilities of large language models (LLMs), and incorporates domain knowledge from market research, in order to generate effective probing questions in any market research survey. We outline the modular processing flow of $\tt SmartProbe$, and evaluate the quality and effectiveness of its generated probing questions. We believe our efforts will inspire industry practitioners to build real-world applications based on the latest advances in LLMs. Our demo is publicly available at https://nexxt.in/smartprobe-demo

翻訳日:2023-05-16 16:37:18 公開日:2023-05-14

# Kochen-Specker の文脈性

Kochen-Specker Contextuality ( http://arxiv.org/abs/2305.08267v1 )

ライセンス: Link先を確認

Mladen Pavicic and Mordecai Waegell

(参考訳) 最近開発された小さなベクトル成分から量子文脈集合を生成する手法は、任意の次元に普遍的かつ理論的に適用できる。しかし、8以上の次元の任意の排他的集合を得るタスクは、スーパーコンピュータでも計算障壁に直面している。そこで本研究では,KS集合の最小複雑性が,低次元の既知の集合から高次元の比較的小さなKS集合を構成するために,次元にスケールしないという事実を生かした次元アップスケーリング手法を提案する。これにより、現在利用可能な計算資源を用いて16次元空間の単純ベクトル成分から多数の集合を生成できる。

A recently developed method of generating quantum contextual sets from small vectors components is universally and theoretically applicable to any dimension. However, tasks of obtaining such arbitrarily exhaustive sets in dimensions higher than eight face a computational barrier even on supercomputers. Therefore, for this paper, we employed a dimensional upscaling method that exploits the fact that the minimal complexity of KS sets does not scale with dimension to construct relatively small KS sets in higher dimensions from known sets in lower dimensions. This enabled us to generate numerous sets from simple vector components in up to 16-dimensional spaces using presently available computational resources.

翻訳日:2023-05-16 16:37:03 公開日:2023-05-14

# 残差計算のない車両検出と分類:ランダム摂動注入によるHEVC画像デコーディングの高速化

Vehicle Detection and Classification without Residual Calculation: Accelerating HEVC Image Decoding with Random Perturbation Injection ( http://arxiv.org/abs/2305.08265v1 )

ライセンス: Link先を確認

Muhammet Sebul Berato\u{g}lu and Beh\c{c}et U\u{g}ur T\"oreyin

(参考訳) ビデオ分析,特に交通監視の分野では,映像データの処理と理解のための効率的かつ効果的な手法の必要性が高まっている。従来のフルビデオデコーディング技術は計算集約的で時間を要するため、研究者は圧縮された領域における代替アプローチを探求する。本研究では,高効率ビデオ符号化(HEVC)ビットストリームから画像を再構成する,ランダム摂動に基づく圧縮領域法を提案する。本手法は,映像理解タスクに関連する情報を保持しつつ,特に車両の検知・分類を重要なユースケースとして重視しながら,元の画像の凝縮表現を作成し,残差に対するランダムな摂動の置換を提案する最初の方法である。残差データを使用しないことにより,提案手法は画像再構成プロセスに必要なデータを大幅に削減し,より効率的な情報保存と送信を可能にする。これは、監視アプリケーションに関わる膨大なビデオデータを考える際に特に重要である。提案手法は,一般のビットベクトルデータセットに適用することで,従来のフルデコード法に比べて復元速度が著しく向上し,画素領域法よりも約56%高速であることを示す。さらに,画素領域法と比較して検出精度が99.9%,分類精度96.84%であり,画素領域法よりわずか0.98%低い。さらに,データサイズが大幅に削減され,ストレージや送信の効率が向上することを示す。本研究は、速度とデータサイズが重要な要因である交通監視アプリケーションにおいて、圧縮されたドメインメソッドの可能性を立証する。

In the field of video analytics, particularly traffic surveillance, there is a growing need for efficient and effective methods for processing and understanding video data. Traditional full video decoding techniques can be computationally intensive and time-consuming, leading researchers to explore alternative approaches in the compressed domain. This study introduces a novel random perturbation-based compressed domain method for reconstructing images from High Efficiency Video Coding (HEVC) bitstreams, specifically designed for traffic surveillance applications. To the best of our knowledge, our method is the first to propose substituting random perturbations for residual values, creating a condensed representation of the original image while retaining information relevant to video understanding tasks, particularly focusing on vehicle detection and classification as key use cases. By not using residual data, our proposed method significantly reduces the data needed in the image reconstruction process, allowing for more efficient storage and transmission of information. This is particularly important when considering the vast amount of video data involved in surveillance applications. Applied to the public BIT-Vehicle dataset, we demonstrate a significant increase in the reconstruction speed compared to the traditional full decoding approach, with our proposed method being approximately 56% faster than the pixel domain method. Additionally, we achieve a detection accuracy of 99.9%, on par with the pixel domain method, and a classification accuracy of 96.84%, only 0.98% lower than the pixel domain method. Furthermore, we showcase the significant reduction in data size, leading to more efficient storage and transmission. Our research establishes the potential of compressed domain methods in traffic surveillance applications, where speed and data size are critical factors.

翻訳日:2023-05-16 16:36:53 公開日:2023-05-14

# 実践的ロバスト強化学習について:実用的不確実性セットとダブルエージェントアルゴリズム

On Practical Robust Reinforcement Learning: Practical Uncertainty Set and Double-Agent Algorithm ( http://arxiv.org/abs/2305.06657v2 )

ライセンス: Link先を確認

Ukjo Hwang, Songnam Hong

(参考訳) モデル不確実性を伴う頑健な強化学習(RL)について検討する。トレーニングのためのサンプルを生成する名目上のマルコフ決定プロセス(N-MDP)が与えられた場合、トレーニング(N-MDP)とテスト環境の間の潜在的なミスマッチを反映するために、N-MDPから摂動されたMDPを含む不確実性セットが定義される。堅牢なRLの目的は、不確実性セットに対する最悪のパフォーマンスを最適化する堅牢なポリシーを学ぶことである。本稿では,既存のものよりも現実的なMDPを含む新しい不確実性セットを提案する。この不確実性集合に対して,表ケースに対する頑健なrlアルゴリズム(arq-learning)を示し,その有限時間誤差境界を特徴付ける。また、ARQ-LearningはQ-Learningや最先端の堅牢なQ-Learningと同等の速度で収束し、実世界のアプリケーションにより良いロバスト性を確保することが証明された。次に,大規模あるいは連続的な状態空間を持つ場合において,ARQ学習の拡張の鍵となるボトルネックを効果的に解決する「悲観的」エージェントを提案する。 Q-Learning, Deep-Q Network (DQN), Deep Deterministic Policy gradient (DDPG) などの有名なRLアルゴリズムに悲観的エージェントのアイデアを取り入れ, PRQ-Learning, PR-DQN, PR-DDPGを提案する。特に、提案されたアイデアは、他のモデルなしRLアルゴリズム(ソフトアクター批評家など)に即座に適用することができる。実験により、モデル不確実性のあるRLアプリケーションにおけるアルゴリズムの優位性を示す。

We study a robust reinforcement learning (RL) with model uncertainty. Given nominal Markov decision process (N-MDP) that generate samples for training, an uncertainty set is defined, which contains some perturbed MDPs from N-MDP for the purpose of reflecting potential mismatched between training (i.e., N-MDP) and testing environments. The objective of robust RL is to learn a robust policy that optimizes the worst-case performance over an uncertainty set. In this paper, we propose a new uncertainty set containing more realistic MDPs than the existing ones. For this uncertainty set, we present a robust RL algorithm (named ARQ-Learning) for tabular case and characterize its finite-time error bound. Also, it is proved that ARQ-Learning converges as fast as Q-Learning and the state-of-the-art robust Q-Learning while ensuring better robustness to real-world applications. Next, we propose {\em pessimistic} agent that efficiently tackles the key bottleneck for the extension of ARQ-Learning into the case with larger or continuous state spaces. Incorporating the idea of pessimistic agents into the famous RL algorithms such as Q-Learning, deep-Q network (DQN), and deep deterministic policy gradient (DDPG), we present PRQ-Learning, PR-DQN, and PR-DDPG, respectively. Noticeably, the proposed idea can be immediately applied to other model-free RL algorithms (e.g., soft actor critic). Via experiments, we demonstrate the superiority of our algorithms on various RL applications with model uncertainty.

翻訳日:2023-05-16 11:18:19 公開日:2023-05-14

# 大規模言語モデルにおけるオープンドメイン質問応答の評価

Evaluating Open-Domain Question Answering in the Era of Large Language Models ( http://arxiv.org/abs/2305.06984v2 )

ライセンス: Link先を確認

Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei

(参考訳) 語彙マッチングは、オープンドメイン質問応答(QA)のデファクト評価方法として残っている。残念なことに、論理的マッチングは、金の答えリストにプラウチブル候補の答えが現れない場合に完全に失敗し、抽出モデルから生成モデルへ移行するにつれて、ますますその傾向が増す。近年の大規模言語モデル (LLMs) の成功により、候補解が長くなると語彙的マッチングの失敗が増加し、ゴールド解とのマッチングはさらに困難になる。正確な評価がなければ、オープンドメインQAの真の進歩は分かっていない。本稿では,一般的なベンチマークであるNQ-openのサブセットを手動で評価することにより,LLMを含む様々なオープンドメインQAモデルの徹底的な分析を行う。私たちの評価では、すべてのモデルの真のパフォーマンスは著しく過小評価されているものの、instructgpt (zero-shot) llmのパフォーマンスは60%近く向上し、既存のトップモデルと同等になり、instructgpt (few-shot) モデルはnq-openの新たな最先端を実際に達成しています。また、語彙マッチング失敗の50%以上が意味論的に等価な答えによるものであることが判明した。さらに、不必要な厳密さに悩まされているにもかかわらず、人間の判断と整合したランクQAモデルを示す。最後に, 自動評価モデルは, LLM が生成する長文解に対してではなく, 語彙マッチングのための合理的なサロゲートであることを示す。自動モデルはLLM回答の幻覚を検出するのに苦労し、LLMを評価することができない。現段階では、人間の評価に代わるものはないようである。

Lexical matching remains the de facto evaluation method for open-domain question answering (QA). Unfortunately, lexical matching fails completely when a plausible candidate answer does not appear in the list of gold answers, which is increasingly the case as we shift from extractive to generative models. The recent success of large language models (LLMs) for QA aggravates lexical matching failures since candidate answers become longer, thereby making matching with the gold answers even more challenging. Without accurate evaluation, the true progress in open-domain QA remains unknown. In this paper, we conduct a thorough analysis of various open-domain QA models, including LLMs, by manually evaluating their answers on a subset of NQ-open, a popular benchmark. Our assessments reveal that while the true performance of all models is significantly underestimated, the performance of the InstructGPT (zero-shot) LLM increases by nearly +60%, making it on par with existing top models, and the InstructGPT (few-shot) model actually achieves a new state-of-the-art on NQ-open. We also find that more than 50% of lexical matching failures are attributed to semantically equivalent answers. We further demonstrate that regex matching ranks QA models consistent with human judgments, although still suffering from unnecessary strictness. Finally, we demonstrate that automated evaluation models are a reasonable surrogate for lexical matching in some circumstances, but not for long-form answers generated by LLMs. The automated models struggle in detecting hallucinations in LLM answers and are thus unable to evaluate LLMs. At this time, there appears to be no substitute for human evaluation.

翻訳日:2023-05-16 11:06:27 公開日:2023-05-14

# cockatiel: nlpタスクにおけるニューラルネット分類器の説明のための解釈可能な要素による帰属分類の連続概念

COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasks ( http://arxiv.org/abs/2305.06754v2 )

ライセンス: Link先を確認

Fanny Jourdan, Agustin Picard, Thomas Fel, Laurent Risser, Jean Michel Loubes, Nicholas Asher

(参考訳) トランスフォーマーアーキテクチャは複雑で、NLPで使用されるが、多くの成功をおさめ、解釈可能性や説明性は困難である。近年の議論では、注意地図と属性法は信頼できない(Pruthi et al., 2019; Brunner et al., 2019)。本稿では,その制限のいくつかを紹介するとともに,そのいくつかをうまく解決したcockatielを紹介する。 cockatielは、nlp分類タスクでトレーニングされたニューラルネットモデルの最終層から、非負行列分解(non-negative matrix factorization:nmf)を使用して、モデルが予測に利用する概念を発見し、感度分析を利用してモデルに対する各概念の重要性を正確に推定することで、意味のある説明を生成する、新しい、概念ベース、モデル非依存のxaiテクニックである。基礎となるモデルの精度を損なうことなく、新しいモデルをトレーニングする必要もない。我々は,単一および多視点の感情分析タスクで実験を行い,コッカティエルが人間のトランスフォーマーモデルと協調する概念を何の監督もせずに発見する能力を示し,その説明の忠実性を忠実度メトリクスで客観的に検証し,2つの異なるデータセットで有意義な説明を提供する能力を示す。

Transformer architectures are complex and their use in NLP, while it has engendered many successes, makes their interpretability or explainability challenging. Recent debates have shown that attention maps and attribution methods are unreliable (Pruthi et al., 2019; Brunner et al., 2019). In this paper, we present some of their limitations and introduce COCKATIEL, which successfully addresses some of them. COCKATIEL is a novel, post-hoc, concept-based, model-agnostic XAI technique that generates meaningful explanations from the last layer of a neural net model trained on an NLP classification task by using Non-Negative Matrix Factorization (NMF) to discover the concepts the model leverages to make predictions and by exploiting a Sensitivity Analysis to estimate accurately the importance of each of these concepts for the model. It does so without compromising the accuracy of the underlying model or requiring a new one to be trained. We conduct experiments in single and multi-aspect sentiment analysis tasks and we show COCKATIEL's superior ability to discover concepts that align with humans' on Transformer models without any supervision, we objectively verify the faithfulness of its explanations through fidelity metrics, and we showcase its ability to provide meaningful explanations in two different datasets.

翻訳日:2023-05-16 11:04:30 公開日:2023-05-14

PDF登録状況（公開日: 20230514）