Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20231023となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# アプリには何があるのか? 女性医療アプリケーションのプライバシーリスクを明らかにする What is in Your App? Uncovering Privacy Risks of Female Health Applications ( http://arxiv.org/abs/2310.14490v1 ) ライセンス: Link先を確認	Muhammad Hassan, Mahnoor Jameel, Tian Wang, Masooda Bashir,	(参考訳) FemTechまたはWomen Technologyは、健康と生殖のデータを監視する女性健康アプリケーションを通じて、女性に手頃で手頃な価格の医療ソリューションを提供することに特化した拡大分野である。トップアプリのダウンロード数は10億を超えており、これらのアプリケーションは広く普及している。しかし、女性の生殖権とプライバシに対する現代的な課題の中で、これらのアプリケーションのセキュリティとプライバシに関する包括的な研究が欠如している。この探索的研究は、人気のある7つのアプリケーションに関連するプライバシーリスクを掘り下げるものだ。最初の定量的静的解析では、さまざまなリスクのあるパーミッションと、多数のサードパーティのトラッカーが明らかになった。さらに、プライバシポリシーの予備審査は、基本的なデータプライバシ原則に準拠していないことを示している。これらの初期の発見は、FemTechアプリの堅牢なプライバシとセキュリティ保護を確立する上で重要なギャップを浮き彫りにした。 FemTech or Female Technology, is an expanding field dedicated to providing affordable and accessible healthcare solutions for women, prominently through Female Health Applications that monitor health and reproductive data. With the leading app exceeding 1 billion downloads, these applications are gaining widespread popularity. However, amidst contemporary challenges to women's reproductive rights and privacy, there is a noticeable lack of comprehensive studies on the security and privacy aspects of these applications. This exploratory study delves into the privacy risks associated with seven popular applications. Our initial quantitative static analysis reveals varied and potentially risky permissions and numerous third-party trackers. Additionally, a preliminary examination of privacy policies indicates non-compliance with fundamental data privacy principles. These early findings highlight a critical gap in establishing robust privacy and security safeguards for FemTech apps, especially significant in a climate where women's reproductive rights face escalating threats.	翻訳日:2024-03-25 14:05:29 公開日:2023-10-23
# PEPSI: アンバランス設定における事実上効率的なプライベート・セット・インターセクション PEPSI: Practically Efficient Private Set Intersection in the Unbalanced Setting ( http://arxiv.org/abs/2310.14565v1 ) ライセンス: Link先を確認	Rasoul Akhavan Mahdavi, Nils Lukas, Faezeh Ebrahimianghazani, Thomas Humphries, Bailey Kacsmar, John Premkumar, Xinda Li, Simon Oya, Ehsan Amjadian, Florian Kerschbaum,	(参考訳) プライベートデータセットを持つ2つのパーティは、交差点を越えて情報を公開することなく、プライベートセットインターセクション(PSI)プロトコルを使用して共有要素を見つけることができる。回路PSIプロトコルは、その濃度などの交叉の任意の関数をプライベートに計算し、一方が他方よりも多くのデータを持つ不均衡な環境でしばしば使用される。既存のプロトコルは計算的に非効率であるか、より大きなセットの順序で大規模なサーバ側通信を必要とする。本稿では,クライアントだけが暗号化されたデータを送信する非対話型ソリューションであるPSI(PEPSI)やPEPSIを紹介する。 PEPSIは1024のクライアントアイテムと100万のサーバアイテムを1秒未満で処理でき、通信量は5MB未満である。我々の作業は、既存の非インタラクティブ回路PSIプロトコルよりも4桁以上高速で、通信の10%しか必要としない。また、Ion et al の作業の最大20倍の速さで、関数の限られた集合を計算し、より大きな集合に比例する通信コストを持つ。我々の研究は、非干渉回路PSIが非平衡環境で実際に適用可能であることを示す最初のものである。 Two parties with private data sets can find shared elements using a Private Set Intersection (PSI) protocol without revealing any information beyond the intersection. Circuit PSI protocols privately compute an arbitrary function of the intersection - such as its cardinality, and are often employed in an unbalanced setting where one party has more data than the other. Existing protocols are either computationally inefficient or require extensive server-client communication on the order of the larger set. We introduce Practically Efficient PSI or PEPSI, a non-interactive solution where only the client sends its encrypted data. PEPSI can process an intersection of 1024 client items with a million server items in under a second, using less than 5 MB of communication. Our work is over 4 orders of magnitude faster than an existing non-interactive circuit PSI protocol and requires only 10% of the communication. It is also up to 20 times faster than the work of Ion et al., which computes a limited set of functions and has communication costs proportional to the larger set. Our work is the first to demonstrate that non-interactive circuit PSI can be practically applied in an unbalanced setting.	翻訳日:2024-03-25 14:05:29 公開日:2023-10-23
# 6Gネットワークにおける共同安全通信とセンシング Joint secure communication and sensing in 6G networks ( http://arxiv.org/abs/2310.14624v1 ) ライセンス: Link先を確認	Miroslav Mitev, Amitha Mayya, Arsenia Chorti,	(参考訳) ジョイントコミュニケーションとセンシングは,第6世代(6G)無線システムで導入された機能のひとつとして期待されている。これにより、多数の新しいアプリケーションが可能になるため、交換された情報をセキュアにするための適切なアプローチを見つけることが重要である。従来のセキュリティメカニズムは、新しい軽量セキュリティソリューションを見つけるという課題を開放する、厳格な遅延、パワー、複雑さの要件を満たすことができないかもしれない。物理層からの有望なアプローチは、チャネルフェージングからの秘密鍵生成(SKG)である。 SKGは数十年にわたって研究されてきたが、その完全なプロトコルの実装はいまだに不十分である。本章の目的は, 実生活環境におけるSKGレートを, 異なるシナリオのセットで評価することである。典型的なレーダ波形を考察し,SKGプロトコルの完全な実装を提案する。各ステップを評価して、物理層からキーを生成することが、将来のネットワークにとって実行可能なソリューションであることを実証する。しかし、すべてのケースで一般化できるソリューションは1つではなく、コンテキストに応じてパラメータを選択すべきであることを示す。 Joint communication and sensing is expected to be one of the features introduced by the sixth-generation (6G) wireless systems. This will enable a huge variety of new applications, hence, it is important to find suitable approaches to secure the exchanged information. Conventional security mechanisms may not be able to meet the stringent delay, power, and complexity requirements which opens the challenge of finding new lightweight security solutions. A promising approach coming from the physical layer is the secret key generation (SKG) from channel fading. While SKG has been investigated for several decades, practical implementations of its full protocol are still scarce. The aim of this chapter is to evaluate the SKG rates in real-life setups under a set of different scenarios. We consider a typical radar waveform and present a full implementation of the SKG protocol. Each step is evaluated to demonstrate that generating keys from the physical layer can be a viable solution for future networks. However, we show that there is not a single solution that can be generalized for all cases, instead, parameters should be chosen according to the context.	翻訳日:2024-03-25 14:05:29 公開日:2023-10-23
# CryptoVerif: 計算処理によるセキュリティプロトコル検証 (チャネル上で通信を行う初期バージョン) CryptoVerif: a Computationally-Sound Security Protocol Verifier (Initial Version with Communications on Channels) ( http://arxiv.org/abs/2310.14658v1 ) ライセンス: Link先を確認	Bruno Blanchet,	(参考訳) この文書は、CryptoVerif.CryptoVerifはシンボリックなDolev-Yaoモデルではなく、計算モデルに依存しない。機密性、対応性(認証を含む)、識別不能性を検証できる。これは、暗号学者が手作業で書いたようなゲームのシーケンスとして提示される証明を生成し、これらのゲームは確率論的プロセス計算で形式化されている。 CryptoVerifは、暗号化プリミティブのセキュリティ特性を特定する汎用的な方法を提供し、プロトコルの任意のセッション数に有効な証明を生成し、プリミティブを破る確率とセッション数の確率の関数としてプロトコルに対する攻撃の確率の上限を提供する。自動で動作させることもできるし、手動による証明表示でガイドすることもできる。 This document presents the security protocol verifier CryptoVerif.CryptoVerif does not rely on the symbolic, Dolev-Yao model, but on the computational model. It can verify secrecy, correspondence (which include authentication), and indistinguishability properties. It produces proofs presented as sequences of games, like those manually written by cryptographers; these games are formalized in aprobabilistic process calculus. CryptoVerif provides a generic method for specifying security properties of the cryptographic primitives.It produces proofs valid for any number of sessions of the protocol, and provides an upper bound on the probability of success of an attack against the protocol as a function of the probability of breaking each primitive and of the number of sessions. It can work automatically, or the user can guide it with manual proof indications.	翻訳日:2024-03-25 14:05:29 公開日:2023-10-23
# 5G位置推定による位置推定と回復:GNSSスポウティング攻撃の阻止 Location Estimation and Recovery using 5G Positioning: Thwarting GNSS Spoofing Attacks ( http://arxiv.org/abs/2310.14885v1 ) ライセンス: Link先を確認	Aneet Kumar Dutta, Sebastian Brandt, Mridula Singh,	(参考訳) 安価なGNSSスプーファーは、道路利用者の安全なナビゲーションや追跡を防止できる。資産の喪失、不正確な運賃推定、誤った速度制限の実施、過度に計算された料金税、不正確な位置に到達する乗客等につながる可能性がある。正当性と攻撃信号の識別が可能な暗号ソリューションや受信機を用いてスプーフィングを防止・検出する技術は,道路利用者のGNSSスプーフィングを検出するには不十分である。近年の研究では、GNSSデータと5G-NR測位を組み合わせ、位置決めの安全性と精度を高めるハイブリッド測位の可能性を探っている。 GNSSと5Gの位置推定を組み合わせた位置推定システム(LER)を他の道路利用者と組み合わせて設計する。我々のLocation Verification Protocolは、悪意のあるプローバーに対する攻撃を防ぐために、クライアントコード(MTAC)のメッセージタイムの理解を拡張します。新たなRecovery and Meta Protocolは、道路利用者の動的かつ予測不能な性質を利用して、GNSSスプーフィングを検出する。このプロトコルは、非常に低い偽陽性率でGNSSスプーフィングを高速に検出し、大規模な設定にカスタマイズすることができる。最大0.3の確率で悪意を持つ(非現実的な)最悪のシナリオであっても、通信後に高い確率でGNSSのスプーリングを検出し、少なくとも20人の道路利用者に対して、偽陽性率は0。道路交通のSUMOシミュレーションにより,中程度の交通条件下での開始から2.6分でGNSSスプーフィングを検出できることがわかった。 The availability of cheap GNSS spoofers can prevent safe navigation and tracking of road users. It can lead to loss of assets, inaccurate fare estimation, enforcing the wrong speed limit, miscalculated toll tax, passengers reaching an incorrect location, etc. The techniques designed to prevent and detect spoofing by using cryptographic solutions or receivers capable of differentiating legitimate and attack signals are insufficient in detecting GNSS spoofing of road users. Recent studies, testbeds, and 3GPP standards are exploring the possibility of hybrid positioning, where GNSS data will be combined with the 5G-NR positioning to increase the security and accuracy of positioning. We design the Location Estimation and Recovery(LER) systems to estimate the correct absolute position using the combination of GNSS and 5G positioning with other road users, where a subset of road users can be malicious and collude to prevent spoofing detection. Our Location Verification Protocol extends the understanding of Message Time of Arrival Codes (MTAC) to prevent attacks against malicious provers. The novel Recovery and Meta Protocol uses road users' dynamic and unpredictable nature to detect GNSS spoofing. This protocol provides fast detection of GNSS spoofing with a very low rate of false positives and can be customized to a large family of settings. Even in a (highly unrealistic) worst-case scenario where each user is malicious with a probability of as large as 0.3, our protocol detects GNSS spoofing with high probability after communication and ranging with at most 20 road users, with a false positive rate close to 0. SUMO simulations for road traffic show that we can detect GNSS spoofing in 2.6 minutes since its start under moderate traffic conditions.	翻訳日:2024-03-25 14:05:29 公開日:2023-10-23
# SD-WAN over MPLS: SD-WANの将来に関する総合的パフォーマンス分析とセキュリティ SD-WAN over MPLS: A Comprehensive Performance Analysis and Security with Insights into the Future of SD-WAN ( http://arxiv.org/abs/2401.01344v1 ) ライセンス: Link先を確認	Abdellah Tahenni, Fatiha Merazka,	(参考訳) ソフトウェア定義広域ネットワーク(SD-WAN)はネットワークトラフィック管理を強化し、Multiprotocol Label Switching(MPLS)は効率的なデータ転送を提供する。本稿では,アルジェリアの大手金融機関であるハウジング銀行におけるMPLS上のSD-WANを分析した。 SD-WANソリューションのためにFortiGateをデプロイし、従来のMPLSと比較し、帯域幅、レイテンシ、ジッタ、パケット損失、スループット、サービス品質(QoS)といったメトリクス間で直接インターネットアクセスします。セキュリティ対策としては、暗号化、ファイアウォール、侵入防止、Webフィルタリング、アンチウイルス、スプーフィング、DoS攻撃、不正アクセスなどの脅威に対処する。 SASEアーキテクチャやAI/ML統合、新たなトランスポートメソッドなど、今後のトレンドについて検討する。 MPLS上のSD-WANは利点があり、性能、セキュリティ、柔軟性が向上している。推奨事項には、継続的なパフォーマンス監視と研究が含まれる。 Software-defined wide area network (SD-WAN) enhances network traffic management, while Multiprotocol Label Switching (MPLS) offers efficient data transmission. This paper analyzes SD-WAN over MPLS in the Housing Bank, a major Algerian financial institution. We deploy FortiGate for the SD-WAN solution, comparing it to traditional MPLS and direct internet access across metrics like bandwidth, latency, jitter, packet loss, throughput, and quality of service (QoS). Security measures include encryption, firewall, intrusion prevention, web filtering, antivirus, and addressing threats like spoofing, DoS attacks, and unauthorized access. We explore future trends such as SASE architecture, AI/ML integration, and emerging transport methods. SD-WAN over MPLS proves advantageous, offering enhanced performance, security, and flexibility. Recommendations include ongoing performance monitoring and research.	翻訳日:2024-03-25 12:57:08 公開日:2023-10-23
# 安全ナビゲーション:深層強化学習による自動運転車の訓練 Safe Navigation: Training Autonomous Vehicles using Deep Reinforcement Learning in CARLA ( http://arxiv.org/abs/2311.10735v1 ) ライセンス: Link先を確認	Ghadi Nehme, Tejas Y. Deo	(参考訳) 自動運転車は交通に革命をもたらす可能性があるが、公道に配備される前に安全に交通を航行できなければならない。このプロジェクトの目的は、CARLAシミュレーターを用いた深層強化学習技術を用いて、不確実な環境での走行を決定するための自動運転車の訓練である。シミュレータは、自動運転モデルのトレーニングとテストのための現実的で都市環境を提供する。ディープqネットワーク(dqn)は運転行動の予測に用いられる。この研究は、衝突センサー、セグメンテーション、深度カメラを統合し、より優れた物体検出と距離推定を行う。このモデルは、4輪車と歩行者の異なるタイプの存在下で、4つの異なる軌道でテストされている。セグメンテーションと深度カメラは、物体の正確な位置決めと距離測定のために使用された。提案手法は,他の車両や歩行者と衝突したり歩道を走行したりすることなく,自動運転車を最終目的地まで移動させることに成功した。複雑な交通シナリオをナビゲートする際の強化学習(RL)モデルの最適性能を確保するため,我々は状態空間を削減するための前処理ステップを実装した。これは、イメージとセンサーの出力をモデルに入力する前に処理する。状態空間を著しく削減したにも関わらず,高レベルの安全性と精度でトラフィックをナビゲートする頑健なモデルを構築した。 Autonomous vehicles have the potential to revolutionize transportation, but they must be able to navigate safely in traffic before they can be deployed on public roads. The goal of this project is to train autonomous vehicles to make decisions to navigate in uncertain environments using deep reinforcement learning techniques using the CARLA simulator. The simulator provides a realistic and urban environment for training and testing self-driving models. Deep Q-Networks (DQN) are used to predict driving actions. The study involves the integration of collision sensors, segmentation, and depth camera for better object detection and distance estimation. The model is tested on 4 different trajectories in presence of different types of 4-wheeled vehicles and pedestrians. The segmentation and depth cameras were utilized to ensure accurate localization of objects and distance measurement. Our proposed method successfully navigated the self-driving vehicle to its final destination with a high success rate without colliding with other vehicles, pedestrians, or going on the sidewalk. To ensure the optimal performance of our reinforcement learning (RL) models in navigating complex traffic scenarios, we implemented a pre-processing step to reduce the state space. This involved processing the images and sensor output before feeding them into the model. Despite significantly decreasing the state space, our approach yielded robust models that successfully navigated through traffic with high levels of safety and accuracy.	翻訳日:2023-11-27 01:00:36 公開日:2023-10-23
# ロボットマニピュレーションの強化:メタワールドにおけるマルチタスク強化学習とシングルライフ強化学習の力の調和 Enhancing Robotic Manipulation: Harnessing the Power of Multi-Task Reinforcement Learning and Single Life Reinforcement Learning in Meta-World ( http://arxiv.org/abs/2311.12854v1 ) ライセンス: Link先を確認	Ghadi Nehme, Ishan Sabane, Tejas Y. Deo	(参考訳) 現在、ロボットは通常、1つのタスクを成功させるために広範なトレーニングを必要とする。しかし、現実のシナリオで真に有用性を高めるためには、ロボットは複数のタスクを効率的に実行する能力を持つべきである。このニーズに対処するために、マルチタスク近位ポリシー最適化(PPO)、マルチタスク信頼領域ポリシー最適化(TRPO)、マルチタスクソフトアクター批判(SAC)など、様々なマルチタスク強化学習(RL)アルゴリズムが開発されている。しかしながら、これらのアルゴリズムは、同様の分布を示す環境や観測空間内でのみ最適な性能を示す。実際には、ロボットが訓練されたものと異なるシナリオや観察に遭遇する可能性があるため、そのような条件は普通ではないことが多い。この課題に対処するため、Q-Weighted Adversarial Learning (QWALE)のようなアルゴリズムは、特定のタスクに対してのみベースアルゴリズム(事前データを生成する)をトレーニングすることでこの問題に対処しようとする。そこでこのプロジェクトの目的は、ロボットアームがメタワールド環境内で7つの異なるタスクをうまく実行できるようにすることである。これを実現するために、ロボットアームの訓練にマルチタスクソフトアクタークリティカル(MT-SAC)が使用される。その後、訓練されたモデルはsingle-life rlアルゴリズムの事前データソースとして機能する。このMT-QWALEアルゴリズムの有効性は、様々な目標位置(ノーベル位置)での試験により評価される。最後に、訓練されたMT-SACとMT-QWALEがよりよく動作するMT-QWALEアルゴリズムの比較を行う。アブレーション研究では、MT-QWALEが最終ゴール位置を隠した後でも、わずかに多くのステップでタスクを完了できることが示されている。 At present, robots typically require extensive training to successfully accomplish a single task. However, to truly enhance their usefulness in real-world scenarios, robots should possess the capability to perform multiple tasks effectively. To address this need, various multi-task reinforcement learning (RL) algorithms have been developed, including multi-task proximal policy optimization (PPO), multi-task trust region policy optimization (TRPO), and multi-task soft-actor critic (SAC). Nevertheless, these algorithms demonstrate optimal performance only when operating within an environment or observation space that exhibits a similar distribution. In reality, such conditions are often not the norm, as robots may encounter scenarios or observations that differ from those on which they were trained. Addressing this challenge, algorithms like Q-Weighted Adversarial Learning (QWALE) attempt to tackle the issue by training the base algorithm (generating prior data) solely for a particular task, rendering it unsuitable for generalization across tasks. So, the aim of this research project is to enable a robotic arm to successfully execute seven distinct tasks within the Meta World environment. To achieve this, a multi-task soft actor-critic (MT-SAC) is employed to train the robotic arm. Subsequently, the trained model will serve as a source of prior data for the single-life RL algorithm. The effectiveness of this MT-QWALE algorithm will be assessed by conducting tests on various target positions (novel positions). In the end, a comparison is provided between the trained MT-SAC and the MT-QWALE algorithm where the MT-QWALE performs better. An ablation study demonstrates that MT-QWALE successfully completes tasks with a slightly larger number of steps even after hiding the final goal position.	翻訳日:2023-11-27 00:36:21 公開日:2023-10-23
# ニューラルネットワーク探索に基づく逐次的マルチタスク適応学習 Cascaded Multi-task Adaptive Learning Based on Neural Architecture Search ( http://arxiv.org/abs/2310.17664v1 ) ライセンス: Link先を確認	Yingying Gao, Shilei Zhang, Zihao Cui, Chao Deng, Junlan Feng	(参考訳) 複数の事前訓練されたモデルをカスケードすることは、エンドツーエンドシステムを構成する効果的な方法である。しかし,完全カスケードモデルの微調整はパラメータやメモリの効率が悪く,並列モデルにアダプタモジュールを適用するだけでは微調整ほど性能が向上しないことが明らかとなった。ニューラルネットワーク探索(NAS)フレームワークに基づくエンドツーエンドのマルチタスクモデルを最適化するための,自動かつ効果的な適応学習手法を提案する。各モジュール上の候補適応操作は、凍結し、アダプタを挿入し、微調整する。さらに,学習可能なパラメータの量を考慮した学習構造を制限するために,損失にペナルティ項目を追加する。ペナルティ項目は検索されたアーキテクチャをうまく制限し,提案手法は,SLURPの完全微調整に対応するパラメータを8.7%に圧縮し,より優れた性能で類似のチューニング手法を手作業で探索することができる。 Cascading multiple pre-trained models is an effective way to compose an end-to-end system. However, fine-tuning the full cascaded model is parameter and memory inefficient and our observations reveal that only applying adapter modules on cascaded model can not achieve considerable performance as fine-tuning. We propose an automatic and effective adaptive learning method to optimize end-to-end cascaded multi-task models based on Neural Architecture Search (NAS) framework. The candidate adaptive operations on each specific module consist of frozen, inserting an adapter and fine-tuning. We further add a penalty item on the loss to limit the learned structure which takes the amount of trainable parameters into account. The penalty item successfully restrict the searched architecture and the proposed approach is able to search similar tuning scheme with hand-craft, compressing the optimizing parameters to 8.7% corresponding to full fine-tuning on SLURP with an even better performance.	翻訳日:2023-11-05 14:14:39 公開日:2023-10-23
# インド株式市場におけるポートフォリオ最適化手法の比較研究 A Comparative Study of Portfolio Optimization Methods for the Indian Stock Market ( http://arxiv.org/abs/2310.14748v1 ) ライセンス: Link先を確認	Jaydip Sen, Arup Dasgupta, Partha Pratim Sengupta, and Sayantani Roy Choudhury	(参考訳) 本章では、インド株式市場におけるMVP、HRP、HERCの3つのポートフォリオ最適化手法の比較研究、特にインド証券取引所に上場している15部門から選択された株式について紹介する。各クラスタの上位株は、2022年7月1日に発行されたnse(nse webサイト)のレポートから、フリーフロー市場資本に基づいて特定される。各部門は、2019年7月1日から2022年6月30日までの3つのポートフォリオ最適化アプローチに従って、株価に基づいて3つのポートフォリオを設計する。ポートフォリオは2022年7月1日から2023年6月30日までの期間にテストされる。ポートフォリオのパフォーマンス評価には,3つの指標が使用される。これら3つの指標は累積リターン、年次ボラティリティ、シャープ比である。各セクタに対して、最高累積リターン、最低ボラティリティ、およびトレーニングとテスト期間における最大シャープ比を与えるポートフォリオを特定する。 This chapter presents a comparative study of the three portfolio optimization methods, MVP, HRP, and HERC, on the Indian stock market, particularly focusing on the stocks chosen from 15 sectors listed on the National Stock Exchange of India. The top stocks of each cluster are identified based on their free-float market capitalization from the report of the NSE published on July 1, 2022 (NSE Website). For each sector, three portfolios are designed on stock prices from July 1, 2019, to June 30, 2022, following three portfolio optimization approaches. The portfolios are tested over the period from July 1, 2022, to June 30, 2023. For the evaluation of the performances of the portfolios, three metrics are used. These three metrics are cumulative returns, annual volatilities, and Sharpe ratios. For each sector, the portfolios that yield the highest cumulative return, the lowest volatility, and the maximum Sharpe Ratio over the training and the test periods are identified.	翻訳日:2023-11-05 14:13:12 公開日:2023-10-23
# 名前付きエンティティ認識のための境界オフセット予測ネットワーク A Boundary Offset Prediction Network for Named Entity Recognition ( http://arxiv.org/abs/2310.18349v1 ) ライセンス: Link先を確認	Minghao Tang, Yongquan He, Yongxiu Xu, Hongbo Xu, Wenyuan Zhang, Yang Lin	(参考訳) 名前付きエンティティ認識(NER)は、名前付きエンティティをテキストで識別し分類することを目的とした自然言語処理の基本的なタスクである。しかしながら、NERのスパンベースのメソッドは、通常、エンティティタイプをテキストスパンに割り当て、不均衡なサンプルスペースとなり、非エンタリティとエンティティスパン間の接続を無視する。これらの問題に対処するため,我々は,候補スパンと最寄りエンティティスパンの境界オフセットを予測するバウンダリオフセット予測ネットワーク(bopn)という,nerの新しいアプローチを提案する。境界オフセットのガイドセマンティクスを活用することで、bopnは非エンティティとエンティティスパンの間の接続を確立し、非エンティティスパンをエンティティ検出のための追加のポジティブなサンプルとして機能させることができる。さらに,エンティティタイプとスパン表現を統合し,検出対象としてエンティティタイプを使用するのではなく,タイプ認識境界オフセットを生成する。我々は,8種類のnerデータセットについて実験を行い,提案手法が従来の最先端手法よりも優れていることを示す。 Named entity recognition (NER) is a fundamental task in natural language processing that aims to identify and classify named entities in text. However, span-based methods for NER typically assign entity types to text spans, resulting in an imbalanced sample space and neglecting the connections between non-entity and entity spans. To address these issues, we propose a novel approach for NER, named the Boundary Offset Prediction Network (BOPN), which predicts the boundary offsets between candidate spans and their nearest entity spans. By leveraging the guiding semantics of boundary offsets, BOPN establishes connections between non-entity and entity spans, enabling non-entity spans to function as additional positive samples for entity detection. Furthermore, our method integrates entity type and span representations to generate type-aware boundary offsets instead of using entity types as detection targets. We conduct experiments on eight widely-used NER datasets, and the results demonstrate that our proposed BOPN outperforms previous state-of-the-art methods.	翻訳日:2023-11-05 14:09:11 公開日:2023-10-23
# 生成型aiモデルによる健康格差:ドメイン特化大規模言語モデルを用いた比較研究 Health Disparities through Generative AI Models: A Comparison Study Using A Domain Specific large language model ( http://arxiv.org/abs/2310.18355v1 ) ライセンス: Link先を確認	Yohn Jairo Parra Bautista, Vinicious Lima, Carlos Theran, Richard Alo	(参考訳) 健康格差は、人種と民族のマイノリティ、低所得者、農村住民など、異なるグループ間の健康結果と医療へのアクセスの違いである。大規模言語モデル(LLM)と呼ばれる人工知能(AI)プログラムは、人間の言語を理解し、生成し、健康コミュニケーションを改善し、健康格差を減らすことができる。人間と医師の対話にllmを使用するには、多様な代表的データ、プライバシの懸念、医療提供者と技術専門家のコラボレーションなど、多くの課題がある。本稿では,SciBERT や多目的 LLM BERT など,ドメイン固有の大規模言語モデルの比較研究を紹介する。試験室の健康格差に関するテキストクエリを,人種などの要因を単独で使用する場合,コサイン類似性を用いて分析した。テキストクエリを使用すると、scibertは"race" と "perpetuates health differences" という2つのクエリを区別しない場合に失敗する。臨床医は、患者と非同期に通信する際に、生成AIを使用してドラフトレスポンスを作成することができると信じている。しかし、倫理的かつ公平に開発・実施されるためには注意が必要である。 Health disparities are differences in health outcomes and access to healthcare between different groups, including racial and ethnic minorities, low-income people, and rural residents. An artificial intelligence (AI) program called large language models (LLMs) can understand and generate human language, improving health communication and reducing health disparities. There are many challenges in using LLMs in human-doctor interaction, including the need for diverse and representative data, privacy concerns, and collaboration between healthcare providers and technology experts. We introduce the comparative investigation of domain-specific large language models such as SciBERT with a multi-purpose LLMs BERT. We used cosine similarity to analyze text queries about health disparities in exam rooms when factors such as race are used alone. Using text queries, SciBERT fails when it doesn't differentiate between queries text: "race" alone and "perpetuates health disparities." We believe clinicians can use generative AI to create a draft response when communicating asynchronously with patients. However, careful attention must be paid to ensure they are developed and implemented ethically and equitably.	翻訳日:2023-11-05 13:54:59 公開日:2023-10-23
# 自然言語処理のための強化学習の展望と医療への応用 A Review of Reinforcement Learning for Natural Language Processing, and Applications in Healthcare ( http://arxiv.org/abs/2310.18354v1 ) ライセンス: Link先を確認	Ying Liu, Haozhu Wang, Huixue Zhou, Mingchen Li, Yu Hou, Sicheng Zhou, Fang Wang, Rama Hoetzlein, Rui Zhang	(参考訳) 強化学習(Reinforcement Learning, RL)は, 治療計画やパーソナライズド医療, 手術スケジュールの最適化など, 複雑な医療意思決定問題に対処するための強力なアプローチとして登場した。自然言語処理(NLP)の分野では、対話システムや機械翻訳、質問応答といったタスクの最適戦略を学ぶ能力から、注目されている。本稿では,NLPにおけるRL技術について概説し,医療における重要な進歩,課題,応用について述べる。レビューは、機械学習とその医療応用のロードマップを視覚化することから始まる。さらに、RLとNLPタスクの統合についても検討している。本研究では,会話戦略の学習を可能にする対話システム,rlに基づく機械翻訳モデル,質問応答システム,テキスト要約,情報抽出について検討した。さらに、RL-NLPシステムの倫理的考察とバイアスに対処する。 Reinforcement learning (RL) has emerged as a powerful approach for tackling complex medical decision-making problems such as treatment planning, personalized medicine, and optimizing the scheduling of surgeries and appointments. It has gained significant attention in the field of Natural Language Processing (NLP) due to its ability to learn optimal strategies for tasks such as dialogue systems, machine translation, and question-answering. This paper presents a review of the RL techniques in NLP, highlighting key advancements, challenges, and applications in healthcare. The review begins by visualizing a roadmap of machine learning and its applications in healthcare. And then it explores the integration of RL with NLP tasks. We examined dialogue systems where RL enables the learning of conversational strategies, RL-based machine translation models, question-answering systems, text summarization, and information extraction. Additionally, ethical considerations and biases in RL-NLP systems are addressed.	翻訳日:2023-11-05 13:54:39 公開日:2023-10-23
# PRCA: プラガブル・リワード駆動コンテキストアダプタによる検索質問応答のためのブラックボックス大言語モデル PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter ( http://arxiv.org/abs/2310.18347v1 ) ライセンス: Link先を確認	Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng, Ming Li, Jing Xiao	(参考訳) ReQA(Retrieval Question Answering)タスクでは、検索とジェネレータで構成される検索拡張フレームワークを採用している。生成者は、検索者が検索した文書に基づいて回答を定式化する。大きな言語モデル(LLM)をジェネレータとして組み込むことは、高度なQA機能のために有益であるが、一般的には予算制約で微調整するには大きすぎる。この問題に対処し、さらにReQA性能を向上させるために、トレーニング可能なPlugable Reward-Driven Contextual Adapter (PRCA)を提案し、ジェネレータをブラックボックスとして保持する。プラガブルな方法でレトリバーとジェネレータの間に位置するPRCAは、強化学習フェーズの報酬を最大化してトークン自己回帰戦略で操作することにより、検索情報を洗練する。実験では,3つのデータセット上でのReQA性能を最大20%向上し,既存のフレームワークにブラックボックスLEMを適合させることにより,PRCAの有効性を検証した。 The Retrieval Question Answering (ReQA) task employs the retrieval-augmented framework, composed of a retriever and generator. The generator formulates the answer based on the documents retrieved by the retriever. Incorporating Large Language Models (LLMs) as generators is beneficial due to their advanced QA capabilities, but they are typically too large to be fine-tuned with budget constraints while some of them are only accessible via APIs. To tackle this issue and further improve ReQA performance, we propose a trainable Pluggable Reward-Driven Contextual Adapter (PRCA), keeping the generator as a black box. Positioned between the retriever and generator in a Pluggable manner, PRCA refines the retrieved information by operating in a token-autoregressive strategy via maximizing rewards of the reinforcement learning phase. Our experiments validate PRCA's effectiveness in enhancing ReQA performance on three datasets by up to 20% improvement to fit black-box LLMs into existing frameworks, demonstrating its considerable potential in the LLMs era.	翻訳日:2023-11-05 13:54:23 公開日:2023-10-23
# インタラクティブAI設計におけるAIアライメント:仕様アライメント、プロセスアライメント、評価サポート AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support ( http://arxiv.org/abs/2311.00710v1 ) ライセンス: Link先を確認	Michael Terry, Chinmay Kulkarni, Martin Wattenberg, Lucas Dixon, Meredith Ringel Morris	(参考訳) AIアライメントは、AIが望ましい結果をもたらすことを保証するという全体的な問題を、望ましくない副作用なしに考慮している。安全と人的価値の観点から考えることが多いが、AIアライメントは、対話型AIシステムのためのインターフェースの設計と評価の文脈でも考慮できる。本稿では,AIアライメントの概念を基本的3段階のインタラクションサイクルにマッピングし,アライメント目標のセットを生成する。 1) 仕様の整合性: ユーザがAIに目的を効率的に確実に伝達できるようにすること。 2) プロセスアライメント:AIの実行プロセスを検証し、任意に制御する機能を提供する。 3) 評価サポート: ユーザがAIのアウトプットを検証して理解できるようにすること。また、AIの実際のプロセスの単純化された、分離された、しかし制御可能な表現として定義された代理プロセスの概念や、人間とAIプロセスの違いがAI制御の課題にどのように影響するかを強調するプロセスガルフの概念も導入する。このフレームワークの価値を説明するために,3つのアライメント次元のそれぞれに沿って,商業的および研究的なシステムを記述し,対話的なアライメント機構を提供するインターフェースが,質的に異なるユーザエクスペリエンスをもたらすことを示す。 AI alignment considers the overall problem of ensuring an AI produces desired outcomes, without undesirable side effects. While often considered from the perspectives of safety and human values, AI alignment can also be considered in the context of designing and evaluating interfaces for interactive AI systems. This paper maps concepts from AI alignment onto a basic, three step interaction cycle, yielding a corresponding set of alignment objectives: 1) specification alignment: ensuring the user can efficiently and reliably communicate objectives to the AI, 2) process alignment: providing the ability to verify and optionally control the AI's execution process, and 3) evaluation support: ensuring the user can verify and understand the AI's output. We also introduce the concepts of a surrogate process, defined as a simplified, separately derived, but controllable representation of the AI's actual process; and the notion of a Process Gulf, which highlights how differences between human and AI processes can lead to challenges in AI control. To illustrate the value of this framework, we describe commercial and research systems along each of the three alignment dimensions, and show how interfaces that provide interactive alignment mechanisms can lead to qualitatively different and improved user experiences.	翻訳日:2023-11-05 13:29:50 公開日:2023-10-23
# 機械学習と知識:なぜロバスト性が重要か Machine Learning and Knowledge: Why Robustness Matters ( http://arxiv.org/abs/2310.19819v1 ) ライセンス: Link先を確認	Jonathan Vandenburgh	(参考訳) 機械学習アルゴリズムを信頼するには、アウトプットに自信が必要である。信頼はモデル信頼性の観点から解釈されるのが一般的であり、モデルが正しい出力の比率を高い割合で生成した場合は信頼される。しかしながら、モデルの信頼性は、間違った特徴に依存するモデルやコンテキストに基づくパフォーマンスのバリエーションなど、マシンラーニングモデルの堅牢性に関する問題には対処しない。信頼という認識の次元は、アルゴリズムの信頼性は、ユーザーがそのアウトプットが正しいかどうかを知る立場にあるかどうかに依存する、知識の概念を通して理解することができる、と私は論じる。知識は正しい理由による信念の形成と、エラーに対する堅牢性を必要とするため、マシンラーニングアルゴリズムは、反事実的シナリオでうまく機能し、適切な機能に基づいて意思決定を行う場合にのみ、知識を提供することができる。これは、解釈可能性、因果的近道独立性、分散シフトのロバスト性といったモデル特性がモデル信頼性に必要でなくても考慮すべき理由を説明できます。 Trusting machine learning algorithms requires having confidence in their outputs. Confidence is typically interpreted in terms of model reliability, where a model is reliable if it produces a high proportion of correct outputs. However, model reliability does not address concerns about the robustness of machine learning models, such as models relying on the wrong features or variations in performance based on context. I argue that the epistemic dimension of trust can instead be understood through the concept of knowledge, where the trustworthiness of an algorithm depends on whether its users are in the position to know that its outputs are correct. Knowledge requires beliefs to be formed for the right reasons and to be robust to error, so machine learning algorithms can only provide knowledge if they work well across counterfactual scenarios and if they make decisions based on the right features. This, I argue, can explain why we should care about model properties like interpretability, causal shortcut independence, and distribution shift robustness even if such properties are not required for model reliability.	翻訳日:2023-11-05 13:28:01 公開日:2023-10-23
# コンテキスト帯域におけるコストレスモデル選択に向けて:バイアス分散の視点から Towards Costless Model Selection in Contextual Bandits: A Bias-Variance Perspective ( http://arxiv.org/abs/2106.06483v3 ) ライセンス: Link先を確認	Sanath Kumar Krishnamurthy, Adrienne Margaret Propp, Susan Athey	(参考訳) 教師あり学習におけるモデル選択は、バイアスと分散を最もバランスのとれたモデルが優先順位として知られていたかのように、コストのない保証を提供する。確率的文脈的バンディット設定における累積的後悔の最小化に対する同様の保証の実現可能性について検討した。最近の研究[Marinov and Zimmert, 2021]は、アルゴリズムがコストのかかる後悔の限界を保証できないインスタンスを特定している。それにもかかわらず、コストのないモデル選択が実現可能な良質な条件を特定する: クラス複雑性が徐々に増大し、クラス複雑性が増大し、クラス内の最良ポリシー値に対する限界リターンが減少する。提案アルゴリズムは, 新たな不特定性テストに基づいており, モデル選択による報酬推定の利点を実証する。コンテキストバンディットにおけるモデル選択の先行作業とは異なり、より多くのデータが収集されるにつれて、アルゴリズムは進化するバイアス分散トレードオフに注意深く適応する。特に、我々のアルゴリズムと分析は、最も実現可能なクラスの複雑さに適応するだけでなく、バイアスを支配する推定分散を持つ最も単純なクラスの複雑さにも適応する。短期的には、これはより単純なクラスの複雑さに依存する後悔の保証を改善する。 Model selection in supervised learning provides costless guarantees as if the model that best balances bias and variance was known a priori. We study the feasibility of similar guarantees for cumulative regret minimization in the stochastic contextual bandit setting. Recent work [Marinov and Zimmert, 2021] identifies instances where no algorithm can guarantee costless regret bounds. Nevertheless, we identify benign conditions where costless model selection is feasible: gradually increasing class complexity, and diminishing marginal returns for best-in-class policy value with increasing class complexity. Our algorithm is based on a novel misspecification test, and our analysis demonstrates the benefits of using model selection for reward estimation. Unlike prior work on model selection in contextual bandits, our algorithm carefully adapts to the evolving bias-variance trade-off as more data is collected. In particular, our algorithm and analysis go beyond adapting to the complexity of the simplest realizable class and instead adapt to the complexity of the simplest class whose estimation variance dominates the bias. For short horizons, this provides improved regret guarantees that depend on the complexity of simpler classes.	翻訳日:2023-10-26 04:10:20 公開日:2023-10-23
# qcpb理論における環境変数に基づくハイゼンベルクの不確かさ関係の修正 A revision for Heisenberg uncertainty relation based on environment variable in the QCPB theory ( http://arxiv.org/abs/2003.07203v3 ) ライセンス: Link先を確認	Gen Wang	(参考訳) ハイゼンベルクの不確実性原理とその拡張は、すべてより優れた近似推定を保持する不等式である。量子共変ポアソンブラケット理論に基づき、不確実性関係を修正・説明し、各測定結果の確実性を高める現実の完全な記述を与える量子測地性関係を提案する。これは、観測可能な環境と環境の間に絡み合いの項が存在し、環境が避けられない影響を引き起こす測定にどのように影響するかをうまく説明している。 The Heisenberg uncertainty principle and its extensions are all still inequalities form which hold the superior approximate estimations. Based on quantum covariant Poisson bracket theory, we propose quantum geomertainty relation to modify and explain the uncertainty relation to positively give a complete description of reality that enhances the outcome of each measurement with certainty. It demonstrates that entanglement term exists between the observable and the environment and nicely explains how the environment has an effect on the measurement which causes the unavoidable influences.	翻訳日:2023-10-26 04:09:00 公開日:2023-10-23
# 環境としてのテキスト:深層強化学習テキスト可読性評価モデル Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model ( http://arxiv.org/abs/1912.05957v4 ) ライセンス: Link先を確認	Hamid Mohammadi, Seyed Hossein Khasteh, Tahereh Firoozi, Taha Samavati	(参考訳) テキストの可読性を評価することは、書式での情報の正確な表現を著しく促進することができる。テキスト可読性評価の定式化は、テキストの長さに関わらず、テキストの意味的な特性を識別する。洗練された特徴とモデルは、テキストの理解性を正確に評価するために使用される。それにもかかわらず、テキストの読みやすさを効率よく評価する問題は比較的未解決のままである。最先端のテキスト可読性評価モデルの効率は、深層強化学習モデルを用いてさらに改善することができる。注意力に基づく能動推論手法を用いて,提案手法は入力テキストと計算資源を効率的に活用する。半教師付き信号を用いることで、強化学習モデルはテキストの可読性を決定するために最小限のテキストを使用する。 WeebitとCambridge ExamsのモデルとBERTテキスト可読性モデルのような最先端のモデルを比較すると、他のモデルよりもはるかに少ない入力テキストで最先端の精度を達成することができることを示している。 Evaluating the readability of a text can significantly facilitate the precise expression of information in written form. The formulation of text readability assessment involves the identification of meaningful properties of the text regardless of its length. Sophisticated features and models are used to evaluate the comprehensibility of texts accurately. Despite this, the problem of assessing texts' readability efficiently remains relatively untouched. The efficiency of state-of-the-art text readability assessment models can be further improved using deep reinforcement learning models. Using a hard attention-based active inference technique, the proposed approach makes efficient use of input text and computational resources. Through the use of semi-supervised signals, the reinforcement learning model uses the minimum amount of text in order to determine text's readability. A comparison of the model on Weebit and Cambridge Exams with state-of-the-art models, such as the BERT text readability model, shows that it is capable of achieving state-of-the-art accuracy with a significantly smaller amount of input text than other models.	翻訳日:2023-10-26 04:08:37 公開日:2023-10-23
# Fisher-Schultz講演:ランダム化実験における異種治療効果に関するジェネリック機械学習推論とインドにおける免疫への応用 Fisher-Schultz Lecture: Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments, with an Application to Immunization in India ( http://arxiv.org/abs/1712.04802v8 ) ライセンス: Link先を確認	Victor Chernozhukov, Mert Demirer, Esther Duflo, and Iv\'an Fern\'andez-Val	(参考訳) ランダム化実験における異種効果の重要な特徴を推定し,推定する手法を提案する。これらの重要な特徴には、機械学習プロキシを使用した効果の最良の線形予測子、インパクトグループによってソートされた平均効果、最も影響の少ないユニットの平均特性が含まれる。このアプローチは高次元の設定で有効であり、その効果は予測的および因果的機械学習手法によってプロキシされる(必ずしも常に推定されない)。私たちはこれらのプロキシを主要な特徴の見積に後処理します。私たちのアプローチは汎用的で、ペナルティ化された方法、ニューラルネットワーク、ランダムフォレスト、ブーストツリー、アンサンブルメソッドと組み合わせて、予測と因果の両方で使用できます。推定と推測は、過度な適合を避け、有効性を達成するために繰り返しデータ分割に基づいている。特に、p値の中央値と中央値の中央値と、信頼区間のその他の定量値を取る。分位集約は,単一の分割手続きに対する推定リスクを低減し,その主推論特性を確立する。最後に、分析により、因果学習による機械学習プロキシの構築方法が明らかになった。効果の最良の線形予測器を構築するために開発した客観的関数を使用して、最初のステップでより良い機械学習プロキシを得ることができる。本研究では,インドにおける予防接種需要を刺激するナッジの組み合わせを評価するランダムフィールド実験において,推論ツールと因果学習者の両方の使用について述べる。 We propose strategies to estimate and make inference on key features of heterogeneous effects in randomized experiments. These key features include best linear predictors of the effects using machine learning proxies, average effects sorted by impact groups, and average characteristics of most and least impacted units. The approach is valid in high dimensional settings, where the effects are proxied (but not necessarily consistently estimated) by predictive and causal machine learning methods. We post-process these proxies into estimates of the key features. Our approach is generic, it can be used in conjunction with penalized methods, neural networks, random forests, boosted trees, and ensemble methods, both predictive and causal. Estimation and inference are based on repeated data splitting to avoid overfitting and achieve validity. We use quantile aggregation of the results across many potential splits, in particular taking medians of p-values and medians and other quantiles of confidence intervals. We show that quantile aggregation lowers estimation risks over a single split procedure, and establish its principal inferential properties. Finally, our analysis reveals ways to build provably better machine learning proxies through causal learning: we can use the objective functions that we develop to construct the best linear predictors of the effects, to obtain better machine learning proxies in the initial step. We illustrate the use of both inferential tools and causal learners with a randomized field experiment that evaluates a combination of nudges to stimulate demand for immunization in India.	翻訳日:2023-10-26 04:08:10 公開日:2023-10-23
# 分離正規化器による硬質混合試料のハーネス化 Harnessing Hard Mixed Samples with Decoupled Regularizer ( http://arxiv.org/abs/2203.10761v3 ) ライセンス: Link先を確認	Zicheng Liu, Siyuan Li, Ge Wang, Cheng Tan, Lirong Wu, Stan Z. Li	(参考訳) Mixupは、決定境界を混合データで滑らかにすることで、ニューラルネットワークの一般化を改善する効率的なデータ拡張アプローチである。近年,動的混合手法は,混合試料中の目標領域を最大化することで,従来の静的ポリシー(線形補間など)を効果的に改善しているが,余分な追加時間コストは許容できない。これらの計算オーバーヘッドは主に混合ラベルに従って混合サンプルを最適化することに由来する。しかし,ラベルミスマッチされた混合サンプルは,深層モデルにおいて識別的特徴を局所化するための有意義な混合サンプルであるため,余分な最適化ステップは冗長であることが分かった。そこで本稿では,より複雑な動的混合政策を提案するのではなく,非結合型正規化器(DM)を用いた効率的な混合目的関数を提案する。第一の効果は、DMがこれらの硬質混合試料を適応的に利用し、ミキシングの本来の滑らかさを失うことなく識別特性をマイニングできることである。結果としてdmは、静的なミックスアップメソッドが、余分な計算なしで動的メソッドのパフォーマンスに匹敵する、あるいは超えられるようにする。これはまた、決定境界のスムーズ化と差別的特徴の識別の両方に焦点を合わせる必要がある、ミックスアップトレーニングのための興味深い客観的設計問題につながります。 7つのデータセットにわたる教師付きおよび半教師付き学習ベンチマークに関する広範な実験は、dmをプラグアンドプレイモジュールとしての有効性を検証する。ソースコードとモデルはhttps://github.com/Westlake-AI/openmixupで入手できる。 Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data. Recently, dynamic mixup methods have improved previous static policies effectively (e.g., linear interpolation) by maximizing target-related salient regions in mixed samples, but excessive additional time costs are not acceptable. These additional computational overheads mainly come from optimizing the mixed samples according to the mixed labels. However, we found that the extra optimizing step may be redundant because label-mismatched mixed samples are informative hard mixed samples for deep models to localize discriminative features. In this paper, we thus are not trying to propose a more complicated dynamic mixup policy but rather an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM). The primary effect is that DM can adaptively utilize those hard mixed samples to mine discriminative features without losing the original smoothness of mixup. As a result, DM enables static mixup methods to achieve comparable or even exceed the performance of dynamic methods without any extra computation. This also leads to an interesting objective design problem for mixup training that we need to focus on both smoothing the decision boundaries and identifying discriminative features. Extensive experiments on supervised and semi-supervised learning benchmarks across seven datasets validate the effectiveness of DM as a plug-and-play module. Source code and models are available at https://github.com/Westlake-AI/openmixup	翻訳日:2023-10-26 04:03:12 公開日:2023-10-23
# 局所埋め込みによる表面反応の量子計算 Quantum Computation of Reactions on Surfaces Using Local Embedding ( http://arxiv.org/abs/2203.07536v3 ) ライセンス: Link先を確認	Tanvi P. Gujarati, Mario Motta, Triet Nguyen Friedhoff, Julia E. Rice, Nam Nguyen, Panagiotis Kl. Barkoutsos, Richard J. Thompson, Tyler Smith, Marna Kagele, Mark Brei, Barbara A. Jones, Kristen Williams	(参考訳) 電子システムのモデリングは量子コンピュータにとって重要な応用である。材料科学の文脈において、重要なオープン問題は表面における化学反応の計算的記述である。本研究では, 量子計算アルゴリズムを用いて表面分子の吸着と反応をモデル化するワークフローについて概説する。活性空間の体系的決定のための2つの局所埋め込み法を開発し比較した。これらの手法は自動化され、分子と表面の相互作用の物理に基づいて体系的に即効性のある活性空間が得られる。さらに,量子アルゴリズムを用いて選択した能動空間のシミュレーションに必要な量子資源を削減するため,回路の精密化と簡易化を行う手法を提案する。この技術は幅広い種類の量子回路に適用でき、短期量子デバイスでの実証を可能にするために重要である。本研究では, 古典シミュレータと量子ハードウェアを用いて, マグネシウム表面の水の解離に活性空間選択と回路単純化の組み合わせを適用した。本研究は,物質科学に応用された量子コンピューティングの分野における有望な研究方向として,提案するアルゴリズムワークフローとともに表面分子の反応を同定する。 Modeling electronic systems is an important application for quantum computers. In the context of materials science, an important open problem is the computational description of chemical reactions on surfaces. In this work, we outline a workflow to model the adsorption and reaction of molecules on surfaces using quantum computing algorithms. We develop and compare two local embedding methods for the systematic determination of active spaces. These methods are automated and based on the physics of molecule-surface interactions and yield systematically improvable active spaces. Furthermore, to reduce the quantum resources required for the simulation of the selected active spaces using quantum algorithms, we introduce a technique for exact and automated circuit simplification. This technique is applicable to a broad class of quantum circuits and critical to enable demonstration on near-term quantum devices. We apply the proposed combination of active-space selection and circuit simplification to the dissociation of water on a magnesium surface using classical simulators and quantum hardware. Our study identifies reactions of molecules on surfaces, in conjunction with the proposed algorithmic workflow, as a promising research direction in the field of quantum computing applied to materials science.	翻訳日:2023-10-26 04:02:44 公開日:2023-10-23
# JAMES:マルチアスペクトグラフ埋め込みと推論によるジョブタイトルの標準化 JAMES: Normalizing Job Titles with Multi-Aspect Graph Embeddings and Reasoning ( http://arxiv.org/abs/2202.10739v2 ) ライセンス: Link先を確認	Michiharu Yamashita, Jia Tracy Shen, Thanh Tran, Hamoon Ekhtiari, Dongwon Lee	(参考訳) オンラインジョブマーケットプレースでは、さまざまな下流タスク(例えば、仕事の推薦、ユーザのキャリア分析、転職予測)に対して、明確に定義された職名分類を確立することが重要である。ジョブタイトル正規化(Job Title Normalization、JTN)は、ユーザーが作成した非標準ジョブを正規化されたジョブに分類するためのクリーニングステップである。しかし,JTN問題の解決は,(1)異なる職種間のセマンティックな類似性,(2)正規化されていない職種,(3)大規模で長期化した職種を現実世界のアプリケーションで扱うこと,といった課題を伴う。そこで本稿では,対象とするジョブの3つのユニークな埋め込み(グラフ,コンテキスト,構文)を構築し,その特徴を効果的に把握する新しいソリューションJAMESを提案する。さらに、これらの埋め込みを注意深く組み合わせ、ニューラルネットワーク論理的推論表現を用いて、乱雑な職名と正規化された職名との類似性を推論する多視点コアテンション機構を提案する。 JAMESを評価するために,35万以上の職種を持つ大規模実世界のデータセット上で,10種類の競合モデルに対して包括的な実験を行った。 JAMESはPrecision@10では10.06%,NDCG@10では17.52%で最高のベースラインを上回った。 In online job marketplaces, it is important to establish a well-defined job title taxonomy for various downstream tasks (e.g., job recommendation, users' career analysis, and turnover prediction). Job Title Normalization (JTN) is such a cleaning step to classify user-created non-standard job titles into normalized ones. However, solving the JTN problem is non-trivial with challenges: (1) semantic similarity of different job titles, (2) non-normalized user-created job titles, and (3) large-scale and long-tailed job titles in real-world applications. To this end, we propose a novel solution, named JAMES, that constructs three unique embeddings (i.e., graph, contextual, and syntactic) of a target job title to effectively capture its various traits. We further propose a multi-aspect co-attention mechanism to attentively combine these embeddings, and employ neural logical reasoning representations to collaboratively estimate similarities between messy job titles and normalized job titles in a reasoning space. To evaluate JAMES, we conduct comprehensive experiments against ten competing models on a large-scale real-world dataset with over 350,000 job titles. Our experimental results show that JAMES significantly outperforms the best baseline by 10.06% in Precision@10 and by 17.52% in NDCG@10, respectively.	翻訳日:2023-10-26 04:01:51 公開日:2023-10-23
# ハミルトンニューラルネットワークのためのシンプレクティック学習 Symplectic Learning for Hamiltonian Neural Networks ( http://arxiv.org/abs/2106.11753v2 ) ライセンス: Link先を確認	Marco David and Florian M\'ehats	(参考訳) 機械学習手法は自然科学において観測データから物理システムをモデル化し予測するために広く用いられている。しかし、それらはしばしば理解されていない「ブラックボックス」として使われ、既存の数学的構造や問題の不変性を無視している。最近、hamiltonian neural networks (hnns)の提案は、ハミルトニアンシステムの性能を改善するために物理的洞察を用いて、統一された"グレーボックス"アプローチへの第一歩を踏み出した。本稿では, 損失関数の異なるハミルトン系のシンプレクティック構造を利用して, HNNの学習方法を大幅に改善する方法について検討する。これにより、人工下界からの損失が解放される。 HNNが学習できる正確なハミルトン関数の存在を数学的に保証する。これにより、hnnsが犯したエラーを証明し、数値的に分析することができます。最後に,非正規化観測データのみから真ハミルトニアンを得るための新しい訓練後補正を任意の順序まで提示する。 Machine learning methods are widely used in the natural sciences to model and predict physical systems from observation data. Yet, they are often used as poorly understood "black boxes," disregarding existing mathematical structure and invariants of the problem. Recently, the proposal of Hamiltonian Neural Networks (HNNs) took a first step towards a unified "gray box" approach, using physical insight to improve performance for Hamiltonian systems. In this paper, we explore a significantly improved training method for HNNs, exploiting the symplectic structure of Hamiltonian systems with a different loss function. This frees the loss from an artificial lower bound. We mathematically guarantee the existence of an exact Hamiltonian function which the HNN can learn. This allows us to prove and numerically analyze the errors made by HNNs which, in turn, renders them fully explainable. Finally, we present a novel post-training correction to obtain the true Hamiltonian only from discretized observation data, up to an arbitrary order.	翻訳日:2023-10-26 03:59:52 公開日:2023-10-23
# オフライン強化学習のためのベルマン整合悲観論 Bellman-consistent Pessimism for Offline Reinforcement Learning ( http://arxiv.org/abs/2106.06926v6 ) ライセンス: Link先を確認	Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal	(参考訳) 悲観主義の使用は、徹底的な探索を欠いたデータセットについての推論が、最近オフラインの強化学習において注目されている。アルゴリズムに頑丈さが加わったにも拘わらず、過度に悲観的な推論は、良い政策の発見を先延ばしする上でも同様に打撃を与える可能性がある。本稿では,一般関数近似に対するベルマン整合悲観主義の概念を紹介する: 値関数に対する点回り下界を計算する代わりに,ベルマン方程式に一致する関数の集合上の初期状態における悲観主義を実装する。我々の理論的な保証は、探索的設定において標準としてベルマン閉包のみを必要とするが、その場合ボーナスに基づく悲観主義は保証を提供しない。より強い表現性仮定が成立する線型関数近似の特別な場合においても、作用空間が有限である場合のサンプル複雑性において、この結果は$\mathcal{O}(d)$による最近のボーナスベースのアプローチにより改善される。驚くべきことに、我々のアルゴリズムは後見の最良のバイアス分散トレードオフに自動的に適応するが、ほとんどの以前のアプローチは、事前の余分なハイパーパラメータをチューニングする必要がある。 The use of pessimism, when reasoning about datasets lacking exhaustive exploration has recently gained prominence in offline reinforcement learning. Despite the robustness it adds to the algorithm, overly pessimistic reasoning can be equally damaging in precluding the discovery of good policies, which is an issue for the popular bonus-based pessimism. In this paper, we introduce the notion of Bellman-consistent pessimism for general function approximation: instead of calculating a point-wise lower bound for the value function, we implement pessimism at the initial state over the set of functions consistent with the Bellman equations. Our theoretical guarantees only require Bellman closedness as standard in the exploratory setting, in which case bonus-based pessimism fails to provide guarantees. Even in the special case of linear function approximation where stronger expressivity assumptions hold, our result improves upon a recent bonus-based approach by $\mathcal{O}(d)$ in its sample complexity when the action space is finite. Remarkably, our algorithms automatically adapt to the best bias-variance tradeoff in the hindsight, whereas most prior approaches require tuning extra hyperparameters a priori.	翻訳日:2023-10-26 03:59:35 公開日:2023-10-23
# 強駆動場を有するイジングモデルにおけるハイゼンベルク相互作用のシミュレーション Simulating Heisenberg Interactions in the Ising Model with Strong Drive Fields ( http://arxiv.org/abs/2207.09438v5 ) ライセンス: Link先を確認	Anthony N. Ciavarella, Stephan Caspar, Hersh Singh, Martin J. Savage, Pavel Lougovski	(参考訳) 離散時間間隔で大きな駆動場を持つイジングモデルの時間進化は、逆場強度の先頭の順序で有効なXXZ-ハイゼンベルクモデルによって再現される。ドライブ場の特定の向きについて、xxx-ハイゼンベルクモデルのダイナミクスを再現する。これらの近似等価性は、イジングモデルにおける動的位相遷移によって設定された臨界駆動場強度より上において有効であり、イジングモデルに従って量子ビットをネイティブに進化させる量子デバイスによりより複雑なシステムをシミュレートできると期待されている。 The time-evolution of an Ising model with large driving fields over discrete time intervals is shown to be reproduced by an effective XXZ-Heisenberg model at leading order in the inverse field strength. For specific orientations of the drive field, the dynamics of the XXX-Heisenberg model is reproduced. These approximate equivalences, valid above a critical driving field strength set by dynamical phase transitions in the Ising model, are expected to enable quantum devices that natively evolve qubits according to the Ising model to simulate more complex systems.	翻訳日:2023-10-26 03:50:40 公開日:2023-10-23
# 高次ファンホーブ特異点の存在下での密度波の運命 Fate of density waves in the presence of a higher order van Hove singularity ( http://arxiv.org/abs/2205.08828v2 ) ライセンス: Link先を確認	Alkistis Zervou, Dmitriy V. Efremov and Joseph J. Betouras	(参考訳) 電子バンド構造のトポロジカルな遷移は、状態密度のファン・ホーブ特異点をもたらすが、量子材料中の様々な種類の秩序に大きく影響する。通常のトポロジカル遷移(ネック形成や崩壊)は、2次元におけるエネルギーの関数としての状態(DOS)の電子密度の対数的ばらつきをもたらす。正規のファン・ホーブ特異点に加えて、DOS には高次ファン・ホーブ特異点 (HOVHS) があり、パワーローの発散がある。再正規化群 (RG) 法を用いて, HOVHSが平行に現れるフェルミ面のネスト部によって形成されるスピン密度波位相の運命について検討した。位相形成は特異点の存在によって促進され,臨界温度は桁違いに増加する。我々は,Sr$_3$Ru$_2$O$_7$,Sr$_2$RuO$_4$,遷移金属ジアルコゲナイドなどの量子材料への本研究の応用の可能性について議論する。 Topological transitions in electronic band structures, resulting in van Hove singularities in the density of states, can considerably affect various types of orderings in quantum materials. Regular topological transitions (of neck formation or collapse) lead to a logarithmic divergence of the electronic density of states (DOS) as a function of energy in two-dimensions. In addition to the regular van Hove singularities, there are higher order van Hove singularities (HOVHS) with a power-law divergences in DOS. By employing renormalization group (RG) techniques, we study the fate of a spin-density wave phase formed by nested parts of the Fermi surface, when a HOVHS appears in parallel. We find that the phase formation can be boosted by the presence of the singularity, with the critical temperature increasing by orders of magnitude. We discuss possible applications of our findings to a range of quantum materials such as Sr$_3$Ru$_2$O$_7$, Sr$_2$RuO$_4$ and transition metal dichalcogenides.	翻訳日:2023-10-26 03:49:51 公開日:2023-10-23
# 画像分類におけるディープニューラルネットワークのモデル量子化に関する包括的調査 A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification ( http://arxiv.org/abs/2205.07877v5 ) ライセンス: Link先を確認	Babak Rokh, Ali Azarpeyvand, Alireza Khanteymoori	(参考訳) 近年,Deep Neural Networks(DNN)による機械学習の進歩が注目されている。高い精度を示す一方で、DNNは膨大な数のパラメータと計算と関連付けられ、高いメモリ使用量とエネルギー消費につながる。その結果、制約のあるハードウェアリソースを持つデバイスにDNNをデプロイすることは、大きな課題となる。これを解決するために、DNNアクセラレータの最適化に様々な圧縮技術が広く用いられている。有望なアプローチは量子化であり、全精度値が低ビット幅精度で格納される。量子化はメモリ要求を減らすだけでなく、低コスト操作を低コスト操作に置き換える。 DNN量子化はハードウェア設計における柔軟性と効率性を提供し、様々な手法で広く採用されている。量子化は従来の研究で広く利用されてきたため、異なる量子化アプローチの理解、分析、比較を提供する統合レポートが必要である。その結果、画像分類に焦点をあてて、量子化の概念と方法の包括的調査を行う。本稿では,クラスタリングに基づく量子化手法について述べ,全精度値近似のためのスケールファクタパラメータの利用について検討する。さらに,ストレートスルー推定器や量子化正規化を含む,量子化dnnのトレーニングを徹底的に検討した。本稿では,量子化DNNにおける浮動小数点演算の低コスト化と,量子化における異なる層の感度について説明する。さらに,量子化手法の評価指標と画像分類タスクにおける重要なベンチマークについて述べる。また,CIFAR-10およびImageNet上での最先端手法の精度を示す。 Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significant. While demonstrating high accuracy, DNNs are associated with a huge number of parameters and computations, which leads to high memory usage and energy consumption. As a result, deploying DNNs on devices with constrained hardware resources poses significant challenges. To overcome this, various compression techniques have been widely employed to optimize DNN accelerators. A promising approach is quantization, in which the full-precision values are stored in low bit-width precision. Quantization not only reduces memory requirements but also replaces high-cost operations with low-cost ones. DNN quantization offers flexibility and efficiency in hardware design, making it a widely adopted technique in various methods. Since quantization has been extensively utilized in previous works, there is a need for an integrated report that provides an understanding, analysis, and comparison of different quantization approaches. Consequently, we present a comprehensive survey of quantization concepts and methods, with a focus on image classification. We describe clustering-based quantization methods and explore the use of a scale factor parameter for approximating full-precision values. Moreover, we thoroughly review the training of a quantized DNN, including the use of a straight-through estimator and quantization regularization. We explain the replacement of floating-point operations with low-cost bitwise operations in a quantized DNN and the sensitivity of different layers in quantization. Furthermore, we highlight the evaluation metrics for quantization methods and important benchmarks in the image classification task. We also present the accuracy of the state-of-the-art methods on CIFAR-10 and ImageNet.	翻訳日:2023-10-26 03:49:35 公開日:2023-10-23
# 選択スキームを特徴付けるための一連の診断指標 A suite of diagnostic metrics for characterizing selection schemes ( http://arxiv.org/abs/2204.13839v3 ) ライセンス: Link先を確認	Jose Guadalupe Hernandez, Alexander Lalejini, Charles Ofria	(参考訳) ベンチマークスイートは進化的アルゴリズムの性能を評価するのに不可欠であるが、アルゴリズムの強みや弱点について明確な直観を与えるには、構成上の問題は複雑すぎることが多い。このギャップに対処するために、当初は8つの手作りメトリクスからなる診断スイートであるDOSSIER(Diagnostic Overview of Selection Schemes in Evolutionary Runs)を紹介した。これらのメトリクスは、搾取、探索、およびそれらの相互作用のための特定の能力を実証的に測定するように設計されている。本研究は,多様性探索(複数の経路を同時に探索する能力)と谷を横断する探索(より広く広いフィットネスバレーを横断する能力)の2つの側面に分けられる。 DOSSIERを6つの一般的な選択スキームに適用する: トランケーション、トーナメント、フィットネス共有、レキシケース、非支配的なソート、ノベルティ検索。以上の結果から,単純なスキーム(トーナメントやトランケーションなど)が搾取を重視していることが確認された。しかし、より洗練されたスキームでは、私たちの診断は興味深いダイナミクスを示しました。レキシケースの選択は、谷の横断を含まない全ての診断で適度に機能したが、谷が存在すると劇的に変化し、ランダムな探索よりも成績が悪くなった。フィットネスシェアリングは、バレー横断と効果的に競合する唯一のスキームだったが、他の診断に苦しんだ。本研究は,新しい選択手法の設計を図り,選択方式の特徴を微妙に把握するための診断の有用性を強調した。 Benchmark suites are crucial for assessing the performance of evolutionary algorithms, but the constituent problems are often too complex to provide clear intuition about an algorithm's strengths and weaknesses. To address this gap, we introduce DOSSIER ("Diagnostic Overview of Selection Schemes In Evolutionary Runs"), a diagnostic suite initially composed of eight handcrafted metrics. These metrics are designed to empirically measure specific capacities for exploitation, exploration, and their interactions. We consider exploitation both with and without constraints, and we divide exploration into two aspects: diversity exploration (the ability to simultaneously explore multiple pathways) and valley-crossing exploration (the ability to cross wider and wider fitness valleys). We apply DOSSIER to six popular selection schemes: truncation, tournament, fitness sharing, lexicase, nondominated sorting, and novelty search. Our results confirm that simple schemes (e.g., tournament and truncation) emphasized exploitation. For more sophisticated schemes, however, our diagnostics revealed interesting dynamics. Lexicase selection performed moderately well across all diagnostics that did not incorporate valley crossing, but faltered dramatically whenever valleys were present, performing worse than even random search. Fitness sharing was the only scheme to effectively contend with valley crossing but it struggled with the other diagnostics. Our study highlights the utility of using diagnostics to gain nuanced insights into selection scheme characteristics, which can inform the design of new selection methods.	翻訳日:2023-10-26 03:49:13 公開日:2023-10-23
# 腫瘍セグメンテーションにおける脳MRIデータ前処理の無視的効果 Negligible effect of brain MRI data preprocessing for tumor segmentation ( http://arxiv.org/abs/2204.05278v4 ) ライセンス: Link先を確認	Ekaterina Kondrateva and Polina Druzhinina and Alexandra Dalechina and Svetlana Zolotova and Andrey Golanov and Boris Shirokikh and Mikhail Belyaev and Anvar Kurmukov	(参考訳) 磁気共鳴イメージング(MRI)データは、デバイスメーカ、走査プロトコル、オブジェクト間の可変性の違いにより異種である。 MR画像の不均一性を緩和する従来の方法は、解剖学的アライメント、ボクセル再サンプリング、信号強度等化、画像のデノイング、関心領域の局在化などの前処理変換を適用することである。前処理パイプラインは画像の外観を標準化するが、画像セグメンテーションの質や、ディープニューラルネットワークにおける他の下流タスクへの影響は厳格に研究されていない。我々は3つの公開データセットの実験を行い、データセット内およびデータセット間トレーニングシナリオにおける異なる前処理ステップの効果を評価する。我々の結果は、最も一般的な標準化手順がネットワーク性能に価値を与えないことを示し、さらにプリプロセッシングはモデル性能を損なう可能性がある。画像の標準化に伴う信号ばらつきの低減により,画像強度正規化手法はモデルの精度に寄与しないことが示唆された。最後に,データ前処理における頭蓋骨切り抜きの寄与は,腫瘍体積の推定値からみるとほとんど無視できることを示す。正確な深層学習分析には,データセット間のボクセル間隔の統一化が不可欠であることを示す。対照的に、非剛性アトラス登録の形での物体間解剖アライメントは不要であり、強度等化ステップ(デノイング、バイアス場補正、ヒストグラムマッチング)はモデルの性能を向上しない。学習コードはオンラインのhttps://github.com/medimair/brain-mri-process-pipelineから利用できる。 Magnetic resonance imaging (MRI) data is heterogeneous due to differences in device manufacturers, scanning protocols, and inter-subject variability. A conventional way to mitigate MR image heterogeneity is to apply preprocessing transformations such as anatomy alignment, voxel resampling, signal intensity equalization, image denoising, and localization of regions of interest. Although a preprocessing pipeline standardizes image appearance, its influence on the quality of image segmentation and on other downstream tasks in deep neural networks has never been rigorously studied. We conduct experiments on three publicly available datasets and evaluate the effect of different preprocessing steps in intra- and inter-dataset training scenarios. Our results demonstrate that most popular standardization steps add no value to the network performance; moreover, preprocessing can hamper model performance. We suggest that image intensity normalization approaches do not contribute to model accuracy because of the reduction of signal variance with image standardization. Finally, we show that the contribution of skull-stripping in data preprocessing is almost negligible if measured in terms of estimated tumor volume. We show that the only essential transformation for accurate deep learning analysis is the unification of voxel spacing across the dataset. In contrast, inter-subjects anatomy alignment in the form of non-rigid atlas registration is not necessary and intensity equalization steps (denoising, bias-field correction and histogram matching) do not improve models' performance. The study code is accessible online https://github.com/MedImAIR/brain-mri-processing-pipeline	翻訳日:2023-10-26 03:48:12 公開日:2023-10-23
# SODA:ソーシャル・コモンセンス・コンテクスト化による数百万件のダイアログ蒸留 SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization ( http://arxiv.org/abs/2212.10465v3 ) ライセンス: Link先を確認	Hyunwoo Kim, Jack Hessel, Liwei Jiang, Peter West, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan Le Bras, Malihe Alikhani, Gunhee Kim, Maarten Sap, Yejin Choi	(参考訳) データ不足は、オープンドメインの社会対話における長年の課題である。この渇きを和らげるために、私たちはsoda: the first public available, million-scale high-quality social dialogue datasetをプレゼンします。知識グラフからソーシャル・コモンセンスの知識を文脈化することで、大きな言語モデルから非常に広い範囲の社会的相互作用を抽出できる。人間による評価は、SODAにおける会話は、以前の人間によるデータセットよりも一貫性があり、特異であり、そして(当然のことながら)自然であることを示している。 SODAを用いて、最高のパフォーマンスの会話モデル(GODEL、BlenderBot-1、Koala、Vicunaなど)よりも、目に見えないデータセットに対して、はるかに自然で一貫性のある一般化可能な会話モデルであるCOSMOを訓練する。実験の結果、COSMOは時にオリジナルの人書きのゴールドレスポンスよりも好まれることが示された。さらに,我々は知識豊かな会話と自然なソーシャル・チットチャットの区別を明らかにした。データ、モデル、コードを公開する予定です。 Data scarcity has been a long standing issue in the field of open-domain social dialogue. To quench this thirst, we present SODA: the first publicly available, million-scale high-quality social dialogue dataset. By contextualizing social commonsense knowledge from a knowledge graph, we are able to distill an exceptionally broad spectrum of social interactions from a large language model. Human evaluation shows that conversations in SODA are more consistent, specific, and (surprisingly) natural than those in prior human-authored datasets. Using SODA, we train COSMO: a generalizable conversation model that is significantly more natural and consistent on unseen datasets than best-performing conversation models (e.g., GODEL, BlenderBot-1, Koala, Vicuna). Experiments reveal COSMO is sometimes even preferred to the original human-written gold responses. Additionally, our results shed light on the distinction between knowledge-enriched conversations and natural social chitchats. We plan to make our data, model, and code public.	翻訳日:2023-10-26 03:29:48 公開日:2023-10-23
# Mo\^usai: 長期遅延拡散によるテキスト・音楽生成 Mo\^usai: Text-to-Music Generation with Long-Context Latent Diffusion ( http://arxiv.org/abs/2301.11757v3 ) ライセンス: Link先を確認	Flavio Schneider, Ojasv Kamal, Zhijing Jin, Bernhard Sch\"olkopf	(参考訳) 近年、テキストのための大規模な生成モデルが急速に発展してきたが、テキストと別の「言語」コミュニケーション(音楽)との関係について研究する研究は少なくなっている。音楽はテキストによく似ているが、感情、物語、アイデアを伝えることができ、独自の構造と構文を持っている。本研究は,テキスト・音楽生成モデルを用いてテキスト・音楽のブリッジを行い,高効率で表現力があり,長期的構造を扱えることを示す。具体的には,テキスト記述から48khzで数分間の高音質ステレオ音楽を生成できる2段階の潜在拡散モデルであるmo\^usaiを開発した。さらに,本モデルでは高効率を特徴とし,単一のコンシューマGPU上で適切な速度でリアルタイムな推論を可能にする。実験と特性分析により,既存の音楽生成モデルと比較して,様々な基準を満たしたモデルの能力を示す。最後に,オープンソース文化を促進するため,オープンソースライブラリのコレクションを提供し,今後の活動を促進することを期待する。 Codes: https://github.com/archinetai/audio-diffusion-pytorch; この論文の音楽サンプル: http://bit.ly/44ozWDH; すべてのモデルの音楽サンプル: https://bit.ly/audio-diffusion。 Recent years have seen the rapid development of large generative models for text; however, much less research has explored the connection between text and another "language" of communication -- music. Music, much like text, can convey emotions, stories, and ideas, and has its own unique structure and syntax. In our work, we bridge text and music via a text-to-music generation model that is highly efficient, expressive, and can handle long-term structure. Specifically, we develop Mo\^usai, a cascading two-stage latent diffusion model that can generate multiple minutes of high-quality stereo music at 48kHz from textual descriptions. Moreover, our model features high efficiency, which enables real-time inference on a single consumer GPU with a reasonable speed. Through experiments and property analyses, we show our model's competence over a variety of criteria compared with existing music generation models. Lastly, to promote the open-source culture, we provide a collection of open-source libraries with the hope of facilitating future work in the field. We open-source the following: Codes: https://github.com/archinetai/audio-diffusion-pytorch; music samples for this paper: http://bit.ly/44ozWDH; all music samples for all models: https://bit.ly/audio-diffusion.	翻訳日:2023-10-26 03:20:09 公開日:2023-10-23
# 等角形E値を用いたFDR制御によるデランダム化ノベルティ検出 Derandomized Novelty Detection with FDR Control via Conformal E-values ( http://arxiv.org/abs/2302.07294v3 ) ライセンス: Link先を確認	Meshi Bashari, Amir Epstein, Yaniv Romano, Matteo Sesia	(参考訳) コンフォーマル推論は、新規性検出のための任意の機械学習アルゴリズムの出力を厳格に校正する、一般分布のない方法を提供する。このアプローチには多くの長所があるが、同じデータを2回分析する際に異なる結果をもたらす可能性があるという意味で、ランダム化の限界があり、任意の結果の解釈を妨げる可能性がある。統計的意義を定量化するために、p値の代わりに適切な共形e値を用いることにより、共形推論をより安定させる。このソリューションでは、同一データの複数の解析から集めた証拠を効果的に集約し、偽発見率を確実に制御することができる。さらに, 提案手法は, 同一データから慎重に抽出した付加側情報に基づいて, 共形e値の重み付けを行う革新的な手法により, 従来の共形推論と比較して, 電力損失が少なく, ランダム性を低減できることを示す。合成および実データによるシミュレーションにより、この解は最先端の代替技術で得られた推論におけるランダムノイズの除去に有効であり、時には高出力につながる。 Conformal inference provides a general distribution-free method to rigorously calibrate the output of any machine learning algorithm for novelty detection. While this approach has many strengths, it has the limitation of being randomized, in the sense that it may lead to different results when analyzing twice the same data, and this can hinder the interpretation of any findings. We propose to make conformal inferences more stable by leveraging suitable conformal e-values instead of p-values to quantify statistical significance. This solution allows the evidence gathered from multiple analyses of the same data to be aggregated effectively while provably controlling the false discovery rate. Further, we show that the proposed method can reduce randomness without much loss of power compared to standard conformal inference, partly thanks to an innovative way of weighting conformal e-values based on additional side information carefully extracted from the same data. Simulations with synthetic and real data confirm this solution can be effective at eliminating random noise in the inferences obtained with state-of-the-art alternative techniques, sometimes also leading to higher power.	翻訳日:2023-10-26 01:33:48 公開日:2023-10-23
# 量子ゲートの遺伝的多部包絡能力の階層性 Hierarchies among Genuine Multipartite Entangling Capabilities of Quantum Gates ( http://arxiv.org/abs/2302.06574v2 ) ライセンス: Link先を確認	Samir Kumar Hazra, Aditi Sen De	(参考訳) 多成分分離状態の階層に基づく真の多成分絡み合いを生成する能力に応じて量子ゲートを分類する。特に、固定ユニタリ演算子がk-分離可能な状態の集合に作用すると、k-分離可能な入力状態の集合を最大化した後、その特定のユニタリ演算子を介して生成される最大(平均)真のマルチパートエンタングルメント(GME)内容が決定される。入力状態が2分割で絡み合っているとき、高いgmeを生成するのに有用なユニタリ作用素を識別するが、入力中の絡み合いが役に立たないような画像も反転できる。一般化幾何測度(GGM)をGME量化器として計算することにより,量子ゲート,対角,置換,Haar一様生成ユニタリ演算子を含む様々なユニタリ演算子の最大エンタングルパワーを特徴付ける。我々は,最大ggmを持つ状態を生成するユニタリ演算子とその入力を決定する。 We categorize quantum gates according to their capability to generate genuine multipartite entanglement based on the hierarchy of multipartite separable states. In particular, when a fixed unitary operator acts on the set of k-separable states, the maximal (average) genuine multipartite entanglement (GME) content produced via that particular unitary operator is determined after maximizing over the set of k-separable input states. We identify unitary operators that are beneficial for generating high GME when the input states are entangled in some bipartition, although the picture can also be reversed in which entanglement in inputs does not help. We characterize maximum entangling power of a variety of unitary operators including special classes of quantum gates, diagonal, permutation and Haar uniformly generated unitary operators by computing generalized geometric measure (GGM) as GME quantifier. We determine the unitary operators and their corresponding inputs which can create the resulting states having maximum GGM.	翻訳日:2023-10-26 01:33:27 公開日:2023-10-23
# グラフニューラルネットワークの一般化:グラフ拡散によるPAC-Bayesian境界の改善 Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion ( http://arxiv.org/abs/2302.04451v3 ) ライセンス: Link先を確認	Haotian Ju, Dongyue Li, Aneesh Sharma, and Hongyang R. Zhang	(参考訳) グラフニューラルネットワークは、グラフ予測タスクに広く使われている。経験的性能に動機づけられた先行研究は、最大次数の観点からグラフ構造にスケールするグラフニューラルネットワークの一般化境界を開発した。本稿では,グラフニューラルネットワークの特徴拡散行列の最大特異値に代えてスケールする一般化境界を提案する。これらの境界は実世界のグラフの事前境界よりも数値的に小さい。我々はまた、上界漸近的に一致する一般化ギャップの下界を構成する。これらの結果を達成するために,先行作業の設定(畳み込みネットワークとメッセージパッシングネットワーク)と新たな設定(グラフ同型ネットワーク)を含む統一モデルを分析する。我々のキーとなる考え方は、ヘシアンを用いたノイズ摂動に対するグラフニューラルネットワークの安定性を測定することである。実験により,Hessianによる測定は,観測されたグラフニューラルネットワークの一般化ギャップと相関することがわかった。微調整済みグラフニューラルネットワークの雑音安定性特性の最適化も、グラフレベルの分類タスクにおけるテスト性能を向上させる。 Graph neural networks are widely used tools for graph prediction tasks. Motivated by their empirical performance, prior works have developed generalization bounds for graph neural networks, which scale with graph structures in terms of the maximum degree. In this paper, we present generalization bounds that instead scale with the largest singular value of the graph neural network's feature diffusion matrix. These bounds are numerically much smaller than prior bounds for real-world graphs. We also construct a lower bound of the generalization gap that matches our upper bound asymptotically. To achieve these results, we analyze a unified model that includes prior works' settings (i.e., convolutional and message-passing networks) and new settings (i.e., graph isomorphism networks). Our key idea is to measure the stability of graph neural networks against noise perturbations using Hessians. Empirically, we find that Hessian-based measurements correlate with the observed generalization gaps of graph neural networks accurately. Optimizing noise stability properties for fine-tuning pretrained graph neural networks also improves test performance on several graph-level classification tasks.	翻訳日:2023-10-26 01:32:45 公開日:2023-10-23
# deforestvis:surrogate decision stumpsを用いた機械学習モデルの行動分析 DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps ( http://arxiv.org/abs/2304.00133v3 ) ライセンス: Link先を確認	Angelos Chatzimparmpas, Rafael M. Martins, Alexandru C. Telea, Andreas Kerren	(参考訳) 機械学習(ML)モデルの複雑さが増大し、異なる(そして重要な)ドメインでの応用が増加するにつれて、より解釈可能で信頼性の高いMLに対する強い需要がある。そのようなモデルを直接的にモデルに依存しない解釈の方法は、ルールセットや決定木といった、よりシンプルで説明しやすく、元のモデルに十分近似するサーロゲートモデルを訓練することである。しかし、ルールセットは非常に長くなり、多くのif-else文があり、複雑なMLモデルを正確にエミュレートすると決定木深さが急速に増加する。そのような場合、両方のアプローチはコア目標を達成できず、ユーザーにモデル解釈性を提供する。そこで本研究では,アダプティブ・ブースティング(adaboost)技術を用いて生成されたサーロゲート決定スランプ(一段階決定木)を提供することにより,複雑なmlモデルの挙動をユーザフレンドリに要約するビジュアル分析ツールであるdeforestvisを提案する。 DeforestVisは、より多くの切り株をインクリメンタルに生成し、決定を正当化するために重み付けされた切り株を使った属性ベースの説明を作成し、ルールオーバーライドが1つ以上の切り株間のトレーニングインスタンス割り当てに与える影響を分析することで、複雑さと忠実さのトレードオフを探索するのに役立つ。独立したテストセットにより、ユーザは手動のルール変更の有効性を監視し、ケースバイケース分析に基づいて仮説を形成することができる。 2つのユースケースでdeforestvisの適用可能性と有用性を示し,データアナリストとモデル開発者とのエキスパートインタビューを行った。 As the complexity of machine learning (ML) models increases and their application in different (and critical) domains grows, there is a strong demand for more interpretable and trustworthy ML. A direct, model-agnostic, way to interpret such models is to train surrogate models, such as rule sets and decision trees, that sufficiently approximate the original ones while being simpler and easier-to-explain. Yet, rule sets can become very lengthy, with many if-else statements, and decision tree depth grows rapidly when accurately emulating complex ML models. In such cases, both approaches can fail to meet their core goal, providing users with model interpretability. To tackle this, we propose DeforestVis, a visual analytics tool that offers user-friendly summarization of the behavior of complex ML models by providing surrogate decision stumps (one-level decision trees) generated with the adaptive boosting (AdaBoost) technique. DeforestVis helps users to explore the complexity vs fidelity trade-off by incrementally generating more stumps, creating attribute-based explanations with weighted stumps to justify decision making, and analyzing the impact of rule overriding on training instance allocation between one or more stumps. An independent test set allows users to monitor the effectiveness of manual rule changes and form hypotheses based on case-by-case analyses. We show the applicability and usefulness of DeforestVis with two use cases and expert interviews with data analysts and model developers.	翻訳日:2023-10-26 01:26:36 公開日:2023-10-23
# 機械心理学:心理学的手法を用いた大規模言語モデルにおける創発的能力と行動の調査 Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods ( http://arxiv.org/abs/2303.13988v4 ) ライセンス: Link先を確認	Thilo Hagendorff	(参考訳) 大規模言語モデル(LLM)は、現在、人間のコミュニケーションと日常の生活を結び付けるAIシステムの最前線にある。急速な技術進歩と極端な汎用性により、LLMは今や数百万人のユーザを抱えており、情報検索、コンテンツ生成、問題解決などの主要なゴート技術になりつつある。そのため、その能力を徹底的に評価し、精査することが重要である。現在のllmでは、ますます複雑で新しい行動パターンがみられるため、もともと人間をテストするために設計された心理学実験の参加者として扱うことができる。そこで本研究では,「機械心理学」と呼ばれる新しい研究分野を紹介する。この論文は、心理学の異なるサブフィールドがLLMの行動テストにどのように影響するかを概説する。機械心理学研究の方法論的基準を定義しており、特にプロンプトデザインのポリシーに焦点を当てている。さらに、LLMで発見された行動パターンがどのように解釈されるかを記述する。要約すると、機械心理学は従来の自然言語処理ベンチマークでは検出できないLLMの創発的能力を発見することを目的としている。 Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Due to rapid technological advances and their extreme versatility, LLMs nowadays have millions of users and are at the cusp of being the main go-to technology for information retrieval, content generation, problem-solving, etc. Therefore, it is of great importance to thoroughly assess and scrutinize their capabilities. Due to increasingly complex and novel behavioral patterns in current LLMs, this can be done by treating them as participants in psychology experiments that were originally designed to test humans. For this purpose, the paper introduces a new field of research called "machine psychology". The paper outlines how different subfields of psychology can inform behavioral tests for LLMs. It defines methodological standards for machine psychology research, especially by focusing on policies for prompt designs. Additionally, it describes how behavioral patterns discovered in LLMs are to be interpreted. In sum, machine psychology aims to discover emergent abilities in LLMs that cannot be detected by most traditional natural language processing benchmarks.	翻訳日:2023-10-26 01:25:21 公開日:2023-10-23
# ChatGPTの一貫性解析 Consistency Analysis of ChatGPT ( http://arxiv.org/abs/2303.06273v2 ) ライセンス: Link先を確認	Myeongjun Erik Jang, Thomas Lukasiewicz	(参考訳) ChatGPTは導入以来大きな人気を集めている。その肯定的な側面は、多くのメディアプラットフォームを通じて報告されており、いくつかの分析では、chatgptがプロの試験でまともな成績を上げたこと、そしてaiが産業分野で人間を助け、置き換えることができるという主張に対する追加の支持が示された。しかし、その信頼性と信頼性を疑う者もいる。本稿では,chatgpt と gpt-4 の論理的一貫性に関する信頼性について検討し,意味的一貫性と否定,対称,推移的一貫性の特性に着目した。両モデルとも言語理解能力と推論能力が向上しているように見えるが,論理的に一貫した予測が得られないことが示唆された。また,LLMの不整合を解消するためには,大規模言語モデル(LLM)を設計し,少数ショットの学習を行い,より大規模な言語モデル(LLM)を採用する実験を行うことも不可能である。 ChatGPT has gained a huge popularity since its introduction. Its positive aspects have been reported through many media platforms, and some analyses even showed that ChatGPT achieved a decent grade in professional exams, adding extra support to the claim that AI can now assist and even replace humans in industrial fields. Others, however, doubt its reliability and trustworthiness. This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour, focusing specifically on semantic consistency and the properties of negation, symmetric, and transitive consistency. Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions. We also ascertain via experiments that prompt designing, few-shot learning and employing larger large language models (LLMs) are unlikely to be the ultimate solution to resolve the inconsistency issue of LLMs.	翻訳日:2023-10-26 01:24:09 公開日:2023-10-23
# 生成検索エンジンにおける検証可能性の評価 Evaluating Verifiability in Generative Search Engines ( http://arxiv.org/abs/2304.09848v2 ) ライセンス: Link先を確認	Nelson F. Liu and Tianyi Zhang and Percy Liang	(参考訳) 生成検索エンジンは、インラインの引用とともに、ユーザークエリへの応答を直接生成する。信頼できる生成検索エンジンの前提条件は、総合的に引用すべきシステム(高い引用リコール、全ての文は引用によって完全に支持される)と正確に引用すべきシステム(高い引用精度、全ての引用が関連するステートメントをサポートする)である。 Bing Chat、NeevaAI、perplexity.ai、YouChatの4つの一般的な生成検索エンジンを、さまざまなソースからのさまざまなクエリ(例えば、過去のGoogleユーザクエリ、Redditで動的にコンパイルされたオープンエンド質問など)で評価する。既存の生成検索エンジンからの応答は、流動的で情報的に見えるが、しばしばサポートされていない文や不正確な引用を含む: 平均して、生成された文の51.5%は引用によって完全に支持され、引用の74.5%のみが関連する文をサポートする。これらの結果は、情報検索ユーザーにとって主要なツールとなる可能性のあるシステム、特に信頼性のファサードを考えると、かなり低いと我々は信じている。この結果が、信頼性の高い生成型検索エンジンの開発をさらに動機付け、研究者やユーザが既存の商用システムの欠点を理解するのに役立つことを願っています。 Generative search engines directly generate responses to user queries, along with in-line citations. A prerequisite trait of a trustworthy generative search engine is verifiability, i.e., systems should cite comprehensively (high citation recall; all statements are fully supported by citations) and accurately (high citation precision; every cite supports its associated statement). We conduct human evaluation to audit four popular generative search engines -- Bing Chat, NeevaAI, perplexity.ai, and YouChat -- across a diverse set of queries from a variety of sources (e.g., historical Google user queries, dynamically-collected open-ended questions on Reddit, etc.). We find that responses from existing generative search engines are fluent and appear informative, but frequently contain unsupported statements and inaccurate citations: on average, a mere 51.5% of generated sentences are fully supported by citations and only 74.5% of citations support their associated sentence. We believe that these results are concerningly low for systems that may serve as a primary tool for information-seeking users, especially given their facade of trustworthiness. We hope that our results further motivate the development of trustworthy generative search engines and help researchers and users better understand the shortcomings of existing commercial systems.	翻訳日:2023-10-26 01:14:42 公開日:2023-10-23
# サンプリングに基づく動き計画のための量子探索手法 Quantum Search Approaches to Sampling-Based Motion Planning ( http://arxiv.org/abs/2304.06479v4 ) ライセンス: Link先を確認	Paul Lathrop, Beth Boardman, Sonia Mart\'inez	(参考訳) 本稿では,従来のサンプリングベースモーションプランナーを,量子探索アルゴリズムを用いて解くデータベース・オラクル構造として,新しい定式化手法を提案する。単純なスパース環境の場合、完全ランダムパス解の重ね合わせを作成し、量子振幅増幅 (qaa) で確率振幅を操作する量子全経路探索アルゴリズム (q-fps) を定式化し、単一障害自由全経路解を量子的に測定する。密集した非構造環境に対しては,親子接続の量子重ね合わせを生成し,qaaで確率振幅を演算し,単一の到達可能な状態を木に追加する量子アルゴリズム q-rrt を高速に探索するランダムツリーアルゴリズムを定式化する。性能はオラクル呼び出しの数と良い量子状態を測定する確率に依存するため、これらの誤差がアルゴリズムの確率論的完全性にどう影響するかを定量化する。次に,提案アルゴリズムにおける最適なオラクル呼び出し数を近似するために,期待するデータベース解の数を数値的に推定する。 q-rrtアルゴリズムを古典的実装と比較し、2次元密閉乱数格子の最大連結成分における二次実行速度の検証を行う。最後に、提案手法を評価してデータベースソリューションの期待数を制限することにより、oracle呼び出しの最適な数を所定の数に制限する。 In this paper, we present a novel formulation of traditional sampling-based motion planners as database-oracle structures that can be solved via quantum search algorithms. We consider two complementary scenarios: for simpler sparse environments, we formulate the Quantum Full Path Search Algorithm (q-FPS), which creates a superposition of full random path solutions, manipulates probability amplitudes with Quantum Amplitude Amplification (QAA), and quantum measures a single obstacle free full path solution. For dense unstructured environments, we formulate the Quantum Rapidly Exploring Random Tree algorithm, q-RRT, that creates quantum superpositions of possible parent-child connections, manipulates probability amplitudes with QAA, and quantum measures a single reachable state, which is added to a tree. As performance depends on the number of oracle calls and the probability of measuring good quantum states, we quantify how these errors factor into the probabilistic completeness properties of the algorithm. We then numerically estimate the expected number of database solutions to provide an approximation of the optimal number of oracle calls in the algorithm. We compare the q-RRT algorithm with a classical implementation and verify quadratic run-time speedup in the largest connected component of a 2D dense random lattice. We conclude by evaluating a proposed approach to limit the expected number of database solutions and thus limit the optimal number of oracle calls to a given number.	翻訳日:2023-10-26 01:13:34 公開日:2023-10-23
# multi-annotator deep learning: 分類の確率的枠組み Multi-annotator Deep Learning: A Probabilistic Framework for Classification ( http://arxiv.org/abs/2304.02539v2 ) ライセンス: Link先を確認	Marek Herde, Denis Huseljic, Bernhard Sick	(参考訳) ディープニューラルネットワークを使って複雑な分類タスクを解くには、通常大量の注釈付きデータが必要である。しかし、エラーの多いアノテータ(例えば、crowdworkers)によって提供されると、対応するクラスラベルはうるさい。標準ディープニューラルネットワークのトレーニングは、このようなマルチアノテーションの学習設定におけるサブパーパフォーマンスをもたらす。本稿では,マルチアノテーション深層学習(MaDL)という確率的学習フレームワークを提案することでこの問題に対処する。下流の真実とアノテータのパフォーマンスモデルは、エンドツーエンドの学習アプローチで共同で訓練される。 ground truthモデルは、インスタンスの真のクラスラベルを予測することを学習し、annotatorパフォーマンスモデルは、アノテータのパフォーマンスの確率論的推定を推論する。モジュラーネットワークアーキテクチャにより、アノテータのパフォーマンス、例えばオプションのクラスやインスタンスの依存性に関する様々な仮定ができます。さらに,アノテーションが相互に関連付けられる可能性のプロキシとして,潜在空間内のアノテーションの密度を推定するために,アノテーション組込みを学習する。重み付き損失関数と共に、相関したアノテーションパターンから学習を改善する。総合評価では,マルチアノテーションによる教師あり学習に関する3つの研究課題について検討する。以上の結果から,madlの最先端のパフォーマンスと多数の関連したスパムアノテータに対するロバスト性が示された。 Solving complex classification tasks using deep neural networks typically requires large amounts of annotated data. However, corresponding class labels are noisy when provided by error-prone annotators, e.g., crowdworkers. Training standard deep neural networks leads to subpar performances in such multi-annotator supervised learning settings. We address this issue by presenting a probabilistic training framework named multi-annotator deep learning (MaDL). A downstream ground truth and an annotator performance model are jointly trained in an end-to-end learning approach. The ground truth model learns to predict instances' true class labels, while the annotator performance model infers probabilistic estimates of annotators' performances. A modular network architecture enables us to make varying assumptions regarding annotators' performances, e.g., an optional class or instance dependency. Further, we learn annotator embeddings to estimate annotators' densities within a latent space as proxies of their potentially correlated annotations. Together with a weighted loss function, we improve the learning from correlated annotation patterns. In a comprehensive evaluation, we examine three research questions about multi-annotator supervised learning. Our findings show MaDL's state-of-the-art performance and robustness against many correlated, spamming annotators.	翻訳日:2023-10-26 01:13:12 公開日:2023-10-23
# CoEdIT:タスク特化インストラクションチューニングによるテキスト編集 CoEdIT: Text Editing by Task-Specific Instruction Tuning ( http://arxiv.org/abs/2305.09857v2 ) ライセンス: Link先を確認	Vipul Raheja, Dhruv Kumar, Ryan Koo, Dongyeop Kang	(参考訳) 本稿では,現在最先端のテキスト編集システムであるCoEdITを紹介する。 CoEdIT は "Make the sentence simple" や "Write it in a more neutral style" といった所望のテキストの属性を指定するユーザからの指示を受け、編集されたテキストを出力する。本稿では,テキスト編集のためのタスク特化命令群(合計82k命令)を微調整した大規模言語モデルを提案する。本モデル(1)は,様々なテキスト編集ベンチマークにおいて最先端のパフォーマンスを達成し,(2)命令でトレーニングされた最大サイズllmと競合し,(2)未認識の編集命令に一般化し,(4)編集動作の異なる組合せを含む複合命令に一般化する能力を示す。定性的かつ定量的な分析により、他の最先端テキスト編集モデルと比較して、著者はCoEdITが提案する編集を好むことを示す。私たちのコード、データ、モデルはhttps://github.com/vipulraheja/coedit.comで公開されています。 We introduce CoEdIT, a state-of-the-art text editing system for writing assistance. CoEdIT takes instructions from the user specifying the attributes of the desired text, such as "Make the sentence simpler" or "Write it in a more neutral style," and outputs the edited text. We present a large language model fine-tuned on a diverse collection of task-specific instructions for text editing (a total of 82K instructions). Our model (1) achieves state-of-the-art performance on various text editing benchmarks, (2) is competitive with publicly available largest-sized LLMs trained on instructions while being nearly 60x smaller, (3) is capable of generalizing to unseen edit instructions, and (4) exhibits abilities to generalize to composite instructions containing different combinations of edit actions. Through extensive qualitative and quantitative analysis, we show that writers prefer the edits suggested by CoEdIT relative to other state-of-the-art text editing models. Our code, data, and models are publicly available at https://github.com/vipulraheja/coedit.	翻訳日:2023-10-26 01:06:46 公開日:2023-10-23
# 量子誤差緩和古典影 Quantum Error Mitigated Classical Shadows ( http://arxiv.org/abs/2305.04956v2 ) ライセンス: Link先を確認	Hamza Jnane, Jonathan Steinberg, Zhenyu Cai, H. Chau Nguyen, B\'alint Koczor	(参考訳) 古典的な影は量子状態$\rho$の多くの性質を非常に少ない測定で学べる。しかし、短期的および早期のフォールトトレラント量子コンピュータはノイズの多い量子状態$\rho$しか準備できないため、理想的でノイズのない状態$\rho_{id}$の性質を効率的に学習することは非常に難しい。本研究では,単一期待値の誤差を緩和するために開発された確率的エラーキャンセラ (pec) やゼロノイズ補間 (zne) や対称性検証 (sv) などの誤差緩和手法を検討し,従来の影における誤差の緩和を一般化する。 PECシャドウは理想量子状態$\rho_{id}$の偏りのない推定器であり、$\rho_{id}$の多くの線形特性を同時に予測するサンプル複雑性は、誤差緩和によるサンプルオーバーヘッドである乗算係数にアプローチする従来のシャドウのものと同一である。シャドーの効率的な後処理のため、このオーバーヘッドはキュービットの数に直接依存せず、ノイズゲートの数とともに指数関数的に増加する。本研究で導入された幅広いツールセットは,短期的および早期のフォールトトレラント量子コンピュータの活用に寄与する可能性がある。 Classical shadows enable us to learn many properties of a quantum state $\rho$ with very few measurements. However, near-term and early fault-tolerant quantum computers will only be able to prepare noisy quantum states $\rho$ and it is thus a considerable challenge to efficiently learn properties of an ideal, noise free state $\rho_{id}$. We consider error mitigation techniques, such as Probabilistic Error Cancellation (PEC), Zero Noise Extrapolation (ZNE) and Symmetry Verification (SV) which have been developed for mitigating errors in single expected value measurements and generalise them for mitigating errors in classical shadows. We find that PEC is the most natural candidate and thus develop a thorough theoretical framework for PEC shadows with the following rigorous theoretical guarantees: PEC shadows are an unbiased estimator for the ideal quantum state $\rho_{id}$; the sample complexity for simultaneously predicting many linear properties of $\rho_{id}$ is identical to that of the conventional shadows approach up to a multiplicative factor which is the sample overhead due to error mitigation. Due to efficient post-processing of shadows, this overhead does not depend directly on the number of qubits but rather grows exponentially with the number of noisy gates. The broad set of tools introduced in this work may be instrumental in exploiting near-term and early fault-tolerant quantum computers: We demonstrate in detailed numerical simulations a range of practical applications of quantum computers that will significantly benefit from our techniques.	翻訳日:2023-10-26 01:04:53 公開日:2023-10-23
# 音韻的多変量形態変化における構成データ増大の理解 Understanding Compositional Data Augmentation in Typologically Diverse Morphological Inflection ( http://arxiv.org/abs/2305.13658v2 ) ライセンス: Link先を確認	Farhan Samir and Miikka Silfverberg	(参考訳) データ拡張技術は、データ空間を克服するために、低リソースの自動モーフィックインフレクションに広く利用されている。しかし、これらの技法の完全な意味はいまだに理解されていない。本研究では,StemCorrupt (Silfverberg et al., 2017; Anastasopoulos and Neubig, 2019)の理論的側面を明らかにすることを目的とした。まず,情報理論的な分析を行い,ステムコラプトが,特にステムと接点間のスプリアス相関を排除し,構成的一般化を改善できると主張した。理論的解析により、stemcorruptがこれらのスプリアス相関を減少させるサンプル効率がさらに研究される。その結果,StemCorruptのデータ効率は,多種多様であり,予測の不確実性が高いデータポイントのサブセットを選択することで著しく向上することが示された。しかし,データ選択戦略の選択に類型的特徴が与える影響についても検討し,高いアロモルファスと音韻的変化を取り入れた言語は,高い不確実性を有する合成例の恩恵を受けにくいことを見出した。本研究は,自然言語形態のスペクトル全体にわたって最適な性能を確保するために,さらなる研究が必要であることを強調する。 Data augmentation techniques are widely used in low-resource automatic morphological inflection to overcome data sparsity. However, the full implications of these techniques remain poorly understood. In this study, we aim to shed light on the theoretical aspects of the prominent data augmentation strategy StemCorrupt (Silfverberg et al., 2017; Anastasopoulos and Neubig, 2019), a method that generates synthetic examples by randomly substituting stem characters in gold standard training examples. To begin, we conduct an information-theoretic analysis, arguing that StemCorrupt improves compositional generalization by eliminating spurious correlations between morphemes, specifically between the stem and the affixes. Our theoretical analysis further leads us to study the sample efficiency with which StemCorrupt reduces these spurious correlations. Through evaluation across seven typologically distinct languages, we demonstrate that selecting a subset of datapoints with both high diversity and high predictive uncertainty significantly enhances the data-efficiency of StemCorrupt. However, we also explore the impact of typological features on the choice of the data selection strategy and find that languages incorporating a high degree of allomorphy and phonological alternations derive less benefit from synthetic examples with high uncertainty. We attribute this effect to phonotactic violations induced by StemCorrupt, emphasizing the need for further research to ensure optimal performance across the entire spectrum of natural language morphology.	翻訳日:2023-10-26 00:56:06 公開日:2023-10-23
# bytesize32: テキストゲームとして表現されるタスク固有の世界モデルを生成するコーパスとチャレンジタスク ByteSized32: A Corpus and Challenge Task for Generating Task-Specific World Models Expressed as Text Games ( http://arxiv.org/abs/2305.14879v2 ) ライセンス: Link先を確認	Ruoyao Wang, Graham Todd, Eric Yuan, Ziang Xiao, Marc-Alexandre C\^ot\'e, Peter Jansen	(参考訳) 本研究では,科学・常識推論タスクの明示的,解釈可能,対話的世界モデルを生成するための言語モデルの能力について検討する。私たちはこれを、数百行のpythonコードで表現されたテキストゲームを生成するタスクとして運用します。この作業を容易にするため、我々は、32の推論中心のテキストゲームであるByteSized32(コード:github.com/cognitiveailab/BYTESIZED32)を紹介した。 28%のケースで、GPT-4は、これらのゲームをシングルショットインコンテキスト学習のテンプレートとして使用できることを実証的に実証した。プログラムエラーに対する自己認識が許されると、ゲームランナビリティは57%に向上する。シミュレーション忠実度の評価は労働集約的であるが,ゲーム忠実度,技術的妥当性,タスク仕様の遵守,勝手性を評価するための一連の自動メトリクスを導入し,専門家による評価と高い一致を示した。我々はこれを、世界モデリングとコード生成の分岐点において、さらなる開発を促進するための課題とする。 In this work, we investigate the capacity of language models to generate explicit, interpretable, and interactive world models of scientific and common-sense reasoning tasks. We operationalize this as a task of generating text games, expressed as hundreds of lines of Python code. To facilitate this task, we introduce ByteSized32 (Code: github.com/cognitiveailab/BYTESIZED32), a corpus of 32 reasoning-focused text games totaling 20k lines of Python code. We empirically demonstrate that GPT-4 can use these games as templates for single-shot in-context learning, successfully producing runnable games on unseen topics in 28% of cases. When allowed to self-reflect on program errors, game runnability substantially increases to 57%. While evaluating simulation fidelity is labor-intensive, we introduce a suite of automated metrics to assess game fidelity, technical validity, adherence to task specifications, and winnability, showing a high degree of agreement with expert human ratings. We pose this as a challenge task to spur further development at the juncture of world modeling and code generation.	翻訳日:2023-10-26 00:46:17 公開日:2023-10-23
# gpt-4を用いた翻訳後自動編集 Leveraging GPT-4 for Automatic Translation Post-Editing ( http://arxiv.org/abs/2305.14878v2 ) ライセンス: Link先を確認	Vikas Raunak, Amr Sharaf, Yiren Wang, Hany Hassan Awadallah, Arul Menezes	(参考訳) ニューラル機械翻訳(NMT)は機械翻訳(MT)の主要なアプローチであるが、NMTモデルの出力は依然として、エラーの修正と重要な設定下での品質向上のために翻訳後編集を必要とする。本研究では,Large Language Models (LLMs) を用いた翻訳後直接編集のタスクを形式化し,GPT-4を用いて複数の言語ペア間でNMT出力を自動的に後処理する方法について検討する。以上の結果から,GPT-4は翻訳後編集に長けており,翻訳の全体的な品質向上に寄与する有意義で信頼性の高い編集や,翻訳における主要な誤りの異なる分類の除去に有効であることが示唆された。特に,人間による編集信頼性評価では,GPT-4は従来のLLMよりも大幅に改善されている。特に,GPT-4に基づく後編集を用いて,WMT-22英語,英語,英語,中国語,ドイツ語の言語ペアの最先端性能を改善した。しかし, GPT-4は幻覚的編集が可能であることも示し, 専門家翻訳ポストエディターとしての使用に注意を促した。 While Neural Machine Translation (NMT) represents the leading approach to Machine Translation (MT), the outputs of NMT models still require translation post-editing to rectify errors and enhance quality under critical settings. In this work, we formalize the task of direct translation post-editing with Large Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit NMT outputs across several language pairs. Our results demonstrate that GPT-4 is adept at translation post-editing, producing meaningful and trustworthy edits to translations that help improve its general quality as well as remove different classes of major errors in translations. In particular, human evaluations on assessing edit trustworthiness show that GPT-4 exhibits a large improvement over the prior state-of-the-art LLM. Notably, we improve upon state-of-the-art performance on WMT-22 English-Chinese, English-German, Chinese-English and German-English language pairs using GPT-4 based post-editing, as evaluated by state-of-the-art MT quality metrics. However, we also show that GPT-4 could produce hallucinated edits, thereby urging caution in its use as an expert translation post-editor.	翻訳日:2023-10-26 00:45:54 公開日:2023-10-23
# SMT 2.0:階層および混合変数ガウスプロセスに焦点を当てた代理モデリングツールボックス SMT 2.0: A Surrogate Modeling Toolbox with a focus on Hierarchical and Mixed Variables Gaussian Processes ( http://arxiv.org/abs/2305.13998v3 ) ライセンス: Link先を確認	Paul Saves and Remi Lafage and Nathalie Bartoli and Youssef Diouane and Jasper Bussemaker and Thierry Lefebvre and John T. Hwang and Joseph Morlier and Joaquim R. R. A. Martins	(参考訳) Surrogate Modeling Toolbox (SMT)はオープンソースのPythonパッケージで、一連のサロゲートモデリングメソッド、サンプリング技術、サンプル問題の集合を提供する。本稿では、ツールボックスに大幅なアップグレードと新機能を導入したSMT 2.0について述べる。このリリースには、混合変数サロゲートモデルと階層変数を扱う機能が追加されている。これらのタイプの変数は、いくつかの代理モデリングアプリケーションでますます重要になっている。 SMT 2.0はサンプリング方法を拡張し、新しいサロゲートモデルを追加し、分散計算とKrigingのカーネルデリバティブを演算することでSMTを改善した。このリリースには、ノイズを処理し、マルチフィデリティデータを使用する新しい機能も含まれている。我々の知る限り、SMT 2.0は階層的および混合的な入力に対するサロゲートモデルを提案する最初のオープンソースサロゲートライブラリである。このオープンソースソフトウェアは、新しいbsdライセンスの下で配布される。 The Surrogate Modeling Toolbox (SMT) is an open-source Python package that offers a collection of surrogate modeling methods, sampling techniques, and a set of sample problems. This paper presents SMT 2.0, a major new release of SMT that introduces significant upgrades and new features to the toolbox. This release adds the capability to handle mixed-variable surrogate models and hierarchical variables. These types of variables are becoming increasingly important in several surrogate modeling applications. SMT 2.0 also improves SMT by extending sampling methods, adding new surrogate models, and computing variance and kernel derivatives for Kriging. This release also includes new functions to handle noisy and use multifidelity data. To the best of our knowledge, SMT 2.0 is the first open-source surrogate library to propose surrogate models for hierarchical and mixed inputs. This open-source software is distributed under the New BSD license.	翻訳日:2023-10-26 00:44:19 公開日:2023-10-23
# 文脈認識ニューラルマシン翻訳の課題 Challenges in Context-Aware Neural Machine Translation ( http://arxiv.org/abs/2305.13751v2 ) ライセンス: Link先を確認	Linghao Jin, Jacqueline He, Jonathan May, Xuezhe Ma	(参考訳) 文脈認識型ニューラルマシン翻訳は、文レベルのコンテキストを超えた情報を活用して、文間会話の依存関係を解決し、文書レベルの翻訳品質を改善する。しかし、よく理解された直感にもかかわらず、ほとんどの文脈対応翻訳モデルは、文レベルシステムよりもわずかに改善されている。本研究では,談話現象,文脈利用,モデルアーキテクチャ,文書レベルの評価など,この分野の進展を妨げるいくつかの課題について検討する。これらの問題に対処するために,パラパラグラフ(パラパラグラフ)翻訳という,より現実的な文書レベルの翻訳環境を提案し,今後の研究を促進するために,漢文小説の新しいデータセットを収集する。 Context-aware neural machine translation involves leveraging information beyond sentence-level context to resolve inter-sentential discourse dependencies and improve document-level translation quality, and has given rise to a number of recent techniques. However, despite well-reasoned intuitions, most context-aware translation models show only modest improvements over sentence-level systems. In this work, we investigate several challenges that impede progress within this field, relating to discourse phenomena, context usage, model architectures, and document-level evaluation. To address these problems, we propose a more realistic setting for document-level translation, called paragraph-to-paragraph (para2para) translation, and collect a new dataset of Chinese-English novels to promote future research.	翻訳日:2023-10-26 00:43:26 公開日:2023-10-23
# ToMChallenges: 心の理論を探求するための原則ガイド型データセットと多変量評価タスク ToMChallenges: A Principle-Guided Dataset and Diverse Evaluation Tasks for Exploring Theory of Mind ( http://arxiv.org/abs/2305.15068v2 ) ライセンス: Link先を確認	Xiaomeng Ma, Lingyu Gao, Qihui Xu	(参考訳) 異なる個人の精神状態を理解する能力である心の理論(ToM)は、多くの実践的応用に不可欠である。大規模言語モデル (LLM) の開発により,ToM のタスクの実行が可能であるかどうかが議論されている。従来の研究では、異なるタスクと、LSM上でToMをテストするためのプロンプトが用いられており、結果は矛盾している。本研究では,Sally-Anne and Smarties テストに基づく精神理論を多種多様なタスクで総合的に評価するためのデータセットであるToMChallengesを提案する。また,回答評価プロセスの合理化を図ったオートグレーダを提案する。 davinci、turbo、gpt-4の3機種をテストした。評価結果と誤差分析により,LLMはプロンプトやタスク間で不整合な挙動を示す。 ToMタスクの堅牢な実行は、LLMにとって依然として課題である。さらに,本論文では,LLMにおけるToM評価の意識を高めることを目的としており,LLMの能力を評価するために,ToMタスクのプロンプトやタスクの設計方法について,さらに議論したいと考えている。 Theory of Mind (ToM), the capacity to comprehend the mental states of distinct individuals, is essential for numerous practical applications. With the development of large language models (LLMs), there is a heated debate about whether they are able to perform ToM tasks. Previous studies have used different tasks and prompts to test the ToM on LLMs and the results are inconsistent: some studies asserted these models are capable of exhibiting ToM, while others suggest the opposite. In this study, We present ToMChallenges, a dataset for comprehensively evaluating the Theory of Mind based on the Sally-Anne and Smarties tests with a diverse set of tasks. In addition, we also propose an auto-grader to streamline the answer evaluation process. We tested three models: davinci, turbo, and gpt-4. Our evaluation results and error analyses show that LLMs have inconsistent behaviors across prompts and tasks. Performing the ToM tasks robustly remains a challenge for the LLMs. In addition, our paper wants to raise awareness in evaluating the ToM in LLMs and we want to invite more discussion on how to design the prompts and tasks for ToM tasks that can better assess the LLMs' ability.	翻訳日:2023-10-26 00:34:07 公開日:2023-10-23
# Dior-CVAE:変分ダイアログ生成のための事前学習言語モデルと拡散先行 Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation ( http://arxiv.org/abs/2305.15025v2 ) ライセンス: Link先を確認	Tianyu Yang and Thy Thy Tran and Iryna Gurevych	(参考訳) 現在の変分ダイアログモデルは、確率分布と後方分布をパラメータ化するために事前学習言語モデル(plm)を採用している。しかし、事前分布に基づくガウスの仮定はこれらの分布と相容れないため、生成された応答の多様性が制限される。これらのモデルはまた後方崩壊、すなわちデコーダは潜在変数を無視し、クロスアテンション機構を介してエンコーダでキャプチャされた情報に直接アクセスする傾向がある。本稿では,これらの課題に対処するために,拡散前の階層型条件付き変分オートエンコーダであるDior-CVAEを提案する。拡散モデルを用いて、従来の分布の複雑さと、PLMが生成した分布との整合性を高める。また,応答生成のための潜在変数の使用を積極的に奨励するクロスアテンション機構へのメモリドロップアウトを提案する。一般に使われている2つのオープンドメインダイアログデータセットを用いた実験により,大規模ダイアログ事前学習を必要とせずに,より多様な応答を生成できることがわかった。コードはhttps://github.com/ukplab/dior-cvaeで入手できる。 Current variational dialog models have employed pre-trained language models (PLMs) to parameterize the likelihood and posterior distributions. However, the Gaussian assumption made on the prior distribution is incompatible with these distributions, thus restricting the diversity of generated responses. These models also suffer from posterior collapse, i.e., the decoder tends to ignore latent variables and directly access information captured in the encoder through the cross-attention mechanism. In this work, we propose Dior-CVAE, a hierarchical conditional variational autoencoder (CVAE) with diffusion priors to address these challenges. We employ a diffusion model to increase the complexity of the prior distribution and its compatibility with the distributions produced by a PLM. Also, we propose memory dropout to the cross-attention mechanism, which actively encourages the use of latent variables for response generation. Overall, experiments across two commonly used open-domain dialog datasets show that our method can generate more diverse responses without large-scale dialog pre-training. Code is available at https://github.com/UKPLab/dior-cvae.	翻訳日:2023-10-26 00:33:48 公開日:2023-10-23
# hierarchyeom.jl:オープン量子システムにおける階層的運動方程式のための効率的なjuliaフレームワーク HierarchicalEOM.jl: An efficient Julia framework for hierarchical equations of motion in open quantum systems ( http://arxiv.org/abs/2306.07522v4 ) ライセンス: Link先を確認	Yi-Te Huang, Po-Chen Kuo, Neill Lambert, Mauro Cirio, Simon Cross, Shen-Liang Yang, Franco Nori, Yueh-Nan Chen	(参考訳) 階層的運動方程式(heom)アプローチは、複数のボソニック環境とフェルミオン環境を同時に結合した系のダイナミクスを記述できる。 HEOM法とシステム環境相互作用を正確に記述する複雑さは、通常、時間を要する計算と大きなメモリコストをもたらす。本稿では、HEOMアプローチを統合するJuliaフレームワークであるHierarchicalEOM.jlというオープンソースのソフトウェアパッケージを紹介する。 HierarchicalEOM.jlは、ボソニックおよびフェルミオンスペクトル、定常状態、および全ての補助密度作用素(ADO)の拡張空間におけるフルダイナミックスを計算する方法の集合を特徴としている。 ADOのマルチインデックスの必要な処理は、ユーザフレンドリーなインターフェースによって実現される。単一不純物アンダーソンモデルとボゾンおよびフェルミオン貯留層と相互作用する超強結合電荷キャビティ系を解析し,パッケージの機能性を実証する。 HierarchicalEOM.jlは、PythonのQuantum Toolbox(QuTiP)の対応するメソッドに関して、このパッケージが構築される上で、かなりのスピードアップを達成する。 The hierarchical equations of motion (HEOM) approach can describe the reduced dynamics of a system simultaneously coupled to multiple bosonic and fermionic environments. The complexity of exactly describing the system-environment interaction with the HEOM method usually results in time-consuming calculations and a large memory cost. Here, we introduce an open-source software package called HierarchicalEOM.jl: a Julia framework integrating the HEOM approach. HierarchicalEOM.jl features a collection of methods to compute bosonic and fermionic spectra, stationary states, and the full dynamics in the extended space of all auxiliary density operators (ADOs). The required handling of the ADOs multi-indexes is achieved through a user-friendly interface. We exemplify the functionalities of the package by analyzing a single impurity Anderson model, and an ultra-strongly coupled charge-cavity system interacting with bosonic and fermionic reservoirs. HierarchicalEOM.jl achieves a significant speedup with respect to the corresponding method in the Quantum Toolbox in Python (QuTiP), upon which this package is founded.	翻訳日:2023-10-26 00:25:02 公開日:2023-10-23
# ブラックボックス変分推論の線形収束:着陸を控えるべきか? Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing? ( http://arxiv.org/abs/2307.14642v2 ) ライセンス: Link先を確認	Kyurae Kim, Yian Ma, and Jacob R. Gardner	(参考訳) 制御変数を持つブラックボックス変分推論(bbvi)、特にスティッキング・ザ・ランディング(stl)推定器は、完全変分族仕様の下で幾何学的(伝統的に「線形」と呼ばれる)に収束する。特に、不特定変分族を含むSTL推定器の勾配分散の2次境界を証明した。二次分散条件に関する以前の研究と組み合わさって、これはプロジェクテッド確率勾配勾配を用いたBBVIの収束を直接意味する。また,正規閉形式エントロピー勾配推定器の既存解析を改善し,stl推定器との比較を可能にし,その両方に対して明示的な非漸近的複雑性を保証する。 We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification. In particular, we prove a quadratic bound on the gradient variance of the STL estimator, one which encompasses misspecified variational families. Combined with previous works on the quadratic variance condition, this directly implies convergence of BBVI with the use of projected stochastic gradient descent. We also improve existing analysis on the regular closed-form entropy gradient estimators, which enables comparison against the STL estimator and provides explicit non-asymptotic complexity guarantees for both.	翻訳日:2023-10-26 00:07:29 公開日:2023-10-23
# クラスタ対応半教師付き学習:クラスタリングを学習する関係知識蒸留 Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering ( http://arxiv.org/abs/2307.11030v2 ) ライセンス: Link先を確認	Yijun Dong, Kevin Miller, Qi Lei, Rachel Ward	(参考訳) 教師と生徒のモデル間の特徴(関係)にマッチする(関係)知識蒸留の実証的成功と実用的意義にもかかわらず、対応する理論解釈は様々な知識蒸留パラダイムに限定されている。本研究では, 半教師付き分類問題に着目し, 関係知識蒸留(RKD)の理論的理解に向けて最初の一歩を踏み出した。まず,教師モデルによって示される集団誘発グラフ上で,rkdをスペクトルクラスタリングとしてキャスティングすることから始める。予測値と基底値のクラスタリングのばらつきを定量化するクラスタリングエラーの概念を用いて,人口を超えたrkdがクラスタリングエラーの低減につながることを示す。さらに,非ラベルサンプルを限定してrkdに限定したサンプル複雑性を提供する。半教師付き学習では,クラスタ認識型半教師付き学習の一般的なフレームワークを通じて,クラスタリングエラーを想定するRKDのラベル効率をさらに向上する。最後に、このクラスタ対応フレームワークにデータの強化一貫性の規則化を統一することにより、正確なクラスタリングを学習する共通の効果にもかかわらず、rkdはスペクトルクラスタリングを通じて「グローバル」な視点を促進するが、一貫性の規則化は拡張を通じた「ローカル」な視点に焦点を当てる。 Despite the empirical success and practical significance of (relational) knowledge distillation that matches (the relations of) features between teacher and student models, the corresponding theoretical interpretations remain limited for various knowledge distillation paradigms. In this work, we take an initial step toward a theoretical understanding of relational knowledge distillation (RKD), with a focus on semi-supervised classification problems. We start by casting RKD as spectral clustering on a population-induced graph unveiled by a teacher model. Via a notion of clustering error that quantifies the discrepancy between the predicted and ground truth clusterings, we illustrate that RKD over the population provably leads to low clustering error. Moreover, we provide a sample complexity bound for RKD with limited unlabeled samples. For semi-supervised learning, we further demonstrate the label efficiency of RKD through a general framework of cluster-aware semi-supervised learning that assumes low clustering errors. Finally, by unifying data augmentation consistency regularization into this cluster-aware framework, we show that despite the common effect of learning accurate clusterings, RKD facilitates a "global" perspective through spectral clustering, whereas consistency regularization focuses on a "local" perspective via expansion.	翻訳日:2023-10-26 00:04:48 公開日:2023-10-23
# 発作予測のためのパスシグネチャ Path Signatures for Seizure Forecasting ( http://arxiv.org/abs/2308.09312v2 ) ライセンス: Link先を確認	Jonas F. Haderlein, Andre D. H. Peterson, Parvin Zarei Eskikand, Mark J. Cook, Anthony N. Burkitt, Iven M. Y. Mareels, David B. Grayden	(参考訳) 過去の観測行動(時系列)から将来のシステム行動を予測することは、科学と工学の基礎である。計算神経科学において、脳波データを用いた脳活動測定による将来のてんかん発作の予測は、多くの研究努力にもかかわらずほとんど未解決のままである。てんかん患者の頭蓋間脳波計測値を用いた長手および最先端のデータセットに基づいて, 発作を患者固有の方法で予測するための予測特徴(バイオマーカー)の自動発見を検討する。この目的のために,データストリーム解析における最近の進展であるパスシグネチャを用いて,測定された時系列から発作予測へのマッピングを行う。予測器は線形分類に基づいており、ここではスパーシティの制約が加わり、差し迫った発作を伴わずとも時系列を識別できる。このアプローチは、現在の機械学習と同等な予測性能を維持しながら、シンプルさとカスタマイズの容易さを主な利点とする一般的なパターン認識パイプラインへの一歩と見なすことができる。それにもかかわらず、パスシグネチャ法にはいくつかの強力な理論的保証があるが、適切な時系列統計は、我々の発作予測の文脈で本質的に同じ結果が得られる。これは、本質的な複雑さと非定常性のため、脳のダイナミクスは利用可能な脳波測定データから識別できず、より具体的には、てんかん発作の予測は脳波測定データだけでは確実に達成されないことを示唆している。 Predicting future system behaviour from past observed behaviour (time series) is fundamental to science and engineering. In computational neuroscience, the prediction of future epileptic seizures from brain activity measurements, using EEG data, remains largely unresolved despite much dedicated research effort. Based on a longitudinal and state-of-the-art data set using intercranial EEG measurements from people with epilepsy, we consider the automated discovery of predictive features (or biomarkers) to forecast seizures in a patient-specific way. To this end, we use the path signature, a recent development in the analysis of data streams, to map from measured time series to seizure prediction. The predictor is based on linear classification, here augmented with sparsity constraints, to discern time series with and without an impending seizure. This approach may be seen as a step towards a generic pattern recognition pipeline where the main advantages are simplicity and ease of customisation, while maintaining forecasting performance on par with modern machine learning. Nevertheless, it turns out that although the path signature method has some powerful theoretical guarantees, appropriate time series statistics can achieve essentially the same results in our context of seizure prediction. This suggests that, due to their inherent complexity and non-stationarity, the brain's dynamics are not identifiable from the available EEG measurement data, and, more concretely, epileptic episode prediction is not reliably achieved using EEG measurement data alone.	翻訳日:2023-10-25 23:55:15 公開日:2023-10-23
# TARJAMAT:10種類のアラビア語の機械翻訳における Bard と ChatGPT の評価 TARJAMAT: Evaluation of Bard and ChatGPT on Machine Translation of Ten Arabic Varieties ( http://arxiv.org/abs/2308.03051v2 ) ライセンス: Link先を確認	Karima Kadaoui, Samar M. Magdy, Abdul Waheed, Md Tawkat Islam Khondaker, Ahmed Oumar El-Shangiti, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed	(参考訳) chatgpt や bard のような命令に精通した大規模言語モデル (llm) の多言語習熟度は高いが、これらのモデルの言語的排他性は未だ不十分である。この制約を考慮し,10種類のアラビア語の機械翻訳能力について, Bard と ChatGPT (GPT-3.5 と GPT-4 を併用) を徹底的に評価した。本評価は,古典アラビア語 (ca) や現代標準アラビア語 (msa) など,様々なアラビア語の方言を対象とする。我々の分析によると、LLMは、最小の公開データセットが存在する方言では困難に直面する可能性があるが、平均的には、既存の商用システムよりも優れた方言翻訳者である。しかしCAとMSAでは、命令調整されたLLMがGoogle Translateなどの商用システムに遅れを取っている。最後に,比較的最近のモデルであるBardの有効性を,翻訳作業中の人間の指示に従って検討する。解析の結果,翻訳文脈における人間の指示と整合するbardの周辺的能力が明らかになった。総じて, LLMの普及は包括的ではなく, 多様な地域社会の言語的, 文化的な複雑さに対処する能力に限られていることが示唆された。 Despite the purported multilingual proficiency of instruction-finetuned large language models (LLMs) such as ChatGPT and Bard, the linguistic inclusivity of these models remains insufficiently explored. Considering this constraint, we present a thorough assessment of Bard and ChatGPT (encompassing both GPT-3.5 and GPT-4) regarding their machine translation proficiencies across ten varieties of Arabic. Our evaluation covers diverse Arabic varieties such as Classical Arabic (CA), Modern Standard Arabic (MSA), and several country-level dialectal variants. Our analysis indicates that LLMs may encounter challenges with dialects for which minimal public datasets exist, but on average are better translators of dialects than existing commercial systems. On CA and MSA, instruction-tuned LLMs, however, trail behind commercial systems such as Google Translate. Finally, we undertake a human-centric study to scrutinize the efficacy of the relatively recent model, Bard, in following human instructions during translation tasks. Our analysis reveals a circumscribed capability of Bard in aligning with human instructions in translation contexts. Collectively, our findings underscore that prevailing LLMs remain far from inclusive, with only limited ability to cater for the linguistic and cultural intricacies of diverse communities.	翻訳日:2023-10-25 23:53:47 公開日:2023-10-23
# 関係指向型:因果知識対応型AGIに向けて Relation-Oriented: Toward Causal Knowledge-Aligned AGI ( http://arxiv.org/abs/2307.16387v9 ) ライセンス: Link先を確認	Jia Li, Xiang Li	(参考訳) 現在、観察指向パラダイムは、時間的非線形効果との関係を考慮しないAIベースのモデルを含む関係学習モデルを支配している。代わりに、このパラダイムは「時間次元」を線形観測タイムラインとして単純化し、特定のタイムスタンプによる効果の事前識別を必要とする。このような制約は動的効果に対する識別可能性の難しさをもたらし、それによってモデル化された関係の潜在的に重要な時間的非線形性を見落としてしまう。さらに、時間的特徴空間の多次元的性質は無視され、関係モデルの堅牢性と一般化性を著しく損なう固有のバイアスが導入された。この制限は、大規模なAIベースの因果的応用において特に顕著である。これらの問題を次元的枠組みのレンズを通して調べると、知識の関連性に関する理解と現在のモデリングパラダイムの間に根本的な相違が同定される。これに対処するために,因果的知識協調型人工知能(agi)の開発を促進することを目的とした,新たな関係指向パラダイムが提案されている。手法として,提案したリレーショナルインデックス表現学習(RIRL)を有効性実験により検証した。 Observation-Oriented paradigm currently dominates relationship learning models, including AI-based ones, which inherently do not account for relationships with temporally nonlinear effects. Instead, this paradigm simplifies the "temporal dimension" to be a linear observational timeline, necessitating the prior identification of effects with specific timestamps. Such constraints lead to identifiability difficulties for dynamical effects, thereby overlooking the potentially crucial temporal nonlinearity of the modeled relationship. Moreover, the multi-dimensional nature of Temporal Feature Space is largely disregarded, introducing inherent biases that seriously compromise the robustness and generalizability of relationship models. This limitation is particularly pronounced in large AI-based causal applications. Examining these issues through the lens of a dimensionality framework, a fundamental misalignment is identified between our relation-indexing comprehension of knowledge and the current modeling paradigm. To address this, a new Relation-Oriented} paradigm is raised, aimed at facilitating the development of causal knowledge-aligned Artificial General Intelligence (AGI). As its methodological counterpart, the proposed Relation-Indexed Representation Learning (RIRL) is validated through efficacy experiments.	翻訳日:2023-10-25 23:52:45 公開日:2023-10-23
# 2つの混合器と不確かさの量子近似ベイズ最適化アルゴリズム Quantum Approximate Bayesian Optimization Algorithms with Two Mixers and Uncertainty Quantification ( http://arxiv.org/abs/2307.16335v2 ) ライセンス: Link先を確認	Jungin E. Kim and Yan Wang	(参考訳) 量子近似最適化アルゴリズムの探索効率は、アルゴリズムの古典的側面と量子的側面の両方に依存する。近年,2つのミキサーを含む量子近似ベイズ最適化アルゴリズム (QABOA) が開発され,古典最適化器のサンプリング効率向上のために代用された。連続時間型量子ウォークミキサーは探索の促進に使われ、一般化されたグローバーミキサーも活用の改善に使われている。本稿では,QABOAの拡張による探索効率の向上について述べる。探索効率は2つの側面により向上する。まず、探索用と搾取用とを含む2つのミキサーを交互に適用する。第二に、量子回路の不確実性は、基底状態分布の曲率に基づいて新しい量子Mat\'ernカーネルで定量化され、最適値を得る確率が増加する。提案する2-ミキサー qaboa$'$s と不確かさを定量化しないものは、5つの離散問題と4つの混合整数問題の3つの単一ミキサー qaboa$'$s と比較される。その結果, 不確実な定量化が可能な2混合QABOAは, 9つの問題のうち5つにおいて, 効率と整合性に優れた性能を示した。また,一般化グロバーミキサーを用いたqaboaは,単一ミキサーアルゴリズムの中で最高の性能を示し,探索効率向上における活用の利点と探索・探索バランスの重要性を示した。 The searching efficiency of the quantum approximate optimization algorithm is dependent on both the classical and quantum sides of the algorithm. Recently a quantum approximate Bayesian optimization algorithm (QABOA) that includes two mixers was developed, where surrogate-based Bayesian optimization is applied to improve the sampling efficiency of the classical optimizer. A continuous-time quantum walk mixer is used to enhance exploration, and the generalized Grover mixer is also applied to improve exploitation. In this paper, an extension of QABOA is proposed to further improve its searching efficiency. The searching efficiency is enhanced through two aspects. First, two mixers, including one for exploration and the other for exploitation, are applied in an alternating fashion. Second, uncertainty of the quantum circuit is quantified with a new quantum Mat\'ern kernel based on the kurtosis of the basis state distribution, which increases the chance of obtaining the optimum. The proposed new two-mixer QABOA$'$s with and without uncertainty quantification are compared with three single-mixer QABOA$'$s on five discrete and four mixed-integer problems. The results show that the proposed two-mixer QABOA with uncertainty quantification has the best performance in efficiency and consistency for five out of the nine tested problems. The results also show that QABOA with the generalized Grover mixer performs the best among the single-mixer algorithms, thereby demonstrating the benefit of exploitation and the importance of dynamic exploration-exploitation balance in improving searching efficiency.	翻訳日:2023-10-25 23:52:26 公開日:2023-10-23
# 粒子フロー再構成のためのスケーラブルニューラルネットワークモデルとテラスケールデータセット Scalable neural network models and terascale datasets for particle-flow reconstruction ( http://arxiv.org/abs/2309.06782v2 ) ライセンス: Link先を確認	Joosep Pata, Eric Wulff, Farouk Mokhtar, David Southwick, Mengke Zhang, Maria Girone, Javier Duarte	(参考訳) 高エネルギー電子-ポジトロン衝突におけるフルイベント再構成のためのスケーラブルな機械学習モデルについて,高粒度検出器シミュレーションに基づいて検討した。粒子フロー(PF)再構成は、トラックやカロリークラスタやヒットを用いた教師あり学習タスクとして定式化することができる。グラフニューラルネットワークとカーネルベースのトランスフォーマーを比較し,2次メモリ割り当てと計算コストを回避しつつ,現実的なpf再構成を実現することを実証した。スーパーコンピュータ上でのハイパーパラメータチューニングは, モデルの物理性能を大幅に向上させ, ジェット横運動量分解能をベースラインに比べて最大50%向上させることを示した。その結果得られたモデルは、Nvidia、AMD、Intel Habanaカードをサポートする、ハードウェアプロセッサ間で非常にポータブルである。最後に,トラックと温度計のヒットからなる高粒度入力でモデルをトレーニングできることを示し,その結果,ベースラインと競合する物理性能が得られることを示した。研究を再現するデータセットとソフトウェアは、findable、accessable、interoperaable、recurable(fair)の原則に従って公開されている。 We study scalable machine learning models for full event reconstruction in high-energy electron-positron collisions based on a highly granular detector simulation. Particle-flow (PF) reconstruction can be formulated as a supervised learning task using tracks and calorimeter clusters or hits. We compare a graph neural network and kernel-based transformer and demonstrate that both avoid quadratic memory allocation and computational cost while achieving realistic PF reconstruction. We show that hyperparameter tuning on a supercomputer significantly enhances the physics performance of the models, improving the jet transverse momentum resolution by up to 50% compared to the baseline. The resulting model is highly portable across hardware processors, supporting Nvidia, AMD, and Intel Habana cards. Finally, we demonstrate that the model can be trained on highly granular inputs consisting of tracks and calorimeter hits, resulting in a competitive physics performance with the baseline. Datasets and software to reproduce the studies are published following the findable, accessible, interoperable, and reusable (FAIR) principles.	翻訳日:2023-10-25 23:43:58 公開日:2023-10-23
# テキストの曖昧さと主観性の測定:象徴的から神経迷走神経へ Measuring vagueness and subjectivity in texts: from symbolic to neural VAGO ( http://arxiv.org/abs/2309.06132v2 ) ライセンス: Link先を確認	Benjamin Icard, Vincent Claveau, Ghislain Atemezing and Paul \'Egr\'e	(参考訳) テキストにおける曖昧さと主観性の自動測定に対するハイブリッド手法を提案する。まず、専門家システムVAGOを紹介し、それを事実対意見文の小さなベンチマークで説明し、次に、より大きいフランスのプレスコーパスFreSaDaでテストし、風刺と通常のテキストにおける主観的マーカーの高頻度性を確認する。 VAGO のニューラルクローンを BERT のようなアーキテクチャで構築し,FreSaDa 上で得られた記号的VAGO スコアに基づいて学習する。説明可能性ツール(LIME)を用いて、シンボル版の語彙を豊かにし、他の言語でバージョンを作成するために、このニューラルバージョンの興味を示す。 We present a hybrid approach to the automated measurement of vagueness and subjectivity in texts. We first introduce the expert system VAGO, we illustrate it on a small benchmark of fact vs. opinion sentences, and then test it on the larger French press corpus FreSaDa to confirm the higher prevalence of subjective markers in satirical vs. regular texts. We then build a neural clone of VAGO, based on a BERT-like architecture, trained on the symbolic VAGO scores obtained on FreSaDa. Using explainability tools (LIME), we show the interest of this neural version for the enrichment of the lexicons of the symbolic version, and for the production of versions in other languages.	翻訳日:2023-10-25 23:43:39 公開日:2023-10-23
# 深層学習と薬物動態の先行した治療に対する予測応答 Forecasting Response to Treatment with Deep Learning and Pharmacokinetic Priors ( http://arxiv.org/abs/2309.13135v2 ) ライセンス: Link先を確認	Willa Potosnak, Cristian Challu, Kin G. Olivares, Artur Dubrawski	(参考訳) 予後の早期発見や患者のモニタリングには,医療時系列の予測が不可欠である。しかし、ノイズや間欠的なデータのために予測が難しい場合がある。これらの課題は、薬物投与などの外因性要因によって引き起こされる変化点によって、しばしば悪化する。これらの課題に対処するために,患者固有の治療効果の深層学習モデルを示す,新しいグローバルローカルアーキテクチャと薬物動態エンコーダを提案する。現実的にシミュレーションされた実世界データと実世界データの両方を用いて,血糖予測タスクの精度向上に向けたアプローチの有効性を示す。我々のグローバルローカルアーキテクチャは患者固有のモデルよりも9.2-14.6%改善している。さらに、我々の薬物動態エンコーダは、シミュレーションデータでは4.4%、実世界のデータでは2.1%で代替符号化技術よりも改善されている。提案手法は, 予期せぬ治療反応に対する早期警告の発行や, 薬物吸収および除去特性の観点から, 患者固有の治療効果を特徴付けるなど, 臨床実践において有益である。 Forecasting healthcare time series is crucial for early detection of adverse outcomes and for patient monitoring. Forecasting, however, can be difficult in practice due to noisy and intermittent data. The challenges are often exacerbated by change points induced via extrinsic factors, such as the administration of medication. To address these challenges, we propose a novel hybrid global-local architecture and a pharmacokinetic encoder that informs deep learning models of patient-specific treatment effects. We showcase the efficacy of our approach in achieving significant accuracy gains for a blood glucose forecasting task using both realistically simulated and real-world data. Our global-local architecture improves over patient-specific models by 9.2-14.6%. Additionally, our pharmacokinetic encoder improves over alternative encoding techniques by 4.4% on simulated data and 2.1% on real-world data. The proposed approach can have multiple beneficial applications in clinical practice, such as issuing early warnings about unexpected treatment responses, or helping to characterize patient-specific treatment effects in terms of drug absorption and elimination characteristics.	翻訳日:2023-10-25 23:33:01 公開日:2023-10-23
# detach-rocket:ランダム畳み込みカーネルを用いた時系列分類のための逐次特徴選択 Detach-ROCKET: Sequential feature selection for time series classification with random convolutional kernels ( http://arxiv.org/abs/2309.14518v2 ) ライセンス: Link先を確認	Gonzalo Uribarri, Federico Barone, Alessio Ansuini, Erik Frans\'en	(参考訳) 時系列分類(TSC)は、医学、環境科学、金融など多くの分野において必須であり、疾患診断、異常検出、株価分析などのタスクを可能にする。 Recurrent Neural NetworksやInceptionTimeのようなTSC用の機械学習モデルは、多くのアプリケーションで成功したが、計算要求の集中によるスケーラビリティの制限に直面している。これを解決するために、ROCKETなどの効率的なモデルが登場し、時系列データから多数のランダムに生成された特徴を活用して、トレーニングを簡素化し、最先端の性能を達成する。しかし、そのランダムな性質のため、生成した特徴の多くは冗長あるいは非形式的であり、不要な計算負荷を加え、一般化を促進する。本稿では、これらの非意味的特徴を識別し、引き起こす方法として、逐次的特徴分離(Sequential Feature Detachment:SFD)を紹介する。 SFDは特徴量の推定にモデル係数を使用し、従来のアルゴリズムとは異なり、複雑なハイパーパラメータチューニングを必要とせずに大きな特徴集合を処理できる。 UCRアーカイブでのテストでは、SFDはオリジナルの機能の10\%$でモデルを生成できるが、テストセットの精度は0.2\%$である。また,Detach-ROCKETと呼ばれる特徴量とモデル精度の最適バランスを決定するためのエンドツーエンドの手法を提案する。最大のバイナリ UCR データセットに適用すると、Detach-ROCKET はテストの精度を 0.6 %$ 改善し、フィーチャの数を 9.9 %$ 削減できる。したがって,提案手法はトレーニングに軽量であり,モデルサイズの削減や一般化の促進に有効であるだけでなく,特徴数の減少も特徴解釈の道を開く。 Time Series Classification (TSC) is essential in many fields, such as medicine, environmental science and finance, enabling tasks like disease diagnosis, anomaly detection, and stock price analysis. Machine learning models for TSC like Recurrent Neural Networks and InceptionTime, while successful in numerous applications, can face scalability limitations due to intensive computational requirements. To address this, efficient models such as ROCKET and its derivatives have emerged, simplifying training and achieving state-of-the-art performance by utilizing a large number of randomly generated features from time series data. However, due to their random nature, most of the generated features are redundant or non-informative, adding unnecessary computational load and compromising generalization. Here, we introduce Sequential Feature Detachment (SFD) as a method to identify and prune these non-essential features. SFD uses model coefficients to estimate feature importance and, unlike previous algorithms, can handle large feature sets without the need for complex hyperparameter tuning. Testing on the UCR archive demonstrates that SFD can produce models with $10\%$ of the original features while improving the accuracy $0.2\%$ on the test set. We also present an end-to-end procedure for determining an optimal balance between the number of features and model accuracy, called Detach-ROCKET. When applied to the largest binary UCR dataset, Detach-ROCKET is able to improve test accuracy by $0.6\%$ while reducing the number of features by $98.9\%$. Thus, our proposed procedure is not only lightweight to train and effective in reducing model size and enhancing generalization, but its significant reduction in feature count also paves the way for feature interpretation.	翻訳日:2023-10-25 23:22:35 公開日:2023-10-23
# openpatch: 分散検出のための3dパッチワーク OpenPatch: a 3D patchwork for Out-Of-Distribution detection ( http://arxiv.org/abs/2310.03388v3 ) ライセンス: Link先を確認	Paolo Rabino, Antonio Alliegro, Francesco Cappio Borlino, Tatiana Tommasi	(参考訳) ラボ環境からオープンワールドへのディープラーニングモデル移行には、予期せぬ状況に対処する準備が伴う。いくつかのアプリケーションでは、デプロイ中に新しいクラスが発生することが重大な脅威となるため、効果的に検出することが不可欠である。理想的には、このスキルは必要なときに、新しいタスクごとにさらなる計算訓練を必要とせずに使用するべきである。分布外検出はここ数年で大きな注目を集めてきたが、研究の大半は現実世界の固有の3dの性質を無視し、しばしばドメインとセマンティックのノベルティを混同する2d画像を扱う。本研究では,各領域によらず3次元点雲によって捕捉される物体の幾何学的構造を考慮し,後者に焦点をあてる。我々は、大きな事前学習モデルの上に構築されたOpenPatchを導入し、その中間機能から、既知のクラスを記述したパッチ表現のセットを単純に抽出する。新たなサンプルについて,1つの既知のクラスのパッチによって,あるいは複数のクラスのコントリビューションによって再構成できるかどうかを評価することにより,新規性スコアを得る。本稿では,実世界の点雲サンプルにおける意味的新奇性検出の課題として,参照既知のデータが合成された場合のアプローチの広範な実験評価を行う。我々はopenpatchが既知の全例と少数例の両方で優れていることを実証し、トレーニング対象とネットワークバックボーンにまたがる堅牢性を示す。本手法の本質的なトレーニングフリーな性質は,実世界の幅広いタスクへの即時適用を可能にすると同時に,高価なリトレーニング作業を必要とするアプローチに対する説得力のあるアドバンテージを提供する。 Moving deep learning models from the laboratory setting to the open world entails preparing them to handle unforeseen conditions. In several applications the occurrence of novel classes during deployment poses a significant threat, thus it is crucial to effectively detect them. Ideally, this skill should be used when needed without requiring any further computational training effort at every new task. Out-of-distribution detection has attracted significant attention in the last years, however the majority of the studies deal with 2D images ignoring the inherent 3D nature of the real-world and often confusing between domain and semantic novelty. In this work, we focus on the latter, considering the objects geometric structure captured by 3D point clouds regardless of the specific domain. We advance the field by introducing OpenPatch that builds on a large pre-trained model and simply extracts from its intermediate features a set of patch representations that describe each known class. For any new sample, we obtain a novelty score by evaluating whether it can be recomposed mainly by patches of a single known class or rather via the contribution of multiple classes. We present an extensive experimental evaluation of our approach for the task of semantic novelty detection on real-world point cloud samples when the reference known data are synthetic. We demonstrate that OpenPatch excels in both the full and few-shot known sample scenarios, showcasing its robustness across varying pre-training objectives and network backbones. The inherent training-free nature of our method allows for its immediate application to a wide array of real-world tasks, offering a compelling advantage over approaches that need expensive retraining efforts.	翻訳日:2023-10-25 23:13:46 公開日:2023-10-23
# 大規模言語モデルのためのMetaToolベンチマーク:ツールの使用と使用方法の決定 MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use ( http://arxiv.org/abs/2310.03128v3 ) ライセンス: Link先を確認	Yue Huang and Jiawen Shi and Yuan Li and Chenrui Fan and Siyuan Wu and Qihui Zhang and Yixin Liu and Pan Zhou and Yao Wan and Neil Zhenqiang Gong and Lichao Sun	(参考訳) 大規模言語モデル(LLM)は、その印象的な自然言語処理(NLP)能力のために大きな注目を集めている。近年,多くの研究がllmのツール活用能力に着目している。彼らは主に、LLMが特定のツールと効果的に連携する方法を調査した。しかしながら、AutoGPTやMetaGPTのようなアプリケーションで見られるような、LLMがインテリジェントなエージェントとして機能するシナリオでは、LDMは、ツールを採用するかどうかを決定し、ユーザ要求を満たすために利用可能なツールの集合から最も適切なツールを選択する、複雑な意思決定プロセスに関与することが期待されている。そこで本稿では,LLM がツール使用意識を持ち,ツールを正しく選択できるかどうかを評価するベンチマークである MetaTool を紹介する。具体的には、ベンチマーク内でToolEと呼ばれるデータセットを作成します。このデータセットには、シングルツールとマルチツールの両方のシナリオを含む、LDMがツールを使用するきっかけとなるプロンプトという形で、さまざまなタイプのユーザクエリが含まれている。その後、ツール使用意識とツール選択の両方にタスクを設定しました。ツール選択に関して,ツール選択,特定のシナリオにおけるツール選択,信頼性問題のあるツール選択,マルチツール選択など,さまざまな観点から4つのサブタスクを定義した。我々は、9つの人気のあるLSMを巻き込んだ実験を行い、その大多数は依然としてツールを効果的に選択するのに苦労しており、LSMと真の知的エージェントの既存のギャップを強調しています。しかし, 誤差解析の結果, 改善の余地は依然として大きいことがわかった。最後に、chatgptをフォローするツール開発者がllmのツール選択性能を向上させるための詳細な説明を提供するための洞察をまとめる。 Large language models (LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities. Recently, many studies have focused on the tool utilization ability of LLMs. They primarily investigated how LLMs effectively collaborate with given specific tools. However, in scenarios where LLMs serve as intelligent agents, as seen in applications like AutoGPT and MetaGPT, LLMs are expected to engage in intricate decision-making processes that involve deciding whether to employ a tool and selecting the most suitable tool(s) from a collection of available tools to fulfill user requests. Therefore, in this paper, we introduce MetaTool, a benchmark designed to evaluate whether LLMs have tool usage awareness and can correctly choose tools. Specifically, we create a dataset called ToolE within the benchmark. This dataset contains various types of user queries in the form of prompts that trigger LLMs to use tools, including both single-tool and multi-tool scenarios. Subsequently, we set the tasks for both tool usage awareness and tool selection. We define four subtasks from different perspectives in tool selection, including tool selection with similar choices, tool selection in specific scenarios, tool selection with possible reliability issues, and multi-tool selection. We conduct experiments involving nine popular LLMs and find that the majority of them still struggle to effectively select tools, highlighting the existing gaps between LLMs and genuine intelligent agents. However, through the error analysis, we found there is still significant room for improvement. Finally, we conclude with insights for tool developers that follow ChatGPT to provide detailed descriptions that can enhance the tool selection performance of LLMs.	翻訳日:2023-10-25 23:12:59 公開日:2023-10-23
# 全方位磁場センシングのためのナノチューブスピン欠陥 Nanotube spin defects for omnidirectional magnetic field sensing ( http://arxiv.org/abs/2310.02709v2 ) ライセンス: Link先を確認	Xingyu Gao, Sumukh Vaidya, Saakshi Dikshit, Peng Ju, Kunhong Shen, Yuanbin Jin, Shixiong Zhang, Tongcang Li	(参考訳) 3次元(3d)結晶と2次元(2d)ファンデルワールス(vdw)材料におけるスピン欠陥は、ナノスケールの量子センシングに革命をもたらす。 1次元の(1D)vdWナノチューブのスピン欠陥は、2次元の小さなサイズと側壁上の結合の欠如により、ユニークな機会をもたらす。しかし、ナノチューブ中の局在スピン欠陥の光学的検出磁気共鳴は観測されていない。本稿では,室温における窒化ホウ素ナノチューブ(BNNT)の単一スピン色中心の観察について報告する。これらのBNNTスピン欠陥は、固有量子化軸を持たないスピン=S=1/2$基底状態を有しており、向きに依存しない磁場センシングをもたらすことが示唆された。この特異な特徴を利用して、2次元磁石の磁場中における磁気異方性磁化を直交方向に沿って観測する。さらに、BNNTをカンチレバーに決定的に転送し、それを用いて走査型プローブ磁気メトリーを実証する手法を開発した。このアプローチのさらなる改良により、任意の方向の磁場の原子スケール量子センシングが可能となる。 Optically addressable spin defects in three-dimensional (3D) crystals and two-dimensional (2D) van der Waals (vdW) materials are revolutionizing nanoscale quantum sensing. Spin defects in one-dimensional (1D) vdW nanotubes will provide unique opportunities due to their small sizes in two dimensions and absence of dangling bonds on side walls. However, optically detected magnetic resonance of localized spin defects in a nanotube has not been observed. Here, we report the observation of single spin color centers in boron nitride nanotubes (BNNTs) at room temperature. Our findings suggest that these BNNT spin defects possess a spin $S=1/2$ ground state without an intrinsic quantization axis, leading to orientation-independent magnetic field sensing. We harness this unique feature to observe anisotropic magnetization of a 2D magnet in magnetic fields along orthogonal directions, a challenge for conventional spin $S=1$ defects such as diamond nitrogen-vacancy centers. Additionally, we develop a method to deterministically transfer a BNNT onto a cantilever and use it to demonstrate scanning probe magnetometry. Further refinement of our approach will enable atomic scale quantum sensing of magnetic fields in any direction.	翻訳日:2023-10-25 23:12:19 公開日:2023-10-23
# Vendi ScoreのCousins:科学と機械学習のための類似性に基づく多様性メトリクスの家族 Cousins Of The Vendi Score: A Family Of Similarity-Based Diversity Metrics For Science And Machine Learning ( http://arxiv.org/abs/2310.12952v2 ) ライセンス: Link先を確認	Amey P. Pasarkar and Adji Bousso Dieng	(参考訳) 多様性を正確に測定することは、機械学習(ML)、生態学、化学など多くの科学分野において重要である。 vendiスコアは、量子統計力学のアイデアを活用し、q=1のヒル数を拡張する一般的な類似性に基づく多様性メトリックとして導入された。生態学における多くの多様性指標とは対照的に、ヴェンディスコアは類似性を考慮し、多様性を評価するためにコレクション内のカテゴリの有病率の知識を必要としない。しかしながら、Vendi Scoreは、アイテムの頻度に比例する感度のレベルで、所定のコレクション内の各アイテムを扱います。これはアイテムの頻度にかなりの不均衡がある設定では望ましくない。本稿では,類似性を用いて他のヒル数を拡張し,希少品や共通品に感度を割り当てる柔軟性を提供する。これにより、さまざまなアプリケーションで使用可能な、多様性指標のファミリー -- 異なるレベルの感度を持つ自動スコア -- が生まれます。基底真理の多様性が知られている合成制御環境におけるスコアの特性について検討する。次に、その有用性をテストし、ヴェンディサンプリングによる分子シミュレーションを改善する。最後に、記憶、重複、多様性、およびサンプル品質の観点から画像生成モデルの振る舞いをよりよく理解するために、vendiスコアを使用する。 Measuring diversity accurately is important for many scientific fields, including machine learning (ML), ecology, and chemistry. The Vendi Score was introduced as a generic similarity-based diversity metric that extends the Hill number of order q=1 by leveraging ideas from quantum statistical mechanics. Contrary to many diversity metrics in ecology, the Vendi Score accounts for similarity and does not require knowledge of the prevalence of the categories in the collection to be evaluated for diversity. However, the Vendi Score treats each item in a given collection with a level of sensitivity proportional to the item's prevalence. This is undesirable in settings where there is a significant imbalance in item prevalence. In this paper, we extend the other Hill numbers using similarity to provide flexibility in allocating sensitivity to rare or common items. This leads to a family of diversity metrics -- Vendi scores with different levels of sensitivity -- that can be used in a variety of applications. We study the properties of the scores in a synthetic controlled setting where the ground truth diversity is known. We then test their utility in improving molecular simulations via Vendi Sampling. Finally, we use the Vendi scores to better understand the behavior of image generative models in terms of memorization, duplication, diversity, and sample quality.	翻訳日:2023-10-25 22:55:24 公開日:2023-10-23
# MLに基づくサロゲートのベイズ的アプローチによる逆問題への応用 Applications of ML-Based Surrogates in Bayesian Approaches to Inverse Problems ( http://arxiv.org/abs/2310.12046v2 ) ライセンス: Link先を確認	Pelin Ersin, Emma Hayes, Peter Matthews, Paramjyoti Mohapatra, Elisa Negrini and Karl Schulz	(参考訳) ニューラルネットワークはシミュレーションモデルとして強力なツールとなり、計算効率が向上する科学的な問題に対する数値解を提供する。この効率性は、解法に要する時間や同様の分析シナリオの評価が必要な場合の数値的に困難な問題に有利である。科学的関心の1つの領域は逆問題の設定であり、ある系の前方ダイナミクスを偏微分方程式で記述し、これらの力学の(潜在的にうるさい)観測によって与えられた系の特性を推測することである。 2次元音響波動方程式の雑音解を与えられた正方形領域上の波源の位置を推定する逆問題を考える。ガウス雑音を仮定すると、音源位置の確率関数を定式化することができ、評価毎にシステムの前方シミュレーションを行う必要がある。サーロゲートモデルとして標準ニューラルネットワークを使用することで、この可能性を数回計算的に評価することができるため、マルコフ連鎖モンテカルロ法を用いてソース位置の後方分布を評価することができる。本手法はノイズデータから音源位置を正確に推定できることを実証する。 Neural networks have become a powerful tool as surrogate models to provide numerical solutions for scientific problems with increased computational efficiency. This efficiency can be advantageous for numerically challenging problems where time to solution is important or when evaluation of many similar analysis scenarios is required. One particular area of scientific interest is the setting of inverse problems, where one knows the forward dynamics of a system are described by a partial differential equation and the task is to infer properties of the system given (potentially noisy) observations of these dynamics. We consider the inverse problem of inferring the location of a wave source on a square domain, given a noisy solution to the 2-D acoustic wave equation. Under the assumption of Gaussian noise, a likelihood function for source location can be formulated, which requires one forward simulation of the system per evaluation. Using a standard neural network as a surrogate model makes it computationally feasible to evaluate this likelihood several times, and so Markov Chain Monte Carlo methods can be used to evaluate the posterior distribution of the source location. We demonstrate that this method can accurately infer source-locations from noisy data.	翻訳日:2023-10-25 22:54:51 公開日:2023-10-23
# 等角予測のためのベイズグラフニューラルネットワークの温度について On the Temperature of Bayesian Graph Neural Networks for Conformal Prediction ( http://arxiv.org/abs/2310.11479v2 ) ライセンス: Link先を確認	Seohyeon Cha, Honggu Kang, and Joonhyuk Kang	(参考訳) グラフニューラルネットワーク(GNN)における正確な不確実性定量化は、特にGNNが頻繁に使用される高い領域において不可欠である。コンフォーマル予測(CP)は、任意のブラックボックスモデルに対して$\textit{valid}$予測セットを提供することによって不確実性を定量化する有望なフレームワークを提供する。 CPは、予測セットが所望の確率を持つ真のラベルを含むことを保証する。しかし、$\textit{inefficiency}$として知られる予測セットのサイズは、基礎となるモデルとデータ生成プロセスの影響を受けている。一方、ベイズ学習は推定された後続分布に基づく信頼できる領域も提供するが、この領域はモデルが正しく指定されたときのみ$\textit{well-calibrated}$である。過去の推定値から有効信頼領域を構築するためのスケーリングパラメータを導入した最近の研究に基づいて, CP フレームワーク内にベイズ GNN に温度パラメータを組み込むことの利点について検討した。より効率的な予測セットをもたらす温度の存在を実証的に実証する。さらに,非効率に寄与する要因を明らかにするために分析を行い,cp性能とモデル校正の関係に関する貴重な知見を提供する。 Accurate uncertainty quantification in graph neural networks (GNNs) is essential, especially in high-stakes domains where GNNs are frequently employed. Conformal prediction (CP) offers a promising framework for quantifying uncertainty by providing $\textit{valid}$ prediction sets for any black-box model. CP ensures formal probabilistic guarantees that a prediction set contains a true label with a desired probability. However, the size of prediction sets, known as $\textit{inefficiency}$, is influenced by the underlying model and data generating process. On the other hand, Bayesian learning also provides a credible region based on the estimated posterior distribution, but this region is $\textit{well-calibrated}$ only when the model is correctly specified. Building on a recent work that introduced a scaling parameter for constructing valid credible regions from posterior estimate, our study explores the advantages of incorporating a temperature parameter into Bayesian GNNs within CP framework. We empirically demonstrate the existence of temperatures that result in more efficient prediction sets. Furthermore, we conduct an analysis to identify the factors contributing to inefficiency and offer valuable insights into the relationship between CP performance and model calibration.	翻訳日:2023-10-25 22:53:27 公開日:2023-10-23
# CO2排出を最適化した深層強化学習に基づくインテリジェント交通信号制御 Deep Reinforcement Learning-based Intelligent Traffic Signal Controls with Optimized CO2 emissions ( http://arxiv.org/abs/2310.13129v2 ) ライセンス: Link先を確認	Pedram Agand, Alexey Iskrov, Mo Chen	(参考訳) 近年、交通ネットワークは、人間の健康や環境に悪影響を及ぼし、交通渋滞に寄与する準最適制御政策の課題に直面している。交通渋滞による大気汚染の増加と通勤時間の延長により、交差点信号管制官は近代交通インフラの重要な構成要素となっている。文学における適応交通信号制御装置はいくつかあるが、比較性能に関する限られた研究がなされている。さらに、二酸化炭素(CO2)排出量が世界的な問題であるにもかかわらず、文献はこの領域に限定的に注意を払っている。本稿では,CO2排出量を削減できるだけでなく,旅行時間などの指標で競合的な結果が得られる強化学習アルゴリズムの報酬形成手法であるEcoLightを提案する。我々は,旅行時間,CO2排出量,待ち時間,停止時間などの指標を用いて,表型Q-Learning,DQN,SARSA,A2Cアルゴリズムの性能を比較した。本評価では, 道路利用者(トラック, バス, 自動車)の様々な汚染レベルを考慮した複数のシナリオについて検討する。 Nowadays, transportation networks face the challenge of sub-optimal control policies that can have adverse effects on human health, the environment, and contribute to traffic congestion. Increased levels of air pollution and extended commute times caused by traffic bottlenecks make intersection traffic signal controllers a crucial component of modern transportation infrastructure. Despite several adaptive traffic signal controllers in literature, limited research has been conducted on their comparative performance. Furthermore, despite carbon dioxide (CO2) emissions' significance as a global issue, the literature has paid limited attention to this area. In this report, we propose EcoLight, a reward shaping scheme for reinforcement learning algorithms that not only reduces CO2 emissions but also achieves competitive results in metrics such as travel time. We compare the performance of tabular Q-Learning, DQN, SARSA, and A2C algorithms using metrics such as travel time, CO2 emissions, waiting time, and stopped time. Our evaluation considers multiple scenarios that encompass a range of road users (trucks, buses, cars) with varying pollution levels.	翻訳日:2023-10-25 22:43:17 公開日:2023-10-23
# 機械学習とサービス内データを用いた旅客船の燃費予測 : 比較検討 Fuel Consumption Prediction for a Passenger Ferry using Machine Learning and In-service Data: A Comparative Study ( http://arxiv.org/abs/2310.13123v2 ) ライセンス: Link先を確認	Pedram Agand, Allison Kennedy, Trevor Harris, Chanwoo Bae, Mo Chen, Edward J Park	(参考訳) 環境にやさしい輸送の重要性が増すにつれて、海洋船の運用に効率的なアプローチが不可欠である。気象状況を考慮した状態監視手法と船舶のサービス内データの利用予測には,船舶のエネルギー効率を予測するための正確かつ完全なモデルが必要である。モデルは、すべての運用データをリアルタイムで効果的に処理する必要がある。本稿では,旅客船から収集したサービス内データを用いて,燃料消費を予測するモデルを提案する。モデルの適切な入力変数を選択するために統計的およびドメイン知識法が用いられた。これらの方法は、実用性を提供しながら、過度に適合し、欠落したデータ、多項性を防止する。検討した予測モデルには、多重線形回帰(MLR)、決定木アプローチ(DT)、人工知能ニューラルネットワーク(ANN)、アンサンブル手法などがある。最高の予測性能は、強化アンサンブルアプローチであるXGboost技術を用いて開発されたモデルから得られる。 \rvv{Our codeは、将来の研究のためにGitHubの \url{https://github.com/pagand/model_optimze_vessel/tree/OE}で入手できる。 As the importance of eco-friendly transportation increases, providing an efficient approach for marine vessel operation is essential. Methods for status monitoring with consideration to the weather condition and forecasting with the use of in-service data from ships requires accurate and complete models for predicting the energy efficiency of a ship. The models need to effectively process all the operational data in real-time. This paper presents models that can predict fuel consumption using in-service data collected from a passenger ship. Statistical and domain-knowledge methods were used to select the proper input variables for the models. These methods prevent over-fitting, missing data, and multicollinearity while providing practical applicability. Prediction models that were investigated include multiple linear regression (MLR), decision tree approach (DT), an artificial neural network (ANN), and ensemble methods. The best predictive performance was from a model developed using the XGboost technique which is a boosting ensemble approach. \rvv{Our code is available on GitHub at \url{https://github.com/pagand/model_optimze_vessel/tree/OE} for future research.	翻訳日:2023-10-25 22:42:59 公開日:2023-10-23
# 変圧器の追加を理解する Understanding Addition in Transformers ( http://arxiv.org/abs/2310.13121v2 ) ライセンス: Link先を確認	Philip Quirke, Fazl Barez	(参考訳) Transformersのような機械学習モデルの内部動作を理解することは、安全で倫理的な使用に不可欠である。本稿では,整数加算を訓練した単層変圧器モデルの詳細な解析を行う。本モデルでは,タスクを並列な桁別ストリームに分割し,異なる桁位置の異なるアルゴリズムを用いる。我々の研究は、モデルが計算を遅く開始するが、迅速に実行することも見出した。高損失の稀なユースケースが同定され、説明される。全体として、モデルのアルゴリズムは詳細に説明されている。これらの発見は厳密なテストと数学的モデリングを通じて検証され、機械的解釈可能性、AI安全性、アライメントにおける幅広い研究に貢献した。我々のアプローチは、より複雑なタスクと多層トランスフォーマーモデルを分析するための扉を開く。 Understanding the inner workings of machine learning models like Transformers is vital for their safe and ethical use. This paper presents an in-depth analysis of a one-layer Transformer model trained for integer addition. We reveal that the model divides the task into parallel, digit-specific streams and employs distinct algorithms for different digit positions. Our study also finds that the model starts calculations late but executes them rapidly. A rare use case with high loss is identified and explained. Overall, the model's algorithm is explained in detail. These findings are validated through rigorous testing and mathematical modeling, contributing to the broader works in Mechanistic Interpretability, AI safety, and alignment. Our approach opens the door for analyzing more complex tasks and multi-layer Transformer models.	翻訳日:2023-10-25 22:42:42 公開日:2023-10-23
# CRoW: 実世界のタスクにおけるCommonsense Reasoningのベンチマーク CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks ( http://arxiv.org/abs/2310.15239v1 ) ライセンス: Link先を確認	Mete Ismayilzada, Debjit Paul, Syrielle Montariol, Mor Geva, Antoine Bosselut	(参考訳) 最近の自然言語処理(nlp)の常識推論研究は、多くの新しいデータセットとベンチマークを生み出した。しかし、これらのデータセットの多くは、現実世界のNLPシステムが解決しようとするタスクを反映していない人工シナリオにおける常識推論の課題を定式化している。本研究では,6つの実世界のnlpタスクの文脈で共通意味推論を適用できるモデルの能力を評価する,手作業によるマルチタスクベンチマークである crow を提案する。 CRoWはマルチステージのデータ収集パイプラインを使用して構築され、Commonsenseに違反する摂動を使って既存のデータセットからサンプルを書き換える。 crowを用いて,自然的,時間的,社会的推論などの共通知識の異なる次元にわたってnlpシステムがどのように機能するかを研究する。実世界のタスク設定において,NLPシステムが人間に比べてCRoW上で評価される場合,コモンセンス推論が解決されるには程遠いことを示す。私たちはデータセットとリーダボードを、https://github.com/mismayil/crow.comで研究コミュニティに公開しています。 Recent efforts in natural language processing (NLP) commonsense reasoning research have yielded a considerable number of new datasets and benchmarks. However, most of these datasets formulate commonsense reasoning challenges in artificial scenarios that are not reflective of the tasks which real-world NLP systems are designed to solve. In this work, we present CRoW, a manually-curated, multi-task benchmark that evaluates the ability of models to apply commonsense reasoning in the context of six real-world NLP tasks. CRoW is constructed using a multi-stage data collection pipeline that rewrites examples from existing datasets using commonsense-violating perturbations. We use CRoW to study how NLP systems perform across different dimensions of commonsense knowledge, such as physical, temporal, and social reasoning. We find a significant performance gap when NLP systems are evaluated on CRoW compared to humans, showcasing that commonsense reasoning is far from being solved in real-world task settings. We make our dataset and leaderboard available to the research community at https://github.com/mismayil/crow.	翻訳日:2023-10-25 22:35:14 公開日:2023-10-23
# 銀河カタログを用いたフィールドレベルシミュレーションに基づく推論:系統的効果の影響 Field-level simulation-based inference with galaxy catalogs: the impact of systematic effects ( http://arxiv.org/abs/2310.15234v1 ) ライセンス: Link先を確認	Natal\'i S. M. de Santi, Francisco Villaescusa-Navarro, L. Raul Abramo, Helen Shao, Lucia A. Perez, Tiago Castro, Yueying Ni, Christopher C. Lovell, Elena Hernandez-Martinez, Federico Marinacci, David N. Spergel, Klaus Dolag, Lars Hernquist, Mark Vogelsberger	(参考訳) 近年、銀河赤方偏移調査から宇宙パラメータを制約する強力な方法は、グラフニューラルネットワークを訓練し、スケールを縮小することなく、フィールドレベルの確率フリー推論を実行することであることが示されている。特に、De Santi et al. (2023) は、天体物理学やサブグリッドモデルにおける不確実性に対して堅牢な銀河の位置と半径速度のみを含むカタログから$\Omega_{\rm m}$の値を正確に推測できるモデルを開発した。しかし、観察は多くの影響に影響されている。 1)マスク。 2)特異な速度と半径距離の不確実性,及び 3) 異なる銀河の選び方。さらに、観測によって赤方偏移、銀河の半径位置と速度を測定できるだけである。本稿では、CAMELSプロジェクトと異なるコードで実行される何千もの最先端の流体力学シミュレーションから生成された銀河カタログ上で、我々のモデルを訓練し、テストする。これらの効果はモデルの精度と精度を低下させ、モデルが故障するカタログの分数を増加させるが、モデルが良好に機能する銀河カタログの分数は90%以上であり、実際のデータに適用しても宇宙論的パラメータを制約する可能性を示している。 It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of $\Omega_{\rm m}$ from catalogs that only contain the positions and radial velocities of galaxies that are robust to uncertainties in astrophysics and subgrid models. However, observations are affected by many effects, including 1) masking, 2) uncertainties in peculiar velocities and radial distances, and 3) different galaxy selections. Moreover, observations only allow us to measure redshift, intertwining galaxies' radial positions and velocities. In this paper we train and test our models on galaxy catalogs, created from thousands of state-of-the-art hydrodynamic simulations run with different codes from the CAMELS project, that incorporate these observational effects. We find that, although the presence of these effects degrades the precision and accuracy of the models, and increases the fraction of catalogs where the model breaks down, the fraction of galaxy catalogs where the model performs well is over 90 %, demonstrating the potential of these models to constrain cosmological parameters even when applied to real data.	翻訳日:2023-10-25 22:34:56 公開日:2023-10-23
# 高調波による重力波のテンプレート化への新しいアプローチ : マッチングフィルタのコストを1桁以上削減する A new approach to template banks of gravitational waves with higher harmonics: reducing matched-filtering cost by over an order of magnitude ( http://arxiv.org/abs/2310.15233v1 ) ライセンス: Link先を確認	Digvijay Wadekar, Tejaswi Venumadhav, Ajit Kumar Mehta, Javier Roulet, Seth Olsen, Jonathan Mushkin, Barak Zackay, Matias Zaldarriaga	(参考訳) 重力波の事象の探索は、興味のある信号にモデルやテンプレートを使用する。 LIGO-Virgo-Kagra(LVK)データモデルにおける現在の検索で使用されるテンプレートは、信号の4次モード$(\ell,m)=(2,2)$と、一般相対性理論によって予測される$(\ell,m)=(3,3)$,$(4,4)$のような省略された上位モード(HM)である。したがって、これらの探索は、高質量と非対称質量比の系のようなパラメータ空間の興味深い部分におけるブラックホールの融合に対する感度を失う可能性がある。我々は,テンプレートバンクにhmを組み込む新たな戦略を開発し,モード間の自然な接続を利用する。ポストニュートン式と機械学習ツールを組み合わせて,与えられた$(2,2)$波形に対応するアライメントスピン$(3,3)$,$(4,4)$波形をモデル化する。これらのモードは、それぞれがデータに対して個別にフィルタリングされ、信号から雑音への比(SNR)の別々のタイムリーを生成することができる。これは、マッチしたフィルタリングコストが$\approx 3\times$四重極のみの検索である($\approx\!とは対照的に)HM検索パイプラインにつながる。 100 \times$, 以前提案されたHM検索メソッドのように)。本手法は,確率的あるいは幾何学的配置技術を用いて構築したテンプレートバンクに適用可能である。さらに,機械学習アルゴリズムを用いて,$(2,2)$のみの幾何配置テンプレートバンクの圧縮について検討する。 Searches for gravitational wave events use models, or templates, for the signals of interest. The templates used in current searches in the LIGO-Virgo-Kagra (LVK) data model the dominant quadrupole mode $(\ell,m)=(2,2)$ of the signals, and omit sub-dominant higher-order modes (HM) such as $(\ell,m)=(3,3)$, $(4,4)$, which are predicted by general relativity. Hence, these searches could lose sensitivity to black hole mergers in interesting parts of parameter space, such as systems with high-masses and asymmetric mass ratios. We develop a new strategy to include HM in template banks that exploits the natural connection between the modes. We use a combination of post-Newtonian formulae and machine learning tools to model aligned-spin $(3,3)$, $(4,4)$ waveforms corresponding to a given $(2,2)$ waveform. Each of these modes can be individually filtered against the data to yield separate timeseries of signal-to-noise ratios (SNR), which can be combined in a relatively inexpensive way to marginalize over extrinsic parameters of the signals. This leads to a HM search pipeline whose matched-filtering cost is just $\approx 3\times$ that of a quadrupole-only search (in contrast to being $\approx\! 100 \times$, as in previously proposed HM search methods). Our method is effectual and is generally applicable for template banks constructed with either stochastic or geometric placement techniques. Additionally, we discuss compression of $(2,2)$-only geometric-placement template banks using machine learning algorithms.	翻訳日:2023-10-25 22:34:34 公開日:2023-10-23
# 大規模言語モデルにおける関数ベクトル Function Vectors in Large Language Models ( http://arxiv.org/abs/2310.15213v1 ) ライセンス: Link先を確認	Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau	(参考訳) 自己回帰的トランスフォーマー言語モデル(lms)における入力出力関数をベクトルとして表現する単純な神経機構の存在を報告する。多様なコンテキスト内学習(ICL)タスクの因果媒介分析を用いて、少数の注意ヘッドがデモされたタスクのコンパクトな表現を伝達し、関数ベクトル(FV)と呼ぶ。 fvsはコンテキストの変化に対して堅牢である。すなわち、収集したiclコンテキストに似ていないゼロショットや自然テキストの設定などの入力に対して、タスクの実行をトリガーする。さまざまなタスク、モデル、レイヤにわたってFVをテストし、中間層の設定に対して強力な因果効果を見つけます。我々はFVの内部構造を調査し、関数の出力空間を符号化する情報をしばしば含んでいるが、この情報だけではFVを再構築するには不十分である。最後に、fvsで意味ベクトル合成をテストし、それらがある程度要約されて、新しい複雑なタスクをトリガーするベクトルを生成することができることを見出します。この結果から,LLMには様々なコンテキストで呼び出すことのできる汎用関数の内部抽象化が含まれていることが示唆された。 We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer language models (LMs). Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Taken together, our findings suggest that LLMs contain internal abstractions of general-purpose functions that can be invoked in a variety of contexts.	翻訳日:2023-10-25 22:34:02 公開日:2023-10-23
# 効果的なアルツハイマー病薬物再資源化のための経路のモデリング Modeling Path Importance for Effective Alzheimer's Disease Drug Repurposing ( http://arxiv.org/abs/2310.15211v1 ) ライセンス: Link先を確認	Shunian Xiang, Patrick J. Lawrence, Bo Peng, ChienWei Chiang PhD, Dokyoon Kim PhD, Li Shen PhD, and Xia Ning	(参考訳) 近年,AD薬物発見のための有効かつ資源効率の高いパラダイムとして,薬物再資源化が出現している。薬物再生産の様々な方法のうち、ネットワークベースの手法は、タンパク質とタンパク質の相互作用のような複数の相互作用型を統合する複雑なネットワークを利用して、候補薬をより効果的に識別できるという有望な結果を示している。しかし、既存のアプローチでは、ネットワーク内の同じ長さの経路が薬物の治療効果を特定するのに等しく重要であると仮定している。他の領域では、同じ長さの経路が必ずしも同じ重要性を持つとは限らない。したがって、この仮定に依存することは、薬物再購入の試みに有害である可能性がある。そこで本研究では,新しいネットワークベースの広告薬剤再提案手法であるmpi(modeling path importance)を提案する。 MPIは学習ノードの埋め込みによって重要なパスを優先順位付けし、ネットワークの豊富な構造情報を効果的にキャプチャする。したがって、学習した埋め込みを活用することで、MPIはパス間の重要性を効果的に区別することができる。抗AD薬候補を同定するベースライン法として, ネットワーク内の薬剤とADの最も短い経路に基づいて, MPIを評価した。上位50の薬物のうち、MPIは、基準値よりも20.0%の薬物を抗AD抗体で優先している。最後に、保険請求データから生成されたコックス比例ハザードモデルは、エコドラ、ニコチン、およびBBB交差ACE-INHの使用をADのリスクが低いものとして識別するのに役立つ。 Recently, drug repurposing has emerged as an effective and resource-efficient paradigm for AD drug discovery. Among various methods for drug repurposing, network-based methods have shown promising results as they are capable of leveraging complex networks that integrate multiple interaction types, such as protein-protein interactions, to more effectively identify candidate drugs. However, existing approaches typically assume paths of the same length in the network have equal importance in identifying the therapeutic effect of drugs. Other domains have found that same length paths do not necessarily have the same importance. Thus, relying on this assumption may be deleterious to drug repurposing attempts. In this work, we propose MPI (Modeling Path Importance), a novel network-based method for AD drug repurposing. MPI is unique in that it prioritizes important paths via learned node embeddings, which can effectively capture a network's rich structural information. Thus, leveraging learned embeddings allows MPI to effectively differentiate the importance among paths. We evaluate MPI against a commonly used baseline method that identifies anti-AD drug candidates primarily based on the shortest paths between drugs and AD in the network. We observe that among the top-50 ranked drugs, MPI prioritizes 20.0% more drugs with anti-AD evidence compared to the baseline. Finally, Cox proportional-hazard models produced from insurance claims data aid us in identifying the use of etodolac, nicotine, and BBB-crossing ACE-INHs as having a reduced risk of AD, suggesting such drugs may be viable candidates for repurposing and should be explored further in future studies.	翻訳日:2023-10-25 22:33:41 公開日:2023-10-23
# DISC-FinLLM: 複数の専門家による微調整に基づく中国の金融大規模言語モデル DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning ( http://arxiv.org/abs/2310.15205v1 ) ライセンス: Link先を確認	Wei Chen, Qiushi Wang, Zefei Long, Xianyin Zhang, Zhongtian Lu, Bingxuan Li, Siyuan Wang, Jiarong Xu, Xiang Bai, Xuanjing Huang, Zhongyu Wei	(参考訳) 金融大規模言語モデル (LLM) を構築するために, マルチエキスパートファインチューニングフレームワークを提案する。提案手法は,マルチターン質問応答能力,ドメインテキスト処理能力,数理計算能力,検索エンハンスド生成能力を用いて,一般的なllmを改善する。 DISC-FIN-SFT という金融インストラクションチューニングデータセットを構築し、4つのカテゴリ(コンサルト、NLPタスク、コンピューティング、検索強化生成)のインストラクションサンプルを含む。複数のベンチマークで評価した結果, 様々な財務シナリオにおいて, ベースラインモデルよりも優れた性能を示した。さらなるリソースはhttps://github.com/FudanDISC/DISC-FinLLMで見ることができる。 We propose Multiple Experts Fine-tuning Framework to build a financial large language model (LLM), DISC-FinLLM. Our methodology improves general LLMs by endowing them with multi-turn question answering abilities, domain text processing capabilities, mathematical computation skills, and retrieval-enhanced generation capabilities. We build a financial instruction-tuning dataset named DISC-FIN-SFT, including instruction samples of four categories (consulting, NLP tasks, computing and retrieval-augmented generation). Evaluations conducted on multiple benchmarks demonstrate that our model performs better than baseline models in various financial scenarios. Further resources can be found at https://github.com/FudanDISC/DISC-FinLLM.	翻訳日:2023-10-25 22:33:13 公開日:2023-10-23
# 区分的線形回帰と拡張因果cnnに基づく長期電力消費予測 Mid-Long Term Daily Electricity Consumption Forecasting Based on Piecewise Linear Regression and Dilated Causal CNN ( http://arxiv.org/abs/2310.15204v1 ) ライセンス: Link先を確認	Zhou Lan, Ben Liu, Yi Feng, Danhuang Dong, Peng Zhang	(参考訳) 日々の電力消費予測は古典的な問題である。既存の予測アルゴリズムは休日のような特別な日付で精度を低下させる傾向にある。本研究は, 日次電力消費系列を傾向, 季節, 残留の3つの成分に分解し, 分別線形回帰をフィルタとし, ディイル化因数CNNを予測器とする2段階予測法を構築した。具体的なステップは、時間軸にブレークポイントを設定し、断片的な線形回帰モデルに月、平日、休日などの1ホットエンコードされた情報を適用することである。春祭りの難解な予測のために、モデル内の3次多項式形式を用いて距離を変数として導入する。前ステップで得られた残差列はDilated Causal CNNを用いてモデル化し, 日中電力消費の最終的な予測は2段階予測の総和である。実験により,本手法は既存手法と比較して精度が高いことを示した。 Daily electricity consumption forecasting is a classical problem. Existing forecasting algorithms tend to have decreased accuracy on special dates like holidays. This study decomposes the daily electricity consumption series into three components: trend, seasonal, and residual, and constructs a two-stage prediction method using piecewise linear regression as a filter and Dilated Causal CNN as a predictor. The specific steps involve setting breakpoints on the time axis and fitting the piecewise linear regression model with one-hot encoded information such as month, weekday, and holidays. For the challenging prediction of the Spring Festival, distance is introduced as a variable using a third-degree polynomial form in the model. The residual sequence obtained in the previous step is modeled using Dilated Causal CNN, and the final prediction of daily electricity consumption is the sum of the two-stage predictions. Experimental results demonstrate that this method achieves higher accuracy compared to existing approaches.	翻訳日:2023-10-25 22:32:59 公開日:2023-10-23
# Transformer-based Capsule Network を用いた転写因子結合サイトの予測 Predicting Transcription Factor Binding Sites using Transformer based Capsule Network ( http://arxiv.org/abs/2310.15202v1 ) ライセンス: Link先を確認	Nimisha Ghosh and Daniele Santoni and Indrajit Saha and Giovanni Felici	(参考訳) 転写因子の結合部位の予測は、遺伝子発現の制御方法と、この調節がどのように治療目的に調節されるかを理解するために重要である。過去数年間、この問題には大きな取り組みがあったが、改善の余地はまだ残っている。この場合、トランスベースのカプセルネットワークvizである。 DNABERT-Capは、ChIP-seqデータセットをマイニングする転写因子結合部位を予測するために提案されている。 DNABERT-Capは、多数のゲノムDNA配列が事前訓練された双方向エンコーダであり、最終予測にカプセル層が関与する。提案モデルは,双方向エンコーダとカプセル層を包含する特徴と,畳み込みおよび双方向の長期記憶層との協調最適化を用いて,転写因子結合部位の予測器を構築する。提案手法の有効性を評価するために,5つのセルラインvizのベンチマークChIP-seqデータセットを用いる。 A549, GM12878, Hep-G2, H1-hESC, Hela – ENCODEリポジトリで利用できる。その結果、受信機動作特性曲線スコアの下の平均面積は、これら5つのセルラインすべてで 0.91 を超えることがわかった。 DNABERT-Capは、最先端のディープラーニングベースの予測器vizと比較される。 DeepARC、DeepTF、CNN-Zeng、DeepBindはそれらを上回っている。 Prediction of binding sites for transcription factors is important to understand how they regulate gene expression and how this regulation can be modulated for therapeutic purposes. Although in the past few years there are significant works addressing this issue, there is still space for improvement. In this regard, a transformer based capsule network viz. DNABERT-Cap is proposed in this work to predict transcription factor binding sites mining ChIP-seq datasets. DNABERT-Cap is a bidirectional encoder pre-trained with large number of genomic DNA sequences, empowered with a capsule layer responsible for the final prediction. The proposed model builds a predictor for transcription factor binding sites using the joint optimisation of features encompassing both bidirectional encoder and capsule layer, along with convolutional and bidirectional long-short term memory layers. To evaluate the efficiency of the proposed approach, we use a benchmark ChIP-seq datasets of five cell lines viz. A549, GM12878, Hep-G2, H1-hESC and Hela, available in the ENCODE repository. The results show that the average area under the receiver operating characteristic curve score exceeds 0.91 for all such five cell lines. DNABERT-Cap is also compared with existing state-of-the-art deep learning based predictors viz. DeepARC, DeepTF, CNN-Zeng and DeepBind, and is seen to outperform them.	翻訳日:2023-10-25 22:32:44 公開日:2023-10-23
# AMP相互作用を有する皮膚微生物モデルと個体群動態の準安定性と安定性の解析 A Skin Microbiome Model with AMP interactions and Analysis of Quasi-Stability vs Stability in Population Dynamics ( http://arxiv.org/abs/2310.15201v1 ) ライセンス: Link先を確認	El\'ea Thibault Greugny (Lifeware), Fran\c{c}ois Fages (Lifeware), Ovidiu Radulescu (UM), Peter Szmolyan, Georgios Stamatas	(参考訳) 皮膚微生物は健康な皮膚の維持に重要な役割を果たしている。数種からなる生態系であり、資源を競い合い、皮膚細胞と相互作用する。皮膚のマイクロバイオーム(ジスビオシスとも呼ばれる)における不均衡は、アクネやアトピー性皮膚炎などいくつかの皮膚疾患と相関している。一般的に、ジスビオーシスは発疹性病原菌の集団による皮膚のコロニー化と関連している。皮膚微生物の非特異的除去による治療は相反する結果を示した。本稿では, 常微分方程式に基づく数理モデルを導入し, 2種類の細菌群(皮膚共生病原体, 接種病原体)と, 抗微生物ペプチドを産生し, 他方の個体群を優占させる機構について検討する。我々のモデルにおける安定状態の観測に対応すると仮定された実験データを用いて、モデルのパラメータ数を13から5に削減する。次に、定量的時相論理の形式的仕様を用いて、大域的パラメータ最適化によるモデルを校正し、感度分析を行う。実験の2日間の時間スケールでは、皮膚表面pHの上昇のような環境の変化が、寄生性病原体集団による皮膚の出現とコロニー形成に好適な条件を生じさせ、一方、ヒトAMPの生成は病原体と通勤者のバランスに非線形に影響を及ぼさないと予測する。驚くべきことに、より長い時間スケールでのシミュレーションにより、平衡状態が2日程度に達すると、実際には準安定状態となり、12日以上後に逆安定状態が到達することが明らかとなった。我々は,このモデルで観測された準安定性の条件を熱帯代数的手法を用いて解析し,その非遺伝的特性を低速系と対照的に示す。これらの条件は、任意の種の人口動態モデルに一般化される。 The skin microbiome plays an important role in the maintenance of a healthy skin. It is an ecosystem, composed of several species, competing for resources and interacting with the skin cells. Imbalance in the cutaneous microbiome, also called dysbiosis, has been correlated with several skin conditions, including acne and atopic dermatitis. Generally, dysbiosis is linked to colonization of the skin by a population of opportunistic pathogenic bacteria. Treatments consisting in non-specific elimination of cutaneous microflora have shown conflicting results. In this article, we introduce a mathematical model based on ordinary differential equations, with 2 types of bacteria populations (skin commensals and opportunistic pathogens) and including the production of antimicrobial peptides to study the mechanisms driving the dominance of one population over the other. By using published experimental data, assumed to correspond to the observation of stable states in our model, we reduce the number of parameters of the model from 13 to 5. We then use a formal specification in quantitative temporal logic to calibrate our model by global parameter optimization and perform sensitivity analyses. On the time scale of 2 days of the experiments, the model predicts that certain changes of the environment, like the elevation of skin surface pH, create favorable conditions for the emergence and colonization of the skin by the opportunistic pathogen population, while the production of human AMPs has non-linear effect on the balance between pathogens and commensals. Surprisingly, simulations on longer time scales reveal that the equilibrium reached around 2 days can in fact be a quasi-stable state followed by the reaching of a reversed stable state after 12 days or more. We analyse the conditions of quasi-stability observed in this model using tropical algebraic methods, and show their non-generic character in contrast to slow-fast systems. These conditions are then generalized to a large class of population dynamics models over any number of species.	翻訳日:2023-10-25 22:32:21 公開日:2023-10-23
# オープンセット認識のための画像タグ付けに意味概念を注入する Inject Semantic Concepts into Image Tagging for Open-Set Recognition ( http://arxiv.org/abs/2310.15200v1 ) ライセンス: Link先を確認	Xinyu Huang, Yi-Jie Huang, Youcai Zhang, Weiwei Tian, Rui Feng, Yuejie Zhang, Yanchun Xie, Yaqian Li, Lei Zhang	(参考訳) 本稿では,画像タグ学習フレームワークに意味概念を注入することにより,強力なオープンセット認識能力を持つ基本画像認識モデルである認識any plus model~(ram++)を提案する。従来のアプローチは、限定された意味論に制約された画像タグ付けモデルか、マルチタグ認識におけるサブ最適性能のための浅い相互作用を持つ視覚言語モデルである。対照的に、ram++は、画像タグテキストトリプレットに基づく統合きめ細かなインタラクションフレームワークに、画像-テキストアライメントと画像-タグ統合を統合する。この設計により、RAM++は定義済みのカテゴリを識別するだけでなく、オープンセットのカテゴリの認識能力を大幅に向上できる。さらに、RAM++は多種多様なビジュアルタグ記述を生成するために、大きな言語モデル~(LLM)を採用しており、LLMの知識をイメージタグトレーニングに統合する先駆者となっている。このアプローチにより、RAM++は推論中にオープンセット認識のためのビジュアル記述の概念を統合することができる。包括的な画像認識ベンチマークの評価では、RAM++は既存の最先端(SOTA)の基本画像認識モデルよりも多くの面において優れている。具体的には、事前に定義された共通タグカテゴリに対して、RAM++では、OpenImagesとImageNet上のCLIPよりも10.2mAPと15.4mAPの強化が紹介されている。事前定義された以上のオープンセットカテゴリでは、RAM++はCLIPとRAMに対する5mAPと6.4mAPの改善を記録している。多様なヒューマンオブジェクトのインタラクションフレーズに対して、RAM++はHICOベンチマークで7.8mAPと4.7mAPの改善を達成した。コード、データセット、事前学習されたモデルは \url{https://github.com/xinyu1205/recognize-anything} で利用可能である。 In this paper, we introduce the Recognize Anything Plus Model~(RAM++), a fundamental image recognition model with strong open-set recognition capabilities, by injecting semantic concepts into image tagging training framework. Previous approaches are either image tagging models constrained by limited semantics, or vision-language models with shallow interaction for suboptimal performance in multi-tag recognition. In contrast, RAM++ integrates image-text alignment and image-tagging within a unified fine-grained interaction framework based on image-tags-text triplets. This design enables RAM++ not only excel in identifying predefined categories, but also significantly augment the recognition ability in open-set categories. Moreover, RAM++ employs large language models~(LLMs) to generate diverse visual tag descriptions, pioneering the integration of LLM's knowledge into image tagging training. This approach empowers RAM++ to integrate visual description concepts for open-set recognition during inference. Evaluations on comprehensive image recognition benchmarks demonstrate RAM++ exceeds existing state-of-the-art (SOTA) fundamental image recognition models on most aspects. Specifically, for predefined common-used tag categories, RAM++ showcases 10.2 mAP and 15.4 mAP enhancements over CLIP on OpenImages and ImageNet. For open-set categories beyond predefined, RAM++ records improvements of 5 mAP and 6.4 mAP over CLIP and RAM respectively on OpenImages. For diverse human-object interaction phrases, RAM++ achieves 7.8 mAP and 4.7 mAP improvements on the HICO benchmark. Code, datasets and pre-trained models are available at \url{https://github.com/xinyu1205/recognize-anything}.	翻訳日:2023-10-25 22:31:48 公開日:2023-10-23
# gradsim:多言語学習のための勾配型言語グループ化 GradSim: Gradient-Based Language Grouping for Effective Multilingual Training ( http://arxiv.org/abs/2310.15269v1 ) ライセンス: Link先を確認	Mingyang Wang, Heike Adel, Lukas Lange, Jannik Str\"otgen, Hinrich Sch\"utze	(参考訳) 世界中のほとんどの言語は、自然言語処理モデルに低リソースの課題をもたらす。多言語学習では、知識は言語間で共有できる。しかし、全ての言語が相互に肯定的な影響を与えている訳ではなく、多言語学習に最適な言語を選択し、特性やデータ分布が相容れない言語間の負の干渉を避けるかというオープンな研究課題である。本稿では,勾配類似度に基づく言語グループ化手法であるGradSimを提案する。 3つの多言語ベンチマークデータセットに対する実験により、他の類似度指標と比較して最大の性能向上につながることが示され、言語間モデルの性能との相関が良好である。その結果,アフリカの低資源言語における感情分析のためのベンチマークデータセットであるafrisentiに新たな最先端技術が設定された。広範な分析では、言語的特徴に加えて、データセットのトピックが言語グループ化において重要な役割を担っており、トランスフォーマーモデルの下位層が言語固有の特徴をエンコードし、上位層がタスク固有の情報をキャプチャする。 Most languages of the world pose low-resource challenges to natural language processing models. With multilingual training, knowledge can be shared among languages. However, not all languages positively influence each other and it is an open research question how to select the most suitable set of languages for multilingual training and avoid negative interference among languages whose characteristics or data distributions are not compatible. In this paper, we propose GradSim, a language grouping method based on gradient similarity. Our experiments on three diverse multilingual benchmark datasets show that it leads to the largest performance gains compared to other similarity measures and it is better correlated with cross-lingual model performance. As a result, we set the new state of the art on AfriSenti, a benchmark dataset for sentiment analysis on low-resource African languages. In our extensive analysis, we further reveal that besides linguistic features, the topics of the datasets play an important role for language grouping and that lower layers of transformer models encode language-specific features while higher layers capture task-specific information.	翻訳日:2023-10-25 22:26:42 公開日:2023-10-23
# aiによるテキスト検出の可能性と不正確性:調査 Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey ( http://arxiv.org/abs/2310.15264v1 ) ライセンス: Link先を確認	Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Singh Bedi	(参考訳) 大規模言語モデル(LLM)は、自然言語処理(NLP)の領域に革命をもたらし、人間のようなテキスト応答を生成する能力を持つ。しかし、これらの進歩にもかかわらず、既存の文献のいくつかは、誤報の拡散、偽ニュースの発生、学界の盗作、ウェブの汚染など、LCMの潜在的な誤用について深刻な懸念を提起している。これらの懸念に対処するために、研究コミュニティのコンセンサスは、AI生成テキストを検出するアルゴリズムソリューションを開発することである。基本的な考え方は、与えられたテキストが人間またはAIによって書かれたものであるかどうかを知ることができれば、上記の懸念に対処するためにこの情報を利用することができるということだ。そのために、AI生成したテキスト検出の可能性を強調する、多数の検出フレームワークが提案されている。しかし、検出フレームワークの開発と並行して、研究者は検出を省くための設計戦略、すなわちAI生成したテキスト検出の不確実性に焦点を当てている。これは検出フレームワークが十分に堅牢であり、検出器を騙すのが容易ではないことを保証するために重要なステップである。この領域における大きな関心と活発な研究にもかかわらず、コミュニティは現在、最近の開発に関する包括的な分析を欠いている。本調査では,AIによるテキスト検出の展望と限界の両方を包含した,簡潔な分類と現状の概観を提案する。集合的知識を豊かにするために、AI生成テキスト検出に関する現在進行中の研究に関連する、批判的で挑戦的なオープンな質問について、徹底的に議論する。 Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses. However, despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs such as spreading misinformation, generating fake news, plagiarism in academia, and contaminating the web. To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text. The basic idea is that whenever we can tell if the given text is either written by a human or an AI, we can utilize this information to address the above-mentioned concerns. To that end, a plethora of detection frameworks have been proposed, highlighting the possibilities of AI-generated text detection. But in parallel to the development of detection frameworks, researchers have also concentrated on designing strategies to elude detection, i.e., focusing on the impossibilities of AI-generated text detection. This is a crucial step in order to make sure the detection frameworks are robust enough and it is not too easy to fool a detector. Despite the huge interest and the flurry of research in this domain, the community currently lacks a comprehensive analysis of recent developments. In this survey, we aim to provide a concise categorization and overview of current work encompassing both the prospects and the limitations of AI-generated text detection. To enrich the collective knowledge, we engage in an exhaustive discussion on critical and challenging open questions related to ongoing research on AI-generated text detection.	翻訳日:2023-10-25 22:25:44 公開日:2023-10-23
# ワンホット一般化線形モデルによる脳状態発見の切り替え One-hot Generalized Linear Model for Switching Brain State Discovery ( http://arxiv.org/abs/2310.15263v1 ) ライセンス: Link先を確認	Chengrui Li, Soon Ho Kim, Chris Rodgers, Hannah Choi, Anqi Wu	(参考訳) 有意義で解釈可能な神経相互作用を明らかにすることは、神経回路を理解する上で重要である。神経信号からの推論された神経相互作用は、主に機能的相互作用を反映する。長い実験では、対象動物は実験、刺激、行動状態によって定義された異なる段階を経験し、したがって機能的相互作用は時間とともに変化する。動的に変化する機能的相互作用をモデル化するために、先行研究では隠れマルコフモデル(HMM-GLM)を持つ状態スイッチング一般化線形モデルを採用している。しかし、機能的相互作用は解剖学的コネクトームによって形作られ、閉じ込められているため、生物学的な可能性に欠けると主張する。本稿では, 先行インフォームド・ステートスイッチング GLM を提案する。各状態においてGLMよりもガウス先行と1ホット先行の両方を導入する。前科は学習可能。学習した前者は、状態と状態の相互作用を捉え、基礎となる解剖学的コネクトームに光を流し、より物理的な相互作用を示すべきであることを示す。各GLMによってモデル化された状態依存相互作用は、複数の脳状態にわたる機能的変動を捉えるトレーサビリティを提供する。本手法は,シミュレーションデータにおける真のインタラクション構造を効果的に復元し,実際のニューラルネットワークで最大予測可能性を達成し,実際のニューラルネットワークに適用した場合のインタラクション構造と隠れた状態をより解釈しやすいものにする。 Exposing meaningful and interpretable neural interactions is critical to understanding neural circuits. Inferred neural interactions from neural signals primarily reflect functional interactions. In a long experiment, subject animals may experience different stages defined by the experiment, stimuli, or behavioral states, and hence functional interactions can change over time. To model dynamically changing functional interactions, prior work employs state-switching generalized linear models with hidden Markov models (i.e., HMM-GLMs). However, we argue they lack biological plausibility, as functional interactions are shaped and confined by the underlying anatomical connectome. Here, we propose a novel prior-informed state-switching GLM. We introduce both a Gaussian prior and a one-hot prior over the GLM in each state. The priors are learnable. We will show that the learned prior should capture the state-constant interaction, shedding light on the underlying anatomical connectome and revealing more likely physical neuron interactions. The state-dependent interaction modeled by each GLM offers traceability to capture functional variations across multiple brain states. Our methods effectively recover true interaction structures in simulated data, achieve the highest predictive likelihood with real neural datasets, and render interaction structures and hidden states more interpretable when applied to real neural data.	翻訳日:2023-10-25 22:25:08 公開日:2023-10-23
# コード切り替わったテキストの機械翻訳のためのデータ拡張手法の比較研究 Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study ( http://arxiv.org/abs/2310.15262v1 ) ライセンス: Link先を確認	Injy Hamed, Nizar Habash, Ngoc Thang Vu	(参考訳) コードスイッチング(CSW)テキスト生成は、データの不足に対処するソリューションとして注目されている。このような関心の高まりを踏まえて、異なる拡張アプローチを比較するもっと包括的な研究が必要です。本研究は,エジプト・アラビア・英語CSWの文脈において,語彙置換,言語理論,後方翻訳(BT)の3つの一般的なアプローチを比較した。機械翻訳におけるアプローチの有効性と人的評価による強化の質を評価する。 BTおよびCSW予測に基づく語彙置換は,CSW並列データに基づいて訓練され,両タスクにおいて最善であることを示す。言語理論とランダムな語彙置換はCSW並列データの欠如に有効であることが証明され、どちらも同様の結果が得られる。 Code-switching (CSW) text generation has been receiving increasing attention as a solution to address data scarcity. In light of this growing interest, we need more comprehensive studies comparing different augmentation approaches. In this work, we compare three popular approaches: lexical replacements, linguistic theories, and back-translation (BT), in the context of Egyptian Arabic-English CSW. We assess the effectiveness of the approaches on machine translation and the quality of augmentations through human evaluation. We show that BT and CSW predictive-based lexical replacement, being trained on CSW parallel data, perform best on both tasks. Linguistic theories and random lexical replacement prove to be effective in the lack of CSW parallel data, where both approaches achieve similar results.	翻訳日:2023-10-25 22:24:41 公開日:2023-10-23
# 言語的・非言語的特徴を用いたマルチモーダルデバイス指向音声検出のためのモーダリティドロップアウト Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features ( http://arxiv.org/abs/2310.15261v1 ) ライセンス: Link先を確認	Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H Tewfik	(参考訳) DDSD(Device-directed Speech Detection)は、音声アシスタントに向けられたクエリと、サイド会話やバックグラウンドスピーチを区別するバイナリ分類タスクである。最先端のddsdシステムは、音声、テキスト、および/または自動音声認識システム(asr)機能のような言語的手がかりを使用して、音声をデバイス指向またはその他の分類し、現実の設定でデプロイされた場合、これらのモダリティのうち1つ以上は使用できないとしばしば競合する。本稿では,ddsdシステムにおいて,欠落したモードに対してよりロバストにするための融合スキームについて検討する。同時に,DDSDの言語的手がかりに加えて,非言語的手がかり(特に韻律的特徴)の使用について検討した。提案手法は,非線形中間核融合による固定手術点における偽受入率(FA)において,韻律のスコアと埋め込みを対応する動詞の手がかりと組み合わせて最大8.5%向上させるとともに,モーダリティ・ドロップアウト手法を用いることで,推論時間中のモダリティの欠如を評価した場合において,これらのモデルの性能を7.4%向上させる。 Device-directed speech detection (DDSD) is the binary classification task of distinguishing between queries directed at a voice assistant versus side conversation or background speech. State-of-the-art DDSD systems use verbal cues, e.g acoustic, text and/or automatic speech recognition system (ASR) features, to classify speech as device-directed or otherwise, and often have to contend with one or more of these modalities being unavailable when deployed in real-world settings. In this paper, we investigate fusion schemes for DDSD systems that can be made more robust to missing modalities. Concurrently, we study the use of non-verbal cues, specifically prosody features, in addition to verbal cues for DDSD. We present different approaches to combine scores and embeddings from prosody with the corresponding verbal cues, finding that prosody improves DDSD performance by upto 8.5% in terms of false acceptance rate (FA) at a given fixed operating point via non-linear intermediate fusion, while our use of modality dropout techniques improves the performance of these models by 7.4% in terms of FA when evaluated with missing modalities during inference time.	翻訳日:2023-10-25 22:24:29 公開日:2023-10-23
# 質問特典付き雑音質問の翻訳のための参照自由領域適応 Reference Free Domain Adaptation for Translation of Noisy Questions with Question Specific Rewards ( http://arxiv.org/abs/2310.15259v1 ) ライセンス: Link先を確認	Baban Gain, Ramakrishna Appicharla, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal, Muthusamy Chelliah	(参考訳) Community Question-Answering (CQA)ポータルは、組織内のユーザを支援する貴重なツールである。しかし、英語以外のユーザーにもアクセスできるようにすることは依然として課題である。質問を翻訳することは、コミュニティのリーチを広げ、様々な言語で同様の問い合わせを行う個人に利益をもたらす。ニューラルマシン翻訳(nmt)を用いた質問の翻訳は、特に、質問の文法的正確性が監視されていないノイズ環境において、さらに課題となる。これらの質問は、非ネイティブ話者による言葉として表現され、不正確な主語順と時には欠落する質問マークがある。このようなデータから合成並列コーパスを作成することも、ノイズの性質から難しい。そこで本研究では,ソース側データのみを用いてNMTシステムを微調整するトレーニング手法を提案する。提案手法は,BERTScore と Masked Language Model (MLM) Score を組み合わせた損失関数を利用することで,妥当性と流速のバランスをとる。提案手法は,1.9 bleuスコア改善を達成することにより,合成対象データに依存する従来のmle(maximum likelihood estimation)ベースの微調整手法を上回った。我々のモデルは、ベースラインにノイズを加えながら堅牢性を示し、なおも1.1BLEUの改善とTERおよびBLEURTメトリクスの大幅な改善を実現している。提案手法はモデル非依存であり,トレーニング段階でのみ必要である。さらなる研究を促進するため、コードとデータセットを \url{https://www.iitp.ac.in/~ai-nlp-ml/resources.html#DomainAdapt} で公開しています。 Community Question-Answering (CQA) portals serve as a valuable tool for helping users within an organization. However, making them accessible to non-English-speaking users continues to be a challenge. Translating questions can broaden the community's reach, benefiting individuals with similar inquiries in various languages. Translating questions using Neural Machine Translation (NMT) poses more challenges, especially in noisy environments, where the grammatical correctness of the questions is not monitored. These questions may be phrased as statements by non-native speakers, with incorrect subject-verb order and sometimes even missing question marks. Creating a synthetic parallel corpus from such data is also difficult due to its noisy nature. To address this issue, we propose a training methodology that fine-tunes the NMT system only using source-side data. Our approach balances adequacy and fluency by utilizing a loss function that combines BERTScore and Masked Language Model (MLM) Score. Our method surpasses the conventional Maximum Likelihood Estimation (MLE) based fine-tuning approach, which relies on synthetic target data, by achieving a 1.9 BLEU score improvement. Our model exhibits robustness while we add noise to our baseline, and still achieve 1.1 BLEU improvement and large improvements on TER and BLEURT metrics. Our proposed methodology is model-agnostic and is only necessary during the training phase. We make the codes and datasets publicly available at \url{https://www.iitp.ac.in/~ai-nlp-ml/resources.html#DomainAdapt} for facilitating further research.	翻訳日:2023-10-25 22:24:03 公開日:2023-10-23
# 言語バリアを破る - 構造化自己認識による言語間推論の改善 Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention ( http://arxiv.org/abs/2310.15258v1 ) ライセンス: Link先を確認	Negar Foroutan, Mohammadreza Banaei, Karl Aberer, Antoine Bosselut	(参考訳) 本研究では、多言語言語モデル(MultiLM)が、異なる言語での推論のために微調整された場合、論理推論能力を他の言語に伝達できるかどうかを検討する。 1) 文脈と質問の言語がテスト対象の新言語で同じままである場合(つまり、推論は単言語だが、学習された推論能力は言語間で伝達しなければならない)、(2) 文脈の言語と質問の言語が異なる場合(コード変更推論と呼ぶ)、の2つのスキームでMultiLMの言語間推論能力を評価する。 RuleTakerとLeapOfThoughtという2つの論理的推論データセットでは、MultiLMはモノリンガルな環境で言語間で推論能力を転送できるが、コードに切り替えられた環境では推論能力の転送に苦労している。そこで本研究では,ルールテイカーとleapofthoughtデータセットの推論性能をそれぞれ最大14%,4%向上させるコード切替シーケンスにおける言語横断的注意を促すためのパラメータセットを用いた新しい注意機構を提案する。 In this work, we study whether multilingual language models (MultiLMs) can transfer logical reasoning abilities to other languages when they are fine-tuned for reasoning in a different language. We evaluate the cross-lingual reasoning abilities of MultiLMs in two schemes: (1) where the language of the context and the question remain the same in the new languages that are tested (i.e., the reasoning is still monolingual, but the model must transfer the learned reasoning ability across languages), and (2) where the language of the context and the question is different (which we term code-switched reasoning). On two logical reasoning datasets, RuleTaker and LeapOfThought, we demonstrate that although MultiLMs can transfer reasoning ability across languages in a monolingual setting, they struggle to transfer reasoning abilities in a code-switched setting. Following this observation, we propose a novel attention mechanism that uses a dedicated set of parameters to encourage cross-lingual attention in code-switched sequences, which improves the reasoning performance by up to 14% and 4% on the RuleTaker and LeapOfThought datasets, respectively.	翻訳日:2023-10-25 22:23:36 公開日:2023-10-23
# SimBIG:Galaxy Clusteringのフィールドレベルシミュレーションに基づく推論 SimBIG: Field-level Simulation-Based Inference of Galaxy Clustering ( http://arxiv.org/abs/2310.15256v1 ) ライセンス: Link先を確認	Pablo Lemos, Liam Parker, ChangHoon Hahn, Shirley Ho, Michael Eickenberg, Jiamin Hou, Elena Massara, Chirag Modi, Azadeh Moradinezhad Dizgah, Bruno Regaldo-Saint Blancard, David Spergel	(参考訳) 本稿では,銀河クラスタリングのフィールドレベル解析による宇宙パラメータのシミュレーションベース推論(sbi)について述べる。標準銀河クラスタリング分析は、摂動理論に基づく解析モデルを用いて、パワースペクトルである$P_\ell$などの要約統計分析に依存する。したがって、それらは銀河分布の非線形および非ガウス的特徴を完全には活用しない。これらの制限に対処するために、我々は {\sc SimBIG}フォワードモデリングフレームワークを使用して正規化フローを使用してSBIを実行する。我々は,BOSS CMASS銀河サンプルのサブセットにSimBIGを適用し,確率的重み付けを平均とした畳み込みニューラルネットワークを用いて,銀河場の大規模データ圧縮を行う。我々は、$\Omega_m = 0.267^{+0.033}_{-0.029}$と$\sigma_8=0.762^{+0.036}_{-0.035}$の制約を推論する。 Omega_m$の制約は標準の$P_\ell$分析と並んでいるが、$\sigma_8$の制約は$2.65\times$ tightである。解析はまた、銀河のクラスタリングだけでハッブル定数 $h_0=64.5 \pm 3.8 \ {\rm km / s / mpc}$ の制約を与える。この高い制約パワーは、非ガウス宇宙情報($P_\ell$)から得られる。我々は、トレーニングデータセットで使用されるものと異なるフォワードモデルを用いて構築された一連のテストシミュレーションから、偏りのない宇宙論的制約を推測する能力を示すことにより、解析の堅牢性を示す。この研究は、競争的な宇宙論的制約だけでなく、desi、pfs、euclidのような今後の銀河調査で追加の宇宙学的情報を活用する新しい方法も導入している。 We present the first simulation-based inference (SBI) of cosmological parameters from field-level analysis of galaxy clustering. Standard galaxy clustering analyses rely on analyzing summary statistics, such as the power spectrum, $P_\ell$, with analytic models based on perturbation theory. Consequently, they do not fully exploit the non-linear and non-Gaussian features of the galaxy distribution. To address these limitations, we use the {\sc SimBIG} forward modelling framework to perform SBI using normalizing flows. We apply SimBIG to a subset of the BOSS CMASS galaxy sample using a convolutional neural network with stochastic weight averaging to perform massive data compression of the galaxy field. We infer constraints on $\Omega_m = 0.267^{+0.033}_{-0.029}$ and $\sigma_8=0.762^{+0.036}_{-0.035}$. While our constraints on $\Omega_m$ are in-line with standard $P_\ell$ analyses, those on $\sigma_8$ are $2.65\times$ tighter. Our analysis also provides constraints on the Hubble constant $H_0=64.5 \pm 3.8 \ {\rm km / s / Mpc}$ from galaxy clustering alone. This higher constraining power comes from additional non-Gaussian cosmological information, inaccessible with $P_\ell$. We demonstrate the robustness of our analysis by showcasing our ability to infer unbiased cosmological constraints from a series of test simulations that are constructed using different forward models than the one used in our training dataset. This work not only presents competitive cosmological constraints but also introduces novel methods for leveraging additional cosmological information in upcoming galaxy surveys like DESI, PFS, and Euclid.	翻訳日:2023-10-25 22:23:11 公開日:2023-10-23
# 共有ランダム性は単一測定によるマクロ現実主義の破れを許容する Shared randomness allows violation of macroscopic realism using a single measurement ( http://arxiv.org/abs/2310.15253v1 ) ライセンス: Link先を確認	Shubhayan Sarkar	(参考訳) システムのマクロ現実主義的な記述は、古典的世界に関する2つの基本的な直観、すなわち、システムが常に異なる状態にあり、非侵襲的な測定、すなわち、測定がシステムを妨げないというものである。時間の符号付けを仮定すると、Leggett-Gargの不等式を利用して、少なくとも3つの測定を必要とするマクロ現実主義の違反を観測する。本研究は,共有ランダム性へのアクセスがある場合,時間内の信号が満たされていない場合でも,単一の測定でマクロ現実主義の違反を観察できることを示す。興味深いことに、提案されたスキームを用いることで、より大きなモデルのクラスを除外することが可能であり、これは時間条件における無符号を破ることができない「マクロスコープ無符号」理論(macroscopic no-signalling theory)と呼ばれる。我々はさらに、マクロなno-signallingの違反を観察する証人を構築した。 Macro-realistic description of systems is based majorly on two basic intuitions about the classical world, namely, macrorealism per se, that is, the system is always in a distinct state, and non-invasive measurements, that is, measurements do not disturb the system. Given the assumption of no-signalling in time, one utilizes Leggett-Garg inequalities to observe a violation of macroscopic realism which requires at least three measurements. In this work, we show that if one has access to shared randomness then one can observe a violation of macroscopic realism using a single measurement even if no signalling in time is satisfied. Interestingly, using the proposed scheme one can also rule out a larger class of models, which we term "macroscopic no-signalling" theories which can not violate the no-signalling in time conditions. We further construct a witness to observe the violation of macroscopic no-signalling.	翻訳日:2023-10-25 22:22:44 公開日:2023-10-23
# SyncFusion:マルチモーダルオンセット同期ビデオ音声合成 SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis ( http://arxiv.org/abs/2310.15247v1 ) ライセンス: Link先を確認	Marco Comunit\`a, Riccardo F. Gramaccioni, Emilian Postolache, Emanuele Rodol\`a, Danilo Comminiello, Joshua D. Reiss	(参考訳) サウンドデザインには、映画、ビデオゲーム、バーチャル/拡張現実といった様々なメディアのサウンドエフェクトを創造的に選択、記録、編集することが含まれる。音を設計する際に最も時間がかかるステップは、音声とビデオの同期です。一部のケースでは、ビデオ撮影からの環境記録が利用可能であり、このプロセスに役立つ。しかし、ビデオゲームやアニメーションでは、参照音声は存在せず、ビデオからのイベントタイミングのマニュアルアノテーションを必要とする。そこで本研究では,映像からの繰り返し動作を音声やテキストの埋め込みと組み合わせて抽出し,新しい同期音響効果音声トラックを生成するように訓練した拡散モデルを条件付けるシステムを提案する。このようにして、ビデオとの同期の負担を取り除きながら、完全な創造的制御を音響デザイナーに任せる。さらに、オンセットトラックの編集やコンディショニング埋め込みの変更は、オーディオトラック自体の編集よりもはるかに手間がかかり、音化処理が簡単になる。再現性を高めるために、音の例、ソースコード、事前訓練されたモデルを提供する。 Sound design involves creatively selecting, recording, and editing sound effects for various media like cinema, video games, and virtual/augmented reality. One of the most time-consuming steps when designing sound is synchronizing audio with video. In some cases, environmental recordings from video shoots are available, which can aid in the process. However, in video games and animations, no reference audio exists, requiring manual annotation of event timings from the video. We propose a system to extract repetitive actions onsets from a video, which are then used - in conjunction with audio or textual embeddings - to condition a diffusion model trained to generate a new synchronized sound effects audio track. In this way, we leave complete creative control to the sound designer while removing the burden of synchronization with video. Furthermore, editing the onset track or changing the conditioning embedding requires much less effort than editing the audio track itself, simplifying the sonification process. We provide sound examples, source code, and pretrained models to faciliate reproducibility	翻訳日:2023-10-25 22:22:27 公開日:2023-10-23
# 並列量子高速探索ランダムツリー Parallel Quantum Rapidly-Exploring Random Trees ( http://arxiv.org/abs/2310.15303v1 ) ライセンス: Link先を確認	Paul Lathrop, Beth Boardman, Sonia Mart\'inez	(参考訳) 本稿では,量子高速探索確率木 (q-rrt) アルゴリズムの並列版である並列量子高速探索確率木 (pq-rrt) アルゴリズムを提案する。並列量子RRT(Parallel Quantum RRT)は、サンプリングベースモーションプランナの並列量子アルゴリズムで、量子振幅増幅を用いて木に加えて到達可能な状態のデータベースを検索する。本研究では,並列量子デバイスがデータベースをより効率的に探索する方法について検討する。量子計測プロセスでは,重ね合わせが基底状態へ崩壊し,確率情報を消去し,複数の解を効率的に見つけることができる。 Pq-RRTは、従来の並列動作計画にインスパイアされたマネージャ/並列量子ワーカーの定式化を使用して、実現可能な状態データベースの同時量子検索を行う。本稿では,複数の並列ユニットが共有データベースに含まれる任意のソリューションを,到達可能性エラーの有無に関わらず発見する可能性について検討し,効率の予測を可能にする。我々は,Pq-RRTとq-RRT,古典的RRT,古典的並列RRTの効率,密度/熱マップ,および速度比較を高密度障害物環境でシミュレーションする。次に,pq-rrtとq-rrtのためのデータベース構築戦略であるquantum database annealingを提案する。 In this paper, we present the Parallel Quantum Rapidly-Exploring Random Tree (Pq-RRT) algorithm, a parallel version of the Quantum Rapidly-Exploring Random Trees (q-RRT) algorithm. Parallel Quantum RRT is a parallel quantum algorithm formulation of a sampling-based motion planner that uses Quantum Amplitude Amplification to search databases of reachable states for addition to a tree. In this work we investigate how parallel quantum devices can more efficiently search a database, as the quantum measurement process involves the collapse of the superposition to a base state, erasing probability information and therefore the ability to efficiently find multiple solutions. Pq-RRT uses a manager/parallel-quantum-workers formulation, inspired by traditional parallel motion planning, to perform simultaneous quantum searches of a feasible state database. We present results regarding likelihoods of multiple parallel units finding any and all solutions contained with a shared database, with and without reachability errors, allowing efficiency predictions to be made. We offer simulations in dense obstacle environments showing efficiency, density/heatmap, and speed comparisons for Pq-RRT against q-RRT, classical RRT, and classical parallel RRT. We then present Quantum Database Annealing, a database construction strategy for Pq-RRT and q-RRT that uses a temperature construct to define database creation over time for balancing exploration and exploitation.	翻訳日:2023-10-25 22:13:46 公開日:2023-10-23
# 仮想現実技術への包含:スコーピングレビュー Inclusion in Virtual Reality Technology: A Scoping Review ( http://arxiv.org/abs/2310.15289v1 ) ライセンス: Link先を確認	Xiaofeng Yong and Ali Arya	(参考訳) 仮想現実の応用と研究の著しい成長にもかかわらず、仮想現実への包摂の概念は十分に研究されていない。インクルージョンとは、vr技術やアプリケーションの採用、使用、設計、開発において、さまざまなグループの人々が積極的に関与することを指す。本稿では,既存の仮想現実研究文献のスコーピング分析について述べる。対象とするグループに基づく文献を,能力,性別,年齢に分類し,コミュニティによるVR体験のデザインを研究する。後者のグループでは、より明確でより重要な事例として、主に先住民に焦点をあてる。また,技術導入とデザインにおけるユーザの役割を包括研究の背景として,モデルへのアプローチを簡潔にレビューし,考察する。我々は、一連の一般的な障壁と研究ギャップと、各グループ固有のものを特定し、将来の研究の方向性を提案する。 Despite the significant growth in virtual reality applications and research, the notion of inclusion in virtual reality is not well studied. Inclusion refers to the active involvement of different groups of people in the adoption, use, design, and development of VR technology and applications. In this review, we provide a scoping analysis of existing virtual reality research literature about inclusion. We categorize the literature based on target group into ability, gender, and age, followed by those that study community-based design of VR experiences. In the latter group, we focus mainly on Indigenous Peoples as a clearer and more important example. We also briefly review the approaches to model and consider the role of users in technology adoption and design as a background for inclusion studies. We identify a series of generic barriers and research gaps and some specific ones for each group, resulting in suggested directions for future research.	翻訳日:2023-10-25 22:13:21 公開日:2023-10-23
# 人的フィードバックによる強化学習のためのアクティブ教師選択 Active teacher selection for reinforcement learning from human feedback ( http://arxiv.org/abs/2310.15288v1 ) ライセンス: Link先を確認	Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell	(参考訳) human feedback(rlhf)からの強化学習は、人間のフィードバックから目的を学習する機械学習システムを可能にする。これらのシステムの中核的な制限は、すべてのフィードバックが1人の人間教師からのものであるという仮定である。教師の合理性、専門性、コストの相違をモデル化し、複数の教師からの学習問題を定式化するHUB(Hidden Utility Bandit)フレームワークを提案する。我々は、様々なソリューションアルゴリズムを開発し、それらを2つの現実世界のドメイン、ペーパーレコメンデーションシステムとcovid-19ワクチンテストに適用する。アクティブ教師選択(ATS)アルゴリズムは,いつ,どの教師に問い合わせるかを積極的に選択することで,ベースラインアルゴリズムよりも優れていることがわかった。 HUBフレームワークとATSアルゴリズムは、教師間の差異を活用して正確な報酬モデルを学ぶことの重要性を示し、堅牢な報酬モデルのためのアクティブな教師選択に関する今後の研究を促進する。 Reinforcement learning from human feedback (RLHF) enables machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite querying a range of distinct teachers. We propose the Hidden Utility Bandit (HUB) framework to model differences in teacher rationality, expertise, and costliness, formalizing the problem of learning from multiple teachers. We develop a variety of solution algorithms and apply them to two real-world domains: paper recommendation systems and COVID-19 vaccine testing. We find that the Active Teacher Selection (ATS) algorithm outperforms baseline algorithms by actively selecting when and which teacher to query. The HUB framework and ATS algorithm demonstrate the importance of leveraging differences between teachers to learn accurate reward models, facilitating future research on active teacher selection for robust reward modeling.	翻訳日:2023-10-25 22:13:08 公開日:2023-10-23
# スパース強化学習への二重ロバストなアプローチ A Doubly Robust Approach to Sparse Reinforcement Learning ( http://arxiv.org/abs/2310.15286v1 ) ライセンス: Link先を確認	Wonyoung Kim and Garud Iyengar and Assaf Zeevi	(参考訳) 状態遷移分布が観測された特徴の線形関数であるエピソドックススパース線形マルコフ決定過程(smdp)に対する新たな後悔最小化アルゴリズムを提案する。 SMDPの唯一の既知のアルゴリズムは、疎度パラメータと未知のポリシーへのオラクルアクセスの知識を必要とする。これらの制限を克服するために,各エピソードのすべての周期からのデータを利用する新しい解析手法と,\emph{all} アクションの特徴ベクトルを併用する2つのロバストな手法を提案する。提案アルゴリズムの後悔は$\tilde{o}(\sigma^{-1}_{\min} s_{\star} h \sqrt{n})$であり、ここで$\sigma_{\min}$ は特徴ベクトルの平均グラム行列の最小固有値、$s_\star$ はスパーシティパラメータ、$h$ はエピソードの長さ、$n$ はラウンド数である。新たに同定されたSMDPのサブクラスにおいて,上界から対数的因子までを一致させる低い後悔境界を提供する。我々の数値実験は理論的な結果をサポートし,アルゴリズムの優れた性能を示す。 We propose a new regret minimization algorithm for episodic sparse linear Markov decision process (SMDP) where the state-transition distribution is a linear function of observed features. The only previously known algorithm for SMDP requires the knowledge of the sparsity parameter and oracle access to an unknown policy. We overcome these limitations by combining the doubly robust method that allows one to use feature vectors of \emph{all} actions with a novel analysis technique that enables the algorithm to use data from all periods in all episodes. The regret of the proposed algorithm is $\tilde{O}(\sigma^{-1}_{\min} s_{\star} H \sqrt{N})$, where $\sigma_{\min}$ denotes the restrictive the minimum eigenvalue of the average Gram matrix of feature vectors, $s_\star$ is the sparsity parameter, $H$ is the length of an episode, and $N$ is the number of rounds. We provide a lower regret bound that matches the upper bound up to logarithmic factors on a newly identified subclass of SMDPs. Our numerical experiments support our theoretical results and demonstrate the superior performance of our algorithm.	翻訳日:2023-10-25 22:12:53 公開日:2023-10-23
# 文埋め込みの次元性について On the Dimensionality of Sentence Embeddings ( http://arxiv.org/abs/2310.15285v1 ) ライセンス: Link先を確認	Hongwei Wang, Hongming Zhang, Dong Yu	(参考訳) 文埋め込みの学習は自然言語処理の基本的な問題である。既存の研究は主に文埋め込みの品質向上に焦点を当てているが、文埋め込み次元の探索は限られている。本稿では,文埋め込みの次元性に関する包括的かつ実証的な解析を行う。まず、文埋め込みの最適次元が通常デフォルト値よりも小さいことを示す。次に、文埋め込みの次元を最小性能劣化で圧縮するために、エンコーダのパフォーマンス損失とプーラーのパフォーマンス損失という、全体的なパフォーマンス損失に寄与する2つのコンポーネントを特定した。そこで本研究では,低次元シナリオにおける全体的な性能損失を軽減するために,エンコーダとプーラを別々に最適化した文表現学習モデルの2段階学習法を提案する。 7つのSTSタスクと7つの文分類タスクの実験結果から,本手法は低次元文埋め込みの性能を著しく向上させることが示された。 Learning sentence embeddings is a fundamental problem in natural language processing. While existing research primarily focuses on enhancing the quality of sentence embeddings, the exploration of sentence embedding dimensions is limited. Here we present a comprehensive and empirical analysis of the dimensionality of sentence embeddings. First, we demonstrate that the optimal dimension of sentence embeddings is usually smaller than the default value. Subsequently, to compress the dimension of sentence embeddings with minimum performance degradation, we identify two components contributing to the overall performance loss: the encoder's performance loss and the pooler's performance loss. Therefore, we propose a two-step training method for sentence representation learning models, wherein the encoder and the pooler are optimized separately to mitigate the overall performance loss in low-dimension scenarios. Experimental results on seven STS tasks and seven sentence classification tasks demonstrate that our method significantly improves the performance of low-dimensional sentence embeddings.	翻訳日:2023-10-25 22:12:27 公開日:2023-10-23
# uncertaintyplayground - 不確実性推定のための高速でシンプルなpythonライブラリ UncertaintyPlayground: A Fast and Simplified Python Library for Uncertainty Estimation ( http://arxiv.org/abs/2310.15281v1 ) ライセンス: Link先を確認	Ilia Azizi	(参考訳) 本稿では,PyTorchとGPyTorchをベースとしたPythonライブラリであるUncertaintyPlaygroundを紹介し,教師付き学習タスクにおける不確実性の評価を行う。このライブラリは、通常分散された結果に対するスパースおよび変分ガウスプロセス回帰(SVGPR)と混合分布のための混合密度ネットワーク(MDN)を通じて、ガウスおよびマルチモーダルな結果分布の高速トレーニングを提供する。さまざまなハイパーパラメータによるモデルトレーニングに加えて、UncertaintyPlaygroundは1つ以上のインスタンスの予測間隔を視覚化することができる。テンソル演算を使用するため、ライブラリはcpuとgpuの両方でトレーニングでき、速度最適化のための様々なpytorch固有の技術を提供する。ライブラリには各モジュールのユニットテストが含まれており、GitHub Workflows(オンライン統合)とTox(ローカル統合)とのマルチプラットフォーム継続的インテグレーションを保証する。最後に、コードはGoogleスタイルのドキュメントでドキュメント化され、MkDocsとMkDocStringsで作成されたドキュメントWebサイトを提供する。 This paper introduces UncertaintyPlayground, a Python library built on PyTorch and GPyTorch for uncertainty estimation in supervised learning tasks. The library offers fast training for Gaussian and multi-modal outcome distributions through Sparse and Variational Gaussian Process Regressions (SVGPRs) for normally distributed outcomes and Mixed Density Networks (MDN) for mixed distributions. In addition to model training with various hyperparameters, UncertaintyPlayground can visualize the prediction intervals of one or more instances. Due to using tensor operations, the library can be trained both on CPU and GPU and offers various PyTorch-specific techniques for speed optimization. The library contains unit tests for each module and ensures multi-platform continuous integration with GitHub Workflows (online integration) and Tox (local integration). Finally, the code is documented with Google-style docstrings and offers a documentation website created with MkDocs and MkDocStrings.	翻訳日:2023-10-25 22:12:12 公開日:2023-10-23
# 重み付き木隣接言語認識のための効率的なアルゴリズム Efficient Algorithms for Recognizing Weighted Tree-Adjoining Languages ( http://arxiv.org/abs/2310.15276v1 ) ライセンス: Link先を確認	Alexandra Butoi, Tim Vieira, Ryan Cotterell, David Chiang	(参考訳) 木に隣接する言語のクラスは、文脈自由文法(CFG)またはプッシュダウンオートマトン(PDA)が他のCFGまたはPDAを制御するような、様々な2段階の形式主義によって特徴づけられる。これら4つの形式は、tree-adjoining grammars (tag)、linear indexed grammars (lig)、pushdown-adjoining automata (paa)、embedded pushdown automata (epda)と等価である。上述の2階形式論の半環重み付きバージョンを定義し、それらの弦和(弦のすべての導出の重み)とアリサム(すべての導出の重み)を計算するための新しいアルゴリズムを設計する。これらの結果から,TAG,LIG,PAA,EPDAの文字列とアロームのアルゴリズムを瞬時に取得した。 lig に対して、このアルゴリズムは、$\mathcal{o}(n\|\mathcal{n}\|)$(ここで $n$ は文字列の長さ、$\|\mathcal{n}\|$ は非終端集合のサイズ)と$\mathcal{o}(\|\gamma\|)$(ここで $\|\gamma\|$ はスタックアルファベットの大きさ)のアルゴリズムよりも時間効率が高い。 EPDAの場合、我々のアルゴリズムは、それぞれ$\mathcal{O}(\|\Gamma\|^2)$と$\mathcal{O}(\|\Gamma\|^3)$の因子によるAlonso et al. (2001)のアルゴリズムよりも空間効率と時間効率が良い。最後に,最初のPAA文字列とAllsumアルゴリズムを提案する。 The class of tree-adjoining languages can be characterized by various two-level formalisms, consisting of a context-free grammar (CFG) or pushdown automaton (PDA) controlling another CFG or PDA. These four formalisms are equivalent to tree-adjoining grammars (TAG), linear indexed grammars (LIG), pushdown-adjoining automata (PAA), and embedded pushdown automata (EPDA). We define semiring-weighted versions of the above two-level formalisms, and we design new algorithms for computing their stringsums (the weight of all derivations of a string) and allsums (the weight of all derivations). From these, we also immediately obtain stringsum and allsum algorithms for TAG, LIG, PAA, and EPDA. For LIG, our algorithm is more time-efficient by a factor of $\mathcal{O}(n\|\mathcal{N}\|)$ (where $n$ is the string length and $\|\mathcal{N}\|$ is the size of the nonterminal set) and more space-efficient by a factor of $\mathcal{O}(\|\Gamma\|)$ (where $\|\Gamma\|$ is the size of the stack alphabet) than the algorithm of Vijay-Shanker and Weir (1989). For EPDA, our algorithm is both more space-efficient and time-efficient than the algorithm of Alonso et al. (2001) by factors of $\mathcal{O}(\|\Gamma\|^2)$ and $\mathcal{O}(\|\Gamma\|^3)$, respectively. Finally, we give the first PAA stringsum and allsum algorithms.	翻訳日:2023-10-25 22:11:55 公開日:2023-10-23
# 拡張予測のための三重単純行列補完 Triple Simplex Matrix Completion for Expense Forecasting ( http://arxiv.org/abs/2310.15275v1 ) ライセンス: Link先を確認	Cheng Qian and Lucas Glass and Nikos Sidiropoulos	(参考訳) 予算超過やプロジェクトの失敗を避けるためには,プロジェクト費用の予測が重要なステップである。伝統的に、これは金融アナリストや時系列分析のようなデータサイエンス技術によって行われてきた。しかし、これらのアプローチは不確実であり、特にデータポイントが限られたプロジェクトの開始時に計画された予算とは異なる結果を生み出す可能性がある。本稿では,潜在空間における特定の費用パターンに関連するプロジェクトの可能性を学習し,コストを予測する制約付き非負行列補完モデルを提案する。モデルは3つの確率的単純度に制約され、そのうちの2つは係数行列に、3つは欠落するエントリに制約される。さらに、予測経費値は、後処理を必要とせずに予算制約を満たすことが保証される。関連する最適化問題を解くために,不正確な交互最適化アルゴリズムが開発され,定常点に収束することが証明される。 2つの実データから得られた結果は,最先端アルゴリズムと比較して提案手法の有効性を示す。 Forecasting project expenses is a crucial step for businesses to avoid budget overruns and project failures. Traditionally, this has been done by financial analysts or data science techniques such as time-series analysis. However, these approaches can be uncertain and produce results that differ from the planned budget, especially at the start of a project with limited data points. This paper proposes a constrained non-negative matrix completion model that predicts expenses by learning the likelihood of the project correlating with certain expense patterns in the latent space. The model is constrained on three probability simplexes, two of which are on the factor matrices and the third on the missing entries. Additionally, the predicted expense values are guaranteed to meet the budget constraint without the need of post-processing. An inexact alternating optimization algorithm is developed to solve the associated optimization problem and is proven to converge to a stationary point. Results from two real datasets demonstrate the effectiveness of the proposed method in comparison to state-of-the-art algorithms.	翻訳日:2023-10-25 22:11:16 公開日:2023-10-23
# AGIのためのシステムAIアプローチ:アライメント、エネルギー、AGIグランドチャレンジへの取り組み Systematic AI Approach for AGI: Addressing Alignment, Energy, and AGI Grand Challenges ( http://arxiv.org/abs/2310.15274v1 ) ライセンス: Link先を確認	Eren Kurshan	(参考訳) AIは、エナジーウォール、アライメント問題、ナローAIからAGIへの移行という三大課題に直面している。現代のAIソリューションは、モデルトレーニングと日々の運用の間、持続不可能な量のエネルギーを消費する。さらに悪いことに、2020年以降、新しいAIモデルをトレーニングするために必要な計算量は、エネルギー消費の増加に直接変換して2ヶ月毎に倍増している。AIからAGIへの飛躍は、バランスの取れた方法で運用される複数の機能サブシステムを必要としており、システムアーキテクチャを必要とする。しかし、現在の人工知能のアプローチはシステム設計に欠けており、システム特性は情報を処理する方法から意思決定方法に至るまで、人間の脳において重要な役割を果たす。同様に、現在のアライメントとAI倫理のアプローチはシステム設計をほとんど無視しているが、研究では、脳のシステムアーキテクチャが健全な道徳的決定において重要な役割を果たすことが示されている。我々は,AGIのシステム設計原則を活用したAGIのためのシステムAIアプローチを提案し,エネルギーの壁とアライメントの課題を克服する方法を提供する。 AI faces a trifecta of grand challenges the Energy Wall, the Alignment Problem and the Leap from Narrow AI to AGI. Contemporary AI solutions consume unsustainable amounts of energy during model training and daily operations.Making things worse, the amount of computation required to train each new AI model has been doubling every 2 months since 2020, directly translating to increases in energy consumption.The leap from AI to AGI requires multiple functional subsystems operating in a balanced manner, which requires a system architecture. However, the current approach to artificial intelligence lacks system design; even though system characteristics play a key role in the human brain from the way it processes information to how it makes decisions. Similarly, current alignment and AI ethics approaches largely ignore system design, yet studies show that the brains system architecture plays a critical role in healthy moral decisions.In this paper, we argue that system design is critically important in overcoming all three grand challenges. We posit that system design is the missing piece in overcoming the grand challenges.We present a Systematic AI Approach for AGI that utilizes system design principles for AGI, while providing ways to overcome the energy wall and the alignment challenges.	翻訳日:2023-10-25 22:11:01 公開日:2023-10-23
# フェルミオン共形場理論はより絡み合うか? Are fermionic conformal field theories more entangled? ( http://arxiv.org/abs/2310.15273v1 ) ライセンス: Link先を確認	Gilles Parez, William Witczak-Krempa	(参考訳) 量子臨界系における解離部分領域間の絡み合いを対数ネガティティティのレンズを用いて検討する。我々は一般次元における共形場理論(CFT)とその対応する格子ハミルトン理論を扱う。小さな分離では対数ネガティビティが大きく、普遍的な振る舞いを示すが、大きな分離ではどのパワーよりも速く崩壊する。これは既にシングルスピン部分領域の最小設定で見ることができる。大規模な分離における蒸留可能な絡み合いの欠如は1dの結果を一般化し、少なくともボソンにとって量子臨界基底状態が長い範囲の二分性絡み合いを持たないことを示す。フェルミオンを持つ系に対しては、フェルミオンパリティを考慮した対数否定性のより適切な定義が存在し、代数的に崩壊することを示す。その過程で、部分転位密度行列のモーメントに対する一般的な CFT 結果を得る。 We study the entanglement between disjoint subregions in quantum critical systems through the lens of the logarithmic negativity. We work with conformal field theories (CFTs) in general dimensions, and their corresponding lattice Hamiltonians. At small separations, the logarithmic negativity is big and shows universal behaviour, but we show non-perturbatively that it decays faster than any power at large separations. This can already be seen in the minimal setting of single-spin subregions. The corresponding absence of distillable entanglement at large separations generalises the 1d result, and indicates that quantum critical groundstates do not possess long range bipartite entanglement, at least for bosons. For systems with fermions, a more suitable definition of the logarithmic negativity exists that takes into account fermion parity, and we show that it decays algebraically. Along the way we obtain general CFT results for the moments of the partially transposed density matrix.	翻訳日:2023-10-25 22:10:36 公開日:2023-10-23
# フォールトトレラント量子コンピュータのためのデコーダの選択方法スピードと正確さのトレードオフ How to choose a decoder for a fault-tolerant quantum computer? The speed vs accuracy trade-off ( http://arxiv.org/abs/2310.15313v1 ) ライセンス: Link先を確認	Nicolas Delfosse, Andres Paz, Alexander Vaschillo and Krysta M. Svore	(参考訳) 実用的な量子アドバンテージを達成するには、計算中に障害を特定し修正する古典的な復号アルゴリズムが必要である。この古典的な復号アルゴリズムは、精度と速度の両方を提供する必要がある。デコーダはいつ「十分速い」のか、それとも「十分正確な」のか? 表面符号の場合、数十の復号アルゴリズムが提案され、異なる精度と速度が提案されている。しかし、与えられた量子アーキテクチャの最適なデコーダをどのように選ぶかは定かではない。より高速なデコーダを精度の低い価格で使用すべきか? あるいはデコーダが精度を犠牲にして所定の時間内に収まるべきか? デコーダが遅すぎると、いくつかのタイムアウト障害の価格と失敗率の増加によって、タイムバウンドに達すると停止する可能性がある。その後、デコーダの最適停止時間はどうなるか? 速度と精度のトレードオフを解析することにより、異なるタスクに対するデコーダの最適停止時間を選択する戦略を提案する。論理ゲート当たりの時空コストを最小限に抑えるデコーダを選択するためのプロトコルを設計し、与えられた深さの論理計算を行う。本プロトコルは,異なるデコーダの比較と,所定のフォールトトレラント量子コンピューティングアーキテクチャに適したデコーダの選択を可能にする。我々はpymatchingデコーダのデスクトップ実装を備えたsurfaceコードのためのプロトコルについて述べる。 PyMatchingは物理量子ビットよりも精度良く数千の論理ゲートを実装するのに十分速いと推定する。しかし、量子ビットをアイドルさせ、アイドル中にエラーを蓄積させる復号遅延のため、ある仮定の下で10^5論理ゲートに到達するのが十分ではない。 PyMatchingのさらなる改善は、よりよいマシン上で実行したり、OSの干渉を減らすことで可能になります。 Achieving practical quantum advantage requires a classical decoding algorithm to identify and correct faults during computation. This classical decoding algorithm must deliver both accuracy and speed, but in what combination? When is a decoder "fast enough" or "accurate enough"? In the case of surface codes, tens of decoding algorithms have been proposed, with different accuracies and speeds. However, it has been unclear how to choose the best decoder for a given quantum architecture. Should a faster decoder be used at the price of reduced accuracy? Or should a decoder sacrifice accuracy to fit within a given time constraint? If a decoder is too slow, it may be stopped upon reaching a time bound, at the price of some time-out failures and an increased failure rate. What then is the optimal stopping time of the decoder? By analyzing the speed vs. accuracy tradeoff, we propose strategies to select the optimal stopping time for a decoder for different tasks. We design a protocol to select the decoder that minimizes the spacetime cost per logical gate, for logical computation of a given depth. Our protocol enables comparison of different decoders, and the selection of an appropriate decoder for a given fault-tolerant quantum computing architecture. We illustrate our protocol for the surface code equipped with a desktop implementation of the PyMatching decoder. We estimate PyMatching is fast enough to implement thousands of logical gates with a better accuracy than physical qubits. However, we find it is not sufficiently fast to reach 10^5 logical gates, under certain assumptions, due to the decoding delay which forces qubits to idle and accumulate errors while idling. We expect further improvements to PyMatching are possible by running it on a better machine or by reducing the OS interference.	翻訳日:2023-10-25 22:04:33 公開日:2023-10-23
# SAM-CLIP:意味的・空間的理解に向けた視覚基礎モデルの融合 SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding ( http://arxiv.org/abs/2310.15308v1 ) ライセンス: Link先を確認	Haoxiang Wang, Pavan Kumar Anasosalu Vasu, Fartash Faghri, Raviteja Vemulapalli, Mehrdad Farajtabar, Sachin Mehta, Mohammad Rastegari, Oncel Tuzel, Hadi Pouransari	(参考訳) CLIP や Segment Anything Model (SAM) など,一般公開されたビジョンファウンデーションモデル (VFM) の展望は急速に拡大している。 vfmには、訓練前の目的から生じる異なる能力が与えられている。例えば、CLIPは意味理解に優れ、SAMはセグメンテーションのための空間理解に特化している。本稿では,vfmsを統一モデルに効率的に統合し,その専門性を統一する簡単なレシピを提案する。提案手法は, マルチタスク学習, 連続学習技術, 教師-学生蒸留を統合した。この戦略は、従来のマルチタスクトレーニングに比べ、計算コストを大幅に削減する。さらに、個々のモデルをトレーニングするために最初に使用されたトレーニング済みデータセットのごく一部しか必要としない。 SAM-CLIPはSAMとCLIPの強度を1つのバックボーンに統合し、エッジデバイスアプリケーションに適応する統一モデルである。 SAM-CLIPは、よりリッチな視覚表現を学習し、広範囲の視覚タスクに適した局所化と意味的特徴を持つことを示す。 SAM-CLIP は SAM や CLIP と比較して,複数の頭部探索タスクのパフォーマンス向上を実現している。さらに、SAM-CLIPは前駆体モデルの基礎的強みを保持するだけでなく、特にゼロショットセマンティックセマンティックセグメンテーションにおいて相乗的機能を導入し、SAM-CLIPは5つのベンチマークで新しい最先端結果を確立する。これは、pascal-voc と coco-stuff データセットでそれぞれ +6.8% と +5.9% の平均 iou 改善を含む、このタスク用に特別に設計された以前のモデルを上回る。 The landscape of publicly available vision foundation models (VFMs), such as CLIP and Segment Anything Model (SAM), is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their pre-training objectives. For instance, CLIP excels in semantic understanding, while SAM specializes in spatial understanding for segmentation. In this work, we introduce a simple recipe to efficiently merge VFMs into a unified model that assimilates their expertise. Our proposed method integrates multi-task learning, continual learning techniques, and teacher-student distillation. This strategy entails significantly less computational cost compared to traditional multi-task training from scratch. Additionally, it only demands a small fraction of the pre-training datasets that were initially used to train individual models. By applying our method to SAM and CLIP, we derive SAM-CLIP: a unified model that amalgamates the strengths of SAM and CLIP into a single backbone, making it apt for edge device applications. We show that SAM-CLIP learns richer visual representations, equipped with both localization and semantic features, suitable for a broad range of vision tasks. SAM-CLIP obtains improved performance on several head probing tasks when compared with SAM and CLIP. We further show that SAM-CLIP not only retains the foundational strengths of its precursor models but also introduces synergistic functionalities, most notably in zero-shot semantic segmentation, where SAM-CLIP establishes new state-of-the-art results on 5 benchmarks. It outperforms previous models that are specifically designed for this task by a large margin, including +6.8% and +5.9% mean IoU improvement on Pascal-VOC and COCO-Stuff datasets, respectively.	翻訳日:2023-10-25 22:04:07 公開日:2023-10-23
# 名前付きエンティティ認識のための批判的toponymyフレームワークに向けて--ニューヨーク市におけるairbnbの事例研究 Toward a Critical Toponymy Framework for Named Entity Recognition: A Case Study of Airbnb in New York City ( http://arxiv.org/abs/2310.15302v1 ) ライセンス: Link先を確認	Mikael Brunila, Jack LaViolette, Sky CH-Wang, Priyanka Verma, Clara F\'er\'e, Grant McKenzie	(参考訳) 批判的toponymyは、彼らが参照する場所名とサイトを通して、権力、資本、抵抗のダイナミクスを調べる。ここでの研究は伝統的に、トポニムのセマンティックな内容とそれらを生成するトップダウンの制度的プロセスに焦点を当ててきた。しかし、概して、日常の談話において一般の人々がトポニムを使う方法や、トポニム参照に付随し文脈を規定する地理空間記述の他の戦略は無視されている。そこで我々は,2010年代のニューヨーク市Airbnbリスティング47,440件の注釈付きデータセットを通じて,人々が場所を参照する方法を文化的・経済的資本がどう形作るかを測定するための計算方法を開発した。このデータセットに基づいて、位置の特徴づけに不可欠な重要な談話カテゴリを識別できる新しい名前付きエンティティ認識(NER)モデルを導入する。本研究は, 地域, 住宅, 観光市場, ジェントリフィケーションなど, これまでに検討されてきた様々な言語信号について, 批判的トポニミーの新たな方向性を示唆するものである。 Critical toponymy examines the dynamics of power, capital, and resistance through place names and the sites to which they refer. Studies here have traditionally focused on the semantic content of toponyms and the top-down institutional processes that produce them. However, they have generally ignored the ways in which toponyms are used by ordinary people in everyday discourse, as well as the other strategies of geospatial description that accompany and contextualize toponymic reference. Here, we develop computational methods to measure how cultural and economic capital shape the ways in which people refer to places, through a novel annotated dataset of 47,440 New York City Airbnb listings from the 2010s. Building on this dataset, we introduce a new named entity recognition (NER) model able to identify important discourse categories integral to the characterization of place. Our findings point toward new directions for critical toponymy and to a range of previously understudied linguistic signals relevant to research on neighborhood status, housing and tourism markets, and gentrification.	翻訳日:2023-10-25 22:03:36 公開日:2023-10-23
# ADMarker:アルツハイマー病のデジタルバイオマーカーモニタリングのための多モードフェデレーション学習システム ADMarker: A Multi-Modal Federated Learning System for Monitoring Digital Biomarkers of Alzheimer's Disease ( http://arxiv.org/abs/2310.15301v1 ) ライセンス: Link先を確認	Xiaomin Ouyang, Xian Shuai, Yang Li, Li Pan, Xifan Zhang, Heming Fu, Xinyan Wang, Shihua Cao, Jiang Xin, Hazel Mok, Zhenyu Yan, Doris Sau Fung Yu, Timothy Kwok, Guoliang Xing	(参考訳) アルツハイマー病(ad)とそれに関連する認知症は高齢化による世界的な健康問題となっている。本稿では,マルチモーダルセンサと,自然環境における多次元ADデジタルバイオマーカー検出のための新しいフェデレーション学習アルゴリズムを統合した初のエンドツーエンドシステムADMarkerを提案する。 admarkerは、プライバシーを守りながらデジタルバイオマーカーを正確に検出できる、新しい3段階のマルチモーダルフェデラル学習アーキテクチャを特徴としている。提案手法は,データラベルの制限,データ不均一性,計算資源の制限など,現実的な課題をまとめて解決する。我々は,コンパクトなマルチモダリティハードウェアシステムを構築し,高齢者91名を対象に4週間の臨床試験を行った。その結果、admarkerは最大93.8%の精度で包括的なデジタルバイオマーカーを検出でき、平均88.9%の精度で早期広告を識別できることがわかった。 ADMarkerは、AD臨床医が多次元の解釈可能なデジタルバイオマーカー、患者の人口的要因、AD診断の間の複雑な相関を縦方向で特徴づけ、追跡できる新しいプラットフォームを提供する。 Alzheimer's Disease (AD) and related dementia are a growing global health challenge due to the aging population. In this paper, we present ADMarker, the first end-to-end system that integrates multi-modal sensors and new federated learning algorithms for detecting multidimensional AD digital biomarkers in natural living environments. ADMarker features a novel three-stage multi-modal federated learning architecture that can accurately detect digital biomarkers in a privacy-preserving manner. Our approach collectively addresses several major real-world challenges, such as limited data labels, data heterogeneity, and limited computing resources. We built a compact multi-modality hardware system and deployed it in a four-week clinical trial involving 91 elderly participants. The results indicate that ADMarker can accurately detect a comprehensive set of digital biomarkers with up to 93.8% accuracy and identify early AD with an average of 88.9% accuracy. ADMarker offers a new platform that can allow AD clinicians to characterize and track the complex correlation between multidimensional interpretable digital biomarkers, demographic factors of patients, and AD diagnosis in a longitudinal manner.	翻訳日:2023-10-25 22:03:15 公開日:2023-10-23
# 非構造格子を持つ超音速流れ問題に対する局所収束入力(nnlci)を用いたニューラルネットワーク Neural Network with Local Converging Input (NNLCI) for Supersonic Flow Problems with Unstructured Grids ( http://arxiv.org/abs/2310.15299v1 ) ライセンス: Link先を確認	Weiming Ding, Haoxiang Huang, Tzu Jung Lee, Yingjie Liu, Vigor Yang	(参考訳) 近年では、ディープニューラルネットワーク(DNN)に基づく代理モデルを用いて偏微分方程式の解法が広く行われている。しかし、この種の代理モデルは、トレーニングデータセットのグローバルな補間にフォーカスしており、そのため大きなネットワーク構造を必要とする。このプロセスは時間と計算コストの両方を消費し、複雑な物理問題の高忠実度予測に使用することを制限している。本研究では,非構造データを用いた高忠実度予測のための局所収束入力(NNLCI)を用いたニューラルネットワークを開発した。このフレームワークは局所的な依存領域を利用し、粗い解を入力として収束させることで、計算資源とトレーニング時間を大幅に削減する。また, NNLCI法を用いて, バンプを有するチャネル内の超音速流の可視化を行った。バンプジオメトリと位置の違いは,提案手法の有効性と可逆性を評価するために考慮される。衝撃波相互作用を含む詳細な流れ構造を系統的に検討した。 In recent years, surrogate models based on deep neural networks (DNN) have been widely used to solve partial differential equations, which were traditionally handled by means of numerical simulations. This kind of surrogate models, however, focuses on global interpolation of the training dataset, and thus requires a large network structure. The process is both time consuming and computationally costly, thereby restricting their use for high-fidelity prediction of complex physical problems. In the present study, we develop a neural network with local converging input (NNLCI) for high-fidelity prediction using unstructured data. The framework utilizes the local domain of dependence with converging coarse solutions as input, which greatly reduces computational resource and training time. As a validation case, the NNLCI method is applied to study inviscid supersonic flows in channels with bumps. Different bump geometries and locations are considered to benchmark the effectiveness and versability of the proposed approach. Detailed flow structures, including shock-wave interactions, are examined systematically.	翻訳日:2023-10-25 22:02:55 公開日:2023-10-23
# TaskDiff: タスク指向の会話のための類似度メトリクス TaskDiff: A Similarity Metric for Task-Oriented Conversations ( http://arxiv.org/abs/2310.15298v1 ) ライセンス: Link先を確認	Ankita Bhaumik, Praveen Venkateswaran, Yara Rizk, Vatche Isahagian	(参考訳) 対話型デジタルアシスタントの普及により、ユーザエクスペリエンスの向上とパーソナライズされた応答生成に使用できる大量の会話データが利用可能になった。 ChatGPTのようなポピュラーな言語モデルを使ってこれらのアシスタントを構築するには、さらなるエンジニアリングと評価方法に重点を置く必要がある。テキストの類似度指標は、このような分析と評価の重要な要素である。文献では多くの類似度指標が提案されているが、ユニークな会話特徴を活かさないため、タスク指向の会話には有効ではない。このギャップに対処するために、異なる対話成分(発話、意図、スロット)とそれらの分布を利用して類似度を計算する新しい会話類似度指標TaskDiffを提案する。 TaskDiffのベンチマークデータセットに対する大規模な実験的評価は、他の関連するアプローチよりも優れたパフォーマンスと堅牢性を示している。 The popularity of conversational digital assistants has resulted in the availability of large amounts of conversational data which can be utilized for improved user experience and personalized response generation. Building these assistants using popular large language models like ChatGPT also require additional emphasis on prompt engineering and evaluation methods. Textual similarity metrics are a key ingredient for such analysis and evaluations. While many similarity metrics have been proposed in the literature, they have not proven effective for task-oriented conversations as they do not take advantage of unique conversational features. To address this gap, we present TaskDiff, a novel conversational similarity metric that utilizes different dialogue components (utterances, intents, and slots) and their distributions to compute similarity. Extensive experimental evaluation of TaskDiff on a benchmark dataset demonstrates its superior performance and improved robustness over other related approaches.	翻訳日:2023-10-25 22:02:38 公開日:2023-10-23
# DeTiME:エンコーダデコーダを用いた拡散強調トピックモデリング DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM ( http://arxiv.org/abs/2310.15296v1 ) ライセンス: Link先を確認	Weijie Xu, Wenxiang Hu, Fanyou Wu, Srinivasan Sengamedu	(参考訳) 自然言語処理の急成長の分野では、ニューラルネットワークモデル(NTM)と大規模言語モデル(LLM)が重要な研究分野として浮上している。それにもかかわらず、NTMは主に、クラスタリングやトピック生成に最適ではないLCMからのコンテキスト埋め込みを利用する。本研究では,Encoder-Decoder-based LLMs (DeTiME) を用いた拡散拡張トピックモデリングという新しいフレームワークを導入することで,このギャップに対処する。 DeTiME は ncoder-Decoder ベースの LLM を利用して高度にクラスタ化可能な埋め込みを生成する。さらに,拡散のパワーを活用することで,特定トピックに関連するコンテンツを生成する機能も提供する。このデュアル機能は、高度にクラスタ化されたトピックと関連するコンテンツを同時に効率的に生成することを可能にする。 DeTiMEのポテンシャルは、クラスタ化された埋め込みの生成にも及んでいる。特に,提案するフレームワークはトレーニングに効率的であることが証明され,高い適応性を示し,広範囲のアプリケーションに対してその可能性を示す。 In the burgeoning field of natural language processing, Neural Topic Models (NTMs) and Large Language Models (LLMs) have emerged as areas of significant research interest. Despite this, NTMs primarily utilize contextual embeddings from LLMs, which are not optimal for clustering or capable for topic generation. Our study addresses this gap by introducing a novel framework named Diffusion-Enhanced Topic Modeling using Encoder-Decoder-based LLMs (DeTiME). DeTiME leverages ncoder-Decoder-based LLMs to produce highly clusterable embeddings that could generate topics that exhibit both superior clusterability and enhanced semantic coherence compared to existing methods. Additionally, by exploiting the power of diffusion, our framework also provides the capability to generate content relevant to the identified topics. This dual functionality allows users to efficiently produce highly clustered topics and related content simultaneously. DeTiME's potential extends to generating clustered embeddings as well. Notably, our proposed framework proves to be efficient to train and exhibits high adaptability, demonstrating its potential for a wide array of applications.	翻訳日:2023-10-25 22:02:24 公開日:2023-10-23
# ゼロショットクロスドメインスロットフィリングのための適応型エンドツーエンドメトリック学習 Adaptive End-to-End Metric Learning for Zero-Shot Cross-Domain Slot Filling ( http://arxiv.org/abs/2310.15294v1 ) ライセンス: Link先を確認	Yuanjun Shi, Linzhi Wu, Minglai Shao	(参考訳) 近年のスロットフィリングは,ディープラーニングと大規模アノテートデータの利用により,大きな発展を遂げている。しかし、トレーニング中にサンプルが見られない新しいドメインを扱うことは、非常に難しい課題である。重度のドメインシフトにより認識性能が大幅に低下する可能性がある。ほとんどの先行研究は、メトリック学習に基づく2パスパイプライン方式でこの問題に対処している。実際、これらの支配的パイプラインモデルは、非並列推論と文脈自由離散ラベル埋め込みのため、計算効率と一般化能力に制限がある。そこで,本研究では,一般的なメトリックベース手法を再検討し,挑戦的ゼロショットスロット充填のための新しい適応型エンドツーエンドメトリック学習手法を提案する。簡易性,効率性,一般化性を考慮して,コンテキスト認識型ソフトラベル表現とスロットレベルのコントラスト表現学習を併用したカスケード型共同学習フレームワークを提案する。公開ベンチマークに関する広範な実験は、一連の競合ベースラインよりも提案手法が優れていることを示している。 Recently slot filling has witnessed great development thanks to deep learning and the availability of large-scale annotated data. However, it poses a critical challenge to handle a novel domain whose samples are never seen during training. The recognition performance might be greatly degraded due to severe domain shifts. Most prior works deal with this problem in a two-pass pipeline manner based on metric learning. In practice, these dominant pipeline models may be limited in computational efficiency and generalization capacity because of non-parallel inference and context-free discrete label embeddings. To this end, we re-examine the typical metric-based methods, and propose a new adaptive end-to-end metric learning scheme for the challenging zero-shot slot filling. Considering simplicity, efficiency and generalizability, we present a cascade-style joint learning framework coupled with context-aware soft label representations and slot-level contrastive representation learning to mitigate the data and label shift problems effectively. Extensive experiments on public benchmarks demonstrate the superiority of the proposed approach over a series of competitive baselines.	翻訳日:2023-10-25 22:02:06 公開日:2023-10-23
# 拡散モデルによるEHR時系列の高速・信頼性生成 Fast and Reliable Generation of EHR Time Series via Diffusion Models ( http://arxiv.org/abs/2310.15290v1 ) ライセンス: Link先を確認	Muhang Tian, Bernie Chen, Allan Guo, Shiyi Jiang, Anru R. Zhang	(参考訳) 電子健康記録(ehrs)は、検査、医薬品、診断を含む患者レベルのデータの豊富な情報源であり、医療データ分析に有用なリソースを提供する。しかし、プライバシーに関する懸念はしばしばEHRへのアクセスを制限し、下流の分析を妨げる。研究者たちは、プライバシー保護のEHRデータを生成する様々な方法を模索してきた。本研究では,Denoising Diffusion Probabilistic Models (DDPM) を用いて,多種多様なリアルな合成EHR時系列データを生成する手法を提案する。提案手法と既存手法を6つのデータセットで比較検討した。以上の結果から,本手法はトレーニングの労力を少なくしながら,データユーティリティの観点から既存手法を著しく上回ります。本手法は,多様で現実的なehrデータを提供することにより,下流医療データ解析も強化する。 Electronic Health Records (EHRs) are rich sources of patient-level data, including laboratory tests, medications, and diagnoses, offering valuable resources for medical data analysis. However, concerns about privacy often restrict access to EHRs, hindering downstream analysis. Researchers have explored various methods for generating privacy-preserving EHR data. In this study, we introduce a new method for generating diverse and realistic synthetic EHR time series data using Denoising Diffusion Probabilistic Models (DDPM). We conducted experiments on six datasets, comparing our proposed method with seven existing methods. Our results demonstrate that our approach significantly outperforms all existing methods in terms of data utility while requiring less training effort. Our approach also enhances downstream medical data analysis by providing diverse and realistic synthetic EHR data.	翻訳日:2023-10-25 22:01:44 公開日:2023-10-23
# flwr-serverlessによるサーバレスフェデレーション学習 Serverless Federated Learning with flwr-serverless ( http://arxiv.org/abs/2310.15329v1 ) ライセンス: Link先を確認	Sanjeev V. Namjoshi, Reese Green, Krishi Sharma, Zhangzhang Si	(参考訳) 個人識別可能な情報の収集と保存が急増する中、連合学習は益々重要で人気が高まっている。これらの発展とともに、個人のデータ保護とデータプライバシ対策への関心を高めるため、世界中の政府から多くの提案がなされている。新たなドメインや既存ドメインでディープラーニングがより関連性を持つようになるにつれ、セキュリティやプライバシを損なうことなく、エッジデバイスなどのさまざまなソースからデータを効果的にトレーニング可能な、フェデレーション学習のような戦略を開発することが不可欠である。最近、Flower (\texttt{Flwr}) Pythonパッケージが導入され、フェデレート学習を実装するためのスケーラブルでフレキシブルで使いやすいフレームワークを提供している。しかし、これまでFlowerは同期フェデレーション学習しか実行できませんが、処理が遅くて脆弱なクライアントサイドのトレーニングジョブによってボトルネックになるため、実行にはコストと時間がかかります。ここでは、フラワーパッケージのラッパーである \texttt{flwr-serverless} を紹介し、その機能を拡張して、フラワーの設計パラダイムを最小限の修正で同期および非同期のフェデレーション学習を可能にする。さらに,フェデレートドラーニングのアプローチにより,中央サーバを使わずにプロセスを実行することが可能となり,アプリケーションのドメインと利用のアクセシビリティが向上する。本稿では,このアプローチの設計の詳細と利用について,公開データセットを用いた一連の実験を通じて述べる。全体として、当社のアプローチは、フェデレーショントレーニングの実行時間とコストを削減し、フェデレーション学習システムの実装と実験を簡単にする手段を提供します。 Federated learning is becoming increasingly relevant and popular as we witness a surge in data collection and storage of personally identifiable information. Alongside these developments there have been many proposals from governments around the world to provide more protections for individuals' data and a heightened interest in data privacy measures. As deep learning continues to become more relevant in new and existing domains, it is vital to develop strategies like federated learning that can effectively train data from different sources, such as edge devices, without compromising security and privacy. Recently, the Flower (\texttt{Flwr}) Python package was introduced to provide a scalable, flexible, and easy-to-use framework for implementing federated learning. However, to date, Flower is only able to run synchronous federated learning which can be costly and time-consuming to run because the process is bottlenecked by client-side training jobs that are slow or fragile. Here, we introduce \texttt{flwr-serverless}, a wrapper around the Flower package that extends its functionality to allow for both synchronous and asynchronous federated learning with minimal modification to Flower's design paradigm. Furthermore, our approach to federated learning allows the process to run without a central server, which increases the domains of application and accessibility of its use. This paper presents the design details and usage of this approach through a series of experiments that were conducted using public datasets. Overall, we believe that our approach decreases the time and cost to run federated training and provides an easier way to implement and experiment with federated learning systems.	翻訳日:2023-10-25 21:52:56 公開日:2023-10-23
# DeepVox と SAVE-CT を用いた胸部大動脈分割と動脈瘤予測のための造影・線量非依存型3次元ディープラーニング DeepVox and SAVE-CT: a contrast- and dose-independent 3D deep learning approach for thoracic aorta segmentation and aneurysm prediction using computed tomography scans ( http://arxiv.org/abs/2310.15328v1 ) ライセンス: Link先を確認	Matheus del-Valle, Lariza Laura de Oliveira, Henrique Cursino Vieira, Henrique Min Ho Lee, Lucas Lembran\c{c}a Pinheiro, Maria Fernanda Portugal, Newton Shydeo Brand\~ao Miyoshi, Nelson Wolosker	(参考訳) 胸部大動脈瘤(英: Thoracic aortic aneurysm,TAA)は、大動脈の進行拡大による解離または破裂を引き起こす致命的な疾患である。通常無症状であり、スクリーニングの推奨は限られている。金本位評価はct angiography (cta) と放射線科医の時間消費評価により行った。他の適応のためのスキャンは、このスクリーニングに役立つが、造影剤や低用量プロトコールがなければ、放射線科医のスキャン量を増加させるだけでなく、臨床評価を困難にする可能性がある。本研究は, 対照群とTAA患者を含む587種類のCTスキャンを, コントラスト増強の有無にかかわらず低線量および標準線量プロトコルで取得した。新しいセグメンテーションモデルであるdeepvoxは、開発とテストセットにそれぞれ0.932と0.897のサイススコア係数を示し、論文で報告されたモデルと比較してトレーニング速度が速いことを示した。新規なTAA分類モデルSAVE-CTは,DeepVoxの2値分割マスクのみを入力として,それぞれ0.930と0.922の精度を示した。これらの2つのモデルは、完全に自動化されたコントラストと線量非依存の評価において、入力として様々な数のスライスを処理し、胸腹部および胸腹部のシーケンスを処理できるため、TAAスクリーニングの潜在的アプローチである。これは、TAA死亡率の低下と、放射線科医に対する患者の評価キューの優先順位付けに役立つ。 Thoracic aortic aneurysm (TAA) is a fatal disease which potentially leads to dissection or rupture through progressive enlargement of the aorta. It is usually asymptomatic and screening recommendation are limited. The gold-standard evaluation is performed by computed tomography angiography (CTA) and radiologists time-consuming assessment. Scans for other indications could help on this screening, however if acquired without contrast enhancement or with low dose protocol, it can make the clinical evaluation difficult, besides increasing the scans quantity for the radiologists. In this study, it was selected 587 unique CT scans including control and TAA patients, acquired with low and standard dose protocols, with or without contrast enhancement. A novel segmentation model, DeepVox, exhibited dice score coefficients of 0.932 and 0.897 for development and test sets, respectively, with faster training speed in comparison to models reported in the literature. The novel TAA classification model, SAVE-CT, presented accuracies of 0.930 and 0.922 for development and test sets, respectively, using only the binary segmentation mask from DeepVox as input, without hand-engineered features. These two models together are a potential approach for TAA screening, as they can handle variable number of slices as input, handling thoracic and thoracoabdominal sequences, in a fully automated contrast- and dose-independent evaluation. This may assist to decrease TAA mortality and prioritize the evaluation queue of patients for radiologists.	翻訳日:2023-10-25 21:52:27 公開日:2023-10-23
# スペシャリストかジェネラリストか? 特定NLPタスクに対するインストラクションチューニング Specialist or Generalist? Instruction Tuning for Specific NLP Tasks ( http://arxiv.org/abs/2310.15326v1 ) ライセンス: Link先を確認	Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, Deng Cai	(参考訳) 広範囲の自然言語処理(NLP)タスクを同時に実行する大規模言語モデル(LLM)の可能性は、広範な研究の対象となっている。命令チューニングは、LSMをそのようなジェネラリストモデルに変換するためのデータ効率のよい方法であることが証明されているが、その性能は特定のタスクのために訓練された専門モデルよりも遅れている。本稿では,包括的汎用的チューニングの導入がスペシャリストモデルの構築に寄与するかどうかを検討する。有効性はタスクの特異性とスキル要件に依存すると仮定する。本実験は,4つの目標タスクを異なるカバレッジレベルで評価し,汎用的な命令チューニングを統合することで,タスクカバレッジが広い場合にモデル性能を継続的に向上することを示した。タスク固有のトレーニングデータ量が制限された場合、その効果は特に顕著である。様々な能力に着目した3つの目標タスクのさらなる調査は、ジェネラリストの指導チューニングが理解と推論能力を改善することを示す。しかし、事実知識を必要とするタスクに対しては、幻覚情報を含む一般データがモデルの性能に悪影響を及ぼす可能性がある。全体として、我々の研究は、一般的な命令チューニングでスペシャリストモデルを開発するための体系的なガイドを提供します。私たちのコードと関連するリソースはhttps://github.com/DavidFanzz/Generalist_or_ Specialist.orgにある。 The potential of large language models (LLMs) to simultaneously perform a wide range of natural language processing (NLP) tasks has been the subject of extensive research. Although instruction tuning has proven to be a data-efficient method for transforming LLMs into such generalist models, their performance still lags behind specialist models trained exclusively for specific tasks. In this paper, we investigate whether incorporating broad-coverage generalist instruction tuning can contribute to building a specialist model. We hypothesize that its efficacy depends on task specificity and skill requirements. Our experiments assess four target tasks with distinct coverage levels, revealing that integrating generalist instruction tuning consistently enhances model performance when the task coverage is broad. The effect is particularly pronounced when the amount of task-specific training data is limited. Further investigation into three target tasks focusing on different capabilities demonstrates that generalist instruction tuning improves understanding and reasoning abilities. However, for tasks requiring factual knowledge, generalist data containing hallucinatory information may negatively affect the model's performance. Overall, our work provides a systematic guide for developing specialist models with general instruction tuning. Our code and other related resources can be found at https://github.com/DavidFanzz/Generalist_or_Specialist.	翻訳日:2023-10-25 21:52:00 公開日:2023-10-23
# 視覚的質問応答のためのlxmertモデル圧縮 LXMERT Model Compression for Visual Question Answering ( http://arxiv.org/abs/2310.15325v1 ) ライセンス: Link先を確認	Maryam Hashemi, Ghazaleh Mahmoudi, Sara Kodeiri, Hadi Sheikhi, Sauleh Eetemadi	(参考訳) LXMERTのような大規模事前学習モデルは、視覚言語タスクのためのテキストイメージペア上でのクロスモーダル表現の学習に人気がある。抽選券仮説によれば、nlpとコンピュータビジョンのモデルには、独立して訓練できる小さなサブネットワークが含まれている。本稿では、これらの観測結果を組み合わせて、VQAタスクの微調整時にLXMERTにそのようなトレーニング可能なサブネットが存在するかどうかを評価する。また,モデルサイズによるコスト便益分析を行い,精度の大幅な低下を伴わずに刈り取ることができるか検討した。実験の結果,LXMERTは40%～60%の大きさで効果的に切断でき,精度は3%低下した。 Large-scale pretrained models such as LXMERT are becoming popular for learning cross-modal representations on text-image pairs for vision-language tasks. According to the lottery ticket hypothesis, NLP and computer vision models contain smaller subnetworks capable of being trained in isolation to full performance. In this paper, we combine these observations to evaluate whether such trainable subnetworks exist in LXMERT when fine-tuned on the VQA task. In addition, we perform a model size cost-benefit analysis by investigating how much pruning can be done without significant loss in accuracy. Our experiment results demonstrate that LXMERT can be effectively pruned by 40%-60% in size with 3% loss in accuracy.	翻訳日:2023-10-25 21:51:40 公開日:2023-10-23
# videoprompter:ゼロショットビデオ理解のための基礎モデルのアンサンブル Videoprompter: an ensemble of foundational models for zero-shot video understanding ( http://arxiv.org/abs/2310.15324v1 ) ライセンス: Link先を確認	Adeel Yousaf, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah	(参考訳) 視覚言語モデル(VLM)は、視覚特徴とテキストベースのクラスラベル表現の類似点を計算することで、クエリビデオの分類を行う。近年,大言語モデル (LLM) は, クラス名の記述性を高めて, テキストベースのクラスラベルの強化に利用されている。しかし、これらの改善はテキストベースの分類器に限られており、クエリの視覚的特徴は考慮されていない。本稿では,事前学習した識別型VLMと,事前学習したビデオテキストとテキストテキストモデルを組み合わせたフレームワークを提案する。標準ゼロショット設定に2つの重要な変更を導入する。まず,言語誘導型視覚機能拡張を提案し,ビデオからテキストまでのモデルを用いて,クエリ映像を記述形式に変換する。得られた説明には、どのオブジェクトが存在するかや、時空間的相互作用など、クエリビデオの重要な視覚的手がかりが含まれている。これらの記述的手がかりは、ゼロショット性能を高めるためにVLMにさらなる意味知識を提供する。第2に、クラスラベル表現を豊かにするために、より意味のある記述を生成するためのビデオ固有プロンプトを提案する。具体的には、クラス名のカテゴリのツリー階層を作成するためのプロンプト手法を導入し、追加の視覚的手がかりに対して高レベルなアクションコンテキストを提供するとともに、3つの異なるゼロショット設定におけるビデオ理解における我々のアプローチの有効性を実証する。 1)ビデオアクション認識 2)ビデオ対テキスト、テキスト対ビデオ検索、及び 3)タイムセンシティブなビデオタスク。複数のベンチマークと様々なVLMで一貫した改善が提案するフレームワークの有効性を実証する。私たちのコードは公開されます。 Vision-language models (VLMs) classify the query video by calculating a similarity score between the visual features and text-based class label representations. Recently, large language models (LLMs) have been used to enrich the text-based class labels by enhancing the descriptiveness of the class names. However, these improvements are restricted to the text-based classifier only, and the query visual features are not considered. In this paper, we propose a framework which combines pre-trained discriminative VLMs with pre-trained generative video-to-text and text-to-text models. We introduce two key modifications to the standard zero-shot setting. First, we propose language-guided visual feature enhancement and employ a video-to-text model to convert the query video to its descriptive form. The resulting descriptions contain vital visual cues of the query video, such as what objects are present and their spatio-temporal interactions. These descriptive cues provide additional semantic knowledge to VLMs to enhance their zeroshot performance. Second, we propose video-specific prompts to LLMs to generate more meaningful descriptions to enrich class label representations. Specifically, we introduce prompt techniques to create a Tree Hierarchy of Categories for class names, offering a higher-level action context for additional visual cues, We demonstrate the effectiveness of our approach in video understanding across three different zero-shot settings: 1) video action recognition, 2) video-to-text and textto-video retrieval, and 3) time-sensitive video tasks. Consistent improvements across multiple benchmarks and with various VLMs demonstrate the effectiveness of our proposed framework. Our code will be made publicly available.	翻訳日:2023-10-25 21:51:27 公開日:2023-10-23
# Conflating Point of interest(POI)データ:マッチング手法の体系的レビュー Conflating point of interest (POI) data: A systematic review of matching methods ( http://arxiv.org/abs/2310.15320v1 ) ライセンス: Link先を確認	Kai Sun, Yingjie Hu, Yue Ma, Ryan Zhenqi Zhou, Yunqiang Zhu	(参考訳) 関心のポイント(POI)データは、現実世界の場所のデジタル表現を提供し、人間と場所の相互作用を理解し、都市管理を支援し、スマートな都市を構築するためにますます利用されている。多くのPOIデータセットが開発されており、地理的カバレッジ、属性フォーカス、データ品質が異なることが多い。時折、研究者は研究領域の場所をよりよく表現するために、2つ以上のpoiデータセットを共用する必要があるかもしれない。様々なPOI Conflation法が開発されているが、体系的なレビューが欠如しており、その結果、POI conflationに慣れた研究者がこれらの既存手法を素早く把握し利用することは困難である。この論文はそのような隙間を埋める。 PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) のプロトコルに従って,再現可能な構文を用いて3つの書誌データベースを探索し,関連研究の同定を行う。続いて、POI Conflationの主なステップ、すなわち、POI マッチングに焦点を合わせ、特定されたメソッドを体系的に要約し分類する。その後、現在の限界と今後の機会について論じる。このレビューは、研究用のPOIデータセットの融合に関心のある研究者に、いくつかのガイダンスを提供することを期待しています。 Point of interest (POI) data provide digital representations of places in the real world, and have been increasingly used to understand human-place interactions, support urban management, and build smart cities. Many POI datasets have been developed, which often have different geographic coverages, attribute focuses, and data quality. From time to time, researchers may need to conflate two or more POI datasets in order to build a better representation of the places in the study areas. While various POI conflation methods have been developed, there lacks a systematic review, and consequently, it is difficult for researchers new to POI conflation to quickly grasp and use these existing methods. This paper fills such a gap. Following the protocol of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), we conduct a systematic review by searching through three bibliographic databases using reproducible syntax to identify related studies. We then focus on a main step of POI conflation, i.e., POI matching, and systematically summarize and categorize the identified methods. Current limitations and future opportunities are discussed afterwards. We hope that this review can provide some guidance for researchers interested in conflating POI datasets for their research.	翻訳日:2023-10-25 21:51:03 公開日:2023-10-23
# グラウンドドインストラクション生成のための幻覚検出 Hallucination Detection for Grounded Instruction Generation ( http://arxiv.org/abs/2310.15319v1 ) ライセンス: Link先を確認	Lingjun Zhao, Khanh Nguyen, Hal Daum\'e III	(参考訳) 本研究では,シミュレーション住宅環境における人道誘導指導の課題について考察する。現在のモデルで大きな問題は幻覚だ。それは、人間の追随者が実行したり、記述された経路に沿って遭遇したりするのと矛盾するアクションやオブジェクトへの参照を生成する。画像テキストペアの大規模なコーパスに事前学習したモデルを採用し,正しい指示と合成幻覚を含む命令を区別するコントラスト損失を微調整することにより,これらの幻覚的参照を検出するモデルを開発した。最終モデルは,命令生成モデルから推定される単語確率や,LSTMとTransformerに基づく教師付きモデルなど,いくつかのベースラインよりも優れている。 We investigate the problem of generating instructions to guide humans to navigate in simulated residential environments. A major issue with current models is hallucination: they generate references to actions or objects that are inconsistent with what a human follower would perform or encounter along the described path. We develop a model that detects these hallucinated references by adopting a model pre-trained on a large corpus of image-text pairs, and fine-tuning it with a contrastive loss that separates correct instructions from instructions containing synthesized hallucinations. Our final model outperforms several baselines, including using word probability estimated by the instruction-generation model, and supervised models based on LSTM and Transformer.	翻訳日:2023-10-25 21:50:41 公開日:2023-10-23
# HetGPT: 事前学習した不均一グラフニューラルネットワークにおけるプロンプトチューニングのパワーを損なう HetGPT: Harnessing the Power of Prompt Tuning in Pre-Trained Heterogeneous Graph Neural Networks ( http://arxiv.org/abs/2310.15318v1 ) ライセンス: Link先を確認	Yihong Ma, Ning Yan, Jiayu Li, Masood Mortazavi and Nitesh V. Chawla	(参考訳) グラフは、webの複雑なパターンやリッチな情報を表現し分析するための自然な選択として登場し、オンラインページ分類やソーシャルレコメンデーションといったアプリケーションを可能にする。一般的な"pre-train, fine-tune"パラダイムは、グラフ機械学習タスク、特にラベル付きノードが制限されたシナリオで広く採用されている。しかしながら、このアプローチは、しばしば、前文タスクのトレーニング目標と下流タスクのトレーニング目標のミスバランスを示す。このギャップは,事前トレーニングから得られた知識が下流タスクのパフォーマンスに悪影響を及ぼすという,“負の転送”問題を引き起こす可能性がある。自然言語処理(NLP)におけるプロンプトベースの学習の急増は、グラフに"事前訓練、プロンプト"パラダイムを適用する可能性を示唆している。しかし、既存のグラフプロンプト技術は、Webグラフ固有の不均一性を無視して、均質グラフに適合する。このギャップを埋めるため,我々は,事前学習されたヘテロジニアスグラフニューラルネットワーク(hgnns)の予測性能を向上させる汎用後学習促進フレームワークhetgptを提案する。キーとなるのは,仮想クラスプロンプトと異種機能プロンプトを統合した,新しいプロンプト関数の設計である。さらに、HetGPTは多視点近傍集約機構を導入し、複素近傍構造をヘテロジニアスグラフで捉える。 3つのベンチマークデータセットに対する大規模な実験は、半教師付きノード分類における最先端HGNNの性能を高めるHetGPTの機能を示す。 Graphs have emerged as a natural choice to represent and analyze the intricate patterns and rich information of the Web, enabling applications such as online page classification and social recommendation. The prevailing "pre-train, fine-tune" paradigm has been widely adopted in graph machine learning tasks, particularly in scenarios with limited labeled nodes. However, this approach often exhibits a misalignment between the training objectives of pretext tasks and those of downstream tasks. This gap can result in the "negative transfer" problem, wherein the knowledge gained from pre-training adversely affects performance in the downstream tasks. The surge in prompt-based learning within Natural Language Processing (NLP) suggests the potential of adapting a "pre-train, prompt" paradigm to graphs as an alternative. However, existing graph prompting techniques are tailored to homogeneous graphs, neglecting the inherent heterogeneity of Web graphs. To bridge this gap, we propose HetGPT, a general post-training prompting framework to improve the predictive performance of pre-trained heterogeneous graph neural networks (HGNNs). The key is the design of a novel prompting function that integrates a virtual class prompt and a heterogeneous feature prompt, with the aim to reformulate downstream tasks to mirror pretext tasks. Moreover, HetGPT introduces a multi-view neighborhood aggregation mechanism, capturing the complex neighborhood structure in heterogeneous graphs. Extensive experiments on three benchmark datasets demonstrate HetGPT's capability to enhance the performance of state-of-the-art HGNNs on semi-supervised node classification.	翻訳日:2023-10-25 21:50:26 公開日:2023-10-23
# プログラミング入門講座における大規模言語モデルの可能性を探る Exploring the Potential of Large Language Models in Generating Code-Tracing Questions for Introductory Programming Courses ( http://arxiv.org/abs/2310.15317v1 ) ライセンス: Link先を確認	Aysa Xuemo Fan, Ranran Haoran Zhang, Luc Paquette, Rui Zhang	(参考訳) 本稿では,導入型プログラミングコースにおけるコードトラッシング問題生成のための大規模言語モデル(llms)の適用について検討する。我々はGPT4のターゲットプロンプトを設計し、コードスニペットと記述に基づいてコードトレースの質問を生成するように誘導した。我々は,モデルが生成する質問の質を,人間の専門家が生成した質問と比べて評価するための評価指標のセットを構築した。私たちの分析は、多様なコードトレッキング質問を生成する際のllmの能力と可能性に関する洞察を提供します。さらに,人間とllmが生成する追跡質問のユニークなデータセットを提示し,教育とnlp研究コミュニティの双方にとって貴重な資源となる。本研究は,LLMの教育的利用の可能性に関する対話の継続に寄与する。 In this paper, we explore the application of large language models (LLMs) for generating code-tracing questions in introductory programming courses. We designed targeted prompts for GPT4, guiding it to generate code-tracing questions based on code snippets and descriptions. We established a set of human evaluation metrics to assess the quality of questions produced by the model compared to those created by human experts. Our analysis provides insights into the capabilities and potential of LLMs in generating diverse code-tracing questions. Additionally, we present a unique dataset of human and LLM-generated tracing questions, serving as a valuable resource for both the education and NLP research communities. This work contributes to the ongoing dialogue on the potential uses of LLMs in educational settings.	翻訳日:2023-10-25 21:49:55 公開日:2023-10-23
# 文書レベルのイベント抽出のための表現の探索 Probing Representations for Document-level Event Extraction ( http://arxiv.org/abs/2310.15316v1 ) ライセンス: Link先を確認	Barry Wang and Xinya Du and Claire Cardie	(参考訳) Probing Classifiersフレームワークは、さまざまな自然言語処理(NLP)アプリケーションのためのディープニューラルネットワークモデルの解釈に使用されている。しかし、研究は主に文レベルのNLPタスクに焦点を当てている。この研究は、文書レベルの情報抽出(IE)で学んだ表現に探索パラダイムを適用した最初のものである。文書レベルのイベント抽出に関連するサーフェス,セマンティック,イベント理解機能を分析するために,8つの埋め込みプローブを設計した。標準データセット上の3つの LLM ベースの文書レベル IE アプローチから得られたモデルから得られた表現に適用する。これらのモデルからトレーニングされたエンコーダは、議論の検出とラベル付けを適度に改善できるが、イベントレベルのタスクをわずかに強化するだけで、一貫性とイベントタイプの予測に役立つ情報のトレードオフがある。さらに,エンコーダモデルは文書長とクロスセンテンス談話に苦しむことが分かった。 The probing classifiers framework has been employed for interpreting deep neural network models for a variety of natural language processing (NLP) applications. Studies, however, have largely focused on sentencelevel NLP tasks. This work is the first to apply the probing paradigm to representations learned for document-level information extraction (IE). We designed eight embedding probes to analyze surface, semantic, and event-understanding capabilities relevant to document-level event extraction. We apply them to the representations acquired by learning models from three different LLM-based document-level IE approaches on a standard dataset. We found that trained encoders from these models yield embeddings that can modestly improve argument detections and labeling but only slightly enhance event-level tasks, albeit trade-offs in information helpful for coherence and event-type prediction. We further found that encoder models struggle with document length and cross-sentence discourse.	翻訳日:2023-10-25 21:49:42 公開日:2023-10-23
# LLMが幻覚し、どのように(証拠的な)クロージャを得るか: 忠実な自然言語生成のための知覚的、内向的、拡張的学習 Why LLMs Hallucinate, and How to Get (Evidential) Closure: Perceptual, Intensional, and Extensional Learning for Faithful Natural Language Generation ( http://arxiv.org/abs/2310.15355v1 ) ライセンス: Link先を確認	Adam Bouyamourn	(参考訳) LLMは、その出力が証拠を持つクレームと同義であると制約されないため、幻覚的であることを示す。文の真偽に関する情報は、標準的なニューラル確率言語モデルでは統計的に識別されておらず、新しい文字列を生成するために条件付けできない。次に, LLM を制約して実測閉包を満たす出力を生成する方法を示す。マルチモーダル LLM は外部世界(知覚学習)について学ぶ必要があり、弦から世界の状態へのマッピング(拡張学習)を学ばなければならない。一項 LLM の出力は、検証された証拠集合の文字列と同義でなければならない。最後に, LLM が証拠を有する主張と同義ではない出力を拒絶することにより LLM から忠実な出力を得るヒューリスティックな手順である Learn-Babble-Prune を提案する。 We show that LLMs hallucinate because their output is not constrained to be synonymous with claims for which they have evidence: a condition that we call evidential closure. Information about the truth or falsity of sentences is not statistically identified in the standard neural probabilistic language model setup, and so cannot be conditioned on to generate new strings. We then show how to constrain LLMs to produce output that does satisfy evidential closure. A multimodal LLM must learn about the external world (perceptual learning); it must learn a mapping from strings to states of the world (extensional learning); and, to achieve fluency when generalizing beyond a body of evidence, it must learn mappings from strings to their synonyms (intensional learning). The output of a unimodal LLM must be synonymous with strings in a validated evidence set. Finally, we present a heuristic procedure, Learn-Babble-Prune, that yields faithful output from an LLM by rejecting output that is not synonymous with claims for which the LLM has evidence.	翻訳日:2023-10-25 21:43:50 公開日:2023-10-23
# ノイズチャネルモデルとしてのLandau-Streater Channel The Landau-Streater Channel as a Noisy Channel Model ( http://arxiv.org/abs/2310.15353v1 ) ライセンス: Link先を確認	Shayan Roofeh, Vahid Karimipour	(参考訳) 3次元では、ランダウ・セプター・チャンネルはヴェルナー・ホルボ・チャンネルにすぎない。このようなチャネルは連続パラメータを持たず、環境ノイズをモデル化することはできない。我々は、その凸と同一性チャネルとの組合せを考え、クトリッツ上の1パラメータ雑音モデルとして適する。さらに、Werner-Holevo チャネルは完全ユニタリ群 $SU(3)$ の下で共分散を示すが、拡張族は群 $SO(3)$ の下でのみ共分散を保持する。この対称性の低減は、元のチャネルの様々な特性に対する影響を調べることができる。具体的には, チャネルのスペクトル, 可視性, 相補的チャネル, 正確なあるいは近似的な分解性, および各種のキャパシティへの影響について検討する。具体的には, 量子容量に対する下界と上界の確立とともに, 単発古典容量と絡み合い支援容量の解析式を導出する。 In three dimensions, the Landau-Streater channel is nothing but the Werner-Holevo channel. Such a channel has no continuous parameter and hence cannot model an environmental noise. We consider its convex combination with the identity channel, making it suitable as a one-parameter noise model on qutrits. Moreover, whereas the original Werner-Holevo channel exhibits covariance under the complete unitary group $SU(3)$, the extended family maintains covariance only under the group $SO(3)$. This symmetry reduction allows us to investigate its impact on various properties of the original channel. Specifically, we examine its influence on the channel's spectrum, divisibility, complementary channel, and exact or approximate degradability, as well as its various kinds of capacities. Specifically, we derive analytical expressions for the one-shot classical capacity and the entanglement-assisted capacity, accompanied by the establishment of lower and upper bounds for the quantum capacity.	翻訳日:2023-10-25 21:43:31 公開日:2023-10-23
# ベイズ最適化におけるランダム探索:順序最適回帰と計算効率 Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency ( http://arxiv.org/abs/2310.15351v1 ) ライセンス: Link先を確認	Sudeep Salgia, Sattar Vakili, Qing Zhao	(参考訳) ガウス過程モデルを用いたベイズ最適化をカーネルベース帯域最適化と呼ぶ。本研究では,分布から引き出されたランダムサンプルを用いて領域を探索する手法について検討する。このランダム探索手法が最適誤差率を達成することを示す。我々の解析は、この研究で確立された無限次元ヒルベルト空間における新しい濃度境界に基づいている。さらに,領域縮小を伴うランダム探索に基づくアルゴリズムを開発し,ノイズのない環境と雑音環境の両方において,そのオーダー・オプティマイト保証を確立する。ノイズフリー環境では,既存の残響性能のギャップを解消し,COLT開放問題を解消する。提案アルゴリズムは,反復毎にクエリポイントを選択するために,非凸取得関数の高価な最適化を回避したランダム探索により,一般的な手法よりも計算上の優位性を持つ。 We consider Bayesian optimization using Gaussian Process models, also referred to as kernel-based bandit optimization. We study the methodology of exploring the domain using random samples drawn from a distribution. We show that this random exploration approach achieves the optimal error rates. Our analysis is based on novel concentration bounds in an infinite dimensional Hilbert space established in this work, which may be of independent interest. We further develop an algorithm based on random exploration with domain shrinking and establish its order-optimal regret guarantees under both noise-free and noisy settings. In the noise-free setting, our analysis closes the existing gap in regret performance and thereby resolves a COLT open problem. The proposed algorithm also enjoys a computational advantage over prevailing methods due to the random exploration that obviates the expensive optimization of a non-convex acquisition function for choosing the query points at each iteration.	翻訳日:2023-10-25 21:43:14 公開日:2023-10-23
# 最適制御フォトニック回路のためのスケーラブル機械学習支援クリアボックス特性 Scalable machine learning-assisted clear-box characterization for optimally controlled photonic circuits ( http://arxiv.org/abs/2310.15349v1 ) ライセンス: Link先を確認	Andreas Fyrillas, Olivier Faure, Nicolas Maring, Jean Senellart, Nadia Belabas	(参考訳) 光集積回路は、光の生成、操作、検出のためのコンパクトで安定したプラットフォームを提供する。これらは古典的および量子的応用に有効である。製造制約、耐性、動作波長から生じる欠陥は、現在のフォトニック集積装置の精度と有用性に制限を課す。これらの欠陥を緩和するには、典型的には基盤となる物理構造のモデルとアクセスが困難なパラメータの推定が必要である。現在、簡単なケースを越えて拡張されるメッシュ構成には、直接的なソリューションがない。我々は、反復的な機械学習支援手法によりフォトニックチップを特徴付けるスケーラブルで革新的な方法を提案する。提案手法は,フォトニックチップの完全モデル化された仮想レプリカを特徴とするクリアボックスアプローチに基づいている。このプロセスはサンプル効率が高く、連続波レーザーとパワーメータで実行することができる。モデルは、個々のパッシブフェーズ、クロストーク、ビームスプリッター反射率、相対入出力損失を推定する。精度の高いキャラクタリゼーション結果に基づいて、デバイスに対する制御の強化を可能にするために不完全さを緩和する。 12モードのクレメンツ干渉計に126相シフタを内蔵し、平均99.77%の振幅忠実性を有する最新チップ制御を100個のハールランダムユニタリ行列上で達成した。 Photonic integrated circuits offer a compact and stable platform for generating, manipulating, and detecting light. They are instrumental for classical and quantum applications. Imperfections stemming from fabrication constraints, tolerances and operation wavelength impose limitations on the accuracy and thus utility of current photonic integrated devices. Mitigating these imperfections typically necessitates a model of the underlying physical structure and the estimation of parameters that are challenging to access. Direct solutions are currently lacking for mesh configurations extending beyond trivial cases. We introduce a scalable and innovative method to characterize photonic chips through an iterative machine learning-assisted procedure. Our method is based on a clear-box approach that harnesses a fully modeled virtual replica of the photonic chip to characterize. The process is sample-efficient and can be carried out with a continuous-wave laser and powermeters. The model estimates individual passive phases, crosstalk, beamsplitter reflectivity values and relative input/output losses. Building upon the accurate characterization results, we mitigate imperfections to enable enhanced control over the device. We validate our characterization and imperfection mitigation methods on a 12-mode Clements-interferometer equipped with 126 phase shifters, achieving beyond state-of-the-art chip control with an average 99.77 % amplitude fidelity on 100 implemented Haar-random unitary matrices.	翻訳日:2023-10-25 21:42:59 公開日:2023-10-23
# 暗黙のオイラー転校学習を伴うハンバーガーのピン Burgers' pinns with implicit euler transfer learning ( http://arxiv.org/abs/2310.15343v1 ) ライセンス: Link先を確認	Vit\'oria Biesek and Pedro Henrique de Almeida Konzen	(参考訳) バーガーズ方程式は流体力学、気体力学、衝撃理論、宇宙論などいくつかの現象の計算モデリングにおいて確立されたテストケースである。本稿では,バーガース方程式を解くために,暗黙のオイラー変換学習手法を用いた物理情報ニューラルネットワーク(PINN)の適用について述べる。提案されたアプローチは、一連のニューラルネットワーク(anns)による時間的離散解を求めることである。各時間ステップにおいて、前のANNはその知識を次のネットワークモデルに転送し、バーガーズ方程式の暗黙のオイラー近似に基づいて損失関数を最小化することにより現在の時間解を学習する。このアプローチは、2つのベンチマーク問題に対してテストされる。1つは厳密なソリューション、もう1つは別の分析ソリューションである。通常のPINNモデルと比較して、提案手法は、同様の正確な結果と計算コストの削減を伴って、より小さなニューラルネットワークアーキテクチャを必要とするという利点がある。 The Burgers equation is a well-established test case in the computational modeling of several phenomena such as fluid dynamics, gas dynamics, shock theory, cosmology, and others. In this work, we present the application of Physics-Informed Neural Networks (PINNs) with an implicit Euler transfer learning approach to solve the Burgers equation. The proposed approach consists in seeking a time-discrete solution by a sequence of Artificial Neural Networks (ANNs). At each time step, the previous ANN transfers its knowledge to the next network model, which learns the current time solution by minimizing a loss function based on the implicit Euler approximation of the Burgers equation. The approach is tested for two benchmark problems: the first with an exact solution and the other with an alternative analytical solution. In comparison to the usual PINN models, the proposed approach has the advantage of requiring smaller neural network architectures with similar accurate results and potentially decreasing computational costs.	翻訳日:2023-10-25 21:42:36 公開日:2023-10-23
# ディープスパースネットワークのためのハイブリッド粒度特徴対話選択に向けて Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network ( http://arxiv.org/abs/2310.15342v1 ) ライセンス: Link先を確認	Fuyuan Lyu, Xing Tang, Dugang Liu, Chen Ma, Weihong Luo, Liang Chen, Xiuqiang He, Xue Liu	(参考訳) ディープスパースネットワークは,高次元スパース特徴を有する予測タスクのためのニューラルネットワークアーキテクチャとして広く研究されている。従来の手法は主に粗粒度空間における特徴相互作用の探索方法に重点を置いていたが、より細かい粒度にはあまり注意が払われていない。本研究では,深層スパースネットワークにおける特徴場と特徴値の両方を対象とする,ハイブリッドな機能間相互作用選択手法を提案する。このような拡張空間を探索するために,ハエで計算される分解空間を提案する。そこで我々はoptikfeatureと呼ばれる選択アルゴリズムを開発し,特徴フィールドと特徴値の両方から機能インタラクションを効率的に選択する。 3つの大規模な実世界のベンチマークデータセットの実験の結果、OptFeatureは精度と効率の点でよく機能していることが示された。さらなる研究が我々の方法の実現性を支持している。 Deep sparse networks are widely investigated as a neural network architecture for prediction tasks with high-dimensional sparse features, with which feature interaction selection is a critical component. While previous methods primarily focus on how to search feature interaction in a coarse-grained space, less attention has been given to a finer granularity. In this work, we introduce a hybrid-grained feature interaction selection approach that targets both feature field and feature value for deep sparse networks. To explore such expansive space, we propose a decomposed space which is calculated on the fly. We then develop a selection algorithm called OptFeature, which efficiently selects the feature interaction from both the feature field and the feature value simultaneously. Results from experiments on three large real-world benchmark datasets demonstrate that OptFeature performs well in terms of accuracy and efficiency. Additional studies support the feasibility of our method.	翻訳日:2023-10-25 21:42:18 公開日:2023-10-23
# 大規模言語モデルの道徳的基礎 Moral Foundations of Large Language Models ( http://arxiv.org/abs/2310.15337v1 ) ライセンス: Link先を確認	Marwa Abdulhai, Gregory Serapio-Garcia, Cl\'ement Crepy, Daria Valter, John Canny, Natasha Jaques	(参考訳) モラル・ファンデーションズ理論(moral foundations theory, mft)は、人間のモラル推論をケア/ハーム、自由/抑圧、聖度/劣化の5つの要因に分解する心理学的評価ツールである(graham et al., 2009)。人々は、文化的な成長と政治的イデオロギーのために、道徳的な決定を行う際に、これらの次元に重みを置きます。大規模な言語モデル(LLM)は、インターネットから収集されたデータセットに基づいて訓練されるため、そのようなコーパスに存在するバイアスを反映することができる。本稿では、MFTをレンズとして用いて、人気のあるLLMが特定の道徳的価値観に対して偏見を得たかどうかを分析する。我々は、既知のLLMを分析し、それらが特定の道徳的基盤を示すことを発見し、それらが人間の道徳的基盤と政治的提携にどのように関係しているかを示す。また、これらのバイアスの一貫性、あるいはモデルがどのように促されるかのコンテキストによって大きく異なるかどうかも測定します。最後に、モラルが特定のモラル基盤のセットを示すように促すプロンプトを反対に選択でき、これが下流タスクにおけるモデルの振る舞いに影響を与える可能性があることを示します。これらの知見は、特定の道徳的スタンスを仮定して、LLMの潜在的なリスクと意図しない結果を示すのに役立つ。 Moral foundations theory (MFT) is a psychological assessment tool that decomposes human moral reasoning into five factors, including care/harm, liberty/oppression, and sanctity/degradation (Graham et al., 2009). People vary in the weight they place on these dimensions when making moral decisions, in part due to their cultural upbringing and political ideology. As large language models (LLMs) are trained on datasets collected from the internet, they may reflect the biases that are present in such corpora. This paper uses MFT as a lens to analyze whether popular LLMs have acquired a bias towards a particular set of moral values. We analyze known LLMs and find they exhibit particular moral foundations, and show how these relate to human moral foundations and political affiliations. We also measure the consistency of these biases, or whether they vary strongly depending on the context of how the model is prompted. Finally, we show that we can adversarially select prompts that encourage the moral to exhibit a particular set of moral foundations, and that this can affect the model's behavior on downstream tasks. These findings help illustrate the potential risks and unintended consequences of LLMs assuming a particular moral stance.	翻訳日:2023-10-25 21:42:02 公開日:2023-10-23
# 残差ネットワークのためのadmmトレーニングアルゴリズム:収束、複雑性、並列トレーニング ADMM Training Algorithms for Residual Networks: Convergence, Complexity and Parallel Training ( http://arxiv.org/abs/2310.15334v1 ) ライセンス: Link先を確認	Jintao Xu, Yifei Li, Wenxun Xing	(参考訳) 本稿では,完全連結残留ネットワーク(FCResNets)トレーニング問題に対して,補助変数を導入することで,一連のシリアルおよび並列近位点ADMMを設計する。近点版の収束性は、クルディカ・ロジャシエヴィチ(KL)特性解析フレームワークに基づいて証明され、我々の目標を達成するために必要な補助関数が構築されるクルディカ・ロジャシエヴィチ(KL)指数の異なる範囲に依存する局所的なR-線形あるいはサブ線形収束率を確保することができる。さらに、並列実装の利点として、時間的複雑さの低減と(ノード単位の)メモリ消費の削減を理論的に分析する。我々の知る限りでは、FCResNetsのトレーニング問題に適用されるADMMの収束、収束率、時間複雑性、および(ノード毎)ランタイムメモリ要件を理論的に解析する最初の研究である。ディープネットワークトレーニングタスクにおいて、高速、パフォーマンス、堅牢性、潜在能力を示す実験が報告されている。最後に、大規模問題における並列トレーニングの利点と可能性を示す。 We design a series of serial and parallel proximal point (gradient) ADMMs for the fully connected residual networks (FCResNets) training problem by introducing auxiliary variables. Convergence of the proximal point version is proven based on a Kurdyka-Lojasiewicz (KL) property analysis framework, and we can ensure a locally R-linear or sublinear convergence rate depending on the different ranges of the Kurdyka-Lojasiewicz (KL) exponent, in which a necessary auxiliary function is constructed to realize our goal. Moreover, the advantages of the parallel implementation in terms of lower time complexity and less (per-node) memory consumption are analyzed theoretically. To the best of our knowledge, this is the first work analyzing the convergence, convergence rate, time complexity and (per-node) runtime memory requirement of the ADMM applied in the FCResNets training problem theoretically. Experiments are reported to show the high speed, better performance, robustness and potential in the deep network training tasks. Finally, we present the advantage and potential of our parallel training in large-scale problems.	翻訳日:2023-10-25 21:41:38 公開日:2023-10-23
# 信頼性と安全な治療基準の推定 Estimating Trustworthy and Safe Optimal Treatment Regimes ( http://arxiv.org/abs/2310.15333v1 ) ライセンス: Link先を確認	Harsh Parikh, Quinn Lanners, Zade Akras, Sahar F. Zafar, M. Brandon Westover, Cynthia Rudin, Alexander Volfovsky	(参考訳) 最近の統計・強化学習手法は患者のケア戦略を著しく進歩させた。しかし、これらのアプローチは、欠落データ、固有の確率性、解釈可能性と患者の安全性に対する重要な要件など、高い視点で大きな課題に直面している。我々の研究は、最適な治療体制を特定する安全かつ解釈可能な枠組みを運用している。本手法では, 同様の医療・薬理学的特徴を持つ患者をマッチングし, 補間により最適な方針を立案する。複雑な設定でも最適なポリシーを識別できるフレームワークの能力を示すために,包括的なシミュレーション研究を行う。最終的に我々は,重症患者に対する発作治療体制を研究するためのアプローチを運用する。本研究は患者の医療歴と薬理学的特徴からパーソナライズされた治療戦略を強く支持する。特に,集中治療室で重篤な発作を経験する患者に対して攻撃的治療を施し,軽度,短時間の発作エピソードに対する服用量を減らすことで,より良好な結果が得られた。 Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regimes. This approach involves matching patients with similar medical and pharmacological characteristics, allowing us to construct an optimal policy via interpolation. We perform a comprehensive simulation study to demonstrate the framework's ability to identify optimal policies even in complex settings. Ultimately, we operationalize our approach to study regimes for treating seizures in critically ill patients. Our findings strongly support personalized treatment strategies based on a patient's medical history and pharmacological features. Notably, we identify that reducing medication doses for patients with mild and brief seizure episodes while adopting aggressive treatment for patients in intensive care unit experiencing intense seizures leads to more favorable outcomes.	翻訳日:2023-10-25 21:41:16 公開日:2023-10-23
# 教師なしフェデレート学習: 対向攻撃に対するロバスト性をもつ不均一混合モデルに対するフェデレート・グラディエントEMアルゴリズム Unsupervised Federated Learning: A Federated Gradient EM Algorithm for Heterogeneous Mixture Models with Robustness against Adversarial Attacks ( http://arxiv.org/abs/2310.15330v1 ) ライセンス: Link先を確認	Ye Tian, Haolei Weng, Yang Feng	(参考訳) 教師なし連合学習アプローチは大きな成功を収めてきたが、教師なし連合学習の領域はいまだに未発見のままである。本稿では,タスク間の不均一な混合比率を持つ混合モデルの教師なし学習を目的とした,新しい連邦勾配EMアルゴリズムを提案する。一般混合モデルに対する包括的有限サンプル理論から始まり、モデルパラメータと混合比の明示的な推定誤差を特徴づけるために、この一般理論をガウス混合モデル(GMM)と回帰の混合理論(MoR)に適用する。提案アルゴリズムは,未知のタスク類似性への適応性,少数のデータソースに対する敵攻撃に対するレジリエンス,ローカルデータプライバシ保護,計算および通信効率など,いくつかの重要な利点を示す。 While supervised federated learning approaches have enjoyed significant success, the domain of unsupervised federated learning remains relatively underexplored. In this paper, we introduce a novel federated gradient EM algorithm designed for the unsupervised learning of mixture models with heterogeneous mixture proportions across tasks. We begin with a comprehensive finite-sample theory that holds for general mixture models, then apply this general theory on Gaussian Mixture Models (GMMs) and Mixture of Regressions (MoRs) to characterize the explicit estimation error of model parameters and mixture proportions. Our proposed federated gradient EM algorithm demonstrates several key advantages: adaptability to unknown task similarity, resilience against adversarial attacks on a small fraction of data sources, protection of local data privacy, and computational and communication efficiency.	翻訳日:2023-10-25 21:40:53 公開日:2023-10-23
# 自己監督型プレトレーニング映像からのスマート環境における遠隔心拍モニタリング Remote Heart Rate Monitoring in Smart Environments from Videos with Self-supervised Pre-training ( http://arxiv.org/abs/2310.15388v1 ) ライセンス: Link先を確認	Divij Gupta, Ali Etemad	(参考訳) 近年のディープラーニングの進歩により、ビデオを分析して遠隔で心拍数を推定することが可能になった。しかしながら、ディープラーニングの方法の注目すべき制限は、効果的なトレーニングのためにラベル付きデータの広範なセットに依存することだ。この問題に対処するために、自己教師型学習が有望な道として登場した。そこで本研究では,自己指導型コントラスト学習を用いて遠隔光胸腺撮影(PPG)と心拍モニタリングを行い,ラベル付きデータへの依存性を低減し,性能の向上を図る。本稿では,コントラストフレームワークによるエンコーダのトレーニングに,空間的および時間的拡張を3つ使用し,その後に遠隔ppgおよび心拍数推定のためのエンコーダの後期中間埋め込みを利用することを提案する。 2つの公開データセットに関する実験では,提案手法がいくつかの関連作業や教師付き学習ベースラインに対して改善され,結果が最先端に接近していることを示す。また,映像表現学習法,事前学習段階における補足効果など,異なる設計選択による効果を示すために,徹底的な実験を行った。また,ラベル付きデータ量を削減した教師付き学習手法に対して,提案手法の頑健性を示す。 Recent advances in deep learning have made it increasingly feasible to estimate heart rate remotely in smart environments by analyzing videos. However, a notable limitation of deep learning methods is their heavy reliance on extensive sets of labeled data for effective training. To address this issue, self-supervised learning has emerged as a promising avenue. Building on this, we introduce a solution that utilizes self-supervised contrastive learning for the estimation of remote photoplethysmography (PPG) and heart rate monitoring, thereby reducing the dependence on labeled data and enhancing performance. We propose the use of 3 spatial and 3 temporal augmentations for training an encoder through a contrastive framework, followed by utilizing the late-intermediate embeddings of the encoder for remote PPG and heart rate estimation. Our experiments on two publicly available datasets showcase the improvement of our proposed approach over several related works as well as supervised learning baselines, as our results approach the state-of-the-art. We also perform thorough experiments to showcase the effects of using different design choices such as the video representation learning method, the augmentations used in the pre-training stage, and others. We also demonstrate the robustness of our proposed method over the supervised learning approaches on reduced amounts of labeled data.	翻訳日:2023-10-25 21:32:01 公開日:2023-10-23
# 生成的対向ネットワークの誤り解析 Error analysis of generative adversarial network ( http://arxiv.org/abs/2310.15387v1 ) ライセンス: Link先を確認	Mahmud Hasan and Hailin Sang	(参考訳) GAN(Generative Adversarial Network)は,近年,高次元分布学習のための重要なモデルである。しかし,誤差収束率を理解するための包括的手法の必要性が高まっている。本研究では,識別器とジェネレータニューラルネットワークを包含する関数のクラスに基づくGANモデルの誤差収束率について検討する。これらの関数は、我々の仮定の下で有界エンベロープ関数を持つVC型であり、タラグランド不等式の適用を可能にする。タラグランドの不等式とボレル・カンテッリ補題を用いることで、GANの誤差に対する厳密な収束率を確立する。この手法は既存のGANの誤差推定にも適用でき、収束率の向上をもたらす。特に,ニューラルネットワーク距離で定義される誤差は,我々の定義では特別な場合誤差である。 The generative adversarial network (GAN) is an important model developed for high-dimensional distribution learning in recent years. However, there is a pressing need for a comprehensive method to understand its error convergence rate. In this research, we focus on studying the error convergence rate of the GAN model that is based on a class of functions encompassing the discriminator and generator neural networks. These functions are VC type with bounded envelope function under our assumptions, enabling the application of the Talagrand inequality. By employing the Talagrand inequality and Borel-Cantelli lemma, we establish a tight convergence rate for the error of GAN. This method can also be applied on existing error estimations of GAN and yields improved convergence rates. In particular, the error defined with the neural network distance is a special case error in our definition.	翻訳日:2023-10-25 21:31:40 公開日:2023-10-23
# コープマン表現の修正コース Course Correcting Koopman Representations ( http://arxiv.org/abs/2310.15386v1 ) ライセンス: Link先を確認	Mahan Fathi and Clement Gehring and Jonathan Pilault and David Kanaa and Pierre-Luc Bacon and Ross Goroshin	(参考訳) クープマン表現は、潜在空間における線形力学をもたらす非線形力学系(NLDS)の特徴を学習することを目的としている。理論的には、これらの特徴はNLDSのモデリングと制御における多くの問題を単純化するために使用できる。本研究では, この問題のオートエンコーダの定式化と, ダイナミックスをモデル化するための様々な方法, 特に長期水平線上での将来の状態予測について検討する。我々は、潜在空間における将来の状態を予測するいくつかの制限を発見し、長期的ダイナミクスを忠実に捉えるために、周期的再符号化と呼ばれる推論時間機構を提案する。我々は,低次元および高次元NLDSの実験を通して解析的および経験的にこの手法を正当化する。 Koopman representations aim to learn features of nonlinear dynamical systems (NLDS) which lead to linear dynamics in the latent space. Theoretically, such features can be used to simplify many problems in modeling and control of NLDS. In this work we study autoencoder formulations of this problem, and different ways they can be used to model dynamics, specifically for future state prediction over long horizons. We discover several limitations of predicting future states in the latent space and propose an inference-time mechanism, which we refer to as Periodic Reencoding, for faithfully capturing long term dynamics. We justify this method both analytically and empirically via experiments in low and high dimensional NLDS.	翻訳日:2023-10-25 21:31:28 公開日:2023-10-23
# GD-COMET:ジオディバースコモンセンス推論モデル GD-COMET: A Geo-Diverse Commonsense Inference Model ( http://arxiv.org/abs/2310.15383v1 ) ライセンス: Link先を確認	Mehar Bhatia and Vered Shwartz	(参考訳) 日々の生活にAIが統合されるにつれて、文化的に認識することで、さまざまなバックグラウンドからユーザーを支援するAIシステムを設計することが重要になっています。本稿では,COMETコモンセンス推論モデルのジオディバースバージョンであるGD-COMETを提案する。 GD-COMETは西洋の常識的知識を超え、幅広い文化に関する推論を生成することができる。 GD-COMETの有効性は,5つの異なる文化にまたがる包括的人的評価と,地理多様性課題における外在的評価によって実証される。評価の結果、GD-COMETは文化的に曖昧なコモンセンス知識を捉え、生成し、NLPアプリケーションに利益をもたらす可能性を示し、NLPをより包括的にすることに貢献した。 With the increasing integration of AI into everyday life, it's becoming crucial to design AI systems that serve users from diverse backgrounds by making them culturally aware. In this paper, we present GD-COMET, a geo-diverse version of the COMET commonsense inference model. GD-COMET goes beyond Western commonsense knowledge and is capable of generating inferences pertaining to a broad range of cultures. We demonstrate the effectiveness of GD-COMET through a comprehensive human evaluation across 5 diverse cultures, as well as extrinsic evaluation on a geo-diverse task. The evaluation shows that GD-COMET captures and generates culturally nuanced commonsense knowledge, demonstrating its potential to benefit NLP applications across the board and contribute to making NLP more inclusive.	翻訳日:2023-10-25 21:31:17 公開日:2023-10-23
# 開量子系における作用素成長仮説 The operator growth hypothesis in open quantum systems ( http://arxiv.org/abs/2310.15376v1 ) ライセンス: Link先を確認	N. S. Srivatsa and Curt von Keyserlingk	(参考訳) 作用素成長仮説 (Operator Growth hypothesis, OGH) は、作用素の挙動、具体的にはランツォ係数の漸近的な成長に関する技術的予想である。十分に汎用的な閉多体系を保つことが期待されている。保持すると、局所相関関数の高周波挙動と(オトクのような)カオスの測度の境界を与える。また、応答関数を数値的に推定する経路も与える。ここでは、開量子系へのOGHの一般化について検討し、そこでは、リウビリアンをリンドブラディアンに置き換える。局所エルミートジャンプ演算子を持つ量子系では、OGHは変形し、Lanczos係数の一般化を定義し、元のOGHと同様に線形に成長するが、散逸強度によって決定されるスケールで指数関数的に増加する振動を経験することを示す。半解析的に解けるモデル(散逸を伴う大規模SYK)、エルゴードスピン鎖の数値計算、散逸の存在下での演算子成長のための解凍可能な玩具モデル(非エルミート粒子ホッピング法に類似)でこの挙動が現れる。最後に、修正されたOGHはリンドブラッドと閉系(高周波数では、前者崩壊のスペクトル関数は代数的に、後者では指数関数的に崩壊する)の基本的な違いに結びつくことを示す。これは実験的に検証可能なステートメントであり、平衡環境に接するシステムへのリンドブレディアンの適用性に制限を課す。 The operator growth hypothesis (OGH) is a technical conjecture about the behaviour of operators -- specifically, the asymptotic growth of their Lanczos coefficients -- under repeated action by a Liouvillian. It is expected to hold for a sufficiently generic closed many-body system. When it holds, it yields bounds on the high frequency behavior of local correlation functions and measures of chaos (like OTOCs). It also gives a route to numerically estimating response functions. Here we investigate the generalisation of OGH to open quantum systems, where the Liouvillian is replaced by a Lindbladian. For a quantum system with local Hermitian jump operators, we show that the OGH is modified: we define a generalisation of the Lanczos coefficient and show that it initially grows linearly as in the original OGH, but experiences exponentially growing oscillations on scales determined by the dissipation strength. We see this behavior manifested in a semi-analytically solvable model (large-q SYK with dissipation), numerically for an ergodic spin chain, and in a solvable toy model for operator growth in the presence of dissipation (which resembles a non-Hermitian single-particle hopping process). Finally, we show that the modified OGH connects to a fundamental difference between Lindblad and closed systems: at high frequencies, the spectral functions of the former decay algebraically, while in the latter they decay exponentially. This is an experimentally testable statement, which also places limitations on the applicability of Lindbladians to systems in contact with equilibrium environments.	翻訳日:2023-10-25 21:31:02 公開日:2023-10-23
# データレイクにおけるセマンティックデータ管理 Semantic Data Management in Data Lakes ( http://arxiv.org/abs/2310.15373v1 ) ライセンス: Link先を確認	Sayed Hoseini, Johannes Theissen-Lipp, Christoph Quix	(参考訳) 近年、現代のデータ分析のために大量の異種データを管理するために、データレイクが登場した。データレイクが運用不能なデータ沼になるのを防ぐ方法のひとつは、セマンティックデータ管理である。いくつかのアプローチでは、湖内のデータに対してより意味と意味を提供するために、Linked Data原則に基づいた知識グラフへのメタデータのリンクを提案する。このようなセマンティクスレイヤは、データ管理だけでなく、異種ソースからのデータ統合の問題にも対処して、データアクセスをより表現豊かで相互運用可能なものにすることもできる。本調査では,データレイクシステム内のアプリケーションとビッグデータのスケーラビリティに着目した最近のアプローチについて概説する。私たちはアプローチを分類します (i)基本的な意味的データ管理 (ii)データレイクにおけるメタデータ強化のための意味モデリング手法 (iii)オントロジベースのデータアクセス方法。各カテゴリにおいて、主要な技術とその背景をカバーし、最新の研究と比較する。最後に、ビッグデータとセマンティックweb技術のより緊密な統合を必要とするこの研究分野における今後の取り組みの課題を指摘する。 In recent years, data lakes emerged as away to manage large amounts of heterogeneous data for modern data analytics. One way to prevent data lakes from turning into inoperable data swamps is semantic data management. Some approaches propose the linkage of metadata to knowledge graphs based on the Linked Data principles to provide more meaning and semantics to the data in the lake. Such a semantic layer may be utilized not only for data management but also to tackle the problem of data integration from heterogeneous sources, in order to make data access more expressive and interoperable. In this survey, we review recent approaches with a specific focus on the application within data lake systems and scalability to Big Data. We classify the approaches into (i) basic semantic data management, (ii) semantic modeling approaches for enriching metadata in data lakes, and (iii) methods for ontologybased data access. In each category, we cover the main techniques and their background, and compare latest research. Finally, we point out challenges for future work in this research area, which needs a closer integration of Big Data and Semantic Web technologies.	翻訳日:2023-10-25 21:30:31 公開日:2023-10-23
# EpiK-Eval: てんかんモデルとしての言語モデルの評価 EpiK-Eval: Evaluation for Language Models as Epistemic Models ( http://arxiv.org/abs/2310.15372v1 ) ライセンス: Link先を確認	Gabriele Prato, Jerry Huang, Prasannna Parthasarathi, Shagun Sodhani, Sarath Chandar	(参考訳) 人工知能の時代、大規模言語モデル(LLM)の役割はますます中心となってきています。その普及にもかかわらず、異なるトレーニングドキュメントから知識を集約する能力は、多くのアプリケーションにおいて重要な能力である。本稿では,LLMがパラメータ空間内で効果的に情報を組み合わせる能力について検討する。セグメンテッドな物語から一貫した知識表現を定式化する上で,LLMの習熟度を評価するための新しい質問答えベンチマークであるEpiK-Evalを紹介する。様々なLSMに対する評価は、この領域において重大な弱点を示す。これらの欠点は、一般的な訓練目的の本質的な性質に起因していると主張する。その結果,知識統合へのアプローチの洗練を提唱し,その全体的な効果と性能を劇的に向上させる可能性を秘めている。本研究は, より堅牢で信頼性の高いLCMを開発するための知見を提供する。私たちのコードとベンチマークはhttps://github.com/chandar-lab/epik-evalで利用可能です。 In the age of artificial intelligence, the role of large language models (LLMs) is becoming increasingly central. Despite their growing prevalence, their capacity to consolidate knowledge from different training documents - a crucial ability in numerous applications - remains unexplored. This paper presents the first study examining the capability of LLMs to effectively combine such information within their parameter space. We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. Evaluations across various LLMs reveal significant weaknesses in this domain. We contend that these shortcomings stem from the intrinsic nature of prevailing training objectives. Consequently, we advocate for refining the approach towards knowledge consolidation, as it harbors the potential to dramatically improve their overall effectiveness and performance. The findings from this study offer insights for developing more robust and reliable LLMs. Our code and benchmark are available at https://github.com/chandar-lab/EpiK-Eval	翻訳日:2023-10-25 21:30:15 公開日:2023-10-23
# 3次元医用ボリュームセグメンテーションにおけるビシナル特徴量の増大 Vicinal Feature Statistics Augmentation for Federated 3D Medical Volume Segmentation ( http://arxiv.org/abs/2310.15371v1 ) ライセンス: Link先を確認	Yongsong Huang, Wanqing Xie, Mingzhen Li, Mingmei Cheng, Jinzhou Wu, Weixiao Wang, Jane You, Xiaofeng Liu	(参考訳) FL(Federated Learning)は、複数のクライアント医療機関が、プライバシー保護を備えたディープラーニング(DL)モデルを共同でトレーニングすることを可能にする。しかしながら、flの性能は、小さな研究所におけるラベル付きデータの可用性の制限と、研究所間での異種(非i.i.d.)データ分散によって制限される。データ拡張は、従来の集中型dlを「フリーランチ」として一般化する技術として実証されているが、flでの応用はほとんど未検討である。特に、高価なラベル付けによって制約される3D医療セグメントは、一般的にデータ拡張に依存します。本研究では,局所的な特徴シフトを効果的に緩和し,プライバシを意識したFLセグメンテーションのための協調トレーニングを容易にするために,VFDA(vicinal feature-level data augmentation)方式を開発することを目的とする。我々は、生データの相互転送や混成を必要とせず、内部と内部のばらつきを考慮に入れている。具体的には、各機関におけるバッチワイド特徴統計(平均偏差や標準偏差など)を利用してデータの差を抽象的に表現し、各特徴統計をガウスプロトタイプを用いて確率的にモデル化する。ビクタナルリスク最小化の観点からは、新しい特徴統計はガウス分布から引き出されて補足を満たすことができる。ばらつきは、個々の機関のデータバイアスと、すべての機関が特徴とする基礎となる特徴統計によって明確に導かれる。 vfdaは3d脳腫瘍と心分画の両方において6つの先進的なfl法を一貫して改善した。 Federated learning (FL) enables multiple client medical institutes collaboratively train a deep learning (DL) model with privacy protection. However, the performance of FL can be constrained by the limited availability of labeled data in small institutes and the heterogeneous (i.e., non-i.i.d.) data distribution across institutes. Though data augmentation has been a proven technique to boost the generalization capabilities of conventional centralized DL as a "free lunch", its application in FL is largely underexplored. Notably, constrained by costly labeling, 3D medical segmentation generally relies on data augmentation. In this work, we aim to develop a vicinal feature-level data augmentation (VFDA) scheme to efficiently alleviate the local feature shift and facilitate collaborative training for privacy-aware FL segmentation. We take both the inner- and inter-institute divergence into consideration, without the need for cross-institute transfer of raw data or their mixup. Specifically, we exploit the batch-wise feature statistics (e.g., mean and standard deviation) in each institute to abstractly represent the discrepancy of data, and model each feature statistic probabilistically via a Gaussian prototype, with the mean corresponding to the original statistic and the variance quantifying the augmentation scope. From the vicinal risk minimization perspective, novel feature statistics can be drawn from the Gaussian distribution to fulfill augmentation. The variance is explicitly derived by the data bias in each individual institute and the underlying feature statistics characterized by all participating institutes. The added-on VFDA consistently yielded marked improvements over six advanced FL methods on both 3D brain tumor and cardiac segmentation.	翻訳日:2023-10-25 21:29:59 公開日:2023-10-23
# 深い統合的な説明 Deep Integrated Explanations ( http://arxiv.org/abs/2310.15368v1 ) ライセンス: Link先を確認	Oren Barkan, Yehonathan Elisha, Jonathan Weill, Yuval Asher, Amit Eshel, Noam Koenigstein	(参考訳) 本稿では,視覚モデルを説明する普遍的手法であるDeep Integrated Explanations (DIX)を提案する。 DIXは、モデルの中間表現から情報を統合することで説明写像を生成し、対応する勾配と結合する。多様なタスク,データセット,モデル構成にまたがる客観的および主観的評価の広範な配列を通じて,現状の手法を超越しつつ,忠実で正確な説明図を生成する上でのDIXの有効性を示す。 This paper presents Deep Integrated Explanations (DIX) - a universal method for explaining vision models. DIX generates explanation maps by integrating information from the intermediate representations of the model, coupled with their corresponding gradients. Through an extensive array of both objective and subjective evaluations spanning diverse tasks, datasets, and model configurations, we showcase the efficacy of DIX in generating faithful and accurate explanation maps, while surpassing current state-of-the-art methods.	翻訳日:2023-10-25 21:29:28 公開日:2023-10-23
# 高信頼保証による公平表現の学習 Learning Fair Representations with High-Confidence Guarantees ( http://arxiv.org/abs/2310.15358v1 ) ライセンス: Link先を確認	Yuhong Luo, Austin Hoag, Philip S. Thomas	(参考訳) 表現学習は、複数の下流タスクで予測される表現を生成するためにますます使われています。そのため、下流予測タスクにおいて不公平なグループに対する不公平を防止できるため、強い公平性を保証する表現学習アルゴリズムの開発が重要である。下流タスクにおける不当なグループに対する不公平さを防止するためには、公平性を保証する表現学習アルゴリズムを提供することが不可欠である。本稿では,信頼度の高い表現を学習する際の問題を正式に定義する。次に,すべての下流モデルとタスクに対して不公平を制限し,ユーザ定義の上界を持つ高信頼保証(frg)フレームワークを用いた公正表現学習を導入する。 FRGが全ての下流モデルとタスクに対して高い確率で公平性を保証することを証明した後、複数の下流モデルとタスクに対する上限不公平性におけるFRGの有効性を示す経験的評価を示す。 Representation learning is increasingly employed to generate representations that are predictive across multiple downstream tasks. The development of representation learning algorithms that provide strong fairness guarantees is thus important because it can prevent unfairness towards disadvantaged groups for all downstream prediction tasks. To prevent unfairness towards disadvantaged groups in all downstream tasks, it is crucial to provide representation learning algorithms that provide fairness guarantees. In this paper, we formally define the problem of learning representations that are fair with high confidence. We then introduce the Fair Representation learning with high-confidence Guarantees (FRG) framework, which provides high-confidence guarantees for limiting unfairness across all downstream models and tasks, with user-defined upper bounds. After proving that FRG ensures fairness for all downstream models and tasks with high probability, we present empirical evaluations that demonstrate FRG's effectiveness at upper bounding unfairness for multiple downstream models and tasks.	翻訳日:2023-10-25 21:29:19 公開日:2023-10-23
# ツィバコフ雑音を伴う効率的な能動学習半空間:非凸最適化手法 Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach ( http://arxiv.org/abs/2310.15411v1 ) ライセンス: Link先を確認	Yinan Li, Chicheng Zhang	(参考訳) Tsybakov Noise~\citep{tsybakov 2004optimal} を用いた、構造化されていないデータ分布下での計算およびラベル付きPAC能動学習の課題について検討する。ここでは, 滑らかな非凸損失関数の任意の1次定常点が, 過大な誤差の保証が低いハーフスペースとなることを証明した。 In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of $\tilde{O}(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}})$\footnote{In the main body of this work, we use $\tilde{O}(\cdot), \tilde{\Theta}(\cdot)$ to hide factors of the form $\polylog(d, \frac{1}{\epsilon}, \frac{1}{\delta})$}, under the assumption that the Tsybakov noise parameter $\alpha \in (\frac13, 1]$, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms~\citep{diakonikolas2020polynomial,zhang2021improved} and the information-theoretic lower bound in this setting. We study the problem of computationally and label efficient PAC active learning $d$-dimensional halfspaces with Tsybakov Noise~\citep{tsybakov2004optimal} under structured unlabeled data distributions. Inspired by~\cite{diakonikolas2020learning}, we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of $\tilde{O}(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}})$\footnote{In the main body of this work, we use $\tilde{O}(\cdot), \tilde{\Theta}(\cdot)$ to hide factors of the form $\polylog(d, \frac{1}{\epsilon}, \frac{1}{\delta})$}, under the assumption that the Tsybakov noise parameter $\alpha \in (\frac13, 1]$, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms~\citep{diakonikolas2020polynomial,zhang2021improved} and the information-theoretic lower bound in this setting.	翻訳日:2023-10-25 21:23:49 公開日:2023-10-23
# 視覚要素と認知バイアスが散乱プロットのトレンドの解釈に及ぼす影響 Visual Elements and Cognitive Biases Influence Interpretations of Trends in Scatter Plots ( http://arxiv.org/abs/2310.15406v1 ) ライセンス: Link先を確認	Alexandre Filipowicz, Scott Carter, Nayeli Bravo, Rumen Iliev, Shabnam Hakimi, David Ayman Shamma, Kent Lyons, Candice Hogan, Charlene Wu	(参考訳) 可視化は情報伝達の一般的な方法であるが、誤情報の拡散にもますます使われている。したがって、可視化の解釈に人々が使用する要因を理解することが重要である。本稿では,散乱プロットの解釈に影響を与える要因に着目し,散乱プロット(アウトリアーとトレンドライン)と認知バイアス(人々の信念)の共通視覚的側面が相関傾向の知覚に与える影響について検討する。アウトリーバーは傾向知覚を歪めるが、他のポイントよりも影響が少ないこと、トレンドラインは傾向をより強く見せるが、アウトリーバーの影響を緩和すること、そして人々の信念は弱いものの強い相関関係には影響しないことの3つの主な発見を強調する。これらの結果から,散乱プロットの解釈を歪ませる要因の影響を軽減するために,視覚要素の調整に関するガイドラインを導出する。これらのガイドラインを他の可視化タイプに一般化し、今後の研究に推奨する方法について検討する。 Visualizations are common methods to convey information but also increasingly used to spread misinformation. It is therefore important to understand the factors people use to interpret visualizations. In this paper, we focus on factors that influence interpretations of scatter plots, investigating the extent to which common visual aspects of scatter plots (outliers and trend lines) and cognitive biases (people's beliefs) influence perception of correlation trends. We highlight three main findings: outliers skew trend perception but exert less influence than other points; trend lines make trends seem stronger but also mitigate the influence of some outliers; and people's beliefs have a small influence on perceptions of weak, but not strong correlations. From these results we derive guidelines for adjusting visual elements to mitigate the influence of factors that distort interpretations of scatter plots. We explore how these guidelines may generalize to other visualization types and make recommendations for future studies.	翻訳日:2023-10-25 21:23:26 公開日:2023-10-23
# gpt-4 科学図形キャプションのための効果的なゼロショットエバブリエータ GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions ( http://arxiv.org/abs/2310.15405v1 ) ライセンス: Link先を確認	Ting-Yao Hsu, Chieh-Yang Huang, Ryan Rossi, Sungchul Kim, C. Lee Giles and Ting-Hao K. Huang	(参考訳) 科学的な数字のキャプションを生成するシステムへの関心が高まっている。しかし、これらのシステムの出力を評価することは大きな課題となる。人格評価は学術的な専門知識を必要とし、費用がかかるが、自動評価はしばしば低品質の著者によるキャプションに依存する。本稿では,大言語モデル(LLM)をコスト効率のよい参照不要な図形キャプション評価手法として用いた。最初にscicap-evalを構築した。scicap-evalは、人間による評価データセットで、3600の科学的な数字のキャプション、オリジナルとマシンによるキャプション、600のarxivの数字を含む。 gpt-4 や gpt-3 といった llm に各字幕のスコア (1-6) を付けて, 読者の理解を支援する。 gpt-4はゼロショット・エバブリエーターとして使われ、コンピュータサイエンスとインフォマティクスの学部生が行った評価を上回り、ph.d.の学生ランキングで0.401のケンドール相関スコアを達成した。 There is growing interest in systems that generate captions for scientific figures. However, assessing these systems output poses a significant challenge. Human evaluation requires academic expertise and is costly, while automatic evaluation depends on often low-quality author-written captions. This paper investigates using large language models (LLMs) as a cost-effective, reference-free method for evaluating figure captions. We first constructed SCICAP-EVAL, a human evaluation dataset that contains human judgments for 3,600 scientific figure captions, both original and machine-made, for 600 arXiv figures. We then prompted LLMs like GPT-4 and GPT-3 to score (1-6) each caption based on its potential to aid reader understanding, given relevant context such as figure-mentioning paragraphs. Results show that GPT-4, used as a zero-shot evaluator, outperformed all other models and even surpassed assessments made by Computer Science and Informatics undergraduates, achieving a Kendall correlation score of 0.401 with Ph.D. students rankings	翻訳日:2023-10-25 21:23:10 公開日:2023-10-23
# 脊髄のコントラスト非依存性ソフトセグメンテーションに向けて Towards contrast-agnostic soft segmentation of the spinal cord ( http://arxiv.org/abs/2310.15402v1 ) ライセンス: Link先を確認	Sandrine B\'edard, Naga Karthik Enamundram, Charidimos Tsagkas, Emanuele Pravat\`a, Cristina Granziera, Andrew Smith, Kenneth Arnold Weber II, Julien Cohen-Adad	(参考訳) 脊髄セグメンテーションは臨床的に有用であり、脊髄圧迫や多発性硬化症などの神経変性疾患の診断・モニタリングのために、脊髄横断領域(CSA)の計算に特に用いられる。セグメンテーションはMRIのコントラストに依存し、コントラストによって異なるCSAとなる。これは、脊髄と髄液の境界が、配列や獲得パラメータによって様々に現れるためである。このコントラストに敏感なCSAは、プロトコルが変化しうるマルチセンタの研究において可変性を付加し、微妙なアトロフィを検出する感度を低下させる。さらに、既存の手法ではコントラストごとに1つのモデルをトレーニングすることでcsaの変動性を高めるとともに、部分ボリューム効果を考慮しないバイナリマスクも生成している。そこで本研究では,脊髄の軟化を誘発する深層学習に基づく方法を提案する。健全な参加者のSpine Generic Public Database($\text{n}=267$; $\text{contrasts}=6$)を用いて、まず6つのコントラストのバイナリセグメンテーションを平均化することにより、参加者のソフトグラウンド真実(GT)を生成した。これらのソフトGTと回帰に基づく損失関数は、脊髄セグメンテーションのためのUNetモデルを訓練するために使用される。我々は,最先端手法に対するモデルを評価し,gtマスクの種類,損失関数,コントラスト固有モデルに関するアブレーション実験を行った。その結果, 軟平均セグメンテーションと回帰損失関数を用いることで, csaの変動性は低下する (p < 0.05$, wilcoxon sign-rank test)。提案する脊髄セグメンテーションモデルは,未発見のデータセット,ベンダ,コントラスト,病理(圧縮,病変)において,部分的ボリューム効果を考慮しつつ,最先端のコントラスト特定手法よりも一般化している。 Spinal cord segmentation is clinically relevant and is notably used to compute spinal cord cross-sectional area (CSA) for the diagnosis and monitoring of cord compression or neurodegenerative diseases such as multiple sclerosis. While several semi and automatic methods exist, one key limitation remains: the segmentation depends on the MRI contrast, resulting in different CSA across contrasts. This is partly due to the varying appearance of the boundary between the spinal cord and the cerebrospinal fluid that depends on the sequence and acquisition parameters. This contrast-sensitive CSA adds variability in multi-center studies where protocols can vary, reducing the sensitivity to detect subtle atrophies. Moreover, existing methods enhance the CSA variability by training one model per contrast, while also producing binary masks that do not account for partial volume effects. In this work, we present a deep learning-based method that produces soft segmentations of the spinal cord. Using the Spine Generic Public Database of healthy participants ($\text{n}=267$; $\text{contrasts}=6$), we first generated participant-wise soft ground truth (GT) by averaging the binary segmentations across all 6 contrasts. These soft GT, along with a regression-based loss function, were then used to train a UNet model for spinal cord segmentation. We evaluated our model against state-of-the-art methods and performed ablation studies involving different GT mask types, loss functions, and contrast-specific models. Our results show that using the soft average segmentations along with a regression loss function reduces CSA variability ($p < 0.05$, Wilcoxon signed-rank test). The proposed spinal cord segmentation model generalizes better than the state-of-the-art contrast-specific methods amongst unseen datasets, vendors, contrasts, and pathologies (compression, lesions), while accounting for partial volume effects.	翻訳日:2023-10-25 21:22:47 公開日:2023-10-23
# 「ワンサイズフィットオール」? NLGシステムのアイデンティティ関連言語特性の観察と期待 "One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features ( http://arxiv.org/abs/2310.15398v1 ) ライセンス: Link先を確認	Li Lucy, Su Lin Blodgett, Milad Shokouhi, Hanna Wallach, Alexandra Olteanu	(参考訳) 適切なNLGシステム行動を構成することの公平性に関する仮定は、システムが社会グループに同じ反応を期待されるような不変性から、適応性まで様々である。我々は,NLGシステム入力における識別関連言語の特徴(名前,役割,場所,方言,スタイル)を摂動させ,不変性や適応性に関する緊張感を照らす5つのケーススタディを設計・実施する。我々は、システムの振る舞いに対する人々の期待を概説し、これら2つの対照的に一般的な仮定に注意を払っている。適応の動機は,社会的規範,文化的差異,特徴特異的情報,調節などであり,非分散的モチベーションには規範主義を好む視点,nlgシステムにとって不必要あるいは難しすぎる視点,誤った仮定への注意などが含まれる。本研究は, 公正なNLGシステム動作を構成するものの定義に関して, オープンな課題を浮き彫りにした。 Fairness-related assumptions about what constitutes appropriate NLG system behaviors range from invariance, where systems are expected to respond identically to social groups, to adaptation, where responses should instead vary across them. We design and conduct five case studies, in which we perturb different types of identity-related language features (names, roles, locations, dialect, and style) in NLG system inputs to illuminate tensions around invariance and adaptation. We outline people's expectations of system behaviors, and surface potential caveats of these two contrasting yet commonly-held assumptions. We find that motivations for adaptation include social norms, cultural differences, feature-specific information, and accommodation; motivations for invariance include perspectives that favor prescriptivism, view adaptation as unnecessary or too difficult for NLG systems to do appropriately, and are wary of false assumptions. Our findings highlight open challenges around defining what constitutes fair NLG system behavior.	翻訳日:2023-10-25 21:22:09 公開日:2023-10-23
# スクイーズ推定のためのガウス状態のクラス Classes of Gaussian States for Squeezing Estimation ( http://arxiv.org/abs/2310.15397v1 ) ライセンス: Link先を確認	Leonardo A. M. Souza	(参考訳) 本研究では,1つのモードで符号化された未知のスクイーズパラメータの評価を対象とする,単一モードと2モードのガウス状態の様々なクラスを,推定過程のキー要素として詳細に検討する。各プローブの有効性を定量化するために、平均量子フィッシャー情報(avqfi)の概念をロバストメトリックとして、ガウス状態の特定のクラスに関連する最適性能を入力として定量化する。単モードプローブでは、純粋に圧縮された単モード状態が最適選択であり、コヒーレンスとAvQFIの相関について検討する。また, 純粋な2モード圧縮状態は, エンコードされたスクイーズパラメータを推定するための単一モードに類似した挙動を示し, エンタングルメントとAvQFIの相互作用について検討した。本稿では,すべての研究クラスを包含する解析的および数値的な結果を示し,量子推定プロセスに有用な洞察を与える。 This study explores a detailed examination of various classes of single- and two-mode Gaussian states as key elements for an estimation process, specifically targeting the evaluation of an unknown squeezing parameter encoded in one mode. To quantify the efficacy of each probe, we employ the concept of Average Quantum Fisher Information (AvQFI) as a robust metric to quantify the optimal performance associated with specific classes of Gaussian states as input. For single-mode probes, we identify pure squeezed single-mode states as the optimal choice and we explore the correlation between Coherence and AvQFI. Also, we show that pure two-mode squeezed states exhibit behavior resembling their single-mode counterparts for estimating the encoded squeezing parameter, and we studied the interplay between entanglement and AvQFI. This paper presents both analytical and numerical results that encompass all the studied classes, offering valuable insights for quantum estimation processes.	翻訳日:2023-10-25 21:21:50 公開日:2023-10-23
# DoGE: 一般化推定によるドメイン再重み付け DoGE: Domain Reweighting with Generalization Estimation ( http://arxiv.org/abs/2310.15393v1 ) ライセンス: Link先を確認	Simin Fan, Matteo Pagliardini, Martin Jaggi	(参考訳) 事前学習データコーパスのカバレッジと構成は、大規模言語モデルの一般化能力に大きな影響を及ぼす。従来、プリトレーニングコーパスは、特定のサンプリング確率(ドメインの重み付け)に応じて、さまざまなソースドメイン(commoncrawl、wikipedia、githubなど)で構成されている。しかし、現在の手法には一般化の最終的な目的のためにドメイン重みを最適化する原則的な方法がない。本稿では,一般化推定関数を用いて評価された最終一般化目標への寄与に基づいて,各領域からのサンプリング確率を再検討するDOmain reweighting with Generalization Estimation (DoGE)を提案する。まず、最小限の最適化で小さなプロキシモデルを訓練し、重み付けされたドメイン重みを求める。各ステップでドメイン重みを更新し、ミラー降下による全体的な一般化ゲインを最大化する。最後に得られたドメイン重みを使って、より大規模なフルサイズの言語モデルをトレーニングします。 SlimPajama-6Bデータセットでは、普遍的な一般化目標により、DoGEはより平均的なパープレキシティとゼロショット推論精度を達成する。ドメイン外の一般化タスクでは、dogeはターゲットドメインのパープレキシティを大きなマージンで削減する。さらに,一般化推定の効率を向上させるパラメータ選択手法を適用する。 The coverage and composition of the pretraining data corpus significantly impacts the generalization ability of large language models. Conventionally, the pretraining corpus is composed of various source domains (e.g. CommonCrawl, Wikipedia, Github etc.) according to certain sampling probabilities (domain weights). However, current methods lack a principled way to optimize domain weights for ultimate goal for generalization. We propose DOmain reweighting with Generalization Estimation (DoGE), where we reweigh the sampling probability from each domain based on its contribution to the final generalization objective assessed by a gradient-based generalization estimation function. First, we train a small-scale proxy model with a min-max optimization to obtain the reweighted domain weights. At each step, the domain weights are updated to maximize the overall generalization gain by mirror descent. Finally we use the obtained domain weights to train a larger scale full-size language model. On SlimPajama-6B dataset, with universal generalization objective, DoGE achieves better average perplexity and zero-shot reasoning accuracy. On out-of-domain generalization tasks, DoGE reduces perplexity on the target domain by a large margin. We further apply a parameter-selection scheme which improves the efficiency of generalization estimation.	翻訳日:2023-10-25 21:21:32 公開日:2023-10-23
# MEMPSEP III。多変量アンサンブル法を用いた太陽エネルギー粒子イベントの発生と特性予測のための機械学習指向多変量データセット MEMPSEP III. A machine learning-oriented multivariate data set for forecasting the Occurrence and Properties of Solar Energetic Particle Events using a Multivariate Ensemble Approach ( http://arxiv.org/abs/2310.15390v1 ) ライセンス: Link先を確認	Kimberly Moreland, Maher Dayeh, Hazel M. Bain, Subhamoy Chatterjee, Andres Munoz-Jaramillo, Samuel Hart	(参考訳) 本研究では,太陽エネルギー粒子(seps)の生成に関与する物理プロセスと関連があることを示す,実地およびリモートセンシングヘリオマフィア計測を収集する複数の宇宙船を用いた,新しい多変量データセットを提案する。太陽周期 (SC) 23 および SC 24 (1998-2013) の一部から地球環境衛星 (GOES) のフレアイベントリストを用いて, SEP を発生させる252の太陽イベント (フレア) と、そうでない17,542のイベントを同定した。特定された事象ごとに、エネルギー陽子と電子データ、上流の太陽風条件、および様々な機器を搭載させたadvanced composition explorer(ace)宇宙船を用いて惑星間磁場ベクトル量などの1auの局所プラズマ特性を取得する。また、SDO(Solar Dynamic Observatory)、SoHO(Solar and Heliospheric Observatory)、WAVES(Wind Solar Radio instrument)からリモートセンシングデータを収集する。データセットは、ヘリオフィジカルスにおける機械学習(ml)の入力と特徴のバリエーションを可能にするために設計されており、sepイベントの発生とその後の特性を予測するための特別な目的を持っている。本稿では,機械学習パイプラインの検証,クリーン化,精査を行う複数の公開観測源から作成したデータセットについて述べる。このデータセットは、新たに開発された太陽エネルギー粒子の確率予測モデル(MEMPSEP; MEMPSEP I (Chatterjee et al., 2023) とMEMPSEP II (Dayeh et al., 2023) を駆動するために使用されている。 We introduce a new multivariate data set that utilizes multiple spacecraft collecting in-situ and remote sensing heliospheric measurements shown to be linked to physical processes responsible for generating solar energetic particles (SEPs). Using the Geostationary Operational Environmental Satellites (GOES) flare event list from Solar Cycle (SC) 23 and part of SC 24 (1998-2013), we identify 252 solar events (flares) that produce SEPs and 17,542 events that do not. For each identified event, we acquire the local plasma properties at 1 au, such as energetic proton and electron data, upstream solar wind conditions, and the interplanetary magnetic field vector quantities using various instruments onboard GOES and the Advanced Composition Explorer (ACE) spacecraft. We also collect remote sensing data from instruments onboard the Solar Dynamic Observatory (SDO), Solar and Heliospheric Observatory (SoHO), and the Wind solar radio instrument WAVES. The data set is designed to allow for variations of the inputs and feature sets for machine learning (ML) in heliophysics and has a specific purpose for forecasting the occurrence of SEP events and their subsequent properties. This paper describes a dataset created from multiple publicly available observation sources that is validated, cleaned, and carefully curated for our machine-learning pipeline. The dataset has been used to drive the newly-developed Multivariate Ensemble of Models for Probabilistic Forecast of Solar Energetic Particles (MEMPSEP; see MEMPSEP I (Chatterjee et al., 2023) and MEMPSEP II (Dayeh et al., 2023) for associated papers).	翻訳日:2023-10-25 21:21:16 公開日:2023-10-23
# 言語モデル事前学習のための既約カリキュラム Irreducible Curriculum for Language Model Pretraining ( http://arxiv.org/abs/2310.15389v1 ) ライセンス: Link先を確認	Simin Fan, Martin Jaggi	(参考訳) 大規模言語モデルのトレーニングのためのデータの自動選択とカリキュラム設計は難しい。さらに、現在のスキームはドメインレベルの選択にフォーカスし、個々のトレーニングポイントのよりきめ細かい貢献を見渡しています。従来のデータポイント選択手法を大規模言語モデルに適用するのは困難である: ほとんどのオンラインバッチ選択メソッドは2回前方または後方パスを実行する。これらの障害を軽減するために,言語モデル事前学習のためのカリキュラム学習アルゴリズムとして,学習性の高いサンプルを優先する既約カリキュラムを提案する。具体的には,厳密な計算オーバーヘッドを避けるために,小型のプロキシモデルを用いて,メインモデルのトレーニング軌道に沿ったサンプル損失をシミュレートする。 RedPajama-1Bデータセットに対する実験は、ランダムな均一なベースラインと反カリキュラム戦略と比較して、全7ドメインにわたる検証難易度が一貫した改善を示した。本手法はネットワークのシャープさを低減し,mmluベンチマークにおける5ショット精度の向上を示す。 Automatic data selection and curriculum design for training large language models is challenging, with only a few existing methods showing improvements over standard training. Furthermore, current schemes focus on domain-level selection, overlooking the more fine-grained contributions of each individual training point. It is difficult to apply traditional datapoint selection methods on large language models: most online batch selection methods perform two-times forward or backward passes, which introduces considerable extra costs with large-scale models. To mitigate these obstacles, we propose irreducible curriculum as a curriculum learning algorithm for language model pretraining, which prioritizes samples with higher learnability. Specifically, to avoid prohibitive extra computation overhead, we simulate the sample loss along the main model's training trajectory using a small-scale proxy model. Our experiments on the RedPajama-1B dataset demonstrate a consistent improvement on validation perplexity across all 7 domains compared to random uniform baseline and the anti-curriculum strategy. Our method also reduces the sharpness of the network and illustrates a better 5-shot accuracy on MMLU benchmarks.	翻訳日:2023-10-25 21:20:33 公開日:2023-10-23
# 全正負の精度行列推定のための高速射影ニュートン様法 Fast Projected Newton-like Method for Precision Matrix Estimation under Total Positivity ( http://arxiv.org/abs/2112.01939v4 ) ライセンス: Link先を確認	Jian-Feng Cai, Jos\'e Vin\'icius de M. Cardoso, Daniel P. Palomar, Jiaxi Ying	(参考訳) 次数 2 (\mathrm{mtp}_2$) の完全正の多変量ガウス分布における精度行列の推定問題について検討する。そのような分布における精度行列はm行列である。この問題は、符号制約付きログ決定プログラムとして定式化することができる。現在のアルゴリズムはブロック座標降下法や近点アルゴリズムを用いて設計されており、多くの非負の二次プログラムや大規模線形系を解く必要があるため、高次元の場合では計算が困難になる。そこで本研究では, 注意深く設計した探索方向と可変分割スキームを組み込んだ2次元投影法に基づく新しいアルゴリズムを提案する。本アルゴリズムは計算複雑性を大幅に低減し,その理論的収束を確立する。合成および実世界のデータセットにおける実験結果から,提案手法は最先端手法に比べて計算効率が著しく向上することが示された。 We study the problem of estimating precision matrices in Gaussian distributions that are multivariate totally positive of order two ($\mathrm{MTP}_2$). The precision matrix in such a distribution is an M-matrix. This problem can be formulated as a sign-constrained log-determinant program. Current algorithms are designed using the block coordinate descent method or the proximal point algorithm, which becomes computationally challenging in high-dimensional cases due to the requirement to solve numerous nonnegative quadratic programs or large-scale linear systems. To address this issue, we propose a novel algorithm based on the two-metric projection method, incorporating a carefully designed search direction and variable partitioning scheme. Our algorithm substantially reduces computational complexity, and its theoretical convergence is established. Experimental results on synthetic and real-world datasets demonstrate that our proposed algorithm provides a significant improvement in computational efficiency compared to the state-of-the-art methods.	翻訳日:2023-10-25 15:27:41 公開日:2023-10-23
# ナップサック制約を受ける高速適応型非単調サブモジュラー最大化 Fast Adaptive Non-Monotone Submodular Maximization Subject to a Knapsack Constraint ( http://arxiv.org/abs/2007.05014v3 ) ライセンス: Link先を確認	Georgios Amanatidis, Federico Fusco, Philip Lazos, Stefano Leonardi, Rebecca Reiffenh\"auser	(参考訳) 制限付きサブモジュラー最大化問題は、パーソナライズドレコメンデーション、チーム形成、バイラルマーケティングによる収益最大化など、幅広い応用を包含している。現代のアプリケーションで発生する巨大なインスタンスは、既存のアルゴリズムを違法に遅くするが、それらのインスタンスは本質的に確率的でもある。これらの課題に着目し,ナップサック制約を受ける(多分単調でない)部分モジュラー関数を最大化する古典的な問題を再考する。 5.83$の近似を達成し、o(n \log n)$の時間、すなわち、他の最先端のアルゴリズムよりも少なくとも1倍の速さで実行される単純なランダム化グリーディアルゴリズムを提案する。私たちのアプローチの堅牢性は、問題を確率的なバージョンにさらに移すことを可能にします。そこで,非単調な目的に対する最初の定数近似である最適適応ポリシーに対する9-近似を求める。提案アルゴリズムの実験的評価は,実データおよび合成データの性能向上を示す。 Constrained submodular maximization problems encompass a wide variety of applications, including personalized recommendation, team formation, and revenue maximization via viral marketing. The massive instances occurring in modern day applications can render existing algorithms prohibitively slow, while frequently, those instances are also inherently stochastic. Focusing on these challenges, we revisit the classic problem of maximizing a (possibly non-monotone) submodular function subject to a knapsack constraint. We present a simple randomized greedy algorithm that achieves a $5.83$ approximation and runs in $O(n \log n)$ time, i.e., at least a factor $n$ faster than other state-of-the-art algorithms. The robustness of our approach allows us to further transfer it to a stochastic version of the problem. There, we obtain a 9-approximation to the best adaptive policy, which is the first constant approximation for non-monotone objectives. Experimental evaluation of our algorithms showcases their improved performance on real and synthetic data.	翻訳日:2023-10-25 15:25:56 公開日:2023-10-23
# Hyperbolic Graph Neural Networks: 手法と応用のレビュー Hyperbolic Graph Neural Networks: A Review of Methods and Applications ( http://arxiv.org/abs/2202.13852v2 ) ライセンス: Link先を確認	Menglin Yang, Min Zhou, Zhihao Li, Jiahong Liu, Lujia Pan, Hui Xiong, Irwin King	(参考訳) グラフニューラルネットワークは、従来のニューラルネットワークをグラフ構造化データに一般化し、その印象的な表現能力によって広く注目を集めている。卓越した成果にもかかわらず、グラフ関連学習におけるユークリッドモデルの性能は、特に非ユークリッド潜在解剖学のデータセットにおいて、ユークリッド幾何学の表現能力によって制限されている。近年,その指数的成長特性から,木のような構造を持つグラフデータ処理や,ゆるい分布の処理において,双曲空間が人気が高まっている。本研究では,現在の双曲グラフニューラルネットワークの技術的詳細を包括的に検討し,それらを汎用フレームワークに統合し,各コンポーネントの変種を要約する。さらに,HGNN関連アプリケーションについても紹介する。最後に,双曲空間におけるグラフ学習の成果をさらに高めるためのガイドラインとして,いくつかの課題も挙げる。 Graph neural networks generalize conventional neural networks to graph-structured data and have received widespread attention due to their impressive representation ability. In spite of the remarkable achievements, the performance of Euclidean models in graph-related learning is still bounded and limited by the representation ability of Euclidean geometry, especially for datasets with highly non-Euclidean latent anatomy. Recently, hyperbolic space has gained increasing popularity in processing graph data with tree-like structure and power-law distribution, owing to its exponential growth property. In this survey, we comprehensively revisit the technical details of the current hyperbolic graph neural networks, unifying them into a general framework and summarizing the variants of each component. More importantly, we present various HGNN-related applications. Last, we also identify several challenges, which potentially serve as guidelines for further flourishing the achievements of graph learning in hyperbolic spaces.	翻訳日:2023-10-25 15:17:16 公開日:2023-10-23
# ユーザが求めるものを理解するための課題:一貫性のない選好とエンゲージメント最適化 The Challenge of Understanding What Users Want: Inconsistent Preferences and Engagement Optimization ( http://arxiv.org/abs/2202.11776v3 ) ライセンス: Link先を確認	Jon Kleinberg, Sendhil Mullainathan, Manish Raghavan	(参考訳) オンラインプラットフォームには豊富なデータがあり、無数の実験を行い、ユーザーエクスペリエンスを最適化するために産業規模のアルゴリズムを使用する。それにもかかわらず、多くのユーザーはこれらのプラットフォームに費やす時間を後悔しているようだ。プラットフォームはユーザの幸福のために最適化していません。問題はさらに深く、特定のプラットフォームの特定のインセンティブを超越し、その代わりに、誤った基本的な仮定が原因であることが示唆されている。しかし、研究は実証され、個人的な経験は、私たちが本当に望むものと矛盾する瞬間にしばしば選択するということを証明している。本研究では,ユーザが不整合な嗜好を持つメディア消費モデルを開発する。ユーザの有用性を最大化したいが、ユーザエンゲージメントを観察するだけのプラットフォームを考える。ユーザの嗜好の不整合のモデルが,日常経験に慣れ親しんだ現象をいかに生み出すかを示すが,従来のユーザインタラクションモデルでは捉え難い。我々のモデルにおける重要な要素は、プラットフォームがユーザーに何を示すかを決定する方法の定式化である。エンゲージメントの改善とユーザ福祉の改善は、コンテンツ多様体の移動方向に依存する。変化の特定の方向において、エンゲージメントの増加はユーザを幸せにし、他の方向では、エンゲージメントの増加はユーザを幸せにする。エンゲージメントの増加がユーザ・ユーティリティの増大に失敗するコンテンツ・マニホールドの構造を特徴付ける。これらの効果をプラットフォーム設計選択の抽象化にリンクすることにより、設計、行動科学、ソーシャルメディア間の相互作用を探索する理論的な枠組みと語彙を作成する。 Online platforms have a wealth of data, run countless experiments and use industrial-scale algorithms to optimize user experience. Despite this, many users seem to regret the time they spend on these platforms. One possible explanation is misaligned incentives: platforms are not optimizing for user happiness. We suggest the problem runs deeper, transcending the specific incentives of any particular platform, and instead stems from a mistaken foundational assumption: To understand what users want, platforms look at what users do. Yet research has demonstrated, and personal experience affirms, that we often make choices in the moment that are inconsistent with what we actually want. In this work, we develop a model of media consumption where users have inconsistent preferences. We consider a platform which simply wants to maximize user utility, but only observes user engagement. We show how our model of users' preference inconsistencies produces phenomena that are familiar from everyday experience, but difficult to capture in traditional user interaction models. A key ingredient in our model is a formulation for how platforms determine what to show users: they optimize over a large set of potential content (the content manifold) parametrized by underlying features of the content. Whether improving engagement improves user welfare depends on the direction of movement in the content manifold: for certain directions of change, increasing engagement makes users less happy, while in other directions, increasing engagement makes users happier. We characterize the structure of content manifolds for which increasing engagement fails to increase user utility. By linking these effects to abstractions of platform design choices, our model thus creates a theoretical framework and vocabulary in which to explore interactions between design, behavioral science, and social media.	翻訳日:2023-10-25 15:17:00 公開日:2023-10-23
# 動的時間ワープ距離の統計的推測と異常時系列検出への応用 Statistical Inference for the Dynamic Time Warping Distance, with Application to Abnormal Time-Series Detection ( http://arxiv.org/abs/2202.06593v3 ) ライセンス: Link先を確認	Vo Nguyen Le Duy, Ichiro Takeuchi	(参考訳) 動的時間ワープ(DTW)アルゴリズムから得られた距離に関する統計的仮説を考慮し,不確実な環境下での2つの時系列間の類似度・距離の統計的推測を行った。複雑なDTWアルゴリズムの解から得られるため,DTW距離のサンプリング分布の導出が困難である。この困難を回避するため,DTW距離上で有効な推論手法を導出する条件選択推論フレームワークを提案する。我々の知る限り、この手法はDTW距離の統計的意義を定量化するための有効なp値を与える最初の方法であり、異常な時系列検出問題などの高精度な意思決定に役立つ。提案手法の有効性を,合成データと実世界のデータの両方で評価する。 We study statistical inference on the similarity/distance between two time-series under uncertain environment by considering a statistical hypothesis test on the distance obtained from Dynamic Time Warping (DTW) algorithm. The sampling distribution of the DTW distance is too difficult to derive because it is obtained based on the solution of the DTW algorithm, which is complicated. To circumvent this difficulty, we propose to employ the conditional selective inference framework, which enables us to derive a valid inference method on the DTW distance. To our knowledge, this is the first method that can provide a valid p-value to quantify the statistical significance of the DTW distance, which is helpful for high-stake decision making such as abnormal time-series detection problems. We evaluate the performance of the proposed inference method on both synthetic and real-world datasets.	翻訳日:2023-10-25 15:15:50 公開日:2023-10-23
# 入力次元の異なるデータセット間でのトランスファーラーニング--線形回帰のアルゴリズムと解析 Transfer-Learning Across Datasets with Different Input Dimensions: An Algorithm and Analysis for the Linear Regression Case ( http://arxiv.org/abs/2202.05069v3 ) ライセンス: Link先を確認	Luis Pedro Silvestrin, Harry van Zanten, Mark Hoogendoorn, Ger Koole	(参考訳) 新しいセンサーと監視デバイスの開発により、より多くのデータソースが機械学習モデルの入力として利用できるようになる。これらは一方、モデルの精度を向上させるのに役立ちます。一方で、これらの新しい入力と過去のデータを組み合わせることは、まだ十分に研究されていない課題である。本研究では,新しいデータと過去のデータを異なる入力次元で組み合わせた移動学習アルゴリズムを提案する。このアプローチは、通常の最小二乗法と同等の計算複雑性で実装が容易で、ハイパーパラメータチューニングを必要としないため、新しいデータが制限された場合に簡単に適用できる。他のアプローチとは異なり、その頑健性に関する厳密な理論的研究を行い、新しいデータのみを利用するベースラインで比較することはできないことを示した。提案手法は,9つの実生活データセット上での最先端性能を実現し,線形移動学習アルゴリズムである線形DSFTより優れ,非線形DSFTと互換性のある性能を実現する。 With the development of new sensors and monitoring devices, more sources of data become available to be used as inputs for machine learning models. These can on the one hand help to improve the accuracy of a model. On the other hand, combining these new inputs with historical data remains a challenge that has not yet been studied in enough detail. In this work, we propose a transfer learning algorithm that combines new and historical data with different input dimensions. This approach is easy to implement, efficient, with computational complexity equivalent to the ordinary least-squares method, and requires no hyperparameter tuning, making it straightforward to apply when the new data is limited. Different from other approaches, we provide a rigorous theoretical study of its robustness, showing that it cannot be outperformed by a baseline that utilizes only the new data. Our approach achieves state-of-the-art performance on 9 real-life datasets, outperforming the linear DSFT, another linear transfer learning algorithm, and performing comparably to non-linear DSFT.	翻訳日:2023-10-25 15:15:15 公開日:2023-10-23
# Jaynes-Cummingsモデルとその子孫 The Jaynes-Cummings model and its descendants ( http://arxiv.org/abs/2202.00330v3 ) ライセンス: Link先を確認	Jonas Larson and Th. K. Mavrogordatos	(参考訳) Jaynes-Cummings (JC) モデルは、現在まで約60年間量子光学の最前線にあり、現代の物理学において最も単純だが複雑な非線形な光物質相互作用の定式化の1つとなっている。このモノグラフは、様々な分野にわたるモデルの全義性に重点を置いており、原子物理学、量子光学、固体物理学、量子情報科学を含むいくつかの領域における特定の物理系における幅広い応用を考察して、その形式主義の基本的な一般化をもたらす。物語を組み立てるために様々な部品を組み立てるとき、我々は主に量子物理学と量子光学の研究者をターゲットにしてきた。このモノグラフはまた、非平衡量子相転移、量子コンピューティングとシミュレーション、および量子多体物理学に携わる大学院生向けのアクセス可能な導入を含んでいる。この枠組みでは、物理学と応用の共通基盤を文献に散らばり、様々な技術進歩を明らかにすることを目的としている。この展示は、量子光学と凝縮物質物理学をインターレースする活気のある場を通して読者を導く。全てのセクションは理論と実験の強い相互関係に費やされており、歴史的にjc物理学を起源とする様々な現代の研究方向の発展と結びついている。これは1960年代初めからその進化を形作った主要な出版物への包括的な参照リストを伴っている。最後に,このような多面的素材の提示を可能な限り簡潔に維持し,数学的表現の経済的利用とともに,様々な図形で連続的なテキストを散在させてきた。 The Jaynes-Cummings (JC) model has been at the forefront of quantum optics for almost six decades to date, providing one of the simplest yet intricately nonlinear formulations of light-matter interaction in modern physics. Laying most of the emphasis on the omnipresence of the model across a range of disciplines, this monograph brings up the fundamental generality of its formalism, looking at a wide gamut of applications in specific physical systems among several realms, including atomic physics, quantum optics, solid-state physics and quantum information science. When bringing the various pieces together to assemble our narrative, we have primarily targeted researchers in quantum physics and quantum optics. The monograph also comprises an accessible introduction for graduate students engaged with non-equilibrium quantum phase transitions, quantum computing and simulation, and quantum many-body physics. In that framework, we aim to reveal the common ground between physics and applications scattered across literature and different technological advancements. The exposition guides the reader through a vibrant field interlacing quantum optics and condensed-matter physics. All sections are devoted to the strong interconnection between theory and experiment, historically linked to the development of the various modern research directions stemming from JC physics. This is accompanied by a comprehensive list of references to the key publications that have shaped its evolution since the early 1960s. Finally, we have endeavored to keep the presentation of such a multi-sided material as concise as possible, interspersing continuous text with various illustrations alongside an economical use of mathematical expressions.	翻訳日:2023-10-25 15:14:57 公開日:2023-10-23
# Quantum Advantage Seeker with Kernels (QuASK): 量子機械学習の研究を高速化するソフトウェアフレームワーク Quantum Advantage Seeker with Kernels (QuASK): a software framework to speed up the research in quantum machine learning ( http://arxiv.org/abs/2206.15284v2 ) ライセンス: Link先を確認	Francesco Di Marcantonio, Massimiliano Incudini, Davide Tezza and Michele Grossi	(参考訳) 量子情報の性質を機械学習モデルの利点に活用することは、おそらく量子計算における最も活発な研究分野である。この関心は、量子アルゴリズムの実装、シミュレート、実行のための多数のソフトウェアフレームワーク(例えば、Qiskit、Pennylane、Braket)の開発を支持している。それらのほとんどは、量子回路を定義し、基本的な量子アルゴリズムを実行し、ソフトウェアが実行されるべきハードウェアに依存する低レベルのプリミティブにアクセスできます。ほとんどの実験では、これらのフレームワークはより大きな機械学習ソフトウェアパイプラインに手動で統合する必要がある。研究者は、異なるソフトウェアパッケージを理解し、長いコードスクリプトの開発を通じてそれらを統合し、結果を分析し、プロットを生成する。長いコードは、プログラムの長さに比例して増加する平均的なバグ数のために、しばしば間違ったアプリケーションにつながる。さらに、他の研究者は、コードスクリプトに関わるすべての異なるソフトウェアフレームワークに精通する必要があるため、実験を理解して再現するのに苦労するだろう。我々はquaskを提案する。quaskはpythonで書かれたオープンソースの量子機械学習フレームワークで、研究者が実験を行うのを支援する。特に量子カーネル技術に注目している。 QuASKは、データセットのダウンロード、前処理、量子機械学習ルーチン、分析と視覚化のためのコマンドラインツールとして使用できる。 QuASKは、量子カーネル、(緩やかな)トレーニング可能な量子カーネル、構造最適化された量子カーネルなど、ほとんどの最先端のアルゴリズムを実装して、量子カーネルを通じてデータを解析する。私たちのフレームワークはライブラリとしても使用でき、既存のソフトウェアに統合され、コードの再利用を最大化します。 Exploiting the properties of quantum information to the benefit of machine learning models is perhaps the most active field of research in quantum computation. This interest has supported the development of a multitude of software frameworks (e.g. Qiskit, Pennylane, Braket) to implement, simulate, and execute quantum algorithms. Most of them allow us to define quantum circuits, run basic quantum algorithms, and access low-level primitives depending on the hardware such software is supposed to run. For most experiments, these frameworks have to be manually integrated within a larger machine learning software pipeline. The researcher is in charge of knowing different software packages, integrating them through the development of long code scripts, analyzing the results, and generating the plots. Long code often leads to erroneous applications, due to the average number of bugs growing proportional with respect to the program length. Moreover, other researchers will struggle to understand and reproduce the experiment, due to the need to be familiar with all the different software frameworks involved in the code script. We propose QuASK, an open-source quantum machine learning framework written in Python that aids the researcher in performing their experiments, with particular attention to quantum kernel techniques. QuASK can be used as a command-line tool to download datasets, pre-process them, quantum machine learning routines, analyze and visualize the results. QuASK implements most state-of-the-art algorithms to analyze the data through quantum kernels, with the possibility to use projected kernels, (gradient-descent) trainable quantum kernels, and structure-optimized quantum kernels. Our framework can also be used as a library and integrated into pre-existing software, maximizing code reuse.	翻訳日:2023-10-25 15:07:44 公開日:2023-10-23
# 非プログラマはアクティブな例を通してプログラムを間接的にラベルできる:Text-to-SQLによるケーススタディ Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL ( http://arxiv.org/abs/2205.12422v3 ) ライセンス: Link先を確認	Ruiqi Zhong, Charlie Snell, Dan Klein, Jason Eisner	(参考訳) 非プログラマは、その意味を表現する複雑なプログラムで自然言語発話に注釈を付けることができるか? 非プログラマがシードセマンティックパーサ(例えばCodex)によって生成される候補プログラムの中から選択するフレームワークであるAPELを紹介する。候補プログラムは理解できないため,プログラムの入力-出力例を調べて間接的に選択するよう依頼する。各発話に対してAPELは、候補プログラムが異なる出力を生成する傾向がある単純な入力を積極的に検索する。そして、プログラマ以外の者が適切な出力だけを選択するように要求するので、どのプログラムが正しいかを推測することができ、パーサを微調整することができる。最初のケーススタディとして、APELを使ってテキストからSQLへのデータセットであるSPIDERを再注釈するために、人間の非プログラマを採用しました。提案手法は,元のエキスパートアノテーションと同じアノテーション精度(75%)を達成し,元のアノテーションに多くの微妙な誤りを露呈した。 Can non-programmers annotate natural language utterances with complex programs that represent their meaning? We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e.g., Codex). Since they cannot understand the candidate programs, we ask them to select indirectly by examining the programs' input-ouput examples. For each utterance, APEL actively searches for a simple input on which the candidate programs tend to produce different outputs. It then asks the non-programmers only to choose the appropriate output, thus allowing us to infer which program is correct and could be used to fine-tune the parser. As a first case study, we recruited human non-programmers to use APEL to re-annotate SPIDER, a text-to-SQL dataset. Our approach achieved the same annotation accuracy as the original expert annotators (75%) and exposed many subtle errors in the original annotations.	翻訳日:2023-10-25 15:06:17 公開日:2023-10-23
# 4Ward: 任意複素非巡回グラフの効率的な学習のための再層化戦略 4Ward: a Relayering Strategy for Efficient Training of Arbitrarily Complex Directed Acyclic Graphs ( http://arxiv.org/abs/2209.02037v2 ) ライセンス: Link先を確認	Tommaso Boccato, Matteo Ferrante, Andrea Duggento, Nicola Toschi	(参考訳) 実装が容易になったため、多層パーセプトロン(MLP)はディープラーニングアプリケーションで広く普及している。 MLPの下のグラフは確かに多部構造であり、ニューロンの各層は隣の層に属するニューロンにのみ接続する。対照的に、個々のシナプスのレベルでの生体内脳のコネクトームは、生物学的神経ネットワークがスケールフリーの度数分布または指数的に歪んだ力の法則の強さ分布によって特徴づけられ、進化由来の神経ネットワークを活用するための新たな道のりを示唆している。本稿では,任意に複雑な非巡回グラフから柔軟かつ効率的なニューラルネットワーク(NN)を生成する方法とPythonライブラリである ``4Ward'' を提案する。 4ward はグラフ描画の分野から引き出された階層化アルゴリズムに触発され、効率的なフォワードパスを実装し、様々な erd\h{o}s-r\'enyi グラフを用いた計算実験においてかなりの時間効果をもたらす。 4Wardは,アクティベーションの並列化による学習行列法の逐次的性質を克服するだけでなく,現在の最先端技術で直面するスケーラビリティ問題にも対処し,ウェイト初期化とアクティベーション関数をカスタマイズする自由を提供する。我々のアルゴリズムは、マイクロスケールのNN設計フレームワークで複雑なトポロジを活用しようとする研究者に役立てることができる。 Thanks to their ease of implementation, multilayer perceptrons (MLPs) have become ubiquitous in deep learning applications. The graph underlying an MLP is indeed multipartite, i.e. each layer of neurons only connects to neurons belonging to the adjacent layer. In contrast, in vivo brain connectomes at the level of individual synapses suggest that biological neuronal networks are characterized by scale-free degree distributions or exponentially truncated power law strength distributions, hinting at potentially novel avenues for the exploitation of evolution-derived neuronal networks. In this paper, we present ``4Ward'', a method and Python library capable of generating flexible and efficient neural networks (NNs) from arbitrarily complex directed acyclic graphs. 4Ward is inspired by layering algorithms drawn from the graph drawing discipline to implement efficient forward passes, and provides significant time gains in computational experiments with various Erd\H{o}s-R\'enyi graphs. 4Ward not only overcomes the sequential nature of the learning matrix method, by parallelizing the computation of activations, but also addresses the scalability issues encountered in the current state-of-the-art and provides the designer with freedom to customize weight initialization and activation functions. Our algorithm can be of aid for any investigator seeking to exploit complex topologies in a NN design framework at the microscale.	翻訳日:2023-10-25 14:57:27 公開日:2023-10-23
# 教師なしコントラスト学習によるインフォーマティブヘルス指標の学習 Learning Informative Health Indicators Through Unsupervised Contrastive Learning ( http://arxiv.org/abs/2208.13288v2 ) ライセンス: Link先を確認	Katharina Rombach, Gabriel Michau, Wilfried B\"urzle, Stefan Koller and Olga Fink	(参考訳) 産業資産の安全かつ効率的な運用には条件監視が不可欠である。この目標を達成するために、堅牢な健康指標の開発が近年大きな注目を集めている。これらの指標は、産業資産の経時的健康状態に関する定量的な洞察を提供し、障害検出や予後予測に有用なツールとなる。本研究では,教師なしコントラスト学習に基づく健康指標を学習するための新しい普遍的アプローチを提案する。運用時間は資産の劣化状態の代理として働き、健康状態までの距離を測定することで健康指標の構築を容易にする対照的な特徴空間の学習を可能にする。提案手法の普遍性を強調するために,本提案手法は,鉄道車両の鉄道車輪の実況監視ケーススタディとミリングマシンケーススタディの2つの異なるケーススタディにおいて,2つの異なるタスク(摩耗評価と故障検出)で比較学習フレームワークを評価した。まず,地中真実の摩耗状態を連続的に測定したミリング機械ケーススタディにおいて,健康指標が実際の健康状態を学習できるかどうかを評価する。第二に,提案手法を,地上の真実の健康状態が不明な鉄道車両の実例研究に適用する。ここでは、鉄道車輪欠陥の故障検出のための学習健康指標の適合性を評価する。その結果,提案手法は製粉機の基礎的真理の健康発展を学習でき,学習した健康指標は運転条件の異なる鉄道車両の故障検出に好適であることがわかった。さらに,提案手法は異なるシステムと異なる健康状態に普遍的に適用可能であることを示す。 Condition monitoring is essential to operate industrial assets safely and efficiently. To achieve this goal, the development of robust health indicators has recently attracted significant attention. These indicators, which provide quantitative real-time insights into the health status of industrial assets over time, serve as valuable tools for fault detection and prognostics. In this study, we propose a novel and universal approach to learn health indicators based on unsupervised contrastive learning. Operational time acts as a proxy for the asset's degradation state, enabling the learning of a contrastive feature space that facilitates the construction of a health indicator by measuring the distance to the healthy condition. To highlight the universality of the proposed approach, we assess the proposed contrastive learning framework in two distinct tasks - wear assessment and fault detection - across two different case studies: a milling machines case study and a real condition monitoring case study of railway wheels from operating trains. First, we evaluate if the health indicator is able to learn the real health condition on a milling machine case study where the ground truth wear condition is continuously measured. Second, we apply the proposed method on a real case study of railway wheels where the ground truth health condition is not known. Here, we evaluate the suitability of the learned health indicator for fault detection of railway wheel defects. Our results demonstrate that the proposed approach is able to learn the ground truth health evolution of milling machines and the learned health indicator is suited for fault detection of railway wheels operated under various operating conditions by outperforming state-of-the-art methods. Further, we demonstrate that our proposed approach is universally applicable to different systems and different health conditions.	翻訳日:2023-10-25 14:56:58 公開日:2023-10-23
# Appleとオレンジの比較:異なる分布から生成されたデータの類似性関数を学習する Comparing Apples to Oranges: Learning Similarity Functions for Data Produced by Different Distributions ( http://arxiv.org/abs/2208.12731v2 ) ライセンス: Link先を確認	Leonidas Tsepenekas, Ivan Brugere, Freddy Lecue, Daniele Magazzeni	(参考訳) 類似度関数は、要素の対が同等であるかを測り、例えば Dwork などの半古典的パラダイムやクラスタリング問題によって導かれる個人的公正の概念など、幅広い応用において重要な役割を果たす。しかし、正確な類似性関数へのアクセスは必ずしも保証されるべきではなく、この点はDworkらによって提起された。例えば、比較する要素が異なる分布で生成される場合、あるいは別の「デムグラフィック」グループに属する場合、それらの真の類似性に関する知識を得るのは非常に困難であると考えるのが妥当である。本研究では, 少数の専門家のフィードバックのみを用いて, グループ間の類似度関数を学習する効率的なサンプリングフレームワークを提案する。厳密な理論境界を用いて解析結果を示し,大規模な実験によりアルゴリズムを実証的に検証した。 Similarity functions measure how comparable pairs of elements are, and play a key role in a wide variety of applications, e.g., notions of Individual Fairness abiding by the seminal paradigm of Dwork et al., as well as Clustering problems. However, access to an accurate similarity function should not always be considered guaranteed, and this point was even raised by Dwork et al. For instance, it is reasonable to assume that when the elements to be compared are produced by different distributions, or in other words belong to different ``demographic'' groups, knowledge of their true similarity might be very difficult to obtain. In this work, we present an efficient sampling framework that learns these across-groups similarity functions, using only a limited amount of experts' feedback. We show analytical results with rigorous theoretical bounds, and empirically validate our algorithms via a large suite of experiments.	翻訳日:2023-10-25 14:56:29 公開日:2023-10-23
# メタ学習型ニューラルディファレンシャル方程式を用いた適応的非同期制御 Adaptive Asynchronous Control Using Meta-learned Neural Ordinary Differential Equations ( http://arxiv.org/abs/2207.12062v5 ) ライセンス: Link先を確認	Achkan Salehi, Steffen R\"uhl, Stephane Doncieux	(参考訳) モデルに基づく強化学習と制御は、ロボット工学を含む様々な意思決定問題領域において大きな可能性を示している。しかし、現実世界のロボットシステムは、その方法の適用性を制限する課題をしばしば提示する。特に、多くの産業システムで共同で発生する2つの問題に留意する。 1)不規則/非同期観測と行動 2) あるエピソードから別のエピソード(例えば、様々なペイロード慣性特性)への環境ダイナミクスの劇的な変化。本稿では,連続時間予測と制御のためのメタラーニング適応ダイナミクスモデルを用いて,それらの困難を克服する汎用フレームワークを提案する。提案手法はタスク非依存であり, 直進的に新しいタスクに適応できる。 2つの異なるロボットシミュレーションと実際の産業用ロボットの評価を行った。 Model-based Reinforcement Learning and Control have demonstrated great potential in various sequential decision making problem domains, including in robotics settings. However, real-world robotics systems often present challenges that limit the applicability of those methods. In particular, we note two problems that jointly happen in many industrial systems: 1) Irregular/asynchronous observations and actions and 2) Dramatic changes in environment dynamics from an episode to another (e.g. varying payload inertial properties). We propose a general framework that overcomes those difficulties by meta-learning adaptive dynamics models for continuous-time prediction and control. The proposed approach is task-agnostic and can be adapted to new tasks in a straight-forward manner. We present evaluations in two different robot simulations and on a real industrial robot.	翻訳日:2023-10-25 14:55:52 公開日:2023-10-23
# 低温画像上でのKinD-LCE曲線の推定とレチネックス融合 KinD-LCE Curve Estimation And Retinex Fusion On Low-Light Image ( http://arxiv.org/abs/2207.09210v3 ) ライセンス: Link先を確認	Xiaochun Lei, Weiliang Mai, Junlin Xie, He Liu, Zetao Jiang, Zhaoting Gong, Chang Lu, Linjun Lu	(参考訳) 低光度画像はノイズや色歪に苦しむことが多い。画像ノイズと色収差のため、低照度画像を扱う場合、オブジェクト検出、セマンティックセグメンテーション、インスタンスセグメンテーション、その他のタスクは困難である。また, 従来のレチネックス理論は低照度タスクの画像を調整する際に情報を失うことが判明した。上記の問題に対して,本研究では低照度化のためのアルゴリズムを提案する。提案手法であるKinD-LCEは、光曲線推定モジュールを用いて、Retinex分解画像の照明マップを強化し、全体の明るさを改善する。照明マップと反射マップ融合モジュールも提案され、画像の詳細を復元し、詳細損失を低減した。さらに、ノイズを除去するためにテレビの損失関数を適用した。提案手法は,低照度画像の多彩な収集で知られているGladNetデータセットを用いて訓練し,低照度画像に対してテストし,下流タスクにExDarkデータセットを用いて評価し,PSNR 19.7216 と SSIM 0.8213 の競合性能を実証した。 Low-light images often suffer from noise and color distortion. Object detection, semantic segmentation, instance segmentation, and other tasks are challenging when working with low-light images because of image noise and chromatic aberration. We also found that the conventional Retinex theory loses information in adjusting the image for low-light tasks. In response to the aforementioned problem, this paper proposes an algorithm for low illumination enhancement. The proposed method, KinD-LCE, uses a light curve estimation module to enhance the illumination map in the Retinex decomposed image, improving the overall image brightness. An illumination map and reflection map fusion module were also proposed to restore the image details and reduce detail loss. Additionally, a TV(total variation) loss function was applied to eliminate noise. Our method was trained on the GladNet dataset, known for its diverse collection of low-light images, tested against the Low-Light dataset, and evaluated using the ExDark dataset for downstream tasks, demonstrating competitive performance with a PSNR of 19.7216 and SSIM of 0.8213.	翻訳日:2023-10-25 14:54:50 公開日:2023-10-23
# 外因性入力を持つMDPの視線学習 Hindsight Learning for MDPs with Exogenous Inputs ( http://arxiv.org/abs/2207.06272v3 ) ライセンス: Link先を確認	Sean R. Sinclair, Felipe Frujeri, Ching-An Cheng, Luke Marshall, Hugo Barbalho, Jingling Li, Jennifer Neville, Ishai Menache, Adith Swaminathan	(参考訳) 多くの資源管理問題は不確実性の下での逐次的な意思決定を必要とし、意思決定結果に影響を与える不確実性は意思決定者の制御の外にある外因性変数のみである。本研究では,これらの問題をExo-MDP (Markov Decision Processs with Exogenous Inputs) としてモデル化し,Handsight Learning (HL) と呼ばれるデータ効率アルゴリズムのクラスを設計する。我々のHLアルゴリズムは、重要な洞察を生かして、データ効率を達成する。例えば、外因性変数のサンプルを持つことで、過去の決定を後から再考して、政策改善を加速する反実的な結果を予測することができる。多官庁・航空会社の収益管理問題において,HLと古典的ベースラインを比較した。当社のアルゴリズムは、仮想マシン(VM)を物理マシンに割り当て、大規模なパブリッククラウドプロバイダの実際のデータセットでそのパフォーマンスをシミュレートする、ビジネスクリティカルなクラウドリソース管理問題にも拡張しています。 HLアルゴリズムは、最先端の強化学習法と同様に、ドメイン固有のヒューリスティックよりも優れている。 Many resource management problems require sequential decision-making under uncertainty, where the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker. We model these problems as Exo-MDPs (Markov Decision Processes with Exogenous Inputs) and design a class of data-efficient algorithms for them termed Hindsight Learning (HL). Our HL algorithms achieve data efficiency by leveraging a key insight: having samples of the exogenous variables, past decisions can be revisited in hindsight to infer counterfactual consequences that can accelerate policy improvements. We compare HL against classic baselines in the multi-secretary and airline revenue management problems. We also scale our algorithms to a business-critical cloud resource management problem -- allocating Virtual Machines (VMs) to physical machines, and simulate their performance with real datasets from a large public cloud provider. We find that HL algorithms outperform domain-specific heuristics, as well as state-of-the-art reinforcement learning methods.	翻訳日:2023-10-25 14:54:07 公開日:2023-10-23
# 対照的なふりかえり:RLにおける素早い学習と一般化のための重要なステップについて Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL ( http://arxiv.org/abs/2210.05845v6 ) ライセンス: Link先を確認	Chen Sun, Wannan Yang, Thomas Jiralerspong, Dane Malenfant, Benjamin Alsbury-Nealy, Yoshua Bengio, Blake Richards	(参考訳) 実生活では、成功はしばしば、互いに時間的に、そして最終的な報酬から遠ざかる複数の重要なステップに付随する。これらの重要なステップは、信用代入のベルマン方程式に依存する従来の強化学習(RL)手法と同一視することが難しい。本稿では、オフラインのコントラスト学習を用いて、これらの重要なステップに注目する新しいRLアルゴリズムを提案する。 Contrastive Retrospection (ConSpec)と呼ばれるこのアルゴリズムは、既存のRLアルゴリズムに追加することができる。 conspecは、新しい対照的な損失によって、タスクのクリティカルステップのプロトタイプセットを学習し、現在の状態がプロトタイプの1つと一致したとき、本質的な報酬を与える。 ConSpecのプロトタイプは2つの重要な利点を提供している。 i) 全ての重要なステップの迅速な識別を可能にする。 (ii)容易に解釈可能で、感覚的特徴が変化した場合の分布の一般化を可能にする。クレジット・アサインに対する他の現代のRLアプローチとは違い、ConSpecは、成功が(そして他の状態を無視した)相反する小さなステップのセットを、取られたステップごとに前向きに予測することよりも、遡及的に特定することが容易であるという事実を生かしている。 ConSpecは多様なRLタスクの学習を大幅に改善する。 In real life, success is often contingent upon multiple critical steps that are distant in time from each other and from the final reward. These critical steps are challenging to identify with traditional reinforcement learning (RL) methods that rely on the Bellman equation for credit assignment. Here, we present a new RL algorithm that uses offline contrastive learning to hone in on these critical steps. This algorithm, which we call Contrastive Retrospection (ConSpec), can be added to any existing RL algorithm. ConSpec learns a set of prototypes for the critical steps in a task by a novel contrastive loss and delivers an intrinsic reward when the current state matches one of the prototypes. The prototypes in ConSpec provide two key benefits for credit assignment: (i) They enable rapid identification of all the critical steps. (ii) They do so in a readily interpretable manner, enabling out-of-distribution generalization when sensory features are altered. Distinct from other contemporary RL approaches to credit assignment, ConSpec takes advantage of the fact that it is easier to retrospectively identify the small set of steps that success is contingent upon (and ignoring other states) than it is to prospectively predict reward at every taken step. ConSpec greatly improves learning in a diverse set of RL tasks.	翻訳日:2023-10-25 14:47:52 公開日:2023-10-23
# 等変深大体積近似による多目的最適化 Multi-objective optimization via equivariant deep hypervolume approximation ( http://arxiv.org/abs/2210.02177v2 ) ライセンス: Link先を確認	Jim Boelrijk, Bernd Ensing, Patrick Forr\'e	(参考訳) 複数の競合する目標を最適化することは、科学と産業に共通する問題である。これらの目的間の本質的に不可分なトレードオフは、パレートフロントを探索するタスクにつながります。後者の目的の有意義な量は、ベイズ最適化(bo)と進化アルゴリズム(eas)で使用される超体積指標である。しかし、ハイパーボリュームの計算の計算の複雑さは、それらの共通の多目的最適化フレームワークの使用を制限する目的やデータポイントの数が増えると不利である。これらの制約を克服するため,我々はdeephvと呼ぶディープニューラルネットワークを用いてハイパーボリューム関数を近似する。より優れたサンプル効率と一般化のために、超体積がそれぞれの目的においてスケール同変であるという事実と、目的とサンプルの両方に置換不変なw.r.t.を、スケーリングと置換の組み合わせ群であるw.r.t.と等価なディープニューラルネットワークを用いて活用する。提案手法は,精度,計算時間,一般化の観点から,高精度で近似的な超体積法に対して評価する。また,本手法を,最先端の多目的BO法およびEAに対して,様々なベンチマークテストケースに適用し比較する。その結果,本手法はマルチ目的最適化タスクに有望であることがわかった。 Optimizing multiple competing objectives is a common problem across science and industry. The inherent inextricable trade-off between those objectives leads one to the task of exploring their Pareto front. A meaningful quantity for the purpose of the latter is the hypervolume indicator, which is used in Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the computational complexity for the calculation of the hypervolume scales unfavorably with increasing number of objectives and data points, which restricts its use in those common multi-objective optimization frameworks. To overcome these restrictions we propose to approximate the hypervolume function with a deep neural network, which we call DeepHV. For better sample efficiency and generalization, we exploit the fact that the hypervolume is scale-equivariant in each of the objectives as well as permutation invariant w.r.t. both the objectives and the samples, by using a deep neural network that is equivariant w.r.t. the combined group of scalings and permutations. We evaluate our method against exact, and approximate hypervolume methods in terms of accuracy, computation time, and generalization. We also apply and compare our methods to state-of-the-art multi-objective BO methods and EAs on a range of synthetic benchmark test cases. The results show that our methods are promising for such multi-objective optimization tasks.	翻訳日:2023-10-25 14:46:51 公開日:2023-10-23
# リニアRNNはおそらくリニア力学系を学習する Linear RNNs Provably Learn Linear Dynamic Systems ( http://arxiv.org/abs/2211.10582v2 ) ライセンス: Link先を確認	Lifu Wang, Tianyu Wang, Shengwei Yi, Bo Shen, Bo Hu, Xing Cao	(参考訳) 勾配降下を伴うリニアリカレントニューラルネットワークの学習能力について検討した。線形RNNに対する最初の理論的保証は、大きな種類の損失関数を用いて安定な線形力学系を学ぶことである。遷移行列 $C$ に関連するパラメータ $\rho_C$ の任意の安定線形系に対して、RNN の幅が十分に大きい場合(かつ隠れた層における所要の幅は入力シーケンスの長さに依存しない)パラメータ最適化損失の非凸性にもかかわらず、線形 RNN は $\frac{1}{1-\rho_C}$ のサンプルと時間複雑性多項式を持つ任意の安定線型力学系を確実に学習できることを示す。その結果,線形RNNを学習するための理論的保証を初めて提供し,リカレント構造が動的システムの学習にどのように役立つかを実証した。 We study the learning ability of linear recurrent neural networks with Gradient Descent. We prove the first theoretical guarantee on linear RNNs to learn any stable linear dynamic system using any a large type of loss functions. For an arbitrary stable linear system with a parameter $\rho_C$ related to the transition matrix $C$, we show that despite the non-convexity of the parameter optimization loss if the width of the RNN is large enough (and the required width in hidden layers does not rely on the length of the input sequence), a linear RNN can provably learn any stable linear dynamic system with the sample and time complexity polynomial in $\frac{1}{1-\rho_C}$. Our results provide the first theoretical guarantee to learn a linear RNN and demonstrate how can the recurrent structure help to learn a dynamic system.	翻訳日:2023-10-25 14:37:37 公開日:2023-10-23
# 雑音型中間スケール量子時代の任意の量子ネットワークのキャラクタリゼーション Characterizing arbitrary quantum networks in the noisy intermediate-scale quantum era ( http://arxiv.org/abs/2210.13751v2 ) ライセンス: Link先を確認	Zhen-Peng Xu	(参考訳) 量子ネットワークは近年注目されている。要するに、エッジで表される量子ソースの分布を、ネットワーク内のノードで表される異なるパーティに記述する。理想の場合、ネットワークから量子状態を特徴づけるツールが最近開発されている。しかし、ノイズの多い中間スケール量子(NISQ)時代の量子ネットワークの特徴は、ほとんどを無効にし、実現可能なツールを要求する。量子ネットワークの純度、共分散、トポロジーを利用することで、ノイズ、中間スケール、ランダム、スパースとなるnisq時代の任意の量子ネットワークに取り組むための体系的なアプローチを提供する。この手法の応用の一つは、マルチパーティタイト絡み合ったソースや量子メモリの品質など、量子ネットワークにおける必須要素の進歩を目撃することである。 Quantum networks are of high interest nowadays. In short, they describe the distribution of quantum sources represented by edges to different parties represented by nodes in the networks. Bundles of tools have been developed recently to characterize quantum states from the network in the ideal case. However, features of quantum networks in the noisy intermediate-scale quantum (NISQ) era invalidate most of them and call for feasible tools. By utilizing purity, covariance, and topology of quantum networks, we provide a systematic approach to tackle with arbitrary quantum networks in the NISQ era, which can be noisy, intermediate-scale, random, and sparse. One application of our method is to witness the progress of essential elements in quantum networks, like the quality of multipartite entangled sources and quantum memory.	翻訳日:2023-10-25 14:36:45 公開日:2023-10-23
# ル・カムの方程式の再検討:凸密度クラス上の厳密なミニマックスレート Revisiting Le Cam's Equation: Exact Minimax Rates over Convex Density Classes ( http://arxiv.org/abs/2210.11436v2 ) ライセンス: Link先を確認	Shamindra Shrotriya, Matey Neykov	(参考訳) 凸密度クラス上の密度推定のための最小値率を導出する古典的問題を考察する。ル・カム(1973)、バージ(1983, 1986)、ウォン・アンド・シェン(1995)、ヤン・アンド・バロン(1999)の先駆的な業績に基づいて、任意の凸密度クラスに対する(定数まで)最小値の正確な値を決定する。この研究はこれらの既知の結果を拡張し、密度クラスの局所計量エントロピーが常にそのような設定の下で最小値の最適速度を捉えることを示した。我々の境界はパラメトリックおよび非パラメトリック凸密度クラスをまたいだ統一的な視点を提供し、以前考えられていたよりも密度クラスのリッチ性に関する弱い仮定の下にある。提案した「マルチステージシーブ」 MLE は任意の凸密度クラスに適用できる。さらに, この推定器は, 真の利子密度にも適応することを示した。リスク境界を限定された全変動やホルダー密度クラスを含む既知のミニマックス率の再帰に適用する。さらに、研究の少ないクラス、例えば凸混合密度に対する上限を導出した結果の有用性について述べる。 We study the classical problem of deriving minimax rates for density estimation over convex density classes. Building on the pioneering work of Le Cam (1973), Birge (1983, 1986), Wong and Shen (1995), Yang and Barron (1999), we determine the exact (up to constants) minimax rate over any convex density class. This work thus extends these known results by demonstrating that the local metric entropy of the density class always captures the minimax optimal rates under such settings. Our bounds provide a unifying perspective across both parametric and nonparametric convex density classes, under weaker assumptions on the richness of the density class than previously considered. Our proposed `multistage sieve' MLE applies to any such convex density class. We further demonstrate that this estimator is also adaptive to the true underlying density of interest. We apply our risk bounds to rederive known minimax rates including bounded total variation, and Holder density classes. We further illustrate the utility of the result by deriving upper bounds for less studied classes, e.g., convex mixture of densities.	翻訳日:2023-10-25 14:35:22 公開日:2023-10-23
# InterFair: 公正な解釈可能な予測のための自然言語フィードバックの回避 InterFair: Debiasing with Natural Language Feedback for Fair Interpretable Predictions ( http://arxiv.org/abs/2210.07440v2 ) ライセンス: Link先を確認	Bodhisattwa Prasad Majumder, Zexue He, Julian McAuley	(参考訳) NLPモデルは伝統的に、センシティブな属性(例えば、性別や人種)に関連する情報の分離に焦点を当てている。むしろ、有利なデバイアス手法は、盲目的に排除するよりも、説明とともにセンシティブな情報を「公平に」使うべきだと論じている。このバランスはしばしば主観的であり、アルゴリズムの達成は困難である。凍結予測モデルを用いて2つのインタラクティブなセットアップを探索し、フィードバックをユーザに提供することで、タスクのパフォーマンスとバイアス軽減のバランスがより良く公平になることを示す。あるセットアップでは、ユーザはテスト例と対話することで、同じ予測精度を維持しながら説明のバイアス(5～8%)をさらに減らした。他の設定では、人間のフィードバックは、関連するバイアスと入力からの予測情報をアンタングルすることで、より優れたバイアス緩和とタスクパフォーマンス(4-5%)を同時に実現する。 Debiasing methods in NLP models traditionally focus on isolating information related to a sensitive attribute (e.g., gender or race). We instead argue that a favorable debiasing method should use sensitive information 'fairly,' with explanations, rather than blindly eliminating it. This fair balance is often subjective and can be challenging to achieve algorithmically. We explore two interactive setups with a frozen predictive model and show that users able to provide feedback can achieve a better and fairer balance between task performance and bias mitigation. In one setup, users, by interacting with test examples, further decreased bias in the explanations (5-8%) while maintaining the same prediction accuracy. In the other setup, human feedback was able to disentangle associated bias and predictive information from the input leading to superior bias mitigation and improved task performance (4-5%) simultaneously.	翻訳日:2023-10-25 14:34:41 公開日:2023-10-23
# ver: エンティティとリレーションを統一する VER: Unifying Verbalizing Entities and Relations ( http://arxiv.org/abs/2211.11093v3 ) ライセンス: Link先を確認	Jie Huang, Kevin Chen-Chuan Chang	(参考訳) 実体と実体の関係は現実世界において不可欠である。基本的には、実体と関係を理解することによって世界を理解する。例えば、コンピュータ科学などの分野を理解するためには、機械学習のような関連する概念と、機械学習や人工知能といった概念間の関係を理解する必要がある。人を理解するには、まず自分が誰で、どのように他人と関係があるかを知る必要がある。実体と関係を理解するために、人間は自然言語記述を参照することがある。例えば、新しい科学用語を学ぶとき、人々は辞書や百科事典でその定義を読むことから始める。 2つの実体の関係を知るために、人間はそれらをつなぐ文を作る傾向がある。本稿では, Verbalizing Entities and Relations のための統一モデル VER を提案する。具体的には,任意のエンティティやエンティティを入力として取り込んで,エンティティや関係を表現する文を生成するシステムの構築を試みる。広範な実験により,我々はエンティティとエンティティの関係を記述した高品質な文を生成でき,定義モデリングや関係モデリング,ジェネレーティブ・コモンセンス推論など,エンティティとリレーションに関する様々なタスクを促進できることを示した。 Entities and relationships between entities are vital in the real world. Essentially, we understand the world by understanding entities and relations. For instance, to understand a field, e.g., computer science, we need to understand the relevant concepts, e.g., machine learning, and the relationships between concepts, e.g., machine learning and artificial intelligence. To understand a person, we should first know who he/she is and how he/she is related to others. To understand entities and relations, humans may refer to natural language descriptions. For instance, when learning a new scientific term, people usually start by reading its definition in dictionaries or encyclopedias. To know the relationship between two entities, humans tend to create a sentence to connect them. In this paper, we propose VER: a unified model for Verbalizing Entities and Relations. Specifically, we attempt to build a system that takes any entity or entity set as input and generates a sentence to represent entities and relations. Extensive experiments demonstrate that our model can generate high-quality sentences describing entities and entity relationships and facilitate various tasks on entities and relations, including definition modeling, relation modeling, and generative commonsense reasoning.	翻訳日:2023-10-25 14:28:29 公開日:2023-10-23
# cape: 大規模言語モデルを用いた前提条件エラーの修正動作 CAPE: Corrective Actions from Precondition Errors using Large Language Models ( http://arxiv.org/abs/2211.09935v2 ) ライセンス: Link先を確認	Shreyas Sundara Raman, Vanya Cohen, David Paulius, Ifrah Idrees, Eric Rosen, Ray Mooney and Stefanie Tellex	(参考訳) 大型言語モデル(LLM)からコモンセンス知識を抽出することは、インテリジェントなロボットを設計するための道筋を提供する。 LLMを計画に活用する既存のアプローチは、アクションが失敗したときに回復できず、エラーの根本原因を解決することなく、しばしば失敗したアクションを再試行する。計画中の前提条件エラーを解決するための修正措置を提案する新しいアプローチ(cape)を提案する。 CAPEは、アクション前提条件からの少数ショット推論を活用することにより、生成されたプランの品質を改善する。本手法は, エージェントがベースラインメソッドよりも多くのタスクを実行し, 意味的正確性を確保しつつ, 再プロポーティングを最小化することを可能にする。仮想ホームでは、ケープは人間の注釈による計画の正しさを28.89%から49.63%に改善しながら実行可能な計画を生成する。私たちの改良はboston dynamics spotロボットに(言語で特定された)一連のスキルと関連する前提条件で初期化され、capeはsaycanと比較して実行されたタスクプランの正しい測定基準を76.49%改善しました。我々のアプローチは、ロボットが自然言語コマンドに従い、失敗から頑健に回復することを可能にする。 Extracting commonsense knowledge from a large language model (LLM) offers a path to designing intelligent robots. Existing approaches that leverage LLMs for planning are unable to recover when an action fails and often resort to retrying failed actions, without resolving the error's underlying cause. We propose a novel approach (CAPE) that attempts to propose corrective actions to resolve precondition errors during planning. CAPE improves the quality of generated plans by leveraging few-shot reasoning from action preconditions. Our approach enables embodied agents to execute more tasks than baseline methods while ensuring semantic correctness and minimizing re-prompting. In VirtualHome, CAPE generates executable plans while improving a human-annotated plan correctness metric from 28.89% to 49.63% over SayCan. Our improvements transfer to a Boston Dynamics Spot robot initialized with a set of skills (specified in language) and associated preconditions, where CAPE improves the correctness metric of the executed task plans by 76.49% compared to SayCan. Our approach enables the robot to follow natural language commands and robustly recover from failures, which baseline approaches largely cannot resolve or address inefficiently.	翻訳日:2023-10-25 14:27:43 公開日:2023-10-23
# Qafny: タイプ誘導古典分離論理による量子プログラム検証 Qafny: Quantum Program Verification Through Type-guided Classical Separation Logic ( http://arxiv.org/abs/2211.06411v3 ) ライセンス: Link先を確認	Liyi Li, Mingwei Zhu, Rance Cleaveland, Alexander Nicolellis, Yi Lee, Le Chang, Xiaodi Wu	(参考訳) 形式的検証は、量子プログラムが仕様を実装していることを保証するのに役立っているが、しばしば時間と労力のかなりの投資を必要とする。この課題に対処するために,量子プログラムの検証用に設計された自動証明システムであるqafnyを提案する。 Qafnyの核心は、量子演算を古典的な配列演算に変換する型誘導量子証明システムである。これらの操作を古典的な分離論理フレームワーク内の証明ルールとしてモデル化することで、qafnyは従来の退屈で時間のかかる推論プロセスの多くを自動化する。我々は証明システムの健全性と完全性を証明し、qafnyプログラムをdafnyプログラミング言語と実行可能な量子回路に変換するプロトタイプコンパイラを実装した。 qafnyを用いて、量子ウォークアルゴリズム、グローバー探索アルゴリズム、ショアのファクタリングアルゴリズムなど重要な量子アルゴリズムを効率的に検証する方法を実証し、人間の労力を大幅に削減する。 Formal verification has been proven instrumental to ensure that quantum programs implement their specifications but often requires a significant investment of time and labor. To address this challenge, we present Qafny, an automated proof system designed for verifying quantum programs. At its core, Qafny uses a type-guided quantum proof system that translates quantum operations to classical array operations. By modeling these operations as proof rules within a classical separation logic framework, Qafny automates much of the traditionally tedious and time-consuming reasoning process. We prove the soundness and completeness of our proof system and implement a prototype compiler that transforms Qafny programs both into the Dafny programming language and into executable quantum circuits. Using Qafny, we demonstrate how to efficiently verify important quantum algorithms, including quantum-walk algorithms, Grover's search algorithm, and Shor's factoring algorithm, with significantly reduced human effort.	翻訳日:2023-10-25 14:26:58 公開日:2023-10-23
# 理解してる? きめ細かいビジュアルコモンセンスのマルチモーダル評価 Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense ( http://arxiv.org/abs/2211.05895v2 ) ライセンス: Link先を確認	Zhecan Wang, Haoxuan You, Yicheng He, Wenhao Li, Kai-Wei Chang and Shih-Fu Chang	(参考訳) ビジュアルコモンセンス理解には、視覚言語(VL)モデルが画像とテキストだけでなく、記述された視覚シーンの理解を完全に統合し、達成するために相互参照も必要である。近年,様々な手法が開発され,ビジュアルコモンセンスベンチマークで高い性能を実現している。しかし、これらのモデルが、限られた評価データ資源のために、視覚的シーンと基礎となるコモンセンス知識を本当に理解しているかどうかは不明である。本研究では,視覚シーン,テキスト,関連知識の理解をテストするために,マルチモーダル評価(me)パイプラインを提案する。次に、MEデータによるトレーニングが標準VCR評価におけるモデルの性能を高めることを示すために、さらに一歩踏み出します。最後に,(1)意味的に低レベルな情報は高レベルな情報の学習を支援するが,その逆ではない,(2)視覚情報はテキストと比較して一般的に活用されている,という興味深い知見が得られた。 Visual commonsense understanding requires Vision Language (VL) models to not only understand image and text but also cross-reference in-between to fully integrate and achieve comprehension of the visual scene described. Recently, various approaches have been developed and have achieved high performance on visual commonsense benchmarks. However, it is unclear whether the models really understand the visual scene and underlying commonsense knowledge due to limited evaluation data resources. To provide an in-depth analysis, we present a Multimodal Evaluation (ME) pipeline to automatically generate question-answer pairs to test models' understanding of the visual scene, text, and related knowledge. We then take a step further to show that training with the ME data boosts the model's performance in standard VCR evaluation. Lastly, our in-depth analysis and comparison reveal interesting findings: (1) semantically low-level information can assist the learning of high-level information but not the opposite; (2) visual information is generally under utilization compared with text.	翻訳日:2023-10-25 14:26:43 公開日:2023-10-23
# GANに基づくデータ合成によるフェデレーションクラスタリング Federated clustering with GAN-based data synthesis ( http://arxiv.org/abs/2210.16524v2 ) ライセンス: Link先を確認	Jie Yan, Jing Liu, Ji Qi and Zhong-Yuan Zhang	(参考訳) フェデレーションクラスタリング(FC)は、フェデレーション設定における集中クラスタリングの拡張である。ローカルな類似性は、ローカルなデータを正しくグループ化するには不十分であり、クライアント間でのサンプルの類似性はプライバシの制約のため直接測定できないため、プライベートデータを共有せずにグローバルな類似性尺度を構築する方法が鍵となる。 FCを分析する最も簡単な方法は、K-means (KM) やfuzzy c-means (FCM) のような集中型から拡張された手法を採用することである。しかし、クライアント間での非独立分散(非IID)データに対して脆弱である。そこで我々は,SDA-FC(Synthetic data aided Federated Clustering)と呼ばれる新しいフェデレーションクラスタリングフレームワークを提案する。各クライアントで生成する敵ネットワークをローカルにトレーニングし、生成した合成データをサーバにアップロードし、合成データ上でKMまたはFCMを実行する。合成データにより、非IID問題に対してモデルが免疫し、プライベートデータを共有することなく、より効率的にグローバルな類似性特性を捉えることができる。総合的な実験によりSDA-FCの利点が明らかとなり、非IID問題とデバイス故障に対処する際の優れた性能が示された。 Federated clustering (FC) is an extension of centralized clustering in federated settings. The key here is how to construct a global similarity measure without sharing private data, since the local similarity may be insufficient to group local data correctly and the similarity of samples across clients cannot be directly measured due to privacy constraints. Obviously, the most straightforward way to analyze FC is to employ the methods extended from centralized ones, such as K-means (KM) and fuzzy c-means (FCM). However, they are vulnerable to non independent-and-identically-distributed (non-IID) data among clients. To handle this, we propose a new federated clustering framework, named synthetic data aided federated clustering (SDA-FC). It trains generative adversarial network locally in each client and uploads the generated synthetic data to the server, where KM or FCM is performed on the synthetic data. The synthetic data can make the model immune to the non-IID problem and enable us to capture the global similarity characteristics more effectively without sharing private data. Comprehensive experiments reveals the advantages of SDA-FC, including superior performance in addressing the non-IID problem and the device failures.	翻訳日:2023-10-25 14:26:07 公開日:2023-10-23
# 狭いs波フェシュバッハ共鳴近傍の3体衝突のスケーリング則 Scaling law for three-body collisions near a narrow s-wave Feshbach resonance ( http://arxiv.org/abs/2212.08257v2 ) ライセンス: Link先を確認	Jiaming Li, Shuai Peng, Yirou Xu, Shiyin Kuang, Le Luo	(参考訳) 超低温の原子ガスは、散乱長のスケーリングに依存する3体系の非弾性過程を研究するための制御可能なシステムを提供する。このようなスケールは様々な相互作用強度を持つボゾン系で確認されているが、フェルミオン原子の存在は解明されていない。本研究では,3体原子損失率$L_3$の散乱長$a<0$の2成分$^6$Liフェルミガスのスケーリング法則について実験的に検討した。スケーリングの法則は、l_3\propto t\|a\|^{2.60(5)}$、およびt$が気体温度である狭い$s$-wave feshbach共振の付近で一定の範囲で検証される。スケーリング則は、散乱長の点で上界と下界を有することが観察される。上界の場合、$a\rightarrow \infty$ の場合、強い三体衝突による共鳴のユニタリな振る舞いにより、パワーロースケーリングが抑制される。下界に対して、$a\rightarrow 0$ の有限範囲効果は、有効散乱長 $L_e$ でスケーリング則を変更する。これらの結果は、フェルミオン系における3体組換え速度は、一般化されたエフィモフ物理学に関連するスケーリング則によって特徴づけられることを示している。 Ultracold atomic gases provide a controllable system to study the inelastic processes for three-body systems, where the three-body recombination rate depends on the scattering length scaling. Such scalings have been confirmed in bosonic systems with various interaction strengths, but their existence with fermionic atoms remains elusive. In this work, we report on an experimental investigation of the scaling law for the three-body atomic loss rate $L_3$ in a two-component $^6$Li Fermi gas with the scattering length $a<0$. The scaling law is validated within a certain range of $a$ near the narrow $s$-wave Feshbach resonance, where $L_3\propto T\|a\|^{2.60(5)}$, and $T$ is the gas temperature. The scaling law is observed to have an upper and a lower bound in terms of the scattering length. For the upper bound, when $a\rightarrow \infty$, the power-law scaling is suppressed by the unitary behavior of the resonance caused by the strong three-body collisions. For the lower bound, $a\rightarrow 0$, the finite range effect modifies the scaling law by the effective scattering length $L_e$. These results indicate that the three-body recombination rate in a fermionic system could be characterized by the scaling law associated with the generalized Efimov physics.	翻訳日:2023-10-25 14:17:15 公開日:2023-10-23
# IRRGN:マルチターン応答選択のための暗黙リレーショナル推論グラフネットワーク IRRGN: An Implicit Relational Reasoning Graph Network for Multi-turn Response Selection ( http://arxiv.org/abs/2212.00482v2 ) ライセンス: Link先を確認	Jingcheng Deng, Hengwei Dai, Xuewei Guo, Yuanchen Ju and Wei Peng	(参考訳) マルチターン対話における応答選択のタスクは、すべての候補から最適な選択肢を見つけることである。モデルの推論能力を向上させるために、これまでの研究では、決定論的で限定的で柔軟性に乏しい発話間の依存関係をモデル化するために、明示的なアルゴリズムを使うことに注意を払っている。加えて、推論前後の選択肢の違いを考慮する研究はほとんどない。本稿では,これらの問題に対処するImplicit Relational Reasoning Graph Networkを提案し,Utterance Relational Reasoner (URR) とOption Dual Comparator (ODC) から構成される。 URRは、発話間の依存関係を暗黙的に抽出し、発話とオプションを抽出し、リレーショナルグラフ畳み込みネットワークで推論することを目的としている。 ODCは、ノイズオプションの干渉を排除できる二重比較により、選択肢間の差異を知覚することに焦点を当てている。 2つのマルチターン対話推論ベンチマークデータセットにおける実験結果から,本手法は4つの事前学習言語モデルのベースラインを大幅に改善し,最先端の性能を実現する。このモデルは、MuTualデータセットで初めて人間のパフォーマンスを上回ります。 The task of response selection in multi-turn dialogue is to find the best option from all candidates. In order to improve the reasoning ability of the model, previous studies pay more attention to using explicit algorithms to model the dependencies between utterances, which are deterministic, limited and inflexible. In addition, few studies consider differences between the options before and after reasoning. In this paper, we propose an Implicit Relational Reasoning Graph Network to address these issues, which consists of the Utterance Relational Reasoner (URR) and the Option Dual Comparator (ODC). URR aims to implicitly extract dependencies between utterances, as well as utterances and options, and make reasoning with relational graph convolutional networks. ODC focuses on perceiving the difference between the options through dual comparison, which can eliminate the interference of the noise options. Experimental results on two multi-turn dialogue reasoning benchmark datasets MuTual and MuTual+ show that our method significantly improves the baseline of four pretrained language models and achieves state-of-the-art performance. The model surpasses human performance for the first time on the MuTual dataset.	翻訳日:2023-10-25 14:16:21 公開日:2023-10-23
# GANに基づくプライバシー保護型深層クラスタリング Privacy-Preserving Federated Deep Clustering based on GAN ( http://arxiv.org/abs/2211.16965v2 ) ライセンス: Link先を確認	Jie Yan, Jing Liu, Ji Qi and Zhong-Yuan Zhang	(参考訳) フェデレーションクラスタリング(FC)は、フェデレーション設定用に設計された集中クラスタリングの不可欠な拡張であり、プライベートデータを共有せずにグローバルな類似度尺度を構築することが課題である。 FCに対する従来のアプローチは、典型的にはK平均やファジィc平均のような集中的な方法の拡張を採用する。しかし,これらの手法はクライアント間の非独立分散(非IID)データに影響を受けやすいため,特に高次元データでは,最適以下の性能が得られる。本稿では,gans(generative adversarial network)に基づく,プライバシ保存型連合型深層クラスタリングを提案することにより,これらの制約に対処する新しいアプローチを提案する。各クライアントはローカルな生成敵ネットワーク(GAN)をローカルにトレーニングし、合成データをサーバにアップロードする。サーバは合成データに深いクラスタリングネットワークを適用して$k$のクラスタセントロイドを確立し、クラスタ割り当てのためにクライアントにダウンロードする。理論的分析によると、GANの生成したサンプルは、クライアント間で共有され、本質的に特定のプライバシー保証を守り、個々のデータの機密性を保護する。さらに,提案手法の有効性を実験的に検証し,精度とプライバシ保護性を考慮したフェデレーションクラスタリングを実現する。 Federated clustering (FC) is an essential extension of centralized clustering designed for the federated setting, wherein the challenge lies in constructing a global similarity measure without the need to share private data. Conventional approaches to FC typically adopt extensions of centralized methods, like K-means and fuzzy c-means. However, these methods are susceptible to non-independent-and-identically-distributed (non-IID) data among clients, leading to suboptimal performance, particularly with high-dimensional data. In this paper, we present a novel approach to address these limitations by proposing a Privacy-Preserving Federated Deep Clustering based on Generative Adversarial Networks (GANs). Each client trains a local generative adversarial network (GAN) locally and uploads the synthetic data to the server. The server applies a deep clustering network on the synthetic data to establish $k$ cluster centroids, which are then downloaded to the clients for cluster assignment. Theoretical analysis demonstrates that the GAN-generated samples, shared among clients, inherently uphold certain privacy guarantees, safeguarding the confidentiality of individual data. Furthermore, extensive experimental evaluations showcase the effectiveness and utility of our proposed method in achieving accurate and privacy-preserving federated clustering.	翻訳日:2023-10-25 14:15:46 公開日:2023-10-23
# 頂点間の相互作用をモデル化するグラフニューラルネットワークの能力について On the Ability of Graph Neural Networks to Model Interactions Between Vertices ( http://arxiv.org/abs/2211.16494v5 ) ライセンス: Link先を確認	Noam Razin, Tom Verbin, Nadav Cohen	(参考訳) グラフニューラルネットワーク(GNN)は、グラフの頂点として表されるエンティティ間の複雑な相互作用をモデル化するために広く使われている。近年のGNNの表現力を理論的に分析する試みにもかかわらず、相互作用をモデル化する能力の形式的特徴は欠如している。現在の論文は、このギャップに対処することを目的としている。分離ランクと呼ばれる確立された尺度による相互作用の形式化強度は、与えられた頂点の部分集合とその補集合の間の相互作用をモデル化する特定のGNNの能力を定量化する。この結果から, 相互作用をモデル化する能力は, 分割の境界から得られるウォーク数によって定義されるグラフ理論特性であるウォーク指数によって決定されることがわかった。一般的なgnnアーキテクチャを用いた実験はこの発見を裏付ける。本理論の実用的応用として,入力エッジの除去時にGNNが相互作用をモデル化する能力を保持するWIS(Walk Index Sparsification)というエッジスペーシフィケーションアルゴリズムを設計する。 wisは単純で計算効率が良く,本実験では誘導予測の精度で代替手法を著しく上回っている。より広義には、モデリング可能な相互作用を理論的に分析することで、GNNを改善する可能性を示している。 Graph neural networks (GNNs) are widely used for modeling complex interactions between entities represented as vertices of a graph. Despite recent efforts to theoretically analyze the expressive power of GNNs, a formal characterization of their ability to model interactions is lacking. The current paper aims to address this gap. Formalizing strength of interactions through an established measure known as separation rank, we quantify the ability of certain GNNs to model interaction between a given subset of vertices and its complement, i.e. between the sides of a given partition of input vertices. Our results reveal that the ability to model interaction is primarily determined by the partition's walk index -- a graph-theoretical characteristic defined by the number of walks originating from the boundary of the partition. Experiments with common GNN architectures corroborate this finding. As a practical application of our theory, we design an edge sparsification algorithm named Walk Index Sparsification (WIS), which preserves the ability of a GNN to model interactions when input edges are removed. WIS is simple, computationally efficient, and in our experiments has markedly outperformed alternative methods in terms of induced prediction accuracy. More broadly, it showcases the potential of improving GNNs by theoretically analyzing the interactions they can model.	翻訳日:2023-10-25 14:15:14 公開日:2023-10-23
# 思考の実証プログラム:数値推論タスクにおける推論から計算を遠ざける Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks ( http://arxiv.org/abs/2211.12588v4 ) ライセンス: Link先を確認	Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen	(参考訳) 近年,複雑な数値推論タスクを解くために,ステップバイステップ推論を行う言語モデルが大幅に進歩している。 CoT(Chain-of-thinkts prompting)は、これらのタスクに対する最先端の手法である。 CoTは言語モデルを使用して、多段階の ‘Thought’ プロセスで推論と計算の両方を実行する。推論から計算を遠ざけるために,言語モデル(主にコーデックス)を用いて推論過程をプログラムとして表現する'PoT'(Program of Thoughts)を提案する。計算は外部コンピュータに委譲され、生成されたプログラムを実行して回答を導出する。我々は,5つの算術語問題データセット(GSM,AQuA,SVAMP,TabMWP,MultiArith)と3つの財務QAデータセット(FinQA,ConvFinQA,TATQA)を用いて,小ショットとゼロショットの両方でPoTを評価する。数ショットとゼロショットの両方の設定で、PoTは評価されたデータセット全体の平均12倍のパフォーマンス向上を示すことができる。 PoTと自己整合性デコーディングを組み合わせることで、すべての数学問題データセットでSoTA性能、財務データセットでほぼSoTA性能を達成することができる。すべてのデータとコードはGithub https://github.com/wenhuchen/Program-of-Thoughtsで公開されています。 Recently, there has been significant progress in teaching language models to perform step-by-step reasoning to solve complex numerical reasoning tasks. Chain-of-thoughts prompting (CoT) is by far the state-of-art method for these tasks. CoT uses language models to perform both reasoning and computation in the multi-step `thought' process. To disentangle computation from reasoning, we propose `Program of Thoughts' (PoT), which uses language models (mainly Codex) to express the reasoning process as a program. The computation is relegated to an external computer, which executes the generated programs to derive the answer. We evaluate PoT on five math word problem datasets (GSM, AQuA, SVAMP, TabMWP, MultiArith) and three financial-QA datasets (FinQA, ConvFinQA, TATQA) for both few-shot and zero-shot setups. Under both few-shot and zero-shot settings, PoT can show an average performance gain over CoT by around 12\% across all the evaluated datasets. By combining PoT with self-consistency decoding, we can achieve SoTA performance on all math problem datasets and near-SoTA performance on financial datasets. All of our data and code are released in Github https://github.com/wenhuchen/Program-of-Thoughts	翻訳日:2023-10-25 14:14:04 公開日:2023-10-23
# 多職種学習能力のキャラクタリゼーション A Characterization of Multioutput Learnability ( http://arxiv.org/abs/2301.02729v5 ) ライセンス: Link先を確認	Vinod Raman, Unique Subedi, Ambuj Tewari	(参考訳) バッチおよびオンライン環境でマルチアウトプット関数クラスを学習する問題を考える。どちらの設定でも、関数クラスの単一出力制限が学習可能である場合に限り、マルチアウトプット関数クラスが学習可能であることを示す。これは、バッチおよびオンライン設定の両方において、マルチラベル分類とマルチアウトプット回帰の学習可能性の完全な評価を提供する。拡張として,バンディットフィードバック設定におけるマルチラベル学習可能性も考慮し,フルフィードバック設定と同様の特性を示す。 We consider the problem of learning multioutput function classes in batch and online settings. In both settings, we show that a multioutput function class is learnable if and only if each single-output restriction of the function class is learnable. This provides a complete characterization of the learnability of multilabel classification and multioutput regression in both batch and online settings. As an extension, we also consider multilabel learnability in the bandit feedback setting and show a similar characterization as in the full-feedback setting.	翻訳日:2023-10-25 14:07:16 公開日:2023-10-23
# マルチクラス分類に基づく量子ニューラルネットワークのデミスティファイト問題依存力 Demystify Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification ( http://arxiv.org/abs/2301.01597v2 ) ライセンス: Link先を確認	Yuxuan Du, Yibo Yang, Dacheng Tao, Min-Hsiu Hsieh	(参考訳) 量子ニューラルネットワーク(QNN)は物理世界を理解する上で重要なツールとなっているが、その利点と限界は完全には理解されていない。特定の符号化方法を持つQNNの中には、古典的なサロゲートによって効率的にシミュレートできるものもあるが、量子メモリを持つものは古典的な分類器よりも優れている。本稿では,マルチクラス分類タスクにおける量子ニューラルネットワーク分類器(qcs)の問題依存パワーを体系的に検討する。予測リスクの分析により, 分類器の訓練損失と一般化誤差を共同で評価する指標として, 訓練損失が一般化能力よりもパワーを支配すること, 第二に, 深層神経分類器の二重発光リスク曲線とは対照的に, qcsはu字型のリスク曲線をとること, の2つの重要な知見を明らかにした。また、最適QCとヘルストローム境界と等角的タイトフレームとの固有接続を明らかにする。そこで本研究では,学習課題における古典的分類器よりもQCの方が有効かどうかを探索するために,損失ダイナミクスを用いた手法を提案する。画像データセットにおける多層パーセプトロン上のqcsの優位性と畳み込みニューラルネットワークの限界を説明するための手法の有効性を数値実験により証明した。我々の研究はQNNの課題依存力に光を当て、その潜在的なメリットを評価するための実践的なツールを提供する。 Quantum neural networks (QNNs) have become an important tool for understanding the physical world, but their advantages and limitations are not fully understood. Some QNNs with specific encoding methods can be efficiently simulated by classical surrogates, while others with quantum memory may perform better than classical classifiers. Here we systematically investigate the problem-dependent power of quantum neural classifiers (QCs) on multi-class classification tasks. Through the analysis of expected risk, a measure that weighs the training loss and the generalization error of a classifier jointly, we identify two key findings: first, the training loss dominates the power rather than the generalization ability; second, QCs undergo a U-shaped risk curve, in contrast to the double-descent risk curve of deep neural classifiers. We also reveal the intrinsic connection between optimal QCs and the Helstrom bound and the equiangular tight frame. Using these findings, we propose a method that uses loss dynamics to probe whether a QC may be more effective than a classical classifier on a particular learning task. Numerical results demonstrate the effectiveness of our approach to explain the superiority of QCs over multilayer Perceptron on parity datasets and their limitations over convolutional neural networks on image datasets. Our work sheds light on the problem-dependent power of QNNs and offers a practical tool for evaluating their potential merit.	翻訳日:2023-10-25 14:07:08 公開日:2023-10-23
# adiabatic quantum annealingはなぜスピードアップしないのか Why adiabatic quantum annealing is unlikely to yield speed-up ( http://arxiv.org/abs/2212.13649v2 ) ライセンス: Link先を確認	Aar\'on Villanueva, Peyman Najafi, Hilbert J. Kappen	(参考訳) h = z h_f + h_0$ ここで$h_f$ は対角的、$h_0=-\|\phi \rangle \langle \phi\|$ は等しい重ね合わせ状態プロジェクタであり、$z$ はアニーリングパラメータである。解析的に最小のスペクトルギャップを$\mathcal{o}(1/\sqrt{n})$と計算し、n$ the total of states and its location $z_$ と計算する。量子速度アップには、最適化問題の状態密度が分かっている場合にのみ計算可能な$z_$の正確な知識を必要とするアニーリングスケジュールが必要であることを示す。しかし、一般に状態の密度は計算が困難であり、実用的な組合せ最適化問題では二次的なスピードアップは不可能である。我々は、この負の結果が$h_0 = -\sum_{i=1}^n \sigma_i^x$のような任意のインスタンス独立な逆ハミルトニアンにも適用される可能性が高いと推測する。 We study quantum annealing for combinatorial optimization with Hamiltonian $H = z H_f + H_0$ where $H_f$ is diagonal, $H_0=-\|\phi \rangle \langle \phi\|$ is the equal superposition state projector and $z$ the annealing parameter. We analytically compute the minimal spectral gap as $\mathcal{O}(1/\sqrt{N})$ with $N$ the total number of states and its location $z_$. We show that quantum speed-up requires an annealing schedule which demands a precise knowledge of $z_$, which can be computed only if the density of states of the optimization problem is known. However, in general the density of states is intractable to compute, making quadratic speed-up unfeasible for any practical combinatoric optimization problems. We conjecture that it is likely that this negative result also applies for any other instance independent transverse Hamiltonians such as $H_0 = -\sum_{i=1}^n \sigma_i^x$.	翻訳日:2023-10-25 14:06:09 公開日:2023-10-23
# Smooth Sailing:Representation Smoothness Analysisによる事前学習型言語モデルのアクティブラーニングの改善 Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis ( http://arxiv.org/abs/2212.11680v2 ) ライセンス: Link先を確認	Josip Juki\'c and Jan \v{S}najder	(参考訳) 禁止ラベリングコストを軽減するために開発されたアクティブラーニング(AL)手法は、教師あり学習におけるラベルの複雑さを軽減することを目的としている。最近の研究は、大規模な事前学習言語モデル(PLM)と組み合わせてALを使用する利点を実証しているが、ALの有効性を妨げる実践的な課題を見落としていることが多い。これらの課題に対して,表現の滑らかさ分析を活用して,ALの実現性,すなわち,効果的かつ実用性を確保する。まず、現実的なAL条件でしばしば利用できない検証セットを必要としない早期停止手法を提案し、複数のデータセットやALメソッドにわたるランダムサンプリングに対する大幅な改善を観察する。さらに,タスク適応がalを改良するのに対し,alの標準短い微調整はランダムサンプリングよりも改善しないことがわかった。本研究は,ALの表現滑らか性解析の有用性を実証し,ラベルの複雑さを低減するAL停止基準を導入する。 Developed to alleviate prohibitive labeling costs, active learning (AL) methods aim to reduce label complexity in supervised learning. While recent work has demonstrated the benefit of using AL in combination with large pre-trained language models (PLMs), it has often overlooked the practical challenges that hinder the effectiveness of AL. We address these challenges by leveraging representation smoothness analysis to ensure AL is feasible, that is, both effective and practicable. Firstly, we propose an early stopping technique that does not require a validation set -- often unavailable in realistic AL conditions -- and observe significant improvements over random sampling across multiple datasets and AL methods. Further, we find that task adaptation improves AL, whereas standard short fine-tuning in AL does not provide improvements over random sampling. Our work demonstrates the usefulness of representation smoothness analysis for AL and introduces an AL stopping criterion that reduces label complexity.	翻訳日:2023-10-25 14:05:46 公開日:2023-10-23
# 量子力学に対する非線形補正の存在下での重力相互作用の量子的性質のテストについて On tests of the quantum nature of gravitational interactions in presence of non-linear corrections to quantum mechanics ( http://arxiv.org/abs/2302.00365v2 ) ライセンス: Link先を確認	Giovanni Spaventa, Ludovico Lami, Martin B. Plenio	(参考訳) 2つの粒子が主に重力を通じて相互作用し、量子力学の法則に従うとき、エンタングルメントの生成は重力相互作用の量子的性質の要点と見なされる。しかし、重力相互作用が古典的あるいは短距離において欠如している場合でも、弱い量子相互作用や局所量子力学に対する非線形補正の存在下でも絡み合いのダイナミクスが生じることを示した。このことは、重力の量子特性を決定的にテストするために絡み合い検出を超えることの重要性を強調しており、大きな質量領域の量子力学に対する他の量子力の強さと潜在的な非線形補正を徹底的に検討する必要がある。 When two particles interact primarily through gravity and follow the laws of quantum mechanics, the generation of entanglement is considered a hallmark of the quantum nature of the gravitational interaction. However, we demonstrate that entanglement dynamics can also occur in the presence of a weak quantum interaction and non-linear corrections to local quantum mechanics, even if the gravitational interaction is classical or absent at short distances. This highlights the importance of going beyond entanglement detection to conclusively test the quantum character of gravity, and it requires a thorough examination of the strength of other quantum forces and potential non-linear corrections to quantum mechanics in the realm of large masses.	翻訳日:2023-10-25 13:56:14 公開日:2023-10-23
# フェアネスを考慮した多変量時系列予測のための情報表現の学習 Learning Informative Representation for Fairness-aware Multivariate Time-series Forecasting: A Group-based Perspective ( http://arxiv.org/abs/2301.11535v2 ) ライセンス: Link先を確認	Hui He, Qi Zhang, Shoujin Wang, Kun Yi, Zhendong Niu, Longbing Cao	(参考訳) 多変量時系列(MTS)予測モデルには、変数間の性能の不公平性が広く存在する。この不公平な問題に対処することは、すべての変数に等しく参加し、脆弱なモデルバイアス/リスクを避けるために重要である。しかし、公正なmts予測は困難であり、文献ではあまり研究されていない。このような大きなギャップを埋めるために、有利変数と不利変数の両方に対応する情報表現の学習としてフェアネスモデリング問題を定式化する。そこで,フェアネスを考慮したMTS予測のためのフレームワークFairForを提案する。 FairForは、下流予測のためのグループ非依存表現とグループ関連表現の両方を生成するための逆学習に基づいている。このフレームワークはまず、K平均目標のスペクトル緩和を利用して、変数相関を推論し、したがって群変数を推論する。次に、フィルタリング・フュージョン成分を用いて、群関連情報をフィルタリングし、直交正規化により群非依存表現を生成する。群独立かつ群関連表現は、有利な変数から不利な変数への知識の共有を容易にし、公正性を保証する。 4つの公開データセットに対する大規模な実験は、フェア予測と大幅な性能改善のために提案したFairForの有効性を示す。 Performance unfairness among variables widely exists in multivariate time series (MTS) forecasting models since such models may attend/bias to certain (advantaged) variables. Addressing this unfairness problem is important for equally attending to all variables and avoiding vulnerable model biases/risks. However, fair MTS forecasting is challenging and has been less studied in the literature. To bridge such significant gap, we formulate the fairness modeling problem as learning informative representations attending to both advantaged and disadvantaged variables. Accordingly, we propose a novel framework, named FairFor, for fairness-aware MTS forecasting. FairFor is based on adversarial learning to generate both group-independent and group-relevant representations for the downstream forecasting. The framework first leverages a spectral relaxation of the K-means objective to infer variable correlations and thus to group variables. Then, it utilizes a filtering&fusion component to filter the group-relevant information and generate group-independent representations via orthogonality regularization. The group-independent and group-relevant representations form highly informative representations, facilitating to sharing knowledge from advantaged variables to disadvantaged variables to guarantee fairness. Extensive experiments on four public datasets demonstrate the effectiveness of our proposed FairFor for fair forecasting and significant performance improvement.	翻訳日:2023-10-25 13:55:41 公開日:2023-10-23
# 不完全なタイムキーピングが量子制御に及ぼす影響 The Impact of Imperfect Timekeeping on Quantum Control ( http://arxiv.org/abs/2301.10767v3 ) ライセンス: Link先を確認	Jake Xuereb, Florian Meier, Paul Erker, Mark T. Mitchison and Marcus Huber	(参考訳) 量子システムを一元的に進化させるためには、エージェントは時間に関する知識を必要とする。本稿では,時間知識の獲得に関する制限が,異なるパラダイムにおける制御量子演算にどのように影響するかを考察する。我々は,エージェントが回路ベースの量子計算で達成できる回路の複雑さを抑えるための時間管理の質を示す。我々は、ランダム回路の一般クラスに対する不完全なタイムキーピングの下で達成可能な平均ゲート忠実性の上界を導出することでこれを行う。量子制御が関連する別の領域は、量子熱力学である。その文脈において、量子ビットの冷却は任意の品質のタイマで達成できることを示す: タイムキーピングエラーは冷却速度にのみ影響し、達成可能な温度には影響しない。本解析は,自律的量子時計の研究と量子チャネルの理論を組み合わせることで,制御された量子ダイナミクスに対する不完全なタイムキーピングの効果を理解する。 In order to unitarily evolve a quantum system, an agent requires knowledge of time, a parameter which no physical clock can ever perfectly characterise. In this letter, we study how limitations on acquiring knowledge of time impact controlled quantum operations in different paradigms. We show that the quality of timekeeping an agent has access to limits the circuit complexity they are able to achieve within circuit-based quantum computation. We do this by deriving an upper bound on the average gate fidelity achievable under imperfect timekeeping for a general class of random circuits. Another area where quantum control is relevant is quantum thermodynamics. In that context, we show that cooling a qubit can be achieved using a timer of arbitrary quality for control: timekeeping error only impacts the rate of cooling and not the achievable temperature. Our analysis combines techniques from the study of autonomous quantum clocks and the theory of quantum channels to understand the effect of imperfect timekeeping on controlled quantum dynamics.	翻訳日:2023-10-25 13:55:17 公開日:2023-10-23
# 異なる私的自然言語モデル:最近の進歩と今後の方向性 Differentially Private Natural Language Models: Recent Advances and Future Directions ( http://arxiv.org/abs/2301.09112v2 ) ライセンス: Link先を確認	Lijie Hu, Ivan Habernal, Lei Shen and Di Wang	(参考訳) 近年のディープラーニングは,自然言語処理(NLP)タスクにおいて大きな成功を収めている。しかし、これらのアプリケーションは機密情報を含むデータを含む可能性がある。したがって、機密データのプライバシーを保護しながら優れたパフォーマンスを実現することは、NLPにとって重要な課題である。プライバシーを守るために、復元攻撃を防ぎ、潜在的な側面の知識を保護できる差分プライバシー(DP)は、プライベートデータ分析のデファクト技術になりつつある。近年,DPモデル(DP-NLP)におけるNLPは,様々な観点から研究されている。本稿では,NLPにおけるDP深層学習モデルの最近の進歩を初めて体系的に検討する。特に,DP-NLP と標準 DP 深層学習の相違点と追加課題について論じる。そこで, DP-NLPに関する既存の研究について検討し, 勾配摂動法, ベクトル摂動法, アンサンブルモデルに基づく手法の3つの側面から最近の展開を述べる。課題や今後の方向性についても論じる。 Recent developments in deep learning have led to great success in various natural language processing (NLP) tasks. However, these applications may involve data that contain sensitive information. Therefore, how to achieve good performance while also protecting the privacy of sensitive data is a crucial challenge in NLP. To preserve privacy, Differential Privacy (DP), which can prevent reconstruction attacks and protect against potential side knowledge, is becoming a de facto technique for private data analysis. In recent years, NLP in DP models (DP-NLP) has been studied from different perspectives, which deserves a comprehensive review. In this paper, we provide the first systematic review of recent advances in DP deep learning models in NLP. In particular, we first discuss some differences and additional challenges of DP-NLP compared with the standard DP deep learning. Then, we investigate some existing work on DP-NLP and present its recent developments from three aspects: gradient perturbation based methods, embedding vector perturbation based methods, and ensemble model based methods. We also discuss some challenges and future directions.	翻訳日:2023-10-25 13:54:41 公開日:2023-10-23
# qudit dicke状態準備 Qudit Dicke state preparation ( http://arxiv.org/abs/2301.04989v4 ) ライセンス: Link先を確認	Rafael I. Nepomechie and David Raveh	(参考訳) qudit dicke状態は(量子ビット)dicke状態として知られる非常に絡み合った完全対称量子状態の重要なクラスの高次元アナログである。任意のクディートディッケ状態を作成する回路を決定論的に定式化する。基本ゲートの観点で回路の明示的な分解を行い、キュービットおよびクトリットの場合のcirqに実装する。 Qudit Dicke states are higher-dimensional analogues of an important class of highly-entangled completely symmetric quantum states known as (qubit) Dicke states. A circuit for preparing arbitrary qudit Dicke states deterministically is formulated. An explicit decomposition of the circuit in terms of elementary gates is presented, and is implemented in cirq for the qubit and qutrit cases.	翻訳日:2023-10-25 13:53:44 公開日:2023-10-23
# CodeLMSecベンチマーク:Black-Boxコード言語モデルにおけるセキュリティ脆弱性のシステム的評価と検出 CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models ( http://arxiv.org/abs/2302.04012v2 ) ライセンス: Link先を確認	Hossein Hajipour, Keno Hassler, Thorsten Holz, Lea Sch\"onherr, Mario Fritz	(参考訳) 自動コード生成のための大規模言語モデル(llm)は、いくつかのプログラミングタスクでブレークスルーを達成した。競争レベルのプログラミング問題における彼らの進歩は、ai支援ペアプログラミングの重要な柱となり、github copilotのようなツールは、何百万もの開発者が毎日使っているプログラミングワークフローの一部として登場した。これらのモデルのトレーニングデータは、通常、インターネット(例えばオープンソースのリポジトリから)から収集され、障害やセキュリティ上の脆弱性を含む可能性がある。この不衛生なトレーニングデータは、言語モデルにこれらの脆弱性を学習させ、コード生成手順中にそれを伝播させる可能性がある。これらのモデルは機能的に正しいプログラムを作成する能力について広範囲に評価されてきたが、これらのモデルのセキュリティ面に対処する包括的な調査やベンチマークはいまだに存在しない。本研究では,言語モデルのセキュリティ問題を系統的に研究し,脆弱なコード生成に対する感受性を評価する手法を提案する。この目的のために,ブラックボックスコード生成モデルの脆弱性を含む生成コードを自動的に発見する最初のアプローチを導入する。これを実現するために,少数ショットプロンプトに基づくブラックボックスコード生成モデルの近似インバージョンを提案する。リスクの高いセキュリティの弱点を生成するために,コード言語モデルを調べることで,アプローチの有効性を評価する。さらに,本手法を用いて,さまざまな脆弱性シナリオに対する安全でないプロンプトの多種多様な収集を行う。このデータセットは、コード言語モデルのセキュリティの弱点を評価し、比較するためのベンチマークを形成する。 Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Their advances in competition-level programming problems have made them an essential pillar of AI-assisted pair programming, and tools such as GitHub Copilot have emerged as part of the daily programming workflow used by millions of developers. The training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities. This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure. While these models have been extensively assessed for their ability to produce functionally correct programs, there remains a lack of comprehensive investigations and benchmarks addressing the security aspects of these models. In this work, we propose a method to systematically study the security issues of code language models to assess their susceptibility to generating vulnerable code. To this end, we introduce the first approach to automatically find generated code that contains vulnerabilities in black-box code generation models. To achieve this, we present an approach to approximate inversion of the black-box code generation models based on few-shot prompting. We evaluate the effectiveness of our approach by examining code language models in generating high-risk security weaknesses. Furthermore, we establish a collection of diverse non-secure prompts for various vulnerability scenarios using our method. This dataset forms a benchmark for evaluating and comparing the security weaknesses in code language models.	翻訳日:2023-10-25 13:48:24 公開日:2023-10-23
# ロバストネスを考慮したコアセット選択による効率よい対人コントラスト学習 Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection ( http://arxiv.org/abs/2302.03857v4 ) ライセンス: Link先を確認	Xilie Xu, Jingfeng Zhang, Feng Liu, Masashi Sugiyama, Mohan Kankanhalli	(参考訳) ACL(Adversarial contrastive Learning)は、高価なデータアノテーションを必要としないが、敵攻撃に耐える堅牢な表現を出力し、幅広い下流タスクに一般化する。しかし、ACLは、すべてのトレーニングデータの逆の変種を生成するために、膨大な実行時間を必要とします。 ACLを高速化するために,ロバストネス対応コアセット選択法(RCS)を提案する。 RCSはラベル情報を必要とせず、自然なデータとその仮想逆数との表現の距離である表現の発散を最小限に抑える情報的サブセットを検索する。すべての可能な部分集合をトラバースするRCSのバニラ解は計算的に禁じられている。そこで, 理論上はrcsをサブモジュラー最大化のサロゲート問題に変換し, グリーディ探索は元の問題に対して最適性を保証する効率的な解である。実験的な結果から、RCSはロバスト性伝達性を著しく損なうことなく、大きなマージンでACLを高速化できることを示す。特に,我々の知る限りでは,大規模な ImageNet-1K データセット上で ACL を効率的に実行し,RCS による効率的なロバスト表現を実現するのは初めてである。ソースコードはhttps://github.com/GodXuxilie/Efficient_ACL_via_RCSにあります。 Adversarial contrastive learning (ACL) does not require expensive data annotations but outputs a robust representation that withstands adversarial attacks and also generalizes to a wide range of downstream tasks. However, ACL needs tremendous running time to generate the adversarial variants of all training data, which limits its scalability to large datasets. To speed up ACL, this paper proposes a robustness-aware coreset selection (RCS) method. RCS does not require label information and searches for an informative subset that minimizes a representational divergence, which is the distance of the representation between natural data and their virtual adversarial variants. The vanilla solution of RCS via traversing all possible subsets is computationally prohibitive. Therefore, we theoretically transform RCS into a surrogate problem of submodular maximization, of which the greedy search is an efficient solution with an optimality guarantee for the original problem. Empirically, our comprehensive results corroborate that RCS can speed up ACL by a large margin without significantly hurting the robustness transferability. Notably, to the best of our knowledge, we are the first to conduct ACL efficiently on the large-scale ImageNet-1K dataset to obtain an effective robust representation via RCS. Our source code is at https://github.com/GodXuxilie/Efficient_ACL_via_RCS.	翻訳日:2023-10-25 13:47:09 公開日:2023-10-23
# スコアベース条件モデルの概念代数 Concept Algebra for Score-Based Conditional Models ( http://arxiv.org/abs/2302.03693v3 ) ライセンス: Link先を確認	Zihao Wang, Lin Gui, Jeffrey Negrea, Victor Veitch	(参考訳) 本稿では,テキスト誘導生成モデルにおける学習表現の構造を,スコアベースモデルに焦点をあてる。そのようなモデルの鍵となる性質は、異なる概念を 'disentangled' な方法で構成できることである。これはこれらのモデルが、概念を 'disentangled' な方法でエンコードする内部表現を持っていることを示唆している。ここでは、概念がある表現空間の部分空間として符号化されるという考えに焦点を当てる。これは何を意味するのかを形式化し、表現に自然な選択があることを示し、与えられた概念に対応する表現の一部を識別する簡単な方法を開発する。特に、表現の代数的操作を通じてモデルによって表現される概念を操作することができる。このアイデアを安定拡散を用いて実例で示す。 This paper concerns the structure of learned representations in text-guided generative models, focusing on score-based models. A key property of such models is that they can compose disparate concepts in a `disentangled' manner. This suggests these models have internal representations that encode concepts in a `disentangled' manner. Here, we focus on the idea that concepts are encoded as subspaces of some representation space. We formalize what this means, show there's a natural choice for the representation, and develop a simple method for identifying the part of the representation corresponding to a given concept. In particular, this allows us to manipulate the concepts expressed by the model through algebraic manipulation of the representation. We demonstrate the idea with examples using Stable Diffusion.	翻訳日:2023-10-25 13:46:45 公開日:2023-10-23
# リンク予測を超えた推論のための2レベル知識グラフの学習表現 Learning Representations of Bi-level Knowledge Graphs for Reasoning beyond Link Prediction ( http://arxiv.org/abs/2302.02601v4 ) ライセンス: Link先を確認	Chanyoung Chung and Joyce Jiyoung Whang	(参考訳) 知識グラフは三重項を用いて既知の事実を表す。既存の知識グラフ埋め込み手法はエンティティ間の接続のみを考慮しているが、三重項間の関係を考える。例えば、$T_1$と$T_2$で、$T_1$は(Academy_Awards, Nominates, Avatar)と$T_2$は(Avatar, Wins, Academy_Awards)である。この2つのベースレベル三重項を考えると、$t_1$は$t_2$の前提条件である。本稿では,三重項間の関係を表す高次三重項を定義する。例えば,$\langle T_1$,PrerequisiteFor,$T_2\rangle$,PrerequisiteForは高次関係である。基本レベルと高レベル三重項からなる二段階知識グラフを定義する。また,二段階知識グラフのランダムウォークに基づくデータ拡張戦略を提案し,有意な三重項を増大させる。我々のモデルであるBiVEは、ベースレベルと高レベル三重項の構造を考慮し、付加三重項を考慮に入れて埋め込みを学習する。 3重項予測と条件付きリンク予測という2つの新しいタスクを提案する。三重項 $t_1$ と高次関係を考えると、三重項予測は、高次関係によって$t_1$ と接続される可能性が高い三重項、例えば $\langle t_1$, prerequisitefor, ? を予測する。略称は$。例えば、$\langle T_1$, PrerequisiteFor, (Avatar, Wins, ?)$\rangle$などである。実験の結果,biveは実世界のbiレベル知識グラフにおいて,2つの新しいタスクにおける他の手法,および典型的なベースレベルリンク予測を大きく上回っていることがわかった。 Knowledge graphs represent known facts using triplets. While existing knowledge graph embedding methods only consider the connections between entities, we propose considering the relationships between triplets. For example, let us consider two triplets $T_1$ and $T_2$ where $T_1$ is (Academy_Awards, Nominates, Avatar) and $T_2$ is (Avatar, Wins, Academy_Awards). Given these two base-level triplets, we see that $T_1$ is a prerequisite for $T_2$. In this paper, we define a higher-level triplet to represent a relationship between triplets, e.g., $\langle T_1$, PrerequisiteFor, $T_2\rangle$ where PrerequisiteFor is a higher-level relation. We define a bi-level knowledge graph that consists of the base-level and the higher-level triplets. We also propose a data augmentation strategy based on the random walks on the bi-level knowledge graph to augment plausible triplets. Our model called BiVE learns embeddings by taking into account the structures of the base-level and the higher-level triplets, with additional consideration of the augmented triplets. We propose two new tasks: triplet prediction and conditional link prediction. Given a triplet $T_1$ and a higher-level relation, the triplet prediction predicts a triplet that is likely to be connected to $T_1$ by the higher-level relation, e.g., $\langle T_1$, PrerequisiteFor, ?$\rangle$. The conditional link prediction predicts a missing entity in a triplet conditioned on another triplet, e.g., $\langle T_1$, PrerequisiteFor, (Avatar, Wins, ?)$\rangle$. Experimental results show that BiVE significantly outperforms all other methods in the two new tasks and the typical base-level link prediction in real-world bi-level knowledge graphs.	翻訳日:2023-10-25 13:46:34 公開日:2023-10-23
# SimMTM: Masked Time-Series Modelingのためのシンプルな事前トレーニングフレームワーク SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling ( http://arxiv.org/abs/2302.00861v4 ) ライセンス: Link先を確認	Jiaxiang Dong, Haixu Wu, Haoran Zhang, Li Zhang, Jianmin Wang, Mingsheng Long	(参考訳) 時系列分析は広範囲で広く使われている。近年,ラベリング費用を削減し,様々な業務に利益をもたらすために,自己監督型事前研修が注目されている。メインストリームのパラダイムはマスクモデリングであり、マスクされていない部分に基づいてマスクされたコンテンツを再構築するために学習することで、深層モデルを事前学習することに成功している。しかし、時系列の意味情報は時間的変動に主に含まれているため、時間的変化をランダムにマスキングする標準的な方法は、時系列の重要な時間的変動を著しく損なうことになり、表現学習の指導が困難になる。そこで我々は,マスク付き時系列モデリングのための簡易事前学習フレームワークSimMTMを提案する。マスク付きモデリングと多様体学習を関連づけることで、SimMTMは、複数の隣人の重み付けによるマスク付き時間点の復元を提案する。 SimMTMはさらに、マスク付きモデリングに役立つ多様体の局所構造を明らかにすることを学ぶ。実験により、SimMTMは2つの標準時系列解析タスク(予測と分類)において、最も先進的な時系列事前学習手法と比較して、最先端の微調整性能を達成する。 Time series analysis is widely used in extensive areas. Recently, to reduce labeling expenses and benefit various tasks, self-supervised pre-training has attracted immense interest. One mainstream paradigm is masked modeling, which successfully pre-trains deep models by learning to reconstruct the masked content based on the unmasked part. However, since the semantic information of time series is mainly contained in temporal variations, the standard way of randomly masking a portion of time points will seriously ruin vital temporal variations of time series, making the reconstruction task too difficult to guide representation learning. We thus present SimMTM, a Simple pre-training framework for Masked Time-series Modeling. By relating masked modeling to manifold learning, SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold, which eases the reconstruction task by assembling ruined but complementary temporal variations from multiple masked series. SimMTM further learns to uncover the local structure of the manifold, which is helpful for masked modeling. Experimentally, SimMTM achieves state-of-the-art fine-tuning performance compared to the most advanced time series pre-training methods in two canonical time series analysis tasks: forecasting and classification, covering both in- and cross-domain settings.	翻訳日:2023-10-25 13:45:10 公開日:2023-10-23
# 最適状態生成コストをもつ非単位力学に対するハミルトンシミュレーションの線形結合 Linear combination of Hamiltonian simulation for nonunitary dynamics with optimal state preparation cost ( http://arxiv.org/abs/2303.01029v2 ) ライセンス: Link先を確認	Dong An, Jin-Peng Liu, Lin Lin	(参考訳) 本稿では,非ユニタリダイナミクスの一般クラスを,ハミルトニアン・シミュレーション(lchs)問題の線形結合としてシミュレートする方法を提案する。 LCHSは、問題を拡張線形系問題に変換することやスペクトル写像定理に頼らない。後者は、量子特異値変換(qsvt)のような非ユニタリ過程を含む幅広いタスクを解決するための多くの量子アルゴリズムの数学的基礎である。 LCHS法は, 状態調製における最適コストを実現することができる。また、全てのパラメータにほぼ最適に依存する複素吸収ポテンシャル法によるオープン量子力学シミュレーションの応用を実証する。 We propose a simple method for simulating a general class of non-unitary dynamics as a linear combination of Hamiltonian simulation (LCHS) problems. LCHS does not rely on converting the problem into a dilated linear system problem, or on the spectral mapping theorem. The latter is the mathematical foundation of many quantum algorithms for solving a wide variety of tasks involving non-unitary processes, such as the quantum singular value transformation (QSVT). The LCHS method can achieve optimal cost in terms of state preparation. We also demonstrate an application for open quantum dynamics simulation using the complex absorbing potential method with near-optimal dependence on all parameters.	翻訳日:2023-10-25 13:36:15 公開日:2023-10-23
# 深い構造を持つガウス特徴モデルの学習曲線 Learning curves for deep structured Gaussian feature models ( http://arxiv.org/abs/2303.00564v3 ) ライセンス: Link先を確認	Jacob A. Zavatone-Veth and Cengiz Pehlevan	(参考訳) 近年、ディープラーニング理論における重要な関心は、トレーニングデータを補間するモデルが、まだ見知らぬ例によく一般化できるときの分析に向けられている。ガウス的ランダムな特徴の複数の層を持つモデルの研究から多くの洞察が得られ、それによって正確な一般化漸近を計算できる。しかし、ウェイト異方性の影響を考慮する研究はほとんどなく、ほとんどの場合、ランダムな特徴は独立かつ同一に分布するガウス重みによって生成され、入力データの構造のみを許すと仮定する。ここでは,統計物理学のレプリカ・トリックを用いて,ガウス的特徴の多層モデルに対する学習曲線を導出する。特徴層の最初の行間の相関を許容することは一般化に役立ち、後層の構造は一般に有害であることを示す。その結果,単純な可解モデルのクラスにおいて,重み構造が一般化にどのように影響するかが明らかになった。 In recent years, significant attention in deep learning theory has been devoted to analyzing when models that interpolate their training data can still generalize well to unseen examples. Many insights have been gained from studying models with multiple layers of Gaussian random features, for which one can compute precise generalization asymptotics. However, few works have considered the effect of weight anisotropy; most assume that the random features are generated using independent and identically distributed Gaussian weights, and allow only for structure in the input data. Here, we use the replica trick from statistical physics to derive learning curves for models with many layers of structured Gaussian features. We show that allowing correlations between the rows of the first layer of features can aid generalization, while structure in later layers is generally detrimental. Our results shed light on how weight structure affects generalization in a simple class of solvable models.	翻訳日:2023-10-25 13:36:06 公開日:2023-10-23
# 組合せ最適化のための効率的なソリューションQuantum Dueling Quantum Dueling: an Efficient Solution for Combinatorial Optimization ( http://arxiv.org/abs/2302.10151v4 ) ライセンス: Link先を確認	Letian Tang, Haorui Wang, Zhengyang Li, Haozhan Tang, Chi Zhang, Shujin Li	(参考訳) 本稿では,量子デュエル(quantum dueling)と呼ぶ汎用組合せ最適化のための新しいアルゴリズムを提案する。伝統的に、与えられた最適化問題に対する潜在的な解決策は、キュービットの「登録」に符号化された。様々な手法が測定時に最良の解を見つける確率を高めるために用いられる。しかし、量子デュエルでは、量子ビットの2番目のレジスタを導入し、元の表現の「競合」を表す。これは2人の候補者を表わす2つのレジスタを与える。毎回、1つのレジスタを「指数」として選択し、制御量子探索により最適化問題におけるより最適な候補を表す他のレジスタのコンポーネントを増幅する。このようなプロセスを繰り返すと、両方のレジスタ内の量子状態は最適にプッシュされる。状態ベクトルの進化の縮約を求める定量的解析の後、幅広いシナリオとハイパーパラメータ選択スキームの下での古典的シミュレーションは、量子コンピューティングの理論的限界である二次速度アップが達成されたことを示している。このような強力な効果を十分に理解することは、量子アルゴリズムの背後にある数学に新たな洞察を与える未解決の課題である。全体として、量子デュエルは、より多くの量子ビットを導入することで、従来考えられなかったアルゴリズムを開発できる興味深いデモンストレーションである。このような設計原理を他の問題に適用すると、新しい効率的な量子アルゴリズムが生まれるかもしれない。 In this paper, we present a new algorithm for generic combinatorial optimization, which we term quantum dueling. Traditionally, potential solutions to the given optimization problems were encoded in a "register" of qubits. Various techniques are used to increase the probability of finding the best solution upon measurement. In quantum dueling, however, we introduce a second register of qubits, representing a "competitor" for the original representation. This gives us two registers representing two candidates. Each time, we would select one register as the "opponent" and amplify the components in the other register representing more optimal candidates in our optimization problem via a controlled quantum search. With a repetition of such processes, the quantum state within both registers will be pushed towards optimal. After a quantitative analysis that finds a contraction for the evolution of the state vector, classical simulation under a broad range of scenarios and hyper-parameter selection schemes shows that a quadratic speedup is achieved -- the theoretical limit for quantum computing. Fully understanding how such potent efficacy remains an unsolved task that could provide new insights into the mathematics behind quantum algorithms. Overall, quantum dueling is a fascinating demonstration where the introduction of more qubits allows previous unthought-of algorithms to be developed. Applying such a design principle in other problems might give rise to new, efficient quantum algorithms.	翻訳日:2023-10-25 13:35:10 公開日:2023-10-23
# モーメントベース正定値部分多様体最適化の簡易化とディープラーニングへの応用 Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning ( http://arxiv.org/abs/2302.09738v8 ) ライセンス: Link先を確認	Wu Lin, Valentin Duruisseaux, Melvin Leok, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt	(参考訳) 運動量を持つリーマン部分多様体の最適化は、イテレートが部分多様体上に残ることを保証するために、しばしば難しい微分方程式を解く必要があるため、計算的に難しい。ここでは、アフィン不変距離を持つスパースあるいは構造化対称正定行列のクラスに対するそのような困難を単純化する。我々は、計量を動的に正規化するリーマン正規座標の一般化バージョンを提案し、その問題をユークリッド空間の非拘束問題へと局所的に変換する。提案手法は,行列乗算のみを用いることで,構造化共分散の既存手法を単純化し,低精度深層学習のための行列逆フリー2ドル^\text{nd}$-orderオプティマイザを開発する。コード: https://github.com/yorkerlin/structuredngd-dl Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of sparse or structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riemannian normal coordinates that dynamically orthonormalizes the metric and locally converts the problem into an unconstrained problem in the Euclidean space. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2^\text{nd}$-order optimizers for deep learning with low precision by using only matrix multiplications. Code: https://github.com/yorkerlin/StructuredNGD-DL	翻訳日:2023-10-25 13:34:50 公開日:2023-10-23
# リパラメトリゼーションによるニューラルネットのパラメータ空間の幾何学 The Geometry of Neural Nets' Parameter Spaces Under Reparametrization ( http://arxiv.org/abs/2302.07384v3 ) ライセンス: Link先を確認	Agustinus Kristiadi and Felix Dangel and Philipp Hennig	(参考訳) モデル再パラメータ化(model reparametrization)は、微積分の可変性規則に従い、ニューラルネットワークのトレーニングを改善する一般的な方法である。しかし、ヘッセン系平坦度測度、最適化軌道、確率密度のモードなどの矛盾を誘発できるため、問題となることもある。これは下流解析を複雑にする:例えば、任意の再パラメータ化がそれらの関係を変化させるので、平坦性と一般化を決定的に関連付けることはできない。本研究では,再パラメータ化下でのニューラルネットの不変性について,リーマン幾何学の観点から検討する。この観点から、不変性は、計量を明示的に表現し、正しい関連する変換規則を使用する場合、任意のニューラルネット固有の性質である。これは、計量は常に存在するが、しばしば暗黙的に同一視と見なされ、記法から外され、再パラメータ化によって失われる。ミニマムの平坦性の測定,最適化,確率密度の最大化について考察する。最後に,不変性が役に立つ興味深い方向について考察する。 Model reparametrization, which follows the change-of-variable rule of calculus, is a popular way to improve the training of neural nets. But it can also be problematic since it can induce inconsistencies in, e.g., Hessian-based flatness measures, optimization trajectories, and modes of probability densities. This complicates downstream analyses: e.g. one cannot definitively relate flatness with generalization since arbitrary reparametrization changes their relationship. In this work, we study the invariance of neural nets under reparametrization from the perspective of Riemannian geometry. From this point of view, invariance is an inherent property of any neural net if one explicitly represents the metric and uses the correct associated transformation rules. This is important since although the metric is always present, it is often implicitly assumed as identity, and thus dropped from the notation, then lost under reparametrization. We discuss implications for measuring the flatness of minima, optimization, and for probability-density maximization. Finally, we explore some interesting directions where invariance is useful.	翻訳日:2023-10-25 13:34:34 公開日:2023-10-23
# ConceptFusion:オープンセットマルチモーダル3Dマッピング ConceptFusion: Open-set Multimodal 3D Mapping ( http://arxiv.org/abs/2302.07241v3 ) ライセンス: Link先を確認	Krishna Murthy Jatavallabhula and Alihusein Kuwajerwala and Qiao Gu and Mohd Omama and Tao Chen and Alaa Maalouf and Shuang Li and Ganesh Iyer and Soroush Saryazdi and Nikhil Keetha and Ayush Tewari and Joshua B. Tenenbaum and Celso Miguel de Melo and Madhava Krishna and Liam Paull and Florian Shkurti and Antonio Torralba	(参考訳) 環境の3Dマップの構築は、ロボットナビゲーション、計画、シーン内のオブジェクトとのインタラクションの中心である。意味論的概念を3Dマップと統合する既存のアプローチは、ほとんどクローズドセットの設定に限られており、訓練時に事前に定義された有限な概念の集合についてしか推論できない。さらに、これらのマップは、クラスラベルまたは最近の作業でのみ、テキストプロンプトを使用してクエリすることができる。この2つの課題は,(1)基本的オープンセットのシーン表現であるConceptFusionによって解決され,概念の閉じた集合を超えて推論が可能となり,(ii)本質的にマルチモーダルであり,言語,画像,オーディオ,3次元幾何学など,様々な3Dマップへのクエリが可能となる。 conceptfusionは、インターネットスケールデータで事前トレーニングされた今日の基盤モデルのオープンセット機能を活用して、自然言語、画像、音声といったモダリティにまたがる概念を推論する。従来のslamとマルチビュー融合による3dマップにピクセル指向のオープンセット機能を融合できることを実証した。これにより、追加のトレーニングや微調整を必要とせず、効果的なゼロショット空間推論が可能となり、3D IoUでは40%以上のマージンを達成できる。実世界のデータセット,シミュレートされたホーム環境,実世界のテーブルトップ操作タスク,自律運転プラットフォーム上でのコンセプトフュージョンを広範囲に評価した。基礎モデルと3次元オープンセットマルチモーダルマッピングをブレンドする新しい方法を紹介する。詳しくは、プロジェクトページ https://concept-fusion.github.io または、5分間の解説ビデオ https://www.youtube.com/watch? v=rkXgws8fiDs Building 3D maps of the environment is central to robot navigation, planning, and interaction with objects in a scene. Most existing approaches that integrate semantic concepts with 3D maps largely remain confined to the closed-set setting: they can only reason about a finite set of concepts, pre-defined at training time. Further, these maps can only be queried using class labels, or in recent work, using text prompts. We address both these issues with ConceptFusion, a scene representation that is (1) fundamentally open-set, enabling reasoning beyond a closed set of concepts and (ii) inherently multimodal, enabling a diverse range of possible queries to the 3D map, from language, to images, to audio, to 3D geometry, all working in concert. ConceptFusion leverages the open-set capabilities of today's foundation models pre-trained on internet-scale data to reason about concepts across modalities such as natural language, images, and audio. We demonstrate that pixel-aligned open-set features can be fused into 3D maps via traditional SLAM and multi-view fusion approaches. This enables effective zero-shot spatial reasoning, not needing any additional training or finetuning, and retains long-tailed concepts better than supervised approaches, outperforming them by more than 40% margin on 3D IoU. We extensively evaluate ConceptFusion on a number of real-world datasets, simulated home environments, a real-world tabletop manipulation task, and an autonomous driving platform. We showcase new avenues for blending foundation models with 3D open-set multimodal mapping. For more information, visit our project page https://concept-fusion.github.io or watch our 5-minute explainer video https://www.youtube.com/watch?v=rkXgws8fiDs	翻訳日:2023-10-25 13:33:56 公開日:2023-10-23
# ケルディッシュ展開による雑音駆動量子系に対する完全正の写像 Completely Positive Map for Noisy Driven Quantum Systems Derived by Keldysh Expansion ( http://arxiv.org/abs/2303.11491v3 ) ライセンス: Link先を確認	Ziwen Huang, Yunwei Lu, Anna Grassellino, Alexander Romanenko, Jens Koch, Shaojiang Zhu	(参考訳) 量子プロセッサにおけるデコヒーレンス誤差の正確なモデリングは、ゲートフィダリティの解析と改善に不可欠である。リンドブラッドの力学図の精度を高めるために、いくつかの一般化が提案され、より単純でより体系的なフレームワークの探索が続いている。本稿では,ケルディシュ形式に基づくデコヒーレンスモデルを提案する。この定式化により、非周期駆動と相関量子ノイズをモデルに含めることができる。適用範囲の広さに加えて,本手法は数値的に単純であり,CPTPマップを生成する。これらの機能により、keldyshマップを量子最適制御技術に統合することができます。この戦略は、量子ビット状態移動とゲート演算における相関量子ノイズを緩和するパルスを生成する。 Accurate modeling of decoherence errors in quantum processors is crucial for analyzing and improving gate fidelities. To increase the accuracy beyond that of the Lindblad dynamical map, several generalizations have been proposed, and the exploration of simpler and more systematic frameworks is still ongoing. In this paper, we introduce a decoherence model based on the Keldysh formalism. This formalism allows us to include non-periodic drives and correlated quantum noise in our model. In addition to its wide range of application, our method is also numerically simple, and yields a CPTP map. These features allow us to integrate the Keldysh map with quantum-optimal-control techniques. We demonstrate that this strategy generates pulses that mitigate correlated quantum noise in qubit state-transfer and gate operations.	翻訳日:2023-10-25 13:27:54 公開日:2023-10-23
# 大規模言語モデルのための文脈忠実なプロンプト Context-faithful Prompting for Large Language Models ( http://arxiv.org/abs/2303.11315v2 ) ライセンス: Link先を確認	Wenxuan Zhou, Sheng Zhang, Hoifung Poon, Muhao Chen	(参考訳) 大言語モデル(LLM)は世界事実に関するパラメトリック知識を符号化し、知識駆動NLPタスクにおいて顕著な性能を示した。しかし、パラメトリックな知識への依存は、文脈的な手がかりを見落とし、文脈に敏感なNLPタスク(例えば知識獲得タスク)における誤った予測につながる可能性がある。本稿では,LLMの文脈的忠実度を2つの側面,すなわち知識の衝突と棄却を伴う予測とで評価し,向上する。 LLMの忠実度は、慎重に設計されたプロンプト戦略を用いて大幅に改善できることを示す。特に、意見に基づくプロンプトや反事実デモを最も効果的な方法として特定する。意見に基づくプロンプトは、ナレーターの声明として文脈を再枠組みし、ナレーターの意見を問うが、反事実的なデモでは、誤った事実を含む例を使用して、知識紛争の状況における忠実性を改善する。どちらの技法も追加の訓練を必要としない。我々は,2つの標準NLPタスクの3つのデータセット,機械読解と関係抽出について実験を行い,その結果から文脈への忠実性の顕著な改善が示された。コードとデータはhttps://github.com/wzhouad/context-faithful-llmでリリースされる。 Large language models (LLMs) encode parametric knowledge about world facts and have shown remarkable performance in knowledge-driven NLP tasks. However, their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks (e.g., knowledge acquisition tasks). In this paper, we seek to assess and enhance LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction with abstention. We demonstrate that LLMs' faithfulness can be significantly improved using carefully designed prompting strategies. In particular, we identify opinion-based prompts and counterfactual demonstrations as the most effective methods. Opinion-based prompts reframe the context as a narrator's statement and inquire about the narrator's opinions, while counterfactual demonstrations use instances containing false facts to improve faithfulness in knowledge conflict situations. Neither technique requires additional training. We conduct experiments on three datasets of two standard NLP tasks, machine reading comprehension and relation extraction, and the results demonstrate significant improvement in faithfulness to contexts. Code and data are released at https://github.com/wzhouad/context-faithful-llm.	翻訳日:2023-10-25 13:27:44 公開日:2023-10-23
# 量子ニューラルネットワークのためのアンサンブル技術による資源節約 Resource Saving via Ensemble Techniques for Quantum Neural Networks ( http://arxiv.org/abs/2303.11283v2 ) ライセンス: Link先を確認	Massimiliano Incudini, Michele Grossi, Andrea Ceschini, Antonio Mandarino, Massimo Panella, Sofia Vallecorsa and David Windridge	(参考訳) 量子ニューラルネットワークは、多くのアプリケーション、特に現在の量子ハードウェア上で実行可能であるため、大きな可能性を秘めている。しかし、量子ビットやハードウェアノイズが限られているため、大規模な実験を行うには大量のリソースが必要となる。さらに、モデルの出力は、量子ハードウェアノイズによる破損の影響を受けやすい。この問題に対処するために、量子ニューラルネットワークの複数のインスタンスに基づいて単一の機械学習モデルを構築することを含むアンサンブル手法を提案する。特に,データロード設定の異なるバグングおよびadaboost手法を実装し,合成および実世界の分類と回帰タスクにおける性能評価を行った。異なる環境下での潜在的な性能改善を評価するため、シミュレーションされたノイズレスソフトウェアとIBM超伝導ベースのQPUの両方で実験を行い、これらの手法が量子ハードウェアノイズを軽減することを示唆している。さらに,これらのアンサンブル技術を用いて保存する資源量を定量化する。これらの手法により,比較的小さな量子デバイス上でも,大規模で強力なモデルの構築が可能であることが示唆された。 Quantum neural networks hold significant promise for numerous applications, particularly as they can be executed on the current generation of quantum hardware. However, due to limited qubits or hardware noise, conducting large-scale experiments often requires significant resources. Moreover, the output of the model is susceptible to corruption by quantum hardware noise. To address this issue, we propose the use of ensemble techniques, which involve constructing a single machine learning model based on multiple instances of quantum neural networks. In particular, we implement bagging and AdaBoost techniques, with different data loading configurations, and evaluate their performance on both synthetic and real-world classification and regression tasks. To assess the potential performance improvement under different environments, we conduct experiments on both simulated, noiseless software and IBM superconducting-based QPUs, suggesting these techniques can mitigate the quantum hardware noise. Additionally, we quantify the amount of resources saved using these ensemble techniques. Our findings indicate that these methods enable the construction of large, powerful models even on relatively small quantum devices.	翻訳日:2023-10-25 13:27:07 公開日:2023-10-23
# PINNSim:物理インフォームドニューラルネットワークに基づく電力系統ダイナミクスシミュレータ PINNSim: A Simulator for Power System Dynamics based on Physics-Informed Neural Networks ( http://arxiv.org/abs/2303.10256v2 ) ライセンス: Link先を確認	Jochen Stiasny, Baosen Zhang, Spyros Chatzivasileiadis	(参考訳) パワーシステムの動的挙動は微分代数方程式の系によって記述できる。時間領域シミュレーションは、これらの力学の進化をシミュレートするために用いられる。それらはしばしば小さな時間ステップサイズを必要とするため、計算コストがかかる。これらのシミュレーションを高速化するために、より大きな時間ステップを踏むことができるシミュレータ PINNSim を提案する。電力系統における単一成分の動的解の解法として物理インフォームドニューラルネットワーク(PINN)を基礎としている。これらの相互作用を解決するために、スケーラブルなルートフィニングアルゴリズムを用いる。 9-bus システム上で pinnsim を実演し,trapezoidal integration rule と比較して時間ステップサイズが増加することを示す。我々は、PINNSimの重要な特徴と、PINNSimを本格的なシミュレーターとして開発するための重要なステップについて論じる。これにより、時間ステップのサイズを大幅に増加させ、時間領域シミュレーションを加速する機会が得られる。 The dynamic behaviour of a power system can be described by a system of differential-algebraic equations. Time-domain simulations are used to simulate the evolution of these dynamics. They often require the use of small time step sizes and therefore become computationally expensive. To accelerate these simulations, we propose a simulator -- PINNSim -- that allows to take significantly larger time steps. It is based on Physics-Informed Neural Networks (PINNs) for the solution of the dynamics of single components in the power system. To resolve their interaction we employ a scalable root-finding algorithm. We demonstrate PINNSim on a 9-bus system and show the increased time step size compared to a trapezoidal integration rule. We discuss key characteristics of PINNSim and important steps for developing PINNSim into a fully fledged simulator. As such, it could offer the opportunity for significantly increasing time step sizes and thereby accelerating time-domain simulations.	翻訳日:2023-10-25 13:26:49 公開日:2023-10-23
# 建物リモート抽出のためのU-Net, ResUnet, U-Net3+のデュアルスキップ接続 Dual skip connections in U-Net, ResUnet and U-Net3+ for remote extraction of buildings ( http://arxiv.org/abs/2303.09064v4 ) ライセンス: Link先を確認	Bipul Neupane, Jagannath Aryal, and Abbas Rajabifard	(参考訳) U-Netなどのセマンティックセグメンテーションネットワークを用いた高分解能地球観測(EO)画像から都市建物を抽出する。それぞれの再イテレーションは、正確なオブジェクトマッピングにマルチスケール機能を利用する、より密なスキップ接続機構を使用することで、パフォーマンスの向上を目標としている。しかし、より密接な接続はネットワークパラメータを増やし、必ずしも正確なセグメンテーションに寄与しない。本稿では,3つのネットワーク(U-Net,ResUnet,U-Net3+)に対して,本質的な特徴マップを選択的に深化して性能向上を図るための3つの二重スキップ接続機構を開発する。 3つのメカニズムを異なるスケールの特徴マップで評価し、9つの新しいネットワーク構成を生成する。それらは、複雑な都市環境のために開発する多重解像度(0.3+0.6+1.2m)データセットを含む、異なる空間解像度の4つの建築フットプリントデータセットの当初のバニラ構成に対して評価される。評価の結果,U-NetとU-Net3+の大規模かつ小規模な特徴は最大0.905 F1, TransUnet (0.903), Swin-Unet (0.882), 最大19倍のパラメータを持つ新しいデータセットより大きいことがわかった。その結果,機能マップとスキップ接続を選択的に拡張することで,パラメータが大幅に増加することなくネットワーク性能が向上することがわかった。調査結果と新たなデータセットは,コンピュータビジョン領域と都市計画決定プロセスに寄与する。 Urban buildings are extracted from high-resolution Earth observation (EO) images using semantic segmentation networks like U-Net and its successors. Each re-iteration aims to improve performance by employing a denser skip connection mechanism that harnesses multi-scale features for accurate object mapping. However, denser connections increase network parameters and do not necessarily contribute to precise segmentation. In this paper, we develop three dual skip connection mechanisms for three networks (U-Net, ResUnet, and U-Net3+) to selectively deepen the essential feature maps for improved performance. The three mechanisms are evaluated on feature maps of different scales, producing nine new network configurations. They are evaluated against their original vanilla configurations on four building footprint datasets of different spatial resolutions, including a multi-resolution (0.3+0.6+1.2m) dataset that we develop for complex urban environments. The evaluation revealed that densifying the large- and small-scale features in U-Net and U-Net3+ produce up to 0.905 F1, more than TransUnet (0.903) and Swin-Unet (0.882) in our new dataset with up to 19x fewer parameters. The results conclude that selectively densifying feature maps and skip connections enhances network performance without a substantial increase in parameters. The findings and the new dataset will contribute to the computer vision domain and urban planning decision processes.	翻訳日:2023-10-25 13:26:33 公開日:2023-10-23
# スタイルGAN画像の自動分割のための教師付きワンショット学習 Self-Supervised One-Shot Learning for Automatic Segmentation of StyleGAN Images ( http://arxiv.org/abs/2303.05639v3 ) ライセンス: Link先を確認	Ankit Manerikar and Avinash C. Kak	(参考訳) 本稿では,StyleGANによって生成された合成画像の自動ワンショットセグメンテーションのためのフレームワークを提案する。筆者らのフレームワークは,生成した画像の自動オンザフライセグメンテーションに使用可能な,GANジェネレータのマルチスケール隠れ機能が有用な意味情報を保持するという観測に基づいている。これらの特徴を用いて, 自己教師付きコントラストクラスタリングアルゴリズムを用いて合成画像のセグメンテーションを学習し, 隠れた特徴をピクセル単位の分類のためのコンパクトな空間に投影する。このコントラスト学習器は、新しいデータ拡張戦略とピクセル単位での予測損失を用いることで、ワンショットセグメンテーションのための特徴ベクトルの学習を高速化する。我々は、5つの標準ベンチマークで実装をテストし、セミ教師付きベースラインを1.02 %の平均wiouマージンで上回るセグメンテーション性能を得るとともに、推論速度を4.5倍に向上させた。また,提案したワンショット学習機を用いて,警告検出のための注釈付き合成袋X線スキャンのフレームワークであるBagGANを実装した。このフレームワークはpidray baggageベンチマークでトレーニングされ、手動アノテーションに基づいたベースラインセグナーに匹敵するパフォーマンスを提供するためにテストされた。 We propose a framework for the automatic one-shot segmentation of synthetic images generated by a StyleGAN. Our framework is based on the observation that the multi-scale hidden features in the GAN generator hold useful semantic information that can be utilized for automatic on-the-fly segmentation of the generated images. Using these features, our framework learns to segment synthetic images using a self-supervised contrastive clustering algorithm that projects the hidden features into a compact space for per-pixel classification. This contrastive learner is based on using a novel data augmentation strategy and a pixel-wise swapped prediction loss that leads to faster learning of the feature vectors for one-shot segmentation. We have tested our implementation on five standard benchmarks to yield a segmentation performance that not only outperforms the semi-supervised baselines by an average wIoU margin of 1.02 % but also improves the inference speeds by a factor of 4.5. Finally, we also show the results of using the proposed one-shot learner in implementing BagGAN, a framework for producing annotated synthetic baggage X-ray scans for threat detection. This framework was trained and tested on the PIDRay baggage benchmark to yield a performance comparable to its baseline segmenter based on manual annotations.	翻訳日:2023-10-25 13:25:19 公開日:2023-10-23
# decn:進化による自動進化アルゴリズムは深層畳み込みネットワークにインスパイアされた DECN: Automated Evolutionary Algorithms via Evolution Inspired Deep Convolution Network ( http://arxiv.org/abs/2304.09599v3 ) ライセンス: Link先を確認	Kai Wu, Penghui Liu, Jing Liu	(参考訳) 進化的アルゴリズム(EA)は、特にブラックボックス最適化のための強力なフレームワークとして登場した。自動EAは、関心の問題における構造を利用して、潜在的ソリューションの生成と選択のための更新ルール(最適化戦略)を自動的に生成し、最適なソリューションの近くにランダムな集団を移動させる。しかし、最適化戦略の貧弱な表現と最適化戦略と目標タスクとの弱い相互作用のため、現在のEAはこの目標を達成することはできない。手動で設計したEAから手動の介入なしに自動化されたEAへの移行を実現するために、深層進化畳み込みネットワーク(DECN)を設計する。 DECNは目的のタスクに高い適応性を持ち、計算コストの少ないより良いソリューションを得ることができる。 DECNはまた、目標タスクの低忠実度情報を有効活用して効率的な最適化戦略を構築することができる。 9つの人工物と2つの実世界のケースの実験は、最先端の人間設計およびメタ学習EAベースラインに対して学習された最適化戦略の利点を示している。さらに、操作のテンソル化により、DECNはGPUが提供する加速度に親しみやすく、EAの102倍高速で動作する。 Evolutionary algorithms (EAs) have emerged as a powerful framework for optimization, especially for black-box optimization. This paper first focuses on automated EA: Automated EA exploits structure in the problem of interest to automatically generate update rules (optimization strategies) for generating and selecting potential solutions so that it can move a random population near the optimal solution. However, current EAs cannot achieve this goal due to the poor representation of the optimization strategy and the weak interaction between the optimization strategy and the target task. We design a deep evolutionary convolution network (DECN) to realize the move from hand-designed EAs to automated EAs without manual interventions. DECN has high adaptability to the target task and can obtain better solutions with less computational cost. DECN is also able to effectively utilize the low-fidelity information of the target task to form an efficient optimization strategy. The experiments on nine synthetics and two real-world cases show the advantages of learned optimization strategies over the state-of-the-art human-designed and meta-learning EA baselines. In addition, due to the tensorization of the operations, DECN is friendly to the acceleration provided by GPUs and runs 102 times faster than EA.	翻訳日:2023-10-25 13:16:27 公開日:2023-10-23
# GANを用いた画像合成評価の再検討 Revisiting the Evaluation of Image Synthesis with GANs ( http://arxiv.org/abs/2304.01999v2 ) ライセンス: Link先を確認	Mengping Yang, Ceyuan Yang, Yichi Zhang, Qingyan Bai, Yujun Shen, Bo Dai	(参考訳) ソリューション間の信頼できる比較を約束する優れたメトリクスは、明確に定義されたタスクには不可欠です。サンプルごとの接地構造を持つほとんどの視覚タスクとは異なり、画像合成タスクは見えないデータを生成することを目標とし、通常、実際のサンプルの1セットと生成されたサンプルの別のセットの間の分布距離で評価される。本研究では,生成モデルの代表としてgans(generative adversarial network)を用いた合成性能評価を実証的に検討する。特に,表現空間におけるデータポイントの表現方法,選択したサンプルを用いた公平な距離の計算方法,各セットから使用するインスタンス数など,さまざまな要素の詳細な分析を行う。複数のデータセットと設定で広範な実験が行われ、いくつかの重要な発見が明らかになった。第一に、cnnベースとvitベースのアーキテクチャの両方を含むモデル群は、測定評価のための信頼性とロバストな特徴抽出器として機能する。第2に、CKA(Centered Kernel Alignment)は、様々な抽出器と階層層を1つのモデルで比較する。最後に、ckaはサンプル効率が高く、2つの内部データ相関の類似性を特徴付けることで、人間の判断とよりよく一致している。これらの知見は,最先端生成モデルの一貫性と信頼性を再評価する新しい計測システムの開発に寄与する。 A good metric, which promises a reliable comparison between solutions, is essential for any well-defined task. Unlike most vision tasks that have per-sample ground-truth, image synthesis tasks target generating unseen data and hence are usually evaluated through a distributional distance between one set of real samples and another set of generated samples. This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models. In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set. Extensive experiments conducted on multiple datasets and settings reveal several important findings. Firstly, a group of models that include both CNN-based and ViT-based architectures serve as reliable and robust feature extractors for measurement evaluation. Secondly, Centered Kernel Alignment (CKA) provides a better comparison across various extractors and hierarchical layers in one model. Finally, CKA is more sample-efficient and enjoys better agreement with human judgment in characterizing the similarity between two internal data correlations. These findings contribute to the development of a new measurement system, which enables a consistent and reliable re-evaluation of current state-of-the-art generative models.	翻訳日:2023-10-25 13:16:05 公開日:2023-10-23
# beyond unimodal: マルチモーダル不確実性推定のためのニューラルネットワークの一般化 Beyond Unimodal: Generalising Neural Processes for Multimodal Uncertainty Estimation ( http://arxiv.org/abs/2304.01518v2 ) ライセンス: Link先を確認	Myong Chol Jung, He Zhao, Joanna Dipnall, Lan Du	(参考訳) 不確実性推定は、ディープニューラルネットワーク(DNN)をより信頼できるものにするための重要な研究領域である。単一モーダルデータを用いた不確実性推定の広範な研究は行われているが、マルチモーダルデータの不確実性推定は依然として課題である。ニューラルプロセス (NP) は、ガウス過程の信頼性を効率よく強力なDNNで提供することにより、一助データに対する効果的な不確実性推定法として実証されている。 NPはマルチモーダル不確実性推定に有意な可能性を秘めているが、マルチモーダルデータに対するNPの適応は慎重に研究されていない。このギャップを埋めるために、マルチモーダル不確実性推定のためのNPを一般化してマルチモーダルニューラルプロセス(MNP)を提案する。 npsの枠組みに基づいて、mnpはマルチモーダルデータの特徴に合わせたいくつかの新規で原理化されたメカニズムで構成されている。提案手法は, 従来のマルチモーダル不確実性推定法と比較して, 高精度な計算時間で, ノイズの多いサンプルに対する頑健さと, 分布外検出の信頼性を示す。 Uncertainty estimation is an important research area to make deep neural networks (DNNs) more trustworthy. While extensive research on uncertainty estimation has been conducted with unimodal data, uncertainty estimation for multimodal data remains a challenge. Neural processes (NPs) have been demonstrated to be an effective uncertainty estimation method for unimodal data by providing the reliability of Gaussian processes with efficient and powerful DNNs. While NPs hold significant potential for multimodal uncertainty estimation, the adaptation of NPs for multimodal data has not been carefully studied. To bridge this gap, we propose Multimodal Neural Processes (MNPs) by generalising NPs for multimodal uncertainty estimation. Based on the framework of NPs, MNPs consist of several novel and principled mechanisms tailored to the characteristics of multimodal data. In extensive empirical evaluation, our method achieves state-of-the-art multimodal uncertainty estimation performance, showing its appealing robustness against noisy samples and reliability in out-of-distribution detection with faster computation time compared to the current state-of-the-art multimodal uncertainty estimation method.	翻訳日:2023-10-25 13:15:43 公開日:2023-10-23
# マルチレイヤパーセプトロンを超えて - ニューラルネットワークの複雑なトポロジを探る Beyond Multilayer Perceptrons: Investigating Complex Topologies in Neural Networks ( http://arxiv.org/abs/2303.17925v2 ) ライセンス: Link先を確認	Tommaso Boccato, Matteo Ferrante, Andrea Duggento, Nicola Toschi	(参考訳) 本研究では,ニューラルネットワーク(ANN)の近似能力に対するネットワークトポロジの影響について検討し,特に複雑なトポロジに着目した。本稿では,Barab\'asi-Albert,Erd\H{o}s-R\'enyi,Watts-Strogatz,Multilayer perceptrons (MLPs)など,様々なトポロジに基づく複雑なANNの構築手法を提案する。構築されたネットワークは、多様体学習ジェネレータから生成された合成データセット、タスクの難易度とノイズのレベル、UCIスイートの実際のデータセットで評価される。以上の結果から,複雑なトポロジは従来のmlpに比べて高い拡散率で優れた性能をもたらすことが明らかとなった。この性能上の利点は、基盤となるターゲット関数の構成性を利用する複雑なネットワークの能力にある。しかし、この利点は、フォワードパス計算時間の増加とグラフの損傷に対するロバスト性低下によるものである。さらに,様々なトポロジ特性とモデル性能の関係について検討する。解析の結果,ネットワークトポロジが近似能力に与える影響は,個々のトポロジカル属性との単純な相関よりも複雑である可能性が示唆された。本研究は、ANNの性能向上のための複雑なトポロジの可能性に光を当て、複数のトポロジ特性間の相互作用とモデル性能への影響を探求する将来の研究基盤を提供する。 In this study, we explore the impact of network topology on the approximation capabilities of artificial neural networks (ANNs), with a particular focus on complex topologies. We propose a novel methodology for constructing complex ANNs based on various topologies, including Barab\'asi-Albert, Erd\H{o}s-R\'enyi, Watts-Strogatz, and multilayer perceptrons (MLPs). The constructed networks are evaluated on synthetic datasets generated from manifold learning generators, with varying levels of task difficulty and noise, and on real-world datasets from the UCI suite. Our findings reveal that complex topologies lead to superior performance in high-difficulty regimes compared to traditional MLPs. This performance advantage is attributed to the ability of complex networks to exploit the compositionality of the underlying target function. However, this benefit comes at the cost of increased forward-pass computation time and reduced robustness to graph damage. Additionally, we investigate the relationship between various topological attributes and model performance. Our analysis shows that no single attribute can account for the observed performance differences, suggesting that the influence of network topology on approximation capabilities may be more intricate than a simple correlation with individual topological attributes. Our study sheds light on the potential of complex topologies for enhancing the performance of ANNs and provides a foundation for future research exploring the interplay between multiple topological attributes and their impact on model performance.	翻訳日:2023-10-25 13:15:22 公開日:2023-10-23
# 2+1)次元可逆フェルミオン状態とホフシュタッターの蝶の部分回転からの完全結晶的位相不変量 Complete crystalline topological invariants from partial rotations in (2+1)D invertible fermionic states and Hofstadter's butterfly ( http://arxiv.org/abs/2303.16919v2 ) ライセンス: Link先を確認	Yuxuan Zhang, Naren Manjunath, Ryohei Kobayashi, Maissam Barkeshli	(参考訳) 物質のトポロジカル相の理論は結晶対称性によってのみ保護される不変性を予測するが、一般に顕微鏡計算からどのように抽出するかは不明である。ここで、${\text{o}}$ は (2+1)d の可逆フェルミオン状態における部分回転から高対称性点である、多体不変量 $\{\theta_{\text{o}}^{\pm}\}$ の組を抽出する方法を示す。この結果は、以前の研究とは対照的に、磁場とチャーン数$C \neq 0$の存在に適用できる。 $\{\Theta_{\text{o}}^{\pm}\}$と$C$、キラル中心電荷$c_-$、および$\nu$は、対称性群$G = \text{U}(1) \times_\phi [\mathbb{Z}^2 \rtimes \mathbb{Z}_M]$で位相状態の完全な多体特徴づけを提供する。さらに、これらの多体不変量は、追加の欠陥を挿入することなく、単一のバルク基底状態から得ることができる。正方格子ホフスタッターモデルを用いて数値計算を行う。注目すべきことに、これらの計算は共形場と位相場の理論の計算と一致し、$G$交差モジュラー$S, T$対称性欠陥の行列が重要な役割を果たす。この結果はホフスタッターの蝶の新たな着色を提供し、離散シフトと量子化された電荷分極によって最近発見された着色を延ばした。 The theory of topological phases of matter predicts invariants protected only by crystalline symmetry, yet it has been unclear how to extract these from microscopic calculations in general. Here we show how to extract a set of many-body invariants $\{\Theta_{\text{o}}^{\pm}\}$, where ${\text{o}}$ is a high symmetry point, from partial rotations in (2+1)D invertible fermionic states. Our results apply in the presence of magnetic field and Chern number $C \neq 0$, in contrast to previous work. $\{\Theta_{\text{o}}^{\pm}\}$ together with $C$, chiral central charge $c_-$, and filling $\nu$ provide a complete many-body characterization of the topological state with symmetry group $G = \text{U}(1) \times_\phi [\mathbb{Z}^2 \rtimes \mathbb{Z}_M]$. Moreover, all these many-body invariants can be obtained from a single bulk ground state, without inserting additional defects. We perform numerical computations on the square lattice Hofstadter model. Remarkably, these match calculations from conformal and topological field theory, where $G$-crossed modular $S, T$ matrices of symmetry defects play a crucial role. Our results provide additional colorings of Hofstadter's butterfly, extending recently discovered colorings by the discrete shift and quantized charge polarization.	翻訳日:2023-10-25 13:14:52 公開日:2023-10-23
# 条件付き生成モデルはおそらくロバストである:ベイズ逆問題に対するポイントワイズ保証 Conditional Generative Models are Provably Robust: Pointwise Guarantees for Bayesian Inverse Problems ( http://arxiv.org/abs/2303.15845v2 ) ライセンス: Link先を確認	Fabian Altekr\"uger, Paul Hagemann, Gabriele Steidl	(参考訳) 条件生成モデルはベイズ逆問題後部から非常に強力なツールとなった。古典ベイズ文学では、後方測度は、観測の摂動を含む先行測度と負の対数類似度の両方の摂動に関して非常に頑健であることが知られている。しかしながら、我々の知る限りでは、観測の摂動に関して条件付き生成モデルのロバスト性は未だ調査されていない。本稿では,適切な条件付き生成モデルが単一観測に対して堅牢な結果をもたらすことを初めて証明する。 Conditional generative models became a very powerful tool to sample from Bayesian inverse problem posteriors. It is well-known in classical Bayesian literature that posterior measures are quite robust with respect to perturbations of both the prior measure and the negative log-likelihood, which includes perturbations of the observations. However, to the best of our knowledge, the robustness of conditional generative models with respect to perturbations of the observations has not been investigated yet. In this paper, we prove for the first time that appropriately learned conditional generative models provide robust results for single observations.	翻訳日:2023-10-25 13:14:17 公開日:2023-10-23
# 外部分布誤差予測における特徴分離性の重要性について On the Importance of Feature Separability in Predicting Out-Of-Distribution Error ( http://arxiv.org/abs/2303.15488v2 ) ライセンス: Link先を確認	Renchunzi Xie, Hongxin Wei, Lei Feng, Yuzhou Cao, Bo An	(参考訳) 土木ラベルを使わずに分布外データ(OOD)で一般化性能を推定することは事実上困難である。従来の手法では分布差とood精度の関係を強調するが,領域ギャップが大きいと必ずしもテスト精度が低いとは限らない。本稿では,特徴分離性の観点から,この問題を経験的,理論的に検討する。具体的には,分布シフト時のテスト精度を推定するために,特徴分散に基づくデータセットレベルスコアを提案する。本手法は,高クラス間分散と高クラス内コンパクト性という,表現学習における特徴の望ましい特性に着想を得たものである。その結果, クラス間分散はモデル精度と強く相関するが, クラス内コンパクト性はoodデータの一般化性能を反映しないことがわかった。予測性能と計算効率の両方において,本手法の優位性を示す実験を行った。 Estimating the generalization performance is practically challenging on out-of-distribution (OOD) data without ground-truth labels. While previous methods emphasize the connection between distribution difference and OOD accuracy, we show that a large domain gap not necessarily leads to a low test accuracy. In this paper, we investigate this problem from the perspective of feature separability empirically and theoretically. Specifically, we propose a dataset-level score based upon feature dispersion to estimate the test accuracy under distribution shift. Our method is inspired by desirable properties of features in representation learning: high inter-class dispersion and high intra-class compactness. Our analysis shows that inter-class dispersion is strongly correlated with the model accuracy, while intra-class compactness does not reflect the generalization performance on OOD data. Extensive experiments demonstrate the superiority of our method in both prediction performance and computational efficiency.	翻訳日:2023-10-25 13:14:07 公開日:2023-10-23
# ファウショット一般化のためのメタグラディエント正規化を用いた自己教師型メタプロンプト学習 Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization for Few-shot Generalization ( http://arxiv.org/abs/2303.12314v4 ) ライセンス: Link先を確認	Kaihang Pan, Juncheng Li, Hongye Song, Jun Lin, Xiaozhong Liu, Siliang Tang	(参考訳) プロンプトチューニングはパラメータ効率のよい手法であり、ソフトプロンプトと条件凍結言語モデルを学び、特定の下流タスクを実行する。効果はあるものの、数ショット設定でのプロンプトチューニングはソフトプロンプトの優れた初期化に大きく依存している。一方、数発のトレーニングサンプルに容易に適合し、一般化性を損なうことができる。既存の作業では、事前学習や教師付きメタ学習を活用してソフトプロンプトを初期化するが、データ効率よく下流のタスクに一般化することができない。上記の問題に対処するため,本論文では,数ショットの一般化のためのMeta-gradient regularization(SUPMER)を用いた自己改善メタプロンプト学習フレームワークを提案する。 SuPMERは、多種多様な設計のメタトレーニングタスクで自己教師付きメタラーニングを活用し、ラベルなしデータのみを使用して効率的な適応のための普遍的なプロンプト初期化を学習する。さらに、勾配正規化関数を共同でメタ学習し、生勾配を領域一般化可能な方向に変換することにより、オーバーフィッティングの問題を緩和する。大規模な実験により、SUPMERは、異なる数ショットダウンストリームタスクに対してより良いパフォーマンスを実現し、さらに強力なドメイン一般化能力を示すことが示された。 SUPMERのコードはhttps://github.com/beepkh/SUPMERで入手できる。 Prompt tuning is a parameter-efficient method, which learns soft prompts and conditions frozen language models to perform specific downstream tasks. Though effective, prompt tuning under few-shot settings on the one hand heavily relies on a good initialization of soft prompts. On the other hand, it can easily overfit to few-shot training samples, thereby undermining generalizability. Existing works leverage pre-training or supervised meta-learning to initialize soft prompts but they fail to data-efficiently generalize to unseen downstream tasks. To address the above problems, this paper proposes a novel Self-sUpervised meta-Prompt learning framework with MEta-gradient Regularization for few-shot generalization (SUPMER). SUPMER leverages self-supervised meta-learning with a diverse set of well-designed meta-training tasks to learn a universal prompt initialization for efficient adaptation using only unlabeled data. Additionally, it jointly meta-learns a gradient regularization function to transform raw gradients into a domain-generalizable direction, thus alleviating the problem of overfitting. Extensive experiments show that SUPMER achieves better performance for different few-shot downstream tasks, and also exhibits a stronger domain generalization ability. The code for SUPMER will be available at https://github.com/beepkh/SUPMER.	翻訳日:2023-10-25 13:13:08 公開日:2023-10-23
# Thorny Roses氏:自然言語処理における両用ジレンマの調査 Thorny Roses: Investigating the Dual Use Dilemma in Natural Language Processing ( http://arxiv.org/abs/2304.08315v2 ) ライセンス: Link先を確認	Lucie-Aim\'ee Kaffee, Arnav Arora, Zeerak Talat, Isabelle Augenstein	(参考訳) 技術と科学的成果物の意図的かつ有害な再利用である二重利用は、自然言語処理(nlp)の文脈ではまだ明確に定義されていない問題である。しかし、NLP技術は発展を続け、社会に広まりつつあるため、内部の作業はますます不透明になっている。したがって、二重利用の懸念とそれらを制限する潜在的な方法を理解することは、研究開発の潜在的な害を最小化するために重要である。本稿では,NLP研究者と実践者を対象に,課題の深さと展望を把握し,既存のサポートの評価を行う。調査の結果に基づき,NLPコミュニティのニーズに合わせた二重利用の定義を提供する。この調査によると、大多数の研究者が研究の二重利用を心配しているが、その対策は限られている。調査結果を踏まえ,NLPにおける二重利用を緩和する現在の状況と潜在的手段について考察し,既存の会議倫理枠組み,例えばACL倫理チェックリストに統合可能なチェックリストを提案する。 Dual use, the intentional, harmful reuse of technology and scientific artefacts, is a problem yet to be well-defined within the context of Natural Language Processing (NLP). However, as NLP technologies continue to advance and become increasingly widespread in society, their inner workings have become increasingly opaque. Therefore, understanding dual use concerns and potential ways of limiting them is critical to minimising the potential harms of research and development. In this paper, we conduct a survey of NLP researchers and practitioners to understand the depth and their perspective of the problem as well as to assess existing available support. Based on the results of our survey, we offer a definition of dual use that is tailored to the needs of the NLP community. The survey revealed that a majority of researchers are concerned about the potential dual use of their research but only take limited action toward it. In light of the survey results, we discuss the current state and potential means for mitigating dual use in NLP and propose a checklist that can be integrated into existing conference ethics-frameworks, e.g., the ACL ethics checklist.	翻訳日:2023-10-25 13:05:52 公開日:2023-10-23
# 時間的知識共有による過去と未来からのニューラルネットワーク学習の実現 Temporal Knowledge Sharing enable Spiking Neural Network Learning from Past and Future ( http://arxiv.org/abs/2304.06540v2 ) ライセンス: Link先を確認	Yiting Dong, Dongcheng Zhao, Yi Zeng	(参考訳) スパイキングニューラルネットワーク(snn)は、脳のような情報処理機構のため、様々な領域の研究者から注目されている。しかしながら、SNNは通常、拡張時間ステップ、低時間情報利用、テストとトレーニングの間の一貫した時間ステップの必要性といった課題に悩まされる。これらの課題は、SNNを高いレイテンシでレンダリングする。さらに、時間ステップの制約は、新しいデプロイメントのためのモデルの再トレーニングを必要とし、適応性を低減する。これらの問題に対処するため,本稿では,snを時間集約モデルとして見る新しい視点を提案する。時間的知識共有(TKS)手法を導入し、異なる時間点間の情報交換を容易にする。 tkは時間的自己蒸留の一種と見なすことができる。 CIFAR10, CIFAR100, ImageNet-1kなどの静的データセットとDVS-CIFAR10, NCALTECH101などのニューロモルフィックデータセットでTKSの有効性を検証する。実験により,本手法が他のアルゴリズムと比較して最先端性能を実現することを示す。さらに、TKSは時間的整合性の問題に対処し、時間的一般化能力に優れたモデルを提供する。これにより、ネットワークは長い時間ステップでトレーニングでき、短い時間ステップでテスト中に高いパフォーマンスを維持することができる。このようなアプローチにより、エッジデバイスへのSNNのデプロイが大幅に加速する。最後に,細粒度タスクにおけるアブレーション実験とTKS試験を行い,TKSの高機能化による情報処理の効率化を実証した。 Spiking Neural Networks (SNNs) have attracted significant attention from researchers across various domains due to their brain-like information processing mechanism. However, SNNs typically grapple with challenges such as extended time steps, low temporal information utilization, and the requirement for consistent time step between testing and training. These challenges render SNNs with high latency. Moreover, the constraint on time steps necessitates the retraining of the model for new deployments, reducing adaptability. To address these issues, this paper proposes a novel perspective, viewing the SNN as a temporal aggregation model. We introduce the Temporal Knowledge Sharing (TKS) method, facilitating information interact between different time points. TKS can be perceived as a form of temporal self-distillation. To validate the efficacy of TKS in information processing, we tested it on static datasets like CIFAR10, CIFAR100, ImageNet-1k, and neuromorphic datasets such as DVS-CIFAR10 and NCALTECH101. Experimental results demonstrate that our method achieves state-of-the-art performance compared to other algorithms. Furthermore, TKS addresses the temporal consistency challenge, endowing the model with superior temporal generalization capabilities. This allows the network to train with longer time steps and maintain high performance during testing with shorter time steps. Such an approach considerably accelerates the deployment of SNNs on edge devices. Finally, we conducted ablation experiments and tested TKS on fine-grained tasks, with results showcasing TKS's enhanced capability to process information efficiently.	翻訳日:2023-10-25 13:05:33 公開日:2023-10-23
# select without fear: ほぼすべてのミニバッチスケジュールが最適に一般化する Select without Fear: Almost All Mini-Batch Schedules Generalize Optimally ( http://arxiv.org/abs/2305.02247v2 ) ライセンス: Link先を確認	Konstantinos E. Nikolakakis, Amin Karbasi, Dionysis Kalogerias	(参考訳) 我々は、決定的、確率的、データ非依存、その他の任意のバッチ選択ルールを用いて、GDトレーニングのための上限と下限の一般化誤差境界を確立する。我々は滑らかなLipschitz-convex/nonconvex/strongly-convex損失関数を考察し、SGD(Stochastic GD)の古典的な上界が、任意の非適応バッチスケジュールに対して、すべての決定論的スケジュールを含む冗長性を持つことを示す。さらに、凸と強凸の損失に対して、上記のバッチスケジュールのクラス上での一般化誤差の均一性を直接証明し、これらのバッチスケジュールが全て最適に一般化されることを示す。最後に、スムーズな(非Lipschitz)非凸損失に対して、全バッチ(決定論的)GDが本質的に最適であることを示す。 We establish matching upper and lower generalization error bounds for mini-batch Gradient Descent (GD) training with either deterministic or stochastic, data-independent, but otherwise arbitrary batch selection rules. We consider smooth Lipschitz-convex/nonconvex/strongly-convex loss functions, and show that classical upper bounds for Stochastic GD (SGD) also hold verbatim for such arbitrary nonadaptive batch schedules, including all deterministic ones. Further, for convex and strongly-convex losses we prove matching lower bounds directly on the generalization error uniform over the aforementioned class of batch schedules, showing that all such batch schedules generalize optimally. Lastly, for smooth (non-Lipschitz) nonconvex losses, we show that full-batch (deterministic) GD is essentially optimal, among all possible batch schedules within the considered class, including all stochastic ones.	翻訳日:2023-10-25 12:55:33 公開日:2023-10-23
# ゼロショットテキスト分類におけるラベル記述訓練の利点 The Benefits of Label-Description Training for Zero-Shot Text Classification ( http://arxiv.org/abs/2305.02239v2 ) ライセンス: Link先を確認	Lingyu Gao, Debanjan Ghosh, Kevin Gimpel	(参考訳) 事前訓練された言語モデルは、下流タスクで特定のラベルセットを分類するために、トレーニングデータから意味的な知識を伝達することで、ゼロショットテキスト分類を改善した。最小限の努力でゼロショット精度をさらに向上する簡単な方法を提案する。タスクのラベルを記述するための小さな微調整データセットをキュレートする。ラベルでアノテートされたテキストを持つ一般的な微調整データとは異なり、我々のデータは、いくつかの関連用語、辞書/百科事典エントリ、短いテンプレートを使用して、単にラベルを言語で記述する。トピックと感情のデータセットの範囲で、我々の手法はゼロショットよりも17-19%精度が高い。また、ゼロショット分類に必要な選択、例えばモデルの語彙のラベルからトークンへの分類とマッピングを促すパターンに対して、より堅牢である。さらに,データにはラベルのみを記述するが入力テキストは使用しないため,入力文を微調整することで,与えられたラベルセットの複数のテキストドメインに対して強く動作し,複数設定で数ショットのドメイン外分類も改善するモデルが得られる。 Pretrained language models have improved zero-shot text classification by allowing the transfer of semantic knowledge from the training data in order to classify among specific label sets in downstream tasks. We propose a simple way to further improve zero-shot accuracies with minimal effort. We curate small finetuning datasets intended to describe the labels for a task. Unlike typical finetuning data, which has texts annotated with labels, our data simply describes the labels in language, e.g., using a few related terms, dictionary/encyclopedia entries, and short templates. Across a range of topic and sentiment datasets, our method is more accurate than zero-shot by 17-19% absolute. It is also more robust to choices required for zero-shot classification, such as patterns for prompting the model to classify and mappings from labels to tokens in the model's vocabulary. Furthermore, since our data merely describes the labels but does not use input texts, finetuning on it yields a model that performs strongly on multiple text domains for a given label set, even improving over few-shot out-of-domain classification in multiple settings.	翻訳日:2023-10-25 12:55:12 公開日:2023-10-23
# メタレビュー生成のための会話構造を持つ複数文書の要約 Summarizing Multiple Documents with Conversational Structure for Meta-Review Generation ( http://arxiv.org/abs/2305.01498v4 ) ライセンス: Link先を確認	Miao Li, Eduard Hovy, Jey Han Lau	(参考訳) 我々は,科学論文のメタレビューを生成するための新しいデータセットpeersumを提案する。メタレビューは、レビュー、マルチターン議論、論文要約の抽象的な要約と解釈できる。これらのソース文書は、明示的な階層的な会話構造、相互参照、(文書間の)相反する情報を含む豊富な文書間関係を持つ。事前学習された言語モデルに構造的帰納的バイアスを導入するために,対話構造に基づくスパース注意を使用するrammer(relation-aware multi-task meta-review generator)と,メタデータ特徴を予測するマルチタスクトレーニング目標(例えば,レビューレーティング)を導入する。実験の結果,Rammerは他の強力なベースラインモデルよりも優れた自動評価指標が得られた。しかし、さらに分析した結果、RAMMERや他のモデルがPeerSumのソース文書のコンフリクトを扱うのに苦労していることが判明し、メタリビュー生成は難しい課題であり、さらなる研究のための有望な道のりであることを示唆している。 We present PeerSum, a novel dataset for generating meta-reviews of scientific papers. The meta-reviews can be interpreted as abstractive summaries of reviews, multi-turn discussions and the paper abstract. These source documents have rich inter-document relationships with an explicit hierarchical conversational structure, cross-references and (occasionally) conflicting information. To introduce the structural inductive bias into pre-trained language models, we introduce Rammer ( Relationship-aware Multi-task Meta-review Generator), a model that uses sparse attention based on the conversational structure and a multi-task training objective that predicts metadata features (e.g., review ratings). Our experimental results show that Rammer outperforms other strong baseline models in terms of a suite of automatic evaluation metrics. Further analyses, however, reveal that RAMMER and other models struggle to handle conflicts in source documents of PeerSum, suggesting meta-review generation is a challenging task and a promising avenue for further research.	翻訳日:2023-10-25 12:53:46 公開日:2023-10-23
# 逆不変正規化による対数コントラスト学習の促進 Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularization ( http://arxiv.org/abs/2305.00374v2 ) ライセンス: Link先を確認	Xilie Xu, Jingfeng Zhang, Feng Liu, Masashi Sugiyama, Mohan Kankanhalli	(参考訳) adversarial contrastive learning(adversarial contrastive learning、acl)は、標準のコントラスト学習(scl)を強化する手法で、敵対的なデータを組み込んで、敵対的な攻撃や共通の腐敗に耐えうる堅牢な表現を学習する。転送性を改善するため、既存の研究は標準不変正規化(SIR)を導入し、標準表現におけるニュアンススタイル要素の影響を排除できるスタイル独立性をSCLに課した。しかし、スタイル独立性がACLが学習したロバスト表現にどのような恩恵をもたらすかは明らかでない。本稿では,aclを解釈するために因果推論の手法を活用し,スタイル要因からの独立を強制するために逆不変正規化(air)を提案する。 SIRとAIRの両方を用いてACLを規制し、ロバストな表現を出力する。理論的には、AIRは、自然データの異なるビューとそれらの逆のバリエーションの間の表現距離を、スタイル要因に依存しないように暗黙的に促す。実験の結果, 変分正規化は, 下流タスクにおける標準一般化とロバスト性の両方の観点から, 最先端の ACL 手法の性能を著しく向上させることが示された。私たちの知る限りでは、ACLを解釈するための因果推論を最初に適用し、ACLを学習した堅牢な表現を強化するためのAIRを開発します。ソースコードはhttps://github.com/GodXuxilie/Enhancing_ACL_via_AIRにあります。 Adversarial contrastive learning (ACL) is a technique that enhances standard contrastive learning (SCL) by incorporating adversarial data to learn a robust representation that can withstand adversarial attacks and common corruptions without requiring costly annotations. To improve transferability, the existing work introduced the standard invariant regularization (SIR) to impose style-independence property to SCL, which can exempt the impact of nuisance style factors in the standard representation. However, it is unclear how the style-independence property benefits ACL-learned robust representations. In this paper, we leverage the technique of causal reasoning to interpret the ACL and propose adversarial invariant regularization (AIR) to enforce independence from style factors. We regulate the ACL using both SIR and AIR to output the robust representation. Theoretically, we show that AIR implicitly encourages the representational distance between different views of natural data and their adversarial variants to be independent of style factors. Empirically, our experimental results show that invariant regularization significantly improves the performance of state-of-the-art ACL methods in terms of both standard generalization and robustness on downstream tasks. To the best of our knowledge, we are the first to apply causal reasoning to interpret ACL and develop AIR for enhancing ACL-learned robust representations. Our source code is at https://github.com/GodXuxilie/Enhancing_ACL_via_AIR.	翻訳日:2023-10-25 12:53:07 公開日:2023-10-23
# マルチバンド非エルミート系のグリーン関数 Green's functions of multiband non-Hermitian systems ( http://arxiv.org/abs/2304.14438v2 ) ライセンス: Link先を確認	Yu-Min Hu, Zhong Wang	(参考訳) 非エルミート系のグリーン関数は、様々な力学過程において基本的な役割を果たす。非エルミート系は非エルミートスキン効果による境界条件に敏感であるため、開有界グリーン函数は非ブロッホバンド理論と密接に関連している。単一バンド非エルミート系における開有界グリーン関数の正確な公式は、一般化されたブリルアンゾーン (GBZ) に沿った積分であることが証明されているが、一般的なマルチバンド系における適切な一般化はいまだ不明である。本研究では、リーマン面上の多重バンド GBZ を見ることにより、マルチバンド非エルミート系における開有界グリーン関数の式を導出する。この公式は、様々な実験プラットフォームで検証できるマルチバンドシステムの方向増幅を記述するために適用することができる。 Green's functions of non-Hermitian systems play a fundamental role in various dynamical processes. Because non-Hermitian systems are sensitive to boundary conditions due to the non-Hermitian skin effect, open-boundary Green's functions are closely related to the non-Bloch band theory. While the exact formula of open-boundary Green's functions in single-band non-Hermitian systems proves to be an integral along the generalized Brillouin zone (GBZ), the proper generalization in generic multiband systems remains unclear. In this work, we derive a formula of open-boundary Green's functions in multiband non-Hermitian systems by viewing the multiband GBZ on the Riemann surface. This formula can be applied to describe directional amplification in multiband systems, which can be verified at various experimental platforms.	翻訳日:2023-10-25 12:52:32 公開日:2023-10-23
# 機械学習の景観を探る : 総合的な調査と分類学 Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy ( http://arxiv.org/abs/2305.06360v5 ) ライセンス: Link先を確認	Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, and Qing Li	(参考訳) 機械学習(ML)モデルによる予測の削除や修正の必要性から、機械学習(MU)が注目を集めている。トレーニングモデルはより効率的で正確になっていますが、未学習の情報の重要性は、プライバシやセキュリティ、公正といった分野でますます重要になっています。本稿では,データ削除,摂動,モデル更新など,現在の最先端技術とアプローチを包括的に調査する。また、一般的なメトリクスやデータセットも提示される。また、攻撃の高度化、標準化、転送可能性、解釈可能性、トレーニングデータ、リソース制約など、対処すべき課題を強調している。本稿では,muの潜在的メリットとその今後の方向性について考察する。さらに、機械学習モデルがユーザの信頼を維持しながら変化する状況に適応できるように、研究者や実践者が未学習の技術を探求し、改善し続ける必要性を強調した。アンラーニングの重要性はさらに強調され、人工知能(AI)をより信頼性が高く透明なものにすること、特に大量の個人データを含むさまざまな領域におけるAIの重要性が増している。 Machine unlearning (MU) is gaining increasing attention due to the need to remove or modify predictions made by machine learning (ML) models. While training models have become more efficient and accurate, the importance of unlearning previously learned information has become increasingly significant in fields such as privacy, security, and fairness. This paper presents a comprehensive survey of MU, covering current state-of-the-art techniques and approaches, including data deletion, perturbation, and model updates. In addition, commonly used metrics and datasets are also presented. The paper also highlights the challenges that need to be addressed, including attack sophistication, standardization, transferability, interpretability, training data, and resource constraints. The contributions of this paper include discussions about the potential benefits of MU and its future directions. Additionally, the paper emphasizes the need for researchers and practitioners to continue exploring and refining unlearning techniques to ensure that ML models can adapt to changing circumstances while maintaining user trust. The importance of unlearning is further highlighted in making Artificial Intelligence (AI) more trustworthy and transparent, especially with the increasing importance of AI in various domains that involve large amounts of personal user data.	翻訳日:2023-10-25 12:46:50 公開日:2023-10-23
# 露出テキスト生成:模倣,検索,パラフレーズ Expository Text Generation: Imitate, Retrieve, Paraphrase ( http://arxiv.org/abs/2305.03276v2 ) ライセンス: Link先を確認	Nishant Balepur, Jie Huang, Kevin Chen-Chuan Chang	(参考訳) 展示資料は、複雑な情報を読者に伝えるための重要なリソースである。その有用性にもかかわらず、手書きの例示テキストを書くことは、注意深いコンテンツ計画、複数の情報源からの事実の取得、これらの事実を明確に合成する能力を必要とする難しいプロセスである。これらの負担を軽減するために,知識源をインテリジェントに検索することで,トピックに対して正確かつスタイリスト的に一貫性のある露出テキストを自動的に生成することを目指す,露出テキスト生成の課題を提案する。我々は、検索強化モデルの限界を克服し、コンテンツ計画、事実検索、言い換えを反復的に実行するIRPを開発することで、我々の課題を解決する。新たに収集された3つの多様なデータセットの実験を通して、IRPは、読者に正確に知らせる実例と組織的な説明文を生成する。 Expository documents are vital resources for conveying complex information to readers. Despite their usefulness, writing expository text by hand is a challenging process that requires careful content planning, obtaining facts from multiple sources, and the ability to clearly synthesize these facts. To ease these burdens, we propose the task of expository text generation, which seeks to automatically generate an accurate and stylistically consistent expository text for a topic by intelligently searching a knowledge source. We solve our task by developing IRP, a framework that overcomes the limitations of retrieval-augmented models and iteratively performs content planning, fact retrieval, and rephrasing. Through experiments on three diverse, newly-collected datasets, we show that IRP produces factual and organized expository texts that accurately inform readers.	翻訳日:2023-10-25 12:45:46 公開日:2023-10-23
# 適応多元CUR分解を用いたクロスエンコーダを用いた効率的なk-NN探索 Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR Decomposition ( http://arxiv.org/abs/2305.02996v2 ) ライセンス: Link先を確認	Nishant Yadav, Nicholas Monath, Manzil Zaheer, Andrew McCallum	(参考訳) k-nearest neighbor (k-nn) の直接探索にはクロスエンコーダモデル (cross-encoder model) が高価である。その結果、k-NNサーチでは高速な近似検索(BM25やデュアルエンコーダベクターなど)が採用され、その後クロスエンコーダが採用される。この問題はANNCUR (Yadav et al., 2022) によって取り組まれており、これはクロスエンコーダのみを使用し、比較的少数のアンカーアイテムとCUR行列の分解を用いて探索を効率化する。 ANNCURの1回限りのアンカーの選択は、平均してエンコーダ間の距離を近似する傾向にあるが、クエリの近くのアイテムまでの距離を正確に推定する能力は失われ、重要なエンドタスクであるトップk項目のリコールを後悔する。本稿では,実用上重要なトップk近傍の近似誤差を適応的に,反復的に,効率的に最小化するADACURを提案する。これまでに利用可能なアンカーを使用してk-NN検索を反復的に実行し、次に次のラウンドのアンカーセットに追加する。 anncurやdual-encoderベースのresearch-and-rerankといった従来および最先端のメソッドと比較して,複数のデータセットにおいて,提案手法は,重要なk = 1設定におけるリコールエラーを,コンペティタよりも多く使用しながら,一貫して70%削減する。 Cross-encoder models, which jointly encode and score a query-item pair, are prohibitively expensive for direct k-nearest neighbor (k-NN) search. Consequently, k-NN search typically employs a fast approximate retrieval (e.g. using BM25 or dual-encoder vectors), followed by reranking with a cross-encoder; however, the retrieval approximation often has detrimental recall regret. This problem is tackled by ANNCUR (Yadav et al., 2022), a recent work that employs a cross-encoder only, making search efficient using a relatively small number of anchor items, and a CUR matrix factorization. While ANNCUR's one-time selection of anchors tends to approximate the cross-encoder distances on average, doing so forfeits the capacity to accurately estimate distances to items near the query, leading to regret in the crucial end-task: recall of top-k items. In this paper, we propose ADACUR, a method that adaptively, iteratively, and efficiently minimizes the approximation error for the practically important top-k neighbors. It does so by iteratively performing k-NN search using the anchors available so far, then adding these retrieved nearest neighbors to the anchor set for the next round. Empirically, on multiple datasets, in comparison to previous traditional and state-of-the-art methods such as ANNCUR and dual-encoder-based retrieve-and-rerank, our proposed approach ADACUR consistently reduces recall error-by up to 70% on the important k = 1 setting-while using no more compute than its competitors.	翻訳日:2023-10-25 12:44:43 公開日:2023-10-23
# ditto: 文埋め込みを改善するためのシンプルで効率的なアプローチ Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings ( http://arxiv.org/abs/2305.10786v2 ) ライセンス: Link先を確認	Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, Hai Yu, Jiaqing Liu, Yukun Ma, Chong Zhang	(参考訳) 先行研究は、未学習言語モデル(例えばBERT)の文表現における異方性問題を微調整なしで診断する。解析の結果,BERTの埋め込み文は非形式的単語に対する偏りに悩まされ,意味的テキスト類似性(STS)タスクのパフォーマンスが制限されることがわかった。このバイアスに対処するために、モデルに基づく重要度推定で単語を重み付けし、文埋め込みとして事前学習されたモデルからの単語表現の重み付け平均を計算する、シンプルで効率的な非教師付きアプローチであるDiagonal Attention Pooling (Ditto)を提案する。 Dittoは、任意のトレーニング済み言語モデルに対して、後処理操作として簡単に適用できる。先行文埋め込みアプローチと比較して、dittoはパラメータを追加せず、学習も必要としない。実験により,提案したDittoは異方性問題を緩和し,STSタスクにおける各種事前学習モデルを改善することができることが示された。 Prior studies diagnose the anisotropy problem in sentence representations from pre-trained language models, e.g., BERT, without fine-tuning. Our analysis reveals that the sentence embeddings from BERT suffer from a bias towards uninformative words, limiting the performance in semantic textual similarity (STS) tasks. To address this bias, we propose a simple and efficient unsupervised approach, Diagonal Attention Pooling (Ditto), which weights words with model-based importance estimations and computes the weighted average of word representations from pre-trained models as sentence embeddings. Ditto can be easily applied to any pre-trained language model as a postprocessing operation. Compared to prior sentence embedding approaches, Ditto does not add parameters nor requires any learning. Empirical evaluations demonstrate that our proposed Ditto can alleviate the anisotropy problem and improve various pre-trained models on STS tasks.	翻訳日:2023-10-25 12:35:55 公開日:2023-10-23
# StructGPT:構造化データを扱う大規模言語モデルのための汎用フレームワーク StructGPT: A General Framework for Large Language Model to Reason over Structured Data ( http://arxiv.org/abs/2305.09645v2 ) ライセンス: Link先を確認	Jinhao Jiang, Kun Zhou, Zican Dong, Keming Ye, Wayne Xin Zhao and Ji-Rong Wen	(参考訳) 本稿では,構造化データに対する大規模言語モデルのゼロショット推論能力(LLM)を統一的に向上させる方法について検討する。 LLMのツール強化の研究に触発されて、構造化データに基づく質問応答タスクを解くための「emph{Iterative Reading-then-Reasoning~(IRR)}アプローチ、いわゆる「textbf{StructGPT」を開発した。本研究では,構造化データ(\ie \emph{reading})から関連する証拠を収集する特殊関数を構築し,収集した情報(\ie \emph{reasoning})に基づいてLLMを推論タスクに集中させる。特に,外部インタフェースの助けを借りて構造化データの推論において,llmをサポートするための<emph{invoking-linearization-generation>手順を提案する。この手順をインターフェイスで反復することで、我々のアプローチは、所定のクエリに対するターゲットの回答に徐々にアプローチすることができる。 3種類の構造化データを用いて行った大規模な実験は,ChatGPTの性能を大幅に向上させ,全データ教師あり学習ベースラインに対して同等の性能が得られることを示す。私たちのコードとデータは、~\url{https://github.com/RUCAIBox/StructGPT}で公開されています。 In this paper, we study how to improve the zero-shot reasoning ability of large language models~(LLMs) over structured data in a unified way. Inspired by the study on tool augmentation for LLMs, we develop an \emph{Iterative Reading-then-Reasoning~(IRR)} approach for solving question answering tasks based on structured data, called \textbf{StructGPT}. In our approach, we construct the specialized function to collect relevant evidence from structured data (\ie \emph{reading}), and let LLMs concentrate the reasoning task based on the collected information (\ie \emph{reasoning}). Specially, we propose an \emph{invoking-linearization-generation} procedure to support LLMs in reasoning on the structured data with the help of the external interfaces. By iterating this procedures with provided interfaces, our approach can gradually approach the target answer to a given query. Extensive experiments conducted on three types of structured data demonstrate the effectiveness of our approach, which can significantly boost the performance of ChatGPT and achieve comparable performance against the full-data supervised-tuning baselines. Our codes and data are publicly available at~\url{https://github.com/RUCAIBox/StructGPT}.	翻訳日:2023-10-25 12:35:37 公開日:2023-10-23
# 量子信頼性 Quantum Reliability ( http://arxiv.org/abs/2305.08461v4 ) ライセンス: Link先を確認	L.X.Cui, Y-M.Du, and C.P.Sun	(参考訳) 量子技術はますます高度で複雑な量子デバイスを生み出した。信頼性(量子信頼性)を評価することは重要な問題です。古典機器の信頼性理論は産業や技術でよく研究されているが、量子信頼性と損失に関する適切な指標は体系的に研究されていない。信頼性損失はプロセスに依存するため、量子忠実性は必ずしもそれを完全に描写するとは限らない。本研究は,状態分散から軌道分離へ焦点を移すことで,量子信頼性の指標を提供する。従来の古典的信頼性の概念とは対照的に、二項論理変数の確率的測定を用いて評価される量子信頼性は、量子確率振幅や波動関数に基礎を置いている。この研究は、古典デバイスと量子デバイスの両方を含む信頼性理論の普遍的な枠組みを提供する。デバイスが実行している実際の量子プロセスがパフォーマンスにどの程度影響するかを解明することで、量子エンジニアリングに関する新たな視点を提供する。 Quantum technology has led to increasingly sophisticated and complex quantum devices. Assessing their reliability (quantum reliability) is an important issue. Although reliability theory for classical devices has been well developed in industry and technology, a suitable metric on quantum reliability and its loss has not been systematically investigated. Since reliability-loss depends on the process, quantum fidelity does not always fully depict it. This study provides a metric of quantum reliability by shifting the focus from state-distinguishing to trajectory-distinguishing. In contrast to the conventional notion of classical reliability, which is evaluated using probabilistic measurements of binary logical variables, quantum reliability is grounded in the quantum probability amplitude or wave function. This research provides a universal framework for reliability theory encompassing both classical and quantum devices. It offers a new perspective on quantum engineering by elucidating how intensely the real quantum process a device undergoes influences its performance.	翻訳日:2023-10-25 12:35:16 公開日:2023-10-23
# ZARA: 小型言語モデルのためのFew-Shot Self-Rationalizationの改善 ZARA: Improving Few-Shot Self-Rationalization for Small Language Models ( http://arxiv.org/abs/2305.07355v2 ) ライセンス: Link先を確認	Wei-Lin Chen, An-Zi Yen, Cheng-Kuang Wu, Hen-Hsen Huang, Hsin-Hsi Chen	(参考訳) エンドタスクの回答と自由テキストの有理性を生成する言語モデル(LM)は、自己有理化モデルとして知られている。近年の成果は, 有理拡張例によるLMの自己合理化による性能向上を示すものである。しかし、説明の恩恵を受ける能力は、アクセシビリティの低い大規模LMでのみ現れる。そこで本研究では,少人数の自己分類を改善するために,小人数のLMに対する説明の活用という,研究の少ない設定について検討する。まず、理性と答えの関係を再考する。人間が説明をどう評価するかという暗黙の精神的プロセスに触発されて、我々は、自己学習のための擬似並列データを自動的に構築するZARA(Zero-shot Augmentation of Rationale-Answer pairs)を提案する。実験結果から,ZARAはタスク精度と説明基準の両方において,FEBベンチマーク上でSOTA性能を達成できた。さらに,推定可能かつ正確な合理化・解答ペアを自動的に識別するzaraの能力を検証する,人間的かつ定量的な評価を行う。 Language models (LMs) that jointly generate end-task answers as well as free-text rationales are known as self-rationalization models. Recent works demonstrate great performance gain for self-rationalization by few-shot prompting LMs with rationale-augmented exemplars. However, the ability to benefit from explanations only emerges with large-scale LMs, which have poor accessibility. In this work, we explore the less-studied setting of leveraging explanations for small LMs to improve few-shot self-rationalization. We first revisit the relationship between rationales and answers. Inspired by the implicit mental process of how human beings assess explanations, we present a novel approach, Zero-shot Augmentation of Rationale-Answer pairs (ZARA), to automatically construct pseudo-parallel data for self-training by reducing the problem of plausibility judgement to natural language inference. Experimental results show ZARA achieves SOTA performance on the FEB benchmark, for both the task accuracy and the explanation metric. In addition, we conduct human and quantitative evaluation validating ZARA's ability to automatically identify plausible and accurate rationale-answer pairs.	翻訳日:2023-10-25 12:33:55 公開日:2023-10-23
# halueval: 大言語モデルのための大規模幻覚評価ベンチマーク HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models ( http://arxiv.org/abs/2305.11747v3 ) ライセンス: Link先を確認	Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie and Ji-Rong Wen	(参考訳) ChatGPTのような大規模言語モデル(LLM)は、ソースと矛盾したり、事実の知識によって検証できないコンテンツといった幻覚を生成する傾向にある。コンテンツの種類や、llmがどの程度幻覚に適しているかを理解するため、大言語モデル(halueval)のための幻覚評価ベンチマークを導入し、幻覚認識におけるllmの性能を評価するために、大量の生成および人間の注釈付き幻覚サンプルを収集した。これらのサンプルを生成するために,ChatGPTに基づく2段階のフレームワーク,すなわちサンプリング・then-filteringを提案する。また、ChatGPT応答の幻覚に注釈を付けるために、人間のラベルも採用しています。実験結果から、ChatGPTは検証不能な情報(約19.5 %$レスポンス)を作成して特定のトピックの幻覚コンテンツを生成する可能性が示唆された。さらに、既存のLLMはテキストの幻覚を認識する上で大きな課題に直面している。しかし、我々の実験は、外部知識の提供や推論ステップの追加がLLMの幻覚認識に役立つことも証明している。私たちのベンチマークはhttps://github.com/RUCAIBox/HaluEval.orgからアクセスできます。 Large language models (LLMs), such as ChatGPT, are prone to generate hallucinations, i.e., content that conflicts with the source or cannot be verified by the factual knowledge. To understand what types of content and to which extent LLMs are apt to hallucinate, we introduce the Hallucination Evaluation benchmark for Large Language Models (HaluEval), a large collection of generated and human-annotated hallucinated samples for evaluating the performance of LLMs in recognizing hallucination. To generate these samples, we propose a ChatGPT-based two-step framework, i.e., sampling-then-filtering. Besides, we also hire some human labelers to annotate the hallucinations in ChatGPT responses. The empirical results suggest that ChatGPT is likely to generate hallucinated content in specific topics by fabricating unverifiable information (i.e., about $19.5\%$ responses). Moreover, existing LLMs face great challenges in recognizing the hallucinations in texts. However, our experiments also prove that providing external knowledge or adding reasoning steps can help LLMs recognize hallucinations. Our benchmark can be accessed at https://github.com/RUCAIBox/HaluEval.	翻訳日:2023-10-25 12:26:50 公開日:2023-10-23
# 表現レンズを用いた多言語機械翻訳における知識伝達 Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens ( http://arxiv.org/abs/2305.11550v2 ) ライセンス: Link先を確認	David Stap, Vlad Niculae, Christof Monz	(参考訳) 翻訳品質だけでは多言語ニューラルマシン翻訳における知識伝達を測定するには十分ではない。この主張を支持するために,言語間の表現的類似度を測定するRepresentational Transfer potential (RTP)を導入する。本稿では,RTPが正と負の両方の転送(干渉)を計測できることを示し,RTPが翻訳品質の変化と強く相関していることを見出した。さらに,転送に関連するデータや言語特性を調査し,マルチ並列重なりが重要ではあるが未検討の機能であることを見出す。そこで我々は,複数並列データを活用することで,言語間での表現の不変性を向上する,補助的類似性損失を用いた新しい学習手法を開発した。提案手法は,複数のデータおよびモデル設定にまたがる低級・中級言語における翻訳品質の向上を示す。 We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation. To support this claim, we introduce Representational Transfer Potential (RTP), which measures representational similarities between languages. We show that RTP can measure both positive and negative transfer (interference), and find that RTP is strongly correlated with changes in translation quality, indicating that transfer does occur. Furthermore, we investigate data and language characteristics that are relevant for transfer, and find that multi-parallel overlap is an important yet under-explored feature. Based on this, we develop a novel training scheme, which uses an auxiliary similarity loss that encourages representations to be more invariant across languages by taking advantage of multi-parallel data. We show that our method yields increased translation quality for low- and mid-resource languages across multiple data and model setups.	翻訳日:2023-10-25 12:26:33 公開日:2023-10-23
# フラットネスアウェアプロンプト選択による精度向上とサンプル効率向上 Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency ( http://arxiv.org/abs/2305.10713v2 ) ライセンス: Link先を確認	Lingfeng Shen, Weiting Tan, Boyuan Zheng, Daniel Khashabi	(参考訳) 大規模言語モデルの能力が増大するにつれ、それらにアクセスするための主要な方法となっている。これにより、効果的な言語プロンプトを自動選択する戦略の開発が動機となった。本稿では,言語プロンプトの期待される有用性を定量化するための新しい指標であるプロンプト平坦性を導入する。この計量は統計学習における平坦性正規化にインスパイアされ、モデルの頑健さをパラメータ摂動に向けて定量化する。我々は,この指標の理論的基礎と他の素早い選択指標との関係を提供し,既存の手法の包括的理解を提供する。実験により,既存の指標と即時平坦性を組み合わせることで,性能と試料効率が向上することを示した。我々の測定値は,6つの分類ベンチマークにおいて,5%の精度向上と10%のピアソン相関で,前回のプロンプト選択指標を上回った。 With growing capabilities of large language models, prompting them has become the dominant way to access them. This has motivated the development of strategies for automatically selecting effective language prompts. In this paper, we introduce prompt flatness, a new metric to quantify the expected utility of a language prompt. This metric is inspired by flatness regularization in statistical learning that quantifies the robustness of the model towards its parameter perturbations. We provide theoretical foundations for this metric and its relationship with other prompt selection metrics, providing a comprehensive understanding of existing methods. Empirically, we show that combining prompt flatness with existing metrics improves both performance and sample efficiency. Our metric outperforms the previous prompt selection metrics with an average increase of 5% in accuracy and 10% in Pearson correlation across 6 classification benchmarks.	翻訳日:2023-10-25 12:26:00 公開日:2023-10-23
# ミラジェス:対話システムにおける擬人化について Mirages: On Anthropomorphism in Dialogue Systems ( http://arxiv.org/abs/2305.09800v2 ) ライセンス: Link先を確認	Gavin Abercrombie, Amanda Cercas Curry, Tanvi Dinkar, Verena Rieser, Zeerak Talat	(参考訳) 自動対話システムや会話システムは、開発者によって人為化され、ユーザによって人格化される。人格化の度合いは、中程度の選択のため必然的であるが、意識的かつ無意識な設計選択は、ユーザーがそのようなシステムを様々な程度にパーソナライズするよう誘導することができる。ユーザが自動化システムに人間であるかのように関連付けることで、アウトプットの過度な信頼性に起因するリスクシナリオにつながる可能性がある。その結果、自然言語処理研究者は、人格化を誘導し、そのような効果を緩和する資源を開発する要因を調査した。しかし、これらの努力は断片化されており、擬人化の多くの側面はまだ研究されていない。本稿では,対話システムの擬人化に寄与する言語的要因と,ジェンダーのステレオタイプや許容される言語の概念の強化など,起こりうる害について論じる。対話システムの構築に向けた今後の取り組みは,その設計,開発,リリース,説明において特に注意を払うこと,ユーザによる人格化を誘発する多くの言語的手がかりに従うことを推奨する。 Automated dialogue or conversational systems are anthropomorphised by developers and personified by users. While a degree of anthropomorphism may be inevitable due to the choice of medium, conscious and unconscious design choices can guide users to personify such systems to varying degrees. Encouraging users to relate to automated systems as if they were human can lead to high risk scenarios caused by over-reliance on their outputs. As a result, natural language processing researchers have investigated the factors that induce personification and develop resources to mitigate such effects. However, these efforts are fragmented, and many aspects of anthropomorphism have yet to be explored. In this paper, we discuss the linguistic factors that contribute to the anthropomorphism of dialogue systems and the harms that can arise, including reinforcing gender stereotypes and notions of acceptable language. We recommend that future efforts towards developing dialogue systems take particular care in their design, development, release, and description; and attend to the many linguistic cues that can elicit personification by users.	翻訳日:2023-10-25 12:24:01 公開日:2023-10-23
# プロンプトは大規模言語モデルにおける確率測定の代用ではない Prompting is not a substitute for probability measurements in large language models ( http://arxiv.org/abs/2305.13264v2 ) ライセンス: Link先を確認	Jennifer Hu and Roger Levy	(参考訳) プロンプティングは、現在、大規模言語モデル(LLM)の言語知識を評価する主要な方法である。他の方法では、文字列上のモデルの確率分布を直接読み取るが、プロンプトでは、言語入力を処理することによって、モデルが内部情報にアクセスする必要がある。本研究では,モデルの言語知識を計測する方法として,メタリング的プロンシングと直接確率測定を比較した。概して、llmsのメタリング的判断は表現から直接導かれる量よりも劣っていることが分かる。さらに、プロンプトクエリが次の単語の確率の直接測定から逸脱するにつれて、一貫性が悪化する。以上の結果から, LLMが特定の言語的一般化を欠いているという決定的な証拠として, メタリング主義的プロンプトに依存する否定的な結果が認められないことが示唆された。また,確率分布へのアクセスが制限されたクローズドAPIへの移行によって失われる価値も強調した。 Prompting is now a dominant method for evaluating the linguistic knowledge of large language models (LLMs). While other methods directly read out models' probability distributions over strings, prompting requires models to access this internal information by processing linguistic input, thereby implicitly testing a new type of emergent ability: metalinguistic judgment. In this study, we compare metalinguistic prompting and direct probability measurements as ways of measuring models' linguistic knowledge. Broadly, we find that LLMs' metalinguistic judgments are inferior to quantities directly derived from representations. Furthermore, consistency gets worse as the prompt query diverges from direct measurements of next-word probabilities. Our findings suggest that negative results relying on metalinguistic prompts cannot be taken as conclusive evidence that an LLM lacks a particular linguistic generalization. Our results also highlight the value that is lost with the move to closed APIs where access to probability distributions is limited.	翻訳日:2023-10-25 12:15:44 公開日:2023-10-23
# 大規模言語モデルを用いた関連言語間の機械翻訳のための分解プロンプト Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models ( http://arxiv.org/abs/2305.13085v2 ) ライセンス: Link先を確認	Ratish Puduppully, Anoop Kunchukuttan, Raj Dabre, Ai Ti Aw, Nancy F. Chen	(参考訳) 本研究では,単語の順序や語彙的類似性などの言語的特徴を共有する同族言語間の機械翻訳について検討する。数少ないプロンプトによる機械翻訳は、少数の翻訳ペアの例を利用して、テスト文の翻訳を生成する。この手順では、トークンの順序が維持され、流動的で正確な翻訳を生成することを保証すると同時に、翻訳を生成する方法を学ぶ必要がある。関連する言語では,そのような言語の単調なアライメント特性を利用することにより,機械翻訳のタスクを単純化できることを示す。本稿では,翻訳過程を単語のチャンク変換の列に分解する,マイナショットプロンプトの新しいアプローチであるdecomtを紹介する。様々な言語家族を対象とした複数の言語ペアによる自動的・人為的評価を通じて,提案手法が確立された複数ショットベースラインアプローチを超えることを示す。例えば、DecoMTは、検査対象言語全体で平均8 chrF++スコアを改善したBLOOMモデルよりも優れている。 This study investigates machine translation between related languages i.e., languages within the same family that share linguistic characteristics such as word order and lexical similarity. Machine translation through few-shot prompting leverages a small set of translation pair examples to generate translations for test sentences. This procedure requires the model to learn how to generate translations while simultaneously ensuring that token ordering is maintained to produce a fluent and accurate translation. We propose that for related languages, the task of machine translation can be simplified by leveraging the monotonic alignment characteristic of such languages. We introduce DecoMT, a novel approach of few-shot prompting that decomposes the translation process into a sequence of word chunk translations. Through automatic and human evaluation conducted on multiple related language pairs across various language families, we demonstrate that our proposed approach of decomposed prompting surpasses multiple established few-shot baseline approaches. For example, DecoMT outperforms the strong few-shot prompting BLOOM model with an average improvement of 8 chrF++ scores across the examined languages.	翻訳日:2023-10-25 12:15:02 公開日:2023-10-23
# Beyond Labels: 新しいアクティブラーニングアーキテクチャによる自然言語説明による人間アノテーションの強化 Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture ( http://arxiv.org/abs/2305.12710v2 ) ライセンス: Link先を確認	Bingsheng Yao, Ishan Jindal, Lucian Popa, Yannis Katsis, Sayan Ghosh, Lihong He, Yuxuan Lu, Shashank Srivastava, Yunyao Li, James Hendler, Dakuo Wang	(参考訳) 現実世界のドメインの専門家(医師など)は、説明なしに日々のワークフローで意思決定ラベルに注釈を付けることは滅多にない。しかし、人間のアノテータを支援することを目的とした、アクティブラーニング(AL)のような既存の低リソースの学習技術は、データポイントの自然言語の説明を無視しながら、ラベルに集中している。本研究は,低リソースシナリオにおけるアノテーションのラベル付けと説明の現実的ニーズを支援する新しいALアーキテクチャを提案する。私たちのalアーキテクチャは、説明生成モデルを利用して、人間の説明によって導かれる説明、生成された説明を忠実に予測するための予測モデル、説明アノテーションから恩恵を受ける新しいデータ多様性に基づくalサンプリング戦略を生成する。自動および人的評価は、ALサンプリングに説明を組み込むことの有効性を示し、ALアーキテクチャによる人間のアノテーション効率と信頼性を改善した。追加のアブレーション研究は、転送学習、一般化可能性、大規模言語モデル(llm)との統合におけるalアーキテクチャの可能性を示しています。 LLMは比較的単純なタスクに対して例外的な説明生成能力を示すが、複雑な実世界のタスクにおけるそれらの有効性は、より詳細な研究を保証している。 Real-world domain experts (e.g., doctors) rarely annotate only a decision label in their day-to-day workflow without providing explanations. Yet, existing low-resource learning techniques, such as Active Learning (AL), that aim to support human annotators mostly focus on the label while neglecting the natural language explanation of a data point. This work proposes a novel AL architecture to support experts' real-world need for label and explanation annotations in low-resource scenarios. Our AL architecture leverages an explanation-generation model to produce explanations guided by human explanations, a prediction model that utilizes generated explanations toward prediction faithfully, and a novel data diversity-based AL sampling strategy that benefits from the explanation annotations. Automated and human evaluations demonstrate the effectiveness of incorporating explanations into AL sampling and the improved human annotation efficiency and trustworthiness with our AL architecture. Additional ablation studies illustrate the potential of our AL architecture for transfer learning, generalizability, and integration with large language models (LLMs). While LLMs exhibit exceptional explanation-generation capabilities for relatively simple tasks, their effectiveness in complex real-world tasks warrants further in-depth study.	翻訳日:2023-10-25 12:14:08 公開日:2023-10-23
# Open-QA評価の評価 Evaluating Open-QA Evaluation ( http://arxiv.org/abs/2305.12421v4 ) ライセンス: Link先を確認	Cunxiang Wang, Sirui Cheng, Qipeng Guo, Yuanhao Yue, Bowen Ding, Zhikun Xu, Yidong Wang, Xiangkun Hu, Zheng Zhang, Yue Zhang	(参考訳) 本研究では,大規模言語モデル (LLM) の事実性を直接推定できるオープン質問回答 (Open QA) タスクの評価に焦点をあてる。現在の自動評価手法は限界を示しており、人間の評価が依然として最も信頼できるアプローチであることを示している。オープンQA内の標準回答に関連するAI生成回答の精度を評価するために,新たなタスクであるQA評価(QA-Eval)とそれに対応するデータセットEVOUNAを導入する。提案手法の評価は,その性能測定にヒューマンアノテート結果を利用する。具体的には,人間評価と高い相関を示す手法について検討し,その信頼性について検討した。また,LLMに基づく評価手法の改良に向け,現在の手法と手法の落とし穴についても論じる。この新たなQA-Evalタスクとそれに対応するデータセットEVOUNAは、より効果的な自動評価ツールの開発を促進し、この分野における今後の研究に有用であることを示す。すべてのリソースは \url{https://github.com/wangcunxiang/QA-Eval} で入手できる。 This study focuses on the evaluation of the Open Question Answering (Open-QA) task, which can directly estimate the factuality of large language models (LLMs). Current automatic evaluation methods have shown limitations, indicating that human evaluation still remains the most reliable approach. We introduce a new task, Evaluating QA Evaluation (QA-Eval) and the corresponding dataset EVOUNA, designed to assess the accuracy of AI-generated answers in relation to standard answers within Open-QA. Our evaluation of these methods utilizes human-annotated results to measure their performance. Specifically, the work investigates methods that show high correlation with human evaluations, deeming them more reliable. We also discuss the pitfalls of current methods and methods to improve LLM-based evaluators. We believe this new QA-Eval task and corresponding dataset EVOUNA will facilitate the development of more effective automatic evaluation tools and prove valuable for future research in this area. All resources are available at \url{https://github.com/wangcunxiang/QA-Eval} and it is under the Apache-2.0 License.	翻訳日:2023-10-25 12:13:11 公開日:2023-10-23
# 微細化を伴わない構造化NLPタスクの文法制約デコーディング Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning ( http://arxiv.org/abs/2305.13971v3 ) ライセンス: Link先を確認	Saibo Geng, Martin Josifosky, Maxime Peyrard, Robert West	(参考訳) 印象的なパフォーマンスにもかかわらず、大きな言語モデル(lms)は、必要な出力形式に正確に従わない場合にも、複雑な出力構造を確実に生成するのに苦労している。この問題に対処するために、文法制約付き復号 (gcd) は lms の生成を制御するために用いられ、出力が所定の構造に従うことを保証している。しかし、既存のgcdメソッドの多くはパースやコード生成といった特定のタスクに限定されている。本研究では,より広い範囲のタスクに対して,形式文法が出力空間を記述できることを示し,GCDが一般に構造化NLPタスクの統一フレームワークとして機能できることを論じる。柔軟性を高めるために, 文法が入力に依存することを許容し, 異なる入力に対する異なる出力構造の生成を可能にする, 入力依存文法を導入する。そして,(1)情報抽出,(2)エンティティの曖昧さ,(3)選挙区解析におけるGCD強化LMのパワーと柔軟性を実証的に実証した。その結果,文法制約のLMは非制約のLMよりもかなり優れており,タスク固有の微調整モデルよりも優れていた。文法制約は、特にトレーニングデータが少ない場合や微調整が高価である場合など、幅広い構造化されたNLPタスクに対して、既製のLMを利用することを大いに約束する。コードとデータ:https://github.com/epfl-dlab/GCD。 Despite their impressive performance, large language models (LMs) still struggle with reliably generating complex output structures when not finetuned to follow the required output format exactly. To address this issue, grammar-constrained decoding (GCD) can be used to control the generation of LMs, guaranteeing that the output follows a given structure. Most existing GCD methods are, however, limited to specific tasks, such as parsing or code generation. In this work, we demonstrate that formal grammars can describe the output space for a much wider range of tasks and argue that GCD can serve as a unified framework for structured NLP tasks in general. For increased flexibility, we introduce input-dependent grammars, which allow the grammar to depend on the input and thus enable the generation of different output structures for different inputs. We then empirically demonstrate the power and flexibility of GCD-enhanced LMs on (1) information extraction, (2) entity disambiguation, and (3) constituency parsing. Our results indicate that grammar-constrained LMs substantially outperform unconstrained LMs or even beat task-specific finetuned models. Grammar constraints thus hold great promise for harnessing off-the-shelf LMs for a wide range of structured NLP tasks, especially where training data is scarce or finetuning is expensive. Code and data: https://github.com/epfl-dlab/GCD.	翻訳日:2023-10-25 12:06:19 公開日:2023-10-23
# edis: マルチモーダルwebコンテンツ上のエンティティ駆動イメージ検索 EDIS: Entity-Driven Image Search over Multimodal Web Content ( http://arxiv.org/abs/2305.13631v2 ) ライセンス: Link先を確認	Siqi Liu, Weixi Feng, Tsu-jui Fu, Wenhu Chen, William Yang Wang	(参考訳) 実世界の検索アプリケーションで画像検索を実践するには、データセットスケール、エンティティ理解、マルチモーダル情報融合の大幅な進歩が必要である。本稿では,ニュース領域におけるクロスモーダル画像検索のための挑戦的データセットであるtextbf{E}ntity-\textbf{D}riven \textbf{I}mage \textbf{S}earch (EDIS)を紹介する。 EDISは、実際の検索エンジンの結果から100万のWebイメージとキュレートされたデータセットで構成され、各イメージはテキスト記述と組み合わせられている。単一のモダリティ候補を仮定するデータセットとは異なり、EDISは100万のマルチモーダルイメージテキストペアを候補として含むことで、現実世界のWebイメージ検索シナリオを反映している。 EDISは、クロスモーダル情報融合とマッチングを同時に扱う検索モデルの開発を奨励する。正確なランキング結果を得るためには、以下のモデルが必要となる。 1)テキストクエリから名前付きエンティティやイベントを理解する。 2)画像又はテキスト記述への接地実体,及び 3) テキストと視覚表現を効果的に融合させる。実験の結果,EDISは高密度エンティティと大規模候補セットを用いた最先端手法に挑戦していることがわかった。また,テキストの特徴を視覚的特徴と融合させることが,検索結果の改善に重要であることを示す。 Making image retrieval methods practical for real-world search applications requires significant progress in dataset scales, entity comprehension, and multimodal information fusion. In this work, we introduce \textbf{E}ntity-\textbf{D}riven \textbf{I}mage \textbf{S}earch (EDIS), a challenging dataset for cross-modal image search in the news domain. EDIS consists of 1 million web images from actual search engine results and curated datasets, with each image paired with a textual description. Unlike datasets that assume a small set of single-modality candidates, EDIS reflects real-world web image search scenarios by including a million multimodal image-text pairs as candidates. EDIS encourages the development of retrieval models that simultaneously address cross-modal information fusion and matching. To achieve accurate ranking results, a model must: 1) understand named entities and events from text queries, 2) ground entities onto images or text descriptions, and 3) effectively fuse textual and visual representations. Our experimental results show that EDIS challenges state-of-the-art methods with dense entities and a large-scale candidate set. The ablation study also proves that fusing textual features with visual features is critical in improving retrieval results.	翻訳日:2023-10-25 12:05:07 公開日:2023-10-23
# オープンエンディングテキスト生成のためのルックバックデコーディング Look-back Decoding for Open-Ended Text Generation ( http://arxiv.org/abs/2305.13477v2 ) ライセンス: Link先を確認	Nan Xu, Chunting Zhou, Asli Celikyilmaz, Xuezhe Ma	(参考訳) プレフィックス(コンテキスト)が与えられると、open-ended generationは、前のトピックから突然外れない一貫性のあるテキストと、望ましくない繰り返しに苦しむことのないインフォメーションをデコードすることを目指している。本稿では,kullback-leibler 発散を利用して現在および過去の復号過程間の分布距離を追跡する改良復号アルゴリズムである look-back を提案する。このように、ルックバックは、潜在的反復句とトピックドリフトを自動的に予測し、障害モードを引き起こす可能性のあるトークンを除去し、履歴へのもっともらしい距離で次のトークン確率分布を制限する。文書継続とストーリー生成に関するデコード実験を行い、ルックバックがより流動的でコヒーレントなテキストを生成することができ、自動評価と人間評価の両方において、他の強力なデコード手法を大きく上回ることを実証する。 Given a prefix (context), open-ended generation aims to decode texts that are coherent, which do not abruptly drift from previous topics, and informative, which do not suffer from undesired repetitions. In this paper, we propose Look-back, an improved decoding algorithm that leverages the Kullback-Leibler divergence to track the distribution distance between current and historical decoding steps. Thus Look-back can automatically predict potential repetitive phrase and topic drift, and remove tokens that may cause the failure modes, restricting the next token probability distribution within a plausible distance to the history. We perform decoding experiments on document continuation and story generation, and demonstrate that Look-back is able to generate more fluent and coherent text, outperforming other strong decoding methods significantly in both automatic and human evaluations.	翻訳日:2023-10-25 12:04:24 公開日:2023-10-23
# DADA:言語規則の動的集約による辞書適応 DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules ( http://arxiv.org/abs/2305.13406v2 ) ライセンス: Link先を確認	Yanchen Liu, William Held, Diyi Yang	(参考訳) 主に標準アメリカ英語(SAE)に焦点を当てた既存の大きな言語モデル(LLM)は、他の英語方言に適用された場合、かなりパフォーマンスが悪化する。既存の緩和策は個々のターゲット方言の相違に対処しているが、それらは高精度な方言識別システムへのアクセスを想定している。方言間の境界は本質的に柔軟であり、言語を個別に定義したカテゴリに分類することは困難である。本稿では,特定の言語的特徴を扱うアダプタを構成することによって,マルチダイアレクティブロバスト性を持つ imbue sae 学習モデルに対するモジュラーアプローチである dada (dialect adaptation via dynamic aggregation) を提案する。 DADAのコンポジションアーキテクチャは、特定の方言の変種へのターゲット適応と、様々な方言への同時適応の両方を可能にする。 DADAは単一タスクと命令微調整言語モデルの両方に有効であることを示し、既存のLLMを異なる英語方言に適応するための拡張可能かつ解釈可能なフレームワークを提供する。 Existing large language models (LLMs) that mainly focus on Standard American English (SAE) often lead to significantly worse performance when being applied to other English dialects. While existing mitigations tackle discrepancies for individual target dialects, they assume access to high-accuracy dialect identification systems. The boundaries between dialects are inherently flexible, making it difficult to categorize language into discrete predefined categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic Aggregation), a modular approach to imbue SAE-trained models with multi-dialectal robustness by composing adapters which handle specific linguistic features. The compositional architecture of DADA allows for both targeted adaptation to specific dialect variants and simultaneous adaptation to various dialects. We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.	翻訳日:2023-10-25 12:03:50 公開日:2023-10-23
# 推論のための大規模言語モデルを用いたモデル自動選択 Automatic Model Selection with Large Language Models for Reasoning ( http://arxiv.org/abs/2305.14333v2 ) ライセンス: Link先を確認	James Xu Zhao, Yuxi Xie, Kenji Kawaguchi, Junxian He, Michael Qizhe Xie	(参考訳) chain-of-thought (cot) と program-aided language model (pal) は、2つの異なる推論方法を示している。 CoTは自然言語を使用し、柔軟性と解釈性を提供し、PALはプログラミング言語を使用し、より構造化され厳密な論理を生成する。本稿では,大言語モデル(LLM)を用いて両世界の長所を動的に選択するモデル選択手法を提案する。我々の理論解析は, 実験結果によってさらに裏付けられるこの手法の実現可能性を強調している。提案手法は,Codex, ChatGPT, GPT-4を用いた8つの推論データセットにおいて,大幅な性能向上を示す。さらに,本手法は自己整合性に相補的であり,統合されると,計算コストを大幅に削減し,性能をさらに向上させることができる。さらに, GSM8KとSVAMPのそれぞれ96.8%と93.7%の精度で, 新たな最先端結果が得られる。私たちのコード、データ、プロンプトはhttps://github.com/XuZhao0/Model-Selection-Reasoningで利用可能です。 Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths. CoT employs natural language, offering flexibility and interpretability, while PAL utilizes programming language, yielding more structured and rigorous logic. We introduce a model selection method to combine the best of both worlds by employing a large language model (LLM) to dynamically select between them. Our theoretical analysis underscores the feasibility of this method, which is further corroborated by empirical results. Our proposed method demonstrates significant performance improvements across eight reasoning datasets with Codex, ChatGPT, and GPT-4. Additionally, our method is complementary to self-consistency; when integrated, it can further enhance performance while significantly reducing computation costs. Moreover, we achieve new state-of-the-art results on GSM8K and SVAMP, with respective accuracies of 96.8% and 93.7%. Our code, data and prompts are available at https://github.com/XuZhao0/Model-Selection-Reasoning	翻訳日:2023-10-25 11:55:06 公開日:2023-10-23
# talkup: エンパワーメント言語を理解するための道を開く TalkUp: Paving the Way for Understanding Empowering Language ( http://arxiv.org/abs/2305.14326v2 ) ライセンス: Link先を確認	Lucille Njoo, Chan Young Park, Octavia Stappart, Marvin Thielk, Yi Chu and Yulia Tsvetkov	(参考訳) 教育から職場のダイナミクス、医療に至るまで、多くの現実世界の文脈において、言語エンパワーメントは重要である。言語技術はこれらの文脈で広く普及しているが、エンパワーメントはnlpではほとんど研究されていない。この研究は、言語と社会心理学の文献から生まれ、言語を力づけることの特徴を探求する。次に私たちは、エンパワーメントのためにラベル付けされたreddit投稿の新しいデータセット、これらの投稿が読者に権限を与える理由、ポスターと読者の間の社会的関係をクラウドソースします。予備分析の結果、TalkUpと呼ばれるこのデータセットは、エンパワーメントと非エンパワーメント言語をキャプチャする言語モデルのトレーニングに使用することができることがわかった。より広範に、TalkUpは意味、前提、社会的文脈が言語の意味にどのように影響するかを探求するための道筋を提供する。 Empowering language is important in many real-world contexts, from education to workplace dynamics to healthcare. Though language technologies are growing more prevalent in these contexts, empowerment has seldom been studied in NLP, and moreover, it is inherently challenging to operationalize because of its implicit nature. This work builds from linguistic and social psychology literature to explore what characterizes empowering language. We then crowdsource a novel dataset of Reddit posts labeled for empowerment, reasons why these posts are empowering to readers, and the social relationships between posters and readers. Our preliminary analyses show that this dataset, which we call TalkUp, can be used to train language models that capture empowering and disempowering language. More broadly, TalkUp provides an avenue to explore implication, presuppositions, and how social context influences the meaning of language.	翻訳日:2023-10-25 11:54:47 公開日:2023-10-23
# 階層型プロンプティング支援 Webナビゲーションにおける大規模言語モデル Hierarchical Prompting Assists Large Language Model on Web Navigation ( http://arxiv.org/abs/2305.14257v2 ) ライセンス: Link先を確認	Abishek Sridhar, Robert Lo, Frank F. Xu, Hao Zhu, Shuyan Zhou	(参考訳) 大規模言語モデル(LLM)は、対話的な意思決定タスクにおける複雑な観察処理に苦労する。この問題を軽減するために,簡単な階層的プロンプト手法を提案する。常に「emph{full} observation~(\eg a web page)」をプロンプトに配置する従来のプロンプトアプローチから逸脱し、より「emph{condensed}」と「emph{relevant}」を専用の「\summ」プロンプトで構築することを提案する。次に \actor プロンプトは、要約された観察に基づいて次のアクションを予測する。提案手法は適用範囲が広いが,Webナビゲーションの複雑な領域において,完全な観測が冗長で無関係な情報を含む場合が特に有効であることを示す。提案手法は,タスク成功率を6.2倍に向上させ,長い観察トレースを持つ対話型意思決定タスクの可能性を実証する。 Large language models (LLMs) struggle on processing complicated observations in interactive decision making tasks. To alleviate this issue, we propose a simple hierarchical prompting approach. Diverging from previous prompting approaches that always put the \emph{full} observation~(\eg a web page) to the prompt, we propose to first construct an action-aware observation which is more \emph{condensed} and \emph{relevant} with a dedicated \summ prompt. The \actor prompt then predicts the next action based on the summarized observation. While our method has broad applicability, we particularly demonstrate its efficacy in the complex domain of web navigation where a full observation often contains redundant and irrelevant information. Our approach outperforms the previous state-of-the-art prompting mechanis by 6.2\% on task success rate, demonstrating its potential on interactive decision making tasks with long observation traces.	翻訳日:2023-10-25 11:54:04 公開日:2023-10-23
# 多言語大言語モデルは(Yet)コードスイッチアではない Multilingual Large Language Models Are Not (Yet) Code-Switchers ( http://arxiv.org/abs/2305.14235v2 ) ライセンス: Link先を確認	Ruochen Zhang, Samuel Cahyawijaya, Jan Christian Blaise Cruz, Genta Indra Winata and Alham Fikri Aji	(参考訳) マルチ言語大言語モデル(LLM)は、最近、広範囲のタスクにおいて優れた機能を示し、ゼロショットまたは少数ショットプロンプトメソッドによる最先端のパフォーマンスを示している。単言語タスクにおける能力に関する広範な研究は行われてきたが、発話の中で言語を交替させる実践であるcode-switching (csw) の文脈におけるその可能性に関する調査は、比較的未完である。本稿では,多言語llmの包括的分析を行い,感情分析,機械翻訳,要約,単語レベル言語識別の4つのタスクにおける性能評価を行った。以上の結果から,ゼロまたは少数ショットプロンプトを用いたタスクにおいて有望な結果を示す多言語LLMは,非常に小さなスケールの微調整モデルと比較しても性能が劣っていることが示唆された。 LLMの現在の「多言語主義(multilingualism)」は、コードスイッチングテキストが本質的には有能ではない、と我々は主張する。 Multilingual Large Language Models (LLMs) have recently shown great capabilities in a wide range of tasks, exhibiting state-of-the-art performance through zero-shot or few-shot prompting methods. While there have been extensive studies on their abilities in monolingual tasks, the investigation of their potential in the context of code-switching (CSW), the practice of alternating languages within an utterance, remains relatively uncharted. In this paper, we provide a comprehensive empirical analysis of various multilingual LLMs, benchmarking their performance across four tasks: sentiment analysis, machine translation, summarization and word-level language identification. Our results indicate that despite multilingual LLMs exhibiting promising outcomes in certain tasks using zero or few-shot prompting, they still underperform in comparison to fine-tuned models of much smaller scales. We argue that current "multilingualism" in LLMs does not inherently imply proficiency with code-switching texts, calling for future research to bridge this discrepancy.	翻訳日:2023-10-25 11:53:41 公開日:2023-10-23
# CompoundPiece: 言語モデルの分解性能の評価と改善 CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models ( http://arxiv.org/abs/2305.14214v2 ) ライセンス: Link先を確認	Benjamin Minixhofer, Jonas Pfeiffer, Ivan Vuli\'c	(参考訳) 多くの言語は複合語を作るために2つ以上の単語を結合するプロセスを持っているが、以前の研究は一般的に過剰に生産的な複合語(例えばドイツ語、オランダ語)を持つ言語に限られており、多くの言語に複合語と非複合語を含む公開データセットは存在しない。本研究では, 複合語を構成語に分割する作業である分解処理を, 大規模に体系的に研究する。まず、Wiktionaryから得られた56の多様な言語に255kの複合語と非複合語のデータセットを導入することで、データギャップに対処する。次に、このデータセットを使用して、分割タスク上のLarge Language Model(LLM)の配列を評価する。 LLMは、特にサブワードトークン化によって不利にトークン化される単語に対して、性能が良くないことがわかった。そこで本研究では,分解のための専用モデルをトレーニングするための新しい手法を提案する。提案した2段階の手順は、第1段階で完全に自己制御された目的に依存し、第2段階の教師付き学習段階は、注釈付きウィキオナリーデータに基づいてモデルを任意に微調整する。我々の自己教師付きモデルは、以前の最良の教師なし推論モデルよりも平均13.9%正確である。私たちの微調整モデルは、以前の(言語固有の)分解ツールよりも優れています。さらに,このモデルを用いて,サブワードトークン生成時のデコンパリングを活用し,これを複合ピースと呼ぶ。コンプレックスピースは、平均でより好適に複合語をトークン化するので、文節のトークン化を用いた同等のモデル上での分解のパフォーマンスが向上する。 While many languages possess processes of joining two or more words to create compound words, previous studies have been typically limited only to languages with excessively productive compound formation (e.g., German, Dutch) and there is no public dataset containing compound and non-compound words across a large number of languages. In this work, we systematically study decompounding, the task of splitting compound words into their constituents, at a wide scale. We first address the data gap by introducing a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary. We then use this dataset to evaluate an array of Large Language Models (LLMs) on the decompounding task. We find that LLMs perform poorly, especially on words which are tokenized unfavorably by subword tokenization. We thus introduce a novel methodology to train dedicated models for decompounding. The proposed two-stage procedure relies on a fully self-supervised objective in the first stage, while the second, supervised learning stage optionally fine-tunes the model on the annotated Wiktionary data. Our self-supervised models outperform the prior best unsupervised decompounding models by 13.9% accuracy on average. Our fine-tuned models outperform all prior (language-specific) decompounding tools. Furthermore, we use our models to leverage decompounding during the creation of a subword tokenizer, which we refer to as CompoundPiece. CompoundPiece tokenizes compound words more favorably on average, leading to improved performance on decompounding over an otherwise equivalent model using SentencePiece tokenization.	翻訳日:2023-10-25 11:53:20 公開日:2023-10-23
# マージン中心:毒性検出における有害集団のアウトリアーに基づく同定 Centering the Margins: Outlier-Based Identification of Harmed Populations in Toxicity Detection ( http://arxiv.org/abs/2305.14735v2 ) ライセンス: Link先を確認	Vyoma Raman, Eve Fleisig, Dan Klein	(参考訳) マージン化されたコミュニティに対するaiモデルの影響は、伝統的に、特定の人口集団間のパフォーマンスの差を特定することによって測定されてきた。このアプローチは脆弱なグループを集中することを目的としているが、交差するサブグループや複数のグループ間で共有される害のパターンを隠蔽するリスクがある。そこで本研究では,障害研究と関連する分野の限界化の理論を考察し,一般から遠く離れた人々がより逆境に直面していることを述べ,毒性検出領域における「マージン」を考える。我々は,「ノーム」から離れた人口特性を持つ人々に関するテキストを特定するために,外れ値検出を用いてデータセットの「マージン」を運用する。モデルパフォーマンスは、人口減少率に対して一貫して悪化しており、平均2乗誤差(MSE)は、毒性タイプに対して最大70.4%悪くなる。また、テキストの外れ値が68.4%まで上昇するほど、テキストの外れ値も悪化する。また,重篤な毒性とアイデンティティ攻撃の分類において,テキストや人口統計学的異常が特に誤りの影響を受けやすいことも見いだした。従来の人口統計による差異の分析と比較すると、我々の外乱分析は、より大きな交叉群が直面するより大きな害をしばしば表面化しており、これらのグループに対する害を特定するのに特に有益であることが示唆される。 The impact of AI models on marginalized communities has traditionally been measured by identifying performance differences between specified demographic subgroups. Though this approach aims to center vulnerable groups, it risks obscuring patterns of harm faced by intersectional subgroups or shared across multiple groups. To address this, we draw on theories of marginalization from disability studies and related disciplines, which state that people farther from the norm face greater adversity, to consider the "margins" in the domain of toxicity detection. We operationalize the "margins" of a dataset by employing outlier detection to identify text about people with demographic attributes distant from the "norm". We find that model performance is consistently worse for demographic outliers, with mean squared error (MSE) between outliers and non-outliers up to 70.4% worse across toxicity types. It is also worse for text outliers, with a MSE up to 68.4% higher for outliers than non-outliers. We also find text and demographic outliers to be particularly susceptible to errors in the classification of severe toxicity and identity attacks. Compared to analysis of disparities using traditional demographic breakdowns, we find that our outlier analysis frequently surfaces greater harms faced by a larger, more intersectional group, which suggests that outlier analysis is particularly beneficial for identifying harms against those groups.	翻訳日:2023-10-25 11:46:58 公開日:2023-10-23
# COMET-M:複合文における複数イベントの推論 COMET-M: Reasoning about Multiple Events in Complex Sentences ( http://arxiv.org/abs/2305.14617v2 ) ライセンス: Link先を確認	Sahithya Ravi, Raymond Ng, Vered Shwartz	(参考訳) 話者の意図する意味を理解するには、しばしば明示されていないことを推論するために常識的推論を描く。マルチイベント文では、文脈知識に基づくイベント間の関係を理解する必要がある。本研究では,複合文内でターゲットイベントのコモンセンス推論を生成可能なイベント中心コモンセンスモデルであるcomet-m(multi-event)を提案する。 COMET-M は COMET (Bosselut et al., 2019) 上に構築されており、単純な文に対してイベント中心の推論を生成するのに優れるが、自然文で広く使われる多文文の複雑さに苦慮している。この制限を克服するため、我々は35Kの人書き推論のマルチイベント推論データセットをキュレートする。我々は,人間による推論に基づいてCOMET-Mを訓練し,自動ラベル付き例を用いてベースラインを作成する。実験結果から,COMET上でのCOMET-Mの性能向上が得られた。さらにcomet-mは、完全なコンテキストを考慮して、ターゲットイベントごとに異なる推論をうまく生成する。 COMET-Mは、コア参照解決、対話、ストーリー理解といった自然なテキストを含む下流タスクを約束する。 Understanding the speaker's intended meaning often involves drawing commonsense inferences to reason about what is not stated explicitly. In multi-event sentences, it requires understanding the relationships between events based on contextual knowledge. We propose COMET-M (Multi-Event), an event-centric commonsense model capable of generating commonsense inferences for a target event within a complex sentence. COMET-M builds upon COMET (Bosselut et al., 2019), which excels at generating event-centric inferences for simple sentences, but struggles with the complexity of multi-event sentences prevalent in natural text. To overcome this limitation, we curate a multi-event inference dataset of 35K human-written inferences. We trained COMET-M on the human-written inferences and also created baselines using automatically labeled examples. Experimental results demonstrate the significant performance improvement of COMET-M over COMET in generating multi-event inferences. Moreover, COMET-M successfully produces distinct inferences for each target event, taking the complete context into consideration. COMET-M holds promise for downstream tasks involving natural text such as coreference resolution, dialogue, and story understanding.	翻訳日:2023-10-25 11:45:19 公開日:2023-10-23
# 低リソース環境下でのアクティブラーニングによるパラメータ効率の良い言語モデルチューニング Parameter-Efficient Language Model Tuning with Active Learning in Low-Resource Settings ( http://arxiv.org/abs/2305.14576v2 ) ライセンス: Link先を確認	Josip Juki\'c, Jan \v{S}najder	(参考訳) プレトレーニング言語モデル(PLM)は、特に低リソースのドメインや言語において、効果的な微調整技術に対する需要が急増している。ラベルの複雑さを最小限に抑えるために設計されたアルゴリズムセットであるactive learning (al)は、ラベルのボトルネックに直面する可能性を示している。パラメタ効率ファインチューニング(PEFT)のために設計されたアダプタモジュールは、低リソース設定において顕著な可能性を示している。しかし、alとアダプタベースのpeftの相互作用は未定である。テキスト分類タスクの低リソース設定におけるALを用いたPEFT動作の実証的研究について述べる。本研究は,低リソース環境下でのFFTよりもPEFTの方が優れていることを確認し,この利点がAL設定で持続することを示した。さらに,peft と fft の特性を,記憶力学やインスタンスレベル表現のレンズを通して検討し,peft が初期層と中期層のより安定な表現をもたらすことを見出した。本研究は、低リソース環境におけるALとPEFTの相乗的ポテンシャルを強調し、効率的かつ効果的な微調整の進歩の道を開くものである。 Pre-trained language models (PLMs) have ignited a surge in demand for effective fine-tuning techniques, particularly in low-resource domains and languages. Active learning (AL), a set of algorithms designed to decrease labeling costs by minimizing label complexity, has shown promise in confronting the labeling bottleneck. In parallel, adapter modules designed for parameter-efficient fine-tuning (PEFT) have demonstrated notable potential in low-resource settings. However, the interplay between AL and adapter-based PEFT remains unexplored. We present an empirical study of PEFT behavior with AL in low-resource settings for text classification tasks. Our findings affirm the superiority of PEFT over full-fine tuning (FFT) in low-resource settings and demonstrate that this advantage persists in AL setups. We further examine the properties of PEFT and FFT through the lens of forgetting dynamics and instance-level representations, where we find that PEFT yields more stable representations of early and middle layers compared to FFT. Our research underscores the synergistic potential of AL and PEFT in low-resource settings, paving the way for advancements in efficient and effective fine-tuning.	翻訳日:2023-10-25 11:44:59 公開日:2023-10-23
# MathDial: 数学推論問題に根ざした豊富な教育特性を持つ対話学習データセット MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems ( http://arxiv.org/abs/2305.14536v2 ) ライセンス: Link先を確認	Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan	(参考訳) 自動対話教師は、教育をパーソナライズし、よりアクセスしやすくするための大きな可能性を持っているが、このようなシステムの研究は、十分な大規模で高品質なデータセットの欠如によって妨げられている。このようなデータセットの収集は依然として困難であり、指導セッションの記録はプライバシの懸念を招き、クラウドソーシングはデータ品質の不足につながる。そこで本研究では,一般的な学生の誤りを表すために,人間教師とLLM(Large Language Model)を組み合わせて対話を生成するフレームワークを提案する。我々はこのフレームワークを用いて、多段階数学推論問題に基づく3k対1の教師-学生対話のデータセットであるMathDialを収集する方法について述べる。 GPT-3のようなモデルは優れた問題解決者であるが、実際に誤ったフィードバックを得られるか、あるいは学生に解決策を明らかにするのが早すぎるため、指導に失敗する。これを解決するために,教師の動きの分類に従って,様々な足場質問を用いて指導することで,生徒に学習機会を提供する。我々は、MathDialとその豊富なアノテーションを使ってモデルをより効果的なチューターとして微調整できることを示した。特に,学生の問題解決と問題解決のトレードオフを測定するインタラクティブな環境では,自動評価と人間評価によって確認する。データセットは公開されています。 While automatic dialogue tutors hold great potential in making education personalized and more accessible, research on such systems has been hampered by a lack of sufficiently large and high-quality datasets. Collecting such datasets remains challenging, as recording tutoring sessions raises privacy concerns and crowdsourcing leads to insufficient data quality. To address this, we propose a framework to generate such dialogues by pairing human teachers with a Large Language Model (LLM) prompted to represent common student errors. We describe how we use this framework to collect MathDial, a dataset of 3k one-to-one teacher-student tutoring dialogues grounded in multi-step math reasoning problems. While models like GPT-3 are good problem solvers, they fail at tutoring because they generate factually incorrect feedback or are prone to revealing solutions to students too early. To overcome this, we let teachers provide learning opportunities to students by guiding them using various scaffolding questions according to a taxonomy of teacher moves. We demonstrate MathDial and its extensive annotations can be used to finetune models to be more effective tutors (and not just solvers). We confirm this by automatic and human evaluation, notably in an interactive setting that measures the trade-off between student solving success and telling solutions. The dataset is released publicly.	翻訳日:2023-10-25 11:44:16 公開日:2023-10-23
# NAIL: 効率的な非自己回帰デコーダを用いた語彙検索指標 NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders ( http://arxiv.org/abs/2305.14499v2 ) ライセンス: Link先を確認	Livio Baldini Soares, Daniel Gillick, Jeremy R. Cole, Tom Kwiatkowski	(参考訳) ニューラル文書再帰は精度の点で極めて効果的である。しかし、最良のモデルには専用のハードウェアが必要であり、コストがかかり、しばしば実現不可能である。そこで本研究では,トランスフォーマーのFLOPを1文書あたり10～6%しか必要とせず,コモディティCPUを用いて提供可能な語彙付きスコアリング機能を備えたトランスフォーマークロスアテンションモデルのゲインを最大86%取得する手法を提案する。 bm25レトリバーと組み合わせると、このアプローチは、クエリエンコーディングの加速器を必要とする最先端のデュアルエンコーダレトリバーの品質に適合する。 NAIL(Non-Autoregressive Indexing with Language Model)は,最近のエンコーダデコーダや,T5,GPT-3,PaLMなどのデコーダのみの大規模言語モデルと互換性のあるモデルアーキテクチャである。このモデルアーキテクチャは、既存の事前学習済みチェックポイントを活用でき、クエリの神経処理を必要としないドキュメント表現を効率的に構築するために微調整することができる。 Neural document rerankers are extremely effective in terms of accuracy. However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this serving-time requirement, we present a method of capturing up to 86% of the gains of a Transformer cross-attention model with a lexicalized scoring function that only requires 10-6% of the Transformer's FLOPs per document and can be served using commodity CPUs. When combined with a BM25 retriever, this approach matches the quality of a state-of-the art dual encoder retriever, that still requires an accelerator for query encoding. We introduce NAIL (Non-Autoregressive Indexing with Language models) as a model architecture that is compatible with recent encoder-decoder and decoder-only large language models, such as T5, GPT-3 and PaLM. This model architecture can leverage existing pre-trained checkpoints and can be fine-tuned for efficiently constructing document representations that do not require neural processing of queries.	翻訳日:2023-10-25 11:43:51 公開日:2023-10-23
# 状況アライメントと説明可能なテキストによる社会文化的規範の類似性と差異 Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment ( http://arxiv.org/abs/2305.14492v2 ) ライセンス: Link先を確認	Sky CH-Wang, Arkadiy Saakyan, Oliver Li, Zhou Yu, Smaranda Muresan	(参考訳) 文化をまたいで推論できるシステムを設計するには、彼らが運用するコンテキストの規範に根ざす必要がある。しかし、社会規範の計算モデル開発に関する現在の研究は、主にアメリカ社会に焦点を当てている。本稿では,中国文化とアメリカ文化にまたがる記述的社会規範の発見と比較のための新しいアプローチを提案する。我々は,中国のQ&Aプラットフォーム(Zhihu)と既存のSocialChemistryデータセットの議論を,文化的軸を対比するプロキシとして活用し,社会的状況を文化的に整合させ,文脈内学習を用いてテキストから社会的規範を抽出することで,我々のアプローチを実証する。人間とaiのコラボレーティブなフレームワークにチェーン・オブ・マインド(chain-of-thought)プロンプトを組み込むことで、中国とアメリカの文化にまたがる社会規範3,069の高品質なデータセットを構築します。文化全体にわたる社会的規範を推論するモデルの能力をテストするために,3Bパラメータ未満の既存のモデルでは,自動評価と人的評価の両方において,大きな改善の余地があることが示される。我々のデータセットに基づく異文化間の規範差のさらなる分析は、社会指向の枠組みと実証的な一致を示し、これらの文化をまたがる規範における状況的および記述的ニュアンスを明らかにした。 Designing systems that can reason across cultures requires that they are grounded in the norms of the contexts in which they operate. However, current research on developing computational models of social norms has primarily focused on American society. Here, we propose a novel approach to discover and compare descriptive social norms across Chinese and American cultures. We demonstrate our approach by leveraging discussions on a Chinese Q&A platform (Zhihu) and the existing SocialChemistry dataset as proxies for contrasting cultural axes, align social situations cross-culturally, and extract social norms from texts using in-context learning. Embedding Chain-of-Thought prompting in a human-AI collaborative framework, we build a high-quality dataset of 3,069 social norms aligned with social situations across Chinese and American cultures alongside corresponding free-text explanations. To test the ability of models to reason about social norms across cultures, we introduce the task of explainable social norm entailment, showing that existing models under 3B parameters have significant room for improvement in both automatic and human evaluation. Further analysis of cross-cultural norm differences based on our dataset shows empirical alignment with the social orientations framework, revealing several situational and descriptive nuances in norms across these cultures.	翻訳日:2023-10-25 11:43:29 公開日:2023-10-23
# 評価メトリクスの評価:測定理論を用いたnlg評価メトリクス分析の枠組み Evaluating Evaluation Metrics: A Framework for Analyzing NLG Evaluation Metrics using Measurement Theory ( http://arxiv.org/abs/2305.14889v2 ) ライセンス: Link先を確認	Ziang Xiao, Susu Zhang, Vivian Lai, Q. Vera Liao	(参考訳) 我々は,自然言語生成(NLG)モデル評価において,評価指標の設計と評価という根本的な課題に対処する。既存の自動測定基準と騒音の限界を,現在の人間評価の方法から認識し,nlg評価基準の信頼性と妥当性を概念化し評価するための,計測理論に基づくフレームワークであるmetricevalを提案する。このフレームワークは測定誤差の原因を定式化し、経験的データに基づいて評価指標を評価する統計ツールを提供する。私たちのフレームワークでは、メトリクスの不確かさを定量化して結果をよりよく解釈できます。筆者らは,本フレームワークの実践的使用を実証するため,要約のための評価指標のセットを分析し,LLM測定値におけるヒトの時間的妥当性と信頼性に関する問題点を明らかにした。 MetricEvalを通じて、信頼性の高いメトリクスの設計、評価、解釈を促進し、堅牢で効果的なNLGモデルを推し進めることを目指している。 We address a fundamental challenge in Natural Language Generation (NLG) model evaluation -- the design and evaluation of evaluation metrics. Recognizing the limitations of existing automatic metrics and noises from how current human evaluation was conducted, we propose MetricEval, a framework informed by measurement theory, the foundation of educational test design, for conceptualizing and evaluating the reliability and validity of NLG evaluation metrics. The framework formalizes the source of measurement error and offers statistical tools for evaluating evaluation metrics based on empirical data. With our framework, one can quantify the uncertainty of the metrics to better interpret the result. To exemplify the use of our framework in practice, we analyzed a set of evaluation metrics for summarization and identified issues related to conflated validity structure in human-eval and reliability in LLM-based metrics. Through MetricEval, we aim to promote the design, evaluation, and interpretation of valid and reliable metrics to advance robust and effective NLG models.	翻訳日:2023-10-25 11:33:31 公開日:2023-10-23
# Debiasing made State-of-the-art: Revising the Simple Seed-based Weak Supervision for Text Classification Debiasing Made State-of-the-art: Revisiting the Simple Seed-based Weak Supervision for Text Classification ( http://arxiv.org/abs/2305.14794v2 ) ライセンス: Link先を確認	Chengyu Dong, Zihan Wang, Jingbo Shang	(参考訳) 弱教師付きテキスト分類の最近の進歩は、高レベルの人間のヒューリスティックを質の高い擬似ラベルに変換する洗練された手法の設計に主に焦点をあてている。本稿では,疑似ラベルを生成する最も簡単な方法であるシードマッチングに基づく手法を再検討し,そのパワーが極めて過小評価されたことを示す。シードマッチングの限定的な性能は,単純なシードマッチングルールによるラベルバイアスによるものであり,高品質な擬似ラベル選択に対する信頼性の学習を防止できることを示した。興味深いことに、マッチした入力テキストにあるシードワードを削除するだけでラベルバイアスが軽減され、信頼性が向上する。その後、シードマッチングによって達成されるパフォーマンスが大幅に向上し、最先端と同等、あるいはそれ以上に向上することができる。また、シード語が知られていない場合の処理には、入力テキスト中の単語トークンをランダムに削除し、削除率を高くすることを提案する。驚くべきことに、このランダムな削除方法を備えたシードマッチングは、しばしば、シード削除よりも優れた性能を達成できる。 Recent advances in weakly supervised text classification mostly focus on designing sophisticated methods to turn high-level human heuristics into quality pseudo-labels. In this paper, we revisit the seed matching-based method, which is arguably the simplest way to generate pseudo-labels, and show that its power was greatly underestimated. We show that the limited performance of seed matching is largely due to the label bias injected by the simple seed-match rule, which prevents the classifier from learning reliable confidence for selecting high-quality pseudo-labels. Interestingly, simply deleting the seed words present in the matched input texts can mitigate the label bias and help learn better confidence. Subsequently, the performance achieved by seed matching can be improved significantly, making it on par with or even better than the state-of-the-art. Furthermore, to handle the case when the seed words are not made known, we propose to simply delete the word tokens in the input text randomly with a high deletion ratio. Remarkably, seed matching equipped with this random deletion method can often achieve even better performance than that with seed deletion.	翻訳日:2023-10-25 11:33:03 公開日:2023-10-23
# コンテキストから外すな! 文脈モデルの必要性とスタイリスティック書き直しの評価について Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting ( http://arxiv.org/abs/2305.14755v2 ) ライセンス: Link先を確認	Akhila Yerukola, Xuhui Zhou, Elizabeth Clark, Maarten Sap	(参考訳) 既存のスタイリスティックなテキスト書き換え手法や評価指標は文レベルで機能するが、テキストのより広い文脈を無視すると、汎用的で曖昧で一貫性のない書き直しが好まれる。本稿では、先行するテキストコンテキストを、スタイリスティックテキストの書き直しの段階である$\textit{rewriting}$と$\textit{evaluation}$の2つに統合することを検討するとともに、元のテキストとコンテクストの結合性を組み合わせた新しいコンテクスト評価指標である$\textt{CtxSimFit}$を導入する。形式性,毒性,感情伝達タスクの非文脈的および文脈的書き直しを比較検討した。しかし、既存の文レベルの自動メトリクス(例えば、rouge, sbert)は、人間の好みとあまり相関しない(\rho$=0-0.3)。対照的に、人間の好みは、我々の新しい$\texttt{ctxsimfit}$(\rho$=0.7--0.9)と、コンテキストに影響を受けた共通メトリクス(\rho$=0.4--0.7)の両方によって、ずっとよく反映されている。総じて,スタイリスティックなテキスト書き換えの評価段階において,コンテクストを世代,特に評価段階に統合することの重要性を強調する。 Most existing stylistic text rewriting methods and evaluation metrics operate on a sentence level, but ignoring the broader context of the text can lead to preferring generic, ambiguous, and incoherent rewrites. In this paper, we investigate integrating the preceding textual context into both the $\textit{rewriting}$ and $\textit{evaluation}$ stages of stylistic text rewriting, and introduce a new composite contextual evaluation metric $\texttt{CtxSimFit}$ that combines similarity to the original sentence with contextual cohesiveness. We comparatively evaluate non-contextual and contextual rewrites in formality, toxicity, and sentiment transfer tasks. Our experiments show that humans significantly prefer contextual rewrites as more fitting and natural over non-contextual ones, yet existing sentence-level automatic metrics (e.g., ROUGE, SBERT) correlate poorly with human preferences ($\rho$=0--0.3). In contrast, human preferences are much better reflected by both our novel $\texttt{CtxSimFit}$ ($\rho$=0.7--0.9) as well as proposed context-infused versions of common metrics ($\rho$=0.4--0.7). Overall, our findings highlight the importance of integrating context into the generation and especially the evaluation stages of stylistic text rewriting.	翻訳日:2023-10-25 11:32:25 公開日:2023-10-23
# ECHo:人間中心推論による事象因果推論のためのビシオ言語データセット ECHo: A Visio-Linguistic Dataset for Event Causality Inference via Human-Centric Reasoning ( http://arxiv.org/abs/2305.14740v2 ) ライセンス: Link先を確認	Yuxi Xie and Guanzhen Li and Min-Yen Kan	(参考訳) 視覚言語社会シナリオに基づく事象因果推論の診断データセットであるECHo(Event Causality Inference via Human-Centric Reasoning)を紹介する。 ECHoは、テレビ犯罪ドラマに現実の人間中心の演能情報ビルを雇用している。 ECHoは、マルチモーダル情報に基づいて社会的相互作用を理解し、推論する、理論・オブ・ミンド(ToM)能力を必要とする。筆者らはECHoを用いて,現在のAIシステムの推論能力を評価するために,統合型Chain-of-Thought(CoT)フレームワークを提案する。当社のToM強化CoTパイプラインは、ゼロショットと少数ショットのビジオ言語推論の両方において、さまざまな大きな基礎モデルに対応しています。 InstructGPTやMiniGPT-4といった最近の大規模基盤モデルを3つの診断的人間中心のタスクで精査するために,この枠組みを用いる。さらなる分析は、ECHoが推論における不完全性と矛盾を明らかにするための挑戦的なデータセットであることを示している。私たちのデータとコードはhttps://github.com/YuxiXie/ECHo.comで公開されています。 We introduce ECHo (Event Causality Inference via Human-Centric Reasoning), a diagnostic dataset of event causality inference grounded in visio-linguistic social scenarios. ECHo employs real-world human-centric deductive information building on a television crime drama. ECHo requires the Theory-of-Mind (ToM) ability to understand and reason about social interactions based on multimodal information. Using ECHo, we propose a unified Chain-of-Thought (CoT) framework to assess the reasoning capability of current AI systems. Our ToM-enhanced CoT pipeline accommodates various large foundation models in both zero-shot and few-shot visio-linguistic reasoning. We use this framework to scrutinize recent large foundation models such as InstructGPT and MiniGPT-4 on three diagnostic human-centric tasks. Further analysis demonstrates ECHo as a challenging dataset to expose imperfections and inconsistencies in reasoning. Our data and code are publicly available at https://github.com/YuxiXie/ECHo.	翻訳日:2023-10-25 11:31:50 公開日:2023-10-23
# Calc-X と Calcformers:シンボリックシステムとの相互作用による算術的連鎖の強化 Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems ( http://arxiv.org/abs/2305.15017v2 ) ライセンス: Link先を確認	Marek Kadl\v{c}\'ik, Michal \v{S}tef\'anik, Ond\v{r}ej Sotol\'a\v{r}, Vlastimil Martinek	(参考訳) 多くのタスクにおける優れたパフォーマンスにもかかわらず、言語モデルは算術演算を必要とするタスクにおいて事実的誤りを犯す傾向にある。この欠陥に対処するために、連鎖推論における計算機の適切な使用を示すデータセットの集合であるCalc-Xを作成する。 Calc-Xは、シンボルシステムに計算をオフロードする言語モデルを教えるのに適している。既存のチェーン・オブ・シークレット・データセットを探索し、提案フォーマットに統一し、30,000以上のサンプルの標準収集を行う。最後に、新しいCalc-Xコレクションを使用して、私たちがCalcformersと呼ぶオープンソースの計算モデルをトレーニングし、これらのモデルがバニラ言語モデルのベースラインと比べて正しい結果を生成する精度のおよそ2倍の精度を示す。すべてのCalc-Xデータセット、ソースコード、Calcformersモデルを公開しています。 Despite outstanding performance in many tasks, language models are notoriously inclined to make factual errors in tasks requiring arithmetic computation. We address this deficiency by creating Calc-X, a collection of datasets that demonstrates the appropriate use of a calculator in reasoning chains. Calc-X is suitable for teaching language models to offload computations to a symbolic system. We survey and unify several existing chain-of-thought datasets into a proposed format, resulting in a standard collection of over 300,000 samples requiring arithmetic reasoning. Finally, we use the new Calc-X collection to train open-source calculator-using models we call Calcformers and show that these models approximately double the accuracy of generating correct results compared to vanilla language model baselines. We make all Calc-X datasets, source code and Calcformers models publicly available.	翻訳日:2023-10-25 11:15:42 公開日:2023-10-23
# 言語モデルによる推論は世界モデルによる計画 Reasoning with Language Model is Planning with World Model ( http://arxiv.org/abs/2305.14992v2 ) ライセンス: Link先を確認	Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, Daisy Zhe Wang, Zhiting Hu	(参考訳) 大規模言語モデル(LLM)は、特に中間推論ステップ(例えばChain-of-Thought, CoT)を生成するよう促されたときに顕著な推論能力を示す。しかしながら、LLMは、与えられた環境でタスクを実行するためのアクションプランの作成や、複雑な数学、論理的、常識的推論の実行など、人間にとって容易な問題に苦しむことができる。この不足は、llmsが世界$\textit{state}$(環境ステータス、中間変数値など)を予測し、アクションの長期的な結果をシミュレートするために内部で$\textit{world model}$を欠いていることに起因する。これは、LCMが人間の脳に似た計画を行うのを防ぐもので、代替の推論経路を探索し、将来の状態と報酬を予測し、既存の推論手順を反復的に洗練する。この制限を克服するために、新しいLCM推論フレームワークである$\underline{R}$easoning vi$\underline{a}$$$\underline{P}$lanning $\textbf{(RAP)}$を提案する。 RAPは、LLMを世界モデルと推論エージェントの両方として再利用し、広大な推論空間における戦略的探索のための(Monto Carlo Tree Searchに基づく)原則的計画アルゴリズムを組み込んでいる。推論中、LLM(エージェント)は、LLM(ワールドモデル)とタスク固有報酬の指導の下で推論ツリーを漸進的に構築し、探索用$\textit{vsの適切なバランスで、高い回帰推論パスを効率的に取得する。利用料は$ exploitation。我々は、計画生成、数理推論、論理推論など、様々な困難な推論問題にRAPを適用する。これらの課題に対する実証的な結果は、cotを含む様々な強固なベースラインに対するrapの優越性を示す。 LLAMA-33BのRAPはGPT-4のCoTを33%の相対的な改善で上回っている。 Large language models (LLMs) have shown remarkable reasoning capabilities, especially when prompted to generate intermediate reasoning steps (e.g., Chain-of-Thought, CoT). However, LLMs can still struggle with problems that are easy for humans, such as generating action plans for executing tasks in a given environment, or performing complex math, logical, and commonsense reasoning. The deficiency stems from the key fact that LLMs lack an internal $\textit{world model}$ to predict the world $\textit{state}$ (e.g., environment status, intermediate variable values) and simulate long-term outcomes of actions. This prevents LLMs from performing deliberate planning akin to human brains, which involves exploring alternative reasoning paths, anticipating future states and rewards, and iteratively refining existing reasoning steps. To overcome the limitations, we propose a new LLM reasoning framework, $\underline{R}$easoning vi$\underline{a}$ $\underline{P}$lanning $\textbf{(RAP)}$. RAP repurposes the LLM as both a world model and a reasoning agent, and incorporates a principled planning algorithm (based on Monto Carlo Tree Search) for strategic exploration in the vast reasoning space. During reasoning, the LLM (as agent) incrementally builds a reasoning tree under the guidance of the LLM (as world model) and task-specific rewards, and obtains a high-reward reasoning path efficiently with a proper balance between exploration $\textit{vs.}$ exploitation. We apply RAP to a variety of challenging reasoning problems including plan generation, math reasoning, and logical inference. Empirical results on these tasks demonstrate the superiority of RAP over various strong baselines, including CoT and least-to-most prompting with self-consistency. RAP on LLAMA-33B surpasses CoT on GPT-4 with 33% relative improvement in a plan generation setting.	翻訳日:2023-10-25 11:15:26 公開日:2023-10-23
# LLMは暗号化プロンプトを理解できる:プライバシーに配慮したフレンドリーなトランスフォーマーを目指して LLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly Transformers ( http://arxiv.org/abs/2305.18396v2 ) ライセンス: Link先を確認	Xuanqi Liu and Zhuotao Liu	(参考訳) コミュニティは、サーバがモデルパラメータを保持し、クライアントが推論のためにプライベートデータ(またはプロンプト)を入力するサーバークライアント設定で、トランスフォーマーベースの大規模言語モデル(LLM)のためのプライベート推論フレームワークを構築することを模索した。しかし、これらのフレームワークは、プライベートインプットが元のllmを通じて前方に伝播するときに大きなオーバーヘッドを課す。本稿では,プライバシ計算フレンドリー近似を用いたトランスフォーマアーキテクチャにおける計算・通信重演算子の置換により,モデル性能への影響が極めて小さい一方で,プライベート推論コストを大幅に削減できることを示す。最先端のiron(neurips 2022)と比較して、当社のプライバシコンピューティングフレンドリーなモデル推論パイプラインは、ほぼ同じ精度を維持しながら、計算速度が5\times$で、通信オーバーヘッドが80%削減されます。 The community explored to build private inference frameworks for transformer-based large language models (LLMs) in a server-client setting, where the server holds the model parameters and the client inputs its private data (or prompt) for inference. However, these frameworks impose significant overhead when the private inputs are forward propagated through the original LLMs. In this paper, we show that substituting the computation- and communication-heavy operators in the transformer architecture with privacy-computing friendly approximations can greatly reduce the private inference costs while incurring very minor impact on model performance. Compared to state-of-the-art Iron (NeurIPS 2022), our privacy-computing friendly model inference pipeline achieves a $5\times$ acceleration in computation and an 80% reduction in communication overhead, while retaining nearly identical accuracy.	翻訳日:2023-10-25 09:12:37 公開日:2023-10-23
# 反復的検索生成シナジーによる検索適応型大規模言語モデルの拡張 Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy ( http://arxiv.org/abs/2305.15294v2 ) ライセンス: Link先を確認	Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen	(参考訳) 大きな言語モデルは強力なテキストプロセッサと推論器であるが、いまだに時代遅れの知識や幻覚など、世界への接続を必要とする制限を被っている。検索型大規模言語モデルは,外部知識に基づくモデル生成の基盤として,広く注目を集めている。しかし、検索者は関連性、特に複雑な情報を必要とするクエリーを捉えるのに苦労する。近年の研究では、検索に積極的に関与する大きな言語モデル、すなわち、生成による検索を改善することで、関連性モデリングを改善することが提案されている。本稿では, iter-retgen と呼ばれる手法により, 検索と生成を反復的に相乗的に行うことで, 高い性能を実現することを示す。モデル出力は、タスクを完了するのに必要なものを示し、それゆえ、より関連する知識を取得するための情報的コンテキストを提供し、結果として次のイテレーションでより良いアウトプットを生成するのに役立つ。出力を生成するときに生成と検索をインターリーブする最近の研究と比較すると、iter-retgenプロセスはすべての知識を全体として取得し、構造的な制約なしに生成の柔軟性を保っている。マルチホップ質問応答、事実検証、コモンセンス推論に基づいてIter-RetGenを評価し、パラメトリック知識と非パラメトリック知識を柔軟に活用できることを示し、検索と生成のオーバーヘッドを少なくしつつ、最先端の検索強化ベースラインに勝ったり、競合することを示す。世代別検索適応によりさらに性能を向上させることができる。 Large language models are powerful text processors and reasoners, but are still subject to limitations including outdated knowledge and hallucinations, which necessitates connecting them to the world. Retrieval-augmented large language models have raised extensive attention for grounding model generation on external knowledge. However, retrievers struggle to capture relevance, especially for queries with complex information needs. Recent work has proposed to improve relevance modeling by having large language models actively involved in retrieval, i.e., to improve retrieval with generation. In this paper, we show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner. A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge which in turn helps generate a better output in the next iteration. Compared with recent work which interleaves retrieval with generation when producing an output, Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints. We evaluate Iter-RetGen on multi-hop question answering, fact verification, and commonsense reasoning, and show that it can flexibly leverage parametric knowledge and non-parametric knowledge, and is superior to or competitive with state-of-the-art retrieval-augmented baselines while causing fewer overheads of retrieval and generation. We can further improve performance via generation-augmented retrieval adaptation.	翻訳日:2023-10-25 09:12:19 公開日:2023-10-23
# シャープネス認識最小化における正規化の役割 The Crucial Role of Normalization in Sharpness-Aware Minimization ( http://arxiv.org/abs/2305.15287v2 ) ライセンス: Link先を確認	Yan Dai, Kwangjun Ahn, Suvrit Sra	(参考訳) Sharpness-Aware Minimization (SAM)は、ディープニューラルネットワークの予測性能を大幅に改善する勾配に基づく最適化(Foret et al., ICLR 2021)である。その結果、その実証的な成功を説明することへの関心が高まっている。特に、SAM更新の重要なコンポーネントである正規化による役割の理解に重点を置いています。我々は、SAMにおける凸関数と非凸関数の両方に対する正規化の効果を理論的に経験的に研究し、正規化が果たす2つの重要な役割を明らかにした。一アルゴリズムの安定化に役立ち、かつ ii) アルゴリズムがminimaの連続体(多様体)に沿ってドリフトすることを可能にする。さらに、正規化のこれらの2つの性質はSAMを超パラメータの選択に対して堅牢にし、SAMの実用性を支持することを主張する。我々の結論は様々な実験によって裏付けられている。 Sharpness-Aware Minimization (SAM) is a recently proposed gradient-based optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep neural networks. Consequently, there has been a surge of interest in explaining its empirical success. We focus, in particular, on understanding the role played by normalization, a key component of the SAM updates. We theoretically and empirically study the effect of normalization in SAM for both convex and non-convex functions, revealing two key roles played by normalization: i) it helps in stabilizing the algorithm; and ii) it enables the algorithm to drift along a continuum (manifold) of minima -- a property identified by recent theoretical works that is the key to better performance. We further argue that these two properties of normalization make SAM robust against the choice of hyper-parameters, supporting the practicality of SAM. Our conclusions are backed by various experiments.	翻訳日:2023-10-25 09:11:54 公開日:2023-10-23
# L-CAD:拡散述語を用いた任意のレベル記述による言語ベースの色付け L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors ( http://arxiv.org/abs/2305.15217v3 ) ライセンス: Link先を確認	Zheng Chang, Shuchen Weng, Peixuan Zhang, Yu Li, Si Li, Boxin Shi	(参考訳) 言語ベースのカラー化は、ユーザフレンドリーな自然言語記述の指導の下で、有意義で視覚的な色を生み出す。従来手法では、画像内のほとんどのオブジェクトに対して、ユーザが包括的な色記述を提供することを暗黙的に仮定していた。本稿では,任意のレベルの記述で言語ベースの色付けを行う統一モデルを提案する。我々は、その頑健な言語理解と豊かな色に事前訓練されたモダリティ生成モデルを活用し、あらゆるレベルの記述の本質的なあいまいさに対処する。さらに,局所的な空間構造を保ち,ゴースト効果を防止するために,入力条件と整合するモジュールを設計する。提案する新しいサンプリング戦略により,多様で複雑なシナリオでインスタンス対応のカラー化を実現する。広範な実験結果から,任意のレベル記述を効果的に処理し,言語ベースと自動カラー化手法を両立させる利点が示された。コードと事前訓練されたモデルは、https://github.com/changzheng123/L-CADで入手できる。 Language-based colorization produces plausible and visually pleasing colors under the guidance of user-friendly natural language descriptions. Previous methods implicitly assume that users provide comprehensive color descriptions for most of the objects in the image, which leads to suboptimal performance. In this paper, we propose a unified model to perform language-based colorization with any-level descriptions. We leverage the pretrained cross-modality generative model for its robust language understanding and rich color priors to handle the inherent ambiguity of any-level descriptions. We further design modules to align with input conditions to preserve local spatial structures and prevent the ghosting effect. With the proposed novel sampling strategy, our model achieves instance-aware colorization in diverse and complex scenarios. Extensive experimental results demonstrate our advantages of effectively handling any-level descriptions and outperforming both language-based and automatic colorization methods. The code and pretrained models are available at: https://github.com/changzheng123/L-CAD.	翻訳日:2023-10-25 09:11:36 公開日:2023-10-23
# LLMは十分に進歩したか? 大規模言語モデルのベンチマークを解く問題 Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models ( http://arxiv.org/abs/2305.15074v3 ) ライセンス: Link先を確認	Daman Arora, Himanshu Gaurav Singh, Mausam	(参考訳) 既存の推論ベンチマークにおける大規模言語モデル(LLM)の性能は、ここ数年で大幅に改善されている。これに対して我々は,LLMの問題解決能力を評価する上で,かなり難しいベンチマークデータセットであるJEEBenchを提案する。競争力の高いIIT JEE-Advanced試験から, 数学, 物理, 化学の課題を515点評価した。このベンチマークで問題を解くには、ドメイン内知識の深層に基づくロングホリゾン推論が不可欠です。さまざまなオープンソースおよびプロプライエタリなモデルに対する評価から,自己一貫性や自己定義,思考の連鎖といったテクニックを用いた場合においても,最も高いパフォーマンスが40%未満であることが分かりました。 GPT-4の典型的な失敗モードは、代数的操作における誤り、抽象的な概念を数学的方程式に正確に基底付けることの難しさ、関連するドメイン固有の概念の取得の失敗である。また,GPT-4は誤答に対する負のマーキングによって引き起こされるリスクを評価することができない。そこで本研究では,自己整合性に対する保温後信頼性保持手法を開発し,効果的な応答選択を実現する。 LLMを用いた問題解決における今後の研究を,我々の挑戦的なベンチマークが導くことを期待します。 The performance of large language models (LLMs) on existing reasoning benchmarks has significantly improved over the past years. In response, we present JEEBench, a considerably more challenging benchmark dataset for evaluating the problem solving abilities of LLMs. We curate 515 challenging pre-engineering mathematics, physics and chemistry problems from the highly competitive IIT JEE-Advanced exam. Long-horizon reasoning on top of deep in-domain knowledge is essential for solving problems in this benchmark. Our evaluation on various open-source and proprietary models reveals that the highest performance, even after using techniques like self-consistency, self-refinement and chain-of-thought prompting, is less than 40%. The typical failure modes of GPT-4, the best model, are errors in algebraic manipulation, difficulty in grounding abstract concepts into mathematical equations accurately and failure in retrieving relevant domain-specific concepts. We also observe that by mere prompting, GPT-4 is unable to assess risk introduced by negative marking for incorrect answers. For this, we develop a post-hoc confidence-thresholding method over self-consistency, which enables effective response selection. We hope that our challenging benchmark will guide future re-search in problem-solving using LLMs.	翻訳日:2023-10-25 09:10:58 公開日:2023-10-23
# GPT-4は良いデータアナリストか? Is GPT-4 a Good Data Analyst? ( http://arxiv.org/abs/2305.15038v2 ) ライセンス: Link先を確認	Liying Cheng, Xingxuan Li, Lidong Bing	(参考訳) 大規模言語モデル(llm)は、コンテキスト理解、コード生成、言語生成、データストーリテリングなど、多くのドメインやタスクで強力な能力を示しているため、多くのデータアナリストは、彼らの仕事が人工知能(ai)に置き換えられるかどうかを懸念するかもしれない。この論争を巻き起こす話題は大衆の注目を集めた。しかし、我々はまだ結論を出すことなく、異なる意見の段階にある。本研究は,「GPT-4は優れたデータ分析者か?」という研究課題を提起し,本研究の目的を,直接比較研究を行うことである。詳細は、GPT-4をデータアナリストとして、幅広い領域のデータベースでエンドツーエンドのデータ分析を行う。本稿では,gpt-4のプロンプトを慎重に設計し,課題に取り組むための枠組みを提案する。また,複数の専門家データアナリストとGPT-4のパフォーマンスを体系的に比較するために,タスク固有の評価指標を設計する。実験の結果, GPT-4はヒトに匹敵する性能を示した。我々はまた、GPT-4がデータアナリストを置き換えることができるという結論に達する前に、さらなる研究に光を当てるために、我々の結果についてより深く議論する。 As large language models (LLMs) have demonstrated their powerful capabilities in plenty of domains and tasks, including context understanding, code generation, language generation, data storytelling, etc., many data analysts may raise concerns if their jobs will be replaced by artificial intelligence (AI). This controversial topic has drawn great attention in public. However, we are still at a stage of divergent opinions without any definitive conclusion. Motivated by this, we raise the research question of "is GPT-4 a good data analyst?" in this work and aim to answer it by conducting head-to-head comparative studies. In detail, we regard GPT-4 as a data analyst to perform end-to-end data analysis with databases from a wide range of domains. We propose a framework to tackle the problems by carefully designing the prompts for GPT-4 to conduct experiments. We also design several task-specific evaluation metrics to systematically compare the performance between several professional human data analysts and GPT-4. Experimental results show that GPT-4 can achieve comparable performance to humans. We also provide in-depth discussions about our results to shed light on further studies before reaching the conclusion that GPT-4 can replace data analysts.	翻訳日:2023-10-25 09:10:24 公開日:2023-10-23
# 自己ICL:自己生成デモによるゼロショットインコンテキスト学習 Self-ICL: Zero-Shot In-Context Learning with Self-Generated Demonstrations ( http://arxiv.org/abs/2305.15035v2 ) ライセンス: Link先を確認	Wei-Lin Chen, Cheng-Kuang Wu, Yun-Nung Chen, Hsin-Hsi Chen	(参考訳) 大規模言語モデル(LLM)は、いくつかのインプット・アウトプット・デモでターゲットタスクに適応する重要なコンテキスト内学習(ICL)能力を示した。 iclの改善のために、既存のトレーニングコーパスから代表的なデモンストレーションを選択するための様々な方法が提案されている。しかし、このような設定は実世界のプラクティスとは一致せず、エンドユーザは通常、デモンストレーションプールにアクセスせずにlmsをクエリする。本稿では,ゼロショットICLを実行するために,LMの固有の機能をブートストラップするシンプルなフレームワークであるSelf-ICLを紹介する。テスト入力が与えられたら、Self-ICLはまずモデルに擬似入力を生成するよう促す。次に、ゼロショットプロンプトにより擬似入力の擬似ラベルを予測する。最後に、擬似インプット-ラベルペアをデモとしてテスト入力用のICLを実行する。 23のBIG-Bench Hardタスクの評価では、自己ICLは平均精度と頭部比較の両方でゼロショットベースラインを上回っている。さらに、ゼロショットチェーンでは、Self-ICLは実演に匹敵する結果が得られる。さらに,Self-ICLの有効性を検証し,異なる環境下での行動に対する洞察を提供するため,さまざまな分析を行った。 Large language models (LLMs) have exhibited striking in-context learning (ICL) ability to adapt to target tasks with a few input-output demonstrations. For better ICL, different methods are proposed to select representative demonstrations from existing training corpora. However, such settings are not aligned with real-world practices, as end-users usually query LMs without access to demonstration pools. In this work, we introduce Self-ICL -- a simple framework which bootstraps LMs' intrinsic capabilities to perform zero-shot ICL. Given a test input, Self-ICL first prompts the model to generate pseudo-inputs. Next, the model predicts pseudo-labels for the pseudo-inputs via zero-shot prompting. Finally, we perform ICL for the test input with the pseudo-input-label pairs as demonstrations. Evaluation on 23 BIG-Bench Hard tasks shows Self-ICL outperforms zero-shot baselines on both average accuracy and head-to-head comparison. Moreover, with zero-shot chain-of-thought, Self-ICL achieves results comparable to using real demonstrations. Additionally, we conduct a range of analyses to validate Self-ICL's effectiveness and provide insights for its behaviors under different settings.	翻訳日:2023-10-25 09:10:01 公開日:2023-10-23
# 説明の活用: 拡張されたテキスト属性グラフ表現学習のためのllm-to-lmインタプリタ Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning ( http://arxiv.org/abs/2305.19523v3 ) ライセンス: Link先を確認	Xiaoxin He, Xavier Bresson, Thomas Laurent, Adam Perold, Yann LeCun, Bryan Hooi	(参考訳) 近年,テキスト対応グラフ(TAG)の表現学習が重要な研究課題となっている。 TAGの典型的な例は、各論文のテキストがノード属性として機能する論文引用グラフである。初期グラフニューラルネットワーク(gnn)パイプラインは、これらのテキスト属性を、スキップグラムや単語の袋など、浅いあるいは手作りの機能に変換することで処理した。近年の取り組みは、言語モデル(LM)によるパイプラインの強化に重点を置いている。 GPTやLlama2のような強力な大規模言語モデル(LLM)が出現し、推論能力と一般的な知識を活用できるようになり、LLMのテキストモデリング能力とGNNの構造学習能力を組み合わせた技術の必要性が高まっている。そこで本研究では,LLMを利用してテキスト情報を特徴として捉え,下流タスクにおけるGNNの性能向上に活用する。我々はLCMにゼロショット分類の実行を促し、意思決定プロセスのテキスト説明を要求し、LCM-to-LMインタプリタを設計して、これらの説明を下流のGNNを強化する情報的特徴に翻訳する。実験の結果,Cora,PubMed,ogbn-arxiv,新たに導入されたデータセットarXiv-2023など,確立されたTAGデータセットの最先端結果が得られた。さらに,本手法はトレーニングを著しく高速化し,ogbn-arxivのベースラインよりも2.88倍向上した。最後に、提案手法の汎用性はTAGを超えて拡張され、グラフテキストデータ~\footnote{Ourコードおよびデータセットを含む他のタスクを強化する可能性を秘めていると信じている。 Representation learning on text-attributed graphs (TAGs) has become a critical research problem in recent years. A typical example of a TAG is a paper citation graph, where the text of each paper serves as node attributes. Initial graph neural network (GNN) pipelines handled these text attributes by transforming them into shallow or hand-crafted features, such as skip-gram or bag-of-words features. Recent efforts have focused on enhancing these pipelines with language models (LMs), which typically demand intricate designs and substantial computational resources. With the advent of powerful large language models (LLMs) such as GPT or Llama2, which demonstrate an ability to reason and to utilize general knowledge, there is a growing need for techniques which combine the textual modelling abilities of LLMs with the structural learning capabilities of GNNs. Hence, in this work, we focus on leveraging LLMs to capture textual information as features, which can be used to boost GNN performance on downstream tasks. A key innovation is our use of explanations as features: we prompt an LLM to perform zero-shot classification, request textual explanations for its decision-making process, and design an LLM-to-LM interpreter to translate these explanations into informative features that enhance downstream GNNs. Our experiments demonstrate that our method achieves state-of-the-art results on well-established TAG datasets, including Cora, PubMed, ogbn-arxiv, as well as our newly introduced dataset, arXiv-2023. Furthermore, our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv. Lastly, we believe the versatility of the proposed method extends beyond TAGs and holds the potential to enhance other tasks involving graph-text data~\footnote{Our codes and datasets are available at: \url{https://github.com/XiaoxinHe/TAPE}}.	翻訳日:2023-10-25 09:03:53 公開日:2023-10-23
# 厳密なローカルUnion-Find Strictly local Union-Find ( http://arxiv.org/abs/2305.18534v3 ) ライセンス: Link先を確認	Tim Chan, Simon C. Benjamin	(参考訳) フォールトトレラント量子コンピューティングは、エラー訂正に必要なデコードを実行するために古典的なハードウェアを必要とする。ユニオン・フィールド・デコーダは最も優れた候補の1つである。これは、近距離のステップを通じてデータ構造の成長とマージを伴い、非常に有機的な特徴を持ち、これは自然に近距離のリンクを持つ単純なプロセッサの格子を用いた実現の可能性を示している。このように計算負荷は、ほぼ理想的並列性で分散することができる。ここでは、この厳密な(部分的な)局所性が初めて実践的であることを示し、最悪の場合のランタイム $\mathcal o(d^3)$ と、表面コード距離 $d$ で平均実行時サブクアドドラティックを持つ。提案するアーキテクチャを単純化する新しいパリティ計算方式を採用し,回路レベルの雑音に対して最適化した。ローカル実現を長距離リンクで拡張したものと比較する。後者はもちろん高速ですが、ローカルな非同期ロジックは違いを無効にする可能性があることに注意してください。 Fault-tolerant quantum computing requires classical hardware to perform the decoding necessary for error correction. The Union-Find decoder is one of the best candidates for this. It has remarkably organic characteristics, involving the growth and merger of data structures through nearest-neighbour steps; this naturally suggests the possibility of its realisation using a lattice of simple processors with nearest-neighbour links. In this way the computational load can be distributed with near-ideal parallelism. Here we show for the first time that this strict (rather than partial) locality is practical, with a worst-case runtime $\mathcal O(d^3)$ and mean runtime subquadratic in the surface code distance $d$. A novel parity-calculation scheme is employed which can simplify previously proposed architectures, and our approach is optimised for circuit-level noise. We compare our local realisation with one augmented by long-range links; while the latter is of course faster, we note that local asynchronous logic could negate the difference.	翻訳日:2023-10-25 09:02:50 公開日:2023-10-23
# LaFTer: 言語とラベルなしイメージコレクションを用いたゼロショット分類器のラベルなしチューニング LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections ( http://arxiv.org/abs/2305.18287v2 ) ライセンス: Link先を確認	M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Mateusz Kozinski, Horst Possegger, Rogerio Feris, Horst Bischof	(参考訳) 近年,大規模な事前学習型ビジョン・アンド・ランゲージ(VL)モデルでは,単純な言語プロンプトとして定義された潜在的に無制限なカテゴリの開語彙認識を可能にするゼロショット視覚分類において,新たな最先端(SOTA)が設定されている。しかし、これらの大きな進歩にもかかわらず、これらのゼロショット分類器の性能は、教師付き微調整で訓練された専用(閉圏集合)分類器の結果に及ばない。本稿では,ラベルのないVLデータとラベルなしのVLデータと,興味のあるカテゴリを記述したLarge Language Model (LLM) を用いて自動生成するテキストの集合を用いて,このギャップを初めて削減する方法を示し,それらのカテゴリのラベル付きビジュアルインスタンスを効果的に置換する。ラベルフリーアプローチを用いることで、ベースVLモデルのゼロショット性能や、さまざまなデータセット上での現代的な手法やベースラインよりも大幅にパフォーマンスが改善され、ラベルフリー環境では最大11.7%(平均3.8%)の絶対的な改善が示される。さらに,ラベルのないアプローチであっても,5ショットの監督を行うベースラインを先導する数ショットよりも平均1.3%向上する。 Recently, large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification enabling open-vocabulary recognition of potentially unlimited set of categories defined as simple language prompts. However, despite these great advances, the performance of these zeroshot classifiers still falls short of the results of dedicated (closed category set) classifiers trained with supervised fine tuning. In this paper we show, for the first time, how to reduce this gap without any labels and without any paired VL data, using an unlabeled image collection and a set of texts auto-generated using a Large Language Model (LLM) describing the categories of interest and effectively substituting labeled visual instances of those categories. Using our label-free approach, we are able to attain significant performance improvements over the zero-shot performance of the base VL model and other contemporary methods and baselines on a wide variety of datasets, demonstrating absolute improvement of up to 11.7% (3.8% on average) in the label-free setting. Moreover, despite our approach being label-free, we observe 1.3% average gains over leading few-shot prompting baselines that do use 5-shot supervision.	翻訳日:2023-10-25 09:02:09 公開日:2023-10-23
# im-promptu: イメージプロンプトからのコンテキスト内コンポジション Im-Promptu: In-Context Composition from Image Prompts ( http://arxiv.org/abs/2305.17262v3 ) ライセンス: Link先を確認	Bhishma Dedhia, Michael Chang, Jake C. Snell, Thomas L. Griffiths, Niraj K. Jha	(参考訳) 大規模な言語モデルは、少数のデモから様々なタスクを解決できる数少ない学習者です。この暗黙のタスクの理解は、単語トークンに対する注意のメカニズムが類推的推論に重要な役割を果たしていることを示唆している。本研究では,視覚刺激の構成可能な要素に対して,類似推論がコンテキスト内合成を可能にするかどうかを検討する。まず,視覚インコンテキスト学習者の一般化特性をテストするための3つのベンチマークスイートを提案する。アナロジーに基づくインコンテキスト学習の概念を定式化し,im-promptuと呼ばれるメタ学習フレームワークの設計に使用する。言語に必要なトークンの粒度は十分に確立されているが、視覚刺激における文脈内一般化を可能にするための適切な構成の粒度は、通常不明である。この目的のために、我々はim-promptuを使用して、ベクタ表現、パッチ表現、オブジェクトスロットなど、さまざまなレベルのコンポジション性を持つ複数のエージェントを訓練します。本実験は,合成規則を未知の領域に拡張する非構成的表現を用いて,外挿能力と構成性の程度とのトレードオフを明らかにする。パッチベースの表現は、堅牢な外挿のために全オブジェクトを含むパッチを必要とする。同時に、クロスアテンションモジュールと結合したオブジェクト中心のトークン化器は一貫性のある高忠実な解を生成し、これらの帰納的バイアスは合成の一般化に特に重要である。最後に,画像生成のための直感的なプログラミングインタフェースとしてim-promptuのユースケースを示す。 Large language models are few-shot learners that can solve diverse tasks from a handful of demonstrations. This implicit understanding of tasks suggests that the attention mechanisms over word tokens may play a role in analogical reasoning. In this work, we investigate whether analogical reasoning can enable in-context composition over composable elements of visual stimuli. First, we introduce a suite of three benchmarks to test the generalization properties of a visual in-context learner. We formalize the notion of an analogy-based in-context learner and use it to design a meta-learning framework called Im-Promptu. Whereas the requisite token granularity for language is well established, the appropriate compositional granularity for enabling in-context generalization in visual stimuli is usually unspecified. To this end, we use Im-Promptu to train multiple agents with different levels of compositionality, including vector representations, patch representations, and object slots. Our experiments reveal tradeoffs between extrapolation abilities and the degree of compositionality, with non-compositional representations extending learned composition rules to unseen domains but performing poorly on combinatorial tasks. Patch-based representations require patches to contain entire objects for robust extrapolation. At the same time, object-centric tokenizers coupled with a cross-attention module generate consistent and high-fidelity solutions, with these inductive biases being particularly crucial for compositional generalization. Lastly, we demonstrate a use case of Im-Promptu as an intuitive programming interface for image generation.	翻訳日:2023-10-25 09:01:49 公開日:2023-10-23
# 分割リカレント変圧器:効率的なシーケンス対シーケンスモデル Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model ( http://arxiv.org/abs/2305.16340v3 ) ライセンス: Link先を確認	Yinghan Long, Sayeed Shafayet Chowdhury, Kaushik Roy	(参考訳) トランスフォーマーは、言語やビジョンを含むさまざまな領域で支配的なパフォーマンスを示している。しかし、計算コストはシーケンス長と二乗的に増大し、リソース制約のあるアプリケーションでは使用が禁止される。これに対応するために,本手法では,シーケンス全体をセグメントに分割し,個々のセグメントに注意を向ける。本稿では,セグメント化(局所的)注意と再帰的注意を組み合わせたセグメント化再帰変圧器(srformer)を提案する。注意窓の長さを減少させることによる損失は、繰り返し注目されるセグメント間で情報を集約することで補償される。 SRformerは、RAF(Recurrent Accumulate-and-Fire)ニューロン固有のメモリを利用して、キーと値の累積積積を更新する。分割された注意と軽量RAFニューロンは、提案したトランスの効率性を保証する。このようなアプローチは、より低い計算/メモリコストでシーケンシャルな処理能力を持つモデルにつながる。提案手法をT5およびBARTトランスに適用する。修正されたモデルは、CNN-dailymail、XSUM、ArXiv、MediaSUMなどの要約データセットでテストされる。特に、様々なサイズのセグメント入力を用いて、提案モデルは、セグメントトランスよりも6-22\%高いrouge1スコアを達成し、他の再帰トランスフォーマーアプローチよりも優れています。さらに,本モデルでは,全注意と比較してクロス注意の計算複雑性を約$40\%$削減する。 Transformers have shown dominant performance across a range of domains including language and vision. However, their computational cost grows quadratically with the sequence length, making their usage prohibitive for resource-constrained applications. To counter this, our approach is to divide the whole sequence into segments and apply attention to the individual segments. We propose a segmented recurrent transformer (SRformer) that combines segmented (local) attention with recurrent attention. The loss caused by reducing the attention window length is compensated by aggregating information across segments with recurrent attention. SRformer leverages Recurrent Accumulate-and-Fire (RAF) neurons' inherent memory to update the cumulative product of keys and values. The segmented attention and lightweight RAF neurons ensure the efficiency of the proposed transformer. Such an approach leads to models with sequential processing capability at a lower computation/memory cost. We apply the proposed method to T5 and BART transformers. The modified models are tested on summarization datasets including CNN-dailymail, XSUM, ArXiv, and MediaSUM. Notably, using segmented inputs of varied sizes, the proposed model achieves $6-22\%$ higher ROUGE1 scores than a segmented transformer and outperforms other recurrent transformer approaches. Furthermore, compared to full attention, the proposed model reduces the computational complexity of cross attention by around $40\%$.	翻訳日:2023-10-25 09:00:49 公開日:2023-10-23
# TEC-Net:医療画像分割のためのビジョントランスフォーマーエンブレス畳み込みニューラルネットワーク TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation ( http://arxiv.org/abs/2306.04086v2 ) ライセンス: Link先を確認	Tao Lei, Rui Sun, Weichuan Zhang, Yong Wan, Yong Xia, Asoke K. Nandi	(参考訳) 畳み込みニューラルネットワーク(cnn)とトランスフォーマーのハイブリッドアーキテクチャは、医用画像セグメンテーションの最も一般的な方法である。しかし、ハイブリッドアーキテクチャに基づく既存のネットワークには2つの問題がある。第1に、cnnブランチは畳み込み操作によって画像局所的な特徴をキャプチャできるが、バニラ畳み込みは画像特徴の適応的な抽出を達成することができない。第2に、変圧器ブランチは画像のグローバル情報をモデル化できるが、従来のセルフアテンションは画像の空間的自己アテンションのみに焦点を当て、複雑な背景を持つ医療画像のセグメンテーション精度を低下させるチャンネルやクロス次元の自己アテンションを無視する。これらの問題を解決するために,医療画像セグメンテーション(TEC-Net)のための畳み込みニューラルネットワークを用いたビジョントランスフォーマーを提案する。我々のネットワークには2つの利点がある。まず、動的変形可能な畳み込み(DDConv)はCNNブランチで設計され、固定サイズの畳み込みカーネルを用いた適応的特徴抽出の難しさを克服するだけでなく、異なる入力が同じ畳み込みカーネルパラメータを共有する欠陥を解消し、CNNブランチの機能表現能力を効果的に改善する。第2に、Transformerブランチでは、パラメータや計算の少ない医用画像のクロス次元長距離依存性を完全に学習できるように、(シフト)ウィンドウ適応相補的注意モジュール((S)W-ACAM)とコンパクトな畳み込み投影を設計する。実験の結果,提案するTEC-Netは,CNNやTransformerネットワークを含むSOTA法よりも医用画像のセグメンテーションが優れていることがわかった。さらに、我々のTEC-Netはパラメータや計算コストを少なくし、事前学習に依存しない。コードはhttps://github.com/SR0920/TEC-Netで公開されている。 The hybrid architecture of convolution neural networks (CNN) and Transformer has been the most popular method for medical image segmentation. However, the existing networks based on the hybrid architecture suffer from two problems. First, although the CNN branch can capture image local features by using convolution operation, the vanilla convolution is unable to achieve adaptive extraction of image features. Second, although the Transformer branch can model the global information of images, the conventional self-attention only focuses on the spatial self-attention of images and ignores the channel and cross-dimensional self-attention leading to low segmentation accuracy for medical images with complex backgrounds. To solve these problems, we propose vision Transformer embrace convolutional neural networks for medical image segmentation (TEC-Net). Our network has two advantages. First, dynamic deformable convolution (DDConv) is designed in the CNN branch, which not only overcomes the difficulty of adaptive feature extraction using fixed-size convolution kernels, but also solves the defect that different inputs share the same convolution kernel parameters, effectively improving the feature expression ability of CNN branch. Second, in the Transformer branch, a (shifted)-window adaptive complementary attention module ((S)W-ACAM) and compact convolutional projection are designed to enable the network to fully learn the cross-dimensional long-range dependency of medical images with few parameters and calculations. Experimental results show that the proposed TEC-Net provides better medical image segmentation results than SOTA methods including CNN and Transformer networks. In addition, our TEC-Net requires fewer parameters and computational costs and does not rely on pre-training. The code is publicly available at https://github.com/SR0920/TEC-Net.	翻訳日:2023-10-25 08:52:08 公開日:2023-10-23
# 特定の問題や予算に最適なアクティブラーニング戦略を選択する方法 How to Select Which Active Learning Strategy is Best Suited for Your Specific Problem and Budget ( http://arxiv.org/abs/2306.03543v2 ) ライセンス: Link先を確認	Guy Hacohen, Daphna Weinshall	(参考訳) アクティブラーニング(AL)の領域では、学習者は事前に定義された予算制約の中で活動しながら、ラベルを探すためにラベルのない例を積極的に選択する。重要なことは、最近、異なるクエリ戦略が異なる条件と予算制約に適していることが示されている。実際には、与えられた状況に対する最も適切なal戦略の決定は未解決の問題である。そこで本研究では,予算の最適戦略を動的に識別する実用的なデリバティブベース手法を提案する。提案手法の直感的動機は, 簡易シナリオの理論解析によって得られる。次に,問題の特徴と利用可能な予算を考慮したAL戦略を動的に選択する手法を提案する。その結果,様々な予算やコンピュータビジョンタスクにまたがるアプローチの有効性が示された。 In the domain of Active Learning (AL), a learner actively selects which unlabeled examples to seek labels from an oracle, while operating within predefined budget constraints. Importantly, it has been recently shown that distinct query strategies are better suited for different conditions and budgetary constraints. In practice, the determination of the most appropriate AL strategy for a given situation remains an open problem. To tackle this challenge, we propose a practical derivative-based method that dynamically identifies the best strategy for a given budget. Intuitive motivation for our approach is provided by the theoretical analysis of a simplified scenario. We then introduce a method to dynamically select an AL strategy, which takes into account the unique characteristics of the problem and the available budget. Empirical results showcase the effectiveness of our approach across diverse budgets and computer vision tasks.	翻訳日:2023-10-25 08:51:11 公開日:2023-10-23
# 再帰フーリエ変換を用いた時間依存Schr\"{o}dinger方程式の解法 Decoupling the time dependent Schr\"{o}dinger equation using recursive Fourier transforms ( http://arxiv.org/abs/2306.03107v3 ) ライセンス: Link先を確認	Sky Nelson-Isaacs	(参考訳) 時間依存型Schr\"{o}dinger equation (TDSE) や、より一般的にはダイソン級数 (Dyson Series) を再帰フーリエ変換を用いた畳み込み方程式として記述し、時間順序演算子を使わずに第二階積分を第一階から切り離す戦略を開発する。エネルギー分布は、1階と2階の標準摂動理論の例で計算される。量子計算におけるボソニックサンプリングと4波混合のためのフォトニックスペクトルのキャラクタリゼーション、量子力学におけるバーディーントンネル振幅などの応用が考えられる。 A strategy is developed for writing the time-dependent Schr\"{o}dinger equation (TDSE), and more generally the Dyson Series, as a convolution equation using recursive Fourier transforms, thereby decoupling the second-order integral from the first without using the time ordering operator. The energy distribution is calculated for a number of standard perturbation theory example at first- and second-order. Possible applications include characterization of photonic spectra for bosonic sampling and four-wave mixing in quantum computation, and Bardeen tunneling amplitude in quantum mechanics.	翻訳日:2023-10-25 08:50:58 公開日:2023-10-23
# cmexamによる大規模言語モデルのベンチマーク - 総合的な中国医学試験データセット Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset ( http://arxiv.org/abs/2306.03030v3 ) ライセンス: Link先を確認	Junling Liu, Peilin Zhou, Yining Hua, Dading Chong, Zhongyu Tian, Andrew Liu, Helin Wang, Chenyu You, Zhenhua Guo, Lei Zhu, Michael Lingzhi Li	(参考訳) 大規模言語モデル(LLM)の最近の進歩は、質問応答(QA)の分野を変えている。しかし、標準化された包括的なデータセットがないため、医療分野におけるLCMの評価は困難である。このギャップに対処するため,中国国立医学ライセンス試験から得られたCMExamを紹介する。 CMExamは、標準化および客観的評価のための60K以上の多重選択質問と、オープンエンドなモデル推論評価のためのソリューション説明で構成されている。 llmsの詳細な分析のために、我々は医療専門家に、疾患グループ、臨床部門、医学分野、能力領域、質問難易度レベルを含む5つの追加の質問項目をラベル付けするよう求めた。データセットとともに,CMExam上で,代表LLMとQAアルゴリズムを用いた徹底的な実験を行った。その結果、GPT-4は61.6%、重み付きF1スコアは0.617であった。これらの結果は、人的精度が71.6%であったのに対して、大きな違いを示している。説明タスクでは、LCMは関連する推論を生成し、微調整後の性能向上を示すが、望ましい標準には達せず、改善の余地が十分にある。私たちの知る限り、CMExamは、包括的な医療アノテーションを提供する最初の中国の医学試験データセットです。 LLM評価の実験と結果はまた、中国の医療用QAシステムとLLM評価パイプラインの開発における課題と潜在的な解決策に関する貴重な知見を提供する。データセットと関連するコードはhttps://github.com/williamliujl/cmexamで入手できる。 Recent advancements in large language models (LLMs) have transformed the field of question answering (QA). However, evaluating LLMs in the medical field is challenging due to the lack of standardized and comprehensive datasets. To address this gap, we introduce CMExam, sourced from the Chinese National Medical Licensing Examination. CMExam consists of 60K+ multiple-choice questions for standardized and objective evaluations, as well as solution explanations for model reasoning evaluation in an open-ended manner. For in-depth analyses of LLMs, we invited medical professionals to label five additional question-wise annotations, including disease groups, clinical departments, medical disciplines, areas of competency, and question difficulty levels. Alongside the dataset, we further conducted thorough experiments with representative LLMs and QA algorithms on CMExam. The results show that GPT-4 had the best accuracy of 61.6% and a weighted F1 score of 0.617. These results highlight a great disparity when compared to human accuracy, which stood at 71.6%. For explanation tasks, while LLMs could generate relevant reasoning and demonstrate improved performance after finetuning, they fall short of a desired standard, indicating ample room for improvement. To the best of our knowledge, CMExam is the first Chinese medical exam dataset to provide comprehensive medical annotations. The experiments and findings of LLM evaluation also provide valuable insights into the challenges and potential solutions in developing Chinese medical QA systems and LLM evaluation pipelines. The dataset and relevant code are available at https://github.com/williamliujl/CMExam.	翻訳日:2023-10-25 08:50:43 公開日:2023-10-23
# 構造自由グラフ凝縮:大規模グラフから凝縮グラフ自由データへ Structure-free Graph Condensation: From Large-scale Graphs to Condensed Graph-free Data ( http://arxiv.org/abs/2306.02664v2 ) ライセンス: Link先を確認	Xin Zheng, Miao Zhang, Chunyang Chen, Quoc Viet Hung Nguyen, Xingquan Zhu, Shirui Pan	(参考訳) グラフ凝縮は、その置換として小さな凝縮グラフを合成することにより、大規模グラフのサイズを小さくするが、様々なグラフ学習タスクに即時利益をもたらす。しかし、既存のグラフ凝縮法は、凝縮グラフにおけるノードと構造の合同最適化に依存しており、有効性と一般化能力の重大な問題を見落としている。本稿では,大規模グラフを明示的なグラフ構造,すなわちグラフフリーなデータを持たない小さなグラフノードに抽出する,SFGCと呼ばれる新しい構造自由グラフ凝縮パラダイムを提案する。我々の考え方は、トポロジー構造情報を合成されたグラフフリーデータ内のノード属性に暗黙的にエンコードすることであり、トポロジーは同一性行列に還元される。具体的には,(1)小規模グラフフリーデータを効果的に合成する訓練軌道メタマッチングスキーム,(2)凝縮データの品質を動的に評価するグラフニューラルネットワーク特徴点スコアメトリックの2つの協調成分を含む。 SFGCはトラジェクトリメタマッチングのトレーニングを通じて、大規模グラフと縮合した小規模グラフフリーデータの間の長期GNN学習挙動を整合させ、グラフフリーデータへの情報的知識の包括的かつコンパクトな伝達を保証する。その後、基礎となる凝縮グラフ自由データは、凝縮グラフ自由データの優れた表現性を保証するための閉形式計量であるグラフ神経特徴スコアを用いて動的に評価される。拡張実験は、異なる凝縮比におけるSFGCの優越性を検証した。 Graph condensation, which reduces the size of a large-scale graph by synthesizing a small-scale condensed graph as its substitution, has immediate benefits for various graph learning tasks. However, existing graph condensation methods rely on the joint optimization of nodes and structures in the condensed graph, and overlook critical issues in effectiveness and generalization ability. In this paper, we advocate a new Structure-Free Graph Condensation paradigm, named SFGC, to distill a large-scale graph into a small-scale graph node set without explicit graph structures, i.e., graph-free data. Our idea is to implicitly encode topology structure information into the node attributes in the synthesized graph-free data, whose topology is reduced to an identity matrix. Specifically, SFGC contains two collaborative components: (1) a training trajectory meta-matching scheme for effectively synthesizing small-scale graph-free data; (2) a graph neural feature score metric for dynamically evaluating the quality of the condensed data. Through training trajectory meta-matching, SFGC aligns the long-term GNN learning behaviors between the large-scale graph and the condensed small-scale graph-free data, ensuring comprehensive and compact transfer of informative knowledge to the graph-free data. Afterward, the underlying condensed graph-free data would be dynamically evaluated with the graph neural feature score, which is a closed-form metric for ensuring the excellent expressiveness of the condensed graph-free data. Extensive experiments verify the superiority of SFGC across different condensation ratios.	翻訳日:2023-10-25 08:50:16 公開日:2023-10-23
# あらゆるものを高品質に分割する Segment Anything in High Quality ( http://arxiv.org/abs/2306.01567v2 ) ライセンス: Link先を確認	Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan Liu, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu	(参考訳) 最近のSegment Anything Model(SAM)は、セグメンテーションモデルをスケールアップする大きな飛躍であり、強力なゼロショット機能と柔軟なプロンプトを可能にする。 11億のマスクで訓練されているにもかかわらず、サムのマスクの予測品質は多くの場合、特に複雑な構造を持つオブジェクトを扱う場合、不足している。本稿では,SAM の本来の設計,効率,ゼロショットの一般化性を維持しつつ,任意のオブジェクトを正確にセグメント化できる HQ-SAM を提案する。注意深い設計はSAMの事前訓練されたモデルの重みを再利用し保存し、最小限の追加パラメータと計算しか導入しない。 SAMのマスクデコーダに入力し,高品質なマスクを予測する学習可能な高品質出力トークンを設計する。マスクデコーダ機能にのみ適用する代わりに、マスクの詳細を改善するために、まず初期のViT機能と最後のViT機能を融合します。導入した学習可能なパラメータをトレーニングするために、複数のソースから44Kのきめ細かいマスクのデータセットを作成します。 HQ-SAMは、紹介された44kマスクの切り離しでのみトレーニングされており、8GPUで4時間しかかからない。ダウンストリームタスクにまたがる10種類のセグメンテーションデータセットでHQ-SAMの有効性を示し,そのうち8つをゼロショット転送プロトコルで評価した。私たちのコードと事前訓練されたモデルはhttps://github.com/SysCV/SAM-HQ.orgにある。 The recent Segment Anything Model (SAM) represents a big leap in scaling up segmentation models, allowing for powerful zero-shot capabilities and flexible prompting. Despite being trained with 1.1 billion masks, SAM's mask prediction quality falls short in many cases, particularly when dealing with objects that have intricate structures. We propose HQ-SAM, equipping SAM with the ability to accurately segment any object, while maintaining SAM's original promptable design, efficiency, and zero-shot generalizability. Our careful design reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation. We design a learnable High-Quality Output Token, which is injected into SAM's mask decoder and is responsible for predicting the high-quality mask. Instead of only applying it on mask-decoder features, we first fuse them with early and final ViT features for improved mask details. To train our introduced learnable parameters, we compose a dataset of 44K fine-grained masks from several sources. HQ-SAM is only trained on the introduced detaset of 44k masks, which takes only 4 hours on 8 GPUs. We show the efficacy of HQ-SAM in a suite of 10 diverse segmentation datasets across different downstream tasks, where 8 out of them are evaluated in a zero-shot transfer protocol. Our code and pretrained models are at https://github.com/SysCV/SAM-HQ.	翻訳日:2023-10-25 08:49:50 公開日:2023-10-23
# 時空間のレバレッジによる日頭太陽照度時系列予測の改善 Improving day-ahead Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context ( http://arxiv.org/abs/2306.01112v2 ) ライセンス: Link先を確認	Oussama Boussif, Ghait Boukachab, Dan Assouline, Stefano Massaroli, Tianle Yuan, Loubna Benabbou, Yoshua Bengio	(参考訳) 太陽発電はCO$_{2}$の排出量を大幅に削減することで気候変動を緩和する大きな可能性を秘めている。それでも、太陽光の固有の変動は、太陽エネルギーを電力網にシームレスに統合する上で大きな課題となる。従来の研究の大半は、太陽の予測に時間的な時系列に基づく手法を採用することに集中してきたが、雲や周囲の物理的文脈などの要因を考慮に入れた研究はごく少数しかなかった。本稿では,衛星データを用いた時空間的コンテキストを活用した深層学習アーキテクチャを考案し,ghi(global horizontal irradiance)の予測に重点を置いた,任意の局に対する高精度な時系列予測を実現する。また,予測に付随する不確実性の指標として,各時間ステップ予測毎に分布を抽出する手法を提案する。モデルを評価する際には,重要な状況下でのモデル性能を捉えるために,特に困難な例を簡単な例から分離するテスト手法を提案する。さらに、複数の地理的に多様な太陽観測所から、太陽放射や関連する物理的変数を観測するための、大規模なゾーンと時系列に衛星画像を収集する新しいマルチモーダルデータセットを提案する。提案手法は、観測されていない太陽ステーションでのゼロショット一般化試験を含む太陽照射予測において堅牢な性能を示し、太陽エネルギーのグリッドへの効果的な統合を促進する上で非常に有望である。 Solar power harbors immense potential in mitigating climate change by substantially reducing CO$_{2}$ emissions. Nonetheless, the inherent variability of solar irradiance poses a significant challenge for seamlessly integrating solar power into the electrical grid. While the majority of prior research has centered on employing purely time series-based methodologies for solar forecasting, only a limited number of studies have taken into account factors such as cloud cover or the surrounding physical context. In this paper, we put forth a deep learning architecture designed to harness spatio-temporal context using satellite data, to attain highly accurate \textit{day-ahead} time-series forecasting for any given station, with a particular emphasis on forecasting Global Horizontal Irradiance (GHI). We also suggest a methodology to extract a distribution for each time step prediction, which can serve as a very valuable measure of uncertainty attached to the forecast. When evaluating models, we propose a testing scheme in which we separate particularly difficult examples from easy ones, in order to capture the model performances in crucial situations, which in the case of this study are the days suffering from varying cloudy conditions. Furthermore, we present a new multi-modal dataset gathering satellite imagery over a large zone and time series for solar irradiance and other related physical variables from multiple geographically diverse solar stations. Our approach exhibits robust performance in solar irradiance forecasting, including zero-shot generalization tests at unobserved solar stations, and holds great promise in promoting the effective integration of solar power into the grid.	翻訳日:2023-10-25 08:49:24 公開日:2023-10-23
# clara: 信頼できる対話型ロボットエージェントのためのユーザコマンドの分類と解除 CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents ( http://arxiv.org/abs/2306.10376v5 ) ライセンス: Link先を確認	Jeongeun Park, Seungwon Lim, Joonhyung Lee, Sangbeom Park, Minsuk Chang, Youngjae Yu and Sungjoon Choi	(参考訳) 本稿では,大規模言語モデル(LLM)を用いた対話型ロボットエージェントの文脈において,与えられたユーザコマンドが明確であるか,曖昧であるか,あるいは不可能であるかを推定することに焦点を当てる。この問題に対処するために,まず,コマンドが確実かどうか(明確か)を分類するためのllmsの不確実性推定法(曖昧か不可能か)を提案する。コマンドが不確実であると分類されると、ゼロショット方式で状況認識コンテキストでLLMを活用する不明瞭なコマンドと非実用的なコマンドとを区別する。あいまいなコマンドに対しては、質問生成を通じてLLMと対話することで、コマンドを曖昧にします。我々は、与えられたコマンドを適切に認識すると、ロボットの誤動作や望ましくない動作が減少し、対話型ロボットエージェントの信頼性が向上すると信じている。我々は,ロボットの状況認識のためのデータセットを提示する。2つの高レベルコマンド,シーン記述,コマンドタイプのラベル(明快,曖昧,実行不可能)からなる。提案手法は,テーブルトップのピック・アンド・プレースシミュレーションを用いて検証した。最後に,実世界のロボットインタラクション実験,すなわちハンドオーバシナリオにおいて提案手法を実証する。 In this paper, we focus on inferring whether the given user command is clear, ambiguous, or infeasible in the context of interactive robotic agents utilizing large language models (LLMs). To tackle this problem, we first present an uncertainty estimation method for LLMs to classify whether the command is certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command is classified as uncertain, we further distinguish it between ambiguous or infeasible commands leveraging LLMs with situational aware context in a zero-shot manner. For ambiguous commands, we disambiguate the command by interacting with users via question generation with LLMs. We believe that proper recognition of the given commands could lead to a decrease in malfunction and undesired actions of the robot, enhancing the reliability of interactive robot agents. We present a dataset for robotic situational awareness, consisting pair of high-level commands, scene descriptions, and labels of command type (i.e., clear, ambiguous, or infeasible). We validate the proposed method on the collected dataset, pick-and-place tabletop simulation. Finally, we demonstrate the proposed approach in real-world human-robot interaction experiments, i.e., handover scenarios.	翻訳日:2023-10-25 08:42:59 公開日:2023-10-23
# trojllm: 大きな言語モデルに対するブラックボックスのトロイの木馬攻撃 TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models ( http://arxiv.org/abs/2306.06815v2 ) ライセンス: Link先を確認	Jiaqi Xue, Mengxin Zheng, Ting Hua, Yilin Shen, Yepeng Liu, Ladislau Boloni and Qian Lou	(参考訳) 大規模言語モデル(llm)は、様々なアプリケーションのための機械学習サービスやインターフェースツールとして徐々に利用されている。しかし、LLMのセキュリティへの影響、特に敵とトロイアの攻撃に関して、十分に検証されていない。本稿では,汎用かつステルス的なトリガを効果的に生成する自動ブラックボックスフレームワークであるTrojLLMを提案する。これらのトリガが入力データに組み込まれると、LSMの出力は悪意ある操作を行うことができる。さらに、フレームワークは個別のプロンプト内にトロイの木を埋め込むこともサポートし、トリガーの攻撃の全体的な効果と精度を高める。具体的には,少数のデータサンプルを用いて被害者llmベースのapiに問い合わせることで,様々な入力に対してユニバーサルトリガを生成するトリガー検出アルゴリズムを提案する。さらに,多種多様なモデルの有効性と伝達性を維持する毒素を発生させる新しいプログレッシブトロイの木馬毒アルゴリズムを導入する。 GPT-3.5 や GPT-4 などの実世界のブラックボックス LLM API において,TrojLLM をテキストプロンプトに効果的に挿入する能力を示すとともに,クリーンなテストセット上での例外的な性能を維持した。私たちの仕事は、現在のモデルの潜在的なセキュリティリスクに光を当て、潜在的な防御的アプローチを提供します。 TrojLLMのソースコードはhttps://github.com/UCF-ML-Research/TrojLLMで公開されている。 Large Language Models (LLMs) are progressively being utilized as machine learning services and interface tools for various applications. However, the security implications of LLMs, particularly in relation to adversarial and Trojan attacks, remain insufficiently examined. In this paper, we propose TrojLLM, an automatic and black-box framework to effectively generate universal and stealthy triggers. When these triggers are incorporated into the input data, the LLMs' outputs can be maliciously manipulated. Moreover, the framework also supports embedding Trojans within discrete prompts, enhancing the overall effectiveness and precision of the triggers' attacks. Specifically, we propose a trigger discovery algorithm for generating universal triggers for various inputs by querying victim LLM-based APIs using few-shot data samples. Furthermore, we introduce a novel progressive Trojan poisoning algorithm designed to generate poisoned prompts that retain efficacy and transferability across a diverse range of models. Our experiments and results demonstrate TrojLLM's capacity to effectively insert Trojans into text prompts in real-world black-box LLM APIs including GPT-3.5 and GPT-4, while maintaining exceptional performance on clean test sets. Our work sheds light on the potential security risks in current models and offers a potential defensive approach. The source code of TrojLLM is available at https://github.com/UCF-ML-Research/TrojLLM.	翻訳日:2023-10-25 08:41:34 公開日:2023-10-23
# 注意、コンパイル、そしてソルバーに基づくシンボリック分析は必要なすべて Attention, Compilation, and Solver-based Symbolic Analysis are All You Need ( http://arxiv.org/abs/2306.06755v2 ) ライセンス: Link先を確認	Prithwish Jana, Piyush Jha, Haoyang Ju, Gautham Kishore, Aryan Mahajan and Vijay Ganesh	(参考訳) 本稿では,大規模言語モデル(LLM)に基づくJava-to-Python (J2P) とPython-to-Java (P2J) のバックエンドコード変換手法と,CoTranと呼ばれる関連ツールを提案する。提案手法は,LLMの注意機構,コンパイル,シンボリックな実行ベーステスト生成を利用して,入力プログラムと出力プログラムの等価性テストを行う。より正確には、コンパイラとシンボリック実行損失を組み込むために、典型的なLLMトレーニングループを変更する。 CoTranと他の12のトランスパイラとLLMベースの翻訳ツールを57,000以上のJava-Python等価ペアのベンチマークで比較した広範な実験により、CoTranはコンパイルや実行時同値精度などの関連する指標において、それらよりも優れていることを示した。例えば、このツールはコンパイル精度97.43%、実行時等価精度49.66%、最も近いツールは92.84%、40.95%である。 In this paper, we present a Java-to-Python (J2P) and Python-to-Java (P2J) back-to-back code translation method, and an associated tool called CoTran, based on large language models (LLMs). Our method leverages the attention mechanism of LLMs, compilation, and symbolic execution-based test generation for equivalence testing between the input and output programs. More precisely, we modify the typical LLM training loop to incorporate compiler and symbolic execution loss. Via extensive experiments comparing CoTran with 12 other transpilers and LLM-based translation tools over a benchmark of more than 57,000 Java-Python equivalent pairs, we show that CoTran outperforms them on relevant metrics such as compilation and runtime equivalence accuracy. For example, our tool gets 97.43% compilation accuracy and 49.66% runtime equivalence accuracy for J2P translation, whereas the nearest competing tool only gets 92.84% and 40.95% respectively.	翻訳日:2023-10-25 08:40:48 公開日:2023-10-23
# 変分不均衡回帰:確率的平滑化による不確かさの定量化 Variational Imbalanced Regression: Fair Uncertainty Quantification via Probabilistic Smoothing ( http://arxiv.org/abs/2306.06599v6 ) ライセンス: Link先を確認	Ziyan Wang, Hao Wang	(参考訳) 既存の回帰モデルは、ラベル分布が不均衡である場合、精度と不確実性の推定の両方において不足する傾向にある。本稿では,不均衡回帰でうまく機能するだけでなく,副産物として合理的な不確実性推定を行う確率的不均衡回帰(vir)と呼ばれるディープラーニングモデルを提案する。 Different from typical variational autoencoders assuming I.I.D. representations (a data point's representation is not directly affected by other data points), our VIR borrows data with similar regression labels to compute the latent representation's variational distribution; furthermore, different from deterministic regression models producing point estimates, VIR predicts the entire normal-inverse-gamma distributions and modulates the associated conjugate distributions to impose probabilistic reweighting on the imbalanced data, thereby providing better uncertainty estimation. いくつかの実世界のデータセットにおける実験では、virは精度と不確実性の両方の観点から、最先端の不均衡回帰モデルよりも優れています。コードは間もなくhttps://github.com/Wang-ML-Lab/variational-imbalanced-regression.comで公開される。 Existing regression models tend to fall short in both accuracy and uncertainty estimation when the label distribution is imbalanced. In this paper, we propose a probabilistic deep learning model, dubbed variational imbalanced regression (VIR), which not only performs well in imbalanced regression but naturally produces reasonable uncertainty estimation as a byproduct. Different from typical variational autoencoders assuming I.I.D. representations (a data point's representation is not directly affected by other data points), our VIR borrows data with similar regression labels to compute the latent representation's variational distribution; furthermore, different from deterministic regression models producing point estimates, VIR predicts the entire normal-inverse-gamma distributions and modulates the associated conjugate distributions to impose probabilistic reweighting on the imbalanced data, thereby providing better uncertainty estimation. Experiments in several real-world datasets show that our VIR can outperform state-of-the-art imbalanced regression models in terms of both accuracy and uncertainty estimation. Code will soon be available at https://github.com/Wang-ML-Lab/variational-imbalanced-regression.	翻訳日:2023-10-25 08:40:27 公開日:2023-10-23
# S-TLLR:STDPによるスパイクニューラルネットワークの時間的局所学習ルール S-TLLR: STDP-inspired Temporal Local Learning Rule for Spiking Neural Networks ( http://arxiv.org/abs/2306.15220v2 ) ライセンス: Link先を確認	Marco Paul E. Apolinario and Kaushik Roy	(参考訳) スパイキングニューラルネットワーク(snn)は生物学的に妥当なモデルであり、特にシーケンシャルな学習タスクにおいて、エネルギー効率の高いインテリジェンスをエッジに展開するのに適していると認識されている。しかし、SNNの訓練は、正確な時間的および空間的信用割当を必要とするため、重大な課題となる。時間によるバックプロパゲーション (BPTT) アルゴリズムはこれらの問題に対処する最も広く使われている手法であるが、時間的依存のため計算コストが高い。本研究では,S-TLLRを提案する。S-TLLRは,Spyke-Timing Dependent Plasticity(STDP)メカニズムにインスパイアされた,イベントベースの学習タスクにおける深層SNNのトレーニングを目的とした,3段階の時間的局所学習ルールである。さらに、S-TLLRは、低消費電力エッジデバイス上でのオンライン学習に適した時間ステップに依存しない、低メモリと時間複雑性を持つように設計されている。提案手法のスケーラビリティを実証するため,画像やジェスチャ認識,音声分類,光フロー推定など,幅広いアプリケーションを対象としたイベントベースデータセットの広範な評価を行った。全ての実験において、S-TLLRはBPTTに匹敵する高い精度を達成し、計算回数を1.1-10\times$に減らした。 Spiking Neural Networks (SNNs) are biologically plausible models that have been identified as potentially apt for deploying energy-efficient intelligence at the edge, particularly for sequential learning tasks. However, training of SNNs poses significant challenges due to the necessity for precise temporal and spatial credit assignment. Back-propagation through time (BPTT) algorithm, whilst the most widely used method for addressing these issues, incurs a high computational cost due to its temporal dependency. In this work, we propose S-TLLR, a novel three-factor temporal local learning rule inspired by the Spike-Timing Dependent Plasticity (STDP) mechanism, aimed at training deep SNNs on event-based learning tasks. Furthermore, S-TLLR is designed to have low memory and time complexities, which are independent of the number of time steps, rendering it suitable for online learning on low-power edge devices. To demonstrate the scalability of our proposed method, we have conducted extensive evaluations on event-based datasets spanning a wide range of applications, such as image and gesture recognition, audio classification, and optical flow estimation. In all the experiments, S-TLLR achieved high accuracy, comparable to BPTT, with reduction in the number of computations between $1.1-10\times$.	翻訳日:2023-10-25 08:31:42 公開日:2023-10-23
# GloptiNets: Certificatesによるスケーラブルな非凸最適化 GloptiNets: Scalable Non-Convex Optimization with Certificates ( http://arxiv.org/abs/2306.14932v2 ) ライセンス: Link先を確認	Gaspard Beugnot (PSL, DI-ENS), Julien Mairal, Alessandro Rudi (PSL, DI-ENS)	(参考訳) 本稿では,ハイパーキューブやトーラス上のスムーズな関数を扱う証明書を用いた非凸最適化手法を提案する。従来の代数的性質に依存する手法とは異なり、このアルゴリズムはフーリエスペクトルの減衰に内在する対象関数の正則性を利用する。抽出可能なモデルのファミリを定義することにより、正確な認証を取得し、ニューラルネットワークを最適化するために開発された高度な強力な計算技術を活用することができる。このように、我々のアプローチのスケーラビリティはGPUによる並列コンピューティングによって自然に向上します。我々のアプローチは、中等次元の多項式に適用されるが、数千の係数を持つ場合、ラッサールの階層に基づく証明による最先端の最適化手法よりも優れ、競合相手にとって難解な問題に対処する。 We present a novel approach to non-convex optimization with certificates, which handles smooth functions on the hypercube or on the torus. Unlike traditional methods that rely on algebraic properties, our algorithm exploits the regularity of the target function intrinsic in the decay of its Fourier spectrum. By defining a tractable family of models, we allow at the same time to obtain precise certificates and to leverage the advanced and powerful computational techniques developed to optimize neural networks. In this way the scalability of our approach is naturally enhanced by parallel computing with GPUs. Our approach, when applied to the case of polynomials of moderate dimensions but with thousands of coefficients, outperforms the state-of-the-art optimization methods with certificates, as the ones based on Lasserre's hierarchy, addressing problems intractable for the competitors.	翻訳日:2023-10-25 08:31:15 公開日:2023-10-23
# ポケット特異的分子生成とエレーボレーションのための機能群に基づく拡散 Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration ( http://arxiv.org/abs/2306.13769v2 ) ライセンス: Link先を確認	Haitao Lin, Yufei Huang, Odin Zhang, Lirong Wu, Siyuan Li, Zhiyuan Chen, Stan Z. Li	(参考訳) 近年、標的タンパク質のポケットの構造から分子を生成するためにAIによる薬物設計法が提案されている。その多くは原子レベルに基づく手法であり、原子を基本成分とみなし、原子の位置と型を生成する。しかし、このように複雑な構造を持つ現実的な断片を生成することは困難である。そこで本稿では,ポケット特異的分子生成・創製のための関数群に基づく拡散モデルであるd3fgを提案する。 d3fgは分子を2つの構成要素に分解する: 剛体として定義される官能基と質量点としてリンカーである。そしてこの2種類の成分は、リガンドとタンパク質の相互作用を強化する複雑な断片を形成することができる。具体的には、拡散過程において、D3FGは、コンポーネントの位置、向き、タイプのデータ分布を事前分布に拡散させ、生成過程において、設計された同変グラフニューラルネットワークでパラメータ化されたデノイザーにより、3変数からノイズを徐々に除去する。実験では, より現実的な3次元構造, タンパク質標的に対する競合親和性, 薬物特性の良好な分子を生成できる。さらに、D3FGは分子の発見の新たな課題の解決策として、既存のリガンドと標的タンパク質のホットスポットに基づいて高い親和性を持つ分子を生成することができる。 In recent years, AI-assisted drug design methods have been proposed to generate molecules given the pockets' structures of target proteins. Most of them are atom-level-based methods, which consider atoms as basic components and generate atom positions and types. In this way, however, it is hard to generate realistic fragments with complicated structures. To solve this, we propose D3FG, a functional-group-based diffusion model for pocket-specific molecule generation and elaboration. D3FG decomposes molecules into two categories of components: functional groups defined as rigid bodies and linkers as mass points. And the two kinds of components can together form complicated fragments that enhance ligand-protein interactions. To be specific, in the diffusion process, D3FG diffuses the data distribution of the positions, orientations, and types of the components into a prior distribution; In the generative process, the noise is gradually removed from the three variables by denoisers parameterized with designed equivariant graph neural networks. In the experiments, our method can generate molecules with more realistic 3D structures, competitive affinities toward the protein targets, and better drug properties. Besides, D3FG as a solution to a new task of molecule elaboration, could generate molecules with high affinities based on existing ligands and the hotspots of target proteins.	翻訳日:2023-10-25 08:31:01 公開日:2023-10-23
# 見えないモダリティインタラクションを学ぶ Learning Unseen Modality Interaction ( http://arxiv.org/abs/2306.12795v2 ) ライセンス: Link先を確認	Yunhua Zhang and Hazel Doughty and Cees G.M. Snoek	(参考訳) マルチモーダル学習では,学習中に興味のモダリティの組み合わせが利用可能であると仮定し,マルチモーダル学習に対するモダリティ完全仮定に挑戦する。我々は,非知覚的モダリティ相互作用の問題を提起し,第1の解決法を提案する。異なるモジュラリティの多次元的特徴を、豊富な情報を保存した共通空間に投影するモジュールを利用する。これにより、情報は利用可能なモダリティにまたがる単純な和演算で蓄積される。トレーニング中の判別的モダリティの組み合わせを減らすために、モダリティ予測の信頼性を示す擬似スーパービジョンを用いてモデル学習をさらに改善する。本手法は,マルチモーダル映像分類,ロボット状態回帰,マルチメディア検索において,多様なタスクやモダリティに対して有効であることを示す。プロジェクトwebサイト: https://xiaobai1217.github.io/unseen-modality-interaction/ Multimodal learning assumes all modality combinations of interest are available during training to learn cross-modal correspondences.In this paper, we challenge this modality-complete assumption for multimodal learning and instead strive for generalization to unseen modality combinations during inference. We pose the problem of unseen modality interaction and introduce a first solution. It exploits a module that projects the multidimensional features of different modalities into a common space with rich information preserved. This allows the information to be accumulated with a simple summation operation across available modalities. To reduce overfitting to less discriminative modality combinations during training, we further improve the model learning with pseudo-supervision indicating the reliability of a modality's prediction. We demonstrate that our approach is effective for diverse tasks and modalities by evaluating it for multimodal video classification, robot state regression, and multimedia retrieval. Project website: https://xiaobai1217.github.io/Unseen-Modality-Interaction/.	翻訳日:2023-10-25 08:30:12 公開日:2023-10-23
# 自然視覚シーンに対する神経反応の時間的コンディショニングスパイク潜在変数モデル Temporal Conditioning Spiking Latent Variable Models of the Neural Response to Natural Visual Scenes ( http://arxiv.org/abs/2306.12045v4 ) ライセンス: Link先を確認	Gehua Ma, Runhao Jiang, Rui Yan, Huajin Tang	(参考訳) 神経応答の計算モデルの開発は、感覚処理と神経計算を理解する上で重要である。現在の最先端のニューラルネットワーク手法は、時間的依存関係を処理するために時間的フィルタを使用し、非現実的で柔軟な処理パラダイムをもたらす。一方、これらの方法は試験的な平均射撃率を目標とし、スパイク列車の重要な特徴を捉えられなかった。本研究は, 時間条件付潜時変動モデル(TeCoS-LVM)を提示し, 自然視覚刺激に対する神経応答をシミュレートする。我々はスパイキングニューロンを用いて、記録された列車と直接一致するスパイク出力を生成する。このアプローチは、オリジナルのスパイク列車に埋め込まれた情報を失うのを避けるのに役立つ。モデルパラメータ空間から時間次元を除外し、時間条件付き操作を導入し、モデルが自然パラダイムにおける刺激配列の時間依存性を適応的に探索し活用できるようにする。 tecos-lvmモデルはより現実的なスパイクアクティビティを生成でき、強力な代替品よりもスパイク統計に正確に適合する。さらに、学習したTeCoS-LVMモデルは、より長い時間スケールでうまく一般化することができる。全体として、計算可能でありながら、我々のモデルは、ニューラルネットワークシステムの重要な特徴を効果的に捉えている。これにより、様々な知覚知覚回路の正確な予測計算アカウントを構築するための有用なツールを提供する。 Developing computational models of neural response is crucial for understanding sensory processing and neural computations. Current state-of-the-art neural network methods use temporal filters to handle temporal dependencies, resulting in an unrealistic and inflexible processing paradigm. Meanwhile, these methods target trial-averaged firing rates and fail to capture important features in spike trains. This work presents the temporal conditioning spiking latent variable models (TeCoS-LVM) to simulate the neural response to natural visual stimuli. We use spiking neurons to produce spike outputs that directly match the recorded trains. This approach helps to avoid losing information embedded in the original spike trains. We exclude the temporal dimension from the model parameter space and introduce a temporal conditioning operation to allow the model to adaptively explore and exploit temporal dependencies in stimuli sequences in a {\it natural paradigm}. We show that TeCoS-LVM models can produce more realistic spike activities and accurately fit spike statistics than powerful alternatives. Additionally, learned TeCoS-LVM models can generalize well to longer time scales. Overall, while remaining computationally tractable, our model effectively captures key features of neural coding systems. It thus provides a useful tool for building accurate predictive computational accounts for various sensory perception circuits.	翻訳日:2023-10-25 08:29:54 公開日:2023-10-23
# bullying10k: プライバシ保護型いじめ認識のための大規模ニューロモルフィックデータセット Bullying10K: A Large-Scale Neuromorphic Dataset towards Privacy-Preserving Bullying Recognition ( http://arxiv.org/abs/2306.11546v2 ) ライセンス: Link先を確認	Yiting Dong, Yang Li, Dongcheng Zhao, Guobin Shen, Yi Zeng	(参考訳) 日常生活における暴力の流行は、個人の身体的および精神的健康に重大な脅威をもたらす。公共空間での監視カメラの使用は、このような事件を積極的に抑止し防止するのに有効であることが証明されている。しかし、プライバシの侵入に関する懸念は、広く展開されているため現れている。この問題に対処するために、ダイナミックビジョンセンサー(DVS)カメラを使用して暴力的なインシデントを検出し、静的画像の代わりにピクセル輝度の変動をキャプチャするのでプライバシーを保護する。我々は,現実のシナリオから様々な行動や複雑な動き,オクルージョンを包含するbullying10kデータセットを紹介する。アクション認識、時間的アクションローカライゼーション、ポーズ推定という3つのタスクを評価するためのベンチマークを提供する。 1万のイベントセグメントがあり、合計120億のイベントと255gbのデータがある。またそれは、ニューロモルフィックなデータセットにも挑戦する。プライバシー保護ビデオシステムを訓練し、開発するための貴重なリソースとなる。 Bullying10Kは、これらの領域における革新的なアプローチの新たな可能性を開く。 The prevalence of violence in daily life poses significant threats to individuals' physical and mental well-being. Using surveillance cameras in public spaces has proven effective in proactively deterring and preventing such incidents. However, concerns regarding privacy invasion have emerged due to their widespread deployment. To address the problem, we leverage Dynamic Vision Sensors (DVS) cameras to detect violent incidents and preserve privacy since it captures pixel brightness variations instead of static imagery. We introduce the Bullying10K dataset, encompassing various actions, complex movements, and occlusions from real-life scenarios. It provides three benchmarks for evaluating different tasks: action recognition, temporal action localization, and pose estimation. With 10,000 event segments, totaling 12 billion events and 255 GB of data, Bullying10K contributes significantly by balancing violence detection and personal privacy persevering. And it also poses a challenge to the neuromorphic dataset. It will serve as a valuable resource for training and developing privacy-protecting video systems. The Bullying10K opens new possibilities for innovative approaches in these domains.	翻訳日:2023-10-25 08:29:34 公開日:2023-10-23
# 雑音量子コンピューティングデバイスにおける高精度画像生成 Precise Image Generation on Current Noisy Quantum Computing Devices ( http://arxiv.org/abs/2307.05253v4 ) ライセンス: Link先を確認	Florian Rehm, Sofia Vallecorsa, Kerstin Borras, Dirk Kr\"ucker, Michele Grossi, Valle Varo	(参考訳) 量子アングルジェネレータ(QAG)は、現在のノイズ中間スケール(NISQ)量子デバイス上で正確な画像を生成するために設計された、新しいフル量子機械学習モデルである。変動量子回路はQAGモデルのコアを形成し、様々な回路アーキテクチャを評価する。いわゆるMERA-upsamplingアーキテクチャと組み合わせて、QAGモデルは優れた結果を得ることができ、詳細な分析と評価を行う。我々の知る限り、量子モデルがそのような正確な結果を得たのはこれが初めてである。モデルから雑音へのロバスト性を調べるために、広範囲な量子ノイズ研究を行う。本稿では,物理量子デバイスでトレーニングしたモデルがハードウェアのノイズ特性を学習し,優れた結果が得られることを示す。トレーニング中に最大8%の量子ハードウェアマシンキャリブレーションが変更しても、十分に許容できることが確認された。このモデルは、粒子エネルギーを測定するために必要となる高エネルギー物理学における不必要なシミュレーションや、最終的にCERNの大型ハドロン衝突型加速器で未知の粒子を発見するために用いられる。 The Quantum Angle Generator (QAG) is a new full Quantum Machine Learning model designed to generate accurate images on current Noise Intermediate Scale (NISQ) Quantum devices. Variational quantum circuits form the core of the QAG model, and various circuit architectures are evaluated. In combination with the so-called MERA-upsampling architecture, the QAG model achieves excellent results, which are analyzed and evaluated in detail. To our knowledge, this is the first time that a quantum model has achieved such accurate results. To explore the robustness of the model to noise, an extensive quantum noise study is performed. In this paper, it is demonstrated that the model trained on a physical quantum device learns the noise characteristics of the hardware and generates outstanding results. It is verified that even a quantum hardware machine calibration change during training of up to 8% can be well tolerated. For demonstration, the model is employed in indispensable simulations in high energy physics required to measure particle energies and, ultimately, to discover unknown particles at the Large Hadron Collider at CERN.	翻訳日:2023-10-25 08:22:23 公開日:2023-10-23
# テスト時間領域一般化のための変分隣接ラベルの学習 Learning Variational Neighbor Labels for Test-Time Domain Generalization ( http://arxiv.org/abs/2307.04033v2 ) ライセンス: Link先を確認	Sameer Ambekar, Zehao Xiao, Jiayi Shen, Xiantong Zhen, Cees G. M. Snoek	(参考訳) 本稿では,モデルが対象領域にデプロイされる前に,ソースドメインのみをトレーニングするドメインの一般化について述べる。我々は、ソーストレーニングとターゲットテストの厳密な分離に従うが、推論中にラベル付けされていないターゲットデータ自体の価値を利用する。我々は3つの貢献をした。まず,実験時に対象領域に学習したモデルを一般化するために,対象サンプルの確率論的擬似ラベル化を提案する。一般化中の不確実性を考慮した分布として擬似ラベルをモデル化し、不正確な擬似ラベルの誤解を招く信号を緩和することにより、テスト時の一般化を変分推論問題として定式化する。次に,より堅牢な擬似ラベルを生成するために,隣接する対象サンプルの情報を含む変分隣接ラベルを学習する。第3に、より代表的対象情報を組み込んで、より正確で頑健な近隣ラベルを生成する能力を学ぶために、一般化手順をシミュレートする訓練中にメタ一般化ステージを導入する。 6つの広く利用されているデータセットの実験は、提案の利点、能力、有効性を示している。 This paper strives for domain generalization, where models are trained exclusively on source domains before being deployed at unseen target domains. We follow the strict separation of source training and target testing but exploit the value of the unlabeled target data itself during inference. We make three contributions. First, we propose probabilistic pseudo-labeling of target samples to generalize the source-trained model to the target domain at test time. We formulate the generalization at test time as a variational inference problem by modeling pseudo labels as distributions to consider the uncertainty during generalization and alleviate the misleading signal of inaccurate pseudo labels. Second, we learn variational neighbor labels that incorporate the information of neighboring target samples to generate more robust pseudo labels. Third, to learn the ability to incorporate more representative target information and generate more precise and robust variational neighbor labels, we introduce a meta-generalization stage during training to simulate the generalization procedure. Experiments on six widely-used datasets demonstrate the benefits, abilities, and effectiveness of our proposal.	翻訳日:2023-10-25 08:22:03 公開日:2023-10-23
# ニューラルネットワークデコーダによる表面実験 Neural network decoder for near-term surface-code experiments ( http://arxiv.org/abs/2307.03280v2 ) ライセンス: Link先を確認	Boris M. Varbanov, Marc Serra-Peralta, David Byfield, Barbara M. Terhal	(参考訳) ニューラルネットワークデコーダは、表面コードをデコードする際に、従来のデコーダよりも低い論理エラー率を達成することができる。さらに、これらのデコーダは物理エラー率に関する事前情報を必要としないため、高度に適応可能である。本研究では,トランスモン量子ビットプロセッサから得られたシミュレーションデータと実験データの両方を用いて,小型表面符号に着目したデコーダの性能について検討する。最初に、ニューラルネットワークが典型的には、マッチするデコーダよりも優れた処理エラーにより、例えば$Y$エラーなど、複数の相関したシンドローム欠陥につながることが示される。 Google Quantum AI, Nature 614, 676 (2023)]の実験データに適用すると、ニューラルネットワークデコーダは最小ウェイト完全マッチングよりも約25\%$低い論理誤差率を達成し、最大ライクなデコーダのパフォーマンスにアプローチする。このデコーダの柔軟性を実証するために、トランスモン量子ビットのアナログ読み出しで利用できるソフト情報を組み込んで、対称ガウスノイズモデルを用いてシミュレーションにおいてこのデコーダの性能を評価する。ソフトな情報を考えると、測定誤差の確率に応じて、約10〜%の論理誤差率が低下する。優れた論理性能、柔軟性、計算効率により、ニューラルネットワークデコーダは量子メモリの短期的な実証に適している。 Neural-network decoders can achieve a lower logical error rate compared to conventional decoders, like minimum-weight perfect matching, when decoding the surface code. Furthermore, these decoders require no prior information about the physical error rates, making them highly adaptable. In this study, we investigate the performance of such a decoder using both simulated and experimental data obtained from a transmon-qubit processor, focusing on small-distance surface codes. We first show that the neural network typically outperforms the matching decoder due to better handling errors leading to multiple correlated syndrome defects, such as $Y$ errors. When applied to the experimental data of [Google Quantum AI, Nature 614, 676 (2023)], the neural network decoder achieves logical error rates approximately $25\%$ lower than minimum-weight perfect matching, approaching the performance of a maximum-likelihood decoder. To demonstrate the flexibility of this decoder, we incorporate the soft information available in the analog readout of transmon qubits and evaluate the performance of this decoder in simulation using a symmetric Gaussian-noise model. Considering the soft information leads to an approximately $10\%$ lower logical error rate, depending on the probability of a measurement error. The good logical performance, flexibility, and computational efficiency make neural network decoders well-suited for near-term demonstrations of quantum memories.	翻訳日:2023-10-25 08:21:26 公開日:2023-10-23
# 単一制約木を用いたマルチエージェントターゲット割り当てと経路探索の解法 Solving Multi-Agent Target Assignment and Path Finding with a Single Constraint Tree ( http://arxiv.org/abs/2307.00663v2 ) ライセンス: Link先を確認	Yimin Tang, Zhongqiang Ren, Jiaoyang Li, Katia Sycara	(参考訳) 目標割り当て問題と経路探索問題(tapf: target-assignment and path-finding problem)は、エージェントに対して同時にターゲットを割り当てることと、エージェントの開始位置から割り当てられたターゲットへの衝突のない経路を計画することである。 TAPFに対処するための主要なアプローチとして、CBS-TA(Conflict-Based Search with Target Assignment)は、K-bestターゲットの割り当てを利用して複数の検索ツリーを作成し、CBS(Conflict-Based Search)は各検索ツリーの衝突を解決する。最適解を見つけることができる一方で、cbs-taは複数の木で重複する衝突解決とk-best代入の高価な計算のためにスケーラビリティに苦しむ。そこで我々は,この2つの計算ボトルネックを回避するために,Incremental Target Assignment CBS (ITA-CBS) を開発した。 ITA-CBSは、単一の検索ツリーのみを生成し、検索中に新しい1-bestの割り当てをインクリメンタルに計算することで、K-bestの割り当ての計算を避ける。我々は,理論上,ITA-CBSは最適解を見つけることが保証され,実際は計算効率が高いことを示す。 Combined Target-Assignment and Path-Finding problem (TAPF) requires simultaneously assigning targets to agents and planning collision-free paths for agents from their start locations to their assigned targets. As a leading approach to address TAPF, Conflict-Based Search with Target Assignment (CBS-TA) leverages both K-best target assignments to create multiple search trees and Conflict-Based Search (CBS) to resolve collisions in each search tree. While being able to find an optimal solution, CBS-TA suffers from scalability due to the duplicated collision resolution in multiple trees and the expensive computation of K-best assignments. We therefore develop Incremental Target Assignment CBS (ITA-CBS) to bypass these two computational bottlenecks. ITA-CBS generates only a single search tree and avoids computing K-best assignments by incrementally computing new 1-best assignments during the search. We show that, in theory, ITA-CBS is guaranteed to find an optimal solution and, in practice, is computationally efficient.	翻訳日:2023-10-25 08:20:08 公開日:2023-10-23
# メタ学習生成モデルによるニューラルネットワークの正規化 Regularizing Neural Networks with Meta-Learning Generative Models ( http://arxiv.org/abs/2307.13899v2 ) ライセンス: Link先を確認	Shin'ya Yamaguchi, Daiki Chijiwa, Sekitoshi Kanai, Atsutoshi Kumagai, Hisashi Kashima	(参考訳) 本稿では,深層学習のための生成データ向上手法について検討する。生成データ拡張は、生成モデルによって生成された合成サンプルを、小さなデータセット設定で分類するための追加データセットとして活用する。生成データ拡張の重要な課題は、合成データが精度を低下させる非変換サンプルを含むことである。これは、合成サンプルが実際のデータのクラスカテゴリを完全に表現しておらず、一様サンプリングが必ずしもタスクに有用なサンプルを提供していないためである。本稿では,メタ生成正則化(Meta Generative regularization, MGR)と呼ばれる新しい生成データ拡張戦略を提案する。生成データ拡張の劣化を避けるため、mgrは損失関数(例えばクロスエントロピー)ではなく、特徴抽出器の正規化用語で合成サンプルを利用する。これらの合成サンプルはメタラーニングによる検証損失を最小限に抑えるために動的に決定される。我々は,MGRが生合成データ強化の性能劣化を回避し,ベースラインを向上できることを示した。 6つのデータセットに関する実験は、特にデータセットがベースラインよりも小さく安定的に優れている場合にmgrが有効であることを示した。 This paper investigates methods for improving generative data augmentation for deep learning. Generative data augmentation leverages the synthetic samples produced by generative models as an additional dataset for classification with small dataset settings. A key challenge of generative data augmentation is that the synthetic data contain uninformative samples that degrade accuracy. This is because the synthetic samples do not perfectly represent class categories in real data and uniform sampling does not necessarily provide useful samples for tasks. In this paper, we present a novel strategy for generative data augmentation called meta generative regularization (MGR). To avoid the degradation of generative data augmentation, MGR utilizes synthetic samples in the regularization term for feature extractors instead of in the loss function, e.g., cross-entropy. These synthetic samples are dynamically determined to minimize the validation losses through meta-learning. We observed that MGR can avoid the performance degradation of na\"ive generative data augmentation and boost the baselines. Experiments on six datasets showed that MGR is effective particularly when datasets are smaller and stably outperforms baselines.	翻訳日:2023-10-25 08:12:06 公開日:2023-10-23
# 多様なオフライン模倣学習 Diverse Offline Imitation Learning ( http://arxiv.org/abs/2307.11373v2 ) ライセンス: Link先を確認	Marin Vlastelica, Jin Cheng, Georg Martius, Pavel Kolev	(参考訳) 多様な情報理論の目的を多様性の尺度として活用し、教師なしのスキル発見の領域では近年大きく進歩している。現在の方法は、重要なオンラインインタラクションを必要とし、膨大な量のタスクに依存しないデータを活用できず、一般的にはスキルの有用性の定量的指標が欠如している。我々は,非教師付きスキル発見のための原則付きオフラインアルゴリズムを提案することで,これらの課題に対処し,多様性を最大化するとともに,各学習スキルが状態限定のエキスパートデモンストレーションをある程度模倣することを保証する。本研究の主な分析的貢献は、フェンシェル双対性、強化学習、教師なしスキル発見を結合し、kl-divergence状態の制約を受ける相互情報目標を最大化することである。さらに,本手法の標準オフラインベンチマークD4RLと,シミュレーションで訓練されたポリシーを実際のロボットシステムに適切に伝達する12-DoF四足歩行ロボットから収集したカスタムオフラインデータセットに対する有効性を示す。 There has been significant recent progress in the area of unsupervised skill discovery, utilizing various information-theoretic objectives as measures of diversity. Despite these advances, challenges remain: current methods require significant online interaction, fail to leverage vast amounts of available task-agnostic data and typically lack a quantitative measure of skill utility. We address these challenges by proposing a principled offline algorithm for unsupervised skill discovery that, in addition to maximizing diversity, ensures that each learned skill imitates state-only expert demonstrations to a certain degree. Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery to maximize a mutual information objective subject to KL-divergence state occupancy constraints. Furthermore, we demonstrate the effectiveness of our method on the standard offline benchmark D4RL and on a custom offline dataset collected from a 12-DoF quadruped robot for which the policies trained in simulation transfer well to the real robotic system.	翻訳日:2023-10-25 08:11:09 公開日:2023-10-23
# 大規模言語モデル研究におけるトピックス,著者,ネットワーク:17K arXiv論文の動向 Topics, Authors, and Networks in Large Language Model Research: Trends from a Survey of 17K arXiv Papers ( http://arxiv.org/abs/2307.10700v2 ) ライセンス: Link先を確認	Rajiv Movva, Sidhika Balachandar, Kenny Peng, Gabriel Agostini, Nikhil Garg, Emma Pierson	(参考訳) 大規模言語モデル(llm)の研究は社会に劇的に影響を与えており、その優先するトピックや価値、それを推進する著者や機関、コラボレーションのネットワークを理解することが不可欠である。この分野の最近の成長により、これらの基本的な特性の多くは体系的な記述を欠いている。我々は2023年と2018-2022年の変化に注目し,16,979 llm関連arxiv論文のデータセットを収集,注釈,分析した。 LLM研究は、社会への影響にますます焦点が当てられている。2023年には、コンピュータと社会のサブarXivは、LSM関連の論文の割合で20倍に成長している。 2023年の論文の大半は、これまでLSM関連の論文を書いていない研究者によって最初に書かれたものであり、これらの論文は特に応用と社会的考察に焦点を当てている。少数の企業が大きな影響力を持つ一方で、アカデミアは業界全体よりもはるかに大きな論文を出版しており、このギャップは2023年に拡大している。 LLMの研究は社会的ダイナミクスによっても形作られており、著者が優先するトピックには性別と学術的・産業的な違いがあり、コラボレーションネットワークでは米国と中国の分裂が激化している。概して,LLM研究は社会が形成・形成し,社会技術レンズの必要性を証明し,研究者や政策立案者への影響を論じる。 Large language model (LLM) research is dramatically impacting society, making it essential to understand the topics and values it prioritizes, the authors and institutions driving it, and its networks of collaboration. Due to the recent growth of the field, many of these fundamental attributes lack systematic description. We gather, annotate, and analyze a new dataset of 16,979 LLM-related arXiv papers, focusing on changes in 2023 vs. 2018-2022. We show that LLM research increasingly focuses on societal impacts: the Computers and Society sub-arXiv has seen 20x growth in its proportion of LLM-related papers in 2023. This change is driven in part by an influx of new authors: a majority of 2023 papers are first-authored by researchers who have not previously written an LLM-related paper, and these papers focus particularly on applications and societal considerations. While a handful of companies hold outsize influence, academia publishes a much larger fraction of papers than industry overall, and this gap widens in 2023. LLM research is also being shaped by social dynamics: there are gender and academic/industry differences in the topics authors prioritize, and a stark U.S./China schism in the collaboration network. Overall, our analysis documents how LLM research both shapes and is shaped by society, attesting to the necessity of sociotechnical lenses; we discuss implications for researchers and policymakers.	翻訳日:2023-10-25 08:09:54 公開日:2023-10-23
# 会話検索のためのゼロショットクエリ再構成 Zero-shot Query Reformulation for Conversational Search ( http://arxiv.org/abs/2307.09384v2 ) ライセンス: Link先を確認	Dayu Yang, Yue Zhang, Hui Fang	(参考訳) 音声アシスタントの人気が高まるにつれ、会話型検索は情報検索において注目を集めている。しかし、会話検索におけるデータのスパーシティ問題は、教師付き会話検索手法の進展を著しく妨げている。その結果、研究者はゼロショット会話検索のアプローチに注力している。しかしながら、既存のゼロショット法は、すべてのレトリバーに普遍的に適用できないこと、その有効性には十分な説明性がなく、欠落によって引き起こされる一般的な会話の曖昧さを解決するのに苦労していること、の3つの主要な制限に直面している。これらの制約に対処するために,会話検索データからの監視を必要とせず,従来の会話コンテキストに基づいてクエリを再構成するZeQR(Zero-shot Query Reformulation)フレームワークを導入する。具体的には,マシンリーディング理解タスク用に設計された言語モデルを用いて,生のクエリにおけるコレファレンスと省略という2つの共通曖昧さを明示的に解決する。既存のゼロショット法と比較して,本手法は適応やインデックス付けを伴わずに任意のレトリバーに適用可能である。さらに、曖昧さが明確かつ積極的に解決されているため、説明可能性も向上し、クエリ意図の理解を効果的に強化する。 4つのTREC会話データセットに関する広範な実験を通して、我々の手法の有効性を実証する。 As the popularity of voice assistants continues to surge, conversational search has gained increased attention in Information Retrieval. However, data sparsity issues in conversational search significantly hinder the progress of supervised conversational search methods. Consequently, researchers are focusing more on zero-shot conversational search approaches. Nevertheless, existing zero-shot methods face three primary limitations: they are not universally applicable to all retrievers, their effectiveness lacks sufficient explainability, and they struggle to resolve common conversational ambiguities caused by omission. To address these limitations, we introduce a novel Zero-shot Query Reformulation (ZeQR) framework that reformulates queries based on previous dialogue contexts without requiring supervision from conversational search data. Specifically, our framework utilizes language models designed for machine reading comprehension tasks to explicitly resolve two common ambiguities: coreference and omission, in raw queries. In comparison to existing zero-shot methods, our approach is universally applicable to any retriever without additional adaptation or indexing. It also provides greater explainability and effectively enhances query intent understanding because ambiguities are explicitly and proactively resolved. Through extensive experiments on four TREC conversational datasets, we demonstrate the effectiveness of our method, which consistently outperforms state-of-the-art baselines.	翻訳日:2023-10-25 08:09:06 公開日:2023-10-23
# 低レイテンシ同時音声翻訳におけるエンドツーエンド評価 End-to-End Evaluation for Low-Latency Simultaneous Speech Translation ( http://arxiv.org/abs/2308.03415v2 ) ライセンス: Link先を確認	Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues and Alexander Waibel	(参考訳) 低遅延音声翻訳の課題は、最近、いくつかの出版物や共有タスクで示されているように、研究コミュニティにおいて大きな関心を集めている。したがって、これらの異なるアプローチを現実的なシナリオで評価することが不可欠である。しかし、現在、システムの特定の側面のみが評価されており、しばしば異なるアプローチを比較することはできない。本研究では,実環境下で低レイテンシ音声翻訳の様々な側面を遂行し,評価する最初の枠組みを提案する。評価はエンドツーエンドで行われる。これには、オーディオのセグメンテーションや、さまざまなコンポーネントの実行時間が含まれる。第2に,このフレームワークを用いた低遅延音声翻訳に対する異なるアプローチを比較する。我々は、出力を更新するオプションを持つモデルと、固定出力を持つメソッドを評価する。さらに、最先端のカスケードシステムとエンドツーエンドシステムを直接比較する。最後に、フレームワークは翻訳品質とレイテンシを自動的に評価すると同時に、低レイテンシモデルの出力をユーザに示すためのwebインターフェースも提供する。 The challenge of low-latency speech translation has recently draw significant interest in the research community as shown by several publications and shared tasks. Therefore, it is essential to evaluate these different approaches in realistic scenarios. However, currently only specific aspects of the systems are evaluated and often it is not possible to compare different approaches. In this work, we propose the first framework to perform and evaluate the various aspects of low-latency speech translation under realistic conditions. The evaluation is carried out in an end-to-end fashion. This includes the segmentation of the audio as well as the run-time of the different components. Secondly, we compare different approaches to low-latency speech translation using this framework. We evaluate models with the option to revise the output as well as methods with fixed output. Furthermore, we directly compare state-of-the-art cascaded as well as end-to-end systems. Finally, the framework allows to automatically evaluate the translation quality as well as latency and also provides a web interface to show the low-latency model outputs to the user.	翻訳日:2023-10-25 08:01:31 公開日:2023-10-23
# BabyのCoThought:コンパクトモデルにおける推論強化のための大規模言語モデルの活用 Baby's CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models ( http://arxiv.org/abs/2308.01684v2 ) ライセンス: Link先を確認	Zheyu Zhang, Han Yang, Bolei Ma, David R\"ugamer, Ercong Nie	(参考訳) 大規模言語モデル(llm)は、さまざまな自然言語理解(nlu)タスクにおいて、主にコンテキスト内学習能力によって、驚くべきパフォーマンスを示している。この能力は、小さなスケールでモデルを構築すること、訓練効率を向上させるために応用できる。本稿では,llmの思考促進の連鎖を利用して,より小さな"baby"言語モデル(babylms)を効率的に学習する"cothought"パイプラインを提案する。我々のパイプラインは、GPT-3.5-turboを用いて、100M未満のデータセットを再構成し、言語学習者の学校テキストに匹敵するタスク指向の人間可読テキストに変換する。 BabyLMはRoBERTa方式で、この再構成データセットで事前トレーニングされる。 4つのベンチマークで評価したところ、BabyLMは10の言語、NLU、質問応答タスクにおいてバニラRoBERTaを3ポイント以上上回り、文脈情報を抽出する優れた能力を示している。これらの結果から,LLM再構成データ上に事前訓練されたコンパクトなLMは,タスクをよりよく理解し,性能を向上できる可能性が示唆された。 Large Language Models (LLMs) demonstrate remarkable performance on a variety of natural language understanding (NLU) tasks, primarily due to their in-context learning ability. This ability could be applied to building babylike models, i.e. models at small scales, improving training efficiency. In this paper, we propose a "CoThought" pipeline, which efficiently trains smaller "baby" language models (BabyLMs) by leveraging the Chain of Thought prompting of LLMs. Our pipeline restructures a dataset of less than 100M in size using GPT-3.5-turbo, transforming it into task-oriented, human-readable texts that are comparable to the school texts for language learners. The BabyLM is then pretrained on this restructured dataset in a RoBERTa fashion. In evaluations across 4 benchmarks, our BabyLM outperforms the vanilla RoBERTa in 10 linguistic, NLU, and question-answering tasks by more than 3 points, showing a superior ability to extract contextual information. These results suggest that compact LMs pretrained on small, LLM-restructured data can better understand tasks and achieve improved performance.	翻訳日:2023-10-25 08:01:12 公開日:2023-10-23
# 最適化フライング対応パッチを用いた深層学習型マルチロータのキドナッピング Kidnapping Deep Learning-based Multirotors using Optimized Flying Adversarial Patches ( http://arxiv.org/abs/2308.00344v2 ) ライセンス: Link先を確認	Pia Hanfeld, Khaled Wahba, Marina M.-C. H\"ohne, Michael Bussmann, Wolfgang H\"onig	(参考訳) マルチローターのような自律飛行ロボットは、例えばポーズ推定のためにカメラ画像に基づいて予測を行うディープラーニングモデルに依存することが多い。これらのモデルは、トレーニング領域外の入力画像に適用した場合、驚くべき結果を予測することができる。この障害は、例えば、ニューラルネットワークの予測を操作するために環境に置かれる小さなイメージ、いわゆる敵パッチを計算することによって、敵攻撃によって悪用される。そこで本稿では,複数の画像が他の1つの飛行ロボットに装着され,対象のマルチロケータの視野内に配置される空飛ぶ対向パッチを紹介する。攻撃ロボットを導入することで、システムは敵のマルチロボットシステムに拡張される。効果的な攻撃のために,複数の敵パッチと入力画像の位置を同時に最適化する3つの手法を比較した。提案手法は, 敵パッチ数に比例して拡張可能であることを示す。さらに,人間に追従するはずのロボットを誘拐するために,計算された敵パッチを用いた新たな攻撃ポリシーを用いて,2つのロボットによる物理的飛行を実証する。 Autonomous flying robots, such as multirotors, often rely on deep learning models that make predictions based on a camera image, e.g. for pose estimation. These models can predict surprising results if applied to input images outside the training domain. This fault can be exploited by adversarial attacks, for example, by computing small images, so-called adversarial patches, that can be placed in the environment to manipulate the neural network's prediction. We introduce flying adversarial patches, where multiple images are mounted on at least one other flying robot and therefore can be placed anywhere in the field of view of a victim multirotor. By introducing the attacker robots, the system is extended to an adversarial multi-robot system. For an effective attack, we compare three methods that simultaneously optimize multiple adversarial patches and their position in the input image. We show that our methods scale well with the number of adversarial patches. Moreover, we demonstrate physical flights with two robots, where we employ a novel attack policy that uses the computed adversarial patches to kidnap a robot that was supposed to follow a human.	翻訳日:2023-10-25 08:00:33 公開日:2023-10-23
# Trie-NLG: パーソナライズされたクエリ自動補完を改善するためのコンテキスト拡張の試み Trie-NLG: Trie Context Augmentation to Improve Personalized Query Auto-Completion for Short and Unseen Prefixes ( http://arxiv.org/abs/2307.15455v2 ) ライセンス: Link先を確認	Kaushal Kumar Maurya, Maunendra Sankar Desarkar, Manish Gupta, Puneet Agrawal	(参考訳) query auto-completion (qac) は、与えられたクエリプレフィックスの適切な補完を提案することを目的としている。伝統的に、QACシステムは、最も一般的な完了を示唆するために、過去のクエリログからキュレートされた試みを活用している。この文脈では、どんなQACシステムでも扱うのが難しい2つの特定のシナリオがある:短いプレフィックス(本質的に曖昧である)と見えないプレフィックス。近年,この2つの課題に対処するためのコンテキストとして,これまでのセッションクエリを活用するために,パーソナライズド自然言語生成(nlg)モデルが提案されている。しかしながら,(1) 従来のセッションクエリのいくつかは,現在のプレフィックスに対するユーザの意図とは無関係であり,(2) NLGモデルは過去のクエリの人気を直接組み込むことはできない。これにより、従来のセッションクエリからの人気信号とパーソナライズ信号とを併用した、QACのための新しいNLGモデルであるTrie-NLGを提案する。我々は最近のセッションクエリとトップトライ補完からなるリッチコンテキストでプレフィックスを拡張することで、Trie-NLGモデルを訓練する。この単純なモデリングアプローチは、トリエベースおよびNLGベースのアプローチの限界を克服し、最先端のパフォーマンスをもたらす。 2つの大きなQACデータセットを用いてTrie-NLGモデルを評価する。提案モデルでは, 平均57%, 約14%のMRRが, 人気トレーベース・ルックアップおよびBARTベース・ベースライン法よりも大きく向上した。コードを公開しています。 Query auto-completion (QAC) aims to suggest plausible completions for a given query prefix. Traditionally, QAC systems have leveraged tries curated from historical query logs to suggest most popular completions. In this context, there are two specific scenarios that are difficult to handle for any QAC system: short prefixes (which are inherently ambiguous) and unseen prefixes. Recently, personalized Natural Language Generation (NLG) models have been proposed to leverage previous session queries as context for addressing these two challenges. However, such NLG models suffer from two drawbacks: (1) some of the previous session queries could be noisy and irrelevant to the user intent for the current prefix, and (2) NLG models cannot directly incorporate historical query popularity. This motivates us to propose a novel NLG model for QAC, Trie-NLG, which jointly leverages popularity signals from trie and personalization signals from previous session queries. We train the Trie-NLG model by augmenting the prefix with rich context comprising of recent session queries and top trie completions. This simple modeling approach overcomes the limitations of trie-based and NLG-based approaches and leads to state-of-the-art performance. We evaluate the Trie-NLG model using two large QAC datasets. On average, our model achieves huge ~57% and ~14% boost in MRR over the popular trie-based lookup and the strong BART-based baseline methods, respectively. We make our code publicly available.	翻訳日:2023-10-25 07:59:52 公開日:2023-10-23
# MISSRec:レコメンデーションのためのマルチモーダルな関心認識シーケンスの事前学習と転送 MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation ( http://arxiv.org/abs/2308.11175v2 ) ライセンス: Link先を確認	Jinpeng Wang, Ziyun Zeng, Yunxiao Wang, Yuting Wang, Xingyu Lu, Tianxiang Li, Jun Yuan, Rui Zhang, Hai-Tao Zheng, Shu-Tao Xia	(参考訳) シーケンシャルレコメンデーション(SR)の目標は、ユーザが興味を持つ可能性のある項目を、履歴的なインタラクションシーケンスに基づいて予測することである。既存のシーケンシャルレコメンデータは、広く使われているにもかかわらず、スパースIDが不足し、コールドスタート問題に苦慮することが多いID機能に基づいて開発されている。さらに、一貫性のないIDマッピングはモデルの転送可能性を妨げるため、共最適化可能な類似のレコメンデーションドメインを分離する。本稿では,多モード情報の可能性を探り,頑健で一般化可能なシーケンス表現を学習することを目的としている。 SRのためのマルチモーダル事前学習および転送学習フレームワークであるMISSRecを提案する。ユーザ側では、Transformerベースのエンコーダ-デコーダモデルを設計し、コンテクストエンコーダがシーケンスレベルのマルチモーダルユーザ興味を捉えることを学習し、新しい興味を意識したデコーダを開発し、アイテム-モダリティ-関心関係を把握してシーケンス表現を改善する。候補項目側では動的融合モジュールを用いてユーザ適応アイテム表現を生成し,ユーザとアイテム間のより正確なマッチングを実現する。コントラスト学習目標を用いてモデルを事前学習し,効率的に微調整する。広範囲な実験がmissrecの有効性と柔軟性を示し、実世界のレコメンデーションシナリオのための実用的なソリューションを約束している。データとコードは \url{https://github.com/gimpong/MM23-MISSRec} で入手できる。 The goal of sequential recommendation (SR) is to predict a user's potential interested items based on her/his historical interaction sequences. Most existing sequential recommenders are developed based on ID features, which, despite their widespread use, often underperform with sparse IDs and struggle with the cold-start problem. Besides, inconsistent ID mappings hinder the model's transferability, isolating similar recommendation domains that could have been co-optimized. This paper aims to address these issues by exploring the potential of multi-modal information in learning robust and generalizable sequence representations. We propose MISSRec, a multi-modal pre-training and transfer learning framework for SR. On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests while a novel interest-aware decoder is developed to grasp item-modality-interest relations for better sequence representation. On the candidate item side, we adopt a dynamic fusion module to produce user-adaptive item representation, providing more precise matching between users and items. We pre-train the model with contrastive learning objectives and fine-tune it in an efficient manner. Extensive experiments demonstrate the effectiveness and flexibility of MISSRec, promising a practical solution for real-world recommendation scenarios. Data and code are available on \url{https://github.com/gimpong/MM23-MISSRec}.	翻訳日:2023-10-25 07:51:57 公開日:2023-10-23
# AgentVerse: マルチエージェントコラボレーションの実現と創発的行動の探索 AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors ( http://arxiv.org/abs/2308.10848v3 ) ライセンス: Link先を確認	Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou	(参考訳) 大規模言語モデル(llm)によって権限を付与された自律エージェントは大幅に改善され、幅広いタスクを一般化できるようになった。しかし、現実のシナリオでは、タスク達成の効率と効果を高めるために個人間の協力がしばしば必要となる。そこで,人間の集団動力学に触発されて,その構成をits部品よりも大きいシステムとして協調的かつ動的に調整できるマルチエージェントフレームワーク \frameworkを提案する。実験により,単一のエージェントより優れたマルチエージェントグループを効果的にデプロイできることを示した。さらに,共同作業におけるグループ内の個々のエージェント間の社会的行動の出現について検討した。これらの行動から,複数エージェントグループの協調性向上のために,ポジティブな行動を活用し,ネガティブな行動を緩和するための戦略について考察する。 \frameworkのコードは、もうすぐ \url{https://github.com/OpenBMB/AgentVerse}でリリースされます。 Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework \framework that can collaboratively and dynamically adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that \framework framework can effectively deploy multi-agent groups that outperform a single agent. Furthermore, we delve into the emergence of social behaviors among individual agents within a group during collaborative task accomplishment. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups. Our codes for \framework will soon be released at \url{https://github.com/OpenBMB/AgentVerse}.	翻訳日:2023-10-25 07:51:26 公開日:2023-10-23
# ベイズ流ネットワーク Bayesian Flow Networks ( http://arxiv.org/abs/2308.07037v3 ) ライセンス: Link先を確認	Alex Graves, Rupesh Kumar Srivastava, Timothy Atkinson, Faustino Gomez	(参考訳) 本稿では,独立した分布の集合のパラメータを,ノイズデータサンプルに照らしてベイズ推論によって修正し,第2の相互依存分布を出力するニューラルネットワークに入力として渡す,新たな階層生成モデルであるベイズフローネットワーク(bfns)を提案する。単純な事前および反復的に2つの分布を更新することから、拡散モデルの逆過程に類似した生成手順が得られるが、前方過程を必要としないという概念的には単純である。離散時間および連続時間損失関数は、サンプル生成手順とともに、連続、離散化、離散データに対して導出される。特に、離散データに対するネットワーク入力は確率単純度に基づいており、したがってネイティブに微分可能であり、勾配に基づくサンプルガイダンスや言語モデリングのような離散領域における数ステップ生成の道を開く。損失関数はデータ圧縮を直接最適化し、ネットワークアーキテクチャに制限を課さない。実験では,動的二項化MNISTとCIFAR-10を用いた画像モデリングにおいて,BFNは競合する対数類似度を実現し,テキスト8文字レベルの言語モデリングタスクにおいて,既知の離散拡散モデルよりも優れていた。 This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference in the light of noisy data samples, then passed as input to a neural network that outputs a second, interdependent distribution. Starting from a simple prior and iteratively updating the two distributions yields a generative procedure similar to the reverse process of diffusion models; however it is conceptually simpler in that no forward process is required. Discrete and continuous-time loss functions are derived for continuous, discretised and discrete data, along with sample generation procedures. Notably, the network inputs for discrete data lie on the probability simplex, and are therefore natively differentiable, paving the way for gradient-based sample guidance and few-step generation in discrete domains such as language modelling. The loss function directly optimises data compression and places no restrictions on the network architecture. In our experiments BFNs achieve competitive log-likelihoods for image modelling on dynamically binarized MNIST and CIFAR-10, and outperform all known discrete diffusion models on the text8 character-level language modelling task.	翻訳日:2023-10-25 07:49:59 公開日:2023-10-23
# 高忠実性2量子ゲートを持つスピン軌道相互作用 Spin-Orbit Interaction Enabled High-Fidelity Two-Qubit Gates ( http://arxiv.org/abs/2308.06986v4 ) ライセンス: Link先を確認	Jiaan Qi, Zhi-Hai Liu and H. Q. Xu	(参考訳) 半導体スピンキュービットプラットフォームにおける2量子ゲート(TQG)に対するスピン軌道相互作用(SOI)の影響について検討した。 SOIは、クォービット対を管理する交換相互作用を異方性とし、アイソトロピックなハイゼンベルク交換のために導かれる従来のTQGにとって深刻な課題となる。微視的レベルから始めると、soiの本質を捉える簡潔な計算ハミルトニアンを開発し、それを用いて回転フレーム時間発展の性質を導出する。 2つの重要な発見がある。まず,制御相/制御zゲートについて,ゲート時間と局所位相補正の観点で若干の修正を加えるだけで,忠実性が最適に向上できる ‘soiノード’ の存在を実証し,解析的に証明する。第二に、SOIなしではアクセスできない新しい2量子ダイナミクス(反射ゲートと直接制御ノットゲート)を発見し、議論する。直接制御しないゲートに対して, 関連する条件と達成可能なフィダリティについて検討する。 We study the implications of spin-orbit interaction (SOI) for two-qubit gates (TQGs) in semiconductor spin qubit platforms. SOI renders the exchange interaction governing qubit pairs anisotropic, posing a serious challenge for conventional TQGs derived for the isotropic Heisenberg exchange. Starting from microscopic level, we develop a concise computational Hamiltonian that captures the essence of SOI, and use it to derive properties of the rotating-frame time evolutions. Two key findings are made. First, for the controlled-phase/controlled-Z gate, we show and analytically prove the existence of ``SOI nodes'' where the fidelity can be optimally enhanced, with only slight modifications in terms of gate time and local phase corrections. Second, we discover and discuss novel two-qubit dynamics that are inaccessible without SOI -- the reflection gate and the direct controlled-not gate. The relevant conditions and achievable fidelities are studied for the direct controlled-not gate.	翻訳日:2023-10-25 07:49:32 公開日:2023-10-23
# 脳イメージングのためのエッジ対応ハードクラスタリンググラフポーリング Edge-aware Hard Clustering Graph Pooling for Brain Imaging ( http://arxiv.org/abs/2308.11909v5 ) ライセンス: Link先を確認	Cheng Zhu, Jiayi Zhu, Lijuan Zhang, Xi Wu, Shuqi Yang, Ping Liang, Honghan Chen, Ying Tan	(参考訳) グラフ畳み込みネットワーク(GCN)は、異なる脳領域間の非ユークリッド空間依存性を捉えることができる。 GCNの重要な要素であるグラフプーリング演算子は、表現学習能力を高め、異常な脳地図の取得を容易にする。しかし、既存の研究のほとんどは、グラフプーリングアプリケーションのシナリオを限定するだけでなく、重要なサブ構造をキャプチャする能力も低下させる方法で、ノードの観点からのみ、元のエッジ機能を無視してグラフプーリングオペレータを設計している。エッジ機能に合わせたグラフクラスタリングプーリング演算子を設計するために,エッジ対応のハードクラスタリンググラフプール(ehcpool)を提案し,グラフクラスタリングプロセスを再定義した。具体的には、エッジとノードの特徴の両方の重要性を評価するために、'Edge-to-node'基準を提案した。エッジスコアによってガイドされた我々は,グラフのスパースクラスタリングを適応的に学習することを目的とした,革命的イテレーションnトップ戦略を設計した。その後、各独立部分グラフのノードとエッジ情報を集約するために、新しいN-Eアグリゲーション戦略を導入する。多地点の公開データセットに関する大規模な実験は、提案モデルの優越性と堅牢性を示している。 EHCPoolは、データ駆動の観点から異なるタイプの機能不全脳ネットワークを探索する可能性がある。コアコードはhttps://github.com/swfen/ehcpool。 Graph Convolutional Networks (GCNs) can capture non-Euclidean spatial dependence between different brain regions. The graph pooling operator, a crucial element of GCNs, enhances the representation learning capability and facilitates the acquisition of abnormal brain maps. However, most existing research designs graph pooling operators solely from the perspective of nodes while disregarding the original edge features, in a way that not only confines graph pooling application scenarios, but also diminishes its ability to capture critical substructures. To design a graph clustering pooling operator that is tailored to dominant edge features, we proposed the edge-aware hard clustering graph pool (EHCPool) and redefined the graph clustering process. Specifically, the 'Edge-to-node' criterion was proposed to evaluate the significance of both edge and node features. Guided by edge scores, we designed a revolutionary Iteration n-top strategy, aimed at adaptively learning sparse hard clustering assignments for graphs. Subsequently, a novel N-E Aggregation strategy is introduced to aggregate node and edge information in each independent subgraph. Extensive experiments on the multi-site public datasets demonstrate the superiority and robustness of the proposed model. More notably, EHCPool has the potential to probe different types of dysfunctional brain networks from a data-driven perspective. Core code is at: https://github.com/swfen/EHCPool.	翻訳日:2023-10-25 07:39:22 公開日:2023-10-23
# フェデレーションバンドのためのインセンティブコミュニケーション Incentivized Communication for Federated Bandits ( http://arxiv.org/abs/2309.11702v2 ) ライセンス: Link先を確認	Zhepei Wei, Chuanhao Li, Haifeng Xu, Hongning Wang	(参考訳) フェデレートされたバンディットに関する既存の作業の多くは、すべてのクライアントが、必要に応じて、サーバとデータを共有することに利他的であることを当然に受け取っています。性能と通信効率に関する説得力のある理論的な保証にもかかわらず、この仮定は過度に理想主義的であり、特に明示的なメリットのないデータ共有を嫌う自己関心のクライアント上でアルゴリズムが運用されている場合、実際にしばしば違反される。このような自己利己的な行動の無視は、フェデレート・バンディット学習の学習効率や実用的操作性に多大な影響を与えうる。これを踏まえて,我々は,サーバがクライアントにインセンティブを提供することでデータ共有を動機付ける,フェデレートされた盗賊に対するインセンティブ付きコミュニケーション問題を導入することで,この未調査研究領域に対する新たな洞察を喚起することを目指している。一般性を失うことなく、この帯域問題を文脈線形設定でインスタンス化し、証明可能な通信とインセンティブコストの保証によってほぼ最適に後悔する最初のインセンティブ付き通信プロトコルであるInc-FedUCBを提案する。合成データと実世界のデータセットの両方に関する広範な実験により、様々な環境における提案手法の有効性がさらに検証された。 Most existing works on federated bandits take it for granted that all clients are altruistic about sharing their data with the server for the collective good whenever needed. Despite their compelling theoretical guarantee on performance and communication efficiency, this assumption is overly idealistic and oftentimes violated in practice, especially when the algorithm is operated over self-interested clients, who are reluctant to share data without explicit benefits. Negligence of such self-interested behaviors can significantly affect the learning efficiency and even the practical operability of federated bandit learning. In light of this, we aim to spark new insights into this under-explored research area by formally introducing an incentivized communication problem for federated bandits, where the server shall motivate clients to share data by providing incentives. Without loss of generality, we instantiate this bandit problem with the contextual linear setting and propose the first incentivized communication protocol, namely, Inc-FedUCB, that achieves near-optimal regret with provable communication and incentive cost guarantees. Extensive empirical experiments on both synthetic and real-world datasets further validate the effectiveness of the proposed method across various environments.	翻訳日:2023-10-25 07:31:27 公開日:2023-10-23
# 金ヨーロ:ゲザ・アンド・ディストビュート機構による効率的な物体検出装置 Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism ( http://arxiv.org/abs/2309.11331v5 ) ライセンス: Link先を確認	Chengcheng Wang, Wei He, Ying Nie, Jianyuan Guo, Chuanjian Liu, Kai Han, Yunhe Wang	(参考訳) 近年, リアルタイム物体検出の分野における主要なアプローチとして, YOLOシリーズモデルが登場している。多くの研究が、アーキテクチャを変更し、データを増やし、新しい損失を設計することで、ベースラインをより高いレベルに押し上げた。しかし,従来モデルでは,機能ピラミッドネットワーク (fpn) とパスアグリゲーションネットワーク (panet) がこれを緩和しているが,情報融合問題に苦しんでいる。そこで本研究では,畳み込みと自己アテンション操作によって実現される高度な集合分散機構(gd)機構を提案する。この新しい設計モデルはGold-YOLOと呼ばれ、マルチスケールの機能融合能力を高め、すべてのモデルスケールでレイテンシと精度の理想的なバランスを実現する。さらに, YOLOシリーズにMAEスタイルの事前トレーニングを初めて実装し, YOLOシリーズモデルが教師なし事前トレーニングの恩恵を受けられるようにした。 Gold-YOLO-Nは、COCO val2017データセットで39.9%のAP、T4 GPUで1030 FPSを達成した。 PyTorchコードはhttps://github.com/huawei-noah/Efficient-Computing/tree/master/detection/Gold-YOLOで、MindSporeコードはhttps://gitee.com/mindspore/models/tree/master/research/cv/Gold_YOLOで入手できる。 In the past years, YOLO-series models have emerged as the leading approaches in the area of real-time object detection. Many studies pushed up the baseline to a higher level by modifying the architecture, augmenting data and designing new losses. However, we find previous models still suffer from information fusion problem, although Feature Pyramid Network (FPN) and Path Aggregation Network (PANet) have alleviated this. Therefore, this study provides an advanced Gatherand-Distribute mechanism (GD) mechanism, which is realized with convolution and self-attention operations. This new designed model named as Gold-YOLO, which boosts the multi-scale feature fusion capabilities and achieves an ideal balance between latency and accuracy across all model scales. Additionally, we implement MAE-style pretraining in the YOLO-series for the first time, allowing YOLOseries models could be to benefit from unsupervised pretraining. Gold-YOLO-N attains an outstanding 39.9% AP on the COCO val2017 datasets and 1030 FPS on a T4 GPU, which outperforms the previous SOTA model YOLOv6-3.0-N with similar FPS by +2.4%. The PyTorch code is available at https://github.com/huawei-noah/Efficient-Computing/tree/master/Detection/Gold-YOLO, and the MindSpore code is available at https://gitee.com/mindspore/models/tree/master/research/cv/Gold_YOLO.	翻訳日:2023-10-25 07:30:25 公開日:2023-10-23
# スライディングモード制御とディープラーニングを用いたスリップ・スキッド補償を用いたスキッドステアリング移動ロボットの軌道追従制御 Trajectory Tracking Control of Skid-Steering Mobile Robots with Slip and Skid Compensation using Sliding-Mode Control and Deep Learning ( http://arxiv.org/abs/2309.08863v2 ) ライセンス: Link先を確認	Payam Nourizadeh, Fiona J Stevens McFadden, Will N Browne	(参考訳) スリップとスキッドの補償は、屋外の地形を移動する移動ロボットにとって重要である。このような困難な環境では、スリップとスキディングは軌道追跡システムに不確実性をもたらし、車両の安全性を損なう可能性がある。この分野での研究にもかかわらず、実世界のオンラインスリップとスキッド補償は、屋外環境におけるホイール・テライン相互作用の複雑さのため、依然として困難である。本稿では,屋外で動作する移動ロボットの車両レベルにおいて,現実的に実現可能なオンラインスリップとスキッド補償を備えた新たな軌道追跡手法を提案する。このアプローチでは、このタイプのロボットに固有の不確実性を考慮して、ロバストな軌道追跡システムの設計にスライディングモード制御を用いる。ロボットの滑りや望ましくないスキディングをリアルタイムで推定し、それらを補償するために、以前に開発された2つのディープラーニングモデルを制御フィードバックループに統合する。提案手法の主な利点は,(1)車輪に2つのスリップ成分とロボットのスキディングとを併用する従来のアプローチとは対照的に,ロボット全体のスリップ関連パラメータを2つ考慮し,(2)オンライン実世界のスリップとスキッド補償機能を備え,予期せぬ環境におけるトラッキングエラーを低減できる点である。実験の結果,軌道追跡システムの性能は27%以上向上した。 Compensating for slip and skid is crucial for mobile robots navigating outdoor terrains. In these challenging environments, slipping and skidding introduce uncertainties into trajectory tracking systems, potentially compromising the safety of the vehicle. Despite research in this field, having a real-world feasible online slip and skid compensation remains challenging due to the complexity of wheel-terrain interaction in outdoor environments. This paper proposes a novel trajectory tracking technique featuring real-world feasible online slip and skid compensation at the vehicle level for skid-steering mobile robots operating outdoors. The approach employs sliding-mode control to design a robust trajectory tracking system, accounting for the inherent uncertainties in this type of robot. To estimate the robot's slipping and undesired skidding and compensate for them in real-time, two previously developed deep learning models are integrated into the control-feedback loop. The main advantages of the proposed technique are that it (1) considers two slip-related parameters for the entire robot, as opposed to the conventional approach involving two slip components for each wheel along with the robot's skidding, and (2) has an online real-world feasible slip and skid compensator, reducing the tracking errors in unforeseen environments. Experimental results demonstrate a significant improvement, enhancing the trajectory tracking system's performance by over 27%.	翻訳日:2023-10-25 07:29:20 公開日:2023-10-23
# オンデマンド駆動ナビゲーションのための要求条件付きオブジェクト属性空間の学習 Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation ( http://arxiv.org/abs/2309.08138v2 ) ライセンス: Link先を確認	Hongcheng Wang, Andy Guan Hong Chen, Xiaoqi Li, Mingdong Wu, Hao Dong	(参考訳) 視覚オブジェクトナビゲーション(VON)のタスクは、特定のシーン内で特定のオブジェクトを特定できるエージェントの能力を含む。 vonタスクを成功させるためには、2つの必須条件を満たさなければならない:1) ユーザが希望するオブジェクトの名前を知る必要がある。 2) ユーザ指定オブジェクトは実際にシーン内に存在しなければならない。これらの条件を満たすために、シミュレータはシーンのメタデータに予め定義されたオブジェクト名と位置を組み込むことができる。しかし、現実のシナリオでは、これらの条件が常に満たされることを保証することはしばしば困難である。馴染みのない環境の人間は、どのオブジェクトがシーンに存在するのかを知らないかもしれないし、実際に存在しないオブジェクトを誤って特定するかもしれない。しかしながら、これらの課題にもかかわらず、人間は依然としてオブジェクトに対する要求があり、それは、シーン内に存在する他のオブジェクトと同等の方法で満たされる可能性がある。そこで本研究では,ユーザの要求をタスク命令として活用し,その要求にマッチするオブジェクトを見つけるようエージェントに促す,要求駆動ナビゲーション(DDN)を提案する。 DDNは、事前に定義されたオブジェクトのカテゴリや名前にのみ依存するのではなく、ユーザの要求を満たすことに集中することで、VONの厳しい条件を緩和することを目的としている。本稿では,大言語モデルから共通知識を抽出することにより,まずオブジェクトのテキスト属性特徴を取得する手法を提案する。これらのテキスト属性機能は、Contrastive Language-Image Pre-training (CLIP)を使用して視覚的属性特徴と整列する。視覚属性の特徴を事前知識として組み込むことで,ナビゲーションプロセスを強化する。 ProcThorデータセットによるAI2Thorの実験では、視覚特性の特徴がエージェントのナビゲーション性能を改善し、VONで一般的に使用されるベースラインメソッドよりも優れていた。 The task of Visual Object Navigation (VON) involves an agent's ability to locate a particular object within a given scene. In order to successfully accomplish the VON task, two essential conditions must be fulfilled:1) the user must know the name of the desired object; and 2) the user-specified object must actually be present within the scene. To meet these conditions, a simulator can incorporate pre-defined object names and positions into the metadata of the scene. However, in real-world scenarios, it is often challenging to ensure that these conditions are always met. Human in an unfamiliar environment may not know which objects are present in the scene, or they may mistakenly specify an object that is not actually present. Nevertheless, despite these challenges, human may still have a demand for an object, which could potentially be fulfilled by other objects present within the scene in an equivalent manner. Hence, we propose Demand-driven Navigation (DDN), which leverages the user's demand as the task instruction and prompts the agent to find the object matches the specified demand. DDN aims to relax the stringent conditions of VON by focusing on fulfilling the user's demand rather than relying solely on predefined object categories or names. We propose a method first acquire textual attribute features of objects by extracting common knowledge from a large language model. These textual attribute features are subsequently aligned with visual attribute features using Contrastive Language-Image Pre-training (CLIP). By incorporating the visual attribute features as prior knowledge, we enhance the navigation process. Experiments on AI2Thor with the ProcThor dataset demonstrate the visual attribute features improve the agent's navigation performance and outperform the baseline methods commonly used in VON.	翻訳日:2023-10-25 07:28:50 公開日:2023-10-23
# 皮下アバター再建のためのグローバル関連3dデカップリングトランス Global-correlated 3D-decoupling Transformer for Clothed Avatar Reconstruction ( http://arxiv.org/abs/2309.13524v3 ) ライセンス: Link先を確認	Zechuan Zhang, Li Sun, Zongxin Yang, Ling Chen, Yi Yang	(参考訳) 3d服を着た人間のアバターを1枚の画像から再構築することは、特に複雑なポーズやゆるい衣服に遭遇する場合、難しい課題である。現在のメソッドは性能に制限があり、主に不十分な2d画像特徴と一貫性のないクエリメソッドに依存する。そこで我々は, モノクロ画像から人間のアバターを再構成する新しいトランスアーキテクチャであるGTA(Global-correlated 3D-decoupling Transformer for clothed Avatar reconstruction)を提案する。提案手法は,グローバルな関連画像特徴をキャプチャするエンコーダとしてビジョントランスフォーマーモデルを活用することで,トランスフォーマアーキテクチャを活用する。その後,3次元分離デコーダは,学習可能な埋め込みをクロスプレーン生成のためのクエリとして使用し,トライプレーン機能を分離するためにクロスアテンションを用いています。本稿では,三面体3次元特徴と人体との融合を効果的に促進するために,空間的局所化と人体的事前知識の利点を活かし,空間的問合せと先行的問合せを組み合わせたハイブリッド事前融合戦略を提案する。 CAPEとTHuman2.0データセットの総合的な実験により、我々の手法は、幾何学的およびテクスチャ的再構築における最先端のアプローチよりも優れており、挑戦的なポーズやゆるい衣服に対して高い堅牢性を示し、高分解能なテクスチャを生成する。コードはhttps://github.com/River-Zhang/GTAで入手できる。 Reconstructing 3D clothed human avatars from single images is a challenging task, especially when encountering complex poses and loose clothing. Current methods exhibit limitations in performance, largely attributable to their dependence on insufficient 2D image features and inconsistent query methods. Owing to this, we present the Global-correlated 3D-decoupling Transformer for clothed Avatar reconstruction (GTA), a novel transformer-based architecture that reconstructs clothed human avatars from monocular images. Our approach leverages transformer architectures by utilizing a Vision Transformer model as an encoder for capturing global-correlated image features. Subsequently, our innovative 3D-decoupling decoder employs cross-attention to decouple tri-plane features, using learnable embeddings as queries for cross-plane generation. To effectively enhance feature fusion with the tri-plane 3D feature and human body prior, we propose a hybrid prior fusion strategy combining spatial and prior-enhanced queries, leveraging the benefits of spatial localization and human body prior knowledge. Comprehensive experiments on CAPE and THuman2.0 datasets illustrate that our method outperforms state-of-the-art approaches in both geometry and texture reconstruction, exhibiting high robustness to challenging poses and loose clothing, and producing higher-resolution textures. Codes will be available at https://github.com/River-Zhang/GTA.	翻訳日:2023-10-25 07:20:34 公開日:2023-10-23
# Real3D-AD: ポイントクラウド異常検出のデータセット Real3D-AD: A Dataset of Point Cloud Anomaly Detection ( http://arxiv.org/abs/2309.13226v3 ) ライセンス: Link先を確認	Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong Liu, Chengjie Wang, Feng Zheng	(参考訳) 高精度点雲異常検出は、加工および精密製造の欠陥を特定するための金の標準である。この分野の方法論的な進歩にもかかわらず、データセットの不足と体系的なベンチマークの欠如は、その開発を妨げる。 real3d-adは,この分野の制約に対処し,高精度なクラウド異常検出データセットである。 1,254個の高解像度3dアイテムがそれぞれ4万点から数百万点まで、real3d-adは、これまでで最大の高精度3d産業用異常検出用データセットである。 Real3D-ADは、ポイントクラウド解像度(0.0010mm-0.0015mm)、360度カバレッジ、完璧なプロトタイプに関する既存の3D異常検出データセットを上回る。さらに,real3d-adの総合ベンチマークを行い,高精度点雲異常検出のためのベースライン手法の欠如を明らかにした。そこで,我々はreg3d-adを提案する。reg3d-adは,局所表現とグローバル表現を保存する新しい特徴記憶バンクを組み込んだ,登録に基づく3次元異常検出手法である。 Real3D-ADデータセットに関する大規模な実験は、Reg3D-ADの有効性を強調している。再現性とアクセシビリティのために、Real3D-ADデータセット、ベンチマークソースコード、Reg3D-ADをウェブサイトで提供します。 High-precision point cloud anomaly detection is the gold standard for identifying the defects of advancing machining and precision manufacturing. Despite some methodological advances in this area, the scarcity of datasets and the lack of a systematic benchmark hinder its development. We introduce Real3D-AD, a challenging high-precision point cloud anomaly detection dataset, addressing the limitations in the field. With 1,254 high-resolution 3D items from forty thousand to millions of points for each item, Real3D-AD is the largest dataset for high-precision 3D industrial anomaly detection to date. Real3D-AD surpasses existing 3D anomaly detection datasets available regarding point cloud resolution (0.0010mm-0.0015mm), 360 degree coverage and perfect prototype. Additionally, we present a comprehensive benchmark for Real3D-AD, revealing the absence of baseline methods for high-precision point cloud anomaly detection. To address this, we propose Reg3D-AD, a registration-based 3D anomaly detection method incorporating a novel feature memory bank that preserves local and global representations. Extensive experiments on the Real3D-AD dataset highlight the effectiveness of Reg3D-AD. For reproducibility and accessibility, we provide the Real3D-AD dataset, benchmark source code, and Reg3D-AD on our website:https://github.com/M-3LAB/Real3D-AD.	翻訳日:2023-10-25 07:20:06 公開日:2023-10-23
# マルチラベル産業セクター配置のためのプロンプトチューニング埋め込み分類 Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation ( http://arxiv.org/abs/2309.12075v2 ) ライセンス: Link先を確認	Valentin Leonhard Buchner, Lele Cao, Jan-Christoph Kalo, Vilhelm von Ehrenheim	(参考訳) Prompt Tuningは、しばしばLLM(Large Language Models)と呼ばれるPLM(Pretrained Language Models)を微調整するためのスケーラブルで費用効率のよい方法として登場した。本研究は,マルチラベルテキスト分類のためのプロンプトチューニングとベースラインの性能と計算効率のベンチマークを行う。これは、企業を投資会社の独自産業分類に分類し、そのテーマ的投資戦略を支援するという課題に適用される。テキストからテキストへの分類はタスク固有の分類ヘッドよりも多く報告されるが、各ラベルが複数のトークンで構成されるマルチラベル分類問題に適用する場合、いくつかの制限がある。 a) 生成されたラベルは,ラベル分類上のラベルと一致しない。 b) 微調整プロセスは,変分不変性を欠き,提供ラベルの順序に敏感である。 (c) モデルは適切な信頼スコアではなく、二項決定を提供する。制限 (a) 分類性能をわずかに向上させるTrie Searchを用いた制約付きデコーディングを適用することで対処する。すべての制限 (a) (b)及び c) は PLM の言語ヘッドを Prompt Tuned Embedding Classification (PTEC) と呼ばれる分類ヘッドに置き換えることによって対処される。これにより性能が大幅に向上し、推論時の計算コストも低減される。当社の産業応用では、トレーニングデータはよく知られた企業に偏っている。このモデルのパフォーマンスは、よく知られた企業とあまり知られていない企業の両方で一貫していることを確認します。以上の結果から,高度な一般化能力を持つPLMの時代にも,最先端の手法をドメイン固有タスクに適用する必要性が続いていることが示唆された。コードベースとベンチマークデータセットをhttps://github.com/EQTPartners/PTECでリリースしています。 Prompt Tuning is emerging as a scalable and cost-effective method to fine-tune Pretrained Language Models (PLMs), which are often referred to as Large Language Models (LLMs). This study benchmarks the performance and computational efficiency of Prompt Tuning and baselines for multi-label text classification. This is applied to the challenging task of classifying companies into an investment firm's proprietary industry taxonomy, supporting their thematic investment strategy. Text-to-text classification is frequently reported to outperform task-specific classification heads, but has several limitations when applied to a multi-label classification problem where each label consists of multiple tokens: (a) Generated labels may not match any label in the label taxonomy; (b) The fine-tuning process lacks permutation invariance and is sensitive to the order of the provided labels; (c) The model provides binary decisions rather than appropriate confidence scores. Limitation (a) is addressed by applying constrained decoding using Trie Search, which slightly improves classification performance. All limitations (a), (b), and (c) are addressed by replacing the PLM's language head with a classification head, which is referred to as Prompt Tuned Embedding Classification (PTEC). This improves performance significantly, while also reducing computational costs during inference. In our industrial application, the training data is skewed towards well-known companies. We confirm that the model's performance is consistent across both well-known and less-known companies. Our overall results indicate the continuing need to adapt state-of-the-art methods to domain-specific tasks, even in the era of PLMs with strong generalization abilities. We release our codebase and a benchmarking dataset at https://github.com/EQTPartners/PTEC.	翻訳日:2023-10-25 07:18:57 公開日:2023-10-23
# LanguageBind: 言語に基づくセマンティックアライメントによるN-モダリティへのビデオ言語事前学習 LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment ( http://arxiv.org/abs/2310.01852v4 ) ライセンス: Link先を確認	Bin Zhu, Bin Lin, Munan Ning, Yang Yan, Jiaxi Cui, HongFa Wang, Yatian Pang, Wenhao Jiang, Junwu Zhang, Zongwei Li, Wancai Zhang, Zhifeng Li, Wei Liu, and Li Yuan	(参考訳) ビデオ言語(VL)プレトレーニングは、複数の下流タスクにおいて著しく改善されている。しかしながら、現在のVL事前学習フレームワークは、視覚や言語を超えた複数のモーダル(Nモダリティ、N>=3)にまで拡張するのは難しい。そこで我々は言語bindを提案し,言語モダリティは十分に探索され,豊富な意味論を含んでいるため,言語を異なるモダリティのバインドとして捉える。具体的には、VL事前学習によって得られた言語エンコーダを凍結し、コントラスト学習を伴う他のモダリティのためのエンコーダを訓練する。その結果、すべてのモダリティは共有機能空間にマッピングされ、マルチモーダルなセマンティックアライメントを実装する。 LanguageBindは、VLモダリティをNモダリティに拡張できることを保証する一方で、言語を中心としたデータペアをアライメントする高品質なデータセットも必要です。そこで我々は,VIDAL-10Mをビデオ,赤外線,深度,オーディオおよびそれに対応する言語として提案し,VIDAL-10Mと命名した。我々のVIDAL-10Mでは、すべてのビデオは長いビデオから切り離されたセグメントではなく、完全な意味を持った短いビデオプラットフォームから作成されています。 vidal-10mをプリトレーニングした後、ゼロショットビデオテキスト検索タスクのパラメータの15%しか持たないmsr-vttデータセットで、imagebindを5.8%r@1に上回った。さらに、LanguageBindはゼロショットビデオ、オーディオ、奥行き、赤外線理解タスクを大幅に改善しました。例えば、LanguageBindがInterVideoを1.9%、MSVDが8.8%、DiDeMoが6.3%、ActivityNetが4.4%上回った。 LLVIPとNYU-Dデータセットでは、LanguageBindがImageBindを23.8%、11.1%で上回っている。コードアドレスはhttps://github.com/PKU-YuanGroup/LanguageBind。 The video-language (VL) pretraining has achieved remarkable improvement in multiple downstream tasks. However, the current VL pretraining framework is hard to extend to multiple modalities (N modalities, N>=3) beyond vision and language. We thus propose LanguageBind, taking the language as the bind across different modalities because the language modality is well-explored and contains rich semantics. Specifically, we freeze the language encoder acquired by VL pretraining, then train encoders for other modalities with contrastive learning. As a result, all modalities are mapped to a shared feature space, implementing multi-modal semantic alignment. While LanguageBind ensures that we can extend VL modalities to N modalities, we also need a high-quality dataset with alignment data pairs centered on language. We thus propose VIDAL-10M with Video, Infrared, Depth, Audio and their corresponding Language, naming as VIDAL-10M. In our VIDAL-10M, all videos are from short video platforms with complete semantics rather than truncated segments from long videos, and all the video, depth, infrared, and audio modalities are aligned to their textual descriptions. After pretraining on VIDAL-10M, we outperform ImageBind by 5.8% R@1 on the MSR-VTT dataset with only 15% of the parameters in the zero-shot video-text retrieval task. Beyond this, our LanguageBind has greatly improved in the zero-shot video, audio, depth, and infrared understanding tasks. For instance, LanguageBind surpassing InterVideo by 1.9% on MSR-VTT, 8.8% on MSVD, 6.3% on DiDeMo, and 4.4% on ActivityNet. On the LLVIP and NYU-D datasets, LanguageBind outperforms ImageBind with 23.8% and 11.1% top-1 accuracy. Code address: https://github.com/PKU-YuanGroup/LanguageBind.	翻訳日:2023-10-25 07:10:25 公開日:2023-10-23
# データソースとしてのAI生成画像:合成時代の幕開け AI-Generated Images as Data Source: The Dawn of Synthetic Era ( http://arxiv.org/abs/2310.01830v3 ) ライセンス: Link先を確認	Zuhao Yang, Fangneng Zhan, Kunhao Liu, Muyu Xu, Shijian Lu	(参考訳) ビジュアルインテリジェンスの進歩は、本質的に大規模なデータの可用性に繋がる。並行して、生成的人工知能(AI)は、現実世界の写真によく似た合成画像を作成する可能性を解き放った。生成AIの進歩によって、ビジュアルインテリジェンスがどの程度の恩恵を受けることができるのか? 本稿では、これらのai生成画像を新たなデータソースとして活用する革新的な概念を探求し、従来のモデリングパラダイムをビジュアルインテリジェンスに再構成する。実際のデータとは対照的に、AI生成データには、未整合のアビデンスとスケーラビリティ、膨大なデータセットの高速生成、エッジケースの無力なシミュレーションなど、大きなメリットがある。生成型aiモデルの成功に基づいて、機械学習モデルのトレーニングから計算モデリング、テスト、検証のシナリオのシミュレーションまで、さまざまなアプリケーションで生成されたデータの可能性を検証します。我々は、この変革的なパラダイムシフトに伴う倫理的、法律的、実践的な考察を深く議論する中で、生成AIの利用を支える技術基盤を探求する。本稿では,現在の技術と応用の徹底的な調査を通じて,視覚知能における合成時代の包括的展望を示す。この論文に関連するプロジェクトは、https://github.com/mwxely/AIGS で見ることができる。 The advancement of visual intelligence is intrinsically tethered to the availability of large-scale data. In parallel, generative Artificial Intelligence (AI) has unlocked the potential to create synthetic images that closely resemble real-world photographs. This prompts a compelling inquiry: how much visual intelligence could benefit from the advance of generative AI? This paper explores the innovative concept of harnessing these AI-generated images as new data sources, reshaping traditional modeling paradigms in visual intelligence. In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability, the rapid generation of vast datasets, and the effortless simulation of edge cases. Built on the success of generative AI models, we examine the potential of their generated data in a range of applications, from training machine learning models to simulating scenarios for computational modeling, testing, and validation. We probe the technological foundations that support this groundbreaking use of generative AI, engaging in an in-depth discussion on the ethical, legal, and practical considerations that accompany this transformative paradigm shift. Through an exhaustive survey of current technologies and applications, this paper presents a comprehensive view of the synthetic era in visual intelligence. A project associated with this paper can be found at https://github.com/mwxely/AIGS .	翻訳日:2023-10-25 07:09:49 公開日:2023-10-23
# 対話管理を改善する: 品質データセット対モデル Improving Dialogue Management: Quality Datasets vs Models ( http://arxiv.org/abs/2310.01339v2 ) ライセンス: Link先を確認	Miguel \'Angel Medina-Ram\'irez, Cayetano Guerra-Artal, Mario Hern\'andez-Tejera	(参考訳) タスク指向対話システム(TODS)は,自然言語を用いて機械やコンピュータと対話する上で重要である。その重要なコンポーネントの1つが対話マネージャで、最善の応答を提供することで、会話をユーザの良い目標に向けて導く。従来,ルールベースシステム (RBS) や強化学習 (RL) ,教師付き学習 (SL) などが,適切な対話管理のためのソリューションとして提案されてきた。しかし、dmsの最大の原因は、これまで採用されてきたモデルではなく、データセットの品質にある、と本研究は主張している。この仮説を実証するために、最も広く使われているデータセットであるmultiwoz 2.1とsgdにおける主なエラーを調査した。そこで我々は,データセットに導入されたエラー量とタイプを完全に制御する合成対話生成器を設計した。このジェネレータを用いて、データセットの誤差がモデルの性能に比例することを示した。 Task-oriented dialogue systems (TODS) have become crucial for users to interact with machines and computers using natural language. One of its key components is the dialogue manager, which guides the conversation towards a good goal for the user by providing the best possible response. Previous works have proposed rule-based systems (RBS), reinforcement learning (RL), and supervised learning (SL) as solutions for the correct dialogue management; in other words, select the best response given input by the user. However, this work argues that the leading cause of DMs not achieving maximum performance resides in the quality of the datasets rather than the models employed thus far; this means that dataset errors, like mislabeling, originate a large percentage of failures in dialogue management. We studied the main errors in the most widely used datasets, Multiwoz 2.1 and SGD, to demonstrate this hypothesis. To do this, we have designed a synthetic dialogue generator to fully control the amount and type of errors introduced in the dataset. Using this generator, we demonstrated that errors in the datasets contribute proportionally to the performance of the models	翻訳日:2023-10-25 07:09:28 公開日:2023-10-23
# グラフ畳み込みネットワークを用いたロバスト心筋セグメンテーションに向けて Towards Robust Cardiac Segmentation using Graph Convolutional Networks ( http://arxiv.org/abs/2310.01210v2 ) ライセンス: Link先を確認	Gilles Van De Vyver, Sarina Thomas, Guy Ben-Yosef, Sindre Hellum Olaisen, H\r{a}vard Dalen, Lasse L{\o}vstakken, and Erik Smistad	(参考訳) 完全自動心筋分画は、心エコー検査から臨床測定を抽出する高速かつ再現可能な方法である。 u-netアーキテクチャは医学的なセグメンテーションのための最先端のディープラーニングアーキテクチャであり、平均的なエラーで心臓構造をリアルタイムでセグメンテーションすることができる。しかし、このアーキテクチャは、しばしば解剖学的に正しくない大きな外れ値を生成する。この研究はグラフ畳み込みニューラルネットワークの概念を用いて、各ピクセルをラベル付けするのではなく、興味のある構造の輪郭点を予測する。本研究では,心臓解剖学に基づく2つの畳み込み輪を用いたグラフアーキテクチャを提案する。さらに、この研究は、グラフ畳み込みアーキテクチャに関するアブレーション研究と、臨床HUNT4データセットに関する臨床測定の評価に寄与する。最後に,U-Netとグラフネットワークのモデル間合意を,入力品質とセグメンテーション品質の両方の予測器として用いることを提案する。この予測器は,分布外および不適な入力画像をリアルタイムに検出できることを示す。ソースコード: https://github.com/gillesvntnu/gcn_multistructure Fully automatic cardiac segmentation can be a fast and reproducible method to extract clinical measurements from an echocardiography examination. The U-Net architecture is the current state-of-the-art deep learning architecture for medical segmentation and can segment cardiac structures in real-time with average errors comparable to inter-observer variability. However, this architecture still generates large outliers that are often anatomically incorrect. This work uses the concept of graph convolutional neural networks that predict the contour points of the structures of interest instead of labeling each pixel. We propose a graph architecture that uses two convolutional rings based on cardiac anatomy and show that this eliminates anatomical incorrect multi-structure segmentations on the publicly available CAMUS dataset. Additionally, this work contributes with an ablation study on the graph convolutional architecture and an evaluation of clinical measurements on the clinical HUNT4 dataset. Finally, we propose to use the inter-model agreement of the U-Net and the graph network as a predictor of both the input and segmentation quality. We show this predictor can detect out-of-distribution and unsuitable input images in real-time. Source code is available online: https://github.com/gillesvntnu/GCN_multistructure	翻訳日:2023-10-25 07:08:39 公開日:2023-10-23
# 深部異常検出のための慣れ親しんだ機能を超えて Going Beyond Familiar Features for Deep Anomaly Detection ( http://arxiv.org/abs/2310.00797v2 ) ライセンス: Link先を確認	Sarath Sivaprasad and Mario Fritz	(参考訳) 異常検出(AD)は、正規性の学習モデルに適合しない観察を識別する重要なタスクである。ディープADにおける以前の研究は主に親しみやすい仮説に基づいており、親しみやすい特徴が事前訓練された埋め込み空間の参照として機能する。この戦略は非常に成功したが、事前訓練された符号化によってうまく捉えられていない真に新しい特徴からなる異常が一貫した偽陰性を引き起こすことが判明した。本稿では,新しい特徴を入力空間における説明不能な観測として捉えるための説明可能性を用いた新しいAD手法を提案する。類似性と新規性をハイブリッドアプローチで組み合わせることで,幅広い異常ベンチマークにおいて高い性能を実現する。提案手法は,複数のベンチマークにまたがる新たな最先端性を確立し,多様な異常な型を扱うとともに,高価なバックグラウンドモデルや密マッチングを必要としない。特に,新しい特徴を考慮すれば,最先端のベンチマークと比較して,挑戦的なベンチマークで偽陰性異常を最大40%削減できることを示す。本手法は,画素レベルの異常に対する視覚検査可能な説明を与える。 Anomaly Detection (AD) is a critical task that involves identifying observations that do not conform to a learned model of normality. Prior work in deep AD is predominantly based on a familiarity hypothesis, where familiar features serve as the reference in a pre-trained embedding space. While this strategy has proven highly successful, it turns out that it causes consistent false negatives when anomalies consist of truly novel features that are not well captured by the pre-trained encoding. We propose a novel approach to AD using explainability to capture novel features as unexplained observations in the input space. We achieve strong performance across a wide range of anomaly benchmarks by combining similarity and novelty in a hybrid approach. Our approach establishes a new state-of-the-art across multiple benchmarks, handling diverse anomaly types while eliminating the need for expensive background models and dense matching. In particular, we show that by taking account of novel features, we reduce false negative anomalies by up to 40% on challenging benchmarks compared to the state-of-the-art. Our method gives visually inspectable explanations for pixel-level anomalies.	翻訳日:2023-10-25 07:08:20 公開日:2023-10-23
# 通信ネットワークにおける情報ルーティングのための国家強化政策の学習 Learning State-Augmented Policies for Information Routing in Communication Networks ( http://arxiv.org/abs/2310.00248v2 ) ライセンス: Link先を確認	Sourajit Das, Navid NaderiAlizadeh, Alejandro Ribeiro	(参考訳) 本稿では,ローカル情報のみにアクセスできる制約付き統計学習問題として定式化できる大規模通信ネットワークにおける情報ルーティングの問題について検討する。本稿では,通信ネットワークのトポロジカルリンク上にグラフ畳み込みを配置することにより,gnn(graph neural network)アーキテクチャを用いて,ソースノードの集約情報を最大化する新しい状態拡張(sa)戦略を示す。提案手法では,各ノードで利用可能なローカル情報のみを利用し,所望の情報を効率的に宛先ノードにルーティングする。教師なし学習手法を利用して、GNNアーキテクチャの出力を最適情報ルーティング戦略に変換する。実験では,実時間ネットワークトポロジの評価を行い,アルゴリズムの有効性を検証する。数値シミュレーションでは,GNNパラメータ化学習における提案手法の性能向上をベースラインアルゴリズムと比較した。 This paper examines the problem of information routing in a large-scale communication network, which can be formulated as a constrained statistical learning problem having access to only local information. We delineate a novel State Augmentation (SA) strategy to maximize the aggregate information at source nodes using graph neural network (GNN) architectures, by deploying graph convolutions over the topological links of the communication network. The proposed technique leverages only the local information available at each node and efficiently routes desired information to the destination nodes. We leverage an unsupervised learning procedure to convert the output of the GNN architecture to optimal information routing strategies. In the experiments, we perform the evaluation on real-time network topologies to validate our algorithms. Numerical simulations depict the improved performance of the proposed method in training a GNN parameterization as compared to baseline algorithms.	翻訳日:2023-10-25 07:07:46 公開日:2023-10-23
# 2体クーロン問題と隠れ$g^{(2)}$代数:超可積分性と立方多項式代数 Two-body Coulomb problem and hidden $g^{(2)}$ algebra: superintegrability and cubic polynomial algebra ( http://arxiv.org/abs/2309.16886v2 ) ライセンス: Link先を確認	Alexander V. Turbiner and Adrian M. Escobar-Ruiz	(参考訳) Sturm表現における2体クーロン問題により、曲線空間における新しい2次元、正確に解ける超可積分量子系が、$g^{(2)}$隠れ代数学と積分の立方多項式代数によって導かれることが示されている。 2つの積分は次数 2 と 4 であり、それぞれ角運動量と修正されたラプラス・ランゲ・レンツベクトルの2つの成分から成っている。立方体多項式代数は普遍包絡代数 $U_{g^{(2)}}$ の無限次元部分代数であることが示されている。 It is shown that the two-body Coulomb problem in the Sturm representation leads to a new two-dimensional, exactly-solvable, superintegrable quantum system in curved space with a $g^{(2)}$ hidden algebra and a cubic polynomial algebra of integrals. The two integrals are of orders two and four, they are made from two components of the angular momentum and from the modified Laplace-Runge-Lenz vector, respectively. It is demonstrated that the cubic polynomial algebra is an infinite-dimensional subalgebra of the universal enveloping algebra $U_{g^{(2)}}$.	翻訳日:2023-10-25 07:07:31 公開日:2023-10-23
# GPT-Fathom: GPT-4以降への進化経路を理解するための大規模言語モデルのベンチマーク GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond ( http://arxiv.org/abs/2309.16583v3 ) ライセンス: Link先を確認	Shen Zheng, Yuyu Zhang, Yijie Zhu, Chenguang Xi, Pengyang Gao, Xun Zhou, Kevin Chen-Chuan Chang	(参考訳) 大規模言語モデル(LLM)の急速な進歩により、その能力と限界を評価するための総合的な評価スイートの必要性が高まっている。既存のLCMのリーダーボードは、一貫性のある設定やプロンプトのない他の論文で報告されたスコアを参照することが多い。本稿では, OpenAI Evals 上に構築されたオープンソースかつ再現可能な LLM 評価スイートである GPT-Fathom を紹介する。我々は,7つの機能カテゴリにまたがる20以上のベンチマークにおいて,10以上のLLMとOpenAIのレガシモデルを整列した設定で,体系的に評価した。 OpenAIの初期のモデルに関する我々の振り返り研究は、GPT-3からGPT-4への進化経路に関する貴重な洞察を提供する。コードデータを追加することでLCMの推論能力が改善されるかどうか、SFTとRLHFによってLCMの能力のどの面が改善されるのか、アライメント税はいくらになるのか、といった技術的な詳細を含む。我々の分析は、先進LLMの透明性向上を目的として、これらの疑問の多くに光を当てている。 With the rapid advancement of large language models (LLMs), there is a pressing need for a comprehensive evaluation suite to assess their capabilities and limitations. Existing LLM leaderboards often reference scores reported in other papers without consistent settings and prompts, which may inadvertently encourage cherry-picking favored settings and prompts for better results. In this work, we introduce GPT-Fathom, an open-source and reproducible LLM evaluation suite built on top of OpenAI Evals. We systematically evaluate 10+ leading LLMs as well as OpenAI's legacy models on 20+ curated benchmarks across 7 capability categories, all under aligned settings. Our retrospective study on OpenAI's earlier models offers valuable insights into the evolutionary path from GPT-3 to GPT-4. Currently, the community is eager to know how GPT-3 progressively improves to GPT-4, including technical details like whether adding code data improves LLM's reasoning capability, which aspects of LLM capability can be improved by SFT and RLHF, how much is the alignment tax, etc. Our analysis sheds light on many of these questions, aiming to improve the transparency of advanced LLMs.	翻訳日:2023-10-25 07:07:17 公開日:2023-10-23
# パーソナライズされたオウムはより危険か? 対話システムにおけるペルソナバイアスの評価 Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems ( http://arxiv.org/abs/2310.05280v4 ) ライセンス: Link先を確認	Yixin Wan, Jieyu Zhao, Aman Chadha, Nanyun Peng, Kai-Wei Chang	(参考訳) 大規模言語モデルの最近の進歩は、会話における一般的な人格や特定の人格を模倣するなど、フリーフォームの指示に従うことを可能にする。一般のペルソナを「アジア人」などの人口集団を表すものとして定義する一方、特定のペルソナは「弓」のような特定のアジア名の形をとることがある。ペルソナの採用は対話システムをより魅力的にし、親しみやすくすることでユーザエクスペリエンスを高める一方で、モデル応答内の社会的バイアスを悪化させ、ユーザとのインタラクションを通じて社会的な危害をもたらすことにより、潜在的なリスクの影を形作る。本稿では,対話モデルの有害な行動が,その行動に適応する人格に対する感受性として定義する「人格バイアス」を体系的に研究する。我々は,人格バイアスを有害な表現と有害な合意のバイアスに分類し,攻撃性,有害継続性,関連性,ステレオタイプ合意,および有害合意の5つの側面において人格バイアスを測定する包括的な評価枠組みを確立する。さらに,多種多様なモデルペルソナを包含するシステム構築型ペルソナデータセットであるuniversalpersonaを用いて,パーソナバイアスの調査を行う。 blender、chatgpt、alpaca、vicunaを含む4つの異なるモデルのベンチマークによって、対話システムにおける重要なペルソナバイアスが明らかになった。また,対話エージェントにおけるペルソナの使用を再検討し,安全なアプリケーションを確保する必要性についても考察した。 Recent advancements in Large Language Models empower them to follow freeform instructions, including imitating generic or specific demographic personas in conversations. We define generic personas to represent demographic groups, such as "an Asian person", whereas specific personas may take the form of specific popular Asian names like "Yumi". While the adoption of personas enriches user experiences by making dialogue systems more engaging and approachable, it also casts a shadow of potential risk by exacerbating social biases within model responses, thereby causing societal harm through interactions with users. In this paper, we systematically study "persona biases", which we define to be the sensitivity of dialogue models' harmful behaviors contingent upon the personas they adopt. We categorize persona biases into biases in harmful expression and harmful agreement, and establish a comprehensive evaluation framework to measure persona biases in five aspects: Offensiveness, Toxic Continuation, Regard, Stereotype Agreement, and Toxic Agreement. Additionally, we propose to investigate persona biases by experimenting with UNIVERSALPERSONA, a systematically constructed persona dataset encompassing various types of both generic and specific model personas. Through benchmarking on four different models -- including Blender, ChatGPT, Alpaca, and Vicuna -- our study uncovers significant persona biases in dialogue systems. Our findings also underscore the pressing need to revisit the use of personas in dialogue agents to ensure safe application.	翻訳日:2023-10-25 07:02:31 公開日:2023-10-23
# DialCoTがPPOに - より小さな言語モデルにおける推論パスの分解と探索 DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models ( http://arxiv.org/abs/2310.05074v3 ) ライセンス: Link先を確認	Chengcheng Han, Xiaowei Du, Che Zhang, Yixin Lian, Xiang Li, Ming Gao, Baoyuan Wang	(参考訳) CoT(Chain-of-Thought)プロンプトは、少なくとも1000億のパラメータを持つLLM(Large Language Models)の推論能力を高めるのに有効であることが証明されている。しかし、100億未満のパラメータを持つ小型言語モデル(slms)の推論タスクに適用されると、効果や有害性は失われる。この制限に対処するために,対話形式を用いて中間的推論ステップを生成し,モデルを最終回答へと導く対話ガイド付き連鎖思考 (dialcot) を導入する。さらに,ppo(proximal policy optimization)アルゴリズムを用いてモデルの推論パス選択を最適化し,推論能力をさらに向上させる。提案手法は従来の手法に比べていくつかの利点がある。まず、より単純なサブクエストに分解することで複雑な推論問題の解法を変換し、タスクの難易度を大幅に低減し、SLMに適したものにする。次に、PPOアルゴリズムを用いてモデルの推論経路の選択を最適化する。 4つの算術推論データセットについて包括的実験を行い,本手法が最先端の競争相手に比べて大幅な性能向上を実現することを実証した。 Chain-of-Thought (CoT) prompting has proven to be effective in enhancing the reasoning capabilities of Large Language Models (LLMs) with at least 100 billion parameters. However, it is ineffective or even detrimental when applied to reasoning tasks in Smaller Language Models (SLMs) with less than 10 billion parameters. To address this limitation, we introduce Dialogue-guided Chain-of-Thought (DialCoT) which employs a dialogue format to generate intermediate reasoning steps, guiding the model toward the final answer. Additionally, we optimize the model's reasoning path selection using the Proximal Policy Optimization (PPO) algorithm, further enhancing its reasoning capabilities. Our method offers several advantages compared to previous approaches. Firstly, we transform the process of solving complex reasoning questions by breaking them down into a series of simpler sub-questions, significantly reducing the task difficulty and making it more suitable for SLMs. Secondly, we optimize the model's reasoning path selection through the PPO algorithm. We conduct comprehensive experiments on four arithmetic reasoning datasets, demonstrating that our method achieves significant performance improvements compared to state-of-the-art competitors.	翻訳日:2023-10-25 07:01:34 公開日:2023-10-23
# 大規模言語モデルにおける幻覚の厄介な発生 -- 包括的定義、定量化、規範的修復 The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations ( http://arxiv.org/abs/2310.04988v2 ) ライセンス: Link先を確認	Vipula Rawte, Swagata Chakraborty, Agnibh Pathak, Anubhav Sarkar, S.M Towhidul Islam Tonmoy, Aman Chadha, Amit P. Sheth, Amitava Das	(参考訳) 最近のLLM(Large Language Models)の進歩は、その顕著な出現能力に対して広く称賛されている。しかし、幻覚の問題は副産物として並列に現れ、重大な懸念を呈している。近年の幻覚の特定・緩和への取り組みはいくつかあるが、幻覚の微妙な分類とそれに関連する緩和方法に限定的に重点が置かれている。このギャップに対処するために、我々は、その度合い、向き、カテゴリーに基づいて、プロファイリング幻覚に関するきめ細かい談話と、緩和戦略を提供する。このように、幻覚の2つの包括的な方向を定義する。 (i)事実ミラージュ(fm)及び (ii)銀製ライニング(sl) より包括的に理解するために、両方向はさらに内在的、外在的に分類され、3度の重度を持つ。 (i)軽度。 (ii)適度で (iii)警報。幻覚も慎重に6種類に分類する。 (i)曖昧さの頭字語 (ii)数字のニュアンス (iii)ゴーレム。 (iv)仮想声 (v)地理的不規則、及び (vi)タイムラップ。さらに,HalucInation eLiciTation (HILT) は,15個の現代LPMを用いて作成した75,000個のサンプルと,前述のカテゴリに対する人間のアノテーションからなる公開データセットである。最後に,幻覚を発生させる脆弱性に基づいてLLMの評価とランク付けを行うための比較スペクトルの定量化手法を確立するために,Halucination Vulnerability Index (HVI)を提案する。私たちは、HVIが幅広いNLPコミュニティのツールとして重要な価値を持っていると強く信じています。結論として,幻覚を緩和するための2つの解法を提案する。 The recent advancements in Large Language Models (LLMs) have garnered widespread acclaim for their remarkable emerging capabilities. However, the issue of hallucination has parallelly emerged as a by-product, posing significant concerns. While some recent endeavors have been made to identify and mitigate different types of hallucination, there has been a limited emphasis on the nuanced categorization of hallucination and associated mitigation methods. To address this gap, we offer a fine-grained discourse on profiling hallucination based on its degree, orientation, and category, along with offering strategies for alleviation. As such, we define two overarching orientations of hallucination: (i) factual mirage (FM) and (ii) silver lining (SL). To provide a more comprehensive understanding, both orientations are further sub-categorized into intrinsic and extrinsic, with three degrees of severity - (i) mild, (ii) moderate, and (iii) alarming. We also meticulously categorize hallucination into six types: (i) acronym ambiguity, (ii) numeric nuisance, (iii) generated golem, (iv) virtual voice, (v) geographic erratum, and (vi) time wrap. Furthermore, we curate HallucInation eLiciTation (HILT), a publicly available dataset comprising of 75,000 samples generated using 15 contemporary LLMs along with human annotations for the aforementioned categories. Finally, to establish a method for quantifying and to offer a comparative spectrum that allows us to evaluate and rank LLMs based on their vulnerability to producing hallucinations, we propose Hallucination Vulnerability Index (HVI). We firmly believe that HVI holds significant value as a tool for the wider NLP community, with the potential to serve as a rubric in AI-related policy-making. In conclusion, we propose two solution strategies for mitigating hallucinations.	翻訳日:2023-10-25 07:00:57 公開日:2023-10-23
# 中国語大言語モデルにおける幻覚評価 Evaluating Hallucinations in Chinese Large Language Models ( http://arxiv.org/abs/2310.03368v2 ) ライセンス: Link先を確認	Qinyuan Cheng, Tianxiang Sun, Wenwei Zhang, Siyin Wang, Xiangyang Liu, Mozhi Zhang, Junliang He, Mianqiu Huang, Zhangyue Yin, Kai Chen, Xipeng Qiu	(参考訳) 本稿では,中国大言語モデルにおける幻覚現象を測定するために,HaluQAというベンチマークを作成した。 HalluQAには450の厳密に設計された敵の質問が含まれており、複数のドメインにまたがっており、中国の歴史的文化、慣習、社会現象を考慮に入れている。 HalluQAの構築中,擬似偽造と事実誤りの2種類の幻覚を考察し,GLM-130B と ChatGPT に基づく敵対的サンプルを構築した。評価のために,モデル出力が幻覚的かどうかを判定するために,GPT-4を用いた自動評価手法を設計する。 ERNIE-Bot、Baichuan2、ChatGLM、Qwen、SparkDeskなど、24の大規模言語モデルに関する広範な実験を行います。 24モデル中、18モデルは50%未満の非幻覚率を達成した。これはHauQAが非常に難しいことを示している。様々なモデルにおける幻覚の主なタイプとその原因を分析した。さらに,様々なモデルに対してどの種類の幻覚を優先すべきかについて議論する。 In this paper, we establish a benchmark named HalluQA (Chinese Hallucination Question-Answering) to measure the hallucination phenomenon in Chinese large language models. HalluQA contains 450 meticulously designed adversarial questions, spanning multiple domains, and takes into account Chinese historical culture, customs, and social phenomena. During the construction of HalluQA, we consider two types of hallucinations: imitative falsehoods and factual errors, and we construct adversarial samples based on GLM-130B and ChatGPT. For evaluation, we design an automated evaluation method using GPT-4 to judge whether a model output is hallucinated. We conduct extensive experiments on 24 large language models, including ERNIE-Bot, Baichuan2, ChatGLM, Qwen, SparkDesk and etc. Out of the 24 models, 18 achieved non-hallucination rates lower than 50%. This indicates that HalluQA is highly challenging. We analyze the primary types of hallucinations in different types of models and their causes. Additionally, we discuss which types of hallucinations should be prioritized for different types of models.	翻訳日:2023-10-25 06:59:37 公開日:2023-10-23
# なぜこの記事を削除するべきか? 多言語ウィキペディア編集者討論における透明スタンス検出 Why Should This Article Be Deleted? Transparent Stance Detection in Multilingual Wikipedia Editor Discussions ( http://arxiv.org/abs/2310.05779v2 ) ライセンス: Link先を確認	Lucie-Aim\'ee Kaffee, Arnav Arora and Isabelle Augenstein	(参考訳) オンラインプラットフォーム上のコンテンツのモデレーションは通常透明ではない。しかし、ウィキペディアでは、この議論が公に行われ、編集者はコンテンツモデレーションポリシーをモデレーション決定を行うための説明として使うことを奨励されている。現在、これらの政策を明示的に言及するコメントはごくわずかで、イングランドの20%だが、ドイツとトルコのコメントの2%にも満たない。コンテンツの穏健化の過程を理解するため、ウィキペディア編集者の議論の新たな多言語データセットを構築し、3つの言語による推論を行う。データセットにはエディタのスタンス(keep、delete、merge、コメント)と、記述された理由、および編集決定毎のコンテンツモデレーションポリシーが含まれている。姿勢とそれに対応する理由(政治)を高い精度で共同で予測し、意思決定プロセスに透明性を付加できることを実証する。我々は,共同予測モデルと多言語コンテンツモデレーションデータセットの両方を公開し,自動透明コンテンツモデレーションのさらなる研究を行う。 The moderation of content on online platforms is usually non-transparent. On Wikipedia, however, this discussion is carried out publicly and the editors are encouraged to use the content moderation policies as explanations for making moderation decisions. Currently, only a few comments explicitly mention those policies -- 20% of the English ones, but as few as 2% of the German and Turkish comments. To aid in this process of understanding how content is moderated, we construct a novel multilingual dataset of Wikipedia editor discussions along with their reasoning in three languages. The dataset contains the stances of the editors (keep, delete, merge, comment), along with the stated reason, and a content moderation policy, for each edit decision. We demonstrate that stance and corresponding reason (policy) can be predicted jointly with a high degree of accuracy, adding transparency to the decision-making process. We release both our joint prediction models and the multilingual content moderation dataset for further research on automated transparent content moderation.	翻訳日:2023-10-25 06:49:22 公開日:2023-10-23
# siameseエンコーダの帰属法 An Attribution Method for Siamese Encoders ( http://arxiv.org/abs/2310.05703v2 ) ライセンス: Link先を確認	Lucas M\"oller, Dmitry Nikolaev, Sebastian Pad\'o	(参考訳) 文変換器(ST)のようなシームズエンコーダモデルの成功にもかかわらず、それらが注意を払う入力の側面についてはほとんど知られていない。障害は、それらの予測が1つの入力を処理するのではなく2つの入力を比較するため、個々の特徴に起因するものではないことである。本稿では,複数の入力を持つモデルに対して統合勾配の原理を一般化し,シャムエンコーダの局所帰属法を導出する。この解は特徴対属性の形式を採り、ST のトークントークン行列に還元することができる。我々の手法は、積分ヤコビアンを導入し、積分勾配の有利な形式的特性を継承する:それはモデルの完全な計算グラフを考慮に入れ、実際の予測に収束することが保証される。パイロットによる研究では、ごく少数のトークンペアが多くの予測を説明でき、名詞と動詞に焦点を当てていることが示されている。正確な予測のためには、トークンの大部分と音声の一部に出席する必要がある。 Despite the success of Siamese encoder models such as sentence transformers (ST), little is known about the aspects of inputs they pay attention to. A barrier is that their predictions cannot be attributed to individual features, as they compare two inputs rather than processing a single one. This paper derives a local attribution method for Siamese encoders by generalizing the principle of integrated gradients to models with multiple inputs. The solution takes the form of feature-pair attributions, and can be reduced to a token-token matrix for STs. Our method involves the introduction of integrated Jacobians and inherits the advantageous formal properties of integrated gradients: it accounts for the model's full computation graph and is guaranteed to converge to the actual prediction. A pilot study shows that in an ST few token-pairs can often explain large fractions of predictions, and it focuses on nouns and verbs. For accurate predictions, it however needs to attend to the majority of tokens and parts of speech.	翻訳日:2023-10-25 06:49:04 公開日:2023-10-23
# スケーラブルなメタ学習を実践する Making Scalable Meta Learning Practical ( http://arxiv.org/abs/2310.05674v2 ) ライセンス: Link先を確認	Sang Keun Choe, Sanket Vaibhav Mehta, Hwijeen Ahn, Willie Neiswanger, Pengtao Xie, Emma Strubell, Eric Xing	(参考訳) 機械学習プログラムにおける多様な帰納バイアスを学習する柔軟性にもかかわらず、メタ学習(すなわち学習する学習)は、膨大な計算/メモリコスト、トレーニング不安定性、効率的な分散トレーニングサポートの欠如により、スケーラビリティの低下に悩まされてきた。本研究では,暗黙の識別アルゴリズムとシステムの両方の進歩を組み合わせたSAMAを導入することで,スケーラブルなメタ学習の実現に注力する。特に,SAMAは,2階勾配情報の明示的な計算を回避し,一階勾配に実装した効率的な分散トレーニング技術を活用することにより,計算負担を低減しつつ,メタ学習プログラムのベースレベルにおいて,幅広い適応型オプティマイザを柔軟に支援するように設計されている。複数の大規模メタラーニングベンチマークで評価され、SAMAは、他のベースラインメタラーニングアルゴリズムと比較して、シングルGPUとマルチGPUのセットアップで、スループットが1.7/4.8倍、メモリ消費が2.0/3.8倍向上することを示した。さらに,SAMAに基づくデータ最適化により,BERT と RoBERTa の大規模言語モデルによるテキスト分類精度が一貫した向上を達成し,画像分類タスクによる小規模・大規模データプルーニングの両立を実現し,言語や視覚領域にまたがるスケーラブルなメタ学習の実践的適用性を実証した。 Despite its flexibility to learn diverse inductive biases in machine learning programs, meta learning (i.e., learning to learn) has long been recognized to suffer from poor scalability due to its tremendous compute/memory costs, training instability, and a lack of efficient distributed training support. In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems. Specifically, SAMA is designed to flexibly support a broad range of adaptive optimizers in the base level of meta learning programs, while reducing computational burden by avoiding explicit computation of second-order gradient information, and exploiting efficient distributed training techniques implemented for first-order gradients. Evaluated on multiple large-scale meta learning benchmarks, SAMA showcases up to 1.7/4.8x increase in throughput and 2.0/3.8x decrease in memory consumption respectively on single-/multi-GPU setups compared to other baseline meta learning algorithms. Furthermore, we show that SAMA-based data optimization leads to consistent improvements in text classification accuracy with BERT and RoBERTa large language models, and achieves state-of-the-art results in both small- and large-scale data pruning on image classification tasks, demonstrating the practical applicability of scalable meta learning across language and vision domains.	翻訳日:2023-10-25 06:48:46 公開日:2023-10-23
# InterroLang: 対話ベースの説明によるNLPモデルとデータセットの探索 InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations ( http://arxiv.org/abs/2310.05592v2 ) ライセンス: Link先を確認	Nils Feldhus, Qianli Wang, Tatiana Anikina, Sahil Chopra, Cennet Oguz, Sebastian M\"oller	(参考訳) 最近開発されたNLP説明可能性法では様々な方法でブラックボックスを開くことができるが(Madsenら、2022年)、この取り組みに欠けている要素は対話型インタフェースを提供する対話型ツールである。このような対話システムは、例えば、明確化やフォローアップ質問を通じて、自然言語インターフェイスを通じて、コンテキスト化された方法でデータセットやモデルを探索するのに役立つ。対話型説明フレームワークTalkToModel(Slackなど2022)をNLPドメインに適用し、自由文合理化などの新たなNLP固有の操作を追加し、3つのNLPタスク(対話行動分類、質問応答、ヘイトスピーチ検出)にその一般化性を示す。説明のためのユーザクエリを認識するために,微調整および少数ショットプロンプトモデルを評価し,新しいアダプタベースアプローチを実装した。次に,(1)対話の正当性と有用性を認識するための2つのユーザ研究を行い,(2)シミュレーション可能性,すなわち,モデルが示されていないときの予測ラベルの把握において,人間がいかに客観的に有用な対話的説明を行うかを明らかにする。モデル行動の説明には合理化と特徴属性が有効であることがわかった。さらに、ユーザーは1対1の説明よりも説明対話に基づいてモデル結果をより確実に予測できる。 While recently developed NLP explainability methods let us open the black box in various ways (Madsen et al., 2022), a missing ingredient in this endeavor is an interactive tool offering a conversational interface. Such a dialogue system can help users explore datasets and models with explanations in a contextualized manner, e.g. via clarification or follow-up questions, and through a natural language interface. We adapt the conversational explanation framework TalkToModel (Slack et al., 2022) to the NLP domain, add new NLP-specific operations such as free-text rationalization, and illustrate its generalizability on three NLP tasks (dialogue act classification, question answering, hate speech detection). To recognize user queries for explanations, we evaluate fine-tuned and few-shot prompting models and implement a novel Adapter-based approach. We then conduct two user studies on (1) the perceived correctness and helpfulness of the dialogues, and (2) the simulatability, i.e. how objectively helpful dialogical explanations are for humans in figuring out the model's predicted label when it's not shown. We found rationalization and feature attribution were helpful in explaining the model behavior. Moreover, users could more reliably predict the model outcome based on an explanation dialogue rather than one-off explanations.	翻訳日:2023-10-25 06:48:01 公開日:2023-10-23
# 信頼性の確立:課題再考とモデル評価 Establishing Trustworthiness: Rethinking Tasks and Model Evaluation ( http://arxiv.org/abs/2310.05442v2 ) ライセンス: Link先を確認	Robert Litschko, Max M\"uller-Eberstein, Rob van der Goot, Leon Weber, Barbara Plank	(参考訳) 言語理解は多面的認知能力であり、自然言語処理(NLP)コミュニティは何十年もの間、計算モデルに取り組んできた。伝統的に、言語知能の側面は、特殊なモデルアーキテクチャとそれに対応する評価プロトコルを備えたタスクに分割されてきた。大規模言語モデル(LLM)の出現により、コミュニティは、ジェネレーティブモデルによるタスク非依存のアプローチである汎用への劇的なシフトを目撃した。結果として、従来の区画化された言語タスクの概念は崩壊し、続いて評価と分析の課題が増加している。同時に、LLMは、これまで予期せぬゼロショットセットアップを含む、より現実的なシナリオにデプロイされ、信頼できるシステムの必要性が増している。したがって、NLPにおけるタスクやモデル評価を構成するものを再考し、言語に関するより総合的な視点を追求し、その中心に信頼性を置くべき時であると論じる。本研究の目的は,モデルの機能的キャパシティの起源を理解するための既存のコンパートナライズドアプローチをレビューし,より多面的な評価プロトコルを提案することである。 Language understanding is a multi-faceted cognitive capability, which the Natural Language Processing (NLP) community has striven to model computationally for decades. Traditionally, facets of linguistic intelligence have been compartmentalized into tasks with specialized model architectures and corresponding evaluation protocols. With the advent of large language models (LLMs) the community has witnessed a dramatic shift towards general purpose, task-agnostic approaches powered by generative models. As a consequence, the traditional compartmentalized notion of language tasks is breaking down, followed by an increasing challenge for evaluation and analysis. At the same time, LLMs are being deployed in more real-world scenarios, including previously unforeseen zero-shot setups, increasing the need for trustworthy and reliable systems. Therefore, we argue that it is time to rethink what constitutes tasks and model evaluation in NLP, and pursue a more holistic view on language, placing trustworthiness at the center. Towards this goal, we review existing compartmentalized approaches for understanding the origins of a model's functional capacity, and provide recommendations for more multi-faceted evaluation protocols.	翻訳日:2023-10-25 06:47:37 公開日:2023-10-23
# タスク適応型トークン化による長文生成効率の向上 Enhancing Long-form Text Generation Efficacy with Task-adaptive Tokenization ( http://arxiv.org/abs/2310.05317v4 ) ライセンス: Link先を確認	Siyang Liu, Naihao Deng, Sahand Sabour, Yilin Jia, Minlie Huang, Rada Mihalcea	(参考訳) 本稿では,ダウンストリームタスクの仕様に生成パイプラインを適用する方法としてタスク適応トークン化を提案し,メンタルヘルスにおける長期的生成の促進を図る。認知科学の知見に触発されて、タスク適応型トークンーザは複数の結果から可変セグメンテーションをサンプリングし、タスク固有データに基づいてサンプリング確率を最適化した。本稿では,専門用語構築のための戦略と,事前学習したモデルのトークン化ステップへのタスク固有のトークンの統合を可能にする語彙統合プロトコルを提案する。中国語と英語の心理学的質問応答タスクに関する広範な実験を通して、我々のタスク適応型トークン化アプローチは、最大60%のトークンを使用しながら、生成性能を大幅に改善することを発見した。予備実験は、非常に大きな言語モデルでトークン化アプローチを使用する場合に有望な結果を示す。 We propose task-adaptive tokenization as a way to adapt the generation pipeline to the specifics of a downstream task and enhance long-form generation in mental health. Inspired by insights from cognitive science, our task-adaptive tokenizer samples variable segmentations from multiple outcomes, with sampling probabilities optimized based on task-specific data. We introduce a strategy for building a specialized vocabulary and introduce a vocabulary merging protocol that allows for the integration of task-specific tokens into the pre-trained model's tokenization step. Through extensive experiments on psychological question-answering tasks in both Chinese and English, we find that our task-adaptive tokenization approach brings a significant improvement in generation performance while using up to 60% fewer tokens. Preliminary experiments point to promising results when using our tokenization approach with very large language models.	翻訳日:2023-10-25 06:47:19 公開日:2023-10-23
# Dual Radar: 自律走行のためのDual 4D Radarを備えたマルチモーダルデータセット Dual Radar: A Multi-modal Dataset with Dual 4D Radar for Autonomous Driving ( http://arxiv.org/abs/2310.07602v2 ) ライセンス: Link先を確認	Xinyu Zhang, Li Wang, Jian Chen, Cheng Fang, Lei Yang, Ziying Song, Guangqi Yang, Yichen Wang, Xiaofei Zhang, Qingshan Yang, Jun Li	(参考訳) radarは、広く採用されているカメラやライダーと比較して、自律運転環境認識の悪いシナリオに適応性が高い。一般的な3dレーダーと比較すると、最新の4dレーダーは正確な垂直解像度と高点の雲密度を持ち、複雑な環境知覚における自律運転のための非常に有望なセンサーである。しかし、LiDARよりもはるかに高いノイズのため、メーカーは異なるフィルタリング戦略を選択し、ノイズレベルと点雲密度の逆比をもたらす。自動運転における深層学習に基づく知覚アルゴリズムにとって、どの手法が有益かの比較分析がいまだに欠けている。主な理由の1つは、現在のデータセットが1種類の4Dレーダーのみを採用するため、同じシーンで異なる4Dレーダーを比較するのは困難である。そこで本研究では,2種類の4Dレーダを同時に撮影する大規模マルチモーダル・データセットを提案する。このデータセットは、有効な4Dレーダ認識アルゴリズムのさらなる研究を可能にし、我々のデータセットは151の連続するシリーズで構成され、そのほとんどは、正確に同期された10,007フレームを含む。さらに我々のデータセットは、多くの道路条件、天候条件、夜間と昼間の照明強度と期間を含む、様々な困難な運転シナリオをキャプチャします。私たちのデータセットは、3dオブジェクト検出とトラッキングに適用可能な連続フレームを注釈し、マルチモーダルタスクの研究もサポートする。我々はデータセットを実験的に検証し、異なる種類の4Dレーダーの研究に有用な結果を提供する。このデータセットはhttps://github.com/adept-thu/Dual-Radarで公開されている。 Radar has stronger adaptability in adverse scenarios for autonomous driving environmental perception compared to widely adopted cameras and LiDARs. Compared with commonly used 3D radars, the latest 4D radars have precise vertical resolution and higher point cloud density, making it a highly promising sensor for autonomous driving in complex environmental perception. However, due to the much higher noise than LiDAR, manufacturers choose different filtering strategies, resulting in an inverse ratio between noise level and point cloud density. There is still a lack of comparative analysis on which method is beneficial for deep learning-based perception algorithms in autonomous driving. One of the main reasons is that current datasets only adopt one type of 4D radar, making it difficult to compare different 4D radars in the same scene. Therefore, in this paper, we introduce a novel large-scale multi-modal dataset featuring, for the first time, two types of 4D radars captured simultaneously. This dataset enables further research into effective 4D radar perception algorithms.Our dataset consists of 151 consecutive series, most of which last 20 seconds and contain 10,007 meticulously synchronized and annotated frames. Moreover, our dataset captures a variety of challenging driving scenarios, including many road conditions, weather conditions, nighttime and daytime with different lighting intensities and periods. Our dataset annotates consecutive frames, which can be applied to 3D object detection and tracking, and also supports the study of multi-modal tasks. We experimentally validate our dataset, providing valuable results for studying different types of 4D radars. This dataset is released on https://github.com/adept-thu/Dual-Radar.	翻訳日:2023-10-25 06:41:02 公開日:2023-10-23
# 超流動性と固体:(5,5)カーボンナノチューブによる無摩擦物質輸送 Superfluidity meets the solid-state: frictionless mass-transport through a (5,5) carbon-nanotube ( http://arxiv.org/abs/2310.07476v2 ) ライセンス: Link先を確認	Alberto Ambrosetti, Pier Luigi Silvestrelli and Luca Salasnich	(参考訳) 超流動性(superfluidity)は、非常に低温で^4$heまたは希薄な原子ガスのような超流動によるメソスコピック粒子の摩擦のない運動を伴うよく特性化された量子現象である。ランドーが示したように、超流体の基本的な励起スペクトルから生じるエネルギー-運動量保存の不適合性は、超流体と運動するメソスコピック粒子の間の量子散乱を臨界速度閾値以下に抑える。ここでは、he原子が狭い(5,5)炭素ナノチューブ(cnt)を通り抜けるとき、摩擦のない運動も標準の超流動がなければ起こると予測する。 Heと相互作用するプラズモンとフォノンモードの準線形分散のため、(5,5)CNTは超流動の固体類似体を具現化し、これによりランダウの超流動性の基準を容易に伝達することができる。その結果、ランダウの方程式はより広範な一般性を獲得し、これまでの記述が純粋に古典的である他のナノスケール摩擦現象にも適用することができる。 Superfluidity is a well-characterized quantum phenomenon which entails frictionless-motion of mesoscopic particles through a superfluid, such as $^4$He or dilute atomic-gases at very low temperatures. As shown by Landau, the incompatibility between energy- and momentum-conservation, which ultimately stems from the spectrum of the elementary excitations of the superfluid, forbids quantum-scattering between the superfluid and the moving mesoscopic particle, below a critical speed-threshold. Here we predict that frictionless-motion can also occur in the absence of a standard superfluid, i.e. when a He atom travels through a narrow (5,5) carbon-nanotube (CNT). Due to the quasi-linear dispersion of the plasmon and phonon modes that could interact with He, the (5,5) CNT embodies a solid-state analog of the superfluid, thereby enabling straightforward transfer of Landau's criterion of superfluidity. As a result, Landau's equations acquire broader generality, and may be applicable to other nanoscale friction phenomena, whose description has been so far purely classical.	翻訳日:2023-10-25 06:40:35 公開日:2023-10-23
# 繰り返しテキストを予測する際の人文モデルと言語モデル Humans and language models diverge when predicting repeating text ( http://arxiv.org/abs/2310.06408v2 ) ライセンス: Link先を確認	Aditya R. Vaidya, Javier Turek, Alexander G. Huth	(参考訳) 単語予測タスクで訓練された言語モデルは、単語予測と読み速度で人間の行動を正確にモデル化することが示されている。これらの結果とは対照的に,人間とlmsの性能が分岐するシナリオを示す。テキストの繰り返しによって形成される5つの刺激に対して,人間の次の単語予測のデータセットを収集した。人間とGPT-2 LMの予測はテキストスパンの最初のプレゼンテーションで強く一致しているが、メモリ(またはテキスト内学習)が役割を担い始めると、その性能は急速に多様化する。我々はこの分岐の原因を中層における特定の注意頭まで追跡した。これらの注意に力の弱い傾向バイアスを加えることで、人間と同じような働きをするモデルが生まれました。このシナリオが、lmsを人間の行動に近づける今後の取り組みを促すことを期待しています。 Language models that are trained on the next-word prediction task have been shown to accurately model human behavior in word prediction and reading speed. In contrast with these findings, we present a scenario in which the performance of humans and LMs diverges. We collected a dataset of human next-word predictions for five stimuli that are formed by repeating spans of text. Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory (or in-context learning) begins to play a role. We traced the cause of this divergence to specific attention heads in a middle layer. Adding a power-law recency bias to these attention heads yielded a model that performs much more similarly to humans. We hope that this scenario will spur future work in bringing LMs closer to human behavior.	翻訳日:2023-10-25 06:38:22 公開日:2023-10-23
# タスクとコンテキストがドライバーの視線配分に及ぼす影響の理解とモデル化 Understanding and Modeling the Effects of Task and Context on Drivers' Gaze Allocation ( http://arxiv.org/abs/2310.09275v2 ) ライセンス: Link先を確認	Iuliia Kotseruba and John K. Tsotsos	(参考訳) ドライバーが何を見ているかを理解することは、ドライバーのトレーニング、監視、支援、そして自動運転を含む多くのアプリケーションにとって重要である。伝統的に、人間の視覚的注意に影響する要因はボトムアップとトップダウン(タスクとコンテキスト駆動)に分けられている。どちらもドライバーの視線配分において役割を担っているが、既存のモデリングアプローチのほとんどはボトムアップ・サリエンシーのために開発された技術を適用しており、タスクやコンテキストの影響を明示的に考慮していない。同様に、共通運転注意ベンチマークは関連するタスクとコンテキストアノテーションを欠いている。そこで,運転者の視線予測のための因子の解析とモデル化を実現するために,以下のことを提案する。 1) 一般的なDR(eye)VEデータセットの欠点に対処し、タスクとコンテキストを駆動するためのフレーム単位のアノテーションで拡張する。 2) 精度とドライバーの視線予測のためのベースラインモデルとSOTAモデルをベンチマークし、それらを新しいアノテーションで解析する。 3) ドライバーの視線予測を明示的な行動と文脈情報で調整し, DR(eye)VE全体のSOTA性能(24\% KLD, 89\% NSS)と, 行動・安全クリティカルな交差点シナリオ(10-30\% KLD)のサブセットにおいて有意に向上させる新しいモデルを提案する。拡張アノテーション、モデルと評価のためのコードも公開されている。 Understanding what drivers look at is important for many applications, including driver training, monitoring, and assistance, as well as self-driving. Traditionally, factors affecting human visual attention have been divided into bottom-up (involuntary attraction to salient regions) and top-down (task- and context-driven). Although both play a role in drivers' gaze allocation, most of the existing modeling approaches apply techniques developed for bottom-up saliency and do not consider task and context influences explicitly. Likewise, common driving attention benchmarks lack relevant task and context annotations. Therefore, to enable analysis and modeling of these factors for drivers' gaze prediction, we propose the following: 1) address some shortcomings of the popular DR(eye)VE dataset and extend it with per-frame annotations for driving task and context; 2) benchmark a number of baseline and SOTA models for saliency and driver gaze prediction and analyze them w.r.t. the new annotations; and finally, 3) a novel model that modulates drivers' gaze prediction with explicit action and context information, and as a result significantly improves SOTA performance on DR(eye)VE overall (by 24\% KLD and 89\% NSS) and on a subset of action and safety-critical intersection scenarios (by 10--30\% KLD). Extended annotations, code for model and evaluation will be made publicly available.	翻訳日:2023-10-25 06:30:49 公開日:2023-10-23
# 二重ポートホモダイン検出による単一コヒーレント状態光MZIによる2パラメータ推定 Two-parameter estimation with single coherent-state light MZI via double-port homodyne detection ( http://arxiv.org/abs/2310.08856v2 ) ライセンス: Link先を確認	Li-li Hou, Jian-Dong Zhang, Shuai Wang	(参考訳) マッハ・ゼーダー干渉計は重要で基本的な光学装置であり、通常、異なる測定スキームと非古典量子状態を持つ単一パラメータ推定問題を研究するために用いられる。本論文では,二重ポートホモダイン検出による単一コヒーレント状態光マッハ・ツェンダー干渉計による2パラメータ推定を実現できることを示す。 2パラメータ推定の位相感度は古典的および量子的フィッシャー情報行列によって研究される。その結果,ダブルポートホモダイン検出により得られた両相シフトの位相感度は,両相シフトが最適作業点に近づくとqcrbとsnlに近づくことが判明した。また、単相シフトの場合、一相シフトの共振位相感度は他の推定位相シフトに依存しず、SNLに打ち勝つことができる。 A Mach-Zehnder interferometer is an important and basic optical device,which is usually used to study the single-parameter estimation problem with different measurement schemes and nonclassical quantum states. In this paper, we show that it can be realized the two-parameter estimation via single coherent-state light Mach-Zehnder interferometer with the double-port homodyne detection. The phase sensitivity of the two-parameter estimation is studied by classical and quantum Fisher information matrices. As a result, we find that the total phase sensitivity of both phase shifts obtained by the double-port homodyne detection can approach the QCRB and the SNL when both phase shifts approach the optimal working point. In addition, for single phase shift, the corresonding phase sensitivity of one phase shift does not depend on the other estimate phase shift, and can beat the SNL.	翻訳日:2023-10-25 06:30:07 公開日:2023-10-23
# LoftQ: 大規模言語モデルのための LoRA-Fine-Tuning-Aware 量子化 LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models ( http://arxiv.org/abs/2310.08659v3 ) ライセンス: Link先を確認	Yixiao Li, Yifan Yu, Chen Liang, Pengcheng He, Nikos Karampatziakis, Weizhu Chen, Tuo Zhao	(参考訳) 量子化は、LLM(Large Language Models)を提供するのに必須のテクニックであり、最近LoRAファインチューニングへの道を見つけた。本研究では、事前学習モデルに量子化とLoRA微調整を併用するシナリオに焦点を当てる。このような場合、完全な微調整と量子化とLoRA微調整のアプローチで下流タスクのパフォーマンスの一貫性のあるギャップを観察することが一般的である。 LLMの量子化を同時に行う新しい量子化フレームワークであるLoftQ(LoRA-Fine-Tuning-Aware Quantization)を提案する。このような初期化は量子化モデルと完全精度モデルの相違を緩和し、下流タスクの一般化を大幅に改善する。本稿では,自然言語理解,質問応答,要約,自然言語生成タスクについて評価する。実験により,本手法は既存の量子化法,特に2ビットと2/4ビットの混合精度で高い性能を示した。私たちはコードを公開します。 Quantization is an indispensable technique for serving Large Language Models (LLMs) and has recently found its way into LoRA fine-tuning. In this work we focus on the scenario where quantization and LoRA fine-tuning are applied together on a pre-trained model. In such cases it is common to observe a consistent gap in the performance on downstream tasks between full fine-tuning and quantization plus LoRA fine-tuning approach. In response, we propose LoftQ (LoRA-Fine-Tuning-aware Quantization), a novel quantization framework that simultaneously quantizes an LLM and finds a proper low-rank initialization for LoRA fine-tuning. Such an initialization alleviates the discrepancy between the quantized and full-precision model and significantly improves the generalization in downstream tasks. We evaluate our method on natural language understanding, question answering, summarization, and natural language generation tasks. Experiments show that our method is highly effective and outperforms existing quantization methods, especially in the challenging 2-bit and 2/4-bit mixed precision regimes. We will release our code.	翻訳日:2023-10-25 06:29:54 公開日:2023-10-23
# 1次元イジングモデルのクエンチダイナミクスにおける臨界現象の古典シミュレーションにおけるマクロ状態とミクロ状態 Macrostates vs. Microstates in the Classical Simulation of Critical Phenomena in Quench Dynamics of 1D Ising Models ( http://arxiv.org/abs/2310.08567v2 ) ライセンス: Link先を確認	Anupam Mitra, Tameem Albash, Philip Daniel Blocher, Jun Takahashi, Akimasa Miyake, Grant W. Biedermann, Ivan H. Deutsch	(参考訳) 本研究では,一次元逆場イジングモデル (TFIM) のクエンチ力学における古典的臨界現象のトラクタビリティについて,高度に歪んだ行列積状態 (MPS) を用いて検討した。本稿では,非可積分な長距離TFIMにおいて発生する動的量子相転移(DQPT)と臨界点に焼成した場合の積分可能な近傍TFIMの無限時間相関長に着目した。 DQPT では,MPS 結合次元の驚くほど重い切り込みで順序パラメータを効率的にシミュレートできることが示されている。これは、完全多体状態が高忠実度でシミュレートされない場合でも、臨界指数を含む相転移の臨界特性を確実に抽出するために用いられる。臨界点近傍の長時間相関長は、完全な多体状態の忠実度に敏感であり、一般に大きな結合次元MPSを必要とする。それにもかかわらず、エンタングルメントが低いダイナミクスの短時間の挙動から抽出できるため、強切断されたmpsでも効率的にシミュレーションできることがわかった。以上の結果から,多体状態(マイクロステート)の正確な計算は,多体状態(マクロステート)の相をシミュレーションする際には,その正確なマイクロステートの正確な仕様は必要とされない可能性が示唆された。また,モデル内の量子カオスと平衡に基づく切断型mpsを用いたシミュレーションのトラクタビリティについて検討した。正多体状態が最も難易度が高いカオスシステムに対して,局所的な期待値を最も容易に近似できる反直感的逆関係を求める。 We study the tractability of classically simulating critical phenomena in the quench dynamics of one-dimensional transverse field Ising models (TFIMs) using highly truncated matrix product states (MPS). We focus on two paradigmatic examples: a dynamical quantum phase transition (DQPT) that occurs in nonintegrable long-range TFIMs, and the infinite-time correlation length of the integrable nearest-neighbor TFIM when quenched to the critical point. For the DQPT, we show that the order parameters can be efficiently simulated with surprisingly heavy truncation of the MPS bond dimension. This can be used to reliably extract critical properties of the phase transition, including critical exponents, even when the full many-body state is not simulated with high fidelity. The long-time correlation length near the critical point is more sensitive to the full many-body state fidelity, and generally requires a large bond dimension MPS. Nonetheless, we find that this can still be efficiently simulated with strongly truncated MPS because it can be extracted from the short-time behavior of the dynamics where entanglement is low. Our results demonstrate that while accurate calculation of the full many-body state (microstate) is typically intractable due to the volume-law growth of entanglement, a precise specification of an exact microstate may not be required when simulating phases of matter of many-body systems (macrostates). We also study the tractability of simulation using truncated MPS based on quantum chaos and equilibration in the models. We find a counterintuitive inverse relationship, whereby local expectation values are most easily approximated for chaotic systems whose exact many-body state is most intractable.	翻訳日:2023-10-25 06:29:32 公開日:2023-10-23
# GRASP: グラフアテンションによる最短パスアタックの高速化 GRASP: Accelerating Shortest Path Attacks via Graph Attention ( http://arxiv.org/abs/2310.07980v2 ) ライセンス: Link先を確認	Zohair Shafi, Benjamin A. Miller, Ayan Chatterjee, Tina Eliassi-Rad, Rajmonda S. Caceres	(参考訳) 機械学習(ML)の最近の進歩は、古典的な組合せ最適化アルゴリズムの補助と加速の可能性を示している。エンドツーエンドで学習することを目的としたMLベースのスピードアップ(すなわち、ソリューションを直接出力する)は、ソリューションの品質とランタイムをトレードオフする傾向がある。したがって、性能保証を維持しながら既存の問題解決を加速できるソリューションは非常に興味深い。本稿では,最小限のエッジ数を取り除き,グラフ内の最短経路を攻撃しようとするAPXハード問題を考える。グラフ注意促進経路攻撃(Graph Attention Accelerated Shortest Path Attack)は、MLが生成したソリューションの品質を維持しつつ、実行時間を最大10倍高速化する最適化アルゴリズムである。 GRASPはグラフアテンションネットワークを用いて組合せ解を含む小さなサブグラフを識別し、入力問題のサイズを効果的に削減する。さらに、最適化タスクとよく相関するノード機能を含む入力グラフの注意深い表現が、最適化ソリューションにおける重要な構造を如何に強調するかを示す。 Recent advances in machine learning (ML) have shown promise in aiding and accelerating classical combinatorial optimization algorithms. ML-based speed ups that aim to learn in an end to end manner (i.e., directly output the solution) tend to trade off run time with solution quality. Therefore, solutions that are able to accelerate existing solvers while maintaining their performance guarantees, are of great interest. We consider an APX-hard problem, where an adversary aims to attack shortest paths in a graph by removing the minimum number of edges. We propose the GRASP algorithm: Graph Attention Accelerated Shortest Path Attack, an ML aided optimization algorithm that achieves run times up to 10x faster, while maintaining the quality of solution generated. GRASP uses a graph attention network to identify a smaller subgraph containing the combinatorial solution, thus effectively reducing the input problem size. Additionally, we demonstrate how careful representation of the input graph, including node features that correlate well with the optimization task, can highlight important structure in the optimization solution.	翻訳日:2023-10-25 06:27:47 公開日:2023-10-23
# 文脈モデルを用いた半教師付き群衆数--群衆シーンの総合的理解を促進する Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes ( http://arxiv.org/abs/2310.10352v2 ) ライセンス: Link先を確認	Yifei Qian, Xiaopeng Hong, Ognjen Arandjelovi\'c, Zhongliang Guo, Carl R.Donovan	(参考訳) そこで本研究では,信頼度の高い群集数モデルの訓練に要する重度アノテーション負荷を軽減し,より多くのデータを活用することで,モデルをより実践的かつ正確にするため,教師の枠組みに基づいた新たな半教師方式を提案する。ラベル付きデータが不足している場合には、ローカルパッチに過度に適合する傾向にある。このような状況下では、ラベルなしデータによる局所パッチ予測の精度を単に改善する従来のアプローチは不十分である。そこで本研究では,モデル固有の「従属化」能力の育成という,よりニュアンスなアプローチを提案する。この能力により、モデルは群衆シーンの理解を活用し、人間の認知過程を反映することで、地域の数を正確に見積もることができる。この目的を達成するために、ラベルのないデータにマスキングを適用し、それらのマスキングされたパッチの予測をモデルに導く。さらに,特徴学習を支援するために,細粒度密度分類タスクを組み込んだ。本手法は,厳密な構造的制約や損失の制約がないため,既存手法の多くに適用可能である。さらに、このフレームワークでトレーニングされたモデルが‘サブシタイズ’的な振る舞いを示すことも観察します。高密度領域の予測に局所的な細部を取り入れつつ,"グレース"のみで低密度領域を正確に予測する。提案手法は,上海技術AやUCF-QNRFといった挑戦的なベンチマークにおいて,従来のアプローチをはるかに上回り,最先端の性能を実現する。コードはhttps://github.com/cha15yq/mrc-crowdで入手できる。 To alleviate the heavy annotation burden for training a reliable crowd counting model and thus make the model more practicable and accurate by being able to benefit from more data, this paper presents a new semi-supervised method based on the mean teacher framework. When there is a scarcity of labeled data available, the model is prone to overfit local patches. Within such contexts, the conventional approach of solely improving the accuracy of local patch predictions through unlabeled data proves inadequate. Consequently, we propose a more nuanced approach: fostering the model's intrinsic 'subitizing' capability. This ability allows the model to accurately estimate the count in regions by leveraging its understanding of the crowd scenes, mirroring the human cognitive process. To achieve this goal, we apply masking on unlabeled data, guiding the model to make predictions for these masked patches based on the holistic cues. Furthermore, to help with feature learning, herein we incorporate a fine-grained density classification task. Our method is general and applicable to most existing crowd counting methods as it doesn't have strict structural or loss constraints. In addition, we observe that the model trained with our framework exhibits a 'subitizing'-like behavior. It accurately predicts low-density regions with only a 'glance', while incorporating local details to predict high-density regions. Our method achieves the state-of-the-art performance, surpassing previous approaches by a large margin on challenging benchmarks such as ShanghaiTech A and UCF-QNRF. The code is available at: https://github.com/cha15yq/MRC-Crowd.	翻訳日:2023-10-25 06:21:50 公開日:2023-10-23
# 非エルミート二成分系における非破壊的縮退と線形回路における実現 Non-defective degeneracy in non-Hermitian bipartite system and the realization in linear circuit ( http://arxiv.org/abs/2310.10132v2 ) ライセンス: Link先を確認	Chen-Huan Wu, Yida Li	(参考訳) ランダム行列理論の観点では、ガウス直交のアンサンブルにおいて非エルミート系をシミュレートする。 2つの異なる固有値を持つエルミート作用素から始めて、ランダムな固有ケットを通して対角的でないゆらぎを導入し、また2つの8\times 8$サブシステムを通して二部構造を実現する。後者のサブシステムでは、非線型対称性を含む非欠陥縮退と、隣接する固有ベクトルにおける線形写像の蓄積効果を検証する。実験では、この効果を非相互非エルミート線形回路で観測する。 In terms of the random matrix theory, we simulate a non-Hermitian system in Gaussian orthogonal ensemble. Starting from a Hermitian operator with two distinct eigenvalues, we introduce the off-diagonal fluctuations through the random eigenkets, and realizing the bipartite nature through two $8\times 8$ subsystems, where one of them is full ranked, while the other is rank deficient. For the latter subsystem, we verify the non-defective degeneracy containing the non-linear symmetries, as well as the accumulation effect of the linear map in adjacent eigenvectors. Experimently, we observe such effect in a non-reciprocal non-Hermitian linear circuit.	翻訳日:2023-10-25 06:21:01 公開日:2023-10-23
# GreatSplicing: セマンティックにリッチなスプライシングデータセット GreatSplicing: A Semantically Rich Splicing Dataset ( http://arxiv.org/abs/2310.10070v2 ) ライセンス: Link先を確認	Xiuli Bi and Jiaming Liang	(参考訳) 既存のスプライシングフォージェリーデータセットでは、スプライシング領域のセマンティックな多様性が不十分であり、トレーニングされた検出モデルがトレースをスプライシングするのではなく、セマンティックな特徴を過度に適合させるという問題を引き起こす。一方、合理的なデータセットがないため、提案された異なる検出方法が実験的な設定で合意に達することができない。本稿では,このような緊急問題に対処するために,手作業で作成し,大量の高品質なスプライシングデータセットである greatsplicing を提案する。 GreatSplicingは5000のスプライシングイメージで構成され、スプライシングされた領域を335の異なるセマンティックカテゴリでカバーしている。 GreatSplicingでトレーニングされたモデルは、既存のデータセットと比較して、最小の誤識別率と優れたデータセット検出能力を示す。 GreatSplicingはすべての研究目的で利用可能であり、www. Greatsplicing.netからダウンロードできる。 In existing splicing forgery datasets, the insufficient semantic varieties of spliced regions cause a problem that trained detection models overfit semantic features rather than splicing traces. Meanwhile, because of the absence of a reasonable dataset, different detection methods proposed cannot reach a consensus on experimental settings. To address these urgent issues, GreatSplicing, a manually created splicing dataset with a considerable amount and high quality, is proposed in this paper. GreatSplicing comprises 5,000 spliced images and covers spliced regions with 335 distinct semantic categories, allowing neural networks to grasp splicing traces better. Extensive experiments demonstrate that models trained on GreatSplicing exhibit minimal misidentification rates and superior cross-dataset detection capabilities compared to existing datasets. Furthermore, GreatSplicing is available for all research purposes and can be downloaded from www.greatsplicing.net.	翻訳日:2023-10-25 06:20:49 公開日:2023-10-23
# nice: cascading collaborative learning による panoptic narrative detection と segmentation の改善 NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning ( http://arxiv.org/abs/2310.10975v2 ) ライセンス: Link先を確認	Haowei Wang, Jiayi Ji, Tianyu Guo, Yilong Yang, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji	(参考訳) PND(Panoptic Narrative Detection)とSegmentation(Segmentation)は、画像中の複数のターゲットを、長い物語記述に従って識別し、位置決めする2つの課題である。本稿では,これら2つの単視的物語認識タスクを共同で学習する,NICEと呼ばれる統一的で効果的なフレームワークを提案する。既存の視覚的接地タスクは2分岐パラダイムを用いるが、これをPNDやPNSに直接適用すると、本質的な多対多のアライメント特性のために予測競合が発生する。マスクのバリセンタをベースとした2つのカスケーディングモジュール(CGA)とBDL(Barycenter Driven Localization)を導入し,それぞれセグメンテーションと検出を行う。 PNSとPNDを連ねてセグメンテーションのバリセンタをアンカーとすることで,本手法は2つのタスクを自然に整列させ,相互に補完して性能を向上させる。具体的には、CGAはバリセンタを検出の基準として提供し、BDLの多数の候補ボックスへの依存を減らす。 BDLはその優れた特性を利用して異なるインスタンスを区別し、セグメンテーションにおけるCGAの性能を向上させる。大規模な実験により、NICEは既存のすべての手法を大きなマージンで上回り、PNDは4.1%、PNSは2.9%に達した。これらの結果は,協調学習戦略の有効性を検証した。この作業のプロジェクトはhttps://github.com/Mr-Neko/NICE.comで公開されている。 Panoptic Narrative Detection (PND) and Segmentation (PNS) are two challenging tasks that involve identifying and locating multiple targets in an image according to a long narrative description. In this paper, we propose a unified and effective framework called NICE that can jointly learn these two panoptic narrative recognition tasks. Existing visual grounding tasks use a two-branch paradigm, but applying this directly to PND and PNS can result in prediction conflict due to their intrinsic many-to-many alignment property. To address this, we introduce two cascading modules based on the barycenter of the mask, which are Coordinate Guided Aggregation (CGA) and Barycenter Driven Localization (BDL), responsible for segmentation and detection, respectively. By linking PNS and PND in series with the barycenter of segmentation as the anchor, our approach naturally aligns the two tasks and allows them to complement each other for improved performance. Specifically, CGA provides the barycenter as a reference for detection, reducing BDL's reliance on a large number of candidate boxes. BDL leverages its excellent properties to distinguish different instances, which improves the performance of CGA for segmentation. Extensive experiments demonstrate that NICE surpasses all existing methods by a large margin, achieving 4.1% for PND and 2.9% for PNS over the state-of-the-art. These results validate the effectiveness of our proposed collaborative learning strategy. The project of this work is made publicly available at https://github.com/Mr-Neko/NICE.	翻訳日:2023-10-25 06:11:11 公開日:2023-10-23
# 創発的AI支援談話:ChatGPTを用いた第2言語作者の事例研究 Emergent AI-Assisted Discourse: Case Study of a Second Language Writer Authoring with ChatGPT ( http://arxiv.org/abs/2310.10903v2 ) ライセンス: Link先を確認	Sharin Jacob, Tamara Tate, Mark Warschauer	(参考訳) ChatGPTの急速な普及は、人間の文章に対する影響に関する議論を引き起こした。執筆基準の低下が懸念される中,特に言語学習者において,学術的文章作成の促進にchatgptが果たす役割について検討した。ケーススタディアプローチを用いて,ChatGPTを学術的執筆プロセスを通じて統合した博士課程生Kailingの経験を考察した。この研究は、活動理論を、生成的AIツールで書くことを理解するためのレンズとして利用し、分析されたデータには、半構造化インタビュー、筆記サンプル、GPTログが含まれる。その結果,カイリングは様々な執筆段階においてChatGPTと効果的に協力し,権威的な声とエージェンシーを保っていることがわかった。このことは、ChatGPTのようなAIツールが、個々の認証を覆すことなく、言語学習者の学術的記述を強化する可能性を浮き彫りにしている。本研究は,ChatGPTを学術書記プロセスでどのように活用するか,およびツールに係わる際の学生の真正声の保存について,批判的な考察を行う。 The rapid proliferation of ChatGPT has incited debates regarding its impact on human writing. Amid concerns about declining writing standards, this study investigates the role of ChatGPT in facilitating academic writing, especially among language learners. Using a case study approach, this study examines the experiences of Kailing, a doctoral student, who integrates ChatGPT throughout their academic writing process. The study employs activity theory as a lens for understanding writing with generative AI tools and data analyzed includes semi-structured interviews, writing samples, and GPT logs. Results indicate that Kailing effectively collaborates with ChatGPT across various writing stages while preserving her distinct authorial voice and agency. This underscores the potential of AI tools such as ChatGPT to enhance academic writing for language learners without overshadowing individual authenticity. This case study offers a critical exploration of how ChatGPT is utilized in the academic writing process and the preservation of a student's authentic voice when engaging with the tool.	翻訳日:2023-10-25 06:10:42 公開日:2023-10-23
# stackageリポジトリ:その進化に関する探索的研究 The Stackage Repository: An Exploratory Study of its Evolution ( http://arxiv.org/abs/2310.10887v2 ) ライセンス: Link先を確認	Paul Leger and Felipe Ruiz and Nicol\'as Sep\'ulveda and Ismael Figueroa and Nicol\'as Cardozo	(参考訳) コンテキスト。プログラミング言語のパッケージリポジトリはますます一般的になっている。リポジトリはパッケージの進化のレジスタを保持することができる。プログラミング言語Haskellでは、その特性モナドを定義することで、Hackageリポジトリで安定したHaskellパッケージのためのキュレートされたリポジトリであるStackageリポジトリを見つけることができます。 Stackageが工業的ターゲットで広く利用されているにもかかわらず、私たちはこのリポジトリがどのように進化したかについて、モナドの使用を含む多くの経験的な研究を知らない。目的。本稿では,2014～2023年における22の長期サポートリリースを通じて,モナドパッケージを考慮したスタックの進化に関する実証研究を行う。 5つの研究課題に焦点を当てて、この進化は、依存関係とインポートを伴うパッケージの観点から分析される。私たちの知る限りでは、これは使用済みのパッケージとモナドに関するstackageリポジトリの進化に関する最初の大規模な分析です。方法。リポジトリの進化に関する6つの研究質問を定義し、22リリースにまたがる51,716パッケージ (17.05 GB) でそれらを分析した。各パッケージに対してキャバクラファイルとソースコードを解析してデータを抽出し,依存関係の観点から解析し,Pandasスクリプトを使ってインポートする。結果だ方法論から異なる結果が得られます。例えば、stackageの特定のリリースではバージョンが利用できない他のパッケージに依存するパッケージがある。 mtlとtransformerは、Stackageの進化で最もよく使われているパッケージのトップ10に入っている。これらの調査結果をstackageのメンテナと議論し,研究課題の洗練に役立てました。 Context. Package repositories for a programming language are increasingly common. A repository can keep a register of the evolution of its packages. In the programming language Haskell, with its defining characteristic monads, we can find the Stackage repository, which is a curated repository for stable Haskell packages in the Hackage repository. Despite the widespread use of Stackage in its industrial target, we are not aware of much empirical research about how this repository has evolved, including the use of monads. Objective. This paper conducts empirical research about the evolution of Stackage considering monad packages through 22 Long-Term Support releases during the period 2014-2023. Focusing on five research questions, this evolution is analyzed in terms of packages with their dependencies and imports; including the most used monad packages. To the best of our knowledge, this is the first large-scale analysis of the evolution of the Stackage repository regarding packages used and monads. Method. We define six research questions regarding the repository's evolution, and analyze them on 51,716 packages (17.05 GB) spread over 22 releases. For each package, we parse its cabal file and source code to extract the data, which is analyzed in terms of dependencies and imports using Pandas scripts. Results. From the methodology we get different findings. For example, there are packages that depend on other packages whose versions are not available in a particular release of Stackage; opening a potential stability issue. The mtl and transformers are on the top 10 packages most used/imported across releases of the Stackage evolution. We discussed these findings with Stackage maintainers, which allowed us to refine the research questions.	翻訳日:2023-10-25 06:10:25 公開日:2023-10-23
# 効率的な変圧器用2層フィードフォワードネットワークの近似 Approximating Two-Layer Feedforward Networks for Efficient Transformers ( http://arxiv.org/abs/2310.10837v2 ) ライセンス: Link先を確認	R\'obert Csord\'as, Kazuki Irie, J\"urgen Schmidhuber	(参考訳) パフォーマンスを犠牲にすることなく、ニューラルネットワーク(NN)の計算とメモリ要件をいかに削減するか? 最近の多くの作品では、リソース効率の高い大言語モデル(lms)を構築するために、専門家のスパース混合物(moes)を使用している。ここでは,2層NN(例えば,トランスフォーマーのフィードフォワードブロック)を近似する様々な手法を統一する汎用フレームワークとして,製品キーメモリ(PKM)など,MoEに関するいくつかの新しい視点を紹介する。このフレームワークからの洞察を生かして,moesとpkmsの両方を改善する手法を提案する。計算方程式条件下でmoesと密接なベースラインを比較する先行研究とは異なり,本評価条件はパラメータ等式であり,lmsを適切に評価することが重要である。当社のmoesはwikitext-103とenwiki8のデータセットで2つの異なるスケールで高密度トランスフォーマーxlと競合するが、リソース効率ははるかに高い。このことは、MoE が極めて大きな LM だけでなく、資源効率の高い LM にも関係していることを示している。私たちのコードは公開されています。 How to reduce compute and memory requirements of neural networks (NNs) without sacrificing performance? Many recent works use sparse Mixtures of Experts (MoEs) to build resource-efficient large language models (LMs). Here we introduce several novel perspectives on MoEs, presenting a general framework that unifies various methods to approximate two-layer NNs (e.g., feedforward blocks of Transformers), including product-key memories (PKMs). Leveraging insights from this framework, we propose methods to improve both MoEs and PKMs. Unlike prior work that compares MoEs with dense baselines under the compute-equal condition, our evaluation condition is parameter-equal, which is crucial to properly evaluate LMs. We show that our MoEs are competitive with the dense Transformer-XL on both the WikiText-103 and enwiki8 datasets at two different scales, while being much more resource efficient. This demonstrates that MoEs are relevant not only to extremely large LMs but also to any-scale resource-efficient LMs. Our code is public.	翻訳日:2023-10-25 06:09:54 公開日:2023-10-23
# RegaVAE:言語モデリングのための検索型ガウス混合変分自動エンコーダ RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language Modeling ( http://arxiv.org/abs/2310.10567v2 ) ライセンス: Link先を確認	Jingcheng Deng, Liang Pang, Huawei Shen, Xueqi Cheng	(参考訳) 検索可能な言語モデルは、古い情報や言語モデル(LM)の幻覚といった問題に対処する上で有望である。しかし、現在の研究は2つの問題に直面している。 1)取得すべき情報を決定すること、及び 2) 生成中の検索情報を効果的に組み合わせること。将来的なトークンをモデル化するLMの性質を考えると,有効な検索情報は現在のソーステキストだけでなく,将来のターゲットテキストも考慮すべきである。さらに,コンパクトな潜在空間から派生した潜在変数を用いたアグリゲーションは,文脈長によって制限され雑音に影響を受けやすい明示的な原文の活用よりも効率的である。そこで本稿では,可変オートエンコーダ(VAE)に基づく検索拡張言語モデルRegaVAEを紹介する。テキストコーパスを潜在空間にエンコードし、ソーステキストとターゲットテキストの両方から現在および将来の情報をキャプチャする。さらに,vaeを用いて潜在空間を初期化し,ガウス前分布をガウス混合分布に拡張することにより,検索生成パラダイムの確率的形式を採用する。理論的解析はRegaVAEの最適化可能な上限を与える。各種データセットに対する実験結果から,テキスト生成品質と幻覚除去の大幅な改善が示された。 Retrieval-augmented language models show promise in addressing issues like outdated information and hallucinations in language models (LMs). However, current research faces two main problems: 1) determining what information to retrieve, and 2) effectively combining retrieved information during generation. We argue that valuable retrieved information should not only be related to the current source text but also consider the future target text, given the nature of LMs that model future tokens. Moreover, we propose that aggregation using latent variables derived from a compact latent space is more efficient than utilizing explicit raw text, which is limited by context length and susceptible to noise. Therefore, we introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE). It encodes the text corpus into a latent space, capturing current and future information from both source and target text. Additionally, we leverage the VAE to initialize the latent space and adopt the probabilistic form of the retrieval generation paradigm by expanding the Gaussian prior distribution into a Gaussian mixture distribution. Theoretical analysis provides an optimizable upper bound for RegaVAE. Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.	翻訳日:2023-10-25 06:07:52 公開日:2023-10-23
# ブロックチェーン上でランダム、フェア、検証可能なゲームを構築する。 Suiネットワーク上のラッフルスマートコントラクト設計 Building Random, Fair, and Verifiable Games on Blockchain. Raffle smart contract designs on Sui Network ( http://arxiv.org/abs/2310.12305v2 ) ライセンス: Link先を確認	Eason Chen, Justa Liang, Ray Huang, Pierce Hung, Damien Chen, Ashley Hsu, Konstantinos Chalkias, Stefanos Pleros	(参考訳) 現代のオンラインゲームにおいてランダム性は重要な役割を担っているが、勝利確率の正確性をめぐって論争が持ち上がり、法的問題とゲーム会社に対する財政的欠点が生じた。幸いなことに、ブロックチェーンベースのゲームは、ランダム性に関する透明性と公平性の問題に対する解決策を提供する。さらに、su networkのような新興のブロックチェーン技術は、非効率や高価な取引手数料といった従来のweb3障壁を排除することで、スマートコントラクトの効率を高める。これにより、大規模な分散ゲームアプリケーションの可能性が解き放たれる。本稿は,ブロックチェーン上での公正で検証可能な,効率的なスマートコントラクトゲームの設計に関する洞察を,Swiネットワーク上でのラッフル構築の例として提供することを目的とする。 DRAND委員会ベースの分散ランダムビーコンや,単一のプライベートキーベースの検証可能なランダム関数(VRF)など,スマートコントラクトにランダム性を実装する効率的な方法を検討する。そして、基本から包括的なスマートコントラクト設計へと前進する。データ入力やストレージスペースの制約など、ブロックチェーンゲーム全般の開発における制限に対処しました。本稿では,オブジェクトテーブル,デリゲートオブジェクト生成,ゼロ知識証明(ZKP)の利用を包含して,ストレージと入力効率を最適化する対応ソリューションを提案する。デザインをテストした結果、DRANDビーコンとプライベートキーベースのVRFの取引手数料は似ていることがわかった。さらに、オブジェクトテーブルは全体的な取引手数料を高くし、ZKPセットアップ料金は安く、検証プロセス中に非常に高価になる。さらに、異なるスマートコントラクト実装の長所と短所を比較して、異なるアプリケーションシナリオに適した設計を特定した。我々の発見は、スマートコントラクトでランダムで公正で検証可能なゲームを構築するための、将来の研究者や開発者にとって貴重なガイダンスを提供する。 Randomness plays a pivotal role in modern online gaming, but disputes have arisen over the accuracy of stated winning chances, resulting in legal issues and financial setbacks for gaming companies. Fortunately, blockchain-based games offer a solution to the transparency and fairness issue regarding randomness. Furthermore, emerging blockchain technology like Sui Network enhances the efficiency of smart contracts by eliminating traditional web3 barriers, such as inefficiencies and expensive transaction fees. This unlocks the potential for extensive decentralized gaming applications. This paper aims to provide insights into designing a fair, verifiable, and efficient smart contract game on blockchain by the example of building raffles on the Sui Network. We explore efficient methods for implementing randomness on smart contracts, including DRAND committee-based decentralized random beacons and single private-key-based verifiable random functions (VRF). Then, progress from basic to comprehensive smart contract design. We addressed limitations in developing blockchain games in general, such as data input and storage space constraints. We propose corresponding solutions, encompassing the utilization of Object Tables, Delegate Object Creation, and Zero-Knowledge Proofs (ZKP) to optimize storage and input efficiency. After testing our designs, we found that the transaction fees for DRAND beacons and private-key-based VRFs are similar. Moreover, Object Tables incur higher overall transaction fees, while the ZKP setup fee is cheap but becomes very expensive during the verification process. Moreover, we identified suitable designs for different application scenarios by comparing the pros and cons of different smart contract implementations. Our findings provide valuable guidance for future researchers and developers in building random, fair, and verifiable games with smart contracts.	翻訳日:2023-10-25 06:02:08 公開日:2023-10-23
# 安全対策に向けて:高圧ガス事故の専門的データセットによる今後の失敗防止 Towards Safer Operations: An Expert-involved Dataset of High-Pressure Gas Incidents for Preventing Future Failures ( http://arxiv.org/abs/2310.12074v2 ) ライセンス: Link先を確認	Shumpei Inoue, Minh-Tien Nguyen, Hiroki Mizokuchi, Tuan-Anh D. Nguyen, Huu-Hiep Nguyen, Dung Tien Le	(参考訳) 本稿では,安全対策のための新しいインシデントAIデータセットを提案する。通常、1つのタスクを含む以前のコーパスとは異なり、データセットは名前付きエンティティ認識、原因効果抽出、情報検索の3つのタスクで構成される。このデータセットは、高圧ガス保存マネージャとして少なくとも6年間の実践経験を持つドメインの専門家によってアノテートされている。安全対策のシナリオにおけるデータセットの貢献を検証した。 3つのタスクの予備的な結果から、NLP技術はインシデントレポートの分析に有用であり、将来の障害を防ぐことができる。このデータセットは、NLPとインシデント管理コミュニティにおける将来の研究を促進する。データセットへのアクセスも提供されている(incidentaiデータセットはhttps://github.com/cinnamon/incident-ai-dataset)。 This paper introduces a new IncidentAI dataset for safety prevention. Different from prior corpora that usually contain a single task, our dataset comprises three tasks: named entity recognition, cause-effect extraction, and information retrieval. The dataset is annotated by domain experts who have at least six years of practical experience as high-pressure gas conservation managers. We validate the contribution of the dataset in the scenario of safety prevention. Preliminary results on the three tasks show that NLP techniques are beneficial for analyzing incident reports to prevent future failures. The dataset facilitates future research in NLP and incident management communities. The access to the dataset is also provided (the IncidentAI dataset is available at: https://github.com/Cinnamon/incident-ai-dataset).	翻訳日:2023-10-25 06:01:29 公開日:2023-10-23
# ベトナム一般教育における多言語問題に対する大言語モデルの記号結合能力の評価 Evaluating the Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education ( http://arxiv.org/abs/2310.12059v2 ) ライセンス: Link先を確認	Duc-Vu Nguyen, Quoc-Nam Nguyen	(参考訳) 本稿では,大規模言語モデル(LLM)が複数選択質問応答(MCQA)タスクに対して,ゼロショット,ワンショット,少数ショット設定でMCSB(Multiple choice symbol binding)を実行する能力を評価する。ベトナム語に焦点を当てており、英語よりも難しいMCQAデータセットが少ない。既存の2つのデータセット、ViMMRC 1.0とViMMRC 2.0は文学に焦点を当てている。ベトナムの自然言語処理(NLP)の最近の研究は、ChatGPTを評価するために、2019年から2023年までベトナム国立高校卒業試験(VNHSGE)に焦点を当てている。しかしこれらの研究は主に、ChatGPTがVNHSGEを段階的に解く方法に焦点を当てている。我々は,数学,物理,化学,生物学のLaTeX式を入力するための構造化されたガイドラインを提供することで,新しい高品質なデータセットを作ることを目指している。このデータセットは、厳密なLaTeXスタイルでタイプされているため、LSMと小言語モデル(LM)のMCSB能力を評価するために使用できる。質問の文脈を考えると、質問に対する最も可能性の高い答えである文字(A、B、C、またはD)を予測することに集中する。 ViMMRC 1.0 と ViMMRC 2.0 ベンチマークを用いて, BLOOMZ-7.1B-MT, LLaMA-2-7B, LLaMA-2-70B, GPT-3, GPT-3.5, GPT-4.0 の6つの有名な LLM の評価を行った。データセットは研究目的でのみ利用できる。 In this paper, we evaluate the ability of large language models (LLMs) to perform multiple choice symbol binding (MCSB) for multiple choice question answering (MCQA) tasks in zero-shot, one-shot, and few-shot settings. We focus on Vietnamese, with fewer challenging MCQA datasets than in English. The two existing datasets, ViMMRC 1.0 and ViMMRC 2.0, focus on literature. Recent research in Vietnamese natural language processing (NLP) has focused on the Vietnamese National High School Graduation Examination (VNHSGE) from 2019 to 2023 to evaluate ChatGPT. However, these studies have mainly focused on how ChatGPT solves the VNHSGE step by step. We aim to create a novel and high-quality dataset by providing structured guidelines for typing LaTeX formulas for mathematics, physics, chemistry, and biology. This dataset can be used to evaluate the MCSB ability of LLMs and smaller language models (LMs) because it is typed in a strict LaTeX style. We focus on predicting the character (A, B, C, or D) that is the most likely answer to a question, given the context of the question. Our evaluation of six well-known LLMs, namely BLOOMZ-7.1B-MT, LLaMA-2-7B, LLaMA-2-70B, GPT-3, GPT-3.5, and GPT-4.0, on the ViMMRC 1.0 and ViMMRC 2.0 benchmarks and our proposed dataset shows promising results on the MCSB ability of LLMs for Vietnamese. The dataset is available for research purposes only.	翻訳日:2023-10-25 06:01:14 公開日:2023-10-23
# LoHoRavens: ロボットテーブルトップ操作のための長期言語仕様ベンチマーク LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation ( http://arxiv.org/abs/2310.12020v2 ) ライセンス: Link先を確認	Shengqiang Zhang, Philipp Wicke, L\"utfi Kerem \c{S}enel, Luis Figueredo, Abdeldjallil Naceri, Sami Haddadin, Barbara Plank, Hinrich Sch\"utze	(参考訳) エンボディエージェントと大規模言語モデル(LLMs)の収束は、インボダイド命令に対する大幅な進歩をもたらした。特に、LSMの強力な推論能力により、ロボットは高価なアノテートデモなしで長距離タスクを実行できる。しかし,様々なシナリオにおける言語条件ロボットの長期推論能力をテストするための公開ベンチマークはいまだに欠落している。このギャップを埋めるために、この研究はテーブルトップ操作タスクに焦点をあて、色、サイズ、空間、算術、参照にまたがる様々な長距離推論の側面をカバーするシミュレーションベンチマークである \textit{LoHoRavens} をリリースする。さらに、LLMの閉ループ計画において、ロボットの実行中に観察フィードバックを組み込む方法について、LLMによる長期操作タスクにおいて重要なモダリティブリッジング問題がある。 LLMに明示的および暗黙的な観察フィードバックを組み込むためのキャプション生成と学習可能なインタフェースの2つの方法を検討した。これらの手法は,提案したベンチマークの2つの基準となる。実験により、どちらの手法もいくつかのタスクを解くのに苦労していることが示され、現在の一般的なモデルでは長い水平操作タスクが依然として難しいことが示されている。提案された公開ベンチマークとベースラインは、長期のテーブルトップ操作タスクのためのより良いモデル開発に役立つと期待している。 The convergence of embodied agents and large language models (LLMs) has brought significant advancements to embodied instruction following. Particularly, the strong reasoning capabilities of LLMs make it possible for robots to perform long-horizon tasks without expensive annotated demonstrations. However, public benchmarks for testing the long-horizon reasoning capabilities of language-conditioned robots in various scenarios are still missing. To fill this gap, this work focuses on the tabletop manipulation task and releases a simulation benchmark, \textit{LoHoRavens}, which covers various long-horizon reasoning aspects spanning color, size, space, arithmetics and reference. Furthermore, there is a key modality bridging problem for long-horizon manipulation tasks with LLMs: how to incorporate the observation feedback during robot execution for the LLM's closed-loop planning, which is however less studied by prior work. We investigate two methods of bridging the modality gap: caption generation and learnable interface for incorporating explicit and implicit observation feedback to the LLM, respectively. These methods serve as the two baselines for our proposed benchmark. Experiments show that both methods struggle to solve some tasks, indicating long-horizon manipulation tasks are still challenging for current popular models. We expect the proposed public benchmark and baselines can help the community develop better models for long-horizon tabletop manipulation tasks.	翻訳日:2023-10-25 06:00:39 公開日:2023-10-23
# プリビレージエスカレーションシナリオにおけるLCMの評価 Evaluating LLMs for Privilege-Escalation Scenarios ( http://arxiv.org/abs/2310.11409v2 ) ライセンス: Link先を確認	Andreas Happe, Aaron Kaplan, J\"urgen Cito	(参考訳) サイバーセキュリティの重要なコンポーネントである侵入テストは、システム内の脆弱性を積極的に識別し、修正することで、潜在的なサイバー攻撃に対する防御メカニズムを強化することができる。浸透試験の領域における最近の進歩の1つは言語モデル(LLM)の利用である。 LLMと浸透試験の交わりを探索し、私的エスカレーションの文脈におけるそれらの能力と課題について考察する。ローカル仮想マシンを利用した自動Linux特権エスカレーションベンチマークを作成する。異なるLLMの評価とベンチマークに対する戦略の促進を目的として,LLM誘導型特権エスカレーションツールを提案する。我々は、異なるプロンプト設計の影響、文脈内学習の利点、LLMに高レベルのガイダンスを提供することの利点を分析する。テスト中のフォーカスの維持、エラーへの対処、そして最終的には確率的なオウムと人間のハッカーとの比較など、LLMの課題領域について論じる。 Penetration testing, an essential component of cybersecurity, allows organizations to proactively identify and remediate vulnerabilities in their systems, thus bolstering their defense mechanisms against potential cyberattacks. One recent advancement in the realm of penetration testing is the utilization of Language Models (LLMs). We explore the intersection of LLMs and penetration testing to gain insight into their capabilities and challenges in the context of privilige escalation. We create an automated Linux privilege-escalation benchmark utilizing local virtual machines. We introduce an LLM-guided privilege-escalation tool designed for evaluating different LLMs and prompt strategies against our benchmark. We analyze the impact of different prompt designs, the benefits of in-context learning, and the advantages of offering high-level guidance to LLMs. We discuss challenging areas for LLMs, including maintaining focus during testing, coping with errors, and finally comparing them with both stochastic parrots as well as with human hackers.	翻訳日:2023-10-25 05:59:12 公開日:2023-10-23
# 感度を意識したベイズ推定 Sensitivity-Aware Amortized Bayesian Inference ( http://arxiv.org/abs/2310.11122v2 ) ライセンス: Link先を確認	Lasse Elsem\"uller, Hans Olischl\"ager, Marvin Schmitt, Paul-Christian B\"urkner, Ullrich K\"othe, Stefan T. Radev	(参考訳) ベイズ推論は不確実性の下で確率的推論と決定を行うための強力なフレームワークである。現代のベイズワークフローの基本的選択は、可能性関数と事前分布、後部近似器、およびデータに関するものである。各選択はモデルに基づく推論とその後の決定に大きく影響し、感度分析を必要とする。本研究では,無形ベイズ推論(abi,すなわちニューラルネットワークを用いたシミュレーションベース推論)に感度解析を統合するための多面的手法を提案する。まず,計算オーバーヘッドを最小に抑えながら,学習プロセスにおける代替可能性と事前仕様との間の構造的類似性を符号化するために,重みの共有を利用する。第2に,ニューラルネットワークの迅速な推論を利用して,様々なデータ摂動や前処理に対する感度を評価する。他のほとんどのベイズ的アプローチとは対照的に、どちらのステップも、確率、事前、データセットの選択ごとにモデルを再フィッティングするコストのかかるボトルネックを回避する。最後に,ニューラルネットワークアンサンブルを用いて,未知データに対する信頼できない近似による結果のばらつきを評価することを提案する。本稿では,本手法の応用モデリング問題における有効性を示す。疫病の発生動態と地球温暖化閾値の推定から,人為的意思決定モデルの比較まで。実験では,モデル選択と推論的帰結の間の隠れた関係を効果的に明らかにする手法を示す。 Bayesian inference is a powerful framework for making probabilistic inferences and decisions under uncertainty. Fundamental choices in modern Bayesian workflows concern the specification of the likelihood function and prior distributions, the posterior approximator, and the data. Each choice can significantly influence model-based inference and subsequent decisions, thereby necessitating sensitivity analysis. In this work, we propose a multifaceted approach to integrate sensitivity analyses into amortized Bayesian inference (ABI, i.e., simulation-based inference with neural networks). First, we utilize weight sharing to encode the structural similarities between alternative likelihood and prior specifications in the training process with minimal computational overhead. Second, we leverage the rapid inference of neural networks to assess sensitivity to various data perturbations or pre-processing procedures. In contrast to most other Bayesian approaches, both steps circumvent the costly bottleneck of refitting the model(s) for each choice of likelihood, prior, or dataset. Finally, we propose to use neural network ensembles to evaluate variation in results induced by unreliable approximation on unseen data. We demonstrate the effectiveness of our method in applied modeling problems, ranging from the estimation of disease outbreak dynamics and global warming thresholds to the comparison of human decision-making models. Our experiments showcase how our approach enables practitioners to effectively unveil hidden relationships between modeling choices and inferential conclusions.	翻訳日:2023-10-25 05:58:57 公開日:2023-10-23
# 最先端大規模言語モデルのためのH2Oオープンエコシステム H2O Open Ecosystem for State-of-the-art Large Language Models ( http://arxiv.org/abs/2310.13012v2 ) ライセンス: Link先を確認	Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Chun Ming Lee, Marcos V. Conde	(参考訳) 大規模言語モデル(LLM)はAIの革命を表している。しかしそれらはまた、偏りのある、プライベートな、著作権のある、有害なテキストの存在など、多くの重大なリスクをもたらす。そのため、オープンで透明で安全なソリューションが必要です。 LLMの開発とテストのための完全なオープンソースエコシステムを導入します。このプロジェクトの目的は、クローズドソースアプローチに対するオープンな代替手段を強化することだ。 h2oGPTは様々なサイズの微調整LDMのファミリーである。 H2O LLM Studioは、最新の最先端技術を用いて、LLMの効率的な微調整、評価、デプロイのために設計されたフレームワークであり、ノーコードGUIである。私たちのコードとモデルは、完全にオープンソースです。この取り組みはAI開発を促進し、よりアクセスしやすく、効率的で、信頼できるものにするのに役立つと信じています。デモは以下の通り。 Large Language Models (LLMs) represent a revolution in AI. However, they also pose many significant risks, such as the presence of biased, private, copyrighted or harmful text. For this reason we need open, transparent and safe solutions. We introduce a complete open-source ecosystem for developing and testing LLMs. The goal of this project is to boost open alternatives to closed-source approaches. We release h2oGPT, a family of fine-tuned LLMs of diverse sizes. We also introduce H2O LLM Studio, a framework and no-code GUI designed for efficient fine-tuning, evaluation, and deployment of LLMs using the most recent state-of-the-art techniques. Our code and models are fully open-source. We believe this work helps to boost AI development and make it more accessible, efficient and trustworthy. The demo is available at: https://gpt.h2o.ai/	翻訳日:2023-10-25 05:46:51 公開日:2023-10-23
# DetectGPT-SC:マスケ予測による自己整合による大規模言語モデルによるテキストの検出の改善 DetectGPT-SC: Improving Detection of Text Generated by Large Language Models through Self-Consistency with Masked Predictions ( http://arxiv.org/abs/2310.14479v1 ) ライセンス: Link先を確認	Rongsheng Wang, Qi Li, Sihong Xie	(参考訳) ChatGPTのような一般的な大規模言語モデル(LLM)は目覚ましい成功を収めているが、AI生成テキストの誤用も懸念されている。したがって、重要な疑問は、テキストがChatGPTによって生成されるか、人間によって生成されるかを検出する方法である。既存の検出器は、人間が生成したテキストとAI生成したテキストの間に分配ギャップがあるという仮定に基づいて構築されている。これらのギャップは一般に統計情報や分類器を用いて識別される。従来の研究手法とは対照的に,ChatGPTのような大規模言語モデルは,テキスト生成や継続において強い自己整合性を示す。自己整合性(Self-Consistency)は、AIが生成したテキストは、人間の生成したテキストと異なり、テキストの一部が隠されている場合と同じ論理的推論を用いて、大きな言語モデルで推論できるという直感に乗じている。そこで本研究では, マスク付き予測を用いた自己整合性に基づくAI生成テキストの検出手法を提案し, LLMによってテキストが生成されるかどうかを判定する。この手法は detectiongpt-sc と呼ぶ。 detectiongpt-scの性能評価のための一連の実験を行った。これらの実験では,様々なマスクスキーム,ゼロショット,簡単なプロンプトを用いてマスクテキストの完成と自己一貫性の予測を行った。その結果, 検出gpt-scは, 異なるタスクにまたがる現在の状態よりも優れていた。 General large language models (LLMs) such as ChatGPT have shown remarkable success, but it has also raised concerns among people about the misuse of AI-generated texts. Therefore, an important question is how to detect whether the texts are generated by ChatGPT or by humans. Existing detectors are built on the assumption that there is a distribution gap between human-generated and AI-generated texts. These gaps are typically identified using statistical information or classifiers. In contrast to prior research methods, we find that large language models such as ChatGPT exhibit strong self-consistency in text generation and continuation. Self-consistency capitalizes on the intuition that AI-generated texts can still be reasoned with by large language models using the same logical reasoning when portions of the texts are masked, which differs from human-generated texts. Using this observation, we subsequently proposed a new method for AI-generated texts detection based on self-consistency with masked predictions to determine whether a text is generated by LLMs. This method, which we call DetectGPT-SC. We conducted a series of experiments to evaluate the performance of DetectGPT-SC. In these experiments, we employed various mask scheme, zero-shot, and simple prompt for completing masked texts and self-consistency predictions. The results indicate that DetectGPT-SC outperforms the current state-of-the-art across different tasks.	翻訳日:2023-10-24 23:31:43 公開日:2023-10-23
# GeoLM:Geospatially Grounded Language Understandingのための言語モデル GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding ( http://arxiv.org/abs/2310.14478v1 ) ライセンス: Link先を確認	Zekun Li, Wenxuan Zhou, Yao-Yi Chiang, Muhao Chen	(参考訳) 人間は記事を読む際に地空間的推論を行う。我々は、地名とその空間的関係をテキストで認識し、それらを地球上の物理的位置と精神的に関連付ける。事前訓練された言語モデルは、言語的文脈を用いてこの認知過程を模倣することができるが、大規模な地理的データベース、例えばOpenStreetMapでは、貴重な地理空間情報を利用できない。本稿では,自然言語におけるジオエンティリティの理解を深める,ジオスパティブな基盤言語モデルであるGeoLMを紹介する。 GeoLMは、地理データベースから抽出された地理空間情報とテキストコーパス内の言語情報を結びつけるアンカーとしてジオエンタリティの言及を利用する。 GeoLMは2種類のコンテキストをコントラスト学習とマスキング言語モデリングを通じて接続する。また、空間座標埋め込み機構を組み込んで距離と方向の関係を符号化し、地理空間コンテキストを捉える。実験では,自然言語処理と地理空間科学のギャップを埋める,頭字語認識,頭字語リンク,関係抽出,ジオエンティティ型付けのサポートにおいて,geolmが有望な能力を示すことを実証する。コードはhttps://github.com/knowledge-computing/geolmで公開されている。 Humans subconsciously engage in geospatial reasoning when reading articles. We recognize place names and their spatial relations in text and mentally associate them with their physical locations on Earth. Although pretrained language models can mimic this cognitive process using linguistic context, they do not utilize valuable geospatial information in large, widely available geographical databases, e.g., OpenStreetMap. This paper introduces GeoLM, a geospatially grounded language model that enhances the understanding of geo-entities in natural language. GeoLM leverages geo-entity mentions as anchors to connect linguistic information in text corpora with geospatial information extracted from geographical databases. GeoLM connects the two types of context through contrastive learning and masked language modeling. It also incorporates a spatial coordinate embedding mechanism to encode distance and direction relations to capture geospatial context. In the experiment, we demonstrate that GeoLM exhibits promising capabilities in supporting toponym recognition, toponym linking, relation extraction, and geo-entity typing, which bridge the gap between natural language processing and geospatial sciences. The code is publicly available at https://github.com/knowledge-computing/geolm.	翻訳日:2023-10-24 23:31:15 公開日:2023-10-23
# 身体部分外観を用いたプレイヤー再同定 Player Re-Identification Using Body Part Appearences ( http://arxiv.org/abs/2310.14469v1 ) ライセンス: Link先を確認	Mahesh Bhosale, Abhishek Kumar, David Doermann	(参考訳) サッカー選手の再識別のための身体部分の外観を学習するニューラルネットワークアーキテクチャを提案する。本モデルは,2ストリームネットワーク(外観マップ抽出用ストリームと身体部分マップ抽出用ストリーム)と,身体部分マップを空間的にプールするバイリニアプール層で構成されている。体部マップの各局所的特徴は、対応する局所的外観と体部記述子の双線型写像によって得られる。提案手法では, 画像マッチング特徴マップが頑健であり, 関連部位の局所的類似度と重み付けされた外観類似度を組み合わせて得られる。我々のモデルはネットワークをトレーニングするために SoccerNet-V3 の再識別データセットにいかなる部分アノテーションも必要としない。代わりに、既存のポーズ推定ネットワーク(openpose)のサブネットワークを使用して、部分サブストリームを初期化し、ネットワーク全体をトレーニングして三重項損失を最小限にします。外観ストリームはImageNetデータセットで事前トレーニングされ、その部分ストリームは SoccerNet-V3データセットのスクラッチからトレーニングされる。我々は,OsNetやInceptionNetといった最先端モデルよりも優れていることを示すことによって,モデルの有効性を示す。 We propose a neural network architecture that learns body part appearances for soccer player re-identification. Our model consists of a two-stream network (one stream for appearance map extraction and the other for body part map extraction) and a bilinear-pooling layer that generates and spatially pools the body part map. Each local feature of the body part map is obtained by a bilinear mapping of the corresponding local appearance and body part descriptors. Our novel representation yields a robust image-matching feature map, which results from combining the local similarities of the relevant body parts with the weighted appearance similarity. Our model does not require any part annotation on the SoccerNet-V3 re-identification dataset to train the network. Instead, we use a sub-network of an existing pose estimation network (OpenPose) to initialize the part substream and then train the entire network to minimize the triplet loss. The appearance stream is pre-trained on the ImageNet dataset, and the part stream is trained from scratch for the SoccerNet-V3 dataset. We demonstrate the validity of our model by showing that it outperforms state-of-the-art models such as OsNet and InceptionNet.	翻訳日:2023-10-24 23:30:55 公開日:2023-10-23
# 最適制御における学習問題に対する暗黙差分の再検討 Revisiting Implicit Differentiation for Learning Problems in Optimal Control ( http://arxiv.org/abs/2310.14468v1 ) ライセンス: Link先を確認	Ming Xu, Timothy Molloy, Stephen Gould	(参考訳) 本稿では,非凸,制約付き離散時間最適制御(COC)問題から生じる最適軌道を暗黙関数定理(IFT)を用いて微分する新しい手法を提案する。従来の研究は、軌道微分のための微分カルーシュ・クーン・タッカー(KKT)システムを解き、補助線形二次レギュレータ(LQR)問題を解くことで効率よく実現している。対照的に、(微分)kkt系におけるラグランジュ乗算項に変数除去を適用することによって生じる行列方程式を直接評価する。結果方程式内の項の構造を適切に説明することにより、軌道微分は時間ステップの数とともに線形にスケールすることを示す。さらに,本手法により並列化が容易になり,モデルサイズによるスケーラビリティが大幅に向上し,ベクトルジャコビアン積の直接計算が可能となった。さらなる貢献として、IFTを用いたトラジェクトリ微分の計算は、時間ステップの数と2倍にスケールするという主張に対処する。本手法を合成ベンチマークと4つの挑戦ベンチマークで評価し,6自由度操縦クワッドローターと6自由度ロケット動力着陸を含む実演ベンチマークから学習した。 This paper proposes a new method for differentiating through optimal trajectories arising from non-convex, constrained discrete-time optimal control (COC) problems using the implicit function theorem (IFT). Previous works solve a differential Karush-Kuhn-Tucker (KKT) system for the trajectory derivative, and achieve this efficiently by solving an auxiliary Linear Quadratic Regulator (LQR) problem. In contrast, we directly evaluate the matrix equations which arise from applying variable elimination on the Lagrange multiplier terms in the (differential) KKT system. By appropriately accounting for the structure of the terms within the resulting equations, we show that the trajectory derivatives scale linearly with the number of timesteps. Furthermore, our approach allows for easy parallelization, significantly improved scalability with model size, direct computation of vector-Jacobian products and improved numerical stability compared to prior works. As an additional contribution, we unify prior works, addressing claims that computing trajectory derivatives using IFT scales quadratically with the number of timesteps. We evaluate our method on a both synthetic benchmark and four challenging, learning from demonstration benchmarks including a 6-DoF maneuvering quadrotor and 6-DoF rocket powered landing.	翻訳日:2023-10-24 23:30:33 公開日:2023-10-23
# 相互作用系における関係ポテンシャルの推論 Inferring Relational Potentials in Interacting Systems ( http://arxiv.org/abs/2310.14466v1 ) ライセンス: Link先を確認	Armand Comas-Massagu\'e, Yilun Du, Christian Fernandez, Sandesh Ghimire, Mario Sznaier, Joshua B. Tenenbaum, Octavia Camps	(参考訳) 相互作用エージェントからなるシステムは、物理学の力学系から複雑な生物学的ネットワークまで、世界に広く普及している。現実世界で堅牢に相互作用できるシステムを構築するためには、そのようなシステムを管理する正確な相互作用を推測できることが重要である。既存のアプローチは通常、軌道のフィードフォワードダイナミクスを明示的にモデル化することでそのような相互作用を発見する。本研究では, ニューラル・インタラクション・推論(NIIP)を, トラジェクトリ・モデリングの柔軟性を高めるための代替手法として, エネルギー関数として表現された関係ポテンシャルの集合を発見し, 元のトラジェクトリを最小限に再構築する。 NIIPは観測された関係制約を尊重する軌道のサブセットに低エネルギーを割り当てる。これらの表現により、NIIPはテスト時間内にユニークな機能を示す。第一に、別々に訓練されたモデル間での相互作用の型を交換したり、軌跡予測を行うような軌跡操作を可能にする。さらに、テスト時に外部の手作りのポテンシャルを追加できる。最後に、niipは、明示的なトレーニングなしで、分散サンプルや異常の検出を可能にする。 webサイト: https://energy-based-model.github.io/interaction-potentials。 Systems consisting of interacting agents are prevalent in the world, ranging from dynamical systems in physics to complex biological networks. To build systems which can interact robustly in the real world, it is thus important to be able to infer the precise interactions governing such systems. Existing approaches typically discover such interactions by explicitly modeling the feed-forward dynamics of the trajectories. In this work, we propose Neural Interaction Inference with Potentials (NIIP) as an alternative approach to discover such interactions that enables greater flexibility in trajectory modeling: it discovers a set of relational potentials, represented as energy functions, which when minimized reconstruct the original trajectory. NIIP assigns low energy to the subset of trajectories which respect the relational constraints observed. We illustrate that with these representations NIIP displays unique capabilities in test-time. First, it allows trajectory manipulation, such as interchanging interaction types across separately trained models, as well as trajectory forecasting. Additionally, it allows adding external hand-crafted potentials at test-time. Finally, NIIP enables the detection of out-of-distribution samples and anomalies without explicit training. Website: https://energy-based-model.github.io/interaction-potentials.	翻訳日:2023-10-24 23:30:08 公開日:2023-10-23
# 量子アドバンテージの検証可能性に関する暗号的展望 A Cryptographic Perspective on the Verifiability of Quantum Advantage ( http://arxiv.org/abs/2310.14464v1 ) ライセンス: Link先を確認	Nai-Hui Chia, Honghao Fu, Fang Song and Penghui Yao	(参考訳) 近年、NISQデバイス上での検証可能な量子優位性の実現は、量子情報において重要なオープン問題として浮上している。サンプリングに基づく量子アドバンテージは、効率的な検証方法を持つことは知られていない。本稿では,暗号の観点から量子優位性を検証する。量子優位性と暗号および複雑性プリミティブの検証可能性の間には、効率よくサンプリング可能で統計的に遠いが、(混合)量子状態(\mathsf{EFI}$)、擬ランダム状態(\mathsf{PRS}$)、最小回路サイズ問題(\mathsf{MCSP}$)の変種(\mathsf{MCSP}$)を含む強い関係を確立する。具体的に言えば a) サンプリングベースの量子優位性は検証可能であるか、$\mathsf{EFI}$および$\mathsf{PRS}$の構築に使用することができる。 b)$\mathsf{MCSP}$の変種に対する多項式時間アルゴリズムは、量子上の利点の効率的な検証を示唆する。我々の研究は、検証可能な量子優位性の探求が量子暗号の応用につながる可能性を示し、量子プリミティブの構築は、量子優位性の検証可能性に関する新たな洞察を与えることができる。 In recent years, achieving verifiable quantum advantage on a NISQ device has emerged as an important open problem in quantum information. The sampling-based quantum advantages are not known to have efficient verification methods. This paper investigates the verification of quantum advantage from a cryptographic perspective. We establish a strong connection between the verifiability of quantum advantage and cryptographic and complexity primitives, including efficiently samplable, statistically far but computationally indistinguishable pairs of (mixed) quantum states ($\mathsf{EFI}$), pseudorandom states ($\mathsf{PRS}$), and variants of minimum circuit size problems ($\mathsf{MCSP}$). Specifically, we prove that a) a sampling-based quantum advantage is either verifiable or can be used to build $\mathsf{EFI}$ and even $\mathsf{PRS}$ and b) polynomial-time algorithms for a variant of $\mathsf{MCSP}$ would imply efficient verification of quantum advantages. Our work shows that the quest for verifiable quantum advantages may lead to applications of quantum cryptography, and the construction of quantum primitives can provide new insights into the verifiability of quantum advantages.	翻訳日:2023-10-24 23:29:50 公開日:2023-10-23
# 最小作業変動の原理に関する実験的研究 Experimental study on the principle of minimal work fluctuations ( http://arxiv.org/abs/2310.14461v1 ) ライセンス: Link先を確認	Wei Cheng, Wenquan Liu, Yang Wu, Zhibo Niu, Chang-Kui Duan, Jiangbin Gong, Xing Rong, Jiangfeng Du	(参考訳) 有名な量子ジャジンスキーの等式の中心的な量は$e^{-\beta W}$であり、$W$は作用し、$\beta$は逆温度である。量子ランダム性が$e^{-\beta w}$のゆらぎに与える影響、したがってジャジンスキー推定器の予測力に与える影響は重要な問題である。ダイヤモンドの1つの窒素空洞中心に取り組み, 単発読み出しによる非平衡作業の2点測定の実施について検討し, $e^{-\beta w}$の変動と非平衡作業プロトコルの断熱性との関係について直接実験を行った。断熱過程は$e^{-\beta w}$の分散を最小限に抑え、初期の理論的な概念である極小作業変動の原理を検証することが観察されている。さらに、高速作業プロトコルにおける$e^{-\beta W}$の分散を最小限に抑えるために、ショートカット・トゥ・アディバティティティ制御を活用できることを実験的に実証した。我々の研究は、ジャージンスキー等式に基づく自由エネルギー差の推定におけるバイアスと誤差に対する量子効果のさらなる実験的研究を刺激すべきである。 The central quantity in the celebrated quantum Jarzynski equality is $e^{-\beta W}$, where $W$ is work and $\beta$ is the inverse temperature. The impact of quantum randomness on the fluctuations of $e^{-\beta W}$ and hence on the predictive power of the Jarzynski estimator is an important problem. Working on a single nitrogen-vacancy center in diamond and riding on an implementation of two-point measurement of non-equilibrium work with single-shot readout, we have conducted a direct experimental investigation of the relationship between the fluctuations of $e^{-\beta W}$ and adiabaticity of non-equilibrium work protocols. It is observed that adiabatic processes minimize the variance of $e^{-\beta W}$, thus verifying an early theoretical concept, the so-called principle of minimal work fluctuations. Furthermore, it is experimentally demonstrated that shortcuts-to-adiabaticity control can be exploited to minimize the variance of $e^{-\beta W}$ in fast work protocols. Our work should stimulate further experimental studies of quantum effects on the bias and error in the estimates of free energy differences based on the Jarzynski equality.	翻訳日:2023-10-24 23:29:22 公開日:2023-10-23
# InstructExcel:Excelの自然言語命令ベンチマーク InstructExcel: A Benchmark for Natural Language Instruction in Excel ( http://arxiv.org/abs/2310.14495v1 ) ライセンス: Link先を確認	Justin Payan, Swaroop Mishra, Mukul Singh, Carina Negreanu, Christian Poelitz, Chitta Baral, Subhro Roy, Rasika Chakravarthy, Benjamin Van Durme, and Elnaz Nouri	(参考訳) LLM(Large Language Models)の進化により、スプレッドシートを含む様々な領域にわたるより複雑なNLPタスクを解決できる。この研究は、LLMがコード(Excelで多くのタスクを実行するTypeScript APIであるExcel OfficeScripts)を生成することができるかどうかを調査する。そのためには、Excelの'Automate'機能を活用して、ユーザのアクションからOfficeScriptを自動的に生成する、新しい大規模ベンチマークであるInstructExcelを導入しました。ベンチマークには、公開公開のexcelスプレッドシートで170以上のexcel操作をカバーする10k以上のサンプルが含まれています。さまざまなゼロショットと少数ショット設定の実験は、InstructExcelがGPT-4のようなアートモデルのハードベンチマークであることを示している。我々は,(1) GPT-4 over GPT-3.5,(2) よりコンテキスト内での例を提供し,(3) 動的プロンプトは,このベンチマークの性能向上に役立つことを観察した。 With the evolution of Large Language Models (LLMs) we can solve increasingly more complex NLP tasks across various domains, including spreadsheets. This work investigates whether LLMs can generate code (Excel OfficeScripts, a TypeScript API for executing many tasks in Excel) that solves Excel specific tasks provided via natural language user instructions. To do so we introduce a new large-scale benchmark, InstructExcel, created by leveraging the 'Automate' feature in Excel to automatically generate OfficeScripts from users' actions. Our benchmark includes over 10k samples covering 170+ Excel operations across 2,000 publicly available Excel spreadsheets. Experiments across various zero-shot and few-shot settings show that InstructExcel is a hard benchmark for state of the art models like GPT-4. We observe that (1) using GPT-4 over GPT-3.5, (2) providing more in-context examples, and (3) dynamic prompting can help improve performance on this benchmark.	翻訳日:2023-10-24 23:22:35 公開日:2023-10-23
# ロボットアームマニピュレーションによる岩盤の浮き上がりのシミュレーション Robotic Arm Manipulation to Perform Rock Skipping in Simulation ( http://arxiv.org/abs/2310.14492v1 ) ライセンス: Link先を確認	Nicholas Ramirez and Michael Burgess	(参考訳) ロックスキップは、人間によって容易に実行できる非常にダイナミックで比較的複雑なタスクである。このプロジェクトの目的は、ロボットマニピュレーションで学んだ教訓を活用して、ロックスキップをロボット環境に持ち込むことだ。具体的には、ロボットアームと動的環境からなるシステムを実装し、シミュレーションで岩をスキップする。リリース速度などの重要なパラメータを変動させることで,スキップ回数の最大化に最も重要な要因を把握できるように,我々のシステムを利用したいと考えています。さらに,シミュレーションでシステムを実装することで,これらの異なるテストパラメータに対してより厳密で正確なテスト手法を得ることができた。しかし,本報告では,非効率化やリリース高さトラジェクタの問題点が指摘されているため,いくつかの制約が生じた。 Rock skipping is a highly dynamic and relatively complex task that can easily be performed by humans. This project aims to bring rock skipping into a robotic setting, utilizing the lessons we learned in Robotic Manipulation. Specifically, this project implements a system consisting of a robotic arm and dynamic environment to perform rock skipping in simulation. By varying important parameters such as release velocity, we hope to use our system to gain insight into the most important factors for maximizing the total number of skips. In addition, by implementing the system in simulation, we have a more rigorous and precise testing approach over these varied test parameters. However, this project experienced some limitations due to gripping inefficiencies and problems with release height trajectories which is further discussed in our report.	翻訳日:2023-10-24 23:22:14 公開日:2023-10-23
# 言語モデルのマルチステップ推論能力の機械論的解釈に向けて Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models ( http://arxiv.org/abs/2310.14491v1 ) ライセンス: Link先を確認	Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut, Mrinmaya Sachan	(参考訳) 近年の研究では、言語モデル(LM)は多段階推論能力(手続き的推論)が強いことが示されている。しかし、lmsが事前学習したコーパスから記憶された回答を騙すか、あるいは多段階推論機構を介してこれらのタスクを実行するかは定かではない。本稿では,多段階推論タスクにおけるLMの機械的解釈を探索することにより,この問題に対処する。具体的には、LMが正しい推論プロセスに類似した推論木を暗黙的に埋め込んでいることを仮定する。我々は,モデル注意パターンから推論木を復元する新しい探索手法(MechanisticProbe)を導入することにより,この仮説を検証する。 GPT-2を合成タスク(k番目の最小要素)で、LLaMAを2つの単純な言語ベースの推論タスク(ProofWriter & AI2 Reasoning Challenge)で分析する。メカニスティックプローブは、ほとんどの例において、モデルの注意から推論木に関する情報を検出できることを示し、LMが実際にアーキテクチャ内の多段階の推論プロセスを通過していることを示唆している。 Recent work has shown that language models (LMs) have strong multi-step (i.e., procedural) reasoning capabilities. However, it is unclear whether LMs perform these tasks by cheating with answers memorized from pretraining corpus, or, via a multi-step reasoning mechanism. In this paper, we try to answer this question by exploring a mechanistic interpretation of LMs for multi-step reasoning tasks. Concretely, we hypothesize that the LM implicitly embeds a reasoning tree resembling the correct reasoning process within it. We test this hypothesis by introducing a new probing approach (called MechanisticProbe) that recovers the reasoning tree from the model's attention patterns. We use our probe to analyze two LMs: GPT-2 on a synthetic task (k-th smallest element), and LLaMA on two simple language-based reasoning tasks (ProofWriter & AI2 Reasoning Challenge). We show that MechanisticProbe is able to detect the information of the reasoning tree from the model's attentions for most examples, suggesting that the LM indeed is going through a process of multi-step reasoning within its architecture in many cases.	翻訳日:2023-10-24 23:22:04 公開日:2023-10-23
# msformer : 歯列分割のためのスケルトン・マルチビュー融合法 MSFormer: A Skeleton-multiview Fusion Method For Tooth Instance Segmentation ( http://arxiv.org/abs/2310.14489v1 ) ライセンス: Link先を確認	Yuan Li, Huan Liu, Yubo Tao, Xiangyang He, Haifeng Li, Xiaohu Guo, Hai Lin	(参考訳) 近年,深層学習に基づく歯のセグメンテーション法は,データ収集とラベル付けの費用と時間のかかるプロセスによって制限されている。限られたデータセットで高精度セグメンテーションを実現することが重要である。これに対する現実的な解決策は、事前学習されたマルチビューベースのモデルを微調整することで、限られたデータでパフォーマンスを向上させることだ。しかし、3次元3次元歯のセグメンテーションのための2次元(2次元)画像のみを頼りにすると、咬合と変形、すなわち不完全かつ歪んだ形状知覚のために、最適以下の結果が得られる。この微調整に基づく解法を改善するため,本稿では2d-3d関節知覚を提唱する。限られたデータで2D-3Dの関節知覚を利用する際の根本的な課題は、3D関連のインプットとモジュールが、広範なトレーニングデータを必要とする巨大な3Dデータやパラメータリッチモジュールを使用する代わりに、軽量なポリシーに従う必要があることである。この軽量な方針に従い, 骨を3次元入力として選択し, 歯のセグメンテーションの新しい方法であるMSFormerを紹介する。 MSFormerは2つの軽量モジュールを既存のマルチビューベースモデルに組み込んでおり、スケルトンから3次元知覚を抽出する3D-スケルトン認識モジュールと、スケルトン画像のコントラスト学習モジュールを用いて、マルチビューとスケルトン知覚の両方を融合させて2D-3D関節知覚を得る。実験結果から,MSFormerと大規模な事前学習型マルチビューモデルが組み合わさって,最先端性能を実現し,100のトレーニングメッシュしか必要としないことが明らかとなった。さらに、トレーニングデータの量が増加すると、セグメンテーション精度が2.4%-5.5%向上する。 Recently, deep learning-based tooth segmentation methods have been limited by the expensive and time-consuming processes of data collection and labeling. Achieving high-precision segmentation with limited datasets is critical. A viable solution to this entails fine-tuning pre-trained multiview-based models, thereby enhancing performance with limited data. However, relying solely on two-dimensional (2D) images for three-dimensional (3D) tooth segmentation can produce suboptimal outcomes because of occlusion and deformation, i.e., incomplete and distorted shape perception. To improve this fine-tuning-based solution, this paper advocates 2D-3D joint perception. The fundamental challenge in employing 2D-3D joint perception with limited data is that the 3D-related inputs and modules must follow a lightweight policy instead of using huge 3D data and parameter-rich modules that require extensive training data. Following this lightweight policy, this paper selects skeletons as the 3D inputs and introduces MSFormer, a novel method for tooth segmentation. MSFormer incorporates two lightweight modules into existing multiview-based models: a 3D-skeleton perception module to extract 3D perception from skeletons and a skeleton-image contrastive learning module to obtain the 2D-3D joint perception by fusing both multiview and skeleton perceptions. The experimental results reveal that MSFormer paired with large pre-trained multiview models achieves state-of-the-art performance, requiring only 100 training meshes. Furthermore, the segmentation accuracy is improved by 2.4%-5.5% with the increasing volume of training data.	翻訳日:2023-10-24 23:21:41 公開日:2023-10-23
# VQ-NeRF:ベクトル量子化は暗黙の神経表現を促進する VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations ( http://arxiv.org/abs/2310.14487v1 ) ライセンス: Link先を確認	Yiying Yang, Wen Liu, Fukun Yin, Xin Chen, Gang Yu, Jiayuan Fan, Tao Chen	(参考訳) 近年の暗黙的神経表現の進歩は、高忠実な表面再構成とフォトリアリスティックな新しい視点合成に寄与している。しかし、これらの方法論に内在する計算複雑性は実質的な障害を示し、実用的な応用において達成可能なフレームレートと解像度を制約している。そこで本研究では,ベクトル量子化による暗黙的ニューラル表現の強化のための効果的かつ効率的なパイプラインであるVQ-NeRFを提案する。本手法の本質は、NeRFのサンプリング空間を低分解能に減らし、トレーニング済みのVAEデコーダを用いて元のサイズに戻すことにより、レンダリング中に発生するサンプリング時間ボトルネックを効果的に軽減することである。コードブックには代表的な特徴があるが、高い圧縮率のため、シーンの細かいテクスチャの詳細を再構築することは難しい。この制約を克服するため,我々は,ネットワークの細部保存能力を高めるために,圧縮および原スケールのNeRFモデルを同時に最適化する,革新的なマルチスケールNeRFサンプリング方式を設計した。さらに,3次元再構成の幾何学的忠実度とセマンティックコヒーレンスを改善するためにセマンティックロス関数を組み込んだ。広範な実験により、レンダリング品質と効率の最適なトレードオフを達成するためのモデルの有効性が実証された。 DTU, BlendMVS, H3DSデータセットの評価により, 本手法の優れた性能が確認された。 Recent advancements in implicit neural representations have contributed to high-fidelity surface reconstruction and photorealistic novel view synthesis. However, the computational complexity inherent in these methodologies presents a substantial impediment, constraining the attainable frame rates and resolutions in practical applications. In response to this predicament, we propose VQ-NeRF, an effective and efficient pipeline for enhancing implicit neural representations via vector quantization. The essence of our method involves reducing the sampling space of NeRF to a lower resolution and subsequently reinstating it to the original size utilizing a pre-trained VAE decoder, thereby effectively mitigating the sampling time bottleneck encountered during rendering. Although the codebook furnishes representative features, reconstructing fine texture details of the scene remains challenging due to high compression rates. To overcome this constraint, we design an innovative multi-scale NeRF sampling scheme that concurrently optimizes the NeRF model at both compressed and original scales to enhance the network's ability to preserve fine details. Furthermore, we incorporate a semantic loss function to improve the geometric fidelity and semantic coherence of our 3D reconstructions. Extensive experiments demonstrate the effectiveness of our model in achieving the optimal trade-off between rendering quality and efficiency. Evaluation on the DTU, BlendMVS, and H3DS datasets confirms the superior performance of our approach.	翻訳日:2023-10-24 23:21:10 公開日:2023-10-23
# テキストファクト転送 Text Fact Transfer ( http://arxiv.org/abs/2310.14486v1 ) ライセンス: Link先を確認	Nishant Balepur, Jie Huang, Kevin Chen-Chuan Chang	(参考訳) テキストスタイル転送は,本質的な内容の変更を伴わずにテキストスタイルを制御することを目的とした,顕著なタスクである。過去のニュースを現在の出来事に適用したり、教育資料を再購入したりといった、より多くのテキスト修正の応用をカバーするために、そのスタイルを変更することなく、トピック間でソーステキストの事実内容を転送するテキスト事実伝達のタスクを提案する。既存の言語モデルでは,テキストの特異性や表現が維持できないこと,エラーを幻覚させる傾向があることから,テキストの事実伝達に苦慮していることがわかった。これらの問題に対処するため、我々は、エンドツーエンドの質問生成と特異性を考慮した質問応答を組み合わせた、ソースコードを最小限に修正するフレームワークであるModQGAを設計する。テキストのファクト転送に適応した4つの既存のデータセットの実験を通じて、modqgaはソーステキストのスタイルを犠牲にすることなく、事実コンテンツを正確に転送できることを示した。 Text style transfer is a prominent task that aims to control the style of text without inherently changing its factual content. To cover more text modification applications, such as adapting past news for current events and repurposing educational materials, we propose the task of text fact transfer, which seeks to transfer the factual content of a source text between topics without modifying its style. We find that existing language models struggle with text fact transfer, due to their inability to preserve the specificity and phrasing of the source text, and tendency to hallucinate errors. To address these issues, we design ModQGA, a framework that minimally modifies a source text with a novel combination of end-to-end question generation and specificity-aware question answering. Through experiments on four existing datasets adapted for text fact transfer, we show that ModQGA can accurately transfer factual content without sacrificing the style of the source text.	翻訳日:2023-10-24 23:20:47 公開日:2023-10-23
# ロボットシステムのインテリジェントエスケープ:方法論,応用,課題に関する調査 Intelligent Escape of Robotic Systems: A Survey of Methodologies, Applications, and Challenges ( http://arxiv.org/abs/2310.14485v1 ) ライセンス: Link先を確認	Junfei Li, Simon X. Yang	(参考訳) インテリジェント・エスケープ(英: Intelligent escape)は、人工知能(AI)技術を用いて、ダイナミックで複雑な、予測不可能なシナリオにおける潜在的な危険にインテリジェントに対応できるようにする分野である。安全への重点がますます重要になり、ロボット技術の進歩が進み続ける中、近年は多様なインテリジェントエスケープ方法論が開発されている。本稿では,ロボットシステムの知的脱出に関する最新の研究成果を包括的に調査する。知的脱出の4つの主要な方法は,計画に基づく方法論,分割に基づく方法論,学習に基づく方法論,生物に触発された方法論である。既存の手法の強みと限界を要約する。さらに, 捜索救助, 避難, 軍事安全, 医療など様々な分野において, インテリジェント・エスケープの潜在的な応用について論じる。インテリジェントエスケープの新しいアプローチを開発するために、この調査は現在の研究課題を特定し、インテリジェントエスケープの今後の研究動向に関する洞察を提供する。 Intelligent escape is an interdisciplinary field that employs artificial intelligence (AI) techniques to enable robots with the capacity to intelligently react to potential dangers in dynamic, intricate, and unpredictable scenarios. As the emphasis on safety becomes increasingly paramount and advancements in robotic technologies continue to advance, a wide range of intelligent escape methodologies has been developed in recent years. This paper presents a comprehensive survey of state-of-the-art research work on intelligent escape of robotic systems. Four main methods of intelligent escape are reviewed, including planning-based methodologies, partitioning-based methodologies, learning-based methodologies, and bio-inspired methodologies. The strengths and limitations of existing methods are summarized. In addition, potential applications of intelligent escape are discussed in various domains, such as search and rescue, evacuation, military security, and healthcare. In an effort to develop new approaches to intelligent escape, this survey identifies current research challenges and provides insights into future research trends in intelligent escape.	翻訳日:2023-10-24 23:20:28 公開日:2023-10-23
# 「なぜこの論文をレビューすべきなのか?」論文レビュー者マッチングのための意味・話題・引用要素の統一化 "Why Should I Review This Paper?" Unifying Semantic, Topic, and Citation Factors for Paper-Reviewer Matching ( http://arxiv.org/abs/2310.14483v1 ) ライセンス: Link先を確認	Yu Zhang, Yanzhen Shen, Xiusi Chen, Bowen Jin, Jiawei Han	(参考訳) 多くの学術会議が論文提出の急増に圧倒されているため、各論文に対する適切なレビュアーを自動的に見つけることは、これまで以上に緊急に必要となる。論文がセマンティックに近づき、トピックを共有し、レビュアーの以前の論文を引用するなど、論文とレビュアーの専門的関連性を測定するために、過去の試みによって様々な要因が検討されてきた。しかし、従来の研究の大半はこれらの要因の1つだけを考慮に入れており、紙レビューの妥当性を包括的に評価している。このギャップを埋めるため,本稿では,意味的・話題的・引用的要素を共同で捉えた論文レビュー者マッチングの統一モデルを提案する。統一モデルでは、コンテキスト化された言語モデルバックボーンを共通知識を学習するためのすべての要因で共有し、各要因の独自性を因子認識紙の埋め込みによって特徴付けるように指導調律を導入する。機械学習,コンピュータビジョン,情報検索,データマイニングなど,さまざまな分野にわたる4つのデータセット(うちの1つは新たにコントリビュートされた)に関する実験は,最新の論文レビュー手法と科学的事前学習された言語モデルと比較して,提案するuniprモデルの有効性を一貫して検証している。 As many academic conferences are overwhelmed by a rapidly increasing number of paper submissions, automatically finding appropriate reviewers for each submission becomes a more urgent need than ever. Various factors have been considered by previous attempts on this task to measure the expertise relevance between a paper and a reviewer, including whether the paper is semantically close to, shares topics with, and cites previous papers of the reviewer. However, the majority of previous studies take only one of these factors into account, leading to an incomprehensive evaluation of paper-reviewer relevance. To bridge this gap, in this paper, we propose a unified model for paper-reviewer matching that jointly captures semantic, topic, and citation factors. In the unified model, a contextualized language model backbone is shared by all factors to learn common knowledge, while instruction tuning is introduced to characterize the uniqueness of each factor by producing factor-aware paper embeddings. Experiments on four datasets (one of which is newly contributed by us) across different fields, including machine learning, computer vision, information retrieval, and data mining, consistently validate the effectiveness of our proposed UniPR model in comparison with state-of-the-art paper-reviewer matching methods and scientific pre-trained language models.	翻訳日:2023-10-24 23:20:11 公開日:2023-10-23
# ランダム投影による効率のよい不均一グラフ学習 Efficient Heterogeneous Graph Learning via Random Projection ( http://arxiv.org/abs/2310.14481v1 ) ライセンス: Link先を確認	Jun Hu, Bryan Hooi, Bingsheng He	(参考訳) 不均一グラフニューラルネットワーク(HGNN)は、異種グラフを深層学習するための強力なツールである。典型的なHGNNは、トレーニング中に繰り返しメッセージパッシングを必要とし、大規模な実世界のグラフの効率を制限する。最近のプリ計算ベースのHGNNは、一時間メッセージパッシングを使用して、不均一グラフを正規形テンソルに変換することにより、効率的なミニバッチトレーニングを実現する。既存の事前計算ベースのHGNNは、主に2つのスタイルに分類される。本稿では,Random Projection Heterogeneous Graph Neural Network (RpHGNN) というハイブリッド事前計算型HGNNを提案する。 RpHGNNの主なフレームワークはプロパゲートの更新イテレーションで構成されており、Random Projection Squashing ステップを導入し、複雑性が線形に増加することを保証する。低情報損失を実現するために,よりきめ細かな方法で隣人からの情報を収集することを目的としたEven-odd Propagation Schemeを用いたRelation-wise Neighbor Collectionコンポーネントを導入する。実験結果から,本手法は7つの小規模および大規模ベンチマークデータセットに対して最先端の結果が得られる一方で,最も有効なベースラインに比べて230%高速であることがわかった。驚くべきことに、私たちのアプローチは前処理ベースのベースラインを超えるだけでなく、エンドツーエンドメソッドよりも優れています。 Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Typical HGNNs require repetitive message passing during training, limiting efficiency for large-scale real-world graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors, enabling efficient mini-batch training. Existing pre-computation-based HGNNs can be mainly categorized into two styles, which differ in how much information loss is allowed and efficiency. We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN), which combines the benefits of one style's efficiency with the low information loss of the other style. To achieve efficiency, the main framework of RpHGNN consists of propagate-then-update iterations, where we introduce a Random Projection Squashing step to ensure that complexity increases only linearly. To achieve low information loss, we introduce a Relation-wise Neighbor Collection component with an Even-odd Propagation Scheme, which aims to collect information from neighbors in a finer-grained way. Experimental results indicate that our approach achieves state-of-the-art results on seven small and large benchmark datasets while also being 230% faster compared to the most effective baseline. Surprisingly, our approach not only surpasses pre-processing-based baselines but also outperforms end-to-end methods.	翻訳日:2023-10-24 23:19:45 公開日:2023-10-23
# BERTモデルに対する意図的バックドア攻撃 Attention-Enhancing Backdoor Attacks Against BERT-based Models ( http://arxiv.org/abs/2310.14480v1 ) ライセンス: Link先を確認	Weimin Lyu, Songzhu Zheng, Lu Pang, Haibin Ling, Chao Chen	(参考訳) 近年の研究では、textit{Backdoor Attacks} が自然言語処理(NLP)モデルの安全性を脅かす可能性があることが明らかになった。バックドア攻撃の戦略を調査することは、モデルの脆弱性を理解するのに役立つ。既存のテキストバックドア攻撃のほとんどは、ステルストリガーの生成やモデル重み付けの変更に焦点を当てている。本稿では,ニューラルネットワークの内部構造とバックドア機構を直接対象とする。本稿では,注意パターンを直接操作することでトロイの木馬行動を向上させる新しいトロイの木馬注意損失(TAL)を提案する。我々の損失は、攻撃の成功率と中毒率の観点から攻撃効果を高める様々な攻撃方法に適用できる。従来のダーティラベル攻撃だけでなく、より困難なクリーンラベル攻撃にも適用される。本研究では,異なるバックボーンモデル (BERT, RoBERTa, DistilBERT) と各種タスク (Sentiment Analysis, Toxic Detection, Topic Classification) について検証を行った。 Recent studies have revealed that \textit{Backdoor Attacks} can threaten the safety of natural language processing (NLP) models. Investigating the strategies of backdoor attacks will help to understand the model's vulnerability. Most existing textual backdoor attacks focus on generating stealthy triggers or modifying model weights. In this paper, we directly target the interior structure of neural networks and the backdoor mechanism. We propose a novel Trojan Attention Loss (TAL), which enhances the Trojan behavior by directly manipulating the attention patterns. Our loss can be applied to different attacking methods to boost their attack efficacy in terms of attack successful rates and poisoning rates. It applies to not only traditional dirty-label attacks, but also the more challenging clean-label attacks. We validate our method on different backbone models (BERT, RoBERTa, and DistilBERT) and various tasks (Sentiment Analysis, Toxic Detection, and Topic Classification).	翻訳日:2023-10-24 23:19:20 公開日:2023-10-23
# qudeval: 議論談話解析における質問の評価 QUDEVAL: The Evaluation of Questions Under Discussion Discourse Parsing ( http://arxiv.org/abs/2310.14520v1 ) ライセンス: Link先を確認	Yating Wu, Ritika Mangla, Greg Durrett, Junyi Jessy Li	(参考訳) Questions Under Discussion (QUD) は、言論を継続的に質問し、回答する多目的言語フレームワークである。文書と回答文が与えられた場合、QUDの言語的制約を満たす質問を生成し、事前の文脈でアンカー文でグラウンド化することができる。これらの質問は好奇心を駆り立て、オープンエンドであることが知られている。本研究はQUD解析の自動評価のための最初のフレームワークを導入し、具体的なプロトコルにおけるQUDの理論的制約をインスタンス化する。細調整されたシステムとLLMの両方から生成された2,190のQUD質問のきめ細かい評価データセットであるQUDevalを提案する。 QUDevalを用いて、現代のLLMではQUDの制約をすべて満たすことは依然として困難であり、既存の評価基準はパーサの品質を十分に近似していないことを示す。人為的なQUDは、人間の評価者によって高く評価され、QUD解析とQUD評価の両方を改善するために、言語モデリングのさらなる進歩のためのハードルがあることが示唆されている。 Questions Under Discussion (QUD) is a versatile linguistic framework in which discourse progresses as continuously asking questions and answering them. Automatic parsing of a discourse to produce a QUD structure thus entails a complex question generation task: given a document and an answer sentence, generate a question that satisfies linguistic constraints of QUD and can be grounded in an anchor sentence in prior context. These questions are known to be curiosity-driven and open-ended. This work introduces the first framework for the automatic evaluation of QUD parsing, instantiating the theoretical constraints of QUD in a concrete protocol. We present QUDeval, a dataset of fine-grained evaluation of 2,190 QUD questions generated from both fine-tuned systems and LLMs. Using QUDeval, we show that satisfying all constraints of QUD is still challenging for modern LLMs, and that existing evaluation metrics poorly approximate parser quality. Encouragingly, human-authored QUDs are scored highly by our human evaluators, suggesting that there is headroom for further progress on language modeling to improve both QUD parsing and QUD evaluation.	翻訳日:2023-10-24 23:12:07 公開日:2023-10-23
# 状態距離情報を用いた多様な戦略の反復学習 Iteratively Learn Diverse Strategies with State Distance Information ( http://arxiv.org/abs/2310.14509v1 ) ライセンス: Link先を確認	Wei Fu, Weihua Du, Jingwei Li, Sunli Chen, Jingzhao Zhang, Yi Wu	(参考訳) 複雑な強化学習(RL)問題では、同様の報酬を持つポリシーは、実質的に異なる振る舞いを持つ可能性がある。報酬を最適化し、可能な限り多くの多様な戦略を発見しながら、多くの実用的なアプリケーションにおいて重要な課題である。本研究は,この課題に取り組むための2つの設計選択,すなわち多様性尺度と計算フレームワークについて検討する。まず、既存の多様性対策では、視覚的に区別できない政策は依然として高い多様性のスコアを得られる。行動差を正確に把握するために, 状態空間距離情報をダイバーシティ尺度に組み込むことを提案する。さらに,この問題に対する共通計算フレームワークである人口ベーストレーニング(pbt)と反復学習(itr)について検討した。 PBTは正確な問題定式化であるが、ITRは高い計算効率で同等の多様性のスコアを達成でき、実際に解の質が向上することを示した。本稿では,ITRと国家距離に基づく多様性尺度の抽出可能な2つの実現法を更に組み合わせ,新しい多様性駆動型RLアルゴリズムである国家固有回帰政策最適化(SIPO)を立証可能な収束特性と組み合わせた。ロボットロコモーションからマルチエージェントゲームまで,3分野にわたるsipoを実証的に検討した。当社のすべてのテスト環境において、SIPOは、既存のベースラインでは見つからない戦略的に多様で人間解釈可能なポリシーを一貫して作り出しています。 In complex reinforcement learning (RL) problems, policies with similar rewards may have substantially different behaviors. It remains a fundamental challenge to optimize rewards while also discovering as many diverse strategies as possible, which can be crucial in many practical applications. Our study examines two design choices for tackling this challenge, i.e., diversity measure and computation framework. First, we find that with existing diversity measures, visually indistinguishable policies can still yield high diversity scores. To accurately capture the behavioral difference, we propose to incorporate the state-space distance information into the diversity measure. In addition, we examine two common computation frameworks for this problem, i.e., population-based training (PBT) and iterative learning (ITR). We show that although PBT is the precise problem formulation, ITR can achieve comparable diversity scores with higher computation efficiency, leading to improved solution quality in practice. Based on our analysis, we further combine ITR with two tractable realizations of the state-distance-based diversity measures and develop a novel diversity-driven RL algorithm, State-based Intrinsic-reward Policy Optimization (SIPO), with provable convergence properties. We empirically examine SIPO across three domains from robot locomotion to multi-agent games. In all of our testing environments, SIPO consistently produces strategically diverse and human-interpretable policies that cannot be discovered by existing baselines.	翻訳日:2023-10-24 23:11:49 公開日:2023-10-23
# 説明、編集、生成:マルチホップファクト検証のための合理性に敏感な偽データ拡張 EXPLAIN, EDIT, GENERATE: Rationale-Sensitive Counterfactual Data Augmentation for Multi-hop Fact Verification ( http://arxiv.org/abs/2310.14508v1 ) ライセンス: Link先を確認	Yingjie Zhu, Jiasheng Si, Yibo Zhao, Haiyang Zhu, Deyu Zhou, Yulan He	(参考訳) 近年,自動マルチホップファクト検証タスクが注目されている。印象的な結果にもかかわらず、これらのよく設計されたモデルはドメイン外データでは性能が劣る。ひとつの可能な解決策は、元のデータの因果的特徴を最小限に変更することによって生成される、反事実でトレーニングデータを拡張することだ。しかし、複数の相関テキスト内の複雑な論理的関係を保存できないため、現在のカウンターファクトデータ拡張技術はマルチホップ事実検証の処理に失敗する。本稿では,論理的な関係を維持しつつ,言語的に多様でラベルつきの反事実を生成する合理的な手法を考案することで,この限界を克服する。具体的には、多種多様な反事実は、説明編集生成アーキテクチャによって生成される。さらに, 検証およびフィルタリングモジュールは, 論理的関係とフリップラベルで正則化するために提案される。実験の結果,提案手法はSOTAベースラインよりも優れており,論理的関係を乱すことなく言語的に多様な反事実データを生成することができることがわかった。 Automatic multi-hop fact verification task has gained significant attention in recent years. Despite impressive results, these well-designed models perform poorly on out-of-domain data. One possible solution is to augment the training data with counterfactuals, which are generated by minimally altering the causal features of the original data. However, current counterfactual data augmentation techniques fail to handle multi-hop fact verification due to their incapability to preserve the complex logical relationships within multiple correlated texts. In this paper, we overcome this limitation by developing a rationale-sensitive method to generate linguistically diverse and label-flipping counterfactuals while preserving logical relationships. In specific, the diverse and fluent counterfactuals are generated via an Explain-Edit-Generate architecture. Moreover, the checking and filtering modules are proposed to regularize the counterfactual data with logical relations and flipped labels. Experimental results show that the proposed approach outperforms the SOTA baselines and can generate linguistically diverse counterfactual data without disrupting their logical relationships.	翻訳日:2023-10-24 23:11:24 公開日:2023-10-23
# 適応型マルチヘッドアテンションを用いたトランスフォーマーの感情分析 Sentiment analysis with adaptive multi-head attention in Transformer ( http://arxiv.org/abs/2310.14505v1 ) ライセンス: Link先を確認	Fanfei Meng, David Demeter	(参考訳) 本稿では,映画レビュー資料の感情を識別するためのアテンション機構に基づく新しいフレームワークを提案する。注意機構を有するディープニューラルネットワークの以前の取り組みは、固定数のマルチヘッド注意を持つエンコーダとデコーダに焦点を当てていた。そこで本研究では,より有用な情報をメモリから読み取ることができなければ,注意処理を自動停止する機構が必要であり,文の長さに応じて注意ヘッド数を変化させる適応型多頭注意アーキテクチャ(adaptattn)を提案する。 AdaptAttnは、各文書を文の長さに基づいて、小、中、大の3つのビンのいずれかに分類するデータ前処理ステップを有する。小さめに分類された文書は、各層で2つのヘッドを通り、中型グループは4つのヘッドを通り、大きなグループは8つのヘッドで処理される。本モデルの有効性をスタンフォード大映画レビューデータセットで検証する。実験結果から,本モデルからのF1スコアはベースラインモデルと同等であることがわかった。 We propose a novel framework based on the attention mechanism to identify the sentiment of a movie review document. Previous efforts on deep neural networks with attention mechanisms focus on encoder and decoder with fixed numbers of multi-head attention. Therefore, we need a mechanism to stop the attention process automatically if no more useful information can be read from the memory.In this paper, we propose an adaptive multi-head attention architecture (AdaptAttn) which varies the number of attention heads based on length of sentences. AdaptAttn has a data preprocessing step where each document is classified into any one of the three bins small, medium or large based on length of the sentence. The document classified as small goes through two heads in each layer, the medium group passes four heads and the large group is processed by eight heads. We examine the merit of our model on the Stanford large movie review dataset. The experimental results show that the F1 score from our model is on par with the baseline model.	翻訳日:2023-10-24 23:11:04 公開日:2023-10-23
# ADoPT:ポイントレベル時間一貫性に基づくLiDARスポーフィング検出 ADoPT: LiDAR Spoofing Attack Detection Based on Point-Level Temporal Consistency ( http://arxiv.org/abs/2310.14504v1 ) ライセンス: Link先を確認	Minkyoung Cho, Yulong Cao, Zixiang Zhou, and Z. Morley Mao	(参考訳) ディープニューラルネットワーク(DNN)は、自動運転車(AV)のためのLiDAR(Light Detection and Ranging)ベースの認識システムに統合され、敵の条件下での堅牢なパフォーマンスが要求される。我々は,攻撃者がLiDARデータに偽のオブジェクトを注入し,その環境を誤解釈して誤った判断を下すという,LiDAR偽造攻撃の課題に対処することを目指している。しかし、現在の防御アルゴリズムは、主に知覚出力(すなわち、境界ボックス)に依存するため、境界ボックスが不完全な知覚モデルによって生成されると、エゴ車両の視点に基づいて取得された制限点を検知する攻撃者に対する制限に直面している。これらの制約を克服するために,連続するフレーム間の時間的一貫性を定量的に測定し,ポイントクラスタの一貫性に基づいて異常物体を識別する,adopt(anomaly detection based on point-level temporal consistency)という新しいフレームワークを提案する。 nuScenes データセットを用いた評価では,提案アルゴリズムは様々なLiDARスプーフィング攻撃に対して効果的に対処し,FPR(False positive ratio)が低く,TPR(Real positive ratio)が85%以上,CARLO(CARLO)と3D-TC2(3D-TC2)よりも優れていた。さらに,様々な道路環境における正確な攻撃検出の可能性を示す。 Deep neural networks (DNNs) are increasingly integrated into LiDAR (Light Detection and Ranging)-based perception systems for autonomous vehicles (AVs), requiring robust performance under adversarial conditions. We aim to address the challenge of LiDAR spoofing attacks, where attackers inject fake objects into LiDAR data and fool AVs to misinterpret their environment and make erroneous decisions. However, current defense algorithms predominantly depend on perception outputs (i.e., bounding boxes) thus face limitations in detecting attackers given the bounding boxes are generated by imperfect perception models processing limited points, acquired based on the ego vehicle's viewpoint. To overcome these limitations, we propose a novel framework, named ADoPT (Anomaly Detection based on Point-level Temporal consistency), which quantitatively measures temporal consistency across consecutive frames and identifies abnormal objects based on the coherency of point clusters. In our evaluation using the nuScenes dataset, our algorithm effectively counters various LiDAR spoofing attacks, achieving a low (< 10%) false positive ratio (FPR) and high (> 85%) true positive ratio (TPR), outperforming existing state-of-the-art defense methods, CARLO and 3D-TC2. Furthermore, our evaluation demonstrates the promising potential for accurate attack detection across various road environments.	翻訳日:2023-10-24 23:10:49 公開日:2023-10-23
# 検索型スタイル転送による質問生成の多様化 Diversify Question Generation with Retrieval-Augmented Style Transfer ( http://arxiv.org/abs/2310.14503v1 ) ライセンス: Link先を確認	Qi Gou, Zehua Xia, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Nguyen Cam-Tu	(参考訳) 文章の文節と回答が与えられると、人間は様々な表現で質問することができるが、この能力は殆どのqgシステムにとって依然として困難である。既存のソリューションは主に、様々なコンテンツプランニングのための、与えられたパス内の内部知識やセマンティックワード空間に焦点を当てている。しかし、これらの手法は表現の多様性に対する外部知識の可能性を考慮していない。このギャップを埋めるため,我々は検索型スタイル転送のためのフレームワークであるrastを提案する。 RASTをトレーニングするために、多様性報酬と一貫性報酬の重み付けを最大化する新しい強化学習(RL)ベースのアプローチを開発する。ここでは、一貫性報酬は質問応答モデル(QA)によって計算されるが、多様性報酬は、最終的な出力が取得したテンプレートをどの程度模倣するかを測定する。実験結果から,本手法は従来の多様性に基づく多様性基準よりも優れ,一貫性の点で比較できることがわかった。私たちのコードはhttps://github.com/gouqi666/rastで利用可能です。 Given a textual passage and an answer, humans are able to ask questions with various expressions, but this ability is still challenging for most question generation (QG) systems. Existing solutions mainly focus on the internal knowledge within the given passage or the semantic word space for diverse content planning. These methods, however, have not considered the potential of external knowledge for expression diversity. To bridge this gap, we propose RAST, a framework for Retrieval-Augmented Style Transfer, where the objective is to utilize the style of diverse templates for question generation. For training RAST, we develop a novel Reinforcement Learning (RL) based approach that maximizes a weighted combination of diversity reward and consistency reward. Here, the consistency reward is computed by a Question-Answering (QA) model, whereas the diversity reward measures how much the final output mimics the retrieved template. Experimental results show that our method outperforms previous diversity-driven baselines on diversity while being comparable in terms of consistency scores. Our code is available at https://github.com/gouqi666/RAST.	翻訳日:2023-10-24 23:10:19 公開日:2023-10-23
# Coyote C++: 完全な自動ユニットテストツール Coyote C++: An Industrial-Strength Fully Automated Unit Testing Tool ( http://arxiv.org/abs/2310.14500v1 ) ライセンス: Link先を確認	Sanghoon Rho, Philipp Martens, Seungcheol Shin, Yeoneo Kim, Hoon Heo and SeungHyun Oh	(参考訳) Coyote C++は、CとC++の完全な自動ユニットテストを実現するために、洗練されたConcolic-executionベースのアプローチを使用する自動テストツールである。コンコリックテストはcやjavaなどの言語で効果的であることが証明されているが、構文上の複雑さと全体的な複雑さのため、c++の実用レベルの自動化を達成するのに苦労している。 Coyote C++は、この障壁を突破し、C++の自動ユニットテストを産業採用に適した実践レベルに引き上げる最初の自動テストツールである。特に、このテストプロセスは、ユーザの関与を必要とせず、"ワンクリック"自動化でテストハーネス生成、テストケース生成、テスト実行を実行する。本稿では,そのハイレベルな構造を概説し,そのconcolic実行エンジンの実装を形作ったコア設計決定を議論することで,coyote c++を紹介する。最後に,coyote c++は,オープンソースソフトウェアと産業ソフトウェアの両方における実験の結果を提示することにより,合理的なタイムスパン内で高いカバレッジを達成できることを実証する。 Coyote C++ is an automated testing tool that uses a sophisticated concolic-execution-based approach to realize fully automated unit testing for C and C++. While concolic testing has proven effective for languages such as C and Java, tools have struggled to achieve a practical level of automation for C++ due to its many syntactical intricacies and overall complexity. Coyote C++ is the first automated testing tool to breach the barrier and bring automated unit testing for C++ to a practical level suitable for industrial adoption, consistently reaching around 90% code coverage. Notably, this testing process requires no user involvement and performs test harness generation, test case generation and test execution with "one-click" automation. In this paper, we introduce Coyote C++ by outlining its high-level structure and discussing the core design decisions that shaped the implementation of its concolic execution engine. Finally, we demonstrate that Coyote C++ is capable of achieving high coverage results within a reasonable timespan by presenting the results from experiments on both open-source and industrial software.	翻訳日:2023-10-24 23:10:01 公開日:2023-10-23
# 可変形逆エンジニアリングによる高効率な分子の創製と検出 Highly Efficient Creation and Detection of Deeply-bound Molecules via Invariant-based Inverse Engineering with Feasible Modified Drivings ( http://arxiv.org/abs/2310.14499v1 ) ライセンス: Link先を確認	Jiahui Zhang	(参考訳) Stimulated Raman Adiabatic Passage (STIRAP)とその変異体、例えばMulti-state chainwise-STIRAPは、多状態系の個体群を効率的に移動させることを可能にし、超低温で深い結合を持つ分子の調製に広く用いられている。しかし、転送効率は一般的に不完全である。主な障害は、損失の存在と、ダイナミクスを断熱的にすることの必要性である。そこで本論文では, 深く結合した分子の効率的かつロバストな生成・検出のための理論的手法を提案する。光学場によって鎖状に結合された状態を持つ単純な3層および5層システムを考える。大規模な調律では、3レベルと5レベルの分子系のダイナミクスをそれぞれ有効2レベルと3レベルに縮小することにより、大きな分子損失が事前に抑制される。その結果、2レベル対応は2種類の「不変ベースの逆工学」 (iie) レシピと直接互換となり, 両プロトコルが同等の性能を示し, 実験可能性も良好であることが判明した。 5レベルの場合、入射パルス間の関係を考慮して、m型構造を最も単純な共振結合を持つ効果的な$lambda$型構造に一般化できることを示す。したがって、この一般化モデルは「IIE」レシピと直接互換性がある。数値計算により、弱い結合分子は強いレーザー強度を伴わずにその深い結合状態に効率的に移動でき、パラメータ変動に対する安定性はよく保存されている。最後に、超低温の深い結合分子の検出について論じ、全てのプロトコルが分子の効率的な検出を可能にすることを示す。 Stimulated Raman Adiabatic Passage (STIRAP) and its variants, such as multi-state chainwise-STIRAP allow efficiently transferring the populations in multi-state system and have been widely used to prepare ultracold deeply-bound molecules. However, their transfer efficiencies are generally imperfect. The main obstacle is the presence of losses and the requirement to make the dynamics adiabatic. To this end, in the present paper a theoretical method for the efficient and robust creation and detection of deeply-bound molecules is proposed. The simple three- and five-level systems with states chainwise coupled by optical fields are considered. In the regime of large detuning, the major molecular losses are pre-suppressed by reducing the dynamics of the three- and five-level molecular systems to those of effective two- and three-level counterparts, respectively. Consequently, two-level counterpart can be directly compatible with two kinds of "Invariant-based Inverse Engineering" (IIE) recipes, the results show that both protocols give comparable performance and have good experimental feasibility. For the five-level case, by considering a relation among the four incident pulses, we show that the M-type structure can be generalized into an effective $Lambda$-type one with the simplest resonant coupling. Therefore, this generalized model can also be directly compatible with "IIE" recipe. Numerical calculations show that the weakly-bound molecules can be efficiently transferred to their deeply-bound states without strong laser intensity, and the stability against parameter variations is well preserved. Finally, the detection of ultracold deeply-bound molecules is discussed, the results show that all the protocols allow efficient detection of molecules.	翻訳日:2023-10-24 23:09:41 公開日:2023-10-23
# 生成AIを活用したオープンアクセシブル大等方的問題バンクを用いた物理エクササイズの改革 : 探索的研究 Reforming Physics Exams Using Openly Accessible Large Isomorphic Problem Banks created with the assistance of Generative AI: an Explorative Study ( http://arxiv.org/abs/2310.14498v1 ) ライセンス: Link先を確認	Zhongzhou Chen, Emily Frederick, Colleen Cui, Munaimah Khan, Christopher Klatt, Mercedith Huang, Shiyang Su	(参考訳) 本稿では、大規模な同型問題銀行を用いて、大規模なSTEMクラスにおける従来の試験の課題を克服し、特にコンテンツ共有サイトや生成AIが試験項目のセキュリティを脅かすことを考察する。まず, 大規模言語モデルgpt-3 と各種オープンソースツールを用いて, 多数の同型物理問題を作成するための効率的な手順を提案する。次に,問題バンクから試験項目がランダムに抽出された場合,試験前の問題バンクへのオープンアクセスを学生に与えることは,試験における学生の成績に劇的な影響を与えないか,あるいは幅広い解の暗記に繋がることを提案する。この仮説を2つの中等式物理試験で検証し,開同型問題バンクから引き出された問題と,試験前の学生がアクセスできない類似の伝達問題とを比較した。いずれの試験でも,オープンバンク問題とトランスファー問題はともに最も困難であった。正解率の差は5%から10%であり、同じ問題型の異型版の違いに匹敵するものであった。項目応答理論分析の結果,両問題とも高い識別率(>1.5)を示し,有意な差は認められなかった。オープンバンクとトランスファー問題における学生の成績は相互に強く相関しており, 試験における問題の平均的相関よりも相関が強い。探索的因子分析では、オープンバンクと転送問題は同じ要因に負担され、2回目の試験でそれぞれ独自の因子を形成していることも判明した。これらの結果は、学生が大きな同型問題銀行に開放されることは、学生の試験成績にわずかな影響しか与えず、従来の教室試験の改革に有意義な可能性を示唆している。 This paper explores using large isomorphic problem banks to overcome many challenges of traditional exams in large STEM classes, especially the threat of content sharing websites and generative AI to the security of exam items. We first introduce an efficient procedure for creating large numbers of isomorphic physics problems, assisted by the large language model GPT-3 and several other open-source tools. We then propose that if exam items are randomly drawn from large enough problem banks, then giving students open access to problem banks prior to the exam will not dramatically impact students' performance on the exam or lead to wide-spread rote-memorization of solutions. We tested this hypothesis on two mid-term physics exams, comparing students' performance on problems drawn from open isomorphic problem banks to similar transfer problems that were not accessible to students prior to the exam. We found that on both exams, both open bank and transfer problems had the highest difficulty. The differences in percent correct were between 5% to 10%, which is comparable to the differences between different isomorphic versions of the same problem type. Item response theory analysis found that both types of problem have high discrimination (>1.5) with no significant differences. Student performance on open-bank and transfer problems are highly correlated with each other, and the correlations are stronger than average correlations between problems on the exam. Exploratory factor analysis also found that open-bank and transfer problems load on the same factor, and even formed their own factor on the second exam. Those observations all suggest that giving students open access to large isomorphic problem banks only had a small impact on students' performance on the exam but could have significant potential in reforming traditional classroom exams.	翻訳日:2023-10-24 23:09:12 公開日:2023-10-23
# s(CASP)を用いた実測的説明生成 Counterfactual Explanation Generation with s(CASP) ( http://arxiv.org/abs/2310.14497v1 ) ライセンス: Link先を確認	Sopam Dasgupta, Farhad Shakerin, Joaqu\'in Arias, Elmer Salazar, Gopal Gupta	(参考訳) 意思決定を自動化する機械学習モデルは、ローンの承認、プレトライアルの保釈、雇用など、連続した分野での利用が増えている。残念なことに、これらのモデルのほとんどはブラックボックスであり、これらの予測決定にどのように到達するかを明らかにすることができない。このような予測を正当化する透明性の必要性。影響を受ける個人は、なぜ意思決定が行われたのかを理解するために説明を求めるかもしれない。倫理的および法的考察は、望ましい結果をもたらすことができる入力属性の変化を個人に通知する必要があるかもしれない。本稿では, 逆実説明を自動生成する後者の問題に焦点をあてる。提案手法は,応答セットプログラミングと s(CASP) 目標指向ASP システムを利用する。 answer set programming (asp) はよく知られた知識表現と推論パラダイムである。 s(CASP)はゴール指向のASPシステムで、応答セットプログラムをトップダウンで実行します。 s(CASP)のクエリ駆動型の性質は、証明木として正当化を提供することを可能にし、生成した偽物の説明を分析することができる。反事実的説明は、ある、あるいは全ての事実的仮定が真実でない複数の可能な世界を想像し、さらに重要なことに、これらの世界をいかにナビゲートできるかを想像することによって、どのように計算され、正当化されるかを示す。また,我々のアルゴリズムを用いて,クエリの失敗に対する応答セットプログラムのクラスに対するCraig Interpolantを見つける方法を示す。 Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might desire explanations to understand why a decision was made. Ethical and legal considerations may further require informing the individual of changes in the input attribute that could be made to produce a desirable outcome. This paper focuses on the latter problem of automatically generating counterfactual explanations. Our approach utilizes answer set programming and the s(CASP) goal-directed ASP system. Answer Set Programming (ASP) is a well-known knowledge representation and reasoning paradigm. s(CASP) is a goal-directed ASP system that executes answer-set programs top-down without grounding them. The query-driven nature of s(CASP) allows us to provide justifications as proof trees, which makes it possible to analyze the generated counterfactual explanations. We show how counterfactual explanations are computed and justified by imagining multiple possible worlds where some or all factual assumptions are untrue and, more importantly, how we can navigate between these worlds. We also show how our algorithm can be used to find the Craig Interpolant for a class of answer set programs for a failing query.	翻訳日:2023-10-24 23:08:38 公開日:2023-10-23
# レストレスマルチアームバンドにおけるゼロショット学習に向けて Towards Zero Shot Learning in Restless Multi-armed Bandits ( http://arxiv.org/abs/2310.14526v1 ) ライセンス: Link先を確認	Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe	(参考訳) レストレス・マルチアーム・バンディット (RMABs) は, 医療, オンライン広告, 密猟などの分野で広く応用されている資源配分問題のクラスであり, マルチエージェント強化学習の観点から最近研究されている。 RMAB以前の研究はいくつかの制限に悩まされており、例えば、連続状態に適切に対処できず、多くの現実世界で一般的な課題である腕のオプトインやオプトアウト時にスクラッチから再トレーニングする必要がある。これらの制限に対処するために、ニューラルネットワークベースの事前訓練モデル(PreFeRMAB)を開発し、これまで見つからなかったRMABの幅広い範囲で、一般的なゼロショット能力を持ち、スクラッチからリトレーニングするよりも、よりサンプル効率の良い方法で特定のインスタンスで微調整できる。このモデルは、一般的なマルチアクション設定や離散状態空間や連続状態空間も含む。迅速な一般化を実現するために,特徴情報を活用し,武器のオプトイン・アウトを経時的に行う新しい単一政策ネットワークモデルを学習する。理論的収束を保証する重要な$\lambda$-networkに対する新しい更新ルールを導き、いくつかの挑戦的で現実世界にインスパイアされた問題に対するアプローチの利点を実証的に示す。 Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective. Prior RMAB research suffers from several limitations, e.g., it fails to adequately address continuous states, and requires retraining from scratch when arms opt-in and opt-out over time, a common challenge in many real world applications. We address these limitations by developing a neural network-based pre-trained model (PreFeRMAB) that has general zero-shot ability on a wide range of previously unseen RMABs, and which can be fine-tuned on specific instances in a more sample-efficient way than retraining from scratch. Our model also accommodates general multi-action settings and discrete or continuous state spaces. To enable fast generalization, we learn a novel single policy network model that utilizes feature information and employs a training procedure in which arms opt-in and out over time. We derive a new update rule for a crucial $\lambda$-network with theoretical convergence guarantees and empirically demonstrate the advantages of our approach on several challenging, real-world inspired problems.	翻訳日:2023-10-24 23:02:22 公開日:2023-10-23
# グラフ表現にコントラスト学習は本当に必要か? Do We Really Need Contrastive Learning for Graph Representation? ( http://arxiv.org/abs/2310.14525v1 ) ライセンス: Link先を確認	Yulan Hu, Sheng Ouyang, Jingyu Liu, Ge Chen, Zhirui Yang, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Yong Liu	(参考訳) 近年,グラフ学習の分野では,コントラスト学習が支配的な自己指導パラダイムとして出現し,多くの研究分野が注目されている。グラフコントラスト学習(gcl)は、拡張アンカーサンプルを互いに近接させ、他のサンプル(負のサンプル)の埋め込みを分離することを目的としている。しかし、既存のgcl法は埋め込みの品質を確保するために大きく多様な負のサンプルを必要とし、最近の研究ではアンカーと正のサンプルを除くサンプルを負のサンプルとして活用し、偽の負のサンプル(アンカーと同じクラスを共有する負のサンプル)をもたらす可能性がある。さらに、この慣行は計算の重荷と、o(n^2)$の高時間複雑性をもたらす可能性がある。これらの欠陥に対処するために、ランク学習を活用し、シンプルで効果的なモデルであるGraphRankを提案する。具体的には、腐敗を通じて2つのグラフビューを生成します。そして、両ビューにおいてペアワイズノード(アンカーノードと正ノード)の類似性を計算し、後者ビューにおける任意のノードを負ノードとして選択し、アンカーノードとの類似性を演算する。そこで本研究では,擬似負の証明を解消し,時間的複雑性を$O(N^2)$から$O(N)$に下げる類似度スコアをランクベースで測定する学習手法を提案する。さらに,複数のグラフタスクにまたがる広範な実験を行い,様々なタスクにおいて他の最先端のgcl手法に対してグラフランクが有利に働くことを示した。 In recent years, contrastive learning has emerged as a dominant self-supervised paradigm, attracting numerous research interests in the field of graph learning. Graph contrastive learning (GCL) aims to embed augmented anchor samples close to each other while pushing the embeddings of other samples (negative samples) apart. However, existing GCL methods require large and diverse negative samples to ensure the quality of embeddings, and recent studies typically leverage samples excluding the anchor and positive samples as negative samples, potentially introducing false negative samples (negatives that share the same class as the anchor). Additionally, this practice can result in heavy computational burden and high time complexity of $O(N^2)$, which is particularly unaffordable for large graphs. To address these deficiencies, we leverage rank learning and propose a simple yet effective model, GraphRank. Specifically, we first generate two graph views through corruption. Then, we compute the similarity of pairwise nodes (anchor node and positive node) in both views, an arbitrary node in the latter view is selected as a negative node, and its similarity with the anchor node is computed. Based on this, we introduce rank-based learning to measure similarity scores which successfully relieve the false negative provlem and decreases the time complexity from $O(N^2)$ to $O(N)$. Moreover, we conducted extensive experiments across multiple graph tasks, demonstrating that GraphRank performs favorably against other cutting-edge GCL methods in various tasks.	翻訳日:2023-10-24 23:01:59 公開日:2023-10-23
# コンピュータ翻訳における単語レベル自動補完の再考 Rethinking Word-Level Auto-Completion in Computer-Aided Translation ( http://arxiv.org/abs/2310.14523v1 ) ライセンス: Link先を確認	Xingyu Chen and Lemao Liu and Guoping Huang and Zhirui Zhang and Mingming Yang and Shuming Shi and Rui Wang	(参考訳) Word-Level Auto-Completion (WLAC) はコンピュータ翻訳において重要な役割を果たす。人間の翻訳者に対して単語レベルの自動補完提案を提供することを目的としている。従来の研究は主に複雑なモデルアーキテクチャの設計に重点を置いてきたが、本論文は基本的な問題を再考することによって、異なる視点を採っている。この質問に答えるために測定可能な基準を導入し、既存のwlacモデルは、しばしばこの基準を満たさないことを発見します。本研究は, 基準の遵守を促進することによってWLAC性能を向上させる効果的な手法を提案する。特に,提案手法は汎用的であり,様々なエンコーダアーキテクチャに適用可能である。実験により,WMT2022におけるWLAC共有タスクの処理性能は,モデルサイズを大幅に小さくし,高い性能を示した。 Word-Level Auto-Completion (WLAC) plays a crucial role in Computer-Assisted Translation. It aims at providing word-level auto-completion suggestions for human translators. While previous studies have primarily focused on designing complex model architectures, this paper takes a different perspective by rethinking the fundamental question: what kind of words are good auto-completions? We introduce a measurable criterion to answer this question and discover that existing WLAC models often fail to meet this criterion. Building upon this observation, we propose an effective approach to enhance WLAC performance by promoting adherence to the criterion. Notably, the proposed approach is general and can be applied to various encoder-based architectures. Through extensive experiments, we demonstrate that our approach outperforms the top-performing system submitted to the WLAC shared tasks in WMT2022, while utilizing significantly smaller model sizes.	翻訳日:2023-10-24 23:01:31 公開日:2023-10-23
# K-Nearest-NeighborsによるcRNA配列解析のためのトポロジカルPCA K-Nearest-Neighbors Induced Topological PCA for scRNA Sequence Data Analysis ( http://arxiv.org/abs/2310.14521v1 ) ライセンス: Link先を確認	Sean Cottrell, Yuta Hozumi, Guo-Wei Wei	(参考訳) 単細胞RNAシークエンシング(scRNA-seq)は、細胞内の不均一性を明らかにするために広く用いられ、細胞間通信、細胞分化、および分化遺伝子発現に関する洞察を与えてくれた。しかし、scRNA-seqデータの解析は、スパーシリティと関連する多数の遺伝子によって困難である。したがって,スプリアス信号の除去と下流解析の促進には,次元化と特徴選択が重要である。従来のPCAは次元減少の主要な作業場であり、データに埋め込まれた幾何学的構造情報をキャプチャする能力に欠けており、以前のグラフラプラシア正規化は単一のスケールの分析によって制限されている。永続ラプラシアン(PL)手法とL$_{2,1}$ノルム正規化を組み合わせたトポロジカル・プライマリ・コンポーネント分析(tPCA)法を提案し,データ中のマルチスケールおよびマルチクラスの不均一性問題に対処する。さらに, k-Nearest-Neighbor (kNN) の永続ラプラス的手法を導入し, 永続ラプラス的手法の堅牢性を向上させる。提案する knn-pl は従来の永続ホモロジーの多くの制限に対処する新しい代数的位相技法である。距離しきい値の変化によってフィルタを誘導する代わりに、各ステップでkNNネットワーク内の隣人数を変動させることでフィルタを実現するkNN-tPCAを導入し、このフレームワークがハイパーパラメータチューニングに重大な影響を与えることを発見した。提案したtPCA法とkNN-tPCA法が,11種類のベンチマークscRNA-seqデータセットに対して有効であることを示すとともに,本手法が文献の他の教師なしPCA拡張よりも優れていることを示すとともに,Uniform Manifold Approximation (UMAP), t-Distributed Stochastic Neighbor Embedding (tSNE), およびProjection Non-Negative Matrix Factorization (NMF) を有意差で評価した。 Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell-cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing downstream analysis. Traditional PCA, a main workhorse in dimensionality reduction, lacks the ability to capture geometrical structure information embedded in the data, and previous graph Laplacian regularizations are limited by the analysis of only a single scale. We propose a topological Principal Components Analysis (tPCA) method by the combination of persistent Laplacian (PL) technique and L$_{2,1}$ norm regularization to address multiscale and multiclass heterogeneity issues in data. We further introduce a k-Nearest-Neighbor (kNN) persistent Laplacian technique to improve the robustness of our persistent Laplacian method. The proposed kNN-PL is a new algebraic topology technique which addresses the many limitations of the traditional persistent homology. Rather than inducing filtration via the varying of a distance threshold, we introduced kNN-tPCA, where filtrations are achieved by varying the number of neighbors in a kNN network at each step, and find that this framework has significant implications for hyper-parameter tuning. We validate the efficacy of our proposed tPCA and kNN-tPCA methods on 11 diverse benchmark scRNA-seq datasets, and showcase that our methods outperform other unsupervised PCA enhancements from the literature, as well as popular Uniform Manifold Approximation (UMAP), t-Distributed Stochastic Neighbor Embedding (tSNE), and Projection Non-Negative Matrix Factorization (NMF) by significant margins.	翻訳日:2023-10-24 23:01:16 公開日:2023-10-23
# 古典的および量子ネットワークにおける量子フェデレーション学習の基礎 Foundations of Quantum Federated Learning Over Classical and Quantum Networks ( http://arxiv.org/abs/2310.14516v1 ) ライセンス: Link先を確認	Mahdi Chehimi, Samuel Yen-Chi Chen, Walid Saad, Don Towsley, M\'erouane Debbah	(参考訳) 量子フェデレーション学習(quantum federated learning, qfl)は、古典的なフェデレーション学習(fl)の利点と量子技術の計算能力を統合する新しいフレームワークである。これには量子コンピューティングと量子機械学習(QML)が含まれており、QFLは高次元の複雑なデータを扱うことができる。 QFLは、従来のFLフレームワークを超える情報理論セキュリティレベルを享受するために、古典的および量子的通信ネットワークにデプロイすることができる。本稿では,qflの課題と機会について,初めて総合的な調査を行う。特に、QFLの重要なコンポーネントを調べ、古典的および量子的ネットワークにデプロイする際に生じるユニークな課題を特定します。そして、新しいソリューションを開発し、特定された課題に対処するための有望な研究方向を明確にする。また、qflの実践的実現を進めるための行動可能な勧告も提供します。 Quantum federated learning (QFL) is a novel framework that integrates the advantages of classical federated learning (FL) with the computational power of quantum technologies. This includes quantum computing and quantum machine learning (QML), enabling QFL to handle high-dimensional complex data. QFL can be deployed over both classical and quantum communication networks in order to benefit from information-theoretic security levels surpassing traditional FL frameworks. In this paper, we provide the first comprehensive investigation of the challenges and opportunities of QFL. We particularly examine the key components of QFL and identify the unique challenges that arise when deploying it over both classical and quantum networks. We then develop novel solutions and articulate promising research directions that can help address the identified challenges. We also provide actionable recommendations to advance the practical realization of QFL.	翻訳日:2023-10-24 23:00:31 公開日:2023-10-23
# 対話状態追跡のためのターンレベルアクティブ学習 Turn-Level Active Learning for Dialogue State Tracking ( http://arxiv.org/abs/2310.14513v1 ) ライセンス: Link先を確認	Zihan Zhang, Meng Fang, Fanghua Ye, Ling Chen, Mohammad-Reza Namazi-Rad	(参考訳) 対話状態追跡(DST)はタスク指向対話システムにおいて重要な役割を果たす。しかし、大量のターンバイターンの注釈付き対話データの収集は費用がかかり非効率である。本稿では,対話中のターンを積極的に選択して注釈を付ける,新しいターンレベルアクティブラーニングフレームワークを提案する。限定的なラベリング予算を考えると,実験の結果,対話ターンの選択的アノテーションの有効性が示された。さらに,より少ない注釈データを用いて,従来の訓練手法と同等のdst性能を効果的に達成し,新たな対話データの注釈化をより効率的に行うことができる。 Dialogue state tracking (DST) plays an important role in task-oriented dialogue systems. However, collecting a large amount of turn-by-turn annotated dialogue data is costly and inefficient. In this paper, we propose a novel turn-level active learning framework for DST to actively select turns in dialogues to annotate. Given the limited labelling budget, experimental results demonstrate the effectiveness of selective annotation of dialogue turns. Additionally, our approach can effectively achieve comparable DST performance to traditional training approaches with significantly less annotated data, which provides a more efficient way to annotate new dialogue data.	翻訳日:2023-10-24 23:00:19 公開日:2023-10-23
# corefprompt:イベントタイプと引数互換性の測定によるプロンプトベースのイベントコリファレンス解決 CorefPrompt: Prompt-based Event Coreference Resolution by Measuring Event Type and Argument Compatibilities ( http://arxiv.org/abs/2310.14512v1 ) ライセンス: Link先を確認	Sheng Xu, Peifeng Li, Qiaoming Zhu	(参考訳) event coreference resolution(ecr)は、同じ実世界のイベントをクラスタに参照するイベント言及をグループ化する。以前の研究のほとんどは"encoding first, then scoring"フレームワークを採用しており、コリファレンス判断はイベントエンコーディングに依存している。さらに、現在の手法では、モデルを導くために、coreferential eventsが同じイベントタイプを持つべきであるなど、人間によるecrルールの活用に苦労している。これら2つの問題に対処するため,我々は,ECRを閉鎖型MLM(masked language model)タスクに変換するプロンプトベースのアプローチであるCorefPromptを提案する。これにより、完全な共有コンテキストを持つ単一のテンプレート内で、イベントモデリングとコリファレンスの同時識別が可能になる。さらに、イベント型互換性と引数互換性という2つの補助的なプロンプトタスクを導入し、モデルが最終的な予測を行うのに役立つECRの推論過程を明確に示す。実験の結果,CorefPromptはSOTA(State-of-the-art)ベンチマークでよく動作することがわかった。 Event coreference resolution (ECR) aims to group event mentions referring to the same real-world event into clusters. Most previous studies adopt the "encoding first, then scoring" framework, making the coreference judgment rely on event encoding. Furthermore, current methods struggle to leverage human-summarized ECR rules, e.g., coreferential events should have the same event type, to guide the model. To address these two issues, we propose a prompt-based approach, CorefPrompt, to transform ECR into a cloze-style MLM (masked language model) task. This allows for simultaneous event modeling and coreference discrimination within a single template, with a fully shared context. In addition, we introduce two auxiliary prompt tasks, event-type compatibility and argument compatibility, to explicitly demonstrate the reasoning process of ECR, which helps the model make final predictions. Experimental results show that our method CorefPrompt performs well in a state-of-the-art (SOTA) benchmark.	翻訳日:2023-10-24 23:00:10 公開日:2023-10-23
# Poster:エッジコンピューティングによるモバイルディミネートリアリティのためのリアルタイムオブジェクト置換 Poster: Real-Time Object Substitution for Mobile Diminished Reality with Edge Computing ( http://arxiv.org/abs/2310.14511v1 ) ライセンス: Link先を確認	Hongyu Ke, Haoxin Wang	(参考訳) ファミネート・リアリティ(DR)は拡張現実(AR)の概念に匹敵するものと考えられており、近年、産業と学術の両方から注目を集めている。現実世界に仮想オブジェクトを追加するARとは異なり、DRはユーザーが現実世界から物理的なコンテンツを削除できる。オブジェクト置換技術と組み合わせることで、メタバース内の探索にさらにエキサイティングな道が開かれる。オブジェクト置換とDRの交わりについていくつかの研究がなされているが、高品質な移動体減少現実アーキテクチャのリアルタイムオブジェクト置換は存在しない。本稿では,エッジコンピューティングを用いたモバイルデバイスの没入型・リアルタイムシーン構築を容易にするエンドツーエンドアーキテクチャを提案する。 Diminished Reality (DR) is considered as the conceptual counterpart to Augmented Reality (AR), and has recently gained increasing attention from both industry and academia. Unlike AR which adds virtual objects to the real world, DR allows users to remove physical content from the real world. When combined with object replacement technology, it presents an further exciting avenue for exploration within the metaverse. Although a few researches have been conducted on the intersection of object substitution and DR, there is no real-time object substitution for mobile diminished reality architecture with high quality. In this paper, we propose an end-to-end architecture to facilitate immersive and real-time scene construction for mobile devices with edge computing.	翻訳日:2023-10-24 22:59:51 公開日:2023-10-23
# CITB: 継続的インストラクションチューニングのためのベンチマーク CITB: A Benchmark for Continual Instruction Tuning ( http://arxiv.org/abs/2310.14510v1 ) ライセンス: Link先を確認	Zihan Zhang, Meng Fang, Ling Chen, Mohammad-Reza Namazi-Rad	(参考訳) 連続学習(continual learning、cl)は、以前の知識を忘れて新しいタスクに移すことなく、継続的に学習し蓄積する人間の能力を再現することを目的としたパラダイムである。近年のインストラクションチューニング (IT) では, NLP の解法に適応する微調整モデルが提案されている。しかし、CLタスクのコンテキストにおいて、命令チューニングがどのように機能するかはまだ不明である。この課題は継続命令チューニング(Continuous Instruction Tuning, CIT)として定式化されている。本研究では,学習プロトコルと評価プロトコルからなるCITベンチマークを確立する。 InstrDialog と InstrDialog++ の2種類の長い対話タスクストリームを体系的に学習するためにキュレートする。実験により,既存のcl手法は豊かな自然言語命令を効果的に活用せず,逐次的に命令調整されたモデルを微調整することで,類似あるいはよりよい結果が得られることを示した。さらに、citの学習に影響を与える可能性のあるさまざまな側面を探求する。このベンチマークがさらなる研究を促進することを期待している。 Continual learning (CL) is a paradigm that aims to replicate the human ability to learn and accumulate knowledge continually without forgetting previous knowledge and transferring it to new tasks. Recent instruction tuning (IT) involves fine-tuning models to make them more adaptable to solving NLP tasks in general. However, it is still uncertain how instruction tuning works in the context of CL tasks. This challenging yet practical problem is formulated as Continual Instruction Tuning (CIT). In this work, we establish a CIT benchmark consisting of learning and evaluation protocols. We curate two long dialogue task streams of different types, InstrDialog and InstrDialog++, to study various CL methods systematically. Our experiments show that existing CL methods do not effectively leverage the rich natural language instructions, and fine-tuning an instruction-tuned model sequentially can yield similar or better results. We further explore different aspects that might affect the learning of CIT. We hope this benchmark will facilitate more research in this direction.	翻訳日:2023-10-24 22:59:38 公開日:2023-10-23
# ランダム化による優先型フィードバック効率RLの作成 Making RL with Preference-based Feedback Efficient via Randomization ( http://arxiv.org/abs/2310.14554v1 ) ライセンス: Link先を確認	Runzhe Wu, Wen Sun	(参考訳) 人間のフィードバック(RLHF)から学習する強化学習アルゴリズムは、統計複雑性、計算複雑性、クエリ複雑性の点で効率的である必要がある。本研究では,RLHF設定において,軌道のペアよりも好みの形式でフィードバックが与えられることを考察する。線形mdpモデルでは、アルゴリズム設計にランダム化を用いることで、サンプル効率のよいアルゴリズム(すなわち、最適に近い最悪の場合の後悔領域を持つ)と多項式の実行時間(すなわち、計算複雑性は関連するパラメータに関して多項式である)を提案する。提案アルゴリズムは,新しいランダム化能動的学習手法により,クエリの複雑さを最小化する。特に,本アルゴリズムは,後悔境界とクエリ複雑性との最適に近いトレードオフを示す。より一般的な非線形関数近似に拡張するために、トンプソンサンプリングのアイデアに触発されたモデルベースランダム化アルゴリズムを設計する。我々のアルゴリズムはベイズ的後悔とクエリの複雑さを最小化し、これら2つの量間のほぼ最適なトレードオフを達成する。正規RL設定における従来のトンプソンサンプリングアルゴリズムと同様に,本アルゴリズムの主な計算プリミティブはベイズ教師あり学習オラクルであり,トンプソンサンプリングアルゴリズムをRLベンチマーク問題に適用する際の経験的側面について深く研究されている。 Reinforcement Learning algorithms that learn from human feedback (RLHF) need to be efficient in terms of statistical complexity, computational complexity, and query complexity. In this work, we consider the RLHF setting where the feedback is given in the format of preferences over pairs of trajectories. In the linear MDP model, by using randomization in algorithm design, we present an algorithm that is sample efficient (i.e., has near-optimal worst-case regret bounds) and has polynomial running time (i.e., computational complexity is polynomial with respect to relevant parameters). Our algorithm further minimizes the query complexity through a novel randomized active learning procedure. In particular, our algorithm demonstrates a near-optimal tradeoff between the regret bound and the query complexity. To extend the results to more general nonlinear function approximation, we design a model-based randomized algorithm inspired by the idea of Thompson sampling. Our algorithm minimizes Bayesian regret bound and query complexity, again achieving a near-optimal tradeoff between these two quantities. Computation-wise, similar to the prior Thompson sampling algorithms under the regular RL setting, the main computation primitives of our algorithm are Bayesian supervised learning oracles which have been heavily investigated on the empirical side when applying Thompson sampling algorithms to RL benchmark problems.	翻訳日:2023-10-24 22:52:07 公開日:2023-10-23
# スケーラブルガウス過程回帰のための三角四分法フーリエ特性 Trigonometric Quadrature Fourier Features for Scalable Gaussian Process Regression ( http://arxiv.org/abs/2310.14544v1 ) ライセンス: Link先を確認	Kevin Li, Max Balakirsky, Simon Mak	(参考訳) 拡張ガウス過程(GP)回帰の文献においてフーリエ特徴近似がうまく適用されている。特に、ガウスの二次規則から派生したQFF(Quardrature Fourier Features)は、Random Fourier Feature (RFF)法と比較して近似精度の向上とキャリブレーションの不確実性評価の改善により近年人気を博している。しかしながら、qffの鍵となる制限は、その性能が高振動二次数に関するよく知られた病理に苦しむ可能性があることである。提案手法は, 所望のフーリエ変換に特化された新しい非ガウス二次法則を用いて, 新たな三角四分法フーリエ特徴量(TQFF)法を用いて, この問題に対処する。我々は、TQFFの正確な二次規則と、得られた特徴写像のカーネル近似誤差境界を導出する。次に, RFF と Gaussian QFF による提案手法の性能向上を数値実験および応用で実証し, より少ない特徴量を用いて, TQFF が広い範囲にわたるGP近似を満足することを示す。 Fourier feature approximations have been successfully applied in the literature for scalable Gaussian Process (GP) regression. In particular, Quadrature Fourier Features (QFF) derived from Gaussian quadrature rules have gained popularity in recent years due to their improved approximation accuracy and better calibrated uncertainty estimates compared to Random Fourier Feature (RFF) methods. However, a key limitation of QFF is that its performance can suffer from well-known pathologies related to highly oscillatory quadrature, resulting in mediocre approximation with limited features. We address this critical issue via a new Trigonometric Quadrature Fourier Feature (TQFF) method, which uses a novel non-Gaussian quadrature rule specifically tailored for the desired Fourier transform. We derive an exact quadrature rule for TQFF, along with kernel approximation error bounds for the resulting feature map. We then demonstrate the improved performance of our method over RFF and Gaussian QFF in a suite of numerical experiments and applications, and show the TQFF enjoys accurate GP approximations over a broad range of length-scales using fewer features.	翻訳日:2023-10-24 22:51:46 公開日:2023-10-23
# 制御された生成課題における大規模言語モデルの評価 Evaluating Large Language Models on Controlled Generation Tasks ( http://arxiv.org/abs/2310.14542v1 ) ライセンス: Link先を確認	Jiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Frederick Wieting, Nanyun Peng, Xuezhe Ma	(参考訳) 近年の研究では,質問生成,読解,多言語など,様々なベンチマークタスクにおける大規模言語モデルの能力について検討されているが,生成タスクにおける大規模言語モデルの制御性についての研究は少ない。粒度の異なる文計画ベンチマークを含む,様々なベンチマークの広範な分析を行う。大規模言語モデルと最先端の微調整された小型モデルを比較した後、大規模言語モデルが後方に落ちたり、比較されたり、小型モデルの能力を超えたりしたスペクトルを示す。我々は*大きな言語モデルがきめ細かい制約を満たすのに苦労していると結論づける。 While recent studies have looked into the abilities of large language models in various benchmark tasks, including question generation, reading comprehension, multilingual and etc, there have been few studies looking into the controllability of large language models on generation tasks. We present an extensive analysis of various benchmarks including a sentence planning benchmark with different granularities. After comparing large language models against state-of-the-start finetuned smaller models, we present a spectrum showing large language models falling behind, are comparable, or exceed the ability of smaller models. We conclude that large language models struggle at meeting fine-grained hard constraints.	翻訳日:2023-10-24 22:51:27 公開日:2023-10-23
# カタストロフィックフォーミングを伴わない連続的名前付きエンティティ認識 Continual Named Entity Recognition without Catastrophic Forgetting ( http://arxiv.org/abs/2310.14541v1 ) ライセンス: Link先を確認	Duzhen Zhang, Wei Cong, Jiahua Dong, Yahan Yu, Xiuyi Chen, Yonggang Zhang, Zhen Fang	(参考訳) 連続的名前付きエンティティ認識(Continuous Named Entity Recognition, CNER)は、新しいエンティティタイプを順次組み込むことで、既存のモデルを更新する。それでも、継続的な学習アプローチは、しばしば破滅的な忘れ物によってひどく悩まされる。この問題は、CNERにおいて、以前のステップから各ステップにおける非エンティティタイプへの古いエンティティタイプの統合により強化され、非エンティティ型のセマンティックシフト問題と呼ばれる問題に繋がる。本稿では,古いエンティティタイプの知識の保持と新しいもの獲得のトレードオフを巧みに回避し,破滅的忘れの問題を効果的に緩和する,プールド・フィーチャー蒸留損失を導入する。さらに,非エンティリティ型に対する信頼に基づく疑似ラベル,すなわち,非エンティリティ型のセマンティックシフトを処理するために,古いモデルを用いてエンティティ型を予測する。擬似ラベル処理に従えば,偏り型分布の問題に対処するための適応型重み付け型バランス学習戦略を提案する。 3つの異なるデータセットを用いて10個のCNER設定に関する総合的な実験を行った。以上の結果から,本手法は従来手法よりも有意に優れており,マイクロf1スコアとマクロf1スコアでは平均6.3$\%,8.0$\%の改善が認められている。 Continual Named Entity Recognition (CNER) is a burgeoning area, which involves updating an existing model by incorporating new entity types sequentially. Nevertheless, continual learning approaches are often severely afflicted by catastrophic forgetting. This issue is intensified in CNER due to the consolidation of old entity types from previous steps into the non-entity type at each step, leading to what is known as the semantic shift problem of the non-entity type. In this paper, we introduce a pooled feature distillation loss that skillfully navigates the trade-off between retaining knowledge of old entity types and acquiring new ones, thereby more effectively mitigating the problem of catastrophic forgetting. Additionally, we develop a confidence-based pseudo-labeling for the non-entity type, \emph{i.e.,} predicting entity types using the old model to handle the semantic shift of the non-entity type. Following the pseudo-labeling process, we suggest an adaptive re-weighting type-balanced learning strategy to handle the issue of biased type distribution. We carried out comprehensive experiments on ten CNER settings using three different datasets. The results illustrate that our method significantly outperforms prior state-of-the-art approaches, registering an average improvement of $6.3$\% and $8.0$\% in Micro and Macro F1 scores, respectively.	翻訳日:2023-10-24 22:51:17 公開日:2023-10-23
# 大規模言語モデルの空間理解の評価 Evaluating Spatial Understanding of Large Language Models ( http://arxiv.org/abs/2310.14540v1 ) ライセンス: Link先を確認	Yutaro Yamada, Yihan Bao, Andrew K. Lampinen, Jungo Kasai, Ilker Yildirim	(参考訳) 大きな言語モデル(LLM)は、様々なタスクにまたがる優れた機能を示している。トレーニング中のテキストのみを見るモデルにもかかわらず、最近のいくつかの研究は、LLM表現が基礎となる基礎概念の側面を暗黙的に捉えていることを示唆している。本稿では,空間的関係という,特に健全な知識のLLM表現について考察する。自然言語ナビゲーションタスクを設計,llm,特にgpt-3.5-turbo,gpt-4,llama2シリーズモデルを用いて空間構造を表現・推論し,同じタスクにおける人間のパフォーマンスと比較する。これらのタスクは、正方形、六角形、三角形の格子、環、木など、異なる空間構造におけるLLM性能のかなりのばらつきを示す。また、LLMは人間と同様、空間地図の保存のためのランドマークとしてオブジェクト名を利用する。最後に,LLMの誤りは空間的要因と非空間的要因の両方を反映していることが判明した。これらのことから, LLMは空間構造の特定の側面を暗黙的に捉えているように見えるが, 改善の余地は残されている。 Large language models (LLMs) show remarkable capabilities across a variety of tasks. Despite the models only seeing text in training, several recent studies suggest that LLM representations implicitly capture aspects of the underlying grounded concepts. Here, we explore LLM representations of a particularly salient kind of grounded knowledge -- spatial relationships. We design natural-language navigation tasks and evaluate the ability of LLMs, in particular GPT-3.5-turbo, GPT-4, and Llama2 series models, to represent and reason about spatial structures, and compare these abilities to human performance on the same tasks. These tasks reveal substantial variability in LLM performance across different spatial structures, including square, hexagonal, and triangular grids, rings, and trees. We also discover that, similar to humans, LLMs utilize object names as landmarks for maintaining spatial maps. Finally, in extensive error analysis, we find that LLMs' mistakes reflect both spatial and non-spatial factors. These findings suggest that LLMs appear to capture certain aspects of spatial structure implicitly, but room for improvement remains.	翻訳日:2023-10-24 22:50:55 公開日:2023-10-23
# デコードインターベンションによるSeq2Seq文法誤り訂正の改善 Improving Seq2Seq Grammatical Error Correction via Decoding Interventions ( http://arxiv.org/abs/2310.14534v1 ) ライセンス: Link先を確認	Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang	(参考訳) シークエンス・ツー・シークエンス(Seq2Seq)アプローチは近年,文法的誤り訂正(GEC)に広く使われ,有望な性能を示している。しかし、Seq2Seq GECアプローチには2つの問題がある。第一に、seq2seq gecモデルは並列データでしか訓練できないため、gecタスクではノイズが多く、量も限られることが多い。第2に、Seq2Seq GECモデルのデコーダは、生成されるトークンの正確性を明確に認識していない。本稿では,外部の批評家を駆使して,インクリメンタルに生成すべきトークンの適切性を評価し,次に次のトークンの選択に動的に影響を及ぼす統一復号処理フレームワークを提案する。予備訓練された左右言語モデル評論家と段階的目標側の文法的誤り検出評論家の2つのタイプの批評家を発見し,検討した。英語と中国語のデータセットに関する広範な実験を通じて、我々のフレームワークは一貫して強いベースラインを上回り、最先端の手法と競合する結果を得る。 The sequence-to-sequence (Seq2Seq) approach has recently been widely used in grammatical error correction (GEC) and shows promising performance. However, the Seq2Seq GEC approach still suffers from two issues. First, a Seq2Seq GEC model can only be trained on parallel data, which, in GEC task, is often noisy and limited in quantity. Second, the decoder of a Seq2Seq GEC model lacks an explicit awareness of the correctness of the token being generated. In this paper, we propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally, and then dynamically influence the choice of the next token. We discover and investigate two types of critics: a pre-trained left-to-right language model critic and an incremental target-side grammatical error detector critic. Through extensive experiments on English and Chinese datasets, our framework consistently outperforms strong baselines and achieves results competitive with state-of-the-art methods.	翻訳日:2023-10-24 22:50:35 公開日:2023-10-23
# オンラインソーシャルプラットフォームにおけるユーザエンゲージメントのコンテキストアウェア予測 Context-Aware Prediction of User Engagement on Online Social Platforms ( http://arxiv.org/abs/2310.14533v1 ) ライセンス: Link先を確認	Heinrich Peters, Yozen Liu, Francesco Barbieri, Raiyan A. Baten, Sandra C. Matz, Maarten W. Bos	(参考訳) オンラインソーシャルプラットフォームの成功は、大規模なユーザーの行動を予測し、理解する能力にかかっている。本稿では,コンテキスト認識モデリング手法が,オンラインソーシャルプラットフォームにおけるユーザエンゲージメントの総合的かつ軽量かつ潜在的に保護可能な表現を提供する可能性を示唆するデータを提案する。深層LSTMニューラルネットワークを活用して、約8万人から1億以上のSnapchatセッションを分析し、アクティブおよびパッシブ使用パターンが過去の行動から予測可能であること(R2=0.345)、コンテキスト情報の統合が行動ベースラインモデル(R2=0.522)と比較して予測性能を大幅に向上すること(R2=0.522)を実証した。スマートフォンの接続状況,位置,時間的状況,天候に関連する特徴は,アプリ内行動の履歴から得られた特徴に対して,ユーザエンゲージメントの非冗長なばらつきを捉えていることがわかった。さらに,時間的文脈情報を考慮した場合,行動履歴を最小限に抑えることができる(R2=0.44)。これらの結果は、長いデータ履歴の必要性を減らし、モデルをより効率的かつプライバシー保護にするためのコンテキスト認識アプローチの可能性を示している。最後に,モデル説明可能性手法を用いて行動メカニズムの予備的考察を行う。本研究は,ソーシャルプラットフォーム上でのユーザエンゲージメントを予測するために,ユーザ行動の文脈化表現の価値を強調し,文脈一致型,習慣駆動型,能動的使用パターンの概念と一致している。 The success of online social platforms hinges on their ability to predict and understand user behavior at scale. Here, we present data suggesting that context-aware modeling approaches may offer a holistic yet lightweight and potentially privacy-preserving representation of user engagement on online social platforms. Leveraging deep LSTM neural networks to analyze more than 100 million Snapchat sessions from almost 80.000 users, we demonstrate that patterns of active and passive use are predictable from past behavior (R2=0.345) and that the integration of context information substantially improves predictive performance compared to the behavioral baseline model (R2=0.522). Features related to smartphone connectivity status, location, temporal context, and weather were found to capture non-redundant variance in user engagement relative to features derived from histories of in-app behaviors. Further, we show that a large proportion of variance can be accounted for with minimal behavioral histories if momentary context information is considered (R2=0.44). These results indicate the potential of context-aware approaches for making models more efficient and privacy-preserving by reducing the need for long data histories. Finally, we employ model explainability techniques to glean preliminary insights into the underlying behavioral mechanisms. Our findings are consistent with the notion of context-contingent, habit-driven patterns of active and passive use, underscoring the value of contextualized representations of user behavior for predicting user engagement on social platforms.	翻訳日:2023-10-24 22:50:03 公開日:2023-10-23
# 同期・融合による実用的深分散透かし Practical Deep Dispersed Watermarking with Synchronization and Fusion ( http://arxiv.org/abs/2310.14532v1 ) ライセンス: Link先を確認	Hengchang Guo, Qilong Zhang, Junwei Luo, Feng Guo, Wenbin Zhang, Xiaodong Su, Minglei Li	(参考訳) 深層学習に基づく視覚的な透かしが徐々に現れ、印象的なパフォーマンスを達成した。しかし、従来のディープ透かし研究は主に固定された低解像度画像に焦点をあてる一方で、任意の解像度画像、特に近年の広汎な高解像度画像にはあまり注意を払わない。さらに、ほとんどの作品は典型的な非幾何学的攻撃(例えば、jpeg圧縮)に対する堅牢性を示すが、一般的な幾何学的攻撃(例えば、ローテーション)やより挑戦的な複合攻撃は無視する。上記の制限を克服するため、実際の深部な \textbf{D}ispersed \textbf{W}atermarking with \textbf{S}ynchronization and \textbf{F}usion, called \textbf{\proposed} を提案する。具体的には,任意の解像度のカバー画像が与えられた場合,複数の固定された小サイズカバーブロックをランダムにランダムに選択して,十分に訓練されたエンコーダによる一貫した透かしメッセージを埋め込む分散埋め込み方式を採用する。抽出段階において、まず、ノイズ付き透かし画像中の符号化されたブロックの位置と修正を行うための透かし同期モジュールを設計する。次に、デコーダを用いてこれらのブロックに埋め込まれたメッセージを取得し、類似性に基づいたメッセージ融合戦略を提案し、メッセージ間の一貫性を完全に活用し、信頼できるメッセージを決定する。異なるデータセットで実施した大規模な実験は,提案手法の有効性を実証するものである。シングル攻撃とコンバインド攻撃に対するビットの精度を平均5.28\%と5.93\%改善し、ファイルサイズの増加と視覚品質の向上を実現しています。私たちのコードはhttps://github.com/bytedance/DWSFで利用可能です。 Deep learning based blind watermarking works have gradually emerged and achieved impressive performance. However, previous deep watermarking studies mainly focus on fixed low-resolution images while paying less attention to arbitrary resolution images, especially widespread high-resolution images nowadays. Moreover, most works usually demonstrate robustness against typical non-geometric attacks (\textit{e.g.}, JPEG compression) but ignore common geometric attacks (\textit{e.g.}, Rotate) and more challenging combined attacks. To overcome the above limitations, we propose a practical deep \textbf{D}ispersed \textbf{W}atermarking with \textbf{S}ynchronization and \textbf{F}usion, called \textbf{\proposed}. Specifically, given an arbitrary-resolution cover image, we adopt a dispersed embedding scheme which sparsely and randomly selects several fixed small-size cover blocks to embed a consistent watermark message by a well-trained encoder. In the extraction stage, we first design a watermark synchronization module to locate and rectify the encoded blocks in the noised watermarked image. We then utilize a decoder to obtain messages embedded in these blocks, and propose a message fusion strategy based on similarity to make full use of the consistency among messages, thus determining a reliable message. Extensive experiments conducted on different datasets convincingly demonstrate the effectiveness of our proposed {\proposed}. Compared with state-of-the-art approaches, our blind watermarking can achieve better performance: averagely improve the bit accuracy by 5.28\% and 5.93\% against single and combined attacks, respectively, and show less file size increment and better visual quality. Our code is available at https://github.com/bytedance/DWSF.	翻訳日:2023-10-24 22:49:16 公開日:2023-10-23
# タスク指向対話システムのためのデュアルフィードバック知識検索 Dual-Feedback Knowledge Retrieval for Task-Oriented Dialogue Systems ( http://arxiv.org/abs/2310.14528v1 ) ライセンス: Link先を確認	Tianyuan Shi, Liangzhi Li, Zijian Lin, Tao Yang, Xiaojun Quan, Qifan Wang	(参考訳) エンド・ツー・エンドのタスク指向対話システムにおいて,ユーザの要求を満たすために必要な情報の選択を容易にすることにより,効率的な知識検索が重要な役割を担っている。しかし、現在のアプローチは一般的に知識検索と応答生成を統合しており、広い知識ベースを扱う際にスケーラビリティ上の課題を引き起こす。オープンドメインの質問応答から着想を得て,レトリバーを用いて関連する知識を検索し,システム応答を生成するレトリバー-ジェネレータアーキテクチャを提案する。～レトリバートレーニングラベルの欠如により、ジェネレータからのフィードバックを擬似ラベルとして、レトリバーをトレーニングすることを提案する。これを実現するために, 発電機の出力に基づいて正と負の両方のフィードバックを生成するデュアルフィードバック機構を導入する。本手法は,3つのベンチマークデータセットにおける実験結果から,タスク指向対話タスクにおいて優れた性能を示す。 Efficient knowledge retrieval plays a pivotal role in ensuring the success of end-to-end task-oriented dialogue systems by facilitating the selection of relevant information necessary to fulfill user requests. However, current approaches generally integrate knowledge retrieval and response generation, which poses scalability challenges when dealing with extensive knowledge bases. Taking inspiration from open-domain question answering, we propose a retriever-generator architecture that harnesses a retriever to retrieve pertinent knowledge and a generator to generate system responses.~Due to the lack of retriever training labels, we propose relying on feedback from the generator as pseudo-labels to train the retriever. To achieve this, we introduce a dual-feedback mechanism that generates both positive and negative feedback based on the output of the generator. Our method demonstrates superior performance in task-oriented dialogue tasks, as evidenced by experimental results on three benchmark datasets.	翻訳日:2023-10-24 22:48:36 公開日:2023-10-23
# Marginal Nodes: グラフにおける構造フェアネスを目指して Marginal Nodes Matter: Towards Structure Fairness in Graphs ( http://arxiv.org/abs/2310.14527v1 ) ライセンス: Link先を確認	Xiaotian Han, Kaixiong Zhou, Ting-Hsiang Wang, Jundong Li, Fei Wang, Na Zou	(参考訳) 社会ネットワークでは、周辺部に位置する人(マージノード)は、中央の人と比較して不公平に扱われる可能性が高い。既存の公正性は主に機密属性(例えば年齢や性別)の保護に焦点を当てるグラフに作用するが、グラフ構造によって引き起こされる公平性も注目されるべきである。一方、グラフニューラルネットワークの情報集約機構は、境界ノードが他のノードから遠く離れている場合が多いため、そのような構造の不公平性を増幅する。本稿では,グラフニューラルネットワークにおけるグラフ構造がもたらす新しい公正性,すなわち「emph{structure fairness}」に着目した。具体的には、まず複数のグラフを分析し、グラフのマージンノードは、グラフニューラルネットワークの他のノードよりもダウンストリームタスクのパフォーマンスが低いことを観測した。そこで本研究では,近傍拡張に基づく構造デバイアスとホップアウェアの注意情報集約を組み合わせることで,構造フェアネスを実現することを提案する。実験の結果,下流タスクにおける全体的な性能を維持しながら,構造フェアネスを大幅に改善できることがわかった。 In social network, a person located at the periphery region (marginal node) is likely to be treated unfairly when compared with the persons at the center. While existing fairness works on graphs mainly focus on protecting sensitive attributes (e.g., age and gender), the fairness incurred by the graph structure should also be given attention. On the other hand, the information aggregation mechanism of graph neural networks amplifies such structure unfairness, as marginal nodes are often far away from other nodes. In this paper, we focus on novel fairness incurred by the graph structure on graph neural networks, named \emph{structure fairness}. Specifically, we first analyzed multiple graphs and observed that marginal nodes in graphs have a worse performance of downstream tasks than others in graph neural networks. Motivated by the observation, we propose \textbf{S}tructural \textbf{Fair} \textbf{G}raph \textbf{N}eural \textbf{N}etwork (SFairGNN), which combines neighborhood expansion based structure debiasing with hop-aware attentive information aggregation to achieve structure fairness. Our experiments show \SFairGNN can significantly improve structure fairness while maintaining overall performance in the downstream tasks.	翻訳日:2023-10-24 22:48:14 公開日:2023-10-23
# S3Aug: アクション認識のためのセグメンテーション、サンプリング、シフト S3Aug: Segmentation, Sampling, and Shift for Action Recognition ( http://arxiv.org/abs/2310.14556v1 ) ライセンス: Link先を確認	Taiki Sugiura, Toru Tamaki	(参考訳) 行動認識はコンピュータビジョンの研究において確立された分野である。本稿では,アクション認識のためのビデオデータ拡張であるS3Augを提案する。従来の2つのビデオから領域を切断・ペーストするビデオデータ拡張手法とは異なり,提案手法ではセグメンテーションとラベル・ツー・イメージ変換により,単一のトレーニングビデオから新たなビデオを生成する。さらに,提案手法では,特定のラベル画像のカテゴリをサンプリングして様々な映像を生成し,中間的特徴をシフトすることで,生成映像のフレーム間の時間的コヒーレンスを高める。 ucf101、hmdb51、mimeticsデータセットの実験結果は、mimeticsデータセットの文脈外のビデオに対して、提案手法の有効性を示している。 Action recognition is a well-established area of research in computer vision. In this paper, we propose S3Aug, a video data augmenatation for action recognition. Unlike conventional video data augmentation methods that involve cutting and pasting regions from two videos, the proposed method generates new videos from a single training video through segmentation and label-to-image transformation. Furthermore, the proposed method modifies certain categories of label images by sampling to generate a variety of videos, and shifts intermediate features to enhance the temporal coherency between frames of the generate videos. Experimental results on the UCF101, HMDB51, and Mimetics datasets demonstrate the effectiveness of the proposed method, paricularlly for out-of-context videos of the Mimetics dataset.	翻訳日:2023-10-24 22:42:21 公開日:2023-10-23
# 階層型ガウス過程とニューラルネットワーク回帰によるカリフォルニア・セントラルバレーの地下水位モデリング Modeling groundwater levels in California's Central Valley by hierarchical Gaussian process and neural network regression ( http://arxiv.org/abs/2310.14555v1 ) ライセンス: Link先を確認	Anshuman Pradhan, Kyra H. Adams, Venkat Chandrasekaran, Zhen Liu, John T. Reager, Andrew M. Stuart and Michael J. Turmon	(参考訳) カリフォルニアのセントラルバレー(cv)の地下水位を連続的にモデル化することは、低品質の井戸データによって困難である。 CV帯水層における3次元岩相テクスチャモデルから地下水位をモデル化するための新しい機械学習手法を提案する。提案法は,ガウス過程(GP)とディープニューラルネットワーク(DNN)を組み合わせて多変量回帰を行う。提案する階層的モデリング手法は、GPによる非パラメトリック回帰が実行されるリソロジー的に情報を得た潜在空間を学ぶためにDNNを訓練する。この手法は、2015年から2020年にかけてCVの地下水位をモデル化するために適用された。高速かつ確実な不確実性定量化を伴う井戸データの非定常特徴をモデル化するためのGP-DNN回帰の有効性を示す。以上の結果から,2017年と2019年のカリフォルニアの湿潤年は,過去の干ばつによる地下水損失の補充にはほとんど効果がなかったことが示唆された。 Modeling groundwater levels continuously across California's Central Valley (CV) hydrological system is challenging due to low-quality well data which is sparsely and noisily sampled across time and space. A novel machine learning method is proposed for modeling groundwater levels by learning from a 3D lithological texture model of the CV aquifer. The proposed formulation performs multivariate regression by combining Gaussian processes (GP) and deep neural networks (DNN). Proposed hierarchical modeling approach constitutes training the DNN to learn a lithologically informed latent space where non-parametric regression with GP is performed. The methodology is applied for modeling groundwater levels across the CV during 2015 - 2020. We demonstrate the efficacy of GP-DNN regression for modeling non-stationary features in the well data with fast and reliable uncertainty quantification. Our results indicate that the 2017 and 2019 wet years in California were largely ineffective in replenishing the groundwater loss caused during previous drought years.	翻訳日:2023-10-24 22:42:07 公開日:2023-10-23
# 部分的観測環境における対向位置の消音 Denoising Opponents Position in Partial Observation Environment ( http://arxiv.org/abs/2310.14553v1 ) ライセンス: Link先を確認	Aref Sayareh, Aria Sardari, Vahid Khoddami, Nader Zare, Vinicius Prado da Fonseca, Amilcar Soares	(参考訳) ロボカップ大会は様々なリーグが開催され、サッカー・シミュレーション2Dリーグが主要リーグである。サッカーシミュレーション2D (SS2D) では、11人の選手と1人のコーチを含む2チームが対戦する。プレイヤーは試合中にサッカーシミュレーションサーバとしか通信できない。いくつかのコードベースが公開され、チーム開発が簡単になる。そこで研究者たちは、意思決定と機械学習の手法の実装に容易にフォーカスできる。 SS2Dの動作と行動は、ノイズや部分的な観察のような異なる課題のために、部分的には正確である。したがって、1つの戦略は、観測の不正確性に取り組むための代替のデノイジング手法を実装することである。我々のアイデアは、パスのようなより正確なアクションを行うために機械学習手法を使用して、有限個のサイクルで見ることなく、相手の位置を予測することです。本稿では,Long Short-Term Memory Model (LSTM) とDeep Neural Networks (DNN) を用いた位置予測について説明する。その結果,LSTM と DNN は,Last-Seen 法のような標準アルゴリズムよりも,相手の位置を正確に予測できることがわかった。 The RoboCup competitions hold various leagues, and the Soccer Simulation 2D League is a major among them. Soccer Simulation 2D (SS2D) match involves two teams, including 11 players and a coach for each team, competing against each other. The players can only communicate with the Soccer Simulation Server during the game. Several code bases are released publicly to simplify team development. So researchers can easily focus on decision-making and implementing machine learning methods. SS2D actions and behaviors are only partially accurate due to different challenges, such as noise and partial observation. Therefore, one strategy is to implement alternative denoising methods to tackle observation inaccuracy. Our idea is to predict opponent positions while they have yet to be seen in a finite number of cycles using machine learning methods to make more accurate actions such as pass. We will explain our position prediction idea powered by Long Short-Term Memory models (LSTM) and Deep Neural Networks (DNN). The results show that the LSTM and DNN predict the opponents' position more accurately than the standard algorithm, such as the last-seen method.	翻訳日:2023-10-24 22:41:47 公開日:2023-10-23
# kindmed: 薬の推奨のための知識誘導薬処方ネットワーク KindMed: Knowledge-Induced Medicine Prescribing Network for Medication Recommendation ( http://arxiv.org/abs/2310.14552v1 ) ライセンス: Link先を確認	Ahmad Wisnu Mulyadi, Heung-Il Suk	(参考訳) 電子健康記録(EHR)の広範囲な採用は、様々な臨床分析においてその利用の機会を提供する。我々は、EHRコホートを外部知識(例えば、標準化された医学オントロジーとWeb上でキュレーションされた豊かな意味論)に豊かにすることにより、より包括的な洞察を得ることができる。本稿では,EHRコホート上に無数の医療関連外部ソースから知識を誘導し,医療知識グラフ(KG)として表現することで,医療を推奨する新しい知識誘導医療処方ネットワーク(KindMed)フレームワークを提案する。このようなkgsを適切に組み込むための関係認識グラフ表現学習に加えて,階層的シーケンス学習を応用し,患者の過去の入院状況における臨床・医学的時間動態の検出と融合を行い,パーソナライズドレコメンデーションを奨励する。安全で正確でパーソナライズされた医薬の予測において、我々は、患者の3つの重要な側面、すなわち、ジョイント・ヒストリー・メディカル・レコードの要約、臨床状態の進展、そして現在の臨床状態について説明し、関連付ける注意深い処方を考案する。グラフ駆動の競合するベースラインに対する主要なパフォーマンスをエッチングし,実世界のEHRコホートに対するKindMedの有効性を示した。 Extensive adoption of electronic health records (EHRs) offers opportunities for its use in various clinical analyses. We could acquire more comprehensive insights by enriching an EHR cohort with external knowledge (e.g., standardized medical ontology and wealthy semantics curated on the web) as it divulges a spectrum of informative relations between observed medical codes. This paper proposes a novel Knowledge-Induced Medicine Prescribing Network (KindMed) framework to recommend medicines by inducing knowledge from myriad medical-related external sources upon the EHR cohort, rendering them as medical knowledge graphs (KGs). On top of relation-aware graph representation learning to unravel an adequate embedding of such KGs, we leverage hierarchical sequence learning to discover and fuse clinical and medicine temporal dynamics across patients' historical admissions for encouraging personalized recommendations. In predicting safe, precise, and personalized medicines, we devise an attentive prescribing that accounts for and associates three essential aspects, i.e., a summary of joint historical medical records, clinical condition progression, and the current clinical state of patients. We exhibited the effectiveness of our KindMed on the augmented real-world EHR cohorts, etching leading performances against graph-driven competing baselines.	翻訳日:2023-10-24 22:41:29 公開日:2023-10-23
# 一般関数近似を用いた破壊・破壊オフライン強化学習 Corruption-Robust Offline Reinforcement Learning with General Function Approximation ( http://arxiv.org/abs/2310.14550v1 ) ライセンス: Link先を確認	Chenlu Ye, Rui Yang, Quanquan Gu, Tong Zhang	(参考訳) 一般関数近似を用いて,オフライン強化学習(rl)における腐敗のロバスト性に関する問題を検討し,オフラインデータセット内の各サンプルを敵が破壊でき,腐敗レベル$\zeta\geq0$がn$エピソードとh$ステップの累積汚損量を定量化する。我々のゴールは、崩壊しないマルコフ決定プロセス(MDP)の最適方針に関して、このような腐敗に対して堅牢で、最適でないギャップを最小限に抑える政策を見つけることである。ロバストなオンラインrl設定 \citep{he2022nearly,ye2022corruptionrobust} から不確実性重み付け手法から着想を得て,バッチサンプル上で効率的に計算する新しい不確実性重み付け反復手順を設計し,オフラインrlのための腐敗-ロバストアルゴリズムを提案する。特に、単一ポリシーカバレッジと$\zeta$の知識の仮定の下で、提案アルゴリズムは、破壊による$\mathcal O(\zeta \cdot (\text{CC}(\lambda,\hat{\mathcal F},\mathcal Z_n^H))^{1/2} (C(\hat{\mathcal F},\mu))^{-1/2} n^{-1})$の加算係数によって悪化する亜最適境界を達成する。ここで、$\text{CC}(\lambda,\hat{\mathcal F},\mathcal Z_n^H)$は正規化パラメータ$\lambda$、信頼セット$\hat{\mathcal F}$、データセット$\mathcal Z_n^H$、および$C(\hat{\mathcal F},\mu)$は、$\hat{\mathcal F}$と基礎となるデータ分散$\mu$に依存する係数である。線形 MDP に特化する場合、汚職依存誤差項は $\mathcal O(\zeta d n^{-1})$ に減少し、$d$ は特徴写像の次元であり、これは既存の線型 MDP の下位境界と一致する。このことは、我々の分析が汚職に依存した用語に関してきついことを示唆している。 We investigate the problem of corruption robustness in offline reinforcement learning (RL) with general function approximation, where an adversary can corrupt each sample in the offline dataset, and the corruption level $\zeta\geq0$ quantifies the cumulative corruption amount over $n$ episodes and $H$ steps. Our goal is to find a policy that is robust to such corruption and minimizes the suboptimality gap with respect to the optimal policy for the uncorrupted Markov decision processes (MDPs). Drawing inspiration from the uncertainty-weighting technique from the robust online RL setting \citep{he2022nearly,ye2022corruptionrobust}, we design a new uncertainty weight iteration procedure to efficiently compute on batched samples and propose a corruption-robust algorithm for offline RL. Notably, under the assumption of single policy coverage and the knowledge of $\zeta$, our proposed algorithm achieves a suboptimality bound that is worsened by an additive factor of $\mathcal O(\zeta \cdot (\text{CC}(\lambda,\hat{\mathcal F},\mathcal Z_n^H))^{1/2} (C(\hat{\mathcal F},\mu))^{-1/2} n^{-1})$ due to the corruption. Here $\text{CC}(\lambda,\hat{\mathcal F},\mathcal Z_n^H)$ is the coverage coefficient that depends on the regularization parameter $\lambda$, the confidence set $\hat{\mathcal F}$, and the dataset $\mathcal Z_n^H$, and $C(\hat{\mathcal F},\mu)$ is a coefficient that depends on $\hat{\mathcal F}$ and the underlying data distribution $\mu$. When specialized to linear MDPs, the corruption-dependent error term reduces to $\mathcal O(\zeta d n^{-1})$ with $d$ being the dimension of the feature map, which matches the existing lower bound for corrupted linear MDPs. This suggests that our analysis is tight in terms of the corruption-dependent term.	翻訳日:2023-10-24 22:41:04 公開日:2023-10-23
# ビッグデータを用いたパンデミックモデルのためのマルチモーダルグラフ学習 Multimodal Graph Learning for Modeling Emerging Pandemics with Big Data ( http://arxiv.org/abs/2310.14549v1 ) ライセンス: Link先を確認	Khanh-Tung Tran, Truong Son Hy, Lili Jiang, Xuan-Son Vu	(参考訳) パンデミックの正確な予測と分析は、効果的な公衆衛生管理と意思決定において重要な役割を果たす。従来のアプローチは主に疫学的データに依存しており、パンデミックのパターンのセンサーや指標として機能する他の貴重な情報ソースを見渡す。本稿では,時間グラフニューラルネットワークとマルチモーダルデータを統合し,学習と予測を行う新しいフレームワークmgl4mepを提案する。特定の事前学習された言語モデルを利用して,ソーシャルメディアコンテンツを含むビッグデータソースを取り込み,ユーザ間のグラフ構造を探索する。この統合は、時間グラフニューラルネットワークによる学習を通じて、パンデミックダイナミクスの豊富な指標を提供する。広範な実験により,パンデミック予測と分析,各地域におけるベースライン手法,パンデミック状況,予測の地平線を上回って,その効果を実証した。時間的グラフ学習とマルチモーダルデータの融合により、パンデミックの風景をより少ない時間ラグ、安価なコスト、潜在的な情報指標で包括的に理解することができる。 Accurate forecasting and analysis of emerging pandemics play a crucial role in effective public health management and decision-making. Traditional approaches primarily rely on epidemiological data, overlooking other valuable sources of information that could act as sensors or indicators of pandemic patterns. In this paper, we propose a novel framework called MGL4MEP that integrates temporal graph neural networks and multi-modal data for learning and forecasting. We incorporate big data sources, including social media content, by utilizing specific pre-trained language models and discovering the underlying graph structure among users. This integration provides rich indicators of pandemic dynamics through learning with temporal graph neural networks. Extensive experiments demonstrate the effectiveness of our framework in pandemic forecasting and analysis, outperforming baseline methods across different areas, pandemic situations, and prediction horizons. The fusion of temporal graph learning and multi-modal data enables a comprehensive understanding of the pandemic landscape with less time lag, cheap cost, and more potential information indicators.	翻訳日:2023-10-24 22:40:10 公開日:2023-10-23
# Test Smell: ソフトウェアテストにおける寄生的エネルギー消費 Test Smell: A Parasitic Energy Consumer in Software Testing ( http://arxiv.org/abs/2310.14548v1 ) ライセンス: Link先を確認	Md Rakib Hossain Misu, Jiawei Li, Adithya Bhattiprolu, Yang Liu, Eduardo Almeida, and Iftekhar Ahmed	(参考訳) 伝統的に、エネルギー効率の研究はハードウェアレベルでのエネルギー消費を減らすことに焦点を当てており、最近ではソフトウェア開発ライフサイクルの設計とコーディングのフェーズで行われている。しかし、ソフトウェアテストがエネルギー消費に与える影響は研究コミュニティからは注目されなかった。具体的には、テストコードの設計品質とテストの臭い(例えば、テストコードの準最適設計や悪い実践)がエネルギー消費に与える影響についてはまだ調査されていない。本研究は,テストの臭いとソフトウェアテストにおけるエネルギー消費への影響を分析するために,Apacheの12のプロジェクトを調査した。ソフトウェア(apacheプロジェクトにおけるデータマイニング)と開発者視点(62人のソフトウェア実践者を対象とした調査)の2次元から,混合手法による実証分析を行った。私たちの発見は 1) テスト臭はソフトウェアテストのエネルギー消費と関連している。テストケースの臭い部分は、非溶融部に比べて10.92\%のエネルギーを消費する。 2) 試験臭は, 他よりもエネルギーに強い。 3)リファクターテストケースは、匂いの多いケースに比べてエネルギー消費が少ない傾向があり、 4) ほとんどの開発者は、テスト臭いがエネルギー消費に与える影響について知識を欠いている。本論文は,今後の研究と発展を導くためのいくつかの観察によって結論づける。 Traditionally, energy efficiency research has focused on reducing energy consumption at the hardware level and, more recently, in the design and coding phases of the software development life cycle. However, software testing's impact on energy consumption did not receive attention from the research community. Specifically, how test code design quality and test smell (e.g., sub-optimal design and bad practices in test code) impact energy consumption has not been investigated yet. This study examined 12 Apache projects to analyze the association between test smell and its effects on energy consumption in software testing. We conducted a mixed-method empirical analysis from two dimensions; software (data mining in Apache projects) and developers' views (a survey of 62 software practitioners). Our findings show that: 1) test smell is associated with energy consumption in software testing. Specifically smelly part of a test case consumes 10.92\% more energy compared to the non-smelly part. 2) certain test smells are more energy-hungry than others, 3) refactored test cases tend to consume less energy than their smelly counterparts, and 4) most developers lack knowledge about test smells' impact on energy consumption. We conclude the paper with several observations that can direct future research and developments.	翻訳日:2023-10-24 22:39:50 公開日:2023-10-23
# 最大独立集合に対する量子ハミルトンアルゴリズム Quantum Hamiltonian Algorithms for Maximum Independent Sets ( http://arxiv.org/abs/2310.14546v1 ) ライセンス: Link先を確認	Xianjue Zhao, Peiyun Ge, Hongye Yu, Li You, Frank Wilczek, Biao Wu	(参考訳) PKアルゴリズムは[Phys. Rev. A 101 (2020) 012318, Chin. Phys. Lett. 38, (2021) 030304] で導入され、HVアルゴリズムは[Science 376 (2022) 1209]で提示された。ここでは2つのアルゴリズムが数学的に等価であることを示す。具体的には、pkアルゴリズムのハミルトニアンは相互作用図のhvハミルトニアンと見なすことができる。 PKアルゴリズムの潜在的な実用的利点について述べる。 Two quantum Hamiltonian algorithms have been proposed to solve the maximum independent set problem: the PK algorithm, introduced in [Phys. Rev. A 101 (2020) 012318; Chin. Phys. Lett. 38, (2021) 030304], and the the HV algorithm, presented in [Science 376 (2022) 1209]. Here we demonstrate that the two algorithms are mathematically equivalent. Specifically, the Hamiltonian in the PK algorithm can be viewed as the HV Hamiltonian in the interaction picture. We remark on potential practical advantages of the PK algorithm.	翻訳日:2023-10-24 22:39:33 公開日:2023-10-23
# ChatGPTをテーマ分析に役立てる - 準備はいいか? Harnessing ChatGPT for thematic analysis: Are we ready? ( http://arxiv.org/abs/2310.14545v1 ) ライセンス: Link先を確認	V Vien Lee, Stephanie C. C. van der Lubbe, Lay Hoon Goh and Jose M. Valderas	(参考訳) ChatGPTは先進的な自然言語処理ツールであり、医学研究における様々な分野の応用が成長している。データのパターンを識別し解釈するための定性的な研究手法であるthematic analysisは、この技術の恩恵を受けるアプリケーションのひとつだ。この視点は、医学的文脈におけるテーマ分析の3つのコアフェーズにおけるchatgptの利用を考察する。 1) 転写物の直接符号化 2)予め定義されたコードリストからテーマを生成すること,及び 3)原稿包含のための前処理引用さらに,ChatGPTによるインタビューテキスト生成の可能性についても検討した。これらの役割におけるChatGPTの使用の強みと限界を評価し,人間の介入が必要な領域を強調した。全体としては、ChatGPTは解析において貴重なツールとして機能し、理論解析の効率を高め、定性的データにさらなる洞察を与えることができると論じる。 ChatGPT is an advanced natural language processing tool with growing applications across various disciplines in medical research. Thematic analysis, a qualitative research method to identify and interpret patterns in data, is one application that stands to benefit from this technology. This viewpoint explores the utilization of ChatGPT in three core phases of thematic analysis within a medical context: 1) direct coding of transcripts, 2) generating themes from a predefined list of codes, and 3) preprocessing quotes for manuscript inclusion. Additionally, we explore the potential of ChatGPT to generate interview transcripts, which may be used for training purposes. We assess the strengths and limitations of using ChatGPT in these roles, highlighting areas where human intervention remains necessary. Overall, we argue that ChatGPT can function as a valuable tool during analysis, enhancing the efficiency of the thematic analysis and offering additional insights into the qualitative data.	翻訳日:2023-10-24 22:39:21 公開日:2023-10-23
# 放射線学におけるGPT-4の境界 Exploring the Boundaries of GPT-4 in Radiology ( http://arxiv.org/abs/2310.14573v1 ) ライセンス: Link先を確認	Qianchu Liu, Stephanie Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Maria Teodora Wetscherek, Robert Tinn, Harshita Sharma, Fernando P\'erez-Garc\'ia, Anton Schwaighofer, Pranav Rajpurkar, Sameer Tajdin Khanna, Hoifung Poon, Naoto Usuyama, Anja Thieme, Aditya V. Nori, Matthew P. Lungren, Ozan Oktay, Javier Alvarez-Valle	(参考訳) 汎用言語モデル(LLM)の最近の成功は、自然言語処理パラダイムをドメインやアプリケーション間で統一された基礎モデルへと大きく変えた。本稿では,現在最も有能なLCMであるGPT-4の性能評価に焦点をあて,最新技術(SOTA)の放射線学特化モデルとの比較を行った。様々なプロンプト戦略を探求し,様々な共通放射線学課題において GPT-4 を評価したところ, GPT-4 は現在の SOTA 放射線学モデルに匹敵するか,あるいは同等であることがわかった。ゼロショットプロンプトにより、GPT-4は、時間文類似性分類(精度)と自然言語推論(F_1$)において、放射線学モデルよりもかなりの利益($10%絶対改善)を得ている。データセット固有のスタイルやスキーマ(例えば、発見要約)を学ぶ必要があるタスクでは、GPT-4はサンプルベースのプロンプトと教師付きSOTAとのマッチングによって改善される。 GPT-4は、複雑なコンテキストにおいて、複雑なドメイン知識を必要とする場合にのみ、十分なレベルの放射線学知識を有することを示す。結果の要約では、GPT-4出力は既存の手書きインプレッションと総合的に比較できる。 The recent success of general-domain large language models (LLMs) has significantly changed the natural language processing paradigm towards a unified foundation model across domains and applications. In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models. Exploring various prompting strategies, we evaluated GPT-4 on a diverse range of common radiology tasks and we found GPT-4 either outperforms or is on par with current SOTA radiology models. With zero-shot prompting, GPT-4 already obtains substantial gains ($\approx$ 10% absolute improvement) over radiology models in temporal sentence similarity classification (accuracy) and natural language inference ($F_1$). For tasks that require learning dataset-specific style or schema (e.g. findings summarisation), GPT-4 improves with example-based prompting and matches supervised SOTA. Our extensive error analysis with a board-certified radiologist shows GPT-4 has a sufficient level of radiology knowledge with only occasional errors in complex context that require nuanced domain knowledge. For findings summarisation, GPT-4 outputs are found to be overall comparable with existing manually-written impressions.	翻訳日:2023-10-24 22:30:19 公開日:2023-10-23
# マルチアノテーションプロセスの展開:アノテーション量とインスタンスがモデル性能に与える影響の検討 Unveiling the Multi-Annotation Process: Examining the Influence of Annotation Quantity and Instance Difficulty on Model Performance ( http://arxiv.org/abs/2310.14572v1 ) ライセンス: Link先を確認	Pritam Kadasi and Mayank Singh	(参考訳) NLPコミュニティは、言語解釈、主観性、曖昧性のニュアンスをよりよく捉えるために、マルチアノテーションデータセットの構築を長年主張してきた。本稿では,データセットがインスタンス毎にひとつのアノテーションから複数のアノテーションに拡張されると,パフォーマンススコアがどう変化するかを示す。アノテーション予算の異なるデータセットを生成するための,新しいマルチアノテーションシミュレーションプロセスを提案する。同じアノテーション予算を持つ同様のデータセットは、パフォーマンスの向上に繋がる可能性がある。我々の発見は、マルチアノテーションの例でトレーニングされたモデルが、単一または少数アノテーションの例でトレーニングされたモデルよりも、常に優れたパフォーマンスをもたらすという一般的な信念に挑戦する。 The NLP community has long advocated for the construction of multi-annotator datasets to better capture the nuances of language interpretation, subjectivity, and ambiguity. This paper conducts a retrospective study to show how performance scores can vary when a dataset expands from a single annotation per instance to multiple annotations. We propose a novel multi-annotator simulation process to generate datasets with varying annotation budgets. We show that similar datasets with the same annotation budget can lead to varying performance gains. Our findings challenge the popular belief that models trained on multi-annotation examples always lead to better performance than models trained on single or few-annotation examples.	翻訳日:2023-10-24 22:29:56 公開日:2023-10-23
# DICE:軌道予測のためのスコーリング付き拡散モデル DICE: Diverse Diffusion Model with Scoring for Trajectory Prediction ( http://arxiv.org/abs/2310.14570v1 ) ライセンス: Link先を確認	Younwoo Choi, Ray Coden Mercurius, Soheil Mohamad Alizadeh Shabestary, Amir Rasouli	(参考訳) 動的環境における道路ユーザの軌道予測は、自律運転など様々なアプリケーションにとって難しいが重要な課題である。この領域における主な課題の1つは、エージェントの未知だが多様な意図に由来する未来の軌道の多様性である。拡散モデルは予測タスクにおけるそのような確率性を捉えるのに非常に効果的であることが示されている。しかしながら、これらのモデルには多くの計算コストのかかるデノナイジングステップとサンプリング操作が含まれており、リアルタイム安全クリティカルなアプリケーションでは望ましくない選択肢となっている。そこで本研究では, 拡散モデルを用いて将来の軌道を計算効率良く予測する新しい枠組みを提案する。反復サンプリングにおける計算ボトルネックを最小限に抑えるため,実時間における推論時間を維持しつつ,精度を向上させるため,サンプル軌道の数を最大化できる効率的なサンプリング機構を採用した。また,相対ランクを割り振ることにより,最も妥当な軌道を選択するためのスコアリング機構を提案する。提案手法は,共通歩行者(ucy/eth)と自律運転(nuscenes)のベンチマークデータセットに対して経験的評価を行い,いくつかのサブセットとメトリクスで最先端のパフォーマンスを実現することにより,その効果を示す。 Road user trajectory prediction in dynamic environments is a challenging but crucial task for various applications, such as autonomous driving. One of the main challenges in this domain is the multimodal nature of future trajectories stemming from the unknown yet diverse intentions of the agents. Diffusion models have shown to be very effective in capturing such stochasticity in prediction tasks. However, these models involve many computationally expensive denoising steps and sampling operations that make them a less desirable option for real-time safety-critical applications. To this end, we present a novel framework that leverages diffusion models for predicting future trajectories in a computationally efficient manner. To minimize the computational bottlenecks in iterative sampling, we employ an efficient sampling mechanism that allows us to maximize the number of sampled trajectories for improved accuracy while maintaining inference time in real time. Moreover, we propose a scoring mechanism to select the most plausible trajectories by assigning relative ranks. We show the effectiveness of our approach by conducting empirical evaluations on common pedestrian (UCY/ETH) and autonomous driving (nuScenes) benchmark datasets on which our model achieves state-of-the-art performance on several subsets and metrics.	翻訳日:2023-10-24 22:29:45 公開日:2023-10-23
# HallusionBench: 自分がどう思うか分かるか? それとも何が見えるか? GPT-4V(ision), LLaVA-1.5, その他の多モードモデルに対する画像文脈推論ベンチマークチェアリング HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models ( http://arxiv.org/abs/2310.14566v1 ) ライセンス: Link先を確認	Fuxiao Liu, Tianrui Guan, Zongxia Li, Lichang Chen, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou	(参考訳) 大規模言語モデル(LLM)は、視覚モデルと整合し、視覚言語モデル(VLM)に統合された後、画像推論タスクにおいて驚くべき改善をもたらす。これは最近リリースされたGPT-4V(ison), LLaVA-1.5などによって示された。しかし、これらSOTA LVLMの前の強い言語は、イメージコンテキストを無視し、推論に先立って(矛盾した)言語にのみ依存する、二重刃の剣である可能性がある。対照的に、VLM の視覚モジュールは LLM よりも弱いため、誤った視覚表現をもたらす可能性がある。言語幻覚と視覚錯覚という2つのVLMの誤りを研究するために,GPT-4V や LLaVA-1.5 さえも困難な画像コンテキスト推論ベンチマークである HallusionBench をキュレートした。本稿では, VLMの錯覚や幻覚に関する新たな知見と, 将来どのように改善していくのかをまとめたHalusionBenchの例を詳細に分析する。ベンチマークとコードベースはhttps://github.com/tianyi-lab/hallusionbenchでリリースされる。 Large language models (LLMs), after being aligned with vision models and integrated into vision-language models (VLMs), can bring impressive improvement in image reasoning tasks. This was shown by the recently released GPT-4V(ison), LLaVA-1.5, etc. However, the strong language prior in these SOTA LVLMs can be a double-edged sword: they may ignore the image context and solely rely on the (even contradictory) language prior for reasoning. In contrast, the vision modules in VLMs are weaker than LLMs and may result in misleading visual representations, which are then translated to confident mistakes by LLMs. To study these two types of VLM mistakes, i.e., language hallucination and visual illusion, we curated HallusionBench, an image-context reasoning benchmark that is still challenging to even GPT-4V and LLaVA-1.5. We provide a detailed analysis of examples in HallusionBench, which sheds novel insights on the illusion or hallucination of VLMs and how to improve them in the future. The benchmark and codebase will be released at https://github.com/tianyi-lab/HallusionBench.	翻訳日:2023-10-24 22:29:24 公開日:2023-10-23
# 言語モデルは幻滅するが、Excelは正確な検証をするかもしれない Language Models Hallucinate, but May Excel at Fact Verification ( http://arxiv.org/abs/2310.14564v1 ) ライセンス: Link先を確認	Jian Guan, Jesse Dodge, David Wadden, Minlie Huang, Hao Peng	(参考訳) 自然言語処理(nlp)の最近の進歩は、大規模言語モデル(llm)の著しい進歩に大きく貢献している。それでも、LLMはしばしば「幻覚」し、非実効的な出力をもたらす。念入りに設計した人間の評価は、重篤な幻覚の問題を裏付けるものであり、GPT-3.5でさえ実際の出力は25%以下であることが明らかとなった。これは、進捗を計測し、インセンティブを与えるために、ファクト検証の重要性を強調する。我々の系統的な調査は、少なくともウィキペディア領域において、人間の判断と強い相関関係を持つ効果的な事実検証としてLLMを再利用できることを確認した。 GPT3.5やChatGPTといった優れたLCMよりも優れていますが、FLAN-T5-11Bは事実検証として最高の性能を発揮しています。さらに詳しくは、これらのllmの高品質な証拠への依存度と、堅牢性と一般化能力の欠如を分析した。本研究は,信頼性のある世代モデル開発のための知見を提示する。 Recent progress in natural language processing (NLP) owes much to remarkable advances in large language models (LLMs). Nevertheless, LLMs frequently "hallucinate," resulting in non-factual outputs. Our carefully designed human evaluation substantiates the serious hallucination issue, revealing that even GPT-3.5 produces factual outputs less than 25% of the time. This underscores the importance of fact verifiers in order to measure and incentivize progress. Our systematic investigation affirms that LLMs can be repurposed as effective fact verifiers with strong correlations with human judgments, at least in the Wikipedia domain. Surprisingly, FLAN-T5-11B, the least factual generator in our study, performs the best as a fact verifier, even outperforming more capable LLMs like GPT3.5 and ChatGPT. Delving deeper, we analyze the reliance of these LLMs on high-quality evidence, as well as their deficiencies in robustness and generalization ability. Our study presents insights for developing trustworthy generation models.	翻訳日:2023-10-24 22:29:03 公開日:2023-10-23
# normdial:社会的規範の遵守と違反をモデル化する、同等のバイリンガル合成ダイアログデータセット NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation ( http://arxiv.org/abs/2310.14563v1 ) ライセンス: Link先を確認	Oliver Li, Mallika Subramanian, Arkadiy Saakyan, Sky CH-Wang, Smaranda Muresan	(参考訳) 社会的規範は基本的に対人コミュニケーションを形成する。本稿では,中国とアメリカの文化に対する社会的規範の遵守と違反に関するターンバイターンアノテーションを備えた,高品質なdyadic対話データセットであるnormdialを提案する。社会規範検出のタスクの導入により,我々のデータセットは中国語と英語の両方で,専門家が注釈付けした社会規範の小さなコレクションで大規模言語モデルを促すことによって,人文パイプラインを用いて合成的に生成される。提案する対話は,人間の評価を通じて高品質であり,既存の大規模言語モデルの性能をさらに評価する。本研究は、言語や文化にまたがる会話的文脈に現れる社会規範のニュアンスを理解するための新しい方向性を示唆する。 Social norms fundamentally shape interpersonal communication. We present NormDial, a high-quality dyadic dialogue dataset with turn-by-turn annotations of social norm adherences and violations for Chinese and American cultures. Introducing the task of social norm observance detection, our dataset is synthetically generated in both Chinese and English using a human-in-the-loop pipeline by prompting large language models with a small collection of expert-annotated social norms. We show that our generated dialogues are of high quality through human evaluation and further evaluate the performance of existing large language models on this task. Our findings point towards new directions for understanding the nuances of social norms as they manifest in conversational contexts that span across languages and cultures.	翻訳日:2023-10-24 22:28:45 公開日:2023-10-23
# f$^2$at:自然パターンと摂動パターンの絡み合いによる機能集中型敵訓練 F$^2$AT: Feature-Focusing Adversarial Training via Disentanglement of Natural and Perturbed Patterns ( http://arxiv.org/abs/2310.14561v1 ) ライセンス: Link先を確認	Yaguan Qian, Chenyu Zhao, Zhaoquan Gu, Bin Wang, Shouling Ji, Wei Wang, Boyang Zhou, Pan Zhou	(参考訳) ディープニューラルネットワーク(DNN)は、よく設計された摂動によって構築された敵の例に対して脆弱である。これは、自動運転車、監視セキュリティ、医療診断などの重要な応用に悲惨な結果をもたらす可能性がある。現在、敵の訓練は敵の例に対する最も効果的な防御の1つである。しかし,従来の対人訓練は,DNNがいまだに素早い特徴を学習しているため,清潔さと頑健さとの良好なトレードオフを達成できない。その本質的な理由は、伝統的な敵の訓練は、敵のノイズやクリーンな例が絡み合えない場合、敵の例からコア機能を完全に習得することが困難である。本稿では,ビット平面スライシングにより,敵対例を自然パターンと摂動パターンに分解する。上位ビット平面は自然パターンを表し,下位ビット平面は摂動パターンを表すと仮定する。そこで,本研究では,自然パターンから中核的特徴に注目し,摂動パターンからの散発的特徴の影響を低減させるという従来の研究と異なる特徴焦点を絞った敵意学習(f$^2$at)を提案する。実験結果から, F$^2$ATは, 精度と対向性において最先端の手法より優れていた。 Deep neural networks (DNNs) are vulnerable to adversarial examples crafted by well-designed perturbations. This could lead to disastrous results on critical applications such as self-driving cars, surveillance security, and medical diagnosis. At present, adversarial training is one of the most effective defenses against adversarial examples. However, traditional adversarial training makes it difficult to achieve a good trade-off between clean accuracy and robustness since spurious features are still learned by DNNs. The intrinsic reason is that traditional adversarial training makes it difficult to fully learn core features from adversarial examples when adversarial noise and clean examples cannot be disentangled. In this paper, we disentangle the adversarial examples into natural and perturbed patterns by bit-plane slicing. We assume the higher bit-planes represent natural patterns and the lower bit-planes represent perturbed patterns, respectively. We propose a Feature-Focusing Adversarial Training (F$^2$AT), which differs from previous work in that it enforces the model to focus on the core features from natural patterns and reduce the impact of spurious features from perturbed patterns. The experimental results demonstrated that F$^2$AT outperforms state-of-the-art methods in clean accuracy and adversarial robustness.	翻訳日:2023-10-24 22:28:32 公開日:2023-10-23
# 多面体表面:多面体表面に基づく自己制御点雲再構成 Polyhedral Surface: Self-supervised Point Cloud Reconstruction Based on Polyhedral Surface ( http://arxiv.org/abs/2310.14560v1 ) ライセンス: Link先を確認	Hui Tian, Kai Xu	(参考訳) 原点クラウドからのポイントクラウドの再構築は、特にモデリングとレンダリングアプリケーションに対する高い需要のため、コンピュータグラフィックスにおいて何十年も重要なトピックであった。この問題を解決する重要な方法は局所曲線に適合する局所幾何学を確立することである。しかし、従来の手法は局所平面あるいは多項式曲線を構築する。局所平面は、開面上の鋭い特徴と境界アーチファクトの損失をもたらす。多項式曲線は局所座標一貫した問題のためにニューラルネットワークと組み合わせるのは難しい。そこで本研究では,局所表面を表す新しい多面体表面を提案する。この方法は、開面上の鋭い特徴と表面境界を表現するためにより柔軟に提供される。局所座標系は不要であり、ニューラルネットワークを導入する際に重要である。具体的には,2次元と3次元の2面と3次元の2面を含む多面体表面を構成するのに普通を用いる。本手法は,一般的に使用される3つのデータセット (shapenetcore, abc, scannet) において最先端の結果を得る。コードは受理時にリリースされる。 Point cloud reconstruction from raw point cloud has been an important topic in computer graphics for decades, especially due to its high demand in modeling and rendering applications. An important way to solve this problem is establishing a local geometry to fit the local curve. However, previous methods build either a local plane or polynomial curve. Local plane brings the loss of sharp feature and the boundary artefacts on open surface. Polynomial curve is hard to combine with neural network due to the local coordinate consistent problem. To address this, we propose a novel polyhedral surface to represent local surface. This method provides more flexible to represent sharp feature and surface boundary on open surface. It does not require any local coordinate system, which is important when introducing neural networks. Specifically, we use normals to construct the polyhedral surface, including both dihedral and trihedral surfaces using 2 and 3 normals, respectively. Our method achieves state-of-the-art results on three commonly used datasets (ShapeNetCore, ABC, and ScanNet). Code will be released upon acceptance.	翻訳日:2023-10-24 22:28:09 公開日:2023-10-23
# AlpaCare:医学応用のための指導訓練型大規模言語モデル AlpaCare:Instruction-tuned Large Language Models for Medical Application ( http://arxiv.org/abs/2310.14558v1 ) ライセンス: Link先を確認	Xinlu Zhang, Chenxin Tian, Xianjun Yang, Lichang Chen, Zekun Li, Linda Ruth Petzold	(参考訳) 大規模言語モデル(LLM)は、命令チューニングによる命令追従能力の大幅な向上を示し、様々なタスクで顕著なパフォーマンスを実現している。これまでの研究は、医療分野固有のLLMの微調整に重点を置いており、医療能力を高めるために数百万のバイオメディカル文献を取り入れている。しかし,既存の医用命令チューニング LLM は,タスクや命令の限られた範囲で制限されており,命令チューニングの有効性が制限され,一般領域のパフォーマンスに悪影響を及ぼしている。本稿では,52kの多様化,機械生成,医療指導追跡データ medinstruct-52k を用いたラマ系モデルを用いて,alpacare モデルを構築した。 AlpaCareは, 一般領域と一般領域の両方において, 従来の指導訓練モデルと比較して, 医用および一般領域の両方において, 強い医用熟練度と汎用性を示す。 medinstruct-52kデータセットと臨床用のフリーフォームな命令テストセットであるmedinstruct-testをコードベースとともに公開し、さらなる研究と開発を促進します。プロジェクトページはhttps://github.com/xzhang97666/alpacare.comで閲覧できます。 Large Language Models (LLMs) have demonstrated significant enhancements in instruction-following abilities through instruction tuning, achieving notable performances across various tasks. Previous research has focused on fine-tuning medical domain-specific LLMs using an extensive array of medical-specific data, incorporating millions of pieces of biomedical literature to augment their medical capabilities. However, existing medical instruction-tuned LLMs have been constrained by the limited scope of tasks and instructions available, restricting the efficacy of instruction tuning and adversely affecting performance in the general domain. In this paper, we fine-tune LLaMA-series models using 52k diverse, machine-generated, medical instruction-following data, MedInstruct-52k, resulting in the model AlpaCare. Comprehensive experimental results on both general and medical-specific domain free-form instruction evaluations showcase AlpaCare's strong medical proficiency and generalizability compared to previous instruction-tuned models in both medical and general domains. We provide public access to our MedInstruct-52k dataset and a clinician-crafted free-form instruction test set, MedInstruct-test, along with our codebase, to foster further research and development. Our project page is available at https://github.com/XZhang97666/AlpaCare.	翻訳日:2023-10-24 22:27:53 公開日:2023-10-23
# スクラップビート:64言語におけるLLMの社会学的理解に関する研究 The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages ( http://arxiv.org/abs/2310.14557v1 ) ライセンス: Link先を確認	Chiyu Zhang, Khai Duy Doan, Qisheng Liao, Muhammad Abdul-Mageed	(参考訳) ChatGPTのような命令調整付き大規模言語モデル(LLM)は、幅広いタスクにおいて顕著なパフォーマンスを示す。様々な NLP ベンチマークにおける命令調整 LLM の性能を調査する最近の研究は多いが、言語間社会プラグマティックな意味(SM)、すなわち社会的・インタラクティブな文脈に埋め込まれた意味を理解する能力に関する包括的な研究は、いまだに乏しい。この欠損は、SMが既存のベンチマークで適切に表現されていないことから生じる。このギャップに対処するため,SM理解に特化した多言語ベンチマークであるSPARROWを提案する。 SPARROWは、6つの主要カテゴリ(例えば、反社会的言語検出、感情認識)にわたる13のタスクタイプをカバーする169のデータセットで構成されている。 SPARROWデータセットは16のスクリプトを表す12の言語ファミリーに由来する64の異なる言語を含んでいる。本研究では,SPARROWにおける多言語事前訓練型言語モデル(mT5など)と命令調整型LLM(BLOOMZ, ChatGPTなど)の性能を,微調整,ゼロショット,少数ショット学習により評価する。当社の包括的な分析から,既存のオープンソースのインストラクションチューニングllmでは,さまざまな言語でsmを理解するのに苦労していることが分かりました。また、ChatGPTは多くのLLMよりも優れていますが、12.19 SPARROWスコアの差があるタスク固有の微調整モデルに依然として遅れています。私たちのベンチマークは、https://github.com/UBC-NLP/SPARROWで公開されています。 Instruction tuned large language models (LLMs), such as ChatGPT, demonstrate remarkable performance in a wide range of tasks. Despite numerous recent studies that examine the performance of instruction-tuned LLMs on various NLP benchmarks, there remains a lack of comprehensive investigation into their ability to understand cross-lingual sociopragmatic meaning (SM), i.e., meaning embedded within social and interactive contexts. This deficiency arises partly from SM not being adequately represented in any of the existing benchmarks. To address this gap, we present SPARROW, an extensive multilingual benchmark specifically designed for SM understanding. SPARROW comprises 169 datasets covering 13 task types across six primary categories (e.g., anti-social language detection, emotion recognition). SPARROW datasets encompass 64 different languages originating from 12 language families representing 16 writing scripts. We evaluate the performance of various multilingual pretrained language models (e.g., mT5) and instruction-tuned LLMs (e.g., BLOOMZ, ChatGPT) on SPARROW through fine-tuning, zero-shot, and/or few-shot learning. Our comprehensive analysis reveals that existing open-source instruction tuned LLMs still struggle to understand SM across various languages, performing close to a random baseline in some cases. We also find that although ChatGPT outperforms many LLMs, it still falls behind task-specific finetuned models with a gap of 12.19 SPARROW score. Our benchmark is available at: https://github.com/UBC-NLP/SPARROW	翻訳日:2023-10-24 22:27:31 公開日:2023-10-23
# 非対称空洞に閉じ込められた$\lambda$型3レベル原子による光子遮断 Photon blockade with a trapped $\Lambda$-type three-level atom in asymmetrical cavity ( http://arxiv.org/abs/2310.14594v1 ) ライセンス: Link先を確認	Xue-Chen Gao, Xiao-Jie Wu, Cheng-Hua Bai, Shao-Xiong Wu, and Chang-shui Yu	(参考訳) 我々は,$\lambda$-type 3レベル原子を持つ非対称ファブリペロキャビティにおいて,強い光子と非逆光子を遮断する手法を提案する。従来型と非型の両方の遮断機構を利用して、強い光子遮断は$\lambda$型原子によってもたらされる非調和固有エネルギースペクトルとマイクロ波磁場によって引き起こされる破壊的量子干渉効果によって達成される。システムパラメータを最適化することにより、幅広いキャビティデチューニングにおける強い光子遮断の操作を実現することができる。キャビティの非対称性によって導入された空間対称性の破れを用いて、方向依存の非相互光子遮断を達成でき、非相互性は最適なキャビティデチューニングで最大に達することができる。特に、空洞変形を簡易に調整することで、非相互光子遮断の発生位置を操作することができる。提案方式は、高品質な非相反性単光子源を生成するための実現可能なアクセスを提供する。 We propose a scheme to manipulate strong and nonreciprocal photon blockades in asymmetrical Fabry-Perot cavity with a $\Lambda$-type three-level atom. Utilizing the mechanisms of both conventional and unconventional blockade, the strong photon blockade is achieved by the anharmonic eigenenergy spectrum brought by $\Lambda$-type atom and the destructive quantum interference effect induced by a microwave field. By optimizing the system parameters, the manipulation of strong photon blockade over a wide range of cavity detuning can be realized. Using spatial symmetry breaking introduced by the asymmetry of cavity, the direction-dependent nonreciprocal photon blockade can be achieved, and the nonreciprocity can reach the maximum at optimal cavity detuning. In particular, manipulating the occurring position of nonreciprocal photon blockade can be implemented by simply adjusting the cavity detuning. Our scheme provides feasible access for generating high-quality nonreciprocal single-photon sources.	翻訳日:2023-10-24 22:21:23 公開日:2023-10-23
# 圧縮真空場を用いた結合光機械システムの量子エンタングルメントとeprステアリングの強化 Enhancing the quantum entanglement and EPR steering of a coupled optomechanical system with a squeezed vacuum field ( http://arxiv.org/abs/2310.14593v1 ) ライセンス: Link先を確認	Shao-Xiong Wu, Cheng-Hua Bai, Gang Li, Chang-shui Yu, and Tiancai Zhang	(参考訳) EPR(Einstein-Podolsky-Rosen)ステアリングは量子情報処理において貴重な資源である。メカニカルモードによるデチューニングの変位を考慮した場合, 真空場が弱い結合型オプトメカニカル系の量子エンタングルメントとEPRステアリングを強化する方法について検討した。システムが真空環境と相互作用する条件と比較して、圧縮された真空場を印加した場合、量子絡み合いとEPRステアリングが強くなる。量子エンタングルメントとeprステアリングの強化には,高い次数を持つ圧縮真空場は有用ではない。圧縮された真空場のスクイーズパラメータよりも、参照位相がこのモデルにおいて重要な役割を果たす。 Quantum entanglement and Einstein-Podolsky-Rosen (EPR) steering are valuable resources in quantum information processing. How to enhance the quantum entanglement and EPR steering of coupled optomechanical systems with a weak squeezed vacuum field are studied when the displacement of detuning induced by the mechanical mode is considered. Compared with the condition that the system interacts with a vacuum environment, the quantum entanglement and EPR steering are stronger when the squeezed vacuum field is applied. A squeezed vacuum field with a large degree is not beneficial to enhance the quantum entanglement and EPR steering. Rather than the squeezing parameter of the squeezed vacuum field, the reference phase plays a vital role in this model.	翻訳日:2023-10-24 22:21:05 公開日:2023-10-23
# カラー化によるLiDARを用いた3次元物体検出装置 Pre-Training LiDAR-Based 3D Object Detectors Through Colorization ( http://arxiv.org/abs/2310.14592v1 ) ライセンス: Link先を確認	Tai-Yu Pan, Chenyang Ma, Tianle Chen, Cheng Perng Phoo, Katie Z Luo, Yurong You, Mark Campbell, Kilian Q. Weinberger, Bharath Hariharan, and Wei-Lun Chao	(参考訳) 自動運転車の正確な3Dオブジェクト検出と理解は、LiDARの点雲に大きく依存し、大量のラベル付きデータを訓練する必要がある。本研究では,データとラベルのギャップを埋める,革新的な事前学習手法であるGPCを導入し,LiDAR点雲のカラー化をモデルに教え,有意義な意味的手がかりを取り入れた。色変化と選択バイアスから生じる課題に対処するため,着色時のヒントとして接地色を提供することにより,色を「コンテキスト」として取り入れる。 KITTIとWaymoのデータセットの実験結果は、GPCの顕著な効果を示している。特に、KITTIデータセットの20%で、GPCはデータセット全体のスクラッチからトレーニングに優れています。要約すると,3次元物体検出のための事前学習の新たな視点を導入し,目的をモデルの役割と整合させ,最終的には自動運転車における3次元物体検出の精度と効率を向上させる。 Accurate 3D object detection and understanding for self-driving cars heavily relies on LiDAR point clouds, necessitating large amounts of labeled data to train. In this work, we introduce an innovative pre-training approach, Grounded Point Colorization (GPC), to bridge the gap between data and labels by teaching the model to colorize LiDAR point clouds, equipping it with valuable semantic cues. To tackle challenges arising from color variations and selection bias, we incorporate color as "context" by providing ground-truth colors as hints during colorization. Experimental results on the KITTI and Waymo datasets demonstrate GPC's remarkable effectiveness. Even with limited labeled data, GPC significantly improves fine-tuning performance; notably, on just 20% of the KITTI dataset, GPC outperforms training from scratch with the entire dataset. In sum, we introduce a fresh perspective on pre-training for 3D object detection, aligning the objective with the model's intended role and ultimately advancing the accuracy and efficiency of 3D object detection for autonomous vehicles.	翻訳日:2023-10-24 22:20:52 公開日:2023-10-23
# 大規模検索モデル:LLM時代の検索スタックの再定義 Large Search Model: Redefining Search Stack in the Era of LLMs ( http://arxiv.org/abs/2310.14587v1 ) ライセンス: Link先を確認	Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei	(参考訳) 現代の検索エンジンは、クエリ理解、検索、多段階ランキング、質問応答など、さまざまなコンポーネントのスタック上に構築されている。これらのコンポーネントはしばしば最適化され、独立してデプロイされる。本稿では,従来の検索スタックを再定義し,検索タスクを1つの大規模言語モデル(llm)で統一する,大規模検索モデルと呼ばれる新しい概念的枠組みを提案する。全てのタスクは自動回帰テキスト生成問題として定式化され、自然言語プロンプトを使ってタスクをカスタマイズできる。提案フレームワークは,LLMの強力な言語理解と推論能力を活用し,既存の検索スタックを簡素化しつつ,検索結果の質を向上させる能力を提供する。この枠組みの実現可能性を明らかにするために,概念実証実験を複数実施し,実世界の検索システムにおけるこのアプローチの実装に伴う潜在的な課題について考察する。 Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others. These components are often optimized and deployed independently. In this paper, we introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM). All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts. This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack. To substantiate the feasibility of this framework, we present a series of proof-of-concept experiments and discuss the potential challenges associated with implementing this approach within real-world search systems.	翻訳日:2023-10-24 22:20:33 公開日:2023-10-23
# GNNEvaluator:ラベルなしで見えないグラフ上でのGNNパフォーマンスの評価 GNNEvaluator: Evaluating GNN Performance On Unseen Graphs Without Labels ( http://arxiv.org/abs/2310.14586v1 ) ライセンス: Link先を確認	Xin Zheng, Miao Zhang, Chunyang Chen, Soheila Molaei, Chuan Zhou, Shirui Pan	(参考訳) グラフニューラルネットワーク(GNN)の性能評価は、トレーニング-テストグラフの分布のミスマッチのため、目立たないテストグラフとラベル付けされていないテストグラフを推測すると、デプロイされたGNNが重大なパフォーマンスの不確実性に直面しているため、実用的なGNNモデルのデプロイと提供にとって必須のタスクである。本稿では,ラベル付きおよび観測グラフ上で訓練された特定のGNNモデルの性能を評価することを目的とした,新しい問題であるGNNモデル評価について,ラベルのない未確認グラフ上での性能(ノード分類精度など)を正確に推定することを目的とした。具体的には,(1) DiscGraph セットの構成と(2) GNNEvaluator トレーニングと推論を含む2段階の GNN モデル評価フレームワークを提案する。 DiscGraphセットは、遅延ノード埋め込みとノードクラス予測に関連するGNNの出力を利用する、差分測定機能を通じて、広範囲で多様なグラフデータ分散の相違をキャプチャする。 DiscGraphセットからの効果的なトレーニング監督の下で、GNNEvaluatorは、評価対象であるGNNモデルのノード分類精度を正確に推定し、GNNモデルの性能を評価するための正確な推論を行う。実世界の未発見およびラベルのないテストグラフに関する広範囲な実験により,提案手法がgnnモデル評価に有効であることを実証した。 Evaluating the performance of graph neural networks (GNNs) is an essential task for practical GNN model deployment and serving, as deployed GNNs face significant performance uncertainty when inferring on unseen and unlabeled test graphs, due to mismatched training-test graph distributions. In this paper, we study a new problem, GNN model evaluation, that aims to assess the performance of a specific GNN model trained on labeled and observed graphs, by precisely estimating its performance (e.g., node classification accuracy) on unseen graphs without labels. Concretely, we propose a two-stage GNN model evaluation framework, including (1) DiscGraph set construction and (2) GNNEvaluator training and inference. The DiscGraph set captures wide-range and diverse graph data distribution discrepancies through a discrepancy measurement function, which exploits the outputs of GNNs related to latent node embeddings and node class predictions. Under the effective training supervision from the DiscGraph set, GNNEvaluator learns to precisely estimate node classification accuracy of the to-be-evaluated GNN model and makes an accurate inference for evaluating GNN model performance. Extensive experiments on real-world unseen and unlabeled test graphs demonstrate the effectiveness of our proposed method for GNN model evaluation.	翻訳日:2023-10-24 22:20:17 公開日:2023-10-23
# jointmatch: 半教師付きテキスト分類への多様かつ協調的な擬似ラベルの統一的アプローチ JointMatch: A Unified Approach for Diverse and Collaborative Pseudo-Labeling to Semi-Supervised Text Classification ( http://arxiv.org/abs/2310.14583v1 ) ライセンス: Link先を確認	Henry Peng Zou, Cornelia Caragea	(参考訳) 半教師付きテキスト分類(SSTC)は、ラベルのないデータを活用する能力によって注目を集めている。しかしながら、疑似ラベルに基づく既存のアプローチは、疑似ラベルバイアスとエラー蓄積の問題に苦しむ。本稿では,近年の半教師付き学習とノイズ学習の課題からアイデアを統一することで,これらの課題に対処するSSTCの総合的アプローチであるJointMatchを提案する。 JointMatchは、異なるクラスの学習状況に基づいて、クラスワイズ閾値を適応的に調整し、モデルバイアスを現在の簡単なクラスに緩和する。さらに、JointMatchは、2つの異なる初期化ネットワークを利用して相互にラベルを交互に教えることでエラーの蓄積を軽減する。相互学習のための2つのネットワーク間の相違を維持するために,より不一致データを重視しながら,高品質な合意データの活用を可能にする戦略を提案する。ベンチマークデータセットにおける実験結果は、ジョイントマッチの優れたパフォーマンスを示し、平均で5.13%の改善を達成した。特にjointmatchは、非常にスカースなラベル設定でも印象的な結果をもたらし、クラス毎に5つのラベルしか持たないag newsで86%の精度を得た。コードはhttps://github.com/HenryPengZou/JointMatch.comで公開しています。 Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data. However, existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation. In this paper, we propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning and the task of learning with noise. JointMatch adaptively adjusts classwise thresholds based on the learning status of different classes to mitigate model bias towards current easy classes. Additionally, JointMatch alleviates error accumulation by utilizing two differently initialized networks to teach each other in a cross-labeling manner. To maintain divergence between the two networks for mutual learning, we introduce a strategy that weighs more disagreement data while also allowing the utilization of high-quality agreement data for training. Experimental results on benchmark datasets demonstrate the superior performance of JointMatch, achieving a significant 5.13% improvement on average. Notably, JointMatch delivers impressive results even in the extremely-scarce-label setting, obtaining 86% accuracy on AG News with only 5 labels per class. We make our code available at https://github.com/HenryPengZou/JointMatch.	翻訳日:2023-10-24 22:19:53 公開日:2023-10-23
# DataCompチャレンジにおける画像-テキスト類似性とキャプション修正の活用:フィルタリングトラックとBYODトラック Leveraging Image-Text Similarity and Caption Modification for the DataComp Challenge: Filtering Track and BYOD Track ( http://arxiv.org/abs/2310.14581v1 ) ライセンス: Link先を確認	Shuhei Yokoo, Peifei Zhu, Yuchi Ishikawa, Mikihiro Tanaka, Masayoshi Kondo, Hirokatsu Kataoka	(参考訳) 大規模なwebクローラデータセットは、高い一般化機能を持つマルチモーダル機能を学ぶ上で、すでに重要な役割を担っている。しかし、データ設計の詳細や改善についてはまだ研究が限られている。近年,固定モデルを用いた最良のトレーニングデータを提案するdatacomp challengeが提案されている。本稿では,DataComp チャレンジにおけるフィルタリングトラックと BYOD トラックの両方に対するソリューションを提案する。提案ソリューションでは,大規模なマルチモーダルモデルCLIPとBLIP-2を用いてWebクローラーデータのフィルタリングと修正を行い,外部データセットとトリックの袋を用いてデータ品質を向上させる。実験では、ソリューションがDataCompのベースライン(フィルタリングトラック:6.6%改善、BYODトラック:48.5%改善)を大きく上回っている。 Large web crawl datasets have already played an important role in learning multimodal features with high generalization capabilities. However, there are still very limited studies investigating the details or improvements of data design. Recently, a DataComp challenge has been designed to propose the best training data with the fixed models. This paper presents our solution to both filtering track and BYOD track of the DataComp challenge. Our solution adopts large multimodal models CLIP and BLIP-2 to filter and modify web crawl data, and utilize external datasets along with a bag of tricks to improve the data quality. Experiments show our solution significantly outperforms DataComp baselines (filtering track: 6.6% improvement, BYOD track: 48.5% improvement).	翻訳日:2023-10-24 22:19:32 公開日:2023-10-23
# FedSplitX: 計算制約のある異種クライアントのためのフェデレーションスプリット学習 FedSplitX: Federated Split Learning for Computationally-Constrained Heterogeneous Clients ( http://arxiv.org/abs/2310.14579v1 ) ライセンス: Link先を確認	Jiyun Shin, Jinhyun Ahn, Honggu Kang, Joonhyuk Kang	(参考訳) 基礎モデル(FM)は機械学習において顕著な性能を示したが、広範なトレーニングデータと計算資源を必要としている。フェデレートラーニング(FL)は、FMがもたらす課題、特にデータのプライバシーと計算上の負担に対処する。しかし、FL on FMsは計算集約的なFMの訓練に苦慮する可能性があるため、様々な計算能力を持つ異種クライアントの状況において課題に直面している。これらの課題に対処するため,システム不均一性に対処する新しいFLフレームワークであるFedSplitXを提案する。 FedSplitXは大規模なモデルを複数のパーティションポイントでクライアントサイドとサーバサイドのコンポーネントに分割し、多様なクライアント機能に対応します。このアプローチにより、クライアントはサーバの計算能力を活用しながら協力し合えるようになり、最も貧弱なクライアントの要求を満たすためにモデルサイズを制限するベースラインに比べて、モデルパフォーマンスが向上する。さらに、feedsplitxは各パーティションポイントに補助ネットワークを組み込んで通信コストと遅延を低減し、モデル性能を向上させる。実験の結果,feedsplitxは大規模モデルのトレーニングにサーバ機能を効果的に活用し,ベースラインアプローチを上回っている。 Foundation models (FMs) have demonstrated remarkable performance in machine learning but demand extensive training data and computational resources. Federated learning (FL) addresses the challenges posed by FMs, especially related to data privacy and computational burdens. However, FL on FMs faces challenges in situations with heterogeneous clients possessing varying computing capabilities, as clients with limited capabilities may struggle to train the computationally intensive FMs. To address these challenges, we propose FedSplitX, a novel FL framework that tackles system heterogeneity. FedSplitX splits a large model into client-side and server-side components at multiple partition points to accommodate diverse client capabilities. This approach enables clients to collaborate while leveraging the server's computational power, leading to improved model performance compared to baselines that limit model size to meet the requirement of the poorest client. Furthermore, FedSplitX incorporates auxiliary networks at each partition point to reduce communication costs and delays while enhancing model performance. Our experiments demonstrate that FedSplitX effectively utilizes server capabilities to train large models, outperforming baseline approaches.	翻訳日:2023-10-24 22:19:18 公開日:2023-10-23
# DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification by Memory Bank DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification via Memory Bank ( http://arxiv.org/abs/2310.14577v1 ) ライセンス: Link先を確認	Henry Peng Zou, Yue Zhou, Weizhi Zhang, Cornelia Caragea	(参考訳) 危機イベントの間、人々はTwitterのようなソーシャルメディアプラットフォームを使って状況、警告、アドバイス、サポートに関する情報を広める。緊急救助組織は、そのような情報を利用して、タイムリーな危機状況を取得し、迅速な救助活動を行う。既存の研究は危機イベント分析のためのモデルを構築するためにそのような情報を利用しているが、完全に監督されたアプローチでは膨大な量のデータを注釈付けする必要がある。一方、半教師付きモデルは偏りがあり、特定のクラスでは適度に機能し、他のクラスでは極めて劣悪であり、災害監視や救助に著しく悪影響を及ぼす。本稿では,最近,半教師付き危機ツイート分類におけるデバイアス手法を2つ検討した。次に,メモリバンクを用いて各クラスから生成された擬似ラベルに対して,トレーニングイテレーション毎に均等なサンプリングを行う,簡易かつ効果的なデバイアス手法であるdecrisismbを提案する。分散状態と分散状態の両方で異なるデバイアス法の性能と一般化能力を比較検討した。その結果,提案手法の優れた性能が示された。私たちのコードはhttps://github.com/HenryPengZou/DeCrisisMBで利用可能です。 During crisis events, people often use social media platforms such as Twitter to disseminate information about the situation, warnings, advice, and support. Emergency relief organizations leverage such information to acquire timely crisis circumstances and expedite rescue operations. While existing works utilize such information to build models for crisis event analysis, fully-supervised approaches require annotating vast amounts of data and are impractical due to limited response time. On the other hand, semi-supervised models can be biased, performing moderately well for certain classes while performing extremely poorly for others, resulting in substantially negative effects on disaster monitoring and rescue. In this paper, we first study two recent debiasing methods on semi-supervised crisis tweet classification. Then we propose a simple but effective debiasing method, DeCrisisMB, that utilizes a Memory Bank to store and perform equal sampling for generated pseudo-labels from each class at each training iteration. Extensive experiments are conducted to compare different debiasing methods' performance and generalization ability in both in-distribution and out-of-distribution settings. The results demonstrate the superior performance of our proposed method. Our code is available at https://github.com/HenryPengZou/DeCrisisMB.	翻訳日:2023-10-24 22:18:55 公開日:2023-10-23
# テンソル分解に基づくスパイクニューラルネットワークの注意モジュール Tensor Decomposition Based Attention Module for Spiking Neural Networks ( http://arxiv.org/abs/2310.14576v1 ) ライセンス: Link先を確認	Haoyu Deng, Ruijie Zhu, Xuerui Qiu, Yule Duan, Malu Zhang, Liangjian Deng	(参考訳) 注意機構はスパイキングニューラルネットワーク(SNN)を改善する効果的な方法であることが証明されている。しかし、現在のSNN入力データフローがGPU上で処理するためにテンソルに分割されているという事実から、以前の研究では、テンソルの特性を注目モジュールの実装として考慮していない。このことは、テンソル関連理論の観点から現在のSNNを再考するきっかけとなった。テンソル分解を用いて、線形に成長するパラメータで優れた結果を示す「textit{projected full attention} (PFA)」モジュールを設計する。具体的には、PFA は \textit{linear projection of spike tensor} (LPST) モジュールと \textit{attention map composing} (AMC) モジュールによって構成される。 lpstでは、元のスパイクテンソルを3つの投影テンソルに圧縮し、各次元の学習可能なパラメータを持つ単一のプロパティ保存戦略を用いる。次に、amcでは、テンソル分解過程の逆手順を利用して、3つのテンソルをいわゆる連結係数を用いてアテンションマップに結合する。提案するPFAモジュールの有効性を検証するため,広く使用されているVGGとResNetアーキテクチャを統合して分類処理を行う。本手法は静的および動的ベンチマークデータセットにおいて,既存のsnnモデルをトランスフォーマベースおよびcnnベースのバックボーンで上回り,最先端のパフォーマンスを実現する。 The attention mechanism has been proven to be an effective way to improve spiking neural network (SNN). However, based on the fact that the current SNN input data flow is split into tensors to process on GPUs, none of the previous works consider the properties of tensors to implement an attention module. This inspires us to rethink current SNN from the perspective of tensor-relevant theories. Using tensor decomposition, we design the \textit{projected full attention} (PFA) module, which demonstrates excellent results with linearly growing parameters. Specifically, PFA is composed by the \textit{linear projection of spike tensor} (LPST) module and \textit{attention map composing} (AMC) module. In LPST, we start by compressing the original spike tensor into three projected tensors using a single property-preserving strategy with learnable parameters for each dimension. Then, in AMC, we exploit the inverse procedure of the tensor decomposition process to combine the three tensors into the attention map using a so-called connecting factor. To validate the effectiveness of the proposed PFA module, we integrate it into the widely used VGG and ResNet architectures for classification tasks. Our method achieves state-of-the-art performance on both static and dynamic benchmark datasets, surpassing the existing SNN models with Transformer-based and CNN-based backbones.	翻訳日:2023-10-24 22:18:35 公開日:2023-10-23
# Few-Shot Conversational Emotion Recognitionのためのクロスタスク・プロンプト・チューニング Efficient Cross-Task Prompt Tuning for Few-Shot Conversational Emotion Recognition ( http://arxiv.org/abs/2310.14614v1 ) ライセンス: Link先を確認	Yige Xu, Zhiwei Zeng, Zhiqi Shen	(参考訳) Emotion Recognition in Conversation (ERC) は感情認識型共感機械の開発において重要であるため、広く研究されている。プレトレーニング言語モデル(PLM)の台頭により、ERC性能の限界がさらに強まった。しかし、最近のPRMを用いたERCの研究はデータ駆動であり、PLM全体を微調整する必要がある。サンプルと計算効率の両面を改善するために,クロスタスク・プロンプト・チューニング (CTPT) と呼ばれる微分自由度最適化手法を提案する。個々のタスクから独立した知識を学習する既存の方法とは異なり、ctptは、他のソースタスクから外部の知識を活用し、少数の設定で学習性能を向上させることで、シェール可能なクロスタスクの知識を活用する。さらに、CTPTは勾配のない低内在次元のベクトルを最適化することしか必要とせず、既存の手法と比較してパラメータ効率が高い。 5つの異なるコンテキスト会話データセットに関する実験により、ctpt法は、少ないシナリオとゼロショット転送の両方において優れた結果を示す。 Emotion Recognition in Conversation (ERC) has been widely studied due to its importance in developing emotion-aware empathetic machines. The rise of pre-trained language models (PLMs) has further pushed the limit of ERC performance. However, most recent works on ERC using PLMs are heavily data-driven, and requires fine-tuning the entire PLMs. To improve both sample and computational efficiency, we propose a derivative-free optimization method called Cross-Task Prompt Tuning (CTPT) for few-shot conversational emotion recognition. Unlike existing methods that learn independent knowledge from individual tasks, CTPT leverages sharable cross-task knowledge by exploiting external knowledge from other source tasks to improve learning performance under the few-shot setting. Moreover, CTPT only needs to optimize a vector under the low intrinsic dimensionality without gradient, which is highly parameter-efficient compared with existing approaches. Experiments on five different contextual conversation datasets demonstrate that our CTPT method has superior results on both few-shot scenarios and zero-shot transfers.	翻訳日:2023-10-24 22:10:18 公開日:2023-10-23
# 翻訳システムはコンテキストの曖昧さに敏感か? That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context? ( http://arxiv.org/abs/2310.14610v1 ) ライセンス: Link先を確認	Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith	(参考訳) 曖昧なテキストの翻訳は、意図された意味をできるだけ曖昧にするために周囲の文脈を使う必要があるため、翻訳システムにとって困難である。先行研究は、ソース言語とターゲット言語の文法的特徴から生じる曖昧さを研究してきたが、ソース自体に存在する意味的曖昧さについて研究している。特に、リテラルとフィギュラティブな解釈(ガチョウ卵など)の両方にオープンなイディオムに注目し、リテラル(ガチョウ卵を産む)とフィギュラティブ(ガチョウ卵をスコア0のように得点する)という曖昧な文脈でイディオムを含む512対の英語文のデータセットであるTIDEを収集する。実験では,MT特化モデルと言語モデルの比較を行った。 (i)あいまいな文が与えられたときの好み (二)曖昧な文脈に対する感受性、及び (iii)形容詞とリテラル語源文のパフォーマンス格差。現在のmtモデルでは常に英語のイディオムを文字通り翻訳していることがわかった。一方、lmsは、ターゲット言語間の差異はあるものの、はるかにコンテキスト対応である。本研究は,文脈認識翻訳のバックボーンとしてのlmsの可能性を示す。 The translation of ambiguous text presents a challenge for translation systems, as it requires using the surrounding context to disambiguate the intended meaning as much as possible. While prior work has studied ambiguities that result from different grammatical features of the source and target language, we study semantic ambiguities that exist in the source (English in this work) itself. In particular, we focus on idioms that are open to both literal and figurative interpretations (e.g., goose egg), and collect TIDE, a dataset of 512 pairs of English sentences containing idioms with disambiguating context such that one is literal (it laid a goose egg) and another is figurative (they scored a goose egg, as in a score of zero). In experiments, we compare MT-specific models and language models for (i) their preference when given an ambiguous subsentence, (ii) their sensitivity to disambiguating context, and (iii) the performance disparity between figurative and literal source sentences. We find that current MT models consistently translate English idioms literally, even when the context suggests a figurative interpretation. On the other hand, LMs are far more context-aware, although there remain disparities across target languages. Our findings underline the potential of LMs as a strong backbone for context-aware translation.	翻訳日:2023-10-24 22:09:58 公開日:2023-10-23
# 会話レコメンデーションシステムのための長期短期計画 Long Short-Term Planning for Conversational Recommendation Systems ( http://arxiv.org/abs/2310.14609v1 ) ライセンス: Link先を確認	Xian Li, Hongguang Shi, Yunfei Wang, Yeqin Zhang, Xubin Li and Cam-Tu Nguyen	(参考訳) Conversational Recommendation Systems (CRS) では、会話エージェントが自然にユーザの好みを尋ね、適切なレコメンデーションを提供する方法が問題となっている。既存の作業は主に階層的なアーキテクチャに従っており、より高いポリシーは会話モジュールを呼び出す(質問する)かレコメンデーションモジュールを呼び出す(推薦する)かを決定する。このアーキテクチャは、これら2つのコンポーネントが互いに完全に相互作用することを防ぐ。対照的に,本稿では,これら2つの重要なコンポーネントをCRSに接続する,長期的フィードバックアーキテクチャを提案する。具体的には、会話コンテキストとユーザ履歴に基づいて、長期的な推薦対象を予測する。ターゲットのレコメンデーションによって、会話モデルは次のトピックや属性を予測し、ユーザの好みがターゲットと一致するかどうかを検証する。バランスフィードバックループは、短期プランナー出力が長期プランナー出力と一致するまで継続する。 In Conversational Recommendation Systems (CRS), the central question is how the conversational agent can naturally ask for user preferences and provide suitable recommendations. Existing works mainly follow the hierarchical architecture, where a higher policy decides whether to invoke the conversation module (to ask questions) or the recommendation module (to make recommendations). This architecture prevents these two components from fully interacting with each other. In contrast, this paper proposes a novel architecture, the long short-term feedback architecture, to connect these two essential components in CRS. Specifically, the recommendation predicts the long-term recommendation target based on the conversational context and the user history. Driven by the targeted recommendation, the conversational model predicts the next topic or attribute to verify if the user preference matches the target. The balance feedback loop continues until the short-term planner output matches the long-term planner output, that is when the system should make the recommendation.	翻訳日:2023-10-24 22:09:35 公開日:2023-10-23
# CAD-DA:統計的推論によるドメイン適応後の制御可能な異常検出 CAD-DA: Controllable Anomaly Detection after Domain Adaptation by Statistical Inference ( http://arxiv.org/abs/2310.14608v1 ) ライセンス: Link先を確認	Vo Nguyen Le Duy, Hsuan-Tien Lin, Ichiro Takeuchi	(参考訳) 本稿では,ドメイン適応(DA)の下での異常検出(AD)の結果をテストするための新しい統計手法を提案し,CAD-DA制御可能なAD(DA)と呼ぶ。 CAD-DAの際立った利点は、事前に指定されたレベル$\alpha$(例えば0.05)で誤同定の確率を制御できることにある。このda設定における課題は、推論結果の妥当性を確保するためにdaの影響を考慮する必要があることである。この課題に対する我々の解決策は、DAの影響に対処するために条件選択推論の概念を活用する。私たちの知る限りでは、daの文脈で有効な統計的推論を行うことができる最初の仕事です。本研究では,cad-da法の性能を合成データと実世界データの両方で評価する。 We propose a novel statistical method for testing the results of anomaly detection (AD) under domain adaptation (DA), which we call CAD-DA -- controllable AD under DA. The distinct advantage of the CAD-DA lies in its ability to control the probability of misidentifying anomalies under a pre-specified level $\alpha$ (e.g., 0.05). The challenge within this DA setting is the necessity to account for the influence of DA to ensure the validity of the inference results. Our solution to this challenge leverages the concept of conditional Selective Inference to handle the impact of DA. To our knowledge, this is the first work capable of conducting a valid statistical inference within the context of DA. We evaluate the performance of the CAD-DA method on both synthetic and real-world datasets.	翻訳日:2023-10-24 22:09:19 公開日:2023-10-23
# 表データ予測のための大規模言語モデルの公平性の検討 Investigating the Fairness of Large Language Models for Predictions on Tabular Data ( http://arxiv.org/abs/2310.14607v1 ) ライセンス: Link先を確認	Yanchen Liu, Srishti Gautam, Jiaqi Ma, Himabindu Lakkaraju	(参考訳) 近年の文献では,大規模言語モデル(llms)を用いた表状タスクの予測の可能性が示唆されている。しかし、LSMは社会に存在するステレオタイプや不平等を反映した有害な社会的偏見を示すことが示されている。この目的のために、多くの高リスクアプリケーションで表データの利用が広まるとともに、表的タスクの予測を行う際にllmがどのような情報ソースを描画するか、社会的バイアスやステレオタイプに影響された表的タスクのllm予測がどの程度なのか、公平性に何をもたらすのか、といった疑問を探究することが不可欠である。一連の実験を通じて、これらの質問を考察し、llmが彼らのトレーニングデータから社会的バイアスを継承する傾向を示し、それによって表予測タスクにおける公平性に大きな影響を与えることを示した。さらに, バイアス緩和の文脈においては, 文脈内学習や微調整が適度な効果を持つものの, ランダムフォレストや浅層ニューラルネットワークといった従来の機械学習モデルに比べて, 異なるサブグループ間の公平性メートル差は依然として大きいことが示された。この観察は、社会的バイアスはllm自体に固有のものであり、下流のタスクデータセットだけでなく、トレーニング前のコーパスから受け継がれていることを強調する。さらに,文脈内サンプルのラベルフリップがバイアスを著しく低減し,LLM内に固有のバイアスが存在することを明らかにする。 Recent literature has suggested the potential of using large language models (LLMs) to make predictions for tabular tasks. However, LLMs have been shown to exhibit harmful social biases that reflect the stereotypes and inequalities present in the society. To this end, as well as the widespread use of tabular data in many high-stake applications, it is imperative to explore the following questions: what sources of information do LLMs draw upon when making predictions for tabular tasks; whether and to what extent are LLM predictions for tabular tasks influenced by social biases and stereotypes; and what are the consequential implications for fairness? Through a series of experiments, we delve into these questions and show that LLMs tend to inherit social biases from their training data which significantly impact their fairness in tabular prediction tasks. Furthermore, our investigations show that in the context of bias mitigation, though in-context learning and fine-tuning have a moderate effect, the fairness metric gap between different subgroups is still larger than that in traditional machine learning models, such as Random Forest and shallow Neural Networks. This observation emphasizes that the social biases are inherent within the LLMs themselves and inherited from their pre-training corpus, not only from the downstream task datasets. Besides, we demonstrate that label-flipping of in-context examples can significantly reduce biases, further highlighting the presence of inherent bias within LLMs.	翻訳日:2023-10-24 22:09:07 公開日:2023-10-23
# M2DF:マルチモーダルアスペクトに基づく感性分析のための多粒度マルチキュリキュラムDenoisingフレームワーク M2DF: Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis ( http://arxiv.org/abs/2310.14605v1 ) ライセンス: Link先を確認	Fei Zhao, Chunhui Li, Zhen Wu, Yawen Ouyang, Jianbing Zhang, Xinyu Dai	(参考訳) マルチモーダルAspect-based Sentiment Analysis (MABSA) はきめ細かな感性分析タスクであり,近年研究関心が高まりつつある。既存の作業は主に画像情報を利用してMABSAタスクの性能を向上させる。しかし、ほとんどの研究では、データセットのテキストに関係のないノイズ画像が多く、モデル学習に悪影響を及ぼすため、画像の重要性を過大評価している。低品質のノイズ画像をしきい値を設定することでフィルタしようとする試みもあるが、しきい値に依存すると必然的に多くの有用な画像情報がフィルタリングされる。そこで本研究では,ノイズ画像の悪影響をデータを変更することなく低減できるかどうかに注目する。この目標を達成するために、カリキュラム学習の概念を取り入れ、トレーニングデータの順序を調整することで、復調を実現できるマルチグラデーション・マルチカリキュラム・デノナイジング・フレームワーク(M2DF)を提案する。実験の結果,MABSAの3つのサブタスクにおいて,我々のフレームワークは一貫して最先端の作業よりも優れていた。 Multimodal Aspect-based Sentiment Analysis (MABSA) is a fine-grained Sentiment Analysis task, which has attracted growing research interests recently. Existing work mainly utilizes image information to improve the performance of MABSA task. However, most of the studies overestimate the importance of images since there are many noise images unrelated to the text in the dataset, which will have a negative impact on model learning. Although some work attempts to filter low-quality noise images by setting thresholds, relying on thresholds will inevitably filter out a lot of useful image information. Therefore, in this work, we focus on whether the negative impact of noisy images can be reduced without modifying the data. To achieve this goal, we borrow the idea of Curriculum Learning and propose a Multi-grained Multi-curriculum Denoising Framework (M2DF), which can achieve denoising by adjusting the order of training data. Extensive experimental results show that our framework consistently outperforms state-of-the-art work on three sub-tasks of MABSA.	翻訳日:2023-10-24 22:08:38 公開日:2023-10-23
# ベトナム人コミュニティによるcovid-19質問応答のための生成的事前学習トランスフォーマー Generative Pre-trained Transformer for Vietnamese Community-based COVID-19 Question Answering ( http://arxiv.org/abs/2310.14602v1 ) ライセンス: Link先を確認	Tam Minh Vo and Khiem Vinh Tran	(参考訳) 近年の研究では、自然言語処理の分野でのGPT(Generative Pre-trained Transformer)の広範化の可能性が実証されている。 GPTは、最先端の質問応答システム(SOTA)においてデコーダとして効果的に採用され、様々なタスクにおいて例外的な性能を得られる。しかし、gptのベトナムでの応用に関する現在の研究状況は限られている。本稿では,ベトナムにおけるcovid-19関連質問に焦点を絞ったコミュニティ型質問応答のためのgpt-2の実装を提案することにより,このギャップを解決することを目的とする。コミュニティベースの質問応答データセットにおいて,異なるトランスフォーマーとsotaモデルの比較分析を行うことにより,新たなアプローチを提案する。実験の結果、GPT-2モデルはベトナムで開発されたコミュニティベースの質問応答モデルと同様に、他のSOTAモデルよりも高い成績を示した。 Recent studies have provided empirical evidence of the wide-ranging potential of Generative Pre-trained Transformer (GPT), a pretrained language model, in the field of natural language processing. GPT has been effectively employed as a decoder within state-of-the-art (SOTA) question answering systems, yielding exceptional performance across various tasks. However, the current research landscape concerning GPT's application in Vietnamese remains limited. This paper aims to address this gap by presenting an implementation of GPT-2 for community-based question answering specifically focused on COVID-19 related queries in Vietnamese. We introduce a novel approach by conducting a comparative analysis of different Transformers vs SOTA models in the community-based COVID-19 question answering dataset. The experimental findings demonstrate that the GPT-2 models exhibit highly promising outcomes, outperforming other SOTA models as well as previous community-based COVID-19 question answering models developed for Vietnamese.	翻訳日:2023-10-24 22:08:15 公開日:2023-10-23
# プリフィックスチューニングに基づく教師なしテキストスタイル転送 Prefix-Tuning Based Unsupervised Text Style Transfer ( http://arxiv.org/abs/2310.14599v1 ) ライセンス: Link先を確認	Huiyu Mai, Wenhao Jiang, Zhihong Deng	(参考訳) 教師なしテキストスタイル転送(unsupervised text style transfer)は、並列データを使わずに入力文のスタイルを変更できる生成モデルを訓練することを目的としている。本稿では,事前学習された大規模言語モデルを用いて,教師なしテキスト転送のための新しいプレフィックスチューニング方式を提案する。我々は,タスク固有の情報,ターゲットスタイル,入力文の内容情報をエンコードするために,3種類の接頭辞,すなわち,スタイル接頭辞,スタイル接頭辞の3種類を構築した。従来の作業で使用される埋め込みと比較して,提案したプレフィックスはモデルに対してよりリッチな情報を提供することができる。さらに,スタイル転送のプロセスにおいて,言語モデルを用いた再帰的手法を採用する。この戦略は、入力文とgpt-2との相互作用をより効果的な方法を提供し、モデルがより有益な接頭辞を構築するのに役立つ。良く知られたデータセットの評価は、我々の手法が最先端のベースラインより優れていることを示している。提案手法のより深い理解のために, アブレーション研究の結果, 分析, および人間による主観評価も提供する。 Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content without using any parallel data. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer. We construct three different kinds of prefixes, i.e., \textit{shared prefix, style prefix}, and \textit{content prefix}, to encode task-specific information, target style, and the content information of the input sentence, respectively. Compared to embeddings used by previous works, the proposed prefixes can provide richer information for the model. Furthermore, we adopt a recursive way of using language models in the process of style transfer. This strategy provides a more effective way for the interactions between the input sentence and GPT-2, helps the model construct more informative prefixes, and thus, helps improve the performance. Evaluations on the well-known datasets show that our method outperforms the state-of-the-art baselines. Results, analysis of ablation studies, and subjective evaluations from humans are also provided for a deeper understanding of the proposed method.	翻訳日:2023-10-24 22:07:57 公開日:2023-10-23
# Co-Prediction Prompt Tuningによる細粒化エンティティタイピングのための雑音ラベルの修正 Learning to Correct Noisy Labels for Fine-Grained Entity Typing via Co-Prediction Prompt Tuning ( http://arxiv.org/abs/2310.14596v1 ) ライセンス: Link先を確認	Minghao Tang, Yongquan He, Yongxiu Xu, Hongbo Xu, Wenyuan Zhang, Yang Lin	(参考訳) 細粒度エンティティタイピング(FET)は、テキスト内のエンティティにセマンティックタイプを割り当てることを目的とした自然言語処理において不可欠なタスクである。しかし、fetはノイズラベリング問題として知られる大きな課題であり、現在の手法ではノイズの分布を推定してノイズのラベルを識別するが、様々なノイズの分布偏差によって混乱する。この制限に対処するために、FETにおける雑音補正のための共予測プロンプトチューニング(Co-Prediction Prompt Tuning)を導入する。具体的には、ラベル付きラベルのリコールに予測結果を統合し、区別されたマージンを用いて不正確なラベルを識別する。さらに,モデルが十分な情報を取得し,雑音同定におけるロバスト性を維持するように,微調整時の発散予測に関する最適化目標を考案する。 3種類のfetデータセットを用いた実験結果から,ノイズ補正手法は,遠隔監視やチャットgpt,クラウドソーシングなど,さまざまなトレーニングサンプルの品質を著しく向上させることがわかった。 Fine-grained entity typing (FET) is an essential task in natural language processing that aims to assign semantic types to entities in text. However, FET poses a major challenge known as the noise labeling problem, whereby current methods rely on estimating noise distribution to identify noisy labels but are confused by diverse noise distribution deviation. To address this limitation, we introduce Co-Prediction Prompt Tuning for noise correction in FET, which leverages multiple prediction results to identify and correct noisy labels. Specifically, we integrate prediction results to recall labeled labels and utilize a differentiated margin to identify inaccurate labels. Moreover, we design an optimization objective concerning divergent co-predictions during fine-tuning, ensuring that the model captures sufficient information and maintains robustness in noise identification. Experimental results on three widely-used FET datasets demonstrate that our noise correction approach significantly enhances the quality of various types of training samples, including those annotated using distant supervision, ChatGPT, and crowdsourcing.	翻訳日:2023-10-24 22:07:39 公開日:2023-10-23
# 情報フローのオンライン監査 Online Auditing of Information Flow ( http://arxiv.org/abs/2310.14595v1 ) ライセンス: Link先を確認	Mor Oren-Loberman, Vered Azar, Wasim Huleihel	(参考訳) 現代のソーシャルメディアプラットフォームは、膨大なユーザーネットワークを通じて情報を急速に広める上で重要な役割を担っている。偽ニュース、誤報、ソーシャルメディアプラットフォーム上の検証不能な事実は、社会に不調和と影響をもたらす。本稿では,ニュース項目を偽物か本物かを分類することを目的として,情報フロー/伝播のオンライン監査の問題を考える。具体的には,実世界のソーシャルメディアプラットフォームに関する経験的研究から,グラフモデルを用いたネットワーク上での確率的マルコフ情報拡散モデルを提案する。そして、誤り確率と正しい決定に要する時間の組み合わせを最小化することを目的として、ある逐次的検出問題として推論タスクを定式化する。このモデルでは、上記のリスクを最小化し、いくつかの統計的保証を証明できる最適検出アルゴリズムが見つかる。次に、実世界のデータセット上でアルゴリズムをテストする。そこで,我々はまず,確率的情報拡散モデル学習のためのオフラインアルゴリズムを構築し,最適な検出アルゴリズムを適用した。実験により,本アルゴリズムは精度と検出時間の観点から,最先端の誤情報検出アルゴリズムより優れていることが示された。 Modern social media platforms play an important role in facilitating rapid dissemination of information through their massive user networks. Fake news, misinformation, and unverifiable facts on social media platforms propagate disharmony and affect society. In this paper, we consider the problem of online auditing of information flow/propagation with the goal of classifying news items as fake or genuine. Specifically, driven by experiential studies on real-world social media platforms, we propose a probabilistic Markovian information spread model over networks modeled by graphs. We then formulate our inference task as a certain sequential detection problem with the goal of minimizing the combination of the error probability and the time it takes to achieve correct decision. For this model, we find the optimal detection algorithm minimizing the aforementioned risk and prove several statistical guarantees. We then test our algorithm over real-world datasets. To that end, we first construct an offline algorithm for learning the probabilistic information spreading model, and then apply our optimal detection algorithm. Experimental study show that our algorithm outperforms state-of-the-art misinformation detection algorithms in terms of accuracy and detection time.	翻訳日:2023-10-24 22:07:17 公開日:2023-10-23
# 量子力学における規則と意味 Rules and Meaning in Quantum Mechanics ( http://arxiv.org/abs/2310.14634v1 ) ライセンス: Link先を確認	Iulian D. Toader	(参考訳) この本は量子力学のメタセマンティクス(qm)に関するものである。大まかに言えば、物理学の哲学と意味論の哲学の交点における研究を追求し、標準QMの意味的事実の競合する説明を批判的に分析する。このような説明の2つの問題は、分類性と規則の永続性である。新しい結果には 1) アインシュタインの不完全性論の再構成は、局所的、分離的、分類的 QM は存在しないと結論付ける。 2) 永続性の原則に基づくボーアの通信原理の再解釈 3) ワイルによって開始された臨界反射の直線に従う量子論理学の意味分散論証 4) 古典論理学におけるカルナップの業績に触発された、QMに関する推論主義に対する意味的不確定性の議論。 This book concerns the metasemantics of quantum mechanics (QM). Roughly, it pursues an investigation at an intersection of the philosophy of physics and the philosophy of semantics, and it offers a critical analysis of rival explanations of the semantic facts of standard QM. Two problems for such explanations are discussed: categoricity and permanence of rules. New results include 1) a reconstruction of Einstein's incompleteness argument, which concludes that a local, separable, and categorical QM cannot exist, 2) a reinterpretation of Bohr's principle of correspondence, grounded in the principle of permanence, 3) a meaning-variance argument for quantum logic, which follows a line of critical reflections initiated by Weyl, and 4) an argument for semantic indeterminacy leveled against inferentialism about QM, inspired by Carnap's work in the philosophy of classical logic.	翻訳日:2023-10-24 22:01:56 公開日:2023-10-23
# セグメンテーション系列の学習による言語モデルの入力コンテキストの拡張 Extending Input Contexts of Language Models through Training on Segmented Sequences ( http://arxiv.org/abs/2310.14633v1 ) ライセンス: Link先を確認	Petros Karypis, Julian McAuley, George Karypis	(参考訳) 長い入力で言語モデルを効果的にトレーニングすることは、多くの技術的課題をもたらす。コストを考慮すると、言語モデルは長いシーケンスに適応する前に一定のシーケンス長で事前学習される。セグメンテーションシーケンスのトレーニングによるモデルの長い入力への適応法と絶対位置埋め込みの拡張のための補間ベース手法について検討する。我々は,事前学習したモデルの入力コンテキストサイズを,アーキテクチャ上の変更やメモリコストを伴わずに拡張する訓練手法を開発した。長い入力からセグメントをサブサンプリングすることで、モデルは元の位置を維持しながら新しい位置の相互作用を学ぶことができる。本手法は,入力コンテキストを拡張することで絶対位置埋め込みを訓練したモデルと,訓練よりも長いシーケンスのパープレキシティを示す一般的な相対位置埋め込み法の両方にメリットがある。提案手法は,入力コンテキストを4倍に拡張し,パープレキシティを向上できることを示す。 Effectively training language models on long inputs poses many technical challenges. As a cost consideration, languages models are pretrained on a fixed sequence length before being adapted to longer sequences. We explore various methods for adapting models to longer inputs by training on segmented sequences and an interpolation-based method for extending absolute positional embeddings. We develop a training procedure to extend the input context size of pretrained models with no architectural changes and no additional memory costs than training on the original input lengths. By sub-sampling segments from long inputs while maintaining their original position the model is able to learn new positional interactions. Our method benefits both models trained with absolute positional embeddings, by extending their input contexts, as well as popular relative positional embedding methods showing a reduced perplexity on sequences longer than they were trained on. We demonstrate our method can extend input contexts by a factor of 4x while improving perplexity.	翻訳日:2023-10-24 22:01:42 公開日:2023-10-23
# 製粉所における工具保守のカットに関する情報決定--KNNに基づくモデル非依存アプローチ Making informed decisions in cutting tool maintenance in milling: A KNN based model agnostic approach ( http://arxiv.org/abs/2310.14629v1 ) ライセンス: Link先を確認	Aditya M. Rahalkar, Om M. Khare, Abhishek D. Patange	(参考訳) 加工プロセスでは、ツールの状態を監視することが、製品の生産性と品質を保証する重要な側面である。ツール条件モニタリングTCMで異なる機械学習技術を使用することで、加工プロセス中に取得した異なる信号の大量のデータを分析することができる。プロセス中に遭遇した実時間力信号は、多数の実験によって取得された。実験中に異なる工具摩耗条件が検討された。決定木を用いたデータと特徴抽出の包括的統計解析を行い,knアルゴリズムを用いて分類を行った。モデルのハイパーパラメータチューニングは、モデルのパフォーマンスを改善するために行われた。ツール状態監視システムにおいて、機械学習アプローチを採用するための研究が数多く行われているが、プロセス解釈性を高め、意思決定がどのように行われるのかを深く理解するためのモデル非依存のアプローチは、多くは実施されていない。本稿では,knnベースのホワイトボックスモデルについて述べる。このモデルが分類をどのように行うか,そしてそれがどのような特徴を優先するかを深く理解することができる。このアプローチは、ツールが特定の状態にある理由を検出するのに役立つ。 In machining processes, monitoring the condition of the tool is a crucial aspect to ensure high productivity and quality of the product. Using different machine learning techniques in Tool Condition Monitoring TCM enables a better analysis of the large amount of data of different signals acquired during the machining processes. The real time force signals encountered during the process were acquired by performing numerous experiments. Different tool wear conditions were considered during the experimentation. A comprehensive statistical analysis of the data and feature selection using decision trees was conducted, and the KNN algorithm was used to perform classification. Hyperparameter tuning of the model was done to improve the models performance. Much research has been done to employ machine learning approaches in tool condition monitoring systems, however, a model agnostic approach to increase the interpretability of the process and get an in depth understanding of how the decision making is done is not implemented by many. This research paper presents a KNN based white box model, which allows us to dive deep into how the model performs the classification and how it prioritizes the different features included. This approach helps in detecting why the tool is in a certain condition and allows the manufacturer to make an informed decision about the tools maintenance.	翻訳日:2023-10-24 22:01:26 公開日:2023-10-23
# 計画, 検証, 切り替え: 異種X-of-Thoughtを用いた統合推論 Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts ( http://arxiv.org/abs/2310.14628v1 ) ライセンス: Link先を確認	Tengxiao Liu, Qipeng Guo, Yuqing Yang, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang	(参考訳) 大規模言語モデル (LLM) は, 思考の連鎖, 思考のプログラムなど, 様々なプロンプト法で有効性を示したので, これらの手法が数学推論タスクにおいて互いに大きな相補関係を形成していることがわかった。本稿では,様々な推論思考をllmに促し,統合型問題解決フレームワークであるxotを提案する。各質問に対して、xotは常に最も適切なメソッドの選択から始まり、反復的に各メソッドを実行する。各イテレーションの中で、xotは生成された回答の有効性を積極的にチェックし、外部エグゼキュータからのフィードバックを取り入れ、異なるプロンプトメソッド間で動的に切り替えることができる。 10の一般的な数学推論データセットに関する広範な実験を通じて,提案手法の有効性を実証し,各モジュールの強度を徹底的に解析する。さらに、経験的結果は、我々のフレームワークは、単一推論メソッドを改善し、論理推論ドメインにさらに一般化する最近の作業と直交していることを示唆している。メソッドの切り替えを可能にすることで、xotは統一フレームワークにおける多様な推論思考の協調的統合に関する新しい視点を提供する。 As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks. In this work, we propose XoT, an integrated problem solving framework by prompting LLMs with diverse reasoning thoughts. For each question, XoT always begins with selecting the most suitable method then executes each method iteratively. Within each iteration, XoT actively checks the validity of the generated answer and incorporates the feedback from external executors, allowing it to dynamically switch among different prompting methods. Through extensive experiments on 10 popular math reasoning datasets, we demonstrate the effectiveness of our proposed approach and thoroughly analyze the strengths of each module. Moreover, empirical results suggest that our framework is orthogonal to recent work that makes improvements on single reasoning methods and can further generalise to logical reasoning domain. By allowing method switching, XoT provides a fresh perspective on the collaborative integration of diverse reasoning thoughts in a unified framework.	翻訳日:2023-10-24 22:01:06 公開日:2023-10-23
# CrisisMatch: 微粒な災害ツイート分類のための半教師付きFew-Shotラーニング CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster Tweet Classification ( http://arxiv.org/abs/2310.14627v1 ) ライセンス: Link先を確認	Henry Peng Zou, Yue Zhou, Cornelia Caragea, and Doina Caragea	(参考訳) twitterやfacebookなどのソーシャルメディア上での自然災害に関するリアルタイム情報共有は,ボランティアや緊急事態管理,対応組織への通知において重要な役割を担っている。しかしながら、災害イベントを監視するための教師付き学習モデルには大量の注釈データが必要であり、災害イベントのリアルタイム使用には非現実的である。この課題に対処すべく,セミ教師付き,少数ショットの学習環境下で,少量のアノテートデータのみを必要とするディザスタツイート分類モデルを提案する。当社のモデルであるCrisisMatchは,災害の初期段階を模したラベル付きデータと大量のラベルなしデータを用いて,ツイートを関心の細かいクラスに効果的に分類する。効果的な半教師付き学習アイデアを統合し、textmixupを組み込むことで、平均11.2\%のディザスタデータセットでパフォーマンス向上を実現する。さらに、ラベル付きデータ数とドメイン外結果の影響についても分析を行う。 The shared real-time information about natural disasters on social media platforms like Twitter and Facebook plays a critical role in informing volunteers, emergency managers, and response organizations. However, supervised learning models for monitoring disaster events require large amounts of annotated data, making them unrealistic for real-time use in disaster events. To address this challenge, we present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting where only a small number of annotated data is required. Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data, mimicking the early stage of a disaster. Through integrating effective semi-supervised learning ideas and incorporating TextMixUp, CrisisMatch achieves performance improvement on two disaster datasets of 11.2\% on average. Further analyses are also provided for the influence of the number of labeled data and out-of-domain results.	翻訳日:2023-10-24 22:00:47 公開日:2023-10-23
# 電子商取引事前販売対話における対話型レコメンダシステムと大規模言語モデル Conversational Recommender System and Large Language Model Are Made for Each Other in E-commerce Pre-sales Dialogue ( http://arxiv.org/abs/2310.14626v1 ) ライセンス: Link先を確認	Yuanxing Liu, Wei-Nan Zhang, Yifan Chen, Yuchi Zhang, Haopeng Bai, Fan Feng, Hengbin Cui, Yongbin Li, Wanxiang Che	(参考訳) e-commerce pre-sales dialogueは、ユーザーが求めるアイテムのニーズや好みを理解し、適切なレコメンデーションを提供することを目的としている。会話推薦システム(CRS)はユーザ表現を学習し、対話コンテキストに基づいて正確なレコメンデーションを提供するが、外部知識に依存している。大規模言語モデル(llm)は、微調整後、事前の対話を模倣する応答を生成するが、正確な推奨のためにドメイン固有の知識が欠如している。直感的には、eコマース事前販売の対話におけるLCMとCRSの強みは相補的であるが、それについてはこれまでの研究は行われていない。本稿では,LCMとCRSを併用した電子商取引事前販売対話の有効性について検討し,CRSとCRSの2つの協調手法を提案する。実際のEコマース事前販売対話のデータセットについて広範な実験を行った。 2つのCRSと2つのLCMとの協調的アプローチがEコマース事前販売対話の4つのタスクに与える影響を分析する。 CRS と LLM の協調作業は,いくつかのケースで非常に効果的であることが判明した。 E-commerce pre-sales dialogue aims to understand and elicit user needs and preferences for the items they are seeking so as to provide appropriate recommendations. Conversational recommender systems (CRSs) learn user representation and provide accurate recommendations based on dialogue context, but rely on external knowledge. Large language models (LLMs) generate responses that mimic pre-sales dialogues after fine-tuning, but lack domain-specific knowledge for accurate recommendations. Intuitively, the strengths of LLM and CRS in E-commerce pre-sales dialogues are complementary, yet no previous work has explored this. This paper investigates the effectiveness of combining LLM and CRS in E-commerce pre-sales dialogues, proposing two collaboration methods: CRS assisting LLM and LLM assisting CRS. We conduct extensive experiments on a real-world dataset of Ecommerce pre-sales dialogues. We analyze the impact of two collaborative approaches with two CRSs and two LLMs on four tasks of Ecommerce pre-sales dialogue. We find that collaborations between CRS and LLM can be very effective in some cases.	翻訳日:2023-10-24 22:00:29 公開日:2023-10-23
# CoF-CoT:マルチドメインNLUタスクのための粗いチェーン・オブ・ソートによる大規模言語モデルの強化 CoF-CoT: Enhancing Large Language Models with Coarse-to-Fine Chain-of-Thought Prompting for Multi-domain NLU Tasks ( http://arxiv.org/abs/2310.14623v1 ) ライセンス: Link先を確認	Hoang H. Nguyen, Ye Liu, Chenwei Zhang, Tao Zhang, Philip S. Yu	(参考訳) Chain-of-Thoughtのプロンプトは推論タスクで人気があるが、自然言語理解(NLU)におけるLarge Language Models(LLMs)への応用は未定である。 llmsの多段階推論に動機づけられ,nluタスクを複数の推論ステップに分解し,llmが様々な粒度からタスクを解決するための必須概念を習得し活用する,粗粒間連鎖(cof-cot)アプローチを提案する。さらに、意味に基づく抽象的意味表現(AMR)構造化知識を中間段階として活用して、発話のニュアンスや多様な構造を捉え、その粒度の異なる関係を理解することを提案する。提案手法は、ゼロショットと少数ショットの両方のマルチドメイン設定の下で、多粒性NLUタスクへのLLMの適応を支援するのに有効である。 While Chain-of-Thought prompting is popular in reasoning tasks, its application to Large Language Models (LLMs) in Natural Language Understanding (NLU) is under-explored. Motivated by multi-step reasoning of LLMs, we propose Coarse-to-Fine Chain-of-Thought (CoF-CoT) approach that breaks down NLU tasks into multiple reasoning steps where LLMs can learn to acquire and leverage essential concepts to solve tasks from different granularities. Moreover, we propose leveraging semantic-based Abstract Meaning Representation (AMR) structured knowledge as an intermediate step to capture the nuances and diverse structures of utterances, and to understand connections between their varying levels of granularity. Our proposed approach is demonstrated effective in assisting the LLMs adapt to the multi-grained NLU tasks under both zero-shot and few-shot multi-domain settings.	翻訳日:2023-10-24 22:00:10 公開日:2023-10-23
# スパイクモードに基づくニューラルネットワーク Spiking mode-based neural networks ( http://arxiv.org/abs/2310.14621v1 ) ライセンス: Link先を確認	Zhanghan Lin and Haiping Huang	(参考訳) スパイキングニューラルネットワークは、脳のようなニューロモルフィック計算や神経回路の動作機構の研究において重要な役割を果たす。大規模スパイクニューラルネットワークのトレーニングの欠点のひとつは、すべての重み値を更新するコストがかかることだ。さらに、トレーニング後、計算タスクに関連するすべての情報は重み行列に隠蔽され、回路機構の透過的な理解が禁止される。そこで本研究では,これらの課題に対して,スパイクモードベースのトレーニングプロトコルを提案する。第一の利点は、重みが各分解項の重要性を特徴付ける入出力モードとそれに伴うスコアによって解釈されることである。したがってモードの数は調整可能であり、実験データのモデリングの自由度を高めることができる。これにより、学習のスペースが大幅に削減されるため、トレーニングコストが大幅に削減される。第二の利点は、周囲空間の高次元の神経活動が典型的には低次元のモード空間に投影できることである。数値分類と選択的感覚統合タスクという2つの計算タスクでフレームワークを分析した。我々の研究は、ニューラルネットワークをスパイクするためのモードベースの学習ルールを導出する。 Spiking neural networks play an important role in brain-like neuromorphic computations and in studying working mechanisms of neural circuits. One drawback of training a large scale spiking neural network is that an expensive cost of updating all weights is required. Furthermore, after training, all information related to the computational task is hidden into the weight matrix, prohibiting us from a transparent understanding of circuit mechanisms. Therefore, in this work, we address these challenges by proposing a spiking mode-based training protocol. The first advantage is that the weight is interpreted by input and output modes and their associated scores characterizing importance of each decomposition term. The number of modes is thus adjustable, allowing more degrees of freedom for modeling the experimental data. This reduces a sizable training cost because of significantly reduced space complexity for learning. The second advantage is that one can project the high dimensional neural activity in the ambient space onto the mode space which is typically of a low dimension, e.g., a few modes are sufficient to capture the shape of the underlying neural manifolds. We analyze our framework in two computational tasks -- digit classification and selective sensory integration tasks. Our work thus derives a mode-based learning rule for spiking neural networks.	翻訳日:2023-10-24 21:59:49 公開日:2023-10-23
# 定常および周期的横磁場をもつイジングスピン系におけるスクランブル Scrambling in Ising spin systems with constant and periodic transverse magnetic fields ( http://arxiv.org/abs/2310.14620v1 ) ライセンス: Link先を確認	Rohit Kumar Shukla	(参考訳) 逆場イジングモデル (TFIM) やフロケスピン系を含む, 可積分系および非可積分系における量子情報のスクランブルについて検討した。本研究はTMIを用いており,TMIはスクランブルの指標として有効であり,より負の値はより高いスクランブルの度合いを示している。積分可能かつ非可積分なTFIMでは,パワー・ロー・パターンに従って初期成長が進行し,顕著なスクランブル動作が観察される。しかし、非可積分TFIMは可積分版に比べて高いスクランブル度を示す。 Floquetシステムでは、TMIは0$から$\pi/2$までの期間にわたって研究される。可積分系と非可積分フロッケ系は、小さな期間のパワーロー成長と大きな期間の急ジャンプを特徴とする$\tau=\pi/4$を除いて、すべての期間にわたってスクランブルな挙動を示す。非可積分フロケ系は、全ての周期で可積分系よりも顕著なスクランブルを示す。私たちが $\tau = \pi/4$ に移動するにつれて、ピークは $\tau = \pi/4$ に近づく(ただし $\tau = \pi/4$ にはならない)。 TMI飽和度はTFIMと比較してFloquet系では変動が少ない。 Floquetシステムにおけるスクランブルの成長は、TFIMを小さな期間で反映するが、より大きな期間で顕著に成長する。短い期間の間、フロッケシステムにおけるスクランブルの程度はtfimのそれと同等であるが、より大きな期間では著しく大きくなる。 Scrambling of quantum information in both integrable and nonintegrable systems, including the transverse field Ising model (TFIM) and Floquet spin systems are studied. Our study employs tripartite mutual information (TMI), with negative TMI serving as an indicator of scrambling, where a more negative value suggests a higher degree of scrambling. In the integrable and nonintegrable TFIM, we observe pronounced scrambling behavior, with the initial growth following a power-law pattern. However, nonintegrable TFIM exhibits a higher degree of scrambling compared to the integrable version. In the Floquet system, TMI is studied across periods from $0$ to $\pi/2$. Both integrable and nonintegrable Floquet systems display scrambling behavior across all periods, except at $\tau=\pi/4$, featuring power-law growth for small periods and abrupt jumps for larger ones. Nonintegrable Floquet systems exhibit more pronounced scrambling compared to integrable ones across all periods. The degree of scrambling increases as we move towards $\tau = \pi/4$, reaching its peak near $\tau = \pi/4$ (but not at $\tau = \pi/4$), regardless of the initial states. TMI saturation fluctuates less in the Floquet system in comparison to the TFIM. The growth of scrambling in the Floquet system mirrors TFIM for small periods but exhibits notably faster growth for larger periods. For a small period, the degree of scrambling in a Floquet system is comparable to that in the TFIM, but it becomes significantly greater for larger periods.	翻訳日:2023-10-24 21:59:31 公開日:2023-10-23
# SIGNトレーニングの再考:第1および第2次グラディエントリプシッツを伴わない非凸加速の可能性 Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz ( http://arxiv.org/abs/2310.14616v1 ) ライセンス: Link先を確認	Tao Sun, Congliang Chen, Peng Qiao, Li Shen, Xinwang Liu, Dongsheng Li	(参考訳) 符号に基づく確率的手法は, パラメータ更新に符号情報のみを用いるにもかかわらず, 頑健な性能を実現する能力から注目されている。しかし、符号ベースの手法の現在の収束解析は、高非滑らか性を含むディープニューラルネットワークトレーニングのような実践的なタスクでは役に立たない一階勾配リプシッツと二階勾配リプシッツの強い仮定に依存している。本稿では,符号に基づく手法を再検討し,その収束を,一階および二階の滑らかさのより現実的な仮定の下で解析する。まず, 弱一階リプシッツの下で符号ベース法を収束させる。弱一階リプシッツに動機づけられ,符号に基づく手法において非凸加速度を許容する緩和された二階条件を提案する。理論的な結果から,最近開発したlionアルゴリズムの計算性能について知見を得た。分散環境では、高速通信圧縮ゴシッププロトコルを利用する場合、この非凸加速度はノード数を線形に高速化することで持続する。我々の理論結果の新規性は、それらがより弱い仮定の下で導出され、手話ベースのアルゴリズムの証明可能な適用性を幅広い問題に拡張することにある。 Sign-based stochastic methods have gained attention due to their ability to achieve robust performance despite using only the sign information for parameter updates. However, the current convergence analysis of sign-based methods relies on the strong assumptions of first-order gradient Lipschitz and second-order gradient Lipschitz, which may not hold in practical tasks like deep neural network training that involve high non-smoothness. In this paper, we revisit sign-based methods and analyze their convergence under more realistic assumptions of first- and second-order smoothness. We first establish the convergence of the sign-based method under weak first-order Lipschitz. Motivated by the weak first-order Lipschitz, we propose a relaxed second-order condition that still allows for nonconvex acceleration in sign-based methods. Based on our theoretical results, we gain insights into the computational advantages of the recently developed LION algorithm. In distributed settings, we prove that this nonconvex acceleration persists with linear speedup in the number of nodes, when utilizing fast communication compression gossip protocols. The novelty of our theoretical results lies in that they are derived under much weaker assumptions, thereby expanding the provable applicability of sign-based algorithms to a wider range of problems.	翻訳日:2023-10-24 21:58:45 公開日:2023-10-23
# 曖昧な明確な記述の理由 Reasoning about Ambiguous Definite Descriptions ( http://arxiv.org/abs/2310.14657v1 ) ライセンス: Link先を確認	Stefan F. Schouten, Peter Bloem, Ilia Markov, Piek Vossen	(参考訳) 自然言語推論は、複雑な言語理解タスクを解く言語モデルの能力を改善する上で、ますます重要な役割を担っている。推論の興味深いユースケースは、コンテキスト依存の曖昧さの解決である。しかし、Large Language Modelsが言語における曖昧さを解決するために明示的な推論をどの程度活用できるかを評価するためのリソースは存在しない。この目的のために曖昧な明確な記述を用い、そのような句からなる最初のベンチマークデータセットを作成し、公開することを提案する。提案手法には,プロンプトの曖昧さを解決するために必要なすべての情報が含まれている。これは最近のLLMにとって難しい課題である。 https://github.com/sfschouten/exploiting-ambiguity Natural language reasoning plays an increasingly important role in improving language models' ability to solve complex language understanding tasks. An interesting use case for reasoning is the resolution of context-dependent ambiguity. But no resources exist to evaluate how well Large Language Models can use explicit reasoning to resolve ambiguity in language. We propose to use ambiguous definite descriptions for this purpose and create and publish the first benchmark dataset consisting of such phrases. Our method includes all information required to resolve the ambiguity in the prompt, which means a model does not require anything but reasoning to do well. We find this to be a challenging task for recent LLMs. Code and data available at: https://github.com/sfschouten/exploiting-ambiguity	翻訳日:2023-10-24 21:50:42 公開日:2023-10-23
# 非平衡温度測定のための強結合フェルミオンプローブ Strongly coupled fermionic probe for nonequilibrium thermometry ( http://arxiv.org/abs/2310.14655v1 ) ライセンス: Link先を確認	Ricard Ravell Rodr\'iguez, Mohammad Mehboudi, Micha{\l} Horodecki, and Mart\'i Perarnau-Llobet	(参考訳) 量子フィッシャー情報 (qfi) によって特徴付けられる測定感度を, 温度$t$ の利害試料と強く結合した単一フェルミイオン熱測定プローブで特徴付ける。プローブが試料と熱平衡に達すると(平衡温度測定)、QFIは弱いカップリング状態から離れることで極低温で指数関数的に改善され、ボゾンプローブ [PRA 96, 062103 (2017)] の同様の結果を補完する。試料との平衡に達する前にプローブを計測する非平衡プロトコルについて,非マルコフダイナミクスに起因する測定感度の新しい挙動を見出す。まず、QFIは、平衡まで単調に成長するマルコフのケースとは対照的に、時間内に非常に非単調な振る舞いを示すので、非マルコフの回復はより高いQFIに到達するために活用できる。第2に、QFIレートは有限の尋問時間$t^$で最大化され、これはマルコフ極限で知られている解$t^ \rightarrow 0$とは対照的である [Quantum 6, 869 (2022)]。最後に、プローブは数個のフェルミオンで構成されており、測定精度の異なる集団的強化について論じる。 We characterise the measurement sensitivity, as characterised by the Quantum Fisher Information (QFI), of a single-fermionic thermometric probe strongly coupled to the sample of interest at temperature $T$. When the probe reaches thermal equilibrium with the sample (equilibrium thermometry), we find that the QFI can be exponentially improved at ultralow temperatures by moving away from the weak coupling regime, complementing similar results for bosonic probes [PRA 96, 062103 (2017)]. For nonequilbrium protocols, in which the probe is measured before reaching equilibrium with the sample, we find new behaviour of the measurement sensitivity arising due to non-Markovian dynamics. First, we show that the QFI displays a highly non-monotonic behaviour in time, in contrast to the Markovian case where it grows monotonically until equilibrium, so that non-Markovian revivals can be exploited to reach a higher QFI. Second, the QFI rate is maximised at a finite interrogation time $t^$, which we characterize, in contrast to the solution $t^ \rightarrow 0$ known in the Markovian limit [Quantum 6, 869 (2022)]. Finally, we consider probes make up of few fermions and discuss different collective enhancements in the measurement precision.	翻訳日:2023-10-24 21:50:30 公開日:2023-10-23
# SPRING-INX: SPRING Lab, IIT Madrasによる多言語言語音声コーパス SPRING-INX: A Multilingual Indian Language Speech Corpus by SPRING Lab, IIT Madras ( http://arxiv.org/abs/2310.14654v1 ) ライセンス: Link先を確認	Nithya R, Malavika S, Jordan F, Arjun Gangwar, Metilda N J, S Umesh, Rithik Sarab, Akhilesh Kumar Dubey, Govind Divakaran, Samudra Vijaya K, Suryakanth V Gangashetty	(参考訳) インドには多くの言語があり、22の言語がインド憲法によって公式に承認されている。インド国民のための音声ベースのアプリケーションを構築することは、限られたデータと対応すべき言語やアクセントの数のために難しい問題である。言語技術コミュニティがインドの言語で音声ベースのアプリケーションを構築することを奨励するため、私たちはSPRING-INXデータをオープンソース化しています。これは、アサメ、ベンガル、グジャラーティ、ヒンディー、カナダ、マラヤラム、マラチ、オディア、パンジャービ、タミルのASRシステム構築のための2000時間に及ぶ法的および手作業による音声データです。この取り組みはインド工科大学マドラス校のSPRING Labが行い、インド政府電子情報技術省(MeitY)が出資したNLTM(National Language Translation Mission)の一部となっている。本稿では,データ収集とデータクリーニングのプロセスとデータ統計について述べる。 India is home to a multitude of languages of which 22 languages are recognised by the Indian Constitution as official. Building speech based applications for the Indian population is a difficult problem owing to limited data and the number of languages and accents to accommodate. To encourage the language technology community to build speech based applications in Indian languages, we are open sourcing SPRING-INX data which has about 2000 hours of legally sourced and manually transcribed speech data for ASR system building in Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi and Tamil. This endeavor is by SPRING Lab , Indian Institute of Technology Madras and is a part of National Language Translation Mission (NLTM), funded by the Indian Ministry of Electronics and Information Technology (MeitY), Government of India. We describe the data collection and data cleaning process along with the data statistics in this paper.	翻訳日:2023-10-24 21:50:05 公開日:2023-10-23
# フェア顔認識のための不変特徴の正規化 Invariant Feature Regularization for Fair Face Recognition ( http://arxiv.org/abs/2310.14652v1 ) ライセンス: Link先を確認	Jiali Ma, Zhongqi Yue, Kagaya Tomoyuki, Suzuki Tomoki, Karlekar Jayashree, Sugiri Pranata, Hanwang Zhang	(参考訳) 公正な顔認識は、あらゆる人口集団で目に見えない顔に一般化する不変性を学ぶことだ。残念ながら、顔データセットは必然的に、現実世界の観察においてユビキタスな不均衡な人口統計特性を捉え、モデルは少数派グループであまり一般化しないバイアスのある特徴を学習する。この偏見は、人口統計学的特性の相違によるものであり、人口統計学的特徴を捉えるためにこのモデルを誤解させるものである。コンファウンディング効果は、コンファウンデーションアノテーションを必要とする因果的介入によってのみ取り除くことができる。しかし,このようなアノテーションは,属性の多様性から,非常に高価である。そこで本研究では,教師なし方式で多様なデータ分割を反復的に生成することを提案する。各データパーティションは自己アノテートされた共同創設者として機能し、Invariant Feature Regularization (INV-REG) のデコンウンドを可能にします。 INV-REGは既存の手法と直交しており、INV-REGと2つの強いベースライン(ArcfaceとCIFP)を組み合わせることで、さまざまな人口集団における顔認識を改善する新しい最先端技術がもたらされる。コードはhttps://github.com/panasonicconnect/invregで入手できる。 Fair face recognition is all about learning invariant feature that generalizes to unseen faces in any demographic group. Unfortunately, face datasets inevitably capture the imbalanced demographic attributes that are ubiquitous in real-world observations, and the model learns biased feature that generalizes poorly in the minority group. We point out that the bias arises due to the confounding demographic attributes, which mislead the model to capture the spurious demographic-specific feature. The confounding effect can only be removed by causal intervention, which requires the confounder annotations. However, such annotations can be prohibitively expensive due to the diversity of the demographic attributes. To tackle this, we propose to generate diverse data partitions iteratively in an unsupervised fashion. Each data partition acts as a self-annotated confounder, enabling our Invariant Feature Regularization (INV-REG) to deconfound. INV-REG is orthogonal to existing methods, and combining INV-REG with two strong baselines (Arcface and CIFP) leads to new state-of-the-art that improves face recognition on a variety of demographic groups. Code is available at https://github.com/PanasonicConnect/InvReg.	翻訳日:2023-10-24 21:49:45 公開日:2023-10-23
# $\Lambda$-Split: クラウドで動く生成AIのためのプライバシ保護スプリットコンピューティングフレームワーク $\Lambda$-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI ( http://arxiv.org/abs/2310.14651v1 ) ライセンス: Link先を確認	Shoki Ohta, Takayuki Nishio	(参考訳) 生成人工知能(AI)サービスの急成長に伴い、これらの技術に固有の計算要求は、特にリソース制約のあるモバイルデバイスにおいて、クラウドによる計算オフロードを必要とすることが多い。これらのサービスは一般的に、生成プロセスの運営を促すプロンプトを使用し、テキストや画像などのプロンプトと結果のコンテンツの両方がプライバシーに敏感な情報や機密情報を保存し、セキュリティとプライバシーのリスクを高める。これらの懸念を軽減するために,計算オフロードを容易にする分割コンピューティングフレームワークである$\Lambda$-Splitを導入し,盗聴や不正アクセスなどのリスクに対してデータのプライバシを保護した。生成モデルである$\Lambda$-Splitでは、通常はディープニューラルネットワーク(DNN)が3つのサブモデルに分割され、ユーザのローカルデバイスとクラウドサーバに分散される。このアーキテクチャは、隠された層出力のみが送信されることを保証し、プライバシーに敏感な生入力および出力データの外部送信を防止する。 dnnのブラックボックスの性質を考えると、傍受された隠れレイヤ出力から元の入力や出力を推定することは、悪意のある盗聴者にとって大きな課題となる。さらに$\lambda$-splitは、従来の暗号化ベースのセキュリティメカニズムと直交し、同時にデプロイされた時のセキュリティ強化を提供する。 llama 2 を用いた $\lambda$-split フレームワークの有効性を実証的に検証し,meta と stability ai が開発した代表的な大規模言語モデルである stable diffusion xl の有効性を検証した。私たちの$\Lambda$-Splitの実装はhttps://github.com/nishio-laboratory/lambda_splitで公開されています。 In the wake of the burgeoning expansion of generative artificial intelligence (AI) services, the computational demands inherent to these technologies frequently necessitate cloud-powered computational offloading, particularly for resource-constrained mobile devices. These services commonly employ prompts to steer the generative process, and both the prompts and the resultant content, such as text and images, may harbor privacy-sensitive or confidential information, thereby elevating security and privacy risks. To mitigate these concerns, we introduce $\Lambda$-Split, a split computing framework to facilitate computational offloading while simultaneously fortifying data privacy against risks such as eavesdropping and unauthorized access. In $\Lambda$-Split, a generative model, usually a deep neural network (DNN), is partitioned into three sub-models and distributed across the user's local device and a cloud server: the input-side and output-side sub-models are allocated to the local, while the intermediate, computationally-intensive sub-model resides on the cloud server. This architecture ensures that only the hidden layer outputs are transmitted, thereby preventing the external transmission of privacy-sensitive raw input and output data. Given the black-box nature of DNNs, estimating the original input or output from intercepted hidden layer outputs poses a significant challenge for malicious eavesdroppers. Moreover, $\Lambda$-Split is orthogonal to traditional encryption-based security mechanisms, offering enhanced security when deployed in conjunction. We empirically validate the efficacy of the $\Lambda$-Split framework using Llama 2 and Stable Diffusion XL, representative large language and diffusion models developed by Meta and Stability AI, respectively. Our $\Lambda$-Split implementation is publicly accessible at https://github.com/nishio-laboratory/lambda_split.	翻訳日:2023-10-24 21:49:21 公開日:2023-10-23
# 量子温度測定における温度-熱の不確かさ関係 Temperature-heat uncertainty relation for quantum thermometry ( http://arxiv.org/abs/2310.14645v1 ) ライセンス: Link先を確認	Ning Zhang, Si-Yuan Bai, and Chong Chen	(参考訳) 温度推定のための資源理論について検討する。温度-熱の不確実性関係を通じて温度精度を決定づけるのは熱の変動であることを示す。具体的には,熱は温度計の進化経路に沿った熱交換と温度計と試料との相関に関連した軌道熱と相関熱に分けられることがわかった。 2種類の温度計に基づいて,両熱項が温度精度向上のための資源であることを示す。さらに,熱-熱の不確かさ関係は熱力学におけるよく知られた温度-エネルギーの不確かさ関係と一致することを示した。推定精度を高めるための資源を明確に識別することにより, 様々な量子特性が正確な温度検出に重要である理由だけでなく, 超高感度の量子温度計の設計に有用な知見を提供する。 We investigate the resource theory for temperature estimation. We demonstrate that it is the fluctuation of heat that fundamentally determines temperature precision through the temperature-heat uncertainty relation. Specifically, we find that heat is divided into trajectory heat and correlation heat, which are associated with the heat exchange along thermometer's evolution path and the correlation between the thermometer and the sample, respectively. Based on two type of thermometers, we show that both of these heat terms are resources for enhancing temperature precision. Additionally, we demonstrate that the temperature-heat uncertainty relation is consistent with the well known temperature-energy uncertainty relation in thermodynamics. By clearly distinguishing the resources for enhancing estimation precision, our findings not only explain why various quantum features are crucial for accurate temperature sensing but also provide valuable insights for designing ultrahigh-sensitive quantum thermometers.	翻訳日:2023-10-24 21:48:47 公開日:2023-10-23
# 多言語k-Nearest-Neighbor機械翻訳 Multilingual k-Nearest-Neighbor Machine Translation ( http://arxiv.org/abs/2310.14644v1 ) ライセンス: Link先を確認	David Stap, Christof Monz	(参考訳) k-nearest-neighborマシン翻訳は、キャッシュされたサンプルのデータストアを作成することにより、機械翻訳の品質が著しく向上した。しかし、これらの改善は大規模なデータストアを持つ高リソースの言語ペアに限られており、低リソースの言語では依然として課題である。本稿では,複数の言語からの表現を1つのデータストアに組み合わせることで,この問題に対処する。その結果,低リソースの翻訳品質(+3.6BLEUまで)だけでなく,高リソースの翻訳品質(+0.5BLEUまで)も大幅に向上した。実験により,データストア作成に言語的類似性を用いることで,4分の1の大きさの多言語データストアを作成でき,5.3倍の高速化が達成できることを示した。 k-nearest-neighbor machine translation has demonstrated remarkable improvements in machine translation quality by creating a datastore of cached examples. However, these improvements have been limited to high-resource language pairs, with large datastores, and remain a challenge for low-resource languages. In this paper, we address this issue by combining representations from multiple languages into a single datastore. Our results consistently demonstrate substantial improvements not only in low-resource translation quality (up to +3.6 BLEU), but also for high-resource translation quality (up to +0.5 BLEU). Our experiments show that it is possible to create multilingual datastores that are a quarter of the size, achieving a 5.3x speed improvement, by using linguistic similarities for datastore creation.	翻訳日:2023-10-24 21:48:33 公開日:2023-10-23
# Relit-NeuLF: ニューラルネットワークによる効率的なリライティングと新しいビュー合成 Relit-NeuLF: Efficient Relighting and Novel View Synthesis via Neural 4D Light Field ( http://arxiv.org/abs/2310.14642v1 ) ライセンス: Link先を確認	Zhong Li, Liangchen Song, Zhang Chen, Xiangyu Du, Lele Chen, Junsong Yuan, Yi Xu	(参考訳) 本稿では,光源数に制限のある多視点画像から複雑なシーンの同時リライティングと新規ビュー合成の問題に対処する。本稿ではRelit-NeuLFと呼ばれる分析合成手法を提案する。最近のニューラル4D光電場ネットワーク(NeuLF)に続いて、Relit-NeuLFはまず2面の光電場表現を利用して4D座標系の各光線をパラメータ化し、効率的な学習と推論を可能にする。次に、3次元シーンの空間変動双方向反射率分布関数(svbrdf)を自己教師ありで復元する。 decomposenetは各レイをalbedo, normal, roughnessといったsvbrdfコンポーネントにマップすることを学ぶ。分解されたBRDF成分と光方向の条件付けに基づいて、RenderNetは光線の色を合成する。 SVBRDF分解を自己監督するために,マイクロファセットモデルを用いて,予測光線色を物理ベースレンダリング結果に近い色にすることを推奨する。総合的な実験により,提案手法は合成データと実世界の顔データの両方において効率的かつ効果的であり,最先端の結果を上回っていることが示された。私たちはコードをgithubで公開しました。 https://github.com/oppo-us-research/RelitNeuLF In this paper, we address the problem of simultaneous relighting and novel view synthesis of a complex scene from multi-view images with a limited number of light sources. We propose an analysis-synthesis approach called Relit-NeuLF. Following the recent neural 4D light field network (NeuLF), Relit-NeuLF first leverages a two-plane light field representation to parameterize each ray in a 4D coordinate system, enabling efficient learning and inference. Then, we recover the spatially-varying bidirectional reflectance distribution function (SVBRDF) of a 3D scene in a self-supervised manner. A DecomposeNet learns to map each ray to its SVBRDF components: albedo, normal, and roughness. Based on the decomposed BRDF components and conditioning light directions, a RenderNet learns to synthesize the color of the ray. To self-supervise the SVBRDF decomposition, we encourage the predicted ray color to be close to the physically-based rendering result using the microfacet model. Comprehensive experiments demonstrate that the proposed method is efficient and effective on both synthetic data and real-world human face data, and outperforms the state-of-the-art results. We publicly released our code on GitHub. You can find it here: https://github.com/oppo-us-research/RelitNeuLF	翻訳日:2023-10-24 21:48:20 公開日:2023-10-23
# 信頼できるディープハッシュ検索のためのセマンティクスアウェア・アドバーサリートレーニング Semantic-Aware Adversarial Training for Reliable Deep Hashing Retrieval ( http://arxiv.org/abs/2310.14637v1 ) ライセンス: Link先を確認	Xu Yuan, Zheng Zhang, Xunguang Wang, Lin Wu	(参考訳) ディープハッシュは,その効率性と有効性から,大規模画像検索システムにおいて集中的に研究され,応用されている。近年の研究では、敵の例の存在は、深いハッシュモデル、すなわち敵の脆弱性に対するセキュリティ上の脅威をもたらすと認識されている。特に,ディープハッシュ化のための信頼性の高い意味表現を効率的に蒸留し,逆学習を導くことが困難であり,ディープハッシュに基づく検索モデルの逆ロバスト性の向上を阻害している。さらに, 深部ハッシュの対角訓練に関する最近の研究は, 統一されたミニマックス構造に定式化することは困難である。本稿では,深部ハッシュモデルの対角的堅牢性を向上させるために,セマンティック・アウェア・アドバサリアル・トレーニング(SAAT)を提案する。具体的には,深いハッシュ処理において,敵意学習を導くための意味表現を構築するためのdmflスキームを考案する。特に, 厳密な理論的保証を有するdmflは, 識別的性質と意味的性質を共に考慮した識別的学習方法で適応的に最適化されている。また、敵のサンプルのハッシュコードとメインステイ特徴とのハミング距離を最大化して敵の例を作成し、その効果を敵の攻撃試験で検証した。さらに,我々は初めて,生成されたメインステイ符号の指導の下で,ディープハッシュの形式化された逆訓練を統一ミニマックス最適化に定式化する。ベンチマークデータセットにおける広範囲な実験は、最先端アルゴリズムに対するスーパーブ攻撃性能を示す一方で、提案された敵対的トレーニングは、信頼できるディープハッシュベース検索のための敵対的摂動を効果的に排除することができる。私たちのコードはhttps://github.com/xandery-geek/saatで利用可能です。 Deep hashing has been intensively studied and successfully applied in large-scale image retrieval systems due to its efficiency and effectiveness. Recent studies have recognized that the existence of adversarial examples poses a security threat to deep hashing models, that is, adversarial vulnerability. Notably, it is challenging to efficiently distill reliable semantic representatives for deep hashing to guide adversarial learning, and thereby it hinders the enhancement of adversarial robustness of deep hashing-based retrieval models. Moreover, current researches on adversarial training for deep hashing are hard to be formalized into a unified minimax structure. In this paper, we explore Semantic-Aware Adversarial Training (SAAT) for improving the adversarial robustness of deep hashing models. Specifically, we conceive a discriminative mainstay features learning (DMFL) scheme to construct semantic representatives for guiding adversarial learning in deep hashing. Particularly, our DMFL with the strict theoretical guarantee is adaptively optimized in a discriminative learning manner, where both discriminative and semantic properties are jointly considered. Moreover, adversarial examples are fabricated by maximizing the Hamming distance between the hash codes of adversarial samples and mainstay features, the efficacy of which is validated in the adversarial attack trials. Further, we, for the first time, formulate the formalized adversarial training of deep hashing into a unified minimax optimization under the guidance of the generated mainstay codes. Extensive experiments on benchmark datasets show superb attack performance against the state-of-the-art algorithms, meanwhile, the proposed adversarial training can effectively eliminate adversarial perturbations for trustworthy deep hashing-based retrieval. Our code is available at https://github.com/xandery-geek/SAAT.	翻訳日:2023-10-24 21:48:00 公開日:2023-10-23
# 超音波画像における乳房病変分割のための多レベル知覚境界誘導ネットワーク Multilevel Perception Boundary-guided Network for Breast Lesion Segmentation in Ultrasound Images ( http://arxiv.org/abs/2310.14636v1 ) ライセンス: Link先を確認	Xing Yang, Jian Zhang, Qijian Chen, Li Wang and Lihui Wang	(参考訳) 超音波画像からの乳腺腫瘍の自動分離は,その後の臨床診断および治療計画に不可欠である。既存の深層学習法では乳腺腫瘍の自動分節化は有意な進歩を遂げているが,正常組織と同等の強度を有する腫瘍に対する成績は,特に腫瘍境界において相変わらず良好ではない。この問題を解決するため,超音波画像から乳腺腫瘍を分離する多レベルグローバル認識モジュール(MGPM)と境界誘導モジュール(BGM)で構成されるPBNetを提案する。特にMGPMでは, 単一レベル特徴写像におけるボクセル間の長距離空間依存性をモデル化し, マルチレベル意味情報を融合することにより, 非拡張腫瘍に対するモデルの認識能力を促進させる。 BGMでは,腫瘍境界を最大プールの希釈および浸食効果を用いて高レベルセマンティックマップから抽出し,低レベル特徴と高レベル特徴の融合を誘導する。さらに,腫瘍境界のセグメンテーション性能を向上させるために,マルチレベル境界強調セグメンテーション(BS)損失を提案する。公開データセットと社内データセットの比較実験により、PBNetは定性的な可視化結果と定量的評価指標の両方で最先端の手法より優れており、Diceスコア、Jaccard係数、特定度、HD95はそれぞれ0.70%、1.1%、0.1%、および2.5%の改善が見られた。さらに, このアブレーション実験により, 提案したMGPMは, 非増強腫瘍とBGMの鑑別に有用であり, BS損失も腫瘍の分節輪郭の精製に有用であることが確認された。 Automatic segmentation of breast tumors from the ultrasound images is essential for the subsequent clinical diagnosis and treatment plan. Although the existing deep learning-based methods have achieved significant progress in automatic segmentation of breast tumor, their performance on tumors with similar intensity to the normal tissues is still not pleasant, especially for the tumor boundaries. To address this issue, we propose a PBNet composed by a multilevel global perception module (MGPM) and a boundary guided module (BGM) to segment breast tumors from ultrasound images. Specifically, in MGPM, the long-range spatial dependence between the voxels in a single level feature maps are modeled, and then the multilevel semantic information is fused to promote the recognition ability of the model for non-enhanced tumors. In BGM, the tumor boundaries are extracted from the high-level semantic maps using the dilation and erosion effects of max pooling, such boundaries are then used to guide the fusion of low and high-level features. Moreover, to improve the segmentation performance for tumor boundaries, a multi-level boundary-enhanced segmentation (BS) loss is proposed. The extensive comparison experiments on both publicly available dataset and in-house dataset demonstrate that the proposed PBNet outperforms the state-of-the-art methods in terms of both qualitative visualization results and quantitative evaluation metrics, with the Dice score, Jaccard coefficient, Specificity and HD95 improved by 0.70%, 1.1%, 0.1% and 2.5% respectively. In addition, the ablation experiments validate that the proposed MGPM is indeed beneficial for distinguishing the non-enhanced tumors and the BGM as well as the BS loss are also helpful for refining the segmentation contours of the tumor.	翻訳日:2023-10-24 21:47:32 公開日:2023-10-23
# 自然言語理解のための合成スカンパスを付加した事前学習言語モデル Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding ( http://arxiv.org/abs/2310.14676v1 ) ライセンス: Link先を確認	Shuwen Deng, Paul Prasse, David R. Reich, Tobias Scheffer, Lena A. J\"ager	(参考訳) 人間の視線データは自然言語理解を反映した認知情報を提供する。実際、人間のスキャンパスで言語モデルを拡張することは、言語理解を含む様々なNLPタスクに有益であることが証明されている。しかし、テキストコーパスの豊富さは視線データの不足と対比されるため、このアプローチの適用性は阻害されている。読解中にヒト様の走査パスを生成するためのモデルが開発されているが、NLPタスクにまたがる人工視線データの可能性はほとんど解明されていない。本研究では,人間の視線データの必要性をなくし,合成スカンパス生成とスカンパス提示言語モデルを統合するモデルを開発した。モデルのエラー勾配はモデルのすべての部分にわたって伝達されるため、スキャンパス生成器は下流タスクに微調整することができる。提案モデルは,基礎となる言語モデルに勝るだけでなく,実際の人間の視線データを付加した言語モデルに匹敵する性能を実現する。私たちのコードは公開されています。 Human gaze data offer cognitive information that reflects natural language comprehension. Indeed, augmenting language models with human scanpaths has proven beneficial for a range of NLP tasks, including language understanding. However, the applicability of this approach is hampered because the abundance of text corpora is contrasted by a scarcity of gaze data. Although models for the generation of human-like scanpaths during reading have been developed, the potential of synthetic gaze data across NLP tasks remains largely unexplored. We develop a model that integrates synthetic scanpath generation with a scanpath-augmented language model, eliminating the need for human gaze data. Since the model's error gradient can be propagated throughout all parts of the model, the scanpath generator can be fine-tuned to downstream tasks. We find that the proposed model not only outperforms the underlying language model, but achieves a performance that is comparable to a language model augmented with real human gaze data. Our code is publicly available.	翻訳日:2023-10-24 21:41:27 公開日:2023-10-23
# 自動走行のためのオンライン領域外検出 Online Out-of-Domain Detection for Automated Driving ( http://arxiv.org/abs/2310.14675v1 ) ライセンス: Link先を確認	Timo S\"amann and Horst-Michael Gro{\ss}	(参考訳) 自動運転における安全性の確保は、自動車業界にとって大きな課題である。人工知能、特にディープニューラルネットワーク(dnn)は、高度に自動化された運転の実現において重要な技術と考えられている。 DNNはトレーニングデータから学習するので、トレーニングデータの基盤となるデータ分布において、適切な精度しか達成できない。トレーニング領域を離れると、分布シフトが発生するため、精度が劇的に低下する可能性がある。本稿では,オンラインにドメインが残されていることを検出可能な安全機構,すなわち実行時にその概念実証を行う。 Synthiaデータセットを用いて行った実験では、入力データがドメイン内外にあるかどうかを100%正確に検出できることが示される。車両がドメインを離れたときに検出する能力は、認証の重要な要件である。 Ensuring safety in automated driving is a major challenge for the automotive industry. Special attention is paid to artificial intelligence, in particular to Deep Neural Networks (DNNs), which is considered a key technology in the realization of highly automated driving. DNNs learn from training data, which means that they only achieve good accuracy within the underlying data distribution of the training data. When leaving the training domain, a distributional shift is caused, which can lead to a drastic reduction of accuracy. In this work, we present a proof of concept for a safety mechanism that can detect the leaving of the domain online, i.e. at runtime. In our experiments with the Synthia data set we can show that a 100 % correct detection of whether the input data is inside or outside the domain is achieved. The ability to detect when the vehicle leaves the domain can be an important requirement for certification.	翻訳日:2023-10-24 21:41:10 公開日:2023-10-23
# 人口降下:自然選択型ハイパーパラメータチューニングフレームワーク Population Descent: A Natural-Selection Based Hyper-Parameter Tuning Framework ( http://arxiv.org/abs/2310.14671v1 ) ライセンス: Link先を確認	Abhinav Pomalapally, Bassel El Mabsout, Renato Mansuco	(参考訳) 一階勾配降下は、これまでに実装された最も成功した最適化アルゴリズムの基礎となっている。ニューラルネットワーク最適化のような非常に高次元性を持つ教師付き学習問題では、主にメモリと計算効率のために、ほとんど常に選択のアルゴリズムである。しかし、勾配降下が非凸関数上の局所ミニマに収束するという最適化の古典的な結果である。さらに重要なことに、ある高次元の場合、大きな鞍点の台地から逃れるのは難しい。一方、ブラックボックス最適化手法は、損失関数のランドスケープの局所構造に敏感ではなく、次元性の呪いを被る。代わりに、memeticアルゴリズムは両方の利点を組み合わせることを目指している。そこで我々は,超パラメータ最適化に着目したメメティックアルゴリズムであるPopulation Descentを提案する。適応的m-elitist選択手法と正規化適合性に基づくランダム化スキームを組み合わせることで、一般的なベンチマークタスクにおいて、より複雑な最先端アルゴリズムを最大13%上回ることを示した。 First-order gradient descent has been the base of the most successful optimization algorithms ever implemented. On supervised learning problems with very high dimensionality, such as neural network optimization, it is almost always the algorithm of choice, mainly due to its memory and computational efficiency. However, it is a classical result in optimization that gradient descent converges to local minima on non-convex functions. Even more importantly, in certain high-dimensional cases, escaping the plateaus of large saddle points becomes intractable. On the other hand, black-box optimization methods are not sensitive to the local structure of a loss function's landscape but suffer the curse of dimensionality. Instead, memetic algorithms aim to combine the benefits of both. Inspired by this, we present Population Descent, a memetic algorithm focused on hyperparameter optimization. We show that an adaptive m-elitist selection approach combined with a normalized-fitness-based randomization scheme outperforms more complex state-of-the-art algorithms by up to 13% on common benchmark tasks.	翻訳日:2023-10-24 21:40:57 公開日:2023-10-23
# 多眼視覚質問応答におけるデータセットバイアス軽減 Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond ( http://arxiv.org/abs/2310.14670v1 ) ライセンス: Link先を確認	Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noal Codella, Kai-Wei Chang, Shih-Fu Chang	(参考訳) 視覚言語(VL)理解タスクは、複数の質問を通じて複雑な視覚シーンの理解を評価する。しかし、モデルが様々なVLタスクを適切に理解せずに正しく解決するために、ショートカットとして活用できる2つのデータセットバイアスを特定した。最初のタイプのデータセットバイアスは \emph{unbalanced matching} バイアスであり、正しい答えは不正確な答えよりも質問と画像に重なる。データセットバイアスの第2のタイプは \emph{distractor similarity} バイアスであり、不正確な答えは正しい答えと過度に異なるが、同じサンプル内の他の不正確な回答と著しく似ている。これらのデータセットバイアスに対処するために,まずads(adversarial data synthesis)を提案する。次に, 合成訓練データ, 特に, サンプル内微分に着目して, モデルの活用を支援するために, サンプル内反事実訓練 (ict) を導入する。大規模な実験では、ADSとICTが、ドメインシフトシナリオであっても、異なるベンチマークでモデルパフォーマンスを継続的に改善する効果を実証している。 Vision-language (VL) understanding tasks evaluate models' comprehension of complex visual scenes through multiple-choice questions. However, we have identified two dataset biases that models can exploit as shortcuts to resolve various VL tasks correctly without proper understanding. The first type of dataset bias is \emph{Unbalanced Matching} bias, where the correct answer overlaps the question and image more than the incorrect answers. The second type of dataset bias is \emph{Distractor Similarity} bias, where incorrect answers are overly dissimilar to the correct answer but significantly similar to other incorrect answers within the same sample. To address these dataset biases, we first propose Adversarial Data Synthesis (ADS) to generate synthetic training and debiased evaluation data. We then introduce Intra-sample Counterfactual Training (ICT) to assist models in utilizing the synthesized training data, particularly the counterfactual data, via focusing on intra-sample differentiation. Extensive experiments demonstrate the effectiveness of ADS and ICT in consistently improving model performance across different benchmarks, even in domain-shifted scenarios.	翻訳日:2023-10-24 21:40:41 公開日:2023-10-23
# b^2sfl:セキュアなフェデレーション学習に基づくトラフィック予測のための2レベルブロックチェーンアーキテクチャ B^2SFL: A Bi-level Blockchained Architecture for Secure Federated Learning-based Traffic Prediction ( http://arxiv.org/abs/2310.14669v1 ) ライセンス: Link先を確認	Hao Guo, Collin Meese, Wanxin Li, Chien-Chung Shen, Mark Nejad	(参考訳) Federated Learning(FL)は、分散ローカルモデルの更新を集約したグローバルMLモデルの協調トレーニングと学習を可能にする、プライバシ保護機械学習(ML)テクノロジである。しかし、悪意のある参加者と集中型のflサーバによって、セキュリティとプライバシの保証が損なわれる可能性がある。本稿では,セキュアなフェデレート学習に基づくトラフィック予測のための,双方向ブロックチェーンアーキテクチャを提案する。ボトム層とトップ層ブロックチェーンは、ローカルモデルとグローバル集約パラメータをそれに従って格納し、分散ホモモルフィック暗号化フェデレーション平均化(DHFA)スキームは、セキュアな計算問題に対処する。本稿では,分散プライバシ保存型平均化モデルを実現するために,部分的秘密鍵分散プロトコルと部分準同型暗号/復号化スキームを提案する。我々は、DHFA操作の実行時間を測定し、ブロックチェーンネットワークの読み書き性能を定量化し、オンライントラフィックフロー予測タスクの予測精度に対する様々な地域グループサイズとモデル複雑度の影響を解明するための広範な実験を行った。提案システムは,現実の交通予測タスクに対して,セキュアかつ分散化されたフェデレーション学習を容易にすることを示唆している。 Federated Learning (FL) is a privacy-preserving machine learning (ML) technology that enables collaborative training and learning of a global ML model based on aggregating distributed local model updates. However, security and privacy guarantees could be compromised due to malicious participants and the centralized FL server. This article proposed a bi-level blockchained architecture for secure federated learning-based traffic prediction. The bottom and top layer blockchain store the local model and global aggregated parameters accordingly, and the distributed homomorphic-encrypted federated averaging (DHFA) scheme addresses the secure computation problems. We propose the partial private key distribution protocol and a partially homomorphic encryption/decryption scheme to achieve the distributed privacy-preserving federated averaging model. We conduct extensive experiments to measure the running time of DHFA operations, quantify the read and write performance of the blockchain network, and elucidate the impacts of varying regional group sizes and model complexities on the resulting prediction accuracy for the online traffic flow prediction task. The results indicate that the proposed system can facilitate secure and decentralized federated learning for real-world traffic prediction tasks.	翻訳日:2023-10-24 21:40:22 公開日:2023-10-23
# move-one-sample-out によるデータプルーニング Data Pruning via Moving-one-Sample-out ( http://arxiv.org/abs/2310.14664v1 ) ライセンス: Link先を確認	Haoru Tan, Sitong Wu, Fei Du, Yukang Chen, Zhibin Wang, Fan Wang, Xiaojuan Qi	(参考訳) 本稿では、トレーニングセットから最も情報に乏しいサンプルを識別・削除することを目的とした、移動単サンプルアウト(MoSo)と呼ばれる新しいデータ抽出手法を提案する。 MoSoの背後にある中核的な洞察は、最適な経験的リスクに対する影響を評価することで、各サンプルの重要性を決定することである。これは、特定のサンプルがトレーニングセットから除外された場合に経験的リスクが変動する程度を測定することで達成される。計算コストのかかる1次調整手順の代わりに、異なる訓練段階からの勾配情報のみを必要とする効率的な1次近似器を提案する。我々の近似の背景にある重要な考え方は、トレーニングセットの平均勾配に一貫した勾配を持つサンプルは、より情報的であり、より高いスコアを受け取るべきであるということであり、これは直感的に理解できる。実験の結果,mosoは高い刈り込み率で性能劣化を効果的に軽減し,様々な設定で良好な性能が得られることがわかった。 In this paper, we propose a novel data-pruning approach called moving-one-sample-out (MoSo), which aims to identify and remove the least informative samples from the training set. The core insight behind MoSo is to determine the importance of each sample by assessing its impact on the optimal empirical risk. This is achieved by measuring the extent to which the empirical risk changes when a particular sample is excluded from the training set. Instead of using the computationally expensive leaving-one-out-retraining procedure, we propose an efficient first-order approximator that only requires gradient information from different training stages. The key idea behind our approximation is that samples with gradients that are consistently aligned with the average gradient of the training set are more informative and should receive higher scores, which could be intuitively understood as follows: if the gradient from a specific sample is consistent with the average gradient vector, it implies that optimizing the network using the sample will yield a similar effect on all remaining samples. Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios and achieves satisfactory performance across various settings.	翻訳日:2023-10-24 21:40:00 公開日:2023-10-23
# DPP-TTS:決定点過程による音声の韻律的特徴の多様化 DPP-TTS: Diversifying prosodic features of speech via determinantal point processes ( http://arxiv.org/abs/2310.14663v1 ) ライセンス: Link先を確認	Seongho Joo, Hyukhun Koh, Kyomin Jung	(参考訳) 深層生成モデルの急速な進歩により、最近のニューラルテキスト音声(TTS)モデルは、人間に似た音声の合成に成功している。単調な韻律パターンを超えて様々な韻律を用いて音声を生成する試みがいくつかある。しかし、以前の作品にはいくつかの制限がある。まず、典型的なTSモデルは、韻律の多様性を高めるためにスケールしたサンプリング温度に依存する。高いサンプリング温度で生成された音声サンプルは知覚的韻律の多様性を欠くことが多く、音声の自然性に悪影響を及ぼす可能性がある。第2に、サンプル間の多様性は無視されるが、サンプリング手順は複数のサンプルではなく単一の音声サンプルに焦点を当てることが多い。本稿では, prosody diversifying module を用いた決定的点過程 (dpps) に基づくテキスト対音声モデル dpp-tts を提案する。 TTSモデルは,各サンプルおよび複数のサンプル間の知覚的多様性を同時に考慮した音声サンプルを生成することができる。 DPP-TTSは, 音声の自然性を考慮した左右比較試験において, ベースラインよりも多様な韻律を持つ音声サンプルを生成することを示した。 With the rapid advancement in deep generative models, recent neural Text-To-Speech(TTS) models have succeeded in synthesizing human-like speech. There have been some efforts to generate speech with various prosody beyond monotonous prosody patterns. However, previous works have several limitations. First, typical TTS models depend on the scaled sampling temperature for boosting the diversity of prosody. Speech samples generated at high sampling temperatures often lack perceptual prosodic diversity, which can adversely affect the naturalness of the speech. Second, the diversity among samples is neglected since the sampling procedure often focuses on a single speech sample rather than multiple ones. In this paper, we propose DPP-TTS: a text-to-speech model based on Determinantal Point Processes (DPPs) with a prosody diversifying module. Our TTS model is capable of generating speech samples that simultaneously consider perceptual diversity in each sample and among multiple samples. We demonstrate that DPP-TTS generates speech samples with more diversified prosody than baselines in the side-by-side comparison test considering the naturalness of speech at the same time.	翻訳日:2023-10-24 21:39:41 公開日:2023-10-23
# オープンアクセス型マルチセンサー衛星画像とgedilidarデータによる森林高とバイオマスの推定:フランス大都市圏の高解像度地図 Estimation of forest height and biomass from open-access multi-sensor satellite imagery and GEDI Lidar data: high-resolution maps of metropolitan France ( http://arxiv.org/abs/2310.14662v1 ) ライセンス: Link先を確認	David Morin (CESBIO), Milena Planells (CESBIO), St\'ephane Mermoz (globeo), Florian Mouret (UO, CESBIO)	(参考訳) 森林資源と炭素のマッピングは、森林管理を改善し、炭素の貯蔵と環境保全の目的を満たすために重要である。宇宙からのリモートセンシングアプローチは、広範囲にわたる高空間解像度で繰り返し観測を行うことにより、森林高度モニタリングを支援する可能性がある。本研究は,かつて森林パラメータの局所的地図(ベース面積,高さ,直径など)を作成するために開発された機械学習アプローチを用いる。本稿の目的は,フランスの全国報道など,より広い範囲へのアプローチの展開について述べることである。我々はGEDI Lidarミッションを基準高度データとして,Sentinel-1,Sentinel-2,ALOS-2 PALSA-2の衛星画像を用いて森林高度を推定し,2020年のフランス地図を作成する。高さマップは、アロメトリ方程式を用いて体積および地上バイオマス (agb) に導出される。 ALSデータからの局所地図による高さマップの検証は、平均絶対誤差(MAE)が4.3mである技術の状態に近い精度を示している。フランスの森林に代表される在庫計画の検証では、標高は3.7mである。針葉樹は広葉樹林より推定がやや優れている。高さから得られたボリュームマップとagbマップはそれぞれ75トン/haと93m${}^3$/haである。 sylvo-ecoregionと森林種(所有者と種)によって集計された結果はさらに改善され、maesは23t/ha、30m${}^3$/haである。これらの地図の正確さは、森林資源や炭素を地域規模や特定の種類の森林で分析し、地理的情報(行政区域、種、所有者の種類、保護地域、環境条件など)と地図を組み合わせることで、ローカルにモニタリングすることができる。本研究で作成した高さ,体積およびAGBマップは無償で利用可能である。 Mapping forest resources and carbon is important for improving forest management and meeting the objectives of storing carbon and preserving the environment. Spaceborne remote sensing approaches have considerable potential to support forest height monitoring by providing repeated observations at high spatial resolution over large areas. This study uses a machine learning approach that was previously developed to produce local maps of forest parameters (basal area, height, diameter, etc.). The aim of this paper is to present the extension of the approach to much larger scales such as the French national coverage. We used the GEDI Lidar mission as reference height data, and the satellite images from Sentinel-1, Sentinel-2 and ALOS-2 PALSA-2 to estimate forest height and produce a map of France for the year 2020. The height map is then derived into volume and aboveground biomass (AGB) using allometric equations. The validation of the height map with local maps from ALS data shows an accuracy close to the state of the art, with a mean absolute error (MAE) of 4.3 m. Validation on inventory plots representative of French forests shows an MAE of 3.7 m for the height. Estimates are slightly better for coniferous than for broadleaved forests. Volume and AGB maps derived from height shows MAEs of 75 tons/ha and 93 m${}^3$/ha respectively. The results aggregated by sylvo-ecoregion and forest types (owner and species) are further improved, with MAEs of 23 tons/ha and 30 m${}^3$/ha. The precision of these maps allows to monitor forests locally, as well as helping to analyze forest resources and carbon on a territorial scale or on specific types of forests by combining the maps with geolocated information (administrative area, species, type of owner, protected areas, environmental conditions, etc.). Height, volume and AGB maps produced in this study are made freely available.	翻訳日:2023-10-24 21:39:22 公開日:2023-10-23
# 純粋およびガウス微分プライバシーを持つ個人学習のための扱いやすいmcmc Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy ( http://arxiv.org/abs/2310.14661v1 ) ライセンス: Link先を確認	Yingyu Lin, Yian Ma, Yu-Xiang Wang, Rachel Redberg	(参考訳) 後部サンプリング、すなわち後部分布からサンプリングする指数的なメカニズムは、$\varepsilon$-pure差分プライバシー(DP)保証を提供し、$(\varepsilon,\delta)$-approximate DPによってもたらされる潜在的に無拘束なプライバシー侵害に悩まされない。しかし実際には、マルコフ連鎖モンテカルロ(MCMC)のような近似サンプリング手法を適用する必要があるため、未適用の$\delta$-approximationエラーをプライバシー保証に再導入する必要がある。このギャップを埋めるために、純粋なDPまたは純粋なガウスDP(すなわち$\delta=0$)を満たす参照分布からワッサーシュタイン無限度(W_\infty$)に比例した雑音でMCMCサンプルを摂動する近似SAample摂動法(ASAP)アルゴリズムを提案する。次に、メトロポリス・ハスティングスアルゴリズムを用いてサンプルを生成し、アルゴリズムがW$_\infty$距離に収束することを証明する。本研究では,新しい手法と注意深い局所化ステップを組み合わせることにより,dp-erm問題において,強い凸と滑らかな損失を伴う最適レートを達成する最初の近似時間アルゴリズムを得る。 Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,\delta)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $\delta$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $\delta=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in W$_\infty$ distance. We show that by combining our new techniques with a careful localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.	翻訳日:2023-10-24 21:38:50 公開日:2023-10-23
# 混合整数線形プログラムに対する正確なラグランジアン乗算器の予測 Predicting Accurate Lagrangian Multipliers for Mixed Integer Linear Programs ( http://arxiv.org/abs/2310.14659v1 ) ライセンス: Link先を確認	Francesco Demelas and Joseph Le Roux and Mathieu Lacroix and Axel Parmentier	(参考訳) ラグランジアン緩和は、難しい制約でMILP(Mixed Integer Linear Programs)を解く最も効率的な方法の一つである。ラグランジアン乗数 (LM) と呼ばれるこれらの制約の双対が与えられたとき、それはMILPの最適値の有界を返し、ラグランジアン法は最高の有界を与える LM を求める。しかし、これらの手法は一般に勾配降下に類似した反復アルゴリズムに頼り、凹凸の線形双対関数を最大化する: 計算の重みは緩和された制約の数とともに急速に増加する。我々は,下降をバイパスし,例えば局所的な最適化を効果的に償却する深層学習手法を導入する。グラフ畳み込みネットワークに基づく確率エンコーダは、MILPインスタンスにおける緩和制約の高次元表現を計算する。デコーダはこれらの表現をLMに変換する。予測した乗算器から得られる境界を直接最適化することにより、エンコーダとデコーダを共同で訓練する。数値実験により,本手法は連続緩和と最良ラグランジアンバウンドの間のギャップの85～2%を閉じることを示し,降下ラグランジアン法に対する高品質なウォームスタートを提供する。 Lagrangian relaxation stands among the most efficient approaches for solving a Mixed Integer Linear Programs (MILP) with difficult constraints. Given any duals for these constraints, called Lagrangian Multipliers (LMs), it returns a bound on the optimal value of the MILP, and Lagrangian methods seek the LMs giving the best such bound. But these methods generally rely on iterative algorithms resembling gradient descent to maximize the concave piecewise linear dual function: the computational burden grows quickly with the number of relaxed constraints. We introduce a deep learning approach that bypasses the descent, effectively amortizing the local, per instance, optimization. A probabilistic encoder based on a graph convolutional network computes high-dimensional representations of relaxed constraints in MILP instances. A decoder then turns these representations into LMs. We train the encoder and decoder jointly by directly optimizing the bound obtained from the predicted multipliers. Numerical experiments show that our approach closes up to 85~\% of the gap between the continuous relaxation and the best Lagrangian bound, and provides a high quality warm-start for descent based Lagrangian methods.	翻訳日:2023-10-24 21:38:19 公開日:2023-10-23
# 3次元メッシュのノードデータ予測のためのハイブリッドGNNアプローチ A Hybrid GNN approach for predicting node data for 3D meshes ( http://arxiv.org/abs/2310.14707v1 ) ライセンス: Link先を確認	Shwetha Salimath and Francesca Bugiotti and Frederic Magoules	(参考訳) 金属鍛造は金型の製造に用いられる。プロセスが効率的になるためには、最適な入力パラメータセットが必要です。現在, 有限要素法を用いて, 時間を要する異なる初期条件のシミュレーションを生成することにより, 最適パラメータを予測している。本稿では,グラフ畳み込みに基づく代理グラフニューラルネットワークモデルを用いて,新しいデータシミュレーションの処理と生成を支援するハイブリッド手法を提案する。また,モデルを用いた新しいデータシミュレーションの処理と生成を支援するハイブリッド手法を提案する。メッシュを表すデータセットが与えられたら、利用可能な情報をグラフやポイントクラウド構造に変換することに注力します。この表現は深層学習を可能にする。予測結果は有限要素法で生成されたものと比較して誤差が低いのと似ている。新しいモデルは、シミュレーションを作成するために適用される場合、既存のpointnetや単純なグラフニューラルネットワークモデルよりも優れています。 Metal forging is used to manufacture dies. We require the best set of input parameters for the process to be efficient. Currently, we predict the best parameters using the finite element method by generating simulations for the different initial conditions, which is a time-consuming process. In this paper, introduce a hybrid approach that helps in processing and generating new data simulations using a surrogate graph neural network model based on graph convolutions, having a cheaper time cost. We also introduce a hybrid approach that helps in processing and generating new data simulations using the model. Given a dataset representing meshes, our focus is on the conversion of the available information into a graph or point cloud structure. This new representation enables deep learning. The predicted result is similar, with a low error when compared to that produced using the finite element method. The new models have outperformed existing PointNet and simple graph neural network models when applied to produce the simulations.	翻訳日:2023-10-24 21:30:34 公開日:2023-10-23
# 語彙テストの大規模言語モデル評価における継続的有用性 The continued usefulness of vocabulary tests for evaluating large language models ( http://arxiv.org/abs/2310.14703v1 ) ライセンス: Link先を確認	Gonzalo Mart\'inez, Javier Conde, Elena Merino-G\'omez, Beatriz Berm\'udez-Margaretto, Jos\'e Alberto Hern\'andez, Pedro Reviriego, Marc Brysbaert	(参考訳) 意味ベクトルに関する論文の中で、Landauer と Dumain (1997) はAI言語モデルの品質を挑戦的な語彙テストでテストすることを提案した。いずれのモデルも完全ではなく, 相違点に誤りが生じたため, 現代の主要言語モデルでは, テスト・オブ・イングリッシュ・アズ・ア・外国語(TOEFL)テストが有益であることを示す。 TOEFLテストは、ターゲット語から選択する4つの代替語からなる。さらに、既存の単語と非単語の区別を必要とするYes/Noテストでモデルをテストした。モデルは、現在の主要言語モデルが存在しない情報を提供するという他の観察結果と一致して、非単語の項目で著しく悪化した。テストがスペイン語に一般化されたとき、状況は悪化した。ここでは、ほとんどのモデルはランダムな文字列の大多数に意味/翻訳を与えた。プラスの面では、最高のモデルは非常にうまく機能し始めており、また、テスト参加者に未知だが辞書で見られる非単語も指している。 In their seminal article on semantic vectors, Landauer and Dumain (1997) proposed testing the quality of AI language models with a challenging vocabulary test. We show that their Test of English as a Foreign Language (TOEFL) test remains informative for contemporary major language models, since none of the models was perfect and made errors on divergent items. The TOEFL test consists of target words with four alternatives to choose from. We further tested the models on a Yes/No test that requires distinguishing between existing words and made-up nonwords. The models performed significantly worse on the nonword items, in line with other observations that current major language models provide non-existent information. The situation was worse when we generalized the tests to Spanish. Here, most models gave meanings/translations for the majority of random letter sequences. On the plus side, the best models began to perform quite well, and they also pointed to nonwords that were unknown to the test participants but can be found in dictionaries.	翻訳日:2023-10-24 21:30:09 公開日:2023-10-23
# BM2CP:LiDARカメラによる効率的な協調知覚 BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities ( http://arxiv.org/abs/2310.14702v1 ) ライセンス: Link先を確認	Binyu Zhao, Wei Zhang, Zhaonian Zou	(参考訳) 協調的知覚により、エージェントは近くのエージェントと補完的な知覚情報を共有できる。これにより、知覚性能が向上し、オクルージョンやスパーシリティといった単一視点知覚の問題が緩和される。既存のアプローチのほとんどは、主に単一モダリティ(特にLiDAR)に焦点を当てており、マルチモーダル知覚の優位性を十分に活用していない。我々は,LiDARとカメラを用いた協調認識パラダイムBM2CPを提案し,効率的なマルチモーダル認識を実現する。 LiDAR-guided modal fusion, 協調深度生成, およびModality-guided intermediate fusionを用いて、異なるエージェントのモード間のディープインタラクションを取得し、また、任意のエージェントのセンサーの1つ、同一または異なるタイプのセンサーが欠落している特別なケースに対処することができる。シミュレーションおよび実世界の自動運転シナリオにおいて,本手法が50倍の通信量で最先端の手法より優れていることを示す。私たちのコードはhttps://github.com/byzhaoAI/BM2CPで利用可能です。 Collaborative perception enables agents to share complementary perceptual information with nearby agents. This would improve the perception performance and alleviate the issues of single-view perception, such as occlusion and sparsity. Most existing approaches mainly focus on single modality (especially LiDAR), and not fully exploit the superiority of multi-modal perception. We propose a collaborative perception paradigm, BM2CP, which employs LiDAR and camera to achieve efficient multi-modal perception. It utilizes LiDAR-guided modal fusion, cooperative depth generation and modality-guided intermediate fusion to acquire deep interactions among modalities of different agents, Moreover, it is capable to cope with the special case where one of the sensors, same or different type, of any agent is missing. Extensive experiments validate that our approach outperforms the state-of-the-art methods with 50X lower communication volumes in both simulated and real-world autonomous driving scenarios. Our code is available at https://github.com/byzhaoAI/BM2CP.	翻訳日:2023-10-24 21:29:28 公開日:2023-10-23
# 物体内部との相互作用駆動型能動3次元再構成 Interaction-Driven Active 3D Reconstruction with Object Interiors ( http://arxiv.org/abs/2310.14700v1 ) ライセンス: Link先を確認	Zihao Yan, Fubao Su, Mingyang Wang, Ruizhen Hu, Hao Zhang, Hui Huang	(参考訳) 本研究では,視覚,ロボットと物体の相互作用,および3Dスキャンを統合したアクティブな3次元再構成手法を提案する。カメラの視点を最適化して環境をよりよく調査する他の能動的視覚研究とは異なり, 再建の主目的は, 対象物の様々な部分の相互作用性の解析と, 隠蔽領域の走査を可能にするロボットによる操作である。その結果、完全な幾何学的取得の上に対象対象物の部分的記述の理解が得られる。本手法はrgbdセンサーを内蔵したフェッチロボットによって完全に自動で動作する。相互作用解析と相互作用駆動型再構成、検出された移動可能な部分の走査と再構成を繰り返すことで、関節部検出とメッシュ再構築の両方がニューラルネットワークによって行われる。最終段階では、前部操作で露出し、その後スキャンされたすべての内部構造を含む残りの全ての非係留部品が、取得を完了するために再構築される。本手法は, 質的, 定量的評価, アブレーション研究, 代替品との比較, 実環境実験を通じて, 性能を実証する。 We introduce an active 3D reconstruction method which integrates visual perception, robot-object interaction, and 3D scanning to recover both the exterior and interior, i.e., unexposed, geometries of a target 3D object. Unlike other works in active vision which focus on optimizing camera viewpoints to better investigate the environment, the primary feature of our reconstruction is an analysis of the interactability of various parts of the target object and the ensuing part manipulation by a robot to enable scanning of occluded regions. As a result, an understanding of part articulations of the target object is obtained on top of complete geometry acquisition. Our method operates fully automatically by a Fetch robot with built-in RGBD sensors. It iterates between interaction analysis and interaction-driven reconstruction, scanning and reconstructing detected moveable parts one at a time, where both the articulated part detection and mesh reconstruction are carried out by neural networks. In the final step, all the remaining, non-articulated parts, including all the interior structures that had been exposed by prior part manipulations and subsequently scanned, are reconstructed to complete the acquisition. We demonstrate the performance of our method via qualitative and quantitative evaluation, ablation studies, comparisons to alternatives, as well as experiments in a real environment.	翻訳日:2023-10-24 21:29:06 公開日:2023-10-23
# 明確化のツリー:検索強化大言語モデルによるあいまいな質問への回答 Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models ( http://arxiv.org/abs/2310.14696v1 ) ライセンス: Link先を確認	Gangwoo Kim, Sungdong Kim, Byeongguk Jeon, Joonsuk Park, Jaewoo Kang	(参考訳) オープンドメインの質問に答える質問はしばしば曖昧で、複数の解釈が可能である。それらを扱う1つのアプローチは、あいまいな質問(aq)の可能な全ての解釈を識別し、stlmakh et al. (2022) が提案したように、それら全てに対応するロングフォームな回答を生成することである。ユーザを悩ませることなく総合的な応答を提供するが、曖昧さの多次元を考慮し、対応する知識を収集することは依然として課題である。この課題に対処するために、我々は新しい枠組みであるtoc(tree of clarifications)を提案している。これは再帰的にaqの曖昧さのない木を構築する。 ToCは、Disambig-F1とDisambig-ROUGEでトレーニングされたトレーニングセット全体において、完全に教師されたベースラインを越えながら、ASQAの既存のベースラインを数ショットで上回っている。コードはhttps://github.com/gankim/tree-of-clarificationsで入手できる。 Questions in open-domain question answering are often ambiguous, allowing multiple interpretations. One approach to handling them is to identify all possible interpretations of the ambiguous question (AQ) and to generate a long-form answer addressing them all, as suggested by Stelmakh et al., (2022). While it provides a comprehensive response without bothering the user for clarification, considering multiple dimensions of ambiguity and gathering corresponding knowledge remains a challenge. To cope with the challenge, we propose a novel framework, Tree of Clarifications (ToC): It recursively constructs a tree of disambiguations for the AQ -- via few-shot prompting leveraging external knowledge -- and uses it to generate a long-form answer. ToC outperforms existing baselines on ASQA in a few-shot setup across the metrics, while surpassing fully-supervised baselines trained on the whole training set in terms of Disambig-F1 and Disambig-ROUGE. Code is available at https://github.com/gankim/tree-of-clarifications.	翻訳日:2023-10-24 21:28:39 公開日:2023-10-23
# CAwa-NeRF:圧縮型NeRF特徴のインスタント学習 CAwa-NeRF: Instant Learning of Compression-Aware NeRF Features ( http://arxiv.org/abs/2310.14695v1 ) ライセンス: Link先を確認	Omnia Mahmoud, Th\'eo Ladune, Matthieu Gendrin	(参考訳) ボリューム特徴格子による3Dシーンのモデリングは、ニューラルレイディアンスフィールド(NeRF)を改善するための神経近似の有望な方向の1つである。 Instant-NGP (INGP) はトレーニング可能な機能グリッドのルックアップテーブルからマルチ解像度ハッシュコードを導入し、高品質なニューラルネットワークプリミティブを数秒で学習できるようにした。しかし、この改善はストレージサイズを高くするコストに繋がった。本稿では, 圧縮対応NeRF特徴量(CAwa-NeRF)の即時学習により, モデルトレーニングの終了時に, ストレージアーキテクチャや元のINGP論文で使用されるパラメータを変更せずに, 余分な時間オーバーヘッドでジップ圧縮された特徴格子をエクスポートすることが可能となる課題に対処する。しかし,提案手法はINGPに限らず,どのモデルにも適用可能である。シミュレーションにより,提案するインスタント・ラーニング・パイプラインは,単一の物体をマスクした背景シーンや,スタジオで撮影された実写シーンなど,さまざまな静的シーンで印象的な結果が得られる。特に、単一のオブジェクトをマスクした背景シーンでは、CAwa-NeRFはPSNR (33 dB) を損なうことなく、元のサイズの機能グリッドを6% (1.2 MB) まで圧縮し、わずかに仮想損失 (32.31 dB) の2.4% (0.53 MB) まで圧縮する。 Modeling 3D scenes by volumetric feature grids is one of the promising directions of neural approximations to improve Neural Radiance Fields (NeRF). Instant-NGP (INGP) introduced multi-resolution hash encoding from a lookup table of trainable feature grids which enabled learning high-quality neural graphics primitives in a matter of seconds. However, this improvement came at the cost of higher storage size. In this paper, we address this challenge by introducing instant learning of compression-aware NeRF features (CAwa-NeRF), that allows exporting the zip compressed feature grids at the end of the model training with a negligible extra time overhead without changing neither the storage architecture nor the parameters used in the original INGP paper. Nonetheless, the proposed method is not limited to INGP but could also be adapted to any model. By means of extensive simulations, our proposed instant learning pipeline can achieve impressive results on different kinds of static scenes such as single object masked background scenes and real-life scenes captured in our studio. In particular, for single object masked background scenes CAwa-NeRF compresses the feature grids down to 6% (1.2 MB) of the original size without any loss in the PSNR (33 dB) or down to 2.4% (0.53 MB) with a slight virtual loss (32.31 dB).	翻訳日:2023-10-24 21:28:18 公開日:2023-10-23
# 軽量通信のためのフェデレーション学習圧縮 Federated learning compression designed for lightweight communications ( http://arxiv.org/abs/2310.14693v1 ) ライセンス: Link先を確認	Lucas Grativol Ribeiro (IMT Atlantique - MEE, Lab_STICC_BRAIn, Lab-STICC_2AI, LHC), Mathieu Leonardon (IMT Atlantique - MEE, Lab_STICC_BRAIn), Guillaume Muller, Virginie Fresse, Matthieu Arzel (IMT Atlantique - MEE, Lab-STICC_2AI)	(参考訳) フェデレーション・ラーニング(federated learning, fl)は、エッジレベルの機械学習のための有望な分散手法であり、特に軍や医療分野のプライバシに敏感なアプリケーションでは、クライアントデータをクラウドコンピューティングサーバに共有したり転送したりできない。多くのユースケースにおいて、通信コストは自然なネットワーク利用のためにflにとって大きな課題である。スマートフォンやIoT(Internet of Things)ノードなどのクライアントデバイスは、エネルギー、計算、メモリの面で限られたリソースを持つ。これらのハードウェア制約に対処するために、軽量モデルやプラニングや量子化といった圧縮技術が集中型パラダイムで一般的に採用されている。本稿では,典型的画像分類タスクにおけるflに対する圧縮技術の影響について検討する。さらに,単純な方法では最大50%の圧縮が可能で,精度損失の1%未満で,最先端技術と競合することを実証した。 Federated Learning (FL) is a promising distributed method for edge-level machine learning, particularly for privacysensitive applications such as those in military and medical domains, where client data cannot be shared or transferred to a cloud computing server. In many use-cases, communication cost is a major challenge in FL due to its natural intensive network usage. Client devices, such as smartphones or Internet of Things (IoT) nodes, have limited resources in terms of energy, computation, and memory. To address these hardware constraints, lightweight models and compression techniques such as pruning and quantization are commonly adopted in centralised paradigms. In this paper, we investigate the impact of compression techniques on FL for a typical image classification task. Going further, we demonstrate that a straightforward method can compresses messages up to 50% while having less than 1% of accuracy loss, competing with state-of-the-art techniques.	翻訳日:2023-10-24 21:27:47 公開日:2023-10-23
# 部分形状対応と関数写像について On Partial Shape Correspondence and Functional Maps ( http://arxiv.org/abs/2310.14692v1 ) ライセンス: Link先を確認	Amit Bracha, Thomas Dag\`es, Ron Kimmel	(参考訳) それらの部品に一致する形状を扱っている間、我々はしばしば機能写像と呼ばれる器具を利用する。この考え方は、形状マッチング問題を最小二乗問題を解いて代数的にマッチングを行う‘convenient’空間に変換することである。ここで、このような定式化は、この分野では人気があるものの、偏りが呼び出されたときに推定マッチに誤差をもたらすと論じる。このようなエラーは、高度な特徴抽出ネットワークを考慮しても避けられず、形状部分性の増大とともにエスカレートし、そのようなシステムの学習能力に悪影響を及ぼすことを示すことができる。これらの制約を回避するために, 部分形状マッチングに対する新しいアプローチを提案する。関数写像の研究により,関数写像中間空間の必要性を回避し,特徴マッチングにより部分形状と全体形状の直接対応を確立する新しい手法が確立された。距離空間間のグロモフ距離は、損失関数の最初の部分を構成することにつながる。正規化には、マッピングのプロパティを保存する領域に基づく用語と、関数マップを計算する必要なしに緩和されたバージョンの2つのオプションを使用する。提案手法はshrec'16データセットの性能が向上し,既存の非教師付き部分形状マッチング法を上回った。特に、SHREC'16 HOLESベンチマークで最先端の結果を達成し、教師付き手法よりも優れている。 While dealing with matching shapes to their parts, we often utilize an instrument known as functional maps. The idea is to translate the shape matching problem into ``convenient'' spaces by which matching is performed algebraically by solving a least squares problem. Here, we argue that such formulations, though popular in this field, introduce errors in the estimated match when partiality is invoked. Such errors are unavoidable even when considering advanced feature extraction networks, and they can be shown to escalate with increasing degrees of shape partiality, adversely affecting the learning capability of such systems. To circumvent these limitations, we propose a novel approach for partial shape matching. Our study of functional maps led us to a novel method that establishes direct correspondence between partial and full shapes through feature matching bypassing the need for functional map intermediate spaces. The Gromov distance between metric spaces leads to the construction of the first part of our loss functions. For regularization we use two options: a term based on the area preserving property of the mapping, and a relaxed version of it without the need to compute a functional map. The proposed approach shows superior performance on the SHREC'16 dataset, outperforming existing unsupervised methods for partial shape matching. In particular, it achieves state-of-the-art result on the SHREC'16 HOLES benchmark, superior also compared to supervised methods.	翻訳日:2023-10-24 21:27:31 公開日:2023-10-23
# 多様な表構造に対する質問応答のためのapi支援コード生成 API-Assisted Code Generation for Question Answering on Varied Table Structures ( http://arxiv.org/abs/2310.14687v1 ) ライセンス: Link先を確認	Yihan Cao, Shuyi Chen, Ryan Liu, Zhiruo Wang, Daniel Fried	(参考訳) 実行可能プログラムの生成による質問応答(TableQA)のテーブル化への挑戦は、通常ドメイン固有の論理形式を必要とする様々なテーブル構造に適応している。そこで本研究では,(1)マルチインデックスPandasデータフレームとして構造化テーブルの統一表現を提供し,(2)Pythonを強力なクエリ言語として使用し,(3)NL質問をPythonプログラムに変換し,Pandasデータフレーム上で実行可能にする。さらに,プログラム機能の拡張と外部知識を備えた複雑なリレーショナルな質問に答えるため,pythonプログラムが呼び出すカスタマイズapiも提供する。異なる構造のテーブル(リレーショナル、マルチテーブル、階層行列)を含む4つのテーブルqaデータセットを実験し、過去の最先端システムに対して顕著な改善を達成します。アブレーション研究では,(1)LLMのみを使用するベースライン上のマルチインデックス表現とAPIの利点を示し,(2)アプローチがモジュール化されており,追加APIを組み込むことができることを示す。 A persistent challenge to table question answering (TableQA) by generating executable programs has been adapting to varied table structures, typically requiring domain-specific logical forms. In response, this paper introduces a unified TableQA framework that: (1) provides a unified representation for structured tables as multi-index Pandas data frames, (2) uses Python as a powerful querying language, and (3) uses few-shot prompting to translate NL questions into Python programs, which are executable on Pandas data frames. Furthermore, to answer complex relational questions with extended program functionality and external knowledge, our framework allows customized APIs that Python programs can call. We experiment with four TableQA datasets that involve tables of different structures -- relational, multi-table, and hierarchical matrix shapes -- and achieve prominent improvements over past state-of-the-art systems. In ablation studies, we (1) show benefits from our multi-index representation and APIs over baselines that use only an LLM, and (2) demonstrate that our approach is modular and can incorporate additional APIs.	翻訳日:2023-10-24 21:27:09 公開日:2023-10-23
# SpEL: エンティティリンクの構造化予測 SpEL: Structured Prediction for Entity Linking ( http://arxiv.org/abs/2310.14684v1 ) ライセンス: Link先を確認	Hassan S. Shavarani and Anoop Sarkar	(参考訳) エンティティリンクは、テキストのスパンをオントロジーや知識ソースにリンクすることで構造化データ生成に焦点を当てた、注目すべき研究スレッドである。各入力トークンをエンティティとして分類したエンティティリンクに対する構造化予測の使用を再検討し,トークン予測を集約する。 SpEL(Structured Prediction for Entity Linking)と呼ばれるこのシステムは、エンティティリンクのタスクに構造化予測を適用するために、いくつかの新しいアイデアを使用する最先端エンティティリンクシステムである。2つの洗練された微調整ステップ、コンテキスト依存型予測集約戦略、モデルの出力語彙の縮小、そしてトレーニングと推論トークン化ミスマッチがあるエンティティリンクシステムにおける一般的な問題に対処する。実験の結果,WikipediaへのエンティティリンクのためのAIDAベンチマークデータセットでは,最先端のアルゴリズムよりも優れていることがわかった。提案手法は,パラメータ数や推論速度の観点からも,非常に効率的な計算手法である。 Entity linking is a prominent thread of research focused on structured data creation by linking spans of text to an ontology or knowledge source. We revisit the use of structured prediction for entity linking which classifies each individual input token as an entity, and aggregates the token predictions. Our system, called SpEL (Structured prediction for Entity Linking) is a state-of-the-art entity linking system that uses some new ideas to apply structured prediction to the task of entity linking including: two refined fine-tuning steps; a context sensitive prediction aggregation strategy; reduction of the size of the model's output vocabulary, and; we address a common problem in entity-linking systems where there is a training vs. inference tokenization mismatch. Our experiments show that we can outperform the state-of-the-art on the commonly used AIDA benchmark dataset for entity linking to Wikipedia. Our method is also very compute efficient in terms of number of parameters and speed of inference.	翻訳日:2023-10-24 21:26:48 公開日:2023-10-23
# 特異点まわりの位相的に保護された絡み合い Topologically protected entanglement switching around exceptional points ( http://arxiv.org/abs/2310.14731v1 ) ライセンス: Link先を確認	Zan Tang, Tian Chen, Xing Tang, and Xiangdong Zhang	(参考訳) 量子エンタングルメント状態の堅牢な操作は、量子情報、計算、通信1-3における応用に不可欠である。しかし、非一貫性と障害のため、このようなタスクを完了させることは常に大きな課題でした。本稿では、四重縮退点を設計することにより、量子絡み合い状態の堅牢な操作を実現するための理論的および実験的に有効なスキームを提案する。 2つの重なり合うリーマンエネルギー面上の例外点を囲むことで、高い忠実度を持つ絡み合った状態に対するキラルスイッチを実現する。リーマン面構造によって与えられる位相的保護のため、このキラリティの切替は、囲む経路における摂動に対して強い堅牢性を示す。さらに,このようなスキームを量子ウォークプラットフォーム上で実験的に検証した。我々の研究は、量子情報分野における非エルミート物理学の新たな応用方法を開く。 The robust operation of quantum entanglement states are crucial for applications in quantum information, computing, and communications1-3. However, it has always been a great challenge to complete such a task because of decoherence and disorder. Here, we propose theoretically and demonstrate experimentally an effective scheme to realize robust operation of quantum entanglement states by designing quadruple degeneracy exceptional points. By encircling the exceptional points on two overlapping Riemann energy surfaces, we have realized a chiral switch for entangled states with high fidelity. Owing to the topological protection conferred by the Riemann surface structure, this switching of chirality exhibits strong robustness against perturbations in the encircling path. Furthermore, we have experimentally validated such a scheme on a quantum walk platform. Our work opens up a new way for the application of non-Hermitian physics in the field of quantum information.	翻訳日:2023-10-24 21:21:26 公開日:2023-10-23
# MAS:2次元拡散を用いた3次元モーション生成のためのマルチビューアンセストラルサンプリング MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion ( http://arxiv.org/abs/2310.14729v1 ) ライセンス: Link先を確認	Roy Kapon, Guy Tevet, Daniel Cohen-Or and Amit H. Bermano	(参考訳) 動作シーケンスの連続した多視点2Dサンプルを生成する手法であるMulti-view Ancestral Smpling (MAS)を導入し,その3Dサンプルの作成を可能にする。 MASは2Dデータのみに基づいて訓練された拡散モデルを活用し、3Dデータが乏しく、収集が難しいため、これまで探索されていない様々な動きの場に機会を開放する。 MASは、異なる角度から同じ動きを表す複数の2Dモーションシーケンスを同時に識別する。我々の整合性ブロックは、個々の世代を統一された3Dシーケンスに結合し、次のイテレーションで元のビューに投影することで、各拡散ステップにおけるすべてのビューの整合性を保証する。プロバスケットボールの操り方、ボール装置を備えたリズミカル体操、馬の障害物コースレースの映像から得られた2DポーズデータにMASを実演する。それぞれの領域において、3Dモーションキャプチャは困難であるが、MASはテキスト条件のない多種多様なリアルな3Dシーケンスを生成する。以下に示すように、我々の祖先サンプリングに基づくアプローチは、一般的な最適化に基づくアプローチと比較して拡散フレームワークとより自然な統合を提供し、ドメイン外サンプリング、詳細の欠如、モード崩壊といった一般的な問題を回避する。 https://guytevet.github.io/mas-page/ We introduce Multi-view Ancestral Sampling (MAS), a method for generating consistent multi-view 2D samples of a motion sequence, enabling the creation of its 3D counterpart. MAS leverages a diffusion model trained solely on 2D data, opening opportunities to exciting and diverse fields of motion previously under-explored as 3D data is scarce and hard to collect. MAS works by simultaneously denoising multiple 2D motion sequences representing the same motion from different angles. Our consistency block ensures consistency across all views at each diffusion step by combining the individual generations into a unified 3D sequence, and projecting it back to the original views for the next iteration. We demonstrate MAS on 2D pose data acquired from videos depicting professional basketball maneuvers, rhythmic gymnastic performances featuring a ball apparatus, and horse obstacle course races. In each of these domains, 3D motion capture is arduous, and yet, MAS generates diverse and realistic 3D sequences without textual conditioning. As we demonstrate, our ancestral sampling-based approach offers a more natural integration with the diffusion framework compared to popular denoising optimization-based approaches, and avoids common issues such as out-of-domain sampling, lack of details and mode-collapse. https://guytevet.github.io/mas-page/	翻訳日:2023-10-24 21:21:12 公開日:2023-10-23
# LLM-gernerated Text Detectionに関する調査:必要,方法,今後の方向性 A Survey on LLM-gernerated Text Detection: Necessity, Methods, and Future Directions ( http://arxiv.org/abs/2310.14724v1 ) ライセンス: Link先を確認	Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Derek F. Wong, Lidia S. Chao	(参考訳) 大きな言語モデル(LLM)から生まれた複雑な言語を理解し、追跡し、生成する強力な能力によって、LLMが生成したテキストは、私たちの日常生活の多くの領域を驚くほどの速さで浸水させ、人間に広く受け入れられる。 LLMが拡大を続けるにつれ、LCMが生成するテキストを検出する検出器を開発する必要がある。このことは、LLMの潜在的な誤用や、LLM生成コンテンツの有害な影響から芸術的表現やソーシャルネットワークのような保護領域を緩和するために重要である。 LLMの生成したテキスト検出は、LLMによってテキストが生成されるかどうかを識別することを目的としている。検出器技術は最近、透かし技術、ゼロショット法、微動LMs法、対向学習法、LSMを検出器として使う方法、そして人力支援手法の革新によって、顕著な進歩が見られた。本調査では,この領域における最近の研究のブレークスルーと,検出器研究の推進の必要性を裏付けるものである。また、一般的なデータセットを掘り下げて、その制限と開発要件を明らかにします。さらに, LLM生成テキスト検出のパラダイムを分析し, アウト・オブ・ディストリビューション問題, 潜在的な攻撃, データのあいまいさといった課題に光を当てる。結論として,LLM生成テキスト検出における今後の研究の方向性に注目し,人工知能(AI)の実装を推し進める。本調査の目的は,新参者への明確かつ包括的な紹介と,LCM生成テキスト検出分野における有意義な更新を提供することである。 The powerful ability to understand, follow, and generate complex language emerging from large language models (LLMs) makes LLM-generated text flood many areas of our daily lives at an incredible speed and is widely accepted by humans. As LLMs continue to expand, there is an imperative need to develop detectors that can detect LLM-generated text. This is crucial to mitigate potential misuse of LLMs and safeguard realms like artistic expression and social networks from harmful influence of LLM-generated content. The LLM-generated text detection aims to discern if a piece of text was produced by an LLM, which is essentially a binary classification task. The detector techniques have witnessed notable advancements recently, propelled by innovations in watermarking techniques, zero-shot methods, fine-turning LMs methods, adversarial learning methods, LLMs as detectors, and human-assisted methods. In this survey, we collate recent research breakthroughs in this area and underscore the pressing need to bolster detector research. We also delve into prevalent datasets, elucidating their limitations and developmental requirements. Furthermore, we analyze various LLM-generated text detection paradigms, shedding light on challenges like out-of-distribution problems, potential attacks, and data ambiguity. Conclusively, we highlight interesting directions for future research in LLM-generated text detection to advance the implementation of responsible artificial intelligence (AI). Our aim with this survey is to provide a clear and comprehensive introduction for newcomers while also offering seasoned researchers a valuable update in the field of LLM-generated text detection.	翻訳日:2023-10-24 21:20:49 公開日:2023-10-23
# ニューラルネットワークのための時系列データ前処理のための拡張適応入力正規化 Extended Deep Adaptive Input Normalization for Preprocessing Time Series Data for Neural Networks ( http://arxiv.org/abs/2310.14720v1 ) ライセンス: Link先を確認	Marcus A. K. September, Francesco Sanna Passino, Leonie Goldmann, Anton Hinel	(参考訳) データの前処理は、あらゆる機械学習パイプラインの重要な部分であり、パフォーマンスとトレーニング効率の両方に大きな影響を与える可能性がある。時系列予測と分類にディープニューラルネットワークを使用する場合、特に顕著である:実世界の時系列データは、多モード性、歪性、外れ値などの不規則性を示すことが多く、これらの特性が適切に対処されていない場合、モデルの性能は急速に低下する。本研究では,与えられたタスクに対する不規則な時系列データを,固定正規化方式ではなく,エンドツーエンドで適切に正規化する方法を学ぶ新しい適応型ニューラルネットワーク層であるEDAIN(Extended Deep Adaptive Input Normalization)層を提案する。これは、バックプロパゲーションを使用して、未知のパラメータとディープニューラルネットワークを同時に最適化することで実現される。本実験は,従来の正規化手法や既存の適応時系列前処理方式と比較して,EDAIN層の優れた性能を示すために,合成データ,クレジットデフォルト予測データセット,大規模リミットオーダーブックベンチマークデータセットを用いて実施した。 Data preprocessing is a crucial part of any machine learning pipeline, and it can have a significant impact on both performance and training efficiency. This is especially evident when using deep neural networks for time series prediction and classification: real-world time series data often exhibit irregularities such as multi-modality, skewness and outliers, and the model performance can degrade rapidly if these characteristics are not adequately addressed. In this work, we propose the EDAIN (Extended Deep Adaptive Input Normalization) layer, a novel adaptive neural layer that learns how to appropriately normalize irregular time series data for a given task in an end-to-end fashion, instead of using a fixed normalization scheme. This is achieved by optimizing its unknown parameters simultaneously with the deep neural network using back-propagation. Our experiments, conducted using synthetic data, a credit default prediction dataset, and a large-scale limit order book benchmark dataset, demonstrate the superior performance of the EDAIN layer when compared to conventional normalization methods and existing adaptive time series preprocessing layers.	翻訳日:2023-10-24 21:20:20 公開日:2023-10-23
# 空中画像の半教師対象検出におけるスケール不均衡の再考 Rethinking Scale Imbalance in Semi-supervised Object Detection for Aerial Images ( http://arxiv.org/abs/2310.14718v1 ) ライセンス: Link先を確認	Ruixiang Zhang, Chang Xu, Fang Xu, Wen Yang, Guangjun He, Huai Yu, Gui-Song Xia	(参考訳) 本稿では,空中画像における半教師対象検出(SSOD)のスケール不均衡問題に焦点をあてる。自然画像と比較すると、空中画像のオブジェクトは画像あたりのサイズと量が小さく、手動のアノテーションの難しさが増す。一方、advanced ssod技術は限定されたラベル付きデータと大量のラベルなしデータを利用して優れた検出器を訓練し、アノテーションコストを節約できる。しかし、航空画像の未調査課題として、SSODは多数の小さな物体に直面すると、劇的な性能低下に悩まされる。大規模オブジェクト間の予測を解析することにより、スケールバイアス、すなわち擬似ラベルの不均衡、ラベル割り当ての不均衡、負の学習不均衡による3つの不均衡問題を同定する。これらの課題に対処するために,航空画像のためのS^3OD学習パイプラインを提案する。 S^3ODでは,3つの重要な要素,SAT(Size-aware Adaptive Thresholding),SLA(Size-re Balanced Label Assignment),TNL(Teacher-guided Negative Learning)を提案し,非バイアス学習を保証した。具体的には、SATは、異なるスケールでオブジェクトの擬似ラベルをフィルタリングする適切なしきい値を選択する。 SLAは、再サンプリングと再重み付けによって、さまざまなスケールでオブジェクトの正のサンプルのバランスをとる。 tnlは教師モデルによって生成された情報を利用して負のサンプルの不均衡を緩和する。 DOTA-v1.5ベンチマークで行った大規模な実験は、提案手法が最先端の競合相手よりも優れていることを示した。コードはまもなくリリースされる予定だ。 This paper focuses on the scale imbalance problem of semi-supervised object detection(SSOD) in aerial images. Compared to natural images, objects in aerial images show smaller sizes and larger quantities per image, increasing the difficulty of manual annotation. Meanwhile, the advanced SSOD technique can train superior detectors by leveraging limited labeled data and massive unlabeled data, saving annotation costs. However, as an understudied task in aerial images, SSOD suffers from a drastic performance drop when facing a large proportion of small objects. By analyzing the predictions between small and large objects, we identify three imbalance issues caused by the scale bias, i.e., pseudo-label imbalance, label assignment imbalance, and negative learning imbalance. To tackle these issues, we propose a novel Scale-discriminative Semi-Supervised Object Detection (S^3OD) learning pipeline for aerial images. In our S^3OD, three key components, Size-aware Adaptive Thresholding (SAT), Size-rebalanced Label Assignment (SLA), and Teacher-guided Negative Learning (TNL), are proposed to warrant scale unbiased learning. Specifically, SAT adaptively selects appropriate thresholds to filter pseudo-labels for objects at different scales. SLA balances positive samples of objects at different scales through resampling and reweighting. TNL alleviates the imbalance in negative samples by leveraging information generated by a teacher model. Extensive experiments conducted on the DOTA-v1.5 benchmark demonstrate the superiority of our proposed methods over state-of-the-art competitors. Codes will be released soon.	翻訳日:2023-10-24 21:20:00 公開日:2023-10-23
# BatteryML:バッテリ劣化による機械学習のためのオープンソースプラットフォーム BatteryML:An Open-source platform for Machine Learning on Battery Degradation ( http://arxiv.org/abs/2310.14714v1 ) ライセンス: Link先を確認	Han Zhang, Xiaofan Gui, Shun Zheng, Ziheng Lu, Yuqi Li, Jiang Bian	(参考訳) バッテリーの劣化は、エネルギーストレージ領域における重要な関心事であり、機械学習が先進的な洞察とソリューションを促進する強力なツールとして台頭している。しかし、この電気化学科学と機械学習の交わりは複雑な問題を引き起こす。機械学習の専門家はバッテリー科学の複雑さに苦しむことが多いが、バッテリー研究者は特定のデータセットに合わせた複雑なモデルに適応するハードルに直面している。これに加えて、データフォーマットと評価ベンチマークを包含する、バッテリー劣化モデリングの凝集度基準が目立って欠如している。このような障害を認識したbatterymlは,データの前処理,機能抽出,従来型モデルと最先端モデルの両方の実装を統一した,ワンステップの,オールエンコンパスなオープンソースプラットフォームです。この合理化されたアプローチは、研究アプリケーションの実用性と効率を高めることを約束する。 BatteryMLはこの空白を埋めようとしている。さまざまな専門分野の専門家が協力して貢献できる環境を育み、バッテリリサーチの全体的な理解と進歩を高める。プロジェクトのコードはGitHubでhttps://github.com/microsoft/BatteryMLで公開されている。 Battery degradation remains a pivotal concern in the energy storage domain, with machine learning emerging as a potent tool to drive forward insights and solutions. However, this intersection of electrochemical science and machine learning poses complex challenges. Machine learning experts often grapple with the intricacies of battery science, while battery researchers face hurdles in adapting intricate models tailored to specific datasets. Beyond this, a cohesive standard for battery degradation modeling, inclusive of data formats and evaluative benchmarks, is conspicuously absent. Recognizing these impediments, we present BatteryML - a one-step, all-encompass, and open-source platform designed to unify data preprocessing, feature extraction, and the implementation of both traditional and state-of-the-art models. This streamlined approach promises to enhance the practicality and efficiency of research applications. BatteryML seeks to fill this void, fostering an environment where experts from diverse specializations can collaboratively contribute, thus elevating the collective understanding and advancement of battery research.The code for our project is publicly available on GitHub at https://github.com/microsoft/BatteryML.	翻訳日:2023-10-24 21:19:34 公開日:2023-10-23
# 空飛ぶサイドキックトラベルセールスマン問題に対する自己適応型遺伝的アルゴリズム A self-adaptive genetic algorithm for the flying sidekick travelling salesman problem ( http://arxiv.org/abs/2310.14713v1 ) ライセンス: Link先を確認	Ted Pilcher	(参考訳) 本稿では,現在最先端の自己適応型遺伝的アルゴリズムを用いて,FSTSP(Flying Sidekick Travelling Salesman Problem)の解法を提案する。フライングサイドキックトラベルセールスマン問題(フライングサイドキックトラベルセールスマン問題、Flying Sidekick Travelling Salesman Problem)は、トラベルリングセールスマン問題(TSP)を拡張した、ドローンの導入による組合せ最適化問題である。 FSTSPの目的は、ドローンを戦略的に展開しながら、すべての場所を訪問するための総時間を最小化することである。また、私の知る限りでは、FSTSP問題を解決するために自己適応型遺伝的アルゴリズム(GA)が使用されるのはこれが初めてです。より小さな問題事例に対する実験結果から,本アルゴリズムは競合するアルゴリズムと比較して,最適な解の量が多く,最適解の比率が低いことを示す。さらに、より大規模な問題の場合、このアルゴリズムは各問題サイズで競合するアルゴリズムを全て上回り、計算時間は合理的に低い。 This paper presents a novel approach to solving the Flying Sidekick Travelling Salesman Problem (FSTSP) using a state-of-the-art self-adaptive genetic algorithm. The Flying Sidekick Travelling Salesman Problem is a combinatorial optimisation problem that extends the Travelling Salesman Problem (TSP) by introducing the use of drones. In FSTSP, the objective is to minimise the total time to visit all locations while strategically deploying a drone to serve hard-to-reach customer locations. Also, to the best of my knowledge, this is the first time a self-adaptive genetic algorithm (GA) has been used to solve the FSTSP problem. Experimental results on smaller-sized problem instances demonstrate that this algorithm can find a higher quantity of optimal solutions and a lower percentage gap to the optimal solution compared to rival algorithms. Moreover, on larger-sized problem instances, this algorithm outperforms all rival algorithms on each problem size while maintaining a reasonably low computation time.	翻訳日:2023-10-24 21:19:11 公開日:2023-10-23
# 高次元低サンプルサイズ分類のためのランダム森林の相違 Random Forest Dissimilarity for High-Dimension Low Sample Size Classification ( http://arxiv.org/abs/2310.14710v1 ) ライセンス: Link先を確認	Lucca Portes Cavalheiro, Simon Bernard, Jean Paul Barddal, Laurent Heutte	(参考訳) 高次元, 低サンプルサイズ (HDLSS) 問題は, 機械学習の現実的な応用に多い。医療画像からテキスト処理まで、従来の機械学習アルゴリズムは、そのようなデータから可能な最善の概念を学ぶのに失敗した。前報では,多視点分類のための相似性に基づくアプローチであるランダム森林相似性(Random Forest Dissimilarity,RFD)を提案した。本研究では、RF類似度尺度を学習前計算SVMカーネル(RFSVM)として使用することにより、HDLSS分類問題を解決するためのこのアプローチの中核となる原理を変換する。このような学習的類似度尺度は, この分類文脈に特に適しており, 正確であることを示す。厳密な統計分析によって支援された40の公的なHDLSS分類データセットによる実験により、RFSVM法はHDLSS問題の大部分において既存の手法よりも優れており、低あるいは非HDLSS問題に対して非常に競争力のあるままであることが示された。 High dimension, low sample size (HDLSS) problems are numerous among real-world applications of machine learning. From medical images to text processing, traditional machine learning algorithms are usually unsuccessful in learning the best possible concept from such data. In a previous work, we proposed a dissimilarity-based approach for multi-view classification, the Random Forest Dissimilarity (RFD), that perfoms state-of-the-art results for such problems. In this work, we transpose the core principle of this approach to solving HDLSS classification problems, by using the RF similarity measure as a learned precomputed SVM kernel (RFSVM). We show that such a learned similarity measure is particularly suited and accurate for this classification context. Experiments conducted on 40 public HDLSS classification datasets, supported by rigorous statistical analyses, show that the RFSVM method outperforms existing methods for the majority of HDLSS problems and remains at the same time very competitive for low or non-HDLSS problems.	翻訳日:2023-10-24 21:18:52 公開日:2023-10-23
# Once Upon a $\textit{Time}$ in $\textit{Graph}$: Relative-Time Pretraining for Complex Temporal Reasoning Once Upon a $\textit{Time}$ in $\textit{Graph}$: Relative-Time Pretraining for Complex Temporal Reasoning ( http://arxiv.org/abs/2310.14709v1 ) ライセンス: Link先を確認	Sen Yang, Xin Li, Lidong Bing, Wai Lam	(参考訳) 私たちの物理的世界は時間とともに常に進化し続けており、事前訓練された言語モデルがテキストの時間的文脈を理解し、推論するための課題をレンダリングしています。既存の作業は、テキストとタイムスタンプの直接的な関連性を強化することに焦点を当てている。しかし、知識時間関連は通常、知識間の時間的依存関係の推論を必要とする下流タスクには不十分である。本研究では,時間の性質を利用し,時間軸に沿った事象の相対配置に基づくグラフ構造の構築を提案する。グラフビューに触発されて、私たちは rememo (\underline{re}$lative ti$\underline{me}$ $\underline{mo}$deling) を提案します。実験の結果,RemeMoは複数の時間的質問応答データセットのベースラインT5よりも優れていた。さらに分析すると、RemeMoは特に長距離複雑な時間的依存関係のモデリングに長けていることがわかる。私たちはコードと事前トレーニングされたチェックポイントを$\href{https://github.com/DAMO-NLP-SG/RemeMo}{\text{this url}}$でリリースします。 Our physical world is constantly evolving over time, rendering challenges for pre-trained language models to understand and reason over the temporal contexts of texts. Existing work focuses on strengthening the direct association between a piece of text and its time-stamp. However, the knowledge-time association is usually insufficient for the downstream tasks that require reasoning over temporal dependencies between knowledge. In this work, we make use of the underlying nature of time, all temporally-scoped sentences are strung together through a one-dimensional time axis, and suggest creating a graph structure based on the relative placements of events along the time axis. Inspired by the graph view, we propose RemeMo ($\underline{Re}$lative Ti$\underline{me}$ $\underline{Mo}$deling), which explicitly connects all temporally-scoped facts by modeling the time relations between any two sentences. Experimental results show that RemeMo outperforms the baseline T5 on multiple temporal question answering datasets under various settings. Further analysis suggests that RemeMo is especially good at modeling long-range complex temporal dependencies. We release our code and pre-trained checkpoints at $\href{https://github.com/DAMO-NLP-SG/RemeMo}{\text{this url}}$.	翻訳日:2023-10-24 21:18:35 公開日:2023-10-23
# Open Domain Conversational Question Answeringのための強力で効率的なベースライン Strong and Efficient Baselines for Open Domain Conversational Question Answering ( http://arxiv.org/abs/2310.14708v1 ) ライセンス: Link先を確認	Andrei C. Coman, Gianni Barlacchi, Adri\`a de Gispert	(参考訳) Open Domain Question Answering (ODQA)設定とは異なり、会話型ドメイン(ODConvQA)は、効率と有効性の両方のベースラインの再評価に関して、限定的な注目を集めている。本稿では,DPR(State-of-the-Art (SotA) Dense Passage Retrieval)レトリバーとFusion-in-Decoder (FiD) リーダパイプラインについて検討し,様々な制約によりODConvQAタスクに適用した場合に顕著に性能が低下することを示す。次に,レトリバーと読み手の間に高速な再ランキングコンポーネントを導入し,対象とする微調整ステップを行うことで,強力でシンプルで効率的なベースラインを提案する。 TopiOCQA と OR-QuAC という2つの ODConvQA タスクの実験により,本手法が SotA 結果を改善するとともに,読み出し遅延を60%削減することを示した。最後に,LLM(Large Language Models)の利用を含む,より複雑なアプローチのリファレンスとして機能する,挑戦的なベースラインの開発に関する,新たな価値ある洞察を提供する。 Unlike the Open Domain Question Answering (ODQA) setting, the conversational (ODConvQA) domain has received limited attention when it comes to reevaluating baselines for both efficiency and effectiveness. In this paper, we study the State-of-the-Art (SotA) Dense Passage Retrieval (DPR) retriever and Fusion-in-Decoder (FiD) reader pipeline, and show that it significantly underperforms when applied to ODConvQA tasks due to various limitations. We then propose and evaluate strong yet simple and efficient baselines, by introducing a fast reranking component between the retriever and the reader, and by performing targeted finetuning steps. Experiments on two ODConvQA tasks, namely TopiOCQA and OR-QuAC, show that our method improves the SotA results, while reducing reader's latency by 60%. Finally, we provide new and valuable insights into the development of challenging baselines that serve as a reference for future, more intricate approaches, including those that leverage Large Language Models (LLMs).	翻訳日:2023-10-24 21:18:09 公開日:2023-10-23
# 試行と観測データを組み合わせた外部妥当性評価 Externally Valid Policy Evaluation Combining Trial and Observational Data ( http://arxiv.org/abs/2310.14763v1 ) ライセンス: Link先を確認	Sofia Ek, Dave Zachariah	(参考訳) ランダム化試験は意思決定政策の効果を評価するための金の基準として広く考えられている。しかし、試行データは意図された対象人口と異なる集団から引き出されたものであり、これは外的妥当性(つまり一般化可能性)の問題を引き起こす。本稿では,対象人口に対する政策の結果について,有効な推測を行うために試行データを用いた。対象個体群からの追加の共変量データは、試験研究における個人のサンプリングをモデル化するために使用される。特定のモデルミスカバリレーションの範囲で検証可能な試行ベースの政策評価を行う手法を開発した。この方法は非パラメトリックであり、有限サンプルであっても妥当性が保証される。認証されたポリシー評価は、シミュレーションデータと実データの両方を用いて図示される。 Randomized trials are widely considered as the gold standard for evaluating the effects of decision policies. Trial data is, however, drawn from a population which may differ from the intended target population and this raises a problem of external validity (aka. generalizability). In this paper we seek to use trial data to draw valid inferences about the outcome of a policy on the target population. Additional covariate data from the target population is used to model the sampling of individuals in the trial study. We develop a method that yields certifiably valid trial-based policy evaluations under any specified range of model miscalibrations. The method is nonparametric and the validity is assured even with finite samples. The certified policy evaluations are illustrated using both simulated and real data.	翻訳日:2023-10-24 21:09:39 公開日:2023-10-23
# SuperTweetEval:ソーシャルメディアNLP研究のための混成、統一、不均一なベンチマーク SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research ( http://arxiv.org/abs/2310.14757v1 ) ライセンス: Link先を確認	Dimosthenis Antypas, Asahi Ushio, Francesco Barbieri, Leonardo Neves, Kiamehr Rezaee, Luis Espinosa-Anke, Jiaxin Pei, Jose Camacho-Collados	(参考訳) その関連性にもかかわらず、ソーシャルメディアにおけるNLPの成熟度は、汎用モデル、メトリクス、ベンチマークと比較すると未熟である。この断片化された状況は、例えば、最高のパフォーマンスモデルであるタスクと、それが他とどのように比較されるかを、コミュニティが知るのを難しくします。この問題を軽減するため,ソーシャルメディアにおけるNLP評価の統一ベンチマークであるSuperTweetEvalを導入する。 SuperTweetEvalで幅広いモデルのパフォーマンスをベンチマークした結果、最近の言語モデリングの進歩にもかかわらず、ソーシャルメディアは依然として困難な状態にあることが示唆された。 Despite its relevance, the maturity of NLP for social media pales in comparison with general-purpose models, metrics and benchmarks. This fragmented landscape makes it hard for the community to know, for instance, given a task, which is the best performing model and how it compares with others. To alleviate this issue, we introduce a unified benchmark for NLP evaluation in social media, SuperTweetEval, which includes a heterogeneous set of tasks and datasets combined, adapted and constructed from scratch. We benchmarked the performance of a wide range of models on SuperTweetEval and our results suggest that, despite the recent advances in language modelling, social media remains challenging.	翻訳日:2023-10-24 21:09:27 公開日:2023-10-23
# 二次元ランダム横場イジング強磁性体の有限サイズスケーリング解析 Finite-size scaling analysis of the two-dimensional random transverse-field Ising ferromagnet ( http://arxiv.org/abs/2310.14756v1 ) ライセンス: Link先を確認	Jiwon Choi, Seung Ki Baek	(参考訳) ランダム横磁場Ising ferromagnet (RTFIF) は、結合強度と横磁場強度のランダム性を含む高度に乱れた量子系である。ある次元において、臨界特性は無限ランダム不動点(IRFP)によって支配され、再正規化群の研究は、2次元(2D)モデルもIRFPによって支配されていると主張している。しかし、臨界点の位置でさえ、量子モンテカルロ (QMC) の研究では未定のままである。本研究では,量子臨界点を見つけるための広範囲なQMCシミュレーションを行い,有限スケールスケール解析を試み,臨界挙動を観察する。 2D RTFIF の臨界場強度を$\Gamma_c = 7.52(2)$, $\beta=1.5(3)$, $\nu = 1.6(3)$, $z=3.3(3)$, $\psi=0.50(3)$と推定する。また、強磁性結合強度はランダムであるが、横磁場強度はランダム性を持たないマコイ・ウーモデルも検討した。 qmc計算の結果, 2d mccoy-wuモデルの臨界挙動は2d rtfifよりも2d横場イジングスピングラスの臨界挙動に近いことがわかった。これらの数値的な知見は、乱れた2次元量子システムの理解を深める。 The random transverse-field Ising ferromagnet (RTFIF) is a highly disordered quantum system which contains randomness in the coupling strengths as well as in the transverse-field strengths. In one dimension, the critical properties are governed by an infinite-randomness fixed point (IRFP), and renormalization-group studies argue that the two-dimensional (2D) model is also governed by an IRFP. However, even the location of the critical point remains unsettled among quantum Monte Carlo (QMC) studies. In this work, we perform extensive QMC simulations to locate the quantum critical point and attempt a finite-size scaling analysis to observe the critical behavior. We estimate the critical field strength of the 2D RTFIF as $\Gamma_c = 7.52(2)$, together with critical exponents such as $\beta=1.5(3)$, $\nu = 1.6(3)$, and $z=3.3(3)$ or $\psi=0.50(3)$. We have also considered the McCoy-Wu model, which has randomness in the ferromagnetic coupling strengths but not in the transverse-field strength. Our QMC calculation shows that the critical behavior of the 2D McCoy-Wu model is closer to that of the 2D transverse-field Ising spin glass than to that of the 2D RTFIF. These numerical findings enhance our understanding of disordered 2D quantum systems.	翻訳日:2023-10-24 21:09:16 公開日:2023-10-23
# 分子のマスクグラフモデリングにおけるトケナイザとデコーダの再考 Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules ( http://arxiv.org/abs/2310.14753v1 ) ライセンス: Link先を確認	Zhiyuan Liu, Yaorui Shi, An Zhang, Enzhi Zhang, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua	(参考訳) マスクグラフモデリングは、分子グラフの自己教師あり表現学習において優れている。従来の研究では,(1) 分子グラフを小さな断片(サブグラフ)に分解してトークンに変換するグラフトークンライザ,(2) マスクでグラフを破損させるグラフマスキング,(3) マスクグラフにエンコーダを塗布して表現を生成するグラフオートエンコーダ, そして, その表現にデコーダを用いて, 元のグラフのトークンを復元する。しかし、以前のmgmの研究はグラフマスキングとエンコーダに焦点を当てているが、トークン化とデコーダの理解は限られている。このギャップを埋めるために、我々はまず、ノード、エッジ、モチーフ、グラフニューラルネットワーク(GNN)の粒度で一般的な分子トークン化装置を要約し、その役割をMGMの再構築ターゲットとして検討した。さらに,MGMに表現型デコーダを採用する可能性についても検討する。この結果から, サブグラフレベルのトークン化器とremask復号化デコーダがエンコーダの表現学習に大きな影響を与えることがわかった。最後に,単純なGNNベースのTokenizer(SGT)と効果的な復号化戦略を備えた,新しいMGM手法SimSGTを提案する。本手法が既存の分子自己教師学習法より優れていることを実証的に検証した。私たちのコードとチェックポイントはhttps://github.com/syr-cn/simsgtで利用可能です。 Masked graph modeling excels in the self-supervised representation learning of molecular graphs. Scrutinizing previous studies, we can reveal a common scheme consisting of three key components: (1) graph tokenizer, which breaks a molecular graph into smaller fragments (i.e., subgraphs) and converts them into tokens; (2) graph masking, which corrupts the graph with masks; (3) graph autoencoder, which first applies an encoder on the masked graph to generate the representations, and then employs a decoder on the representations to recover the tokens of the original graph. However, the previous MGM studies focus extensively on graph masking and encoder, while there is limited understanding of tokenizer and decoder. To bridge the gap, we first summarize popular molecule tokenizers at the granularity of node, edge, motif, and Graph Neural Networks (GNNs), and then examine their roles as the MGM's reconstruction targets. Further, we explore the potential of adopting an expressive decoder in MGM. Our results show that a subgraph-level tokenizer and a sufficiently expressive decoder with remask decoding have a large impact on the encoder's representation learning. Finally, we propose a novel MGM method SimSGT, featuring a Simple GNN-based Tokenizer (SGT) and an effective decoding strategy. We empirically validate that our method outperforms the existing molecule self-supervised learning methods. Our codes and checkpoints are available at https://github.com/syr-cn/SimSGT.	翻訳日:2023-10-24 21:08:50 公開日:2023-10-23
# 効率的かつ解釈可能なバンディットアルゴリズム Efficient and Interpretable Bandit Algorithms ( http://arxiv.org/abs/2310.14751v1 ) ライセンス: Link先を確認	Subhojyoti Mukherjee, Ruihao Zhu, Branislav Kveton	(参考訳) 現代の機械学習における説明可能性の重要性に動機づけられ、"emph{ efficient}"と"emph{interpretable}"のバンディットアルゴリズムを設計した。バンディットアルゴリズムは、未知のモデルパラメータの不確実性を減らす目的で探索するときに解釈可能である。解釈可能性の定量化にあたり,不確実性低減率と理論的最適値を比較する新しい指標 \textit{uncertainty loss} を導入する。我々は、解釈可能で最大に不確かさを低減できる、 \textbf{c}onstrained \textbf{o}ptimal \textbf{de}signに基づくバンディットアルゴリズムであるコードを提案する。 \codeのキーとなるアイデアは、統計的制約によって決定されるすべての可算的なアクションを探索し、解釈可能性を達成することである。我々は, 最適設計の最適基準を利用して, マルチアーム・リニアバンドのCODEを効率的に実装し, ほぼ最適後悔境界を導出する。また、CODEは従来の位相除去の位相を除去するものと見なすことができ、より実用的で一般的なものである。合成問題と実世界問題の両方において数値実験により, \code の利点を示す。 CODEは他の最先端の解釈可能な設計よりも優れており、高い信頼度境界アルゴリズムのような一般的なが解釈不能な設計の性能と一致している。 Motivated by the importance of explainability in modern machine learning, we design bandit algorithms that are \emph{efficient} and \emph{interpretable}. A bandit algorithm is interpretable if it explores with the objective of reducing uncertainty in the unknown model parameter. To quantify the interpretability, we introduce a novel metric of \textit{uncertainty loss}, which compares the rate of the uncertainty reduction to the theoretical optimum. We propose CODE, a bandit algorithm based on a \textbf{C}onstrained \textbf{O}ptimal \textbf{DE}sign, that is interpretable and maximally reduces the uncertainty. The key idea in \code is to explore among all plausible actions, determined by a statistical constraint, to achieve interpretability. We implement CODE efficiently in both multi-armed and linear bandits and derive near-optimal regret bounds by leveraging the optimality criteria of the approximate optimal design. CODE can be also viewed as removing phases in conventional phased elimination, which makes it more practical and general. We demonstrate the advantage of \code by numerical experiments on both synthetic and real-world problems. CODE outperforms other state-of-the-art interpretable designs while matching the performance of popular but uninterpretable designs, such as upper confidence bound algorithms.	翻訳日:2023-10-24 21:08:22 公開日:2023-10-23
# mcc-kd:マルチcot一貫性のある知識蒸留 MCC-KD: Multi-CoT Consistent Knowledge Distillation ( http://arxiv.org/abs/2310.14747v1 ) ライセンス: Link先を確認	Hongzhan Chen, Siyue Wu, Xiaojun Quan, Rui Wang, Ming Yan, Ji Zhang	(参考訳) 大規模言語モデル(LLM)は、思考の連鎖(CoT)による複雑な推論において顕著な能力を示した。近年、これらの推論能力はLLMからより小さなモデルへ移行することへの関心が高まっている。～しかしながら、合理化における多様性と一貫性の両立が課題となっている。本稿では,これら2つの側面の強化に着目し,mcc-kd (multi-cot consistent knowledge distillation) を提案する。 MCC-KDでは,各質問に対して複数の有理数を生成し,回答分布間の双方向KL分割を最小化することにより,対応する予測間の一貫性を強制する。以上の結果から,様々なモデルアーキテクチャ (LLaMA/FlanT5) と様々なモデルスケール (3B/7B/11B/13B) によるMCC-KDの有効性について検討した。実験の結果は、MCC-KDの分布内データセットにおける優れた性能を確認するだけでなく、分布外データセットに対する堅牢な一般化能力を強調している。 Large language models (LLMs) have showcased remarkable capabilities in complex reasoning through chain of thought (CoT) prompting.~Recently, there has been a growing interest in transferring these reasoning abilities from LLMs to smaller models.~However, achieving both the diversity and consistency in rationales presents a challenge.~In this paper, we focus on enhancing these two aspects and propose Multi-CoT Consistent Knowledge Distillation (MCC-KD) to efficiently distill the reasoning capabilities. In MCC-KD, we generate multiple rationales for each question and enforce consistency among the corresponding predictions by minimizing the bidirectional KL-divergence between the answer distributions.~We investigate the effectiveness of MCC-KD with different model architectures (LLaMA/FlanT5) and various model scales (3B/7B/11B/13B) on both mathematical reasoning and commonsense reasoning benchmarks. The empirical results not only confirm MCC-KD's superior performance on in-distribution datasets but also highlight its robust generalization ability on out-of-distribution datasets.	翻訳日:2023-10-24 21:07:55 公開日:2023-10-23
# 実世界1型糖尿病管理における深層学習の安全性の課題 The Safety Challenges of Deep Learning in Real-World Type 1 Diabetes Management ( http://arxiv.org/abs/2310.14743v1 ) ライセンス: Link先を確認	Harry Emerson, Ryan McConville and Matthew Guy	(参考訳) 血糖シミュレーションにより、患者に害を与えずに1型糖尿病(T1D)管理戦略を評価できる。深層学習アルゴリズムは、シミュレータ機能を拡張する有望な手段を提供するが、これらのアルゴリズムは、必ずしも生理的に正しいグルコースダイナミクスを学習せず、訓練データにおいて共同設立者から不正確で潜在的に危険な関係を学べるという点で制限されている。これは、厳格な研究プロトコルでデータが収集されないため、現実のシナリオではより重要である可能性が高い。この研究は、現実世界のデータで訓練されたディープラーニングアルゴリズムを使用してグルコースのダイナミクスをモデル化することの意味を探求する。フリーリビングデータはopenaps data commonsから処理され、患者が報告した糖尿病イベントのタグが補足され、最も詳細なt1dデータセットを構成する。このデータセットは、最先端のブドウ糖シミュレータのトレーニングと評価、安全性クリティカルなシナリオ間での予測誤差の比較、Shapley Additive Explanations (SHAP)を用いた学習力学の生理的適切性の評価に使用された。深層学習予測精度は、広く使われている数学シミュレーターのアプローチを上回り、安全上の重要なシナリオにおいてモデルは悪化し、自己申告された食事や運動情報を活用するのに苦労した。 SHAP値分析は、最も基本的なT1D管理原則の一つであるインスリンと炭水化物の役割を根本的に混乱させたことも示している。本研究は,t1dと医療における実世界のシステムモデリングにディープラーニングを用いた場合の生理的適切性を検討することの重要性を強調し,実世界のデータ制約にロバストなモデルの構築を推奨する。 Blood glucose simulation allows the effectiveness of type 1 diabetes (T1D) management strategies to be evaluated without patient harm. Deep learning algorithms provide a promising avenue for extending simulator capabilities; however, these algorithms are limited in that they do not necessarily learn physiologically correct glucose dynamics and can learn incorrect and potentially dangerous relationships from confounders in training data. This is likely to be more important in real-world scenarios, as data is not collected under strict research protocol. This work explores the implications of using deep learning algorithms trained on real-world data to model glucose dynamics. Free-living data was processed from the OpenAPS Data Commons and supplemented with patient-reported tags of challenging diabetes events, constituting one of the most detailed real-world T1D datasets. This dataset was used to train and evaluate state-of-the-art glucose simulators, comparing their prediction error across safety critical scenarios and assessing the physiological appropriateness of the learned dynamics using Shapley Additive Explanations (SHAP). While deep learning prediction accuracy surpassed the widely-used mathematical simulator approach, the model deteriorated in safety critical scenarios and struggled to leverage self-reported meal and exercise information. SHAP value analysis also indicated the model had fundamentally confused the roles of insulin and carbohydrates, which is one of the most basic T1D management principles. This work highlights the importance of considering physiological appropriateness when using deep learning to model real-world systems in T1D and healthcare more broadly, and provides recommendations for building models that are robust to real-world data constraints.	翻訳日:2023-10-24 21:07:33 公開日:2023-10-23
# SAMCLR:ビューサンプリングにSAMを用いた複雑なシーンでのコントラスト事前トレーニング SAMCLR: Contrastive pre-training on complex scenes using SAM for view sampling ( http://arxiv.org/abs/2310.14736v1 ) ライセンス: Link先を確認	Benjamin Missaoui, Chongbin Yuan	(参考訳) コンピュータビジョンにおいて、自己監督的コントラスト学習は、同じ画像の異なるビュー間で同様の表現を強制する。事前トレーニングはイメージNetのようなイメージ分類データセット上で実施されることが多い。しかし、複雑なシーンと複数のアイテムを扱う場合、同じイメージの複数のビューが同じオブジェクトカテゴリを表すことは、非常にありそうにない。そこで本研究では,イメージをセマンティック領域に分割し,同じ領域から2つのビューをサンプリングするSimCLRのアドオンであるSAMCLRを提案する。 Cityscapes と ADE20K で事前トレーニングを行った後、CIFAR-10, STL10, ImageNette の分類に基づいてSAMCLR が少なくとも同等に動作し、SimCLR だけでなく、DINO や MoCo も性能的に優れていることが実証された。 In Computer Vision, self-supervised contrastive learning enforces similar representations between different views of the same image. The pre-training is most often performed on image classification datasets, like ImageNet, where images mainly contain a single class of objects. However, when dealing with complex scenes with multiple items, it becomes very unlikely for several views of the same image to represent the same object category. In this setting, we propose SAMCLR, an add-on to SimCLR which uses SAM to segment the image into semantic regions, then sample the two views from the same region. Preliminary results show empirically that when pre-training on Cityscapes and ADE20K, then evaluating on classification on CIFAR-10, STL10 and ImageNette, SAMCLR performs at least on par with, and most often significantly outperforms not only SimCLR, but also DINO and MoCo.	翻訳日:2023-10-24 21:07:05 公開日:2023-10-23
# 大規模言語モデルにおけるプロンプトエンジニアリングの可能性:包括的レビュー Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review ( http://arxiv.org/abs/2310.14735v1 ) ライセンス: Link先を確認	Banghao Chen, Zhaofeng Zhang, Nicolas Langren\'e, Shengxin Zhu	(参考訳) 本稿では,Large Language Models (LLMs) の能力を解き放つ上で,プロンプトエンジニアリングが果たす重要な役割について述べる。 Prompt Engineering は LLM の入力テキストを構造化するプロセスであり、LLM の有効性を最適化するための技術である。この調査は、ロールプロンプトやワンショット、少数ショットプロンプトといったプロンプトエンジニアリングの基本原則と、チェーン・オブ・ソートやツリー・オブ・ソート・プロンプトのようなより高度な方法論を解明する。本論文は, プラグイン形式の外部支援が, この課題にどのように役立つか, 外部知識の獲得による機械幻覚の低減を図っている。続いて,aigc(artificial intelligence-create content)ツールにおける構造とエージェントの役割についてより深く理解することの必要性を強調する。本稿では,異なる視点からプロンプト手法の有効性を評価し,異なる手法を用いて評価する方法について議論する。最後に,教育やプログラミングといった分野におけるプロンプトエンジニアリングの適用に関する情報を集め,その転換可能性を示す。この包括的な調査は、llmとプロンプトエンジニアリングの巨大な世界を経験する誰にとっても、フレンドリーなガイドになることを目的としている。 This paper delves into the pivotal role of prompt engineering in unleashing the capabilities of Large Language Models (LLMs). Prompt engineering is the process of structuring input text for LLMs and is a technique integral to optimizing the efficacy of LLMs. This survey elucidates foundational principles of prompt engineering, such as role-prompting, one-shot, and few-shot prompting, as well as more advanced methodologies such as the chain-of-thought and tree-of-thoughts prompting. The paper sheds light on how external assistance in the form of plugins can assist in this task, and reduce machine hallucination by retrieving external knowledge. We subsequently delineate prospective directions in prompt engineering research, emphasizing the need for a deeper understanding of structures and the role of agents in Artificial Intelligence-Generated Content (AIGC) tools. We discuss how to assess the efficacy of prompt methods from different perspectives and using different methods. Finally, we gather information about the application of prompt engineering in such fields as education and programming, showing its transformative potential. This comprehensive survey aims to serve as a friendly guide for anyone venturing through the big world of LLMs and prompt engineering.	翻訳日:2023-10-24 21:06:47 公開日:2023-10-23
# 大規模言語モデルと言語規則を用いた矛盾検出のためのプロトタイプの作成 Generating Prototypes for Contradiction Detection Using Large Language Models and Linguistic Rules ( http://arxiv.org/abs/2310.14732v1 ) ライセンス: Link先を確認	Maren Pielka, Svetlana Schmidt, Rafet Sifa	(参考訳) 本稿では,大言語モデルの生成能力と言語規則を活用する,矛盾検出のための新しいデータ生成手法を提案する。我々のビジョンは、原型矛盾の縮合したコーパスを提供することであり、より深い言語分析と効率的な言語モデル微調整を可能にする。この目的のために、特定の矛盾型の記述に関して矛盾するステートメントを作成するように生成モデルに指示する。さらに、このモデルは完全に新しい矛盾の類型を思いつくよう指示されている。補助的アプローチとして,否定や反逆,数値ミスマッチなどの単純な矛盾を,言語規則を用いて構築する。我々の手法はデータの一貫性と多様性の観点から有望な結果をもたらす。機械学習のセットアップでこのデータを利用するには、さらに研究と手作業による改良が必要である。 We introduce a novel data generation method for contradiction detection, which leverages the generative power of large language models as well as linguistic rules. Our vision is to provide a condensed corpus of prototypical contradictions, allowing for in-depth linguistic analysis as well as efficient language model fine-tuning. To this end, we instruct the generative models to create contradicting statements with respect to descriptions of specific contradiction types. In addition, the model is also instructed to come up with completely new contradiction typologies. As an auxiliary approach, we use linguistic rules to construct simple contradictions such as those arising from negation, antonymy and numeric mismatch. We find that our methods yield promising results in terms of coherence and variety of the data. Further studies, as well as manual refinement are necessary to make use of this data in a machine learning setup.	翻訳日:2023-10-24 21:06:21 公開日:2023-10-23
# 言語生成における地理的消去 Geographical Erasure in Language Generation ( http://arxiv.org/abs/2310.14777v1 ) ライセンス: Link先を確認	Pola Schw\"obel, Jacek Golebiowski, Michele Donini, C\'edric Archambeau, Danish Pruthi	(参考訳) 大規模言語モデル(LLM)は膨大な量の世界の知識を符号化する。しかし、これらのモデルは大量のインターネットデータに基づいて訓練されているため、支配的なグループに関する情報を不規則に取得するリスクがある。この不均衡は生成された言語に伝播する。本研究では,言語モデルが特定の国を過小評価する,地理的消去の形式を研究・運用する。様々なLSMに対して一貫した消去例を示す。その結果, 減退は, トレーニングコーパスにおける言及頻度の低さと強く相関していることが判明した。最後に,カスタム目的を用いた微調整により消去を緩和する。 Large language models (LLMs) encode vast amounts of world knowledge. However, since these models are trained on large swaths of internet data, they are at risk of inordinately capturing information about dominant groups. This imbalance can propagate into generated language. In this work, we study and operationalise a form of geographical erasure, wherein language models underpredict certain countries. We demonstrate consistent instances of erasure across a range of LLMs. We discover that erasure strongly correlates with low frequencies of country mentions in the training corpus. Lastly, we mitigate erasure by finetuning using a custom objective.	翻訳日:2023-10-24 20:59:18 公開日:2023-10-23
# 2チャネル近藤効果への多体量子干渉経路:分子接合の逆設計 Many-body quantum interference route to the two-channel Kondo effect: Inverse design for molecular junctions ( http://arxiv.org/abs/2310.14775v1 ) ライセンス: Link先を確認	Sudeshna Sen and Andrew K. Mitchell	(参考訳) ナノワイヤブレークジャンクションや結合量子ドットデバイスで実現される人工分子における分子接合は、軌道の複雑さ、強い電子相互作用、ゲート制御、外部電子回路とのハイブリダイゼーションによる多体効果などにより、ユニークな機能を持つ。逆設計では、望ましい関数を最適に実行する候補構造を見つける。ここでは、分子接合を記述する一般化量子不純物モデルのための逆設計戦略を開発し、例えば、多体量子干渉を利用して単純な4サイトまたは5サイト分子運動量で2チャンネル近藤臨界点を実現することを実証する。極めて高いコンド温度を達成できることを示し,エントロピーと輸送シグネチャを実験的に利用すべきであることを示す。 Molecular junctions -- whether actual single molecules in nanowire break junctions or artificial molecules realized in coupled quantum dot devices -- offer unique functionality due to their orbital complexity, strong electron interactions, gate control, and many-body effects from hybridization with the external electronic circuit. Inverse design involves finding candidate structures that perform a desired function optimally. Here we develop an inverse design strategy for generalized quantum impurity models describing molecular junctions, and as an example, use it to demonstrate that many-body quantum interference can be leveraged to realize the two-channel Kondo critical point in simple 4- or 5-site molecular moieties. We show that remarkably high Kondo temperatures can be achieved, meaning that entropy and transport signatures should be experimentally accessible.	翻訳日:2023-10-24 20:59:12 公開日:2023-10-23
# 複数の専門家に学ぶための原則的アプローチ Principled Approaches for Learning to Defer with Multiple Experts ( http://arxiv.org/abs/2310.14774v1 ) ライセンス: Link先を確認	Anqi Mao, Mehryar Mohri, Yutao Zhong	(参考訳) 本稿では,複数の専門家と推論する学習の一般的な問題に対するサーロゲート損失とアルゴリズムについて検討する。本稿では,まず,予測関数と遅延関数を同時に学習するマルチエキスパート設定に適したサロゲート損失の新たなファミリーを導入する。そして、これらのサロゲート損失が強い$h$-consistencyバウンダリの恩恵を受けることを証明する。そこで本研究では,本解析の適用例を示すとともに,実際のサーロゲート損失の例を示す。これらの損失関数は、その最小化に基づいてアルゴリズムを遅延させる新しい学習の設計につながる。本研究の主な焦点は理論解析であるが,SVHNとCIFAR-10データセットに関するいくつかの実験の結果も報告する。 We present a study of surrogate losses and algorithms for the general problem of learning to defer with multiple experts. We first introduce a new family of surrogate losses specifically tailored for the multiple-expert setting, where the prediction and deferral functions are learned simultaneously. We then prove that these surrogate losses benefit from strong $H$-consistency bounds. We illustrate the application of our analysis through several examples of practical surrogate losses, for which we give explicit guarantees. These loss functions readily lead to the design of new learning to defer algorithms based on their minimization. While the main focus of this work is a theoretical analysis, we also report the results of several experiments on SVHN and CIFAR-10 datasets.	翻訳日:2023-10-24 20:58:55 公開日:2023-10-23
# 予測器・リジェクタ・マルチクラスアブステンション:理論的解析とアルゴリズム Predictor-Rejector Multi-Class Abstention: Theoretical Analysis and Algorithms ( http://arxiv.org/abs/2310.14772v1 ) ライセンス: Link先を確認	Anqi Mao, Mehryar Mohri, Yutao Zhong	(参考訳) 本研究は,多クラス分類における禁忌学習の鍵となる枠組みについて検討する。この設定では、学習者は事前に定義されたコストで予測を行うことを回避できる。本稿では,この学習問題に対する新しい理論的,アルゴリズム的結果の連続を予測器-リジェクタフレームワークで提示する。我々は,非漸近的かつ仮説的集合特異的な一貫性保証を証明し,既存の2つの疑問を正解するサーロゲート損失のファミリーをいくつか紹介する。これらの保証は、サロゲート損失の値の観点から、吸収損失関数の推定誤差の上限を与える。予測器とリジェクタを同時に学習するシングルステージ設定と,アプリケーションにおいて重要な2段階設定の両方を分析し,クロスエントロピーなどの標準的なサロゲート損失を用いて第1段階で予測器を学習する。これらの保証は、これらのサロゲート損失を最小化することに基づく、新しいマルチクラスアブステンションアルゴリズムを示唆する。また,これらのアルゴリズムをcifar-10,cifar-100,svhnデータセットの最先端アルゴリズムと比較した実験結果について報告する。その結果,新しいサーロゲート損失の利点を実証し,広く適用可能な2段階アブステンションアルゴリズムの性能を示すことができた。 We study the key framework of learning with abstention in the multi-class classification setting. In this setting, the learner can choose to abstain from making a prediction with some pre-defined cost. We present a series of new theoretical and algorithmic results for this learning problem in the predictor-rejector framework. We introduce several new families of surrogate losses for which we prove strong non-asymptotic and hypothesis set-specific consistency guarantees, thereby resolving positively two existing open questions. These guarantees provide upper bounds on the estimation error of the abstention loss function in terms of that of the surrogate loss. We analyze both a single-stage setting where the predictor and rejector are learned simultaneously and a two-stage setting crucial in applications, where the predictor is learned in a first stage using a standard surrogate loss such as cross-entropy. These guarantees suggest new multi-class abstention algorithms based on minimizing these surrogate losses. We also report the results of extensive experiments comparing these algorithms to the current state-of-the-art algorithms on CIFAR-10, CIFAR-100 and SVHN datasets. Our results demonstrate empirically the benefit of our new surrogate losses and show the remarkable performance of our broadly applicable two-stage abstention algorithm.	翻訳日:2023-10-24 20:58:45 公開日:2023-10-23
# GPTの知識ベース完成可能性の評価 Evaluating the Knowledge Base Completion Potential of GPT ( http://arxiv.org/abs/2310.14771v1 ) ライセンス: Link先を確認	Blerta Veseli, Simon Razniewski, Jan-Christoph Kalo, Gerhard Weikum	(参考訳) structured knowledge bases (kbs) は検索エンジンや他のアプリケーションのための資産であるが、必然的に不完全である。言語モデル (LM) は非教師なし知識ベース補完 (KBC) のために提案されているが, 大規模かつ高精度に行う能力は未解決のままである。以前の実験では、一般的な被験者のみを評価するか、KBから既に存在する事実をサンプリングするため、ほとんどが不足していた。本稿では,GPT が最大の KB である Wikidata を完成させる可能性について,慎重に評価する。 GPT-3 や ChatGPT や GPT-4 のようなモデルでは,そのサイズや能力に拘わらず,この課題に対して完全に納得できる結果が得られていない。それでも、より小さなlmsで、以前のアプローチよりも堅固な改善を提供する。特に, GPT-3では, 適切なしきい値設定により, Wikidataを90%の精度で2700万事実まで拡張できることが示されている。 Structured knowledge bases (KBs) are an asset for search engines and other applications, but are inevitably incomplete. Language models (LMs) have been proposed for unsupervised knowledge base completion (KBC), yet, their ability to do this at scale and with high accuracy remains an open question. Prior experimental studies mostly fall short because they only evaluate on popular subjects, or sample already existing facts from KBs. In this work, we perform a careful evaluation of GPT's potential to complete the largest public KB: Wikidata. We find that, despite their size and capabilities, models like GPT-3, ChatGPT and GPT-4 do not achieve fully convincing results on this task. Nonetheless, they provide solid improvements over earlier approaches with smaller LMs. In particular, we show that, with proper thresholding, GPT-3 enables to extend Wikidata by 27M facts at 90% precision.	翻訳日:2023-10-24 20:58:25 公開日:2023-10-23

Title

Authors

Abstract

論文公表日・翻訳日

# アプリには何があるのか? 女性医療アプリケーションのプライバシーリスクを明らかにする

What is in Your App? Uncovering Privacy Risks of Female Health Applications ( http://arxiv.org/abs/2310.14490v1 )

ライセンス: Link先を確認

Muhammad Hassan, Mahnoor Jameel, Tian Wang, Masooda Bashir,

(参考訳) FemTechまたはWomen Technologyは、健康と生殖のデータを監視する女性健康アプリケーションを通じて、女性に手頃で手頃な価格の医療ソリューションを提供することに特化した拡大分野である。トップアプリのダウンロード数は10億を超えており、これらのアプリケーションは広く普及している。しかし、女性の生殖権とプライバシに対する現代的な課題の中で、これらのアプリケーションのセキュリティとプライバシに関する包括的な研究が欠如している。この探索的研究は、人気のある7つのアプリケーションに関連するプライバシーリスクを掘り下げるものだ。最初の定量的静的解析では、さまざまなリスクのあるパーミッションと、多数のサードパーティのトラッカーが明らかになった。さらに、プライバシポリシーの予備審査は、基本的なデータプライバシ原則に準拠していないことを示している。これらの初期の発見は、FemTechアプリの堅牢なプライバシとセキュリティ保護を確立する上で重要なギャップを浮き彫りにした。

FemTech or Female Technology, is an expanding field dedicated to providing affordable and accessible healthcare solutions for women, prominently through Female Health Applications that monitor health and reproductive data. With the leading app exceeding 1 billion downloads, these applications are gaining widespread popularity. However, amidst contemporary challenges to women's reproductive rights and privacy, there is a noticeable lack of comprehensive studies on the security and privacy aspects of these applications. This exploratory study delves into the privacy risks associated with seven popular applications. Our initial quantitative static analysis reveals varied and potentially risky permissions and numerous third-party trackers. Additionally, a preliminary examination of privacy policies indicates non-compliance with fundamental data privacy principles. These early findings highlight a critical gap in establishing robust privacy and security safeguards for FemTech apps, especially significant in a climate where women's reproductive rights face escalating threats.